1969 11_#35 11 #35 1969-11_#35 1969-11_%2335
User Manual: 1969-11_#35
Open the PDF directly: View PDF .Page Count: 834
AFIPS
CONFERENCE
PROCEEDINGS
VOLUME 35
1969
FALL JOINT
COMPUTER
CONFERENCE
November 18 - 20, 1969
Las Vegas, Nevada
The ideas and opinions expressed herein are solely those of the authors and are
no necessarily representative of or endorsed by the 1969 Fall Joint Computer
Conference ComInittee or the American Federation of Information Processing
Societies.
Library of Congress Catalog Card Number 55-44701
AFIPS PRESS
210 Summit Avenue
Montvale, New Jersey 07645
c 1969 by the American Federation of Information Processing Societies, Montvale,
New Jersey, 07645. All rights reserved. This book, or parts thereof, may not be
reproduced in any form without permission of AFIPS Press.
Printed in the United States of America
CONTENTS
OPERATING SYSTEl\1S
A survey of techniques for recognizing parallel processable stre2,ms
in computer programs .........•............................
1
Performance modeling and. empirical measurements in a system
designed for batch and time-sharing users .................... .
17
M. J. Gonzalez
C. V. Ramamoorthy
J. E. Shemer
D. W. Heying
Dynamic protection structures ............................... .
The ADEPT-50 time sharing system .......................... .
39'
An operational memory share supervisor providing multi-task
processing within a single partition .......................... .
51
27
B. W. Lampson
R. R. Linde
C. Weissman
C. Fox
J. E. Braun
A. Oart~nhaus
ARRAY LOGIC-LOGIC DESIGN OF THE 70's
Structured logic .......................................
0
•••••
61
R. A. Henle
I. T. Ho
Characters-Universal architecture for LSI .................... .
Fault location in cellular arrays ............................. '..
Fault mUltiplication cellular arrays for LSI implementation ...... .
The pad relocation technique for interconnecting LSI arrays of
imperfect yield ....................... ' .................... .
69
81
89
O. A. Maley
R. Waxman
F. D. Erwin
K. J. Thurber
C. V. Ramamoorthy
S. C. Economides
99
D. F. Calhoun
111
R. O. Skatrud'
C. Weissman
E. V. Comber
COMPUTERS FOR CONGRESS
(Panel Session-No papers in this volume)
THE COMPUTER SECURITY AND PRIVACY CONTROVER.SY
The application of cryptographic techniques to data processing ... .
Security controls in the ADEPT-50 time-sharing system ......... .
Management of confidential information ....................... .
119
135
PROGRAMMING LANGUAGES AND LANGUAGE PROCESSOR.S
Some syntactic methods for specifying extendible programming
languages ..... , .......................................... .
SYMPLE-A.general syntax directed macro processor .......... .
An algebraic extension to LISP ............................... .
An on-line machine language debugger for OS/360 .............. .
The multics PL/1 compiler .................................. .
145
157
169
179
187
V. Schneider
J. E. Vander Mey
R. C. Varney
R. E. Patchen
P. Knowlton
W. H. Josephs
R. A. Freibeurghouse
FORTHCOMING COMPUTER ARCHITECTURES
A design for a fast computer for scientific calculations ........... .
A display processor design ................................... .
209
The system logic and usage recorder. . . ....................... .
Implementation of the NASA modular computer with LSI functional characters .......................................... .
219
P. M. M elliar-Smith
R. W. Watson
T. H. Myer
I. E. Sutherland
M. K. Vosbury
R. W. Murphy
231
J. J. Pariser
H. E. Maurer
247
O. A. Korn
255
D. S. Miller
M. J. Merritt
M. A. Franklin
J. C. Strauss
W. L. Oraves
R. A., MacDonald
DIGITAL SIMULATION OF CONTINUOUS SYSTEMS
Project DARE: Differential analyzer replacement by on-line
digital simulation ......................................... .
MOBSSL-UAF: An augmented block structured continuous systems simulation language for digital and hybrid computers ..... .
A hybrid computer programming system .....•.................
275
Hybrid executive-User's approach ........................... .
287
PROBLEMS IN MEDICAL DATA PROCESSING
A system for clinical data management ........................ .
297
R. A. Oreenes
A. N. Pappalardo
C. W. Marble
O. O. Barnett
Medical education: A challenge for natural language analysis,
artifical intelligence, and interactive graphics ................. .
307
J. C. Weber
W. D. Hagamen
Design principles for processor maintainability in real-time systems ..
319
Effects and detection of intermittent failures in digital systems ....
329
Modular computer architecture strategies for long-term mission ...
337
A compatible airborne multiprocessor ......................... .
347
H. Y. Chang
J. M. Scanlon
M. Ball
F. Hardie
F. D. Erwin
E. Bersoff
E. J. Dietrich
L. C. Kaye
ARCHITECTURES FOR LONG TERM RELIABILITY
PUBLISHING VERSUS COMPUTING
(Panel Session-No papers in this volume)
INFORMATION MANAGEMENT SYSTEM,S FOR THE 70's
(Panel Session-No papers in this volume)
WHAT HAPPENED TO LSI PROMISES
LSI-Past promises and present accomplishment-The dilemma
of our industry ........................................... .
What has happened to LSI-A supplier's view' ................. .
359
369
H. G. Rudenberg
C. G. Thornton
Real-time graphic display of time-sharing system operating
characteristics. . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
A graph manipulator for on-line network picture processing ...... .
379
387
On-line recognition of hand generated symbols ................. .
399
J. M. Grochow
H. A. DiGiulio
P. L. Tuan
G. M. Miller
TOPICS IN ON-LINE TECHNIQUES
MANAGING MONEY WITH ,COMPUTERS
(Panel Session-No papers in this volume)
DATA BASE AND FILE MANAGEMENT STRATEGIE.S
Common file organization techniques compared ................. .
An information retrieval system based on superimposed coding ... .
413
423
Establishment and maintenance of a storage hierarchy for an
on-line data base \lnder TSS/360 ........................... .
433
Resources management subsystem for a large corporate information system .............................................. .
441
N. Chapin
J. R. Files
H. D. Huskey
J. P. Considine
A. H. Weiss
H. Liu
W. S. Peck
P. T. Pollard
Incorporating complex data structures in a language designed for
social science research ..................................... .
453
S.
Jr. Kidd
CIRCUIT /MEMORY INNOVATIONS
A nanosecond threshold logic gate ............................ .
Silicon-on-sapphire complementary MOS circuits for high speed
associative memory ........ , .............................. .
463
L. Micheel
469
J. R. Burns
J. H. Scott
A main frame semiconductor memory for fourth generation
computers ...... '. , ........................................ .
479
A new approach to memory and logic-Cylindrical domain devices.
489
A new integrated magnetic memory ........................... .
499
T. W. Hart,Jr
D. W. Hillis
J. Marley
R. C. Lutz
C. R. Hoffman
A. H. Bobeck
R. F. Fischer
A. J. Perneski
M. Blanchon
M. Carbonel
Mated film memory-Implementation of a new design and
production concept. . . . ••...........•......................
505
L. A. ProhoJsky
D. W.Morgan
THE IMPACT OF STANDARDIZATION FOR THE 70's
(Panel Session-No papers in this volume)
USING COMPUTERS IN EDUCATION
A computer engineering laboratory ............................ .
Evaluation of an interactive display system for teaching numerical
::.nalyBiJ.a. . . . . . . . . . . . . . . . . . . . . . . . • . . . . . . . . . . . . . . . . . . . . . . . . .
Computer based instruction in computer programming: A symbol
manipulation-List processing approach ..................... .
515
D. M. Rob.inson
525
P. Oliver
F. P. Brooks, Jr.
535
P. Lorton, Jr.
J. Slimick
545
A. M. Hlady
T. W. (lay, Jr.
M. L. Dertouzos
K. Nezu
S. Naito
COMPUTER RELATED SOCIAL PROBLEMS: EFFECTIVE
ACTION ALTERNATIVES
(Panel Session-No papers in this volume)
DEVELOPING A SOFTWARE ENGINEERING DISCIPLINE
(Panel Sossion-N0 papers in this volume)
PROPRIETARY SOFTWARE PRODUCTS
(Panel Session-:-No papers in this volume)
HARDWARE TECHNIQUES FOR INTERFACING MAN WITH
THE COMPUTER
A touch sensitive X-Y position encoder for computer input ...... .
A queueing model for Bcan conversion ......................... .
Charcter generation from resistive storage of time derivatives .... .
Economical display generation of a large character set ........... .
553
561
569
COMPUTER-AIDED DESIGN OF COMPUTERS
ISDS: A program that designs computer instruction sets ........ .
Directed library search to minimize cost ....................... .
Computer-aIded-design for custom integrated systems ........... .
575
581
599
F. M. Haney
B. A. Chubb
W. K. Orr
613
625
D. M. Avedon
S. A. Brown
629
J. K. Koeneman
J. R. Schwanbeck
MANAGEMENT PROBLEMS IN HYBRID COMPUTER
FACILITIES
(Panel Session-No papers in this volume)
COMPUTER OUTPUT MICROFILM SYSTEMS
An overview of the computer output microfiJm field ............. .
The microfilm page printer~oftware considerations ............ .
Computer microfilm: A cost cutting solution to the EDP output
bottleneck ................. '................ , ............. .
THE FUTURE IN DATA PROCESSING WITH
COMMUNICATIONS
A case study of a distributed communications-oriented data
processing system ......................................... .
Analysis of the communications aspects of an inquiry-response
system .......... '........................................ .
A study of asynchronous time division multiplexing for time-sharing
computer systems ........................................ .
637
N. Nisenoff
655
J. S. Sykes
669
TOPICAL PAPERS
The jnvolved generation: Computing people and the disadvantaged .
The CUE approach to problem solving ........................ .
Self-contained exponentiation ................................ .
679
691
701
DCDS digital simulating system .............................. .
707
Pattern recognition in speaker verification. . ................... .
721
D. B. Mayer
J. D. McCully
N~ W. Clark
W. J. Cody
H. Potash
D. Allen
S. Joseph
S. K. Das
W. S. Mohn
HYBRID TECHNIQUES AND APPLICATIONS
A hybrid/digital software package for the solution of chemical
kinetic parameter identification problems ..•..................
Extended space technique for hybrid computer solution of partial
differential equations .................... : . . . . . . . . . . . ...... .
733
A. M. Carlson
751
D'. J. Newman
J. C. Strauss
761
771
N. H. Kemp
P. Balaban
A time-shared I/O processor for real-time hybrid computation .....
781
On-line software checkout facility for special purpose computers ...
789
T. R. Strollo
R. S.' Tomlinson
E. R. Fiala
T. H. Witzel
S. S. Hughes
A hybrid frequency response technique and its application to
aircraft flight flutter testing ................................ .
801
Extension and analysis of use of derivatives for compensation of
hybrid solution of linear differential equations ................ .
HYPAC-A hybrid-computer circuit simulation program ........ .
REAL-TIME HYBRID COMPUTATIONAL SYSTEMS
J. M. Simmons
W. Benson
J. P. Fiedler .
A survey of techniques for recognizing
parallel processahle streams in
computer programs *
by C. V. RAMAMOORTHY and M. J. GONZALEZ
The University of Texas
Austin, Texas
lNTRODUCTIOK
."). Improved performance in a uniprocessor multiprogra~med environment. Even in a uniprocessor environment, parallel processable segments of high priority jobs can be overlapped so
that when one segment is waiting for I/O, the
processor can be computing its companion
segment. Thus an overall speed up in execution
is achieved.
State-of-the-art advances-in particular, anticipated
advances generated by LSI-have given fresh impetus
to research in the area of parallel processing. The
motives for parallel processing include the following:
1. Real-time
urgency. Parallel processing can
increase the speed of computation beyond the
limit imposed by technological limitations.
With reference to a single program, the term "parallelism" can be applied at several levels. Parallelism
within a program can exist from the level of statements
of procedural languages to the level of micro operations.
Throughout this paper, discussion will be confined to
the more general "task" parallelism. The term "task"
(process) generally is intended to mean a self-contained
portion of a computation which once initiated can be
carried out to its completion without the need for
additional inputs. Thus the term can be applied to a
single statement or a group of statements.
In contrast to the way the term "level" was used
above, task parallelism can exist at several levels within
a hierarchy of levels. The statements of the main
program of a FORTRAN program, for example, are
said to be tasks of the first level. The statements within
a subroutine called by the main program would then
be second level tasks. If this subroutine· itself called
another subroutine, then the statements within the
latter subroutine would be of the third level, etc. Thus
a sequentially organized program can be represented
by a hierarchy of levels as shown in Figure 1. Each
2. Reduction of turnaround· time of high priority
jobs.
:~
Reduction of memory and thne requirements
for "housekeeping" chores. The simultaneous
but properly interlocked operations of reading
inputs into memory and error checking and
editing can reduce the need for large intermediate storages or costly transfers between
members in a storage hierarchy.
4. An increase in simultaneous service to many
users. In the field of the computer utility, for
example, periods of peak demand are difficult to
predict. The availability of spare processors
enables an installation to minimize the effects
of these peak periods. In addition, in the event
of a system failure, faster computational speeds
permit service to be provided to more users
before the failure occurs.
'" This work was supported by NASA Grant NGR 44-012-144.
1
2
LEVEL 1
Fall Joint Computer Conference, 1969
LEVEL 2
LEVEL 3
LEVEL n
(a)
(b)
(c)
Figure 2-Sequential and parallel execution of a
computational process
Figure I-Hierarchical represen ta tion of a seq uen tially
organized program
block within a level represents a single task; as before,
a task can represent a statement or a group of statements.
Once a sequentially organized program is resolved
into its various levels, a fundamental consideration of
parallel processing becomes prominent-namely that
of recognizing tasks within individual levels which can
be executed in parallel. Assuming the existence of a
system which can process independent tasks in parallel,
this problem can be approached from two directions.
The first approach provides the programmer with
additional tools which enable him to explicitly indicate
the parallel processable tasks. If it is decided to make
this indication independent of the programmer, then
it is necessary to recognize. the parallel processable
tasks implicitly by analysis of the relationship between
tasks within the source program.
After the information is obtained by either of these
approaches, it must still be communicated to and
utilized by the operating system. At this point, efficient
resource utilization becomes the prime consideration.
The conditions which determine whether or not two
tasks can be executed in parallel have been investigated by Bernstein. 1 Consider several tasks, T i, of a
sequentially organized program illustrated by a flow
chart as shown in Figure 2(a). If the execution of
task Ts is independent of whether tasks Tl and T2 are
executed sequentially as shown in Figure 2(a) or 2(b),
then parallelism is said to exist between tasks T 1 and
T 2. They can, therefore, be executed in parallel as
shown in Figure 2(c).
This "commutativity" is a necessary but IlLOt sufficient condition for parallel processing. There may exist,
for instance, two processes which can be exelcuted in
either order but not in parallel. For example:, the inverse of a matrix A can be obtained in either of the
two ways shown below.
(1)
a) Obtain transpose of A
b) Obtain matrix of cofactors of the transposed
matrix
c) Divide result by
determinant of A
(2)
a) Obtain matrix of
cofactors of A
b) Transpose matrix
of cofactors
c) Divide result by
determinu.nt of A
Thus obtaining the matrix of cofactors and the transposition operation are two distinct processes which can
be executed in alternate order with the same result.
They cannot, however, be executed in parallel.
Other complications may arise due to hardware
limitations. Two tasks, for example, may need to access
the same memory. In this and similar situations,
requests for service must be queued. Djkstra, Knuth,
and Coffman2 •8 •4 have developed efficient scheduling
procedures for using common resources.
In terms of sets- representing memory locations,
Bernstein has developed the conditions which must be
Techniques for Recognizing Parallel Processable Streams
satisfied before sequentially organized processes can be
executed in parallel. These are based on four separate
ways in which a sequence of instruct'ions can use a
memory location:
(1) The location is only fetched during the execution
ofT i .
(2) The location is only stored during the execution
ofT i •
(3) The first operation within a task involves a fetch
with respect to a location; one of the succeeding operations of T i stores in this location.
(4) The first operation within a task involves a store
with respect to a location; one of the succeeding operations of T i fetches this location.
Assuming a machine model in which processors are
allowed to communicate directly with the memory
and multi-access operations are permitted, the conditions for strictly parallel execution of two tasks or
program blocks can be stated as fo11ows.
(1) The areas of memory which Task 1 "reads"
and onto which Task 2 "writes" should be mutually
exclusive, and vice-versa.
(2) With respect to the next task in a sequential
process, Tasks 1 and 2 should not store information in
a common location.
individual functional units can be assigned to independent components within a task. The motivation
remains the same-- a decrease in execution time of
indjvidual tasks. The CDC 6600, for example, can
utilize several arithmetic units to perform several
operations simultaneously. This type of parallelism can
be illustrated by the arithmetic expression which
follows.
x
= (A+B) * (C-D)
Normally, this expression would be evaluated in a
manner similar to that shown in Figure 3(a). The
independent components within the expression, however, permit parallel execution as shown in Figure
3(b) with the same results.
Explidt and implicit parallelsim
In the explicit approach to parallelism, the programmer himself indicates the tasks within a computational
process which can be executed in parallel. This is
normally done by means of additional instructions in
the programming language. This approach can be
illustrated by the techniques described by Conway,
Opler, Gosden, and others5 ,6,7. FORK in the FORK
and JOIN technique6 indicates thep arallel processability of a specified set of tasks,within a process. The
next sequence of tasks will not be initiated until all
The conditions listed by Bernstein are sufficient to
guarantee commutativity and parallelism of two
program blocks. He has shown, however, that there do
not exist algorithms for deciding the commutativity or
parallelism of arbitrary program blocks.
As an example of what has been discussed here
consider the tasks shown below \vhich represent FORTRAN statements for evaluation of three arithmetic
expressions.
x
= (A+B) * (A-B)
Y = (C-D) / (C+D)
z = X+y
Because the execution of the third expression is independent of the order in which the first two expressions
are executed, the first two expressions can be executed
in parallel.
Parallelism within a task can also exist when individual components of compound tasks can be executed
concurrently. In the same manner that ind.ividual
processors can be assigned to independent tasks,
3
(a)
(b)
Figure 3-Illustre,tion of pamllelism within a compound
task
4
Fall Joint Computer Conference, 1969
the tasks emanating from a FORK converge to a
JOIN statement.
In some instances, some of the parallel operations
initiated by the FORK instruction do not have to be
completed before processing can continue. For example,
one of these branch operations may be designed to
alert an I/O unit to the fact that it is to be used momentarily. The conventional FORK must be modified
to take care of these situations. Execution of an IDLE
Figure 4-FORK and JOIN technique
statement, for example, permits proceSSOrB to be
released without initiation of further action. 7 The
FORK and .JOIN TECHNIQUE is illust:rated in
Figure 4.
Another example of the explicit approach is the
PARALLEL FOR7 which takes advantage of parallel
operations generated by the FOR statement in ALGOL
and similar constructs in other languages. For example,
the sum of two n X n matrices consists essentially of
n2 independent operations. If n processors were available, the addition process could be organized such that
entire rows or columns could be added simultaneously.
Thus the addition of the two matrices could he accomplished in n units of time. Another example of this
approach is the programming language PL/l which
provides the TASK option with the CALL staten;.ent
which indicates concurrent execution of parallel
tasks.
An additional way of indicating parallelism explicitly
is to write a language which exploits the parallelism in
algorithms to be implemented by the operating system.
This is the case with TRANQUIL,8,21 an ALGOLlike language to be utilized by the array processors of
the ILLIAC IV. The situation is unique in that the
language was created after a system was devised to
solve an existing problem. "The task of compiling a
language for the ILLIAC IV is more difficult than
compiling for conventional machines simply because of
the different hardware organization and the need to
utilize its parallelism efficiently." A limitation of this
app:roach is that programs written in that particular
language can only be run on array-type computers and
is, therefore, heavily machine dependent.
The implicit approach to parallelism does not depend
on the programmer for determination of inherent
parallelism but relies instead on indicators existing
within the program itself. In contrast to the relative
ease of implementation of explicit parallelism, the
implicit approach is associated with complex compiling
and supervisory programs.
The detection of inherent parallelism between a set
of tasks depends on thorough analysis of the source
pro,gram using Bernstein's conditions. Implementati.on
of a recognition scheme to accomplish this detecti.on
is dependent on the source langua,ge. Thus a r€lco:~nizer
which is universally applicable cannot be implomented.
An algorithm developed by Fisher9 approaches the
problem of parallel task detection in a general manner.
His algorithm utilizes the input and output. sets of
each task (process) to determine essential ordering
and thus inherent parallelism. Given such information
as the number of processes to be analyzed, the input
and output set for each process, the given permissible
Techniques for Recognizing ParaUelProcessable Stream.s
ordering among the processes, and any initially known
essential order among the processes, the algorithm
generates the essential serial ordering relation and the
covering for the essential serial ordering relation. This
covering provides an indication of the tasks within the
overall process which can be executed concurrently.
Basically, this work formalizes in the form of an
algorithm the conditions for par2Jlel processing developed by Bernstein. The conditions for parallel processing
between two tasks are extended to an overall process
Detection of task paraUelism-A new approach
,The next subject covered in this paper involves
implicit detection of parallel processable tasks within
programs prepared. for serial execution. An indication
is desired of the tasks which can be executed in parallel
and the tasks which must be completed before the
start of the next sequence of tasks. Thus the problem
can be broken down in two parts-recognizing the
relationships between tasks within a level and using
this information to indicate the ordering between tasks.
The approach presented here is based on the fact
that computational processes can be modeled by
oriented graphs in which the vertices (nodes) represent
single tasks and the oriented edges (directed branches)
represent the permissible transition to the next task
in sequence. The graph (and thus the computational
process) can be represented in a computer by means
of a Connectivity Matrix, C.IO.ll C is of dimension
n X n such .that C ij is a "1" if and only if there is a
directed edge from node i to node j, and it is "0"
otherwise. The properties of the directed graph and
hence of the computational process it represents can
be studied by simple manipulations of the connectivity
matrix.
A graph consisting of a set of vertices is said to be
strongly connected if and only if any node in it is reachable from any other. A subgraph of any graph is defined
as consisting of a subset of vertices with all the edges
between them retained. A maximal strongly connected
(l\!£.S.C.) subgraph is a strongly connected subgraph
that includes all possible nodes which are strongly
connected with each other. Given a connectivity matrix
of a graph, all its M.S.C. subgraphs can be determined
simply by well-known methods. to A given program
graph can be reduced by replacing each of its M.S.C.
sub graphs by a single vertex and retaining the edges
.connected .betwe~n these vertices and others. After
the reduction, the reduced graph will not contain any
strongly connected components.
The paragraphs which follow will describe the sequence of operations needed to prepare for parallel
5
processing in a multiprocessor computer a program
written for a uniprocessor machine.
(1) The first step is to derive the program graph
which identifies the sequence in which the computation
al tasks are performed in the sequentially codeprogram. Figure 5(a) illustrates an example program
graph. The program graph is represented in the computer by its connectivity matrix. The connectivity
matrix for the example is given in Figure 5(b).
(2) By an analysis of the connectivity matrix, the
maximal strongly connected subgraphs are determined
by simple operations.1O This type of subgraph is i:llustrated by tasks 2 and 12 in Figure .5. Each M.S.C.
subgraph is next considered as a single task, and the
graph, called the reduced graph, is derived. The reduced graph does not contain any loops or strongly
1 2
3 4
0
0
2a 0
2b 0
3
0
4
0
5
0
6
0
7
0
8
0
9
0
10
0
11
0
12a 0
12b 0
12c 0
13
0
14
0
a 2b
1
0
0
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
5
6 7 8 9 10 11 12
o
o
0 000 0
0 o 0 o 0
000 000
1 1 o 0 0 0
0 0 0 1 0 0
0 0 1 0 0 0
0 0 0 1 o 0
0 0 0 0 1 0
0 0 0 0 0 1
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
1
0
0
12
a 12b
c
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
1
0
0
0
0
0
0
0
13 14
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
(b)
Figure 5-Program graph of a serially coded program
and its connectivity matrix
6
Fall Joint Computer Conference, 1969
connected elements. In this graph; when two or more
edges emanate from a vertex, a conditional branching
is indicated. That is, the execution sequence ",-m take
only one of the indicated alternatives. A vertex which
initiates the branching operation wdl be called a
decision or branch vertex. The reduced graph for the
example program graph is shown in Figure 6. In this
graph. vertex 3 represents a branch vertex.
(3) The next step is to derive the final program
graph and its connectivity matrix T. The elements of
T are obtained by analyzing the inputs of each vertex
in the reduced graph. An element, T ii , iF! a "I" if
and only if the j-th task (vertex) of the reduced graph
has as one of its inputs the output of task i; othCf\vise
T ii is a "0". Figure 7 illustrates the final program for
the example after consideration iR given to the inputoutput relationships of each taRk. The connectivity
matrix for the final program gr9ph is shown in F"gure R.
From the sufficiency conditions for task parallelism.
two tasks can be executed in parallel if the input set of
one task does not depend on the output Ret of the other
and vice versa. The technique outlined in Step 4 detects
this relationship and uses it to provide an ordering
for task execution.
(4) The vertices of the final program graph are
,
E) 6
= f{S)
Figure 7-Final progra:n graph of the parallel
processable i)rogram
10
0
0
0
0
0
0
0
0
.4
1=
0
0
0
0
0
0
0
0
0
a
0
9
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
10
0
11
0
12
0
0
0
0
13
0
0
0
0
14
0
0
0
0
Precedence
Partitlons
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
14
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
13
0
0
0
0
1"
0
0
0
0
11
0
0
0
0
[I} , (2} , {3,a} , (4,5,9,lO}
(6,11,12}, (7,131, (141
Figure 8-Connectivity matrix of the final program
graph
F!gure 6--Reduced program graph of the serially coded
program
partitioned into "precedence partitions"P as follows.
Using the connectivity matrix T, a column (or columns)
containing only zeroes is located. Let this eolumn
correspond to vertex Vl. Next delete from T both the
column and the row corresponding to this vertex. The
first precedence partiton is P 1 = {vI} . Using the remaining portion of T, locate vertices {V21, Vzz, . .. } which
correspond to columns containing only zeroes. The
second precedence partition P z thus contains vertices
{VZ1, Vzz, .. . }. This implies that tasks in set p z =
Techniques for Recognizing Parallel Processable Streams
{V21, V22, ••• } can be initiated and executed in parallel
after the tasks in the previous partition (i.e., PI) have
been completed. Next delete from T the columns and
rows corresponding to vertices in P 2 • This procedure is
repeated to obtain precedence partitions P a,P4,' .. P p ,
until no more columns or rows remain in the T matrix.
It can be shown that this partitioning procedure is
valid for connectivity matrices of graphs which contain
no strongly connected components.
The implication of this precedence partitioning is
that if P 1,P 2 , ••• P p corresponds to times t 1,t2,. • •t p , the
earliest time that a task in partition Pi can be initiated
is ti.
The final program graph contains the following types
of vertices: (1) The branch or decision type vertex
from which the execution sequence selects a task from
a set of alternative tasks. (2) The Fork vertex which
can initiate a set of parallel tasks. (3) The Join vertex
to which a set of parallel tasks converge after their
execution .• (4) The normal vertex which receives its
input set from the outputs of preceding tasks. Figure 7a
indicates the final program graph with the first three
types of vertices indicated by B, F, and J, respectively.
(5) From precedence partitioning and the final
program graph, a Task Scheduling Table can be
developed. This table, shown in Table I, serves as an
input to the operating system to help in the scheduling
of tasks. For example, if the task being executed is a
Fork task, a look-ahead feature of the system can
prepare for parallel execution of the tasks to be initated upon compl~tion of the currently active task.
(6) The precedence partitions of Step 4 provide an
indication of the earliest time at which a task may be
initiated. It is also desirable, however, to provide an
indication of the latest time at which a task may be
initiated. This information can be obtained by performing precedence partitions on the transpose of the
T matrix. This process can be referred to as "row partitions". The implication here is that if task is in the
partition corresponding to time period h, then h is
the latest time that the task i can be initiated.
Using both the row and column partitions, the permissible initiation time for each task can be derived as
shown in Table II. Task 4, for example, can be initiated during t4 or to depending on the availability of
processors.
At this point it is desirable to clarify some possible
misinterpretations of the implications of this method.
The method presented here does not try to determine
whether any or· all of the iterations within a loop can
be executed simultaneously. Rather the iterations
executed sequentially are considered as a single task.
7
TABLE I-Tp.sk scheduling table
TASK
TYPE
TIME
INPUTS
TO TASKS
TASK
NUMBER
tl
-
1
t2
1
2
FORK
t3
2
3
BRANCH
t3
2
8
FORK
t4
3
4
t4
3
5
t4
8
9
t4
8
10
ts
5
6
ts
9
11
ts
9
12
t6
4,6
7
JOIN
FORK
to
10,11,12·
13
JOIN
t7
7 ,13
14
JOIN
For this reason, the undecidability problem introduced
by Bernstein is not a factor here.
In addition, precedence partitions may place the
successors of a conditional within the same partition.
The interpretation of this is that only one of the successots will be executed, and it can be executed in
parallel with .the other tasks within that partition.
The FORTRAN parallel task recognizer
In order to determine the degree of applicability of
the method described above, it was decided to apply
the method to a sample FORTRAN program. This
was accomplished by writing a program whose input
consists of a FORTRAN source program; its output
consists of a listing of the tasks within the first level
of the source program which can be executed in parallel. .
The program written to accomplish this parallel task
Fall Joint Computer Cqnference, 1969
8
TABLE II-Permissible task initiation time
COLUMN PARTITIONS
TASK
TIME
PERMISSIBLE TASK
INITIATION PERIODS
t1
1
TASK
TIME
t?
2
1
t1
t3
3,8
2
t2
t4
4,S,9,1O
3
t3
ts
6,11,12
4
t 4 , ts
t6
7 ,13
S
t4
t7
14
6
ts
7
t6
ROW PARTITIONS
t1
1
8,
t3
t2
2
9
t4
t3
3,8
10
t 4 , ts
t4
S,9
11
ts
ts
4,6,10 ,11,1:1
12
ts
t6
7 ,13
13
t6
t7
14
1~
t7
detection is known in its final form as a FORTRAN
Parallel Task Recognizer .13
The recognizer, also written in FORTRAN, relies
on indicators generated by the; way in which the
program is actually written. ConSider the expressions
given below.
Xl
=
f1(A,B)
X2 = f 2 (C,D)
Because the right-hand side of the second expression
does not contain a parameter gen~rated by the computation which immediately preced~s it, the two expressions can be executed in parallel. ~f, on the other hand,
the expressions were rewritten as shown below, the
termination of the first computation would have to
precede the initiation of the second.
Xl
X2
= fl(A,B)
= f 2 (XI,C)
The recognizer performs this determination by comparing the parameters on the right-hand of the equality
sign to outcomes generated by previous statements.
Other FORTRAN instructions can be analyzed
similarly. Consider the arithmetic IF:
IF (X - Y) 3,4,5
Here the parameters within the parentheses must be
compared to the outputs of preceding statements in
order to determine essential order.
Other FORTRAN instructions are analyzed in a
similar manner in order to generate the connectivity
matrix for the source program. During t.b.is analysis
the recognizer assigns numbers to the executable
statements of the source program. After this is completed, the recognizer proceeds with the method of
precedence partitions described earlier. Precedence
partitions yield a list of blocks which contain the statement numbers which can be executed concurrently,
Figure 9 shows a block diagram of the steps t:a.ken by
the recognizer to generate the parallel processable
tasks within the first level of a FORTRAN source
program.
Some statements within the FORTRAN set are
treated somewhat differently. The DO statement, for
example, does not itself contain any input or output
parameters but instead generates a series of repeated
operations. Because of the loop considerations mentioned earlier, and because the rules of FOHTRAN
require entrance into a loop only through the DO
statement, all the statements contained within a DO
loop are considered as a single task. A loop, however,
may contain a large number of statements, and a great
amount of potential parallelism may be lost if consideration is not given to the statements within the
loop. For this reason, the recognizer generates a separate connectivity matrix for each DO loop within the
program.
The recognizer itself possesses limitations which
must be eliminated before it can be applied to programs
of a complex nature. For example, only a subset of
the entire FORTRAN set is considered for recogniton.
This could be corrected by expanding the recognition
process to include a more complete set of instructions.
In addition to the DO statement, loops can also be
Techniques for Recognizing Parallel Processable Streams
C
READ NEXT
SOURCE
PROGRAM
IN3'1RUCTION
20
10
N
SCAN EXECUTABLE
STATEMENTS AND
COMPARE INPUT
PARAMETERS TO
OUTPU1S a' FRLVlCUS
STATEMENTS
IF THIS TAS< IS THE
SUCCESSOR OF A
BRANCH OR TRAN3HR
CPERJU'ION, REOORD
THIS I!'lFffiMATION
ECORDINPUT
ND OUTPUT
>-_"":_'1ARAME TERS
QUIRED BY
HIS TASK
C
9
10
11
12
13
14
15
16
17
18
19
20
30
40
50
60
100
200
3057
315
4
52
21
THIS IS A TEST PROGRAM DESIGNED TO CHECK PPS
DIMENSION Al(lO) ,A2(l0) ,A3(l0)
INTEGER Al ,A2 ,ABC ,A2X2, B ,C ,D
READ 100, (Al(I) ,1=1 ,10), B ,C ,D
READ 100, (A2(I),I=1,10),NS,NST,NSTU
DO 10 1=1,10
IF (Al(I) -A2(I)) 20,30,40
Xl=(Al(I))*(B-C)
X2=D+(B/c)
A3(I)=Xl*X2
CONTINUE
THIS IS A TEST COMMENT
PRINT 200 ,B,C ,D
CALL ALPHA(Al ,A2 ,ABC, B4 ,B5)
PRINT 3057 ,Xl ,X2, (A3(I) ,1=1 ,10)
CALL BETA(Xl ,X2 ,A3, B6)
IF(B4-B5) 50.50,60
READ 315, E , F • G • H
X3=(E*F)+(G-H)
PAR}\LLEL
X4=B6+G
PROCESSABLE
X5=X3-X4
TASKS
X6=(B4+B5)* X5
(1,2)
PRINT 4,X3,X4,X5
(3)
PRINT 52, (Al(I) , 1=1,10) .ABC ,C, (A3(I) ,1=1.10)
(9,10.11,12)
FORMAT(lOI2,3I3)
(13)
FORMAT(1HO,8 B C D* ,/,313)
(14)
FORMAT(1H ,213 .lOF7 .1)
(15,16)
FORMAT(4F7 • 4)
(17)
FORMAT(3F7 • 4)
(18.19,20)
FORMAT(12I3 ,10F7 .1)
END
(a)
HEN MATCH IS
FOUND ,MJKE ENI'RY
IN C,i.e. , SHOW A
CONNECTION FROM
PREDECESSOR TO
SUCCESSOR
USING THE As)IGNED
AFTER GENERATION
STATEMENT NUMBERS
OF CIS CCMPLETE,
GEm:RATE
INDICATE THOSE
I-----I~TASKS WI'lHIN THE
PRECEDENCE
FIRST LEVEL WHICH
PARTITIONS
CAN BE DONE IN
PARALLEL
Figure 9-Block diagram of the FORTRAN
parallel te.sk recognizer
created by branch and transfer operations such as
the IF and GO TO instructions. To eliminate these
loops, it would be necessary to analyze the connectivity matrix in the manner mentioned earlier before
beginning the process of precedence partitions. The
recognizer does not presently perform this analysis.
Nested DO loops are not permitted, and the source
program size is limited in the number of executable
statements it may have and in the number of parameters anyone statement can contain.
Some of these limitations could be eliminated quite
easily; others would require a considerable amount of
effort. To allow a source program of arbitrary size
would require a somewhat more elaborate handling of
memory requirements and associated problems. At the
9
(b)
Figure IO-An exe.mple of the recognition process.
present time the recognizer consists of a main program
and six subroutines. In its present form the recognizer
consists of approximately 1300 statements.
The recognizer is presently written in such a manner
that it will detect only first level parallelism. The
method it uses, however, can be applied to parallelism
at any level.
The theory of operation of the FORTRAN parallel
task recognizer will be illustrated by applying the
recognition techniques to a sample FORTRAN program.
Figure IOCa) is a listing of the sample program showing
the individual tasks. Figure IOCb) is a listing of the
parallel processable tasks as determined by precedence
partitions. The numbers to the left of the executable
statements are the numbers assigned by the recognizer
during the recognition phase.
Elimination of the limitations mentioned here and
other limitations not mentioned explicitly will be the
subject of future effort.
Observations and comments
Regardless of the manner in which the subject of
parallel processing is approached, common problems
arise. Prominent among these is a need to protect
common data. If two tasks are considered for concurrent execution and one task accesses a memory
location and the other amends it, then strict observance
must be paid to the order in which this is done. The
10
Fall Joint Computer Conference, 1969
FORTRAN recognizer, for example, may determine
that two subroutines can be executed in parallel. At
the present time no consideration is given to the fact
that both subroutines may access common data
through COMMON or EQUIVALENCE statements.
In order to truly optimize execution time for a
program which is set up for parallel process'llg, it
would be highly desirable to determine the time required for execution of the individual tasks ·within
the process. It is not enough to merely determine that
two tasks can be executed concurrently; the primary
goal is that this parallel execution result in higher
resource utilization and improved throughput. If the
time required for the execution of one task is 100 tImes
that of the other, for example, then it may be desirable
to execute the two tasks serially rather than in parallel.
The reasoning here is that no time wou~d be spent
in allocating processors and so forth.
Determinat;.on of task execution time, however, is
not a simple matter. Exhaustive measurements of the
type suggested by Russell and Estrin14 would provide
the type of information mentioned here.
Another problem area involves implementation of
special purpose languages such as TRANQUIL. It
was mentioned earlier that programs written in a
language of this type are highly machine-limited. It
would be highly desirable to be able to implement
progr9ms written in these languages in systems whicl~
are not designed to take advantage of parallelism.
Along these lines, the programming generality suggested by Dennis 15 may be significant.
It should be pointed out that aU the techniques
whl.ch have been discussed here will create a certain
amount of overhead. For this reason it is felt that a
parallel task recognizer, for example, would be best
suited for implementation with production programs.
Thus even though some time would be lost initially,
in the long run parallel processing would result in a
significant net gain.
Conclusions
The method of indicating parallel processable tasks
introduced here and illustrated in part by the FORTRAN Parallel Recognizer appears to provide enough
generality that it is independent of the language, the
application, the mode of compilation, and the number
of processors in the system. It is anticipated that this
method will remain as the basis for further effort in
this area.
In additi.on to the comments made earlier, some
possible future areas of effort include determination of
possible paralleljsm of individual iterations within a
loop. It is hoped that additional information can be
provided to the operating system other than a mere
indication of the tasks which can be executed in parallel. This would include the measurements mentioned
earlier and an indication of the frequency of execution
of individual tasks.
I t is also hoped that a sub-language may be developed which can be added to existing languages to
assist in the recognition process and the development
of recognizer code.
Detection of parallel components within
compound tasks
Several algorithms exist for the detection of independent components within compound tasks.16.17.1b.19
These algorithms are concerned pr·.marily with detection of this type of parallelism within arithmetic
expressions. The first three algorithms referenced
above are summarized in [19] where a new all~orithm
js also introduced.
The arithmetic expression which will be used as an
example for each algorithm is given below.
A+B+C+D*E*F+G+H
Throughout this discussion. the usual precedence
between operators will apply. In order of increasing
precedence, the precedence between operators will be
as follows: + and - , * and/, and t, where l' stands
for exponentiation.
Hellerman's algorithm
This algorithm assumes that the input string is
written in reverse Polish notation and contailns only
binary operators. The string is scanned from left to
right replacing by temporary results each occmrrence
of adjacent operands immediately followed by an
operator. These temporary results will be considered
as operands during the next passes. Temporary results
generated during a given pass are said to be at the
same level and therefore can be executed in parallel.
There will be as many passes as there are levels in the
~;;yntactic tree. The compilation of the expression
listed above is shown in Figure 11.
Although this algortihm is simple and fast, it has
two shortcoming'). The first is a possible difficulty in
implementation since it requires the input string to
be in Polish notation; the second is its inabilit.y to
handle operators which are not commutative.
Techniques for Re,cognizing Parallel Processable Streams
TEMPORARY RESULTS
GENERATED DURING lth PASS
INPUT STRING AFTER THE lth PI\SS
0
the algorithm causes it to be slow, and at least one
additional pass would be required to specify parallel
computations.
AB+C+DE*F* +G+H+
Rl C+R2 F*+G+H+
Rl=A+B
R2=D*E
R3 R4+G+H+
R3=Rl+C
R4=R2*F
RS G+H+
RS=R3+R4
4
R6 H+
R6=RS+G
5
R7
R7=R6+H
2
11
LEVEL
~~
5
4
It""
/RS"""
o
H
G
A"-c 1',\
;I.;", / R: '" F
A
B D
E
Figure ll-Parallel computation of
A+B+C+D*E*F+G+H using Hellerman's
algorithm
Stone's algorithm
The basic function of this algorithm is to combine
two subtrees of the same level into a level that is one
higher. For example, A and B, initially of level 0, are
combined to form a subtree of level 1. The algorithm
then searches for another subtree of level 1 byattempting to combine C and D. Since precedence relationships between operators prohibit this combination, the
level of subtree (A+B) is incremented by one. The
algorithm now searches for a subtree of level 2 by
attempting to combine C, D, and E. Since this combination is also prohibited, 'subtree (A+B) is incremented to level 3. The next search is successful, and a
subtree of level 3 is obtained by combining C, D, E
and F. These two subtrees are then combined to form a
single subtree of level 4 .
In a similar manner the subtree (G+H), originally
of level 1, is successively incremented until it achieves
a level of 4; at that time it is combined with the other
subtree of the same level to form a final tree of level 5.
The algorithm yields an output string in reverse
Polish which does not expressly show which operations
can be performed in parallel. Even though the output
string is generated in one pass, the recursiveness of
Squire's algorithm
The goal of this algorithm is to form quintuples of
temporary results of the form:
Ri (operand 1, operator, operand 2, start level
= max [end level op. 1; end level op. 2], end level=
start level + 1) .
All temporary results which have the same start level
can be computed in parallel. Initially, all variables
have a start and end level equal to zero.
Scanning begins with the rightmost operator of the
input string and proceeds from right to left until an
operator is fouIld whose priority is lower than that of
the previously scanned operator. In the example thp
scan would yield the following substring:
D*E*F+G+H
N ow a left to right scan proceeds until an operator is
found whose priority is lower than that of the leftmost operator of the substring. This yields: D*E*F.
At this point a temporary result Rl is available of the
form:
HI (D, *,E,O,I).
The temporary result, Rl, replaces one of the operands
and the other is deleted together with its left operator
The new substring is then:
R1*F+G+H.
The left to right scans are repeated until no further
qunituple can be produced, and at that time, the right
to left scan is re-initiated. The results of the process
are shown in Figure 12.
Although the example shm,'s the algorithm applied
to an expression containing only binary operators, the
algorithm can also handle subtraction and division
with a corresponding increase in complexity.
A significant feature of this algorithm is that Polish
notation plays no part in either the input string or
the output quintuples. Because of the many scans and
comparisons the algorithm requires, it becomes more
complex as the length of the expression and the diversity of operators within the expression increase.
Fall Joint Computer Conference, 1969
12
INITIAL STRING: A+B+C+D*E*F+ G+H
RIGHT TO LEFT SCAN
D*E*F+G+H
Rl*F+G+H
R2+G+H
A+B+C+R2+G+H
R3+C+R2+G+H
R4+R3+R2+H
R4+RS+R2
R6+R2
R7
QUINTUPLES
Rl
R2
R3
R4
RS
R6
R7
Op.l
D
F
A
C
H
R4
R2
LEVEL
LEFT TO RIGHT SCAN
OPERATOR
+
+
+
+
+
4
3
Op.2
START
END
E
Rl
B
G
R3
RS
.R6
0
1
0
0
1
2
3
J
2
1
1
2
3
4
2
LEVEL
4
1
3
o
o
Figure 12-Parallel computation of
A+B+C+D*E*F+G+H uE-ing Squire's p,lgorithm
Baer and Bovet's algorithm
The algorhhm uses mUltiple passes. To each pass
corresponds a level. All temporary results which can
be generated at that level are constructed and inserted
appropriately in the output string produced by the
corresponding pass. Then, this output string becomes
the input string for the next level until the whole
expression has been compiled. Thus the number of
passes will be equal to the nUInber of levels in the
syntactic tree. During a pass the scanning proceeds
from left to right and each operator and operand is
scanned only once.
The simple intermediate language which this algorithm produces is the most appropriate for multiprocessor compilation in that it shows directly all
operations which can be performed in parallel, namely
those having the same level number. The syntactic
tree generated by this algorithm is shown in Figure
13.
A new algorithm
This section will introduce a technique whose goals
are: (1) to produce a binary tree which illustrates the
parallelism inherent in an arithmetic expression; and
Figure 13-Parallel computation of
A+B+C+D*E*F+G+H using Baer and
Bovet's algorithm
(2) to determine the number of registers needed to
evaluate large arithmetic or Boolean expressions without intermediate transfers to main memory.
This technique is prompted by the fact that existing
computing systems possess multiple arithmetic units
which can contain a large number of active storages
(registers). In addition, the superior memory bandwidths of the next generation of computers will simplify
some of the requirements of this technique.
In the material presented below, a complex arithmetic expression· is examined to determine its maximum
computational parallelism. This is accomplished by
repeated rearrangement of the given expression. During
this process the given expression in reverse Polish form
is also tested for "well formation", i.e., errors and
oversights in the syntax, etc.
The arithmetic expression which was used aB a model
earlier will also be used here, namely A+B+C+D
*E*F+G+H. The details of the algorithm follow:
(1) The first step is to rewrite the expression in
reverse Polish form and to reverse its order.
+H+G+*F*E D+C+B+A
(2) Starting with the rightmost symbol of the string,
assign a weight to each member of the string based on
the following procedure:
Techniques for Recognizing Parallel Processable Streams
Assign to symbol Si the value Vi = (V i-I) + Ri
i = 1,2, ... ,n
INITIALRIGHTMOSTS i
SUBSTRING
---. .--
O(Si) = 0 if Sds a variable
=
FINAL RIGHTMOST
SUBSTRING
1 if Si is a unary operator
V i-2
1,
=,
\'0 = 0
Using this procedure, the following expression results:
Root
Xode
8
8
14
13
12
11
H
+
G
+
2
Vi
1
10
*
F
3
This procedure is repeated until the initia,l Vm occupies
the position i = 2 in the substring. For this example
this is already the case. Thus the rightmost substring
is in the proper form.
(5) The transposition procedure of step 4 is applied
next to the leftmost substring. However, since the
leftmost substring of this example consists of only two
operands and one operator, no further operations are
necessary.
(6) The resultant binary tree is shown in Figure 14.
The numbers assigned to each node represent the final
weight V i of the symbol
determined in steps 1-5
above.
as
2
2
Vm
9
Si + + C + B A * F * E D
V i-3 + H i- 2,
such that V i-(i-l) = VI = HI. and
Si
11 10 9 8 7 6543 2 1
Vi 12 3 1321212 1
O(Si) = 2 if Si is a binary operator
and Vi - l = V i-2+R i etc.,
+*F*ED+C+BA
ViI 2 3 2 3 2 1 2 1 2 1
where Ri = 1 - O(Si) given that
O(Si)
13
8
6
5
4
3
2
1
*
D
+
C
+
B
A
2
2
1
2
2
1
Note that for a "well-formed expression" of n symbols
V1l = 1.
(3) At this point the root node of the proposed
binary tree can be determined. Thus the given string
can be divided into two independent sub-strings. To
determine the root node, draw a line to the left of the
firRt symbol with a weight of 1 (i = 11, Si=+, V i =l)
to the left of the symbol with the highest weight,
V m(i=7, Si=E, Vi=Vm=3). The two independent
substrings consist of the strings to the left and to the
right of this line. The root node will be the leftmost
member of the string to the left of the line (i= 15,
St=+, Vi=l). Note that Vi also equals 3 for j=9;
however V m is chosen from the etuliest occurrence of
a symbol with the highest weight.
(4) The next step is to look for parallelism withni
each of the new substrings. Consider the rightmost
substring. Form a new substring consisting of the
symbols within the values of Vi = 1 to the right and to
the left of Vm' Transpose this substring with the substring to the right of it whose leftmost member has a
weight of V i= 1.
Some observations and comments on this algorithm
are given below.
(1) The two branches on either side of the root node
can be executed in parallel. Within each main branch,
the transposition procedure of step 4 yields supplementary root nodes. The sub-branches on each side of the
supplementary nodes can be executed in parallel.
(2) The number of levels in the binary tree can be
LEVEL
4
o
Figure 14-Bin:;>,ry tree for pt',rallel computation of
A+B+C+D*E*F+G+H
14
Fall Joint Computer Conference, 1969
predicted from the Polish form of the original string.
No. of LEVELS = MAX [NUMBER OF 1's; Vm]
in the substring (rightmost or leftmost) containing Vm.
(3) The tree is traversed in a modified postorder
form.20 The resulting expression is
D*E*F+A+B+C+G+H
(4) An added feature of this technique is that the
number of registers required to evaluate this expression
without intermediate STORE and FETCH operations
is obtained directly from the binary tree. This information is provided by the highest weight assigned to
any node within the tree. Thus for this example the
expression could be evaluated using at most two
registers without resorting to intermediate stores and
fetches.
(5) This technique of recognizing parallelism orr a
local level has been applied to a single instruction, in
particular, an arithmetic expression. It is worthwhile
mentioning that each variable within the expression
can itself be the result of a processable task. Thus this
technique can be extended to a higher level of parallel
stream recognition, i.e., level parallelism.
In order to implement the techniques mentioned
here for components within tasks and the techniques
mentioned earlier for individual tasks, several system
features are desirable. Schemes for detecting parallel
processable components within compound tasks are
oriented primarily toward arithmetic expressions. For
these situations string manipulation ability would be
highly desirable. Since individual tasks are represented by a graph and its matrix, the ability to manipulate rows and columns easily would be very important. In this same area, an associative memory
could greatly reduce execution time in the implementation of precedence partitions.
ACKNOWLEDGMENTS
The authors would like to thank the referees of the
FJCC for their comments and suggestions which
resulted in improvements of this paper.
REFERENCES
1 A J BERNST.EIN
Analysis of programs for parallel processing
IEEE Trans on EC Vol 15 No 5 757-763 Oct 1966
2 E W DJKSTRA
Solution of a problem in concurrent programming control
Comm ACM Vol 8 No 9 569 Sept 1965
:~
D KNUTH
Additional comments on a problem in concurrent
programming control
Comm ACM Vol 9
~o
5
:~21-322
Nlay 1966
-1 E G COFFMA~ H. R MUNTZ
Models of pu~e lime sharing disciplines for research
allocation
Proc 1969 Natl ACM Conf
5 M E CONWAY
A. mult1:processor 8ystem de8ign
Proc FJCC Vol 23 139-146 1963
6 A OPLER
Procedure-oriented statements to facilitate parallel proce8sing'
Comm ACM VoIR No 5 306-307 May 1965
7 J A GOSDEN
Explicit parallel processing description and control in
programs for multi- and nni-proce8.'?or computers
Proe FJCC Vol 29 651-660 1966
R N E ABEL P P BUDNIK D J KUCK
Y MURAOKA R S NORTHCOTE
H. B WILHELMSON
TRANQUIL: A. languaqc for an array proce8sing computer
Proc SJCC 57-68 1969
9 D A FISHER
Program analY8i8 for multiproces.'?ing
Burrougfi1 Corp May 1967
10 C V RAMAMOORTHY
Analysis oj graphs by connectivity considerations
Journal ACM Vol 1:~ No 2 211-222 April 1966
11 C V RAMAMOORTHY M J GONZALEZ
Recognition and representation of parallel processable streams
'in computer progranv~--Il (task/proce88 parallelis'm)
1969 Nr,tl ACNI Cont'
12 C V RAMAMOORTHY
A. structural theory of machJne diaf!nOsl:s
Proc SJce 74;{-756 1967
13 M J GONZALEZ C V RAMAMOORTHY
Rec)g'1,itia. ad repres'nt'ltiJn, '),{ p1.rallel proces8abl~e
8treams
in CJmputer
programs
Symposia on Parallel PrOCe3'30r System"! Technolol~ie3 and
Applications Ed. L C Hobbs Spartan Books June 1969
14 E C RUSSELL G ESTRIN
Mea8urement based automatic analYI~is of FORTRAN
programs
Proc SJCC 1969
15 J B DENNIS
Programming generality, parallelism and computer
architecture
Proc IFIPS Cong;res'l 68 CI-C7
16 H HELLERMAN
Parallel processing of algebraic
expres,~ions
IEEE Trans on E C Vol 15 No 1 Feb 1966
17 H S STONE
One-pa8b compilation of arithmetic expre88ions for C~
parallel proce8sor
Comm ACM Vol 10 No 4 220-223 April 1967
18 J S SQUIRE
A translation algorithm for a multiprocessor computer
Proc 18th ACM Natl Conf 1963
19 J L BAER D P BOVET
Compilation of arithmetic expre.~sions for parallel
computation
Techniques for R.ecognizing Parallel Processable Streams
Proc IFIPS 68 B4-BI0
20 D KNUTH
The art oj computer programming, Vol. 1, fundamental
algorithms
15
Addison-Wesley 316
21 R S NORTH COTE
Software developments for the array computer ILLIAC IV,
Univ of Illinois Rpt Ko 313 March 1969
Performance Illodeling and empirical
measurements in a system designed for
batch and time-sharing users
by JACK E. SHEMER and DOUGLAS W. HEYING
Scientific Data Systems, a Zerox Company
EI Segundo, California
the quality of service the user receives (his waiting time
for service completion, the price he is charged for
service, etc.).
The ramifications of hardware and software designs to
achieve such service can be investigated both internally
and externally; yet, a particular design strategy need
not supplement effective service from both viewpoints.
On the contrary, schemes tailored to improve external
utilization often degrade internal service effectiveness
and vice versa. Unfortunately, in confronting these
design trade-offs, the designer often had to rely upon
heuristic and intuitive arguments, since there is a
general lack of design models which quantitatively
relate system variables to reflect a priori performance
estimates. Hence, the design is complicated not only by
trade-offs between the often dissimilar aims of external
and internal effective service, but also by a deficiency of
design tools for investigating various implementation
alternatives.
These problems are especially amplified with the
advent of time-shared cqmputer systems. In timesharing systems, an ideal goal is to respond to interactive
on-line users such that each user receives the impression
that he has his own computer, yet at a price he can
afford. Thus in these systems, the computer complex is
shared among a number of independent users who are
concurrently communicating with the system, generating programs and interactive service requests via
on-line remote terminal equipment. This action enables
one to achieve economies of scale and distribute the cost
INTRODUCTION
If any design goal is common to all computer system
organization schemes, it is that of providing "effective
service" both externally to the user of the computational
facility and internally with respect to utilization of
system resources. Thus, generally speaking, there are at
least two dimensions to this design objective. On the one
hand, effective service is the external satisfaction of a
broad spectrum of user demands. For example, the ideal
system might be visualized as one which economically
provides a large number of programming languages;
machine compatibility with other computers of widely
diverse hardware; and rapid computation. On the other
hand, effective service is the internal utilization of all
system components so as to increase computational
efficiency. In this respect, system structures are implemented which strive to maximize sub-system
simultaneity and system throughput. For example, a
degree of macro-parallelism is attained in many present
day systems by allowing a central processing unit (CPU)
and input/output controller to share the use of a main
memory register, thereby enabling processing and
input/output (I/O) to proceed concurrently (for one or
several independent programs, depending upon the
system software).
In general, external effectiveness is all that the user
sees, and it is therefore of primary interest to him.
Whereas, the purveyor of the equipment is vitally
concerned with internal utility and coordination.
However, this latter consideration indirectly relates to
17
18
Fall Joint Computer Conference, 1969
of the system among all users according to their usage
of the facilities. Similarly, the objective of rapid response
is realized by time slicing CPU service and sharing it
among the on-line users. A request for program execution
is not necessarily serviced to completion; but rather jobs
are granted finite intervals (quanta~ of processing time.
If a job fails to exhaust its demands during a quantum
allocation, then it is truncated and postponed according
to a scheduling discipline, thereby facilitating rapid
response to short requests. 1- 4 This preferential treatment
of short jobs increases the programmer's productiveness,
since one-attempt efforts, editing, debugging, and other
typically short interactive demands often encounter
exorbitant turn-around times in batch processing
environments (i.e., in relation to the amount of actual
processing time consumed, due to problems of key
punching, printer output, card stacking, and total
system demand).
However, since computation is not necessarily run to
completion and main memory size is limited (by both
economic and physical reasons), programs must be
swapped into and out of main memory as the CPU
commutates its service from request to request.
Therefore, unless swapping is achieved with no loss in
time, it is obvious that service in the time-sharing sense
is less efficient in CPU utilization than service to
completion. Also, the time spent scheduling, allocating
buffers, and controlling swap input/output represents
overhead or wasted processing time which, due to
incomplete servicing, is greater in time-sharing systems
than batch processing systems. Furthermore, if the
system is dedicated to servicing on-line requests, the
CPU is essentially idle during periods of low on-line
input traffic. Hence, a design compromise must be
attained between external response rapidity and internal
efficiency since system performance, in the general case,
is a function of both response to selected classes of users
and utilization of system resources.
Yet, exploring such problem areas prior to design is
complicated, because any performance investigation is
incorrigibly statistical. Performance is not only a
function of software characteristics such as the input/
output, memory, and processing requirements of each
on-line request together with the occurrence rate of such
requests, but also dependent upon hardware characteristics such as the instruction processing rate and rates
accessing secondary memory.
This paper presents one approach to mitigating some
of these difficulties. A system design is briefly described
and then analyzed utilizing a mathematical model. The
system is structured to accommodate both batch and
time-sharing users with the goal being to achieve a
balance of system efficiency and responsiveness. A set
of variables are defined which characterize on-line user
demands and the servicing capacity of variou8 units
within the system. These variables are then quantitatively related in a mathematical model to derive salient
performance measures. Examples are given which
graphically display these measures versus various ranges
of the system variables. These a priori performance
estimates are then compared with empirical data
extracted from the system during its actual operation.
Here the emphasis is given to mathematical modeling
because this analysis method is more expedient and
generally less costly than the alternative approach of
simulation. Moreover, since many of the variables are
non-independent and rely upon characterization of user
demands, and siilce these are difficult to accurately
describe prior to actual operation, the macroscopic and
statistical indications provided by a mathematical model
are perhaps all that one can feasibly obtain.
Design and performance study
System design
The Batch/Time-Sharing Monitor (BTM) is designed
to afford SDS Sigma 5 and Sigma 7 users with interactive
and on-line time-sharing without disrupting batch
operations. For considerations of efficiency, the primary
objective of the BTM design is to provide limited timesharing service while concentrating on throughput of
batch jobs-the servicing of time-sharing u:sers is
allocated to minimize response for interactive users with
no special service given to the compute bound on-line
users (because high-efficiency batch service is avaHable).
Thus, the system is structured with resources for the
batch and time-sharing portions of the system separated
as much as possible. Different areas of main memory are
allocated so that a (compute bound.) batch user is
always "ready to run." The file device is common
because files may be shared between batch and timesharing users. However, the management tec:hnique
used minimizes the interference from this factor. The
swapping Rapid Access Disc (RAD) for time-i~haling
users is independent of the file device, thus insuring that
swaps in process do not affect on-going batch programs.
The batch user is kept essentially compute bound by
buffering all of his unit record I/O via a RAD. This
allows the compute portion of each job to follow that
of the previous job without waiting for the printout,
etc., to complete. Thus, there is no need to attEmpt to
reclaim swap time from one time-sharing user to
another-a natural claimant: the batch job is readily
available.
Performance Modeling and Empirical Measurements
Hence, a very simple (and low overhead) swapping
and scheduling algorithm can be used. As a particular
user is dismissed, other users are polled in turn to see
who is "ready to run." If someone is found (not the
same user), a replacement swap is initiated and the
CPU is allocated to the batch job. When the swap-out/
swap-in is complete, the new user is given one quantum
(Le., providing the batch job has already had at least its
quantum) ; then the cycle is repeated.
In this way, batch is guaranteed a certain percentage
of the machine (and typically gets much more), and a
moderate number of time-sharing users receive rapid
response to conversational request. Yet with this
relatively simple framework, a number of questions are
unavoidable: How does on-line response and batch
throughput vary with the number of on-line users, and
how do other variables such as quantum size and swap
time relate to system performance? Moreover, how
does one characterize system performance and the
variables which influence it?
Parameterizations and performance measures
The subject of "on-line" response is unfortunately
plagued by many interpretations of what constitutes
response (and, moreover, what defines adequate
response). For the purposes of this paper, "typical
on-line requests" are those which require minimal
central processor time-less than one quantum allocation. Thus, the response time C 1 to a "typical on-line
demand" is that period elapsing between request
generation (the keying in of a control character such as
"carriage return") and the termination of the first time
quantum * which is allocated to the servicing of the
request. This definition provides the basis upon which
the on-line performance of the BTM system is analyzed
in this paper, since it is assumed that on-line users are
typically in phases of program preparation. ** Thus,
providing the quantum is large enough, the great
majority of user interactions (e.g., "open the next
line," "delete source image," "perform syntax check
and insert into text," etc.) can be satisfied ·with single
quantum allocations.
The mathematical model developed in the Appendix
enables one to characterize the system by selecting
values for the variables:
N = total number of active on-line communication
* Also note that if the scheduling algorithm is round-robin then
0 1 provides a basis for approximating the response time for
a request which requires multiple quanta.'
** Note that this is not the case in system environments in which
the on-line users run production (compute bound) programs.
19
sources (i.e., the number of remote users who
are concurrently using the system).
A
= average uf>er interaction rate (frequency at
which a single user requests service by the
CPU).
J.t
= mean rate at which on-line requests are
serviced by the CPU (1/ J.t = average
amount of CPU time required to complete
each request given that the CPU was
dedicated to the servicing of the request).
S=
the average amount of time required to swap
an old user out of core and load a new user
(clearly, S is dependent upon the swapping
device as well as program size).
qR = time quantum allocated to on-line requests
(time-sharing users).
qB
= time quantum given to batch requests
(background users).
ill
= the average cumulative quantum extension
(for monitor services such as scheduling, file
I/O, service calls, etc.) incurred during the
period elapsing between successive quantum
allocations to on-line jobs.
To supplement analysis efforts, the BTM system
software is capable of monitoring these (and other)
variables and accumulating their statistical distributions
during actual system operation. This does not impose
any significant overhead since much of this data is
already accumulated in the accounting log, and (as in
many other commercial systems) used as a basis for
charging users.
Upon establishing reasonable values for the above
variables, the model can then be used to derive performance measures. In terms of resptmse, the salient
performance index is E[C 1] where
E[C1] == the expected response time which "typical
on-line demands" experience (see defini
tion given above).
In addition, the model can readily be used to estimate
the percentage of CPU time available for batch jobs; the
percentage of CPU time received by time-sharing users;
utilization of the swapping RAD; expectations of
system revenues; and a variety of other indices obtained
from combinations of the derived parameters.
A priori estimates for some of these performance
measures are given in Figures 1-5 for reasonable ranges
Fall Joint Computer Conference, 1969
20
100
qR= 200 ms.
~
S
I
as mi. IF 7212 RAD
248
443
mi.
mi.
IF 7232 RAD
IF 7204 RAD
Aa 1 Request/20 user.sec.
1/.... 400 ms./request
iii,.
100 mi.
80
PERCENT OF
CPU TIME
AVAILABLE
FOR BATCH
JOBS
(Pr[B]X 100%)
LIMIT FOR
7212 RAD
60
.<:
Avera9"
Response
6~~'~t!kal
2
20
Demands"
(sec.)
i
_.
"Swap Limited"
....
"Batch limited"
o
20
10
N
30
40
•
NUMBER OF CONCURRENT USERS
10
14
22
18
26
30
34
Figure 3-Relative batch capability
N_
NUMBER OF CONCURRE NT
USERS
Figure I-E[Cll vs. N (p.
120
= 2.5 requests/sec.)
100
.1
QR=200m ••
S=
85 ms IF 7212 RAD
248 ms. IF 7232 RAD
443 ms. IF 72)4 RAD
1
10
MAXIMUM
NUMIfII Of
CONCUHENT
USERS
'"
A = 1 Request!20 u_.sec:.
1/.. = 200 ms./request
iii = 100 ms.
--
20
Avero9"
Reoponse
To "Typkal
On-Line
Demands II
(Sec.)
1~.O----;---~10-C-~-SP-EE-D-~(-~~-I*-.-)~100~------~~~
(LOGSC"LE)
Figure 4-N max vs. CPU speed
e-0 A
'Swap limited"
l-II,.........-~--r--.---,..--r"----.,...,_.....-~--r--=.t_"Batch limited"
10
14
18
22
26
30
34
N-NUM8ER OF CONCURRENT
USERS
Figure 2-E[Cl l vs. N
(I-' =
5 requests/sec.)
of the variables N, A, JJ., S, qn, qB, and m· Obviously,
these variables will differ from ()ne environment to
another. Therefore, before discussing conclusions which
can be drawn from these graphical results, it is appropriate to clarify the parameterizations and assumptions
which were used in the calculations:
I-'
S was conservatively
calculated assuming that four RAD accesses are
required per swap with an average total of 16K
words transferred during each swap. (The RAD's
are head per track rotating memories operating
at 1800 rpm; and the SDS model 7204, 7~~32 and
3
7212 RAD transfer data at rates 187 X 10
6
bytes/sec., 384 X 10 3 bytes/sec. and 3 X 10
bytes/sec., respectively.)
2. The user interaction rate A was estimated from
statistics gathered at RAND6 and other data
extracted from the GE/Dartmouth BASIC
system6 and the SDS 940 system.
1. The average swap time
Performance Modeling and Empirical Measurements
21
Mathematical results
N
qe
~
18
=85 1115. (;.e.
"swap lim;ted")
85 mi. IF 7212 RAD
S =( 2048 rris. IF 7232 RAD
«3 1115. IF 7204 RAD
\
>. = 1 request/20 user·sec.
;;; = 100 mi.
\
E[C ]
1
Average -4
Response
To "Typ;cal
On-Une
Demands II
(Sec.)
7232 RAD
r.i.. -....:--~ ___;... _~ _~ __~ __~
7204 RAD
7212 RAD
'.!.- - ~J -iJr-
-<1> __
0.1
0.-4
-ar- -ar -
-.I> -
-.I>- - -&. -
0.7
0.8
-Jo,. -
,,= 2.5 requests/sec.
0-0.
&r • ..!.
0.:1
.0.3
0.5
0.6
0.9
-iJr - -.....
"
1.0
= 5 requelts/sec.
1.1
1.2
qR(sec.)QUANTUM ALLOCATION
TO
ON-LINE USERS
Figure 5-E[011 vs. qn (N = 18)
3. The selection of qn = 200 ms. was established
such that the majority of user interactions are
satisfied with single quantum allocations. Whereas, selecting qB = 85 ms. and 200 ms. was done
merely to demonstrate "swap limited" and
"batch limited" operation, respectively.
4. The value of the average monitor time ill per
on-line/batch quantum cycle was approximated
utilizing batch accounting information and
timing studies of monitor services.
5. Values of p, were chosen such that the average
would be ,:::::: 125 ms. to
on-line quantum
150 ms. when 200 ms. was allocated. This
selection was inferred from data extra~ted from
the SDS 940 System and BTM code traces. (Yet,
note that a single parameter p, does not provide a
characterization covering the more general case
in which the processing time distribution is
multi-modal.t However,for purposes of studying
interactive response, it provides a good approximation and lends itself to the mathematical
analysis.)
qn
t The multi-modal case arises because of a multiplicity of language facilities and the natural division of requests into interactive
or compute demands.
Given this framework, let us now turn our attention
to the 'figures. Employing the mathematical model,
a priori estimates of average interactive response time
E[C1] are displayed versus N in Figure 1 and Figure 2
for p, = __ 2.5 requests/sec. and p, = 5 requests/sec.,
respeetively. Here, three different curves are plotted in
each :figure to demonstrate the limiting effects of each
swapping device (i.e., "swap limited" operation when
the batch quantum qB is less* than the swap time S).
Also, note that an additional, curve is given for the
model 7212 RAD to display the effects of selecting a
batch quantum which exceeds the swap time (i.e.,
"batch limited" operation). This latter curve shows that
the fastest swapping device effectively becomes a slower
device when qB is set such that operation is "batch
limited"-the model 7~12 RAD is almost equivalent to
a model 7232 RAD when qB = 200 ms.
Now since N ·is the total number of concurrent users
(active communication sources), Figures 1 and 2 enable
one to estimate a value for the maximum number of
users N max which the system can simultaneously
accommodate by: (1) assuming "swap limited" operation
and (2) defining what constitutes adequate response to
typical on-line demands. For example, if one assumes
that adequate interactive response is achieved if :::::: 80%
of the time a user experiences a delay of less than 5 sec.
then, depending upon p" one concludes:**
i. the model 7204 RAD will accommodate a
maximum of 10 to 16 concurrent users for***
p, = 2.5 requests/sec. to p, = 5 requests/sec.,
respectively;
ii. the model 7232 RAD will accommodate a
maximum of 16 to 26 concurrent users for
p, = 2.5 requests/sec. to p, = 5 requests/sec.,
respectively;
iii. the model 7212 RAD will accommodate a
maximum of 26 to 38 users for p, = 2.5 requests/
sec. to p, = 5 requests/sec., respectively.
However, the actual number of on-line users who
'" For this situation, the actual batch quantum allocation is the
swap time S.
""" These conclusions were made by assuming that the probability distribution for response time 0 1 is such that twice the mean
E[Oll is (at least) the 80 percent point. This is a reasonable assumption in light of both the mathematical characterizations used in
the model and empirical measuresments.
""""'Note that reducing J.L from 5 requests/sec. to 2.5 'requests/sec.
is tantamount to reducing processing speed by a factor of 1/2.
22
Fall Joint Computer Conference, 1969
concurrently use the system is a statistical parameter
which generally is less than N max and varies according
to the total number of on-line subscribers, their
demands, processing speed, N max, etc. In practice, the
total number of on-line subscribers typically exceeds
N max by at least a factor of three.
For the above cases, nominally 50--80% of the CPU
time is available for batch jobs. This is shown in
Figure 3. Similarly, utilizing this same response
criterion, it is interesting to observe the effects of
increasing**** CPU speed J.I.. This is demonstrated in
Figure 4 for each of the swapping devices. As CPU speed
increases indefinitely, the capacity of the system to
service on-line requests approaches a limit established
by the swapping device.
Additional insight into system responsiveness is
provided by Figure 5. Here, E[C 1] is graphically
displayed versus the on-line user quantum qR for "swap
limited" operation and N = 18 (with all other variables
the same as those employed in Figures 1 and 2.) Note
that the selection of a minimum qR is very critical;
however, having estabIished a minimum qR, the variations are not dramatic for a relatively large range above
minimum qR. Also, notice that as J.I. is reduced from 5
requests/sec. to 2.5 requests/sec., a model 7232 RAD
must be used to achieve what a model 7204 RAD
accomplished in the former case; and similarly, a model
7212 RAD is required to equal the performance of a
model 7232 RAD.
Experimental results
Extensive statistics were gathered from the system
(while running typical jobs) with a twofold purpose in
mind. First, it was necessary to substantiate the validity
of the assumptions employed in the model; i.e., establish
that the chosen parameters were indeed consistent with
the actual environment. Secondly, a correlation between
empirically measured performance and the results of the
model would lend credence to the validity of the model,
and therefore allow us to extrapolate and predict
performance for other user environments and system
configurations.
The first objective was accomplished by observing a
BTM system which used a model 7212 RAD for
swapping with quanta qR = qB == 200 ms. Values for
A, J.I., ill and program size were tabulated for many
different observation periods. For each (jj these monitoring sessions different average values were obtained, but
the values J.I. = 3.5 requests/sec., A = 1 request/15
user-sec., § = 85 msec. and ih = 100 msec. were found
t~ be quite representative of most samples. The variables
J.I. and A were most subject to variation and ran~~ed from
2 to 6 requests/sec. and from 1 request/25 use:r·sec. to
1 request/l0 us~r·sec., respectively. Also, the data
indicated that the assumptions of exponentinlly distributed CPU time and request inter-arrival time
provided good approximations of user demandEI.
Given that the first objective was satisfied, realization
of the second objeetive is buttressed by Figure 6 which
plots the average of all sampled values for two of the key
performance indications (average response time E[C1J
and CPU time available for batch Pr[B]) as a function
of the number of users N. Upon comparing these results
with the mathematical predictions (also see Figures 1-3),
one can infer that (at least for the range of variables
considered) the mathematical model is reasonably
consistent with actual system operation.
Comments
The analysis presented above primarily focused attention on the system's capacity to accommodate user
demands. Even though no mention was given to
cost/performance tradeoffs, the model lends itself to
this latter design consideration. For example, the
variables N, Pr[B] , and J.I. might be combined to reflect
the revenue derived for service to batch jobs and the
revenue obtained for servicing interactive users which
could then be weighted against the cost expended to
100
Measured
Percentage af CPU Time
Available for Batch Jobs
- "EI
)a...
1
Soltlpled
E [c~
Prediction
Obtained
From Madel
... ... ...
/
:1il,
I
SAMPLED
Pr[B] X 100%
(PERCENT)
(Sec.)
60
20
.4
12
16
N
**** Note that this latitude is only possible on a limited basis
(e.g., code optimization, faster memory, faster operation unit,
multi-processing, etc.)
80
24
28
•
NUMBER OF CONCURRENT
USERS
Figure 6-Empirical results
32
Performance Modeling and E.mpiricalMeasurements
provide (and maintain) the system complement. This
would provide a basis for the designer to balance CPU
cost/performance with that of other system elements.
The process of selecting and examining performance
indexes similar to those discussed here enables the
designer to better appraise the many implementation
tradeoffs which confront him. Moreover., when supplemented with empirical data, these techniques provide a
basis for not only configuring existing systems but also
synthesizing new systems. However, it should be
emphasized that apart from the mathematical model
itself and its macroscopic treatment of the system, the
fidelity of the results and conclusions obtained in this
analysis (or any analysis of this sort) can only be as good
as the accuracy attributed to the independent variables
(N, X, J.l, m, S). The values possessed by these variables
dramatically affect performance and will vary from one
environment to another. Therefore, one should be
cautious before inferring any explicit and universal
characterizations of system performance.
completion of a request and generation of a new request
on a given line is described by the distribution function
I -
A(x) = ( 0
e-}..~
for t
for t
- NXpo(t)
0
0
~
<
0
0
+ J.lPr[R(t)]Pl(t)
for n = 0
[(N - n)X + J.lPr[R(t)]]Pn(t)
+ (N - n + 1)XPn-l(t)
+ J.lPr[R(t)]Pn+1(t)
for 0 < n
J.lPr[R(t)]PN(t)
<
N
+ XPN-l(t)
for n = N
where Pr[R(t)] denotes the probability that at time t
t'le computer is servicing one of the remotely generated
on-line requests. Note that in the above equations, the
input rate is (N - n)X when n requests are queued.
Thus the model accounts for the natural variations in
demand intensity which r ~sult because there are a finite
number N of input sources.
From these equations, the stationary probability 7
that n on-line requests are queued is
p.
=
eN ~I n)! C;r[Rl)"
po
where
Pr[R] = limit Pr[R(t)] and
----? 00
1
po
= ---------------------------
BTM mathematical model
Consider the generation of on-line requests on each
communication channel is an exponential process with
parameter X. Hence, the time interval x between
~
<
Given that there are N channels, let p (~l denote the
probability that n on-line requests are queue 1 at f 0 ne
arbitrary time t for n = 0, 1, .. ·N, then
t
APPENDIX
for x
for x
Similarly, assume that the service time t required by
each on-line request is exponentially distributed with
parameter J.l and characterized by the distribution
function
REFERENCES
1 B KRISHNAMOORTHI R C WOOD
Time-shared computer operations with both interarrival and
service time expone11 tial
J A C M Vol 13 317-338 July 1966
2 E G COFFMAN JR
Stochastic models of multiple and time-shared computer
operations
Report 66-38 Dept of Eng Univ of Calif Los Angeles
June 1966
3 L KLEINROCK
Time-shared systems: A theoretical treatment
J A C M Vol 14 242-261 April 1967
4 J E SHEMER
Some mathematical considerations of time-sharing scheduling
algorithms
J A C M Vol 14 262-272 April 1967
5 G E BRYAN
JOSS: 20,000 hours at a cOrlsole-a statistical summary
Proc F J C C 769-777 1967
6 H CANTRELL
. Time-sharing data
General Electric Technical Information Series Report
R65CD12 December 1965
7 T L SAATY
Elements of queueing theory
McGraw-Hill New York 1961
23
[ 1
+
t;,
(N~! n) ! C;'[R1YJ
The probability Pr[R] can be estimated by considering
24
Fall Joint Computer Confer·ence, 1969
the interval which elapses between successive allocations
of a quantum to on-line users. Let;Tk denote the total
time between the oth on-line quantum completion and
the kth on-line quantum completion. If the kth completion leaves the on-line queue in an empty state, then the
expected value of the time ATk until the next on-line
quantum completion is
is to let f increase by some small Af uutil a solution for
po is obtained which is consistent with Pr[R,]. The
variable f satisfying this criterion will vary drama.tically
depending upon N,
J.L, A and qB.
Upon solving for Po, the percentage of CPU time
available for batch jobs is
m,
qB
+ Po (l/NA)
Pr[B] = =qR-+-=-qB--'-+~m::::::-:-+"":"-p-o~(l-/-N-A)
where qB is the avera~e quantum which batch users
receive; qR is the expected duration of an on-line
(remote user) quantum; (l/NA) is :the mean time until
the generation of the next on-line request; and ill is the
expected monitor overhead time per batch/on-line
quantum cycle. Here, ill accounts. for 'any scheduling;
I/O overhead; file operations, and any other CPU time
pre-empted by the monitor which results during the
cycle of a quantum allocation to a batch job followed by
a quantum allocation to an on-linejob.
In the case when the kth on-line quantum completion
does not leave the interactive user queue empty, then
with probability (1 - po)
The variables qB and qR are heavily infiueneed by
quantum periods and swap time. If one assumes that
(with the exception of a batch quantum allocation every
other quantum) on-line jobs run on a demand basis
(i.e., the batch quantum qB is less than the swap time S),
then qB = S. Hence, the swap time limits the rate at
which successive quantum allocations are provided to
the on-line requests (i.e., maximum service capacity is
given to on-line requests). Whereas, if the batch
quantum limits the servicing of on-line requests
(qB > S), then qB = qB. Therefore, for completeness
_,
[ qB if S < qB
qB =
S if S ~ qB
l
Now let T B , T R , and Tm denote respectively the length
of time out of T k which the system spends servicing
batch jobs, on-line jobs, and monitor functions)
respectively.
Then as k goes to infinity, the ratios TB/k, TR/k, and
Tm/k converge with probability one to (qB + palNA) ,
and
respectively. Therefore, in the limit, an
approximation to the fraction of the time which the
system spends servicing on-line requests is
qR,
m,
Pr[R] = lim [TRJ = lim [TR/kJ
k-HD
Tk
k-+ 1 + qB
where
N
Here, f is an appropriate scale factor introduced to
facilitate solving for
{pn}~
n-O
The numerical technique
E[n] =
L: npn
and E[ToJ is the expected time remaining subsequent to
Performance Modeling and Empirical Measurements
the arrival of an on-line request before the next quantum
allocation is initiated. The value of E[ToJ is difficult to
accurately express since it is a function of the probability
densities for qB and m together with machine state
probabilities; however, it is clear that
25
time interval t given that m requests are queued. For
example, with exponential inter-arrival
Also, in the above equations
At any rate, E[To] is not a dominant factor in E[C I ]
unless E[C I ] is extremely small (i.e., E[C I ] ~ qR + E[To],
for example). Hence, the precise value of E[ToJ is not
Qritical in those cases which are of particular interest
(namely, those resulting when the on-line queue tends
toward saturation; i.e., E[n] ~ N).
In addition to the above result for E[C I ], since the
scheduling discipline is round-robin, it is possible to
estimate2- 4 the expected total response time E[rl t] for
an on-line request which requires a processing time t in
excess of a single quantum qR
E[Rlt] ~t
where
< alb >
+ [E[C
I]
+ qR)
+ qB + ill]
(po E[To]
-
is the smallest integer greater than a/b.
Alternate model
Let Pmn(T k ) denote the probability that non-line
requests are queued at epoch T k marking the completion
of the kth on-line quantum allocation, given that at
epoch T k - l there were m on-line requests awaiting
service from the system. I ,2 Then independent of k
since the CPU servicing of requests is characterized as
an exponential process
[0
Pr[n - m
Here, p B denotes the probability density function which
describes the batch quantum allocation, and p B+R is
the convolution of PB with the density function PR
defining the distribution of an on-line quantum allocation. Both PB and PR include overhead functions to
account for file I/O, monitor overhead, etc.
,The density function PB is derived from the swap time
distribution when qB < S; whereas, it depicts the CPU
servicing of batch requests when S < qB. For example,
in the latter case with o(z) representing the Dirac-delta
function describing an independent variable z, one
could characterize the constant batch allocation interval
by
PB(t) = oCt - ('YB
+
qB»
where the constant 'Y B reflects batch overhead. Similarly,
letting 'YR denote the overhead incurred during an
on-line quantum allocation
0
for t :::; 'YR or t > 'YR + qR
PR(t) = p,e-JJ.t + e-MR oCt - (qR + 'YR»
l
for 'YR :::; t :::; 'YR + qR
!
For completeness, the transitions from the O-state are
assumed to be
y+QR-E
+
y=
Smax if service to on-line customers is swap
limited (i.e., qB < S)
qB if batch quantum limits on-line service
(i.e., qB ~ S)
+ 11 m, t] PB+R(t)
dt.
for 1 :::; m S n
pmn
=
o for n
:::; m - 2; m
~
1
1J+q R-E
[o
Pr[OI m, t] PB+R(t) dt
for n = m - 1
~
0
where E ~ 0 and Pr[kl m,t] denotes the conditional
probability of generating k new on-line requests in a
Then, having formulated the state transitions {Pmn}
and defined the density functions PB(t) and PB+R(t), the
problem remains to solve for the steady-state probabilities. This is accomplished by noting that the Pmn'S
define an ergodic Markovian chain whereby in matrix
form with!!.. = (Pmn) there exists a unique set of number~
fPm }~=0 such that
26
Fall Joint Computer Cpnference, 1969
and
N
LPn = 1
n=O
The solution of these equations produces the limiting
stationary probabilities {P16}n~o which could be used in
calculating E[n] to provide a more accurate estimate of
E[C 1]. (That is, providing one can accurately describe
PB, PB+R, A, etc.).
However, since the accuracy of such variables would
be highly questionable in the absence of any empirical
information and since this latter model presents a
number of non-trivial mathematical difficulties, it was
not utilized to derive the result.s given in this paper.
Yet, in the future, as sufficient data is accumulated from
the actual operation of BTl\1 systems, then the latter
model will enable us to extrapolate and better predict
the effects of alterations to the system (e.g., improvements resulting from faster swapping devices or
increases in CPU speed).
ACKNOWLEDG~\1ENT
The authors are indebted to ::,\1. Leavitt, D. Cumming,
.J. Doeppel, T. l\1artin and G. E. Bryan for their many
contributions to the BTl\1 design effort and also wish to
extend thanks to all those other individuals at Scientific
Data Systems who helped to make this project possible.
In particular, the authors are grateful to D. Cota,
E. lVlaso and Dr. R. Spinrad for their guidance in these
efforts.
Dynamic protection structures
byB. W.LAMPSON
Berkeley Computer Corporation
Berkeley, California
INTRODUCTION
A very general problem which pervades the entire field
of o,Perating sys.tem design is the construction of protectIOn mechamsms. These come in many different
forms, ranging from hardware which preve~ts the execution of input/output instructions by user programs,
to password schemes for identifying customers when
t~ey log onto a time-sharing system. This paper deals
wIth one aspect of the subject, which might be called
the meta-theory of protection systems: how can the
information which specifies protection and authorizes
access, itself be protectea and manipulated. Thus, for
example, a memory protection system decides whether a
program P is allowed to store into .location T . We are
concerned with how P obtains this permission and how
he passes it on to other programs.
In order to lend immediacy to the discussion it'
will be helpful to have some examples. To pro~ide
some background for the examples, we imagine a
computation C running on a general multi-access
system 1\1. The computation responds to inputs from
a terminal or a card reader. Some of these look like
commands: to compile file A, load B and print the
output double-sI;>aced. Others may be program statements or data. As C goes about its business, it executes
a l~rge n~mber of different programs and requires at
varIOUS tImes a large number of different kinds of
access to the resources of the system and to the various
objects which exist in it. It is necessary to have some
way of knowing at each instant what privileges the
comput~ti?n ha~, and of establishing and changing
these prIvIleges In a flexible 'vay. We will establish a
fairly general conceptual framework for this situation,
and consider the details of implementation in a specific
system.
Part of this framework is common to most modern
operating systems; we will summarize it briefly. A
program running on the system M exists in an environment created by M, just as does a program running in
supervisor state on a machine unequipped with software. In the latter case the environment is simply the
available memory and the available complement of
~achine instructions and input/output commands;
SInce these appear in just the form provided by the
hardware designers, we call this environment the bare
machine. By contrast, the, environment created by IVI
for a program is called a virtual or user machine. 6 It
normally has less memory, differently organized, and
an instruction set in which the input/output at least
has been greatly changed. Besides the machine registers and memory, a user machine provides a set of
objects which can be manipulated by the program. The
instructions for manipulating objects are probably
implemented in software, but this is of no concern to
the user machine program, which is generally not able
to tell how a given feature is implemented.
The basic object which executes programs is called
a task or process;6 it corresponds to one copy of the
user machine. What we are primarily concerned with
in this paper is' the management of the objects which
a process has access to: how are they identified, passed
around, created, destroyed, used and shared.
Beyond this point, three ideas are fundamental to
the framework being developed:
1. Objects are 'named by capabilities,a which are
names that are protected by' the system in the
27
28
Fall Joint Computer Conference, 1969
sense that programs can move them around but
not change them or create them in an arbitrary
way. As a consequence, possession of a capability can be taken as prima facie proof of the
right to access the object it names.
2. A new kind of object called a domain is used to
group capabilities. At any time a process is
executing in some domain and hence can exercise
the capabilities which. belong to the domain.
When control passes from one domain to another (in a suitably restricted fashion) the capabilities of the process will change.
3. Capabilities are usually obtained by presenting
domains which possess them with suitable
authorization, in the form of a special kind of
capability called an access key. Since a domain
can possess capabilities, including access keys,
it can carry its own identification.
A key property of this framework is that it does not
distinguish any particular part of the computation. In
other words, a program running in one domain can
execute, expand the computation, access files and in
general exercise its capabilities without regard to who
created it or how far down in any:hierarchy it is. Thus,
for example, a user program runnipg under a.debugging
system is quite free to create another incarnation of
the debugging system underneath him, which may in
turn create another user program which is not aware
in any way of its position in the i scheme of things. In
particular, it is possible to reset 'things to a standard
state in one domain without disrupting higher ones.
The reason for placing so much weight on this property is two-fold. First of all, it 'provides a guarantee
that programs can be glued tog~ther to make larger
programs without elaborate pre1arrangements about
the nature of the common environment. Large systems
with active user communities quickly build up sizable
collections of valuable routines. The large ones in the
collections, such as compilers, often prove useful as
sub-routines of other programs. Thus, to implement
language X it may be convenient to translate it into
language Y, for which a compiler already exists. The X
implementor is probably unawar~ that Y's implementation involves a further call on an assembler. If the
basic system organization does not allow an arbitrarily
complex structure to be built up~ from any point, this
kind of operation will not be feasible.
The second reason for concern about extendibility
is that it allows deficiencies in the design of the system
to be made up without changes in the basic system
itself, simply by interposing another layer between the
basic system and the user. This is especially important
when we realize that different people may have different
ideas about the nature of a deficiency.
We now have outlined the main ideas of the paper.
The remainder of the discussion is devoted to filling
them out with examples and explanations. The entire
scheme has been developed as part of the operating
system for the Berkeley Computer Corporation IVfodel
I. Since many details and specific mechanisms a,re
dependent on the characteristics of the surrounding
system and underlying hardware, we digress briefly
at this point to describe them.
Environment
The BCC Model I is an integrated hardware ~md software system designed to support a large number (up to
500) of time-sharing users. This system consists of
two central processors, several small processors, a large
central (core and integrated circuit) memory, androtating magnetic memory. The latter contains more than
500x 106 bytes, including approximately 12X 10 6 bytes
of drum having a transfer rate of more than 5X 106
bytes per second.
The hardware allows each process more than 512k
bytes of virtual memory. The central processors can
accommodate operands of various sizes including 48and ~6-bit floating point numbers. The addresslng
structure allows characters, part-word fields and array
elements to be referenced directly. The subroutinecalling instruction passes parameters and allocates
stack space automatically. System calls are handled
exactly like ordinary function calls.; when anays or
labels are passed to the system they are checked automatically by the hardware so that they can be used
by the system without further ado.
The memory management system organizes memory
into pages. A page is identified by a 48-bit unique name
which is guaranteed different for each page ever created
in the system. Tables are maintained in the central
memory which allow the page to be found in the various
levels of the memory system. These tables are automatically accessed by the address mapping hardware
the first time the page is referenced after the processor
starts to run a new process. Thereafter its real core
address is kept in fast registers. It is therefore unnecessary for any program other than a small part of the
basic system to be concerned about the location of a
page in the memory system; when it is referenced, it
will be brought into the central memory if it is not
already there. Extensive facilities are provided, however, to allow a process to control the level in the memory hierarchy of the pages it is interested in. 'The work
of managing the memory is done by a processor with
Dynamic Protection Structures
read-only program memory and data access to the
central memory; this processor has a 100 ns cycle
time, so that it can handle the large amount of computing required to keep up with demands placed on
the memory system. Another small processor handles
-the remote terminals, which are multiplexed in groups
of 20 to 100 at remote concentrators and brought.
into the system over high-speed lines.
Pages are grouped into files, ·which are treated as
randomly addressable sequences of pages. The only
mechanism provided to access the data in a file is to
put a page of the file into the virtual memory of a
process. Files and processes are named and have protection information associated with them.
Domains in action
Before plunging into a detailed analysis of capabilities and domains, we will look at some of the practical situations which these facilities are designed to
serve. They all have the same general character: several
programs with different privileges exist. Each program
corresponds to one domain. Some of the domains con. trol others, in the sense that the capabilities of a controlled domain are a subset of those of its controlling
domain. As a first example, consider the command
process CP of an operating system. This program
accepts a command, perhaps from a remote terminal,
and attempts to recognize it as a call on a program X
which CP knows about. If it succeeds, CP calls on X for
execution, passing it any parameters which were included in the command. To do this, CP must set up
a suitable environment for X to function in. In particular, enough memory must be provided for X to
run, X must be loaded properly, and suitable input/
output must be available. When X is finished, it will
return and CP can process a new command.
The key point is that we want CP to be protected
from X, to ensure that the user's commands continue
to be processed even if X has bugs. In particular, we
want to be sure that
X: command
cP: command processor
command input
Capabilities
command output
required by
Directory of commands
X
Domain X
Return to CP
Calls
Domains
Figure 1-A command processor and its comma.nd
to X in two forms: in the picture on the right, and as
a return capability in X. The reason for the capability
is that X cannot return with a simple branch operation, since it would then be able to start CP running
at any point, which would destroy the protection.
Suppose now that we want to allow X to get additional commands executed. X might, for example, be a
Fortran compiler whose output must be passed
through an assembler. A simple way to do this is to
put the assembler input on a file called, say, FORTRANTEl\1P, and issue the command.
ASSElVIBLE FORTRANTEMP, BINARY
This command is just a string, which can easily be
constructed by the compiler X. To get it executed,
however , X must be able to call CP; This situation
is illustrated in Figure 2; note the call capability in X,
which is quite different from the return capability.
Weare ignoring for the moment the question of how
CP knows that X is authorized to call the assembler.
If the idea of the preceding paragraph is pursued, it
suggests the value of being able to switch the source
of command input and the destination of command
output in a flexible way. By these terms we mean the
cP: command processor
X:
Command
Y: Command
I
command input
1. X does not destroy CP's memory or files, so
that CP can continue to run when X returns.
2. CP can stop X if it goes wild. Usually we want
the ability to set a time limit and also to intervene from the terminal.
In other words, we want CP and X to run in separate
domains, as illustrated in Figure 1 (since this is an
informal discussion, we do not trouble to distinguish
carefully between the program X and the domain in
which it runs). Here we have shown the call from CP
29
command output
Directory of commands
Capabilities
capabili ties
required by
required by
Y
! Return
to X
i
0
X
X
Domain X
I Domain
(0
!
(0
call CP
IReturn to CP
iI Return to CP
i
Figure 2-A recursive command processor
0
30
Fall Joint Computer Conference, 1969
traffic between a program and the entity by which it
is directed. In a time-sharing system this is normally
a terminal at which the user is sitting; in a non-interactive system it will be a file of control cards. It is
often desirable, however, to switch between the two,
so that routine processing can be done automatically
when the user's attention is elsewhere, yet he can
regain control when things go awry. Again, it is not
uncommon to wish to capture a complete record of a
conversation between user and machine for later
analysis and replay. More radical, it may be of interest
to replace the user at his terminal with a program
which can manipulate the strings of characters which
constitute commands and responses. In this way major
changes in the external appearance of a system can
be obtained with little effort.
All of these things can be accomplished by giving
interactions with the command I/O device the form of
calls to a different domain which acts as a switch. A
generalization to include the possibility of different
command devices for different domains is easy. Thus,
a user may initiate a program in a domain X which,
while continuing to communicate with him, starts a
CP 1:
cOllllUlnd
proceSlIOr 1
call CIO
X:
aacro
J«::
c~d
call CIO
CP2: command
processor 2
call CIO
Domain J«:
Doaain CP2 .
Domain X
Directory
of caa.and.
Return to Cpl
Return to Me
Domain CIO
Return to CIO
user proaram
CIO·
control I/O
call CIO
call CPl
Return to CP2
call CP2
call Me
Return to X
Figure 3a-Switchable control I/O--the- domains
~
Top-level command processor initiates a
cornmand
~
which wants to drive another command
processor with some pre-stored or computed
input.
It therefore creates another CP
and calls it, telling CIO to use
Me
fClr
its I/O
8
The lower CP is given a command to cal.l
the user program
x.
This program needs input
which it gets by calling CIO, the domclin
which is switching the control I/O.
~
the current input source, which is
CIO calla
Me
Figure 3b-Switchable control I/O-the calh
subsidiary domain and feeds it commands. The subsidiary, unaware of the way in which it is being; driven,
may iterate the process by creating Z. The key fact
which makes it all work is the isolation of one domain
from others. Thus, Y may decide to close all its files
without disturbing X, since Y has no way of even
knowing about X's files,. much less accessing t.hem. Z,
on the other hand, can be an open book to Y. Various
aspects of the situation are illustrated in Figure 3.
This section concludes by analyzing a problem of
great practical importance: how to construct H debugging system. This example is a good source of insights
into the facilities required of a protection system because of the great variety of things which can be expected to go wrong during debugging. There are two
domains, one for the debugger D and one for the program X being debugged. We of course want D to be
protected from X. Equally important, we want X to
be completely open to D, so that every object a{}cessible
to X is also accessible to D, and furthermore that D
can find all the objects accessible to X as well as access
them. Otherwise D will not be able to find out what X
has done or to undo any damage. Furthermore, we
want D to be able to imitate any actions which X
can take, so that D can create suitable initial conditions
for debugging parts of X. Thus, D needs operations
which, given a capability for X, allow D to
find all the capabilities in X
copy capabilities between D and X
destroy capabilities in X
enter X at any point with any machine state
DYnamic Protection Structures
With these powers, D can also handle domains whicll
X has created, since it can get hold of X's capabilities
for them. Breakpoints can be inserted in X in the
form of calls on D.
NAME
TYPE
VALUE
31
DOMAINS
1
A
1: 0
2
B
0
1
o :I 0
I
o:0
C
0
0
1:0
D
0
0
0·: 1
E
1
1
o:1
I
I
I
4
Domains and capabilities
The nature of capabilities
F
6
0
1
I
I
1
I
2
1:0
I
As we have already said, a capability is a protected
name of an object. When any object is created, a
capability is created to name it; without the capability
the object might as well not exist, since there is no
way to talk about it. The capability may be thought
of as an ordinary data item enclosed in a box which
prevents tampering with the contents. Thus, for example, it may be convenient to make a capability for
a file consist of simply the disc address of its index.
This is entirely satisfactory, since programs which
handle the capability cannot modify it. If they could,
disaster would ensue, since any program could put
any desired disc address into a file capability, and
there would be no protection at all. If the machine
hardware allows a word to be tagged so that it cannot
be modified except by the supervisor, then we have
precisely what we want for a capability. The situation
is illustrated in Figure 4. It should be possible to load
and store such a word (including the tag bits) in order
to give programs the necessary freedom to manipulate
the names of the objects they are working with.
If this kind of hardware is not available a different
and potentially confusing implementation is required.
The potential can be kept from realization by referring
back to the "pure" implementation of the last paragraph. What is required is to hide the capabilities
away in the supervisor and provide programs with
unprotected names which can be used to refer to them.
When a program running in domain D presents one
of these names, it is necessary to check that it actually
names a capability which belongs to D. This can easily
Capabili ty:
TAG
TYPE
TAG
= read-only,
TYPE
= FILE
VALUE
= disk
VALUE
except to supervisor
address of index
Figure 4--Structure of a eapability
(a)
capabilities grouped, with
1
ITJ~in]
1
ITJD~ain4
bits for ownership
(b)
capabilities separate
for each domain
Figure 5-Capabilities and unprotected names
be done, if there are n such capabilities, by using
numbers between 1 and n for the names. 3 An attractive
alternative, if domains can be grouped into larger units
which share many capabilities, is to number the
domains from 1 to i and the entire collection of capabilities from 1 to n and to attach a string of i bits to
each capability. Bit d is on exactly when the capability
belongs to domain d. Figure 5 illustrates.
A somewhat more expensive implementation is to
search a table associated with the domain whenever
an unprotected name is used. This scheme shares with
the bit-string idea the advantage that it is easy for
different domains to use the same names for the same
object.
There are capabilities for all the different kinds of
objects in the system. On the Model I these are
files
pages of memory
processes
domains
interrupt calls
terminals
access keys
Domains and memory
The nature of a domain is considerably more dependent on the underlying system than is the case
for capabilities, mainly because of the treatment of
memory. From a purist's viewpoint, every access to a
32
Fall Joint Computer Conference, 1969
!
--------------------------~---------------------------------------------------------------monitor
memory word is an exercise of a, capability for that
utility
word. A more moderate positio~, and one which is
user
quite feasible on suitable hardw~re, is to view each
access as the exercise of a cap~bility for a segrnen t
in decreasing order of strength. The hardware enforces
which contains the word. 2 The! mapping hardware
a restriction that addressing cannot go into fI, higher
which implements segmentation is thus viewed as part
ring. It also provides protected entry points :into the
of the capability system, and ~ satisfying unity of
utility and monitor rings and automatically checks
outlook is gained. Since a seg~ent is identified by
addresses passed into these rings as param1eters to
number, the preceding section applies. We shall not
ensure that they are legal in the ring from which they
consider the formidable difficulties which arise if different domains use different names for the same segment.
came.
This simple hardware-implemented structure permits
If segments are accessed through capabilities like
three
domains to transfer control around among each
everything else, then a domain cOJilsists of nothing more
other
and to address each other's memory in a very
than a collection of capabilities. On machines not
convenient and efficient way. The price paid is a riequipped with the proper hard\\'are a domain has an
gidity in structure, and a drastic incompatibility with
address space as well. In the lVlodel I this is a list of
the main, software-implemented domain meehanism.
the pages which occupy each of the 64 slots for pages
The incompatibility is resolved by requiring a change
in the 128k memory which is acc:essible to a user proin ring to be reported to the software, except \yhen the
gram.
only processing to be performed before returJl1ing the
It is also necessary to deal w~th the fact that the
original ring can be done with the capabilities of the
hardware does not allow one domain to access the
original ring. Short calls thus remain cheap, while the
address space of another one directly. This fact is of
overhead added to longer ones is not excessive.
great importance when we consider how data is passed
back and forth between domains, since it implies that
arrays cannot be passed simply by specifying their
Domains and processes
addresses. It is therefore extremely convenient to inThe relationship between domains and processes is
clude as part of a call the abilitN' to pass scalar data
another area greatly influenced by the surrounding
items, and essential to include th~ ability to pass capasystem. The logical nature of the two kinds of object
bilities. From this foundation arQitrarily complex comallows a great deal of freedom: in fact, a domain has
munication can be built, since capabilities for pages,
much the same appearance to a process that a segment
files and domains can be passed. 'rhus, if an array needs
of memory does. The storage for capabilities ~provicled
to be passed as a parameter, i~ is sufficient to pass
capabilities for the pages or file !containing the array,
by a domain can accommodate many processes, and a
single process can switch from one domain to another
together with its base address a:pd length. The called
(subject to restrictions which are considered in the
domain can then put the pages into its address space
and access the array. This is of course much less connext section).
In the ::Uodel I, however, storage is allocated in 2k
venient than passing an entire segment as a parameter,
but it is quite workable.
'
pages, and one of these, called the context block, is
An alternative approach is to organize the hardware
used to hold the system-maintained private data for
each process. The cost of ha.ving a process is thus high,
so that the address space of one domain is a subset to
and there is considerable incentive to minimize the
that of another. This eliminates all problems when the
number of processes; usually one is enough per compusmaller one calls the larger, although it does not help
at all when we want to share only part of the address
tation, if advantage is taken of the interrupt facilities
space. A subset organization fits well with a linear or
described later. When the usage of space in the context
"ring"-like system4 in which the domains are numbered,
block is analyzed, it turns out that there are only two
and the capabilities of domain i are a subset of those
items which would have to be duplicated to allow
of domain i-I. As we shall see, there are good reasons
~everal processes to run with the same address space.
for wanting a more flexible sch¢me, but for a great
These are a 14-word machine state and a stack used
many applications a linear orderirlg is quite satisfactory.
for local storage when the supervisor is executing in
To allow these to be handled more efficiently, the
the process. This stack has a minimum of about 60
Model I hardware breaks the address space of a process
words and can grow to several hundred words at certain
into three rings:
points during supervisor execution. It is therefore the
Dynamic Protection Structures
main barrier to the existence of cheap processes. The
problem can be greatly alleviated by allocating stack
space dynamically at each function call and releasing
it at each return, but this would require some major
changes in system organization.
Although processes are expensive, domains are quite
cheap, since the bit-string method is used to assign
capabilities to domains. Each process in the Model I
can have about a dozen domains associated with it.
The process can run in any of its associated domains
but in no others. This implies that two processes never
run in the same domain.
In a system in which processes are cheap, it is possible
to take an entirely different approach which encourages
the creation of processes for every purpose. In such a
system, parallel processing is of course greatly facilitated. In addition, free creation of processes can be
used to give a somewhat different form to many of
the facilities described in this paper.3
It is perhaps worthwhile to point out that a machine
whose addressing is not organized around a stack or
base registers cannot reasonably run several processes
out of the same domain unless they are executing totally disjoint code, because of the problem of address
p.onflicts.
Transfers of control
Calls
The only reason for creating a domain is to establish
an environment in which a process may execute with
different protection than that provided by any existing
domain. If this objective is to be fulfilledJ transfers of
control between domains must be handled with great
care, since they generally imply the acquisition of
new capabilities. If it is possible for a process' running
in domain X to suddenly jump into domain Y and
continue execution at any arbitrary point, X can certainly induce Y to damage the objects accessible
through Y's capabilities.
To provide an adequate mechanism for transfers
between domains, we introduce the idea of a protected
entry' point or gate, and make the rule that transfer
into a domain is normally allowed only at a gate. A
gate is a new kind of capability which can be created
by anyone with a capability for the domain. It specifies
a location to which control is to go when the gate is
used. Gates can be passed around freely like other
capabilities, and each one may be viewed as conferring
a certain amount of power, namely the power to accomplish whatever the routine entered by the gate is
33
designed to do. With gates it is possible to selectively
distribute the powers of a domain in a flexible way.
A transfer through a gate usually takes the form of
a subroutine call; some provision must therefore be·
made for a return. It is not satisfactory to create
another gate which the called process may return
through, since he might save it away and use it to
return at some later and unexpected time. Instead,
the domain and location to return to are saved on a.
call stack in the supervisor, from which the return
operation can retrieve them. It is possible to call a.
domain recursively with this mechanism, a feature
which is generally desirable and also quite important
for the trap and interrupt system about to be described.
In order to allow the stack to be reset in case of an
error, or for any of the other reasons which prompt
programmers to reset stacks, a jump-return (n) operation is provided which returns to the domain n levels
back. Protection is maintained by requiring the domain
doing the jump-return to have capabilities for all the
domains being jumped over.
Traps
A trap is caused by the occurrence of some unusti~l
event in the execution of the program which requires
special handling, such as a floating point overflow, a
memory protection violation or an end of file. When a
trap occurs, it forces control to go to a specified place,
where presumably a routine has been put to deal with
the event. Whether any particular event causes a trap
or simply sets a flag which can be tested by the program
is a decision which should be under the programmer's
control. Traps may be initiated by hardware (e .g ..
floating overflow) or may be artifacts of the software;
as with most distinctions between hard ware and software implementation, this one is of little importance,
and we expect all traps to be transmitted to the program
in the same form, regardless of their origin.
These are all obvious points which are generally
accepted, and have even become embedded in the
definition of PL/I. What concerns us here is the relationship between traps and domains, which is not
quite so obvious. The basic problem is that the response to a trap must be made to depend on the environment in which is occurs. The 'occurrence of, say, a
floating overflow is simply a fact, and has nothing to
do with who is running. The action to be taken, on the
other hand, is entirely a function of the situation.
Consider the example in Figure 6. If a floating overflow
occurs with the call stack in state (b), it is clear that
34
Fall Joint Computer Conference, 1969
Name
A
Domain
Traps
B
Statl.stl.cal
package
C
Matrl.x
Inversion
a)
FLTOV,
SINGMTX
I
I
FLTOV
Domains and
enabled traps
o
b)
The call stack
during matrix
inversion
o
~SIN~ o
o
o
CATCHALL
8
0FLTOV
(0
c)
o
o
G
ICommand processor ICATCHALL I
the matrix
inverter processes a
floating overflow
d)
the matrix
inverter returns with
trap-return
(SINGMTX)
e)
the matrix
inverter returns
with trapreturn
(BAD DATA)
Figure 6--Traps and trapreturns
C should have the first chance to handle the trap. If
it is not interested, the domain B which called it should
have the second chance. In state Cc}, on the other hand,
domain B should have the first chance, and then A.
The reasons for this, is that we do not wish to give up
control to a weaker domain when a trap occurs.
The idea is then the following: Each domain is
considered to have a father. When a trap occurs, it is
first directed to the domain S which is running. If S
does not have the trap enabled, the father of S is
tried in the same way. If no one can be found to handle
the trap, there are two possibilities:
to each hardware-generated trap is a standard name.
Software-generated traps can use £tny names, including
the ones for hardware traps. This makes it easy for a
subroutine to simulate the occurrence of a hardware
condition which it may not be convenient to produce.
A simple extension of the return operation. to a
trap-return allows a routine to signal an error without
leaving any traces of itself; the trap-return does a
return and immediately causes the specified trap,
without allowing any execution beyond the return
point. The domain which handles the trap then sees
it as having occurred in the calling routine, which is
exactly what is wanted. Thus in Figure 6 we have n
matrix inversion routine which processes its own
floating overflows, but reflects two other conditions
to its caller with trap-return. Another useful convention is to disable the trap when it occurs. This
makes it much less likely that the program will get
into a loop, especially for such traps as illegal instruction and memory protection violation.
Interrupts
There remains one more way to cause n tlmnsfer
between domains: the occurrence of nn interrupt. This
is not intended to be the normal mechanism for communication between coopernting processes; the basic
block ,and wnke-up mechanismso are expected to perform that function. There nre times, however, when it
is desirable to force a process to do something:, even
if it is not paying attention. Two obvious reasons for
this are:
n quit signal from the terminal, which indicates
that the user wants to regain control over a process
which hns gone into a loop, or perhaps ,simply
become unnecessarily wordy;
ignore it;
generate a catchall trap which any domain that
lacks a father is forced to handle.
the elapse of a certain amount of time, which
has much the same meaning.
If a domain T is found with the trap enabled, it is
called with the name of the trap as argument. It can
then return and allow execution to proceed if it is
able to clear things up. Alternatively, it can do a
jump-return to someone farther back on the call stack
if it finds the situation to be hopeless. An important
property of this scheme is that the trap routine can do
arbitrarily complex processing without disturbing the
situation at the time of the trap.
Conceptually, we wish to think of traps as identified
by symbolic names. Each domain must then include a
list of names of the traps it has enabled. Conesponding
The action required in these two cases is different.
When n timer interrupt is requested (and there may be
two kinds, for real time and CPU time) the desired
action is usually to cnll a specific domain, often the
one which is setting the timer. If another domain
wants a timer, it will use one which is logically different.
The user's quit signal, on the other hand, is context
dependent like a trap; the desired action is a function
of the routine which is running when the signal 2~rrives.
Thus an iterntive root-finder may interpret a quit as
an indication that the solution is accurate enough,
but the debugging system under which it may be run-
Dynamic Protection Structures
ning will curtail its printing when it sees a quit and
await a new command. This' analysis suggests a simple
implementation: convert the quit into a trap from the
currently executing domain. Each interrupt, then, will
give rise to a call or a trap, depending on its type as
declared by the programmer.
Even when we see how to convert them into operations within the process, interrupts still pres.eut one
serious problem which does not arise in the handling
of traps. This is the fact that a program occasionally
needs to be allowed to compute for a while without
losing control. Usually this happens when modifications are being made to a data base; if a quit signal
should appear or a timer run out halfway through this
operation, the data is left in a peculiar state. The
obvious solution is to allow a process to become noninterruptible for a limited period of time. The function
of the limit is to prevent the process from getting into
a state from which it cannot be retrieved; exceeding
it is a programming error and always causes the process
to become interruptible again and an error trap to
occur, regardless of whether an interrupt is actually
pending. The limit is properly measured in real time,
since its primary purpose is to put a bound on the
frustration of the user at his console.
N on-interruptibility is a process-wide condition. It
must be possible, however, for a newly -called domain
to extend the limit exactly once, so that it can function
properly even though its caller is about to exceed his
limit. The limit is thus part of a call stack entry. When
a return occurs, the old limit comes back into force,
and an immediate trap may occur if it has been exceeded.
Table I summarizes the operations connected with
transfers of control between domains.
TABLE I-Operations for transfers
Operation
Arguments
Call
Return
Jump
Jump-return
Trap
Trap-return
Gate, Parameters
Parameters
Gate, Parameters
Depth, Parameters
Trap number
Trap number
Proprietary programs
The remainder of this paper deals with the protection problems introduced when objects are allowed
35
to have external, mnemonic names. The examples in
this section are intended to introduce this subject, and
are also of interest in their own right. Suppose then
that a user U has a program executing in domain P
and wishes to perform a circuit analysis. P has generated the input data for the analysis, and intends to
use the results for further calculation. Within the
system M on which P is running, some user V has
written a suitable analysis program A which he has
offered for sale, and U has decided to use V's prog.ram.
I t happens that U and V are competitors.
Both users in this situation have selfish interests
to protect. First, and most obvious, V does not want
his program stolen. He therefore insists that while it
is executing U must not be allowed to read it. Equally
important, however, is the fact that U does not want
V's program to be able to read the calling program P
and its data; although U may not be trying to market
P, it, and especially its data, contain valuable information about U's current development work which
must be kept from competitors. The relationship
between U and V, and between their programs P and A,
is therefore one of mutual suspicion. Each is willing
to entrust the other with just enough information
to allow the circuit analysis to be completed, and no
more. The system must support this requirement if it
is to be a suitable vehicle for selling programs.
Furthermore, cale must be taken beyond the programs. While P is running it needs the ability to access U's files by name, to read input data and record
results. This privilege must certainly not be extended
to A, since it can learn even more about U's secrets
by examining his files than by looking at his program,
not to mention the possibility of modifying them. On
the other hand, A may need access to V's files to obtain
data for the analysis and to collect statistics and accounting information; this access must not be available
to p,. The. protection mechanisms must therefore provide for isolating P and A at the level of file naming as
well as on the lower levels which have been the subject
of this paper so far.
What is required then is a system facility something
like this. V establishes A as a proprietary program,
specifying the file on which it resides. Another user's
program P may then ask the system to attach this
file. To do this, the system creates a new domain A,
installs the program in it, provides it with some storage,
and returns to P a gate into A. When P wants to call
A, he uses the gate and passes whatever parameters
he thinks are needed for Ato function.. When A is
finished, he retmns. The protection mechanisms we
36
Fall Joint Computer Conference, 1969
have been discussing prevent undesired interference
between P and A. Safeguards for the files are discussed
below.
The example abcwe is one of a great variety of similar
situations. The system itself creates many of them. A
LOGOUT command, for example, requires special access to accounting files and to capabilities for destroying
a process, but it would be nice to call it with the
standard command processor. Similarly, driving a
special peripheral like a printer requires special capabilities. If a company maintains a large data base, it
may wish to give different classes of users access to
different parts of it by allowing them to call different
accessing programs. These and many other applications
fall within the general outline established by our proprietary program example. We now proceed to consider
how to handle the file naming problems it presents.
External names
Table II lists the goals of a naming system for objects,
and indicates some of the distinctions between the
use of capabilities in names which have been discussed
in previous sections, and the use of external names,
which are strings of characters such as 'FILEl' or
'CIRCUIT'. In summary, it says, that capabilities are
very convenient for use by a program, since they are
cheap and self-validating. On the other hand, they are
very bad for people, since they cannot be typed in or
remembered. Names for people ~hould also have the
property that the same name can :refer to many different objects, the distinctions to be made by context.
Thus, Smith's file 'ALPHA' is not the same as Jones'
'ALPHA'.
TABLE 11- Goals of a naming system for objects
Goal
Achieved by
Capabilities
N ames are mnemonic
N ames can be relative
to other names
N ames can be used externally
Possession of name
X
authorizes access
N ames are cheap
X
to use
N ames can be maX
nipulated by programs
Achieved by
external names
X
X
X
X
Techniques for achieving all these goals are well
known. They depend on the introduction of a new kind
of object called a directory, which consists of pairs:
< external name, capability>, and an operation of
opening an object by supplying the name to obtain
the capability. Since the external name is interpreted
relative to a directory, there is a suitable basis for
establishing the context of a name. A tree-structured
naming system is implicit in the scheme, because
directories are themselves objects accessed by capabilities. It is now easy to see how a program in 2~ domain
D accesses the objects belonging to owner U. 'When D
is created, it is supplied with a capability for TJ's
directory, which it simply exercises.
There is more controversy over the proper methods
of accessing objects belonging to other users. A popular
approach is to use passwords: a public read-only
directory is filled with capabilities for all other directories which allow the objects in them to be accessed
provided a correct password (usually different for each
object) is supplied as part of the opening operation.
This method is not satisfactory. First, it is inconvenient,
since it requires the person accessing the fillS to remember the password. Second, it is insecure. If he
writes the password down, or includes it in a program,
the possibility increases that it will become known. It
is bad enough to have to use a password tOo obtain
entry to the system, but at least only one password is
involved, it is used only once per session, and it can
be changed, if need be after each session, without too
much fuss. None of these things is true of passwords
attached to files: there are many of them, many people
need to know them, and one must be used each time
a file is opened. This scheme has no advantage except
economy of implementation.
A method based entirely on capabilities suffers only
one of these drawbacks: it is inconvenient, but secure.
It is also, however, quite complex. The idea is that if
a file (or anything else) is to be shared, a capability
for it should be passed from its owner to those who
wish to share it. The problem is that a capability,
being a protected object, must be passed through protected channels; it cannot be sent in a letter, even a
registered letter. The solution is illustrated in Figure
7. Every user has (at least) two directories, a private
one which he works with, and a transfer directory. The
public directory PUB, for which every user has a read
capability, contains write capabilities for all the trans··
fer directories. The object is to move the capability
for X from PDA to PDB. Proceed as follows:
Dynamic Protection Structures
Name
Ac_c~~~_
va!.ue..
A
W
TOA
B
W
TOB
PUB:
Name
Access
37
Value
mAl L...J .,J.y
A '.
public directory, containing a write-only
capability for the
transfer directory
of each user.
*
'"
** ..
temporary capability for
copying
final copied
capability
-.. .. path for copying
c::
R
PUB
RW
TDA
•W
OBJ
TOB
D8
*
SMITH*
I
u ser A's priv!te directory
1_. -- -:-1--
I
~ _~l-o~:
I
I
I
l
I
user B' s transfer directory
C
Rr-:~
~
OBJ
I
I
Capabilities for
SMITH's computation before opening
the file.
I
I
I
I
I
I
**
user B' s private directory
.I--::.:AL=P:..:.HA::.:...a.I~R~~ ~- - ~ ... "
JONES' directory
Figure 7-Sharing capabilities without aecess keys
A moves a capability for TDB into PDA
Using it, A moves his capability for X to TDB
B moves the capability for X from TDB to PDB
Since only B can access TDB, security is preserved. A
malicious user can confuse things by writing random
capabilities into the TDs, but it is easy for B to check
that he has gotten the right thing. Furthermore, if X
is a directory, future communication can be carried
out quite conveniently, since A and B can then communicate through X without any worries about outside interference.
A much better method is based on the simple idea
of attaching to a directory entry a list of the users
who are allowed to access it; with each user we can
also specify options, so that Rosenkrantz may be
granted write access to the file while Guildenstern can
only read it. This scheme, which was first used in
CTSS/ has two drawbacks. The first is that if the list
of users who are authorized to access a file is long, it
takes a lot of space to store it; this problem is espe~ially
annoying if there are several files to be accessed by the
same group of users. The second drawback is that there
is no provision for giving different kinds of access to
different domains of a computation. Both difficulties
can be overcome in a rather straightforward manner.
Before we pursue this point, it is important to notice
-why the difficulty encountered above in the capabilitypassing scheme does not arise here. We can think of
the computation of a logged-in user as possessing a
special kind of capability which identifies it as belonging to him. If SMITH is the user, we will refer to
thiA capability as SMITH*, meaning that the string
S~11TH*
Capabil1ties for
SMITH's computation after opening
the file.
Figure 8--Use of access keys
'SNIITH' has been enclosed in a tamper-proof box.
When JONES wishes to give SMITH access to his
file ALPHA, he puts the name SMITH on the access
list; JONES can do this since he has a capability for
ALPHA. When a computation presents the capability
SMITH*, ~the system observes that the string (or user
number) which is the contents of the capability matches
the string on the ac~ess list and grants the access.
At no time is it necessary for JONES to have SMITH*
in his possession. He needs only the name SMITH
which, since it is not a protected object, can be communicated to him by shouting across the room. Figure
8 illustrates.
To generalize the method we need two ideas. One
is that of an access key. This is an object (i.e., it can
be referenced only by using a c.apability) which consists simply of a bit string of modest length, long
enough that the number of different access keys is
larger than the number of microseconds the system
will be in existence. Any user may ask the system for a
new access key; the system will create one never seen
before and return a capability for it. The object SMITH*
38
Fall Joint CoIllJ)uter Conference, 1969
mentioned in the last paragraph is an example of an
access key; one is kept for each user in the system.
Since an access key is an object, capabilities for it
appear in the directories and are protected exactly as
is done for any other object (since the access key is a
small object, it may be convenient for the implementation not to give it any existence independently
of the capabilities for it, i.e., to make the value of the
capability the object itself, rather than a pointer to
it as in the case of files). To give a group of users access
to some files, all we have to do is distribute a new
access key GROUP* to the users and put GROUP
on the access list for each file. The distribution is
accomplished by creating GROUP* and putting all
the users on its access list; once they have copied it
into their directories they can be removed from the
access list, so that no space need be wasted. In practice,
as we have pointed out, numbers of perhaps 64 bits
would be used instead of strings like 'GROUP'.
The second idea is not new at all. It consists of the
observation that since an access key is just an object,
different domains can have different access keys and
hence different kinds of access to the file system. Thus,
for example, a user's computation may be started with
two domains, one for his program with his name as
access key, and the other for system accounting with
an access key which allows it to write into the billing
files. With a single suitable access: key, a domain can
easily get hold of an arbitrarily large collection of
othAl' objects which are protected by other keys, since
the first key can be used to obtain other keys from the
directory system.
SUMMARY
We have described a very general scheme for distrlbuting access to objects among the various parts of
a computation in an extremely specific and flexible
way. The scheme allows two domains to work together
with any degree of intimacy, from complete 1~rust to
bitter mutual suspicion. I t also allows a domain to
exercise firm control over everything created by it or
its subsidiaries.
.
REFERENCES
P A CRISMAN editor
The compatible time-sharing system: A. programmer's guide
MIT Press 2nd ed Cambridge Mass 1965
2 J P DENNIS
Segmentation and the design of mu,lti-programmed computer
systems
.J ACM Vol 12 Oct 1965 589
3 J B DENNIS E C VAN HORN
Programming semantics Jor multiprogrammed compuuuion
CACM Vol 8 No 3 March 1966 143
4 R M GRAHAM
Protection in an information proce8sing utility
CACM VollI No 5 May 1968 368
5 B W LAMPSON
A scheduling philosophy for multi-proce8.<;iny 8ystems
CACM VollI No 5 May 1968347
6 B W LAMPSON et al
A user machine in a time-sharing system
Proc IEEE Vol 54 No 12 Dec 1966
The ADEPT-50 time-sharing system
by R. R. LINDE and C. WEISSMAN
System Development Corporation
Santa Monica, California
and
C. E. FOX
King Resources Company
Los Angeles, California
INTRODUCTION
In the past decade, many computer systems intenderl
for operational use by large military and governmental organizations have been "custom made" to
meet the needs of the particular operational situation
for which they were intended. In recent years, however, there has been a growing realization that this
design approach is not the best method for long term
system development. Rather, the development of
general purpose systems has been promoted that
provide a broad, general base on which to configure
new systems. The concepts of time-sharing and general-purpose data management have been under development for several years, particular.ly in university
or research settings. 1 ,2,3 These methods of computer
usage have been tested, evaluated, and refined to
the point where today they are ready to be exploited
by a broad user community.
Work on the Advanced Development Prototype
(ADP) contract was begun in January 1967 for the
purpose of demonstrating-in an operational environment-the potential of automatic informationhandling made possible by recent advances in computer technology, particularly advances in timesharing executives and general-purpose data management techniques. The result of this work is a largescale, multi-purpose system known as ADEPT, which
operates on IBM system 360 computers. *
The entire ADEPT system is now being used at
four field installations in the Washington, D. C. area,
as well as at SDC in Santa Monica. The system was
installed at the National Military Command System
Support Center in May 1968, at the Air Force Command Post in August 1968, and at two other government agenc;es in January 1969. These four field sites
collectively run ADEPT from 80 to 100 hours per
week, providing a total of some 2000. terminal hours
of time-sharing service monthly to theIr users.
The ADEPT system consists of three major components: a time-sharing executive; a data management system adapted from SDC's Time-Shared Data
Management System (TDMS) described by Bleier,4
and a programmer's package. This p~per deals .exclusively with the ADEPT Time-SharIng Executr~re,
and particularly with the more novel asp~~ts of Its
architecture and construction. Before examInIng these
aspects it will be instructive if we review the basic
design and hardware configuration of the system.
A general purpose operating system
The ADEPT executive is a general-purpose time-
* Development of ADEPT was supported in part by the Advanced Research Projects Agency of the Department of Defense.
39
40
Fall Joint Computer Conference, 1969
sharing system. The system operates on a 360 Model
50 with approximately 260,000 bytes of core memory,
4 million bytes of drum memory, and over 250 million
bytes of disc memory, shown graphically in Figure
1 and schematically in the appendix. With this machine
configuration, ADEPT is designed to provide responsive on-line interactive service, as well as background
service to approximately 10 concurrent user jobs. It
handles a wide variety of different, independent application programs, and supports the use of large
random-access data files. The design-basically a
swapping system·-provides for flexibility and expansion of system functions, and growth to more powerful
models in the 360 family.
ADEPT functions both as a batch processor (whereby jobs are accumulated and fed to the CPU for operation one by one) and as an interactive, on-line system
(in which the user controls his job directly in real
time simply by typing console requests).
Viewed as a batch system, ADEPT allows jobs to
be sub"mitted to console operators or submitted from
consoles via remote batch commands (remote job
entry). In either case, jobs are "stacked" for execution
by ADEPT in a first-in/first-out order. The stack is
serviced by ADEPT as a background task, subject
to the priorities of the installation and the demands
of "foreground" interactive users.' Viewed as an interactive system, ADEPT allows the user to work with
a typewriter, allowing computer-user dialog in real
time. Via ADEPT console commands, the u,ser identifies himself, his programs, and his data files, and
selectively controls the sequence and extent of operation of his job in an ad lib manner. A prime advantage
of the interactive use of ADEPT is that the system
provides an extendable library of service programs
that permit the user to edit data files, compile or
assemble programs, debug and: eliminate program
errors, and generally manage large data bases in a
responsive on-line manner.
System architecture
The architecture of the ADEPT executive is that
of the "kernel and the shell". The "kernel," referred
to as the Basic Executive (BASEX), handles the
major problems of allocating and scheduling hardware resources. It is small enough to be permanently
resident in low core memory, per~itting rapid response
to urgent tasks, e.g., interrupt control, memory allocation, and input/output traffic. The "shell," referred to as the Extended Executive (EXEX), provides
the interface between the user's application program
and the "kernel". It contains those non-urgent, large-
/
CORE ( 26M BYTES)
lj
2303 DRUM
(3.9M BYTES)
2311 DISC PACKS
(7.25M BYTES PER PACK)
2314 DISC STORAGE
(207M BYTES)
2302 DISC STORAGE
(226M BYTES)
Figure 1-Relative capacity of various ADEPT direct-access
storage media available in less than 0.2 seconds. The initial
system that operates at SDC utilizes core, 2303 drum, ~~311 and
2314 disc packs, and 2302 disc storage. The NMCSSC system
utilizes 2314 disc storage in lieu of 2311 or 2302 discs. The architecture of the ADEPT executive is such that it permitR any
combination of the e..bove types of disc storage in varying a.mounts
task extensions of the basic "kernel" prqcesses that
are user-oriented rather than hardware-oriented;
they may, therefore, be scheduled and swapped.
The version of the ADEPT time-sharing system,
thus far developed has multiple levels of control
beyond the two-level "kernel-shell" structure--i.e.,
it can be thought of figuratively as an "onion skin".
Figure 2 shows these relationships graphically.
Beyond EXEX, "object systems" may exist as
subsystems of ADEPT (developed by the user community without modification to EXEX or BASEX.),
thus further distributing and controlling the system
resources for the object programs that form still
another level of the system. The design ideas embodied
in ADEPT parallel those of Dijkstra,o Corbato,6
and Lampson,7 but differ in techniques of implementation.
The ADEPT Basic Executive operates in the lower
quarter of memory, ther~by providing three quarters
of memory for user programs. With the current H
core configuration, ADEPT preempts the first 65,000
bytes of core memory, the bulk of which is dedicated
to BASEX; EXEX must then operate in user memory
The AD!E:PT-50 Time-Sharing System
,.,..------- ........
",. / '
OTHER FUNCTIONS
........ "
/
/
/
41
,,
\
/
\
I
\
I
\
\
I
,
,
I
I
\
I
\
\
,
II
" " ........
/
"'"'-----,.,."
." ./
/
/
.... ....
Figure 2-Multiple levels of control in ADEPT
in a fashion similar to user programs. ADEPT is
designed to operate itself and user programs as a
collection of 4096-byte pages. BASEX is identified
as certain pages that are fixed in main storage and
that cannot be overlayed or swapped. EXEX and
other programs are identified as sets of pages· that
move dynamically between main storage and swap
storage (i.e., drum). It is necessary to maintain considerably more descriptive information about these
swapp able programs than about BASEX. This
descriptive information is carried in a set of system
tables that, at any point in time, describe the current
state of the system and each program.
ADEPT views the 'User as a job consisting of some
number of programs (up to four for the 360/50H
configuration) that were loaded at the user's reouest.
These programs may be independent of one another
or, with proper design, different segments of a larger
task. Implicitly, EXEX is considered to be one of
these programs. To simplify system scheduling, communication, and control, only one program in the
user's set may be active (eligible to run) at a time.
When ADEPT scheduling determines that a job may
be serviced, the current job in core is saved on swap
storage, and the active program of the next job is
brought into core from swap storage and f'xecuted
for a maximum period of time, called a quantum. The
process then repeats for other jobs. Figures 3 and 4
schematically depict these relationships.
Figure 3-Simple commutation of users programs. This figure
illustrates the relationship between user's programs' EXEX
and BASEX. Each spoke represents a user's job, with his EXEX
providing the interface between BASEX and the hardware
resources. The maximum number of interactive job the
IBM 360j50H configuration is ten.
Figure 4-ADEPT's basic sequence of operation. This figure
shows the basic operating system cycle: idle loop is interrupted
by an external interrupt (an activity request); a program is
scheduled, swapped into core from the drum, and executed
escape from the execution phase occurs when quantum termination condition (e.g., time expiro'l.tion, service or I/O call, error
condition) is met; the program i"! then swapped out and control
is returned to the idle loop (if no other program"! are eligible to
be scheduled).
Basic executive (BASEX)
Table I lists the BASEX components and their
general functions as of the eighth and latest executive
release. These basic system components form an
integrated, non-reentrant, non-relocatable, perma-
Fall Joint Computer Conference, 1969
42
nently-resident, core memory package 16 pages long
(each page is 4096 bytes). They are invoked by hardware interrupts in response to service requests by
users of terminals and their programs. Note the
. division of input/output control into cataloged (SPAM
and lOS), terminal (TWRI), and drum (BXEC)
activities to permit local optimization for improved
system performance.
TABLE I-Basic executive components
Component
Function
ALLOC
Drum and core memory allocation.
BXBUG
Debugger for executive programs.
BXEC
Basic sequence and swap control.
BXECSVC
SVC handlers for WAIT, TIME,
DEVICE, STOP AND DISMISS
calls.
EXEX
Linkage routines for EXEX (BASEX/
EXEX interfaces); also services commands DIALOFF, DIALON.
INTRUP
First-level interrupt control.
lOS
Channel-program level input/output
supervisory control.
RECORD
Records SVC, interrupt activity in
BASEX.
Scheduler.
SKED
SPAM
Input/ output access methods to cataloged storage.
TWRI
Terminal input/output control.
System Tables
Resident system data areas for communicationtable (COMTAB) 1 loggedin user's table (JOB), loaded programs
table (PQU) , drum and core status
tables (DSTAT, GSTAT), and a
variety of other tables.
Extended executive (EXEX)
Unlike the tight, closed package of integrated
BASEX components, EXEX is; a loose, open-ended
collection of semiautonomous programs. Table II
lists this collection of programs. EXEX is treated
by BASEX as a user program, with certain privileges,
and each user is given his own "'copy" of the EXEX.
I t is transparent to the user that EXEX is reentrant
TABLE II-Extended executive components
Component
Function
AUDIT
Maintains a real-time recording of all
security transactions as an accountability log.
BMON
Batch monitor for control of background job execution.
CAT
Cataloger for file storage access control; also services FORGET command.
DTD
Transfers recording information from
drum to disc.
DBUG
Debugger for non-executive (user)
programs.
LOGIN
User authentication and job creation.
SERVIS
Library of service commands 'Ghat are
reentrant, interruptible and scheduled:
APPEND, CHANGE, CREATE,
CYLS, DELETE, DRIVES;I INIT,
LISTF, LISTU, LOAD, LOADD,
LOAD and GO, OVERLAY, REPLACE, RESTORE, RESTORED,
SAVE,
SEARCH,
VAItYOFF,
VARYON.
RUN
Remote batch job submission control
servicing commands RUN and
\ OANCEL.
XXTOO
Library of small, fast, executive
service commands: CPU, BGO,
BQUIT, BSTOP, DIAL, DRUMS,
GO, LOGOUT, QUIT, R~BTART,
SKED,
SKEDOFF,
STATUS,
STOP, TIME; USERS.
SYSDEF
Defines input/output hardwa.re configuration at time of system start up.
SYSLOG
Defines authorized user/terminal security profiles at time of system
start up.
TEST
Initializes system tables at time of
system start up.
SYSDATA
Non-resident, shared, system data
table for dial messages and other
common data, e.g., lists of all logged-in
users; other non-resident, job-specific
tables also exist, e.g., job environment
pagel push-down list data page.
The ADEPT-50 Time-Sharing System
and is being shared with other users, except for its
data space. Each job has its own "machine state"
tables saved in its unique set of environment pages.
This structure permits flexible modification and orderly
system expansion in a modular fashion. EXEX is
always scheduled in the same way as other user programs.
Though EXEX components are, in large part,
non-self-modifying reentrant routines and thus, could
at sm!1ll cost, be relocatable; neither user programs
nor EXEX components are relocated between swaps.
The lack of any mapping hardware on the IBM 360/50
and the design goal and knowledge that most user
programs would be of maximum size made unnecessary
a software provision to relocate programs dynamically.
User programs may be relocated once at load time.
however.
Communication and control techniques used in ADEPT
Communication is the generic term used to cover those
services that permit two (or more) programs to intercommunicate, be they system program, user program,
or both. From this communication vantage point we
shall examine the connective mechanism used between
the Basic and Extended Executives; the techniques
that allow components within the EX EX to make
use of one another; and the system design that permits
an object program to control its own behavior as well
as to communicate with the system and with other
object programs.
The ADEPT job or process
Before we discuss the system mechanics, let us
examine how the system treats each user logically.
A user in the system is assigned a job number. Each
job in the system may be viewed as a separate process,
and each process is, by definition, independent of all
other processes running on the machine. A processor job- is not a program. It is the logical entity for
the execution of a program on the physical processor,
and it may contain as many as four separate programs.
A program consists of the set of machine instructions
swapped into the processor for execution, and the
Extended Executive is one of these programs.
The ADEPT executive requires a large number of
system tables to permit Basic and Extended Executive communication. Conceptually, the use of descriptive tables defining the condition of a user's process
is analogous to the state vector (or state word) discussed by Lampson and Saltzer. 8 •9 That is, the collection of information contained by these tables is
43
sufficient to define an inactive user's process state
at any given moment. By resetting the central processor from the state vector, a user's job proceeds
from an inactive to an active state as if no interruption had occurred. The state vector contains such
items as the program counter, the processor's general
registers, the core and drum map of all the programs
in the job, and the peripheral storage file data. All
of the collective data for each program or task in the
process are contained in the state vector.
Basic and extended executive communication
Each ADEPT user (i.e., any person who initiates
some activity within the system by typing in commands) is given a job number and assigned an entry
in the JOB table. The JOB table contains the system's
top-level bookkeeping on user activity. I t contains
the user's identification, his location, his security
clearance, and a pointer to his program queue. Each
user is assigned one entry, or JOB, in the table. Associated with each JOB are the one or more programs
that the user is running.
Top-level bookkeeping on programs is contained
in the Program Queue (PQU) table. Each PQU entry
contains a program identification and some (but not
all) information that describes that program in terms
of its space requirements, its current activity, its
scheduling conditions, and its relationship to other
programs in the PQO that belong to the same JOB.
The detailed descriptive information and the status
of each JOB and its programs are carried in the swappable environment space.
The environment pages (there can be as many as
four) comprise a number of separate tables that contain such information as the contents of the general
registers, the swap storage page numbers where the
balance of the program resides, the program map,
and lists of all active data files. A single environment
page (or pages) is shared by all programs that belong
to the same JOB (user). The system design allows for
environment page overflow at which time additional
pages are assigned dynamically. The environment
pages, PQU table, JOB table, and data pages comprise the state vector of the user's job.
To permit storage of "global" system variables,
and to allow system components to reference system
data that may be periodically relocated, there exists
a system communication table, which resides in low
core so that it can be referenced without loading a
base register.
The IBM 360 supervisor call (SVC) is used exclu-
44
Fall Joint Computer Conference, 1969
sively by EXEX components and object programs to
request BASEX services. Though additional overhead
is incurred in the handling of the attendant interrupt,
the centralization of context switching provided is
of considerable value in. system design, fabrication,
and checkout.
Extended executive communication
An EXEX may make use of another EXEX fUIlction by use of the sve call m~chanism. To support
the recursive EXEX, an additional sve processing
routine is required to manage the different recursive
contexts. This routine, called the sve Dispatcher,
processes calls from user and EXEX functions alike,
manages a swappable data page, and switches to an
interface linkage routine. The· data page contains
a system communication stack that consists of a
program's general registers and the Program Status
Word at the time of the sve. This technique is
analogous to the push-down logic of recursive procedure calls found in ALGOL or LISP language
systems. The stack provides a convenient means of
passing parameters between routines in the EXEX.
Since each job has its own unique data page and environment page, EXEX is both recursive and reentrant.
The environment status table (ESTAT) contains
the swap and core location for each component in
the EXEX and for each program in the job. It resides
in the job environment page. When an EX EX service
is requested, only that particular EXEX program is
brought in from swap storage,: rather than the full
service library. The interface linkage routine provides
this management function; it lies as a link between
the sve Dispatcher and the particular EXEX
function. The interface routine picks up necessary
work pages for the EXEX component involved and
branches to that component aner it is brought into
core. The interface routine maintains a separate pushdown stack of return addresses: providing the means
for the EXEX component to properly exit and return
control to its interface routine and then to the system.
The EXEX component called; may make additional
EXEX sve calls before exiting. To provide correct
work page allocation during recursive calls, the interface routine also saves the work page core and drum
page addresses in the push-down stack. Upon completion of a call, the EXEX component returns to
its interface routine; the interface routine releases
all allocated work pages to the system and branches
to a common unwind procedure.
The unwind procedure, like the sve Dispatcher,
is simply a switching mechanism. It determines, via
the stack, whether to return to a still higher level
EXEX function, or to turn the EXEX off and exit
to the Basic Sequence. This recursive/reentrant control is the most complex portion of ADEPT and is
the "glue" that binds BASEX and EXEX together.
Figure.5 illustrates the recursive process.
Object program communication
One of the more stringent services required of an
operating system is the rapid interchange of large
quantities of data between object programs. The
interchange of even simple arrays, matrices, and tables
via stack parameters or a common fil~ suffers from the
inadequacy of limited capacity or extensive I/O time.
Many operating systems ignore this requirement,
thereby restricting the general-purpose appllications.
Yet there are solutions to this problem, and one successful technique employed in the ADEPT system is
that of "shared memory". Shared memory is achieved
by using the basic mechanism for managing reentrancy,
namely the program environment page map. Through
the ADEPT SHARE Page call, an object program
can request that designated pages of another program
DATA PAGE PUSH
DOWN STACK.
SVC
DISPATCHER
STACKS
EXEX
COMPONENT'S
GENERAL
REGISTERS
NUMBER OF ENTRI ES
EX EX
"A" COMPONEN·r
REGISTER~
UNWIND
DECREMENTS
STACK
EX EX
"B" COMPONENl
REGISTERS
0)
Figure 5-Block diagram of EXEX behavior g,nd
control
The ADEPT-50 Time-Sharing System
45
console and then processed in turn. by this supervisor
in the job be added to its map. If core page numbers
function.
are passed as parameters in various service calls, whole
pages of data may be passed between programs. EXEX
and many object programs operating under this system
Armed interrupts and rescue function
use this method for inter-program communication.
ADEPT operating on the IBM 360/.50H restricts
The basic design of ADEPT conveniently provides
its user programs to 46 active core pages. However,
for prooessing object program "armed" interrupt
by utilizing the GETPAGE call, an object program
calls. This means that an obje-ct program is able to
may acquire up to 128 drum pages and may subseconditionally start (wakeup) and stop (sleep) the
quently activate and deactivate various page sets
execution of its own programs, and others as well.
by utilizing another service call, ACTDEACT (actiThe conditions for ~mploying wakeup calls include
vate/ deactivate). This scheme permits bulk data from
too much elapsed time, or the occurrence of unpredisc storage to be placed on drum and operated upon
dictable but anticipated events, e g., errors and other
at "swap" speeds. Thus skilled system users can
program calls. In "arming" these "software-interachieve efficient use of time and memory by managing
rupt" conditions by object program calls, the program
their own "paging". We consider this the best alternaentry point(s) for the various conditions are specified.
tive considering the questionable state of other, autoWhen such conditions occur, the operating system
matic paging algorithms. 1O ,1l,12,13 Most EXEX comtransfers to the specified entry' point and gives the
ponents use these calls for just such purposes. For
appropriate condition code. (Note that if we take this
example, the interface routines mentioned above use
call one step further, and permit one object program
activate calls to "turn on" called components of the
to arm the software and hardware interrupts of another
EXEX.
object program, we have the basic control mechanism
The Allocator component of ADEPT manages the
necessary to permit the operation of "object systems.
page map for each program. This software map renecessary to permit the operation of "object systems,"
flects the correspondence between drum and core
i.e., subexecutives-another level in the "onion skin"
pages, established initially by the SERVIS (service)
of ADEPT control.)
component at load time. The Allocator's function is
User programs interface with the ADEPT system
to inventory available core and drum pages by mainprimarily via the supervisor call (SVC)· instruction;
taining two resident system tables: one for core, the
a secondary interface is provided via the program
other for drum. Whenever drum pages are released . check interrupt that protects the program and system
or obtained, the Allocator updates the page map in
after various error conditions. The executive design
the job's environment page. The Allocator processes
allows user programs to trap all such interfaces with
the SHARE (page), GETPAGE, FREEPAGE, and
the system via its rescue arming mechanism. This
ACTDEACT calls from EXEX and object programs.
means that one program can trap and get first-level
SERVIS allows a program at run time to add data
control of all occurrences of SVC's and program checks
pages or to overlay program segments from disc or
within a single job. This mechanism also means, then,
tape. In so doing, SERVIS makes use of the various
that the responsibility and meaning for these interAllocator calls.
faces can be redefined at the user program level.
As of this writing, this mechanism is being employed
Simulating console commands
to eonstruct object systems for an improved batch
monitor, an interface for the proposed ARPA NetAn importan.t attribute of ADEPT time-sharing
work,14
and to experiment with automatic translators
is that nearly all the functions and services that can
for
compatibility
with other operating systems. Other
be initiated at the user's console can also be called
uses
include
improvements
in program recovery in
forth within a user's program. A program designer
a
variety
of
user
tools,
e.g.,
compiler
diagnostics.
can, for example, build a system of programs, which
can operate in batch mode under the control of a program by issuing internal commands in much the same
manner as the user sitting at the console .. With this
approach, the ADEPT batch monitor controls background tasks by simulating user terminal requests.
Ba.tch requests can be enqueued by users from any
Resource allocation, access, and management
ADEPT system design, of course, includes a complete set of resource controls that monitor secondary
storage devices.
46
Fall Joint Computer Conference, 1969
The cataloger
The Cataloger, an EXEX component, is functionally
analogous to the core/drum Allocator, but is used
for devices accessible by user programs. It maintains
an inventory of all assignable storage devices, assigns
unused storage on the devices,· maintains descriptions of the files placed on these devices, controls
access to these files, and-upon authorized requestdeletes any file. Specifically, the Cataloger:
• Assigns storage on 2302, 2311 and 2314 discs.
• Assigns tape drives.
• Locates an inventoried file by its name and certain qualifiers that uniquely identify the file.
• Issues tape or disc pack mounting instructions
to the operator when necessary.
• Verifies the mounting of labeled volumes.
• Passes descriptive information to the user program opening a file.
• Allows the user of a file to request more storage
for the file.
• Denies unauthorized users access to files.
• Returns assigned storage to available storage
whenever a file is deleted.
• Maintains a table of contents on each disc volume.
As the largest single compon~nt of the ADEPT
Eexcutive (65,000 bytes), the Cataloger was written
in a new, experimental programming language called
MOL-360 (Machine-Oriented Language for the 360).16
I t is a "higher-level machine language" developed
under an ARPA-sponsored SDC research project on
metacompilers. It resolved the dilemma involving
our desire for higher-level source language and our
need to achieve flexibility with machine code. The
Cataloger design and che6kput, enhanced by the use
of MOL-360, showed simultaneously the validity
of MOL compilers for difficult machine-dependent
programming.
results of EXCP for the call are "interpreted" by
SPAM and returned to the user program as status information. As such, SPAM represents a more symbolic
I/O capability than the EXCP level. It provides a
relatively simple method for executing the operations
of reading, writing, altering, searching for, ELnd positioning records within ADEPT cataloged and controlled disc-based and tape-based file structures,.
Resource mana,gement
As of this writing, the computer operator has a set
of commands at his disposal that allow him to control
the system resources. Various privileged on-line commands enable him to monitor the terminal activities
of system users and to control assignment and availability of storage devices. However, there is an increasing need for a "manager" to be given more
latitude in dynamically controlling the system resources and observing the status of system users,
particularly because ADEPT was designed to handle
sensitive information in classified government and
military facilities. To meet these objectives, a design
effort is under way that gives the computer operator
system-manager status, with the ability to observe
and control the actions of system users. The result
will be a program that encompasses some of the management techniques reported by Linde and Ch aney 16
tailored to present needs.
Swapping and scheduling user programs .
Most of the programs that run under ADEPT
occupy all of the core memory that is not used by
the resident Basic Executive (46 pages on the 360/
50H). If the set of needed pages could be reduced
considerable reduction in swap overhead could be
expected. One way to achieve this is to mark fo][, swapout only those pages that were changed during program execution. The hardware needed to automatically
mark changed pages is unavailable for the 3:60/50;
however, through use of the store-protect feature on
the Model 50, ADEPT software can simulate the effect and produce noteworthy savings in swap time.
Page marking
The SPAM component
SPAM is a BASEX component that permits symbolic, user-oriented I/O. It can be viewed as a specialpurpose compiler that compiles sytnbolicuser program
I/O calls into 360 channel programs, and delivers them
to the Input/Output Supervisor (lOS) for execution
via the EXCP (execute channel program) call. The
Whenever a user program is swapped into ClOre, its
pages are set in a read-only condition. As the program
executes, it periodically attempts to store data (write)
in its write-protected pages. The resulting interrupt
is fielded by the system. After s~tisfying itself that
the store is legal for the program, the executive marks
the target page as "written," turns off write-protect
The ADEPT-50 Time-Sharing System
for that page, and resumes the program's execution.
The situation repeats for each additional page written.
At the completion of the program's time slice, the
8wapper has a map of all the program pages that
were changed (implied in the storage keys with no
write protection). Only the changed pages are swapped
out of core. Measurement of this scheme shows that
about 20 percent of t·he pages are changed; hence,
for every five pages swapped in, only one need be
swapped out, for a total swap of six pages, rather
than the full swap of ten pages (five in, five out). The
scheme makes the drum appear to be 40 percent faster.
The use of the storage protection keys is based on
the functional status of each page rather than on
some user identity. User programs always run with
a program status word key of one, and the bits in
the storage key associated with the programs start
out at zero. After a page has been initially changed,
its key is set to one also. The other bits in the key are
used to indicate: first, a page is transient, not yet
completely moved to or from swap storage; second,
a page is unavailable, i.e., it belongs to someone else;
third, a page is locked and cannot be swapped or
changed; and finally, a page is fetch-protected because
it may contain sensitive information.
Scheduling algorithm
The scheduling algorithm provides for three levels
of scheduling. Jobs that are in a "terminal I/O complete" state get first preference in the schedule. Jobs
in the second level, or background queue, are run if
there are no level-one jobs to run. A job is placed in
level two when the two-second quantum clock alarm
terminates its operation two consecutive times. Compute and I/O-bound programs are treated alike. A
level-two job-when allowed to run-is given quantum
interval equal to the basic quantum time multiplied
by the scheduling level (i.e., 2 sec X 2 = 4 sec).
However, a level-two background job may be preempted after two seconds for terminal I/O. Anyoperation a level-two job makes that terminates its quantum prematurely will return the job to a level-one
status. The batch monitor job is run when the first
two queues are empty. User programs may be written
to overlap execution and I/O activity. Our choice of
scheduling parameters for quantum size, and number of service levels was selected empirically and as a
result of prior experience. 17
A command SKED, which is limited to the operator's terminal, has the effect of forcing top priority
for a job (the job stays at level one all the time). Only
47
one job may run in this privileged scheduling state
at a time.
Pervasive security controls
Integrated throughout the ADEPT executive are
software controls for safeguarding security-sensitive
information. The conceptual framework is based
upon four "security objects": user, terminal, file,
and job. Each of these security objects is formally
identified in the system and is also described by a
security profile triplet: Authority (e.g., TOP SECRET, SECRET), N eed-to-Know Franchise, and
Special Category (e.g., EYES ONLY, CRYPTO).
At system initialization time, user and terminal
security profiles are established by security officers
via the system component SYSLOG. SYSLOG also
permits the association of up to 64 passwords with
each user. At LOGIN time, a user identifies himself
by his unique name, up to 12 characters, and enters
his private password to authenticate his identity. The
LOGIN component of ADEPT validates the user
and dynamically derives the security profile for the
user's job as a complex function of the user and terminal security profiles. The job security profile is
used subsequently as a set of "keys," used when access
is made to ADEPT files. The file security profile is
the "lock" and is under control of the file subsystem.
File access Need-to-Know is permitted for Private,
Semi-Private, and Public use. With the CREATE
command, a list of authorized users and the extent of
their access authorization (i.e., read-only, write-only,
read and write) can be established easily for SemiPrivate files. Newly created files are automatically
classified with the job's "high water mark" security
triplet-a cumulative security profile history of the
security of files referenced by the job. Through judicious use of the CHANGE command, these properties may be altered by the owner of the file.
Security controls are also involved in the control
of classified memory residue. Software and hardware
memory protection is extensively used. Software
memory protection is achieved by interpretive, legality checking of memory bounds for I/O buffer
transfers, legality checking of device addresses for
unauthorized hardware access, and checks of other
user program attempts to seduce the operating system
into violating security controls.
The hardware protection keys are used to fetchprotect all address space outside the user program and
data area. Also, newly allocated space to user programs
is zeroed out to avoid classified memory residue.
The ADEPT-50 Time-Sharing System
for that page, and resumes the program's execution.
The situation repeats for each additional page written.
At the completion of the program's time slice, the
swapper has a ma,p of all the piogram pages that
were changed (implied in the storage keys with no
write protection). Only the changed pages are swapped
out of core. Measurement of this scheme shows that
about 20 percent of the pages are changed; hence,
for every five pages swapped in, ;only one need be
swapped out, for a total swap ot six pages, rather
than the full swap of ten pages (five in, five out). The
scheme makes the drum appear to be 40 percent faster.
The use of the storage protection keys is based on
the functional status of each page rather than on
some user identity . User programs always run with
a program status word key of one, and the bits in
the storage key associated with the programs start
out at zero. After a page has been initially changed,
its key is set to one also. The other bits in the key are
used to indicate: first, a page is transient, not yet
completely moved to or from sw~p storage; second,
a page is unavailable, i.e., it belongs to someone else;
third, a· page is locked and cannot be swapped or
changed; and finally, a page is fetch-protected because
it may contain sensitive information.
Scheduling algorithm
The scheduling algorithm provides for three levels
of scheduling. Jobs that are in a "terminal I/O complete" state get first preference in ,the schedule. Jobs
in the second level, or background queue, are run if
there are no level-one jobs to run. A job is placed in
level two when the two-second quantum clock alarm
terminates its operation two consecutive times. Compute and I/O-bound programs are treated alike. A
level-two job-when allowed to run-is given quantum
interval equal to the basic quantum time multiplied
by the scheduling level (i.e., 2 sec X 2 = 4 sec).
However, a level-two background. job may b~ preempted after two seconds for terminal I/O. Anyoperation a level-two job makes that terminates its quantum prematurely will return the job to a level-one
status. The batch monitor job is run when the first
two queues are empty. User programs may be written
to overlap execution and I/O activity. Our choice of
scheduling parameters for quantum size, and number of service levels was selected eritpirically and as a
result of prior exp~rience.17 .
A command SKED, which is limIted to the operator's terminal, has the effect of f~rcing top priority
for a job (the job stays at level one all the time). Only
!
48
one job may run in this privileged scheduling state
at a time.
Pervasive. security controls
Integrated throughout the ADEPT executive are
software controls for safeguarding security-sensitive
information. The conceptual framework is based
upon four "security objects": user, terminal, file,
and job. Each of these security objects is formally
identified in the system and is also described by a
security profile triplet: Authority (e.g., TOP SECRET, SECRET), Need-to-Know Franchise, and
Special Category (e.g., EYES ONLY, CRYPTO).
At system initialization time, user and terminal
security profiles are established by security officers
via the system component SYSLOG. SYSLOG also
permits the association of up to 64 passwords with
each user. At LOGIN time, a user identifies himself
by his unique name, up to 12 characters, and enters
his private password to authenticate his identity. The
LOGIN component of ADEPT validates the user
and dynamically derives the security profile for the
user's job as a complex function of the user and terminal security profiles. The job security profile is
used subsequently as a set of "keys," used when access
is made to ADEPT files. The file security profile is
the "lock" and is under control of the file subsystem.
File access N eed-to-Know is permitted for Private,
Semi-Private, and Public use. With the CREATE
command, a list of authorized users and the extent of
their access authorization (i.e., read-only, write-only,
read and write) can be established easily for Semi·,
Private files. Newly created files are automa1jcally
classified with the job's "high water mark" security
triplet-a cumulative security profile history of the
security of files referenced by the job. Through judicious use of the CHANGE command, these properties may be altered by the owner of the file.
Security cdntrols are also involved in the control
of classified memory residue. Software and hardware
memory protection is extensively used. Software
memory protection is achieved by interpretive, legality checking of memory bounds for I/O buffer
transfers, legality checking of device addresses for
unauthorized hardware access, and checks of other
user. program attempts to seduce the operating system
into violating security controls.
The hardware protection keys are used to fetchprotect all address space outside the user program and
data area. Also, newly allocated space to user prog;rams
is zeroed out to avoid classified memory re:3idue.
The ADEPT-50 Time-Sharing System
Typically, the complete system reaches "on the air"
status in less than a minute.
System instrumentation
Many of the parameters built into the scheduling
and swapping of early ADEPT versions were based
upon empirical knowledge. The latest versions of
the' Basic and Extended Executives include routines
to record system performance, reliability, and security
locks.
Built into the BASEX is a routine to measure the
overall and the detailed system performance. 20 Such
factors as the number of users, file usage, hardware
and software errors, and page transaction response
time are recorded on unused portions of the 2303
drum. These measurements provide a better understanding of the system under a variety of inputs and
give the designers insight into how the hardware and
software components of the system affect the performance of the human user.
An AUDIT program was made part of the EXEX
to record the security interaction of terminals, users,
and files., AUDIT records EXEX activity in the areas
of LOGIN, LOGOUT, and File Manipulation. This
routine strengthens the security safeguards of the
executive. Specific items that are recorded involve:
type of event, user identification, user account number, job security, device identification, time of event,
file identification1 file security and event success. In
addition, this routine provides accounting information and is used as a means of debugging the security
locks of new system releases.
In addition to the BASEX recording function,
several object programs have been written that simulate various modes of user activity and provide controlled job distributions. These programs, called
"benchmarks," run under controlled conditions and
enhance the means of improving system performance
and throughput, as described elsewhere by Karush.21
The programs are designed to gather performance
measures on the major routines of the executive and
have been of considerable help in system "tuning,"
because they renect the effect of coding and design
changes to various system routines. The routines in
the executive that are of primary concern are the
swapper, the scheduer ,the terminal read/write pack..
age, and the interrupt handling processes. Attempts
are being made to design a set of benchmarks that
represent a typical job mix. However, we are primarily
interested in measuring the performance of our system
against various modifications of itself and in measuring
its behavior with respect to different job mixes.
49
SUMMARY
The ADEPT executive is a second-generation, generalpurpose, time-sharing system designed for IBM 360
computers . Unlike the monolithic systems of the past,l,2
it is structured in modular fashion, employing distributed executive design technIques that have permitted
evolutionary development. This design has not only
produced a flexible executive system but has given the
user the same facilities used by the executive for
controlling the behavior of his programs. ADEPT's
security aspects are unique in the industry, and the
testing and fabrication methods employ a number
of novel approaches to system checkout that contribute to its operational reliapility.
It is important to note that this system deals particularly well, with size limitation problems of very
large files and very large programs. The provisions
made for multiple programs per job, active/inactive
page status for programs larger than core size, page
sharing between programs, common file access across
programs within jobs, and the commitment of considerable space to active fil~ environment tables (up
to four pages worth) contribute to this success. Nevertheless, all these capabilities are designed to handle
the smaller entities as well. We feel ADEPT-50 is
a significant contribution to the technology of generalpurpose time-sharing.
ACKNOWLEDGMENTS
We would like to express our appreciation for the
dedicated efforts of some very adept individuals who
participated in the design and building of this timesharing system. Our thanks go to Mr. Salvador Aranda,
Mr. Peter Baker, Mrs. Martha Bleier, Mr. Arnold
Karush, Mrs. Patricia Kribs, Mr. Reginald Martin,
Mr. Alexander Tschekaloff and all the others who
have followed their lead.
REFERENCES
1 P CRISMAN editor
The compatible time-sharing system: A programmer's guide
MIT Press Cambridge Mass 1965
2 J SCHWARTZ et al
,A general-purpose time-sharing system
.
Proc SJCC Vol 25 1964397-411 Spartan Books BaltImore
3:E W FRANKS
A data management system for time-shared file-processing
using a cross-index file and self-defining entries
AFIPS Proc Vol 28 196679-86 Also available as SDO
document SP-2248 21 April 1966
5(}
Fall Joint Computer Conference, 1969
4 R E BLEIER
Treating hierarchical data structures in the SDC time-shared
data management system (TDMS)
Proc 22nd Nat ACM Conf Thompson Book Co 196741-49
5 E W DIJKSTRA
The structure of T.H.E. multi-programming system
C A C M Vol 11 No 5 May 1968
6 F J CORBATO V A VYSSOTSKY
Introduction and overview of the multws system
Proc FJCC Nov 30 1965 Las Vegas Nevada
7 B W LAMPSON
Time-sharing system reference manual
Working Doc Univ of Calif Doc No 30.1030
Sept 1965 Dec 1965
8 B W LAMPSON
A sch6duling philosophy for multi-processing systems
C A C M Vol 11 No 5 May 1968
9 J H SALTZER
Traffic control in a multiplexed computer system
MAC-TR-30 thesis MIT Press July 1966
10 G H FINE et al
Dynamic program behavior under paging
Proc ACM 1966223-228 Thompson Book Co Wash D C
11 E G COFFMAN L C VARIAN
Further experimental data on the behavior of programs in a
paging environment
C A C M Vol 11 No 7 July 1968471-474
12 L A BELADY
A study of replacement algorithms for d, virtual storage computer
IBM Systems Journal Vol 5 No 2 1966
13 R W O'NEIL
Experience using a time-shared multi-programing system
with dynamic address relocation hardware
Proc SJCC 1967 Vol 30 611-627 Thompson Book Go
Washington D C
14 L G ROBERTS
Multiple computer networks and intercomputer networks and
intercomputer communication
ACM Symposium on Operating System Principles
Oct 1-4 1967 Gatlinburg Tenn
15 E BOOK D C SCHORRE S J SHERMAN
Users manual for MOL-360
SCC Doc TM-3086/003/01
16 R R LINDE P E CHANEY
Operational management of time-sharing systems
Proc ACM 1966 149-159
17 P V McISSAC
Job descriptions and scheduling in the SDC Q-32 timesharing system
SDe Doc TM-2996 June 196628
18 C WEISSMAN
Security controls in the ADEPT-50 time-sharing system
AFIPS Proc FJCC Vol 35 1969
19 W A BERNSTEIN J T OWENS
Debugging in a time-sharing environment
AFIPS Proc FJCC Vol 33 19687-14
20 A D KARUSH
The computer system recording utility: application and
theory
SDC Doc SP-3303 Feb 1969
21 A D KARUSH
Benchmark analysis of time-sharing system
SDC Doc SP-3343 April 1969
APPENDIX A: Advanced development prototype system block diagram.
UNIVUSAL CHAl SeT '1640
HN2 NINT 'lAIN ott"...
6/'0 LINES/INCH (.,Q) w,,9'10
'EVISID 3OA""L '96'
IWT MOOI.&I....::::===+--""
ONlY
..uJICIYTIS
4615 lUMINAL
CONTIOL TV'! I
"" SECOND
IACH
ClIYTEWIDI)
3." M IYTE CA'ACITY
312.5K IVTEJUC TRANSon_ lATE
•.• M SIC AVEUGE ACCESS TIM!
3233
1912TEllGlAPH
TEiMlNAL CONTlOL
TY,11l
lPO Q 20569
c. IICOGNITION
lPOQ23'6oI
IT>! 'NTlIIU"
7U5T1iM'NAl
CONnOt.
IX'ANS'ON
AVE ACCISS W/O
MOVING HIAD • 17 MS,
WITH MOVING HlAO 120 MS
TlANS.IIIAn
'41KIYTDi\K
CA'''',TY
m .. Ivm
MU)( CAlLI
ASS~V
5ni2t2
(1) IILONGS TO CCD
~y31O
DOIS NOT HAVI TY_TIC
(2) ON ONI. V ON1274'
\
."SOlUTE VECTO« AND CONUOL
ALfttANUMEllC KEV'OAIO 124.5
IKIUFF£ltl499
CHARACTER GENElATOI 1680
LIGHT 'EN41B5
FUI\fClION KEVIO ...IO st55
2J8K IVTESlSfC
' .......LL(t OATA AO""Ul
ptAYUlil1 '5.500
OVAL 'APE DRIVES
800BITS!INCH
7 9~TRJ.CK AND 1 7-TRACK
ADDITIONAL
DlIVES
2.8 'IOSQ
90 K eVTES:SEC TUNSFU.
112.SIN/SEC
~A TE
EX'.ANOED
CAPA,BILITY
'381'
TO REMOTE CONSOLES VIA OATA SETS (lNHEN NUOEO)
An operational memory share supervisor
providing multi-task processing within a
"single partition
byJ.E.BRAUN
Penna. -N. J. -Md. Interconnection
Philadelphia, Pa.
and
A.GARTENHAUS
Applied Programming Services, Inc
Philadelphia, Pa.
INTRODUCTION
The real-time digit"al process control system, of which
the Partition Share Supervisor is an operational feature
was designed and implemented to assist in the function~
of monitoring, evaluating and controlling an interconnected system of electrical power utility companies. The main processing unit is located at the
central control office with teleprocessing communications to remote lower level control centers.
The basic addressable unit within the main processor
is the byte (8 data bits + 1 parity bit), with a word
consisting of four bytes. There is a storage protect
option which is implemented through assignment of
storage and "keys" to contiguous 2048 byte blocks of
memory. A group of memory blocks with matching
protect keys comprise a partition or task area. This
protection feature permits non destructive read-out
across partition boundaries but will cause termination
of any task which attempts to write in another task's
memory area.
The arithmetic-logic unit maintains its current status
in a program status word which contains such information as whether or not I/O is currently being permitted on each of the data channels, the protect key for
the instruction presently being executed, present
machine status, length of current instruction, the address of the next in"struction to be fetched, etc. There
are certain instructions within the instruction set
which can only be executed when the machine is in
the "supervisor" state, i.e., when the portion of the
program status word which indicates machine status
is correctly set." These instructions are classified as
"privileged" instructions and perform such functions
"as disabling data channel interrupts, altering storage
keys, resetting the program status word, etc.
The ability of the computer to disallow certain of
its instructions when operating in the normal problem
program state prevents inadvertent destruction of
critical storage area or catastrophic conditions being
caused by problem programs which could lead to
system shutdown.
This system utilizes the indeperldent I/O channel
concept which permits the main processor to continue
execution of program instructions while the channel
transfers data from I/O devices into main storage by
cycle interleaving.
The multi-tasking capability of the manufacturer
supplied software support system permits priority
51
52
Fall Joint Computer Conference, 1969
scheduling of several tasks all utilizing the resources of
one processing unit. The design of the real-time control
system requires that it perform certain of its functions
in a cyclic basis. Therefore, the internal storage has been
divided into' four task areas (partitions) with time dependent and critical programs placed in partitions
with relatively higher priorities. The following task descriptions are listed in order of task priorities:
/
HIGH
MEMORY
ADDRESS
TASK 2
.//
REAL TIME PROCESS CONTROL
TYPEWRITER/CARD READER
TELECOMMUN I CATI ONS CONTROL
72K
ANALOG/DIGITAL TELECOMMUNICATIONS
CONTROL
EMERGENCY DISPATCH ROUT I NES
V/
TASK 3
Task 1 (core requirement) == 42K)
DIG I TAL CONSOLE MESSAGE PROCESS I NG
OUTPUT TEXT GENERATION FOR TASK 2
REMOTE CARD I NPUT PROCESSOR
\
Task 1 is dedicated to the manufacturer supplied
operating system (O/S) which contains supervisory
routines, data management routines priority scheduler,
etc.
TASK-TO-TASK COMMUNICATIONS MONITORING
TASK 4
PART I TI ON SHARE SUPERY I SOR (PSS)
TASK 5/TASK 6
~:::
6K
(SHARED PART IT ION)
TI ME DEPENDENT AND SPEC I AL DEMAND
Task 2 (Icore requirement ==72K)
961t
SC I ENT I FI C APPLI CAT! ON PROGRAMS
(TASK 5)
Task 2 incorporates the process control family of
programs. It also includes the remote typewriter/caTd
reader communications programs since they use little
processing time and benefit from both the independence
of input/output channel operations and quick response
time available to the task.
D~ring power system
emergency situations, Task 2 additionally initiates
routines which, due to their critical nature, retain
system resources and dispatch emergency communications until the disturbance is relieved.
Task 3 (core requirement == 40K)
Task 3 contains special digital console message processing routines, text output gene*ators for programs
operational within Task 2, routines! for processing card
inputs from the telecommunications system and routines which monitor and control inter-task communications.
Task 4 (core requirement == 6K)
Task 4 is the Partition Share, Supervisor (PSS)
which causes Tasks 5 and 6 to share the remaining
available memory. The detailed description of this
task is the subject of this paper.
Task 5 (core requirement == 96K)
Task 5 consists primarily of scientific application
programs. These programs are run as required either on
special demand from real-time on~line tasks or periodically with the length of the period depending on
the nature of the program.
OR
OFFL I NE MISCELLANEOUS USES
(TASK B)
TASK I
(NUCLEUS)
V/
OPERATING SYSTEM CONSISTING OF:
SUPERY I SORY, DATA MANAGEMENT,
PRIORITY SCHEDULER ROUTINES, ETC.
V/
LOW
MEMORY
ADDRESS
Figure I-Initial memory configuration with task
functional descriptions and relative locations shown
Task 6 (core requirement == 96K)
This task is the off-line* task and is dedicat.ed for
miscellaneous uses such as compiles, assemblies, accounting routines, etc.
Figure 1 is a functional diagram of the tasks just
discussed and shows their re1ative locations in computer memory.
General discussion
Task dispatching
Task dispatching is under the control of the operating system. From a copceptual standpoint, the
operating system can be considered to be the only
main program in storage and all other tasks within
the computer as subroutines.
* The term off-line is used in this paper when referring to tasks
which do not directly operate within the real-time environment.
This use is similar to the term "background" which the re9.der
may have previously encountered.
An Operational Memory Share Supervisor
The dispatching function consists of allocating the
resources of the processor to the highest priority task
which is in the Hready" state. When no tasks are in
the ready state, the processor is not working and is in
a wait state. When any task reaches a point where it
no longer can process until the completion of some
event (such as an I/O operation), it relinquishes control of computer facilities to lower priority tasks via
the scheduler. It will regain these facilities when the
event it is awaiting is completed and there are no
higher priority tasks which are in the ready state.
/'"
HIGH
MEMORY
ADDRESS
/
TASK 2
72K
/
TASK 3
4OK
Inter partition communication
The subject real-time system requires that operational tasks be able to communicate for the purpose
of exchanging information such as live data, requests
to run various subtask routines, etc. Tasks which
communicate with other tasks are equipped with intertask communication routines which are considered the
highest priority routines within the individual task. In
this fashion, when the task is dispatched, the internal
task priority scheme allows the communication routines
to be processed first. Furthermore, any task can be
interrupted to allow its communication routines to
operate. Thus tasks can communicate at any time
(asynchronously) .
Partition sharing
The Partition Share Supervisor (PSS) is required to
be able to handle three basic functions:
1. Suspend processing of the off -line task when
required.
2. Load and process the lowest priority on-line
task (LPOL).
3. Upon completion of (2) above, be able to restore
and restart the off-line task.
There are two conditions under which PSS suspends
off-line processing. One is when the previously set
real-time clock causes an interrupt. This interrupt is
recognized as indicating the LPOL is to be recycled
for a periodic run. The other is when a communication
is received' from another task indicating that one of the
routines within the LPOL task is to be executed.
Figure 1 shows the computer configuration in the
normal mode. Normal mode is considered to be when
the shared partition is occupied by off-line programs.
Note that there are four problem program partitions
(excluding the nucleus).
Figure 2 shows the configuration when the off-line
programs are "rolled out" and the LPOL programs
are operational. There are now three problem program
53
TASK 4 PARTITION SHARE SUPERVISOR {PSS'
TASK 5
V
"6
96K
(LO' PR lOR I TV ON LI NE TASK)
~
COMBINED
SINGLE
TASK
AREA
(102K)
Vi'"
TASK 1 (NUCLEUS)
LO.
MEMORY
ADDRESS
42 K
V
Figure 2-Showing memory configuration when low
priority on line (LPOL) task is active
partitions and the area dedicated to the PSS and LPOL
tasks is one contiguous partition.
Detailed discussion
The following description details the operations involved in reconfigurating the system from that of
Figure 1 to that of Figure 2 and returning to that of
Figure 1.
As previously stated, the PSS task is initiated for
one of two reasons:
1. Timer interrupt indicating a need to run the
LPOL task for time dependent programs.
2. External interrupt triggered by communication
from another task indicating a need to process
a requested program.
Prior to either type of interrupt, the PSS task is
in a wait state (i.e., the task cannot be dispatched
until the completion of one of the above two events).
Fall Joint Computer Conference, 1969
E4
Upon being initiated, PSS takes the following steps:
FIELD
1. Places its own task in the supervisor state in
order to allow execution of privileged instructions
required to modify system control blocks in the
nucleus, override the storage protection feature,
and disable system interrupts at critical times.
2. Allows all outstanding I/O to complete in the
off-line partition (quiescing the partition).
3. Erases the boundary between the PSS task
and off-line task.
4. Deletes reference to the now non-existent offline task from operating system control blocks.
5. Writes a copy of the off-line partition, which is
now an extension of the memory area of the
PSS task, on a disc file.
6. ~eads the LPOL task into the vacated area.
7. Executes the LPOL task.
At this point, we have gone from the configuration
shown in Figure 1 to that of Figure 2 and the LPOL
task is now able to process its requests. Upon completion by the LPOL task of all required processing,
the following steps are taken by PSS to return to the
off-line configuration:
8.
9.
10.
11.
12.
Writes the LPOL task on a disc file.
Reads the off-line task into the vacated area.
Re-establishes task boundaries erased in 3.
Restores system reference to the off-line task.
Places the PSS task in a "wait state" awaiting
an interrupt which will cause a recycle.
At thi'3 point, the off-line task is fully restored to the
system and in a "ready state". It will then be red ispatched by the task dispatching routines on a priority
basis.
System 'control blocks
Prior to a detailed discussion of PSS mechanics, we
will discuss relevant system control blocks utilized in
effecting partition sharing.
Task Control Block (TCB)
There is a TCB associated with each task. Contained
in the TCB are various boundaries, indicators, etc.,
used in performing task controL Figure 3 shows those
fields (with references labeled as used in this paper)
which are accessed or modified by PSS.
TCB List (TCBLIST)
The TCBLIST is located in the nucleus and is a
list of TCB 10cationR in ord~r of task priority. There
TCBTAHB
Figure
3~Task
C(I.ENTS
PO I NTEfI TO TASK
MSS (B(JUNDARY
BOX-SEr: FIG.5)
TCBPKE
CONTA HIS STORAGE
PROTECTION KEY
FOR THlt TASK
TCBIDF
TASK I DIENTI FI
NUMBER
TCBTCB
PO I NTEFI TO NEXT
LOIER F'R I OR I TV
TASK T(:B
cn I ON
control block (TCB)
is an entry in the list for each task in the system (see
Figure4).
Task Area Boundary Block (TABB)
There is a TABB associated with each task. The
TABB contains addresses defining the upper and lower
boundaries of the task region and also has a pointer
to the first free area label within the task. The format
of a TABB is shown in Figure 5.
Free Area Label (FAL)
There is an F AL which is an integral part of every
available free storage area in memory. An F AL is
POINTER TO TCB OF HIGHEST PRIORITY TASK
~------------------------------------~~--
POINTER TO TCB OF NEXT HIGHEST PRrORITY TASK
~--------------------------------------------.-
•
•
•
•
.. ,..
.!.,..
POINTER TO TCB OF LOWEST PRIORITY TASK
Figure 4-TCB list (TCBLIST)
An Operational Memory Share Supervi.sor
LABLE
FALPT
FALPT
LOADDR
~
POINTER TO FIRST FREE AREA
LABEL (FAL) IITHIN TASK
AREA. (SEE FIGURE 8)
LOADDR
THE ADDRESS OF THE LOW
BOUNDARY OF THE TASK
HIABOR
THE ADDRESS OF THE HIGH
BOUNDARY OF THE TASK.
HIADDR
55
I01 ESTAT
STATUS INDICATOR FOR THIS
lORE. THE LAST lORE IN THE
CHAIN HAS AN lORE STAT FIELD
11TH A VALliE OF 1-
IOREI 0
FIELD SET TO SAlE ID NUMBER
AS THAT OF THE TCBIDF FI ELD
OF THE TASK IHICH INITIATED
I/O REQUEST (SEE FIGURE 3)
Figure 7-1/0 request element (lORE)
Figure 5-Task area boundary block (TAB B)
Quiescing a partition
effectively a label for each free storage area which
defines the size of it and contains a linkage pointer to
the next FAL. The format of an FAL is shown in
Figure 6.
Input/Output Request Element (lORE)
There is a chain of IOREs for all outstanding or
queued I/O operation requests from any partition.
Each lORE contains information used by the system
I/O interrupt handling routines as I/O operations are
completed. Figure 7 shows the format of an lORE.
System Vector Table (SVT)
The SVT is resident in the nucleus and contains
essential pointers required by the operating system.
Included is a pointer to the start of the lORE chain.
The location of the SVT is retrieved from a fixed memory location which is conditioned with the SVT address
during system initialization.
As mentioned under General Discussion the PSS
.
'
task IS required to run in supervisor state at times.
Although the state of, the PSS task changes from
problem to supervisor and back throughout its execution, these changes of state will not be noted in
this discussion. It should be understood that PSS
operates in problem state at all times where it is not
required to be executing privileged instructions modifying storage in another partition or the nucieus or
disabling I/O interrupts.
'
FAUXT
FAL ..n
FALCOUIIT
FALCOU..T
PO INTER TO NEXT FAL IN THE
CNAIN OF FAL' S..
NOTE: IF THIS FIELD IS ALL
ZEROS, THIS IS THE LAST fAL
IN THE CHAIN.
.....T OF FREE MEIORY
AVAILAILE SURTING AT THE
IEII.I. OF THI S FAL.
Figure 6--Free area label (FAL)
Prior to rolling out the off-line partition, PSS must
be sure all I/O is quiesced in order to prevent the I/O
supervisor routines from accessing some storage area
which is in a transitory state.
There is an lORE for all outstanding and queued
I/O requests. Within each lORE is an identification
number field (IOREID-see Figure 7) which links it
with the initiating task. When that task is involved in
an I/O operation, the TCBIDF field of the TCB
(Figure 3) has a task identification number that will
match the 10REID field of some active lORE.
As I/O interruptions occur, the I/O Interrupt Handler services the interrupt and removes the appropriate
lORE from the chain and makes it inactive.
Partition quiescing is accomplished by initially disabling I/O interrupts, obtaining the TCBIDF field
from the TCB of the task involved, locating the lORE
chain by using the pointer in the SVT, and scanning
the IOREs checking for 10REID fields which match
the TCBIDF field of the TCB. If none are found, there
are no 10REs for the task and it is already in a quiescent
state. If any are found, then the task has a pending
I/O interrupt or outstanding I/O requests. If this is
the case, PSS enables interrupts allowing the I/O
Supervisor to process, if necessary, and then immediately disables them. If the I/O in question has been completed, the lORE will have been removed from the
chain during the time interrupts were enabled.
PSS restarts at the beginning of the chain and checks
again, repeating the above steps until it comes to the
end of the chain without having found any active
elements for the task. When it rea-ches this point, there
are no longer any 10REs associated with the task and
it is in fact quiescent.
It should be noted that since the PSS task has a
higher priority than the task to be quiesced, it does
not anow any new I/O requests to be initiated by that
task since PSS retains the computer resources.
Erasing of a partition boundary and
task deletion
There is control information which is received by
56
Fall Joint Computer Conference, 1969
------------------------------------------------------------------------------------the communications routines within the PSS task
which must be accessible to the, LPOL task for both
reading and writing (such as indications which LPOL
routine to is be run, the replacement value for the
next cycle time which is calculated by the LPOL task
as a function of its current running time, entry point
addresses of routines mutually shared by the PSS and
LPOL tasks, etc.). Additionally; task management is
greatly facilitated by extending the PSS task aroa to
include the LPOL function while controlling via the
PSS Task Control Block (TCB) rather than modifying
the off-line task TCB or creating a newone.
In order to make the shared task area a memory
extension of the' PSS task, the memory areas must be
linked. This is achieved by modifying the TABB (see
Figure 5) of the PSS task so that the LOADDR field
points to the low address of th~ shared task. Figures
8 and 8a show the pointer relationships before and
after these TABB modifications.
The storage protection feature must now be satisfied
to make the two storage areas completely contiguous.
Since there is a mismatch in storage keys between the
PSS and shared tasks, the keys associated with each
protected block of memory within the shared task are
reset to match those of the PSS task. At this point,
I
1
1
ON LINE REAL TIME
TASK AREAS
PSS TASK AREA
1
ON LINE REAL TIME
TASK AREAS
PSS TASK AREA
$..7r:.\
f'
//7
"
""
..... ~s~
I
..
SHARED TASK AREA
/ f 'r//"" "",
I
~
, \\\
I,
,
..
\ \
\\
\\1
'
OPE:RATING
"--t---~ )'\\ .........- - - - 4 Sl'STEM
LOADDR
\\
LOADDR
TASK
\'
'IREA
HIADDR
HIADDR
PSSTABB
1
NUCLEUS
OFFLINE TABB
Figure 8a-TABB pointers after modification
the two task areas have become a contiguous block of
memory assigned to the PSS task area.
Figure 9 shows how, TCBs are linked together within
the system. Note that each entry in the TCBLIST
points to a TCB and each TCB points to t.he next
lowest priority TCB in the chain. Figure 9a shows the
arrangement of the TCBLIST and the TCBTCB field
in the next-to-Iast TCB in the chain after modification
to three partitions. This has been done by replacing
the pointer to the last TCB in the TCBLIST with a
pointer to the next-to-Iast TCB, and setting TCBTCB
field of the next-to-Iast TCB to zero. These modifi-
OPERATING
\I--l-O-AD-O-R~ S~ ~1~M
AREA
HI ADDR
HI ADDR
PSSTABB
NUCLEUS
OFFLINE TABB
Figure 8-TABB pointers in PS$ and offline task
prior to modification
Figure 9-Portion of nucleus showing TCBIJIST gond
TCBTCB pointer relationship prior to modification
An Operational Memory Share Supervisor
1
1
ON LINE REAL TIME
TASK AREAS
~
57
PSS TASK AREA
Figure 9a-TCBLIST and TCBTCB pointers after
modification
cations have additionally made the last task nonexistent to the operating system.
OPERATING
SYSTEM
t - - - - - - - 1 TASK
AREA
t-------1
Rollout jRollin
The process of rolling out the off-line task and rolling
in the LPOL task is a straightforward write/read
operation to a disc file. Since storage is divided into
2048 byte units for assignment of storage keys, the
task area read or written is some multiple of 2048
bytes in length. Thus the records are read or written
in 2048 byte blocks for purposes of simplicity and
efficiency.
Free area modification
The PSS and LPOL tasks now occupy the same task
area. It is neceRsary, therefore, to make certain modifications which will cause all requests for work storage
to be satisfied from that portion of the task area wholly
dedicated to the LPOL task. Although no task boundary exists between LPOL and PSS, if work storage
were to be allocated from the PSS domain, it would
not be subsequently saved and restored in future
cycles since the PSS area is not included in the dynamic
area which is stored on the disc file.
Figures 10 and lOa show how these modifications
are accomplished. Initially (Figure 10) the FALPT field
of the PSS TABB is pointing to the free area within
what was its own task area. This is the normal condition
for this pointer when there is an operating off-line
task. However, we have modified the configuration to
three task areas and we now wish to make the only
available free area all exist in the LPOL area. Figure
lOA shows that the FALPT field of the PSS TABB
has been re-pointed to the first F AL within the LPOL
task area.
At this point, the LPOL task is ready to process
PSS TABS
(NOT BEING USED)
NUCLEUS
Figure lO--F ALPT relationship with F AL locations
prior to modification
I,.,.,
1
ON LINE REAL TIME
TASK AREAS
'r-
PSS TASK AREA
,' FALNXT 1FALCOUNT1
.J'",
.....
-----------------FORMER TASK BOUNDARY
_--_
VACANT TASK AREA
t~f:::----~--:"'-::::-~
(
ZERO
I
9SK
FALPT
PSS TABB
. . . ~~"
.....
"
FALPT
OPERATING
SYSTEM
TASK
.AREA
(NOT BEING USED)
NUCLEUS
Figure lOa-F ALPT fields after modification
Fall Joint Computer Confer'ence, 1969
58
whatever request caused it to be activated. We have
now covered steps 1 through 7, under General Discussion. In returning from the three partition to the
four partition environment, the steps are essentially
the reverse of those detailed.
Upon restoring the off -line task, PSS enters a wait
state and will be restarted as previously outlined. The
task dispatcher port.ion of O/S will restart the off-line
task as soon as there is available computer time and
no higher priority tasks require the computer resources.
Initialization
The initialization process for PSS consists of:
1.
2.
3.
4.
5.
6.
7.
8.
Suspending of off-line processing.
Reconfiguration from forir to three partitions.
Rolling out the off-line task.
Making the off-line task area one contiguous
free area.
Loading the LPOL task and allowing it to
ini tialize itself .
Rolling out the LPOL task.
Rolling in and restarting the off-line task.
Entering the normal cycle at the wait point.
Step 4 above has not been previously covered in
detail. In order to force the initial loading of LPOL into
the desired location, 'the F ALs for PSS are initially
modified. Figures 10 and lOA show the PSS TABB
before and after this is done. The F ALPT field of the
PSS TABB initially points to the first FAL within
the PSS area. The FALPT field of the LPOL TABB
points to the first FAL of its task area. By altering
the FALPT of the PSS TABB to make it point to the
LPOL first F AL and by altering the F AL by both
making it the last F AL in the chain and indicating
one large block of free memory, we have created a
large free area available to PSS for loading the LPOL
programs.
As the LPOL task acquires and releases memory
blocks for work storage, the FALs within the area
are modified by the operating system consistent with
memory availability. PSS simply saves the pointer to
the first LPOL FAL prior to each rollout and restores it
after rollin and prior to reinitiating LPOL. Continuity
of FAL linking is maintained in this fashion.
Special handling
There are occasions when the off-line partition cannot be quiesced. This could be caused by a card reader
jam, a printer being out of paper, etc., causing an
lORE associated with the I/O to remain linked in the
chftin beyond some reasonable amount of time (presently 10 seconds). These conditions are relatively
infrequent; however, provision has been made for them
by advising the operator via the computer console
typewriter and an attention bell that the off-line task
is non-quiescent and requires attention.
The memory area actually required by PSS is less
than 6K. However, in order to initially load PSS into
memory, a large enough partition must be available to
furnish the operating system job scheduler routines
their required amount of core. This requirement is in
the order of 24K. Thus there is a pre-initialization
phase during which PSS changes the initial configuration (Figure 11) of 50K and, 52K to 6K and 96K for
the PSS and off-line tasks, respectively (Figure 1).
The technique for doing this will not be detailed; hmvever, the essential steps are as follows:
1. Heferring to Figure 12, the initial PSS task area
is shown iri three segments (B, C, D) and the
initial off-line task area is shown in one segment
(A). The PSS Pre-Initializer is loaded by the
operating system into area B.
72K
TASK 2 (ON LINE)
40K
TASK 3 (ON LI NE)
TASK 4 (PSS)
50K
TASK 5/6 (OFF-LINE/LPOL)
52K
TASK 1 (OPERATING SYSTEM)
42K
NUCLEUS
Figure ll--Initial task core allocations
An Operational Memory Share Supervisor
2. In ord~r to place the PSS main program in the
area where it can control storage, it must be
forced into area D. To achieve this, the task
area boundary block is modified to make area
D free and areas Band C unavailable.
3. The PSS main program is loaded into area D.
4. The off-line boundary block is, modified to include areas Band C as free areas.
5. Control is passed to PSS main.
/
/'"
TASK 2
(OK LINE)
72K
/'
UPPE R BOUNDARY OF
PSS
TASK AREA
TASK 3
(ON LINE)
(0)
~
PSS TASK
~ 6K
/'
PRE-INITIALIZATION PROGRAM
CONCLUSION
~---UWE R BOUNDARY
>-
OF
PSS
TASK
AREA
5QK
AFTER
PRE-I NI.TIALIZATION
I~
LOWER
/ ' """---. INITIAL
BOUNDARY
OF PSS TASK AREA
®
The configuration is now that of Figure 1.
40K
©
®
59
52K
OFF-LINE TASK AREA
(llHTlALLY)
Implementation of PSS has effectively added 96K of
additional processor memory to the real-time system
of which it is an integral part. This coupled with the
facility to process off-line tasks while having an available stand-by on-line task; has greatly enhanced the
capability of the system. The application of PSS has
effected a maximal utilization of computer resources
by the system.
REFERENCES
/'
1 IBM System/360 operating system control blocks
Form No 028-6628
42K
OPERATING SYSTEM
(NUCLEUS)
V
Figure 12
2 JRM system/360 operating system input/output supervisor
Program Logic Manual Form No Y-28-6616
3 IRM sysfem/360 operating system control program with MFT
Program Logic Manual Form No Y27-712~
4 IBM system / 360 operating system fixed task supervisor
Program Logie Manual Form No Y28-6612
Structured logic
by R. A. HENLE, 1. T. HO, G. A. MALEY
and R. WAXMAN
IBM Components Division
Hopewell Junction, N.Y.
INTRODUCTION
dissipate maximum power at the same time.
Large-scale integration for computer applications
has been predicted for several years, but close examination shows that the progress has been uneven. Memory
designers continually demand higher levels of integration for larger and faster memory systems, and
new memory concepts are being developed to further
exploit the characteristics of large-scale integration.
The one-thousand-circuit chip will become nothing
more than a milestone.
But what of the logic area? Here, we struggle along
hoping to find some high-volume applications for chips
with a mere fifty circuits. When we design a mediumsized machine we find that so much unit logic is required that the average level of integration falls below
ten. Orderly memory and random logic integrated
circuit fabrication procedures are growing so different
that thought is being given to building different types
of manufacturing facilities. This represents a rather
drastic approach and in the authors' opinions may
prove unnecessary.
The success to date in memory is encouraging, for
it gives direction to logic. Memory products should
therefore be examined critically for they may well
hold the key to success for logic products. The salient
features of a chip used in a memory product are:
• Well-Defined Function. The memory chip designer knows exactly how his chip fits into the
entire memory system. He therefore can optimize on a high level. As examples, he uses special
circuits for the latch functions and uses decoders redundantly to save pads.
• Volume. •While the initial memory chip design
is quite complex, the volume requirement makes
the initial design cost nearly negligible. With
this ground rule the chip can be highly engineered,
and nearly order of magnitude improvement
can be expected and obtained.
Structured logic, or array logic as it is sometimes
called, is an attempt to design logic with more of the
characteristics· of memory. Many unsuccessful starts
have taken place, but we shall discuss some of the
more successful efforts. We shall also add some thoughts
of our own, but it should be pointed out that the problem is far from solved.
Logic arrays
The basis of all array logic is a matrix of elements
with programmable interconnections. Diode structures
have been proposed in the past, and a matrix of common collector transistors is of recent interest. The
transistor array is programmed in the factory by
connecting or not connecting the emitter of each
transistor to a common line. (See Figure 1.) We shall
use transistor arrays in our examples, for that is what
we have been working with, but diode arrays should
not be ruled out.
• Regularity. Memory arrays are regular in components and wiring. The layout geometry is well
defined and can be highly optimized for total
chip utilization.
• Low Power. Memory systems are designed and
partitioned so that all circuits on a chip do not
61
62
Fall Joint Computer Conference, 1969
Figure I-A tn3,l1sistor array
The ROS
The read-only store (ROS) array in its simplest
form uses two decoders to feed the array: one feeds
the horizontal lines and the other the vertical lines,
as shown in Figure 2. A particular grid position in the
array is selected by activating ~he appropriate horizontal and vertical decoder line~. The addressed cell
of the array is located at the intersection of the two
activated lines. If the emitter at this address is COll-
nected to the horizontal decoder line, then a, 1 has
been programmed into this particular cell in the array.
If the emitter is unconnected, a 0 is said to be programmed into the array. The presence of the programmed 1 or 0 is sensed at the output when that
particular cell is addressed. The horizontal output lines
are dot ORed together to produce one common output
line, as shown in Figure 3.
Conceptually, the ROS is related directly to a
Karnaugh map, one bit position in the array for each
square in the appropriate Karnaugh map. Figure 4
depicts the four-variable K-map that relates to the
ROS of Figure 2. This relationship proves the universality of a ROS, for any Boolean function that
can be K-mapped can be implemented directly. Universality is the feature of the ROS chip most often
described as an asset, but in practice it is seldom useful except in code translators. The Boolean functions
used in the design of any computer are definitely not
random and not evenly distributed among all possible function"! of n variables. This fact is well documented in the many failures with other universal
logic blocks (ULB's). The real problem with the ROS
array is that it doubles in size each time an input
variable is added. This doubling in size is necessary
to maintain the dubious value of being universal.
The ROAM
The read-only associative memory (ROAl\1)
IS
ROS CIRCUITS
2
4
+6
ROS
C
z
0
Figure 2-Read-only store
Figure 3-Read-only-store circuits
a
Structured Logic
K-MAP
CD
J
AJ B
00
00
1
01
0
01
11
10
1
0
1
1
0
1
~
11
1
0
0
1
10
1
0
1
0
Figure 4-Karnaugh map
matrix of common collector transistors that may be
programmed by conneoting or not connecting the base
of each transistor to a common line in its own column
(Figure 5.) The emitters of each row are commoned
and feed the emitter of an output transistor. Each
row of array transistors and the associated output
transistor form a current switch.
Through phase splitters, each input variable has
both true and complement lines available to the array.
Hence , each variable controls a true line and a complement line (column) in the array. This gives rise
ROAM
A
B
C
Figure 5-Read-only associative memory
63
to the word "associative" in the name. By programming each row in the array to a particular pattern
of l's and O's, the input word pattern will "associate"
(compare) with the appropiate row in the array. If
there is no match, the outputs will remain logical zeros.
If at least one row has a pattern the same as the input
pattern, there will be a logical one output on that
horizontal line (row).
To program the array, each base is tied to a true
line (column), a complement line (column), or is
left floating. Thus, for a base tied to a true line, a 1
on that input line will yield a 1 at the emitter and a
1 at the output, since the row of emitters effectively
forms a DOT -OR (positive logic). Bases tied to a true
line are equivalent to a logical 1, since a 1 at that input causes a 1 at the output.
Conversely, a base tied to a compleme~t line is
equivalent to a logical O. A 0 at a particular input
raises the complement line of the phase splitter,
thereby raising to the 1 level all emitters of transistors
in that column that have their bases tied to the complement line (column).
If the base is left floating, that array grid position
is effectively a DON'T CARE. That is, the output
line will not be raised to 1 by either a 1 or 0 at that
. transistor's column input.
Figure 6 illustrates the implementation of an adder
position with SUlVf and CARRY outputs using a
ROAM array. A black triangle connecting a vertical
line and a horizontal line indicates a base connection;
lack of a black triangle indicates a floating base. Note
that if a true line is connected, then the complement
line is not connected, and vice versa for each array
grid position. Thus, at most, only 50 percent of the horizontal and vertical intersections will ever be used.
To conceptually understand the ROAM and relate
it to the Karnaugh Map it is convenient to think in
terms of negative logic. Thus, down levels are logical
1, the commoned emitters of each row form a DOTAND (all emitters down results in a down level, any
emitter up results in an up level), and dotting the output
transistors results in a DOT -OR.
Each row of the ROAl\I represents a term of a
logical expression in the sum-of-products form. The
logical expression CARRY = B . C + A . B + A . C
is in sum-of-products form, and B . C, A . B, and
A . C are each terms of the expression. Each term
may be implemented on one row of the ROAM. For
example, Figure 6 illustrates the implementation of
the CARRY function. Note that the A true and B
true columns are both connected to a transistor base
in the second row of the ROAM array, yielding the
term A . B. The three rows B . C, A . B, and A . C
64
Fall Joint Computer Conference, 1969
A,S
C ,0
C
1
1
0
0
1""
111
T
10 0
-1
o0
o1
[0
11
0[2] 0
A
0
(l)
A,S
oo 0 OJ
olQJ 0
1 0
are DOT-ORed at the output to yield B . C + A .
B + A . C = CARRY. In forming the term A . B,
the variable C does not have its true or complement
column line connected to a base. CARRY is 1 if A is
1 and B is 1 regardless of the value of C.
Each term of a logical expression in sum-of-products
form is an "implicant" on a Karnaugh Nlap. An implicant is formed by looping the l's in the Karnaugh
map and "reading" the loops from the ma,p. Loops
can only contain adjacent l's and the number of ones
in a loop must be equal to 1, 2, 4, ... , a power of 2.
This results from the fact that adjacent squares on a
Karnaugh map always differ only by the value of
one variable. Two squares looped yields a term with
n-l variables (n = number of variables), four squares
looped yields a term with n-2 variables, etc. Thus,
each implicant requires one row in a ROAM. The
bigger the loop of l's the fewer connections need be
made in that row. The complete expression i.s formed
by DOT -ORing the rows which is the same ,as ORing
the implicants.
The example of Figure 6 uses three loopi3 of two
l's each to form the CARRY. The SUM is formed
by four loops of one 1 each. In this case three con-
CARRY
SUM
C
S
........- .....-+-+---t-~-
~
CARRY
SUM
Figure 6-ROAM adder position
TABLE I----+Bits required for n variables in ROS and ROAM ARRAYS
2
3
VARIABLES
4
5
6
7
8
n
BITS
--.
<:Ij
~
c
R.::
'-"
~
~
j
R.
~
ROS
Always Universal
4
8
16
32
64
128
256
ROAM
2
8
12
18
24
16
24
32'
40
48
56
64
8
24
64
20
30
40
50
60
70
80
90
160
160
24
36
48
60
72
84
96
108
192
384
28
42
56
70
84
98
112
126
224
.896
32
48
64
80
96
112
128
144
256
2048
3
4
5
6
7
8
9
16
2n/2 Rows (Universal)
2 ''J
4·n
6'n
8·n
10'n
12·n
14·n
16·n
18·n
32·n
n·2 n
Structured Logic
nections must be made in each of the four required rows
to obtain
SUM
=
A . 13 . C + A . B . C +
A.B .C
+A· 13·(3
In contrast to the ROS, the ROAM can have uni~
versal capability with only one-half the number of
rows as the ROS needs bits for the same number of
variables. Moreover, the ROAM does not need to be
universal to be useful, thus allowing even further
reduction in size. Table I illustrates the difference
brought about by the ROS requiring one bit per K-map
position and the ROAM requiring one row per K-map
implicant.
Historically, computer functions are composed of
about four implicants or terms. The chart shows that
a four-implicant function is cheaper to implement
with a ROAM than with a ROS when the function
contains six variables or more. When the decoders
required for the ROS are considered, even four-variable functions with four implicants are more economical in ROAlVI than in ROS.
Two useful formulas to compare ROS bits required
with ROAM bits required for a given function are:
ROS bits
=
2n
65
ROAM bits = 2 In,
where n = number of variables, I = number of implicants. Thus, it is more economical to build a function
with the ROAM when 2 I n < 2n. This does not
consider the cost of the ROS decoders, which add a
factor to the inequality.
If we assume that the decoders for n-even take
2n(2n/2) bits, and for n-odd take [en + 1) 2(n+1 )/2 +
(n - 1) 2(n-l)/2] bits, then the cases for which ROAM
should be used are:
1. n even
2 I n < 2n
+ 2 n(2
n / 2) ;
2. nodd
+ (n + 1) (2[n+l1/2) + (n -
2 In < 2n
1) (2[n-ll/2)
Thus, ROAM is more economical than ROS in most
practical problems.
A realistic example of control logic for a small machine model has been implemented using the ROAM
array. Table II gives a comparison of the number of
bits required for a ROAM implementation versus the
number of bits required for a ROS implementation.
Note that the ROAM is significantly more economical.
A partitioning of functions could have been devised
for the ROS implementation. The ROAM would still
TABLE II-ROS vs. ROAM -a control logic example
TOTAL NUMBER OF VARIABLES. ..................................................
TOTAL NUMBER·OF FUNCTIONS ... 0...............................................
TOTAL NUMBER OF IMPLICANTS ................ 00................................
One 7-implicant function of 13 variables
Four I-implicant functions of 7 variables
One I-implicant function of 11 variables
ROAM
ARRAY SIZE: 28 X 12 ......................................................
ROS 1
ARRAY SIZE/FUNCTION: 214 . . . . . . . • . . . . . . . . . . . . . • . . . . . . . . . . . . .
6 ARRAYS FOR 6 FUNCTIONS: 6 X 16,384 ................................. .
SHARED DECODER .......................
TOTAL BITS ................................................ ·....
ROS 2
ARRAY SIZE FOR 13 VA'RTABLES: 213 . • • . . . . . . . • • . . . . . .
ARRAY SIZE FOR 7 VARIABLES: 27 X
ARRAY SIZE FOR 11 VARIABLES: 211 • • • • • • • • • • . • • • • • . .
SHARED DECODER ....
1"OTAL BITS .....................................
0
0
0
•••••
0
•
0
•••••••
0
4.0.0
•••••••
0
••
00
•••
••••
0
•
0
•••••
0
•••••••••••
0
•••
0
0
0
0
••••••••••
•••••••••••
0
0
0
••••
00
••••
•
•••
0
0
0
•
•••••••
0
•••••
0
•••
•••••
0.0
0
0
0
0
0
••••••••••
0
00000.
0
•••
•••••••••••••••••
. . . . . . . . . . . . . . . .
0
••
0.0.
0
••••••• 0
0
0
••
0
0
14
6
12
336 BITS
16,384 BITS
98,304 BITS
3,584 BITS
101,888
8,192 BITS
512 BITS
2,048 BITS
3,584 BITS
14,336
66
Fall Joint Computer Conference, 1969
be more economical than the ROS, however, especially
when one considers the additional wiring complication
-of connecting several small ROB arrays and the additional design time required to effectively partition
the functions.
The optimum size for a ROAM has not been determined, but chips with at least 512 bits on them are
desirable. This capacity would provide between eight
8-variable, 4-implicant functions, and one 64-variable,
4-implicant function (an extreme case, needless to say)
on a chip. The practicality of building and using such
a chip is yet to be determined.
The SLT array
Arrays can be designed so that they may be used for
direct replacement of present logic. The SLT array
performs the function AND-OR-INVERT in negative logic or OR-AND-INVERT in positive logic
and can be used directly to replace SLT logic. While
direct replacement of random logic with array chips
may prove to be the wrong approach in the long run,
it may well be the only way to get array logic started.
The SLT array has the same advantages over ordinary logic that all arrays have: orderliness of design
and layout, and high density with relatively low cost.
In addition, this type of array has a higher bit usage
than other arrays, since it more closely resembles the
familiar random logic, functionally. The SLT array
does not have decoders or phase splitters on its input
lines, as do other types of arrays. This makes the array
less universal than even the ROAIVI array ibut more
effective for r2,ndom logic. It is fair to say that arrays
of this type make poor code translators just as SLT
logic builds poor translators. It is difficult to believe
that any array will be effectiv3 in both random logic
and code translation problems.
As already stated, the ROAM array has specific
applications to decoders and associative memory
problems. The SLT array may very well be the element required to do general logic design. The reason
for this is the placement of the inverters as shown in
Figure 7. This movement of the inverters to the output lines may appear a minor modification, but it
should be remembered that there has never been a
useful logic block with inverters on the input lines. It
may pay to have both true and complemented outputs from a current switch logic block. Figure 8 shows
a full adder implementation in SLT logic 2~nd in an
SLTarray.
Array-driving arrays
The SLT array in Figure 8 demonstrates one necessary feature of an array that has yet to be discussed:
Any logic array must be able to drive any other array
in the same family, including itself. Note in Figure
8 the CARRY output fed back into the array. This
line probably will be an external wire. This technique
is required since it is in effect Boolean faetorin~~, a
proven necessity. This type of feedback is al80 needed
to produce sequential circuits, giving memory to the
arrays.
Figure of merit
Figure 7-8LT array
I t is less meaningful to compare array logic with
random logic in each individual term of power consumption, propagation delay time, and silieon area,
since one can usually be traded for the other, such 9·S
power with delay. Instead a comparison is made of
their figures of merit, chosen to be the product of
power consumption P, delay time T, and silicon area
A, all with weight function of one (PTA). Since no
isolation wall is needed between collector transistors,
a ROS or ROAM cell including approprifl,te interconnections can be laid out on a silicon chip area equivalent to 20-25 percent of that occupied by a transistor
that needs isolation walls. As shown in Figures 5 and 7,
Structured Logic
A
B
B
C
~--------------CARRY
ABC
67
pears to be the limited useful size of a single array,
and the difficulty in standardizing a particular array
configuration. As a minimum achievement at this
time, it appears that arrays will be useful in development of complex functions within a silicon chip.
Array logic will not eliminate the need for a circuit
designer in the future, since specialized designs will
be needed to optimize circuit and component technology. In some of these design cases, the importance of
array logic techniques will be obvious, but in others
it will not be.
At this point, array logic does not appear to strongly
affect the system designer's approach to machine design, and a knowledge of array logic may never be required.
In the future, however, to the extent that array
logic techniques influence the design and optimization
of highly efficient functions, the system designer's
work will be significantly influenced by progress made
in developing array logic techniques.
BIBLIOGRAPHY
Figure 8-SLT full adder position
the delay time of an array is two levels of current
switch emitter follower (CSEF) independent of the
number of inputs. For sophisticated functions, such
as the one-bit adder shown in Figure 8, more than two
levels of logic may be required.
Some typical comparisons of array logic and random
logic include the sampling design of array logic chips
to perform the same function a random logic chip
would. This comparison helps to partially discover
the merit and the limitation of the array logic. In
comparison with random logic chips that perform
sophisticated functions or have two or more cascading
levels of CSEF's, array logic chips have superior
PTA figures.
CONCLUSIONS
Various array configurations described here suggest
that random logic may be implemented by use of an
array of programmable crosspoints. Comparisons of
array logic with conventional logic indicate that in
many cases the PTA figure of merit is superior for
arrays. The most significant problem with arrays ap-
1 R RICE
Computers of the future
IBM Research Report RC-151 April 201959
2 R RICE
Systematic procedures for digital system realization from logic
design to production
Proc IEEE Vol 52 12 1691-1702 pec 1964
3 R C MINNICK
Application of cellular logic to the de:~ign of monolithic digita
systems
Microelectronics and Large Systems
Spartan Books Wash D C 1965 225-247
4 L C HOBBS
Effects of large arrays on machine organization and hardware
software tradeoffs
Proc FJCC 1966 Vol 2989-96
5 R C MINNICK
Cutpoint cellular logic
IEEE Transactions on Electronic Computers Dec 1964
6 W E KING III A GUISTI
Can logic arrays be kept flexible?
AFCR!. Report 65-547 Aug 1965
7 D C FORSLUND R WAXMAN
The universal logic block (ULB) and its application to logic
design
IEEE Conference Record 1966 Seventh Annual Symposium
on Switching and Automata Theory 236-250
8 S S Y AU C K TANG.
Universal logic circuits and their mod1~lar realization
Proc SJCC 1968
9 R C MINNICK
A survey of microcellular research
Jour ACM Vol 142 April 1967 203-241
Characters-Universal architecture
for LSI
by F. D. ERWIN and J. F. McKEVITT
Hughes AircraJt Company
Fullerton, California
defined areas represent the regions of the system with
the highest gate-to-pin ratios. After these portions are
lifted out of the system, the remainder is characterized
by very low gate-to-pin ratios (notably control and
data routing functions). Unable to satisfy the LSI
design criteria of high gate-to-pin ratios any longer,
the designer must look to more standard components.
Unfortunately, any proposed solution to the LSI
partitioning problem which lacks a total system approach tends to drift towards this pitfall.
Researchers striving towards partitioning for total
or near-total LSI implementation tend to diverge
along one of two conceptual paths; bit-slicing and
functional partitioning. To illustrate the difference,
consider the data portion of the computer. In functional
partitioning one may specify an adder as one LSI array, registers as another, a shift register as a third, and
so forth. On the other hand, in bit-slicing one would
design an LSI array consisting of a combined one- or
two-bit adder, registers, shift registers, etc., then build
up his system from this chip type according to the desired word length.
The bit-slice approach has resulted in some notable
advantages, particularly the ability to achieve very
high gate-to-pin ratios and implement systems using
a small number of different array types. 1 ,2 However,
bit-sliced mod~les have the basic flaw of being systemdependent, a drawback described by Pariser in an
early paper.3 This means that behind such bit-slicing
approaches there lie systems, real or implied, for which
the resulting arrays are most efficient. An attempt to
apply the arrays to a significantly different system
results in a poor design. Considering the types of bit-
BACKGROUND
Since the advent of LSI technology, several schemes
have evolved for the utilization of large arrays to their
full potential. A common and straightforward approach
involves the designer restricting himself to the equipment being designed at the moment. Faced with only
a limited set of problems, it is not difficult to specify
a small number of LSI array types which will efficiently
complete the design. While the results are quite encouraging for specific cases,! the drawbacks of any mass
adoption of these techniques are obvious. This, the
so-called "custom approach," would require the semiconductor manufacturer to be responsive to each customer with numerous low-output production runs of
highly specialized devices. The per-unit cost to the
user, for his own efforts as well as those of the manufacturer, would be quite high due to the inability to
spread initial costs over many devices. In addition,
the complexity of lOO-gate-plus arrays is such that it
is difficult to substitute one for another (with efficient
results). This would severely limit the· off-the-shelf
capabilities of both user and manufacturer.
An obvious solution to these problems is the intrqduction of a small set of standard LSI chips. Semiconductor suppliers, making tentative advances into
LSI product marketing, have already proposed such
devices as adders, counters, and shift registers. However, this does not represent the solution to the general
problem. A design heavily committed to the use of these
devices must fall back on MSI or standard I C for the
large remainder of the circuitry. The reason is that
adders, counters, registers and other orderly, well-
69
70
Fall Joint Computer Conference, 1969
.~--------------~--------------------------------------------~---
slice devices being proposed, inefficiencies would most
often be manifest in the design of a simple device in
which the majority .of the gates qf the array intended
to accomplish complex functions ~re wasted. Although
this may be acceptable in some: situations, it is unlikely that it would satisfy the strict requirements of
size, weight, power, and reliability imposed by aerospace and military systems.
It is the contention of this p~per that a judicjous
partitioning of digital systems in general, divorced
from bias towards any particular system, results in a
set of LSI devices that can entirely implement many
different computer systems of varying functional complexities and word lengths.
The resulting group of array~, referred to as a
"character set" and each one indiyidually as a different
"character", is sufficiently small ib. number (10), with
each type having acceptable size· and gate/pin ratio,
to be considered acceptable and desirable in view of its
wide range of app~ications. These! building blocks are
referred to as characters because of the metaphor that
may be made between the building blocks and characters of the alphabet (letters). Letters form words
to express the language whereas ~uilding blocks form
units to build the machine. In both cases a closed set
(of characters) is used to produce the desired end.
Although the character set is neither rigidly functionally-partitioned nor bit-sliced, it is biased towards
functional partitioning to give it the versatility to
efficiently implement both comple* and simple digital
devices. As an approach, functio~al partitioning has
a detailed and successful backgtound. 3 ,4 Bit-slicing
consideratoins give the character set its ability to
implement systems of varying word lengths.
In addition to providing the u~er with a standard
set of chips to implement many different digital machines, the completeness of the approach (the ability
of the characters to implement the whole machine)
relieves the user of the burden of: logic design. These
tasks are reduced to the selection of character types
and word lengths.
Introduction to the character set
A universal conclusion among LSI researchers is
that control functions are more difficult to modularize
than functions related to data :operations. Micromemory control technique was chdsen as the solution
for LSI implementation for several reasons. A micromemory, meaning here a read-only Bolid-state memory
with its sequencer and instruction register, is easily
partitioned into the large modules! necessary for LSI
implementation. Control fUllctions in this form are
then amenable to reproduction in large quantities
of identical units. Also,design with control centered
in one level of micro memory is more orderly and
straightforward.
The micro memory has been provided with a relatively sophisticated microprogram instruction repertoire. This means that the microprogram contains the
essence of the machine's major mathematical functions, such as multiply and complex sequencing. This
is desirable since it represents an efficient use of hardware for these purposes and also reduces the number of
different array types necessary. Also, a versatile repertoire leaves the designer free to make units which
operate as simply or as complexly as desired. The
~egree of flexibility which this repertoire gi ves the
character set is a major factor in its success. It should
be stressed that the "micro operations" of the I~harac
ter set are as important a factor as its logic design. This
fact, a critical one in all LSI solutions committed to
micromemory control, cannot be overemphasized.
Interest in designing a character set at Hughes was
concurrent with the development of an advanced computer system. The character set itself was developed
with the ultimate objective of implementing all future
Hughes digital data processing equipment with a common family of LSI circuits.
The outcome of that original effort revealed that
computer structures in general are frequently ordered,
or at least amenable to such ordering, as shown in
Figure 1.
The divisions of Figure 1 are functional. That is,
regardless of the hardware characteristics, the computer
philosophy is such that its functions may be identified,
separated, and diagrammed as shown in the figure.
From Figure 1 came the concept of the funetional
character set. With the fundamentals of LSI design
in mind, logic was designed to accomplish each computer
BOOLEAN LOGIC FUNCTIONS
COMPUTER
CONTROL
FUNCTIONS
•
MINORI TRANSFER, SHIF1',
ROTATE, COMPLEMENT,
INCREMENT, LOGICAL
OR, ETC .
•
MAJOR, ADO, SUBTRACT,
EXCLUSIVE OR, ETC.
'NPUT/OUTPUT
FUNcn"'~
FAST ACCESS
REGISTER STORAGE
AUXILIARY DEVICES
• COUNTERS
• CLOCKS
:
~~:CT~~:AD
I
~
'----1
_ _ _ _ _ _ .....J
L
CORE MEMORY
Figure I-Computer functional organization
Characters-Universal Architecture for LSI
ta
71
M.M Micro-array
PI Scratch pad memory
P2 Up/Down counter
P3 Switch
M
Ml
M2
INPUT /OUTPUT
FUNCTIONS
~:~~s~~~~ss
SCRATCHPAD
!
II
L.._ _----I._ _ _-...J
II
L..._ _----1._ _ _.....
G1 character
~.BITS---1
~----j.----I
g~~~'::-ER·I ~_ _--j._ _ _- I
L..-_ _---I._ _ _~
SWITCH
Characters of the same letter are logically grouped
into a common unit as illustrated in Figure 3.
CORE MEMORY DEFINED
AS AN I/O-TVPE DEVICE
I
Figure 2-Functional charf:l.cter set
function indicated by the picture. Each unique LSI
chip type which resulted was referred to as,a different
character type and given an identifying name and
number. Figure 2 shows the character set which resulted from the logic design according to the concepts
outlined in Figure 1.
The character set and repertoire have been through
several improvement cycles and used in the test implementation of a NASA computer to be discussed
later. Current plans include test design of the H4400
(a new Hughes computer) with the improved character
set, implementation of the character set with high
speed ~IOS circuits, and construction of one computer
using the characters.
These ten LSI characters alone provide the entire
hardware complement for the logic of a broad range of
computers and digital equipment. No extra logic in
the form of either IC, MSI, or custom LSI need be
added to the characters to finish the job. An important
by-product of this is that the user need never consider
logic design. His tasks are reduced to selection of the
necessary characters and the writing of the appropriate
microprograms for them. In fact, it is possible for the
character set to fit into a realistic total design automation procedure as discussed later.
The G 1 character provides the bulk of storage for
operands of the microprogram. Each character contains four registers of eight bits each accompanied by
reading and writing selector gates. The storage element
is provided with simultaneous dual reading and
writing capability. The storage flip flop itself is designed
for minimum read after write delay.
Eaeh of the two input busses is common to all
registers and carries to the G 1 character eight lines
per bus, one line from each bus for each bit of the
register. Input data selection is accomplished at the
memory element by a coincidence of positive information on a particular input bus and register selection
for that bus by destination decoding logic within the
character. The destination decoding logic is duplicated
to provide for writing from the two input busses into
the same character under control of two different microcommands. As will be illustrated later, this is a key
factor for the machine expandability property of the
character set as it allmvs G 1 to form a data path link
between individual logic units under control of up to
two' different micromemories. Different registers in
r,he character may be written into simultaneously.
Reading of the register is provided by dual source
decoding logic which gates data to independent dual
output busses. This duality provides for information
from any two registers to be simultaneously placed on
two output busses. The conceptual structure of the G 1
character is shown in Figure 4.
Several G 1 characters placed in parallel provide
registers of more than eight bits in length.
Description of the character set
This section describes each of the ten characters.
They are summarized below for reference.
G1
Ll
L2
L3
Ml
M2
Register storage
Generallogic
Arithmetic logic
Input/Output
Micromemory counter
Micro-instruction Register
Figure 3-Typical functional character configuration
Fall Joint Computer Conference, 1969
72
L1 character
The Ll character provides the basic logic functions
selectable by microprogram. In addition input bussing
is provided for nine channels (eight bits/channel).
One channel of the bus is required for each G 1, L2 or
or L3 character connected to the L1 character. The
logic functions provided consist of the rotates, shifts
(logical), no-operation, complement., and incrementation. Also associated with the L1 charac>ter is the decoding logic for these logic operations. The type of
microprogramming used with the functional character
system relies heavily upon the fast and efficient manipulation of bits within the various operands. To this
end, shifts and rotates have been: provided which execute from 1 to 31 positions in a single step (as opposed to serial operation). Incrementation is accomplished with the use of a logic register which may also
be used as a simple holding register. The L1 character
is eight bits wide and contains the following logic:
1. Bussing gates
2.
3.
4.
5.
6.
Decoding logic
Rotate, slJift, and complement logic
Incrementer
L register
Gating to output bus
In Figure 5 is shown a block diagram of the L1
character. Several L1 characters may be connected
together to form logic operations on words longer than
91112-1'
r---I
I
I
L_-,
I
I
I
ENCODED
SIGNALS
ENCODED
SIGNALS
Figure 4--G 1 character block diagram
MICRO.
MEMORV
CONTROL
L _ _ _ _ _ _ _ _ _ _ _ _ GENERAL LOGIC.JI
F~I~
Figure 5-Ll character block diagram
une byte. A limit of four bytes exists in order to maintair! consistency of definition in the rotates and shifts.
Information entering the L1 card from the various
sources is bussed to form the input bus. Then it is
operated upon and the resultant is bussed to the output bus where it leaves the character or is optionally
stored in the L regist.er (",here it would thus be available
at the next mirro-instruction time for use in the increment operation or as an "L" source).
L2 character
The L2 character provides the major arithmetic
functions used by the microprogram. The arithmetic
unit provides the 2's complement sum of the contents of the A and B registers. Addition is performed
with carry look-ahea'l byte parallel. Control signals
may copditioll the adder to alternately provide either
of two special results (a) a mod 2 addition instea.d
of full addition or (b) an input carry to the lowest order
bit for full addition (this forced carry in conj unction
with a negated operand accomplishes a 2'B complement operand for subtraction). The L2 character
consists of two holding registers for the operands of
the adder, the adder itself, decoding and error logic,
and bussing gates. Figure 6 dia.grams function-wise
the L2 character.
A typical arithmetic operation using the L2 character might proceed as follows: (1) first operand traIlS"
ferred to· B register (from output bus), (2) second
operand transferred to A register, (3) after appro,priat.e
delay access result and transfer out of L2 charact.er via
the input. bus. The error logic provides overflow and
carry-out information.
Characters-Universal Architecture for LSI
r----------------------~
73
r-----.,.LL..L...U.. . .
I
1
1
I
~~+h
I
I
I
I
I
I
I
I
I
INPUT
I
_;...1-++-~--+--I
I
INTERRUPT
I
Figure 6-L2 character block diagram
I
I
I
I
I
I
I
(mIl
DESTINATION
'---_~OECODE
I
'------
L3 character
The L3 character provides input/output capability
for the microprogram machine. For purposes here
input/ output includes not only the usual peripherals
but also main memory, scratch pads, real time clocks,
an P -charact.ers-namely all elements of the computer
not directly controlled by the micromemory. The L3
character provides iDput gating for external devicesfour buffered and three non-buffered channels. The
buffered-input gatiDg may be controlled either by the
microprogram or the external I/O device itself. Four
I/O output channels are provided. Interrupt signal
storage and int.errupt mask storage for four channels are
available. Parity generation and checking along with
odd/even control is provided for the four buffpre tl and the error type in cell m has not changed.
Further, the input and output leads of the cascade do
not fail.
I t is assumed that the 12 allowable cell functions for a
Maitra cascade are fI, f2, f3, f4, f5, f6, 17, fs, f9, flO, fn, f13,
and f14. (See Definition 1 for an explanation of the notation
Ii.) Seven allowable errors are assumed for each cell;
these are hb (s-a-l; stuck-at-one), fo (s-a-O; stuck-atzero), fl5-p (complementation where p is the cell
function), f12 (the input X), f3 (the complement of the
input X), flO (the input V), and f5 (the complement of
the input V). These seven errors consist of the two
failure types (s-a-O and s-a-l) usually assumed by
most fault diagnosticians augmented by f15-p, h2, fa, fIr'>
and fs. [Note that flO and i5 have different allowable
error sets; i.e., Ehu = (fr, i15, f5, f12, f3) and Ef5
(fr, f15, flO, f3,
i12).J
Definition 1.
follows:
The cell functions are numbered as
Xi Y i-I fo fl h f3 f4 f5 f6 17 /s /9 flO /n !I2 !I3 f14
Assumptions and definitions
Figure 1 illustrates the interconnection structure of a
Maitra cascade. 3 Every cell in the cascade is a two-input,
one output cell. It is assumed that the Boolean variables
applied to the cascade are numbered as illustrated on the
cascade shown in Figure 1. All testing of the cascade is
accomplished using only the input leads and the output
lead of each cascade (and of arrays). The ability to
measure the functional value produced by a cell by
means of probing a buss connecting two adjacent cells js
not assumed. To minimize the "uncertainties" (the
functional values between cells cannot. be measured and
the location of the error is unknown; therefore, the
functional values between cells are uncertain) involved
in testing cascades, it is assumed that cell n is tested first
(see Figure 1), then cell n-l, etc. If an error occurs in
cell n-j, its propagation may be stopped by one of cells
n-1, n-2, ... , n-j + 1. Once cell n is tested, it may be
set such that it transmits the output of cell n-1 to the
output terminal of the cascade. In this manner (under
certain error assumptions) the cells may be tested in the
following order until error location results: n, n-1, ... , 1.
The number of tests needed to test a cellular cascade is
O(n) *, where n is the number of cells in the cascade.
I t is assumed that only one error (faulty cell) may
appear in a cascade. Also, the interconnections between
cells do not fail, the error is time independent; i.e.,
* See Definition 6.
0 0 0 1
0 1 0 0
1 0 0 0
1 1 0 0
0 1 0
1 1 0
0 0 1
0 0 0
1 0 1 0 1
0 1 1 0 0
1 1 1 0 0
0 0 0 1 1
0
1
0
1
/15
1 0 1 0 1
1 0 0 1 1
0 1 1 1 1
1 1 1 1 1
Definiton 2. An error occurs in a cell whenever the
cell produces a function that is not the same as the
function specified for that cell.
Definition 3. G = (ft, i2, 14, fs, f6,
17, fa, jg, ho, /n, h3,
!t4).
Definition 4. I
p
denotes (1, 2, 3, 4, ... p).
Definition 5. The error function E is a mapping
from G x In to G, where EUh j) = A denotes that cell j
was theoretically to produce fi€,G but instead it
produced AeG. Clearly, E(jj, j) = fi indicates that cell j
does not have an error occurring in it.
Definition 6. X* means either X or X', but not both.
Definition 7.
nitude as n.
O(n) means the same order of mag-
11 necessary and sufficient condition for fault
location in cascades
Location of a single fault in a cascade is considered in
this section. A necessary and sufficient condition for
location of a single fault in a cascade is proven. The
84
Fall Joint Computer Conference, 1969
proof of Theorem 1 can be utilized to obtain an algorithm to loca,te faults in a cellular cascade or array.
Theorem 1. Given a cascade with n cells, then the error
can be located if a,nd only if for every
iEln - (1)
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
Proof:
E(fl4, i) ~ fIi)? f12
E(fll, i) ~ f10i h
E(fs, i) ~ fo, 112
E(h, i) ~ fo, f3,
E(fa, i) ~ f9, J12' f3
E(f9, i) ~ f6, ~12' f3
E(f13, i) ~ f12; flo
E(h, i) ~ f3,!t6
E(f4, i) ~ fo, f12
E(fl, i) ~ fo, f3
E(flO, i) ~ fo, f15, f6
E(j5, i) ~ f10,fl),!I5
The proof is an inquction proof. Clearly,
the theorem is truJ for the case n = 1.
Assume that the ~heorem is true for a
positive integer k and consider a cascade
with k + 1 cells. Given the cell function
for cell k + 1, if it can be shown that the
error can be located in cell k + 1 if and
only if assumptions (1) through (12) are
Figure 5-Test decision map for fs
Figure 6-Test decision map for f2
Figure 7-Test decision map for f8
Figure 3-Test decision map for fu
Figure 4-Test decision map for f11
valid for cell k + 1, then the proof is
complete.
Assume conditions (1) through (12).
This part of the proof is now completed in
Figures 3 through 14. Note that if Co,
G1, " ' , Gi are used to set Yi = C at time
tI, then if Y i = C is wanted at time ~ if
Go, G1, " ' , Ci are utilized again, Y i is the
same value as it was at it; however all that
can be said about Y i is that it is either C
or C', but not both. This fact is used in the
proof of this theorem. In the figures with
the circled function number it may be
necessary to add one more test to deter-
Fault Location in Cellular Arrays
Figure 8-Test decision map for f9
Figure 9-Test decision map for f1a
Figure lO-Test decision map for f7
Figure ll-Test decision map for f4
85
Figure 12-Test decision map for f1
Figure 13-Test decision map for flo
Figure 14-Test decision map for fr;
mine whether the cell is in error or is
receiving the complemented sequence.
The proof of the other half of the
theorem will be by contradiction. Assume
that the error can be located, but that the
restlictioIlS (1) through (12) are not
needed. Then it can be verified that the
following pairs of conditions give the same
output at the cascade's t~rminal. Since the
two conditions give the same outputs, the
error cannot be located, which is a con··
tradiction of the assumption; therefore,
86
Fall Joint Computer Conference, 1969
(9) Y k = 1, 1, 1 and E(f4, k + 1) = h;
Y k = 0, 0, 1 and E(h, k + 1) = fo.
Y k = 0,0, and E(h, k + 1) = h;
Y k = 0, 0, 1 and E(j4, k + 1) = !J.2'
the assumption that the restrictions are
not needed iA incorrect and the proof is
completed. After (1) an abbreviated notation is used. Note:
Using the Test
Decision lVlaps and the contradiction part
of this proof one can actually determine
the values of Y i-I.
(1) Y k = 1, 1, 1 and E(f14 , k
+
°
(10)
°
k
k
Yk
°
(12)
+
+
°
(4) Y k = 1, 1, 1 and E(f2, k + 1) = f2;
Y k = 0, 1, 1 and E(f2, k + 1) = fa.
Y k = 0, 0, and E(f2, k + 1) = f2;
Yk = 0, 1, 1 and E(f2, k + 1) = fo.
°
(5) Y k = 1, 1, 1 and E(f6, k + 1) = f6;
Y k = 0, 1,
and E(f6, k
1) = fa.
Y k = 0, 0, and E(f6, k
1) = f6;
Y k = 0, 1, and E(f6, k
1) = f12.
Y k = 1, 0, 1 and E(fe, k
1) = f6;
Y k = 0, 1, 0 and E(f6, k + 1) = fg.
°°
°
+
+
+
+
= 1, 1, 1 and E(fg, k + 1) = fg;
Y k = 0, 1, and E(jg, k + 1) = !12.
Y k = 0, 0, and E(fg, k + 1) = fg;
Y k = 0, 1, and E(fo, k + 1) = fa.
Y k = 1, 0, 1 and E(fo, k + 1) = fo;
Y k = 0, 1, and E(fg, k + 1) = f6'
(6) Y k
°
°
°
°
(7) Yk = 1, 1, 1 and E(fla, k + 1) = f13;
Yk = 0, 1, 1 and E(jla, k
1) = f12.
Y k = 0, 0, and. E(fla, k + 1) = f13;
Y k = 0, 1, 1 and E(fla, k + 1) = f16.
°
+
(8) Y k = 1, 1, 1 and E(h, k + 1) = f7;
Y k = 1, 0, 1 and E(j7, k + 1) = fa.
Y k = 0,0, andE(j7, k
1) = f7;
Y k = 1, 0, 1 anp E(J7, k + 1) = fu).
°
+
=
=
0, 1,
°
°
and E~!lO, k
+ 1) = 10.
+ 1) = .flO;
+ 1) = /5'
Y k = 1, 0, 1 and E(flO , k
Y k = 0, 1, and E(ito, k
(2) Y k = 0, 0, and E(fll, k
1) = fn;
Yk = 0, 0, 1 and E(fn, k + 1) = fa.
Y k = 1, 1, 1 and E(fn, k + 1) = fn;
Y k = 0,0,1 and E (fn, k + 1) = fIr,.
(3) Y k = 1, 1, 1 and E(fs, k
1) = fs;
Y k = 1,0, 1 an.;! E(fs, k + 1) = !t2.
Y k = 0, 0, ~nd E(fs, k + 1) = fs;
Y k = 1, 0, 1 and E(fs, k + 1) = fo.
= f15'
k
Y k = 0,0, and E(f14, k + 1) = f14
are equivalent to Y k = 0, 1, and
E(f14, k + 1)= it2 at the cascade's
output terminal.
°
°°
°
(11) Y = 1, 1, 1 and E(flO, k + 1) = ito;
Y = 0, 1, °and E(ito, k + 1)
Y
0, 0, °and E(flO,k + 1) = flO;
1) = f14
are equivalent to Y k = 0, 1, and
E (f14, .k + 1) = fl5 at the cascade's
output terminal.
°
Y k = 1, 1, 1 and E(fl, k + 1) = it;
Y k = 0, 1, and E(it, k + 1) = fo.
Y k = 0, 0, and E(Jl, k + 1) = fl;
Yk = 0, 1, and E(it, k + 1) = fa.
Y k = 1, 1, 1 and E(f5, k + 1) = !5;
Y k = 0, 1, and E(f5, k
1) = !o.
Y k = 0, 0, and E(Is, k
1) = /5;
Y k = 0, 1, and E(/5) k
1) = f15'
Yk = 1, 0, 1 and E(f5, k + 1) = is;
Yk = 0, 1, and E(/5, k
1) = ito.
°
°
°
°
+
+
+
+
If the cascade meets the assumptions of Theorem 1,
then Theorem 1 can be used to determine test schedules
for the location of an error in cascades. It should be
noted that when cell k is tested, one obtains information
about the cells k - 1, k - 2, .. " 1, and therefore a test
schedule with O(n) tests will test any cascade with n
cells under the allowable error set6 • Clearly, if the
conditions of Theorem 1 are relaxed, then fault detection
(and maybe isolation) can be accomplished in the same
number of tests; however, if one is only interested in
fault detection, Theorem 2 is the best technique to use.
If a more complex cascade than the casca.des considered here is under consideration,· then a good
understanding of the method used to derive the
theorems in this paper will allow one to extend the
theories presented. If the cell functions fo, fa, !J.2, and f16
are allowed, then the fault techniques may be easily
extended since none of these functions depend on the Y
value; however, one must exercise care in the use of the
theory because it is based on the ability of the tester to
place theoretically both a
and a 1 on the Y interconnection, and examples (trivial) in which this cannot
be accomplished do exist.
°
Fault detection in Maitra cascades
In
this section the detection of a single ftmlt in a
cascade is considered. The theory for this section is
based on the observation that every n cell Maitra
Fault Location in Cellular Arrays
cascade (as defined in this. paper) produces :;I. function
dependent on X 06.
The purpose of this detection scheme is to utilize
exactly two tests to detect whether a cascade has a
faulty cell.
Theorem 2. Let the Maitra cascade have n cells. If
e2, " ' , c'n are such that f(X o, e1, e2,
en) = Xo*, then
e
1
e1, "', en) = f(O, e1, "', en)
implies that there exists a cell i such
that E(fp, i) = fo, f16, itz, or fa.
(1) f(l,
(2) f(l,
e1,
en)
= (1 *)' and f(O,
(0*)' imply that there
exists a cell i such that E(f p, i) =
it6-p or is·
e1,
"',
e1,
"',
en) =
en) =
1 * and f(O,
0* imply that there is
no error in the cascade or that there
exists a cell such that E(f p, i) = flO
and p r6 10.
(3) f(l,
e1,
Proof:
"',
"',
en)
=
In part (1) f does not depend on Xo;
therefore, there must be a cell i such that
E(jp, i) = fo, f15, it2, or fs. In part (2) f
depends on (X 0 *) '; therefore, there is a
cell i such that E(f p, i) = f15-p or f5'
Whereas, the proof of part (3) is now
obvious.
X 0 was chosen as the variable to be used in Theorem 2
because of the symmetry of the resulting theorem.
Since Xl can be made (by a suitable choice of constants~
to pass theoretically through every cell *, the theorem
could be rewritten in terms of Xl. In terms of the
complexity of the detection scheme it is seen that
cascades could have a very simple detection test
schedule. It should be noted that Theorem 2 can very
easily be adapted to provide fault detection in cascades
if it is assumed that flO is not an allowable error for any
of the 12 cell functions.
Examples
This section consists of examples of the use of
Theorems 1 and 2. fA denotes the measured value of
f whereas fT denotes the theoretical value of f.
* Assuming the cell function for cell 1 is not flO or f6•
87
Example 1. Assume that there is no error in the
cascade shown in Figure 15.
Test
Xo Xl
0
0
0
0
0
0
0
0
0
1
1
0
X 2 Xa
1
0
1
1
0
1
1
0
1
0
1
0
X 4 fT fA
0
0 0
0
1 1
1
1 1
0 0
0
1 1
0
1 1
0
Conclusion
E(fa, 4) = fa
E(fs, 3) = fs
E(f14, 2) = f14
E(f14, 1) = f14
Example 2. Assume that E(fs, 3) =
shown in Figure 15.
Test
Xo Xl
0
0
0
0
0
0
0
0
X 2 Xs
1
0
1
1
0
1
1
0
X 4 fT fA
0
0 1
0
1 1
1
1 0
0
0 1
it6 in the
cascade
Conclusion
E(fa, 4) = f6
E(fs, 3) = f15
Example 3. Assume that E(f14, 2) = fain the cascade
shown in Figure 15.
Test
Xo Xl
0
0
0
0
0
0
0
0
X 2 Xa
1
0
1
1
1
0
1
0
X 4 fT fA
0 1
0
1 0
0
1
1 0
0 0
0
0000000
0101011
Conclusion
E(fa, 4) = fa
E(fs, 3) = f6 80
an extra test is
needed.
E(fs, 3) ~ f5 and
the complemented
sequence
Y2 IS being
received.
E(f14, 2) = f3
Example 4. This example satisfies the hypothesis of
Theorem 2. Assume that E(fs, 4) = fo for
the cascade shown in Figure 1.5.
[(Xo
+
Xl
+
X z) Xa]
EB X 4 =
fT(X O, Xl, X 2, X a, X 4)
!T(X O, 0, 0, 1, 0) = Xo
fA(O, 0, 0, 1,0) = fT(I, 0, 0, 1,0) = 0 implies that there
is a cell i such that E(f p, i) = fo, it5, it2, or fa.
Fall Joint Computer Conference, 1969
88
--1
xo
t
f14
H
C
f14
H
(3
fa
r ~f
closely resembling a cascaded structure).
4
H
ACKNOWLEDGMENT
f6
Figure 15-A cascade to: be tested
The author wishes to thank R. C. Minnick for his help
in the preparation of this paper.
REFERENCES
CONCLUSION
Techniques for fault location an~ detection in cellular
arrays with an allowable error set of fo, f16, !I6-p, fa, !I2,
f6, or flO were described in this paper. It was shown that
the problem of testing an array could be reduced to the
problem of testing a cascade. The solutions presented
are particularly attractive because of their simplicity.
To locate an error, O(n) tests are needed for an n cell
cascade. Detection of an error requires only two tests
if the allowable error set is reduced by one error (flO).
A necessary and sufficient conclition for single-error
location was given. If the restrictions of this condition
are relaxed, then an isolation theorem such as given by
Thurber 6,7 can be derived; however, this isolation
condition will be more complex t~an the theorem given
by Thurber 6,7. A criterion that enables detection of a
single error in only two tests was! derived.
Although the theories presenited were derived for
regular arrays of logic, they have ,potentially wide areas
of application. A good understanding of the philosophies
presented here will allow the extension of the results to
cascades of m input n output cellf';. Also, some irregular
arrays may be tested using this ,theory if they can be
decomposed into sections composed of some form of a
cascaded structure (or sections composed of structures
1 W H KAUTZ
Testing for faults in combinationa,l cellular logic armys
1967 Switching and Automata Theory Symposium
2 W H KAUTZ
Diagnosis and testing oj cellular arrays, properties of
cellular arrays jor logic and storage
SRI Project 5876 Scientific Rpt No 3 July 1967 119-145
3 K K MAITRA
Cascaded switching networks oj two-input flexible cells
IRE Trans on Electronic Computers Vol EC-ll April
1962 136-143
4 R C MINNICK
Cutpoint cellular logic
IEEE Trans on Electronic Computers Vol EC-13 Dec
1964 685-698
5 R C MINNICK
A survey of microcellular research
Journal Association for Computing Machinery Vol 14 April
1967 203-241
6 K J THURBER
Fault location in cellular arrays
PhD dissertation Montana State Univ June 1969
7 K J THURBER
Fault location in cellular cascades
Submitted to IEEE Trans on Computers
8 L M SPANDORFER J V MURPHY
Synthesis of logic .functions on an array of integrated circuits
Scientific Rpt ~o 1 for UNIVAC Project 4645 AFCRL63-.528 Contract AF 19(628)2907 Sperry Rand Corp
UNIVAC Engineering Center Oct 1963
Fast multiplication cellular arrays for
LSI implementation
by C. V. RAMAMOORTHY and
S. C. ECONOl"fIDES
The Univer.~ity of Texas at Austin
Austin, Texas
The methQdQIQgy and retrQactive design prQcedures
Qf the lVlultiplicatiQn Array are presented. IntercQnnectiQn arrangements at the cell level, fQr the array
fQrmatiQn, as well as the mQdule level by. bringing all
mQdule inputs and Qutputs at the terminals Qf the
"package", fQr the purpQse Qf assembling larger multiplicatiQn units, are alsO' shQwn.
Since in any LSI circuit testing impQses a cQmplex
prQblem SQme diagnQstic schemes are suggested for
recQnfiguratiQn and QperatiQn under reduced capabilities 0'1' even by autQmatically switching in Qf a permanently cQnnected spare mQdule.
Other LSI cQnsiderations in terms Qf cell or module
fan-in/fan-Qut, tQtal number Qf pins required per
package, chip sizes and densities and rough cost estimates are alsO' discussed.
INTRODUCTION
The inherent capabilities Qf Large Scale IntegratiQn
technQIQgy have recently shifted attentiQn tQward twO'
majQr cQncepts in the design Qf functiQnal cQmputer
subsystems; the cQncepts Qf FunctiQnal MQdules and
Cellular Arrays.
The FunctiQnal MQdule cQncept emphasizes the
PQssible standardizatiQn Qf frequently used CQmmQn
digital subsystem units such as registers, adders,
cQunters, etc. Because Qf the unique iterative prQperties alsO' displayed by these units it is CQmmQn to' view
them as building blQcks (functiQnal mQdules), built
Qn a single substrate Qf material, the intercQnnectiQn
Qf which can expand significantly their functiQnal
capabilities. In additiQn to' standardizatiQn, their
massive prQductiQn may suggest IQW CQst subsyst~ms.
The Cellular Array cQncept allQws the intercQnnectiQn Qf several types Qf mutually independent logic
blQcks, the cells, in various geQmetric CQnfiguratiQns
to' perfQrm a desired QperatiQn.
This paper is an attempt to' cQmbine the abQve twO'
apprO' aches in the realizatiQn Qf a Binary Cellular
Array multiplicatiQn unit easily adaptable to' the
LSI realizatiQn techniques and speculate the PQssibilities Qf the realizatiQn Qf Qther similar such functiQnal
units aiming to' IQwer the CQst per unit Qf cQmputatiQn and PQssibly increase the Qverall system reliability.
MultiplicatiQn was chQsen in the study because it
fQrms the basis Qf divisiQn and square rQQt operatiQns
by iterative methods as well as others indicated by
design trend Qf present day cQmputing systems.
Single bit multiplier
;'Figures 1 and 2 show the integral parts and the detailed cellular array structure Qf the multiplication
unit, in which each rQW of the array cQrresPQnds to'
Qne bit Qf the multiplier. The array uses K-bit Qperands
prQducing 2K bit prQduct.
TO' achieve fast executiQn time the mUltiplication
is done by perfQrming K-l carry save additiQns (simple
EXCLUSIVE-OR QperatiQns) followed by a full
binary addition. Since the cells in the array Qperate
asynchrQnQusly, the unit as a whQle can Qperate faster
without using a clock pulse.
We, shall next explain the single-bit multiplication
unit in some detail.
89
90
Fall Joint Computer Conference" 1969
------------------------------------~~-----------------------------
m\
m.
m~_ um2
m, --"I
.
r-------------
,
1
-n2
~~~C'
I
I
-n3
C
C
~-I-I----.l¥--_
I
I
I
P~
-
C
_..J'.
P~
_ _n\
P~
AND
-+f-,¥,-------",
CARRY SAVE ADDER
~c-------_n7
I
I
I
'\
I~--~~~~--~~~I
I
,...x.---:1I.~"------"'--""'----"'---JI
I
I
I
L _______________ _
FigurE' 2-The "single-bit" asynchronous mult.iplieation
cellular array
I
-
-
__ I
Figure l-The integral part.s of the asynchronouH
multiplication array
The following example will illustrate the above
matrix formation.
Let the multiplicand be represented by the binary
vector M = (mJ, m2, ... mk) and :the multiplier by the
binary vector N = (n 1, n2, .. . nk).
A kx2k, P matrix is now generated starting from righ t
to left (whose elements Pij are computed from the
relation Pij = m
nj, PijE fO, 1} with the follO\ving
conditions
1: '
EXAl\IPLE
-----
11
o if ni =
SiS k for i - 1,2,3
} ... k
0 and/or 1
k
+ 1S j S i i=I,2,3 ... k
I
1 for
In terms of the array to be implemented, this condition
implies that for the range "i," "j" where Pij = 0 no cell
will be required to perform a 10giO function. Thus the
[PJ matrix has the following form: '
Pl,2k-2 ... Pl,A: ... P13
P2,2k-Z' ••
Pk,2k-l
P2,A:' .. P23
P12
Pu
000 0 101 0 1
00010 L 0 1 0
o 0 1 0 1 0 100
o 1 0 1 0 100 0
1
=
(10101) and N = (111111)
then the P matrix is P =
fl SiS k for i = 1,
2,3, ... k
mj-Hl if ni = 1 and/or i - l < j < k + l
I for i = 1, 2, 3 ... k
Pii
.:.vI
lVIUI.TIP LY ...
101 0 1 000 0
The above matrix can be realized by selective ANDing of components of M and N. This "Shifting Network" accomplishes the proper positioning of the
numbers to be added before their addition, just as in
the conventional multiplication. Arrays of Carry
Save Adders are used to perform the addition IOf these
binary numbers utilizing Wallace's algorithm.!
The first stage of the Carry Save Adder adds the
first two rows of the P matrix (first two generated
partial products) thus generating two vectors-the
first partial sums and the first carry having the form:
S
=
(SI, 2k-l
SI, 2k-2 • •• SI, k • •• SI1)
P21
The double subscript is used to identify the above
vectors with corresponding positions of the P matrix
that contributes to their generation.
Faist Multiplication Cellular Arrays
The logic functions yielding the elements
are:
S2i
91
and
C2j
where j = 2, 3,," 2k - 1. The composite cells are
shown in Figure 3a.
In the subsequent stages the Carry Save Adder will
add three vectors: The sum vector generated at the
previous stage, the carry vector generated at the
previous stage shifted once to the left and the next row
vector of the P matrix.
The logic functions producing the new sand c vectors
C
Slj~
S.
c
l,j-1
•
PHI,j
•
+
IJ
c
•
I,J-I
+
•
1)
MAJORITY
P lj
FUNCTION OF P 2j =t!)--c 1J
THREE
VARIABLES
P
~~J>S
~L·+IJ
"EXCLUSIVE-OR"
FUNCTION OF
THREE
VARIABLES
-cp
I
- nference, 1969
LI aotivates the single multiple of the multiplicand (first
"AND" gate row of each group of rows in the ESA).
L2 activates the 2's complement of the multiplicand
(second "AND" gate row, directly under each row of
inverters).L 3 activates the double multiple of the
mUltiplicand. Therefore, the typical cell of the ICC has
B I, B 2, and Co as inputs and L I, L2 and L3 as outputs.
Its logic functions are shown below. BI and B2 are any
two consecutive bits and Co is the darryout. The logic:
BI B2
Co
0
0
0
0
1
1
1
0
1
0
1
0
1
0
0
0
1
1
0
0
1
1
M M 2M
LI L2 L3
0
1
0
0
1
0
0
0
0
0
0
1
0
0
1
0,
0
0
1
0
0
1
0
0
Note: The interpretation of BII B2 = 01 is not one times
the multiplier as it would obviously appear, but it is
instead two times the multiplicand because of the way
the multipli,er is plaGed in the register, vertically with
the least significant bit on the tOli>. The B I , B2 = 10
combination is interpreted in a similar manner
The typical cell "K" of the ICC is shown in detail in
Figure 5b.
The ,carry save adder, end around carry
accumulator and full binary adder
A layout of the inputs to the CSA stages, the EACA
and FBA is displayed below. The groups of binary
numbers between the lines represent the actuall inputs
to a particular row of cells. The first three groups are
CSA row inputs. The fourth group represents the EACA
inputs and the final group, those of the FBA. An binary
numbers representing partial products are of cou~se
P matrix row vectors activated by the ICC lines dne
to a .particular multiplier bit pair combination.
1 1 1 1 1 1 1 0 1 0 0
1 1 1 1 1 0 1 0 0
1st partial product
2nd partial product,
o 0 0 0 0 1 001 0 0
1 1 1 1 101 0 0 0 0 0
1 1 101 0 0
1st partial sum
1st carry
3rd partial product
1 0 0 0 1 1 0 0 0 1 0 0
o1 1 1 0 0 1 60 0 0 0 0
o10 1 1
2nd partial su m
2nd carry
4th partial product
o 1 0 0 0 1 0 0 0 1 0 0 3rd partial su m
1 0 1 0 1 1 0 0 0 0 0 0 0 3rd carry
o 1 1 1 End Around Carries
1 0 0 0 1 1 1 0 1 0 0 0 14th partial sum
o 1 0 0 0 0 0 0 0 1 0 0 0 4th carry
1 1 0 0 1 1 1 0 1 1 0 0 1
Figure 6-The binary multiplying cellular array
Final Sum (Result)
Figure 6 shows array after superimposing the individual circuits.
It can be easily noticed that there is a reduction
by a factor of two in the total number of cell rows re...,
quired for the array and therefore in the total finml
propagation T p, at the expense of some ,additional
control logic, a number of inverters and an additional
stage for the EACA. No further complexity in the
cell structure results, thus the o'tiginally developed
cells were used, with a minor modification for cell S
as shown in Figure 7a. This cell may also, be present
in the single bit multiplication array.
It must also be noticed that the overflow of bits
resulting in the left-most significant part of the final
Fast Multiplication Ce,llularArrays
95
a good choice for all practical purposes. An interconnecting scheme of standard dimension 64 X 8 bit
modules to realize the 64 bit multiplier was then devised aiming to minimize the number of pins per
module necessary for the interconnection.
As seen in Figure 8, the resulting 64 X 64 multiplication unit requires 2-Full Binary addition stages
and 4-Carry Save addition stages per module, a
total of 32-Carry Save additions and 15-Binary Additions (only one for the first module). However, there
is a real time overlap between these various stages,
and by utilizing a pipelining technique and a series
of flip-flops after each FBA, a 100 percent utilization
of the unit during computation is achieved, and the
multiplication cycle is considerably faster. This is
illustrated shortly in connection with Table III.
The basic module as displayed in Figure 6 has to be
modified further for the interconnection. An extra
FBA and additional gating for diagnostic purposes is
, 64 •
I-10 -I
Figure 7a-Cell "S"-A form of Cell "S"
I
MODULE - 1
PRIl_
l[
Flln-Plnn
~I'"I Bits
I
1
1
MODULE-4
1
MODULE - 5
~
l
MODULE - 6
MODULE - 7
1
1
Figure 7b-CelJ "R"-Reconfiguration cell
product register may be advantageously utilized for
sign and decimal point consideratlons.
Diagnostics and reconfiguration
In order to incorpora te diagnostics in the array
and study- the interconnection problem, a standard
size module had to be assumed. It was felt that the
implementation of a 64 X 64 bit multiplier would be
Bits
2 - Bits
....:...J +-
MODULE -3
r
S _ _ _ _-++--_...
l~
MODULE - 2
NS
L
.<;toraae
Figure 8-Example of an assembled 64 X 64-bit
mUltiplication unit using the pipelining scheme
Fall Joint Computer Conference, 1969
96
----,---------------...:.----~-------
introduced in every module between the output of its
respective FBA and what is shown as a product
register. The typical newly developed cell for the
diagnostics and reconfigura tion is shown in Figure
7b, while the above mentioned modifications are displayed in detail in Figure 9 for a typical module.
As seen, three additional control lines are needed
to perform the following functions.
a. To relay a Fault or No-Fault signal, indicating
that a fauft has or has not occurred in one particular module (NF/F) (e.g., if F = 0 NF = 1).
b. To relay a No Shift signal for the output of this
module, (NS = 1) if no fault has occurred in
the preceding module.
c. To relay a shift, eight-bits to the right, (S = 1)
for the output of this and all subsequent modules
if a fault has been detected in the preceding
module.
The detection of the fault could be accomplished by
a software routine which may check the final product
of the unit periodically and appropriately set the flipflops of the control signals.
By shifting the outputs of all subsequent modules
to the malfunctioning one eight-bit positions to the
right while forcing the output of the faulty module to be
equal to zero at the same time and simultaneously
introducing the spare module which is permanently
connected to the unit, one can still achieve 100 percent
computational efficiency. If another module fails to
function properly, by applying again the same reconfiguration scheme the unit will function with a reduced
capability since the eight-least significant bits of the
multiplier will be lost. No provision has been made at
this point if two modules fail to function properly
OVERFWW
I..
at the same time. At least one of them must be replaced
to put the multiplication unit back in service.
Aiming to maximize the number of multiplications
per unit time, as already mentioned, one can introduce
storage elements at intermediate points. This allows
the unit to accept a new set of operands without waiting
for the total completion of the present computation.
Consider an m X m bit multiplier module. If the
intermediate computations are stored after the Carry
Save adders, the first Binary adder and the second
Binary adder, the rate of multiplications in the module
per unit time will be
Rm =
1
where
max [tcs, t b]
tC8 = Total time propagation through the CHA.
tb = Total time propagation through the FBA for
the binary addition of two m-bit binary
numbers.
Then the number of storage elements required per
module is 2m + m + m = 4m. If, however, storage
elements are inserted at the outputs of the two Binary
Adders only, as shown in Figure 8, the maximum rate
of multiplications in each module per unit time will be
while the total number of storage elements required
will be decreased by half, that is 2m.
The table below gives the sequence of events in
the first four modules of the 64 X 64 composite multiplier unit of eight modules, based on the pipelining
technique.
Table III
1:.,---
6 .. ··8
MODULES
BlTS
TIME UNITS
1
-+--+-..+-----1--rI'--
NF/..-F
2
3
4
5
Figure 9-The comhinationallogic gating
for reconfiguration
1
Bll
B2l
Bn
B41
B51
2
Bll
B 2t , Bl2
B SI, B22
B 41, B32
B n , B41
4
3
Bll
Bu
Bal
B 41 , B 23
B 51, B33
Bll
BZI
B13
B 41, B14
B n , B24
Each time unit in the above table corresponds to the
factor tb + t es , and B i j represents the j th bimtry
addition of the i th multiplication.
Fast Multiplication Cellular Arrays
97
---~-----,------------------------------------------------------------------------
Approximate number of GATES/CELL*
For cell "C" approximately seven-gates are required
For cell "S", itS'"~ approximately three-gates are
required
For cell "R" approximately two-gates are required
For cell "K" approximately nine-gates are required
Figure lO-An alternate. interconnecting scheme for
the 8-modules of the 64 X 64 multiplication unit
Another interconnecting scheme which has not been
investigated yet in detail but seems to be equally as
efficient, considerably faster and adaptable to the
proposed reconfiguration technique is the one shown in
Fig. 10, where each level of nodes represents FBA's
Figure 10, where each level of nodes represents FBA's
performing in parallel with an anticipated multiplication
cycle of
LSI implementation
The implementation shown for the 64 X 8 module
reveals a number of characteristics suitable for large
scale integration. Among them are the repetitive
interconnections of simple identical cells and the
modularity suitable for expansion and reconfiguration.
Below some of the approximate hardware requirements are pointed out.
Approximate 'number of PINS/MODULE
1.
2.
3.
4.
5.
m + n + 2 needed for the multiplicand register
m + n + 2 needed as inputs to the second FBA
m + n + 2 needed for the product
n + 2 needed for the multiplier register
three-control pins for reconfiguration
Approximate number of CELLS/MODULE
The cells are the kinds already discrussed: C, S,
S', R, K. All are present in a module.
1.
2.
3.
4.
5.
m X n/2 cells needed for the CSA stages
m + n cells needed for the EACA stage
m + n reconfiguration cells
2 (m + n + 2) cells needed for the two FBS's
n/2 + 1 cells needed for the ICC.
The above estimates point out the fact that testing
at the individual cell or circuit level (item yet to be
examined) becomes a problem, especially when the
complexity of the chip is increased, with a paralleled
decrease in reliability and yield of non-defective chips.
However, using the modular approach it is advisable
to perform the testing externally on the module and
discard the malfunctioning units. This would considerably decrease the amount of logic on a chip, which would
otherwise have to be inserted for the testing of the
individual circuits. This approach seems to be economically feasible since it is estimated that by 1970
an LSI chip of 100 X 100 mils in size may contain
200 components, at five cents per component, while
by 1975 an LSI chip of 300 X 300 mils in size may
contain as many as 3,600 components at the cost of
about one cent per component. Therefore, miniaturization of LSI chips will discourage the testing on the
individual circuit level, while the loss due to the
discarding of modules after tesing at the frame level,
will be negligible.
In view of the above considerations and since the
present state-of-art high density MOS circuits are
being driven at 10 MHz, implementation of the
multiplier modules as the one presented by MOS circuits appears very desirable from a manufacturing
viewpoint. A reasonable building block might be a
64 X 64 bit multiplication unit requiring an approximate number of 5000 active elements (field effect
transistors) . One could also visualize the whole unit
incorporated in one or two chips. Where speed is the
primary requirement, the unit can be designed using
fast bipolar transistors, with an expected five ns delay.
Assuming then a 64 X 64 bit module is implemented
by bipolar transistors, the execution time could be
in the neighborhood of 0.22.5MS, which when pipelined,
the maximum number of multiplications per second may
be approximately 5 X 106 • An MOS array of the same
module will perform in an order of magnitude slower
than in the bipolar case.
* The above gates 9,re mostly "AND" gates with the "OR" gate
not included in the count. They are also 2(m + n) additional
gates needed for the reconfiguration scheme and m X n gates for
shifting each array.
98
Fall JQint Computer Co¢erence, 1969
The pin count also indicates that the current design
is within the state-of-art of the MOS technology.
The performance figures given :above are educated
guesses since the circuit and int~rmodule delays are
dependent on the circuit types, their interconnections,
the chip topology, etc. In addition 1the design examples
described in the previous sections indicate the ease
with which the array could b~ partitioned to fit
reasonable unit or chip sizes.
functional arrays appear quite feasible and, worth
considering. The possibility of composite design of
a multiplication, division and square rooting unit using
techniques presented in this paper could be very useful, particularly if the division and square root algorithms are based on the availability of fast multiplication units such as those discussed in this paper.
CONCLUSION
The authors would like to thank Mr. Gary Vvang of
the NASA Electronics Research Center for sharing
with them some of his thoughts on the subjeet. Also
Mr. W. R. Adrion, graduate student at the University
of Texas at Austin for his constructive sugg;estions.
Since fast multiplication has become the basis of
iterative divisions and square root~ in fast computers6 • 7
there appears to be a need for ch~ap array type, LSI
realizable multiplication subsystems. This paper reports
the design methodology and the detailed implementation of one such structure. Ease of diagnosis and capability of reconfiguration were used ~s twin requirements
in the final design. When the unit is composed of a
number of modules and a malfunction is detected in
one of them, a method of switching automatically in
a spare module was presented. An estimate of the
logic circuitry in the hard core (that portion of the
unit which must be operating without any faults)
during testing is found to be less that 14 percent for
a 32 X 32 module, 9.7 percent ~or 64 X 64 module
and 4 percent for 128 X 128: module. Therefore,
as the size of the multiplication module-unit increases
the relative size of the hard core decreases very rapidly.
To conclude, the cellular array implementation of an
asynchrouous multiplication unit using mostly noncarry-propagating Carry Save add~rs was accomplished.
The final cell design and the cOJitrol and the reconfiguring circuitry are quite simple.
A number of additional studies needs to be done in
the future. The design of self-diagnosable and repairable
ACKNOWLEDGMENTS
REFERENCES
1 C S WALLACE
A suggestion for a last multiplier
IEEE Trans Prof Group on Electronic Computers Vol 13
No 1 Feb 1964
2 'Methods for high-speed addition and multiplication
NBS Cir No 591 1958
3 0 L MAcSORELY
High-speed arith/11.R,tic in binary computers
Proc IRE Vol 49 No 1 Jan 1961
4 1\1 LEHMAN
Short-cut multiplication and division in automatic binary
digital computers
Proc Inst Elec Eng Paper No 2693M Vol105B Sept 1958
5 I FLORES
The logic of computer arith/11.R,lic
Prentice-Hall Inc 1963
6 D FERRARI
A division method using a parallel multiplier
IEEE Trans Prof Group on Electronic Computers Vol 16
No 2 April 1967
7 S F ANDERSON et al
I BM system model .91: Floating point execution unit
IBM Journal of Research and Development Jan 19167
The Pad Relocation technique for
interconnecting LSI arrays of imperfect
yield
by D. F. CaLHOUN
Hughes Aircraft Company
Culver City, California
INTRODUCTION
The interconnection of circuits required in Large Scale
Integration (LSI) using multi-level metalization above
monolithic semiconductor arrays is taking basically
two approaches. One is predicated on processing with
a reasonable yield entire arrays without any semiconductor defects (i.e., 100 percent yield chips) which
allows once-generated fixed-wiring patterns to obtain
the required interconnect. The second approach aims
at much larger semiconductor hrrays (i.e., full-slice
LSI) for which defect-free processing cannot be expected. Thus, probe tests are made of the semiconductor circuits processed on each LSI slice (or wafer)
and record is made of the good and bad circuit positions. Unique interconnection masks are then generated
to interconnect good circuits in each wafer's particular
yield pattern using certain "discretion" in avoiding
the bad circuits. As a result, the 100 percent yield
approach emphasizes the need to use standard interconnect masks but is complexity limited by the occurrence of defective circuits in larger arrays, whereas
approaches capable of routing around the defective
circuits have required a full set of unique signal interconnect masks for each wafer's particular yield pattern.
The Pad Relocation approach, however, allows the
interconnection of full ..slice LSI arrays containing defective circuits to be accomplished with a minimal
amount of unique interconnect per array. Only a
portion of one of the typically three interconnect levels
varies from array to array, thus allowing significant
improvements in the cost, reliability, and testability
of the finished arrays as well as less limitation on cell
yields and array complexities.
Description of the Pad Relocation technique
Pad Relocation is a technique which allows a predetermined standard pattern of good circuits to be
established on all LSI slices used to perform the same
array function regardless of the varying yield patterns
determined by DC wafer probe tests. This is accomplished by relocating the pads of nearby good circuits
to the positions where good circuits were specified
by a presc~ibed master pattern, but were not· found
during wafer probe tests. The pad positions above a
bad circuit (or any unused circuit) are isolated from
that circuit by a layer of dielectric. Where good circuits are found in expected good circuit locations,
those circuits are used without relocation. Thus, the
Pad Relocation technique functionally establishes a
specified pattern of good circuits as if there had actually
been a 100 percent circuit yield in that pattern. A
single wiring pattern can then be generated for all
the LSI arrays of the same function to accomplish the
much more complex signal interconnect between the
master pattern circuits. By determining standard
cross-under areas within the Pad Relocation layer
where relocation lines need never occur, it has been
shown that large arrays can be interconnected with
the same number of total interconnect layers as required by discretionary techniques.
99
100
Fall Joint Computer Conference, 1969
With each wafer's good circuits located in the predetermined master pattern, an optimal standard
interconnect of the circuits can be made for each
wafer. Since this signal routing and mask-making
expense is incurred only once for each function, much
more effort can be spent optimizing the signal routing.
As a result, the total number of interconnect levels
(including Pad Relocation) may actually be fewer
(for very complex arrays) than pther techniques by
which the interconnect is generated for each wafer's
particular yield pa,ttern.
The Pad Relocation technique has been 100 percent successful for all integrated circuit and special
LSI wafers considered so far. The "master pattern"
gives the prescribed locations of good circuits to
which each LSI array's particular yield will be tailored.
Statistically, if M is the percentage of wafer circuits
in the master pattern and Y is the wafer circuit yield
from probe tests, then only M(100 - Y)/100 percent
of all wafer circuits need to be relocated. For example,
if Y = 35 percent and M = 30 percent, then the
relocation (as a statistical average) of 19.5 percent
of the wafer circuits will establish a master pattern
that uses 86 percent of all the good wafer circuits.
This would allow 120 good circuits to be located in
prescribed positions, leaving an average of only 20
good circuits unused.
An example
The methodology of the Pad lRelocation technique
is best described by example. Figure 1 shows the mapping of circuits on an LSI wafer. :Each dot represents
the position of a semiconductor. cell such as a full
adder, or a quad two-input NAND gate cell, or a flipflop, etc. Figure 2 identifies with a slash (/) the location of all circuits determined to be good by dc wafer
probe tests on a particular slice., The yield of wafer
circuits varies from 10 percent to 90 percent depending
on the circuit complexity, and the locations of the
good circuits cannot be predicted from wafer to wafer.
This makes it impossible to use standard interconnect patterns without first transforming the various
wafer yielq patterns to a single standard pattern.
The circuit yield (the percent of :total circuits which
are good) for the wafer in Figure 2; is nearly 30 percent
and yet there is not a single area :of 100 percent yield
that is larger than three circuits by two circuits. Thus,
100 percent yield could obtain urtits with only about
5 percent of the complexity allowed by full-slice interconnection techniques. The goal ~s to tailor by some
efficient means the locations of the good circuits in
Figure 2 to a standard pattern that may be used for
Figure I-Integrated circuit wafer
..
//
..
. / / ..
/
.
./ .
./ .
. I I I ... / ..
........ / / . / . / /
. / ... / / . / . / / . / / .
.. / .. / .. / . / .. / / / / .
/ / / ...... / . / / / / . /
... / .. " / . / .....
. . / . . . . . / /. . / . / / . . . .
... / / / . . . / / / .. / . / ....
.. / / / / . / .. / .... / / / .
..... / / . /
. / / .... .
. . / / . / / / .. / .. / ... .
/ .. -././ .... / / / / . / .
... / / . / / . / . / .. / / / / .
/ ... / . / .... / .....
/ . / / / . / .. / /
... / . / / ...
/ ..
./ .
/
Figure 2-Wafer after test-Slashes show good cireuit
positions
all wafers with about the same circuit yield. For higher
yield wafers, there are other standard patterns. which
use more good circuits.
Figure 3 shows a master pattern (in heavy dots)
which can be used for wafers having at least a 25 percent yield. That pattern is characterized by a, more
The Pad Relocation Technique
•• • • • • • • • • • • •
• • • • • •• • •• • • • • •
•• • • • • •• • ••••••••
••••••••••••••••••••
• • • • • •• • •• • • • • • • • • • •
• •••••••••••••••••••••
• •••••••••••••••••••••
••••••••••••••••••••••
• ••••••••••••••••••••••
•
.......................
· ....•.•.•...•........
·• •••••••••••••••••••••
'
••••••••••••••••••••••
• •••••••••••••••••••••
•
•• • • •
• •
• • • •• •
• • • • • • •
••• • • • •
•
•
••••• • • • • • • • • • • •
• • • • • • • •• •
•
•• ••••• • ••• •• • •
••
•• ••
•• • • • • • • • e ••
•
101
••••••
• . . I I ..
. .• I I • . ,
•
.1.1·····1···
·111·.·1·.·.········
. ··.··.·1'·1·'1···1.
.. ... II . I. I I . , I .•... I I
. ·1·.1'.1·1·'11'1··111
III·.· '."'.1'1.1".
. •. I . • . . . • . I . • . . . . • . .
.. ·1.···.11··1·11···
.··1.1·' ··1.1··.·'·.·
·.·11'1.1··.···.111.
., ·.11·.·· ·.1· . • . .
.·11·1.1··.··1··· .
1 · . · . · 1 · · · ·1.1"1 •
. 11·11'.'" ·1111
. , . '.1·1··· '1.' .•.
l·ll'.I·.II·
. . • . '1·11' . . •
·1·
•.
el
Figure 3-A master pattern of good circuits-All wafers
will be matched to this pattern by the Pad
Relocation technique
dense usage of good circuits toward the center of the
wafer \vith good circuit positions never adjoined on
more than one side by another circuit in the master
pattern. The latter characteristic facilitates the routing
of standard signal interconnect as well as the relocation of circuits in at least three directions. The matching
of the master pattern to the expected yield distribution as a function of distance from the wafer center
optimizes the conflicting goals of minimum number of
relocations and maximum probability of fulfilling the
master pattern.
Figure 4 shows the Figure 3 master pattern superimposed on the particular \vafer yield of Figure 2.
The objective now is to route a nearby good circuit,
shown by a slash, to each heavy dot (i.e., master pattern position) which initially is \vithout a good circuit. This specification can be completed manually
giving a coding sheet descr~ption of necessary circuit
relocations; or a simple computer routing program can
output a punched tape or cards that can be used to
make a mask automatically. The computer routine for
Pad Relocation \vill use about two orders of magnitude less run time than a customized signal routing
primarily because no circuit placement or logic signal
routing are required. Pad Relocation requires only
that a good circuit be identified for relocation to each
position in the master pattern which did not initially
have a good circuit. A later paper will present work
that is under way to automate the Pad Relocation
Figure 4-Master pattern superimposed on the particular
yield of the Figure 2 wafer
selection and Rpecification with the use of interactive
graphics.
Figure 5 shows a manually generated specification
AREA A
.
:ri-J::1··
:{1.r..~:ill ~ f: J ~g.
n:
• . . ·Ihl-e//,·.;-.....
·J-e1·'/,·/6·11.'
11
"if
~~~~:~:::~~~.~
. . .-. . -/- . . ,1. r--:..J . -./
. . foe
I---e / . / /-' . . .
r-J4
I.~ia~n ~ ~
. . . . ' f H / ' " ' ' ·.1
.•
. · e - - I / · I . h · , , · . / ..
e---t. e-J flf· I ' " . 1.1.· I.e
'11'11·· ·.··1111·
•. · - . t · I · · · !-e .. , .
.. hll.fH·~h··
. '~:i~:{q:"
fH· ~ ..
Figure 5-Specification of a set of relocations necessary
to completely implement the master pattern of
Figure 3
Fall Joint Computer Conference, 1969
102
of posdible relocations that cOn)pletely satisfies the
master pattern of Figure 3, us~ng the good circuit
positions of the wafer in Figure! 2. The longest relocation line length is less than 10.45 inch. Figure 6
shows how the relocation in area of Figure 5 can be
accomplished without crossovers for a quad two-input
gate cell. Each gate of the bad ci~cuit at the lower left
is functionally replaced with a good gate from the top
right circuit. It should be noteq that the computer
needs only subroutines for leaving (or entering) a cell
from the top, bottom, left, and right, for moving parallel lines' over some number of c~lls, and for making
ninety degree turns in order to dq all the possible Pad
Relocation routing patterns. Figure 7 shuws the actual
Pad Relocation of an SN5480 g~ted full adder above
a silicon wafer using 0.002 incl~ aluminum lines on
0.0035 inch centers. Figure 8 s~ows how simple the
Pad Relocation mask is if it is cbnsidered as a set of
the above mentioned subroutines.
A
Figure 7-Pad Relocation of an SN5480 gated full
adder above a silicon wafer (Using O.002-inch
aluminum lines on O.0035-inch centers)
Intermedia.te step to full wafer LSI
Figure 9 shows an intermediate step to full-wafer
LSI using the Pad Relocation te;chnique. Three 4-bit
Modular Multiplier modules are ~o be fabricated from
the three bordered half-inch square areas (as was suggested in a 1968 FJCC paper by D. F. Calhoun).
Within the three bordered areas,; slashes again represent good circuits and circles show the master pattern
,
rr.....-r
,
I
~
locations. The lines terminating in arrowheads show
how three, eight, and five good circuits can be relocated into the positions circled to establish the same
pattern of good circuits for each module, thus allowing
the use of one standard signal interconnect pattern
for all subsequent modules tailored to that pattern.
Figure 10 demonstrates the simplicity of a coding
sheet specification of the necessary circuit relocations
~
...
n
.
..I
t
•
II
•
c~t
1
--'
,'..
_.L-- ..
,
r-II
I
I
f.--
-
__ .
~:~~:I
I
- -,
I
I
..
I
..
r'
- --
_,.
__
I
. :.J ..
~
I
~
::!.~~
_··1 ...
------ . . . . .
1 .
• .....J
.
I
I
!
i
......
.It
I
;
L _____ ..... _ ----------
Figure 6-A set of pad relocations ;necessary to replace
functionally the quad two-inptit gate circuit in
area A of Figure 5
.
-
:-,
~:
Figure 8-Mask pattern for the pad relocations specified
in Figure 5
The Pad Relocation T'echnique
Figure 9-Pad Relocation routing for three 200-gate
modules on a single l-Yzinch wp,fer
103
Figure ll--Four relocation patterns for SN5480's
for the three multipliers of Figure 9. Figure 11 shows
the four possible Pad Relocation interconnect patterns
which are necessary for the LSI multipliers. For these
modules it seems appropriate to incorporate simple
signal cross-under lines and power distribution in
the Pad Relocation level so as to require only two
additional levels of interconnect above the tested
LSI chips.
..... PAD RELOCATION LSI
A Pad Relocation LSI hardware program
PHONE
CiRCUit
HUGHES
0."
Rtt6CAliuN biMteflbN
~T~~+W++W++W~~w+~~~~~~C~L~U;fl~:t3~1~*,
_,:
I
T
S
IFI
S
THE
...L..I-i-I-I-U+~-I--I-+--I--I--l~I-++W++++-j-E~!~2f-+--I-W-++-~~~~C*~+'ll."J'+-j ~-+~.+-I-i-+-+-I-+-I-+-l
-'-l-U!--W-J.-l-l-iW-l-+-W-~--+-l-I-I-l--I-l-+-l-I-1'~L4"IH++H-H-I~H~~ C: ~ LU~~"I-+++++++-l++-l
iLU!--W-J.~UL+-W-~--+-l-W+-I-l-+~~~L*2+-W-~+h~~L~~CA E :
I
L 2
•
RI
H
C
L, I
N 33
PECIFI S
IC
R L CA I N S
AT
R
I
E USED
H T C I RC IT
SI I
II
,I
II
An LSI hardware development program began in
January 1969 (in which Hughes Aircraft Company
contracted Texas Instruments to do the multi-level
processing) and which resulted in fully tested and
packaged 207 gate arrays in May 1969. During this
program, (1) TI fabricated and tested one type of
their LSI wafers having a certain mix of gates and
flip-flops, (2) TI supplied the yield information on
each wafer to be processed for Hughes, (3) Hughes
generated both the one standard signal interconnect
mask for all wafers as well as an iI).dividual Pad Relocation mask for each wafer, and (4) using the mask specifications from Hughes, TI processed the two additional
levels of interconnect and tested and packaged each
of the finished units. Similar programs for higher
complexity arrays have since been initiated. The
results of this program are described below.
The logi,c array to be built in: LSI
Figure lo-Coding sheet specification
Investigations were made three years ago at Hughes
Aircraft Company into the applicat:on of LSI arrays
104
Fall Joint Computer Conference, 1969
j
to techniques for doing the verx high speed sum-ofproducts computations required: in advanced digital
filtering systems. A result of thi;s study ,vas the de.velopment of the high speed ":l\10dular Carry Advance
l\1ultiplier" which was described l in a 1968 Fall Joint
Computer Conference paper by D. F. Calhoun. Among
its characteristics is its modularity \vhich allows
longer wordlength multiplication$ to be efficiently accomplished (in terms of speed ~nd parts) simply by
paralleling more of the identic~l modules. A 5-bit
sign-and-magnitude Modular Multiplier designed with
four types of logic gates and a JK flip-flop was thus
chosen as the vehicle for LSI development on this
program. Such an array forms and. stores in a register
the 9-bit sign-and-magnitude product of two 5-bit
operands. The 5-bit multiplier design uses 153 NAND
gates and 9 flip-flops (each equi\ralent to six NAND
gates) for a total of 207 interconpected gates per LSI
wafer.
The logical interconnection of, 207 gates using less
than one square inch of an LSI ~afer represents well
any state-of-the-art bipolar LSI ~pproach. Two levels
of interconnect (including the Pad Relocation) were
used above the tested wafer which already had a first
level of metalization for component interconnect.
In terms of cross-over complexity, signal linelengt.hs,
and circuit fan-outs, the IVToduhtr l\1ultiplier design
can be considered typical of a 200 gate logic array.
Figure 12-Texas Instruments LSI type "K" slice
(HAC Photo 4R07185)
Description of the chosen LSI slice
The chosen semiconductor slice :for this LSI development program was the Texas Instruments type HK"
slice. Basically, the K slice is a hiploar array of transistor-transistor logic (TTL) ga~es and flip-flops occupying an active area of about 11.1 square inches. A
picture of this LSI wafer is shown in Figure 12. The
array is subdivided into 298 cell!3 of dimension 0.084
inch by 0.044 inch. Of the 298 Basic wafer cells, 170
are split into two 42 by 44 mil halt-cells for gates while
the 128 JK flip-flops on the wafkr occupy full 84 by
44 mil cells. The distribution of logic elements on the
K slice is shown in Figure 13. Each cell labeled "3"
has two independent three-input NAND gates while
the adjacent cells labeled "5" have an independent
five-input NAND gate and a on~-input NAND gate.
In three of the rows of gates ~ single seven-input
NAND gate designated by a "7" was processed instead
of two three-input NAND gates. The rows of fullsized 84 by 44 mil cells contain the JK flip-flops, which
are labeled "FF". In total there! are 642 logic gates
(170 ones, 264 threes, 170 fives, 'and 38 sevens) and
128 JK flip-flops processed on the wafer.
LSI ARRAY· SLICE "K"
-
t
44 MILS
~
F
T6
F
F
7/ 5
F
31&
6
3
5
3
6
316
5
3
5
3
5
31 5
315
F
F
F
F
F
F
F
F
F
F
F
F
F
315
3
5
3
5
3
5
3
5
315
3T 5
3/5
F,
F
F
F
F
F
F
F
F
F
F
F
d6
ih
7
5
7
5
7
5
7
5
F
F
F
F
F
F
F
F
F
F
F
3
6
3
5
3
5
3
6
315
F
F
31,
31 6
315
315
F
F
F
F
F
F
F
F
F
F
F
F
F
F
71 5
F
F
3T 5
F
71 6
315
F
F
F
F
F
F
F
F
F
F
F
F
31 6
3
6
3
5
3
5
3
&
31&
3T 5
F
F
F
F
F
F
F
F
F
F
F
F
71 6
7
F
F
F
F
F
F
F
31 5
F
F
F
31&
31 6
F
F
F
F
31 6
F
F
316
F
7
5
7
5
7/5
71 5
F
F
F
F
F
F
F
31 5
3
5
3
5
3/ 5
31 &
F
F
F
F
F
F
F
F
F
F
F
F
3/ 5
31 6
3
&
3
6
3
6
3
5
31 5
3r 5
31&
31 B
31 5
F
F
F .F
F
F
F
F
F
F
F
F
F
F
F
7
5
7
5
7
&
7
5
715
7/5
71 5
7ls
F
F
F
F
F
F
F
715
71&
F
F
F
F
31
e
31 5
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
315
3
5
3
5
3
5
3
5
3T 5
31 6
F
F
F
F
F
F
F
F
F
F
F
F
F
31 5
315
3
5
3
6
3
6
3
5
3T 5
31&
3/5
31 5
3
5
3
5
3
6
3
6
31 5
31 5
3
6
3
6
3
5
3
5
3\ 5
3
5
3
5
3
5
3
5
F
F
316
F
F
Figure 13-LSI array slice "K"
71 5
71 5
715
F
F
F
F
F
F
3/ &
3/ 5, 31 5
F
F
F
F
F
31 6
F
F
F
F
1188 MILS
F
5
31'
3/ &
315
F
715
F
6
F
F
F
F
3
3/ 5
F
F
F
7
F
3j5
715
F
F
F
F
F
dB
F
31 5
F
31'
3j1
F
715
B
F
F
F
F
&
71'
F
F
F
F
F
71 5
F
31 5
F
F
3
7\5
F
F
F
3r5
F
715
F
F
F
F
F
3\ 6
F
5
3
31 6
F
5
3
3
31 5
F
3
5
5
315
F
5
3
6
F
F
3
5
3
F
F
5
3
3
F
F
3
5
315
3/6
F
5
3
3/5
3/ 5
F 'F
3
F
3
3/5
315
3} a
7/ 5
F
I
1176 MILS
IM~LSr-
F
F
F
3\ 5
F
I
The Pad Relocation Technique
Selection of the master pattern and
pad relocation patterns
First, a master pattern of circuits was chosen to
define the standard circuit positions on the K slice
that would be interconnected to form the Modular
Multiplier function. This master pattern (shown in
Figure 14) was defined with respect to (1) maximizing
the probability of successful fulfillment, Pr(M), of
the master pattern, (2) facilitating the standard signal
interconnect, and (3) using a minimum number of
relocation patterns efficiently. After the master pattern and the repertoire of relocation patterns to be
used were determined, restricted areas in the Pad
Relocation level were defined to allow signal crossunders from the standard top level signal interconnect. Sufficient cross-under capability for this design
was found in the flip-flop cells alone by using certain
areas of these cells which are not required by any of
the defined relocation patterns. Other cross-under
areas can be defined for any more complex designs
so as to still use only two metalization layers above
the tested circuits. A set of Pad Relocation patterns
was prepared to allow the efficient selection of the
Master Pattern Cell Designation Key:
t:.
= 1 input
o
= 3 input gates
o=4
gate
input gates
105
particular patterns and their positions necessary to
fulfill each wafer's master pattern. The chosen set
of K slice relocation patterns is shown in Figure 15.
This semiautomated specification has :fi~,cilitated a
very fast turnaround and low cost capabiiity for the
generation of Pad Relocation masks and for working
with new routing requirements, wafer layouts and
logic designs.
'
LSI program results
The end results of the Hughes effort described in
this section were the two metalization mask specifications used by TI to process each wafer. Only one of
these is unique since the use of Pad Relocatio~ allowH
all signal interconnect to be obtained from a oncegenerated standard mask. Figure 14 shows the worksheet specification of how the yield of a typical LSI
slice can be tailored to the chosen master pattern.
The lines with arrowheads at the end specify relocation patterns from the set of patterns shown in Figure
15. The completion of the K slice master pattern was
accomplished successfully on each of the 30 wafers
attempted. A typical time for a man to complete and
verify the specification shown in Figure 14 was two
minutes manually.
From the specifications like those in Figure 14, the
necessary relocation patterns were selected from the
standard set shown in Figure 15 and were added to
o = JK flip-flop
~
11111
/MIIIIIf
~
F u,
"lI
'It
~
l
"'"
.jill
~
m
.. II
;!II'
-
1@!::siiil
ill
Figure 14~Pad Relocation worksheet with master
pattern locations shown
l
Figure 15-Set of K slice relocation patterns
106
Fall Joint Computer Conference, 1969
the standard cross-under pattern to complete the Pad
Relocation mask such as the on~ shown in Figure 16.
Only the particular circuit relocation patterns vary
within this mask which allows thb least possible variation of interconnect and testing from one array to
another. The more complex but standard mask is the
one shown in· Figure 17 which abcomplishes all necessary. signal interconnect (except the cross-unders to
the Pad Relocation level) and the power distribution
for the 5-bit multiplier design. The design for this
mask can efficiently be done manually for arrays of
this and larger size since the ~aster pattern is well
distributed. In mask plotting itime alone, the Pad
Relocation mask required only about 20 percent the
time required to plot the signal interconnect metalization patterns. A photograph of the final 207 gate
LSI multiplier is shown in Figure 18.
Statistics of Pad Relocation master patterns
The choice of a master pattern for Pad Relocation
is important since its definition affects the average
number of relocated circuits (and thus the routing
time and mask complexity) as well as the number and
simplicity of the signal interconnect levels. Also a good
statistical match between the ~aster pattern and the
expected wafer yield distribution will result in a higher
-
•
probability of successful relocation. As an example,
consider a master pattern that is defined too densely
about a wafer's periphery. Since peripheral wafer
circuits show a much lower yield than the more central
'1IJJ'IJJIJIJ'IJJJJIII~J~llllllll.JI
.. t: l ;.
;.;
:.
!.
~;
~
1,
:;
~
,
c'·
,
'
I
••
'hi
.......
I •
L
-
II-II~=.II1II
\
It I
~
, • II .11 II I
I •
-
I II
~~
II
=
11111
.......
-
II
- -ill'
.- -:;D:5
--.-~
•
Figure 17-5-bit. modular multiplier standard interconnect mask
L
11111
--
~-
--q
.
I •
ill'
-
,
•
~
iiIIIt
1.1
•
' .
. . ,
11111111111111111111"111111111111111111 .
li'igure 16-Pad relocation mask with standard crossundel'S
Figure 18-207 gate multiplier LSI array usin.g Pad
Relocation (HAC Photo 4R09152)
'The Pad Relocation Technique
ones, there will statistically be more relocations, longer
relocation lengths, more difficulty in satisfying the
master pattern, and a higher concentration of signal
interconnect above the master pattern than if the
master pattern had been chosen to match the "expected" yield distribution as was done for the example
shown in Figure 3.
A first question that must be answered is what is the
"expected" yield distribution? Investigations thus far
have pointed out only that there is significant decrease
in yield as a function of the distance from the wafer
center which can be attributed to boundary defects,
and that when good or bad circuits occur, there is a
more than random clustering effect. No ability to
predict the locations of these clusters has been obtained.
What must be done is to examine the yield of large
samples of the wafer types that will be used to determine the distribution that best describes their
expected yield patterns. This distribution will be different for different ranges of yield as well as for different
circuit complexities and wafer types. The master pattern for a specific range of yield, wafer type, and wafer
size should be matched to the expected distribution
so as to take advantage of any knowledge of where
good circuits are more probable. By so doing, the
probability of successfully fulfilling a master pattern
is maximized while minimizing the expected length of
the longest relocations.
StatisticaJ techniques have been developed to determine and compare the efficiency of various master
patterns in terms of maximizing both the utilization
of good circuits and the probability of successfully
fulfilling the master pattern. For example, if y is the
percentage of the total circuits that were found to be
good (i.e., the yield), m the percentage of total circuits that are in the master pattern, and r the number
of unused circuits from which a relocation could be
made to each master pattern circuit, then the probability of successfully fulfilling each master pattern
circuit independently is:
Y
~
L.J
~ (1 - y - 1)(1 - y)k
(1 - y)k = Y LJ
k=O
k"'"r
+ (1 - y)ry = y
L
(1 - y)k
(1)
k-O
where the first term is the probability that the master
pattern circuit itself is good, and each succeeding term
is the conditional probability of needing to examine
another candidate for relocation times its probability
of being good. Equation (1) can be simplified as follows:
(1 - Y - 1)
k=O
~ (u - l)u k
k=Q
-Y
= Y L.J
(2)
with
a
P(l) = Y + (1 - y)y + (1 - y)2y.+
107
u = (1 - y)
and
k=r
L
k=O
(u - l)u k
k=O
- (u r+1
-
1)
1 -(1 - y)r+l
(3)
therefore,
P(l) = 1 - (1 - y)r+l
(4)
If the master pattern has a total of M circuits in
it, then the joint probability of successfully fulfillin g
all of the M circuits becomes:
P(M)
= P(l)M = [1 - (1 - y)]r+1M
(5)
Equation (5) is based on an uncorrelated and pseudorandom distribution of good circuits (see Reference 10
with y 2:: 0.25) as well as the same assumption as
Equation (1) that there are r circuits (good or bad) for
each master pattern circuit fsom which a relocation can
be made independently of the other master pattern
circuits. It is, however, an unnecessary restriction to
assign r circuit positions which could only be used to
fulfill each master pattern circuit. Instead, consider
successively examining up to r circuit positions which
are the closest to each particular master pattern position
and, for which, there is still a free path in the Pad.
Relocation level to the master pattern position. Then
Equation (5) will give the probability of successfully
relocating (if necessary) to each of the M required
master pattern positions at least one of the r closest and
free circuit positions.
Equation (5) determines a family of curves. for
P reM) versus M for various yields and values o~ r.
Figure 19 shows the curves of PrOf) versus M With
y = 0.5 for r = 4 and r = 9. It should be noted that
each circuit of M may actually be many interconnected
gates of logic and M = 100 would represent 1000 gates
Fall Joint Computer Conference, 1969
108
1.00
y
=0.5" CIRCUIT YIELD
0.90
allow the standard signal interconnect to be designed
to require the minimum number of levels and the
minimum area per level. Thus, chip areas can be less
interconnect limited.
0.80
Improvement of testing and reliability of la:rge
scale integrated systems
0.70
0.60
~
...
0.50
~
M =220
FORP=O.S
0.40
0.30
0.20
0.10
0.0
20
50
100
soc
200
1000
M
Figure 19--The probabilty Pr{M) of successfully
fulfilling a ma.ster pattern of M cifcuits by relocating from
one of up to r nearby circuits. Eeqh circuit is a tested unit
which may have many gates 6f logic complexity
if each circuit of M had 10 gates of equivalent logic
complexity. If it is desired to 'successfully fulfill the
master patterns of at least half the wafers considered,
Figure 19 shows that 220 circuits (and thus probably
750 or more gates) can be used if r = 4, and 680 circuits can be used if r = 9. Of ¢ourse, any wafers for
which the master pattern was hot easily fulfilled are
not lost since they can be inv~ntoried and used for
other master patterns, or for integrated circuits, or
diced and bonded to substrates~ As a comparison the
most complex current bipolar p.iscretionary unit has
an equivalent Al of 169 while the 100 percent yield
approach has reached an equivalent M of only 24.
Advantage of Pad Relocation to iJSI
signal interconnect
The prime advantage of Pad ~ Relocation LSI which
has been described above is th~t it places the pads of
all used circuits in standard positions which both allows fixed-pattern signal' routing between these circuits as well as the utilization of more circuits than
allowed by other LSI techniques. There are further
advantages, however, to the rquting of the standard
signal interconnect. For exaIl1ple, the positions to
which circuit pads will always be brought can be modified and optimized to facilitate the necessary routing
of signals as well as to minimize the lengths of the
longest or the most critical signal paths. This will also
Semiconductor device reliability, as well as propa~
gation delay, is highly dependent on proper maintenance of junction temperatures within certain
bounds. From the maximum specified junction temperature, a maximum power dissipation per wafer
area can be computed which is dependent on the heat
conductive characteristics of the wafer and the cooling
techniques used, as well as on the area and power dissipation of the particular circuits. Thus there will be
a maximum number of circuits that should be powered
up on the wafer. In addition, no region of t.he wafer
should exceed a certain maximum power density in
order to insure that the wafer will not have relative
"hot spots" where too many powered circuits are located. Pad Relocation LSI can help insure that the
wafer power dissipation density is not excessive by
specifying the relocated circuits to be primarily those
from areas of sparce circuit utilization, thus obtaining
a more uniform pmver density across the enti.re wafer.
By so doing, the system cooling requirements can be
relaxed and/or more circuits can be used on the same
wafer. This more uniform power dissipation could be
quite difficu ~t to insure with other routing techniques
since there is less choice in the used circuit positioning.
A simple means by which a Pad Relocation
0
(12)
Likewbe, define the Category history C h , at the eth
event.~'),~
(13)
F, = u~
(19)
From equations (11) through (14) we see how the
Authority and Category histories accumulate as a
function of event e. These events are the specific times
when files are accessed by a job. To maintain security
Fall Joint Computer Conference, 1969
122
i
TABLE I-8ecurity property determination matrix
Object
~roperty
User, u
Authority
A
Category
C
Given Constant
Given Constant
Franchise
F
u
.------------------Terminal, t
Given Constant
Job, j
min(A 1o At)
File, f
Existing file
Given Constant
u~
Given Constant
Cu
T\
u~J
C,
Existing file
Given Constant
New file
max(A(he-1), peA;»~, e > 0
New file
Ch(e - 1) U Ci, e
>
0
integrity, these. histories can n:ever exceed (i.e., be
greater than) the job security profile. This is specified as,
If equations (22) and (23) hold, then by definition
Ah(oo)
~
Ai
(20)
u =
Ut
=
Uj
(24)
(21)
Access is granted to a file jf and only if
For e::l 0, we see the properties initialized to their
simplest form. However, as e g~ts large, the histories
accumulate, but never exceed thai upper limit set by the
job. Ah(e) and Ch(e) are impQrtant new concepts,
discussed in further detail laterl We speak of them,
affectionately, as the securj~y "high-water mark," with
analogy to the bath tub ring that marks the highest
water level attained.
The Franchise of a new file is always obtained from
the Franchise of the job given by equation (6). When
i = II = 0, the job is controlled by the s~ngle user Uj who
becomes the owner and creator of the file wth the sole
Franchise for the file.
Access control
Our model is now rich enough tq expreSl:) the equations
of access control. We '\\ ish to control access by a user to
the system, to a terminal, and to a file. Access is granted
to the system if and only if
UEU
(22)
where U is the set of all sanctioned users known to the
system.
Access is granted to a terminal if· and only if
(25)
for propertjes A and C according to equationEI (8) a.nd
(9), and
(26)
If equations (25) and (26) hold, then access is granted
and Ah(e) and Ch(e) are calculated by equations (12)
and (14).
Model interpretation
Three different dimensions for restricting :Jiccess to
sfnsitive information and information processes are
possible with the security profile triplet. The generality
of this technique has considerable application 1;0 public
and military systems. For the system of interest,
however, the Authority property corresponds to the Top
Secret, Secret, etc., levEls of government and m~litary
security ~ Category c)rresponds to the host- of special
control compartments used to restrict access by project
and area; such as those of the Intelligence and Atomic
Energv communities; and the Franchise property
corresponds to access sanctioned on the lbasis of
Security Controls in ADEPr-50 Time-Sharing Systetn
need-to-know. With this interpretation, the popular
security terms "classifics-tion" and "clearance" can be
defined by our model h the SB,me dimensions--as a
nUn/max test on the security plofile trjplet. CIgssification is attached to a security object to designate the
minimum security profile required for access, vvhereas
clearance grants to a security object the maximum
security profile jt has permissjon to exercise. Thus, legal
aCCfSS obtains if the clearance is greater than or equal
to the classjfication, i.e., if equation (25) holds.
Another observation on the modEl is the "job
umbrella" concept implied by equatjons (22) through
(26); i.e.. tbe derived clearance of the job (not thf'
clearance of the user) is used as the securhy control
triplet for file access. The job umbrella spreads a
homogeneous clearance to normalize access to a
heterogeneous assortment of program and data files.
This simplifies the problem of control in a multi-level
security system. Also note how the job umbrella's
h;gh-water mark (equat;ons (11) through (14» is used
to automatically classify new files (equ9tions (17) and
(18»; this subject is discussed further below.
A final observp.tion on the model is its p,pplic["tion of
need-to-know to terminal access, equation (23). This
feature allows terminals to be restricted to special
people and/ or special groups for greater control of
personnel intmfaces-i.e., systems programmers, computer operators, etc.
Security control implementation
The selection of a set ,theoretic model of security
control was not fortuitous, but [) deliberate choice biased
toward computation91 efficiency and ease of implementatjon. It permits the clean separation and isolation of
security control code from the security control data,
which enables ADEPT's security mechanjsms to be
openly discussed and still remain safe-a point advocated by others.14.16 We achieve this safety by "arming"
the system with security control datB, only once at
start-up time by the SYSLOG procedure discussed later.
Also, the model jmproves the credibility of the security
system, enhancing its understanding and thereby promoting its certification.
Security objects: Identity and structure
Each security object has a unique identification (ID)
within the system such that it can be managed indivjdually. The form of the ID depends upon the securityobject type; the syntax of each is given below.
123
User identification
For generality of definition, each user is uniquely
identified by his user:id, which must be less than 13
characters with no embedded blanks.
The user :id can be any meaningful encoding for the
local installation. For example, it can be the individual's
Social Security number, his military serial number, his
last name (if unique and less than 13 characters), or
some local installation man-number convention. The set
of all user :ids constitutes the universal set, U.
Terminal identification
All peripheral devices in ADEPT are identified
uniquely by their IBM 360 device addresses. Besides
interactive terminals, this includes disc drives, tape
drives, line printer, card reader-punch, drums, and 1052
keyboard. Therefore, terminal:id must be a two-digit
hexadecimal number corresponding to the unit address
of the device.
Job identification
ADEPT consists of two parts: the Basic Executive
(BASEX), which handles the allocation and schedul~ng
of hardware resources, and the Extended Executlve
(EXEX), which interfaces user programs 'with BASEX.
ADEPT is designed to operate itself and user programs
as a set 'of 4096-byte pages. BASEX is identified as
certain pages that are fixed in main core, whereas EXEX
and user programs are identified as sets of, pages that
move dynam.ically between main and s~ap memory.
A set of user programs are identified as a job, with page
sets for each program (the program map) described in
thejoh's environment area, Le., the job's "state tables."
Every job in ADEPT has an environment area that
is swapped with the job. It contains dynamic system
bookkeeping information pertinent to the job, including
the contents of the machine registers (saved when the
job is swapped out), internal file and ~/O control tables,
a map of all the program's pages on drum, user:id, and
the job security control parameters. The environment
page(s) are memory-protected against readin~ and
writing by user programs, 80S they are really swappable
extensions of the monitor's tables .
. The job:id is then a transitory internal parameter
which changes with each user entrance and exit from the
system. The job:id is a relative core memory address
used by the executive as a major index into central
system tables. It is mapped into an external two-digit
number that is typed to the user in response to a
successful LOG IN.
124
Fall Joint Computer Conference, 1969
File identification
ADEPT's file system is quite rich in the variety of
file types, file organization, and equipment permitted.
There are two file types: temporary and permanent.
Temporary files are transitory "scratch" disc files,
which disappear from the system: inventory when their
parent job exits from the syst~m. They are always
placed on resident system volumes, and are private to
the program that created them.
Permanent files constitute the majority of files
cataloged by the system. Their permanence derives from
the fact that they remain inventoried, cataloged, and
available even after the job that created or last referenced them is no longer present, and even if they are not
being used. Permanent files may be placed by the user
on resident system volumes or on demountable private
volumes.
There are six file organizations from which a user may
select to structure the records of his file: Physicalsequential, Sl; non-formatted, S2; index-sequential, S3 ;
partitioned, S4; multiple volume fixed record, S5; and
single volume fixed record, S9. Regardless of the
organization of the records, ADEPT manages them as a
collection, called a file. Thus, security control is at tho
file level only, unlike more definitive schemes of
sub-element control. 8,10--12
All the control information of a file that describes
type, organization, physical storage' location, date of
creation, and security is distinct from the data records
of the file, and is the catalog of the file.
All cataloged ADEPT files are uniquely identified by
a four-part name; each part has various options and
defaults (system assumptions). This name, the file:id,
has the following form:
file:id : : = name, jorm,·user:id, volume:id
Name is a user-generated cha~acter string of up to
eight characters with no embedded blanks. It must be
unique on a private volume as well as for Public files
(described below).
Form is a descriptor of the internal coding of a file.
Up to 256 encodings are possible, although only these
seven are currently applicable:
1
2
3
4
5
6
7
= binary data
= relocatable program
= non-relocatable program
card images
= catalog
= DLO (Delayed Output)
= line images
=
U ser:id corresponds to th~ owner of the file, i.e., the
creator of the file.
Volume:id is the unique file storage device (tape, disc,
disc pack, etc.) on which the file resides. For various
reasons, including reliability, ADEPT file inventories
are distributed across the available storage media,
rather than centralized on one particular volume. Thus,
all files on a given disc volume are inventoried on
that volume.
Security properties: Encoding and structure
Implementation of the security properties in ADEPT
is not uniform across the security objects as suggested
by our model, particularly the Franchise property. Lack
of uniformity, brought about by real-world considerations, is not a liability of the system but a reflection of
the simplicity of the model. Extensions to the model ~tre
developed here in accordance with that actually
implemented in ADEPT.
Authority
Authority is fixed at four leveJs (w = 3 for Hquation
(1)) in ADEPT, specifically, UNCLASSIFIED, CONFIDENTIAL, SECRET, and TOP SECB.ET in
accordance with Department of Defense security
regulations. The Authority set is encoded as :~ logical
4-bit item, where positional order is important. Magnitude tests are used extensively, such that the high-order
bits imply high Authority in the sense of equ2.tion (8).
Category
Category is limited to a maximum of 16 eompart·
ments (1/1 :::; 15 for equation (2)), encoded as a logical
16-bit item. Boolean tests are used exclusively on this
datum. The definition of (and bit position correspondence to) specific compartments is an installation option
at ADEPT start-up time (see SYSLOG). Typical
examples of compartments are EYES ONLY,
CRYPTO, RESTRICTED, SENSITIVE, etc.
Franchise
Property Franchise corresponds to the military
concept of need-to-know. Essentially, this corresponds
to a set of user:ids; however, the ADEPT implementation of Franchise is different for each security object:
1. User: All users wishjng ADEPT service must be
knowIl to the system. This knowledge is imparted
by SYSLOG at start-up time and limited to
approximately 500 user:ids (max(U) :::; 500).
Security Controls in ADEPT-50 Time-Sharing System
2. Terminal: Equation. (5) specifies the Franchise
of a given terminal, F t, as a set or user:ids. In
ADEPT, F t does not exist. One may define all
the users for a given terminal, i.e., F t ; or alternatively, all the terminals for a given user. Because
SYSLOG orders its tab1es by user:id, the latter
definition was found more convenient to
jmplement.
3. Job: The Franchise of a job is the 'llser:id of the
creator of the job at the time of LOGIN to the
system. Currently, only one user has access to
(and control of) a job (p, = 0 for equation (6)).
4. File: Implementation of Franchise for a file (F f),
is more extensive than equation (7). In ADEPT,
we wish to control not only who accesses a file,
but also the quality of access granted. We have
defined a set of four exclusive qualities of access,
such that a given quality, q, is defined if
q
E
{READ, WRITE, READ-ANDWRITE, READ-AND-WRITEWITH-LOCKOUT-OVERRIDE}
(27)
ADEPT permits simultaneous access to a file by
many jobs if the quality of access is for READ
only. However, only one job may access a file
with WRITE, or READ-AND-WRITE quality.
ADEPT automatically locks out access to a file
being written to avoid simultaneous reading and
writing conflicts. A special access quality, however, does permit lockout override. Equation (7)
can now be extended as a set of pairs,
F f = {(uJ, qO), (u), ql), "', (ul, q'Y)}
(28)
where q i are not necessarily distinct and are given
by equation (27).
The implementation of equation (28) is dependent upon 1', the number of franchised u,sers.
When l' = 0, we have the ADEPT Private file,
exclusive to the owner, uJ; for l' = max(U), we
have the Public file; values of l' between these
extremes yield the Semi-Private file. l' is
implicitly encoded as the ADEPT "privacy"
item in the file's catalog control data, and takes
the place of F f for all cases except a Semi-Private
file. For that case exclusively, equation (28) holds
and an actual F f list of user:id, quality pairs
exists as a need-to-know list. The owner of a file
specifies and controls the file's privacy, including
the composition of the need-to-know list.
125
Security control initialization: SYSLOG
SYSLOG is a component of the ADEPT initialization
package responsible for arming the security controls. It
operates as one of a number of system start-up options
prior to the time when terminals are enabled. SYSLOG
sets up the security profile data for user:id and
terminal:id, i.E.:" the "given constants" of Table I.
SYSLOG creates or updates a highly sensitive
system disc file, where each record corresponds to an
authorized user. These records are constructed from a
deck of cards consisting of separate data sets for
compartment definitions, terminal:id classification, and
user:id clearance. The dictionary of compartment definitions contains the less-than-9-character mnemonic for
each member of the Category set. Data sets are formed
from the card types shown in Table II. Use of passwords
is described later in the LOGIN procedure.
An IDT card must exist for each authorized user; the
PWD , DEV , SEC , and CAT card types are optional.
Other card types are possible, but not germane to
security control, e.g., ACT for accounting purposes.
More than one PWD, DEV, and CAT card is acceptable
up to the current maximum data limits (i.e., 64 passwords, 48 terminal:ids, and 16 compartments).
A variety of legality checks for proper data syntax,
quantity, and order are provided. SYSLOG assumes ~he
following default conditions when the correspondlIlg
card type is omitted from each data set:
PWD
DEV
SEC
CAT
No password required
All terminal:ids authorized
A = UNCLASSIFIED
C = null (all zero mask)
This gives the lowest user clearance as the default,
while permitting convenient user access. Various options
exist in SYSLOG to permit maintenance of the internal
SYSLOG tables, including the replacement or deletion
of existing data sets in total or in part.
The sensitivity of the information in the security
control deck is obvious. Procedures have been developed
at each installation that give the function of deck
creation, control, and loading to specially cleared
security personnel. The internal SYSLOG file itself is
protected in a special manner described later.
Access control
A fund2.mental secur.1ty concern in multi-3ccess sysis that many users with different clearances will be
simultaneously using the system, thereby raising the
126
Fall Joint Computer Conference, 1969
TABLE II-SYSLOG control cards
Card Type
Purpose
DICT
I dentifies start of data set of compartment definitions.
Defines up to 16 compartments.
compartment 1
TERMINAL
UNIT terminal:id
IDT 'U,ser:id
PWD password
DEV terminal:id1
SEC Authority
CAT compartment 1
compartment16
password'
terminal:id48
compartment 16
Identifies start of data sets of terminal definitions.
Identifies start of a terminal data set.
Identifies start of a user data set.
Defines legal passwords for user:id up to 64.
Defines legal terminals for user:id up to 48.
Defines user:id Authority.
Defines user:id Category set.
possibility of security compromise. Since programs are
the "active agents" of the user, the system must
maintain the integrity of each and of itself from
accidental and/or deliberate intrusion. A multifile
system must permit concurrent access by one or more
jobs to one or more on-line, independently classified files.
ADEPT is all these things--multiuser, multiprogram,
and multifile system. Thus, this section deals with access
control over users, programs, and files.
an unsuccessful LOGIN. Furthermore, the terminal is
ignored (will not honor input) for approximately 30
seconds to frustrate high-speed, computer-assisted.
penetration attempts. If, however, the match is
successful (equation (22) holds), the current password in
the SYSLOG file for this user:id is discarded ,and
LOGIN proceeds to create the job clearance.
(
start)
User access control: LOGIN
To gain admittance to the system, a user must first
satisfy the ADEPT LOGIN decision procedure. This
procedure attempts to authenticate the user in a fashion
analogous to challenge-response practices.
The syntax of the ADEPT LOGIN command, typed
by a user on his terminal, is as follows:
----- Equatic'n (22)
/LOG IN user :id password accounting
Figure 1 pictorially displays the LOG IN decision
procedure based upon the user-specified input parameters. Usel':id is the index into the SYSLOG file used to
retrieve the user security profile. If no such record exists
(Le., equation (22) fails), the LOGIN is unsuccessful and
system access is denied. If the security profile is found,
LOGIN next retrieves the terminal:id for the keyboard
in use from internal system tables, and searches for a
match in the terminal:id list for which the user:id was
franchised by SYSLOG. An unsuccessful search is an
unsuccessful LOG IN.
If the terminal is franchised, then the current password is retrieved from the SYSLOG file for this usel':id
and matched against the password entered as a kevboard
parameter to LOGIN. An unsuccessful match i; again
----- Equation (23)
----- Equation (22)
----- Equations (15) and (16)
Figure 1-LOGIN decision procedure
Security Controls in ADEPT-50 Time-Sharing System
Passwords in ADEPT obey the same syntax conventions as user:id. (See the earlier description of User
Identification.) Although easily increased, currently
SYSLOG permits up to 64 passwords. Each successful
LOG IN throws away the user password; 64 successful
LOGINs are possible before a new set of passwords
need be established. If other than random, once-only
passwords are desired, the 64 passwords may be encoded
in some algorithmic manner, or replicated some number
of times. Once-only passwords is an .easily implemented
technique for user authentication, which has b~en
advocated by others.2,7 It is a highly effective and
secure technique because of the high permutability of
12-character-passwords and their time and order
interdependence, known only to the user.
Once the authentication process is completely satisfied, LOGIN creates the job security profile according to
equations (15) and (16) of our model. That is, the lower
Authority of the user and the terminal becomes Ai, and
the intersection (logical AND) of the user and terminal
Category sets becomes the Category of the job, Cj. For
example, a user with TOP SECRET Authority and a
Category set (1001 1001 0000 1101) operating from a
SECRET level terminal with a Category set (0000 0000
0000 0010) controls a job cleared to SECRET with an
empty Category set.
Program access control: LOAD
As noted earlier, the ADEPT Executive consists of
two parts: BASEX, the resident part, and EXEX, the
swapped part. EXEX is a body of reentrant code
shared by all users; however, it is treated as a distinct
program in each user's job. Up to four programs can
exist concurrently in the job. Each operates with the job
clearance-the job clearance umbrella.
LOAD is the ADEPT component used to load the
programs chosen by the user; it is part of EXEX and
hence operates as part of the user's job with the job's
clearance. Programs are cataloged files and as such may
be classified with a given security profile. As is described
in "File Access Control" below, LOAD can only load
those programs for which the job clearance is sufficient.
Once loaded, however, the new program operates with
the job clearance.
In this manner, we see the power of the job umbrella
in providing smooth, flexible user operation concurrent
with necessary security control. Program files may be
classified with a variety of security profiles and then
operate with yet another, i.e., the job clearance. By this
technique security is assured and programs of different
classifications may be operated by a user as one job. It
127
permits, for example, an unclassified program file (e.g.,
a file editor) to be loaded into a highly classified job to
process sensitive classified data files.
File access control: OPEN
Before input/output can be performed on a file,
a program must first acquire the file by an OPEN call
to the Cataloger. Each program must OPEN a file for
itself before it can manipulate the file, even if the file is
already OPENed for another program. A successful
OPEN requires proper specification of the file's descriptors-some of which are in the OPEN call, others of
which are picked up directly by the Cataloger from the
job environment area (e.g., job clearance, user:id)-and
satisfactory job clearance and user:id need-to-know
qualifications according to equations (25) and (26) of
our model. Equation (25) is implemented as (8) as a
straightforward magnitude comparison between A j and
AI' Equation (25) is implemented as (9) as an equality
test between C I and (C j / \ C / ). We use (C j / \ C / ) to
ensure that C I is a subset of the job categories; i.e., the
job umbrella. Lastly, equation (26) is a NOP if the file
is Public; a simple equality test between Uj and UI if the
File is Private; and a table search of F I for Uj if the file
is Semi-Private. These tests do increase processing time
for file access; however, the tests are performed only
once at OPEN time, where the cost is insignificant
relative to the I/O processing subsequently performed
Qn the file.
The quality of access granted by a successful OPEN,
and subsequently enforced for all I/O transfers, is that
requested, even if the user hp"s a greater Franchise. For
example, during program debugging, the owner of a file
may OPEN it for HEAD access only, even though
READ-AND-WRITE access quality is perm.itted. He
thereby protects his file from possible uncontrolled
modification by an erroneous WRITE call.
Considerable controversy surrounds the issue of
automatic classification of new files form.ed by subset or
merger of existing files. The heart of the issue is the poor
accuracy of many such classification techniques17 and
the fear of too many over-cle.ssifIed files (a fear of
operations personnel) or of too many under-clPJssified
files (a fear of the security control officers). ADEPT
finesses the problem with a clever heuristic-most new
files are created. from. existing files, hence classify the new
file as a private file with the composite Authority BJnd
Cate.
J'
,,0
I
YJ'~
4C'c
:i?o~~C'~ ~o~J'~c
,,0
;.~~
o~~.
q,.;.
J'~'.f>J'J'r J'J'~ ~~~
C'"oy
~(
'.l:
'-I'.()
4'~o/
~
.t~ ¢'~
;.
...."
A'./;
:i?oJt(
'1"",
If(
~
:/~ 1'(~ -I'(~
oJ'~
~
~~p~
¢'~ ~-I'(~ 1'(~
1l
~~ o~~ o~<:~C'o. o~ o(~ c:.,~
J'<~"eC'Y<1>
;p
"l'lf.
J'
A
<~ '1'<~4'~ ~"b~.,.
-I'.()
~
~<1>
.
EVENT
~
J'
LOGIN
X
X
LOGOUT
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
CHANGE FILE
X
X
X
X
X
X
X
CLOSE FILE
X
X
X
X
X
DELETE FILE
X
X
X
X
~- 1-.-
RECLASS
X
X
x
x
OPEN FILE
REOPEN
l
FILE
REPLACE
DEVICE LIST
2
4
WRAPUp 5
X
X
X
X
X
X
X
x
CATEGORY DICTIONARy
RESTART
X
3
~
x
x
X
--- 1 - - - -f.---
x
x
v
X
1 This is the "OPEN existing file" command.
2 A list. of all
the terminal devices and their assigned security and categories is recorded at each system load.
3 A list of the prose category names is recorded at each system load.
4 Whenever the system is restarted on the same day (and AUDIT had been turned on earlier that day) the time of
the restart is recorded.
5 The time that the AUDOFF action was taken, or the time that the WRAPUP function called AUDIT, to terminate the
AUDIT function.
132
Fall Joint Computer Conference, 1969
fully demonstrated a security control mechanism that
more than adequately supports heterogeneous levels and
types of classification. Of note in thi~ rega~d is the
LOGIN decision procedure, access control tests, job
umbrella, high-wat.er mark, and audit trails recording.
The approach can be improved in the direction of more
compartments (on the order of 1000 or more), extension
of the model to include system files, and the implementation of a single Franchise test for all security
objects. The implementation needs redundant encoding
and error detection of security profile data to increase
confidence in the system-though we have not ourselves
experienced difficulty here. The increase in memory
requirements to achieve these improvements may force
numerical encoding of security data, particularly
Category, as suggested by Peters.7
Second, SYSLOG has been highly successful in
demonstrating the concept of "security armin.g" of the
system at start-up time. Our greatest difficulty in this
area has been with the human elem.ent-the computer
operators-in preparing and ha.ndling the control deck.
In opposition to Peters,7 we believe the operator should
not be "designed out of the operation as much as
possible," but rather his capabilitits should be upgraded
to meet the greater levels of sophistication B.nd responsibility required to operate a time-sharing system. 20 He
should be considered part of line management. ADEPT
is oriented in this direction and work now in progress is
aimed at building a real-time security surveillance and
operations station (SOS).
Third, we missed the target in our attempt to isolate
and limit the ~mount of critical coding. Though much
of the control mechanism is restricted to a few components--LOGIN, SYSLOG, CATALOGER, AUDIT
-enough is sprinkled around in other areas to make it
impossible to restrict the omnipotent capabilities of the
monitor, e.g., to run EXEX in Problem state. Some
additional design forethought could have avoided some
of this dispersal, particularly the ·wide distribution in
memory of system data and programs that set and use
these data. The effect of this shortcoming is the need for
considerably greater checkout time, and the lowered
confidence in the system's integrity.
Lastly, on the brighter side, we were surprisingly
frugal in the cost of implementing this security control
mechanism: It took approximately five percent of our
effort to design, code, and checkout the ADEPT
security control features. The code represents about ten
percent of the 50,000 instructions in the system. Though
the code is widely distributed, SYSLOG, security
commands, LOGIN, AUDIT, and. the CATALOGER
account for about 80 percent of it. The overhead cost of
operating these controls is difficult to me8.sure, but it is
quite low, in the order of one or two percent of total
CPU time for norm.al operation, excluding SYSLOG.
(SYSLOG, of course, runs at card reader speed.) The
most significant area of overhead is in the checking of
I/O channel programs, where some 5 to 10 msec are
expended per call (on the average). Since this time is
overlapped with other I/O, only CPU bound programs
suffer degredation. AUDIT recording also cont.ributes
to service call overhead.. In actuality, the net operating
cost of our security controls may be zero or possibly
negative, since AUDIT recordings showed us numerous
trivial ways to measurably lower system overhead..
ACKNOWLEDGl\'lENTS
I would like to acknowledge the considerable encouragement I received in the formative stages of the ADEPT
security control design from lVIr. Richard Cleaveland, of
the Defense Communications Agency (DCA). I woul.d
like to thank l\irs.l\1artha Bleier, l\1r. Peter Baker, and.
Mr. Arnold Karush for their patient care in designing
and implementing much of the work I've described
Also, I wish to thank Mr. l\Tarvin Schaefer for assisting
me in set theory notation. Finally, I would like to
applaud the· ADEPT system project personnel for
designing and building a time-sharing system so
amenable to the ideas discussed herein.
REFERENCES
1 A HARRISON
The problem of privacy in the computer age: An annotated
bibliography
RAND Corp Dec 1967 RM-5495-PR/RC
2 L J HOFFMAN
Computers and privacy: A survey
Stanford Linear Accelerator Center Stanford Univ Aug
1968 SLAC-PUB-479
8 H E PETERSEN R TURN
System implications of information privacy
Proc SJCC Vol 30 1967 291-300
4 W H WARE
Security and privacy in computer systems
Proc SJCC Vol 30 1967 279-282
5 W H WARE
Security and privacy: Similarities and differences
Proc SJCC Vol 80 1967287-290
6 R LINDE C WEISSMAN C FOX
The ADEPT-50 time-sharing system
Proc FJCC Vol 35 1969 Also issued as SDC Doc SP-3344
7 B PETERS
Security considerations in a multi-programmed comp uter
system
Proc SJCC Vol 30 1967 283-286
8 RYE CAPRI COINS OCTOPUS SADIE Systems
Security Controls iIi A:P,EPT'-50 Time-Sharing System
NOC Workshop National Security Agency Oct 1968
9 H W BINGHAM
Security technique8 for EDP oj multi-level cla88ified
information
Rome Air Development Center Dec 1965 RADC-TR-65-415
10 R M GRAHAM
Protection in an information proce88ing utility
ACM Symposium on Operating Systems Principles Oct
1967 Gatlinburg Tenn
11 L J HOFFMAN
Formularie8-Program controlled privacy in large data ba8e8
Stanford Univ Working Paper Feb 1969
12 D K HSIAO
A file 8y8tem for a problem 80lving facility
Dissertation in Electrical Engineering Univ of Pa 1968
13 J I SCHWARTZ C WEISSMAN
The SDC time-8haring 8Y8tem revi8ited
Proc ACM Conf 1967 263-271
14 P BARAN
On di8tributed communication8: IX, 8ecurity, 8ecrecy, and
tamper-free con8ideration8
133
RAND Corp Aug 1964 RM-3765-PR
15 C WEISSMAN
Programming protect'ion: What do you want to pay?
SDC Mag Vol 10 No 8 Aug 1967
16 J P TITUS
Wa8hinqton commentary-Security and privacy
CACM Vol 10 No 6 June 1967379-380
17 I ENGER et al
.{l utomatic 8ecurity cla88ification study
H.ome Air Development Center Oct 1967 H.ADC-TR-67-472
18 A KARUSH
The computer sY8tem recording utility: A pplication and
theory
System Development Corp March 1969 SP-3303
19 A KARUsiI
Benchmark analysi8 of time-8haring 8ystem8 : Methodology and
re8ults
System Development Corp April 1969 SP-3343
20 It R LINDE P E CHANEY
Operational management of time-8haring 8Y8tems
Proc 21st Nat ACM Conf 1966 149-159
Management of confidential information
b,y EDWARD V. COMBER
System Dynamics, 1m.
Oakland, California.
INTRODUCTION
For many years, informed persons have expended
considerable time and energy attempting to evolve
an acceptable philosophic assessment of the concept
of "privacy." Studies made in the fields of anthropology,
phychology, and sociology are in general agreement
that both the mental and physical well-being of an
individual requires fr~edom to experience some degree
of personal anonymity within the envir?nment.
While the significance of "privacy" has been recognized,
it has eluded the constraint of an acceptable definition. .The search for a workable definition continues
as man seeks a means for establishing, practical bounds
for inter-personal relations.
Recently, the concern for "privacy" has become a
rallying point for those who see the present growth
and applications of data automation as a threat to
the "rights of privacy" of the individual. These advocates lament that the individual is unaware of the
threat to his "loss of privacy" as his attention is
diverted by the glowing promises of anticipated
benefits that may become available through data
automation.
It is the writer's belief that through the proper and
reasonable utilization of the tools of modern data technology man will have within his power a mechanism
that has the potential of becoming his strongest ally
in his search for means to preserve the values of "privacy." In reality, the critical element in this question
of "privacy" should not address itself to the electromechanical capability of the computer or system telecommunications functions. The true focal point is the
direct challenge to the discipline and conduct of man
who is the designer and user of the data system. 6 Man
must be willing to abide by the standards he derives
from his own "privacy" criteria. He 'must staunchly
forego any temptation to engage in system shortcuts,
and he must hold to the position that he will not accept
lightly any violations of the "confidentiality controls"
established for system operation. Any breach in the
integrity 'of the system must be viewed as a direct
personal challenge to the integrity' of each person
associated with the undertaking.
SUMMARY
The following is a brief resume of significant elements
that have been identified with the question of "privacy." These comments are not offered as final nor
are they to be considered as embracing the entire
area of concern. The summary is presented simply as
a means of bringing together some key factors that
could serve as a foundation for a basic "privacy" control system. The working standards will evolve as
man gains more experience with this powerful ally
and is able to resolve philosophical and ethical questions that are inherent in the concept of "privacy".
As the environment and pace of modern life adjust
to current needs, the nature of "privacy" will probably
also reflect changes in priorities and the character of
the social stresses.
Elements in the invasion of privacy
No definitive statement exists which provides a
clear and acceptable statement of what is "private
information," or what constitutes an "unwarranted
invasion of privacy." Any criteria proposed to date
to identify "private information," or describe an act
135
136
Fall Joint Computer Conference, 1969
that would constitute "unwarranted invasion of privacy," must take into account whether or not such
disclosure of the specific data:
A. Would relate to an individual, a family or other
small group in such manner as to facilitate the
likelihood of the unwarranted identification of
the individuals, or
B . The data is not considered public information by
provision of legal statute, or
C. Would cause or be the basis for unjust economic
loss or social stigma or harassment to the
individual, or
D. Result in the unnecessary loss of a property
right.
What is private vs. what is confidential?
When attempting to discuss "privacy," the term
"confidentiality" inevitably will join the debate, but
does not promote clarification. What sort of personal
information do reasonable men interpret as "private?"
The answer to this question depends upon many
things; for example, anyone or more of the following
factors may apply:
A. The context within which the specific information
is embedded,
B. The amount of information assembled and accessible,
C. The intrinsic nature of the information.
D. The sophistication of the social values held by
the individuals concerned,
E. The character and scope of the sub-culture,
F. Significance of personal attributes such as: age,
ancestry, social status, race, etc.
Recently, the California Intergovernmental Board
on EDP was established by statute.1 It is charged
with responsibility to provide for intergovernmental
representation in the coordination of the many government sponsored EDP programs and to take leadership
in the establishment of intersystem standards. The
Intergovernmental Board appointed a select Technical Advisory Committee to assist in the preparation
of a Manual to serve as a guideline for all agencies
in the development of local systems and facilitate
adequate interface capability as required. The manual
was completed and is under review by the Intergovernmental Board prior to general release to official
agencies throughout the State of C~lifornia.
A sub-committee of the Technical Advisory Committee was specifically assigned to address the question
of "privacy". The members of the Privacy Sub-com-
mittee concluded, after some study, that there are a
number of personal information items that could be
made accessible to an integrated data system without
any threat to the individual "privacy". It was also
recognized that there are many other data items that
for one reason or another should be restricted from
wide access in the absence of an established right to
know. Some examples of these data items are as shown
below:
A. Information that may not be relevant to personal
privacy:
Name
Maiden Name
Address
Age or DOB
Race
Sex
Marital Status
N arne of Spouse
Next of Kin
B. Information that would probably be relevant to
personal privacy:
Occupation
Education
Income
Religious Preference
Political Preference
Family Size
N umber of Children
Ages of Children
Taxes Paid
History of Residence
Attitudes Toward Social Issues
Property Ownership
Value of Real Property
Marital History
Drinking Practices
Hospitalization Record
Medical Record
Symptoms of Illness
Record of Arrest
Ancestry
Nationality
Name of Relatives
Response to Psychological
or Medical Questions
Proliferation of data it'3ms throughout culture
While some of the information items mentioned
above may be found on records that are classified as
confidential, many of the information items may also
be found on records that are not subject to restriction
Management of Confidential Information
by law or policy. The current trend in social intercourse and information exchange reflects an everbroadening depth of self-revealment by individuals.
Private and governmental services are being extended
into newer areas and thereby attracting the participation of an ever-growing segment of the citizenry.
The integration of interagency information systems
with data exchange introduces a new dimension associated with the creation of composite record images
of persons known to the total system. These images are
the product of independent and frequently unrelated
inputs of data to serve other specific needs. Any
integrated interagency information system with this
potential capability must be administered by professionally qualified persons who remain sensitive of the
need to verify both the identification of the subject
of inquiry and the inquirer's "right to know". As more
data systems are activated and interfaces are established, the individual who is the initial source of the
data becomes more remote and isolated from the
operational inquiry that relates to his record. It should
be the constant aim of the system design, operational
programming, and user discipline to assure that system
integrity is not subverted.
Significance of developing standards for data verification
Attention should not be directed solely to provide
for the identification and classification of personal
data items. What is equally important, standards
must be developed and adopted to guide data acceptance and utilization with respect to the ability
to verify the information. For example, the confidence
in the operating system will be increased and utilization encouraged if the user is assured that data items
are subject to verification as to:
A.
B.
C.
D.
E.
F.
Accuracy
Bias
Completeness
Currency
Documentation
Satisfaction of Legal Requirements
A safety value that will support a sound verifica'tion program is to initiate a practi,cal data purge
system. The best data system in terms of cost/benefit
analysis is one that has a high content of active data
and one that is adequately updated. The effect of establishing a continuous and critical purge system is to
provide an orderly review of file content, to remove
inactive or low value data.
137
One approach to a data classification plan
A number of studies have been undertaken in an
attempt to identify and define data items that should
be processed as classified or confidential. There have
been perhaps as many solutions offered as there have
been studies proposed. The Privacy Sub-committee
mentioned above proposed a simple three category
data plan for consideration and approval or the California Intergovernmental Board on EDP.2 The concept is summarized below:
A. Confidential:
This classification has the highest level of
restriction, and should be limited to data which
is prohibited from free and full disclosure by
statutory regulation (law).
B. Restricted:
This is data which:
1. Is not prohibited from full and free disclosure by statute (coufidential), and
2. An unauthorized intrusion could constitute an unwarranted invasion of personal privacy, and
3. Has been administratively assigned a
security classification-restricted.
C . Unclassified:
All data maintained by a public agency not
otherwise classified as confidential or restricted
as defined below.
Sources of classification criteria
The criteria for the establishment of classification
of data arise from a variety of sources. In many instances, the criteria is a result of the interaction of
one or more of the following:
A. Public Policy:
The living residue of tradition and social acceptance.
B. Statutory Law:
The formalized and legal codification of social
needs and standards of conduct.
C. Legal Interpretation:
The implementation of judicial and administrative decisions that have been sanctioned
through public acceptance.
D. User Agency Specifications:
Operational decisions that have been adopted
138
Fall Joint Computer Conference, 1969
and ennunciated to promote agency goals In
an atmosphere of public support.
E. Personal Needs of The Individual:
Acceptance of the system integrity by the public who participate and furnish personal information to assist an agency function with respect
to the needs of the individual (Federal Census,
Social Security, etc.).
Each of the sources of criteria utilized is subject
to its own characteristic variations, and will require
continuous reevaluation. The scope of data items
subject to the confidential classification are under
constant adjustment and reassessment due to the
dynamic character of the social conditions which give
rise to the data.
Identification of areas sensitive to intrusion3
.
One of the main deterrents to the development of
new ideas about privacy has been the lack of specificity
as to where the threats to privacy may arise. Many
agree that at· some future date, a serious threat may
develop. That a real danger exists today is not universally accepted.
Let us consider the potential challenge to "privacy"
that may originate from any of the;following sources:
A. The accidental observance of data by an individual.
B. The accidental dumping oj a volume of confidential data to general view.
C. The solitary snoop.
D. The snoop-Jor-pay (hired spy).
E. The file stealer.
F. Misuse of confidential file by administrator having
access to system.
G. Organized crime.
H. Totalitarian government.
I. Another possibility might be the intrusion of the
private sector into government data files.
Establish policy on data classification
Before any acceptable automation program can
be developed to process information that may be considered "private" or "confidential," certain policy
decisions must be resolved.
A. The responsible administrators representing users
of the system must reach agreement on the data
content of the information' system. This agreement must include the identification of any
data items or files that would be subject to
restricted access or inquiry. If the restriction
is pursuant to current policy, said policy should
be specified:
1.
2.
3.
4.
General Public Policy
Agency Administrative Policy
Statutory Provision
Judicial Ruling
B. Specific criteria should be established based on
the accepted policy statements, and serve as a
guide to test the classification of all data, introduced into the system. The c011.tinued validity
of a classification should be based upon periodic
challenge and justification.
C. A policy manual should be prepared and maintained as a ready reference to facilitate system
operation.
1. Personnel participating in the system should
be held individually accountable for full
compliance with the "policy guidelines."
2. The policy manual should be subject to
continuous review and update to remain
current with system requirements, technology, and legal specifications.
D. Additional considerations in the development of
an Interagency Information System to maintain privacy control. Decisions regarding the
following elements of the system design and
operation will prove significant:
1. Facility Security:
(a) Location of Hardware
Single vs. Multiple Facility
(b) Physical Adequacy
Equipment
Personnel
(c) Access to Facility
Normal
Emergency
2. Equipment:
(a) Selection
(b) Configuration
(c) Operating Characteristics
Multi-processing
Multi-programming
Remote Terminals
3. Program Control:
(a) Single Management Responsibility
Management of Confidential Information
(b)
(c)
(d)
(e)
(f)
User Representation and Participation
Operating System
Monitor of System Services And
Access
System Applications
Man Machine Interface (Key Consideration)
Modularization of System Applications
Does Modularization Weaken
Privacy Control?
Integration of Compatible Systems
Does Program Control Reside
With The Core System?
4. The Human Factor:
This is the critical and perhaps most
unpredictable element in the functioning
process.
(a) Personnel Recruitment, Selection
And Appointment
(b) Personnel Training And Supervision
(c) Maintenance of Operating Discipline
(d) Personnel Retention
139
encourage system utilization by t:q.e participants for which it was designed.
1. Equipment (system hardware):
(a) Location and physical security of
equipment.
(1) Central Computer Installation
(2) Associated Peripheral
Equipment
(3) Back-Up FacilitiesDuplicate Files
(b) Remote terminal installations
(I/O devices.)
(c) Circuit Security
2. System Configuration
(a) Central Data Bank vs. Dispersed
Data Bases
(b) Central Data File vs. Central Index Concept
(c) Central System Control vs. Remote
Terminal Activation
Precautions to minimize potential for "privacy" violations
(1) Restricted Terminal Operation
(2) Multiple Function Remote Terminal
The same versatility and power that makes the
computer valuable as a data manipulator can be employed to monitor system services and support human
supervision procedures. The operating information
system should provide (assuming an adequate system
analysis and design):
3. Software System Support-Programming
must be developed with an awareness of
the need for system integrity and data
security. Provision must be made to provide control over basic software components, such as:
A. A Sound Data Classification System
1. Specify data subject to restricted access
and special protection.
2. Provide for isolated storage of restricted
data if necessary.
3. Determine who has right to access to
confidential data and under what operating
conditions.
4. User agency personnel should be certified for access by administration.
B. Physical Conditions:
What levels of control should be imposed to
promote system integrity and at the same time
provide a functional environment that will
(a)
(b)
(c)
(d)
Program Library
Back-Up Documentation
Diagnostic And Test Routines
Continuous Coding of Update
Schedules That Support The
Identification Schemes Inherent
to The Confidentiality Control
Programs
(e) Transaction Monitor Logs Should
Be Designed to Provide The
Basis For Operational Supervision But Not Reveal The
Location or Content of The Confidential Files Which Are Subject
to Monitor Control
140
Fall Joint Computer Conference, 1969
-------------------------------------------------------------------------------------4. Personnel Requirements-If the system
equipment and facilities justify particular planning to minimize the hazards to
confidentiality, it is certain that consideration be given to the personnel who will
function in the system. The scope of attention should extend through both the
employees who perform the technical
services associated with EDP, and the
operating personnel of the agency for
which the information system was developed. Despite all that has been said
heretofore, the "key" to security of information rests with the individuals who
have access to the data system. Our
personnel planning should encompass
many specific areas. The following relate
most directly to physical factors:
(a) Personal Safety
(1) Area Accessibility
(2) Emergency Provisions
(b) Personal Accountability
(1) Identification Control
Plan
(a) Access to Installation
(b) Access to Specific
Work Areas
(2) Is the Plan PracticalUsed?
(c) Conveniences And Necessities
(1) Are They Adequate?
(2) Are They Properly Located?
(d) What Special Precautions Are Warranted When Non-employee Personnel Are Permitted Access to
The Installation Area?
(1) Equipment Maintenance
(2) Building Service Maintenance
C. System Design Considerations:
Control provided through specific programming techniques.
1. Limiting Terminal Access to The System-Programming
(a) Classification Schedule (Data Level
Control)
(1) Terminal Identification
(2)
(3)
(4)
(5)
Terminal Verification
User Identification
User Verification
Call-Back COnCep1j
(b) Restriction of Detail of Information in Response to Inquiry
(Data Item Control)
(1) Refer to Index -. Pointer
to Source Data
(2) Status Indicator
(3) Advise Supervisory Station
(a) Secure
Permission to Interrogate The Restricted File
(b) Receive Seleeted
Hesponse
Throug;h Monitor Agent
(4) Specific
Limitation
Terminal Operation
on
(a) Data Input
(b) Data Manipulation
(c) Data Output
(d) Data Change or
Update
(e) Data Purl~e
2. Establish A Monitor On All Terminal
Action to Intercept and Identify unauthorized attempts to access the system.
(a) Identify Transmitting Terminal
And Location
Operator(?)
(b) Identify Terminal
(c) Identify Specific Nature of Restricted Access Attempt
(d) Provide For Supervisory Level
Notification of The Attempt to
Support Maintenance of System
Discipline
The
Unauthorized At(e) Abort
tempt to Secure Data
3. Maintain audit review of selected files to
Managem,ent of Confidential Information
facilitate the orderly purge of files and to
check levels of file activity
(a) Establish, as necessary, periodic
file review procedures to challenge the continued "confidential" status of individual data
items to assure conformity with
system policy and user need
(b) Maintain
necessary
statistical
measures of activity in restricted
files to document operational
policy decisions.
(c) Provide special test routines to
challenge the confidentiality
procedures and verify system
functional integrity
(d) The Human Factor- The concern
for confidentiality of data and
file security eventually will focus on an assessment of problems
that arise from the human element in the man-machine system. Despite the sophistication
exercised in system analysis, design and implementation, specific
recognition must be given to
the fact that people participate
in system operations.
What about a future computer utility?4
With the rapid and diverse growth of computer
services and recognizing the intimate relation between
hardware facilities, communication channels and the
users of the systems, it is no accident that discussion
should arise about the future establishment of a computer-communication utility. The need for such a
service becomes more apparent as we see the introduction of time-sharing systems and the implementation of large integrated data services that support
major regional and even statewide programs. The
arguments pro, and con the justification for a computer-communication utility are beyond the scope of
this paper. However, the utility concept does provide
the opportunity to propose several avenues of approach
to improving the "privacy" control aspect in personal
data· systems. One of the recurring suggestions has
been to establish a system of certification and licensing
for persons directly involved with the design, installation, management} and the operation of data systems
-nontaining sensitive personal information. A second
device that could prove of value w{)uld be to effect
141
control through regulation of the computer-communication utility service.
CONCLUSION
The challenge of privacy control
Violations of standards regarding confidentiality
or privacy of information occur when particplar items
of personal data furnished to an information system
for approved selective use are released to unauthorized
persons or in a manner that jeopardizes expected
system integrity.
A. The Predominance of The Human Factor
Tbe integrity of any information system regarding confidentiality or invasion of privacy
will eventually be resolved at the level of the
human factor. Machines, data sets, file cabinets,
index cards, tape drives, disk files, memory
modules, computers, report registers-each of
these devices is an inanimate object devised by
man to receive, transfer, or hold information
items made available to the system through
human intervention. Data stored in these devices are significant only insofar as the output
is meaningful to man, and subject to change
or exposure by the action of an individual. Data
stored in an inactive or inaccessible device
without human interaction will not reveal information that would provide the basis for a
violation of privacy. The relationship between
man and his information system can be described as consisting of the following basic elements:
(1) Man conceives the system.
(2) Man builds the elements necessary to provide the system.
(3) Man organizes the elements and establishes a scheme of operation.
(4) Man gathers the data that he introduces into the system.
(5) Man activates the system.
(6) Man commands the resources of the
system.
(7) Man utilizes the results of the system in
his external contacts in society.
The consistent factor in the above summation
is the predominant relationship of man to the
system. Man is responsible for creation of
the system, the input of information, the
manipulation of that information, and the final
142
Fall Joint Computer Conference, 1969
disposition of the data produced or revealed by
the system.
B. Personnel Standards Are Necessary
Due to the prime significance of the human
element in the integrity of any data automated
system, the programs must address the following problems in a forthright manner:
(1) Personnel standards must be established
for all participants.
(2) All accepted personnel must be indoctrinated on a continuing basis regarding
the system objectives, functions, operational responsibility, etc.
(3) Specific training must be provided regarding system participation and
terminal operation.
(4) Each installation should have competent
supervision and a plan of routine
inspection of operations.
(5) Each agency participating in a larger
shared system must be accountable
for the performance and integrity
of its representatives. It must also be
responsible for the release of any
system information that is received
from a classified file.
(6) All personnel who have access to the system should be required to sign a voluntary statement acknowledging their
individual responsibility to protect
the integrity of the system and respect
the confidentiality of classified data.
This statement could be a factor in the
initial as well as continued employment. 4
The operating system must prove convenient
and satisfactory to the User. It must provide
an effective service with assurance as to its
accuracy and adequacy. Outputs should be
tailored to meet the user need under the circumstances of the inquiry. The efficiency of
the system should discourage any user development or maintenance of alternate or substitute systems. The man-machine interface should
be maintained through the use of simple, direct
devices with a minimum requirement for coding
progressive verification, etc. An automated
data system should be so designed and supported that the user is free to direct his full
attention to his prime functional responsibility.
The information system must be a viable fmd
practical tool. It should function at the convenience of the user, with intelligible outputs
consistent in time and content to satisfy the
service requirement. Where a system requires specific security restrictions, these must
be furnished and function without imposing
any awkward limitation on the legitimate user
of the system.
C. Weak Policy And Discipline Result8 in An
Inferior System
Recent critics have voiced objection to the development of major data banks and interagency
information sharing systems in government service. Their objection has been based, in part,
on certain practices associated with private
credit bureau operations. The lament, properly
uttered, pointed to a lack of data control fLnd
exercise of discretion by a number of these
private agencies. While the economic and social
value of credit rating bureaus is rendily admitted, the loose policies regarding "privacy
of data" casts a shadow regarding the ability
to maintain integrity in a major information
system. I believe it is an unfortunate and improper inference to conclude that public information systems cannot protect the "privacy"
of information due to questionable practices
among some business organizations established
to collect and merchandise private information for profit.
D. Limitation of Data Access of Specific
A uthorizalion
Suggestions have been made that an individual
should specify the extent of utilization of personal information and then the system be required to conform to the intention expressed
by the individual. This proposal sounds reasonable, but on further consideration:. presents
subsequent problems in data management,
modification of data use authorization, etc.,
that demand thorough study.
E. Individual Right of Inspection of Record - File
Correction
Perhaps one of the most practical approaches
toward satisfaction of individual "right to privacy," and at the same time facilitate the
availability of the maximum of information resources to solve social needs is to make pro-
Management of Confidential Information
VISIOn so that the individual can inspect the
system files that contain his personal data.
The individual should also have means to seek
correction of any data item that is in error and
subject to bias interpretation.
F. Develop Realistic Data Purge Policy
Attention should be given to the development
of basic guidelines regarding the longevity of
data resident in a file or information system.
The current trend is to collect and classify
more and more data on more and more people.
While hopefully most of the data will have
social value, I am sure that a significant quantity will provide little benefit to the individual
or the community. It is not too early to consider the need for sound purge criteria so that
the data retained in an operating system will
offer the highest potential return for the energy
expended.
G. Adequate Training Programs Must Be Developed
And Employed For The EDP Staff And Personnel of The User Agency Who Have Occasion
to Engage The Data System
The content should include an introduction to
system design concepts, the overall functions
and data processing applications that are components of the system and a thorough instruction in terminal man-machine dialog. In addition, some attention should be given to explaining the service philosophy with particular attention to the rules regarding access to
and utilization of any information from confidential or restricted files. The legal and mora]
issues must be clearly defined, and an understanding accepted by all who engage the system
that a violation of the security code regarding
143
restricted data may be sufficient grounds for
removal from system participation or dismissal.
The training program must be viewed as a continuing support function with periodic refresher
classes, problem sessions, review of privacy
criteria, etc. It is most important that the
agency administrators and key supervisory
personnel become involved in this program" and
not leave the system discipline t.ask to the technical staff who are not equipped nor responsible
for this duty.
H. Despite much uncertainty and misgivings as to
the effectiveness in terms of "privacy" control
that will result from the imposition of a licensing
scheme, such a potential mechanism will be the
subject of more intense consideration with the
passage of time.
REFERENCES
1 Intergovernmental Board on Electronic Data Processing
created by statute passed by Legislature of the State of
California. S B No 1100. This statute established under
sections No 11710-11720 of the Government Code
2 File Security Procedures-Report by Sub-Committee on
Privacy and Confidentiality of the Intergovernmental
Board on Electronic Data Processing Oct 18 1969
3 Ibid
4 D E SCHWEINFURTH
The coming computer utility-Laissez-Faire licensing or
regulation?
Computer Digest May 1968
5 A F WESTIN
Privacy and freedom
Atheneum New York 1967
6 Hearings Before a Sub-Committee on the Committee on
Government Operations House of Representatives-89th
Congress (Second Session) July 26 27 and 28 1966
7 System Development Corp "SDC Magazine" Vol 10 Nos
7 and 8 July Aug 1967 (This issue focussed on the question
of computer privacy.)
Some syntactic methods for specifying
extendible programming languages
by VICTOR SCHNEIDER
Purdue University
Lafayette, Indiana
Model of translator system
Our model of a programming-language translator
system is represented schematically in the block diagram of Figure 1. This diagram divides the translator
system into two components. The first component T is
a translator program that reads in and translates the
valid programs of some programming language L.
The output of the translator is a subset T(L) of the
intermediate language. The second component is a
system M for executing the programs translated into
the intermediate language. It will be seen that, in this
intermediate language, the operators follow their
operands in postfix (reverse polish) form, and they are
relatively machine jndepend.ent. In this paper, we will
be mainly concerned with defining the operation of
the translator component by specifying the' inputoutput relationships of the translator for a particular
programming language. These relationships will be
described in a syntactic notation that is independent
of the particulE r translation algorithm used. for implementing the translator T.
The language that was chosen as an example for this
paper is Wirth and Weber's EULER.14 EULER is
quite similar to ALGOL 60 in appearance and capabilities, and it has additional features found in the
LISP list-processing language. The original EULER
Input .Programs
in Language L
Figure I-Simplified block diagram of a translator
system
syntax was written to conform to the requirements of
a precedence translation algorithm,14 and contains a
number of syntactic rules whose purpose is to facilitate
construction of a precedence translator from these rules.
Because of the presence of these stylized rules, it was
decided to rewrite the EULER grammar into a more
compact and transparent form than the one in which
it originally appeared. An Irons-style notation2 ,3 was
used to specify the translation of this new EULER
grammar.
Reverse Polish translation of programming languages
To illustrate what we mean by a syntactic specification of a programming-language translator, let us
consider as an example the following small portion of
the EULER syntax and examine some of the basic
devices used by our EULER sY:'ltem:
145
146
Fall Joint Computer Conference, 1969
Grammar 1. A Simplified Subset of EULER
Syntactic Rule
Rule of Translation
(expr) ---+ (var) = (expr)
I(sum)
(sum) ---+ (sum) + (term)
I(term)
(term) ---+ (term) * (factor)
I(factor)
(factor) ---+ (sum»)
lat (var)
I(var)
I(var ). ( expr-sequence )).
(var) ---+ (name)
(expr-sequence) ---+ (expr)
I (expr~sequence), (expr>
(var) (expr > assign
I
(sum) (term )add
I
(term) (factor )multiply
I
(sum)
(var)
(var )in
(expr-sequence) (var )in
variable (name)
I
(expr-sequence) (expr)
Note that the rules of translation above refer to
sequences of symbols on the right parts of syntactic
rules. In this example, we see that the rules of translation specify how symbols and sequences of symbols in
the source language are rearranged and rewritten in the
translated language. Where no change at all is indicated
in the translation of a particular rule, the symbol
·"1" appears as a translation rule. As an example of how
sequences of symbols are rearranged for translation, the
infix addition of
+
is translated into the reverse polish sequence of symbols
consisting of a "" followed by a ""
followed by the intermediate-language command for
adding together the values resulting from evaluation
of the previous two subexpressions. As in good polish
notation, parenthesis are removed from around expressions, and this process is specified by associating
the translation nde "" with the syntactic rule
---+( on the lefthand
side are used for translating arithmetic operands into
the intermediate language. For example, the syntactic
rule
---+
indicates that operands in arithmetic expressions are
variable names, and the translation of a into
the sequence
in
indicates that the "in" command is used for fetching
the value associated with and for storing that
value on top of the run-time operand stack of systom
M.
The other syntactic rule
---+ at
reflects the fact the EULER permits use of program
variables that are pointers to data named by other
program variables. Hence, the effect of the "at" command of the source language is to suppress the appearan~e of "in" in the translated program after the translated variable name. In this case, a pointer to the data
stored in is left on top of the operand stack in
system M at run time. Finally, the rule
-+
means that the names of program variables are translated into the sequence "variable ." Here, the
effect of the "variable" command is to find a pDinter to
the data stored in the following name by system M alli
to place this pointer on top of the run-time op~rand
stack.
The sequence ".( )." on
the right part of the remaining rule is ~he
definition of an EULER function call. FunctlOn
calls are translated with the parameters preceding the
function name in the translated program. In this way,
the function call can be made to look like a reverse
polish operator having n operands: with n the nnmber of
Syntactic Methods for SpecifyingEJxtendible Programming Languages
parameters. A parameterless function call is translated
exactly the same way as a program variable. Thus,
the sequence
"variable < name> in"
in a translated program serves both to fetch data and
to initiate a call on a function, depending on the
< name> involved. This calling sequence will be
referred to in the following discussion of extendible
language features.
In the full translation grammar for EULER given
in Appendix 2, it is possible to see how the methods
presented in the preceding example are applied to the
specification of a complete programming language.
Note that this larger grammar uses, e.g., the symbol
"+" in place of the "add" instruction of our small
example, and, in general, translates as many sourcelanguage symbols as possible directly jnto commands
of the intermediate language. The description of EULER
programming given in Appendix 1 of this paper should
clarify the meaning of the EPLER operators used,
and the following section in thIs paper wHI discus 3 the
syntactic methods for optimizing and extending
EULER as they are developed in the EULER grammar. A full description of the intermediate reversepolish language specified by the EULER rules of
translation can be found in Schneider. 10
Syntactic methods of optimizing expressions
In the EULER grammar of Appendix 2, the rules of
translation specify that a conditional statement or
expression of the form
"IF < expr> 1 THEN < expr> 2 ELSE < ..expr> 8"
is translated into its intermediate language version in
the form
"l$IF 2 $THEN 3 $ELBE"
Note that each of the expressions here can themselves
contain conditional expressions of any desired degree
of nesting, and each of the subexpressions will be rearranged aFi shown above. In this intermediate language
Syntactic Rule
(prim) ~ (stringprim)
(stringprim) ~ (stringhead) I
(stringhead) ~ I
I(stringhead) (symbol)
147
the "$IF" command causes an interpretive scan to
the matching "$THEN" label if 1 is false.
Otherwise execution continues until a "$THEN" is
reached, at which point a scan occurs to the "$ELSE"
label that matches this "$THEN" . In this way,
"$THEN" and "$ELBE" behave like baJanced parentheses around expressions, and also serve as placemarkers to which control can be transferred in the
translated program.
This mechanism for executing translated cond tional
expressions is used also as the basis for translating
logical expressions into a partially optimized form.
To take an example, the EULER sequence corresponding to a disjunction is represented by
" OR ".
Its translated form is
" < disj > $IF $TR UE $THEN < conj > $ELSE".
Here.. if the first operand" " of the expression
is true, the entire expression is true. Therefore, the
second operand is evaluated only if the first operand
is false. A similar mechanism is used for the sequence
" < conj > AND < neg> ".
Here, if the first operand is false, the second operand
need not be evaluated. Hence, the translated conjunction is of the form
" $IF $THEN $FALSE $ELSE."
Some syntactic methods of extending E U LE R
After developing the appropriate techniques for
translating conditional expressions and for optimizing
logical expressions, the next order of business is to
use these syntactic tricks to provide extended facilities
in the EULER language. The introduction of full
string-processing facilities into the EULER system is
the first example to be considered. Without altering
the EULER interpreter, and with a little reprogramming of the translator, we can effect the following
improvement:
Rule of Translation
I
(stringhead )).
(stringhead). * (symbol),
148
Fall Joint Computer Conference, 1969
Here, a string of arbitrary length is translated into a
list whose cells store the symbols in the string one
symbol in the cell in sequence. With this arrangement,
it is possible to manipulate strings using the list concatenation operator provided by EULER, and using
EULER subroutines to perform tests for list equality
and containment.
The second example involves the addition of facili-
ties for reading in data at run time within the framework of the EULER system. In this case, additional
facilities must be provided in the EULER polish string
interpreter. These facilities take the form of routines
for converting numbers into their internal representation and for packing string data. The added syntax
consists of the following set of rules:
Syntactic Rule
Rule of Translation
(program) -+ .ENTRY (block).EXIT.
\.EKTRY (data)., (block) .EXIT.
(data) -+ (datahead) END
(datahead) -+ DATA (item)
\ (datahead )., (item)
(item) -) (number)
I (stringprim)
I (datalist)
(datalist) -+ .0.
I (datalisthead ) (item»).
(datalisthead> -+ .(
I (datalisthead >(item),
(block)
(data> (block)
With this program structure, the data portion could
be read in by a run-time subroutine that leaves the
data in a pre-arranged location of memory. The
interpreter routine could then be read in over the data
routine, and the translated program would be executed.
A statement of the form "READ < prim>" would
then store an appropriate link to some segment of
the read-in on top of the run-time operand
stack.
The third example involves the use of a syntactic
notation to expand the EULER language into a selfextendible programming language similar to MAD / 1
(4) and ALGOL 68 (11). By an extendible programming
language, people currently mean the following two
things.
a. A language in which the programmer can specify
new data types and data structures composed
of novel configurations of data elements.
b. A language in which the programmer is able to
reorder the priorities of expression operators and
is able to specify arbitrary new operations at
will.
In EULER, there already exists a general mechanism
for allowing programmers to manipulate data structures,
namely, the list mechanism. EULER lists can be
constructed from arbitrary combinations of data
I
$DATA (item)
I
I
I
I
I
I
I
I
elements. However, EULER only has eight data types
with no facilities for extending their ranges. Such rangeextension facilities depend on the machine on which
the language is implemented, and algorithms for specifying such data types as numbers of arbitrary precision
must be written for the machine in question. Hence,
our example will concentrate on the machine-independent
problem of specifying new operators in programs.
Any reasonable programming language must presuppose the existence of a standard set of expre~~ion
operators before provision is made for aUa wing programs to expand this set of operators. VVith each
standard operator will be associated a standard precedence level, and the operators to be introduce:l by
the programmer must also have precedence levels. A"
the term is currently used, operator precedence (or
priority) is a measure of how expression operators
compare in binding power. For example, exponentia.tion
is said to have lower precedence than addition, bec:aus~
. exponentiation is performed before addition in
2.rithmetic expressions. Thus, precedence impose<:J an
ordering on the operations of a language. This ordering
is reflected in the ordering of syntax rules in programming language grammars. In the EULER grammar
above , rules are ordered so that list concltenation is
.
performed first, then exponentiation, and so on, unttl
the operation of value assignment. From concatenation
Syntactic Methods for Specifying E,xtendible Programming Languages
to assignment of value there are nine levels of precedence.
Our approach in providing, for the programming of
new operations js to assign these operations to one of
nine c:asses of operators, reflecting the nine levels in
original grammar. This means that the translator must
now treat operators as though they are procedure calls
that ca.n only be written into the translated program
149
where their associated precedence level permits th eir
operations to occur. In order to permit the programmer
to tell the translator what precedence is associated with
a newly defined operator, we require an additi onal
operator declaration in our language. This declaration ,
together with the precedence syntax of express)ons
that follows, is sufficient to provide the expanded
operator-definition facility
Grammar 2. An Expression Grammar for Defining New Operators
Syntactic Rule
Rule of Translation
(expr ) ~ (var) (opname) (expr )
I (disj )
(disj) ~ (disj) (opname) (conj)
I (conj )
(conj) --? (conj) (opname ) (neg)
I(neg)
(var) (expr) $VARBL (opname) $IN
I
(disj ) (conj ) $VARBL (opname) $IN
I
(conj ) (neg) $VARBL (opname) $IN
I
(catena) ~ (catena) (opname) (prim)
I (prim)
(catena) (prim) $VARBL (opname) $IN
I
(blockhead) ~ (blockhead)
(operatordec ).,
(operatordec) ~ OPERATOR
(opname)
I(operatordec), (opname)
(blockhead) (operatordec )
(explI) ~ (opname) = (opdef)
(opdef) ~ (defhead) (expr) $.
(defhead) ~ (rankpart)
(operand part ).,
(rankpart) ~ RANK OF (digit).,
(operand part ) ~ OPERANDS (name)
I (operandpart), (name)
(opname) ~ (symbol)
I (opname) (symbol)
In the expression syntax above, the
in each rule is translated into a procedure call, \vith
parameters consisting of the one or more operands
associated with each . These procedure
calls either refer to the "Standard" operator associated
$NEW (opname)
(operatordec) $NEW (opname)
(opname ) (opdef) =
I
(rankpart) (operandpart)
(Not Translated)
$FORMA (name)
$FORMA (name) (operand part )
I
I
with a particular precedence level or refer to the translated declared by the programmer. It is
assumed that the translator will automa.tically enclose
each translated program with an extra outer block
containing procedure definitions for the set of standard
150
Fall Joint Computer Conference, 1969
operators basic to the language. In this way, the
standard operators can be redefined within a particular
program, but will regain their usual meaning upon exit
from the block in which. the redefining statement
occurred. A consequence of this method of allowing
new operator definitions is that program subroutines
may use operators global to their definitions, but may
not have operators passed to them as parametsrs,
since all assignment of precedence is performed at
translation time.
A certain amount of optimization is still possible
within the framework of this extendible translator. As
an example, suppose that we write the following pro~
cedure correspond to the standard operator for logical
conjunction:
AND = RANK OF 7., OPERANDS X, Y., IF Y
THEN X ELSE FALSE $.
The actual parameters in the procedure call for logical
AND above are expressions surrounded by ".$" and
"$.". Thus, the effect of the conditional expression in
the operator definition given above is to evaluate the
Y parameter only once and not to evaluate the X
parameter unless Y is true.
Grammar 3. A
Programmer~defined syntactic
augments to existing
languages
As a next step in allowing programmers to decide on
the nature of their own programming languages, we
could conceive of a translator facility for allowing
programmer~specified syntactic and semantic augments
to existing programming languages. The idea behind
this definitional facility is that the translator can be
provided with facilities for accepting new syntactic
rules and associating their right parts with :rules of
translation that are essentially calls on global procedures. The operands within the new syntactic augments are than translated as parameters supplied to
the procedures for executing the augments. The
feasibility of such augments, provided they do not
lead to problems of syntactic ambiguity, can be inferred
from the algorithms presented in Schneider. 9 .10
As an example of what a programmer might be
tempted to add to his language, and of the methods he
could use, we consider the problem of adding ALGOL
W-style iteration to the EULER language. In the
folloWing translation grammar, the global procedures
used in translated programs are "$FOR." and
"$ WHILE", corresponding to the incremented variable and ]ogioal iterations, respectively.
Programmer~Defined
Syntax of Iterative Statements
Syntactic Rule
(a) (expr) ~ WHILE (expr)l DO (expr)2
(b) (expr) ~ FOR (var) FROM (expr)l UNTIL (expr)2 BY (expr)3 DO
(expr)4
Rule of Translation
(a) .$ (expr)1 $.. $ (expr)2 $.$VARBL $WHILE $IN
(b) (var) (expr)l (expr)2 (expr)3 .$ (expr)4 $.$VARBL $FOR $IN
Note that the controlled statement in the syntax
above is translated with procedure definition brackets
".$." and "$.". In this way whenever the corresponding
formal parameter in the "$FOR" OR "$WHILE"
procedure definition is executed, the entire controlled
statement is executed as a procedure. The procedure
definitions of "$FOR" and "$WHILE" that follows
are the "semantics" of Grammar 3:
$FOR = .$FORMAL VAR, EXPl, EXP2,
EXP3, STAT.,
BEGIN LABEL TEST, CYCLE.,
VAR = EXPl.,GOTOTEST."
CYCLE .. VAR = VAR+ EXP2.,
TEST .. IF(VAR- EXP3) *SIGN(EXP2)GT 0
THEN UNDEFINE D
ELSE BEGIN STAT., GO TOOYOLE:
END $.
$WHILE = .$FORMALLOGEXP, STAT.
BEGIN LABEL OYOLE.,
OYOLE .. IF LOGEXP 'THEN BEGINE~T A1',
GO TO OYOLE END
ELSE UNDEFINED END $.
Syntactic Methods for Specifying Extendible Programming Languages
151
The flowchart of Fjgure 2, showing the transitions to
and from the box corresponding to < expr <, illustrates
hO\v the EULER translator was programmed.
1111+1
REFERENCES
N ",Sj
1
j=j+l
THEN
(consequence)
(alternative)
Outcode(N1 )
Outcode(N1 )
1.1-1
1a1-1
Sj ill ?
(pro cde t)
).
Out code (Sj)
j.j+l
Outcode(Sj)
jaj+l
1-1-1
TO INITIAL POINT FOR
~
Xl
Xl
~
X 2 < consequence>
X2
~
< condition>
By letting Xl be THEN and X2 be IF in the translator,
the coding is greatly simplified, and no ambiguities
are introduced, since the X; can be treated as "new
and distinct" symbols of the normal-form grammar.
1 R W FLOYD
A descriptive language for symbol manipulation
JACM Vol 8 1961 579-584
2 E T IRONS
A syntax dire::ted compiler for ALGOL 60
CACM Vol 4 1961 51-55
3 P M LEWIS R E STEARNS
Syntax-directed transduction
JACM Vol 15 1968465-488
4 D L MILLS
The syntactic 8truciure oj MADlt
DDC Rpt No AD-671-68:-3 1968
5 P NAUR editor
Revised report on the algorithmic langua(,'c ALGOL 60
CACM Vol 6 1963 1-17
6 V 13 SCHNEIDER
The design of processors for context-free languages
NSF Memo Northwe",tern Univ Hl65
7 V B SCHNEIDER
Pushdown-store processors of context-free languages
Dept of Industrial Engineering Northwe-"tern Univ 1966
Evam;ton III
8 V B SCHNEIDER
Syntax-checking and parS'ing of conte;rt-free languages by
pushdown-store auto mata
Proc SJCC 196771-75
9 V B SCHNEIDER
A system for deS'igning fast programming language translators
Proc SJ CC 1969 777-792
10 V B SCHNEIDER
A translator system for the EULER programmng language
Tech Rpt 68-76 Computer Science Center Univ of Md
College Park 1969
11 A VAN WIJNGAARDEN editor
Report on the algorithmic language ALGOL 69
Mathematisch Centrum 49 2e Boerhaavestraat Am",terdam
The Netherlands 1969
12 J WEIZENBAUM
A symmetric list processor
CACM Vol 6 1963524
13 N WIRTH
A generalization of ALGOL
CACM Vol 6 1963 547-554
14 N WIRTH H WEBER
A generalization of ALGOL and its formal definition: Parts
I and II
CACM Vol 9 1966 13-25 89-99
Appendix I
Features of the E U LE R language
EULER is a nested block-structure language,
similar to ALGOL. Thus, every block, consisting of a
sequence of statements surrounded by BEGIN and
152
Fall Joint Computer Conference, 1969
END parentheses, can be treated as a single statement
in ALGOL fashion. An EULER program consists of
an EULER block preceded by .ENTRY and followed
by.EXIT ..
In EULER., there are three declarations. One declaration is for data variables, one for program labels,
and one for formal parameters of procedures. In the
program
".ENTRYBEGIN NEW X, Y.,
LABELZ., ...
Z .. X
X and Y will store data, and Z will be a label preceding some statement.
Assigning a data type to a declared variable is
accomplished by writing an assignment statement with
data of the appropriate type on the right-hand side
of the assignment. Thus, typing of variables in EULER
is dynamic, since any assignment statement can change
the data type stored in a variable. And, data typing
is implicit, since there are no declarations like rea.!,
integer, etc., as appear in ALGOL. The followi.ng is a
list of the right EULER data types:
+ YEND .EXIT."
I. Number --In the EULER system, all numbers are assumed to be floating
point numbers. The assignment statement
"V
=
E.,"
with E a numerical expression or number, causes variable V
to become a numerical variable.
II. Symbol
-In this EULER implementation, an assignment statement
such as
"V = . *ALPHAN.,"
causes the six characters "ALPHAN" to be stored in the
location named by variable V.
III. Logical
-The logical constants are TRUE and FALSE, standing
respectively for logical truth and falsehood. The assignment
statement,
"V = L.,"
with L a logical constant or logical expression, causes variable
V to become a logical variable.
IV. Label
--EULER programs use two declarations. "NEW" is used to
declare a data variable, and "LABEL" is used to declare the
presence of a label in some block of a program. Interestingly,
if V is a variable in some EULER block, and V is not in a
block global to the block of label L, then the assignment
statement
"V = L.,"
causes V henceforth to be of type label, and to be interchangeab1e with L in GO TO statements.
V. Reference-In EULER, if VI is a variable not in a block global to the
block of variable V2, then the assignment statement
"VI = AT V2.,"
Syntactic Methods for Specifying Extendible Programming Languages
makes VI a pointer to the data stored in V2. After VI is
turned into such a pointer, the two statements
and
"V2 = V2 + 1.,"
"VI IN = VI IN
+
1.,"
will have exactly the same effect of manipulating whatever
data is stored in V2.
VI. Procedure--An assignment statement of the form
"VI = .$ (expr) $.. ,"
causes VI to become the name. of a parameterless procedure
call with body given by (expr). As a programming example,
we might consider the following EULER block: "BEGIN
NEW X, Y., X = 2.,
Y = .$FORlVIAL Z., X = X
OUT Y~(5). END"
+
Z$ .. ,
When Y.(5). is operated on by the "OUT" operator, the
value 7.0000 will be -written out.
VII. List
-In EULER, lists can be constructed in three distinct ways:
(a) On command: "VI = LIST 300.,"
This statement creates a list of 300 undefined cells and makes
VI their name.
(b) By explicit notation: "V2 = .(1,.(2, 3)., 4) .. ,"
This statement creates a list consisting of two numbers and a
sublist and makes V2 the name of that list.
(c) By concatenation: "VI = VI CON CAT V2.," Using the
CONCATenation operator, small lists can be joined into
larger ones.
In addition, lists can be subscripted in the same way as
ALGOL arrays, each element of a list can be any EULER
data type, including label, reference, and procedure. The
following EULER block is a small example of the genera1ity
of the list notation: "BEGIN NEW X, Y., LABEL Z.,
=
.(2, .$ BEGIN X = X+ 1., Y(X) END $.,
.$ OUTX$., Z) .. ,
X = Y(l)., Y(X)., GOTO Y(4).,
Z .. OUT .*FINISH END"
Y
With this program segment, first 3.0000, then FINISH will
be written out by the executed program.
VIII. Undefined-Every variable declared by "NEW" in an EULER program
is initially of type "UNDEFINED." In addition, "UNDEFINED" is used as a data constant occasionally and as an
empty option in conditional statements such as:
"V = IF LI THEN .(1, 5). ELSE UNDEFINED.,"
For more details on EULER programming, the reader is referred to the Wirth and
Weber EULER paper.14
153
154
Fall Joint Computer Conference, 1969
Appendix 2
11 new translation grammar for EULER
Syntactic Rule
Rule of Translation
1: (program) ~ .ENTRY (block) .EXIT. (block)
(blockhead) (body) $END
2: (block) ~ (blockhead) (body) END
$BEGIN
3: (blockhead) ~ BEG IN
(blockhead) (label dec )
1(blockhead) (labeldec ).,
(blockhead) (vardec )
1(blockhead) (vardec ).,
$NEW name
4: (vardec) ~ NEW (name)
(vardec) $NEW (name)
1(vardec ), (name)
$LABEL (name)
5: (labeldec) ~ LABEL (name)
(labeldec ) $LABEL (name)
I (labeldec), (name)
I
6: (body) ~ (body)., (stat)
I
1(stat)
7: (stat) ~ (labdef) (stat)
I
1(expr)
I
$LBDF (name)
8: (labdef) ~ (name) ..
(expr) $GOTO
9: (expr) GO TO (expr)
(expr) $OUT
lOUT (expr)
(var) (expr) =
1(var) = (expr)
I
I(disj )
1(condition) (consequence) (alternative) I
(expr) $IF
10: (condition) ~ IF (expr)
(expr) $THEN
11: (consequence ) ~ THEN (expr)
(expr) $ELSE
12: (alternative) ~ ELSE (expr)
I
13: (disj ) ~ (conj)
(disj ) $IF_$TR UE $THEN_
1(disj ) OR (conj)
(conj ) $ELSE
I
14: (conj) ~ (neg)
(conj ) $IF_ (neg) $THEN_
1(conj) AND (neg)
$FALSE $ELSE
I
15: (neg) ~ (relation)
(relation) $NOT
INOT (relation)
I
16: (relation) ~ (sum)
{sum )1 (sum )2 (relop )
1(sum)1 (relop) (sum)2
$EQI$NEQI$GEQ
17: (relop) ~ EQINEQIGEQ
I$LEQI$GTI$LT
ILEQIGTILT
I
18: (sum) ~ (term)
(term)
1+ (term)
(term) $NEG
1- (term)
(sum)(term) {+I-}
1(sum) {+I-} (term)
19: (term) ~ (factor)
I
(term) (factor) {*I/I./.I
1(term) {*1/1·/·
$MODUL}
IMODULO} (factor)
I
20: (factor) ~ (catena)
(factor) (catena )**
1(factor )** (catena)
I
21: (catena) ~ (prim)
(catena) (prim) $CONCA
1(catena) CONCAT (prim)
$UNDEF
22: (prim) ~ UNDEFINED
Syntactic Methods for Specifying Extendible Programming Languages
Syntactic Rule
23:
24:
25:
26:
27:
28:
29:
30:
31:
32:
33:
34:
35:
36:
37:
38:
I(val')
I (label)
I( (expr»)
I (block)
I (procdef)
I (reference prim )
I (Iistprim)
I (numberprim)
1(logicalprim )
ITAIL (prim)
I (val') . ( (expr-sequence )) .
I{symbolprim)
(label) -~ (name)
(val') ~ (name)
I (val') IN
1 (val') (sum-sequence»)
(expr-sequence) ~ (expr)
I (expr-sequence), (expr)
(sum-sequence) ~ (sum)
I (sum-sequence), (sum)
(referenceprim) ~ AT (val')
(list prim ) ~ (list)
ILIST (sum)
(list) ~ .( ).
I (listhead> (expr )).
(listhead) ~ .(
(numberprim ) ~ (number)
IREAL (disj)
ILENGTH (catena)
IABSOL UTE (sum)
IINTEGER (sum)
(logicalprim ) ~ TRUE
1FALSE
ILOGICAL (sum)
1(sypeinquiry) (val' )
(typeinquiry) ~ ISNU
IISLOIISLAIISLI
IISPR IISREIISSY IISUN
(symbolprim) ~ . * (6-symbol string)
(procdef) ~ .(prochead ) (expr) $.
(prochead) .$
I (prochead) (formaldec ).,
(formaldec) ~ FORMAL (name)
(formaldec ), (name)
(6-symbolstring)
{ (letter)1 (digit) (blank)
I,I·I$I*I?I = 1+1-
Rule of Translation
(var) $IN
(label) $IN
(expr)
I
I
I
I
I
I
(prim) $TAIL
(expr-sequence) (val') $IN
I
$VARBL (name)
$VARBL (name)
(val') $IN
(val' ) (sum-sequence»)
I
(expr-sequence) (expr)
I
(sum-sequence») (sum)
(val')
I
(sum) $LIST
I
I
I
$NUMBR (number)
(disj ) $REAL
(catena) $LENGT
(sum) $ABSOL
(sum) $INTEG
$TRUE
$FALSE
(sum) $LOGIC
(val' ) (typeinquiry )
$ISNU I$ISLO I$ISLA
I$ISLI I$ISPR I$ISRE
I$ISSYI$ISUN
I
I
.$-(prochead) (formaldec )
$FORMA (name)
$FORl\1A (name) (formal dec )
I
i>1<}6
(i.e., a string of 6 characters.)
39: (name) ~ (letter)
I
155
156
Fall Joint Computer Conference, 1969
----------------------~-------------------------------------------------,--
Syntactic Rule
I(name> (letter>
I (name> (digit>
40:
41:
42:
43.
Rule oj Translation
I
I
(For the IBlYI 7094 and the UNIVAC 1108, only the first six characters of a
(name> are translated.)
(number) ---'? (integer>
Converted to octal.
I(integer). (integer)
Converted to octal floating point.
(integer> ---'? (digit>
I (integer> (digit>
(digit> ---'? 0111 ... 19
I
(letter) ---'? AI ... IZ
I
--
SYMPLE-A general syntax directed
~acro preprocessor
by JAMES E. VANDER MEY
The Pennsylvania State University
University Park, Pennsylvania
ROBERT C. VARNEY
The Pennsylvania State University
McKeesport, Pennsylvania
and
ROBERT E. PATCHEN
IBM Corporation
Boston, Massachusetts
INTRODUCTION
The subject of this paper is a general syntax directed
macro preprocessor system. One of the suggested potential uses of this system is that of evaluating new or
extended programming languages by the technique of
syntax directed macros. This led to the association of
the acronym SYl\1PLE (SYntax Macro Preprocessor
for Language Evaluations) with this system.
A preprocessor is a processor intended to be used prior
to another processing stage. In our case, it is assumed
that the SYlVIPLE preprocessor system will generally
be used in processing higher level language texts (ones
which are user oriented), producing output text in the
same or a similar higher level language.
The term "macro" is used in a very general sense in
this paper. As in other macro systems, the macro mechanism consists of the recognition of a macro "reference"
in the source text being processed, and a macro "definition" defining a translation proceduFe invoked by
some corresponding macro reference.
A SY1\:lPLE macro definition consists of two parts:
the "macro semantic portion" or "macro body"; and
the "macro templates."
The macro semantic portion is the translation procedure and consists of the instructions to be executed when the macro is "invoked". A macro is
invoked when a pattern described in one of its
macro templates is recognized by the parser in
the source input text. This macro reference pattern
may have identifiable parts which are then considered as arguments for the semantic portion.
A macro template defines a possible macro reference pattern for this macro and consists of two
distinct parts: A specification of a general syntactic substructure of the source input text in which
a given macro reference may occur (i.e., context);
and any necessary further syntactic qualifications
within that general syntactic substructure (e.g., a
specific pattern). The actual pattern matching
technique for macro reference is thus a two level
syntax directed matching procedure. This syntax
157
158
Fall Joint Computer Conference, 1969
directed macro reference technique is the method
by which SYl\1PLE achieves both simplicity and
generality.
The SYl\1PLE system as a macro system is not tied
to any particular programming language. The base
(source input) language and the object (output) language of the macro facility could in fact be entirely
different languages.
The syntax of the languages to be processed and/
or extended must be adequately described through the
syntax description metalanguage of the S Yl\1PLE
system. This syntactic description is used for determining "context" for macro references and thus the requirements for a minimally "adequate" syntactic description
of a language are proportional to the degree of context
required to isolate macro references.
As a very simple example, assume all macro references
must occur in only a single specific syntactic unit (syntactic substructure) of the base language (e.g., only
labels of Fortran statements). Then to facilitate the
recognition. of macro references in the source language,
the syntax of the base language need only be described
via the metalanguage to the extent that it can isolate
this syntactic unit type (i.e., Fortran labels.) vVhen
recognized, this syntactic unit will then be considered
as a candidate for containing a macro reference.
After a candidate syntactic unit is isolated in the
source input a check can be made for the existence of
specific macro references by testing for further qualifying patterns within that syntactic unit. For instance,
a Fortran label of "three blanks followed by t"yO numbers" might be a specific macro reference. A check would
thus he made for this reference according; to the syn-'
tactic pattern defining "three blanks followed by two
numbers" whenever a Fortran label is recognized. This
process of local syntax investigation is called "template
matching" for a macro reference.
It is also through the template matching facility
that translation parameters in the source language
(e.g., arguments, conditions, etc.) are recognized and
passed to the actual macro facility. These translation
parameters, which we shall call argument strings, can
be manipulated by the instructions contained in the
body of the macro (semantic portion).
Since the primary function of the SYl\1PLE system
is that of a preprocessor, the translation process is mainly that of a manipulation of argument strings and the
insertion of modified and/or created strings back into
the source input. Hence, the actual semantic portion
of the macro is implemented in a language oriented to
the manipulation of character strings. Thus translation
due to macro references and related translation param-
SYW'LE PREPROCESSOR S'I'STEM FI..CNI
Figure I-A general flow of the SYMPLE macro
preprocessor system
eters generally results in the insertion of the translation code in the base language into the body of the
code being processed. It will be shown that this "in
place" translation in the SYMPLE system does not
necessarily imply expansion in exactly the same place
(i.e., at the lexicographical location of the maero
reference).
An attempt will now be made to summariize and
interrelate the functions of the SYMPLE system by
outlining the system functional flow via a system flow
diagram (Figure 1) and the following brief description.
The preprocessor operates as follows:
1. The first items processed contain control information which includes such items as the device(s)
from which subsequent information is to be read,
the device(s) designed for system output, the
names of special edit macros, specifie listing
options, etc. Control information ma,y oceur
in the input stream at other logical stages of
processing.
2. A description of the base language syntactic
structure is read as input and proeessed to
build a data base for the recognition portion.
This data base will be used later by a parser.
3. Macros (templates and associated semantic
translation routines) are read in, stored, and
used to create necessary data bases for later
processing.
4. A source deck is read in and parsing; of the
source input begins. (Probable entry point for
most users.)
a. As a syntactic unit is recognized, a check
is made to see if any macros have templates
to be matched in this syntactic unit.
SYMPLE
Ternplates of edit macros, if any, are tested
last. When there are no templates left to
be checked and if the end of the total
parse has not been encountered, the parse
is continued.
b. If a macro template match is successful,
the argument strings are passed to its
associated macro semantic portion. There
may be any number of macro templates
associated with a given macro semantic
portion, and ident.ical template patterns
can be associated with different macro
semantic portions.
c. The instructions in the current macro
semantic portion are executed (actually
interpreted) and the results of their operations are effected (e.g., storage manipulation, insertion of translation into input
source, dynamic creation of new macro
templates or semantics for this or other
macros). Upon completion of execution
control is returned to 4a above.
5. When the source deck has been completely
parsed and thus source time translations, including any necessary editing, have been completed, the file is then ready for output in a
manner specified by the control information.
6. Processing is now completed, but by appropriate control information another cycle may
be initiated on (a) new information or (b) on
a previous preprocessor output file. Thus, in the
latter case, we have the possibility of a multipass preprocessor, if desired.
The remainder of this paper will be devoted in the
main to the details of what the SYMPLE system can
do and in general how one goes about using the SYMPLE system. The syntax description metalanguage is
introduced first followed by an introduction to the
macro translation (semantic) and insertion capabilities
ofSYMPLE.
Syntax description metalanguage
The syntax description metalanguage is used to describe a parsing "grammar" of the base language in
which macro references are to be embedded and thereby
outline the manner in. which the source input is to be
parsed. For example, suppose a label field is one syntactic structure to be parsed. The parser should then be
told that a label field consists of, say, five characters
which are either all digits, all blanks, or a string of
blanks followed by a string of digits.
159
The grammatical metalanguage used to direct
SYMPLE',s parser is similar to the Backus-Naur
Form 4 (BNF) metalanguage. For example, similar
grammatical productions are used to define syntactic
structures; the nonterminals and terminals of BNF are
also used being renamed syntactic units and literal
strings, respectively. There, are, however, several features in SYMPLE's metalanguage which were incorporated to extend the power and simplicity of grammatical description over that of standard BNF.
Actual productions in SYMPLE's metalanguage to
define the parsing desired in the preceding example are
(LABEL-FIELD) :5&5(0$' 'O$(DIGIT»
(DIGIT) :'0' 1'1' 1'2' 1'3' 1'4' 1'5' 1'6' 1'7' 1'8' 1'9'
The first production above is interpreted as: a label
field is defined as not less than five nor more than five
characters of a string of zero or more blanks 'immediately followed by zero or more digits.
Productions
The syntactic units of the base language are defined
by productions in the metalanguage. These productions are of the form:
(LHS): right side
where (LHS) represents the syntactic unit being defined on the left side and the right side contains metalinguistic descriptions of other syntactic unites) and/or
literal string(s) in the left to right order in which they
comprise the structure of (LHS). The colon (:) separates the defined syntactic unit on the left side from
the defining information on the right side.
The first production of the base language grammar
must be the definition of the syntactic unit representing
the total syntactic structure of the base language (i.e.,
the initial or distinguished symbol of BNF). Other
productions may be in any order.
(Named) Syntalctic units
The metalinguistic representation of a syntactic unit
in a production is a string of arbitrary length enclosed
in parantheses. The string (called the name of the
syntactic unit) may be composed of any characters
with the exception of those used as special delimiters
in the syntax description metalanguage (i.e., illegal
characters are 0: ;'1 $&).
160
Fall Joint Computer Conference, 1969
Literal strings
A literal string is represented in the metalanguage
by the desired string of characters enclosed in single
quotation marks ('). Any character may be used within
a literal string, except that a single quotation mark is
represented by two adjacent single quotes for each
occurrence in the literal string in order to differentiate
it from the ending delimiter of the literal string.
Alternatives
If a syntactic unit in the base language may h~ve
alternative representations, these alternatives may be
represented in the metalanguage as a single production
with the alternatives of the syntactic unit each appearing on the right side and separated from each other by
the conventional OR symbol (I).
Example:
(DIGlf):'1'1'2'1'3'I(OTHER)
Complex substructures (Unnamed syntactic
units)
If one does not wish to break down and label a syntax substructure in detail, but simply label an entire
complex substructure as a syntactic unit, pairs of parentheses may be used as grouping in::licators. Consider
the following equivalent examples of a definition of
the syntactic unit (NUM4).
Example:
Example:
(NUM) :'2'1'3'1'4'
(NUM2) :'3'1'4'1'5'
(NUM3):'5'1 '6'1'7'
(NUM4) :'1' (NUM) (NUM2) 1'1'
(NUM3)
(NUM4): '1' «'2'1'3'1'4')('3'1'4'1'5')1
('5'1 '6'1 '7'))
Grouping may occur to any depth desired and each
quantity within the grouping parentheses must have
the form of any legal right side of;a production.
Quantity repetition and bounds
Often in the syntax of a base language a (named or
unnamed) syntactic umt or literal string may be required to occur several times. Or it may be desirable
to specify that a syntactic structure b3 a function of
the length of an input string in addition to other qualifications (e.g., a label field of exactly five characters
and consisting of . .. ).
To indicate either the repetition of a string (Le., the
input string defined by a syntactic structure) or the
length bound on the number of characters in some
string, an operator group must precede the respective
quantity in the syntax. The operator group ils of the
form n$m or n&m for the string and character counters
respectively, where n is an integer representing the
lower bound and m, an in 'jeger representing the upper
bound.
Consider the following example.
(A): 3$3 (SUB-STRUCTURE)
(B): 3$3 (SUB-STRUCTURE)
(C): 'C'
(SUB-STRUCTURE):
O~~5 (0)
1$3'AB'
The first production defines (A) as exactly three strings
of (O$5(C)1$3'AB'). Thus, acceptable strings for (A)
might be ABABAB or ABCABCC.CCABAB or CCABABCABAB, etc. However, (B) is defined as exaetly
three characters which are otherwise defined as in (A).
Thus, (B) can be only CAB; no other combinations
will yield exactly three characters. Notice that the
string counter differs from the character counter in that
it is distributed over all inner strings whereas the character counter represents an absolute bound over a given
substructure.
When productions include quantities with :repetition
counts, the parser which utilizes these produc:tions will
attempt to find the largest number of those quantities
in the input source consistent with the upper bound of
repetitions. If the input contains more than 1Ghe upper
bound of these quantities, the input string corresponding to the upper bound count of quantities will be reeognized and succeeding repetitions will be analyzed according to the syntax following. A lower bound count
of zero is allowable and simply indicates the optional
omission of the quantity.
The absence of an explicit lower bound implies a
lower bound of one. The absence of an explilcit upper
bound implies an upper bound which is the maximum
bound allowable in the system. In the present im.plementation it is 32767. It should be noted that
1$1 (SYUN) and (SYUN) are equivalent as are
$(SYUN) and 1$32767 (SYUN)
Complement look-ahead
The symbol -, preceding a literal string, syntactic unit
or grouping indic?tes that at that point in the syntax
the quantity indicated lll:ust not occur: This :ls called a
complement look-ahead for the indicated quantity at
SYMPLE
parse time. If the quantity is found, the parse being
attempted has failed. (Any syntactic units found on the
look-ahead will not result in macro template match
attempts.) If the quantity is not found, the parse continues as before the complement look-ahead.
Example:
(LETTER):'A'I'B'I'C'\'D'I'E'
(SPLTRSTRG) :$( --, '0' (LETTER»
The strings recognized as (SPLTRSTRG) will be any
string which consists of one or more of A, B, D or E,
butnotC.
Scan positioning
The production defining a syntactic unit can be made
to include, without investigation as to structure, an
arbitrary lengh of input, or it. may require that a
particular syntactic unit in the input conform to more
than one syntactic structure. This is done by explicitly
positioning the location at which the parser is "looking."
This location, called the scan position, can be adjusted
either relative to its present position or to the beginning
reference points in the syntax of the parsed input.
a-X(Space) positioning
The occurrence of the symbol X immediately followed
by an unsigned integer number and delimited by bracketing commas at any point in the right side of a production will cause the scan position to be adjusted
rightward from its present location the integer number
of positions specified. The symbol X and following
number must be bracketed on both sides by commas
except in the following cases: X is the first (last)
symbol of a grouping level or the first (last) symbol of
the right side of a production, in which case the left
(right) comma is not required.
Example: Define an (END-CARD) to be an
80 character string. The first six characters must be
blanks, the next 66 characters must have the word
END somewhere with the rest blanks, and the last
eight characters may be anything.
(END - CARD): 6 & 6' '66 & 66 (0$"
('END')
0$' ') , X8
b-T (Tab) positioning
The format is similar to that of X positioning, except
a T is used instead of an X.
The T scan positioning results in the scan position
161
being moved the specificed number of places to the
right of the beginning location at which the parse began
at (1) this grouping level, if the T positioning appears
within a grouping parenthesis pair, or (2) th~ right side
of the production otherwise.
Example: A syntactic unit (El\1PLOYEE-NO.)
is defined to be an 80 character string with'i1 syntactic
unit (LAST-NAME) beginning in position one, followed by a single blank and then the syntactic unit
(FIRST-NAIVIE). Exactly 15 spaces after the beginning of (FIRST-NAl\/[E) is to appear the syntactic
unit (CODE). Finally (NUMBER) will be 75 spaces
from the beginning of (ElVIPLOYEE-NO.).
(El\tfPLOYEE-NO. ): (LAST-NAME) "
((FIRST-NAl\tfE) , TI5, (CODE)), T75,
(NUNIBER)
Recursive grammars in the metalanguage
Recursive grammars (i.e., productions with the
syritactic unit of the left side occurring as well on the
right side, or being in the derivation of a syntactic
unit of the right side) are allowed in the metalanguage
subject to certain conditions.
For instance, left recursive productions are not allowable, but other recursive productions are allowable.
Further, the character (&) bound counts are cumulative . from the initial (top) occurrence in a recursive
parse while the repetition bounds ($) are effective at
each leVf~1 of recursion.
N on-specific grammars in the metalanguage
Let a non-specific grammar be one in which the
particular alternatives of structure for a syntactic unit
may have structurally the same headings (i.e., leading
components which are structurally the same). The metalanguage allows the specification of such grammars
and at recognition time the parser always picks the
first specified (or left most) alternative as its initial
guess. Subsequent guesses continue with the next
specified alternatives.
The user must be aware of the possible consequences
if the apparent ambiguity in a non-specific grammar
causes the recognition of syntactic units to be rejected
later as a result of an unsuccessful parse. Though the
back-up to the next alternative is handled automatically by the parser, the syntactic units recognized may
result in macro invocations; the results of which will
not automatically be negated . Relevant user aids in
this area are provided by the system.
The following example illustrates' a parsing grammar
162
Fall Joint Computer Conference, 1969
for a language which is context sensitive and not context free and which utilizes recursive productions.
L = (Onl nOn:n
~
1)
(LANG) :(LSTR) -; '1', Tl, $'0' (RSTR)
(LSTR) :'O'(LSTR)'I'1 '01'
(RSTR) :'I'(RSTR)'O'j '10'
The parser first determines that the input string
belongs to the context-free language On 1nx; checks to
make sure x does not begin with a 1; repositions to the
beginning of the parsed substring of l's and then determines that the remaining substring of the input
string belongs to the context-free language 1nOn. If
the above conditions are true, then the input string
belongs to the context-sensitive language Onl nOn.
The SYMPLE macro facility
The macro facility of SYl\IPLE provides the actual
translation mechanisms. The macros themselves are
read in to the system following the base language
grammar and prior to the user's source deck. The individual macro definitions are described in this section.
MACRO FORMAT
The overall format of an individual macro definitions
is as follows:
< macro name> ( < syntactic
unit» = < template body> / ( < syntactic
unit» = .... ;
macro semantic statements
END;
The exact format and meaning of the various parts
are described in the balance of this section.
Mac:r:o name
The first item to appear in the macro is the name of
the macro. The name may be any string of characters,
excluding those special characters previously mentioned as excluded from a syntactic unit name. The
macro name is used exclusively as a "handle" for the
user's organization and SYMPLE's internal system
and macro referencing. The macro name should not
be confused with a macro reference in the source text.
A source reference to the macro is comp!etely inde~
pendent of its name.
'Templates
Following the macro name are a series of macro
templates which are descriptions of possible macro
references that will cause the invocation of the macll'o.
A single macro template is of the form:
«syntactic unit»
=
where the syntactic unit is any syntactic unit that may
occur in the base language, and the template body, if
presentJ consists of a description of a specifi~ structure
to be found within that syntactic unit. The syntax and
semantics of template body are identical wjlth those
of the metalanguage of SYMPLE except for an extension to make it possible to identify and name argument
strings for the macro.
The extension added to facilitate the identification
and naming of argument strings was simply to allow
the enclosing of the desired argument location in the
syntactic structure of the template within bracketing
parentheses and preceding the left enclosing p2~renthesis
with a name (with the same character restrictions as a
macro name) to be associated with the enclosed argument string. These enclosed argument strings may
occur anywhere within the template, and in fact may
even enclose other argument strings. The namesassociated with the argument strings must be unique within a
single macro template.
A macro template may cause a macro invocation in
the following manner. When the syntactic unit designated on the left of the equal sign in a macro template
is recognized by the parser, the actual structure of the
syntactic unit found is compared with the specific
syntax specified in the template body. A successful
comparison results in the invocation of the macro and
the passing to the macro of identified argument strings
in the macro reference, if any. If no template body is
specified, then the macro is immediately invoked with
no arguments passed.
The syntax structure defined in a template body
need not be structurally consistent 'with that of the
object syntactic unit in which it will be compared.
However, if the template body contains syntactic units,
these units must have been in the productions submitted
with the description of the base language. These productions though can be stand-alone productions (not
logically in the normal base language structure) included solely for use within templates. The use of these
stand-alone syntactic units, literal strings, and alternative arrangements and selection of syntactic units in
the base language can result in template structures
quite different from those recognized in the process of
finding the object syntactic unit. Thus the template
comparison is actually an attempted reparsing within
SYl\1PLE
the physical bounds of the object syntactic unit according to the template syntax description.
Any number of macro templates may follow the
macro name, with a slash (/) separating each, except
that the last template is followed by a semicolon (;).
E'{ample:
NO! (LABEL) = Al ('
, A2
«NUM»)/(8TMT) = 'e' A3 (X79);
macro semantic statments
END;
lVlacro NO! will be invoked when either
1. A (LABEL) s found consist'ng of four blanks
followed by a (NUM») or else
2. A (STMT) is found beginning with the letter e.
In case 1 two argument strings will be available for
manipulation and testing by the macro semantic
statements; that associated with the string name Al
will be four blanks and the found (NUM); that associated with the string name A2 will be just the found
(NUl\l). In case 2, the argument string associated with
string name A3 will be the 79 characters following the
initiallettor e.
Argument string names which are not in a matched
template or which are associated with null argument
strings in the matched template are associated with
the null string (i.e., have a length attribute of zero),
Macro semantics
a-General
The macro semantics facility in S Yl\IPLE .is based
on a string oriented language which drives an interpretive mechanism. This language closely parallels
SNOBOL and has a simple syntax. The basic form of
most semantic statements is
< action verb>, < string name>
< string reference> , ... ;
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
163
reference strings with concise notations
communicate between macros
execute subroutine-like macros
manipulate strings of values
alter sequential execution (branch)
insert strings back into the ground language code
loop repetitively
perform string comparisons
display string-string name associations
terminate interpretive action
to which needs to be added for our discussion one
capability not explicitly mentioned: the ability to
dvnamically alter entire macros (templates and semantics).
This last c;pability mentioned and number 8 listed
above are the means by which the macros effect
their results in the translation process.
b--Output string insertion
Strings which are produced in the macro semantic
portion of a macro may be inserted into the source code
in any of several ways. The semantic language statement which directs the insertion of a string is of the
form.:
INSERT, = ;
The directive is a code rather than a string name which
specifies the type of insertion to be perforx.n~d. The
directive codes are I, lA, IB, A, B, A, , B,
, PI, PIA, PIB, PA, PB, PA, , PB
and MADD,
They are explained below.
I-The string name(s) is an argument
string name(s). The associated
string is to replace the argument
string occurrence in the macro
reference.
where the action verb is a key word describing some
action to be performed on the referenced strings (literal
strings, string names, etc.) with the resultant string
generally being associated \\~ith the given string name.
The details of the ~emantic language faGility are
described in another paper.13 The use of relatively
simple semantic statements in later examples should be
intuitively understandable. .
This semantic language provides the ability to:
lA-The string name(s) is an argument
string name(s), The associated
string is to be inserted immediately
after the referenced argument
string in the macro reference. In
this , and for all remaining insertion directives, the macro reference
itself remains unchanged.
1. manipulate strings of characters
2. reference strings literally, directly, indirectly
IB--Same as IA except read "before"
instead of "after" ,
164
Fall Joint Computer C<;mference, 1969
--------------------------~----------------------------------------------------------------
A-The string(s) associated with the
string name(s) is to be ins~rted
immediately after the syntactic
unit in which the current macro
reference occurred.
B-Same as A except read "before'
instead of "after."
A, -The string(s) associated with the
string name(s) is to be inserted
after a particular syntactic unit
or grouping level of the parsed
tree, called ithe referenced syntactic unit (RFSYUN). The RFSYUN is the first syntactic unit
(at the same or higher level) to
the immediate left of the syntactic
unit or grouping level on the
parsed. tree, wlo:e derivation includes, and is;the value of
levels above, the present macro
reference. Ifa RFSYUN does
not exist by the above definition
then the directive A,
references the beginning of t~le
input stream.
B, -Same as A, dig'.t except read "before"
instead of "after" .
P prefix directives-(e.g., PI, PIA, etc.) Each P
prefix directive results in the
same type of insertion as the nonprefixed directives. However, the
strin'1; inserted is transparent to
all future attempts at parsing or
template matching (i.e., "protected") . The only exception to
this is that: a P prefix inserted
string will be visible to the template matching of a specially
designated macro, called the "editH
mr,ero, whose name is specifiied
at submission time via the processor control language. All P
prefix inserted strings, if unaltered by the edit macro, will appear in their inserted locations
in the final output.
c-Dynamic macro modification
In addition to inserting strings m the source sub-
mission, strings may be treated as new/changed macro s
. via the following directive.
lVIADD-The string associated with the string
name is a macro and includes macro templates
and/ or macro semantics. If the macro is new (no
other macro with the same name) it will be added to
the present library of macros for this submission.
If the macro name is that of a current m~cro, macro
templates, if present will be added to those presently
associated with the macro and macro semantics, if
present, will replace those of the present macro.
CONCLUSION
The purpose of the SYl\IPLE system is to provide a
general language-independent macro preprocessor. The
syntax directed approach was used to allow both general
and flexible macro referencing techniques.
The SYl\;fPLE syntax description metalanguH,ge
was designed from the premise that the metalanguage
should be a practical tool for real programming languages with their many syntactic idiosyncracies (e.g.,
imbedded blanks, fields of specified length, continuation columns, etc.). As far as possible and practical
these real problems should be easily describable in
the SYNfPLE Hyntax description metalanguage. In a
standard BNF metalanguage, such problems are at
best very a"wkward to describe. This led to such concepts as length and repetition binding, andl explicit
scan positioning.
Explicit scan positioning added the abiilty to perform successive analyses, even within a local template
match, by repositioning the analyzer for rescan of
already parsed information. This rescanned information
may of course, contr,in different information 2.S the
result of insertions from macro invocations.
The insertion of information in the "protected" mode
(P-prefix directive insertions) further extends the
power of the SCH,n and rescan mechanism of the syntax
analyzer. It allows the user the option to insert code
which either may possibly affect the future syntax
analysis (normal mode), or be completely "transparent"
and thus not possibly affect subsequent syntactic
analyses.
Systems such as T~lG,lO COGENT14 and similar
syntax directed compilers or compiler-compilers have
their semantic actions hooked to the parsed syntactic
units of a source submission, much the way SYlV[PLE
would do without the local syntax parsing of a ma,cro
template. In the context of macro processors, however,
the application of global syntactic analysis followed
by local syntax analysis for the macro references ap-
SYMPLE
pears to be a new application. The obvious advantage
of this technique is that it provides a means of specifying a contextual dependence for macro references.
Patterns in the source input which would qualify as
macro references on a local syntax basis will qualify
only if they are in the correct global context.
Several previous macro systems [notably XPOP,7
ML/I,2 LIMp19] use some sort of a generalized ma~ro
reference technique. IVlost used a template matchIng
technique based on pseduo-syntax methods (e.g.,
noise word structuring of XPOP, specific literal template structures of LIMP). In each case, however, the
scope of applicability of these macro references was
not controlled on a global syntactic basis. lVIL/I, for
instance, depends on the occurrence of a name of a
macro in a statement for the recognition of a macro
reference. XPOP looks for a macro reference in each
statement based on word structures, with non-"noise"
words in these structures being the arguments. lVlacro
references in LIMP are perhaps the most general of
the above mentioned systems. However, the templates
of LIl\,IP are (1) literal templates (i.e., character
strings-not defined syntax structures) with "holes"
in them , the "holes" being- filled by the required argumen ts; and (2) each template is eligible in any given
"line" of input. Thus there is no discrimination in regard to the applicability of a template on a global
basis in any of the above mentioned systems; nor is
there structuring of the templates themselves on a
general syntactic basis; nor can the arguments be
identified in a really general manner. It would take
little to show that, at least from a macro reference
point of view, these systems would be relatively simple
special instances in SYMPLE.
The general applicability of the SYl\/fPLE system
has been alluded to, and a few mostly simple examples
are illustrated in the appendix. These examples illustrate the use of SYl\IPLE as a language extension
facility, in handling "sift" problems, and text editing.
There are certain to be many other areas of applicability not mentioned.
165
control information is included for completeness.
SYNTAX;
(PRO G) :$(STlVIT) (END-CARD)
(STMT) : (LABEL-FIELD) ('0' I' ')
(UNLAB-STMT) I (COMMENT)
(COMMENT): 'C', T81,'!'
(LABEL,FIELD): 5 & 5«BLKSTRG) (NUM»
(UNLAB-STMT) : -. (END-BODY) (BLKSCN)
(SEQFIELD) 0$19 «CONT-FIELD)
(BLKSON) (SEQFIELD»
(END-CARD): 6$6 " (END-:BODY) (SEQFIELD)
(END-BODY) : 66 & 66 (0$1 (BLKSTRG)
'END' 0$1 (BLKSTRG»
(BLKSCN) : 66 & 66 (0$1 (BLKSTRG) $
(NONBLK»
(BLKSTRG) : $' ,
(NONBLK):--\"-j"", X11""$ (--\"", Xl)""
(NUM): '0'1'1'\'2'\'3'1'4'\'5'1'6'\'7'1'8'\'9'
(SEQFIELD):X7, 'I'
(CONT-FIELD):--\ 'C', T6, -i ('0'\' '), Xl
SYNEND;
MACROS;
CONDENSE (BLKSTRG) = Al «BLKSTRG»/
(SEQFIELD) = Al «SEQFIELD» /
(CO NT-FIELD) = Al «CONT-FIELD»;
REPLACE, Al = ;
INSERT, I = AI; STOP; END;
END-STMT (UNLAB-STMT) =; INSERT,
A='!';STOP;END;
MACEND;
SOURCE, RECMK;
INTEEER
1 BLK/I
DO 10
1
*2
AA ( 4)
1 , VAL (
K=l
VAL ( K ) = VAL ( K )
195
*
11
1 0 0 )
,100
K
A - AA ( 4 )
1 0 AA ( KilO 0 )
STOP
1 000 0
ABC Dl/4 1
1
1
00*01
+
2
3
4
5
BLL
6
7
8
END
9
APPENDIX I
SYMPLE processing examples
Example 1
The first example of this appendix is designed to
take OS/360 Fortran IV input and condense all noncomment statements into single condensed strings by
eliminating unnecessary blanks, sequence number
fields, and continuation fields. Each condensed statement will be separated by a record mark (!). Processor
Output from SYMPLE after processing above inp~t
INTEGER*2AA(4)/'AB C D'/, BLK/' 'I, VAL
(100)/100*0/!D010K= 1,100!
VAL(K) = VAL(K) + 95* K-AA (4)!10
AA(K/100) = BLK! 1000 STOP! END!
Notes on example:
1. The grammar of Fortran IV is detailed here
166
Fall Joint Computer Conference, 1969
only to a level which will distinguish major
substructures. If one wished to further detail
the syntax structure, the syntax of Fortran
statements in the condensed form would be
relatively simply since all extraneous clutter
has been removed. The P-prefix insert capability
could be used to ignore clutter for possible
reparsing without actually removing it from the
input (and thus output).
2. The grammar is non-specific with at least one
point of apparent ambiguity. The beginning
characters of an (END-CARD) will qualify
as the beginning charatters of a (ST1\lT) (i.e.,
6$6"= (LABEL-FIELD)' '). Thus upon encountering an (END-CARD) there ,vill be a
back-up, since an attempt is first made to parse
it as a (STMT). In this case, of course, the
back-up will not have any bearing on the total
processing result.
3. (LABEL-FIE.LD) will accept a label, say
bblb5 and the compressed result would be 15.
The structure of this particular (LABELFIELD) would be
bb
1
''0
5
(BLKSTRG) (NUM) (BLKSTRG) (NUM).
4. Note how (NONBLK) includes all non-blank
characters and literal strings (including blanks.)
5. The ,macro template (UNLAB-STIVIT) =; in
the second macro results in the macro ENDSTMT being invoked with no arguments.
6. The use of the processor control RECMK parameter results in a ! being added to the end 0 f
each logical record on input. The syntax grammar used assumes this, though an equivalent
grammar without the RECMK could easily
be used in this case.
I!xarnple 2
This example is designed to remove all redundant
parentheses in a language which uses pairs of left and
right parentheses for grouping. A redundant parentheses pair is any pair of parentheses which enclos s
a string which is also totally endo'sed in parentheses.
SYNTAX;
(FLANG) :$(PAREN)
(PAREN) :'(,(INARDS)O$l(INTOO)
(INARI~S) :(PAREN)I (INTOO)
(INTOO) :0$(-, ')'"""1 ')"Xl)O$l(PAREN)
SYNEND;
MACROS;
REDUN(PAREN) =' ('AA((INARDS))')';
SEPART,AA='(', AA, ')'/F, Ll; INSERT, I
AA; Ll :STOP; END;
MACEND;
SOURCE, LIST;
(((A(B)))C)((((XY~((Cl))(A))F)))
/*
Output from SYMPLE after processing
((A(B) )C) ((XY~(Cl) (A))F)
Note:
In a recursive parse, inner-most (lowest)
recursive syntactic units [e.g., (PAREN)] are
recognized first and subject to macro expansion
first.
Example 3
A final example shows a simple extension of OS /
360 Fortran IV obtained by adding a different statement type to the grammar. ,This different st:::~tement
type will contain a macro reference. The format of,
and argument location in, the macro referen,~eR will
be strictly dependent on the local syntax specified i.n
the templates of the macros.
A different statement type could be designated
simply as starting with a non-numeric non-blank
character after column 1 and before column 6.
Tl'le grammar defining this basic extension could appear in a submission as follows.
SYNTAX,PUT;
(PROG) :$(STMT) (END-CARD)
(STMT) :(NEW-STMT)I (END-CARD),T80
(NEW-STMT) :5&5($' '$(NONNUM-BLK),T80
(END-CARD) : 6&6' , 66&66 (0$' , CEND')
0$' '),X8
(NUM):' 0'1' 1'1' 2'1' 3'1' 4'1' 5'1' 6'1 '7'1 '8'1'9'
(NONNUM-BLK):-j (NUM) -;' ',Xl
SYNEND;
At this point the syntax description differentiatin g
this new statement type is defined and any user could
take advantage of the description which via the processor control PUT parameter has been saved. Using the
appropriate processor control and job control statements to retrieve the above syntactic specification, a
user could make submissions similar to the follOWIng.
SYNTAX,GET;
(NOISE) : $' 'I'STORE'I'IN'I'To'I'INTO'I'THE'1
'PUT'I 'oF'1 'AND'
SYMPLE
(NON-NOISE): $ (-j (NOISE),XI)
SYNEND;
MACROS;
SUM (NEW-STMT) = Al ($ (NOISE) (' ADD'I
'SUM')$(NOISE)A2((NON-NOISE» $
(NOISE) A3 ((NON-NOISE» $ (NOISE)
A4 ((NON-NOISE», T80);
CONCAT, Al ='
" A4, '=', A2, '+', A3;
INSERT,I=AI;
STOP; END
MACEND;
SOURCE;
C THIS IS A FORTRAN COMMENT
ADD A TO B AND STORE IN C
SUM A AND B AND PUT INTO C
STORE THE SUM OF A AND B IN C
END
/*
Output of SYMPLE after processing
C THIS IS A FORTRAN COlVIMENT
C=A+B
C=A+B
C=A+B
END
The macro used above is a simple macro using a keyword and non-noise positional parameters. The illustrated new type of statement if imbedded in any
Fortran source deck, would, when processed, be converted to the Fortran type statements listed, and replace the new sta temen ts.
.
REFERENCES
M J BAILEY M P BARNETT P B BURLESON
Symbol manipulation in Fortran - SASP I subroutines
CACM Vol 7 No 6 June 1964339-346
2 P J BROWN
The ML/I macro preprocessor
CACM Vol 10 No 10 Oct 1967618-623
3 J A FELDMAN
A formal semantics for computer languages and its
application in a compiler-compiler
167
CACM Vol 9 No 1 Jan 19663-9
4 J A FELDMAN D GRIES
Translator writing systems
CACM Vol 11 No 2 Feb 196877-113
5 D E FERGUSON
Bvolution of the meta-assembly program
CACM Vol 9 No 3 March 1966 190-196
6 M L GRAHAM P Z INGERMAN
A universal assembly mapping language
Proc ACM Aug 1965 409-421
7 M I HALPERN
XPOP: A metalanguage without metaphysics
Proc FJCC Vol 26 196457-68
8 M I HALPERN
Toward a general processor for programming languages
CACM Vol 11 No 1 Jan 1968 15-26
9 B M LEAVENWORTH
Syntax macros and extended translation
CACM Vol 9 No 11 Nov 1966 790-793
10 It M McCLURE
TMG-A syntax directed compiler
Proc ACM Aug 1965 262-274
11 M D McILROY
Macro instruction ext.ensions of compiler languages
CACM Vol 3 No 4 April 1960 214-220
12 C N MOOERS L P DEUTSCH
TRAC-A text handling language
Proc ACM Aug 1965229-246
13 R E PATCHEN
String oriented macro language and interpreter
Pen~ State Univ Dec 1968 Thesis in
Computer Science
14 J E REYNOLDS
An introduction to the COGBNT programming system
Proc ACM Aug 1965 422-437
15 S ROSEN
A compiler bu.ilding system developed by Brooker and Morris
CACM Vol 7 No 7 July 1964403-414
16 E F STORM
CHAMP - Character manipulation procedures
CACM Volll No 8 Aug 1968561-566
17 J E VANDER MEY
A general syntax diretced macro preprocessor
.
Penn State Univ March 1969 Thesis in Computer SCIence
18 R C VARNEY
The central portion of the SYMPLE system-Tree
construction and parsing
.
Penn State Univ June 1969 Thesis in Computer SClence
19 W M WAITE
A language independenf m "'~r~ -n"I)~~·'1·'lf)r
CACM Vol 10 No 7 July 1967 433-441
An algebraic extension to LISP
by PRENTISS HADLEY KNOWLTON
Harvard University
Cambridge, Massachusetts
INTRODUCTION
An algebraic facility for LISP is quite desirable.
Such a capability is motivated by the desire to utilize the primitive LISP arithmetic functions at the
algebraic expressio~ level. The requirement for a
mp,ans of evaluating expressions might very well arise
from applications in algebraic manipulation. Thus,
the user, having performed some sort of transforma
tion on an algebraic expression, might wish to have
the resulting expression evaluated for a specific set
of values. This facility, in response to this requirement, has the acronym "LEAF" (LISP Extended
Algebraic Facility).
Design considerations and FORTRAN language
facilities provided by LEAF include:
1. a list structured organization compatible with
existing LIB.P ;
2. an arithmetic assignment statement;
3. a DO statement;
4. a logical IF statement;
5. an unconditional GO TO statement; and
6. an INPUT and OUTPUT statement.
Since LEAF is designed in the "spirit" of LISP,
built in functions in a given LISP system which provide for such conveniences as "pretty printing" of
functions and editing facilities may also be applied to
LEAF programs.
The list structured organization of LEAF
Although the initial motivation in developing LEAF
was to extend the LISP language, a number of other
motivating properties of the LEAF concept make
themselves apparent as one uses the LEAF facility.
In order to attain compatibility with the existing
LISP language, LEAF is essentially a dialect of
FORTRAN in list structure. B,:ence, a program is a
list whose elements are statements. A simple LEAF
program to accept two numbers from the teletype,
determine their sum, and type out the result might be
written as follows:
( (INPUT A B)
(C = A
B)
(OUTPUT C»
+
In similar manner, a statement is a list whose elements are the components of that statement. In order
to execute a statement, the LEAF interpreter typically
looks a't the keyword (e.g., INPUT), the first element
of the statement, to determine how the statement
should be processed. This is analogous to the LISP
interpreter, in which the first element of a LISP
command is a function, and the remaining elements
of that command constitute the arguments of the
function.
In the "assignment" function, unlike the other
LEAF commands, the keyword or "=" is the second
element of the list. If the item on the left hand side
of the equal sign is an array reference, the sUbscripting
can be thought of as a single list element, a sublist
whose elements constitute the subscripts. In SDS 940
LISP a,s well as in other/LISP implementations, commas are perfectly acceptable list element delimiters.
Thus, the user is free to use commas for readability
in. subscript lists if he desires, fmd. he is not constrained
to always delimit list elements with blanks. It is important to note in the case of a subscripted variable
on the left hand side of the equal sign in the assign-
169
170
Fall Joint Computer Conference, 1969
ment statement that the" =" is in fact the third element of the list. Nevertheless, recognition and processing of the assignment statement is still a relatively
straightforward procedure.
In addition to the properties LISP and LEAF share,
it is interesting to note that the conveniences which
exist for displaying and modifying LISP functions are
also applicable to the display and modification of
LEAF programs. The nesting of DO loops is readily
apparent from the indented listing one obtains from
the LISP' 'pretty printing" facility:
( (DO I = t TO to
(A(l) = B(l))
(DOJ = t TOto
( .)
( . )
( . ) )
( . )
( )
( )))
.
In like manner, one may utilize the editing facilities
available on a given LISP system to modify a LEAF
program with equivalent flexibility as modifying a
LISP function.
Justifications for a list structure
It is worthwhile noting that the list structured approach to the design of an algebraic language lends
itself well to the concepts of program block structure,
program editing, adaptability to a time sharing environment, and, most important of all, language and
data structure compatibility.
Program block structure of the LEAF system is
best illustrated by the DO statement, in which a list
whose elements are statements: constitute the range
of the DO specification. This program block structure
lends itself well to editing operations, since, armed
with an indented listing of his program, one is able to
quickly and accurately access and work with his program at any level. An example of program modification
using the editing facility of SDS 940 BBN LISP is
given in Appendix C.
Like the LISP language, LEAF lends itself well to
a time sharing environment, in that LEAF programs
are easily interpreted at the source language level.
List structured organization of LEAF programs permit several users to work independently with the same
reentrant interpreter, even when two separate programs are "intertwined" in the same storage region.
A particularly significant observation one might
make of the LEAF language is that it possesses the
same basic structure as its data. Hence, there is no
reason why one might not wish to devise a program
which performs operations upon itself, such as the
changing of a "+" to an "*,, in an arithmetic expression.
In this sense, within the framework of the LEAF
language, a statement might be thought of as an alphanumeric vector whose elements are keywords,
operators, and operands.
Fortran language facilities provided by LEAF
1. The Assignment Statement
The assignment statement of LEAF is identieal
to that of FORTRAN IV with the additional
flexibility of mixed mode arithmetic. Thus,
one may work interchangeably with both integer and real data in arithmetic expressions
without worrying about problems of mode
conversion, since the existing LISP floating
point functions are designed to handle such
situations automatically.
2. The DO Statement
The .DO specification of LEAF is similar to
that of PL/I. The remainder of the statement
consists of a list whose elements as st:ELtements
constitute the range of the DO. Any level of
nesting is permissible, and the LISP "pretty
printing" facility shows the nesting quite
clearly as illustrated earlier.
3. The Logical IF Statement
Like PLII, the logical IF statement consists
of an "IF" part followed by a "THEN" p:~rt.
The "IF" part consists of two aJrithmetic
expressions separated by a relational operator
(without periods). The true or false value of
the relation determines the execution or nonexecution of the "THEN" part. In either event,
the next statement in sequence is reached.
4. The Unconditional GO TO Statement
The GO TO statement of LEAF, like that of
PLII, specifies destination by means of a name
rather than by means of a statement number
as is the case with FORTRAN IV.
5. The INPUT Statement
The INPUT statement consists of the key
word "INPUT" followed by the variables to
be defined. The "RATOM" (read atom) func-
An Algebraic Extension to LISP
tion of SDS 940 BBN LISP permits relative
free formatting of input data.
6. The OUTPUT Statement
Similarly, the OUTPUT statement consists
of the keyword "OUTPUT" followed by the
variables to be printed. The "PRINT" function
of SDS 940 BBN LISP is utilized in this context.
CONCLUSIONS
The LEAF approach seems to be an answer to certain
problems facing users who are dissatisfied with present
day LISP and present day FORTRAN. Feasibly, pro~
grams already written in FORTRAN IV might be cori~
verted to LEAF. The advantages of indented display
of program nesting as well as the facilities of the
LISP editor would certainly warrant this activity.
Working with an algebraic language at the source
language level has many distinct advantages. Among
these advantages, this writer suggests that the COMMENT statement should be treated as an executable
statement, whose text could be made to be listed by
user request during program execution.
The author sincerely hopes that the philosophy of
the LEAF system is given some consideration by the
implementers of future algebraic compilers.
171
ACKNOWLEDGMENTS
The author wishes to extend special thanks to Dr.
Daniel G. Bobrow of Harvard University's Applied
Mathematics Department, under whom this work was
done as independent study. Dr. Bobrow is also responsible for many of the facilities present in SDS
940 BBN LISP.1 Special thanks are also due to Aiken
Computation Laboratory of Harvard University who
graciously provided SDS 940 computer time for the
carrying out of this work.
Mr. Cornelius Peterson, manager of the Boston
Office of Computer Usage Company, provided the
necessary facilities for the writing of this paper. Finally, Mr. Burton Bloom, Senior Staff Analyst of the
CUC Boston Office, provided many helpful suggestions during the technical revision of this work.
Finally, the author extends appreciation to Jet
Propulsion Laboratory, Pasadena.t CaliforniaJ. for
the use of their facilities in preparing visual aids in
the presentation of this paper.
REFERENCE
1 D G BOBROW et al
The BBN 940 LISP system
Bolt Beranek and Newman Inc Cambridge Mass April 1968
APPENDIX A
Syntax description of the LEAF system
I. Fundamental Language Components:
(letter) :: = AIBICIDIEIFIGIHIIIJIKILIMINIOIPIQIRISITIUIVIWIXIYIZ
(digit) :: = 0111213141516171819
(identifier) :: = (letter) { (letter) 1(digit)};
(variable) :: = (identifier)
(unsigned~integer-constant) :: = (digit) { (digit)};
(sign) :: =+ 1 (integer-constant) :: = [(sign)] (unsigned-integer-constant)
(real-constant) :: = [(sign)] (unsigned-integer-constant).
I
[ (unsigned-integer-constant )] [(exponent-part)] 1
[ (sign)] [(unsigned-integer-constant )J.
(unsigned-integer-constant) [(exponent-part)] /
[ (sign)] (unsigned-integer-constant) (exponent-part)
(exponent-part) :: = [(sign>] { (digit)} ~
II. Basic Language Elements
(program) :: = ({ (statement)} i)
(statement) :: = (comment-statement» /( (optional-statement-label) (statement-body»
(comment-s~atement) :: = COMMENT (commentary)/*(commentary)*
172
Fall Joint Computer Conference, 1969
(optional-statement-label) :: :::::: [(identifier)]
(statement-body) :::::::: (do-statement) I (input-statement) I
(output-statement) I (assignment-statement) I
(go-to,-statement) I(if-statement) I(stop-statement)
(do-statement) :: :::::: D'O (index) :::::: (initial-value) T'O (final-value)
( (do-block»
(do-block) :::::::: {(statement)} i
(input-statement) :: ~ INPUT (argument-list)
(argument-list) :::::::: {(variable>} i
(output-statement) :: :::::: '0 U T PUT (argument-list>
(assignment-statement> :::::::: (variable> :::::: (arithmetic-expression>
(arithmetic-expression> :::::::: (term> (plus-or-minus> (arithmetic-expression> I(term)
(plus-or-minus> :: :::::: + I (term> :::::::: (factor) (star-or-slash> (term> I (factor>
(star-or-slash) :::::::: * I /
(factor> :::::::: (variable> I (constant> I «arithmetic-expression»
(constant> :::::::: (integer-constant) I (real-constant>
(go-to-statement > :::::::: G'O T'O (identifier>
(if-statement) :: = IF (arithmetic-expression> (relational-operator>
(arithmetic-expression) THEN «statement»
(relational-operator) :::::::: GTIGE'ILTILEIEQINE
(stop-statement> :::::::: ST'OP
APPENDIX B
Some representative functions of the LEAF interpreter
(STATEMENT
(LAMBDA (C'OMMAND)
(C'OND
«COMMENT-STATEMENT C'OMMAND)
NIL)
«D'O-STATEMENT C'OMMAND)
NIL)
«INPUT·STATEMENT C'OMMAND)
NIL)
«'OUTPUT-STATEMENT C'OMMAND)
NIL)
«ASSIGNMENT-STATEMENT C'OMMAND)
NIL)
«G'O-T'O-STATEMENT C'OMMAND)
NIL)
(T (IF-STATEMENT C'OMMAND»»)
(C'OMMENT-STATEMENT
(LAMBDA (C'OMMAND)
(EQ (CAR C'OMMAND)
(QU'OTE C'OMMENT»»
(D'O-STATEMENT
(LAMBDA (C'OMMAND)
An Algebraic Extension to LISP
(PROG (INDEX FROM TO)
(COND
((NEQ (CAR COMMAND)
(QUOTE DO»
(RETURN NIL»)
(SETQ INDEX (CADR COMMAND»
(SETQ FROM (CADDDR COMMAND»
(SETQ INDEX FROM)
(SETQ TO (CADDDDDR COMMAND»
LOOP (COND
((OREATERP INDEX TO)
(RETURN T»)
(LEAF (CADDDDDDR COMMAND»
(ADDl INDEX)
(GO LOOP)
»)
(INPUT-STATEMENT
(LAMBDA (COMMAND)
(PROG (ARGUMENT-LIST)
(COND
((NEQ (CAR COMMAND)
(QUOTE INPUT»
(RETURN NIL»)
(SETQ ARGUMENT-LIST (CDR COMMAND»
LOOP (COND
((NULL (CAR ARGUMENT-LIST»
(RETURN T»)
(SET (CAR ARGUMENT-LIST)
(RATOM NIL»
(SETQ ARGUMENT-LIST (CDR ARGUMENT·LIST»
(GO LOOP)
»)
(OUTPUT-STATEMENT
(LAMBDA (COMMAND)
(PROG (ARGUMENT-LIST)
(COND
((NEQ (CAR COMMAND)
(QUOTE ,OUTPUT»
(RETURN NIL»)
(SETQ ARGUMENT-LIST (CDR COMMAND»
LOOP (COND
((NULL (CAR ARGUMENT-LIST»
RETURN T»)
(PRINT (CAAR ARGUMENT-LIST»
(SETQ ARGUMENT-LIST (CDR ARGUMENT··LIST»
(GO LOOP)
»)
(ASSIFNMENT-STATEMENT
(LAMBDA (COMMAND)
173
174
Fall Joint Computer Conference, 1969
(PROG NIL
(COND
«NEQ (CADR COMMAND)
(QUOTE
(RETURN NIL)))
(SET (CAR COMMAND)
(ARITHMETIC-EXPRESSION (CDDR COMMAND)))
(RETURN T)
»))
=»
(ARITHMETIC-EXPRESSION
(LAMBDA (LIST)
(PROG (VALUE)
(SETQ POINTER LIST)
(SETQ VALUE (TERM NIL))
LOOP (COND
«NULL (CAR POINTER»
(RETURN VALUE»)
«EQ (CAR POINTER)
(QUOTE +)
(SETQ POINTER (CDR POINTER))
(SETQ VALUE (FPLUS VALUE (TERM NIL)))
(GO LOOP))
«EQ (CAR POINTER)
(QUOTE -)
(SETQ POIN"TER (CDR POINTER»)
(SETQ VALUE (FDIFFERENCE VALUE (TERM NIL»)
(GO LOOP»)
(T (RETURN VALUE))
)))
(TERM
(LAMBDA NIL
(PROG (VALUE)
(SETQ VALUE (FACTOR NIL)
LOOP (COND
«NULL (CAR POINTER))
(RETURN VALUE»)
«EQ (CAR POINTER)
(QUOTE *)
(SETQ POINTER (CDR POINTER)
(SETQ VALUE (FTIMES VALUE (FACTOR NIL)))
(GO LOOP))
(CEQ (CAR POINTER)
(QUOTE I))
(SETQ POINTER (CDR POINTER))
(SETQ V ALU E (FQUOTIENT V ALUE (FACTOR NIL)))
(GO LOOP))
(T (RETURN VALUE)))
)))
(FACTOR
An Algebraic Extension to LISP
(LAMBDA NIL
(PROG (VALUE POINTER-SA VE)
COND
((NUMBERP (CAR POINTER»
(SETQ VALUE (CAR POINTER»
(SETQ POINTER (CDR POINTER»
(RETURN VALUE)
((ATOM (CAR POINTER»
(SETQ VALUE (CAAR POINTER»
(SETQ POINTER (CDR POINTER»
(RETURN VALUE)
(T (SETQ POINTER-SAVE POINTER)
(SETQ VALUE (ARITHMETIC-EXPRESSION (CAR POINTER»))
(SETQ POINTER POINTER-SA VE)
(SEQ POINTER (CDR POINTER»
(RETURN VALUE»)
))
(FDIFFERENCE
(LAMBDA (A B)
(FPLUS A (FMINUS B»»
(LEAF
(LAMBDA (PROGRAM)
(PROG (LOCATION LABEL
(SETQ LOCATION PROGRAM)
LOOP (COND
((NULL (CAAR LOCATION»
NIL)
((STOP-STATEMENT (CAR LOCATION»
(RETURN (QUOTE STOP»»
(STATEMENT (CAR LOCATION»
(SETQ LOCATION (CDR LOCATION)
GO LOOP)
»)
(STOP-STATEMENT
(LAMBDA (COMMAND)
(EQ (CAR COMMAND)
(QUOTE STOP»»
APPENDIX C
Representative applications of the LEAF system
Examples of i'YI:put statements, output statements, the assignment statement, and arithmetic expressions
~
INPUT-STATEMENT ((INPUT ABC D E F G»
1.0 2.0 3.0 4.0 5.0 6.0 7.0
Tt
t The "T" indicates that the invoked function succeeded.
175
176
Fall Joint Computer Conference, 1969
~
OU.TPUT-STATEMENT ((OUTPUT ABC D E F G»
1.000000000
2.000000000
3.000000000
4.000000000
5.000000000
6.000000000
7.000000000
T
~ ASSIGNMENT-STATEMENT ((H = A
B
C
D
E
F
G»
T
~ OUTPUT-STATEMENT ((OUTPUT H»
28.00000000
T
~ ARITHMETIC-EXPRESSION ((A * B * C * D * E * F * Q»
5040.000000
~ ARITHMETIC-EXPRESSION ((A + B * C»
7.000000000
~ ARITHMETIC-EXPRESSION ((A * B + C»
5.000000000
ARITHMETIC-EXPRESSION (( ((((((((((A»»»»» - ((B»/(C + D)
*(E + F]t
-2.142857143
~ ARITHMETIC-EXPRESSION(( (A + B) * (C + D - F) »
3 .000000000
~ ARITHMETIC-EXPRESSION(( A - B
C) / (D
F * (((0») »
4.347826087E-02
~ ARITHMETIC-EXPRESSION(( A / B - C / D + F * G')
41.75000000
+ + + + + +
+
+
A program using input, output, and assignment statements
~ E(SETQQ PROGRAM ((lNPiJT A B) (C = A + B) (D = A - B) (E = A
(F = A / Bj (OUTPUT ABC D E F) (STOP» )t
~ E(LEAF PROGRAM)§
2.0 3.0
2.000000000
3.000000000
5.000000000
-1.000000000
6.000000000
6.666666667 E-01
t The "]" causes a sufficient number of right parentheses to be
generated.
t At this point, the atom "PROGRAM" is bound with the LEAF
program as shown. The top-level function "E" merely means
"execute the given function (first elements) on its arguments
without prior evaluation of those argumef,tts."
§ The LEAF interpreter is now applied to the designated pro-
gram. The user satisfies the INPUT statement by typing "2.0
3.0 (CR)," and the LEAF system responds with the desired
output, followed by "STOP" as generated by the STOP statement.
* B)
An Algebraic Extension to LISP
STOP
A program using the DO statement
~ PRETTYPRINT(SUMMATION) tj
((SUM = 0.000000000)
(COUNT =0.000000000)
(DO I = 1 TO 10 ((COUNT = COUNT
(SUM = SUM + COUNT)
(OUTPUT SUM)))
(STOP))
+
1.000000000)
~
E(LEAF SUMMATION)
1.000000000
3.000000000
6.000000000
10.00000000
15.00000000
21.00000000
28.00000000
36.00000000
45.00000000
55.00000000
STOP
Modification of a program using the editing facility
EDITV(SUMMATION)t
EDIT
*(1 (SUM = 1.0))
*3
*7
*2
~
*p
(SUM = SUM
*(4 *)
+ COUNT)
*1'
*PP
!l In this instance, we assume that the "SUMMATION" program
has already been defined; hence, we need only print it out using
the "PRETTYPRINT" of SDS 940 BBN LISP. Note how transparent program block structure becomes via this facility.
t At this point we wish to edit our sample SUMMATION example to no longer produce successive sums, but to produce
successive products or factorials. The "*,, tells us we are talking
to the editor. The command "*(1 .(SUM = 1.0»" updates the
first statement of our original summation program (1.0 is the
identity element for multiplication.). "*3" focuses our attention
on the DO statement, "*7" focuses our attention on the range
of the DO, and "*2" focuses our attention on the second statement of the range of the DO. "*P" causes that statement to be
printed out, the operation "(4 *)" causes the "+" of that statement to be changed to an "*", "i" returns our attention to the top
level, "*PP" "pretty prints" the edited function, and "OK" tells
the editor we are all done.
177
178
Fall Joint Computer Conference, 1969
((SUM = 1.000000000)
(COUNT = 0.000000000)
(DO I = 1 TO 10 ((COUNT = COUNT
(SUM = SUM * COUNT)
(OUTPUT SUM»)
(STOP»
*OK
SUMMATION
+- E(LEAF SUMMATION)
1.000000000
2.000000000
6.000000000
24.00000000
120.0000000
720.0000000
5040.000000
40320.00000
362880.0000
3628799.999
STOP
+
1.000000000)
An on-line machine language debugger
for OS/360
by WILLIAM H. JOSEPHS
The Rand Corporation
Santa Monica, California
INTRODUCTION
The environment provided by the multiprogrammed
options of Operating System 360 is not the most
suitable for debugging. It is primarily a batch system
with a programmer's card deck disappearing into th~
card reader and reappearing at some future time on a
printer. What happens in between is often impossible
to discern; any attempt to monitor a program's execution (e.g., the satting of an address stop) is so complicated that it is nearly impossible. In this environment
debugging is diffi~ult-at the conclusion of a program:
the programmer eIther has successful execution or some
indication of program error. If he planned ahead
~an~ w~s lucky), his output will include not only an
mdlCatlOn of the actual error, if O.le occurred but
trace information (either through OS TESTRAN
facilities or his own ptintouts) to help hjm determine
the problem. However, he is usually presented with a
dump, containing a numerical reference to the completion-codes manual. More importantly, the dump
represents the state of the system when OS decided
it could not continue the program's execution' the
user must disc~>ver why it went wrong by edu~ated
guesse.s and by "playing computer" with his program.
!he dIfficulty an~ sheer wastefulness of this procedure
IS extr~melY eVIdent. For this purpose, an on-line
symbolIc debugger can be invaluable .
. One traditional environmental requirement for onhne debugging is an on-line system with remote jobentry. capabilities and file-management functions or
a dedlca~ed machine and its operator console. DYDE
(Dynanuc Debugger) , the system described herein ,
was developed in and for the former environment
using the RAND Simultaneous Graphics' System.
However, the debugger can be used in a normal OS
batch environment using any available 2260 graphicdisplay terminal or even the on-line operator's typewriter.
The text that follows includes an external description
including invocation procedures and command f.:>rmats',
followed by a brief explanation of the internal operation
of the debugger (including the "pingpong" SVC).
DYDE
Invocation of DYDE
DYDE is executed as an OS job using a standard
set of Job Control Statements (see Figure 1). The3e
de~ne the library in which DYDE resides (JOBLIB),
a lIbrary containing the program or programs to be
debugged (SYSLIB), and a scratch file for organizing
the symbol table (SYSUrl). In addition, any JCL
s ~atements defining data sets that are used by the
pr Jgram to be debugged must be included (in this
context, DYDE contains a facility for overriding
both the SYSLIB and the SYSUT1 ddnames if the
program being debugged needs them). Figure 2 illustrates a procedure for assemblying, link editing, and
debugging. In any of these procedures, as soon as
DYDE receives control, it writes out a message indicating its readiness for user commands.
Device dependencies
DYDE can interact with the user through either
179
180
Fall Joint Computer Conference, 1969
II
JOB
-I I JOBLIB
DD
EXEC
IISl
library definition
PGM=DYDE
IISYSLIB
DD
library definition
IISYSUTl
DD
UNIT=SYSDA,SPACE=(TRK,(S,l»
DD
IISCOPE
UNIT=040
Figure 1-8ample JCL for invocation of DYDE using
the 2260 version
II
JOB
IIJOBLIB
DD
IISTEPl
EXEC
library definition
IIASM.SYSIN
ASMFCL,PARM.ASM='TEST',PARM.LKED='TEST'
DD
*
source deck
1*
//STEP3
EXEC
IISYSLIB
DD
IISAMPLEDD
I/SYSUT2
IISCOPE
PGM"DYDE
DSNAME-*.STEPl.LKED.SYStMOD,DISP-(OLD,DELETE)
DD
DD
DD
data set description
UNIT=SYSDA,SPACE=(TRK,(S,l»
UNIT=040
Figure 2-8ample assemble, lirik (jldit, and debug JCL
Note: The SYSLIB card points ~o the output of the
Link Edit step, and the user will override (using the
*DDNAME command) the SYSUTI default
name with SYSUT2
an IBM 2260 display station or the IBM 1052 operator's
console. For this purpose, two versions of DYDE
exist; one for the 2260 interaction, the other for the
1052 (described in Appendix A). 13ecause these devices
are extremely different, the mechanics of the interaction
differ significantly. However, the basic operations are
the same.
The more natural mode of operation and the one
for which DYDE was originally desig~ed, uses the
IBM 2260 graphics-display station. This is an alphanumeric d~vice with a CRT capable of displaying up
to twelve lines of text; each line c~n contain a maximum
of 80 characters.. The control urut for the 2260 the
IBM 2848, buffers typed mesSages, displays ;yped
characters, and handles displ~y regeneration and
cursor advancement. The main CPU is presented with
an attention interrupt only when the enter key is
depressed. The OS Graphics Access Method (GAM)
schedules an asynchronous routine of DYDE that,
in turn, activates the main routine in DYDE. The
message is then read and acted upon.
The twelve-line screen face is divided into two
logical sections:
1. The first three lines-O, 1, and 2-are for DYDEuser communication;
2. The remaining nine lines-3 through 11---'f~re
for data display.
Data is written in the second area in a wrap-:-.around
fashion-the first data item is displayed starting on
line 3, the next on line 4, and so on until the screen
is full. At this point, new data is displayed starting
again on line 3 (erasing automatically the previously
displayed data); and line 4 is erased, providing a
visual delimiter between old data and the most recent
display. Each new line of data display is handled in
this manner, with the data overwriting the oldest
data on the screen, and the next numbered line blanked
as a delimiter.
The three remaining lines-O, 1, and 2-are used
for command processing. The user enters his commands
on line 2 beginning with a start symbol (displayed as
~ and usually written automatically by DYDE)
followed by the command; this is followed by the
attention or the enter key (displayed as • ) that
interrupts the CPU. DYDE reads the meSSage and
immediately echos (Le., rewrites) it on line O. This
provides not only positive verification of the transmission but also, as the user prepares to type the next
message, a useful indicator of the last operation performed. Any data display requested is displayed on
the first free line of the data area, and the line following
is blanked. Finally, DYDE writes a confirmation
message on line 1 and prepares line 2 for the next
command by erasing it, writing the start symbol, and
positioning the cursor at the first free space. Should
the command be syntactically incorrect, an err or
message is written on line I-the echo message on
line 0 provides the user with ready reference for discovering his error-and the data region of the display
is not disturbed.
The discussion that follows is concerned primarily
with the 2260 version of DYDE rather than the 1052.
Significant differences will be noted; however, all
command and message formats, as well as operational
details, are described for the graphic station version
rather than for the typewriter version.
Typical debugging session
A typical debugging session begins when DYDE
gains control and writes its READY message. At this
On Line Machine Language Debugger for 08/360
point, the user can identify the program to be debugged,
perhaps overridding one or more of the ddnames that
DYDE normally uses. After the program has been
successfully LOADed, the full spectrum of DYDE
commands is available to the user. He may indicate
to DYDE that he wishes execution of his program to
be temporarily suspended when control reaches specified locations; this is done by inserting breakpoints
at these locations. Commands exist for modifying
parts of his code or his data. He can then request
DYDE to begin execution of his program. At this
point, four events can suspend program execution
and transfer control to DYDE:
1. Control reaching a previously defined breakpoint;
2. Executing the pingpong supervisor call as an
assembled instruction in the user's program
(e.g., useful when debugging an overlay program
when a particUlar load is not originally in
core) ;
3. An asynchronous interrupt from the user at
his 2260 (not available for 1052 users);
4. The program program checks (e~g., it specifies
an invalid address or operation code).
For release 17 of the operating system, a fifth event
can suspend program execution:
5. Whenever the user's program is terminated
abnormally by the operating system. *
At any of the above halting points, the user may,
for example: (1) display data in his program, (2)
modify data, instructions, or register contents, (3)
create hardcopy of specified areas within his program,
(4) insert new breakpoints, or (5) delete old breakpoints. He may resume execution of his program
~rom the point at which it last halted (the "current"
breakpoint) in either the instruction step mode (execute one instruction at a time) or in the uncontrolled
mode, in which _case only one of the above events
can suspend program execution again. In this manner,
the user can watch his program's execution to catch
an error as it is occurring as well as test his program
with sample data or temporary patches.
DYDE commands
The available commands that the user may issue
fall into two general categories: (1) those that create
the proper environment for debugging the program,
* Items two through five are considered by DYDE to be implicit
breakpoints.
181
and (2) those that cause actual data display from the
program.
All "environmental" commands begin with an
asterisk, followed by the command keyword. If parameters are necessary, the keyword is followed by an
equal sign; then the parameters are entered and
delimited by one of several special characters (the
selection of the special characters is made by the user).
These special characters include the following symbols:
, " '-', ':', ';', 'I', '.', '$', and '@'. In the commands
the
'I' is used.
descriptions that follow,
Most of the commands allow different forms of the
parameters; however, each legal form is stated explicitly, and no other form may be used. Within the
parameter descriptions, the user substitutes the indicated quantity for lower-case items and supplies the
operand exactly as shown for upper-case items. Several
commands contain a quantity called "loc" as a parameter. In general, this refers to a location within the
user's program; its actual use is described at the end
of this section.
The commands (with the preceding start symbol
and the trailing, end-of-message symbol omitted)
ollow.
1.
*N4-ME = pgmname
defines the linkage-editor-assigned member name of
the program to be debugged. This program is LOADed
from the data set defined by the SYSLIB DD card
(or any overrides-see*DDNAMES command below).
While LOADing the program, the debugger organizes
the symbol table, if present, and write~ it out on the
data set defined by the SYSUTI DD card (also overridable-see the *DDNAMES command). The command may be issued at any time; if a previous program
is in core, it is deleted, and the debugger reinitializes
itself before LOADing the new program.
2.
*FINISH
terminates the debugger.
3.
*pARM = parameter information
sets up pointers so that the information. following the
equal sign is passed to the program according to normal
OS standards. *
* If the parameters are coded P ARM = 'XYZ' on the EXEC
card, the command should be *pARM = XYZ.
Fall Joint Computer Conference, 1969
182
*DDNAME=sysHb/sysutl/sysprint
4.
causes the debugger to override, in its DCBs, the
default name for the library data set, the symbol-table,
the utility-work data set, and the data set to contain
hardcopy output. The normal names are SYSLIB,
SYSUTl, and SYSPRINT. However, as indicated
previously, the user may need these names for his
program's execution. In this case, he may, using this
command, override one, all, or any combination of
these three names; e.g., if the user included a DD
card name PRIVLIB instead of the SYSLIB card,
he would issue *DDNAME =PRIVLIB. If he needed
the name SYSUT1 and SYSPRINT for his program's
execution, he could include DD cards named A and B
and issue the command *DDNAME=/A/B. To be
effective, this command must be issued before the
associated data set is needed; to issue a *NAME
command followed by the *DDNAME would be
meaningless unless the user wished to debug two
programs from two different libraries.
5.
(1)
*SETMODE=NEXT=ON
(2)
*SETMODE=NEXT=OFF
causes the debugger to change its global mode setting.
NEXT = ON tells the debugger to recognize the
next *GO (or a null command) as a command to execute the next instruction; in this way, the background
program can be run one instruction at a time. NEXT =
OFF resets this.
6.
*TRACE
causes DYDE to print the current contents of the
screen face into the SYSPRINT data set and, thereafter, tQ print each displayed liile. If DYDE is tracing
currently, *TRACE turns off tracing.
7.
*PRINT
requests the debugger to copy everything displayed
currently on the 2260 screen face into the SYSPRINT
data set (this same is overridable-see the *DDNAME
command). In this way, the user may keep a history
of his debugging sessions and also develop a hardcopy
trail of errors for later analysis. This command does not
exist in the 1052 version.
(1)
(2)
*BREAK=name
*BREAK=name/DEL
8.
(3)
(4)
(5)
*BREAK=/DEL
*BREAK = name/loc
*BREAK = name/loc/verify string
instructs the debugger to insert a breakpoint (eases
1, 4, and 5) or delete a breakpoint (cases 2 and 3). In
the former case, a breakpoint, with the given name, is
inserted at a specified location. In case 1, it is inserted
at the last displayed position: in case 4, at the named
location; and in case 5, at the named location-after
DYDE has verified that the supplied string (in hex)
matches the information that is actually in core at
that location. If the two strings do not match:, the
location is displayed, but no breakpoint is inserted
nor is any other change made. Case 2 tells the debugger
to dehte the named breakpoint; and case 3 tells the
debugger to delete the current breakpoint (if one
exists).
9.
*GO
instructs the debugger to execute (or resume) the
current program. If this is the first *GO issued after
an *NAME, the program begins at the link-editorassigned entry point. If the program is halted currently
at a breakpoint, control is resumed at the breakpoint's
location unless an *RESUME has modified this address. If the program has program checked (a specific
type of 360 interrupt such as an invalid address specification), the only way to resume it without relo~l,ding
a fresh copy is thro 19h the *RESUME.
10.
*RESUME = loc
specifies that when program execution is restarted, the
debugger should resume execution at the specified
address rather than starting at the current breakpoint.
This is the only way to resume a program thati has
program checked., Note that great care must be exercised when using this command to guarantee that
registers and program cells are properly set so that
another program check does not cocur.
11.
*DUMP
tells the debugger to dump itself and the program
as if an ABEND (an abnormal termination iSVe with
the code of 100) were located at the current breakpoint rather than the machine instruction. actually
there.
On JAne Machine Language Debugger for OS/360
12.
(1)
(2)
(3)
(4)
(5)
*MODIJ;fY = 'COND' /value
*MODIFY = loc/value
*MODIFY = reg no/value
*MODIFY = value
*MODIFY =loc/rep value/verify
value
instructs the debugger to modify the program being
debugged. In cases 1 and 3, the debugger modifies
either the condition code set when the program resumes
or the value of the specified register. For the condition
code, the user supplies the mask as if he were testing
it-*MODIFY = 'COND' /8 wOllld cause the instruction BC 8 to branch, whereas BC 7 would not. For
the regIster, the hex digits supplied replace the same
number of digits in the register-if register 3 contains
ABCD1234 and if the command *MODIFY =
#3/0000 were issued, the new value would be 00001234.
In case 2, the specified location is modified by the
supplied value; in case 5, the specified locl.tion is
modified by the rep value, after comparing it with
the verify value; and in case 4, the last dis~layed
location is· modified. All hex digits supplied are modiied
in all cases; i.e., if location 1000 contained 47FO,1234
and if the command *MODIFY =47AF were issued,
the new value would be 47AF,1234. Note that in
cases 2 and 4 the value supplied may contain imbedded
commas.
13.
(1)
(2)
*CSECT = loc
*CSECT
defines a new context for the evaluation of expressions
used for the loc parameters. In case 1, the location
specified is used as the new base. Case 2 resets the
program's base to the first byte of the load module.
Several previous commands contain a location
specification as a parameter (signified by loc in the
command's syntax). Wherever this is required, the
user may code the sum or difference of any combination
of the following elements:
1. ?hex value-hex displacement from the current
base point (see *CSECT) ;
2. &hex value-absolute displacement from the
first addressable byte in the machine;
3. decimal value-decimal displacement from the
current base point (see *CSECT) ;
4. *-location of the current breakpoint;
5. # followed by a register (i.e., # 3);
6. character string-absolute location of the
specified symbol;
7. any sum 0: difference of the above enclosed in
183
parentheses (no limit on the depth)-meaning
the contents of the expression within the
parentheses.
Cases 1, 2, and 7 require further explanation. When
the program is loaded initially, all displacements are
evaluated with reference to the first byte of the load
module. This is independent of the linkage -editorassigned entry point. Thus, ?44 refers to 68 (decimal)
bytes after the first byte of the load module. The
*CSEC r command may be used to modify this; i.e.,
if an *CSEC r = ?44 is issued, the reference to ?44
refers to a location 136 (decimal) from the entry point.
In this way, the user may move from one control
section to another without having t:> comput~ displacement plus linkage -editor-assigned control se~tion
address. This feature may be used when, for example,
one program dynamically loads another. The user
may plant a breakpoint just before the actual transfer
of control, discover the location of the entry point of
the LOADed program (it should be in a register))
and plant a breakpoint there (perhaps using the
*BREAK=/(#15) command). When the second
breakpoint is reached, the user may issue a *CSECT
= * command to set the context to the LOADed
program. *
Examples of valid loc parameters follow:
1. «(&10))+4) would locate the current TCB
(location x'10' in the machine contains the
address of the communications vector table;
the first word points to a double word in core,
and the second word contains the address of
the current TCB).
2. If register 3 contained the value x'10',
« « # 3))) +4) would accomplish the same thing.
If cell CV'TLOC in the user's program contained
the value x'10', ««CVTLOC)))+4) would also
locate the current TCB.
3. SAVE+4 should specify a location 4 bytes
after the symbol # SAVE.
4. ( # 15) would specify the location pointed to
by register 15.
The other general category of comm'l.nds requests
displays of items or status about· the program being
debugged. These do not begin with an asterisk followed
by a keyword, but are merely commands that specify
what is to be displayed. These commands follow:
* In this case, the symbol table is unavailable for the LOADed
program.
184
Fall Joint Computer Conference, 1969
----------------------------------------------------------------~-------------,-----
1.
(1)
(2)
'R'
# followed by register number
requests the debugger to display either the contents of
alI registers (case 1) or only the specified register
(case 2). For the 2260 version of DYDE, either one
line is written for a single register display or four lines,
each .containing the contents of four registers, are
written. For the 1052 version, case 1 calls for writing
three messages to the operator' (without reply) for
registers 0 through 11 and one WTOR (which forms
the basis for the next command) for the remaining
four registers. In either case and for either version, the
registers are displayed as they were at the last breakpoint, including any subsequent: manual modification
(or all zero if the program has notiyet begun execution).
2.
'COND'
requests a display of the curren':' tcondition code as a
decimal value between 0 and 8;; i.e., if the condition
code is displayed as 8, a BC 8 will· branch but a BC 7
will not.
3.
'BREAK'
requests a display of the current breakpoint informatio,n. All data regarding currently active breakpoints
are displayed as well as identification of the current
breakpoint.
4.
(1)
(2)
(3)
(4)
loc
loc/length
loci/modifier
loc/length/modifier
causes the display of a particular locati)u (see the loc
parameter discussion above), and defines a 'current'
location to be used if the next *MODIFYor *BREAK
does not specify an explicit one. If no length or modifier
information is supplied and the loc specification contains no symbol, a 4-byte hexadecimal value is displayed. If a symbol is present, its length and type
attributes are used. A length, which must be a decimal
less than 32, determines how many digits will form
the final display. The modifier may be C, B, or R or
it may be omitted. If C is coded, the value will be
displayed as characters; B requests the display as a
bit string of ones and zeros; and R requests a display
relative to the current base point. However, if R is
qualified by some value in parentheses (e.g., loci/R
(BASE2)), the displayed value is relative to the value
ofBASE2.
One other command to DYDE exists: the asynchronous interrupt to the user's executing program. After
a user has indicated his desire to resume execution of
his program, DYDE does not receive control again
until another breakpoint is encountered. However, if
the user provides an asynchronous interrupt (by
simultaneously depressing the enter and shift keys on
the 2260), DYDE is given control by OS, interrupting
the program being debugged (which is currently
executing). DYDE plants a breakpoint where th.e
program will resume and then terminates interrupt
processing. When OS resumes the program, this breakpoint is executed, and DYDE is entered. In this manner,
the user, after requesting resumption of his program,
may interrupt it from the console and use all of DYDE~'s
facilities.
Symbol table
To allow the user to make symbolic references to
his program, DYDE uses the OS TESTRAN facility
to provide a symbol table. The a:ssembl~r's test option
tells it to provide the symbol table as part of it:? output
object module. Similarly, the linkage-editair's test
option tells it to write a composite symbol table (a
concatenation of each symbol table present in the
input load or object modules) along with the load
module. Under normal processing this symbol table
is ignored; i.e., when a load module is brought into
core, the symbol table is stripped off. However, before
loading a program in response to an *NAME command,
DYDE checks the disk data set containing the program
for a symbol table. If the load module on the disk
does not contain symbol table entries, it i8 simply
loaded into core, and the user is informed that symbols
are not available.
However, if symbol table entries are present, they
are read into core; an index is built through a hash
technique; and they are written into the SYSUTI
data set. Each symhol used is present along with its
attributes of type and length and its displa.cement ..
The composite external symbol dictionary (CESD)
of control sections, produced by the linkage-ed~tor,
is used to build a map of the program so that each
symbol may be assigned an address relativ(~ to the
load point rather than a displacement from its controlsection origin. As each symbol is retrieved, the first
four characters are multiplied by the last four, and
the middle seven bits of the resulting 64-bit product
On Line Machine Language Debugger for OS/360
are used to index a 128-entry hash table. Each table
entry contains an index to a block of data on external
storage and a displacement within that block. All
symbols with the same hash entry are chained together,
each pointing to the block and displacement of the
next symbol. Each block contains enough space for
200 symbols; the most recently referenced block is
kept in core to minimize disk accesses. This method
seems to work efficiently for the on-line user expecting
rapid response.
Entry from p1ngpong SVC
SVC Work Al'es
Entry when debugged
program is to be restarted
Pgm resume
PSV
DYDE's
resume PSW
Program
reghter Save
Area
Branch to
Spec 1a 1 Area
DYDE regt.ter
Save Arel!.
Format of Special Area
SPECIAL SVC PIIIGPONG
LIFT
DC 3X' 0700'
SVC PINGPONG
The lift instruction is moved
to LIFT and control pa.,ed to
SPECIAL. IF the inatruction
branches, DYDE is done. If it
drops through, the sve is i •• ued
again, and control pal.ed to
l'GMENTRY.
User BVe
One major deficiency of the 3600 hardware, which
any debugging system must overcome, is the requirement that any transfer of control be accompanied by
the setting (and the destruction) of one of the sixteen
general-purpose registers. Thus, the transfer of control
from the debugged program at breakpoints cannot be
accomplished merely by a branch, but must be performed by an instruction that is independent of register
settings. The most likely candidate is a supervisor
call (SVe) and its associated supervisor call routine,
which can arrange for saving all sixteen registers and
the transfer of control. However, the modification of
the user's program when such an sve is inserted to
represent a breakpoint requires that destroyed instructions be executed interpretively out of line, if
the breakpoint is to be used in the future. This is
quite expensi've since approximately 120 instructions
are in the 360 repertoire, and each one's interpretation
must be coded separately. Using the EXECUTE
instruction to execute the one modified instruction
out of line is another possibility. However, this requires
that all sixteen registers be properly set before the
EXECUTE instruction is issued, and that control be
transferred to the next instruction in the program
without destroyin~ any register contents.
To solve this problem, DYDE employs a type III
user-written supervisor call that allows both DYDE
and the program to be debugged to reside as "coroutines" in the same job. This sve can be viewed
from the outside as having a pingpong effect on the
control flow. Each time the sve is issued, after an
initial call, control is passed to the other co-routine;
i.e., the first call passes to the sve routine an address
within DYDE for register and program-status~word
(PSW) save areas, one for itself and one for the program
being debugged. Thereafter, each issuance'saves the
registers and PSW of the issuing co-routine in its
area and restores the registers and PSW of the other
member of the pair. Thus, each breakpoint inserted
in the program being debugged calls for DYDE to
185
Figure 3-User program-DYDE interfaces
lift and save the current instruction at that location
and to plant the two-byte sve. When the sve is
executed, control passes to DYDE at an entry point
specified by it; a note is made of the location where
the sve was issued. When the user indicates he wants
his program resumed, the lifted instruction is moved
into a special area in DYDE; the program's resume
address is updated to point to this location; and DYDE
issues the sve. This causes the program's registers
to be rest'ored and control to be passed to the lifted
instruction. If it is a branch, control passes directly
back to the program. However, if it is not a branch,
control will pass to the next instruction in this special
area, which happens to be another pingpong sve call.
Since it was issued while the program was in ex~cution,
control is passed to DYDE, which notes that the sve
was issued from within its own address space and
that the lifted instruction dropped through. DYDE
then calculates the address of the instruction following
the lifted instruction, places it in the program's resumeprogram-status word, and reissues the sve. This
causes control to return to the program, which remains
in control until another breakpoint is reached (see
Figure 3). The only instructions that cannot be executed
when moved are the Branch and Link and the Branch
and Link Register, which are location dependent-they
load a specific register with the current contents of the
location counter and then branch to another location.
DYDE interprets both instructions.
APPENDIX
A
1052 Operation
The 1052 is the normal OS operator's console. DYDE
uses the Write to Operator (WTO) and the Write
to Operator with Reply (WTOR) facilities to com-
186
Fall Joint Computer Conference, 1969
municate with the user. The~e macros allow any
program to type a message on the typewriter, or to
type a message and wait fora reply. This facility
provides a very rudimentary fQrm of interaction; not
only is the typewriter slow, but: the form of user commands is, of necessity, burdensome. More importantly, the console is used by 6s for communications
with the operator. As such, it types out not only
declarative but informative messages and expects
some replies. Thus, a user wishing to use DYDE on
a 1052 must tolerate other console activity; separate
those' meSSages sent to him by DYDE from other
operator messages, usually by noting the message
content; and tag his commands with the number of
the message to which he is replying.
The mechanism for these replies is bothersome. The
user first depresses the REQUEST key, then, when
the system responds with the proceed light, he must
type the character R (short for REPLY), leave a
space, and then type the follo~ing: (1) the number of
the outstanding message to wHich he is replying, (2)
a quote, (3) the message body, (4) a terminal quote,
and (5) the end of block. Assuming the user has received a proceed light, and is replying to message 3,
he must type:
R03, 'THIS ISAN EXAMPLE.'
followed by an end of block.
Using this operation, DYDE initially types out a
READY message and waits for a reply. The user
responds to this message using the reply mec:hanismby issuing a legal command, and being careful to note
the number (or tag) associated with the READY
message. DYDE responds to each request with a
message. If the request requires more than one line,
at least one WTO is issued, with no wait for reply;
it is followed by a WTOR and a wait for reply. In
this manner, DYDE can debug a program that resides
as one of many jobs in a multiprogrammed. environment, and still keep the interference with normal
system operations at a minimum.
APPENDIXB
Command abbreviations
The following command abbreviations are available:
Abbreviation
Full Form
*NA
*M
*BR
*FI
*DD
*CS
*RE
*TR
*S
null command
(i.e., just the
enter symbol)
*NAME
*MODIFY
*BREAK
*FINISH
*DDNAMES
*CSECr
*RESUME
*TRACE
*SETMODE
if mode is next, then
*NEXT if mode is
not next, then *GO
The multics PL jl compiler
by R. A. FREIBURGHOUSE
General Electric Company
Cambridge, Massachusetts
INTRODUCTION
The IVlultics PL/1 compiler is in many respects a
"second generation" PL/1 compiler. It was built at a
time when the language was considerably more stable
and well defined than it had been when the first
compilers were built. 1 ,2 I t has benefited from the
experience of the first .compilers and avoids some of the
difficulties which they encountered. The Multics compiler is the only PL/1 compiler written in PL/1 Emd. is
believed to be the first PL/1 compiler to produce high
speed object code.
The langua.ge
The Multics PL/I language is the language defined
by the IBM "PL/1 Language Specifications" dated
March, 1968. 1 At the time this paper was written most
language features were implemented by the compiler
but the run time library did not include support for
input and output, as well as several lesser features.
Since the multi-tasking primitives provided bv the
Multics operating system were not well suited to 'PL/l
tasking, PL/l tasking was not implemented. Interprocess communica tion (M ultics tasking) may be
performed through calls to operating system facilities.
The system environment
The compiler and its object programs operate within
the Multics operating system. 3 ,4,6 The environment
provided by this system includes a virtual two dimensional .address space consisting of a large number of
segments. Each segment is a linear address space whose
addresses range from 0 to 64K. The entire virtual store
is supported by a paging mechanism. which is invisible
to the program. Each program operating in this
environment consists of two segments: a text segment
containing a pure re-entrant procedure, and a linkage
segment containing out-references (links), definitions
(entry names), and static storage local to the program.
The text segment of each program is sharable by all
other users on the system. Linking to a called prog:ram is
normally done dynamically during program execution.
Implementation techniques
The entire compiler and the lVJultics operating system
were written in EPL, a large subset of PL/l containing
most of the complex features of the language. The EFL
compiler was built by a team headed by lVI. D. l\TcIlroy
and R. lVlorris of Bell Telephone Laboratories. Several
members of the l\1ultics PIJ/I projeGt modified the
original EPL compiler to improve its object code
performance, and utilized the knovdedge acquired from
this experience in the desig;n of the Multics PL/l
compiler. EPL and lV1.ultics PL/I are sufficiently
compatible to allow the IHultics PL/l compiler to
compile itself and the operating system.
The l\.fultics PL/1 compiler was built and de-bugged
by four experienced system programmers in 18 months.
All program preparation was done on-line using the
CTSS time-sharing system at lVIIT. l\ITost de-bugging
was done in a batch mode on the GE645, but final
de-bugging was done on-line using l\1.ultics ..
The extremely short development time of 18 months
was made possib!e by these powerful tools. The same
design programmed in a macro-assembly langua.ge using
card input and batched runS would have required twice
as much time, and the result would have been extremely
unmanageable.
187
188
Fall Joint Computer Conference, 1969
Design objectives
The project's design decisions and choice of techniques
were influenced by the following objectives:
1. A correct implementation of a reasonably
complete PL/llanguage.
2. A compiler which produced relatively fast object
code for all language constructs. For similar
language constructs, the object code was expected to equal ~r exceed that produced by most
Fortran or COBOL compilers.
3. Object program compatibility with EPL object
programs and other IV[ultics languages.
4. An extensive compile time diagnostic facility.
5. A machine independent compiler capable of
bootstrapping itself onto other hardware.
The compiler's size and speed were considered less
important than the above mentioned objectives. Each
phase of the original compiler occupies approximately
32K, but after the compiler has compiled itself that
figure will be about 24K. The. original compiler was
about twice as slow as the IVlultics Fortran compiler.
The bootstrapped version of the PL/1 compiler is
expected to be considerably faster than the original
version but it will probably not equal the speed of
Fortran.
A n overview of the compiler
The Multics PL/I compiler is designed along
traditional lines. It is not an in.teractive compiler nor
does it perform partial compilations. The compiler
translates PL/I external procedures into relocatable
binary machine code which may be executed directly or
which may be bound together· with other procedures
compiled by any l\llultics language processor.
The notion of a phase is particularly useful when
discussing the organization of the l\dultics PL/I
compiler. A phase is a set of procedures which performs
a major logical function of compilation, such as syntactic analysis. A phase is not necessarily a memory load or
a pass over some data base although it may, in some
cases, be either or both of these things.
The dynamic linking and paging facilities of the
Multics environment have the effect of making available in virtual storage only those specific pages of those
particular procedures which are referenced during an
execution of the compiler. A phase of the l\1ultics PL/I
compiler is therefore only a logical grouping of procedures which may call each other. The PL/I compiler
is organized into five phases: Syntactic Translntion,
Declaration Processing, Semantic Translation, Optimization, and Code Generation.
The internal representation
The internal representation of the program being
compiled serves as the interface between phases of the
compiler. The internal representation is organized into
a modified tree structure (the program tree) eonsisting
of nodes which represent the component parts of the
program, such as blocks, groups, statements, operators,
operands, and declarations. Each node may be logically
connected to any number of other nodes by the use of
pointers.
Each source program block is represented in the
program tree by a block node which has two lists
connected to it: a statement list and a declaration list.
The elements of the declaration list are symbol table
nodes representing declarations of identifiers ·within that
block. The elements of the statement list are nodes
representing the source statements of that block. Each
statement node contains the root of a computation tree
which represents the operations to be performed by that
statement. This computation tree consists of operator
nodes and operand nodes.
The operators of the internal representation are
n-operand operators whose meaning closely parallels
that of the PL/I source operators. The form of an
operand is changed by certain phases, but operands
generally refer to a declaration of some variable or
constant. Each operand also serves as the root of a
computation tree which describes the computations
necessary to locate the item at run time.
This internal representation is machine independent
in that it does not reflect the instruction set, the
addressing properties, or the register arrangement of
the GE645. The first four phases of the compiler are also
machine independent since they deal only with this
machine independent internal representation. Figure 1
shows the internal representation of a simple program.
Syntactic translation
Syntactic analysis of PL/I programs is slightly more
difficult than syntactic analysis of other languages such
as Fortran. PL/I is a larger language containing more
syntactic constructs, but it does not present any
significantly new problems. The syntactic translator
consists of two modules called the lexical analyzer and
the parse.
Lexical analysis
The lexical analyzer organizes the input text into
groups of tokens which represent a statement. It also
creates the source listing file and builds a token table
which contains the source representation of all tokens in
The Multics PL/l Compiler
6~~Ci FIXED,PRINT ENTRY, F ENTRY RETURNS(FIXED) INT;
DO I = 1 TO 10;
CAll PRINT("Factorlol Is'! F(I)h
END;
PROC (N) FIXED~
DCl N FIXED;
symbol table
IF N • 0 THEN RETURN(1)~
~for I
RETURN (N*F(N-l»;
symbo I to bl e
END F;
END FACTi
~for PRINT
FACT:
F:
PRINT:
/
statement node
,., or
,., TLL
statement node
,., or on'
statement node
for FACT end
PROC
-bi~ nod.
This vector of pointers is the
representation of the co II
statement. it is created by
the lexical analyzer and serves
as input to the parse.
MESSAGE
statement node
for IF c lau .. -
,
statement node
PRINT
symbol to ble
.............. for N
.".....- fo r F
block node
FACT~-----
PROC(MESSAGE, VALUE);
DCl MESSAGECHAR(*), VALUE FIXED;
CAll DISPlAY(MESSAGE II VALUE);
END;
The token table produced by
the lexical analyzer for
this proorom is:
symbol table
"
lR9
-
"0
jump,ne.
(/ t
statement node
for THEN clouse -......... 1
VALUE
I
DCl
t'statement DOde _ _ ..
... fo r RETURN
,
CHAR
./ "
Neall
*
FIXED
statement node
for Fend
F/
"_
/
N
CAll
"
1
DISPLAY
II
Figure I-The internal repre£erta,tion of e. progmm.
The example is greatly simplified. Only the statements of procedure F are shown in detail.
END
Figure 2-The output of the lexical analyzer.
the source program. A token is an identifier, a constant,
an operator or a delimiter. The lexieal analyzer is called
by the parse each time the parse wants a new statement.
The lexical analyzer is an approximation to a finite
state machine. Since the lexical analyzer must produce
output as well as recognize tokens, action codes are
attached to the state transitions of the finite state
machine. These action codes result in the eoncatenation
of individual characters from the output until a
recognized token is formed. Constants are not converted
to their internal format by the lexical analyzer. They are
converted by the semantic translator to a format which
depends on the context in which the constant appears.
The token table produced by the lexical analyzer
contains a single entry for each unique token in the
source program. Searching of the token table is done
utilizing a hash coded scheme which provides quick
access to the table. Each token table entry contains a
pointer which may eventually point to a declaration of
the token. For each statement, the lexical analyzer
builds 3 vector of pointers to the tokens which were
found in the statement. This vector serves as the input
to the parse. Figure 2 shows a simple example of lexical
amtlysis.
The parse
The parse consists of a set of possibly recursi~e
procedures each of which corresponds to a syntactIC
unit of the'language. These procedures are organized to
perform a top down analysis of the source pr?gran~. ~s
each component of the program is recognIzed, I~ IS
transformed into an appropriate internal representatIOn.
The complete~ internal representation is a program tree
which reflects the relationships between all of the
cJmponents of the original source program. Figure 3
shows the results of the parse of a simple program.
Syntactic contexts which yield declarative inf~rm~
tion are recognized by the parse, and this informatlO~ IS
passed to a module called the context recorder w~ICh
constructs a data base containing this informatlOn.
Dedare statements are parsed into partial symbol table
nodes which represent declarations.
The problem of backup
The top dmvn method of syntactic analysis is used
because of its simplicity and flexibility. The use of a
simple statement recognition algorithm made it possible
190
Fall Joint Computer Conference, 1969
--------------------------------------------------------------------SUM:
PROC(X,N) FLOAT;
DCl (S INITIAl(O),X(1000)) FLOAT;
DCl (I,N) FIXED:
DOl ,,' TON~
S • S+X( I);
END;
RETURN.(S);
END SUM;
symbol table
/or
N
table
The declaration processor
The token table
Figure 3-The output of the parse
to eliminate all backup. The statement recognizer
identifies the type of each staterhent before the parse of
that statement is attempted. The l;lJgorithm used by
this procedure first attempts to recognize assignment
statements using a left to right scan which looks for
token patterns which are roughly analogous to X = or
X ( ) =. If a statement is not recognized as an
assignment, its leading token is matched again8t a
keyword list to determine the statement type. This
algorithm is very efficient and is able to positively
identify all legal statements without requiring keywords
to be reserved.
Declarat~'on
creates new declarationR hrwing the same format as
those derived from dec}gre statements. This activity
creates contextual and implicit declarations.
procf'ssinq
PL/1 declara.tion processing is complicated by the
great variety of data attributes and by the context
sensitive manner in which they are derived. Two
modules, the context processor and. the declaration
processor, process declarative information gathered by
the parse.
The context processor
The context processor scans the data base containing
contextually derived attributes produced during the
parse by the context recorder. ;It either augments the
partial symbol table created from declare statements or
The declaration processor developR sufficient information about the varial13s of the program so 1jhat they
may be allocated storage, initialized and accessed by the
program's operators. It is organized to perform three
major functions: the preparation of accessing code, the
computation of each variable's Rtorage requirements,
and the creation of initialization code.
The declaration processor is relatively machine
independent. All mac\ine dependent characteristics,
such as the number of bits per word and the a.lignment
requirements of data types, arc contained in a table.
All computations or statements produced by the
declaration processor have the same internal representation as source language expressions or statements. Later
phases of the compiler do not distinguish between them.
The use of based references by the declaration
pro ·e~sor
The concept of a based reference is useful to the
understanding of PL/1 data accessing and the implementation of a number of language featureE.. A based
declare,tion of the form DeL A BASED is referenced
by a based reference of the form P --? A, where P is a
pointer to the storage occupied by a value whose
description is given by the declaration of A. l\1ultiple
instances of data having the characteristics of A can be
referenced through the use of unique pointers, i.e.,
Q --? A, R --? A, etc.
The declaration processor implements a number of
language features by transforming them int.o suitable
based declarations. Automatic data whose size IS
variable is transformed into a based declaration.
For example the declaration:
DCL A(N) AUTO;
becomes
DCL A(N) BASED (P) ;
where: P is a compiler produced pointer which is set
upon entry to the declaring block.
Based declarations are also used to implement
parameters. For example.
X: PROC (C); DCL C;
The Multics PL/l Compiler
191
Given a declaration of the form:
beeomes
X: PROC (P); DCL C BASED (P) ;
DeL 1
2
2
2
where: P is a pointer which points to the argument
corresponding to the parameter C.
Data accessing
The address of an item of PL/1 data consists of three
basic parts: a pointer to some storage location, a word
offset from that location and a bit offset from the word
offset. Either or both offsets may be zero. The term
"word" is understood to refer to the addressable unit
of a computer's storage.
Example 1
DCL A AUTO;
The offset of A is zero, the offset of B is M bits, and the
offset of C is M + 5 bits rounded upward to the
nearest word boundary.
In general, the offset of the nth item in a structure is:
bn(Cn-l(Sn-l) + b n- 1(cn-2(sn-2) + bn- 2
( ... b 3(c2(s2)) + b 2(Cl(Sl)))'" )))
where: bk is a rounding function which expresses the
boundary requirement of the kth item.
Sk
The address of A consists of a pointer to the declaring
block's automatic storage, a word offset within that
automatic storage and a zero bit offset
Example 2
DCL 1 S BASED(P),
2 A BIT(5),
2 B BIT(N)
When referenced by P ~ B, the address of B is a
pointer P, a zero word offset and a bit offset of 5. The
word offset may include the distance from the origin of
the item's storage class, as was the case with the first
example, or it may be 'only the distance from the
level-one containing structure, as it was in the last
example. The term "level-one" refers to all variables
which are not contained within structures. Subscripted
array element references, A(K, J), or sub-string
references, SUBSTR(X, K, J), may also be expressed
as offsets.
Offset expressions
The declaration processor constructs offset expressions which represent the distance between an element
of a structure and the data origin of its level-one
containing structure. If an offset expression contains
only constant terms, it is evaluated by the declaration
processor and results in a constant addressing offset. If
the offset expression contains variable terms, the
expression results in the generation of accessing
instructions in the object program. The discussion which
follows describes the efficient creation of these offset
expressions.
S,
A BIT(M),
B BIT(5),
C FLOAT;
Ck
Sk
is the size of the kth item.
is the conversion factor necessary to convert
to some common units such as bits.
The declaration processor suppresses the creation of
unnecessary conversion functions (Ck) and boundary
functions (b k ) by keeping track of the current units and
boundary as it builds the expression. As a result the
offset expressions of the previous example do not contain
conversion functions and boundary functions for A
andB.
During the construction of the offset expression, the
declaration processor separates the constant and variable terms so that the addition of constant terms is done
by the compiler rather than by accessing code in the
object program. The following example demonstrates
the improvement gained by this technique.
DCL 1 S,
2 A BIT(5),
2 B BIT(K),
2 C BIT(6),
2 D BIT(10);
The offset of Dis K+ll instead of 5'+K+6.
The word offset and the bit offset are developed
separately. Within each offset, the constant and variable parts are separated. These separations result in the
minimization of additions and unit conversions. If the
declaration contains only constant sizes, the resulting
offsets are constant. If the declaration contains expressions, then the offsets are expressions containing the
minimum number of terms and conversion factors.
The development of size and offset expressions at
192
Fall Joint Computer Qonference, 1969
compile time enables the object program to access data
without the use of data descriptors or "dope vectors."6
Most existing PL/1 implementfl,tions make extensive
use of such descriptors t.o acc~ss data whose size or
offsets are variable. Unless these descriptors --.8re
implemented by hardware, theit use results in rat.her
inefficient object code. The l\IJultics PL/1 strategy of
developing offset expressions from the declarations
results in accessing. code similar to that produced for
subs ;ri1ted array references. rhis code is generally
more dficient than code which uses descriptors.
In general, the offset expressions constructed by the
declarat io:.l proce3sor remain unchanged until code
generation. Two cases are ex¢eptions to this rule:
subscri pted array references, A (K ,J), and sub-string
references, SUB S T R (X, K, J)i.
Each subscripted
reference or sub-string referen6e is a reference to a
unique sub-datum within the i declared datum and,
therefore, requires a unique !offset. The semantic
translator constructs these un~que offsets using the
subscripts from the reference anq the offset prepared by
the declaration processor.
Semantic translation
The semantic translator transforms the internal
representation so that it reflects the attributes (semantics) of the declared variables without reflecting the
properties of the object machine. It makes a single scan
over the internal representation of the program. A compiler, which had no equivalent of the optimizer phase
and which did not separate the machine dependencies
into a separate phase, could conceivably produce object
code during this scan.
Organization of the semantic translator
The semantic translator consists of a set of recursive
procedures which walk through the program tree. The
actions taken by these procedures are described by the
general terms: operator transformation and operand
processing. Operator transformation includes the creation of an explicit representation of each operator's
result and the generation of conversion operators for
those operands which require conversion. Operand
processing determines the attributes, size and offsets of
each operator's operands.
Allocation
The declaration processor do¢s not allocate storage
for most classes of data, but it does determine the
amount of storage needed by e~ch variable. Variables
are allocated within some segmeD!t of storage by the code
generator. Storage allocation is delayed because, during
semantic translation and optimization, additional decL,.rations of constants and compiler created variables
are made.
Initialization
The declaration processor creates statements in the
prologue of the declaring block which will initialize
automatic data. It generates! DO statements, IF
statements and assignment statements to accomplish
the required initialization.
The expansion of the initial ~ttribute for based and
controlled data is identical to that for automatic data
except that the required stateniwnts are inserted into
the program at the point of allocation rather than in the
prologue.
'
Since array bounds and string'sizes of static data are
required by the language to be ,constant, and since all
values of the initial attribute Qf static data must be
constant, the compiler is able to iP.itialize the static data
at compile time. The initializatibn is done by the code
generator at the time it allocate$ the static data.
I
Operator transformation
The meaning of an operator is determined by the
attributes of its operands. This meaning specifies which
conversions must be performed on the operands, and it
decides the attributes of the operator's result.
An operator's result is represented in the program
tree by a temporary node. Temporary nodes are a
further qualification of the original operator. For
example, an add operator whose result is fixed-point is a
distinct operation from an add operator whose result is
floating-noint. There is no storage associated with
temporaries~they are allocated either core or register
stora~e by the code generator. A temporary's size is a
function of the operator's meaning and the sizes of the
operator's operands. A temporary, representing the
intermediate result of a string operation, requires an
expression to represent its length if any of the string
operator's operands have variable lengths.
Operand processing
Operands consist of sub-expressions, references to
variables, constants, and references to procedure names
or built-in functions. Sub-expression operands are
processed by recursive use of operator transformation
and operand processing. Operand processing converts
constants to a binary format which depends on the
The MuJtics PL/l Compiler
context in which the constant was used. References to
variables or procedure names are associated with their
appropriate declaration by the search function. After
the search function has found the appropriate declaration, the reference may be further processed by the
subscriptor or function processor.
The Search function
During the parse, it is not possible for references to
source program variables to know the declared attributes
of the variable because the PL/l language allows
declarations to follow their use. Therefore, references to
source program variables are parsed into a form which
contains a pointer to a token table entry rather than to
a declaration of the variable. Figure 3 shows the output
of the parse. The search function finds the proper
declaration for each reference to a source· program
variable. The effectiveness of the search depends heavily
on the structure of the token ~able and the symbol table.
After declaration processing, the token table entry
representing an identifier contains a list of all the
declarations of that identifier. See Fig-ure 4.
The search function first tries to find a declaration
belonging to the block in which the reference occurred.
If it fails to find one, it looks for a deolaration in the next
containing block. This process is repeated until a
TOP:
PROC~
DCl B POI NTER
~
BEGIN;
DCl B FLOAT;
BEGIN;
DCl B FIXED
END;
~
END;
END;
Token Table
symbol tabl e for
\
,..............B as a pointer
block node for
~
TOP
~
'"
symbol table for
"
............... B as floating-point
block node for
"'first BEGIN
'\
'"
symbol table for
"
~ B as f I xed-poi nt
b lock node· for ~
second BEGIN
Figure 4-The relationship between the token table and
the symbol ta.ble
OEM,
193
PROC~
DCl I S,
2 A(N) FLOAT,
2 B(M) FIXED;
S.B( I) • 0;
END;
----
---
\
symbol table
block node _ _ _ _ for B
for OEM
\reference
node for B
\
,
\
'. }
,
t he word offset
express Ion bUi It
by the declaration
processor.
sta tement node
for assignment ~
\
...
/
reference
,-
'-.....0
node for B
/
/+,
/+\
N
-I
I
]
the word offset
express ion bu i It
by t he semantiC
translator.
Figure 5-A simplified diagram showing the effects of
subscripting
declaration is found. Since the number of declarations
on the list is usually one, the search is quite fast. In its
attempt to find the appropriate declaration, the search
function obeys the language rules regarding structure
qualification. It also collects any subscripts used in the
reference and places them into a subscript list. Depending on the attributes of the referenced item, the
subscript list serves as input to the function processor' or
subscriptor.
The declaration processor creates offset expressions
and size expressions for all variables. These expressions,
known as accessing expressions, are rooted in a reference
node which is attached to a symbol table node. The
reference node contains all information necessary to
access the data at run time. The search function
translates a source reference into a pointer to this
reference node. See Figure 5.
Subscripting
Since each subscripted reference is unique, its offset
expression is unique. To reflect this in the internal
representation, the subscriptor creates a unique reference node for each subscripted reference. See Figure 6.
The following discussion shows the relationship between
the declared array bounds, the element size, the array
offset and subscripts.
194
Fall Joint Computer yonference, 1969
compiler. Since the virtual origin and the multipliers are
common to all references, they are constructed by the
declaration processor and are repeatedly used by the
subscriptor.
Let us consider the case of an a*ay declared:
u.(h :Ul, b :U2, ... " In :un)
Its element size is s and its offset is b.
Arrays of PL/1 structures which contain arrays may
result in a set of multipliers whose units differ. The
declaration:
The multipliers for the array are defined as:
mn = s
m n- l = (Un -In + ~)s
m n-2 = (U n- l -In-I! + 1) m n- l
DCL 1 S(10),
2 A PTR,
2 B(10) BIT(2);
!
ml
= (U2 -b
+
1)m2
The offset of a reference a(h, i2 ,
.! • "
in) is computed as:
n
V
+L
ij~lj
i=1
where: v is the virtual origin. The virtual. origin is the
offset obtained by setting the s~bscripts eqnal to zero.
I t serves as a convenient base from which to compute
the offset of any array element. i
During the construction of: all expressions, the
constant terms are separated fr~m the variable terms
and all constant operations are performed by the
yields two multipliers of different units. The first
multiplier is the size of an element of S in wo:rds, while
the second multiplier is the size of an element of B
in bits.
Array parameters which may correspond. to an array
cross section argument must receive their multipliers
from an argument descriptor. Since the arr:il.ngement
of the cross section elements in storage is not known to
the called program, it cannot construct its own multipliers and must use multipliers prepared by the calling
program. Note that the current definition of PL/1
allows any array parameter to receive a crOl3S section
argument.
The function processor
PROC~
FIGS
(x. Y, Z)
x • y+ z ~
DCL
FLOAT;
END;
,.,..,.~~~
symbol table
, ,.....,for Z \
reference
symbol tdble
node for Z
for Y ~
\
~
block node
symbol t a b l ;
for l( \
reference
'(0"fo, X
F\
.tat~ment
reference
node for Y
fo,
a".,m..'
'
,
V·-V+
,
\
To... Tab"
!
tabl~}
Each entry point.
no a .ymbol
Y~B
'.i...---
.......
Generic procedure references
A generic entry name represents a family of procedures whose members require different types of
arguments.
DCL ALPHA GENERIC
:
,
node
An operand which is a reference to a procedure is
expanded by the function processor into a call operator
and possible conversion operators. Built-in function
references result in new operators or are transl.ated. into
expressions consisting of operators and operands.
----4.----
Y
:
X
i
i
Figure 6-The internal representation of a statement
before and after the execution ofithe search function.
The broken lines show th~ statement's
operands before the isearch
(BETA
ENTRY (FIXED)),
GA1VIlVlA
ENTRY(FLOAT)) ;
A reference to ALPHA (X) will result ina call to
BETA or CAMMA depending on the attributes of X.
The declaration processor chains together all members
of a generic family and the function processor selects the
appropriate member of the family by matching the
n.rguments used in the reference with the declared
argument requirements of each member. "'Then the
appropriate member is found, the original reference is
replaced. by a reference to the selected. mem.ber.
The Multics PL/l Compiler
Argument processing
The function processor matches arguments to userdeclared procedures against the argument types required
for the procedure. It inserts conversion operators into
the program tree where appropriate, and it issues
diagnostics when it detects illegal cases.
The return value of a function is processed as if it
were the n + 1th argument to the procedure, eliminating
the distinction between subroutines and functions.
The function processor determines which arguments
may possibly correspond to a parameter whose size or
array bounds are not specified in the called procedure.
In this case, the argument list is augmented to include
the missing size information. A more detailed description
of this issue is given later in the discussion of object
code strategies.
The built-in function processor
The built-in function processor is basically a table
driven device. The driving table describes the number
and kind of arguments required by each function and is
used to force the necessary conversions and diagnostics
for each argument. Most functions require processing
which is unique to that function, but the table driven
device minimizes the amount of this processing.
The SU BSTR built-in function is of particular
importance since it is a basic PL/1 string operator. It is
a three argument function which allows a reference to
be made to a portion of a string variable, i.e.,
SUBSTR (X, I, J) is a reference to the ith through
i + j - lth character (or bit) in the string X.
This function is similar to an array element reference
in the sense that they both determine th~ offsets of the
reference. The processing of the SUBSTR function
involves adjusting the offset and length expressions
contained in the reference node of X. As is the case in
all compiler operations on the offset expressions, the
constant and variable terms are separated to minimize
the object code necessary to access the data.
The optimizer
The compiler is designed to produce relatively fast
object code without the aid of an optimizing phase.
Normal execution of the compiler will by-pass the
optimizer, but if extensively optimized object code is
desired, the user may set a compiler command option
which will execute the optimizer. The optimizer consists
of a set of procedures which perform two major optimizations: common sub-expression removal and removal
of computations from loops. The data bases necessary
195
for these optimizations are constructed by the parse
and the semantic translator. These data bases consist of
a cross-reference structure of statement labels and a
tree structure representing the DO groups of each
block. Both optimizations are done on a block basis
using these two data bases.
Although the optimizer phase was not implemented
at the time this paper was written, all data bases
required by the optimizer are constructed by previous
phases of the compiler and the abnormality of all
variables is properly determined.
o'ptimiza,tion of PL/I programs
The on-condition mechanism of the PL/1 language
makes the optimization of PL/I programs considerably
more difficult than the optimization of Fortran programs. Assuming that an optimized version of a
program should yield results identical to those produced
by the un-optimized version, then if anyon-conditions
are enabled in a given region of the program, the
compiler cannot remove or reorder the computations
performed in that region. (Consider the case of a divide
by zero on unit which counts the number of times that
the condition occurs.)
Since some on-co~ditions are enabled by default,
most PL/1 programs cannot be optimized. Because of
the difficulty of determining the abnormality of a
program's variables, the optimization of those programs
which may be optimized requires a rather intelligent
compiler. A variable is abnormal in some block if its
value can be altered without an explicit indication of
that fact present in that block. An optimizing PL/I
compiler must consider all based variables, all arguments
to the ADDR function, all defined variables, and all
base items of defined variables to be abnormal. If the
compiler expects values of variables to be retained
throughout the execution of a call, it must also consider
all parameters, all external variables, and all arguments
of irreducible functions to be abnormal.
Because of the difficulty of optimizing programs
written in the current PL/11anguagel compilers should
probably not attempt to perform general optimizations
but should concentrate on special case optimizations
which are unique to each implementation. Future
revisions to the language definition may help solve the
optimization problem.
The code generator
The code generator is the machine dependent portion
of the compiler. It performs two major functions: it
allocates data into Multics segments and it generates
196
Fall Joint Computer Conference, 1969
------------------------------------------------------------------------------------645 machine instructions from the internal representation.
Storage allocation
A module of the code generator called the storage
allocator scans the symbol table allocating stack
storage for constant size automatic data, and linkage
segment storage for internal static data. For each
external name the storage allocator creates a link (an
out-reference) or a definition (an entry point) in the
linkage segment. All internal static data is initialized as
its storage is allocated.
Due to the dynamic linking; and loadin.g characteristics of the .:.\lultics environment, the allocation and
initialization of external static storage is rather unusual.
The compiler creates a special type of link which causes
the linker module of the operating; system to create and
initialize the external data upon first reference. Therefore, if two programs contain references to the same
item of external data, the first one to reference that data
will allocate and initialize it.
necessity to minimize tr..e number of page faults caused
by large object prOf]~ms.
The length of the object program is minimized by t,he
extensive use of out-of-line code sequences. These
out-of-line code sequences represent invariant code
which is common to alll\Iultics PL/1 object programs.
Although the compiled code makes heavy use of out-ofline code sequences, the compiled code is n.ot in any
respect interpretive. The object code produce for each
operator is very highly tailored to the specific attributes
of that operator.
All out-of-line sequences are contained in a single
"operator" segment which is shared by all users. The
in-line code reaches on out-of-line sequence through
transfer instructions, rather than through the standard
subroutine mechanism. ",7 e believe that the time
overhead associated with the transfers is more than
redeemed by the reduction in the number of page faults
caused by shorter object programs. System performance
is improved by insuring that the pages of the oper~'tor
segment are always retained in storage.
The stack
Code generation
The code generator scans the internal representation
transforming it into 645 machino instructions which it
outputs into the text segment. During this scan the
code generator allocates storage for temporariefl, and
maintains a history of the contents of index regiflters to
prevent excessive loading and storing; of index values.
Code generation consists of three distinct activities:
address computation, operator' selection and macro
expansion. Address computation is the process of
transforming the offset expressions of a reference node
into a machine address or an instruction sequence which
leads to a machine address. Operator selection is the
translation of operators into n-qperand mncros which
reflect the properties of the 645 funchine.
A one-to-one relationship often exists between the
macros and 645 instructions but many operations (load
long string, etc.) have no machine counterpart. All
macros are expanded in actual 6(15 code by the macro
expander which uses a code pattern table (macro
skeletons) to select the specific instruction sequences
for each macro.
Object code strategies
~fultics PL/l object programs utilize a stack segment
for the allocation of all automatic data, temporaries,
and data associated with on-conditions. Each task
(l\lultics process) has its own stack which is extended
(pushed) upon entry to block and is reverted (popped)
upon return from a block. Prior to the execution of each
statement it is extended to create sufficient space for
any variable length string temporaries used in that
statement. Constant size temporaries are allocated at
compile time and do not cause the stack to be extended
for each statement.
Prologue and epilogue
The term prologue describes the computations which
are performed after block entry and prior to the
execution of the first source statement. These actions
include the establishment of the condition prefix, the
computation of the size of variable size autom:atic data"
extension of the stack to allocate automatic data, and
the initialization of automatic data. Epilogues are not
needed because all actions which must be undone upon
exit from the block are accomplished by popping the
stack. The stack is popped for each return or non-local
go to statement.
The object code design
The design of the object code is ,a compromise between
the speed obtainable by straight in-line code and the
Accessing of data
IVIultics PL/l object code addresses all dat:a, includ-
The Multics P.L/l Compiler
ing members of variable sized structures and arrays
directly through the use of in-line code. If the address
of the data is constant, it is computed at compile time.
If it is a mixture of constant and variable terms, the
constant terms are combined at compile time. Descriptors are never used to address or allocate data.
stock storage
for A.
J
]
;~~c~.
String operations
All string operations are done by in-line code or by
"transfer" type subroutinized code. No descriptors or
calls are produced for string operations. The SU BST R
built-in function is implemented as a part of the normal
addressing code and is therefore as efficient as a
subscripted array reference.
String temporaries
A string temporary or dummy is designed in such a
way that it appears to be both a varying and non-varying string. This means that the programmer does not
need to be concerned with whether a string expression is
varying or non-varying when he uses such an expression.
as an argument.
Varying strings
The ]Vlultics PL/l implement[\tion of v~.ryiD.g strings
uses a data format which cor.sists of an. integer followed
by a non-varYlng string 'whose length is the declr.re
maximum of the varying string. 1'1'.e ilteger is used to
hold the current size of the string in bits or chr.rr.cters.
Using this data format, operations on vr.rying strinf;s
are just as efficient as opert'.tions on non-vr.rying strings.
On -conditions
The design of the condition machinery minimizes the
overhead associated with enabling and reverting onunits and transfers most of the cost to the signf:\l
statement. All data associated with on-conditions,
including the condition prefix, is allocated in the stack.
The normal popping of the stack reverts all enabled
on-units and restores the proper condition prefix. Stack
storage associated with each block is threaded backward
to the previous block. The signal statement uses this
thread to search back through the stack looking for the
first enabled unit for the condition being signalled.
Figure 7 shows the organization of enabled on-units in
the stack.
Argument passing
The PL/l language permits parameters to be
On-unit control
data for X.
~'-'""'1.) ]
storaoe
.
on-unit control
data for X and
197
Procedure A enabled on
on-unit for condition X
and called procedure B.
Procedure B enob led a
new on-unit for condition
X and on on-unit for
condition Y.
It then
called procedure C.
Y.
stock storage
for C.
}
Procedure C did not
enable anyon-units.
Figure 7-Stack storage and the signal mechanism
A f:.ignal for condition X causes the signal mechanism to search
rack through the stack until it findo; the first enabled' on-unit
for condition X.
An on-unit is compiled 8'3 an internal procedure. The execution
of an ON..,f;tp,tement creates p, block of on-unit control data. This
control datr, comi<;:·ts of the name of the condition for which the
unit wes enr,bled r,nd a procedure variable. The Fignal mechanism
uses the proeedure variable to invoke the on-unit.
All data
associated with the enr,bled on-unit is stored in the stack storage
of the procedure which ene,bled it. Normal popping of the stack
reverts the on-units en8bled during the execution of the
procedure.
declared with unknown array bounds or string lengths.
In these cases, the missing size information is assumed
to be supplied by the argument which corresponds to the
parameter. This missing size information is not explicitly
supplied by the programmer as is the case in Fortran,
rather it must be supplied by the compiler as indicated
in the following example:
SUB: PROC(A);
~!JAIN:
DCL A CHAR(*);
DCL SUB ENTRY;
PROC;
DCL B CHAR (10) ;
CALL SUB (B) ;
Since parameter A assumes the length of the argument B, the compiler must include the length of B in the
argument list of the call to SUB.
198
Fall Joint Computer Conference, 1969
~--------------------------------------------------------------------------------------The declaration of an entry name mayor may not
include a description of the arguments required by that
entry. If such a description is not supplied, then the
c·alling program must assume that argument descriptors
are needed, and must include them in a11 calls to the
entry. If a complete argument description is contained
in the calling program, the compiler can determine if
descriptors are needed for calls to the entry.
In the previous example the entry SUB was not fully
declared and the compiler was forced to assume that an
argument descriptor for B was required. If the entry
had been declared SUB ENTRY (CHAR(*» the
compiler could have known that the descriptor of B was
actually required by the procedure SUB. Since descriptors a.re often created by the calling procedure but not
used by the called procedure, it is desirable to separate
them from the argument information which is always
used by the called procedure.
Communication between procedures written in PL/1
and other languages is facilitated if the other languages
do not need to concern themselves with PL/1 argument
descriptors. The l\1ultics PL/1 implementation of the
argument list is shown in Fig,re 8. Note that the
argument pointers point directly to the data (facilitating
communication between languages) and that the
descriptors are optional, also note that PL/1 pointers
TAG:
PROC;
OCL ACtO) BITCN),B CHAR (1) , C AREA(1024);
CALL
XCA,B,C)~
END;
11Ie argument II at
prepared for the
call to X.
polntera to the actual
valuu o:f A, Band C.
must be capable of bit addressing in order to implement
unaligned strings. Since descriptors contain no addressing information, they are quite often constant and can
be prepared at compile time.
SUl\1l\.f AR Y
Our experiences both as users and implementors of
PL/1 have led us to form a number of opinions and
insights which may be of general interest. .
1. It is feasible, but difficult, to produce efficient
object code for the PL/1 language as it is currently defined. Unless a considerable Hmount of
work is invested in a PL/1 compiler, the object
code it generates will generally be much worse
than that produced by most Fortran or COBOL
compilers.
2. The difficulty of building a compiler for the
current language has been seriously underestimated by most implementors. Unless the
language is markedly improved and simplified
this problem will continue to restrict the availability and acceptance of the language and will
lead to the implementation of incompatible
dialects and subsets.7
3. Simplification of the existing language will make
it more suitable to users and implementors. We
believe that the language can be simplified and
still retain its "universal" character :and
capabilities.
4. The experience of writing the compiler in PL/1
convinced us that a subset of the·]angmllge is "ve11
suited to system programming. This conviction
is supported by Professor Corbato in his report on
the use of PL/1 as an implementation language
for the M ultics syst.em. s Many PL/1 concepts
and constructs are valuable, but PL/1 structures
and list processing seem to be the principal
improvement over alternative languages. 9
ACKNOWLEDGlVIENTS
duer Iptor of A
deaer Iptor of B
deacrlptor
Figure 8-An argument list showing the relationship
between arguments and their de:criptors. The
broken lines indicate that; descriptors
are optional.
of C
The author wishes to express recognition to members
of the General Electric l\1ultics PL/1 Project for their
contributions to the design and implementation of the
compiler. J. D. Mills was responsible for the design and
implementation of the syntactic analyzer and the
l\1ultics system interface, B. L. Wolman desig;ned and
built the code generator and operator segment, and
G. D. Chang implemented the semantic tr::tllslator.
Valuable advice and ideas were provided by A. H.
K vilekval. The earlier work of l\t1. D. McIlroy and
R. lViorris of Bell Telephone Laboratories and numerous
The Multjcs PL/l Compiler
persons at MIT's Project MAC provided a useful guide
and foundation for our efforts.
5
REFERENCES
1 P L I 1 language specifications
Form Y33-6003-0 IBM Corp March 1968
2 The formal definition of PL/I as specified by technical
reports TR25.081, TR25.082, TR25.083, TR25.084,
TR25.085, TR25.086 and TR25.087, IBM Corp
Vienna Austria June 1968
3 F J CORBATO V A VYSSOTSKY
Introduction and overview of the multics system
Proc FJCC 1965
4 V A VYSSOTSKY F J CORBATO R M GRAHAM
6
7
8
9
Structure of the multics supervis01
Proc F JCC 1965
R C DALEY J B DENNIS
Virtual memory, processes, awi sharing in multic8
CACM Vol 11 No 5 May 1968
PLI1 (F) programmer's guide
Form C28-6594-3 IBM Corp Oct 1967
R F ROSIN
PLlt Implementation survey
ACM SIGPLAN Notices Feb 1969
F J CORBATO
PLlt as a tool for system programming
Datamation May 1969
H W LAWSON JR
P L /.t list prooo8sing
CACM Vol 10 No 6 June 1967
199
A design for a fast computer for
scientific calculations
by P. M. MELLIAR-SMITH
The General Electric and English
Electric Companies Limited
Borehamwood, Hertfordshire, U. K.
Recently developed techniques, such as the associative
fast store and Tomasulo's algorithm, will enable
typical large scare computers to achieve 15 to 20
million instructions per second. The hardware of such
machines has a very much greater potential power,
but it is inefficiently used, being limited to decoding a
single instruction per logic cycle.: This paper proposes
a technique whereby the programmer is provided with
complex instructions capable of controlling the operation of the whole machine during one logic cycle.
The use of such instructions for the inner loops of
programs yields substantial performance improvements without significantly increased costs.
Recent efforts to develop very fast computers have
generated two elegant techniques for increasing the
speed of computers.
The first is the associative fast store, first used for
the Titan computer at the University of Cambridge,
England (a 32 word 'slave' store), and more recently
for the IBM 360/85 (a 16 K byte 'buffer' store or
'cache'). The associative fast store seeks to overcome
the major problem in the design of very fast computers,
the disparity between the access time of suitable main
stores and the potential operation time of the arithmetic units, by providing a small quantity of very fast
integrated circuit store. This can be made as fast as
the arithmetic units but it cannot contain more than
a fraction of the information used by typical programs.
However it has been found experimentally that, in
any short period of time, programs do not access the
whole of their storage and that a fast store, which
retains a few hundred of the words most recently
used by the program, is able to provide without delay
almost all the information needed by the processor.
A possible method of implementation is shown in
Figure 1. The fast store holds a number of words of
code and data, together with their addresses. When
the processor requires a particular item, the address
is first sent to the fast store where it is compared
simultaneously with the addresses of all the words
in the fast store. Should the required item be pre~ent
in the fast store, then its address will match that sent
by the processor and the data can be returned to the
processor with minimal delay. If none of the addresses
match, then the required item must be fetched from
the main store and the processor may be held up.
But when the data word has been fetched, in addition
to being sent to the processor, it can also be inserted
into the fast store, displacing some other item, so
that should it be needed again it will be immediately
available.
The succes~ of this technique is entirely dependent
on the proportion of data items needed by the processor which have to be fetched from the main store,
and this proportion, the failure rate, is the primary
criterion of the effectiveness of the fast store. The
speed of the compu ter is determined by:.
Effeetive Access time
= Fast Store
+
201
Access Time
Main Store
( Access Delay
*
Failure)
Rate
202
Fall Joint Computer Conference, 1969
CENTRAL PROCESSOR
each curve reprMl8I1ts
the effective access time
for one sample program
,
fast
store
_1
address
.rb
access
time
,,.:~ I
I'
FAST
STORE
-
357.
I
data
I,
I
"'T ----.
L-~~~~=--
I
'I'
307.
---=~~c::::::::::.
----
~--~~~~~~~
-
MAIN STORE
257. - .- 512
-
-
---
physical access time
-
~
768
1024
fast store size
Figure I-The use of an associativeifast store to reduce
the access time of the main store
If the Main Store Access Delay, 'which must include
organisational overheads as well, as the Main Store
Access Time but which may be partially overlapped,
is equivalent to ten fast store accesses then a failure
rate of 3 percent must be attained to achieve 75 percent of the potential processor spe~d.
Experimental simulations with fl!ctual programs have
shown that the three characteris~ics of the fast store
which most affect the failure rates! are its organisation,
its size, and the size of the unit Qf information transferred from the main store to the fast store. The
organisation of the fast store need not concern us
here, except to remark that the type of organisation
described above is to be preferred to alternative
methods which avoid the associajtive access to large
numbers of addresses.
!
The experimental simulations show that the primary
method of obtaining an adequatply low failure rate
is to make the fast store large ;enough. If the fast
store is smaller than several hundred words then
programs refer to many items not held in the fast
store and the full performance of the machine is not
obtained. However the fast store must not be made
too large, even without cost considerations. As the
size of the fast store is increased so its access time is
also inevitably increased, and eve~tually this increase
in the physical access time of the f~st store overwhelms
any further reduction in the n~mber of references
to the main store. Figure 2 shows the result of simulations to obtain the etfective access time of a particular integrated circuit fast store operating with a thin
film main store, for seven sample programs. It can
1536
1280
(words)
Figure 2-The physic9,l assess time of 3, fast store (broken line)
and the effective access times (continuous lines) for sa'llple programs. Access time is a percentage of main store aace-r;; time,
storage size is in words, and line size is 4 words
be seen that for many of the programs the optimum
size of the fast store is about 1000 words.
The description of a fast store given above assumed
that the unit of information held in the fast store
was a single word, and that information is transferred
from main store in single words. The experimental
simulations have shown that a more efficient unit
would be a block of a small number of consecutive
404
fast
store
each curve represents
the effective access time
for one sample program
access
time
.~~~-~.....-i'~
2~1.,~
-- ---:~:-~--.c::~
__-- __
physical access time
____
- p____
______
~
~
~
~
____~____~·,_-__
-=~,-,-
124
8
•
n
line size for fast store (words)
M
Figure 3-The physical access time of a fast store (broken line)
and the effective access time..'> (contnuollslines) for several sample
proglams. Acce'>!' time is a percentage 01 main store access time
line size is in words, and store size is 1024 words
Design for Fast Computer for Scientific Calculations
words accessed simultaneously from the main store.
Such a block is very similar to, though much smaller
than, a, page in a paging system and will be called a
line. Figure 3 shows how the effective access time of
an integrated circuit fast store varied with line size
during the simulation of seven sample programs. It
ca.n be seen that, when the line size is small, increasing
it not only improves the physical access time of the
fast' store but also reduces the failure rate, resulting
in an impressive performance improvement. But for
larger line sizes any further improvement in the access
time from the increased line size is offset by increasing
failure rates and overall performance deteriorates. It
appears that a line size of between four words and
sixteen words is suitable, providing that the line size
is not allowed to exceed the totaJ widt.h of the main
store.
The associative fast store technique provides very
hast effective access times and overcomes this problem
in the design of very fast computers. Thus the onus
is placed back onto the processor to make full use of
the speed of the fast store, both by the provision of
fast arithmetic units and by the execution of lengthy
ariiihmetic operations in parallel. A beautiful technique
for overlapping arithmetic operations has been developed by R. M. Tomasulo for the IBM 360/91
and is known as Tomasulo's algorithm.
Consider] for instance, the typical tight loop containing floating point load, multiply, add, and store
instructions operating on the same register. As shown
in Figure 4a the conventional machine places the
result of each operation in the register before extracting
it again to perform the next operation. The register
has no substantial function in this loop which would
be more efficiently performed as shown in Figure 4b.
Here the partial result is passed directly from one
A
without
forwarding
arithmetic unit to the next without first being placed
in the register, a technique known as forwarding. Not
only is this faster, but it also frees the register from
interlocks, which would prevent its concurrent use
for subsequent calculations. Thus for the example
loop, it might be possible to launch the second iteration
of the loop before the first iteration has been completed.
The basic structure of a flpating point arithmetic
unit using Tomasulo's algorithm is shown, slightly
simplified, in Figure 5. Separate arithmetic units are
provided for addition and multiplication, and there
are also units to hold the floating point registers and
to buffer operands to be written to store. The arithmetic units are pipelines so that several independent
operations, in different stages of completion, can be
processed simultaneously within each arithmetic unit.
Thus, for instance, the addition unit can start a further
addition operation each logic cycle even though the
individ.ual addition operation takes three to four cycles
to complete.
In front of each arithmetic unit there is a block of
registers in pairs, the reservation stations. These serve
to gather the operands required for the arithmetic
operations as and when they become available, As
soon as a reservation station has collected both the
required operands, the relevant arithmetic operation
can be started at this, the earliest possible, moment.
Operands are made available to the reservation stations
as early as possible by the cross bar switch which
connects the outputs of all the arithmetic units, the
registers and the store buffers to the inputs of all the
reservation stations, so that any operand can be routed
directly to any reservation station where it is required.
Tomasulo's algorithm applies only to operations
between registers. Consequently arithmetic operations
that derive one of their operands from store are per-
B
with,
forwarding
Figures 4a and 4b-The use of forwarding to speed
arithmetic calculations
203
Figure 5-Typical floating point unit for use with
Tomasulo's algorithm
204
Fall Joint Computer Conference, 1969
formed in two stages, the first of which loads the
operand from store into one of the store buffer registers
while the second is a register .to register operation
between that buffer regist3r and the specified floating
point register.
Under Tomasulo's algorithm: instructions are still
decoded sequentially but their 'execution proceeds as
and when the required operands become available.
Arithmetic operations between registers are performed in four stages:
select a suitable vacant reseryation station,
obtain both operands and place them in the
reservation station,
execute the arithmetic operation,
transmit the result directly to all registers and
reservation stations waiting for it.
The identity of the destination register must not be
held with the operation as it is being processed, for
arithmetic operations can be performed out of sequence
and the result of some subse'quent operation may
already have been placed in that register. The essence
of Tomasulo's algorithm is that a record is keptJ for
each register, of the origin of the result for which it is
waiting, the result most recently assigned to it. Previous
results, directed by the program to pass through the
register, will be forwarded directly to the relevant
arithmetic units and can be' ignored by the register.
The same technique is used for reservation stations,
recording for each which operand or result it is waiting
for.
As an example of the requir~d effect., consider the
short loop referred to above. The first instruction loads
an operand from store to a floating point register.
Obtaining the operand from store will take a small
interval of time1 even with an jntegrated circuit fast
store, and so a store buffer is allocated, the register
is set t No., Skip Two
F'igure 3c-Data commands
o
I
3 ..
0 0 0 0
I
0 0 0 0
7 8
0 0
0
1
1 1
Halt
Pop
Pop, But Skip if Jump
Genera I Register
{:
0
Register Exchange
Spare
23
11 12
I
lop.
0
0
0
0 0
0 0
0 0
0
0
0
1 0
Data
RN
Spare
{1
0
1
I
RQ
J
RQ
J
23
16 17
12
1 0
0
0 1
0
1 0
0
1 1
OP2
RN
1
1
1
1
0 0
Data Out (From R 15)
Data In (To R,sl
Direct Output
Test and Skip
I
23
1920
15 16
12
Device
I
,R~gister Number
or Command
0
1
Figure 3d-Miscellaneous and 1/0 commands
Display Processor and 940 Core Memory, and two
connections, the I/O and interrupt lines, between the
Display and the 940 Processor.
Display commands
One of our goals was to design a rich but "clean"
series of display commands. In particular, we wanted
to avoid a difficulty we encountered in several other
display systems-the fact that word length restrictions
force reliance on two word instructions or on dual
operating modes in which the machine will treat all
words either as display data or instructions, depending
on its mode. The 24 bit word length of the 940 provided enough space (just barely) to allow all instructions
I
211
to carry OPCODE and X-Y or character data in a
single word.
Figure 3-a shows the six Display Commands. The
display generator 'can produce lines and characters.
Lines are drawn in 2 + 3L microseconds, where L is
the length of the line in inches. The beam can be
randomly positioned anywhere on the screen in 7
microseconds maximum. Characters are drawn in from
4 to 12 microseconds, depending on size and number of
strokes required. One command plots three characters
in "typewriter" format *; the remaining commands
specify the endpoints of displayed lines. ** The end ~
point of a line can be specified, in two's complement,
as an absolute location on the 1024 by 1024 coordinate grid of the display screen or as a relative displacement from the current beam location. One pair of
commands allows endpoints to be specified in relative
or absolute terms. Another pair allows mixed specifications-one coordinate absolute, the other relative.
The remaining command allows three endpoints to be
specified as short, relative displacements. Each X or
Y component of a short displacement specification
(Figure 3a) is represented, in two's complement, by
one sign bit and two magnitude bits. The two magnitude
bits are treated by the hardware as the two high order
bits of a three bit magnitude representation. The low
order bit is assumed to be O. This allows displacements
of about 0.1 inch in X and Y to be specified. Each
line specification carries an unblank bit CU). If set,
the line will appear, otherwise it will produce an invisible beam movement.
The appearance of displayed elements is controlled
by the three fields of the display parameter registe
(RIO), (Figure 4). Eight intensity levels and four
character sizes are available. A line can be drawn solid,
in a variety of dotted and dashed formats, or as a single
dot at its terminal point (point plotting). To allow
independent control of the three parameters, a masking
mechanism is included. t To change parameters one
uses a Load Command (Figure 3c) with bits 12-14
specifying which parameters are to be affected.
* A null code can be placed in the unused character position when
it is desired to plot one or two characters. In addition, the ch9.l·acter generator has an unusually rich complement of control characters, including space and half space up, down, backwards, and,
forwards. Full details are covered in reference 10.
** The strting point for a. line or group of characters is the,
current beam position. The X and Y registers always contain this
"alue; their contents are appropriately updated as each Display
Command is executed.
t
A similar scheme was used in the Digital Equipment Corporation (DEC) 340 and 338.
212
12
Fall Joint Computer Conference, 1969
17 18
14 15
Intensity
23
2021
Char. Size
Figure 4-0perand field tor displruy parameter command
The pushdown stack
One of our key goals was to achieve a displu,y system
that would allow us to represent pictures by means of
complex data structures. Behind this goal was a desire
to eliminate or minimize the separation that is necessary in many systems between a "master representation" and a "display file". Looking at this rather
general goal in more detail, we ,wanted the ability to:
1. Execute nested picture subroutines to arbitrary
depth.
2. Create "transparent" subroutines-save and
restore selected display registers such as the X
and Y beam position and display parameters
on entering and leaving a subroutine.
3. Pass parameters to subroutines.
4. Easily identify objects selected by light pen
or stylus in terms of the picture structure.
5. Perform certain forms of general list processing.
Nested subroutines can be handled by a variety of
subroutine mechanisms. The need for easy light pen
selection led us to use a pushdown stack. When processing a light pen or stylus "hit" one must trace one's
path back through the subroutine hierarchy in order to
relate the object selected to the drawing structure.
Without a pushdown stack this requires search through
the subroutine structure. With a stack system, however,
the required trace is maintained compactly and automatically by the return addresses stored in the stack.
The use of a pushdown stack ~s not in itself new with
this design. The DEC 338 displ~y, for example, made
very successful use of a stack system. What is unique
in this display is the way in whidh the stack was implemented. The need to save anp' restore information
other than return addresses meant that it had to be
possible to push any register into the stack. In order
to get the information back into the right register,
data in the stack had to be marked in some way. After
considering several marking schemes, we hit on the
idea of placing instructions rather than data in the
stack. When a display register is "pushed" into the
stack, what actually appears in memory is an instruction to reload the register in question with its
original contents.
The notion of putting instructions in the stack, of
course, changes one's conception of the whole stack
mechanism. Tqe POP instruction (counterpart to
PUSH), for example, becomes a special variety of
"execute", and the stack pointer a kind of auxiliary
program counter. In recognition of this, we reversed
the direction in which stacks usually build. As information is pushed into the stack, the stack pointer is
decremented. This means that instructions in the stack
are "popped" (executed) in the usual low-to-high
address order.
Treating the stack pointer as an auxiliary program
counter suggested that we make it accessible, as is
the program counter, to certain processor instructions.
By doing so, we freed the stack from a fixed location
in core. Because one can load the stack pointer, one is
free to start the stack where one pleases. lv1oreover,
as ,ve shall see below, one can even achieve a stiack
that occupies disjoint areas of memory by saving the
old stack pointer at the beginning of each new section
of stack.
With this background, we can now look at some
details of the stack system. The Pu~h, Load/Push,
and Push Data commands (Figures 3b and 3c) place
information in the stack, Pop and Pop but Skip if
Jump (Figure 3d) get it back out. As mentioned above,
the Push commands assemble instructions in memory;
the Pop commands execute these instructions. The
Push operation may seem complex, but is in fact quite
simple. To see this, let us examine a Push command
in detail.
1. Assume "push the X Register" has been fetched
into the Instruction Register (R1).
2. The register field (bits 4-7) of R1 selects the
X register (R8). The contents of R8 are copied
to bits 12-23 of Rl.
3. Bits 8 and 9 of R1 are cleared to 0. The remainder of R1 is left unchanged.
4. IU is copied back into the memory at the
location selected by the Stack Pointer (R3).
5. The Stack Pointer is decremented.
6. The net result in memory is a "Load the X
Register" command * with the current X value
in its operand field.
The main use for Push is to save register contents
for later restoration at the end of a subroutine. As
indicated in Figures 3a and 3b, Push can be brought
.to bear on any register accessible to the pro!?;rammer.
Because the stack is marked, a single instruction restores the information regardless of where it came
from.
* A variant of Push
will place an Add Command in the stac:k.
A Display Processor Design
In dealing with display structures, it is convenient
to supply names or tags for the objects being presented.
These may, for example, be pointers to other areas of
memory that describe non-graphic properties of the
objects. The No-Op command (Figure 3b) allows
names to be included in a display file. It causes no
action, but its operand field may cont,ain tag information. Push Data allows names to be pushed into
the stack, a further convenience when tracing back
through a subroutine hierarchy. This command writes
its own operand field into the stack in the form of
aNo-Op command.
The third Push variant-Load/Push-exchanges its
operand field with the selected register before writing
the original register contents into the stack. Load/Push
the Program Counter provides a standard subroutine
call. The current program location is stored in the
stack (as a Jump instruction) while the Program
Counter is simultaneously reset to the subroutine
entry point specified by the Load/Push command.
Load/Push can be used in a similar way to save and
simultaneously reset any other register.
Load/Push the Stack Pointer deserves special attention. Because the Stack Pointer is loaded with the
new value before its original contents are pushed, the
old value will be pushed into the new stack Thus,
the first word put into the new stack is a pointer that
links it to the old stack. It is this feature that allows
one to create disjoint stacks; the saved stack pointers
provide an automatic address chain back to the original
stack. We have chosen to call these stored links "Stack
Jumps."
Pop, the counterpart to Push, causes the display
processor to execute instructions in the stack. When
the processor encounters a Pop, it increments the
Stack pointer, fetches the instruction selected by the
new pointer value, executes that instruction, and then
returns to normal instruction execution under control
of the Program Counter. Typically, the instructions
executed by Pop will be Load or N o-Op commands
created by one of the Push instructions. However,
any instruction can be executed through Pop.
With the Pop instruction in hand, we can now examine a typical subroutine linkage. Having entered
the subroutine through a Load/Push Program Counter,
one can use Push or Load/Push commands to save
any other registers. The net result is a series of Load
Commands in the stack with a Load Program Counter
occupying the last (highest numbered) address. Two
commands: Pop followed by a Jump to the previous
instruction will restore the saved registers and provide
a subroutine return. The processor loops on these two
commands, reloading the saved registers, until the
213
stored Load Program Counter removes it from the
loop and returns control to the main program.
The Pop but Skip on Jump command allows one
to restore saved registers without returning from a
subroutine. This command behaves exactly like Pop
except upon encountering a Load 'Program Counter
in the stack. In this event the stacked instruction is
ignored, the Stack Pointer decremented and the
Program Counter incremented an extra time. The net
result is that the processor breaks out of a loop such
as the one suggested above, just before executing the
return Jump.
The above discussion has suggested some conventional uses for the stack instructions. However, such
features as the ability to manipulate the Stack Pointer
in various ways permits the user to devise more sophisticated uses for the stack mechanism. We have
made heavy use of this flexibility in the software
support package. One example application is the handling of rubber band lines and other simple constraints
within the display processor. We accomplish these
functions by performing list processing in the display
file using the stack feature.n
Experience in working with the system has shown
that the heavy use of multiple stacks could be more
efficient if another stack pointer were available or if
a 14 bit address length general purpose register were
available for temporary storage of the Stack Pointer.
The Shell system is being modified to add two such
14 bit general registers. The ability to execute instructions in the stack has given generality and power
to the display processor at modest cost.
M emory sharing
A consequence of our desire to achieve close coupling
between pictorial and other information was the need
to allow easy access to display files from programs in
the 940. As well as permitting advanced graphics
applications, we felt that close access would simplify
the general software support for the display.
To realize this goal we attached the display processor directly to the core memory of the central
computer rather than relying on a separate buffer
memory. * The display processor addresses the 1.75
microsecond 940 memory through its program counter
and stack pointer. In operation, the display processor
refreshes the display consoles by executing display
commands stored in 940 memory and passing the data
they contain to the display generator.
Given this close interconnection between display
* This connection utilizes the 940'8 second memory port.4
214
Fall Joint CO!llPy.ter Conference, 1969
and main computer, considerable care was necessary
to ensure a display system that could operate effectively
without degrading or endangering the supporting timeshared computer system. One: potential dangercompetition between display and central processors
for memory access-was reduced to an -acceptable
level by use of dual access. priorities on the second
path to memory.**4
A second and more serious danger-inadvertent
alteration of 940 memory by a display program-was
eliminated by including memory mapping and protection hardware in the display processor. This
equipment is identical in function to equivalent hardware in the 940. 4 By means of this mapping, the 16K
word "virtual" memory that can be accessed by the
display (and 940) instru(~tions is mapped into 2K
word physical pages that may be scattered through
the 64K words of 940 core memory. At anyone time
only a few of these pages may be assigned to the
display, and those pages that ~re assigned may be
made accessible for reading only or for reading and
writing.
Registers in the mapping hardware indicate, for
each of the eight pages that the display might address,
whether or not a physical page is assigned, and if
assigned its status (read only or read/write). Only
the 940 monitor can change the :contents of the map
registers. As shown in Figure 5, memory addresses
transmitted by the display processor, are processed
through the mapping hardware :before accessing 940
memory. Any attempt to address an unassigned page
or to write into a read-only page stops the display
processor and sends an interrupt signal to the 940.
One consequence of mapping is that undebugged
display programs are of no danger to the system or
to other users. Mapping has the additional benefit
of allowing users and system software designers to
treat display programs in exact1y the same way as
940 user programs. In fact, beqause mapping for a
user's 940 program can be made identical to the mapping for his display file, the two: can share the same
"The 940 CPU accesses memory through the first path to memory. The display accesses memory through a second path. Devices
on the second path can request access with either higher or lower
priority than the first path. The display processor overlaps the
drawing of a vector or character with the fetch of the next command. Memory accesses at this time are with low priority.
When the display opera lion is completed, access is made with
high priority, if not prev!ously succes$ful. Non-overlapped accesses are made with high priority. Using the above mechanism,
reasonable assumptions on command mix and the fact that the
940 memory has 4 independent interleaved modules, it has been
estimated that the 940 CPU will be blocked from immediate
memory access less than 2 percent of the time. 10
--.t____ Me"ory Output (:14)
O.ut.!:::PUt~Bu=-S(~24)~_ _ _ _ _ _ _ _ _ _
I
--'---------;.-----'-''-1
Error
P.rityError (1)
Input Bus (24)
Parity T.. t
~........--.I--_ _
: Me"ory Input
(24)
\ - -_ _ _ _ _ _ Me"oryAddr... (16)
'---_ _- , . . . . - - - - - - - C o m p u t . r Output (24)
T;.,..out (I)
Int.rruptAfert
~I-_ _ _ _ _ Int."upt
-'--'-~----'------t
Ge.I,.
L - - - - - - - - - - - - - _ C o m p u ... lnput (24)
I
I
Control
_ _ _ _ _ _ _ _ _-\ EOWPOT/PiN ..,.----EO/MPOT/PIN
•
Control
~
Figure 5-Interface structure
address space and thus, be merged in any way the
user pleases. Thus, the user can, if he wishes, create a
common data structure that represents pictorial and
other properties of the objects to be viewed . In 3,ddition, he can achieve an unprecedented richness of
interaction between operations performed at a display
console and the underlying processing in the main
computer.
Processing tasks-Display vs. 940
The issue of how much power to include in the display
processor is a complicated one. This issue is discussed
more fully in an earlier paper that was inspired by the
difficulties we encountered on his project. W-e chose
to include enough computing power to handle the
immediate response to interactive events such as
light pen "hits" or the depression of push ]buttons.
Less than this would yield sluggish interaction; tasks
requiring more power could, we felt, be rele~~ated to
the 940 processor.
With these ideas in mind, we equipped the display
processor w:th a set of commands aimed specifically
at interactive situations. As shown in Figure gc, these
include bit manipulating and skip commands and :~n
arithmetic compare operation. The bit manipulating
and skip instructions include Clear, Toggle (Complement), And, Set, Skip on 0, Skip on 1, Skip on 1 and
Clear, all handled under the mask in the operand field
of the instruction. These commands are used to test
or change status, control interrupt masking and so
forth. There is also a three way arithmetic compare of
a selected register with the operand giving a skip of
0, 1, or 2, depending on the result. This command
allows one to branch on the X or Y location of the
display beam or of a coordinate input device. Taken
together with the Add, Register Exchange and General
A Display Processor Design
Register Commands; t and the stack mechanism, these
interactive commands have allowed us to do such
things as handle light buttons, produce point rasters,
and perform the work involved in light pen tracking,
all without intervention from the 940. Control of the
dIsplay prOCp.RRor is implemented with microcoding and
a read-only menory. The time required per microstep
is 400 nanoseconds. Command fetch, decoding, and
program counter update require 6 microsteps plus a
memory read time. The number of micro steps required
per command execution is variable, Load requires 1,
Push 3 and Pop 9, for example. The Pop and General
Register Commands have the longest execution time.
The read-only memory can be easily modified or inexpensively replaced. This feature will be used to
modify or add commands thought to be useful from
the software experience. ll
In spite of its power, the display processor must
call on the 940 for assistance in tasks beyond its capabilities. In addition, the 940 must, of course, have
ultimate co~trol over the display. We satisfied both
needs by connecting the display processor to the I/O
and interrupt systems of the 940. Through these connections the display processor can transmit service
requests to the 940. The 940 processor can in turn
interrogate and set the registers of the display. Together with the shared memory mechanism, these two
connections yield a closeness of coupling that contributes importantly to the ability of the two machines
to share their processing resources.
Through its I/O lines the 940 processor can directly
access all registers of the display. Any display register
can be brought into the 940 processor by a 940 Parallel
Input (PIN) instruction. Conversely, the 940 processor
can set any display register through a Parallel Output
(POT) instruction. This feature aids the 940 in initializing the display and in processing interrupt
requests. If the 940 sets the display's Instruction
Register (through a POT instruction), the display will
treat the information as a command, exe~ute it, and
then halt. Unless directly altered by a command
executed in this way, the display's Program Counter
is not changed. The net result is that the 940 can, in
effect, "execute" any display instruction. As well as
access to the display registers, the direct I/O contThough not directed at any particular interacti ve function, our
implementation of the processor design allowed us' to include
these commands at little cost. They have proven more than worth
the price. The Add (Figures 3b, 3c) and Register Exchange
(Figure 3d) generates a new processor instruction in which OP2
operates on RN using the contents of R Q as operand. This allows
one. for example, to add or compare two rep,isters.
215
nection allows the 940 to stop and start the display
set the display's memory map and the "device map"
described in the next section.
The interrupt system gives the display a means for
requesting help from the 940. Some events in the
display (irrecoverable errors, for example) can only be
dealt with by the 940. Either .the 940 or the display
processor can cope with other situations (light pen
hits, scope edge violations). In recognition of this, we
grouped all interrupt as well as other control and status
information into one register-the System Parameter
Register (Rll), shown in detail in Table I. The bottom
twelve bits of this register are accessible both to the
bit manipulation commands of the display and, via
the POT/PIN instructions, to the 940. The top seven
bits are accessible only to the POT/PIN instructions
because only the 940 can deal with the information
they contain.
TABLE I-System parameter register*
Bit
Function
(Bits Accessible to 940 Only)
5
6
7
8
9
{Thes~ ~wo bits assist the 940 in interpreting
certaIn Interrupt events.
Parity Error Flag.
Memory Map Violation Flag.
Time-Out Flag (the display has a built-in downcounting c]ock).
10 Halt Mask.
11 Halt Flag.
(Bits Accessible to 940 or Display)
12
13
14
15
16
17
18
Unused.
X Edge Overflow Flag.
Y Edge Overflow Flag.
Edge Overflow Mask.
Synchronous Hit Flag (e.g., light pen).
Synchronous Hit Mask.
Asynchronous Hit Flag (e.g., pushbutton or
keyboard).
19 Asynchronous Hit Mask.
20 Blink (toggles continuously at blink rate).
21 Blink Control.
22 Slow Mode Control (for storage tube consoles).
23 Master U nblank (if 0 unconditionally blanks the
display).
* Nineteen of the possible 24 bits in this register were
implemented.
216
Fall Joint Computer Conference, 1969
The lower bits in RII handle several kinds of events,
for each of which there is a flag :bit and a mask bit.
The flag bit is set whenever the event occurs; the
setting of the mask bit determin~s whether or not an
interrupt signal is sent to the 94p. This arrangement
allows the programmer to cope wi~h events through the
bit oriented instructions of the ~isplay processor, or
ignoring them in his display program, to pass them on
as interrupt signals to the 940. Ih addition, a display
program can request service from the 940 by executing
a Halt and Interrupt instruction (Figure 3d).
Because the 940 must assist the display processor
in certain situations, it was necessary to allow display
users to write real-time 940 programs. The problem
of preventing real-time program~ from degrading the
time sharing performance of the, 940 was handled by
setting limits on a display user's CPU usage during
each refresh cycle of the display.
Consoles and other I/O devices
So far, we have considered the display processor
and its relationship to the paren~ computer. We were
also concerned with display co~soles and other peripheral devices, and their relationship, in turn, to the
display processor and generator. Our main goal in this
area was flexibility. We wanted ithe ability to attach
a variety of display consoles, differing in some cases
in their equipment complements,: as well as other nondisplay devices including graph~c input tablets, and
specialized analog equipment, s~ch as circle or raster
generators. We met this need 'by dissociating from
the display processor design any consideration of
individual consoles or other devices. Instead, we elected
to treat these as I/O devices, and to handle their
control and the transmission of information to and
from them by means of a very g~neral I/O bus system.
The digital portion of this bu~ system is similar in
nature to the bussing schemes used on several general
purpose computers. -Devices ar~ selected by an address field in the I/O instructions; all devices are
treated homogeneously as collections of registers; and
a given register may contain control or status information, input or output data,: or a mixture of these.
Figure 3d shows the Input/O~tput commands. Two
of these permit the user to trahsmit information between the I/O register (RI5) and the registers of
external devices. Incoming da~a and status information can then be examined by the Display Processor,
through the test and skip instructions described in the
last section, or dealt with by, the 940 through the
POT/PIN commands. The rem~ining two commands
permit somewhat faster direct output of key commands
and direct testing of key device status bits. As mentioned in the last section, another component in the
digital I/O bus system is the channeling, through OR
gates, of synchronous and asynchronous events in the
peripheral devices into the HI and H2 bits of the System Parameter Register.
Corresponding to this treatment of digital information, the transmission of analog signals within the
system was also handled through a bussing scheme,
which allows input of analog signals to summing points
within the display generator as well as output of display drive signals. * Because of this treatment of peripheral devices, one can view the display processor
and generator taken together as a specialized hybrid
computer whose main job is to handle a series of I/O
devices through a combined analog/digital bus system.
Just as the 940 processor is time-shared, we wanted
the ability to time-share the display processor and
generator among a number of user consoles without
danger of interference between them. This was
achieved by giving the 940 processor the ability to
control and thus schedule, usage of the display processor, and by allowing for device protection hardware
in the display's I/O bus design. This hardware utilizes
a mapping scheme similar to the memory mapping
and protection hardware in the 940 and has the additional advantage of allowing a user to refer to peripheral devices through "virtual" addresses that can
remain constant even though he may be assigned a
different console at different times.
CONCLUSION
The stack mechanism in this design is the most significant departure from previous machine design
practice. The features of a marked stack, and the
ability to create disjoint stacks (through the "stackjump" linkage) are both easy to implement and useful.
As is by now wen known, the stack feature in a. display
processor is essential for orderly treatment of "hits"
detected by the light pen or other stylus devices.
Close coupling between display information and ~)40
programs has been achieved by the mechanism of
shared memory. Other general purpose display systems
seem to be relying more and more on small local computers for interactive service and to shield the main
computer from the display. By contrast, we deliberately
set out to achieve a rich interaction between display
and parent computer, and the extremely close coupling
* Whether a device generates or responds to analog signals depends upon bit settings in its control register.
A Display Processor Design
of the two machines reflects this goal. Our experience
so far indicates that this coupling can be achieved
without serious degradation of the 940 time-sharing
system.
Until now most displays have been treated strictly
as I/O equipment. As displays have grown in complexity over the years, however, we have come to
recognize that display processors have many of the
attributes of general purpose computers. In recognition of this, we deliberately approached the design
problem with a processor-oriented rather than I/O
device-oriented approach. This thinking is reflected
in the display's extensive instruction set, in the use
of memory and device mapping, in the uniform treatment of consoles as peripheral devices, and finally, in
the microcoding and uniform bussing scheme that
dominate the display processor design.
2
3
4
5
6
7
ACKNOWLEDGMENT
The authors would like to acknowledge the contributions of their co-workers at Shell Development,
Bolt Beranek and Newman, and Sanders Associates,
during the various design and implementation phases
of this project. The Bolt Beranek and Newman effort
was supported by the Advanced Research Projects
Agency of the Defense Department under contract
F 19628-68-C-0125.
REFERENCES
1 L C HOBBS
8
9
10
11
217
Display applicfllion and technology
Proc IEEE 59 12 1870-1884 1966
N A BALL H Q FOSTER W H LONG
I E SUTHERLAND R L WIGINGTON
A shared memory computer display system
IEEE Trans on Electronic Computers EC-15 5 750-756
1966
K H. KONKLE
An analog comparator as a pseudo-light pen for computer
displays
IEEE Trans on Computers C-17 1 54-55 1968
W W LICHTENBERGER M W PIRTLE
A facility for experimentation in man-machine interaction
AFIPS Proc 27 589-598 1965
C MACHOVER
Graphic CRT terminals-characteristics of commercially
avaUable equipment
AFIPS Prot 31 149-160 1967
T H MYER I E SUTHERLAND
On the design of display processors
Com ACM 11 6410-414 1968
M W PIRTLE
Intercommunication of processors and memory
AFIPS Proc 31 621-634 1967
H H POOLE
Fundamentals of display systems
Spartan Books Washington D C 1966
C SgITZ G F PFISTER
A display processor for a small computer
AFIPS Proc This issue 1969
R W WATSON
The design of a general purpose graphic terminal jor a timesharing system
Shell Development Co TechnIcal Progress Report 138-68
July 1968
R W WATSON et al
Paper in preparation describing the design philosophy of
the software for use with the display system reported here.
The system logic and usage recorder
by R. W. MURPHY
Inf.ernation:al Busine88 M achine8 Corporation
Poughkeepsie, New York
computer. In addition to the monitor interface, there
is a standard input/output interface which is used to
pre-load the associative memory when this is required
by an algorithm, and over. which the collected and
reduced data are transmitted as the Recorder's output.
In this p..A.per, some simple data-gathering procedures
are discussed first in order to introduce the design
concepts of the Recorder. This is followed by descriptions of the organization and programming of the
system, and finally some specific data reduction algorithms are given.
INTRODuc'rION
A fundamental problem in monitoring 'the performance
of a system with a hardware device, is too much data.
Inside the System/360 Model 40, for example, seventeen address bits and sixteen data bits may be processed
every 2.5 microseconds; this rate is equivalent in
bulk to about three novels per second but not generally
equivalent in interest or information. The design objective for any hardware monitor, therefore, is to
reduce the data it sees as soon as possible.
The associative memory (AM) is an excellent means
for not recording data beyond significance. The memory
can be instructed to record data only if they are new;
if the data have already been seen and stored, no more
space need be squandered upon them. This philosophy
of monitoring and measurement has been expanded
into the System Logic and Usage Recorder, an experimental device under test in IBM Poughkeepsie's
SDn Advanced Technology group.
In the Recorder, the basic associative processes of
interrogation and storage are extended, by means of a
system of data routing and field control, into a capability for performing advanced data reduction and
data processing algorithms. The algorithms are programmed and retained in a control storage where they
may be added to or modified by the user.
Data to be analyzed in the Recorder are collected
at the host computer through a special moirltor interface which detects and transmits such signals as instruction and data addresses, operation codes, and the
statuses of channels and internal computer conditions.
The monitor interface, which consists of 48 lines, is
one-way, and does not affect the operation of the host
-Simple data ga~'ng and basic operation
A question asked in performance measurement is,
"How much time is spent in executing programs out of
various areas of storage?" To determine these times, a
counter must be assigned to each of the active areas;
when an instruction is fetched from an area, clock
pulses begin incrementing the corresponding counter,·
and continue until an instruction is brought from some
different area.
In the Recorder, the counters are assigned to storage
areas automatically, through associative memory. Initially the memory is blank and the counters stand at
zero; but when the first instruction address is received
in the Recorder from the computer being monitored, it
is stored in an associative memory word cell as shown
in Figure 1.
This word cell then becomes responsible for monitoring the storage area 00100 through OOlFF, which the
word cell does by comparing its contents with each
new instruction address brought into the AM input
register. As long as there is equality in the high-order
219
Fall Joint Computer Conference, 1969
220
----~--------------------------------------------
COMPUTER'S
STORAGE
001281
ADDR.
INSTRUCTION
REGISTER
SIGNAL
MONITOR
INTERFACE
If
AM INPUT
REGISTER
MASK REG
AM WORD
CELLS
MONITOR
INTERFACE
00128
11100000
001
INPUT REG.
MASK REG.
MATCH
INDICtTION
~
COUNTERS
X f---- 0001
COUNITERS
AM WORD
CELLS
00123
011526
001654
001567
~
Figure I-Assignment of counter to initial execution
area
Figure 3-Correlation of executed area. with
channel activity
bits of the address (the low-order bits are ignored by
means of a mask), a match will be indicated, and the
match indicator for that cell will continue the selection
of the corresponding counter, allowing it to accumulate
time intervals.
This process of interrogation, is repeated until an
inequality between the value stored in the cell and an
instruction address produced a, mismatch, signalling
that program execution has moved to a different area
of monitored storage. The mismatch will deselect the
counter, and will cause the controlling program to
branch into a write cycle in order to record a new
active area as shown in Figure 2.
The process diagrammed in the figure will assign
counters as they are needed, and record their assignment in the associative word cells. Since interrogation
of the associative memory is a single operation, it does
not matter how many of the cells contain meaningful
data, and the fineness of the measurements can be
adjusted by means of masking to take advantage of
the available memory space. If execution in the host
computer should revert to an area already identified
by the Recorder, such as 001 in the example, 1~he
original cell's contents will again match the address
and reactivate the counter for additional accumulations.
The two-branched monitoring procedure iEI a ba.sic
one and can be made to yield many kinds of inf'orma~ion. For example, if channel activity is also monitored and presented at the interface as a field of bits,
this field can be juxtaposed with the instruction B.ddress field as in Figure 3.
With this process, which has the same flow chart, as
in Figure 2, a correlation will be made autom.aticnlly
between storage usage and channel activity. It is" of
course immaterial what kind of data is being; brought
to the' interface; the user can perform the correlat.ion
on any combinations of events which are represented
by digital signals brought over the monitor interfnc?
Another form of correlation is of interest because It
yields information about the sequence of even.ts taking
place in the monitored system. This procedure c~nsists
of relating each event to its predecessor by formIng: an
ordered pair at the AM input register as in Figure 4.
Two kinds of events are recorded in t.his. prOC e8s:
the occupancy of a particular area, and the transition
from one area to another. The procedure is essentially
the same as that given by the flow chart of Figure 2,
except that an additional data routing is pro~~amn~.
Each address is first placed into the left-hand fileld
(the current field) and the interrogation .is p'~rforn:~ed.
Following the action consequent on the lntel'rogat~on,
the address is then put into the right-hand field (the
previous field) and is retained there until the next address arrives and the cycle is repeated.
1
AM INPUT
REGISTER
MASK
COUNTERS
AM WORD
CELLS AFTER
INTERROGATION
AM WORD
CELLS AFTER
WRITING
Figure 2-Assignment of next counter to next
execution area
0123
This procedure develops a graph of the systE~m's
System Logic and Usage Recorder
COMPUTER'S
STORAGE AOOR.
REGISTER
221'
M'asking. The suppression of part or all of a field
at a particular step in the procedure.
'--'T"F-'-'
Operation. Interrogate, store, or read for associ-
ative memory.
Branching. Choice of the next step, based upon
INPUT REG.
MASK REG.
AM WORD
CELLS
MATCH
INDICATION
001
004
004
004
004
001
results of previous steps.
COUNTERS
00123
00008
00678
00007
The specification of these elements applies primarily to associative memory as it processes the data
received from the monitor interface, and is incorporated
in the AM format instruction:
Figure 4-Recording occupancy and transitions
of execution areas
Routing 2
operation in associative memory, and could be used to
study the operation of paging algorithms. If the full
instruction address were applied to the memory by
modifying the mask, all the linkages of a program
would be recorded and could be used to draw the
program's block diagram as it was actually executed.
The application would be very wasteful of space, however, and impractical except for very small programs.
There is a more complex procedure, to be discussed
later, which eliminates much of the redundant information and makes block diagramming feasible with
associative memories that will be available in the near
future.
Emphasis so far has been placed upon the associative
operations and what might be called the logic recording
capability. The usage recording functions take place
in the counters, which are actually cells in a supplementary storage addressed by the associative memory
as a result of interrogation operations. These cells may
be set up in various ways to record counts, times, or
the presence of computer conditions, according to the
measurements required.
General design concepts
The examples of data gathering just discussed show
that a variety of performance measurements can be
made, simply by changing the nature and the positioning of data applied to the associative memory. This
variety is enhanced greatly by means of a stored
program control system which gives the user full control over the functions available in the Recorder. In
general, each step of a data reduction procedure will
specify the following elements:
Routing. The source, length, and terminus of a
field of data to be processed.
Mask
The operation code for the AM format instruction
will specify one of the following:
INTERROGATE-compare contents of input
register with all stored words and turn on
match indicators for cells with equal contents.
INTERROGATE NEXT-same as above, except
that the match indicator for the next cell is
turned on.
WRITE-store the contents of the input register
into all cells whose match indicators are on.
WRITE NEW-store the contents of the input
register in the first vacant word cell.
WRITE ONE-store the contents of the input
register in the first cell whose match iUdicatoris on.
WRITE ALL-store the contents of the input
register in all cells regardless of the match
indicators.
R:EAD-put the contents of the first cell whose
match indicator is on into the output re~i3ter.
Two fields of data may be moved simultaneously by
means of the two routing specifications. These fields
may be one, two, or three bytes in length, or, alternatively, a literal constant of one byte may be substituted
for one of the routing specifications. The routing of
data will be discussed in more detail in the section on
Data Paths and Routing Control in conjunction with
the data paths of the Recorder.
In general, the fields of data processed are of variable
222
Fall Joint Computer Conference, 1969
-------------------------------------------------------------length, on a byte basis. The associative memory is
eight bytes in width, and its masking is also generally
controlled on a byte basis. However, many algorithms
require status bits which must be masked or unmasked
by bit. The mask specification in the instruction, therefore, consists of fifteen bits, of which the first seven
apply to the first seven bytes of the associative memory,
and the remaining eight to the individual bits of the
eighth byte. In addition, it is also possible to apply a
literal mask to any byte by placing it in a routing
specification along with an identifying code. This literal
mask has precedence over the normal mask, and remains until removed by another literal. This mask is
not normally used in data reduction, but is necessary
for such algorithms as simultaneous addition into
associative memory or ordered retrieval from it.
The next two instruction specifications of each instruction provide conditional branching to the program,
based upon the collective condition of the match indicators. The choice of the next instruction depends on
the following:
INTERROGATE -if single or multiple
match
Instr. 1
if no match
Instr.2
WRITE or READ -if one or more
MI's are on
if no MI's are on
Instr. 1
Instr.2
Data paths and routing control
Figure 5 is a schematic diagram of the data registers
and paths of the system. Each line represents a path
for one byte of data, and a dot where two lines cross
indicates a programmable connection. One group of
six paths (48 bits) carries monitored data from the
interface with the host computer to the input of the
associative memory. The various registers and the
crossbar switch provide buffering and field control over
these data. Another path, one byte wide, connects
memory outputs to memory inputs through an adder
to allow internal processing functions.
The word logic circuits link the supplementary
storage with the associative memory and provide an
addressing function for the two memories. This addressing function is initiated by interrogating the
associative memory with data in its input register; if
the data in any associative word cell compare equally
with the interrogating data, either that word cell, or
a word cell in supplementary storage in one-to-one
correspondence with it, or both may be selected for
the entry or recovery of data. Explicit addresses for
these word cells do not appear in the instructional
g
roo-M
o~
I
A
,
R
S
W
I
T
C
~
K~
'--~
8~B
H
!:~
I
L
I
ri
:.~J'
A
i8
!!l!
~~
=2
J
AM INPUT
MASK
i
i
i
ASSOCIATIVE
MEMOIIY
~-~
WOItO ~
LOGIC ~
r-
t-
r-'.
-'.
AM OUTPUT
"'WIlT
I
-----
ITO II 10
INT£IIACf
J
1 11111
~
I'I'OItMI
~
--.I.
•
OUTPUT
--.I.
Figure 5-Recorder data paths
control system. The word logic circuits also provide
other functions, including tie-breaking in the case of
multiple matches and a match/no-match signal for
conditional branching in the program.
Control over the data routing is accomplished within
the instruction by means of routing specifications. The
standard instruction format contains two routing
specifications, each controlling one field of data;; a
special instruction format is used for supplementary
storage operations which are to be overlapped with
the associative operations. The routing specification in
the standard format contains 16 bits, iden.tified as
follows:
Change Code (one bit). A zero indicates that the
A Register is to be left unchanged; a one
causes the specified field to be entered into
the A Register before being routed further.
Literal Code (one bit). A one causes a one byte
constant from the instruction to bo entered
into the A Register before bein~~ routed
further. This constant replaces the field lenl~th
and source address specification.
Length Pield (three bits). Specifies the number of
bytes of the field being routed. The maximum
field length from the monitor register is three
bytes, and from other sources, seven. A
length of zero causes no transfer of da,ta.
Source Address (six bits). Specifies the location at
which the lowest-order byte of the field to
be routed is to be found. Successive bytes
the same field are moved in accordance with
the length specification.
Terminus Address (five bits). Specifies the location
to which the lowest-order byte of the field
is to be routed. Addresses are tabulated below.
System Logic and Usage Recorder
Termini
Sources
Supp. Store
Output
Assoc. Mem.
Output
Void
I/O Input to
Recorder
Clock
Monitor Interface
COJ).stant
Supp. Store
Input
Assoc. Mem.
10-17
Input
IA Void
I/O Output from
IB
Recorder
IC-IF
20-25
26
DO-OF
223
the monitored machine's state. This process will usually
be completed only when the computer has assumed a
new state, but a match indicator will be on, pointing
to the record of the previous state. If the algorithm
provides an SS instruction at this time, the SS cell
will be selected and updated according to the SS instruction. Once the selection has been made, it is not
affected by any alteration of the match indicators
until the SS instruction is completed and another one
issued.
It may be seen from Figure 5 that the updating is
accomplished through the adder and the SS input and
output registers, and that it is possible for AM and
SS operations to proceed independently once the selection of an SS cell has been made. This overlap will
take place automatically for all AM instructions except
those which call for the transfer of data between associative memory and supplementary storage or over the
I/O channel. The overlapped processing may be
represented as follows:
Oo-OF
10-17
IA
IB
Notes: Addresses are given in hexadecimal.
The address for the constant is not
used when the constant is specified as a literal,
but if the value of the constant is unchanged
the constant may be routed either alone as a
one-byte field, or as part of a two- or three-byte
field at addresses 25 or 24.
If a void is specified as a source, the
corresponding terminus is reset to zeros.
If a void is specified as a terminus,
positions of the A Register corresponding to
the source are reset to their new values.
G
New Monitored Data
I
rRe,cog. of New State
AM Proc.
f
t--t
state i
f
t-t--i
!
~ - - state i+l
SS Proc.
The two routing specifications per instruction permit
two fields to be moved simultaneously and in parallel
from the monitor interface to the associative memory
input register via the A register and the crossbar
switch. Transfers of data from sources other than the
monitor register take place over a bus which is one
byte wide, and are therefore serial by byte. As a result,
only one such transfer can be called for in each instruction, using the first routing specification. The
second routing specification can be used, ho\vever, for
a simultaneous transfer through the crossbar. A literal
can be specified only with the second specification.
state i - I
The time at which the monitored computer assumes
a new state is taken to be the time of receipt of new
monitored data, as indicated by the appropriate strobe
signal from the computer. Since there is generally a
lag of one cycle before the new state is recognized, the
clock is buffered so that it may be reset to record a new
time period starting from the strobe while the old time
period is retained pending use in the SS instruction.
If no new state has occurred, the old and new time
periods are combined.
The updating of a word in supplementary storage
is controlled by a single instruction containing specifications for performing different operations on four fields
of the ·word. These fields may be from one to seven
bytes in length individually, the combined length not
exceeding the sixteen bytes of the SS word. The SS
instruction occupies control storage as part of the
programmed algorithm, but it differs in format from
the AM instruction:
Supplementary storage
Supplementary storage (SS) is used to retain times,
counts, and condition codes for which associative processing is not required. However, each word cell of
supplementary storage corresponds to a unique cell of
associative memory and may be selected wheneevr an
interrogation of associative memory turns on the match
indicator for the corresponding AM cell. The general
concept is that the AM cell retains data describing the
state of the monitored machine, while the SS cell
collects the statistics relative to that state.
The character of the monitoring algorithms is that
there is a series of operations involving associative
memory only, establishing or identifying a record for
state i
I
=
Notes: RC
reset controls
LF = length of field
OF = operation on field
~:24
Fall Joint Computer Conference, 1969
The starting location specifies ,the low-order byte of
field 1, which is updated according to its length and
operation specification. The remaining fields are contiguous in the SS word, and are processed in succession.
If the entire sixteen bytes of the: word are not utilized
in an application, the starting lpcation may be other
than zero, and the time of completion of the SS instruction \vill be lessened.
In addition to length, the field! specification may cal1
for one of the following operations:
1. Increment field
2. Add clock to field
3. Put the lesser of the clock reading and the old
field value in field.
4. Put the greater of the clock reading and the
old field value in field
5. OR the interface byte to the field
6. No operation
Application examples
In the application examples to:follow, the algorithms
are given as block diagrams, in \vhich each block represents one instruction, including data routing, the
operation, and the masking for Al\l operations. Data
are routed by fields, which are· constant within each
application and are designated by capital letters
generally mnemonic with their meaning. The location
of a field is indicated by a suqscript. identifying the
register involved in the routing 'or the memory itself.
These subscripts are:
b-monitor :nterface buffer
a-crossbar entry register
i-associative memory input:register
s-storage ceJIs of associativ~ memory
o-output register from associative memory
p-input/output registers of supplementary storage
The various fields used in an algorithm form an
ordered set at the input to ass;ociative memory and
after being written into a particular word cell. The
notation for such an ordered set isi:
< SsP 8C > for a particular stored word
S
If interrogation is to be performed, it is generally
on a set of such words. This set is not ordered and is
written as follows:
{ }
In this example, Sand P identify the fields active
in the interrogation, and the dash indicates that the
field occupying that relative location in the word is
masked.
Application 1 : Combinations of events and states
Problem
To find out what system states occur over a period of
operation of a host system, how many times e~Lch
state occurs, and how much time is spent in each sta,te.
For this application, a system state is defined to be
one combination within the following classes of monitored signals:
Stopped/operating
HunnOng/waiting
Supervisor/problem
Channels busy
Page of instruction
2/possibilities
2
"
2
"
8
256
"
1 bit
1 bit
1 b~t
3 bits
8 bits·
The monitor interface is set up to provide all of the
above signals except page of instruction on an on·-off
basis. The page of instruction is the high-order 8-bit
group of the instruction address, whose presence at
the interface is signaled by means of the instruction
strobe. An evaluation of the system state is to take
place at each instruction strobe, or, if instructions are
not being executed, at each change in the remaining
conditions.
Procedure
Each system state is represented by a particular bit
pattern in the above array of 14 bits, and is recorded
in one word of associative memory. The time interval
and usage of each state is totaled in the correBpondilng
word of supplementary storage. If instructions are being
executed (operating and running program states), the
entire bit pattern is used, otherwise only program and
channel statuses are stored.
Whenever a change of state occurs, the appropriate
bit pattern is compared simultaneously against all those
previously stored. If no match if found, indicating a
new state, the bit pattern is stored in the next vacant
word, and the statistical fields in supplementary storage
are initialized. If a match is found, indicating: a repetition, the statistics are updated.
Interrogations of associative memory may occur as
a result of instruction strobes without a change from
the state of the previous interrogation. To detect
changes, a control bit is added to the array of 14 bits
and is set to one in the word representing the current
state of the system.
System Logic and Usage Recorder
Results
At the end of the evaluation, there will be one word
of data for each different system state which has
actually occurred. These can then be printed out using
the ordered retrieval procedure to present the nonexecuting states first, then the states in page order.
22,t)
only the last state indicator is stored in preparation
for the next cyole; otherwise, the entire contents of
the liE register are written into the next vacant
word to record the new state.
Application 2: distributions of events
Problem
Algorithm for combination of events and states
S CHANGE
OF
STATE
1- STROBE
Field combining program status and
busy channels bits (6 bits total)
P -
Page of instruction (8 bits)
C -
Last state indicator
a •• oc.
BUpp)
codes arrive at the monitor interface they :ue routed
to successive fields in the Interrogate/Entry register
and also to a field set aside for comparison against the
set of conditional branch codes which occupy a special
set of preloaded words. When one of these codes is
found, the array of six fields in the I/~ register is
used to interrogate the rest of associative: memory
which holds the arrays already found, and the appropriate entry or updating of usage is performed. The
I/E register is reset to zeros, and the next opera.tion
code starts a new sequence.
The sequence may go beyond five code8 before a
conditional branch is found. In that case, the seventh
code takes the place of the first, and so on until a
conditional branch is found.
Results
Each instruction strobe initiates a test to find if a
branch was taken for one of th~ prespecified operation
codes. These need not be the i entire set of the host
computer.
If no actual branch is foun<;i, the running count is
incremented by selecting the ~ord where it is stored
with an interrogation for its code. The field is read
out of supplementary storag¢, routed through the
incrementer, and restored in the, same word.
If the branch has taken pla¢e, the running count is
routed into the I/E register w~ere it becomes the path
length. The combination of operation code and path
length then is either stored, or if already ill storage,
causes an increment to be made to its frequency field.
Each word contains one mix of six or f€~wer operation codes. The terminating conditional branch (code
may occupy any of the six fields, but if there is at
least one zero after it, the entire sequence: is as recorded; if not, the preceding five codes are read in
"end-around" fashion.
Algorithm for finding short sequences\
I STROBE
°B
-
Operation code received at monitor
interface
Pre- stored branch operation codes
Fields in storage:
( (B s 0 0 0 0 0 0 0 0 0 0 0
set of branch
>} codes
(00 O!O:O!B!O 0 000 O>} mix of.four
c;odes ln se,!uence of four
(000l1012013014015B1607d!09010)}
Application 3: Short sequences and mixes
SSSSSSS'lSS
rni.x of ten codes
in sequence of
at least sixteen
Problem
Knowledge of instruction mixes can be an important
factor in the planning of new systems. There are a
number of ways in which th~ collection of mix data
can be specified, all involving: some form of sequence
following or finding. In this eX$.mple, the problem is to
find what operation codes itimediately precede the
conditional branch types of in~truction, up to a maximum of six including the branch.
Procedure
One word of associative meplOry is to be used fos
each mix, with the operation bodes distributed acrosr
the word in six fields of one fuyte each. As operation
Successive operationc()4es',~r~ placed in successive
across the, liE regist~r, by means of a string
of ,macroinstructionsdiffeting:onIY in the routing
o fields
System Logic and Usage Recorder
microinstruction. When a branch operation code is
new
received , a common routine is followed to add the..
mix to storage or increment the usage field of an eXIstIng
227
complete test record in a format permitting an item
by item comparison with results of tests of variations
of load or system.
In1X.
Application 4-; Long sequences
Algorithm for followinJ! long sequences
Problem
One way of determining the performance of a system
is to see how often prespecified sequences of events
occur. In this example an operating system is to be
tested with a known load to determine if predicted
sequences of supervisor calls, interrupts, and object
programs are being followed. The sequences may be
very long, may overlap or include each other, and
may start or end with any arbitrary element.
The change to a new current PSW represents a step
in the sequence, and can be detected by the fact that
there is an interruption in the host system or that a
LOAD PSW instruction is executed. The address of
the PSW identifies the sequence element and is obtained from the monitor interface whenever a change
occurs.
Procedure
Associative memory is preloaded with the sequences
to be followed, the elements of each sequence being
placed in successive memory words. In the word the
code for each element occupies one field, in this case
24 bits of address. The word also contains two singlebit fields, one of which contains a one for the start
and the other a one to indicate the end element.
This procedure makes use of a special interrogation
operation for associative memory in which, when a
word is matched, the next succeeding word in physical
order is selected for the entry of data. In this case, a
status bit is entered after this form of interrogation
in order to keep track of progress through the sequence,
and the crucial interrogation is made simultaneously
on the address and status bit. If the interrogation is
successful after the next element has been received,
the status bit is moved to the next word.
In addition to recording successes in traversing
complete sequences, statistics can be compiled on
partial traverses in the words of supple~entary storage
corresponding to intermediate sequence elements.
Results
At the end of the test, associative memory will
contain the sequences tested for, and supplementary
storage the record of how well these sequences were
followed. The sequences could then be printed as a
C -
Code for sequence element
B, E - Start and end
S -
Status bit
UPDATE
STATISTICS +
RESET M.I,'S
INTERROGATE
NEXT
{}
NEXT CODE
SET rZl IN
{<- -- -55>}
I-Bj
INTERROGATE
{<- B..- - »
I IS IN Sj
ENTER
{<-- -55»
Initially, and at the end of each cycle, the status
bits are set to one for all first elements.
When the next code is received from the monitor
interface, an interrogation is first made to find out if
that code matches any expected last elements of sequences so far successfully followed. If so, the statistics are updated and that element is reset to zero
status (without affecting other elements in that sequence).
The same code then is used to interrogate the set
of all elements whose status bit is one. This operation
uses the INTER NEXT operation to prepare for the
eventual entry of a one in the status bit of the next
word. Figure 6 shows the match indicators turned on
for the word actually matched.
Zeros are then set into all status bits, regardless of
the match indicators, and without resetting them. This
step clears any elements which may not have been
matched with this last code.
Finally, all first elements are selected for entry by
the use. of a normal interrogate operation. This se-
228
Fall Joint Computer Conference, 1969
--------------------~~-----------------------------------------------,-----cessors. These linkages can be identified from Hddresses
and operation codes in the instruction stream.
Procedure
Figure 6-8t.eps in following the sequence
GHAAAABCDE
lection is OR'd with the selection obtained by the
INTER NEXT operation above"so that ones can now
be entered into the union of the two sets.
Application 5: Block diagramming
Problem
In debugging or in evaluating ,the performance of a
program it is important to know whether program
segments are executed in the prdper order, how much
time is spent in each segment,: how well they were
overlapped with channel activity, and if execution
was forced to wait. Although one or a few segments
might be singled out for exaniination by methods
similar to those of the preceding: applications, there is
difficulty in predicting where and what to look for,
and a chance of missing something significant.
If every instruction address )Vere paired with its
successor in the instruction stream and the combination
applied to associative memory, eV'entually the memory
would contain all the links between instructions for
that program. However, most instructions have unique
successors, and the technique would waste memory
space or redundant information. i The essential information is contained in just those linkages from or to
instructions which have several successors or prede-
Each word of associative memory contains three
address fields, the "entry," "exit," and "destination."
The entry and exit addresses are the first and last of a
block of sequential instructions, and the destination
is the entry of a succeeding block, so that each stored
word represents one linkage in the logical structure of
the program.
Certain addresses are identified as exits when they
occur in the instruction stream accompanied by a
branch operation code. The first address after an exit is
automatically an entry to a current block, which will
occupy one of four possible relationships to blocks
already found. As the entry and succeeding addresses
appear in the instruction stream, they are compared
with previously stored entries and exits to resolve
whether the current block is new or one being retraced,
or whether either the current block or an old block is
to be partitioned.
As execution of the program proceeds, wi1~h repetitions of its segments, most of the linkages will be
followed one or more times, and the corresponding
division of the address stream into blocks will be
established. When these elements are found or repeated,
their time and usage is noted, and channel and w[~it
statuses are correlated with them, using supplementary
storage for this additional data.
Results
It can be shown that each conditional branch instruction will result in at least two, and no more than
four linkages, and that the number of blocks esta,bli ~hed
by the branch is always one less than the number of
linkages. Since one word of storage is required :for each
linkage, approx;mately 2700 blocks can be recorded in
a 4096~word memory. Depending upon the complexity
of the program's structure, the memory can cope with
programs of between 6,000 and 16,000 ingtructions.
At the conclusion ofa block diagramming evaluation,
associative memory will contain the structural composition of the program a~cording to it" actual execution, and supp ementarYi storage wi 1 contain the'
statistics correlated with each structural element. The
standard presentation of this information wou.ld be a
listing of the blocks with their exit linkages governing
their order.
Once the information has been collected, other output procedures can be used to meet specialrequire~
ments. For documentation of the program, it may be
Sys,tem Logic and Usage. Recovder
desirable to present the block diagram in pictorial
form, using the host computer to compute and print
the dicl.gram. When the program is being optimized by
trial, it will not always be necessary to print out the
,entire listing, but only the more time-consuming
elements.
Detailed description of procedure
If an instruction is a conditiona' branch, the first
its operation code is found in the instruction
sti'eal1l, it is recognized to have he potential for a
different successor in some future execution and therefore it is recorded as the "exit" of a block. Its successor
of the moment i3 one "destination" and also an "entry"
to another, or possibly the same, block. The basic
record thus consists of three addresses, identifying the
entry, exit, and one destination of the block.
When a conditional branch identifies the next addlJSS as an entry to a block, this block may intersect
some block already derived from the instruction stream.
There are four possible relationships of a current block
to blocks already traced out, as shown in this diagram
tim~
Np
--t
Nc
l.
Xc
I------t
Nc
Z.
Xp
!
I
previously stored block
I
I
I
1
J
I
I
I
Xc:
I
3.
4.
I
current block possibilities
I
Nc
229
In order to partition the intersected blo'ck discovered
in this case, iF is necessary to determine the addr~s
one location less than the current entry. This exit is
not computable exactly when variable-length instructions are being executed, but it might occur again
in the instruction stream and be recognized because its
successor matches the entry' in question. To cause this
to occur, a flag is added to the intersected block, removing it from use by the algorithm, so that if the
block should be repeated from its original entry, the
situation will resolve itself into case 2.
The flagged block might include an initializing routine
which is never repeated, and the block will contain
time and status data which cannot be distributed to its
partitions. Therefore, the flagged block is retained for
the ultimate readout and presentation of results
Special operations in the program, such as multiway
branches, cause no difficulties to the operation of the
algorithm when they are based on recognized operation
codes. If the program changes an operation code to a
branch, as mentioned in case 3 above, the algorithm
must be altered to take into account so~e cases in
addition to the four cases described. An algorithm
which makes use of addresses only, and is thus unaffected by a changed operation code, has been worked
out by the author but is not included here.
Xc
Nc
I
Xc
1----1
In the first possibility, none of the addresses from
the currant entry, N c through the current exit, Xc,
will be found to match any previously stored entries
or exits, N p or Xp; the block is therefore new and can
be added to the store.
A current entry may not be recognized, but may be
followed eventually by an address which does match
some previously stored entry. The address Just previous
to that matching N p becomes the current exit of a
bloek, as shown in 2. above, and the block is recorded
with Np as destination. The progr[l.m will continue
by repeating < N p Xp> , because Xc is not a branch.
The destinatiop of a block may be to an entry already
recorded, as shown in case 3. Assuming that no change
of operation code has taken place the same exit must
follow, and the block need n'ot be recorded again unless
the destination is different. Eventually, in the program's
execution only case 3 will be found.
If a branch, conditional or unconditional, haa led to
a new entry within a block, as shown in case 4, this
fact will not be known immediately. However, sooner
or later an address will match the exit, N p, to signal
the condition. The current block can be added to the
store, but the previous block is intersected by it.
ACKNOWLEDGMENTS
The author expresses his appreciation to Mr. R. R.
Seeber for his advice and encouragement on this
project.
The associative memory for the system logic and
Usage Recorder was designed by Mr. A. W. Bidwell,
and system design construction and debugging were
performed by Messrs. F. E. Jordan, H. L. Wetzel, and
G. T. ManelskL
Block Diagramming Algorithm
N
~
Entry address
X -
Exit address
D -
Destination address
o -
Operation code
S
-
StatuI! bit:
F
-
FIllg bit:
I for new block
1 for intersected block (case 4)
(IitatisticlI)snpp}
Implementation of the NASA modular
computer with LSI functional characters
byJ. J. PARISER
Hughes Aircraft Company
Fullerton, California
and
H.E.MAURER
NASA Electronics Research Center
Cambridge, Massachusetts
INTRODUCTION
The NASA Electronics Research Center (ERC) in
Cambridge, Massachusetts, has undertaken a broad
program to satisfy flight computer system requirements
for future missions, including versatility and long term
reliability. Specific attention to these requirement.s is
necessary because flight qualified aerospace computers
and even some still under development, have been
designed for increased' computational speed and
arithmetic capability, but not for the long life reliability
.and application flexibility that will be required for
future spac.e missions,l,2 For example, the mean time
between faIlure (l\lTBF) of available aerospace computers lies in the range of 2,000 to 5,000 hours, whereas
long space missions will require an MTBF of 106 hours.
Sseveral computer organizations have been described
in the literature which include redundancy for increasing mission reliability, but still negl~ct applications versatility.3,4 Some non-spaceborne computers
of the array or multiprocessor type are currently being
developed.6 ,s. These systems, although potentially capa?le .of meetmg ERC's versatility and reliability obJectIves, lack design features for space applications
(component reliability, weight, volume, radiation
hardness, etc.).
This paper describes the architecture of a modular
computer which can be configured to operate as a
number of parallel processors, with each segment or
column solving an independent problem that may be
different or identical. Each column in turn contains a
number of blocks called modules, which may be configured so as to form patched columns, using modules
from different physical locations; for example, a diagonal (see Figure 1). This structure meets the high
speed computational requirements for attitude control
associated with strap down systems, and also achieves
the reliability required for long time mission success.
The modular computer requirements have been derived through simulations which yielded speed, word
length, and memory requirements.
A breadboard model consisting of two columns has
been built and is currently in the terminal stage of
system checkout. Software is being developed concurrently with hardware. This l\fodular Computer
Breadboard (lVICB) will be used for experimenting with
different structures in order to enhance the NASA
ERC modular computer objective. The body of this
paper describes the LSI implementations of the modular
computer, with requirements and organization given
in the following sections.
The NASA modular computer requirements 7 *
The functional design requirements can be character-
* A summary is included here for easy reference.
231
232
Fall Joint Computer Conference, 1969
COLUMN
COLUMN
MODULE
MODULE
1,1
1,2
MODULE
MODULE
2,2
2,1
COLUMN
•••••••
•••••••
MODULE
l,n
MODULE
2,n
:
MODULE
n,l
••
••
•
••
••
•
••
•••
MODULE
n,2
•••••••
MODULE
n,n
Figute 1-NASA modular computer showing columns and modules
ized by high probability of success over a short period
for high speed computations apd survival for long
periods at low computation rates.
The l\1odular Computer, as a potential component
of a guidance and navigation subsystem of severa]
potential space booster configur::itions, must be applicable to at least four distinct missions: the synchronous
satellite, lunar orbiter, Mars orbiter, and Jupiter fly-by
solar probe. Computer memory size, word length, and
speed requirements for each phase of these four missions
have been estimated by means of computer simulations.
The object computer was assu:med to have singleaddress and sequential operation ..
Figure 2 shows the computational requirements as a
function of injection velocity ac~uracy. Next to reliability, computational speed is the most critical parameter. Only one set of curves is shown for all missions
since it has been assumed that the guidance computa-
tional requirements up to and including injeetion are
the same for all missions. The speed (instructions per
second) axis represents equivalent additions per second
at a rate of 1 multiply equals 6 adds. The memory
requirements include approximately 1,400 words for
executive and 10 operations, for a total of 12,800 words.
In terms of physical parameters, it is estimated. that
radiation, temperature, and computer operability
requirements represent the most cirtical environmental
conditions which the modular computer must meet.
The proposed trajectories could subject the s:pacecraft
to 3 to 48 hours of I-MeV electron flux of 109 e/em 2
sec and 80-l\,leV proton flux of 107 p/cm 2see. Representative calculations of anticipated ambient thermal
environments clearly indicate that an environmental
control system is needed. The mission time requirement
for navigation varies from six hours for the synchronous
satellite to 436 days for the Jupiter fly-by. These times
pose stringent reliability requirements.
Implementation of the NASA Modular Computer
233
8
30
15
WORD LENGTH
~
\
6
VI
0
a::
VI
0
I-
~
ill 20
I
('t)
0
.....
I
I-
VI
0
a::
0
4
~
(!l
w
......
I-
VI
""
"" ~ --- r---..
VI
z
6W
a::
0
~
W
~
x
W
z
0
0
~ 10
U
..J
>
a::
.;-
~
W
a.
""
MEMORY SIZE
""" "" ~
VI
10
-
~ ~PEED
...-...,
5
2
2
3
4
5
6
7
8
9
10
INJECTION VELOCITY ACCURACY -fps
Figure 2 --Computational requirements for injection into parking orbit
The modular computer architecture
Design philosophy8
The most severe requirements in terms of speed and
accuracy occur during boost. 7 Post injection computational requirements are low and the accuracy of computations is far less critical. Therefore, to satisfy the
composite requirements a Modular Computer (lvrC)
organization as shown in Figure 3 has been structured.
Each column of the l\1C can satisfy the 1.5 X 106 instructions/sec requirement.
During boost, three columns of the modular computer
operate concurrently in a triple modular redundant
(Tl\lR) mode, with majQrity voting at the outputs.
After orbit injection, the Tl\:lR mode is terminated
and the ensemble of modules is configured so that only
one computer remains operating; the others are turned
off to conserve power and improve reliability. * System
* The
failure rate of non-operating circuits is assumed to be
lower than that for operating ones.
interlocks are provided which insure that the oncomputer performs correctly (\vithin bounds). If this
is not the case, the Configuration Assignment Unit
(CA U) is triggered. 1t is the task of this unit to assemble
at least one computer out of all the available modules.
The availability of good modules is determined by
means of hardware-software tests with interlocks. As
may be seen from Figure 3, each of the computers has
been separated into four functional modules: a lV1 emory
Unit, Control Unit, Arithmetic Unit, and an 10 Unit.
The Configuration Assignment Unit (CAU) in conjunction with the CU, together with the Configuration
Control S\vitches (CCS), can automatically reconfigure
the ensemble so as to form an operating computer.
Such a computer may consist of any combination of
lVIU-i, CU- i, AU-i, 10-i.
The breadboard version of the modular computer
contains two columns. This is sufficient for the intended
experiments:
1. Determination of mission algorithms within
234
Fall Joint Computer Conference, 1969
--~----------------------------------------------------------------------------------
specified accuracy limitations and consistent
with the intended applicatiQn.
2. The use of parallel processing to achieve higher
effective computational speed.
3. Automatic detection and isolation of the occurrence of a computer modu;le failure, and automatic reconfiguration to ellminate the effects of
the faulty element.
Computer structure
Although Figure 3 shows a tri-column configuration,
the actual flight computer may require additional
columns and some configuration 'adjustment in order
to meet the mission time requirements. 9
In general terms, the modular ~omputer consists of:
k-Configuration Assignment Units (CAU)
one set of Configuration Control Switches (CCS)
m-:Control Units (CU)
n-Arithmetic Units (AU)
p-Input Output Units (IOU)
q-Memory Units (MU)
r-Power Supply Systems
The values of k through r are determined from
reliability requirements and configuration alternatives.
In the preliminary design, k = r = 1 and m = n = p
= q = 3. This configuration will be adjusted as required.
The configuration assignment unit (CAU)
The Configuration Assignment Unit controls the
switches which interconnect the, various modules to
produce the necessary computer or computers. The
CAU monitors CU requests for changes in the computer's configuration and, based on a predefined test,
may accept or reject these requests. It determines if
no operating "computer" exists" and then establishes
new configurations until a wonking "computer" is
assembled. The CAU contains registers which permit
communication between control units. CU interrupts
are generated in the CAU by means of Status and
Mask registers. The system clock is also located in
the CAU. The primary tasks of the CAU are:
1. To validate requests for change from a CU by
monitoring the elapsed time and the result of a
diagnostic, and then accepting and implementing the request.
2. To connect all possible configurations one at a
time until one operating; computer is found,
based on diagnostics.
3. To initiate an interrupt in a newly configured
computer to ~tart a diagnostic.
4. To provide and monitor a counted delay of
about 30 seconds which, if not reset iln time,
will be interpreted as the absence of a working
computer, which will initiate two above.
5. To maintain configuration and status information during a shut-down if power is maintained to the CAU. When power is restored,
the two previously stored configurations will be
exercised first to locate an operating computer.
If these fail, 2 above is initiated.
6. To accept from the executive CU requests for
changes in 10 configuration.
Configuration control switches (CCS)
As seen from Figure 3, the CCS's provide a pg,th
between any module in a row with any module in rows
immediately above and below. In addition, the switches
provide for traffic between the CU and IOU modules.
All paths are under the control of the CAU.
Control unit (CU) and Arithmetic unit (AU)
The Control Unit determines the sequence of operations within the computer, which consists of one or
more MU's, one AU, and any applicable IOU; i.e., all
computer memory, arithmetic, and input/output operations are under the control of the CU. As is seen from
Figure 3, the traditional ACP (Arithmetic and Control
Processor) has been split into separate functions of
CU and AU. This is done to enhance processing speed
and long term reliability. Each unit has a set of 16
temporary registers much like the multi-usage registers of third generation computers, except that there
are three index registers which are separate and distinct
in addition to the temporary registers. The AU and
CU operate concurrently. The AU accepts data a,nd
instructions from the coupled CU and execu1ies these
instructions under internal control, making the results
available to the same CU. Two's complement arithmetic, both floating and fixed point, are included.
The input output units (IOU)
The Input Output Units (IOU) are of the direct
memory access type, which provide cycle-stea.ling a.ccess for 10 transfers. Each IOU provides two input and
two output channels. The IOU contains two registers
which can be loaded by the CU. These registers hold
the priority and normal operation control words. When
Implementation of'the NASA Modular Computer
AU-l
I
I
AU-2
•
AU-3
•
SWITCH-CCS
t
t
CAU
CU-l
CU-2
t
1
235
t
I
CU-3
r--e--
It
t
-~
,
SWITCH-CCS
MU-2
MU-l
PRIMARY
TO ALL
MODULES
J
. I
•
MU-3
t
I
POWE R SYSTEM
STANDBY
1
•
,
I
SWITCH-CCS
IOU-l
IOU-2
,
I
IOU-3
t
I SWITCH-CCS
I
1-
t
I
CU 2
Figure 3-Modular computer organization
data for the priority channel is absent, the normal
transaction is served.
The memory unit (MU)
The Memory Unit (MU) receives, parity checks, and
stores incoming data in the assigned address. The address is also checked for parity. At present, each memory unit can store 4,096 words of 36 bits each. A readrestore cycle is completed in 1 microsecond. Each
memory is addressable by any CU, as permitted by
the CAU. Two CU's are not allowed to be associated
with one MU. The CAU may permit a CU access to
more than one memory. Memory access is through a
combination of sequential and priority control. First
access is assigned to data from the IOU, while second
priority is assigned to the CU.
LSI implementation of the modular computer
Overview
Size, power, and reliability constraints demand that
the modular computer be implemented with LSI
circuits, but the question of how to achieve an LSI
implementation remains. To date, several approaches
to logic partitioning for LSI have been reported, ranging
from the conventional approach, where partitioning is
. done after the logical equations have been written, to
the "cellular" type approach, where a group of logical
gates are structured to be programmed on the cell to
form specific functions. 9 ,10 ,11 ,12
The conventional approach includes both manual
and automatic partitioning. This approach appears
undesirable for the .modular computer implementation
because the design process tends so be lengthened18 and
236
Fall Joint Computer Conference, 1969
-------------------------------------------------------------------------------------the number of LSI chip types tends to increase, particularly as applications are broa~ened outside of the
computer proper. A small number of LSI chip types
is an important factor towards achieving the very
tight quality and process controls required for the
realization of very low component failure rates. The
latter is a must for long time mission reliability.
In the cellular approach, the cell design is such that
all combinations of n variables must be implementable
in order for the cell to be of univQrsal use. Proofs have
been: developed showing that such a. cell can indeed
form all functions of n variables. The cell, although a
universal device, still requires the process of writing
logic and determining which paths in the cell structure
should be connected or cut (physically or logically)
in order for the universal cell to assume the unique
logic posture specified by the logic designer.
A functionally organized set of building blocks with
predetermined* logic interconnects has been chosen
for the modular computer implementation. This set,
called functional characters, tends to satisfy the requirements of a small number of LSI types.
The set of characters, 10 in all, :was selected through
a pragmatic approach to logic partitioning. As for the
cellular technique, the characters have predetermined
logic interconnects but do not require restructuring of
interconnections in order to achieve the logical design
objective. The design process with, functional characters
is analogous to programming using a compiler. The
characters are analogous to compiler statements. The
designer specifies inputs, outputs" and control for each
character's micro-operation. 1\1icro-programming is
used as the control structure. Three of the 10 characters
comprise the micro-program store. Perhaps designing
with pre-specified large functioris without the utilization of Boolean equations marks the greatest departure and contribution of the functional characters.
No attempt is made to demonstrate that a character
or the set can implement all combinations of n variables.
All combinations are not required in order to build
effective computing machines. The design philosophY
permits the introduction of new characters if the
existing ones are shown to be ineff~ctive.
The ten functional characters exist as logical blocks
containing approximately 350 gates per block. Thef:e
blocks can be sub partitioned into smaller blocks with
fewer gates per block or chip, whereby several smaller
blocks would compose a functional character (see Table
V). A reduced-width set of functional characters has
been breadboarded using conventional IC circuits.
This demonstrated the modularity and versatility of
the characters.
The characters can be implemented with LSI circuits,
using cellular or threshold logic, or any other appropriate technique. An overview of the characters is presented here. ** Statistics are given comparing the functional character design of the :Ylodular Computer
Breadboard with the implementation utilizing custom
log;ic design and partitioning, as found in the implemented ~Iodula.r Computer Breadboard (}[CB). Uegrettably, there iR no meanR for 11 one-to-one comparison
using identical Rtages of ~JCB implementation. To
the extent practical, the cOmpar!HOIlH addreRs the Ra.me
syRtem parameters. The comparisons assume that all
cards of ~ICB containing IC's have been converted to
equivalent LSI chipH.
Description of the functional characters
The functional character Het is a group of lo~~ic
arrays forming a self-sufficient family of buildinJ~ blocks
that reduce computer design to a determination of
character types and number, followed by microprogramming of the set. Ten character types have been
shown to be sufficient for the building of both special
purpose and stored program general purpose digital
equipments. These characters are:
G1
PI
LI
I;2
L3
lVIl
1.\12
l\{i\tI
P2
P3
Register storage
Scratch pad memory
General logic
Ari thmetic logic
IllPUt/OUtput
.i\Iicromemory sequencer }
::\'licro-instruction Register
l\Iicromemory array
Up/Dmvll counter
Switch
l\Iicropro~r.am
memory
Tab'e I shows the gates, pins, and gates/pin ratios for
these fUIlctions.
Characters of the same letter are logically grouped
into a common u[iit, as illustrated in Figure 4. This
arrangement extends the register count and word
length. The complexity of logical operation -can also
be extended by the cascading of characters. Several
microprogram strings can be executed simultaneously.
The micromemory function was divided into three
* The logic of t~e block i'3 de·~igned prior to the cO'llputer design
** More detailed di<;cussion on the subject is found in paper
by F. D. Erwin and J. F. McKevitt. of thi!' Proceedings.
Implementation of the NASA. Modular Computer
Figure 4-Typical functional chamcter configumtion
characters in order to provide for greater versatility.
The array can be adapted to different size programs.
The instruction register may be cascaded using two or
more M2 characters, and still operate under a single
sequencer control.
TABLE I-Composition of the ten character types
sufficient for building special purpose and stored
program GP digital equipments
Gates
Pins
G/P
G 1 General Register 224
62
3 .4
PI Scratch Pad
Depends on system architecture
(8 X 16) bits/block
Ll Boolean
145
274
1.8
L2 Arithmetic
250
77
3.3
L3 I/O
377
149
2.5
Ml Sequencer
348
91
3.8
M2 Instruction
323
131
2.5
MM Array
Depends on size of program
2048 bits/block
P2 Up/Down Counter 147
81
1.8
P3 Switch
210
118
1.8
Functional character implementation of the
. modular 'computer breadboard (MCB)
The functional character appears to have a broad
range of applications. This was demonstrated in the
study by implementing an A to D, DDA, and the
modular computer.16 For the purpose of evaluation,
the breadboard version (MCB) was implemented using
the functional characters. The MCB is a two column
configuration of the modular computer. The functional
and operational aspects of the MCB have been preserved in the functional character implementation.
However, the implementation detail was tailored to
the functional character set. This includes the grouping
237
of registers into memory arrays and complete microprogramming, which are not part of the MCB.
Figure 5 shows the block diagram of the existing
MCB. This diagram has been overlaid with the
character implementation as shown in Figure 6. Note
the P3 blocks of Figure 6 are equivalent to the switches
(CCS) of Figure 5. In the block diagram form, the
MCB implementation using functional characters is
depicted as an assemblage of characters each under
microprogram control. The microprogram resides in
the micromemory, which consists of the MM, Ml, and
M2 characters. The word length is determined by the
number of juxtapositjoned characters of the same type.
In general, the characters are 8 bits wide. The PI
character is 16 bits wide. The number of Gl or PI
rows identifies the number of registers of the P or G
type. The G type operates directly under microprogram control, whereas the P type operates indirectly
under microprogram control. The G 1 character contains four registers for a 4 X 8-bit array. The PI character contains 16 registers for a 16 X 16-bit array.
Figure 6 also shows the character content of each
module adjacent to name of the module. The number
to the left of the slash (j) is the total number of characters used per module, regardless of type. These numbers
represent the first microprogram pass referred to in
Table III. The number to the right of the. slash is the
number of character types used in each module regardless of the number of modules. Note that the number
of characters is additive, whereas the number of character types is not; the sum of the character types is 10.
Evaluation of the functional character design
Table II shows the comparison data of the functional
character implementation versus the existing MCB
implementation. As may be seen, in all aspects, except
gates committed**, the functional character implementation results in a significant improvement over the
existing MCB design. The number of gates committed
is 35 percent higher for the functional character approach. In the LST area, the tradeoff will no doubt
recognize the functional character approach as signifioantly superior. An increase of 35 percent in the number
of gates committed is a small. price to pay for the
reduction in the number of chip types and pins.
As will be shown later for reliability purposes, a
small number of pins in the system is far more im-
** "Commi tted" mther than "used" is the proper descrip1lbr since
Borne gates on the chip or conventional card are unused but yet
theya.re committed by virtue of being part of the chip or ca.rd.
Fall Joint Computer Cdnference 1969
238
i
'
i
i
i
i
AU-l
AU-2 .
f
,
1
ccs
I
~
!
•
•
i
CAU
CU-l
CU-2
~
f
f
CONTROL
PANEL
POWER SUPPLY
i
I
I
f
i
ccs
1
MU-l
I
MU-2
r
ccs
1
1
IOU-2
IOU-l
I
I
t
I
;
i
I
ccs
I
I
Figure 5-Block dmgram of existin& modular computer breadboard overlayed with the character
implementat,ion shown in Figure 6.
.
portant than a smallriumber of gates, all other factors
being equal. As seen in Table I~, the number of pins
required for the MOB implemen~ation is 2.6 times the
number required for the functional character implementation.
I
As the column heading show$, the comparison in
Table II is made between two LSI implementations:
i
one representing the functional character teohnique
the other representing the conventional approach
where every MOB card containing X number of Ie's
has been converted to an equivalent 10 with the number
of card terminals becoming the equivalent LSI package
pins.
The implementation with the functional characters
Implementation of the NASA Modular Computer
239
AU-12517
L1 L1 L1 L1
AU-225/7
L2 L2 L2 L2 L3lL3
G1 G1 G1 G1
G1 G1 G1 G1 MM
G1 G1 G1 G1 M1M2
I
CCS 2/1
1
CAU 3917
L1 L Ll L1 L3 L3 L3
P2 P2 P2 P2 G1 31 :::n
P2 P2 P2 P2 G1 G1 G1
P3,P3
I
2/1
CU-138/9
MM MM PI PI L3 L3
MM MM PI PI G1 G1
M2 M2 PI PI G1 G1
P2 P2 P2 P2 G1 G1 G1
P2 P2 P2 P2 G1 31 G1
M1
G1
MM
M2
DUPLICATE
M1
--... L2 Ll
L3 L3
CU-238/9
PI PI G1 G1
PI PI
DUPLICATE
PI PI L1 L1
P2 P2 P2 L2 L2
t •
-,I
I
CCS 2/1 P3, P3
MU-211/6
MU-111/6
I
I
L1 L1 L1 Ll
M1 MAIN
G1
MEMORY
~ MM
M2 MODULE
~L3 L3
DUPLICATE
~
J
CCS 2/1
1
P3,P3
IOU-117/6
J
IOU-217/6
Ll L1 L1 L1
G1 G1 G1 G1
M1
[<31 G1
MM
L3 L3 L3 L3
M2
•
DUPLICATE
t
I
CCS 2/1 P3, P3 2/1
I
I
Figure 6-Functional charaeter implementation of MCB
resulted in a 35 percent greater throughput. This is
because the functional character assumed a 32 percent
faster gate. For equal gate delays the two implementations would yield approximately equal throughput.
The most significB,nt point from a quality control
point of view is that the entire computer was implemented with ten character types-three of these belong
to the "micromemory" domain used for micropro-
gramming of the computer modules. The micro memory array (MM) is the storage element which contains
the control information. If permanent memory is used,
it may be necessary to generate the desired information
content on a number of different chips. However,
effort is being expended in industry towards producing
electronically alterable, read only memory arrays.17
Progress to date shows that there is l?romise of being
240
Fall Joint Computer Conference, 1969
TABLE II-Comparison of functional character
implementation and existing MCB
.
implementation]6
Implementation
Item
Func- MCB ;Percent
tional AssumingJmproveCharac- Each
ment
ter Im- Card Is i Over
plemen- An LSI: MCB High/
Chip i Imple- Low
tation
(Units)
menta- Ratio
tion
Types
23
10
Cards (LSI
Chips)
206
554
Pins Committed 18,200 47,600
Gates
Committed
47,200 35,000
2.6
0.75
Gates/Pin
+56
2.30
+63
+62
2.70
2.62
-35
+250
1.35
3.47
able to use only one array with id,entical metalization
patterns. This array will be encoded with the proper
information content at the time of use.
It is reasonable to project that ten characters and
ten masks are sufficient to implement the MCB and
the majority of digital equipments. Other types of
equipment were implemented, including A to D and
D to A conversion logic and a DDA. All designs
utilized the same characters bllt different microprograms. The efficiency of gate u~age was best in the
MCB implementation and worst in the DDA.15 It is
premature to conclude that a di~erent character is
required for a more efficient im~lementation of the
DAA. The MCB design was optimized through remicroprogramming, but this was not done with the DDA
and A to D equipments.
,
Design with functional character~ saves time. During
a six month period, the entire MCB was designed,
microprogrammed, and remicroprogrammed several
times. This illustrates the ease andlspeed of the design
process. The improvements gained through microprogramming are demonstrated in Tables III and IV.
Table III shows the improvements in terms of the
number of characters and characte:f types required for
the two microprogram passes. The ¢haracters remained
unchanged. In this comparison, the configuration of
the MCB was identical with the presently implemented IC version.
Further improvements were ga~ned, as shown in
Table IV, by restructuring the MCB with the appro-
TABLE III-Effects of microprogram improvement
on the functional character implementation
of the IVICB
No. of Characters
Used
No. of Character
Types UseD
lVIICROMICROPROGRAM PASS PROGRAIVI PASS
Unit
First Subsequent First Subsequent
MU
CAU
CU
AU
Switches
I/O
Computer
Total System
11
39
38
25
8
17
7
38
35
21
8
17
6
7
9
7
1
6
5
7
9
7
1
229
206
10
10
6
TABLE IV-Effects of combining the
AU and CU of MCB
Parameter
No. of Characters
No. of Character
Types
Fixed Point
Direct Add
Fixed Point Add
Fixed Point
Subtract
Inclusive or
Exclusive or
Logical and
Same Exeept
Functional
AU and CU Were
Character
Implementation
Combined
of the Existing
Configuration
229
182
10
10
9.9 us
11. 6 us
4.2 us
6.4 UlS
11.6 us
11.5 us
11.5 us
11. 5 us
6.4 us
6.2 us
6.4 us
6.2 us
priate remicroprogramming. The AU and CU were
combined into one unit, eliminating some logic and the
switch between them. This reimplementation was
feasible with the functional character set due to the
more general nature of the characters as contrasted
with the custom implementation of the existing Jv.ICB.
Implementation of the NASA Modular Computer
Combining the AU and CU into one unit may affect
the long term reliability. This and curiosity about the
relative merits of multiprocessor structures, such as
the Hughes H4400 (currently being built), vs. modular
computers, such as the MCB, led Hughes to study
factors affecting long term reliability. In this study,
modules of equal complexity, with the exception of
the switches, were assumed. The results are presented
in Reference 9.
Several interesting points are worth mentioning here:
1. Multiprocessors have an improved short term
reliability, but the long term reliability is degraded somewhat.
b. Different configurations, or organizations, significantly affect long term reliability.
c. Component reliabilities (failure rate of the
characters) mar medly affect the mission reliability.
d. The failure rates quoted for existing IC's. of
10-8 failures per gate-hour will have to be significantly reduced in order for either the multiprocessing or the modular computer organization
to reach the desired long time mission reliability
objectives.
Circuit realization of the functional characters
This section presents circuit considerations for the
LSI realization of the functional characters. The circuits
must not only reflect the correct logical functions but
also, because of the potential space applications,
satisfy the electrical, thermal, and mechanical constraints.
The circuit solutions are to be designed to reflect a
set of NASA design guidelines that are intended to
insure a high probability of mission success. These
guidelines are:
Gates per chip
-About 100, no more than 150
Circuit yield
-100% without. yield enhancement
Conductor spacing -0.1 mil minimum
Conductor width -Current density not to exceed
106 amps/cm2
Metalization layers-No more than 2
Circuit type
-Bipolar TTL
The 100 gate per chip function size limit reflects
241
the 100 percent yield and TTL technology constraints.
I t is expected that LSI and TTL circuits containing
about 100 gates will be producible with 100 percent
yield. Other circuit technologies such as MOS may
accommodate a larger number of gates per chip.
As may be recalled from Table I, some functional
characters require about 350 gates per function. The
natural tendency would be to implement one character
per chip. However, this is not an acceptable solution
for TTL circuits in view of the above constraints.
Therefore, the functional characters were subpartitioned as shown in Table V.
The intent of the table is not to select the optimal
subpartition, but to enumerate some logical choices.
The optimal choice will depend on assigned wejghtings
for gates and pins per chip, as well as the other design
'constraints mentioned earlier. The table thus shows
each character and the characters' composition, using
one or more custom or commercially available LSI/MSI
chips. More than one subpartitioned chip is required
to implement the functional character. The number of
chips and chip types required is given in the second
column as a descriptor and also in the sixth and seventh
columns under "composite." The columns under the
"composite" heading state the total number of items
required to implement one functional character. The
columns under the. first and second chip heading contain similar information on a per chip basis.
A comparison of Tables I and V shows the following
changes:
1. The number of chip types is at least 20%
greater than the number of characters; thus,
.paying a small penalty in terms of part number
problems.
2. The number of gates per chip dropped (approximately by a factor of 0.5) and the number of
pins remained about equal, resulting in an increased number of pins in the system by a
factor of about 2.
3. The total number of gates per function increased
an insignificant amount.
As is shown below, these changes tend in the wrong
direction for obtaining improved MTBF's of the modular computer. As is seen from the above and Table V,
the subpartitioned characters would require a greater
number of bonds (pins) and will therefore operate at
higher temperatures than the non-sub partitioned set.
The temperature rise is due to the increased number
of gates required and the higher current required due
242
Fall Joint Computer COJ1ference
;
, 1969
~--------~--r-----------~------~---------_
TABLE V-Alternate schemes for sub-partitioning
Name
Composition
pat¢s
Pins
G/p
Ratio
2nd Chip
1st Chip
Composite
Character
Ratio
Chips/
Chip Gates/ Pins/
Character Types Chip Chip
Gate/Pin
No.
Used
G1-Register
2 custom chips,
single type
224
62
3.6
2
1
112
52
2.2
2
L1-Logic
2 custom chips,
single type
274
145
1.9
2
1
137
138
1.0
2
L2-Adder
2 Identical
custom chips
258
77
3.4
2
1
129
60
2.2
2
1 custom and
1 commercial chip
22~
77
3.0
2
2
117
88
1.3
1
4 identical chips
454
150
3.1
4
1
114
72
1.6
4
41()
2 identical chips
with optional parity
149
2.8
3
2
150
87
1.7
Alternate Scheme
2 chips + optional
parity chip
398
149
2.8
3
2
150
95
Optimal 3-chip
configuration
377
149
2.5
3
2
129
Ml-Micromemory
Sequencer
3 chips-2 types
358
91
3.9
3
2
3 chips-2 types
150 gates if I. C.
348
91
3.8
2
2
P2-Counter
1 custom chip
14$
81
1.8
1
1
1 custom and 2
commercial chips
16$
81
2.0
3
2
L3-Input/Output
Gates
Pins
Gate/Pin
Ratio
Ne
USI
---
111*
43*
2.6*
1
2
110
68
1.8
1
1.6
2
98
66
1.5
1
85
1.5
2
119
86
1.4
150
92
1.6
1
104
85
1.2
2
142
73
1.9
1
206
73
2.8
1
147
81
83
82
0.9
40*
1.8
1
1
*
*
--I--'
2*
1
--'---.
P3-Switch
2 identical chips
210
118
1.8
2
1
105
75
1.4
2
M2-Micro
Instruction
Register
3 chips":2 types
323
131
2.5
3
2
100
51
2.0
2
123
89
1.4
1
*Commercially available chip
to a larger number of external gates. * More pins require
more external gates to drive the capacitance of the
external pins. Both f~ctors, increased pins and higher
temperature, increase the failur~ rate of the device
and thus lower the probability' of mission success.
Reliability considerations requite a minimum number
of bonds (pins) and a lowest junction temperature
practicable. IS Several other factors affect reliability.
These are either less influential on the operational
failure rate, or on a relative basis do not affect the
tradeoff. For example, the quality of the package's
hermetic seal may be an importaht factor in development and acceptance testing. BtJt once a good seal
has been estabHshed, it will remain good. Furthermore,
the difficulty of making a good sea] is proportionate
to the lengths of seal interface. The latter in turn ;.s
a function of the number of pin~ per package, which
for the cases in question is about the same.
* LSI circuits are generally built· with; tailored lower power in
ternal gates for driving low capacitance and limited fanout within
the chip's boundaries and higher power gates at the chip output
in orde,r to overcome the input output ¢apacitance and chip fanout ..
Temperature is a very important consideration since
the failure rate of the device increases about 1.8 times
per 25°C temperature rise. IS
Within specific cooling capacity, circuits, and packaging technology, two factors affect the device's temperature:
a. The number of gates per system.
b. The number of 10 package pins per system.
For example, in the natural and subpartitioned
functional characters (Tables I and V) the number of
gates per system remains approximately eonstant.
However, the number of pins nearly doubled for the subpartitioned case. Typically, in TTL circuits the power
dissipation of the subpartitioned implemen1jation is
expected. to increase. Specifically, the dissipation is
increased by a factor of 1.08. Using the data from Table
II, the total number of functional character gates in
the MCB is 47,200 and the number of pins is 18,200.
Assuming a power dissipation p and 2p or more for
internal and external gates, respectively, th,e power
dissipation for the MCB is:
Implementation of the NASA Modlllar Computer
where
M = total number of gates
and
N
=
total number of pins
P for the subpartitioned implementation is 57,700
= p (47,200 + 10,500).
.
Using the same formula, the power dissipation for
the functional character implementation (Table II)
versus conventional MCB implementation is 53,300 p
versus 50,900 p, resp,ectively.
Even though the number of gates is 35 percent
greater for the functional character implementation,
the power dissipation is about 5 percent greater than
that of the MCB's, were it'implemented with LSI's
representing present MCB cards. This 5 percent difference will disappear ·in practice. The octual power
difference relative to the present IC implementation
would be in favor of the functional implementation.
In addition to the number of pins causing increased
power dissipation, which may be equated with increased failure rates, there are other reliability and
cost penalties associated with an increased number of
pins. These all result from bonding. Each pin requires
two internal bonds (one to the metalization, the other
to the pin). Each pin must in turn be fastened to some
external holder (card, connector, wire, etc.).
Everyone of these junctions is a potential failure
and a fabricati.on cost factor. Thus, the number of
pins as a contributor to increased system failure rate
manifests itself in several ways. Every effort must be
made to keep the pin count low.
The "ideal" LSI chip, assuming it could be built
would contain the largest number of gates and use
the lowest speed power product circuit. Figure 7 shows
the various circuits currently available and the speedpower-product lines (PL) 19 . Note that the "ideal"
circuit for space applications would be located in the
lower left corner of the figure. The ion implanted and
complementary MOS circuits come closest to the
"ideal" circuit. The shaded area shows the speed-
** Each
pin must require at least one external gate, and each
external gate dis!dpates at least p more units of power. Typically,
U of the pins are used for output; the others are used for inputs,
power, and ground.
243
power coverage of the P channel ion implanted MOS
(1M OS) . The area for the N channel 1MOS is forecast
to be below that shown for the P-IMOS. At the speed
considered the complementary MOS would straddle the
P and N areas. The complementary circuit is attractive
as a compromise speed-power option. However, it
requires about twice as many devices per circuit over
single channel. Thus, a single chip would be unable to
support a complete function, resulting in increased pins
per system. This is undesirable, as pointed out earlier. *
From this, we conclude that the ion implanted MOS
type circuit (single channel, high speed, low power)
is optimal for the functional character implementation
of the MCB, barring producibility problems. It provides
the desired density at 100 percent yield, lower power
dissipation, and desired circuit producibility.19 ,20.21
There are not sufficient practical data to make a judgment. If the "ideal" circuit is not available, a meaningful system can be built using TTL circuitry for the
functional character implementation. The penalty is
increased power and pins required at a very significant
gain of availability of proven circuit technology.
CONCLUSIONS
It has been demonstrated that digital equipments can
be designed using pre-specified logical building blocks
called functional characters. Onc~ the logical design of
the functional character has been accomplished, the
system designer no longer needs to employ Boolean
equations to specify the system. He needs only to
specify the inputs and outputs' of the characters and
microprogram the sequence of their operations. The
set of functional characters can be considered as
standard and "universal" LSI chips that are sufficient
to implement most digital equipments. Two desirable
features of the characters are that the number of chip
types and pins in the system are significantly reduced.
I t may be inferred that standard design automation
programs which have as inputs Boolean statements or
their equivalent will not be applicable as functional
character design aids. Routing programs have the
greatest potential of being useful. Simulation programs
will have to operate at a macro level. A microprogram
assembler is a desirable program.
In order to obtain the required 106 hours between
system failures, it will be necessary to improve the
system configuration of the modular computer and to
improve the basic circuit or module reliability. The
* The ratio of power dissipation for internal and external gates
is much greater for MOS at the desired speed.
244
Fall Joint Computer Conference, 1969
LEGEND:
* THE
IMOS AREA IS A FUNCTION OF FAN-IN/
FAN OUT
16-n5 DELAY
8-n5 DELAY
""
=A
OCCU RS AT 3/4,6/3,9/2 12/1
OCCURS AT (1-2)/2, 4/1
L DEPENDS ON THE GATE
CAPACITANCE
=B
e SIG NE400A
20
15
"
e SYV SUHL I
'"
eSIG"SE8000J
"'
PROJECTED
N-I~OS
5
4
3
2.5
2
1.5
1
2
10
20
POWER DISSIPATION IN MW/GATE
Figure 7--8peed-power products of some bioplar and MOS circuits
functional character implementation of the modular
computer will readily allow configuration changes. The
module content and overall system configuration can
be readily changed. The characters improve the
module's MTBF because of the significant reduction
in the number of pins.
The reduced number of chip types facilitates quality
control, thereby potentially improving the module's
MTBF.
Any circuit or chip wiring technique can be used to
implement the characters. Currently, the modular
computer is planned to be implemented with TTL, 100
percent yield LSI technology. Other circuits and
technologies are being evaluated.
ACKNOWLEDGIVlENT
The research reported in this paper was sponsored in
part by the Electronics Research Center under Contract NAB 12-665.
The authors express their appreciation to Mr. W. L.
Martin of Hughes Aircraft Company for his many
suggestions for improving this report.
REFERENCES
1 D 0 BAECHLER
Trends in aerospace di(!ital computer design
Computer Group News Vol 2 No 7 Jan 1966 18-32
2 A study of Jupiter fly-by-missions
General Dynamics Rpt FZM-4625 May 171966 3--159 to
3-202
3 A AVIZIENIS
Design of fault-tolerant computers
Proc FJCC Vo1311937
4 M M DICKINSON J B JACKSON G C RANDA
Saturn V launch vehicle digital computer and data ada]Jter
Proc FJCC Vol 26 1964501-516
5 G H BARNES R M BROWN M KATO
D J KUCK D L SLOTNICK R A STOKES
The I LLI A C IV computer
IEEE Trans on C Vol 17 No 8 Aug 1983
6 E J DIETERICH L C KAYE
A compatible airborne multiprocessor
In this Proc
Implementation, of the NASA Modular Computer
7 H E MAURER R C RICCI
Horizons in guidance computer component technology
IEEE Trans on C Vol 17 No 7 July 1968
8 E H BERSOFF E HOPE F TUNG
IEEE transactions on aerospace and electronic systems
To be published
9 F DERWIN E H BERSOFF
Modular computer architecture strategies for long term missions
In this Proc
10 R C JENNINGS
Design and fabrication of a general purpose airborne computer
using LSI arrays
IEEE Computer Group Conf Digest June 1968
11 H R BEELITZ· S Y LEVY R J LINHARDT
H S MILLER
System architecture for large-scale integration
Research in the effective implementation of guidance computers
with large scale arrays
First Interim .Rpt Submitted to NASA ERC Oct 1968
16 J J PA,RISER F DERWIN J F McKEVITT
C P DISPARTE J A BURKE
Research in the effective implementation of guidance computerb
w1:th large scale arrays
Second Interim Rpt Submitted to NASA ERC 1969
17 H G DILL R W BOWER K G AUBCHON
T N TOOMBS
Anomalous behavior in stacked-gate MGS tetrodes
International Solid State Circuit Confetence, Phila
19-21 February 1969
18 G R VAN HOODE
Evaluation of experience with micro-electronic integrated
circuits
Proc FJCC Vol311967
12 R C MINNICK
TRW No 9990-6183-ROOO May 1967
19 J SEGAL
Cutpohlt cellular logic
Speed/power chart for digital IC's
IEEE Trans on EC Dec 1964
13 R C MINNICK
A survey oj microcellular research
Journal of the Association for Computing Machinery
Vol 14 No 2 April 1967
14 J J PARISER
Connection considerations with a view toward batch fabrication
Proc Nat Symposium of the Impact of Batch Fabrication
on Future Computers April 1965
15 J J PARISER F DERWIN J F McKEVITT
J A BURKE C P DISPARTE
245
T4e Electronic Engineer June 1968
20 R W BOWER H G DILL K G AUBUCHON
SA TOMPS
Characterization of MOS FETs formed by gate masked ion
implantation
Given at the Internat Electron Devices Meeting Wal3h
Oct 1967
21 H G DILL
Offset gate field effect transistors with high drain breakown
potential and low miller feedback capacitance
IEEE Trans on Electron Devices Oct 1968
Project DARE: Differential Analyzer
REplacement by on-line digital
simulation
by GRANINO A. KORN
University of Arizona
Tuscon, Arizona
INTRODUCTION
While batch-processed applications of convenient,
highly developed digital continuous-system simulation
languages are now commonplace, 1 ,2 such systems do
not provide the intimate man-machine intercourse
cherished in analog/hybrid simulation. The DES-I
system,2 which combined .a special simulation console
and a digital plotter with an SDS 9300 (mediumsized) computer was, then, a pioneering effort, unfortunately abandoned by its manufacturer. The only
commercially available interactive system appears
to be the IBM CSMP 1130 system which, like its
predecessor PACTOLUS,2 can be programmed from·
a simple typewriter terminal. This is an interpreter
system implemented on a small computer and thus
yields relatively quite slow execution.
The writer has felt quite strongly for some timeS that
digital on-line simulation is ready to go-we do have
simple simUlation-language programming, plus very
reasonably priced, fast digital computers, plus new
graphic displays. All that would seem to be needed was
a system design which would combine these items
(Table I), with a good deal of human-factors engineering
to make the operator happy as well as efficient. Project
DARE (Differential Analyzer REplacement), sponsored by the National Science Foundation at the
University of Arizona, is a continuing attempt to
develop a series of such systems.
Project DARE demonstrates all-digital on-line
simulation of dynamical systems. Each DARE system
adds a very convenient but still relatively inexpensive simulation console to a small or large digital
computer and can replace conventional analog computers in many applications. System equations or
block-statements and input data are entered and
conveniently edited on a cathode-ray-tube typewriter.
Solutions or phase-plane plots appear on a second
cathode-ray-tube display; system parameters and
initial conditions are readily changed for successive
runs; displayed data can be stored for comparisons;
programs and results may be printed and plotted for
hard-copy report preparation; and automatic iterative
operation is possible. With a reasonably fast digital
computer, man-machine interaction at the console
is rather -more comfortable than with even a modern
analog/hybrid computer.
DARE I is a flexible CSSL-type floating-point
system permitting relatively slow computation with
the PDP-9 computer. DARE II is a block-diagrambased system which trades fixed-point operation for
relatively very high speed on the small PDP-9, permitting, for instance, real-time flight simulation.
DARE III and DARE IV are only in the planning
stage and will implement economical and fast floatingpoint simulation on a time-shared CDC 6400.
A critical study of future possibilities indicates
that DARE-type systems could permit flight simulations including 40 Hz frequencies by 1975, but that
modern analog computers are still a hundred times
faster. Actual present-day practical applications, how-
247
248
Fall Joint Computer Conference, 1969
ever, employ really fast (and therefore relatively inaccurate) analog computation $0 rarely that much
analog simulation could well give way to the more
accurate, convenient, and often more economical
digital methods demonstrated by Project DARE.
DARE I:
An on-line CSSL-type system
DARE I software, written for the PDP-9 by J.
Goltz as a Ph.D. dissertation,5 produces a complete
floating-point simulation system, including the basic
monitor, editor, and loader used also by DARE II.
DARE I source language is essentially similar to the
SCI-sponsored CSSL.l Though basically equationoriented, DARE I will also implement user-created
analog or hybrid blocks as FOR fRAN functions.
DARE I employs the FORTRAN compiler supplied
with the digital computer and will be described in
detail in a separate paper.!>
DARE I accepts system differential equationg in firstorder (state-equation) form. These equations :are
simply typed in FORTRAN' notation on the screen
of a CRr typewriter at the right of the DARE console (Figure 1). An interactive CRr typewriter pro-
TABLE I-A list of requirements
for an on-line digital simulation system
A useful on-line continuous system simulation
system must provide for:
1. Entry of system differential equations (in
equation and/or block; statement form).
2. Entry of data (system parameters, initial
conditions, function tables, etc.).
3. Entry of simulation parameters (frame
time, communication interval or display
sampling interval, maximum computation
time, integration routiI;W used, maximum
tolerable error in variable-increment integration routines, choice of variables for
display).
4. Editing, modification, and correction of
the above entries.
5. Display of state variables vs. the independent variable (usually the time) and
against each other (ph~se-plane plots).
6. Preparation of hard cop~ for reports in the
form of printed tables, xy recorder plots,
or strip-chart records.
In addition, a sophisticated simulation system
must permit "simulation studies," viz.:
7. Computations based on results from multiple
differential-equation-solving runs (statistics, cross-plots).
8. Iterative computation, i.e., repeated runs
with system parameters and/or initial
conditions recomputed on the basis of
preceding runs for optimization, boundary-value problems).
Figure I-DARE simulation console for use with a PDP-9 or
PDP-I5 computer. Programs and data are entered, edited, a,nd
modified on the CRT typewriter at right. Up to four solution
curves, or a phase-plane plot, are produced on-line on the output
graphic CRT display at left. A simulation control panel underneath the output display controls simulation and display, with
special push-buttons producing hard copy of programs, data, and
solutions when desired. The teletypewriter and plotter used for
this purpose are not shown.
Console switches (lower left) are sampled by the computer to
provide control inputs:
Method Switch: A rotary switch used to select the integration routine.
DT, TMAX, EMAX: 4-decade thumbwheel switches in an
adapted FORTRAN format.
The third decade reads from -5 through 0 to +5, and with
the fourth decade indicates a power of 10.
Elapsed Time: A strip of 12 lamps to indicate the progress
of computation, and to reassure the user that the eomputer
is actually operating when computation exceeds a few
seconds.
Sense Switches: 2 position switches for various functions,
determined by program.
Trace Finder: Pushbuttons to identify one of 5 trace8 on
scope display-probably by momentarily blanking it out.
Command Push-buttons (lower two rows):
Lighted pushbuttons, for purposes marked on buttons.
"Type eqns," "type data," and "select display," are indi··
cators only, offering suggestions to the user from the ,computer.
Such suggestions can also appear on the alphanumeric CRT
display.
Project DARE
gram proceeds to ask for problem data and simulation
parameters. Of the latter, the frame time DT, the
maximum computing time TMAX, and also the error
EMAX for variable-increment integration, can be
entered either with the CRT keyboard or by console
digiswitches, whichever the operator prefers. Console
buttons can recall selected program or data pages
to the CRT screen for editing, or cause them to be
printed out for report preparation.
As the differential-equation solution proceeds, all
state-variable values are read onto DECtape once
per "communication interval"! (typically every 10
to 50 DT). Thus any selected state variable can be
brought back for single or multiple displays and
printout; it is possible to compare a current solution
with a selected earlier solution display. Permanent
graphic records are obtained with an xy recorder and
a four-channel stripchart recorder connected to the
display.
The choice of integration routines for differentialequation solution has been discussed and rediscussed
in many survey papers. 2 ,4 All DARE systems (like
the better batch-processing systems2) offer a choice of
integration formulas. With the on-line systems, console selection of integration routine and frame time
(time increment DT) permits very convenient comparison of different integration methods in terms of stored
solution displays.
The flexible and convenient DARE CRT Editor
program 5 •6 permits overwriting and correction, insertion of text, and automatic search for lines containing
selected strings.
A SORT jEDIT program (precompiler) sorts the
symbol string constituting the program and creates
a FORTRAN differential-equation-solving program,
which is then compiled and executed. After the first
run, data such as system parameters and initial conditions may be changed on the CRT screen, and successive differential-equation solving-runs are obtained
without recompilation. Iterative and statistical simulation studies can be programmed with FORTRAN
statements.5
A new homemade graphic display7 associated with
our DARE console displays up to four variables against
time, or selected phase-plane plots. The display uses
one dual 9-bit (I8-bit) word per display point to save
memory and refresh time, can generate line segments
for curve interpolation, and shares the processor
memory through a standard PDP-9 data channel.
This permits fast display refreshing with a minimum
of time-wasting instructions.
249
DARE II: A fast block-macro system with
an efficient precompiler
The DARE I system demonstrates the convenience
and power of a scale-factor-free, floating-point, equation-oriented, on-line simulation at relatively low
computing speed. But we also wanted to demonstrate a
much faster on-line simulation system, which would
permit true real-time flight simulation, still using the
same small and inexpensive digital computer. With
the PDP~9, this meant giving up floating-point operation. DARE II machine equations must be scaled
(much like those in analog computers) between -1 and
1 machine unit; with the PDP-9, ones-complement
coding is employed. Overloads are detected and displayed by a special subroutine.
To provide high execution speed, DARE II uses
the PDP-9 macro-assembler to create macros corresponding to analog computing blocks, an approach
first used by Gaskill and McKnight in their batchprocessed DAS system on the IBM 7090. 2 Our system
permits especially convenient block programming,
with each block named by type and by the actual
output-variable name. The example of Figure 2 is
represented by
SUM
FI, SIDOT, S2DOT
COS
COSA,A
MULT
SIDOT, COSA, RDOT
(1)
where the first argument of each block-macro represents
the block output. Note the convenient mnemonics used.
DARE II block-statements and data are entered
on the dual-CRT console used also with DARE I and
can be edited, modified, and printed out with the aid
of the 'same string-processing editor.6 DARE II
simulations of many small systems (second to sixth
order) are, however, so fast that repetitive simulation
and display at two to 20 computer runs per second is
possible. Keyboard entry of parameters is then too
52 DOT
FI
ROOT
Figure 2-A block diagram
250
Fall Joint Computer Conference, 1969
-------------------------------------------------------------------------------------slow for CR T demonstration, of parameter-change
effects,. and a "diddle knob" or joystick permitting
rapid changes of a keyboard-addressed parameter will
be added. The knob or joystick will control incrementation of an up-down counter holding the parameter
value.
DARE II software incorporates substantial improvements over the DAS system. Block-macros may be
typed in any order. An optimJizing precompiler sorts
statements like those in our example (1) before assembly, so that each block of the sorted program can
operate on already computed quantities:
COS
COSA, A
MULT
SIDOT, COSA"RDOT
SUM
Fl, SIDOT, S2DOT
(2)
This will then permit, say, integration of the output Fl. DARE I I next employs conditional assemblylo
to completely eliminate the assembly of code for redundant
storeJetch pairs corresponding to outputs and inputs of
interconnected blocks. Thus, the first macro COS COSA,
A in (2) would ordinarily end with
STORE
COSA
(3)
while the second macro MULT SIDOT, COSA, RDOT
would start with
FETCH
COSA
(4)
DARE II automatically cancels the redundant
pair of instructions (3), (4), although (3) would be
kept if it were needed elsewhere in the program. The
pair
STORE SIDOT, FETCH SIDOT
will be similarly cancelled, unless SIDOT is needed
elsewhere. The DARE I I precompiler program is specifically designed to permit elimination of as many trackstore pairs as reasonably possible. In addition, conditional assembly also eliminates code for unused multiinput-summer inputs and similar unused options.
As a result, DARE II produce8 code which i8 E1s8entially
a8 efficient a8 well-written PDP-9 machine-language
code and permit8 relatively very fast execution (Table II).
If core storage is scarce, DARE II block macros can
be subroutine calls to save core at the expense of some
computing time.
Although the basic PDP-9 instruction set is quite
limited (no byte manipUlation, spare registers, or add-
TABLE II-Estimated computation times for a typical aerospace-vehicle simulation
(TIMES are in /.lsec except as noted)
OPERATION
x
+y +Z
XY
AX
F(X)
SINX or COS X
TOTAL-ONE
DERIVATIVE
EVALUATION
Two Derivative
Evaluations
RK2 Integration
Total Frame
Time DT
Max. Frequency
at 25
Frames/cycle
NUMBER
REQUIRED
100
8'0
60
8
10
DARE I
DARE II
DARE III/IV
197X
PDP-9/FORTRAN PDP-9/Macro-assembler
CDC 6400
System
(Floating-point)
(Fixed-point)
(Floating-point) (Floatin~~-point)
XI000
X700
X700
X4000
X600
= 100,000
= 56,000
= 42,000
= 32,000
= 60,000
X5
X24
X21
X52
X60
= 500
= 1920
= 1260
= 416
= 6002
X3.4
X7
X7
X80
XI00
XO.2 = 20
= 280 Xl.2 = 961
= 340
= 420
= 640
= 1000
X1.2 := 72
XI0 := 80
X15 := 150
290 msec
4.7 msec
2.7 msec
0.46 msec
12
580 msec
X3000 = 36,000
9.4 msec
X120 r 1440
5.4 msec
X25 = 300
0.9 msec
X4 = 48
~
616 msec
11 msec
5.7 msec
1.4 msec
0.07 Hz
4 Hz
7Hz
30 Hz
~
Project DARE
into-memory), many analog-computer blocks can be
emulated quite nicely. As an example, a single-variable
function with 256 uniformly spaced breakpoints can
be formed by table lookup and interpolation in 50
p'sec, and a two-variable function with 16 X 16
breakpoints can be formed in 120 p.sec.9 It is also readily
possible to add to the DARE II macro-block repertoire; one can, for instance, create blocks which precisely correspond to the computing elements of any
given analog computer.
Like DARE I, DARE II offers a choice of integration routines. Because PDP-9 lacks true index registers,
the second-order Runge-Kutta routine 4
+ Y2(K1 + K
k+1X
=
kX
Kl
=
DT F(kX, k DT)
K2
=
DT F[kX
2)
(5)
+ K17 (k +1) DT]
is probably the most useful, although it requires two
evaluations of the derivative F(X, T) at each integration step. To implement Eq. (5), our program does
not first evaluate all n K1's and then proceed to add
half of each to its kX, as might be done with a real
index register. The program instead computes each
kX + Y2 Kl and kX + Kl before the next Kl is evaluated. When this is finished for all X, the program sets a
tally switch to mark the second part of the Runge-Kutta
routine, increments the independent variable, and
uses the kX + Kl to produce the K2 and the k+1X
as each derivative is computed. All integrand accumulation is. done in double precision to reduce roundofferror effects.
With suitable interrupts from a real-time clock, a
DARE II simulation could be readily linked to a
hybrid-computer setup and/or to real system hardware
(autopilot, operator positions). Note, in this connection, that the macro-assembler system would circumvent the reentrancy problems usually encountered
in attempts to service multiple system interrupts with
FORTRAN programs.3
A look into the future: DARE III and DARE IV
The DARE I and DARE II systems are expected
to be completed in 1969. A useful and readily feasible
next step could employ a modern 24 to 36 bit machine
somewhat larger than our PDP-9 (e.g., SEL 840B,
SDS Sigma 5, DEC PDP-10) to speed DARE I executIon, or to add floating-point capability to DARE II.
Such a system would cost between $120,000 and
$20.0,000, which still matches the cost of a comparable
analog-hybrid computer. Far more interesting from
251
the point of view of economy as well as computing
speed, however, is the possibility of time-sharing a
substantially larger central digital computer, such as
a CDC 6400. In fact, economical operation of even
a medium-si~ed digital machine mainly intended for
simulation should provide for time sharing with a
"background" batch-processing program.
Our proposals for follow-on projects, then, envisage
implementation of DARE I-and DARE I I-like simu-.
lation systems with the University's CDC 6400, using
the eX'isting PDP-9/ console combination as a remote
user's station. 6400 activity would be restricted to
very fast and efficient compiling and execution of
differential-equation-solving programs, while the stringprocessing CRT editor, data entry and display, and
also some iterative and statistics routines in slow
simulations, would be performed by the small processor associated with the user's console. It is interesting
to note that the simulation programs and data sent
to the central computer involve only character strings
transmitted at type-in rates. Alphanumerical data
from the central computer do not require much higher
rates', extensive numerical tables could be line-printed
at the central installation. Each DARE CRT display,
which is refreshed by the console processor, involves
at most 2400 9-bit data samples. For typical 10 sec
flight simulations, this would require transmission of
21,600 bits every 10 sec, or less than 2500 bits/second,
so that a telephone line would do. Such operation is thus
ideally suitable for remote time-sharing, provided
that the 10-second-plus-overhead computer runs can
be made available without excessive delays.
Based on initial DARE II experience, smaller simulation problems would be solved much more rapidly,
say in 0.1 sec of central-processor time. Repetitive
console displays demonstrating parameter-change effects would not be possible with reasonable data-transmission rates (nor would many such demonstrations
be economically feasible)! Our proposed time-sharing
scheme is, however, ideally suited to fast iterative
simulation or statistics-taking by the central processor
In this type of operation, only successive criterionfunction values, accumulated statistics, or similar
numbers, need to be transmitted and displayed during
the iteration runs, and low transmission rates would
again suffice.
In a console simulation system specifically designed
for remote time sharing, our PDP-9 is really unnecessarily elaborate and could very effectively be replaced
by the less costly 8K PDP-15, with DECtape but
without extended arithmetic. Such a system, including
very reasonable display facilities, would cost well
252
Fall Joint Computer Conference, 1969
under $50,000. An even less e~pensive system could
be readily based on an even smaller 12- to 16-bit computer. This would save another $10,000; but the 18bit word length of the PDP-15: is especially efficient
for display-refreshing purposes and adds to the standalone capabilities of the console. Note, in this connection, that our own PDP-9-based console could
employ DARE I for complete problem debugging
before ever using CDC 6400 time ..
With the large central computer and its relatively
efficient compiler available, the proposed DARE III
and DARE IV systems corresponding the the FORTRAN -based DARE I and the assembler-based DARE
II, may well merge into each other. The multiple
indexing needed for efficient implementation of integration routines may well be done best by the CDC
6400 FORTRAN compiler, while derivative computations would probably still be executed more efficiently
by an assembler-based system employing conditional
assembly, as in the DARE II scheme.
Digital vs~ analog/hybrid simulation:
considerations
Computing-speed
Table II lists detailed estimat~s for various digital
computation times required in a typical mediumsized aerospace simulation. Our example involves 12
state-variable-derivative integrations, 100 threeterm additions, 140 products, and 18 functions of one
variable. The DARE I and DARE II systems are
implemented on a Digital Equipment Corporation
PDP-9 (one J,Lsec cycle time). This machine was chosen
because it has an 18-bit rather than a 16-bit word
length, although some of the n¢wer 16-bit machines
have much better instruction sets. The PDP-9 FOR-
Figure
~-·DA HE
console in oppration with the PDP-9
TItAN compiler appears to be designed mainly to
save core storage a:ld produces relatively very slow
execution. At a reasonably conservative 2Ei frames
(time increments DT) per period, the resulting 616msec frame time for our aerospace simulation would
permit the DARE I system to produce sinusoidal
oscillations at 0.07 Hz. Speedwise, \ve see that the only
differential analyzer our DARE I system replaees
is an old-fashioned Bush or General Electric mechanical
differential analyzer!
A notable and inexpensive improvement in this
situation is afforded by the fact that several PDP-9sized digital computers are already available with
hard\vare floating-point arithmetic. No such option
is available with the PDP-9, but \ve ourselves have
designed a current-mode logic, floating-point arithmetic unit for the PDP-9 which, if and when installed,
would yield a speed improvement by a factor of at
least 15 for the DARE I system, so that our simulated
aerospace vehicle could wiggle at about 1 Hz, floatingpoint.
Our block-oriented DARE II system, also running
on the PDP-9, was specifically designed to demonstrate
relatively high-speed, real-time flight simulation on
the inexpensive computer. The price paid fot· this is
fixed-point operation, but DARE II's ejllicient execution and 11 -msec frame time permits about .4 Hz in the
aerospace-simulation example.
An ,improved 18- to 24-bit stand-alone computer
of the future could probably produce comparable
floating-point simulation at 4 Hz. As we have noted,
though, the DARE III/IV systems will implement
the economically much more important goal of timeshared operation ·with a large central digital computer,
in this case the CDC 6400. As we have seen, very
efficient and still relatively machine-independent execution will be obtained by FORTRAN integration
and macro-assembler implementation of derivative
computations, although many operators may prefer
an entirely equation-oriented approach. In either
case, Table I indicates estimated frame times of the order
of .5.7 msec, thus permitNng about 7 Hz operation at
,fi.5 frames per cycle. Note that th'l:s system would provide
floating-point aerospace-vehicle simulation in real time.
The last column of Table II extrapolates the DARE
III/IV system to a hypothetical 1970X digital computer permitting an approximately fivefold increase
in computing speed through faster hard\vare and/or
multiprocessing, instruction look-ahead, or hard-wired
subroutines. Thi8 is in no sense a way-out extrapolation,
since digital-computer projects now on the drawing
boards already plan for a fifty-fold speed increase. Proba-
Project DARE
bly the most time to be gained in simulation calculations would be through the availability of fast scratchpad memories or multiple registers, which would permit derivative computations with as few core-memory
references as possible; this will already be approximated
in the assembler version of our CDC-6400 simulation
program. Additional computing bandwidth would
rea<#ly be obtained with computer systems employing
parallel multiple processors, which would fit nicely
into differential-equation solving schemes. Note, however, that no manufacturer of large digital computers
would even consider a special design for continuous
system simulation, so that all improvements must
make, as it were, incidental usage of developments
in large-scale scientific and business computers.
Let us now consider the computing-speed situation
on the analog/hybrid computer side. One or two analog computers available for sale in 1970 will offer not
only 0.02 percent of half-scale static accuracy, but
also 0.1 percent of half-scale error in linear computations at frequencies up to 1 KHz; multiplication
and function generation are somewhat less accurate.
In applications where such component accuracies
suffice, even existing analog computers are thus seen
to have a 20:1 speed advantage over the fastest
digital-simulation systems. This bandwidth advantage
is moreover, not likely to decrease within the next ten
years; since 1965, improved ± 10-V hybrid computers
developed in our laboratory have operated with errors
below 0.2 percent for linear and one percent for nonlinear operations up to 10 KHz, at perfectly reasonable
cost.l1,12
Digital versus analog/hybrid:
Economics
Our DARE system is implemented on about $90,000
worth of PDP-9 and simulation console; another
$25,000 could be very advantageously spent on a
disk to speed compilation. When implementing the
fixed-point DARE II language, our stand-alone system
is roughly comparable to a modest 150-amplifier hybrid computer of 1960 vintage, say, an Electronic
Associates 231-R together with a small digital computer used for potentiometer setup, static checking,
and some function generation.
At a more or less comparable price, the on-line
digital system is incomparably more convenient to
program, check out, and operate (this is, of course,
doubly true of the floating-point system). We also
have, of course, all the possibilities of the 16K PDP-9
with dual display and can produce floating-point check
solutions with DARE 1.
Our PDP-9 installation is, however, mainly intended
253
as a demonstration. A more useful stand-alone insta t
lation, based perhaps on the SDS Sigma 5, would
roughly double our cost, but would permit real-time
floating-point flight simulation,plus some foregroundbackground time sharing. Although such a system
would be economically competitive with a 1970 analog/liybrid computer in many applications, the full
economic potential of on-line digital simulation will be
realized only in a time-sharing system. The tremendous
advantage of the time-sharing system is, simply,
that the central processor is free for other business
while the simulation user looks at his console-refreshed
display, or simply scratches himself. We have already
seen that the communication requirements for timeshared simulation are quite small.
I believe that the foregoing considerations clearly
indicate the area of future analog/hybrid vs. digital
simulation competition. In applications where analog /
hybrid and digital simulation systems compete at equal
computing speeds, i.e., in most real-time or "slow"
simulation, the new digital systems will win overwhelmingly both on economic and on human-engineering grounds.
Since, on the other hand, reasonably complex nonlinear
digital simulations will not be able to run at frequencies
much in excess of 100 Hz, faster simulation will still
belong on analog/hybrid computers.
A crucial question confronting the simulation community (and specifically the analog-computer industry)
is, then, this: where, and how large, are the application
areas of really fast analog/hybrid computation? The
most immediately important would seem to be:
1. Parameter and functional optimization, including
trajectory optimization.
2. Random-process simulation, including optimization of statistics, communication-system simulation, and parameter-tolerance studies.
3. Solution of partial differential equations, including ,techniques requiring multiplexing of
analog computing elements.
It is in precisely these applications that the very
large number of computer runs needed may give the
analog/hybrid computer a measure of economic advantage even over digital batch processing. Even
here, only important and frequent applications could
tilt the balance away from time-shared digital simulation, which saves much analog-computer scaling,
setup, checkout, and "head-scratching" time, not to
speak of computer amortization. Cost estimates for
different simulation methods sometimes omit these
"hidden" costs.
I wonder, finally, how much practical high-speed
254
Fall Joint Computer Conference, 1969
analog/hybrid computation is really done in the aerospace, chemical and nuclear-energy industries, which
are, at this time, the principal consumers of continuoussystem simulation. Our own laboratory's work on the
design and applications of very fast analog/hybrid
computers,I1,12 for instance, has always elicited much
polite interest, but very little imitation. By contrast,
much current aerospace work involves "slow" or realtime hybrid simulation of aerospace systems, with the
digital computer doing housekeeping functions such
as static checking, plus function generation and, perhaps, some accurate trajectory integration. The resulting accuracy and software problems combine all
the worst features of both anal0g and digital computation; the main reason for employing hybrid simulation at all is either the existence of actual hardware
in the loop or some 20- to 50-Hz components due to
hydraulic servos and/or aeroelasticity. This type of
hybrid simulation can be swalloiWed by future on-line
digital systems like Jonah by the whale. For the 1970s,
the simulation community would be well advised to
include on-time digital simulation in its planning,
together with some careful reconsideration of faster
analog/hybrid techniques.
ACKNOWLEDGMENTS
The writer is grateful to the National Science Foundation for supporting Project DARE under NSF Grant
GK 1860, and to Dr. R. Mattson, Head, Electrical
Engineering Department, The Vniversity of Arizona
for contributing University facilities. Project DARE
software and hardware are being, developed by a group
of graduate students in the Electrical Engineering Department, including H. M. Aus, D. Chinnock, J. Goltz,
T. Liebert, J. PuIs, and A. Trevor. Professor J. V.
Wait is co-principal investigator.
REFERENCES
1 SCI SOFTWARE COMMITTEE
The SCi continuous-system simulation language
Simulation Dec 1967
2 R D BRENNAN
R N LINEBARGER
A survey of digital simulation
Simulation Dec 1964
3 B JOHNSON
Real-time digital simulation
Proc IBM Symposium on Digital Simulation 196,1
4 P R BENYON
Review of numerical methods for digital simulation
Simulation Nov 1968
5 J GOLTZ
The DARE Ion-line continuous-system simulation system
ACL Memo 169 Electrical Engineering Dept
The Univ of Ariz 1969
6 A PDP-9/Cathode-ray-typewriter editor
ACL Memo 164 Electrical Engineering Dept The Univ of
Ariz 1968
7 G A KORN et al
A new graphic display/plotter jor small digital computers
Proc SJCC 1969
8 A TREVOR J V WAIT
D IFF E: Anon-line differential-equation solving routine w:ith
automatically scaled display
ACL MEMO 153 Electrical Engineering Dept the Univ of
Ariz 1968
9 H M AUS G A KORN
Table-lookup /interpolation function generation for jixed-point
digital computations
IEEETEC August 1969
10 M D McILROY
Macro-instruction extensions of compiler languages
CACM April '1960
11 G A KORN
Progress of analog/hybrid computation
Proc IEEE Dec 1966'
12 B K CONANT
A new solia-state iterative differential analyzer making maximum·use of intergrated circuits, Proc. FJCC 1968.
MOBSSL-UAF -An augmented block
structured continuous system silDulation
language for digital and hybrid
computers
by M. J. MERRITT and D. S. MILLER
USC School of Engineering
Los Angeles, California
INTRODUCTION
The motivation for the development of digital simulation languages may be seen by tracing the thoughts
of two widely different people preparing to analyze a
continuous dynamic system. Both are experienced
engineers and mathematicians, but the first is a novice
programmer with little or no FORTRAN experience.
Both have access to one or more digital computers.
The novice's thoughts might be as follmvs: iiI do not
know FORTRAN and I'm not really interested in
learning it just to solve t.his problem. I have heard
that digital continuous simulation languages are simple
and easy to use. I'll try one". The experienced programmer, on the other hand, might think, iiI only need a
few quick solutions, why bother with a FORTRAN
program. I'll use a simulation language for convenience."
Clancy and Fineberg,l in 1965, compiled a comprehensive list of some 31 simulation languages. One of
these would fit the needs as well as the computer of
both individuals. The novice is l(,okir~g for a simple
easy to use language. The experienced programmer is
looking for one that compiles and runs efficiently while
providing as much flexibility and convenience as
possible. Since none of the presently available languages
achieve the same running efficiency as a FORTRAN
program written specifically to solve the same problem
its conveniences must weigh heavily in the programmer's mind.
If a language is to satisfy the needs of these, as well
as a broad spectrum of users in between, then it must
possess the following characteristics: _
1. It must be easy to learn.
2. Its language statements must be simple and
easy to interpret.
3. It should not require any knowledge of FORTRAN.
4. It should allow on-line interaction during both
problem preparation and problem execution.
5. The language should contain sufficient computational control, and input/output elements so
that only exceptionally complex tasks require
FORTRAN or other non-simulation language
statements.
Of the \videly distributed languages, PACTOLUS2
and IBM 1130 CSl\'ipa come the closest to meeting
these requirements. Unfortunately, they lack many
necessary computational and control functions. The
popular l\IIDAS4 language is not interactive, while
~\1Il\/nC, DSL 90 and 360 CSl\/fps are difficult to learn
and very FORTRAN oriented.
All of these requirements may be met by combining
two things: a computer graphics terminal, and an
255
256
Fall Joint Computer Conference, 1969
augmented block structured simulation language. The
graphics terminal for its interactive communication
abilities and the block structured simulation language
because of its simple language statements. Further,
the graphics terminals ability to display large quantities
of instructional and reference information quickly,
allows it to guide the new programmer through each
st.ep of the problem preparation.
A block structured language may be visualized as
a collection of input-output boxes (see Table 1), each
of which carries out a basic mathematical operation.
The user's inputs, the language statements, describe
the way in which these pre-defined functional blocks
are to be inter-connected. A typical language statement
might be: 54, M, 1,7 which might mean: the output of
the block element designated as # 54 is the product of
the outputs of the block elements designated # 1 and
# 7. The advantage of block structured language
(MIDAS PACTOLUS, 1130 CSMP) lie in the simplicit
of their language statements .Their major disadvantage
is their rigidity, i.e., the u~r is restricted to those operations which may be mechanized with the available
mathematical and control operations. This disadvantage may be overcome by constructing process
oriented block elements which cause higher order
mathematical operations to be carried out. The Gradient Processor and Disk Input/Output block elements,
described below, are two such elements.
.The MOBSSL language
MOBSSL-UAF, which stands for Merritt and
Miller's Own Block Structured Stimulation LanguageUnpronounceable Acronym For, is a descendent of
MIDAS through PACTOLUS and IBM 1130 CSMP
I t differs from its antecedents in the following ways:
1. Continuous and iterative gradient modeling
and optimization procedures are performed by a
Gradient Processor block element.
2. Analog to Digital and Digital to Analog conversion block elements facilitate closed loop
hybrid computation, On-line interaction and
control of analog plotting devices: x-y plotters,
strip chart recorders, memoscopes and oscilloscopes.
3. A Disk output block element allows up to 10
block outputs to be written in a pre-defined
disk data set. A Disk Input block element reads
up to 10 inputs from a pre-defined data set.
Utiljty subroutines allow these data sets to be
referenced by FORTRAN programs.
4. Iterative and parametric computatjons are
facilitated by allowing control cards to specify a
SIlVlULATION MODE. When a solution is
completed, the SIMULATION l\10DE determines which of the following is to occur:
STOP-terminates the job.
PCHG--read data cards and modify pat'ameters and initial conditions accordingly. The last data cf~rd
specifies
the
SIMULATION
1\10DE for the next solution
which is begun immediately.
RUN-begin a new solution immediately.
Successive solutions may be
modified by on-line control or the
gradient and iterative block elements. This mode con1iains no
exit and must be terminated by
operator intervention or by forcing
an error exit, i.e., take the square
root of a negative number.
Process oriented block elements, like the Gradient
Processor and the Disk Input/Output blocks, make it
possible for unsophisticated programmers to study
complex dynamic systems, modeling and optimization
problems, and exercise on-line control without first
learning the FORTRAN language.
The communication and interactive features of
MOBSSL, the Hybrid block elements and the SIMULATION MODE, were dictated by the computational
facilities of the System Simulation Laboratory at the
University of Southern California. In this laboratory,
each user receives ten minutes of computer time on a
first come, first served programmer present baBis. This
period is too short to encourage the use of the console
typewriter for communication purposes. Instead, the
user may read pre-planned parameter chang.es from
punched cards, or operate control switches and potentiometers connected to the Hybrid block el.ements.
The effects of these changes are observed in the line
printer listings or on analog displays operated by the
Hybrid elements.
The gradient processor
Optimization and modeling of synamic dystems mn.y
be re-formulated as a search for the extrema of a
scalar functional of a vector with free parf~meters.
MOBSSL-UAF
257
TABLE I-Definition of MOBSSL elements
MOBSSL
ELEMENT TYPE TYPE
CODE
BANG-BANG
BLOCK DIAGRAM
SYMBOL
B
DESCRIPTION 8
COMMENTS
Eb..--- +1
el~
eo: - I
eo: 0
-e l
-1----DEAD SPACE
D
eo
I
e,
I
~
D
1
n
E
e,
e2
e
3
*
~~
F
GtNERATOR
P
p.
E n
I
I
Ip
I
G
HALF POWER
H
eo:el-~
00
< eI < 0
e:
0
I
0< e<+oo
I
-00< el ~ P2
e : 0
P2 < e l < ~
eo: e, -P,
~ ~ e,<+oo
P2 ~ 0
p.~O
0
e,
e : e (~ el + P2 e2 + P3 e3 )
0
P3
el~
.
GAIN
eo
V
Ip
2
EXPONENT
eo : +1
-
eo: F(e,)
~
el~
-
TSLOPE=~
el
e
1
eo: PI e l
> 0
*INTEGRATOR
*JITTER
J
~_eo Random Number Generator
~
Generates random number between ±.I
-1~e~+1
o
CONSTANT
K
258
Fall Joint Computer Conference, 1969
------------------------------------.~---------------TABLE I-Definition of MOBSSL elements
r---------..-M-O-B-S-SL----r--------r--------------'--]
ELEMENT TYPE TYPE
BLOCK DIAGRAM
SYMBOL
DESCRIPTION
8 COMMENTS
~========~~C=O=D=E==~============~===========================
LIMITER
eo = P2
L
eo = el
eO=P1
<
-00
el e2 ===:> QUIT
el~ e =#> QUIT
2
RUN)
Quit element terminates the run at 1the
end of the DT step in progress
RELAY
R
e3
e2
el
~
e
3
e2
~eo
el - - ,
(01
*STORAGE
S
I
ez
el
e3
liEtttl~
St~~r
T
1
P2
P3
SPDT
eo = e3
eO=e 2
-00
O~
<: eO
MOBSSL-UAF
259
TABLE I-Definition of MOBSSL elements
MOBSSL
ELEMENT TYPE TYPE
CODE
*TIME
PULSE
GENERATOR
BLOCK DIAGRAM
SYMBOL
DESCRIPTION
8 COMMENTS
T
~~---4----~----~·t
f4- PI ---- PI...... f4- PI--
Generates Impulse train of unit amplitude
and period P, which starts when e,~O
(to delay start of pulse train keep e,
negative ).
*UNIT
DELAY
U
~
el~()
eo(t) = PI
t =0
eo(t) = eo(t- 6t)
t>O
Max. no. of unit
delays =75
Used as a delay element and in conjunction with Z element for sampled data
systems and difference equation
computa t ions
*VACUOUS
V
WEIGHTED
SUMMER
w
MULTIPLIER
x
*WYE
Y
*ZERO-ORDER
Z
HOLD
t=O
e =P
~Ol
~ Used in conjunction with WYE element
for implicit funct·ion generation
~---~
Logical branch element used in implicit
function generation
eo = PI
eo=e l
if t =0 and e2 ~O
e 2 >O
eo unchanged
SUMMER
+
eo= ±el
e2 ~ 0
±e2 ±e 3
This is the only element that accepts
negative block numbers.
260
Fall Joint Computer Conference, 1969
TABLE I-Definition of MOBSSL elements
l===========~M=O=B=S=S=L~B=L=O=C=K==D=IA=G=R=A=M=*=========================
ELEMENT TYPE
TYPE
CODE
SYMBOL
DESCRIPTION
e,
2
If e2 =0, program interrupt occurs ond
360 supervisor generates message
indicati ng exponent overflow except ion
eo
INVERTER
POWER
**
SINE
( DEGREES)
SO
e2
eo =(e)
,
,
e>O
e, ~ If e,~ 0, problem processing is terminated
ez~ and 360 supervisor generates error
message indicating an attempt to t()ke
10 arithm of a number S 0 has occurred
Inputs in degrees
SINE
(RADIANS)
SR
eo = SIN (Pie, +P2e 2 +P3e3 )
Inputs in radia ns
COSINE
(DEGREES)
CD
eo = COS ( Pte, +PZe 2 + P3eJ
Inputs in degrees
COSINE
(RADIANS)
CR
eo = COS (P,e, +PZe 2 +P3e 3 )
Inputs in radians
TANGENT
(DEGREES)
]
e2 ~ 0
eo = -e
DIVIDER
B COMMENTS
TO
eo = TAN (P,e, +PZe 2 +P3e3 )
Inputs in degrees
MOBSSL-UAF
TABLE I-Definition of MOBSSL elements
ELEMENT TYPE
TANGENT
(RADIANS)
MOBSSL
TYPE
CODE
BLOCK DIAGRAM
SYMBOL
DESCRIPT10N
S
COMMENTS
TR
I nputs in radio ns
COMMON
LOGARITHM
LG
Bose 10
NATURAL
LOGARITHM
LN
eo = LOGE ( Pie I + P2e 2 +P3e3 )
Base E
ARCSINE
(DEGREES)
ARCSINE
(RADIANS)
ARCCOSINE
(DEGREES)
AS
Output
in degrees
Output
in radians
IS
AC
Output in
ARCCOSINE
( RADIANS)
IC
degrees
-,
eo = COS (P,e, +P2e 2 +P3e3 )
Output in radians
ARCTANGENT
lDEGREES)
AT
ARCTANGENT
.( RADIANS)
IT
261
262
Fall Joint Computer Conference, 1969
TABLE I-Definition of MOBSSL elements
ELEMENT TYPE
MOBSSL
TYPE
CODE
BLOCK DIAGRAM
SYMBOL
DESCRIPTION
8
]
COMMENTS
?=======t=====:=*==~=====:=
ANALOG
to DIGITAL
CONVERTER
AD
DIGITAL
to ANALOG
CONVERTER
DA
~
~
eo =
eAoc'II'p < 100.0V'
eAD~p
1
I
A-D n eo
el
e l < 100.0
D-A n
~
el
eo
~2 P2
e3
P3
P
--
~~~ Used in conjunction with
block to
~ produce osculations.
For additional details see page 74310
MALE
FEMALE
n
e.10~
[)
cJ
Accepts only
block as input.
Produces more blocks.
~------------~--------~---------------4--------------------------------------,--------.
*SPECIAL
ELEMENTS
I
2
3
4
5
PI
:~~
Cf~
1-5
~
e3
Consider the modeling problem shown in Figure 1.
The task is to select those values of the parameter
vector, a ,which result in minimizing the output of the
Criterion Function Evaluator. The integral squared
difference between the model outPB.t and the output of
the unknown system is often selected as the scalar
criterion function. The Gradient Processor, GP, block
element controls the systematic variation of the parameter vector, a, so as to locate the desired minima.
Let the criterion function or cost function which is
User suppl ied Fort ran subroutines not
restricted to 3 inputs, 3 parameters and I
output. Inputs and outputs of all blocks
+ all MOBSSL variables (T,~T, nOT,
TSAMP, etc.) are available. Approximately
1100 words of core available for this
purpose.
to be extremized be denoted as 4>(O!1, •• •,an ) or just
4> (a) . If 4> is a non-linear function of the para.meter
vector, (x, as it usually is, then iterative search procedures must be employed to find the extrema. Of the
procedures described by Bekey and Karplus,6 the most
often used is the method of steepest ascent. The Gradient Processor, GP, block element mechani~~es an
iterative form of the method of steepest ascent described below.
The gradient of the criterion function on the jth
MOBSSL-UAF
The m; are positive if a maximum of cf> is sought
(steepest ascent) and they are negative if a minimum is
sought (steepest descent). The magnitudes of the m;
may bo used to restrict the size of each parameter step
as follows:
Let I\£la i
the unnormalized parameter step, be
defined as
CRITERION
FUNCTION
EVALUATOR
Iuser
263
l\u,
suppliedl
• information.
GP
a l +1
GRADIENT PROCESSOR
Figure 1-Application of the gradient processor element
to parameter identification
iteration £lcf>i, in the cf>XO'lX, •• •,xan space may be estimated by perturbing each parameter aj by an amount
£la+ i and £la-I:
Let MSL be a pre-defined constant, equal to the
largest parameter step, £la, desired.
If
then
otherwise
'Vc/J' =
c/J(Olli,
0/2 i , ... "
a"i
+ AOI,,+)
- c/J(al i ,
a2 i , • " , a"i -
Aa,,-)
This computation requires 2n solutions of the
equations which determine cf>, with appropriate cyclic
control of the parameter vector a. At the conclusion
of the 2n solutions, the gradient vector, Vcf>, is computed.
Let the ith 'estimate of the parameter vector be
denoted by a i • Let 0° be any arbitrarily selected set
of parameters, then the successive estimates of the a
are computed from
where M i is an n by n diagonal matrix of the form
where the k; are constants supplied as inputs to the
GPblocks.
As with all iterative procedures, it is difficult to
determine when to stop the iteration. The Gradient
Processor block element offers three separate stopping
options, all controlled by input parameters:
1. if the number of iterative cycles exceeds a
specified number the simulation is terminated.
2. if cf> is being maximized and exceeds a given
value, or cf> is being minimized and is less than a
given value then the simulation is terminated.
3. If \cf> (a i+1)_cf> (a i) \ is less than a given constant,
the simulation is terminated.
At the conclusion of each iterative cycle, a total of
2n
1 runs, the values of the new parameter vector,
a i+ 1 the old criterion function, cf> (a i) and the new
criterion function cf>(a iH ) and the magnitude of the
stopping criteria being used are printed. Additional
listings of the gradient vector, and individual parameter
changes both before and after normalization are
optional.
MOBSSL will accept up to 11 GP blocks. They
must all be assigned sequential block numbers. The
+
264
Fall Joint Computer Conference, 1969
---------------------------------------------------------------------TABLE II-Gradient processor inputs and parameters
Figure 2-MOBSSL block diagram for second-order
system damping ratio example
first GP block is not associated with a parameter, but
accepts inputs and constants used to control the sequencing of the remaining GP blocks. The outputs of
the last n GP blocks are the values of the n parameters
aI, ... , an where n::; 10. A bingle parameter modeling
problem is shown in Figure 2.
The functions of the GP blocks inputs and parameters are given in Table II.
Parameter identifi.cation using the GP element
The damping ratio, t, of a linear, second order
system is not known. The response of this second order
system to a step input is available. A model equation is
FIRST GP BLOCK
INPUT 1
The Criterion Function, ¢.
INPUT 2
A stopping criteria: Maximum or
Minimum value of ¢ desired;; usually
supplied by a constant block.
INPUT 3
Not u~ed.
P ARAl\1ETER 1 A Stopping Criteria: if I¢(;l!i+l)
¢(ai)1 ::; PAR 1, stop.
H zero, using another criterion
PARAMETER 2 If positive, maximize ¢.
If negative, minimize ¢.
Magnitude is largest allowable p:~r
ameter step II.la II.
PARAMETER 3 A stopping criteria: number of
al~owable iterations.
If positive,
print optional information. If negative, suppress it.
ALL OTHER BLOCKS
INPUT 1
Steepest ascent gain consta,nt k j usually supplied by constant block
element.
INPUT
Not used.
INPUT 3
PARAMETER 1 Initial estimate of parameter value,
2}
Ci.jo.
PARAMETER 2 Posjtive parameter perturbation,
.laj+
PARAMETER 3 Negative parameter perturbation,
The actual damping ratio, r, of the system was set to
0.7.
The MOBSSL program to carry out the iterative
steepest descent minimization of the integral of the
absolute value of the difference between the two step
responses is shown in Figures 2 and 3. The initial value
of the parameter al is selected as 0.4. The MOBSSL
results for the first iteration are shown in Figure 4.
The first column of tabular data shown is time, the
second is the output of the 2nd GP block, aI, the 3rd
column is the output of the Criterion Function Evaluator, which is ¢(a) at the end of a solution, the 4th
column is the step response of the system containing
the unknown parameter, and the :last column and plot
show the step response of the model containing parameteral.
As can be seen from these results, aI, started at 0.4
and after one iteration had reached 0.6455, heading
towards 0.7.
.lar·
I terative computational elements
Iterative computational processes are facili1iated by
two MOBSSL elements: the STORAGE element and
the VARIABLE CONSTANT element, designated S
and VK respectively. These elements allow the results
obtained in previous solutions to modify future solutions.
When MOBSSL ends a solution, it examines the SIM ULATION MODE established" by the programmer's
control cards. If the RUN mode is in effect, and the
Gradient Processer, GP, element is not in use, the
STORAG E and VARIABLE CONSTANT elements
are processed to determine their new outputs. All other
elements are reset to their initial" conditions a.nd the
independent variable is reset to zero. The solution
counter, N, which begins at 1, is incremented by 1.
When all of the bookkeeping is completed, N[OBSSL
begins the new solution.
MOBSSL-UAF
M()I3SSL.IJAF-- Mf:lH(ITT'S UWN KlIlCK STfl.UCTURI:IJ SIMIILATIUN LANGUA(;l:.
CIII'Jl'lr;URflTIUN
IICfl.UNYM I'OR ••• MK
II MOl) 2
JAN 011969
S~l:CIHCATIIIN5
IIlIT~'1T
NIIME
HlileK NUMtll:R
r,p HI' AI)I: R
1
tJIII(AM 1
<'
~1f."Jf:L IIF THING
:;
f'/,(IUI'L OF THG ()UT
4
PARIIM MULTIPLIER
~
t: RIHlR
(:,
FR ROR ':,,!, 2
7
SORT(ERRUR**2)
CRIT!:RIUN FCN
9
Kl/fCK
(;~
IN~UT
INf.'UT
TytJl:
(;~
9
(;1'
II
4
10
4
30
H
I
K
K
II)
r,R AIJ I t'NT (;111 N I l
GIlIN FlIR MINIMAX
12
HIIN(;
:;0
T f-' I N(; nOT
'.0
K
I
I
III~Ll
INITIAL CI)ImlTlflNS
ICIPAR NAMF
UNtJRUN()UNCl:II~LE
265
lCltJlIRl
HUJeK
10
11
12
4
4()
)
INPUT 2
l?
0
0
3
(J.O
0.0
3
0
0
0
~
0
0
(,
2'
3
h
7
0
fl
()
0
0
0
0
0
40
II)
(I
I)
I)
()
0
0
0
0
()
:Hl
40
PIIRAMI:TI:RS
f.'lIln
I'IIK 7
O.u0300
O.40()O()
I.O()OO(l
-0 .()~>fll)n
0.0
INtJUT
-0.3()(lOO -311.00()OO
O.?O(l()O
0.0
0.0
0.1l
-1.00000
-1.oO()()()
0.21)()()()
0.0
n.n
0.0
-2.0()OOO
-I.'.(JOO(J
PIHiC;RIIM /011111)(=
ST(JP
I S O .04000
IIHEGRAT ION INTtRVIlL
TUTIIL Tlt~f IS
~HINT
INTERVAL IS
1I1IICKS TO HI: PR INTl:U IIR!:
BL(JCK HI 131'
PLOTTED
IS
10.()o()on
O.4U(lOO
2
3 KANr,E Of'
9
30
PLuTT!:\) VARIAflLl:
IS
0.0
2.00000
Figure 3-MOBSSL listing of configuration, parameter and timing data for damping ratio example
The VARIABLE CONSTANT element
The VARIABLE CONSTANT, VK, element is
programmed in the same manner as the CONSTANT,
K, element. In both cases, the element's output remains
constant during a solution. The constant stored by the
VK element is recomputed between successive solution.~. Consequently, the VK element utilizes only the
termin\'l values of its possibly time varying inputs. If
information available interior to a solution is neeqed to
modify the next solution, it must be stored in a sample
and hold element until the end of the solution, at which
time it may be used by a VK element.
The VK element presents a constant output for an
entire solution, equal to its output on the previous run
plus the sum of its first input, P2 times its second input,
and P3 times its third input aH at the end of the previous
bolution. The constants PI, P2 and P3 are the block
element's three parameters. Its output is set equal to
PI during the first solution. The VK element is similar
to a mechanical ratchet or to an accumulator.
The VK element is equivalent to an analog computer
iterative accumulator. It allows solution to solution
parameter varia tions These may be systematic changes,
random changes or solution dependent changes. For
example a frequency lesponse can be implemented as
shown in Figure 5. The VK block causes the radial
frequency w to be incremented by k at the end of
each solution.
Consider the following parameter identification
problems. Let
YD
+a
YD = 1
be a first order system containing an unknown parameter a. Let
Y + alY
= 1
be a model containing an adjustable parameter al.
Further, define an error measure
e(t) = YD(t) - yet)
and a criterion function
The derivative of the criterion function with respect to
the parameter a1 may be computed continuously during
the solution.
266
Fall Joint Computer Conference, 1969
N."~
.
COMPUTES
cp (a~
Blor."
o
0.0
O.OOllH
O."OOfJO
O."'OOOU
f'J."'OIl(If}
•• zono
O."f')ono
1.f,Onn
1.01)00
O.ltf.H1UO
1."noO
(t.,"onoo
1.1111100
0.4110000
O."O(IOU
o.
).2H"'0
J.',OUO
o."oono
O. ~3b'1
O.bl n~
".(11)00
.....,,""
O.')07!)]
O.Ol ... ..,J
0.0' j~ I
0.1 /./,. ..
l).ill'''1
n. "ooou
0.40000
0.41100(,1)
O."O'I'JIJ
..... nnn
O."OIiIlO
~.nooo
o ... noou
~3'ij~
(1.411)"" ..
o.on""
o. qqnQ2
O.l~Ul'
I
0.·'8~b
II.l'b")
.I·"'~
I.nun
1.1I."~
•• 1 h&" a
•• 10 .. '"
1.1)"""
n.""".qq
0.Yb.l12
0.Q',I)2
fI."HI1
o•• ~, I '"
0.Y''''2
R.t"H)f)"
1II.~·f)""
iii. "non
•• 1000
n ... ouno
0.1I7l111
0."')111111
0."""00
n.• filM , , "
n
",."noo
O.'·nool}
n ... noon
'. ~ono
'.ltnon
f).4iI(tnou
O."U(10f)
I).'H~.
tI.'1jqo ..
0.""""
1-'
1-----+
1------.. _---------+
1----------------------.
1------'.- ---.. ------- -------+
1--- ----------------------- •• -.
1- - --- ---------- ----- •• ----- ••• +
1-- - ---,.---. --.- •• - •••• -.-- •• -- +
1- - -- - ------- --- •••• ---- ••
1-- ----- --------------------- •
--.-+
1--------------------------+
1--- --- .. --------------- ---+
1- - -- - ---- ------ -.---.---+
1----.-----.
---.--------+
1- --- - ------ - --.-.--.--.
1--- ---
-----.--.-----.-+
1- - --- ---- •• - --------- ••
1-----······_····_--·--+
1- - -- - -- ------------- ••• +
1------------------... --.
0.""II1II11 ,
('t.·"1.1~6
o.q·'UUII
fl. tJ" J ....
0.""'''(1(,
l.no .. ,,,
n. ""'fIt;,~
I.olaqn
1- .. - .. - ------- ---
1).""/1171
I.nl ~hl
I .1I1~1I1
1-- -------- ------. -------.
1- -- - - - -- - --- -.-.--------.
"t" ••
".""'l1li'"
II.n
0.0
0.011/10
O. ftnuno
".lIInno
(J.~('ItHHJ
O. ('Ino t ,
O.llnlll
1.",,,00
"."nullU
0.0""""
(l.""nu
l.non"
, ... non
n .... (lnnO
O.n"I""
1).,,''1100
0.0""'1
1.IIIMn
fI ... nnou
•• nnn"
.f!lq~.
·1-----------•
""J""
o .1t~'lhn
n. ""',,"'.
I
2.0000
n.II"""~
O.ftt"OflU
,.,nnn
n.",lIluu
0.1 '''' 'I
n.o,.",u.
0.21 "'~
0.1"'."
7' ....
n.""
n ....... , '
n." "f,,,
('1 ...
71
n. """"".
I.",'-.n'l
11.0
O.OhHI
O.lH~1
(I.It'IH'
n.hl~IO
n.
,,,,,,'tq
1- --- - - -- ------- - ••• -.- •••
1-------.-----•• -.--.-••
-.
---------+
·1---------.
I'
1",---.
1- --- - .. _ .. --- - -- +
1 .... ----------_ .... _--.
-------+
n.", .. n'1
1- ... -"- ----- .. - -- ..
I .no"O)III
I- ----- - --- ---- --- -------.
1.""'."0.
•• uM'lOI
1.1)'1"'.
1- -- - - - - - - ------ --- - .-.-.- +
1- -- ...... - -- - ------- ---------.
1- - -- - -- -- -----.--------.- ••
n.h,lUIiU
n. t ''14,..
I.nltl""
..... nnn
".f,lh"H'
O. I OJ''''
1 - -- ---- -_ •• - •• -- •• ---- --.-.
n
.'-""''''0
n .1n '1Io(l
1."''''''''''
1.1'·,,'11
l.th'h(l.,
411 .IUllln
I.O'On)
"."nll(111
0.'
I.IIJ.,.
p
l.n"I.· ,
1).,,'\11"0
f'I./I"v"
I.n; ,,..,..
I.nl ~.,It
l.tHt)1
1- --- -- ... - .. ------ .. -- --------.
I .... - ....... -- ----- .... ----------.
1- - - - - ---- - - - --- -------- - ••
1-- ----- -. - .-.-.------ •• - +
~.lnf'l'"
'\.""""
", ... nnn
A.,.,."n
".f,nnn.,
f.l'H'I"
n."fI'"HI
,."onn
lII.ttn""
"."M"
lII.fI"nn
"l o
lnun
In.,,,,,'''''
........
l.flU,,'1)
0."""'0
n./ H
I.nflll h.
".1')',l1li .. )
".'1"1'1 t
"""1'1
11.""/n"
I)
0."""""
n.'1''lln
It., "','4
(t.
f'.'J", .. "
o. 'l"J".
".'l"Jnn.'
O.r)',~""
" .... nnlln
".1'. ,,,1
n.? .. ,."
n.",."'fI!)
".40"""
n."nOfl
O./flflfJU
n."",,"'"
n.
~
'J'l'"
n.')1'7 I)
II.""H'"
fl."
0.(11",),1
n.ll ,., ..
n.l11,n
n. 'I"'~
1l1li
n.
1" ' ...,
fl .... , l~M
n
fI.lt/'t
1"1
0.7,1IHI
o.
"""",
1.117""
0.11","1"
I •
."J"'~Oh
_\,.~
,,,
1-- - - - - - -- - - - -- - -.- -- -------- •••• ----.
I.H',~~
1- - -- - --- - -- - --- - --- ---- -.-
1.0/''''''
1.0'" .. "
1.0 .... ·',.
1."'11'11
1."loIlt1"
I.J .... 'II
1.1 J"'14
1 - - - - -- - --- - -- ---- -- ---- -- --- -- ------.
1- - .. -- - ----- ... - ..... _ .. - ---- -- - .. -- .. -_ .. +
, ... -_ .. -- - _.- - .. _----------------- +
1- - - - - -- - - - -- --- ----.------.
I.f)""'"
, .,"110')
1 • , ... ~IJ I
" ... nnn
".lOt.oU
1."11"'"
1."'1,.,
A ••tI'hll
n./lu"",
I ... H·· .. I
1."11) .. 1
I."'''''''''''
'.0
J"'"
1.0111...
l.fll"I)',
1.IlH)?]
O.Qj4QQ
O. HI""'0.1",13(
n.lln}
'.",OflU
n./"nfll)
l.qll1 :\
1.1117')1
1. 11010'>
•• on 1n ..
1.001' tit
".OOftO
O./HOllO
}."fI")t ..
fJ.q'lIHl}
O.H 12J"
O. q~':t,,~
"."nno
o. 'fJlIfJU
, . f)
I L!I
O.',fl"n'
I.OHIO
".III'lflO
O./OIIIIU
I.1.'mJ'"
It"H""
(I.IOIIII!)
o. '4)11
0."',.. .. "
l.n'.IH 1
O."'11A"
1.0?477
'J.lf1n n
o./nnnu
'.flAUU
o.
q.~nnn
1.1'7"
O.I"'lOU
O./or)on
l . I .... 't1
fl. q.,AJ'.
1.1"" I
C).1)""7)
o.lnIlIH)
] ./n ''''
1
I'IIPAA
')'~nOl
o.qq,.,~
1.1""''''
I .Ilhnh
I. I
Jjq~
o ... onn
o.flnno
I. ;tnnn
n.o
0.0
0.f1hhQ1
1."llnn
1.lIono
o.onno'l
o.nnllt}
O.f1n'" ..
O.OM,fIQ
0 .. ,,',"':10
0.&','1"'0
n.2I ",')
n• •
U.lll,}q
0."10"'1
].',OOIl
".00110
~.
2MIO
O.""'''''U
O.ft"""u
n. ""I:J"u
n".".,.,
..
O.h',""O
O."".,.,u
O."",,,,",U
O ... ' .... .,u
0."""'''0
O.h'."'''''"
(l.",,·')"1U
~."ooo
0.', ..... .,,,
".nnnn
n. """~n
h.·OI"O
It.
nonn
'.lunn
.,.",nlln
0.1""""0
O.'I'·"',U
n.b'o'J"10
n." . .,,,o
".oono
0.6 "' ..... U
R.ItIlOO
0.,.. .. "'>0
..... non
q.)nno
f:I.",ono
In.nnnn
In.nlooo
n.I)""~U
0."'''''''0
('."""'1\)
0.1)10""'0
I'"'' J
1',," .2
SUMMARY
OF PREVIOUS
IltaVIIXIS CRl TIJU(lIf l'JNcn(llf
2N+1 SOLUTIONS
PAR,
-----------+
1--- -- - ---- - ..... --- ... -----.
1- - -- - - -- - ----- .. 1-- - -- --------- ---.
I .. - - -- ---- - ..... --_ .. -+
---+
,--- -- .. --- .. -------.
.-_ .... -.. --------------+
.... -- .... --- --.
1- --- - ----
1- - --- -- -- ------ ---- --- ••
1-- - --- - --------- - --- -- ---.
1- - -- - -- - - --- - -- ---- •• -- ---.
1--- --- - ---------- ------- ---.
1- ---- --- - - ----- ---- - •• --.--.
1------_ .. _------------------.
1- - - -- ---- --- ---------- .. -- --+
•
0.6"'.,,,U
".Hnon
f ~ ---- - ... - ....... - ----- ..... ------ ....... ----.
t ,"1:11 to
0.'1'11"'1
(I.I"fl"lJ
1.1non
f .. -- .. ----- .. --+
1- - ----- -- --- -----._-.
1- -- - --- --- -- - -- •• --- ---- ---.
O.I),}?f,
"./nnf,1t
ft./llltI.U
l.ltnnn
1 ... onn
1-'
1------.
n.'l""''''1
~.,..nnf'}
n.n
1- -- - -- -- -- - - - - - -- ----- --.
1- - --- ---- ---- ... ---- ....... ---.
•t ..
--___
- .. -_... - .. _-- --_ ......... +
- .. -- __.. ..
______________
•
0."f,1')0
n. ,flnUIl
..... "nrt
In.n .. (\o
-.
n.I'If)II(1
\.1')"'0
.....,. .
...
---+
1J.I"flnu
It.n"on
III.nnnll
~--
1- - - --- --- .. - ------ .. _ .. - ---.
1- .. --- --- ......... - -_ .... ----I .... -- .. - --_ ... - ... -------- ----.
1- - - .. - - - .... -- - ------------.
1--- ---- -- - .. ------ - ------.
I ... - -- - ............. -- - ------ - ___ at
o.n
I)."".",)"
0.·11/ ....
n.,
... ",)fln
'.II1"f,
,.. --_...... -.. -_ .. --- --- -
n. Q·I.~"
n.n
".'1(11'''1
1.1"""
1 •• ·IIuO
'."'Inn
".o!)nll
I .nl Iqq
, .nfll""
fI.;1Io1)1
n ••••• ~""
".If,,,''U
l.lnnn
t.,'1''''
".""'''''
I.nn 'n"
0.1,' ........
n.,.nnl"(J
n.",lIfU",
n.n
I.''''''''
1."f,flO
1.""°')
,"ft"
11."/"'1
•• ",nnrt
IIl.nnon
0.m~9
0.111""'1
0.11" J1
0.1 JJ lit
, ... .,00
O.6I.~~
OollSll
fJ.21.tftl'J
1.1I17~1
I.lonn
COMPUTES
cp(a i + I )
O.lIbl"
0."2708
1.0QO(ll
0.0111 1
l.no'fI'
1.00 '0'
t ."OIJJIt
"."(II}OU
n.o
FOLLOWED BY
STEEPEST
ASCENT
PARAMETER
i
STEP a + I
0.HJ~~
O.MO"'"
o ... \wOO
".,UlOO
....,.. .
cp (a.-4a~)
0.0
O.71bH
..... oon
In.n .. ttn
COMPUTES
0.(1
O.OM09
1I.1I ~ I~
O. )~,~q
1.0lSO"
1.0' I ~M
t .O"!t'/6
l.n'1?1
I.OH?l
1.01'''''
n. &771)7
0.71 J70
n ... onoo
'.1000
lo.nonn
cp (a.+ Aa7)
'lOC~
•
0.0
.... 001')
COMPUTES
P.,jIJAA
0.0
30
&LClCk
O.M(1nO
n •• ooo
l[)N:lOF lIilllO
1ItiNCI
PARA)j[TU! 1
""f
o.nll11
0.0'1111
n.ov,...,]
n. n't1)4q
0.(1'.")".
n.o
'''''l.I
O.Hl~"
n.ll~
It
n. ""."111
1.(I/H12
, .'I",.,n·,
11.0"'7",
.O""·'b
I.Ut,'l1Q
I .n1c,')7
1.0""
11
I .04~7)
I.ni'
'hi,
1.01')"111
I.nl
'~1
l.n010'"
O. Ii' JWI
1 .Oonl~
0.1 ]""'''
n. l;o'l '~,.
0.11hQQ
O.~9RO
0.17111
A
1.0'JJtA'j
I. Old .. "
0.11'1"''1
0.17147
n.1 i'H'
7~~"0
O. HM 1 It 1
11.'11111
n. Q'lI1'l7
n.IO,.,o
1.0010"
n. u""A7
,
O.QQ7J1H
0.'1'11401
O.
q"JA~"
}H"'"
0.11''''',)
u. '1'In7)
l'A11 4
I'AK 5
IJ. I
o.
O."J7'1~
n.nnun')
O.Ir1·''''A
0.1141.,
0.117'12
o. IlM ,,,
U .11 A77
o. ~Y~~6
n.q·,,,,,,,
1.0.,"
1.0]1 lIlJ
I.IIIOOq
t .010"01
I.OIIH7
o.qq""O
o."'lt,lQ
O.qtl~lq
0."'1.,17
O.I'J·'~"A
0.'),,,,1).,
O.q~l
.. q
O."'UI]1
n.ql.Jn".,
!'AlI 6
I'
1----·
.---------+
1------_
. . . . _---.
1----------.... -----..
+
1- "" --- ------- --- .. _-- --+
1-- ... - -1-- --- --- .. ------ ---- -----.
1-- -- -- ----- ----------
-----------------+
----+
,- - -- - -_ .... -_ .. ----- .... ------ +
1- ... - --- - ---- - .. - - ----- - ----+
1- ---- -- -- -- - - ... -----------+
1--- ------- ------- ----- ---.
1- -- -- ---- --- ----------- - ••
1- ---------- --.-------- -- +
1- -- .... - -- - --- -------
-----+
------------+
,--_ .. --- --------- .. ---- - -- +
1- '"' --- --- - --1--- --- ---------- ----- --- +
1- -- - - --- - ---------- -----.
1- -- - ---- --- --- - -- ------- •
1- - --- - -- ... ---------------.
1- - .. ---- ------ .. -------- --.
I ... - - _ .... _- --_ ... --- ------- .. -+
1-------- .. --.. ----- -------.
1- ... - - - --- - --- --- - --- -----.
PAIl 7
CUI
2e(t) 8y(t)
8al
Influence
Coefficient
Producer
80:1
where
Figure 6-MOBSSL block diagram demonstrating use of
VK block as iterative element in parameter search
by discrete sensitivity difference equation technic
is the solution of an associated differential equation
described by Meissinger:7
d
_ ( 8y(t) )
dt
OCXl
+ al (
oy(t) ) = -yet)
OCXl
The parameter adjustment algorithm is
The STORAGE element
The STORAGE element is very similar to the VK
element, except that its output remains constant for
one or more solutions depending on the block's parameters. During the first solution, the output of the
STORAGE blocks ares et equal to their first param~
eter's values. Thereafter they are set equal to the
sum of their input..'!. The change in output takes place
every P3 solutions, when
[(N-P2) MODULO P3]
and
1
= 1,2, ...
This process is easily mecha.nized by the VK element.
The incremental changes in the parameter al are
accumulated from solution to solution. The MOBSSL
program is seen in Figure 6. The computational results
of this program, for an a of 1.0 are seen in Figure 7.
The four columns of tabular data are: time, YD(t),
yet), oy{t)/OO:I, respectively. The last column and the
plot show the growth of the criterion function during
the solution. The initial value of the adjustable param~
eter, aI, is 0.1. It becomes -0.5342, -0.5928, and
- 0:6365 in three successive solutions, converging
towards -1.0. As can be seen from the plot, the error
decreases considerably from solution to solution.
=
0
For example, if P2 = 0 and P3 = 3, then the output of
the storage block will be PI during solutions one, two
and three and will be reset to the sum of its inputs
between solutions three and four. It will hold this
output during solutions four, five and six, recomputing
its output between solutions six and seven, etc. Notice
that the STORAGE element does not accumulate in
computing its new output. If P2 = 0 and P3 = 1 the
STORAGE element is a non-accumulating VK element. In Figure 8, the S element is used to pass the
output of integration element 1 from one solution to
the next. During the first solution the output of the
S element is 3.0. A more complex application. is shown"
in Figure 9 where the output of the S element is used
to modify the structure of the simulation from solution
to solution. In odd numbered solutions, N = 1,3,5,
etc., the output of the S elemtmt will be + 1 and the
output of RELAY element 4 will be -1. As with all
268
Fall Joint Computer Conference, 1969
MOIISs...UAf-- ME'"'ITl'S UWII !\lOCI( 51RlICTUREI) 511'ULAIION
LANGUAr.~,
UNI'RONLJIJNCt:ABLE ACRUN'fl'l FUR . . . MK
"
MOO 2
JAN 01 1969
tOHFIGUUTlOIi 5P~t I~ ICAT 10"15
OUTPUT NAME
, .. IIIf,
MOOEl OF '",ING
t.jUM~FII
"lUCI(
RlO(.1(
I
T VPE
I
I
H,pI'T
13
\3
2
IIiFLUEIiCE CUEfF.
E'U~OA
EAIIIlII SQUAllED
till TEM ION ~UIlC Til
~
IZ
I
4
-2
0
0
°0°
0
0
0
0
0
10
II
VK
n
t"'>OIT 1'111\ ANO DAIIAME HII')
Ir./,·AIII
ALIt(; I(
0.0
0.0
0.0
-0.010;00
-1).1<101)0
, .onono
..
10
17
, 3
""I.I(;IIAM MUI.JI:
INTEGIIAT IllN INffllVAL 15
WTAl T IMf 15
INTfltVAL 15
!!lOCKS TO AE "AIIiHU AR~
"It lilT
IIlOCA TO lie "LOllEO
BLOCK
0.0
0.25918
0.4'>119
0.~·n4j
0.109RFI(}
0.771»11
0.11"·10
0.1l11'>4
0.9092"
0.9H79
0.9'>021
0.%311
0.977.67
0.91'175
0.'11!~00
0.911'>1"
PM\l
-1.0(01)0
I.O"OOU
-I.nonoo
0.0
0.0
0.1)
l'AII3
0.0
0.0
l.noOOO
0.0
0.0
n.O
II UN
n.lllnOO
". ;>1)000
0.3OO0n
I
II IIA"If.E OF PLOTTED VAR I Allll:-
15
TIlING
TIME
0.0
0.3000
n.6nno
0.'1000
1.2000
1.5000
I.A(\OO
7.1000
2.4000
2.7000
3.00nO
3.3000
3.6000
3.9000
4.2000
4.2100
1
..
9
10
II
12
INJTIA~
It/l'AII IIAME
AlP,",A
0
I)
8
GAlli
DEl TA AL p .. A
VARIAIILE C!)NS,TNT
IlipUT
INPUT 3
0
I ~,PtJT 1
I
3
12
I
~00E1. Of "llJIO
"lOCK
0.0
0.29'>54
0.'>81.35
0./1606"
1.13U111
1.392111
1.64122
I. tl9406
2.IH'>'1
2.36605
2.59164
:Z.8105"
) .07 jill
3.22918
3.42926
).43'>A3
INFWgjCE OOEfT
CRlTERlctl ~'CII
ALOCK
0.0
-0.0"411
-0.117'16
-1).3"150
-0.hI,490
-1.(JlAS6
-1.431105
-1.'11919
-2.451'11
-3.05n57
-3.69333
-4.387. 76
-5.11556
IILOCK
0.0
0.000011
0.110221
0.01425
0.05119
0.13400
0.2H11>1
o .~3936
0.91"'11
1.44168
2.15188
3.01207
4.2121l7
5 .1000R 1
1.2'>4101
-5.8HII'>~
-6.69fl72
-1>.72"33
1.314~2
0.0
0.00002
0.00051
0.00307
0.111027
0.02,05
0.0!>1I14
0.01l772
0.13'132
0.20516
0.28727
0.31l:i"2
0.4'14111
0.61 All
0.754,,0
0.15915
17. 1 S
IAnl f CUNSTANt WITH IILOCK NO.
Hll)Al til lIlT Al TIME
(l.0
0.0
-0.04000
0.214114
0.2'>'1\ tI
-0.147~1
0.504'10
0 ... 5119
-0.lH/,09
O. ,,'114H
O. '>'1343
-fl. 4"/t~4
o .H5H(''1
0.6?II IHl
-O./,V.?I
0.71('117
0.99363
-(l. f1~70'l
O.!! 34 ·/0
1.106'>1
-1.OOf,04
1.2011 I
0.1177'>4
-I.IIIH'>
1.7.H024
0.'10'l21l
-1.3'>731
O. '13219
1.34104"
1.40193
-1."101'>
0.9'>021
-I ./,'>/,1·/
1.441135
0.91>311
-1.79004
: 1.4A121
0.'177.61
-I. '11111/
1.51913
0.1,11'11'>
-7.02202
1.5 i ,696
O.'1Il~OO
-2.0254'1
O. '1H51.,
I. '>411'1
0.0
O.(lI)OOl
0.OOO3!!
0.01)17.1<
(l.00751
o .(l11l7'1
(l.(U631
O.0(d03
0.0'1'134
(l.1','>1>5
n. 201 ')6
0.261')3
0.34302
0.42653
0.':>111(l
0.5l0R"
12 IS
THF. VALUE (IF VARIARLE CUNSTANT WIT .. flLOCK Nil.
RUN TERM !NIlTEfJ AT TIME t:IJIJIIL to TOTAL TIME
0.0
0.0
0.0
0.0
-0.03'1(,6
0.27309
O. 2~9111
0.3000
-0.14011
0.4'11112
0.45119
0.,,"00
-0.7790"i
0.~,)343
O. /,R':> 17
0.9000
0.0
0.00001
0.()O030
0.001111
THe VALUf. OF VAl!
RUN TE:IIMINAHO
1).0
0.3000
0 .... 000
0.9000
1.2000
1.5000
1.8000
2.1000
2.4000
7..7000
3.0000
3.3000
3.6000
3.9000
4.2000
4.2100
TIME
'" 0.0
10.000
0.0
1+
1-+
1---+
1------+
1----------+
1--------------+
1--------------------+
1---------------------------+
1- ---- ---------- ------------------.--+
1---- ------- -- ---- ---- --- ------ ----- +
-0.53'02 AT EN!) OF RUN
12 15
THE VALUE OF vAil JA"Lf CUNS1AN~ WITH IILOCK NO.
TOl AL TIME
liON HRMINAlfD AT llM~ lQUAL
0.0
0.0
0.0
0.0
-0.04041
0.27119
0.25'1111
0.3000
-0.14579
0.51334
0.(,(0(10
0.45119
-0.295Al
O. '>9)43
0.114'>2
0.9000
-0.47514
0.1I8!>91
0.6?IIRO
1.2000
-0.611 101
1.03192
0.11681
1.5000
-0.1111041
1 .1 ~!J21l
0.83410
1.8000
-1.0117101
1.26124
0.tl71'>4
... 1000
-1.211534
1.3'>2'>0
0.909211
2.4000
-1.4AIII
1.4293'1
0.1,13279
2.1000
1.49490
-1.6"156
0.9!>021
J.OGOO
-1.84313
1.55071
0.96311
3.3000
,1.5'11:121>
-2.006'18
0.912/.1
3.6000
1.103H11o
-1.I'>A1I
O. '11<,11'>
3.9000
1.6H21
O.I,IA~OIl
-2.7""32
4.2000
-7..307.17
1.61437.
0.'11151'>
4.2100
to
10.00000
0.0
IS
+
+
1+
1+
1-+
1-+
1--+
1---+
1---+
-0.'>918
"
~NIJ
OF RIiN
1+
1+
1+
1-+
1-+
1-+
-(J.63"'> AT EN!) OF II UN
Figure 7-MOBSSL data and computational results from first three iterations of parameter search by
sensiti~ity
equation method
MOBSSL-UAF
W
~AGE
AGT - 10
GRAPHICS
269
------,
TERMINAL
I
I
I .
•
ADAGE
HIGH
SPEED
D-A
a
A-D
CONVERTERS
Figure 8-Use of storage element to store and transfer
integrator output between simulations during
iterative operation
•
IBM
DISCRETE
DATA
INTERFACE
IBM 360-44
DIGITAL
COMPUTER
ECKMAN
li
2132
ANALOG
COMPUTER
Figure lO-USC system simulation laboratory computer
complement
age between the computer graphics terminal and the
IBM 360 is not yet installed. The analog computer
is equipped with a multi-channel strip chart recorder,
one and two pen x-y recorders as well as oscilloscopes
and memoscopes. Software presently exists to allow
the digital computer to carry out the following interface operations:
Figure 9-Storage element used to modify structure
of simulation from one solution to next
block structured languages, the sorting algorithms
experience difficulty with purely algebraic loops. The
STORAGE-RELAY element loop is rendered sortable
by the inclusion of the UNIT DELAY element. The
UNIT DELAY has no effect on the computations,
and the output of RELAY element 3 will be flex).
During even numbered solutions the output of the
STORAGE element will be -1 and the output of RELA Y element 3 will be f2(x).
Hybrid computational elements
MOBSSL, UAF has been developed for use in the
USC System Simulation Laboratory. The System
Simulation Laboratory's computer complement is
shown in Figure 10. The software and hardware link-
a.
b.
c.
d.
e.
f.
g.
h.
digital to analog conversions
analog to digital conversions
read discrete data lines
set discrete output lines
control the mode of the analog computer
operate the analog computer's select system
process external interrupt signals
set potentiometers in the analog computer
As yet, only the first two functions analog to digital
and digital to analog conversions, with element designations of AD and DA respectively, are available
within MOBSSL. MOBSSL programs may contain up
to 10 DA elements and up to 32 AD elements, limited
only by the available hardware The hybrid elements
may be used separately as I/O elements or together
as part of a closed loop hybrid operation.
The DA element is often used as an output element
in MOBSSL simula.tions. Ap. shown in Table I, the
DA element causes a voltage, equal to its first input, to
appear at the output of the digital to analog converter
selected by its first parameter. If the input exceeds
270
Fall Joint Computer Conference, 1969
±100.0, the output volt~ge will not be correcL DA
elements may be used to drive recording devices in
order to obtain graphical presentations of MOBSSL
results. Because of large variations in computation
times and input-output times, unless special timing
routines are used, the amount of realtime between
successive outputs win not be constant during a solution. There are several ways of getting around this.
1. Use the graphical results qualitatively and obtain
quantita,tive results from the printer listing.
2. If a multichannel strip chart recorder is used,
place a known function of time on one channel
and derive timing information from it. The independent variable, sine waves, output of
timing elements, etc., are convenient signals.
3. If an x-y plotter is used place the independent
variable, the output of block 201, on one axis.
4. Two dependent variables are being plotted
against each other and no timing information
is required.
Methods 3 and 4 are used in the example described
below.
The AD element type is useful for changing parameters and initial conditions. As shown in Table I, the
input to an AD block is supplied by an ADC located
on the analog computer patchboard. Parameter 1 of
the AD block determines the ADC number. The AD
block output is a floating point number between
±100.0. If the input exceeds ±IOO.O volts, the output
of the AD block will be incorrect. On line parameter
changes can be achieved by connecting the outputs of
manually operated potentiometers to the input of an
A-D converter as shown in Figure IIa. Figure lIb
demonstrates the use of the AD block to permit online adjustment of constants and coefficients appearing
in MOBSSL block diagrams. Figure IIc demonstrates
the use of the AD block to allow on-line changes in
integrator initial conditions. This is valid since the
output of an integrator is:
t
eo(t)
eo(O)
+
f ei1) (t) dt
o
and eo(O) can be any number summed with the output
of an integrator having zero as its "initial condition."
When MOBSSL is being used in an iterative mode,
on-line adjustments are needed only at the beginning
of a solution. Parameter variations during the solution
are undesirable. This may be achieved by using the
(a)
~
-100 ~ Y~ +100
V
+100
manually operated
three terminal pot
Y
Analog patchboard
hookup
-100
(b)
(c)
GY
MOBSSL configuration
Variable
for variable constant anel variable gain
MOBSSL configuration for integrator
Ie
Figure ll-Use of AD element for on-line parameter
changes
fStores AD input present
(1?t last DT pri()r to t:: 1.0
e o( 1.0-)
wJ>
Figure 12-AD block used to modify a parameter at the
beginning of a run
zero order hold as a sample and store element. When
input number 2 to the ZOH element is less than or
equal to zero it holds its previous output. When it is
positive it samples, stores and holds present input. In
the example shown in Figure 12, the ADC is effective
only during the first second of the solution, after which
it may be pre-set in preparation for the next solution.
An AD element can be used as the input to a QUIT
block to terminate a run from the analog console.
Other applications of the AD element include sampling and processing of analog data where synchronous
samplirLg is not required. The output of the gaussian
MOBSSL-UAF
271
noise generator, both direct and filtered, located in
the Beckman Analog Computer, may be sampled and
used in place of the output of the uniform distribution
Random Number Generator block type.
Attempts to use the hybrid block elements in real
time applications have brought to light the need for a
whole series of timing and interrupt processing elements. These elements will expand the real time capability of MOBSSL considerably.
The following example often referred to as the HostParasite problem, demonstrates the use of the· DA
block to drive an X-Y plotter. It is a set of differential
equations which represents the popUlation of hosts and
parasites as a function of time. The physical situation
from which the differential equations are abstracted
comes about when there is a host (Le., food for a
parasite) which would reproduce at a known rate if
there were no parasites. The parasites die off at a known
rate if there are no hosts. Finally, a decrease in the
number of hosts and an increase in the number of
parasites is a function of the number "encounters"
between hosts and parasites. Whenever a host is unlucky enough to encounter a parasite, the parasite
eats him up. The equations implemented are:
.
.
P
H
=
KIH - K 4HP
=
K2H
H ~ host population as a function of time
P ~ parasite populatjon as a function of time
Kl ~ overall growth rate of hosts per hour
assumir g no parasites
K2 ~ overall decay rate of parasites per hour
assuming no hosts
K s, K4 ~ number of host-parasite encounters per
hour
t ~ time in hours
where
Kl ~ 0.05 /hour
(+5% per hour)
K2 ~ 0.10/hour
(-10% per hour)
~ oneen counter
per 5000 hours
2 X lO-4/parasite-hour
Initial Conditions:
Run I Run II Run III Run IV
H(O)
P(O)
+ KsHP
2 X lO-4/host-hour
Figure 13-MOBSSL block diagram for the host-parasite
problem
l for
every hostparasite pair
100
200
1200
1200
600
500
500
250
The lVIOBSSL diagram is shown in Figure 13, a listing
of the MOBSSL configuration specifications, parameters and other simulation data are shown in Figure
14. Figure 15 is a graph of hosts vs. time and parasites
vs. time obtained using the PCHG mode and interchanging parameter 1 of blocks 6 and 8 on the second
run. Time is obtained from DA block 9 appropriately
scaled by gain block 4 from block 201 which provides
the independent variable. Figure 16 is a phase plane
plot of hosts vs. parasites for four sets of IC's. DA 6
provides the input for the plotter's X axis and DA 8
drives the plotter's Y axis. Note that the existence of
closed orbits for all physically realizable IC's is clearly
demonstrated, as well as the existence of a stationary
point at (H,P) = (500,250).
Disk input and disk output elements
Through the use of the Disk Input, DI and Disk
Output, DO, blocks vector functions of the independent variable may be respectively read out of and
written into previously alloc !ted data sets on disk
storage during a simulation. The DI block is used when
272
Fall Joint Computer Conference, 1969
IIIIIIJ!,!,l,IJIII'-- .'~"RITTtS uwill IIl"CK
STIIUCTlJIIEU 51MlJlATIUN lANC,UAGI:o UNPRUNOUNCI:AbLI: ACI(UNYM FOI( ••• MK 11 MOU:2
JAN 01
19b9
(.UNF I GUR AT I liN SPt:( II' I C II I IIIN<,
IIl0(.K NIJMlltli
OlllPIJT NA,..tc
I
I-lULT I "lIl-R
l~
HlI!>T5
.. 1
PAIIIISITES
tb
SCALING
HOS'
28
PIIIIASIH SCALING
b
HU~T
OAe
PARASITE DAC
TIME
SCALING
9
OAe
TIME
INPuT
TYI'E
IIVICK
0
0
2'5
27
2/)
I)A
OA
G
..
2A
3(11
4
UA
1.. 11 I AL CIlMJI T IONS
IC /I'AIII
tlLULK
ICIPAR NA"E
27
PA~ASI Tf
IC
2'>
HOST
IC
1b
SCAL I Nr.
HO'>T
111
PA~"SJTIc SCALI"'!;
H(,~T nAC I>lUMIIER
PAIIASITE \lAC NO.
TI"f- IlAC NUMIIER
TlMI- SCALING
1
2~
201) .onono
lIlO.OOOOO
0.0.,000
0.0'>000
2.00000
1.00001)
".00000
0.100011
AIoIl)
INPUI
0
2)
27
0
0
INPUT 2
27
I
I
0
0
0
0
0
0
\)
0
0
0
PAIIA"t'HIiS
PAil ..
0.1100;>0
-0.nOO20
0.0
0.0
0.0
0.0
0.0
0.0
PAil 3
-0.10000
0.05000
0.0
0.0
0.0
0.0
0.0
0.0
STUI'
0.10000
31 'I.OOIlOO
".00000
.. IlT INTI'IIYAl IS
ALoeKS TO 81: PR INau AI7. 3Rb 17
2'''000U
1I/1'>'I.~1IJJI
3.0000U
4. 110 0 IIlJ
7h4~. ""0'14
1H'I.Ohl'>U
b.UOOOO
77Y'.9'.'12l
7.'WOoo
R ... OOOO
'11f, 1.1,"'12'
I ~o .. .,. 33,>', ..
9.!>lJ'I'I9
LO.799','I
17',1,., ."1,>/,2
11.'1'19'1"
,>07 .. ".'>lI/O:i
13.1 '1<)')'1 I" .... !>'> ."A 1~O
14. JI)l.Jr,'l
200.0UooO
"'4.411114
7'1.0",,"2 J
., I. 1''' 171<
~".712Hq
2".l',!>Il.,
71.17"51
1'I.7')?''}
III. '>!> .... b
;>?IHIi?"
4h./.f)I;> 7
10'1. ?OlJ4"
4/)~17\).11I7,)0
""~.IH3')9
1!>.!>'1"'1" 7'1I,I,)j.37')00
Ib.7'1""" "2hIl7').~I~"U
17.'14'1'111 15'}'I7I,.117')(}U
1".1'1"'111
b711". <)",,'11
;'0.)'1'1'1"
~71·'1.'111111.
21. !>9""tI
I q 11111. "'10/,2
77.1q",,'1
I 3777 ... ~7.,tl
2:,.lJ'/II'ItI
7('/0,'
25. L 99'1"
fI",'j."'I4,)j
2/). j9Qqll
71./1 •• '>1114
21.')"9'111
7,1'I.IIU'>9
28.1'1'197
71 .. 7.f,7')111
;>9.'1qq<)h
Q~Z".941"1
'H.I'I·/'H.
I H 1'9. 3011')9
~i!,"lq411
27""1.IZ,,"1
33. ')I}"'III
')1 .... ·'. ,>,> .. /)9
3't.7,)<)91 1'>0711."3/'>0
3'>.'1""91 "7',2(1h. ,\7'>00
~7.14·/"'H
7'1'ojll/.j7',(}O
311. j"'I'/1l 41h '41 • .,oc:,)(l
39. ')'1'191> 14 7I'IIl."OOIlf)
40.7'1"'17
"10"1. I ')' j'.
31101,. HIlI<','I
"I. '1'1""7
4~. 1,,<;<)1
1<1,),,7. '> 'lUj I
4 /,.j'l'l"H
I if·I>".7f)~1;>
4,)."""'1H
10/d:I.I"bU<)
"".1'1'/<)7
11.,111' • ':>"'d 1
47.'1'/'/'1/
"1>10.1'>",,>
"lJ.I <)ql17
7j\I}.II.,jI')
o;(),'9"411
771<0.0\'>1>2
~1 • .,q'l'l"
"~H7.1.I .. nh
52.7<)'1<1"
1 'n .11>. O",,/)'I
'>3.9<1'1')4
;o3,,"-17,""H\7
55.IQ9·'"
'J7hAI. '/"112
80 ... 100,)'1
10fll ... nl"
7"11.0l710
., I I • ;>('''3'>
H'I.h .. "q7
11)7.'>900:'
1/3.117 ~q'/
111. I HO.,,,
H'''''''.
') I • .lb'I/.'
_1') .4.'40b
lh. II 'lhb
71.0<)?00
14.7"l/j
20.fd194
21. j')77!>
4"/,"23'1"
Ill. 01'1 "!>
3~~", j'>"j 'I
161.99'19
1,,7.9<)<)9
173.<)<)9'1
179. '19'1"
185.<)'1'19
A I'>. ~R" 1 ..
191.<)'1qQ
Qq cl.117:\7j
7o l l. rtf,C}H 1
1'17.9""9
')0'>.40710
203."'199
209.9999
3\'>.HI9kl
l15.9'1q9
1"".171'>.1
221.9<)99
ld,""7~/)
227.99.,'1
17,"'7?q.,
;>H.9.,99
'>0. fj"7"~
239.9'19'1
1~.llICl'>
2"5.99<)9
""."4104
251.99'1')
n.007,!j
257.99911
1O
114.M.'IH/)
281."'1')11
287.9.,911
!>7.)9Qq,) "H71,)4.1"''>0
143.70')011
293.9998
511.79Q<)" 7'111lJ3.Z'oonO
1127. %30"
299.9,)qR
5q.lJ9<)q" 40l>039.(1f.250
'-1'11>.4'>'172
305.'1998
61.19'19" 14:i'>H".Oh'.,O
7b2.747HO
311.Q<)98
&2.Ylq9"j
"9H'I201:\4"/6&
"9'1./)II7.to2
317.99'18
1>3.'>'1'1'1')
3121>7.7H'>lb
317.0~1>1>4
3111.0.,9'1
&3.bl"'1b
]0'181.00000
30q.~70AO
RUN TERM I NATEO AT T I"~ tUUAL TO WIIIL TIME:
STOP
0
It
END OF JOII
HOSTS
II LOC I<.
25
IOO.lIuooO
II I. ,)2b08
133."401 b
l"b.tnl) .'10'>4'1
2111.3.132')
3"0; .l724b
4AI. j "1')7)
Id4.67IH"
H~3. 1728')
III III .7.b,,3b
13\'>.'>')'1')7
142 ... 10"'0'1
'-140. II to 711
1.2". '169', j
I"".? JI, 'ib
171.nb7h
'lfJO.tJIJ",'.e:.
l 00. I'> II H
I I I . "4'-17 j
I 'j4.' jl• .,o
11;,7.117971.
n'h,<);4;?
,,1\CI.Zl7'.1)
)1.1. HII700
""4.7~7",)
td?OllIlAI
11'\ ~. 1",> 17
lnu4.H9fi /.4
1341. lO,Hlh
141\ ..... 41H
l"
100 • .10'14'1
117 • .lI"/O"j
\3". '/jOlll
I h'I.IJHj )<)
lIh.hHH.I
IfJ7. 101111
370.I.:Ulh
41\'1.1071 II
,,'.j .'t9r,h 1
h'.4.41772
10','1 .~O'>fJo
I J',h. 7;'>fl3
1 .. 17 ... 71~4
9~".('llIno
',07 ... H 11>9
I Afl."4.,R!>
11'1. tlloIl'1
100.I'I'IOH
100.07127
lb40.00000
0.0
1640.0
1--+
1--+
1---+
1----+
1-----+
1-------+
1----------+
1-------------+
1------------------+
1------------------------+
1--------------------------------+
1- ------------------ --------------------+
1------------------------------------------+
1- ---------------------- ------+
1------------+
1-----+
1--+
1--+
1--·
1--+
1---+
1----+
1-----+
1-------·
1----------+
1--------------+
1------------------+
1------------------------+
1--------------------------------+
1----------------------------------------+
1------------------------------------------+
1----------------------------.
1-----------+
1,-----+
1--+
1--+
1--"
1--+
1---+
1----+
1
-----+
1-------+
1----------+
1--------------+
1------------------+
1------------------------+
1--------------------------------+
1------------------- ---------------------+
1 -- - - --------------------------------------+
1----- -----------------------+
1-----------+
1----+
1--+
1--+
1--+
l~igure 14-MOBSSL printer listing for the host-parasite problem
MOBSSL-UAF
1600'..-----r-----.-----.-----.----,--..-----.------.-----.---,
273
a)
up to
10 Dl
blocks
TIME.
HOURS
Figure 15-X-Y plotter graph of hosts and parasites vs
time
1=7+(n-l)
1600..---..----r-----.--...,.....~--.---,--r---,r--,--------,
Ie I Po= 200, Ho= 100
Ie 2 Po =1200, Ho=1200
z
Ie :3 Po = 500, Ho= 600
~ 1200
Ie 4 Po= 250, Ho = 500
......
( < input list> )
This analog macro operator is defined in the macro
definition and is any identifier not already used as a
micro or macro operator. To call a macro, one simply
writes the Macro HAL instruction with the appropriate
macro operator, connection symbols and parameters in
their places.
Table VI indicates the format required in a macro
definition. A header, AMACRO, and a trailer,
AMEND, define .the beginning and end of the macro
definition. The prototype statement is the l\1acro HAL
instruction in the format given above. This defines
the macro operator and the number and position of
the inputs and output symbols to be expected. The
next statements required are declaration staJtements.
These macro Gontrol statements indicate which
identifiers are to be considered as connection symbols
and which as parameter identifiers. The body of the
TABLE VI-Macro definition format
AMACRO
"Prototype Statement"
"Declaration Statements"
"Body"
AlVIEND
1. Macro HAL Instructions
2. Macro Assembly Instructions
Hybrid Computer Programming System
program consists of Macro HAL instructions described
earlier, and Macro Assembly instructions. The Macro
Assembly instructions provide for symbol operations
such as substitution, arithmetic operations on parameter values and identifiers, and conditional operations
for expansion time component and parameter changes.
With these facilities, very flexible macros can be
written which conditionally adapt the implementation
structure to the requirements of the problem.
Most of the major operators such as integrate,
sum, etc., associated with digital simulation languages
a'nd requiring several components for analog implementation are provided as system macros. These
assembly language system macros also represent a
target language into which the differential equation
based notation of the Allocated Source Language is
to be translated. IS
Processing the hybrid assembly language
The processing required for HAL is indicated in
Table IV. The first tasks of syntax ~hecking, macro
expansion and list building result in the production
of a linked list and several associated tables. Together
they represent an easily accessed and processed internal
digital representation of the analog problem.
Scaling the problem and producing static check
values, though not yet implemented, also occur at this
level. These tasks may be performed at higher language
levels, however, this level has been chosen to facilitate
rapid on-line programmer interaction. Thus once
changes have been made in the structure of the analog
problem on the patchboard, the equivalent changes
can be made in the internal digital computer representation of the problem and new scaling and static check
information can be requested.
Allocation of components to the analog patchboard
is automatically done at this level. When performed
manually, the allocation involves matching components
to blocks in a block diagram problem solution. The
criteria for such assignments are often qualitative and
include such notions as compactness of patchboard
wiring and neatness or symmetry in wiring appearance,
both of which aid in problem debugging. The assignment itself, though sometimes tedious, is easily effected
using these visual qualitative criteria.
When mechanizing the component allo ation task
on the digital computer, two main approaches are
available. The first attempts to give meaning to
qualitative criteria such as compactness and neatness
through the development and subsequent optimization
of appropriate objective functions. Objective functions
such as wire length, wire crossovers, and area covered
283
on the board are often used in digital computer backboard wiring and, to some degree, do reflect the concept
of compactness. The general problem of allocating
objects (components) to locations on a board, subject
to restrictions on object placement, with the goal of
minimizing some objective function is often referred to
in the literature as the "assignment" or "placement"
problem. 16 Algorithms for the solution of such optimization problems can take several hours of computing
time16 when several hundred objects and locations are
present. This is due largely to the astronomical number
of ways one can allocate a given problem and the slow
and not easily predicted convergence properties of
available algorithms. The large computing time requirements make this approach unsuitable for a short
compile time or on line programming system. In addition, it is not clear that these objective functions
meaningfully quantify the qualitative allocation criteria generally employed by programmers.
The second approach to the component allocation
task is to develop a set of heuristic algorithms which
try to embody concepts such as compactness and neatness while at the same time keeping computing costs
at a minimum. As with most heuristic algorithms, the
one currently implemented in this system has worked
well on most, but not all of the problems it has en-,
countered. In every problem it does, however, find
a legal allocation if it is possible, The basic assumptions
of the heuristic are given below.
a. CompOilents used as integrators and summers
are generally the key elements in determining the way
a programmer patches a board, or draws a flow diagram.
These components should therefore be allocated to
preserve, as much as possible, the visual signal flow
patterns between them.
b. Patching situations such as initial condition
pots and pots tied to the inputs or output of amplifiers
should be considered as special cases. In many of these
and other cases, the patchboard of the EAI 680 has
been designed for neatness and compactness by providing special patchplugs which may be used instead
of wires. Since plug patching is both neat and repre:"
sents a minimum wire length it should be utilized
where possible.
c. A certain amount of patching compactness is
desirable. The remainder of the components should
therefore be allocated in the basis of their closeness
to already allocated components.
These assumptions form the basis of a three phase
allocation algorithm with each phase corresponding
to one of the assumptions above. The details are not
discussed in this paper. The example provided in the
284
Fall Joi.nt Computer Conference, 1969
next section, however, demonstrates the results of the
algorithm.
An example
Figure 2 contains the scaled block diagram of an
automobile suspension simulation.16 Table VII is an
input listing of the problem as represented in the Basic
HAL language. The same problem represented with
the use of the system macros integrate (INTG), summation (SUM), and invert (INVT) is given in Table
VIII. There is approximately a three to one reduction
in code lines required when the problem is represented
using the macro facility and this representation is
reasonably clear and compact. The resulting allocation
of the problem to the EAI 680 patchboard is given
in Figure 3 and indicates that much of the problem's
visual signal flow patterns have b~en preserved.
The programming system which to date includes
the syntax checking, and the macro, listbuilding and
allocation processing described· previously has been
implemented in FORTRAN IV. For the example
above, this processing took approximately four seconds
when executed on a Univac II()8 computer. A more
complex problem, the Cable Arrestor Problem,17 containing roughly twice as many components took nine
seconds.
CONCLUSIONS
This paper proposes a hybrid programming system
in terms of four language levels and the processing
required between them. Some of the details of the
lowest language levels which have been implemented
are presented and an example demonstrating the use
of the system is given.
Currently the authors are engaged in completely
specifying the modifications necessary for transforming
CSSL into a desirable allocated source language. A
continuing study is also being made of the interface
Figure 2-Simulation of an automobile suspension
system
TABLE VII-Basic HAL input for automobile
suspension problem
PVALUE (KI = .16, K2 = .5, K3 = .8)
P9 = POT(S3, KI)
PIO = POT(XI, .5)
PII = POT(C2, K2)
PI2 = POT (J4, .4)
PI3 = POT(XI, .1)
PI4 = POT (X2, .5)
PI5 = POT(J4, .5)
PI6 = POT (NREF, .5)
PI7 = POT(NX2, .15)
PI8 = POT(X2, .5)
PI9 = POT(88, K3)
P20 = POT(83, .2)
(,RI) = RCNC(" ,P9, PI2,,)
(,R2) = RCNC(PIO"",)
(,R6) = RCNC(" ,PI5, P20, PI9,)
(,R7) = RCNC(, "PI7,,,)
R3= RCNS(Pll ,PI4"" ,83)
R8 = RCN8(PI6,PI8"" ,88)
R5 = RCN8(PI3,NX2"" ,85)
J4 = AMPJ(85,,)
Xl = AMPC(,RI, ,Xl"", ,)
C2 = AMPC(,R2, ,C2"",,)
NX2 = AMPC(,R6, ,NX2,."",)
X2 = AMPC(,R7, ,X2"",,)
83 = AMPS(R3"",)
88 = AMPS(R8"" ,)
85 = AMP8(R5"" ,)
NREF = REFN
HALEND
requirements between the analog and digital program
subsections. In addition implementation continues on
the lower level processing tasks, the initial goal being
a subsystem which handles the analog subsection of
TABLE VIII-Macro HAL input for automobile
suspension problem
PVALUE (KI = .16, K2 = .5, K3 = .8)
Xl = INTG(K1,S3, .4,J4)
C2 = INTG(., 5XI)
NX2 = INTG(.5, J4, .2, S3, K3, S8)
X2 = INTG(.5,NX2)
85 = SUM(.I,Xl, 1, NX2)
83 = SUM(K2, C2, .5, X2)
S8 = SUM(.5, X2, 5, NREF)
J4 = GAIN (S5 , 1)
NREF = REFN
HALEND
Hybrid Computer Programming System
Figure 3-Automobile suspension problem allocated
to EAI 680 patchboard
a program at the Macro HAL level, and contains a
limited interactive mode (Figure 1) capable of online
scaling anq. static checking in response to patchboard
configuration changes.
Programming costs for hybrid computers have mushroomed to the point where the economic justification
of hybrid simulation projects is being questioned. It
is hoped that this proposal will both stimulate discussion in this area and fill a current and growing need
for an effective hybrid programming system.
REFERENCES
2
3
4
5
G GORDAN
GPSS-A general purpose systems simulation program
IBM Systems Journal Vol 1 1962 18-32
H M MARKOWITZ B HAUSNER H W KARR
SIMSCRIPT: A simulation programming language
Prentice-Hall Inc N J 1963
J J CLANCY M S FINEBERG
Digital simulation languages: A critique and a guide
Proc FJCC Vol 27 1965 23-36
J C STRAUSS
Digital simulation of continouus systems: A n overview
Proc F JCC Vol 33 1968 339-343
T D TRUITT
Hybrid computation . .. What is it? Who needs it?
IEEE Spectrum Vol 1 No 6 1964 132-146
285
6 EAI iI6 680 Reference Handbook
Electronic Associates Inc N J 1967
7 C GREEN H D'HOOP A DEBH.OUX
APACHE-A breakthrough in analog computing
IRE Trans on E C Vol 11 1962699-706
8 APACHE: Analog programming and checkinQ programmers
manual
Euratom Doc EUR 2437 e 1966
9 APACHE: Analog programming and checking system
programmers guide
Euratom Doc EUn 3052 e 1966
10 W OCKER STEGER
HYTRAN-A software system to aid the analog programmer
Proe FJCC Vol 26 1964291-298
11 W MIESSNER
APACHE Subcommittee report
SCI Simulation Software Committee 1965
12 J KOVACS J C STRAUSS
An approach to a hybrid programming language
SCI Third Annual Simulation .Software Meeting 1967
13 T J GRACON J C STRAUSS
:1 decision procedure for selecting among proposed analog
computer patching systems
Simulation Vol 13 No 2 1969
14 J C STRAUSS editor
CSSL-The SCI continuous system s1:mulation language
Simulation Vol 9 No 6 1967
15 M BREUER
Design aldomation of digital computers
Proe IEEE Vol 15 1966 1700-1720
16 EA! handbook of analog computation
Electronic Associates Inc N J 1967 Chapt 3 119
17 A E ROGERS T W CONNOLLY
Analog computation in engineering design
McGraw-Hill Co Inc 1960379
18 M STEIN
A utomatic digital programming of analog computers
IEEE Trans on E C Vol 12 1963 100-111
19 H PAYNTER J SUEZ
A utomatic digital set-up and scaling of analog computers
ISA Trans Vol 3 1964 55-64
Hybrid executive-User's approach
by W. L. GRAVES andR. A. MAcDONALD
TRW Systems Group
Redondo Beach, California
INTRODUCTION
Hybrid executive programs have long been prevalent
in the hybrid computer simulation industry, however,
what should be the essential features of a hybrid executive is ~till a controversial subject. For the most part,
the desIgn of hybrid executives has been undertaken
by.the manufacturers of hybrid systems and in many
deSIgns the complexity in the operation of these
progra~s. has resulted in their usage only on large
cla~s dIgItal systems. Consequently, hybrid facilities
whlCh employ a small to medium class digital computer
sys~em are faced with the task of developing an executlVe program compatible with the facility environment. However, in many of these small to medium
hybrid facilities, the segregated program development
effort for a hybrid executive is not undertaken until
considerable time after the installation of the hybrid
sy~tem. The normal reasons are inadequate programmmg funds or a higher priority assignment of available
personnel to satisfy programming and development
needs of existing hybrid simulations.
For hybrid computation, specifications for the executive design must include sufficient flexibility to enable
the user to easily alter the mode of the executive
execution at run time as well as at compilation time
to meet the requirements of the particular engineering
problem being simulated. In hybrid executives existing
today,. such flexibility does not generally exist. These
executlVes usually consist of a conglomeration of many
programs that perform specific functions and are
linked together only to the extent that the order of
their execution is controlled by a simple monitor.
However, the nature of these functions is such that
the provision of linkage between control and problem
rl:ata could considerably reduce the complexity of their
implementation while increasing flexibility.
In this paper, the philosophy for a hybrid executive
design, which has evolved from extensive user experience, is described. Since it is a user philosophy it is
relatively unique in the hybrid simulation industry
wherein most designs are specified by "software experts", which usually have attained their expertise
via an all digital environment. A definition of the term
"user" is in order. A user is defined as a person in the
role of either an applications programmer or engineering analyst as opposed to a E\ystem software programmer or analyst. The hybrid executive (hereafter referred to as the TRW executive) discussed in this
paper was primarily developed to satisfy the simulation requirements for a large aerospace engineering
problem. However, the authors feei that the extended
usage of this executive to other applications, whatever
the size, is reasonable, The general requirements for
this problem and the rationale used in the design of
the executive programs are discussed.
Typical executive requirements for hybrid simulation
In early 1967, the TRW Analog/Hybrid Facility
had been requested to develop a large multi-us~
hybrid simulation capability in support of ,the Apollo
program. For this study, which involved several independent simulations, each basically simulating two
vehicles in 6 DOF and employing as many as two
control systems for each vehicle, it became very apparent that total executive control for each of these
simulations would be required for the following reasons:
• The size and complexity of the simulations would
287
288
Fall Joint Computer Conference, 1969
require an extensive daily checkout to assure
simulation readiness. To accomplish this task by
manual means on the analog; would be impractical,
and therefore, potentiometer setup and static
checkout using digital control would be required.
Also, since it was expected that the definitions of
the simulation state would change frequently,
either due to changes in parameters or to different
selections of program options, the pot setup and
static checkout programs should have sufficient
flexibility to assure analog; or system readiness
for the current simulation definition.
• Complete flexibility in the data input and output
formats, such that either the simulation staff or
the various engineering; analysts assigned to this
project could communicate with the simulations
in a familial', user oriented, language and without
burdened details of specific data formats.
• A large simulation staff of programmers of varying
experience and backgrounds would be assigned to
the program, therefore, generalized software to
handle control such as interrupts, analog/digital
interface, sampling, etc., need be developed such
that program interfacing would not be a difficult
task.
• Because of the size and complexity of the simulation and because of an additional requirement
to be able to use the simulations for a multiple of
studies, scaling of both amplitude and time would
be difficult to specify prior to execution. Therefore,
the capability to rescale at run time would be
necessary to reduce considerably the recompilations required if this information is fixed within
the program.
• A requirement to display the dynamic status of
up to ·several hundred variables either digitally
and/or via the analog would be necessary. Because
digital display using a line printer during problem
execution would be time prohibitive, a dynamic
dump capability to external bulk storage (disc
drives or magnetic tapes) for later recovery or
further processing would be required.
• Because of the potential multiple of uses for the
simulation programs, data I/O requirements from
study to study would be expected to vary considerably. Since it would be highly inefficient to
recompile the programs for each new I/O configuration, the executive capability must include
a means for defining the I/O processes at execution time rather than at compilation time.
• Since the total digital program storage requirements were expected to exceed available memory,
the executive program structure must provide
capability for program overlay and data interfacing in a manner not overburdening to either
the user or the respective programmers.
In satisfying the requirements for executive control
of the Apollo simulations, two important constraints
\vere applied. First, development and design effort of
the executive must be done within the budget and
schedule allotted by the Apollo simulation task, and
second, sufficient generalization and compatibility must
be maintained in the design for adaptation to other digital software systems, if necessary, during the simulation
effort. This latter constraint implies that the design
and implementation should not require modification
of software provided by the computer manufacturer,
(loader, compiler, I/O, etc.) for operation.
Evolution of the executive design and development
In the Hybrid Computation Facility at TRW
Systems Group, which currently employs a medium
class digital computer (CDC-310(~) linked to four
analog computers (two Beckman 2132's and two Comcor CI-5000's), a generalized hybrid executive program
was not available for nearly three years from the time
of installation in 1964. A reasonable software development activity within TRW could not be inith~ted with
the available personnel because of committments to
simulation development for several large programs.
Prior to late 1967, executive control for hybrid simulations was tailored specifically to fulfill the requirements for the particular study and was generally not
applicable from study to study. However, valuable
experience had been gained in realizing, from a usage
point of view, the total requirements and capabilities
for a generalized hybrid executive program.
Upon the initiation of the Apollo simulations in
1966, two approaches for developing a hybrid executive
were considered. One approach was to develop a complete executive separate from the problem implementation and later integrate the two programs for
final checkout. A second approach was to develop the
executive in parallel with the problem implementation
and integrate and check out the combined modules of
the simulation as they were developed. From the
stringent A\pollo simulation schedule, it was apparent
that the latter approach would be more feasible. Consequently, the design evolution of the executive was
dictated by satisfying the particular simulation requirements at the time of implementation. As a result,
Hybrid Executive-User's Approach
many of the capabilities presently existing in the TRW
executive have resulted from second or third generation
design changes as user flexibility and program efficiency
so required.
289
DATA
USER INPUTS [
I/O ACT ION REOUEST
PROGRAM
FUNCTION REQUEST
'-----r--I
Program description
Several basic philosophies were adhered to during
the executive design and development:
1. Any information required in defining the simu-
lation which may change frequently is entered
as data at run time. This class of information
includes items such as scale factors, linkage
assignments, analog component or console
assignments, required program sequencing control flags and all problem parameters.
2. Any information that is changed only if the
engineering system being studied is redefined is
compiled into the system ._. This would include
items such as problem equations, etc.
3. All control or problem executions which are
non-time critical, that is, not required for the
dynamic execution of the problem, need not
reside in memory during the time critical execution. Functions such -as pre-data and postdata processing, initialization, pot value determination and setting, static check determination and interrogation are non-time critical
and are usually executed once per run sequence
and therefore may be program overlayed, thus
optimizing or reserving resident core for the
time critical or "Real Time" program.
4. All data values required to transfer information
or problem status between major program
functions must reside in core using a "COMMON"
reserved data area. It is this importa~t constraint on implementation that permits the
usage of program overlaying and aides significantly in the executive design.
Five or six separate computer functions or programs
can be defined, which satisfy the total simulation
requirements: data I/O processing, initialization, pot
evaluation and setting, real time execution, static
check evaluation and interrogation, and possibly, post
data processing. Figure 1 depicts the general organization of these functions. It should be noted, that the
order of execution of these functions is completely
determined by the user at run time from data input,
and that any single function can be executed separately or by an automatic sequencer.
Since overlaying processing is used, each function
Figure I-Hybrid executive program structure
or program comprises, but not necessarily so-, a separate
computer overlay with each in turn further overlayed
(with the exception of the real time program) as increased core requirements are experienced. Each of
these programs is executed by a simple driver or
monitor upon command by the user utilizing the
resident COMMON for data transfer. In the following
sections, the design for each of the five major programs
and the control of their execution is briefly discussed.
Data I/O processing
Because the most frequent interaction between the
user and the system occurs through the I/O portion of
the executive, special attention is warranted to make
the interaction as painless as possible. Since the external
characteristics of entering both data and action requests
are identical for the TRW executive, the following
comments generally apply to both classes of information.
The essential task performed by I/O software is
the conversion between data representations required
externally to the computer. Each time an item of
information is processed for I/O, a description of the
item sufficient to allow conversion must be available.
The TRW executive requires inclusion of descriptors
that specify the following. Names entered must be
defined as data identifiers or action requests identifiers.
The internal classification of the data, REAL, INTEGER, OCTAL, etc., must be specified. Differentiation
must be made between data that is part of an array
and data that is not. Conversion from one set of engineering units to another is also allowed and must be
specified.
Clearly, any I/O format that requires specification
290
Fall Joint Computer Conference, 1969
---------------------------------------------------------------------------------------of all of these descriptors every time an item is referenced is untenable. In the TRW executive, the approach
used to reduce the problem requires the user to provide
a list of all names that are to be accepted and the
required descriptors of each. Specification of the
required descriptors is done using FORTRAN oriented
names such as REAL, INTG, etc. This list is compiled
to allow ease of linkage with appropriate I/O handling
routines. Once the list is defined, entry of data requires
only a name and a numeric value. Since all conversion
is pre-specified, no artificial indicators, such as a
decimal point to specify a floating point number, are
required. Since it is reasonable to expect the descriptors
defined for each data value will not change unless the
problem definition changes, no appreciable loss in
fiexibility for I/O processing is realized when the descriptor list is compiled.
The internal definit~on of conversion requirements
also permits extremely simple definitions of display
requirements. In this case, the. data value already
exists within the computer and only the name of a
variable is necessary to complete the information
needed within the computer to define output requirements. Indirectly, this has allowed requesting all
display functions by simply entering a list of names.
The implications contained here are best illustrated
in the case of specifying "Dynamic Dump" requirements. This is an output function that should be time
optimized. Unfortunately, optimization of a routine
to output floating point data requires different instructions than those needed for output of fixed point
variables. In view of this, a prob~em arises when it is
desired to intermix floating and fixed point numbers
in a single general request list. The TRW executive,
since it has access to all pertinent descriptive information, can handle this problem internally without
the user even being aware that it is happening. The
allowance of such mixed mode lists is provided for
printing and dynamic dumps.
Another problem often encountered in trying to
enter data into a computer is caused by the presence
of rigid format structures such as: requiring that items
be aligned to specific card columns. Where users are
often required to hurriedly keypunch or type in their
own data for performance of runs, such rigidity becomes
too restrictive. Thus, one design criterion for the I/O
package was the elimination of this problem. A solution
was achieved through use of an input string scanning
routine which searches an entire input record for
appropriate data fields.
In the case of action requests two forms exist and
are distinquished only by their manner of use. The
first form, which is the larger class, is referred to as
an I/O action request and the functions performed
are restricted to various manipUlations of data. Requests for saving program status on a disk or transferring data from cards to tape are examples. B2,sic
to this class is the requirement that the subroutine
used "to process the request returns control to the
executive input output controller. The second class of
action request, referred to as program execution
requests, is used to initiate execution of hybrid functions
not related to I/O. In this case the routine used to
satisfy the request passes control to the executive
execution sequence controller rather than the I/O
controller. In both cases, the specification of the
request to the executive program is the same and the
user implies through his own subroutine the class to
which the request belongs. Figure 2 shows the cont:rol
used for I/O processing in conjunction with how this
control interfaces with the executive control of those
functional blocks as indicated in Figure 1.
Potentiometer evaluation and setting
As part of performing each and every computer run,
potentiometers must be set to the proper values. In
most small and medium sized hybrid labs the ability
to do this from the digital is provided with one of
two levels of sophistication. The first requires specifyin g
the address of the potentiometer to be set and the
value to which it must be set. The second requires
specification of the potentiometer address, parameter
values and a FORTRAN like expression used in computing the setting, The latter then both computes the
Figure 2-Executive and I/O processing control
HybridE~ecutive-User's
setting and automatically sets the potentiometer using
an interpretive compiler. Both of the methods require
that the user select those potentiometers whose settings
will change. This selection is based on the engineer's
knowledge of parameter value changes, and in himped
parameter definitions or where the same parameter
is used repeatedly -throughout the problem, this can
be very cumbersome.
The TRW executive automatically includes the
necessary setting changes in the digital program and
thus relieves the user of an unnecessary burden. Since
the actual setting is the only number associated with a
potentiometer that reflects parameter variations, it is
used to initiate resetting of potentiometers. The method
used is as follows: a list containing all setting values
is retained on bulk storage; as part of each run, all
potentiometer settings are computed and compared to
the list; a difference between the two values automatically results in a resetting of the poten~iometer and
the list being changed to reflect the new value.
Although the concept used is very simple, there are
implications that markedly affect program implementation. The most pertinent of these is the requirement that all current parameter values be available
to the program which computes the potentiometer
settings. To easily make these values available and to
still retain the speed necessary to make computation
of all settings feasible, requires compilation of the
setting evaluation routine instead of using an interpretive routine as do many of the hybrid computer manufacturers. Clearly, interpretive methods offer considerable flexibility in specifying potentiometer values,
but the authors believe that this degree of flexibility
is not necessary
Before clarifying this point of view, a definition is
in order.
Let
Ps
= As,
where
Ps
= Potentiometer setting·
Approach
Dependency only on. physical parameters implies
that a" pot definition" changes only when the problem
being solved is redefined in a manner such that
equations are changed, which in most simulations is
relatively infrequent. Thus, i'f the "potentiometer
se~ting" program requires recompilation only when
"pot definitions" are changed, no significant loss of
flexibility is encountered.
As a result of these considerations, the routine was
formulated such that analog scale factors and potentiometer addresses were entered as data and "pot definitions" were coded into a FORTRAN subroutine
Use of this method utilizes the full capability of
COMMON while retaining the flexibility at run time
in specifying values (scale factor, component address)
most likely to vary.
Since the program (Figure 3) necessary to compute
the actual setting (Le., form the product of the "pot
definition" and the scale factor), compare old and new
values and handle bulk storage files is the same for
any problem, it is formulated as part of the executive.
Definition of the pot setting requirements for a given
problem consists of coding the FORTRAN list of
"pot definitions" and preparing the list of pot addresses
and scale factors. The analog data associated with a
pot is stored in a serial file on a disk. This data consists
of the pot address, analog console number, analog
scale factor, present value of pot setting, and an index
. Dp
D p = Pot definition
As,
= Analog scale factor
Assuming the reader is familiar with the meaning of
"potentiometer setting" and "analog scale factor" the
given equation will suffice to define" Pot Definition".
The important characteristics of a pot definition are
its dependency only on physical parameter values and
its corresponding independence of potentiometer address or analog scale factor.
291
Figure 3-Potentiometer setting control
292
Fall Joint Computer Conference, 1969
(I) which defines where in an array (D p) the value of
the pot definition is stored by the FORTRAN routine
used to evaluate the pot definitions.
Initialization or finalization
In engineering simulations, the analyst prefers to
have the mechanization in a form which is either
familiar to him or closely related to the physical system
being simulated. In digital simulation, the analyst is
usually far removed from the program, and if his
results are of a suitable form, the. actual formulation
of the equations is of little interest and can therefore
be optimized for computer efficiency and stability.
To the contrary, in analog or hybrid simulation
where a close rapport with the program is desirable,
mechanization in either an optimum or in a less computer sensitive manner is often traded off against a
more realizable formulation. As an examrle, an analyst
might have access only to data determined in a reference frame that differs from the reference frame best
suited for use within the computer {e.g., gimbal angles
vs direction cosines). In such cases, reformulation of
values for computer initialization may require extensive
computation. In hybrid computer simulations the
digital computer can be used to determine such values
regardless of the complexity. 'Vith this capability, the
total simulation can be formulated for optimum execution, and often better computer stability, and the
results transformed to the users preference without
decreasing the flexibility to the user or analyst.
To accomplish the reformulation transfer from the
user desired input form to the program execution form
to the user desired output form, non-time critical
digital calculations, which may be considerable, need
be performed. Examples would be coordinate transformations, root extraction, curve fitting, data analysis,
etc. Since these types of calculations are executed only
once each run cycle, they can be programmed using
FORTRAN, extended precision, and non-optimal
programming techniques with a negligible increase in
the system execution or throughput. It is this purpose
that the preinitialization and/or finalization rrograms
serve. Since these programs are entirely derendent on
the problem being simulated, the only executive
function is the call to these programs and the provision for data linkage through the use of COMMON.
Real time program
The spccification of software that would appreciably
aid in getting the real time program operational was
based on a generalization of the kind of problem that
would be solved using the system. It was assumed
that the physical system being studied could eonsist
of several interacting subsystems each having a unique
frequency content. For example, in the AroIlo studies
the kinematics and dynamics are frequency septrable.
This kind of system implied a computer program
consisting of several loosely interacting subrrograms
each having its own timing and sampling requirements.
Two primary questions to be answered were "'What
can an executive do that will provide assistance in
programming each subprogram?" and "V\-hat aid may
be provided in correlating the suhprograms to represent
a complete system?"
Two facts immediately suggested general answers to
the questions above. Because each simulation repre··
sents a different system, the equations solved in the
real time program are essentially unique for each new
problem, and can be considered only by the user. At
the same time, certain functions such as mode eontrol
and inter-ecmruter data transfer are common to all
simulations and characteristically derend only upon
the computer system being used. Experience has shown
tha t the user normally displays considerable ability
to solve problems associated to his equations, but that
his performance deteriorates markedly when dealing
with computer system dependent functions. Clearly,
a general executive can only address itself to aid in
handling the computer system derendent problems
present in the real time program. It is also clear,
however, that these are the areas where the user most
needs aid.
The TRW executive includes three major activities
within the real time part. It provides generalized soft,·
ware to handle ADC /DAC specification control of the
"dynamic dump" (time histories), and mode eontrol
and interrupt processing. Relating these activities to
the questions above, it is found that generalization of
these fuootions provides assistance to the user both in
programming individual sub-programs and in overall
system correlation. Justification of this last statement
requires a more detailed description of each of the
functions considered and their interaction with the
user~
Although extensive details pertaining to the methods
used in implementing the TRW executive are not
appropriate, some indication of the gross approach used
is appropriate. The available interrupt structure allows
execution of up to eight concurrent real time subprograms (this limit of eight is caused by the maximum number of programmable interrupts available in
the TRW hybrid system). A subprogram naming convention has been adopted to allow flexibility in choosin!~
Hybrid Executive-User's Approach
the interval at which variables are stored on bulk
storage or at which variables are transferred for display
purposes. The subprograms are arbitrarily named
LOOP!, LOOP2, etc., up to the maximum number
allowed by the interrupts available. The number
associated with the loop is then used as a key to initiate
certain action. In the case of dynamically dumping
variables, the following scheme is used: each subprogram includes a call to the routine which performs
the dump operation; the parameter passed with the
call is the loop number; this number is compared to a
number entered as data which specifies the subprogram,
and thus, the time interval at which the dump is to be
made; if the numbers compare, a dump occurs. A
similar system is used for selecting inter-computer
display tran~fers.
Associated with each subprogram are the following
parameters which may be entered as data:
Present problem time
Time interval at which the subprogram is executed
Address of the first AD C channel used
Number of ADC channels used
Address of the first DAC channel used
Number of DAC channels used
Interrupt priority level
This data is stored as blocks in a predefined order
known to each executive subroutine used in the real
time program. Such a block structure permits usage
of the same calling sequence to execute all executive
subroutines, thereby reducing the chance of programmer error to a minimum.
ADC/DAC specifications
The handling of ADC/DAC specifications within a
program would seem to present little difficulty since
even the most sophisticated DAC or ADC routine
should require no more than three or four parameters.
However, in many systems, specification of these
parameters requires compilation. Such a requirement
not only removes flexibility by requiring recompilation
to incorporate changes, but also forces the assignment
of specific equipment in a relatively early stage of
program development. At the time a particular subprogram is written, it is usually not convenient to
assign specific ADC's or DAC's since requirements
for all subprograms must be considered in determining
the best distribution. Similarly, conversion scale factors
may change at any time. Another capability convenient for the user is flexibility in specifying intercomputer data transfer for purposes of display. This
293
requires specification of specific DAC's or ADC's, the
variables to be transferred, the scale factors to be used,
and the time interval at which the transfer occurs. In
view of these considerations, it seems reasona ble to
require software that allows assignment of all parameters associated with ADC's and DAC's at run time.
The actual assignments are made by entering two
lists of data; the first containing the names of the
variables to be transferred, and the second containing
the conversion scale factor. The lists are entered in an
order corresponding to the ADC or DAC line that is
being used. In the I/O processor, the list of names is
replaced by a list of the addresses of those names and
this along with the scale factor list, is passed to the real
time program for tailoring of the specific transfer
routines for a run. Because intermixing of floating
and fixed point computations within the same program
is rarely encountered, DAC and ADC lists have been
restricted to include either floating point variables or
fixed point variables, but not both. This enables performance of ADC /DAC functions in simple indexed
loops which are easily ta nored.
Dynamic dump
The capability for dynamically dumping variable
values onto bulk storage during a run and processing
them later when time becomes less restrictive, is a
desirable feature in any hybrid system. In addition to
providing information for analysis purposes, it is very
useful for dynamic debugging. Two essentially distinct
functions are associated with a dynamic dump capability. The first involves the specification of those
variables which are to be dumped and the actual
performance of the dump during execution. The second
involves the capability to display either the same
variables that are dumped or a set of variables which
are derived from the original variables by a user written
processing' program. Three user requirements affect the
specification of the dump function. First, he must
have freedom to specify those variables which he wishes
to dump and the frequency at which they are to be
saved. Second, for ease in interpreting the r~:mlt3,
the values dumped should be coherent in time. That
is, all values saved from a given interrupt level should
represent functions of the same time, otherwise, a
time skew in interpretation of the results will occur.
Third, if post run processing of data is present, the
user must be allowed to easily specify the form of
process and a display list that is different from
the list of variables dumped.
294
Fall Joint Computer Conference, 1969
Mode control and interrupt processing
In considering the most suitable form for mode
control and interrupt handling routines, the situation
is somewhat different from that of inter-console data
transfer. Usually the programs necessary to handle
these functions are very hardware dependent and
generally so complex that only a highly experienced
programmer can adequately cope with the problem
involved. Here the obvious approach to specifying
executive requirements is to remove flexibility, and
therefore, the need for user intervention from the
system. Some user control is necessary, however, and
the amount of flexibility allowed by the executive
should be sufficient to satisfy his reasonable needs.
Certainly the user must be permitted to specify what
subprogram he wants executed when the computer is
in a given mode or when a particular interrupt occurs.
He must also be able to specify the priority of each
interrupt. It is also reasonable that an executive should
expect the user to specify the frequency and perhaps
the source of an interrupt. Beyond these few items, it
should not be necessary and, in fact, it is not desirable
for the user to intervene in the operation of mode or
interrupt control software. The other user consideration
that should be included is a "no penalty clause". Thus,
if a user requires only three interrupt levels, he should
not be required to inform the system that the other
available levels are not required. In general, the user
should only be required to specify those items which
he needs for solving his problem.
The procedure required to specify the specific interrupt structure for a given problem is as follows. The
address of a list is passed as a parameter to a standard
executive routine which tailors a general interrupt
structure to meet the users requirements. The list
includes the names of the subprograms included in
the real time program and the names of their associated
data blocks. A similar list method is used to specify
routines that are to be executed when the computer
is placed in a given mode.
The standard executive routine is written such that
it completely handles all normal mode control and
interrupt servicing. Dummy subroutine calls are
included to allow user definition of special mode or
interrupt routines. During initialization the executive
extracts information from the lists described above
and modifies the dummy calls with appropriate user
supplied routine addresses. Similar dummy instructions
are used to permit generalization of other functions.
Since all dummy entries initially consist of "NOP"
instructions, failure to specify all modes or interrupt
levels will not affect execution.
It was claimed earlier that the structures described
serve to simplify the preparation of individual sub·,
programs and the correlation of these in to a unified
system. A review of the necessary steps will illustrate
this. While writing a subprogram, the user must only
be aware of the name assigned to the subprogram, the
name of the data block associated with it, and the
names of the executive sub-routines he wishes to call.
The total number of names needed in the TRWexecutive is six.
Integration of the subprograms demands very little
more from the user. Before final compilation of the
real time program, lists defining the mode control
and interrupt structure must be prepared. Since this
is done very late in the development of the program,
all information is readily obtainable. Preparation of
lists describing the details of interrupt priorities, execution intervals, etc., may be left until computer runs
are planned. Since the entire problem should be well
defined at this time, little difficulty is encountered in
selecting specific values for these parameters.
Static check
A major task which must be performed in any
simulation is static verification of both the hardware
used and the program being executed. An effective
digital program can greatly aid in carrying out many
parts of this task. The items that can be provided by
the digital computer system for static checking are:
• Initialization of the system using parameter values
chosen for the check.
• Comparison of computer values determine from
the physical equations in the digital with that
those values sampled from the analog.
• Information useful in verifying the validity of
the equation values computed by the digital,
is, debugging aids.
At TRW, the first requirement is met by the normal
executive system used for analysis runs. When a static
check is requested, normal run setup procedure is
followed to the end of the initialization phase of the
real time program (Figure 2). At this point, the static
test request is recognized and execution of the static
check program begins. Using this method of establishing the check case provides the advantages of convenience and flexibility in three ways. First, it allows
rapid switching to the check mode using actual run
values if a problem arises during analysis. Seeond,
after the check is made and the problem corrected,
the return to normal running conditions requires
Hybrid Executive-User's Approach
absolutely no action. Third, the system allows rapid
definition and execution of several different check
cases. All that is necessary to perform a static check
is the entry of desired parameter values, using exactly
the same methods as any other analysis run, and a
request for execution of a static check. Since defining
a single check case that effectively verifies an entire
analog program is virtually impossible, the ability to
perform a series of checks is very important. Proceeding through the initialization phase of the real
time program has the advantage that the ADC and
DAC values which are sampled and presented during
initialization represent realistic problem values. This
is sufficient to complete the set of values needed to
base the entire static check on direct evaluation of
the physical equations.
The static check overlay consists of two programs.
The first is a FORTRAN subroutine in which the user
codes his equations for use as the check reference.
Because the system is dependent upon having access
to normal run parameters which are stored in the
computer COMMON area, the use of FORTRAN was
a natural choice. Also, the use of FORTRAN rather
than an interpreter program does not constrain the
user in coding the analog equations, as encountered
in some executive approaches.
The second program comprises the executive part
of the overlay (Figure -4). It compares the equation
values computed in the user FORTRAN program with
the output of an analog component and generates
appropriate error messages. The address of the analog
component, the analog scale factor, and two indices
which are used to correlate the component and the
appropriate equation value are entered as a data
record. Since only physical equations are coded in the
FORTRAN program, recompilation is necessary only
if these equations are redefined.
The correlation of an equation to an analog component and scale factor is achieved by using two indices
specified in the FORTRAN routine as follows. The
terms or factors of an equation that appear at a particular analog output are coded individually and stored
in a one dimensional array. The section of coding
for each equation is identified by a statement number
or index. The statement number and array index are
then included on a data card with the component
address and scale factor to provide the necessary
correlation. Since the computation of the terms of an
equation is done only after a complete set of data
cards for a given equation is read, the array used need
only be large enough to store all of the values computed
for the largest equation.
295
NO
PRINT INPUT
DATA AND
THE EXPECTED
ANALOG VOLTAGE
Figure 4-Static check control
The executive also provides user options that· allow
extensive verification of the user program and the
data files without requiring the presence of an analog
computer. This option is usually not available in interpreter programs. The first option is a data card
editing function that detects obvious format and keypunching errors. The second performs the normal
static check procedure but replaces the interrogation
of the analog computer with a printout of both the
actual and scaled equation values. This data may
then be used to do off-line debugging of the static
check routine. Use of this feature can assure that only
analog program debugging will be necessary when the
analog computer is finally checked.
Other options are available to provide flexibility.
One allows a choice between checking all analog components or just components that represent the total
value of an equation. In the latter case, an error in the
final equation value will direct the program to check
all of the terms of that equation. A second option
permits skipping a check of selected parts of the
program.
296
Fall Joint Computer Corrlerence, 1969
Operating procedures
Operationally the TRW executive has proven very
effective. The entire procedure for executing a computer run consists of entering desired parameter values
and a single command "RUN". From that point, all
setup, operating, and display functions are performed
automatically in a manner predefined by simple list
inputs. The provision that analog setup routines have
access to normal data parameters is of course the key
to making such a simple run procedure possible.
Future hybrid executive development
As it was indicated earlier, the TRW executive was
developed through a process of evolution under the
pressure of developing concurrently a large simulation.
Although the operational characteristics which have
resulted from this evolution are generally very good,
many of the systems software aspects leave room for
development. With a recent expansion in the number
of systems software personnel at TRW, it is now possible to reimplement the executive on a sounder systems
basis and integrate it into a more comprehensive software system. As proposed, the new system will provide
a multi-user capability, simplified file processing, a
more powerful I/O structure, accounting control and
extensive debugging aids.
ACKNOWLEDGMENT
The authors wish to express a special appreciation to
Charles E. Vaughnn for his contributions in the design
and especially in the implementation. It was by his
outstanding efforts and his expertise on the CDC 3100
digital computer that the details of the design. were
worked out to assure compatibility throughout the
executive and with the system software.
BIBLIOGRAPHY
1 G A BEKEY
Hybrid computation
John Wiley & Sons Inc N Y 1968 7 177
2 D R MILLER G N GRADO B R BAKER
The philosophy and the result: Comcor's CI-5000 hybrid
computing system
Simulation July 1965 39-46
3 T D TRUITT
A discussion of the EAI approazh to hybrid computat1:on
Simulation Oct 1965248-257
4 B R WILSON
The Boeing integrated hybrid operating system
Simulation Nov 1967209-223
5 R B McGHEE A Y LEW
Software for hybrid computers
Simulation Dec 1965367-373
6 C K BEDIENT L L DIKE
The Lockheed hybrid system - A giant step
Proc FJCC Vol 33 Part 11968
7 G N SOMA J D CRUMKLETON
A priority interrupt oriented hybrid executive
Proc F JCC Vol 33 Part 1 1968
8 M D THOMPSON
Growing pains in the evolution of hybrid executives
Proc FJCC Vol 33 Part 11968
9 D A WILLARD
The Boeing / Vertol hybrid executive system
Proc FJCC Vol 33 Part 11968
10 E A JACOBY J S RABY D E ROBINSON
Family I: Software for NASA-Ames simulation system
Proc F JCC Vol :33 Part I 1968
11 W GILOI D BECKERT H C LIEBIG
A flexible standard programming system jor hybrid
computation
Proc SJCC Vol 34 1969
A system for clinical data managem,ent
by R. A. GREENES, A. N. PAPPALARDO, C. W. MARBLE,
andG.O.BARNETT
Massachusetts General H ospita,l
Boston, Massachusetts
INTRODUCTION
The application of computers to the delivery of patient
care is more a problem of "data management" than of
"data processing." Although calculations and interpretation of data are often required, of much greater
concern are the problems involved in the collection,
communication, coord.ination, and presentation of
information. As the process of delivery of medical care
becomes increasingly complex, and involves increasing
numbers of professional and nonprofessional personnel,
responsibility for achieving the continuity and comprehensiveness that is essential to medical care seems to
rest heavily on the development of appropriate computer-based data management systems. Such systems
may further provide the primary feasible means by
which quality control, auditing of the medical care
process, and research into the diagnosis and treatment
of disease can be achieved.
These functions now are dependent on the use of
the patient medical record, although they are fulfilled
only to a minimal extent by it. Despite changing
functions and increased demands on it, the medical
record has changed little in 'form over the past century.
Medical records possess no organization by diagnostic
or therapeutic problem; notes relevant to a particular
aspect of a patient's health may be accessed only by
leafing through an entire volume. Terminology is not
standard, data is not organized in well-defined formats,
and notes are often illegible. As a consequence, the
objective of using the computer for clinical data
management is gaining considerable impetus.
This paper will describe a number of criteria which
the authors have found to be important in the design
of systems for clinical data management, and a novel
.system which has been implemented to meet these
requirements. The system to be described has been in
operation for over a year. The extent to which it has
proved useful has led the authors to believe that the
criteria defined have general applicability for clinical
data management. In the discussion to follow, the term
"clinical data management system" refers to a timeshared computer system which supports on-line input,
inquiry, and retrieval of clinical information from a
central data base.
Design and implementation
The internal design of an information system dictates
constraints on the external attributes of such a system.
The characteristics that must be resolved include the
number, priority, and level of responsiveness of the
users, both active and inactive; the ratios among CPU
time, connect time, and input/output time; the structure, m.agnitud.e, and timeliness of file information;
the profile of application programs in regard to size,
type, and interactiveness; user requirements for development and service modes of operation; and finally,
the overall economic justification for the system.
High level programming language
One of the most time-consuming aspects of the
development of information system programs involves
the optimal interfacing of the system with its users in
a particular application area. This requires much
attention to human engineering, and repeated modification 3,nd revision of programs. The implementation of
297
298
Fall Joint Computer Conference, 1969
------------------------------------------------~----------------------'-----
clinical data management applications has generally
begun on relatively small computers. This has, in many
cases, been necessary because development was a
gradual process and started with limited objectives.'
Since high level languages have not typically -been
available on small machines, most programming has
been done in machine language.
The expense and inefficiency of writing, debugging,
and modifying such programs have been serious obstacles to active research and development. A few
clinical data management systems have used large
general purpose computers which could provide much
increased flexibility. However, the overhead of a large
operating system on a major computer has often
seemed' excessive, because of the rather small amount
of processing involved in many 0f these applications.
Futhermore, because of the reliability requirements of
a clinical data management system, modularity and
duplication of hardware is desirable and often essential.
Because of the expense entailed by hardware redundancy, this is typically feasible only with inexpensive,
minimal equipment configurations:.
The MGH Utility Multi-Programming System(MUMPS) is a compact time-sharing system on a
medium scale computer, dedicated to clinical data
.management applications. It is currently implemented
on a PDP-9 (Digital Equipment Corporation) with
24,000 words of 18 bit memory and a Burroughs fixed
head disk with three million characters of storage
capacity. A set of terminal scanners is used to interface to remote devices: teletypes, buffered display
scopes, line printers, card readers,; and A/D converters.
Both memory size and peripheral storage capacity can
be expanded in the system. In th,e current version, 16
users may run simultaneously.
All application programs in this system are written
in a high-level interpretive language, a distant ancestor
of which is JOSS,l developed at the Ran9. Corporation
in 1964. It has also been influenced by related languages
such as STRINGCOMP (developed by Bolt, Beranek
and Newman, Inc.), and FILECOMP (specified by
Medinet Division of General Electric Corp.). The
MUMPS language allows the programmer to write a'
program, debug it, edit it, run it, and modify it concurrently during an interactive session at a console.
The interpreter itself is a part of the executive system
and is re-entrant. The total spape taken up by the
time-sharing monitor, the I/O monitor, buffers, and
re-entrant interpreter is currently about 8,000 words
of memory. The time-sharing and I/O monitors have
been specifically tailored to work efficiently with the
interpreter. No attempt has been made to accommodate'
~ U~FR
PAR~~!,ON
USER
PARTITION
U!OER
PMTITION
- - - - _ .._ - - - - - - - - 1
USFR
PARTITION
usrn·· PARTITION
USER
PARTITION
USER
PARTITION
USER
PARTITION
USER
PARTITION
L
____ L_O~A~ _ ~~T~ ____
(DYNAMIC STORAGE SPACE)
- - - - - ~C-T;V; - - - -
-l -,
APPLICATION
PROG:~AM
RE-ENTRANT INTERPRETER
INPUT/OUTPUT MONITOR
SYSTEM
& BUFFERS
TIME-SHARING MONITOR
&
INACTIVE PROGRAM FILES
J---
Figure I-A schematic diagram of the core memory
allocation of the MUMPS system and user partitions. A sin~~le
partition is expunded to show its internal dtructure. The use of
secondary storage (disk) for global data and inactive programs
is represented.
machine language user programs. All active users are
assigned partitions of core memory. Activa.ting a
program consists of finding an available partition and
bringing the program into it from disk; as long as it
remains active, it occupies its partition. Core B,nd disk
storage allocation are depicted in Figure 1.
The basic orientation of the language is proeedural,
much as FORTRAN and ALGOL. The largest unit
of a program is a group of statements called a "part"
indicated by an integer part number. A single line or
statement of the program is a "step"; it is identified
by a step number consisting of a decimal fraction
appended to the part number. Multiple comma,nds ma,y
be entered in a single step and executed one after
another. A conditional statement which when evaluated
System for Clinical Data. Management
has a false value will, however, cause the rest of the
commands in that step to be ignored. Commands may
be stated in a long mnemonic form, or for the experienced user, in a much more compact form in which
only the first letter of the command is used. A statement preceded by a step number is considered to be
in "indirect" or "program" mode, and is stored to be
executed as part of a program. A statement without a
step number is in "direct" mode, which indicates that
it is to be executed immediately after it is entered
from the user terminal.
Interface flexibility
Clinical information about a patient derives from a
variety of sources-the patient, the attending physician,
consultants, the radiologists, the clinical laboratory,
etc. Problems of using the computer to obtain information from each of these sources have begun to receive attention. Perhaps the most widespread activity
of this type has been the development of systems for
clinical laboratory information processing. 2 •3 •4 •5
With the exception of laboratory data, which is
either numeric or simple text, much of the clinical
information in the medical record is generally recorded
in narrative or free text form. Most investigators are
convinced that natural language is not in general
suitable for computer record keeping applications,
except perhaps in certain circumscribed areas with
limited vocabulary and syntax. 6 •7 As a result, there is a
significant amount of work currently being devoted to
the development of methods for structuring this
narrative data. 8 •9 •1o It is generally recognized that this
may be best achieved by introduction of new ways of
capturing such information, e.g., entry of data by use
of check lists, forms, or direct user-computer dialogue.
Interactive dialogues for the capture of narrative data
may be based on hierarchical organization and presentation to the user of the subject material. Any particular
topic may then be pursued to an arbitrary depth, by
means of a succession of increasingly discriminating
selections by the user from the options presented. A
variety of programs for interactive acquisition of clinical data have been developed, and have generated
needs for special terminals, display formats, and conversational languages. Conversational programs have, for
example, been devised for the on-line acquisition of a
patient's medical history.1 1 •12 Other systems aimed
primarily at the physician have been designed for the
purpose of entry of physical examination notes,I3 the
recording of progress notes, or the generation of X-ray
reports by the radiologist. 14 .15 In the development of
such applications, the emphasis is placed primarily on
299
the interface (hardware, software, and environmental)
of the system with the individuals who have to use it.
As the potential of clinical data management systems
is recognized, they will be called upon to fulfill a diversity of output functions, e.g., the display of reports or
summaries, organized chronologically or topically, the
production of tables or graphs. Information obtained
by dialogue must often be translated into more precise
medical terminology, or compacted into coded representations. Flexibility in output and presentation of
information, as well as in its acquisition, is essential.
The philosophy of MUMPS has emphasized the need
for ease in interfacing and adapting programs to the
requirements of the application. Programs written in
the interpretive language do not require any compiling
or assembling. Error comments during execution are
typed out at the user's console, and allow quick recovery, modification of the program, and reexecution
of it. All debugging and modification is done in the
same language in which the program is written and
can be done entirely from the user terminal. This
makes modification especially convenient, parti~ularly
in a service environment where the trouble shooting
necessary to interface a program with an application
area is a time consuming process. The MUMPS environment allows a programming session to take the form of
a conversational dialogue between the programmer and
the terminal device, thus minimizing the userls time in
programming a problem, the computer)s time needed
in checking it out, and most important, the elapsed
time required to obtain a final running application
program.
Text handling capabilities
The complexity and variety of data that must be
handled in a clinical information system impose a
number of requirements on the system. A considerable
amount of information that is input is in the form of
text .strings of variable length. The processing of input
often requires syntax checking or limit checking. String
comparisons, extractions, and concatenations need to
be performed. When special driver languages or monitor subsystems are employed to control dialogues
between the user and the computer, string processing
capabilities are mandatory. Most existhlg higher level
languages do not provide the needed combination of
algebraic and boolean expression hap.dling capabilities
with the ability to handle string information.
The MUMPS language has been designed to meet
this need. In addition to algebraic and boolean processing, a MUMPS program can perform string extraction, locations, comparisons, and checking of
300
Fall Joint Computer Conference, 1969
~w,H
.. WRI TE 1
TF: ",'I
~. (115
1.10 READ !,"UNIT NO.
",X
loiS IF" 'X:3N"-"2N"-"2N TYPE ,.,
SET OCT= "CA, P, F"'iS, C>WL, TP, ,~A, O{, CL, CJ~, SGOT, L flY, VI:'>!, >!u~,
READ !,"TEST: ",TES
FOR 1=1:1:14 IF" C:PIECE(DCT,I)=TI':S nUIT GOTd ~.:1
TYPI': .. ???. GOTO ~. 1
'lSi{ !, "RF.SUL T= .. , RF.S GOTO 1+3
REAl)"
P~OR. ERRO~ . . . Oo{? ",X IF" 'X("Y" GuTO ~03
DO l(1IfA TYPE! GOTO ~·I
~.I(1I
ILLEGAL" GOTO
1
~.~(11
:>.:>5
~. 3(11
~.'I(1I
"00 1
UNIT
UNI T
UNI T
UNIT
2.5G1
NO.
NO.
NO.
NO.
ILLEGAL
123-45-67R
ILL EGAL
12-345-67
ILLEGAL
1 ~3- 456-78
123-45-67
Figure 2-A portion of a MUMPS program to input
a seven digit unit number from the teletype (accomplished by
step1.10. The value entered is stored as the vttriable named X;
a check is made that X has the correct form, i.e., 3 digit.,;, followed
by a hyphen, 2 digits, a hyphen, and 2 more digits (step 1.15).
Improper values cause an error message, and reque.3t of a new
value. The WRITE command lists the statements. The DO
command causes execution, which is illustrated. (In this and
other figures, user input is underlined to distinguish it from the
response of the computer.)
syntax and form of information. These features are
illustrated in Figures 2 and 3. Figure 2 shows a portion
of a program written in MUMJi>S to read a hospital
unit number from a Teletype (i.e., entered by a user),
to check its syntax, and to reject any improperly
formatted responses. Figure 3 shows statements in a
program for the clinical chemistry laboratory, which
permit entry of a test name and' its result. Checks are
made on the legality of the test name and the reasonableness of the result. Some of the interactive editing
capabilities are shown in the figure.
Terminal device flexibility
An important feature of the language is its input/
output scheme, which permits programs to be written
independently of the particular device for which one is
programming. One may use any device for which the
hardware system has been appropriately interfaced by
merely assigning a device number to a system variable
indicating the device to be utilized. This makes it
possible to generate a report on a display scope, for
example, and then to use the same program to type
out the report on a typewriter, merely by changing,
during execution, the value of the device number
assigned to the input/output variable. Formatting and
control of position on a page are made very simple by
utilization of special format characters and variables
indicating current position and line spacing.
Multi-user a,ocess to a central data, base
A major requirement of a clinical data management
system is that the information stored be accessible to a
variety of users concurrently. Access may be from a
r.: ~I':"
9.10 IF" RES>16(O\!'i"S 151;l!RI':S< 11(11 GOTO
~. 'I
~
TEST: Nil
RI':SUL T-;;-~
PROR.
F.RROR ••• OO{?
!..
TEST:
Figure 3-A section of a MUMPS progra.m that might
be used in a clinical chemistry laboratory information syst1em.
Step 2.05 sets the variable DCT to the list of test determinations
that are valid for this particular labora.tory. Step 2.10 then accepts
a test name from a technician. The $PIECE function in step
2.20 then extracts substrings (between commas) from DCT and
compares them to the variable TES whose value is the test name
entered. It does this repeatedly for values of I = 1, .. , , 14 until
a match is found; at this point the iteration is terminated :and
execution continues at step 2.30. If no match is founeL, an error
comment is printed (step 2.25) and step 2.10 is repeated. Step
2.30 accepts a test result, and goes to a part in the program
dependent on the plt"rticular value of I for which the match was
found.
Part 9 illustrates a specific check for results entered for the test
name, N A (in which case I = 6). The result is compared to
prescribed limits, in step 9.10, and if it exceeds either limit, eontrol goes to step 2.40. Here the user is asked to verify the value.
The user's response is inspected to see if it contains a "Y", in
which case a YES response is implied. Otherwise, a new result
is requested, in step 2.30. If either the user verifies it, or the
result is within limits set by step 9.10, control goes to step 2.50.
Step 2.50 calls part 100 to file the value and then returns to ~Itep
2.10.
The DO command causes execution, which illustrates operation
of the program. Note that the user has interrupted the program
from his teletype (indicated by the I'? 2.10 IOINT" lerror comment, showing where the interrupt occurred). In this caso, a
programmer has decided to edit the program to make the limits
for a sodium determination more stringent, by retyping 8tep
9.10. The program is then re-executed.
variety of terminals, by a variety of programs in the
system, at varying frequencies. Among the possible
purposes for accessing a file might be to report a
laboratory result, to enter an X-ray impression, to
record a progress note, or to enter a specific: inquiry.
Although many of these activities occur independently,
they must share a common data base. Nevertheless,
manipulation of the data base must occur without
time sharing conflict, such as might occur if two mlers
were to update a portion of the data base simulta-
System for Clinical Data Management
neously. Without special provision, this migh tresult in
loss of information.
Efforts to develop specialized clinical data management applications are still relatively primitive. There
have been very few concerted efforts devoted to the
general problem of management of medical record data
the development of integrated patient data files and
the implementation of systems for long term st~rage
and retrieval of this data. 16 •17 Among the difficulties
faced by the few developmental efforts that have been
undertaken have been the lack of generality in their
approaches, and the reliance on highly specific programming languages, file structures, and file handling
routines.
MUMPS provides application programs with the
ability to create and utilize their own "local" data
as well as to manipulate "global" data, shared· b;
other programs in the system. Local data utilized by
~ program is referenced symbolically, and space for
It IS allocated as needed. Local data is that set of variables established within the domain of a particular
program, and available and defined only within that
program. The data actually resides within the user
partition, and functions as scratch or transient data.
L?cal ~rrays are assumed to be sparse or of varying
dImenSIOns, and only subscripts for which data are
defined are allocated space. A symbolic variable used
in a program may be given either a numerical value or
a variable-length string value. When it has a string
value, only that space required by the string is actually
allocated. Thus for both strings and sparse arrays, the
overhead of a compiler system does not exist in which
typically maximum sizes of arrays and ~aximum
lengths for string variables must be alJocated.
This philosophy is extended to the management of
data on the random access disk. Elements stored in
data files are referenced entirely symbolically; the
file name is similar to that of a local variable name in
a program. Fields in the data file are treated as array
elements and referenced by means of SUbscripts; subfields are referenced by appending additional subscripts. Data files on the disk thus comprise an external
system of arrays, which provide a common data base
av.ailable to all programs. The. arrays which make up
thIS external system are called global variables and
are identified by global array names. A global ~ame
(or file name) consists of the character up-arrow (t)
followed by at least one alphabetic character. The
form of the subscript portion of an array reference
consists of an arbitrary number of numeric expressions
separated by commas and enclosed by parentheses.
To avoid time-sharing conflicts, a program may
301
prevent other programs from having access to one or
more global arrays "\yhich it is in the process of altering
in some way, by the use of the command OPEN. The
argument of OPEN may be one array name or a list
of array names. OPEN prevents any other program
from altering data in any of the specified arrays. The
effect of OPEN is cancelled when the program ends or
at the occurrence of the command CLOSE, which does
not require any arguments, and releases all opened
arrays to other users in the system.
Hierarchical data base organization
A most important requirement for clinical data
management is the ability to handle the several levels
of structure of a medical record data base, and to
support the rather complex updating and retrieval
ne'eds of such a system. An example of a typical patient
data file, such as exists in the information system under
development at the lV[assachusetts General Hospital,
is illustrated in Figure 4. This indicates the typically
hierarchical (tree-like) structure of the data base, which
has both a topical and a chronological organization.
1\Ilost computer systems currently available do not have
the ability to utilize hierarchical file organizations
conveniently.
The global array facility in MUl\rfPS has been designed to meet this need. The structure of global
arrays is hierarchical, and any node within the array
tree may possess a numeric or string data value and/or
a pointer to a lower level in the tree. Data may be
stored at any level, and there are no constraints to
the dimension or the size of the array. In addition
the quantity and magnitude of subscripts for an array
are dynamic, so that not only may the content of an
array change during usage, but also its structure may
vary.
Since modification of content and structure of a
global array may be caused by a variety of programs
in the system, a particular program must sometimeE
examine the current configuration of an array before
attempting to ,access or update it. MUMPS provides a
set of global array functions to determine the type
and structure of a global array. These functions permit
the programmer to locate the nodes where information
is stored within an array, and nodes within the array
which are empty and thus available for data storage.
The storage of data into an array is accomplished
solely by the assignment command, SET. Consider
the fonowing statement:
SET tAPR(UN,NAME) = "JOHN DOE",
l' APR(UN,AGE) = 34
302
Fall Joint Computer Conference, 1969
Figure 4-A tree-structured patient data file, indicating: (1) the use of certain levels in the tree to group information in
specific topics, e.g., basic identifying and administrative data, review of systems, phy.5ical examination, and (2) other leve1.3 to
group information into .3ets which differ by date or by some other sequencing field.
Assume the global array name l' APR is reserved for
the active patient record file. Each patient in the file is
accessed through his hospital unit number, in this
case, a local variable UN. Both NAME and AGE are
also local variables whose values indicate particular
categories represented by subscripts at the second level
of the array. This statement then assigns the string
value "JOHN DOE" and the numeric value 34 to the
specific second level categories, name and age respectively. Subsequently, a statement such as:
SET tAPR(UN,CHEM,N):;:::DATE.",".TEST
might define the Nth laboratory test in the chemistry
lab with the double field entry of the date concatenated
(by means of the dot operator) with a comma and the
test name.
Retrieving data from global arrays is no different
from retrieving data from local arrays. Both consist
of ascertaining the value of a subscripted variable by
using it within a numeric or string valued expression.
The statement:
TYPE" THE AGE OF", tAPR(UN,NAME),
"IS ", tAPR(UN,AGE)
will effect the printout:
THE AGE OF JOHN DOE IS 34
To print out a list of a patient;s laboratory tests
(assuming l' APR(UN,CHEM) is the total number
of tests defined) the following statement milght be
used:
FOR I:;:::1:1:tAPR(UN,CHEM) TYPE
l' APR (UN ,CHEM,I)
The KILL command when applied to a specific
node in a global array, prunes the array tree at th:l1t
node. Any data value and/or array pointers to lower
level nodes are removed, and that node reverts back
to an undefined status. The statement KILLt APR
(UN) would delete all information for the patient
defined by the local variable UN.
Included in the global array syntax is the "'naked"
global variable. The form of the naked variable consists of the up-arrow followed by a subscript enclosed
in parentheses. This notation is equivalent to the last
previously used global array reference except that the
value of the last subscript is replaced with the value
of the SUbscript in the n9.ked variable. For example,
the statement:
TYPE" THE AGE OF
",1' APR(UN,NANIE) ,
" IS '" l' (AGE)
is equivalent to the example cited earlier.
MUMPS requires that reference to all file information be done symbolically, in the syntax of hierarchical global arrays. This replaces the classical manner
of sequentially accessing record files on seeondary
System for Clinical Data Management
memory devices. Instead, an attempt is made to logic ally map the content and structure of the tree-like
data arrays into the physical storage medium of the
system. The general technique is to map logical information at a specific level of an array into fixed size
blocks chained together linearly to contain all the data
values stored at that level, and all the pointer words
which link it to the chains of the next lower level. The
implementation of this design requires a careful consideration of the timing and size constraints of the
physical device in relation to the overall system. The
actual memory device used in the system is a large
fixed head disk. The organization of this type of disk
is two dimensional, wherein any physical block has a
track and a segment coordinate. Initially a set of free
lists are formed which chain all blocks possessing the
same segment address together. Whenever a continuation block at the same level or a header block at a
new level is required, the appropriate block in the free
list whose segment address is a few segments away is
utilized. This method makes it possible to trace down
the many levels of a tree structure required to access
a datum during a fraction of a disk revolution, in
addition to the average access time of the disk unit
required to reach the first level of the tree. As a consequence, the time required to retrieve a particular
datum is virtually independent of the depth of subscripting required to specify the datum. Space is
conserved by utilizing small sized physical blocks such
that at any subscript level an average of one continuation block is required. When data is updated, care is
given to repack and sometimes reorganize the individual
data elements within a chain to insure maximum
utilization of space for variable length data. Whenever
a part of the global structure is deleted, it is passed
to the garbage collector routines to be disassembled
from tree-like chains back into linear chains and appended to the appropriate free lists. This is done during
periods of low CPU activity so as to avoid competition
with the active programs.
Once a block of data accommodating a single level
of subscripting is referenced, it is maintained in core
memory until a reference is given to a different block
by the program. Use of the naked variable then permits
other data at the same level to be referenced merely
by specifying a terminal subscript, so that once a level
is reached, often no further disk access need be made
to manipulate associated information. If any data in
a block is altered, it is only written back on the disk
when a reference is made to a block other than the
one that is in core memory, or when a CLOSE command
is given.
303
Large stor.ageca.pacity
The conversational environment in which a clinica 1
data management system is designed to operate demands little computer processing power. When data is
entered, a program need only check on its legality,
decide where to file it, and select an appropriate response to the user. Generation of reports may involve
manipulation of information from peripheral storage
to assemble the data needed, but only a small amount
of processing to actually format or produce the report.
Large volumes of data need to be available for low
level, low frequency usage. Thus one does not need
computing power as much as the availability of peripheral storage of large capacity. J\Iuch of the data
may be potentially accessed at any time, and therefore
need to be stored on a random access device. Because
of the large quantities of data that may be anticipated
in such systems, it is necessary to provide hierarchies
of peripheral storage, in which the access time of the
storage device used is commensurate with the frequency or urgency of the need for retrieval.
In :l\1U~VJPS the fixed head disk provide~ fast random
access storage, ,vhereas slower access requirements are
currently met by three Dectape units. A large movable
head disk unit is being installed to permit intermediate
access times for other data.
Efficient Time Sharing
In a conversational data management system,
programs spend much of their time in an input/output
hung status, i.e., doing disk activity or completing a
transaction at a terminal. As a result, there is again
not a large demand by a program for the central
processor. In contrast to most numerical applications
where central processing power is the limiting factor,
in a conversational environment the time necessary to
complete a task is often determined by the speed of
the input/output equipment or the human response
time at a terminal. As a consequence of the small
demand for the central processor by an individual
program, one can theoretically time share a large
number of programs. Efficiency of the use of the central
processor is in this situation determined by how rapidly
the time-sharing monitor can change from one user
to another. This swapping overhead is the delay before
a particular user program can run after a previous
user has quit the run state, due to an input/output
hang, expiration of time slice, or termination of its
task. When the central processor is not being fully
utilized, swapping overhead tends to determine response time of the system.
304
Fall Joint Computer Conference, 1969
TABLE I-A comparison of execution times for various numeric processing
examples in MUMPS and FORTRAN
CPU Time (Microseconds)
StatemeI:u
FOR/DO
(Iteration, per cycle)
1+2
2*3
1 + 2*3
1 + 2 - -3*4/5
MUMPS
FORTRAN
MUMPS/FORTRAN
RATIO
250
12*
20.8
800
850
1050
1550
7*
44
48
120
114.3
19.3
21.9
12.9
* These are the only operations compiled by the PDP-9 FORTRAN Compiler as in-line code. All other operations
beside integer addition (in DO loops and arithmetic expressions) are compiled as subroutine calls.
In the MUNIPS system, the use of a partitioned
memory has been dictated by the oven.vhelming concern for response time. As a result of partitioning, the
time sharing monitor can switch between users in
minimum time without having to resort to swapping
of programs in from a drum or disk. In addition, the
monitor automatically overlays external program
segments invoked by an active program. Proper linkages are set up to return automatically to the invoking
program when execution of a segment terminates.
Execution speed of an interpretive program doing
pure numneric processing may be slower by a factor of
about 20 to 1 over corresponding code generated in a
compiler or assembly language system.
Table I illustrates some timing comparisons between
a single user v.ersion of the MUMPS interpreter and
the manufacturer-supplied FOR TRAN compiler for
this computer, for statements involving pure numeric
processing activity of varying complexity. As has been
indicated above, however, few programs do pure numeric processing in a clinical data management environment. Input/output conversion in FORTRAN and
most other compiler systems is handled in a purely
interpretive fashion, and thus, for this activity, very
little difference in the performan¢e bet,veen the two
kinds of systems may be expected. Furthermore, a
significant part of the processing done by programs in
clinical data management systems involves file manipulation, or text string processing activities; in all assembly or compiler language systems these functions
are usually handled by the use of. subroutines. Therefore, the employment of an interpreter as a means of
generating calls to these subroutines rather than compiling the calls themselves requires only a small amount
of processing overhead.
The foregoing observations refer to comparisons
between execution speeds of NIUMPS interpretive
language statements and compiler-generated objeet
code on a single-user computer, with no other processes competing for the processor. More significantly,
in a data management environment, are-entrant
interpreter such as MUMPS may provide the most
economical means of achieving a highly responsive
time-shared information system. In the IVJ[UMPS
system with sixteen typical users active, response
times (a most sensitive measure of efficiency in a timesharing system) are always less than a second and
usually appear instantaneous.
There are several reasons that account for this, all
of which are related to very efficient use of core storage.
First, a typical program written in the interpretive
language takes up 10 percent to 20 percent of the
space taken up by the object code generated for a
similar program written in a compiler language. Also,
dynamic allocation of data and efficient storage of
variable length strings and of sparse arrays are st.andard
features of the interpreter. Thus data also take up
considerably less space in this kind of environment.
In addition, since the interpreter is re-entr~:mt, a,ll
programs may share the same utility routines and
operating system capabilities. This contrasts rather
sharply with conventional compiler language operating
systems, in which each running program must have
its own copy of the necessary system routines that it
will utilize.
The significant advantage that results from.the above
features is that programs take up much less space;
therefore, a partitioned memory system on a medium
or small scale computer becomes feasible. Active
programs are typically highly interactive, and aria
System for Clinical Data Management
therefore doing only small amounts of processing
between input/output requests. Therefore the timesharing monitor is invoked frequently to pass control
from one user to another, in order to utilize the central
processor as much as possible. In a partitioned system,
swapping of the users is very rapid. In systems that
use various schemes for submerging disk or drum
swapping, users that are running in a conversational
mode often do not stay in the run state long enough
to submerge the concurrent swapping process. Therefore potential CPU time is unavailable; this unused
time may be on the order of 20 to 50 percent of the
total amount available. The speed that results from
not using disk or drum swapping appears, in our
experience, to more than offset the overhead of interpretation, with greatly increased efficiency in the utilization of space.
CONCLUSION
The convenience occasioned by the utilization of a
high level language with symbolic referencing capability for data stored in complex tree structures on
peripheral storage has greatly simplified the development of application programs for clinical data management. This is the only system that we know of, on a
computer of medium or small scale, which supports
such extensive file manipulation, string handling, and
input/output flexibility. It is the only system we have
encountered on any computer which allows all these
manipulations to occur entirely in a high level language.
This system has been used at the MGH for all of our
programming research and development activities.
Equally important, because of its compactness and
efficiency in this environment, we use it for the implementation of our service programs, including a chemistry laboratory reporting system,18 a patient history
taking system, and a number of programs for physician
entry of narrative record information.
An advantage of this approach to clinical data
management over the use of a large commercially
available general purpose time-sharing computerwith its complex operating system has been the increased flexibility that is possible with a specially
designed system. This increased flexibility results
because the system has been built to meet specific
objectives, in contrast to having been implemented
within the often arbitrary and inefficient constraints
of a general-purpose time-sharing facility. In addition,
with a special purpose system, it is possible to achieve
the efficiency required for service operation with a
305
computer whose size and cost are well matched to the
requirements of the problem area.
BIBLIOGRAPHY
1 J C SHAW
JOSS: A designer's view of an experimental on-line
computin.g system
Pl'OC FJCC Vol 26 1964455-464
2 D A LINDBERG
Collection, evaluation, and transmission of hospital laboratory
data
Meth Inform Med JUly 1967 Vol 6 97-107
3 H C PRIBOR W R KIRKHAM R S HOYT
Small computer does a big job in this hospital laboratory
Mod Hosp Vol 110 April 1968 104-107
4 G 0 BARNETT P B HOFMANN
Computer tech'f!:..ology and patient care: Experiences of a
hospital research effort
Inquiry V 1968 51-t'7
5 G P HICKS M M GIESCHEN W V SLACK et al
Routine use of a small digital computer in the clinical
laboratory
JAMA Vol 196 June 13 1966973-978
6 H JACOBS
A natural language injormation retrieval system
Meth Inform Med Vol 7 Jan 19688-16
7 A W PRATT L B THOMAS
An information processing system for pathology data
Pathology Annual Vol 1 Century Appleton N Y 1966
8 G 0 BARNETT R A GREENES
Interface aspects of a hospital information system
Ann N.Y. Acad Sci (in press)
9 R D YODER
P1'eparing medical record data for computer processing
Hospitals Vol 40 Aug 16 19661.'')-76
10 L J.J WEED
Medical records that guide and teach
New Eng J Med Vol 278 1968 652-657
11 W V SLACK G P HICKS C E REED et al
A computer-based medical history system
New Eng J Moo Vol 274 Jan 27 1966194-198
12 J G MAYNE W WEKSEL P N SHOLTZ
Toward automating the medical history
Mayo Clin Proc Vol 43 Jan 1968 1-25
13 J .M KIELY J L JUERGENS B L HISEY
P E WILLIAMS
A computer-based medical record
JAMA Vol 205 1968571-576
14 A W TEMPLETON P L REICHERTZ E PAQUET
J L LEHR G W LODWICK F I SCOTT
RADIATE-Updated and redesigned for multiple cathode-ray
tube terminals
Radiol Vol 92 1969 30-36
15 H P PENDERGRASS R A GREENES
G 0 BARNETT J W POITRAS C W MARBLE
A N PAPPALARDO
An on-line computer facility for systematized input of
radiology reports
Radiol Vol 92 1969 709-713
16 P HALL C MELLNER T DANIELSSON
Medical education-A challenge for
natural language analysis, artificial
intelligence, and interactive graphics
by J. C. WEBER and W. D. HAGAMEN
Cornell University Medical College
New York, New York
INTRODUCTION
In a functional sense, Computer Assisted Instruction
(CAL) has not advanced from the primary grades, yet
its implications for higher education cannot be ignored.
Most of the work that has been done in CAL falls into
the category of drill and practice or straight tutorial
presentation. Logically, both the hardware and software that have been developed or modified to support
CAL have been tailored with these goals in mind. In
medical education, multiple choice questions would
neither hold the interest of the average student nor
challenge his intellectual abilities. Since we can formally
present only a small fraction of the problems our
students may some day have to deal with, we are concerned not only with presenting factual information,
but even more with developing their power to reason
and handle new problems.1Vredical students have widely
divergent backgrounds and needs, as well as differing
interests. For these and other reasons, we need a truly
two-way, free-format discussion where each student is
treated as an individual. Anatomy, the field in which
we teach, is very much a visual science. Consequently,
graphic capabilities are important. Here also the student
needs to interact and be treated as an individual.
It should be pointed out that we are computer naive
people who have been working without professional
help. We have been using a system and a language
which nicely meet the requirements for which they
were designed, but in approaching the needs of higher
education, programming becomes laborious and cir-
cuitous. Weare well aware that others working with
more sophisticated systems have produced more sophisticated results. Indeed, to many our methods may
seem primitive. However, our challenge has been to
implement natural language analysis, self-adaptive
programming, and interactive graphics within a framework of restricted costs. It is important that people in
the computer field be made aware of the systems and
language requirements of people in various areas of
education. For CAL ever to become a reality, it must
first become an interdisciplinary endeavor.
The system and language
Our work has been centered around the IBlVI L500
Instructional System and its associated language-COURSEWRITER II. The 1500 is supported on 1130
(32 K) hardware. Peripheral equip'ment consists of 32
terminals, each with a cathode ray tube (CRT), a 128
character keyboard input and a light pen, a typewriter
unit, and a 16 mm random access image projector.
COURSEWRITER II is an interpretive, noncomputational language. Both COURSEWRITER II
and the 1500 Instructional System are described in
detail in IBl\1 publications.
N aturallanguage analysis
Our basic format is schematically illustrated in
Figure 1. There are two different types of discussions.
The large circle represents an anatomical discussion
307
308
Fall Joint Computer Conference, 1969
OR
8
G
DO YOU WANT l\1:E TO INITIATE THE
DISCUSSION?
(CP)
Anatomical
If he indicates that he wants to ask a question, he is
branched to a subroutine which handles the analysi s
of student questions (SQ). If he indicates he wants the
computer to initiate the discussion, he is branched to
one of the clinical problems (CP) in one of the dusters.
Clinical problems
Figure I-A schematic representation of the modular
unit. The huge circle represents an anp,tomical
discussion, the smaller satellites represent
clinical problems. A large number of these
modular units are interconnected to
form a course segment or topic of
discussion
and the smaller satellites clustered around it represent
what we call clinical problems. The cluster of clinical
problems surrounding each anatomical unit is directly
related to that block of anatomical material. We try
to have a ratio of at least ten clinical problems to each
anatomical discussion. Thus the organization is modular
and any number of these modules may be linked together to form a course segment or topic of discussion.
At the present time we try to keep these course segments
small enough that the average student can complete
th~m in 30--60 minutes.
To facilitate the description we shall consider a
discussion of the extrinsic muscles of the eye and their
nerve supply. There are seven such muscles and they
are supplied by three nerves. This course segment
consists of 13 modules, i.e., 13 anatomic[l.l discussions
and their associated clinical problems.
A student signs on a terminal for a particular course
segment, i.e., he chooses the general topic he wants to
discuss. He is then presented with a choice:
DO YOUWANT TO BEGIN BY ASKING
QUESTIONS?
(SQ)
On the initial branch, i.e., if the student elects to
have the computer initiate the discussion, both the
cluster a,nd the specific clinical problem in the cluster
a,re randomly selected. Each clinic[l.l problem is a relatively brief linear presentation, i.e., three or four statements, each illustrated by a picture (with the film
strip projector), followed by one key anatomieal
question. For example, after describing and illustrating
a patient's signs, symptoms, and history, the student
might be asked:
WHICH
NrUSCLJ1~
IS INVOLVED?
If he answered this question correctly, he would branch
to another clinical problem in another cluster or module:
The student is taken from cluster to duster in a pre·
scribed sequence. However, the specific clinical problem
in each cluster is randomly, but non-repetitively
selected. As long as he continues to respond 2~ppropri
ately, he branches from one clinical problem to. another
without ever entering into the underlying anatomical
discussions. Each time he successfully completes a
clinical problem, a scoring counter is incremented by
one. If he were to progress through six of these clinical
problems, he would have been examined on three of
the seven muscles, and three of the seven bra,nches of
the three nerves supplying them. Since the general
principles of function and methods of testin gone
muscle or nerve are similar to those underlyi ng the
others, it is our judgment that a student who. successfully completes six successive clinical problems correctly,
in this predetermined sequence, has demonstrated
masterv of this block of subject matter, and he is told
so. He may then either sign off this course segment or
continue in it as long as he desires. It is possible for
someCllle to sign on, complete six successive clinical
problems, and be finished in as little as two minutes.
(The value that we require in this scoring counter to
demonstrate mastery is dependent on the len~~th,
complexity, and nature of thp material discussed.)
Medical Education
However, if he misses the one key question in any
clinical discussion, his scoring counter is set to zero
and he is branched to the corresponding anatomical
discussion.
Anatomical discussions
first question, and the one he has to answer to get out
of this part of the discussion in essence is: "What would
you do with this knowledge?" More specifically he is
asked:
vVHATWOULD YOU ASK A PATIENT TO DO
The anatomical discussions differ from the clinical
problems in several important respects:
(1). They are highly branched. For some questions
there are as many as 35 anticipated answers with up
to ten different branches, depending on which anwser
is given:. It does not require many such nodal points
to produce a highly complex network. It is possible
for a student to stay in a single anatomical discussion
for 30-40 minutes without retracing his steps. However, it is unlikely for him to have to do so, since
hopefully he is learning at every decision point.
(2). For each anatomical discussion there usually is
only one starting point, and one logical exit point.
Despite the. complexity implied above, the entrance
and exit points may be adjacent to each other, i.e.,
it is possible to come in, answer two questions, and
be out. In practice this seldom happens, since some
subset of the question that permits him to get out
is included in the clinical problem which sent him
into the anatomical discussion. We simply are following the well known pedagogical axiom that one
can only hope to get across one or two major points
in a discussion. Some individuals can appreciate these
general principles in their barest form, while others
need elaboration.
Let us illustrate this with one example. The student
misses a clinical problem and enters an anatomical
discussion. The first question he is asked may be:
WHAT IS THE ACTION OF THE RIGHT
SUPERIOR RECTUS MUSCLE?
The correct answer, assuming the patient
straight ahead to start with is:
309
1S
looking
IT MOVES THE EYE SUPERIORLY, MEDIALLY, AND ROTATES IT IN A CLOCKWISE
DIRECTION AS YOU FACE THE PATIENT.
This may sound like a fairly difficult question and
certainly we obtain a variety of answers. However, the
student is shown how to reason out the answer by a
series of leading questions and explanatory pictures.
The question that follows the correct answer to the
"THA T WOULD TEST THE ACTION OF THE
RIGHT SUPERIOR RECTUS AND ONLY
THIS MUSCLE?
Here iB where our challenge lies-to teach the student
to question the validity and significance of facts--to
train him to reason. What good is it that a physician
know the action of a muscle if he cannot utilize this
knowledge by testing the muscle in his patient?
(3). If the student entered the anatomical discussion
via a clinical problem and reaches this normal exit
point, i.e., has answered the above question correctly,
he will branch to the next clinical problem and once
again try to answer six in a row correctly.
(4). At any point in an anatomical discussion, but at
no point in a clinical problem, the student may ask
any question he wants. He is then branched to a subroutine which analyzes student questions.
Student questions
At any point in an anatomical discussion when he is
asked H, question he may choose not to anSWEr it, but
rather to ask a question of his own. His motivation may
be that he thinks his own question will lead him to the
answer he is lacking, or he may in effect be saying:
"Okay, I've had enough of this particular line of conversation, let's proceed to something I don't already
understand." Whenever he asks a question three
things are permanently recorded: his name, his
question, and where in the program he asked the
question. The address of the question he avoided
answering is stored so he may be returned to this
point.
His question is first prescanned (key letter analysis)
in order to determine whether it is germane. If not,
he is told so and returned immediately to the question
he avoided answering. However, if his question is
germane, it is further analyzed and he is branched to
some other point in that anatomical discussion or into
another module, where he is shown how to reason out
the answer to his question. We prefer this to giving
direct answers to his questions. If he is branched to a
place where his question is answered immediately, the
reasoning behind this answer and a probing analysis
follow.
310
Fall Joint Computer Conference, 1969
-----------------------------------------------------------------------------------------Once the student is in the question asking routine,
and after his question has been answered, he has several
options open to him. (1) He may continue to ask as
many questions as he likes, thus branching from point
to point within a given anatomical module or, more
commonly, branching from one anatomical discussion
to another. (2) He may signal the computer at any
time that he is ready to return to the point where he
asked his initial question. (3) He may, without knowing
it, reach the normal exit point of an anatomical di~
cussion. However, since he is in the question asking
mode, he is treated differently than if he had entered
via, a clinical problem. Instead. of being branched to
another clinical problem, he is returned to the point
where he asked his initial question. Thus there is no
way he can avoid the question he originally chose not
toanswer~
Remember that when he firs~ signed on the course
segment he was given an option a~ to whether he wanted
to ask a 'question or whether hei wanted the computer
to initiate the discussion. If he c~ose to ask a question
at that time, he would have en~ered exactly the same
subroutine. He w()uld have been handled in the same
manner with the following mihor exceptions. If he
reached the normal exit point of an anatomical discussion, he would be branched back to his starting
point and given the option again. If he signalled the
computer that he had tired of asking questions, 'he
would in essence be saying that he wanted the computer
to take the initiative and would:then be branched to a
randomly selected clinical problem.
There are several distinct advantages to the experimental format currently being used by our students.
(1). Authoring is greatly facilitated by the use of
modular units for course construction. It is one thing
to sit down and write a lecture or linear presentation,
but quite another to outline a highly branched, openended discussion. The smaller the modu~es, the easier
this is to perform.
(2). Relevance and interest are maintained through
the "clinical problem" approach to human anatomy.
The clinical problems, however, are just one type of
application question which is common to many disciplines. They provide a certain' amount of interest or
spice to the learning of what dtherwise might appear
to be a series of facts or skills which often seem irrelevant. The question of "relevance" is even more important than providing interest. There is more to
learn than we have time to teach and sometimes we,
as teachers, tend to get carried away by details that
happen to have special interest for us. Thus the appli-
cation questions help to keep us "honest" and relevant.
If a piece of anatomical knowledge cannot be accessed
via a clinical discussion, perhaps
its significance to the student.
w~
should question
(3). The ability of the student to ask free format
questions and be shown how to reason out the answers,
gives him the feeling of being treated as an individual.
He can literally chart his own path through a discussion until he is ready to be evaluated, i.e., to enter
the clinical problems. Teaching the ability to ask
questions and to reason out the answers is one of the
most difficult tasks we face as teachers.
(4). The high ratio of clinical problems to anatomical
,discussions, the redundancy and highly branched nature
of the anatomical discussions themselves, and the
ability of the student to ask free format questions, all
contribute in permitting students taking the same
course segment to have relatively unique experiences.
Not only do they get different clinical problems, but
they may not even be taken to the same anatomical
discussions. We find that this variety of experience
inside the classroom stimulates discussion outside the
classroom.
(5). It is the combination of the features discussed
above that permits one student to be told he has
mastered the material in one course segment in as
little as two minutes, while another student may spend
several hours to attain t:p.e same degree of mastery.
This raises the interesting implication of informing a
student when he has attained sufficient mastery of
the entire subject matter, rather than giving him a
course grade. Some students, either because of ability
or previous experience, might achieve this level of
mastery in a month, while another student might
require the present six months. The faster students
would have a lot of free time which could be spent on
other courses, independent study, electives, or research. Thus it is conceivable that once a curriculum
were implemented on the computer, the student's
medical college transcript might more meaningfully
consist of a record of how many things he accomplished
during his training, rather than a series of numerical
grades.
A rtificial intelligence
This may be a rather grandiose term for the rather
primitive examples we have, but we want to discuss
two general topics, i.e., improving the methods by
which we handle the student questions, and developing
self-adaptive programs.
Medical Education
Student questions
As a result of experience with students on the initial
course segments, we found that a large percentage of
the questions they asked either were not answered or
were not handled appropriately. This does not have to
happen very often to discourage a student from asking
any further questions. However, we recorded every
question a student asked, so we were able to review
them. We found there were three main reasons for
mismatches on the questions: (1) the question was not
related to the subject matter being discussed, (2) the
student did not provide enough information, and (3)
he provided too much information.
(1). If the question is not pertinent to the subject
being discussed, we have no need to answer it. This
was determined by prescanning for keywords. We
found, however, that we could not always tell whether
the question was not pertinent, or whether it "vas just
not specific enough. Basically we solved this by equating
certain synonymous terms and by adding to the number
of keywords in the prescan. We also added two other
levels of scanning. The first is for such things as leg,
arm, thorax, etc., which are parts of the body far
removed from the eye. If these are detected, the student
is told, for example:
WE ARE NOT DISCUSSING THE LEG AT
THIS TIME. PLEASE LIMIT YOUR QUESTIONS TO THE SUBJECT UNDER DISCUSSION.
We have a second level which includes keywords related to the region, but not to the subject. Thus if
his question referred to the maxillary nerve, part of
which does run through the orbit, he would be told:
THE MAXILLARY NERVE IS RELATED TO
THE ORBIT, BUT DOES NOT INNERVATE
ANY OCULAR l\1USCLES.
If he did not match on any of these three levels of
prescanning, he would simply be told:
YOUR QUESTION DOES NOT APPEAR TO
BE WITHIN THE SCOPE OF OUR DISCUSSION. DO YOU WANT TO REPHRASE
IT?
If he does not choose to rephrase it, he is branched
back to where he asked the question. Differentiating
whether his question is not germane or whether it is
311
not specific enough is almost essential. Trying to determine in what way it is not related simply makes
the dialogue a little more personal and gives the
student the feeling he is being treated as an individual.
. (2). The most common difficulty was that the student
did not supply us with enough keywords, i.e., his
question was not specific enough. Thus we have developed a little subroutine which helps him make his
question more specific. For example, if the only keyword we detect is MUSCLE, we ask him:
WHICH MUSCLE OF THE EYE AND WHAT
DO YOU WANT TO KNOW ABOUT IT?
He then is given the chance to rephrase his question.
Thus with relatively little programming we can interact with the student in a conversational manner until
his question is understood. On the basis of previous
experience we feel we will be able to handle most of
the questions asked.
(3). The third area where we sometimes had difficulty
was when the student provided us with too many
keywords. It is a surprising fact that the number of
keywords required in a given course segment to provide
us with enough information to answer a question is
remarkably constant. In the program on the muscles
of the eye it was three. When there were too many
keywords, analysis showed he was usually asking more
than one question, or at least what he thought was
a single question could be broken down into two
smaller ones. Less frequently he was simply being too
verbose. Formerly he would branch on the basis of
the first three words that matched, but this was not
always appropriate. Now we count and store the
number of keywords in his question. If this exceeds
our magic number, in this case three, the words we
have detected can be displayed for him on the screen.
He then is asked to rephrase his question using no
more than three of these words, or to ask only one
question at a time.
Self-adaptive programming
We would like the program to modify itself on the
basis of experience, much as a teacher learns from his
experience with students. As a result of our own research in neurophysiology, we feel that two basic
aspects of learning are: (A) an increase in seeking or
exploratory behavior following cessation of a rewarding
stimulus,l·2 and (B) habituation or the dropping out
of unrewarded components of a response. 3 A teacher,
312
Fall Joint Computer Conference, 1969
at least a good teacher, when challenged is ready to
increase the variety of his respbnse. This is an example
of exploratory behavior. He may do this by retrying
responses that were previously part of his repertoire,
but had been temporarily discarded, i.e., had undergone
habituation. He may also increase his repertoire of
response by incorporating responses acquired from
experience with students. At the present time we have
only begun to incorporate these learning concepts into
instructional programs.
The following are examples of capabilities we consider
necessary for the computer if it is to approach the
versatility required in tutorial discussions. The first
two exist only as isolated demonstrations at selected
points, because COURSEWRITER II does not permit
us the computational ability to do this on a large
scale. The third example, which we consider of utmost
importance, has not actually been implemented as yet,
but we foresee no major obstacles; except for the limited
computational capacity of the system.
(1). If a certain percentage of the students (currently
20 percent) all ask the same question at the same point
in the progra~, subsequent students are branched as
though they had asked the same question. They are
treated as though they were in the question asking
mode, e.g., when they reach the normal exit point of
an anatomical discussion, they. are returned to the
point where they came from. Thi~ branching is dynamic
and reversible in the sense that the need for asking the
question is constantly evaluated. Thus if two students
in any series of ten ask the sam~ question at the same
point, every odd-numbered st~dent that follows is
branched as though he had asked the question. Evennumbered students are not branched. If nine of the
next ten even-numbered students fail to ask the question, the branch is deleted. However, if two or more of
them do, then the branch is reinforced, i.e., three out of
every foul' successive students will be branched.
Certainly if a significant number of students did
_ ask the same question at the same point in a discussion, we as teachers would probably modify our
approach. How often this will occur and whether the
percentage should be greater or less than 20 percent
are questions we cannot answer until we can test it on
a larger scale.
(2). There often are several places in a program to
which we could branch a student in response to his
question. At present we make this choice for the
students. We plan to give them some degree of control
by forming a hierarchy of possible branch points.
Originally these will be evaluated by us as first, second,
third, or fourth choices. Hmvever, each time a student
is branched and reaches the point where we think his
question should have been answered, he will in effect
be asked: "Okay?" or "Does that answer your question?" If he says yes, the likelihood of that branch
will be augmented. If he says no, he will be branched
to another point and the likelihood of the original
branch will be decremented.. Thus what we thought
was the lea.st plausible response to a given question
may be shown to be the most desirable on the basis of
experience with students, and it will achieve the status
of the initial branch without any manual interference
by the author.
(3). One of the most significant ways a teacher learns
from experience with his students concerns the unanticipated but appropriate answer. Right now we record
all unanticipated answers and review them periodically.
Occasionally an unanticipated answer proves to be
more perceptive than the anticipated answers the
author programmed. At present such a student is
treated as though he were wrong.
When a student gives an unanticipated answer and
feels he is treated in an inappropriate manner, why not
permit him the option of repeating his answer and
treating it much as we would a question'? EssentiaJly
he would be entering a "debate mode" . We feel that
our question answering routine is sufficiently flexible
now that he would eventually be taken to a point
where he could decide whether his original amnver was
valid or not. If it proved invalid, he would be branched
back and his pathway erased. However, if he felt he
had won his point, then his route could be preserved.
This would then become an anticipated answer for
subsequent students. In interpersonal discussions our
students often challenge us and not infrequently they
win their point. However, even if this occurred only
once in a thousand times, these are the type of responses we would least like to discourage. How can we
profess to encourage our students to question and
reason and then give an inflexible response? This is a
level where computers are not presently competitive
with a human tutor.
Since we have not yet implemented this, and do not
want to be considered idle dreamers, we shall elaborate
on how we intend to program this type of ability.
First it should be made clear that we are not talking
about situations where the student's response involves
evidence not available in the program. We are talking
about situations \vhere he reasons from one logical
statement to another. Let us cite a specific example.
In our original version of the discussion of the eye, we
Medical Education
programmed many anticipated answers to the question:
WHAT IS THE ACTION OF THE SUPERIOR
RECTUS?
One answer we did not program was:
THAT DEPENDS ON THE STARTING
POSITION OF THE EYE.
We subsequently modified the program to include this
as an anticipated answer. However, the inherent logic
was already present for the student to have won his
point. If he had asked-in debate mode:
WHAT IS ITS ACTION IF THE PATIENT
STAR'fS BY LOOKING IVLEDIALLY'?
he would have been given one answer. If he then asked:
WHAT WOULD ITS ACTION BE IF THE
PATIENT STARTS BY LOOKING LATERALLY?
he would have been given a very different answer.
Clearly this would prove that the action depends on
the starting position of the eye.
The computer has no such ability to reason, but the
student does. Thus we are permitting him to make
value judgment. He could signal the computer that
these two answers made his point and subsequent
students would then branch. there, rather than along
the path previously followed. Since we are permitting
the student to make a value judgment that affects the
subsequent course of his fellows, the process must be
reversible. Thus the next ten students who gave the
same answer would be asked by the program whether
they understood the line of reasoning that followed. If
the consensus were yes, then the branch would remain;
if it were no, then the branch would be deleted.
Interactive graphic8
313
capacities that are competitive? In order to explain
what we have done and our problems in this area, it
will be necessary to go into some of the details of the
system with which we work, since it is quite different
from what most people think of when they speak of
CRT graphics.
The 1510, which is the CRT, light pen and keyboard
unit, was designed primarily for the rapid display of
text, and its designers assumed that its graphic applications would be limited. The usable area on the face
of the CRT is 4% >< 8 inches. It may be thought of as
a grid consisting of 32 rows and 40 columns (Figure 2).
A standard alphanumeric character would occupy t,~o
of these boxes, i.e., two rows by one column. Each box
on this grid, i.e., each one row by one column unit,
may be thought of as a matrix of 48 potential dots of
light, six dots high and eight dots wide (Figure 3).
Thus the entire screen consists of a maximum of 61,440
dots (192 vertical X 320 horizontal). Actually these are
more accurately described as horizontal slashes; the
dots are wider than they are high as may be seen in
Figure 4. This is a significant factor which must be
considered in preparing the drawings, to prevent distortion.
The system provides a standard character dictionary
and the user may define additional graphic sets. These
graphic characters, as defined by the system, occupy
such a large part of the screen that the likelihood of
being able to use the same graphic character to con11M 1510 lutractkmal Dillplay Planning Guida
Pet • • M-¥Ot-O (U/MOBI
Prl ..... III U.5.A.
Column
o 1
1
J
.(
~
•
• !
~
10 11 II 13 14
I~
16 11 18 19 20 21 22 23 24 25 26 21 28 29 30 31 12 33 J4 35 36 31 ]8 39
"
.'1
.'
Gross anatomy is very largely a visual science.
Knowing the three dimensional relationship of one
structure to another is a fundamental basis for clinical
diagnosis. The best way to organize this information
is with pictures, so our students are encouraged to
spend a lot of their time sketching. In our linear (noncomputer) programmed teaching they can actually sit
and copy pictures that are projected. The question
then arises, does the computer offer interactive graphic
."
Figure 2-This shows the organization of the screen into 32 rows
and 40 columns forming 1280 addressable units. The standard
alphameric characters occupy two of these boxes, i.e., two rows
by one column
314
Fall Joint Computer Conference, 1969
----------------------~--------------------------------------------------------------------
X X IX
X
XIX X X X X IX
X X X IX IX X )( IX
~ X IX ~ X IX X X
~ XIX X IX ~ X X
~ X IX X IX IX X X
iX IX X X X
Figure 3-The 48 dot matrix defined by the intersection
of one row and one column
change any of the letters or words. It quickly became
obvious that ''Ie did not want graphic characters of
the type just defined, but rather we needed a graphic
alphabet. Just as in the case of the English language,
given the 26 letters of the alphabet, one can write anything he likes, so given the means to directly access
each of the 48 dots in each box, we could draw any
pictures we desired.
That is basically what we did; we defined a character
dictionary with each character being a single dot
(Figure b). Thus with \vhat amounts to little more than
1/3 of one character dictionary area we can draw as
many pictures as we desire. The backspace function
permits superimposition of characters. Thus if our
display instruction were to contain the following characters, as defined in our graphic alphabet, i.e., BCDE]?JNRVbfjklmn, and there were a backspace command
between each of them except the "m" and the "n,"
we would get the dot pattern shown in Figure 6., X otice
that omission of the backspace instruction caused the
"n" dot to appear in the six by eight box one column
to the right.
We always limit our display instructions to one row
at a time and we put as many instructions on each row
as possible, i.e., we try to break our pictures up into
the smallest units we can. This permits us greater
freedom with the input buffer (250 character limit),
facilitates debugging, allows us to modify pictures with
a maximum of ease, provides animation capac!tty, ann
is especially useful when we give the student the capacity to draw his own pictures. However, there is at
present one very serious limitation to putting multiple
AB C 0 E F GH
I J K L M N lO P
IQ R is T U V iW X
abcde f gh
i ik Imn0 0
io r
s t u vwx
Figure 4-A CRT display of the skull and mandible
from the side
struct more than one picture is almost nil. It is analogous to taking a printed page,· dividing it into four
quadrants and saying you can use these quadrant units
to write anything else you desire, as long as you don't
Figure 5-The characters used to define our graphic
e.lph9,bet.. The character plus its case determines
the position of the dot within the matrix
Medical Education
315
xx xx xx xx xx xx xxx
xx XX xx IXX xX xx xx x)c
xxIxx xx xx xx xx xx x><
xx IXX xx xx xx xx xxx
xx IXIX xx xx xx xx xx xx
xx XIX X xx
xxx
A
x X X XX
X
X
X
IX
X
X
x
t><
B
X X XX
X
x
x
x
xx Xl.>< xiX xx xxx
xx xx xx xx xxx
xx IXX xx xx xxx
xx xx xx xx xxx
xx xx Xx xx xxx
xx xx xx xx Ixx
---
-
- ------
r--- --
c
X
x
x
x
x
~I
X
Figure 6-The pattern produced by the following coding
%B%F%J%K%L%MN
The" <" defines subseqnent eh2,rr,cters as upper
case. The" >" defines subsequent chamcters
as lower case. The "%" is the code for
the backspace function
display instructions on a single row. No erasure occur~
between adjacent rows, i.e., the upper and lower limitp
of each box are inviolate. However, erasure does occur
between adjacent columns. Let us aSRume that we ha0
two adjacent boxes on the same row filled completely
with dots [Figure 7 (A)]. If subsequently any pattern
were displayed on the same row in the column just to
the left of this, e.g., a vertical line in the extreme left
of the box, we would get the pattern shown in Figure
7 (B). A subsequent display instruction on the same
row but in the column just to the right of our original
display, e.g., three dots vertically arranged in the extreme right of the box, we would get the pattern shown
in Figure 7 (C). Thus a display insert command erases
five dot columns to the right of the insert and three
dot columns to the left on the f arne row. We have been
told that this can be improved on a hardware level so
xx Xix xx Ixx
xx xx xx xx
xx xx xx xx
xx xx xx xx
xx xx xx xx
xxx xxx
x
x
x
f--
--
-
Figure 7-Thi" shows the problem of en"ure with milltiple di..,;play inserts on the same row _ The insert.ion of the single dot
column in (B) causes erasure of [) dot columns from the original
pattern in CAy. Insertion of the 1/2 dot column in (C) causes
erasme of 3 dot column from the original pattern
no erasure will occur. This would be of utmost importance to anyone who wants to exploit the graphic
capacities, eRpecialiy in having the student draw on
the CRT.
The resolution of the light pen is limited to one box
as defined by one row, one column. Light detected from
one box can be differentiated from light in any other
such box. Two lighted dots are required to produce a
detect and these two dots must be separated by one
dot row. Thus the pattern:'3 shown in Figure 8 (A) would
all permit detection; those in Figure 8 (B) would not.
We use the light pen aR a pointer. We have not been
able to devise any means of using it as a stylus, although
we do have various ways in which we can have the
student draw on the CRT. Some of the ways we use
CRT graphics are enumerated and briefly described.
Identifi,cation
In the CRT display shown in Figure 9, we have the
student use the light pen to identify the structures
labeled in Figure 10. We feel that since we are dealing
with a picture approximatefy four inches in height,
this is pretty good resolution. As with verbal questions,
we branch selectively not only according to whether
he is right or wrong, but also on the basis of what the
nature of his error is. Thus his thinking is analyzed
316
Fall Joint Computer Conference, 1969
----------~---------------------------Anterior palatine foramen
A
Ix
[X
x
x
x
x
X
Foramen laceru
Foramen ovale
!
Spinous process
i
B
I
X
~
Ix
IX
xx xx IXX xX
X 1)( 1)(
xx IX XI)(
Petrotympanic-----4-=P"
fissure
Hypoglossal canal
Carotid canal
Figure 8- (A) shows three dot patterns which permit light pen
detection; (B) shows three dot patterns which would not be detected by the light pen
and he is led by discussion or :,lemonstration to the
correct answer. Since the face of the CRT is hehind a
glass cover, we have a parallax problem. The boxes
that he is trying to define measure only 1/6 X 1/5
inches. We cannot vary the intensity of the beam by
tracking the pen. We can require a double detect, i.e.,
Figure lO-This shows some of the structures thHt we
require the student to identify using the light pen
on the graphic display shown in Figure 9
on the first detect temporarily erase the adjacent boxes
and ask in effect: "Is this what you want to point to?"
However, in practice we do not find this necessary.
After a little experience the students make very few
parallax errors.
Animation
We use a few examples of animation in the usual
sense such as moving the eyes, swallowing, etc., which
can be done in the insert mode. This is quite effective
as long as only part of the picture has to be regenerated.
More commonly we employ animation in the sense of
drawing something slowly for purposes of emphasis.
For example, when we ask a student to point to where
a nerve originates, after he does so correctly, we may
respond by having the nerve "grow'.' out along its
course.
Enlargement
Figure 9-A CRT display of the base of the skull
The 1510 has no vector or scaling capacities. However, we do present a small scale view of a structure
.such as the skull and then enlarge certain parts of it
in 2X steps until we get the desired resolution for
llght pen interaction or to show greater detail.
Medical Education
"Drawing" on the CRT
The quotation marks are to emphasize that the light
pen cannot be used as a stylus. This would be desirable,
of course. However, this is not as great a limitation
as it might seem, since we are trying to get the students
to appreciate spatial relationships and proportions,
ra ther than training them as artists. There are several
means· by which we permit students to generate their
own pictures and have them evaluated. In each of
these instances, the erase feature is a distinct limitation,
and we are actually delaying much of our development
in graphic until a hardware modification comes through.
(1). Woe present the student with our dot matrix and
have him input from the keyboard, evaluating his
picture segment by segment. This may sound artificial
but it works quite well. However, from the keyboard
there is a 100 character input buffer, so here, more than
anywhere elISe, we feel the limitation of the erase
feature.
IBM 1510 Inatructional DlaplJy Planning Guida
11J4
317
........J6OI...O(.."..,.a)
............ u.s.A.
Column
'.
•
_ . • • III,.
.'.:
*
- .... It - ,
~.
.. ,.
..··....
·., ... ...
·
.....
·
'
··. " .....
•
•••••
/I
•
•••••
II
.......
A
:
I
'
B
Figure 11-(A) shows a crude form of light pen drawing by the
student; (B) represents the computer evaluation of the drawing
using our graphic alpha.bet.
(2). 'Ve put a lighted square in each box. The student
has three modes of operation from which to choose. If
he is in the insert mode, touching a lighted box causes
the square to be replaced by an asterisk like symbol.
The replace mode causes the square to replace the
asterisk, e.g., if he changes his mind. When he is finished
he enters the erase mode in which every square he
touches disappears and he is left to view his finished
drawing '[Figure 11 (A)]. The drawing is then evaluated
by the computer, and those parts of his drawing that
are judged to be accurate are regenerated using our
graphic alphabet. Thus his drawing, represented in
Figure 11 (A), would be presented back to him as in
Figure 11 (B). However, any parts of his drawing not
judged accurate would be left alone and he would have
to try again. A photograph of this view of the skull is
shown in Figure 12.
(3). We have every bone in the body drawn on
coordinate paper. On the CRT a graph paper grid
provides the lighted matrix for the light pen detect.
In essence we have him point to a series of points and
if he is correct, we generate the line of appropriate
contour between successive points. With soft tissues,
e.g., organs, muscles, etc., \ve are concerned with their
relation to bones. The bony skeleton then becomes the
lighted matrix upon which he draws. For example, it
is of vital importance that the student know the normal
projections of the heart arid its various subdivisions
onto the thoracic cage from ev~ry angle. Thus we
present him with a graphic of the bony rib cage and
ask him to point to where each chamber or struct.ure
Figure 12-A CRT display of the skull from the front
crosses the bones, and generate the pictures as he
progresses.
SU1\1lVIARY
We have tried to describe some of the natural language
ana.1vsi~. Relf -adapt.ive programming. and interactive
318
Fall Joint Computer Conference, 1969
graphic capabilities we feel are. required for medical
education. Although the system .and language we have
becn using 'were designed for CAl, they were not designed for the furt.her capacities toward which we have
tried to force them. We would like to have a system
and a language that were tailored to meet the needs of
higher education.
CAl is expensive, but so is; medical education in
its present form. Any tool that would significantly
improve the quality of medical. education can hardly
be denied on the basis of cost. The real question is
whether CAl can justify itself on a performance basis.
Perhaps in two, five, or ten years the comput.er industry ,,'{ill feel the state of the ari. justifies a real commitment to this field. However, will what they produce
truly meet the needs of the medical educator unless a
really interdisciplinary phase of research and development is undertaken now'?
ACKNo",rLEDGl\lENTS
The authors gratefully acknowledge the help and en-
couragement of the many people at The IBM. Systems
Hesearch Institute.
This investigation was supported by General Research Support Grant FR-0539G from the Genera 1
Research Support Brandl, Bureau of Health Professions and 1\1anpower Training, K ational Institutes
of Health and by Grant #50/68 from the National
Fund for l\ledical Education.
HEFERENCES
1 W D HAGAMEN
Respon8es of cats to tactile and noxious sUmuli: Temporal
summation, facilitation, internal inhibition, and ea:iernal
inhibition as examples of interactions between stimuli on a
beham'orallevel
Areh Neurol Vol 1 1959 203-215
2 N F O'DONOHUE W D HAGAMEN
A map of the cat brain for regions producing self-slim1ilation
and unilateral inattention
Brain Research Vol 5 1967 289-305
aS
L JAFFE
P F BOURLIER W D HAGAMEN
Adaptation of evoked auditory potentials: A. midbrain through
frontal lobe map in ihe unanesthetized cat
Brain Heseareh Vol 14 1969 111-127
Design principles for processor maintainability
in real-time systems
by H. Y. CHANG and J. M. SCANLON
Bell Telephone Laboratories
Naperville, Illinois
INTRODUCTION
With the arrival of large real-time, time-shared
systems, the requirement of system reliability has become even more demanding. The result of even a
momentary system misbehavior could be catastrophic,
since any disruption of service is experienced by all
the users on-line at that time. Thus for real-time
systems such as telephone switching systems, airline
reservation systems, on-line teaching machine, etc.,
where numerous users are served, and critical real-time
systems such as command and control, a high degree
of system dependability and maintainability must be
realized.
Since many of the real-time systems employ the
concept of centralization of logic, the overall system
reliability objective in large part depends on how well
the central processor itself meets the dependability
and maintainability objectives. For a processor, the
dependability objective often calls for the use of reliable
components, conservative circuit design techniques and
various redundancy methods. The maintainability
objective, on the other hand, demands a processor
architecture that is best suited for automatic trouble
detection, recovery from faults and fault isolation, so
as to insure operational survivability in an environment
which is not free of faults.
The purpose of this paper is to describe several
design principles which may be used in planning processor organization, designing logic circuits, and fault
detection and diagnostic tests in order to facilitate
the design of a highly maintainable processor for
real-time systems. Our scope will be limited to present-
ing a unified account of some design guidelines, most
of which reflect material assembled from a combination
of analytical study and practical experience on a realtime time-shared system.! The problem of achieving
high dependability by the use of various error detection
and correction codes or redundancy techniques has
received adequate treatment in the literature,2-4 and
will not be included here. In the second section we describe the various observed trouble symptoms and their
manifestation in the system. A maintenance sequence
for preserving the system's integrity upon occurrence
of faults is then suggested. Guidelines for planning a
processor organization to achieve high maintainability
are diseussed in the third section. Several principles
for designing logic circuits and fault detection and
diagnostic tests are described in the fourth section.
System malfunctions and recovery procedures
An important first step in establishing a fault recovery and detection philosophy for a particular
system is to establish the possible failure modes of
both system and device components. On a system
level, trouble symptoms usually manifest themselves
in some form of mutilated data. They can be caused
by errors in transmission or reception of data among
the various units; e.g., a bit erroneously set on a memory
aCcess. Or, they can result from errors in internal
data manipulation, e.g., attempting to reach an address which has been incorrectly computed.
On a device level, the trouble symptoms with discrete logic implementation usually correspond to
single, hard faults (by cornman assumption). A perma319
320
Fall Joint Computer Conference, 1969
nently open diode and a transistor output stuck-at-1
(s-a-1), are some examples of this class of faults.
However, these troubles usually manifest themselves
in some observable system malfunction. With the
advent of integrated circuit techp.ology, more complex
and varied device failure modes may be expected.
As one of the principal requirements in a real-time
facility is to provide continuous service, the system
must remain operational even in a fault environment.
This dictates that trouble syrriptoms be recognized
and the associated fault be isolated and repaired, with
little or no interference from the user's standpoint.
This objective can be implement~d by devising a fault
recovery procedure. A fault recovery procedure usually
consists of the following steps: fault detection, fault
recognition, system recovery, and :fault diagnosis.
Fault detection is usually a function entirely performed by a variety of hardware implementing error
detection codes such as parity; checks, one-out-of-N
codes, etc., and analogue sig*al. margin checkers.
Systems incorporating some level of redundancy may
also use matching between duplicated modules as a
means of fault detection. In all cases the checker itself
should routinely be examined by programs to insure
its validity.
The objective of fault recognition is to resolve a
failure to a particular subsystem (e.g., a memory
module, an input/output chatmel controller etc.).
This is done by first establishing the type of error
which has occurred such as a par.ty failure on a memory read, and determining from that information,
through some analysis procedhre, what subsystem
contains the fault. The analysis procedure may include
a sequence of instruction retrys in order to distinguish
the hard faults from the transietits, and then to resolve
the failure to the subsystem level by alternately exercising various suspected candicI;ates. It may also examine subsystem error indicators, over some period
of time, to accumulate clues pointing to the source
of malfunction.
Once the failure is resolved to a subsystem, choosing
the next step in the fault recovery procedure depends
upon whether 01' not a spare subsystem is available.
If a spare is not available, diagnostic action must
be initiated to determine the identity and location of
the fault. The normal system; operation, which had
momentarily been interrupted :at the time of fault
detection, must now be suspenp.ed through diagnosis
and repair. The system must then be recovered to a
hardware state and program point where normal processing can be resumed. This sequence of events is
depicted by Figure 1.
a
DIAGNOSIS
REPAIR
FAULT DETECTION
[
. .SYSTE'" RECOVERY
--=----r-''""''''
(MI LLISECONDS) -+-fIHi'I
(MILl_I SECONDS)
(MICROSECONDS)
~MNJTES
OR
Figure I-Fault recovery sequence (without splHe)
However, if a spare is available, a different strategy
could be taken. The system is first reconfigured by
interchanging the faulty subsystem with its corresponding spare, using some method of program controlled
switching. 1 The recovery procedure is then initiated to
restore the system to a normal processing state, in
order to reduce the period of interrupted service. 'rhe
task of diagnosis and repair can be postponed and
offered to the system as a relatively low priority job
since it is the most time consuming step of the recovery
procedure. This sequence of events is depicted in
Figure 2.
A comparison of Figures 1 and 2 illustrates some of
the maintenance advantages of hardware redundancy.
First the diagnostic task, which generally consumes
more time than all the other recovery steps combined,
can be deferred and interleaved with normal system
processing on a time-shared basis after the system is
restored to sanity. Secondly, the availability of a
spare permits a "good" vs. "bad" comparison type of
diagnostic testing where the "good" machine inter-
11
SYSTEM RECOVERY
FAULT RECOGNITION
ori~~~ToN ,.....~~r---'--y-L-,
(MILUSEC;CWnS)
: MICROSECONDS)
_. ·PRm'~b.G--l
]J
---1_.L--+---'--L-+--~-t--"._.'_"_"....
SWITCH IN SPARE
D'_DSIS8
M~'R
-
TIME-~ I
I.
•
I
I
I
I
41I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
t1
tl 11
H
H
TIME -__.
Figure 2-Fault recovery sequence (with spare)
Design Principles for Processor Maintainability
rogates the faulty machine. This type of testing is
readily programmed because of the availability of a
spare and hence can be automatic. Without some level
of redundancy, an approach must be used whereby
the operator acts as the interrogator. This implies
manually forcing the machine through recovery steps
as illustrated in Figure 1. However. in practice it is
often advisable to provide some subsystems with
spares, and some without, to arrive at a balance of
cost versus reliability.
In most applications, the central processor, whether
under program control or some combination of manual
arld program control, acts as the executor of any
system recovery scheme. Thus it is of paramount
importance that the central processor itself be highly
maintainable. With this in mind then, the remainder
of this paper will concentrate on outlining maintenance
design principles for the central processor, regardless
of the system environment in which it must perform.
Structural considerations for processor maintainability
Past experience has indicated that the effectiveness
of programmed testing depends not merely on the
techniques used in deriving tests and test results, but
also on the inherent structural maintainability of the
central processor unit. The central processor maintainability is generally constrained by such factors as
the modularity of the logic organization, the availability of accessible tests points, etc. It is, therefore,
appropriate to list some of the desirable guidelines to
be included for consideration in order to achieve overall
processor maintainability.
Mod ularization
In planning a processor organization for maintainability, modularization is of utmost importance.
The processor should be composed of well defined
functional modules, with a minimum number of intermodular feedbacks. * This is desirable to confine the
effects of malfunctions as well as to facilitate programmed testing. Specifically: (a) the function of each
module should be definable as a register, decoder,
sequencer, etc. Irregularities such as scattered special
flip-flops imbedded into a well-defined decoder or
sequencer, or circuits with a mixed mode of synchronous
and asynchronous operations should be avoided. The
symmetry and the regularity exhibited by the
structure of these modules often imply uniformity in
* An intermodular feedback is a control and/or data path that
traverses a ring of functional modules.
321
the trouble symptoms caused by faults in these modules.
As a result, a considerable amount of effort in designing
tests and deriving test results can be saved. For example, an attempt should be made to keep general
purpose registers logically equivalent so that a single
set of diagnostic tests will be applicable to all registers;
(b) the interface between modules should be "controlla hIe" and be as simple as possible. This implies
that the number of intermodular feedbacks be minimized and that a uniform and consistent method of
controlling information flow between modules be
established. A common practice in designing tests for
a large processor is to treat each functional module
individually. As a result it is usually difficult to foresee
global problems created by interaction among modules.
Many of these interactions can lead to inconsistent
test results, i.e., test results that may change from
diagnosis to diagnosis. s For example, a fault in module
A may prevent the initialization of some circuits in
module B. If the testes) for detecting this fault, due
to the presence of global feedbacks, also depend on
the proper initialization of these circuits in module B,
the test results become inconsistent. In a large processor with many functional modules, the testing
problems created by these interactions can be extremely
complicated. Thus, a "clean" interface between modules is very desirable. This means that in the test
mode, every module should be, either directly or indirectly, controllable and monitorable.
Accessibility and observability
The result of segmentation of a processor into
functional modules permits the strategic placement of
test points for purposes of controlling and/or monitoring the state of the machine during programmed
testing. A method for test point placem~nt has been
considered by Ramamoorthy, with the use of graph
theory.6 The functional modules of a processor can be
considered to correspond to nodes of a directed graph,
and signal paths to edges. The nodes of a graph are
partially ordered, from primary inputs to primary
outputs. Feedback loops between nodes can be
"broken" under the constraint that all nodes rem~in
reachable from primary inputs. Additional control
points are then inserted at places where feedbacks
have been broken. Test points for monitoring purposes
should also be added to modules whose outputs are
not observable, either directly or indirectly, at primary
outputs. The resultant processor organization is
therefore, one in which every module is controllable
and monitorable for programmed testing. Consequently,
Fall Joint Computer Conference, 1969
322
the accessibility and observability are greatly improved.
Our experience indicates that such a facility can often
simplify the design of tests and may well improve the
resolvability of faults.
As an example, consider the orO'anization
shown in
•
0
FIgure 3 (a). Each box represents a functional module.
Global feedback loops (BEFB), (CEDe) and (CDC)
are broken at edges FB and DC. Every module is
still accessible from its primary input (through module
A). Control points are added at FB and DC to enable
modules Band C. An additional test point is also
required at output of module F for monitoring purposes.
The resultant organization, with modules partially
ordered, . is show~ in Figure 3 (b). Note that every
module IS accessIble from its primary input and/or
added c~ntrol points and the outputs of every module
are momtorable at its primary output and/or added
test point(s).
~s 'vil~ be seen in a later section, a modular organizatIOn WIth adequate test points will greatly simplify
the design of tests. Thus far only the design guidelines
for the structure of a processor have been touched
upon. Some principles for the behavior aspect are in
order.
PRIMARY
INPUT
Interrupt and rollback mechanisms
As was mentioned earlier, a prime maintainability
objective of real-time, time-shared systems is to
preserve the system integrity in the presence of faults.
The use of error detection and correction circuits may
detect and mask out the misbehavior caused by some
faults. For example, a system employing a Hamming
code can effectively mask out single errors and recognize double errors. However, in real-time operations
the tasks of recovery from a fault occurrence usually
requires a combination of program and hardware
mechanisms. Special interrupt circuitry must be provided which is triggered by fault detection circuits to
initiate the recovery process. Protected storage must
also be provided to preserve the state of the machine
in order to restart the program after the system has
been recovered from a hardware failure.
The use of interrupt and rollback mechanisms can
be illustrated by the following example (Figure ,1).
Suppose the normal sequence of operation is Si, S2
. ", Sn where S1 denotes a steady-state point, or a
point to which the program can be rolled back. A
fault is detected while the transition from S2 to S3 is
being executed. To prevent mutilation of data, this
transition should be interrupted and all pertinent information on the state of the machine stored away. The
system will then enter a maintenance mode to isolate
and repair the fault. Once the trouble is cleared;, normal
PRIMARY
OUTPUT
Figure 3a-Functional modules of a processor an
example
ADDED CONTROL--.a
ADDED
TEST
POINT
NORMAL OPERATION
INTERRUPT OPERATION
MAIIITENANCE OPERA liON
PRIMARY
INPUT
ADDED CONTROL:--"
Figure 3b--Partially ordered functional modules of
a processor
~
STORE STATE
OF MACHINE S.
RECOVERY
AND
REPAIR
- - + DIAGNOSI~;
Figure 4-The use of interrupt and rollback mechanisms
Design Principles for Processor Maintainability
operation can then resume by rolling back to steady
state 8 1 ,
Interrupt and rollback mechanisms have proven to
be extremely valuable in real-time operations, especially when there are excessive intermittent troubles
in the system. 7 The maintenance of this addit~onal
hardware should be made periodically to insure that
it is in proper working condition.
EXTERNAL DEVICES
r-----------:,
CONTROL
CENTRAL
PROCESSOR
Circuit and test design
Processor maintainability can be greatly facilitated
if appropriate design principles are followed in circuit
EXTERNAL DEVICES
r-----------,
CENTRAL
PROCESSOR
L
I
I
I
I
I
I
I
____________ --1I
Figure 5a-Processor interface with external system
(operational mode)
,
I
I
I
I
I
I
I
I
I
L __________
Interface with external devices
In many systems the central processor and its
external devices such as memories are interconnected
via common buses. To test the circuits of the central
processor that are associated with buses, it is often
necessary to send data and/or addresses to these
external devices. This mode of testing is often inefficient as it requires extensive initialization of devices
in the external communities. Furthermore, the test
results may be inconsistent since the data is highly
dependent on the states of these devices at the time
the central processor is to be tested. In a large system
a central processor may communicate with numerous
devices, and interfacing with these units for testing
presents a serious problem. To avoid this situation, a
separate return path should be provided (see Figures
5(a) and 5(b» so that the testing of interface circuits
can be simplified. The return path concept establishes
a testing environment in which the state of the processor during testing need not be dependent upon the
states of other external subsystems or devices. In some
cases a saving of twenty to thirty percent of time and
program space for testing central processor interface
circuits can be achieved.
323
I
I
I
I
I
....J
Figure 5b-Processor interface with external system
(maintenance mode)
design and in developing diagnostic tests and programs.
In this section, we recommend several such techniques,
most of which are suggested by our experience and
by the results of other workers in the diagnosis field.
Circuit design
Circuit Redundancy-A fundamental assumption
shared by all diagnostic methods is the single-fault
assumption, i.e., one and only one fault may occur
since last diagnosis. The presence of an undetected
fault may invalidate this assumption. Consequently,
the effectiveness of the diagnostic can be weakened.
Thus the requirement of deriving a complete test set,
one which is capable of detecting all faults under
consideration, is necessary in order to reduce the set
of undetected faults that can occur in the field. The
presence of redundant circuits greatly complicates the
design of detection and diagnostic tests and generally
weakens system maintainability. Although faults in
redundant circuits may not affect system operation,
they could invalida,te certain tests designed for other
faults under the single-fault assumption. 8 For example,
the fault a stuck-at-1 (s-a-1) of a redundant circuit
shown in Figure 6 is not detectable. The presence or
the absence of fault a s-a-1 has no effect on circuit
operations. However, suppose a s-a-1 exists and another fault {3 stuck-at-O (s-a-O) occurs. The test
vector (x= 0, y= 1) which was originally designed for
detecting {3 s-a-O, is no longer valid, as the path
y~{3~z has been "desensitized" .9-11
As verifying the validity of all tests under all combinations of undetectable or redundant faults is impracticable, circuit redundancy should be eliminated
whenever possible.
Failure M'odes-Many manufacturers have indicated that the use of integrated circuits yields a highly
reliable design at low cost. However, to the best of
our knmvledge the failure ~odes of integrated circuits
324
Fall Joint Computer Conference, 1969
x
y
z
Figure 6--Example of a redundant circuit
and their effect on the test desi~n methods have not
been fully explored. From samples that. have been
studied, the dominant failure modes are still the same
as that of discrete components,. i.e., stuck-at-l and
stuck-at-O types. However, thero are also manv other
ne,\.- modes of failure that may require spe~ial attention. 12 Since the integrated circuit configuration can
introduce a number of parasitic components (such as
diodes and capacitances) betweerl connections, inputs
(of NAND gates, for example) can. be grounded due to
a parasitic diode short. Other modes of failures that
are characteristics of physical design include inputs
crossing, inputs simultaneouslys-a-l (due to a mechanical bond lifting), collector to emitter short, etc.
Until a better understanding of this subject is obtained,
one must be cautious in adopting a given integrated
circuit for production. A careful study of the feasibility of designing tests for detecting f~ults exhibiting
possible abnormal trouble symp~oms should be made.
Circuit Behavior-It is gener~lly known that one
of the most difficult maintenanbe tasks is to handle
faults, which may be intermittent or marginal, yielding
inconsistent failure symptoms. 1\/[any of these faults
are caused by gradual compon~nt deterioration due
to aging, manufacturing defects) etc., which are unavoidable. There are others that' are caused by overly
critical timing, or unrealistically tight tolerances, and
can probably be avoided by careful design. Examples
of these cases include (a) a hard fault in one circuit
which causes marginal operation in another circuit
(e.g., hard fault in a voltage regulator), (b) a hard
fault in one circuit which prevents the initialization
of another circuit (e.g., a fault :in a clock gate), (c)
faults which cause circuit operat~on that is dependent
upon equipment options employed in the unit being
diagnosed, and nlany others. The test results obtained
under these circumstances are usually unpredictable.
To avoid diagnostic inconsistencies, the test designers
are required to perform the time consuming, arduous
task of reviewing the entire unit to uncover these
deficiencies, and organizing test sequences by the use
of early terminations or selected test skipping techniques. The scope of this task can be minimized if
circuit designers are encouraged to design circuits
which are well-behaved even under failure. However,
since it is unrealistic to assume that diagnostic and
circuit designers will be completely successful in preventing marginal or intermittent faults, some tools
should be provided to aid maintenance personnel in
resolving abnormal fault conditions.
Connectivity and Packaging-With the use of large
scale integration and the increase in logic density, the
relative cost of factory testing and field maintenance
is rapidly escalating. lVIinimizing the number of global
feedbacks between modules makes the system less
sequential (more combinational); the task of testing,
as well as that of generating field maintenance tests
and a fault dictionary or catalog, is thus simplified.
However, the situation could be further improved if
the circuit designer would reduce, wherever possible,
the number of fan-ins and fan-outs, and especially,
the number of reconvergent fan-outs. * The problem
created by reconvergent fan-outs in deriving tests has
been noted by many workers. 9 ,10 It greatly complicates
the test generation process and can also affect the
fault resolvability, as in many cases faults in fan-out
regions are not distinguishable from those in fan,·in
regions. 'Thus, reconvergent fan-outs should be a.voided,
wherever possible.
A common practice in circuit packaging has been
to assemble each type of plug-in package to contain
several of the same type of logic elements such as
fiip-fiops, p-input N ANDS, etc. However, this practice
is not necessarily an optimal one from the viewpoint
of attaining maximum fault resolvability. As diagnostics are generally associated with "actions" rather than
with circuits,t3 serial packaging (i.e., organizing logic
elements along paths from inputs to outputs) would
yield a far better diagnostic resolvability than parallel
packaging. Admittedly; serial packaging will result in
more types of plug-in packages. Since in Medium
Scale Integration or Large Scale Integration a system
may only be composed of several of these packages, the
requirement that faults be isolated· to one s,nd only
one suspected package is quite necessary in order to
reduce repair time and/or possible additional un·
necessary package replacement. This implies that the
* Suppose g2,te B is ree,chable through gate A along some path(s).
Reconvergent fan-outs of gate A are those fan-out paths that
reconverge at gate B.
.
Design Principles for Processor Maintainability
use of a serial packaging technique to improve resolution should be carefully considered in the design
stage, along with other attributes such as cost of spares,
size, quantity, complexity and production yields, etc.,
to achieve an economic balance.
Design of maintenance tests
'I'he design objectives of maintenance procedures
to enhance processor maintainability are basically twofold: (1) to design a set of tests capable of detecting
and isolating all single, solid faults to a replaceable
package level, (2) to insure that test results will be
consistent for all faults from diagnosis to diagnosis.
The aforementioned design principles for processor
architecture and circuit design were aimed at facilitating the design and the application of maintenance
tests. In this section, vve present our recommendations
on methods of deriving tests and generating test
results, on techniques of structuring fault detection
and diagnostic programs, and diagnostic data
interpretation.
Tests Derivation-l\lethods of deriving tests for
logic circuits have been extensively explored. 9 ,10 ,14 ,15
The objective is to generate a set of tests capable of
detecting each member of a prescribed fault set. The
most significant result that is applicable to circuits of
practical size is the path sensitizing concept or the
D-algorithm technique. 9- n The idea is to assign a
certain input test vector to a circuit so that faults
along some path from input to output will cause the
circuit output vector to be different from that obtained
under the fault-free condition. For combinational logic,
programs for deriving sensitized paths are fairly simple
to implement. The running speed is also moderate
for circuits with very few reconvergent fanouts. For
sequential logic, there is no known technique that can
efficiently handle circuits with even a moderate number
of feedback paths. A practical approach, therefore,
would be to design the processor organization with a
minimum number of controllable feedback paths, as
was suggested in an earlier section making the logic
purely combinational for the purpose of testing. The path
sensitizing techniques can then be used to derive a
complete test set.
Generation of Test Results-The pros and cons of
developing a digital fault simulator for generating
test results as opposed to other alternatives (e.g., the
manual method and the physical simulation approach)
have been discussed by Manning and Chang. 16 It was
concluded that the digital method is extremely useful
in the early design stage to provide immediate feedback
325
. on the adequacy of hardware design and processor
maintainability. The physical method seems to have
an edge in computer time required to generate all the
test results. However, it is not clear how the physical
method can be used for a circuit realized with integrated circuit technology.
At present, for circuits with 100 logic gates a typical
estimate of required computer time to generate test
results is about one hour.17 With improved techniques
for fault simulation, the running can be substantially
redueed so as to make the digital approach even more
attractive. 16 Those readers who are interested in the
detailed description of the development of a digital
fault simulator can refer to several articles by Seshu.
See References 19-21.
Test Ordering and Minimization-The test set and
the test results generated through the simulationprocess usually contain redundancy. In some real-time
systems in which both program space and time are
at a premium, it is desirable to select a minimum or
near minimum set of tests that isolate faults only to
the circuit package level. To accomplish this, the
tests should first be arranged in "logical" order in the
same manner as modules of the· processor are ordered
(see Figure 3(b)). This in effect constitutes a parallelism between the organization of the processor and
the structure of testing procedure, which is considered
to be a useful aid in isolating marginal and/or intermittent faults that produce inconsistent test results
from diagnosis to diagnosis. Then, the test set for
each module can be reduced by using one of the known
methods for selecting an optimum set of diagnostic
tests. 22- 24
Program Structure -The final phase in the design
of a diagnostic testing procedure is to incorporate the
tests and test results (obtained through the simulation process) into a diagnostic program. In order to
minimize the overall program development effort (e.g.,
programming, debugging, integration and documentation) and to reduce the program maintenance effort
(e.g., updating changes, etc.), the program structure
should be modular, uniform and consistent. To accomplish these objectives, the use of the "data table"
approach is recommended.
Basically, the program is composed of two parts:
the control section and the data table section. The
data table section consists of a sequence of standard
entries, each of which specifies the operation of a
particular test or test sequence for certain modules.
Typically, each entry specifies (a) the input test vector(s) to be applied, (b) the prescribed length of time
Fall Joint Computer Conference, 1969
326
or number of central processor cycles the circuit is
forced or allowed to operate, (c) the expected circuit
resp.onse or output (s), (d) the inf9rmation ~ecess~r?,
to Interpret the results, and (e): the reqUIred InItialization information, if any. The inform9Jtion contained in the data table can be deriyed with the aid of
a digital simulator. The control seGtion, on the other
hand, is a program designed to int~rpret these entries
and perform functions such as initialization, segmentation of testsJ interfacing with other programs, manipulation of test outputs} etc. Figure 7 illustrates the
layout of a typical diagnostic progra~ structure.
Experience on using this particular design approach
reveals several advantages: (a) the design process
becomes standardized, which in turb results in a large
saving .in program development ; (b) programs are
more easily modified, e.g., if the circuit changes, the
majority of program alterations will be restricted to
the data table section; (c) test results are easier to
interpret, and (d) the control section can be written
and debugged independently of the data table section.
Also, the data table lends itself well ~o the participation
of many· designers, e.g., the register! designers develop
the data table for the registers, thei decoder designers
develop the data table for the decoders, etc. However,
in systems where a large number of memory fetches
can be penalized in time, it may suffer a slight drawback in that an increase in execution time of the
program over the conventional apprbach may be realized. However, this problem is not s~rious as the diagnostic program is not a frequently executed program.
Data I nterpretation-In large real Jtime systems, the
CONTROL
SECTION
DATA
TABLE SECTION
INITIALIZE' CIRCUIT (I)
EXECUTE
TEST TI - - - - -...... TEST 1iI
I~PUT TEST VECTOR(S)
ANALYZE AND STORE
RESULTS IN M(I)
~
OPERATION
EXPECTED OUTPUTS
ANALYSIS
itMTIALIZE
EXECUTE
CIRCUIT( I +1)
TEST lj+.
II' TEST TI+
•
ANALYZE AND STORE ~
RESULTS IN M(I+I)
~
I~PUT TEST VECTOR(S)
!
OPERATION
!
EXPECTED OUTPUTS
!
ANALYSIS
Figure 7-Layout of a diagnostic prpgram structure
diagnostic output usually corresponds to an enormous
amount of data (e.g., for a processor of 104 gates, a
test vector might. be represented by about 5,000 bits,
where each bit designates the pass or fail of a test).
In addition.! the observed fact that test results of some
faults are inconsistent from diagnosis to diagnosis
demands a flexible data interpretation procedure.
Several techniques for resolving diagnostic data
into faulty components or circuit packages have been
described in the literature. 5 These techniques employ
the concept of some form of fault dictionary in which
each entry of the dictionary points to the set of faulty
components or circuit packages producing the particular failure pattern(s). These patterns can be derived
by simulation.l1,20 ,26
The simplest form of dictionary is a listing of test
results ~here a "0" indicates a test passed and "I"
indicates a test failed. Faults are located by finding a
match between the observed failure pattern and the
entry listed in the dictionary. This technique is adequate to analyze failure patterns consisting of a relatively small number of failing tests. For fault conditions
producing a large number of failing tests, a data
compression technique to represent the pattern in
some compact form (e.g., a fixed length number) is
desirable in order to minimize the system repair time.
The tradeoff between the isolation accuracy of the
dictionary and the resolution provided by each of·
these techniques is discussed in Reference 26. The final
choice of methods for interpreting diagnostic data for
fault isolat·.on depends on the allowable system downtime and the availability of skilled maintenance
personnel.
CONCLUSION
In this article we have given a unified account of design
principles for processor maintainability in real··time
systems. The processor should be functionally well
modularized with a minimum number of intermodular
feedbacks. This is necessary to confine the effects of
malfunctions as well as to facilitate programmed
testing. To insure the validity of diagnostic dat~t the
amount of hardcore should be minimized, and ample
test points must be provided to control the state of
the machine, even under faulty conditions. Adequate
system recovery mechanisms must also be incorporated
to insure system sanity in a fault environment. Furthermore the processor should have a clean interface with
the e~ternal devices such as memory units and peripheral systems to enable the rapid identification of trouble
symptoms to a subsystem level.
Design ,Principles for Processor Maintainability
The design of processor logic circuits should be
preceded by a thorough understanding of the failure
modes of the circuit technology chosen for implementation. The elimination of circuit redundancy and the
incorporation of a packaging scheme which provides
good diagnostic resolvability are some other desirable
prerequisites for a good maintenance scheme. Finally
individual circuits should be examined to determine
whether an all hardware, hardware-software, or all
software maintenance facility should be provided.
The diagnostic program should be structured to
efficiently implement the selected testing procedure
(combinational or sequential). It should also provide
a flexible operator interface to aid in isolating intermittent faults. Computerized fault simulation methods,
which enable one to generate and evaluate the diagnostics, should be used throughout the design stages
to provide adequate feedbacks on the effectiveness of
system's diagnosability.
It is recommended that designers consider these
guidelines in planning a new machine organization,
designing logic circuits and maintena,nce tests so that
an optimum mix of software and hardware for processor maintainability can be achieved. Because of the
increased complexities of present and next generation
computing systems, and because of the rapidly changing
technologies, new maintenance techniques will have to
evolve at an accelerated rate. We have only documented
a few thoughts on guidelines for processor maintainability in real-time systems. Our opinions are obviously
influenced by our training and experience. Since there
are only a limited number of published documents
on this subject, we encourage other workers in this
field to present similar result$.
REFERENCES
2
3
4
5
6
W KEISTER et al
No. 1 electronic switching system
Bell System Tech Journal Vol 43 No 5 parts 1 and 2
Sept 1964
R H WILCOX W C MANN editors
Redundancy techniques for computing systems
Spartan Books 1962
W H PIERCE
Failure-tolerant computer design
Academic Press 1965
R E BARLOW F- PROSCHAN
Mathematical theory of reliability
John Wiley and Sons Inc 1965
H Y CHANG W THOMIS
Methods of interpreting diagnostic data for locating faults in
digital machines
Bell System Tech Journal Vol 46 No 2 Feb 1967289-317
C V RAMAMOORTHY
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
327
A structural theory of machine diagnosis
Proc SJCC Vol 30 April 1967 743-756
R E STAEHLER
No.1 EBB service experiences-Hardware
lEE Conf on Switching Techniques for Telecommunication
Networks April 1969 463-466
A. D FRIEDMAN
Fault detection in redundant circuits
IEEE Trans on Electronic Computerl'l Vol 16 No 1 Feb
196799-100
D B ARMSTRONG
On finding a nearly minimal set of fault detection tests for
combinational logic nets
IEEE Trans on Electronic Computers Vol 15 No 1 Feb
1!}6666-73
J P ROTH
Diagnosis of automata failures: A calculuc and a method
IBM Journal of Research and Development Vol 10 1966
2'18-291
J P ROTH W G BOURICIUS P R SCHNEIDER
Programmed algorithms to compute tests to detect and
distinguish between failures in logic circuits
IEEE Trans on Electronic Computers Vol 16 No .) 1967
567-579
W WORKMAN
Failure modes of integrated d1'cuits and their relationship to
reliability
Microelectronics and Reliability Vol 7 1968 257-264
J B KRUSKAL R E HART
A geometric interpretation of diagnostic data from a digital
machine
Bell SYbtem Tech Journal Vol 45 Oct 1966 1299-1338
Based on a Study of the Morris, III Electronic Central
Office
R D ELDRED
Test routines based on sy'mbolic logic statements
Journal of ACM Vol 6 No 11969 33-36
J F POAGE
Derivation of optimal tests to detect faults in combinational
circuits
Math Theory of Automata Polytechnic Press 1963
Brooklyn N Y 483-528
E G MANNING H Y CHANG
A comparison of fault simulation methods for digital systems
Digest of the First Annual IEEE Computer Conf 1967
10-13
E R JONES C H MAYS
A utomatic test generation methods for large scale integrated
logic
IEEE Journal of Solid State Circuits Vol 2 Dec 1967 221
E G MANNING H Y CHANG
Functional techniques for efficient digital fault simulation
Digest of IEEE Interna.t Conv March 1968 194
S SESHU D N FREEMAN
The diagnosis of asynchronous sequential switching systems
IRE Tra.ns on EC Vol 11 No 4 Aug 1962 459-465
S SESHU
On an improved diagnosis program
IEEE Trans on EC Vol 14 No 1 196576-79
S SESHU
The logic organizer and diagnosis programs
Rpt R-226 Coordinated Science Lab Univ of 1111964
(AD605627) .
328
Fall Joint Computer C~nference, 1969
22 R A JOHNSON
An information theory approach to diagnosis
Proc 6th Nat Symposium on Reliability and Qualit}
Control 1960 102-109
23 H Y CHANG
An algorithm for selecting an optium: set of diagnostic tests
IEEE Trans on EC Vol 14 No 5 1965706-711
24 H Y CHANG
A distinguishability criterion for selecting efficient diagnostic
tests
Proc SJCC Vol 32 1968529-534
25 F J HACKL R \V SHIRK
A n integrated approach to automated computer maint.enance
Conf Record on Switching Theory and Logical Design
1965 289-302
26 H Y CHANG
Figures of merit fOT the diagnostics of a digital system
IEEE Trans on Reliability Vol 17 .No 3 Sept 1968 147-15a
Effects and detection of intermittent
failures in digital systelns
by M. BALL and F. HARDIE
IBM Corporation
Owego, New York
INTRODUCTION
A great deal has been written during the past few years
on the subject of diagnostic test procedures for digital
systems. Almost without exception, however, the investigators have limited their interest to the detection
and location of solid faults, and their test procedures are
usually based on the assumption that either the fault
exists for the running time of the test procedure or the
time interval between the fault occurrence is less than
the required time to run the test.
In practice, experience has shown that field failures in
digital systems used for aerospace application (e.g.,
Titan and Saturn vehicle guidance computers) tend to
be intermittent in nature. The authors believe that this
experience is testimony to the efficiency of the current
diagnostic test procedures in screening solid faults
from digital systems before delivery for field use, not
that failures which develop in the field tend to be
intermittent. That is, diagnostic testing of aerospace
digital systems using the advanced test procedures
available today generally detects all solid faults but
only a small portion of the intermittent faults that
exist in any digital system prior to delivery to the field.
The residue of intermittents in the system which
escaped detection eventually make their presence known
during field operation.
The reason for the emphasis on diagnosis of solid
faults is the relative complexity involved in the diagnosis of intermittent faults. This is the natural course of
evolution in system design as well as in biology-adaption
to basic environmental requirements with later complex
specifl,lization.
In an attempt to direct the evolution of diagnostic
techniques along the channels leading to efficient
detection and location of intermittent failures in digita 1
systems, the authors conducted a series of experiments on the effects and detection of intermittent
failures. Over 500 hours of IBM 7090 time were accumulated using a sophisticated logic simulator to evaluate
the Saturn V Launch Vehicle aerospace computer opera..
tion in both normal and failure modes. The purpose of
these experiments was to determine the effects of intermittent failures on computer operation rather than to
investigate the mechanisms of failure, and to evaluate
the detectability of classes of failure rather than to
develop specific techniques for failure detection.
In this study solid faults were treated as a special
case of the general class of intermittents. That is, a
solid fault was treated as an intermittent whose dura tion exceeds the running time of the test program.
The simulated intermittents were made to vary in
duration from 500 nanoseconds upwards (one clock
time of the simulated computer), and were specified
in the computer logic at randomly chosen points of
combinational and sequential circuits. A total of
792,884 intermittent failures were simulated to give a
realistic statistical sample. These intermittent points
were chosen to occur in the program control and
arithmetic sections of the simulated computer.
For each intermittent a record was kept of the time
of error occurrence, time of error detection and the
number of failures which caused a difference in operation from a "good" machine. From these records the
prpbability of detection was calculated assuming a
329
Fall Joint Computer Conference, 1969
330
TABLE I-Intermittent detection capabilities
Unit
Adder/
Subtractor
Multi/
Divide
Program
Control
Total
Failures
Failure
Duration
Failures
Causing
Incorrect
Operation
%
Affected
Logic
Failures
Detected
Distected
%
267,894
500 nanosec
22,276
8.4
1,113
5
269,590
500 nanosec
to
5 millisec
500 nanosec
to
5 millisec
22,376
8.3
1,122
5
44,704
17
255,400
"perfect" error detector. The results showed that many
intermittent failures exert on1y a weak influence on the
correct operation of synchronous logic circuits. As
shown in Table I, approximately eight percent of the
simulated failures caused the arithmetic element to
perform incorrectly, with a comparable (five percent)
probability of detection by the i "perfect" error detector.
252
These programs operated on the IBM 7090 computer
as shown in Figure 1. The compiler program produced
7090 instructions for the logic portion of the simulator
program. The failure injection program allowed the
introduction of selected faults into the logic portion of
the simulator program on the component level-tha,t
is, open or shorted diodes and transistor outputs
The system simulator
One of the most serious problems confronting the
designers of digital systems is the task of verifying
proposed design features. Both manual analysis and
simulation techniques are used to aid in this task.
During the design and development phase of the
Saturn V Launch Vehicle Digital Computer, a Fault
System Simulator was developed* by IBM to provide
the means of (1) verifying the logical integrity of the
digital equipment, (2) evaluate design changes before
commitment to hardware, and (3) evaluating test
programs. During the course of its use, however,
emphasis gradually shifted toa special simulator
application jwhich generate information on the characteristics of machjne operation to aid the engineer in
diagnosing malfunction symptoms. One of the most
significant series of simulator e~periments was concerned with evaluating the sensitivity of the digital
logic to intermittents.
The system simulator consisted :of a compiler, failure
injector, logic simulator, and e~aluation programs.
• Design and Use of Fault Simulatio~ for Saturn Computer
Design, by F. Hardie & R. J. Suhocki-IEEE Trans. on Electronic Computers Vol EC-16, No.4 August 1967 p. 412-29.
0.5
Figure I-Simulator flow diagram
Intermittent Failures in Digital Systems
stuck to a logical zero or a logical one. The simulator
program operates on a 7090 description of the digital
equipment (a logic master tape) to simulate the logical
behavior of the equipment in normal operation and in
various failure environments.
The simulator program executed special test programs and displayed, by means of print-outs, the state
of selected logic nodes or register contents at every
clock tjme of an instruction cycle. In inViestigating
the behavior of equipment containing logic failures,
simultaneous failure environments were provided by
using parallel simulation techniques, and the system
states for each environment were determined simultaneously. Of the 36-bit 7090 word, 3 bits were used
to represent the normal system state and each of the
remaining 33 bits were used to represent a failed state
Multiple faults were simulated by injecting 2 to 25
failures into a single bit position.
Up to 100 logic test nodes were available for printout in each normal or failure environment. Special
pseudo operation codes allowed additional nodes to be
retrieved as required. Another pseudo operation code
caused the contents of selected registers to be placed
on the simulator output tape for use by the evaluator
program.
The evaluator programs identified fault symptoms
and correlated these symptoms with the injected
failures. The output of the evaluator was a report of
detected errors, undetected errors, accuracy of diagnosis, and general behavior of the digital equipment.
Simulator applications
The primary applications of the system simulator
can be grouped into four general categories: design
evaluatiQn, failure evaluation, data generation, and
data analysis. 'The obvious use of the simulator was to
provide early and rapid verification of the logical
integrity of the basic hardw8,re designs of digital
equipment. In addition to checking the basic logic, the
simulator was used to determine whether certain design
ground rules were satisfied by the circuit designs, and
even whether the ground rules themselves were adequate. For example, individual circuits were checked
against fan-in and fan-out constraints. In addition,
the constraints themselves were checked against drive
and load requirements by applying random and worst
case parameter values to the drives, driven circuits, and
circuit loads.
Delay simulation, incorporating logical element delay
characteristics in the logic simulator, was used to
analyze the nature of digital signal propagation in the
computer designs. Several race conditions were de-
331
tected by the delay simulations which were corrected
by modifying equipment initialization procedures or
by design changes.
Operational and test programs were evaluated on
the system simulator. Although functional program
simulators provide nearly error-free programs from
the standpoint of information flow, an appreciable
amount of program debugging is usually required
when the program is first used with the hardware.
Logic simulator evaluation of programs reduced this
final debugging phase to a minimum.
The applications discussed so far pertain to properly
operating equipments. The logic simulator should be
regarded in such applications as a tool to aid in design
analysis and not as a replacement of manual analysis
and engineering judgment. In the area of failure mode
analysis, however, the simulator as a tool becomes
even more important because of the inherent difficulty
in determining the behavior of failed machines, and
especially ,in identifying the fault from the failure
symptoms.
The failure injection program and diagnostic evaluation programs provide a failure evaluation capability
for the system simulator. Test programs for equipments
were evaluated for their failure detection and fault
isolation capabilities. Built-in test circuitry and test
point configurations were evaluated in the same manner.
Optimum placement of detection circuits and test
points was determined by successive simulations.
Although the evaluation applications represent perhaps the most important use of the system simulator,
the simulator also possesses a capability of generating
data which is useful not only in design and test of the
system but also in increasing the capability of the
simulator itself. For example, a diagnostic catalog can
be generated as a by-product of a test program evaluation which relates each injected fault to the resulting
failure symptoms. The catalog is then available for use
in evaluating diagnostic programs or procedures in
further simulations.
One of the applications of the logic simulator which
is generally very difficult to perform manually is to
trace the propagation of an error caused by a component failure, especially when the failure produces a
loss of program control. Such traces can be generated
by logic simulation, however, and have important diagnostic value in identifying system faults. The status of
the failed equipment at every clock time can be determined by monitoring over a hundred nodes or test
points internal to the equipment logic, as well as the
equipment interface. A summary of simulator applications is ,shown in Figure 2.
332
Fall Joint Computer
Co~erence,
1969
i
masking of failures by the logic was due primarily to
Design Evaluation
Hardware
Basic Logic
De~ign Ground Rules
Delay Simulation
Software
Op~rational
Programs
Test Programs
Design Changes
Failure Evaluation
Test Programs
Error Detection
Efficiency
Diagnostic Capabilities
Circuit SenSitivity:
Error Propagation
F~ilure Effects
Data Generation
Node Data
Error Traces
Diagnostic Catalo~
Data Analysis
Laboratory Support
Field Failure Analrsis
Figure 2-Simulator appFcations
Simulation of intermittent failure8
The application of the simulator! which is the primary
concern of this paper was a seri~s of experiments to
determine the sensitivity of logic to intermittent
failures. Intermittents simulated! by the failure injection program were made to vary from one clock
time to the cycle time of the test program (representing
a solid failure). These faults were injected at randomly
chosen points in the equipment ~ogic and at random
points in the test program.:
.
For each intermittent a record Mras kept of the time
of occurrence, time of detection,; and the number of
failures which caused a differen~e from the "good"
machine. The results of the simuJation indicated that
many intermittent logic failures had very little eff'ect
on the operation of the digital equipment-less than
ten percent of the total failures injected into the simulator program caused the logic td perform incorrectly.
Analysis of the simulation resul~s disclosed that this
• the extensive use of combinational logic
• the clocking of the AND gates which feed and/or
gate the logic levels from the sequential circuits.
• the duration and frequency of the intermittent
failure.
These simulation results and conclusions were based
on a relatively small statistical sample-a few hundred
simulated failures. In order to obtain a realistic statistical sample, the failure injection program was
modified to execute the following procedure:
1. The logic failure was initiated at the first clock
time of the test program.
2. The test program was executed until a state
difference was detected by the simulator program
between the logic under ·examination and .a
"good logic" reference.
3. Upon failure detection, the time of detection
and failure symptoms were recorded, the logic
under examination reset to the same state a,s
the reference logic, and the test progrnm advanced to the next clock time.
4. The procedure was repeated for one fun cycle
of the test program.
The immediate data from· this simulation provided
a measure of the sensitivity of the logic to intermittent
failures of one clock time duration. That iS1 the portions
of the test program during which the injected faults
cause a deviation from normal operation were identified.
The same data was used to provide a measure of the
sensitivity of the logic to intermittent faiiures of longer
durations than one clock time by manipulatilng the
data with simple editing programs rather than by
further simulation, making it feasible to accumulate
information on an equivalent of over a half million
simulated failures.
To assure the validity of the above techniques, the
quantitative results concerning the sensitivity of the
logic to intermittents obtained by the first method of
actually simulating failure durations of one clock period
and then manipulating the data with special edit
programs, were compared and found to be closely
correlated. The combined data from both simulation
experiments was then used to derive a series of curves
representing the sensitivity of the logic to intermittents
of various durations, two of which are shown in Figures
3 and 4. The ordinate in each figure is the probabili1GY
that the intermittent will cause a malfunction in log:ic
Intermittent Failures in Digital Systems
333
1.0
1.
0.90.9
0.8
0.70.7
u
j
u
'0,
.s
'0
c
.2
i:ic
.2
....o
0.6
c
~
g 0.5
.2
0.5
~
~
'0
c
.~
'0
.§
8o
~
4)
'0
0
.£
2
a..
0.3
~
0.3
:.0
-B
:.0
-B
0.4
2
0.4
a.
'0
0.6-
e
a..
0.2
0.2
0.1
0.1
O~------~~~~~~------~--------T-1
0
1
10
100
1000
10000
Duration of Intermittant Failure (C lock Times)
10
100
1000
10000
Duration of Intermittant Failure (Clock Times)
Figure 4-Sensitivity of multiply/divide logic
Figure 3-Sensitivity of arithmetic logic
operation, while the abscissa is the duration of the
intermittent.
The sensitivity of the logic was found to vary ap·
preciably not only with the class of logic (combinational
or sequential) but with the operational function of the
logic circuitry as well. This condition necessitated the
plotting of sensitivity versus fault duration individually
for different areas in order to obtain meaningful relationships.
A summary of these results is given below:
• There is a smaller probability of detecting intermittent failures in combinational (AND-OR)
circuits than in sequential (LATCH) circuits.
• There is a very low probability of detecting a
single occurrence intermittent failure on a logic
page (average population of 120 AND, OR, invert
type circuits). This condition exists because many
intermittents do not make the "failed" logic act
different from the "good" logic and the detection
of intermittents requires that the logic must be
exercised by appropriate data for the failure to be
detected.
• For these injected intermittents, a fault existing
for one clock time was virtually undetectable; one
existing for ten computer word times was about
50 percent detectable; and one existing for 50
computer word times was almost 100 percent
likely to be detected.
• There is a wide variation of error detection sensitivities between computer modules.
Test program efficiency
An analysis of simulation results was performed to
determine the quantity and type of information which
should be generated by a test program to assure a
reasonable probability of error detection and fault
location in the digital equipments. Figure 5 shows the
efficiency of the test program versus the size of the
test program for various types of failures. Qurve a
represents a solid failure. Curves band c represent an
intermittent failur~ of 100 clock time duration in typical
334
Fall Joint Computer Conference, 1969
which an error was first detected. The second line
indicates the phase, bit and clock time that the error
was first detected. The third line indic ates the first
three program instructions during wh ich an error was
detected. The remaining lines indicate various COInbinations of the above test parameters.
1.0
0.9
0.8
Observed Failure Symptom
or Parameter
0.7
~
~
b.
·0
u..
u
.0,
0.6
..3
0
First Program Step of Detected Error
Time of First Detected Error
First Three Program Steps of Detected Error
First Program Step of Detected Error and
Time of First Detected Error
First Three Program Steps of Detected
Errors and Time of First Detected Error
First Three Program Steps of Detected
Errors and Time of Each Detected
Error
i
Ol
~u
IntermittQnt in
Sequential
Logic
(100 cloc:k
time durQtion)
0.5
.2!CI)
0
....0
t:.0
0.4
..8
2
Q..
FQ~ilure8
Identified
0.3
0.2
10.5%
28.1
6:J.2
71.8
8:3.2
9~r.5
Figure 6-Symptomiailure correlation
0.1
CONCLUSIONS
0
0
100
400
200
Number of Instructions in Test Program
Figure 5-Efficiency of test prograp!
VB.
500
program size
sequential and combinational logic, respectively. Note
that, although a reasonably high efficiency of detecting
a solid failure was achieved with a relatively short
test program (90 percent with 200 instructions), the
probability of detecting the intetmittent was almost
linear with program size.
'
Many diBerent types of error symptoms were produced as a by-product of the simulation experiments.
Each symptom was analyzed. to determine its individual
and combined value in identifying •logic failures. Figure
6 is a summary of the results of this analysis for solid
faults. The relative diagnostic values of the error
symptoms in identifying intermittent failures are about
the same except that the percentages will be less according to the duration of the interWttent.
Due partly to unavoidable repundancy in a test
program (by which a logic elem~nt is exercised more
than once) and due to error propagation in digital
systems, an error in logic operation resulting from a
failure of a logic element can occur several times during
the execution of the test progr8!m. The first line. of
Figure 6 indicates the program instruction during
The series of simulation experiments described above
strengthened the authors' opinion that the prevalence
of intermittent failures of digital equipments in the
field is due to the relatively low efficiency of current
test techniques in screening such failures before delivery of the equipment to the field. That is, althouigh
~urrent test techniques cause most of the solid faults
which are "built into" the equipment during fabrication
to be discovered before release to the field, a large
residue of intermittents slip through the tes1G screen
and cause operational errors during field use:
The simulation experiments described above did very
little in the way of deriving a solution to the problem
of intermittent failures. No attempt was made to
determine the mechanisms or characteristics ()if actua.l
intermittent faults in existing digital equipment. 'Ilhe
experiments were designed only to examine the sensitivity of digital logic to intermittent faults in general,
without regard to mechanisms of failure.'
The simulation results indicated to the authors that
current test techniques, slanted toward deteciiion and
location of solid faults in -digital equipment, :are adequate for solving the problem of intermittents. The
experiments showed a rather surprising insensitivity to
intermittents of short duration. Although this insensitivity may seem to be a fortunate characteristic for
actual operation, it makes the problem of testing
infinitely more difficult.
Intermittent Failures ili Digital Systems
335
Two general approaches to the test problem are ob~
vious:
• develop better test techniques for detecting and
locating intermittent faults, and
o
U
Total Test and
Mo i nte nonce
Costs
• develop techniques for making the intermittents
appear solid.
The second approach has found widespread acceptance, as indicated by the common use of vibrational
and thermal stimuli to force intermittent faults to
expose themselves during factory checkout. In this
way many intermittent faults are detected that may
otherwise have slipped through the factory test screen.
The prevalence of intermittent failures during field
operation, however, testifies to the inadequacy of this
approach by itself.
A third approach is, of course, to design the equipment to be absolutely insensitive to intermittent logic
failures. Instruction retry, check point rollback and
redundancy are being advanced as possible solutions.
Redundancy, especially triplicated logic with voting, *
has proven very effective in this area, but not without
cost in hardware and power. Eventually, when logic
hardware becomes sufficiently inexpensive, redundancy
"may very well be the way of life and the intermittent
problem will have been solved. ** Meantime, there
remains urgent need for developing better test techniques for detecting and locating intermittent faults
in digital equipment.
The greater part of test and maintenance cost of
computer systems today is spent on detecting and
isolating intermittent failures. Intermittents have comprised over thirty percent of pre-delivery failures
and almost ninety percent of field failures in several
computer systems known to the authors, and this
seems to be the trend in present computer technology.
Unfortunately, most of the current research in diagnostic techniques is concerned with the "detection and
location of solid failures.
Logic simulation has provided a powerful tool for
* IBM Proposes Triple~Redundant Computer, by M. Bv.ll and
F. Hardie, Computer Design Vol. 6, pages 34-36, Nov. 1967.
** Self-Repair in a TMR Computer by M. Ball and F. Hardie.
Computer Design Vol. 8, No.4, pages 54-57, April 1969.
Cost of Solid Foilur('~
Failures
I
I
I Field Delivery
f Ild o·
l;f~,
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
Calendar Time of Computer Life
Figure 7-Co-;:;t of failureg
studying the effects of intermittents in specific computer organizations, but in itself is not a solution to the
cost problem. Even when these effects have been
identified, the techniques for designing a computer to
be int~rmittent-resistant or for testing a computer to
locate intermittent failures are not yet state-of-art.
Figure 7 shows a typical curve of the relationship
of the costs of testing and maintaining a computer
system from its initial assembly to the end of its useful
life. The following conclusions may be evident from
the figure:
• Intermittent failures are far more costly in test
and maintenance than solid failures.
• The cost ratio of intermittent to solid failures
increases with system usage, especially following
delivery to the field. The reason for this trend is
probably the better screening of solid failures by
current test techniques.
• The cost of field maintenance remains high with
usage, and most of the cost is due to intermittent
failures. This large residue of intermittent faults
is probably due to inefficient test screening rather
than to new faults.
• The costs of a computer system tend to be monotonically decreasing with use. End-of-life is
f()rced by obsolescence rather than by wear-out.
Modular computer architecture strategy
for long term missions
by
F. D. ERWIN
Hughes Aircraft Company
Fullerton, California
and
E. BERSOFF
N ABA Ekctronics Research Center
Cambridge, Massachusetts
INTR.ODUCTION
Long term mission reliability of a modular computer
has been studied at Hughes Aircraft Company as a
consequence of a study with NASA ERC.l,2 Particular interest Uty in the attainme!lt of long term
reliability with modular computer organization aId the
effects on reliability of variations in modular organization. The results of this investigation are presented
in this paper.
In the past, the designers of aerospace computerfl
have concentrated on increasing computational speed
and arithmetic capability within stringent wdght and
power limitations. There seems to be little doubt that
a.erospace computers will soon be. extremely fast,
versatile and compact. A requirement for long term
system reliability has been developing and may drastically change the nature of the on-board oC>,~put~F'
Ext.remely long missions are being pIallnedwhich
require a computer to operate for one to five or more
years afteI launch. Current on-boprd ~o~p'Ut;;M~ystelns
are not adeq~~t~r-thiS1&sK-:-'---'-----"-""""-'---'" "
One promising approach for 8,chieving reliability
and flexibility is through mQ.
0
0
.0.7
~
0
W
0.5
If)
0
.J
U
TMR
2 OF 3
d
1
0.3
=
0.1
0
0.5
0
1.5
1.0
2.0
3.0
4.0
5.0
NORMALIZED TIMF-. T
NOTES:
(1)
d
(2)
N-L
EXCEPT AS NOTED
= 0.1
R = LId (L/d+N-L) e -L T
N-L
READ 2 OF 6
AS L OF N TYP.
~
~
K=O
(N -L)
K
..-l.=.!l.K
L/d+K
e- KdT
Figure 4-Long term reliability curves
CAU AU
N
L
T
CU
MU I/O
4
4
3
6
7
1
1
1
2
2
1.05 1.49. 3.54 .85 .88
figuration is reached one must dete:t:mine the additional
gates added to each module by virtue of the new
configuration and then calculate a corrected system
reliability. Usually one must iterate through several
configurations many times to reach the desired reliability with a minimal gate count. At this point
Equation (8) may be used for a more accurate
reliabilityvalue.
Reading the graphs for module set reliability:
CONCLUSIONS
CAU AU
T
R
CU
1.05 1.44 3.54
.89 .97
.8
l\IU I/O
.85
.98
.88
.98
Thus, R B = .67 - a significant gain in system reliability, though 3.dditional steps must yet be made to
reach the desired reliability. When a potential con-
A method of estimating long term reliability of modular
computers has been presented and two sample cases
examin.ed. In the second example 240 percent additional hardware was used to improve five year predicted reliability from .018 to .67. To this must be added
the additional switches to accommodate the increased
modules (from 13 in first example to 24 in second). To
obtain a reliability of the order of .99 for a five year
mission perhaps the additional hardware necessary
344
Fall Joint Computer Conference, 1969
N'
L
CAU's
31
1
AU's
Ei
2
CU's
7'
2
MU's
4
1
I/O
4
1
Figure 5-Multi-module modular comput.er
would amount to as much as four times that required
for the actual computing. Gatd failure rates used in
the examples are for present day high quality IC's. If
the basic gate reliability could b~ increased by a factor
of ten this total additional hardware could be approximately halved.
The modular approach with st:tndby modules appears
capable of servicing long missions with feasible costs.
ACKNOWLEDGMENT
The authors express their appreciation to Mr. Jack
L. Bricker of Hughes Aircraft Company for his effort
and guidance in developing the mathematic2~1 model.
REFERENCES
1 J J PARISER H E MAURER
Modular computer iniplementation with LSI
In these proceedings
2 F DERWIN J F Me KEVITT
Character8-Univer8al architecture for LSI
In these proceedings
3 R,A SHORT
The attainment of reliable digital 8Y8tems through
tht~
use of
Modular Computer Architecture Str.ategy
345
TABLE I-Two column component breakdown (approximate)
modular computer breadboard (separate arithmetic & control modules)
MODULE
CAU
Switches
I/O
Memory
Control
Arithmetic
GATES/MODULE
4800
180
2000
1950
8100
3400
%
24
1
10
10
40
15
IC/MODULE
1440
55
495
495
2175
975
%
25
1
9
9
39
17
TOTAL
20430
100
5635
100
ALTERNATE APPROACH (COMBINED ARITHMETIC & CONT-a,OL MODULES)
CAU
4800
25
1440
27
Switches
180
1
55
1
2000
11
495
10
I/O
Memory
2000
11
495
10
Processor
9800
52
2700
52
TOTAL
18780
redundancy-A survey
Computer Group News March 1968
4 BERSOFF HOPE TUNG
Modular computer researoh
To be published
5·E J KLETSKY
Upper bounds on mean life of self-repairing systems
IRE Trans on Reliability and Quality Control Oct 1962
43-48
6 P 0 NERBER
100
5185
100
Power-off time impact on reliability estimates
IEEE Internat Cony Record Part 10 March 22-26 1965
NY 1-8
7 L K DAVIS G A WATSON T G SCHAIRER
Advanced computer dormant reliability study, Final Report
Autonetics Div of No America Rockwell Corp Oct 14 1967
8 J L BRICKER
Reliability studies of the NASA deep space computer and the
H-J,.J,.OO computer
To be published
A compatible airborne multiprocessor
by E J. DIETERICH and L. C. KAYE
RCA Aerospace SY8tem8 Divi8ion
Burlington, Massachusetts
INTRODUCTION
The control of large military forces is creating the need
for large data-processing systems located in transport
aircraft and in other situations where tight quarters
and hostile environments call for the design features
found in airborne systems. In these applications the
configuration of the computer and its peripheral equipment strongly resembles what is found in a typical
commercial da.ta-processing system, with some additional requirements for reliability. In particular, the
functional programs are complex and extensive, and
the availability of a complete package of support software, including compilers 'and utility routines as well
as the resident executive, is likely to be of critical
importance. Because of its cost, so complete a software
package cannot reasonably be developed specifically
to answer a particular military need; it must be captured from an existing software system. The only
source of complete data-management software packages
is commercial data-processing; and thus it makes
practical sense for a large, militarized data-processing
computer to be strictly compatible with an existing
commercial product. As a bonus, the commercial
computer can then be used as a support computer
for compilation and program checkout. An example
of a program in which an airborne computer is supported by an existing ground-based commercial computer is found in the Strategic Air Command's Post
Attack Command and Control System-Airborne
Data Automation. 1 In this system the airborne computer is the RCA/USAF Variable Instruction Computer2 and the ground support computer is the IBM
7090.
The hardware compatibility required for capturing
system software is rigorous. s It is not sufficient that
the militarized computer contain a large subset of the
commercial instruction list, or that it obtain nearly
identical results when eXiecuting the same programs.
Bit for bit, the militarized computer must possess all
the instructions and non-instructional features of the
commercial machine, including input-output features,
with the possible exception of privileged instructions
usable only by the resident executive program; even
here the exceptions must be few or else an entirely
new executive will be required.
On long missions, especially when critical command
data are being handled, the military user must have
assurance that a certain minimum capability will
always be available. Even with the best modern
technology it is prohibitively costly to provide assured
availability in a single-thread system. The classical
method of coping with failure-complete duplication
of the hardware, with a stand-by unit for every unit
in active operation-is also unduly expensive. In most
applications there are peak loads which occur relatively rarely, but which must be within the capacity
of the system in its normal state, and the minimum
essential capability is substantially less than the peak.
What is called for is a fail-soft approach in which
major components are duplicated but not allowed to
remain idle. All components are used simultaneously
to obtain the peak throughput, but the system can
continue operation at reduced throughput in case of
a failure. The failed component can be diagnosed and
repaired without interrupting the operation of the
surviving portions of the system and in a time short
compared to the expected time to failure of the identical surviving component. Thus the user has nearly
347
348
Fall Joint Computer Conference, 1969
complete assurance against cc;>llapse of the entire
system. 4
The multiprocessor hardware
A data-processing system cap~ble of graceful degradation is illustrated in Figure Ii. Clearly, many other
types of peripheral equipment could he included. All
the peripheral control units are co-channelled, so that
if one input-output section of 'the central computer
should fail, another path would r~main open.
The central computer, the Model 215 multiprocessor,
is shown in more detail in Figure 2. It consists of two
Central Processor Units (CPU), two Input-Output
Units (IOU), and from two to eight l\'fain Memory
Units, interconnected by an essentially passive SignaJ
Distribution Unit (SDU). By a conceptually simple
redesign of the SDU, requiring, however, substantially
more hardware, the system could be expanded to include four CPU's, four IOU's, and sixteen Main Memory Units. Each of the active units: is separately powered
and operates independently of other units of the same
type-for instance, any number :of memories can execute independent, overlapped cycles simultaneously.
The SDU is merely a mechanical package housing the
interconnections among the active units; as the diagram suggests, such circuits as it contains (largely
line-drivers and receivers) are partitioned and powered
from the active units. The logical and electrical designs
conform to the constraint that at failure in any active
unit, or in its partition of the SDU, must not interfere
with continued operation of the remainder of the
system. Multiprocessors for grot;tnd-based application
PERIPHERAL
EQUIPMENT
CONTROL
MASS
MEMORY
DISPLAY
CONTROL
CHANNELS
CHANNELS
I/O
I/O
PROCESSOR
PROceSSOR
CENTRAL
PROCESSOR
CENTRAL
PROqSSOR
MEMORY UNITS i
I
REDUNDANT CENTRAL COMPUTER
DISPLAYS AND
KEYBOARDS
TAPES, PRINTERS,
AND OTHER
PERIPHERAL DEVICES
Figure I-Typical multiprocesspr application
CENTRAL
PROCESSOR
CENTRAL
PROCESSOR
I/o
PROCESSOR
r-----------------
I
I
I
I
I
I
I
I ____ _
L
MEMORY
MEMORY
Figure 2-Fail-soft computer configuration
similar in many respects to this one have been previously described. 6 ,6
If one IOU and one CPU are turned off or discon~ected, the uniprocessing system that remains is funct~onally compatible with the RCA Spectra 70 serIes of commercial computers. 7 ,8 The entire instruction
set of the Spectra 70, including privileged instructions
is contained within the Model 215 as well as the fou;
Program States, the input-output channel control the
interrupt management scheme, and all other fea
~
u.
o
0::
w
a:\
Z
/
LABORA'mAY
PROGBESS
INTEGRATED CI RCUIT/ /
10
.'LLST,PPING
SWITCH
:IE
::I
~/'"
TRANSISTOR,,/
INVENTED
~
~
FIRSTSILICON
TRANSISTOR
MESA
TRANSISTOR
PLANAR
TRANSISTOR
~---~--~----~--~--~
1948
1952
1956
1960
1964
1968
Figure l-Complexity of integrated circuits versus year
of laboratory accomplishment
Large-Scale Integration
largest machines4 and their memories (Figure 2). 1"'here
was much discussion on how this was to be achieved
economically, practically, and without an unreasonably
large effort in component assembly.
As a result, computers were ready for integrated
cirGuits-and they are now eagerly utilizing each
generation of more complex ones, as each of these in
turn offers acceptable performance, ever higher speeds,
lower cost per device, and greater packing density.
The expectations for large-scale integration have been
derived from various pronouncements made by device
makers as early as 1964, and in the several succeeding
years. Initially, a greate r pervasiveness of integrated
electronics was proposed. 5 There followed a number of
laboratory investigations of complex integrated circuits
and extrapolations of their characteristics were published. 6 •7 •s These were quickly followed by analysis
of the potential advantages of LSI from the user's
standpoint,9.10 analyses of computer organization
architecture and partitioning,Il.12 as well as tempered
discussions of possible areas of utilization13 and cost.14
Computer· architecture has developed that permits
interaction and utilization of large blocks of components-i.e., subsystems-without delineating all combinations of signals and their paths one by one. Thus,
computer theory is capable of dealing with large-scale
1,000,000 _-------------:::,.,-,1'"1
~
zw
• Very low cost per elementary function or per bit.
I:L.
ow
10,000
.....,,-H--+--t~
• Far smaller size and relatively few connecting
leads than present computer circuits using integrated circuits on printed circuit cards.
a::
w
I:L.
en
Z
1,000
• Complete circuit compatibility with other semiconductor active devices.
o
i=
u
z
:::>
LL
• Off-the-shelf circuits or at least readily designed
custom circuits, available with the strokes of a
computer-controlled mask generator.
100
~
:::>
u
a::
U
integrated circuits, and engineers examine all new offerings of component manufacturers to assess their
suitability for one or another potential application.
On the other hand, such complex subsystems as an
LSI chip must embody far more thought and care in
design15 than a simple gate circuit. The LSI chip must
contain more than just a repetition and interconnection
of dozens of simple integrated gate circuits.Jn the past,
subsystems of discrete components also had to be tested,
modified, remeasured, and remodified many times
before they were ready for use in a large complex
computer. That represents a significant change from
early days, in which a transistor-if it had enough
sustaining and saturation voltage, gain, and switching
speed at a given cost, was considered satisfactory for
a new generation of transistorized co ;nputer.3. At that
time, it may have required one year to shake down a
transistor in a new logic circuit and three to five years
to develop the rest of a complex system-or two years
to develop the concepts of integrated circuits, with
two to four years to complete the system. I6 It might
now take three years to shake down LSI ideasJ and
another one to three years to complete the system
using them. This accounts not only for the development times required for a given product but for the
total time required for developing the subsystem concepts and configurations, and adapting these to the
newly conceived systems.
From such considerations various authors have derived these expectations 5-14 for LSI circuits• Much more complex functions-Iogic,memory,
or other-on a single chip or a single package.
100,000
:E
:::>
361
10
TUBES
TRANSISTORS
1950
INTEGRATED
CIRCUITS
LSI
1960
1970
Figure 2-Functional complexity of electronic computers
• A silicon device factory operated like a "Brownie"
photoprint shop: put in a negative and out conies
a ten~cent deckle-edged glossy print.
With such great expectations,it is not surprising
that many predictions and promises were made by the
device manufacturers. And many-even the more
extravagant-promises were believed. Most promises
362
Fall Joint Computer Conference, 1969
ma de by device manufacturers were based on the
concept that a further aggregation of existing standard
logic or memory circuits would be sufficient to fulfill
such promises. Little did they anticipate that much
new technology had to be developed in order to fulfill
simultaneously all or most of the above expectations
on cost, ease of design, compact packaging, and so on
.which had been individually predicted and promised.
Furthermore, computer makers have scaled up their
demands and expectations, and are attempting to
clarify the technical and interface requirements on
purchased subsystems-and a subsystem is what
LSI circuits really are.
Yet the makers of peripheral equipment, like displays or ~esk calculators, or of small memory buffers,
are close to the realization of such promises, and are
probably within a year of producing the equipment
based on the expectations for LSI circuits and the
promises of their vendors. The interaction between
vendor and user of LSI circuits is less time-consuming
for peripheral equipment systems which are much
less complex than large computers.
The accomplishments of LSI
Devices
At this point, it may prove instructive to look at
some of the accomplishments of the semiconductor industry in more detail, from the invention of the transistor in 1948 to the complex circuits of the present
time. 17-22
Table I illustrates some key events in the steady
progression of innovations utilized by the computer
industry. The table shows not only the date of the
laboratory announcement, but also the time (one to
three years later) when such devices became available
for purchase. Figure 3 presents this data in graphic form,
plotting the circuit's complexity as a function of time.
Note that, in addition to a delay in moving from the
laboratory into first production, the mass production
of silicon transistors really followed only after the
planar process provided commercially useful devices
at costs competitive with germanium transistors. This
happened after 1960.
Figure 3 also shows the complexity of the integrated
circuits actually used in computers as a function of the
system's introduction date---another year or two
after, production of such circuits was in full swing and
produced reliable units at reasonable cost. The horizontal spread in years between these curves is a measure of the time required-again and again, one might
add-to turn new concepts from the laboratory into
10.000 _ - - - - - - - - - - - - - - ' . . . - . / :
0.
%
(,)
w
en
a:
~
...I
W
W
/
>
~
c(
w
co
:E
/
/
10
o
a:
II
LABORATORY PROGRESS . - . . ;
100
:E
w
u.
f
/
a:
!::
z
w
::"
:"...-
/ 1/
1,000
~
~
/'
/
a:
0.
,/
/
:'
I
:
INTE~RATI:D
I
/
TRANSISTOR /
INVENTED/
CIRCUITRY USED IN
COMPUTERS
::::>
z
./
/'
~~------------~--------~~-----J
1948
1952
1956
1960
1964
.1968
1!972
Figure 3--Twenty-year growth uf complexity toward
LSI
producible devices, and finally into reliable devices
manufactured in large numhers at low cost.
The time span also indicates the time required :for
systems manufacturers to become acquainted wilth
the properties of such devices, utilize them in prototype
designs, buy a few, and again a few more; and finally
to purchase many more 38 their systems are sold. One
must remember that a device reaches large-sc:a.le, lowcost manufacture only when the system for which it is
destined is also sold in large numbers.
To illustrate how many innovations must be accomplished in translating a concept into a finished device
and a manufactured integrated circuit, one ean look
at some of the key technical innovations23 ,24 and developments which led to the Minuteman II system in
1966 (Figure 4). This system employed integr~l.ted circuits in its guidance computer.
Systems
Let us now consider several computer systems25 that
first utilized various new semiconductor devi(~es
(Table II). The years 1951-1952, when the transistor
had already been in existence for three or four years,
saw the advent of some of the first electronic computers
using vacuum tubes. The first commercial computers
with germanium transistors were introduced in 1956,
when the silicon diffusion techniques were jruJt being
announced by Bell Laboratories. Diffused silicon
Large-Scale Integration
TABLE I-Dates of announcement of devices
LEOIiND:
~RES£ARCH
~:'TL
DEVICE
PAll EXf'LOfilATOfI!Y
and circuits
DATE OF ANNOUNCEMENT
LABORATORY FOR SALE
Transistor discovery
Germanium transistor
Grown silicon transistor
Diffused silicon transistor
PNPN stepping switch
Planar silicon transistor
Integrated circuits
MOS registers (100 bit)
Bipolar memory array
(64 bit)
1948
1951
1956
1956
1958
1958
1966
1968
~IIIMANIUM ,,,..,.....TO": OE
~AT.ZONE MEL TINO: 110. COftPS
~ICOH 'RANIISTO": Tt
1952
1954
1957
~.TL
iDE MASKING: ITL
CONFIGURATION
CONFIGURATION RESEARCH: eTL
l
~
~ONTN:.n: BTL
1959
1961
1968
1969
~A":FAlftCHILD
iO"_RCUtTl: WEST
iO"_RC~
"NUTEMANII
I
I
TlIT
1962
1963
1965
1966
1966
1968
1969
1969
1970
I
I
TABLE II-Computer active devices and dates
of first system shipment
1951
1953
1956
~ DEYELMMENT
m m C A L RESEARCH: ITl
1940
YEAR FIRST
PRODUCED SYSTEM
363
DEVICE TYPE
Univac I
IBM 701
Univac 1101
Tube
Tube
Germanium
transistor
Telstar I
Silicon transistor
M;nuteman I
S!licon transistor
IBM 360
SLT hybrid
(silicon)
Minuteman II
Integrated circuit
Univac, RCA, etc. Integrated circuit
Various
MSI scratch pad
memory
Calculators
MOS-LSI
reg'sters
IBM
LSI buffer
(CACHE)
Various
LSI memory
transistors did not find their way into computers until
about 1963 with Minuteman I, and 1965 with the IBl\1
SLT hybrids in the commercial Model 360. Monolithic
integrated circuits did not appear until 1966-1967
in military systems (Minuteman II) and commercial
computers (RCA, Honeywell, UNIVAC, Burroughs).
The first large .computers that will incorporate LSI
are still on the drawing boards, and are expected to
emerge in the early 1970's.
lM5
11&0
1910
:"'ODI£TION
I
11116
1110
Figure 4-From transistor to Minuteman II, a twentyyear sequence of innovations in solid-state devices
The relationship between component and
system innovations
One of the reasons why systems do not immediately
adopt a revolutionary concept is that the concept must
have not only promise for the future, it must also compete in cost or performance with existing technologies
in pr~ctical applications. Consequently, except for
military applications that value lightweight or other
factors of engineering performance more than cost,
the germanium transistor was used in. commercial
computers only after / it provided bot4 higher speed
and a lower cost than vacuum tubes.
The same principle holds for each later development
In fact, integrated circuits exceeded the frequency.
performance and cost less than most discrete silicon
or germanium transistors only after 1965, and thus were
not applied to commercial computers until about that
time (Figure 5).
The same applies to LSI; most types described or
available today are not yet out of the laboratory or
are only in pilot production. 22- 29 These just about match
the costs of more conventional MSI or low-cost integrated circuits. Vigorous competition is not yet apparent, though it is anticipated.
Figure 6 traces the path of a system's components
to some of its subsystems and systems, relating the
previous data on the development dates of components
and systems. For additional perspective, we have
traced a few initial pertinent developments in materials
and basic research.
Fall Joint Computer Conference, 1969
364
-------------------------------------------------------------• Improvements in LSI Manufacturing
smaller devices with finer mechanical and optical
accuracies
greater processing yields and lower costs
new package developments
multiple layer metallization and interconnections
1011r-____________________________
LSI ARRAVS AND
INTEGRATED CIRCUITS
• Improvements in Design
computer-aided logic and circuit design and
tolerancing
automatic mask generation
test sequence and operation by computer
• Improvements in Applications and Development
diagnostic routines and their auto mature
simulatiop
development of more appropriate architecture
and hierarchies for systems and memories
improved methods for. reliability assessment
1945
1955,
1965
1975
Figure 5-Swit.ching rate per dollar for computer logic
elements
LSI COMPUTER
r
!
!
:i : :., '
"
1945
1960
1856
1960
YEAR OF FIRST INTRODUCTION
11165
1970
1975
li'igure 6-Tracing the development of new components
into systems
Interpretation
One can examine6 ,8,l1 ,12 what must be accomplished
in order to turn an assemblage of integrated circuits
into a useful series of computer subsystems, whether
logic, memory, or other. For example:
I t is apparent that much of the implementation in
development of such LSI circuits borrows heavily froID
the computer field itself in terms of mechanizing the
performance of engineering design, development, test
and diagnosis at many levels of device circuit, and
subsystem engineering. This is in addition to the process improvements required jn manufacturing the
circuits.
In the ne,v medium of the silicon crystal, one cannot test, trouble-shoot, and correct breadboards in
the traditional way-that is, by using an oscilloscope
or meter and test probes. The circuits are too minute,
too buried under other connections and insulating
layers, for po~nt-by-point signal tracing to be effective.
Thus, both systems and device engineers must. use new
methods of diagnosis and analysis, must develop software and simulation techniques in order to understand
what is going on within their own devices. This is
clearly an age of computers building computers. The
needs of the LSI laboratories in the semiconductor
industry regarding computer design, simulation, and
test make this very clear.
The device maker and the computer builder H,re
inevitably linked to one another. In fact, the device
maker might turn to the computer builder and say,
"We thought you already knew how to design, test,
and diagnose logic and memory circuits by use of
computers. But now we find that we have to learn this
from the beginning."
Even with more rapid and effective utiliziation of
computers in LSI design, manufacture, test, and improvement, time delays must be expected between the
first versions of this new concept and its beeomin!~ a
reliable low-cost product, and between this intermedi:a,te
Large-Scale Integration
step and the ultimate utilization in a large commercial
electronic system such as acomputer. Many interfaces
must be matched, the previous, but stiL advancing,
technologies must be overtaken, economic trade-offs
performed, and investment decisions reached. Financial
decisions are generally the most important, and these
frequently require the longest to .resolve in large organizations. Confidence in the new LSI product must
pe established, its reliability examined, the credibility
of its manufacturer and his delivery and cost promises
examined, and finally any alternative approaches again
compared.
Of course this all takes time. But therein lies a dilemma. A new product will not get off the ground if someone does not risk using it; its manufacture will not be
initiated if there are not at least prospective customers,
and establishing reliability is difficult and expensive
w:thout prototype system use and field testing. Consequently, it is tempting to brush away the dilemma
by early promises and premature announcements.
Many observers believe that without forward looking
claims such new concepts and developments would
only evolve at a snail's pace. "Nothing ventured,
nothing gained" certainly applies in this case. And the
only valid realization of the promise of LSI is the delivery of such circuits and their successful use in an
electronic system.
In interpreting the accomplishments to date, and
the reasons why some expectations have not been
realized, one discovers the following:
• The definition and structure of an LSI computer
are not. fully understood, but are still evolving.
Yet progress toward large-scale integration appears inevitable. The semiconductor industry has
a tremendous commitment and momentum toward
further integra~ion of circuits.
• Considerable time is required for the exchange of
ideas and their assimilat:on, in order to accomplish
the experimental interaction required to turn concepts into practical embodiments in systems and
to test these in the field.
• While both component and computer industries
may be learning from previous difficulties, many
of the interactions required now between system
and device designers remind one of a similar mismatch of expectat:ons and performance requiring
further interactions30 during the early days of
simple integrated circuits.
365
Further expectations for LSI
Some might conclude that the next step inevitably
leads to the further integration of LSI-integration
~ubed or GSI (for Grand-Scale Integration). More
~Ikely, .however, the evolutionary process will approach
In varIOUS ways the concepts of molecular electronics
in which simple as well as extremely complex electroni~
functions are delineated and designed into the molecular
arrangements of solids, such as a small chip of a crystal
of s~licon. Furthermore, it seems that th~ concept
a~plIed over a~d over is that of batch fabrication, applIed to a medIUm particularly well suited to this concept.
The computer-on-a-slice concept may not soon be
here; instead the memory-on-a-slice, the arithmetic
processor-on-a-slice, the internal communication system-on-a-slice, or whatever, will be. The most likely
subsystems amenable to implementation in LSI will
be those suitable to repetitive batch processing, and
those requiring relatively few connections to interface
with other parts of the system. Significant in all cases
is the repetitiveness of internal structure and partitioning that provides great functional capability with
relatively few external leads. When there are a million
devices inside one LSI chip, such will be called a "megaelectronic" device. But this version is still far into the
future. Less complex circuits now provide good performance at low cost, and so will continue to be used
for some time to come. But the underlying assumptions--that by shrinking device size further one will
gain both more devices per unit area and higher internal speed-are real and lead to the expectations of
still further increased performance at lower cost per
function.
While many practical difficulties must still be overcome, the fundamental physicallimits3 permit at least
another order-of-magnitude improvement over the
performance-to-cost ratios of many present integrated
circuits of medium and large scale.
Near-term applications most likely for LSI circuits
are the following. 31 ,32
Memory buffers
The rapid increase in speed of logic circuits has forced
modest progress in core memory speed and cost, but
has far outstripped improvements in the speed of access
of disc memories. Thus, opportunities for buffers between disc and core memories, and between core memories and fast logic circuits, exist. The LSI (MaS and
366
Fall Joint Computer Conference, 1969
bipolar) circuits are well suited to these respective applications, and are now being tried aggressively by some
designers.
Small memories and logic systems
The ease of interfacing with related integrated circuit
logic makes semiconductor LSI memories very suitable
and reasonably inexpensive for use in small systems.
Pending applications are in desk calculators and in
character generators for display; Existing mediumscale integrated (MSI) circuits are being used in lamp
drivers and scratch pad 16-bit memory devices.
Linear circuits
Just as applications for linear: integrated circuits
lagged behind those for digital cil'cuits, so LSI linear
circuits are expected to develop more slowly than
digital ones. However, a number of quite complex circuits for TV and stereo radio have been developed by
now, all of which certainly may be classed as mediumscale integration. Sophisticated operational amplifiers
and active filters are also worthy of consideration.
Other applications
Another widely used circuit of. the future is likely
to be a serial or parallel address encoder/decoder, which
can be set by means of external connections or preset
by the manufacturer. This class of circuit will be utilized
in remote signaling and TV tuning, intercoms, mobile
communication sets, and automobile or other command
multiplexing systems. It also resembles certain address
encoders/decoders used in computer circuits. While
most of the cited applications have not yet been developed widely, they will require c~rcuits ranging from
four to 32 bits, which would barely be considered in
the LSI class. Further applications are in digital differential analyses and other specialized calculator or
function generator circuits.
CONCLUSION
This paper has looked at some of the promises made by
device developers about LSI and; examined their accomplishments so far. The inescapable conclusion is
that only medium-scale integration is here today. It
will be another year before large-scale integration will
be available, reliably manufactured, and accepted for
use in critical portions of electronic computers.
It i~ also apparent from this paper that, in order to
be applied in useful computer systems, technical innovations must undergo further adaptation to the
specific systems, and vice versa. This mutual improvement and development requires human interaction
and communication33 during months or years of time.
Of course, one can only predict the orderly pro~~ression
of technology and its gestation with time, and progre8sion and gestation may be speeded by new developments
or delayed by unfortunate experiences.
One can certainly expect the future evolution of
large-scale integrated circuits and their increased
participation in electronic systems-not only in computers, memories, and peripherals, but also in telephone and industrial systems; and in automobile,
appliance and entertainment consumer products.
Only the time scale is unknown. These visions of LSI
are on the horizon-to predict when they will draw
within arm's reach is not the purpose of this paper. But
once the first application has been successfully introduced, many more will follow rapidly.
REFERENCES
1 J K AYLING R D MOORE G K TU
A high-performance monolithic store
ISSCC Digest of Tech Papers 12 196936-37
2 A B PHILLIPS
Private communication
3 R L PETRITZ
Technological foundations and future discu.'tsions of large
scale integrated electronics
Proc FJCC Vol 29 196665-87
4 Air Force Systems Command
Integrated circuits come of age
Andrews AFB publication 1966
5 P E HAGGERTY
Integrated electronics-A. perspective
Proc IEEE Vol 52 Dec 1964 1400-1405
6 R D LOHMAN
LSI-The fabricator's viewpoint
ISSCC Digest of Tech Papers Vol 10 Feb 196730-31
7 R L PETRITZ
Current status of large scale integration. technology
Proc FJCC Vol 31 196765-86
8 J S KILBY
Device fabrication Jor large scale integration
ISSCC Digest of Tech Papers 9,30 Feb 1966
9 M G SMITH W A NOTZ
Large scal.eintegration from the user's point of view
Proc FJCC Vol 31 1967 87-94
10 G C FETH M G SMITH
Large scale integration perspectives
Computer Group News Nov 196824-32
11 H R BEELITZ S Y LEVY R J LINHARDT
H S MILLER
System architecture for large scale integration
Proc FJCC Vo] 311967 185-200
12 L C HOBBS
Effects of large arrays on machine organization and hardware!
software trade-offs
Proc F JCC Vol 29 1966 89-96
Large-Scale Integration
13 M E CONWAY L M SPANDORFER
A computer system des-igner'8 view of large scale integration
Proc FJCC Vol 33 1968835-845
14 R N NOYCE
A look at future C08ts of large integrated arraY8
Proc FJCC Vol 29 12966 111-114
15 N CSERHALNI 0 LOWENSCHUSS B SCHAFF
Efficient partitioning for the batch-fabrieated fourth-generation
computer
Proc FJCC Vol 33 1968857
16 E G FOUBINI
The implications of solid-8tatt3 technology on electronic systems
ISSCC Digest of Tech Papers Vo110 Feb 196729
17 J BARDEEN W H BRATTAIN
The transi8tor-A semi-conductor triode
Physical Review Vol 74 1948 230
18 W SHOCKLEY M SPARKS G K TEAL
P-N junction transistors
Physical Review Vol 83 1951 151
19 R L WALLACE JR W J PIETENPAL
Some circuit propertie8 and applications of N P N transistors
Bell System Tech Journal Vol 30 1951 530
20 C A LEE
A high-frequency diffu8ed-ba8e germanium transistor
Bell System Tech Journal Vol 35 195623
21 I M ROSS
A four-circuit silicon diffused P N P N stepper .~witch
1956 Device Research Com
22 64-bit read-write memory cell
Fairchild Semiconductor Preliminary Data Sheet No MML
9035 Sept 1968
23 Patterns and problems of technical innovation in American
industry
Arthur D. Little Inc Federal Clearinghouse V S Dept of
24
25
26
27
28
29
:30
31
32
33
367
Commerce Rpt to Natl Science Foundation PB 181573
Sept 1963
Management factors affecting research and exploratory
development
Arthur D Little Inc Federal Clearinghouse U S Dept of
Commerce Rpt SD 235 to Director of Defense Research
and Engineering AD 618321 April 1965
Annual supplement of computer characwristics quarterly
Adams Associates Inc Bedford Mass 1968
A F BEER K H NICHOLAS I H LEVIN
_-1 MOST memory us-ing discretionary wiring
ISSCC Digest of Tech PaperA Vol 12 Feb 1969 142-143
R F HERLEIN A V THOMPSON
An integrated associative memory element
ISSCC Digest of Tech Papers VoI1Z Feb 196942-43
A RASHID
Iligh-8peed LSI current mode-logic arrays for LIMAC
ISSCC Digest of Tech Papera Vol 12 Feb 196968-69
H YAMAMOTO M SHIRAISHI T KWOSAWA
A ;"O-NS, 144-bit N-channel JfOS-IC me'mory
ISSCC Digest of Tech Papers Vol 12 196940-41
J A MORTON
The Microelectronics dilemma
International Science and Technology Vol 55 July 1966
35-44
B AGUSTA
A 64-bit planar dou,bled-DiJfu.sed monolithic memory chip
ISSCC Digest of Technical Papers Vol 12 196938-39
T R FINCH
LSI-Digital electronics
ISSCC Digest of Te0h Papers Vol 10 Feb 1967 13032-33
J' N SHIVE
The properties, physics and design of 8emi-conductor devices
D Van Nostrand Co Inc Princeton N J 1959 471
What has happened to LSI-A supplier's view
by C. G. THORNTON
Philco-Ford Corporation
Blue Bell, Pennsylvania
INTRODUCTION
Three years ago at the Solid-State Circuits Conference
in Philadelphia, the concept of large-scale integration
was already considered to be sufficiently far advanced
as to be the main theme of the Conference, with a
large number of technical papers showing beautiful
colored slides of potential "products/' containing
several hundred transistors interconnected with two
layers of metallization on a single chip of silicon. Related papers were presented at the Fall Joint Computer
Conference that year, and semiconductor vendors had,
for some time, been indicating the benefits that would
accrue to the straightforward extension of the principles of planar integrated circuits to more complex
"subsystems" on a single piece of silicon. The concept
appeared to be clear-all that remained was its implementation; yet, as of the start of 1969, no major
systems had been constructed with LSI and predictions
of significant volume usage were still one to two years
away. One can legitimately ask whether the darling
of the industry a few short years ago has become the
"bete noir" of today's computer industry, or whether
most of the problems have been solved and we are
well on the way to practical commercial utilization? This paper reviews some '~f the more significant
problems that have required solution during the past
four years, in order for LSI to now begin to play its
role as a major element in new system design.
The situation can best be 'discussed in terms of the
specific problem areas that have been encountered since
1964, in attempting to implement LSI. These include:
1. System design.
2.
3.
4.
5.
6.
Product design.
Fabrication capability.
Testing.
Packaging.
Reliability.
It is the thesis of this paper that a number of specific
problems existed in each of the above areas which
would logically have been expected to require several
years of effort in their solution. Each of these is discussed.
System design
Since the functional density which can be practically
obtained on a single MOS chip has led that obtainable
with the bipolar approach, early LSI systems design
experience was based on the use of MOS technology.
Although individual MOS-LSI circuits were co mmercially available four years ago, sales for such devices to be used in conjunction with conventional
components were very limited. It was quickly realized
that it was nearly as difficult to build a cost effective
computer system which partially used MOS-LSI, as
it is for a person to become partially pregnant. For
example, compatibility problems arose when systems
were redesigned to use MOS rather than bipolar shift
registers. Mixed systems were designed, oI).ly to' find
that by the time the cost of the interface circuitry and
the clock drivers were included, it was more economical
to use a larger number of smaller bipolar register
369
370
Fall Joint Computer Conference, 1969
elements. More significantly, attempts to partition
parts of existing systems into blocks containing 100
gates or more led to excessive interconnections to the
discrete IC control circuitry, and to new packages
containing up to 60 leads. Chip sizes tended to be
14,000 mils2 or larger, and it becamei a costly experience
both to user and supplier to learn that such chips were
at that time well beyond the state of the fabrication
art. For optimal utilization of LSI, the system designer
has found that he must rethink his system from scratch
in terms of the new technology, he must be able to
partition the system into tractable chip sizes with
reasonable gate-to-pin count ratios, with considerable
advanced care required at the partitioning step to
insure the ability to test the reSUlting functions. It
has also required studies, such as the LIMAC LSP
demonstration vehicle, the design of small calculators
and the appearance of a variety of standard LSI
functions to assist in shaping new design concepts.
These concepts include distributed control and memory,
with integral chip decoding and encoding, and the use
of read-only memory subroutines, among other techniques~ Just as the active device count required to
perform a function went up dramatically when the
designer went from the use of discrete components to
integrated circuits, the systems designer has had to
learn to waste LSI circuitry effidiently in order to
make his system design compatible with the technology.
Given that the entire system must be redesigned
and the associated expense, it is not surprising that
most initial LSI equipments have been limited in
scope. To attack the broader pr0blem of designing
large LSI computer systems or major product lines of
peripheral systems, only a few us~r companies out of
the entire industry initially made: the total commitment required (i.e., 20 to 50 engi~eers with available
in-house or vendor prototype devi~e fabrication facilities). Such programs typically started three to four
years ago using MOS technology, and have just this
year" reached a level of completion: where prior system
commitments can be made.
Product design
Progress toward LSI may also have been impeded
by the proffered viewpoint that; the semiconductor
vendor would supply the necessary partitioning and
design capability. The semiconductor vendor suggested
that he would integrate his facility upward to encompass subsystem design in much the same fashion as
he had previously taken over mu~h of the computer
circuit design. On the contrary, .many of the more
successful total system programs today seem to be
those where the vendor is supplying design rules relating
to his fabrication capability, and the custom chip
designs are being accomplished directly within the
systems houses. In 1969, requirements have already
existed for over 500 specialized custom chip designs
needed by approximately a dozen users to implement
prototype systems. The nu~ber of engineers required
to accomplish these designs, even with a modern
computer-aided design capability, far exceeds the
number available in vendor companies. It would be
irrational, moreover, to expect semiconductor device
manufacturers with their general purpose circuit engi..
neers to compete with major equipment houses in
optimizing the partitioning and chip design in a variety
of special system applications. Failure on the part of
many system groups to get sufficiently involved in
the design of custom LSI has slowed the rate of usage.
The main thrust of the component vendors has
been to increase the breadth and complexity of their
"standard product" lines, since it is only through
volume production of such standard products that the
ultimate lowest costs per chip will be obtained. For
certain classes of circuits, the standard produ'ct ap··
proach is moving rapidly, with the development of
such devices as shift registers, read-only memories,
random access memories, A-to-D converters, D-to-A
converters, BDA's, parallel-to-serial and serial-to··
parallel converters, counters, etc., being made a.vaila··
ble.
Regardless of who designs the LSI componentis, the
tools were simply not available to do the job until
recently. As a minimum, the following are required:
Logic simulation techniques
Techniques are required for simulating the perfor..
mance of the blocks obtained by a trial system par··
titioning. Such simulation should include not only login
simulation, but should ideally take into account circuit
delays. Some LSI systems designers have not been
content to rely on computer simulation, but. have
constructed simulation cells, or macro versions of the
subcircuits that they plan to work with, so tha,t they
can physically simulate the performance of the entire
LSI chip. Such simulation techniques have been in
development in a number of laboratories for several
years, and several computer programs have also now
been developed to attack this problem.
Standardized design approaches
During the past three years, the cost of obtaining a
What has Happened to LSI
few custom LSI chips from a vendor has remained
remarkably constant in the range of $25,000 to $50,000,
with several months to supply prototypes. Vendors
and users have both attempted to improve the situation by using computei:' aids, and in some cases by
using a standard cell or building block approach. A
typical design approach is shown in Figure 1. The
individual steps may all be performed manually or
they can be accomplished by a computer operation.
The numbers in the corners of the blocks give a rough
indication of the priorities in terms of the development
of computer techniques to replace manual methods.
It is noted that, after simulation and testing, higher
priority is given to automatic mask generation than
to the more complex problem of placement and routing.
This stems from the need to eliminate the time consuming and error prone operation of ruby cutting as
well as the need to obtain the required precision without
excessively large camera reduction. Most large MOSLSI chips to date have been accomplished with manual
placement and routing, with computer placement and
routing just becoming an effective tool.
The "standard cell" may vary in complexity all the
way from a complete gate or flip-flop configuration
to cells as small as individual transistor or line seg-
371
ments.The larger cells are easier to use in computeraided design, and computer-aided placement and
routing programs are more successful with this approach. Although the technique does not achieve
minimum area, it has permitted major reduction in
prototype design and turn-around time. The near
practical development of all of these techniques has
taken three to four years to accomplish, with more
improvement to come.
Common design rules
Another major obstacle has resulted from the fact
that mUltiple sourcing of user design circuits requires
a certain degree of unanimity among suppliers' design
rules and processes. After four years of MOS process
evolution, it is only this year that parts can be ordered
f~om as many as three suppliers, using nearly the
same set of masks. The situation in bipolar has been
equally chaotic, with no effective second source capability" More than one major system has gotten into
serious trouble with a single source of LSI-MSI that
failed to materialize. Other users are going to be very
reluctant to move ahead with LSI, until some types
of multiple sourcing can be found.
Fabrica.tion capability
(5)
CELL ASSIGNMENT
OR
CHOICE OF DESIGN RULES
LOGIC
BLOCKS
I
1
1
I1- ___________
I
~
(4)
DIGITIZE
(7)
(6)
NTERCONNECTION
ROUTING
TRIAL CELL PLACEMENT
OR
DEVICE LAYOUT
ARTWORK
GENERATION
(3)
TESTING
Figure I-LSI product design
The ease with which photographs of large complex
chips with multilayer metallization could be obtained
for pUblication a few years ago has proved to be grossly
misleading in terms of the magnitude of the technical
problems. As a matter of fact, a number of fundamental
technical problems initially existed which made it
economically impossible to produce LSI devices. Three
of the more significant of these are discussed here.
These are:
1. Defect density.
2. Multilayer metallization.
3. Mask making.
Defect density
Chief among problems discussed was that of defect
density, but the tendency was to greatly oversimplify
the expected solution to the problem. Many managers
felt that the defect density would be reduced largely
by "greater care in processing," or "use of clean room
facilities," rather than requiring the development and,
in some cases, the invention of totally new fabrication
techniques to successfully produce these devices.
In 1964, the defect problem was treated analytically
by Murphy,2 who showed that with the existing defect
372
Fall Joint Computer Conference, 1969
densities of several hundred/ cm, 2: economically practical arrays could be expected to c6ntain about 10 gates
per chip on the order of 30 to 60mils2 in size. Further
studies have shown that even with appreciable clustering of defects, a 98 percent yi~ld of single gates is
required to obtain a reasonable Yfeld at the 100 gate/
circuit level,3 One approach to finessing the problem
was through the discretionary wiring approach. U nfortunately, this technique developed its own set of
problems, which took twice as long to solve as originally
estimated. The problem of eliminating the defects was
also greatly underestimated by single chip LSI suppliers, and large chip yield forecasts were made which
could not be met. The sources cjf defects were subtle
in nature, and their solution has tequired chemical and
metallurgical process changes in wafer preparation,
photoengraving, metallization, and mask making. It
has only been during the past- year that, under laboratory conditions and with several new process innovations, the required low defect' densities (less than
10/cm2) have been attained to permit fabrication of
bipolar arrays contai~ing hundreds of components on
large chips. A good yield, circa 1969 (greater than 20
percent), is illustrated in the wafer map shown in
Figure 2 for 256-bit shift registers, each containing
2067 transistors on 100X 100 mil chips.
The defect problem was thougpt to be simpler with
MOS, in view of the smaller number of processing
steps. MOS arrays did, in fact, initially yield better
in somewhat larger chip sizes than bipolar, with considerably higher yields on a per~component basis because the active devices require less area than bipolar
devices. However, the MOS limit was soon reached
Figure 2-Map of wafer of 256-hit shift registers
at less than twice the chip size of bipolar, as it was
found that each 1\-10S process step was more critieal
than its bipolar counterpart. Specific MOS problems
relate to the surface-sem~itive nature of the devices,
to the high fields which exist, and to the susceptibili.ty
of the thin gate oxide to contain specific types of defects.
l\'Iany 1964-65 MOS circuits were fabricated with
only 1000 Aof oxide in the gate region. The thin oxide
was required in order to overcome the high level of
fixed charge density, QS8, in the oxide, and obtain
tractable levels of threshold voltage. Clock voltages
in the range of 25 to 30 V were used to overcome the
high threshold voltage characteristics of these devices,
and obtain reasonable speeds. Thus, fields as high as
3X 106 V /cm were impressed across the oxide, with
even higher yields at any thin spots that might be
process induced. If one examines the detailed topology
of 1\10S integrated circuits, one also finds stepped
regions in the oxide and metal edges where even higher
field concentrations exist, where defect-free devices
break down when overstressed. The maximum oxide
breakdmvn field for near perfect planar metal-Si02silicon structures has been determined in this and
other laboratories4 to be approximately 107 V / cm, not
allowing for thin spots in the oxide. Thus, these deviees
were extremely marginal in design. It. remained for
the industry to learn how to reduce and control the
oxide charge, permitting thicker gate oxides to be
used with greater safety margins.
In addition to the problem of leakage through the
oxide, 1\110S device performance and stability depends
on the control of a number of interface effects 9,t the
dielectric semiconductor interface, most of which have,
during the past four years, become well understood by
physicists working in research and development laboratories, but whose control at the production level is
only now becoming a reality. An example of the type
of problem is that of field inversion where MOS deviees
lose their inherent isolation properties when the interface state density or field charge in the field oxide :are.
allowed to vary. Specifically, in 1964, there ·were only
a few effects associated with planar oxides that were
of much concern to integrated circuit manufacturers.
These included surface recombination which affected
transistor {3 and diode leakage, and the presence or
absence of surface contaminants on the oxide which
were believed responsible for the occasional channelling
problems on life tests. The high doping density and low
voltages used in most bipolar circuitry made th.?se
devices relatively resistant to surface problems. vVlth
the advent of MOS devices, a number of additional
effects became important, and new discoveries were
What has Happened to LSI
problems has evolved, it has been necessary to devise
special test structures, each used to examine a particular effect in the absence of other effects. Two
structures which are used for this purpose are illustrated
in Figures 4(a) and 4(b). The structures shown test
for the following individual effects:
SURFACE IONIC
CONTAMINANTS
X
x
R
R
R
X
[]
X
X
R
[]
R
N-NEUTRAL
IONIZABLE
STATES IN OXIDE,
IONIZABLE BY
RADIATION
FAST STATES
IN SILICON
AT INTERFACE
373
e-
SLOW STATES
IN OXIDE
NEAR INTERFACE
[ ] } DONOR STATES
~
IMMOBILE CHARGE
Q }
[CJ
ACCEPTOR STATES
--ELECTRON
Figure 3-Distribution of charges in a MOS structure
made which have now been determined to affect the
yield of both MOS devices and the smaller geometry
bipolar devices desired for LSI. These include the
presence of fast and slow ions diffusion in the oxide,
the presence of fixed charges in the oxide whose magnitude is a function of processing conditions and applied
fields, and a number of different kinds of minority
carrier trapping effects in the oxide and at the interface. The complexity of the problem is seen in Figure 3,
which shows the location of charges in a planar oxide
structure.
High yield production of LSI devices requires special
tests at each manufacturing step to control the important oxide charge effects. Control charts in new
areas must be maintained, and the effects of process
variability on these effects must be well understood
by production engineering. 'Vhereas such control has
been readily understood and applied in the R&D line
and at the pilot line level, many companies have been
slow to implement these procedures in production,
due to the considerable re-education process that is
required. lVIore than one company has been severely
disappointed in their attempts to place LSI in production.
Another important limitation in increasing the
yield and reliability of LSI devices has been the fact
that these very complex structures literally defy
analysis of internal yield and reliability problems as a
function of the terminal parameters of the finished
device. Traditionally, single transistors had been incorporated on each chip as an aid in process control,
and for determining causes of low yield.
As the detailed nature of the many sources of device
1. Transistor properties and field inversion,
2..Mobile and fixed charge in the oxide,
3. Fast and slow interface states,
4. Surface ion migration and surface conductivity,
5. Leakage between p regions and leakage in large
and small periphery p-n junctions under a
variety of oxide thicknesses and metal overlayers,
6. Shorts and leakage through different thicknesses of oxides over different suface conductivity types and with varying topologies (small and
large oxide steps),
7. :Metal and p-region resistance and electromigration susceptibility under various localized
conditions,
8. :Metal continuity over steps,
Figure 4-Test vehicles
a. Surface effects test vehicle
374
Fall Joint Computer Conference, 1969
oxide are not present, there may still be a large number
of thin or weak spots which are susceptible to premature breakdown. The evolution of a uniform hil~h
strength dielectric for multilayer technology involved
tests such as those shown in Figures 5(a) Emd 5(b).
In this type of test, thin metal is used in the upper
layer so that when a short develops, the energy dissipated will evaporate the metal away from holesthereby" clearing" the short, and restoring the original
condition. Thus, it is possible to impress consecutively
higher and higher voltages between the two layers,
exposing the weak spots one-by-one, until the: ultimate
b. Oxide integrity and metalliza.tion test vehicle
9.
10.
11.
12.
Contact resistance,
Resistance of multilayer vias,
Leakage through multilayer dielectric,
First - and second-level metal resistance.
Such test vehicles must be used in the laboratory,
pilot production, and production operations to control
the process and optimize the yield, yet they took two
years to develop and apply after the basic effects
were known.
Figure 5-Breakdown strength of oxides in a
multilayer test vehicle
a-t. Dielectric strength test vehicle
Multilayer metallization
Most of the early LSI demonstration photographs
showed multilayer metallization. In many cases, it is
possible to obtain a 2:1 reduction in chip area with the
application of an additional layer of interconnections.
In the case of the discretionary wiring approach, its
use was absolutely essential. Initially, it had been
expected that the major problem in the use of twolayer metal would be due to $horts through the dielectric. This did turn out to be a very significant
problem in the case of discretionary wiring, where an
entire wafer is covered with second - and third-level
insulated interconnections which must be free of shorts.
In the case where the actual shorts through the
a-2. Enlargement showing self-healed pinhole
What has Happened to LSI
1000
r__-------------------,
~
~ 100
.S
;
VP/5
o
VOLTAGE (VOLTS)
h. Pinhole density for silane-vapor-plated and
R-F sputtered Si02 on delineated aluminum
5,000 A and 10,000 A thick
dielectric strength is determined. The "stair-step"
plots, shown in Figure 5(b), show the wide differences
between silicon oxide dielectric layers prepared with
differing processing conditions. It is now possible to
prepare both chemical vapor-deposited and sputtered
Si0 2 layers which are virtually free of shorting-type
defects within the area of a single LSI chip, and success
is . also being reported on large ·.discretionary wired
wafers with a combination of these techniques.
In the case of smaller chips, opens proved to be of
more significance than shorts, with problems developing
at the vias between upper and lower metallization
levels. In order to limit their size, such vias must be
kept small in area, and it was quickly d~termined
that the presence of thin oxide layers or other contaminants at these points would produce either opens
or an unacceptable amount of via resistance. Under
non-ideal conditions, a test structure such as included
in Figure 4(b), containing 18 vias in series, commonly
shows resistahces on the order of 10 to 20 ohms. In
some LSI circuits, the tend~ncy for a high resistance
to be present is increased by the occurrence of cell
potentials, which produce an anodizing effect during
via etching, and which is a function of a particular
circuit topology. Thus, the same four vias in a circuit
375
containing 120 vias might be found to be open without
any obvious reason. Metallization problems also developed with electrical opens occurring at the point
where the upper-level metal steps down over an abrupt
oxide cut to reach the first-level metal, and the metal
at these. points tends to become constricted. It appears
to have taken the better part of two years of effort in
various industry laboratories to develop multilayer
processes· to the point where they can be used to
achieve competitive yield and reliability levels with
single-layer metal products. Even so, rules governing
the via area znd shape of the via cut must be carefully
chosen and strictly adhered to.
The application of multilayer metallization to MOS
is less critical for via resistance, since the circuit
operate at high impedance levels. A different type of
fundamental problem arose, however, when it was
.found that the application of the second layer of dielectric caused drastic changes in the electronic\properities of the first-level silicon-oxide interface.
Not only temperature and radiation effects (in the
case of sputtering processes) exist, but rapid diffusion
impurities can be introduced which penetrate to the
original interface and alter the charge condition. Thus,
the same level of new understanding and special process
control is required as was the case in the original development of stable high performance MOS devices.
Mask making
In 1964 and 1965, severe problems existed in mask
making which alone would have made it impossible
to manufacture LSI. Problems existed in both image
quality and image registration.
In the case of image quality, lenses were not generally available to handle the conventional lOX final stepand-repeat reduction with a sharp field in an area
greater than a 75 X75 mil chip. Attempts to step at
, a larger size and then reduce a multiple pattern were
also limited by the lens quality and photoprocessing
techniques, so that considerable size and corner comppnsation had to be built into the original artwork to
obtain something close toa usable mask.
As better lenses became available, image quality
improved, but problems remained in sizing and registration which still limit maximum practical array size.
A high yield of circuits of typical" state-of-the-art"
design generally requires the placement of successive
in),ages, one within the other, with a separation of a
tenth of a mil, and a tolerance of this of 0.05 mil. In
an LSI device, one might logically wish to obtain
such registration at opposite ends of the diagonal of
a 115 mils2 chip. In the mask-making stepping process
376
Fall Joint Computer Conference, 1969
----------------------------------------------------------------------------------,--------alone, three sources of error occur (under optimum
conditions) which affect this registration: (1) vertical
stepping error ±0.01 mil, (2) rotational stepping error
±0.01 mil, (3) size reduction error ±0.02 mil (1 part
in 4000 reduction error over a 2" stepping table travel).
Adding these tolerances leads to ±0.040--mil registration error in the mask, which means that the processing opera tor must align her mask during device
fabrication to ±O.Ol-mil-a bare possibility. Thus,
any attempt to fabricate circuits at sizes larger than
115 mils on a ~ide with 0.010 mil registration requirements has automatically placed severe limitations on
the expected yield, and practical bipolar LSI design
rules have therefore been kept to larger tolerances or
smaller chip sizes. Unfortunately, optimum MOS performance demands even j ighter design rule tolerance
(±0.08 mil on gate overlap).
Testing
LSI raised many new problems in testing, some of
which were initially recognized and some which only
became evident when manufacturers attempted to
move LSI testing to the production level. It is now
generally recognized that for circuits containing more
than 50 gates, one cannot practically exercise all of
the logic contained on the chip as a method of testing,
since the time required to accomplish this quickly
stretches into many hours or days per circuit; rather,
test programs must be computer-generated which rely
on the fact that only certain kinds of faults can practically exist in the device, and which merge redundant
test patterns. Some fault conditions can only be detected by the introduction of specially constructed
error inputs. Even with such factors taken into account, however, an effective test sequence can only be
expected to become available when the test problem
has been taken into account at the tim3 of system
partitioning and circuit design. In some cases, it is
necessary to break feedback lines on a chip to reduce
sequential networks to combinational networks, albeit
at a sacrifice of gate-to-pin ratio.
At best, a formidable problem still presents itself.
Two of the major contributors to this problem are:
first, the inability to test the "inner stages" of the
array, resulting in an inordinate number of tests necessary at the inputs to guarantee the proper outputs,
and second, the complex test sequence generally exceeds
the capability of available test equipment and might
be expected to add a disproportionate amount to the
total cost of the device. As the level of integration
increases, the number of actual chips per system will
decrease, but the cost of testing fewer (but more com-
plex) chips can become the most significant contributor
to the final cost of the unit.
The testing of sequential logic can be considerably
complicated by the necessity of first applying a sequence of input patterns to force the output into a
particular state. Consequently, consideration must be
given to the sequence of the input patterns to ensure a
complete functional test. As with combination networks, a test pattern for sequential networks can be
reduced through the use of computer-aided test minimization programs. However, these programs can be
quite long, hence, expensive, since many distinguishing
sequences are necessary to check the possible failure
modes.
Although the problem of generating sufficient test
programs has in many cases been satisfactorily resolved,
the problem of how the testing is to be performed on a
manufacturing basis is still largely undecided. As of
the New York IEEE Show in March 1969, for example,
only two or three pieces of commercial equipment
were being offered for LSI testing, and in general
these equipments are either considerably limited in
capability or are very expensive, as applied to single
operator handling. Examination of these equipments
and other individual test equipments, which exist in
individual companies, would suggest that we are still
in the first generation of LSI test equipment development. Progress in the commercial use of LSI will
continue to be impeded until this problem is resolved.
Packaging
LSI raised many new problems associated with
packaging these devices. Early attempts at LSI system
partitioning led to poor gate-to-pin ratios in an ~tt
tempt to maintain maximum system flexibility whieh
in turn required large numbers of bonds. InitiaUy,
attempts at packa!?;ing such LSI were extensions of
the then available flat pack techniques attempting to
maintain a minimal periphery chip with a large number
of closely spaced leads. This configuration led to a
shorter se:11 length than had been determined by the
package industry to be required for freedom from
leakers. The urgent necessity for having packages
suitable for prototypes also led to the use of less thftn
optimum procedures for fabricating and sealing these
packages. Sealing techniques which worked well on
small integrated circuit packages failed ~o seal properly
when the package periphery became large, and. special
techniques had to be developed. Conventional le2Lk
test procedures cannot be applied, since the lar'ger fla,t
packages will not withstand the same test pressures
and the larger internal volume requires excessively
What has Happened to LSI
long pressure tests to detect small leaks. The larger
number of pins also put new requirements on the wafer
and chip bonding processes. Large chips are more
likely to have voids in the chip~to~header bond, and
a larger number of wire bonds have to be made without
a bad bond in order to obtain a finished device at high
yield. One solution to this problem has appeared to
be in the direction of beam lead or flip~chip techniques.
The applicability of such techniques to large numbers
of interconnections has relatively recently been demonstrated, as in the case of the semiconductor memories
described by Kraynak,S Agusta6 and Alexander.7 Most
of these approaches have required additional processing steps on the wafer to obtain the required bonding
materials at each interconnection site.
High speed LSI arrays have also placed new demands
on packaging from a power dissipation standpoint. For
example, an array of 100 high speed gates, each dissipating 50 mW, would produce a total dissipation of
5 watts, which has been beyond the state~of~the~art
of conventional IC packaging. High speed LSI has
therefore required considerable research into methods
of obtaining high speed at lower power levels, and this
has required smaller geometry structures to minimize
capacitance-thereby making the large LSI circuits
more difficult to produce at a reasonable yield.
Reliability
One of the originally stated reasons for going to
LSI has been to increase reliability by decreasing the
total number of interconnections and packages in the
system. This may be true for a system of fixed capability, such as a desk calculator or a computer terminal.
On the other hand, in large systems, LSI is more often
viewed as a means of economically increasing the total
system complexity to perform more tasks, rather than
as a means of decreasing the package count for previously designed systems, in which case the MTBF
for the total enlarged system is of concern.
The advent of LSI brought into the picture a new
range of potential reliability problems that have to be
resolved.
Since LSI devices are more complex, they require
more metallization per chip. The larger number of
pins in LSI leads to an increased number of interfaces
between the chip and the package, and it is at these
locations-wire bonds and chip bonds-that the principal failure modes occur in silicon integrated circuits. In
fact, metallization and wire bond failures account for
approximately 60 percent of all conventional integrated circuit failures. Thus, reliability may suffer on
a per package basis.
377
In the case of high density LSI in a conventional
type of IC package, dissipation is increased to the
point where the circuit elements are operating considerably closer to the maA-imum allowable junction
temperature than would be the case for individually
packaged lower complexity Ie's. Derating to increase
reliability is not as feasible and it has become important to explore the long term degradation of devices at these higher temperatures that can no longer
be considered an accelerated condition.
Failure rates on a per package basis are necessarily
increased by this effect, and the MTBF for the entire
system must be re-evaluated to make sure that the
expected benefits are in fact being obtained.
Perhaps the area of greatest difficulty in insuring
LSI reliability is in the application of the screening
techniques that have been accepted for use in integrated circuits. Typically, visual, mechanical, thermal
and operational screening of the final product is required. The final in-process screens 'should be performed at stress levels sufficiently stringent to remove
all devices which contain potential reliability hazards,
but the screen levels imposed must not degrade the
inherent reliability of those devices which survive the
screening sequence. Unfortunately, the screening levels
adapted for conventional integrated circuits, however,
may not be applicable in general to LSI and MS I
devices.
Because of the larger size of LSI packiages, the
centrifuge and shock tests applied to conventiona.l LSI
can cause mechanical damage and loss of hermeticity
unless special precautions are taken.
Because of the increased complexity of IC's and
MSI devices, it must be assumed that the effective ness
of a preseal visual inspection will not be as great as it
is for conventional integrated circuits. The sheer complexity of these devices outstrips the ability of a human
operator working with a microscope. This is particularly true when one considers the increased number
of possibilities for scratches and open metallizations
at oxide steps, t.he possibility of shorting between upper
and lower metallization levels because of pinholes or
cracks in the inSUlating oxide, the possibility for opens
due to marginal nletallization alignments, and the
possibility of failure because of high leakage between
adjacent metallization stripes because of photolith defects, resulting in poor delineation.
SUl\1MARY
The promises of LSI are still basically valld; however,
the electronics industry has had to face tremendous
problems in its efforts to make LSI a production
378
Fall Joint Computer Conference, 1969
reality. The solution to these problems has required
the development of new approaches in almost every
aspect of integrated circuit technology, and has required close cooperation between the vendor and the
user. It is, in fact, remarkable how much progress has
been made in the past four to five years. At present,
there are over 200 catalog part numbers for LSI devices and several LSI systems are programmed for
some 1970 production. It now appears that 1970 will
be the year of reality for LSI.
3
4
5
6
REFERENCES
G HERZOG
The LIMAC-An LSI demonstration vehicle
IEEE International Convention Digest N Y Mar 26 1969
2 B T MURPHY
7
Cost optima of monolithic integrated circuits
Proc IEEE Vol 52 1964 1537-1545
A G F DINGWALL
High yield processing for fixed-interconnect lar~'e scale l~rrays
IEEE Trans on Electron Devices Vol 15 No 9 1968 ti31-637'
N KLEIN H GAFNI
The maximum dielectric strength of thin oxide films
IEEE Trans on Electron Devices Vol 13 No 12 1966281289
P KRAYNAK P FLETCHER
Wafer-chip assembly for large-scale integration
IEEE Trans on Electron Devices Vol 15 No 9 1968 {)60-663;
B AGUSTA
Planar double diffused monolithic memory chip
Digest of Technical Papers Solid-State Circuits Conference
Feb 19 1969 Philadelphia Pa 38-39
E J ALEXANDER
P-channel IOFET memories
~EEE Internat Convention Digest March 261969 N Y
- - - - - - -------
Real-time graphic display of timesharing system operating characteristics *
by JERROLD MARVIN GROCHOW**
Massachusetts Institute oj Technology***
Cambridge, MasEachusetts
lNTRODUCTION
The Graphic Display lVlonitoring System (GDM) is an
experimental monitoring facility for 1\,1ultics, a general
purpose time-sharing system implemented at Project
~tfAC cooperatively with General Electric and the Bell
Telephone Laboratories. 2 ,7 GDl\/l allows design, systems
programming, and operating staff to graphically view
the dynamically changing propertieE of the timesharing system. It was designed and implemented by
the author to provide a medium for experimentation
with the real-time observation of time-sharing system
behavior. GDl\1 has proven to be very useful both as a
measuring instrument and a debugging tool and as
such finds very general use.
'l'10nitoring the activity of a traditional computer
'System (one with only a single active process) is a fairly
simple task. Hardware and software devices can easily
be devised to keep track of almost any parameter.
Asking the question "What are you doing right now?"
to a computer system controlling multiple processes or
servicing multiple interactive users, however, proves
particularly difficult to answer meaningfully. It becomes necessary to "snapshot" the system (record in
some manner its state at a specific time) and interpret
* Work reported herein was supported (in part) by Project MAU,
an M. I. T. research program sponsored by the Advanced Research Projects Agency, Department of Defem:e, under Office of
Naval Refearch Contract Nonr-4102(Ol)
** This paper is based on R thesis submitted in partial fulfillment
for the degree of Master of Science at the Mas~chllsetts
Initutest of Technology, Department of Electricv,l Engineering.
u.
Proj~ct
MAC
this information for the inquirer. Since a basic property
of a time-sharing system is that, in fact, it is "doing
something else" a few milliseconds from now, what the
inquirer really wants to ask is "What are you doing
now, and now, and now ... ?" Implicitly, he is also
asking to be shown what is happening in an easily
interpretable format. The GDM solution to his problem
is to provide the user with a real-time, graphical output "eavesdropper."
Statistical studies of time-sharing systems have been
performed1 ,5,1l in an attempt to provide "after-thefact" monitoring (in effect answering the question
"On the average, what is happening?") and there have
been simulations in an effort to provide "predictive
monitoring."6,1l One company has even produced a
hardware device to receive system status information
over a special wired in channel and record the results
on magnetic tape. 12 Other than the "SNUPER Computer"6 which, however, still requires engineerinstalled hardware probes, there has been little work
directed towards providing a generalized, real-time,
time-sharing system monitoring device. It is felt that
while the hardware used for this implementation of
GDl\1 is perhaps unusual, the design principles involved
arid the monitoring methods explored are sufficiently
general to provide a framework and a guide for other
designers.
The basic goal in designing the GDlVI System was
to produce a time-sharing system monitoring device
for use by the staff of the Multics project. Initial
requirements implied that it would be on-line, that is,
active while 1Vlultics was in operation-not just collecting data for future analysis, and would provide
379
380
Fall Joint Computer Conference, 1969
dynamically changing graphic output (as well as hard
copy if desired). It was to be designed such that the
act of monitoring did not cause significant interference
to the time-sharing system or. perturbations in its
behavior and such that it would not be necessary to
make more than a few minor additions to supervisory
procedures in order to incorporate the GDM System
(as opposed to monitoring don~ by inserting entire
procedures in critical points in the supervisor in order
to collect data; see Scherrll for a;n example). Since the
GDM System was to be an experimental tool, it was
also considered especially important that it be easily
expandable and adaptable to new or different monitoring
requests. Coupled with these req~irements was the need
to involve the expected user cqmmunity as early as
was possible in the project in otder to insure its continued use after initial implementation. In this regard,
acceptance by the systems progr~mming staff was very
encouraging and many currently make use of the GDl\I
facility.
The original GDIVI System ¢mbodies these goals
while making use of existing harcLware at Project MAC.
The Digital Equipment Corpora~ion 338 (see Figure 2)
was already on site for use in other experimental work.
A more extensive (and less ~xpensive) monitoring
system could perhaps be designed if it were possible to
choose both the display processor and the method of
interface to the time-shared computer. This was not,
however, viewed as a major handicap in developing a
useful system.
Succeeding sections will discuss the various components of the GDM System and will describe in detail
initial experiments and current usage at Project MAC.
Compromises in design and special problems due to
the particular constraints of the display hardware or
software and the Multics system to which they interfaced are also discussed.
'
D. A language for describing desired data manipulation and display formats (Display Description
Language), a (planned) compiler for translating
such descriptions into display computer assembly
language programs, and a set of macro-definitions for simplifying display computer progrs,mming and for calling the subroutines mentioned
underC.
Figure 1 gives a functional representation of the
various GDM subsystems showing the interaction
among them, the two computers, and the user. Figure
2 shows the complete hardware configuration. Reference 8 goes into considerable detail about the GDl\1
monitor system software including system flow charts.
Modes of operation
Use of the GDM System generally falls into one of'
three classes of operation:
1. Demonstration mode: any of a number of
library displays may be viewed to get a general
picture of Multics operation at the mom€~nt.
Data used in these displays is updated periodically according to preprogrammed instructions.
2. XRAY mode (so named because of its similarity
to the X-ray System4): the user may type the
USER 'S DOL
PROGRAM
-1
@
@
GDM MACRO
DEFINITIONS
STANDARD PDP -8
ASSEMBLER
The GDM System consists of four major components:
A. An input-output procedure running under
Multics to transmit dat~ as requested to the
display computer.
B. A monitor system operating on the display
computer to facilitate the creation, storage, and
retrieval of display templates (see below) and
to perform various other housekeeping functions.
C. A series of display computer subroutines for
manipUlating data and generating command
sequences for the display.
~
WITHMAC=:J
ASSEMBLY PROGRAM
DATA MANIPULATION
PROGRAMS, DISPLJAY
TEMPLATiE
@
SAVED ON MAGNETIC TAPE
FOR FUTURE USE
--
What is the GDM system?
Subsystems
DOL COMPILER
USER INTERACTION
DATA I DISPLAY
SUBROUTINES
GDM
MONITOR
SYSTEM
(PDP-8/338
DISPLAY
COMPUTER)
©
REQUEST FOR DATA
@
I
GE-645IPDP-8
MULTICS
TIME-SHARING 1- __
SYSTEM
(GE - 645
COMPUTER)
$
I
I
Figure I-GDM subsystem interactions
Real-Time Graphic Display
381
VOICE GRADE
TELEPHONE LINE
---------,
Figure 3-XRAY display
PDP -8/338
L ______________________ J
(THIN LINES REPRESENT DATA TRANSMISSION; HEAVY LINES
REPRESENT TRANSMISSION OF STATUS INFORMATION AND
INTERRUPTS)
:Figure 2-Hardware configuration
segment number and offset of a datum (see
Reference 3 for a description of the addressing
scheme used in Multics) on the teletype of the
display computer and see displayed the octal
and ASCII character representation of its
contents, updated every second (Figure 3,
XRAY display).
3. Display creation mode: the user will go through
the process of creating his own display (as outlined in Figure 1) in order to gain desired flexibility in data displayed, format of display, or
data sampling rate. Displays are then saved in a
special format, the "display template," for use
in later experiments or as part of the library.
All mo~es of operation employ the same type of
disl?lay template and are listed only to differentiate
between the application of the GDM System. System
programmers have been trained in five minutes to
utilize the many displays already in the library (operation under "Demonstration Mode"). Some use the
XRAY display when there are one or two locations of
interest at a particular moment, as in the current
number of available disk pages or the value of a particular time-dependent variable. Display creation
mode, the most general use of the GDM System, requires the most work on the ptirt of the user. He must
decide what data items to display, how to display them,
and how often to sample them. He must then create
the data manipulation routines and the display list
comprising his particular "display template." Until
the DDL compiler is constructed, this work must be
done in an extended version of the PDP-8 Assembly
Language as seen in Table I (the 338 computer uses
the same systems software as its sister PDP-8). It is
in this mode of use that all the facilities of the GDM
System come into play and in which the most fruitful
experimental work can be performed.
Examples
Figures 4 and 5 show typical examples of G DM
diSplays. Figure 4, Core Memory Summary Display,
displays real-time information on the usage of Multics
core memory pages; Figure 5, Active Process State
Display, displays user activity information (see below).
The display templates for both figures were constructed
in about two hours apiece by an experienced user and
have provided many hours of system observation for
experienced and inexperienced alike.
The display in Figure 5 causes information about
each process in Multics to be extracted from the traffic
controller data base. The column labelled "MP" is
the "multiprogramming state," an indication of a process' eligibility to receive CPU time. Stars to the right
of this column indicate the processes that are eligible
(state 4). The column "ST" is the "activity ~tate"
running, ready to run (waiting to be serviced), or not
ready to run. The star is next to the process currently
382
Fall Joint Computer Conference, 1969
Figure 4-Multics core memory summary display
Figure 5-Active process state display
running, state L In a multi-processor configuration,
there would be more than one suc~ process.
The associated bar graphs also provide a descriptive
measure of overall system activity. By "eyeball integration" of the length of the bars, one can get a fairly
accurate idea of system loading. Several means of
calculating graph lengths have been used (in different
display templates all using the same basic form) :
1. Whenever a process is ready or running, the
length of the bar is increased. When the process
is not ready, the associat~d bar decreases. Each
bar changes length as an exponentially weighted
sum of ready -running and not-ready time.
(This is seen in Figure 5.) .
2. Whenever a process is ready, its bar grows in
equal time increments. When the process is
finally serviced (receives processor time), its bar
is reset to .zero length.
The display of type 1 gives a general pic:ture of
system loading but also shows something of the behavior of the individual process. The scale is calibrated
in percentage to indicate that the bar shows the percentage of time a process is requesting or receiving CPU
time--a measure of the process' activity. The type 2
display is more useful in getting an uncluttered. picture
of just how long a "ready" user must wait to run, i.e.,
how long each process is spending in the queues waiting
for service.
The display templates for these two displays differ
in about ten instructions (the computation of bar
length). The two hours of editing and assembling to get
a "first draft" of the display is even less if averag:ed
over the two displays. Herein lies a basic flexibility of
the GDM System: once the data to be displayed have
been decided upon, it need be only a matter of minutes
before it is viewed. Display formats can be easily
experimented with and a finished display template c:an
be added to the GDM library for future monitori.ng
without any costly "dedicated system" monitoring
runs.
The examples discussed above show simply two ideas.
Others have included collecting (and displaying) du,ta
on the mean lifetime of a page in the M ultics memory
(how long does it take before the page is swapped out
to secondary storage), the distribution over time of
the number of active time-sharing users (very nicely
displayed as a graph similar to Figure 6D), and the
average number of users referencing particular supervisor segments (built up during the length of the monitoring session). There is a great deal of work yet to be
done before we run out of ideas or into the limitations
of the G DM system.
More on the display template
A display template (DT) consists of three sections:
1. A list of the time-sharing system data items to
be sampled (segment immber and data base
format are sufficient since absolute core locations are determined by GDM at monitoring
time).
2. Instructions on display type (numerical, ASCII,
bar graph, other graphics, etc.), sampling rate,
and data manipulations (averaging, sca.ling, etc.)
for each data item or group of items.
3. A display list: machine instructions for the 338
Display giving text, formatting inform2,tion, and
storage for items to be displayed.
For example, to display a single process' a1ctivity as
Real-Time Graphic Display
TITLE
S4
S3
60
S2
SI
7
3 4 5 6
TIME UNITS
2
0
9
8
I----
S3
6b
I
S2
SI
0
2
7
3 4 5 6
TIME UNITS
8
9
TITLE
(f)
4
6c
t:
z 3
::>
~
~
::>
~
z
u
I.&.J
::>
0
I.&.J
0::
lI..
•• ••
•
•
since all the facilities of the computer are available,
data manipulations can be quite complex (although
subroutines are provided for such common operations
as scaling and masking) and displays quite unusual
(Figure 6 shows standard types for which GDM provides some macro facility). The only limit is the designer's imagination and the size of the PDP-8 core
memory.
The Mu,ltics/GDM interface
TITLE
S4
383
•
•
..
7
8
9
6d
,
••
••
I 2345
10
DATUM UNITS
••
15
• •
20
Figure 6-0ther standard display types
in Figure 5, a DT would contain about twenty instructions (Table I) .
IThe various non-PDP-8 instructions (call, do,
dlstart, etc.) are macro calls to a set of definitions designed as part of the GDM System. Various subroutines (nplot, ge645, sked, etc.) are also provided as
interfaces to the GDM monitor and to simplify programming. These features allow the programmer without
PDP-8 experience to design a display template with a
minimal apprenticeship. (Implementation of a DDL
compiler should simplify this even further.) Of course,
The GDlVl System is designed for use in a symbiotic
relationship with a time-shared computer. The computer must be capable of supporting a display processor
functioning basically independently of the time-sharing
system but occasionally interjecting requests for data
transmission.
The l\Iultics environment is particularly friendly to
this type of system as it is possible to make data requests through the generalized input-()utput controller
(GIOC) of the GE-6459 , without interrupting the
central processing unit (Figure 2). It is necessary, however, to dedicate two of the 2048 GIOC channel pairs
(one for transmitting and one for receiving) to the
display processor. Those problems introduced by this
relationship are discussed further below.
The Multics/GDM interface procedures are capable
of providing the following services:
1. Accept address request by segment number and
offset of data to be displayed (GDM).
2. Convert this address to an absolute memory
location for interpretation by the GIOC (GDM
to M ultics) .
3. Transmit the datum from the GE-645 memory
to the 338 (Multics).
In general, a GDM~type monitor requires only the
simplest method possible of getting data from the timeshared computer to the display processor. On the
Project MAC system, this means sending requests to a
short I/O program running on the GE-645 GIOC. The
2400 bit per second Dataphone (201B modems) used
for this transmission limits the request rate to approximately twenty per second (a negligible disturbance on
a one-and-one-half microsecond per instruction processor). Higher data rate transmission can be used
with corresponding increases in interference (if we increase the rate to 40,000 bps, the perturbance is still
Jess than .1 percent) and special telephone lines.
All displays currently in use sample the GE-645 at
rates at or near the available maximum. Displays with
a number of data items occasionally resort to special
384
Fall Joint Compute.r Conference, 1969
TABLE I-A display template to monitor a
T -8 user's activity
*address table
tc data
540;541
*data routines
a
7777
80,0
call ge645, 1, 2
call nplot, mp, 1
call ge645, 1, 3
call nplot, st ,1
jms calc
do hplot, bar
call sked, 144, a
jmp i a
*display -list
dlstart
nl; nl
mp,O
sp2
st, 0
sp2
hbar bar
escape
top
Isegment name
Ilocations within the segment
Iname table of routines to be
lealled by the GDl\1 monitor
I end of taple
IPDP-8 subroutine format
Iget first data item
/plot "lVIP~' state number
Iget next data item
Iplot "8T'~ state number
I call to machine language
I subroutine to calculate bar
I graph length
Iplot horizontal bar graph
Ireschedule "a" to be called
Iby monitor in one second
IPDP-8 subroutine format
Imacro instruction to start dis-
I play
I"new
line~' for formatting
Istorage for "MP"
Ispaces for display formatting
Istorage fQr "8T"
Iformattirig
Imacro to create bar graph display
/ display instruction macro
I display instruction macro to
I cause refreshing of display
sampling methods in order to update important items
at least once a second: about the rate at which the
human eye can follow a dynamic display with that much
information.
Advantages and disadvantages of GDM
Advantages, disadvantages, capabilities, and limitations of GDl\l can be grouped into two categories:
those relating to its monitoring ability; and those
relating to its ability to report the information monitored.
Monitoring ability
Several factors determine the usefulness of any type
of monitor .. These include the number and type of
events it can monitor, the rate at which it can monitor
them, and. the interference that this observation will
cause to the system being monitored.
One of the capabilities of GDl\1 is a facility to change
the point of observation easily: this is accomplished
through the use of the display template. A new display
template can be designed and operational in a short
time and, once constructed, can be added to a library
for future recall. No hardware changes need be made,
no plug boards rewired, no probes changed to monitor
a new or different event. Another display template
with a few basic instructions is all that is n43eded to
change the "probe" of GDM.
GDM,.as constructed, is a sampling monitor. Current
dataphone connections limit requests for data items to
about twenty per second as mentioned above. Faster
dataphone, direct connections or other means can be
used to influence sampling rate. The current rate is
such that "microsecond" events cannot be monitored.
Transient data items wilJ be missed if their core location changes many times in a second. Current displays, therefore, limit themselves to observing only
"wired" data, this is, data whose core location need be
determined only once during a particular monitoring
period although the data itself may change many times.
As approximately 80 percent of the l\1ultics supervisors,
data bases fall into this category at t~e current time,
this is not particularly restrictive.
Monitoring which requires the collection of a large
number of statistics over a very short time period
similarly is hindered by the current configuration
although "long-time" statistics are collected and displayed by a number of display templates.
Under Multics, short-time event monitoring is performed by special software embedded in the Multics
supervisor.l0 A GDM display is used to observe, in
real time, the data base of this monitor in order to see
the time build up of the statistics and to note ll.ny
abnormalities that might be missed by ob~erving
averages after an hour or more of operation. In this
way, the advantages of a real-time display are eombined
with monitoring embedded in the time-sharing system
(which causes significant interference when turned on)
to provide a very useful tool.
The area of system interference has alre:ady been
discussed but one item should be emphasized. In the
Multics· configuration, GDM· need take only GIOC
time-not CPU time. In computer systems where this
is not possible, interference will still be nel~ligible if
the GDM monitor "steals" only enough informa1Gion
to m~ke a useful display. Five hundred cycles per
second is still only .1 percent on a two-microsec:ond
Real-Time Graphic Display
cycle time computer and this is more than sufficient
for even the most complex display.
Reporting ability
Output of information is another area in which
flexibility is crucial. Displays in Figures 3, 4, and 5
show only numbers, characters, and bar graphs. Displays have also beep. constructed with the types of
graphs shown in Figure 6 and many others have been
suggested for particular applications. It has been found
that displaying the same information in different ways
often presents an entirely different picture of what is
going on. The only price to be paid for this flexibility
is programmer time and even then it is no more difficult
to display a bar graph (or any other type) than it is
to simply show a number. Several display templates
showing the same data in different formats can be
made almost as easily as a single one and the best
added to the GDM library.
For those who desire hard copy, GDM, in its current
configuration, offers only photographs of its displays
(stopped at any instant of time, saved on tape for
future reference or photographing). Plotters of various
kinds could perhaps be connected in tandem with ::It
dynamic display and requested to plot a particular
instance, even while the CRT display is still changing.
Here again, the designer is limited only by the hardware available and his imagination.
CONCLUSIONS AND OBSERVATIONS
The GDM System at Project MAC has served in two
major capacities:
1. As a monitoring "control center".
2. As a debugging tool.
The very nature of a multiple-access computer
system makes it very difficult to determine at one
location exactly what is happening at all terminals.
The GDM display, conveniently located near the main
body of Multics programmers, is readily consulted to
determine the state of a rampant user program, the
availability of secondary storage space, or just the
general health of the system (a slave display might
possibly be installed near the computer itself or in the
office of the system administrator as well). Many
system programmers have, at one time or another,
brought up the GDM System on their own initiative
to find out various, otherwise unobtainable, pieces of
information (a "cookbook" instruction sheet has been
provided for just this purpose). A visit to the GDlVI
385
display is always included as part of the standard
system tour for visitors.
As a debugging aid, GDM' has been invaluable. It
is responsible for the detection of many system bugsoften transient or time dependent-that were not
easily isolatable by previously available means.
One of the features of GDM that has made it so
useful 'is its ability to simplify the act of dynamic
display creation to the point where this is no more
difficult than writing a simple assembly language
program. This flexibility has paid many times over for
the effort of implementation.
Finally, GDM can be readily adapted for use with
other time-sharing systems: only two Multics-dependent modules exist in the monitor and display templates
can be designed to suit any system.
GDM was designed as an experimental system and
as such has been very useful at Project MAC. Its use
during a period of intense debugging of the Multics
system has proven its development worthwhile.
ACKNOWLEDGIVIENTS
The author would like to express his. gratitude to
Professor F. J. Corbat6, advisor for his Master's
Thesis, for his continued support and aid during the
period of thjs work. Thanks are also due to many
members of the Multics development group at Project
MAC without whose help this work could not have
been undertaken and in particular, Thomas Skinner
N oellVIorris, and Professor J. H. Saltzer.
REFERENCES
1 E G COFFMAN L C VARIAN
Further experimental data on the behavior of programs in a
paging environment
CACM Vol 11 No 7 1968471-474
2 F J CORBAT6 VA VYSSOTSKY
Introduction and overview of the M ultics system
Proc FJCC 1965 185-196
3 R C DALEY J B DENNIS
Virtual memory, processes, and sharing in Multics
CACM Vol 11 No 5 1968306-333
4 D J EDWARDS
GE-6.1,li core memory X -ray program
Multics System Programmers' Manual Section BE.13
Cambridge Mass MIT Project MAC internal doc 1966
5 G ESTRIN L KLEINROCK
Measures, models, and measurements for time-shared computer utilities
Proc ACM Nat Meeting 1967 85-95
6 G ESTRIN et al
SNUPER COMPUTER. a computer in instrumentation
automation
Proc SJCC 1967645-656
386
Fall Joint Computer Conference, 1969
7 E L GLASER J F COULEUR· G A OLIVER
System design oj a computer Jor time sharing applications
Proc FJCC 1965 185-196
8 J M GROCHbW
The graphic display as all- aid in the monitoring oj a timeshared computer system
Project MAC Tech Rpt MAC-T&-54 Thesis Cambridge
Mass Sept 1968
9 J F OSSANA L E MIKUS S D DUNTEN
Communications and input/output 8witching in a multiplex
computing system
Proc FJCC 1965 231-240
19 J H SALTZER J W GINTELL
1'he instrumentation oj Multics
Presented at the Second ACM Symposium on Operating
System Principles Princeton N J 1969
11 A L SCHERR
An analysis oj time shared computer systems
Project MAC Tech Rpt MAC-TR-18 Thesis Cambridge
Mass June 1965
12 F D SCHULMAN
Hardware measurement device Jor IBM SysterY}/SfJO ti'"!.e
sharino evaluation
Proc ACM Nat Meeting 1967 103-109
A graph manipulator for on-line network
picture processing
by HUGO A. DI GIULIO
Stanford University
Stanford, California
and
PAUL L. TUAN
Stanford Research Institute
Menlo Park, California
INTRODUCTION
This paper describes research which involves the use
of interactive computer graphics for processing systems
analysis networks. The term "systems analysis network"
is used to include project scheduling, task-resource
simulation, computer programming flow diagrams,
decjsion tree, assembly line b8.lancing, flows in networks,
etc. These network pictures usually characterize the
precedence relations and the logical and data flow
among network component parts, and are traditionally
the planning tools for industrial engineers, operations
research analysis, and management and systems
planners. In this research, a system is developed to
provide a "drawing board," through the use of interactive computer graphics, to compose, transform,
decompose, partition, simplify, merge, and regenerate
network pictures for the purpose of facilitating rapid
convergence in man-computer experiments.
First, a study of the characteristics of network
pictures, in the light of graph theory, is conducted. It
provides a theoretical framework within which interactive graphics operations can be structured. Next,
a system of representing and processing network
pictures through boolean matrix operations is developed.
This is followed by the development of algorithms with
which to regenerate network pictures, such a picture
would be isomorphic with its original drawing, while,
at the same time, maximizing its visual effectiveness.
Finally, a system which enables the user to perform
various manipulation and transformation schemes is
described.
This research is in connection with the Biotechnology
Laboratory of the Department of Industrial Engineering
at Stanford University. An ADAGE computer system
(AGT/30) with an on-line graphics terminal is bfing
used under the sponsorship of N.I.H. project NLM
00525-2 and School of ,Engineering, Stanford University.
Characteristics of network pictures for systems analysis
In this study we sh911 limit our attention to only the
fonowing types of network:
Activity network (e.g., PERT, CPM)
Project scheduling
Job-resource simulation
Flows in networks (e.g., maximal flow, shortest route)
Decision tree
Computer program flow diagrams
Ass( mbly-line balancing
For convenience, henceforth they will be grossly called
"systems analysis networks," or "SA networks." The
logical structure of these networks gives rise to some
387
388
Fall Joint Computer Conference, 1969
common characteristics in their graphic r6presentation.
We shall descdbe some of them below:
Independence of geometric ,constraints
By independence of geometrjc constraints we mean
that an SA network picture does not require rigid
coordinate pos1tions for its picture parts as is in the
case of drawing of a physical object. An SA network is
essentialJy a d~rected line graph3 with only precedence
relations to be considered. In fact, it can be constructed
with only nodes and arcs. A nodd'generally represents
an event, a machine, an operation etc.; and an arc may
represent an activity, a flow, and at the same time,
gives a sense of precedence.
The order in which operatjons or decisions are
performed in an SA network is expressed by precedence
relations. A precedence relationship exists betweer nodes
belonging to the same path. We say node x precedes
node y to imply that y cannot occur until x has occurred.
This relation may be expressed by precedence operators
with symbols >, <. The expression x > y implies x
precedes y, or equivalently, y < x (y is preceded by x).
The precedence relationship is transitive, i.e., if x > y
and y > z, then x > z. AU nod.es in the same network
which can relate to each other in this manner belong to a
partial ordered field which we shall call a "transitivity
closure." An immediate precedence relationship between
tW() nodes is represented by ---+, or ~. x ---+ y implies x
immediately procedes y; x ~ y implies x is immediately
preceded by y. The ---+, ~ operators (link operators) do
not have transitive properties.
Therefore, an SA network picture can be defined the
same as a directed. graph which we shall denote by G.
G = (X, F) where F is a "precedence function" defined
over X. X is the set of all nodes in G. F(x) is the set of
all immediate successors of node x in G. The expression
y e F(x), or simply y e Fx, implies that node x and node y
(both belong to X) are connected. by a directed arc
(an arrow) pointing from x toward y. We denote this
arc by (x, y) where x is called the first node, and y the
second node. x and yare called "adjacent nodes."
(Henceforth, an individual node will be identified by a
lower case English alphabet, with or without subscript).
The letter A denotes the set of all arcs in G. The
expressions G = (X, F) and G = (X, A) are equivaJent.
F is not necessarily a single valued function, for
example, we may have F(x) = {ti, v, w}, i.e., there are
three arcs emanating from node x: (x, u), (x, v), and
(x, w). F-I is an inverse function (the set of all immediate
predecessors) where F-l(y) = {xly e F(x)}. Thus, if
(u, y) and (x, y) are the arcs with y as their second
nodes, then F-l(y) = {u, x}.
The functions F2, F3, ... , Fn are defined by: F 2x =
F(Fx), F3X = F(F2X), ... , Fnx = F(Fn-1x). Likewise,
F-2y = {xly e F2X}, F-3y = {xly e F 3x}, ... , F-n =
{xlY e Fnx}. Fnx is called the "nth generation successor
set of x," F-nx, the "nth generation predecessor set of x."
To preserve the consistency of transitivity relationship (so that y cannot be both a successor and a
predecessor to x) we shall regard all SA networks as
being acyclic (i.e., there is no directed cycle in G). The
cyclic conditions may be treated as acyclic with the use
of "equivalent nodes" as will be discussed later.
We can ~
Fx the "successor set" of x. Fx
""
h
== v
I~'
~ ~x
i=l
where
FhX
rf= cp and Fh+lX = cpo We call Fx the "predk
ecessor set" of x. Fx =
V
F-i where F-k X :;t. cp and
i=l
= cpo We define..... the "forward tra,nsitivity
..,
closure" of x by {x} v Fx, the "inverse trl:msltlVlty
closure" of x._ by {x} v F~;... and tee "tra,nsitivity
closure" of x, F(x), by {x} v F(x) v F(x).
F-(k+l)X
Subgraphs, partial graphs, partitions, all1d
reduced graphs
A subgraph of G = (X, F) is a graph (Z, F'z) where
Z ( X, and for all nodes x in Z, Fzx = (Fx) (, Z. i.e.,
a subgraph of G is the result of taking away at least one
node, and its associatpd arcs, from G. A partial graph
of G is a graph of the form (X, F') where F'x ( Fx for
all x in X, i.e., a partial graph of G has all the nodes
of G but without some (at least one) of its ares.
Xl, X 2, ... , Xr constitute a partition of x: if: (1)
r
V
Xi = X; (2) for every i and j, i rf= j, and i, j ::;; r,
i=l
f \ X j = cpo A graph GO = (XO, A 0), where XO =
{Xl, X 2, ... , X r }, and A ° is the set of arcs, is called a
reduced graph of G. (Xi, Xj) e A 0, i rf= j, if and only
if there exist a node x e Xi and a node y e X j such that
(x, y) eA.
Xi
Common basic diagrams
Basic diagrams are subgraphs which possess cert,ain
topological characteristics into which an SA graph ean
be decomposed. We consider all SA graphs as a~~gregates
of some basic diagrams. It is advantageous that these
basic diagrams be presto red in a dictionary, therefore,
it is not necessary to enumerate the topologimtl det~~ils
of a basic diagram each tim.e it occurs in a. graph.
A graph may be collapsed into a simpler form by
Graph Manipulator
reducing the number of nodes and arcs in some, or all, of
the basic diagrams contained in the graph (graph
reduction is explained in the last section of this paper).
We introduce some of the most commonly used basic
diagrams below:
Closed Be'rial Path (see Fig. La): Many SA networkS
are constructed with individual paths (e.g., job-resource
simulation). Serial path is a simple and elementary path
R(x, y) having n nodes and exactly n - 1 arcs. x and y
are exterior points of R(x, y). Nodes in R(x, y) which
are successors of x and predecessors of yare interior
points of R(x, y). A closed serial path has the properties
of: (1) Fx and F-ly are singletons; (2) For each interior
points z, Fz and F-lz are singletons. A serial path which
violates the aforementioned properties is an open serial
path. The simplest form of a closed serial path is two
nodes linked by an arc. In such a case, there is no
interior point.
Simple Out-Brancb (see Fig. I.b): Branches in SA
networks often indicate decision points. An out-branch
(a)
CLOSED SERIAL PATH
389
occurs at a node x if IFxl > 1. A simple out-branch
requires that there is no path between members of Y
where Y is a subset of F(x). In our example (Fig. I.b),
Y = {a, b, c}.
Simple In-Branch (see Fig. I.c): In-branch occurs at
a node x if IF-li > 1. To be a simple in-branch there
must not exist a path between members of W where W
is a subset of F-l(x). In our example, W = {a, b, c}.
Closed Parallel Path (see Fig. I.d): A closed parallel
path PP(x, y) implies that there are more than one
closed serial path from x to y with x, y as their exterior
points.
The lconcept of weighted arcs and nodes
In an SA network an arc serves two functions: (1) To
connect two nodes and give a sense of precedence, and
(2) to carry values. For examples: cost, capacity,
distance, flow units, data string, time, speed, probabilities, are values which may be associated with an
arc. We consider those arcs which serve both functions
weighted arcs. Similarly, a node may be weighted. For
example, in a, job-resource simulation a node is typically
a processing station which contains channel capacity,
mean processjng rate, probability distribution function,
queue storage, etc.
Picture composition and storage
Picture composition
~------------~~--.
W) CLOSED PARALLEL PATH
.&.
(0
--
x
y
'"
"
w
Figure I-Basic diagrams
To begin an interactive expedment a user must be
able to draw a diagram on the CRT similar to the kind
of diagrams he usually draws on paper. An SA network
picture may be drawn by either manual input from the
ADAGE graphics terminal, using joystick and light pen,
or by programmed statements, or a combination of both.
During the drawing phase of the experiment the console
input involves the permanent display of certain
"function keys" and "graphic primitives" on the bottom
and the right edge of the CR T screen. A graphic
primitive (one of the node symbols) is picked up and
moved to the desired position on the screen with the
movement of the tracking cross which is directed by the
console joystick. Directed arcs are created by connecting
nodes with the movement of the joystick. Alphanumerical labels for each node may be entered via the
console typewriter. Figure 2.a shows the free-hand
drawing of a project scheduling3 network picture on
the CRT.
Equivalently, a picture may be composed by pro-
390
Fall Joint Computer Conference, 1969
x~y
x links y and crea te are
(x, y).
x~y
x links from y and crea,te are
(y, x).
Graph G1links graph G 2 • {xd
'and {y i} are lists of con-·
catenation pojnts. For each
x i there is a Yi such x i links
Yi·
G I links x VIa (c, x) where
c E G I , X ~ G 1•
x
~
[a, b, c]
[a, b, cJ
~
x
Out-branch from x to 2., b, c.
In-branch from a, b, c
1;0
x.
The Precedence Operators (see Fig. 4 for examples)::
Figure 2a-Initial drawing of a project scheduling
network'
x>y
x
S
S
E
F
E
F
BWSR
B
P
F
0
F
I
I
I
W P
K C
B
R
0
LTV
R
P
I
I
x +- y
I
I
I
x ....
I
y ....
z
I
I
I
FI
x+-y+-z
I
I
I
K
I
C
B
a .... b .... x+-e+-d
I
!
R
GO
I
I
G
I
L
T
G1 Ie) .... x
I
V
I
Pt
FP
I
I
I
EF
RP
IMAGE
I
I
I
BF
H
EXPRESSION
x js preceded by y
x .... y
so
RW
G
y
I
BP
WF
F
Pt P
G
H FI
<
x precedes y
G, Ie) .... x .... y .... G2 Id)
I
x .... [a, b, c)
Figure 2h--The connection matrix-C-matrix--of
a project scheduling network
[a, b, e) .... x
gramming. We introduce some of the commonly used
operators, together with some examples below (with
the contention that a network picture progresses from
left to right):
The Link Operators (see Fig. 3 for examples):
Figure 3-Examples of link operators
Graph Manipulator
EXPRESSION
>
x
nodes in G. If we label the row corresponding to Xi by i,
and the column corresponding to Xi by j, then the
element. of C has the value Cii = 1 if (Xi, Xi) E A;
IMAGE
0
y
391
0
n
Cii :=
x
<
0
y
0
and
n
x
>
y
0
> z
CD
\9 \9
0
C9
0
~
<0
0
x
> [a,
0
b, cl
®
0
x .... y
> [a,
b, cl
0
-0
0 if
i
(Xi,
Cik
>
x J-) ti A.
O.
Xi
Xi
is a source node if
is a sink node if
k=l
L: Cik =
f
k=l
71
O. Xi is an isolated point if
= 0
k=l
Cki
>
1 and
n
L: Cki = L: Cik =
k=l
k=l
L: Cki
O.
k=l
For example, Figure 2.b shows the C-matrix for the
project scheduling network sketch given in Figure 2.a.
In order to preserve computer memory the C-matrix
is converted into a "precedence matrix," or "P-matrix,"
before storage. A precedence matrix is a connection
matrix with its row (columns) arranged in accordance
with the precedence relations in G. The rules of arrangement are as follows:
A
0
®
0
Figure 4-Examples of precedence operators
Retention of picture information with Boolean
matrix operations
The special topological characteristics of the SA
networks (i.e., a picture is defined by the precedence and
logical relationships among network components rather
than their geometric attributes) permits us to make a
radical departure from the conventional means of
picture storage in which the coordinates and other
geometric specjfications (e.g., radius, angles) of vectors
or primitives are to be remembered. The picture
retention scheme for the SA networks involves a
minimum amount of jnformation, yet it preserves the
isomorphism of the picture topology as w~l1 as the
"meaning" of the picture. Under this scheme, a picture
may be regenerated for the purpose of CRT display or
for the purpose of revision, decomposition, reduction
or merging with other pictures. This is done through
the Network Picture Processing Language (NPPL)
which employs boolean matrix operations for varioub
picture manipulations.
During the picture composing phase, while a picture
is being drawn on the CRT by the user, a "connection
matrix," or "C-matrix" is constructed in the working
storage of the computer. C-matrix is a boolean matrixl
with dimension n x n where n is the total· number of
1. If y E Fx, then y must be placed after X (i.e., the
row (column) associated with y must have a
larger iQdex number than that of x).
2. If y E Fx, then y must be placed before x.
3. If y ti Fx, then the order between X and y 1S
irrelevant.
Figure 5 show~the P-matrix of the project scheduling
network of Figure 2.a. We notice th2.t the P-matrix is
triangular (this will always .be true if the precedence
relation are held), and it is predominantly inhabited
with zeros. Both features contributed to the economical
use of core storage.
Picture regeneration
Convention of network picture arrangement
The network pictures stored in computer file may be
retrieved in its entirety, or in part, for CRT display. It
is also desirable to redisplay a picture immediately
after it is drawn bec~us~ \nvariably the computer will
generate a "better" plCture than the one drawn by the
user. In our present effort the convention of a network
picture generated by the computer includes the
following rules:
1. All the source nodes are placed at the left end
of the screen which means that the network
pictures progress from left to right.
2. Only forward arrows are allowed, i.e., no backward or vertical arrows.
3. All arcs are made of linear segments.
4. Ljne crossings are to be minimized.
5. Other visual effectiveness considerations.
Fall Joint Computer CoIiference, 1969
392
B W B R R
5
E F P
F
F 5 G
F P W H P FI B K R P
0
0
E
Pt C G L
V
F T
5 1'\.1
1'\1
1'\,.'
BP
1
1 1
WF
BF
RP
.RW
H
P
I
B
K
R
FP
50
GO
Pt
C
G
We consider the graph area as being a rectangle grid.
The nodes of a graph are always placed at the intersection of the vertical and horizontal lines. We call the
vertical lines "stages" and the horizontal lines "levels."
If we can place each node of the graph at its proper stage
and level, and connect them according to F function,
then a graph is generated. Figure 7 shows the project
scheduling network with each "stage" indicated. If the
"stage" and "level" assignments are not properly made
it may result into backward arrows and frequent
occurrence of line crossings. Both features are undesirable from a visual effectiveness point of view. W·e shall
briefly list the procedures of assigning stages as follows
by using the example of the project scheduling network:
1
1
1,\
"""
r'\
1
I
I'\.
1
I'\,.
,
1
1'\.1
1'\,.1
."
1
I'\...
1
1
1
I'\...
1
I'\,.
1
I'\,.
1
1'\
~
1
1
" '-I
~
1 1
1
"-f'\..
EF
1
1
['\1
T
~
V
-Figure 5-The precedence matrix-P-matrix--of a
project scheduling network
Algorithms for optimum routing for
interconnections
1. Place all source nodes in stage 0, S(O). e.g.,
S(O) = {s}.
2. Obtain S'(I) = \ J
F(x). e.g., S'(I) = tel.
x E S(O)
3.
Obtain S'(2) =
As mentioned before, the special structure of the SA
networks allows us to generate a network picture with
an efficient generator which presupposes the topological
characteristics of the graph, thus, reducing drastically
the storage requirement. The graph generator of the
NPPL operates on the P-matrix (or C-matrix) and
transforms it into a graph image with all routes of
interconnections "optimized." For example, Figure 6 is
the same network picture as shown in Figure 2.9. but it is
interconnections "optimized." For example, Fig. 6 is
the same network picture as shown in Fig. 2.a, but it is
drawn with the convention and constraints set by
NPPL.
F(x). e.g., S'(2) = {f}.
\J
X E
S'(I)
In general, S' (n) =
x
E
\J
F(x).
S'(n-l)
4. If S'(n) f \ S'(n + k) = Y 1= ¢ (where k ~ 1)
then S'(n) would be modified by labeling members of Y in S'(n) as "dummy nodes," and the
successors of dummy nodes would be deleted
from any subsequent stages. The dummy nodes
will be repeated at each succeeding stage until
S'(n + k). The function of the dummy node is to
be a "place marker" for an arc which erosses
I
I
I
I
I
I
I
I
I
I
I
10
Figure 6-Regenerated picture of a project
scheduling network
Figure 7-H Stages" of a network picture
11
Graph Manipulator
5(0)
{5}
5( 1)
{E}
= , F}
5(2)
5(3)
5(4)
5(5)
5(6)
5(7)
5(8)
5(9)
5(10)
{ BP, WG, 50 }
= { RP, BF, FW, H*, B, G*}
= { H, P*, R, G*}
= { P, GO, G *}
= { Fl, G }
= { K, C, FP, L
= { Pt, V*, T*}
= { V, EF, T* }
7
FI G
I
I~
IKe
I
FI
8
I
I
I
9
10
F
:~
I
E
P L I Pt @cD Il V F
393
: 11
I
cD IT
I
I
G
((~
I
FP
I
I
L
I
C(~
I
I
I
cD
I
v
I
EF
I
cD
I
T
(8)
5( 11) = { T }
P - MATRIX
8
10
I
I
*Dummy Node
Figure 8-Assigning stages
11
I
I
I
I
I
I
I
I
several stages (this is often necessary in order to
avoid backward or vertical arrows) and to keep
it free from interference from other arcs or nodes.
5. The "scan" process will continue until stage m is
reached where SCm) = c/>, and all S'(j)'s, j = 1,
2, ... , m - 2, have been modified (i.e., the
labeling of dummy nodes and the deletion of
their successors). The modified stages are then
denoted by S (j) for all J (~ 0). Figure 8 shows
the result of the scan process as it applies to the
project scheduling network.
The next is to assign "levels." Figure 9.b shows a
graph of the project scheduling network from stage 7
through 11, using the order of node appearance in each
stage (Figure 8) as the initial "level" assignment.
Figure 9.a gives its associated P-m2~trix with stage
partitions shown. As can be seen in Figure 9.b, that the
"unoptimized" vers!on of node positioning resulted into
two line-crossings ((FP, Pt) with (C, V*), and (Pt, EF)
with (V*, V)). The crossing violations can also be
detected from the P-matrix as shown in Figure 9.b.
The P-matrix of Figure 10.a shows proper level assignis shown in Figure 10.b.
ment (thus, the matrix is called P*-matrix). This is done
by interchanging row (column) positions of K with C in
stage 8, and Pt with V in stage 9. The optimized graph
is shown in Fig. 10.b.
I
I
I
~----.~~~--~~
I
I
I
I
(b)
GRAPH
Figure 9-An example of improper "level" assignment
The criteria of optimizing the rows and columns of
the P -matrix, in order to minimize line crossings, are:
1. Interchange rows and columns only within each
stage.
2. The non-zero elements of each row should be
consecutively located.
3. If the non-zero element of a row begins in
column j, then no non-zero element of any
previous row may begin in a column with
column index less than j.
An optimal, or near optimal condition may be
achieved by rearranging the columns and rows of
P-matrix belonging to the same stage such that the
resultant matrix meets, or most nearly meets, the above
criteria.
The dummy nodes are not displayed on the CRT .as
full· symbols, instead they are merely treated as point
Fall Joint Computer Conference, 1969
394
7
8
I
I
I
FI G C
K
P
I
I
I
FI
9
I
I
F
L:(0 Pt
I
10
111
I
E
I
®I v
F
regeneration phase. Upon completion of the line, an
arrowhead fncing the opposite direction is placed at the
beginning· of 'the arc (i.e., the left end of the a.rc). The
presence 'or absence of circuits can be detected by
examining the diagonal elements of the T-matrix.
A non-zero diagonal element signifies the existence of a
circuit.
®I T
I
G
I
C
K
I
FP
I
!
I
L
---(0
Picture manipulation
I
I
Pt
__ .9)
I
Union of graphs G1
I
-
v
I
EF
I
®
I
T
(8)
p. - MATRIX
8
10
11
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
0
IIi'
,+,
I
I
I
I
I
(b)
GRAPH
Figure 10--An example of proper "level" assignment
vectors which often serve as pivot points for arcs. For
example, see arc (T*, T) in Figure 10.80.
The handling of circuits
A graph which contains one or more circuits (i.e.,
directed cycles) is called a cyJ1ic graph. A computer
program flow diagram is typically a cyclic graph since
program loops are the rule rather than the exception.
Certain job-resource simulation models in which
rejected product recycles back for rework at a previously
encountered work station also constitutes a cyclic
graph. While a cyclic condition· can always be handled
with connectors, thus making the resultant graph
acyclic, it is more desirable to use directed arcs to
display the actual circuits. In NPPL, the backward
arrow, which represents the feed-back portion of the
cir, or (2) Xl () X 2 ¢ 4P and the
set of common nodes (i.e., Xl () X 2) have the S9,me order
of presence in both P-matrices P(G I) and P(G 2). The
procedure of obtaining the P-matrix associated with the
graph G I + G 2, provided that P(G I ) and P(G 2) have
been obtained, is as follows:
If Xl () Xl! = 4>:
1. "Fill" Operation for P(GI)-Expand P(G I) by
adding IA21 zero row vectors of dimension
IAII + IA21 to the bottom of P(GI), and IAzl ~~ero
column vectors to the right of P(G I), i.e., after
the last column in P(G 1). Thus, we have obtained
the expanded P(Gl), pP(G I).
2. "Fill" Operation for P(G 2)-Expand P(G2) by
adding IArl zero row vectors to the top, and IAII
zero column vectors to the left, of P(G 2). Thus,
we have obtained pP(G2).
3. Finally, P(G I + G2 ) = pP(GI)
pP(G2). 'The
symbol + stands for element-to-element boolean
"inclusive-or" operations.
-+
If Xl () X2
¢
4>:
1. "Fill" Operation for P(GI)-To obtain pP(GI) we
fill P(G I) with zero row (column) vector:3 of
dimension IAII + IA21 corresponding; to those
nodes which are in G2 but are not in Ch.
2. "Fill" Operation for P(G 2)-To obtain pP(G 2) we
fill P(G z) with zero row (column) vector,s of
dimension IAII + IAzl corresponding; to those
nodes which are in G I but not in G2 •
3. P(G I + G2) = pP(G I)
pP(G2).
See Figur~ 11 for an example of G l + G2•
+-
Intersection of graphs G1
* Gz
Assuming G I () G2 ¢ 4> we first obtain a compressed
matrix -yP(G I) by strjking out all rows and columns
which are associated with non-common nodes. By
Graph Manipulator
am
abc
d
I
I
P(G,) b
e
b
,PIG,' • ""G,'
I
c
I
d
I
395
d
e
:~
e
f
b
d
e
9
l
P(G2)fg
b
I
d
I
I
Figure I2-An example of 0 1
9
\
e
pP(G,)
.
fb
dgl;
afbcdge
I
I
f
b
I
I
I
I
I
I
b
I
b
I
I
I
I
2.
Associ~tive:
I
d
I
I
Figure II-An example of 0 1
d
+O
I
2
Deletion operations G1
-
G2
Since G 1 - G 2 implies G 1 - (01 * G2) we strike out,
from P(G1), all those rows and columns which belong to
the set of common nodes, i.e., all the common nodes of
G 1 and G 2, together with their associated arcs would be
deleted from G 1•
The - operator is also an unary operator, e.g., the
expression -0 1 produces the complement of subgraph
G 1, or G - G 1, where G 1 ( G; the expression -A
produces G - A where A ( G; -x means the deletion
of node x, together with its associated arcs" from G;
-(x, y) deletes the arc (x, y) from G.
+, *, and -
operators
The operators described in the previous three sections
G1 + G 2 = G 2 + G1,
G1 * G2 = G2 * G1
G1 + G2 + G 3 = G 1 +
(G 2 + G 3) = (G1 + G 8) +
G3
I
similar method, we also obtain ,),P(G2). As in G 1 + G 2 ,
nodes in Xl f \ X 2 must have the same order in both
P-matrices. P(G1 * G 2) = ,),P(G 1) ; ,),P(G 2) where the ~
symbol'indicates an element-by-element "and" operation of two boolean matrices, i.e., A ; B = C implies
C,j = aij /\ bi;. We use G1 and G2 as illustrated in
Figures l1.a and 11. b to show the result of G 1 * G 2 in
Figure 12.
Properties of
1. Commutative:
I
I
d
have the following algebraic properties:
pP(G,) + pP(G 2 )
pP(G 2 )
bed
* O2
(G 1 * G 2 * G 3 = G1 *
G2 * G 3) = (G1 • G2) * G8
3. Distributive:
G 1 * (G 2 +' G 3) = (G 1 * G2)
+ (G1 * G 3)
G1 + (G2 * G 3) = (G 1 +
G2) * (G1 + G 8)
4. De Morgan's Law: G1 - (G 2 + G 3) = (G 1 G2) * (G 1 - G 3)
G1 - (G 2 * G 8) = (G 1 G2) + (G1 - G 8)
Genera.tion of subgraphs
A standard operation in graph theoryl is to compute
Cn which gives the number of paths of length n (length
is defined as the number of arcs between two communicating nodes on a particular path) between any two
nodes G. As can be envisioned, this is a costly operation particularly if the graph is large. Instead of asking
how many paths of length n between x and y we now ask
whether there is a path from x to y of length n. We may
achieve this by using boolean matrix operations. Thus,
in raising C-matrix to a power we replace all ordinary
summation by boolean summations. If P is the product
matrix of A x B where matrices A and B have dimensions
m x rand r x n respectively, and x is the ~ymbol f~r
boolean matrix'multiply, then Pii = a i1b 1i + ai2~j +
. .. , -t-airbri (where Pi}, ai}, bi; are elements of matrix
P, A, B respectively). We denote the nth power of
C-matrix resulting from boolean matrix multiply by
cn. C,: is a zero-one matrix. It is associated with a
graph which possesses an. arc (x, y) if and only if there
exists a path of length n from x to y. For example,
Figure 13 shows the C-matrix associated with agaphand
the jPowers of C. C2 shows that there. exist paths from
a tod, b to e, and c to e, of length 2. C3 shows that there
is only one path of length 3 in the entire graph, and that
in
396
Fall Joint Computer Conference, 1969
k
where X' = {x} U ('0 Fix)
;=1
c3
alii
abe
b
e
d
:11 :11
e
abede
ITRAN (G, x, k)
Construct subgraph (X', A')
abede
k
1
I
where X' = {x}
1
I
e
d
e
d
e
e
abc
d
e
I
I
I
b
I
I
c
I
I
d
I
I
e
Figure I4-A transitivity matrix
is from a to e. C4 = 0 which implies that there is no
path of length 4 or more in the graph.
Boolean summation may also apply to the addition of
B = C (where A and B have
connection ~atrices. A
the same dimensions) implies Cij = 0 if aij = bij = 0;
c:! = 1 otherwise. Using the examples given in Figure 13
we show the result of C -+- C2 + C3 in Figure 14. The
type of boole2lu matrix in Figure 14 which we shall call
"transitivHy matrix," or "T -matrix," ind.icates whether
:there is aj>ath between any pair' of nodes. It also gives
F x and Fx for any x. For example, the column labels
associated with the non-zero elements of row vector b
constitute the set Fb (i.e., {d, eD, and the row labels
associated with the non-zero elements of column
vector b constitute the set Fb (i,e., {a}).
With the utility of the T-matrix there are a number
of standard functions under NPPL for generating
subgraphs. We shall introduce a few below:
+
FTRAN(G, x)
ITRAN(G, x)
TRAN(G, x)
FTRAN(G, x, k)
Construct subgraph (X', A') of
G =.... (X, A) where X' = {x}
' 0 Fx
Construct subgraph." (X', A')
where X' = {x} ' 0 Fx
Construct subgraph (X', A')
where X' = Fx
Construct subgraph (X', A')
F-ix)
i=1
e
I
d
Figure I3-Examples of C;i matrices
a
'0 ('0
For example, if we name the graph in Figure 6 (II. project
scheduling network) Go, the statement G 1 = ITRAN
(Go, H) would produce a graph showing all the activities
which are prerequisite to the installation of heating
(H), including the node H. Figure 15). The graph which
represents activitjes between foundation (F) and flooring
(F1) may be obtained by the statement G6 = FTRAN
(Go, F) * ITRAN(G o, Fl). (Figure 16)
Graph reduction and expansion
Operating on the "basic diagrams" as explained
earlier, the NPPL can successively reduce a graph to
various levels of complexity as may be specified by the
user. Some of the standard reduction functions are as
follows:
Reduce the interior nodes of
RSPI(G, x, y)
ALL ACTIVITIES PREREQUISITE TO HEATING
G, • ITRAN (Go. H)
RP\
BF~
/
:~
...'
: I :B
S
E
F
BP
WF
S I ElF II P
I
Wi R B
F: P
:
F @IH
I
I
I
I
I
I
RP
I
BF
I
ctf.
I
H
Figure I5-An example of ITRAN(G,x)
Graph Manipulator
397
ACTIVITIES BETWEEN FOUNDATION AND FLOORING
G .. FTRAN (Go. F) n ITRAN (Go. Fa)
5
Figure 16-Activities between foundation and flooring
closed serial path R(x, y) in G
and the arcs between them, into
a single node.
RSP(G, x, y)
ROB(G, x)
Reduce closed serial path R(x, y)
into a single node.
RPPI
Reduce out-branches of node x
in G such the set Y (defined
earlier) is a single node.
RSP'
RSP. RPPI
RSP
Figure 17-Examples of graph reduction con't.
RIB(G, x)
Reduce in-branches of node x
such the set W (defined earlier)
is a single node.
RPPI(G, x, y)
Reduce interior nodes of closed
parallel path PP(x, y) in G into
a single node.
RPP(G, x, y)
Reduce "closed parallel path
PP(x, y) in G into a single node.
RSP. RPP. RSP. ROB. RIB
I
Figure 17-Examples of graph reduction
As an example of utilizing some of the functions
mentioned, Figure 17 shows a series of reduction
beginning with the full graph of the project scheduling
network G] of Figure 6.
Graph expansion is essentially the reverse of graph
reduction. The nodes to be operated on must be
compressed (macro) nodes. The expansion may be done
398 Fall Joint Computer Conference, 1969
in a single phase, or in several phases, e.g., the outbranches are expanded first, then the in-branches, then
the serial paths, etc.
In conclusion, we would like to mention that the
network picture processing language (NPPL) is not
designed solely for the purpose of generating and
manipulating network pictures. A greater objective is to
provide an over-all control system through which
man-computer experiments can be performed. We
envision that once a pjcture is constructed (whether by
initial composition, or by merging/decomposition
operations) the user may assign (or reassjgn) input data
to any node (or arc) by selecting the desired node
(or arc) on the CUT. An input page would then appear
on the CRT with pre-designated format to guide the
user for inputing data. The matrix representation of
each graph, as previously explained, would also serve as
pointers to the storage areas of data pages. The simulation phase would then follow the input phase. During
any phase of the experiment, controls may be returned
to the picture composition and processing phase in order
to maximize man-computer interaction.
3
4
5
6
7
8
9
10
11
REFERENCES
1 C BERGE
The theory of graphs and its applications
John Wiley and Sons N Y 1962
2 E S BUFFA
12
Modern production management
John Wiley and Sons NY 1965 538
R G BUSACKER T L SAATY
Finite graphs and networks: An introduction with applications
McGraw-Hill Book Co N Y 1965
H A DI GIULIO P L TUAN
A system for network picture processing with interact'ive
computer graphics
Proc ACM 1969 Nat Conf and Exposition Aug 1969
C FLAMENT
Application of graph theory to group structure
Prentice-Hall 1963 Englewood N J
L R FORD D R FULKERSON
Flows in networks
Princeton Univ Press N J 1962
'1' R HOFFMANN
Assembly line balancing with a precedence matrix
Management Science July 1963 551-562
A KAUFMANN
Graphs, dynamic programming and fi"'/.,ite games
Academic Press 1967 N Y
S C PARIKH W S JEWELL
Decomposition of project networks
Management Science Vol 11 No 3 1965444-459
A C SHAW
The formal description and parsing of pictures
Stanford Linear Accelerator Center Rpt No 84 19Ei8
Stanford Univ
A W STEINBERG
Some notes on the similarity of three management science
models and their analysis by connectivity matrix techniques
Management Science Jan 1963341-343
W M WAITE
An efficient procedure for the generation of closed sullsets
CACM Vol 10 No 3 1967 169-171
On-line recognition of halld-generated
sYlnhols*
by GEORGE 1\1. MILLER
University of California
Berkeley, California
INTRODUCTION
'Vith
the .growth of information processinO'~ svstems
.
~
Incorporatmg large data bases, many situations arise
in which the data to be entered is a human's analysis
of a problem. Often it will be undesirable to require the
user to learn to type, and this mode can be cumbersome
for random two-dimensional entry on a form or drawing.
U sing an electronic tablet coupled to a display tube
would make it convenient for the user to point to acorrect answer or print it in a very natural way. This
paper describes a new technique for converting these
hand written symbols to code words which c~n subsequently be processed by a computer.
It might be supposed that handwriting is not speed
competitive with keyboard methods. Donald Devoe l
of Sylvania's Applied Research Laboratory has recent.ly conducted several experiments which indif'ate
the contrary. Although handprinting of capitals and
num~rals is about five times slower than a skilled typist
copymg prose, the former compares favorably with the
ra~e for untrained typists (i.e., about 60 characters/
mInute). In a task of making geometrical measurements
on a drawing and recording this data in a table Devoe
found that handprinting required only about tw~-thirds
of the time required using a keyboard. This difference
was ~till eviden~ '\\ith his subjects after six days of
~ractICe. Hence It may be anticipated for such applicatIOn areas as computer-assisted instruction, 2 input of
-
'" This research was supported by the Advanced Research Projects
Agency of the Department -of Defense under Contract" No SD185.
- -
mathematical, logical and chemical formula in canonical for~s, input and manipUlation of matrices, program
debuggmg, 3 specifying and designing systems by means
of flowchart symbols, and t\vo-dimension~l game
playing, that hand printing will not only be desirable
to users, but also an efficient means of computer entry.
Research in hand-printed symbol recognition has
been evident in the te~hnical journals for more than
a decade. The reader will soon discover that most
symbol recognition literature is concerned with hard
copy or off-line input. Typically~ an optical scanner is
used to obtain a two-dimensional array of points from
a completed hand-printed character. The major effort
of many researchers has been the exploration of unique
methods of preprocessing or feature extraction to reduce the dimensionality of this ra.w data. 4 Others have
placed greater relative emphasis on classification techniques and on the selection of features from a feature
set or poo1.5 More recently several workers, including
Duda and Hart,6 have made use of context to improve
recognition performance.
The electronic tablets used to obtain on-line source
data provide a nearly exact trace of the path of the
writing instrument and the order of the composite
strokes used to inscribe a symbol. This time-sequence
information is a great boon to machine recognition,
but cannot be obtained by scanning off-line source
images. For example, many individuals make 5's
which look almost identical to their S's. However, an
on-line recognizer will have no difficulty in distinguish_ing between this pair if the former is made with two
strokes while the latter with only one. Similarly a
399
400
Fall Joint Computer Conference, 1969
lower case b and a numeral 6 are readily distinguished
if their loops are inscribed in opposite directions of
rotation.
Although on-line recognition systems have the advantage of low noise input data with higher information
content, a number of challenges face the designer. He
may desire a recognizer program whirh is invariant
to size and position of the input symbol, has automatic
means for detecting when a syinbol is completed. is
relatively insensitive to minor perturbations from i(ieal
symbol shapes, has sufficient resolution to accommodate
the wide range of symbols used: in languages and the
professions, is easily trained to the writing style of an
individual, and which requires -a mini~um amount
of memory space and computation time.
The author's on-line recogni~er has been implemented on hardware typical of that found in a modern
computer graphics environment. The components are
shown in Figure 1 and include a time-shared computer
a ?RT. display, and a Rand Tt;tblet.7 When the pe~
SWItch IS closed by pressing the stylUS on the tablet
surface, .the sequence of filtered pen track coordinates,
along WIth control bits to indicate the end of strokes
~re temporarily stored in a buff~r. In order to permit
Immediate display of the "ink trace," this operation is
performed in a PDP-5 peripheral processor. The
PDP-5 and the display control: use a common core
me~ory. Communication with the Berkeley Time·
~harm? SystemS (TSS) is by means of a half duplex
lInk WIth a capacity of approximately 50l( bits per
second. The TSS schedules the user's dictionary building
or recognition routines and has access to the PDP-5
memory. The user interacts with and controls these
routines using either a teletype or by pointing to light
buttons on the display. The recognition routine operates
on the track coordinate data to determine when an
input symbol has been completed and whether the
Display
Control
PDP-5
4K-12 bit
6115 core
Link
r
S-DS-940
32K-24-bit
1.75 fLs
core
Tablet
Control
TTY
Figure I-Hardware used in recognition
rese~rch
symbol closely matches any of those previously stored
jn the user's dictionary. The output consists of an
identification code, and data on the size and location
of the recognized symbol.
BACKGROUND
Two recognition algorithms developed by other researchers will be partially described in the next few
~aragraphs. (Additional background in on-line recognitIOn techniques is contained in the dissertation from
which this paper is abstracted9 ). The purpose of including this material is not to make a thorough comparison or evaluation, but simply to point out severa'!
limitations in their methods which led to a search for
the techniques described in the body of this report.
G. F. Groner,lO of the Rand Corporation, has developed an on-line recognizer which has successfully
been applied to a larger system for creating, editing,
and executing computer code and flow charts. Strokes
are identified via a data-dependent s~quence of tests
determined by the system designer. The first four
stylus directions are used to divide the strokes into
groups. Further tests depend upon the particular subset of strokes and are chosen from the following: the
number and/or relative position of corners, the relative position of starting and ending points, the number
and/or positions of relative maxima and minima in y,
and the fifth and succeeding stylus directions. The
recognition of multiple-stroke symbols is based on correctly classifying the constituent strokes and their
spatial relationships.
The Rand recognizer has several limitations. It cannot conveniently be modified for individual printing
styles. Adding or deleting symbols is complicated because these operations frequently require cha.nges in
the tests used on resident symbols. The selection of
features and the ordering of tests are based on an
intuitive analysis of data obtained from a subset of
potential users. There does not appear to be any convenient way of optimizing this design procedure.
M. 1. Bernstein and T. G. Williams,l1 of the System
Development Corporation, have recently described
an on-line recognizer in which each user of the system
may build a dictionary of the symbols he desires for
his particular application. Strokes are divided into
segments if they contain corners. Segments with a
large or small aspect ratio are coded as vertical or
horizontal lines respectively. Otherwise the segment is
circumscribed with a minimum rectangle divided into
the five sub-areas shown in Figure 2. The path of the
segment is now retraced and each time a bounda,ry
is crossed, the number of the newly entered sub-are'a
On-line ,Recognition of Hand-generated Symbols
is added to a string to form an "area-sequence signature." In addition to the segment signature, the dictionary entries specify the geometrical relationship
between the component segments and strokes of symbols. The distance between the center of each successor
segment with respect to the center of the collection of
its predecessors is quantized as coincident (C), proximate (P), or far (F). If the successor segment or stroke
is proximate or far, the direction of its center with
respect to the center of the collection of predecessors
is quantized to one of eight sectors.
The SDC system requires an exact match of segment
signatures and their spatial data for recognition so that
a user's dictionary should contain all of the variations
anticipated. As an example, Figure 2 shows that the
first stroke of a numeral 4 could have three different
area-sequence signatures. For each of these the second
stroke could be in any of the four spatial positions
shown. It is very unlikely that a partiCUlar user would
produce all of the twelve possible combinations, but
half this number is likely. Mr. Bernstein has indicated
that on the average he requires three or four dictionary
entries per symbol and that certain symbols require two
or three times this number.
Although multiplicity of dictionary entries may not
be a serious limitation of the SDC recognizer, it seemed
desirable to this author to find a symbol descriptor and
Area sequence signatures
possible for first stroke of 4
P2
P3
P4
c
Spatial variations possible for second stroke
Figure 2-Multiplicity of dictionary entries
401
recognition technique which would permit a high
recognition rate but require fewer dictionary entries
per symbol. Two concepts are used to obtain this goal.
On the one hand it does not seem necessary or desirable
to require a rigid geometrical relationship between the
component strokes or segments of a symbol unless this
information is needed for classification. If the numeral
4 is the only symbol which is generated using a twostroke sequence similar to L followed by !, then there is
no need to require any particular spatial relationship
between the strokes. It follows, however,_ that some
sort of automatic procedure is needed to determine
which spatial information in a large set of symbols is
redundant. A second way to reduce the number of
dictionary entries is to devise a segment signature
scheme which lends itself to the use of bestmatch
techniques. The idea is to compute the degree of
simliarity of an input segment with a set of prototypes
and choose the closest match. With this capability it
should not be necessary to store combinations of
moderately distorted segments, but only nominal
shapes.
CV S signature and Lee metric
General description
In both of the above-mentioned recognition methods,
an input symbol is classified on the basis of a number
of discrete decisions. As a general principle it seems
preferable to retain full information at each intermediate stage in the symbol recognition process. 12 Stated
another way, it is desirable to have a smooth transformation between data spaces. A segment descriptor
can be thought of as performing a mathematical transformation on the sequence of pen track coordinates.
The idea of a smooth transformation is analogous to
that of a continuous transformation in the mathematical sense.
The argument for the principle of smooth transformations can be made by an example. Consider as two
segment classes the numeral! and the right angle L. As
the lower half of an ideal! is rotated counterclockwise
the generated symbol will pass through a transition
region where the probability of its being in class Lin··
creases and the probability of its being in class I decreases. A good segmen.t descriptor and classification
method should reflect this continuous change. In the
case of handwritten symbols, it is also desirable to have
a feature space which is invariant to symbol size and
position.
The author of this paper has conceived and tested
a segment descriptor and metric which obtains the
402
Fall Joint Computer Conference, 1969
goals mentioned in the previous: paragraph. This new
method employs what will be called the contour vector
sequ,ence (CVS) and has some sim.ilarity to an encoding
scheme described by Freeman. 13 In Freeman's method
a square mesh is superimposed on the arbitrary curve
to be encoded. Mesh nodes lying closest to the intersections between the curve and the mesh define a
straight-line approximation to the given curve. The
scheme is illustrated for two symbols in Figure 3. Successive nodes can only be one of eight, so the resulting
encoding is a sequence of octal digits: The number of
elements in the chain is directly proportional to the
length of the curve. In a subsequent paper Feder and
Freeman14 use this encoding technique to fit a given
curve to a similarly-shaped section of a larger curve.
However the method is size variant and cannot be used
for measuring the degree of similarity between two
arbitrary segments.
In the author's CVS encoding scheme the contour
of a segment is subdivided into six nearly equal length
arcs which are approximated by their associated chords.
Each of the chords is quantized to a vector having one
of eight possible directions. Hence the resulting signature is a vector CVS = 81 S2 S3 84 86 S6 of six components,
where each component takes slope of values between
zero and seven. (See Figure 4.)
The degree of dissimilarity between two segments
is obtained by summing the absolute rotational difference, expressed in angular units of 1r/4 radians, between corresponding components of the associated
contour vector sequences. This distance measurement
is equivalent to the Lee metric used in coding theory.15
Specifically, if segment A has
7
a
6
5
2
4
3
Octal encoding of
adjacent mesh points
c
b a
665432~~
665432100
f
9
Figure 3-Freeman code
(1)
clearly
and segment B has
(2)
then the Lee distance (will also be called the mismatch)
between the segments A and B is given by
G
DL(A, B) = MM A -
B
=
L: ICil
(3)
i=l
where
(4)
Icil
8 -
= {
Ci,
for 5
~ Ci ~
7
(5)
C i,
otherwise
(6)
As Ic il cannot exceed 4 angular units, the maximum Lee
distance between two segments is 24.
Figure 4 illustrates the contour vector sequences for
an alpha and a delta, and calculates a Lee dis1tance of
ten between these two symbols. The figure also shows
a mis:m.atch of only two between somewhat different
alpha segments. The latter is an example of the smoo1ih
transformation between the pattern space and the CVS
feature space. Data obtained from an experimenta I
recognition program has shown that similar segments
are mapped into points in the feature space which are
close together in terms of the Lee metric. This clustering
of. segments which look alike to humans makes it
On-line ,Recognition .of Hand-generated Symbols
o
/*2
5
CVScx:= 5571.33
CVS = 531702
3
8
4
6
Quantized
directions
~
3
n Icd
.i =I
= 022231
5
7 5
3
403
I d 3 4 5 b l 8
90ABCDEF
G H I J I( LMN
OPQ 1< STUV
WX I 21' ~ + $
I
J I"" "I I J >K
•
V
Suppose above sym bol s are stored prototypes
and the symbol below is input.
II
1f)J4
~4
CVS~ = 457134
DL (Q-8) = 12
D (I"'l-CX:) - 2 ...-Lowest
L '-4
mism atch
#@. / '" ;/.
< > .?
Figure 4-CVS encoding and mismatch calculation
possible to only store nominal segments and use the
metric to recognize non-ideal segments on a nearest
prototype basis. The property also can be used to advise
a person that certain symbol pairs which he creates are
very "close" to each other and may give trouble in
either human or machine recognition.
Choice of six component CVS
Several factors contributed to the choice of six components in the CVS. In order to reduce storage requirements for a user's dictionary, it is desirable to use
as few components as possible. On the other hand the
CVS must provide sufficient resolution to distinguish
between classes in a large set of symbols. Experiments
were conducted with a variety of symbol shapes in
order to obtain a compromise betwen these two goals.
As a minimal requirement it was felt that an on-line
recognizer should accommodate handprinting of the
teletype symbols shown in Figure 5. If strokes containing cusps (such as the 3 and 9) are subdivided into
less intricate segments, this set of symbols can be
conveniently printed using the 28 segments shown in
Figure 6.. As many of these prototype segments are
symmetric about an axis, it appears desirable to have
an even number of components in the CVS. Figure 6
gives visual evidence that very little shape information
......
•
Figure 5-Hand printed
j
•
•
•
J
t9tet!)~ ~yn')31:l
is
lost if these segments are approximated with six
components. This number of components also provides
a minimum Lee distance four between any pair of segments. The symbol pairs having this lowest mismatch
are 1 _. f, c - <; S - f, and U - V.
Although a contour vector sequence having four
components probably would be sufficient for many applications involving a small number of symbols, six
components are needed to distinguish between the
symbol pair S - f of Figure 7. The figure also shows
that only four components provide a rather poor
straight line approximation of intricate curves such
as the theta or the lower case e. A final factor affecting
the choice of a six component CVS for further experimental investigations was the 24 bit word length of
the co~puter, leaving six additional bits for other
kinds of segment data.
Computational algorithms
The computation of the contour vector sequence
begins with a pre-processing operation on the raw pen
data. The Rand Tablet (see Figure 1) has a resolution
of .01 inches and is sampled each seven milliseconds
404 Fall Joint Computer CQnference, 1969
------------------------~---------------------------------------------------------
-I
I
I
I
I
I
I
444444
--)
"X
245723
--,I
IlL]
442064
I
I
I
,--
223566
~.
I
I
444322
223444
\)
1"- 1
1 "-I
003300
~-:::;:::/
644446
222222
/
/
,,
/
,
555333
555555
~==
665322
a
~
643501
,,
,,
333333
I
I
...J
111333
644446
1_-
I
I
I
I
__ I
'./
444570
422466
/-1
000000
i
I
6446
I-
>
333555
/
,,
6446
333111
,,
/
653356
/
I
1
/
I
I
124554
1
~
1
1
1
I
I
~
443100
001344
-_/
444566
>
L_
235522
6 Components
e
4 Components
~
Figure 7--Four and six component CVS's
Figure 6-Segments used to print teletype symbols
to obtain the location of the stylus. To reduce redundancy and filter out spurious noise from the tablet, the
PDP-5 accepts· a new coordinate (Xh yj) only if the
following three conditions are satisfied:
where (Xi, y i) is the last coordinate accepted, K2 defines
an inner window, and Kl defin~s an outer window.
Kl and K2 are preset to ten and three respectively,
but may be changed from the teletype using the command SET PARAMETERS. Owing to the high
sampling rate, a new point is stored whenever the X or
y coordinate changes .03 inches; from the previously
accepted point. This amount of: resolution has been
found sufficient for subsequent computations on 1/4
inch high symbols, but K2 may be increased for larger
symbols.
From the above mentioned filtering process the
contour of a segment is represented by a sequence of'
nearly equally spaced x-y coordinates. These points a,re.
used to obtain a six-chord approximation to the segment. The algorithm (see Figure 8) consists of dividing
the number of' coordinate points less one by six, and
taking the quotient (Q) as the nominal dista,nce between adjacent chord points (Zi' Zi+l)' If the division
produces a remainder (R) it is distributed between the
chords. If R ~ 3, an extra point is added when forming
each of the odd chords ZOZl' Z2Z3, and Z4Z5' If R is equal
to one or four the last coordinate is removed and if
R is equal to two or five the first is also neglected. These
rules are summar3zed in a table of Figure 8.
The final step in computing the CVS signature is a
quantization of the chord directions into one of eight
sectors. The first component of the CVS is computed
using the AY and AX associated with th e ch ord Z (z]. In
the example of Figure 8 tan 22.5° ~ IAY/AXI ~ tan
67.5°, AX < 0, and AY < 0, indicating the direction 5.
A similar application of the Quantization Table to the
remaining chords results in a signature of 543175 for the
numeral 6 shown.
On-line .Recognition of Hand-generated Symbols
0.03"
l
0
~
I
I~X-
405
7
(# coordinate points) - I
6Y
20
Q=3
I
6"
R '2
I
2
3
4
N
5
.Y
4
Y
Y
Y
N
Coded directions for
CVS components aj and bj
3
5
N
N
Y
N
2
6
Add point
Ignore
R 1st point oddtolegs
a·I
7 0
Ci = IOi-bi l
610
521 0
43210
343 2 1 0
2 543 2 1 0
16543210
076543210
7654321 0
5 2 10
a·I
43210
343 2 1 0
2 343 2 1 0
1 2 3 4 3 2 10
012343210
7 6 5 4 3'2 1 0
bi
tan 67.5° ~ I~Y/~XI ~ (jJ
tan 22.5° ~ !6Y/6XI< tan67.5°
o ~ 16Y/6XI < tan 22.5°
~X>O
~Y>O ~Y>O
6Ys- 0 6YS.O
0
0
I
2
7
6
4
3
2
4
5
2
Quantization table
Figure R-Algorithm for co:nputing CVS
If the number of coordinate points is less than seven,
the CVS is computed in a different manner. Segments
having four to six points are assumed to be straight
lines and only the two end points are used. The
quantized slope of the chord betwen these points is
assigned to each CVS component. For example a short
mid bar in the letter F would have a 222222 signature.
If the number of coordinate points is in the range one
to three, the segment is assumed to be a dot and as··
signed a CVS of 000000. Although this signature is
also that of a vertical bar drawn bottom to top, the
latter is not commonly inscribed. However, the dot
could just as well be assigned any signature having a
large Lee distance with respect to the other segments
employed. By treating short segments in the manner
described, it is possible to utilize a larger K2 (inner
window) and thereby reduce storage requirements
and computational time.
The Lee distance between two segments is obtained
by summing the absolute rotational difference Ic,1
between corresponding components (ai, b i) of the associated CVS's. As indicated by equation (5), Icil cannot
always be obtained by simply computing the absolute
bi
C·I
5
...0101
6
...0110
7
... 0111
2 5 compliment of Ci
... 1011
... 1010
'-v-'
... 1001
'-v-'
~X~O
6X>O 6Xs-O
ICi l
7 0
610
1
I Ci I
'--y-J
3
2
Figure \=I -Algorithm for computing IC'i I
algebraic difference Ci between ai and b i • For example
two components with quantized directions 7 and 2 have
Ci = 5, but IC'il = 3. Equation 5 also shows that
when c .. is 6 or 7, the respective Icil 's are 2 and 1.
However the lower table in Figure 9 demonstrates that
when 5 :::; Ci :::; 7, the correct ICil is obtained from the
least three significant bits of the 2's complement of
Ci. This simple algorithm has been implemented with
standard machine instructions.
Dissecting strokes into segments
I t is well known in mathematics that continuous
tranformations depend upon well behaved functions.
If a function is not continuous and/or analytic, it may
be necessary to apply the transformation separately
to a piecewise approximation of the function. In an
analogous way the smooth transformation property
of the CVS signature and Lee metric is related to the
geometrical properties of the two dimensional entities
on which it operates. Discontinuities in the pattern
space are easily handled because in an on-line imple-
406
Fall Joint Computer Gonference, 1969
/
::2) ~
~
DL(3,3)= 9
246246
~ ~~
252466
Input
-
DL(3,~)=
~
5
253471
Prototypes
have shown that if a large nU'1lber of symb::>ls In a w~er
set contain cusps, segmentation on cusps rEtSults in
a higher percentage of correct classifications.
Several methods were investigated for dissecting
strokes into segments. The first technique tried depended upon detecting the relatively slow pen velocity in the vicinity of cusps. An inverse measurement
of pen velocity was obtained by counting the number of
tablet coordinates rejected in the pre-processing operation. Experimental data showed that the writing velocity differed between users and also for different symbols
inscribed by the same user. Relatively slow veloci.ty
was observed for smooth portions of strokes aB well as
at cusps. In general the results of segmentation on
velocity measurements were found to be unreliable.
The dissecting method used in the experimental
recognizer locates cusps using geometrical meaSUlrements and is insensitive to pen velocity. The algorithm
operates on the sequence of filtered coordinatHs which
a,pproximate a stroke. Cusps are isolated when the ilneluded angle between three successive points is lE~ss
than a constant (normally set to 300 ). Although cusps
are reliably detected, sharp corners mayor may not
cause segmentation. Hence if a user desires to employ
such one stroke symbols as a narrow V or N, he may
need to inelude alternative descriptors in his dictionary.
223566,345011
Prototypes
Figure lQ--Improvement obtained by .segmenting
mentation the start and end of a stroke are reliably
indicated by a micro-switch in the writing stylus. Hence
it is possible to compute a separate CVS for each component stroke in a symbol. The manner in which the
Lee metric is applied to multi-stroke symbols is discussed in the next section.
Sharp corners or cusps in a stroke correspond to
points in which the derivative of a function is not defined, and can be troublesome to the CVS transformation. A lower case z and a numeral 3 will be used as
an example. Figure 10 shows that a slightly distorted
3 may have less mismatch with the prototype z than
with the prototype 3. However, the lower half of
the figure demonstrates that the smooth transformation property can be restored if each of the
symbols are separated at the cusp into two segments.
Now the upper left-cup common to both symbols has
the same prototype CVS, and classification depends
only upon the dissimilar lower parts. Experiments
Experimental symbol recognizer
Multiple stroke symbols
A symbol may be composed of several strokes and
one symbol can be a subset of others. Consequently
an on-line recognition program must provide some
means to detect symbol completion. One possible
technique divides the writing surface into a grid and
each symbol must start in a new space. This constraint may be acceptable when the data to be entered
is in tabular form, but the technique is unsuitable for
randomly placed symbols of varying sizes. A second
method makes use of a tree-structured dictionary .10 ,11
After a particular stroke has been classified, the dictionary is referred to for a list of permissible successor
strokes. If the next stroke does not have an aUowable
identity and/or geometrical position, it is assumed to
be assoicated with a new symbol. This technique has
the advantage of reducing dictionary search time, a
desirable feature when there are a number of entries
for each symbol type. However, a poorly inscribed or
positioned stroke may not correspond to an allowalble
successor and can abort the recognition process. In
many of these abort instances the complete sym 1)01
1
On-line Recognition of Hand-generated Symbols
contains sufficient information for correct classification.
In the author's recognition system classification is
obtained from the best match of the complete input
symbol-with dictionary entries having the same number
of strokes and segments. No attempt is made to identify
the component segments of a multi-segment symbol.
Hence it was necessary to devise a symbol completion
algorithm which operates independently of the recognition process. The basic technique is to center a
stroke or a subsymbol in a somewhat larger rectangle.
If the next stroke does not enter this rectangle it is
assumed to start a new symbol. This procedure automatically adjusts to varying symbol sizes.
The precise dimensions of the enclosing rectangle
is determined by the aspect ratio of the stroke and in
some cases the predecessor stroke is also a factor. If
the height-to-width ratio R of the initial stroke is
in the range of 1/3 to 3, the stroke is bordered on
the top and bottom by m/2 and on the sides by m/4,
where m is the maximum of the stroke width or height.
A-test for
stroke 2
A-test for
stroke 3
,-------,
,
I
r-l-Q!e
L:__
m'
,
3
1
,IL _______ ,I
..J
A-B tests
for stroke 2
r--T-'---I
!I
:1 i I !
I
I
I
:
:
I
,
L__ L_L?J
R>3
B-test for
stroke 2
,-------,
:, --2:,
,,
,,
I - I
A-test for
stroke 4
A-test for
stroke 3
r---,
,----,
iII i I
lihl!
:..... ___.,J 3
,
407
If the second stroke enters this rectangle, the combined
strokes are enclosed with the minimum rectangle plus
the m/2 and m/4 borders, where m is now the maximum
dimension of the symbol or subsymbol. The procedure
is repeated until a new stroke fails to enter the rectangle. (See example at the top of Figure 11.)
When the initial stroke is tall and narrow with R >
3, two different enclosing rectangles are employed.
Rectangle A is of the previously mentioned type and
borders the top and bottom of the stroke by m/2 and
'the sides by m/4. i.f the 2nd stroke enters this rather
narrow box, the two strokes are assumed to belong to
the same symbol or subsymbol and the A test is
repeated. Second strokes which do not enter the Arectangle but have an R > 3 are tested to see if they
enter the B-rectangle. The latter is actually a square
of dimension 2m. When the B test has a positive result,
the first two strokes are enclosed with an A-rectangle.
If the 3rd stroke is completely within this rectangle,
all three strokes are assumed to belong to the same
symbol or sub symbol and the A-test is continued.
However if the 3rd stroke is not enclosed by the Arectangle, the first stroke is assumed to be a complete
symbol and the 3rd stroke is treated as a possible second stroke for the next symbol. As shown in the middle of Figure-II, this feature of the symbol separation
algorithm permits l's to be more closely spaced than
the vertical bars of an H. When the initial stroke is
wide and short with R < 1/3, the B-test is applied to
the second stroke. If the result is positive, the A-test
is applied to subsequent strokes. (See example at the
bottom of Figure 11.)
In addition to satisfying one of the spatial tests which
have been mentioned, a component stroke of a symbol
must be made within an interval of time (prescribed
by the user) after the previous stroke. In this manner
the program can detect the completion of an isolated
symbol or the last symbol in a string.
I
L ____ J
Segment spatial data
A-test for
stroke 3
'L-____
1;-1
I_i
I
A-test for
stroke 4
r----,
!+!L
'-IV
,
L ____ •I
L _____ -.J
R
/
1
?
@
•
Figure 13-Test symbol set
0/
/0
On-line Recognition of Hand-generated Symbols
4
A
D
M
P
041.-42222
ilUHH344
03444444
03444~44
03444-~.222?'
(
)
1
B
&Q
&M
01444444
01245723
014420G4
2
6
7
8
c
J
L
N
016511210
C31;~1:1,!.1;
51;;0131;4
034t.I;Lit.1I
51GOI?'~G
~.',t.t.·V"j'"
O~,t.I,Z.lc(j
&\1
r.341,31C ::J
&"(
(jt.·~1l316r.
&Z
(J1J?234GG
0144411';4
131 -14 /:.':/,11
t
I
f, 1~'.22:?22
I1CC13122
1n22222
15111:4.32?
154il.312iO
11441:566
H~?'35711
,./: 111333
0~r,Gr,I?t:J~
ItI
&L
&0
&V
,
55':'2
01::0(ll,fj(l
I}3555333
12.223%6
?
04223566
01121;554
#
~34~".l~~1:
0200(1)eCl
r.54441l4t,
E
(':31: I; .~ 1;1: 1\
04222222
3
0144457~
01,11,-111322
01al3310
014431~0
5?'r.~;
151j44322
01555555
13C01244
04222229-
012234 /,1.
u
£;I;GSt.2W
03-11;1;1;1;1;
03444444
05444444
01555555
03444444
05444444
04223456
11444444
01222222
42222222
02333333
02333333
0233333.!.
02555555
01333333
02222222
02666666
12223566
01222.222
01654210
02654210
15JH113411
41222222
01643501
01.s54432
0
~H
l'il 1121,5G
034444';4
034444.',1,
I
v
&S
&U
03'~ 444·~ Ii
F'
1!11lt.1J4':'2
&F\
"4333333
41444444
04666666
04222222
0355421~
&'1)
13114
01)223565
%
s
&!l
&P
il344~444
04555~·:,i5
046542113
016511210
12012·::'G
I1GG5Z,n
11t.1I4011
0544<1444
04222222
04555555
@
Etl222~22
CS653 F~;j
&G
03444444
01444444
43444444
04222??::"12222?~
. 02222222
fl?22?222
Figure 14-Dictionary entries
for 85 different symbols. The set includes. the ten numerals, the upper and lower case alphabetical characters, and 23 common teletype symbols. Lower case
letters which ate normally printed the same as upper
case were changed to a cursive style. This was necessary because the experimental recognizer does not
make use of relative size information.
A printout of the dictionary entries created by the
author are shown in Figure 14. The output code selected for lower case alphabetics consists of the corresponding upper case symbol preceded by an ampersand. Each segment is represented by eight octal digits
and the six on the right are used for the CVS. The 2nd
digit from the left contains numbers 1 through 5
to indicate the location of the geometric center of the
segment in a rectangle enclosing the complete symbol
(See Figure 12). This geometrical relationship is required during recognition if the user has set the most
significant bit of the first digit on the left. A "I" in the
least significant bit of the 1st digit indicates that the segment continues a stroke and this information is used
to partition the dictionary into subsets of. symbols
having the same number of strokes and segments.
'409
As the middle bit is currently unused, the first digit
can be only a 0, 1,4, or 5.
The TROUBLES EIOOT parameter was set to list all
symbol pairs having a .mismatch of four or less. Figure
15 shows that the pairs C - <, L - f U-V, and V-W
each had a Lee distance of exactly four. The author decided to accept this level of mismatch for single segment symbols unless subsequent tests suggested a
change of form. Figure 15 also lists ten pairs of multisegment symbols having a total Lee distance of four
. or less. Seven of these pairs ,were made more robust
by taking advantage of reliable differences in the relative position of component segments. Only one spatial
bit was set in each symbol, thus allowing sloppy positioning of segments not needed for distinction. For
example the second stroke of the f-T was chosen because it was felt that the horizontal bars of these two
symbols would always be coincident and up respectively. However the first stroke of the f might at times
be coincident and would then provide no spatial difference with the T. The SPATIAL DATA mode is
called by typing SP and the number in the dictionary
list of the desired symbol. A routine then automatically
requests a no (N) or yes (Y) decision to set the spatial
bit for each segment of the symbol.
Although the manual setting of spatial bits is greatly
facilit~ted by the TROUBLESHOOT mode, the
procedure does require familiarity with the fundamental
principles of the recognizer. The task could be accomplished. automatically, but would require the user to
provide additional input samples. The F-I pair in the
n
~"44.A444
p
41n4';65
2
03444444
44223566
e
r 01444444
44222222
0
41222222
T 01444444
44222222
"
41222222
3
&F'''5654444
+
0U41,444
&F'05654444
3
+
014444114
&A03651l21"
"
&tW3444i1014
"
&pe3011l441111
"
L
01444322
"
.
4
u
"144310"
"
41222222
0
41222222
15444321
Ii
V "1333111
4
01331311
"
1144·~!i)ll
52001344
0
51COl344
&H~31\1I1j444
52liOl344
3
51001246
&P£)34~44~1j
"1555333
V 01333111
&H~34440144
0
"1654432
c
&L"1244432
"
&GM654210
&r1034444lj4
C
51001344
3
5 HHil246
F
03444444
"
K
*
04222222
"
011144444
04222222
43444444
0
41444444
0~555555
0
"1555555
Figure I5-TROUBLESHOOT list
41222222
."42222222
02333333
0
01.1133333
Fall Joint Computer Conference, 1969
410
~
CUD
I~
U
I
C
LUC
K
F'
0
T
H
E
Q
X
Original dictionary entries
~
~
LUD
I~
RUD
LUL
Spatial variations provided by user
1=
• x,- x, -
0
F= ex
' -x-c
,
• x, X L
-
01444444
03444444
03444444
01654210
014421(10
03444444
01655422
03444444
034444~0
01654210
01434333
J
01lj~457fJ
U
M
p
S
0
0144211'0
031.44444
e34""'444
0164:::356
01654211
E
R
034441144
034t.1l444
T
H
054~4445
v
013·~2111
A
0344444'1
034444413
01444322
01001344
1-
0121,5522
E
L
y
B
04332111
R
00344444"4
01654210
01321310
0Hl24311
034 ... 11444
01654210
01554322
0
W
N
D
0
G
0341l4~34
042222~.2
MIS-MATC,H: 1
01122222
MIS-MATCH: 1
MIS-MATCH: 4
01212222
02122232
MIS-MATCHI 0
MIS-MATCH I I
MIS-MATCHI .3
04222221
02232223
MIS-MATCH: 2
MIS-MATCH: 0
04555555
02333333
MIS-MATCH: 7
01222233
"4222223
I'll s- ~lATCH I 0
05444444
04222223
e2333333
01555555
MIS-~iATCHI
2
MIS-MATCH: 0
mS-f'JATCHs 1
01311444
MIS-MATCH: .3
0423'.566
MIS-NATCH: 2
MIs-r·lf.TCH: 2
MIS-~:!'TCHI
1
MIS-NA'l'CH: 2
04222222
0422456G
04222222
01222222
C2333332
MIS-~lAT(;H:
1
0541,-141,4
01222222
01222223
MIS-MATCHI 0
02222222
MIS·,MATCH: 5
(14222222
0222~?22
MIS-MATCH: e
MIS"MAiCHI 2
Mis-rlATCH:
~
01221222
01222222
02444444
04223556
IUS-MATCH: 1
MI!;-MATCH: 1
MIS-MinCH: 1
12223566
MIS-NATCHI 2
I'll S- 1·IATCH I 2
02333333
~1j2345GG
MIS~MAtCH:
'"
mS-NATCH: 2
Ms-riATt:HI 3
mS-NA"ICH: 0
01234565
ms-r·jATCH:
~5224444
ms-r·1ATCH: 3
"
Figure 17-TEST on upper-case letters
Final dictionary entries
Figure 16-Automatic setting of spatial bits
&T
&H
&E
&Q
upper part of Figure 16 will be used as an example. Assume that additional training samples produce the
spatial variations shown in the middle of the figure.
The spatial bit routine would determine from all five
samples that the third stroke provides reliable distinction, and that two dictionary entries are required for the
F. (X means that the spatial bit is not set.) In contrast
to other training methods, the user is required to provide samples only for symbol pairs having low mismatch. The minimal use of spatial information results
in a recognizer which is very tolerant to inaccurate
positioning of the component strokes of most symbols.
A TEST mode allows the user to further evaluate
his dictionary. He simply draws a sequence of symbols
which are separated from each other by at least one
quarter of the maximum symbol dimension. The
symbol separation algorithm determines when a symbol
has been completed, and the recognition routine guesses
the identity of the symbol on the basis of lowest mismatch. Mismatch calculations are made between the
input symbol and all dictionary entries having the
same number of strokes and segments. If two or more
symbols have the lowest mismatch, the first one encountered in the dictionary search is chosen. Dictionary entries in which a spatial bit is set require a
specific location for the corresponding input segment.
131444432
133444444
131216532
134655321
&U
e34421~e
el~144322
&C
&K
134222222
133114444"
131654444
0131(.J632
01344332
1314114566
133443100
133444444
0344441\4
01012456
01410742
131431022
&1
&F'
&0
X
&J
&U
&M
&P
&5
&0
&V
&E
0UW6532
&R
033331\44
&H
&E
133444444
131216532
1312444.32
135654211
042231155
13451,3211
133444444
133344444
011120742
03432110
1334444"4
135653100
01310642
0116~)1j21 0
&T
&L
&A
&Z
&y
&B
&R
&0
&\1
&tJ
&D
&0
&G
0144~422
01222222
12001444
MIS-MATCH:
15444444
15444222
0400e00~
11665322
131556133
01121222
MIS-MATCH: 2
MIS··MATCH: 1
o
MIs-r'lATClHs 4
01222222
MIS-MATCH: 2
MIS-MATCH: o
MIS-MATCH: o
MIS-MATCH: 7
MIS-t1ATCH: 2
MIS-MATCH: 2
°
MIS-r'1ATCH: 3
MIS-NATCH:
04~00000
mS-MATCH: 3
15444322
MIS-MATCH: :3
15011444
13001344
MIS-MATCH: 2
l1eC1356
12222222
MIS-i1ATCH: 1
MIS-MATCH: 3
MIS-MATCH: o
I'll S- fvIATCH: 1
MIS-MATCH: 4
11000112
04222222
MIS-MATCH: 1
MIS-MATCH: 2
12011444
MIS-MATCH: o
01555555
MIS-~JATCH:
155113221
11345012
11444457
15jrj~1356
I1G00122
o
"
MIS-MATCH: 4
MIS-MATCH: 6
MIS-MATCH: 7
MIS-i·1ATCH: 3
MIS-t1ATCH: 1
MIS-MI\TCH: 2
MIS-MATCH: 4
15442100
MIS-Mf..TCH: 2
1 HH 1444
MIS-MATCH: 0
15444432
mS··Mt"~TCH:
3
11445011
I"iIS-NfllCH:
Figure 18-TEST on lower-case letters
Figures 17 and 18 show some test results obtai.ned
on the author's symbol set. The displayed or teletype
output from the TEST mode includes the dictionary
On-line Recognition of Hand-generated Symbols
entry guessed, the segment descriptors for the input
symbol) and the mismatch between the former and the
latter. In this particular test the phrase "the quick
fox jumps over the lazy brown dog" was inscribed in
upper and lower case letters. Out of a total of 70 inscribed symbols, the only error was a lower case x which
misread as an upper case X.
CONCLUSION
The author's dictionary entries (see Figure 13) were
also used to classify the distorted one and two stroke
letters of Figure 19. Except for a T which misread as
a t, all of the characters in this figure were classified
correctly (the amount of mismatch is shown below each
symbol). Although previously developed on-line recognition schemes also are capable of recognizing distorted symbols, they require the user to provide a
large number of training samples. The nearest prototype technique described in this paper performs the
task with a single dictionary entry per symbol.
The symbol recognizer has been used by many different people and all have found it enjoyable to operate.
In one experiment three subjects were asked to con-
A
2-
o ])
,
0
1)
D
'5'
~
(
~
t
s-
p
'P.,
7
~
l(
-p
N
5
2
0
R
C
C
C
2
2-
?
2-
'3
'3
Z
~
'3
IV
f1/
!>
N
2-
'Z.
struct personalized dictionaries consisting of the numerals, the upper C9se letters, and the lower case letters
which differed from upper case. Each of these persons
adjusted to using the tablet and CRT display within
15 minutes and then took about a minute to make each
dictionary entry. As the automatic means for setting
spatial bits has not been implemented, the subjects
were given brief instructions on the manual procedure.
The operation itself took about 15 minutes.
After their dictionaries were constructed the subjects were asked to write the complete alphabets and
the phrase "the quick fox jumps over the lazy brown
dog" in upper and lower case. From this test of 132
symbols a user typically had two to five misreads. With
additional experience and very slight refinements of
dictionaries, all subjects obtained recognition rates
in excess of 98 percent. An error rate of 5 percent is
generally considered acceptable in on-line systems,
because each character can be classified, displayed, and
corrected immediately by the writer if it is wrong.
The compiled program for the symbol recognizer
requires approximately 9K 24-bit words of memory.
On the average an additional 4 words are required for
each dictiJnary entry. Owing to the simplicity of the
mismatch calculations and the high speed of the SDS
940 computer, the recognizer can easily accommodate
normal writing rates of symbols from a set of lOG..
The CVS signature and Lee metric is a fundamental
technique for measuring the similarity of two arbitrary
curves? and can be applied to a wide spectrum of
pattern classification problems. The author is currently
investigating the usefulness of the method for machine
recognition of cursive writing in lower case letters.
Preliminary results from this research are contained
in Reference 9.
b
ACKNOWLEDGMENTS
Q
Q
4
I~
CZ
c:;
..,
y
y y
t1
cJ
a
3
0
'3
5
S·
S'I
<;
U
2-
y
11
411
\.,..,J
'"
"L
~
U
\J
L/
'3
0
t,./
W
lrJ
The author wishes to express his gratitude to the many
people of Project Genie whose work has produced the
TSS used to implement this research. Chacko N eroth
and Ken Pier gave valuable assistance in early experimental work on the basic algorithms, Barry Borgeson
provided software for the tablet and display, and Bo
Lewendal wrote much of the program incorporated
in the experimental recognizer.
Support for this work has come from the Advanced
Research Projects Agency of the Office of Secretary
of Defense under Contract No. SD-185.
'2-
REFERENCES
Figure 19-Intentionally di",torted symbols
ID B DEVOE
412
Fall Joint Computer Conference, 1969
-------------------------------------------------------------------------------------------,------Alternatives to handprinting in the man1lal entry of data
IEEE Trans of Human Factors in Electronics Vol 8 Nol
March 196721-32
2 T G WILLIAMS C H FRYE
A1J instruction application of computer graphic8
Educational Tech 5-10 June 15 1968
3 G D HORNBUCKLE
The computer graphics user machine interface
IEEE Trans of Human Factors i~ Electronics Vol 8 No 1
March 1967 17-20
4 J H MUNSON
Experiments in the recognition of hand-printed text: Part I
character recognition
Proc FJCC 1968 1125-1138
5 G NAGY
State of the art in pattern recognition
Proc IEEE Vol 56 No 5 May 1968 836-862
6 R 0 DUDA P E HART
Experiments in the recognition of hand-printed text: Part I 1context analysis
Proc FJCC 1968 1139-1149
7 M R DAVIS T 0 ELLIES
The RAND tablet: A man-machine graphical communication
device
Proc FJCC 1964325-331
8 B W LAMPSON W W LICHTENBERGER
M W PIRTLE
A user machine in a time-sharing system
Proc IEEE Vol 54 No 12 Dec 1966 1766-1774
9 G M MILLER
On-line computer recognition of handwritten symbols
Elec Engrg Dept Univ of Wis 1969 PhD Dissertation
10 G F GRONER
Real-time recognition of hand-printed text
Proc FJCC 1966591-601
11 M I BERNSTEIN T G WILLIAMS
A. two-dimensional programming system
I F I P Congress Edinburgh Scotland Aug 5-10
1968 C84-C89
12 J H MUNSON
Some views on pattern-recognition methodology
Internat Conf of Methodologies of Pattern Recognition
Univ of Hawaii Honolulu Jan 24-26 1969
13 H FREEMEN
On the encoding of arbitrary geometric configuration,s'
IEEE Trans of EC Vol 10 No 2 June 1961 260-268
14 J FEDER H FREEMEN
Digital curve matching using a contour correlation algorithm
Proc IEEE int Conf March 196669-85
15 E R BERLEKAMP
A.lgebraic cod1~ng theory
McGraw-Hill Book Co 1968 Coopt 8 204-205
·Common file organization techniques compared
by NED CHAPIN
IrifoSci Inc.
Menlo Park, California
INTRODUCTION
In order to make a comparison of file organization
techniques, concurrence is needed on terminology. To
that end, this introduction offers some definition of
terms. Unfortunately, many of these terms do not
have universally accepted definitions. A general definition of terms can be found elsewhere. 6
In offering definitions of terms, this paper does not
suggest that those who give different definitions are
wrong. On the contrary, the differences in definition
that exist reflect in part imperfect communication
among people in the field, and in part. real differences
in the concerns of the people in the field. Hopefully,
papers such as this ope will help improve communication. But the differences in concern will continue
to exist) and to spawn both new differences and new
terms.
As used in this paper, the term "file organization"
is not synonymous with file structure) data structure,
data base" data organization, or data management. A
file organization is viewed as a way of putting together the components of a file.. "File structure" is
viewed as synonymous with file organization, but is not
used in order to help distinguish it from "data structure." A "data structure" is a more general term than
file organization.l. since a file is viewed as but one general
organization of data. Some people use the term data
structure to refer only to vertical relationships among
data. "Data organization" is viewed as synonymous
with data structure. A data base is viewed here as a
group of files or alternatively as a controlled aggregation of data which can be regarded as organized into
files.
The term "data management" is used with a variety
of meanings in the field .. Sometimes it is narrowly used
to refer to movement and formatting of data to and
from internal storage, and the supporting software.
Sometimes in a broader sense it also refers to the
identification of data and procedures to maintain the
integrity and security of the datI\.. At other times, the
term is used also to refer to file organization. In a very
broad sense, it refers also to. the maintenance of files
the handling of inquiries, and the preparation of
reports.
These definitions raise questions about the definitio n
of the vertical and horizontal organization of data.
Looking first vertically, this paper views a file as an
arbitrary but usually homogeneous but not exhaustive
aggregation of records. Records are collections of data
all of which share some attribute in common, usually
the name of a thing the data are about. For example, a
record of employee job attendance might contain data
apout number of days worked, number of days absent,
the usual work station, the parking lot location, the
home address, the home telephone, the usual days of
the week absent, and the like. When these data are
drawn together and grouped in terms of "the identification of the employee (such as by employee identification number), the individual groupings thus formed
are here viewed as records. The components of the
record are data items (usually fields), as diagrammed in
Figure 1.
The definition of a record implies no specific ordering
of the data items •. The definition of the file implies
no ordering of the records "within the file. By ordering
is meant the application of a collating sequence or
pattern template to data items at a uniform level in
the vertical hierarchy of data. When records are ordered,
~.
413
4'14
Fall Joint Computer Conference, 1969
OPERATIONS FILE
BASE
~
o
FILE
H
CJ)
RECORD
(J)
H
ITEM
Figure I-Condensed diagram of the vertical hierarchy
of data
the data items used for the ordering are referred
to here collectively as the key. For example, the records
in the attend~nce file just cited might be ordered using
an ascending numeric collating sequence with the
employee identification numbers serving as the key.
The horizontal organizations of data reflected in
this paper require definitions of table, tree! string,
and list. A "table" is a series of pairs of data itemS,
which are the argument and the function. The table by
its form permits the table user to establish by inference a relationship between a particular argument and
its associated function. A telephone book and a statement of tax rates are examples of tables.
Three important tables for the comparison of file
organizations are indexes, directories, and tables of
contents. An "index"· has the arguments in a specific
order but the function which may consist of multiple
data items may be in order. By contrast, a "table of
contents" cites the functions in a specific order but
leaves the arguments in any order. "Directories" may
have the arguments or functions or both ordered in
any manner. For this reason, the term directory serves
as a general term covering in practice both indexes
and tables of content.
A "tree" can be used to represent vertical relationships among data. 4 A tree may also be used for horizontal organization of data, as shown in Figure 2. For
Figure 2-A partial representation of a tree af! a
horizontal organization for a file
example data about a firm's operations might be broken
into divisions such as production, sales, eng:ineering,
and the like. These divisions in tUrn can be broken
into subdivisions. For exampleJ, sales might be broken
into territories, and production into the product categories. Engineering might incorporate new product cntegories currently not in production, as well as those: in
production. These categories can in turn be broken
still further. Thus in production they might be broken
by production equipment or in terms of a bill ofmaterinl. In snles they might be broken down into products
or into salesmen. In summnry the term tree gets its
name from the graphic representation of the processes
of subdividing.
By contrast, a string organizntion. is viewed as a
series of things, one after the other, where: the elements composing the series are similar. EX~Lmples of
strings nre series of characters, of digits, of nnmes,
or of numbers.
A "list" is viewed as a series of records or data
items each accompanied by one or more pointers to
other' elements in the series. These pointers are here
termed "links" and are themselves data items. Some
people prefer the term "chain" to refer to a list.
Irrespective of vertical or horizontal aspeets of the
file organization, a file may exhibit a si~npl~ or a
compound organization. A "simple" orgaIll2iatlOn has
only one major structural pattern. A " com~ound"
organization has two or more distinct and dl~erent
structural patterns which taken together comprIse the
file organization.
Classifications
The number of people in the field have proposed
Common File Organization Techniques Compared
classifications of file organization. A brief review of
some of these will serve as a basis for selecting one
for use in making comparison here.
A team headed by Anthony J. Dowkart has offered
an extensive basis for comparison. 9 In summary, this
basis is: the data definition provided, the facilities
for file creation and maintenance, the retrieval mechanism, the processing procedures, the output characteristics, and the operating environment. This basis of
classification is concerned not with file organization
alone, but also with data management in the broad
sense. Looking at the matter of file creation and maintenance, and of data definition, the classification bases
suggested are performance oriented, rather than
structure or pattern oriented.
Richard G. Canning has suggested classifying file
organization into two general classes based upon type
and upon structure. 3 Within type he proposes recognizing sequential, indexed, and chained files. Within
structure, he proposes recognizing linear, hierarchical,
and involute files. These classifications are more
structure and pattern oriented than those just cited,
but they lack a consistently applied, obvious basis.
lVIinker and Sable in reviewing data management
systems suggested a basis of classification as user
language, file structure, system processing capability,
and user interface. 13 This again shares the same general
user basis cited previously. Looking more particularly
at the basis identified as file structure, ::\Iinkerand
Sable suggested classifying on the basis of the implementing storage media (such as tape· or disk) and the
variety of field and record lengths permitted. Among
those that permit greater variety and which are disk
based, Minker and Sable suggested a classification of
indexed, tree-ordered, and linked, or chained. Th('se
suggestions share many of the features of those of
Canning as noted earlier.
David Lefkovitz has suggested a classification of
file organization based upon a combination of the hardware and software components utilized to implement
the file. 12 These he viewed from a functional point of
view, particularly with regard to the retrieval process.
Thus a file organization may be classified on the basis
of which software-hardware components it utilizes and
in what way. For example, does it use a directory,
does it use a randomizing or a tree approach? If it
uses a tree approach, does it use a fixed length key or
a variable length key? And so on. Such a basis of classification results in a very large number of possible
classes. In a sense, each non-identical existing file
organization becomes a separate classification.
Ned Chapin has suggested a classification scheme
415
based fundamentally upon the way of indicating association at a giV(,Yl vertical level within a file. 4 At
one extreme he placed the attributed organization
which provides explicit identification with the data
at some given level. This obviates the necessity for
providing a means of association below this level.
At another extreme, he placed the linked or list organization, where each data element at a given level incorporates a specific indication of association. Two
varieties of this he singled out for particular attention:
the complex ring which is a complex list that forms
closed loops, and the muble or multiple double-linked
list which provides two or more links. At another
extreme, he placed the hierarchical organization, which
provides a tree-like association on a horizontal basis.
Finally, at another extreme, he placed the positional
organization. This provides association in terms of
placement in relation to other data, at a given vertical
level. Thus, field A is always known to precede field
B, and field B is always known to precede field C,
and all three fields are always present in a record.
Hence, values from the third field position have a
known identification and association.
The Chapin classification utilizes an important
feature of the way people think about data, as its
basis for classification. As such, it avoids the mixed
base problems inherent in the other classification
schemes it reviewed, without the gaps or holes characteristic of the other systems.
This classification approach lends itself to a graphic
representation, as diagrammed in Figure 3. The diagram
uses time as the left to right distance, but not in strict
scale units. 7 The vertices or nodes are the identity of
data. The solid arcs or lines are the sequence of the
active (pointed to) data. Vertically, the diagram has
two parts, an upper or demand CD) part, and a lower
or supply part. A perfect match of the file organization
to the demands upon it occurs ,,,hen the data (indicated by broken lines) demanded and supplied occur
at the same time.
Characteristics
Th~
D
S
point is well taken that users by and large are
, • , T• , •
.. • • • • • •
C/
II
II
•
••
••
••
•
I
•••
•
••
•
)
••
•
Figure 3-A graphic representation of associations
showing the ideal pattern for a file organization
)
416
Fall Joint Computer Conference, 1969
unconcerned with the classification of a particular file
organization technique. They are concerned with the
functional characteristics of the file organization technique in action. Some of these of course are hardware and software dependent. But within those bounds,
they are determined largely by the file organization
itself. Among the common characteristics are the speed
and basis of access, the use of storage capacity, the
ease of maintenance (for insertions, alterations, and
deletions), and the extent of software support available.
The speed and basis of access is fundamentally
affected by the association provided in the file organization because access uses the association for its realization. The hardware, the software, and the association together set the limits. The basis of access may
be by attribute, by value, or by property as has been
pointed out elsewhere. 4
The use of storage capacity reflects two aspects
of file organization, each of \vhich in turn rests upon
the basis of association. One aspect is that compound
organizations commonly use more storage capacity than
do simple ones. Another is that hardware and software
factors also affect the use of storage, given the file
organization.
The procedures, the convenience, and the time re-·
quired for maintenance operations, such as insertion,
deletion, and alteration of data in a file, depend obviously upon the hardware and software used. But they
also depend importantly upon the association provided
by the file organization, since maintenance involves
access, but is more than access. Common maintenance practice is not always a corollary of the features
of the file organization.
The extent of software support available is a very
significant determinant of the degree to which people
are willing to use a file organization. Even if it be
theoretically attractive, a file organization unsupported
by software is in practice ignored in favor of anything
that is supported by debugged software.
niques supported by software available from the computer vendors but not provided normally 2,S part of
the operating systems. These usually take the form of
"packages" capable of a variety of functions.
A third category are the file organization techniques
available in the software market from independent
suppliers of software. K one of these are as common as
those available in the first category, but some are as
common as some in the second category.
For contrast, this paper looks also at the extensions
to COBOL proposed to CODASYL in the area of file
organization techniques.
Vendor supported techniques
Historically the oldest, the most popular, and by
far the most common, is the strict sequential file
organization. The strict sequential is a positional
organized file commonly consisting of ordered records
which are themselves positionally organized. 4 ,lo As such,
its use of storage is the most economical of nIl. It is a
simple, not a compound organization.
The strict sequential enjoys a rapid next-record
aceess by attribute, but a slow random acces!) by attribute, as diagrammed in Figure 4. That is, as long as the
sequence in which access is demanded conforms to the
sequence in which the file was sorted, access is rapid
unless the number of records to be passed over is large.
Unfortunately, access is often desired on some other
key. This requires first a reordering of the file which
involves a time-consuming sorting operation, or an
exhaustive search of the file. Even with this sorting
,. ,.
CONSECUTIVE
,
:
•·· • •
•
,
:
:
--+
•
• •
l --.--+
T
T
r---+
Common techniques
Techniques covered
The most common file organization techniques are
those proselytized and supported with software by the
computer vendors. These are normally part of the
operating system and are accessible to anyone who
programs in the symbolic language for a particular
computer. Some of them are available to users of
higher level languages such as COBOL and FORTRAN.
Less commonly used are the file organization tech-
RANDOM
T
~==4c=:;. . ..::~~:::.~:::
Figure 4-Diag,ram of the strict sequential file
o rganiza tion
Common File
operation, access by value and by property involve
search.
Maintenance for sequential files is logically straightforward, but slow. It requires typically a complete
passage of the file with a complete copying of it. Each
record must be read and written in order to do maintenance on the file. Because of this, insertions and dele~
tions are easily accomplished. Alterations are also simple as long as the typical fixed length restrictions on field
sizes is observed. Where variable length fields are permitted alterations become a little more complex but are
still logically straightforward.
Software support for sequential organization is
extremely good. Its popularity is attested by Table I.
It is the most widely supported of all the file organi~
za tion techniques.
The indexed sequential is a compound file organization technique, historically younger than the strict
sequentia1. 4 ,lo This too is a positional organization.
The main file is a strict sequential file. With it is a
sequential organized index using the same key. Sometimes indexes to indexes are provided depending upon
the size of the main file and the storage space available.
Random access for the indexed sequential file is
superior in speed to the strict sequential because the
index search requires less time than a search of the
main file. From the index the location of the desired
record can be found and the record then accessed without search. But for a next-record access, the same
procedure usually is required, which slows such access
(see Figure 5). Access by attribute, by value, and by
property follow the same pattern as for the sequential
organized file.
. The use of storage space for the indexed sequentia1
IS l~rger because of the additional space required for
the mdexes. An added inefficiency in the use of storage
space is the typical requirement for overflow areas to
TABLE I -Summary of the file organization techniques
supported by the eight largest computer vendors
Strict
Sequential
I ndexed
Sequential
Direct or
Random
IBM
RCA
CDC
UNIVAC
Burroughs
NCR
GE
Honeywell
IBM
RCA
CDC
UNIVAC
NCR
Honeywell
IBM
RCA
UNIVAC
NCR
GE
Honeywell
Organi~~tion
Techniques Compared
417
Figure 5-Diagram of the indexed sequential file
organization
permit insertions in the main files. This overflow may
amount to as much as a third to a half more space for
the main file, although typically this can be held to
about one-tenth more space.
The maintenance of the indexed sequential file
differs considerably from that for strict sequential.
Maintenance does not require rewriting the entire file;
only those specific records in the file that are altered
are rewritten back into their same places. This saving
in maintenance time can be more than offset by other
factors.
An insertion in an indexed sequential file requires
that· adjustments be made to the index and to the
main file. The inserted record typically must be written
in the main area displacing a record into the overflow
area. Links are inserted if more than one such overflow
occurs in a given area. By contrast, deletion is more
simple. The record to be deleted is simply marked for
deletion but is not physically deleted from the file nor
from the indexes. Periodically, the entire file is rewritten in order to eliminate the accumulated deletions,
to pull the insertions into the main sequence, to re~pportion the overflow areas, and to clean the index.
In sum, whether or not the maintenance time for an
indexed sequential file exceeds that for a strict sequential file depends upon the volume of insertions and
alterations. For low to moderate volume, the strict
sequential is usually slower over-all. An indexed sequential suffers from the same single-key limitations
as the strict sequential.
The software support for indexed sequential generally
is good The software operates more slowly per random
access than for strict sequential because of the decreased buffering possible.
The direct or random file organization is also a
positional organization. 4 ,lo It is like strict sequential
in that it is simple, not compound. The direct or
random file organization is a variation of the strict
418
Fall Joint Computer Conference, 1969
sequential. It uses a transforma~ion of the key. Whatever the key ',,"ould be is passed through an algorithm
to calculate a position in storage. Because of the
possible occurrence of multiple records having the same
key, or of closely spaced keys, provision is made in the
algorithm to handle some conditions. One is to place or
find a record when its transformed key is the same as
another transformed key. This can be handled by links
and overflow areas, or by shiftiI).g records to maintain
a sequence in order to restrict the search domain. Another is to set up the initial spacing of records in the
file to permit room for the later insertions. The amount
of storage space allocated for this purpose is usually
not less than that allowed for overflow areas in the
case of an indexed sequential file.
The random access provided by the direct or random
file organization is slightly faster than that for an
indexed sequential organized file, since no index reference is needed. But for next-record access, it is slower
because the transformed key order is not the same as
the ordinary key. Hence, every access is a random
access, as diagrammed in Figure n. The access basis is
the same as noted earlier for the positional organized
files. Also, only one key can be 'used, as noted earlier.
The use of storage space for the direct or random
file organization is about as effi~ient as that for the
indexed sequential, and is less; efficient than for the
strict sequential. This is because of the voids that
must be left in the spacing of the records to accommodate inserts, and the use of overflow areas. No space
is needed for an index.
The maintenance for a direct or random organized
file resembles the indexed sequential more than the
strict sequential. This may alsQ extend to alterations
and deletions. For insertions, no index need to be adjusted. If the record to be inserted must go into a place
that is already occupied (that is, the transformed key
is a duplicate of an already existing transformed key)
then provision must be made for moving records or for
use of overflow area and links.
The software support for the direct or random file
organization is less troublesome and less burdensome
than that for the indexed sequential. Also, less supporting software is needed to accomplish the job. The
user does not even need to rely upon manufacture
provided software but can make do by providing his
own algorithm for key transformations and by using a
strict sequential file organization. lVIany vendors have
been supplying this software for a longer period of
time than they have supplied indexed sequential soft",;are.
Another type of common file organization technique
available from the computer vendors and incorporated
as a normal part of their operating systems is the partitioned file organization. 4 ,10 This is a hierarchical file
organization. But it is normally not accessible to the
programmer even though it is utilized routinely by the
operating system for its own functions such as program
libraries. Typically, the hierarchical file organizations
are compound because they require directories and
sometimes even hierarchies of directories to maintain
association and provide access. These directories usually include one that is of the table of contents type.
Access by attribute is the most common. The speed
of access depends mostly upon the size and number of
directories used (see Figure,7). Maintenance is usually
done by making deletions by altering only the directories. Insertions are entered in the directories and the
new data placed in any available space. Alterations
are often treated as combined deletions and insertions.
The software support is usually inadequate to enable
the use of the partitioned file organization by programmers in their own programs. The organization
becomes increasing uneconomically of storaglB space as
, deletions accumulate. To eliminate them requires rewriting the entire file and recreating the directories
j
--.'--.'--,4'~~'~~'~~'~--'~
•
•
•
•
•
•
•
:
:
:
:
...-1-._:L----L-:
:
:
:
~
--t--:----.:----t-:
·
•
•
••
•
••
••
.•
•••
••
•
••
•• • ••• •
••
I
Figure 6--Diagram of the direct or random file
organization
•
t.....-----I---+-~--...-_r
Figure 7-Diagram, of the partitioned file orgnnization
Common File Organization Techniques Compared
TABLE II-Summary of selected vendor
augmentations
Strict
Sequential
GIS
FORTE
MARS
Indexed
Sequential
GI~
I
UNIMS
FORTE
MARS
UL/I
Direct or
Random
GIS
FORTE
Other
Techniques
IDS (ring)
FORTE (list)
an operation equivalent to that needed for the indexed
sequential file organization.
Vendor augmentation
Computer vendors over the years have made a,
number of augmentations and elaborations of the implementation of file organizations just compared. The
best known of these are listed in Table II.
One of these has been IBIVI's GIS (Generalized Information System).2 This elaboration provides a number
of features that add greatly to the power and convenience available to the user. Underlying it are the
two positional organized file organizations, the strict
sequential and direct or random. The use of indexed
sequential is optional depending upon the scope of the
GIS implemented. GIS is a free-standing package, not
an extension of COBOL, but GIS can be used with
COBOL.
The access for the GIS is slower because of the
additional software. But that software yields greater
convenience of user access by reducing programming
effort to file and retrieve data. The use of storage
space is but little more extensive, ignoring the space
for the additional software. l\1aintenance follows the
usual procedures but is more convenient from the user's
point of view because he does not need to write all of
the programs for doing it. The software support is
comprehensive.
The Integrated Data Store (IDS) is available from
General Electric, l and is similar to the General l\10tors
Associative Programming Language. IDS offers a complex ring file organization where the number of links
possible at the record level in the file may be made as
extensive as the user desires. In practice, it is used
most often as an extension of COBOL.
Access by attribute beyond the first access is slightly
facilitated because of the links. Access by property is
much facilitated as a practical matter because of the
links which provide quick reference to the records with
related contents. The use of storage space is greater
419
than for a strict sequential organization because of the
space occupied by the links. Since in practice, directories are used to locate or serve as pointers to rings, a
little additional storage is also needed for them.
Although insertions, deletions, and alterations are
handled by the software, the procedures are considerably more complicated for IDS than for the positional
organized file. This is because of the need to adjust
the links whenever insertions and deletions are made. If
the insertion cannot be made physically nearby, then
subsequent accesses following the links are slowed.
This maintenance problem compounds as the number
of links to be adjusted increases. The software support
available for IDS is comprehensive and has been extensively tested in use.
The UNIlVIS (Univac Information lVIanagement
System) is available from the Univac Division of
Sperry Rand. It offers a modified indexed sequential
file organization in a package of software, in a similar
manner to that noted earlier for GIS and IDS. It too
can serve as an extension to COBOL.
The access and maintenance for UNIlVIS are similar
in character to that noted earlier for indexed sequential
files. But to the user the procedures appear easier
because of the assistance provided by the software.
UNIMS uses little more storage space than the indexed sequential noted earlier. The software support
is comprehensive.
The UL/I (User Language/I) from RCA offers a
more convenient language for the handling of access,
maintenance, and reports from files than the usual
programming languages. As such it has similar objectives to GIS noted earlier. UL/I uses a modified
indexed sequential file organization in a way that gives
the appearance of a hierarchical file organization. l1 The
characteristics of this software system were still fi uid
at the time of this paper.
FORTE is available from Burroughs Corporation.
It provides unordered (sequential), indexed sequential,
random,. and a combination of indexed sequential and
random. Further, it provides list file organization in
two forms, a two-cell list, and a usual double-linked
list (but not a multiple-linked list or ring structure).4.14
As such it represents an improvement over the FORGE
software which Burroughs has offered. FORTE is designed for use as an extension of COBOL, not as a
free standing software package for file organization
and use.
Another relatively new entry in the field is MARS
from CDC. In giving the user the appearance of a
range of file organizations, it like UL/I relies primarily
upon the strict sequential and indexed sequential file
420
Fall Joint Computer Conference, 1969
organizations. Like GIS noted earlier, MARS is a
generalized system providing access, maintenance, and
report capabilities. It does however provide the capability of building an inverted list organization. Its
characteristics were still fluid at the time of preparing
this paper.
Non-vendor augmentation
The number of implementations of file organization
alternatives are available in the software market from
sources other than computer vendors. With IBM's
Summer 1969 announced changes in software policy,
this growth in alternatives can be expected to grow still
larger. Only a brief selection is covered here, based
primarily on age and popularity (see Table III).
Two distinct classes of offering are available in the
software market. One uses and elaborates upon the
vendor provided file organization and software support.
Another replaces the vendor provided file organization
and hence also provides its own software. A brief look
at each of the groups will round, out the comparison,
since these offerings may soon become more popular
in the market.
In the first group, some of the best known are the
MARK-IV, the FILE EX, SCORE-II, and INQUIRE.
The first two of these use the vendor-provided strjct
sequential and indexed sequential file organization techniques. To these they add an important software superstructure for report preparation, qata retrieval, and file
maintenance. As such they provide an alternative to
the user for preparing his own programs to accomplish
similar ends, and to the use of the vendor-provided
software.
The SCORE-II also uses the vendor-supported sequential and an indexed sequenti3j1 file organization. In
addition it also provides tree structure, not directly
but based upon a combination of the strict sequential
and indexed sequential. This adds flexibility to the
package of report preparation, retrieval, and mainteTABLE III--Summary of selected non-vendor
augmentations
Strict
Sequential
Indexed
Sequential
MARK-IV
MARK-IV
FILE EX
FILE EX
SCORE-II
SCORE-II
Director
Random
Other
DM-5
(hierarchy)
SCORE-II
(tree)
INQUIRE
(list)
nance facilities.
Differing in its choice of the underlying file organization is INQUIRE. This utilizes the indexed sequential and the direct or random file organization.s. But
these are not directly accessible to the programmer.
Rather, INQUIRE combines them to form a modification <;>f an inverted list file structure. * This gives
added power to the file maintenance, retrieval, and
report capabilities of INQUIRE. Access by attribute
and by property is facilitated by the inverted list
organization, but maintenance requires adjustment 'Of
the lists as additional operations. 4
In the second group, the oldest and most publicized
entry is the DNI-5 (Data Manager-5) which has been
described in the literature of the field. s DM-5 ,like the
others, includes the soft-ware for retrieval, maintenance,
and report preparation. DM-5 utilizes a hier,archical
file organization of a compound form. Tables are used
at several levels. Both random and next-record access
is handled by use of the tables, and are of about equal
speed for access by attribute. Since the records are
not ordered by a key, but ma,ny keys can be used in the
construction of the tables, the single key restrietion 'Of
the positional file organization is avoidod with a result
similar to that for the inverted list file organization.
In summary, the non-vendor offerings in the software market typically combine into a single packa!~e
both file organization and convenient aids to using it.
The offerings thus far do not attempt to replace the
file organizations supported by the computer vendors.
COBOL extensions
The Data Base Task Group proposed last year to
* The inverted list was developed about 1964 under the leadership of Dr. Jack Minker 803 a modification of the inverted fille.
The inverted file organization was in use in the information
retrieval field in the years 1957-1958. The inverted file is a positional file organization with an ordering determined by multiple
keys. Records in the file reoccur as many times as they may have
keys, which need not be the same from record to feicord. By
contrast, an inverted list is a list file organization of a compound
form. The main portion of the file need not be and usually is not
in a list form. The key portion of the file is organized ai3 a set of
lists consisting of pointers for each key to records in the main
file. Since as a practical matter, the links are unnecessa,ry, common practice is to elide them. The result is conceptually equivalent to an inverted file with all records replaced by surrogates
(a common practice now), and with the records drawn into a
subfile of their own with no redundancy. (The inverted list cnn
also be viewed as resulting from a consolidation of the links in
one direction from a muble chain or multilist file. 4 •12 ) In net
effect in their modern forms, and as a practical matter, an inverted liflt differs from an inverted file primarily in emphasis and
mann~r of implementation.
Common File Organization Techn'iques Compared
the CODASYL COBOL Committee an extension of
COBOL to incorporate provisions for the complex ring
file organization. 6 Although the discussion devotes considerable attention to the other file organization techniques, the proposal is for the inclusion of only orre· of
them, the complex ring. In substance, this is very
similar to the IDS noted earlier. This discussion included with the proposal indicates that ring file organization can be used to simulate or serve as other file
organizations, such as sequential, random, hierarchical
or tree, and inverted file. Although not presented in
the discussion, it can also be used as for muble chains
or a multilist file organization.
One of the major objectives of the Data Base Task
Group was to work toward keeping the description of
data stored with the data itself. This is in effect. an
attempt to delay binding time. Since delayed binding
time in general improves the flexibility and power of
the resources available to the programmer, the objective is commendable. Providing linkage among data
can be a definite step in this direction. The question
to be argued is whether or not the ring file organization
is the best choice of means for accomplishing this
objective as well as serving as a worthwhile extension
of COBOL.
From the comparisons presented, it can be argued
that replacing a ring file organization by a frankly
compound file organization sans links, would gain more
for COBOL. Examples of candidate file organizations
are the inverted list and the hierarchic~l. Access for
both is faster and more powerful; maintenance for
both is simpler.
CONCLUSION
Automatic computers during the middle and late
1950's had by present day standards, relatively slow
execution times and great restrictions upon the availability of both internal and external storage. The trend
has been toward increasing the availability of larger
and larger amounts of storage capacity, and toward
faster and faster operating speeds.
These changing computer capabilities suggest the
desirability of seriously rethinking the historic preference for positional organized files. This was certainly an
appropriate choice of file organization technique, when
storage capacity was extremely limited and operating
speed was slow. It required the least storage space and
the least direct overhead within the program at the
time of file use. The positional organized file entails a
very heavy cost of additional operating time in order
to reorder (sort) the file. It also involves tlie time to
rewrite the file periodically as a part of the mainte-
421
nance of the file, depending for its extent upon the form
of the positional file organization.
N ow that computers have much more extensive external and internal storage capacity and operate more
rapidly, it appears appropriate to reappraise our continued reliance upon positional file organization techniques. Let us consider briefly the alternatives. The
attributed file organization is still too expensive of
storage space and of machine time for serious attention
in pure form. The list file organizations in general
suffer from costly maintenance. The exception is the
inverted list. The hierarchical file organizations appear
attractive, but like the inverted list, are in practice
compound file organizations.
It is significant that these latter two file organization
techniques are generally not available to computer
users because the supporting software is not generally
available. The software exists, but the form of most
puts it beyond the reach or scope of operations for
most computer users. But this gap is narrower now
than it was. Some vendors such as CDC and Burroughs
have started to move to provide a wider range of file
organization. techniques. Independent software firms
are starting to offer a wider variety of alternatives.
But a gap still exists.
REFERENCES
C W BACHMAN
Integrated data store
DMPA Quarterly Vol 1 No 2 Jan 1965 10-30
2 J H BRYANT P SEMPLE
GIS and file management
Proc 21st Natl ACM Conf 1966 Thompson Book Co
Washington D C 97-107
3 R G CANNING
Data management: file organization
EDP Analyzer Vol 5 No 12 Dec 1967 14 pages
4 N CHAPIN
A comparison of file organization techniques
Proc 24th ACM Natl Conf 1969 ACM New York 273-283
5 N CHAPIN
Data structures
Automatic Computers N Y Van Nostrand Reinhold Co
in press
6 Data Base TBsk Group
COBOL extensions to handle data bases
SIGPLAN Notices Vol 3 No 5 April 1968 1-45
7 M E D'IMPERIO
Data structure.<{ and their representation in storage
Annual Review of Automatic Programming Vol 5 Oxford
1969 Pergamon Press 1-75
8 P J DIXON S JEROME
DM-l-a generalized data management system
Proc SJCC Vol 30 1967 185-198
9 A J DOWKART et al
A methodology for comparison of generalized data manage-
422
Fan Joint Computer Conference, 1969
ment systems
CFSTI No AD-811-682 March 1967287 page,...
10 IBM CORP
Introduction to IBM System/360 direct acce8s storage
devices and organi..ation methods
IBM Corp 1968 White Plains N Y 70 pages
11 W I LANDAUER
The balanced tree and its utilization in information retrieval
IEEE Trans on Electronic Compu~ers Vol 12 No 6 Dec
1963 863-871
12 D LEFKOVITZ
File structures for on-time sy.~lem
Spartan Books 1969 Wa!'hington DC 215 pages
13 J MINKER J SABLE
File organization and data management
Annual Review of Information and Technology 19157
John Wiley and Sons Inc N Y 123-160
14 N S PRYWES H J GRAY
Outline for a mutilist organized system
ACM Natl Meeting 1959
An information retrieval system based
on superimposed coding *
by JOHNR. FILES and HARRY D. HUSKEY
University of California
Santa Cruz, California
The cost of storing information in machine-accessible
form has declined markedly in the last decade, and
promises are such that one can look forward to having
complete libraries available in such form. This places
increased importance on algorithms which make it
possible to search large flIes efficiently.
This paper describes an approach to this problem.
In practice, information in a large file can be more
efficiently accessed if it is indexed in some manner. The
method of indexing which will be discussed is particularly well suited for a file which:
1. Is very dynamic with both deletions and additions frequently occurring.
2. Contains an extensive vocabulary which is to
be encoded.
Both of these characteristics are frequently found in
files that are to be coded. A file of information on recently published articles about a given subject and a
card catalogue for a large library are good examples
of files which require a large amount of maintenance.
If updating the index (code file) is expensive and timeconsuming, updating is put off until it is felt that the
performance of the system has deteriorated enough to
justify the effort required to update it. Until the updating takes place, information which is no longer of
use is still retrieved, and the new information, if present,
* The research reported on here was done at the University of
California at Santa Cruz with the partial support from Project
Genie at the University of California at Berkeley (Contract
SD-185 with The Advanced Research Projects Agency of the
Department of Defense).
423
is in a secondary file. Keeping a secondary file containing recent additions avoids the serious problem of
not having new material available, but it does decrease
the efficiency of the system since such a file must be
searched separately each time an inquiry is made of
the main file.
The ability to utilize an extensive vocabulary is
also very important. In the proposed system the vocabulary to be used is derived directly from words
used in the original documents, thereby eliminating
the time-consuming and expensive practice of manually
abstracting and choosing indexing terms. Machinegenerated derivativesgf the original vocabulary retain
more information about the original content of the
item than does the manual system of assigning descriptors. In the manual case when selected descriptors
are assigned to a document, associations of descriptors
to words and to phrases are made. Such associations
are not made in exactly the same manner by two
trained indexers, and it is likely that the associations
made by the average interrogator of an information
retrieval system will be even more diverse. -Because of
this lack of uniformity in assigning descriptors it is
desirable to allow each searcher to aeterffilne words
and phrases that he wishes to associate with the concept
on which he is doing a search. Postponing such associations until the time of the search can be accomplished
only if the entire word content is' preserved in the
coded form.
E~ of update and freedom qLy<;>£~lary are not
enough in themselves to make a coding algorithm
worthwhile. Factors such as speed of access, ability
424
Fall Joint C.omputer C~:mference, 1969
t.o make searches f.or c.ombinatiOI~s .of w.ords and c.ompactness .of c.ode file are als.o· ini:po~tant c.onsiderati.ons.
All .of these characteristics will be discussed f.or the
c.oding scheme .outlined bel.ow.
Mo,cltina
Rea.do.b/c
Reael ;" text
Re.c.o,-eI
The system
'FIG.lf
The inf.ormati.on retrieval system which was investigated can be divided int.o three c.omp.onents:
preparati.on .of the text, generati.on .of the c.ode file,
and the searching pr.ocedure. A general .outline .of the
first tw.o comp.onents can be seen in Figure 1.
Since the f.orm and f.ormat .of the text to be used
can be expected t.o vary greatly, the text is standardized as it is read in. Flags are set t.o indicate b.oundaries
between rec.ords as well as at the ends .of lines t.o make
it easier t.o repr.oduce the d.ocument when it is retrieved.
Als.o, as a measure t.o reduce the bulk .of the file generated (text file) extra blanks in the input text are rem.oved. In the pil.ot system the text file was generated
fr.om tw.o s.ources: a bibli.ography: .of c.omputer science
and a listing .of auth.ors and titles fr.om recent issues .of
The Computer Group News of the IEEE. B.oth .of these
texts were read, pr.ocessed, and st.ored .on a disk. The
text file generated was·
characters st.ored .one
character per byte.
.
Once the text file is generated c.oding can pr.oceed.
The text file is examined character by character until
the end .of a string which is t.o be c.oded (w.ord) is encountered. The unit c.oded is a string .of at least three
alphabetic characters surr.ounded by n.on-alphabetic
symb.ols (an English w.ord). After: the w.ord is f.ound it
is c.ompared with a list .of n.on-c.ontent w.ords, (i.e, the
Delete List c.ontaining w.ords such :as: .of, the, and and).
If the w.ord is f.ound in the Delete List there is n.o further
pr.ocessing .of that w.ord, and the next w.ord is c.onsidered.
When a w.ord is f.ound that i$ n.ot in the Delete
List, the trimming alg.orithm is applied t.o reduce the
w.ord t.o a pseud.o-r.o.ot. C.omm.on: endings such as s,
ed, ing and c.omp.ound endings ~uch as fully· (as in
carefully) are rem.oved. By rem.oving endings, different
f.orms .of the same w.ord aremadei int.o syn.onyms. F.or
example, the w.ords 'c.omputer' and 'c.omputers' will
b.oth be reduced t.o the base 'c.o~put.' This derived
ro.ot is then passed.on t.o the c.oding pr.ocedure. (Further
discussi.on.of trimming al~.orithm iIliAppendix C.)
In the coding pr.ocedure, a c.ode word is generated f.or
each rec.ord. The c.ode w.ord can be th.ought .of as a
bit string c.ontaining N bits, all .of :which are initialized
t.o zer.o at the beginning .of the c.odi.ng .operati.on. When
a trimmed w.ord is t.o be c.oded into the c.ode w.ord, the
numeric value .of the letters in the w.ord is summed,
1°°1°00
DOII,.J.,.ie,
Remove. b/o. .. b
~----------~----~~--------<3--
c;.. t Go 'Word
01 text'
T r;", to
l',evJo-
root
Acid bit
to code
worc/
yes
Stor~
Code
o.nc/ text
pointer
No
Figure I-Coding procedure
giving a number which is used t.o ch.oose an element
fr.om the unif.orm distri buti.on .of integers between 1
and N. Thus the resultant integer (c.ode value .of the
w.ord) is generated by an alg.orithm which giv'en the
same trimmed w.ord in the future will generB~te the
identical c.ode value f.or that w.ord. "By using :1 fixed
arithmetic pr.ocedure t.o pr.oduce the c.ode value f.or a,
w.ord, the need f.or a dicti.onary .of w.ords and assigned
c.ode values disappears-. This frees the large amount .of
st.orage which such a dicti.onary w.ould .occupy :!l.S well
as saving the time required t.o search such a file~ If
Information Retrieval System Based on Superimposed Coding
TRIMr,~ED
WOR~
NUMERIC
VALUE
CODE
VALUE
INFORM(ation)
5226
15
RETRIEV(al)
42483
13
SYSTF:lf.
1,1947
3
BAS(ed)
95060
9
SUPERIMPOS(ed)
22151
7
COD(ing)
87008
3
CODE WORD
Occurrence
lile
425
Text
File
~
0001000101000101
0
15
Figure O-Coding "An information retrieval system
based on superimposed coding"
for a particular word the code value generated is K,
then the K'th bit in the code word is set to one. (Figure 0) .
The entire operation of finding a word, checking the
Delete List to see if it should not be coded, trimming
and coding is repeated until the entire record is processed. The code word which is uniquely determined
by the words in the record is then stored in 'a file (code
file) along with a pointer to the beginning of the record
in the text file. This procedure is repeated until all
the records have been coded.
After coding, the~' file is ready for searching. The
searching program accepts any number of words, each
of which is processed in the same manner as the words
in the text file. It is looked for in the Delete List,
trimmed, and used to generate a code value. This code
value is then used to produce a query code in exactly
the same way as the code words were produced in
the code file. Upon generation of the query code the
actual search may begin. Each code word in the code
file is matched against the query code to see if the
query code is a subset of it. (Here a bit string X is
said to be a subset of another, Y, if when the I'th bit
in X is one, the I'th bit in Y is also one, i.e., 1010 is a
subset of 1011 while 0101 is not.) Each time that the
query code is a subset of the code word, the pointer to
the text file is used to gain access to the corresponding
record which can be further processed to see not only
if it contains the relevant words, but that the words
are in the correct order.
The above is a brief description of the coding suggested for a file of an information scanning program.
Some details such as the exact procedure for removing
endings and the use of several independently generated
code values to produce multiple code words for 11 given
record, were not dealt with here. A more detailed
treatment of these problems can be found in the appendix.
L
Hecord 8
Figure 2-Inverted file
Results
From the pilot system, data was gained on the performance of such a system of superimposed coding.
When possible, the performance of the superimposed
coding system will be compared with that of a threaded
list and inverted file. (Figures 2 and 3) The following
factors received major consideration:
1.
2.
3.
4.
5.
Ease of update
Effect of a large vocabulary
Amount and type of storage
Speed of search
Cost
Before making any comparisons it would be best to
give a brief description of threaded lists and inverted
files. An inverted file consists of two main parts, a
vocabulary file and an occurrence file. As records are
processed, each significant word is looked up in the
vocabulary file. If the word has appeared before, it
has associated with it a pointer to an area in the oc-
426
Fall Joint Computer Conference, 1969
Text
Occurrence
File
Vocabu.l. .. ry
File
File
Word 1
Word 2
Record 4
Word 3
Record 5
Record 6
R
Record 8
,'~--......j
I
I
9
Figure 3-Threaded list
currence file; if not, then an area in the occurrence file
is set aside for the word and a pointer to the first
location in that area is entered in the vocabulary file.
After this pointer is found, an entry is made in the
first free location in the corresponding area of the
occurrence file to indicate the record in which the
word occurred.
The threaded list on the other hand, has the same
type of vocabulary file, but the occurrence file is arranged in a different manner. 'Fhe pointer in the vocabulary file now indicates a location associated with
the first record containing the given word. This location in the occurrence file, in turn, contains a pointer
to another location in the OCCUrrence file associated
with the second record which contains the word, and
the pointer in this location points ... Thus a linked
list of all the occurrences of the word is formed. 2
the text file but the location of the record's code in
the code file. The code word and pointer are removed
from the code file, and their location is recorded as
being free to be used for a new entry to the code file.
The space that the text was occupying in the text file
is now also free to contain new text. In order to add. a
record, which is the more common situation, the text
of the new record is added to the text file in the Hrst
free location of a suitable size or at the end. It is then
processed in the same manner as all the other records
have been. The generated code word and pointer is
inserted in the first free space in the code Est. Here
no room is wasted since all of the code word and pointer
combinations are of the same length. Thus any type
of update in the code file, will affect only the code for
the record which is being changed.
The threaded list can be updated with slightly more
effort. The problem, and a minor one, is that the
records in the occurrence file are not all of the sa,me
length, making it necessary to· see if there i8 enough
room in a given free area to insert the new entry.
The inverted file on the other hand is far more difficult to update than either of the others. If a record is
to be removed all that need be done is to delete all
pointers to it in the OCCUlTence file. The addition of a
record however becomes a serious problem. If for every
word in the record there is room for an additional
pointer in the areas set aside for pointers to records
containing that word, then the update is easy. But if
there is no room, a secondary file must be set up. The
number of such files will grow until it is felt that a
thorough update should be made. Then the entire text
file must be re-inverted to produce a new vocabulary
and occurrence file. This is a very time-consuming and
expensive project.
2. Effect of a large vocabulary
With the superimposed coding there is no problem
associated with having an arbitrarily large voeabuI3~ry.
This is true because the superimposed coding does not
require a table of vocabulary words like the inverted
and threaded list files do. Since the vocabul3~ry file is
not present and does not have to be searched, increasing
the vocabulary neither lengthens the time required for
a search nor increases the amount of storage required
to contain the coded information.
1. Ease of update
In the proposed system a record can be added or
deleted very easily. To delete a record a search is
performed which will retrieve the desired document.
This produces not only the pointer to the record in
3. Storage requirements
The major advantage of superimposed coding lies in
the great economy of storage. In the pilot program
which was run, a text file of 100,000 bytes was used to
Inf~rmation Retrieval System Based on Superimposed Coding
produce a code file requiring 3,000 bytes. This reduction
of 30 to 1 from the text to the code file is far better
than the ratio' obtained with the threaded list and
inverted files. Such reductions are largest with small
files such as the one experimented with, but substantial
reductions do exist even in larger files. For example,
assume that the text file consisted of 10,000,000 bibliographic entries, each containing 12 words which will
be coded. Such an author-title entry was found to have
roughly 300 characters in it, implying that the text
file would be roughly 3X 1()9 characters. Also assume
that an average search contains at least three significant words. Such an assumption is made on the grounds
that a search based on fewer words would tend to return
more titles than would be of interest due to the very
large size of the bibliography. From these two assumptions, utilizing considerations explained in Appendix B,
it is found that the code file would consist of seven
code words and one pointer for each record. Each of
the code words is produced in a manner similar to the
single code word mentioned before. Now, however,
once the trimmed form of the word is found seven
difYerent procedures are applied to produce the pseudorandom number between 1 and N for each of the seven
code words. Each of the code words will have 24 bits
and the pointer will have 32 bits, .thus indicating that
each record will produce 25 bytes of code in the code
file. The total size of the code file would then be 2.5
X 108 bytes, which still is a reduction of better than
10 to 1.
Such a reduction is far out of reach of an. inverted
file since each record in the text would have to have
twelve 24 bit pointers pointing to it, and one 32 bit
pointer from the record to the starting position of that
record in the text file. This requires a total of 4X 108
bytes and indicates only a portion of the room taken
up by the inverted file. It does not include the vocabu- .
lary file which would be substantial, nor does it encompass the overhead of the occurrence file consisting
of markers for the boundary between lists of pointers
for a given word. AlSo it ignores the room which must
be set aside for a linking pointer in case a new occurrence is to be added.
An additional advantage of the superimposed coding
lies in the type of storage which can be used to store
the code file. Since the file will be searched serially
the storage media need not be random access. This
permits the use of a ch.eaper sequential access storage
device such as magnetic tape, which could greatly
decrease the cost of such a system.
427
4. Speed of search
Evaluating the speed of a search using superimposed
coding is difficult since the speed of any implemented
system depends .heavily on the characteristics of the
storage media containing the code file as well as on
the obvious consideration of the size of the text file.
The search can be performed by reading the code file
from bulk storage into addressable memory and comparison of the query codes with code words made by
software. If this is done then the time required to
search the code file can be cut to less than 6 X (the
memory cycle time of the machine) X (the total' number
of code words in the code file). This speed can be·
achieved due to the simplicity of the comparison which
the software must make. The program only needs to
test if X is a subt~et of Y by loading the accumulator
with Y, doing a logical AND of the accumulator with
a register which contains X, and testing to see if the
accumulator equals X. When large text files are used,
and there are several independently assigned code words
for each record, time is saved by being able to reject a
record when anyone of the query codes' fails to be a
subset of the corresponding code word. By taking
advantage of this a substantial amount of time can
be saved. In the previously mentioned large file, with
seven code words for each record and an average
search of three words, more than 90 percent of the
records would be rejected after only the first comparison was made. This means that there would b.e
36 memory cycle times (the time allotted for the SlX
comparisons which did not have to be made) free to
take care of the overhead in the searching program.
Even with this simple and fast searching procedure,
a search does require longer than the threaded list or
inverted file. Although the implementation of this technique in software is slower, there are several methods
that radically reduce the amount of time required to
search the code file.
Since the algorithm for searching the code file is
simple, the actual testing to see if X is a subset of Y
can be done with very simple hardware. If the I'th
bit of X is 1 and the I'th bit of Y is 0 for any of the
values of I from 0 through 7, then X i~ not a subset of
Y and the value of Z will be 1. If in no case is bit I
of Y = 0 and bit. I of X= 1, then X is a subset of Y
andZisO.
Considering the speed of present day circuitry the
time required to search a code file would be reduced
to the time required to transfer the data from bulk
storage. Since the hardware is ~o simple, it is practical
to scan data from several sources simultaneously. An
428
Fall Joint Computer Conference, 1969
intersection of the lists. Due to this parallelism of t.he
search superimposed coding can handle a multiple wOlrd
search in a more efficient manner than the other two
methods.
At first glance it appeared that searching the entire
code file would preclude the use of superimposed coding
on a large file. With more careful examination, however,
it is apparent that this type of code file can be searched
as rapidly as either the threaded list or the faster
inverted file.
Factors which lead to this conclusion include:
Bit 4 Y
Bit 4 X
A. The code file search can easily be implemented
in hardware. Such hardware is simple and
very fast as well as being able to handle
several streams of data simultaneously.
B. If several sequential access devices or a
random access storage device is used 1bhen the
code file may be structured to allow large
blocks of the code file to be rejected with
only one test.
C. The superimposed coded file is much more
efficient at handling searches for records COlntaining several desired keys.
5. Cost
Figure 4-Hardware to test if X is a subset of Y
Z = (YoAXo) V CVIA Xl) V ... V CY7AX7 ~
alternative to having the file searched externally would
be to wire into read only memory the commands to
test for a subset. By adding instructions to use the
next code word and repeat the 'operation if the test
fails, the search will proceed through core memory at
a rapid rate making only one core access for each test.
The end of the list of code words Can be marked by a
code word containing all ones. This has any possible
query as a subset and would assure that the loop
was interrupted at that point.
A second technique which would reduce the time
required to search the file is to sort it in some manner.
One such method which generates a superimposed 8
bit code from a 24 bit code is discussed in Appendix A.
Other methods such as carefully dividing the code file
into small groups and then doing a logical OR of the
chosen code words to form rejector vectors have been
suggested. 4
In comparing the speed of the search it should be
noted that with superimposed coding and when searching for several words, the search for all of the words
is carried out at once. In the threaded list and inverted
file a search for several words is made by making a
list of occurrences for each word and then finding the
The cost of implementing an information retrieval
system utilizing the type of superimposed coding suggested would be substantially less than the cost of
implementing a threaded list or inverted file using the
same text file. The reasons for this stem from the
reduced requirement for computational capability of
the computer, as well as a substantial reduction in
the amount of storage required for the coded information.
All three systems must dedicate a large amount of
storage to the actual text. This, in all of the cases,
can be either directly accessible to the computer such
as a large disk file, or may be only machine referable
such as a machine controllable microfilm displlay, like
the proposed system at the University of Ca.lifornila,
Santa Cruz or the one being used as part of Project
Intrex at 1VLI.T.6 The difference of storage cost is not
found in the storage of the text file but in the comparison of the cost of the storage of the code file of
the superimposed coding system with the cost of storing
the vocabulary and occurrence files of the '-threaded
list and inverted file. The code file is smaller and mm
be stored in a sequential access device rather than a
random access device. IBoth of these factors tend to
reduce the cost of the system.
If scanning of the code file is implemented iin hardware then the requirements on the computer become
Information Retrieval System BasE1d on Superimposed Coding
very small. All that it is responsible for is processing
the words in the inquiry in order to generate the query
codes, and then, while the search is in progress, stand
by to store the pointers to the text file which the one
or, possibly several, hardware scanners pass to it.
The trial program" which processed the questions
generated the query codes and handled the searching
in software, was substantially under 16,000 bytes of
code on an IBM 1130 with no overlaying. Thus the
requirement for expensive core storage is low. The
cost of the hardware which would do the testing for
the query code being a subset of the code word and
its interfacing with the computer would be very small
compared to the cost of the necessary storage devices.
One phenomenon which is found in the superimposed
coding and not in some other forms of coding is the
presence of spurious matches. These occur because,
in a given code word the fact that the I'th bit is zero
signifies that any word assigned the code value I is
not in the record. The converse is not true. Since many
vocabulary words could cause the I'th bit to be one,
the I'th ·bit being equal to one, does not indicate that
a specific word is present. By generating several independent code words for each record the number of
times that superimposing will cause an irrelevant
record to be retrieved can be made arbitrarily small.
Take for example the case where twelve words were
coded into seven 24 bit code words. In that case the
probability that a record in which all seven of the
query codes for a question were a subset of the code
words, and none of the three words involved in the
search were in the given record, was 3 x 10 -10. (See
formula in Appendix B, bd = .35, cw = 2.8, qc = 7)
Since the number of such spurious matches can be
limited to any desired extent, although not entirely
eliminated, it is convenient to perform some final
verifying operation to assure that the words specified
in the search are actually present. This verification in
the case of the pilot program was accomplished as a
side result of the check to see that the desired words
occurrea in the specified order. Consequently there
was no penalty in making this extra check on the
records which were retrieved.
The requirement that additional checking be done
is not an unreasonable one. The fact that a document
contains the words in which one is interested does
not necessarily indicate that the document is of interest.
Therefore any key word searching procedure can only
be the first step of an information retrieval system.
The job of a key word search is to quickly reject
records that do not contain information of interest.
In this sense any of the three types of key word infor-
429
mation retrieval systems which have been mentioned
are more properly information screening procedures
which can rapidly eliminate a large portion of the
text file as unlikely to contain relevant information.
Such a system should be used to identify those records
which warrant further and more extensive examination.
CONCLUSION
The method of superimposed coding which has been
discussed is a simple and relatively inexpensive manner
of scanning a large text file. With a simple check for
spurious matches made after the search, such a system
can stand alone as a key word information retrieval
system. On the other hand since the actual scanning;
of the text can be easily and rapidly handled by peripheral hardware, the method is very attractive as a
first stage screening method. Although the prospect of
having to search the entire code file for every inquiry,
at first glance, appears discouraging, the simplicity of
the scanning algorithm and the ~ase with which searches
can be carried out in parallel makes such a linear
search very reasonable.
APPENDIX A
Besides implementation in hardware, measures can
be taken to eliminate the need for searching the entire
code file, thus reducing the required search time. One
manner of doing this is to use the first code word of
each record to generate a shortened code word for it.
In the case of a 24 bit code word, the first bit of the
of the second level code word is the logical OR of the
first three bits of the first level code word. Bits 4
through 6 could also be ORed and used as the second
bit of the second level code word. Continuing this
process an 8 bit second level code word is produced
based on the bits 1 through 24 of the original code
word. Since there are only 256 of these second level
codes possible, with each record's first code word being
mapped into one and only one of these classes, the file
is partitioned into 256 sets characterized by the numbers
through 255. When it is time to search the code file,
the element of the partition that the first query code
belongs to is determined. If for example the query
code is 000100000010000001000000 it would belong to
set 84 (01010100). The only sets which would have to
be searched would be those characterized by numbers
which have 84 as a subset (i.e., 1i111111, 11111110,
11111100 would have to be searched, but 11111011
would not have to be examined further). There would
be only 32 out of the 256 sets which would have to be
searched, thus the number of code words which would
°
430
Fall Joint Computer Conference, 1969
h1we to be compared with the query codes would be
reduced. Using the scheme of coding 12 words into ,24
bits would cause roughly 10 percent of the code file to
be classifi~d as 255 (11111111) and just over 3 percent
to be classified by a number whose binary representation
contains 7 ones and one zero. Due to the non- uniform
distribution of the code words over the 256 sets, the
reduction in the amount of the code file to be searched
would not be the 7/8 suggested by the reduction in
the number of sets which must· be searched. The reduction would, however, be in the neighborhood of
30 percent (3/8 of the sets whose binary representation
has sevens one and one zero and 18/28 of those with
six ones and two zeros can be eliminated).
APPENDIXB
Since care was taken to assign the code values using
numbers from a uniform distribution, the expected
number of spurious matches can be predicted. By
varying the length and number of the code words the
frequency of spurious matches can be controlled. The
number of spurious matches is a function of the bit
density, bd (i.e., the number of ones in the code word
divided by the number of bits in the code word); the
number of code words per record, cw; the number of
ones in the query code, qc; and the number of records
which are coded into the code file N.
The expected number of spurious matches =
N x(bd) cw.q.c.
The number of bits used to code one record =
cw (the number of bits in the code word)
By keeping the number of bits used and the number of
ones in a code word constant in the above two equations,
it is found that the minimum number of spurious
matches occurs when the number of bits in the code
word is e times the number of ones in the code word.
That is when the bit density is' 1/e. The number of
bits B to use for the code word when there are M
words to be coded in each record is roughly 2.2M.
This is found by considering that the probability that
a given position will be left blank is (l-l/B)M. The
expected bit. density would then be l-(l-l/BM).
Setting this equal to the 1/e and solving for B yields
the desired results. 3
APPENDIXC
The trimming program was divided into three
sections. The first step removes all Ie's, 'd's and is'S
from the end of the word. These letters were removed
since there are many words such as 'attractions' which
have compound endings terminating in s, es, d, and
ed. By removing these letters, in the above, the suffix,
'tion', is left on the end of the word where it can be
easily identified and removed in a later section of the
program. Once this operation is completed the endings
'er' then 'ly' and then 'aI' are searched for and removed
if found. This procedure removes endings such as the
'ally' on the end of 'functionally' and ag~in is a technique to handle compound endings.
Afterthe above two trimmings have been accomplished, the Trim List is consulted. Suffixes found in the
Trim List are arranged in order by length, starting
with the longest. The ending found in the list is compared letter by letter with corresponding letters on
the end of the word remaining after the first two trimming stages have been completed. Since all of the IS'S,
ie's and 'd's have been removed, the suffixes are in an
unusual form. For example, 'ness' would have belsn
trimmed to In' by the first stage of the trimming
procedure. Also lance' appears as 'anc' in the Trim
List.
The reason for having suffixes in this form can be
seen by considering the problem of trimming the two
words 'finance' and 'financed'. In the second case,
when the 'ed' is found on the end of the word, it is
difficult to decide if the 'ed' or just the cd' should be
removed. The decision was made to remove the 'ed'.
'This means that to trim 'financed', 'anc' mw~t be in
the Trim List. However, 'finance' which should be reduced to the same pseudo-root requires either the
ending lance' to appear in the list or the Ie' removlsd
before the ending is compared with endings in the
Trim List. The second course of action was chosen
because it reduces the length of the Trim List and makes
the first step of the trimming operation very simple.
The comparison of the endings in the Trim. List is
continued until either the list is exhausted or a match
is found and the ending removed. There are two more
checks to be made on the trimmed word. First, the
last two letters of the ,vord are compared. If they
are the same, then the last letter is removed. This is
are the same, then the last letter is removed. This
done so t.hat a word such as 'trimming' will be cut
back to 'trim'. First the ling' is removed to give 'trimm,
and then the second 'm' removed to give the desired
root.
The final action provides some protection against
trimming words too severely. The word 'deeds' would
be trimmed to nothing. To prevent such loss of information, any word which has been' reduced to le8s
than three letters is restored to a length of three. At
this point the word is considered trimmed.
There is one major problem which occurs with the
Information Retrieval System Based on Superimposed Coding
Rem ove
Last
Letter
ReMove
L.as t
tWD
Le tters
Aemove
LQ,st -two
Letters
use of a trimming algorithm. Words which do not
convey the same meaning can be reduced to the same
root. An example would be that both 'information'
and 'informal' are reduced to 'inform'. Such a result
may be undesirable; it is unlikely that when searching
for one of the words, the other would be of interest.
Unfortunately the effect of this type of false retrieval
could not be observed in the small pilot program.
Such confusion of terms was rare due to the specialized
nature of the text. In a system utilizing a larger text
file containing a more generalized vocabulary, the
number of such erroneous replies may become substantial. If a system utilizing a trimmed form of the
vocabulary words is used for the first stage of an information retrieval system, the problem of such extra
records is not a serious one, since the purpose of the
search is to locate information-rich sections of the
text. Further examination \vould determine whether
the record is of inteJ;est or not.
The decision to utilize a trimming algorithm in the
pliot program was based on the feeling that the error
of failing to retrieve information was less tolerable
than retrieving some irrelevant information.
Remove
L4St two
Lstte rs
TRIM LIST
Aemove
The
£nri.,"H}
Remove
L.Q;st
Le tter
Restore
To len,th
3
Figure 5-Trimming procedure
431
ology
ement
icant
ition
ation
orial
iting
ating
istie
aney
ment
ient
ator
ieal
ymg
ary
eou
est
ent
ion
ern
dom
ful
val
ial
cal
ing
ene
ane
iz
ry
iv
it
at
or
er
en
al
ag
id
ic
ab
y
n
e
432
Fall Joint Computer Conference, 1969
APPENDIXD
BIBLIOGRAPHY
DELETE LIST
a
am
an
as
go
in
is
it
so
to
we
all
a:p.d
are
but
can
for
had
his
how
may
nor
our
the
was
also
does
from
have
more
must
that
this
thus
ways
were
what
will
with
being
would
every
might
other
since
their
there
these
which
while
would
should
another
however
either
without
1 R S CASEY et al C S WISE
. Mathematical analysi.'1 of coding systems
Punched Cards Their Applications to Science and Industry
Reinhold PubCo . 1958
2 T C LOWE
Design principles for an on-line information retrieval sllstem
Doctoral dissertation submitted to the Univ of Pa 1966
Philadelphia
3 C N MOOERS
Coding, information retrieval and the rapid selector
American Documentation 4:225 Oct 1950
4 R T MOORE
A screening method Jor large information retrieval systems
Proc SJCC Vol 19 1961 259
5 C F OVERHAGE et al
Massachusetts Institute of Technology
MIT Project Intrex March 15 1968 to Sept 15 1968
Semi-annual Activity Rpt Cambridge 1968
6 E B PARKER
SPIRES (Stanford Public Information REtrieval SeriJice)
1968 Annual Rpt to Nat Science Foundation
Project GN 600 742 Jan 1969
Establishment and maintenance of a
storage hierarchy for an on-line data
base under TSS / 360
by JAMES P. CONSIDINE and ALLAN H. WEIS
Thomas J. Watson Research Center, IBM corporation
Yorktown Heights, N ew York
INTRODUCTION
As on-line interactive systems increase in popularity,
several problem areas become more and more apparent.
One of these is the management of the on-line accessible
data base. It has been the experience of installations
throughout the country that such a data base tends, if
ungoverned, to increase in size as the system continues
in operation, bounded only by the size of the storage
available to. contain it. It is, therefore, essential for
the continuance of a viable system that this data base
be examined and methods devised to control its growth.
In the first section of this paper we record some
observations we have made on the nature of one particular on-line data base, specifically its growth and
usage characteristics. The second section details a
system we have designed to control the growth of the
data base and insure maximum utilization of the on-line
devices available. The third section describes the
results of operating with the system. The fourth section
details future amplifications and modifications to overcome some foreseeable difficulties inthe present version.
Finally we summarize our observations and re-state
the conclusions we have reached.
T SS / 360 data base at T. J. Watson Research Center
Since our system first went on a somewhat regular
schedule of four-hour-a-day user sessions in June 1968,
it was clear that, even under these conditions of relatively low availability, managing the on-line storage
was going to be one of our primary problems. The
433
amou.nt of on-line storage occupied by user data sets at
that time was approximately 20,000 pages, or 80,000,000
characters (1 page = 4096 characters or 8192 hexadecimal digits). It was a matter of a few months before
the amount rose to what is our working optimum,
30,000 pages or 120,000,000 characters. This o~timum
is dictated by the maximum number of devIces we
wish to devote to on-line storage. The distinction
between devices and volumes should be made clear. A
volume is a unit on which data are actually recorded.
There are in principle large numbers of volumes available. A device is a unit on which a volume is mounted
and which carries out the transmission of data to and
from the volume. Devices are necessarily limited in
number. A tape reel is a volume; the tape drive is a
device.
To return to the data base, observations made at
the time indicated that perhaps 10-20 percent of this
data was non-useful. Examples of this are data sets
defined but not used and never erased, output listings
of assemblies and compilations done many days previous to the current date and other such system- and
user-generated residues. l\1easures were devised to
periodically and systematically remove such unwanted
data from the on-line storage, thereby achieving a
small amount of leeway while the problem was being
further studied.
In an effort to acquire information on the usage of
the data base, we implemented a means of marking a
data set with the date on which it was used. Report
programs were written to process the data thus re-
434
Fall Joint Computer Conference, 1969
corded and the results ""-ere very informative to management and system programmers alike.
Extracts from a typical report are presented in
Figure 1. Among the facts which can be determined
from such reports are the names of the authorized users
actually using the system currently, how much storage
each user is occupying, how much he is using, and how
the amount of storage used by e~ch user varies from
observation period to observatiotl period. The total
amount of on-line storage that is being currently used
by all users is also recorded. h~ addition, the data
recorded can be processed to yield an on-line storage
profile, as shown in Figure 2.
For instance, in the reports formulated from data
gathered on February 1, 1969
discovered that of
our 160 or so authorized users, some 50 had actually
used the system since the beginning of the year. We
also found that most of these 50 were not actually
using all of the storage they were occupying. In one
case, up to 95 percent of the storage of a partiCUlar
user had not been used during the period. In total we
discovered that of some 28,000 pages of storage on
the system only 13,000 pages had been used in the
last month. These figures were based on information
recorded after all the "vaste space occupied by obviously
70
60
CALL 8111fhJtr{/JIM/: ....
c>
lJI!E! (;()fJOL
<)I
CiLL PCMtJErlAIN#,....
£NTER LINKAeE
<>~R~
ENTMLlNKAK
c:>
r:::>
CAll. PCMPIIf flJINI. .....
ENTER a11kr
EKTlRUNKAK
<>(;ALL 8IIMPImI (/JIN~ ....
[NTl:R COBOL
ENP
Figure 5-Flow of program control during on-line
processing
where the incoming message and its internal format are to be placed. TPCHUG will do the following things:
• Get an incoming message from the waiting
queue if one is available. If it is not, it will
wait until there is one so that other subtasks
may be executed.
• Place the message in the TRP.
• Edit and translate the message into the internal format.
• Mark COP's event control block for this TRP
to show that a message is available for processing.
• Wait until TPCHUG's event control block
is again' posted by COP to show the TRP is
again ready to accept a message.
When TPCHUG waits, Path 2 back to COP is
effectively taken. COP will then check its event
cont;ol blocks to' determine which TRP requires
serVICe. The following actions are taken by COP:
445
Via a program control list constructed by TPCHUG, COP will determine the next module
to be applied to the current message in a TRP .
.. Mark SLINK's event control block for this
TRP to show that action is to be taken.
Wait for the event control blocks to be marked
by TPCHUG or SLINK.
8
f)
Path 3 is now completed and SLINK will gain
control when this subtask is made active. At this
time the following functions will be performed:
• The required applications program will be
loaded, if it is not already in core memory.
• Control will be given to the appropriate program SO that it may execute.
'. Upon return from the program, SLINK will
mark COP's event control block to show that
the program has completed its processing.
• Wait upon its event control block for this TRP.
Paths 4, 5, and 6 have been taken, and the same
sort of thi,ng occurs for Paths 7, 8, 9 and 10.
The mrin difference is that when TPMSGOUT
hoos finished putting the response(s) on the output waiting queue,. COP's event control block
is complete and the TRP is now free to be used
again .
• Termination of the On-Line Day
Messages may be in waiting queues or in various
stages of processing when termination of the online system occurs. COP must assure that the
teleprocessing subsystem has received all incoming
messages, the input waiting queues have all been
emptied and all messages have completed processing before the file management routines close
the files and COP releases the subtasks. While
the teleprocessing programs are emptying the
output waiting queues of messages and transmitting them, COP is editing the statistics which
have been gathered that day and producing a
report from them. After all processing has been
completed, control is returned to the operating
system.
Batch system
• Initialization
Control information must be gathered and set
up in main storage to effect the proper sequence
of jobs to be run during a particular batch job
stream. The aforementioned control information
will contain such things as:
446
Fall Joint Computer Conference, 1969
--------------------------~-------------------------------------------------------,-------
/
/
/
r
BL..OCKn:
\
It.FCR~AT
10'"' SYSTEMS/RESOURCES MANAGEMENT
ANAL 'lSI SCF 8ATCH PROGRA~ ACTIVITY
JOB/STEP NAME
I
FReGIIA~
BL.OCK2
BLOCK,l
Program Name:
\
NAME
Il:fb"'"
IHeRl"
IH811l""
SIZE
0991(
C6(1I
05211
05211
C('I'
C~2"
SC901bd
IEF8Al't
IEfbRH
IEFtRl"
Il:fB1I1"
UMReocc
At.."",.
05211
0 .. 011
C HK
01011
CHI!
CEeI(
0101(
CCPft IT
I
\
Core &pace RequIted
ReserJled Ar84
ABENO COI7r1Jtlon Cixle
InhIbiting CondlhooCOt/ar
Jolt Control ClK/8
Proqrc9m Level:
Contlihont91 Input-file f/l'1
Locallon of R/~ #"1
CondltlonallnMfile *"2
LOC8holJ of Rle 'B
Tht/B il Me bloCK for fam program or,J'Orf fIIeI 1119/1 be
eyecuted 1/1 #li.r ClIo/e.
;
Ordered bll Pngrr.91t! i.tfol anti InI!IP/f CM~.
Figure 6-BESS program control blocks
• Program Name.
• Resultant" condition code, if job fails to run.
• Condition code or codes ;resulting from other
jobs which this job depends on in order to know
whether or not to execute.
• Program level denoting whether or not this
job can run concurrently with· other jobs
based on core usage, sequeilce of jobs to run and
shared use of data files.
• Names and device locations of conditional
files; i.e., files which might or might not be
present during anyone job within a job stream.
• Processing
During the so-called batch processing, depending
on the information supplied for control during
initialization time, jobs will be scheduled either
alone or as subtasks, depending on core requirements, availability of data fil~s, and shared access
of data files.
• Termination
Upon completion of all the 'jobs which could be
processed, the operator or o~her appropriate per-
u: ~EL
00
CC
00
00
CC
OC
00
CC
CO
C2
CC;
0"-29-'69
OLETEST I
RESULT
TYPE
TlMEIMINItUDE CCI'PLE TI ON ALLOTEO/ACTUAL/WAlT
8
A81:NO 000C2
000.1l
1
t.e INPUT
Z
NORMAL
000.11
2 · NORMAL
000.11
2
A
'BHe 000C5
000.11
A
A8fNO 0013E
000. lit
3
'"'OIlM ..L
000 ....
6
NOR"AL
000.13
9
NORMAL
000.13
..
SKIP 1
5 " 0 ,"PUT
7
hOT AVAIL
JOS-"
L lBRARY
1
o
o
Z
1
o
o
o
TUT AL R",", 11"E eccc "IM.lE S
- "LL 111'H Allf ilEAL TIME, ANt OVERLAP WITHIN LEVELS
-- A,nAt TillE IS "1t.LTES.SECCt.tS
••• JUb LI8RARY NLM8ER IS CONCA1EhATICt. t.UMBEII OF LIBRARY FOR PROGRAM
Figure 7-BESS status report
sonnel will be notified as to the status of all the
jobs. Such information as:
• J oha which were not run due to data not being
available.
• J oba which were not run to oompletion lsi' !vlll8sK n;ftJ/I1BI8r
t> 2ntI.rv~fK I19rA~
Pertaining to the responsiveness of a computer
system. To respond in a timely fashion to user's
needs who have direct access to the computer via
data entry devices, terminals and displays; real-time.
II
SLINK
Figure 12-Control block relationship in the on-line
system
Subtask Linkage
An on-line resources management program that is a
constant subtask to COP and links to each processing program.
Sub8Y8tem
• find the various system support modules within
a dynamic environment .
• establish the linkage between the program and
the system support module.
COP
Controller of On-Line Processing
A system of interrelated programs that is subordinate
in control and execution to another system.
Subtask
An executable program that has all the attributes
of a task but is subordinate to and under the control
of another task.
452
Fall Joint Computer Conference, 1969
--------------------------~---------------------------------------------------------------ACM Computing Surveys VoliN 0 1 March 1969
Task
One of 'two or more programs, or series of programs
which execute concurrently in a single CPU.
TPCHUG
A teleprocessing program that is a constant subtask to COP. It reads transactions from the input
waiting queue, edits them, and. translates them into
their processing format.
TPMSC:OUT
A teleprocessing subroutine that converts a response
to a terminal's format and pla~es it into the output
waiting queue.
TRP
Transaction Response Pool
A block of memory which contains a single raw transaction (message), some of its' control information,
its intermediate forms and its' response(s). A TRP
is assigned to one transaction ~t a time for its active
life within the CPU. It contains all data associated
with the transaction in chronologicu,l sequence so
it is useful for debugging.
ACKNOWLEDGMENT
The authors wish to thank Mr. J. R. Kleespies for
his encouragement and support; Mr. W. D. Ayers and
Mr. R. T. St. Germain for their dedicated efforts
in design and programming; Miss Agnes Wolf and
Mrs. L. J. Fiore for their meticulous typing; lVlr. Leo
Karl of IBM for initial analysis and Messrs. F. J.
Thomason and J. W. Nixon of Haskins & Sells for
their invaluable advice.
REFERENCES
1 S ROSEN
Electronic computers: A historical 4urvey
2 R F ROSIN
Supervisory and monitor systems
ACM Computing Surveys VoliN 0 1 March 1969
3 J DIEBOLD
Thinking ahead: Bad decision on computer use
Harvard Business Review Jan-Feb 1969
4 H LIU
A file management system jor a large corporate injormatWn
system data bank
Proc FJCC Vol 33 1968
The following references are selected by the authors for geners,l
background information, but are not mentioned in the 1Gext.
5 J MARTIN
6
7
8
9
10
11
12
13
14
15
16
Programming real-time computer systems
Prentice-Hall 1965
J MARTIN
Design oj real-time computer systems
Prentice-Hali 1967
M G JINGBERG
Notes on testing real-time systems programs
IBM Systems Journal Vol 4 No 1 1965
J D ARON
Real-time systems in perspective
IBM Systems Journal Vol 6 No 1 1967
J W HAVENDER
Avoiding deadlock in mu,ltitasking systems
l~M Systems Journal Vol 7 No 2 1968
B I WITT
Job and task management
IBM Systems Journal Vol 5 No 1 1966
D D KEEFE
Hierarchical control programs jor systems evaluation
IBM Systems Journal Vol 7 No 2 1968
W C McGEE
On dynamic program relocation
IBM Systems Journal Vol 4 1965
IBM system/a60 operating system
MVT Control Program Logic Summa-ry Form Y28-6658
IBM system/a60 operating system
MVT Supervisor Form Y28-6659
IBM system/a60 operating system
MVT Job Management Form Y28-6660
IBM system/a60 operating system
Supervisor and Data /Management Services Form C28-0046
Incorporating complex data structures
into a language for social science
research
by STEPHEN W. KIDD
The Brookings Institution
Washington, D. C.
INTRODUCTION
This paper presents a set of augmentations to the
language BEAST* (Brookings Economics and Statistical Translator) as part of a continuing effort to define
a language for a particUlar group of computer users,
social scientists. In this nebulous group we include
professional economists, political scientists, psyc1Wlogists, sociologists, and a large· number of univer~ity
students in those disciplines. An important assumptIOn
underlying our work has been that the cost of not
having SUbstantially better software than pljesently
exists is very large and should be measured in terms
of researchers' time. The true .cost of inappropriate
methods of computer utilization should not be ineasured by staff and computer costs, but by the social
cost of the output foregone. When answers to questions
of importa.nce for national public policy formation require weeks, months, or even years to obtain, the cost
becomes a social cost that we all eventually bear.
BEAST is a computer language designed to embody
many of the concepts of the more quantifiable social
scien'Ces. The present version of the BEAST operates
primarily upon ":r:ectangular". data files, that is, files
having observations on attributes of enumeration
units. In other words, acceptable files consist of fixed
* Jeffrey W. Bean, Stephen W. Kidd, George Sadowsky, Beverly
D. Sharp, THE BEAST: A User-Oriented Procedural Language
for Social Science Research. (The Brookings Institution, June 13,
1968). Reference to "the current BEAST" should be understood
to refer to that paper.
453
length logical records, one record for each enumeration
unit. Many social science data files either have this
structure or can be cast in this structure with little
effort, and the majority·of "general purpose programs,"
written for social scientists also assume this dat a
structure. However,· many social science data files have
a more complex structure and cannot be processed
either by the present version of the BEAST language
or by most existing computer programs.
This paper describes possible extensions to the
BEAST language to make it useful for processing data
with a more complicated structure. Though the data
structures and language constructs described here could
be applied to extensions of other languages, we f~el
that they have partiCUlar utility when combined With
features already available in the BEAST. The intent
of the proposed extensions is not to introduce a general
list processing capability into the language as has been
done with some other languages,6,9 ,16 ,18 but to accommodate a particular class of files characterized by
hierarachical record structures. We have deliberately
decided in favor of a limited structure that permits
the ease of reference that is essential for the users we
envision for the language. The generality of those
complex structure~ which have been disallowed in
the current proposal is a lUXUry which can only be
bought for a significant price--the increased specificity
required in a language to reference such struc~ur~s.
The user who wants such generality pays the price In
other languages in the increased tedium of writing
his program.
454
Fall Joint Computer Conference, 1969
Consider a slight variant of the 1966 Survey of
Economic Opportunity (SEO) File constructed by
the U. S. Office of Economic-Opportunity. The organization of data within each enumeration unit is treestructured, that is, each level or segment of data may
be followed by a variable number of segments of data
at the next lower level. Figure 1 illustrates the structure
of this file.
Disaggregation by respondent characteristics, time
period, income group, geographic area or other conditions is often very fruitful for social science research.
For example, using this file it should be possible to
define a subset of households or families based upon
person characteristics, or the reverse. Such groups
might be (1) the set of all families such that no persons
are 65 or more years old, (2) the ~et of all households
such that at least two persons earn $5,QOO or more
per year in wages and salaries, (3) the set of all families
such that exactly two persons are less than 21 years old,
(4) the set of all persons whose households are headed
by a woman, (5) the set of all families that live in the
northeast, and (6) the set of all persons whose families
are at least five persons in size arid which live in the
southwest.
The current BEAST language provides the DEFINE
SAMPLE statement for defining al subset of the user's
original population and the ON SAMPLE suffix for
restricting computations to observations within that
subset. The format of the DEFINE SAMPLE statementis:
DEFINE SAMPLE sample name AS logical
expression
An example of a DEFINE SAMPLE statement would
be
DEFINE SAMPLE OLDMEN AS AGB
AND SEX EQ 'M'
> t.5
That sample definition could be invoked using the ON
SAMPLE suffix to compute the average income of the
old men in a set of data:
LET AVINC =. MEAN (INCOME) ON
SAMPLE OLDMEN
The ON SAMPLE suffix can be used in a simillar wa.y
to define a restricted domain for calculation of derivod
variables, statistical procedures~ and input and output.
While the current definition of the language is
sufficient to express extremely general conditions on
rectangular data files, the syntax for logical expre:3sions is insufficient for defining samples of the type
mentioned above for the SEO file. The next two se.~
tions describe an augmented I/O facility and an expanded conditional expression syntax designed to
evaluate logical functions on data structures of the
type indicated.
Before proceeding further, it is useful to formalh~e
somewhat the data structure indicated in Fil~ure 1.
Data related to a single entity like a person, a family
a state, or a company we shall call a segment. * An oecurrence of a segment resembles one row of a rectangular data matrix: it is one set of values for 81 list of
attributes, and it is defined by the list of attributes illcluded in one occurrence of the segment: For example,
a segment describing a person (a PERSON se,gment)
might be defined by the list of attributes AGE, SE:X,
INCOME", and RENT. We denote that a PERSON
segment is composed of values for those four attributes
by writing
PERSON [AGE, SEX,INCOME, RENT]
~II
I~I
~
I
:J~I
I
I
~: :J~I
I
I
1U"'NTi
HOUfIHOLD
.AT4
sr~I!NT1
"'~MIL"
MTA
1-=:::::1
1-:::::::1
I~I
or in general with the notation
segmentname [attributelist]
I
~I
;1
'I
SfGMfNT;J
'lfCfO~ DATA
SEGMENT ....
WolJ(
It"HIl'£"'Ct:
JATA
Figure I-A logical structure for the survey of
economic opportunity file
* The concept of a segment as described here should not be COD,fused with its usage in discussions of virtual memories and address
spaces. Our usage is close to what R. M. Balzer ha.s oalled 1:1.
"collection" in "Dataless Programming", (Ra.nd Corporatioll
JUly, 1967) Memorandum RM-5290-ARPA. It also resembles
,
the usage in COLINGO of "group": COLINGO C-1Qt User s
Manual, (Mitre Corporation, May 1968) Document E8D-TH66-653; and the POP-2 usage of the term "record." R. Moo Bu~n8,
J. S. Collins, "An Introduction to the POP.2 Progrnmmml'
Language," (University of Edinburgh, October, 1967,) MinMAC Reports, No.4. The term segment has been adopted for
IBM's GIS file management system.
455
Incorporating Complex Data Structures
------------------~------------------------IUNT
j
eIt
I
'A... ILY'
Figure 2 shown an example of one such PERSON
segment. A rectangular matrix would be composed
of a set of such "segments" conceptually placed one
below the other.
As an extension of that structure, segments can be
combined by linking them to const;uct a "tree". The
tree has as its "root" a single segment, and has as its
"branches" one or more different segments. Figure
3 shows one such tree structure representing one
FAMILY and three PERSON's. * **
A tree such as in Figure 3 is the basic unit in our
augmented d.ata structure.
.
We call shall each successive tier of the data hIerarchy a level. Levels are numbered and begin at one,
the level for those segment types not contained in any
other segment. Level one is the highest segment level
possible. Every segment type has a unique level associated with it, though more than one segment type
may occur at any level. When a segment S is connected
to segment T by a single path through one or more
segments, we shall say that S contains T (conversely,
T is contained in S). All segments contained in segment
S are called subsegments of S. Segments are contained
in a unique segment of the next higher level. This
restriction on the data structure permits simplification of
the language we use to reference the structures. In
particular, it permits attributes of segments at one
level to be "imputed" to segments at a lower level, and
it obviates explicit upward-references when referring
to low level segments.
We shall call information about containment (which
segments contain or are contained in which other segments) structural information about the data, as distinct from the data itself. The structural information
of a file is often contained only implicitly in the
physical arrangement of the data in the file. When data
are read into memory, the structural information
* These
figures give no indication of the physicti.l structure of
the data. There are several.reasonable ways in which such data
could be armnged, but the language used to talk about such data
should be independent of the physical arrangement of the data.
** For convenience we will call "an occurrence of the structure
defined by "X" simply "an X".
.
!
IT
Figure 2-A PERSON segment
..
..I
z~
~i
r
2
~
~
I~:
.
~
)(
=
II
I
!
I
...CION 1
II
Pl'UON'
I II
I'IIUON I
Figure 3-A simple enumer.1tion unit
should become explicitly represented as a list structure
for efficiency in processing.
Trees of the forms described above can often represent naturally the structure of the enumeration units,
(E~U's) encountered in social science research. For
the purposes of this paper the tree that represents
an enumeration unit consists of a unique segment type
at level one called the root segment together with all its
subsegments. A file is an ordered set of such enumeration
units. To denote that an enumeration unit has a
structure we shall give the entire aggregate a name and
define its constituents according to their relations. The
simple tree structure in Figure 3 would be defined in
BEAST by writing
DEFINE EU FAMSTRUCT AS
1 FAMILY [REGION, WEALTH,
URBANRURAL]
2 PERSON [AGE, SEX, INCOME,
RENT]
The purpose of such a definition is to describe the set
of possible occurrences of the enumeration unit, si~ce
an EU definition says nothing about whether a parlwular occurrence of the structure will actually have any
subsegments, the number of subsegments, or the physical order or type of the attributes in the segments.
Another example of an .EU definition is:
DEFINE EU CONGCOMMITTEE AS
1 COMMITTEE [NAME, BUDGET]
2 MEMBER [LAST, FIRST, STATE,
PARTY]
This definition specifies a tree structure with two levels
that represents a Congressional committee. The root
segment is a COMMITTEE segment, and for the pu~
poses of CONGCOMMITTEE it has only two attrl-
456
Fall Joint Computer
Co~ference,
1969
-------------------------------------------------------------------------------,----butes, NAME and BUDGET. Segments of type COMMITTEE are assumed to contain only segments of type
MEMBER. On input only str~ctural information
relating QOMMITTEE and MEl\:1BER segments will
be retrieved from the file though the file may contain
other segment types and attributes. On output only
the structural information indicated will be displayed.
As a third example consider the strhcture defined by the
statement
DEFINE EU DWELLING AS
1 HOUSEHOLD [AGEOFHEAD,
SEXOFHEAD]i
2 FAMILY
structural information is required to evaluate this
expression.
New attributes for FAMILY segments could 80180
be generated from such ~ structure that begins with a
null attribute list. We could compute the total incolne
in each FAMILY segment (the sum of all PEnSON'S
income contained in that segment) using 'the TOTAL
function:
LET FAMINC = TOTAL INCOME WITHIN
FAMILY
The function TOTAL has the general form
TOTAL
'3 PERSON [AGE,
I~COME]
When a segment name (FAMIL Y) is included in an
EU definition with no attributes: listed, then only the
structural information at that le-Vel is extracted from
the file. In this example, DWELLING would have the
form indicated in Figure 4. With ~uch an EU structure
one could evaluate logical expressions that required
structural information, but no attributes, at the family
level. We might, for example, reference
PERSON'S IN FAMILY'S WITH AT LEAST 4
PERSON'S
No FAMILY attributes are needed because only
r.~
:
•
.r-----""1
~L_I__-----"
attribute) [WITHIN segmntid
OJ.
[subscript] [boolprim] ,
segmntid
In the example above we have taken the total of an
attribute (INCOlVIE), where the summation is taken
over all values for INCOME contained within the
specified segment, FAMILY. BEAST assumed that
iteration is intended over all F Al\HLY segments since
no subscript or modifier is put on the segment identifiers.
To explicitly assign a new attribute to a segment we
will use the notation
LET segmntid: attribute name = expres.sion
Using this notation and the TOTAL function to count
PERSON subsegments, we can create an B~ttribute
in each F AMILY segmente qual to theav erage income
of all persons in the F AMILY :
LET FAMILY: AVINC = (TOTAL INCOlv.[E
WITHIN FAJMILY)/
(TOTAL PERSON'S
WITHIN F AMILY)
Again iteration over FAMILY segments is impUed
because the segment identifier is unqualifiod. The
value of the function would become a scalar if the
second segment identifier were qualified with either a
simple logical condition or a BEAST SUbscript.
For example, the statement
LET X = TOTAL INCOl\IE WITHIN
FAl\ULY'S (1. .. 100)
Figure 4-Example3
* The
syntactic type boolprim represents a single logiical term.
In~rporating
would compute the sum of the income of all persons
contained in the first 100 families and assign the value
to the scalar variable X.
Conditions on structures
Logical expressions to deal with tree structures of the
type described in the above section must be capable
of expressing both intra-segment relations (analogous
to present BEAST logical expressions) and interlevel
relations among segments contained in or containing
the reference segment of an expression. The reference
segment of an expression is that segment with which
the value of the expression is associated, distinguishing
it from the other segments upon which the value of the
expression may also depend. For example, the reference
segment of the logical expression
FAMILY'S IN HOUSEHOLD'S WITH LESS
THAN 10 PERSON'S
is the FAMILY segment because the expression defines
a condition on FAMILY segments. Three segment
types appear in'the expression-FAMILY, HOUSEHOLD, and PERSON-but 'the value of the entire
condition on each F AMILY.
expression is clearly
Had the expression been simply
a
HOUSEHOLD'S WITH LESS THAN 10
PERSON'S
then the reference segment would have been HOUSEHOLD.
Table I gives a formal syntax for sample definitions
using the proposed extensions to logical expressions.
The set of words WITH, IN WHICH, etc., are used as
"noise" words and are not significant for the interpretation of an expression. The construct'S is used optionally to imply a plural and not a possessive. Note
that, for example, the plural ofFAMILY becomes
FAMILY'S, not FAMILIES.
The primary additions to the current BEAST's logical expression syntax are the three logical primitives
defined by the syntax specifications
{samPlename
(1)
IN
(2)
quantifier (inboOlPrim
segmntid
(3)
segmntid
{ALL
}{segmntid
EVERY inboolprim
[boolprim]
[boolprim)
boolprim
}
I
}
J
Complex Data Structures
457
TABLE I -Syntax for conditions on structures
aampl ..tatement
:.
DEFINE SAMPLE name AS (refaeamntld] lOlexp
refaelmntld
:-
aeamntld
aeptlntid
:.
aeamntname
10lexp
: m
boolprim
boolprim
:.
inboolprim
I
IN
I
I aeamentname'S
{~D }lOlexp
~ITH
WHICH
FOR WHICH
~: WHOM
1
\aeamntname (aubacript)
r
boolprim
boolprim
INOT boolprim I(l°lexp) I
HAVE
INCLUDE
INCLUDES
{::;;::~~::lprlmJ}
I
quantllier
{::::!:~~Olpr~} I
AIJL
} {aelmntld bOOIPrim}
{EVER Y
inboolprim
(l0lexp) \IOlicalvar \ numexp
inboolprlm
I A I AN I ONLY \
relop
quantifi.r
:=
NO \ ANY
[quantop]
quantop
:~
EXACTLY
IOllcalvar
:=
variable of type 10lical
numClxp
:=
arithmetic .xpullion
relop
:.
EQ
intelerexp
:*
exprellion with an intelral value
numexp
intelerexp
I AT LEAST I AT MOST I MORE THAN I LESS THAN
where
I NE lOT I LT I OE I LE
1. I t is often useful to test whether a segment is
contained in another segment having certain
characteristics, e.g., whether a PRODUCT segment is contained in a COMPANY segment of
a particular sort or whether a segment with
quarterly data is contained in ~ segment with
particular annual data. To make such a test we
have added a logical operator with the form
IN {samPlename
}
segmntid [boolprim]
If a segment identifier immediately precedes the
word IN then the test is applied to that segment.
If no explicit identifier is used then the test is
applied to the reference segment of the expression.
For convenience let us call the segment being
tested'S'. Considering the form
S IN segmntid [boolprim]
the system first checks whether the segment
containing S is of type segmntid. If no condition
is specified on segmntid then the value of the
IN phrase is the truth value 'of that inclusion
test. If the segment S is contained in segmntid,
then any condition on segmntid is also evaluated
q,nd the value of the IN phrase becomes the
truth or falsity of the condition on the segment
at the higher level.
458
Fall Joint Computer Conference, 1969
If the IN operator has the form
S
IN
among its subsegments. The permissible forms
for an quantifier are (1) A, AN, ANY, NO,
ONLY (2) any of the relations
samplename
the test is TRUE if S is a member of the sample
defined by samplename and FALSE otherwise.
In this case the reference segment of the sample
definition must be (1) the same as S or (2) a
segment that contains S.
For example, using the SEO file, one might say
DEFINE SAMPLE; SI AS FAMILY'S
IN HOUSEHOLD WITH AGEOFHEAD OVER 65
The reference segment is explicitly specified
(after AS) as being the FAMILY segment. The
segmntid is HOUSEHOLD, and the Boolean
primitive modifying HOUSEHOLD is WITH
AGEOFHEAD OVER 65. A particular FAMILY segment will be a member of the sample SI
if it is contained in a HOUSEHOLD segment
with an elderly head.
For a second example f let us assume that
there are two types of f!limily segments, called
COUNTRYFAM a1).d CITYFAl\1, each of
which may contain PERSON segments. We
define as a sample called CITYFOLK aU PERSON segments contained in CITYFAM segments by the statement:
DEFINE SAMPLE CITYFOLK AS
PERSON'S IN QITYFAM'S
2. While the first logical operator gave us the
ability to express conditions on the segments
that contain the referen~e segment of an expression, the second operator puts conditions
on segments that the reference segment may
contain. This operator has the general form
EXACTLY
AT LEAST
AT IVrOST
lVIORE THAN
LESS THAN
followed by an integral scalar expression or (3)
simply an integer expression. A, AN, and ANY
are equivalent to AT LEAST 1, and NO is
equivalent to EXACTLY O. The quantifier
ONLY indicates no specific number of oceurrences, but is TRUE if and only if S contaim! at
that level only segments of type segmntid,. and
they satisfy the condition imposed on them, if
any.
The condition referenced by the quantifier
may be subsegments that satisfy some eondition
or simply the existence of the subsegments.. A
segment satisfying a condition is speeified by
either a segment identifier with a logical primitive, or simply an intrasegment boole~m primitive (inboolprim) which is a condition made from
attributes all in the same segment type.
Since the value of ·an inboolprim is uniquely
associated with a particular segment;, an inboolprim is equivalent to a segment with a
condition on it (See example below).
The following sample definitions illustrate
the use of the quantifier logical operator applied
to a file of household survey data.
DEFINE SAMPLE BIGFAMS AS
F AlVIILY'S WITH AT· LEAST
PERSON'S
.j,
As with IN, the segment ~ to which this phrase
refers will be the refereIlCe segment unless it
immediately follows a difierent segment identifier.
This sample definition has FAlVIILY'S as its
reference segment. The F AMILY segments in
the sample are defined by a single logical
primitive. According to the syntax specification,
AT LEAST 4 is a quantifier, composed ofa
quantop (AT LEAST) followed by an integerexp
which in this example is simply the number 4.
In this example the quantifier is folloVlred by a
si~ple unqualified segment identifier, PERSON'S.
The quantifier operator tests whether S
contains a specified nUluber (given by the
quantifier) of occurrences of some condition
DEFINE SAlVIPLE CROWDED AS
FAMILY'S IN HOUSEHOLD'S WITH
AT LEAST 10 PERSON'S
quantifier
l
inboolPrim ['S]
segmntid [boolprim]
I
Incorporating Complex Data Structures
The sample CROWDED is defined using both
the primitives IN and a quantifier. IN is a
condition on F Al\1ILY'S because it follows
immediately after the declaration of ~he reference segment. IN is followed here by the segment
identifier HOUSEHOLD'S qualified by the
phrase AT LEAST 10 PERSON'S. Evaluation
of this expression involves a relatively complex
computation on each enumeration unit, since for
each F Al\1ILY ea level 2 segment) it is necessary
to find the total number of PERSON'S (at
level 3) contained in the parent HOUSEHOLD
segment at level 1.
DEFINE SAl\lPLE ELDERL YFAl\lS AS
F AlVIILY'S WITH A T LEAST 2
AGE'S> 60
This example shows one use of the construction
called an intrasegment boolean primitive. Assuming that AGE is an attribute of the segment
type PERSON, the quantifier 'phrase above
would be equivalent to AT LEAST 2 PERSON'S WITH AGE> 60.
3. The final condition on a segment is also an
operator applied to its subsegments. Though
similar to ONLY, ALL and EVERY are evaluated using only the segment type indicated,
and are independent of any other subsegment
types which S may contain at the same level as
segmntid. Also, a condition must be specified on
the segment identifier. The ,vords ALL and
EVER Y .are equivalent. The general form is
ALL
} {segmntid
{EVERY
inboolprim
boolprim
I
459
Input and cutput
An integral part of the BEAST language is its
reliance upon machine readable codebooks for describing
data files. The machine readable codebook incIudp-s a
format description, including the physical and logical
formats of the data file, the name and positions of all
data items in the file, and the meaning of their permis~
sible values. The BEAST system automatic&.lly references this information to interpret any user commands
relating to a file.
As an example of a simple input request, suppose a
user is investigating the relation between hourl~ng
costs and income for different age-sex combinations.
He knows that a given file, SURVEYFILE, contains
the results of a sa,mple rmrvey useful to his investigati n', he also knows that the file contains at least the
following four attributes of each respondent; age, sex,
rent, and monthly income. In order to access this body
of information using the BEAST, he writes:
SELEcrr SURVEYFILE
to designate SURVEYFILE as the current input file.
The execution of the SELECT statement causes the
BEAST to read the codebook associated with SURVEYFILE in preparation for an actual input request.
The codebook contains attribute names for each
respondent item; suppose that those corresponding to
the above attributes are AGE, SEX, RENT, and INCOME. For each attribute, the set of measurements
for all respondents is repre5ented as a column vector.
To extract these attributes, the user writes in the
BEAST:
GET SEX, AGE, INCOME, RENT
I
I
For example, considering a structure of the
form
1 FAl\lILY
2 CHILD
2 ADULT [ASSETS]
we could say
FAl\-1ILY'S IN WHICH EV-ERY ADULT
HAS ASSETS >. 500
and the value of the expression would be independent of the contents of the CHILD segments
contained in any F Al\1IL Y segment.
El.ecution of this GET statement C9uses four vectors
to be extracted from the file and placed in working
storage. There is no ordering rule for the input list;
the order of the names has no relation to their physical
arrangement on the file.
'
The remainder of this sectjon shows how the "access
by· name" referercing of files can be extended to incorp~rate the more complex Bt~uctures described in
this paper. When only one segment type is considered
there' is no ehang~ from the current BEAST specification because there is no structural information. To
signal the system that structural information exists
in' a file the user replaces the simple attribute list in a
GET statement with either the name of an EU structure or an actual EU specification. Such a GET statement indicates that the structural information as well
460
Fall Joint Computer Conference, 1969
----------------------------------------------------------------------------------as the data values should be retrieved from the current
input file. Similarly, an EU specification used in an
output list will result in the display of only the attributes
and structure indicated in the specification. As with the
curn,nt BEAST, if no subscnpts or sample qualifications
are specified in an I/O list then every occurrence of the
elements specified in the list will be retrieved or printed.
Using this form of I/O list one could write
DEFINE EU DWELLING AS
1 HOUSEHOLD [CITY,. STATE]
2 FA~IILY [FAMTYPEj
SELECT SURVEYFILE
GET DWELLING (1. .. 100)
The first statement defines DWELLING as a tree
structure with t,wo levels, the household level and the
family level. There are two attributes at the household level; they give the city and state where the
household is located. '1 here is only one attribute at the
family level, an indication of family type. The GET
statement results in the extraction of the first 100 of
these enumeration unit~ from the data file called
SURVEYFILE. The resulting: number of F'Al\IILY
segments in these 100 HOUSEHOLDS is unknown,
but it can be found by using 'the TOTAL fUDction.
LETNFAMS = TOTALFAMILY'SlN
HOUSEHOLD'S (1. .• 1(0)
Since the segment identifier FAMILY'S is used with
an e~plicit qualifier as the object of TOTAL, the value
of the function \\;ll be a scalar eqtu~l to the number of
FAMILY segments contained in the first 100 HOUSEl} OLD segments.
A small BEAST program
We conclude with a small ,but complete program
utilizing the data structures and statement types
described in this paper. This example also illustrates
tvv·o other BEAST statemellt types, the REPEAT and
COIVlPUTE statements. The iteration statement in
BEAST is distinguished by the fact that its dummy
argument is defined "by name" ,rather than "by value."
This is 8 useful device permitting the dummy to be used
on the left side of an assignment statement, to be only
partially defined on entry of ,a repeat block, and to
assume as a value any entity in the language that may
be named. The general form of the iteration statement
is given by
[label:] REPEAT FOR dummy 1 = namelist 1
[AND FOR dummy 2 = namelist 2]...
END [label]
The dummy variable must be used in such a Wl3,y within
the range of a REPEAT that substitution of all elements of the list result in E!yntactically correct BEAST
statements.
The COMPUTE statement is used to execute complex statistical procedures and print their results. 'rhe
COMPUTE statement has the general form
COMPUTE procedure OF dataphrt:tS€l [WITH
[ON SAMPLE name]
optionsphrase]
The procedure may specify any of a number of procedures including cross-tabulation, correlation, multiple regression, and analysis of variance. The datfl, to
which the procequre is to be applied is specified in the
dataphrase, and the exact form of the dataphrase depends on the procedure being invoked. The parameters
of the procedure can be modified using the optilmsph1·ase.
One may, for example, specify that the residuals of a
regression equation are to be printed as PBlXt of the
output.
When the arguments of a procedure are at more than
one level the number of "observations" derilved from
an enumeration unit equals the number of o(~curren.ces
of the lowest level reference. In such a case the v:a.lue
of the higher level references are distributed over their
subsegments giving a rectangular expansion of the tree
structure. When the phrase ON SAMPLE nfLme is appended to a COMPUTE statement the procedw'e is
executed using only the observations that are included
in the sample name. The reference segment OJ1 the sample must be at least as high as the lowest levell attribute
in the dataphrase.
Table II shows a program that uses the six subsets
defined in the introduction as selection criteria for two
cross tabulations using the Survey of Economic Oppo~
tunity file. The program will calculate and print a total
of 12 cross tabulations, two on each of the six samples
defined. Because the variables in the COMPUTE are
at the PERSON level we may use either ]PEltSON,
FAMILY, or HOUSEHOLD level samples.
CONCLUSION
Languages designed for statistics have tended to operate
Incorporating Complex Data Structures
TABLE II-8ample program
DEFINE EU DWELLING AS 1 HOUSEHOLD [SEXOFHEAD, AGEOFHEAD)
Z FAMILY [FAMTYP, REGION, URBANRURAL)
3 PERSON [WAQES, SALARY, AGE, SKILEVEL)
DEFINE SAMPLE Sl AS FAMILY'S WITH NO AGE'S> 65
Zl
DEFINE SAMPLE S4 AS PERSON'S IN HOUSEHOLD'S WITH SEXOFHEAD EO 'F'
DEFINE SAMPLE 55 AS FAMILY'S WITH REGION = 1
DEFINE SAMPLE S6 AS PERSON'S IN FAMILY'S (WITH AT LEAST 5 PERSON'S
AND WITH REGION EO 7)
SELECT SE066 'SE066 IS NOW THE INPUT FILE,'
CACM Vol 8 April 1965
7 W J DIXON editor
Univ Calif Press Berkeley Los Angeles 1967
8 Economic Growth Center '
Development of a generalized economic information retrieval
system and data files
Application for Nat Science Foundation Research Grant
July 1966-June 1969 Principal Investigator Richard
Ruggles Yale Univ 1966
9 H GELERNTER et 801
A fortran-compiled list processing language
GET DWELLING (1 •.• S 000)
LIREPEAT FOR X = 51, SZ, 53, S4, S5, 56 NlTERATE OVER SAMPLE DEFINITIONS'
liTHE PERMISSIBLE CATEGORIES FOR EACH VARIABLE ARE GIVEN IN THE
CODE BOOK OF SE066.'
COMPUTE CROSSTAB OF SEX, AGE, SKILEVEL
ON SAMPLE X
COMPUTE CROSSTAB OF URBANRURAL, SKILEVEL
ON SAMPLE X
END L
CLP-The Cornell list processor
BMD biomedical computer programs
DEFINE SAMPLE SZ AS HOUSEHOLD'S IN WHICH AT LEAST Z PERSON'S HAVE
(WAGES + SALARY) ~ 5000
<
5 Colingo project
Colingo C-1O User's Manual Vol I II May 1968
Mitre Corp Bedford Mass AF 19 (628) - 5165
6 RW CONWAY et 801
'DEFINE 'THE STRUCTURE OF THE ENUMERATION UNIT'
DEFINE SAMPLE S3 AS FAMILY'S WITH Z AGE'S
461
'REPEATS MAY BE NESTED TO ANY DEPTH.II
JACM Vol 7 April 1960
10 M GREENBERGER M JONES H JAMES JR
D N NESS
On-line computation and simulation: The OPS-3 system
MIT Press 1965
11 M A GOETZ Chm
The strategy of file organization
Proc IFIP Congress 65 Vol 2 1965 May 24-29 460-479
12 Harvard University, Department of Social Relations
STOP
upon the simplest data structures, while languages
with facilities for the more complex structures have
seldom had the statistical operations that have made
the current version of BEAST attractive. By extending
BEAST to include the tree structures described here
we hope to increase the usefulness of the language
without sacrificing any of the convenience of the current
language. While the methods of referencing such structures have been stressed here it is nonetheless important
to be able to manipulate such structures to add and
delete individual segments and entire levels. We have
not presented our tentative solutions to the problems
of manipulating segments.
The data-text system: A computer language jor social science
research designed for numerical analysis of data and content
analysis text
Preliminary Manual Harvard Univ 1967 Cambridge
13 1. B. M. Application Program
Generalized information system application description (GIS)
IBM Tech Pub Dept 1965 White Plains NY
14 Inter-University Consortium for Political Research
M O,.(:hine readable codebooks and their use
Inter-Univ Consortium for Politica.l Research Nov 1967
15 R J JONES
Data file two-A data storage and retrieval system
Proc SJCC Vol 32 1968 171-181
16 A J' PERLIS
A definition oj formula algol
Carnegie Mellon Univ March 1966
17 N S PRYWES
·Executive (j,nd retrieval based extended machine
REFERENCES
1 C W BACHMAN S B WILLIAMS
A general purpose programming system for random access
memories
Proc FJCC Vol 26 1964411-422
2 R M BALZER
Dataless programming
Proc FJCC Vol 31 1967535-544
3 R E BLEIER
Treating hierarchical data structures in the SDC time-shared
data management system (TDMS)
Proc 22nd Nat Conf Association for Computing Machinery
196741-49
4 R BUHLER
Proc IFIP Congress 65 Vol 2 1965 May 24..29,460
18 J M SAKODA
DYSTAL MANUAL: Dynamic storage allocation language
in }I"'ortran
Brown Univ 1965 Dept of Sociology and Anthropology
Unpublished manual
19 MSCHATZOFF
Applications of time-shared computers in a statistics
curriculum
IBM Data. Processing Division 1966 Cambridge Scientific
Center
20 N R SINOWITZ
Dataplus-A language jor real time information retrieval
from hierarchical data bases
Proc SJCC Vol 32 1968395-401
21 Social Systems Research Institute, Computation Division
P-STAT: An evolving user-oriented language for statistical
analysis of social 8cience data
Socioeconomic Information Processing Service user's manual
(SI.PS) preliminary corrections, February 10 1967
Princeton Computer Center Princeton Univ 1966
Univ of Wisconsin 1967 Madison
Nanosecond threshold logic gates for
16 X 16 hit, 80 ns LSI multiplier
by LUTZJ. MICHEEL
Air Force Avionics Laboratory
Wright-Patterson Air Force Base, Ohio
INTRODUCTION
Threshold logic gates with nonlinear Jeedback
Smith and Pohm have demonstrated the ultra-high
Previous research and development efforts in digital
speed capability of threshold logic gates in the form
monolithic integrated circuits and arrays were almost
of RTL gates modified with negative, nonlinear current
exclusively concerned with Boolean logic. However,
feedback. 1 •2 In these gates VOE was clamped to Vre!
by introducing threshold logic, considerable savings in
= VBE by tunnel or backward diodes (Figure 1) which
gate count as well as in subsystem processing speed
prevented both saturation and cutoff; thus, the tranare evident. When logic subsystems, such as registers,
sistor always operated in the ON condition near the
adders, counters or combinational control logic, defT peak. In reverse direction, of course, the diodes
signed with common NOR logic, were replaced by
functioned as the familiar Baker clamp. Propagation
subsystems employing threshold logic, average savings
delays between 5 and 1.2 ns were achieved in breadin gate count of three to one have been demonstrated. 15 •J6
board implementation with fan-out of 3 and 2. By
Furthermore, the number of consecutive logic levels
varying. the bias current I jb , the authors implemented
necessary to implement a given switchjng· fu.nction,
NAND,Majority, and NOR for various values of
and thus the relative processing delay, is also generally
fan-in :s;; 8 and a]so the threshold functions lying bereduced by the same ratio.
tween tl~ese special cases. The gates were not amenable
The full adder function requires two inverting
threshold gates, and carry propagation is accomplished
with only one gate delay per stage. Basic flip-flop
types can be implemented by a single threshold gate.
Advanced parallel adders for three addends would
consist of three threshold gates per bit, and functional
multipliers should also become pra~tical in iterative
(,
array implementation.
For full utilization of the much greater logical
capability of threshold gates, the employed techn~logies
should be amenable to large. scale integration which
excludes hybrid approaches. Utilization of such monolithic threshold gates and arrays is possible in most
kinds of computers, data processing and control
.
equipment.
Figure I-Modified RTL gate witn tunnel diode feedback
463
464
Fall Joint Computer Conference, 1969
-------------------------------------------------------------to monolithic integration, however, because of the
tunnel diodes and because high-aQcuracy resistors
were required for current biasing.
i,
Vh
---------:-..--""!"'. .
1.2 V'
Thr68hold logic gate8for LSI
Xk - -lor +1
1 t ,,-0
VaE
I
When the firs·t experimental Schottky barrier diodes
became available, 3 circuits similar to the gates described
in the first section, but having symmetrical transfer
functions, were studied by the author. A pair of antiparallel diodes were used for the collector base feedback
(Figure 2). The plastic-encapSulated diodes, with
molybdenum silicon interface, :had 0.8 pF capacity
at 0 V, and VD = 0.4V and ~V/~I = 250 at 1 mAe
Type 2N 918 transistors were selected for maximum
ff' and minimum r~. With 19.2 mW average power and
i = 0.8 mA, the switching tithes (td + t,,) and (t.
+ t,) were between 1.65 and 1.85 ns.
Luce confirmed these switching time measurements, 4
and using experimental trans~stors SMX2-T with
ff' = 2.8 GHz at VeB = 0 V and Ie = 2 mA, he
achieved propagation delays of 1~8 nswith only 5.6 mW
circuit power. The 400 mY, TO-18 Schottky-diodes
contributed 2 pF Miller capacity. With reduced voltage
swing of ±300 mV, Luce attained average tpd = 1.4 ns
and minimum tpd = 0.8us.
Figure 3 depicts the symmetrical current-in, voltageout transfer function and the summing point characteristics of the new gate which exhibits, at I'k a swit('hing
step in VBB of only 23 mV.
A basic improvement of this ~ymmetrical threshold
gate over RTL circuits should b$ pointed out. In RTL
it is the sum of input currents 2: lin, plus the (negative)
base bias current I bb , which turns: on the npn transistor.
In the new threshold gate, the! (positive) base bias
current keeps the transistor at the threshold point
ItA in ON condition. The! input current sum
± I.,.L: L" is merely required to switch the gate from
I,,, to its high 01' low state.
First order worst case analysis of the basic 5-input
Tin:.)
Figure 2-Thmlhold gate with 8cttottky diodf' feedback
0.8 V
r- - =..-=-.::.--=-:.....:-=-::.--=-~.;..I-t---=-==-=:-==:=-=--=. - - Vref
I
0.4V+---------
--!--\. .-.. .
I
"
I
:
_.::..::...::...:.=--.
V,
I
::
o L-.-----'---+I-+-~I---.'_-J-.....
--\'-
Figure
-~
3-Tran~fer
-1 0 t -¥
iunction of
th~
~
i
Tl:
Xk
gate shown irrpjgure
~~
gate was performed under the assumption of temperature tracking of communicating circuits in monolithic
LSI environment. The transistors should have {3!5; 40
and VBB matching of ± 5 mV, while a resistor r2~tio
tolerance of ± 3 percent is required. These characteristics can be attained in LSI with good manufacturing
yield. The Schottky diodes should have VD = OA V
± 15 percent at 0.8 rnA and very low capacitance for
the high-speed version of the gate (i ~ 0.8 rnA).
Several other versions are discussed in the nex"t section.
Compatible metal-silicon Schottky diodes have been
proposed (MOIiI17) and implemented (MO,6 A}7 ,8 , 19 Pt18 )
as Baker clamps in integrated circuits which were mOI~tly
of the TTL type. The same technology is applics.ble
to the modified R TL threshold gates. Transis·tors with
fT~2GHz are now available for LSI utilization at low
Ie and V CB values.S,IO
Various optimizations of the integrated gate
While the experiments described in an earliof section
were concentrat.ed on high speed gat.es, wi"th unity
current step i = 0.8 mA, other circuit options would
emphasize optimization in the following areas.
Low power.
For low fan-in gates, power consumption can be
reduced by small input and collector cw:entl~ an~ by
lower collector voltage. The former implies t,ranslstor
beta ;;:: 80 in order to minimize the influence of absolute
variations in Gbb , G k and (3; unity currents i;~ ?2 InA
are attainable with this beta. The latter reqUJ.res an
active source or the collector current. This would
improve the circuit dc performance since collector
current variations between the high and the low output states would be minimized. Trade-off studies :B.re
required in order to determine whether curren.t source
Nanosecond Threshold Logic Gate~' 465
or collector resistor contributes lower collector load
capacities. The high area consumption and the low
beta of lateral pnp transistors makes the current source
less attractive for LSI circuits at this time.
,....---------- +vcc
G'
k
~I-----+-..()T·
High input weight count.
When many or all inputs are low, the high negative
summation current must be accommodated by feedback diode current and collector resistor current. Minimum input current (i/2~0.1 mA) and high transistor
beta are again required. Tantalum thin-film overlay
resistors would provide high sheet resistivity for accommodating the large number of input resistors
without requiring overly large substrate area consumption. Decoupling diodes1 •2 would ease the problems
of leakage currents and of resistor tolerance req uirements.
Improved noise immunity.
Two Schottky diodes in series per. feedback branch
(or simply two anti-parallel silicon' diodes) would
increase the voltage swing to ±0.8 V (or ±0.'6 V).
Collector biasing would be required in order to avoid
saturati()n.1 •2
High fan-out.
Collector biasing in combination with an emitter;.
follower output stage (Figure·4) would greatly improve
the output drive capability.!.
ftC inputs.
As Smith and Pohm pointed out, capacitive shunting
of the input resistors would increase the gate switching
speed for very low fan-out. Capacitive shunting, howev'er, is an acceptable method only for low-noise array
environment and for non-reversing switching transitions (no spikes), such as in a ripple carry.
Reduced Miller effect.
The detrimental Miller effect could be reduced if
only one (symmetrical) Schottky diode were u'Jed with
VD = ::1::0:.4 V at Ip = ±i/2,. Following a. suggestion
by Schuermeyer,11 this type of diode can be obtained
through very high concentration of surface states.
Proposed functional LSI multiplier
Recent advancements in the state-of-the-art of
silicon processing for medium scale and large scale
L..----~-GNO
Figure 4-CoJe<'tor bla3ing and emltter follower outPUl
integration have made possible the implementation of
monolithic arrays composed of the new threshold gate.
High densities with 11 mil2 average area consumption
per component have been achieved in pilot line LSI
with good processing yield;lo this includes intra/interconnections and three layers of metallization which
facilitate optimum layout. The array was an 8-bit
adder employing ECL trees with 1.2 ns carry propagation. The transistors have 0.15 X 0.8 mil emitters
and 100 OJ square base resistivity. The resistors were
0.5 mil wide with values in the 100 .. .4000 range and
exhibited 6 percent ratio tolerance on 60 0 /square
material.
The threshold gates of the second section require
cluste~ of equal resistors in the 1 . . . 4 kO range
with 3 percent ratio tolerance.
This tolerance could
be attained with 0.5 mil wide resistors on 100
O/square base material. The 2 QH.z transistors with
0.1 X 0.4 mil emitters discussed by Phillips et aVo
shOUld also be amenable to LSI in the early Seventies.
The 10 mW high speed gate with i = 0.8 mA uses
1 kO resistors; with fan-in of 5, this gate would encompass a substrate area of approximately 6 X 13
mil2 •
A 16 X 16 bit functional mUltiplier is proposed for
LSI implementation using the 10 mW, 1 ns threshold
gate. Figure 5 shows the matrix of multiplier cells in
skewed form with all sum and output bits having a
given binary weight arranged in the same column.
Each cell M'i of the multiplier (Figure 6) is composed
of a full adder and an AND gate which performs the
multiplication. The cell in Figure 6 operates on ~ and
y: and the adder inputs are P ii = XiYi, C(i-l) U),
a~d S( i+l) (i-I) • U· the gates of the third section with
symmetrical transfer function are used and if T' wo
(X1,X2 , ••• , X k ) = T' wo(Ia) is the inverting threshold
function, all three mUltiplier-cell gates can be implemented with the threshold Wo = 0
466
Fall Joint Computer Cpnference, 1969
INPUT REGISTER
X
Ix is
xd
,
,,
i
i
I
INPUT
REGISTER
Y
, - - - - - -_ _ _---:--J
:
:
I
,
PRODUCT REGISTER Q
Figure 5-16 X 16 bit functional multiplier
M '+1 i_I)
Figure 7-16 X 16 bit functional multiplier on four
LSI dies
Figure 6----'---Function",l multiplier ('ell
=
T'o(X'i' Y';, +0.5)
C"i
= T' (P ii, C(i-l)(J),
S'li
= T' 0
0
(P I;'
C(i-l)(i),
8(i+1)(i-l»)
S(i+l)(J-l), 2C'i,) .
The proposed multiplier would be implemented on
four LSI dies with 64 iterative (ells each (Figure 7)
with two layers of metallization. 'For attaining optimum
layout and, thereby, higher component density, an
implementation with three layers may be preferable. lO
The three gates per cell would: occupy an area of 13
X 18 mi12, and each LSI die would have an area of
~ 115 X 155 mi12 • For 10 mW power per gate, the
array will consume 1.92 W po~er, and 44-pin 1 X 1
in2 stud packages would provride adequate thermal
management.
The multiplier cells are used ~n two complementary
ways-as Type 1 cells with positive inputs/negative
outputs (Figure 8a) and as Type 2 cells with negative
inputs/positive outputs (Figure 8b). Equivalen.t to "the
odd/even levels in NOR logic design,12 alternating
matrix columns (Figure 9a, b) are composed of Type 1
cells using inverted inputs X~ (i = odd, e.g.) and of
Type 2 cells using true inputs Xi+l (i + 1 == eV€ln).
Only one bus connection to the matrix is required per
flip-flop in the X-register, whereas both Y j and Y'j
are bussed through each hori zontal row. An a:dditional
column i = 17 of carry inverters (Figure 10) oonverts
C(16)j into S(17);'
Although the average carry ripple length is much
shorter13 than the full length of each partiall product
adder (having j = const.), no carry look-aheadcirouitry is included since it would corrupt the iteraijive
structure of the multiplier and also the approach of
minimum wafer area consumption of the L8I array.
The worst case mUltiplication time tM for two kbit numbers includes 2k - 1 carry delays' and k -- 1
sum delays (Figure 10). Three nanoseconds should be
allowed for each package-to-package tranBition ttr
assuming matched transmission lines, and :B. seU,ing
time of less than 2 ns is required for the output flipflops Q each of which consists of a single threshold
gate.14 For k = 16,
tM
+ 31
+ 5 ttr + tut
= tpiAND)
tpd
(Carry)
+
15
tpd
(Sum)
Nanosecond Threshold Logic Gates
467
(a)
Figure 9a-Alternating type l/type 2 cells in the
multiplier matrix
x·
!
~
/1
~--------~------~~' ~~
Cout -E -
t----Io------.
+------1<- Cin
(b)
(b)
Figure 8a-Type 1 cell
Figure 8b-Type 2 cell
(1
= 79
+ 31 + 30 + 15 + 2) ns
us.
CONCLUSIONS
New low-power nanosecond threshold logic gates which
are amenable to monolithic LSI have been discussed.
These gates require high-performance integrated devices, and the necessary advanced silicon processing
techniques should be available with high manufacturing yield in the early Seventies. Functional LSI multipliers with 80 us multiplication time for two 16-bit
numbers have been proposed. Such multipliers and
D
TYPE I CELLS
~ TYPE
2 CELLS
Figure 9b-Alternating columns X/X composed of
type l/type 2 cells
468
Fall Joint Computer Conference, 1969
--------------------~------------------------------------------------,------
s
2 W R SMITH
c~c
Resistor-transi8tor-backward diode nan08econd logic
's
I,
-.• • • • • •
•
•
•
•
•
1-
.
3
.~
• • • • • • • ".
~
4
5
6
,
•
• • •• • •• •
...
~
L
Semiconductor Products and Solid-State Tech Vol 6
March 1963 17-23
Samples of hot carrier diodes developed under Contract
AF33(615)-2629 by Motorola Inc Semiconductor Div
for the U. S. Air Force Avionics Lab
The samples became available in Oct 1965.
R L LUCE
Personal communication May 1966
W. SE~LBACH
Monthly Status Letter for May 1966 Motorola Inc unde,r
Contract AF33(615)-5205 with the Air Force Avionics Lab
See also Reference 10
Y TARUI Y HAYASHI H TESHIMA
T SEKIGAWA
Tranmtor Schottky-diode integrated-logiC circuit
•
•
Inkrnat Solid-State Circuit3 Conf Phila Pd. Feb 11968
7 R A ALDRICH
•
Low storage Schottky barrier diode transistors
•
• ...
• • • • • • r.
°31
Figure lo-Longest path for worst-case mUltiplication
time
simil.a.r monolithic LSI arrays, ~.g., high-speed adders,
counters, and control logic subsystems, can be advantageously implemented with threshold logic ;16 the
average savings in gate count is ,3 :1, and the number of
interconnections is reduced by 2:1 or more. LSI arrays
with the new 10 mW, 1 ns threshold gate would be
applicable to future ultra-fast * low-power da.ta processing systems. Practical procedures for logic design
with threshold logic gates ~ere published elsewhere
by Winder.UI
ACKNOWLEDGMENTS
The stimulating discussions with R. Winder, R. Luce,
J. Tellier, and C. Huang are gratefully acknowledged.
REFERENCES
1 W R SMITH A VPOHM
A new approach to resi8tor-tran8i8tor-tunnel diode nano8econd
logiA;
IRE Trans EC Vol 11 Oct 1962 658-664
In".emat Electron Devices Meeting Wash D C o.c:t 1968
8 J E PRICE
A high-speed integrated Schottky-diode transistor logic circuit
Intemat Ele(tron Devices Meeting Wa.,h D C Oct 1968
9 W SEELBACH D METZ
Compatible semicondu.;lor thin film techniq~leb
AF Avionics Lab Tech Rpt AFAL-TR-66-305 O<::t 1966
AD 802 677 r repared under Contract AF33(615)-2629 by
Motorola In~ Semiconductor Products Div Phoelilix Ari~
10 C PHILLIPS G RUPPRECHT FLEE
A.dvanced integraiion techniques for low power, 1(X)-SOO MHz
digital processing
AF Avionics Lab Tech Rpt AFAL-TR 68-226 Se,pt 1968
AD 843997 Prepared under Contract AF33(61Di)-5205 by
Motorola Inc Semiconductor Products Div Phoenix Ariz
11 F SCHUERl\1EYI~R
Fer.'lonai eommunication Sept 1968
12 G A MALEY J EARLE
The Logic Design of Transistor Digital Computers
Prentice-Hall Inc 1963 Englewood Cliffs N J
13 B GILCHRIST J H POMERENE S Y WONG
Fast carry logic for digital computers
IRE Trans EC Vol 4 Dec 1955 133-135
14 J I AMODEI D HAMPEL T R MAYHEW
R 0 WINDER
A n integrated threshold gate
Internat Solid-State Circuita Conf Phila Pa Feb 1967
15 R 0 WINDER
The statu8 of threshold logic
Fir.'1t Annual Princeton Conf on Info Science~ and Syst.,mg
Princeton N .J March 1961
16 J H BEINART D HAMPEL K. PR03r
R 0 WINDER
Integrated threshold loflic Jor IJSI
USAF Avionico lab Final Rpt No AFAL-TR~~9-195
on Contract F33615-68-C-1536 Prepared by RCA
Advanced Communication8Lab Somerville N J
Published in Aug 1969 AD 857477
Silicon-on-sapphire complementary MOS
circuits for high speed associative
memory *
by J. R. BURNS and J. H. SCOTT
RCA Laboratorie8 .
Princeton, New Jersey
INTRODUCTION
The utility of associative memory in a wide variety of information handling systems has been long recognized
and in-the early 1950's such memory systems were proposed for implementation through cryotron logic and
storage arrays. Cryogenic element technology afforded
the ingredient of compatible logic and memory within
a basic cell, a requirement essential to the practical
realization of associative memories. To date, such an
approach has not been successful due mainly to processing difficulties connected with thin film elements
operating in a liquid helium environment. Other appro~ches, involving the use of multi..apertured magnetIC elements, have been proposed and implemented,
but the resultant cost was prohibitive due to complexities of peripheral electronics as well as the magnetic
storage element itself. Furthermore, systems of this
type have relatively long parallel search times (,-...., 10
~secs) especially if aC'cess is on a serial-by-bit basis.
Th~s~ consider~ti~ns have seriously limited the applica~Ihty of assoClatI~e concepts in all forms of data processI.ng and have resulted in a situation where system
deSIgners do not consider associative memory asa
solution to a given probiem in spite of many obvious
advantages in applications such as sorting, merging,
iii T~e research rep?rted in this paper was sponsored by the Electronic Research DIrectorate, Rome Air Development Center, Air
Force Systems Command, Griffiss Air Force nase, New York
under contract F30602-68-C-0l97.
469
pattern recognition, and most recently memory allocation in time shared computers.
'
Many of the objections mentioned above are not
valid today because of the rapid evolution of integrated
circuit technology. This is particularly the case for
semico~ductor memory arrays where the universality
and regularity of such sub systems take full advantage
of the low ~ost potential of Large Scale Integration
(LSI). ConsIderable effort has been expended throughout the industry on high speed random access memory
arrays having non-destructive read-out in the sub100 nanosecond range where the cost of competitive
ma~netic memories is dictated by the high quality
perIpheral electronic circuitry. Although a substantial
part of this effort has been on bipolar memories the
dominant trend is toward MOS memories becau~e of
tlie simpler processing technology, lower power dissipation, ~d smaller silicon area per bit, all of which
lead to low~r cost systems. Monolithic silicon MOS
memories generally take two forms, i.e., single polarity
MOS arrays, usually P type, and complementary MOS
.
..
'
a umque CIrCUIt configuration capable of higher speed
and extremely low power but at the expense of more
complex processing technology and slightly higher
costs. This is the approach taken here for the realization of sophisticated associative memory with one
important difference, namely, the utilization of thin
film silicon-on"sapphire technoiogyl for the fabrication
of high quality complementary MOS arrays. Silicon-onsapphire combines the best features of monolithic
470
Fall Joint Computer Conference, 1969
silicon and thin film integrated: circuits through the
epitaxial growth of thin films of' single crystal siliconon-sapphire substrates which c~n be selectively removed so that fiJI.parasitic reactance which degrades
the performance of monolithic circuits is effectively
eliminated. Coupled with the improved circuit performance is a potentially simpler processing sequence for
CMOS integrated arrays (requiring only two noncritical
source-drain diffusions) which will substantially reduce
costs as well.
A 880ciative memory design
General considerations
Several considerations influence the design of an
associative memory array, the madority of these having
to do with the sophistication req~ired of the memory.
Based on requirements believed tp be minimal in most
associative applications, the following features are
desirable:
1. Normal operation as a read-write random access
memory (having high speed non-destructive
read out) in addition to! completely parallel
associative search operation.
2. "Masked" search capability so that any part of
the total field can be eliminated from the search
word. This will also provide a "masked" write
wherein partial updating of the field of a selected
word is possible.
3. Modular array design So that associative mem-
ories of arbitrary numbers of words and bits
per word can be constructed by "wired OR"
connections of the word and bit lines of individual modules.
Accordingly, the module was chosen to be one of
four words each four bits long and has the basic block
diagram shown in Figure 1.
Operation of the module is summarized in the following Table!.
As shown, the module performs as a read-write
memory in addition to the ability to perform a completely parallel search. In the "don't care" condition
where both of the bit line pairs are "0," any of 1;he
digits to be completely masked off in this· condition
will not produce a mismatch signal in any word regardless of the contents of that bit in the word., Design
of the basic cell which performs these various functions
is discussed in the next section.
Associative ,cell design
The circuit diagram of the basic cell, designed to
implement the aforementioned functions, is shown in
Figure 2 and is seen to consist of 14 MOS devices, 10
N channel and 4 P channel. The flip-flop portion consists of the cross-coupled CMOS inverters N I, PI, and
N 2, P 2. To write a "I" into the cell,W and Dl are r ised
to + Vo volts and D2 remains at ground. This combination cuts off P a while the series combination of Na
and N6 pulls the "0'" side of the flip-flop down toward
'rABLE I-Associative module system operation
FUNCTION
Write
Read
'WORD AND BIT LINE CONDITIONS
RESPONSE
Wi = "I"; D Ij ~ "1," D 2j = "0"
Write "I" in
jth
bit of ith word.
. == "0 ' "D 2.1. -- "I"
WI' -- "1 , "D 13,
Write "0" in
jth
bit of ith word.
W~
= "I" All D; lines = "0"
Paralle _Search A:I Wi = "0"
D Ij = "I"
D Ij = "0"
D 1j = "0"
D 2j = "0"
D 2j = "I"
D 2j
=
"0"
N on-destructive read of ith word~stored contents
determined by presence or absence of I, on
lines D 2j •
Search for "I" in /h bit.
Search for "0" in jth bit.
Don't care.
Mismatch of any bit in word indicated by current
on W lines.
Silicon-on-Sapphire Complementary MOS Circuits
~.Assoc~nVE
~
CELL
Figure l--;-Associative memory block diagram
01
I
+Vo
I
"I" " t Vo
"0"" GNO,
r--
L..----+----'eo=+=
I
?
~ Is
I
I
Figure 2-Complementary MOS associative cell
ground and after one stage delay the "I" side is up
at + Vo. Similarly,.a "0" is written by raising Wand
D2 to + Vo with Db at ground potential.Note that
when both lines are grounded, and W is high, the state
of the cell is unchanged.
Non-destructive read out· is obtained by again selecting W, thereby turning on'transistor N lO , and keeping all D lines at ground. Depending on the state of
the cell, Ns is either, on or off and a large or negligible
small current is produced on the low impedance D2 line.
Mismatch detection by means of a parallel search is
accomplished with all W lines grounded and placing
the search criterion on each bit line pair, i.e., Dl = +
Vo, D2 = 0 for "1"; Dl = 0, D2 = + Vo for "0" and
Dl = D2 = 0 for "don't care" or "0". Transistors
N 6 , N 7, Ns, and No form the local exclusive OR circuit.
If the stored information mismatches the information
on the bit lines, one of the pair of N6-N7 or Ns-No
will form a conducting path from the positive s~pply
471
to the W line (at ground potential) and produce a large
current (",,1 rnA) in the W line. Both pairs will be cut
off if there is a match or if a "don't care" condition
prevails in that bit location. Since all such circuits are
OJ:t'd together across the word, a match occurs only if
all ~xclusive OR gates are open or a negligible small
current appears on the W line. Any bit of the word mismatching the search bit will generate a mismatch current for the entire word.
It should be noted that read out and mismatch detection are both accomplished by current sensing in a
low impedance line. This is an extremely high speed
operation as the relatively large capacitance on the
word and digit lines can be swamped by the low input
resistance of a grounded base bipolar and the voltage
conversion done at the relatively low capacitance collector and at essentially the same current level. (A
complementary emitter follower performs more than
adequately as a combination drive-sense circuit on
both word and digit lines.) In high speed table look-up
applications l such as "paging" in time shared computers, fast parallel search and access is extremely
desirable as this function is carried out once every main
memory cycle.
Another aspect of current sensing on the mismatchline is that the magnitude of the mismatch current is
directly proportional to the number of bits in error in
that particular word. Utilization of analog detection
circuitry on this line will then enable the determination
of the word which most closely matches the search
word, independent of the significance of the bit. The
so-caned· "proximity match" condition is quite useful
in many aspects of pattern recognition, for example,
or other applications where incomplete information is
available for the search criterion.
Processing of silicon~on-8apphire COS/ M as
Technical considerations
Great difficulty has been experienced and reported
by workers2 attempting to build high quality, active
silicon devices on sapphire substrates by the straightforward application of standard bulk silicon technology
to these heteroepitaxial films. These difficulties can
be traced, in general, to two problems.
The first is contamination from the substrate, epitaxial system or handling procedures, .and the second
is due to disorder in the epitaxial layer caused by the
growth interface. It can, therefore, be expected that
devices and circuits fabricated in heteroepitaxial material must have the silicon processing adjusted in order
to account for these deviation.'3 in properties.
472
Fall Joint Computer qomerence, 1969
------------------------~--------------------------------------------------------TABLE II-Physical characteristics of heteroepitaxial system components
Silicon Si
Crystal Unit celt
face-centered cubic
a = 5.4301
(1)
Sapphire Al2lO
r = 4.75.8
a = 12.991
Density
(g/cc)
2.33
3.98
Hardness
(Mohs)
7
9
Melting point
(°0)
1412
2030
Dielectric constant
:11.7
(500 Hz - 30 MHz)
Dissipation factor
tan 8
10-a - 10-4
10- 8
3.4975
at 1.357
1.7707
at 5461
Refractive index
Thermal conductivity
cal/ cm sec· °C
at 25°C
The~mal
expansion
1/ °0(25 = 800°0)
JJ.
0.30
coefficie~t
3159 X 10-&
9.4 (1 to C)
(100 Hz - 100 kHz)
-
10-4
1
0.065 (60° to C)
8.4 X
1(}-8
,
Table II is a comparison of some of the physical
characteristics of the components: of the heteroepitaxial
system that must be taken into account if high quali ty
silicon-on-sapphire devices are to be built. From this
data, it is evident that some physical stress and disorder due to the mismatch of these characteristics is
inevitable.
The effects of disorder and strain on the properties
of bulk silicon are ~ell known, e.g., base "push out"
in bipolar transistors. Comparison of what is known to
occur in bulk and what is observed in SOS yields some
insight into the processing consIderations. The most
severe problems are:
1. Accp.lp.rated Diffusion
2. Accelerated Oxidation
a. Contamination
The change of diffusion coefficient in bulk silicon
is a function of surface concentration and dislocation
density. The distribution of disorder sites in BaS has
been shown to be highest at the silicon to substrate
interface and decreases as the thickness of the film increases.a Due to this distribution, there is a change in
diffusion coefficient causing the impurities to move
faster as they penetrate the' film and thereforH aterall
diffusion can increase with depth. The resultant diffusion profiles are depicted in Figure 3. The following
Figure 4 is a photograph of an actual "angle lap and
stain" demonstrating the results of too long a diffusion
of the source and drain regions. Note the resulting
short of the source to the drain is at the silicon to
sapphire interface. Because SOS has no bulk 1io dilute
the fast diffusing contaminants plus the additional
complication that the substrate can contribute to the
contamination (AI, O2, etc.) much greater ca,re must
be taken in handling and substrate preparation. This
Silicon-on-Sapphire Comple·mentary MOS Circuits
~ SlHcon
N+
~ Silicon
P
fLZZI Doped Sl02
Figure 3-Diffusion failures in bulk silicon and siliconon-sapphire
. SOURCE
Figure 4-Photomicrograph of diffusion failure in SOS
consideration is further compounded by the affinity
of contaminants for disorder sites.
473
Finally, oxidation and its effects must be considered
in the light of what is known to occur in bulk silicon
for this is the pillar on which 'silicon technology ~
built. Here, one finds three major effects. The first is
dissolution of O2 from the ambient and the generation
of donor states with reported values on the order 0 f
lOts/cc. The second is the precipitation of these impurities causing large changes in mobility and, finally r
segregation of impurities in the oxide.'
From the previous discussion, it is obvious, without
going into the' details of the phenomenon involved,
that bulk silicon technology is not directly applicable
to the fabrication of high quality complementary
MOS devices on insulating substrates. The necessary
alterations in the process involve elimination of
oxidation where possible and minimization of the time
that the wafer is exposed to high temperatures. In
addition, advantage can be taken of the thin film nature of the technology by utilizing the "deep depletion"
MOS structure,5' thereby enabling construction of
complementary devices in a film of common conductivity type.
Figure 5(a) depicts a wafer of silicon-on-sapphire with
a 300°C deposited oxide defined by photolithograpl:tic
techniques in the pattern to etch away that silicon
not utilized by active devices. Figure 5(b) shows the
pattern left after the silicon is etched from the undesired
region.
After the desired patt~rn is. achieved in the silicon,
thin layer of boron doped oxide [cross-tracked area
Figure 5(c)] covered with pureSiO'is deposited (300°C)
and etched into the desired P+ regions. This is followed
by phosphorus doped oxide covered by pure SiO' and
etched to define the N + regions as shown in Figure 5 (d).
This structure has never been above 300° C and has
the appropriate doped oxide source defined in the P+
and N+ regions with the channel regions clean and free
of oxide. The wafer is then subjected to its only high
temperature treatment for the time required to drive
in the diffusants and grow the channel oxide. This is
indicated in Figure 5(e) with the appropriate diffusions
drawn in. The final device structure (see Figure 6),
including metallization, shows the built-up oxides at
the edge of the active gate region minimizing the
parasitic overlap capacitance.
The metallization utilized to complete these structures was evaporated aluminum and posed some problems in continuity due to the relatively large silicon
steps the metal was required to pass over (~ 1 micron).
Figure 7 is a scanning electron photograph of one such
crossover. Note that the metal is thinner than the one
micron of silicon and that the continuity appears sus-
474
Fall Joint Computer :Conference, 1969
~PL-tzr_~
Sapphire:
(A)
p»t\fW»))1 .~~»1
~SL
Figure 6-Final device structure
Sapphire:
(B)
~Alumlnum
I
~7W'11
WILZPi7l/@1
Sapphire·
(C)
~z_
.•~-~~
Sapphire:
(0)
tZZl Sillcon
~Si02
~ Boron:doped
~ PholP:horus
SI0 2
A&\
doped S10 2
(El.
i
Figure 5-Low temperature CMOS Process
SILICON
pect. In fact, it was continuous. By increasing the
thickness of the aluminum used to 15,000 A or 1.5
microns, this problem was virtually eliminated.
!
SAPPHIRE'
Figure 7-Aluminum metallization over oxidized silicon
edge
Unique features of sos technolou;y
Several significant advantage~ result from the utilization of SOS in the areas of process simplification as
well as improved device and citcuit performance. The
use of the "deep depletion" MOSD eliminates the need
for a difficult counter diffusion to form complementary
devices while selective silicon: removal restricts the
critical silicon areas to the channel regions of the
transistors since all metal in~erconnects are routed
over the sapphire substrate.: This gives complete
freedom from metal to substra~e shorts and spurious
channel formation, two major sources of yield reduetion
in monolithic MOS arrays.
The most obvious advantage of this technology i13 the
substantial improvement in circuit speed due mainly to
the virtual eliminatjon of all parasitic capacitances
within the array. As shown in Figure 6, the through
diffusion of the source and drain contaots to the
sapphire reduces the contribution of the junetion
capacitance to the side-wall area, one dimension of
which is only 1 micron thereby cutting this capacitance
Silicon-on-Sapphire Complementary MOS Circuiti
by approximately two orders of magnitude over a bulk
devioe of the same surface dimen.sions.
Doped oxides as solid diffusion sources serve to further reduce parasitics in the form of gate overlap and
crossover capacitances and all wiring capacitance is
completely eliminated. Combined with the low threshold
voltages (typically 0.5 volt enhancement for both
device types) and the high field effect carrier mobilities
of the transistors, the overall result is the realization of
the full high frequency capability of MOS devices
within an array environment. Inasmuch as the gain
bandwidth product of the MOS is comparable to that
of double diffused bipolar devices, circuit speeds approaching those obtained with non-saturating bipolar
logic gates (nano-second stage delays) can indeed be
achieved with this technology while retaining the
power and noise immunity features inherent in complementary MOS circuitry.
I ntegrated circuit design wnd experimental results
The fabrication of the associative array requires a
total of five photo-masks each of which was generated
with the aid of an automatic drafting machine. These
masks define, in order of processing sequence, the
isolated silicon islands, boron doped oxide, phosphorous
doped oxide, contact opening and aluminum metalization patterns. Heavily doped N+ silicon bars are used
throughout as a first la"yer of interconnection. Extension use of symmetry and mirror imaging was used in
the layout as an effective means of reducing chip area.
A photomicrograph of the completed silicon-on-sapphire associative array is seen in Figure 8 with the
bonding pads appropriately labeled. The chip has
an active area of 77 X 53 mils, is packaged in a 14
lead flat p:wk, and contains a total of 224 MOS devices.
A test complementary inverter is included within the
patern for initial wafer evaluation. Each transistor
in the array (including the test units) has identical
channel widths of two milsL lengths of 0.4 mils, and
channel oxide thickness of 1800 A. Characteristics of
typical test devices are shown in Figure 9. Based on
these parameters and the assumed lateral diffusion
of about 1 micron on both the source and drain regions,
field effect mobilities of 150 cm2/volt second for holes
and 300 cm2/volt second for electrons are obtained from
the characteristics.
Experiments conducted on fully packaged arrays
show that a storage cell can be written into with a
10 volt, 10 nano-second duration pulse with the array at the 10 volt supply level. Minimum sense current during read out is 1 mA as is the minimum
r
TEST INVERTER
0'3
023
• •
475
°24
WI ...
... °14
W2 -+
.. +Vo
W3-+
... ° 12
W4 -+
t
t
°Il
°21
t
°22
Figure 8-16 bit SOS associative array
0.2 mA Idiv
2 V /div
I volt/step
(a) N - CHANNEL TEST UNIT
0.1 mA /div
2 V / div
I volt/step
(b) P-CHANNEL TEST UNIT
Figure 9-Test device characteristics
476
Fall Joint Computer Conference, 1969
---------------------------------------------------------MEMORY CONTENTS
B,
WI
I
vi 2
0
0
W3
W4
B2 B3
0
0
B4
2 m~l/dh,
10 m.ec I div
0
0
0
0
MMI
SEARCH WORO=OOOI
SEARCH WORO=~OOI
0, I, 2,3 , 8 4 BITS IN ERROR
MM2
MM3
MM4
MMI
MM2
MM3
MM4
Figure ll-Analog mismatch signal
2mA/div
20nsec / d iv
MMI
SEARCH WORO=;;OI
Figure lo-Associative
MM2
MM3
MM4
m~mory
operation
value of mismatch current. Assqciative memory operation is best illustrated by referring to Figure 10 which
shows the contents of the memory as well as mismatch
current waveforms generated for three different seJ.rch
criteria. The result of the first search for contents
0001 correctly indicate a match in word two only.
Note that the mismatch current in word four, which
has two bits in error, is in exceSs of 2 rnA while that
in words one and three is only 1 mA corresponding
to a single incorrect bit. The second and third photographs again show correctly the proper mismatch
waveforms for search criterion 0001 and 0001, the last
of which correctly shows a mat'Ch for all four words
if the first two bits are ignored.
The additional feature of "proximity" matching
alluded to previously is shown more clearly in Figure
11 where the mismatch output:is shown for zero" one
two, three and four bits in error in a given word. Use
of analog detection circuitry at: this point will gr~atly
enhance the utility of resultant associative memory
system.
Although the work discussed here is of a research and
t.'.evelopment nature and the volumes of arrays obtained
are relatively small, it would be remiss at this point to
avoid any discussion of yield, an all-importBlnt faetor
in integrated electronics. It is perhaps even mor.e difficult to discuss this area when 'One considerfl the fact
that in this, new technology, a number of problem
areas had to be overcome before any complex arrays
were obtained. From that point on, however, the results were extremely encouraging as yields of 30 to
50 percent on packaged arrays were obtained ith
extremely reproducible device characteristies. These
represent relatively high yields when compltred with
monolithic MOS circuits of comparable comple~:ity.
It is believed, again with limited data, that these figlll"es
are a direct result of SOS technology wherein the
amount of critical silicon is limited to the channel
regions of the devices, and that yield depends only
on this area rather than on total chip area as in monolithic circuits. In the 16-bit associative array, thecri1iical
area de'3cribed represents 180 square mils whereas the
total chip size is in excess of 5000 square mils, so that
significant improvements in yield should. and. do
result.
SUMMARY
System, circuit, and device processing concepts have
been developed and have resulted in the succeBsful realization of high performance silicon-on-sapphire associative memory arrays. Features of the array include high
speed current sensing for mismatch detection and non..
destructive read out. The complementary MOS process
sequence utiHzed in the array fabrication resulteci in
yields as high as 50 percent and produced hig;h qU9.1ity
complementary devices with field effect mobilities
of 300 and 150 cm2/volt-sec for electrons BInd holes,
respectively. The drastic reduction of parasitic caplacitance inherent in SOS technology combined with these
device characteristics provides a performance l!evel
equivalent to the highest speed bipolar circuits while
reta.ining ~ll of the other desirable circuit and processing
Silicon-on-Sapphire Complementary MOS Circuits
477
REFERENCES
features of MOS arrays.
ACKNOWLEDGMENTS
The authors gratefully acknowledge the contributions
of D. J. Dumin and his associates in providing the
silicon-on-sapphire films used in this work. A. O'Toole,
J. Sokoioski aJ;ld Mrs. R. Gilchrist are responsible
for the array processing while W. Salt and Mrs. B.
Denton diced, mounted J and bonded the sapphire pellets~ Thanks are due to G. Cullen and G. Gottlieb for
supplying the material from which the low temperature
processing sequence evolved. R. Powlus contributed
to the initial phase of associative array design. J.
Previte of Rome Air Development Center des 3rves
considerable credit for recognizing the potential benefits of thin film active devices.
l
1 J ALLISON J BURNS F HEIMAN
Silicon-on-sapphire complementary MaS memory cells
IEEE J Solid State Circuits Dec 1967
2 C Y WRIGLEY L J KROKO
Properties of the silicon-sapphire interface in heteroepitaxy
Electrochemical Society Se:miconductor Silicon Abstracts
May 1969
3 D DUMIN
Deformation of and stress in epitaxial silicon films on single
crystal sapphire
J Appl Phys Vol 36 1965 2700
4 E CROSS G WARFIELD
Effects of oxidation on electrical characteristics of silicon-onsapphire
J Appl Phys Vol 40 1969 2339
5 F HEIMAN
Thin film silicon-on-sapphire deep depletion MaS transistors
IEEE Trans on Electron Devices Vol 13 1966855
A main frame semiconductor memory
for fourth generation computers
by THOMAS W. HART, JR., DURRELL W. HILLIS,
JOHN MARLEY, ROBERT C. LUTZ and CHARLES R. HOFFMAN
MOTOROLA, SPD
Phoenix, Arizona
INTRODUCTION
Jlodule description
It has been obvious for several years that Large Scale
Under various engineering and marketing constraints, a module building block concept evolved. This
module in its general form .contains 8192 bits. Interface to and from the module is performed with standard
current-mode logic levels. MECL levels were chosen
because that logic family provides the fastest interface when connecting many modules into a large mem0ry system. Also, most of the customers and potential
customers working on high speed systems are using some
form of current-mode logic. In any event, it is not difficult to interface from other logic families to MECL
levels.
.
By varying the logical connections to the module, an
organization of 8192Xl, 4096X2, 2048X4, or l024X8,
can be obtain~d. Figure 1 shows a block diagram of
the module. Addressing is binary. Inputs and outputs
may be bussed with other modules for expansion of
the number of locations in a memory system. No complicated timing is necessary to operate the module.
When an address is applied, the contents of the specified address will appear at the output terminals within
85 ns and remain until a new address is presented.
Writing in a specified location is accomplished by
pulsing the write enable line after the address and data
have been presented. The module can be cycled every
Integration could be applied to memories. Memories
offer several advantages in that a large volume of one
type of device can be manufactured, and that the design can be optimized for one application. There exists
a wide spectrum of memory product areas with varying
size, costs, speed and enviromental performance. Most
of these application areas are presently serviced by
various forms of magnetic storage.
Semiconductor memories have been encroaching into
some of these areas. First, the "scratchpad" was replaced by semiconductor memories yielding a better
performance at lower cost. Secondly, the small buffer
memories are now being implemented by various forms
of semiconductor storage, mainly by MOS shift registers. Large very high speed semiconductor buffers are
being built for large systems such as the IBM 360/85
to "effect a hardware performance increase of slower
core main memories.
It is felt that the advent of an all semiconductor
main frame memory is fast approaching. The initial
market penetration will be in the high performance
area (100-300 ns) replacing flat-film memory designs
where costs per bit are quite high. Eventually, most
memory application areas will be vulnerable to semiconductor implementation on a price and performance
basis. This paper "ill describe a memory module which
will be used as a building block to implement high
performance memories in the next generation of computers.
479
lOOns.
The memory module uses p-channel MOS flip-flops
for storage. Address decoding, word drive, sense, and
digit drive are accomplished with bipolar circuits.
This combination results in a low power, low cost
480
Fall Joint Computer Conference, 1969
---------------------------------------------------------integrated logic circuits. Total power dissipation is
about six watts. While readily accomplished, no attempt was made to reduce power by various Bwitching
and pulse powering schemes since this level of power
density can be easily handled in most applic~ttions by
forced air cooling.
ADDRESS
ENABLE
ADDRESS
10 - 13 BITS
8192
BITS
Electrical description
DATA OUT
1, 2, 4 OR 8 BITS
1, 2, 4 OR 8 BITS
+5 V
GRND
ARRAY
LOAD
MAY BE PULSED OR
RETURNED TO -5 V
-5 V
Figure I-Block diagram 8192 bit module
memory array, while retaining, high speed module
performance because of the bipolar circuits. The memory array itself contributes only a small fraction of the
time used in a memory cycle (see timing diagram,
Figure 14). The cycle-time is mainly determined by
the bipolar circuits peripheral ~o the lVIOS-storage
array.
The memory module was de~igned to operate on
±5v power supplies since these ~re fairly standard in
+5 V
ARRAY POWER
'-5 V
The module is a multi package hybrid assembly.
Four different integrated circuits are used to eonstruct
the module. These chips are (1) 256 bit MOS storage
array, (2) Array Select Circuit, (3) Word Deeode and
Drive Circuit, and (4) Sense-Digit Circuit. The complete module has 32 Storage Arrays, four Array Select
Circuits, two Word Drive Circuits, and four SenseDigit Circuits.
Storage array chip
A block diagram of the 256 bit lVIOS Sto:rage Array chip is shown in Figure 2. The array is organized
in 2D fashion as 32 words X 8 bits. _ The linoar
select organization minimizes the number of devices per
storage cell and also the number of inter-connections
on the chip. Unfortunately, linear select organization
causes some complications in packaging. These problems are circumvented here by placing sense line
switches on the same chip as the array. This provides
two benefits. First, additional addressing can be plerformed with the sense switches improving decodler
11
;
--
32 WORD L INES
i
STORAGE ARRAY
32 WORDS
X
8 BITS
BIT LINE
BIT LINE
-5
•
,
ENABLE
--
SENSE
~__- - + ___~------~W~O~R~D~L~IN~E~------~--+_---~
!
lIN~
SWITCHES
Il:
EN
~----~--~~---------------------4_---~---___.
:
1. :
~
8 BIT-L!NE
PAIRS
Figure 2-Block diagram M OS storage array
~E.N-4--~-- --------------------------~--r1_..
__
TO BONDING
PAD
TO BONDING
PAD
Figure 3-Storage cell circuit schematic
Main Frame Semiconductor Memory
efficiency. Second, the internal capacitance of the
bit-lines can be isolated from the external bit lines by
the sense switches, substantially improving the sense
loop time constant.
A schematic of a storage cell and the MOS senseswitches at the end of the bit line is shown in Figure
3. Ql and Q2 are the active devices of the flip-flop, RI
and R2 are the flip-flop load devices, and Q3 and Q4
are the series gating devices which connect a selected
481
cell to the bit line pair. Each bit line has a transistor
QEN in series with the connection to the bonding pad
and a transistor QEN which terminates the bit line
to ground when QEN is on. The geometries of the ac-
tive devices are designed to provide a sense current of
80 microamperes under worst case processing and
operating conditions. The load resistor device geometries determine the standby power dissipation of the
chip which in this case is about 40 milliwatts.
Figure 4-256 Bit M08 storage array
482
Fall Joint Computer Conference, 1969
Figure 4 is a photomicrograph of the chip. The
dimensions of the chip are 138 mils X 141 mils. A
low threshold process using < 100> material is used.
The substrate serves as the buss for the +5 volt supply.
One layer of metal interconnection is used. A high
concentration P-diffusion (15-20in/square) is used for
crossunders so as to minimize series resistance. In the
layout the bit lines have no crossunders. The word
lines have nine crossunders. The resistance of these
crossunders and capacitance associated with the word
gates on the memory cell form :an RC delay line. In
this design the delay is about 2.5 us.
Chip 8election circuit
TO ENABLE & ENABLE LINES OF MEMORY .ARRAY CHIPS
~""""'---='-~EA E
EE'
ADDRESS
ENABLE
ADDRESS
INPUT
ADDRESS
ENABLE
ADDRESS
INPUT
ADDRESS
INPUT
Figure 6-Chip select circuit logic
A bipolar circuit which decodes three binary bits is
used to select one of eight MOS Storage Array Chips.
Each of the output driver stages provides the complimentary signals EN and EN necessary to drive the
sense switches on the MOS Storage Array. Additional
inputs to the·· chip-selection Cu;cuit are provided to
select groups of eight arrays.
Emitter Coupled Logic (EC~) input signals are
translated to saturated logic which is referenced to
the negative supply (-5.0). The complimentary output stages provide logic levels near the positive (+5)
and negative (-5) supplies for driving the MOS sense
switches. Block and Logic Diagrams are shown in
Figures 5 and 6.
Memory package
Eight MOS Storage Array chips and one chip selection circuit are contained in a 1.2·5 inch square memory
package. Interconnection of these nine chips is made
by a beam lead laminate as described later in this paper.
Each memory package contains a total of 2048 bits
as shown in Figure 7. Four such packages form 'the
storage portion of the 8192 bit memory modlllle. This
assembly of four packages results in a total capacitance
buildup of 250 picofarads on the word lineH and 70
picofarads on the sense-digit lines.
Decoding word driver
Selection of the storage array word lines i:3 accomplished by a bipolar circuit which decodes fowr address
bits and drives one out of sixteen word lines. As in the
Chip Selection Circuit, ECL input signals are transla'ted
to saturated logic whose outputs provide .logic levels
near + 5 and - 5 volts. Block and logic diagrams are
shown in Figures 8 and 9. Two of these chips are
packaged in a 1.25 inch square package similar to -the
memory array package except that interconnection
within the package is made with a thick film met:Etlization and wire bonds. Two address enable inputs are
provided. One is used as a master enable and 1~he
other is used as a one bit decode to select one or 1~he
8 BIT·lINE PAIRS
THREE-BIT(
ADDRESS
~ )TOCHI.='
ADDRESS ENAB LE (2)
16 WORD lINES-~-.t
256 WORDS
X 8 BITS
NINE CHIP
HYBRID ASS'BLY
.....- - 1 6 WORD LINES
....--ENAEILE
.....--ARR.Ay POWER
+5 V
GRND
-5 V
ENABLE
'E"fiJ'A'BIT
Figure 5-Chip select circuit
) TO CHIP =8
-5 V GRND +5 V
Figure 7-Memory package
Main
DECODING
WORD
DRIVER
'1 WORD LINE
OUTPUTS
2 ADDRESS ENABLES----t-t
-5 V
GRND +5 V
Figure 8-Decoding word driver chip
TO WORD SELECT LINES OF MEMORY ARRAY PLANES
"
ENABLE
ADDRESS
INPUT
ENABLE
ADDRESS
INPUT
ADDRESS
INPUT
.........----+lI---++-~+-""*__t+- - - - - - -
~----~~~-~-+~~------
ADDRESS
INPUT
Figure 9-Decoding word driver chip logic
other of two Decoding Word Driver chips sharing the
same package. A block diagram of this package is
shown in Figure 10.
Sense amplijier"digit driver
The sense amplifier"digit driver subassembly contains four identical sense amplifier~digit driver
integrated circuit chips. Each chip receives and sends
read and write signals to the MOS storage array, accepts EeL level data input and data enable signals,
and generates EeL data output signals.
The purpose of each chip is, of course, twofold.
Fr~me
Semiconductor Memory
First! .when it ha~b~en properly enabled for writing,
it must transmita'-W:r..it~ signal to appropriate bites)
of the selected word ·in the storage array. Second, when
properly enabled, it must sense the storage cell currents. of the s~lected word and translate them to EeL
signals at the data output terminals.
The logic diagram shown in Figure 11 is functionally
equivalent to the sense amplifier-digit driver circuit.
In addition to showing the basic sense amplifier, digit
driver, and gate blocks of the sense amplifier-digit
driver chip, Figure 11 also shows the existence of a bit
line recovery circuit. The purpose of this circuit is
to rapidly return all bit line voltages to zero, immediately after each write operation.
To thoroughly understand the sense amplifierdigit driver logical organization, consider the sequence
of events which must occur to perform the read and
write operations.
To accomplish a write operation, the desired input
data is placed at the DATA IN terminals of the chip.
The data is enabled by a coincidence of logical zeroes
at the DATA ENABLE inputs. When the WRITE
ENABLE input is forced to a logical zero, one of the
bit line voltage drivers in each half of the circuit
drives one line of each bit line pair to approximately
+4v. This voltage impressed on a bit line accomplishes
the write in the storage array. The leading negative
edge of the WRITE ENABLE signal also sets the
recovery .circuit flip-flop. The following positive edge
of the WRITE EN ABLE signal turns the digit driver (8)
off and turns the recovery circuit driver on. When
recovery of all bit lines is accomplished, the recovery
circuit flip-flop resets and the recovery circuit driver
is shut off .. Both the digit driver and the recovery circuit driver are designed to exhibit a very high output
1 - - -..... 16 WORD LINES
16 WOF.lD LINES .....- - 4
7 BIT ADDRESS---t~
NOTES
-5 V GRND +5 V
Figure
l~Decoding
word driver package
483
THE NUMBERED BLOCKS CORRESPOND TO THE FOLLOWING
1 BIT LINE VOLTAGE DRIVER. 2. SENSE AMPLIFIER,
3 VOLTAGE COMPARATOR, 4 BIT LINE RECOVERY
CURRENT DRIVER
Figure ll-Sense amplifier/digit driver chip logic
484
Fall Joint Computer C?nference, 1969
---------------------------t
i - - - - - -___________________________________________________________ __
I
impedance when off, such that they do not interfere
with the read operation.
Reading is accomplished by enabling either one or
both halves of the chip with ~he DATA E~ABLE
signals. If the WRITE ENAB.t.E is held at logical
one, the bit line currents flow into the sense amplifier
inputs. The sensed information i is made available at
the DATA OUT terminals. Since the I/O signals are
EeL, uncommitted emitter outputs are used so that
wired OR'ing of the positive going output signals is
possible.
Figure 12 shows a block diagram of the sense amplifier-digit driver package. Since the DATA OUT
signals from all four sense amplifier-digit driver chips
can be OR'ed, various connectioilS of the DATA EXABLE and DATA 1:\ signals j are possible. If the
DATA EXABLES are connectPd for maximum decoding, a one-out-of-eight seleJtioll can be accomplished. \Vith all eight DATA IX inputs and DATA
OeT outputs strapped togethell, the module organization becomes 8192 words of one bit. Similarly, if all
DATA EXABLES are tied together, each DATA
IN and DATA OeT is used as a separate information
channel, and the resultant mqdule organization is
DAT A OUTPUTS
DATA INPUTS
8 SENSE-DIGIT PAIRS
DATA ENABLES
(UPT03BIT
ADDRESS &
{
COMPLEMENTS)
WRITE ENABLE
DATA IN (8) _ _~
DATA OUT (8)
-5 V
GRND
+5 V
Figure 12-Sense-digit package
1024 words of eight bits. Other connections result ill
"1096 words of two bits, and 2048 words of four bits.
These various connections occur external to the module.
lIenee, the sense amplifier-digit driver plane organization is the same regardless of the final module organization desired.
]1,[ odule
elcctr'ical organization
Figure 13 shows an integrated electrical slChematic
8
8
32 WORD LINES
32
SENSE-DIG'IT PLANE
2
4
4
DECODING
WORD DRIVER
PLANE
16
=
16
w
DWD
Z
Z
..J
..J
..J
~
~
:>
:>
0
w
w
A8. A8. A9, A9
++___- 4_ _ _ _3-t-t4_ _3-1-1~
2
4
MASTER ENABLE o----------------~
Figure 13-8192 bit memory stack-electrical organization
0
~
w
~
~
L-_-4++...;1:.:6_B::.:I:..;.T.....::.L1;,,;,.N:..;:E:.::S_ _ _ _
3
:>
ex:
~
~
A5-A7
!(:$:i:~~5
1.40
------~
Figure 16-Packaging for memory stack
488
Fall Joint Computer Conference, 1969
or one chip at a time. It also perrpits quite stringent
quality control measures to be imMemented since the
beams can be examined individually~
Memory package
A sketch of the package which ·is being used is illustrated in Figure 16. It consists of a 1.28 inch square,
96 percent alumina base, which is metallized to a custom pattern containing 68 metal film leads which go
under a glass-sealed side wall. The base of the usable
1.0 square inch interior contains the power distributl~n pattern. The headroom within the package is 60
mils.
As can be seen by inspection of the figure, the area
occupied by the IC memory chips and the control
chip is approximately 25 percent of the area, the remaining area being used for the XJy interconnect and
exit bond functions!
In the assembly cycle, a total of 448 beams leads
are bonded to the IC chips which is half of the bonds
required by wire bonding techniq.ues. The laminate
contains 480 electrically active plated feed throughs.
Larger beam leads are employed to connect the inter··
connection laminate to the exit bond pads and the
power distribution. A total of 73 such bonds are re··
quired. In the computer program which generated the
interconnect laminate artwork master sets, approxi.
mately 1400 conductor track segments instructions
were generated. The cover is alloyed to the package
subassembly after precap testing. The result is a,
memory component containing 2048 MOS memory
cells and having only 68 leads to the outside world.
CONCLUSION
A high performance memory module has been des.cribed
which is suitable for use as a building block for large
mainframe memories. 1\tlass production of this memory
module is planned. Costs per bit of a memory system
using these modules as basic building blocks will be
much lower than that of other technologies giving fL
similar performance. In the near future the competi.
tive pressure of semiconductor memories will be felt in
moat performance ranges. Magnetics watch out!
A new approach to Inemory and logiccylindrical domain devices
by A. H. BOBECK, R~ F. FISCHER and
A. J ~ PERNESKI
Bell Telephone Laboratories
Murray Hill, N ew Jersey
INTRODUCTION
have been studied only induced voltage readout will
be detailed in this paper.
A new class of. materials, the orthoferrites,··6 are
now available which, in addition to supporting cylindrical domains at densities approaching 106 per square
inch, have the combined properties of high nucleation
fields (so domains will not spontaneously appear), low
domain wall coercivities, and high domain wall mobilities. A description of the general properties of
cylindrical domains6 •7 in orthoferrites is followed by
a section on the behavior of domains in gradient
fields. Conductor circuits; "angelfish" circuits8 and
in-plane rotating field circuits9 are presented as general
methods to propagate domains. Finally the relevance
of domain wall devices to the computing field is discussed.
Magnetic domain behavior in single crystal magnetic
oxides has been studied extensively over the last
several d~cades. These investigations, both theoretical
and experimental, are an attempt to better understand
these materials and their complex domain structures.
Recently single crystal oxides have been utilized in
memory and logic devices. This paper will update
work on cylindrical domains in orthoferrites first
published in 1967 and later discussed at the 1968 and
1969 Intermag Conferences. 1.2,3
A cylindrical domain, sometimes referred to as a
bubble, is a localized high energy magnetic state.
Such a domain is stable and resists any attempt to
deform it. Domains can be moved about in much the
same way as a charged particle. A domain can be
moved one domain diameter in less than 100 nanoseconds thus indicating that data rates in excess of
10" bits/sec can be realized in this technology. As yet
no upper limit to the cylindrical domain velocity has
been found experimentally.
Sucessful device utilization of cylindrical domains
depends upon developing techniques for generating
propagating, interacting and detecting these domains.
Domains can be generated by secti~ning an existing
domain into halves. Each new domain can. be considered as an information input if the splitting operation
is selectively controlled. A stream of domains,. fed into
a propagation channeCand transmitted to an output
point, can be detected by optical, Hall or induced
voltage readout. Although all these readout techniques
General observations
If we take a thin platelet of orthoferrite above its
N eel temperature and cool it to room temperature
spontaneously nucleated serpentine-like strip domains
will be present. Such a domain pattern, as seen in
Figure 1, will usually include a number of single wall
domains. A single wall domain can be identified by
noting whether the wall which bounds it closes upon
itself. If a prescribed magnetic field, the bias field,
is applied normal to the surface of the platelet the
single wall domains become cylindrical. An array of
such domains is shown in Figure 2. The 1.7 mil thick
platelet of Sm'55 Tb' 45 FeOa osoferrite is subjected to
a 42 Oe bias field.
489
490
Fall Joint Computer Conference, 1969
Figure 1-8trip domains, 1.5 mils in width, in a 1.7 mil thick
platelet of Sm.65Tb .46FeOa orthoferrite v.iewed by Faraday effect.
Note the single wall domains.: Bias field is zero
Figure 2-With a 42 Oe bias field the single wall strip
domains of Figure 1 become cylinders each 1.8 mils
in diameter
Those familiar with the earlier references recall
that cylindrical domains are stable over a limited
range of the bias field (typically 10 percent of 411"lVI8)'
An excess bias causes the domain to collapse inward.
On the other hand as the bias is ~ecreased the domains
grow in size eventually reachin~ a diameter at which
they become unstable to elliptical perturbations and
then suddenly grow int.o long stri~ domains.
A strip domain can also be. cut by energizing a
conductor posi~ioned in contacti with the orthoferrite
and intercepting the strip domain at right angles.
For SmTb orthoferrite a current of 300 rnA is sufficient. Later, in the discussions of conductor propagating circuits, a technique for splitting cylindrical
domains will be presented.
cussed previously an increase in the bias field decreases
the domain diameter and vice versa. Now consider
the reaction of a cylindrical domain subjected to a
nonuniform rather than a uniform field. The response
will be complex and could involve a changEl in size,
motion at a nonuniform rate or even the collapse of
a domain. However, it is possible to treat the case
in which a uniform gradient field is applied.
Consider, as shown in Figure 3, a cylindrical domain
of diameter 2r in a uniform gradient field. The domain
Manipulation of cylindrical domains-Oeneral
Domains in orthoferrites are maintained in the
preferred cylindrical form by an overall uniform bias
field applied normal to the platelet surface. As dis-
X--"
UNIFORM GRADIENT
FIELD
Figure 3-A cylindrical domain of diameter 2r pOElitionecil
in a uniform gradient field
New Approach to Memory and Logic
will experience a force attempting to move it toward
a position of reduced bias. To overcome the wall
coercivity, He, the following condition must be met:
491
HIGH PERMEABILITY DISCS
(1)
AH>8Hc!1r.
1
Furthermore, it can also be shown that the domain wal
velocity, J, is given by
J(cm/sec) = AH(Oe) M(cm/sec/Oe)/2
(2)
where Mis the usual domain wall mobility.6
One method to see the effect of a gradient field is to
interact one domain with another. In the case of
domains widely separated the far field of a cylindrical
domain can be approximated as that of a dipole and the
following relationship derived (see Figure 4).
(3)
Equation (3) specifiedt12 , the minimum stable separation between a pair of domains as they repel one another because of their mutual gradient fields.
Finally it has been found useful to interact high
permeability magnetic film patterns with cylindrical
0
(/1
I~
~/d/:- .--CYLIN. DRICAL
~
DOMAIN
(2)
~,
(0
Figure 5-Interaction between a matrix of high
permeability disks and a cylindrical domain
domains. Consider, for example, a matrix of permalloy
dots positioned on- the surface of an orthoferrite platelet. One finds, by experiment, that a cylindvical domain"
prefers a position in contact with the permalloy as
shown in Figure 5. The permalloy dot diameters and
separations have been chosen to be consistent with
the stable cylindrical domain size in the orthoferrite
under study. The dots serve as localized flux closure
paths thereby reducing the magnetostatic energy.
They provide a shift register, a memory array,. etc.,
with well defined domain positions.
Conductor circuits
..112
Figure 4-Two domains, mutually repelled in a
material whose coercive force is He, reach a
stable separation (12.
In order to utilize cylindrical domains in shift
registers, memories and logic circuits, we require
motion in discrete steps at specific times. Therefore,
highly localized fields are needed. Such fields can be
produced by small conductive loops placed flat on a
platelet surface. Since thin film techniques are used
to fabricate the conductor circuits, a completely closed
loop is not practical.
Figure 6 illustrates the basic conductive loop configuration and the resulting field profiles. These were
obtained by measuring the fields produced by an
expanded scale replica of the thin film circuits. "The
circuit dimensions were chosen to provide controlled
motion of domains whose diameters range from 3.5 to
6 mils. In order for a domain to move to an adj acent
loop it must ihitially be in contact with some portion
of the positive gradient field produced by that loop.
492
Fall Joint Computer Conference, 1969
----------------~-----------------------------------'---40
30
Q)
,g
<:..OPERATING CONTOUR
o
...J
~ 20
i 10 1I{ffj)~1~ DOMAINS
~
AvERAGE I="IELD
eOR '2 MIL THICK
ORTHOFERRITE
l
'200mo
____
ORTHOFERRITE:
2.0 MIL THICK
PLATELET
Vb Fe 0 3
____
____ ____
400
100
200
300
I DRIVE (MAl
~
~~
~
~_____-J
500
Figure 7-Quasi-static operating contour for 2-0 mil
thick platelet of YbFeOa.
Figure 6-Conductor circuit ~sed to propagate
cylindrical domains and th~ resulting field
profiles for 200 rnA applled current
This puts a lower limit on the dpmain size. The limit
of maximum domain size is reached when a disparity
of domain to applied field area results in reduced control of the domain position.
The most important feature of the semiclosed conductive loop circuit is that the field is confined to
an area consistent with that of:a domain. Therefore,
the field may far exceed the valq.e which would transform a domain from a cylinder to a strip. The upper
limit on this field~ however, is tllat value which would
stretch the cylindrical domain int the strip area defined
by ~he connections between the loops.
The limits of the applied drive and bias fields are
illuStrated in Figure 7. The dat~ was obtained using
a 2.0 mil thick platelet of YbFeOa operated in a quasistatic fashion on a conductor pa~tern similar to that of
Figure 10. The operating conto~r resides within the
bias field extremes required to maintain a cylindrical
domain. The position and size of !the operating contour
within the bias field boundaries ia determined primarily
by the range of domain sizes Which the circuit can
accommodate.
;
In Figure 8, velocity curves ~re given of domains
in YFeOa, TmFeOa and YbFeO~ platelets. These are
functional measurements obtained using the circuit
of Figure 6. Rossol has shown that YFeOs exhibits an
extraordinarily high mobility. 10: Functional velocity
measurements of YFeOa have ~onfirmed this. Data
rates in excess of 3 X 106 bits/sec have been reached.
A direct comparison of device speed and domain wall
I
10.0
8.0
t
r
r
6.0 ~
(fL SEC-I)
4.0
I
r-
I
20
r
I
o
100 200 300 400 500
IORIVE (rnA)
I 0 ~-
!
TmFe03
08~
\
I
T
0.6 '(fL SEC - 1 )
0.4 -
YbFe03
I
.
0.2:-/
i
o
~
~.
100
300
I
I
500
I
~
.
.L.-J
700
IORIVE (rnA)
Figure 8-Functionalvelocity curves of YFeO., T'bFeOI
and TmFeOa platelets.
mobility cannot be made because of the complex nature
of the field profile. Notice that threshold currents as
New Approach to Memory and Logic
493
1XI--------, ,.------, r----------,
1X2----_-I-+-----, .-------If-+--. ~----4
IX3---~ .-----l~--~~ r--f-+-r-~r----' ~--~
Figure 9-Thin film conductor pattern for two dimensional propagation of cylindrical domains. Conductor
dimensions identical with that of Figure 6.
low as 10 rnA have been measured representing drive
fields less than 1.0 Oe.
A conductor pattern is shown in Figure 9. Note
that the series of loops are interconnected such that
there are three separate interleaved circuits. Thus,
with a three phase system, a domain at position A can
be propagated to C with the sequential application of
currents IYl, IY2 and Iys. Two dimensional propagation
can be performed by simply aligning two identical
circuits orthogonal to each other. The domain at position A can now be propagated to B with currents
Ix!, IX2 and Ixs. Bidirectional propagation merely
requires a reversal in the three phase sequence. The
domains (bits) are spaced on 10.5 mil centers or every
third propagate loop. This is adequate spacing to
avoid interactions in materials having a coercive
force of 0.25 Oe or higher such as YbFeOs. The resulting
packing density is over 6 X lOS bits/in2•
Figure 10 is a photograph of a unidirectional shift
register circuit. The register is equipped with an input
and output circuit. Information is written by controlled
domain replication and the output circuit detects a
change in flux. The circuit is operated with a biphase
propagating source. Directionality is achieved with
the help of permalloy dots. The dots, which provide
low energy sites for the domains, are uniformly shifted
with respect to the conductive loops. This asymmetry
places the domains in a consistent, preferred position
prior to each propagat.e phase. The permalloy in essence provides a five Oe third phase drive. The permalloy
OUTPUT
Figure lO-Photograph of the conductor pattern of a undirectional shift legist.er utilizing a bipl;l.ase propagating source. The
circuit contains a controlled replicate input and an out.put circuit
which detects a change in flux. Circuit is capable of propagating
domains having, a nominal diameter of 4 mils
dots are, typically, 4000 A thick, one mil diameter
and spaced on four mil centers along the propagating
track. They are deposited on pedestals, fabricated
as part of the conductor circuit. This is done to ensure
that the permalloy is in intimate contact with the
orthoferrite. The biphase register design provides a means
of constructing long serial registers without necessitating conductor crossovers. With a biphase system,
however, the packing density of domains is about one
fourth the propagate positions rather than one third,
as in the case of the three phase system. In addition,
speed is reduced by virtue of the limit of the pseudodrive provided by the permalloy.
A suitable material for use with the device is TmFeOs.
A. platelet two mils thick, exhibiting domains three to
five mils in diameter was used. Operation is initiated
by placing a "source" domain in the starting loop. To
insert a bit, the larger loop encompassing the replication (hairpin-like) conductor is energized, centering
the source domain over the replication conductor.
After the domain is split, one section is returned to
the start position and the other is simultaneously
shifted two loop positions to the start of the register.
The domain is shifted through the register until it
494
Fall Joint Computer Copference, 1969
reaches the output circuit. The two outer conductor
loops are part of the readout ddve circuit while the
two inner loops comprise the sen$e circuit. The readout drive loop nearer the domain ~s energized drawing
the domain into the loop and then expanding it to
the extent of the loop. The dom~in is then collapsed
by a reversed drive through both Iloops. The resulting
flux change is detected on one Isense loop and the
induced voltage due to di/ dt is cancelled with the
other. The domain is expanded to: an area forty times
the area of the cylindrical domain and provides an
output of 1.0 mV-ILsec. A photograph of the output
waveform is shown in Figure 11 J Notice the bipolar
nature of the waveform. The output circuit has been
shaped to not only increase the' area of the output
domain but also to maximize thr rate of change of
flux linkages during the collapse ph~se.
The circuit has been operated at speeds in excess of
106 bits/sec using 350 rnA prop~gate currents. The
minimum replicate drive pulse is 750 mA, 1 ILsec wide.
The nominal readout drives for domain expansion
and collapse are 530 rnA and 700 mA, respectively.
"A 'Y!'geljish" circuits
We have progressed through three phase conductor
circuits where the propagation direction is determined
by the sequence in which current pulses are applied
and two phase conductor-permalloy circuits where
the propagation d":rect:on is built in by a non symmetric
conductor-permalloy alignment. A logical progression
is the possibility of an all permalloy circuit to interact
with, and thereby propagate, domains in orthoferrite.
There are, in fact, two such general classes of circuits
and they will be discussed in this and the following
section.
The first class, coined the ','angelfish" circuits,
utilize the fact that a cylindrical dpmain can be modu-
Figure ll-Photograph of the outPl,lt waveform from
circuit shown in Figure 16. Horizontal scale is I
#,sec/div; vertical scale is 2 mV /div
HIGH PERMEABILITY
THIN FILM
WED~E__
+
HARD
(0)
-
+ + G~=
(b)
Figure 12-Domain positioned on a wedge-shaped high
permeability permalloy thin film. The domain is more
easily moved off the point of the wedge (a)
than the blunt edge (b)
lated in size by increasing or decreasing the bins field.
Motion is achieved by maneuvering this pulsating
cylindrical domain in and out of asymmetrical energy
traps. The traps are created by wedge shaped 151ms ()f
high permeability permalloy placed in contaet with
the orthoferrite platelet.
The interaction which exists between a cylindric:;},l
domain and a wedge is illustrated in Figure 12. The
domains assume a position on a wedge where the
magneto static energy is minimized. It was confirmed
by experiment that from this position a domain is
more easily moved off the point (a) rather than the
blunt end (b). The mechanical analogy is thll,t it lis
easier to walk up a ramp than to scale a wall. A shHt
register can be built which propagates domains along
a series of wedges by means of a periodic modulation
of the diameter of the domains. During the expansion
phase the leading domain wall reaches out to latch
onto the blunt edge of the next wedge and during the
contraction phase the trailing domain wall slides o:ff
the point of the wedge that held it. This pushing and
pulling action provides the unidirectional motion
desired.
An experimental 32-step shift register, shown in
Figure 13, propagates domains continuously around
a circle. The permalloy circuit is photoetched from a
4000 A permalloy film. The size can be estima,ted by
noting that the outer ring is 50 mils in diameter. The
inner and outer permalloy rings provide lateral stB~
bility to the domain as it travels. Lateral stability is
not required because of any inertia associated with
the domain, but to ensure that the domain will expand
and contract along the direction of motion rather than
across. Operation is obtained as the bias field is oscillated between the extremes of 38 to 44 Oe. The orthoferrite used was a 2.3 mil thick platelet of Tb o"6 TmO"li
FeOs.
New Approach to Memory and Logic
495
(a)
(b)
BIAS
FIELD
Figure 14-Isometric view of permalloy T-BAR
pattern in contact wit.h surface of orthoferrite
platelet. Rotating in-~lane field generates poles
which cause the domain to move
(c)
(d)
Figure 13-A section of a 32-step unidirectional ring
"angelfish" register. The bias field is 38 Oe (a),
440e (b), 38 Oe (c), 44 Oe (d). Motion is
counterclockwise
Pr.opagation by "T-./3AR" perm5!lloy circuits
In a second method of propagation an in-plane
rotating field acting on a structured permalloy pattern generates traveling positive and negative magnetic poles to selectively attract and repel and thereby
control the motion of a cylindrical domain. A variety
of permalloy patterns are suitable and one such pattern, the T-BAR, is illustrated in Figure 14. The
name, T-BAR is, of course, identified with the high
permeability thin film permalloy pattern shown in
contact with the upper surface of an orthoferrite
platelet.
The operation of this circuit will be most readily
understood after a study of Figures 14 and 15. First
the bias field is adjusted to maintain a stablecylin-
drical domain. Next assume that a field is applied in
the plane of the orthoferrite and directed as illustrated
in Figure 14. This in-plane field, which has very little
direct effect on the orthoferrite, produces magnetic
poles in the structured permalloy circuit thereby providing the cylindrical domain with the low energy rest
position shown. Clockwise rotation of the in-plane
field causes a systematic redistribution of the magnetic
poles in the permalloy and the domain responds by
moving from left to right as photographed in Figure
15(a)-15(e). With each rotation of the field the domain advances one period of the circuit. The propagation direction may be reversed by rotating the field
in the counterclockwise sense.
Figure 16 shows a typical domain generator. The
entrance to the T-BAR propagating channel is from
the left if the field is rotating clockwise. The large
ge~erator disk at the entrance maintains a domain
which stays in contact with the + poles formed on the
disk by a rotating transverse magnetic field. As the
field rotates to the position shown in Figure 16a, the
dom~tin is .forced to pass over the first + pole formed
at t.he left end of the propagating channel. When the
field rotates another quarter cycle, Figure 16b, one
end of the domain becomes attached to the advancing
+ poles of the propagating channel while the other
remains attached to the + poles of the disk. As the
field rotates further, Figure 16c, the two ends of the
domain are forced to travel in opposite directions, and
a negative pole distribution begins ·to build up near the
center of the stretching domain, forcing it away from
the disk. When the negative pole distribution is maxi-
496
Fall Joint Computer Con-ference, 1969
(0 )
(c)
(d)
o.. ·
(e)
I·······,···,:T
·.• :·.T
. . . . '. . ~
I;.i,.,.
..
Figure 16-Domain generation-A permanent domain
a'3sociated with the rotating + pole configuration of the glenerator disk is forced to stretch when one end becomes trapped in
the T-BAR propagate channel. When the in-plane rotating;
field HR is directed upward, the - poles near the stretched
portion ot the domain cause it to sever into two, leaving III newly
formed domain in the propagate channel
j
Figure 15-Sequence of photographs showing a 2 mil
diameter domain propagating as the field rotates
clockwise through 360 0
mum near the stretched portion of ~the domain, Figure
16d, the field from the disk shrinks that portion of
the domain width until it becom~ unstable and the
domain suddenly ruptures into .two portions, one
remaining on the disk and the other remaining in the
propagation channel. Bo.th domains then return to a
domain size determined by the bias field with the result,
shown in Fi gure 16e.
In general the minimum transverse field required for
domain generation is larger than the minimum field.
for propagation; therefore, insertion of domains
into a single channel can be controlled by increasing
the amplitude of the rotating transverse field for either
an entire cycle or for only that portion of the cycle
(approximately 34 cycle) where the domain becomes,
stretched to its maximum. Insertion of information in
multichannel devices (say up to ten channels) can
be controlled by designing the geometry of the generators so that either the amplitUde of the rotating
field, or the portion of the cycle it must be increased,
or both, is different for different channels.
An example of domain generation uses a magnetic
overlay made from 8900 A isotropic permalloy. The
T -BAR propagation channel has the same dimensions
as previously stated and the generator disk is 9
mils in diameter with a 2.5 mil protrusion into the
New Approach to Memory and Logic
497
TABLE I
Experimental
Rare Earth
Y
La
Pr
Nd
3m
Eu
Gct
Tb
Dy
Ho
Er
Tm
Yb
Lu
. 3mO.6Er O.4
3mO.55TbO.45
(milS)
2r
411Ms
~
105
83
71
62
84
83
94
137
128
91
81
140
143
119
8.4
6.6
5.7
4.9
6.7
6.6
7.5
10.9
10.2
7.3
6.5
11.2
11.4
9.5
7.5
6.0
5.5
3.7
1.7
2.0
4.5
6.0
2.3
3.8
7.5
83
108
6.6
8.6
1.0
0.75
(mils)
Thick, h
3.0
33
Not Available
Not Available
2.0
3.2
1.1
3.0
2.0
10.5
2.4
16
2.2
51
1.6
32
2.1
12
2.0
8
2.3
37
41
3.0
2.0
10.5
3.0
td
(Oe)
field
33
61
=
1.8
2.0
-
CAlculated
(mils) ( ergs/c
i
X 'N
.--l
2.5
1.8
4.4
2.9
3.7
2.9
1.4
1.7
3.3
3.9
1.9
3.0
4.3
1.1
1.3
1.6
1.7
1.7
1.8
1.7
1.6
2.4
3.9
3.9
0.80
0.40
0.35
0.30
a
4:2
s
propagate channel. The orthoferrite is a 2 mil
thick platelet of Sm .55 Tb .45FeOa with 411'lVI8 = 108
gauss. The bias field is 42 Oe producing approximately
1.5 mil diameter domains. The transverse field amplitude necessary to generate domains is 20 Oe peak
while 10 Oe peak is sufficient to propagate domains.
Domain logic
Logic can be performed in cylindrical domain devices by utilizing the repelling forces between domains.
T -BAR-like overlays are used to transport domains
close enough to allow the interactions to occur. An
overlay arrangement particularly useful for performing
logic functions is that of an idler position into which
a domain can be inserted and forced to circulate within
a relatively fixed position as the transverse field rotates.
An example of domain logic uses the permalloy
overlay of Figure 17. A logic variable N is determined
by the presence or absence of a domain circulating in
the idler position formed by the four bars which provide the pole positions four, five, six, seven. The input
variable X is determined by the presence or absence
of a domain in the T -BAR track defined by pole
positions ... -3, -2, -1, 1, and two output tracks
3', 4', 5', ... and 7', 8', 9' ... deliver the logic function
X • N. N is the flip flop function N = X • (N -1) +
X • (N -1) where (N -1) is the previous state of the
flip flop. Poles 2 and 6 are positioned so that if
there is a domain on one of the poles, and none on the
other, poles 3 or 7, respectively, are preferred
c=::!J
~
X'N
Figure 17-Cylindrical domain flip flop-The state of
the flip flop is determined by the presence or absence of a trapped
circulating domain at the sequencing pole positiom; (idler) 4,5,
6, 7. Each new domain entering the input channel x changes the
state of the flip flop by becoming trapped in the idler if it is full
over poles 3' or 7' for the next step. As the transverse
field rotates counterclockwise, a domain entering this
device will travel along successively generated poles
- 3, - 2, -1 and 1. When it reaches pole 2 it
makes a decision to go to pole 3' or 3 depending
on whether a domain is present or not on the idler
position 6. If, a domain is present on 6, the two
domains repel each other and go to poles 3' and 7'
when the field rotates the next quarter cycle and
henceforth sta,y on the output tracks 3', 4', 5' ... , and
7', 8',9' ... , leaving the idlerposition empty. However,
if there is no domain on pole six when the input domain
reaches pole two, the input domain goes next to pole
3 and becomes trapped in the successively generated
idler poles 4, 5, 6, 7, 4, 5 ... until a new domain
from the input track forces it out. The device,
therefore, acts like a flip flop with one input
and two identical outputs. The presence or absence
of a domain in the idler position determines the state
of the flip flop. A binary counter can be made by using
one of the outputs as a carry to succeeding stages.
Flip flops have been operated by using 11 ,000 A
permalloy with the overlay design consisting of the
usual one mil by five mil rectangles. The orthoferrite
was TbFe03, with a 54 Oe bias producing 3 mil
diameter' domains. The rotating field peak amplitude
was approximately 17 Oe.
498
Fall Joint Computer Conference, 1969
CONCLUSIONS
We have seen that the orthoferrites provide interesting
research material for both the theoretician and the
experimentalist. Papers covering the wide swathe
from materials preparation to device applications
have been published. All available orthoferrites have
been evaluated as potential domain wall device
materials. It was found, for example, that the use of
Sm.fi6Tb.46FeOa orthoferrite will maximize the storage
density since in this compound the smallest domains
are found. Stable cylindrical domains 0.5 mil in diameter
allow storage densities of 106 bits/in2 •
Techniques for. propagating domains at data rates
in ex.cess of three megabits/sec have been demonstrated
using conductor circuits. The upper limit on the data
rate (or either the "angelfish" or "T-BAR" is yet to be
determined although it is expected that the rate for
the latter will be in ~xcess of one megabit/sec. Thus we
believe that one of the future applications of domain
wall devices will be in large capacity shift registers----a solid state disk file.
Although most of the device work presented in this
paper concerned the propagation of domains other
efforts have pursued the areas of information insertion
and detection, and magnetic logic. Magnetic logic is
readily implemented using interactions of domains.
Therefore, a second application is' expected in special
purpose memory-logic systems.
Domain wall devices are fabricated using the pro:duction techniques pioneered by' the semiconductor
industry. Thus these devices should be a compatible
companion to LSI in future systems. Domain wall
devices require few process steps and as such should
be manufacturable in high storage capacity units.
ACKNOWLEDG MENTS
The authors would like to acknowledge the interest
and encouragement shown by H. E. D. Scovil in the
domain wall device project. J. J. IVIcNicol, R. H.
Morrow and R. J. Psonak built and tested many of
the circuits. We are all indebted to A. A. Thiele for
supplying most of the theoretical background for this
work and for allowing us to publish many of his results.
REFERENCES
1 A H BOBECK
Properties and device applications of magnetic domain8 in
orthoferrites
Bell Syst Tech J Vol 46 Oct 1967 1901-1925
2 A H BOBECK
Properties of cylindrical magnetic domains in orthoferrites
IEEE Tmns on Mag Vol 4 Sept 1968 450
3 A H BOBECK R F FISCHER A J PERNESKI
J P REMEIKA L G VAN UITERT
Application oj orthoferrites to domain wall devices
i969 Intermag Canf April 15-18 1969 Amsterdam
4 D TREVES
Studie8 on orthoferrites at the Weizmann Institute of Science
J Appl Phys Vol 36 March 1965 1033-1039
5 S GELLER
CrY8tal 8tructure of gadolinium orthoferrite GdFeO.
J Chern Phys Vol 24 June 1956 1236-1239
6 C KOOY U ENZ
Experimental and theoretical study of the domain configuration in thin layers of BaFe12019
Philips Research Rpt Vol 15 Feb 1960 7-29
7 A A THIELE
The theory of circular magnetic domains
To be public hed
8 A H BOBECK U F GrANOLA
Magnetic domains
Science and Technology No 86 Feb 1969
9 A J PERNESKI
P1'Opagation of cylindrical magnetic domains in orthoferrite8
1969 Intermag Conf April 15-18 1969 Amsterdam
Netherlands .
10 F C ROSSOL
To be published
A new integrated magnetic memory
by M. BLANCHON and M. C:ARBONEL
THOMSON-CSF
Laboratoire Central de Recherches
Essonnes, France
INTRODUCTION
us plot the switched flux versus· the driving current,
when only one current pulse is present (curve A) and
when a large number of identical pulses is sent (curve
B). For correct memory operation the toroid must
switch completely for I and must not switch for 1/2.
Let us name IMIN the minimum current needed to
switch 90 percent of the flux with a single pulse (curve
A), and IMAX/2 the maximum current allowed to switch
only 10 percent of the flux with a large number of
pulses (curve B). The required conditions are I ~ IMIN
and 1/2 ~. I MAX/ 2. This is not possible for the permalloy
1/2 mil toroid where IMAX/2 < 1/2 (IMIN ). Forother
pulse widths, other shapes or other thicknesses (1/8
mil to one mil) this is still not possible. Thus, one is
then led to .use more elaborate driving curr.mts such
as bipolar digits or doublet currents.1i With these
improvements, the permalloy toroid memory will
work but with relatively tight tolerances. However
in batch-fabrication of a large number of toroids
.
'
t1ght tolerances will lead to low yield. Therefore, toroids
were abandoned.
.Very thin permalloy sheets were used by RCA,12 in 1963,
1n order to achieve integrated magnetic memories. In
1964, LFEo has described an approach to mass memories
(1.0~ -109 bits! using this material and an integrated
WIrlng. For dIfferent reasons, these two projects were
abandoned. This paper shows that the two conditions
of success are the choice of the. shape of the element
and the integration process.
First, th~ shape of the element is discussed and it
appears that the toroidal shape j.<, unsuitable for the
realization of large integrated memory planes. U nlike the ordinary core, the three-hole element' has
ver~ broad tole~a~ces on driving currents and on magnetlc character1stICs of the material. Therefore, the
three-hole core was chosen for the integrated memory
plane described in the third part of the paper.
Then, the dra~backs of the usual integration processes are underlined and a new, much more reliable
method is proposed. A 16 X 8 bits and a 32 X 36 bits
plane were realized using this fabrication process.
The characteristics of these memories are exposed in
the last part of the paper.
The three-hole element
Permalloy sheet intricate magnetic elements are
easily obtained by etching. The three apertured element has many advantages for storage. Diagrams illustrating the operation of the element are shown in
Figure 2. The four legs of the element are of equal
width. Starting from tlie clear state, the one-state is
written by applying the word-write drive alone. This
will work for any value of the word current p~ovided
iw > iwo where iwo is the magnetic threshold of the ele-
Memory device characteristic8
Batch-fabrication of memory planes necessitates
a careful study of the characteristics of the memory
element. The simplest shape is the toroid.
Characteristics of the toroid
Consider the element shown in Figure 1 and let
499
500
Fall Joint Computer Copference, 1969
------~~-CLEAR
dlp
STATE
~--dt
---=;:;.=--ONE
e
1 }Js
10~s ~OO}Js
i
IMAXA
t MIN
PERMALLOY
oarbitr. units
.-'
AS
1. mil
ARMCO
.2
---=;,:;..a..
ZERO
2
---==-~DISTURI3
3
ONE
8 =1 Otis
2
90~.
word
I
1
I
-
_________
L'-"_~,-r-_-
o
------T---:-16~;:_t
L ___ !.t..
0,2 I 0,4
Imax/2
0,6;
bs
I min.
~A>
Figure 1-8-curves 1/2 mil thick etched permalloy toroids
me~t: A zero state is written by aPI?lying simultaneously
a dIgIt an~ ~ word current, the only condition being
tha~ the dIgIt current exceed the word current. Ap-
plymg a disturb digit drive has t;l0 effect on the zero
but produces a flux rearrangemeht on the one state
magnetically decoupling the left: hole from the out~
put hole. Subsequent disturbs win therefore have no
effect. As ~ay be seen from the' bottom of Figure 2
the operatmg range is very wid~ and is not closely
de~endent on the magnetic characteristics of the material.
'
This results in wide tolerances and a wide operating
t~~pe~ature range. Furthermore, a lack or reproduci~Ihty In the material or in the ~lement shape is not
Important. The three apertured 'element is therefore
very s~itable for batch processing; integrated magnetic
memones.
Memory plane fabrication
The processing technique is extremely important
for obtaining a good yield. Let us· consider an example
Idigit
Figure 2-0peration of the three hole element
(Figure 3). The element here is a simple toroid. and in
the usual integration technique we find a lower winding
and an upper winding tied together by a through-connection, thus creating one or two interfaces. These
interfaces may be a thin layer of vacuum deposited
copper3 or an electrolytic solder.6 This results in 9, serious
lack of reliability (broken conductors).
Another drawback comes from the insulation between the· winding and the elements. Since there are
always pinholes in the insulators,- there are often short
circuits. One should note that the insulation of the
edge of very thin elements is generally extremely di.fficult.l,2
Finally the strains induced by the deposition of the
windings may decrease the uniformity of the output
signals.
All these drawbacks lower the fabrication yield and
the permalloy sheet memories become uneconomic~~l.
The new method described here starts from ,a, permalloy sheet (1/2 mil thick). The permalloy is electroplated with copper (1/2 mil). Using positive photoresist techniques, holes are etched in the plate (Figure
A New Integrated Magnetic Memory
501
hole·
Toroid
V/hVAI
~lns ul at ion
_ _ _ _ _ _ _.....,./
.......;a..........
Lower
\ . . . _/ /
winding
:rrough .
connection
.Figure 3-Cross sectional view of an ordinary
integrated toroid
Etch ing holes
+
a)
core
Figure 5-Top view
forming small bridges over the permalloy. The sheet
is then dipped in photoresist which takes the plaqe of
the copper. After an exposure to the element pattern,
the magnetic elements are etched (Figure 4c). If it is
dlBsired, the memory may be completed by an encapsulation.
b)
Photoresist
Permalloy
+
c)
+
+
+
t
+
+
+
+
Fine I etching
Figure 4-The new fabrication process: cross sectional
view
4a and Figure 5). Then the sheet is exposed to the
wiring pattern, developed and gold is electrodeposited
to make the winding (Figure 4b). The copper is selectively removed, leaving intact the permalloy and the
gold winding. At this point, the winding is held only
by the edges of the holes in the permalloy sheet,
This method is attractive for several reasons:
• Since the upper winding, the through-connection
and the lower winding have beeh deposited at
the same time, the wiring is continuous without
any interface and this is the reason why it is extremely rare to find a broken conductor.
• Since electroplating tends to fill up all the holes,
there are no pinholes at all in a 1/2 mil copper layer. Therefore, there are no short circuits in these
memories.
• Since the elements are etched after the wiring,
there are no insulation edge problems.
• Mechanical stresses may arise from the electroplating of the copper layer and the gold winding.
Removing the copper and etching the element
shape relieves the residual stresses of the permalloy.
!Jlxperimental results
Memory plane models of 16 words X 8 bits and
.502
Fall Joint Computer Conference, 1969
Figure 7-8uperimposed outputs of 32 elements (zero.
one and disturb one)
Hor 100 ns/cm
Ver 1 mV Icm
Figure 6-Photographs of 128 ~nd 1152 bits memory
planes ;
32 words X 36 bits were easily; fabricated using these
techniques (Figure 6). High yie~d of acceptable planes
seems possible even with larger ~lanes.
Characteristics for the 1152 bits storage planes are
given in Table 1.
TABLE I-Memory Plane Characteristics
Permalloy thickness
Word write current
Digit current
Read current
Density
Cycle time
Vout
.5 mil
50 rnA
50 rnA
100 rnA ;
250 bits/cm2 (1560 bits/Sqin)
< 5 J.l.S
1,6 mV ; .7 J.l.s
The uniformity of the output signals is excellent as
may be seen from Figure 7 w:here the output of 32
three apertured element are shown superimposed.
CONCLUSION
Until now, integrated permalloy sheet memories were
not a success. This comes from the choice of the element
shape and the processing technique. By using a three
apertured element and a new much more reliable fa,brication method, these memories seem to have a bright
future for mass memories. Higher densities smd larger
planes (256 X 72) are under study.
ACKNOWLEDGMENTS
The authors would like to thank J. P. Dupeyron
for his assistance on the experiments.
The work reported in this paper was supportedl by
"Direction des Recherches et Moyens d'Esssds" .
REFERENCES
1 G R BRIGGS J W TUSKA
Permalloy sheet transfluxor array memory
J Appl Phys Suppl Vol 33 No 3 1065-1066 March 1962
2 G R BRIGGS J W TUSKA
Design and operating characteristics of a high bit density
permalloy sheet transfluxor memory stock
Proc INTERMAG Conf 3-4-1 3-4-8 1963
3 H W FULLER T L McCORMACK
C P BATTAREL
System and fabrication technique for a solid state random
access mass memory
Proc INTERMAG Conf 5-5-1 5-5-4 1964
4 J A BALDWIN JR J L ROGERS
Inhibited flux-A new operation of the three hole memory
core
J Appl Phys Suppl Vol 30 No 4 58-59 April 19591
ANew Integrated Magnetic Memory
5 H CHANG
Coupled memory elements
J Appl Phys Vol 38 1203 March 1967
50::-
6 M CARBONEL V CHAPTAL
Batch fabricated integrated all magnetic logic
IEEE Trans Magnetics Vol MAG 3 535-537 Sept 1967
Mated film memory-Implementation of
a new design and production concept
by L. A. PROHOFSKY and D. W. MORGAN
UNIVAC, Division of Sperry Rand Corp.
St. Paul, Minnesota
INTRODUCTION
A high performance computer memory must operate
at high speed, require a minimum amount of power,
and be capable of operating under extreme environmental conditions. Thin film memories meet these requirements, however, anyone who eocpected them to
become the primary memory technology was certainly
premature. Despite its superior performance features~
the thin film memory has encountered producibility
problems which have prevented it from becoming cost
competitive. Univac has developed the MA TED
FILM* memory concept and a continuous vacuum
deposition system which togethp,r have overcome
previous producibility obstacles and now make the
evaporated film memory a serious contender fOf main
store applications. l The features which are new and
unique to this approach are:
1. Economical continuous deposition for 16-hour
periods with all deposition paranleters maintained in equilibrium.
2. The closed-flux path design has wide opera,ting
margins and provides an exceptionally low
susceptibility to process variations.
3. Changing the film array organization from a
word-bit matrix to a bit-slice array has greatly
reduced the number of connections and process
steps required to fabricate the memory stack.
This paper describes: (1) the l\1ATED-:F'ILM
memory design which can be adapted to a wide range
... Trademark of Sperry Rand Corporation.
505
of capacity and speed; (2) the continuous vacuum deposition facility which has been developed for the
production of l\IATED ]'ILl\f memories; and (3) a
wide temperature, 500 nanosecond, 5 X 1(}6 bit memory
which has been built and tested.
Storage element
Construction
The st.orage element (Figure 1) is formed by a
deposit of two layers of nickel-iron separated by a thin,
deposited, copper conducting strip. Silicon monoxide
layers isolate the nickel-iron layers from the copper
layer. The layers of silicon monoxide are sufficiently
thin so they do not interfere with the magnetic coupling
of the two nickel-iron layers.
Each layer is deposited through masks on glass
substrate'3 in a va.cuum chamber (Cont.inuous Vacuum
Deposition System). 'Vhen completed, the copper
condu~ting strips form the sense/digit line enclosed
by the two magnetir layers.
An etched high permeability keeper is placed in
close proximity to the depo~it.ed element (Figure 2).
The storage element and the keeper are separated by
a one mil imm1ating coating to avoid any shunt current
paths through the keeper. The storage element now
has a closed magnetic flux pa.th for both the transverse
and longitudinal axes. The a.dvantages of this configuration are: (1) The transverse and longit.udina.l
demagnetizing fields are reduced. This results in lower
drive currents and improved operating margins. (2) Interaction between adjacent bits is reduced to a negligible
506
Fall Joint Computer Conference, 1969
MAGNETIC CHARACTERISTICS
Hk· 3.001
/
\
A
/\
I
WORD CURRENT
(TRANSVERSE)
RESWITCH
SWITCH
SENSE/DIGIT LINE - - (6) 401<1 cu
\J
\./
(5) 0.4K1 CR
SENSE/DIGIT L l N E - - INTERCONNECT
(4) 401<1 cu
(3) O.4KA CR
BOTTOM MAGNETIC LAYER
(2) 5d SiO
(I)
FILM SIGNAL
\
----~-~------~~----------------------
3Kl Ni FE
Figure l-Storage element (exploded view)
D
\
\
WORD CURRENT
ANISOTROPY
AXIS
Figure 2-Storage element drive fields
I
I
\
Figure
DIGIT CURRENT
(LONGITUDINAL)
.... -_/
:~-Signal
drive current relationship
a stored "I" or "0" closes through the silicon monoxide
insulating layers and around the sensei digit line.
Readout of the storage element is accomplished by
passing word current through the word lline. The
resultant transverse field rotates the magnetization
of the storage element, which induces a voltage in the
sense/digit line.
The initial direction of' magnetization determines the
polarity of the induced voltage. The relationship of
the word current, film signal, and digit. current is shown
in Figure 3. The rotation of the storage element
magnetization occurs during the rise time of the word
current.
Passing a current of selected direction through the
sense/digit line restores or writes a "I" or ":0" in the
storage element. The resultant longitudinal lleld overlaps the trailing edge of the word current field and
steers the magnetization to a state determined by the
direction of the digit current.
Nominal operating characteristics
level. (3) Word line to sense line capacitance, which is
a source of word noise, is minimized.
Theory of operation
During deposition, a strong magnetic field produces
a uniaxial magnetic anisotropy in the films of the
storage element. Therefore, magnetization of the storage elenlent exhibits a preferred axis in the plane of
the element normal to the depositeds ensel digit line. 2
A stored "1" or "0" magnetic state of the storage
element is determined by the direction of magnetization
around the sensei digit line and parallel to the anisotropy axis of the film. The magnetic flux resulting from
Storage element operating characteristies are obtained by plotting output flux as a function of drive
currents for prescribed reading and writing conditions.
The total output flux is obtained by integrating the
output yoltage with respect to t.ime. Since in normal
operation digit current is common qlode in the sensei
digit line pair, aU digit currents are given as total
array currents. This is t,vice the single element curreut.
Output flux vs read word current
Figure 4 shows the output flux of a typical element
as a function of element word current for both the
"I" and "0" states.
Mated Film 'Memory
200
OUTPUT
FLUX (MV-NS)
507
OUTPUT FLUX (MV-NS)
SATURATING WRITE
I
COMPLETE ROTATION
OCCURS AT 500 MA
I~-
200
WRITE WORD
--t--t--f-f--t--...,--k-_t--V-t-_t--_I---=CURRENT (MAl
READ WORD
CURRE NT (lolA)
I
-1000
I
·DOC
I
600
·600
tOO
-Il00
-600
\
600
IlOO
WRITE THRESHOLD
'1000
Figure 6-0utput flux vs. write word current
Figure 4-0utput flux vs. read word current
OUTPUT (MV-NS)
FLUX
DIGIT
ISTURB
200
DIGIT
CURAt:NT (MA)
-100
-10
-10
-40
-20
20
40
to
.0
100
Figure 5-0utput flux vs. digit current
Figure 7-The mated film core array
The curve provides information on element output,
symmetry (skew), and operating word current amplitude requirements. The curve is an actual plot obtained
by:
1. Writing adverse history 256 times*
2. Writing once in the opposite direction.
3. Reading once and recording flux output at th.e
indicated word current level.
* For
transverse fields exceeding the write threshold but below
the saturating write level, the degree of saturation achieved
becomes a function of the nUl~ber of pulses applied. The first
pulse will write a portion of the film while each succeeding pulse
writes a little more. In this way, the film asymptotically approaches the maximum magnetized state for the given field.
Adverse history consists of a sufficient number of pulses to
ensure that the element is conditioned prior to write with the
magnetic state worst case for the write operation. It has been
observed that there are no significant history effects beyond 256
pulses. In memory applications, the element is operated beyond
the saturating write level, where hiRtorv effects are negligible.
4. Repeating steps (1), (2), and (3), incrementing
the read word current each time.
History and write word current amplitude: 500
milliamperes.
Digit current amplitude: 50 milliamperes.
Output flux vs digit current
Figure 5 shows the output flux level obtained with
a fixed word current of 500 milliamperes as a function
of digit current after repeated digit disturbs.
The plot was obtained by:
1. Writing adverse history 256 times.
2. Writing once in the opposite direction with the
indicated digit current.
3. Digit disturbing 256 times with the indicated
digit current.
4. ~eading once and recording the output flux.
508
Fall Joint Computer Conference, 1969
subst.rate- (Figure 7). The deposited sense/digit line
pair links all bits on the array making this a 1024 word
by one bit slice of the memory. The storage element,
in the shape of a capital I, is shown in the enlarged
view of the array (Figure 8). The body of the I is the
active region of the element. The remainder of the
element is always in a demagnetized state; however, it
serves the useful function of reducing the transverse
demagnetizing field. The two holes, which straddle
each element, accommodate the word lines.
Continuous vacuum deposition· system
Figure 8-Enlargement of memory array
5. Repeating steps (1), (2), (3), and (4) while
incrementing digit current each time.
Write word current !1mplitude: 500 milliamperes.
Write digit current amplitude: 50 milliamperes.
A digit current of 25 milliamperes is sufficient to
write, while a current of over ~O milliamperes is re...
quired to digit disturb the storage element.
Output flux vs write word current
Figure 6 shows the output flux level as a function of
write word current with fixed read word current and
fixed write digit current.
The plot was obtained by:
1. Writing adverse history 256 times.
2. Writing once in the opposite direction at the
indicated word current.
3. Reading once and recording flux output.
4. Repeating steps (1), (2), and (3) while incrementing the write and history word current
each time.
Read word current amplitude: 500 milliamperes.
Digit current amplitUde: 50 milliamperes.
The write threshold occurs at 300 milliamperes and
a saturated writ.e is accomplished at 500 milliamperes.
Memory array
An array of 1024 active storage element plus 32
spares is vacuum deposited on a photo-etched glass
The continuous vacuum deposition system is t.he
one most significant feature which sets MATED FILM
memory arr!1Y processing apart from conventional
batch processing systems. Operational shakedown tests
on the system have been completed. These tests
demonstrated the system's feasibility as well as its
capability. The capacity of the present system is
108 bits per year. A program to increase this rate will
put the facility in a full capacity mode of 1.6 X 109
bits per year by early 1970.
Continuous fabrication
MATED FILM memory arrays are fabricated by a
continuous vacuum distillation process using an in-line
concept of material flow. Glass substrate blanks travel
sequentially through four deposition chambers (Figure
9) where progressive layers of magnetic alloy, copper,
and insulator material are deposited through precision
contact masks. The lost time due to pump down, substrate heating, and substrate cooling in a batch proeess
is saved in this continuous process once steady state
vapor composition is achieved. Typically this is 20
minutes after start-up.
Within the vacuum chambers, the various materials
are vaporized continuously and the deposition is
monitored and controlled automatically. A proiLlction
cycle of 16 hours during a 24-hour period i3 realized
using this process.
In conventional batch distillation processes, the
composition of a multi-component vapor is a time
dependent function. The higher volatility fractiion
vaporizes in a proportion greater than its melt fractilon.
To achieve a deposited alloy film of a precise com.position, for example zero-magnetostriction iron-niekel
alloy, the vapor stream must be captured at a point
in time determined by composition versus distillation
time. s With continuous fabrication, the proce:~ control
is built around negative feedback techniqu.es which
routinely control the composition of the alloy v2~por
Mated Film Memory
DEPOSITION
HEARTH
16 HOURS OF
OPERATION
0
DEPOSITION
CHAMBER
BoB ~
0
BLANK
L....;~...!...-....,L---~~~.!---~~..:::......-T----~......;...-T-----I COMPLETED
SUBSTRATE
ARRAY
o
509
o
0
0
CHAMBER I
BOTTOM MAGNETIC LAYER
INSULATION LAYER
CHAMBER 2
COPPER INTERCONNECTING LINE
CHAMBER 3
COPPER DIGIT-SENSE LINE
INSULATION LAYER
D
CHAMBER 4
~~
TOP MAGNETIC -
LAYER
INSULATION LAYER
'-~~
~~y~
LOADING
SUBSTRATES
INTO HOLDERS
LOADING INPUT
CHAMBER
REt.«:>VING
PROCESSED
SUBSTR#VE
PROCESS
MONITORING
TEST STATION
Figure 9-Schematic, thunderbird facility
for continouus periods of 16 hours. The vapor distilled
by this steady-state process produces constant zeromagnetostriction nickel-iron vapor for time periods
measured in hours rather than minutes.
The separate production stations (the four deposition
chambers) of the continuous system permit corrections
to be made easily and quickly. Also, the continuous
emergence of arrays allows for prompt monitoring
of the system. After each deposition stage, the substrates are removed and inspected. As soon as a defect
is detected, the continuous system can be stopped and
the problem isolated and corrected. Loss of process
control in the batch system, no matter when it is detected, usually results in loss of the entire batch.
The deposition system is capable of evaporating up
to three source materials concurrently at specific
rates. Electron beam heated sources are used for nickeliron, copper, and chromium. The SiOsource is resis_ _ _ _ _ _ _T~RA~NS~PORT~SY~ST~EM~_ _ _ __
System description
MA TED FILM memory arrays are fabricated in
four identical continuous vacuum evaporators. Ea.ch
evaporator (Figure 10) consists of a deposition system,
a transport system, and pumping system.
---------------PUMPING SYSTEM
Figure to-Continuous vacuum evapora.tor
510
Fall Joint Computer Conference, 1969
tance heated. Evaporant shutters above each source
automatically expose the substrate for a, predetermined
time interval. The evaporation rate of the nickel-iron
and copper sources is controlled using a vapor rate
monitor. The monitor signal is used to regulate the
electron beam gun emission current. Evaporated
materials are replenished by wire feeders which draw
nickel-iron or copper wire froIT,l a spool and guide it
into the molten source. The removable base plate,
which contains all of the deposition equipment except
the shutters and vapor rate monitor, fits onto the bottom of the main chamber.
The transport system moves substrates from a
magazine in the input chamber to the main chamber,
where the depositions are made, and then into the
output chamber. Heaters raise the substrate to deposition temperature during transit from the input
chamber to the deposition chamber. The substrates
pass through a water-cooled tunnel in t.he cooling section of the transport system, which cools them to
handling temperaturf:' before. they enter the exit
chamber.
The automatic pumping syst~m has three interlocked
subsystems controlled from 4 single console. The
pumping system maintains higr vacuum in thb deposition section of the evaporator,: while cycling the input
and output chambers from atrriospheric pressure down
to high vacuum as required by the transport section.
!
System operation
After the substrate is inspected for possible defects,
it is placed in the substrate hplder and covered with
the first mask. Subsequent substrates and masks are
loaded in holders and placed in a cartridge. A cartridge
of holders is loaded into the input chamber of Station 1.
The holders are automatically ejected from the cartridge and pushed sequentially toward the deposition
chamber. Within each deposition chamber the substrate
is exposed at two of the three positions or windows
available. At the first position the bottom magnetic
layer is deposited in the memory bit pattern. At another
position the silicon monoxide is deposited over the
magnetic alloy through the same mask. When this
process is completed, the holders are pushed to the
e~it chamber.
After it has been removed from the exit chamber, the
substrate ",ith the first magnetic alloy and silicon
monoxide layers is inspected, and returned to the
substrate holder with the mask for the interconnecting
elements. The cartridge is then reloaded into the input
chamber of Station 2. Using the same procedure, a thin
adhes10n layer of chromium is deposited for the sense
line interconnecting clements, followed by an overlaying
deposit of copper.
At Station 3, the substrate is again removed, inspected, and loaded into the input chamber using
different masks for the sense/digit conductor deposition. Chromium, copper, and silicon monoxide are
deposited using 911 three exposure positions.
At Station 4, the top magnetic layer and silicon
monoxide are deposited. The outer film of silicon
monoxide seals and insulates the memory bits.
At this point, the completed film arrays a,re ready
for functional testing before being assembled into
memory stacks.
Jfemory stack construction
The MATED FILlVl memory can be thought of as
a two wire system. One a'Xis of stringing and its a,8SOciated connections are an integral part of the previously
desrribed deposition process. To complete the 5tack
it is only necessary to string the word axis and terminate
these word lines in the word diode selection matrix.
The memory plane assembly is formed by bonding
two film arrays to a single keeper, as shown in Figure 11.
In this form the array is less susceptible to scratch~ng
or cracking during subsequellt assembly.
The film arrays are combined to form a 1024 word
by n bit substack with one array for each bit in tht:
memory word. The substack can then be arranged in
various series/parallel configurations to meet specific
system requirements. The design will accommoda.te
word lellgth up to 256 bits without affecting eycle time.
Figure 12 is an exploded view showing: the substack
construction. The memory planes are stacked with the
etched holes vertically aligned; half of each word loop
is connected to the bussed word line header and is
threaded down through the subs tack, while the re-
Figure ll-Memory plane assembly
Mated Film Memory
511
FIGURE13. THE MEMORY CHASSIS
WEIGHT
491bs.
CAPACITY
16K words 32 bits
DIMENSIONS
18.3" x 11.3" x 5.5"
CYCLE TIME
500 nanoseconds
INPUT POWER
190 Watts
ACCESS TIME
225 nanoseconds
INPUT VOLTAGE
90 volt (internal
de to ac power
converter)
INTERFACE
8 Channel Asynchronous
COOLING
Conduction to a
convection heat
exchangar
ENVIRONMENT TOLERANCE
Mil-E-16400 Class 1
Figure 12-The substack
maining half of each word loop is threaded from the
bottom of the 8ubstack. The preformed wire wraps
connect the word loops at the bottom. The top end of
the word loops are wire wrapped to tbe diode leads.
The wire wrap connections are then mass soldered to
ensure a reliable electrical connection. The completed
substack contains 32 spare words and 10 percent
spare planes which are externally accessible. These
spare words and planes may be used, without restriction
anywhere in the substack. This means that the substack will never require rework unless all of the spare
words or spare planes are consumed.
Figure 13-The memory chassis
M emory system
The memory substack and element design does not
vary with the application; however, some of the memory
electronics must be tailored to the specific capacity
and speed required. One typical configuration which
has been built and tested is a 16K word, 32-bit militarized memory with a cycle time of 500 nanoseconds.
A sketch of this memory (Figure 13) shows the location
of the memory subassemblies.
The heat exchanger which mounts on the front face
of the chassis is not shown in this sketch. Cooling is
accomplished via thermal conduction from the components to the heat exchanger which is convection
cooled by external air.
The stack module (Figure 14) contains a pair of
1024 word, 64-bit substacks mounted on a common
plug-in header. The connectors on each side carry the
drive lines leading to the diode selection matrix. The
stack modules are field interchangeable within and
between chassis.
Figure 14-The stack module
Sensei digit configuration
The total senseidigit line is formed by interconnecting the 1024-bit sections, which are part of the
individual substacks. Figure 15 shows one of 64 COffi-
512
Fall Joint Computer Conference, 1969
TO 8 TOTAL GROUPS
Of 8 DRIVERS
1024
ElEMENr
SU8STAlTE
TO 8 TOTAL DRivERS
$.5,5,
-6V
+IOV
Figure I5-Sense/digit line configuration
TO 8 TOTAL DlVERTERS
----
plete sensei digit lines. :Stack modules 1 and 2 form
the left and right halves, respectively, of the 4096bit bridge. Stack modules 3 and 4 form a seoond
bridge and are connected in parallel to a common sense
amplifier and digit driver.
The common mode choke ensures that currents
flowing in and out of the bridge are equal, and provides
both a common mode and differential null at the sense
terminals to the degree the legs of the bridge are balanced. This unbalance is contrQlled so that the digit
noise induced into the amplifier ~s less than three times
the signal, a level within the tolerance of the amplifiers.
The center driver transformer reduces the time required
for the digit current to achieve .steady state throughout the line to 40 nanoseconds. Without this transformer, the time would be 80 nanoseconds.
Word selection
Words are selected by the following method. The
four stack modules, each containing 2048 double length
words, combine to form the system capacity of 8192
double length words arranged in a 64 by 128 matrix
(Figure 16). On the driver side of the matrix, address
bits S6, ~, and 88 along with Ss, 810, and Sll are decoded
to form an eight by eight matrix which selects one of
64 drivers. Similarly, other address bits are decoded to
select one of 16 diverters and one of eight diverter
selectors. Word current passes through the word loop
which lies at the intersection of the drive line and the
diverter line. The word current generator controls the
amplitude and timing of the word current pulse.
Timing
Figure 17 shows the timing for a typical memory
cycle. Prior to time zero, all requests were processed
and ·the memory was waiting. Then, at time zero, a
S, S,S,
s..S,SIIS"
-
S~IOWS
BOUNDARIES
01' 2048 WOIID BY
IH BIT STACK IIoIOOULE
+6V
Figure I6-Word selection
~O
500
100
INITIAL
MEMORY
REQUEST
REQUEST HOUSE KEEPING AND
PRIORITY EVALUATION
600
.,.
.,.
700
800
1--'--
MEMORY
INmATE.
R£ ENTRY MEIIOR'I
ACTIVE
R£jEST'
I
TRANSLATION
AND SELECTION
'-y---J
PRIORITY
EVAWATION
MEMORY ACCESS TIME
MEMORY CYCLE TINE
MAXIMUM SYSTEM ACCESS TIME PLUS CA8LE DELAY
WORD
CURRENT
DIGIT
CURRENT
'1'
SENSE
PREAMP
OUTPUT '0'
r-----_:t--....
DATA
AEGISTERi----------
Figure 17-Timing for a typical memory cycle
memory request arrived at the memory interface.
For this condition, 165 nanoseconds are required to
acknowledge the request, process it through the priority
network and gate the address into the memory address
register.
At t = '235 nanoseconds, the address is decoded and
the proper word and diverter switches ha.ve been
turned on. Word current is driven through the selected
word loop, interrogating the films in that wor 1. The
sense signal peaks within the 50 nanosecond rise time
of the word current. The polarity of the film sig~nal
indicates the stored state. The sense preamplifier output
is shown for both a stored "I" and "0."
Mated Film Memory
A sense signal from the near end of the sense line
has only a 10 nanosecond delay through the preamplifier; a signal from the far end of the sense lin~ has the
additional 40 nanosecond delay of the sense line.
The bottom trace shows the length of time the contents of the data register are valid. During this time,
the digit driver is turned on; the polarity of the digit
current determines which state is to be stored. On a
read cycle, the data is recirculated from the data
requester. At this time if the data from the requester
is not available, the memory performs a split write
cycle while waiting for the data to arrive.
At t = 600 nanoseconds, priority evaluation of active
requests begins. If active requests are present, the
memory will recycle every 500 nanoseconds.
Test results
A preproduction model of the memory system described was completed in April 1969 and has been
undergoing environmental evaluation. Figure 18 contains "schmoo" data which indicates the threshold of
the first bit failure, with the memory system running a
comprehensive pattern of writes, reads, and disturbs.
Word and digit currents are shown as a percentage of
deviation from nominal, Iw = 700 rnA, Id = 45 rnA.
The center square represents the system's drive current
limits; these limits are ± 5 percent. This is safely within
the usable operating region, as indicated by the
"schmoos", for ambient temperature ranges of - 55°C
to 65°C. The degree of overlap of the high and low
513
temperature "schmoos" eliminates the need for drive
current temperature compensation;
These results show the nominal characteristics of
the storage element to be quite representative of the
entire memory. The results also show that there are
no noise or signal interaction conditions in the stack
or electronics that will compromise system margins.
Above the maximum digit current failure is the
disturb of unselected bits. This limit approaches the
Hc of the films since the element design very effectively
minimizes transverse fields on these bits. This would
otherwise aggravate the condition. Film dispersion and
skew determine the minimum limit of digit current
for an adequate write. Word current could not be
varied above 20 percent of nominal so the schmoo
in this region is not know. Minimum word current
failure is caused by the reduced effective rise time resulting in a delayed and reduced signal peak.
CONCLUSIONS
The existing MATED FILM memory design is conservative, yet competitive. As with any new technology,
future development can be expected to enhance performance and reduce costs. The two most significant
growth areas for MATED FILM are higher speed and
higher bit density. The feasibility of a 200 nanosecond
cycle time for systems up to 106 bits has been demonstrated by several partially populated breadboards.
Expansion of memory in. the word direction has little
effect on cycle time. The high bit density in this direction minimizes delay and loading effects.
Part of the future plan for this memory is to double
the bit density on the present size array so that each
array will contain 2048 bits. This will provide such
direct improvements as reduced costs, increased production capacity, and smaller physical size.
REFERENCES
-50"10
Figure 18-Memory system operating margins
1 W M OVERN
Stat'us of planar film memory
IEEE Trans on Magnetics Vol 4 No 3 Sept 1968308-312
2 N S PRYWES Editor W CHOW A CHYNOWETH
H EDWARDS M HINES D LEENOV
V NEWHOUSE A POHM N PRYWES S RUBENS
Amplifier and memory devices: With jiims and diodes
McGraw-Hill Book Co 1965 Chapters 12 13
3 N S PRYWES Editor W CHOW A CHYNOWETH
H EDWARDS M HINES D LEENOV
V NEWHOUSE A POHM N PRYWES S RUBENS
Amplifier and memory devices: With films aY/,d diodes
McGraw-Hill Book Co 1965 Chapter 16
A computer engineering laboratory
by D. M. ROBINSON
University of Delaware
Newark, Delaware
INTRODUCTION
The advent of modern electronic computers has expanded the scope of nearly all areas of scientific
endeavor. The electrical engineer is perhaps most
acutely affected by this expanision by virtue of his
two-fold interest in computer processes. He is, as are
his colleagues of other scientific disciplines, excited
by the computing capabilities now at his disposal.
Even more, he is deeply involved by virtue of his responsibility for the conception and design of the computer and its hardware adaptation to a variety of applications. It is to the second phase of the electrical
engineer's involvement with computers that our
educational activities are directed, that is, to his involvement in the realization of computers or computerlike systems.
The environment
I n order to adequately portray this educational activity, it is necessary to describe the environment in
which it takes place. This environment will be described as it applies to electrical engineering students
at the University of Delaware. However, this is not
an atypical situation and the description could apply
to many of our universities.
Present status
Our senior students are now beginning to come from
a generation which has grown with the computer. Some
have started their association with computing machines in high school or even earlier. All have been
through some sort of a problem-oriented first course
which leads to machine solutions employing a lan-
guage like FORTRAN. All have become familiar with
the power of the computer for problem solving as
early as their first course in Linear Circuit Theory (a
candid admission here is that some problems at this
level are indeed a bit forced). By the time these students have become juniors, they are aware of useroriented packages such as ECAP (Electronic' Circuit
Analysis Program, an IBM applications program)
and have employed this type of program in analysis
of active and passive networks. Modeling and simulation have become familiar terms and tools to these
students.
Except in the very earliest courses, machine computation is not introduced artificially. The students have
been challenged by the problems. Courses have not
been modified to simply introduce computational
techniques; rather, the problem areas have no longer
been artificially compressed to exclude the large system or the nonlinear problem which motivates the computational techniques. It should be mentioned that
closed-form solutions and functional relationships are
sought first. We do not seek to relegate all problems
to computer solutions but rather to find a reasonable
balance between this and the more traditional treatment of problems.
All of these activities are motivated by the search
for solutions to generally traditional problems in
electrical engineering; these activities have been termed
applications oriented. For the most part, engineering educators tend to center their computer related
activities about the capability of machines for solving
traditional problems and the vehicle by which this
computational power may be focused on their particular discipline. In such application areas, our educational
515
516
Fall Joint Computer Conference, 1969
s:rstem seems to be responsive to the student's requirements.
Changes
The electrical engineer's environment is dynamic.
An educational system which was responsive to the
needs of the past may not now serve. There are new
problems of importance, probl~ms which have been
spawned by the very existence· of the computer. Recent electrical engineering graduates are concerned
with the design of systems which. may involve a generalpurpose digital computer in an on-line control function,
a data-retrieval and signal-processing operation or
some similar real-time application. Control, communi
cation, pattern recognition, filtering, and numerous
other system functions are frequently developed about
special-purpose digital computers. As a class, such
systems certainly represent a significant portion of
today's electrical engineering effort. With these problems for motivation, electrical engineering. students
view a casual user relationship with computers as
simply not being relevant to their educational needs.
Their interests and future respdnsibilities can only be
served by an involvement which gives them an intimate experience with this developing environment.
The importance of this changing situation has been
recognized at the University of Delaware and over the
past five years, several curriculum modifications have
been made to strengthen and u.pdate our related activities. The subjects which have the strongest relation to this area and, as such, the ones which have
received the greatest attention in our revisions, cover
such topics as logical design, switching theory and
computer organization. The curriculum modifications have extended into such traditional courses as
electronic circuits, control systems, communication
systems, and information theory. These courses have
been modified to emphasize the role of discontinuous
elements or discrete systems or to introduce the notion of digital processes. Some course work is immediately related to digital systems and their design while
more remotely related course work simply encourages
thinking in terms of digital problem solutions.
Role of the laboratory
These curricular innovations have permitted the development of the general analysis, synthesis or design
techniques required for the examination of digital
systems. Mathematical descriptions of the situation
are developed from models of these systems. As in any
physical situation, the conclusions drawn from manipu-
lation of the mathematical models are no better than
the original representation of the system; in additi.on,
the modeling process itself is often tempered by the
degree of rigor which may be mathematically tractable.
Consequently, the conclusions dra.wn from analysis
of the models may fail to give a complete or accurate
representation of the physical digital system's behavior. In this area then, as in all areas of en~~ineering,
it is felt that laboratory experience acts as a, medium
through which the reality of the physical situation
may be brought to the student. He is made aware of
the limitations of his system models and the implilcations of his modeling process. It is in the 12Lboratory
that a student must pursue the details of the subject;
this is where he "puts it all together." Thus, progress
in the discipline area requires progress in related laboratory experiences.
Enhancing the quality of laboratory studies in digital systems is a process which is not accomplished
without assiduous attention. This is true of 12~boratory
studies in general and it is especially the c:tse for a
digital systems laboratory. This is at least partially
due to the plague which has been termed the "'tyralmy
of numbers." A common characteristic of digital
systems is certainly that large numbers of elements
are required and that large numbers of connections
must be established. Only trivial problems can be
attempted in an afternoon spent. in the laboratory.
Even trivial systems can quickly spread into a maze
if usual breadboard techniques are used. Lmboratory
budgets can rapidly become unrealistic if even only
one or two students wish to retain a problem of moderate complexity. Some early efforts were made to develop small patching stations and arrangements
which would help alleviate these problems. These
efforts served some pedagogical purpose; however,
their limited versatility and the relatively slow expansion process did not permit them to foster the
desired growth of this area.
The state of our laboratory has been enh:tnced by
the acquisition of a small digital computer and the
introduction of this machine into a system which approximates a generalized interface. This system permits physical access to all of the essential eomputer
functions and incorporates facilities for patching
connections to external digital logic-modules so that
an extension of the computer or an interfacing system
may be rapidly established. We have dubbed the
system with the acronym DADEC (Design and Demonstration Electronic Computer). This system, which
represents only a modest investment, has proved to
be a boon in the inspiration of interest and stimulation
of growth in this study area\
A Computer Engineering Laboratory
Several laboratory experiments and exercises have
been developed about this DADEC system Some of
these are extremely simple exercises which serve to
establish familiarity with the machine, its coding,
logic levels, etc. S~me experiments are rather sophisticated real-time data processing adventures. The set
of experiments was designed to support course work
from sophomore computer science level through electrical engineering senior projects.
In this paper, the DADEC system will be described
and several example problems outlined. The examples
have been chosen to illustrate the range of educational
levels which may be served using the experimental
system, the versatility of the system, an example from
several of the particular related course areas, and
some problems which may be of general interest.
The DADEC system
The DADEC system is conceptually and practically
very simple; ablock diagram of the system is shown in
Figure 1. Central to the system is a small generalpurpose digital computer. A number of digital logicmodules (flip-flops, gates, one-shots, line drivers, etc.)
are mounted in adjacent frames with a patch panel
which permits the rapid establishment of interconnections between these peripheral elements and the
computer. All of the computer interfacing lines are
available at terminals on this patch panel.
The majority of the logical building blocks are
completely unspecified, that is, any available logic
module may be substituted in the patching arrangement. It has been found that a few specific functions
are repeated in a great many interfacing problems,
and these functions have therefore been prewired on
CENTRAL PROCESSOR
(PDP 8)
PAPER
TAPE
READER
PUNCH
FIXED
INPUTS
DISPLAY
LOGIC
MODULE
ARRAY
PATCH SYSTEM
Figure I-DADEC 8ystem -J31ock diagram
517
the patch panel (two binary up-counters and one
binary up-down-counter). Switch-registers, light-registers, some momentary contact switches and free indicator lights are available as a portion of this generalized
interface. Trunk lines are available for connection to
remote equipment such as analog tape transports,
signal sources, etc.
An analog-to-digital converter is included in thissystem. ~tudents have designed, built and added a
four-channel analog multiplexer. Students have also
designed, built and added ten channels of digital-toanalog conversion. A portion of this D-A converter is
used to drive a storage oscilloscope facility. This system is by no means static; we are presently adding
additional equipment racks 'for the inclusion of micrologic modules. Plans include the addition of a papertape reader-punch and a disc to the system. An incremental digital tape recorder for accumulation of data
for later off-line processing is to be interfaced by the
students and added to the system.
A few comments are in order regarding the selection
of the particular computer for use in the DADEC
system. While the computer is general-purpose, it is
not subject to the same set of constraints which govern
the selection of a machine for a user oriented computing center. For our purposes, the most important
criterion for evaluating a machine is its ability to
contribute to the educational process. In order to
contribute, it need not have a tremendous core storage
capacity or a rapid thru-put capability. Since the
machine has been in use, its applications have been
concerned with interface problems or the demonstration of system functions and not with its use simply
as a computational device. The machine need not
have a long word length; there is very little pedagogy
which is served by a twenty-four bit machine which
is not adequately served less expensively by a twelvebit machine. Indeed, the short word length and the
resulting abbreviated instruction list and core paging
system actually serves our instructional purposes. The
computer should be easy to interface and adaptable
to a large variety of peripheral equipment. It should
have inherent compatibility with a family of logic
circuits which are readily available. The machine
should be easy to service; frequent failures of the system are observed, since many of the experiments involve hardware entry into the internal operation of
the machine. Finally, it, is a desirable attribute if
the machine has at least a limited FORTRAN language
compatibility. This enables inexperienced coders to
immediately use the system once any additional software is established for addressing peripheral devices.
518
Fall Joint Computer Conference, 1969
One currently has a rather large selection of machines
which meet these objectives (at least 25 such machines).
At the time our decision was made, the list was not so
extensive, but we have found that the Digital Equipment Corporation's PDP-8 is a very satisfactory,
moderately priced machine.
Example experiments
Several example experiments will be outlined in this
section. Some of the experiments are, of course, prompted by the requirement that students must first be
introduced to this system; however, the predominant
motivation is problem solving. When the system was
first conceived, the faculty felt responsible for specification of a number of problems to be implemented. We
felt that we would be hard pressed to find a sufficient
number of examples to insure full utilization of the
system, however, the students have been encouraged
to suggest problems and their exuberance now prevails. We encourage the students to seek problems
from other departments on campus and their suggestions have covered the gamut from exotic timesharing activities to automatic control of oyster reproduction. These following few examples were chosen
from student suggested projects.
An introduction to the system
The Electrical Engineering Department is responsible for the instruction of computer science majors
of the College of Arts and Science in a course that is
oriented toward the hardware and architecture of
c~mputing systems. For the most part, these students
wIll have had no experience with a digital computer
at a more intimate language level than FORTRAN.
We find that a simple machine-language program tracing
experiment is extremely effective in establishing both
an introduction to the DADEC system and the operation of a compiled language. A simple type-out routine
is coded in FORTRAN; this program is compiled
and loaded along with the operating system. The routine is then executed in a single-step machine-language
mode s~ that all of the required steps of masking, code
converSIOn, communication with a peripheral device,
etc., may be examined using the register information
supplied by the DADEC system. This experiment is,
of course, extremely simple; however, it does illustrate the fact that this somewhat generalized digital
system finds use even at early instructional levels.
An extension of the computer
These computer science students soon become moder-
ately proficient at programming in the assembly language of this machine. Programming instruction is
not a part of the course per se; but the relation between
"hard ware " and" so ft ware" h
w 'IC h'IS d'Iscussed, quite
often naturally brings up coding problems. K ear the
end of the course they are capable of more ambitious
experiments in which additional commands are added
to the repertoire of the computer. An example of this
is the addition of a "hardware" EXCLUSIVE-OR
command. In this experiment, a program controlled
input/ output transfer is initiated to transfer the contents of two memory locations to external registers.
The peripheral portion of the system performs the
EXCLUSIVE-OR operation and transfers the data
back into the accumulator. Now, of course, a programmer can accomplish a similar result with a sub·-routine
of ~ome fifteen or so statements. The student is thus
faced with an example of what is often called the "hardware-software" trade-off.
Automatic testing
Within the electrical engineering curriculum emphasis is placed on designing the class of elec~ronic
circuitry which is usually involved in computers. Each
student is assigned the problem of accomplishing a
"worst-case" design of a discrete element NAND /
NOR gate. This design requires that a certain fan-in,
fan-out requirement be met at room temperature with
any transistor from a given distribution.. The DADEC
system is used in the evaluation of the students design, that is, in testing of the circuits. The students go
through the procedures of design computation, breadboarding, testing, reevaluation of their desig:n, and
finally, fabrication of their design on a printed wiring
board which is acceptable in the DADEC interfac:e
system. The system then exercises their circ:uit by
connecting output loads and applying worst-case
signals while circuit conditions are tested with the
analog-to-digital converter. The computer gives the
student a grade on the lab experiment which indicates
how well he met the design objectiv~s.
Encoding and deooding
A course discipline area is developed in the theory
of simple sequential systems. As an example problem,
and one which draws upon the student's information
theory background, a single error correction digital
transmission system is designed. An asynchronous,
sequential coder and decoder are realized using NAND
gates. This sub-system is patched into the DADEC interface and the computer is used to gener~te code
A Computer Engineering Laboratory
groups which are transmitted to and received from the
transmission system. A random error generator (a computer subroutine) creates· a noisy channel or errors
in the transmission path. The computer further analy~es
the transmission and reports the per~ormance statistics of the system.
Understanding the computer functions
The particular computer employed in this system
has two rapid input-output data transfer mechanisms.
These are called single-cycle and three-cycle databreak transfers. These are rather difficult mechanisms
for the students to assimilate. This is not because they
are conceptually difficult but because of the large
number of signals which must be recognized and carefully timed. A simple experiment serves to illustrate
both of these data-break facilities. We call this experiment a hardware clear core. In this interface,
the single cycle data-break is first called to set zeros
into core location zero and one into location one. The
three cycle data-break is then initiated with a word
count register as location zero accompanied by presentation of all zeros on the data lines. This has the net
effect of clearing all core locations except zero and
one. The single cycle data-break is then again called
to clear these two locations. This is all acco~plished
with a sequenced switch operation in the interface.
While the interface is particularly simple, the experi. ment does require a sophisticated understanding of
the operations of the computer.
Some more challenging experiments
Student projects are being executed using the
DADEC system. In this project environment, rather
comprehensive problem areas are either suggested to
the students or suggested by the students. They may
then pursue a solution of the problem for one or perhaps two terms of their senior year. Several of these
problems will be described in greater detail than have
the previous problems, since these serve to illustrate
the student's approach to problem solving.
HL
HL
00 01
f • f2
519
10
\I
Z
00 01
• f2
00 00 01
10
00
(0
2
4
5
I
01
3
®
4
5
I
01
I I 01
10
I I
®
2
4
5
0
II
I I 01
10 10
10
I
I
10 00 10
10 10
® @) ®
10 10
10
FLOW TABLE
EXCITATION TABLE
SIGNA~_
F. = H + [ f 2+ L f. f~
~:,,~: ~/~
V
L
COMPARATORS
L
Fa =H'f2 + H'Lf.'
FUNCTIONS
Figure 2-Pulse-height detector -
Z =( f. fa )'.
Description
designing an asynchronous sequential circuit which
transmits a standardized pulse whenever its input
pulses meet the proper amplitude criterion. A description of thi$ system is shown in Figure ·2. Two comparators are used as decision elements to determine if
the input signal has passed either the low threshold
voltage (VL) or the high threshold voltage (VH)' The
results of these decisions i.e., the output of the comparators, are described by -Boolean variables Hand L.
A flow table which summarizes the required circuit
action for any input sequence is shown in Figure 2
(note that flow tables of this type are described in
references such as Maley7).
This fl9w table may successfully be assigned internal
state variables (fl and f2 ) as shown. The excitation
table may be formed, and from these tables excitation functions (F 1 and F 2 ) and the output function
(Z) may be derived.
A puise-height.analyzer1
The analysis of pulse-height information is quite
suitable for digital sub-system soluti~. This particular pulse-height analyzer is unique in that the pulses
are of only about 30 nanoseconds duration and the
counting interval must be short (about 50 microseconds) with no dead time between successive count
intervals. The student approached the problem by
II
Figure 3-Pulse-height detector -
Logic diagram
520
Fall Joint Computer Conference, 1969
----------------------------------------------------------------------------------,------veloped by the students. In this instance, thera H,re
very few calculations accompanying the process and a
rather short symbolic program suffices to control the
experiment, accumulate the data, present the display,
and punch out information for later entry into a larg;er
computer for analysis. In this instance, the DADJffiC
system is functioning as an on-line data retrieval
system with quick-look facilities and off-Hne d:a.ta
processing.
TIME SCALE 100 ns/dlv
AMPLITUDE 500mv/dlv
(OUTPUT 2 volts/dlv)
Figure 4-Pulse-height detector -
Performance
A logic diagram realizing these excitation and output functions using NAND elements is shown in
Figure 3. The Z function feeds a, pulse amplifier which
produces standardized pulses upon a logical 1 to 0
input transition. Figure 4 indicates the performance of this pulse-height detector in response to
pulses which dwell at the threshold level for only some
10 to 15 nanoseconds. Notice that pulses less than the
low-threshold or greater than the high-threshold
produce no output. Pulses with amplitudes between
these thresholds produce standard 10.0 nanosecond
output pulses.
These output pulses are directed to one of a pair of
up-counting registers in a synchronous sequential
SUb-system. These registers alternately store the count
for the appropriate counting interval and then dump
the stored count directly into a memory location using
the computer data-break facility. The entire analyzer
interface, which consists of some 45 flip-flops, 50 gates,
and about five other miscellaneous circuits, is patched
on the DADEC system. The computer controls the
counting interval and keeps track of the appropriate
core locations for data storage.
The computer also controls the two threshold voltages VH and VL by directing appropriate numerical
values to two channels of the digital-to-analog converter. Two additional D/A channels are employed
for graphical display of the accumulated 'Count as either
a function of the threshold voltages or time. This
is accomplished by simply presellting these two analog
channels and a device selectiqn channel to the X,
Y and Z axis of a cathode-rayLtube with storage facilities.
For this problem, and indeed for all problems of a
project nature, the software support must also be de-
Play ba1l2
An interesting set of experiments is developing in
the area of physiological monitoring of athletes. Thru
the cooperation of the coaches and players of a baseball team, it has been possible for us to introduce str:a.in
gages' and other transducers in the player's bats,
switches in the player's shoes, contact assemblies in
the bases and ball-speed monitoring equipment in
the playing field. Small digital sub-systems h:a.ve been
designed and built to time the player's fun to first
base after the crack of a bat, to time the pitch, fmd
to monitor- the position of the pitcher's and batter's
feet. The DADmd system is used to collect and correlate these data and alsoto sample and digitally represent the bat acceleration during the swin~~. These
processes are all moderately simple and their implementation is straightforward; they will not be further
described.
In this application, the DADEC system is used for
data accumulation. Information is produced on punched
paper tape for later analysis on large data-processing
machines. For the baseball fans~ a typiQal se1i of dl:t.ta
--
1000
fh
c::
::J
0
(,)
~ .~.J.... .~. J.-,.~. ...........
500 ~ -L LABEL
. ..
'.
:
.."....... ............. .......
/.,
z
Q
~
0::
LaJ
..J
LaJ
•
0
........,... ....................
II LABEL
-500
- ----- .......""'"
""...
()
()
•
I
.....,.
",
'""l
--+ 3
I~Z
Xl
X
Y2
I
1
YI Y2
Figure S-8hock measurement system -
I
I 1
I 1
I I I I
1
Zz = X 3 y: +XIYI1yZ
Z 3 = X z y~ + XI YI y~
EXCITATION FUNCTIONS
Pront end
0
Xz
I
ZI =X 3 +X 4
S I-X Z+X 4
SZ-X 3 +X 4
TYPICAL INPUT
SEQUENCES
0
0
EXCITATION Tt,BLE
FLOW TABLE
INPUT
DECODING
LOGIC
"Figure 6-Shock measurement system -
X4
Logic dil,gram
several hundred of such shock waves which are generated iTh bursts at a possible rate of some 6,000 shoeks
per minute.
The support programming for this system was also
executed by the students. In this instance, eonsiderable calculation must be applied to the data. It was
felt that the FORTRAN language was an efficient
vehicle for such calculations. The FQRTRAN program must communicate with the interface and such
programming problems must be solved by the s'liudents.
What's under way
A large number of problems have been suggested
for solution on this DADEC system. A listing of problems which have been accepted and are in various
1
A Computer Engineering Laboratory
stages of progress is given below. It should be noted
that these are undergraduate project problems and
as such need not necessarily be new or spectacular in
their implications. The sole requirement is that the
probl,ems have engineering application and will allow
the student to follow a reasonable design procedure
to achieve his ~oal. The problem areas under study
include signal analysis using exponential basis functions, Lesbegue sampling, speech analysis and generation, automatic x-ray data processing, on-line correlation analysis, physo-acoustic reverberation studies,
graphic displays, and control of psychological experiments.
Spin-off projects
Several projects have developed which are not ,directly related to the DADEC system but are inspired
by it or find use and application in design with the
system. One example is a Boolean string manipulation program which accepts long strings of Boolean
expressions combined with a variety of operators
(EXCLUSIVE-OR, 'AND, OR, NOT, STROKES,
etc.).6 The string manipulation program operates
on this set of characters and yields a sum-of-products
type expression for the Boolean function. Boolean
simplification algorithms have also been developed.
A fa'mily of programs that permit a high degree of
operator-machine interaction have been developed
for the manipulation of flow tables. 6 These programs
are useful in flow table manipulations such as the
elimination of superfluous states, or accomplishing
appropriate mergers and they are helpful in solving
the state assignment problem.
CONCLUSIONS
The system has been in use for about thirteen school
months. Our classes are generally small; we graduate
about thirty electrical engineers per year. The list
of problems presented is perhaps a measure of the enthusiasm with which students have accepted this
problem area and DADEC system. The anticipated
problem of problem suggestion is itself no longer a
523
problem. Weare now in the enviable position of being
able to be discriminating in the suggestions which we
allow to go to completion. The students are beginning
to vie for time on the system and in order to qualify
for this time they must present an acceptable technical
proposal outlining their application.
The present status of this DADEC system then is
one in which a number of experiments have been
developed in support of a variety of course efforts.
A tremendous pos~ibility exists for future developments of this sort. That is, the system configuration is
sufficiently versatile so that only lack of the students
imagination precludes his open-minded approach to
a problem. It thus seems that this modest investment
has sparked considerable interest and motivated the
students to pursue the detail necessary to solve the
problems of our new environment.
REFERENCES
The first six references are to student reports which
are available from the Morris Library of the University of Delaware.
1 J F BENNETT
On-line processing of nanosecond pulses
Dept of Electrical Engineering Univ of Delaware 1968
2 D L CLARK
Analog and digUal data recovery jrom magnetic tape
Dept of Electrical Engineering Uhiv of Delaware 1968
3 J A BRCICR
A stellar occultation digital data sub-system
Dept of Electrical Engineering Univ of Delaware 1969
4 L T QUICK
Digital processing oj analog stellar occultation data
D~pt of Electrical Engineering Univ of Delaware 1969
5 L R NICHOLS
Computer manipulation of boolean character strings,
Department of Electrical Engineering Univ of Delaware
1968
6 G D EARLE
A utomatic flow table manipulation
Dept of Electrical Engineering U niv of Delaware 1969
7 G A MALEY J EARLE
The logic design of transistor digital computers
Prentice-Hall Inc Englewood Cliffs N J 1963
8 M P MARCUS
Switching circuits for engineers
Prentice-Hall Inc Englewood Cliffs N J 1962
:Evaluation of an) interactive display
system for teaching numerical analysis
byP. OLIVER andF. P. BROOKS, "JR.
University of North Carolina
Chapel Hill, North Carolina
INTRODUCTION
The purpose of this study was to develop, use, and
evaluate an interactive" display system for teaching
selected topics 'in elementary numeriCal analysis. We
were interested in giving students a thorough intuitive
understanding of the pertinent mathematical functions
and in measuring the learning effects of an on-line
graphical capability.
This system was developed in the spirit of the CullerFried on-line system. 1 It is similar to it in its emphasis
on the combination of an interactive and a display
capability, and its mathematical orientation; it differs
from it in that it is designed primarily as a teaching
tool rather than for problem solving.
The system developed enables the insttuctor or
student to enter a variety of mathematical equations
into the computer in a FORTRAN:..like format and
obtain graphical displays of these functions. In addition, the user can illustrate a number of elementary
numerical methods, such as Newton's method for
locating roots of equations, the Euler-Heun method for
solving ordinary differential equations, and the use of
interpolating polynomials. The hardware consists of a
display unit with lightpen and function keyboard and
a background computer. The software consists of a
monitor; programs which interpret requests from the
display user; and programs which produce displays.
A quantitative evaluation of the feasibility and usefulness of computer graphic techniques in teaching
elementary numerical analysis raises the following
questions:
525
1. Does the system developed perform a useful
function?
2. Does it perform this function better than
currently available visual facilities, e.g., slides
or film? Does it help the instructor to prepare
more informative and interesting lectures? Does
it give the instructor more flexibility in the
classroom? Does it encourage the students to
take a more active interest? Does it improve
student retention?
3. Can it be integrated into the teaching process
so as to avoid being a distracting curiosity?
4. What does it cost to teach with such a system,
and how can it be economically feasible?
5. What sort of computer system (software and
hardware) is required?
6. How much manpower, tIme, and money IS
required to develop such a system?
Procedures
A brief non-cr~dit course in elementary" numerical
analysis was offered by the Department of Computer
and Information Science in the" summer of 1968. The
course was held twiCL One. group was taught with the
aid of the on-line graphic system; the other was taught
conventionally. Thec1ass..met for thirteen periods, two
hours night,ly. Prerequisites for this course were elementary calculus and a familiarity with ordinary
differential equations.
The topics selected for use in the course and evalugtion were
526
Fall Joint Computer Co~erence, 1969
--------------------------------------------------------------------1. Polynomial approximation and interpolation.
2. Iterative methods of solving for the real roots
of algebraic equations.
3. Numerical solutions to ordinary differential
equations.
The system was used by the instructor to show
examples during lectures and by the students in a
laboratory session devoted to the properties of polynomials.
The system had been tested qualitatively by similar
use during its development. We learned at tl:at time
that hands-on time by students was useful in removing
the novelty of the display unit, allO\ving the stu: ~ents
to concentrate on the material illm;;trated. It was also
found that presenting a series of illustrat:ons concentrating on a single topic, e.g., iterative methods to
find roots of equations, was an effective way of imparting the key concepts of the material to the students.
Example
The use for lecture illustration can be seen from
an example. The topic roots of equations 'was
introduced with two specific examples from physics
-a column-buckling problem and a pipe-flow
problem. Each problem required solving for the
real roots of an equation.
Then there was a brief discussion of the techniques available for solving equations, and the
field was narrowed to iterative methods. The
properties common to all iterative methods were
discussed, and the practical questions which face
the problem solver, e.g., rate of convergence and
computational efficiency, were presented.
The first specific method, linear functional
iteration with acceleration, was introduced by
presenting t.he necessary theorems on the e",istence
of solutions and convergence.
This was followed by a series of illustrative
examples. These consisted of polynomial and
non-polynomial equations. The iterative method
was applied to each and the regions and rates
of convergence were discussed for each case. In
applying functional iteration to the equation
and the effects on convergence were illustrated by
actually displaying each of the cases.
The Aitken acceleration scheme was then applied to each of the cases previously illustrated,
and its effects on non-converging as welll as <:onverging sequences of iterates were exploredl.
Finally, a brief review of the techniques discussed
and the key concepts discovered through the
illustrative examples was given by the instructor.
This cycle of introduction, presentation of theory,
illustrative examples, and review was followed in each
of the classroom lectures.
Besides the lectures, each group was given a laboratory exercise designed to lead the student to the important properties of polynomials. The test group
worked the exercises using the interactive display
system. The students themselves operated the display
device after receiving instructions on its use. The
control group worked the exercises using the bl:ackboard
as a graphic device.
An examination was given on each of the three topics,
as well as a final comprehensive examination covering
these three topics. Each group. was given a one-hour
examination (the pre-examination) during the first day
of class. This examination tested mathematical maturity and previous knowledge of numerical analysis.
Circumstances did not permit a random assignment
of students to groups. Students attended the session
of their choice.
The course was open to anyone possessing the neeessary prerequisites. Each group was composed largely
of advanced graduate students with backgrounds in
statistics, mathematics, and physics, and no previous
experience in numerical analysis. In each group there
was one non-student. These two non-students had
college backgrounds (mathematics and physics) similar
to those of the students, plus professional backgrounds.
The test group was composed of four subjects; the
control group consisted of six. Three additional subjects
were available for measurements on the second topic,
the roots of non-linear equations; two belonged to the
first group, one to the second. These three subjects
were ~iven the same pretest as the others.
Design of the experiment
x
8
+ 2x + lOx 2
20
=
0,
for example, the several ways in which the iterative
scheme could be set up (e.g.,
x = 20/(x2 + 2x
2x2 . - r)/lO)
+ 10), orx =
(20-
The experiment performed was of nonrandomized,
control-group, pl'etest-posttest design.
The two groups of observations were viewed as
independent samples from a population composed of
two normally distributed subpopulations. It wa~s further
assumed that each sample group was drawn from a
distinct sUbpopulation, and that the subpopulation
Evaluation of an Interactive Display System
variances were the same, and equal to the population
variance.
With these assumptions, the following tests were
performed :2
1. A variance-ratio test for each of the post-
examination results to determine the validity of
the assumption of equal variances of the two
groups.
2. A multivariate F -test to determine if the difference in performance of the two groups, taking
the results of all four post-examinations into
consideration, was due to chance or to the·
difference in treatments. The mean score of
each group on the pre-examination was taken
as the covariate, and the mean scores on thp
four post-examinations were the variables.
3. A t-test on the within-classes regression coefficient to determine if the difference in the
initial ability of the two groups as measured by
the pre-examination scores had a significant
effect on the post-examination results.
4. A univariate F -test for each of the four postexaminations to test the null hypothesis
versus its alternative
where ml and m2 are the mean scores of the
test and control groups, respectively. The preexamination mean for each group was used as
a covariate. A significance level of .05 was
chosen prior to performing the experiment.
Instrumentation
Hardware
The IBNI 2250 Display Unit, Modell, was used
foro this experiment. This unit is attached to an
IBM System/360 Model 40H (256K bytes) computer via a selector channel.
Images are generated by the 2250 on a cathode
ray tube which has a display area of 12" X 12"
in size, with 1024 by 1024 addressable points 3 •
The following special features were available on
the unit used for this experiment:
An 8K byte buffer used for image regeneration.
A character generator.
Absolute vector graphics, which allows the
527
plotting of vectors by specifying only the
coordinates of the end points.
An alphanumeric keyboard for entering characters into the buffer.
A function keyboard consisting of thirty-two
pushbutton. keys, an indicator light for each,
and eight overlay code sensing switches.
A lightpen.
Programming system
'1'he graphic programming system used in this
experiment operates under Operating 8ystem/360
(MFT, Version 16).
At Initial Program Load time a monitor module
is loaded into a 44K partition reserved specifically
for graphics. This mO)litor brings the application
program residing in the system linkage library into
the graphic partition and transfers control to it.
The graphic system is composed of seven load
modules totaling I1pproximately 5,500 8/360 assembly language instructions. No more than three
load modules are ever in core at the same time. A
dynamic overlay structure is used, so that at most
35K bytes of memory are used at anyone time.
The multiprogramming environment in which the
system operates allows the user to operate while
batch processing and other tasks take place using
other core partitions.
The user has the following functions available
to him:
General Functions:
Grid DisplayThe user defines- his coordinate system by
providing upper and lower bounds for the x
and y axes, and increments (from the lower
bounds of each axis) at which he desires
vertical and horizontal lines to be displayed.
Polynomial DisplayPolynomials may be displayed by entering
their coefficients or their real roots in the
appropriate data area. Figure 1 displays the
polynomial x 3 - x, and shows the grid parameters along the margins oJ the display.
Point DisplayUp to fifteen points may be displayed by
entering the (x,y) coordinates.
Function DisplaysFunctions of one variable may be displayed
528
Fall Joint Computer Conference, 1969
The following numerical analysis techniques
m'ay be illustrated:
Polynomial Interpolation
I terative Methods for Roots of Equations
Linear Iteration
Newton's IVlethod
Secant lVlethod
Method of False Position
Solution of Ordinary Different.ial Equations
Multipoint -Methods
Predictor-corretor Methods
Runge-:l(utta Method
Using the display system
Figure I-Display of the polynomial x3
-
x
by defining them in a PL/ll-ike format.
Figure 2 is the display of the function tan
(x) - x.
Redraw FeatureAll the polynomials in a current display,
plus the most recently entered points and the
most recently displayed non-polynomial function may be redrawn on a new grid.
Erase FeatureAny single vector or set of points may be
erased from the screen via use of the lightpen.
Numerical Analysis Teaching Function-
" ::ill
H'
•
.'
g
I
II
( t f ' '~~Olf . . .
l
- j - ..
,J';, ,•• -.-
Figure 2-Display of the function tan(x) x
P..Iiiiiiii
l",
The system was designed as a te34hing tool, ntOt a
problem-solving device, although it has been used as
such. Ease of use, flexibility, and hardness-i.e., the
capability of continued operation in the presence of
disruptions such as invalid entries by U8ers-were
prime considerations in the system's design.
Ease of use is facilitated by use of the programmed
function keyboard (PFK) as the sole source of "~om
mands" from the user-this is in contrast with using a.
command language via the alphanumeric keyboard,
which would require the user to learn the command
syntax as well as more manual effort on his part.
Each command is serviced by a subroutine. This
modularity of program design makes it easy to add,
delete, or modify sections of code. The calling~ sequence
is uniform for all subroutines.
The steps required to define a problem and illustrate
its solution are designed to parallel those a student
should perform if defining and solving the problem
with pencil and paper.
The following example illustrates this. The use of a
single function keyboard will be considered an "instruc~ion," and will be designated by namin~~ the ]key.
(Keys are labeled on the PFK overlay.) 8ettin~~ of
parameters on the designated screen locations will be
indicated by writing the parameter name, followed by
an equal sign, followed by its value. The meta-instruction < initialize> indicates the setting of the
screen dimension. In the example which follows the
coordinates of the lower left-hand corner of the screen
are (- 5, - 5), those of the upper right-hand corner
(5,5).
The problem is to illustrate three iterations of
N ewton's method to locate the real root of the equation
x 3 - x - I = 0, using x = 2 as an initial estimate of
Evaluation of an Interactive Display System
529
the root. DATAPAD1 refers to a program-defined
screen location used for entering parameters and
functions.
Figure 3 gives the program which will generate the
desired display. Figures 4-6 represent the resulting
display after each iteration.
Thus, to illustrate the use of Newton's method to
locate the real root of the polynomial x3 - x-I the
user performs the following steps:
1. Define the domain and range x3 - x-I in
which he is interested. This is done via the
alphanumeric keyboard.
2. Use a PFK key to display the desired coordinate
system.
8. Define and display the polynomial, entering its
coefficient with the alphanumeric keyboard, and
using a PFK key to enter this definition into
main core and cause display.
4. In a similar fashion, define and store the initial
estimate of the root.
5. Use a PFK key to illustrate each iteration.
These actions are those the student or the instructor
would ordinarily take in solving or illustrating the
problem, and are taken in the same order.
As a second example representative of the capa-
Instructions
(initialize)
DATAPADI =
XP3 - X-I;
STOREF
PLOTF
PLACE
DATAPAD1 =
3*XP2 - 1;
STORED
PLACE
DATA:PADI = 2,
DATA
INIT
NEWTON
NEWTON
NEWTON
Comments
define function, x3 - x-I
store definition
interpret definition and plot
function
place cursor in DAT AP ApI
area
FiJl;ure 4-Illustration of newton's method for finding
the real root of x3 - x-I = 0, first iteration
bilities of the programming system, we illustrate the
use of Euler's method for solving the differential
equation
y'
= -2xy
with initial condition
y = 1 atx = 0
The domain and range are O~x~3,-1.5~y~ 1.5. A
step size of .3 will be used. The large stepjsize is chosen
so as to emphasize the properties of the method.
define derivative, 3x2 - 1
store derivative definition
place cursor in D ATAPAD 1
area
define initial estimate, 2
store initial estimate
identify stored value as
initial estimate
illustrate first iteration
illustrate second iteration
illustrate third iteration
Figure 3-Illustrative program
Illustration of Newton's method for finding the real
root of x3 - x-I = 0
Figure 5-Illustration of Newton's method for finding
the real root of x3 - x-I = 0, second iteration
530
Fall Joint Computer Conference, 1969
------~----------------~-----------------20r---,-1I-----------------------~
o
IU)
IJ.J
I-
~
10
U)
o
a..
6 Test Group
- - Test Group Means
5
o
Control Group
- - - Control Group !Means
o.~~~~~~~~~~~-L~~~
Figure f)-Illustration of Newton's method for finding
the real root of x8 - x - 1. = 0, third iteration
o
5
10
15
20
PRETEST
Figure 8--Scatter diagram, Interpolation and
approximation
Figure 7 illustrates the approximate solution (the
straight line segments) together with the true solution y = e-x.2
Results of the tests
The three subjects who participated only in the
pre-examination and the roots of equations examination
were not considered in performing the multivariate
F -test, since the test requires that the number of subjects
from a particular group be equal for each of the examinations considered. Their scores were used in all the
other tests.
The variance-ratio test supports the hypothesis of
equal variances for each of the four cases.
The result of the multivariate F -test indicntes that
the total differences in performance of the two groups
have only a 5.8 percent probability of being due to
chance. It appears likely, therefore, that the treatment
differences had a significant effect on the performance
2°r---il------------------o
I
I
I
LI ___________________ _
I-
U)
IJ.J
I-
~ 10
U)
~
CD
o
o
6.
Test Group
Test Group Means
5
o
Control Group
- - - Control Group Melms
5
Figure 7-Illustration of Euler's method to approximate
the solution of y' = -2xy, y(O) = 1 in the range
0:$ x :$3
10
15
PRETEST
Figure 9-Scatter diagram, roots of equation.,
20
E,valuation of an Interactive Display System
20r---,-:1:-------------------------~
A
531
20
I
I
~
10
15
I
A
O
t-
00
UJ
tI 10
t-
oo
I
15
o
I
A
I
t-
I
-+-0-------------------
0
o
oo
UJ
r
~
10
0
0
0..
o
CL.
-r------------------I
10
I
A
Test Group
- - Test Group Means
5
5
o
0
Control Group
- - - Control Group Means
I
A Test Group
- - Test Group Means
o
I
Control Group
- - - Control Group Means
I
I
I
5
10
15
20
PRETEST
Figure l{}-Scat.t.er diagram, different.ial equations
differences, taking all four examinations into
consideration.
The~e was significant correlation between the pretest
and posttest. scores for only one of the four cases-the
final examination.
.
Figures 8-11 give the scatter diagrams for the four
examinations. The scores on each post-examination are
plotted versus the pre-examination scores. These
diagrams show that the test group avera"e scores
improved steadily from test to test, while the control
group performance fluctuated considerably. The difference in the means for the post-examinations increased
from test to test and was particularly large for the. final
examination. This seems to indicate that use of the
graphic on-line system helped on retention, nnd that
there was greater carx:y-over of .learning from t.opic to
topic on the part of the test group. The scatter diagrams
also indicate greater correlation between pre- and
post-examination scores for the test group.
The univariate F -tests for each of the post-examinations show that the use of the graphic system made a
sig~ificant difference for the roots of equations, differentIal equations, and final examinations.
The data does not indicate a significant difference in
performance on the approximation and interpolation
examination. One may conclude that there was no
difference, or else that there is insufficient data to
warrant a definite conclusion. The small sample size
makes the test performed very weak. Reference to
power curves shows there would be a probability of .6 of
O~~~-L~~L_~~_L~~L_~~_L~
o
5
10
15
20
PRETEST
Figure II-Scatter diagram, final examination
error if the hypothesis was accepted that the graphic
system made no difference.2 A definite conclusion cannot
be reached from these data on the effects of the system
for the topic of approximation methods.
Validity oj the results
The data support the assumptions of normal distributions and equal group variances. The possible effects of
previous knowledge or experience in numerical analysis
were controlled by the use of a pre-examination. Even
so, these effects were small. The t-tests performed on the
within-classes regression coefficients indicate that the
adjustment made for pre-test scores did not affect any
of the raw scores except those of the final examination.
The intelligence of the subjects is the major uncontrolled variable in this experiment. It was not possible
to adjust for intelligence, because scores on a common
measure of intelligence were not available. If the members of the test group were much brighter than those
of the control group, the experimental data could be
explained thusly. Such a difference is doubtful in view
of the similar backgrounds and educational levels of
.the two groups, and in view of the pretest scores.
Would these results apply to other groups'? We
cannot tell for certain until the experiment has been
repeated for groups of different backgrounds, scholastic
levels, and motivation. There is no a priori reason
to doubt that it can be extended ..
532
Fall Joint Computer Conference, 1969
In summary, the following conclusions can be made
the quantitative results of the experiment:
r f\garding
1. There is evidence to support the thesis that the
graphic on-line system provides a useful and
efficient aid jn teaching numerical methods in
roots of equations and differential equations.
This effect is sufficient to be demonstrated even
though weak tests were used.
.) The graphic on-line capability has a positive
effect on retention.
J. Further experimentation with an improved
system and a larger sample must be made in
order to reach conclusive results for the topic
of approximation.
Qualitative observations
Besides the numerical data, a number of observations
can be made regarding the use of the graphic system as
a result of the course conducted.
1. Preparation time on the part of the instructor
2.
4.
5.
6.
'.
averaged about four hours per class hourconsiderably longer than is generally required.
Up to twenty-five percent more time is required
to present an equivalent amount of material
using the graphic system than when not using
it. This time is used in setting up illustrative
displays.
This set-up time is distrarting to the student.
Intermittent use of the graphic device during
a. class session is especial1y distracting. A good
procedure is to introduce the material briefly,
present the necessary theorems; give a series of
examples illustrating the methods an algorithms; terminate the session 'with a brief summary of the material.
The amount of information displayed is important---each display should illustrate at single
principle rather than several.
The ability to regenerate an entire display on
a changed grid size proved very useful. The
instructor can illustrate a particular problem
in the large, and then enlarge a particular part
to fill the entire screen.
A system will fail at times. The instructor must
be ready to continue the illustration in progress
at the blackboard. He must be thoroughly
familiar with the problems he is presenting.
Whenever possible the instructor should encourage the students to discover the point of a
display.
8. Hands-on time on the part of the s'Ludents is
very ,useful. One problem of the final ex~tmi
nation consisted of determining the parameters
a, b, and c in the polynomial form a(x + b)2
+ c so that the reSUlting polynomial would pass
through three given points.
The test group handled this with ease., and each
individual was able to find the correct values
and explain the steps taken to arrive at them .
Most of the control group subjects were not
successful, and those that were were not systematic in their approach. The purpose of this
exercise was not simply to find the coefficients.
Rather, it was to illustrate the effects of varying
the three parameters on the behavior of the
polynomial.
9. Class participation was much greater in the
test group. The students in this group were
eager to pursue topics which were not direetly
covered in the lectures. During the lecture OD
iterative methods for finding roots of equation~
the students in the test group discovered thp
effects of applying acceleration to diver!~inf!
sequences of iterates, and did so by their own
initiative. The test group also worked the examination questions much faster than the eontrol group, usually starting by drawing a picture.
Findings and conclusions
Experience to date gives tentative answers to the
questions initially posed:
1. The results indicate that the interactive display
system is a valuable and powerful aid in. teaching
selected topics in numerical analysis.
2. The system performs this function better than
visual facilities generally used. The graphic and
the interactive capabilities enable the instruetor
to develop a large number of significant examples to illustrate his classroom lectures and
to make them more interesting. The interac1iive
capability provides a flexibility not availa,ble
through slides or filmstrips. Complete response to
student questions stimulates student inquisitiveness. Student retention is improved, a,nd there
is a greater carry-over of learning from topic
to topic.
3. The system can be effectively integrated into
the teaching process, but delay time-the time
necessary to generate new displays-and reliability are problems which require an unusual
level of instructor preparation.
Evaluation of an Interactive
4. The cost of teaching with such a system is not
high except for the cost of the display unit.
Running the system requires very little processing time. Preparing class problems requires
about five minutes of Model 40 CPU time per
display hour. Classroom presentation averaged
about two minutes of CPU time per display hour.
The display unit is costly, but this application
could use a simpler and cheaper display device.
Both cost and reliability can be improved by.
using this system to prepare slides for classroom use, but extemporaneity and flexibility
will be sacrificed.
5. In determining the hardware and software
capability required for such an interactive display system, a number of items must be considered. A 12" X 12" screen size is about average
for display units with vector capability. A smaller screen size could be tolerated for individual
use, but not for classroom use. The alphanumeric keyboard is essential for entering data
into the system, but the function keyboard
could be eliminated. One could use the standard
alternative of a menu of lightpen buttons displayed on the screen. One could not readily
substitute the alphanumeric keyboard for
function buttons without seriously impairing ease
of use. The 8K buffer used in this experiment
could be reduced to 4K without impairing
system efficiency.
A graphic programming support such as the
IBM Basic Programming Services is useful but
not vital. The applications facilities required
would depend on the use to be made of the
system. Those used in this investigation were
minimal though adequate for teaching the
selected topics in numerical analysis.
6. Development of the system described here
required about 1200 man-hours, with one
Displ~
System
533
individual devoted to this. task over a one-;.vear
period. Development also required about 163
hours of 8/360 Model 40 time.
The results of this experiment indicate that use of
an interactive display system can significantly increase
the active role of the learner and improve student insight and understanding of elementary topics in numerical analysis.
This is a pilot study. It demonstrates the usefulness
of such a system only for one group of students with
one particular subject-matter. To generalize, one would
have to replicate this experiment with other groups of
students.
The study is, however, as useful for what it suggests
as for what it proves. It suggests specific techniques
for using such a system. It suggests that we measure
the separate effect of student hands-on time. A controlled experiment should be run in which students
use the graphic system to work a given set of problems,
studying a set of notes presenting the necessary background material. This treatment would not involve an
instructor except as a monitor.
Finally, the study suggests the desirable characteristics of follow-on systems and ways of making them
more economical.
REFERENCES
1 B D FRIED
Solving mathematical problems
McGraw-Hil Book Co Inc N Y 1967 In On-line
Computing edited by W. J Karplus
2 B J WINER
Statistical principles in experimental design
Mc Graw-Hill Book Co Inc NY 1962
3 IBM System/S60 component description, IBM 2250 display
unit model 1
IBM Corp Form A27-27011969
4 J C R LICKLIDER W E CI.,ARK
On-line man-computer communication
Proc SJCC Vo1211962 113-128
Computer based instruction in computer
programming-A symbol manipulationlist processing approach
by P. LORTON, JR. and J. SLIMICK
Institute for Mathematical Studies in the Social Sciences
Stanford, California
INTRODUCTION
Since February, 1969, a computer based course in
computer programming has been running at an "inner
city" high school in San Francisco, California. Each
day ninety high school juniors and seniors in classes
of fifteen interact with a course designed to teach the
fundamentals of computer programming for business
applications. For fifty minutes a day each student is
on-line with a computer located thirty miles away on
the Stanford University campus. The purpose of this
paper is to describe the rationale and the major components of the software system used to implement the
project.
Lesson material and programming problems for
the students are presented on teletypewriters linked
via telephone lines to the Computer Based Laboratory
of the Institute of Mathematical Studies in the Social
Sciences on the Stanford University campus. In this
laboratory are several computers which form a unique
system for presenting instructional material.
The main computer in the system for this project is
a Digital Equipment Corporation model PDP-ID.
The PDP-ID is a single address, 18 bit binary macltine.
The machine has 32,768 words of core memory of
which 20,480 words are used by the time-sharing
operating system. User programs are permitted up to
12,288 words of core. The time-sharing system allows
up to 26 users to run concurrently on the computer.
This is made possible by the addition to the PDP-ID
of a very high speed drum with 26 tracks, each capable
of holding 4096 words. The time-sharing system swaps
programs in and out of core memory very rapidly using
a simple priority scheme based on "time-slicing."
Because of the necessity for user micro time-sharing
. the programs in this project occupy 10 of the 26
available tracks.
The PDP-l communicates with the students at the
high school through a smaller computer (DEC PDP-8)
used to buffer text output. A PDP-8I has been installed
at the school to perfDrm a similar function at the other
end of the line. Collins data sets were used in p]ace of
the PDP-8I during the first year.
A im and purpose of the course
The main goal of this course is to present in very
general terms the concept of a digital computer as a
tool for solving business-related problems. As computers
proliferate in business and industry there will be an
increased demand for people who can see their jobs in
terms amenable to computerized operation. Such tasks
as filing and stockroom control, now available to
minimally trained individuals, will soon require personnel able to see and solve problems in terms understandable to a computer.
With the goal of training for applications on these
kinds of problems, the need for something other than
a "formula translation" approach is evident. Using
filing and stock control as sample problem areas, an
approach which stresses symbol-manipulation and
list-processing suggests itself. Inventories can easily
535
536
Fall 'eloint Computer Conference, 1969
be viewed as ordered pairs (a symbol-manipulation
concept) of item names and counts. Retrieving information from a file can be thought of as a "tree search"
(a list-processing concept).
The advantages of teaching a symbol manipulationlist processing (abbreviated: SMLP) language are
best shown in an analysis of the properties of SlVILP
languages.
A. SMLP languages operate: primarily on symbols
and sets of symbols and, secondarily, on quantities. This implies that problems as conceptually complex as text scanning become more
manageable. Once text scanning becomes manageable, then many applications such as natural
language-based information retrieval or dialogue
systems for management information collapse
into programmable problems. The power of an
approach which emphasizes symbol manipulation is that conceptuaJly difficult problems
often become readily programmable.
B. The list structure in SM1P languages provides
an absolutely general form of data and program
storage. A programmer, given a universal data
storage facility, can give some attention to optimization of the structure of his data. The
optimization of data structure cannot be over
emphasized since information retrieval (among
other applications) is not: economically possible
without structuring the data so that the computer answers efficiently the ,most frequently asked
questions.
'
C. SMLP languages teach the use of pointers and
indices. While properly part of (B), the simplest
definition of a pointer is that it is a quantity that
specifies the location or existence of some other
quantity; an index can be defined as a quantity
specifying some base location. The concepts of
pointer and index are useful in teaching the
manipulation of data by using references rather
than moving blocks of data from one place to
another. An immediate example of an application
of pointers is data sorting.
D. SMLP languages allow Simple implementation
of push-down stacks. While not of great intrinsic value, push-down stacks simplify the
calling and structure of subroutines, particularly
recursive ones.
E. SMLP languages simplify the treatment of name
scope problems in a hierarchical store. A -fundamental concept of symbolic programming is
that a quantity can have a name; furthermore,
it may be desirable to limit the area of the pro-
gram in which a given name refers to a partil[}ular quantity. Thus, it is desirable to have a
method of associating a given name to the relevant quantity on the basis of "area" j, this association is referred to as "name,:,scope."
In general, language possessing properties A-E provide exceptionally general approaches to programming
digital computers. It can also be pointed out that the
COmmon Business Oriented Language (COBOL) resembles this kind of language more than it re8embles a
"formula translation" language .. The general concepts
available through an SMLP language would, it is believed, be of considerable help to the student13 in their
future efforts to build an understanding COBOL nnd
related languages.
Basic ,concepts
G.ood computer programming, under the philosophy
advanced here, depends on the understanding of <:ertain concepts not particularly oriented towatd nny
one machine or language. The basic concepts which
seem necessary for understanding the kind of applications programming taught in this project seem to
divide into concepts which are related to making a
stored program machine work for the user and concepts
which are related to what is felt to be the basic task
of business applications programming: symbol manipulation-list processing. It is these concepts which form
the basic content for this course.
The first nine general concepts in the following list
are of the first type. The tasks described are all aElsociated with the how and why of making stored program machines do the work required of them.
I. "Machine" related concepts:
A. Stored Program. Refers to the ability to have
a set of imperative actions implying some overall task stored in a machine which can execute
it in some sequential fashion.
B. Stored Data. Refers to the ability of a machine to store quantities like "stored program"
actions but not encompassing an over~Lll me;a.ning.
C. Variable. Refers to the ability to name some
part of the stored program and refer to ·the
properties or value of this part through reference to its name.
D. Operations. Refers to the capabilities c:ontained
in the Central Processing Unit. Two main claBses
of operations are felt important: Arithmetic :a,nd
Non-Arithmetic.
Computer Bsaed Insturctuion in Computer Programming
E. Addressing. Refers to the capability of pointing
to various parts of the stored program as well
as the ability to form data into clusters or arrays in some useful' way. Three sub-concepts
are felt noteworthy: Indexing, Base addressing,
and Indirect addressing.
F. Branching. Refers to the ability of a stored
program to reorder the sequence of events it
performs in completing a task.
G. Loops. Refers to the ability to re-execute a
subsequence of the stored program to complete
a repetitive task.
H. Blocks/Sub-Programs/Procedures. Refers with
minor differences in emphasis to sub-groupings
of the stored task which form semi-self contained programs often capable of being introduced into the main event sequence by being
"called."
I. Input-Output. Refers to the machine's methods
for listening and talking to the user.
The following concepts are more directly related to
the symbol manipulation..;list processing approach to
the problem space than they are to the problem of
making a machine work. This does not mean that the
concepts listed above are unrelated to issues associated
with the nature of the problem space. Neither does it
mean that a ymbol manipulation-list processing language is unsuited to presenting them.
II. "Language" related concepts:
A. Data Handling. Refers to the method of viewing
and manipulating the data a program is to handle.
B. Recursion. Refers to a "self calling" ability of
sub..blocks of the program in an SMLP type
language.
C. Arrays and Strings. Refers to a more general
and efficient way of clustering stored data so
that its manipUlation becomes a simpler task.
D. Data Structures. Refers to named functions
which use indexing' and pointers to locate elements in the stored data. Examples might be
"trees," "lists", "graphs", etc.
Languages selected for the project
Given the conclusions on the advantages of teaching
a "symbol manipulation-list processing" language and
the fact that some machine level concepts might usefully be introduced into the course, a language appropriate to each conclusion was selected: a simple as-
537
sembly language and a fundamental SMLP language.
Each of these languages is briefly described below.
Major components of the project
The implementation of the conclusions reached in
the preceding discussion involved developing three
separate programs which, when loaded into the PDP1-D, operate as the software system for this project.
The three programs include a "driver" (SLAKER) to
supervise the interaction of the student with the curriculum material and the language processors, an
interactive assembly language processor (SIMPER)}
and an interpretive SMLP language processor (SLOGO). Each of these parts of the software package is
described below. Appendix A contains a sample lesson
illustrating many of the components described" below.
Major component: SLAKER
Introduotion
SLAKER [Slimick-Lorton All Knowing Educator
Routine] is designed to provide the interface betweenthe student at a teletypewriter and the curriculum
material of the project. The over-riding concern in
the development of this driver was to provide as much
freedom and flexibility for each user as is consistent
with service at reasonable intervals.
If a student's program would cause a real machine
to enter an infinite loop or write over his data, then
this would happen to him in the instructional setting.
Certain obvious restrictions have been placed on this
goal. A student's work is not free to "clobber" other
users (although this might well happen on a "real"
machine). A student can wipe out his own effort and
experience the pain of having to recover from the error.
Functions
The balance of the description of SLAKER is devoted to the major functions it is designed to perform.
1. Text Emission
One of the major tasks SLAKER has is the presentation of problems to the student at his teletypewriter.
Several of the disc files attached to the driving program
contain the curriculum material which is organized
into four sequences of lessons and problems through
which the student is to proceed. In addition to the
lesson-text, the problem code contains certain values
which indicate various subsections of the problem such
538
Fall Joint Computer Conference, 1969
as the "correct answer" or the "hint," as well as the
problem type to the driver.
The four strands into which lessons and problems are
grouped for this project are: Lesson, Homework, Extra
Credit, and Test. For problems in the Lesson strand,
SLAKER is charged with waiting until the student
enters the correct answer before going on to the next
problem. With the other thr~e strands, SLAKER
presents the next problem as soon as any answer is
entered~ In every case the student is informed of the
correctness of his answer.
2. Response Evaluation
After emitting the text for a problem to the user,
SLAKER monitors his output, collecting it as an
answer. When the user enters an "evaluate my work"
request, ~LAKER checks his answer according to the
type of problem the student was given.
A. MUltiple Choice
Under this format the answer is first compressed
so that all duplicate characters are eliminated.
Then the answer is searched for matches with
the characters recorded as the correct answer.
lJp to twenty characters are collected from the
student as possible answers for problems stated
in this format. Only alphabetic characters are
collected so that spaces, punctuation marks,
or numbers can be inserted in the answers without affecting the correctness of the alphabetic
string.
B. Constructed Response
When the student's input is a response to
this type of problem, all the characters he types,
with the exception of carriage returns and line
feeds, are collected. The checking routine then
examines the response string looking for two
kinds of characters: those that must be present
and those designated as optional. The serach
and match routine is of such generality that it
is felt all possible correct answers will be marked
correct if they are defined in the curriculum.
C. Anticipated Alternative
Although not a separate type of problem,
this checking capacity is a separate skill of
SLAKER. If alternative answers are expected
they can be specified and checked for .. If a correct response is not found, then the answer
evaluation routine checks the student's effort,
in the same fashion, against the strings speci-
fled as possible alternatives. If a match is fOUlnd,
then an appropriate comment is given and the
student is told to try again, just as if he were
wrong. At present this capability is availHble
on constructed response problems and single
choice-multiple choice problems.
D. Programming Problems
. Evaluation of these problems is done by
asking the student questions about his program
after he wrote and debugged it with given d:ftta.
This method of evaluation allows the student
flexibility in programming a different solution
than the solution the curriculum writers had
in mind.
3. Communication with Language Interpreters
Since the main aim of the course is to provide rich
and varied experience in programming, a main responsibility of SLAKER is readily to provide this' c:ontact. Each language differs slightly in how it wantB to
be told a student is using it but, basically, SLAKER's
role is to make the initial contact with the language
processor, pass subsequent information to it ~:md await
the user's indicated wish to return to the main program.
4. Special requests from the Student Station
The following activities can be requested from a
student station. As a group they provide the student
with considerable flexibility in how he proceeds through
the course.
A. Restart Station. Allows a user to request a
station be restarted from the sign-on point.
Used to correct improper sign-on efforts by
students.
B. Sign-off Station. Allows a user to terminate his
lesson when he is ready. Part of the execution
of this command involves storing where the
student left off on his history file so that he may
restart from this point on the following d.ay.
C. Go to Choice Point. Places the user at a point
where one of the following choices can be ms,de:
1. Return to Last Problem. Allows the stu-
dent to continue working from where he
last signed off in the strand he specifies.
2. Go to Specific Lesson. Allows the student
to begin working on the lesson number in
the strand he indicates.
3. Attach a Language Processor. Allows the
student to call forth one of the language
processors available in the course.
Computer Bsaed Insturctuion in Computer Programming
D. Skip Problem. In the Lesson strand, only a
correct answer will advance a student on to the
next problem. This feature allows a student
to skip out of this loop. As the next problem is
called, the. correct answer to the skipped problem is printed.
E. Give Hint. Commands SLAKER to print the
"hint" provided for the particular problem.
F. Erase Answer. The user has the option of
erasing all of the answers he has typed or merely
the last character. Erasing the last character
can be repeated until the entire answer is erased
if wished.
G. Communicate with Stanford Monitor. This
feature allows student stations to type messages to the monitor teletypewriter at Stanford.
Usually, its use is reserved for the classroom
teachers who may want to correct a lesson, enter
a new student, or ask a question. As part of
this feature it is also possible to communicate
from the monitor teletypewriter to any of the
student stations.
Major oomponent: SIMPER
Introduction
SIMPER [Simple Instructional Machine for the
Purpose of Educational Research] represents an attempt to make available to the student at a teletype"writer a simple computer which he can program in a
manner analogous to "assembly language programming" on digital computers of modest size.
This instructional package can be most easily understood when viewed as consisting of two main parts:
a machine (SIl\1PER) and an assembler (SASS).
The latter is designed to generate the machine code
for SIMPER. The "machine" is a mythical digital
computer which can be described in a formal way and
for which programs can be written. Although the machine responds to 18 bit instructions in its "machihe
language," there is no direct access to the machine via
18 bit numbers. The purpose of the machine is to teach
students to program so the machine is programmable
only through a symbolic assembly language.
The assembler generates code for SIMP~~R from
Assembly Language instructions typed by the student.
Assemblers generate code instruction by instruction.
This one generates code for SIMPER immediately
after each instruction is typed, in by the student. This
feature enables the student to receive immediate correction for most syntax errors and, when the student
539
avails himself of the option, each line of code can be
checked immediately to assure the student that the
assembler translated, the student's instruction as he
wished.
The current version of SIMPER is designed to time
share up to 15 students concurrently. The interpreter
occupies 4096 words of PDP-ID core memory while
the arrays representing the simulated machines for all
15 possible users occupy an additional 4096 words of
memory.
Description of the SIlVIPER machine
SIMPER is a fixed-point, single address machine
with a memory of variable size (currently 128 words).
Operations are performed in two general purpose
registers. Instructions are six digits in l~ngth: two digit
operation code, one digit register specification field,
and a three digit address field. At present, 16 operations
can be performed.
The size ,of the machine's memory is variable depending on the available space. For this project the
memory .3ize is 128 decimal (200 octal) locations. This
size was chosen because it allows the fifteen students
to run parallel in the space available on the PDP-ID
and it also means the students' daily programming effort can be "saved" on a disk scratch file of convenient
length, enabling the student to continue programming
efforts from session to session.
Operation of the SIMPER machine
SIMPER runs by executing the six digit number it
finds in the memory location pointed to by the program counter. The program counter is updated as part
of the instruction-fetching activity. An instruction by
instruction-execution of a program is printed on the
Teletype. While thus being able to monitor the exeoution of his program, hopefully, a student is given
special insight into how each instruction operates and
how a sequence of instructions can be converted into
meaningful work. This "printing out" of the execution
sequence also slows down the speed of execution so
that the work of the machine is easily followed. The
student can also watch the effects of "bugs" arise and
develop into problems which require attention. This
feature is' intended to make the debugging of machine
language programs an easier task. A special flag can
be set at execution time to suspend this feature. Execution speed is then improved by a factor of four.
540
Fall Joint Computer Conference, 1969
----------------------------------------------------------.----------The assembler
Description
The assembler recieves its instructions from a stude,nt through a teletypewriter keybo.ard. Each student
interacting with the program is listened to for characters
which are collected as an instruction to be assembled.
Students are served by the as!3embler in a manner
which both time shares and "oils the squeaky wheel
first."
When the student is given a problem involving assembly language programming, he is told to sign on to
SIMPER. He calls the choice point option and, in
response to "Where to? ~", types "SIMPER." The
student is then in contact with the assembler. He is
informed that he may now write his program and
columns labeled "LOC" and "INS rRUCTION"
are created. In the LOC column the assembler prints
the number of the memory location into which the
instruction· being written will be assembled. The assembler then awaits an instruction from the student.
The student types his instruction and an indicator
that he is finished'. The assemble~ immediately examines
the text string and attempts to generate SIMPER
executable code. If all is in order, programming advances to the next memory location. If all is not in
order,- the assembler generates an appropriate error
message. By assembling in real-time after each instruction is entered, the assembler can give immediate
feedback on syntax errors to the student.
Major component: SLOGO
SLOGO (Stanford LOGO) is the I.M.S.S.S. implementation of LOGO, a computer language developed
by Wallade Feurzig and Seyrp.our Papert of Bolt,
Beranek, and Newman expressly for teaching the
principles of computer programming. SLOGO is
similar to LISP 1.5 in that both are left prefix languages, both have a simple type of function definition,
and both have similar sets of primitive operations.
SLOGO functions, unlike LISP, have predefined
numbers of arguments which" along with the left prefix notation, allow SLOGO to require minimal user
punctuation.
While SLOGO is an ideal symbol manipulation and
string processing language, it has substantial weakness
in not providing structures that are effectively lists of
lists a la LISP .1.5. While generality is very desirable
to the programmer, the choice of LISP 1.5 as the symbol manipulation-list processing language for this
project posed such severe curriculum problems that
the attempt to use it was abandoned; thus, SLOGO1
which has less generality, was implemented instead.
SLOGO currently time shares five concurrent users;
each user has a 4096 word drum track that contfdns
his own functions, execution stack, etc. SLOGO is a
re-entrant program when executing commands from
a user, but it is not re-entrant with respect to console
input and the queuing apparatus. The curren1Gly available functions with short definitions attached :are listed
in Append~x B. In the following sections, :first, the
basic data types used in SLOGO are descrihed, and
immediately thereafter is a discussion of the 1iWO
processing modes of SLOGO.
Data types in SLOGO
There are three basic data types in SLOGO: word,
sentence, and number. A brief explanation of each follows.
(1) A "word" consists of a string of letters, digits,
or certain punctuation marks; punctuation. marks
that cannot be used are blank, single qu01Ge, ">",
" <", "-", and possibly others that depend on which
version of SLOGO is being run.
(2) A "sentence" consists of a group of words. Although one can argue that sentences could consist
of one or more words, to avoid ambiguity we assUlme
that sentences consist of two or more words.
(3) A "number" consists of a string of decimal
digits plus a leading minus sign, if the number is negative. The largest number acceptable is ± 131,0'71.
There are three methods of referring to data: function values, pointer variabies, and literals. A brief
explanation of each follows.
(1) A literal is a direct reference to the indicated
data. Word and sentence literals are written with the
single quote (') surrounding the desired datl:~. Lit£lral
numbers appear as the number itself, without quotes.
A quoted ~umber is assumbed to be a word.
Example:
The following are word literals:
'AARDVARK'
'45'
'3A'
'MIXTEC'
'THISISAWORD'
The following are sentence literals:
'AARNOLD IS A APATHETIC AAR])VAE~K'
'ONTOGENY RECAPITULATES PHYLOGE-
Computer Bsa€d Insturctuion in Computer 'Programming
N¥"'
'12345'
541
code and then interpreted by the SLOGO interpreter.
Upon detection of a~ error or the successful execution
'THIS IS A SENTENCE'
of the Polish string of code, whatever output produced
is printed (if PRINT is used) and SLOGO returns to
The following are number literals:
a listen state while the next line is being typed in.
1
"Definition" mode is indicated by a "~,, ("right
1776
arrow") sign, and is the exceptional mode of operation.
-10
It is entered from command mode when an input line
131071
has been terminated with a period and begun with a
"TO". At that point definition mode is entered and
(2) Function values. Most of SLOGO's built-in
cannot
be left until the command "END" is entered.
functions and all of the defined functions return a .
There
is
no attempt at function .execution while in
value. This value may be subsequently referenced by
definition mode. The only use of definition mode is
other functions, and the type of this function may be
to define a SLOGO function by entering successive
any of the three basic types.
lines of functions and arguments. During definition
(3) Pointer variables are in reality name pairs,
mode, checking is done on the function names, validity
where one part of the pair is the name and the other
of arguments, etc., but no functions are executed.
part is the value. Names must have type values of
either word or sentence but never number. The value
SUMMARY
type can be word, sentence, or number. Names are
written inside closed symbols, which can be either" <"
The purpose of this paper has been to describe the
and" > " or " - " for left and right sides.
software and corresponding rationale for a project
designed to teach high school students how to use
Example:
computers. The main thought behind the project is
that, especially for business applications, an approach
< NURNDY IS A GAl\tlE>
which stressed symbol manipulation and list proc_POINTER_
essing skills would very likely prove of long-term use
to the students.
To implement this course, a three-part software
The peculiar literal " is accepted by the read-in
package has been developed which provides guided
routines, can be generated internally, and is always
interaction for each student with important programprinted by SLOGO as "NIL".
ming concents. The software package includes a
To illustrate the difference between literals and
"driver" to shepherd the student through the course
pointer variables, assume there is a name pair whose
material, an assembly language interpreter to provide
name is "HEROINE" and whose value is the sentence
him with an understanding of basic machine operation
"OUR GAL SUNDAE."
and a symbol manipulation-list processing language
The value, then, of < HEROINE>
interpreter to provide him with experience in solving
problems in a suitable higher level language:
is OUR GAL SUNDAE.
It is worth noting that all of these programs are
T,he value of 'HEROINE'
written in a subset of ALGOL-60. A course dedicated
to the teaching of higher level computer languages could
is HEROINE.
show the utility of such languages in no better way
than to have its software packages written in such a
SLOGO processing modes
language. One of the very useful demonstrations this
project has made has been to show that complete, useSLOGO operates in two modes, command and definiful and efficient computer-based instruction systems
tion. There is a special character printed at the extreme
can be written in a higher level language.
left-hand end of the type line to indicate which mode
Preliminary and informal results from the students
SLOGO is in.
in the course are quite encouraging and tend to sup"Command" mode is indicated by a ">" ("greater
port the basic philosophy of this approach. There is
than") sign, and is the normal mode of operation. In
every reason to believe that the future statistical
command mode, as soon as a line of functions and
analysis of the effects of this course will confirm these
arguments is typed in, terminating with a "." (period),
initial observations.
the line is converted to a Polish string of interpretive
542
Fall Joint Computer qonference, 1969
APPENDIX A
Sample lessons
(The following are short exam;ples from the actual curriculum; they have been retyped. Cc)'"nrnents
within brackets are parentheticalicomments added to indicate various features.)
3 JULY 1969
SLAKER (VERSION OF 28 MAY 69)
PLEASE TYPE YOUR NUIVIBER·· ,--+11
(CTRL G TO BEGIN-CTRL T TO RESTART) -';
[si~n-on]
[start at Lesson 68]
WHERE TO? -'; L68
LESSON 68: USING TESTS
[a SLOGO lesson]
WE CAN USE 'FIRST,' 'BF,i AND SO.oN WITH 'CALL'
IF YOU TYPE THIS:
CALL FIRST OF BF OF 'BEARS HIBERNATE IN WINTER' 'X'
IF WORD? < X > THEN P < X >.
THEN SLOGO FINDS THAT; < X > IS 'H,' WHICH IS A WORD,
SO SLOGO REPLIES:
H
FOR PROBLEIVIS 1-6, TYPE: WHAT SLOGO REPLIES
TYPE 'N' IF NOTHING IS tRINTED.
1. CALL FIRST OF 'BLU~ SKIES' 'W'
IF WORD? < W > THEN P < W >.
~ BLUE·· . CORRECT
[constructed response1
2. CALL DIFF OF 9 AND 6 'X.'
IF NUMBER? < X >, THEN P TIMES OF 4 AND < X >.
DOES SLOGO THINK 3 IS A NUIVIBER?
[a hint)
~ 12·· . CORRECT
[another constructed response1
NOW SIGN ON TO SLOGO AND DO PROBLEMS 7-10.
AFTER ALL 4 PROELElVIS ARE DONE, TYPE CONTROL A.
7. TEST TO SEE IF" 'PLACE KICK' IS A SENTENCE.
IF IT IS, PRINT 'IS SEN.'
8. TEST TO SEE IF 7 I~ A WORD. IF IT IS, PRINT 'IS WORD.'
9. TEST TO SEE IF '1 4'8' IS A NUMBER. IF IT IS, PRINT THE NUMBER.
10. TEST TO SEE IF 'P' IS A WORD. IF IT IS, PRINT BF OF THE WORD.
[sign-on to SLOG01
WHERE TO? -'; SLOGO·· ·O~
[hello from SLOG01
SLOGO· .. THE ORIGINAL CONJURING CAT
["P" is "PRINT." ("BF" is "BUT> IF WORD? 'P' THEN P BF 'P.'
FIRST"); this is solution to 10 :above1
= NIL
[Sample of SLOGO programming1
> TO REVERSE < A >.
~ IS < A >' .'
-'; IF YES RETURN' .'
i
~ RETURN WORD LAST <1 A > AND REVERSE OF BUTLAST < A >.
~END.
>
P REVERSE '1234567890.'"
= 0987654321
> ... OK
[return to SLAKER]
Computer Bsa€d Insturctuion in Computer 'Programming
OUTPUT SHOULD BE:
7. IS SEN
8. NO OUTPUT
9. NO OUTPUT
10. NIL
543
[correct answers to 7-10; control
S ("SNIP") takes one on to 11]
LESSON 11: PROBLEM SOLVING
[a SIMPER lesson]
WRITE A SIMPER PROGRAM TO SOLVE EACH OF THESE PROBLEMS FOR YOU
1. MARY BOUGHT 3 POUNDS OF CANDY AT 29 CENTS PER POUND.
WHAT WAS HER BILL?
WHERE TO? -? SIMPER· .. OK
SIMPER (VERSION OF 6 JUN 69)
BEGIN PROGRAMMING
LOC INSTRUCTION
000 - ? BEGIN
001 - ? GET X
002 - ? GET Y
003 - ? LOAD X
004 - ? MUL Y
005 - ? STOR X
006 -? PUT X
007 - ? END
008 -?
EXECUTE·· . STARTING LOC - ? 0 AND ENDING LOC
PROGRAM EXECUTED ON 3 JULY 1969
P C INSTR REG A REG B
000 BEGN
0 32768
INPUT -? 3
INPUT - ? 29
000
LOAD
3
32768
004MUL
32768
87
005
STOR
32768
87
OUTPUT = 87
007
END
87
32768
.. ·END OF EXECUTION, CONTINUE'
008 -? .• ·OK
HER BILL WAS 87 CENTS. IF YOUR PROGRAM SAID
OUTPUT = 87, SKIP ON.
[go to SIMPER]
[hello from SIMPER]
[possible student solution to this problem
-?
7
[execution of solution1
[back to SLAKER1
[answer1
-?
[skip on1
·2. A RECTANGLE IS 8 INCHES LONG AND 4 INCHES WIDE.
FIND ITS AREA.
~ TO FIND THE AREA OF A RECTANGLE, MULTIPLY THE LENGTH
TIMES THE WIDTH.
WHERE TO?
-?
SIMPER·· ·OK
[sign-on to SIl\1PER]
[hint1
544
Fall Joint Computer Conference, 1969
----------------------~-----------------------------------------------------,-----APPENDIX B
Concise guide to SLooO
(Optional words are italic).
WORDS OF X AND Y
SENTENCE OF X AND Y
FIRST OF X
BUTFIRST OF X
LAST OF X
BUTLAST OF X
SUM OF X AND Y
DIFFERENCE OF X AND Y
TIMES OF X AND Y
QUOTIENT OF X AND Y
IS X Y
IF YES THEN 8 1, when 8 1 is some
executable statement
IF NO THEN SI
IF WORD? OF X THEN 81
IF SENTENCE? OF X THEN SI
IF NUMBER? OF X THEN 61
TO NAME OF < X > AND < Y >
RETURN X
END
GO TO LINE N
CALL THING X NAME Y
LOGO
ERASE name
TRACE
UNTRACE
PRINT X
produces a word which is X concatenated with Y.
produces a sentence of Y appended to X.
if X is a word, result is the first letter; if X is a sentence,
result is the first word.
if X is a word, result. is all but the first letter j: if X is a
sentence, result is all but the first word.
if X is a word, result is the last character; if X is a
sentence, result is the last word.
if X is a word, result is all but the last character; if X: is
a sentence, result is all but the last word.
X+Y
X-Y
X(8)Y
X+Y
sets internal flag to true if X = Y (equality of a.rgumcmts
for numbers; character by character equality of words;
word by word equality of sentences); false otherwise.
"execute 8 1 if internal flag is true; ignore 8 1 if false.
execute S1 if internal flag is false; ignore 8 1 if true.
executes S1 if X is a word.
executes 8 1 if X is a sentence.
executes 8 1 if X is a number.
begins definition of a function named "name" and whose
formal parameters are X and Y.
exit from current function with value X.
complete definition of function and insert RETURN' ,
in the code for safety's sake.
branching statement to be used inside of user-defined
functions.
assocfates the name produced by evaluating Y with the
value produced by evaluating X.
reset.
erase the function named "name."
turn on trace for all user-defined functions.
turn off the trace.
print the value of X on the user's teletype.
A touch sensitive X-Y position encoder
for computer input
by A. M. HLADY
National Research Council
Ottawa, Canada
described by Lewin. 2 The stylus must be large enough
to accommodate the necessary components, and, in
addition, present devices require a cable connecting
the stylUS to the console for signal transmission. his
makes some active styli difficult to use with dexterity.
Input devices for encoding position increments do
not have separate input surfaces, and their operation
depends entirely on visual feedback from the display
surface. This type of device consists of a mechanical
assembly having at least two degrees of freedom, such
as a joy-stick or track-ball, which can be manipulated
to indicate changes in the position of a cursor displayed
on the screen.
INTRODUCTION
Any input device used in conjunction with a computer
controlled display for interactive information exchange between man and computer must function
as a position encoder. Input devices for handling two
dimensional positional information· can be grouped
into two general types, one type encoding absolute
positions and the other encoding changes in position.
Devices accepting absolute positions rely on a direct
mapping of positions from an input surface to a display surface. The input surface is usually a fiat plate
or tablet on which positions are indicated with a movable hand held stylus. One consideration in developing
a device of this type is the location of the input surface with respect to the display surface. The mapping
relationship between surfaces is simplified for the user
to the extent of being instinctive if the two surfaces
are coincident. If the input surface is superimposed on
the display surface with a finite separation, the user
has to cope with the problem of parallax. A transparent
input surface and a one to one mapping scale are implicit in these two arrangements. A third possibility
is that the two surfaces are in different physical locations. This makes it necessary for the user to rely o·n a
visual feedback process by observing the mapping of
his selected position in relation to the desired position and then modifying his selection to decrease the
difference.
The stylUS used for indicating positions on the surface is typically an active one which contains a signal
sensor, as for example, in the RAXD Tablet,! or a
signal radiator, as in a magnetically coupled device
Touch sensitive overlay
Work on the device described in this paper began
with several primary objectives which are related to
the considerations outlined above. These objectives
are:
1. The device must encode absolute positions
indicated by the user.
2. The input surface must be as close as possible
to the display surface.
3. Positions are to be indicated with a passive
Rtylus, including a human finger.
The first two objectives en~ure that the relationships between the positional information that the
user must provide and the information he observes
on the screen are fundamental ones. This reduces
the time and mental effort expended, especially when
the device is used for item selection, that is the selec545
546
Fall Joint Computer Conference, 1969
tion of a sub-set from a set of items shown on the display surface.
Assuming that the first two objectives are met, the
third allows one to select items or positions on the
screen merely by pointing at them with a finger. Because pointing with a finger iSi man's mQst natural
method of indicating selection l it touch activated device creates a minimum of distraction for the user.
In fac~1 an ideal implementation of the three objectives listed above would result in an input device
that was apparent to the user in function rather than
in substance.
Admittedly,. the human finger is a rather coarse
stylus but the resolution attainable is sufficient for
many types of manual information entry. The words
or phrases displayed for selection in an information
retrieval system could be in a format suitable for this
type of input technique. I~ a conventional keyboard
is used in conjunction with the display terminal, a
touch activated display overlay reduces the time spent
in going from keyboard to display by eliminating the
intermediate step of picking up a stylus. In addition,
a portion of the display screen could be used as a
touch sensitive keyboard with dynamic computer
control of the associated key functions. The apparent
simplicity, both physically and; functionally, of this
type of input device is a signifidant advantage if· the
user is a young child communicating with a computerassisted instruction system.
For information entry requiring more resolution
than one can obtain with a fing~r, a suitable passive
stylus could resemble an ordinary pencil with its convenient size, light weight, and f~eedom of movement.
One touch sensitive devices that has been developed
for use with a CRT oonsists of a number of wires
terminating at the front surface of the display tube.
Each wire forms the arm of an AC bridge which is
unbalanced by body capacitance. A second device,
developed by Control Data Corporation, has a series
of translucent, touch-activated strips in front of a
CRT display.
The approach taken in our case was to use an echo
ranging technique with elastic surface waves. Echo
ranging with pulsed ultrasonic surface waves has been
applied successfully for a number of years in the field
of flaw detection for structural materials. The propagation delay of ultrasonic elastic waves has been used
as the basis for graphic input devices for a computer.
However, these devices do not employ echo ranging
and consist basically of fixed sources or radiators with
the sensor in a movable stylus. One of these, developed
by Woo at IBM,4 also uses surface waves on a glass
plate. The Lincoln Wand5 provides a three dimensional
input capability by using ultrasonic waves propagating
in air.
In the device developed at NRC, the radiator 9,nd
sensor are physically the same piezoelectric transducer
which is electrically switched between the driving
circuitry and the echo receiving circuitry. Pulse modutransparent
lated surface waves are produced on
glass plateJ and any object contacting the surface
reflects some of the wave energy back to the source.
The distance from the radiator/sensor to the tarl~et
is proportional to the time between the radiator pulse
and the reception of the echo pulse.
a
Surface wave characteristics
An elastic surface wave can be represented. mathematically as a combination of inhomogeneous longitudinal and transverse waves. This is exempHfied by
the particle displacements for a surface wave. The
particles describe elliptical orbits with the major axis
perpendicular to the surface and the minor axis parallel
to the direction of propagation1 correspondinl~ to the
transverse and longitudinal components respectively.
The particle displacements decrease exponentially
with depth into the material", the depth decay factor
being a function of the wavelength and the material.
For glass, the wave energy at a depth of one wavelength is only about three percent of its value at the
surface. A practical implication of this result is thu.t,
to a close approximationJ a plate several wavelengths
thick appears as the solid half-space necessary for
true surface wave propagation.
Waves on the free surface of a solid half-space, which
are also known as Rayleigh waves, are not dispersive
and their phase velocity depends only on the properties
of the material on which they are propagating. For
plate glass the velocity is 10,400 ft/sec.
The amplitude of all elastic waves decreases with
distance from the source through three mechanisms·beam divergence, scattering, and absorption. Because
a surface wave is essentially a two-dimensional phenomenon, the decrease in amplitude due to beam divergence is proportional to l/Vr, compared to l/r for
spatial waves, where r is the distance from the souree.
The attenuation due to scattering and absorption is
related to that of spatial waves, with the attenuation
factor being approximately proportional to frequeney
in the ultrasonic range. The attenuation coeffieient of
plate glass measured at 8 MHz is 0.40 nepers/inch.
An interesting property of surface waves is their
ability to propagate along curved surfaces. If the r:l1dius of curvature is large with respect to the wave-
Touch Sensitive:X-Y Position Encoder
547
length, there is only a slight change in attenuation and
velocity. This property makes it possible to employ
the echo ranging principle described to produce a device which uses the curved front face of a CRT as the
input surface, reducing parallax to a practical minimum.
Echo ranging parameters
All systems using echo ranging for target location
have similar design parameters. Although considerable
effort has gone into the refinement of echo ranging
techniques for radar and sonar, the additional complexity and cost of such developments as signal correlation makes them impractical for this application.
For two dimensional space~ the stylus location can
be determined by measuring its distance from two
fixed points or its normal distance from two fixed
lines. The latter method was chosen and implemented
by alternately scanning the surface in orthogonal
directions using linear transducer arrays fixed at the
edges of a square plate. This method can provide the
stylus location directly in terms of x-y coordinates
without additional computation. The line reference
method also avoids the problem of edge reflections
obscuring valid echoes. Furthermore, with the large
beamwidths needed in the first method, it is difficult
to achieve an adequate surface wave power density
a t frequencies in the megahertz range.
The choice of plate material was limited by the
transparency requirement. Ordinary plate glass was
found to be satisfactory although its attenuation coefficient is higher than that of fused quartz and some
optical glass. All the glass tested had several surface
flaws per square foot but most of these were shallow
enough to be eliminated by localized hand grinding
and polishing. The plate size was chosen to provide a
usable surface of 10 X 10 inches.
the choice of carrier frequency
Factors involved
include the positional resolution, the surface wave
attenuation, the radiator beamwidth, the gain in
reflected energy for a given target size, and the availability of piezoelectric transducers. A carrier frequency
of 8 MHz was chosen for the initial device with the
corresponding wavelength on glass being about
0.015 inch.
in
Radiator/sensor development
One of the most efficient and convenient ways of
generating surface waves at frequencies in the low
megahertz range is by the mode conversion of a longitudinal spatial wave. This occurs when a longitudinal
wave is incident on an interface between two solid
PI EZOELECTRIC
TRANSDUCER
\
\
PRISM
" ...... ......
......
""
" ...... ......
....-----2..t
·1
Figure 1-Sm1ace wave radiator/sensor
materials with an angle of incidence large enough
that total internal reflection occurs, and no energy is
refracted into the second material. In order that the
boundary conditions remain satisfied at the interface
for this caseJ inhomogeneous longitudinal and transverse waves are produced in the second material. In
ther words, a surface wave is generated.
A practical implementation of this, shown in Figure
1, consists of a thickness mode piezoelectric transducer mechanically coupled to a solid prism. Maximum surface wave output occurs for a prism angle, ai,
such that the spatial period of the surface perturbations corresponds to the wavelength of the resultant
surface waves at the frequency of the incident wave.
That is, when
where CL is the longitudinal wave velocity in the
prism,
and Cs is the surface wave velocity.
For this optimum angle to be real, the prism material
must be chosen so that CL <' cs. One of the commonly
available materials that meets this velocity requirement for generating surface waves on glass is an acrylic
resin such as Plexiglass or Lucites.
The same configuration also makes an efficient surface wave sensor. In this case, incident surface waves
548
Fall Joint Computer Conference, 1969
excite spatial waves in the prism with an angle of
propagation determined by the velocity ratio. When the
same transducer is used for both sending and receiving,
the energy that was internally reflected within the
prism during the send interval: appears as clutter or
noise during the receive interval. Although this excess
energy is gradually absorbed by the prism material,
its effect can be reduced by moqifying the prism shape
and coating it with an absorbent material, For the
tr&nsducers actually constructed, the first two inches
of range could not be used 'because of the clutter.
The pie210electric transducers are made of a lead
zirconate-Iead titanate ceramic having a thickness
mode electro-mechanical coupling coefficient of 0.66.
This material is relatively good for energy transformation in both directions. The bandwidth and mechanical
output power of a piezoelectric· transducer are related
to the mechanical impedance of the materials to ,vhich
it is coupled. After some experimentation with quarter
wa ve impedance matching transformers and various
backing materials, it was decided to sacrifice band-
OHMS
180
160
140
I
I
,,
,
~
I
I
I
I
I
,,
10,
I
I
20
width for sensitivity by using air-backed transducers
bonded directly to the prism. The result was a radiator
fractional bandwidth of 20 percent. The parallel components of the electrical input impedance for a small
test array constructed in this way are shown in Figure
2.
For an 8 :MHz pulse modulated signal with a 1.6
MHz bandwidth, the minimum resolvablle stvlus
movement should be about 0,04 inch. As will be' explained later, this resolution was attained but unusable
in the first device constructed.
Array design
The method of target location being used requires
a line source of waves having uniform amplitude :and
phase across a ten inch width. To combine separate
radiator elements into a linear array with the desired
characteristics, the radiation pattern of individual
elements must be known. An expression for the directivity characteristics of a prism type of radiator has
been derived,6 and it yields results similar to the sin
x/x function for spatial radiators. Figure :3 compares
values computed for an 8 MHz radiator usinl~ this expression with experimentally measured values.
For practical plate dimensions and transdUlcer sizes,
the usable surface area lies in the far-field region of the
individual elements but in the near-field region of the
overall array. By computing the response for various
linear array configurations, a radiator width of 0.465
inch, and a spacing of 0:565 inch, were selected .
After the arrays were assembled and tested, the
measured radiation pattern was more irregular than the
computations indicated. This discrepancy was attributed to the variation is spacing, orientation, and
bond characteristics due to assembiy tolerances :and
the variations in transducer sensitivity. The gap8 in
the pattern were sufficiently large and numerous that
it was necessary to add a second set of arrays on the
opposite sides of the plate. These are offset with respect
to the first so that the beams from opposite arrays are
effectively interleaved. The arrays are energized sequentially to avoid mutual interference.
The maximum two-way propagation time for a ten
inch usable surface and a two inch buffer zone is about
200 J,Lsec. Therefore, even with four separate arrays,
the sampling rate can be greater than 1 KHz, which
is more than adequate to follow normal stylus motion.
~L.5----~70-----1~5-----8~~----~8.5~---9~0~--~9~.5----~
FREQUENCY
(MHZ)
Figure 2-Parallel impedance components for a serier:-!
connected array of four 1/2 X 1/4 inch transducers
Electronic circuitry
The signal processing circuitry consists of ~L radiator
driver, an electronic s,vitch, and an echo receiver. The
. Touch Sensitive X-Y Position Encoder
I.OO~---------------------.
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
10
12
DEGREES
Figure 3-Direct.ivity pattern for a surface wave radiator
at 8 MHz with 0.23 inch width
timing circuitry digitizes the signal propagation time,
and the control logic maintains the correct operating
sequence. Figure 4 shows how these components are
interconnected.
The radiator driver and the arrays are matched to
50 ohms allowing them to be connected with standard
coaxial cable. The' diode switch, with a four-pole
double-throw action, permits the four arrays to be
multipIexed into a single driver and receiver, and it
also isolates the receiver during the driver pulse. The
echo receiver consists of an RF amplifier followed by
a demodulator and a threshold detector. The receiver
gain is electronically swept during each scan to compensate for the signal attenuation with range. A range
gate rejects echoes originating outside of the designated
area. Figure 5 shows the demodulator and threloJhold
detector outputs for a single scan. The signal at. the
centeris the echo from a finger touching the g1a.ss.
Echo timing is performed by a free running counter.
Both up and down counting are required to digitize
scans originating at opposite sides of the input surface.
The coordinate grid is considered to have X and Y
axes coincident with the edges of the usable surface,
the origin being in the lower left corner. Adjustments
on the range gates and counting circuitry allow the
size and position of the coordinate grid to be varied
slightly to permit registration with the grid of an associated display device.
The control circuitry allows two modes of operation:
a continuous mode and a discrete mode. In the continuous mode, a Data Ready pulse signals the comput-
x.
L..--_ _--'y.
COMPUTER
Figure 4-Position encoder block schematic
549
Figure 5-Echo receiver response
Vertical: Upper 0.5 v/div .. Lower 5.0 v/div.
Horizontal: 25 /lsec/div.
550
Fall Joint Computer Conference, 1969
er f.or every set of c.o.ordinates generated while stylus
c.ontact is maintained. In the discrete m.ode, .on the
.other hand, .only the l.ocati.on .of the initial c.ontact is
transferred t.o the c.omputer. The stylus must be lifted
and" rep.ositi.oned t.o initiat~ an.other data transfer.
The discrete m.ode significantly reduces the am.ount .of
data that must be handled with.out degrading the
resp.onse time when the device is ,being used f.or item
selecti.on purp.oses .only.
In applicati.ons such as CAl which require a cluster
.of c.omputer terminals in .one l.ocation, it bec.omes feasible t.o time-share the electr.onic circuitry am.ong several
terminals, thereby decreasing c.ost per unit.
Device performance
The c.omplete device is sh.own in Figure 6 with a
static display card behind the glas~ f.or dem.onstrati.on
purp.oses. It has been interfaced wlth a Digital Equipment C.orp.orati.on PDP-8 c.omputer f.or testing and
evaluati.on.
Tests have sh.own that stylus m.ovements .of 0.04
inch c.ould be res.olved, which c.or~esp.onds t.o the calculated value menti.oned earlier. However, it was found
that a c.ontact area appr.oximately 74 inch in diameter
is necessary t.o ensure .operati.on, anywhere .on the
10 X 10 inch surface. The c.ontac~ area must be as
large as that t.o bridge the regi.ons .of l.ow sensitivity
which result fr.om the irregularities ~n the surface wave
radiati.on pattern. This means that even th.ough the
device has an inherent p.ositi.onal res.oluti.on .of 0.04
inch, the usable w.orking res.oluti.on is c.onsiderably
l.ower.
When using the device with a qnger, a pressure .of
.only a few .ounces is adequate f.or .operati.on .over m.ost
.of the surface. In a few places, the pressure has t.o be
increased t.o enlarge the c.ontact area sufficiently. In
the f.ormer case, p.ointing with a finger t.o items displayed behind a seemingly .ordinary glass plate is
quite natural, and, except f.or the parallax, a pers.on
can make use .of the device with.out c.onsci.ously bein;g
aware.of its presence.
The p.ositi.on enc.oding is accurate and linear to ab.out
0.,5 percent. This figure takes int.o acc.ount the variati.on.
in wave vel.ocity due t.o temperature change and ma,terial inh.om.ogeneity, n.onlinearity .of the rSidiated
wavefr.ont, and the stability .of the timing circuits.
Because scratches and marks .on the glass pr.oduce
small ech.oes which c.ontribute t.o the backgr.ound
n.oise level in the receiver, s.ome care must be used to
keep the surface clean. The accumulati.on .of finger,·
prints .on the glass als.o c.ontributes t.o the backgr.ound
n.oise. H.owever, this is n.ot a seri.ous pr.oblem when the
device is used with reas.onably clean hands.
The initial device as described has served t.o dem.on··
strate the feasibility .of using surface wave ech.o ranging:
as the basis f.or a t.ouch-sensitive p.ositi.on enc.odeJ'"
The experience gained in c.onstructing and testing:
the device has been useful in determining where iill"
pr.ovements are needed and h.ow they sh.ould be im.··
plemented. Further c.omputati.ons indicate that 9, m.ore
s.ophisticated appr.oach t.o the array design and as·,
sembly sh.ould impr.ove the radiati.on pattern uni·,
f.ormity and thereby reduce the present disparity
between the minimum c.ontact size and the inherent
res.oluti.on. Tests have been sh.own that l.owering the:
carrier frequency t.o ab.out 4 MHz sh.ould increase
the signal-t.o-n.oise rati.o .of usable stylUS ech.oes by decreasing the signal attenuati.on and l.owering the sensitivity t.o surface c.ontaminati.on. The .overall lc.onsequences .of these changes will be t.o impr.ove the perf.ormance with medium and l.ow res.oluti.on styli and
als.o t.o simplify the circuitry, and hence reduc:e the
c.ost, by using tw.o arrays instead .of f.our. Work is
pr.ogressing .on the constructi.on .of a device which inc.orporates the impr.ovements described.
REFERENCES
Figure 6--Touch sensitive po:oition encoder
1 M R DAVIS T 0 ELLIS
The RAND tablet: A man-machine communication device
AFIPS FJCC Proc Vol 26 325 1964
2 M H LEWIN
A magnetic device for computer graphical input
AFIPS FJCC Proc Vol 27 8311965
3 E A JOHNSON
Touch display: A novel input/output device for computers
Touch Sensitive X-Y Position Encoder
Electronics Letters Vol 12 1964 Vol 13 1965
4 P W WOO
A proposal for input of hand drawn informatio'll to a digital
system
IEEE Trans on Electronic Computers EC-13 609 1964
5 L G ROBERTS
The Lincoln wand
,'131
AF'IPS F'JCC Pro(' Vol 28 223 1966
(; I A VIKTOROV 0 M ZUBOV A
Directivity diagrams of radiators of Lamb and Rayleigh waves
Soviet Physics-Acou~tie~ Vol 9 1H62 Vol 138 1963
7 I A VIKTOROV
Rayleigh waves in the ulirasoll ic range
Soviet Pysics-Acoustics Vol 8 1962 Vol 11 X 1962
A queueing model for scan conversion
by T. W. GAY, JR.
IBM Systems Development Division
Kingston, New York
identified, and the appropriate analysis applied to
determine the expected effects of various modes of
operation. Practically, however, no such extensive
analysis can be carried out. This is due in part to the
lack of complete knowledge of the system specifications at the time analysis is required. But more important is the· present limitation of the mathematics.
STATEMENT OF PURPOSE AND EXPECTED
RESULTS
The purpose of this paper is to present a queueing
model for analyzing a video scan converter (VSC).
The system analyst constantly strives for quicker
methods, parallel approaches, and more accurate resuIts. Queueing theory is generally useful in the first
and second of these categories. How then does the
analyst develop a queueing model of a VSC in the
hardware development and design stage?
The model is constructed through study of the internal functioning of the VSC and a queueing model
is then developed which functions analogously with it.
The queueing model developed for the VSC was an
extension and adaptation of the known queueing model
called Hthe machine interference queueing modeL"
(See first section for an explanation).
The general machine interference queueing model was
extended and modified to permit the servicing of
multicharacter conversions in lieu of single character
conversions.
The results are presented in the first two sections of
this paper.
IN'rRODUCTION AND EXPLANATION OF THE
VIDEO SCAN CONVERTER
Queueing analysis is a recent branch of probability
theory which studies the characteristics and effects
of congestion in systems subject to random flows. The
system under study may be a supermarket, a busy
airport, or a real-time message processor. Ideally,
the behavior of each of these systems could be represented in mathematical terms, the common elements
The function of the video (analog) scan converter is
to effect the conversion of characters which have been
generated by a computer in directed beam format into a
video scan format. In fulfilling this function, a video
scan converter ordinarily "paints" character(s) on the
face of a cathode ray tube and. "converts" their image
by scanning the image with a Vidicon. The directed
beam character appears to be painted on with no
presence of dots (or scanning lines). The painted image
is converted to a video scan character and is composed
of horizontal dots conforming to the character shape.
The smooth painted character has now become a
configuration of dots close enough together so that the
eye perceives an entire character(s).
One video scan converter is normally used to service
a group of video displays. If a keyboard is attached
to a video display, then the operator can enter keystrokes thru the keyboard into the computer. The keystroke(s) are converted to the video scan format and
appear on the operator's display screen. If characters
appear on the display, one by one, this is called "single
character conversion", a subject not discussed in this
paper. However, characters frequently appear on the
display screen in groups due to (batches) because of
high traffic. This paper assesses this multiple character
conversion phenomenon.
553
554
Fall Joint Computer Conference, 1969
A queueing model for the video scan converter
Explanation of the general Case "machine interference" queueing model, with development of associated equations. (Reference: Feller, W., An Introduction to Probability Theory and its Applications
2nd edition, New York: J. Wiley and Sons, 1957:
Pages 416-418.)
The machine interference model is a general class of
queuing models. Weare here specifically interested in
the "Machine Servicing With Single Serviceman."
This model has a finite number of customers arriving
randomly at a single server. It was original1y applied
in Swedish industry to determine how many machines
(customers) one setup man (server) could tend without
undue waiting delays resulting from several machines
requiting service at the same time.
Assume there are "m" identical machines assigned
to one serviceman.: Each machine is in one of two
states.
1. "up" (operating)
2. "down" (requiring service)
When a machine goes "down", it joins the queue for
the s,erviceman. If the serviceman if free he immediatel~ begins to service the machine. If h~ is busy,
the machll~ must wait for service. The queue (waiting
line) is organized on a "first-in, first-out" basis. The
design is shown in Figure 2.
To obtain the only readily available solution, the
following assumptions are made:
1. S?rv~ce Time. for all machines is expotentially
dIstrIbuted wIth mean time, "T s".
2. The "up" time for each machine expotentially
distributed with mean time , "'Ta" '
the actual service and "up" time distributions a:re
more regular.
Since there is a "fixed" number of customers, we
can readily see that the arrival rate of the machines
to th~ service queue is proportional to the number still
operating. If all machines are in the queue, the arriv:al
rate is reduced to zero. Because of this "captive audience" characteristic, the system has a built-in limiting
effect and cannot become unstable (no infinite number
of customers in the queue). For a relatively efficient
machine, the mean operating time, T a , is comparatively
large compared to mean servicing time, T 8' The ratio
of these two values is termed here the "service ratio",
z.
z
Ta
(1)
=-
T"
If P k denotes the probability that K machines are
"down", then let Po denote the probability that an
machines are operating and the serviceman is idle. No
machines are in the waiting line nor being serviced.
Po i~ the probability which represents the frac1Gion of
time the serviceman is idle. Thus, 1 - Po is the frac-
I:
Ta
~ I-
Tw
-I-
T!s-j
Tb
OPERATING
"up" TIME
=Ta
Ta
WAITING
LINE
These assumptions result in worst-case answers if
WAIT- Tw
SERVICE
. ,',
•
•
•
•
1.0
.9
.8
Ta
.7
.6
rmlZI.5
.4
Ta
.3
.2
.1
I
0.1
0.2
I
0.4 0.6 0.8 1.0
!
I
4
6
",
I
8 10
20
40 60
SERVICE RATIO Z
Figure I-Poisson ratio function
VS.
service ratio
100
80
Figure 2-Model of machine servicing with single
servireman
Queueing Model for Scan Conversion
tion of time the serviceman is· busy, and can be called
the server utilization.
Hence:
e-Z
Pk
z(m-k)
(
)
555
The mean rate of machine breakdown is 1/T b• Since
there are m machines, the total mean rate. of machine
breakdowns entering the service queue is miT b. Each
breakdown requires a service time T 8' Therefore, the
server utilization, rm (z), must be:
(m - k):
= ------
(2A)
Where P k is the probability that k machines are
"down" Equation 2A is the ratio of two Poisson expressions, both obtainable from Poisson tables, and
is known as the truncated Poisson distribution. If
K = 0, then Equation 2A would give the probability
of no machines in the service queue. If 1 - Po is server
utilization, then substituting k = 0 into Equation 2A
and subtracting it from 1 gives:
Server Utilization =
(1 - Po)
(4)
But if fm (z) is already known thru use of Figure 1, then
Equation 4 can be solved for T b:
m
T8
Tb = - - ; where T b is mean time
between breakdowns.
rm(z)
By further examination of Equation 3, it can be seen
that the mean time a machine stays in the "down"
state is:
Tw
+
T = Tb
8
-
]
m T,
T a = [ .--- - - T a
rm (z)
(6)
A correlation which will be made later is that T w + T 8
is sometimes referred to as average response time, T r'
A useful alternate form to Equation 6 is:
Tw
For convenience, this function has been plotted in
Figure 1, "Poisson Ratio Function versus Service
Ratio." Given Ta and T 8 , z can be calculated using
Equation 1. Given m, the number of individual queues,
rm (z) can be found at the intersection of z and m and
its value read from the "y" axis, Figure 1. Note that
rm (z) denotes server utilization.
For each machine, a breakdown is followed by a
wait for service, a service time, and an operating time
until the next time it has a breakdown. In equation
form:
(5)
+ T,
= [
--~-
-
~ T,
(7)
rm (z)
Since Ta = z T 8, and substitute for T a in Equation 6.
The mean number of down machines in the waiting
line, not including the one In service, is given by:
m
Lw =
L
(k -
1) P k = m -
(z
+
1) rm (z)
(8)
k=l
The mean number of all "down" machines, including
the one in service, is given by:
m
Lq =
k P k = Lw
+
rm (z)
m - z rm(Z)
(9)
k=l
Where:
Tb
L
-
is the mean time between breakdowns per
machine
Tw - is the mean time waiting to be serviced per
machine
T8 - is the mean time to service a "down"
machine
Ta - is the mean time a machine is "up", (operating)
Where, in Equations 8 and 9 above:
P k is the probability that k machines are down
k is the number of machines "down"
m is the total number of machines in the systems
and is a constant
z is the ratio of the machine "up" time to the
machine service time.
rm (z) is the server utilization
Fall Joint Computer Conference, 1969
556
The proportion of time that a ~achine spends in the
"down" state is found by divid~ng Equation 6 by Til:
Prob (machine K and only
machine k is "down")
= (1 - z/m rm(z» (10)
ANALOGOUS ITEIVr
NOW- REPLACING
THE ORIGINAL
ITEMS FOUND IN
ORIGINAL EXA~lPLE
----------------8 independent sources for
8 machines,
incoming data; m = 8
m=8
---------------
Example
n
Suppose that eight machines are tended by one
serviceman. The mean operating time of a machine is
380 seconds, and mean service time is 34.5 seconds.
Assume that both operating and service times are exponentially distributed. Determine the operating
characteristics of this system.
The service ratio is z =
1T • =' ::0.5secon~
secon s
"a
(T
+
W
T )
8
mT 8
= fa(1l) -
T
a
=
= 1
380 seconds = the mean
operating time, T a, per
machine (time frame is
immaterial)
380 milliseconds = the
average inter-arrival
time, T a , between
characters coming from
anyone source of data
34.5 seconds = the mean
service time, T s , per
machine (time frame is
immaterial)
34.5 milliseconds = the
averag-e service time per
character of input from
any source of da1Ga, T 8
11
a. What is the serviceman's utilization?
Using Figure 1 ,with z = 11 and m = 8
rm(z) = ra (11) = .62 = 62%, which i." the
serviceman's utilization
b. What is the average number of "down" machines?
Using Equation 9, with im == 8, ra (11) = .62,
and z = 11 Lq = 8 - 11 (.62) = 8 - 6.82 =
1.18, which is the average number of machines
"down" and are located in the waiting line or in
service.
c. What is the average time a machine spends in
the "down" state?
Using Equation 6, with m = 8, ra(ll) = .62,
Ta = 380 seconds, T.s = 34.5 seconds.
8 (34.5) sec. 380
.62
sec
------
one service facility,
n = 1, required to
service all eight sources
of data (the video scan
converter)
one serviceman,
------------------_ _-------------------..
serviceman's utilization
rm (z)
utilization of video scan
converter (servic,er),
rm (z)
_.--------.--------_ ..---------------------
down machines, Lq
total characters waiting
or being serviced in the
system, Lq
-----~-- ----~ ----.-----------~---- ..........
the average time a
machine spends in the
"down" state ::;:
(TID + T.)
the average time a
character spends waiting
for and receiving serviee
= (T w + T 8), response
time.
----_.----.--------- ....
+
(Tw
T 8 ) = 445 sec. - 380 sec, = 65 seconds,
which is the average time a machine is "down."
d. What fraction of the tots'! time is a machine in
the "down" state?
Using Equation 10, with m = 8, ra (11) = .62,
z = 11. Prob = Fraction of total time = (111/8 (.62» = .15 = 15%
Let us consider the same example again, the one we
have just used to determine op~rating characteristics.
To illustrate the practicality of: the case of "Machine
Servicing With a Single Serviceman" let us transform
the example by considering the analogies we wish to
introduce.
Tb is the average time
between "breakdowns"
per machine, and is the
sum of T a , Tw and Ts
-
-.~---.----~-
The average time interval
between services of a
specific queue; l/T b is
the average numloer of
queues serviced during
this time interval
--------._-------------------Prob (machine K is in the Prob (that any character
in the system is waiting
"down" state)
= fraction of the total or is being serviced)
time a machine is in = fraction of the total
time any chara,cter in
the "down" state
the system is waiting
or is being serviced
Queueing Model for Scan Conversion
Transformed example continued
.
t"
The serVice ra 10
IS Z
Ta
380 ms
= T! = 34.5 ms
11
a. What is the average scan convertEr utilization?
Using Figure 1, with z = 11, m = 8
rm(z) = ra (11) = .62 = 62% utilization
b. What is the average number of characters in
the system?
Using Equation 9, with m = 8, rm (11) = .62,
z = 11 Lq = m - z rm (z) = 8 -11 (.62) =
8 - 6.82 = 1.18 characters on the average are
in the system waiting or being serviced.
557
34.5 millisecond
TB' = -- = - - - - - - - N
N Characters
(11)
We are especially interested in the response time,
Tn since this provides a measure of "machine responsiveness" to a keyboard operator entering a character
stream into the system. T r , is meant to be the average
response time per character, since the response time for
the first character will be longer than that of the last
character awaiting service from the same source.
As with our previous model, our service ratio is dedefined as:
.
c. What is the average response time per character?
(12)
z =
Using Equation 6, with m = 8, rs(11) = .62
Ta =; 380 ms, T, = 34.5 rns
(Tw
+T
m T8
B)
= ( - - - Ta
Also using Figure 1, the server utilization, rm (z),
can be found at the intersection of z and m, and its
value read from the "y" axis.
)
ra (11)
= [ 8 (34.5 rns) _ 380 rnsJ
Let T,. * = (T w
+
T 8) = [
~-
z] T
B
(13)
rm (z)
.62
+
(Tw
To) = [445 ms - 380 rns] = 65 ms,
average response per character
d. What fraction of the total time does a character
spend waiting for or being serviced?
During the time interval between services, T b, the
number of characters which arrived at a specific queue
is:
(14)
Using Equation 10, with m - 8, rs (11) = .62,
z = 11
Fraction of total time = (1 -
z/m
(rs (11»
= (1 - 11/8 (.62»
.15 = 15%
Extension and adaption of the general case "machine
interference" queueing model to permit multiple character updates per service cycle.
Consider now that we wish to adapt the single character update model to one which is capable of representing
a mUltiple character update. Specifically we mean the
ability to service "N" characters coming from the
same source and residing in the same queue in the same
34.5 milliseconds service cycle. In effect, the service
time per character reduces to:
Where N is the character contents of an individual
queue and is the average number of characters services
as a "batch".
Referring to Figure 3, in a typical multiple character
service there are N characters and N - 1 time intervals,
T a , between the characters. T a is the average interarrival time of the incoming character stream. As the
wait time becomes longer more characters arrive at
the individual queue, awaiting service simultaneoulsy
with the first character in the queue .. When the queue
is serviced, all characters residing in the queue at that
point in time are serviced in the same constant service
time of 34.5 millisecond for all N of them. Note that
the service time per character haA been effectively reduced to 34.5 mslN characters.
The response time per character must reflect, how-
Fall Joint Computer Conference, 1969
558
CHARACTER ARRIVAL
POSITION
TIME
Since the service time was subtracted out in Equa-·
tion 15, it must be re-entered so that each and every
character in the "bash" is charged with Ts. Inserting
T 8 into Equation 17, gives the average response time
per character, T r :
lI
T, =
-(T w -2T a 1 - - - - -
f _l)CTW ....
N
nTa)]
+ T.
(18)
[
- - - - - - - - (T w-T 01
THE SUM OF THE INDIVIDUAL WAITING TlMES- (FOR THE CASE,N-51
=
Equation 18 is important since it is the mathematjcal
expression we originally set out to find. The reader
should note that Tr is the Overall Average Response
Time per Character. The following example should be
of interest.
~Twl+
«
As compared to the value obtained with a simulation
~odel, the following is the % difference:
1'90.0
1
0:."
T Bum = [(760.5 + (447.5) + (134.5)] = 1,342.5ms
414.5
VIDEO (ANALOG) SCAN
CONVERTER
~
559
/;
0:.
w
>
«
...J
...J
«
% Difference = (414.5 - 390) ms X 100%
0:.
w
>
390 ms
0
=
50
34.5
+ 6.3% Difference
Presentation of results with comparison to a
simulation model
Table 1 and Figure 4 show the computed and simulated values from the queueing model described in
an earlier section and a simulation (GPSS) model respectively. The purpose of simulating the video scan converter was to establish validity of the queueing model
results, within a range of
or - 20%.
The argument is valid that error in modeling can
exist in:
--
Figure 4-Attached video displays with keyboard, m
U sing this as criteria to determine validity the following is my evaluation of the results:
+
a. The values differed by 10% maximum (at
M = 16, the queueing model value for Tr =
162.0 millisecond versus Tr = 180 millisecond
for the simulation model.
h. Over the complete range of m = 0 thru m =
32, the values of T r from both models increased
as m was increased a like amount.
c. The slope of T r values from both models differed
over the range of m = 0 thru m = 32. They
were:
a. The queueing model
b. The simulation model
c. Both models
Of paramount importance, however, is the underlying principle that the probability is least that both
models will be in error. As a general rule for confirming
validity:
a. Values from both models should be in the same
"ball park."
b. Output values from both should increase
or decrease as independent variables are changed
by like amounts.
c. Produce approximately the same slope of values
d. Provide a reasonable division of positive and
negative % "differences" over the range of the
model's output.
_ _---,-_ _----.l_ _----.l
24
32
40
48
L~
16
Queueing Model
Range
m
m
M
=:
=:
=:
o thru 8,
8 thru 16,
16 thru 32,
Tota.ls
Simulation Model
Slope = + 4.77 Slope = + 5.06
Slope = +11.16 Slope = +13.12
Slope = +15.78 Slope = +13.12
----
---
+31.71
+31.30
Even though the slopes are somewhat differer u
they are not appreciably so. I t appears t1 Le
Fall Joint Computer Conference, 1969
560
TABLE I-Computation of values for Figure 4Overall average response time per character, T ..
.
1~
m
1
4
8
rm(z)
.10
.43
.77
*
.995
32
EQUATION
1.0
FIG.l
There appears to be a reasonable division between positive and negative % differences.
CONCLUSIONS
The queueing model as presented, in my opinion,
provides a very satisfactory mathematical representation of a video scan converter and better than originally
anticipated.
34.5ms
45.0ms
83.0ms
276;.Oms
795.0ms
13
N
1.110
1.146
1. 270
1.893
3.540
14
T
w
o .00ms
10.5ms
48.5ms
241,.5ms
760. Sms
15
ACKNOWLEDGMENTS
LTw
o .00ms
10. Sms
48.5ms
241. 5ms 1,342. Sms
16
LTw /N
O.OOms
9.15ms
38.2ms
1271.5ms
380.0ms
17
43.65ms
72.7ms
162.0ms
414.Sms
18
46.00ms
7S. Oms
180:.0ms
390.0ms
SIM.
MODEL
o .00ms
-2.35ms
-2.3ms
-1S;.Oms
+24.Sms
-----
The author wishes to thank IVIr. P. H. Seaman for
his help in the form of technical discussion and evalua-·
tion; }Ir. T. G. Greene for his cooperation in developinp:
the simulation model; l\Ir. V. L. Hoberecht for hils COll·sultation and advice regarding video display systems;
and to :11r. D. H. Rumble for his encouragement..
0.00%
-5.1%
-3.07%
-10'.0%
+ 6.3%
19
Tr
T r = ~Tw"N+T s 34.5ms
T
r
(SIM. )
34.5ms
DIFFERENCE
% DIFFERENCE
z =
~
z
Ts
z=~=
T
s
278.0ms
34.Sms
z
8 for m=1, 4, 8, 16
313.0ms
34.5ms
=9
for m= 32
queueing model does not tak~ into account some
factor at the 1m,' end of m and somewhat over
compensates at the high end of m.
d. As shown in Table 1, the maximum negative
difference is - 10.0%, the :maximum positive
difference is + 6.3%. This is calculated as follows
% Difference =
(queueing value, T r - simul~ted, (19)T r) X 100%
(simulated, Tr)
REFERENCES
\y FELLER
A. n intror/cf1;on tv probability theory and its apphcai'iou.~
John Wiley and Som;; 1!)57 N Y 2nd ed 416-11.~
2 P II SEAM:\'"\"
.1nalysis of so'me fjueuein(}tI1,odels in real i1:me 1~!/8tem::;
:~
IBM Data Proeessing Technique~ Manual F20-0007-0
.'39-4247·48 (Note: Th~ PoiH::>on ratio function l';"(z) VH.
service ratio z c= Ta/T. waf; ohtained from this IBM
publicat.ion.
W FEI,LEn.
.-in inlrod]lciion to probability theory and ih; applications
John Wiley and Sont,: ); Y 1965 3rd ed 460-468
4 J MARTIN
Design of real-time computer sytem.s
Prentice-Hall Inc Eng1ewoorl Clifis N J 1967 396-426
5 D R COX W L SMITH
Queues
John Wiley and Sons ~ Y 1961 general reference to variOll~ Chlt;~es I)f qupueing pmble n.:.:
Character generation from resistive
storage of time derivatives
by MICHAEL L. DERTOUZOS
Massachusetts Institute of Technology
Cambridge, Massachusetts
given. in this section is a complete list of primitive
sequences for the 94-character ASC-II set. The second
section of the paper describes a character-generation
system that stores the above primitives in a resistor
matrix, and uses them to compose desired characters
on a CRT display'~ In the third. section, this approach
is evaluated and compared to more conventional
methods of dot intensification, in terms of cost, speed,
and fidelity.
INTRODUCTION
Recent advances in man-machine communication
have stimulated increased interest in techniques and
special circuits that generate characters, for graphical
and alphanumeric Cathode-Ray-Tube (CRT) display
terminals, at the display site. The primary advantage
in employing such local character generation is compression of the data that is required to store and communicate a character from the computer to the display-a single binary word of length n is all that is
required to instruct the character generator to display one of 2n possible characters. The primary disadvantage of local character generation is display
cost, for it is generally considerably less expensive
to generate characters from a longer sequence of more
elementary commands-for example commands that
cause the CRT beam to move right, left, up or down
by a minimum resolvable increment. Besides these
conflicting costs of data storage and transmission
versus local-display generation, several other less
tangible criteria such as character stability and fidelity (aesthetics), are instrumental in the design and
evaluation of a local character-generation approach.
This paper discusses a character-generation technique which requires, for each character, the storage
in a resistive memory of the time derivative functions
for the horizontal and vertical CRT deflection signals.
The first section of the paper descrIbes specific geometrical primitive segments that can compose a large class
of characters and symbols; "the choice of such primitives is important, since it affects directly the quality
of the displayed characters and the display cost. Also
Character primitive8
Ch.aracters and symbols, generated on CRT displays, are made up of certain elementary graphical
segments. Character primitive8 over a character set
will be called those segments which are (i) atomic or
indivisible to smaller segments, and (ii) sufficient in
number and quality to compose within acceptable
accuracy every character in that set. At one extreme,
the points of a uniformly spaced grid are adequate
character primitives (Figure la); however, as the
number of these points is reduced (Figures Ib ami c),
it becomes progressively more difficult to recognize
the displayed characters. At the other extreme, the
set of all characters may be considered itself as a set
of character primitives. This set, however, is not very
useful, for while it is generally easy to construct a
system capable of implementing the primitives of
Figure 1, it is considerably more difficult and expensive to implement the primitives at the other extreme.
Conversely, it takes only seven bits to specify one
of the 94 characters of the ASC-II set, while it takes
49 bits to specify every one of the possible subset of
561
562
......
......
..
Fall Joint Computer Conference, 1969
....•....
........
16 POINTS
49 POINTS
169 POINTS
• •
• •
• •
• • •
• • • •
•
•
•
•
•
•
•• • •
•
•
(b)
•••••••• .
.
.
.. . .
••••••••
..
(0)
•
•
(c)
Figure I-Points as character primitives
dots of Figure lb. These simple observations on the
above two extremes are characteristic of the problems
of character generation and of the objectives in the
design of an effective character generator-that is
the, desirability for a small number of primitives which
can be economically implemented.
The primitives used in the character generation
technique of this paper are continuous strokes which
are either (i) straight lines or (ii) so-called "cusps".
A straight-line primitive is specified relative to a
point P by increments Az, All which ,are real numbers;
in our notation each such primitive; is denoted, when
visible, by (Az, All) or, when invisible by an underscore (Az , All). Figure 2a shows two such primitives.
The equation of primitive (Az, All) is relative to a
coordinate center at point P as follows:
~
Au
=
~ for 0 < ~ < 1 0 < ~ < 1
Az
- Au -
,
(1)
- Az -
.1.10- (2,3)
~
pro. 1'- ..
r-
~
."
If
R
r-- (3,-1)
1-
IL
(0)
-
(b)
~;[]
o
(c)
1
'3
2
"3
.L}:/'(3,3)
1:-
i""
In Region I (0
~ ~ < ~);
L =
Au
In Region II (1 < ~
3-Az
<
.1 =
All
(1,-3)
1 - (1 - 3 ~>,'
Az
(a)
~).
3'
1
(b) (2)
In Region III (~3 ~ ~ ~ 1);
Az
x=
where x and yare the horizontal and v:ertical coordinates
of every point on that primitive.
r--"- ...-
The cusp primitive, on the other hand, is specified
relative to a point P by increments Az, All, whieh are
real; moreover, one of these increments is overs,cored,
and is called the cusp increments; that is eithe:r (Az ,
Au) or (Az , All) are valid cusp primitive notations .
Geometrically, a cusp primitive is, as shown in :Figure
2b, contained in a rectangle of dimensions Ax, Au;
the curved segment corresponding to the overscored
increment is obtained by dividing the other increment
into three equal parts, fitting a straight line in the
middle section and a parabola in each of the othe:r two
sections so that the parabolas are tangent to the above
straight line. More precisely, the cusp, (Az, All), shown
normalized in Figure 2c, is given, relative to a coordinate center at point P, by
Au
1 - (3 ~ _ 2)2
Az
(c)
The cusp (xz, All) is obtained from Equations (2) by
interchanging literal x with literal y everywhere in
these equations. A cusp is always visible. These apparently mysterious primitives are justified on two
counts: (i) ability to represent a large class of characters and symbols with a small number of primitives,
as discussed immediately below and (ii) ease of implementation, as discussed in the following section.
A character or symbol is composed from a sequence
of these two types of primitives; here the first primitive
is specified relative to the . lower left corner of the
character field, and each subsequent primitive is
specified relative to the terminating point of the.
preceding primitive. For example, capital lette)r A
is formed in Figure 3a by the primitive sequence
1
...!..Ax
Figure 2-Straight/cusp primitives
SA = (.45, 1.2) (.45, -1.2) (- .788, .3)(.676, 0)
Observe· that the first segment is a visible straJight
primitive which starts at the lower left corner and
Character Generation from Resistive Storage of Time Derivatives
1.2
1.2
IA
1.0
.8
IJ
~
~
.6
I
1\
I
,
1.0
\
.8
'
\ lf2~
VI
\~-
I
II
I
rJ
o
o
i\
\
~t\
!~ ~
I'
1~
I~
./~
I~
r
~II
.4
(a)
.4
H~
1\
.. ~ ~
1. The average number of primitive segments per
\
r-. ~
~
.2
Here, the first four primitives are straight with the
third primitive invisible. The fifth primitive however
is a cusp which starts at the point [.4, .7] and ends at
the point [.4, 1.2].
Figure 4 shows the primitive sequences corresponding
to all 94 alphanumeric characters and symbols of the
ASC-II code. This Table is arranged exactly as the
table of the ASC-II code for reference purposes. Some
statistics of interest here are as follows:
.&
..... r-.... Io...
.2
,.
r--'
l'
1/'
.6
II
.4
I::; ~ /
l~ ~i~
J~I ~
.6
.8
o
.0
.2
.4
.6
character is 4.43.
2. The maximum number of primitives per character is eight.
3. The total number of different magnitudes for
the primitive increments is 13.
4:. No character uses more than two cusp primitives; these primitives occur (intentionally) either
at the fifth, at the seventh, or at both the fifth
and seventh segments of that character's primitive
sequence. *
.8
(b)
Figure 3-Character composition by straight/cusp
primitives
terminates at the point [,45, 1.2]. The second segment
is again a visible straight primitive, which starts as
point [.45, 1.2] and terminates .45 units to the right
and 1.2 units below that point. Observe further that
the third segment is invisible, and that the direction
and order in the sequence of each primitive is shown
adjacent to each segment in Figure 3a. Capital letter
P of Figure 3b is formed by the primitive sequence
Of the above observations, 1, 2, and 3 indicate that
a relatively small number of primitives can form a
relatively large class of symbols. The fourth as weIJ
• or they can be made to occur at these segment positions by
introducing primitives (0,0) anywhere in the sequence.
Sp = (0, 1.2) (.4, 0) ( - .4, - .5) (.4, 0) (.2, 5)
o
p
<.!.!!1· ... !)(0.·.&)( ... ·.~)10•.5)
~10.-L01\Qd!10.-.1l
~IO•. Zll:k&lIO•. Z)
563
~t3 •. 6)~I-.3 •. 6)1-.3.·.6)~I-.i!i,1.2)(.2.-.3) 10.1.2)1.4.011~1.4.011.! •. 51
A
Q
\!.!:2!IZ.ZMO.-I.Zl
t45.LZII.45.-LZ1I~I.676.01
~1-.2 •. 21~1·.8 •. 3110.-.511.8.-.4)IO•.51
2
B
R
~1.7. !)(-.7.-.911.8.01
10.1.2H.4.0I(~.4.0M.5. 5M~II.'.-.711-.4.01
10.1.2l!.4.01\::&:~I,'.OH.5 •. 511~1 .• -.7)
1:.M:2!1-.2;.21
1!!,£)(0•. 9H~II-.8.-.!HO•. 3)1.8•. !1
b
10.IZ1~18 •.!HO.-.3H-.'.-.!)
12.:l)(0~.ZII~I".-.5)(0 •. 3)(-.8 •.51
1.8.-.3MO.IZHo.-.8M-.8.-.!Ho..31C8 •.!1
r
1J.2!10..9)12cl)1.7•. ~1
•
3
IJM.lI.2,l.2MMI-.Z.-I.2M~I-.7.OM.!.!)1. 7.01
C
5
C
u.!1.XI.OH.g.-.511.i.-.7H·.Z.Ol
18 .9H-.8 •.!HO.-.511.8.-.~1
1!.1)1-.8 •. 3H.8.-.51(-'8.-.~1
18 •.311-.8.-.3110•. 311 ....31
o
T
1~10.-l611~H-....!11.8 •. 511-.8.-.~1
4
Ua21 (O,lZ){ -.7, - 9)(.8,0}
10.I.ZII.'.01l.'.-IZII-.'.01
~(O.I.21~Olt8.01
~10.IZII2L:!!I-.8.-.!110 •. 3)1....!1
!:!.2!1-.I.OX-!~Ho.Ult:!cl)14.0)
5
E
U
e
U
10•.5U.OHO•. IH· ....3HO.-.31C .. -.l1
\J.2!10..9H!!.::!I.I-... -.!Mo..81
\Qdl1 ...-.'5M-.••.3M ....'51
\U!l1.8.1.2lt!,Q110.-.211.• -.'110.-.21
'"
1.•.. 7HI.5)(.6.01
10~.ZI17.01l·.7.-.511.6,011-.6.-.711.8.01
10~.2110.-.8)('8.- ')(0•. '1
8
6
F
II
~-2.0X- .•• -IZ)( .••. 711-.4.-.Z1
(O.t.2)(.1.0}~( .••OI
1~.2)('4.-\.2)(.4.\'ZI
~HO.tOH.Z -.1)(-.3.0)(.1 .1)(.4.. 21
7
G
W
101.211.'.011-.8.-1.21
I.!.!.M- .••.!110.-.5)( ... -.~M-.S.0'1
\I
~11.2.-1.211.1~.")('25.-.8)(2).ZI
8
H
\1.!.M· ....3)( .•• -.511· .•• -.~)( ....51
x
h
(0).ZlI!t!!l.a,oI~)IO.-I.Z)
IQ,UlI ... ·I.ZIt;.LQ1I .••1.21
10.I.ZII2.:!!I ....!)(0.-.61
1.7,.')(~1.7.-.•1
~10 •. 91~10.0l
IJ..::!lU'OM! •. I)(.5~.11I.:,,!.Q.M.4.-.•1
<.!.Q!1-.8.11C3,.2)(-.5.-.III .•• -.~)
~IO.-o.Zl
~1-.~.\.21
\!ml.Z.0)(.:i.U)(-.;j.-.7M.4. Zl
1~IO.OIIo..-.'IIO.OI
~1.4.oM~M:!2II,"'01
12.!1!1.4.-.1M.4•. 'I~MO.-.81
J
Z
IJ.IR\.4.0.1I::L2!1Q,·10.)(-.4••. ll
1l!.!l1l.l,OlI-.••·I.Z1I ...Ol
+
I<
\MM ...OI~IO.-.Il
1O,I.1IQ:!!1.7•. 7H.:Ja::!)ta.-. 7 1
I~hl.-.Zl
W
12.:!11.2.-.•II.Z •.•)(.2.-.IIU •.• 1
y
9
\:!,!)C4 •. 411Jcl,)(-.6,0)CI.Z1I.4.-.41
1~1.5.0}('2 •. Z)10•.7)(-., •.11l0.-.3)1.••-.!1
<.!.2.)(-.6, .•)(AI•.•1
.
IO,I.IM ....-..II......HO.-U)
\2.,!lCI.o.l
,
L!.U.1I-....011Q,-U)(.4.0'1
~1I.'-Ll)
~0.O'M!1.:!110'-.911-.6.-.31
10.1.21~1.".5M~I."-.51
1!.::!!1-.Z.ZlIQ,5M-.I•.lltl •.llIo,.511.2 •. Z1
1~(0).21
I~IO""I
m
1~1.4,OMO,12M-.4.01
10•. 911~10'-.61\2.:!11-.4 •.3)(~1.4•.!1
I~MZ •. ZKO..5IU.!)I·.I •. I)IO,.tll(-.Z •. Z)
N
~1~~.O_)(_O'O'_I______________-4=~_O'I_IAI_.AI_)(_..__
•• 1 __________~I~~~Z-II..
~--\.I~IIO,l~21---------- __~~~.4~M...~.~-.4~.-•.-S)____________+1~0..~.)(=o=-~~AI~
•. !~)(0~.-~
.• I~________4=~=.21~(...
~OI~I-=.Z'O~I____________~
1.'.1.2)
~10•.ll~10•. ZM.3.5M-.••.'5)
~....!Mo,-.&Ita.-.;j1tO,.51
12cl)1..,o1
~•.5)( ...3110.-.S11•.••-.!1
Figure 4-Straight/cusp primitive sequences for U4-character ASC-II set
564
Fall Joint Computer Conference, 1969
as the other observations above. will be used in the
following section in connection with the implementation of this character generation technique.
DISPLAY
MODULE
7-bit
Input
Y@]
CHARACTER·
GENERATOR
The character generator
A local character generator for a CRT display is
generally a system (Figure 5) with input a seven-bit
word, denoting a character, and output two deflection
and one beam-intensification wa:veforms (functions
of time), which when applied to the CRT deflection
and beam controls, respectively, display that character
relative to beam position, Xp and i y p' Character and
line spacing is usually accomplished by a control unit
external to the generator, which varies Xp and y p upon
completion of each character and line, respectively.
If the CRT display module is of the refresh type, then
the codes of characters to be displayed are stored in
a local storage medium, usually a delay line, and are
presented periodically, usually every 1/30 to 1/40
sec to the character generator. If the CRT display
module is of the storage type, ~hen the character
generator generates the waveforms x, y and b only
once for each character to be displayed, and the corresponding character is stored on the screen of the CRT.
Any given character primitive y = f(x) can be
generated by such a system in an .infinite number of
ways, since for everyone of many possible choices
for a horizontal deflection waveform x(t), where t
is time, there is always a vertical deflection waveform
yet) = f(x(t» which when applied simultaneously with
x(t), causes the CRT beam to trace the primitive y =
f(x). Two particular types of waveforms, set) and
c(t) were chosen to implement the primitives of the
preceding section; they are shown in Figure 6a, and
their time derivatives in Figure 6b.
A straight-line primitive about any point is generated
by applying waveform set), apprqpriately scaled, to
both the horizontal and vertical axes. Thus, setting
Yp
Figure 5-Local character generator
and shown in Figure 6c. This is the desired primitive
of Equation (1).
A cusp primitive about any point, is generated by
applying waveform set) to one axis and waveform c(t)
to the other, after each waveform has been appro··
priately scaled. Figure 6d shows the resulting segmen1i
when set) is applied to the horizontal axis, and c(t)
to the vertical axis, and Figure 6e shows a sel~menti
obtained with different scaling and interchange of the
two waveforms. More generally, setting
x(t) = .1xs(t) +
Xl
(a)
Yl
(b)
(5)
yet) = .1yc(t)
+
where .1x and .1y are real numbers, yields a cusp
primitive, about-point [Xl, Yl] described as follows:
for
o<
X -
Xl
-.1x
< 1,
3
y= y. + l1Y[l - (1 - 3x ~XX')2J (a)
for
1 <
3-
Xl
2
-.1x
<-,
3
X -
y = Yl
+
.1y
<
1
(Ib) (6)
for
x(t) = Llx set) + Xl
(a)
(3)
yet) = Lly set)
+
YI
~
_
~
RING
LIGHT
PHOTOCONDUCTING
LAYER
,-~,,""====
=::::::-------- \
i'......-:::::
~~
./
Figure 1-8chematic diagram of
th~
character generator
SIGNAL PLATE
Inc
BEAM~~\
~~~~~~~~T ~
J-
AUGNME.NT
CUlL
_
GUN
L-..---vvv-_
T
'TARGET RESISTOR
VIDEO OUT
before the succeeding projection of a different character grid is made. Accordingly, the erasing scan of the
vidicon face should precede each reading scan. However, if an erasing scan is applied to the entire face of
the vidicon, a certain time would be wasted prior to
each reading scano In our pattern generating system the
erasing scan or prescanning is restricted to the area
where the reading scan is to follow. The remaining area
is occupied with residual image.
Character generating unit
Figure 1 shows a schematic diagram of the character
generator designed for Kanji. Ch~racters and symbols
are printed in a 16 by 16 matrix form on each of four
character grids. Four miniature; flash lamps whose
light emission timing is determirted by a control circuit are used to project the re~l image of the four
character grids. When one of the! flash lamps is selectively energized, all the patterns, printed on the corresponding character grid are projected on the full
effective area of a target face qf the vidicon (type
8572) by means of a half mirror and a lens having reduction ~atio 1/2. F-number of the:lens is 5.6.
Generating cycle
The vidicon consists of a highlr evacuated envelope
containing an electron gun at one end and a transparent
optical flat target face at the other (Figure 2). A transparent conductive layer is deposited on the inner surface of the target face as a signal plate. A photoconductive film is deposited on this layer so as to form
capacitors. In the site of electron impact the surface
of photoconductive layer catches a negative charg~ of
electron. When no light falls on the photoconductive
layer, its surface is maintained at the cathode (ground)
potential by electron beam scannihg because the layer
is a good insulator. When a patt~rn is projected conduction increases in the bright a~eas. The bright part
of the pattern enhances the leaka,ge current through
Figure 2--8chematic cross-section of the vidicon
the layer and let the capacitors discharge during exposure. The reading scan which follows the exposure
restores the negative charge, and the current for the
restoration produces a video signal across the targ:et
resistor.
Generation of a character is accomplished by the
f ollowing sequential steps:
1. Prescanning the area where the desired cha,racter is to be projected.
2. Flashing a xenon lamp in order to project and
store the character image on the vidicon target.
3. Scanning the area of a particular character in
order to pick up the video signal.
In the prescanning step, the deflection yoke moves
the electron beam to the position on the vidieon ta.rget where the character is to be projected, and lets
the beam form a small raster throughout the area to
cover the image of charaeter. The raster size is 0.7 mm
square (about 1/250 of full effective area of the vidicon
target). It takes 1.5 ms to erase completely the residual image stored by preceding flashes.
. Two factors are specified for the image persistence,
viz., the transient response of photoconductive material, and the time lag which results from incomplete
charging of electrons on the target with large capaeitance by the scanning beam of low landing effici.ency.ll."
Generally the photoconductive decay time constant
is very short, of the order of one ms. On the other hand,
the capacitive lag makes a predominant contribution
to the image persistence (of 10 ms) in the standard
TV application. Since the total target capacitance is
proportional to the size of the raster, there is no Sil~
nificant capacitance in the present application where
about 1/250 of the total surface is used. The localized
scanning reduces the resultant image persistence time
from 10 ms of TV application to about one millisecond.
Economical Display Generation of Large Character Set
The flashing illumination just after the prescanning
continues only 5j.J.s. Each miniature xenon-flash lamp
is energized by tha discharge of a capacitor, which is
triggered by a selection pulse. Although more than two
hundred characters are projected on the target face
of a vidicon, only one of them is exactly stored on the
vidicon target, because the correspondi~g part of the
vidicon target has been presecanned.
The last step is the reading scan. The deflection of the
scanning beam during this step is the same as that of
the prescanning. However, the videosignal on the output of the vidicon, is taken out through a video gate
circuit.
Figure 3 illustrates a portion of a real image. It is
projected from a character grid onto the target face
of the vidicon. Owing to the image persistence of the
vidicon target, the image focused persists for a certain
period even if the projection is executed for a very
short time. The prescanning and read scanning of
particular area are accomplished by the X and Y deflection yoke. The prescanning and read scanning modes
are illustrated in Figure 3. Examples of prescanning
and read scanning are shown by the lines superimposed
on the letter 3.
The deflection of the scanning beam to any position
on the vidicon target can be accomplished in 5j.J.s.
Linearity and stability of the deflection amplifier are
approximately 0.1 percent. The bandwidth of deflection
amplifier is 5QO KHz.
y
6
e
4
3
t£
~~~
K~~~
~,~ 1~ I~' ~
o 1 2 N"1/4
2
3
Processing of video signal
The video signal output of the vidicon is amplified
and converted to a two-level signal by a video-processing circuit.
As the aperture of the scanning electron beam is not
very sharp, the video signal contains an intermediate
level notwithstanding the fact that the character
grid has two levels of black and white. Considerable
variations in both modulation depth and dc level occur in the video signal depending upon detailed patterns
of projected characters. Shading of the vidicon also
causes variations. Thus a simple clipping circuit of constant clipping level cannot be used.
Figure 4 shows a block diagram of the video processing circuit. The increment of the video signal is
detected in the differentiation circuit which consists
of a 0,5 j.J.S delayline and an integrated differential
amplifier j.J.PC53. This circuit eliminates the dc-level
shift from the video signal and sends trigger pulses to
a flip-flop which converts the video signal to a twolevel signal.
The control circuit in Figure 1 decodes a patternrepresenting binary signal, selectively energizes one
of the flash lamps, and controls the pres canning and
read scanning in the vidicon so that the desired one
of the proJected patterns is scanned.
ABCDEF
0
Perpendicularity and residual magnetism of the deflection yoke, and pin cushion or barrel distortion of
the vidicon are the other factors influencing the positioning accuracy of the electron-beam deflection. The
pin cushion distortion of the vidicon diminishes the
accuracy considerably. Four small magnets each of
the size 2 X 2 X 3 mm placed close to the vidicon
target correct the distortion. Residual magnetism of
each magnet is about 2000 G along the longitudinal
axis. Positions of magnets are adjusted by means of
screws. In the present system the overall error of beam
positioning is kept within 0.5 percent of full deflection.
This is sufficient because the projected characters are
larger than those in the flying spot system.
System operation
,~
2
571
4
5
6
Figure a-Figure of the image of a character grid
projected on the face of the vidicon
X
Figure 4-Block diagram of the video processing circuit
572
Fall Joint Computer Conference, 1969
Two significant bits of the character-representing
signal are decoded into four flash: lamps to select one
character grid out of four character grids. The remaining eight bits are supplied to; the X and Y direction D-A converters in the deflectiop circuits.
There are two saw-tooth waveform generators in
the control circuit, one for X-scaiming and the other
for Y-scanning. The repetition fiequency of Y-scanning is 20 KHz while that of X-scanning is 0.67 KHz.
The ratio of these frequencies isi determined by the
number of scanning lines for one character. In the
present system, each character is ~canned by 30 vertical lines. The scanning signals £i'om each saw-tooth
wave generators are respectively added to the character
selection signals of X- and Y-axis which are supplied
from the D-A converters. Figure 5 shows a block diagram of the deflection circuit.
The control circui~ produces th,e gate pulse for the
video signal as soon' as the read-scanning starts. Synchronizing pulses for X- and Y-axes are available from
the control circuit for the reconstruction of the character images either at display or at printer unit.
Figure 6-The optical unit of character generator
Operating characteristics.
The optical unit of character g~nerator is shown in
Figure 6. The size of this unit is 500
mm wide , 600 mm
:
long and 150 mm high. The weiglit is 20 kilogram. Almost all the electronic circuits are:constructed by IC's.
The quality of the characters generated by the
present 1024 font capacity system is sufficiently high.
Figure 7 shows an example of Japanese sentences displayed on a CRT display unit. The storage CRT which
needs no costly memory devices' for the refreshment
of the information, is suitable for this application.
Figure 8 shows an example of printed pages performed
by a fiber optics CRT unit.
The generating speed of the present system is 330
characters per second. The machi*e speed is restricted
by the persistent lag in the vidicon. In order to decrease
the time for erasing a new photoconductive layer of
the vidicon is required.
Figure 7-Displayed Japanese sentences on CRT
The reliability of the vidicon operated under unusual
condition of selective scanning on the target was investigated by the running test of about 1000 houriS.
But no noticeable change was observed.
CONCLUSION
Figure 5-Block diagram of the deflection circuit
I t has been confirmed that the new opto-electronic
character generating system with 1024 font cl1pacity
has many advantages such as high font capacity, high
speed, high quality, low cost and small size. The advantages have been achieved by utilizing four character grids and one single vidicon. Each character grid
Economical Display Generation of Large Character Set
NEC
PATTERN
Q1234!:J6789
GENERATOR
ABCDEFGHIJKLMNOPO
RSTUVWXYZ
r
,~ :r. t
f
IJ '" ? 7"::J
+t
j
A -t! '/
lI'r
';I
T"
-r'='.A;t..
573
The generation speed of 330 characters per second
was realized with the new character generating system.
Excellent stability was confirmed for a long period of
operation.
/
~
I.'
j
(T)
1;1
+to
~
"- h
(}o
~,~
.t ........ 11
-t:':" '/
.
~ t::::."
< It.:.
J h t.:
7' "0{ ,-f,.
/If)
~ l,.
T it.of
t:. f, -., ""(" 1:
L
~ ~
I)
on ,op
t
,"
\"
1
6 tl ~
1..: f: II:l b
h !-
ACKNOWLEDGMENT
It.
'I
*'
:">- f. ;".;1 1; .;~. " loJ l·tll~ ~ t . .(f. ftl ,. ~ I ,!i,
'f. ri" r.,::, ,i; ';rt; It W '.1', '7t
~ ~ '1 n o-,f f-'''t.~';i;*~rr;;:-,tti~ ~,~'. ~~(A. ~ : : = < right operand part> < result
part> < condition part> < if part> < else part>
The part of an instruction in GIS
may be one of a list of 36 operations including:
add, subtract, multiply, divide, compare, branch
shift, move logical operations, and others.
The < left operand part>, < right operand part>,
< result part>, < if part>, and < else part> are
< addresses> .
An , in turn, consists of many parts including displacement information, indexing, indirect
addressing, bits to distinguish between references to
various types of memory such as main memory or
register memory, and other special techniques for
specifying memory locations.
Each part of an instruction has an interpretation.
The right and left operand parts specify operands
which are to participate in the operation. The < result
part> specifies an address where the result of an operation is to be stored. The < condition part> specifies
some internal condition which may be set as the result
of the operation. The < if part> specifies the address
of the next instruction provided that the internal condition is satisfied and the specifies the
address of the next instruction if the internal condition is not satisfied.
In most instruction sets, some of the GIS parts
have implicit values. For example, in a single-address
instruction format one of the operand addresses is always assumed to refer to the accumulator. The sa,me
is true of the result address. The if and else instruction
addresses are assumed to refer to the next instruction
in memory. To completely specify an instruction set
by means of GIS, it is necessary to indicate whether
each instruction part is implicit or explicit. The assumed value must also be specified for implicit P~l>rts
rSDS
while, for explicit parts, the parts of the instruction
format used to encode· the value of the part must be
precisely specified.
GIS can be used to represent almost any instruction
format in use in existing computers. From a syntactic
point of view its primary limitation is its list of operations, which is necessarily restrictive since some operations in actual computers deal with special features
and cannot be generalized. From a semantic point of
view, GIS is not capable of all the subtle nuances assigned to certain instructions in some computers. For
example, GIS makes no distinction between post-indexing and pre-indexing. In most cases, however,
these subtleties have little effect on the design of the
syntax of the instruction language which is of primary
concern.
The most important attribute of GIS so far as the
design program is concerned is that it is a design concept for instruction sets which it appears to represent
at an appropriate level.
GIS meets the requirement of generality because
it contains all the important addressing methods as
special cases. It can be used to represent single address instructions, double- or triple-address instructions, memory-to-register instructions, and registerregister operations, as well as others.
Another requirement .is that a program using GIS
as its model of an instruction set should be able, without a great deal of effort, to generate instruction sets
that are plausible solutions to a design problem. GIS
possesses this feature in the sense that any instance of
the GIS model is indeed a valid instruction set.
ISDS:
The design program
The first step in the construction of ISDS was the
selection of a method for storing GIS representations
of instruction sets in the memory of a computer. The
Backus Normal Form representation of GIS suggests
a tree-like data structure. The structure actually used,
called a "form-variable", is an IPL-V (Information
Processing Language-V) list structure· containing
each instruction part identified by name and an attribute-value description list for each part to store important information about the part (whether it is implicit,
whether the specification is a list of possible values or
the number of bits needed to encode the time, and
other descriptive information.)
All of the programs of ISDS are written in IPL-V,
the primary reason being that IPL-V contains instructions for manipulating the tree-like data structure that
is most appropriate for representing GIS instruction
IOOSiC.~______________-T______________
577
r -____
~
Figure I-Hierarchy of routines and data in ISDS
sets in the memory of a computer. However, the formvariable is a slightly more specialized data structure
than the IPL-V list structure. Hence it was necessary
to write a set of programs for manipulating formvariables.
These form-variable routines add items to formvariables, delete items, search for items, find attribute
values on item description lists, and insert and delete
attribute values on item description lists. The formvariable is a recursive data structure since an item may
be a single value, a list of values or another form-variable.
The form variables of ISDS are at the lowest bvel of a
hierarchy of routines (see Figure 1*) and are the building
blocks of other routines in the sense that the higherlevel routines make use of them to store new items in
an instruction set, search for an item, and so on.
The form-variable routines are general in that they
contain no information about instruction sets, GIS, or
any aspects of the design process but are merely bookkeeping programs. ISDS contains another set of programs that are general in the sense that they perform
the numerous computational tasks that must be undertaken during the· design of an instruction set. These
tasks include counting the number of items on a list
and determining the number of bits required to encode
a list of items.
At the level above the form-variable and computational routines, ISDS contains routines that add single parts to an instruction set. One such routine, for
* Figures I through 4 from thesis, "Using A Computer to Design Computer Instruction Sets", by Dr. Fred M. Haney. Carnegie-Mellon University.
578
Fall Joint Computer Conference, 1969
-------------------------------------------------------------------------------------example! adds a specified number of bits for designating
index registers in a memory address. The number of
bits is an input to this routine.
This routine performs no analysis, but merely the
bookkeeping required to add a new part to an instruction set. The analysis required to determine the number of bits to be added for indexing is performed at the
next level of the hierarchy. One. routine, for example,
adds indexing to the address references of an instruction set. For this routine the number of bits is not
specified. The routine performs the analysis to determj.ne the number of bits to be specified and then calls
its counterpart which adds the specified number of
bits. The routines which add specified parts to an
instruction set are called "strategy-level utility routines". The routines which perform analysis and call
for specific parts to be added are called "operators".
The routines in the higher level of ISDS are much
more specialized than the low-level form-variable
routines that can be used to represent many different
kinds of objects. At the next higher level, the strategylevel utility routines are intended specifically for constructing instruction sets although they could be used
in any design strategy since they have no decision
power. Some decision power begins at the level of the
operators which are based on a particular view of the
relationships between the differe~t parts of an instruction set. Each operator uses the values of certain parts
of the instruction set to determine the value of some
new part. The types of possible relationships are illustrated in Figure 2.
In many cases, the relationship between parts of
the instruction set are relatively obvious, but different results could be obtained with a different set
of operators.
So far, nothing has been said about how the operators
of ISDS are applied. One way is to write a program
consisting of a sequence of calls on the operators.
Operators that might be called, for example, are the
address operator (which selects the number of addresses per instruction and the size of each address), the
indexing operator, the indirect addressing operator,
the arithmetic operator, and the logical instruction
operator. (This program would be a specifie design
strategy for the instruction set design problem.) It
must be recalled that a design strategy is a particular
method for selecting the parts of a solution to a design
problem. In particular, a design strategy is a specific
choice of the independent variables that determine
each part of the solution, together with a particullar
sequence in which the design decisions were made. As
was pointed out, the operators represent a particular
view of the independent variables and their influence
on each part of the instruction set. The operators could
have been used to write a set of different design strategies. Instead, however, a heuristic program that would
determine its own strategy according to the demands
of the design problem was written:
The statement of the design problem to this progra.m
consists of the following information:
1. An optional GIS representation of a p2~rticular
instruction set containing features whieh mUlst
be included in the final product.
2. A cost-value matrix which assigns a relative
cost and value to each instruction feature of
GIS. The cost-value matrix also specifies a
maximum cost for the instruction set.
3. Optional constraints on instruction features.
4. Memory size, word size, and byte size of the
computer.
Figure 2-Relationships between the design variables
The heuristic design program consists of two routines; a basic strategy and a search routine. The basic
strategy uses the memory size and word size to determine the number of addresses in each instruction and
the general format of each address (whether it is a
memory reference or an address augmented by a ba:3e
register, page bit, etc.).
After this basic strategy has provided a starting
pointl the search routine adds one instruction part ::lIt
ISDS
memory
&
579
The following inputs were presented to the heuristic program:
data
requirements
1. A cost value matrix as follows:
Cost
Indexing
Indirect Addressing
General Registers
Partial Word Address
Extra Operations
Permanent Adjustment To
Index Registers
no more
operators
Value
10
0
0
0
0
10
20
10
1
1
0
10
2. A cost constraint of 10.
3. Required operations of add, subtract, multiply,
divide, compare, and absolute value for fixed
point and floating point arithmetic.
4. Required operations of "negate", "and", '.'or",
and "no operation for logical data."
5. A move operation.
6. Memory size and word size of 65536 words and
36 bits respectively.
Figure 3-0ptimization in ISDS
a time until there is no remaining space in the instruction format or the cost limit is reached. At each stage
of the specification of the solution, the search routine
tries every operator and evaluates the result with respect to the value coefficients provided in the statement of the problem (See Figure 3).
Corresponding to each operator there is a routine
that restores the instruction set to its status before the
operator was applied.
Hence, the sequence of events at each stage of specifications is "apply an operator", "evaluate", "restore",
"apply the next operator", etc., until all operators have
been applied, at which time the search routine reapplies the operator that resulted in the greatest improvement in the instruction set.
The search described above is a one-step search in
the sense that the instruction set is evaluated after
application of a single operator. Presumably much
more interesting strategies could be obtained by evaluating after the application of sequences of operators,
but the geometric increase in the computing time
required made this approach impractical.
This example illustrates the operation of the heuristic program described above:
The basic strategy determines that 16 bits are re
qui red for each main memory address. Since five bits
are needed to encode the required operations, there
is only room in an instruction word for one address
without some augmented addressing scheme. The basic
strategy can specify augmented addressing, but for
this case it specifies a single, main memory address
specification of 16 bits. The search strategy specifies
additional instruction features in the following sequence: general registers, indirect addressing, additional operations, additional operations, indexing, a permanent adjustment to an index register after indexing,
operations, operations, partial word addressing. The
resulting instruction set has the following format:
o
19 20 35
18
13 14 15
Opera~ Partial General Index Index Indirect/MemOry
tion Word Register Adjust
Address Address
Code Address
5 6
9 10
This format is almost identical to the format of
the Univac 1108 computer, however, the instruction set designed by ISDS is not. The primary
difference is in the number of operations in the
two instruction sets. The 1108 permits over 150
operations, whereas the ISDS instruction set contains only 52 operations.
580
Fall Joint Computer Conference, 1969
The instruction sets also differ in their interpretation of some of the instruction features. How~
ever, this example shows that ISDS is capable
of designing an instruction language that in its
essential features resembl¢s the instruction lan~
guageofthe Univac 1108.
It is interesting to note in the above example that
if only 16 or fewer operations are required in the statement of the problem, then the· basic strategy assigns
four bits for the operation code and the remaining 32
bits permit two 16-bit memory references. In this case
the search routine would not b¢ able to apply any of
the operators since every bit of the instruction word
is used by the basic strategy. This illustrates a practical value of the present heuristic program; i.e., it
permits a designer to learn by experimentation how
the different design variables interact and how minor
changes in one part affect the final product.
completely speciiFied
instruction set
SUMMARY
Working with ISDS indicates that for some design
problems it is plausible to write programs that solve
the design problem without human intervention. In
general, the approach consists of the following steps:
1. Select a design concept---a model of solutions
to the design problem.
2. Select a data structure for instances of the dedesign concept.
3. Create operators that perform analysis and
specify single parts of the model.
4. Create programs that use cost, value and constraint information from' the statement of the
problem to apply the operators in some sequence
that results in a solution td the problem.
This process, as it is applied in ISDS, is illustrated
in Figure 4.
To be of practical use, a design program based on
the ISDS approach would require a more sophisticated search strategy than the one used in the present
version of ISD8. In general, it is probably possible to
find clever ways of selecting the operators to be applied without actually trying everyone. Any such
scheme would give the search mu,ch more direction and
enable the program to evaluate strategies of depth
greater than one.
Figure 4-ISDS as a design model
The approach to automated design described is of
limited use in many practical design problems. However, as designers experiment with interactive design
systems they are likely to discover problems for which
the so-called creative effort is relatively routine. For
such problems, the approach of ISDS offers 1Ghe prospect of more efficient automation than can be achieved
in an interactive system.
REFERENCES
1 M ASIMOW
I ntroduclion to design
Prentice Hall, 1962 Englewood Cliffs N J
2 C ALEXANDER
Notes on the synthesis of form
Harvard University Press 1964 Ca.mbridge Ma.ss
Directed library search to
minimize cost
by DR. BRUCE A. CHUBB
Lear Siegler, Incorporated
Grand Rapids, Michigan
Statement, of the problem
The system engineer operating within the framework of a typical manufacturing organization operates
from the following basic information and constraints:
a. A set of customer specifications to be met,
b. A basic system configuration to' be used in realizing these specifications,
c. A set of standard components that fit into this
configuration. The problem is to determine the
collection of components that satisfies the given
specification at minimum total dollar cost.
The above described situation exists in every area of
system engineering where the configuration is "fixed"
and a multitude of candidate components are available.
The characteristics of these components can be stored
in computer libraries by part numbers and an analysis
program can be written to systematically analyze the
system for any candidate set of components by merely
inserting the appropriate part numbers. Such computer
programs are structured so as to retrieve the data for
each particular component, proceed with the various
performance calculations and display the results to
the designer for each set of part numbers manually
selected.
This paper goes one step further and presents techniques and procedures for the effective use of computers
in automating the solution to the above class of design
problem.
Theoretical development
Development of the analysis program
The analysis section is the starting point of any computer-aided or automated design program. Optimization, in the design context, is derived from an efficient
use of iterative analysis techniques. Devoid of a good
analysis capability, the designer has nothing. Its presence provides a powerful tool in itself. In this case,
however, it is simply a means to an end-Automated
Design.
Although the internal details of the analysis program
vary greatly for different appJications, the input-output characteristics can be readily defined as shown in
Figure 1. The first, and primary, requirement of the
analysis program is that it must accurately represent
Component
Parameter
Vect~r X
System
Perfozmance
Vector Y
System
Specification
Vector S
(
)
One-to-One
Correspondence
581
Figure l-Input-output characteristics of system
analysis program
582
Fall Joint Computer Conference, 1969
the hardware. This requires a significantly detailed
model, including often overlooked nonlinearities, and a
realistic consideration of componmit toler[\nce effects.
Second, the outputs of the analysis program must have
a one-to-one correspondence with, the list of system
specifications. That is, if the cU8t~mer specifies overshoot, response time, accuracy, etc~, then the program
must have the capability of calchlating the system
performance characteristics in this form. Third and last,
since the analysis is to be repeated many times in an
iterative fashion, the solution time should be a minimum.
The analysis problem is now defined mathematically
by letting S, Y, and X be vectors, defined in general as:
System Specification Vector
System Performance Vector
the elements as required to calculate the system performance function vector Y. However, it is convenient
to include the component costs as part of the X vecto:r
[even though they will not appear explicitly in (2)] sincle
they are required to calculate the optimization function
that is introduced later.
Thus (2) can be used to calculate the system performance vector (Y) given any component vector (X).
By programming this equation as presented, one obtains the desired analysis program except for one
deficiency. That is, due to manufacturing tolerances,
the X vector varies from unit to unit, and we :are interested not in a particular value of Y but what spread
or limits to expect. The tolerance effects can be included by using either the Monte Carlo or M:oment
Methods. I ,2 The latter technique is used in this paper
since it also provides information that is extremely useful in minimizing the system cost.
The Moment technique makes use of an expansion
of the function about the mean parameters using a
Taylor series. The higher order terms of the series arle
neglected. This requires taking the partial deri.vativ4~
Component Parameter Vector
(1)
where
k
=
number of performance specifications
number of component p~rameters
n
Si
=
numerical value for the ith specification
(1 ::; i ~ k)
Yi
= system
performance function corresponding
to ith specification (1 ::; ~ ::; k)
Xi = numerical value for jthcomponent parameter (1 S j ::; n)
Thus one can write in general that
Yl
Y2
Fl(X l, Xt, X., "', X,,)
Ft(X l, ~, X., ' .. " X,,)
(2)
where the F's represent the functions that need to be
programmed to provide the system analysis. It is
only necessary, at this time, that t~e X vector contain
* This path automatically
followed for initial
guess.
Figure 2-Computer aided des1gn program flow chart
Directed Library Search to Minimize Cost
of each performance variable with respect to each
component parameter. Assuming that the component
performance parameters are independent and noting
that the aY ifax j = 0 if Xj is a component cost, the
mean value of Y, is given by the equation
UYi
=
V
[ (Ux.)
aY,
2
aY
ax~ ] + [ (un) ax, ]
+ ... + [
~]
2
(4)
aXn
where i = 1, 2, "', k and the partial derivatives are
evaluated while all other parameters are held at their
mean value. As can be seen from (4), the use of the
Moment method requires that we calculate the partial
derivatives of each system performance function with
respect to each component parameter. The matrix
of these partials is the Jacobian.
J =
aY k aY k
aY k
(5)
The entries in the Jacobian are obtained numerically
by programming (s) and using a subroutine to make the
following steps:
1. Set all the X/s equal to their mean value
Yi
-/J.Yi
c:= - - - - for
aX j
aX j
j
1, 2, ... r k and j
1
3. Step 2 is repeated for each Xi for j = 1, 2,
.. " n thereby obtaining the complete Jacobian
matrix.
Development of ·computer optimization design
procedure
2
aY iI
(Ux.)
aY i
583
(/J.yJ,
and the calculated Y vector is taken to be the
mean value /J. y.
2. Xl is replaced by (/J.Xl· + aX z) and the corresponding value of Y is calculated with all other
X's at their mean value. From this, we obtain
the first column of the Jacobian matrix using
Use of the computer-aided design procedure described
in the previous section, although many times more
effective than any manual method, nevertheless represents only a passive use of the digital computer. That
is, the engineer makes all the design decisions and the
computer only serves as a fast calculator. The next
logical step toward optimized design is to use the computer to determine how the components should be
varied to converge on the desired minimum cost system.
Figure 2 illustrates in general how a computer could
be used in a dynamic Rense. The prerequisite to design
is to input the data for all components. This is accomplished by loading in the component data cards prepunched in a prescribed format. This need be done only
the first time and thereafter only if that data is to be
changed; e.g., updated. These data are then stored by
part number in an easily retrievable form on magnetic
disk and are referred to as the "component libraries."
In order to provide the mainline design program with a
guide as to part number selection, some ordered array
of these is desired. This is accomplished by using a
"search matrix library," the precise working of which
is explained later. Thus, immediately after generation
of the component libraries, the computer calculates
the component search matrices and stores these in a
second block of data-the search matrix library. Now
the program is ready to be used. The designer inputs
the system specifications, fixed production labor costs,
and any initial set of components of his choice. The
latter item could be made a random selection if desired.
In either event, the computer retrieves the component
data from libraries and proceeds to calculate the system
performance. The component parameters are then perturbated one at a time and the partials of each system
performance function with respect to each component
parameter are determined. Once this is completed the
partials are stored in the form of a Jacobian matrix.
The calculated performance limits are then compared to
the specification limits. The fraction of the units produced that statistically fall outside of the specification
limits is then calculated as the "rejection ratio." From
this rejection ratio, the fixed labor cost, and the summa-
584
Fall Joint Computer Conference, 1969
tion of the parts cost, the total cost is calculated. A
printout is then made so that the user can follow the
steps that a computer makes. Following this, some
method must be employed to determine if cost is a
minimum. If it is, then a final printout can be made.
If it is not, then an option is shown as to how one wants
to optimize. This can be accomplished by the user
reading in another set of part numbers or the computer
automatically can select a set in the manner described
in a later section using the search matrix library. This
procedure is repeated in an iterative manner untiJ the
optimum design is reached.
The first question that must be answered in an
optimization problem is, "What is to be optimized
and what is optimum?" Often, this is not a trivial
problem in itself since there are 'many separate and
usually conflicting factors; i.e., minimum cost, maximum accuracy, small volume, best response, otc. These
factors may be considered simultaneously be defining
a scalar P of the form
+ L: component]
Ai(Y i
-
Di)2
[
Costs
1 + ove~head]
RatlO
en
N umber Required
Number =
Built
[
(6)
Rej~ction]
RatlO
(8)
N umber Required
[1 -
i~l
Labor
[ Cost
where
1-
Thus, we have for the total cost
Total =
Cost
k
L:
Total = Number [Labor
Cost
Built
Cost
However, the number that must be built fora given
contract is given by
Generation of ohject functions
P =
"fixed" he maximizes his profits. Using this miinimum
cost philsophy, an appropriate object function can be
generated in the following manner.
The total cost to build a given number of 8ystems
is represented by the equation
Rejection]
Ratio
+ L:
component]
Costs
[
1
+ ove~headJ
Ratio
(QI)
P = object function to be minimized
k
=
Ai
= weight factor selected to;give the ith property
number of desired properites
the desired priority
Since the number of required units and (1 + overhead
ratio) are product terms which are not functions of the
components, one obtains the same cost minimizing
set of components using the function
Y i = current value of ith property
Labor
Cost
D i = desired value for ith property
+ L:
Component
Costs
(10)
Cost =
A serious difficulty inherent in this approach, however,
consists in finding a set of weigh~ing factors AI, A2,
.. " Ak such that scaling between the various terms is
properly considered in· order to maintain sensitivity
and obtain good convergence. Considering properties
such as accuracy, weight, cost and response, these
weight selections often become s~bjective in nature.
It is proposed in this paper that ~n entirely different
object function shall be used. It is fbunded on the competitive philosophy that the maI).ufacturer wants a
design that fulfills the customer reGluirements at minimum overall cost. With this result, ihe can either maximize his chances of competing or: if his sale price is
1 _ Rejection
Ratio
Equation (10) is the object function used for what is
defined later as "the fine search mode." When it is at a.
minimum, the desired optimum set of components has
been defined. However, one problem may exist in the
early portion of the iteration cycle. That is, the design
can be so far away from specification that, for all
practical purposes, the rejection ratio is unity, th'3
denominator of (10) goes to zero, resulting i~ infinite
cost. As long as this occurs, (10) has no practical value.
In fact, one loses all sensitivity in calculating partials,
Directed Library Search to Minimize Cost
and there is no way of telling jf one design is better
than another. For this reason, a "course search mode"
is defined. Its corresponding object function is:
R = 1 -
Q =
L:
t" lL.. ... t .
Lll
fYI,
k
A,R,(Y, - S,)2
585
L21
Y2, " ' ,
Lkl
Yk(Yl, Yll, "', Yk)dYI dY2 ... dYk
(11)
(12)
,=1
where:
where
•Q = object function to be minimized
Lil = -{ L,2 = S.
oo} for the ith specification an upper bound
k = number of specifications to be met
A, = weight facgor for ith specification
R,
=
rejection ratio for ith specification
Y i = calculated system performance 3 sigma
limit corresponding to ith specification
S.
=
Lil = S.
{
Li2 =
00
The joint density of the Y's is given by:
jth specification limit
e·
It should be further noted that
Y, =
} for the jth specification a lower bound
!J.Yi
""""'"[(Y - Y)My-l(Y - y)T]
(13)
-3 uy, if S. is a lower limit, and
where:
Since Equation (11) is used only in the coarse search
mode, selection of the weight factors is not too critical.
For this study, Ai was set at I/S i 2 except for the case
when Si equals zero and then Ai was arbitrarily set
equal to unity.
In the coarse search. mode, cost is neglected in an
attempt to determine the performance such that the
rejection ratio becomes less than unity. The incorporation of the Ri term in (11) greatly aids in the accomplishment of this condition. First it nulls each term in
the summation which represents an overdesigned condition (i.e., Ri = 0) and secondly it applies a linearily
increasing weight on the others according to their
significance.
Once each of the R/s is driven less than unity, the
cost becomes finite, and the optimization 'process is
switched from the coarse to the fine search where (10)
is used as the object function.
Calculation of rejection ratio
The total rejection ratio R is the probability of a
design falling outside of the specification, and assuming
that the specification limits are constant, it is given by
and the (k X k) covariance matrix My is
My = JMxJT
(14)
Since the component performance parameters are
assumed independent and (JXi = 0 if X, is a component
cost, one caR write the component covariance matrix
M~as
o ...
0
(JX2 2 •••
0
o ...
]
(15)
(JXn 2
In order to evaluate R using (12), one must evaluate
the multiple integral of dimension k. This can be accomplished using numerical techniques, however, the
process is very time consuming. In the interest of
minimizing computer time, one of the three alternate
procedures listed in Table I are best implemented. Each
of these approximations requires calculating only the
586
Fall Joint omputer Conference, 1969
-------------------------------------------------------------------individual specification rejection ratios (Ri for i = 1,
2, ... , k) which are given by
-~(~~)'
1
2
R. = 1 - - - - - V2'lr(jYi
(jYi
2
= total system cost
= labor cost
R(X) = rejection ratio
K
f(X)
dy (16)
e
C (X)
L
=
component cost
Taking the partial derivative of C with respect to
X i and expanding to include all Xi
Equation (16) can be evaluated by using the standard
error function
"',
2
ERF(z)
~~]
ax"
(17)
0,
using the relationships summarized Table II.
Since the upper bound approx~mation is always on
the safe side, it is the one used here. However, the independent approximation does lie between the two
extremes and thus might be closer to the actual cases
Expanding the aR/ ax vector interms of the Jacobian
defined by (3) one obtains the desired matrix equation
for the fine search cost derivative vector as
aC
Obje~t
function derivatives
+ f(X)] [1 -
R(X)]-l
2
l
1
-----
[af
- R(X)
aX l
af
"',-
aX2
aX k
K + f(X) [aR aR
aR
+ ------ -- -- . .. (1 - R(X»2
ay l
'
ay2 '
-aY k
where:
= component parametet vector [XI, X 2, ' • "
Xn]
af ]
--,-,
(18)
where
x
ac
--
It is of necessity that the partial derivatives of the
object function be calculated in!. the steepest ascent
method of optimization. If th~se derivatives were
somehow known for the direct :search technique, it
would be of advantage since on~ could then conduct
exploratory moves in descending order of importance.
In our case, it would be a major task to perturbate each
of the component parameters ag&in and calculate the
resulting change in the object function to obtain the
partail derivatives. It is shown, however, that these
can be obtained directly from the Jacobian matrix
which is already available from 'the tolerance calculations; namely, Equation (5). This is accomplished
in the following manner as derived first for the fine
search and then for the coarse search.
The object function used in fine search, Equation
(10), can be written as
C(X) = [K
ac
[-...... ax.. ]
aX aX
~ =
aX i
{I if Xi is a component cost
0 otherwise
,
]
aY k
(20)
Directed Library Search to Minimize Cost
Table I-Estimates of total rejection ratio (R)
Lower Bound
Upper Bound
k
k
L R.
L R.
if
i=l
i=l
1
I
otherwise
<
1
I
Rj where Rj 2. Ri
for all
I 2. i 2. k
where
Independent
I
I
k
I -TI(I-Ri )
i=l
aR
aR
,
aY l
Si
=
J.'Yi
= mean value of Y i distribution
O"Yi
=
ith
specification limit
standard deviation of Y i distribution
and the + sign is taken jf 8 i is an upper limit and the sign is taken if Si is a lower limit.
The object function used for coarse search is of the
form [see (11)]
and the vector
aR
587
(21)
F(X)
= A1R1(X) [Y1(X) -81]2 + A?R 2 (X) [Y2 (X) -82]2
aY2
+ ... + AkRk(X)[Y k(X)
is referred to as the "rejection ratio derivative vector"
and given the notation aR/ aY.
The calculation of the aRjaY vector, as required for
the fine search mode, depends on the particular equation
used in approximating the rejection ratio R [see Table
I]. We consider here only the case where R is approximated by the upper bound [see Reference 3 for other
cases]. Since in the fine search mode
- Sk]2
(25)
Following the same type of procedure, as for the fine
search, the coarse derivative vector is found to be
[
aF
aF
aF ]
aX I
aX2
aXn
- - , - , "', --
aR?
one has
= 2
R(upper bound) = Rl
+ R2 + ... + Rk
A 2 (Y2
-
82)R2
+
(Y2
-
82 )2
--
aY2
(22)
and since R i is a function of Y i only for i = j
aR(upper bound)
for i = 1, 2, "', k
(23)
aY i
and only the partials of the individual rejection ratios
are required.
Considering the specification limit a constant, the
magnitude of aR i / aY i is given by the Y i density function evaluated at the point Yi = Si and the sign of
aR i/aY i depends on whether S i is an upper or a lower
bound. That is
---e
aY i
y21rO"y2
. aY I
aY l
aYll
aX l
ax'!
aXn
-- - - ... - -
(26)
l
BY k aY k
aY k
---_ ... - aX l
aX2
ax,.
588
Fall Joint omputer Conference, 1969
-----------------------------------------------------------------------------------------Equation (26) gives the desired partial derivatives
of the coarse search object function with respect to
each component parameter in the system. Again, like
(20), it is in terms· of the already available Jacobian
matrix and no further parameter perturbations are
required.
Design program strategy
The design program developed as part of this study
has two basic operating options-;-analysis and directed
search. When operating" with toe analysis option, the
component part numbers required for each analysis
may be either read in from cards or selected at random
by the program. In either case~ as many consecutive
runs are made as requested and a final printout is
provided summarizing the best design obtained. Thus
the engineer can make a rapid ~valuation of a selected
number of designs of his choosing, or, he can perform
Monte Carlo runs by letting the computer select the
part numbers at random.
With the directed search option, the computer program uses the object derivatives in connection with
search matrices to direct the next component selection
in an attempt to reduce the object function. This process is repeated 'in an iterative· fashion until a local
minimum is obtained. Since there is no guraantee that
this condition is the absolute! minimum, numerous
starting points are employed ~nd the one with the
lowest cost in assumed to be the best design. The
starting points for each search may be specified by the
user or otherwise selected at random by the program.
The generation of the search matrices is a prerequisite
to a directed search. A separate: search matrix is used
along with each component libra~y and their generation
automatically follows each library update. These
matrices consist of an order array of the component
part numbers defined by
m = the number of part numbers for ith component
stored in the library
S1li
= a component part number for 1 ::s; n ::s; m and
1 2 j 2 1
Each column of Si corresponds to a particular parameter of the ith component and the entries of the column
consist of all the ith component part numbers arranged
in ascending order of the mean value of that parameter.
That is, let the jth column of Si correspond to the kth
component parameter of the X vector. Then Slj, S2"
•• " Smi are chosen such that
::s;
Xk(Smi)
(28)
where
Xk(SnJ
signifies the mean value of the component
parameter X k for the part number stored in
location S1li
In order to explain the strategy used by the design
program to conduct a search, the following definitions
are established.
search
= minimiza tion process which begins with the initial Se1G of ]part
numbers and ends once a local
minimum is found.
base point
= set of part numbers for which the
object function is less than that
calculatt.d for any previous set of
part numbers in a given search"
sub-search
= that part of a search which takee
place between
points.
successiive
base
exploratory move = a set of part numbers whiich arlB at
least tentatively being considflred
for a system performance analysir.
failure
= an exploratory move whi<:h is stnalyzed and the object Junction
obtained is greater than (0 i equal
to) that of the base point.
(27)
success
= an exploratory move whieh is less
than that of the base point.
Sm2 ••• sml
where
= the number of parameters used to describe the
ith component
local minimum
= the object function corresponding
to the base point which remains
once all the exploratory moves
analyz'ed in a given sub-search
result in failure.
Directed Library Search to Minimize Cost
Table II -Equations for calculating individual
rejection ratios (R i )
Si Upper Bound
Si .::.
Ily . • 0" S
i
Si
< Ily.
1
0" 5
[
[
1 " -" ERF
1 . ERF
;,
where
m, the dimension of IP AR, equals the number
of component parameters. The direction vector
(IDEX) is defined by
Si Lower Bound
<
c-"y )]
0.5
[
1 • ERF
('12'
y0
<0
(al )
where i = IPAR 11
XMAXS = a vector containing '~he
mean + 3 sigma values for
the total X parameter
vector for the base point.
XMINS = a vector containing the
mean -3 sigma values for
the total X pluameter
vector for the base point.
This normalized distance is then compared to a
program input parameter XNN. For XNN >
1, one is assured that the XIPARII random variable
has been varied so that its frequency distribution
inside the 3 sigma lim,its lies outside 1ihe
distribution for the corresponding baBe point
parameter. Thus by selecting the value of XNN,
the program user can control the extent 1GO which
exploratory moves are made. A value of X:NN =
1.5 was found to give satisfactory results. By
making XNN larger one explores more possibilities at the expense of increased computer
time. Thus, for DIST < XNN the progra,m
returns the part numbers to the basl~ point,
increments to the next most significant parameter incrementing the sub-search progr~ss
number by one, and returns to step 6 above by
calling'SEARCH. If DIST ;:;; XNN, the program continues to make the second. check.
This second check consists of calculating the
Directed Library Search to Minimize Cost
estimated change in the object function based
on its first derivative vector using the equation.
m
..!1object
591
~
"/
a object
I:
r- --
I
_.J
I
I
I
where XNOM and XNOMS are the mean component parameter vectors corresponding respectively to the exploratory part number
vector and the base part number vector. Since
the i = IPAR I I term in (32) is negative, one
knows that· if ..!1object turns out to be positive,
the summation of the changes caused by the
.parameters in IPN JJJ other than IPAR I I have
resulted in an estimated increase in the object
function. Sinc.e an increase in ..!1object is undesirable, one returns to step 6 above, when
..!1object > 0 and calls SEARCH keeping the
same sub-search progress number (II). If
..!1object :::; 0, a complete system performance
analysis is made using the exploratory move
part numbers.
9. If the exploratory move turns out to be "a
success" (i.e., the object function is reduced)
one returns to step 2 above and the process
is repeated. If it is "a failure" (i.e., the object
function isn ot reduced.) one returns to step 6
and the next exploratory move is investigated.
10. The optimization procedure terminates once
aU the exploratory moves madeJrom a given base
point are completed "without success." This
base point defines the local minimum.
Figure 3 summarizes the described design strategy
in the form of a flow chart for the computer program.
For simplicity sake, only the logic fundamental to
the directed search option is included.
DETERMINE
NEXT
COMPONENT
SELECTION
Figure 3-Directed search basic program logic
Table III -System specifications
Name
The example presented here is the ~utomated design of an instrument servomechanism consisting of a
follow-up device, electronic ampHfier, drive motor with
feedback generator, and geartrain. A pictorial diagram
showing a fixed system configuration using these components is shown as Figure 4.
It is assumed that a design of this configuration must
meet up to five preassigned specifications in the areas
Boundary
Units
Static accuracy
Sl
upper
degrees
Resolution
S2
upper
degrees
Velocity lag
S3
upper
degrees
Follow-up rate
S4
lower
deg/sec
Damping ratio
S
lower
-
A utomated design example
Application problem
Symbol'
5
of dampi~, accuracy, at),d time response, Table III
lists the specifipations by name and vector notation,
tells :whether each specification is an upper or lower
bound, and the units used.
Fall Joint Computer Conference, 1969
592
--------------------------------------------------------------------------------------------
-1
~
,
,
voltage. :
Control
transmitter (CX)
\\\I't,,\
,
,
Name
Control
tran,former (CT)
Excitation,
>~~Ftu~",
Table IV-System design equations
\
'[EAmPlificr
Symbol
Equation Used
Static Accuracy
111111\\\
Resolution
Excitation
voltage
-------'
t
----------+----------·t--~
Amplifier ,"put
voltage ~."
Error
Figure 4-Schematic diagram 6f motor-generator
instrument servomechanism
Four component libraries are established to list the
part characteristics as follows:
a.
b.
c.
d.
Velocity Lag
voltage r.
Follow-up-25 part numb~rs
Amplifier-50 part numbets
Motor-generator--25 part numbers
Geartrain--25 part numbers
Even though the size of each dem6nstration library was
purposely kept small, the number of theoretical possible
candidate systems is large; namely, 25 X 50 X 25 X
25 = 781,250.
The optimum collection of components is defined as
"the one that satisfies the given specification in a manner resulting in minimum total cOE1t."
Component libraries and search matrices
The design equations corresponding to the five
specifications are listed in Table IV (see Reference (4)
for their derivation]. By grouping the parameters
shown in Table IV according to component and adding
the corresponding cqmponent cost, one obtains the X
parameter vector as summarized in Table V.
In addition to specifying any desired combination
of the above described five performance requirements,
the user must also define the load that the servo is to
drive. For the example program developed, the load
is represented by an inertia (Jt) and a coulomb friction
(Tt )· These are shown as X 21 and 1C22 of Table V.
The components selected to make up the libraries for
this study, chosen so as toprovirl;e a broad base of design, are typical of those used t~roughout the servomechanism industry. An example of the parameter
used is shown in Table VI which, consist of the valU(~s
follow-up component library.
'
Each column of the library data is labeled with the
appropriate X-vector notation; i.e., Xl, X 2 , " ' , X 20 ,
Follow-up Rate
Damping Ratio
e [I
~
N
-
T +T
-.IL2.
NT
s
J
E
2!!.
E
c
N2 (B +K K T IE )
m g ag sl' c
each of which is assumed to be a random variable with
a normal distribution defined for each component by
the mean ± 3 sigma limits gi~en by the MAX
and MIN values shown. The variables Xi for i = 1J• 4,
9, 16, 17, and 20, which are the individm~l component
costs, motor rated voltage and the gear ratio and have
no manufacturing tolerance, are still treated as rH,ndom vairables" but having zero variance; XIVIAX, =
XMIN i .
The search matrices are generated immediat.ely after
the library data is stored in the computer system. The
search matrix for the follow-up is shown as Table VII
and consist's of the follow-up component part numbers
arranged in an ordered array.
Computer solution
In order to demonstrate the application of the program in its most comprehensive form, a customer requirement is assumed which makes use of all five spelCifications. The particular set is:
= 0.35 degrees
Resolution = 0.3 degrees
Velocity lag for 300 deg/sec input = 5 degrees
Follow = up late = 300 deg/sec
Damping ratio = 0.5
1. Static accuracy
2.
3.
4.
5.
The assumed labor cost is $200.
The results obtained using the program in the direct
search mode now are illustrated in detail for three
s~arches. The first, shown in Table VIII, is a caf~e where
the initial guess fails completely to meet three out of
Directed Library Search to Minimize Cost
Table V-Component vector notation for library
COMP
F
VAR
PARAMETER NAME
SYMBOL
UNITS
0
Xl
Cost
C
f
dollars
L
L
X2
Gain
K
f
volts/rad
0
W
X3
Accuracy
Sf
minutes
X4
Cost
C
a
dollars
Xs
Gain to Followup
Kaf
volts/volt
X6
Gain to Generator
K
ag
volts/volt
X7
Output Saturation Level
E
sat
volts
Xs
Output Null Voltage
E
an
volts
X9
Cost
C
dollars
X
Stall Torque
T
s
oz-in
XII
No-Load Speed
S
m
rpm
X l2
Inertia
J
gm-cm 2
X l3
Starting Voltage
E
X l4
Generator Gain
K
g
volts/IOOO rpm
XIS
Generator Null
E
millivolts
X l6
Rated Control Voltage
E
X l7
Cost
C
g
dollars
XIS
Inertia
J
gm-cm 2
X l9
Friction
T
X20
Gear Ratio
N
L
0
X21
Inertia
JR,
gm-cm 2
A
X22
Friction
T
oz-in
U
P
A
M
P
L
I
F
I
E
R
G
E
MN
oE
T R
oA
RT
0
R
G
E
A
R
T
R
A
I
N
0
IO
m
.
m
s
gn
c
g
g
t
volts
volts
oz-in
--
593
594
Fall Joint Computer Conference, 1969
--------------------------------------------Table VI-Followu,p library data
Xl
PART
NU.
COST
f)OLLARS
1001
100?
lJ'H
1004
10'15
10')6
1') )7
10)8
1(119
1011
10 II
1012
1013
1014
1015
1016
300.00
24.00
35.00
20C./)0
61)0.f)O
28.00
40.00
3£:.00
27.')0
30.0CI
95.00
90.0r
300.00
60.0(
16.0('1
3').00
If)! 7
260./)0
1018
15C'.OO
2C'.:JG
101Q
1021
10?1
1022
1023
1::>24
L125
ze.nC'
26.00
V'i.O('
2{i. ')O
28.f)C
18.00
X VECTOR NOTATION
X2
FClLOWUP C;AIN
(VOLT S/I1.AD)
MAX
23.6O(;C
12.7rcc
24.flOCC
0.5050
/').502"
24.80CO
12.100C
12.100(
12.70CI)
O.50~0
11.700('
0.5050
0.505e
24.8'JOC
12.7C~C
2~.q()CC
11.7(')CO
23.tOCC
27.00CC
C.515C
5.5CeC
5.25(0
'5.5CCO
5.25(0
5.500C
M1N
21.4000
1('.3000
2C.2000
C.495C
0.4975
20.200('
1(,.900Co
1 C.9000
1('.3000
[.495('
11.300('1
0.4950
C.495C
20.2000
1 o. :'IOOC
19. 100C
11.3000
21.4;1)00
lA.OOOO
O.505C
4.5;OCC
4.7500
4.5000
4.1500
4.500C
T~ble
X3
COST
ACCUR ACY
(MIN OF ARCI
MAl(
1.0
10.0
7.0
"0.0
10.0
15.0
3.0
7.0
15. (',
120.0
2.0
60.0
15.0
3.0
30.0
10.0
I.e
7.0
30.0
180.0
10.0
5.0
1<;.0
7.0
30.0
VII-Followup search matrix
MJN
/).')
f).0
('.0
O./)
0.0
0.0
0.0
~.O
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
).0
C.O
I)./)
n.n
0.0
1015
1025
1023
101q
1009
lOr) 2
1021
1006
1024
10?O
1022
1010
1016
1013
100R
1007
1014
1012
1011
1010
1005
1004
1012
1013
1020
1024
1021
1025
1072
1023
1015
1007
1011
1009
1004
1017
1013
10'11
1017
1:)08
1]02
1014
10(')6
1003
101R
1816
100 1
IO()5
IOlY
101~
the five required specifications, thus resulting in an
infinite cost (shown as **** when cost ~ 1. X 106
dollars). Each line represents an analysis run and lists
the cost (10), scalar (11), total reject [upper bound of
(12)], the four component part numbers used, and the
individual specification rejection percentages [Ri
using (16) for i = 1, "', 5]. Fifty-five iterations
are required by the program to minimize the scalar
object function to the point where the cost becomes finite and the program switches from the coarse
to the fine search mode. It should be noted that for
this and subsequent computer runs, the intermediate
printout is eliminated for all iterations where the
scalar (cost when in fine search) is not reduced. These
are considered "failure iterations" as is the case for
numbers 2, 6, 7, etc., for the coarse search in Table
VIII.
Once the program is in the fine search modfj, the
cost is minimized up to run number 2.02 where it is
reduced from $38,261.30 to $374.27. As shown, an additional 23 iterations are required according to the
termination procedure, as explained in an earlier section, in order to establish that part numbers 1009,
2003, 3002, and 4014 establish a local minimum.
Table IX illustrates the results obtained from the
second search. This case represents the opposite condition where the initial guess at first hand looks like a
K~F
THETA
101 7
1001
1018
1011
1014
1007
1022
1074
1003
100R
1021
10C5
1002
1016
1013
1('06
1023
1 !)( 9
1 en 5
1025
1 01 9
1004
1012
1010
1020
"reasonable design"; i.e., the rejection is only 0.77
percent. However, after 74 iterations in the direct
search mode, the cost has been reduced from the
original design value of $555.30 to only $.374.2~(
a savings of $181.03 per unit! The computer run time
was less than one minute.
The third search is shown in Table X where 1;his
time the initial parts result in a design which fails completely to meet four out of the five specifications. After
55 iterations, the program has reduced the scalar from
59,610,000 to 3.396 and only one specification remains a
complete failure; however, this point turns out to be
a loc,al min:mum and no further reduction is obtained.
A total of 15 searches was made and the local
minimums found and their frequencies are summarized
in Table XI. Based on the results listed in Table XI,
the system obtained using part numbers 1009, 2003,
3002 and 4014 is assumed to be the best de::;ign. The
final computer printout sheet summarizing this combination is shown as Table XII.
Directed Library Search to Minimize Cost
595
Table VIII-Directed settl'C'h with initial guess undersigned
····*··***BEGIN SEARCH
,
RUN
NO.
C )ST
1 *********'
~
••••• ****
4 **.******
5 **.******
>l *********
12 .********
16 *********
B *********
34 **** •• ***
35 *********
40 *****:l<***
41 *********
55
3 B 261. 30
76
2999.5C
109
79,.05
119
565. ')4
120
544.01
122
537. ·)1
127
527.15
521.0(
133
136
512.11
14q
4d1.JC
159
432. Jr
423. ')(
162
172
42'>. H
118
424.26
41 b. ')(
181
194
414.')('
202
314.21
22'5
314.27
SCAlAQ.
3. 722 [+i:'l2
7.147E+02
3.)42E+J2
8. 662E +CO
8.61 Ef +C'O
8.611E+:}·,)
5. BO[ +00
4.497E +00
4. 'B4E' +::0
3.l59E+SO
3.291E:+on
3.207F+(lO
3.222HO(,
3.213E:+00
PEI'CENT
DEJECT
lCJ.'C
1").OC
1(,:). C'O
101. 0('
1 C). oc.
1')).00
In').0C
101').11(,
1(1).('(,
1 C'O. 0(·
100 •.1(
1 '"'". CC
qR.76
h4.6S
? B. 12
4.742E+OJ
4.254F+JI)
5.27bF+')O
5.349[+1'\0
1 •.') 1
0.1)0
1.1(
'J.Gl
2.89b~+SI
1.~:6E+Cl
J.O
b. 4"'94E' +00
5.166[+00
0.0('
S.694E:+0~
1.00
5.4~OE+')a
(\.:)0
'5.!>9t-F+C'J
5. 522E+'J8
5.293E+),)
5. 144,E:+00
6.973F+QO
6. 973E +00
1. 1 f.
1 • C"l
o.ro
o. or
!."C
1.41
1.4)
~U"'BEk
2
•••••••••
COMPCNE NT S SHECTED
·······INDIVIDUAL REJECTIONS·····
FOUP A,..,P ~OGEN GRTR
STAlIC RES
LAG
FURA TE
OAMP
1"0') Z(46 3015 402i IIUJ.L'U 1()u.00 IOO.O~
0.00
0.03
10C5 2('46 3e12 4022 100.0C 100.00 1 CO. 0'
0.00
1.09
10(5 2(46 3021 4022 10O.OC 100.00 IOC.O)
0.0
100.00
Ion 2C46 3021 4(\22 18,,45 82.34
0.0
0.0
100.00
1024 2(46 3(>21 4022
4 .. 18
o.c
82.88
0.0
100.01)
1022 2('46 3021 4022
0 .. 90
82.88
100. (II)
0.0
0.0
lJ22 2("46 3021 4024
0 .. 01
21.95
C.O
0.00 100.0')
1022 204h 3012 4024
55.55 100.OC
O.C
0.03 11)0.00
10G7 2('46 3012 4024
0.0
O.CO
C.O
0.03 100.00
1 ('C' 7 2('46 3"23 4024
O.CO 63.32
0.2:)
56.16
94.76
If)G7 2046 3005 4(:24
iJ.CO
74.21
27.87
B8.46
55.68
1('('7 2(46 3011 4024
0.00
21.45
18.67
54.19
6.71)
1 I) 1 1 2046 3011 4024
0.00
21.11
16.88
54.19
6.52
10 l I 2(46 3011 4023
0.0
11.20
16.3()
51.22
5.93
1(11 2('2'5 3011 4023
0.0
0.0
O.OJ 28.32
0.0
1 r~: 1 2C2'J '3 ('11 4020
O.C
0.0
0.0
C.01
0.00
1 (I) 1 ?r.? 5 3002 4020
0.0
0.0
0.00
0.00
0.00
101 1 2r2'5 ~OC2 4014
0.0
0.0
O.OJ
0.00
0.00
1111 2(.41 3002 4014
0.0
0.0
0.01
0.0
0.0
1 ~ll 2030 3002 4014
0.0
0.0
0.0
0.0
o..e
Hl1 20B 30e2 4014
0.(\
0.0
0.0
0.00
0.01')
1') 1 1 204A 30(2 4014
0.0
0.0
0.0
0.00
0.00
trH'7 2(413 30('2 4014
0.0
0.0
0.0
0.00
0.00
10(,8 204A 3002 4014
0.0
0.0
0.0
0.00
0.00
H~ 16 2('48 3007 4014
0.0
0.0
0.0
0.00
1.16
),)(,6 2C48 3002 4(\14
0.00
0.0
0.0
0.00
1.01
10('2 2C48 3C07 4('14
0.00
0.0
0.0
0.00
0.00
10('9 2048 3002 4(114
0.0C'
1).0
0.0
0.00
0.01)
lCC9 2e03 3002 4014
0.00
0.0
1.4:>
0.00
0.00
lOOq 2('03 3()02 4014
o.co 0.0
1.4)
0.00
0.00
M
C
0
E
1
2
2
2
2
2
2
2
2
2
2
2
2
'3
3
3
'3
'3
'3
3
'3
'3
'3
'3
'3
3
3
3
3
4
Table IX-Directed search with initial guess overdesigned
,
.*********BEGIN
RUN
NO.
CJST
1
~5~.3(,
2
5
7
8
12
26
27
36
42
44
56
74
97
55').b~
547.11
544. ')l
54).LO
53~.89
41)().38
464.7q
454.8P
454.0C
449.)1
414.0C374.27
374.21
SEA~CH
1 •• * ••••••
NUMBER
PERCENT CO~PONE~TS SELECTEO
Q,EJfCT FOUP A~P MOGE~ GRTR
4.43lf+CO
0.77 lC('6 2050 3('16 4013
4. 'BOE +00
0.48 10C6 2050 3002 4013
4.751E+IJC
~.O2 1006 2050 3002 4020
3.936E+~a
0.00 1006 2(50 3002 4009
4.821E+C'O
~.C2 1006 7(150 3002 4014
4.537E+OO
0.17 1')02 2050 3002 4014
4. 9aCE +00
O. C8 1002 2025 30e2 4014
4. B90E +00
0.17 1.::'109 2("25 3002 4014
O.lQ 1009 2('41 3002 4014
2. B02E +0 1
0.0(': 11)09 2(:30 3002 4014
1.550E+-J1
5.740E+00
0.00 1009 2033 3002 4014
5.144E+00
').OC 1009 2048 3002 4014
6. q73E +('1)
1.41 lCC9 20/)3 3002 4014
b. 913E +00
1.41 1 CC9 2003 3002 41)14
SCAlA~
The validity that the above $374.27 local minimum
is also the absolute minimum can be checked, for this
example, by using the procedure explained as follows:
The lowest possible cost for a system made up of any
collection of components is the summation of the individual component costs alld the labor cost since if
there are rejects, they only increase this cost. Therefore,
·····*.INDIVIDUAL REJECTlONS···**
STATIC RES
FURATE
LAG
DAMP
0.04
0.0
C.13
0.0')
0.00
1).00
0.48
0.00
(\.00
0.0
1).00
1).0
0.0
0.0
0.02
0.00
0.0
0.0
0.00
0.00
0.00
0.0
0.0
0.0
0.02
0.00
0.0
0.17
0.0
0.00
0.00
0.08
0.0
0.00
0.00
0.02
0.15
0.0
0.00
0.00
0.00
0.19
0.0
0.0
0.0
0.00
0.0
0.0
0.0
O.C
0.00
0.0
0.00
0.0
0.00
O.OC
0.0
0.0
0.00
0.00
0.00
0.0
1.40
0.00
0.00
O.CO
0.0
1.40
0.00
0.00
M
0
0
f
1
3
3
3
3
3
3
3
3
3
3
3
3
4
to test if a local minimum is also the absolute minimum,
one need analyze only the subset of the total combination for which
labor cost
+L:,component costs < local minumim
(33)
If it turns out that analyzing each system in this subset
59..6 Fall Joint Computer Conference, 1969
Table
X~Directed
search resulting in/an unsatisfactory local minimum
********.*BEGIN SEAPCH
RUN
NO.
C)S T
T .**.*****
2 *********
3 ••• * •• *.*
5 *.* •• *.**
7 *.* ••• **.
8 •••••••• *
10 ••••••• *.
11 •••••• ***
12 ••••••• **
13 ••••••• **
14 .*.* •• ***
1'; •• **.*.**
16 .* •• *****
17 .** •• ****
18 .*.* ••• **
19 * •• *.*.**
55 .. * ...... **
156 * •••••• **
PERCENT
:;CALAR REJECT
5.'J61E+(l7 100.OC
4.616E+(,7 100.00
4. :H3f +07 Ina.CO
2.07BE+07 lOa •. or
1. 919E +07 100.0C
1. B12E +01 100.00
1.094E+07 100.00
6.126E+06 I(,O.OC
6.678E+06 100.0C
5. 485E +1)6 100.0C
5.216E-+Ob 100.00
5.6f!1E+1)4 100.00
8. b48E+Ol 1CO.CO
3.746E+00 100.('0
3.434E+OO 1CO.OO
3.399E+00 1CO.,)C
3.3<;16E+QO loo.or
3. ~96E +,)0 10'.).00
SELECTED
AMP "'OGEN GRTR
2014 3010 4017
2014 3014 4011
2(114 3009 4011
20'14 3004 4017
2014 3016 4011
2('14 3002 4011
2014 3003 4011
2014 3011 -4017
2014 3005 4011
2014 3023 4017
2014 3023 4011
2014 3023 4017
2C14 3012 4011
2014 3011 4011
2('14 3015 4017
2014 3015 4017
2006 3015 4017
2006 3015 4017
CO~PONENTS
FOUP
1013
1013
1013
1013
1013
1 C13
1013
1013
1013
1013
1020
L024
1024
1024
1024
1022
1022
111;22
...
3 •• *.* ••••
~U~BEP
* •• ·**.INDIVIOUAL REJECTIONS·····
STATIC RES
tAr.
FIIRATF DAMP
99.91 100.00 '100.00 100.00
0.0
99.93 100.00 100.00 100.00
0.0
99.94 100.00 100'.01) 100.00
0.0
99.92 100.00 100.00 100.00
0.0
99.89 100.00 100.00 18.15
0.0
99.96 100.00 100.00 100.00
0.0
99.91 100.00 100.00 100.00
0.0
99.98 100.00 100.00 100.00
0.0
99.98 100.00 100.00 100.00
0.0
99.93 100.00 100.00 100.00
0.00
99.95 100.00 '100.00 100.00
0.00
99.86 99.63 100.00 100.00
0.00
99.82 56.60 100.00 100.00
0.0
58.94 98.59
0.00 100.00 69.54
26.65 41.15
0.08 100.00
0.00
14.80 47;.15
0.06 100.00
0.00
14.80 47.1~
0.06 100.00
0.00
0.06
14.BC 41.15
100.00
0.00
Table XII-Best design obtained using directed search
D~SIGN
AUTOMATED
PESEARCH PROGRAM
JANUARY 15, 1969
•••• OEFINITION Of LOAO ••••
INtRTIA (GM-CMSQR)
FRICTllIN COl-it")
MAX
9.00("·E-+07
S.OOCE-Ol
MIN
7.000E:+02
4.000E-Ol
•••• PART NUMBfRS OF COMPONENTS SELECTED ••••
FOLLOWUP AMPLIFIER MOTOR-GF,N GEAR TRAIN
1009
2003
3002
4014
•••• PERfORMANCe ••••
MAXIMU,",
4.011E+03
4.578E-+03
6 .. 1&9E-+Ol
4.895E-+Ol
3.033E-Ol
5.457E-02
5.295E+00
9.702E+02
2.227E+00
MINIMUM
2.928E-+03
2.056E-+03
2. 711E +0.1
3.218E+Ol
It.712E-0;2
2.666E-02
3.091E+OC
6.192E+02
1.213E+00
SPEC LIMIT PCT PfJ
0.350
0.300
5.000
300.00C
0.500
0.00
0.0
1.40
0.00
0.00
TOTAL INERTIA
(GM-CMSQR)
TORQUE CONSTANT (Ol-IN/RAD)
DAMPING COEFFICfENT (Ol-IN-SEC)
NATURAL fREQUENCY (HERTZ)
STATIC ACCURACY (OEG)
RESOLUTION (OEG)
LAG FOR 300. DEG/SEC RAMP (DEG)
FOLLOWUP RATE COEG/SEC)
DAMPING RATIO
•••• COST SUMMARy ••••
1.41
1.41
1.40
200.00
169.00
174.21
E
PCT REJECTION (UPP[R BOUND)
PCT REJECTION (INDEPENOENT)
PCT REJECTIO~ (LOWER BOUND)
LABOR COST
PARTS COSl
TOTAL COST (USING R-UPPFR BOUND)
SIGNIFlf~
CONV[NTIONAL
PO~ER-Of-TfN
NOTATION
0
0
E
1
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
4
Directed Library Search to Minimize Cost
Table XI-Local minimums obtained for design
example
Number
Times
Ioccurred
Component Part Numbers
Pz
P3
System
Cost
Rejection
PI
1
'"
100%
1016
2004
3024
4025
1
'"
100%
1022
2006
3015
4017
1
$410.01
0.25%
1023
2008
3006
4014
1
$394.63
1.43%
1009
2003
3002
4002
1
$394.29
0.07%
1009
2012
3006
4014
9
$374.27
1. 41%
1009
2003
3002
4014
1
Search terminated as iterations exceeded maximum allowed
of 300
P"
results in a total system cost higher than the local
minimum being investigated, the latter is the absolute
minimum.
For the above $374.27 local minimum there are
17,835 combinations which satisfy (33). This number
although large is much less than the 781,250 total pos-
597
sible combinations and it becomes a practical value
when one considers the solution time. The 17,835
combinations were, therefore, analyzed (at a cost of
1.7 hours of computer time compared to 74.4 hours for
a complete exhaustive search) and each resulted in a
total system cost> $374.27 thus proving the latter
to be the absolute minimum.
REFERENCES
D MARK
Choosing the best method of variability analysis
Electronic Design Nov 8 1963
2 D G MARK L H STEMBER JR
Variability analysis
Electro-Technology Vol 76 July 1965 35-48
3 B A CHUBB
Computer aided optimizatiorl: of nonlinear servomechanism
employing a directed search of multiparameter component
libraries and statistical tolerancing
Michigan State Univ 1969 PhD Thesis
4 B A CHUBB
Modern analytical design of instrument servomechanisms
Addison-Wesley 1967
Computer-aided design for custom
integrated systems
by V.i. K. ORR
The Singer Company-Friden Research Center
Palo Alto, California
INTRODUCTION
The computer-aided design (CAD) system described
herein was developed to aid in the design of digital
systems to be implemented by custom integrated circuits (CIC) and multi-chip hybrid custom integrated
systems (CIS). The terms MSI/LSI are avoided here
due to the general confusion which exists in the literature as to what constitutes an MSI/LSI circuit. The
CAD system philosophy is tlmt each Ole is implemented from a selected set of "library elements".
This design approach results in some size inefficiencies,
compared with manual designs, but provides many
advantages, of which flexibility and a shortened design cycle are the most important. This CAD system
captures fundamental design information in a machinereadable form early in the design process, thus maximizing potential computer assistance and minimizing
costly and time-consuming errors. This paper contains
an overview of the complete CAD system, highlighting
its more distinctive features. The complete system
has been operational on a 360/30 for several months,
and specific experiences with it can therefore be discussed.
function required of a CIS into the corresponding
functional logic.
Distinctive features include:
1. Logicspec~ a special register-transfer source
language,
2. Compiled functional logic independent of hardware implementation,
3. Designer control of factoring and gathering.
Logic simulation
These programs provide a complete simulated environment for the CIS, and a bit-simulation of response
to input pattern sequences.
Distinctive features include:
1. Random access and cyclic memory,
2. Read-only-memory,
3. Time-dependent and conditional input signals,
4. Logic level statistics,
5. Selective output facilities.
Logic conversion
Overview
Following are the major sections of the complete
CAD system, and the distinctive features to be discussed more fully iI\ later sections.
These programs convert the functional logic to the
logic family selected for hardware implementation, and
create the design data-base.
Distinctive features include:
Logilc design
1. Efficient NAND/NOR logic generation,
These programs convert a description of the logical
2. Wired-OR
599
600
Fall Joint Computer Conference, 1969
Logic element design
These programs facilitate origination and revision
of the library elements used in final system implementation:
Distinctive features include:
1. Graphospec, a special graphic source language,
2. Logic element library,
3. Artwork generation facility.
Partitioning _
L
These programs enable the designer to explore alternative partitionings, and post the final locations
of all logical elements to the design data-base.
Distinctive features include:
r--_ _ _
----.~_-P-'LA1J
1. Minimization of total pad-count for the CIS, and
2. Extensive designer/computer interaction.
Element selection
-----
- - - ....___- - 1
These programs select the smallest eligible element
meeting all the circuit requirements.
Distinctive features include:
LIBRARY
MAINTENANCE
INTER-
l
CONi'.IECTIOI~
L
GRAPHOSPEC
CONNECTIO'"'J
DECK
r=
1. Automatic insertion of gate expanders and intraCIS pads, and
2. Capabilities for handling variable size elements.
IVERIFICATIOI~
Element interconnection
Figure l-Computer-aided design system for CIS
This program establishes the X-Y interconnection
routing.
Because of the nature of this paper, references are
not cited in the text, instead an annotated bibiography
is given at the end of the paper.
Logic design
The initial input to the CAD system, as shown in
Figure 1, consists of a set of Logicspec statements.
Logicspec is a language which has been developed to
simplify the task of describing a logic design in machinereadable form.
The Logicspec language permits the designer to
avoid many of the burdensome details of logic design
These details are filled in by the Logicspec Translator,
which converts a Logicspec des<;lription into a complete set of design equations. These design equations
are essentially Boolean equations t the operators being
AND, OR and NOT. However, they are written in a
modified form of polish notation.: In this notation the
equation
A=B·C·D+E·F
appears as
A = ((B
CD·) (E F·) +)
An important characteristic of this notation is tha,t
each operator corresponds to a gate in an AND/OR
implementation of the equation. This greatly simplifies
those programs in the CAD system which must operate
on these equations.
Since the Logicspec language is similar to other
register transfer languages which have been proposed,
only some of its more distinctive features will be
discussed here; a full description will be published
elsewhere.
Flip-flops are the only memory elements dealt with
directly in a Logicspec description. Memory systems
such as core and delay line memories are treated as
systems interfaced to the logic design through signal
lines. The description of these memories is deferred
until simulation, where the simulator controller governs
the manner in which the various memories interact
via the signal lines, with the logic design.
Computer-Aided Design
Flip-flops are introduced in a description through
the use of a Flip-Flop Collection declaration such as
FFC 12 A(l, 8*), B(l *,8)
The foregoing indicates that the collections A and B
both contain eight type-12 flip-flops. The "*,, identifies
the high order end of the collections for decoding references such as "A = 2". The type code ("12") is used
by the simulation system to determine how the associated flip-flops are to be simulated and by other programs to determine how the flop-flips are to be implemented. This information is contained in an on-line
disk library which can be expanded as required. Each
flip-flop declared may have a maximum of five input
and two output terminals:
All subsequent statements are conditioned by C
until another qualification statement occurs which
overrides the condition C. To illustrate this consider
the following:
FFC 12 A(l *,3), B(l *,2), C(l *,2);
*(A = 3):
2
~
B;
GLEAR C;
*(A = 0):
B
~
CIS;
. The modified Polish equations produced by the Logicspec translator for the above description are:
C(l)/C = *(A = 3)*
C(l)/S
A(l)/R, A(l)/S, A(l)/T, A(l)/P, A(l)/C, A(l)
and A(l)'.
601
~
(*(A = 0)* B(l)/l . )
C(2)/C = *(A = 3)*
C(2)/S = «(*A = 0)* B(2)/1 .
The functions of these terminals is determined by
the information contained in the corresponding library
entry.
The bulk of a Logicspec description consists of a
set of statements which specify that if a certain condition C is true then an action S, or set of actions SI,
S2, ... , Sn occur.
The statement form actually used by the designer
is the more concise conditional statement:
IFCTHENS
or when several actions are involved:
B(l)/R = *(A = 3)*
B(2)/S = *(A = 3)*
*(A = 3)*= (A(l)/O A(2)/1 A(3)/1 . )
*(A = 0)* = (A(l)/O A(2)/0 A(3)/0 . )
Qualification statements may be nested using a form
of subscripting:
*C 1 : Sl;
S'
2,
*1 C2: S3;
S4;
The actions prescribed may include such operations
as SET A, CLEAR B, C ~ D (transfer C to D), and
INHIBIT TX. Conditional statements can be nested,
i.e., Si could be another conditional statement. The
condition C may be any Boolean expression formed
using the operators + (OR), . (AND) and '(NOT).
I t is permissible to describe an entire design using only
Boolean equations; one need not use conditional statements if he so desires.
Most designers who have used the system feel that
the conditional statement is rather cumbersome, and
generally prefer to use an alternate form referred to
as a qualification statement. This statement takes the
form:
*C:
*1 C a: So;
*C4: S6;
In the above, SI and S2 are conditioned by C 1, S3 and
S4 by C 1 · C 2, S5 by C 1 • C s and S6 by C4 only. Logicspec is a free-form language, thus the identations above
are for documentation only.
The structure of a Logicspec description contains
important "clues" which are used by the translator
to produce efficient logic. As an example, the majority
of common control conditions are described using
qualification statements. Referring to the above, C 1
is a common control condition in that it controls the
actions SI, and S2 and in conjuntcion with C2, Sa and
.
S4' The Logicspec translator searches all qualificatIOn
602
Fall Joint Computer Conference,
196~
statements for such common conditions, and may
either duplicate the gates 'involved every time the
condition is used or generate a llew signal which is
used wherever the condition appears. This decision
is under the control of the design~r, who specifies the
minimum number of times a condition must be used
before a new signal is generated. The designer can also
control the generation of new signals based on how the
condition is used and the number of gates required to
generate the condition.
The designer can use the flexibili1,y described above to
reduce the time required to simulate a design by instructing the translator to gener~te a new signal for
every common condit.ion. This g¢nerally reduces the
number of gates in a design and t;hus the p;atc evaluation time during simulation.
The basic Logicspec language ,is very simple, but
means are provided for extending the language through
the use of subsystem definitions. A subsystem definition
for the four bit ring-counter pic~ured in Figure 2 is
given in Figure 3.
,
In Figure 3, line three is a signal collection declaration for the single rail bus OUT (dQuble rail bus declarations begin with SIGC/2). Line: four indicates that
the words COUNT and SETa are to be added to the
basic Logicspec vocabulary whenever a RINGC is
used. Lines five and six simply describe fixed connections.
Once a subsystem has been d~fined and added to
the subsystem library, the design~r may use it in one
of two ways-he may INCLUDEi it or simply SIMULATE it as part of his design.
The INCLUDE option specifies that the actual
text describing the subsystem is to be passed to the
Logicspec translator and processed along with the
text describing the rest of the design, in much the same
way as a rnacro call functions in prdgl'amming langua~es.
The SIMULATE option makes the logical descrip!
OUT I
r-----I
I
I
I
I
I
OUT 2
OUT S
tion of the subsystem available for simulation purposes
only-the rest of the logic design must interaet with
the subsystem through its input/output terminals.
The subsystem logic does not become part of the system being designed: subsystem simulation information
is passed directly to the simulation program, and is
not processed by the Logicspec translator.
The same subsystem may be included and simulated
in the same design, For example
INCLUDE RINGC A(AO), B(BO);
SIMULATE RINGC C(CO);
indicates that two ring-counters, A and B whose out
put buses are AO and B0 1 respectively, are to be included in a design whereas C is only to be simulated.
The efficiency of the logic produced by the Logiespec translator has been evaluated, using designs for
two systems which were in production before Logi.~
spec was developed. These two systems were described
in Logicspec, processed through the translator, and
the resultant logic compared against that in the production systems. In both cases the logic produced
by the translator contained five percent more gates
than the production designs.
Logic simulation
The electronics industry increasingly uses logic
simulation to eli mate logic design errors before COnl-
DEFINE RINGC (OUT)i
(1)
FFC 12 A (1*, 4);
(2)
SIGC OUT (1 *, 4)i
(3)
OPERATION COUNT, SETa;
(4)
A
(5)
OUT ..
4--
OUT;
A(4)' --+ A(l);
(b)
*SETO: CLEARAi
(7')
A (I)
I
I
I
*COUNT: SHR Ai
COUN T - 4 - - - - - I
SET 0
(8)
- 1 - - - - - - + - - -.....- -.......- - - . 1
IL
________________
Figure 2-Four bit ring-counter
~
END;
Figure 3-Subsystem definition for ring-counter
(9)
Computer-Aided Design
mitting a design to hardware. Many designers, however, insist on building breadboards to isolate leadlength and other circuit problems. In some cases, this
is still a valid position. However, whenever the product
will utlimately use CIC's a breadboard serves only
to .correct logic errors, simply because of the difference
between the breadboard and final product technologies.
The creation of a logic simulation program begins
with the simulator ordering program. This program
orders the design equations, in preparation for the
simulator compiler which produces the simulation
code. The equation order, E I , E 2 , ... , En, produced
by the ordering program has the following property:
the variable defined by equation, E i , is a function of
flip-flop outputs, system inputs (external inputs) or
variables which have been defined in the preceding
equations Et, ... ,E i =l' In addition a level list is produced
which gives the number of gate delays in the definition
of each signal. This list is used by the designer to isolate signal paths which contain excessive delays. These
may be eliminated by changing the Logicspec description.
Whenever an equation occurs which defines a signal
as a function of itself the program will fail to order it.
At the completion of the ordering process a list of all
unordered equations is produced. The designer must
change his description such that every equation can
be ordered before proceeding to the simulator compiler.
From this the reader may wonder how flip-flops built
from cross-coupled gates (latches) are processed. The
answer is that the designer uses a flip-flop which has
the characteristics of a latch, but he does not write
the equations which describe the latch itself.
The simulator compiler generates code to evaluate
each equation in the order specified by the ordering
program. One pass through this code may represent
one simulated clock time; the equivocation is clarified by the discussion of the simulator controller.
The simulator controller simulates all memory elements in a given CIS design, monitors various signals
to find predesignated error conditions, and applies
time-varying input signals so as to provide a realistic
simulation of the environment in which the CIS must
operate. A set of powerful commands has been developed to facilitate the designer's interaction with
the simulator, and to maximize the information he
receives about the simulation results. Concise statements are provided for describing wave forms which
are to be applied to the machine's inputs (system inputs). Commands are provided to control the display
of selected signals and flip-flops during simulation,
as well as the status of any delay line or core memories
involved in the design.
603
The flip-flop control procedure used by the simulator controller is outlined in Figure 4. A pass through
the simulation code will define each signal and flipflop input. If any asynchronous (non-clocked) flip-flop
changes are required the controller makes these changes
and another pass is made through the simulation code
to propogate the effect of these changes. The controller
counts the number of times recycling is required be'tween clock times. If this count exceeds a limit specified
by the designer, an error message is generated, thus
permitting detection of any oscillating conditions
which may be present in a given design. When there
are no more asynchronous changes, a clock time is defined and all clocked flip-flop changes are made. This
procedure for handling asynchronous flip-flop changes
is also used to handle asynchronous changes in all other
types of memories.
Simulation running time is clearly increased whenever asynchronous events occur. However, in the absence of asynchronous events there is virtually no run
time overhead associated with the capability to handle
such events. As regards running time, a logic system
containing 100 flip-flops and 600 gates is simulated
at a rate of 18 clock periods per second.
EVALUATE BOOLEAN
EQUATIONS
(SIMULATION
CODE)
MAKE ANY NECESSARY
NON-CLOCKED FLIP-FLOP
STATE CHANGES (SUCH
AS A DC PRESET OR
CLEAR)
MAKE ANY NECESSARY
CLOCKED FLIP-FLOP
STATE CHANGES
Figure 4-Simulator controller
604
Fall Joint Computer Conference, 196·9
Logic conversion
As discussed earlier, the logic; produced by the
~gicspec
translator consists of a s~t of Boolean equatlOns. Generally our logic is implemented in either
NANDS or NORS, thus the design. equations must
be converted to one of these logic families.
The Logic Conversion Program is a one pass table
driven program capable of converting the 'design
equations into either NANDS or NORS. When strapping (OR-tieing) is permitted, the program will use
it when it yields a savings in gates: and/or logic levels.
One of the unique features of this program is the
order in which it converts the design equations. The
conversion pro~uced for the ith eq\lation can be done
efficiently (in terms of the number of gates required)
only when it is known how the signal defined by this
equation has been used-positively~ negatively or both.
In other words, to produce an efficient conversion for
equation i one must first produ~e a conversion for
e.ach. equation which uses the sigI~3J defined by equatlOn 1. On the surface this seems like a difficult problem
at least a time consuming task, however, as it turns out
all of the necessary information is produced by the
ordering program used in simulatio~.
Recall that the simulator orderi~g program produces
the design equation ordering E l, E~, ... ,En' where every
signal in equation Ei has either been defined by a
preceding equation or is a flip-flop output or system
input. The conversion program ~onverts the design
equat~ons in the order En, En-l, ... ,E l . That is, the first
equatlOn converted is the one which appears at the
end of the list produced by the simulator ordering
program.
As the conversion is done, thei program maintains
a "usage list" which indicates how each signal has
been used. As an example, if the equation A = B + C
is converted to NANDS the progtam records the fact
that Band C have been used negatively, sinc:e the
NAND conversion for this equation is A = 13 @C,
where @ represents a NAND gate. Thus, we see that
when the program reaches equation Ei the usual~e list
entry for the singal Vi defi,ned by E i , contains all of the
information as to how Vi has been used. Returning to
the previous example, if B was used only in the equation
which defines A then the conversion program would
produce an equation for B rather than B.
The table used by the conversion program to con-,
vert the design equations to NANDS, assuming 8u
strapping capability, is shown below. This table is
somewhat simpler than others which have appeared
in the literature.
The entries in Table I give the NAND gate replacements for each Boolean operator as a function of the
polarity that is required at a given level in the logie
network. The "positive", "negative" entries which
appear in the table are the polarities required IOn the
inputs to the gate(s) which replace the Boole{m op ..
erator. "Strap" implies that under the indicated conditions strapping may be used. Whenever the NOT
operator occurs, it is simply removed with the indicated
polarity reversal.
Figures 5a and 5b illustrate how the conversion
table is used. In Figure 5a the implication is that :!I.
conversion is to be produced for H rather than j1; thus
the first conversion table access is made with (Polarity,
Boolean operator) = (POSITIVE, AND).
To insure that conversion is done correctly, the
designer must supply a list of the system inputs and
outputs with their required polarities. In addition,
he must specify the polarity required at each flip-flop
input.
Figure 7 shows a conversion produced for the design
equations given in Figure 6. The symbol $ is used to
indicate strapping. Each operator, @/$, is followed
TABLE I-NAND conversion table.
Boolean Operator
AND
@@ (STRAP)
OR
@
NOT
ELIMINATE
POSITIVE
POSITIVE
@
NEGATIVE
NEGATIVE
@@ (STRAP)
ELIMINATE
NEGATIVE
POSITIVE
N,EGATIVE
POSITIVE
Computer-Aided Design
CONVERSION
BEGINS HERE
Z =
2
c-t----t
A
=
((L K .) M +)
3 G = (M N +)
D~----1
I
Figure o-Df'oign equation"l
I
E-+----t
F----t
I
I
I
I
I POSITIVE
REQUIRED
POLARITY:
Figure .5a-H
(( (B A .) C(l)/l +)1 D(l)/l .)
3
2
LEVEL:
A -tl_--t
B -+----t~
605
=
I
I
I
I
I
*M'*
I NEGATIVE : POSITIVE 1
I
(A . B
I
I
+C
. D
+E
. F) . G
A---I
B---I
C---I
D---I
E---I
F---I
H
Figure 5b--NAND equivalent for H (without strapping)
by an operator number. The signal Z, which appears
as *Z'* in -Figure 7, was produced rather than Z because this ~ignal was described separa tely to the conversion program as a negative polarity system output.
Generally when AND-OR-NOT logic is converted
to NAND or NOR logic, additional levels are introduced. The designer normally will pass the logic produced by the conversion program back through the
ordering program to determine if excessive logic levels
have been introduced. If there are excessive levels,
the designer must eliminate them by changing the
original Logicspec description.
To facilitate further processing, implementation
equations such as those of Figure 7 are compacted into
a file which resembles a wiring list. This file, referred
to as the design Data Base, it used by all subsequent
programs.
Logic implementat-ion
Following logic conversion, artwork must be generated to produce the CIS which implements the logic
contained in the design data base. In part this involves
the selection of an IC equivalent for each gate and
=(
M @10)
2
*N'* = ( N @11)
3 (1)
*Z'* = ( ( ( 8 A @4) ( C(l)/l @3) $2) 0(1)/1 @1)
4 (2)
5
*A' * = ( ( L K @9) ( M @8)
A = ( *A'* @ 6)
<> (3)
G
=
$7)
(*M'* *N'* @5)
Figure 7-Implementation ]ogi('
flip-flop in the data base. The central information
source used in establishing these equivalences is the
Element Library. Since this library is used by all subsequent programs, it is appropriate to introduce it at
this time.
The library elements important for the following
discussion are gates (NA~D/NOR), flip-flops, line
drivers, and expanders. Although the Element Library
contains a much broader range of digital elements, the
CAD system is presently only capable of utilizing
these simple logic elements to implement a CIS. The
effectiveness of the CAD system will increase as the
complexity and variety of library elements that can
be used to implement a CIS is increased.
The library entry for each element contains all of
the information required to produce the artwork for
the several mask levels for the given element. This information is stored in a disc file in relocatable form so
that an element may be positioned at any location on
a chip in one of four possible rotations, and optionally
as a 'mirror image. Dimensions, fan-in and fan-out
capabilities and logic type are iricluded in each element entry.
A complete set of programs accomplishes element
library maintenance. Most important of these are
the programs which the element designer uses in the
creation and modification of library elements. Elements are generally built up in a bootstrap fashion.
Resistors, diodes, and transistors are described to the
system, in a special Graphospec language, as a col-
606
Fall Joint Computer Conference, 1969
lection of rectangles. In this langljIage the description
of a rectangle consists of the coo~dinates of one vertex, the length of the associated diagonal and a mask
layer designation. More comple~ elements such as
gates are described as collections: of these elements.
There is virtually no limit to th¢ complexity of the
elements that can be built up in this: fashion.
The computing equipment currently available at
the Research Center for use in CAD does not include
a graphic display terminal. In anticipation that one
will be available in the future the Graphospec language
was designed for use on such a terminal.
£>----I-A
Partitioning
The logic contained in the data base, representing
that to be implemented by a CIS, may exceed the
capacity of a single IC chip. It is then necessary to
partition the logic into groups (partitions), each of
which can be implemented by a sIngle IC chip. It is
important to note that partitioning is accomplished
before IC equivalents have been se~ected for each gate
and flip-flop in the data base. One reason for this is
that there can be a significant size difference between
gates and flip-flops whose inputs are generated and
outputs used on the same chip andJ those whose inputs
(outputs) originate (terminate) on a different chip.
Thus, it is not possible to know ~xactly the area required for each element until partitioning has been done.
The approach taken toward partitioning was to
develop a set of manipulation and: reporting programs
that the designer can put together to implement a wide
variety of partitioning strategies. Some understanding
of what is done by these program~ can be gained from
the following brief descriptions.
The input program
Calculates approximate areas fot each logic module.
A given logic module consists of either a flip-flop, a
collection of flip-flops, or the gates required to implement a design equation. As an illustration, the
equation A = ( (B C @ 3) (D E @ 2) @ 1) is treated
as a four input logic module which has one output cf.
Figure 8. The area of this logic module is the sum of
the approximate areas for the gates which make it up.
The locate program
Places named logic modules on specified chips. Additionally, the designer can speci£y that a module is
to be locked in place, that it cannot be moved from its
designated location by subsequ~nt programs. The
name of a logic module is defined to be the name of
D-......-----E-I------II
Figure R-Logic module a3 defined fol' partitioning
purposes
the output signal (the name of the logic module in
Figure 8 iR A). Flip-flop outputs bear the name of the
flip-flop.
The randomize program
Randomly distributes all logic modules which have
not been placed over the chips which the designer
designates as available. The designer can elect to
begin partitioning with any number of chips.
The weld program
Creates a new logic entity by associating any
specified set of logic modules together. For example,
one might weld the reset logic for a flip-flop collection
to the flip-flop collection itself.
The reduction program
Moves logic modules, or logic module sets, between
chips whenever a move will result in a reduction i1l1
the total number of interconnection pads required
within the CIS. Moves are made subject to the area
and pad limitations the designer has given for each
chip.
The display program
Produces a chip interconnection table which gives
Computer-Aided Design
the name of each "back-plane" signal, the number of
the chip which generates the signal and the numberCs)
of the chip(s) the signal is connected to. This is only
one of the several reports designed to aid the designer
in executing his partitioning strategy.
Frequently an "optimum" partitioning job can be
done only if the designer is willing to change his
design. Gates can often be traded for pads, reducing
system cost, also the duplication of registers, especially
those that are extensively decoded, may reduce cost.
The cost effectiveness of trade-offs such as these will
of course change as packaging techniques improve,
however, the situation will still arise when for want
of a pad a chip must be added to a CIS.
To simplify the task of implementing design changes
made to improve partitioning results, a facility is
provided which allows the designer to obtain a "location deck" at any time during the partitioning
process. Each card in this deck contains a logic module
name and the number of the chip on which the module
is located. The cards are punched in the format accepted by the locate program.
All design changes must be made to the associated
Logicspec description-this fundamental design document is always kept up to date. Once a change has
thus been made and a new data base created the location deck is processed to obtain the new partitioning
results. If design changes eliminated certain logic
modules the associated cards in the location deck are
rejected. Furtherz if logic modules were added the
designer is required to include cards for these in the
location deck.
607
area minimization, the selection program recognizes
special combinations of logic elements and substitutes
corresponding special library elements. At the moment
the only special element substituted is a dual output
NAND.
From this point on all processing is done on an individual chip basis.
Placement and interconnection
Three layers of metal interconnections are generally
required for the chips within a CIS. In such three
layer systems the first metal layer is used solely for
element intraconnections, and the second and third
layers are used for element interconnections. Thus,
the CIS placement and interconnection task is equivalent to the two-sided PC card placement and interconnection task. The algorithms used are modifications
of those which have proved effective tools for generating
PC card artwork.
Element placement and interconnection are alway s
done using the power bus] ground bus and pad layout
prescribed by the designer. Several "standard" chip
layouts are stored in the element library and the particular layout specified by the designer is referenced
by the programs as required. A typical chip layout
is shown Figure 9.
The CAD system can handle chips of various sizes,
however there are certain aspects of chip layout which
are standard from chip to chip:
1. Pads are located on the perimeter;
2. Power and ground busses are on separate metal
layers-one under the other,
Element selection
Chip locations for each logic module established by
the partitioning process are posted to the design data
base. The element selection program then selects
the library elements that are to be used to implement
the logic on each chip. Selection is controlled by a
list, which the designer prepares, of eligible library
elements. From this list, the program selects for each
gate and fiip.Jlop the element of smallest area which:
PAD
1. Provides the logic function required by the
associated element in the data base,
2. Has the required fan-out capability, and
3. Has the required fan-in capability.
Whenever there is no eligible library element with
adequate fan-in, gate input expanders are automatically
added. Whenever the source and destination of a
signal are on different chips, appropriate output and
input pads are added automatically. To effect further
Figure 9-Typical chip layout
608
Fall Joint Computer Conference, 1969
3. The minimum horizontal. dimension of the
region bounded by two segments of a bus or
by a column of pads and a i bus segment is Cthe maximum must be 2C (M figure 9).
The CIC placement problem is complicated by
the fact that the library elemel}ts which must be
placed on a chip are not all the: same size. This is
simplified somewhat by the restdctions imposed on
chip layout an~ library element' design. Reflecting
the restrictions __ discussed regarding chip layout all
library elements must be designed with one or the
other of the aspect ratios pictured in Figure 10.
Placement is accomplished in' three steps. First
the elements and pads are placeq on a regular grid,
assuming that all the elements are the same size' the
.
. '
partIcular size chosen is that of the smallest element
which must be placed on the chip.
Element pairs are then intercqanged on this grid
until a minimum approximate intetconnection distance
is found. Second, the elements are expanded to their
full size into a new, initially empty, grid which actually
represents the chip. :8lements ate processed one at
a time starting at the center of the "small" grid and
moving outward along a spiral path. For each element
processed all possible positions on the new grid are
evaluated with respect to three criteria: (1) the distance
from the ideal position as defined by the small grid,
(2) the degree of occupancy of this position by clements
already processed, and (3) the angle of rotation between the lines defined by the grid center and ideal
point and grid center and positi9n being evaluated.
The third criterion is designed to: keep the expansion
progressing outward from the center point. If the
position picked as minimal with ~espect to the above
criteria is partially or fully occupied, a search is
entered to find other positions. for the occupying
elements. The third placement st~p is to again interchange pairs of elements so as t6 minimize interconnection distance, although this time only elements of
the same size may be interchanged.
l
VARIABLE
~------~
1---'2C~
Figure
l~Permissible
library element aspect
ratios
For each chip processed, the placement program
produces two outputs. The first includes a list of alll
of the library elements on each chip, with their absolute chip location given, this is entered in the element
library. The second is a list of required interconnections;
this is the input for the wiring program.
The wiring program makes all power and ,ground
connections first, using a simple heuristic. Given that
the point x, y is to be grounded or connected ,to power,
a bi-directional search beginning at x, y is' made in
a direction perpendicular to the two closest segments
of the appropriate bus. If an obstruction is encountered
during the search a turn is made perpendicular to
the preferred search direction. When one of the bus
segments is encountered the required connection is
wade.
When all power and ground connections have been
processed element interconnections are made using
the Lee-algorithm. To speed up this process these
connections are made in two steps. At first eaeh pair
of points to be connected is enclosed in a rectangle
and the Lee-search is restricted to this enclosing rectangle. The particular rectangle chosen for a given
pair of points is the one whose diagonal passes through
the two points and is four units longer than the line
joining the two points. If the program fails to make
the connection within the enclosing rectan~~le the
pair of points is added to a" failure list" and processing
continues with the next pair. Once all point pairs
have been processed pairs in the failure list are again
processed; this time, however, the search area, is not
restricted.
Resticting the Lee-search as described above, in
some ca.;;es improves running time as much as 28 percent..
The average density, in interconnections/square,
of the chips processed to date has been 3.8, where a
square is 10 wiring grid units on a side. At this density
manual completion has been required for less tha,n
1 percent of the interconnections processed.
To facilitate manual completion the output of the
wiring program is a card deck referred to as a conneetion deck, which can be manually manipUlated to
make those connections which were not made automatically. These cards actually contain a deseription
of the connections in the Graphospec language accepted by the element library maintenance programs.
Thus, these programs can be used to plot metal masks,
as a basis for deciding how to make the remaining
connections.
This manual wiring completion procedure is a
potential source of errors. A verification program is
therefore provided to validate all manually introduced
Computer-Aided Design
connections. Actually this program checks all connections in the connection deck against the design
data base and produces an error list of all missing
and erroneous connections. When a final connection
deck is obtained it is entered in the element library.
At this point the element library contains all of the
information required to produce the artwork for a
given chip.
Non-recurring engineering costs
In June 1969 an experiment vms pm'formed to
measure the non-recurring engineering costs of custom
integrated circuit design.
~For this experiment, a digital system whose logio
design was already complete was chosen as a starting
point. This system, as it existed in prototype form,
consisted of 42 flip-flops, 215 NAND gates and 20
NOR gates implemented in 69 conventional dual inline
packages.
The experiment began when the system design, in
the form of four D-size logic diagrams, was received
at the Research Center. Members of the CAD staff
transformed the design into a machine-readable form
using the Logicspec language. It should be noted that
Logicspec was not being used as a design language,
but merely as a means of conveying design information to the computer.
Following the transformation to .Logicspec R com··
plete logic simulation was performed to identify any
errors introduced by the manual transformation.
Several such errors were found. Tn addition, two errors
were found in the logic diagrams.
At the beginning of the experiment it ,vas decided
that the CIS would be implemented using chips measuring 140 mils on a side with a maximum of 39 signal
pads/ chip. On these chips power and ground buses
and pads occupied approximately 7,100 mils2 of the
available area, leaving 12,500 mils2 for the placement
of library elements.
Using a parts list submitted with the logic diagrams
it was estimated that with the selected chip size the
system could be implemented using six chips. The
six ohip partition obtained is oharaoterized below.
The area utilization figures given below were obtained
after element selection had been performed.
The area utilization figures given in Table II clearly
indicate that the six chip partition makes somewhat
inefficient use of the available area. At the time it was
not obvious that fewer chips could be used due to pad
limitations. For this reason the experiment was completed. using the six chip partition. Subsequently a five
chip partition was obtained this is characterized below.
The five chip partition required more gates than the
six chip partition because it was necessary to trade gates
for pads in order to stay within the prescribed pad limits.
Following element selection each of the six chips was
processed through the placement and wiring programs.
Of the six chips processed only one required manual
completion:one connection was made manually.
The end product of the experiment was a complete set
of rubylith mask masters (11 mask layers) for one chip
(chip five). In determining costs it was assumed that
the mask masters for each of the remaining chips would
cost approximately the same.
The professional manpower and computer costs
required to perform the experiment are summarized
below. At the Research Center all plotting is done in a
multiprogramming environment (i.e., it is overlapped);
for this reason the summary is broken into two parts.
The entire experiment was completed in an elapsed time
of three weeks.
The non-overlapped time shown above was the time
TABLE II-Six chip partition.
Chip
Gates
1
2
3
4
5
6
Total
Average
26
40
38
32
43
28
207
34
Flip-Flops
9
7
9
4
6
7
42
7
609
Pads
Used
Area
In Mils2
31
38
38
37
37
36
217
36
10,762
11,343
11,438
7,446
11,052
8,696
60,757
10,126
% of
Available
Area
86%
90%
91%
59%
88%
690/0
610
Fall Joint Computer Conference, 1969
TABLE III-Five chip partition.
Chip
Gates
1
2
3
4
5
33
44
37
51
45
Total
Average
210
42
Flip-Flops
Pads
Used
Area
In lVIils2
% of
Available
Area
37
36
39
38
39
12,321
12,117
12,136
12,117
12,162
98%
97%
97%
97%
97%
189
37.8
60,853
12,170
10
7
10
8
7
42
8.4
TABLE IV-Manpower and computer costs for experiment (exclusive of plotting).
Professional % of Computer Hours
2\1an Hours Total
(IBM 360/30)
Transfer Design to Logicspec
Logic Simulation'
Convert to Nand Logic
Partition System to 6 chips
Library Element Selection
Placement of Elements
Interconnection of Elements
Totals
40.5
26.0
2.0
31:0
0.5
3.0
1.0
104.0
39%
25%
2%
30%
0.5%
3%
1.8
1.6
0.4
11.0
0.6
1%
4.5
24.2
4.1
% of
Total
7%
7%
2%
45%
3%
17%
19%
TABLE V-Computer costs for plotting portion of experiment.
Computer Time
(IBM 360/30)
N on-Overlapped
Overlapped
(Plotting)
Prepare Composite for Manual
Interconnection' Completion
Prepare Mask IVlasters Chip 5
Prepare Mask lVIasters for Other
Chips (Extrapolation)
Totals
required to load a disk file with ~he information which
was to be plotted.
Experience at the Research Center indicates that
approximately 2.5 nonprofessional man hours are
required to strip and check a rubylith mask master.
Including this the total cost for a final set of mask
masters for the six chips is as summarized in Table VI.
.3
.8
3.3
11.8
16.5
20.1
59.0
71.6
TABLE VI-Non-recurring engineering
costs of CIS design.
Professional Man Hours
Non-Profesional Man Hours
Non-Overlapped IBM 360/30 Hours
Overlapped IBM 360/30 Hours
104.0
165.0
44.3
71.6
Computer-Aided Design
CONCLUSION
Development of the CAD system described herein
required approximately twelve man years of effort. The
system is now providing the tools which make the task
of developing a CIS as simple as, and as regards
non-recurring costs, no more expensive than the task of
developing the same system using discrete IC packages
and printed circuit boards.
ACKNOWLEDGMENTS
The author is indebted to Messers. L. P. Robinson and
G. Hare for their continued support and encouragement.
The development and implementation of the system was
carried out by the author, J. Landau, L. P. Robinson,
R. Quick, A. Watson, and V. Wilson. We are all
greatful to those designers who struggled with it during
its infancy.
REFERENCES
An excellent source paper on computer-aided design is:
1 M A BREUER
General survey of desion automation
Proc IEEE Vol 54 1966 1708-1721
The ba""ic CAD System philosophy is similar to Motorola's
Poly cell approach
2 M S CALLAHAN
Moving into MOS production
Electronic News Vol 4 1968
611
A classical paper on regi ~ter transfer languages is:
3 D F GORMAN J P ANDERSON
A logic design translaior
Proc FJCC 196286-96
Th.e modified polish notation is discus;;;ed in:
4 W K ORR J M SPITZE
Design automation utilizing a modified polish notation
Proc F JCC 1964 643-650
A good description of equation ordering techniques appears
in:
5 I H YETTER
High-speed jault simulation jor Univac 1107 computer system
Proc ACM Nat Conf 1968 265-277
The simulator compiler was patterned after:
6 R A RUTMAN
LOG! K a syntax-directed compiler jor computer bit-time
simulation
Masters Thesis Univ of Caif at Los Angeles 1964
A logic conversion teehnique similar to the one described
appears in:
7 M KLERER G KORN
Digital computer user's handbook
McGraw-Hill Book Co Inc N Y 1967 4-185-4-192
The exchange algorithm used in placement is similar to:
8 J POMENTULE
An algorithm jor minimizing backboard wiring junctions
CACM Vol 8 1965699-703
The wiring algorithm used appears in:
9 C Y LEE
An algorithm jor path connections and its application
IRE Trans on Electronic Computers Vol 10 1961 346-365
An overview of the computer output
microfilm field
by DON M. AVEDON*
Scan Graphics Corporation
Stamford, Connecticut
taining data, produced by a recorder from
computer generated electrical signals.
2. Computer Output Microfilmer: a recorder which
converts data from a computer into human
readable language and records it on microfilm.
3. Computer Output Microfilming: a method of
converting data from a computer into human
readable language onto microfilm.
INTRODUCTION
From the earliest times, man has made his mark. At
first his marks were made with his own fingers on
walls of caves. He used a chisel or brush to create
pictures of animals. He developed symbols, alphabet
and languages. lVlan used marks to pass information
from person to person and from generation to generation. Through the ages, man recorded information
to be used again and again. He recorded history,
mathematics and law. These things brought order to
his life. The history of civilization is the history of
man's ability to communicate, record and make marks.
In making marks, there is most always a moving
object. lVlan used his own fingers. Today most marks
are made by a type slug, a print hammer, a moving
drum, or some mechanical device. And now man has
electronic digital computers. These machines manipulate and generate information at unprecedented speed.
Man's need to make marks has multiplied many times
in the past few decades. lVluch of the drudgery of
handling information has been relegated to the computer. The speed of computers is so great that mechanical mark-making devices can no longer keep pace.
Devices using a stylus or print hammer will not move
fast enough and require too much maintenance.
This is the beginning of our story-a new method
for making marks-COM.
What does COlVI mean?
1. Computer Output Microfilm: microfilm con-
* Also Director, National Microfilm Association, Annapolis,
Maryland.
This paper will describe COM technology and the
various types of COM recorders. Some of the uses and
applications will be explored. A description of the
various recorders and a comparison of the units will
be made. Microfilm origination, dissemination and
retrieval systems will be reviewed. Some COM market
forecasts will be looked at and a survey of the field
by the National Microfilm Association will be presented.
General
Over the past several years, American industry as
well as the scientific community have turned increasingly to the use of computers and microfilm as a means of
controlling what is referred to as the "paperwork
explosion." Computers and microfilm have been
generally used independently to cope with the same
problem. Both have been successful, but neither alone
has completely solved the problem. The effect of combining microfilm and the computer in a system for
information handling may turn out to be more dramatic
than the effect of either alone.
Computer systems of all generations, first, second and
613
614
Fall Joint Computer Conference, 1969
third, ha ve been plagued by an imbalance of speeds.
The fUIlctions of computer systems namely, input proce8sing, and outpu t--though intundned a:::: f1.mctions,
have been sadly imbalanced in their speed relationships one to the other. The computer itself, or the main
frame, has seen an ascension of speed and power of
phenomenal proportions from the mid-1950's to the
present. The older vacuum tube equipment could process at thousandths of seconds or milliseconds. The
transistor and solid state technology brought forth
microseconds or a millionth of a second speeds. Finally,
the third generation in this evolution, the micrologic
of integrated circuits, has caused nanosecond speeds,
a billionth of a second, to be realized. However, the
input/ output twins have seen no similar evolution.
On the input side the basic medium of data input is
still the EAl\1 card which is over 30 years old. On the
output side mechanical printing and its hardcopy paper
medium has been the major avenue of getting the
information to the user.
Although there have been several major efforts to
improve the input/output situation, and especially to
eliminate the output bottleneck, none has succeeded
until now. The Computer Output Microfilmer, or COM
recorder provides the solution to the computer output
problem. A COM recorder has the output equivalent of
as many as 30 impact printers operating simultaneously.
Some COM units have a transfer rate as high as 100,000
characters per second (transfer rate: the speed at
which information can be transferred from magnetic
tape to microfilm).
The COlVI is a device which records computer data on
microfilm in human readable form. It is a recorder
which may be connected directly to the computer for
"on line" operation or to a magnetic tape unit for
"off-line" operation. The magnetic tape unit "reads"
information into the COM from a magnetic tape which
previously has been recorded directly from the computer.
There are three types of COM devices:
that is, they are capable of reducing the digital output
of computers to convenient, usable plots and curves
that are annotated with alphanumeric information.
Figures 1 and 2 are typical scientific plots. This was
the role of the COM until recent years when some of
the scientific users began using the printing cap.ability
for non-scientific alphanumeric listings.
..."\
./
....
'.
J
-"'1-...... - ...... -
... &.. ,
. , . It . . . . _
••. " "
".:!':'_ ...-............ H . . . . .. . .t.. ._
, . - .. . . . . . t •
. - . . - _ II . . . . . . . , . . ••
. _.M. __ .. 1.1, . . . . . .
• "1111t
....
•.
.,.....
• ...... I.f...
•........... "
Figure 1-Typical scientific plot.
Business-alphanumeric printer
Scientific-alphanumeric printer and plotter
Graphic Arts--special quality alphanumeric
printer and plotter
Recording the output of a digital computer directly
on microfilm is not new. As early as 1955 at least one
COM recorder was in use for this purpose. The early
units as well as some of the new units were designed
for scientific work. These recorders are printer-plotters;
. . . . ." .
""""
II .• "
......
.,A Wi . . . . . . . . . . 1.. •• , N _
....
_MIff'0.'"
.... -...-.
I' ....
_ .............
- . .................
,.• ~.
t.
• . . . . . .,
, . . . . . . . . . . . . ..
, ...... . . -
• ......
II..,. •• .,....,......,.
I ..... II
-t . . . . . .
Figure 2-Typical scientific plot
Overview of Computer Output Microfilm Field
LOC
NCt
"" "
'I-C.t
'I-C.t
IltU5
I30U'
In ...
1114"
IU.I'
Iltllf
UtlU
140.U
14010.
1401n
140 . . 1
14010t
140"0
140111
140" 3
140"4
140"5
140.U
1401"
1401 . .
1401lt
140.ao
140111
140UI
140lU
140114
140115
\4101 . .
140ln
140UI
1401 . .
140130
14011l
1401U
1401ll
140114
140tU
14011'
14011'
140tU
1410U
14 I ' "
14"t3
14 "t4
141 't5
~4 "
14
.. "
10
14""
I . . IU
14 . . 2t
14 . . 30
14 . . 31
1 ... 143
14".4
14U45
14,,4e
14"4'
1418.t
141180
142022
14 3031
'430U
143035
143Ul!
I .. 3'02
1/ .. 0'0
14 .. 1&3
LOC
.TAftJ.
100000
MHt
c.t
'I-HOt
'I-HOt
'I-.At
'I-.At
HO.
HOt
HUt
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HOt
ttot
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HO.
HO.
HO.
HOt
HO.
HO.
HO.
HOt
HO.
HO.
HO.
HO.
HO.
HO.
HO.
HO.
HO.
HO.
HO.
HO.
HOt
MG-HOt
HO.
HOIl
14~05l'
HOt
145080
1 .. 5332
1 .. 53311
1 .. 1I33e
... 5JlT
... 53U
14533t
1 .. 11 340
H()~
HOt
HOt
HO.
HOt
HOt
HOt
HOt
ACTIVI
MilliNG
ACTIVI
ACTIVI
ACTIVI
ACTIVI
ACTIVI
ACTIVI
ACTIVI
ACTIVI
ACTIVI
ACTIVI
ACTIVI
ACTIVI
ACTIVE
ACTIVI
ACTIVI
ACTIVI
ACTIVI
ACTIVE
ACTIVI
ACTIVI
ACTIVI
ACTIVI
ACTIVE
ACTIVI
ACTIVI
ACTIVI
ACTIVI
ACTIVE
ACTIVI
ACTIVI
ACTIVI
ACTIVE
ACTIVI
ACTIVI
ACTIVI
ACTIVE
ACTIVI
ACTIVE
ACTIVI
ACTIVI
ACTIVI
ACTIVE
ACTIVI
ACTIVI
ACTIVI
ACTIVE
ACTIVE
ACTIVE
ACTIVE
ACTIVE
ACTIVE
ACTIVE
ACTIVE
ACTIVB
ACTIVB
ACTIVE
ACTIVB
ACTIVE
ACTIVE
ACTIVE
ACTIVE
ACTIVE
ACTIVE
ACTIVE
INACTIVE
ACTIVE
A'::TIVP.
ACTIVE
ACTIVE
ACTIVE
ACTIVE
ACTIVE
ACTIV~
ACTIVE
ACTIVF.
ACTIVE
145141
145lt5
145UI
145Ut
145UO
1 .. 5UI
145'tl
145'"
145800
145104
145105
14510e
1458n
145108
14510t
1451"
145113
145"4
1"5"5
... &1 . .
145" ,
1451 . .
145830
145131
14513e
145151
14515'
14515t
11.58'2
It5Ue
145nl
145n 2
1 .. 5n 3
14511
145113
145 . . .
14511t
145tn
145."
°
I4l1tJO
1411.33
14~t34
145142
14eOIl
1480.5
l4eo"
14 . . 15
14euo
14e 12"
14el25
I .. elll
14 . . 30
l4e 131
14e 13e
14 . . lT
14 . . 31
148'5'
, .... tl'
14'2"1
l4e 2" 2
, .. e2 .. 5
l4e2 .. e
I .. e2 .. T
14e1l4l'
14e2 .. t
... e2110
14e251
.... 252
'48 25 J
1"'254
.... 2511
1 .. 11 2511
141125T
14'2!'111
1 .. 11 25t
1411280
14'211 I
1411282
LOC
.TAftJ.
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HO.
HO.
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HO.
HO.
HOt
HOt
HOt
HOt
HO.
HOt
HO.
HOt
HOt
HO.
HOt
HO.
HO.
HOt
HO.
HOt
HOt
HO.
HO.
HOt
HO.
HO.
HO.
HOt
HOt
HO.
HO.
HO.
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HOO
HOt
HOt
HOt
ACTIVI
ACTIVI
ACTIVI
ACTIVI
ACTIVI
ACTIVI
ACTIVI
ACTIVI
ACTIV'
ACTIVI
ACTIVI
ACTIVI
ACTIVI
ACTIVI
ACTIVI
ACTIVI
ACTIVI
AC'rtVI
ACTIVI
ACTIVI
AC11VI
ACTIVI
ACTIVE
ACTIVE
ACTIVI
ACTIVE
ACTIVE
ACTIVE
ACTIVE
ACTIVI
ACTIVI
ACTIVE
ACTIV!!
ACTIVE
ACTIVI!
ACTIVE
ACTIVE
ACTIVE
ACTIVI!
ACTIVE
ACTIVE
ACTIVE
ACTIVE
ACTIVE
ACTIVB
ACTIVI!
ACTIVE
ACTIVB
ACTIVB
ACTIVB
ACTIVI!
ACTIVE
ACTIVE
ACTIVI!
ACTIVB
ACTIVE
ACTIVI!!
ACTIVE
ACTIVE
ACTIVE
ACTIVE
ACTIVE
ACTIVE
ACTIVE
ACTIVE
ACTIVE
ACTIVE
ACTIVE
ACTIVE
ACTIVE
ACTIVI!.
ACTIVE
ACTIVE
ACTIVE
ACTIVE
ACT I VE
ACTIVE
ACTIVE
14' . . 1
14' au
141"6
14''''
14''''
14'If'
14.UI
14'U4
14.ua
14.145
14'114
14' lie
14'1"
14.lIt
14lnl
l4'UI
14'41a
14'411
14'414
14l4ae
14'4"
14'4 If
14'441
14.4 .. 1
14'4t5
14 .....
14'"''
14l4t8
14'4It
1415U
1411120
14.521
14.5a'
14.538
, .. ,51'
14.1138
14.11 .. l
14.ee I
14eee3
14.5I1e
14.55T
l4e55e
14ellel
14e5el
1415"
14e5,.
1 .. 15'2
14e5T ..
l4e588
l4e5tO
l4ellll
l4e518
14 ell 18
1 .. 1102
14110J
1411011
14eeoe
141 ....
141115
... eett
14e700
14"01
... eTOt
I4I1TU
...1411,.5
e, ...
1411'IJ
1411TU
1411'44
"'11'5'
1411Tllt
""'81
1411T'l
.... TII5
1411UT
... IITTO
''''T''
... 1I'tt
615
.TAftJ.
HO.
HO.
HO.
HO.
HO.
HO.
HOI
HOI
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HOt
ACTIVI
ACTlvI
ACTIVE
ACTIVI
ACTIVI
ACTIVI
ACTIVI
ACTIVI
ACTIVI
ACTIVI
ACTIVI
ACTIVI
ACTIVI
ACTIVI
ACTlvI
ACTIVE
ACTIVI
ACTIVI
ACTlvl
ACTIVI
ACTIVI
ACTIVI
ACTIVI
ACTIVI
.CTIVI
ACTIVE
ACTIVI
ACTIVI
ACTIVI
HOt
ACT~VI
HOt
ACTlvl
ACTIVI
ACTIVI
ACTIVE
ACTIVI
ACTIVI
ACTIVE
ACTIVE
ACTIVE
ACTIVE
ACTIVI
ACTIVI
ACTIVE
ACTIVI
ACTIVE
AC.,IVE
ACTIVE
ACTIVI
ACTIVE
ACTIVI
ACTIV!!
ACTlvl
ACT"'!!
ACTIVE
ACTIVE
ACTI"E
ACTIV!!
ACTIVE
ACTIVI
ACTIVI
ACTIVE
ACTlvl
ACTIVE
ACTIVE
ACTIV!!
ACTIVE
ACTIVE
ACTIVK
ACTIVE
ACTIVE
ACTIVE
ACTIVE
ACTIVI
ACTIVF.
ACTIVE
ACTIVE
ACTIVB
AC riVE
HOt
HOt
HO.
HOt
HOt
HOt
HOt
HOt
HOt
HOt
HO.
HOt
HOt
HO.
HOt
HO.
HOt
HO.
HOt
HOt
HO.
HO.
HO.
HOt
HOt
HOt
HO.
HO.
HOt
HO.
HO.
HO.
HO.
HO.
HO •
HOt
HO.
HOt
HOt
HO.
HOt
HOt
HO,
HOt
HOt
HO.
HOt
)"igure ;~-l'ypical business information-aiphanumerie:;
These non-scientific (business) applications prompted
the development of special COM devices which are
designed for high speed recording of alphanumeric
computer output. These units record the same type of
information as impact printers only they are much
faster and the information is placed on microfilm
instead of paper. Figure 3 shows an example of this
type of information. Thousands of computers in use
today do not yield full capacity. The computer systems
are slowed down by their output devices, the impact
printers, which pr.oduce too much paper. The mountains
of printout they produce are smothering the very
efficiencies for which computers were designed. These
thousands of computers do not put vital information
into the hands of the right people in the right places
in time for the right decisions. These new business
616
Fall Joint Computer Conference, 1969
SALES
600
-
-
I-
-
-
-
-
-
-
-
r-
-
/=
-
-
/\
3,00
V
-
-
p)
~~, "~
I
\\ VI.
'1'--
\
;
-
200
\
/
''-....J.
-
I-
r-
I
I-
!
f-
100 l -
3.000
I-
/
-I
--
/.
~
/
III-
~AST I~I~
=
c-
1
.,....,
~
I~ I ~
c--.,
..,..
<.D
.,....,
.....,
,.....,
;:::
-
=
- ....-- -~
......
c--..
c:n
.,....,
.,....,
r-..
c--..
<.D
......
c--..
'"
........
en
.....,
....
<.D
......
-- -""
f-u.~~
::-'_-
~
--
......
c-...
-
0:>
-
......
c-...
0:>
......
""
:;::;
-
-
//
-
-
-
I
I-
L
/
I-
-
1-I-
DEC JAN FEB MAR APR MAY JUN JUL AUG SEP OC T NOV
FORECAST
........
..,..
.....,
YEAR TO DATE
......
.,....,
<.D
VARIANCE
......
.......
c:n
......
1968
::g
........
-- ....- ::20:>
........
.,....,
-- - c:n ~- ....- ::2-
c-
ACTUAL
........
-I
-
(
l-
c--..
0:>
-I
-
1. 000 I-
MONTHLY
c-...
c-...
/
I-
-
......
..,..
-....,.....
/.
I-
DEC JAN FEB MAR APR MAY:JUN JUL AUG SEP OCT NOV
......
......
....,
/
~/
-
i.-
19:68
=-
/~/ -
2.000 -
-
I-
VARIANCE
-
--
-
f-
ACTUAL
'-
--
I-
I-
I
-
-
-
-
'\
-
4.000 -
-
!
-
-
-
I
I
-
400
-
-
-
-
500
THOUSANDS OF DOLLARS
5.000 l-
(""")
c-
-""
.,....,
co
.,....,
........
<.D
c.--,
........
....
.... .,....,
........
~
........
........
~
........
.... .........,
c--..
""
-
c-
--.....,
..,..
c--..
<.D
c-
<.D
........
--
........
c:n
c--..
.....,
c-....
.....,
-
<.D
~
........
........
o:>
.....,
0:>
......
I.
R[(~R[)[[I
I'I~I
THE
'NF~RMATl·~N
INTERNATl~NAL
Figure 4-1'ypical business
COM recorders can solve this problem. The problem
is solved by the following advantages the COM system
has over the impact printing syste~:
1. Printing at computer tape speeds.
2. Forms printed with data simultaneously.
3. Retrieval coding placed on records ttS it is
created.
FR80
report.-gt"aphic~
4.
5.
6.
7.
Smaller records storage.
Reduced cost of supplies and material.
Weight of information significantly reduced.
Microfilm doesn't have to be decollated, burst
or bound.
The third type of COM is the graphic arts printelr.
This is an electronic composition system. This type
Overview of Computer Output Mi·crofilm Field
of recorder can produce alphanumerics and graphics
with graphic arts quality at data processing speeds.
The evolution of the COM is quite interesting, it
began with the scientific device being used for plotting
technical data in graphic form. Now it is being used
extensively by business for alphanumerics as a replacement for impact printers. I predict that business
management will quickly realize that they too would
have great advantages from the scientific type of
system and have their business information plotted
and 'presented in graphic form instead of as alphanumerics or having a draftsman manually prepare charts from
alphanumeric data. Figure 4 shows a business report
produced by a scientific COM recorder.
617
LOGIC
AND
CONVERSION
ELECTRONICS
Technology
Speed
CATHODE
The most obvious technological advantage of a COM
is the speed at which computer information is translaMd
into human readable form on microfilm. It is difficult
to visualize or appreciate this speed. I, therefore,
present these comparisons:
5,000
electric
typewriterR
30
impact
printers
RAY
TUBE
FIXED DATA
1
CalVI
recorder
Looking at it another way:
Characters
Lines
per sec.
per min.
Typewriter
Impact Printer
Lines
per hr.
Pages
per hr.
15
7
400
6
2420
1100
66,000
1031
1,900,000
30,000
CO M recorder 70,000
32,000
Cathode ray tube (CRT) systems
The Computer Output Microfilmer, as the name
implies, produces computer generated microfilm records
, with no intervening paper copy. This is achieved by
converting the computer digital signals to voltages
which are applied to a cathode ray tube. (Another
method, electron beam recording, will be described
later.) This conversion process results in the information being displayed on the cathode ray tube screen
in human understandable form. The microfilm record
is produced by photographing the information displayed on the cathode ray tube. The basic nature of
this process is illustrated in Figure 5.
MICROFILM
CAMERA
FI!?;tJre
5-CI~T
ll1ierofilm recording
Electron beam re,cording (EBR) systems
The second method of recording directly on microfilm uses an electron beam, see Figure 6. Using the
stroke method 1 much like that of a pencil writing on
paper, the electron beam writes a latent image directly
on dry-silver microfilm. The electron beam originates
at t.he cathode of the electron gun, located on the top of
a sealed housing. Electrostatic plates and electromagnetic yokes, or magnetic lenses, deflect the be3.m
to form characters and position them on the micro·
film frame. The housing is similar in principle to a
618
Fall Joint Computer Conference, 1969
Electron Gun
--~
ters are stored in memory. About 16 strokes are used
on an average per character.
Characters can also be created by point plotting:.
This method is generally used, for special symbols or
typefaces.
Gage Panel
..-=~~ Electron
8eQm
Vacuum Control
Panel
Line generation
To Heat
Processor
Line generation in scientific COM recorders is done
by the use of a "line" or "vector" generator. This is
known as a vector stroke generator and is capa,ble of
drawing vectors. Line width) vector direction and
intensity levels are all generally programmable.
Forms overlay
Figure 6-Electron beam recording on microfilm
cathode ray tube except in place of a phosphor screen
it has a small aperture through which the beam passes
to write directly on the microfilm. Vacuum pumps
reduce the air pressure within the housing to a level
low enough to facilitate generation and precise control
of the beam. Because the beam has practically no
inertia, it can be deflected, or modulated, rapidly
enough to keep pace with the data transfer rates of
the tape drive.
Character generation
Forms overlay features are provided on most units.
The forms overlay feature provides the capability of
superimposing predetermined, fixed forms with the
generated image. Forms are interchangeable by an
operator or on some units may be called in by program.
These forms may contain maps} company logos, charts
and graphs such as the one in Figure 7.
Retrieval coding
COM recorders can generate retrieval codes and
patterns for each or selected frames of information.
The following coding systems are usually standard
features: Codeline, Image Blip (Image Count) and
Miracode. These indexing identifiers are recorded on
film simultaneously with the data. This feature is the
key to push button easy retrieval of information on
microfilm.
There are several methods of creating characters for
COM recording. Stromberg-Datagraphix has developed a special cathode ray tube called a Charactron®*
Shaped Beam Tube. The Charadtron tube creates an
image by directing an electronic· beam through individual characters cut in a matrix-a thin precise disc
with alphanumeric and symbolic characters etched
through it. This matrix is located within the neck of
the tube. This method extrudes the beaminto the shape
of the character being printed. This has the effect of
stenciling each character onto the tube face.
Another method of creating characters is by the
use of a "stroke" generator. In this type of system a
spot is deflected to trace the shape of the character
desired. The voltages necessary .to deflect the spot
are generated by sweep generators~ one for X deflection
and one for Y de:flection. Instructions for the charac-
The 3M dry-silver film is available in 16mm nODLperforated form.
The 105mm film is used in the business type COM
and the film is cut and used as microfiche. The 35mm
* Charactron is a trademark of Stromberg-Datagraphix, Inc.
* Dacomalic is a trademark of East,man Koda.k.
Films
There are two recording films in use today in the
COM field. Almost all CRT systems use Koda~k (Recordak) Dacomatic* film, types 5461 and 7461. The
EBR systems use 3M Computer Film~ type 761 (drysilver). The Dacomatic film is available in thefollowing
sizes:
a.
b.
c.
d.
105mm nonperforated
35mm with perforations
16mm with perforations
16mm nonperforated
Overview of Computer Output Microfilm Field
.... ........,. ..............
" ....A 110 . . . . . ., ••
,
••
•
•"
••
•"•9
•••
••
9
••
•
II ..
....
.....
.....
.....
.....
.....
.....
I.~...
619
--
"~
'"
""- r'\.
'~
~
~
"'~
I.IN
10 • •
I.'.
1.1.
............ ..
'.1.
..HI -" •.•"
"'
"~
'.1.
"'..... '"....." r-"
i'..
Figure 7 - Forms overlay
film is generally used for scientific work (graphics). The
16mm nonperforated film is used for almost all business
applications. The perforated 16mm film is only used
for special high precision applications.
Users and applications
Scientifilc
A few of the scientific applications are: circuits,
printed wiring board masters, thin film masks, animated
mov:ies, graphs and charts. See Figures 1, 2 and' 3.
rrhe following are some of the organizations using
scientific COM recorders:
North American Aviation
NASA
Collins Radio
Bell Telephone Laboratories
Lawrence Radiation Laboratories
MIT Lincoln Laboratories
Business
Business applications include all types of listing;
account reports, management reports and anything
that might have been produced by a computer and
impact printer. The following are some of the organi·
z,ations using COM's in business applications:
Sears Roebuck & Company
J. C. Penney Company
Social Security Administration
Equitable Life Assurance Society
Bureau of the Census
International Harvester Company
Systems servi,ce centers
At the present time there are over 40 systems service
companies operating COM service centers in the following cities in the United States:
California
CanagoPark
Culver City
ElSegundo
Glendale
Los Angeles
San Francisco
Stockton
Colorado
Bolder
620
Fall Joint Computer Conference, 1969
Denver
Connecticut
Hartford
Westport
Florida
Miami
Georgia
Atlanta
Illinois
Chicago
Indiana
Indianapolis
Louisiana
New Orleans
Maryland
Baltimore
College Park
Massachusetts
Boston
Springfield
Wilmington
Michigan
Dearborn
Detroit
Missouri
St. Louis
New Jersey
Cherry Hill
Dayton
New York
Binghamton
Buffalo
New York City
Rochester
Spring Valley
White Plains
North Carolina
Winston-Salem
Oh~'o
Columbus
Cleveland
Dayton
Penn$Ylvania
Philadelphia
Pittsburgh
Texas
Austin
Dallas
Houston
Utah
Salt Lake City
Virginia
Arlington
Washington, D. C.
Wisconsin
Brookfield
COM recorders
At the writing of this paper the followng companies
were marketing COIH units:
a.
b.
c.
d.
e.
f.
g.
h.
1.
j.
k.
1.
m.
n.
AMETEK/Straza (Scientific)
Beta Instrument (Scientific)
California Computer Products (Scientific)
Canon (Business)
Computer Micro-Data Systems (Scientific)
Computer Industries (Scientific & Business)
Control Data (Scientific)
Eastman Kodak (Business)
Information International (Scientific)
3M (Business)
RCA (Graphic Arts)
Scan Graphics (Scientific)
Singer-Link (Scientific)
Stromberg-Datagraphix (Scientific & Business)
Total COM systems
As can be seen in Figure 8 there is very little difference
between photographing a CRT or a paper document.
In selecting the film for recording from a CRT it
should be matched to the phosphor of the tube in
sensitivity. The polarity of the image on a CRT is
negative (light lines on a dark background) and on
paper it is usually positive. Therefore, with normal
film processing the image of the CRT will be reversed
and appear on film as a positive and a microfilm of a
positive paper document will appear as a negative.
Since most users of microfilm prefer to use negative
images in readers and for making hardcopy it is necessary to obtain a negative image of the COM film, this
is done one of two ways. At the time of processing the
recording film is flashed and developed in a special
movie processor which provides a negative image on
film from the negative CRT image. The second method
of obtaining negative film images is to make a second,
generation duplicate on Kalvar or silver film which
will reverse the polarity again and therefore from a
negative CRT image we get, with normal processing,
a positive first generation recording film and then a
negative second generation duplicate.
Figure 9 depicts the various systems used for COM
operations. In scientific applications the film is most
often put in aperture cards or used as short st.rips or on
reels. In most business applications the film is used i~.
roll from in cartridges. There are a few systems where
Overview of Computer Output Microfilm Field
the film is cut and pasted up to make a master microfiche. A recent development, the 105mm film head for
a COM provides microfiche directly and therefore
eliminates much of the manual labor in producing
microfiche. In most business applications film duplicates
are required to disseminate the information to many
users. In all systems, readers, reader-printers, retrieval
devices and enlarger-printers are needed by the end
users of microfilm. Additional information on these
items can be obtained from the National Microfilm
Association's "Guide to Microreproduction Equipment" now in its fourth edition.
There are six generally used methods of making
copies of computer generated reports. Figure 10 provides a cost comparison of a 100 page report. As can
be seen on the graph, distribution of microfilm duplicates is the lowest cost method at any quantity of
copies.
~I ~
~
i
....,,"
(]i::\.
~
1 "",
~
PAPER
DOCUMENT
Prepareut
(Microfilm Cople.)
i
~
1.00
COM Recorder To Enlarger To Offset Duplication
(Paper Cople.)
.50
Comment: By the end of 1970 there will proba-
COM Recorder
~~~"\~f:.rrlnter
...............-.................................................
To
10
20
30
type today 67 percent of COlvl's are
business type today
«l
50
100
200
300
400
SOD
1000
NUMBER OF DISTRIBUTION POINTS
• Cost of ... lcrofIl ... retrlewl equipment NOT Included
PiglJre IO-Comparative cost.s of creating copies ot
comput.er generated' output
bly be more than 1,000 COM'/3 in use.
4. Positive versus negative original recording film:
56 using positive (normal processing)
29 using negative (flash reversal proeessin!~)
A few organizations use both positive and negative
Comment: Even though it requires spechtl
processing equipment to obtain a
negative image on the original film
it is being done, there must be a
need.
Number of
Units
n-.....
20
19
18
17
16
5. The following film processors are being used in
COM systems:
15
Fulton
10 users
18 users
Kodak
Remington Unipro 3 users
Stromberg
6 users
Other
35 users
14
13
12
II
10
6. The following is the quantity of original recording film being used per month by 48 respond
ents who gave figures:
YEAR
Fil!ure Il-CO:\j rf'corder unit placement foreea~t
80,000 feet
16mm perforated
16mm nonperforated 852,000 feet
6'2,000 feet
35rnm perforated
35mm non perforated 3] ,000 feet
105mm nonperforated 6,000 feet
Comment: The following is an estimate of
members are interested in the COM
field.
NOTE: SYSTEMS SERVICE COMPANIES
the quantity of recording film 2~1l
COM's are currently using per
month.
lomm perforated
400,000 feet
Overview of Computer Output Microfilm Field
16mm nonperforated
35mm perforated
35mm nonperforated
Miracode*-IO users
ImageBlip (Image Count)-24 users
Code Line-7 users
Flash -9 users
Other-12 users
4,200,000 feet
300,000 feet
150,000 feet
7. 57 of the 74 users duplicate their film.
23 Diazo
29 Kalvar
22 Silver
Some used more than one process.
Comment: Convenience and turnaround time
are most important.
Comment: Most COM systems are now using
the Image Blip (Image Count)
system of retrieval.
11. Regarding a question on the use of hardcopy
the following responses were received:
Never used-5 users
Seldom used-20 users
Frequently used-34 users
Always used-6 users
8. Duplicating film being used per month by 40
COM systems reporting:
16mm-3,800,000 feet
35mm-35,000 feet
105 x 148mm-34,000 fiche
3-1/4" x 7-3/8"-270,000 fiche
6" x 8"-34,000 fiche
Aperture cards-367,000 cards
Comment: The following is an estimate of the
quantity of duplicating film being
used per month by all COM
systems:
16mm-22,800,000 feet
3.5mm-200,OOO feet
Aperture ca.rds-2,200,OOO cards
Microfiche (various sizes)-2,OOO,OOO fiche
9. Microforms being used in COM systems reporting:
Roll Film (including cartridges)-56 users
Microfiche-21 users
Jackets-13 users
Aperture cards-13 users
Comment: The following are the percentages
of COM systems using each microform:
Roll film (including cartridges)-55 percent
Microfiche-21 percent
Jackets-12 percent
Aperture cards-12 percent
10. For those using roll film and cartridges the
following indexing systems are in use:
623
Comment: Hardcopy is required, but on a
selected basis.
12. For 35 respondents, 844,000 pages of
are produced each month.
hardcop~'
Comment: The average COM. system produces 24,000 pages of hardcopy per
month.
Standards
In February of 1968 the National Microfi'm Association (NMA) established a committee to investigate
and recommend standards for microfilm produced by
COM recorders. This committee has members from
most of the C01\.1 manufacturers, several COM systems
service companies and many users in government and
industry. There are three sub-committees each with a
rnlssion as follows:
Format. Quality and Glossary.
The National Microfilm Association is attempting to
coordinate the activities of this new microfilm application by considering standards, reporting of mauy
specific applications in its Journal and having COM
exhibits at its annual convention.
For additional information on the CO:vI field, write
to the National Microfilm Association, P.O. Box 386,
250 Prince George Street. Annapolis, Maryland 21404,
* Miracode is 1:1
Lrademark of Eastman Kodak.
The microfilm page printer-Software
considerations *
by S. A. BROWN
Datalogics, Inc.
Chicago, Illinois
INTRODUCTION
Magnetic tape microfilm recorders have been available
in the market place for the past several years. It has
been only within the last eight months or so that a
general awareness of these devices has developed.
Trade magazines and journals are now carrying feature
articles describing computer based, microfilm information systems. Investment houses are releasing
surveys and market evaluations of this area. Talks on
the economics and human engineering aspects of this
approach are being presented at many technical conferences. Little, however, has been said about programming considerations for the preparation of the specially
formatted magnetic tape required for the operation of
these devices. The purpose of t~is paper, then, is to
examine the fiexibilities and capabilities of magnetic
tape microfilming as viewed by the programmer, to
discuss the software problems that he faces when
attempting to Use such a device and to describe several
generalized solutions to these problems.
prefaced by one or two characters of coordinate .data
and terminated by delimiter character. The coordInate
characters specify the position of the line on the page.
This spec~fication may be in absolute, in terms of a
specific character in the page array! or relative to the
last printed line.
.
.
A page printer differs from a conventlOnal Impact
line printer in allowing the line to be placed randomly
on the page rather than in ascending line sequence. It
is possible to skip from the bottom of a page to. t~e
top as easily as from the top to the bottom. ThIS 18
illustrated in Figure 1.
This may be contrasted to an impact printer which
can only advance.
.
The fiexibilities of a page printer can best be appreCiated by considering the probl~m of printing a report
containing, say three vertical columns. To print suc~ a
report on a conventional line printer would reqUIre
buffering an entire page in core or at least the first two
columns before any data could be printed. In a page
printer environment, this restriction is removed.
The machine
A typical such device is the Series F Electron Beam
Recorder manufactured by 3M Company. It is a microfilm page printer with an extended graphic set. The
page area is a 132 X 64 character array organized as
64 lines each containing maximally 132 characters.
Data to be printed reside as line images on magnetic
tape. Each line is represented as a character string
* The work described in this paper was sponsored by 3M Company, Computer Graphics, St Paul, Minnesota
625
Extended character set
Electronic rather than mechanical generation of the
character se~ provides a wide variety of available
graphics. In addition to the standard upper case set,
most microfilm printers provide a lower case as well
as a bold face. Series F Electron Beam Recorder further
includes a large size set.
This graphic variety allows design of highly le~ible
microfilm documents that previously could be obtaIned
only at the expense of typesetting.
626
Fall Joint Computer Conference, 1969
lin. 3
Magnetic Tlpt Rlcord
132 characters
...:.l.!.!LtI_
1 _-_
64 lines
M1croftl. Pigi
Figure I-Microfilm page orgflonization
Forms or line art
This machine provides a meaflS of inserting line
drawings or ruled forms with printed headings similar
to custom line printer forms. This capability allows
insertion of single fixed forms, random retrieval from
a library of 30 images and sequential retrieval from
a file of 2000 images.
Applications
The application for this device' ranges from that of
replacing current impact line printers printing on stock
or custom forms to preparation; of material that is
typeset or types, such as illustrated parts catalogues
and directories.
Software implications
Typically, a microfilm printer is used to replace some
or all of the functions of a line impact printer. Conceptually this is easy to visualize, line printing is a
subset capability of page printing. The user, however,
finds himself in one of two situations. Either his
program prints directly on· line or formats a tape for
off.line printing. The former case obviously implies
changes to the application program; the latter implies
either program modification or a tape to tape transcription pass. The low unit page cost exhibited by microfilm
recorders makes even a nominal tape to tape computer
charge relatively expensive. It may amount to 20·40
percent of the total microfilm' cost. The apparent
alternative is program modification. Typical program
conv~rsion takes from two hours to two days, dependm~ on the availability of program source, docu~ent~tlOn and test data. Although, the reprogramming
tIme. IS minimal for a single program, if the universe
conSIsts of hundreds, as is normal, a major expenditure
of effort is required. Further, in those instances where
the user wishes to retain the original progrs,m for
back-up purposes2 he is forced to maintain both
programs. The user is confronted with -the potential
requirement for a large re-programming investment
and must weigh this against the economics of a microfilm system .
The user is in a similar position when he requires the
extended character set, page printer or forms capabilities. This time, however, he is really modifying his
application and can be expected to expend programming effort. He has more than a simple media conversion problem to solve; he is designing a microfilm
format that did not previously exist.
In so modifying his application, he has to consider
all of the characteristics and idiosyncrasies of the
specific microfilm printer he is going to use. These
include placement of inter-record gaps and control
codes within the text of the microfilm document.
Solutions
In the best of circumstances the user would prefer
to see extensions to his operating system and programming languages to support output devices-with extended
graphic and page printing capabilities. If he desires
merely the same output on microfilm that he obtained
on hard copy, 'he should have to change only a peripheral assignment statement in his job deck and
execute his program. In the case where the user requires full utilization of the microfilm device's capabilities, he would prefer to resort to new statements in
his application languages such as COBOL, PL/I, RPG~
etc. These might include facilities for declaring muliGicolumn output, invoking alternate character sets or
specifying insertion of graphics.
3M Company has recognized the need for system
software with these capabilities and feels that as the
computer microfilm user community grows, operating
system and language implementors will include them
in future systems. During the initial design phases of the
Series F EBR, they asked us to formulate interim
solutions for several specific computer systems. 'We were
instructed that these solutions remain valid until such
time that microfilm page printers were recognized by
operating system implementors as standard peripherals.
These solutions can be categorized as either conversion
support or new application support. .
Conversion support
Support software has been written for the IB1VI 360
DOS and 0 IS operating systems. This has taken the
form of extensions to the operating systems. Con-
The Microfilm Page Printer 627
siderable care has been-taken to insure that the change
was local and did not disturb the rest of the system.
The DOS extension is a supplement to DOS Logical
IOCS and provides object program compat~bility with
problem programs written in PL/l, COBOL, RPG and
Assembly Language. I/O Modules similar in concept
to the ones that comprise Logical IOCS were written
to interface a printer file definition with a physical
magnetic tape drive. This interface routine is responsible for adding the EBR control codes to the print
image and forwarding it to the magnetic tape drive. A
series of these routines reside, together with standard
IBM supplied I/O routines, in the relocatable library.
To invoke the extension, the user adds a single link
editor control card and re-links his program. The output that normally appeared on a line printer is directed
to magnetic tape in a format appropriate to the EBR.
The result of processing this tape on the microfilm
page printer is identical in all respects to that previously
obtained on the line printer.
A similar extension was provided for IBM's 0 jS 360
operating system. In this case the user is provided
with load module compatibility. Extensions were written for the four QSAM move and locate mode modules.
The modules have been modified to examine the volume
serial number of the output data set and if the first
three characters are "EBR" and the data set is in
ASA mode, the file contents are re-formatted before
being written on its assigned device. Operationally, the
user is required to include only one control card to
divert his output from a standard system output
writer (SYSOUT) to a magnetic tape in EBR format.
A similar system involving modification of the IBM
1401 Autocoder assembler provides source language
compatibility for the EBR. Further object program
support is being developed for the CDC 3300, GE 400
and RCA Spectra 70 TDOS operating systems. I
personally feel that object program support is extremely
important, particularly in this age of proprietary software where source programs may not even be available.
New appli£ation support
Here, the user requires an output format not obtainable on a conventional line printer. He must develop a
new program or at least modify a current one. Again,
he should be insulated from certain details of the
microfilm printer, such as placement of control codes
and inter-record gaps. The approach in this case was
to provide a general purpose output package written
in COBOL. This package, called EBRPACK, provides
entry points to select form overlays and character sets,
plus additional entries to replace the standard COBOL
printer command "WRITE dataname 1 AFTER
ADVANCING dataname 2 LINES."
SUMMARY
Software support for microfilm page printers is necessary and desirable; it must utlimately come from
operating system and programming language implementors. In the interim, operating system extensions
providing microfilm-line printer interchangeability may
readily be prepared. Applications requiring specific
features associated only with microfilm page printers
may be designed and implemented utilizing output
packages written in machine independent lanJl;uages.
Computer microfilm-A cost cutting
solution to the EDP output bottleneck
by JOHN K. KOENEMAN and JOHN R. SCHWANBECK
Oppenheimer & Company
New York, New York
SUlVIMARY
Although the computer microfilm recorder has received little attention to date, this new output device
represents a technological breakthrough which will
have a major impact on the computer industry. Installation of a recorder generally results in a tenfold
increase in the speed of computer output and a con-'
comitant substantial reduction in CPU time which can
result in major data processing and report production
cost savings. As an added bonus, a microfilm system
is the equal of most electronic time-sharing systems
for information storage and retrieval applications.
Consequently, we feel that computer microfilm, although little noticed thus far, represents a major industrial and investment concept.
The electromechanical line printer-heretofore the
only practical means of obtaining hardcopy rapidly
from the computer-has a maximum' output rate of
only 2,500 characters per second. But; the computer's
throughput capability is 25,000 to 100,000 characters
per second. Owing to the severe output bottleneck
that results from this imbalance of speeds, the bulk
of information ingested and produced by computers
has, until now, essentially been locked on magnetic
tape and not easily available to the computer user.
With the advent of the computer microfilm recorder,
which can produce output as fast as the computer can
process data, this mass of stored information has suddenly become readily available in humanly readable
form. One of the most important questions which must
therefore be asked is: "How much information is
stored on magnetic tape and how badly is it desired
by the computer user?" Our field work has consistently
shown that an early Xerox type phenomenon existsuser volume rises rapidly to meet capacity.
Because the microfilm recorder eliminates the computer output bottleneck, it also results in a major cost
savings, This effect is most readily apparent in the
data processing service industry, where a customer
can now realize an approximate 40 percent to 50 percent reduction in his monthly service bureau bill if
microfilm rather than continuous paper froms is acc.~pted as computer output.
Even greater relative savings can be realized by
companies with medium to large-scale in-house data
processing departments. Overall, it can be shown that
the lowest data processing costs, at all levels of use,
are achieved when microfilm recorders are employed
to produce alphanumeric or graphic computer reports.
Moreover, acceptance of computer output in film
form automatically creates an information storage
and retrieval system which is the equal of most electronic systems. Although microfilm has gained a bad
reputation because of the poorly designed equipment
and improperly processed film which library users
have been forced to endure for years, newly introduced
microfilm equipment can now easily provide the quality
of image and speed of retrieval of the most expensive
time-sharing terminals.
In addition to the standard data processing market,
there is another separate and distinct market, that of
pure information storage and retrieval, for which com-
629
630
Fall Joint Computer Conference, 1969
puter microfilm can compete very effectively because
of its low cost. In fact, computer microfilm is frequently
referred to as "the poor man's time-sharing." The
service bureau charge for processing and producing one
page of computer generated microfilm daily for one
month is 10 percent to 40 percent that of storing one
page of information on magnetic disc for the same time
period. When the terminal and communications costs
of electronic time-sharing systems are also considered,
the cost advantage weighs even more heavily in favor
of microfilm. In large measure, this dramatic cost
difference is the result of the substantially greater
density of data storage which film (1,000,000 bits/
sq. cm.) enjoys over magnetic media (1,000 bits/sq.
cm.). Thus, although highly optimistic forecasts have
been made for the growth of electronic systems for
use in information storage and retrieval applications,
we feel that fundamental economic considerations
strongly suggest that computer generated microfilm,
instead, will become the most common (although,
obviously, not the sole) method of computer information storage and retrieval.
Several other benefits are derived from the computer
microfilm recorder which are normally of peripheral,
q ut can on occasion be of prime, importance:
• An unlimited number of report copies can be
obtained from one computer run with no loss of
clarity; by contrast, only four or five truly readable
copies can be obtained from a single run when an
impact line printer and continuous paper forms are
used.
• Owing to its compactness, microfilm essentially
eliminates the problems and costs of computer
report storage.
• Microfilm permits dramatic reductions in computer report transportation or communications
costs.
Computer microfilm is not without certain drawbacks, however. A computer microfilm information
system cannot be used in situations where the data
base changes rapidly, such as in airline reservations
or stock market quotations. It also cannot be employed
where user interaction with the data base is desired.
Additionally, paper possesses a distinct advantage as
data processing output where computer usage is very
light, or sc~entific applications (i.e., high computationlow output) are involved.
In summary, with the development of the computer
microfilm recorder, the most efficient processor of in-
formation-the computer-has finally been directly
linked with the most efficient means of information
storage and retrieval-microfilm. User experience to
date strongly suggests that very large and potentially
vast demand exists for the inexpensive and fast access
to computerized information that this combination
provides. Indeed there is every indication that computer microfilm could bring about a real information
explosion. Certainly all ingredients necessary for sueh
pyrotechnics are present-a sudden quantum jump in
the speed of information output, low cost, and ease
of use (Exhibit 1). As a consequence, we feel 1~hat the
computer microfilm service, hardware, and supplies
industries will experience impressive growth over the
near and intermediate term. Indeed, output oj[ microfilm recorders, which should jump from lOO units in
1968 to about 400 units in 1969, presently is production
limited.
The microfilm recorder substantially reduces data
processing and report generation costs for all users
Although there are considerable variations in volume
discounts and prime or off shift machine rates, a 50
percent cost saving is common when a data processing
service organization customer changes from paper to
microfilm as computer output. Similarly, cost reductions of 40 percent to 70 percent have been documented
by heavy in-house computer users even though, in
most cases, the availability of computer reports has
been substantially increased as well. Although the
relative cost savings of the in-house user and the service
bureau customer are similar, the source of these savings
is not. Whereas essentially all the service bureau cost
reduction can be attributed to lower computer time
charges, the bulk of in-house economies derives from
labor and material savings. On balance, however, it
can be shown that the lowest data processing eosts 2~re
always obtained when a microfilm recorder is em.ployed.
Service 'center cost reductions
To obtain 1,000 pages (and three carbon copies)
of processed information, a data processing service
organization customer presently accepting paper output will incur about one hour of IBM 360/30 machine
rental at $65.00 per hour and a materials charge of
$30.00 for continuous forms. Thus, total service bureau
charges for the processing and production of 1,000
pages of information will total about $95.00 whlen paper
is used as the computer output medium.
If, however, a change to computer microfilm is
made, the cost of a similar run drops to about $40.00
Computer Microfilm
Exhibit 1
COMPUTER MICROFILM VS IMPACT PRINTERS: DISTINCT ADVANTAG£S
PltlIT
uno
TIME
FILM RECORDER
IMPACT PRINTER
1
50
1
10
-.nil
IIlTIIlYAi.
TIME
TIME
COST If
IlATtIlAlS
1
1
1
18
"3
8"
PHYSICAL VOLUME AND WEIGHT OF PRINTOUT
Exhibit 2
COMPUTER MICROFILM
631
TOrTM__
CO_ST_H_.~M~ON~TH~~~~)________,
$45
40
VS
IMPACT LINE PRINTER: 35
SERVICE CENTER COSTS
30
50% SAVINGS
IN DATA PROCESSING AND
REPORT PRODUCTION COSTS
25
20
. . . . . NOOUCT lIUUTL!ll
15
Exhibit l-Computer microfilm vs impact printers:
Distinct advantages
10
10URCE:
to $45.00. Because the economics of large, fast computers can be used to advantage when the machine is
no longer output bo,:!nd, most computer microfilm
programs are run on 'an IBM 360/6'5 or equivalent.
Because the time necessary to process 1,000 pages of
information on a 360/65 is about 0.2 minutes, total
data processing charges at $600 per hour amount to
only about $2 or $3. Conversion from magnetic tape
to a single microfilm original, can be accomplished for
about $30.00 (three cents per original page), and the
cost of three copies will add an additional $10.00
(3.3 mills per page). Thus, for comparable data processing and report production services, a computer microfilm service bureau will cost only $40.00 to $45.00, in
contrast to about $95.00 for a traditional data processing service organization (Exhibit 2).
m~m~ ~rg~'R~~~IITS,
SERVICE CENTER PRICE LISTS
Exhibit 2-Computer microfilm vs impact line printer:
Service center costs
illlibit 3
COMPUTER MICROFILM
TO=TA;;;...LC;;. ; .OST;;.;. . . .H;•. ; ...;;.;MO;;.;;NT.;.;.;H~~_ _ _--.
$45'40
VS
IMPACT LINE PRINTER:
LEASED IN-HOUSE COSTS
IREAKEVEN @ 1110_
PAGES PER MONTH;
SIGNIFICANT SAViNCS
THEREAfTER
In-house cost reductions
In the next exhibit (3), it can be seen that although
the installation of a microfilm recorder (SD4360) increases the fixed cost of 'a data processing installation
about $2,000 per month, variable costs for materials
are s? low that the recorder becomes economically
advantageous after 90,000 to 100,000 pages per month
of output, or the equivalent of five to six machine
hours per day of a relatively small four-tape System
360/30. Thus, an in-house installation operating two
shifts can achieve a 25 percent-30 perCllet cost reduction through the elimination of machine shift permiums,
labor, and materials savings. Extensive Army studies1
have shown that operating savings of 40 percent to
70 percent can be achieved when three-shift operation
or multiple satellite computers with attached line
printers are involved.
The magnitUde of the demand for computer reports
that is presently unsatisfied because such reports are
considered uneconomical can perhaps be judged by
o
o
200
NUMBE. OF O.IIIIHAl PAIIES
H. MONTH (000)
Exhibit 3-Computer microfilm vs impact line printer:
Leased in-house costs
noting that if the management of a corporation with as
little as $15 million in annual sales desired detailed
daily reports on finished parts inventory, accounts
receivable, and unfilled orders, almost seven hours
of computer time would be consumed in printing out
these reports. 2 Incremental costs of about $3,000 to
$4,000 per month for materials and possibly $2,000$2,500 for additional labor would probably thereby
be incurred. Thus, although the utility of detailed
management reports such as these is probably high,
Fall Joint Computer Conference, 1969
632
MATERiAlS
COMPUTEI MICROFILM YS
PAPER CONTUIUOUS FORMS OUTPUT:
SUlSTAIITlAL SAv.5 • 1lATUtALS.
It) .90 COST PER DUPLICATE
.50
f-
.40f.30f.20f.10 f0
0
1001.751-
1013.01-
',API.
'.
.25f-~
0
0
f-
-.-........
2.0 f-
~~
.SOf-
4DD 5DD
3ClII
HOURS UTlLlZATlON/1II0NTH
100
PUll
-.........
filM
4
2
3
DUPlICATES/PAGE
I
5
•
2DD
,,PAPI.
,, ..
,1
COST PER ORIGINAL PAGE
COST PER DRIGINAL PAGE
lSOf-
•••••••••
WDR COSTS
IIIACHINE RENTAL
125f-
•
.60f-
_ _ IElfTIl. AIItIlAIOII COSTS
10)1.75
..l•
.IO~
.70 f-
I0Il
f1.0 f-
f0
0
.....
concern which began using prototype computer
microfilm equipment in 1967 had increased its film
consumption to 20 million feet per year (400 million
pages) by 1967 and reached 38 million feet ('760 million pages) in 1968.
The substantial savings in consumable materhtls
costs, labor costs, and machine rental are, of course,
the three major cost elements considered in calculating operational savings (Exhibit 4).
Additionally, however, considerable savings in computer report shipping and storage costs can frequently
be realized, although these expense elements have not
been included in our calculations (Exhibit 5).
~411 .~ ••
filM
ll1D
2DD
3DD 4DD 5DD
HOUIIS UTlLlZATlON/1II0NTH
I0Il
Exhibit 4-Computer microfilm vs paper continuous
forms output: Substantial savings in materials,
machine rental, and labor costs
we think it likely that the operational difficulties and
the extremely high EDP costs necessary to produce
such information have led many manufacturing companies to forego such data until now. However, with
the installation of a computer microfilm recorder, the
same $15 million company described above could produce the same reports at an incremental cost of only
$400 to $500 per month for materials and no incremental
cost for labor. Thus, the company would then find it
feasible to produce these reports. Operating experience
to date of computer microfilm recorder owners certainly
would point toward such a conclusion.
Moreover, it is important to note that the cost
curve of a computer microfilm data processing installation is essentially flat out to very large quantities
of output (Exhibit 3). Thus, the corporate manager
would now be able to obtain additional detailed reports almost instantaneously at virtually no incremental cost.
Experience to date indicates that most managements
will quickly begin to utilize the full capacity of a newly
installed recorder.
For example, in one case, a large insurance
company installed a microfilm recorder in May 1967.
Although the equipment operated only five hours
per week when first installed, after approximately
one year, utilization had increased tenfold to 50
hours per week. In another case, a manufacturing
COMPUTER MICROFILM VS
PAPER CONTINUOUS FORMS OUTPUT:
Exhibit 5,
SUBSTANTIAL SAWlIIIS III $lOUIE AND SHIPPlIIC COSTS •••
11,111 'AlES
VOLUME
MICROFILM
PAPtI
0.10 Ft J
4.50FtJ
11,000 'AlES
WEIIMT
MICROFILM
PAPER
3.0 Lbs.
150.0Lbs.
APPIOXIM ATE
_VAl
STOIAIE C:
On
$0.05
$4.15
UPII0XIM
filST CU
. MAIL co
0.010 I
4.05 0 I
Exhibit 5-Computer microfilm vs paper continuous
forms output: Substantial savings in storage and
shipping costs
Cost reductions for all users
In summary, then, by superimposing the costs of
service centers (Exhibit 2) on those of in-house installations (Exhibit 3), it can be seen that the use of
a computer microfilm recorder will always result in
the lowest data processing cost at all levels of US~tge
(Exhibit 6).
These facts should be apparent;
1. A computer microfilm service center is always
about 50 percent cheaper than a paper service
center, and this cost advantage proba.bly will
go higher.
2. A computer microfilm service center is the
least expensive data processing alternative up
to about 200,000 pages of output per month.
(200,000 pages per month is the maximum output of a single shift working six days per week
on a 360/30 with one attached line printer.)
Computer Microfilm
ExhibitS
SUMMARY OF
COST COMPARISONS:
COMPUTER MICROFILM
RECORDER RESULTS IN
THE LOWEST COSTS
AT ALL LEVELS OF USAGE
TOTAL COST PEII MONTH 1$0001
$45
IMPACT LI. ""NTEl
40
••••
35
30
25
20
'DUlCE:
~¥~g:~~ ~'lf~'R~~TS:
.
SERVICE CE.NTER 'PRICE LISTS
0
0
NUMlEII Of 01l1l11W. PAlES PEII MONTH CIIIIII
Exhibit 6-Summary of cost comparisons: Computer
microfilm recorder results in the lowest costs at
all levels of usage
633
Microfilm is the most efficient medium for storing
and accessing generated computer data
Computer microfilm is actually, by a wide margin,
the most efficient and economical storage and retrieval
system for computer generated information. Microfilm
has always been superior to paper from a bulk handling
and storage standpoint. With the introduction of the
computer microfilm recorder, it can now also approximate electronic time-sharing systems in performance
for the great majority of information storage and retrieval applications. Thus, computer output on microfilm can provide a simple, fast information system far
superior to those currently in use. Indeed, computer
microfilm service bureau managements indicate that
it is not the substantial cost advantage of film over
paper computer output which is most attractive to
prospective customers, but rather its usefulness as an
effective information system. The dramatic cost benefits, however, can be an extremely effective sales too,
in getting the customer to consider microfilm seriously.
Microfilm joins the computer era
3. Beyond 200,000 pages of output per month,
an in-house computer with a microfilm recorder
is by far the least expensive data processing alternative.
4. An in-house computer/microfilm recorder can
bring about a cost saving vis-a-vis a com Lter/
line printer installation beyond about 90,000
to 100,000 pages per month, or only five to
six hours of computer time per day, with paper
output.
Thus, if decisions regarding an in-house capability
versus utilization of a service bureau were always rational and financially sound, 100 percent conversion
from paper to computer microfilm output could be
expected. To anticipate a conversion ratio of 100 percent is, of course, unrealistic. Nonetheless, the pricing
revolution which the computer microfilm service companies have brought about in the data processing
service industry should result in very extensive use of
the computer microfilm recorder in this segment of
the computer industry. The small data processing user
will be the primary beneficiary of the dramatic reduction in data processing service bureau costs. Similarly,
medium-scale to heavy computer users will find the
substantial cost and operating advantages of an inhouse recorder sufficiently compelling to bring about
heavy conversion to microfilm output in this market
segment.
Development of the computer microfilm recorder
has brought in its wake a flurry of product development activity aimed at greatly facilitating access to
information on microfilm. Most individuals think of
microfilm only as an archival medium-for storing
outdated information for which a need might or might
not arise at some time in the future. Actually, the active use of microfilm for the storage and retrieval of
information in daily use has been practiced by some
pioneering users and companies for years. For the most
part, these have been extremely large users (e.g.,
Social Security Administration). We feel that in large
part the reluctance to adopt active microfilm systems
has been due to the fact that information in such systems had to be manually sorted, updated, and coded-a tedious and time-consuming task.
Now, however, this task has been eliminated through
the development of computer microfilm coding systems
which can provide manual access to one page out of
73,500 in one to five seconds.
Additionally, the speed and ease with which computer information can be obtained on mi~rofilm has
been increased from days to literally minutes. One
manufacturer has adopted a marketing program stressing "on time" information rather than "real time",
which is, in fact, an accurate description. There is
virtually no computerized information which cannot
be obtained overnight in a fully useful, properly indexed
format.
634
Fall Joint Computer Conference, 1969
In sum, the user of computer microfilm has access
to a "poor man's time-sharing" information system,
as some have termed it, with no addition to his CPU
costs.
MICROFILM VIEWERS
VS.
hhibit7
TIME SHARING SYSTEMS:
fAVORABLE COMPARISON IN TERMS OF COST AND SPEED
TERMINAL COST
Computer microfilm competes effectively
against time ..sharing
Many feel that time-sharing will become the most
common method of providing access to computer
generated information. But, it can be shown that for
most applications, the storage and retrieval of information electronically is very uneconomical relative to
a computer microfilm system.
For example, in one specific application, a data
storage capacity of 15,000 pages, to be updated
daily, was required. The effective cost of this application on a commercial time-sharing disc file system
equalled about $3.00 per page or a total of $45,000
per month. On microfilm, this same information
can be updated once a day for approximately $0.60
per page or $9,000 per month-a storage cost reduction of almost 80 percent. l\loreover, the timesharing system would incur additional costs for
terminal connect time and computer search time.
Therefore, we feel that microfilm, as a medium of
access to computer information, will become much
more commonly employed than time-sharing in the
future. Time-sharing, however, will always be required
for applications in which immediate interaction with
the data base is desired.
Microfilm permits the storage cost savings just
described because it has a significantly greater storage
density capacity than the magnetic storage media used
in time-sharing systems (i.e., disc packs and data cells).
While it is only possible to store approximately 1,050
bits per square centimeter on computer magnetic materials, it is possible to store 1,000 times this amount;
or over one million bits, on a square centimeter of microfilm.
In addition to storage costs, the relative disadvantages of time-sharing for information storage and retrieval include substantially higher terminal and communications costs (Exhibit 7).
As shown in the exhibit, a full page of information
can be accessed in one to four seconds on the CARD
device. To equal this speed with a time-sharing system
a high-cost video terminal and Telpak-D communications line must also be employed.
As a result of these cost factors, microfilm is the more
economical of the two systems for most normally encountered information storage and distribution prob-
MICROFILM VIEWERS
HF IMAGE CARD SYSTEM
TERMINAL
ONLY
TYPICAL RENTAL
PER MONTH
(Includes Modem)
$2,00II-$3,450
$95-$1&0
TIME
TIME TO OISPL'Y
FULL PAGE
(8.000 Characte'rs)
r
4 second;
Stromberg DatagraphiX 1700
(Automatic MagaZine,
$ 1,248
$ 42
4·15 seconds
KODAK PVM (Manual Roll)
$
$ 21
8·20 seconds
600
(Includes
Maintenance)
COMPUTER TERMINALS
SANDERS 720 (Video)
$ 9,025
$468
IBM 2265 VIDEO (Plus Controls)
$15,000
$471
3.1 seconds
IBM SELECTRIC 2741
$ 3,100
$130
533 seconds (9min.)
TELETYPE KSR 33
$
450
$90
800 seconds (13 min.)
COMPUTER COMMUNICATIONS
HIGH SPEED - BROAD BAND (TELPAK·D)
APPROXIMATE
TRANSMISSION COST
0.2 seconds
TIME TO TRANSII'ORT
FULL PAGE (8,000 Characters)
$ 45/lllile/month
0.1 seconds
WATS SERVICE
$240· $2,OOO/lllonth
2:7 seconds
DEDICATED VOICE
$ 2· J/mile/month
LO SPEED
I
$ l/mile/month
27 seconds
800 seconds (13
mi'~.)
COMPUTER COMMUNICATI()HS COST" MUST B£ INCLUDED WITH TERMINAL AND STOMCE COSTS IN TIME SHARING SYSTE'\4S,
SOUItCE: AUERBACH ttl-n>, INC., r,('.MMUNICATlOHS REPORT; AUERBACH INFO, tNC., STO EOP REPORTS; COMPANVPRI(:E LISTS
Exhibit 7-Microfilm viewers vs time sharing systems:
Favorable comparison in terms of cost and speed
lems. The surface illustrated in Exhibit 8 delineates
the points (determined by file size, number of users,
and update frequency) at which a microfilm Bystem
is roughly cost equivalent to an electronic information
storage and retrieval system.
For the problems located within the surface, a microfilm system is less expensive; for those outside the
surface, electronic systems are less expensive.
For example, the exhibit demonstrates that when
information must be available to 200 users and updated every business day, a microfilm system is more
economical for files of 14,000 pages or less. A file of
this size could contain the daily closing stock quotations for the NYSE, ASE, and OTC market for over
four years. Similarly, a 14,000 page file could contain
all the records for payroll, personnel, and finished
goods inventory (plus 10,000 accounts receivable
records) for an average industrial corporation with
sales of $800 million per year. 3
There are two types of commonly encountered applications for which microfilm is not a suitable replacement for time-sharing: when the user wishes to
input, manipulate, and extract data at will, and when
updating is required more than once a day, such as
in transportation reservation systems (these cases are
Computer Microfilm
COMPUTER GENERATED MICROFILM
VS
ELECTRONIC COMPUTER SYSTEMS:
MOST ECONOMICAL INfORMATION STORAGE AND RETRIEVAl SYSTEM
fOlIlOST COIIIIOIILY EIICOUIITER£D APPlICA-r. ...
located above the update frequency = 20 times/month
plane in Exhibit 8). Whereas time-sharing allows information stored in a computer to be updated immediately and made readily available in updated form
to all users, with a microfilm system four to six hours
is the minimum time one may expect for file update,
preparation, and distribution.
However, in most other commonly encountered inform.ation storage and retrieval applications, computer
processed data is required for informational purposes
only, such as in referencing records to service a customer inquiry. In these cases, a microfilm information
system is equally as effective and far less expensive than
a time-sharing system.
Recently, hybrid information systems have been
introduced in which a data base is stored on microfilm
while recent updates and changes can be retrieved
electronically from computer memory. These systems,
which utiliz,e the advantages of both microfilm and
time-sharing systems, should find widespread acceptance in the future.
REFERENCES
_ . _ _I'III_OII _ _.IIIC .•
_ _ DATA
_l1li-., "'tel LIITI
1 Report on non-impact printing proJect
Army Materiel Command Jan 1968
2 The computer and the small company
Exhibit 8-Computer generated microfilm vs electronic
computer systems: Most economical information
storage and retrieval system for most commonly
encountered applications
635
Auerbach Info. Inc.
3 Statistical Abstract of the U S 1968 (89th edition)
U S Bureau of Census
Washington D C 1968 and
The computer and the small company
Auerback Info. Inc.
Design of distributed' communications
system-A case study
by N. NISENOFF
Computer Command and Control Company
Washington, D. C.
INTRODUCTION
The design concept:
The development of a concept for a Department of
the Army Civilian Personnel Management and Manpower Data Reporting System and an Optimum Automatic Data Processing System was undertaken by
Computer Command and Control Company in June,
1967.*
The work was initiated by the Department of the
Army to meet the increasing demand for more detailed
information about civilian employees, as required in
connection with Army-wide civilian personnel career
management programs, and in view of new and more
detailed general governmental reporting requirements.
In addition, the system was to be capable of maintaining data concerning the wide range of skills and
experience of Army personnel. A further goal was the
reduction to a minimum of the time delay in communicating relevant personnel data for the purpose
of applicant screening.
The system, as developed, is a generalized civilian
personnel information system that embraces all aspects of the Army's civilian personnel management
activity and control. It provides the information
gathering, processing, storing, querying and. reporting
capabilities to meet the requirements of Headquarters,
Department of the Army; all echelons' of field commands; the Department of Defense; the Bureau of
the Budget; the Civil Service Commission; and other
governmen t agencies.
1. Provides a powerful, efficient, open-ended,
processing capability at a cost level tha~ is
the minimum commensurate with the system
requirements.
2. Utilizes the most advanced (yet proven) hardware and inform.ation entry, storage and retrieval
techniques available as so to effect data entry,
validation, distribution, storage and organized
retrieval with minimal human intervention.
3. Offers direct, rapid, complete and easy exchange
of both formatted and unformatted personnel
information among authorized individuals and
offices at all levels.
4. Provides standardized funtional personnel management information formats and processing
techniques, together with adequate on-line
analytical tools.
5. Makes exchange of data with the Civil. Service
Commission, the Department of Defense and
with other Army systems simple and easy, providing data definitions have been standardized.
* This effort has been performed for the Deputy Chief of Staff
for Personnel, United States Army, ~nder contract DAHCI5
67 C 0265.
637
Insofar as practicable, use has been made of presen t
data bases. By applying automatic file conversion
techniques previously developed, it will be possible
to efficiently convert many existing data bases into
random access files that can be electronically updated
and queried. Particular attention has also been given
to the problems of interfacing with and making the
best use of existing automated or partly automated
general management information systems within the
Department of the Army.
638
Fall Joint Computer Conference, 1969
-------------------------------------------------------------------------------------------The scope of the project is in part indicated by the
following: There are over half a million civilian employees of the Army paid from appropriated funds,
of whom about 140,000 are foreign nationals. In addition, there are about 200,000 civilian employees oversea~ who are p9.id from non-appropriated funds but
who are administered by Civilian Personnel Offices.
To service these employees, there are some 200 Civilian
Personnel Offices scattered around the world. For just
the United States Civil Service employees, it is estimated that approximately two billion characters of
information will need to be carried in the Army Civilian Personnel Management and lVT anpower Data
Reporting System.
Following the initial data gathering and analysis
phase, the record and file structuring effort was undertaken. During this phase all candidate data elements
were identified, classified and organized into files.
File usages were then examined and files were assigned
to appropriate storage media. For example, the data
elements required to develop a reply to a relatively
frequent query were placed in a fast mass random access storage subsystem. On the other hand, any data
elem,ent required but infrequently was placed in the
magnetic tape storage subsystem.
Given the results of the file structuring study, four
separate and distinct Continental United States
hardware configurations 'were postulated, and two
additional prepared for overseas components. Each
configuration is capable of performing the data processing functions required. The Continental United
States configurations specified were:
1.
2.
3.
4.
A centralized single computer system;
A regionalized five computer system;
A decentralized twelve computer system; and
A localized twenty-one computer system.
A cost analysis was then performed to evaluate each
configuration. The overall results .of this evaluation,
including the cost of initial loading of the data bank,
is shown in Figure 1.
After considering this cost data plus the other r~dvan
tages and disadvantages, Configuration I was selected.
It of is interest to note that the total cost of the system
per Civilian Personnel Office for Configuration I is
about the same as the salary and overhead cost of
oneGS-5.**
To establish the practicality of the implementation
of the proposed system, a br~ak-even analysis was
performed. A reduction in work force of four percent
** At the time the report was prepared, a GS-5 earned $5,732.00
per year.
RANGE OF
HARDWARE
COST
$3,200
$2,800
$2,400
0
$2,000
p..
u
..:
<.1.l
p..
,...
til
$1,600
t
0
U
>-<
.....
::c
,...
z
HARDWARE
AND
COMMUNICATIONS
$1,200
0
::E
$
800
$
400
~
~OST OF~OADING IN~O
DATA
THE AUTOMATED SYST16]
I
1
SYSTEM DESJ(J~N
AND
IMPLEMENTATION
.....
5
12
21
NUMBER OF COMPUTER SITES
II
II I
IV
CONFIGURATION NUMBER
Figure I-Monthly cost per CPO for four examined
configurations
within the Civilian Personnel Offices is the breakeven point, while a six percent reduction would produce net savings of approximately 1.25 million per
year. There is evidence that in an automated system,
this reduction in work force could be made with no
loss of efficiency or productivity. In fact, the automa.ted
system could be expected to greatly increase staff
efficiency and productivity, as well as provide management information vastly improved with respect to
timeliness, completeness, accuracy and internal eonsistency. Finally, analytical services would be availnble
which cannot be achieved with a manual system..
The software and programming aspects of the ov€~rall
problem were not examined as thoroughly as desired.
Certain assumptions were made, among these were:
1. The computer hardware would be dedics~tej
to the application.
2. Computer manufacturer's software support
would be adequate for all needs except spedfic
applications packages.
3. The query language and the storagE) and retrieval subsystems were not specified.
Design of Distributed Communications System
Fortunately, the study was not performed without
prior experience or knowledge concerning these points.
Previous efforts by the Company, as well as members
of the team, had been concerned with these very
points. Estimates were made and employed.
A subsequent investigation***· required a more
detailed and thorough examination of these very
points. The results of that study will be reported upon
in the near future.
639
offices, at all levels, wherever both military and civilian
personnel are found in significant numbers. However,
at the execution level, branching is noted between the
military and civilian personnel staffs in executing
day-to-day detailed, direct operational responsibilities.
This will continue to be the case in the future.
Table I indicates the distribution of civilian e~
ployees with respect to citizenship; by Army area in
the Continental United States, or geographical area
outside Continental United States, and membership
of the staff of Army Material· Command, Corps of
Engineers or "other" organizations. Additionally, it
presents the number of Civilian Personnel Offices
servicing the ten designated groupings.
The dimensions of the problem
General description
Within the Department of the Army, military and
civilian personnel administration is centered in single
Information and processing requirements
**. Contract
No. FA68WA-1913, Design of FAA Manpower
and Personnel Information System.
As a basic premise, it is assumed that processed num-
TABLE I-Distribution of civilian employees and civilian personnel offices
Army or
Geog. Area
I
III
IV
V
VI
Hawaii
Alaska
Far East
Europe
SOCOM
Totals
No.ofOrgs.
87
34
25
37
29
7
6
No. of CPO's
19
2
63
22
20
30
25
1
3
5
16
2
250
187
4
Total U.S.
Employees
Foreign
Nationals
Grand Total
147,800
57,200
45,200
63,600
42,600
5,900
::!,100
5,100
7,600
1,900
81,000
56,200
2,800
147,800
57,200
45,200
63,600
42,600
5,900
3,100
86,100
63,800
4,700
380,000
140,000
520,000
---
Breakdown of U.S. Employees
Army or
Geog. Area
I
III
IV
V
VI
Hawaii
Alaska
Far East
Europe
SOCOM
Totals
AMC
72,400
19,700
16,400
31,700
19,800
Corps of Engineers
14,300
10,700
7,000
8,500
6,700
500
300
160,000
48;000
Other
Total U.S. Employees
61,100
26,800
21,800
23,400
16,100
5,900
2,600
5,100
7,300
1,900
147,800
57,200
45,200
172,000
380,000
63,600
42,600
5, 9()(}-~
3,100
5,100
7,600
1,900
t.-:.40 a
FI
.
v
l Joint Computer Conference,
1969
--------------------------------------------------------eric, textual and graphic information delivered to the user
must be adequate to meet both predicatable and ad hoc
needs Processing and delivery must be timely.
The basic parts of such an automated system are:
1. A central processor (or processors).
2. Data storage capacity and retrieval capability.
3. A means for inputting and outputting information.
.
4. Adequate data communications.
Central processor
In discussing central processors, there are two basic
factors to consider. If there is to be but one central
processor, then the only require~ent of importance is
that it have the capacity and· time available so as
to be able to handle the input and output loads and
perform the required processing.
On the other hand, if a multiple computer configuration is decided on, in addition to the requirement set
forth above, there has to be a distinct set of software
programs for each local or decentralized computer
type which is not internally compatible with the master
computer, plus additional programs to provide for
transfer of data from one computer to another. This
adds considerably to both the cost and the complexity
of the system.
If the central processor is not dedicated to civilian
personnel use, but is shared, the particular priority
that would most probably be accorded civilian personnel information processing would entail delays of
indeterminate length. As the number of computers
handling civilian personnel information is increased,
the likelihood of sharing the computer increases
greatly. At the same time, any procedure that involves
the output of more than one computer would not be
completed until time is available on the last available
computer. It is not only the delay that can prove to
be vexing; it is also the fact that it is most difficult to
ascertain how long the delay might be.
Data storage capacity and! retrieval capability
When, as is expected, 40 percent of all employees
are in the career management program, storage for
approximately 1.8 billion characters of information
will be required for the records of the United States
Army civilian personnel. Storage for an additional
600 million characters will be necessary for United
States employees overseas and foreign national employees.
Data elements
In determining how large an individual record
would be, it is recognized that the record of a new {,mployee will not be as extensive as that of an l~mployee
who has worked many years. To measure that difference, the number of characters for a typical personnel
record of a GS-l through GS-5, of a GS-6 through
GS-l1 and a GS-12 through GS-18 wererecordled.
Table II is a summary of information concerning
the data elements which are required to meet both
present and anticipated needs. ~t also provides the
numbers of characters required for each of three record
categories.
Input and output requirements
There are 160 Civilian Personnel Offices in the Con'·
tinental United States, which will require a, tot all of
from 300 to 400 input/ output consoles, depending on
the make or type finally chosen. The overseas Civilian
Personnel. Offices will require an additional 60-90
consoles.
Both from qualitative and cost standpoints, the effects of proper or improper console selections will
clearly be very significant in view of the large number
of units involved.
Four broad categories of information ar4~ preBent
within the system.
1. Information necessary to update the files and
records.
2. Queries and responses.
3. Statistical data.
4. Processing outputs in general.
Detailed analysis of data presently handled or
required in updating files and records and in meeting
all except ad hoc query demands results in 20 characters/
man/day of data input and 18 characters/man/day
output. To handle ad hoc queries and their response,
an additional 12 characters/man/day (six input,
six output) are considered adequate. These statiBtics
are based on there being 240 working days a year.
Table III shows the known inputs to and outputs
from the central processor and the numbers of characters/man/year and characters man/day.
A lternative configurations
As a starting point, a single computer system was
considered. Next, an integrated system of six c}omputers
was examined, then one with 13, and finally, one with
Design of Distributed Communications Sy.stem
641
TABLE II--Summary information: Data elements
A verage Sized Record
Data Elements
Maximum
Record
Size
(Char)
as 1-5
as 6-11
Plus W. B.
Plus
W. B. Supervisors
as 12-18
Multiple
Single
Multiple
Single
Multiple
Single
Computer Computer Computer Computer Computer Computer
Organization Elements* (0)
Position Elements (P)
Civilian Personnel
Elements (CP)
Career Management
Program Elements (CM)
Executive Assignment**
Elements (EA)
Primary Personnel (PP)
Elements
Statistics' (ST) *
TOTAL
164
164
164
164
164
164
164
11,132
2,796
2,796
4,335
4,335
6,192
6,192
2,562#
7,548#
7,548
2,456#
7,548#
2,467
139
139
2,960
3,099
139
7,061
12,186
139
8,812
14,043
U sing the statistical data tabulated below, the average record length would be 4,700 for a single computer system
and 7,100 for a multicomputer installation.
as 1-5 plus W. B. less W. B. Supervisors-230,000 employees = 60%
as 6-11 plus W. B. Supervisors
-113,000 employees = 30%
as 12-18 and PL Appointees
- 37,000 employees = 10%
(U.S. citizens only)
* The storage requirements for organizational and statistical elements are insignificant as compared with the
personnel data storage requirements. Therefore, no values are assigned.
** Executive assignment data not carried in Army System but by Civil Service Commission; presented here for
information only.
# The character count for average record length in the CM category is kept at maximum record length since it is
expected that personnel in this program will have been in the civil service for several years and therefore will
have need for all the record space available.
22. The process halted at this point because it was
becoming apparent that the complexities of a multicomputer system-and, as a result, the unpredictable
time responses as well as costs-were growing at a
far higher rate than any advantages that might accrue.
The first configuration involves a single computer
located in the Washington area. Figure 2 shows a
schematic of this. Each Civilian Personnel Office
would input to and receive data from this one computer. A complete record concerning each civilian
working for the Department of Armr would be main-
tained in this computer, except for foreign nationals
in Europe and the Far East. This would result in a
total of approximately 380,000 records.
At each Civilian Personnel Office, there will be at
least one input/output console. Here, all information
would be entered that would update personnel records,
as well as Civilian Personnel Office queries. Each
console would also ~cieve all query-response outputs
as well as general information outputs and record
printouts that are needed for all personnel management
purposes.
642'
Fall Joint Computer Conference, 1969
--------------------------------------------------------------------------------,-----TABLE III-Major inputs to and outputs from system (Annual)
INPUTS
Payroll Change Slip (2515)
Request for Personnel Action (52)
Application for Federal Employment (57)
Request for Referral List (2302·2)
Installation Training (750)
Employee Performance & Career Appraisal (2302.4)
Employee Performance Rating (1052)
Job Description Rewrite (374)
Qualification Record (2302)
Total Inputs
OUTPUTS
Referral List Response (2302.2)
Notification of Personnel Action (50)
Career Employment Record (2302-5)
Position Review (275)
Occupational Inventory of Civilian Positions (1629)
Table of Distribution & Allowances (2952)
Civilian Personnel by Basic Rate (3100)
Civilian Personnel Employment Report (3250)
No. of C'haraCj~er8
150
300
750
350
1,000
1,600
100
·400
100
4,750
165
300
3,600
100
100
70
70
22
Total Outputs
Known Inputs
4,750 = 20 Characters/man/day
Nnown Outputs
4,427 = 18 Characters/man/day
4,427
Known Requirements 9, 177 = .38 Characters/ man/ day
For ad hoc queries and presently unanticipated requirements, 6 characters/day each for input and output are
added~
For system'concept development the following values are used:
Total Inputs'
26 Char/man/day
Total Outputs
24 Char/man/day
Total Requirements 50 Char/man/day
The system will be able to accept inputs of authorized
manpower spaces and changes to them. as these allocations or changes to specific spaces are made. By
comparing the data with position information reported by Civilian Personnel Offices, it will be easy to
cietermine, at any time, discrepancies between vacancies and established positions.
The second configuration provides for five computers
carrying personnel records in addition to the RAPID
system computer complex in Washington. Figure 3
is a schematic diagram of this.
These five processors would be located, with numbers
of Army and Civilian Personnel Offices serviced, as
shown in Table IV.
Personnel records of employees assigned to each
Civilian Personnel Office would be stored in the comi
puter for the respective Army area. Each Civilian
Personnel Office would use its consoles to communicate
with the computer in the Army area where it was
located, in the same manner as described previously
for the operation of a single computer system.
These five area computers would be connected
electrically to the RAPID computer in Washington.
The central (Washington) computer would store Borne
20 critical elements of information (applroximn,tely
100 characters) concerning each non-career employee
plus additional career management information for
each employee in the career management Bystem.
This master computer would thus be able to produce
statistical data as well as provide responses to many
queries without requiring access to the area Icomputers.
Answers not obtainable from the RAPID system
Design of 'Distributed Communications System
160 CPOs
643
rN u,s,
o
COMPUTER SITE IN EACH ARMY,
EACH
INSTALLATION INCLUDES CENTRAL PRO·
.CESSOJl... MASS STORAGE AND PERIPHERALS.
Figure 3-8ix computer configuration
Figure 2-8ingle computer configumtion data input
and output
computer complex would be generated by ,polling the
computer(s) able to furnish the answer, or if the specific
computer containing the desired information was not
known, all five computers would be queried.
The third configuration provides for twelve computers carrying (primarily) personnel records, plus
the RAPID system computer in Washington. Figure
4 shows a schematic diagram of this,
In this approach, a configuration was developed
wherein the two major employers of civilian personnel,
the Army Materiel Command and the Corps of Engineers, retained the records of their own personnel
in their own computers. The balance of the employees
are serviced in Army area computers as in Configuration II.
In an effort to store the data as close to the Civilian
Personnel Offices as feasible, a twenty-two computer
configuration was also postulated and studied. A
schematic of this is shown in Figure 5. Again, the
RAPID system computer would serve the same
TABLE IV-Six computer configuration description
Command
DCSPER
First Army*
Third Army
Fourth Army
Fifth Army
Sixth Army**
Location
Pentagon
Ft. Meade, Maryland
Ft. McPherson, Georgia
Ft. Sam Houston, Texas
Ft. Sheridan, Illinois
The Presidio, San
Francisco, California
Personnel Serviced
Number of CPO's
163,000
57,000
45,000
63,000
86
22
20
30
52,000
29
380,000
187
* To include records for U. S. personnel in Europe, the Far East and Southern Command.
** To include records for U. S. employees in Hawaii and Alaska.
644
Fall Joint Computer Conference, 1969
--
~~
I
8
8
8
8
8
EJ
8
(3
13
13
13
/8
18
8
8
13
8
(>
V-13
COMPUTER SITE IN EACH ARMY AND AT INDICATED
COMMANDS. EACH INSTALLATION INCLUDES CENTRAL
PROCESSOR, M~SS STORAGE AND PERIPHERALS.
Figure 4-13 computer configuration
function as described in the explanation of ConfigurationII.
Hardware technology
The general considerations which must be taken
into account in designing a complex system such as the
one under discussion can be divided into two major
areas. One is concerned with the hardware to be employed and the other with the software. The hardware
area, in turn, can be subdivided into four parts, while
there are two distinct aspects of software use to be
considered.
The following discussion will be concerned only with
the hardware aspects, specifically:
1. The central processor hardware considered
during the study,
2. The mass random access storage systems,
3. The input/output terminals to be located at
the various Civilian Personnel Offices and
Headquarters offices throughout the Continental United States and,
4. The communications techniques or channels to
be utilized.
Figure 5-22 computer configuration
Of the four specific hardware subsystems, the communication channels, the terminals and the computers
are highly interrelated. Further, the communication
software package either provided by the hardware
manufacturer or developed by the contractor· must
be integrated in such a fashion as to permit the large
volumes of data transfer tg work in a smooth, wellintegrated fashion. ~his was assumed to be true for
this study.
Automatic data processing systems
Following a detailed examination of the automatic
data processing systems available at the time of the
study (1967), a representative selection was made.
Summary data concerning these systems will be found
in Table V.
Random access mass storage systems
Of the four configurations specified, Configuration I
requires the largest and fastest mass random alCcess
storage subsystem. Under this configuration, an online storage capacity (at one computer site) of an estimated 1.8 billion bytes of information will be required.
Today, no single device is available which can meet
Design of Distributed Communications System
645
TABLE V
COMMUNICATIONS
CPU AND MEMORY
MANUFACTURER
AND MODEL
INSTALLED
MEMORY
RANGE AND
CYCLE
TIME
RCA
SPECTRA 70/35
10/66
RCA
SPECTRA 70/45
7/66
RCA
SPECTRA 70/46
9/68
ll/64
BURROUGHS
B6500
1/68
SDS
SIGMA 5
8/67
5/65
9/65
3/66
CONTROL DATA
3100
2/65
CONTROL DATA
3300
12/65
CONTROL DATA
3500
4l31K WORD
.85~,
16'262K BYTE
2.5",
65262 K BYTE
IBM
360/65
161961: WORD
.6",
9/65
IBM
360/50
32K WORD
4."_
16128K WORD
2.7"'
IBM
360/40
16131K BYTE
1.44",
12/66
GENERAL ELECTRIC
435
162621: BYTE
1. 44~,
4131K WOR:!
.85",
SDS
SIGNA 7
1665K BYTE
1. 44~·'
4-
BURROUGHS
BSSOO
PRINTER
CARD READER
SPEED
MODEL
SPEED
MODEL
MAGNETIC TAPE
3/67
2. "'
131-104U
BYTE
.75"5
832K WQIIP
1. 75",
832K WOIID
1. 25",
8262K WORD
.8",
COST
~
I
TRANS.
RATE
MODEL
URCIIASE
:%.
210K
VV-
1250 LPM
70/243-10
1250 LPM
540K
70/243-10
8llK
70/243-10
~
778. SK
V772K
VV%
542K
775K
417IC
V500IC
v.::
6841:
VVVV580K
299K
470K
650K
1250 LPM
1435 CPM
70/237
601( C/S
70/442
1435 CPM
601( C/S
70/237
70/442
143S CPM
601( C/S
70/237
70/442
1040 LPM
1400 CPM
18-721 C/S
B328
B129
B425
1040 LPM
1400 CPM
9241
9112
1000 LPM
7445
1000 LPM
7445
1200 LPM
PRTZ01
600-ll00
LPM
1403
600-1100
LPM
1403
600 -1000
LPM
1403
800-1000
LPM
501
900 CPM
7140
900 CPM
7140
601 C/I
7321
900 CPM
15-42K
CRZ201
MTH301
800 CPM
30-lS0a: C/S
1402
2401
800 CPM
800 CPM
30-1801: C/S
2401
1402
1200 CPM
405
15-60K C/S
604
800-1000
LPM
lIZOO CPM
405
405
30-180IC C/S
2401
1402
1200 CPM
this requirement. To attain this storage volume a
number of units must be integrated into the total
system. For example, use of the RCA RACE unit,
actually the cheapest available device on a dollar per
byte basis, would still require three units to accommodate the total volume of data. To purchase these
units (including their individual control units), will
cost nearly half a million dollars, while rental would
be about $12,000 a month.
Table VI presents a summary description of the
most likely candidates for the mass random access
storage systems. Note that this listing of devices includes the largest mass storage units that are presently
available as well as smaller units which have been
considered for utilization with the smaller decentralized
centers described in Configuration IV. Figure 6 shows
the storage capacity as the independent variable with
the cost/performance ratio shown in cents per bytes
stored.
601 C/S
7321
800-1000
LPM
501
501
9-321 C/S
DUAL DR.
9381-2
15-601: C/S
604
15-60K C/S
604
I
COST
~
MAXIMUM
RATE
MODEL
%
42.3K
2400
BAUD
70/668
42.31(
70/668
RCHASE
x:
~
42.31(
x::
38. ZI
l/C
43.21
~
~
~
10K
101
261(
%
V.
L/:.
X
bG
,
SOFTWARE
CONTROLLER
DATE FIRST
2400
BAUD
6J: B/S
70-668-31
2400 BAUD
B249 AlID
B5480
VERY HIGH
B6350 AND
B6350-1
2702
V.
27.5K
40.U 1/5
3316
40.81: B/S
3316
40.81: B/S
3316
X
NO MULTIPROC.
451(
X
X
NO
ASSEMBLER
44K
X
NO
ASSEMBLER
481
V-
x:
;<:
X
NO IPG
COBOL
6/68
20K
7614
381
X
110 MULTIPROC.
45K
201:
2400 BAUD
DATANET
30
2702
27.5K
%
%
1/
%
%
1800
BAUD
3U
PARTIAL
FULL
PURCHAS
7614
2702
27.51
~
lSOO
BAUD
311
COST
X
X
951
%
l%
40.11
X
NO MULTIPROC.
X
NO MIlLTIPROC.
40.1l
~
40.n
V
V
/
X
NO MULTIPROC.
X
NO MULTIPROC.
x
MULTIPROC. 6/69
x
AS OF
12/67
Communications channels
Initially, the U. S. Postal System was considered
as a valid technique of transmitting the daily accumu'lated data (from each Civilian Personnel Office) to
the computer site. Further examination raised two
objections. These were:
1. Cost involved.
2. Transmission delays.
The use of the Postal Service entails several costs
which can be summarized, on a m.onthly basis, as:
$37,884
Postage
233
Packaging
375
Addressing and Handling
Replacing Damaged and Lost
375
Reels
$3S,867 or $24:0/
month/CPO
646
Fall Joint Computer Conference, 19tm
TABLE VI
STORAGE
CAPACITY
NAME
ACCESS TIME:
ACCESS
MILL I SECONDS
PRICE"
~~gF~g~~fEtrpJ;"
NUMBER OF
BYTES
UNITS PER·
CON\ROL
TRACXS
PER
SURFACE
MEDIA
NUMBER OF
HEADS PER
MECHANISM
P~~~D
MINIMUM
AVERAGE
MAXIMUM
PURCHASE
RENTAL
COST
PER
BYTE
(fiB)
:
.026
$145K
560M
STRIP
128
16
16
20
200
385
IBM 23ZI (DATA CELL)
400M
STRIP
100
20
20
25
'\<300
~600
DATA PRODUCTS s'01P5
400M
DISK
510
50
85
250
143K
.036
BRYANT 2AC 4000
419M
DISK
728
30
110
180
375K
.089
BURROUGHS 9375
500M
DISK
150
60
120
590X
IBM 2314
210M
DISK
200
25
75
135
250K
CONTROL DATA 814
151M
DISK
192
20
60
110
BOX
5,500
.152
UNIVAC FASTRAND 2
100M
DRUM
12,480
39
92
154
165K
3,750.
.165
24
235
Z35
17
35
75
135
RCA 568-11
(RACE)
NCR 353-3 (CRAM)
IBM
16
STRIP
150
18
64
56
56
SDS 7202
737K
DISK
128
128
IBM 2311
7.3M
DISK
200
20
10
25
136.5K
$2,800
9,900
.034
.118
.118
35.5K
.195
18K
.245
26.3K
.360
ARRANGED IN ORDE~ OF COST PER BYTE OF STOAAGB
·Price for 1 (one) storage module only.
I
J
1.0
For electrical communications, two distinct and
different classification schemes or methods can be
employed to facilitate the analysis. These are:
•
11
_10
•
9
8
•
7 •
v
.'•
1. Governmental/non-governmental facilities.
2. TWX voice grade/broad band facilities.
4
0.1
'3
2 •
1
0.01
o~.
0'/
f-rQ
<~
0.001
10~f
1M
100M
lOB
18
CAPACITY (BYTES).
STORAGE U!IIIT LEGENll
1.
RCA 568-11
2.
IB~f
2321
(RACE)
(DATA CELL)
5.
BURROUGHS 9375
6.
IBM 2314
3. DATA PRODUCTS 5085
7. CDC 814
4.
8. UNIVAC FASTRAND II
BRYANT 2AC 4000
9. NCR 353-2 (CRAM)
10. SDS 7202
11.
IBM 2311
(DISC PACK)
Figure 6-Cost (cents per byte)/capacity (bytes)
Comparisons among costs for each of the serviees
noted become very involved and complicated for a
single configuration, let alone for four. However, each
service is described below, following a brief di.scussion
of the data volumes expected. With approximately
380,000 United States citizens covered by the system,
and with a flow of forty to fifty characters per man per
day over the communications channels; it seems almost
mandatory that a "dedicated" communicatiorul syst(~m
be available.
In the event personnel records are procesesd by tit
computer used for other applications as well, it is
assumed that the personnel system will be u.vailalble
for major update processing and for query response
Design of Distributed Communications System - 647
during the third shift (eight hours). With an ,average
of fifty characters/man/day traffic for 380,000 records,
there will be a traffic flow of nearly twenty million
bytes (eight-bit characters) a day. With a 70 percent
line utilization, this requires a 1,000 bytes/second
transmission capability.
Autodin
Autodin ,can be used to provide the type of service
required. However, the rfollowing points serve to
eliminate it from consideration:
1. The service is not available at approximately
20 percent of the Civilian Personnel Office sites.
2. It is an extremely costly means of transmitting
civilian personnel data. The charge is a fixed
rate per site and is high, in part at least, because
the system must be able to pass classified information. Since civilian personnel data would
not be classified, except possibly for occasional
specific information that would be afforded
special handling, this costly apsect would not be
needed.
3. Civilian personnel information would' be afforded a low priority as compared to other data
using Autodin. This would cause delays of
variable and indeterminate length.
The cost, per site or terminal, depends upon the
line bandwidth required, not the distance the message
is sent or the line usage (time). These costs are shown
in Table VII.
Autovon
Autovon is a military leased, voice grade, direct
dial telephone system. There is no apparent reason
for not employing this system to transmit digital
data during off-hours (6 p.m. to 6 a.m.).
Charges are variable, but an estimate of $315$372/month/CPO seems reasonable.
Hard decisions concerning the use of Autovon for
digital data transmission were not obtained, though
statements were made that Autovon is used in some
cases for data transmission. Neither could any indication be obtained that for night use lower or preferential
rates were available ..
Wide area telephone service (WATS)
The most attractive data transmission channel
studied during this effort, from the viewpoint of the
Army Civilian Personnel Program, is the Wide Area
Telephone Service (WATS). WATS offers two billing
plans; a 24-hour, unlimited service, and a measured
time service.
,Under a measured time WATS contract, a basic
monthly charge for the first ten hours of usage is
made and an additional charge per hour of actual
usage is levied. The tariff which governs this service
is extremely detailed and a full discussion is beyond
the scope of this report. However, a single computation will indicate the method of selecting between the
unlimited and the measured WATS.
A single W ATS line with a six band capability (full
48 state coverage), based in Washington, D.C. costs
$2,250 per month. The measured WArS, with the
same capability, costs:
C = 370
+ 29(H) ,
where,
C
= Cost in dollars per month
H
= Hours of usage beyond the first ten hours,
per month.
The break-even point can be calculated by setting
C = 2,250 and solving for H. This yields a value of
H = 65 hours or 75 hours/month of circuit time or
approximately three hours/day.
Definitive evaluation for the Army Civilian Personnel System must await final implementation decisions. However, a computation con~erning the use
of WATS for several possible configurations has been
carried out, and is detailed i~ Table VIII.
Leased broad band lines
TABLE VII-Costs
A utodin Bandwidth
(Band)
Cost/Month
Termina l Sit
75
150
1,200
2,400
{1,188
2,375
9,504
14,250
Finally, use of leased broad band lines was examined.
Their use was considered only for Configuration I
implementation.
Many possible line linkages can be conceived. The
one demonstrated here is for illustrative purposes only,
but is typical.
Postulate that a concentration device, or subsystem
(such as a very small digital computer with magnetic
Fall Joint Computer Conference, 1969
648
------------------------------------------------------------------------------------TABLE VIII-Number of WATS lines required vs configurations
Configuration
I
Number
of Bands
4
Location
Washington, D. C.
Bands
1,3,
6, 6
Cost/
Month *
$ 6,100
---$ 6,100
II
Ft. Meade, Maryland
Ft. McPherson, G'eorgia
Ft. Sam Houston, Texas
Ft. Sheridan, Illinpis
Presidio, CalifornIa
2
1
1
1
1
1,3
1
1
5
2
$ 2,075
1,300
2,015
2,075
2,000
---$ 9,465
---P ADIR System Input
I
6
$ 2,250
---$11,715
III
Ft. Meade, Maryland
Ft. McPherson, Qeorgia
Ft. Sam Houston" Texas
Ft. Sheridan, Illinois
Presidio, California
Corps of Engineers, D.C.
AMC, D. C.
MUCOM, Edgewood, Maryland
T & E Com., Aberdeen, Maryland
WeapCom., Rock,Island, Illinois
ECom., Ft. Monmouth, New Jersey
Missile Com., Huntsville, Alabama
1
1
1
1
1
RAPID System Input
3
1
1
5
2
1
1
1
1
1
6
6
6**
6**
5
1
6
$ 1,575,
1,300
2,015,
2,075;
2,000
2,250
2,250
2,250
1,500
$17,215;
2,250
$19,4:65
* Includes intrastate charges, as required.
** Shared.
tape or disk), will be located at each Army Headquarters. A WATS system, similar to' the system described
for Configuration II (see Table VIII) will be installed.
Leased broad band lines would aslo be installed from
the five Army Headquarters to the RAPID site.
A broad band channel capable of transmitting 5,100
characters/second costs $15/mil~/month. In addition
each terminal requires a termin~tion which rents for
$250 each. As a result, monthly rental for the communication channels (and their terminations), but
not the concentration, would be $18,465.
Terminals
Terminals must have certain attributes. The requirements will vary, depending on whether the
terminal is located (for example, at a major command
headquarters, or in an operating Civilian Personnel
Office) on the final network configuration ;selected,
and on the communication means employed. However, certain minimal capabilities can be specifiEd:
1. The data terminal must possess an "extended"
(ASCII) keyboard, for entry of data.
Design of Distributed Communications System
TABL~
Configuration
IV
649 .
VIII-(Contd)-Number of WATS lines required vs configurations
Number
of Bands
1
Locat'ion
Boston
New York City
Philadelphia
Baltimore
D. C. Area
Atlanta
Kentucky
Chicago
St. Louis
Kansas City
Colorado
New Orleans
Texas Gulf
NE Texas
West Texas
South California
San Francisco
Hawaii
Utah
Seattle
Alaska
Bands
1
X
1, X
1, X
X
1
I, X
1, X
2,X
X
X
2, X
1, X
X
1, X
1, X
1, X
X
1
2
X
1, X
2
2
1
1
2
2
2
I
1
2
2
1
2
2
2
Costl
Month *
330
$
1,150
875
375
500
1,300
1,200
1,475
645
610
1,825
1,400
815
1,815
1,915
1,850
650
No information
500
1,775
No information
---$21,005
X
=
Intrastate charges.
2. The terminal must have an extended storage
ability to retain up to one day's input for
transmission to the data- processing site, with
due allowances for peaks.
3. The terminal must be capable of receiving and
storing infor~ation transmitted by the servicing computer (in general, during the night)
in response to ad hoc requests or standing
requirements.
4. The terminal must be capable of producing in
hard-copy form all information transmitted
from the computer site to the Civilian PerSQJl,nel
Ofijce or other location either on-line, or on a delayed basis.
5. The unit must be able to communicate, via
low cost (voice grade) telephone lines with
the computer center.
After careful consideration of the characteristics
of the many available principal Input/Output equipments,' :we have narrowed the field for further consideration to three. These are:
1. Mohawk 1103.
2. Dartex 1022.
3. Comml)nitype 100SB.
The selection
In performing the cost analysis ifor,.each of the four
configurations ~xamined, four different manufacturers'
equipments were examined. These manufacturers'
equipments were examined in the Qontext of the processing loads required for both the centralized and
distribut~ coD:figurations. Table IX shows the specific
central processors considered.
Although there are many approaches that can be
followed to select a central processor for each of these
'confignratio,ns, one dpminant constraint controlled
the selections. This constraint was the requirement
th~t the cotpputer be able to utilize the amount of
random access storage needed by the system at each
location. Thus, although there are smaller computers
available, ~ot only from the four manufacturers
whese equipments were examined, but also from other
sour~~s, the equipment seleQted represented, in general,
the smallest comput~ls that could do the job.
650
Fall Joint Computer Conference, 1969
~----------------------------------------------------------------------------------TABLE IX-Central processors t}mployed in the cost analysis for each configuration
.J."\f anufacturer
CDC
BURROUGHS
IBM
RCA
I
II
III
IV
3304
3501
350/40
70/45
3304
3501
360/40
70/45
3114
2501
360/30
70/35
3114:
2501
360/2:0
70/85
Configuration I
Configuration I has a single centralized site into
which all Civilian Personnel Offices address their data.
Detailed examination showed that approximately 1.8
billion bytes of information would be the largest amount
of dedicated random access storage required at this
location. Supporting that sUbsystem would be a highspeed disk subsystem of approximately 10.0 million
bytes. This subsystem would act as a directory and
contain a high-use skeletal record for each employee
whose total record was contained in the random access storage system. The central processor at this
site is provided with approximately 65,000 bytes of
high-speed core storage, a normal complement of standard peripherals; such as, a high-speed printer, a card
reader and punch, a communications control unit as
well as eight high-performance magnetic tape units.
The magnetic tape units are employed to maintain an
on-line journal of all system actions. They also act as
replicate security repositories of current information
in the event of equipment destruction, electrical information losses, over-writing, etc. In additidn, the
magnetic tape drives can be employed for other purposes during non-civilian personnel operations at the
site. Finally, magnetic tape would be used to store
trailer information (overflow beyond single fixed-format
record storage capcaity) and archival information.
Communications with the centrally located computer
site can be handled by four WATS lines. Two of these
would cover the 48 states, one for the Eastern onefourth of the United States, and one for the Eastern
Seaboard. The extent of these lines, that is, the number
of WATS bands, have been selected to provide the
optimal coverage of CONUS based upon the geographic
distribution of Civilian Personnel Offices and the
populations they support.
Finally, each Civilian Personnel Office was examined
to determine the number of terminals required. A
keyboard~ hardcopy printer, and an intermediate
storage capability are considered a mandatory requirement for this system application.
To better understand "the operation of the Config-
uration I system, consider the requirements jror da,ta
transmission from and into a Civilian Personnel Office on a daily basis; 26 characters per man per d,ay
(on the average) are inputted to the computer from
a Civilian Personnel Office, while 24, characters per
man per day are outputted from the computer to a
Civilian Personnel Office. With these figures, an estimate of the communication requirements can be made.
Similarly, the estimate of the actual keyboard typing
or data outputting can be obtained.
Taking all these facts plus the data provided in an earlier section into account, it can be shown that the cost
for Configuration I will be in the order of $130,000 to
$150,000 rental per month.
Configurations II, III and IV
Configuration II is schematically represented in
Figure 3. Configuration II represents a total of five
computer sites and thus the amount of rental required
to support these sites does increase. Similarly, the communication cost rises from approximately 6.1 to 11. 7K
dollars per month.
Configurations III and IV have been treatled irithe
same fashion as Configuration II.
Detailed Configurational Comparison
The results developed thus far may now be applied
to the crucial problem-which of the four pl:.>stulated
configurations is recommended and why. The central
processors selected for examination with respect to
the four configurations postulated have been specil5.ed
in Table IX.
In presenting a detailed description and price comparison, Table X summarizes the key requirements,
i.e., the number of personnel serviced, mass random
access storage capacities, and the data tralllsmis8ion
vorumes for each of the four configurations. These d.ata
were .employed as guides in the hard ware selections.
Detailed equipment specifications and pricing/rental
were also examined and are summarized in 1'able :X.I.
651
Design of Distributed Communications System
•
TABLE X -Summary of requirements for the four configurations examined
Configuration Number
I
II
III
IV
N umber of Computer Sites
1
5
12
11
People Serviced
Maxima (Exclud'ed from Range Figures)
Minimum
Range (Not including Maxima)
Average for Range
380K
163K
45K
45-63 (4)
54K
48K, 75K, 83K
11K
l1K-27K (9)
19K
47K,58K
2K
2K-31K (19)
14K
1.8G
655M
162M
162l\{-251M
201M
225M, 350M, 390M
52
52M-127M
90M
223M, 180M
15M
15M-II8l\1
62M
0
I.2G
1.2G
1.20
8.2M
2.3M
2.3M-3.2l\1
2.7l\1
2.4M, 3.6M,. 4.2M
0.6M
0.6M-1.4M
1. OM
2.41VI, 2.9M
O.IM
0.IM-I.6M
0.7lVl
RAM Capacity Requirements (Bytes)
Maxima (Excluded from Range Figures)
Minimum
Range (Not including Max'ma)
Average for Range
RAPID Supplemental RAM Capacity
Required
Data Transmission Volumes (Characters
per Second)
Maxima (Excluded from Range Figures)
Minimum
Range
Average for Range
19M
Two key aspects of the information. contained in
Table XI have been plotted to provide a clearer view.
These are:
other factors must be examined. Only then can a
decision be made.
In favor of Configuration I implementation are:
1. Hardware comparisons (exclusive of communications and terminal costs) )for the, four manufacturers, for each of the four configurations.
2. Comparison of monthly rentals for all l,tardware aspects (using an average set of v~lues
for the on-site computers and their conventional
peripherals) .
Point 1 is amply described in Figure 7, while Point
2 is presented in Figure 8.
At this point, Configurations III and IV were dropped
from further consideration. The few advantages which
could be enumerated in their favor were not sufficient
to outweigh the added costs.
!
The selection between Configurations I and II
appears less clear cu~ Although the monthly rental
for Configuration Ir- is approximately JO percent
greater than the monthly rental for Configuration I,
1. l~ower monthly rental.
2. l~ile centralization in one physical location close
to Department of the Army and Department of
Defense headquarters activities.
3. No undesirable redundancy in either hardware,
software or machine processing. Also, if this
processor is identical with that of the present
RAPID system, then each pan act as back-up
for the other.
4. Availability of a "dedicated" computer for
Army Civilian Personnel record-keeping. This
implies that a self-established priority system
can be employed.
5. A minimum of highly skilled ADP programmers,
opera.tor persoanel, etc., required.
6. Data base "timeliness" and uniformity.
7. No limitations on "cross servicing."
652
Fall Joint Computer Conference, 1969
-TABLE XI-Purchase and rental comparison-four computer manufacturers and four system configurations
Co~figuration
Subsystem
I
Purchase
M$
III
II
Rental
K $ *
IV
Purchase
M$
Rental
K $*
Purchase
M$
Rental
K $*
Purchase
M$
Rental
K $*
14.4]\11
384K
2::n
316
345
Computer SiteMass RAM
CDC
Burroughs
IBl\1
RCA
1.0M
0.8
1 .1
1.2
26K
16
17
26
4.6M
2.6
4;0
4.7
115K
60
83
96
9.6M
5.6
8.7
8.6
226K
132
179
196
Mass RAM
(Inchiding.
RAPID
Supplement)
CDC
Burroughs
IBM
RCA
0.9M
2.4
0.8
0.6
19K
40
17
12
1.7M
2.1
1.9
2.0
41K
67
40
40
2.4M
3.4
3.5
3.6
43K
55
65
65
Communications
Channels-WATS
12K
6K
10.1
15.4
17.0
4.1M
7.5
4.4
4.4
~ro
107
~)5
'H5
t'J!HK
20K
Terminals
3.5M
88K
3.5M
88K
3.5M
88K
3.5M
88K
Totals
CDC
Burroughs
IBM
RCA
5.5M
6.6
5.3
5.3
138K
149
131
131
9.7M
10.5
8.9
10.1
255K
227
221
235
15.6M
13.5
15.7
16.7
371K
294
351
378
22.0M
21.2
23.3
24.8
51B3K
447
519
549
*-lVIonthly
M-Millions
K - Thousands
On the other hand, Configuration II provides:
1. Local, autonomous control at the Army level
of each computer system.
2. Redundancy of equipment which offers an
alternative processing site in the event a system
is down.
3. With lower processing loads per machine, cost
sharing could be practiced.
No numerical weighting of these advantages seems
appropriate. However, after a thoughtful review of
each point, and a careful summation of all1ihe points
concerning each alternative, one is left with. but one
reasonable choice-Configuration I.
SUMMARY
The investigation demonstrated that a highly distributed, Automated Personnel and Manpower System
was feasible and would be cost-effective. It alISO showed,
rather forcefully, that although the terminals were
located throughout the country, a single concentrated
Design of Distributed Communications System
480 K
653
480 K
400 K
'40"~
320 K
240 K
160 K
160 K
80 K
80 K
12
21
TERMINALS
NUMBER OF COMPUTER SITES
II
III
IV
CONFIGURATION NUMBER
12
21
NUMBER OF COMPUTER SITES
Figure 7-Monthly rental of CONUS computer systems
for four manufacturers for all four configurations
(Computer hardware only)
II
III
IV
CONFIGURATION NUMBER
Figure 8-A typical total monthly rental breakdown for
all four configurations (CONUS Only )
central processing site was by far the most economical
approach to the system implementation.
An interesting fallout of the study was the fact
that the cost of the communications channels required
to support the system accounted for only one and
one-half to three percent of the cost of the fully-implemented system.
Finally, the broadest result of the study was the
conclusion that real time, on-line (or quasi on-line)
systems were practical, cost-effective and currently
attainable.
ACKNOWLEDGl\1ENTS
The material presented in this paper has been almost
completely drawn from a report* submitted to the
Department of the Army in February, 1968, The
report was prepared by Mr. H. H. Lowell, Mr. J. V.
Heimark, Mr.Q. A. Koehler, 1\Ir. R. J. Gibbons and
the author.
Particular appreciation is due to l\!Iiss L. Richard
for all her help and assistance in preparing this paper.
Without her help it would not have bee,n submitted.
* Final Report, "Civilian Personnel Management and Manpower Information System Concept for Department of the
Vmy," 28 February 1968, H. H. Lowell, J. V. Heimark, G. A.
Koehler, N. NisenotI, R. J. Gibbons, Computer Comml,\nd and
Conrrtol Company.
Analysis of the communications aspects
of an inquiry-response system
by J. S. SYKES
Bell Telephon.!3 Laboratories, Incorporated
Holmdel, New Jersey
INTRODUCTION
In order to meet the information retrieval needs of
various industries, inquiry-response systems are being
implemented by storing large data bases in centralized
computer files. In some systems, the files are accessed
by personnel primarily as the result of telephone calls
from customers. As an example, in the airlines industry,
computer files are accessed by reservation clerks to
determine the availability of reservations for a specific
flight. In this example, and in similar applications
involving queries or requests from customers, input
messages requesting certain information are generated
by a customer representative and then transmitted
to a computer from an input-output terminal such as
a visual display device. When the computer has obtained the requested information, a response message is
transmitted back to the requesting terminal, and the
representative continues her dialogue with the customer.
For an inquiry-response system to function properly,
the system must be designed to meet two grade-ofservice, or performance, objectives. One objectIve is
concerned with the interval a customer. must wait
before his call is answered by a representative. The
other objective is concerned with the interval a customer must wait during the conversation until the
customer representative can secure the necessary in
formation froni. the computer; the naturalness of the
dialogue degenerates as the retrieval time* increases.
In order to meet the first objective, sufficient representatives must be available to handle the incoming
voice traffic. To meet the second objective, an adequate
data communications subsystem and sufficient computer processing capability must be provided.
In this paper an analytical model is presented that
approximates the interaction of the voice and data
communications subsystems in an inquiry-respons'e
system. The model can be used during preliminary investigations to gain insight and to obtain conservative
estimates of communications capabilities required in
order for a system to meet specified grade-of-service
objectives. The model consists of relationships that
involve basic communications parameters such as the
following: .
a. Rate at which calls are received from customers
b. Interval required for representatives to handle
incoming calls
c. Number of input and corresponding computer
response messages generated as the result of
a customer call
d. Data transmission rates to and from the computer
e. Lengths of input and response messages.
The model uses these relationships in order to estimate the following quantities:
1. The number of customer representatives re-
quired in order to handle a given volume of
offered calls at a specified grade of service
2. The volume of data traffic generated as a result
* In this analysis, the retrieval time is defined as the interval from
the time an input message is generated until the complete response
has been received.
655
656
Fall Joint Computer Conference, 1969
of the incoming voice traffic
3. The number of equivalent active terminals
that can be served by a data link while meeting
a specified retrieval time objective; an estimate
of the retrieval time as a function of carried data
traffic is used to obtain this quantity.
4. The number of data links required in order to
meet a specified retrieval time objective
5. The average occupancy of the one or more data
links serving the input-output terminals.
. For illustrative purposes, the analytical model is
applied to a hypothetical inquiry-response system.
Both half-duplex** and full-duplex** methods of operation are considered for the data communications subsystem. For this example, estimates of average retrieval time are obtained with mathematical queuing
models.
Assumed system characteristics
The analysis and its appEcation presented in this
paper are based on assumptions· concerning the incoming voice traffic, the characteristics of the data
communications subsystem connecting the inputoutput terminals to the computer, and the computer
processing capability. These assumptions are considered
in this section.
The basic assumptions that have: been made concerning the origination and nature of the voice traffic are
the following:
1. The overall system is in a state of statistical
equilibrium.
2. Calls are generated by custbmers in accordance
with a Poisson distribution, which implies a
large group of potential customers.
3. Durations required for representatives to handle
incoming calls *** are distributed according to a
negative exponential probability law.
4. Calls are answered immediately when there is
a customer representative not currently engaged
in a conversation; all other calls experience
delay.
5. Delayed calls ar~ answered in a first-come, firstserved order as representatives become free.
** With half-duplex operation, message transmission is allowed in
either direction, but not both directions simultaneously; simultaneous transmission in both directions is called full-duplex
operation.
*** These durations would consist of
the talking time with the
customer plus subsequent time (if any) required to perform
call-related tasks.
The assumed overall configuration of the voiceaccess network as well as the data communications
subsystem is illustrated in Figure 1. The voice-access
network is assumed to consist of the established telephone network that provides line-switched connections
from the customer to the business location. Ca,lIs are
automatically routed to an idle representative unless
they must be delayed; if so, the call distributor maintains the calls in a queue until a, representative becomes
free.
The data communications subsystem is assumed
to consist of a group of input-output terminals such as
visual display devices that are associated with Il,
common control unit, which is connected to a computer
by means of a data link. Various methods of operation
are possible for this data communications configuration.
These possibilities depend on whether or not message
transmission is allowed simultaneously in both directions, whether or not the computer requests traffic by
means of polling, whether the polling characters are
directed to individual terminals or to a control qnit
that gathers input messages from all of the terminals,
etc. This analysis considers both full-duplex and half'duplex methods of operations. Polling of the lControl
unit by the computer and multi message transmissions
in each direction are assumed.
Computer processing time, itS used in this paper,
refers to the overall interval from the instant an input
. message* enters the multiprocessing computer until
the corresponding response is placed in queue for
transmission back to the requesting customer repretsentative. Thus, processing time includes input message
analysis, data retrieval from one or more memory files
(perhaps even from another computer),· and response
preparation; in addition, the processing times may be
prolonged by queuing delays within the computer. It
has been assumed that an estimate of the averag;e com.puter processing time for a system is available; as willI
be explained, this estimate is used ill. determini.ng the
average retrieval time.
System analysis
In this section the analytical model of the communications aspects of an inquiry-·response system is developed.
* Examples of input messages are initial inquiries, requesta for
page flips, and any subsequent inquiries generated dluring a
customer's call. In addition, in some systems, update messagoa
may be sent to the computer, perha,ps after a call has been teJrminated. If so, it is assumed that fo:r each updating message the
computer returns some type of acknowledgment.
.Analysis of Communications Aspects
657
-xeS - Bv)
COMPUTER
1 - { p[n
>
0] exp
_
} ;
V
o ::; Bv < S
(2)
where P[D > 0], commonly called the Erlang C function, is given by
r=l
COMMON
CONTROL
UNIT
DATA
...lNPUT - OUTPUT
}~R~~~C1
PfD
TERMINALS
>
(S - 1) ! (S - Ev)
0) = - - - - - - - - - 8-1
(Ev)n
(Evl
n!
(8 - 1) ! (8 - Bv)
L -+------
VOICE
TRAFFIC
n=O
o ::; Bv <
S . _ (3)
Equation (2) is a result of A. K. Erlang's exponential
holding time analysis. A summary of his analysis along
with various delay curves was published by E. C.
Molina. 2 For specified values of S, values of P[D > 0]
are tabulated in Reference 1 as a function of the ratio
Figure
l--Inquiry-reE'pon~e
Ev/S.
system
Conversion of offered voice load to data traffic
Personnel required to handle offered voice load
Assume that during the period of maximum incoming
customer calls, i.e., the system busy hour, the calls are
received at a rate Av. Assume further that the average
duration required for representatives to handle incoming calls is V. The voice load Ev handled by the
representatives is therefore given by
Ev =
(1)
AvV,
where Ev is commonly expressed in erlangs, a dimensionless unit. The number S of personnel required to
handle Ev erlangs during the busy hour is dependent
on the grade of service G(x) to be offered customers,
i.e., the promptness with which customers' calls would
be answered. An example of G(x) is the following: at
least 0.95 of the customers' calls should be answered
within x = 20 seconds from the time ringing begins.
If the assumptions previously stated concerning the
voice traffic are met, values of S can be obtained for
a specified G(x) by using the following formulas: l
G(x)
1 - Prob[Answering Delay
1 - P[D > x]
>
x secs]
The amount of data traffic generated as the result
of a customer call is a random variable. Some calls
may involve only one or possibly two input messages
and the associated responses. Other calls, which may be
multipurpose, may require six or eight such interactions; in addition, some updating of the computer files
may be involved. In this paper, f will be used to represent the average number of interactions generated as
the result of a call.
Let Ai represent the average rate during the busy
hour at which input messages are generated by the
group of customer representatives served by one data
link. By using 1, Ai can be related to Av as follows:
(4)
Let AT represent the average rate during the busy
hour at which corresponding response messages are
prepared by the computer and placed in the output
queue for the data link. Since it is being assumed that
each input message to the computer results in a respon~e, the average rates Ai and AT are equal.
The second factor influencing the volume of generated
data traffic is the average time t i required to transmit
658
Fall Joint Computer Conference, 1969
a message to the computer. This quantity is the quotient of
(8)
li = the average number of characters that com-
Ti.
prise messages transmitted to the computer,
ani'
= the rate of transmission from the contro1
unit to the computer, i.e.,
Let a represent the ratio of the aver8.ge length of
response messages to the average length of input messages; if T i = Tr, Equation (8) then reduces to
Eviti
(5)
Ptot(Ti, T'I')
= - - [1
+
a] .
(9)
V"fi
Correspondingly, tr, the average time required to
transmit response messages from the computer to the
control unit is given by
Volume of data, traffic allowed per data link
-tr
tr
= -.
(6)
The product of Ai and t,.;., which wjll be denoted by
pi(Ti), represents the erlangs of data traffic generated
during the busy hour for transmission at a rate T i from
the control unit to the computer. Likewise, the product
of Ai and tr, which will be denoted by pr(T r), represents
the erlangs of response data traffic transmitted at a
rate Tr from the ·computer to the control unit during
the busy hour.
With full-duplex message transmission, separate
one-way transmission facilities carry Pi(Ti) and Pr(Tr).
Therefore, expressions for the magnitudes of Pi(Ti) and
pr(T r) (in erlangs) can be independently determined
by using Equations (1), (4), (5), and (6), i.e.,
which leads to
Pi(Ti)
EvIl:
= ---.
(7a)
VTi
Similarly, since it is being assumed that ~ = Ar,
EvIl
Pr(T r) = - - .
VT'I'
In summary, Equations (7), (8) and (9) reveal the
manner in which the various communicatiom; parameters affect the amount of generated dat~~ traflfic.
(7b)
With half-duplex message transmission, the same
facility is alternately used for input and output traffic. Therefore, p,(Ti) and pr(Tr) Can be cl)mbined to give
Ptot(T" T'I'), i.e.,
As indicated by the notation, calculated erla.ng
values obtained for Pi(T,), pr(Tr), and Ptot(Ti:, Tr) are
based on specified transmission rates. Erlanl; values
are not sufficient by themselves, however, to determine
the number of data kinks operating at the assumed rates
that would be required to implement the data communications subsystem. For example, if Ptot(Ti, Tr)
were less than one erlang, it could be inferred that one
data link would suffice for that traffic. However, in
order to avoid excessive storage usage and extended
retrieval times, data links cannot be used to their
full capacity. In fact, as will be shown, the average
retrieval time increases without bound as the average
occupancy of a data link approaches unity;: a:erage
data link occupancy refers to the average portIon of
the busy hour that the data link is being used for nlessage transmission.
..
it
Although data link occupancy m~Bt be hnute~,
is desirable to use data links as effiCIently as pOSSIble.
Let Ptot(Ti, Tr) denote the maximum volume (in
erlangs) of data traffic that can be carried by the data
link operating at transmission rates Ti and Tr. For h81£duplex operation, Ptot(Ti, Tr) is numerically eq~al to
Pmax(Ti, Tr), where pmax(T" Tr) represents the maXImum
allowable occupancy for a data link operatinl~ at rates
Ti andTr.
For full-duplex operation, ~tot(T" Tr) is ~he SUn:1 of
Pi (T,) and pr (Tr), which represent the maxnlllum. data
volume (in erlangs) that can be carried, on the In.put
and output links operating at rates T i and 1''1', res]pectively. For systems in which the input an.d output
traffic volumes are unequal; the average, occupa,ncy
of the input and output links may be cOllsidera,bly
iifferent. For this case,
Analysis of Communications Aspects
where Pmax(Ti) and Pmax(Tr) correspond to Pi(Ti) and
1lr(Tr), respectively. For a full-duplex system, Ptot(T i, Tr)
is not a constant but is dependent'on the ratio tr/ti'
The value of Ptot(Ti, Tr) for a particular subsystem
is governed by the specified retri~val time objective
for the system. A commonly used objective is as follows: The average retrieval time', should be Tmax
seconds or less. Another type of objective* can be
expressed similar to the voice traffic" grade of service
G(x), e.g., 0.95 of the retrievals should be received
within T' seconds.
Either a computer simulation or analytical means
can be used to determine values of Ptot (Ti, 1"r) that
correspond to a specified retrieval time objective.
With a properly written simulation, one can obtain
probability distributions as weH as all moments of
interest. However, using a simulation can be costly
during preliminary investigations in which one is
studying the effects of various communications parameters on the retrieval time. For this reason, a wellformulated mathematical queuing model can be useful
and rewarding for these investigations, even though
results from queuing models that represent complex
systems are often limited to average values.
Number of input-output terminals allowed
per data link
An important consideration in the communications
design of an inquiry-response system is the maximum
number of active input-output terminals, or equivalently the maximum number,of active personnel, that
can be served by a partiCUlar data link without exceeding a specified retrieval time objective. This maximum,
which will be denoted by Smax, is obviously related to
Ptot(Ti, Tr). In this section, a method is outlined for
approximating values of Smax for specified values of
the grade-of-service objectives and the other communications parameters. As will become apparent, the
method may be used iteratively to determine which
combinations of parameter values permit specified
'" System studies are often desirable in the final design stages of
a aystem to determine whether a given design will allow a specified
percentile-type objective to be met. Because of mathematical
complexity, however, analytical methods can seldom if ever be
used for such studies; a simulation is normally required. For
preliminary investigations, analyses based on average values can
be used to obtain valuable insight concerning the aensitivity of
the retrieval time to various system parameters. This insight can
be very helpful in designing and running a subsequent simulation.
Lack of such insight often resultJ in very costly system simulations.
659
voice traffic and retrieval time objectives to be met·
If costs are associated with these combinations, insight
can be gained concerning which means of implementation is most economical.
The first step towards getting values of Smax is to
use the specified values to construct graphs (using
Equations (3) and (8), respectively) that show S
versus Ev and Ptot (Ti, Tr) versus Ev, where Ptot(Ti,Tr)
represents the sum of Pi (T i) and Pr(Tr) for both fullduplex and half-duplex cases. Corresponding points
from these two graphs are then plotted to give a third
graph showing S versus Ptot (Ti, Tr); the value of S
corresponding to the point Ptot(Ti, Tr) = Ptot (Ti, Tr)
iB SmaX'
The second step is to relate the values of Smax and
the values of the retrieval time objective, e.g., Tmax,
that correspond to equal values of Ptot (Ti, Tr). Thus,
a graph such as Smax verus Tmax can be constructed
for given tran~mission rates T i and Tr. The benefit of
such graphs can be increased considerably if the oridnate also shows the values of Ev that correspond to
the values of Smax' By using estimates of the expected
voice load incoming to a cluster of customer representatives, the number of data links required to accommodate the cluster can be readily deduced from
the graph for each specified retrieval time objective.
This technique will be discussed further in the model
application section.
Plots of Smax a.nd Ev versus the retrieval time objective can aid investigations of the cost of implementing a system to meet a specified average retrieval
time objective. For example, a designer 'may discover
that for a relatively small increase in the allowable
Tmax , considerable savings in transmission and computer port costs could be achieved by serving more
representatives with a single data link.
Number of data links required
As was mentioned above, graphs showing Smax and
E" versus the retrieval time objective can be used to
estimate L(Ti, Tr), the number of data links required
to interconnect the computer and the input-output
terminals serving the customer representatives. There
is also a more analytical method for estimating L(Ti' Tr)
in which values of Ptot(Ti, Tr) are used. The same general
method can be applied to full-duplex and half-duplex
message transmission subsystems; however, it should
be remembered that for the full-duplex case, the value
of Ptot(Ti, Tr) may change if the val~e of the ratio
tr/t i is changed.
Let k represent the ratio of the total volume of
uuO
Fall Joint Computer Conference, 1969
-------------------------------------------------------------------------------------generated input and output data traffic (in erlangs)
to Ptot(Ti, Tr), i.e., let
k=
Let K represent the integer part of the ratio k. The
number of data links required to serve the cluster is
given by
(lOa)
If K
= k, i.e., k is an integer, then
For half-duplex links, if it can be assumed that the
total volume Ptot(T i, Tr) is divided evenly among them.
the average occupancy of each link is given by
P
= --_.
(lOb)
Model application
In this section, the analyti.cal model is applied
to a hypothetical information retrieval design problem.
For this example, it is assumeq. that information required for the operation of a business, such as customer
service and billing records, is to be stored in a computer. Input-output terminals will permit access to the
computer files; it is assumed that retrievals are primarily required in order to intelligently handle telephone calls from customers. Several clusters of inputoutput terminals are to be served by the same computer
complex. The cluster to be considered in this example
is concerned only with information retrieval; it is
assumed that file modifications are done by other personnel.
The basic configuration proposed for this cluster is
illustrated in Figure 1. Telephone calls from customers
are routed by the automatic call distributor to idle
customer representatives. Each customer representative is equipped with an input-output terminal. These
terminals are associated with a: common control unit,
which is connected to the computer by means of a data
link.
One objective of this analysis is to determine the
basic requirements of the data communications subsystem, i.e., how many common control units in con-
junction with their data links are required to accommodate the number of customer representatives that
will be needed to handle the incoming telephone calls?
To help answer this question, both full-duplex and
half-duplex methods of operation are considered. :Following the description of these assumed methods of
operation, representative parameter values ;are used
to indicate how these two proposals can be quantitatively compared.
Description of assumed methods of opelrntiolll
The first assumed method of operation to be described involves half-duplex message transmission,
which may have some economic advantages over fullduplex operation for some geographical config;urations.
Half-duplex operation is more suited for clusters
generating and receiving relatively low data traffic
volumes and for which retrieval time objectives are
not critical. One disadvantage of half-duplex operation
is the line time required to reverse the direction of
transmission; this interval will be referred to as the
reversal time.
The disadvantage of reversal times can be partially
overcome if the computer polls and delivers groups
of messages to the common control unit instead of
single messages to the individual input-output terminals. This method of operation will be referred to as
group poll and delivery as opposed to single poll and
delivery operation. When a large number of terminals
are served by a control unit, group polling significantly
reduces the line time required for reversing the diJrection of transmission and tram;mitting polling characters.
In addition, as the volume of data traffic increases,
group polling lessens the variance of the interval from
the time an input message is ready for tran.smisE;ion
until it has actually been transmitted to the eomputer.
When the reversal time durations are comparable
to message transmission times, data link efficiency is
. increased considerably by allowing multimessltge transmissions for both input and response messages, i.e.,
priority is not assigned to either type of message.
Line efficiency increases because a reversal is not required following the transmission of each lower priority
message in order to check the status of the higher
priority message queue.
With group polling and multimessage tranBmissions,
all input messages generated by the terminals since
the last poll are sequentially transmitted to the computer. Only when the input message queue becomes
empty is the direction of transmission reversed. After
the reversal, the computer begins delivering the queue
Analysis of Communications Aspects
h
Dp
h
P:
R:
Tj:
T,:
TRANSMISSION OF POLLING CODE
TRANSMISSION FACILITY REVERSAL
TRANSMISSION OF INPUT MESSAGES
TRANSMISSION OF RESPONSES
Dp: COMPUTER DELAY PRECEDING NEXT POLL
(ASSUMED ZERO IN THIS PAPER)
Figure 2A-Typical cycle of operation (Half-duplex
message tran~mi.:;sion)
661
_ ~s soon as .a response is prepared by the computer,
It IS entered Into an output queue for delivery. It is
transmitted immediately unless -another transmission
is already in progress; if so, the response is delayed until
all responses ahead of it in the queue have been sent.
Thus, while input messages are being transmitted by
the ?ontr~l unit, response messages corresponding to
preVIOUS mput messages are being received by the
control unit. Full-duplex message transmission is
illustrated in Figure 2B.
Personnel to handle incoming voice traffic
of responses to the control unit, which distributes each
It will be assumed that the assumptions stated
response to the appropriate terminal. After all responses
previously
concerning voice-access sUbsystems apply to
have been delivered, the computer polls the control
this example. It will further be assumed for this example
unit either immediately, or optionally after some
that during the busy hour of the busy day the average
specified deJay* D p , and the cycle repeats. A fixed
number of characters that identify the control unit is . number of calls per hour are not expected to exceed
600; the average duration of each call is expected to
sent preceding message transmissions from the control
be approximately three minutes. By using Equation
unit. This cycle of operation is illustrated in Figure 2A.
(1), it is found that Ev, the expected voice traffic load,
During a given cycle, either queue or even both
should not exceed 30 erlangs. In order to determine S,
queues can be found empty. If, for example, both are
the number of customer representatives required to
found empty, a group poll and delivery cycle dehandle this traffic volume, a grade-of-service objective
generates to a polling sequence followed by a succession
must be specified. In Figures 3A and 3B, S has been
of reversal times, which are separated by a "No Traf~
plotted as a function of Ev. Figure 3A shows 0(10),
fie" character sequence.that identifies the control unit.
0(20), and 0(30), where each is assumed equal to
Such degenerate cycles are assumed to reoccur until
0.95. Figure 3B shows the effect of varying the value
at least one message accumulates in either the input
of 0(20) from 0.90 to 0.9,75.
or the response message queue.
Figures 3A and 3B reveal that the grade-of-service
With the full-duplex case, reversal times are unnecesstandard ,for answering voice calls can be improved
sary, since the control unit can be transmitting and
considerably with the addition of a relatively few
receiving simultaneously. However, group polling of
representatives. For example, assuming that the averthe control unit is still beneficial, since polling interference on the delivery line occurs less frequently. With
full-duplex operation, input messages generated by
TRAFFIC FROM COMPUTER
the customer representatives are ordered in a first-come,
I
first-served manner for transmission from the control
unit. Transmissions to the computer begin immediately
after a polling code is received from the computer;
P: TRANSMISSION OF POLLING CODE
Tr : TRANSMISSION OF ONE OR MORE RESPONSES
the polling codes are interspersed among messages'reI: IDLE LINE
ceived from the computer. All messages that have
accumulated awaiting the polling code, as well as those
that are generated during the transmission, are transTRAFFIC TO COMPUTER
mitted to the computer. An interval of duration** Dp
P
Tj
Dp
P
1i
starts at the end of a transmission from the control
. IliIii!AOO!!0
~> 35
0::0::
IA.I
d'?
z"zO
30
~~ 25
0::0
IA.I«
~ffi 20
ASSUMPTIONS
0 .....
ERLANG C APPLIE S
~~ 15
m:::E
:::E
::>
z
PR[O~ISECS] -G(l)
10
Ij- 3 MINUTES
5
IL-_._J~_---.l
L - - _ . L -_ _- L - _ - 1 I_ _
o
5
10
EV.
15
20
25
30
AVERAGE VOICE LOAD
(ERLANGS)
35
40
Figure 3A-Effect of grade of service on number (If
personnel required I~(x) = 0.95)
EFFECT OF GRADE - OF- SERVICE
ON NUMBER OF PERSONNEL REQUIRED
(G (20) • Y)
5 0 r - - - - - - - · - - - - - - - - - - - - - - - - - - - - - - - - - -__~
o
....
45
o
~
of representatives from 38 to 42. These a,dditional
representatives could be individuals that are assigned
as representatives only during busy hour conditions.
40
5~
~> 35
0::0::
IA.I
'?
jj 30
z"zO
~~ 25
The amount of data traffic generated is dependent on
the degree of interaction hetween customer representatives and the computer. As an example, procedures could be outlined that would minimize the number of computer interactions per call by simply transmitting in a single response as much as possible of the
information in a computer file. On the other hand, if the
intent were to minimize the information that must be
read by representatives, several interactions could be
used during which the computer eliminated most of
the undesired information. Computer processing limitations would favor the former method of operation;
human factors considerations may favor th'B latter.3
An illustration of the effect of interactions on the
amount of data traffic generated for a half-duplex
method of operation is presented in Figure 04:, which
shows Ptot(T i, 'l1r) as a function of Ev. Let Type I interactions be those in which whole pages of information
are transmitted to the representative; parameter
values assumed are t r = 300 characters and I == 3
interactioIl9. Let Type II interactions be those in
which more specific items of information can be requested; values assumed are tr = 75 characters tmd
j = 6 interactions. Figure 4 indicates that in order to
accommodate 30 erlangs, the less interactive method
would require at least two half-duplex data links whereas one link may suffice for the Type II method, depending on the specified retrieval time objective.
0::0
IA.I«
~~
ASSUMPTIONS
20
0 ....
~~ 15
CD:::E
:::E
::>
z
Volume of data traffic allowed per data link
ERLANG C APPLIES
PR
[0 ~ 20
Ii - 3
10
SECS] -0(20)
MINUTES
5
o
5
10
15
20
25
30
VOICE LOAD
(ERLANGS)
35
40
Ev. AVERAGE
Figure 3B- Effect of grade of service on number of
per30nne] required [0(20) = y]
age voice load during the busy hour is 30 erlangs and
the grade-of-service objective is such that calls should
be answered within 20 seconds 1 Figure 3B indicates
that the fraction of calls that meet the objective can
be increased from 0.9 to 0.975 by increasing the ~lUmber
For this example it has been assumed that the retrieval time objective would be stated as an averu.ge
value, i.e., as ifmaX' Mathematical queuing; models
have therefore been used for this example to aid in
determining values of PeolTi, °Tr). Separate models were
used to represent the half-duplex and full-duplex
methods of operations; descriptions of these models
and associated formulas are presented in the Appendix *;
a derivation of the formulas for the half-duplex model
appears in Reference 4.
* A computer simulation was used to verify the queuing model
of the half-duplex method of operation. Values of T obta,ined with
the queuing model were found to be conservative estlmates.
Additional dhcussion concerning the results of the queuing model
and the simulation appears in the Appendix.
Analysis of Cqmmunications; Aspects
I.O,....-----------r-------~
I, -75 CHAR
Y• 6
-
f.
INTERACTIONS
jSSUMPTIONS
i -15 CHAR
y- 3 MIN
TI· T, • 120 CHAR/SEC
0.2
o
10
15
'lv.
20
25
30
35
40
AVERAGE VOICE LOAD
(ERLANGS)
Figure 4-Conversion of voice load to data traffic
The correspondence between Ptot (T i, Tr) and Tmax was
actually established in reverse, i.e., values of the average
retrieval time T were calculated as a function of the
total volume (in erlangs) of input and output data traffic
carried by the data link. The five durations included
in this retrieval time calculation are the following:
Di= Delay of an input message awa;ting transmission
ti = Transmission time of the input message
Op = Computer processing time, i.e., interival from
the arrival of the input message until the
appropriate response is entered in an output
queue
Dr = Delay of the response in a computer output
queue
tr = Transmission time of the complete response.
T
was obtained by summing the mean value of these
intervals, i.e.,
The queuing models were used to determine values
of 15i and Dr. Since the server in these models represents the data link, these delay values depend on
the average occupancy of the data link. For the halfduplex case, the average occupancy is numerically equal
to Ptot(Ti, Tr), providing Ptot(Ti, Tr) < 1. For the
663
full-duplex case, the average occupancies of the input
and output links are numerically equal to Pi(T,) and
Pr (T r), respectively, providing Pi(Ti) and pr(Tr) are
both less than 1.
Values for ti and tr were obtained from Equations
(5) and (6). The value of Cp was chosen to be two
seconds; for other analyses, the value should be chosen
to fit the characteristics and expected load of the
system computer. With Op = 0, it should be noted
that 1; represents the average retrieval time due solely
to data communications, i.e., message queuing and
message transmission.
In Figures 5A and 5B, T is plotted as a function of
the erlangs of data carried per link for the half-duplex
and the full-duplex cases, respectively; in each plot,
T' is shown for different average response lengths. For
the half-duplex case, the erlangs of data carried per
link is equivalent to the average data link occupancy.
For each of the plots, as the erlangs of carried traffic ap·
pro aches zero, T approaches the sum of ti, tr, C p , and
R, where R = 0.2 seconds for the half-duplex case
and zero for the full-duplex case. Figure 5A can be
converted into plots of T versus Ev by reference to
Figure 4.
Other communications parameter values assumed
for the plots in Figures 5A and 5B are as follows:
t,
= 15 characters
c2 (t,) = 0.1*
C2 (tr)
= 0.5
(5p = 2 seconds
Ti = Tr
= 120 characters per second.
The queuing models permit values of each of these
parameters to be varied individually or in various
combinations; by observing the results of such variations, insight is gained concerning which parameters
most significantly affect T. As was mentioned previously, the graphs can also be used in reverse to determine the effect of parameter variation on values of
Ptot(Ti, Tr) for specified values of Tm~'
Number of customer representatives allowed
per data link
By relating values of Sand Ptot(Ti, Tr) appearing in
Figures 3~nd 4, respectively, that correspond to equal
* The coefficient of variation of a random va.riable y, which is
denoted by c2 (y), is defined as follows: c2 (y) = Var(Y)/Y2.
Fall Joint Computer Conference, 1969
664
12
50.----A~S~S~U~MP~T~10~N~S---------------------~
II
45
TI •
:E
9
e
...J
« ....
GClO)· 0.95
~o
"fWAX CAN BE ACHIEVED IF
,~
FALLS TO RIGHT OF CURVE.
a::z 6
wo
a::w 5
U)
wC)
1-0
«
4
~
3
II-
2
a::
w
i . 75
""r~TI(
ASSUMPTIONS
Ai -15
CHARACTERS
C 2(li)·0 .Ii
o
C 2 (lr)·0.5
-
Cp • 2 SEC
Ev
}l'igure 5A-Effect of response length on average
retrieval time (Half-duplex me.:lsage transmission)
12
ASSUMPTIONS
Li • 15 CHARACTERS
9
Dp-I SEC;
e
Tj • Tr "
C2 (.i r }. 0.5
Cp .2
SEC
120 CHAR/SEC
...J
«
> .... 7
win
_0
a::z 6
..... 0
wu
a::w 5
In
w ....
C)
4
«
a::
w
3
>
«
--
:5
15
:~
>
."
,II:
I~
CHAR
I ·3 INTERACTIONS
10
~
,..r
1;. 150 C
-----_---------..'z,.
~
.
75 CHAR
2
o
3
1.0
ERLANGS OF DATA CARRIED PER LINK PAIR
(INPUT PLUS OUTPUT TRAFFIC)
Figure 5B--Effect of response length on average retrieval time
(Full-duplex message transmission)
values of Ev, a graph of S versus Ptot(Ti, Tr) was obtained. This graph was then used in conjunction with
Figure 5A, which shows T verSus Ptot(Ti, Tr) in order
to obtain Figure 6A, which shows Smax versus T max
for the half-duplex method of operation. Values of
4
5
6
7
B
9
10 II
12
AVERAGE RE TRIEVAL TI ME OBJECTIVE
( SECONDS)
I~I
14
Figure 6A ---Effect of re."ponse length on pel'sonnel
allowed per data link (Half-duplex message
transmh;~ion )
1.0
C2 (lj). 0.1;
2
TWA)('
ERLANGS OF DATA CARRIED PER UNK
(AVERAGE DATA LINK OCCUPANCY)
10
~
!r·
- 300
20
Ti - Tr - 120 CHAR/SEC
0
II
CHAR
..!
-R·0.2 SECi
...:-
..H2li!
7
>U)
l-
30
V • 3 MIN
l-
!
35
"1·15 CHAR
CpO 2 SEC
10
w
w
!r • 120 CHAR/SEC
that correspond to values of Smax for G(20) = 0.95
are indicated on the right-hand vertical boundary of
the graph. Curves are plotted to depict the eft'ect of
Type I and Type II interactions. Values of 'fimax that
fall to the right of these curves can be achieved.
The graph indicates that for the indicated para,meter values, the more interactive procedure allows
considerably more personnel to be served by a single
data link. With Type I interaction, i.e., the less interactive procedures, a Tmax of five seconds cannot be
met. However, with Type n interactions, this objective can be met for values of S less than approximately
35 representatives, which would be required to handle
an incoming voice load of approximately 2(], erlangs.
If a Tmax of three seconds is desired, it is obvious that
some of the parameter values must be changed. Perhaps
the transmission rates T, = Tr could be increased, or
if possible, Cp could be reduced. Trade-oft's can thus
be studied between data communications and eomputer
processing capabilities.
Figure 6B shows Smax versus Tmax for full-duplex
message transmission. As expected, the graph indim;~teE'
that full-duplex operation allows more representative~
to be served on one data link' than does h2~lf-duplex
operation. Figure 6B also reveals that with full-duplex
message transmission, a Tmax: of five or possibly four
seconds can be met with ono data link while handlling
30 erlangs of incoming voiee traffic. In compari8on,
reference to Figure 6A revoals' that with half-duplex
operation, two data links would be required 1:.0 meet a
Tmax of five seconds with ltv = 30 erlangs; with one
data link, T would equal approximately severt seconds.
Analysis of Communications Aspects
cation, estimates of average retrieval time as a function
of erlangs of input and output data traffic were obtained
by using delay formulas from mathematical queuing
models.
60
A"!r'Mt!S
415
'1'1-
z
40
~::i
315
~I
~
S 30
U
a:
~
215
!~
x
20
~I
c
a
II)
115
10
30
SEC
Dp. I SEC
-y. 3 fIIlN
6 (lOt-O.H
20
----...,
1. • 300 CtWt
'i - 3
115
INTERACTIONS
_!~N~:~Cc;.:.(\we
Y
10
1
2
3
~;;
~i
~!
:~
iC
J&i
..ItQIl:
15
0
-115 otAR
Cp - 2
0
tt!
C
't;;' 120 otARISEC
li
..J
w
15
6
7
8
ACKN OW LEDGl\1ENT
lVIr. E. J. Rodriguez wrote and ran the computer
simulation referred to in this paper. His assistance in
obtaining this data as well as other data appearing in
the figures is much appreciated.
15
CAN BE AotlEYED IF
REFERENCES
FALLS TO RIGHT OF CURVE.
4
665
9
10
II
12
13
14
TMAX, AVERAGE RETRIEYAL TlfIIE OBJECT lYE
(SECONDS)
2
Figure 6B-Effect of response lengt.h on personnel allowed
per data link (Full-duplex message transmission)
SU_Y.TMARY
An analytical model has been presented that can be
used for preliminary investigations of the voice and
data communications aspects of inquiry-response systems. The model can be used to gain insight and to
obtain conservative estimates of communications capabilities required' in order for a system to meet specified grade-of-service objectives.
In particular, the mathematical relationships ·in
the model can be used to estimate quantities such as
the number o' customer representatives required to
handle incoming voice traffic and the volume of data
traffic generated as a result of this voice traffic.
These estimates in conjunction with retrieval time
estimates are used to predict the number of data links
required and the number of equivalently active inputoutput terminals that can be served by a data link
without exceeding a specified retrieval time objective.
The model is useful for studying the sensitivity /of
the voice and data communications requirements to
changes in various communications parameter values.
This insight can aid in limiting the cost of subsequent
detailed system simulations. Also, the model can be us~d
iteratively to determine which combinations of parameter values permit specified voice traffic and retrieval
time objectives to be met most economically.
As an illustration, the model is applied to a hypothetical system. Requirements for full-duplex and halfduplex message transmission are compared. The assumed methods of operation are characterized by group
polling of and delivery to a common control unit rather
than individual input-output terminals. For this appli-
3
4
5
A DESCLOUX
Delay tables jor finite- and infinite-source systems
McGraw-Hili Book Company Inc N Y 19624
E C MOLINA
Application of the theory of probability to telephone trunking
problem8
Bell System Tech Journal Vol 6 1927461-494
D MEISTER D E F ARR
The utiKation of human factors information by designers
Human Factors Vol 10 1967 71-87
J S SYKES
Analytical model of half-duplex interconnection8 of comptl,ters
IEEE Trans on Com Tech Vol 17 1969 235-238
Analysis of 80me queuing model8 in real-time 8y8tems
IBM Tech Pub Dept F20-0007-1 16
APPENDIX
Queuing models were used in the application section
of this paper to represent the assumed method~ of
operation of the data communications subsystem. This
appendix contains a description of these queuing models
as well as the associated delay formulas used for calculating D i and Dr, two of the terms in the expression
forT.
H ali-duplex message transmission
The queuing model selected to represent the assumed
half-duplex method of operation is a single-server
dual-queue model4 in which service is alternated between the two queues; a finite interval is required to
switch service from one queue to the other. Each queue
is assumed to have an independent Poisson input and
an independent general service time distribution. The
alternating priority rule is followed. With this rule,
all customers entering a queue while that queue is
being served are also served; when that queue eventually becomes empty, service can be switched to the
other queue.
In this model the single-server represents the data
link that alternately allows transmission of the input
Fall Joint omputer
666
C~nference,
1969
messages that accumulate in the control unit and the
responses that accumulate in the computer. The
service times in the model represent the intervals ti
and tr required to transmit individual messages. The
switching, or reversal, times represent the intervals
required to reverse the direction of data link transmission. For calculation purposes, it can be assumed
that the reversal times also include the constant
intervals required to transmit a fixed number of characters for supervisory purposes., Examples are polling
sequences to request input messages from a control
unit and identification sequences that precede input
messages to identify the transmitting control unit.
Assu:ning the transmission times ti and tr have
mean values 'ii and tr and coefficients of variation
c2 (t,) and C2(tr) and assuming, the facility reversal
time R, the polling time PI, and the control unit identification time P 2 have constant durations, the formula for
D i is as follows:
.
Pitigi
Di =
Prgrtr(l - Pi)2
+ _____
2(1 - Pi)
+
p,giLPr2
._1- - - - - - - -
2(1 - Pi) (1 - p) (1 -- P
+
2P,Pr)
(1 - Pi)(J) + .h)
+--------
2(1 - p)
where
average occupancy of the data link
P
= Pi
+ pr <
Pi
= Alti < 1
p,
= A,tr <
J 1 = (R
(R
J2
=
gi
= [1
g, = [1
1
1
+ PI)
+ P2)
+ c2 (ti)]
+ C2 (t r )]
estimating storage usage at the control unit and at the
computer are the following, which give the average
number of input messages and responses, respectiv1ely,
that would be included in a multimessa~~e transmission:
(1 - p)
A,[J 1
Nr
+
J 2]
= ------
(1 - p)
A computer simulatio:n was used to determine how
well this queuing model represents the assumed mE,thod of operation. Values of T obtained with thE~ queuing
model were found to be conservative. In general, the
best agreement was obtained as long as value8 of P were
less than 0.6 to 0.7; differences were within a range
from zero to 15 percent. With most combinations of
parameter values, the disparity increased significantly
for values of P exceeding 0.8; La., the queuing model ga,ve
.overly conservative estimates of T. Agreement improved as the value of a = tr/ii decreased and/or the
value of Ti = Tr increased.
The disparity can be explained as follows: in the
simulation the arrival pattern of responses in the computer output queue was not quite as r~ndom as is ex- .
pected for Poisson arrivals, which are assumed in the
queuing model. A principle of queuing theory is that
as regularity of arrivals and, service times incref~se,
the average delay decreases. s Excellent al~reement
between the results of the queuing model and the simulation were achieved when the value of Ar used in the
queuing model was set equal to 0.9 times the Ar Ul3ed
for the simulation.
Full-duplex message transmission
Independent models were selected to represent the
input message queue and the response queue in the
assumed full-duplex method of operation. Polling interference on the delivery line W3,S assumed to be negligible.
Note that as P ~ 0,
(J 1
+J
2)
Di~---.-
2 ,
The formula for Dr is identical to the one shown for Di
with all i subscripts changed to r's and vice versa.
Two additional formulas th~t may be helpful in
For the response queueJ the classical MIGII model was
assumed. For the input message queue, an accumulat.ion
interval of Dp seconds was assumed prior to each poU.
This situation was modeled as an M/G/1 queue with a
setup time of Dp. Assumptions for ti and tr are the
same as stated for the half-duplex case. Formulas for
Analysis of Communications Aspects
D, and Dr are
as
Expressions for N i and N r for this case are as follows:
follows:
Xit~[1
+ C2(ti)]
Di = --------2(1 - Pi)
A/t~[l
667
+ c (t
2
Dp
+-
~,
= -----; Dp > 0
2
(1 -
1
r )]
Dr = -----.---2(1 - Pr)
PI)
Xr =
-~-.--.
1 - Pr
A study of asynchronous time division
multiplexing for time-sharing computer
systems
byW.W.CHU*
Bell Telephone Laboratories, Incorporated
Holmdel, New Jersey
stream is shown in Figure 2. The crucial attributes of
such a multiplexing technique are:
INTRODUCTION
In order to reduce the communications costs in timesharing systems and multicomputer communication
systems, multiplexing techniques have been introduced
to increase channel utilization. A commonly used
technique is Synchronous Time Division Multiplexing
(STDl\I). In Synchronous Time Division Multiplexing,
for example, consider the transmission of messages
from terminals to computer, each terminal is assigned
a fixed time duration. After one user's time duration
has elapsed, the channel is switched to another user.
With synchronous operation, buffering is limited to
one character per user line, and addressing is usually
not required. The STDM technique, however has
certain disadvantages. As shown in Figure 1,' it is
inefficient in capacity and cost to permanently assign
a segment of bandwidth that is utilized only for a
portion of the time. A more flexible system that efficiently uses the transmission facility on an "instantaneous time-shared" basis could be used instead. The
objective would be to switch from one user to another
user whenever the one user is idle, and to asynchronously time multiplex the data. With such an arrangement, each user would be granted access to the channel
only when he has a message to transmit. This is known
as an Asynchronous Time Division Multiplexing
System (ATDM). A segment of a typical ATDM data
* Present address: Computer Science Dept., UCLA, Los Angeles
California, 90024.
1. An address is required for each transmitted
message, and
2. Buffering is required to handle the rahdom
message arrivals. **
If the buffer is empty during a transmission interval
the channel will be idle for this interval.
'
An operating example of an ATDM system for
analog speech is the "Time Assignment Speech Interpolation" (TASI) system used by the Bell System on
the Atlantic Ocean Cable.1 Using TASI, the effective
transmission capacity has been doubled and the system
operates with a negligible (with respect to voice transmission) overflow probability of about 0.5 percent,
even without buffering.
The feasibility of the· ATDM system depends on:
(1) An acceptably low overflow probability-of the
same or lower order of magnitude as the line error
rate--that can be achieved by a reasonable buffer
size, and (2) an acceptable expected message queuing
delay due to buffering. To estimate these parameters,
analyses of the statistical behavior of the buffer are
presented below. The user-to~computer traffic is in
** There may be other reasons for providing buffering such as:
tolerating momentary 103S of signals (e.g., fading), momentary
interruptions of data flow, permitting error control on the line,
etc. Under these conditions, the buffer should be designed to
-satisfy also the above .specific requirements.
669
670
Fall Joint Computer Conference, 1969
---------------------------------------------------------ENCODING
COMP\fTltR
a
DECODING
t--_ _.....:MU=LT.:.:.;IPL=E=:;:XE:=..,D.........=.L::.::INE=--_ _ _ _ _ -I8UFFERI~IG
Figure 3-Asynchronous time division multiplexing
system for time-sharing computer communication'!
ASYNCHRONOUS
TIME-DIVISION MULTIPLEXING
~
DATA FROM USER "A"AT THE i TH CYCLE
Figure 1--Time-division multiplexing
(a) USER·TO- COMPUTER DATA STRUCTURE
I ADS ~EI ADS ~[~_-_-~_-_-_-_~ --IADS~EI
(b) COMPUTER -TO - USER DATA STRUCTURE
ADS
ADDRESS
E
END OF MESSAGE
~
MESSAGE
Figure 2-Asynchronous time diyision multiplexing
data stream
units characters, while the computer-to-user traffic is
in units strings of characters which we shall call bursts.
The length of the bursts are d~fferent from one to
another and are treated as random variables. Because
of the asymmetrical nature of the traffic characteristics,
the statistical behavior of the buffer in the user-tocomputer multiplexer and the co~puter-to-user multiplexer are quite different and, therefore, are treated
separately. An example is given to illustrate the multiplexer design in a time-shared; computer-communi·
cations system that employs ATDM ·technique.
Analysis oj buffer behavior
User-to-computer buffer
An ATDM system consists of a buffer, encoding/
decoding circuIt, and a switching :circuit (in the case of
mUltiple multiplexed lines) as sh9wn in Figure 3. For
the analysis of the statistical behavior of user-tocomputer buffer, the character (fixed length) arrivals
from the sources to the buffer are assumed to be generated from a renewal counting process; that is, the
character interarrival times are independent and
identically distributed. Since the line transmits with
constant speed, the time it takes to transmit each
fixed length character (service time), 1/#-" is assUlned
to be' constant. For reliability and simplicity in data
transmission, synchronous transmission is assumed.
The data are taken out synchronously from the buffer
for transmission at each discrete clock time. The d~l.ta
arriving at the buffer during the periods between clock
times have to wait to begin transmission at the beginning of the next clock time, even if the transmission
facility is idle at the time of 3,rrival. In queuing theory
terminology, the above system implies there is a gate
between the server and waiting room which il3 opened
at fixed intervals. Thus we shall analyze the queuing
modelt with finite buffer size (waiting line) and synchronous multiple transmission channels (servers). Powell
and Avi-I tzhak2 analyzed 3. similar queuing model
with an unlimited waiting line. Birdsall,s and later
Dol" analyzed a queuing model with limited waiting
room but with a single server. In here, the model is
generalized to accommodate multiple servers with
limited waiting room.
To establish the set of state equations for analysis
of a buffer with a size of N nharacters and e serve.rs,
we assume that the system has reached its equilibrium.
Let Pn be the probability that there are exactly n
characters in the system (in the buffer and in servi,~e)
at the end of a service time, and a c be the pr.obabil:ity
t The results derived from this study can also be used as a conservative estimate (upper bound) for the case in which the lines
are permitted to transmit the characters arrived during the
service interval. The estimate yields better approximation for
the heavy than light traffic intensity case. Because under heavy
traffic case, the lines are usually all busy and the charaeters that
arrive during the service interval have to wait and cannot be
serviced during the service interval. The maximum over design in
8. buffer syatem with c transmission lines that permits to transmit
the characters arrIved during service interval is c characters.
Study of Asynchronous Time Division Multiplexing
there are no more than c characters in the system at
that time, i.e.,
c
ac =
L:
-(.1)
Pi
i=O
Without loss of generality, we can let the service
interval equal to unity. We shall express the probability of number of 'characters present in the buffer at
the end of the unit service time interval (left side of
equation (2)) in terms of the probability of the number
present in the system at the beginning of the interval
(right side of equation (2)), multiplied by the probability of a given number of characters arriving during
the service interval. As this can occur in different
combinations, we add the probabilities. With synchronous transmission, all characters in service. would
finish their service and leave this system at the end of
a service interval.
Thus in a unit service interval of time, we have
buffer is vacant, if no more than c characters are in
transmission at the beginning of the interval, and no
arrivals occur during the interval. The second equation
describes the case in which one character is in the buffer
if no more than c characters are in transmission at the
beginning and .one arrives during the service time
interval; or there are c + 1 in the buffer at the beginning and no character arrives during the service
interval, etc. In the numerical computation carried
out in this paper, we assume the character arrivals
are generated from a Poisson process; that is, 71"110 =
exp( -- Au)A~/n!, where Au is the average character arrival rate to the user-to-computer buffer (offered load)
from the m independent users. Since the buffer has a
finite size of N, Pi>N = 0. Thus, when a character
arrives and finds the buffer is full, an overflow will
result. Therefore, the average character departure rate
: from the user-to-computer buffer (carried load), au is
less than the offered load from the users Au. The carried
load can be computed from the buffer busy period
po = a c7l"o
PI = a 0 7l"1 + pc+l7l"o
P2 = ac7l"2 + pc+l7l"1 + pc+271"O
Pn = a c'1l"n
au
Pc+n7l"O, for n
c-l
N
i=O
i-c
= L: i'Pi +c L:
Pi
(4)
The overflow probability of the user-to-computer
buffer, the expected fraction of total number of characters rejected by the buffer, is then equal to
+ pC+171"n-l + ... + PHn-171"1
+
671
:s;
N - c (2)
P
_ offered load-carried load = 1 _ au/Au (5)
offered load
oj -
= ac'1l"n
pn
+
pc+t'1l"n-l
+
+
pN-171"nH-(N-c)
for N - 1
+ pN7I"n-(N-c)
~
n
>
The traffic intensity from user-to-computer, Pu,
measures the degree of congestion and indicates the
impact of a traffic stream upon the service streams.
I t is defined as
N - c
N
l2
Pi = 1
i-O
(6)
Due to limited buffer size,
Pi>N
=
0
(3)
Channel (server) utilization, 1], measures the fraction
of time that the lines are busy. It can be expressed as
Where
(7)
71"110 = probability of n characters originating from a
renewal counting. process during a service
interval
N = buffer length in characters
c
=
number of transmission lines
The first equation describes the case in which the
Since physically it is impossible for the transmission
lines to be more than 100 percent busy, the utilization
is limited to a numerical value less than unity. In the
no-loss case (unlimited buffer size), P oj = 0, then 1]
==
p.
The time average queuing length in the uset-tocomputer buffer, L u , is equa,} to
·672
Fall Joint Computer Cqnference, 1969
N
Lv.
L:
(i - C)Pi
+ Av./2 characters
for N
i-o
>
c.
(8)
The first term in Equation (8) is: the expected number
of characters in the system at the beginning of a service
interval. Since the characters could not leave the
system during the service interv~l, we add the time
average number of character ~rrival (for Poisson
arrivals) during the service interVal which is Au/2. The
expected (time average) queuing delay of each character at the user-to-computer buffer due to buffering,
Dv., can be evaluated by using Little's/) result. We have
Dv.
=
Lu/(Au(1 - Po,))
~ervice
times
(9)
For the single server case, that is, c = 1, the set of
state equations (2) becomes ali imbedded Markov
Chain, and can be solved itera~ively to obtain the
state probabilities as shown in ~eferences 3 and 4.
For the multiple server case, however, the multiple
dependence on the various states prevents us from
using the iterative techniques for solution. Thus, the
set of state probabilities, p/s, mu~t be solved from the
set of linear matrix equations (2) .• The overflow probability, queuing delay, and queue length are then computed from the pi's via Equations 4:,5,8 and 9.
The size of the matrix (Equation 2) corresponds to
the buffer length. The matrix equation was solved by
the Gauss elimination method. 6 For purposes of accuracy, double precision was useq. in all phases of the
computation. From the character arrival rate, Au, the
coefficient values can be comput~d from (2) and they
are stored in the computer program. Due to the limitation of the computer word size~ double precision on
IBM 360/65 provides 15-digit accuracy. Therefore,
when the coefficient value is less than 10-16 , it is set
equal to zero. The computation time required to solve
this type of system equation is largely dependent on
its size. For a 10 X 10 matrix the: computation time is
about 0.8 seconds, while a 50 X50 matrix equation
takes about 1.67 minutes.
Numerical results are presented in Figures 4, 5 and
6. These results reveal the relationships among the
overflow probabilities, number of transmission lines
used, traffic intensities, and buffer sizes.
LCr
Lc=}4
Lc=Y!
Lc:Y!
LCr
I x10 -10 L.---L_..JI.,..-....J.._~I::---1-~~-..J._ _I~--L1 - - L -
Co: I
0
C-S
5
10
20
25
IS
40
35
SO
55
45
BUFFER SIZE IN CHARACTER LENGTH. N
Figure 4-0verflow probability vs buffer size
c
en
1>.1
~.S
::I
j:
CI>
z
Q
-I
0
3.0
:c
II:
~
C-I
U
c[
II:
2.S
c[
:c
u
~
~
~
2.0
CI>
il
~
I.S
C· i!
0
~u
It!
x
1.0
1>.1
Computer-to-user buffer
In a previous section, the buffer behavior has been
ana1yzed for a finite queue with multiple server,
Poisson arrivals, and constant service time, which
corresponds to the users-to-colnputer traffic. The
5
10
o.S
~
0
~
__-+____
~
10
__
~~
IS
__
~~
20
__
~
25
___L - _ . - - L - 30
BUFFER SIZE IN CHARACTER$, N
Figure 5-Expected queuing delay vs buffer size
35
40
Study of Asynchronous Time Division Multiplexing
f
= 1,2,
...
n = 0,1,2, ...
673
(10 )
(11)
The total number of characters that arrived during
the time to transmit a character on. the mUltiplexed
line is a random sum, SN, and is equal to
(12)
2.5
0.5L---::-,--",==::~:::::;;;;;~~~;::;;;;~_~~~~~~
o
0.1
0.2
0.3' 0.4
0.5
0.'
0.9
TRAFFIC INTENSITY.
II' 'A IC,.
Figure 6-Expected queuing delay vs traffic intensit y
computer-to-user traffic, however, is quite different
from the users-to-computer traffic. The central processor of a time-sharing computer sequentially performs fractions of each user's job and the output
traffic to the users are strings of characters which we
shall call bursts. The length of the bursts are different
from one to another and are treated as random variables. It is assumed that the internal processing speed
of the computer is very fast as compared to the line
transmission speed. Further, it is assumed that the
various processing tasks generated by the user-computer interaotions are independent from one user to
another and have exponential interarrival times for
a given user. In ATDM operation with these assumptions, the arrivals of bursts at the common output
transmission buffer for the group of users are approximated as random. In this section, we shall analyze this
buffer behavior under the assumptions of a finite queue,
single server with batch (burst) arrivals, and constant
service time.
Using the burst length and traffic intensity as parameters, we would like to find the relationships among
the overflow pr"'obabiJities, expected burst delays due
to buffering, and buffer Edzes.
Let us consider the case that the burst length, L
is geometrically distributed with mean, = 1/0; and
the number of bursts arrived during a unit service
interval (time to transmit a character from the multiplexed line), N, is Poisson distributed with mean, Ac
bursts/service time. The distributions of Land N are
as follows:
t
where L i , a random variable distributed as (10), is
the number of characters contained in the ith arriving
burst. N, a random variable distributed as (11), is
the total number of bursts arriving during the unit
service interval. For simplicity in notation, we let
S = S~
The characteristic function of S, cf> s(u), can be expressed in terms of the characteristic function of. Lt.
cf>t(u) , and Ac.
Since the burst lengths are geometrically distributed
the characteristic function of L is
"'L(U) = OoeXP(iU)/( 1
. where i =
~ (1 ~ O)exp(iU»)
(14)
Y=1. Substituting (14) into (13), then
cf>s(u) = exp[- Ac
+
Ac·O·exp(iu)/
(1 -
(1 - O)exp(iu))]
(15)
From (15), it can be shown that the probability
density of j characters arriving during a unit service
interval, f(S = j) = fj, is a compound Poisson distribution as shown in (16)
t (j -
fj = f(S = j)
k=l
1 ) ( AcO ) k
k -- 1
exp( -A c )
(l - 0) i-kexp ( -Ac)/k!
j = 1,2, ...
j = 0
(16)
The expected value of S is given by E[S] = E[L]E[N]
A/O, and the variance of S is given by
Var[S] = A(2 - 0)/02
(17)
674
Fall Joint omputer
Con~erence,
1969
The time required to compute the probability density
function of S, fil from (16) is dependent on the size of j.
For large j (e.g., j > 1000), the computation time
could be very large and prohibitive. A convenient and
less time consuming way to compute fi is from cJ>B(U) by
using the Fast Fourier Transform7 •inversion method as
follows:
M
fi =
L
,.-1
cJ>s(r)exp[ -2'1rirj/M]
j
= 0, 1, 2, "', M - 1
(18)
ClIc
=
1 -po
The overflow probability of the buffer with burst
input, the expected fraction of total number of char~w
tel'S rejected by the buffer, is equal to
offered load-carried load
offered load
where
I'
The average character departure rate from the buffor
(carried load), a c , is less than the average character
arrival rate to the buffer (offered load), {3 = Ac/8,
from the computer. The carried load can be computed
from the probability that the'buffer is idle,
P~f = - - - - - - - - - -
1 -
a./{3 (23)
= 27ru/M
M ::::: total number of input points to represent
cJ>s(r) = total number of output values of f i .
In order to accurately determine cJ>s(r), it is computed
with double precision on the IBM 360/65. Further, we
would like to use as many points; as possible to represent cJ>s(r); that is, we would like to make IVI as large
as possible. Because of the word length limitation of
the computer, double precision provides 15-digit accuracy. Therefore, when fi < 10~111, it is set equal to
zero. M is selected such that fi>M < 10-111 . The M's
are different for different values of>.c andl.
The following is the set of state equations for a
buffer size of N characters with batch renewal arrivals,
single server, and constant output rate.
n
Pn
=
L
7rOPn+l +
7rn-i+1Pi +7rnPo
i-l
or
n
The traffic intensity from computer-to-user is
The set of state Equations (19) is an imbedded
Markov Chain. In the following numerical computations, we shall assume that the character arrivals
are generated from a compound Poisson procc~ss, i.e.,
7ri = f i • The state probabilities can be solved iteratively and expressed in terms of po. From (20), we can
find the value of po. Thus we find all the state probabilities. The overflow probabilities for various burst
lengths can then be computed from (23). These results
are presented in Figure 7 which provides the relationships (at pI oJ = 10~) between burst lengths and buffer
sizes for selected traffic intensities.
In the above 9.nalysis, we have treated each character as a unit. However, in computing the expected
burst delay, Dc, due to buffering, we should treat ef~ch
burst as a unit. The service time is now the time required to transmit the entire burst. For a line with
= 0, 1, 2, "', N - 1
N
L
Pi
=
1
(20)
and
(21)
The above equations are reduced 'from Equation (2) by
letting c = 1.
e,le--""""=--~~~,e~2~~~~:::=='-~";3-~BUrFER I..ENGTH (CHARACTERS)' N
Figure 7-Buffer length vs avercl.ge burJt length,
P:,
=
10-
0
Study of Asynchronous Time Division Multiplexing
constant transmission rate, the service time distribution
is the same as the burst length distribution except by a
constant transmission rate factor. When overflow
probability is very small, for example, p l 01 = 10-6 ,
then Dc can be app~oximated by the expected burst
delay of the infinite waiting room with Poisson Arrivals and single server with geometric service time,
M/G/l, mode1. 8 •9 Hence
Dc = AE(L2) = Ac(2 - 8)
2(1 - p)
2(8 - Ac)
character-holding times
(25)
where E(L2) = second moment of burst length, L. The
delays are computed from (25) for selected traffic
intensities and burst lengths. Their results are portrayed in Figure 8.
1000~~~--~--r---.---.---.---,----r--~
60
40
0
ui
w
:e
20
......
I
C)
z
0 100
10
...J
0
J:
I
a:
w
......
v
w
:::>
0
......
V)
a:
:::>
CD
0
W
......
v
w
n.
x
w.
O.IL-__L -_ _L -__L -_ _L -_ _
0.1
0.2
0.4
~
0.6
TRAFFIC INTENSITY
_ _~_ _~_ _- L_ _~
0.8
P = 'A.J/fL
Figure 8-Traffic intensity VB expected burst queuing
delay
1.0
675
Discussion oj results
We shall first discuss the user-to-computer buffer
behavior. Figure 4 portrays the relationships between
overflow probabilities and buffer size for selected
traffic intensities and selected numbers of servers. The
curves for two.;,., three-, and four-servers lie in the
region between the single and the five-server curves.
For a given traffic intensity, the overflow probability
decreases exponentially with buffer size. For a typical
traffic intensity of 0.8 ,a buffer of twenty-eight character length will achieve an overflow probability in the
order of 10-6. A larger buffer size is needed for Pu > 0.8
in order to achieve the same degree of buffer performance. For a given p, the queuing delay increases as
the overflow probability decreases (or the buffer size
increases). When the overflow probability is less than
10-4 (for pu = 0.8, this overflow probability corresponds
to a buffer size of about eighteen characters), the delay
increment with buffer length becomes negligible and
the delay can be approximated as independent of buffer
size as shown in Figure 5.
For the data transmissions in time-sharing systems,
the buffer overflow probability should be somewhat
less than the line error rate. For currently available
lines, the error rate is about 10-5 • Therefore from
Figure 5, we know that the queuing delay range of
interest is almost independent of the buffer length.
Figure 6 describes the queuing delays (at overflow
probability = 10-6) for various traffic intensities. The
queuing delay increases exponentially with P. For a
given p, the queuing delay decreases with the increase of
number of servers. Figures 4 and 6 agree with our
intuition that whenever multiple servers are needed,
it is always advantageous to use a common buffer
rather than using several single lines with separate
buffers.
N ext we shall discuss the computer-to-user buffer
behavior. The overflow probability depends upon the
buffer size, the traffic intensity, and expected burst
length. For a given average buffer length, the overflow
probability increases as the traffic intensity increases.
For a given traffic intensity, and a desired buffer
overflow probability, the required buffer size increases
as the average burst length increases. Figure 7 provides
the relationships between the average burst length
and required buffer size to achieve an overflow probability of 10-6 for selected traffic intensities.
When the average burst length equals unity, then
the result reduces to the case of Poisson arrivals,
single server and constant service time as had been
analyzed. 8 • 4 For a given traffic intensity, required
buffer size for average burst lengths tCe > 1), Nt, to
676
Fall Joint Computer Conference, 1969
achieve the same degree of overflow probability is
much greater than that for unity burst length, N l . In
general, N( > tXN 1 • As t increases, the difference between Nt and tXN l increases. For example, for Pc
= .8, t = 1, the required buffer size to achieve Plol
= 10-6 is N 1 = 28 characters. When t = 4, then from
Figure 7, N4 = 212 > 4 X28 = 112 characters. In
the same manner, if f, = 20, N20 = 1200 > 20X28
= 580 characters. This is due to the fact that the
variance of S is proportional to t as shown in (I 7).
Figure 8 portrays the relationship between expected
burst queuing dela,y and traffic intensity for selected
expected burst lengths. For a given expected burst
length, the expected queuing delay increases as traffic
intensity increases; for a given traffic intensity, the
expected queuing delay increases with burst length.
These are important factors that affect the delay.
Optimal design of multiplexing system
Let us first consider the design of the user-tocomputer multiplexer. Based on the user-to-computer
traffic characteristics, the number of user terminals,
maximum allowable queuing dela.y, and overflow
probability, several different buffer system configurations might satisfy the desired requirements. Hence
there are trade-offs among the number of transmission
lines we might use, the transmissjon rates of the lines,
and the buffer sizes. We would like to design the multiplexing system whose total cost (t:ransmission cost and
buffer storage cost) is minimum. One way to proceed
with this is first to select the set of possible multiplexing
system configurations based on the queuing delay
requirements from Figure 6. Based on the maximum
allowable overflow probability, we can obtain the
required buffer length for this set of possible multiplexing system configurations. The optimal user-tocomputer part of the multiplexing system can then be
selected as that which minimizes the cost of the system_
Next, we shall consider the optimizations of the
computer-to-user multiplexer. Data collected from
several operating time-sharing systemslO revealed that
the average number of characters sent by the computer
to the group of users is an order of magnitude greater
than the number of characters sent by the group of
users to the computer. Thus, using high transmission
rate line for computer output data would significar. tly
reduce in buffer size and the queuing delay due to
buffering. Further, the change in the computer system
such as changes in the scheduling algorithm ll- 17 in the
central processor can strongly influence the computer
output traffic statistics, which will directly affect the
buffer performance, and the design of the decoding
system.
In practice, we ,,,"ould like to design a system that
has minimum total cost yet satisfies all the requirements such as the inquiry-response delay, average
holding time of each user, etc. Since the multiplexing
system and the central processor intimately interaet
with each other, the multiplexing system should be
treated as a subsystem of the time-shared computer
system. The economical and performance optimization
should be carried out jointly between the central processor and available communication facilities.
Example
Consider the design of a time-sharing system that
consists of many remote terminals and that employs
the ATDl\1 technique with full duplex operation between the terminals and the central processor. M:easurements of the traffic characteristics from several operating systems have revealed that the character interarrival time per user line can be approximated as
exponentially distributed with mean about 0.5 seconds. 10
Thus, the character arrivals can be treated as Poisson
arrivals with a rate of 2 char/sec. A reasonable c:onservative guess is that 50 percent of the transmitted
information is sufficient for addressing and fl~aming.
Voice-grade private lines can easily transmit 240 chari
sec from users. Suppose this operating system I~onsists
of m = 48 terminals, all the terminals are assumed
to be independent and have the same traffic characteristics. The buffer is designed such that the overflow
probability is less than about 10-6. We shall use our
model to determine the buffer size and the a,vera!~e
queuing delay incurred by each character.
The traffic intensity is Pu = 1.5 XmAu/ CJLu := 1.5:X
48X2/240 = 0.6. To achieve the desired overflow
probability, from Figure 4, the required buffer length
is 14 characters. From Figure 6, the normalized queuing
delay due to buffering is equal to 1.25 holding times.
Since each holding time is equal to 1/JLu = 1/240 = 4. Jl6
millisecond, the waiting time of each character is 5.06
milliseconds. Now suppose the number of termina.ls
is increased from 48 to 96. In order that traffic intensity
be less than unity, two transmission lines are required
and the traffic intensity is still equal to 0.6. From
Figure 5, the buffer length corresponding to the desired
overflow probability for two transmission lines is
about 14 characters. The waiting time is about 0.8
holding times which is equal to 3.33 milliElecondls.
Although the difference between 5.06 milliseconds and
3.33 milliseconds may not be detected by a UBer at a
Study of Asynchronous Time Division Multiplexing
terminal, a common buffer of the same size operating
with two ou~put lines can handle twice the number of
input lines as with one output line. Thus, the' common
buffer approach permits handling a wide range of
traffic without substantial variation in buffer size.
Next, we shall consider the buffer design problem
that employs the ATDM technique to transmit data
from. central processor to remote terminals. The traffic
statistics as well as the message length are different
from that of the users. The burst interarrival time lO
can be approximated as exponentially distributed
with a mean of 2.84 seconds. Thus, the bursts can be'
approximated as Poisson arrivals with a rate of Xc =
0.35 bursts/sec. Further, data collected in the same
study indicate that the burst length can be approximated as geometrically distributed with a mean of t
= 20 characters. Suppose we use a wideband transmission line that transmits 480 char/sec to provide
communications from the central processor to 48 ·terminals. Assuming 20 percent of the transmitted
information is used for addressing and framing, then
the traffic intensity, Pc = 1.2XJ,,uc ~ 0.84. To achieve
an overflow probability of 10-6, from Figure 7, we
find that the required buffer size is 1,400 characters.
From Figure 8, the expected queuing delay for each
burst is 85 character-holding times, or 85/480 = 0.176
seconds.
Suppose now we changed our transmission rate from
480 to 960 char/sec; then the traffic intensity Pc ~ 0.42.
The corresponding required buffer size in order to
achieve an overflow probability of 10-6 is 480 characters, and the delay is 15 character-holding times or
16 milliseconds. Thus, these results also provide insight regarding the trade-uff between transmission
costs and storage costs.
The above example is based on the output traffic
characteristics of a specfic computer scheduling algorithm. As the output traffic statistics changes with
different scheduling algorithms, the buffer performance
in the multiplexing system is affected. To design an
optimal ~ystem, we should jointly optimize the scheduling algorithm and the multiplexing system such that
yield minimum total cost and also meet the required
system performance such as maximum allowable
inquiry-response delay, desired overflow probability"
etc.
CONCLUSIONS
Queuing analyses indicate that for an allowable overflow probability and queuing delay, moderate buffer
sizes can be achieved for asynchronous time division
multiplexing for time-sharing computer systems.
677
Further, when multiple transmission lines are required,
better buffer performa.nce will be achieved by using a
common buffer rather than by using separate ones.
Because of the asymmetric nature of the traffic
characteristics of user-to-computer transmission versus
computer-to-user transmission, a much larger buffer
is required for the computer-to-user mUltiplexer to
handle the larger volume of data generated by the
central processor.
The mUltiplexing system and the central processor
in a time-shared environment directly interact with
each other. To design an optimal operating syst~m,
we should jointly optimize the central processor and
the multiplexing system (for example, the interaction
between scheduling algorithm and buffer performance)
to obtain a mi:g.imum cost system that meets the system
performance requirements. It is apparent that closer
coordination between the computer and communication system designs would be fruitful in terms of
economics and technological improvements to the
overall system design.
ACKNOWLEDGMENTS
The author wishes to thank E. Fuchs and D. Heyman
of Bell Telephone Laboratories for their helpful discussions.
REFERENCES
1 K BULLINGTON J M FRASER
Engineering aspects of T AS!
B S T J March 1959 353-364
2 B A POWELL B AVI-ITZHAK
Queuing system with enforced idle time
Operations Research Vol 15 No 6 Nov 1967 1145-1156
3 T G BIRDSALL et al
Analysis of asynchronous time multiplexing of speech source8
IRE Trans on Communications Systems Dec 1962390-397
4 N M DOR
Guide to the le'Yfflth of buffer storage required for random
(Poisson) input and constant output rates
IEEE Trans on E C Oct 1967683-684
5 J D C LITTLE
A proof of the queuing formula L = }o., W
Operations Research Vol 9 1961 383-387
6 R WHAMMING
Numerical methods for scientists and engineer8
McGraw-Hill Book Co Inc N Y 1962363-364
7 W M GENTLEMAN G SANDE
Fast fourier transforms--for fun and profit
Proc FJCC Vol 29 563-578
8 N U PRABHU
Queue8 and inventories
John Wiley and Sons Inc NY 196542
9 PM MORSE
Queues Inventories and Maintenance
678
Fall Joint Computer Conference, 1969
-----------------------------------------------------------------------------------------,------John Wiley and Sons Inc 1958 15-18
10 P E JACKSON C D STUBBS
A study of multiaccess computer communications
Proc SJCC Vol 34 1969491-504
11 A L SCHERR
An analysis of Time-Shared Computer Systems
MIT Research Monograph No 36 MIT Press Cambridge
Mass 1967
12 P E DENNING
Effect of scheduling on file memory operations
Proc SJCC Vol 30 19679-21
13 J E SHEMER
Some mathematical considerations of time-sharing scheduling
algorithms
JACM Vol 14 No 2 April 1967 262",272
14 E G COFFMA:N JR
A nalysis of two time-sharing algorithms designed for
limiting swapping
JACM July 1968
15 E G COFFMAN L KLEINROCK
Feedback queuing models for time-shared system
JACM Vol 15 No 4 Oct 1968549-576
16 L KLEINROCK
Certain analytic results for time-shart5d processors
Proc IFIP Congress 1968 Edinburgh Scotland Aug ,5-10
1968 D1l9-D125
17 W W CHU
Optimal file allocation in a multicomputer information system
Proc IFIP Congress 1968 Edinburgh Scotland Aug !)-1O
F80-85
The involved generation-Computing
people and the disadvantaged
by DAVID B. MAYER
IBM Systems Development Division
White Plains, New York
lNTRODUCTION
Motivated computer professionals all over the United
States have undertaken a most special and extraordinary task: they are involving themselves in every
way possible in the training of disadvantaged and educationally-deficited men and women from the so.;.called
ghetto and poverty areas of the country~ They are
exhibiting a special and wonderful tension which impels them to appear at that interface between their own
computing community and those underprivileged who
wish to enter it.
As Chairman of the new ACM Committee On Computing And The Disadvantaged (ACM-CCD) I haye
heen privileged to visit or directly participate in ten
projects in New York City, Boston, Los AIigeles, San
FrancJsco, Sacramento, St. Louis, and Philadelphia.
From them can be drawn some broad brlish pictures
of such projects, some of their special problems, and
t heir relative probabilities of success.
Typically, computer projects have undertaken to
some of the disadvantaged either as operators or
programmers. Generally the participants have been
characterized as follows:
trai~
· i 9-23 years old
.4ropped out of ninth or tenth grade
• are black or brown
• are two to three years behind their white counterparts who are at the same grade ~evel in terms of
tested comprehension
• about two-third~ male
.have a job of some kind,. but are underemployed
apparently by reason of race or langua~e
• come from a poverty-stricken area, often an urban" ghetto"
• have police records in about onejthird the cases
The disadvantaged-Who are they?
.evidently have some motivation to better themselves
The term "disadvantaged" w~s origin~lo/ coined in
connection with educational grants from the government, for potentially very bright youths from proverty
backgrounds for experiments in educational techniques
programs. Since that time, it has broadened to include all those who are educationally-deficited (and
with minimal hope of retrieval of those years they are
behind), including those from both poor white and
non-white communities.
.have children or heavy "family"
responsibiliti~s
• on aptitude tests score over the complete range
from high to low
More particularly though, a review of spme other
statistics may help us to orient ourselves :17
For Negroes in the 25-34 year old age bracket:
679
680
Fall Joint Computer C~nference, 1969
i
.47.0 percent dropped out before graduatipn from
high ~chool
i
.45.6 percent completed high s~,hool
• 7.4 percent completed high school and
c~llege
.A Negro sixth grader was 2+1/2 grade levels behind his white counterpart :in general scholastic
achievement
!
.A Negro ninth grader was thr¢e grade levels behind
his w.hite counterpart
' ;
This three-year deficit picture p~rsists, through 12th
grade and graduation, in general. '
Remediation, restructuring, and 'relevaIlJCY'
!
What does this mean to the computer training course,
or to the jobs which people with such backgrounds can
undertake?
It means some tutoring in the technical concepts
during the computer operator's lor other courses. It
almost cel tainly will mean lengthening the course deliberately. Currently computer operator and programming (usually Cobol, by the way:) courses run two to
five times longer than the equiv~lent course given
the regular industrial milieu.
i
It means teaching only 'rele~ant' material, only
the guts of content, only that which is directly applicable to that job waiting at tije end of the course:
ergo, no frills.
'
It means employers will have! to restructure some
jobs, in smaller, less complex, carefully detailed clusters, so that a rather straight-forward set of behaviors
can be carried out by new employees.
It is possible to take small top :level segments of the
disadvantaged populace and tr~in them directly in
computer tasks without remediation. But in general if
we want to really dig into the American dilemmas of
today, remedial training will be needed for any broad
training program developed to ~ting students up, to
the level of comprehension needed to understand some
of the computer concepts of our: more abstruse computer texts.
in
Trade-oils in training
There i~~ then, a kind of balan:ce of course content
requirements versus several variab.les-principally time
-which one can invest to obtain effective training and
eventual on-the-job performance r~sults.
For example, most disadvantaged projects teaching
key punch operators required that trainees be able to
type 20 to 40 words per minute prior to entering kiey
punch classes. 2 ,3,4 Where the normal key punch class is
five days, in projects for the disadvantaged they run
15 to 20 days .
A project choosing high school graduates can train
computer operators quite effectively, and include
theoretical material on operating systems, programming techniques, the intern8~1 supervisor/program
coupling within the computer, enough so as to allow
an operator to make some reasoned judgements in
error situations. This is obtained through trade-ojls
such as (a) lengthening the course, or (b) intensifying
the hands-on expereince. This probably gives the di.sadvantaged person who graduates one of the finest
running starts in 'operations' in. the country. (N. B.
Particularly true of the Urban League/IBM/Bank of
America project in Los Angeles. 4
The placement problem and the
a~l8Umed
job
mar~vet
Most projects have been located in large urban,
highly computerized geographilcal areas; groups in
the planning stages have typicallly looked about themselves and faced the combinatorial possibilities of
probable jobs available and probable people they were
hoping to train. Almost invariably they concluded that
three possible combinations were feasible:
• key punch operator
• computer operator, either as a trainee handling
tapes and discs and peripherals primarily, or as
a trainee console operator.
• a trainee Cobol programmer
Generally rejected for training were job desc:riptions
which involved:
• Fortran or basic language programmers
• pure EAM or "unit record equipment" operatOlrs;
however, this was sometimes appended to the
computer operator trainee position description
• tape librarians, dispatchers, I/O clerks, and the
like.
Most projects made only a cursory pass at the actual
placement planning question and generally a,ssumed
that any graduates they offered "the marketplace would
be snapped up with only a modicum of effort to fiIlLd
interviews. Inevitably, halfway through the training
when efforts turned toward placement interviews
there were some rather rude awakenings to sever:a'!
Computing People. and the Disadvantaged
facts: the students' color, language, and prior records
were obstacles that required active selling to overcome. More often than not there was a mad scramble
toward the end of the training period to find employers
willing and able to hire trainee' computer operators
from the poverty sector of our poplulation. Only heroic
efforts upon the part of placement committees would
slowly find openings for interviews, much less precomffiitted employment slots.
Hence, if there w~re one piece of advice this author
w~)Uld give it wpuld be: plan your placement process
first; involve would-be employers at the earliest planning stages to test the marketplace, to involve them in
the training stages, to be interested in the graduates,
and to assure jobs at the end of the course. It is almost
axiomatic that if you should fail to place your 'disadvantaged' trainee within a very few weeks of his
graduation you may have lost him or her forever and
all the training investment will have been for naught.
The computer operator-What is he?
In order to converge upon concrete results and make
some comparisons only the training surrounding the
job of Trainee Computer Operator will be described.
Let us consider three different, but related, aspects
of the Computer Operator position description:
• EAM or "unit record equipment" knowledge and/
or ability;
• pure computer operating, highly structured, highly
practical, based on detailed specified stimuli and
response patterns.
• an "understanding" computer operator who has
sufficient theoret'ical knowledge about operating
systems computer operator, who has sufficient
theoretical knowledge about operating systems to
solve unexpected error situations, so as not to
abort, but successfully run a job.
In a typical job description, the Trainee Computer
Operator works under close supervision, performs the
simpler operations on peripheral devices and on the
console, expedites the data in and out of the system
and the installation, and is generally a careful intelligent follower. He is usually expected to have two
years of college (possibly an AA degree) or several
years of tabulating machine (EAM) experience or
a 200-hour hands-on computer operating course. 16
The Journeyman Computer Operator is expected
to do more: based upon six or more months of actual
Trainee experience, he checks input and output for
681
general results, analyzes stops and takes corrective
action, and runs test programs. He is also required to
know the principles of operations, basic elements· of
programming, follow directions carefully and analyze
data, and perform arithmetic computations.
It is the author's contention that the EAM tasks
and training are frills and basically obsolete and should
not be taught (excepting a little keypunching for error
corrections); that the second or "structured, practical"
job description is the one for minimal entry level jobs
for the disadvantaged; and that the third description
adds a requirement for "theoretical understanding"
for computer operator train',ng projects. This htter
requirement is significantly high in terms of language
comprehension and acts as a deterrent to large sectors
of the disadvantaged population tl'ying to take advantage of the training.
It is interesting to note that in almost every case of
the disadvantaged training projects with which the
author is familiar, nowhere nearly such stiff conditions
are placed either upon the students for entry into· the
computer operator course, nor upon them for eventual
hire. And there is every indication they can perform
successfully upon the job with considerably less stringent qualifications.2 •7 .12
The author therefore urges that to be able to develop
the truly disadvantaged, educationally-deficited person (a dropout from as low as the ninth grade) computer installations should re-structure their basic
computer operator job specifications and training
projects their content to reflect the entry-level requirements for the "practical, structured" computer
'operator trainee. This would give gainful employment
of a meaningful type to many more people in the total
community, particularly the disadvantaged.
J
Computer operator curricula
In this section are described two examples of operator curricula to exhibit the basic approaches, typical
content, and a preliminary view of some of the training
techniques employed. (Treated more fully in another
section.)
In the Mitre Corporation's fully in-house, fully
funded (internally) on-the-job (OJT) format' students
are paid at regular industrial rates. Remedial training
in basic language skills and mathematics takes up
most training hours daily, for the first few of the 26
weeks total. Gradually it is replaced by computer
operations training on both the IBM 7030 (Stretch)
and the IBM 360/30 and 360/40 systems; and of
course gradually the students work out on the line.
682
Fall Joint Computer Conference, 1969
--------------------------------------------------------------------------------,----Instructors are internal, paid, staff members; four
students started ,and three finished successfully. They
were part of a 12-trainee Mitre project for clerks,
operators, and the like.
The salient features of this Project's outline include:
(a) deliberately assigning their second shift Supervisor
for nine months as Training Coordinator to prepare the
technical curriculum, instruct,' supervise the OJT
aspects, and coordinate with the remedial training; (b)
giving all first shift personnel a stake in the outcome, and
include them in the evaluation process; (c) providing
separate remedial training on a descending scale
concurrent with increasing line operations training
and expereince. A full Outline is available from the
ACM.18.28
The second basic approach a~d the one most often
used, was the external, separate training program; it
is typified by the CPDA project in New York City.l
Using a "self-selection" process12 75 prospective students went through an 'orientatiop' to computer operating, and then 48 volunteered for:actual training. Thirty-two stayed with it, 27 graduated on the first round,
three more were tutored to completion, and 17 of the
30 were placed as of this writing. .
The program, approach, and Syllabus Outline of
CPDA are given in Figures 1 and la.
EAM/Unit Record equipment training is not given
in this course. Several of the courses did offer as much
as a week's equivalent of such training, on the basis
of its relevan'ce still in today's. card-oriented input/
output part of the computing world. The Urban
League/IBM/Bank of America project in Los Angeles,
the Philadelphia ACM/Boardof Education project,
and the St. Louis IBM/Board of Education are noted
in particular .4,l0,11 The latter have a regular EAM
course available in their vocational schools as well.
Training techniques
It would seem obvious that sorne specialized training
techniques would have to be employed to reach disadvantaged or educationally-deficited people, and a
few such techniques have been attempted in the computer training field. Such experiments should be carried
out in a professional, measured" feedback atmosphere,
but rarely has that been available. In the projects
studied, probably the three aspects of training that
have paid off the most are:
• lengthening the courses by two to five times the
average;
1. ORIENTATION PROGRAM
a. Registration
.Welcome 75 prospective students
• describe project
• history and needs of the computer field
• introduction to the computer
• film on computer and operations.
b. Introduction To Computer And Business Environment
• devices used at an installation
• business environment and general working
conditions
• employment prospects for the computer
operator.
c. Computer Installation Visit
d. The Computer Operator And The 'Trainilng
Program
• computer concepts
• general responsibilities of the operator on 1~he
job
• the operator's relationship to the eomputer
field
• the training program
• general discussion and individual counseling.
Extensively, throughout these orientation sessions
instructors and counsellors mingle with the Btudents,
interact on questions, and encourage self-selection into,
or not-into, the actual training.
Figure l-A no-frills training syllabus. Computer
Operations Training for the IBM S/360 Models 3.0 and
40. Training consists of two parts. Since motivation
a.nd interest are prime factors in training completion and job performance, a preliminary
four-session orientation program was designed
to give the candidate a data base for
making up his own mind, to enter or
not-enter training. (After CPDA1,lS).
• making the classes small; or alternati vely assigning two instructors per class to bring the
pupil/teacher ratios down to as low as four-to-one;
• allowing the class to teach itself, to a certain ex.:.
tent, by teaming or as a full group.
Interestingly enough, no project used any specialized
audio-visual material (other than hands-on work with
Computing People and the Disadvantaged
the computer itself), most of them depending upon
the standard available texts, programmed instruction
books, or books of illustrations.
Nevertheless, a kind of experiment did take place
in the CPDA project, observed by the teachers, staff
professional guidance counselors, and the students
themselves. It involved trying three differing teaching
techniques:
1. The 'classical' approach con~ists of a teacher
lecturing to his students, with the· teacher as
focus for feedback (answers, discussions, questions). This can be characterized as a 'vertical'
organization of class structure.
2. The 'teams approach consisted of the teacher
breaking up the group into five teams of three
students each. This came about to solve the problem of demonstrating the computer console and
2. COMPUTER OPERATIONS TRAINING
PROGRAM
a. Course Structure
• The training program will consist of both
classroom sessions and computer room visits.
• It is expected that approximately 30-45 students will complete the Orientation Program
and enter the Operations Training Program.
• There will be three sections, each with 10-15
students.
.
• Each section will have one primary teacher
and one assistant teacher.
• Classroom sessions will meet twice a week for
two hours.
• Computer room visits will be scheduled as
required by the Syllabus and will be from two
to three hours in length.
b. Educational Material. The basic student text
for the course will be:
IBM System 360 Model 30 DOS System
Operation Training Manual and Book of
Illustrations (Studen.t Text); Forms C201676-0, C20-1677-0).
.Examples of I/O media will be available in
the classroom for student familiarity with
cards, tapes, disk packs, printer forms, carriage control tapes, etc.
683
SYLLABUS
Section A (INTRODUCTION AND PERIPHERALS) covers:
• Introduction to Input/Output Media
• Computer Room Procedures
• Computer Room Visits (hands-off demonstrations and hands-on practicums)
• Operations of Peripheral Devices
• General Review
Section B (SOFTWARE INTERFACE) covers:
• Introduction to "Operating Systems"
• Control Information
• Operator Interface With DOS(Disk Operating System)
• Computer Room Visits (hands-on practicum)
• Stand-Alone Programs
• Compatibility Modes-Emulation
• Course Review
Figure IA--A no-frills training syllabus (con'td). Note
the absence of EAM/unit record equipment training,
and a maximum of immediately-applicable job
knowledge given in a 54-hour course over a period
of 2.5 mont.hs (20 sessions). (After CPDAl 13).
More detailed versions of this and other
curricula, syllabi, and lesson plans are available through the ACM Committee on
Computing and The Disadvantaged
(ACM':CCD).
peripherals effectively. Having 15 students stand
around in a large semi-circle proved boring and
ineffectual; by placing the few most hep students
with two of each of the others, he could in effect
assign problems to teams to work out, and allow
students to teach each other within teams.
When competition rather than cooperation
started to raise its head, the team members
were rotated. In addition, the instructors, after
giving the teams a problem, deliberately gave
the impression that they would answer no
further questions. After computer runs the
whole class would hold a· post-rrwrtem. Instructors also created unexpected . problems, such as
casually dereadying a printer, or flipping a
tape into 'file protect' mode without a file
protect ring being inserted. Furthermore, students would be called upon at random at the
beginning of a class to recapitulate the pre-
684
Fall Joint Computer Conference, 1969
vious session's work and lessons, taking the
instructors' place in essence. The remainder
of the class usually jump~d in to help the hapless classmate-after waiting an appropriately
gruesome few minutes. This 'team' process,
a combined 'horizontal' and 'vertical' class
structure, and the random 'instructor' , all
created an involvement within the class. It
worked, and beautifully; in fact the class got
ahead the syllabus.
problems, and must be handled carefully. It has been
used very effectively for teaching programmers and
systems analysts (advantaged),8 and it is strong;ly urg:ed
that the Montessori techniques and environment be
attempted on the disadvantaged in all occupations. At
least one project, the Sacramento/ACM Education
Committee,16 is planning to use it for a computer
operators' course.
3. The 'fully horizontal 'or 'group/workshop' approach was occasionally attempted by the third
pair of instructors. This normally involved the
teacher bringing much of the material to the
students' attention via! lectures and some
reading, but required that answers to problems
and operations, co~e from the class as a group.
In this particular instance) the structure worked
fairly well, the class completed the material on
schedule, but as an exper~ment it was relatively
inconclusive. This was' partly because the
amount of lecture required, and individual help
given was more than normally used in a true
'horizontal' workshop situation. That is, in this
case, the technique never got a thorough
workout.
There is still one more important aspect of training
which will aid a project immeasurably: the maintenance of continuity of warm, stable teachers with
whom the class can identify, and the assured continuity of class sessions, the same physical facilities.
knowing that the class is going to meet, and there will
be .a job waiting at the end of the course. Changing
classrooms every few weeks and uncertaintie8 of computer time when promised try the motivations of the
stud.ents (and instructors) sorely, at times. Those
situations which had good steady facilities, the same
instructors throughout, (usually paid, and professional at teaching itself) have the highest attend:1 nce
and morale levels. Though these items go almost without
saying, the proliferation of volunteer projects impels
the author to issue this type of warning, for the sake of
everyone involved, partiCUlarly the disadvantaged
students; they have been through enough instability
in Iife already.
of
To summarize: the lecture technique worked fairly
well on the brighter students, who expected it as a
matter of previous exposure. Their class suffered the
greatest number of dropouts, but not from the training
technique used.
The 'teams' approach was very effective for both
morale and learning. The class was able to cover a few
items the others didn't.
The 'interactive' 'fully-horizontal' group organized
as a workshop also finished, and reasonably well, certainly comparable to the others: in content knowledge.
But it is predicted that the stronger extension of that,
the new Montessori/workshop group involutional
methods should give far better results for disadvantaged
people, especially when the staff and the facilities can
be structured properly.16
The Montessori environment. requires careful guidance upon the part of the instructor,andaspecialquality
of allowing the class to explore freely the alternative
paths to answers. The instructor, in a sense, must be
willing to subdue his usual posit~on of center-focus role,
become a part of the discussion, part of the group, almost at their own level. Over a period of weeks, the
group should become highly interactive, over the
material, over technical and occasionally external life
Training and stability
Curricula comparison: Methodology of evaluation
Now that somE: specific curricula have been presented,
we wish to set down some criteria and the method by
which we will compare the various content a,nd techniques; to do so we have prepared ourselves in the preceding paragraphs with the job specifications, the
required curriculum content, and the training approaches. We consider some of the following points of
comparison expanded in Figure 2.
.Ts content aimed at the structured, stimulusresponse, practic al type of course?
• Are the results of the course immediately applicable to a job in a computer installation?
• Does curriculum allow a "flexible tail" so that
graduate can go to work in an installa,tion that
has computers and operating systems other than
the particular one taught?
• Does the course lead into on-the-job training
(OJT) easily?
ComputingP~ple
and the Disadvantaged
685
FIGURE 2-Comparison of curricula for computer operator training projects for disadvantaged peoples
ACM+
Phila.
Board of
Education
Practical Structured?
Immediately Applicable?
Any Computer?
Theory/Oper. Systems
Yes
Yes
Sort of
Yes
SanU'B
Boston LA/Urban St. Louis
MITRE League + Board of cisco IBM
Corp.
IBM + Education
Bank of
&IBM
America
CPDA
NYC
Midwest
Side
NYC
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
. Yes
Yes
Separate
No
COBOL
No
No
No
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
No
No
\
Programming?
Basic Language
Higher Language
No
No
No
No
No
No
Program Structures
Prog's vs. Op. Systems
No
Yes
No
Yes
Flexible Tail-Toward
Other Computers
Lead into OJT
OJT Part of Course?
Yes
Yes
Yes
Ye~
Yes
Little
Could
No
No
No
Some
No
Some
Call
Backs
No
Yes
Interviews
Yes
No
Yes
Some
No. Students/Teacher
No
Yes
Yes
Yes
Yes.
Some
No
COBOL
Yes
Yes
Some
Yes
No
Yes
COBOL
RPG,
COBOL
Yes
Yes
Some
Yes
NtNo
No
No
No
No
No
Could
No
Yes
Yes
None
Yes
Yes
None
No
Yes
Yes
OJ-by
Supervisors
No
No
No
Yes
Yes
CheckBack,
Verbal
No
Yes
Yes
Yes
No
Some
No
Yes
Yes
No
No
No
No
No
No
No
No
Little
No
20
16
12 or 6
4
11
10
Non-English Help?
No
Yes
Yes
No
No
No
No
Buddies? Liaisons?
2 per
Student
Yes; Both
Counselors and
No
Some
No
No
No
Training Measurements
-Written Exams
-Hands-on Exams
Yes
-Post-Grad Performance Not Yet
"N0 Failures" Policy?
Support: Remedial
Support: Tutoring
Support: Guidance
No
Tutor~
12 or 6
686
Fall Joint Computer Conference, 1969
Figure 2-Comparison of curricula for computer operator
training projects for disadvantaged peoples (Cont'd)
Attrition Rates
Number Started
N umber Finished
Number Graduated
N umber Placed
Length of Course
(Hours)
Orientation
Hands-On-Time ( " )
Technical Class ( " )
Other Classes (Remedial)
Tota1 Class Hours
N umber of Sessions
Elapsed Time (Weeks)
Hours per Session
~O
19
19
11
0-
48
32
27
17
12+
(av.) 2.5
8
10
30
8*
48-56
20
10
2
EAM Taught?
Only Key-Punching?
IBM 360 DOS
OS
Other (1401, 7030)
Yes
No
Yes
No
No
No
Some
Yes
No
No
Teacher Stipends
Student Stipends
Yes
No
No
No
Sponsors:.
25.5
42.5
68
27
12
6
6
4
4
3
3
3
12
11
9
9
9
9
8
9
0
0
0
0
0
0
40
200
No
26
1-5
No
2
15
30
47
22
2-4
2-3
Yes
No
No
No
Yes
Yes
Yes
Yes
Yes
No
Yes
No
No
Yes
No
Yes
No
No
Yes
No
Yes
No
No
Some
No
Yes
Yes
Yes
No
Yes
Yes
Yes
No
CPDA
ACM;
Also
VolunPhila.
ACM;
Bd. of
teers
Then
& City
EducaVolunUniv. of teer +
tion
N.Y.
Paid
GuidStaff
ance
Counselors
*Optional
t
240
30
6
8
2.5
70
175
No
350
40
8
5.5
21
MITRE
Corp.
Internally
tClass
Hours
Only
Urban
St. Louis San FranLeague + Bd. of
oisco-IBM
Education Education
IBM +
Bank of
Center
+ IBM
America
Computing People and the Disadvantaged
• How are the results of training measured?
.How is the post-graduate performance measured?
• How much time is given in classroom lecture? I:n
hands-on experience per student?
• Is there a "no failures" policy of teaching?
• How much supportive remedial help is built into
curriculum?
687
supportive hours for every student class hour .12 Professional staffs and funded projects tend to use a lower
ratio of time (about one-to-one) but this js often balanced by a much larger expenditure of up to approxi.
mately $1,000 per student, involving facilities and
professionals.
Other ,measures, such as performance on the job,
have been followed up by projects too cursory to
warrant reporting at this time.
• Are non-English students helped?
• How many teachers per student?
• Course .content: was EAM/U nit Record equipment taught fully, or was only the keypunch
taught (for computer room use)? Was operating
system taught as button-pushing course? Or was
a 'theory of operating systems' taught in addition,
and the relationship between the resident applications programs and the operating system concepts taught?
With these criteria and questions in mind, a chart
of the Yes/N o/Comments type, (Figure 2) gives a picture to the reader of the various projects, and their relative strengths and weaknesses.
Performance criteria-Some measures of the projects,
If one were to attempt to measure the results of such
training for the disadvantaged, one might look to the
annual salaries accruing, of those who obtained jobs,
versus the expenditure for the project. MWSDPS16
suggests that for approximately $14,000 they graduated
28 students of all types (key punch and computer
operators, and programmers), and placed 20 of them,
for a job value of $98,000 annually, relieving the welfare rolls of eight people at the same time. By the same
token, CPDA graduated and placed '17 computer
operators, who now earn about $74,000 per year, all
for something less than $1,000 cash, but using six
teachers, eight guidance counselors, and about a dozen
more in placement, tutorial, curriculum development,
measurement, and find-a-computer chores, all
volunteer .12
In another performance measure, it is ,evident that
the more deeply into the social problems' fabric a
project wishes to penetrate, the more 'underpinning'
or training support techniques one must invest energy
and staff: these include tutoring, remedial training,
high school equivalency aid, teaching English as a
second language, and both vocational and life-guidance
counseling. The volunteer projects expend 10 to 20
The 'shadow programming aide'
It seems to this writer that there are considerably
more jobs available at a slightly higher level of complexity: that of a 'shadow programming aide'. 27 Not
just a coder, this person works in shadow relationship
to a regular programmer, carrying out some of the
more onerous details of programming, such as flowcharting from given coding, setting up debug runs,
keypunching, expediting the debug process, or carrying
out some of the detailed, but highly supervised coding.
This kind of programming technician could very well
have real upward mobility, dependent primarily on
the trainee's learning rate, general intelligence, interest, and proven ability. The national crush for programmers is far greater currently than for operators;
and in addition, the direct personal involvement of a
disadvantaged person with a regular, stable, accepting
programmer or two, would be one of the best stabilizing
entries into the computing field.
SUMMARY-PITFALLS AND SUGGESTIONS
A number of points may rbe abstracted from the
foregoing, in addition to others not made in the main
text:
• The 'typical project' for training disadvantaged
people is created by a highly motivated group of
computer-Cand other) proCessionals, with a desire
to get involved, to do something.
• The question each project must face is: just how
deeply into the social problems it wants to delve:
underemployed, unemployed, educationally-deficited, hard-core unemployed, or whatever.
• Most computer training projects aim at the 1923 year olds, who are two or three years behind
in their ed.ucation, and won't need very much
remedial training to get them through a computer
operator (or programming) course and into a meaningfuljoh.
688
Fall Joint Computer Cc;mference, 1969
• It is a surprising note, but t~ere seem to be relatively few trainee computer i operator courses in
the country: mostly these have been given on the
job (OJT) , or from reference manuals supplied
by the manufacturers; leastwise the author has
detected very few. This points up a real econ·mic
advantage for the prospective employers: very
little OJT need be spent to start these employeesthey're ready to go from Day One. For once, in
this case at least, the disadvantaged are probably
starting out ahead of their advantaged brethren.
• The first psychological jolt for motivated whites
is to discover they will not, most of them, be acting
at the actual 'interface' bet~een the disadvantaged and the advantaged community. More of
them will have to find psychic reward in support
functions, such as finding employers with available
jobs, writing curricula, obtalning free computer
time, finding teachers and class space, obtaining
funds.
• The second psychological jolt comes in discovering
how much planning is to be~one (or should have
been done).
• The third jolt is the marketplace: obtaining job
slots requires persuading people outside your little narrow project, and it's tough. Start very early
to involve would-be employers, even at the planning stages.
'
'
• Courses, so far, have usually involved computer
operator training on IBM machines, particularly
the 360 series, in DOS or; OS (Disk Operating System or Operating 'system/360). Programming usually means Cobol, rather than basic
languages. A way to teach a '~ore' curriculum applicable to any operating system for any manufacturer's machine needs to he developed. Such
a step has begun to take shape with the ACMPhiladelphia/Board of Educ~tion project, using
the 'flexible tail' method' fot phasing to OJT.
(Ref. 11).
• The single greatest lack is fuhds. They are hard
to get. Working with the local Board of Education
to obtain government funds, or with local industry for underwriting expenses privately, seems to
he most effective.
I
• Volunteer projects usually failed of their total
objectives, but succeeded partially; but they tend
also to peter out. Plans for r~al continuity must
be built in.
'
• One cannot stress too much the responsibility we
have in changip.g the lives of would-be trainees.
• The one thing training projects are prepa.red for
is attrition; their expectations are that they will
graduate 30 or 40 per,cent of the people who arrive
~t their door for the first formal class day. It
works out that way, though some projects have
gotten over 80 percent.
• What is needed for the classes is a new, simple,
straightforward text on operating a computer wrl"tten in the language which the disadvantaged can
understand. It may have to be aimed at the ninth
grade level, for both the English and the commercial algebra comprehension. It could start with
what they know already: numbers systems can
come from the numbers game, and you could g:o
on from there. Here's a set of books someone could
write, and the whole computing profes~ion may
benefit.
• Serious consideration of new, less complex, job
descriptions can be attained in the operations and
the programming area. It is up to employers and
the industry to develop them, in order t.o include
a larger part of our population.
• Finally, the author strongly urges that the computing community initiate a national broad-based
organized effort, to develop jobs, regular training
projects, and adaptations of training techniques
for disadvantaged to enter the field. Two major
proposals are now before the ACM Committee on
Computing and the Disadvantaged: one from the
Sacramento Chapter, one of the few ACM: chapters actively pursuing a generalized, funded-plusvolunteers approach; and the other from ACM's
Special Interest Group on Computer Personnel
Research (SIGCPR) for a Massive TlLaininl~
Project, involving 50 cities and a 1000 students
per year, and fully documented, measured research on selection, training, and perfoI'mancf~
within such projects and their graduates ..
All these are part of bringing students from the dark
into the light, to help them enter the world of working
and earning peoples, to stand on their own two feet
with self-respect, and dignity:-in' the words IOf the
.Prophet Micah, "that they shall sit every man undeJr
his vine and under his fig-tree, and none shall mak(~
him afraid."
Computing People and the Disadvantaged
APPENDIX A
Computer training, project8 for the di8advantaged
-Brief characterization8 and de8criptionb
1. NEW YORK CITY: CPDA (Computer Professionals Development Association; Fall, 1968. Three
parallel classes of 16 men each, Computer Operators
only; completely volunteer teachers and staff. First
pilot course completed November, 1968, hopefully
leading in a funded and/or at least a teachers-paid
project~ Dr. Allen Morton, IBM/SRI, NYC, President.
• Unique aspects: candidates self-selected based
upon own interests after four-session "orientation." Three differing teaching techniques tried:
classical lectures, two- and three-man teams, and
semi "workshop" group involutional approach,
yielding differing results. Strong professional guide
ance counseling, heavy tutorial aid, and high school
equivalency training available. 48 started, 32
finished, 17 placed. Cost less than $1,000; jobs
worth $74,000 per year.
2. NEW YORK CITY: MWSDPS (Middle West
Side Data Processing School); with Puerto Rican
community group; a semi-funded project; started with
ACM, then became industry-supported and volunteer.
Summer, 1968. L~ Barnett, Long Island University,
Director.
• Unique aspects: started with 164 off-the-street
prospects through advertising, etc. Interviewed
applicants for "logic capability", "motivation"
and language comprehension. Started 19 prqgrammers in COBOL, graduated 14, placed 9.
Started 17 keypunchers, graduated 8, placed
7; started 12 computer operators in IBM 360/DOS,
graduated 6, placed 4. Cost about $14,000, and jobs
worth $108,000 annually.
3. NEW YORK CITY : Harlem; a series of IBM
keypunch operator courses organized by W. DeLegall,
Columbia University Computing Center. Basic lesson
learned: after first course taught basic typing from
scratch as preliminary to keypunch training, subsequent classes required candidates to have 40 wpm
typing skill before entering.
4. LOS ANGELES: ULDPTC (Urban League
Data Processing Training Center), 7226 S. Figueroa
Ave.; jointly sponsored by League, with professional
IBM teachers and IBM donated equipment, in Bank
of America donated building. Urban Leag'ue both
selected and placed candidates. This is the most profes.-
689
sional and thoroughly equipped computer field training
project for. disadvantaged in the country. Supported
completely by private/industrial funding. John O.
Adams, (IBM), Training Director.
• Unique aspects: three parallel courses in keypunching (12 people, four weeks), computer operations (IBM 360/30, DOS, 12 students, six weeks),
programming (COBOL, 12 students, 12 weeks).
Full daytime staff and students; no student stipends. Has full IBM 360/30 with tapes, discs,
printer, card reader/punch, dedicated to project
only (i.e., no production, only classes). Runs two
years, about 250 students per year. Attrition rate
very low (5 percent-20 percent) and placement
rate very high.
5. LOS ANGELES: Maywebb Science8 Corp. Originally a Watts area volunteer project, spearheaded by
Louis Webb. Has graduated programmers primarily.
After two years is offering courses on regular paid
"private EDP school" basis, and to private industry
on government funds.
6. LOS ANGELES: Operation Boot8trap: part of
Watts area self-help in manufacturing and retail
stores. Also started as key punching classes and programming; 47 enrolled in latter, plus remedial training.
Founder: Louis Smith.
7. SAN FRANCISCO: IBM; computer operator
course (EAIVI + DOS) completely staffed and equipped
by IB1\1's Branch Office Education Center; concentrated on somewhat older group (average age =
28.5 years, top is 45 years), a good portion of whom
had seen tab equipment before. High percentage of
police records and unemployed. Very successful employment placement. Director, Philip Brav~rman,
IBM San Francisco.
8. SACRAMENTO: ACM CHAPTER, Education
Committee. Specially planned computer operator
training project, to begi'n classes in September 1969.
Includes long, detailed market study of job specifications to determine job placement availability and
committed slots for graduates in state government
and private industry. Detailed employer/project interaction; year-long careful planning in all phasessystems analysis approach; use of PMS (Program
Management System) for scheduling; professional,
external structured measures of selection, evaluation
of curricula and training performance, being woven
in from the start; 'Montessori/workshop' group involutional training techniques; paid teachers; possible
690
Fall Joint Computer qonference, 1969
student stipends. Totally ACM-tlirected project, with
industry/ government cooperatiop.. Organizer: Elizabeth R. Alexander.
'
9. BOSTON: MITRE Corp.l; totally funded by
MITRE: the only fully in-house OJT in this survey;
see section on Computer operator curricula.
10. ST. LOUIS: Board of Edtcation and IBM; 11
students, eight ~eeks, 5.5 hours iper day, in programming and operations; 8 out 6f 11 graduates are
either computer operators or ~eleprocessing clerks i
1968 Summer project, not being!repeated. Funded by
Board of Education (Dr. La~son, Treasurer) and
staffed by regular IBM Systen;ts Engineers, as professional teachers. Students received stipends, and
were chosen from group of 60 ndt planning to go on to
college and with average grades. former Director: Ron
Dobies, IBM/DPD, Clayton, Mi~souri.
Teaching techniques and quality education/training for the
disadvantaged
Proc Seventh Annual Conf on Computer Personnel Researeh
AClVl1969 to be published
15 E R ALEXANDER
Montessori techniques applied to programmer trainin" in a
workshop environment
Proc SJCC Vol 34 1969 373-379
16 Data processing training for the underemployed: an evaluation
oj an experiment
The Diebold Group Inc 196927
17 U S Dept of Labor Bureau of Labor Statistics Nov 1967
on a Census of the Negro Community
18 J EYELER
From welfare rolls to over $i,OOO a year in four monthlt
Datamation Mag Apri11969175-177
19 M BAUMAN
Computers and the ?wderprivileged
Proc SJCC Vo1341969 35
20 J J DONOVAN
A program for the underprivileged and the overprivileged in the
Boston commum:ty
11. PHILADELPHIA: ACMand Board of Education; Delaware Valley Chapter ACM in cooperation
Proc SJCC Vol 34 1969 36
21 W B LEWIS
with Thomas Edison High School counsellors; an
after-school hours project (twO! days/week) for 20
volunteer students; paid teachers, two "buddies"
per s~udent volunteer from ACM; runs about 25 weeksj
includes EAM through full 360/DOS operations;
furnishes some OJT at end of course to dovetail with
prospective employers' non-IBM: computers. Director,
Milton Bauman, ACM and Pr~ce, Waterhouse and
Co., Phila.
'
Proc SJCC Vol 34 196937
22 A L MORTON JR
Computers and the underprivileged
Proc SJCC Vol 34 196938
23 J SEILER
Experimental and demonstrat'ion manpower projects
Proc SJCC Vol 34 196938
24 H GRIFFIN P GRAVELLE
Report on the philosophy and m.echanics oj' the urban
education commiUee oj Philadelphia
Proc Seventh Annual Conf on Computer Personnel
Research Assoc for Computing Machinery 1969 to be
published
25
REFERENCES
12 J P GILBERT D B MAYER
What the JOBS program is all about
i
Experiences in selection of disadvantq,ged people into a self
computer data processing training prqgram
Proc Seventh Annual Conf on Computer Personnel Research
ACM-SIGCPR (to be published) 1969
13 J BURROWS
'
Report on Mitre OJT training project
Personnel communication to the author
14 W A DE LEGALL
Cal~fornia State Pe1"sonnel Board: Specifications for compUlier
operator trainee and for computer operator
1969
26 The Mitre Corp: Example oj an in-house computer operator
training u~ing OJT techniques
File Memorandum Outline ACM Committee on Computing
and the Didsavantaged May 1969
27 D B MAYER
A sugge!~ted new entry-level job for the disadvantagerl:
Shadow programming aide
File Memorandum Job Spec ACM-CCD 1969
The Q approach to problem solving
by J. D. McCULLY
TRW Systems
Redondo Beach, California
INTRODUCTION
The problem of determining derivatives on a digital
computer has received a great deal of attention for
several years. Some exotic systems have been developed
and. numerous papers have treated the problem. In
1964: it was suggested by Wengert! that the chain rule
could be applied to values for the determination of
derivatives.
This general concept has served as the basis for a
series of programs developed at TRW Systems. It
has been expanded to permit the essentially simultaneous computation of first and second partial derivatives with respect to several independent variables.
Second partials are especially valuable in optimization
problems, and excellent results have been obtained
with this technique. The first program written at TRW
some years ago to apply Wengert's chain rule concept
was called ROP (for Restricted Optimization Program)
and has been used to optimize sets of algebraic equations. After some experience with this program it was
decided that a complete system should be devised to
permit wider application of the technique to problems
where partial derivatives would be of value. The system
was initially named CUE, for Computer Utility for
Engineers, but was recently renamed Q in deference
to another system named CUE.
The intent was to make Q essentially a computer
operating system. On the other hand, it was to be used
within an already existing operating system (SCOPE
2.1) on TRW's CDC 6500 machine without modification to the existing system. A good discussion of this
type of system is found in Glass. 2 The consequence
was necessarily some added overhead operating cost,
but it was hoped that two factors would offset this
added cost. One of these factors was the planned rna
chine-independent characteristic of the Q system
which essentially uses only FORTRAN and FORTRAN
routines (including 1-0). In practice, some of the machine-oriented functions of the SCOPE operating
system proved impossible to resist and conversion to
another machine may be less easy than was originally
planned.
The second factor that would make Q attractive
despite the increased machine time was the inclusion
of several unique features in the system. The most important of these features is the above mentioned partial derivatives. Another is dynamic storage, and a
third feature' of interest is a macro processor for the
input language. With this feature the system is suitable
for use by the engineer who is more or less familiar with
FORTRAN and wants his job done quickly even at
the expense of some extra machine time.
Sample problems
Before the structure and characteristics of the Q
system are described in detail, it may be useful to
give some examples of the kind of problem for which
it has proved most useful. These examples are taken
from INTRODUCTION to SLANG.3 In general it
can be said that Q is suitable for mathematically complex problems. It has been designed to relieve the user
of most of the complex calculations involved and to
provide him with a short turnaround time that makes
practical a series of alternate approaches or formulations.
As an essential part of making Q user-oriented, a
high-level language called SLANG has been evolved
to allow easy communication with the computer by
691
692
Fall Joint Computer Conference, 1969
~----------------------~--------------------------------------------------------------engineers with little programming knowledge. For
purposes of the sample problems :it is necessary to keep
in mind that the problem statements shown are written in SLANG. The convenience of formulating problems in this way will be apparent.
The first example illustrates the use of SLANG for
solving a typical optimization problem with nonlinear
implicit equations imbedded in the engineering model.
The problem is to minimize the weight of a three-stage
liquid rocket vehicle boosting a payload from the surface of Mars. The optimum values of thrust level and
burn time for each stage are to be determined for the
specified mission. Total burn time, total velocity increment, and payload weight are given. The SLANG
statements required to solve this problem are shown
in Figure 1.
In this problem, the quantity being minimized is
WT0T'the statement
0PTIMIZE WT0T
(1)
identifies the payoff function a~d establishes an optimization loop which ends with the second END
L00P card. The statement
INDEPENDENT THRUST (2), THRUST
TBURN(l), TBURN(~)
(3)
identifies the implicit simultaneous equations being
solved and establishes an equatiqn solving loop which
ends with the first END L00P dud. The independent
variables of the S0LVE loop are identified by the
sttttement.
INDEPENDENT THRUST(1), TBURN(3)
(4)
Even though they are expressed !in terms of intermediate variables, the equations Gland G2 are equivalent to the ultimate form
G1 = G1 (THRUST (1), T~URN (3))
G2 = G2 (THRUST (1), TBURN (3))
*
VARIABLE ISP(3) , ISPVAC(3) , TBURN(3) , THRUST (3) ,XIl)(3).
WPR0P(3) ,W~TAGE(3) ,STRFAC(3) ,DELV(3) ,MR(3)
1 READ DATA
0PTIMIZE WT0T
INDEPENDENT THRUST(2) ,THRUST(3) ,TBURN(l) ,TBUlrn(2)
0LIMITS(FPRIN - 0)
S0LVE G1,G2
INDEPENDENT THRUST(1),TBURN(3)
DLVT(IlT - 0
W - WPAYLD
TBT(IlT - 0
D(Il F(IlR L - 1 T0 3
I • 4-L
ISP(I) - ISPVAC(I) * (1 - XIP(I»
WPR(IlP(I) - THRUST(I) * TBURN(I) I ISP(I)
. WSTAGE(I) - 0.0234 * THRUST(I) + WPR0P(I)
+ 1.255 * WPR0P(I) 'II'll 0.704 + 4
STRFAC(I) - WPR~P(I) I WSTAGE(I)
W - W+ WSTAGE(I)
MR(I) - W I (w - ~TR0P(I»
DELV(I) - GC * ISP(I) * L0GN(MR(I»
DLVT(IlT - DLVT(IlT + DELV(I)
TBT(IlT - TBT(IlT + TBURN(I)
REPEAT
G1 - DLVT(IlT - DELVIP
G2 - TBT(IlT - TBTIP
END L~P
WT(IlT - W
PRINT VARIABLES
END L~P
G(Il T(Il 1
END
DATA
THRUST-5400, 1237, 317, TBURN-142,127 ,131,GC-32,17'~.WPAYLD·
DELVIP-2.8E4,TBTIP-400,ISPVAC-315,315,315,XIP-O,O,5E-3,
$END
Figure 1-8LANG formulation of sample optimization
problem
(3),
(2)
designates thrust levels of two stages and burn times
of two stages as independent variables which are being
determined by the optimization. Equations Gland G2
are being solved to constrain the solution such that
total velocity increment and burri time match specified
values. The statement
SOLVEG1,G2
*
(5)
The purpose of the S0LVE loop is to find the values
of THRUST (1) and TBURN (3) that satisfy G1 =
o and G2 = O. Engine performance and vehicle weig~ht
quantities are computed in a loop beginning with the
statement.
D0F0RL
=
1 T03
(6)
and ending with
REPEAT
(7)
The equations between these two statements H,re used
three times, one time for each of the three stag;es. Two
characteristics of SLANG should be evident from this
example. One is that the SLANG expressions used to
describe the engineering model (}losely resemble those
of F0RTRAN. The other is that numerical algorithms
for optimization and nonlinear equation solving are
invoked using the commands 0PTIMIZE and S0LVJ~.
The total running time for this problem was eight
seconds on the CDC 6500. The printout of the Bolution
is shown in Figure 2.
The second example demonstrates how a S0LVE
loop can be used to match ~n integration boundaJry
Q Approach to. Problem Solving
Variable Values
D£LVIP
GC
WPAYLD
2.80000£+04 DELV
3.21740E+Ol Gl
3. 15000E+02 ISP
3. 00000£+00 HR
7.95069E-01
1. 33079£+02
5.00000£+01 WPROP
6.78801E+02
O.
1.06014E+04
O.
G2
3.15000£+02
2.84635E+OO
7.14542£-01 TBTIP
1.15853E+02 THRUST
2.45111£+03
1. 69824£+02 WTOT
5.0oo00E-03
9.30107E+03
O.
ISPVAC
3. 15000E+02
2.50361E+00
4.00000E+02 TBTOT
5.ll094E+03
5.39694£+02
3.77865E+03 W
8.09754E+03 DLVTOT 2.80000E+04
3. 15000E+02
3. 15000E+02
3. 13425E+02 I
1.00000£+00
2.23222E+00 STRFAC 8.51072£-01
4. 00000E+02 TBURN
1. 51068E+02
1.27746E+03
3.28288E+02
1.21347E+02 WSTAG£ 2.88002£+03
3.77865E+03 XIP
O.
Figure 2--SLANG printout of results from problem
shown in Figure 1
/S¢LID R\1ICKET ENGINE START-UP TRANSIENT PR\1IBLEM
/
THE PURP\/ISE \/IF THIS PR\/IBLEM IS T\/I DETERHINE
/
THE PERCENTAGE \/IF EQUILIBRIill1 CHAMBER PRESSURE
/
ATTAINED BY AN END BURNING S\/ILID R\1ICKET ENGINE
/
AT A SPECIFIED TIHE (TSPEC) DURING ITS STARTUP
/
TRANSIENT
/
THE PR\1IBLEM INV\1ILVES INTEGRATI\/IN, B\1IUNDARY C\/INDITI\/IN
/
MATCHING, AND HAS A S\/ILVE L\1I\/IP
READ DATA
PCEQ = (12/32.174 * RH(l\P * CSTARO * A * K) ** (1/(1 - N - Q»
S(l\LVE C(l\NST
INDEPENDENT PCSPEC
FAC = VC / (GAM ** 2 * AT * 12)
LET TINTEG - INTEGRAL (1 / (CSTARO * PC **
Q * PC * (RH(l\P * CSTARO * PC ** Q * A *
*
K * PC ** (N -1) * 12 / 32.174 - 1»,
*
PC ~ PCIG T\1I PCSPEC IN 10 STEPS)
*
TC\1IMP - FAC * TINTEG
C\/INST .. TC\/IMP - TSPEC
PRINT VARIABLES
END L\1I0p
PERCNT - PCSPEC * 100 / PCEQ
PRINT VARIABLES PERCNT
ST\/IP
END
DATA
TSPEC - 0.5,
PCSPEC • 1500,
PCIG • 700,
RH\1IP • 0.064
CSTARO • 3320,
A • 4.4 E-4,
K - 172.65,
N .. 0.745,
Q - 0.015,
VC .. 220,
GAM - 0.66175,
AT - 0.35,
$END
Figure 3-8LANG formulation of houndary matching
problem
condition. The complete set of input is shown in Figure
3.
The expression in the argument of the integration
statement is an equation for dt/dP c (where t = time,
Pc = chamber pressure) during the start up transient
of a solid rocket engine. The problem is to determine
the value of chamber pressure at a specified time. This
.value is the upper limit of integration, and is being
computed such that the integrated tiine (TC0MP)
matches the specified time (TSPEC). That is, when
the value of the constraint, C0NST, is zero, the upper
integration limit PCSPEC is the value of chamber
pressure at TSPEC. The final calculation of PERCNT
computes the percentage of equilibrium chamber
693
pressure, PCEQ, achieved at time TSPEC. PCEQ
is computed from input data. The lower limit of integration, PCIG, is the ignition pressure, and is an input
constant.
Strucutre of the Q system
The Q system is basically a Complier/Interpreter
type package with the four major elements of the system shown in Figure 4. The user's input language
(SLANG) is converted by a set of system-supplied
macros into the MODTRAN language. The MODTRAN compiler then converts this language into an
assortment of pseudo instructions and some associated
tables. These are processed by the link editor before
going to the interpreter for execution.
With this system it is possible to omit the macro
processor if the user chooses to write directly in MODTRAN. On the other hand, a user might wish to use
only the macro processor to perform some transformations on BCD data.
The ML/I processor was originally designed by P.
J. Brown4 of Cambridge University, who supplied the
logic to TRW. The processor was converted to FORTRAN with little difficulty, and this version was included in the CUE system for making an initial pass
at the input of non-programmer users. It was found
that the average engineer in a hurry (for whom the
system was designed) was unwilling to take the trouble
of writing his own macros. Ideas for suitable macros
were solicited from potential engineer users, and the
resulting language was christened SLANG. Additions
are continuously being made to SLANG to make it
more useful. At one time it was planned to have four
HL-l
11ACRO PROCESSOR
(SLANG)
MODTRAN
COMPILER
t
-
CLINK EDITOR
I
OJ.
-
INTERPRETER
Figure
4-B~sic
Q system elements
Fall Joint Computer Conference, 1969
694
------------------~------~---------------------------------------------------------,-------
"dialects" of SLANG of increa~ing degrees of sophistication, but this idea was abandoned in favor of a
single version.
An example of how the processor converts SLANG
macros to MODTRAN is shown in Figure 5. It is
worth noting that the writing a.o.d debugging of macro
definitions is considerably easier than would be the
modification of the ::-"10DTRAN compiler itself. The
programmer need in general b~ concerned only with
the particular macro definition l'1e is working on, and
both his inputs and his outputs arp in BC D.
I t was originally planned to, incorporate some of
the more popularSLANG variations into ::\10DTRAN,
thus reducing processing time; unfortunately this
project has been continuously I)Ostponed because of
more pressing work. The more recent versions of the
Q system allow for relocatable subroutines, which have
served to reduce machine time considerably. Previously
an illusion of subroutines was: created by suitable
macros, but it was necessary to 'process the user's entire input deck each time the equations were modified.
The l\10DTRAN language bears a strong resemblance to FORTRAN or BASIC, since it was designed
by FORTRAN programmers. Algebraic statements are
essentially the same, and DO 106ps are provided that
have the same function except that they provide for
backward stepping when desired. Arrays are as in
FORTRAN except that they are ~imited to two indexes.
READ and WRITE statement~ are similar, as are
FORMA T statements. All variables are floating point
as in BASIC, and corrections ar¢ automatically made
for round-off errors on comparison~s.
-IF (A.LE.S) GO TO ,10204
GO TO 10200
102?4 IF (A.LE.4) GO TO 10210
GO TO 10206
102io GO TO 10212
IF A LE S
GO TO 10214
THEN IF A LE 4
102i2 HCNO-A
THEN GO TO NODE(A)
ELSE GO TO ERROREXIT
REJOIN
ELSE IF A EQ 0 THEN STOP
"y
GO TO 20000
10214 GO TO 10208
10206 GO TO 10216
10208 GO TO 10202
REJOIN ALL
10200 IF (A.EO.O) GO TO 10222
GO TO 10218
Some lVIODTRAN statements are unusual, as for
example EXECUTE label, which will cause a transfer
to the label. When a JUl\fPBACK statement is encountered, control is tranferred to the statement following the EXECUTE label.
The FORTRAK subroutine concept is used in
MODTRAN, but the CO~\1l\10N method of communicating between subroutines was eliminated in favor
of using the names of the variables themselves to
communicate locations, as in BASIC and other languages. Another provision is that a variable can be
typed as LOCAL to a particular subroutine, permitting
subroutines to be written independently. The FORTRAN concept of calling sequence/argument list is
used for communication between such subroutines,'
so that lVIODTRAN subroutines may be written and.
placed in the system library for general use.
The MODTRAN compiler has no provision for
user-written functions (arithemetic or other), which
makes it possible to determine an indexed, variable
even though no suitable allocation statement has appeared. When the compiler encounters what appears to
be an array (which could be a misspelled system function), it processes the indices and assumes that by the
time the statement is executed another s1batement
making the allocation for the array will have been
previously executed. The allocation statement can be
either GLOBAL or LOCAL. For example, the statement:
GL0BAL X (NR,0W, NC,3L), Y (10), Z
(8)
will cause the release of any arrays previou:sly associated with X and Y and the allocation of ten words
to Y as well as the generation of an array NR0W rows
by NC0L columns for X. Such statements are executable, and once executed will apply to all other subroutines where the variables X and Y appear as globnls.
The variable Z in this statement is only given a global
assignment by the compiler and that portion of 1~he
statement is not executable. If the compiler encounters a variable not defined as GL0BAL or L!2>CAL it
assigns the variable to the nominal category previously
defined by the user (normally GL0BAL).
10222 CONTINUE
10218 CONTINUE
Generation of partial derivatives
10202 CONTINUE
SLANG
Figure 5-Example of
HODTRAN
SLANG/M0DTRA~
conversion
Perhaps the most interesting feature of the Q system
is the way in which partial derivatives are treated.
The MODTRAN language provides for speeification
of three levels of partials:
Q Approach to l?roblem Solving
NO PARTIALS
FIRST PARTIALS List
(9)
SECOND PARTIALS List
In these statements, List specifies which variables
are to be the independent variables. An INDEPENDENT List statement might also be used for this
purpose. A typical set of statements might be:
SECONfl PARTIALS X, Y, Z
= Y *X/Z
D = F *F
F
(10)
These statements will cause the dependent variables
D and F to be evaluated and all of the first and second
partial derivatives of these two variables with respect
to X, ¥. and p will be computed. The resulting storage
requirerrients can become quite large; in the case of
three independent variables one word is required for
the value, three for first partials, and sLx for second
partials, making a total of ten words (see equation
11). In the case of 15 independent variables 136 words
of storage are required for each dependent variable.
The system tries to hold down the total storage required by ret,urning the partial storage to the free
area wherever possible. We are considering a scheme
to reduce the numb~r of words required in the case of
a dependent variable that is not a function of all the
independent variables.
The actual operation of computing partial derivatives is carried out by the interpreter in the course of
evaluating the given expressions of the problem. This
evaluation consists essentially of a sequence of operations, which may be unary (perfo~med on a single
variable), for example SIN (X) or binary (performed
on two variables), for example X *Y. The result of an
operation either becomes one of the variables going
into the next operation or, if the sequence is complete,
the result is stored as the answer in the appropriate
location. An operation is performed by the interpreter
causing a transfer to one of the appropriate subroutines. Each subroutine has either one primary input
(unary), or two primary inputs (binary), and a single
output. The inputs (operands) mayor may not have
partials, and if they do it may be necessary to compute
only first partials or both first and second partials.
Consider the division operator, for example; either or
both the divisor and dividend mayor may not have
partials, leading to four different possible cases. Each
case is different with respect to how the partials of
695
the resultant variables are computed, and four separate
subroutines have been written for the division operator;
the appropriate subroutine is selected by the interpreter
during the execution of the user's program. If an equation is evaluated several times, it is entirely possible
that a variable may have partials during one evaluation and none during another, in which case the appropriate subroutine would be executed during each
evaluation. At the time that the link edit is performed
every variable is given a core location assignment.
If the variable has no partials then the value associ.;.
ated with the variable is stored in this location. If,
however, during the execution of the model the variable
develops partial derivatives by being a function of
variables which have partials, then a vector is opened
for the variable and the initial location replaced by a
pointer to this vector. As an illustration, consider the
following sample vector for a variable F when there
are three independent variables X, Y, and Z:
F aF aF aF
a2F
a2F
a2F
, aX' jjY' aZ' aXaX' aXa Y' aXaZ '
All of the variables which have partials will have similar
associated vectors. The independent variables will
each have such a vector where all of the partials are
zero except for the one corresponding to the derivative
of the independent with respect to itself where a value
of one will be stored. When an INDEPENDENT
statement is encountered all of the vectors which
happen to be active at that point are deleted and a
new set of independent vectors set up. As the run
progresses new dependent vectors will be allocated.
In MODTRAN statements for unary operations, the
subroutines tend to be similar except for the three
lines for the evaluation of F, S1, and S2 (see below for
definition of SI and S2). In the example of Figure 6,
SINX is used as the name of the interpreter subroutine for evaluating the sine of a variable. NUMIND
indicates the number of independent variables, E is
the operand vector, and F is the resultant vector.
There would of course be similar routines for COS,
EXP, TAN, etc., which might appear in the user's
input. In the general case all of these subroutines would
be identical except for F, S1 and S2. Suppose ·oper
corresponds to the unary operator that is being used,
then F, S1 and S2 can be expressed in general as follows:
Fall Joint Computer Conference, 1969
696
SUBROUTINE
SUBROUTINE MUL(D,E,F)
5INX(E~F)
DIMENSION D(1) ,E(1) ,F(1)
DIMENSION
C~MM~N/NUMIND/NUMIND
C0MH0N/NUMIND/NUl'1IND
M=NUMIND
1
F(l) = SIN(E(l»
DO 20 K-1,NUMIND
2
51= COS(E(l»
3
52= -SIN(E(l»
00 10 L-1,K
IF(FIRST)
G~
TO 20
M-M+1
10
H==NUHIND
F(M)-D(M) *E(1)+E(M) *D(1)+D(K) *E(L)+D(L) *E (K)
F(K)-D(1)*E(K)+D(K)*E(1)
20
DO 20 K=l,Nll1IND
F(1)=D{1)*E(1)
IF (FIRST) G0 TO 20
RETURN
S3=.32*E(K)
END
DO 10 L=I,K
Figure 7-Sample interpreter subroutine for binary
operation
H==H+l
10
F(M)=E(M)*SI+S3*E(L)
20
F(K)=F(K)*SI
Then from any table of derivatives
RETURN
aF
D aE
ax - . ax
END
S2
=
(14)
+
(12)
a2oper(E)
a:B.2
Should it be necessary to evaluate only first partials
then at the time each of the subroutines is executed
the logical variable FIRST will be set to true and the
computing of the second partials will be bypassed.
Binary functions vary considerably, but an example
of this type of function is given in Figure 7 for the
multiplication operation. D and E are the operands
and F is the resultant vector.
Perhaps it would be useful to demonstrate the manner in which the equations of the MUL routine were
derived. Assuming for purposes of explanation that
X & Yare the only independent variables then we
know that
F =D·E
.E
aF2
a2E
aD. aE aD. aE
axay = D. axaY + ay aX + ax aY
F = oper (E)
aE
ax
while
Figure 6-Sample interpreter subroutine for unary
operation
SI = aoper(E)
+ aD
(13)
(15)
2
8 D. E
axaY"
The reader should be able to convince himself that
the statement at label 20 on Figure 7 correspon.ds
to (14) while the statement at label 10 corresponds
to (15). It should also be possible to place these statements in the context of a generalized number of independent variables by referencing equation No. 11.
Tabular function defined by arrays of input df~ta
are handled by a system routine which fits a polynomial to the data and then assumes that the derivatives of the polynomial correspond to those of the
function. This is of course rather cumbersome and
the results may n.ot be accurate for many functions.
System supplied routines
In addition to the usual system-supplied routines
such as those illustrated above, the Q system attempts
to provide rather elaborate sets of routines whicha.re
Q Approach to Problem Solving
called algorithms. These routines should remove some
of the burden off the user to provide a method of solution. They are kept in the Q FORTRAN library
and are called as needed. Since one of the main features of the system is the ability to take partial derivatives, it is not surprising that most of these routines
are built around this capability. The most important
and most frequently used of these algorithms are
called SOLVE, OPTIl\1, and INTEG.
The SOLVE algorithm makes use of the XewtonRaphson technique in order to drive specified functions to zero. In order to do this it is necessary to
evaluate the first partial derivative of the functions
and apply correction factors to the independent variables based on this information until the convergence
criteria is reached. Since it is possible to obtain first
partial derivatives by numerical techniques this
method of solving functions is rather common. The
partials of the Q system should be more accurate, however, especially in the neighborhood of singularities.
The optimization function is initiated by writing
MAXIl\1IZE, l\HNIMIZE, or CRITICALIZE followed by the variable to be optimized and an INDEPENDENT statement for the variables the system
will vary in an attempt to find a solution. The partial
derivatives playa major role in this algorithm. Originally the system made use of Lagrangian multipliers in
conjunction with the Newton-Raphson technique for
optimization, but this method has been superseded
by a modified version of rotational discrimination, as
described by Law and Fariss. 6
The INTEG algorithm is used to integrate a set 'of
simultaneous differential equations by a fourth-order
Runge-Kutta method. It can be combined with the
SOLVE algorithm to solve two-point boundary value
problems, as in the second SLANG example given
earli'er. In this case the INTEG routine is imbedded
within a SOLVE loop, where the solution to the
SOLVE operation is the end points to match certain
expressions. Other routines are available to save and
restore partial derivatives, to add and delete independent variables, to input or printout all global variables,
etc.
Implementation of the system
As implemented on the CDC SCOPE system Q
requires two back-to-back executions under SCOPE
with a compilation by the SCOPE FORTRAN compiler separating the executions. The user need not be
aware of these efforts in his behalf, however, as he submits one job and gets one output. It is even possible
G97
to place the SCOPE control cards necessary to run
the Q system onto a file, along with the various other
files required by the system, so that the user need only
see a few of the SCOPE control cards.
In the first execution under the Q system a basic
monitor surveys the user control cards to determine
the objective of the decks which the user supplies.
Thus in one run the user might have some SLANG
decks to be sent via the ML/I processor to the MOD..
TRAN compiler, some MODTRAN decks which
would go directly to that compiler, some FORTRAN
decks for compilation by the SCOPE FORTRAN
compiler when it is called in between the executions,
and perhaps even some FORTRAN and/or MODTRAN relocatable decks. Control cards are intermixed
with other input and some action is normally taken
immediately with the cards following a control card
until the next control card in encountered. Sometimes
a set of cards is sent directly to a processor, such as a
MODTRAN deck going to the MODTRAN compiler,
while in other cases it is necessary to place the deck
on a file for later processing, such as a FORTRAN
deck. The flow of operations is shown in Figure 8.
Once all the user's input except for data cards has
been read and either processed or assigned, the link
editor is called in to tie together the various MODTRAN routines. The link editor assigns all of the
variables to their final locations in the data portion
of the bucket and performs the required relocation of
the pseudo instructions. An attempt is made to satisfy
all of the externals referenced from MODTRAN
routines with MODTRAN entry points, including
a search of the Q MODTRAN library file. The references
which are still unsatisfied are assumed to be for FORTRAN routines and a search of the Q FORTRAN
library file is performed. Any routines found there are
pulled off for loading by the next execution. At this
point the link editor writes a FORTRAN routine
which will be compiled by the CDC FORTRAN
compiler. This routine consists of a computed GO TO
followed by a call to each of the routines which it has
determined are FORTRAN. For example it might
write:
SUBROUTINE CALLI
COMMON/N/N
GOTO (1,2),N
1 CALLINTEG
RETURN
2 CALLBROP
RETURN
END
(15)
698
Fall Joint Computer Conference, 1969
and placed on the FORTRAN relocatable file. The
nature of this core load varies radically depending on
what the user requires. Control is initially passed t.o
the main MODTRAN routine 'but after that the user
is on his own.
During execution of a MODTRAN routine, the
pseudo instructions put out by the MODTRAN compiler are being interpreted. As is usual with interpretive
schemes, quite a bit of control can be exercised in
making sure that the user is not getting into trouble
and in taking some appropriate action when he attempts to do something which would be improper.
There are three user data areas in the Q system.:
variables, arrays, and partials. The three areas are
rather heavily intertwined with pointers, a pointer
being distinguishable from a value by the fact that
it is a positive integer while a value is a normalized
floating point number. Initially only the variable
area is assigned (by the link editor) and the interpretation of the user's program causes the buildup of the
other two areas. Thus suppose the user says
GLOBAL X(lO)
Figure 8--Flow diagram of the Q system
Actually the routine CALLI will be more complicated
than this example, since the user is allowed to have
arguments to these FORTRAN routines. The basic
concept is, however, that this is the manner in which
it is made posible to call a FORTRAN routine from a
MODTRAN routine. Should the user, for example
write
CALL INTEG
(16)
in MODTRAN he will in actual.fact be calling subroutine CALLI with N set equal to 1. Since the
routine CALLI is placed in the input stream to the
FORTRAN compiler the user receives a listing of
this routine in the middle of his: output. It was not
deemed worthwhile to try to suppress this listing,
since the user might very well be compiling some of
his own routines on the same call to the FORTRAN
compiler.
After the link editor has relinquished control to
the FORTRAN compiler and that processor has completed its task, the second execution of the user's job
begins. This consists of a loading of the Q interpreter
and all of the FORTRAN routines which have been
collected by the link editor on the previous execution
(1'7)
where X was previously only a value. An array will be
opened in the array area and the location at which X
was assigned will be replaced by a pointer to the array.
A double tag system as described by Knuth6 is used
for the allocation of arrays, a system which allows a
good method of returning variable length arrays to
the free area. Two more words are used to speeify the
dimensions on the array, causing the use of four words
in addition to the actual size of the user's array. When
an array is released, the two words which were used
for indexing are replaced by linking pointers to facilitate the search for free areas of adequate size. The user
of course need not be aware of this process when he
opens or closes an array.
It is also frequently the case that a variable will
not only have a value associated with it but will have
some partial derivatives. In this case the 10catioIl of
the variable, or the indexed location within its array,
is replaced by a pointer into the partial area. At the
location in the partial area the value and the associated partials are stored. Some rather complicatted
chaining-down pointers may result before the desired
location is finally achieved; but normally if the user
is taking partials he will be spending most of his time
during execution doing just that, computing partia;ls,
and the time spent on pointers will be relatively small.
I t was also necessary to make some provision for returning these partial vectors to the free area, but this
Q Approach to Problem Solving
is a rather simple matter since all of these vectors
are of the same length.
Additional complications are entered into the system
when the user performs such operations as saving
partials and beginning a new set. This is basically
performed by closing off the current partial area and
opening up a new one. A swapping of pointers with
values ,occurs so that the partials can be restored later.
SUMMARY
No claims are made that the Q system is a direct challenge to other computer systems. It does, however,
offer anapproach to some rather difficult problems.
As was pointed out earlier, it is easy to introduce
modifications into the SLANG language, a characteristic which is not common to programming languages. It is also rather easy to introduce new algorithms into the system, thereby expanding its
problem solving capability. It is hoped that the Q
system constitutes a basis for further development
along these lines 'since the user is' frequently denied
this flexibility in a computer system.
699
ACKNOWLEDGMENTS
Bob Kennedy helped in the preparation of this
article and Dave Adamson provided the sample
SLANG problems along with the discussion of them.
REFERENCES
1 R E WENGERT
A sinple automatic derivative evaluation program
C A C M Vol 7 1964463-464
2 H L GLASS
A n elementary discussion of compiler/interpreter writing
Computing Surveys 1 196955-77
3 D ADAMSON
Introduction to SLANG
TH W Doc 99900-6672-HO-OO 1968
4 P J BROWN
Macro processors and their use in implernenting sojtware
Thesis Univ Math Lab Cambridge England 1968
5 V J LAW R H FARISS
Rotation discrimination for optimization with limits on the
variables
Preprint 19B Second Joint AICHE-IIQPR Meeting
May 19-22 1968 Tampa Fla
6 D E KNUTH
The art of computer programming
Addison Wesley Publishing Co Vol 1 Chap 2 1968 p.442
Self-contained exponentiation*
by NANCY W. CLARK and W. J. CODY
A rgonne National Laboratory
Argonne, Illinois
INTRODUCTION
The traditional implementation for floating-point
exponentiation, x raised to the y power, is to compute
exp (y fn(x» using standard subroutines for the
logarithm and the exponential function. While it is
possible to provide extremely accurate subroutines
for these latter functions, we shall shortly see that
this is seldom done. Even in those rare cases where
excellent subroutines are available, the exponentiation
routine, for sound theoretical reasons, is poor. In this
paper, we present brief titatistics indicative of the
quality of these three subroutines in the basic Fortran
libraries provided by various manufacturers, a detailed error analysis for exponentiation, and a method
for exponentiation via self-contained subroutines.
In the following discussion we will use the term
exponentiation to refer to XV where we will always assume x > O. The term exponential will refer to CV
where c is a fixed constant base, usually either 2 or e.
exp(ri)
n = 40(1)88,
fn(x)
x = .25(.015625)2.0,
(2 n ,22 - n)
n/2)
* Work performed under the auspices of the U. S. Atomic Energv
'
701
= 0(1)22,
(x, y)
(2 n , 44 - 4n)
With the cooperation of a number of different individuals and computing centers, we ran some simple
tests on the exponential, logarithm and exponentiation subroutines in the basic Fortran libraries on eight
different computers representing six different manufacturers. The only version of the single-precision
library on the CDC-3600 available to us contained
routines we had written according to the methods to
be described and does not necessarily represent the
I
n
(4 n , 11 -
x**y
The present situation
Commission.
manufacturer's library . We also tested our own version
of the library for the IBM 8/360 in addition to the
standard library.
These tests were not intended to be complete certifications of the routines tested, but were designed to
lightly probe areas where such subroutiries are most
likely to have tlouble. The tests consisted of computations with a series of arguments exactly representable
in binary notation. The corresponding function values
were output in octal or hexadecimal form and compared
against similar computations in 96-bit arithmetic on
a CDC 6400. The computations involved were:
(.75 X 2 n , 46 - 4n)
)
n - 1(1)11.
-
The test results are summarized in Table I.
Certain of the computers used have either octal or
hexadecimal floating-point arithmetic. On these computers, a mantissa can be properly normalized and
still have the first two or three bits zero. This accounts
for the apparent tabular discrepancies between the
sum of the maximum number of bits in error and the
minimum number of correct bits, and the total number
of bit ~ in the mantissa on these machines.
702
Fall Joint Computer Conference, 1969
TABLE I-Accuracy Test Results
lVlachine and
Subroutine
Single-Precision
IVI
N
Double-Precision
M
Machine and
Subroutine
Single-Precision
IVI
N
N
Double-Precision
1\1
N
manti:~sa)
Burroughs B-5000 (39 bit mantissa) (78 bit mantissa) IBlVI 360/75
IBNI library
EXP
8
69
EXP
9
30
71
LN
3
35
7
LN
X**y
11
67
X**Y
31
7
(24 bit mantissa) (56 bit
Control Data 3600 (36 bit mantissa, (84 bit mantissa, IBlVf 360/75
Argonne library) CDC library)
Argonne library
EXP
1
35
4
80
EXP
LN
LN
2
34
5
79
x**y
X**y
1
35
8
76
(24 bit mantissa) (56 bit mantissa)
Control Data 6400 (48 bit mantissa)
EXP
1
47
LN
2
46
X**y
7
41
SDS Sigma 7
EXP
LN
X**Y
(24 bit
4
4
8
G.E.225
EXP
LN
X**Y
(30 bit mantissa, FIZ1\10P system)
3
27
12
18
10
20
Univac 1107
EXP
LN
X**Y
(27 bit mantissa) (54 bit mantissa)
50
2
25
4
6
21
7
47
6
21
10
44
G.E.645
EXP
LN
X**Y
(27 bit mantissa) (63 bit mantissa) Univac 1108
14
49
1
EXP
26
4
4
59
LN
23
X**Y
1
14
49
26
(27 bit mantissa) (60 bit mantissa)
2
25
8
52
6
21
6
54
8
19
9
51
1\1 = maximum number of bits in error.
We will show presently that accuracy in exponentiation depends very heavily on the accuracy in the
calculation of the exponential function. Note, however,
that even with a good exponential function~ as is
apparently the case in the single precision CDC 6400
and the original IBM 360 libraries, the exponentiation
routine can still be in error by two to three significant
decimal places or more. Also note that the exponentiation routines corresponding to OUr methods as well
as the single-precision routine on the G.E 645 display
primarily round-off error in these tests.
Error analysis
There are two major types of error in any function
subroutine. The first is transmitted error, i.e., error
due to small errors in the arguments. If we assume
21
20
14
1
3
10
7
3
10
21
21
21
1
2
2
manti~sa)
20
19
15
49
52
46
1
2
1
52
52
52
(56 bit
8
4
8
manti~:sa)
48
50
46
N = minimum number of correct significant bits.
z
= f(x)
where f(x) is differentiable, then
oz ~
f' (x)
x £ex) ox
(1)
where
(jz = flz/z
~
dz/z
(2)
denotes the relative error in .z, and ~z denotes the
absolute error in z. It is clear that the tra,nsmitted
error, oz, depends solely on the inherited error, OX,
and not on the subroutine. The second type of error
is generated error, i.e., that error generated by the
Self-Contained Exponentiation
computational process. This includes both errors due
to truncating an infinite process at some finite point
and roundoff errors.
Even infinitely precise subroutines have no control
over inherited error. Therefore, in designing subroutines
we assume there is no inherited error and seek to
minimize the generated error.
Now let us consider the logarithm-exponential
method for exponentiation. We use the relation
x
>
0,
(3)
where
w = ys
and
s = logc(x).
From (1) and (2), and recalling our assumption that
= ~y = 0, we see
~x
AW = yAs
where As represents only the generated error from the
logarithm computation.
If
u
then
DU = fn c AW
+
DG (w)
(4)
where DG(w) denotes the generated relative error from
the exponential computation. For good exponential
routines DG(w) affects only the least significant one
or two bits of u. Thus, the relative error in the exponentiation is essentially proportional to the absolute
error in w. Clearly, we want to minimize AW as it
appears to the exponential routine.
There are two major contributions to this error:
the generated error from the logarithm calcula tion,
and the finite word length of the computer. The
second is by far the more important of the two. Suppose the floating-point mantissa of the calculator
contains 2t significant bits, but w is of the order of
2t. Then the floating-point representation of w, the argument to be passed to a standard exponential routine,
may have a rounding error as large as 2- t , i.e., AW ~ 2- t •
Consequently, u may be accurate to only about t bits
703
independently of the accuracy of the logarithm calculation. This is the reason some of our tests found inaccurate exponentiation even though the logarithm
and exponential routines appeared to be reasonably
accurate.
A new approach
There are at least two alternatives to the traditional
computation. One is to resort to "overkill" by carrying
out the traditional computation in a higher precision
arithmetic. This is expensive in time; it is easy to do
for single-precision routines, but difficult for double
precision routines. (Is this the approach on the G. E.
645?) The second alternative is to raise the status
of exponentiation routines. At the moment they are
considered to be secondary routines which call upon
the primary routines for the exponential and logarithm.
We propose that they become primary, self-contained
routines with possible secondary entry points for the
exponential and logarithm.
If we accept this major reversal in philosophy, we
free the computation of several restrictions. For
example, we need not pick c = e in Eqs. (3) and (4),
but can make the choice c = 2 which appears most
natural for a computer, and which introduces the
factor fn2 = .69315 in Eq. (4). This permits us to
obtain extra significance in the results of the logarithm
computation z as we shall shortly see, and to retain
this significance throughout the remainder of the calculation.
The first implementations of the algorithm we will
outline were programmed using single-precision fixedpoint arithmetic to do single-precision exponentiation
on both the CDC 3600 and the IBM 360 computers.
Because neither computer allows efficient doubleprecision fixed-point arithmetic, the algorithm has to
be modified to use double-precision floating-point
arithmetic to do double-precision exponentiation. So
that the presentation will not be too abstract, we will
present basically the algorithm as used on the IBM
360 in double-precision.. Modifications for singleprecision floating point ~r fixed point versions, or for
other machines should be obvious.
We first reduce the range over which the logarithm
must be approximated. Let
x = 2 k ·m,
and choose
b = 11./16
1/2 :::; m
<
I,
704
Fall Joint Computer Conference, 1969
and
n 'an odd positive integer less tha.n 16, such that
x = 2 k - b m/a
where
Then
such that Zl is the integer part of 16z. Essentially,
then, 8 is already in reduced form. We compute the
exponent W in reduced form by writing
where
SI = k - b,
S2 = log2
achieved a logarithm accurate to well beyond usual
working precision. Since /821 :::; 1/16, the ab80hde error
in s is now about 2-4 times the normal relative error in
floating point. Careful multiplication of 8 by y will
minimize the crucial quantity dW. At this point, the
usefulness of fixed-point arithmetic with the extra
significant bits in the representation of a number is
apparent. When such arithmetic is not available, as
we have assumed is the case, it is necessary to arrange
the floating-point computations to achieve the extra
significance at minimal cost. This is done as follows.
Let us say we reduce a number z when we write it
in the form
(1 + z ,
1 -
z)
where YI and Yz are the double-precision representations of the most significant and least significant halves
of y respectively, and forming the products 8'IYl, 82YI,
and SYz. Each of these quantities is again reduced and
the results combined to form the reduced
and
Z=
m-a
m+ a'
N ow WI is of the form
Since z is quite small CI zl :::; .022),: 82 is easily computed
to full floating-point accuracy usi~g a low order rational
approximation, or even the first few terms of the Taylor
series, provided z is computed accurately. (A little
extra care is necessary at this point in base 16 floatingpoint but we will not go into the, details here.) Since x
is assumed to be exact, m is exact and we can achieve
full precision in m-a by breaking the constant a into
two parts such that
WI
= t
+
j/16
where t and j are integers. We then finally compute the
exponential value
(5)
Since IW21 :::; 1/16, a Taylor series computation of the
exponential is quite efficient, although we used rational
Chebyshev approximations. The quantities 2i/16 can be
carried in a table. In fact, if Eq. (5) is rewritten as
to the precision desired and such that the exponent on
a2 is much less tha,n that on al. Then the computation
will retain the low order bits of, a. Normal floatingpoint can be used for the rest or the evaluation of z.
Note that by carrying 81 as one floating point number, and 82 as another, we have rather painlessly
and the quantities 2- n / I6 are tabulated, the same table
can be used for the constant a needed in the logarithm
computation. This dictates the form of the: earHer
decomposition. of a into al and az. Clearly al should be
the value of a correctly rounded to working precision
while az becomes a positive or negative correction term.
Self-Contained Exponentiation
705
TABLE II--Random argument tests on conventional double-precision X**Y on IBl\f 360/75
Argument Range
y
x
(1/16,16)
(2-16 ,2 16)
(2-32 ,2 32 )
(2- 64 ,2 64 )
(2- 8 ,2 8)
(1/16,16)
1
0
2
Frequency of Bit Errors
No. of bits in error
3
4
8
5
6
7
(-4,4)
272 467 405 371
(-16,16)
78 123 153 168
( -8,8)
80 109 131 152
( -4,4)
95 115 126
57
(-32,32)
59
90 115 109
(- 64, 64)
60
96 110 128
240
247
216
161
199
196
197
377
288
215
312
275
47
321
295
293
406
318
1
294
234
352
343
318
0
195
241
303
253
281
IVIax. ReI. RlVIS
Error ReI. Error
9
10
other
0
44
120
192
107
167
0
0
0
0
86
82
7
48
48
9
0
3
1.25E-15
8.82E-15
5.08E-14
2.68E-14
1.40E-14
1.95E-14
3.65E-16
2.70E-15
9.60E-1.5
6.97E-15
4.02E-15
5.73E-15
-.---------.--
Average execution time for (x, y) random in (0, 1) = 195,usecs.
TABLE III-Random argument tests on self-contained double-precision X**Y on IBl\1 360/75
Argument Range
x
(1/16,16)
(2-16 ,2 16)
(2-32 ,232 )
(2- 64 ,2 64)
(2- 8 ,2 8)
(1/16,16)
y
0
(-4,4)
(-16,16)
( -8,8)
( -4,4)
( -32,32)
(- 64,64)
1301
1206
1314
1350
1097
872
Frequency of Bit Errors
No. of bits in error
1
2
3
677
759
667
634
812
823
22
35
19
16
89
250
0
0
0
0
2
52
lVlax. ReI.
Error
4
0
0
0
0
0
3
2.22E-16
2.22E-16
2.22E-16
2.21E-16
2.22E-16
2.22E-16
RMS
Errror
ReI.
6.24E-17
6.11E-17
5.81E-17
5.44E-17
6.31E-17
6.94E-17
"-------~--------------------
Average execution time for (x, y) random in (0, 1) = 180 psecs.
Since the last two factors in U are each less than unity
in magnitude, and the 2t+1 factor affects only the
floating-point exponent, we see that the construction
of U from its factors is a stable process. Note that the
error Llw, hence by Eq. (4) the error aU; now depends,
primarily on· the magnitude of y. Using Eq. (4), and
noting that we have gained an extra four bits in our
calculation of 8, we see that y must be greater than
roughly 32 before the inaccuracies in w become large
enough to greatly affect aU. To verify this point, and
to provide an in-depth comparison of our method and
of the traditional computation, we have subjected our
routine for the IBM 360 and the original IBM routine
to a full certification as described in references one
and two. The results, for identical tests, are presented
in Tables II and III.
One final word about the fixed point version 0"
this algorithm. In fixed point, the extra bits over t 1
normal floating point manitssa length are already
available. As we have indicated, the decomposition of
a and y and the reduction of 8, w, etc. are no longer
necessary. This constitutes a savings in storage as
well as in the number of instructions to be executed.
But no matter which approach is taken, the fixed
point or the floating point, the self-contained routine
can be expected to be competitive timewise with the
traditional routine because we have saved the overhead of linking with other subroutines. All three of
the self-contained programs we have written
are actually faster than their traditional counterparts.
The price is paid in terms of storage. This price can
be minimized by incorporatIng entries for the exponential and logarithm routines into the exponentiation routine, thus eliminating separate routines for
the former.
706
Fall Joint Computer Conference, 1969
----------------------------------------------------------------------------------,-------ACKNOWLEDGMENTS
We would liketo acknowledgethe assistance of J. Boyle,
P. Businger, L. Fosdick, C. Hammer, J. Pagels, R.
Royston, H. C. Thacher, Jr., J. F. Traub and the
computing centers at Argonne, Bell Telephone Laboratories, University of Illinois, Northwestern University,
Univac, and the University of Notre Dame for their
assistance in the tests reported in Table I.
REFERENCES
W J CODY
Performance testin(J oJ Junction subroutines
Proc SJCC 1969 759-763
2 N CLARK W J CODY K E I-ULLSTROM
E A THIELEKER
Perlonnance statistics of the Fortran IV(H) library for the
IBM system/360
Report ANL 7321 Argonne
~atl
Lab 1967
Dens digital simulating system *
by H. POTASH, A. TYRRILL, D. ALLEN,
S. JOSEPH, and G. ESTRIN
Universiity of California
Los Angeles, California
INTRODUCTION-SIMULATION SYSTEMS
1. A set of basic building blocks whose properties
are known is available.
2. An instruction set or task assignment for the
computer system is defined along with cost and
performance constraints.
3. Using his experience and intuition, the designer
generates an ensemble of modules. These modules form the system's building blocks which the
designer believes will perform the stated functions effectively.
To see a world in a grain of sand
And a heaven in a wild flower,
Hold infinity in the palm of your hand
And eternity in an hour.
-William Blake
This article is concerned with the problems of digital
simulation and describes methods used in the Digital
Control Design System (DCDS)1 for the simulation
of digital structures. The paper is divided, into five
parts:
• A short introduction to DCDS, its structure and
purposes.
Given the above (1-3), the digital system must
be describable to a design aid system. The designer
then needs a language, its translator, and an operating
system with the following properties:
4. The set of functions to. be performed can be
described.
5. The building blocks, their interconnection, and
their place and function within the ensemble
can be described.
6. A computer program can generate a fabrication
description of control modules capable of going
through a sequence of states necessary to have
the system perform the above functions. The
designer may specify synchronous or asynchronous control systems.
7. A simulator can accept the descriptions in (4)
and (5), and the sequence description generated
in (6), and produce measures of accuracy and
performance.
8. If the performance of the ensemble is "good",
the description of the computer. system is in
such form that. it may be fed into a more de-
• A discussion of simulation techniques, entities and
attributes.
• The DCDS pseudo machine simulator.
• The pseudo machine program.
• A simple example'of a DCDL program.
DeDS, its structure and purposes
The Digital Control Design System (DCDS) was
developed at the University of California at Los
Angeles to aid in the design and architecture of computer systems. The design system operates under the
following assumptions:
• This refj~arch was supported in part by the Atomic Energy
AT(ll-l) Gen 10 Project 14, and the Office.of Naval
Research, Information Systems Branch, NOOO14-67-A-0111-0016.
Commi~eiG)n
707
708
Fall Joint Computer Conference, 1969
----------------------------------------,--tailed design process. If l1ot, the designer may
al ter his architecture.
To satisfy the above needs, Digital Control Design
Language (DCDL) has been implemented as part- of
design automation research being conducted at the
University of California at Los Angeles. 2- 6 A. compiler
for DCDL has been implemented for the SSD SIGMA
7 using a META 5 compiler writing systcm. 6 ,7 The
DCDL compiler is currently also being implemented
for the IBM/360.
The DCDL system illustrated in Figure 1 contains
two compiler processors written in META 5, a pseudo
machine (which is the subject oithis paper) written in
FORTRAN IV and the machine language, and twO"
control implementation modules written in FORTRAN
IV. T~e input processor is a DC:r;>L syntactic analyzer;
(~) thIS program translates the digital system descriptIOn (example in Part IV) into an interpretive code
used by the pseudo-machine for simulation of the
described hardware. The second META 5 processor
(2) produces a numerical code which is then transformed
into a binary control program and a fabrication description of a control subsystem for the computer system
being designed. The implementation specificatio~s for
the wiring of the control matrices are produced by the
two FORTRAN IV programs (3,4). Control modules
imp~ied by microprograms have their wiring lists automatICally generated by the Control Matrix Processors
in DCDS. The hardware construction of the controi
processor is then effected by using a set of one or more
similar building blocks (Control Matrix Building
Blocks), according to wiring specifications given by
DCDS.
The software module described in Part III is a pseudo
machine {5) in charge of executing simulation runs.
The pseudo machine is composed of a combination of
FORTRAN IV and machine language subroutines. The
simulation runs are designed to check test cases in
order to ass~ss the validity of a described design as well
as to calculate its estima.ted execution time.
DCDS is designed to analyze asynchronous as well
as clocked systems, with the former posing a. spedal
problem: dynamic reevaluation of variables. Any time
a logical variable is changed, the system must, as a
consequence of this change, reevaluate any other
variable which is a function of the changed variable.
This process must continue until no further "consequential changes" occur.
DCDS's capability to dynamically reevalua,te vari_abIes allows the designer to describe his system using
-the same logical equations and timing relations which
he uses to implement it. Programming in a form (see
Part IV) which is highly related to the actual hardw:are
provides for a system directly used by the designer
eliminating the programmer as a "middle-man". This
direct correspondence also makes the DCDL progr:am
an up-to-date documentation of the system designed.
The syntax analyzer accepts a description which images
the hardware and translates this description into
simulation code. Thus the designer is freed from the
tedious job of programming the structure-of the model
required-a process sometimes more time-consuming
than building a hardware prototype and testing it on
the bench.
'rhe Digital Control Design Lang~ge (DCDL)
is built as a ~luster of three ntain sublanguages: a
language intended for expressing ~~olean €:guations
and time relations; a microprogramming language;
and an algorithmic language. DCDL uses FORTRAN
as the algorithmic sublanguage. The user may choose
anyone of the three sublanguages to describe any of
the parts or module"s in the describ~d design. The logical
aI).d micl'oprogrammi'nlg sublangu~g~s use the s~me
declarations and access the same variables by their
names. The execution statements of sublanguages a.nd
their syntactic formats differ and one cannot combine
statements of different sublangu,ages. Thus DCDS
~rovides' the user with a powedul means of expression,
since he can select the most convenient and expressive
form from among the three s'ublanguages to describe
a hardware module.
Entities and attributes in simulation systems
Figure I-DCDS system flow chart.
For our observations herein, we consider the simula-
DGDS
Digital
Simulating System
.
(
709
1,
tion of a ~ystem to be the modeling and associated
measurement of a system by a STRUCTURE in which
EVENTS occur in TIME according to a set of RULES.
Thus there are four sets of basic elements which must
be dealt with in simulation:
The input to circuit analysis programs like NASAP
and LISA or to the Boolean Analyzer is in table-form
which either explicitly gives the set of rules (Boolean
equations) 01' gives a table that implies a unique set
of rules (Kirchoff's and Ohm's equations for the circuit).
STRUCTURES, EVENTS, TIMES, and RULES
Main Entities-8TRUCTURES
Different simulation methods neglect one or more of
these sets (e.g., time independent models). Anyone of
the four sets may be selected as primary entities and
the others treated as attributes of that set.
One may choose to consider an analytic closed form
solution to be a simulation of a real system. In this
case, the process of simulation becomes a transformation. ~ssume for examQle the transfer equation for an
electronic circuit. Both internal even.ts (voltages and
currents in the individual elements) and struct?J,re
(top~logy of the circuit) may be neglected and one
manipulates the set of rules (Le. Kirchoff's law and
Ohm's equatiqns) to produce a transfer function which
gives the output event~ as a function of time and ~nput
events.
Thus whenever the rules are considered to be the
main entities, then either an analytic transformation or
an algorithmic procedure is used for simulation. The
type and form of the information transferred into the
simulation system as well as the simulation systems
themselves vary from one another depending upon
~hich of the four sets was chosen as the main. set of
entities. Due to these differences, different languages
or input rules are used to describe the simulated system
to the software package designed to perform the simulation.
The following examples of different programming
structures will serve to illustrate the previous discussion.
Main Entities-EVENTS
Examples of programming structures:
SIMULA [8], GASP [9], SIMSCRIPT [10], [11],
[1~], [13], GPSS [14].
A simulated system is described by an event flow
chart. The programming systems above use input
~anguage formats suitable for the description of events
in such a form.
Main Entities-RULES
Examples:
NASAP [15], LISA r16], Boolean Analyzer [17].
Examples:
LOGIK
[1~],
Weather Simulation Program [19].
Partial Differential Equation Simulation [20].
The input format is any form suitable for describing
the physical or hierarchical structure of the simulated
system.
Modeling and approxima,tions
After the selection of the entity and attribute relations, the next step for simulating a system is to decide
what can be approximated and how the selected approximations c~p. be done. The choice of what to approximate can be categorized as:
a. mak~g certain entities (inputs) constants; for
example t = 0 in time independent mo~eling.
h. neglecting parts of the attributes; for example
in simulation of partial differential equations by
Monte Carlo methods, the field constants are
calculated for only a small number of selected
field points in the structure.
c. modifying the set of rules; the use of differenc~
equations to solve partial differential equation
problems is an example of modifying the rules,.
For a different example of rule modification,
consider a simulation program simulating another program on a digital system. The purpose
of the simulated program is to execute a matrix
inversion in which the inversion is performed
on a 2X2 part o{the matrix instead of the entire
n X n array. In this case, the system rules may be
modified to obtain fast simulation time for a
simualtion that "takes the system through the
motions" without obtaining the actual numerical result. Thus for Such approximatiop.s,
one ~ay simulate the system faster than real
run time.
Event directed simulation can be expected to be faster
than structural simulation since structure simulation
has to go through all possible events in the system,
while event simulation takes the system only through
710, Fall Joint Computer Cqnference, 1969
--------------------------------------------------------------------------------------,-----the prescribed events. This is, of course, also the main
pitfall of event simulation; it does' not point out events
that might occur in the system but are unforeseen by
the programmer.
struction in the program stack) transfer of data 1~0
register A. All the other consequences of this action
(i.e., all the outputs of gates whose input is A) are
simulated from the Call Stack (structure simulation.).
DCDS pseudo machine simulator
The pseudo machine program
A computer module in DCD L may be described by
its structure (LOGIC), by the set :of events that it controls (PROGRAM), or by th~ algorithmic rules
(SIMULATE). In order to perforrrt this task, the DCDS
pseudo machine simulator operates as an algorithmic
simulator by calling on the FORTRAN programs; as
a structure simulator when simulating a logical structure (operating from the Call Stack); or as an event
simulator when processing a microprogram. The
Program Stack (see Figure 2) operates the sequence
of events generated by the control microprogram. The
Call Stack operates all the logical details occurring in
the logical structures forced by the control events.
The DCDL event simulation is limited to operations
within a logical structure. The eve~ts that are generated
by the control as time moves forward, forces the simulator to follow all consequences of the events within
the described logical structure. For example, the event
simulator may directly order (by executing an in-
A pseudo machine processor is a program written in
machine language or higher level language for the
machine on which one performs the simulation runs.
In the present implementation on the SIGMA 7 this
program is written using FORTRAN and aBsembly
language.
The process in which the translation is separated
from the simulation allows one to write the translator
program independently of the machine in use. The
separation of the compiler program and the pseudo
machine program allows independent debugging and
changes in each. Modifications in DCDL tmd i1GS
compiler are done by changing the META 5 compiler
program. FORTRAN changes in the pseudo machine
provide for changes in simulation methods as well as
insertion by the designer of other features expressed
in FORTRAN to capture event information relevant to
one design or another.
Thus, by the process of programming in DCDL and
by translation one obtains:
INOEXING ANO
ARITHMETIC UNIT
IMCI
INSTRUCTION REGISTER
ANOCOUNTER
A. Documentation of the design;
B. A check on the consistency and completene8s
of all logical variables and all logical functions;
C. Automatic implementation of control sections;
D. Simulation runs for given sets of input data; and
E. The amount of time a certain run will take on
the described design.
Following is a discussion specifying the pseud.o
machine structure and operation codes.
Instructions, interpretation, addressing, and
indexing
OR
PROGRAM
STACK
This unit contains the following parts (see Fi~~ure 2).
(a) Time counter and time registers.
The counter counts simulated execution
time. The time registers are u,ed to store
REGULAR
LOGICAL
VARIABLES
TEMPORARY
LOGICAL
VARIABLES
OELAYEO
STORAGE
TABLE
ARITHMETIC
VARIABLES
INSTRUCTION
TABLE
MEMORY
Figure 2-Pseudo machine structure
time counts of different parallel br,anches.
At a parallel junction, comparison between
duration of operation on each branch is made
and the highest time count will be the new
value of the simulation time counter.
(b) Indexing arithmetic unit.
This unit is capable of fixed point operatioJ[l
DCDS Digital Simulating System
(plus, minus, multiplication, and division)
and is used for indexing arithmetic.
(c) Call-stack and Program-stack.
Two push down (LIFO) stacks. One of the
elements in the stacks is the operative address; i.e., the address of the instruction to
be executed next. The operative address
is usually the word at the top of the callstack. If the call-stack is empty, the operative address is the word at the top of the
program -stack.
A control branch to a lower (subordinate)
control level (CALL) is instrumented by
putting the first address of the lower control
level program into the call stack, thus making
the call address the operative address. When
the lower control level is of type PROGRAM,
the address is put in the program stack.
The operative address is incremented by
1 after an instruction is executed or the
address is replaced by another due to the
execution of a branch (a normal branch that
occurs within the program being executed).
An exit or return from the subordinate program will
cause the stack to pop while a further entry into another
subordinate program brings a new address into the call
stack. The consequential calls are put into the call
stack but their execution is delayed until all the parallel
operations have been carried out and then all consequential calls are carried out. Two key words in DCDL
indicate parallel structures. *GROUP indicates a set
of similar modules operating in parallel and controlled
by the same binary control variable (for example,
a set of 32 single bit adder modules in a 32 bit binary
adder). *PART indicates a set of dissimilar modules
operating in parallel under the control of a single
binary control variable (for example, shifter and counter
in floating point normalization). A *PART may contain simple and nested *GROUPS in which case the
whole structure is operating simultaneoulsy under
the supervision of a single control variable. The stacks
have three points. TOPC (top of the call stack). TOPP
(top of the progranl stack) and OPR (the operative
address.)
OPR = TOPP.if call stack is empty
OPR = TOPC if call stack is not empty
OPR = TOpe at the time of entry to *GROUP
or *PART if executing inside a *GROUP
or a *PART.
711
Consequential calls are intended for the dynamic
reevaluation of variables. The STORE instruction
invoking the consequential calls puts new addresses
of variable reevaluation routines into the Call Stack.
This is accomplished according to the following steps:
1. The old and the new value of the variable are
compared.
2. The new variable value is stored.
3. If the comparison mentioned above shows a
difference between the old and new value, the
address of the subroutine that calculated the
new value of the dynamically dependent variable is put into the Call Stack.
4. The address of the next instruction is the address
on the top of the Call Stack. Thus, if there were
allY consequential calls, they would be executed
prior to the completion of the execution of the
subroutine that invoked those consequential
calls.
When there are no more changes in the values of the
variables, the instructions proceed to the end of the
reevaluation routine, which contains RETURN as
the last instruction. The RETURN instruction pops
the Call Stack sending the program to finish operations
in the routine which invoked the consequential calls.
The process of dynamic reevaluation will stop only
if the variable values and the logical functions are
consistent. Assume the following st~tements:
A = /\ (B,C);
D = V (A,E);
B = ~ D;
with initial conditions A = 0, B = 1, C = 0, D = 0,
E = O. This set of relations and values is consistent.
Now (lonsider that the variable C is changed to one. The
new set up of variables and relations is inconsistent
and the reevaluation of variables will not reach a
steady state. Each reevaluation will put a new address
in the call-stack:
A change in operation occurs once an address is
put into location n in the stack. The pseudo machine
prints an error message which is followed by the names
and values of variables partaking in a STORE instruction. This process continues allowing the program to
put addresses in the next ten slots of the call stack.
When the execution calls for storing an address at
n+ll the call stack is cleared (TOPC = 0) and the
operative address is tllken as the instruction on top
of the program stack. This debug feature allows the
712
Fall Joint Computer Conference, 1969
----------------------------------------------------------------------_,----program .to check for IQgical inconsistencies without
getting into an infinite loop or having to stop simulalation runs.
5. Delay table. The result of a logical transformation specified in DCDL can be effected directly
or after a specified time, for example in the
statement
A
=
'DELAY(3)' & (C, D, E): OPl;
the transformation &(C, D, E) is performed if
control variable OP1 is activated, but the content of A will be changed only three time unitR
later.
To facilitate translation of the delay modifier, the
pseudo machine contains a delay table. An entry into
the delay table contains three parts: variable name
variable's new value, time of exit.
Variable name
Variable value
I
Exit
time
Each time the time counter is incremented, all time
of exit entries into the delay table are checked, and
the entries with a time of exit matching the time counter
activates a store operation storing the new value in
the appropriate variable, invoking consequential calls
if such are present.
Logical manipulating accumulators
The pseudo machine contains two string accumulators, A and B. The machine performs the operations
of AND, OR and EQUAL between the respective bits
of the string accumulators and the result is stored in
string accumulator A. The current size of both string
accumulators is given by the content of String Accumulator Size Register (SASR).
All operations are performed on words of the same
size. Calling an operand of the wrong size causes an
error message printout and the machine goes to the
next instruction. An exception to. this occurs when the
operand is of size one bit. In this case, the one bit is
extended to a word that contains all zeros or all ones
of the size indicated by the SASR. A special imltr~ction
sets the size of the string accumulator (Le., the cont£mt
of SASR) thus setting the size of all followin:g logi1cal
operations.
Data Blocks
Data blocks have different lengths and contli.in
binary arrays. A binary array can possess up to three
dimensions. Only a single bit or a binary word string
can be addressed in the blocks. Each data block contains a two word header containing the variable name
followed by the structure described below.
Storage for a Single Bit
The storage blcok for a single bit is one word (four
bytes) plus a word for each consequential call. A
consequential call occurs when a variable A is a dlynamic function of a variable B. B forms the input to
the gate, the output of which is A. When B is <,hanged,
a consequential call causes the pseudo-machine to
reevaluate the variable A. Thus, the storage location
of variable E contains the addresses of sets of instructions which will reevaluate all variables which are
dynamically dependent on the variable B.
The single bit storage words format
variable value
indicator flags
number of consequential calls
Byte 1: number of consequentilal calls
invoked by a change in the
stored binary variable.
Byte 2: this byte contains indicn.tors for
high bit position, number of
dimensions of th~ logical variable, and variable type. Eaeh
indicator occupies two bits.
DCDS Digital Simulating System
1
Format:
2
variable dimension
00:
bit variable
01:
one dimension array (word)
10:
two dimensional variable
11:
three dimensional variable
I
I
3
713
4
variable
storage
cc directive
cc address
"
consequential
"
call
"
address
variable type
00:
logical point, the variable does not
contain memory
01 :
1 level storage, declared as *RS
10:
2 level storage (clocked)
position of the high order bit
00:
the high order bit is the roost significant bit
(leftroost bit)
01:
the high order bit is the least significant bit
(rightmost bit)
Byte 3: not used
Byte 4: variable value
The following words .(if any) contain the consequential
call address in byte 3 & 4 and its directive in byte 1.
Byte 1: consequential call type (directives)
011: calls on any change in the variable
001: consequential call, only if the variable
changes from 0 to 1
010: consequential call, on the change of the
variable from 1 to 0
Ixx: consequential call of an entry to a PROGRAM, put a new address on top of
program stack (operation on the last 2 bits
same as above).
One dimension array storage
In a one dimensional binary array st.orage, the first
word contains the range and type of the stored variable.
The following words contain the binary variable and
then the consequential-calls (if any).
First word: Byte 1: number of consequential calls
Byte 2: variable dimension, high order
bit position and variable type
(same as for bit storage)
Byte 3: lowest SUbscript of variable
Byte 4: size of varia.ble.
The second word through the nth word
1)
word size
n
=
---3;--contain the value of the binary
(
+
word. If the variable is a clocked F IF, the amount of
space for variable storage is doubled and each bit has
two storage locations, primary and secondary.
The last set of words contains consequential call
addresses and their directives.
Two dimensional binary storage
1
2
3
4
5
6
IIIII
IIIII
cc directive
cc address
"
"
"
A two dimensional arrangement contains at least 3
words. The first 2 words are used for bookkeeping in the
same format 9S the 1 dimensional arrangement, with
byte 5 indicating the lowest value of the second
subscript, and byte 6 indicating the range of the second
subscript.
714
Fall Joint Computer' Conference, 1969
Three dimension·
d. three address subscripts and a set of subscript
tags.
1
2
3
4
5
6
7
8
cc directive
cc address
"
"
"
In a three dimensional arrangement, byte 7 indicates
the lowest value of the third variable and byte 8
indicates the range.
Arithmetic variable storage
The third entity stored in pseudo memory is a block
of 256 arithmetic variables used for indexing and address manipUlations.
Temporary logical variabls
The memory contains a block of 256 one dimensional logical temporary variables, each one 128 bits
long.
Pseudo machine instruction set
Most of the pseudo machine 'instructions closely
resemble general purpose computer instruction lists.
The main exception is that the addresses of logical
variables contain the variable address as well as bit
and word indices.
In--ihe following paragraphs we will discuss spec,ific
instruction.s which .are unique to the DCDL pseudo
machine and will give the reader more insight into
DCDS simulating programs.
A pseudo machine logic instruction is contained in
a 64 bit w~rd (eightbyte~).
As implemented on the SDS ~igma 7, the most
common format of the pseudo-machine logic instruction code contains
a. operation code (one byte)
h. operation cqde modifiers (one byte)
c. operand address (two bytes)
The actual operand address is a function of the
main address (i.e., array address), the three sub8cripts,
and the subscript tags. The main address corresponding
to the name of the data block (i.e., the name of the
variable). The subscript tags indicate whether the
subscripts are to be used direotly, indirectly, or by
word size.
Each index byte has a two bit tag. The interpreta.tion of the tag is:
If the tag is 00, this subscript is not currently ej[fective. For example, ill A(1, 3), A is a t;w~ di.mensional array and the third index is not used.
If the tag is 01: The subscript is indicated directly by the numerical content of the corresponding subscript byte.
If the tag is 10: The subscript is given directly;
i.e., the corresponding number is the locatioJ[l
of an indexing word in memory.
If the tag is 11: It is used for ,,::ord v,ariablles and
the word is the entire range of this subscript.
The following section contains pseudo machine instruction examples from the set of pseudo mL3.chine
instructions.
Store with invoked consequential calls
STDC a): a f - A, Call Stack f - consq (a)
If there is a difference between (a) and A, all the
consequential call addresses associated with (a)
are put into the call stack. To av~id: redundant
operation, a duplication of the address aJready
inside the call stack will not be inserted; i.e., when
two or more successive operations request the
same consequential call this mechanism s,ets th,e
operation such that the call will be executed
only once. When the receiving variable (n) is ::1.
clocked element (two storage levels) both levels
change to match the content of A.
Store in secondary level
SSEC (a) : (al) f - A.
Stores into first level of a clocked storage ellemenl~
(a clocked element has two storage levels). Thi:~
instruction does not initiate conseqJlential., calls.
Secondary to primary storage level
transfer, entire array
TRANS (a):
(a2)
f-
(al)
DCDS Digital Simulating System
715
Transfers the data from secondary to primary
level in clocked memory elements. This instruction
initiates consequential calls if consequential call
addresses are present and the content of primary
and secondary differ.
the consequential calls mechanism. Thus, if
consequential c!111s have been involved, within
PART this instruction causes the effective address
to be the top of the stack and execution of consequential calls to begin.
Secondary to primary transfer, only
des~nated hit(s)
*GROUP entry point
BTRANS. (a):
(a2)
~
(al)
Instruction execution same as aboye except
transfer is performed only on bites) designated by
the instruction. Note: consequential calls are not
associated with single bits; a change in a variable
invokes all consequential calls for the array.
Delayed storage
DELAY (a), i:
DELAY. TABLE ~ a, i, A
i, the delay count, is put in the second byte of the
eight byte instruction (as a modifier). Delayed
storage invokes consequential calls when they are
associated with the stored variable. - The consequential calls as well as storage will be activated
after i time units.
Instruction format
'7 C'- -'- -'- -'-
Loads the value K1\ into the arithmetic variable
serving as index register (XR).
Increments the GROUP flag by one (GIl-0UP =
GROUP + 1).
Format
,E 2'X X'X X'- -'X X'X X'- -'- -'
1
2
345
6
7
8
Byte 4: arithmetic variable serving as index
register (XR).
7&8: Humber (1<0 loaded into the index
register (XR).
*GROUP exit point
GROUP, K2, i, n, XR:
-.1- - ' - - ' - - '
12345678
Byte 2: delay count
Byte 3-8: logical variable address
Delayed secondary to primary transfer
CKDLY(a),i:
GRUPIN, KI, XR:
DELAYTABLE~a,i
This instruction stores the address and time count
in the Delay Table. The variable value does not
have to be stored in the Delay Table sinc~ it is
stored in the secondary register of the variable.
*pART entry point
PARTIN:
changes the GROUP flag to 1. As long as the
GROUP flag is not equal to zero (GROUP ~ 0)
the operative address does not change due to the
placement of an address in the call stack.
*pART exit point
PARTOUT:
Turns the GROUP flag to "0" thereby releasing
(1) Compares K2 with the v~lue stored in the
appropriate index register (XR).
If the values are equal:
Decrements the GROUP flag (GROUP =
GROUP - 1) and proceeds w:th tne execution of
next instruction. Note that if GROUP flag is
decremented to zero (GROUP = 0) the stack
pointer is moved to the highest occupied position
POINT-TOP and stored consequential calls are
executed.
If the values are not equal:
The index register variable XR is changed by 1
or by -1.
The operative address (next instru~tion address)
is changed to the value n.
'E 3'- _'X X'- -'- -'- -'- .,;.'- _,
12345678
Byte 2:
(i) Incrementing or decrementing value
(1 or -1)
4 (XR) Address of index register
5&6: (n) Label of the instruction at the
top of th3 *GROUP loop
7&8: (K2) upper limit of index register.
71ti
Fall Joint Computer Conference, 1969
The operative address cannot, change as long as
execution is within a *PART OF *GROUP
(GROUP ~ 0). The consequential calls will be
stored in the call stack and' evaluated onc the
program exists all the nesting of *GRO UP and
*PART.
Count time
TIME, n: (TimeI') --- (Timer)
lay table.
+
n, Evaluate de-
Counts n time units; note that with each count
the delay table will be reevaluated and the instruction will activate delayed storage.
Unconditional branch
GOTOn;
Unconditional branch to n: the vnlue n replaces
the operative address.
Store timer
TIMS (n):
(n) +- (Timer)
Stores the content of the t~mer in n
Conditional branch
Return to time count routine
GaTe (k)n:
TRET:
Branch is taken if the logical accumulator A = 0
and k = 0 or A ~ 0 and k = 1. When the branch
is taken, n replaces the operative address in the
CALL or PROG RAlVI Stack.
This instruction pops the call stack then returns
control to the timer control subroutine.
Bring timer
TIMI (n):
Call
(Timer) +- (n)
Sets the timer according to the value stored in n.
CALLn:
Control transfer. The label n is put on top of the
call stack making it the new current operative
address.
Return from a substructure
Set timer
TIMO n, m, k·: (Timer) +- n, (timer subroutine)
+-m,k.
The instruction contains a new initial value fOlr
the timer.
RETURN:
The instruction causes the call stack to pop making
the next label in the stack the operative address.
Gather point for parallel branches in a
microprogram
GATHER (b),j,k:
Call microprogram controller
CALI> (n):
puts (n) on top of the program stack
Return from a microprogram
RETRNP:
Pop the program stack
Check bit
CHECK (a)
The instruction contains a bit indicator (byte 2).
The bit indicator is compared with a bit in memory
addressed by bytes three-eight. If the bits are the
same, the result is no operation; if the bits are
different, the i.ru!truction executes a RETURN.
This instruction appears at the glther point of
parallel operation. The instruction contains two
numbers, j and k, each stored in a two bytH loca··
tion and used for parallel branch count. k contaiml
the total number of parallel branches coming in
to the gather point; j contains the number of
branches not yet executed. The arithmetic varia-·
ble b is used to store the maximum operation time
on the parallel branches.
operation: if j
~
0
a. j +- j - 1
b. (b) +- MAX «b), (timer»
c. Pop the call stack
if j = 0
a. j +- k
b. (timer) +- MAX «b), (timer»
DeDS Digital Simulating System
c. (b) +- 0
d. go to next instruction (past parallel gather)
'D 4'X X'X X'- -'- -'- -'- -'- -'
12345678
Byte 4: Arithmetic variable storing time count
5&6: value of k, total number of parallel
paths
7&8: value of j, number of parallel paths
to be executed
717
Numedcal to logical variable transfer,
additional words
SOUT2 (n), k: B(32*k to 31+32*k)+-n.
Loads the content of (n) into the kth word of B.
This instruction must be followed by SOUT 1 or
another SOUT2.
Call simulation section
CALSIM, n: B+-O, CALL simulation section.
Logical to numedcal variable transfer,
first word
Resets B, then activates the FORTRAN or machine language simulation section. n is the number
of the subroutine called.
SINI (n), (v): B ~ (v), (n) +- B(O-31)
Error trap
The content of the logical variable v is loaded
into B accumulator. Wbe.n the rightmost bits of
B(O-3l) are loaded into the arithmetic variable
n. This arithmetic variable is to be transferred into
the simulated section. If the size of B is less than
32, zeros will -be put into the leftmost bits of the
word.
TRAP:
This instruction must follow a conditional branch.
The execution of the instruction consists of printing an error message and then following the branch
of the previous instruction, even though the branch
conditions were NOT satisfied.
The logic design of a serial adder
Logical to numerical variable transfer,
additional words
SIN2 (n), k: n+-B(32*k to 31+32*k)
This instruction must follow SINI or another
SIN2 instruction. The instruction transfers the
kth word from B to the arithmetic variable n
to be transferred into the simulation section.
Format
Figure 3 gives the block diagram of a design specification for a serial adder. The adder contains two clocked
shift registers, A and B, containing 16 bits each. 'Other
parts of the adder are a four bit counter COUNT, a
carry flip flop C, a single bit sum and carry logic, the
adder controller AUC, and a PANEL section.
The sum of A and B generated by the adder replaces
the content of B. A is connected to perform a cyclic
shift such that at the conclusion of the addition it
'5 I" - -'X X'- -'X X'X X'X X' XX'
12345678
Byte 2: contains the address of the arithrrletic
variable
4: k, position of the word in B.
Numerical to logical variable transfer,
first word
SOUTl (n), (v): B+-n, (v)+-B, B+-O
This instruction transfers the bits of an arithmetic
word n into the rightmost 32 bits of B, then stores
the content of B in v, and then resets B (the instruction may invoke consequential calls if they
are associated with v). Byte 2 contains the
arithmetic variable address.
Figure 3-Serial adder
718
Fall Joint Computer Conference, 1969
contains its initial value. The sum bit generated at each
cycle is stored in position B (16) .
Add~r
Design Example, Serial
Figure 4 contains a DCDL prrigram specifying the
serial adder. The program starts by declaring a UNIT
named ADDER at control level # 1. The declaration
section starting with the key ;word *DECLARE
specifies that the UNIT ADDER receives three control
signals (ORDERs) from its sup~rvisor(s). The ORDERs are , CNT and RESET. The functions controlled by these ORDERs will be specified
later in the LOGIC part of this UNIT.
Other parts declared in this DECLARE section are
t~e 16 bit register A, the 16 bit register B, and the
flIp flop C. A, B, and C are composed of clocked RS
flip flops (type *CRS). The next declaration is a four
bit register COUNT constructed from TRIGGEij. flip
flops and. a DATA_BUS logic variable TEST. The
valu~ of the variable TEST will ·be specified in the
LOGIC part as a logical function of memory elements.
*U~IT
LEvEL~l
AJDER.
*OE"CLARE
*6~~E~ CA+q>,CNT,RESET
*CRS A(l~:l) , B~t6:1)
*TRIG3E~
C~ I~T(4:1)
*CA TA sus T::-ST
*EN!)
*LeCTC
*PART: c~, T ,
J
, C
J
J
ceUNT(l)~·'xl'
ceU~T(2)~.C'UNT(1)
C6U~T(3)x.&(C6U~T(1),ceU~T(2»
C5U~T(4)~.
,
,
,
'
c-'xO'
CSU\T (*) _ 'X' ,
*~N:)
.
.PART:CA+B>.
A(*)~.'CYCL~(-l)' A(-)
~(16)~_I(&(4(1),~(1),C),
~(-A(l),B(ll,-C),
*GR~UP 1.1,15
~(tlhe(I+11
*SET,
*END
-END
TEST.
*E~~
,
&(-A(1),-9(1),C),
&(A(1),-9tl)'-C»
C'_I(&(A(l).~(l»,&(A(l),C),
.
!
*E"-ID AO!)J;'R
.U~lT
AJC,
*eR~f.R
*~[gLV
LEVEL-2
Aen
~
,
~IN
*E''lir.:
.PRf1GRA:1
ADD: RESr:T: A1
,
A2 :cA+~>,~~T : A3 ,
A3; *C;tl_ T6 TEST: (A4, A,. )
A4: *~C;:TlIRJ FI''''
-E:'\II)
.p.~ AUC
.PANEL AAA, LEVEL_3
Al: :A2
*~YSTE'1_RE,-::ET:
A(*)*'X7~37',
B(.lII!'x2EC')'I *TI'1:.*O;
*AT-:TI""E_r~TERv"L • 2,w~ITr A(.):X,I CI(*):X, CfHl"T,*):x ,
*SlART ADD,
*~I~rSH ~JNI
The first PAR T section is controlled by CNT
clocked transfers (%=) which are associated with th«3
olocked input of the registers' flip flops. The next
PART section controlled by the ORDER RESET
specifies a direct connection (=) into the clocked variables C and COUNT. Therefore, the PART controlled
by CNT changes the clocked input of the COUNT
register. The PART controlled by RESET changes
the content of COUNT and C using direct set (DC set)
and direct reset (DC reset).
The last PART section is controlled by the ORDER
VARIABLE < A + B>. Activated by the < A + B >~
control variable are the following transformations:
&(8(l):,C»
s(ceJ~T(1),CeU~T(2),C'~~T(3~ceU~T(4»
*OECLAR<:
(3)).
a. The content of A is shifted a cyclic shift by ono
to the right, the result is stored in A(*) ;
b. B(16) receives the sum function of A(I), B(l);,
andC;
c. The GROUP of bits B(l) to B(15) are shiftedl
by one to the right;
d. The carry flip flop C receives the carry which is
a function of A(l), B(l) and C.
~(ceJ~T(1),C~UNT(2),C~U~T(3»
*END
-PART: RESET
The declaration section ends with the key word *END.
The logical and control relations in the ADDER
UNIT are specified in the LOGIC section which starts
with the key word *LOGIC. The LOGIC section contains three PAR T sections and one direct transfer
statement.
The first PART section is controlled by the ORDER_VARIABLE CNT. This section contains the
input statments to the four COUNT flip flops. The
statements specify that the input to COUNT (1) is
a "ONE" ('Xl' specifies a one in a hexadecimal format).
The input to COUNT (2) is the output of COUNT (1).
Similarly the input to COUNT (3) is the AND Qif
(COUNT (1), COUNT (2)) and the input to COUNT
(4) is the AND of (COUNT (1), COUNT (2), COUNT
*~~o
Figure 4-Serial adder DCDL program
AAA
Note the PARTs containing a clocked transfer refer
to double rank clocked elements. Whenever the con··
trolling variable is activated, the specified function
(to the right of % =) is stored in the secondary rank
of the variable to the left of %= . In the succeeding time
Ub.it, a primary secondary transfer is activated.
The last statement in LOGIC is a dynamic specification of the variable TEST as an AND function of the
bits of COUNT.
The next UNIT to be specified is the adder controller,
AUC. AUC introduces two new variables in its declaration section: an ORDER ADD which it receives
from its supervisors, and a reply FIN which it sends
back to the supervisors.
DCDS Digital Simulating System
The control function of AUC is specified by a microprogram in the PROGRAM section of AUC. The interpretation of the microprogram is as follows:
a. When a controller receives the ORDER ADD,
it issues the ORDER RESET. After the default time lapse, two time units, the controller
switches to state AI.
b. In state AI, the controller issues the ORDER
< A + B>. After two time :units, the controller
movestoA2;
c. At state A2 the controller issues two ORDERs
and CNT. The next state is A3;
d. A3 is a conditional branch. If TEST is "ONE",
the next state is A4. If TEST is "ZERO", the
next state is A2. The GO_TO line is an internal
control branch specification which does not
require any additional cycle. Therefore the
execution time of this line is zero time units;
e. The last microprogram line states that when
the controller is in state A4 it issues the REPLY
pulse FIN, and r~turns to its zero state.
The highest controller in the structure is AAA
PANEL at level 3. The PANEL specifies the system's
initial conditions (placing initial values in A and B)
using the SYSTEM RESET statement. The initial
condition for the timer is specified by the statement
*TIME = O. The key word *START indicates the
initiating variable, and the key word *FINISH is
followed by the variable signaling completion. The
last statement in PANEL is *END followed by
PANEL's label AAA.
More Complex Structures
The above description has illustrated the use of
DCDL to design a simple adder. The language and
system have been used· to design more complex structures including a multiplier and special purpose logic
card tester. 1
CONCLUSION
The scope of the DCDS study was limited to systems
for which a set of predefined building blocks and a defined structure are present. A total design automation
system requires programming tools capable of studying,
simUlating, and gathering statistics and thereby able
to evaluate conjectures about the behavior of structures and sequences of events before the details of
the structures and events are known. We hope that
further extension of DCDS and further study in silI).ulation and modeling will add the capability to make
719
conjectures based on systems less rigorously defined
than DCDS presently requires them to be.
The DCDL implementation by sublanguages which
are compiled by META5 allows a simple insertion of
other sublanguages designed to study the architectures
of systems. The DCDL pseudo machine op~rates as
a FORTRAN based simulator either to describe the
simulated system or to augment the pseudo machine
instruction set.
BIBLIOGRAPHY
1 H POTASH
A digital control design system
UCLA Dept of Engineering RpT No 69-21 May 1969
PhD Dissertation
2 R L MANDELL
Tools for the construction of design automation system8
UCLA 1968 PhD Dissertation
3 R MANDELL G ESTRIN
A meta-compiler as a tool for design automation
Proc SHARE Design Automation Workshop 1966
New Orleans Louisiana
4 R A RUTMAN
LOGIK, a syntax-directed compiler for computer bit-time
simulation
UCLA Masters Thesis Aug 1964
5 K P GOSTELOW
LOGIK, a system for the computer-aided selection and
assignment of electronic modules
UCLA Rpt No 68-8 March 1968
6 D OPPENHEIM
The MET A 5 language and system
Tech Memo TM-2396jOOOjOl System Development Corp
Santa Monica Jan 1966
7 D OPPENHEIM D HAGGERTY
MET A 5: A tool to manipulate strings of data
Proc 21st Nat Conf of Association for Computing
Machinery 1966
8 () DAHL K NYGUARD
SIMULA, a language for programming and description of
discrete event systems
Introduction and User's Manual Norwegian Computing
Center Forskningueien IB Oslo 3 N0:t:way May 1966
9 P J KIVIAT A COLKER
GASP-a general activity simulation program
P2864, RAND Corp Santa Monica 1964
10 B DIMSDALE H M MARKOWITZ
A description of the SIMSCRIPT language
IBM Systems Journal Vol 3 No 11964
11 M A GEISLER H M· MARKOWITZ
A brief review of SIMSCRIPT as a simulating technique
RAND Corp RM-3778-PR Sttnta Monica 1963
12 B HAUSER H M MARKOWITZ
Technical appendix on the SIMSCRIPT simulation programming language
RAND Corp RM-2813-PR Santa Monica 1963
13 H M MARKOWITZ
SIMSCRIPT, A simulation language
Prentice-Hall Englewood Cliffs N J 1963
14 R EFRON G GORDON
720 Fall Joint Computer Conference, 1969
r
A general purpose digital simulator and examples of its
application: Part 1 -description of tlie simulator
IBM Systems Journal Vol 3 No 1 1964
15 L P McNAMEE H POTASH .
A user's guide and programming manual for N ASAP
UCLA Dept of Engineering Rpt No! 68-38 Aug 1968
16 K L DECKERT E T JOHNSON
User's guick for LISA 360, a program for linear systems
analysi8
IBM System Development Division TR 02-432 San Jose
July 31 1968
17 M A MARIN
Applications for tlie Boolean analyzer
UCLA Dept of Engineering 1968 PhD Dissert.ation
18 R A RUTMAN
LOGIK, a syntax-directed compiler for computer bit-time
simulation
UCLA Masters The,.;is Aug 1964
19 Y MINTZ
Very long term global integration of tlie primitive equ,aUons oj
atmospheric motion
Meteorology Monographs Vol 8 No 30 Feb 1968
20 A F CARDENAS
A. problem oriented language and a translator for partial
differential equations
PhD Dissertation UCLA 1968
Pattern recognition in speaker verification
by S. K. DAS and W. S. MOHN
I BM Corporation
Research Triangle Park, North Carolina
INTRODUCTION
There are many ways in which a pattern recognition
system may be implemented. In the specific problem of
speaker verification,l,13 a two-class recognition scheme
is of interest. A speaker who desired verification of
his identity based upon some previously stored characteristics of his speech represents one of the two classes
(real), whereas the other class (impostor) encompasses
all other speakers.
In implementing such a system, it is convenient,
first, to obtain a representation for each of the utterances of interest in the form of a time-frequency-amplitude matrix. 2 ,3 The conventional method of deriving
this representation is by means of a filter-bank analyzer. 2 ,3 Speech signals are inputted to the analyzer
and the outputs of the various filters are sampled and
averaged over the appropriate time interval. This process generates a set of short-term average spectra with
which to form the time-frequency-amplitude matrix.
Normally, only those components of this matrix
which contain significant speaker characteristics need
be retained. Identification of such speaker-dependent
components is somewhat arbitrary although several
guide lines are available. 2 ,3
The next step is to regard all the pertinent elements
of the above-mentioned matrix as constituting a single
vector. Th~s, the net result of the previous processing
steps is a vector representation for each utterance.
At this stage, several mathematical and statistical
tools may be applied appropriately to the data. For
example, the vector representation of an utterance may
exbihit high dimensionality. For further computational
advantage, it is desirable to reduce this dimensionality
of the vector. It is also helpful to achieve as much
intra-class clustering and inter-class separation as
possible. lVlethods such as analysis of variance,4 discrimina.nt analysis5 and mutual information calculation lO are available for this purpose. The analysis of
variance and mutual information methods can be
conveniently used even if the initial dimensionality of
the vectors is rather high. The disadvantage of these
two methods is that each element of the vectors is
considered independent of the other elements; this is
not desirable since the interrelationships between the
elements which may be important for the purpose of
speaker verification are completely ignored. On ~he
other hand, while discriminant analysis treats the
vectors in multi-dimensional space, thereby preserving
the interrelationships, the computation time required
may be impractical if the vectors are initially of inappropriately high dimensionality.
Finally, a method for discriminating among t~e
vectors of the real class and of the impostor class IS
required. This is usually done by means of a reference
vector. There are again several alternatives here. For
example, i't has been pointed out that if a su.itable
representation for the impostor class is not avaIlable,
it is possible to derive a reference vector based on t~e
real class data only.6 But, if the impostor class IS
properly characterized; Adaline-type linear threshold
elements,7 which attempt explicit discrimination among
the real and the impostor classes, may be used to advantage.
721
722
Fall Joint Computer Conference, 1969
--------------------------------------------------------------------------------,----There are many other methods for feature selection
and reference vector generation. 7 ,8 Each method has
its particular advantages and shortcomings. In addressing the speaker verification problem, it is convenient to use the analysis of variance technique for
feature selection and a modified form of the Adalinetype linear threshold device for deriving a reference
vector. Previous unreported in-house experimentation
has indicated that the two techniques, analysis of
variance and mutual information' calculation, produce
rather similar sets of features. The rationale for using
the modified form of the Adeline device will be presented in the next section.
Another important aspect of the pattern recognition
problem is to conduct a significant experiment with
true data. Too often, for reasons of economy and time
limitations, artificially generated data or a very limited
quantity of true data js used to perform experiments.
Experience has shown that conclusions based on such
experimentation are often misleading. A primary
achievement of the present experiments is the use of
a large true data base. The value of this large data
base will be further appreciated in the following sections.
The next section outlines the modification of the
Adaline-type linear threshold element. The analysis
of variance technique, being a standard tool in statistics,
is not treated here. The details of the experimental
part are listed in the third section. Finally, some conclusions and observations regarding the whole procedure are made.
Theory
At first, in this section, a brief description of the
classical Adeline7 procedure will ,be given and some
of its shortcomings will be pointed out. Next, a modified
form of the above procedure will be developed. In the
standard Adaline technique,7 a reference (or weight)
vector W is derived by utilizing vectors from both
of the two classes to be recognized, C and C. The
vectors in the two classes are assumed to be line<),rly
separable. 7 For convenience in describing the technique, the negative of the vectors belonging to Care
assigned to C. Next, denote the vectors that are now
attributed to C as Y 1, ... , Y m, where m is th~ total
number of vectors. The Adaline pfocedure7 is an iterative method of determining a weight vector, W, such
that
(1)
j=I,2,···,m
Yi' W >
°
where the operator (.) signifies the inner product operation of two vectors. The iteration process is described
by the rule
Wk+l = W k
Wk
+
Y k otherwise.
Using this weight vector W, a new test vector may be
classified to C or C depending on whether the inner
product of the test vector with W is greater than zero
or not. The drawback of this procedure of decision
making is that the test pattern vectors belonging to
C which would normally produce slightly positive
inner products may, in the presence of some noise,
lead to negative inner products and be misclassified.
Similar statements may be made about the patterns
belonging to C.
In order to avoid this difficulty, a weight vector lV'
which satisfies the inequality
j = 1,2"", m
(:2)
O:::;K< 00,0:::;,,< 00,0:::;0:< 00
may be tentatively proposed for classification. Clearly,
the advantage of this inequality is that for non-zero
K, a dead-zone is created. This zone is symmetric
about zero and equal in magnitude to the right-hand
side of equation (2). The dead-zone may be designated
as an interval of no decision. As a result, some tolerance
to noise is provided. A noisy test pattern vector whireh
would otherwise satisfy equation (2) may lead to an
inner product lying in the dead-zone, but is unlikely to
to be misclassified.
However, only some special cases of equation (2)
will really concern us. The cases which will not be of
interest are
j = 1,2,.··, m
O:::;K< 00,0:::;0:<
(3)
oo,a~1
The reason they are not of interest will now he given.
It will be demostrated that a W' satisfying equation
(3) may be derived from a W satisfying equation (1) by
a simple change in magnitude; thus since Wand 117 I
would differ only in magnitude and not in orientation,
the classification ability of W' would be identical to that
of W. (It should be pointed out that if an actual iteration process is carried out to arrive at a weight vector
W for equation (1) and a weight vector W' for equation
(3), K~O, the weight vectors are likely to be oriented
differently in the multi-dimensional space and would
Pattern Recognition in Speaker Verification
thus lead to different generalizations. The important
point to note is the possibility that W' may be oriented
in the same direction a.s one of the possible W vectors.)
Assume that a weight vector W has been found by
some means which satisfies equation (1). Denote the
minimum of the inner product values indicated in the
eft-hand side of equation (1) by 5, i.e.,
> Klw'l
Y r W'
7:2:)
(4)
j=I,2'''·,m
O
0
i
The postulate will be shown to be true by deriving
a scalar constant S> 0 such that the weight vector
W' = 8W
O K
will satisfy equation (3). Since the above operation on
W changes its magnitude only and not its direction, the
minimum of the inner products Yj' W' still occurs at
the same value of j (more than one value of j may
produce the minimum value, but this fact is of no
concern to the present development). Let this value
of j be designated as j I. Thus,
Y/·W =
5
Then multiplying both sides of the above equation by
S yield
Y/·SW
=
S5
or
Y/·W' = 85
In order to achieve equation (3), it is necessary to find
an S such that
Klw'la
= Ksalwla
85 =
Rewriting the above equation,
s=
(
KI:I"
it is apparent that the inner products of the weight
vector, normalized to unity, and the vectors Yare
computed in this algorithm. This implies that the only
way the weight vector can affect the value of the inner
products is by changing its direction in space and not
by changing its magnitude. This fact is of considerable
interest since it has already been demonstrated how a
simple change in magnitude of a weight vector satisfying the simple equation (1) can make the new weight
vector satisfy the more complex equation (3), even
though the generalization property remains unchanged.
The use of equation (4) is advocated in this paper.
In the following paragraphs, however, an approach to
the more general equation (2) will be considered first.
Substitution of suitable values for the different parameters of equation (2) will then realize the results for
equation (4). It will be found that a bound for the
convergence rate may not always be obtained for the
approach adopted in this paper.
The procedure parallels the proof for the standard
Adaline technique.7 The iteration method is defined by
the equations
W'j+1 =
r-·
which is the suitable value for satisfying equation (3).
Note that S cannot be determined for a = 1.
If among all possible cases of equation (2) the cases
given in equation (3) are not considered, the equation
of interest is either an inequality in the form of equation
(3) with a = 1, 'Y =0.
W',. if Yj' W',. > KIYihIW',.la
W',. + (~/I y,.I.B)y,. otherwise;
O 0, 'Y =
0,
"'--k-ID...
1Yr..12
°
the standard bounds described in the literature7 for
conventional Adaline-type devices are obtained. For
the experiments reported in this paper, the parameters
are
{3
A2
a
=
Figure I-Cases showing presence and absence of upper
bound
1
Substitution of these parameter values in equation (10)
leads to
(11)
a definite upper bound on the nu~ber of steps k = k M
exists, provided, of course, that the solution vector
exists. On the other hand, for cases (2) and (3) where
where
Min
Am =
j
.A
= 1, 2, .... , k Yj'W'
Max
and
BM
=j =
1, 2, .... , k - 1
Thus, the above inequality leads to
The left- and right-hand sides of this inequality have
been plotted in Figure 1. It is clear when the two
straight lines intersect, as in case (1),
no such upper bound exists and convergence of the
procedure is not guaranteed.
As in standard literature,7 the bound is not useful
in estimating how many steps will be required in a
given situation, since it depends on the knowledge of
a solution vector W'.
It has been shown that the algorithm in equation
(4) it desirable because it forces the gain vector to
change its direction in space; a simple change in magnitude eannot help in satisfying the inequality of equation (4). At t~e same time, the possibility of obtaining
a section in a finite number of steps exists. In the
next solution, this algorithm is the basis for some
experiments with real data.
726'
Fall Joint Computer Conference, 1969
----------~----------------~---------Experimental results
The nature of speaker verification allows one to
perform experiments which are fairly well controlled.
Since most speaker verification applications provide
cooperative users-individuals desiring verificationit is possible to require each user to utter a particular
phrase. The phrase can be designed to carry a maximum
of speaker-depende:t:J.t information. The choice" for
the "experiments being reported here was "Check
Available Terminals." Each speaker included in the
test was asked to utter this and four other such phrases
in a predefined but randomized order, interspersing
each utterance with an uttertLnce-Iabeling task to
prevent interaction between adj acent phrases. Re-"
cordings of these utterances were made in an acoustically treated room using a wide-band recording system.
A boom-mounted microphone and headset combination
assured constant microphone placement. Each subject
was asked to speak in a normal: voice and a level adjustment made to provide approximately the right
input signal level to the tape recorder. It was felt that
these rather idealized conditions would allow evaluation of optimum verification performance. In addition
in certain applicatiop.s the real data may approach thi~
idealized high quality.
TABLE I-Filter bank specifications
Filter
Number
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Center
Frequency
Hz.
± db
Bandwidth
188
459
715
969
1220
1472
1725
1975
2225
2475
2725
2991
3300
3659
4083
4586
5194
5954
6932
8203
250
250
250
250
250
250
250
250
250
250
250
290
330
390
460
550
670
860
1110
1450
In total, utterances from 118 male speakers were
used. Fifty of these were arbitrarily assigned as "reals"
and 100 utterances of each phrase were collected from
each speaker over about a five-to-ten-week period.
Each of the other speakers was assigned to the "impostor" clas~; each uttered each of the five phrases 20
times, all at one time.
The analog recordings were digitized using the hardwar~ shown in Figure 2 and Table I. It consisted primarIly of 20 bandpass filters covering the range of
center frequencies 188 Hz to eight kHz. The lower
frequency filters had 250 Hz. bandwidths while the
higher frequency filters were somewhat broader,. A
20-ms. sampling interval was employed. The output
of each filter was rectified and integrated over each
sampling interval. The value of this integral was <:onverted logarithmically into a four-bit value spanning
a 32-dB. range. Only two other pieces of hardware were
used-an automatic level control (ALC) and a fundamental frequency detector. The former maximized
use of the full dynamic range of the AID conver8ion
system. Further, to allow reconstruction of the original
absolute signal level, the value of the gain of this ALC
circuit was digitzied for each sampling "interval. ~rhe
fundamental frequency (pitch) detector also passed a
digital estimate of the pitch period to the eomputer
for each sampling interval. Otherwise, this pitch information would have been unavailable because of
the width of the bandpass filters. Smith has descri1bed
the pitch determination method used. 15
The remaining experimental steps were executed
through programming. It was felt that implementing
most of the system by software and using a generalpurpose hardware analyzer would maximize the flexibility of the system. Even greater flexibility could be
obtained by simply sampling the analog speech waveform and storing digitized samples, but the quantity
of the data to be processed would be prohibitive. In
-
1~~~
.
.
.
To DigitoJ
COI'IpUhtr
VoitugeProportiOl"lClI
toALCgqin
Figure 2-Functions of analyzing hardware
Pattern Recognition in Speaker Verification
total, approximately 13 hours of analog recordings of
the phrase, "Check Available Terminals," were processed.
The first step of utterance-processing, segmentation,
was a speech-recognition process which would operate
with good reliability over a large population of speakers
because the phrase to be recognized was known. This
step automatically eliminated improperly spoken or
digitized utterances. It also served a time-alignment
function, allowing comparison of like sounds from utterance to utterance and from speaker to speaker.
Ten points in time w.ere found for each utterance.
Each segmentation point was defined by a precise
set of acoustic rules which will not be given here. The
points were given the following symbols which correspond roughly to the standard orthography of the
wordA-'
checK aVaILaBle TERMinaIS l 8'},
(~1
= onset of S:
S').
= end of S.)
The segmentation rules were determined by the following iterative process. A group of ten speakers was
selected arbitrarily and programs were designed to
segment their utterances properly. Accuracy of seg...
mentation was verified by studying digital spectrogram
l!atterns of "each' of the utterances. Once designed, the
rules were tested on another arbitrary set of ten speakers. The a~curacy of segmentation was improved by
accoun ting fo'r factors manifested in th~se new speakers.
Once the rules seemed sufficiently accurate, in terms
01 testing on new utterances by this combined set of
20 speakers, these rules were evaluated on an independent group of 20 speakers. Performance appeared
consistent; that is, no significant segmentation problems were apparent in this new set of speakers and the
segmentation programs were considered complete.
Space does not permit detailed description of the final
segmentation rules. Roughly speaking, they involved
the following functions: voicing detection, frication
detection, total signal power, and second formant frequen<;ly. Consideration of these functions and the known
context of a fixed phrase p~rmitted q,uite accurate
segmentation over a broad population of speakers.
As a preliminary rule, all utterances were required
to have defined locations for all ten segmentation points.
This restriction resulted in ten percent of the phrases
being rejected. Phrase rejection which implies no decision by the machine as to speaker identity should
be contrasted with speaker rejection. Most applications
727
would be less sensitive to unnecessary phrase rejection
than speaker misclassification. Furthermore, the phraserejection rate could be reduced substantially if later
stages of recognition were designed to operate on a
partially segmented utterance.
The next phase, feature extraction, used a segmented
utterance for input and produced a vector of features.
For this set of experiments, determining the features
was a two-step process. First, a large set of " proposed"
features was selected. This choice was based upon previous research by the authors and their colleagues, as
well as on published results of experiments involving
human and automatic speaker identification.l.4.8.9.11
~~cond, the list of features was shortened for ec~nomy
~f implementation. The" goodness" criterion used to
determine whether or not to include a particular featul'f~
was the F -ratio of analysis of variance. 4
A detailed list of the proposed features would be
too lengthy to include here; instead, the general types
of functions empl~yed will be described. A complete
description of the features is given elsewhere. I4 The
most common function Was an integration of the
power in one or more filters over a number of timp.
samples. To perform this integration, the log power
values determined by the hardware were converted
to a li,near scale, summed, and then reconverted to
log scale. This had the effect of simulating the same
type of analyzer with broader filters and longer integration jntervals. Three "bandwidths" were chosen
for integration: a single filter, a band of several filters,
roughly approximating a single formant region, and
the entire set of filters, corresponding to the power in
the original signal during the 20-ms sampling interval.
Three intervals of integration were also used: a short
period of two to four time samples centered at a segmentation point, medium-length interval~ extending
from one segmentation point to the next, and long i?-tervals encompassing several segmentation poin.ts.
Most of the combinations of these integration regions
were employed at each segmentation point.
In order to detect finer differences between utterances, a section of each utterance was subjected to
"time normalization." The time-frequency matrix of
filter values from the sample labeled "V1I.' to that labeled "L" was" stretched" ~r "shortened" by linearly
interpolating the sampled output of each' filter integrator to provide a fixed number of samples. Various
integrals like those described above were determined
during this time-normalized section also.
Programs were written to estimate approximate
formant frequencies and amplitudes as well. Formants
are characterized by amplitude maxima in the frequency spectrum and are the result of the transfer
728
Fall Joint Computer Conference, 1969
----------------------------------------------------------------------------------------function of the vocal tract. 2 There is reason to bel ieve
that consistent differences exist among various speakers
in absolute formant frequencies and detailed formant
transitions from sound to sound, even though the approximate motions are the same from talker to talker.
These would reflect an interplay between individual
structural and behavioral differences.
These various functions resulted in a total of 405
proposed features. It was obvious by their design that
they were not independent, neither functionally nor
statistically, but no logical basis was available to select
independent features that would be good, a priori, for
speaker verification. The second step of reducing the
feature set employed analysis of variance, ranking the
405 features according to their F-ratio. This measures
a quantity proportional to the variance of the speakers'
means divided by the mean of each speaker's variance.
Such a measure has the desirable properties of invariance to translation and scaling. No -measure of feature
dependent was calculated. T)le rank orders were tested
for consistenGY across different speaker populations.
The rankings' were determined for two different groups
of 25 speakers each and rank correlation coefficients
were calculated. 12 It was determined that the F-ratio
was a consistent measure of relative feature worth
when computed over a set of 25 speakers. All of the
experiments to be reported used the same feature set,
the best 200 features being determined by a composite
ranking based on 50 speakers. Details of the ra,nking
are given elsewhere. 14
Provision had to be made for features that sometimes
did not exist or for which an estimate of value did not
exist. For example, for certain portions of some utterances the system was unable to determine, adequately, pitch frequency or some formant frequency.
This phenomenon will occur to some degree in all feature-extraction systems. A missing feature value poses
interesting theoretical problems in the design of a
decision method. Should one estimate a value for it
on the assumption that the feature really did exist but
the system was not sophisticated enough to determine
its value? Or should th.e feature really be presumed
missing in the original signal and the utterance considered in a special manner indicating that it is not
like utterances in which the feature appeared to exist?
Sebestyen8 addressed these questions in relation to
probablistic decision methods, but another approach
seemed needed for the non-parametric Adaline technique used here. One possible good approach would be
to determine the relative frequency with which each
feature was missing in both the real speaker training
data and that of the training impostors. During recognition, a value would be substituted for each missing
feature which favored neither the real nor the impostor
class. Such a value could be the mean of the feature
value averaged over both real and impostors. 'I'he fact
that it was missing would be realized by chan~:ing the
a priori probabilities of the two classes in accordance
with the previously stored relative frequencieB. Thus,
if a real speaker consistently had a feature missing
during training, and that same feature 'was missing
during recognition, the recognition threshold would
be shifted in favor of accepting the utterance as th,at
of the real speaker.
In the experiments reported here a simpler 8trate~~y
was employed because of the relative infreqUl~ncy of
missing features. A mean value was retained for eaeh
feature that was ever missing from the real speaker's
training data. During recognition, if one of these
features was missing, the stored mean value was used
as an estimate of the missing feature. If a feature wa,s
non-existent dunng recognition but al ways existed
during training, the utterance was ignored entirely.
Almost all utterances ignored in this way were impostor
utterances and recognition performance would probably
not be degraded significantly if each of these utterances
was classified as being that of an impostor, but these
statistics were not calculated. Approximately four
percent of the recognition impostor set of utterances
were ignored in this way.
The adaptive linear decision algorithm described
earlier was used for all experiments described her'e.
Preliminary experiments were performed to determine
a good value for K, the relative training threshold. It
was determined that K = 5 provided a good tradeoff since convergence was obtained in a reasonable
amount of time and higher values of K significantly
increased training time with little improvement in
generalization performance.
In order to perform training, the set of real utterances
was stored in memory. The larger set of impostor data
resided on direct access storage. The algorithm proceeded through the data, obtaining utterances alternately from the real and impostor sets. When 1lihe end
of either set was reached, selection began again at the
first of the completed set but continued from wherever
it happened to be in the other set. The method provided
rapid convergence since the algorithm was always presented with a member of the "other" class after
adapting to the first class. A "pass" was defined to be
one complete loop through the longer of the two list,s
of utterances-in our case, the impostor set.
The training data consisted of approximately the
first 50 utterances of the real speaker being test.ed and
nine from each of the 29 impostors. These rather arbitrary numbers were the result of practical factors,
:Pattern Recognition in Speak~r Verification
such as program running time, storage space, and the
total number of speakers and utterances available.
Further experiments have indicated that generalization
results do not depend strongly on the exact amount of
impostor training data unless one significantly reduces
the number of impostors involved.
The recognition data consisted of the remaining utterances from the real speaker (about 50) and all available utternaces (20 or less) from each of 39 other impostors, Thus, testing generalization of acceptance of
the real speaker involved utterances produced by him
after producing all of the training data, while generalization of impo~tor rejection was tested u3ing entirely
new people that the training algorithm had never
processed.
Computation time on an IBM Sytsem/360 Model
40 was approximately one minute for both each training
pass and recognition of 700 utterances.
Table II lists the results of these experiments. The
accuracy figure tabulated for each real· speaker is the
729
misclassification rate (impostor as real and real as
impostor) for the case of the two classes bein~ equally
likely. In many applications the a priori probabilities'
would be unequal and the costs associated with the
two types of errors would be different, thereby m9.king
a statement of a single mh~classifica.tion probability
uninformative. Ignoring the question of cost differences, the distribution of errors was usually such that
unequal a priori probabilities should allow reduction,
or at least no increase, in the probability of system
misclassification (both types combined). Figure 3
shows typical distributions of recognition dot products
for two real speakers. The probability-density function
of the dot product has been integrated from the left
for the real speakers and from the right for the set
of recqgnition impostors. The ordinate valu e corresponding to a particular abscissa value corresponds to
the percentage error that would be experienced for
that class (real or impostor) if the recognition threshold
were placed at that val ue.
TABLE II-Generalization error over fifty speaker real set (Crossover error rate)
Speaker
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
Passes *
5
10
7
6
5
9
5
4
4
8
2
6
15**
5
5
11
9
15**
11
5
5
3
12
3
4
Error (%)
Speaker
Passes
Error (%)
.3
.8
1.2
.0
1.4
.0
.1
.1
.7
2.--0
.2
.2
2.2
.5
.0
1.2
3.4
2.3
.2
.4
1.8
.0
7.3
2.1
.3
26
27
28
3
4
3
2
9
2
14
2
2
5
6
9
6
.2
5.1
1.4
.0
.0
.6
.0
.3
.0
.0
.4
1.3
.1
.7
.7
3.1
.0
.3
2.3
1.2
.0
1.8
.2
* Number of passes to reach convergence.
** Convergence not reached by 15 passes, non-converged gain used.
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
6
6
8
4
4
4
5
8
4
3
2
4
.7
.1
Fall Joint Computer Conference, 1969
730
--------------------------------------------------------------------Cumulative %
Real Speaker 3
Total Real Utterances = 50
Total Impostor Utterances =641
New
Impostors
-6
Training Thresholds
-4
-3
-2
-I
Y:'!:i
KI '!:il
Cumulative %
Reo I Spea ker 5
Total Real Utterances = 50
Total Impostor Utterances = 642
~-------------------------------------~
0%
Classification Error
Figure 5-Probable true distribution of accuracy across
many real speakers
-I
2'
Y'w
~I'!:il
Figure 3-Typical results of generalization tests
The training algorithm was designed to produce an
optimum recognition threshold of zero (positive-dot
product corresponding to the real speaker, negative to
impostors), but the resulting decision function was not
symmetrical about the origin. Thus, the accuracy figures
in Table II are based upon adjusting the recognition
threshold to produce equal misclassification probabilities
on recognition data. To obtain an intercept, the steplike nature of the cumulative real distribution was
smoothed by linear interpolation. In practice, the
recognition threshold must be set in some other way
since independent data from the real speaker may not
be immediately available. One method might be to
set the threshold to produce a fixed rate of impostor
CONCLUSIONS AND FURTHER WORK
20
Mean = 1.0%
Mode = 0%
Median = .4%
X
l
acceptance (Type II error in statistical terms) and let
the real rejection rate be undetermined until the Jreal
speaker uses the system a number of times.
Figure 4 shows a histogram of the 50 accuracies in
Table II. This distribution bears a resemblance to an
exponential form, as might be expected. One would
always expect a small percenta.ge of people to have unusually high error rates but no one can have negative
error rates; hence, the skewed distribution. If a Sufficient number of recognition utterances was availa,ble
from the real speakers to allow accurate estimation of
very low error rates, the true distribution of error
would probably look more like Figure 5. This fumdamentally imperfect accuracy would result from the inevitable variation in speech patterns with time and because, in the limit of a large enough recognition impOi)tor
set, someone would probably be found who is 8imilaJr to
any given speaker, at least within the precision of the
features being used.
10
1.0
2.0
4.0
6.0
Crauaver Generalization Errar Rate (%)
Figure 4-Generalization error histogram
8.0
A more general technique than the conventional
Adaline7 approach has been treated in this paper. The
upper and lower bounds of equation (11), applica,ble
to the present method, have been derived from the
general bounds of equation (10). These gener2~1 bounds
may be exploited for other applications of equation(2).
In the conventional Adaline method, the iteration
process guarantees a solution in a finite number of
steps if a solution exists. In the approach adopted in
this paper, the iteration process guarantees a solution
in a finite number of steps if a solution exists and if
this solution satisfies the condition of equation (11).
Pattern Recognition in Speaker Verification
Since in either of the above two cases the solution
vector is not known beforehand, the difference is only
a philosophical one. The experiments reported in this
paper, however, demonstrate that solution vector
can indeed be found in most cases.
The value of the large data base is pointed out
again. First of all, this large data base is directed
toward an adequate representation for the real and
the impostor classes. Even after the data base is divided
to conduct independent design and te stexperiments,
the above postulate remains largely valid. Also! in
many phases of the speaker-verification work (e.g.,
feature selection), an iterative method is unavoidabla.
Thus, once a tentative design is created on some date,
the design is tested on a different set of data. If the
design shows faults (large error rate), a new design
is implemented by using both the former design and
the former test data. This new design must now be
tested on an entirely different set of data. This type of
iterative procedure can only be realized if a large data
base is available.
It is felt that the accuracy obtained in the verification experiments is good and that enough people were
involved in the test to produce meaningful results.
The most comparable previously reported experimental
results l ,l~ state average accuracies of about ten percent
with no provision fQr "No Decision." Differences in
data bases prohibit exact comparison of verification
systems. The authors' results cover a significantly
larger base of reals than either of the previous experiments.
The authors feel that much of the improvement in
ac~uracy is the result of phrase selection and carefully
designed segmentation algorithms but some of the
improvement must be attributed to the rather idealized
conditions under which utterances were gathered.
However, the procedure was automatic once the
segmentation prograin was designed. Further work is
being pursued to determine the effect on current results of degrading the signal in both bandwidth and
signal-to~noise ratio. Female speakers will also be considered. 'improved results are most likely to be ·obtained through improving segmentation accuracy and
flexibility, and the use of more sophisticated f~atures
(given better segmentation). It is felt that the present
accuracy could be attained with fewer than 200 features
by combining dependent features, if st~rage space
presented a significant problem.
ACKNOWLEDGMENTS
The authors are indebted to their associates in the
731
Speech Processing and other Advanced Technology
departments, IBM Systems Development Division
Laboratory, Research Triangle Park~ -N. C!, and to
c9nsultants Dr. K. P. Li and Dr. D. F. Stanat for
invaluable assistance in theory, hardware design and
co~struction, data gathering and pre-processing, and
programming. This research is an outgrowth of earlier
work performed in association with the Advanced
Analog Products department.
REPERENCES
1 K P LI J E DAMMANN W D CHAPMAN
Experimental studies in speaker verification, using an
adaptive system
Journal of the Acoustical Society of America Vol 40 Nov
1966 966-978
2 J L FLANAGAN
Speech analysis, synthesis, and perception
Academic Press N ew York 1965
3 C C TAPPERT N R DIXON D H BEETLE JR
W D CHAPMAN
A dynamic-segment approach to the recognition of continuous
speech: an expl&ratory program
Tech Rpt No RADC-TR-68-177 Rome Air Development
Center Griffis AFB N Y June 1968
4 S PRUZANSKY
Talker-recognition procedure based on analysis oj variance
:.Journal of the Acoustical Society of America Vol 36 Nov
19642041-2047
5 S S WILKS
Math Statistics John Wiley and Sons Inc N Y 1962
6 S K DAS
A method of decision making in pattern recognition
IEEE Trans on Computers Vol 18 April 1969 329-333
7 N .J NILSSON
McGraw-Hill Book Co N Y Learning Machines 1967
8 G SEBESTYEN
Decision-Making Processes in Pattern Recognition
Macmillian Co N Y 1962
9 G L HOLMGREN
Speaker recognition, speech characteristics, speech evaluatio~,
and modification of speech signals-A selected bibliography
IEEE Trans on Audio and Electroacoustics Vol 14 Marcc,
196632-29
10 L A KAMENTSKY C N LIU
Computer-automated design of multifont print recognition logic
IBM Journal of Research and Development Vol 7 Jan
19632-13
11 J W GLENN N KLEINER
Speaker identification based on nasal phonation
Journal of the Acoustical Society of America Vol 43 Feb
1968 368-372
12 M G KENDALL
Rank Correlation Methods Hafner NY 1962
13 J E LUCK
A utomatic speaker verification, using Cepstral measurement
J oumal of the Acoustical Society of America Oct 1969
to be published
732
Fall Joint Computer Conference, 1969
--------------------------~-----------------------------------------------------------,-----14 W S MOHN
Statistical feature evaluation in speaker identification
Dept of Electrical Engineering N C State Univ July 1969
PhD dissertation
15 C P SMITH
Speech data reduction
AD-117-290 Clearinghouse for Federal and Scientific
Tech Info 1957
A hybird / digital software package for
the solution of chemical kinetic
parameter identification problems
by ALAN M. CARLSON
Electronic Associates, Inc.
Princeton, New Jersey
INTRODUCTION
application area, develop general purpose software
for it, and assess the resultant software based on the
above definition, computer economics, ease of use, etc.
The objectives of this paper are to present and illustrate the use of the software package developed as a
result of the above mentioned project.
The chemical kinetic data analysis problem, which
is often referred to as the chemical model building or
parameter identification problem was selected as the
applications area. Since the software package, which
will be referred to as the kinetic data analysis or KDA
package, solves chemical kinetic problems via either
all-digital or hybird simulations; the question of simulation economics and accuracy was investigated and
will also be discussed.
The illustrative problem is the "Monsanto Benchmark Problem" which has been welldocumented2 ,8,6-8
and typifies the chemical kinetic problems the KDA
package was designed to solve. This problem requ.ires
the determination of twenty-two unknown parameters
using thirteen sets of experimental data and a mathematical model requiring the simultaneous solution of
seven non-linear differential equations.
The modern hybrid computer offers many significant
improvements over first generation hybrid systems
These improvements include:
1. The increased speed of digital computers en-
abling programs to be written in hybrid FORTRAN without drastically limiting hybrid
solution rates.
2. The development of analog/hybrid software
(e.g., hybrid simulation languages and analog
set-up programs).
The net result of these improvements has been an
increase in the SCope and complexity of hybrid applications and a reduction in the effort required to program
and debug hybrid problems. Unfortunately, the dev'elopment of hybrid applications software has not
kept pace with recent hybrid improvements.
Applications software for purposes of this discussion
is defined as an integrated set of digital/hybrid programs capable of solving the majority of frequently
occurring problems in a specific applications area.
Based on this definition, little or no tangible information
is currently available on the practicality of developing
hybrid software packages although its benefits are
obvious.
In mid-1968, EAT's Princeton Computation Center
initiated a development project to· determine the
feasibility of hybrid applications software. The objectives of the project were to select a frequently occurring
Problem analysis
Referring to Figure 1 the kinetic data analysis
problem, which occurs during the initial phases of, say,
plant design· and economic optimization projects, has
three essential, related parts. They are:
733
Fall Joint Computer Conference, 1969
7:)..1
DEFINE ClEM ICAl
ANALYSIS
TECHNIQ~S
DESIGN
EXPER I MENTAL DATA
PERFORM
~--~~
EXPERlftENTS
lAIORATORY
EXPER I ME NTS
DEFINE
COMPUTATIONAL
TECHNIQ~S
COMPUTER
I'ROGMMM I NG .....-._ _ _ _
~
DEVELOP MODELS
FOR PROPOSED
MECHANISMS
SET
CIECKOUT
MOons
PERFORM
KINETIC DATA
ANALYS IS STUDY
STANDARDS
FOR RESULTS
Figure I-Typical kinetic data 2,nalysis How diagram
1. Performing kinetic experiments to obtain the
data necessary to determine the model.
2. Proposing one or more mathematical models
representing alternative kinetic mechanisms,
chemical reactions, etc.
3. Computational analysis of the proposed models
by determining values for model parameter
(e.g., rate constants) that minimize the discrepancy between computed and experimental results.
The technology required to design and perform kinetic
experiments is available and the initial derivation of
mathematical models to simulate these experiments
is not generally regarded as a diffiult task. However,
the applications software required to evaluate these
. models is either unavailable, restrictive in a physical
sense, or fails to provide the user with an efficient solution to his problem.
The project manager responsible for the solution of
a kinetic data analysis problem, based on an impromptu
survey, is not interested in becoming deeply involved in
programming or underwirting extensive program development studies to solve his problem. With the exception of a few industrial organizations, the computational alternati.ves at his disposal are not consistent
with his interests. The computational alternatives are:
1. Direct Simulation-The classical analog com-
puter or digital simulation language studylO
where the analyst adjusts model parameters in
a trial and error fashion. This technique is generally successful; however, it is very time consuming' susceptible to human error, and inefficient
except for small problems. The adv2,ntage of
direct simulation is that it provides the analyst
with a great deal of knowledge about the physical
behavior of the system being simulated.
2. Parameter Estimation-A variety of digital
computer programs that solve kinetic problems
using, for example, statistical techniques, line
and non-linear least squares, etc. Spec:jfic illustrations may be found in a recent article by
Lapidus and Bard. 5 Unless the analyst is familiar
with these programs and is capable of using
them without making major modifications, their
utilization creates a number of problems. These
problems include:
A. The mathematical techniques restrict the
form of the data or the model:, thereby
influencing the design of kinetic experiments' (e.g., batch-isothermal experiments).
B. The infrequent use of statistical techniques or lack of a working knowledge
of statistics makes it difficult for the user
to evaluate program results and equate
them to the physical problem.
Parameter estimation programs do, however,
represent a relatively economical means of
solving kinetic problems if they can be used
efficiently and without major revisions.
3. Parameter Optimization-This technique uses
general purpose optimization algorithms (e.g.,
gradient search) to automate the above mentioned direct simulation technique. Referring
to Figure 2, the optimization variables, A,
which are unknown parameters in the kinetic
model, are varied so as to minimize an objec:tive
function. The objective function, F, is a sealar
quantity representing the error between eomputed and experimental results which may be
obtained using a variety of mathematical relationships (e.g., sum of squares, integral of the
absolute error, etc.). As shown in Figure 2, the
best current values of the algorithm variables,
AB' are those model parameters resulting in the
"best fit" between experimental and computed
concentration data, AF, when the algorithm can
no longer improve the objective function. 'This
technique is:
A. Theoretically the most general purpose
approach to solving kinetic data analysis
problems. It may be used in either a11-
A. Hybrid/Digital Software Package
PROGRAM
ALGORITHM
6~ J =G
:i
G;:J
EXECUTI VE
FORMS
J
,
t
ANALOG
PROGRAMS
PROGRAM
INPUT/OUTPUT
PROGRAMS
~
DATA
OPTIMIZATION
SET-UP
I"
735
---
PROCESSOR
•
ALL-DIGITAL STUDIES
"
ANALOG
COMPUTER
.,
-
"
HYBRID
STUDI ES
"
PREPARATION
•
01 FFERENTI AL
ALGE BRAI C
EQUATIONS
EQUATIONS
Figure 3-KDA program organization
Figure 2-Simplified parameter optimization flow diagram
digital or hybrid simulation~ and the
mathematical forms of the kinetic models
and physical systems that can be investigated are not restricted.
B. Not generally used because many organizations do not have access to appropriate
software and the development of this
software imposes an intolerable financial
burden on anyone project. In the past,
this technique was not widely used due
to high digital production costs. The
"Parameter Optimization" technique, requires several hundred simulations of
individual experiments per optimization
run.
The results of the above mentioned survey indicated
a significant market existed for general purpose kinetic
data analysis applications software if it could produce
easily interpretable results, require minimal user participation, and solve kinetic data analysis problems at a
reasonable cost using the "Parameter OptimizatioI)."
technique. These results were used as guidelines for
the software development project.
Software description
The Kinetic Data Analysis package consists of
several digital/hybrid processors whose individual
functions and interactions are too complex to describe
in this paper. However, referring to Figure 3, the current version of these processors may be visualized as
five FORTRAN programs under the control of a Program Executive. The Program Executive restores and
executes programs requested by the user, provides the
software package with a convenient mechanism to add
programs, etc.
The five programs shown in Figure 3 are an Analog
Set-Up Program, a Data Preparation Processor, and
three optimization programs. The optimization programs are identical with the exception of the mathematical form and/or computer used to simulate the
kinetic model or models. These programs, which have
identical executive, optimization, and objective function programs are:
1. A hybrid optimization program using the analog
computer to simulate kinetic models.
2. An all-digital optimization program for kinetic
models requiring the solution of one or more
ordinary differential equations.
3. An all-digital optimization program for kinetic
models requiring the solution of a set of algebraic equations (e.g., continuous stirred-tank
reactor experiments.)
The Analog Set-Up Program is an interactive program used, for example, to static check analog patch
panels prior to executive hybrid production runs. Since
programs of this type are generally part of the operating
system software for a hybrid computer, a description
of this program will not be presented in this paper.
Subsequent discussions will also exclude the Program
Executive, since its function has, for all practical purposes, already been d·3fined. Therefore, the description
of the Kinetic Data Analysis package will be limited
to the Data Preparation Processor and the optimization
programs.
A brief description of hmv the user interacts and communicates with the software package to solve a kinetics
problem will be discussed first to clarify later discussions.
Fall Joint Computer Conference, 1969
736
TOTAL NUMBER OF CHEMICAL SPECIES, • " •••••••
C1]
UNKNOWN ARRHENIUS RATE CONSTANTS, •••••
J:L1]
EXPERIMENTS OR SETS OF DATA, ••••••••••• ~
AND UNKNOWN MODEL PARAMETERS •••••••• ~
C USER )
+
f
ESTIMATE OF
DIGITAL! HYIRID
INSERT
DATA
KINETIC MODEL
PREPARATION
IN KDA
PROCESSOR
SUIROUTINES
ECONOMII CS
--
ANALOG SCALE
FACTORS, S TAT IC
AND DYN AMIC
CHECK SO LUTION:,
MODEL
CATALYST VARIABLE TRANSFORMATION? ••••••
NON-ISOTHERMAL EXPERIMENTS?
OPTIMIZA, TION
CHECK SO LUTION
~
••••• ~
DATA
•••••••••
DIGITAL SOLUTION OF KINETIC MODEL?
KDA
PROGRAMS
-).-3-P-'J
MAXIMUM DATA SET TEMPERATURE. •••••••• 1 2, ¢ ti1
HYBRID INTERFACE
ASSIGNMENTS AND
SCALE FACTORS
DATA TAPE WITH
DATA
PROCESSOR
ALL PROCESSED DATA
CONCENTRATION
WEIGHTING
FACTORS
DATA TRANSFORMATIONS
>-_ _
~OPTIMIZATION
ALGORITHM
DATA ORGANIZATION
DATA SUMMARY
DIAGNOSTIC
INCLUDING DERIVATIVES
MESSAGES
CORRECT 10 NS
PROGRAM
--
EXECUTEA, HE PROGRA
FOR ALL- DIGITAL
SOLUTIOt
_I
Figure 4-Typical KDA data form
PREPARA T 10 N
l-----
OPTIMIZATION
Figure 6-Flowchart for first phase of KDA study
DATA SET TEMPERATURE DATA IN DEGREES •••••• [£]
MINIMUM DATA SET TEMPERATURE •••••••••
--
HYIRID ID IGITAL
l::(]
MATERIAL BALANCE
ANALYSIS
Figure 5--Data preparation processor flowchart
User interaction ,communication
The user's first contact with the Kinetic Data
Analysis package is a set of data forms (sec Figure 4)
that request experimental data and other related information in kinetic rather than computer terminology.
These forms are transformed into a deck of punched
cards and fed to the Data Preparation Processor. Referring to Figure 5, if no errors are detected, the data
is processed and the results are printed out and stored
on tape. This tape contains all optimization algorithm
and kinetic information requi~ed for the execution of
the optimization program.
To complete the data forms the user is required to
provide a "yes" or "no" answer to the question, "AllDigital Solution?" The initia,] answer to this question
is "yes" regardless of the user's intention to perform
a hybrid simulat:on because, referring to Fig:ure 6" the
all-digital optimization program has a built-in. mechanism for obtaining:
1. An analog static check and dynamic check solution.
2. A cost estimate of th~ all-digital solution versus
the hybrid solution cost for problems where the
most economic alternative is questionable.
3. An accurate estimate for all unknown analog
scale factors.
4. An overall dynamic test for hybrid simulations
which are required to program and debug the analog
model for hybrid studies.
For all-digital studies, the Kinetic Data Analysis
package supplies three partially programmed FORTRAN IV subroutines and a "Block Data" subroutine for kinetic models consisting of either algebraic
equations (e.g., stirred-tank reactor) or ordinary differential equations (e.g., batch or flow reactors). The
integration package uses a fourth order Rumge-Kutta
integration algorithm and a readily implemented
mechanism is available to obtain the classical "error
versus step size" data to determine the correct and
most economical step size for the integration process.
The three subroutines require the user to:
1. Store initial values of the variables being integrated in an integration initial condition array.
2. Store computed results in a specified array.
3. Compute intermediate variables and model
derivatives or, for example, stage outputs usin~
FORTRAN IV statements.
Items one and two, typically, require two or _three
statements and the requirements for item three are
a function of the complexity of the kinotic model.
A Hybrid/Digital Sottware
The "Block Data" subroutine is used to define total
number of and names of intermediate and integration
variables for control and printout purposes.
These four programs (in object form) are incorporated
into the Kinetic Data Analysis package to form an
executable program which, upon request, will read in
the data prepared by the Data Preparation Processor
and print out the values of intermediate and dependent
variables as a function of the independent variable.
For all-digital studies, the user now has an executable
optimization program capable of solving his problem.
For hybrid studies, this program provides static
check, dynamic check and scale factor information. If
the user executes one digital solution to his problem
(this will be clarified later), the results provide the
information required to test the overall accuracy of a
hybrid simulation and the running time of the alldigital model to compare hybrid versus digital economics.
With the exception of reprocessing the card deck
obtained from the data forms and requesting hybrid
processing, no digital programming is required for
hybrid studies. The Data Preparation Processor, in
the hybrid mode, assigns hybrid interface channels to
operate in conjunction with preprogrammed hybrid
interface programs. Since da.ta transferred to and from
the analog model is done in a predefined sequence,
the analog logic and interface circuits are also predefined and can be prepatched. Therefore, the additional
effort required for hybrid studies is limited to the analog programming required to actually simulate the
kinetic model.
The Kinetic Data Analysis package has, in effect,
organized the hybrid study and, with the aid of the
static check, dynamic check, and scale factors determined earlier, made programming and debugging the
analog model a relatively simple task. The aforementioned card input analog set-up program limits the
time required to set up and check out analog programs
to a few minutes.
At execution time, the user communicates with the
Data Preparation Processor, the optimization programs, and the Program Executive through a set of
predefined user oriented commands. These commands
can be inputed via cards for batch-unattended runs or
a console typewriter. Since the Kinetic Data Analysis
package uses a' "space" as a delimiter, commands are
entered in "free format." For example, the command
"INPUT DATA 8", which is used to read in the data
tape from FORTRAN I/O unit 8, may start at
any location on a punch card.
The above mentioned command list, which contains
Packag~
737
more than fifty individual commands, is too extensive
to discuss in detail. The commands can, however, be
classified into the six areas of control they make available to the user.
1. Program ControL .. Select I/O devices,
call
Kinetic Data Analysis Programs, add to the program
library, etc.
2. Kinetic Data
Handling.. ............. Control I/O options and
computations performed on
experimental and computed
kinetic data.
3. Optimization Data
Handling ................. Control I/O options and
computations
associated
with optimization variabIes.
4. Objective Function
ControL ................. Control the mathematical
form, weighting and the
components or data sets
used to compute the objective function (see later
discussion) .
5. Optimization Algorithm ControL ... Select the mode (e.g.,
maximize, minimize) and
other options (e.g., iterative, cyclic operation) associated with the optimization algorithm.
6. Model Control and
Diagnostic............. Select hybrid diagnostic options (e.g., scan for interface error messages) or
digital model control options (e.g., set or reset a
one/zero model switch to
modify kinetic model).
The form of the results obtained by the user during
program execution will be discussed later.
Data capacity and classification
The Kinetic Data Analysis package is capable of
processing up to fifteen sets of experimental kinetic
data (or data sets) which may contain concentration
data for a maximum of fifteen chemical species or components. Each data set may contain up to ten values
738
Fall Joint Computer Conference, 1969
of an independent or sampling variable (e.g., time for
batch reactor, volume for flow reactor, etc.) and fifteen
concentration points per sampling variable. These data
must be common to all data sets and the sampling
variable must be a monoatonic increasing function
whose initial value is zero. Howover, equal sampling
variable increments are not required. Each data set also
contains provision for a catalyst concentration, a temperature, and an alphanumeric user identifier. The
purpose and manipulation of the catalyst and temperature data will be discussed later.
Up to fifteen unknown reaction rate constants, which
are assumed to obey the Arrhenius equation, can be
processed. This limit is independent of the thermal
state of the system (i.e., isothermal or non-isothermal
data sets). In addition, the Kinetic Data Analysis
package can process up to fifteen unknown model or
individual parameters (e.g., reaction orders; heat
transfer coefficients, etc.).
The above mentioned limits apply to all-digital
studies and hybrid systems w:lOse interface contains
a minimum of sixteen analog to digital and digital to
analog channels.
The Data Preparation Processor catagorizes experimental kinetic data into one of three classes called
KDA Case Numbers. They are:
parameters into optimization algorithm variables.
2. The transformation and transfer of these v;a,riabIes to the kinetic model.
Both transformations are a function of the aforementioned KDA Case Number and the Arrhenius equation
K = A·EXP (-B/T)
where
K = reaction rate constant
A,
n=
Arrhenius coefficients
T = absolute temperature
The Kinetic Data Analysis package uses an alternative, but rigorously correct, form of the Arrhenius
equation whose derivation is shmvn in Appendix A. This
relationship is
(2)
where {3 is defined as
{3 = (l/T R
Case # 1.. One or more experiments performed
under nonisothermal conditions
Case # 2.. Two or more experiments performed
under isothermal conditions where
the difference between the maximum
and minimum temperature levels is
greater than 5°C or OF.
Case # 3.. One or more experiments performed
under isothermal conditions where
the temperature range is less than or
equal to 5°C or OF.
This data catagorization is one of the key factors required, for example, to organize optimization algorithm
input data and the transfer of rate constants to the
kinetic model.
Optimization variable transformations
. Two tranformations, which play an important part
In the data flow between the various KDA processors,
are:
1. The tranformation of rate constants and model
(1)
-
l/T)/O/TL - l/T H )
(3)
In equation 3, TH and '1\ are the maximum Emd
minimum experimental data temperatures, respectively
and T R is a mid-range reference temperature defined by
the equation
(4)
In equation 2, KR denotes the reaction rate constant at
TR and KHL is the ratio of the maximum to minimum
rate constants (KHL = KH/KrJ.
For experimental data catagorized as KDA Case :#: 3,
the optimization variables, Ai, are defined as:
(5)
\vhere
1
= rate constant index, i
=1,2,"', NRC
NRC = the total number of rate constants.
For the two remaining data catagories
(6)
A Hybrid/Digital Software Package
and
739
TABLE I- Items influenced by KDA case number
(7)
KDA CASE NUMBER
Individual or model parameters specified by the user
are sequentially added after the last rate constant
variable. For example, the first parameter, PI, is
assigned to ;\NRc+I for KDA Case # 3.
Referring to Figure 7, optimization variables are
transferred to the kinetic model as a function of the
KDA Case Number as shO\vn in Table I. For hybrid
kinetic models, the rate constants are scaled and transferred to the analog computer in a predefined transfer
sequence as shown in Table II. Note that for both
digital and hybrid models concentration initial conditions, sampling points, and a ramp sloape (i.e.,
reciprocal of the last data set sampling point) are also
transferre4 to the kinetic model.
TOTAL NUMBER OF
OPTIMIZATION
VARIABLES
1
2
2 • NRC + NPR
2' NRC +NPR
FORM ASSIGNED TO
OPTIMIZATION
VARIABLES
REPRESENTING
RATE CONSTANTS
KR
KR
KHL
KHL
3
NRC +NPR
K
RATE CONSTANT DATA
TRANSFERRED TO
DIGITAL MODEL
A, B
A, •
KR, KHL
K
K
RATE CONSTANT DATA
TRANSFERRED TO
HYBRID MODEL
K
K
LOG (K )
R
LOG (K
HL
)
TABLE II-Typical transfer sequence* for KDA case
#2
N
=N
FRUN
N
F
=0
D/A DEMULTIPLEXING
CHANNEL
NUMBER
N =0
+1
=0
A/D
ABC
COMPUTE MODEL
INPUTS
- - - F O R Nth
DATA SET
2
NCS
SIMULATE Nth
---DATASET
NRC
1
z
Q
...1<
-!::z......
~o::
-
-
-
zw
-~
COMPUTE
DATA SET
OBJEC TIVE
FUNCTION
14
o
u
15
CAT**
TEMP
* Channel zero used by prepatched KDA circuits.
-
~
-
COMPUTE
TOTAL
OBJEC T IVE
FUNCTION
________
~
Figure 7-Simplified objective function flow diagram
** Transferred when appl icable •
OUT
Optimization algorithm and objective function
options
The current version of the Kinetic Data Analysis
package uses a slightly modified version of the P ARTAN algorithm described in detail by Harkins4 • Since
a detailed description of the algorithm is available,
740 Fall Joint Computer Conference, 1969
this paper will only consider the mathematical form
of the objective function. However, it should be noted
that this algorithm, which can be classified as an "accelerated gradient" algorithm, was selected because of
its proven effectiveness on a number of all-digital and
hybrid kinetic studies performed in recent years at
EAI Computation Centers. The add-on capability of
the software package makes it possible to add other
algorithms if the need exists.
The mathematical form of the objective function
is specified by the user at execution time. Referring
to Figure 7, the objective function is based on the
"total error" or sum of the individual data set errors.
For example, to compute the objective function for
a problem consisting of ten components and ten dat.1l
sets, ten analog runs or one hundred digital integrations
are required.
The form of the objective function, its weighting
factors, the exclusion of a chemical species or data sets
from the objective function, etc., are defined by the user
at execution time via the Executive Program. The
software package provides integral and polynomial
objective function options to the user based on the following definitions:
En,m,i = COMPi'ICn,m,i - C~,m.iIEXPN
COlVIP i = 1.0 or 0 when a chemical species is to
be excluded
OMIT n = 1.0 or 0 when a data set is to be
excluded
i = index denoting a chemical species,
l~i~.J
m = index denoting a sampling point,
1 ~ In ~ lV1
n = index denoting a data set or experiment, 1 ~ n ~ N
Cn,m ,i = computed results (unscaled) array
C~ ,m, i = experimental results array
F = total objective function
FRUN n = data set objective functions
EXPN = a positive, non-zero constant
In the above relationship (J denotes a positive sampling
variable ratio whose maximum value is unity:
em
= SV m
8
The weighting factor is unity if PWI and PW2 are
zero. If PWI = 1.0 and PW2 = 0, initial values are
weighted, and if PW2 = 1.0 and PWI = 0, final
values are weighted. Note that both PWI and PVv2
cannot simultaneously be set to one.
The integral option defines individual data set
objective functions as
J
FRUN n =
L
J
M
FRUN n =
:E
WGTn,m
L:
En,m,i
(8)
i=1
where the weighting factor (WGT n ,m) is
WGT n.m = 1
+ PWI
em +
8
PW2 .
em
B
(9)
WGTn,i
1
En,m,i d(SV)
(11)
0
i-I
where the integrn.l is computed using a "Trapezoidal
Rule" approximation and the weighting factor
(WGTn,i) is defined as
WG'1\,i = 1 + CWl'Cn,i
+ CW2· (l-C?I,i) (l:~)
The control constants CWI and CW2 are identicalllll
,1 values are conbehavior to PWI and PW2. the
c.entration weighting factors computed from experimental data by the Data Preparation Processor.
If CWI = 1.0 and C\V2 = 0, large concentrations
are weighted, and if CW2 = 1.0 and CWI = 0, small
concentrations are ~weighted. This weighting factor
is useful when, for example, a component whose range
is 0 - 0.05 in a given experiment is more sensitive to
an analytical error of, say, ± 0.01 than a component
whose range is 0.5 - 1.0.
Referring to Figure 7, the total objective function
F, is. obtained by summing the individual data set
errors, FRUN, modified by Ol\1.ITn (1.0 or 0) to control
the inclusion or exclusion of the various data setH.
en
N
F =
The polynomial option defines individual data set
objective function as
(10)
SVM
L
OlVIITn·FRUNN
(13)
n=1
Note that user commands control the values assigned to
01\11'1\, C01\IP i , EXPN, PWl, PW2, CWI and CW2.
In addition to the aforementioned objectivE~ functions, the software package has provision for the user
to add a digital subroutine to compute the individual
data set errors if the "built-in" options are not applicable. For example, if the data set errors are computed
on the analog computer this subroutine can be used
to transfer them into the digital computer.
A Hybrid/Digital Software Package
Optimization results include a table containing the
objective function, its fractional contribution to the
total objective function, and the average error per
data point for each data. set. The total absolute error*
or standard error is included in all results to allow the
user to compare the relative merits of various objectives
functions since their magnitudes depend on their
mathema.tical form.
Temperature and catalyst data
Each of the data sets has associated with it a single
temperature which is sufficient for experiments performed under isothermal conditions (i.e., KDA Case
#2 and 3.) For non-isothermal situations the data
set temperature is the initial or feed temperature;
therefore, the requirements of kinetic models which
include energy balances (i.e., temperature obtained
from the solution of a differential equation) are also
satisfied.
Studies that require the storage of, say, temperature
versus time data are simulated by:
1. Using "Data" statements to include these data
in the subroutines supplied by the user for all
digital studies.
2. Using, say, card programmed diode function
generators (CPDFG) OD the analog computer
for hybrid studies.
The CPDFGs work in conjunction with preprogrammed logic that automatically associates each
function with the appropriate data set during the
simulation.
The software package also allows the user to associate a catalyst concentration with each data set. The
catalyst concentration, which is transferred to the
kinetic model, provides the user with a mechanism for
simUlating kinetic models involving a non-reactive or
reactive catalyst. For example, when catalyst concentration data is not available in studies involving reactive catalysts, the catalyst concentration is the initial
condition for the catalyst material balance equation.
Typical application
The following discussion will be devoted to the
solution of "Monsanto Benchmark Problem" using
the Kinetic Data Anaylsis package on a fuJy expanged
EAI 8900 Hybrid Computer. This discuss:oh will include a mathematical description of the problem, illustrate the form of the results obtained during the
... Equation 8 with EXPN and WGT n.m equal to unity.
741
preparation and optimization phases of the study,
and summarize the numerical results obtained from
the study. Simulation accuracy, errors in results, and
economics will also be discussed.
Problem description
The illustrative problem contains the two essentia,
ingredients to perform a kinetic data analysis study;
a proposed kinetic model and experimental data. Referring to Table III, each of the thirteen available
data sets contained concentration-time data for seven
chemical species (i.e., R. S, T, U, W, X, and Y), the
concentration of a non-reactive catalyst, and a temperature. These data were obtained from experiments
performed under isothermal conditions over a 133 to
181°C temperature range which included a threefold
variation in catalyst concentration, 117 to 368. No
two data sets had identical initial concentrations and
the number of non-zero sampling variable (i.e., time)
p;)ints per data set varied from one to four.
The proposed kinetic model, which is shown in Table
IV, is based on the following chemical equations:
KI
K4
R+S==:;T-~U
Ks
K2
R+S-~U
Ks
R+S~W
K9
Ko
T
K6
+ S -~X.==;U +
S
K IO
The model contained eleven unknown rate constants
(K I -- K ll) and since this study falls under the KDA
Case #2 category, there are a total of twenty-two
optimization variables. Each rate constant has one KR
and one KHL optimization variable associated with it.
Data preparation processor results
Processing the card deck corresponding to the KDA
Data Forms produced the results indicated in Figure
5, which are illustrated by Figures 8 through 11. These
742
Fall Joint Computer Conference, 1969
TABLE III- Typical data set
TEMPERATURE IN OEC C
IDENTIFIER: RUN TWO
CATALYST (UNKNOWN
TEMPERATURE: 146°C
MINIMUM
) IN UNKNOWN
130,11
MAXIMUM
21111,11
MINIMUM II.117E 113
CATALYST CONCENTRATION: 117
SCALED CATALYST-TEMPERATURE DATA
TIME
HOURS
0.0
1.0
2.0
3.0
CONCENTRATION IN MASS FRACTION
R
S
T
U
W
Y
X
0.425 0.501
0.359 0.465 0.051
0.017 0.106
0'.315 0.442 0.086 0.033 0.120
0.281
--
0.018 0.005 0.050
0.424 0.123 0.048 0.116
--
---
---------------------------------------------------SCALED
SCALED
DATA SET
NO, IDENTlrlER
--
0.008
........ -- ........ --_ .. -------- -_ .......... _............... -........ -_ ................. -SCALE
.... ---
CATALYST CONC
RUN
ONE
11,730"
11,3315
2
RUN
TWO
0.730"
".3179
3
RUN
3
11.911011
".6440
RUN rOUR
0,8U0
0.3288
5
RUN rIVE
0,8Ue
11',6576
6
RUN
SIX
11,7900
11',6522
7
RUN
7
11,74110
B,6522
8
11.83511'
1,6522
0.002
0.004
TEMPERATURE
----------------------------------------------------
8
RUN
9
RUN NINE
0,85""
0,6522
111
RUN
TEN
0,890"
",6522
11
RUN
11
11.6650
11,6739
12
RUN
12
0,9B51
0,6141
13
RUN
13
11,8651
1,I0B0
MAX IMUM
VALUE
rACTDII
COM' "
1.11"" 11
II.Ulle 11
-------------------------------------------------_ ..
COM' :1
II.UII.E 11
•• UIIE 11
CAT CONC ANO TEMP XrER ON OAC 14 ANO 15 OURING 'B' PERIOD
ADC C~ANNEL
NUMBER
.... ------- .. ---------- ... -_ .... -_ ... ---_ ...... ------------_ .... _.. _...
-
COM"
'.UIIE I i
I.UI.E .1
CO"' IJ
•• U,.F 11
,.U"F. 11
~
II.,",E I.
..211.E In
COi'!' ~
•• 51 •• E ••
' . u l l e 11
CO"' 'r
•• U •• E ••
'.U •• E 12
COM'
...... ___ ....... __ ....... _____ .. __ ....... ___ ....... ___ ..... w_ ............ ___ ...... --.... _._.-
OAe ASSICNMENTS
ii;c -C~;~;E~ --------;;; i;;~E ---------~;; i~~~- ----. -- --·;C~~E-----~~~.-~;~~;;;NAME
ICAIE 1 DilLY'
-_ NU.M8EII
...... ----------- --_ ..........
--_ ........... -- .. ---VALue
.... --_ ....... -.... --- .. rACTOli
_... -.. _------ .--_ ..:".- .. -IIATE CON
1
•. n .. E
llATE CON
2
I . . . . . E .1
•• U5IE ••
IIATE CON
3
. . . . . . E 11
'.l25.E ••
,•
' •• I I , F "
',251.E 11
llATE CON
•• 1I"E 111
•• UIIE 11
llATE CO'"
12
••• 66.E-11
'UTE CO",·
6
.,U"E Ii
.,l".E 11
llATE CON
7
'.25..., ••
',.11.11 I i
FlATE CON
a
II,II"'E 11
1,12!11E ••
IIA TE CON
9
II",..E 12
',U"E-11
111
FlATE CON 11
l,llllE .1
' , l I •• e .1
11
UTE CON 11
,,4111'" III
II,25"E ••
-_.- .. -................ -- ................... -............... -...... -.. -- .. _....... --... --_ ........ -.- --_....... -- -_ ........ ---_ ..
.
Figure g.-Hybrid interface ,assignments
figures omit the first phase of the form processing output. That is, the direct playback of the KDA Data
Forms with appropriate error messages when errors
are detected.
Figure 9-Temperature-catalyst interface data transfer
Figure 8 illustrates the hybrid interface assignments
for the eleven reaction rate constants and the seven
chemical species involved in the mathematical model,
their maximum values, and their scale factors (i.e.,
reciprocal of maximum value). Figure 9 deta,ils the
scaled temperatures and catalyst concentrations that
will be transferred to the analog model during the
"B" demultiplexing period on D/A-channels 14 and
15. Note that this problem is in the KDA Case #' 2
category whose interface transfer sequence has been
illustrated in r-rable II.
Referring to Figure 10, the Data Preparation PrQi[~
essor assigns a number to both the data sets and
chemical species involved in the study. These numbers
are required by the user to execute commands that
manipulate specific chemical components or data sets.
For example, to exclude the eleventh data set from the
study, the command is "EXCLUDE II" not "EXCLUDE RUN II" where "RUN II" is the data set
identifier specified by the U!3er.
The lower half of 1~'igure 10 Hlustrates a typieal data
set printout containin~ the origi.nl "time" units and
sc, .led values (i.e., normalized) of the s.vnpling Vltriable.
The normalized values were obtamt:d by:
A Hybrid/Digital Software Package
743
NAME I KOA NUMBER SUMMARY
CWE"leAl S~EC IES
6-18 I \1-1'
OAT A seT
'0. ---10E,T
IF IER
......
. ---------.... -_ .... -.... -- .. ----_ ........ --- ............. _- ............ -.. -.....
COMP NAME
COMP
COMP
COMP
COMP
COMP
COMP
COMP
2
W
2
3
4
5
6
7
8
9
18
11
12
13
BETA r ACTOR
1.1171E
n
TEMPERATURE
TWO
CATALYST RATIO
UNMNOWN
146. I
C
'.ll1.E 11
-1.2421
--; i ~e --------;: I;;; -;; -;: l;;e -; ~ -;: ;;;;-;;: -;: i;;E -;;:----------------------------------I.~ee.
0.3333
0.6667
,.9999
------------------------------------;;;j -j
-F;;;C -------------............................. -........ _.................................... _.... --- ...... -...... _--- .. -_ ............. -.... -_ .... ---_ .... -- ------_ ...... ---~o~ce~;
1.359E 08 8.315E U
COMP R
1.425E U
COMP S
0.5~lE ~. 1.465E ~e 1.442E ee 1.424E ~I
'.188E-81 1.5UE-.i I."IE-Il •• 1231 II
COMP U
1.".E-.2 1.17IE-Il '.33.E-11 1;4I1e-11
COMP
w
I.5IeE-1l •• 1I6E II •• UIE 18 1.116E I,
COMP
~
•• I8.e 18 I.U'E
COMP Y
o~
~ -~G;
1.31U
I,IU~
'.1251
1.'9"
1.11123
•. 2ou
•• 1191
MI~
NPOI • 1
Q!.111'
I.
HE)!:P • ;S
81'41111.
I.eel'
EYA
L.·
'L""X.
2151.
'.1".
'LMIN.
e."ll1
MINVAl
"AXVAl
VALUt
AANGE
TYPE
.......,-----_ ...... ----_ ......... ----_ ................................ -- ........ _-_ ............... --_ ...... -_ .. -_ .. --_ ..... -.... -NAME
1."leE '1
MAIl
o
43~0E
8~
o
.10~IE
01
WL01
A.1991E .,,3
8.1040E
~3
P. ZleIE
~3
0. 4'~IE 8e
•• 148eE I t
0.4101E II
I.IIIIE II
MA02
A,3991E 03
1.108eE 13
0.48UE
n
•• 111IE 81
Wlel
0.UI.e 01
1.1110E II
0.10IIE U
'.UI0E 00
OR03
.U~IE
01
I.
e.1iI1~E
0\
Figure ll-Concentration weighting factors and
algorithm input data
II I.UIE II
. . . . ee II 1.2eU-82 '.4"E-12 . . . . . E-12
------------ ... _--------_
... -.... -..... ---- ... _----_ ........ --------------- .. -------------- ... __ .. _----_.
•• 999& II • '\IIE 11 1.1I1E 11 1.1I1E 01
CONC SUM
Figure l~KDA number assignments and
processed data set
TABLE IV-Mathematical model
DEFINITION OF TERMS
Rl =Kl RS
R6 =K6 US
R11 =K11 Y
R2 = K2 RS
R7 = K7 US
R12 = Rl + R2 + R3
R3 = K3 RS
RS = KS T
R =R + RS
13
4
R4 = K4 T
R9 = K9 W
R14 =R6 + R7
RS =KS TS
RlO= K10 X
t
I.U50
0.15"
1.16"
DATA
1.281E '8
COMP T
~e I.UIE
0.0035
'.1iIU
...... -_ ...... __ .. ----- .... -.... _......... -..... _........... --------- ........ -.... -- .. ----.-_ ........... -.................... -_ .. -NO.
SAMPLING POINTS
NORMALIZED
•. 4H0
I.'~IP
~.~3n
SUMMARY
PAATAt.J COEr • • 1' 8MAI(.
PARAM~TER
c~~;o~e~;'
3
AuN
CONTROl. DATA •••• T't'PE •
CATALYST CONce~TRATION
1,3051
l.tl30
•
PAAll\N OATA
RUN
0.2410
1,115~
RUN ONE
RUN TWO
RUN 3
RUN rOUR
RUN riVE
RUN SIX
RUN 7
RUN 8
RUN NINE
RUN TEN
RUN 11
RUN 12
RUN U
1
USER IDENTlqER
5215
~ .IP'~
OAT A SET I DENT
KDA NUMBER
2
hO
AuN
~
3
DATA SET NUMBER
~.
ONE
AUN
R
S
T
U
1-5 I
= time
CA T = Catalyst Concentratioh
a =CAT /(CAT)MIN = Catalyst Ratio
MRT, MSR, MST, etc. = Molecular Weight Ratios;
e
=a t
MATERIAL BALANCE EQUATIONS
~ = ~t
= -R 12 + (MRT) RS + (MRW) R9
~ = ~t
: -(MSR) R12 - R14 - RS + (MST) RS + (MSW) R9 + (MSX) R10 + (MSY) Rll
~ = £Xt
= (MTR) Rl - R13 - (MTS) RS ;
~ = ~t
= (MUR) R2
+ (MUT) R4
rN = rN
d9
+ (MUX) R10
adt
~
(MYS) R - R
7
11
+ (MUY) Rl1 - (MUS) R14
1. Performing the catalyst transformation shown
in Table IV, which was the result of a "yes"
answer to the question, "CATALYITC REACTIONS?" (see Figure 4).
2. Dividing all values by the maximum sampling
point to form the "normalized" values or s~aled
sampling points.
These results also contain concentration and rate
summations for each time point to assist the user in
. evaluating the consistency of the data based on material balance. The rates, which are not shown in
Figure 10, were computed numerically by differentiating
a polynomial whose coefficients are determined by a
least square fit of the concentration data.
Figure 11 illustrates the concentration weighting
factors and the input data to the PARTAN Algorith~.
Note that the Data Preparation Processor has assigned
names, for example, "KROl", to the optimization
variable and placed them in a "type three" category.
This means they are constrained between an upper and
lower limit denoted by "MAXVAL" and "MINVAL".
The initial values of the variables are in the "VALUE"
column.
The results of the preprocessing indicated that the
eleventh data set should be excluded from the study
because its concentration sums indicated as much as
Fall Joint Computer Conference, 1969
744
ten percent error. Therefore, optimization results were
obtained using twelve, rather than thirteen, data sets.
~UM8ER
DAU SET
I OENT I r I ER
ABSOLUTE
EKHOR
ERAOR
rUCT I O~
AvERACE
ERROR
__ .. J .... __ .............. __ .... _ .................. __ .............. _ .. __ ............................ _ .. _ ...................... _, __ .... ..
Optimization results
Figures 12 through 15 illustrato the form of some of
the results obtained from the hybrid solution of the
problem. Figure 12 illustrates the user commands,
which are documented as they are processed, and an
optimization summa~y. The summary is updated everytime the algorithm detects an improvement in' the
RU~
ONE
•• 791~E-11
1.2"3E-01
•• 113IE-'1
RUN
TwO
'.2296E I I
A.5961E-81
•• UUE-I!
RUN
3
1.3339E-'1
'.4674E ••
1.12I3E 81
RUN FOUR
I.4895E I I
A.1271e . ,
~.23J1E-1I1
RUN F lyE
,.4.52E II
•• ln1E
ee
'.1929E-.l
RUN
SI X
'.1987E ••
1.5158E-01
'.UUE-'2
RUN
7
'.49UE 18
1.1295E II
'.2H6E-11
RUN
•
1.5J3aE Ie
1.1383E I I
'.1·II3E-.1
RUN NINE
'.160lE II
•• 4162E-81
•• ,'I27E-'2
'.6'I13E-'2
11
RUN
TEN
'.14Z 4 E II
e.3697E-e1
'12
RUN
12
'.J96AE
•• 1021E 10
'.1414£-.1
13
RUN
13
•• 2517E I I
•• 6535E-01
II. '1192£-'2
.e
TYPE
INPUT OATA ,8
aAU SET NUMBER
2
USER IDENTlrtER
CAULYST CONCENTRATION
liEU rACTOR
RUN
TwO
'.111IE IJ UNKNOWN
TEMPERATURE
CATALYST RATIO
146.' C
'.III1E A1
-'.2421
SAMPLING POINTS
--------;: ;;;; -;; -;: ~ ;;; -;~ -;: i;;; -; ~ -;: ;;;; -;;-----_. ----------------------' -----------------------_. -----------------C~~CE;; R~; ~ ~~ -~ ~ -~~; -;;: e--------'. --.. _.............. _.... -_ ....................... -_ ................... --_.. -_... _............ _........ -----_ .......... _....... -............................... ..... ..
COMP •
W. '.3"£ U •• 315£ II e.2UE 0'
... ---.. _............... -- ... -_ ...... _.... _.... -.. -......... ---- .. -. _.... --.. -_... -_.. -- -_ ........... ---- .. ,. _.. ..
I I •• 345E I .
I I e. 219E
-_'.425£
............. _---.. -....... _---_ .....3UE
_.... -... ----- ..... __ .... _- ... _--- .. _...... --- ... __ ......... ---_ ........... _--;;~;
COMPANy ••• •••• •
c~~;~;~;;
MONSANTO COMPANY
LOCH ION •••••• •
PROJ ENGR ••••••
'
Sf. LOUIS. MO.
'.~25E
PAUL PA~ISOT
~.
I.
~.
'.'~IE "-'.131E-I1-'.121£-I1-'.I71E·'2
:~:--:~~~~:~~:~
- - ......... _
PROJ NUMBER ••••
.. _
... -
COMP S
1006~9
A.
CURRENT aAlE •••
sEPT •• 1968
.. - - - - - - _ ......... - - -
'.511£
CAR~SON
PROJ ENCR ••••••
.. _
.. _
. . . . . . . . . . . . . . . . . . . - - - _ ........ -
......... -
-
. . . . . . . . . . . . . . . . . - - - _ ..... _
. . . . . . - - - _ •• -
-
. . . . . . . . _ _ ~I- ... . .
'.4655 II •• 442E II •• 424E II
-.. _- _.... -... _... -........ -_ ......................... -................ ---_ ............... --- -_ ........... _...... -_.......... ..... . .
'.511£ . . '.455E II '.431E . . '.416E II
_....
,
• .... £ I.· •• 9I6E-I2-' .1I1E-11-•• 798£"2
TAPE UNIT 9
Figure 13-Typical objective function summary
. and detailed data set results
RESTORE PARTAN PLOT
nCLuOE 11
INTEGRAL OBJECTiVE fUNCTION
wEICHT LARGE CONCE~TRATIONS
ERROR E XPONEN' 1. 0
'''II••.•••
,.a
'.UIIII
•"u,
OPTIMIZATION SUMMARY
.. ------------------IMP"OVEM<',T
~U"1f:1ER
-----------------
O~JECT! VE
Fu,cr I 0'
------------------------ -- ...
~.
6
7
8
9
1~
11
12
13
14
IS
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
XULUlI
'0 OF rU"JC.
FVALUAT IONS
A09PF
~
1
0.671 SE
~1
~. 53~8E
H
~. 5221.
~.5I~~E
~.51e~f
~1
~1
PI
p.50?lE PI
0. 4 817F ~1
0.4814E 01
0.4769E 01
0.4766E 01
0.4711E PI
0.4694E ~1
0.4667E 01
r.4498E 01
0.4479E PI
0.4424E ~1
0.4415E 01
3.4380E 01
0.423SE 01
0.41HE ~1
0.4111E ~1
~.4111E
~1
0. 41~4E ~1
0.4093E 01
~. 4~69E
01
~. 3963E ~1
0.391BF ~1
0.3839E ~1
0.3832E PI
0.3827E 01
1
24
?5
26
29
3~
54
S6
S7
59
62
85
.7
90
118
124
151
154
155
182
183
188
191
192
216
272
298
299
326
329
330
NO Of GRAn.
EVALU.TIO~S
CURRE~ T
ALPHA
ClJ~RENT
:Ill
'"I .. • •• l I l I l l
."1'
" •. HII ..
••••• 4 ......
'.3I1E'1
,:1""-----------------------------------------------------------------------------!~-"
BETA
_--------_ .. _- -- ........ ---------_ ... -_ .. -_ .. _~
1
1
1
'I
1
2
2
2
2
2
3
3
3
4
5
5
5
6
6
6
6
6
7
9
10
1~
H
11
11
~ .1~~0
0.1~00
o .10~0
~
.1000
J.1000
3 .1~0~
~ .1000
0.1000
o .lJ00
~ .1000
~ .1000
0.1000
0.1000
0.1~00
0.1000
0.1000
0.1000
0.1000
o .10J0
0.0362
~. 0382
3. ~382
0.0382
~. 0382
0.1000
0.1000
0. ~382
0. ~382
~ .1~00
~ .100~
0.1000
0.1000
o .100~
P .1618
0.2618
0.4236
0. 4 236
~ .1618
0.1618
0.1618
0.1618
0.1618
e.I618
0.2618
0.2618
0.1000
0.1000
0.0382
0.0382
0.0382
P.038.2
!!::
:::::::::
~:::
I
27.,
21..
It..
IiiI
1,4n, .&
"I..
I~
'.'''''1
'.It&IIl
1,'Uri'l
II.'
'.USf fit.
'.",tItll
~. ~6111
~ .1000
0.1000
o .100~
~. 0382
~. 0382
~. 0382
0. ~618
0.1000
0.1618
0.1618
-- --_ ......... ----_ .. -..... ---- ... -- .................. -- -_ .... ------- ............ -- --_ .. --_ .. -... _... --_ ....... ---- -_ .... -Figure 12-Typical executive program output and
optimization summary
Figure 14-Typicalline printer objective function-No. of improvements plot
objective function during the execution of an optimiza··
tion run. It keeps a running record of the total number
of gradient evaluations and objective function evalu··
ations and notes any optimization variables at their
upper or lower limit (not shown in Figure 12). The
:A Hybrid/Digital Software Package
Figure I5-Typical concentration reRlllts output
"..
~
I J r-- t--f--f--+-{.,L~-l
I - - + ! t - i-- -Vl+' -
_.-
=;"t-Jl,,+-I ~~t
•
-+------+- -+--.i
t
I
lR :1
I
-;
.-- . 1--;
-t
-v
-- - -
I+-- +--+--
~.
--1
. -.
'--
i
j
t-
.. JI
1.1 1 1 i ; II I
1
J1 I i 11)! I! II
:+11-r-I t , .! I \+
l!t
I, I.
j
II
-t
t
t Iff Ii II/t
i
;
I
il
t
fi fI-
f-
I
i .i
i
I-i
1
lit
+- 1--+
t
-I-
I j
1
,-
--+-~-~ ---~
I
1
v
i
I -
t!
1 1 1
I;!!
i
i
iii,
11
.'"
i 1! i
fl.
.•
"'"
t
1-
--
""
V/[J!I
V - / ·1
+t~
1
• - \--- --f
-I----
i": -~rll
f--
.i_
I ; , ,
_+
to.
•
I
I.. '
IJ '
!
,
)
- -+ ------+
-
+
..
I !: ! 1I II. t I/.-
!
I.
i.
i
j
j
i
I
I.
•
~
•
~
-I-
..
I
~; i:
~.
j
!!
1 j
I
.•
-+
t
t
i
j
i 1 i
t
I
t
•
i
;
i
;,'
r---t
t
i
IT
II
; r-
!
i
t
!
i
1
i
i2:X:
i
j
j
i.
t
;:
1
:
1
I
:
1
!-i··
i
..
j
!
!
:
!
j
t
f--I-
!
....
i !
j
_i
, !f
\
1i J'. I !t
I,
!
l
!
i
i
I
lIt
t
I
j
i
I
\
t
~-
i
I
i
'i]
+(
-l-
t..
i
i
I
i
1
1
!I
f
.j
i;
j : .r-- - ~
I;
j
i.
; i !
I
\1
I,':
·.i,
~I:
___
i! i ! j '\' I
1 I I I I . , 1
iliil!!!
••
•
iii
'I
I,.
'i~
f
-.
-f
+
Ii
iJ--r
: V11
;
.
;
! ;
I
;
I
r
I '
t
f
rl:
! ;
i
~.i
•••
I'
I
i
i
j ! I
~ j \-
i
~
;
i
if, . :
I:
f
i;~
i
i
!
! i1i-j
i i i
j-! 1
! I
I i i i i f~
iii
-1'I~f~ut11i
iii t t t i j j Iii
I
I '-t1.\, j , ,. I \- I ! ,
I
!,
t
, t
t
11 tir-l ·-t Jl 1-11; II .Lk
;: I
J- . r
t
~ ~,I..I,...'I)
~
t
I
Ii!
;1;
f;!
i
1- 1 f --1
i
i
!
i
!
......
~
I
I
~
i : ,
! '
I
i
:
\1'
I,:. . .
.
I I;
i • ,'.,ft,'
I
Ii III
!;I iii ; ;
:.
: ; -4
! : ! j
.
j
t iI II
I -\- I I
~ i
. .
",-.
t
--i
I
I I
! !
Ii
I
i
: I ; ;,
1
i
i
i
I
;
I
"i
r t
+
tJ jJ LLll1 i ~j
11111 I ] ! I ; 1! f-
!~-f
!
!,.'
1
II'
III
t
1.11
1
i
i
-·-~w,i '1,,-1......-:--1::,:1
~ll~,'-~. r:~""-~'"
!. . ~, 1 l l.•
t--->--:-r;il
r ~ ;~
i +i 4
! r 1 I-I
j
I,
I \
-m~1~r /:-111 tHllIll tt11111
- i
1
I
i i ,
: i i i ! -,
1I 1
!
!
!
I
Ii
I ; l
. -. - i
\-+---+---+ -.
i-I '
. -
_I
·f
i
'-t- j tI j\ .~1 ft. i! t\
L
t--
~~_.'... -\I!
!,!.
jt
-tiro-Ii
+-
..
,
\-
+
tit··
--
-1--
~
I
~~
+-
745
746
Fall Joint Computer Conference, 1969
alpha and beta values pertain to the algorithm perturbations, etc.
During the optimization process the optimization
summary is the only output available to the user with
the exception of a percent improvement indicated on
the analog computer digital voltmeter. The percent
improvement is relative to the initial or base valuJ of
the objective function.
After the op-timization process has been completed,
the previously mentioned objective function summary
is obtained (see Figure 13) which includes a reproducibility error. Referring to Figures 13 and 14, the u,ser
may also request a detailed comparison of experimental
to computed results and a line printer plot 01' the objective function or any of the optimization variables
as a function of the number of improvements.
The objective function summary allows the user
to determine if, for example, anyone data set is making
an excessively large contribution to the objectIve
function. The reproducilibity factor, which is typically
zero for all-digital studies, is obtained by re-evaluating
the objective function under "best fit" condItions after
the optimization process has been completed. The percent error between the two objective functions is the
percent reproducibility error shown in Figure 13. It
reflects the total error introduced into the objective
function by the hybrid interface, analog components,
etc. As shown in Figure 13, this error wa..: typIcally
less than one percent.
The objective function plots allow the user to
graphically follow the path of the optimization proc~ss.
However, plots of specjfic optimization variables
versus the number of improvements are more important. They indicate the activity or sensitivity of
variables during optimization and allow the user to
take appropriate action if, for example, a variable
always remained essentially constant.
Figure 15 shows a concise final results plot that can
be requested via the appropriate user command. This
plot, which is obtained on the analog strip chart recorder, consists of a sample variable ramp (() and
a set of curves for the computed concentrations. The
"blips" on the concentration curves represent the
deviation between the curves and experimental data
points; therefore, the absence of "blips" represents a
near perfect or perfect fit. The pulse prior to each ramp
denotes the data set number. The first data set is
preceded by a 10 volt pulse, the second by a 20 volt
pulse, etc.
Problem solution and results
To avoid the possiblity of confusing a local minimum
on the error surface with the true minimum, sets of
optimization runs were always made starting from four
points on the error surface. The four sets of starting
values used were the maximum and minimum values
of the optimization variables, their arithmetic average
values, and the initial or "best guess" values. The
problem was solved using the following iterative process:
1. Perform four separate, complete optimization
runs using the maximum, minimum, average,
and initial values of the optimization variables.
2. Examine the results and determine if the final
values of the objective function and optimization variables show good agreement.
3. If the results of step two indicate more runs
are required, refine the four sets of starting values
based on their results and repeat the first step.
This iteration process was repeated three times using
the integral form of the objective function with large
concentration weighting and an error exponent· equal
to unity. Referring to Table V, the values of the objective function for these three iterations are reported
in standard error form (i.e., the unweighted sum of
the absolute concentration errors). After the third iteration, the mathematical form of the objective function
was changed to the standard form to eliminate the
effects of the concentration weighting and the results
of this iteration indicated that for all practical purposes, the "best fit" had been obtained.
The four sets of optimization variables obtained
from the fourth iteration showed reasonably good but
not perfect agreement. The error introduced into
specific reaction rate constants by differences in the
final values of the optimization variables were computed using the error form of equation 2; namely,
~~
K
aK + fI (aKHL)
R
KR
(14)
KHL
where .:lKR and .:lKH L are the most probable errors and
and KHL are the average values of the individual
optimization variables.9 The results of this analysis are
shown in Table VI. Note that the absolute percent error
of anyone rate constant is a function of temperature or
{3 whose range is ±0.5.
KR
Simulation Accuracy
Comparisons between equivalent hybrid and alldigital optimization runs were made to determine how
analog component or digital integration errors in-
A Hybrid/Digital Software Package
TABLE V-Objective function results
CD - ALL-DIGITAL STUDY
STARTiNG LOCATION
INITIAL
8.28
STARTING VALUES'
MAXIMUM
5'.76
AVERAGE
MINIMUM
747
*
CH
TOTAL COST
OF STUDY
ICC~~_ _ _ _-:::;:~::""----~
-
HYBRID STUDY
L
5.65
25.2
3.65
ITERATION 1*
3.76
3.15
3.59
ITERATION 2'
3.15
3.08
3.12
3.08
ITERATION 3'
3.06
3.03
3.11
3.08
ITERATION 4*
3.00
2.98
2.99
3.01
A:-:-L-:-N7':":U~MB::":'E:-R':":OF:-:O:":P':":TI~M':':IZ~AT~IO~N-R-U-NS-~
'-T:::O::'T
NOR
F'igure 16--Typical hybrid-digital economic plot
'Standard error equivalent of weighted integral objective function.
TABLE VI-Rate constant error analysis results
ABSOLUTE PERCENT ERROR
i
i
Ki
R
MINIMUM ERROR
KHL
1
0.32
3.48
2.06
2
0.44
0.83
0.86
3
0.26
5.43
2.98
4
0.30
4.27
2.43
5
1.06
1.96
2.04
6
1.60
4.48
3.84
7
2.05
1.98
3.04
8
0.59
3.18
2.18
9
0.23
3.49
1.97
10
3.22
5.14
5.79
11
2.46
4.33
4.62
K.
I
MAXIMUM ERROR
fluenced results. This comparison was based on the
standard objective function value obtained after one
function evaluation. Using both single and double precision digital integration, a co~parison of objective
function values showed good agreement between the
digital and hybrid results. Both the hybrid and single
precision digital integration results were within approximately ± 1% of the results obtained using double precision integration. These minor differences were traced
to errors of less than 0.001 in computed concentration
data points.
One comparison or equivalent all-digital versus hybrid optjmization runs was made. Although both
solutions differed slightly when their optimization
summaries (see Figure 12) were compared, the final
funct.on and optimization var~able results
obtained were identical for all practical purpose (L.e.,
one or two perc..:nt dtfference). This would seem to
indicate that the errors associated with experimental data and the mathematical model will
have a greater influence on results than the relatively
minor errors introduced by digital integration or analog components. It was also concluded that double
prec~sion integration accuracy was not worth the additional computation time it required compared to
single precision integration.
objec~ive
Simulation economics
The above discussion indicates there is no technical
advantage to be gained ,by using a hybrid rather than
an all-digital simulation to so~ve a kinetics problem
with the KDA package. Therefore, two questions of
interest are:
1. Is there an advantage to using )ne type of
computer?
2. How does one determine which computer to use
for specific problems'?
The answer to the first question ;s there is an economic
"break-even" point (see Figure 16) that governs the
selection of a hybrid computer over a digital computer
or vice versa. This "break-even" point is created when
the simulation of the kinetic molel requires the solution of a set of differential equations and the digital
cost per optimization run is in excess of the equivalent
hybrid cost.
A hybrid solution is practical when the hybrid economic advantage during the production phase of a
kinetic study offsets and surpasses the deficit encountered during the problem preparation phase. Recalling previous discusions to perform a kinetic study
748
Fall Joint Computer Conference, 1969
using the Kinetic Data Analysls package the analog
programming task is superimposed on the normal preparations required for an all-digitaL study. This creates
an obvious hybrid deficit which combines with hybrld
cost advantage during the execut.on of the opt.mization program to create an economic "break-even" pomt.
The economics associated with the hybrid versus alldig~ta, question should be considered care. udy because
sufficient savings can be reaLzed by making the correct
decision. For example, a recent hybrid ver;::;us alldigital economic study for a reactor control problem!
indicated tpat a large seale hybrid computer had approximately a 20:1 time and 40:1 cost advantage over
large scale, third generation digital computers (e.g.
$1,200 per hour computation center rate), and a 60:1
hybrid time advantage for the solution of the "lVIonsanto Benchmark Problem" has been reported in the
literature. 6
The hybrid cost advantage is directly related to the
average computation time required to simulate a data
set or experiment. The analog computer, typica.1y
requ~res 10-20 milliseconds to simulate one data set,
which is independent of problem complexity. The time
required for the equivalent digital simul~tion is a
function of the speed of the digita computer, the
number of equations, their degree of nonlinearity, and
t he integration algorithm. The influence of the digital
.lltegration algorithm on thiK sitnation is miLor since
the' analog; compute can be "spel'ded-up" more rea.dily
t han the algorithm.
.
The answer to the question of how one determines
t he answer to the all-digital or hybrid question is very
difficult due to lack of information. However, based on
information obtained from several hybrid optimization
studies performed on EAI 8900 Hybrid Computers, it
was possible to <;lerive some "rules~of-thumb" or guide!inef;. These relationships, which are based on a variety
of studies involving up to twenty-six optimization
variables, are admittedly crude.
The time required to execute one hybrid optimization run, including detailed printouts and tape manipulation, can be estimated using:
TH ~ 3·NOV·NDS/I00
(15)
where
T H • = time per hybrid optimization run, minutes
NOV = total number of optimization variables
NDS = total number of data sets
An approximate relationship to determine the
equivalent time, T D, for a digital optimization is:
TD
~
NOV·NDS·DST/1500
(16)
where DST is the average number of milliseconds required to simulate one data set. This relationship does
not include the time required for on-line I/O operations,
which are not important if a competitive hybrid/digital
situation exists.
A crude economic plot, see Figure 16, may be olbtained from the equations:
and
CD = CJ;
+ (RD·'f D + CZ) NOR
(18)
where
C H , CD = total hybrid and digital simulation costs
C~, CJ; = estimated hybrid and digital preparation
costs
R H , RD = hybrid and digital computer rates
C~, cZ = engineering costs per optimization run
NOR = estimated number of optimization runs
The engineering costs associated with the execution
and analysis of the optimization runs, cZ and cZ, are
not necessarily identical. For example, in the illustrative problem, four sets of four hybrid optimization
runs (NOR = 16) were required and the engineering
effort was four man days. An all-digital study could
have required as long as, say, sixteen days to execui~e
on a "slow" digital computer and required, say, eight
man d-ays of engineering.
The application of the above mentioned ec:onomic
analysis to the "Monsanto Benchmark Problem" indicated that the "break-even" point was slightly leBs
than thirteen optimization runs. Since the problem'
solution required sixteen optimization runs, the
economics were only slightly in favor of a hybrid solution. However, a significant hybrid advantage Wl:~S
indicated if additional work was required. For example
investigation of alternative mathematical models or
analysis of additional experimental data.
CONCLUSIONS AND COMMENTS
The present version of the Kinetic Data Analysis
package has, based on limited customer utilization in
A Hybrid/Digital Software Package
EAI Computation Centers, proven to be both an efficient and an economic means of performing both hybrid
und all-digital studies. For example, the time required
to obtain the all-digital optimization program has been
one man-day or less for small- to medium-sized Kinetic
Data Analysis studies.
Of greater significance, however, is the fact that this
work has proven the practicality of hybrid applications
software. It can be used as an effective tool to solve
frequently oocurring problems on a routine basis with
significant reductions in cost and problem preparation
time. Therefore, the development of general purpose
packages to solve specific classes of problEms on hybrid
computers would seem to be a fruitful area for future
work.
749
and
K
~
LN ( -- )
KR
1
R(
~
lR
1
- - )
(7)
T
which can then be combined to obtain
(8)
and
\
{3
= (I/T R
-
1/T)/(l/T L
-
11TH)
(9)
Note that the range of {3, based on equation 2, is
± 7'2 and the original Arrhenius coefficients in terms
of KR and KHL are:
APPEKDIXA
Derivation of alternative reaction rat.e constant equation
and
Defining the Arrhenius equation as:
K = A·EXP (-B/T)
and a mid-range absolute temperature as
l/T~
=
(11TH
+
1/T L )/2
A
(1)
(2)
2
= Arrhenius coefficients
= ~laximum absolute temperature
= ::\1inimum absolute temperature
= Reference absolute temperature
= Absolute temperature, T L :::; T :::; TH
= Reaction rate constant
:3
4
one obtains:
KH
= A·EXP (-B/T H )
(3)
5
(4)
6
KR = A·EXP (-B/TR)
(11)
REFERENCES
where
A, B
TH
TL
TR
T
K
= KR·EXP (+ B/T R )
(5)
Combining equations 3 and 4 and equations 1 and 5
yields:
7
8
A CARLSOK
Hybrid simulation of an exchanger/reactor controlsY8tem
Presented at the Tech Conf on Process Control May 1968
Edmonton Alberta Canada
C GIESE
Determination of best kinetic coefficients oj a dynamic
chemical process by on-line digital simulation
Simulation Vol 20 1967 141
H II HARA R A NESBIT P E PARISOT
A hybrid progra'm jor the solution of the Monsanto
Benchmark problem
Presented at Nat A I Ch E Meeting Columbus Ohio May
19(16
A HARKINS
The use oj parallel tangents in optimization
Chern Engr Prog Sym Series 60 35 1964
L LAPIDUS Y BARD
Kinetic analysis by digital parameter estimation
Catalysis Review Vol 2 67 1968
R A NESBIT H H HARA P E PARISOT
Experiences with hybrid computer solution for kinetics
parameters search problem
Presented at Central States Simulation Council Meeting
St Louis Missouri Jan 1966
P E PAIUSOT
Parameter search for kinetic models utili'dng the hybrid
computers
Presented at Midwe3t Simulation Council Meeting Aug
1965 Pittsburgh Pa
P E PARI SOT L E FRANK V ~ SCHRODT
750
Fall Joint Computer Conference, 1969
Computer solution of sets of non-linear differential equations
Presented at ACS Meeting Houston Texas Dec 1963
9 J B SCARBOROUGH
Numerical mathematical analysis
Johns Hopkins Press Balto 1958 4th ed 432-438
10 T J WILLIAMS
Computer simulation of chemical reactions
Chern Engr News Vol 20 1962 88
The extended space technique for
hybird computer solution of partial
differential equations *
by DONALD J. NE\VIVfAN and JON C. STRAUSS
Carnegie-Mellon University
Pittsburgh, Pennsylvania
practical technique. However, the technolgoy has
progressed to the point where the large quantity of
algebra no longer prevents accurate solutions- of both
linear and nonlinear problems.
INTRODUCTION
The rapid solution of partial differential equations
(PDE) has been a subject of increasing interest in
recent years. This interest in partly due to advances
in areas of technology which require the solution of
PDEs, but is primarily due to the need to apply modern
optimization and identification techniques to the
spatially continuous systems that are best modeled
by PDEs. The parallel organization of theana~og
subsection of a hybrid computer facilitates extremely
rapid solutions of complicated systems of .ordinary
differential equations (ODEs). Therefore, techniques
to find a system of ODEs that can be solved to obtain
a rapid approximate solution to a PDE on the hybrid
computer have become the subject of intensive investigation.
As digital computers have become faster and their
memories larger, interest in symbol manipulation techniques has also increased, and advances have been made
in the capabilities of computers to perform manipulative tasks once considered impractically large. The
Galerkin technique for transforming a PDE into a
system of ODEs has been known for some time, but
for more than a crude solution of simple, linear
problems, the quantity of algebra is so large that until
recently this method has not been considered as a
The GalE!rkin method employs an assumed solution
consisting of a sum of time weighted spatial funtions;
this separable form is similar to that used in the analytical technique for solution of linear PDEs commonly
known as the separation of variables method. Each
spatial function in the separable form is called a mode,
and these modes are assumed to be known functions
selected to satisfy the boundary conditions. The Galerkin method yields one ODE for each mode; the solution of the resulting system of ODEs yields the time
varying weighting coefficients of the modes.
Recent investigation of the errors in assumed manymode solutions of PDEs has led to the discovery that,
while for the first few modes the Galerkin method is
very effective, its performance for many-mode solutions
is not satisfactory. The Galerkin method with small
numbers of modes has been demonstrated to give more
accurate solutions than other methods for the same
number of ODEs.1 If even more accurate solutions
are required, more modes can be introduced into the
solution, but the Galerkin method fails to produce results with any significant increase in accuracy for
these multi-mode solutions. Although the Galerkin
method has been shown to be convergent,2 advances
in symbol manipulation capability have shown that
the method is limited in accuracy in practice by the
extremely slow rate of convergence. Therefore, a new
* This work WIl.:5 supported by Nationa,l Science Foundation
Grant No. GJ-179. This paper W8,S abstracted from the dis-·
sertation of D. J. Newman,9 submitted in partial fulfillment of
the requirements for a Ph.D. in Electrical Engineering from
Carnegie-Mellon University.
751
7;"52
Fall Joint Computer Conference, 1969
--------------------------------------------,-technique that is effective for multi-mode solution is
needed.
In this paper, a technique designed to meet this
need, the extended space technique, is described and
demonstrated. After a description of the PDE and
the notion of assumed modes, a revimv of the Galerkin
method introduces a thorough tutorial on the nature of
the approximation errors. The linear problem with
polynomial modes is used to further explain the slow
convergence of the Galerkin method and to explain
how the extended space technique overcomes this
defect. Formal notation is introduced to make the
technique applicable to the nonlinear problem. Finally an example problem is presented \vith comparative results based on an analytic solution.
A review and comparison of other hybrid methods
is presented in a previous paper by the authors.3 A
more thorough explanation of the Galerkin method
and its relationship to other assumed mode methods
is available froql. a review article by Finlayson and
Scriven,4 the Ph.D. dissertation of D. J. Newman,9 and
a recent tutorial article by R. Vichnevetsky.lO
Nonlinear partial differential equation
The form of the PDE of interest is, given in (1) where
u(x,t) is the dependent function of independent variables x and t, P is a nonlinear partial differential
operator with respect to x, and f is a forcing function.
-
a u(x, t) =
at
P[u(x, t)]
+
f(x, t)
(1)
The solution to this problem must satisfy an initial
condition in t and homogeneous boundary conditions
in x on the interval [0, 1]. (The [0, 1] interval is chosen
for notational convenience only; the solution so obtained may be scaled to any other interval. Brackets
are used to denote "operates on," and parentheses
are used to denote that the value "depends on.") Thus
(1) is an initial value problem in t, and retention of
this initial value character in the system of ODEs to
be obtained is desirable. The PDE form given in (1)
appears to include only a limited number of PDEs,
but through proper problem formulation a wide class
of problems can be solved by simultaneous solution
, of PDEs of this form.
n
vex, t) =
L
Ci(t) hi(x)
(2)
io=l
The assumed spatial modes hl") are preselected
to satisfy the orthogonality conditions of (3) and the
spatial homogeneous boundary conditions on the solution to (1).
i
~
j
(3)
i = j
Since the boundary conditions are homogeneous, v(x,t)
also satisfies the spatial boundary conditions. A previous paper by the authors3 removes the restriction
to homogeneous boundary conditions" but it is retained in this paper to simplify the presentation. The
The Ci(t) functions are weighting functions for the
assumed modes.
Subject to the conditions stated above, the selection
of the modes depends on the problem knowledge of
the solution, and 'computational convenience. If
specific regions of the space differ in such a way that
the solution has different characteristics there or very
high accuracy is required, the problem should be subdivided into regions. The algebra for each region is a
separate problem, but the resulting ODE systems are
interdependent. A description of the regionali2;ation
problem is presented in Reference 9.
The Ci(t) functions must be determined to give as
nearly as possible the best solution to the PDE in
(1) for the given modes of (2). The best approximation
to the solution is one which matches the modl:L1 expansion of the exact solution u(x,t) for each mode in
the approximate solution v(x,t). If u(x,t) is replaced in
(1) by v(x,t), a residual function R(x,t) must be introduced to preserve the equality as shown in (4).
(4)
The approximate solution v(x,t) iA an exact solution
to equation (4), but the intent is to solve equation (1)
which differs from (4) by an additional forcing function
H. Analyzing the difference between u(x,t) and v(x,t)
is equivalent to analyzing the effect of adding: the
residual function R to the PDE.
Assumed modes
Galerkin's approach
An approximate solution v(x,t) to (1) is proposed
in the separable form of (2).
The residual R in (4) is determined by the choice
of the weighting functionsci(t) in the approximate
The Extended Space Technique
solution v(x,t). GalerkiJ} suggE;sted in 19156 an approximation method based on orthogonalizing the residual
,",,-ith respect to the assumed modes; this orthogonality
requirement is described by the n equations in (5).
/1
R(x, t)
1
11 p [n!; c,h, ]
!;n n:,irfo
+f
R(x, t) hi(x) dx = 0
i = 1,2 ... n
(5)
o
Galerkin does not give any justification for this method
except to say that it is related to the work of Ritz.6
However, in addition to the strong intuitive appeal, it
is easily shown that the orthogonality condition of
(5) can be obtained by minimizin~ the integral of the
residual squared with respect to the time derivatives
of the Ci(t). This strong relationship to the variational
methods of Ritz has led some investigators to refer to
this method as the Ritz, Galerkin method.
Substituting (4) into (5) and employing the orthogonality conditions in (3) yields the ODEs in clt) given
by (6).
In this paper, the Ci(O) are chosen to give a least squares
fit of v(x,O) in (2) to the initial condition on u(x,t) in
(1). Thus the Ci(t) functions are determined from ODE
initial value problems.
This approach can be generalized to any number of
spatial variables as shown by Stacey.7
11/ here does the residua l go?
To be sure, the residual does not vanish for most
PDEs and most finite sets of modes. The expressions
in (5) ensure that the residual is orthogonal to the
hi(x) functions, hence the residual is not compos~d
of the modes that are in the approximate solution to
the problem. However, since the hi(x) must satisfy the
boundary conditions, they do not form an appropriate
basis for R and hence determination of Ci(t) as in (6)
does not minimize the residual in the most appropriate
subspace.
A more useful form for investigating the residual is
easily obtained by solving (4) for R and combining
with (6) to obtain (7).
753
I
h,dx - P [
~l c,h, ]
- f
(7)
An analysis of (7) reveals that the residual must come
from those parts of P[v(x,t)] and f(x,t) which ~re orthogonal to hi (x). The conclusion is that R acts as a
forcing function composed of components of P[v(x,t)]
and f(x,t) that are orthogonal to hi(x).
Is this effect good or bad?
With respect to f(x,t) even if the effect is Hot good
at least the effect can be evaluated in terms of the
physical problem. In short, f(x,t) might a,s well be
assumed to be a function described by (2), and if certain
properties of f must be considered in problem, modes
characterizing these properties may be carried in the
solution. This is quite tenable if f(x,t) obeys the boundary conditions, and equally impossible if f(x,t) does
not.
With respect to P[v(x,t)], the effect is not immediately clear in terms of the physical problem. For modes
which are not themselves solutions of the unforced
problem (not natural modes), the effective forcing
function contributed by R with components that are
not in the solution can have effects OIl the solution.
In the Galerkin method these effects emerge as errors
in the approximate solution v(x,t) in addition to the
error due to the omission of modes that are in u(x,t).
These errors are caused by errors in the c:(t) functions
and do not disappear very rapidly when more modes are
added such as may be done for f(x, t).
Evidently the effects of the residual on the solution
can be quite pronounced when mode and nonmode
functions interact as may happen if nonnatural modes
are employed.
A spedal case
The two-point boundary value problem with a
second order linear PDE is a meaningful case to study.
Since the object of this section is to examine the nature of the residual generated by the Galerkin method,
the discussion is made more clear by assuming f(xJt) =
oand by employing simple polynomial modes.
The modes for this two-point problem are required
to satisfy the condition that hi(x) equal zero at the
754
Fall Joint Computer
Conferenc~,
1969
ends of the solution interval [0,1] for i = 1, 2, 3 ... ,n.
Actually more general conditions involving derivatives
of hi(x) can be used as shown later in an example
problem, and a still wider class of boundary conditions
can be used as described in Reference 3. However,
these conditions are simple and serve to demonstrate
the principles involved.
The simplest polynomial modes that satisfy these
boundary conditions are given in (8).
i=1,2···,n
(8)
The bar on the h.:(x) indicates that these functions are
not orthogonal, but they are independent. The n
orthogonal functions hi(x) defined in (3) are readily
generated from the h ,(x) by the Gram-Schmidt procedure.
In order to determine the composition of the residual, P[h.:(x)] must be examined to determine which
components are orthogonal to all of the hj(x). For this
purpose, hi(x) is an adequate substitute for h.:(x) and
considerably simplifies the discussion. Since P is a
linear combination of derivative operators, P[hi(x)]
could not contain any powers of x greater than i + 1
but could have any lower term ill-cluding a constant
term. In fact an adequate basis for P[hi(x)] includes
in addition to the hi(x) functions two functions 1
and x that do not satisfy the boundary conditions.
Therefore, the residual must be composed of a linear
combination of 1 and X, and the Galerkin solution
for this special case has an effective forcing function
of the form ax + b.
Introducing such an extraneous function or alternatively ignoring such a function if it were part of
f(x,t) does not seem to be reasonable. Ostensibly the
residual in the PDE is due to the omission of modes of
higher degree from the approximate solution; however,
such a residual would not be a function ax + b but
would contain all modes especially those of the highest
degree included in the solution.
The extended space technique for the special cq,se
This technique extends the space of functions being
considered for the solution of the second order linear
PDE to include functions, hn+l(x) and h n+2(x) , which
are used to absorb the residual and reduce the error
in the coefficients Ci(t); however, these functions are
not included in the actual approximate solution vex, t).
In the extended space technique, the residual is not
part of P[v(x,t)]; instead, the residual consists of
functions that are not part of the approximate solution and cannot be generated in the PDE from the
approximate solution. The addition of sufficient;
amounts of ~1 (x) and h n+2(x) to remove the s,x + b
component from the residual reduces the error in the
coefficients for the modes.
The expression given in (9) is substituted into the
PDE instead of the approximate solution v(x,t) to
generate the extended space residual R.(x,t).
n+2
L
~1
h,(x) c,(t) = vex, t)
+ hn+l(X) Cn+l(t)
+ hn+2(x) Cn+2(t)
(9 )
Two functions which are orthogonal to the h.:(x),
i = 1,2, ... , n + 2 can be found from 1 and x and are
denoted g~(x) and g~(x). These functions with hi(x),
i = 1, 2 ... n form a basis for P[v(x, t)]. The g~(x)
functions are employed in (lOa) to give two line2~r
algebraic equations which when solved simultaneously
with the n linear ODEs in (lOb) determine the coefficients c,(t) in vex, t).
[1 g~(x) R.(x, t) dx = 0
j = 1, 2
(lOa)
11
i=1,2,···,n
(lOb)
o
h,(x) R.(x, t) dx = 0
o
The two equations in (lOa) insure that the residual
will not have 1 and x as a basis. The equations in.
(lOb) are essentially the same as those in (5) and insure
that the residual is orthogonal to the modeB. Th.e
conditions in (10) are necessary for the minimization
of the integral R squared in the subspace with basi'g
g~, gg, hi (i= 1, ... ,n). It has been demonstrated tha't
this is a more appropriate subspace for the descJription
of R than that with hi (i= 1, ... ,n) alone as a b9~sis. H
should therefore be expected that the extended space
technique give better results than the Galerkin method.
A close examination of P[u(x,t)l (u(x,t) is the exact,
solution) compared to P[v(x,t)] indicates why the
extended space technique does give better re:mlts.
P[u(x,t)] can be broken into three important parts:
P[v(x,t)] is one part, a part which has the same basis
as P[v(x,t)] but is generated by functions in u(x,t)
that are not in v(x,t) is a second part, and a part
which has a basis different from P[v(x,t)] is a third .
Because the third part has no effect on (10), it eannot,
cause any error in the coefficients Ci(t), but the secondl
part can. Because the second part is generated by
functions not in v(x,t), it does not appear in (4). Ideally
The Extended Space Technique
the residual should be this second part, but since the
second part is functionally indistinguishable from the
first part, the ideal residual cannot be produced. The
extended space technique alleviates the errors caused
by the absence of the second part for two reasons:
(1) the extension functions hn+1 (x) and h n +2 (x) do
generate some of the second part; (2) since the residual
is composed of these extension functions, the effective
forcing function is not composed of only the g~(x) and
gg(x) which should have been cancelled out of (4) by
the second part. Particularly in this special case, linear
P with polynominal modes where' the greatest interaction is between adjacent modes, the majority of the
effect of the second part of P[u(x,t)] is absorbed by
these two mechanisms.
A generalization of the technique
The extended space technique can be generalized
to cover a nonlinear PDE with m boundary conditions
where the solution employs nonpolynomial modes.
Unfortunately, the effect of the technique on the
residual and the error in the coefficients cannot be
readily examined under these general conditions.
755
In order to proceed with the description, a more
general notation is required: G is the set of functions
that are desired as modes and functions to flt the
forcing function, f(x,t). G has an orthogonal basia of
n + m functions denoted gi(X). Hn is a subset of G
such that all functions in Hn satisfy the m boundary
conditions. H is an extension of lIn outside of G, but
all of the functions in H also satisfy the boundary
conditions. H also has an orthogonal basis of n + m
functions denoted hi(x), and the first n of these funtions are in Hn. In addition m functions denoted g~(x)
are defined to be orthogonal to all functions in Hand
along with the hi(x) in Hn form a basis for G. The
relationship of these sets of functions is shown pictorially in Figure 1.
. The approximate solution retains the form given in
(2), but the residual Re is given by (11).
(11)
The system of equations that are solved to determine
Ci(t) are given by (12) and are derived from orthogonality conditions as in (10).
o=
{
{ p [
~ h,c, ] + f } gj dx
j = 1, 2,· .. , m
Ilh;11
!t
C; = {
{ p
(12a)
[~ c,h, ] + f } h dx
j
j = 1,2···n
(12b)
Equations (12a) are a nonlinear algebraic system and
(12b) are a nonlinear ODE system.
A linear PDE problem
This area DOES
meet boundary cndt.
The studv of heat transfer within a solid is an
interesting problem in connection with this work
because the surface conditions give rise to a two-point
boundary value problem. Problems of this nature are
encountered in heat exchangers where metallic fins
are cooled by a forced flow of a fluid. In this example
problem, the linear diffusion equation shown in (13)
is used to represent one dimensional heat flow within
the metal fin.
a2u(x,
Figure 1-Function spaces
k
dx 2
t)
au(x, t)
=JL~
(13)
756
Fall Joint Computer Conference, 1969
In (13) u(x,t) is the temperature, k the conductivity
and J.L the heat capacity of the metal.
Newton's law of cooling shown in (14) is used for the
boundary condition at the fin surface cooled with the
fluid at temperature set).
au(x, t)
ax
a(u(x, t) - s(t))
(14)
This problem is a linear PDE problcm \vitlt a linear
differential boundary condition.
The problem is chosen as an example bccausc it
has an analytical solution. Thc solution is evaluated
and used for comparisons of the accuracy of two, thrcc
and five mode assumed mode solutions employing the
Galerkin technique and ,,,ith two and threc mode
solutions employing the extended space technique.
A metalfin
The example problem deals with a fin of mctal
uniform in thickness which is cooled by water on both
sides as shown in Figure 2. The initial temperature
(100°) is uniform throughout the cross section, and
cooler water (0°) begins to circulate by the fin at time
zero. The problem has symmetry so that only half
the fin must be considered in the problem.
The water-metal surface is assumed to obey Xcw-
ton's law of cooling which requires that the rate of
transfer of energy through a boundary be proportional
to tbe temperature difference across that boundary.
The rate of transfer is proportional to tbe derivative
of tcmperature within the metal at the surface. The
diffcrcnce in tempcraturc across the boundary is the
difference between thc temperature within the metal
at the surface and the water tcmpcrature, a function
of time set). In (14), the proportionality constant a is
assumcd to be equal to one for simplicity, and set)
is aHsumcd to bc the step function given in (15) which
is choscn so that the problem will have a simple analytic
solution.
100
set) =
t
= 0
t
>
(15)
0
1
0
\Vhile the surface provides one boundary condition,
symmctry provides another since the derivativc of
temperature must be zero on the axis of symmetry.
'The complete PDE problem is given in (Hi) where
k = 1/10 and M = 1.
a2u(x,
1
10
au(x, t)
t)
at
ax 2
au(x, t)
o
ax
(16)
x-I
WATER
FLOW
(
au(x, t) _
u(x,
ax
I
t))\
0
Ix=o
u(x.O) = 100
FA~
X=Q
X=2
X=I
Y
Figure 2-Metallic fin
X
The analytical solution to this problem is found by
classical separation of variables, is quite complicated,
and is not harmonic in nature. The frequencies of the
sine components vary according to the solutions of
w = a cot(w). In a sample problem given by Lebedev,
Skal'Skaya and Uflyand,8 ::1,Il answer is given which is
presumably an exact ans\ver for this problem. Aetuu,lly
their expression is a close approximation to the exact
solution with an accuracy of better than .01 percent
fora = 1.
lVlodes and the ODE system
'l'he modes for this problem are chosen by application
of the method for homogeneous differential boundary
conditions presented in Reference 3. A simple poly-
The Extended Spa,ce Technique
nomial family gi(X) = ix _. Xi is employed as the set
G which satisfies g~(l) = O. The modes are integrated
with the boundary condition at x = 0 applied to determine the integation constant, and the hi(X) shown in
(17) are produced.
hi(x) = (i
+ 1) (x + 1) -
Xi+l
TABLE III-Coefficient matrices for extended
space technique
-.0740726
i = 1,2 ... n
757
.000426177
(17)
The modes hi(x) are obtained by orthogonaliza.tion
of hi(x) and are shown in Table I.
.142572
-1.174004
TABLE I-Orthogonal modes for linear PDE problem
2X
h2(x) = - X
i2+
i3+
ha(x) = - X
i4+
(7932/3905)X i 3 - (58089/
54670)X i 2 + (857/27335)X
(857/27335)
hi (x) = - X
2
(691/432)X
- (43/216)
+
h 4 (x) = - X
+
i
5
i2
-.0740739
.14311 00
-.301688
-1.174666
.371627
.0741377
-4.15921
+ (148725/59152)X i 4
- (30469/14788)X i 3
(200593/354912)X i 2
- (1129/177456)X
- (1129/177456)
+
hll(x) = - X
.000427785
- (43/216)X
Numerical results
i6+
(432358/143745)X i 5
- (31431/9583)X i 4
+ (2431444/1581195)X i 3
- (1266961/4743585)X i 2
+ (6746/4743585)X
+ (6746/4743585)
The ODE systems for two, three and five modes
employing the Galerkin technique are obtained by the
application of equation (6). The derivatives of Ci(t)
are linear functions of the Ci(t), and the coefficient
matrices of the equations are given in Table II, The
ODE systems for two and three modes employing the
extendeg, space technique are obtained by substitution
of (12a) into (12b) to eliminate the highest two Ci(t)
Again the derivative functions are linear, and the coefficient matrices are given in Table III.
The simple boundary condition employed in this
problem to facilitate obtaining an analytical solution
presents some severe difficulties in obtaining a good
fit to the initial condition. The modes must satisfy this
unrealistic boundary condition which imposes a steep
slope at x = 0 where the initial condition is flat.
Figure 3 shows the solution fit to the initial condition
for two, three and five modes for both techniques.
Even at five modes the fit is not entirely satisfactory;
however, for small numbers of modes, the analytical
solution suffers from the same defect. This is the cost
that must be paid to obtain an analytical solution for
comparison.
Figure 4 shows the solution at times of five, 20 and
100 seconds for both techniques. The solutions, exact,
two, three and five mode are indistinguishable on a
graph of this scale. The hybrid solution also produces
identical results and the analog block diagram for
this problem with three modes is shown in Figure 5.
TABLE II-Coefficient matrices for Galerkin method
.074074
.14341
.30674
.71994
-1.8134
4.286710-4 \
-1.1769
1
.38796
-5.6258
2.6283
1.829210 -5
7.7397 10 -3
-19.084
.59565
-24.223
1.461010 -6
-3.8192 10 -3
.020270
-9.4932
.79036
1.5616 10 -7
7.571710- 5
.034980
.033540
-17.439
758
Fall Joint Computer Conference, 1969
~---------------------------------------------------------------------------------------------'-'-2 MODE
"-'--3 MODE GALERKIN
. . . -5---.. .
C°t-_ _ _....iIi~--- X...
,..-_~11.0
/
/
.,'
./
-._._. 2 MODE
o
-.05
----.3 MODE
Figure 6--Galerkin method error T
- - 5 MODE
Figure 3-Linear PDE problem T
100
=
0
-------------------------------------------.!.~-Q.--------------.
t=5
t=20
00
'. '._ ...... """ .",;
X...=a.,;,;,.5_ _ _t_-..;1.;.00_ _ _ _ _~x= 1.0
...._ _ _ _ _ _ _ _ _ _ _ _ _ _.....
Figure 4-Linear PDE problem solution
=
5.0
cent of the solution) for the three assumed mode solutions using the Galerkin method. The improvement
between the two and three mode solution is substantial
but the five mode solution is disappointingly similar
to the three mode solution. The error does not decrea,se
nor does it change shape. Since in a five mode solution
only the seventh and higher degree polynomials :~re
excluded from the solution, the logical conclusion would
be that the error would have three maxima and three
minima; but since it does not, possibly an error has
crept into a lower mode which is not diminishing to
zero very rapidly. The analysis performed previously
indicates that this is in fact the case and that eVen
though this error does slowly diminish as the number
of modes increases, the Galerkin method on linear
problems leaves all the error in the modes that are
part of the solution.
The extended space technique shows a drama,tic
improvement in the accuracy of the results' for three
modes. Figure 7 shows' the error curves for the two
and three mode solution; the three mode soluti.on
matches the analytic solution so well that five mode
solution is not needed. The error shown for the three
mod~ solution is so small that it is comparahle to the
errors in numerical integration of the ODE system
and is only meaningful in the sense that it is a great
improvement over the Galerkin method.
Figure 5-Linear PDE problem analog diagram
In order to compare the accuracies of the different
solutions, errors for cross sections at five seconds are
chosen because the five second, cross section has the
greatest error and because at five seconds the analytic
solution is sufficiently convergent to give an accurate
basis for comparison. Figure 6 shows the error curves
on a greatly magnified scale (full scale is .15 to .2 per-
Figure 7-Extended space error T = 5.0
The Extended Space Technique
75~
i
TABLE IV-Comparison of eigenvalues
Extended Space
Gderkln
2-Hode
3-Hode
S-Hode
- .07402
- .07402
- .07403
.07403
-1.1734
_1.1770
_1.1768
Analytic
_19.084
_4.1439
CONCLUSIONS
2-Hode
_1.1742
3-Hode
- .07402
- .07402
_1.1741
-1.1738
-4.1601
_9.4916
-17.028
-9.0810
-19.499
-16.000
TABLE V-Comparison of digital computation times
Extended Space
Galerkin
Algebra
Integra Hon"
Algebra""
2-Mode
11
98
3-Mode
34
234
S-Hode
209
Integration"
10
The example problem has demonstrated how much of
an improvement the extended space technique can
be over the classical Galerkin method. Both the accuracy of the solution and the eigenvalues of the ODE
system are better for the three mode extended space
technique solution. However, this improvement i.s not
obtained without some increased cost: the quantity of
algebra that must be performed to deter min ~ the
three mode extended space solution is about equal to
the quantity to determine the five mode Galerkin
solution. Even when this increased cost is considered,
the extended space technique is superior because the
three mode solution is better than the Galerkin five
mode solution.
This technique is also applicable to nonlinear problems, but no experimental results are available at
present. The nonlinear application has an additional
complication: the simultaneous solution of nonlinear
algebraic equations and a nonlinear ODE system is
required. Work on a nonlinear problem is currently
being done and results are expected to indicate comparable superiority over the Galerkin method for
nonlinear problems.
Times in seconds for IBM-300/6S
" Does not inc lude the time for the
compilation of ODE derivative subprogram requiring about )0 seconds.
**
Values corrected to remove estimated
program compilation time which was
not inc luded in other timings.,
The eigenvalues for the various solutions shown in
Table IV indicate why the extended space technique
produces such accurate results. For all solutions the
first two eigenvalues match the eigenvalues obtained
from the exact solution very well. The third eigenvalue
for the Galerkin method is never very near the exact
value even for five modes; however, the extended space
technique produces an eigenvalue very near the exact
solution with only three modes. In fact the extended
space technique produces a much better eigenvalue
for three modes than the Galerkin method does for
five modes. Table V presents a comparison of the
digital computation times to do the algebra necessary
to prepare the ODE systems and times to do the
numerical integration of the systems for the Galerkin
method and the extended space technique. Computation time on the hybrid computer to solve the ODE
system is the same for all cases and may be as small
as 10 milliseconds on the Carnegie-Mellon University
EAI:.680/PDP-9 hybrid computer depending on the
I/O device used to monitor the solution.
REFERENCES
1 W Z COLLINGS
The method of undetermined functions as applied to
nonlinear diffusion problems
MME thesis Univ of Delaware 1962
2 M A KRASNOSEL'SKI
Topological methods in the theory oj nonlinear integral
equations
Pergamon-Macmillan 1964
3 D J NEWMAN J C STRAUSS
Hybrid assumed mode solution of nonlinear partial
differential equations
Proc FJCC Vol 33 1968575
4 B A FINLAYSON L E SCRIVEN
The method of weighted residuals-A review
Applied Mechanics Reviews Vol 19 No 9 Sept 1966735
5 B G GALERKIN
Rods and plates
Vestn Inzhen i Tekh Petro grad 19 897-908 1915
Translation 63-18924 Clearinghouse Fed Sci-Tech Info
6 W RITZ
tJber eine neue Methode zur Losung gevisser Variations
problem der mathemalischen pl).ysik J f reine u angewandte
Mathematik, 1909
7 W M STACEY JR
Modal approximations
MIT Press Cambridge 1967
8 N N LEBEDEV IP SKAL'SKAYA Y S UFLY AND
Problems in mathematical physics
Translated by ARM Robson Pergamon Press Oxford 1966
9 D J NEWMAN
760
Fall Joint Computer Cpnference, 1969
--------------------------------------------------------------------------------,-----Hybrid assumed modtl solution of nonlinear partial
differential equations
Carnegie-Mellon Univ Pittsburgh Pa 1969 PhD thesis
Available from Univ MicrofilmS Ann Arbor Michigan
10 R VICHNEVETSKY
Use of functional approximation methods in the computer
solution oj. initial value partial differential equation probkrns
IEEE Transactions on Computers Vol C-18 June 1969
Extension and analysis of use of
derivatives for compensation of hybrid
solution of linear differential equations
by NELSON H. 'KEMP
Wolf Research and Development Corporation
West' Concord, Massachusetts
INTRODUCTION
When compared to continuous (analog) computation,
hybrid computation is subject to two sources of error
not associated with hardware, but caused by its
logical nature. They are often referred to a'5 the time
(or transport) delay, and the reconstruction errors.
This time delay error is caused by the time taken for
the digital computer to process the data sampled from
the analog computer, before sending the updated
results back to the analog. The reconstruction error
results from the hold action of the digital-to-analog
link: the updated value from the digital is sent to
the analog and held fixed until the next updating, instead of being updated continuously.
The effect of these errors on the hybrid solution (as
compared with a pure analog solution) is twofold.
First, inaccuracies are introduced. Second, the hybrid
solution may become instable and grow without bound,
even though the correct solution is bounded or even
decreases to zero.
To prevent instability and minimize error, hybrid
computations utilize compensation techn:ques. The
variables processed in the digital computer for use in
the analog computer are calculated at some future time,
by an extrapolation scheme, before being sent to the
analog. Depending on the scheme used, this technique
can have a beneficial effect on the accuracy and stability
of the solution, for a given sampling interval.
There are a number of extrapolation techniques
commonly used to 'achieve compensation. One such
technique is that of multistep extrapolation, or digital
filters, in which values of the variables at earlier time
are used for extrapolation. A good discussion of this
method is given by Mitchell. 1 He demonstrates its
shortcomings for heavily damped systems, caused by
the instability of the extraneous solutions introduced
by use of values at earlier times. For each step back
in time, one extraneous solution is introduced, and
these solutions are instable for large enough sampling
intervals. The popular three-step, or parabolic, extrapolation introduces two such solutions, and their
amplitude increases with increasing damping, so that
heavily damped systems require small sampling intervals for stability.
Some years ago, Miura and I wata2 suggested another
technique of extrapolation. For solving differential
equations, they used the derivative of each variable
to extrapolate, rather in the manner of a Taylor series.
The implementation suggested was to add to the output of an integrator a multiple of the input, the sum
being the extrapolated value of the variable. Further
use of thi~ scheme, for undamped systems, was made
by Gilberta and Karplus4 •6 with several implementations
suggested. Gilbert3 analyzed the undamped system,
using z-transforms. This extrapolation technique has
the advantage of requiring either no backward steps,
or only one, depending on the implementation, thus
eliminating or reducing the number of extraneous
solutions introduced. The result is a solution which is
not only more accurate than the uncompensated
76]
762
Fall Joint Computer Conference, 1969
-----------------------------------------------------------------------------------------hybrid solution, but can be more stable. This is in
contrast to the use of multi-step methods, which im..
prove the accuracy but reduce the stability compared
to the uncompensated hybrid solution.
There is apparently only one published reference to
the use of the method of Miura and Iwata for a damped
second order system. Bekey and Karplus, 6 on pages
382-383 of Chapter 12, give some results of unpublished *
work of Howe and Fogarty.6 In this work, theyextrapolate x and x by using 1.5 T times x and x respectively,
where T is the sampling interval. They use an implementation where the extrapolation is performed in
the analog computerl the extrapolated values are
sampled by the digital computer, combined to give
X, and then converted D to A and sent to the analog
computer for integration. We can call this calculation of extrapolated values in the analog computer
analog compensation. The analysis by z-transforms is
based on a timing sequence in which the A to D sampling occurs before the D to A conversion of
The
result of this compensation scheme is two desirable
solutions which have exponents whose error are of order (wT)2, in contrast to error of order wT for the uncompensated solution, where w is the natural frequency.
However, there are two extraneous solutions of the
order (twT)t, where t is the damping coefficient, in
contrast to the single extraneous solution of order twT
for no compensation. Therefore, we see that in this
case derivative compensation improves the accuracy,
but it reduces the stability, compared to no compensation.
This situation can be improved if we change to
what might be called digital compensation. Here, we
sample x and X, and do the extrapolations in the digital
computer. This is the scheme used in the present report: For' a damped system, it uses no backward time
steps, instead of the one backward step inherent in
the Howe-Fogarty implementation. Therefore, it has
only one extraneous solution, of order twT, and is somewhat more stable than the uncompensated case because
of a better numerical factor. The accuracy of the two
desirable solutions is of the same order as those of
Howe and Fogarty.
The same scheme as that given for digital compensation in this report can be obtained by the analog I"ompensation method of Howe and Fogarty if they change
the order of A to D sampling and D to A conversion
and perform D to A before A to D. This may not be a
x.
desirable implementation because the transients set
up by D to. A may interfere with the values sampled
A to D immediately thereafter.
The purpose of this report' is to extend the use of
derivatives for extrapolation, to apply the method to a
damped second order system typical of control problems to analyze the system by use of z-transforms,
and to compare the analys's with hybrid c:alcula,tions
using both derivative compensation and multi-step
compensation.
The extension of the derivative method, which is
also referred to as Taylor series compens3,tion, is in
several directions. First, we not only correct x by using
X, but also by using X, since that derivatiive is also
available. Second J we do not assume an ex1jrapolation
ahead by 1.5T, but carry along arbitrary constants
which are then' chosen to give greatest accuracy. The
first order corrections are indeed found by this method
to be 1.5T, providing a simple analytical deriv:ation
of this fact. The second order coefficient of x Illi~y be
chosen in several ways toenhance accuracy or stability.
The analy.sis is applied to a linear damped oscillator,
forced by a control function which is a linear combination of x and x. The oscillator is implemented on the
analog computer, the control function on the digital
computer.
The z-transform analysis yields formulas which can be
used to predict the stability of both the compensatedl and
uncompensated cases for any values of the parameters
and sampling interval. Similar results are given for the
three-step compensation scheme, and show it to be
less stable.
A numerical test was made by implementing both
schemes on a Beckman 2,200/SDS 930Q hybrid cOIJlputer. The hybrid calculations were compared with
continuous calculations of the same system made on
the analog computer. The superior accuracy and stability of the Taylor series method over the three-step
method is clearly appar~nt in the strip chart results,
as well as in the digital printouts.
I
Analysis
Continuous solution
The forced oscillator analyzed is defined by
(2.1)
6 = Kxc - K(TX
• Prof. ~oga~y kindly sent me a copy of this report, and the
remarks m thIS paragraph are based on my analysis of Section 5
of the report.
+
x)
(2.2)
where K and T are constant control parameters. The
command input Xc is taken to be a constant here, for
Extension and Analysis of Use of.Derivatives
ease .of analysis. Further, .only the, 'simple initial
c.onditi.ons x(O) = 0, x(O) = 0 are c.onsidered, alth.ough
.other values bring .only algebraic c.omplicati.on.
The exact c.ontinu.ous s.oluti.on .of this pr.oblem is
simply .obtained by transP.osing the variables .on the
right side and defining t.otal frequency and damping by
X =: Al [ "
] e"Xl Ct-nT)
+
A2 [ , i
-+-
2WrA
+
Kxc
1
+
K
[
1(
1 - -
2
irT)
1 - -
At t
==
r l = (1 - r2)1/2
(2.8)
(n+ I)T these are expressible in rea..l form as
e"XTlt
tf.
(2.9a)
Xn+l
=
e-wl'T [Xn(C.os wrIT -
r/rl sin wrIT)
- w(x,.. - opn) sin wrlT]
where the ATl.2 are the r.o.ots .of the characteristic
equati.on
Hybrid difference Equations
The hybrid implementati.on c.onsiders the ~ term as a
c.ontr.oI functi.on which is calculated digitally while the
left side .of (2.1) is calculated continu.ously in the anal.og
c.omputer. Thus, between the sampling times nT and
(n
l)T, ~ is held fixed at the value opn supplied t.o the
anal.og at t = nT.
+
Theref.ore during this interval the anal.og s.olves
(2.6a)
with initial conditi.ons
x = Xn
(2.7a, b)
w2 = 0,
Al.2 = w( ...... r ± irl) ,
x = ---
] e"X2-C!nT)
where Al.2 are the r.o.ots .of the free-vibrati.on characteristic equation
A2
The s.oluti.on with zer.o initial c.onditi.ons is then
763
(2.9b)
These tW.o equations are difference relati.ons between
Xn., Xn and Xn+l, Xn+l' with given opn' Equati.ons (2.7)
show that the anal.og c.omputer pr.oduces segments .of
f.orced damped vibrati.ons between sampling times, each
joined t.o the adjacent segments with cDntinu.ous x and x,
but disc.ontinuDus x, because 0Pn changes at each
sampling time. The hybrid system sDlves the difference
equati.ons (2.9), as will we, but first opn must be specified
in terms .of x and x tD mDdel the digital part .of the
calculatiDn.
Taylor serIes compensation
The digital calculati.on .of oPn, the value sent tD the
anal.og at time nT,can .only depend .on quantities'
sampled by the digital at previ.ous sampling times. We
will pr.oject x and x and take opn t.o be given by the
pr.ojected values acc.ording tD (2.2):
(2.10)
The pr.ojecti.ons are acc.omplished frDm Xn-l, Xn-l by a
Taylor series f.orm
(2.6b)
(2.11a)
The s.oluti.on .of (2.6) is
(2.11b)
x=
We have used as many terms as the available derivatives
allDw. The quantity Xn - l can be sampled and made
available in the digital. The sec.ond derivative is
calculatedfr.om the differential equatiDn (2.6a)
Fall Joint Computer Conference, 1969
764
-------------------------------------------------------------------------------------Equation (2.1.0)-(2.12) are the essence of the Taylor
series compensation scheme proposed here. In contrast,
a three-step scheme would project chn from previous D's:
The inversion of a z'-transform follows easily by
observing from the definition (2.17) that
00
Zk-l X*
(2.13a)
where
2: X, zk-n-l
n-O
If this is looked upon as a Laurent expansion in the
(2.13b)
and similarly for Dn -2, Dn-s. This scheme goes back to
(n - 3)T, two steps further than (2.11).
In both cases the constants t, k, h, or ao, ai, a2 are
available to help improve the solution. For the threestep method, it is conventional to project to the time
(n + 1/2)T, for which the values of the constants are
al = -21/4, a2 = 15/8, ao = 1 - al - a2 = 35/8
(2.14)
If we project (2.11) the same distance, we find
k = 9/8
t = h = 3/2,
(2.15)
Instead we will carry the constants along, and choose
their values on the basis of the resulting formulas.
The final form of Dpn comes by inserting (2.11) and
(2.12) into (2.10) to obtain
15 Pn
=
=
complex variable z the residue is the coefficient of the
term for which n = k, which is Xk. Thus the inversion
of X* to find Xn is accomplished by finding, for each n,
Residue (zn-l x*) = Xn
The stability of the solution is also indicated by
(2.19). Stability requires that Xn not grow as n increasles.
The only factor in the residue which depends on n is zn,
which grows or decreases with n depending on whether
the absolute value of z is greater or less than unity. This
leads to the well-known stability criterion that every
root of the denominator of x* must have absolute value
equal to or less than unity.
The transformation of (2.9) and (2.16) is accomplished
by mUltiplying by z-n and z-n+I respectively, summing
and using (2.17) and (2.18), remembering the inHial
conditions are zero. The result is
(z - l)x* - x*(e-WST / wtl) sin wtlT
Kxc - K {Xn-l (1 - hWTwT - kw 2T2)
- 15 P ,n-l (hwTwT + kw2T2)
+ w-
1
Xn-l[WT
+
+
(x*- Dp*)
rl + tlt
- 2tkw2T2]}
l
sin wtlT)] = 0
(2.20a)
(2.16)
x* [z - e-wsT (cos wtlT -
Solution· by zmtransform
The z-transform provides a simple method of solving
the difference equations. The definition of the z-transform of the sequence Xn is
(2.17)
and for our purposes its important property is
00
n=O
e-WST (cos wtlT
(t - 2thwT )wT
We now have the three difference equations (2.9a)
(2.9b) and (2.16) for the three unknowns xn, xn and Dpn.
Their solution win provide the result of our model of the
hybrid calculation.
L.:
(2.19)
Xn+1 z-n = z(x* - xo)
(2.18)
+
(x* - Dp*)
wltl
e-
tltl
wsT
sin wtIT)]
sin wtlT = 0
(2.20b)
K)x* + x*Kw-1 [WT + (t - 2thwT )wT
- 2tkw2T2] - (x* - c5p*) [z + K (hwTwT
+ kw2T2)] = z2Kx c/(z - 1) (2.20c)
(z
+
These equations have been arranged so the variables
are the z-transforms x*, x*, and x* - op*, and their
solution gives the z-transforms of the problem v:l:l.riables,
which must then be inverted to yield formulas for the
actual solution.
If the three equations are solved by determinants
the denominator is given by the determinant of the
coefficients,
Extension and Analysis, of Use of Derivatives
~
= - z [(z - 1)2 - 2z(e-wtT cos wtIT - 1)
+ K{(z + l)(e-wtTcoswtIT 1) + (z - 1)1 t 1e-wtT sinwt1T
(t - thwT)wT + tkw2T2]
+(e-2wtT _ 1)]
- (e- 2wtT
-
[t - WT -
- (z - 1)[(z - 1) - (e-
wtT
1)
cos wtlT - 1)]
765
To see the significance of this, remember that the
important term in the residue is zn which can be
written exp(n tn z). But z in the form (2.23) can be
used to expand tn z to yield
zn
= exp{ndwT
+ n[f -
d 3/6
+ nee +
d 2/2)w2T2
dee - d 2/2)] w3T3
+ ... }
(2.27)
(2.21)
This is a cubic in z, whose roots determine the solution
through their residues, according to (2.19).
The solution for x* is then
Z2 Kxc
x* =
[(z
(z - 1)
- (e- 2wtT - 1)
+ l)(e-wtT cos wtIT
Thus the first term is part of the exact solution at
t = nT, and subsequent terms are error terms.
With two roots Zl, Z2 given as a complex conjugate
pair by (2.23)-(2.26), the third root is simple to find by
dividing ~ by (z - ZI) (z - Z2). The expanded result is,
using (2.24) and (2.25),
~ 1)
~
Zs
+ (z
- 1) t It1 e-wtT sin wtIT]
(2.22)
An additional root at z = 1 is visible here, whose residue
also makes a contribution.
Expansion of roots
The nature of the roots of ~ can be seen by letting T
approach zero in (2.21). Then all terms approach zero
except the first, so one root must approach zero, the
other two approach unity. The exact roots are complicated to find since (2.21) is cubic, but we can be
satisfied with expansions of the roots in powers of wT.
Let us first look for a root of the form:
(2.23)
:=,
(1 - h) KWT wT
+ K[t + K (1 -
+
e = % d2 -
(2t
+
KT)d
+
(1
Kd[dwT(h - 3/2)
2(d
+ K)
+ (t -
+ WT tT/W)
=
0
(2.24)
3/2)]
(2.25)
These determine the first two coefficients in (2.23). The
solution of (2.24) is
h) w2T2]w 2T2
tWT(l - 2h)
+ ...
(2.28)
Solution in the physical (time) domain
The solution is the sum of the residues of (zn-I x*)
at the poles z = 1, Zl, Z2, Zs, with x* given by (2.22). The
residue at z = 1 is easily found by putting z = 1 into
(~ - 1) x*, which yields
Residue (z = 1) = Kx c/(1
+ K)
(2.29)
which is just the constant part of the exact solution
(2.4) .
Since Zl and Z2 are complex conjugates, so are their
residues, and their'sum is twice the real part of either.
If the expansion (2.23) is put into (2.22) and (2.21), the
result for Zl to order wT is found to be
Residue (Zl) = - - 2(1 + K)
[ _ ( 1
where WT, tT, ~T are defined in (2.3) and (2.4). Thus the
first coefficient is identical with the exponent of the
exact solution.
+
The solution is usually stable to the roots ZI, Z2
because the real part of d is negative, so the dominant
term of zn is a damping. However, it may be unstable to
Za, an.d will be for large enough wT.
Before choosing values for the compensation parameters, we will look' at the actual solution generated by
these roots.
If the coefficients of (2.21) are also expanded in powers
of wT, and (2.23) is inserted, setting the lowest two
powers of wT to zero yields
d2
k - 1/2
+ ill,wT_ ) + ~( 1 _
WTtT/W
tIT
el - d 12/2
==
ll,wTWTtT/W
(3 r
+ i(3 i
)]
(2.30)
706 Fall Joint Computer Conference, 1969
--------------------------~-----------------------------------------------------,----Finally, the residue at Zs is found similarly using (2.28):
The determinant of the coefficients is no,,,, fifth degree,
with five roots. Two are of the form (2.23) with d the
Residue (za) = 2-1 (wT)n+s [KWT(1 - h)]n+l (2.31)
same, (2.24). The next coefficients are
Choice of compensation constants
Comparison of (2.27) and (2,29) with the exact
solution (2.4) shows that the first deviation of both the
zn factor, and the rest of the expression, depend on
e - d 2/2. If this term is zero, the deviation will then be
o(w2T2) in both places. And (2.31) shows that the
contribution of the extraneous solution is of high order
in U1T and should decrease rapidly as long as !Z3! < 1.
These observations lead to the conclusion that we
should make e - d 2/2 vanish, which means, according
to (2.25),
h
= l = 3/2
(2.32)
The coefficient k is not determined to this order.
However, if e - d 2/2 = 0 the next term in (2.27) is
found from the expansion of (2.21) to be
f -
dS/o
Y2 d
2
=
Kd(1 + dWT)(al + 23,2 + 3/~~)
------------(2.3·:1:)
2(d + wTtTlt)
and, if e - d 2/2 = 0,
f - dS/6
-K(I
=
+
dWT)[KdwT/12
+
d 2(a2 - 22/12)]
(2.35 )
The other three roots are power series in (wT)1/3, given
in terms of
r = (- 1
+ i3 1/2)/2,
r = (- 1 -
+
dWT)/12] - 2KwT(1
+ dWT)/3}
This cannot vanish for any choice of real k. One can
make either its real part or its imaginary part vanish,
although k will then depend on t~l.e parameters of the
problem. One obvious choice which reduces the size
of f - d S/6 is
i31/2)/~~
by
Za,4,6 = (KwTa2WT)l/S(I, ,
=
-Kd{d[k - 13(1
e -
r)
+ (KwTa2wT)2/3(al + a2)(I, r, r)/3a2
+ KWTa2wT /3a2 + ...
The residues at z = 1 and z = Zl are the same as for
Taylor series compensation, (2.29) and (2.30). The first
terms of the residues of the other three roots are
Residue (ZS,4,6) = (Kx c/6)(wT)(n+7)/3
(KwTa2)(n+l)/3 (1, r, r)n+l
k = 13/12
(2.36)
(2.87)
(2.33)
and this is the one used in the implementation. Further
study would be needed to determine if another, more
complicated, choice were better.
Notice that the values given in (2.32) are exactly
those shown in (2.15), which are obtained by projecting
to (n + 1/2)T, while the k of (2.33) is only 1/24 smaller
than the corresponding value of k in (2.15). One can
therefore look upon the analysis as providing a derivation of the length of the projection interval, in contrast
with the usual graphical or intuitive arguments.
Results for three-step compensation
An entirely analogous solution can be obtained using
the three-step projection of (2.13). The necessary
starting values ~-l and ~-2 are taken the same as ~o.
To make the o(wT)
e - d 2/2 = 0 by taking
errors vanish we
ma,ke
which agrees with (2.14). To determine ai, a2 separately
one can go to (2.35) and choose a2 = 22/12, which. is
1/24 less than the value in (2.14). So again we come very
close to the usual projection distance by an analyti1cal
derivation.
The error caused by the extraneous roots should not
be as small for this type of compensation, since it
depends on (wT)n/3, and decreases rather slowly, a8 n
increases.
The solution is also less stable, because of the
one-third power dependence of the roots on wT. In fact,
the absolute values through the first two terms are
Extension and Analysis of Use of Derivatives
IZS.4.6\ =
11
(KW Ta2WT)l/3
+ (1,
-Y2, -Y2)(KwTa2WT)I/3 (al
+ a2)/3 a 21
(2.38)
and since al + a2 is negative, the conjugate pair Z4, Z6 is
the least stable. This is the pair introduced by going
back two steps in time, which shows the destabilizing
influence of that procedure.
St~bility
considerations
As mentioned already, it is the extraneous roots
which control the stability of the hybrid calculation.
For the Taylor series compensation, this root is given
by (2.28), and is of the order KWTwT, the same as for
the uncompensated case, which can be obtained from
(2.28) by putting k = h = C = O. In fact, the compensated root is somewhat smaller (thus more stable)
since the coefficient of the first term is -1/2 instead
of 1. Notice that one could improve the stability,
at some cost in accuracy, by choosing k so that
the coefficient of the second term in Zs vanishes, although k would then depend on the parameters of the
problem instead of being constant.
In contrast, the extraneous roots for three-step
compensation are given in (2.38) and are of order
(a2 KWTwT)I/3, considerably larger than the uncompensated or Taylor series cases. Therefore, the threestep method yields a less stable solution. If a2 = 0,
we then have a two-step scheme, and there are only
two extraneous roots, of order (KwTwT) 1/2, more
stable then the three-step scheme but still less stable
than the uncompensated or Tay lor series cases.
If the scheme of Howe and Fogarty, discussed in the
Introduction, were useq., there would also be two extraneous roots of order (KwTwT)I/2, so the stability
would be about the same as for a two-step scheme. In
fact, the two-step and Howe-Fogarty schemes are
closely related, both going back one step in time.
Computer implementation
The Taylor series (or derivative) method of compensation was tested, and compared with the threestep method, by solving the problem posed by (2.1),
(2.2) on the hybrid computer of the NASA Electronics
Research Center. This is a Beckman 2200/SDS 9300
machine with interface built by Beckman. .
As described above, integration the of x and x, and
the combination of x and x on the left side of (2.1) were
performed in the analog computer. The value of 0 was
found in the digital computer, by sampling x and x from
767
the analog at intervals of T and extrapolating. Then
was calculated and sent back to the analog to be
used to find x. The A to D sampling was accomplished
first, followed immediately by the D to A updating.
In order to compare the resulting hybrid solution with
a continuous solution, the complete equation was also
solved in the analog simultaneously as an oscillator
with frequency WT and damping S7' as defined by (2.3).
The details of the analog circuit, the digital programs,
the control circuit, the scaling, etc., are given in Ref.
7, pages 130-141 and Appendix E.
The output of this calculation was a set of stripcharts and digital printouts giving the hybrid and pure
analog values of x, X, x, 0, and the difference between
the hybrid and analog values, which may be taken as a
measure of the error of the hybrid solution.
Runs were made for the parameters
op
W
=
0.412, S = -0.2425, ST
= 0.7
using the conventional compensation constants
al
= -21/4, a2 = 15/8, ao = 1 -al -a2
=
3,15/8
for the three-step method, and the set
t = h = 3/2, k = 13/12
which we have derived for the Taylor series method.
The values of WT were varied between 0.5 and 15.0. For
each such value, the control parameters K and T can
be calculated from (2.3). Runs were made at several
sample intervals T in order to study the stability of
the hybrid calculation. For large enough T is was always possible to make it unstable.
The relative merits of the Taylor series and three~
step compensation schemes, compared to pure analog
and uncompensated hybrid results, are strikingly
illustrated by excerpts from the strip charts drawn
by the analog computer. The case chosen for illustration is WT = 15, for which (2.3 gives K ~ 1234, T =
0.0942.
Figure 1 shows the strip chart record for x(t) for
four cases. At the top is the continuous solution produced by a pure analog calculation. Below follow the
records for the uncompensated, Taylor series compensated, and three-step compensated hybrid solutions
all for a sample interval T = 25 milliseconds, which is
17 samples per cycle based on total frequency. In order
to bring out the errors more clearly, Figure 2 shows the
difference signal XH - x... on a larger scale, where the
SUbscripts H and A stand for hybrid and analog, re-
768
Fall Joint Computer Conference, 1969
----------------------------------------------------------------------------,----ANALOG
1/
~·',o:L~:t'c ~, =":
~T~,='-"'c
..
..
UNCOMPENSATED
HYBRID
~
Ilmf-t--g
3-STEP
COMPENSATED HYBRID
Figure I-Strip chart records of x(t) for W = 0.412,
S = -0.2425, ST = 0.7, WT = 15. The sample
interval T = 25 ms.
~1++-+~b'T-~:c
UNCOMPENSATED
HYBRID
spectively. The great improvement in accuracy aehieved
by going from no compensation to Taylor series 1iO
compensation is apparent. On the other hand, the
solution· with three-step compensation is unstable and
saturates the amplifiers.
The stability properties of these three C3,ses are
predicted by the formulas we have developed. For 110
compensation (t = h = k = 0), (2.28) gives Zal=
0.736, while for Taylor series compensat:on the saIne
formula shows Izal = 0.411. On the other hand, for
three-step compensation, (2.38) gives Izal == 0.578,
IZ4' z61 = 1.375. Therefore, the part of the 130lutiou
corresponding to the root Za is stable, but the part
correspond:ng to the roots Z4! Z6 are unstable, leadi][):g
to an un table solution, as shown in Figures 1 and 2.
To stabilize the three-step case, the sample interv'al
T would have to be reduced to 10 ms, or about ,4:2
samples per cycle, for which (2.38) shows Iz41 == 0.928.
A case run at this value of T indeed showed three-st,ep
compensation to yield a stab~e solution.
To destabilize the uncompensated and Taylor series
cases, a run was made at T = 50 m3 (8.4 samples per
cycle), for which (2.28) gives IZal = 1.89 and 1.1.2,
respectively. The results of the run are shown in Figure
3, where the rapid increase of x until the amplifie,rs
saturate is seen for both cases.
Similar results hold for other values of "'T' In 3,ll casos,
stability or instability exhibited by the numerie used.
Accuracy considerations
HYPAC is both a digital and analog system,
therefore, all factors that produce errors in digital
and analog differential analyzers will also produce
error in this system. These factors are many and are
extensively covered in literatute. 2 •3 •4 They include
finite sampling, round off and •quantization in the
digital system and limited bandWidth, noise (limited
dynamic range), accuracy and linearity of components,
etc., in the analog system.
The accuracy consideration which is unique to this
program is connected to the way the initial conditions
are set up in the hybrid element.
2. Switching transients are generated whenever
the integrators are switched from one operation
mode to another. These transients are caused
by charges stored in the parasitic cnpacities
of the switches, and affect the outputs of the
integrators. The magnitude and poln.rity of
these small voltage increments caused by switehing transients can be regarded as random.6
The error contributions of these two factors are
minute for each integration period ~T (a few milllivolts) but since a s~lution usually consists of a few
hundred integration periods, the propagation of these
errors can be very significant. It is the author's feeling
~that the errors will not build up if the system itself
and the hybrid element in particular is stable. Although
our experiments }.lave confirmed this, the above statement is rather intuitive and needs further investigation.
Programming of HYPAC
In order to demonstrate how a' problem is prepared
for simulation, let us consider a simple example.
The circuit in Figure 6a is a set-reset flip-flop composed out of two NOR gates, a positive volltage VI
resets the flip-flop and V2 is the set input. The circuit
diagram of the NOR gate used is shown in Filgure 6b.
The complete circuit is depicted in Figure 7. llet
us assume that it was decided to use the outlined
subcircuits as the hybrid element in this example .
. Since this element is not buffered from its input and
output circuits, some special approach is needed in
RESET
A-o
A-o
SET
1. The initial condition of any particular run has
come through an Analog to Digital to Analog
conversion string and was therefore truncated
by the AID converter.
a
** If the circuit is nonlinear,
the eigenva.lues should be evaluated
at the worst possible combination of' parameters and biases.
Figure 6a-Set-reset flip-flop
HYPAC . 775
Using this method, the circuit can be partitioned
into blocks as shown in Figure 8.
This circuit is identical to the one in Figure 7 if
the relations
R
OUT
INP.I
I NP.2
R
\...-.J\F\I\,----'
b
Figure 6b-NOR gate
r-----'
I HYBRID
I
I ELEMENT I
I
I
I
I
I
I
I
I
I
I
L __
1Bl
= i B1 ;
=
-VB1; 101
= i 01 ;
iB2
= i B2 ; VB2 =
-VB2; 102
= i 02 ; V02
V Bl
VOl
I
I
=- __ ..JI
r------l
I
HYBRID
I
I ELEMENT I
R
EO
I
I
I
RL
I
I
I
I
I
I
I
I
_
I
I
L__ __ -.J
~
2
Figure 7-Set-reset flip-flop
order to extract this subcircuit from the whole circuit.
The approach we used is called the partition method
and is described in the Appendix.
-VOl
=
-V02
are satisfied.
The programming of HYPAC is reduced to writing
the nodal or loop equations of the circuit. The HYPAC
block diagram is shown in Figure 9. The blocks HI and
H2 are hybrid elements, blocks >1 and >2 are associated
with the second outputs of the hybrid element, all
other blocks are conventional PACTOLUS elements.
The hybrid element can be patched on the analog
computer as shown in Figure 10. The transistor model
used in this simulation is the analog separation modelll
based on the charge control equations. The input IB
and 10 and the initial conditions are applied through
Dj A converters. The outputs of the sub circuit and the
outputs of each integrator are read out through AID
converters. The outputs of the integrators are stored
to be used as the next initial conditions.
While using this method of programming, one
should be careful not to introduce hybrid algebraic
loops, which, of course, can be highly unstable. Such
loops can be easily spotted by inspection and can usual-
I
I
=
Figure 8-Set-reset flip-flop
776
Fall Joint Computer Conference, 1969
Figure ll-Trigger flip-flop
I/R
r---------------,
~~~----._--_.--._~~--.__.~II
Figure 9-HYPAC block diagram for set-reset flip-flop
R
c
I
I
I
I
I
I
I
Iv()
I
02
R
<
....- + - - - - + ( )
I
T3
1
1
+I
E~~
l
I,
_I
I
~--~~~-,
IL ______________
EI
I
I
~
Figure l2-NOR gate
Figure lO-Analog simulation of a NOR gate
ly be eliminated by placing them wholly in the hybrid
element.
Experimental results
In order to demonstrate the effectiveness of
78
C
CURRENT
DATA
WORD
COUNT
CURRENT
DATA
TABLE
POINTER
CYCLE
DATA
WORD
COUNT
CYCLE
DATA
TABLE
POINTER
)
Figure 4-0ommand and data table structure
Time Shared I/O Processor
the Hybrid Processor is in general running several
hybrid processes. When a process requests service, it
is quite likely that another request may be in progress
or that a higher priority request may be granted first.
In order to prevent conflicts from introducing cumu·
lative timing skew, the individual process clocks are
designed to count through 0 to negative values, and
the at field is actually added to the contents of the
process clock when the command word is fetched. As
long as the result of this addition results in a positive
quantity, the process will not be subject to cumulative
skew and will be accurately timed on the average. We
do a bit better than this, however, by taking advantage
of the fact that the clock tick frequency is very ac·
curately crystal-controlled and that the clock = 0
pulse is a very precisely timed event at exactly the
Llt intervals. This pulse is amplified and available to
users for patching. It can be used to initiate a sample
and hold gate, for example, or to cause the transfer
between buffers in a double buffered D/A converter.
This means extremely precise timing control with a
resolution of 10 J,&seconds, and crystal accuracy is
achieved.
The remainder of the control bits in a command
word are used to stop (H) or restart (R) a process. An
external signal control bit (E) enables a temporary
stop to wait for a signal external to the computer and
the Hybrid Processor to restart the process.
Connection of hybrid processor to SDS';940
The Hybrid Processor is attached to the SDS-940
via a Data Multiplexor Channel (DMC)& with a modi·
fied Data Sub-Channel II (DSC 11)1, as shown in
Figure 5. The status of a block transfer for a DMC
sub-channel is normally contained in one of two "in·
ternal interlace words" which are located in fixed
adjacent positions in core memory. These interlace
words contain the remaining word count in the leftmost bits and current location of the block transfer in
the rightmost bits. The economy of using core memory
words instead of flip-flop registers is quite important,
and this economy is retained by the Hybrid Processor.
However, the DSC II has been modified so that the
locations of these interlace words are no longer fixed
but are uniquely determined by the particular process
selected for service. That is, the first process (Process
A), uses words n, * n + 1 as interlace words; the second
process (Process B), uses words n+4, n+5, etc. Also
during the command fetch, the even interlace word is
* where n is a
memory address which is 0 MOD 4
ACCESS
PORT I
785
ACCESS
PORT 2
CORE
MEMORY
161<
• •• OTHER SYSTEM
DATA CHANNELS
CORE
MEMORY
16K
CORE
MEMORY
16K
CORE
MEMORY
161<
POTIPIN
1/0 CONTROL LINES
AID
CONVERTERS
DIA
CONVERTERS
DID
INPUTS
DID
OUTPUTS
Figure 5 -Connection of hybrid processor to SDS-940
used as the current command pointer word, while
during that data fetch or store, the odd interlace word
is used as the current data pointer word. Note, that as
before, the state of the block transfers is completely
contained in the memory interlace words.
Each hybrid process can switch command and data
tables by using the "cycle" interlace words. Every
process actually has four interlace words. Words nand
n + 1 are the current command and data table interlace
words respectively. Words n+2 and n+3 are the cycle
command and data table interlace words. A process
can cause the contents of its cycle interlace words to
be moved into the current interlace words. This effectively switches the command and/or data tables to
new core areas. This switch is accomplished without
CPU intervention, but the CPU must establish new·
cycle words soon after cycling occurs if the next cycle .
operation is expected to switch tJ yat 1.1)~·.Br C)O
area. In fact, a safety interlock will abort a process
and signal an error interrupt if a cycle attempt is made
before the previous cy.cle operation was properly re·
sponded to by the CPU. More details of the BBN
Hybrid Processor implementation on the SDS-940 are
available in another document. l
Suppose the maximum time to service a hybrid I/O
request were Q!. This time would be measured from the
start of processing of a request by the Hybrid Processor to the completion of this request. Then a con-
786
Fall Joint Computer Conference, 1969
-----------------------------------------------------------------------------------------servative estimate of the bandwidth of the Hybrid
'
Processor would be l/a.
A simple scheduling technique ~ould involve selling
fractions of this bandwidth to users. Each user would
buy l/ai of the bandwidth of t~e Hybrid Processor,
such that
L
1/ ai
::;
1/6-
all i
It would now be necessary for t)1e Hybrid Processor
(or whatever controls the Hybrid Processor) to insist
that user 'i never perform hybrid I/O faster than a i
seconds between hybrid interactions. This simple check
could be performed by either hardware or software.
This technique would guarantee that the Hybrid
Processor would never be over-committed.
On the SDS-940 implementa~ion of the Hybrid
Processor, the maximum a is approximately 20 JJ,sec. *
This means the Processor has a guaranteed bandwidth
of 50KHz.
Simultaneous requests for service are resolved by a
simple priority network which selects the highest
priority process currently requesting service to run
(Process A is highest priority, B, C, D are in order of
decreasing priority). This means the higher priority
processes see the least instantaneous skew, but none will
see any cumulative skew if the bandwidth scheduling
see instantaneous
rules are adhered to, and none
skew if the clock := 0 pulse and sample/hold gates are
used correctly.
will
Device and timing protection
The Hybrid Processor commands reference all
hybrid devices in an unrestricted: manner. If the user
is given direct access to these commands, he could
detrimentally affect another user's experiment by
changing a value on another user's D / A converter,
for example. Also, one user could easily lock out Hybrid
Processor service from another user if he had a higher
priority process and usurped all of the Hybrid Processor's capacity. It is therefore not feasible for the
user to construct his own command tables. Instead,
the time-sharing monitor constructs these tables for
the user and keeps them in monitor core. The monitor
makes certain that a user does. not access another
user's devices or usurp all of the: Hybrid Processor's
* for D/A converiSion. approximately 30 ~3ec for A/D conversions.
time. The data tables are, however, kept in the user's
own address space.
Hybrid processor software
Some very elaborate software (with about 2K words
of machine language code) exists in our time sharing
monitor on the SDS-940 for controlling the Hybrid
Processor. This software locks and unlocks pages of
data tables into core; sets up the transfers of data table
pages from core to drum and vice versa; worries about
anticipating pages before they're needed and getting
the drum requests on a high priority drum queue; and
provides a convenient handle on the hybrid processor
for users.
The user interface to the hybrid processor is provided by some SYSPOP's,5 which permit the user to
assign and deassign hybrid devices; assign and deassi~~n
hybrid processes; define a sequence of commmid and
data tables to be executed a specified number OIf times;
specify a prototype of the command table which gets
set up in the monitor's address space; speeify the
boundaries of data tables in the user's address spaee;
start and stop processes; and interrogate the status
of assigned processes.
Real-time CPU usage
With a Hybrid Processor I/O system, user programs
or user I/O need not be periodic. The I/O can be
precisely timed using the Hybrid Processor independent
of CPU activity. Therefore, it no longer is necessary
to start CPU computation at exact times.
Suppose that for each process the following parameters were specified:6 •7
1. T The period of the process (exact period if
synchronous, minimum period if asynchronous).
2. P The maximum amount of CPU time the
process may require each period.
3. D The maximum tolerable delay between the
moment the process requests service 2md the
time when all servicing has been complet.~d
(most synchronous processes would allow servi,ce
to be completed any time during the period
i.e., D=T).
4. Whether the process is synchronous or asynchronous.
U sing this characterization, the demand of the
process upon the system might be phrased as follow::!:
"When my process requests service it must be ~~ranted
P seconds of CPU time within D seconds of when the
request is made. My process will never request service
Tim,e Shared 1/0 Processor
787
more often than T seconds after the previous request."
The parameters P, D, T, and the specification of
whether or not the process is synchronous enables the
system to decide whether the demands of this process
(and all others) can be successfully met. The system
cannot, of course, guarantee service to a set of realtime processes with arbitrary P's, D's, and T's. In
fact, two restrictions are obvious:
any of the analog or hybrid equipment. The labor
involved in designing and implementing the hardware
and software is approximately 172 man-years. We
believe this cost is justified by the utility of the processor.
(1)
Several changes will be made in our new hybrid I/O
system for our next research computer (a DEC PDP10).8 These changes will increase the total available
bandwidth, improve the command/data flow control
so that even less CPU capacity will be required to
direct the Hybrid Processor, make several improvements to the clock system, etc.
and
L:
(2)
all i
processes
If the sum in (2) were greater than unity, it would
be possible for the real-time processes to require more
than 100 percent of the available CPU time.
The scheduling algorithm used to select which process runs at any time is intimately related to the
guarantees which the system can make to a set of
users. It would be desirable to find a scheduling algorithm. which would allow:
L
Pi to be close to 1
all i
Ti
processes
and would minimize the amount of switching between
processes to reduce overhead. It can be shown that if
switching time is negligible, no algorithm can do a
better job of scheduling for synchronous or asynchronous processes than the following:
Run the proces8 which must be completed 800nest
That is, whenever a process requests service, the
system computes the 'time when the process must
complete service (T i), which is equal to the current
time plus D i • The system then decides to run the process with the minimum T ,. Whenever servicing is completed or aborted (for trying to use more than Pi
CPU time) the system runs next the process with
minimum T i' This algorithm and the necessary and
sufficient conditions under which the system can undertake to run a set of processes are discussed in detail
by Fiala.6
Costs
The Hybrid Processor is not an inexpensive device.
Approximately $20K of digital hardware components
are necessary for a Hybrid Processor, not including
Future work
Future hybrid processor revisions
New clock system
A 36-bit time of day clock will be implemented
whIch counts at 100 KHz. It will be possible to read
this time via an I/O input command over the PDP-10
I/O buss. This 36-bit count will recycle approximately
every eight days.
At least two 36-bit "alarm" registers will be used
in conjunction with the clock" These registers will be
compared with the values in the clock after each
"tick" settles down. If a match on any register is
found, the following events will occur:
A. A CPU interrupt request will be generated so
that a new 36-bit value may be placed in the
alarm register and any CPU action which was
to be initiated at this time will be triggered
(such as the scheduling of a new process to
run).
B. Each alarm register will have eight enable bits
whose set output will be gated with the alarm
pulse and this gated result will be buffered and
available for patching to trigger external devices.
The control of the alarm registers will require the
use of a PDP-10 I/O output instruction to the selected
alarm register to set any combination of the eight
enable bits followed by a PDP-10 I/O output instruction to the selected alarm register to set up the
36 bits of the register itself.
Hybrid I/O
The hybrid I/O capability will be quite similar to
the capability of our current SDS-940 Hybrid Processor. The channel will operate 011 command and data
788
Fall Joint Computer Conference, 1969
-----------------------------------------------------------------------------------------tables with each command paired with a corresponding
word in the data table.
The command format will also be similar to the
current Hybrid Processor on our SDS-940. However,
the flow through these command and data, tables will
be directed by two new tables per process called the
command and data flow tables. These will replace the
"cycling" operations by "driving" the Hybrid Processor through command and data tables a specified
numbers of times. The "cycling" operation had the
disadvantage of putting a large: burden on the CPU
for processes which cycle often (which proved to be
true for many processes).
We will also implement the command and data
table pointer words in hardware to increase Hybrid
Processor bandwidth.
CONCLUSIONS
The use of a Hybrid Processor permits many realtime experiments which were not possible in the past,
and are not possible on other real-time computer
systems. We are able to handle high speed as well as
asynchronous hybrid interactions. l\10st of this is
made possible by the separation of the real-time I/O
functions from the computation· function. The realtime I/O functions are performed by a processor
especially designed to handle re~d-time I/O, and the
computations are performed by a general purpose processor.
ACKNOWLEDGMENT
This work was supported by the Advanced Research
Projects Agency of the Department of Defense
(F 19628-68-C-O125) .
REFERENCES
2
3
4
5
6
7
8
T R STROLLO R S TOMLINSON E R FIALA
I J ELKIND
The hybrid processor
AFCRL-67-0485 BBN Rpt No 1686
R BELLUARDO R E GOCHT G A PAQUETTE
The hybrid computation facility at United A il·crajt
Corporation Laboratories
Proc DECUS 1963 Maynard Mass 261-269 1964
M CONNELLY
Preliminary design of a time-shared, real-time, simulation
facility
Memo No 1 MIT ESL-DSR 76259 Dec 19 1966
M CONNELLY
Preliminary design of a time-shared, real-t'ime, simulation
facility
Memo No 2 MIT ESL-DSR 76259 Jan 30 1968
SDS-940 Reference Manual 900640A
Scientific Data Systems Aug 1966
E R FIALA
Scheduling of real-time processes in a time-shared environment
MIT Masters Thesis 1968
M S FINEBERG 0 SERLIN
Multiprogramming for hybrid computation
Proc FJCC 1967
DEC PDP-I0 System Reference Manual
HGAA-D June 1968
On-line software checkout facility for
special purpose computers *
by J. S. HUGHES
IBM Corporation
Huntsville, Alabama
and
T.H.WITZEL
IBM Corporation
Gaithersburg, Maryland
oriented for program checkout. Before the existence of
the Laboratory, available facilities for checking out
flight programs were oriented to hardware checkout.
Although such facilities can be, and have been, rigged
for program checkout, they have not provided the type
of assistance required to produce the quality of software demanded by spaceborne computers. The Laboratory is believed to be unique in the capabilities it
provides to the programmer/engineer in controlling and
affecting the operation of the Flight Computer in a
real-time environment.
INTRODUCTION
An on-line software checkout facility for special
purpose computers (referred to as the Flight Software
Development Laboratory) has been created to aid
programmer/engineers in the development of programs
that will operate in a spaceborne computer aboard the
Apollo/Saturn IB and V Launch Vehicles. The Flight
Computer operates as an integral part of various
vehicle subsystems in the Instrument Unit (IU). The
subsystems provide onboard navigation, guidance, control, sequencing, data compression, and ground 'communications. These functions are illustrated in Figure
1. Continued emphasis is placed on error-free flight
software, since it is an essential element in overall
vehicle performance. No opportunity exists to test or
exercise the flight program in its actual flight environment prior to a mission. Therefore, to ensure the
inte,grity of the flight program, simulators are used to
accomplish flight testing. The purpose of this paper is
to present the organization of one such simulator that
has been created for the sole purpose of the development and checkout of Saturn flight software. The
emphasis throughout the design and implementation
of the Laboratory has been that it must be user-
Flight software development begins with a set of
explicit engineering requirements: equation and logic
definition, range of variables, and expected performance
data. After an intensive analysis of the requirements,
the flight software is designed and organized to meet
these engineering requirements with minimal flight computer memory and reasonable flexibility. After the
flight program has been flowed, scaled (fixed point
computer), coded, assembled, and checked out by the
program unit or module, the flight phases are integrated
and checked out. This process continues until the entire
flight software has been integrated. The procedure
described above requires that the programmer/engineer
be able to measure and evaluate his progress in an
efficient manner. The purpose of this laboratory facility
is to provide the programmer/engineer with a user-
• This work was performed under contract with NASA's Marshall
Space Flight Center.
789
790
Fall Joint Computer Conference, 1969
-------------------------------------------------------------------------------------------Typewriter
Display
Unit
BK Buffer
System/360
Madel ""
262,144
Byt.,
Figure I-Real and simulated flight equipment
oriented tool by which he is able to test and evaluate
his programs in a simulated flight environment, using
an actual spaceborne computer and interface hardware.
This enables him to measure and evaluate flight software performance against the engineering requirements
for the many vehicles and envirpnmental variations.
The Laboratory user must produce quality software
in the shortest possible time fram~. The key objective
in designing the Laboratory was, to provide accurate
simulation models in the form of' user-oriented tools.
Thus, the Laboratory user can s\viftly determine the
progress and results of his work through real-time mancomputer interaction. The co~puter offers data,
counsel, and guidance to the mao;, who in return supplies certain indispensable knowledge of the overall
system. Systems reliability and effective communications between the Laboratory and user playa major
role in establishing user cOllfiden~e. Operating experience in the Laboratory has clearly demonstrated that
these objectives have been satisfied.
Hardware configuration
The Laboratory has as its main hardware components
an IBl\1 System/360 l\10deI44, linked through a special
purpose interface to a Saturn Launch Vehicle Digital
Computer and Launch Vehicle Data Adapter. An
IBM 2250 Display Unit is employed as an integral
part of the Laboratory, providing two-way mancomputer communications. Figure 2 illustrates the
organization of the hard~are comp(ments and in general
indicates the basic paths of information flow.
One high speed multiplexer channel has been dedicated to the flight hardware interface. Each of the
subchannels is likewise dedicated,. as shown in Figure
2. The dedicated channel and subchannels minimize
Figure 2-Flight software development laboratoryBlock diagra.m
interference from other I/O activities and ena,ble the
creation of a special low overhead channel scheduler.
These features incorporated with the 32-level priority
interrupt scheme make the Model 44 highly re~;ponsive
to the real-time interface requirements. The other high
speed multiplexer channel is dedicated to disks that
support real-time data collection and permit fHst
access for the display system.
In this particular application, six of the 32 llevels of
priority interrupt are used by external hardwired equipment. The others are used by internally generated
software functions for scheduling time-dependent software functions.
The Launch Vehicle Digital Computer and Launch
Vehicle Data Adapter are the two flight components
that have been integrated into the Laboratory.
The Flight Computer is a general purpose computer
which, under control of a stored program, procesEles
data. serially, using fixed-point 2's complement
ari thmetic.
The Launch Vehicle Data Adapter serves as an
input/ output device for the Flight Computer and the
central station for the signal flow in the Saturn As'trionics System, which is illustrated in Figure 1. The Data
Adapter accepts discrete input signals from the stage
switch selectors, Instrument Unit command receiver,
ground launch computer, telemetry computer interface unit, telemetry data multiplexer, control distributor, and other vehicle equipment. It has output
registers to provide discrete output signals to the
Oli-line Software Checkout Facility
a.bove-mentioned equipment. It also accepts and processes computer interrupt signals from the ground
launch computer and Instrument Unit equipment.
The interface unit provides all the normal ground
and flight communications paths between the flight
hardware and the central processor. However, this
interface was designed to go beyond these requirements.
The interface is unique in that it was designed to place
emphasis on (1) minimizing the central processor interface traffic and (2) maximizing user visibility by giving
the user the control of internal flight hardware operations and the access to information internal to the
Flight Computer. Also, the unit was designed for ease
of maintainability. Specifically, three major cap.abilities have been incorporated into the interface unit.
First, the interface unit has been designed so that it
can control the internal operation and timing of the
Flight Computer and Data Adapter. Secondly, the
interface contains special hardware, oriented toward
supporting flight program debug as opposed to program
verification, which is an independent program audit
function performed using the debugged programs.
Finally, the interface unit has been designed so that
extensive automatic diagnostics can be run from the
central processor to isolate suspected interface failures.
The IBM 2250 Display Unit is organized around a
cathode ray tube on which computer-programmed
graphic and alphameric information is displayed at
high speeds. This provides visual communication between the computer and the user. In addition, keyboards and a light pen provide the user with a versatile
means of entering and modifying computer information.
With the display system, the user has direct and rapid
access to stored data which can be selected, processed,
modified, and displayed in alphameric and graphic
representation. For example, the user can display and
modify memory in both the Model 44 and the Flight
Computer through the display unit.
The display unit was configured to minimize central
processor time and core requirements on the Model 44.
A primary feature of the display unit is a buffer storage
of 8,192 bytes, which is used to store images for display
regeneration purposes. The use of a buffer enables the
display llnit to operate concurrently with the computer system, freeing the main core and the channel
for other functions. Additional features which greatly
compress the image storage requirements are the
absolute vector and character generator features.
Operating system
The operating system for the Laboratory is desig-
791
nated as the Checkout Control System (CCS). It is
the operating system which is furnished with the IBM
System/360 Model 44, with additions and modifications to convert the system from a sequential batch
job processor to a real-time multiprogramming processor. However, all the original functions and features
have been retained. Programs not requiring the elements of a real-time multiprogramming system may
operate as though the additional facilities were not
present.
The principal area of the Model 44 Programming
System (44PS) in which additions and changes have
been made is the supervisor. The required functions of
CCS include the ability to support various operations
of computing at precise intervals of time. These operations are selected by a priority scheme which controls
the sequence of execution. Other operations are designed to execute as a result of interrupts induced
outside the central processor. These are generally of
such importance that their priorities are higher than
operations initiated as a result of time. The function
of multiprogramming through a scheme of priority
interrupts and the requirement of real-time operation
are the principal requirements forCCS. To satisfy
these requirements, capabilities in three principal areas
have been added. These are multiprogram scheduling,
real-time input/output scheduling, and application
program phasing control.
A principal element of the program scheduling facility for CCS is the timer queue (Figure 3). It consists
of a string of items ordered in ascending sequence of
time-.to-execute. Each item of the queue contains a
pointer to the routine to be executed at the corresponding time. When the timer interrupt occurs, the timer
processor routine gains control and the routine corresponding to the timer interrupt is placed into a state
of execution. Its immediate or deferred execution is a
function of priority levels. When a timer interrupt
occurs, a comparison is made between the priority
level of the routine currently in execution and the level
of the routine for which the timer interrupt has occurred. If the level of the current routine is higher
than or equal to the other, it resumes execution while
the execution of the lower priority routine is deferre d.
Conversely, if the priority level of the current routine
is lower, the other is placed immediately into execution,
temporarily suspending the first. This method of
scheduling uses the hardware priority interrupt system
and additional software of COS.
Figure 4 illustrates some of the conditions which
may occur with a typical combination of timer-initiated
priority routines. Notice that the execution priority
Fall Joint Computer Conference, 1969
792
-------------------------------------------------------------------------------------------Head
I
Pointer to First Item
~
,
Priority
15
Time to Execute
24
Point to First Item
~
~
~
)
Program A
10
29
Program B
25
38
Progra'!l C
20
49
Program 0
10
56
Program E
Figure 3-Timer queue
i'rlorlly 10
~Iarlty
_ _ _ _ _ _ _ _ _ _ ~~
E'--_ _ __
15
"'Iarll)' 20
~Iorily
~~
,
:"'''
~
o IL.joll....-_
__
25
Non-~iorily
Inlerrvpt
69
74
Figure 4-Timer-initiated multiprogramming
level of timer interrupts is coincident with the priority level executing at that time. In addition, the
figure shows how the high priority routine gains control
from a lower priority routine. In this priority system,
low magnitude numbers correspond to high level priority.
Figure 3 illustrates a timer queue containing several
items which will initiate programs on various levels a1j
different times. These items match the information
illustrated in Figure 4. As each item reaches the top
of the list, the internal interval timer is set 1~o thE~
increment of time from "now" until the program. is to
execute. When the timer expires, a priority leyel request for the program is set, the item is removed from
the queue, and an interval for the next item is calculated. The program pointed to by the item which
caused the timer interrupt is attached to its priority
level for execution. When the queue becomes E,mpty,
the non priority level regains control.
The second major feature of program scheduling ilS
the supervision of priority interrupts by the priority
interrupt executive. Gertain 'housekeeping' functions
are performed by this feature, such as register saving
and restoring, as control passes up and down the priority levels. Control is automatically given to the
priority interrupt executive whenever anyone of the
32 levels is activated. The routine to be given control
is determined, registers are saved as required, and a
pointer to parameters is set. Control is then given to
the priority routine. When the routine concludes its
operation, it returns control to the priority interrupt
executive which restores registers and causes the routine on the next highest level to resume or be~~in execution.
Figure 5 illustrates the overall flow of data and
control in CCS. Whereas Figure 4 illustrates the effeet
of program scheduling, this figure illustrates. the mechanics involved. A program currently executing ma,y
be interrupted by the timer (1). The timer processor
selects data from the queue (2) and attaches the routine
to execute (3). It sets a new interval in the timer (4)
and initiates -a priority interrupt (5) (assuming the
routine is of a higher priority than the current program).
The priority interrupt executive determines the routine
to execute (6) and gives control to the routine ('7)
which returns control (8) when finished. The executive
then returns control to the interrupted progra.m (9).
At step 5, the condition may exist that the timerinitiated routine is of lower priority than the current
program. If so, the timer processor returns control
directly to the current program (10).
Both application and system programs. may queue
routines using the timer queue (11). The actual queueing is done by a system routine.
In Figure 5, the dash line connecting "programs"
and "current program" is intended to show that the
On-line Software Checkout Facility
cessing. The first operation is bandpass filtering.
In each of nine heterodyne-type tracking filters, the
center frequency of the 2 Hz passband is continuously
tuned to track the excitation frequency sweep. The
outputs from the tracking filters oscillate about zero
volts and are available in real-time at one of the four
Ci 5000 analog computers. These filter outputs become the data signals for input to the hybrid computer
and· are quite clean sinusoids with slowly varying
frequency (Figures 2 and 3). The one excitation signal
exhibits approximately constant amplitude but the
eight response signals exhibit maxima at the aircraft
structural resonances.
803
INPUT
Data compression
The next operation is the storing, in real-time in the
central memory of the CDC 6400 digital computer, the
times at which all zero-crossings occur, and the r:eak
amplitudes of all cycles. The time of a positive- ~oing
zero-crossing, such as ts in Figure 2 is stored with the
amplitude ~ of the previous positive peak. A negativegoing zero-crossing time such as t4 is stored with the
amplitude as of the previous negative peak. This data
compression is possible since the outputs of the tracking
filters are quite clean sinusoids and more frequent
sampling would yield redundant data. If needed for
other applications, further data compression could
be achieved by discarding some amplitudes and zerocrossing times during parts of the frequency sweep
which contain no structural resonances. The necessary real-time digital computing could be perf L>rmed
if the central processor time allocated to the program
is greater than the ten percent presently used.
The envelope detection circuit
The peak amplitudes are generated by applying each
data signal to one of the nine envelope detection circuits (Figure 4). Each circuit consists of two nearly
identical circuits; one for the positive side and one
for the negative side of the data signal. Amplifiers A,
B, C, D and E in Figure 4 form a positive envelope
circuit utilizing two mode-controlled integrators (D
and E) as a track/hold pair, and a first order loop
(A, B and C) as a maximum-value circuit. Since the
amplifier A represents a perfect diode, the loop acts
like a high-gain lag when the input is greater than the
output of B and the "diode" A is forward biased. When
the input falls below the output of B, the "diode"
becomes reverse biased and B is forced to hold at its
last value.
As long as the input is positive, comparator output
Figure 4-The envelope detection circuit
U is true, B is in the compute mode, D is tracking B,
and E is holding the previous peak. As the input goes
negative, U goes false, B resets to zero, preparing the
maximum-value circuit for the next positive signal,
D holds the last voltage from B, and E tracks the new
peak from D. Thus, with the positive envelope on E
updating on each negative-going zero-crossing of the
input, the envelope has staircase-like discontinuities
as shown in Figure 2, though it is smoother in practice
when more slowly varying frequencies are used. Figure
3 is a segment of typical aircraft data. The second
half of each circuit generates the negative envelope
in a similar manner (Figure 2). All of these envelope
voltages are input continuously to analog-to-digital
converters. A practical upper frequency limit to the
circuit, using the gains in Figure 4, is 120 Hz. With
other gains the useful frequency range could be shifted
so that the upper frequency limit is approximately
2000Hz.
The hybrid interface and data storage
Each analog console, with its associated interface,
contains 32 channels of analog-to-digital conversion.
During real-time, when the hybrid system is sensitive
to interrupts and peripheral processor patterns (precompiled I/O programs stored within and executed
by one of the CDC 64.0~_ peripheral processors), it is
804
Fall Joint Computer Conference, 1969
------------------------------------------------------------------------------------~-----possible to transfer nine data words from the sampledisk, as well as the tapes, becomes available for realand-hold amplifiers, via the analog;..to-digital converters,
time data storage. The size of the program eould be
increased even further by using mUltiple analog conto central memory in less than 300 microseconds.
soles and interfaces to achieve parallel data conversion
The actual mechansim of data storage is initiated
and transmission.
by leading edge I/O interrupts which cause the previously defined patterns (programs) in the peripheral
The hybrid time measurement ciN!uit
processor to transfer the data to a temporary buffer
in central memory. No central processor time is reA problem arose in the accurate measurement (within
quired for this operation. Upon completion of I/O,
ten microseconds) of the time intervals between the
the central processor takes the data from the temporary
interrupts on as many as nine channels.
buffer and packs it in an array with four data words
The problem was complicated by the possibility
per central memory word. Here it is stored until the
of virtually simultaneous zero-crossings. Although
frequency sweep is completed and the post real-time
the digital computer includes a Precision Interval
processing is begun. The leading edge I/O interrupt
Generator, which downcounts at a rate of 500 KHz,
is activated asynchronously by a positive-going zerothe attempts to use it for accurate timing of events,
crossing of one of the data channels. This causes the
external to the digital computer, were unsuccessful.
instantaneous digital value of the corresponding posiThis was mainly because of the difficulty of handling
tive envelope (fro'm the analog-to-digital converter)
simultaneous interrupts and the effect of data-link
to be stored together with the time at which the zerodelays (including software delays and delays, which
crossing occurs. Similarly, negative-going zero-crossings
could be of the order of a millisecond, arising when
set interrupts which initiate the storage of negativethe computer must finish u. previously initiated or
going zero-crossing times and corresponding negative
envelope amplitudes.
During the real-time phase, when timing is most
critical J the requirements on the central processor
are reduced to that of transferring and packing data
within central memory. This requires approximately
ten percent of the central processor for nine channels
at signal frequencies of 30 Hz. Thus the central processor
is readily available to service other hybrid programs or
batch digital programs as required.
1---+----4--.....j~
Using central memory only, data for up to 32,000
)I#&UII
(£ 'EIIJ &&s)
"~"JOSfp
signal cycles can be stored. It might be possible to
)I#.lU&",&
flU!U.JeM ,,"unoo lISJeoo pea.J
increase this number greatly by using the disk file
&S.Jeoo pliaJ
&U!! uMop/dn dwv peaJ
or magnetic tape. However, the Lockheed-Georgia
)lJ6i!OO'cf peaJ
NOllO'1 lV1I910
Company's hybrid computing facility is a time-shared
,,"unoo
system and it was necessary to program this problem
SIauue40
IlS..aeoo .j.ullwaJOU! ....._ _--1
for time-sharing compatibility. The system contains
~."'" ""19"
a CDC 6638 disk and four CDC 607 tape drives, and
is strongly file oriented, using the disk for intermediate
file storage during input and output. In a time-critical
problem such as flight flutter testing, the disk might
not be available for mass data storage, since it can be
in use on a non-interruptable channel for several
seconds under certain I/O conditions. When real-time
mass data storage is required, the 607 tape drives are
\ A A AA
V V V V V oo~+
generally used. These drives can be assigned to a
~~O~specific problem and can normally be accessed within
A
A
A oo~
500 milliseconds. However, it is necessary to use a
+~ I ' - - ' ' - - ' \ 0
central memory buffer capable of storing approximately
~oo~.j.nd+llo V'dwv
one second duration of data.
UU LfLf oo~+
If the restrictions of time-sharing are removed and
if the system can be dedicated to flutter testing, the
Figure 5-The time measurement circuit
-
~unoo
-
OO~+
,0
ftn nn
A Hybrid Frequency Response Technique
higher priority task before attending to the next).
These difficulties were avoided by developing a new
hybrid technique to take advantage of the sample/
hold feature of the analog-to-digital converters (ADC).
This time measurement circuit is described with reference to Figure 5. The basis is the generation of a timesynchronized voltage waveform which represents the
fine count and is fed into one ADC for each of the
data channels to be monitored. The waveform selected
is a 0-100-0 volt triangular wave with a 20 millisecond period. It is generated by two complementary
integrators controlled by the analog clock which counts
down frequencies from a mega-Hertz crystal oscillator.
Because one integrator is always in reset while the
other is integrating, the synchronization of the output
at zero volts is assured at the beginning of each new
cycle.
By connecting e~.'ch holding register (which goes
true when an interrupt is set and remains true until
the central processor begins action on the interrupt)
to the sample/hold controller of the corresponding
ADC, this fine count voltage can be held until the
digital computer can read it. The coarse count is
purely digital and is incremented at the beginning of
each cycle by a subroutine which is called from the
highest priority interrupt. A logical signal is required
if the coarse count is incremented during the time
between holding and reading an ADC. This signal
is obtained by "ANDing" the holding registers of
the coarse count and the zero-crossing interrupts. By
using this signal to set a flip-flop which feeds a discrete
control line, and by reading this line at the same time
as the digital computer' reads the ADC, the coarse
count portion of the stored time may be decremented
if necessary.
The triangular wave is an easily synchronized signal
with no discontinuities. For timing purposes, 'however,
it is necessary to know whether the wave is ramping
up or down at the time of reading. This is achieved by
using, for each channel, a flip-flop tied into a discrete
line; The flop-flop normally tracks the analog clock,
but maintains the present state when the holding
register indicates that the interrupt is in progress.
A zero-crossing then triggers an interrupt which simultaneously initiates the digital computer, holds the
ADC and the ramp up/down flip-flop, and actuates the
gate of the coarse count warning flip-flop. After recognizing the interrupt, the digital computer simply reads
the ADC and the two discrete lines and stores their
values together with the coarse count. Further action
may be postponed until the post-real-time phase.
With a ten volt per millisecond excursion of the
analog triangular wave, the ADC's are able to resolve
805
TABLE I-Successive zero crossing times for five
channels interrupted simultaneously at 5
Hz.
.969989
1.169992
1.369 cM8
i;-5'6Q~iH
1.769992
1.969989
2.169989
2.369992
2.1';69991
";769987
~.9699cH
1.169991
3.J699A6
~.5699a9
3.769991
'j";()6Q9M
4.1699149
•• 369994
4.569989
4.769989
4.969992
~.1699H9
5.369988
5.569991
".769991
5.969987
6.169<:190
1,.3'69991
6.S6998A
tI.71,99R9
.969990
.96998~
.96~98~
1.169993
1.369989
1.569988
1.769993
1.9f)9990
2.169989
2.3f)9992
2.569992
1.169992
1.369988
1.569987
1.769992
1.9f)9989
2.169989
2.369991
2.569991
2.76991:11.
2.969990
3.1"9991
3.3699d7
3.';69991
3.7699 cH
1.169995
I.J6998d
1.569987
2.7f)'I9~7
2.96999;:?
3.169991
3.369987
3.569991
3.769992
3.9699A9
4.169989
4.369994
.... 5"991'19
4.769989
4.9699~2
5.1(999)
5.369989
S.5699941
A.C;6q9~1l
"'.71'\9991
b.9f1"'~91
7.96998'1
8.169'191
8.3699 cH
1.7b99~l
1.9b9989
2. 169'1t19
2.369991
2.56Ci991
2.769987
2.969990
3.169991
3.369987
3.569'189
3.76'1991
J.96'1"'8d
4.169989
4.369993
4.56"'98'11
4.1699a~
4.96'199l
5.169990
5.369987
5.56'119'111
5.769990
5.969986
6.169990
6.369991
6.~69"'87
6.709'11t19
6.969991
7.169989
7.36998'1
7.569'1'1 ..
7.709989
7.969988
8.1MI991
8.3699911
8.!:i"996~
8.~6'198d
8.769991
8.76'>1991
.969991
1.169993
1.369989
1.569988
1.769'194
1.969991
2.169989
2.369'1'112
2.569992
2.76991:19
2.969992
3.169992
3.369989
3.569991
3.769'119~
3.96991'19
4.169991
4.369995
4.56'11991
4.769990
4.969994
5.169'i92
5.369989'
5.569992
5.769991
5.969988
6.169991
6.369992
6.Sft99B9
6.7699"'1
6."'69993
7.169989
7.369991
7.569995
7.769991
7.9699a9
8.169'1193
8.3699'11
8.5699d9
8.769'1192
the voltages within 0.1 volts. This is equivalent to a
timing accuracy of ten microseconds. Better accuracy
could be achieved by balancing the integrators and
the ADC's for off-set and drift. Typical results from
a system without special balancing are presented in
Tables I and II. Identical channels were interrupted
simultaneously by an analog clock. Table I contains
interrupt times (zero-crossing times) for successive
interrupts at 5 Hz while Table II contains similar results for interrupts at 10Q Hz. The times of simultaneous events on all channels differ by no more than
three microseconds. Also the periods between succes·
sive interrupts on anyone channel differ by no more
than three microseconds. At zero-crossing frequencies
as high as 100 Hz, the nine channels of the flight
flutter program can be sampled with this same accuracy.
It is possible to read all 32 ADC channels at each a?alog console within one millisecond so that, at ~ s~mplmg
rate of 100 Hz, each interface channel IS Idle 90
percent of the time. Thus the data frequency or the
number of channels could be increased significantly
withou t loss of accuracy.
Post real-time processing
With a maximum lag of only a few cycles after
occurrences on the aircraft? a digital description of
Fall Joint Computer Conference, 1969
806
TABLE II-Successive zero crossing times for five
channels interrupted simultaneously at
100 Hz.
ZEROCKOSSING TIMES IN SECONDS
cmIDre. L
1
.11Q99Z
.18Q994
.19Q992
.20Q994
.21Q992
.229 99 4
.?39 9q 3
.2.Q-.}9 ..
.25Q9 9 3
.1.bqY9..,
.21Q!,J~3
.2AQq9.
.?'19 4 <.n
• :H\9'1'~4
.'31Q'Nl
.32Q-J'I4
.31Q'N?
• :~4q94"
CH"NNEL 2
... 'iQ~'O
• 4'l4-JY4
.47Q99:'
.4AQ!,J44
.4QQ'I 4 '3
.tlOqQ'J4
.<;144<»)
• "2YY'~"
.119991
.189994
.199992
.201i'i94
.219991
.l21i99 •
• .:!39991
.249994
.471.J9~l
.4F199~4
... q99~Z
.41:1~944
.4~49':14
.41i9.,,91.
.4'19941.
.'i()9944
.:, 1 'JQ .. l
.'i?1.J9 4/+
• "d99<.1?
.C;09c)94
.~O.".,,'#:,
• .,199'.12
.':>19'193
.~099-J4
.~1~91.J?
.C;2~9C)4
• 'il "I 9Cj 4
.'2991.J4
- .~3~"I'i3
.~]499~
,"4~'J"":'
.... 4'l1.J4
• 'j"'i4~]
.Sb 9 9'14
.21'11i944
.24~947
.3(\':199 •
.H'N4?
.37.9944
.33"'94\
.34<:1'14.
,,,(I"'94~
.4149<:'0
.4?1.J91.J.
.43.,,441
.44<19<.14
.4",,9 Q4 1
.54Q'J9'->
.r:.,c..,Q44'>
.... 4"J944
.'J~"'94t'
.r.;3999(!
."44944
• .,599'1 i
.C;f,09-.}''
• "t)-il9 Y4
• "t>999 ..
.r.;~oCNj
0;
• '>t~1199~
.4449"11
.2"'19'~4
.~7~9'1~
.3649'14
._H'19'1\
.3R99 ...
.lq99l.Jt
.43QY Q 3
.44'1":/'14
.179993
CHANNEL
.4h."Q-';"•
• 4 7"'9'~ 1
.23 4 917.
.2499'14
.7511942
.30'Hq"
• :'Hq'N3
.4299'14
4
.4"i'l9'1\
.4b9'11.J4
.47'l'J'J?
.2299'~4
.1.,1i9'·,.
.40'1'144
.41'1997
CHA~NEL
.11999l
.189994
.199992
.209994
.219992
.729994
.1'39992.2.9994
.2C,994;.!
.209994
• 27991.Jt!
.?A9994
."99992
.109994
• 319992
.329994
• 'i399~2
.149Q94
• i5999]
.3t>9994
.379l}92
.lR9994
.3Q99'1t!
•• 0991.J4
.4\1i991
• 479944
.43999t
.1049944
.4,;9991.
.469994
.419991.
.119992
.IR9944
• 191i991
.209994
.219942
.~.,q44J
• 1RQ'I44
• 19Q'~9-1
CHANNEL 3
.1~9994
.191i91i1.
.20999 4
.21999J
• U999':i
.2J999J
.249994
.Z':)9993
.259~9Z
.269'19,
.20999,)
.27999Z
.2ij.,,994
.279992
.2f11i994
.29"'992
.2999'.11
.-J0999!:1
.309994
• 3l9.9~2
.329994
.3199 q )
.329990
.339992
.34999&
.3';)'N9 40
.J0999&
.379991.
.3ij999~
.J9994.:!
.401.J9~"
.41'1992
.42999 ..
... J'i99.:!
.4 .. 91.J9'3
.4""'9I.Jt
.4b'l996
• .,~1.J9,,*l
.,b99 4n
• 331i9'o12
.349'194
.3S999?
.3b'i9'i4
.J1 9 '192
.3f1991.J4
.J99992
.4099"'4
.419991
.4..,99"13
• 43'i'·N7.
,"4'~994
nine signals can be stored in the CDC 6400. The computer can be ordered to start or stop accepting realtime data either at the console or by remote switches
in the flight test monitoring room. After a stop order
-the stored data is immediately processed. Both signal
zero-crossing times and amplitUdes undergo conventional digital smoothing. Next an amplitude-versusfrequency history is generated for all nine signals
from their peak amplitude values and the time intervals
between zero-crossings of the excitation signal. A
phase-versus-frequency history of each response signal,
relative to the excitation, follows by comparing zerocrossing times in each response with those in the excitation. Thus the hybrid computer is used as a frequency response (or transfer function) analyzer. By
searching through the values of the response envelopes,
it is able to find the resonances, calculate their frequencies, and normalize their amplitudes by the corresponding amplitUdes of the excitation. An increase
with airspeed of the normalized amplitude of a resonance can indicate a decrease in its damping and in
this way the aircraft stability trends can be followed.
If the frequency sweep is sufficiently slow and if the
actual forcing of the aircraft is accurately repl'esent(~d
by the excitation signal, the computer can use the
phase information and the technique of Kennedy and
Pancu 4 to separate closely coupled resonances and
calculate their damping. Because these post n~al-tirce
operations involve conventional digital programrr.ing,
details are irrelevant in this presen ta tion .
Typically, the answers, from eight resr-onse signnls
and a sweep from 1 to 30 Hz lasting 120 Beconds,
begin to appear on the line printer approximately
three seconds after the end of the sweep. For its versatility a facsimile machine is used to transmit copiles
of the line printer output at eight pages per minute
to its remote terminal in the flight test monitoring
room. A remote line printer or display scoI.e could
quite easily have been used .
SUlVIMARY
The hybrid frequency response technique has m2~de
possible very rapid data reduction during aircraft
flight flutter testing when time saving is extremely
important. Previously, such data reduction has been
performed in post real-time, to a large extent by hand,
from chart recordings. The savings in aircraft flight
time, and the increased number of channels VI hich (~an
be Ltnalyzed, fully justify the use of a large corrr-uter .
It is worth comparing this hybrid system with other
systems which were considered.
A different approach could be based on the 13ampling
of the data signals at such a high frequency that p"eak
amplitudes and zero-crossing times could ,be detected
digitally, in post real-time, by interpolation between
the samples. W hen the signals are quite clean sinusoids
of slowly varying frequency, this method leads to much
redundant data and a large storage requirerrent.
Furthermore, it was found that the use of nine data
signals and frequencies up to 30 Hz requires that the
computer accept a prolonged data input rate far
greater than its capability.
Some data compression can be achieved by the use
of the Fast Fourier Transformli which requires a minimum sampling rate of at least twice the hig;hest frequency of interest. 6 Thus, a sweep from 1 to 30 Hz
with 120 seconds duration requires at least 7,200
samples per channel. This is approximately twice
the number taken by the hybrid technique ... On the
CDC 6400 a Fast Fourier Transform of 8192: samples
takes approximately sixty seconds per channel using
software. 'This is prohibitively long for flight flutter
testing when compared with the three sec:onds for
A Hybrid Frequency Response Technique
nine channels taken by the hybrid technique. A Fast
Fourier Transform using hardware would be much
faster but such a unit was not available. Transform
techniques are more applicable to transient and
random signals than to slow frequency sweeps.
Separate commercial frequency response analyzers
for each data channel could be interfaced with a digital
computer through analog-to-digital converters but
it is extremely difficult to justify the purchase of a
n umber of such units when a very large hybrid computer is available. Certainly a digital computer is
needed to perform the many logical operations which
separate the important resonances and discard less
important ones. To obtain numerical values of
damping, digital operations appear necessary. The
hybrid computer has the. additional advantage of
making possible many convenient forms of system
control and display to further aid in saving aircraft
flight time.
This data reduction system was develoJ:ed for. use
in a flight flutter test program but it should be adaptable to other situations calling for very fast reduction
of slow frequency sweeps. The present application
requires only one analog console and about 25 percent
of the available central memory but it could be expanded to use all four analog consoles and interfaces
to give a capability for 40 data channels. Of course,
this would dedicate the system. Some elements, such
as the hybrid time measurement circuit, could find
even wider application.
807
ACKNOWLEDGMENTS
We wish to acknowledge the valuable help of M. E.
:McCoy, A. Roberts, 11. Elder and J. Hatley of the
Hybrid Computing Department and L. A. Tolve,
S. W. Robinson and W. F. Grosser of the Aeromechanics
Division. We are also grateful to Mrs. Clara Culpepper
for the typing of the manuscript.
REFERENCES
2
3
4
5
6
R L BISPLINGHOFF H ASHLEY R L HALFMAN
Ael'oelasiicity
Addison-Wesley Pub Co Inc Cambridge Mass 1955 Chap I
G GRIMM J PHILBRICK
Flight flutter testing-Recently developed techniques in
excitation and data reduction
lAS Natl Summer Meeting Los Angeles 1960 No 60-91
V V SOLODOVNIKOV
Introduction to the statistical dynamics of automatic control
systems
Dover Pub N Y 1960 Chap 2
C C KENNEDY D C P PANCU
Use of vectors in vibration measurements and analysis
Journal of Aeronautical Science Vol 14 1947603-625
J W COOLEY J W TUKEY
An algorithm lor the machine calculation of complex fourier
8e1'1;eS
Math of Computation Vol 19 1965297-301
C E SHANNON
Communication in the presence of noise
PrQc IRE Vol 37 No 11 1949
AMERICAN FEDERATION OF INFORMATION
PROCESSING SOCIETIES (AFIPS)
OFFICERS and BOARD of DIRECTORS of AFIPS
President
V ice President
Dr. Richard 1. Tanaka.
Ca1ifornia Computer Products, Inc.
305 North Muller Street
Anaheim, California 92803
Mr. Keith W. Uncapher
The RAND Corporation
1700 Main Street
Santa Monica, California 90406
Secretary
Treasurer
Mr; R. G. Canning
Canning Publications, Inc.
134 Escondido Avenue
Vista, California 92083
Dr. Walter Hoffman
Computing Center
Wayne State University
Detroit, Michigan 48202
Executive Director
Executive Secretary
Dr. Bruce Gilchrist
AFIPS Headquarters
210 Summit Avenue
Montvale, New Jersey 07645
Mr. H~ G. Asmus
AFIPS Headquarters
210 Summit Avenue
Montvale, l\ew Jersey 07645
A CM Directors
Dr. B. A. Galler
Computing Center
University of Michigan
Ann Arbor, Michigan 48104
Professor Anthony Ralston
State University of New York
Computing Center
4250 Ridge Lea Road
Amherst, New York
Mr. Donn B. Parker
Control Data Corporation
3145 Porter Drive
Palo Alto, California {}4304
IEEE Directors
Mr~ L. C. Hobbs
Hobbs Associates, Inc.
P.O. Box 686
Corona del Mar, California 92625
-Dr. Robert A. Kudlich.
A C Electronics Division
General Motors
Milwaukee, Wisconsin 53201
Mr. Samuel Levine
Bunker-Ramo Corporation
445 Fairfield Avenue
Stamford, Connecticut 06902
A merican Society for Information Director
Mr. Herbert Koller
Leasco Systems & Research Corporation
4833 Rugby Avenue
Bethesda, Maryland 20014
Simulation Councils Director
Mr. James E. Wolle
General Electric Company
Missile & Space Division
P.O. Box 8555
Philadelphia, Pennsylvania 19101
Association for Computational Linguistics
Observer
Dr. Donald E. Walker
Head, Language & Text Processing
The Mitre Corporation
Bedford, Massachusett·s 01730
Special Libraries Association Observer
Mr. Burton E. Lamkin
National Agricultural Library
U.S. Department of Agriculture
BeltsvHle, Maryland
Society for Industrial and Applied
Mathematics Observer
Dr. D. L. Thomsen, Jr.
IBM Corpor9.tion
Armonk, New York 10504
Society for Information Display
Observer
Mr. WHliam Bethke
RADC-(EME, W. Bethke)
Griffis Air Force Base
New York, New York 13440
AFIPS Committee Chairmen
Abstracting
Dr. Vincent E. Guiliano
School of Information and Ljbrary Studies
Hayes C, Room 5
State University of New York
Buffalo, New York 14214
Admissions
Dr. Robert W. Rector
Informatics, Inc.
5430 Van N uys Boulevard
Sherman Oaks, California 91401
Awards
Dr. Arnold A. Cohen
UNIVAC
2276 Highcrest Drive
Roseville, Minnesota 55113
Ad Hoc Conference Committee
Dr. Barry Boehm
Computer Science Department
The RAND Corporation
1700 Main Street
Santa Monica, California 90406
Constitution & Bylaws
Mr. Richard G. Canning
Canning Publications, Inc.
134 Escondido Avenue
Vista, California 92083
Education
Dr. Melvin A. Sh~er
CSC-Infonet
650 N. Sepulveda Blvd.
El Segundo, California 90245
1969 FALL JOINT CONFERENCE COMMITTEE
Chairman
Jerry L. Koory
Planning Research COrporation
Vice Chairman
Ted Braun
Applied Computer Technology Corporation
R. K. Goran
IBM Corporation
M. C. Rogers
TItW Systems
G. M. Sylvester
Lockheed Electronics
E. G. Walsh
California Computer Products
Ladie8! Program
Treasurer
lVlichael Baran
System Development Corporation
Secretary
Robert A. Berman
The RAND Corporation
Education Program
Fred Gruenberger, Chairman
San Fernando Valley State College
Don Kehbiel
Santa Monica City College
Roger Mills
TR W Systems Group
Robert White
Informatics, Inc.
Exhibits
S. F. Needham, Chairman
TRW Systems
R. D. Blosser
Autonetics
L. J. Bouser
Hewlett Packard
R. A. Burks
Scientific Data Systems
C. R. Cornwell
Lockheed Electronics
P. P. Gehl
Scientlfic Timesharing Corporation
Ann L. Rataichak, Chairman
IBM Corporation
Mrs. J~mes O. White, Jr.
Mrs. Keith W. Uncapher
Mrs. Fred Gruenberger
Local Arrangements
Al Deutsch, Chairman
Informatics, Inc.
Valerie Maitland
The International Data Exchange
Mel Brown
Compata, Inc.
Les Levitan
The International Data Exchange
Tom Schuman
Informatics, Inc.
Stu Shaffer
System Development Corporation
Jim Smith
Naval Undersea R&D Center
Bob White
Informatics, Inc.
Printing and Mailing
Ed Chappeleas, Chairman
IBM Corporation
Glenn W. Murray, Vice Chairman
Autonetics
Edith Taggart
IBM Corporation
Alex Connolly
IBM Corporation
Charles Adamo
Philco Ford
Robert L. Koppel
Scientific Data Systems
Howard Gorman
Autonetics
Lora Perkins
Autonetics '
Public Relations
Robert B. Forest, Chairman
Datamation
Janet Eyler
Datamation
Santo A.' Lanzarotto
Soientific Data Systems
Dawn Walker
Dawn Walker Public Relations
Mike 1VI urphy
McGraw Hill
Martha Palubniak
McGraw Hill
Mike Creedman
Publications and Technical Program
E. M. Grabbe, Chairman
TRW Systems
Warren E. lVIeyer, Vice Chairman
System Development Corporation
J. W. Redd
TRW Systems
Jack J. PariseI'
Hughes Aircraft Company
Guy H. Dobbs
Isaacs, Dobbs System
John J. Rosati
TRW Systems
Art M. Rosenberg
Informatics, Ino.
Allan N. Wilson
General Dynamios
Char lie D. Coleman
IBM Corporation
Alex Hurwitz
IBM Corporation
Donald, W. Gada
Aerospace Corporation
Robert E. Perry
~ughes Aircraft Company
Esker J. Harris
IBM Corporation
Registration
Frank F. Jurkovich, Chairman
Applied Computer Technology
Patricia M. Riley, Vice Chairman
TRW Systems
Walter L. Dooley
North American Rockwell
Dixie L. Lopez
Precision Data Systems, Inc.
Phyllis W. Yorg
TRW Systems
Irene E. Matthews
TRW Systems
R. A. Hayes
Hughes Ground Systems Support
Special Activities Committee
Smith Dorsey, Chairman
Autonetics
Muriel Gustin
Varian Data Machines
Scott Hillman
Autonetics
Robert McCowan
Scientific Data Systems
Robert Steen
IBM Corporation
Paul Thomas
AutoneticB
Liason
Harry T. Larson
California Computer Products
H. G. AsmuS, AFIPS Headquarters
American Federation of Informatiolil.
Processing' Societies
Richard B. Blue, Sr., ACM
TRW Systems Group
Jerry Baker, SCI
Hughes Aircraft Company
Sei Shoh'ara, IEEE
Scientific Data Systems
Finance
Mr. Walter L. Anderson
General Kinetics, Inc.
11425 Isaac Newton Square So.
Reston, Virg1n;a 22070
Harry Ooode Memorial A ward
Mr. Brian W. Pollard
Radio Corporation of America-ISn
Building 202-2
Cherry H HI, New Jersey 08101
IFIP Congress 71
Dr. Herbert Freeman
Professor of Electrical Engineering
New York Universit.y
University Heights
New York, New York 10453
International Relations
Dr. Edwin L. Harder
Westinghouse Electric Corporation
Research & Development Center
Beulah Road, Churchill Borough
Pittsburgh, Penna. 15235
I nformation Dissemination
Mr. Gerhard L. Hollander
Hollander Associ~.tes
P.O. Box 2276
Fullerton, California 92633
JCC Conference
Dr. A. S. H09.gland
IBM Research Center
P.O. Box 218
Yorktown Heights, New York
10598
JCC Technical Program
Dr. David R. Brown
Stanford Research Institute
333 Ravenswood Avenue
Menlo Park, California 94025
JCC Ceneral Chairmen
1970 SJCC
1970 FJCC
Mr. Harry L. Cooke
Radio Corporation of America
Princeton, New Jersey
lVlr. Robert A. Sibley, Jr.
Department of Computer Science
University of Houston
Cullen Boulevan rd
Houston, Texas 77004
REVIEWERS, PANELISTS, AND SESSION CIIAIRMEN
REVIEWERS
Robert P. Abbott
Chacko T. Abraham
Robert M. Aiken
Richard M. Alden
Roy P. Allen
Edward B. Altman
Saul Amarel
L. D. Amdahl
Juan J. Amodei
Robert H. Anderson
L. V. Anderson
T. C. Anderson
Frank D. Anzelmo
Akio Arakawa
Majid Arbab
Paul Armer
George N. Arnovick
Joel D. Aron
M. M. Astrp.han
Pauline A. Atherton
Donald C. Augustin
H. L. Babin
George F. Badger, Jr.
Philip R.. Bagley
Jerry H. Baker
N. Addison Bal1
lVlichael Ballot
Robert Balzer
Allen E. Barlow
Ben B. Barnes
Robert M. Barnett
James P. Bartlett
A. Batenburg
Frank Bates
John A. Bayless
W. R. Beam
C. K. Bedient
G. A. Bekey
Robert W. Berner
R. D. Benham
Russell Bennett
Fra.nk Bequaert
Paul T. Berning
lVI. I. Bernstein
Paul W. Berthiaume
L9.wrence Beruc
William P. Bethke
L. L. Bewley
Claude D. Birkhead
Donald V. Black
James A. Bloomfield
Daniel G. Bobrow
Morris J. Bodoia
Garret Boer
Gordon R. Bolton
Harold Borko
E.1. Bosch
Sherman H. Boyd
A. lVI. Bradley
Robert D. Brands berg
Harvoy Bratman
E. L. Braun
Barbara Brawn
Robert Brennan
Melvin A. Breuer
Carl N. Brooks
Barry W. Brown
J. Reese Brown, Jr.
G. E. Bryan
Wener Buchholz
T. D. Buettel1
Leslie L. Burns
Warren P. Burrell
C. A. Caceres
l\1yron A. Calhoun
Peter Calingaerh
E. David Callender
Thomas W. Calvert
D. J. Campbell
Anthony V. Campi
Rudd H. Canaday
David W. Cardwell
Roy B. Carlson
Robert L. Carmichael
Chester C. Carroll
W. C. Carter
Leonard J. Chaitin
James M. Chambers
Stanley K. Chao
G. G. Chapin
T. E. Cheatham
H.. C. Cheek
Li-an L. Chenh
J. Chernak
B. F. Cheydleur
G. Chingari
C. K. Chow
W. F. Chow
R. F. Churchhouse
E. H. Clamons
W. Douglas Climenson
Lawrence J. Clingman
A. Ben Clymer
Edward G. Coffman, Jr.
Dan Cohen
Edmund U. Cohler
Walter L. Colby
L. Stephen Coles
Albert H. Coltin
Steve Condon
Thomas J. Condon
Ralph B. Conn
Michael M. Con nors
Barbara Conrad
Robert Constant
Alfred E. Corduan
J. L. Corbett
W. A. Cornell
Ira W. Cotton
George A. Coulman
F. C. Cowburn
T. D. Cox
Richard L. Crandall
Arthur J. Critchlow
D. L. Critchlow
James J. Croke
Herbert A. Crosby
Joseph D. Crunkleton
Nicholas Cserhalmi
Charles A. Csuri
Joseph F. Cunningham
Alfred G. Dale
Donald A. Darms
C. M. Davis
Kenton S. Day
Stephen Peter de J ong
Peter B. Denes
Peter J. Denning
Weldon C. Dennis
Karl S. Detzer
U. Clarke S. Dilks
Heinz Dinter
Donald L. Dittberner
George G. Dodd
R.ichard K. Dove
John C. Duffendack
Michael A. Duggan
John J. Dulin
Arnold 1. Dumey
T. J. Dylewski
Lester D. Earnest
Lita B. Edwin
C. A. Eggert
Raymond Eisenstark
Robert F. Elfant
Pete England
Warren J. Erikson
F. Dennis Erwin
Ed ward R.. Estes
S. E. Estes
Da vid L. Evans
Carl C. Farrington, Jr.
George A. Fedde
Juli91n Feldman
Fr91nk R. Field, Jr.
Robert T. Filep
T. R. Finch
Ray Fitzgerald
James L. Flanagan
E. Gil Flores
L.E. Fogarty
F. H. Fowler
Margaret R. Fox
Ap:lali.e J . Frank
W. Donald Frazer
Roy N. Freed
I. F. Freibergs
C. V. Freiman
Paul J. Friedl
Joyce Friedman
James P. Fry
Lewis M. Fulton
Adolf Futterweit
L. Gainen
Rodger L. Gamblin
Sherbie G. Gangwere
Manuel G. Garcia
Reed M. Gardner
Clarence Giese
M. C. Gilliland
Michael M. Gold
David G. Gordon
Jerome J. Gordon
Robert M. Gordon
D. F. Gorman
John A. Goaden
Malcolm H. Gotterer
Alan J. Gradwohl
Alonzo G. Grace, Jr.
M. N. Greenfield
Donald W. Grissinger
George F. Frondin
Qab~iel F. Gr~ner
W. Groth
Otto A. Gutwin
Adolfo Guzman
Thomas G. Hagan
Murray J. Haims
John E. S. Hale
Mark I. Halpern
Richard G. Hamlet
Carl Hammer
Frederick M. Haney
A. G. . Hanlon
P. ~. Hanratty
John W. Harbaugh
Philip A. Harding
Donald R. Haring
Esker J. Harris
J. O. Harrison
Harry P. Hartkemeier
Elbert Hartsfield
R. Dean Hartwick
S. L~ Hasin
Theodore :f. Hatch) Jr.
Kenneth E. Haughton
Arthur Hausner
Robert M. Have
Lester C. Hazlett
John Heafner
John D. Heightley
Melvin F. Heilweil
Walter A. Helbig
V. E. Henriques
Paul J. Hermann
Bertram Herzog
G.eOrge E. "Heyliger
John H. JIiestand
A. N. Higgins
Richard H. Hill
Leonard Hirsch
Ha.r.old M. Hite
Elias H. Hochman
Alistair D. C. Holden
G. L. Hollander
Arthur W. Holt
Robert L.; Hooper
James..A. Howard
David K. Hsiao
Barbara Huberman,
Thomas A. Humphrey
Earl Hunt
Cuthbert C. Hurd
P. J. Hurley
Gilbert P. Hyatt
Manley R. Irwin
Roy A. Ito
Edwin L. Jacks
Albert S. Jackson
Edward A. Jacoby
LeoF. Jarzomb
Ronald J efferiea
Bruce B. Johnson
Edwin G. Johnson
R. E. Johnson
Walter L. Johnson
Edwin R. Jones
Terence G. Jones
Earl C. Joseph
L. E. Justice
Richard Y. Kain
Marvin J. Kaitz
J. F. Kalbach
Ted KaUner
Akira Kasahara
Char les Kellogg
Joseph E. Kernan
C. W. Kessler
Wan-Lin Kiang
Robert E. King
E. S. Kinney
PlUlip Kiviat
K. E. Knight
Prentiss Knowlton
Manfred Kochen
H. R. Koen, Jr.
Eldo C. Koenig
C. J. Koester
James S. Koford
Igal Kohavi
Ziv Kohavi
Anthony J. Kolk, Jr.
Deena Koniver
John O. Kopf
G. A. Korn
Ladis D.)! ovaoh
R. L. Kuehn
Carl J. Kuehner
J. H. Kuney
Jerome Kurtzberg
Kenneth C. K wan
Dominic A. Laiti
Butler W. Lampoon
Daniel J. Lasser
P. Lazarush
Eric G. A. LeBln
Richard C. T. Lee
Y. C. Lee
M. Lehman
John Lennie h
A. S. Lett h
William E. Lewis
W. Wayne Lichtenberger
Hans P. Lie
Leonard R. Lindenmeyer
Carroll R. Lindholm
Robert K. Lindsay
R.obert N. Linebarger
Thomas P. Linville
Ho-Nien Liu
Kenneth. M. Lochner, Jr.
R. D. Lohman
Henry A. Long
Fred Luconi h
David K. Lynn
Malcolm Mac21ulay
B. E. F. Macefield
Walter Main
C. M. MaJoneh
Carl W. Malstrom
Richard L. Mandelh
Michael Marcotty
John Markus
M. E. Maron h
Irvin Marshall
William L. Martin
R. L. Mattison
Harold E. Maruer
Lynn H. Maxson
C. Hugh Mays
M. E. McCoy
Andrew J. McGill
J. L ..McKenney
P. T. McKiernan
John McLeod
M. W. McMurran
H. W. Margler
Michael J. Merritt
Gene S. Metsker
Charles S. Meyer
James C. Michener
Bart J. Michielsen
Kenneth L. Miller
Stephen W. Miller
W. F. MiIJer
Jack Minker
Gerald. Minton
Baker A. Mitchell
E. E. L. Mitchell
Gordon S. Mitchell
Benjamin Mittman
Owen R. Mock
Marion F. Moon
Dana W. Moore
Richard Kelly Moore
Richard A. Moran
Stanley M. Morris
George J. Moshos
Robert A. Mosier
John H. Munson
John K. Munson
Anthony W. Muoio
John J. Murray
F. W. Murray
Robert P. Myers
Jan A. Narud
David Nee
Gary W. Nelson
Richard A.N esbit
Peter G. Neumann
Allen N ewelJ
Malcolm C. Newey
Fred Newman
William M. Newman
C. B. Newport
R. V. Niedrauer
Norman R. Nielsen
Nils J. Nilsson
N. Nisenoff
Samuel Nissim
J. D. Noe
Ronald A. Nolby
Paul Northrop
William A. N otz
D. R. O'Bell
Joseph A. O'Brien
A.Ockene
Cedric F. O'Donnell
Ken O'Flaherty
John T. O'Neil, Jr.
Lubomyr S. Onyshkevych
G. Oppenheimer
Hichard H. Orenstein
Elmer Edwin Osborne
J. T. Owens
Thomas F. Penderghast
Lysel H. Peterson
J. G. Petitt
James K. Picciano
Melvin W. Pirtle
Walrren J. Plath
A. V. Pohm
Robert V. Pole
James M. Pomerene
Sigmund N. Porter
John A. Postley
A. W. Potts
L. K. Pounds
M. J. D. Powell
Rebecca C. Prather
R.. J. Preiss
J. Paul Pdtchard, Jr.
1. C. Pyle
Jesse T. Quatse
James S. Raby
C. V. R amamoorthy
Bertram Raphael
M. D. Rapkin
A. Karl Rapp
Louis C. Ray
Stanley.G. Reed
Harry C. Reinstein
Irwin Remson
William T. Rhoades
Phyllis A. Richmond
Frank C. Rieman
Joseph W. Rigney
Frank D. Risko
Lawrence G. Roberts
R. W. Roberts
D. E. Robison
Nathaniel Rochester
Alan E. Rogers
R. M. Rojko
Michael W. Rolund
Jack Roseman
C. A. Rosen
Morton Rosenberg
Jack L. Rosenfield
Robert R. Rosin
William Edward Ross
R. E. Roundtree, Jr.
Raymond J. Rubey
Paul M. Rubin
Morris Rubinoff
Seymour Z. Rubenstein
Fred Ruffing
Edward C. Russell, Jr.
Roy L. Russo
Jerome D. Sable
Tak Saisho
Erik Salbu
Gerard Salton
John M. Salzer
P. 1. Sampath
Jere L. Sanborn
Wendell Sander
F ..J. Sansom
Lawrence Sashkin
Helmut M. Sassenfeld
E. S. Savas
Don Savitt
David B. Saylors
W. E. Schiesser
Arthur J. Schneider
Larry C. Schooley
Ernest J. Schubert
Melvyn H. Schwartz
J. E. Schwenker
Sally Y. Sedelow
Thomas K. Seehuus
Warren D. Seider
Robert H. Selzer
Arnold B. Shafritz
David Shansky
Elmer B. Shapiro
Jaoke E. Shemer
Paul C. Sheretz
Jerome S. Shipman
Riohard R. Shively
Sei Shohara
Paul N. Sholtz
Gerald E. Short
Richard L. Shuey
George T. Sh,uster, Jr.
Edgar H. Sibley
Roland Silver
Leonard C. Silvern
Q. W. Simkins
R. Simmons
W. D. Simpson
K. D. Sirakides
Patrick G. Skelly
R. A. Slater
W. U. Slauk
Donald R. Slutz
Terry A. Smay
Bernard L. Smith
Kenneth Creston Smith
Leland Smith
Riohard V. Smith
L. A. Smitzer
E. W. Snyder
Terry R. Snyder
Gerald N. Soma
L. M. Spandorfer
Char les F. Spitzer
F. W. Springe
Thomas B. Steel, Jr.
Howard H. Steenbergen
John K. Stephens
David H. Stewart
A. J. Stone
Harold S. Stone
Jon C. Strauss
Walter A. Sturm
Maurice E. Suhre, Jr.
Roger K. Summit
William R. Sutherland
R. Taylor
Arth ur Teplitz
Larry Tesler
Alan L. Tharp
R. E. Thoman
E. M. Thomas
Gregory L. Thomas
Martin D. Thompson
W. P. Timlake
August A. Toda
Fred M. Tonge
Douglas M. Towne
George R. Trimble, Jr.
Thomas D. Truitt
H. S. Tsou
Frank Tung
G. H. Turner, Jr.
George J. Turner
G.T.Uber
Leonard Uhr
Erwin A. Ulbrich, Jr.
Man T. Ung
William R. U ttal
Richard L. Van Horn
Richard L. Van Tilburg
R. Vichnevetsky
Sam S. Viglione
R. Von Buelow
Alfred H. Vorhaus
Sigurd Waaben
Sven E. Wahlstrom
John V. Wait
R. V. Wakerling
P. Duane Walker
John B. Wallace
Charles J. Walter
C. A. Walton
BenC. Wang
Gary Y. Wang
Robert L. Ward
LA. Warheit
Homer R. Warner
M. Cameron Watson
C. W. Watt
Vance ,Weaver
M. N. Weindling
Leonhard H. Weiner
Ralph R. Wheeler
J. J. Whelan
Malcolm E. White
Gio Wider hold
Jerome B. Wiener
Ronald L. Wigington
Roger C. Wilborn
Lyle C. Wilcox
M. Wildmann
Donald A. Willard
Theodore J. Williams
Thomas G. Williams
Carrel A. Wilson
N eIa Winkless
Howard Wishner
Eric W. Wolf
James E. Woile
John W. Womack
Roger C. Wood
Franz Worth
J. H. Worthington
J. Howard Wright
Kendall R. Wright
Ronald E. Wyllys
J. C: Wyman
J. W. Young
Lawrenoe S. Young
Daniel C. Zatyko
Norman S. Zimbel
Stuart Zimm&man
Arthur S. Zukin
PANELISTS
Lynn Abbott
Paul Armer
L. A. Av~nzino
Robert Barnett
Robert W. Berner
Sergio Bernstein
Paul Berthiaume
Melvin Breuer
Alan R. Butcher
James C. Castle
Ken Charshaf
Robert L. Chartrand
R. K. Chooljian
A. Ben Clymer
Aaron H. Coleman
Robert S. Cope
Alex d' Agapeyeff
Roy Davia
William E. DeLair
Ben Erdman
L. E. Fogarty
Les Goldberg
G. R. J. Grosch
Alexander C. Grove
Stanley D. Halper
John W. Hamblen
Peter P. Harris
Joseph O. Harrison
Alan Hecht
Elias H. Hochman
Bernard C. Hogan
Joseph Hootman
John F. Horty
Robert Jefferson
Stephen J. Kahne
Dan D. Kassan
Al J. Knite
R. C. Leader
George F. J. Lelaner
Roger Levien
Robert W. Lucky
Tony Lumpkin
·CarlW. Malstrom
Michael Marcotty
John Mayne
C. W. Medlock
Mortimer Mendelsohn
K. Stephen Menger
Paul Metzelaar
Edward E. L. Mitchell
Jack E. Myers
Thomas J. McConnel, Jr.
William H. McKeeman
D. C. McElroy
Carl E. Nelson
K.Okashima
K. Otten
Max Palevsky
Thomas M. Rees
John S. Saloma III
Phillip L. Schiedermayer
Kenneth Schurr
Robert J. Seidel
John D. Seiley
John P. Singleton
Thomas B. Steel, Jr.
Howard Steenbergen
Julius T. Tou
John V. T.unney
Lawrence Urdang
Willis Ware
Milton Warsha wsky
Lawrence Weed
John Willner
Joseph H. Wimbrow
Eric W. Wolf
William L. Wooley
SESSION CHAIRMEN
Morton I. Bernstein
Leon Blitzer
David H. Brandin
Robert R. Brown
Walter Brunner
James Burrows
Michael P. Burwen
Paul S. Collins
Francis L. Goff
Malcolm H. Gotterer
Martin Greenberger
A. H. Halpin
Robert V. Head
Richard Johns
Kenneth W. Kolence
Walter F. Kosonocky
RQY L. Lawrence
Don Lebell
Arthur H. Lipton
Jerome Lobel
William L. Martin
Donald A. Meier
Robert McClure
Bret Nehel
Louis Robinson
Joseph W. Smith
Mer lin G. Smith
Robert Stuckelman
Robert L.Thaler
BEST PRESENTATION AWARD PANEL
Robert H. Glaser
IBM Corporation
J2.mes H. Bennett
Applied l.ogic Corporation
Harry T. Larson
California Computer
Products
J. W. Redd
TRW Systems
Rex Rice
Fairchild Semiconductor
PRIZE PAPER COMMITTEE MEMBERS
Mr. Paul Armer
Stanford University
Dr. D. W. Gade
The Aerospace Corporation
Mr. Jules Schwartz
King Resources, Inc.
Dr. Edwin K. Blum
University of Southern California
Mr. Nathaniel Rochester
IBM Corporation
Dr. W. A. Sturm
The Aerospace Corporation
Dr. James A. Ward
The Pentagon
1969 FJCC LIST OF EXHIBITORS
Access Systems, Inc.
Adap-e, Inc.
Addison-Wesley Publishing Company, Inc.
Addressograph Multigr9.ph Corporation
Advanced Programming, Inc.
Advanced Systems Inc.
Advanced Terminals, Inc.
AFIPS Press
Airoyal Manufacturing Company
AL/COM Time Sharing Network
Allen Babcock Computing, Inc.
Allied Computer Technology, Inc./Heuristic Systems
Division
Alphameric Data Corporation
American Data Systems
American Telephone & Telegraph Company
AMP, Inc.
Ampex Corporation
Anderson Jacobson, Inc.
APL Computing Services
Applied Data Research, Inc.
Applied Digital Data Systems
Applied Dynamics, Inc.
Applied Magnetics Corporation
Applied Peripheral Systems, Inc.
Source Exif Data: File Type : PDF
File Type Extension : pdf
MIME Type : application/pdf
PDF Version : 1.3
Linearized : No
XMP Toolkit : Adobe XMP Core 4.2.1-c041 52.342996, 2008/05/07-21:37:19
Producer : Adobe Acrobat 9.0 Paper Capture Plug-in
Modify Date : 2008:11:18 02:30:40-08:00
Create Date : 2008:11:18 02:30:40-08:00
Metadata Date : 2008:11:18 02:30:40-08:00
Format : application/pdf
Document ID : uuid:5bfae1a1-b202-468e-ba92-fbfe043c088e
Instance ID : uuid:2c523ec8-6c4d-4220-91e2-672d5f7dae6b
Page Layout : SinglePage
Page Mode : UseOutlines
Page Count : 834
EXIF Metadata provided by EXIF.tools