1969 11_#35 11 #35

1969-11_#35 1969-11_%2335

User Manual: 1969-11_#35
Open the PDF directly: View PDF .
Page Count: 834
Download
Open PDF In Browser	View PDF
AFIPS
CONFERENCE
PROCEEDINGS
VOLUME 35

1969
FALL JOINT
COMPUTER
CONFERENCE
November 18 - 20, 1969
Las Vegas, Nevada

The ideas and opinions expressed herein are solely those of the authors and are
no necessarily representative of or endorsed by the 1969 Fall Joint Computer
Conference ComInittee or the American Federation of Information Processing
Societies.

Library of Congress Catalog Card Number 55-44701
AFIPS PRESS
210 Summit Avenue
Montvale, New Jersey 07645

c 1969 by the American Federation of Information Processing Societies, Montvale,
New Jersey, 07645. All rights reserved. This book, or parts thereof, may not be
reproduced in any form without permission of AFIPS Press.

Printed in the United States of America

CONTENTS
OPERATING SYSTEl\1S
A survey of techniques for recognizing parallel processable stre2,ms
in computer programs .........•............................

1

Performance modeling and. empirical measurements in a system
designed for batch and time-sharing users .................... .

17

M. J. Gonzalez
C. V. Ramamoorthy
J. E. Shemer

D. W. Heying
Dynamic protection structures ............................... .
The ADEPT-50 time sharing system .......................... .

39'

An operational memory share supervisor providing multi-task
processing within a single partition .......................... .

51

27

B. W. Lampson
R. R. Linde
C. Weissman
C. Fox
J. E. Braun
A. Oart~nhaus

ARRAY LOGIC-LOGIC DESIGN OF THE 70's
Structured logic .......................................

0

•••••

61

R. A. Henle

I. T. Ho
Characters-Universal architecture for LSI .................... .
Fault location in cellular arrays ............................. '..
Fault mUltiplication cellular arrays for LSI implementation ...... .
The pad relocation technique for interconnecting LSI arrays of
imperfect yield ....................... ' .................... .

69
81

89

O. A. Maley
R. Waxman
F. D. Erwin
K. J. Thurber
C. V. Ramamoorthy
S. C. Economides

99

D. F. Calhoun

111

R. O. Skatrud'
C. Weissman
E. V. Comber

COMPUTERS FOR CONGRESS
(Panel Session-No papers in this volume)
THE COMPUTER SECURITY AND PRIVACY CONTROVER.SY
The application of cryptographic techniques to data processing ... .
Security controls in the ADEPT-50 time-sharing system ......... .
Management of confidential information ....................... .

119
135

PROGRAMMING LANGUAGES AND LANGUAGE PROCESSOR.S
Some syntactic methods for specifying extendible programming
languages ..... , .......................................... .
SYMPLE-A.general syntax directed macro processor .......... .

An algebraic extension to LISP ............................... .
An on-line machine language debugger for OS/360 .............. .
The multics PL/1 compiler .................................. .

145
157

169
179
187

V. Schneider
J. E. Vander Mey
R. C. Varney
R. E. Patchen
P. Knowlton
W. H. Josephs
R. A. Freibeurghouse

FORTHCOMING COMPUTER ARCHITECTURES
A design for a fast computer for scientific calculations ........... .
A display processor design ................................... .

209

The system logic and usage recorder. . . ....................... .
Implementation of the NASA modular computer with LSI functional characters .......................................... .

219

P. M. M elliar-Smith
R. W. Watson
T. H. Myer
I. E. Sutherland
M. K. Vosbury
R. W. Murphy

231

J. J. Pariser
H. E. Maurer

247

O. A. Korn

255

D. S. Miller
M. J. Merritt
M. A. Franklin
J. C. Strauss
W. L. Oraves
R. A., MacDonald

DIGITAL SIMULATION OF CONTINUOUS SYSTEMS
Project DARE: Differential analyzer replacement by on-line
digital simulation ......................................... .
MOBSSL-UAF: An augmented block structured continuous systems simulation language for digital and hybrid computers ..... .
A hybrid computer programming system .....•.................

275

Hybrid executive-User's approach ........................... .

287

PROBLEMS IN MEDICAL DATA PROCESSING
A system for clinical data management ........................ .

297

R. A. Oreenes
A. N. Pappalardo
C. W. Marble
O. O. Barnett

Medical education: A challenge for natural language analysis,
artifical intelligence, and interactive graphics ................. .

307

J. C. Weber
W. D. Hagamen

Design principles for processor maintainability in real-time systems ..

319

Effects and detection of intermittent failures in digital systems ....

329

Modular computer architecture strategies for long-term mission ...

337

A compatible airborne multiprocessor ......................... .

347

H. Y. Chang
J. M. Scanlon
M. Ball
F. Hardie
F. D. Erwin
E. Bersoff
E. J. Dietrich
L. C. Kaye

ARCHITECTURES FOR LONG TERM RELIABILITY

PUBLISHING VERSUS COMPUTING
(Panel Session-No papers in this volume)
INFORMATION MANAGEMENT SYSTEM,S FOR THE 70's
(Panel Session-No papers in this volume)

WHAT HAPPENED TO LSI PROMISES
LSI-Past promises and present accomplishment-The dilemma
of our industry ........................................... .
What has happened to LSI-A supplier's view' ................. .

359
369

H. G. Rudenberg
C. G. Thornton

Real-time graphic display of time-sharing system operating
characteristics. . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
A graph manipulator for on-line network picture processing ...... .

379
387

On-line recognition of hand generated symbols ................. .

399

J. M. Grochow
H. A. DiGiulio
P. L. Tuan
G. M. Miller

TOPICS IN ON-LINE TECHNIQUES

MANAGING MONEY WITH ,COMPUTERS
(Panel Session-No papers in this volume)
DATA BASE AND FILE MANAGEMENT STRATEGIE.S
Common file organization techniques compared ................. .
An information retrieval system based on superimposed coding ... .

413

423

Establishment and maintenance of a storage hierarchy for an
on-line data base \lnder TSS/360 ........................... .

433

Resources management subsystem for a large corporate information system .............................................. .

441

N. Chapin
J. R. Files
H. D. Huskey
J. P. Considine
A. H. Weiss

H. Liu

W. S. Peck
P. T. Pollard
Incorporating complex data structures in a language designed for
social science research ..................................... .

453

S.

Jr. Kidd

CIRCUIT /MEMORY INNOVATIONS
A nanosecond threshold logic gate ............................ .
Silicon-on-sapphire complementary MOS circuits for high speed
associative memory ........ , .............................. .

463

L. Micheel

469

J. R. Burns
J. H. Scott

A main frame semiconductor memory for fourth generation
computers ...... '. , ........................................ .

479

A new approach to memory and logic-Cylindrical domain devices.

489

A new integrated magnetic memory ........................... .

499

T. W. Hart,Jr
D. W. Hillis
J. Marley
R. C. Lutz
C. R. Hoffman
A. H. Bobeck
R. F. Fischer
A. J. Perneski
M. Blanchon
M. Carbonel

Mated film memory-Implementation of a new design and
production concept. . . . ••...........•......................

505

L. A. ProhoJsky
D. W.Morgan

THE IMPACT OF STANDARDIZATION FOR THE 70's
(Panel Session-No papers in this volume)
USING COMPUTERS IN EDUCATION
A computer engineering laboratory ............................ .
Evaluation of an interactive display system for teaching numerical
::.nalyBiJ.a. . . . . . . . . . . . . . . . . . . . . . . . • . . . . . . . . . . . . . . . . . . . . . . . . .
Computer based instruction in computer programming: A symbol
manipulation-List processing approach ..................... .

515

D. M. Rob.inson

525

P. Oliver
F. P. Brooks, Jr.

535

P. Lorton, Jr.
J. Slimick

545

A. M. Hlady
T. W. (lay, Jr.
M. L. Dertouzos
K. Nezu
S. Naito

COMPUTER RELATED SOCIAL PROBLEMS: EFFECTIVE
ACTION ALTERNATIVES
(Panel Session-No papers in this volume)
DEVELOPING A SOFTWARE ENGINEERING DISCIPLINE
(Panel Sossion-N0 papers in this volume)
PROPRIETARY SOFTWARE PRODUCTS
(Panel Session-:-No papers in this volume)
HARDWARE TECHNIQUES FOR INTERFACING MAN WITH
THE COMPUTER
A touch sensitive X-Y position encoder for computer input ...... .
A queueing model for Bcan conversion ......................... .
Charcter generation from resistive storage of time derivatives .... .
Economical display generation of a large character set ........... .

553
561
569

COMPUTER-AIDED DESIGN OF COMPUTERS
ISDS: A program that designs computer instruction sets ........ .
Directed library search to minimize cost ....................... .
Computer-aIded-design for custom integrated systems ........... .

575
581
599

F. M. Haney
B. A. Chubb
W. K. Orr

613
625

D. M. Avedon
S. A. Brown

629

J. K. Koeneman
J. R. Schwanbeck

MANAGEMENT PROBLEMS IN HYBRID COMPUTER
FACILITIES
(Panel Session-No papers in this volume)
COMPUTER OUTPUT MICROFILM SYSTEMS
An overview of the computer output microfiJm field ............. .
The microfilm page printer~oftware considerations ............ .
Computer microfilm: A cost cutting solution to the EDP output
bottleneck ................. '................ , ............. .

THE FUTURE IN DATA PROCESSING WITH
COMMUNICATIONS
A case study of a distributed communications-oriented data
processing system ......................................... .
Analysis of the communications aspects of an inquiry-response
system .......... '........................................ .
A study of asynchronous time division multiplexing for time-sharing
computer systems ........................................ .

637

N. Nisenoff

655

J. S. Sykes

669

TOPICAL PAPERS
The jnvolved generation: Computing people and the disadvantaged .
The CUE approach to problem solving ........................ .
Self-contained exponentiation ................................ .

679
691
701

DCDS digital simulating system .............................. .

707

Pattern recognition in speaker verification. . ................... .

721

D. B. Mayer
J. D. McCully
N~ W. Clark
W. J. Cody
H. Potash
D. Allen
S. Joseph
S. K. Das
W. S. Mohn

HYBRID TECHNIQUES AND APPLICATIONS
A hybrid/digital software package for the solution of chemical
kinetic parameter identification problems ..•..................
Extended space technique for hybrid computer solution of partial
differential equations .................... : . . . . . . . . . . . ...... .

733

A. M. Carlson

751

D'. J. Newman
J. C. Strauss

761
771

N. H. Kemp
P. Balaban

A time-shared I/O processor for real-time hybrid computation .....

781

On-line software checkout facility for special purpose computers ...

789

T. R. Strollo
R. S.' Tomlinson
E. R. Fiala
T. H. Witzel
S. S. Hughes

A hybrid frequency response technique and its application to
aircraft flight flutter testing ................................ .

801

Extension and analysis of use of derivatives for compensation of
hybrid solution of linear differential equations ................ .
HYPAC-A hybrid-computer circuit simulation program ........ .
REAL-TIME HYBRID COMPUTATIONAL SYSTEMS

J. M. Simmons

W. Benson
J. P. Fiedler .

A survey of techniques for recognizing
parallel processahle streams in
computer programs *
by C. V. RAMAMOORTHY and M. J. GONZALEZ
The University of Texas
Austin, Texas

lNTRODUCTIOK

."). Improved performance in a uniprocessor multiprogra~med environment. Even in a uniprocessor environment, parallel processable segments of high priority jobs can be overlapped so
that when one segment is waiting for I/O, the
processor can be computing its companion
segment. Thus an overall speed up in execution
is achieved.

State-of-the-art advances-in particular, anticipated
advances generated by LSI-have given fresh impetus
to research in the area of parallel processing. The
motives for parallel processing include the following:
1. Real-time

urgency. Parallel processing can
increase the speed of computation beyond the
limit imposed by technological limitations.

With reference to a single program, the term "parallelism" can be applied at several levels. Parallelism
within a program can exist from the level of statements
of procedural languages to the level of micro operations.
Throughout this paper, discussion will be confined to
the more general "task" parallelism. The term "task"
(process) generally is intended to mean a self-contained
portion of a computation which once initiated can be
carried out to its completion without the need for
additional inputs. Thus the term can be applied to a
single statement or a group of statements.
In contrast to the way the term "level" was used
above, task parallelism can exist at several levels within
a hierarchy of levels. The statements of the main
program of a FORTRAN program, for example, are
said to be tasks of the first level. The statements within
a subroutine called by the main program would then
be second level tasks. If this subroutine· itself called
another subroutine, then the statements within the
latter subroutine would be of the third level, etc. Thus
a sequentially organized program can be represented
by a hierarchy of levels as shown in Figure 1. Each

2. Reduction of turnaround· time of high priority

jobs.
:~

Reduction of memory and thne requirements
for "housekeeping" chores. The simultaneous
but properly interlocked operations of reading
inputs into memory and error checking and
editing can reduce the need for large intermediate storages or costly transfers between
members in a storage hierarchy.

4. An increase in simultaneous service to many

users. In the field of the computer utility, for
example, periods of peak demand are difficult to
predict. The availability of spare processors
enables an installation to minimize the effects
of these peak periods. In addition, in the event
of a system failure, faster computational speeds
permit service to be provided to more users
before the failure occurs.
'" This work was supported by NASA Grant NGR 44-012-144.

1

2

LEVEL 1

Fall Joint Computer Conference, 1969

LEVEL 2

LEVEL 3

LEVEL n

(a)

(b)

(c)

Figure 2-Sequential and parallel execution of a
computational process

Figure I-Hierarchical represen ta tion of a seq uen tially
organized program

block within a level represents a single task; as before,
a task can represent a statement or a group of statements.
Once a sequentially organized program is resolved
into its various levels, a fundamental consideration of
parallel processing becomes prominent-namely that
of recognizing tasks within individual levels which can
be executed in parallel. Assuming the existence of a
system which can process independent tasks in parallel,
this problem can be approached from two directions.
The first approach provides the programmer with
additional tools which enable him to explicitly indicate
the parallel processable tasks. If it is decided to make
this indication independent of the programmer, then
it is necessary to recognize. the parallel processable
tasks implicitly by analysis of the relationship between
tasks within the source program.
After the information is obtained by either of these
approaches, it must still be communicated to and
utilized by the operating system. At this point, efficient
resource utilization becomes the prime consideration.
The conditions which determine whether or not two
tasks can be executed in parallel have been investigated by Bernstein. 1 Consider several tasks, T i, of a
sequentially organized program illustrated by a flow
chart as shown in Figure 2(a). If the execution of

task Ts is independent of whether tasks Tl and T2 are
executed sequentially as shown in Figure 2(a) or 2(b),
then parallelism is said to exist between tasks T 1 and
T 2. They can, therefore, be executed in parallel as
shown in Figure 2(c).
This "commutativity" is a necessary but IlLOt sufficient condition for parallel processing. There may exist,
for instance, two processes which can be exelcuted in
either order but not in parallel. For example:, the inverse of a matrix A can be obtained in either of the
two ways shown below.
(1)

a) Obtain transpose of A
b) Obtain matrix of cofactors of the transposed
matrix
c) Divide result by
determinant of A

(2)

a) Obtain matrix of
cofactors of A
b) Transpose matrix
of cofactors
c) Divide result by
determinu.nt of A

Thus obtaining the matrix of cofactors and the transposition operation are two distinct processes which can
be executed in alternate order with the same result.
They cannot, however, be executed in parallel.
Other complications may arise due to hardware
limitations. Two tasks, for example, may need to access
the same memory. In this and similar situations,
requests for service must be queued. Djkstra, Knuth,
and Coffman2 •8 •4 have developed efficient scheduling
procedures for using common resources.
In terms of sets- representing memory locations,
Bernstein has developed the conditions which must be

Techniques for Recognizing Parallel Processable Streams
satisfied before sequentially organized processes can be
executed in parallel. These are based on four separate
ways in which a sequence of instruct'ions can use a
memory location:
(1) The location is only fetched during the execution

ofT i .
(2) The location is only stored during the execution
ofT i •

(3) The first operation within a task involves a fetch
with respect to a location; one of the succeeding operations of T i stores in this location.
(4) The first operation within a task involves a store
with respect to a location; one of the succeeding operations of T i fetches this location.
Assuming a machine model in which processors are
allowed to communicate directly with the memory
and multi-access operations are permitted, the conditions for strictly parallel execution of two tasks or
program blocks can be stated as fo11ows.
(1) The areas of memory which Task 1 "reads"
and onto which Task 2 "writes" should be mutually
exclusive, and vice-versa.
(2) With respect to the next task in a sequential
process, Tasks 1 and 2 should not store information in
a common location.

individual functional units can be assigned to independent components within a task. The motivation
remains the same-- a decrease in execution time of
indjvidual tasks. The CDC 6600, for example, can
utilize several arithmetic units to perform several
operations simultaneously. This type of parallelism can
be illustrated by the arithmetic expression which
follows.

x

= (A+B) * (C-D)

Normally, this expression would be evaluated in a
manner similar to that shown in Figure 3(a). The
independent components within the expression, however, permit parallel execution as shown in Figure
3(b) with the same results.

Explidt and implicit parallelsim
In the explicit approach to parallelism, the programmer himself indicates the tasks within a computational
process which can be executed in parallel. This is
normally done by means of additional instructions in
the programming language. This approach can be
illustrated by the techniques described by Conway,
Opler, Gosden, and others5 ,6,7. FORK in the FORK
and JOIN technique6 indicates thep arallel processability of a specified set of tasks,within a process. The
next sequence of tasks will not be initiated until all

The conditions listed by Bernstein are sufficient to
guarantee commutativity and parallelism of two
program blocks. He has shown, however, that there do
not exist algorithms for deciding the commutativity or
parallelism of arbitrary program blocks.
As an example of what has been discussed here
consider the tasks shown below \vhich represent FORTRAN statements for evaluation of three arithmetic
expressions.

x

= (A+B) * (A-B)

Y = (C-D) / (C+D)

z = X+y
Because the execution of the third expression is independent of the order in which the first two expressions
are executed, the first two expressions can be executed
in parallel.
Parallelism within a task can also exist when individual components of compound tasks can be executed
concurrently. In the same manner that ind.ividual
processors can be assigned to independent tasks,

3

(a)

(b)

Figure 3-Illustre,tion of pamllelism within a compound
task

4

Fall Joint Computer Conference, 1969

the tasks emanating from a FORK converge to a
JOIN statement.
In some instances, some of the parallel operations
initiated by the FORK instruction do not have to be
completed before processing can continue. For example,
one of these branch operations may be designed to
alert an I/O unit to the fact that it is to be used momentarily. The conventional FORK must be modified
to take care of these situations. Execution of an IDLE

Figure 4-FORK and JOIN technique

statement, for example, permits proceSSOrB to be
released without initiation of further action. 7 The
FORK and .JOIN TECHNIQUE is illust:rated in
Figure 4.
Another example of the explicit approach is the
PARALLEL FOR7 which takes advantage of parallel
operations generated by the FOR statement in ALGOL
and similar constructs in other languages. For example,
the sum of two n X n matrices consists essentially of
n2 independent operations. If n processors were available, the addition process could be organized such that
entire rows or columns could be added simultaneously.
Thus the addition of the two matrices could he accomplished in n units of time. Another example of this
approach is the programming language PL/l which
provides the TASK option with the CALL staten;.ent
which indicates concurrent execution of parallel
tasks.
An additional way of indicating parallelism explicitly
is to write a language which exploits the parallelism in
algorithms to be implemented by the operating system.
This is the case with TRANQUIL,8,21 an ALGOLlike language to be utilized by the array processors of
the ILLIAC IV. The situation is unique in that the
language was created after a system was devised to
solve an existing problem. "The task of compiling a
language for the ILLIAC IV is more difficult than
compiling for conventional machines simply because of
the different hardware organization and the need to
utilize its parallelism efficiently." A limitation of this
app:roach is that programs written in that particular
language can only be run on array-type computers and
is, therefore, heavily machine dependent.
The implicit approach to parallelism does not depend
on the programmer for determination of inherent
parallelism but relies instead on indicators existing
within the program itself. In contrast to the relative
ease of implementation of explicit parallelism, the
implicit approach is associated with complex compiling
and supervisory programs.
The detection of inherent parallelism between a set
of tasks depends on thorough analysis of the source
pro,gram using Bernstein's conditions. Implementati.on
of a recognition scheme to accomplish this detecti.on
is dependent on the source langua,ge. Thus a r€lco:~nizer
which is universally applicable cannot be implomented.
An algorithm developed by Fisher9 approaches the
problem of parallel task detection in a general manner.
His algorithm utilizes the input and output. sets of
each task (process) to determine essential ordering
and thus inherent parallelism. Given such information
as the number of processes to be analyzed, the input
and output set for each process, the given permissible

Techniques for Recognizing ParaUelProcessable Stream.s
ordering among the processes, and any initially known
essential order among the processes, the algorithm
generates the essential serial ordering relation and the
covering for the essential serial ordering relation. This
covering provides an indication of the tasks within the
overall process which can be executed concurrently.
Basically, this work formalizes in the form of an
algorithm the conditions for par2Jlel processing developed by Bernstein. The conditions for parallel processing
between two tasks are extended to an overall process

Detection of task paraUelism-A new approach
,The next subject covered in this paper involves
implicit detection of parallel processable tasks within
programs prepared. for serial execution. An indication
is desired of the tasks which can be executed in parallel
and the tasks which must be completed before the
start of the next sequence of tasks. Thus the problem
can be broken down in two parts-recognizing the
relationships between tasks within a level and using
this information to indicate the ordering between tasks.
The approach presented here is based on the fact
that computational processes can be modeled by
oriented graphs in which the vertices (nodes) represent
single tasks and the oriented edges (directed branches)
represent the permissible transition to the next task
in sequence. The graph (and thus the computational
process) can be represented in a computer by means
of a Connectivity Matrix, C.IO.ll C is of dimension
n X n such .that C ij is a "1" if and only if there is a
directed edge from node i to node j, and it is "0"
otherwise. The properties of the directed graph and
hence of the computational process it represents can
be studied by simple manipulations of the connectivity
matrix.
A graph consisting of a set of vertices is said to be
strongly connected if and only if any node in it is reachable from any other. A subgraph of any graph is defined
as consisting of a subset of vertices with all the edges
between them retained. A maximal strongly connected
(l\!£.S.C.) subgraph is a strongly connected subgraph
that includes all possible nodes which are strongly
connected with each other. Given a connectivity matrix
of a graph, all its M.S.C. subgraphs can be determined
simply by well-known methods. to A given program
graph can be reduced by replacing each of its M.S.C.
sub graphs by a single vertex and retaining the edges
.connected .betwe~n these vertices and others. After
the reduction, the reduced graph will not contain any
strongly connected components.
The paragraphs which follow will describe the sequence of operations needed to prepare for parallel

5

processing in a multiprocessor computer a program
written for a uniprocessor machine.
(1) The first step is to derive the program graph
which identifies the sequence in which the computation
al tasks are performed in the sequentially codeprogram. Figure 5(a) illustrates an example program
graph. The program graph is represented in the computer by its connectivity matrix. The connectivity
matrix for the example is given in Figure 5(b).
(2) By an analysis of the connectivity matrix, the
maximal strongly connected subgraphs are determined
by simple operations.1O This type of subgraph is i:llustrated by tasks 2 and 12 in Figure .5. Each M.S.C.
subgraph is next considered as a single task, and the
graph, called the reduced graph, is derived. The reduced graph does not contain any loops or strongly

1 2

3 4

0

0

2a 0
2b 0
3
0
4
0
5
0
6
0
7
0
8
0
9
0
10
0
11
0
12a 0
12b 0
12c 0
13
0
14
0

a 2b
1
0
0
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0

1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0

5

6 7 8 9 10 11 12

o
o

0 000 0
0 o 0 o 0
000 000
1 1 o 0 0 0
0 0 0 1 0 0
0 0 1 0 0 0
0 0 0 1 o 0
0 0 0 0 1 0
0 0 0 0 0 1
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0

0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0

0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0

0
0
0
0
0
0
0
0
0
0
0
1
0
0
1
0
0

12
a 12b
c
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
1
0
0
0
0
0
0
0

13 14
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0

0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0

(b)

Figure 5-Program graph of a serially coded program
and its connectivity matrix

6

Fall Joint Computer Conference, 1969

connected elements. In this graph; when two or more
edges emanate from a vertex, a conditional branching
is indicated. That is, the execution sequence ",-m take
only one of the indicated alternatives. A vertex which
initiates the branching operation wdl be called a
decision or branch vertex. The reduced graph for the
example program graph is shown in Figure 6. In this
graph. vertex 3 represents a branch vertex.
(3) The next step is to derive the final program
graph and its connectivity matrix T. The elements of
T are obtained by analyzing the inputs of each vertex
in the reduced graph. An element, T ii , iF! a "I" if
and only if the j-th task (vertex) of the reduced graph
has as one of its inputs the output of task i; othCf\vise
T ii is a "0". Figure 7 illustrates the final program for
the example after consideration iR given to the inputoutput relationships of each taRk. The connectivity
matrix for the final program gr9ph is shown in F"gure R.
From the sufficiency conditions for task parallelism.
two tasks can be executed in parallel if the input set of
one task does not depend on the output Ret of the other
and vice versa. The technique outlined in Step 4 detects
this relationship and uses it to provide an ordering
for task execution.
(4) The vertices of the final program graph are

,

E) 6

= f{S)

Figure 7-Final progra:n graph of the parallel
processable i)rogram

10
0

0

0

0

0

0

0

0
.4

1=

0

0

0

0

0

0

0

0

0

a

0

9

0

0

0
0

0

0

0

0

0

0

0

0

0

0

0

10

0

11

0

12

0

0

0

0

13

0

0

0

0

14

0

0

0

0

Precedence
Partitlons

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0
0
0

0

0
0

0

0

0

0

0

0

0

0

0

0

0
0

0

0

0
0

0
0

0

0

0
0

14

0

0

0
0

0

0

0

0

0

0

0

0

0

0

0

0

0
0

13

0

0

0

0

1"

0
0

0
0

11

0

0

0

0

[I} , (2} , {3,a} , (4,5,9,lO}
(6,11,12}, (7,131, (141

Figure 8-Connectivity matrix of the final program
graph

F!gure 6--Reduced program graph of the serially coded
program

partitioned into "precedence partitions"P as follows.
Using the connectivity matrix T, a column (or columns)
containing only zeroes is located. Let this eolumn
correspond to vertex Vl. Next delete from T both the
column and the row corresponding to this vertex. The
first precedence partiton is P 1 = {vI} . Using the remaining portion of T, locate vertices {V21, Vzz, . .. } which
correspond to columns containing only zeroes. The
second precedence partition P z thus contains vertices
{VZ1, Vzz, .. . }. This implies that tasks in set p z =

Techniques for Recognizing Parallel Processable Streams
{V21, V22, ••• } can be initiated and executed in parallel
after the tasks in the previous partition (i.e., PI) have
been completed. Next delete from T the columns and
rows corresponding to vertices in P 2 • This procedure is
repeated to obtain precedence partitions P a,P4,' .. P p ,
until no more columns or rows remain in the T matrix.
It can be shown that this partitioning procedure is
valid for connectivity matrices of graphs which contain
no strongly connected components.
The implication of this precedence partitioning is
that if P 1,P 2 , ••• P p corresponds to times t 1,t2,. • •t p , the
earliest time that a task in partition Pi can be initiated
is ti.
The final program graph contains the following types
of vertices: (1) The branch or decision type vertex
from which the execution sequence selects a task from
a set of alternative tasks. (2) The Fork vertex which
can initiate a set of parallel tasks. (3) The Join vertex
to which a set of parallel tasks converge after their
execution .• (4) The normal vertex which receives its
input set from the outputs of preceding tasks. Figure 7a
indicates the final program graph with the first three
types of vertices indicated by B, F, and J, respectively.
(5) From precedence partitioning and the final
program graph, a Task Scheduling Table can be
developed. This table, shown in Table I, serves as an
input to the operating system to help in the scheduling
of tasks. For example, if the task being executed is a
Fork task, a look-ahead feature of the system can
prepare for parallel execution of the tasks to be initated upon compl~tion of the currently active task.
(6) The precedence partitions of Step 4 provide an
indication of the earliest time at which a task may be
initiated. It is also desirable, however, to provide an
indication of the latest time at which a task may be
initiated. This information can be obtained by performing precedence partitions on the transpose of the
T matrix. This process can be referred to as "row partitions". The implication here is that if task is in the
partition corresponding to time period h, then h is
the latest time that the task i can be initiated.
Using both the row and column partitions, the permissible initiation time for each task can be derived as
shown in Table II. Task 4, for example, can be initiated during t4 or to depending on the availability of
processors.
At this point it is desirable to clarify some possible
misinterpretations of the implications of this method.
The method presented here does not try to determine
whether any or· all of the iterations within a loop can
be executed simultaneously. Rather the iterations
executed sequentially are considered as a single task.

7

TABLE I-Tp.sk scheduling table

TASK
TYPE

TIME

INPUTS
TO TASKS

TASK
NUMBER

tl

-

1

t2

1

2

FORK

t3

2

3

BRANCH

t3

2

8

FORK

t4

3

4

t4

3

5

t4

8

9

t4

8

10

ts

5

6

ts

9

11

ts

9

12

t6

4,6

7

JOIN

FORK

to

10,11,12·

13

JOIN

t7

7 ,13

14

JOIN

For this reason, the undecidability problem introduced
by Bernstein is not a factor here.
In addition, precedence partitions may place the
successors of a conditional within the same partition.
The interpretation of this is that only one of the successots will be executed, and it can be executed in
parallel with .the other tasks within that partition.

The FORTRAN parallel task recognizer
In order to determine the degree of applicability of
the method described above, it was decided to apply
the method to a sample FORTRAN program. This
was accomplished by writing a program whose input
consists of a FORTRAN source program; its output
consists of a listing of the tasks within the first level
of the source program which can be executed in parallel. .
The program written to accomplish this parallel task

Fall Joint Computer Cqnference, 1969

8

TABLE II-Permissible task initiation time

COLUMN PARTITIONS
TASK

TIME

PERMISSIBLE TASK
INITIATION PERIODS

t1

1

TASK

TIME

t?

2

1

t1

t3

3,8

2

t2

t4

4,S,9,1O

3

t3

ts

6,11,12

4

t 4 , ts

t6

7 ,13

S

t4

t7

14

6

ts

7

t6

ROW PARTITIONS

t1

1

8,

t3

t2

2

9

t4

t3

3,8

10

t 4 , ts

t4

S,9

11

ts

ts

4,6,10 ,11,1:1

12

ts

t6

7 ,13

13

t6

t7

14

1~

t7

detection is known in its final form as a FORTRAN
Parallel Task Recognizer .13
The recognizer, also written in FORTRAN, relies
on indicators generated by the; way in which the
program is actually written. ConSider the expressions
given below.
Xl

=

f1(A,B)

X2 = f 2 (C,D)

Because the right-hand side of the second expression
does not contain a parameter gen~rated by the computation which immediately preced~s it, the two expressions can be executed in parallel. ~f, on the other hand,
the expressions were rewritten as shown below, the

termination of the first computation would have to
precede the initiation of the second.
Xl
X2

= fl(A,B)
= f 2 (XI,C)

The recognizer performs this determination by comparing the parameters on the right-hand of the equality
sign to outcomes generated by previous statements.
Other FORTRAN instructions can be analyzed
similarly. Consider the arithmetic IF:
IF (X - Y) 3,4,5
Here the parameters within the parentheses must be
compared to the outputs of preceding statements in
order to determine essential order.
Other FORTRAN instructions are analyzed in a
similar manner in order to generate the connectivity
matrix for the source program. During t.b.is analysis
the recognizer assigns numbers to the executable
statements of the source program. After this is completed, the recognizer proceeds with the method of
precedence partitions described earlier. Precedence
partitions yield a list of blocks which contain the statement numbers which can be executed concurrently,
Figure 9 shows a block diagram of the steps t:a.ken by
the recognizer to generate the parallel processable
tasks within the first level of a FORTRAN source
program.
Some statements within the FORTRAN set are
treated somewhat differently. The DO statement, for
example, does not itself contain any input or output
parameters but instead generates a series of repeated
operations. Because of the loop considerations mentioned earlier, and because the rules of FOHTRAN
require entrance into a loop only through the DO
statement, all the statements contained within a DO
loop are considered as a single task. A loop, however,
may contain a large number of statements, and a great
amount of potential parallelism may be lost if consideration is not given to the statements within the
loop. For this reason, the recognizer generates a separate connectivity matrix for each DO loop within the
program.
The recognizer itself possesses limitations which
must be eliminated before it can be applied to programs
of a complex nature. For example, only a subset of
the entire FORTRAN set is considered for recogniton.
This could be corrected by expanding the recognition
process to include a more complete set of instructions.
In addition to the DO statement, loops can also be

Techniques for Recognizing Parallel Processable Streams

C

READ NEXT
SOURCE
PROGRAM
IN3'1RUCTION

20

10

N

SCAN EXECUTABLE
STATEMENTS AND
COMPARE INPUT
PARAMETERS TO
OUTPU1S a' FRLVlCUS
STATEMENTS

IF THIS TAS< IS THE
SUCCESSOR OF A
BRANCH OR TRAN3HR
CPERJU'ION, REOORD
THIS I!'lFffiMATION

ECORDINPUT
ND OUTPUT
>-_"":_'1ARAME TERS
QUIRED BY
HIS TASK

C
9
10
11
12
13
14
15
16
17
18
19
20

30
40

50

60
100
200
3057
315
4
52

21

THIS IS A TEST PROGRAM DESIGNED TO CHECK PPS
DIMENSION Al(lO) ,A2(l0) ,A3(l0)
INTEGER Al ,A2 ,ABC ,A2X2, B ,C ,D
READ 100, (Al(I) ,1=1 ,10), B ,C ,D
READ 100, (A2(I),I=1,10),NS,NST,NSTU
DO 10 1=1,10
IF (Al(I) -A2(I)) 20,30,40
Xl=(Al(I))*(B-C)
X2=D+(B/c)
A3(I)=Xl*X2
CONTINUE
THIS IS A TEST COMMENT
PRINT 200 ,B,C ,D
CALL ALPHA(Al ,A2 ,ABC, B4 ,B5)
PRINT 3057 ,Xl ,X2, (A3(I) ,1=1 ,10)
CALL BETA(Xl ,X2 ,A3, B6)
IF(B4-B5) 50.50,60
READ 315, E , F • G • H
X3=(E*F)+(G-H)
PAR}\LLEL
X4=B6+G
PROCESSABLE
X5=X3-X4
TASKS
X6=(B4+B5)* X5
(1,2)
PRINT 4,X3,X4,X5
(3)
PRINT 52, (Al(I) , 1=1,10) .ABC ,C, (A3(I) ,1=1.10)
(9,10.11,12)
FORMAT(lOI2,3I3)
(13)
FORMAT(1HO,8 B C D* ,/,313)
(14)
FORMAT(1H ,213 .lOF7 .1)
(15,16)
FORMAT(4F7 • 4)
(17)
FORMAT(3F7 • 4)
(18.19,20)
FORMAT(12I3 ,10F7 .1)
END

(a)

HEN MATCH IS
FOUND ,MJKE ENI'RY
IN C,i.e. , SHOW A
CONNECTION FROM
PREDECESSOR TO
SUCCESSOR

USING THE As)IGNED
AFTER GENERATION
STATEMENT NUMBERS
OF CIS CCMPLETE,
GEm:RATE
INDICATE THOSE
I-----I~TASKS WI'lHIN THE
PRECEDENCE
FIRST LEVEL WHICH
PARTITIONS
CAN BE DONE IN
PARALLEL

Figure 9-Block diagram of the FORTRAN
parallel te.sk recognizer

created by branch and transfer operations such as
the IF and GO TO instructions. To eliminate these
loops, it would be necessary to analyze the connectivity matrix in the manner mentioned earlier before
beginning the process of precedence partitions. The
recognizer does not presently perform this analysis.
Nested DO loops are not permitted, and the source
program size is limited in the number of executable
statements it may have and in the number of parameters anyone statement can contain.
Some of these limitations could be eliminated quite
easily; others would require a considerable amount of
effort. To allow a source program of arbitrary size
would require a somewhat more elaborate handling of
memory requirements and associated problems. At the

9

(b)

Figure IO-An exe.mple of the recognition process.

present time the recognizer consists of a main program
and six subroutines. In its present form the recognizer
consists of approximately 1300 statements.
The recognizer is presently written in such a manner
that it will detect only first level parallelism. The
method it uses, however, can be applied to parallelism
at any level.
The theory of operation of the FORTRAN parallel
task recognizer will be illustrated by applying the
recognition techniques to a sample FORTRAN program.
Figure IOCa) is a listing of the sample program showing
the individual tasks. Figure IOCb) is a listing of the
parallel processable tasks as determined by precedence
partitions. The numbers to the left of the executable
statements are the numbers assigned by the recognizer
during the recognition phase.
Elimination of the limitations mentioned here and
other limitations not mentioned explicitly will be the
subject of future effort.

Observations and comments
Regardless of the manner in which the subject of
parallel processing is approached, common problems
arise. Prominent among these is a need to protect
common data. If two tasks are considered for concurrent execution and one task accesses a memory
location and the other amends it, then strict observance
must be paid to the order in which this is done. The

10

Fall Joint Computer Conference, 1969

FORTRAN recognizer, for example, may determine
that two subroutines can be executed in parallel. At
the present time no consideration is given to the fact
that both subroutines may access common data
through COMMON or EQUIVALENCE statements.
In order to truly optimize execution time for a
program which is set up for parallel process'llg, it
would be highly desirable to determine the time required for execution of the individual tasks ·within
the process. It is not enough to merely determine that
two tasks can be executed concurrently; the primary
goal is that this parallel execution result in higher
resource utilization and improved throughput. If the
time required for the execution of one task is 100 tImes
that of the other, for example, then it may be desirable
to execute the two tasks serially rather than in parallel.
The reasoning here is that no time wou~d be spent
in allocating processors and so forth.
Determinat;.on of task execution time, however, is
not a simple matter. Exhaustive measurements of the
type suggested by Russell and Estrin14 would provide
the type of information mentioned here.
Another problem area involves implementation of
special purpose languages such as TRANQUIL. It
was mentioned earlier that programs written in a
language of this type are highly machine-limited. It
would be highly desirable to be able to implement
progr9ms written in these languages in systems whicl~
are not designed to take advantage of parallelism.
Along these lines, the programming generality suggested by Dennis 15 may be significant.
It should be pointed out that aU the techniques
whl.ch have been discussed here will create a certain
amount of overhead. For this reason it is felt that a
parallel task recognizer, for example, would be best
suited for implementation with production programs.
Thus even though some time would be lost initially,
in the long run parallel processing would result in a
significant net gain.

Conclusions
The method of indicating parallel processable tasks
introduced here and illustrated in part by the FORTRAN Parallel Recognizer appears to provide enough
generality that it is independent of the language, the
application, the mode of compilation, and the number
of processors in the system. It is anticipated that this
method will remain as the basis for further effort in
this area.
In additi.on to the comments made earlier, some
possible future areas of effort include determination of

possible paralleljsm of individual iterations within a
loop. It is hoped that additional information can be
provided to the operating system other than a mere
indication of the tasks which can be executed in parallel. This would include the measurements mentioned
earlier and an indication of the frequency of execution
of individual tasks.
I t is also hoped that a sub-language may be developed which can be added to existing languages to
assist in the recognition process and the development
of recognizer code.

Detection of parallel components within
compound tasks
Several algorithms exist for the detection of independent components within compound tasks.16.17.1b.19
These algorithms are concerned pr·.marily with detection of this type of parallelism within arithmetic
expressions. The first three algorithms referenced
above are summarized in [19] where a new all~orithm
js also introduced.
The arithmetic expression which will be used as an
example for each algorithm is given below.
A+B+C+D*E*F+G+H
Throughout this discussion. the usual precedence
between operators will apply. In order of increasing
precedence, the precedence between operators will be
as follows: + and - , * and/, and t, where l' stands
for exponentiation.
Hellerman's algorithm
This algorithm assumes that the input string is
written in reverse Polish notation and contailns only
binary operators. The string is scanned from left to
right replacing by temporary results each occmrrence
of adjacent operands immediately followed by an
operator. These temporary results will be considered
as operands during the next passes. Temporary results
generated during a given pass are said to be at the
same level and therefore can be executed in parallel.
There will be as many passes as there are levels in the
~;;yntactic tree. The compilation of the expression
listed above is shown in Figure 11.
Although this algortihm is simple and fast, it has
two shortcoming'). The first is a possible difficulty in
implementation since it requires the input string to
be in Polish notation; the second is its inabilit.y to
handle operators which are not commutative.

Techniques for Re,cognizing Parallel Processable Streams

TEMPORARY RESULTS
GENERATED DURING lth PASS

INPUT STRING AFTER THE lth PI\SS
0

the algorithm causes it to be slow, and at least one
additional pass would be required to specify parallel
computations.

AB+C+DE*F* +G+H+
Rl C+R2 F*+G+H+

Rl=A+B
R2=D*E

R3 R4+G+H+

R3=Rl+C
R4=R2*F

RS G+H+

RS=R3+R4

4

R6 H+

R6=RS+G

5

R7

R7=R6+H

2

11

LEVEL

~~

5

4

It""
/RS"""

o

H

G

A"-c 1',\
;I.;", / R: '" F

A

B D

E

Figure ll-Parallel computation of
A+B+C+D*E*F+G+H using Hellerman's
algorithm

Stone's algorithm
The basic function of this algorithm is to combine
two subtrees of the same level into a level that is one
higher. For example, A and B, initially of level 0, are
combined to form a subtree of level 1. The algorithm
then searches for another subtree of level 1 byattempting to combine C and D. Since precedence relationships between operators prohibit this combination, the
level of subtree (A+B) is incremented by one. The
algorithm now searches for a subtree of level 2 by
attempting to combine C, D, and E. Since this combination is also prohibited, 'subtree (A+B) is incremented to level 3. The next search is successful, and a
subtree of level 3 is obtained by combining C, D, E
and F. These two subtrees are then combined to form a
single subtree of level 4 .
In a similar manner the subtree (G+H), originally
of level 1, is successively incremented until it achieves
a level of 4; at that time it is combined with the other
subtree of the same level to form a final tree of level 5.
The algorithm yields an output string in reverse
Polish which does not expressly show which operations
can be performed in parallel. Even though the output
string is generated in one pass, the recursiveness of

Squire's algorithm
The goal of this algorithm is to form quintuples of
temporary results of the form:
Ri (operand 1, operator, operand 2, start level
= max [end level op. 1; end level op. 2], end level=
start level + 1) .
All temporary results which have the same start level
can be computed in parallel. Initially, all variables
have a start and end level equal to zero.
Scanning begins with the rightmost operator of the
input string and proceeds from right to left until an
operator is fouIld whose priority is lower than that of
the previously scanned operator. In the example thp
scan would yield the following substring:
D*E*F+G+H
N ow a left to right scan proceeds until an operator is
found whose priority is lower than that of the leftmost operator of the substring. This yields: D*E*F.
At this point a temporary result Rl is available of the
form:

HI (D, *,E,O,I).

The temporary result, Rl, replaces one of the operands
and the other is deleted together with its left operator
The new substring is then:
R1*F+G+H.
The left to right scans are repeated until no further
qunituple can be produced, and at that time, the right
to left scan is re-initiated. The results of the process
are shown in Figure 12.
Although the example shm,'s the algorithm applied
to an expression containing only binary operators, the
algorithm can also handle subtraction and division
with a corresponding increase in complexity.
A significant feature of this algorithm is that Polish
notation plays no part in either the input string or
the output quintuples. Because of the many scans and
comparisons the algorithm requires, it becomes more
complex as the length of the expression and the diversity of operators within the expression increase.

Fall Joint Computer Conference, 1969

12

INITIAL STRING: A+B+C+D*E*F+ G+H
RIGHT TO LEFT SCAN
D*E*F+G+H

Rl*F+G+H
R2+G+H

A+B+C+R2+G+H

R3+C+R2+G+H
R4+R3+R2+H
R4+RS+R2
R6+R2
R7

QUINTUPLES
Rl
R2
R3
R4
RS
R6
R7

Op.l
D
F
A
C
H
R4
R2

LEVEL

LEFT TO RIGHT SCAN

OPERATOR

+

+
+
+
+

4

3

Op.2

START

END

E
Rl
B
G
R3
RS
.R6

0
1
0
0
1
2
3

J
2
1
1
2
3
4

2

LEVEL
4

1

3

o

o
Figure 12-Parallel computation of
A+B+C+D*E*F+G+H uE-ing Squire's p,lgorithm

Baer and Bovet's algorithm
The algorhhm uses mUltiple passes. To each pass
corresponds a level. All temporary results which can
be generated at that level are constructed and inserted
appropriately in the output string produced by the
corresponding pass. Then, this output string becomes
the input string for the next level until the whole
expression has been compiled. Thus the number of
passes will be equal to the nUInber of levels in the
syntactic tree. During a pass the scanning proceeds
from left to right and each operator and operand is
scanned only once.
The simple intermediate language which this algorithm produces is the most appropriate for multiprocessor compilation in that it shows directly all
operations which can be performed in parallel, namely
those having the same level number. The syntactic
tree generated by this algorithm is shown in Figure
13.
A new algorithm
This section will introduce a technique whose goals
are: (1) to produce a binary tree which illustrates the
parallelism inherent in an arithmetic expression; and

Figure 13-Parallel computation of
A+B+C+D*E*F+G+H using Baer and
Bovet's algorithm

(2) to determine the number of registers needed to
evaluate large arithmetic or Boolean expressions without intermediate transfers to main memory.
This technique is prompted by the fact that existing
computing systems possess multiple arithmetic units
which can contain a large number of active storages
(registers). In addition, the superior memory bandwidths of the next generation of computers will simplify
some of the requirements of this technique.
In the material presented below, a complex arithmetic expression· is examined to determine its maximum
computational parallelism. This is accomplished by
repeated rearrangement of the given expression. During
this process the given expression in reverse Polish form
is also tested for "well formation", i.e., errors and
oversights in the syntax, etc.
The arithmetic expression which was used aB a model
earlier will also be used here, namely A+B+C+D
*E*F+G+H. The details of the algorithm follow:
(1) The first step is to rewrite the expression in
reverse Polish form and to reverse its order.
+H+G+*F*E D+C+B+A
(2) Starting with the rightmost symbol of the string,
assign a weight to each member of the string based on
the following procedure:

Techniques for Recognizing Parallel Processable Streams
Assign to symbol Si the value Vi = (V i-I) + Ri
i = 1,2, ... ,n

INITIALRIGHTMOSTS i
SUBSTRING

---. .--

O(Si) = 0 if Sds a variable

=

FINAL RIGHTMOST
SUBSTRING

1 if Si is a unary operator

V i-2

1,

=,

\'0 = 0

Using this procedure, the following expression results:
Root
Xode

8
8
14

13

12

11

H

+

G

+

2

Vi

1

10

*

F

3

This procedure is repeated until the initia,l Vm occupies
the position i = 2 in the substring. For this example
this is already the case. Thus the rightmost substring
is in the proper form.
(5) The transposition procedure of step 4 is applied
next to the leftmost substring. However, since the
leftmost substring of this example consists of only two
operands and one operator, no further operations are
necessary.
(6) The resultant binary tree is shown in Figure 14.
The numbers assigned to each node represent the final
weight V i of the symbol
determined in steps 1-5
above.

as

2

2

Vm

9

Si + + C + B A * F * E D

V i-3 + H i- 2,

such that V i-(i-l) = VI = HI. and

Si

11 10 9 8 7 6543 2 1

Vi 12 3 1321212 1

O(Si) = 2 if Si is a binary operator
and Vi - l = V i-2+R i etc.,

+*F*ED+C+BA

ViI 2 3 2 3 2 1 2 1 2 1

where Ri = 1 - O(Si) given that

O(Si)

13

8

6

5

4

3

2

1

*

D

+

C

+

B

A

2

2

1

2

2

1

Note that for a "well-formed expression" of n symbols
V1l = 1.
(3) At this point the root node of the proposed
binary tree can be determined. Thus the given string
can be divided into two independent sub-strings. To
determine the root node, draw a line to the left of the
firRt symbol with a weight of 1 (i = 11, Si=+, V i =l)
to the left of the symbol with the highest weight,
V m(i=7, Si=E, Vi=Vm=3). The two independent
substrings consist of the strings to the left and to the
right of this line. The root node will be the leftmost
member of the string to the left of the line (i= 15,
St=+, Vi=l). Note that Vi also equals 3 for j=9;
however V m is chosen from the etuliest occurrence of
a symbol with the highest weight.
(4) The next step is to look for parallelism withni
each of the new substrings. Consider the rightmost
substring. Form a new substring consisting of the
symbols within the values of Vi = 1 to the right and to
the left of Vm' Transpose this substring with the substring to the right of it whose leftmost member has a
weight of V i= 1.

Some observations and comments on this algorithm
are given below.
(1) The two branches on either side of the root node
can be executed in parallel. Within each main branch,
the transposition procedure of step 4 yields supplementary root nodes. The sub-branches on each side of the
supplementary nodes can be executed in parallel.
(2) The number of levels in the binary tree can be
LEVEL
4

o
Figure 14-Bin:;>,ry tree for pt',rallel computation of

A+B+C+D*E*F+G+H

14

Fall Joint Computer Conference, 1969

predicted from the Polish form of the original string.
No. of LEVELS = MAX [NUMBER OF 1's; Vm]
in the substring (rightmost or leftmost) containing Vm.
(3) The tree is traversed in a modified postorder
form.20 The resulting expression is
D*E*F+A+B+C+G+H
(4) An added feature of this technique is that the
number of registers required to evaluate this expression
without intermediate STORE and FETCH operations
is obtained directly from the binary tree. This information is provided by the highest weight assigned to
any node within the tree. Thus for this example the
expression could be evaluated using at most two
registers without resorting to intermediate stores and
fetches.
(5) This technique of recognizing parallelism orr a
local level has been applied to a single instruction, in
particular, an arithmetic expression. It is worthwhile
mentioning that each variable within the expression
can itself be the result of a processable task. Thus this
technique can be extended to a higher level of parallel
stream recognition, i.e., level parallelism.
In order to implement the techniques mentioned
here for components within tasks and the techniques
mentioned earlier for individual tasks, several system
features are desirable. Schemes for detecting parallel
processable components within compound tasks are
oriented primarily toward arithmetic expressions. For
these situations string manipulation ability would be
highly desirable. Since individual tasks are represented by a graph and its matrix, the ability to manipulate rows and columns easily would be very important. In this same area, an associative memory
could greatly reduce execution time in the implementation of precedence partitions.
ACKNOWLEDGMENTS
The authors would like to thank the referees of the
FJCC for their comments and suggestions which
resulted in improvements of this paper.
REFERENCES
1 A J BERNST.EIN
Analysis of programs for parallel processing

IEEE Trans on EC Vol 15 No 5 757-763 Oct 1966
2 E W DJKSTRA
Solution of a problem in concurrent programming control

Comm ACM Vol 8 No 9 569 Sept 1965

:~

D KNUTH
Additional comments on a problem in concurrent
programming control

Comm ACM Vol 9

~o

5

:~21-322

Nlay 1966

-1 E G COFFMA~ H. R MUNTZ
Models of pu~e lime sharing disciplines for research
allocation

Proc 1969 Natl ACM Conf
5 M E CONWAY
A. mult1:processor 8ystem de8ign

Proc FJCC Vol 23 139-146 1963
6 A OPLER
Procedure-oriented statements to facilitate parallel proce8sing'

Comm ACM VoIR No 5 306-307 May 1965
7 J A GOSDEN
Explicit parallel processing description and control in
programs for multi- and nni-proce8.'?or computers
Proe FJCC Vol 29 651-660 1966

R N E ABEL P P BUDNIK D J KUCK
Y MURAOKA R S NORTHCOTE
H. B WILHELMSON
TRANQUIL: A. languaqc for an array proce8sing computer

Proc SJCC 57-68 1969
9 D A FISHER
Program analY8i8 for multiproces.'?ing

Burrougfi1 Corp May 1967
10 C V RAMAMOORTHY
Analysis oj graphs by connectivity considerations

Journal ACM Vol 1:~ No 2 211-222 April 1966
11 C V RAMAMOORTHY M J GONZALEZ
Recognition and representation of parallel processable streams
'in computer progranv~--Il (task/proce88 parallelis'm)

1969 Nr,tl ACNI Cont'
12 C V RAMAMOORTHY
A. structural theory of machJne diaf!nOsl:s

Proc SJce 74;{-756 1967
13 M J GONZALEZ C V RAMAMOORTHY
Rec)g'1,itia. ad repres'nt'ltiJn, '),{ p1.rallel proces8abl~e
8treams

in CJmputer

programs

Symposia on Parallel PrOCe3'30r System"! Technolol~ie3 and
Applications Ed. L C Hobbs Spartan Books June 1969
14 E C RUSSELL G ESTRIN
Mea8urement based automatic analYI~is of FORTRAN
programs

Proc SJCC 1969
15 J B DENNIS
Programming generality, parallelism and computer
architecture

Proc IFIPS Cong;res'l 68 CI-C7
16 H HELLERMAN
Parallel processing of algebraic

expres,~ions

IEEE Trans on E C Vol 15 No 1 Feb 1966
17 H S STONE
One-pa8b compilation of arithmetic expre88ions for C~
parallel proce8sor

Comm ACM Vol 10 No 4 220-223 April 1967
18 J S SQUIRE
A translation algorithm for a multiprocessor computer

Proc 18th ACM Natl Conf 1963
19 J L BAER D P BOVET
Compilation of arithmetic expre.~sions for parallel
computation

Techniques for R.ecognizing Parallel Processable Streams
Proc IFIPS 68 B4-BI0
20 D KNUTH
The art oj computer programming, Vol. 1, fundamental
algorithms

15

Addison-Wesley 316
21 R S NORTH COTE
Software developments for the array computer ILLIAC IV,
Univ of Illinois Rpt Ko 313 March 1969

Performance Illodeling and empirical
measurements in a system designed for
batch and time-sharing users
by JACK E. SHEMER and DOUGLAS W. HEYING
Scientific Data Systems, a Zerox Company
EI Segundo, California

the quality of service the user receives (his waiting time
for service completion, the price he is charged for
service, etc.).
The ramifications of hardware and software designs to
achieve such service can be investigated both internally
and externally; yet, a particular design strategy need
not supplement effective service from both viewpoints.
On the contrary, schemes tailored to improve external
utilization often degrade internal service effectiveness
and vice versa. Unfortunately, in confronting these
design trade-offs, the designer often had to rely upon
heuristic and intuitive arguments, since there is a
general lack of design models which quantitatively
relate system variables to reflect a priori performance
estimates. Hence, the design is complicated not only by
trade-offs between the often dissimilar aims of external
and internal effective service, but also by a deficiency of
design tools for investigating various implementation
alternatives.
These problems are especially amplified with the
advent of time-shared cqmputer systems. In timesharing systems, an ideal goal is to respond to interactive
on-line users such that each user receives the impression
that he has his own computer, yet at a price he can
afford. Thus in these systems, the computer complex is
shared among a number of independent users who are
concurrently communicating with the system, generating programs and interactive service requests via
on-line remote terminal equipment. This action enables
one to achieve economies of scale and distribute the cost

INTRODUCTION
If any design goal is common to all computer system

organization schemes, it is that of providing "effective
service" both externally to the user of the computational
facility and internally with respect to utilization of
system resources. Thus, generally speaking, there are at
least two dimensions to this design objective. On the one
hand, effective service is the external satisfaction of a
broad spectrum of user demands. For example, the ideal
system might be visualized as one which economically
provides a large number of programming languages;
machine compatibility with other computers of widely
diverse hardware; and rapid computation. On the other
hand, effective service is the internal utilization of all
system components so as to increase computational
efficiency. In this respect, system structures are implemented which strive to maximize sub-system
simultaneity and system throughput. For example, a
degree of macro-parallelism is attained in many present
day systems by allowing a central processing unit (CPU)
and input/output controller to share the use of a main
memory register, thereby enabling processing and
input/output (I/O) to proceed concurrently (for one or
several independent programs, depending upon the
system software).
In general, external effectiveness is all that the user
sees, and it is therefore of primary interest to him.
Whereas, the purveyor of the equipment is vitally
concerned with internal utility and coordination.
However, this latter consideration indirectly relates to

17

18

Fall Joint Computer Conference, 1969

of the system among all users according to their usage
of the facilities. Similarly, the objective of rapid response
is realized by time slicing CPU service and sharing it
among the on-line users. A request for program execution
is not necessarily serviced to completion; but rather jobs
are granted finite intervals (quanta~ of processing time.
If a job fails to exhaust its demands during a quantum
allocation, then it is truncated and postponed according
to a scheduling discipline, thereby facilitating rapid
response to short requests. 1- 4 This preferential treatment
of short jobs increases the programmer's productiveness,
since one-attempt efforts, editing, debugging, and other
typically short interactive demands often encounter
exorbitant turn-around times in batch processing
environments (i.e., in relation to the amount of actual
processing time consumed, due to problems of key
punching, printer output, card stacking, and total
system demand).
However, since computation is not necessarily run to
completion and main memory size is limited (by both
economic and physical reasons), programs must be
swapped into and out of main memory as the CPU
commutates its service from request to request.
Therefore, unless swapping is achieved with no loss in
time, it is obvious that service in the time-sharing sense
is less efficient in CPU utilization than service to
completion. Also, the time spent scheduling, allocating
buffers, and controlling swap input/output represents
overhead or wasted processing time which, due to
incomplete servicing, is greater in time-sharing systems
than batch processing systems. Furthermore, if the
system is dedicated to servicing on-line requests, the
CPU is essentially idle during periods of low on-line
input traffic. Hence, a design compromise must be
attained between external response rapidity and internal
efficiency since system performance, in the general case,
is a function of both response to selected classes of users
and utilization of system resources.
Yet, exploring such problem areas prior to design is
complicated, because any performance investigation is
incorrigibly statistical. Performance is not only a
function of software characteristics such as the input/
output, memory, and processing requirements of each
on-line request together with the occurrence rate of such
requests, but also dependent upon hardware characteristics such as the instruction processing rate and rates
accessing secondary memory.
This paper presents one approach to mitigating some
of these difficulties. A system design is briefly described
and then analyzed utilizing a mathematical model. The
system is structured to accommodate both batch and
time-sharing users with the goal being to achieve a

balance of system efficiency and responsiveness. A set
of variables are defined which characterize on-line user
demands and the servicing capacity of variou8 units
within the system. These variables are then quantitatively related in a mathematical model to derive salient
performance measures. Examples are given which
graphically display these measures versus various ranges
of the system variables. These a priori performance
estimates are then compared with empirical data
extracted from the system during its actual operation.
Here the emphasis is given to mathematical modeling
because this analysis method is more expedient and
generally less costly than the alternative approach of
simulation. Moreover, since many of the variables are
non-independent and rely upon characterization of user
demands, and siilce these are difficult to accurately
describe prior to actual operation, the macroscopic and
statistical indications provided by a mathematical model
are perhaps all that one can feasibly obtain.
Design and performance study

System design
The Batch/Time-Sharing Monitor (BTM) is designed
to afford SDS Sigma 5 and Sigma 7 users with interactive
and on-line time-sharing without disrupting batch
operations. For considerations of efficiency, the primary
objective of the BTM design is to provide limited timesharing service while concentrating on throughput of
batch jobs-the servicing of time-sharing u:sers is
allocated to minimize response for interactive users with
no special service given to the compute bound on-line
users (because high-efficiency batch service is avaHable).
Thus, the system is structured with resources for the
batch and time-sharing portions of the system separated
as much as possible. Different areas of main memory are
allocated so that a (compute bound.) batch user is
always "ready to run." The file device is common
because files may be shared between batch and timesharing users. However, the management tec:hnique
used minimizes the interference from this factor. The
swapping Rapid Access Disc (RAD) for time-i~haling
users is independent of the file device, thus insuring that
swaps in process do not affect on-going batch programs.
The batch user is kept essentially compute bound by
buffering all of his unit record I/O via a RAD. This
allows the compute portion of each job to follow that
of the previous job without waiting for the printout,
etc., to complete. Thus, there is no need to attEmpt to
reclaim swap time from one time-sharing user to
another-a natural claimant: the batch job is readily
available.

Performance Modeling and Empirical Measurements
Hence, a very simple (and low overhead) swapping
and scheduling algorithm can be used. As a particular
user is dismissed, other users are polled in turn to see
who is "ready to run." If someone is found (not the
same user), a replacement swap is initiated and the
CPU is allocated to the batch job. When the swap-out/
swap-in is complete, the new user is given one quantum
(Le., providing the batch job has already had at least its
quantum) ; then the cycle is repeated.
In this way, batch is guaranteed a certain percentage
of the machine (and typically gets much more), and a
moderate number of time-sharing users receive rapid
response to conversational request. Yet with this
relatively simple framework, a number of questions are
unavoidable: How does on-line response and batch
throughput vary with the number of on-line users, and
how do other variables such as quantum size and swap
time relate to system performance? Moreover, how
does one characterize system performance and the
variables which influence it?

Parameterizations and performance measures
The subject of "on-line" response is unfortunately
plagued by many interpretations of what constitutes
response (and, moreover, what defines adequate
response). For the purposes of this paper, "typical
on-line requests" are those which require minimal
central processor time-less than one quantum allocation. Thus, the response time C 1 to a "typical on-line
demand" is that period elapsing between request
generation (the keying in of a control character such as
"carriage return") and the termination of the first time
quantum * which is allocated to the servicing of the
request. This definition provides the basis upon which
the on-line performance of the BTM system is analyzed
in this paper, since it is assumed that on-line users are
typically in phases of program preparation. ** Thus,
providing the quantum is large enough, the great
majority of user interactions (e.g., "open the next
line," "delete source image," "perform syntax check
and insert into text," etc.) can be satisfied ·with single
quantum allocations.
The mathematical model developed in the Appendix
enables one to characterize the system by selecting
values for the variables:
N = total number of active on-line communication

* Also note that if the scheduling algorithm is round-robin then
0 1 provides a basis for approximating the response time for
a request which requires multiple quanta.'
** Note that this is not the case in system environments in which
the on-line users run production (compute bound) programs.

19

sources (i.e., the number of remote users who
are concurrently using the system).
A

= average uf>er interaction rate (frequency at
which a single user requests service by the
CPU).

J.t

= mean rate at which on-line requests are
serviced by the CPU (1/ J.t = average
amount of CPU time required to complete
each request given that the CPU was
dedicated to the servicing of the request).

S=

the average amount of time required to swap
an old user out of core and load a new user
(clearly, S is dependent upon the swapping
device as well as program size).

qR = time quantum allocated to on-line requests
(time-sharing users).
qB

= time quantum given to batch requests
(background users).

ill

= the average cumulative quantum extension
(for monitor services such as scheduling, file
I/O, service calls, etc.) incurred during the
period elapsing between successive quantum
allocations to on-line jobs.

To supplement analysis efforts, the BTM system
software is capable of monitoring these (and other)
variables and accumulating their statistical distributions
during actual system operation. This does not impose
any significant overhead since much of this data is
already accumulated in the accounting log, and (as in
many other commercial systems) used as a basis for
charging users.
Upon establishing reasonable values for the above
variables, the model can then be used to derive performance measures. In terms of resptmse, the salient
performance index is E[C 1] where
E[C1] == the expected response time which "typical
on-line demands" experience (see defini
tion given above).
In addition, the model can readily be used to estimate
the percentage of CPU time available for batch jobs; the
percentage of CPU time received by time-sharing users;
utilization of the swapping RAD; expectations of
system revenues; and a variety of other indices obtained
from combinations of the derived parameters.
A priori estimates for some of these performance
measures are given in Figures 1-5 for reasonable ranges

Fall Joint Computer Conference, 1969

20

100
qR= 200 ms.

~

S

I

as mi. IF 7212 RAD

248
443

mi.
mi.

IF 7232 RAD
IF 7204 RAD

Aa 1 Request/20 user.sec.

1/.... 400 ms./request
iii,.

100 mi.

80

PERCENT OF
CPU TIME
AVAILABLE
FOR BATCH
JOBS
(Pr[B]X 100%)

LIMIT FOR
7212 RAD

60

.<:

Avera9"
Response

6~~'~t!kal

2

20

Demands"
(sec.)

i

_.

"Swap Limited"

....

"Batch limited"

o

20

10

N

30

40

•

NUMBER OF CONCURRENT USERS
10

14

22

18

26

30

34

Figure 3-Relative batch capability

N_

NUMBER OF CONCURRE NT
USERS

Figure I-E[Cll vs. N (p.

120

= 2.5 requests/sec.)

100

.1

QR=200m ••

S=

85 ms IF 7212 RAD
248 ms. IF 7232 RAD
443 ms. IF 72)4 RAD

1

10

MAXIMUM

NUMIfII Of
CONCUHENT
USERS
'"

A = 1 Request!20 u_.sec:.

1/.. = 200 ms./request
iii = 100 ms.

--

20

Avero9"
Reoponse

To "Typkal
On-Line
Demands II
(Sec.)

1~.O----;---~10-C-~-SP-EE-D-~(-~~-I*-.-)~100~------~~~
(LOGSC"LE)

Figure 4-N max vs. CPU speed

e-0 A

'Swap limited"

l-II,.........-~--r--.---,..--r"----.,...,_.....-~--r--=.t_"Batch limited"
10

14

18

22

26

30

34

N-NUM8ER OF CONCURRENT
USERS

Figure 2-E[Cl l vs. N

(I-' =

5 requests/sec.)

of the variables N, A, JJ., S, qn, qB, and m· Obviously,
these variables will differ from ()ne environment to
another. Therefore, before discussing conclusions which
can be drawn from these graphical results, it is appropriate to clarify the parameterizations and assumptions
which were used in the calculations:

I-'

S was conservatively
calculated assuming that four RAD accesses are
required per swap with an average total of 16K
words transferred during each swap. (The RAD's
are head per track rotating memories operating
at 1800 rpm; and the SDS model 7204, 7~~32 and
3
7212 RAD transfer data at rates 187 X 10
6
bytes/sec., 384 X 10 3 bytes/sec. and 3 X 10
bytes/sec., respectively.)
2. The user interaction rate A was estimated from
statistics gathered at RAND6 and other data
extracted from the GE/Dartmouth BASIC
system6 and the SDS 940 system.

1. The average swap time

Performance Modeling and Empirical Measurements

21

Mathematical results
N
qe

~

18

=85 1115. (;.e.

"swap lim;ted")

85 mi. IF 7212 RAD
S =( 2048 rris. IF 7232 RAD
«3 1115. IF 7204 RAD

\

>. = 1 request/20 user·sec.
;;; = 100 mi.

\

E[C ]
1
Average -4
Response
To "Typ;cal
On-Une

Demands II
(Sec.)
7232 RAD

r.i.. -....:--~ ___;... _~ _~ __~ __~
7204 RAD

7212 RAD

'.!.- - ~J -iJr-

-<1> __

0.1

0.-4

-ar- -ar -

-.I> -

-.I>- - -&. -

0.7

0.8

-Jo,. -

,,= 2.5 requests/sec.

0-0.
&r • ..!.

0.:1

.0.3

0.5

0.6

0.9

-iJr - -.....

"

1.0

= 5 requelts/sec.
1.1

1.2

qR(sec.)QUANTUM ALLOCATION
TO

ON-LINE USERS

Figure 5-E[011 vs. qn (N = 18)

3. The selection of qn = 200 ms. was established
such that the majority of user interactions are
satisfied with single quantum allocations. Whereas, selecting qB = 85 ms. and 200 ms. was done
merely to demonstrate "swap limited" and
"batch limited" operation, respectively.
4. The value of the average monitor time ill per
on-line/batch quantum cycle was approximated
utilizing batch accounting information and
timing studies of monitor services.
5. Values of p, were chosen such that the average
would be ,:::::: 125 ms. to
on-line quantum
150 ms. when 200 ms. was allocated. This
selection was inferred from data extra~ted from
the SDS 940 System and BTM code traces. (Yet,
note that a single parameter p, does not provide a
characterization covering the more general case
in which the processing time distribution is
multi-modal.t However,for purposes of studying
interactive response, it provides a good approximation and lends itself to the mathematical
analysis.)

qn

t The multi-modal case arises because of a multiplicity of language facilities and the natural division of requests into interactive
or compute demands.

Given this framework, let us now turn our attention
to the 'figures. Employing the mathematical model,
a priori estimates of average interactive response time
E[C1] are displayed versus N in Figure 1 and Figure 2
for p, = __ 2.5 requests/sec. and p, = 5 requests/sec.,
respeetively. Here, three different curves are plotted in
each :figure to demonstrate the limiting effects of each
swapping device (i.e., "swap limited" operation when
the batch quantum qB is less* than the swap time S).
Also, note that an additional, curve is given for the
model 7212 RAD to display the effects of selecting a
batch quantum which exceeds the swap time (i.e.,
"batch limited" operation). This latter curve shows that
the fastest swapping device effectively becomes a slower
device when qB is set such that operation is "batch
limited"-the model 7~12 RAD is almost equivalent to
a model 7232 RAD when qB = 200 ms.
Now since N ·is the total number of concurrent users
(active communication sources), Figures 1 and 2 enable
one to estimate a value for the maximum number of
users N max which the system can simultaneously
accommodate by: (1) assuming "swap limited" operation
and (2) defining what constitutes adequate response to
typical on-line demands. For example, if one assumes
that adequate interactive response is achieved if :::::: 80%
of the time a user experiences a delay of less than 5 sec.
then, depending upon p" one concludes:**
i. the model 7204 RAD will accommodate a
maximum of 10 to 16 concurrent users for***
p, = 2.5 requests/sec. to p, = 5 requests/sec.,
respectively;
ii. the model 7232 RAD will accommodate a
maximum of 16 to 26 concurrent users for
p, = 2.5 requests/sec. to p, = 5 requests/sec.,
respectively;
iii. the model 7212 RAD will accommodate a
maximum of 26 to 38 users for p, = 2.5 requests/
sec. to p, = 5 requests/sec., respectively.
However, the actual number of on-line users who
'" For this situation, the actual batch quantum allocation is the
swap time S.
""" These conclusions were made by assuming that the probability distribution for response time 0 1 is such that twice the mean
E[Oll is (at least) the 80 percent point. This is a reasonable assumption in light of both the mathematical characterizations used in
the model and empirical measuresments.
""""'Note that reducing J.L from 5 requests/sec. to 2.5 'requests/sec.
is tantamount to reducing processing speed by a factor of 1/2.

22

Fall Joint Computer Conference, 1969

concurrently use the system is a statistical parameter
which generally is less than N max and varies according
to the total number of on-line subscribers, their
demands, processing speed, N max, etc. In practice, the
total number of on-line subscribers typically exceeds
N max by at least a factor of three.
For the above cases, nominally 50--80% of the CPU
time is available for batch jobs. This is shown in
Figure 3. Similarly, utilizing this same response
criterion, it is interesting to observe the effects of
increasing**** CPU speed J.I.. This is demonstrated in
Figure 4 for each of the swapping devices. As CPU speed
increases indefinitely, the capacity of the system to
service on-line requests approaches a limit established
by the swapping device.
Additional insight into system responsiveness is
provided by Figure 5. Here, E[C 1] is graphically
displayed versus the on-line user quantum qR for "swap
limited" operation and N = 18 (with all other variables
the same as those employed in Figures 1 and 2.) Note
that the selection of a minimum qR is very critical;
however, having estabIished a minimum qR, the variations are not dramatic for a relatively large range above
minimum qR. Also, notice that as J.I. is reduced from 5
requests/sec. to 2.5 requests/sec., a model 7232 RAD
must be used to achieve what a model 7204 RAD
accomplished in the former case; and similarly, a model
7212 RAD is required to equal the performance of a
model 7232 RAD.

Experimental results
Extensive statistics were gathered from the system
(while running typical jobs) with a twofold purpose in
mind. First, it was necessary to substantiate the validity
of the assumptions employed in the model; i.e., establish
that the chosen parameters were indeed consistent with
the actual environment. Secondly, a correlation between
empirically measured performance and the results of the
model would lend credence to the validity of the model,
and therefore allow us to extrapolate and predict
performance for other user environments and system
configurations.
The first objective was accomplished by observing a
BTM system which used a model 7212 RAD for
swapping with quanta qR = qB == 200 ms. Values for
A, J.I., ill and program size were tabulated for many
different observation periods. For each (jj these monitoring sessions different average values were obtained, but

the values J.I. = 3.5 requests/sec., A = 1 request/15
user-sec., § = 85 msec. and ih = 100 msec. were found
t~ be quite representative of most samples. The variables
J.I. and A were most subject to variation and ran~~ed from
2 to 6 requests/sec. and from 1 request/25 use:r·sec. to
1 request/l0 us~r·sec., respectively. Also, the data
indicated that the assumptions of exponentinlly distributed CPU time and request inter-arrival time
provided good approximations of user demandEI.
Given that the first objective was satisfied, realization
of the second objeetive is buttressed by Figure 6 which
plots the average of all sampled values for two of the key
performance indications (average response time E[C1J
and CPU time available for batch Pr[B]) as a function
of the number of users N. Upon comparing these results
with the mathematical predictions (also see Figures 1-3),
one can infer that (at least for the range of variables
considered) the mathematical model is reasonably
consistent with actual system operation.
Comments

The analysis presented above primarily focused attention on the system's capacity to accommodate user
demands. Even though no mention was given to
cost/performance tradeoffs, the model lends itself to
this latter design consideration. For example, the
variables N, Pr[B] , and J.I. might be combined to reflect
the revenue derived for service to batch jobs and the
revenue obtained for servicing interactive users which
could then be weighted against the cost expended to

100
Measured
Percentage af CPU Time
Available for Batch Jobs

- "EI

)a...

1

Soltlpled
E [c~

Prediction
Obtained
From Madel

... ... ...

/
:1il,

I

SAMPLED
Pr[B] X 100%

(PERCENT)

(Sec.)

60

20

.4

12

16

N

**** Note that this latitude is only possible on a limited basis
(e.g., code optimization, faster memory, faster operation unit,
multi-processing, etc.)

80

24

28

•

NUMBER OF CONCURRENT
USERS

Figure 6-Empirical results

32

Performance Modeling and E.mpiricalMeasurements
provide (and maintain) the system complement. This
would provide a basis for the designer to balance CPU
cost/performance with that of other system elements.
The process of selecting and examining performance
indexes similar to those discussed here enables the
designer to better appraise the many implementation
tradeoffs which confront him. Moreover., when supplemented with empirical data, these techniques provide a
basis for not only configuring existing systems but also
synthesizing new systems. However, it should be
emphasized that apart from the mathematical model
itself and its macroscopic treatment of the system, the
fidelity of the results and conclusions obtained in this
analysis (or any analysis of this sort) can only be as good
as the accuracy attributed to the independent variables
(N, X, J.l, m, S). The values possessed by these variables
dramatically affect performance and will vary from one
environment to another. Therefore, one should be
cautious before inferring any explicit and universal
characterizations of system performance.

completion of a request and generation of a new request
on a given line is described by the distribution function
I -

A(x) = ( 0

e-}..~

for t
for t

- NXpo(t)

0
0

~

<

0
0

+ J.lPr[R(t)]Pl(t)
for n = 0

[(N - n)X + J.lPr[R(t)]]Pn(t)
+ (N - n + 1)XPn-l(t)
+ J.lPr[R(t)]Pn+1(t)
for 0 < n
J.lPr[R(t)]PN(t)

<

N

+ XPN-l(t)
for n = N

where Pr[R(t)] denotes the probability that at time t
t'le computer is servicing one of the remotely generated
on-line requests. Note that in the above equations, the
input rate is (N - n)X when n requests are queued.
Thus the model accounts for the natural variations in
demand intensity which r ~sult because there are a finite
number N of input sources.
From these equations, the stationary probability 7
that n on-line requests are queued is
p.

=

eN ~I n)! C;r[Rl)"

po

where
Pr[R] = limit Pr[R(t)] and
----? 00

1

po

= ---------------------------

BTM mathematical model
Consider the generation of on-line requests on each
communication channel is an exponential process with
parameter X. Hence, the time interval x between

~

<

Given that there are N channels, let p (~l denote the
probability that n on-line requests are queue 1 at f 0 ne
arbitrary time t for n = 0, 1, .. ·N, then

t

APPENDIX

for x
for x

Similarly, assume that the service time t required by
each on-line request is exponentially distributed with
parameter J.l and characterized by the distribution
function

REFERENCES
1 B KRISHNAMOORTHI R C WOOD
Time-shared computer operations with both interarrival and
service time expone11 tial
J A C M Vol 13 317-338 July 1966
2 E G COFFMAN JR
Stochastic models of multiple and time-shared computer
operations
Report 66-38 Dept of Eng Univ of Calif Los Angeles
June 1966
3 L KLEINROCK
Time-shared systems: A theoretical treatment
J A C M Vol 14 242-261 April 1967
4 J E SHEMER
Some mathematical considerations of time-sharing scheduling
algorithms
J A C M Vol 14 262-272 April 1967
5 G E BRYAN
JOSS: 20,000 hours at a cOrlsole-a statistical summary
Proc F J C C 769-777 1967
6 H CANTRELL
. Time-sharing data
General Electric Technical Information Series Report
R65CD12 December 1965
7 T L SAATY
Elements of queueing theory
McGraw-Hill New York 1961

23

[ 1

+

t;,

(N~! n) ! C;'[R1YJ

The probability Pr[R] can be estimated by considering

24

Fall Joint Computer Confer·ence, 1969

the interval which elapses between successive allocations
of a quantum to on-line users. Let;Tk denote the total
time between the oth on-line quantum completion and
the kth on-line quantum completion. If the kth completion leaves the on-line queue in an empty state, then the
expected value of the time ATk until the next on-line
quantum completion is

is to let f increase by some small Af uutil a solution for
po is obtained which is consistent with Pr[R,]. The
variable f satisfying this criterion will vary drama.tically
depending upon N,
J.L, A and qB.
Upon solving for Po, the percentage of CPU time
available for batch jobs is

m,

qB

+ Po (l/NA)

Pr[B] = =qR-+-=-qB--'-+~m::::::-:-+"":"-p-o~(l-/-N-A)
where qB is the avera~e quantum which batch users
receive; qR is the expected duration of an on-line
(remote user) quantum; (l/NA) is :the mean time until
the generation of the next on-line request; and ill is the
expected monitor overhead time per batch/on-line
quantum cycle. Here, ill accounts. for 'any scheduling;
I/O overhead; file operations, and any other CPU time
pre-empted by the monitor which results during the
cycle of a quantum allocation to a batch job followed by
a quantum allocation to an on-linejob.
In the case when the kth on-line quantum completion
does not leave the interactive user queue empty, then
with probability (1 - po)

The variables qB and qR are heavily infiueneed by
quantum periods and swap time. If one assumes that
(with the exception of a batch quantum allocation every
other quantum) on-line jobs run on a demand basis
(i.e., the batch quantum qB is less than the swap time S),
then qB = S. Hence, the swap time limits the rate at
which successive quantum allocations are provided to
the on-line requests (i.e., maximum service capacity is
given to on-line requests). Whereas, if the batch
quantum limits the servicing of on-line requests
(qB > S), then qB = qB. Therefore, for completeness
_,
[ qB if S < qB
qB =
S if S ~ qB

l

Now let T B , T R , and Tm denote respectively the length
of time out of T k which the system spends servicing
batch jobs, on-line jobs, and monitor functions)
respectively.
Then as k goes to infinity, the ratios TB/k, TR/k, and
Tm/k converge with probability one to (qB + palNA) ,
and
respectively. Therefore, in the limit, an
approximation to the fraction of the time which the
system spends servicing on-line requests is

qR,

m,

Pr[R] = lim [TRJ = lim [TR/kJ
k-HD
Tk
k-+ 1 + qB

where
N

Here, f is an appropriate scale factor introduced to
facilitate solving for

{pn}~

n-O

The numerical technique

E[n] =

L: npn

and E[ToJ is the expected time remaining subsequent to

Performance Modeling and Empirical Measurements
the arrival of an on-line request before the next quantum
allocation is initiated. The value of E[ToJ is difficult to
accurately express since it is a function of the probability
densities for qB and m together with machine state
probabilities; however, it is clear that

25

time interval t given that m requests are queued. For
example, with exponential inter-arrival

Also, in the above equations
At any rate, E[To] is not a dominant factor in E[C I ]
unless E[C I ] is extremely small (i.e., E[C I ] ~ qR + E[To],
for example). Hence, the precise value of E[ToJ is not
Qritical in those cases which are of particular interest
(namely, those resulting when the on-line queue tends
toward saturation; i.e., E[n] ~ N).
In addition to the above result for E[C I ], since the
scheduling discipline is round-robin, it is possible to
estimate2- 4 the expected total response time E[rl t] for
an on-line request which requires a processing time t in
excess of a single quantum qR
E[Rlt] ~t

where

< alb >

+  [E[C

I]

+ qR)
+ qB + ill]

(po E[To]

-

is the smallest integer greater than a/b.

Alternate model

Let Pmn(T k ) denote the probability that non-line
requests are queued at epoch T k marking the completion
of the kth on-line quantum allocation, given that at
epoch T k - l there were m on-line requests awaiting
service from the system. I ,2 Then independent of k
since the CPU servicing of requests is characterized as
an exponential process

[0

Pr[n - m

Here, p B denotes the probability density function which
describes the batch quantum allocation, and p B+R is
the convolution of PB with the density function PR
defining the distribution of an on-line quantum allocation. Both PB and PR include overhead functions to
account for file I/O, monitor overhead, etc.
,The density function PB is derived from the swap time
distribution when qB < S; whereas, it depicts the CPU
servicing of batch requests when S < qB. For example,
in the latter case with o(z) representing the Dirac-delta
function describing an independent variable z, one
could characterize the constant batch allocation interval
by
PB(t) = oCt - ('YB

+

qB»

where the constant 'Y B reflects batch overhead. Similarly,
letting 'YR denote the overhead incurred during an
on-line quantum allocation
0
for t :::; 'YR or t > 'YR + qR
PR(t) = p,e-JJ.t + e-MR oCt - (qR + 'YR»
l
for 'YR :::; t :::; 'YR + qR

!

For completeness, the transitions from the O-state are
assumed to be

y+QR-E

+

y=

Smax if service to on-line customers is swap
limited (i.e., qB < S)
qB if batch quantum limits on-line service
(i.e., qB ~ S)

+ 11 m, t] PB+R(t)

dt.

for 1 :::; m S n
pmn

=

o for n

:::; m - 2; m

~

1

1J+q R-E
[o

Pr[OI m, t] PB+R(t) dt
for n = m - 1

~

0

where E ~ 0 and Pr[kl m,t] denotes the conditional
probability of generating k new on-line requests in a

Then, having formulated the state transitions {Pmn}
and defined the density functions PB(t) and PB+R(t), the
problem remains to solve for the steady-state probabilities. This is accomplished by noting that the Pmn'S
define an ergodic Markovian chain whereby in matrix
form with!!.. = (Pmn) there exists a unique set of number~
fPm }~=0 such that

26

Fall Joint Computer Cpnference, 1969

and
N

LPn = 1
n=O

The solution of these equations produces the limiting
stationary probabilities {P16}n~o which could be used in
calculating E[n] to provide a more accurate estimate of
E[C 1]. (That is, providing one can accurately describe
PB, PB+R, A, etc.).
However, since the accuracy of such variables would
be highly questionable in the absence of any empirical
information and since this latter model presents a
number of non-trivial mathematical difficulties, it was
not utilized to derive the result.s given in this paper.
Yet, in the future, as sufficient data is accumulated from

the actual operation of BTl\1 systems, then the latter
model will enable us to extrapolate and better predict
the effects of alterations to the system (e.g., improvements resulting from faster swapping devices or
increases in CPU speed).
ACKNOWLEDG~\1ENT

The authors are indebted to ::,\1. Leavitt, D. Cumming,
.J. Doeppel, T. l\1artin and G. E. Bryan for their many
contributions to the BTl\1 design effort and also wish to
extend thanks to all those other individuals at Scientific
Data Systems who helped to make this project possible.
In particular, the authors are grateful to D. Cota,
E. lVlaso and Dr. R. Spinrad for their guidance in these
efforts.

Dynamic protection structures
byB. W.LAMPSON
Berkeley Computer Corporation
Berkeley, California

INTRODUCTION
A very general problem which pervades the entire field
of o,Perating sys.tem design is the construction of protectIOn mechamsms. These come in many different
forms, ranging from hardware which preve~ts the execution of input/output instructions by user programs,
to password schemes for identifying customers when
t~ey log onto a time-sharing system. This paper deals
wIth one aspect of the subject, which might be called
the meta-theory of protection systems: how can the
information which specifies protection and authorizes
access, itself be protectea and manipulated. Thus, for
example, a memory protection system decides whether a
program P is allowed to store into .location T . We are
concerned with how P obtains this permission and how
he passes it on to other programs.
In order to lend immediacy to the discussion it'
will be helpful to have some examples. To pro~ide
some background for the examples, we imagine a
computation C running on a general multi-access
system 1\1. The computation responds to inputs from
a terminal or a card reader. Some of these look like
commands: to compile file A, load B and print the
output double-sI;>aced. Others may be program statements or data. As C goes about its business, it executes
a l~rge n~mber of different programs and requires at
varIOUS tImes a large number of different kinds of
access to the resources of the system and to the various
objects which exist in it. It is necessary to have some
way of knowing at each instant what privileges the
comput~ti?n ha~, and of establishing and changing
these prIvIleges In a flexible 'vay. We will establish a
fairly general conceptual framework for this situation,

and consider the details of implementation in a specific
system.
Part of this framework is common to most modern
operating systems; we will summarize it briefly. A
program running on the system M exists in an environment created by M, just as does a program running in
supervisor state on a machine unequipped with software. In the latter case the environment is simply the
available memory and the available complement of
~achine instructions and input/output commands;
SInce these appear in just the form provided by the
hardware designers, we call this environment the bare
machine. By contrast, the, environment created by IVI
for a program is called a virtual or user machine. 6 It
normally has less memory, differently organized, and
an instruction set in which the input/output at least
has been greatly changed. Besides the machine registers and memory, a user machine provides a set of
objects which can be manipulated by the program. The
instructions for manipulating objects are probably
implemented in software, but this is of no concern to
the user machine program, which is generally not able
to tell how a given feature is implemented.
The basic object which executes programs is called
a task or process;6 it corresponds to one copy of the
user machine. What we are primarily concerned with
in this paper is' the management of the objects which
a process has access to: how are they identified, passed
around, created, destroyed, used and shared.
Beyond this point, three ideas are fundamental to
the framework being developed:
1. Objects are 'named by capabilities,a which are
names that are protected by' the system in the

27

28

Fall Joint Computer Conference, 1969

sense that programs can move them around but
not change them or create them in an arbitrary
way. As a consequence, possession of a capability can be taken as prima facie proof of the
right to access the object it names.
2. A new kind of object called a domain is used to
group capabilities. At any time a process is
executing in some domain and hence can exercise
the capabilities which. belong to the domain.
When control passes from one domain to another (in a suitably restricted fashion) the capabilities of the process will change.
3. Capabilities are usually obtained by presenting
domains which possess them with suitable
authorization, in the form of a special kind of
capability called an access key. Since a domain
can possess capabilities, including access keys,
it can carry its own identification.
A key property of this framework is that it does not
distinguish any particular part of the computation. In
other words, a program running in one domain can
execute, expand the computation, access files and in
general exercise its capabilities without regard to who
created it or how far down in any:hierarchy it is. Thus,
for example, a user program runnipg under a.debugging
system is quite free to create another incarnation of
the debugging system underneath him, which may in
turn create another user program which is not aware
in any way of its position in the i scheme of things. In
particular, it is possible to reset 'things to a standard
state in one domain without disrupting higher ones.
The reason for placing so much weight on this property is two-fold. First of all, it 'provides a guarantee
that programs can be glued tog~ther to make larger
programs without elaborate pre1arrangements about
the nature of the common environment. Large systems
with active user communities quickly build up sizable
collections of valuable routines. The large ones in the
collections, such as compilers, often prove useful as
sub-routines of other programs. Thus, to implement
language X it may be convenient to translate it into
language Y, for which a compiler already exists. The X
implementor is probably unawar~ that Y's implementation involves a further call on an assembler. If the
basic system organization does not allow an arbitrarily
complex structure to be built up~ from any point, this
kind of operation will not be feasible.
The second reason for concern about extendibility
is that it allows deficiencies in the design of the system
to be made up without changes in the basic system
itself, simply by interposing another layer between the
basic system and the user. This is especially important

when we realize that different people may have different
ideas about the nature of a deficiency.
We now have outlined the main ideas of the paper.
The remainder of the discussion is devoted to filling
them out with examples and explanations. The entire
scheme has been developed as part of the operating
system for the Berkeley Computer Corporation IVfodel
I. Since many details and specific mechanisms a,re
dependent on the characteristics of the surrounding
system and underlying hardware, we digress briefly
at this point to describe them.
Environment

The BCC Model I is an integrated hardware ~md software system designed to support a large number (up to
500) of time-sharing users. This system consists of
two central processors, several small processors, a large
central (core and integrated circuit) memory, androtating magnetic memory. The latter contains more than
500x 106 bytes, including approximately 12X 10 6 bytes
of drum having a transfer rate of more than 5X 106
bytes per second.
The hardware allows each process more than 512k
bytes of virtual memory. The central processors can
accommodate operands of various sizes including 48and ~6-bit floating point numbers. The addresslng
structure allows characters, part-word fields and array
elements to be referenced directly. The subroutinecalling instruction passes parameters and allocates
stack space automatically. System calls are handled
exactly like ordinary function calls.; when anays or
labels are passed to the system they are checked automatically by the hardware so that they can be used
by the system without further ado.
The memory management system organizes memory
into pages. A page is identified by a 48-bit unique name
which is guaranteed different for each page ever created
in the system. Tables are maintained in the central
memory which allow the page to be found in the various
levels of the memory system. These tables are automatically accessed by the address mapping hardware
the first time the page is referenced after the processor
starts to run a new process. Thereafter its real core
address is kept in fast registers. It is therefore unnecessary for any program other than a small part of the
basic system to be concerned about the location of a
page in the memory system; when it is referenced, it
will be brought into the central memory if it is not
already there. Extensive facilities are provided, however, to allow a process to control the level in the memory hierarchy of the pages it is interested in. 'The work
of managing the memory is done by a processor with

Dynamic Protection Structures
read-only program memory and data access to the
central memory; this processor has a 100 ns cycle
time, so that it can handle the large amount of computing required to keep up with demands placed on
the memory system. Another small processor handles
-the remote terminals, which are multiplexed in groups
of 20 to 100 at remote concentrators and brought.
into the system over high-speed lines.
Pages are grouped into files, ·which are treated as
randomly addressable sequences of pages. The only
mechanism provided to access the data in a file is to
put a page of the file into the virtual memory of a
process. Files and processes are named and have protection information associated with them.
Domains in action

Before plunging into a detailed analysis of capabilities and domains, we will look at some of the practical situations which these facilities are designed to
serve. They all have the same general character: several
programs with different privileges exist. Each program
corresponds to one domain. Some of the domains con. trol others, in the sense that the capabilities of a controlled domain are a subset of those of its controlling
domain. As a first example, consider the command
process CP of an operating system. This program
accepts a command, perhaps from a remote terminal,
and attempts to recognize it as a call on a program X
which CP knows about. If it succeeds, CP calls on X for
execution, passing it any parameters which were included in the command. To do this, CP must set up
a suitable environment for X to function in. In particular, enough memory must be provided for X to
run, X must be loaded properly, and suitable input/
output must be available. When X is finished, it will
return and CP can process a new command.
The key point is that we want CP to be protected
from X, to ensure that the user's commands continue
to be processed even if X has bugs. In particular, we
want to be sure that

X: command

cP: command processor
command input

Capabilities

command output

required by

Directory of commands

X

Domain X

Return to CP
Calls

Domains

Figure 1-A command processor and its comma.nd

to X in two forms: in the picture on the right, and as
a return capability in X. The reason for the capability
is that X cannot return with a simple branch operation, since it would then be able to start CP running
at any point, which would destroy the protection.
Suppose now that we want to allow X to get additional commands executed. X might, for example, be a
Fortran compiler whose output must be passed
through an assembler. A simple way to do this is to
put the assembler input on a file called, say, FORTRANTEl\1P, and issue the command.
ASSElVIBLE FORTRANTEMP, BINARY
This command is just a string, which can easily be
constructed by the compiler X. To get it executed,
however , X must be able to call CP; This situation
is illustrated in Figure 2; note the call capability in X,
which is quite different from the return capability.
Weare ignoring for the moment the question of how
CP knows that X is authorized to call the assembler.
If the idea of the preceding paragraph is pursued, it
suggests the value of being able to switch the source
of command input and the destination of command
output in a flexible way. By these terms we mean the

cP: command processor

X:

Command

Y: Command
I

command input

1. X does not destroy CP's memory or files, so
that CP can continue to run when X returns.
2. CP can stop X if it goes wild. Usually we want
the ability to set a time limit and also to intervene from the terminal.
In other words, we want CP and X to run in separate
domains, as illustrated in Figure 1 (since this is an
informal discussion, we do not trouble to distinguish
carefully between the program X and the domain in
which it runs). Here we have shown the call from CP

29

command output
Directory of commands

Capabilities

capabili ties

required by

required by

Y

! Return

to X

i

0

X

X

Domain X

I Domain

(0

!

(0
call CP

IReturn to CP

iI Return to CP

i

Figure 2-A recursive command processor

0

30

Fall Joint Computer Conference, 1969

traffic between a program and the entity by which it
is directed. In a time-sharing system this is normally
a terminal at which the user is sitting; in a non-interactive system it will be a file of control cards. It is
often desirable, however, to switch between the two,
so that routine processing can be done automatically
when the user's attention is elsewhere, yet he can
regain control when things go awry. Again, it is not
uncommon to wish to capture a complete record of a
conversation between user and machine for later
analysis and replay. More radical, it may be of interest
to replace the user at his terminal with a program
which can manipulate the strings of characters which
constitute commands and responses. In this way major
changes in the external appearance of a system can
be obtained with little effort.
All of these things can be accomplished by giving
interactions with the command I/O device the form of
calls to a different domain which acts as a switch. A
generalization to include the possibility of different
command devices for different domains is easy. Thus,
a user may initiate a program in a domain X which,
while continuing to communicate with him, starts a
CP 1:

cOllllUlnd
proceSlIOr 1

call CIO

X:

aacro

J«::

c~d

call CIO

CP2: command
processor 2

call CIO

Domain J«:

Doaain CP2 .

Domain X

Directory
of caa.and.

Return to Cpl

Return to Me

Domain CIO

Return to CIO

user proaram

CIO·

control I/O

call CIO

call CPl

Return to CP2

call CP2
call Me
Return to X

Figure 3a-Switchable control I/O--the- domains

~

Top-level command processor initiates a
cornmand

~

which wants to drive another command
processor with some pre-stored or computed
input.

It therefore creates another CP

and calls it, telling CIO to use

Me

fClr

its I/O

8

The lower CP is given a command to cal.l
the user program

x.

This program needs input
which it gets by calling CIO, the domclin
which is switching the control I/O.

~

the current input source, which is

CIO calla
Me

Figure 3b-Switchable control I/O-the calh

subsidiary domain and feeds it commands. The subsidiary, unaware of the way in which it is being; driven,
may iterate the process by creating Z. The key fact
which makes it all work is the isolation of one domain
from others. Thus, Y may decide to close all its files
without disturbing X, since Y has no way of even
knowing about X's files,. much less accessing t.hem. Z,
on the other hand, can be an open book to Y. Various
aspects of the situation are illustrated in Figure 3.
This section concludes by analyzing a problem of
great practical importance: how to construct H debugging system. This example is a good source of insights
into the facilities required of a protection system because of the great variety of things which can be expected to go wrong during debugging. There are two
domains, one for the debugger D and one for the program X being debugged. We of course want D to be
protected from X. Equally important, we want X to
be completely open to D, so that every object a{}cessible
to X is also accessible to D, and furthermore that D
can find all the objects accessible to X as well as access
them. Otherwise D will not be able to find out what X
has done or to undo any damage. Furthermore, we
want D to be able to imitate any actions which X
can take, so that D can create suitable initial conditions
for debugging parts of X. Thus, D needs operations
which, given a capability for X, allow D to
find all the capabilities in X
copy capabilities between D and X
destroy capabilities in X
enter X at any point with any machine state

DYnamic Protection Structures
With these powers, D can also handle domains whicll
X has created, since it can get hold of X's capabilities
for them. Breakpoints can be inserted in X in the
form of calls on D.

NAME

TYPE

VALUE

31

DOMAINS

1

A

1: 0

2

B

0

1

o :I 0
I
o:0

C

0

0

1:0

D

0

0

0·: 1

E

1

1

o:1

I

I
I

4

Domains and capabilities
The nature of capabilities

F

6

0

1

I
I

1

I

2

1:0
I

As we have already said, a capability is a protected
name of an object. When any object is created, a
capability is created to name it; without the capability
the object might as well not exist, since there is no
way to talk about it. The capability may be thought
of as an ordinary data item enclosed in a box which
prevents tampering with the contents. Thus, for example, it may be convenient to make a capability for
a file consist of simply the disc address of its index.
This is entirely satisfactory, since programs which
handle the capability cannot modify it. If they could,
disaster would ensue, since any program could put
any desired disc address into a file capability, and
there would be no protection at all. If the machine
hardware allows a word to be tagged so that it cannot
be modified except by the supervisor, then we have
precisely what we want for a capability. The situation
is illustrated in Figure 4. It should be possible to load
and store such a word (including the tag bits) in order
to give programs the necessary freedom to manipulate
the names of the objects they are working with.
If this kind of hardware is not available a different
and potentially confusing implementation is required.
The potential can be kept from realization by referring
back to the "pure" implementation of the last paragraph. What is required is to hide the capabilities
away in the supervisor and provide programs with
unprotected names which can be used to refer to them.
When a program running in domain D presents one
of these names, it is necessary to check that it actually
names a capability which belongs to D. This can easily
Capabili ty:

TAG

TYPE

TAG

= read-only,

TYPE

= FILE

VALUE

= disk

VALUE

except to supervisor

address of index

Figure 4--Structure of a eapability

(a)

capabilities grouped, with

1

ITJ~in]

1

ITJD~ain4

bits for ownership

(b)

capabilities separate
for each domain

Figure 5-Capabilities and unprotected names

be done, if there are n such capabilities, by using
numbers between 1 and n for the names. 3 An attractive
alternative, if domains can be grouped into larger units
which share many capabilities, is to number the
domains from 1 to i and the entire collection of capabilities from 1 to n and to attach a string of i bits to
each capability. Bit d is on exactly when the capability
belongs to domain d. Figure 5 illustrates.
A somewhat more expensive implementation is to
search a table associated with the domain whenever
an unprotected name is used. This scheme shares with
the bit-string idea the advantage that it is easy for
different domains to use the same names for the same
object.
There are capabilities for all the different kinds of
objects in the system. On the Model I these are
files
pages of memory
processes
domains
interrupt calls
terminals
access keys

Domains and memory
The nature of a domain is considerably more dependent on the underlying system than is the case
for capabilities, mainly because of the treatment of
memory. From a purist's viewpoint, every access to a

32

Fall Joint Computer Conference, 1969
!

--------------------------~---------------------------------------------------------------monitor
memory word is an exercise of a, capability for that
utility
word. A more moderate positio~, and one which is
user
quite feasible on suitable hardw~re, is to view each
access as the exercise of a cap~bility for a segrnen t
in decreasing order of strength. The hardware enforces
which contains the word. 2 The! mapping hardware
a restriction that addressing cannot go into fI, higher
which implements segmentation is thus viewed as part
ring. It also provides protected entry points :into the
of the capability system, and ~ satisfying unity of
utility and monitor rings and automatically checks
outlook is gained. Since a seg~ent is identified by
addresses passed into these rings as param1eters to
number, the preceding section applies. We shall not
ensure that they are legal in the ring from which they
consider the formidable difficulties which arise if different domains use different names for the same segment.
came.
This simple hardware-implemented structure permits
If segments are accessed through capabilities like
three
domains to transfer control around among each
everything else, then a domain cOJilsists of nothing more
other
and to address each other's memory in a very
than a collection of capabilities. On machines not
convenient and efficient way. The price paid is a riequipped with the proper hard\\'are a domain has an
gidity in structure, and a drastic incompatibility with
address space as well. In the lVlodel I this is a list of
the main, software-implemented domain meehanism.
the pages which occupy each of the 64 slots for pages
The incompatibility is resolved by requiring a change
in the 128k memory which is acc:essible to a user proin ring to be reported to the software, except \yhen the
gram.
only processing to be performed before returJl1ing the
It is also necessary to deal w~th the fact that the
original ring can be done with the capabilities of the
hardware does not allow one domain to access the
original ring. Short calls thus remain cheap, while the
address space of another one directly. This fact is of
overhead added to longer ones is not excessive.
great importance when we consider how data is passed
back and forth between domains, since it implies that
arrays cannot be passed simply by specifying their
Domains and processes
addresses. It is therefore extremely convenient to inThe relationship between domains and processes is
clude as part of a call the abilitN' to pass scalar data
another area greatly influenced by the surrounding
items, and essential to include th~ ability to pass capasystem. The logical nature of the two kinds of object
bilities. From this foundation arQitrarily complex comallows a great deal of freedom: in fact, a domain has
munication can be built, since capabilities for pages,
much the same appearance to a process that a segment
files and domains can be passed. 'rhus, if an array needs
of memory does. The storage for capabilities ~provicled
to be passed as a parameter, i~ is sufficient to pass
capabilities for the pages or file !containing the array,
by a domain can accommodate many processes, and a
single process can switch from one domain to another
together with its base address a:pd length. The called
(subject to restrictions which are considered in the
domain can then put the pages into its address space
and access the array. This is of course much less connext section).
In the ::Uodel I, however, storage is allocated in 2k
venient than passing an entire segment as a parameter,
but it is quite workable.
'
pages, and one of these, called the context block, is
An alternative approach is to organize the hardware
used to hold the system-maintained private data for
each process. The cost of ha.ving a process is thus high,
so that the address space of one domain is a subset to
and there is considerable incentive to minimize the
that of another. This eliminates all problems when the
number of processes; usually one is enough per compusmaller one calls the larger, although it does not help
at all when we want to share only part of the address
tation, if advantage is taken of the interrupt facilities
space. A subset organization fits well with a linear or
described later. When the usage of space in the context
"ring"-like system4 in which the domains are numbered,
block is analyzed, it turns out that there are only two
and the capabilities of domain i are a subset of those
items which would have to be duplicated to allow
of domain i-I. As we shall see, there are good reasons
~everal processes to run with the same address space.
for wanting a more flexible sch¢me, but for a great
These are a 14-word machine state and a stack used
many applications a linear orderirlg is quite satisfactory.
for local storage when the supervisor is executing in
To allow these to be handled more efficiently, the
the process. This stack has a minimum of about 60
Model I hardware breaks the address space of a process
words and can grow to several hundred words at certain
into three rings:
points during supervisor execution. It is therefore the

Dynamic Protection Structures
main barrier to the existence of cheap processes. The
problem can be greatly alleviated by allocating stack
space dynamically at each function call and releasing
it at each return, but this would require some major
changes in system organization.
Although processes are expensive, domains are quite
cheap, since the bit-string method is used to assign
capabilities to domains. Each process in the Model I
can have about a dozen domains associated with it.
The process can run in any of its associated domains
but in no others. This implies that two processes never
run in the same domain.
In a system in which processes are cheap, it is possible
to take an entirely different approach which encourages
the creation of processes for every purpose. In such a
system, parallel processing is of course greatly facilitated. In addition, free creation of processes can be
used to give a somewhat different form to many of
the facilities described in this paper.3
It is perhaps worthwhile to point out that a machine
whose addressing is not organized around a stack or
base registers cannot reasonably run several processes
out of the same domain unless they are executing totally disjoint code, because of the problem of address
p.onflicts.

Transfers of control

Calls
The only reason for creating a domain is to establish
an environment in which a process may execute with
different protection than that provided by any existing
domain. If this objective is to be fulfilledJ transfers of
control between domains must be handled with great
care, since they generally imply the acquisition of
new capabilities. If it is possible for a process' running
in domain X to suddenly jump into domain Y and
continue execution at any arbitrary point, X can certainly induce Y to damage the objects accessible
through Y's capabilities.
To provide an adequate mechanism for transfers
between domains, we introduce the idea of a protected
entry' point or gate, and make the rule that transfer
into a domain is normally allowed only at a gate. A
gate is a new kind of capability which can be created
by anyone with a capability for the domain. It specifies
a location to which control is to go when the gate is
used. Gates can be passed around freely like other
capabilities, and each one may be viewed as conferring
a certain amount of power, namely the power to accomplish whatever the routine entered by the gate is

33

designed to do. With gates it is possible to selectively
distribute the powers of a domain in a flexible way.
A transfer through a gate usually takes the form of
a subroutine call; some provision must therefore be·
made for a return. It is not satisfactory to create
another gate which the called process may return
through, since he might save it away and use it to
return at some later and unexpected time. Instead,
the domain and location to return to are saved on a.
call stack in the supervisor, from which the return
operation can retrieve them. It is possible to call a.
domain recursively with this mechanism, a feature
which is generally desirable and also quite important
for the trap and interrupt system about to be described.
In order to allow the stack to be reset in case of an
error, or for any of the other reasons which prompt
programmers to reset stacks, a jump-return (n) operation is provided which returns to the domain n levels
back. Protection is maintained by requiring the domain
doing the jump-return to have capabilities for all the
domains being jumped over.

Traps
A trap is caused by the occurrence of some unusti~l
event in the execution of the program which requires
special handling, such as a floating point overflow, a
memory protection violation or an end of file. When a
trap occurs, it forces control to go to a specified place,
where presumably a routine has been put to deal with
the event. Whether any particular event causes a trap
or simply sets a flag which can be tested by the program
is a decision which should be under the programmer's
control. Traps may be initiated by hardware (e .g ..
floating overflow) or may be artifacts of the software;
as with most distinctions between hard ware and software implementation, this one is of little importance,
and we expect all traps to be transmitted to the program
in the same form, regardless of their origin.
These are all obvious points which are generally
accepted, and have even become embedded in the
definition of PL/I. What concerns us here is the relationship between traps and domains, which is not
quite so obvious. The basic problem is that the response to a trap must be made to depend on the environment in which is occurs. The 'occurrence of, say, a
floating overflow is simply a fact, and has nothing to
do with who is running. The action to be taken, on the
other hand, is entirely a function of the situation.
Consider the example in Figure 6. If a floating overflow
occurs with the call stack in state (b), it is clear that

34

Fall Joint Computer Conference, 1969

Name
A

Domain

Traps

B

Statl.stl.cal
package

C

Matrl.x
Inversion
a)

FLTOV,

SINGMTX

I

I

FLTOV

Domains and
enabled traps

o

b)

The call stack
during matrix
inversion

o
~SIN~ o
o

o

CATCHALL

8

0FLTOV

(0
c)

o
o
G

ICommand processor ICATCHALL I

the matrix
inverter processes a
floating overflow

d)

the matrix
inverter returns with
trap-return
(SINGMTX)

e)

the matrix
inverter returns
with trapreturn
(BAD DATA)

Figure 6--Traps and trapreturns

C should have the first chance to handle the trap. If
it is not interested, the domain B which called it should
have the second chance. In state Cc}, on the other hand,
domain B should have the first chance, and then A.
The reasons for this, is that we do not wish to give up
control to a weaker domain when a trap occurs.
The idea is then the following: Each domain is
considered to have a father. When a trap occurs, it is
first directed to the domain S which is running. If S
does not have the trap enabled, the father of S is
tried in the same way. If no one can be found to handle
the trap, there are two possibilities:

to each hardware-generated trap is a standard name.
Software-generated traps can use £tny names, including
the ones for hardware traps. This makes it easy for a
subroutine to simulate the occurrence of a hardware
condition which it may not be convenient to produce.
A simple extension of the return operation. to a
trap-return allows a routine to signal an error without
leaving any traces of itself; the trap-return does a
return and immediately causes the specified trap,
without allowing any execution beyond the return
point. The domain which handles the trap then sees
it as having occurred in the calling routine, which is
exactly what is wanted. Thus in Figure 6 we have n
matrix inversion routine which processes its own
floating overflows, but reflects two other conditions
to its caller with trap-return. Another useful convention is to disable the trap when it occurs. This
makes it much less likely that the program will get
into a loop, especially for such traps as illegal instruction and memory protection violation.

Interrupts
There remains one more way to cause n tlmnsfer
between domains: the occurrence of nn interrupt. This
is not intended to be the normal mechanism for communication between coopernting processes; the basic
block ,and wnke-up mechanismso are expected to perform that function. There nre times, however, when it
is desirable to force a process to do something:, even
if it is not paying attention. Two obvious reasons for
this are:
n quit signal from the terminal, which indicates
that the user wants to regain control over a process
which hns gone into a loop, or perhaps ,simply
become unnecessarily wordy;

ignore it;
generate a catchall trap which any domain that
lacks a father is forced to handle.

the elapse of a certain amount of time, which
has much the same meaning.

If a domain T is found with the trap enabled, it is
called with the name of the trap as argument. It can
then return and allow execution to proceed if it is
able to clear things up. Alternatively, it can do a
jump-return to someone farther back on the call stack
if it finds the situation to be hopeless. An important
property of this scheme is that the trap routine can do
arbitrarily complex processing without disturbing the
situation at the time of the trap.
Conceptually, we wish to think of traps as identified
by symbolic names. Each domain must then include a
list of names of the traps it has enabled. Conesponding

The action required in these two cases is different.
When n timer interrupt is requested (and there may be
two kinds, for real time and CPU time) the desired
action is usually to cnll a specific domain, often the
one which is setting the timer. If another domain
wants a timer, it will use one which is logically different.
The user's quit signal, on the other hand, is context
dependent like a trap; the desired action is a function
of the routine which is running when the signal 2~rrives.
Thus an iterntive root-finder may interpret a quit as
an indication that the solution is accurate enough,
but the debugging system under which it may be run-

Dynamic Protection Structures
ning will curtail its printing when it sees a quit and
await a new command. This' analysis suggests a simple
implementation: convert the quit into a trap from the
currently executing domain. Each interrupt, then, will
give rise to a call or a trap, depending on its type as
declared by the programmer.
Even when we see how to convert them into operations within the process, interrupts still pres.eut one
serious problem which does not arise in the handling
of traps. This is the fact that a program occasionally
needs to be allowed to compute for a while without
losing control. Usually this happens when modifications are being made to a data base; if a quit signal
should appear or a timer run out halfway through this
operation, the data is left in a peculiar state. The
obvious solution is to allow a process to become noninterruptible for a limited period of time. The function
of the limit is to prevent the process from getting into
a state from which it cannot be retrieved; exceeding
it is a programming error and always causes the process
to become interruptible again and an error trap to
occur, regardless of whether an interrupt is actually
pending. The limit is properly measured in real time,
since its primary purpose is to put a bound on the
frustration of the user at his console.
N on-interruptibility is a process-wide condition. It
must be possible, however, for a newly -called domain
to extend the limit exactly once, so that it can function
properly even though its caller is about to exceed his
limit. The limit is thus part of a call stack entry. When
a return occurs, the old limit comes back into force,
and an immediate trap may occur if it has been exceeded.
Table I summarizes the operations connected with
transfers of control between domains.
TABLE I-Operations for transfers
Operation

Arguments

Call
Return
Jump
Jump-return
Trap
Trap-return

Gate, Parameters
Parameters
Gate, Parameters
Depth, Parameters
Trap number
Trap number

Proprietary programs

The remainder of this paper deals with the protection problems introduced when objects are allowed

35

to have external, mnemonic names. The examples in
this section are intended to introduce this subject, and
are also of interest in their own right. Suppose then
that a user U has a program executing in domain P
and wishes to perform a circuit analysis. P has generated the input data for the analysis, and intends to
use the results for further calculation. Within the
system M on which P is running, some user V has
written a suitable analysis program A which he has
offered for sale, and U has decided to use V's prog.ram.
I t happens that U and V are competitors.
Both users in this situation have selfish interests
to protect. First, and most obvious, V does not want
his program stolen. He therefore insists that while it
is executing U must not be allowed to read it. Equally
important, however, is the fact that U does not want
V's program to be able to read the calling program P
and its data; although U may not be trying to market
P, it, and especially its data, contain valuable information about U's current development work which
must be kept from competitors. The relationship
between U and V, and between their programs P and A,
is therefore one of mutual suspicion. Each is willing
to entrust the other with just enough information
to allow the circuit analysis to be completed, and no
more. The system must support this requirement if it
is to be a suitable vehicle for selling programs.
Furthermore, cale must be taken beyond the programs. While P is running it needs the ability to access U's files by name, to read input data and record
results. This privilege must certainly not be extended
to A, since it can learn even more about U's secrets
by examining his files than by looking at his program,
not to mention the possibility of modifying them. On
the other hand, A may need access to V's files to obtain
data for the analysis and to collect statistics and accounting information; this access must not be available
to p,. The. protection mechanisms must therefore provide for isolating P and A at the level of file naming as
well as on the lower levels which have been the subject
of this paper so far.
What is required then is a system facility something
like this. V establishes A as a proprietary program,
specifying the file on which it resides. Another user's
program P may then ask the system to attach this
file. To do this, the system creates a new domain A,
installs the program in it, provides it with some storage,
and returns to P a gate into A. When P wants to call
A, he uses the gate and passes whatever parameters
he thinks are needed for Ato function.. When A is
finished, he retmns. The protection mechanisms we

36

Fall Joint Computer Conference, 1969

have been discussing prevent undesired interference
between P and A. Safeguards for the files are discussed
below.
The example abcwe is one of a great variety of similar
situations. The system itself creates many of them. A
LOGOUT command, for example, requires special access to accounting files and to capabilities for destroying
a process, but it would be nice to call it with the
standard command processor. Similarly, driving a
special peripheral like a printer requires special capabilities. If a company maintains a large data base, it
may wish to give different classes of users access to
different parts of it by allowing them to call different
accessing programs. These and many other applications
fall within the general outline established by our proprietary program example. We now proceed to consider
how to handle the file naming problems it presents.
External names

Table II lists the goals of a naming system for objects,
and indicates some of the distinctions between the
use of capabilities in names which have been discussed
in previous sections, and the use of external names,
which are strings of characters such as 'FILEl' or
'CIRCUIT'. In summary, it says, that capabilities are
very convenient for use by a program, since they are
cheap and self-validating. On the other hand, they are
very bad for people, since they cannot be typed in or
remembered. Names for people ~hould also have the
property that the same name can :refer to many different objects, the distinctions to be made by context.
Thus, Smith's file 'ALPHA' is not the same as Jones'
'ALPHA'.
TABLE 11- Goals of a naming system for objects

Goal

Achieved by
Capabilities

N ames are mnemonic
N ames can be relative
to other names
N ames can be used externally
Possession of name
X
authorizes access
N ames are cheap
X
to use
N ames can be maX
nipulated by programs

Achieved by
external names
X

X
X

X

Techniques for achieving all these goals are well
known. They depend on the introduction of a new kind
of object called a directory, which consists of pairs:
< external name, capability>, and an operation of
opening an object by supplying the name to obtain
the capability. Since the external name is interpreted
relative to a directory, there is a suitable basis for
establishing the context of a name. A tree-structured
naming system is implicit in the scheme, because
directories are themselves objects accessed by capabilities. It is now easy to see how a program in 2~ domain
D accesses the objects belonging to owner U. 'When D
is created, it is supplied with a capability for TJ's
directory, which it simply exercises.
There is more controversy over the proper methods
of accessing objects belonging to other users. A popular
approach is to use passwords: a public read-only
directory is filled with capabilities for all other directories which allow the objects in them to be accessed
provided a correct password (usually different for each
object) is supplied as part of the opening operation.
This method is not satisfactory. First, it is inconvenient,
since it requires the person accessing the fillS to remember the password. Second, it is insecure. If he
writes the password down, or includes it in a program,
the possibility increases that it will become known. It
is bad enough to have to use a password tOo obtain
entry to the system, but at least only one password is
involved, it is used only once per session, and it can
be changed, if need be after each session, without too
much fuss. None of these things is true of passwords
attached to files: there are many of them, many people
need to know them, and one must be used each time
a file is opened. This scheme has no advantage except
economy of implementation.
A method based entirely on capabilities suffers only
one of these drawbacks: it is inconvenient, but secure.
It is also, however, quite complex. The idea is that if
a file (or anything else) is to be shared, a capability
for it should be passed from its owner to those who
wish to share it. The problem is that a capability,
being a protected object, must be passed through protected channels; it cannot be sent in a letter, even a
registered letter. The solution is illustrated in Figure
7. Every user has (at least) two directories, a private
one which he works with, and a transfer directory. The
public directory PUB, for which every user has a read
capability, contains write capabilities for all the trans··
fer directories. The object is to move the capability
for X from PDA to PDB. Proceed as follows:

Dynamic Protection Structures

Name

Ac_c~~~_

va!.ue..

A

W

TOA

B

W

TOB

PUB:

Name

Access

37

Value

mAl L...J .,J.y
A '.

public directory, containing a write-only
capability for the
transfer directory
of each user.

*

'"

** ..

temporary capability for
copying
final copied
capability

-.. .. path for copying

c::

R

PUB

RW

TDA

•W

OBJ
TOB

D8
*

SMITH*
I

u ser A's priv!te directory

1_. -- -:-1--

I

~ _~l-o~:

I

I
I

l

I

user B' s transfer directory

C

Rr-:~
~

OBJ

I

I

Capabilities for
SMITH's computation before opening
the file.

I

I
I

I
I

I

**

user B' s private directory

.I--::.:AL=P:..:.HA::.:...a.I~R~~ ~- - ~ ... "

JONES' directory

Figure 7-Sharing capabilities without aecess keys

A moves a capability for TDB into PDA
Using it, A moves his capability for X to TDB
B moves the capability for X from TDB to PDB
Since only B can access TDB, security is preserved. A
malicious user can confuse things by writing random
capabilities into the TDs, but it is easy for B to check
that he has gotten the right thing. Furthermore, if X
is a directory, future communication can be carried
out quite conveniently, since A and B can then communicate through X without any worries about outside interference.
A much better method is based on the simple idea
of attaching to a directory entry a list of the users
who are allowed to access it; with each user we can
also specify options, so that Rosenkrantz may be
granted write access to the file while Guildenstern can
only read it. This scheme, which was first used in
CTSS/ has two drawbacks. The first is that if the list
of users who are authorized to access a file is long, it
takes a lot of space to store it; this problem is espe~ially
annoying if there are several files to be accessed by the
same group of users. The second drawback is that there
is no provision for giving different kinds of access to
different domains of a computation. Both difficulties
can be overcome in a rather straightforward manner.
Before we pursue this point, it is important to notice
-why the difficulty encountered above in the capabilitypassing scheme does not arise here. We can think of
the computation of a logged-in user as possessing a
special kind of capability which identifies it as belonging to him. If SMITH is the user, we will refer to
thiA capability as SMITH*, meaning that the string

S~11TH*

Capabil1ties for
SMITH's computation after opening
the file.

Figure 8--Use of access keys

'SNIITH' has been enclosed in a tamper-proof box.
When JONES wishes to give SMITH access to his
file ALPHA, he puts the name SMITH on the access
list; JONES can do this since he has a capability for
ALPHA. When a computation presents the capability
SMITH*, ~the system observes that the string (or user
number) which is the contents of the capability matches
the string on the ac~ess list and grants the access.
At no time is it necessary for JONES to have SMITH*
in his possession. He needs only the name SMITH
which, since it is not a protected object, can be communicated to him by shouting across the room. Figure
8 illustrates.
To generalize the method we need two ideas. One
is that of an access key. This is an object (i.e., it can
be referenced only by using a c.apability) which consists simply of a bit string of modest length, long
enough that the number of different access keys is
larger than the number of microseconds the system
will be in existence. Any user may ask the system for a
new access key; the system will create one never seen
before and return a capability for it. The object SMITH*

38

Fall Joint CoIllJ)uter Conference, 1969

mentioned in the last paragraph is an example of an
access key; one is kept for each user in the system.
Since an access key is an object, capabilities for it
appear in the directories and are protected exactly as
is done for any other object (since the access key is a
small object, it may be convenient for the implementation not to give it any existence independently
of the capabilities for it, i.e., to make the value of the
capability the object itself, rather than a pointer to
it as in the case of files). To give a group of users access
to some files, all we have to do is distribute a new
access key GROUP* to the users and put GROUP
on the access list for each file. The distribution is
accomplished by creating GROUP* and putting all
the users on its access list; once they have copied it
into their directories they can be removed from the
access list, so that no space need be wasted. In practice,
as we have pointed out, numbers of perhaps 64 bits
would be used instead of strings like 'GROUP'.
The second idea is not new at all. It consists of the
observation that since an access key is just an object,
different domains can have different access keys and
hence different kinds of access to the file system. Thus,
for example, a user's computation may be started with
two domains, one for his program with his name as
access key, and the other for system accounting with
an access key which allows it to write into the billing
files. With a single suitable access: key, a domain can
easily get hold of an arbitrarily large collection of
othAl' objects which are protected by other keys, since

the first key can be used to obtain other keys from the
directory system.

SUMMARY
We have described a very general scheme for distrlbuting access to objects among the various parts of
a computation in an extremely specific and flexible
way. The scheme allows two domains to work together
with any degree of intimacy, from complete 1~rust to
bitter mutual suspicion. I t also allows a domain to
exercise firm control over everything created by it or
its subsidiaries.
.

REFERENCES
P A CRISMAN editor
The compatible time-sharing system: A. programmer's guide

MIT Press 2nd ed Cambridge Mass 1965
2 J P DENNIS
Segmentation and the design of mu,lti-programmed computer
systems
.J ACM Vol 12 Oct 1965 589

3 J B DENNIS E C VAN HORN
Programming semantics Jor multiprogrammed compuuuion

CACM Vol 8 No 3 March 1966 143
4 R M GRAHAM
Protection in an information proce8sing utility

CACM VollI No 5 May 1968 368
5 B W LAMPSON
A scheduling philosophy for multi-proce8.<;iny 8ystems
CACM VollI No 5 May 1968347
6 B W LAMPSON et al
A user machine in a time-sharing system
Proc IEEE Vol 54 No 12 Dec 1966

The ADEPT-50 time-sharing system
by R. R. LINDE and C. WEISSMAN
System Development Corporation
Santa Monica, California

and

C. E. FOX
King Resources Company
Los Angeles, California

INTRODUCTION
In the past decade, many computer systems intenderl
for operational use by large military and governmental organizations have been "custom made" to
meet the needs of the particular operational situation
for which they were intended. In recent years, however, there has been a growing realization that this
design approach is not the best method for long term
system development. Rather, the development of
general purpose systems has been promoted that
provide a broad, general base on which to configure
new systems. The concepts of time-sharing and general-purpose data management have been under development for several years, particular.ly in university
or research settings. 1 ,2,3 These methods of computer
usage have been tested, evaluated, and refined to
the point where today they are ready to be exploited
by a broad user community.
Work on the Advanced Development Prototype
(ADP) contract was begun in January 1967 for the
purpose of demonstrating-in an operational environment-the potential of automatic informationhandling made possible by recent advances in computer technology, particularly advances in timesharing executives and general-purpose data management techniques. The result of this work is a largescale, multi-purpose system known as ADEPT, which

operates on IBM system 360 computers. *
The entire ADEPT system is now being used at
four field installations in the Washington, D. C. area,
as well as at SDC in Santa Monica. The system was
installed at the National Military Command System
Support Center in May 1968, at the Air Force Command Post in August 1968, and at two other government agenc;es in January 1969. These four field sites
collectively run ADEPT from 80 to 100 hours per
week, providing a total of some 2000. terminal hours
of time-sharing service monthly to theIr users.
The ADEPT system consists of three major components: a time-sharing executive; a data management system adapted from SDC's Time-Shared Data
Management System (TDMS) described by Bleier,4
and a programmer's package. This p~per deals .exclusively with the ADEPT Time-SharIng Executr~re,
and particularly with the more novel asp~~ts of Its
architecture and construction. Before examInIng these
aspects it will be instructive if we review the basic
design and hardware configuration of the system.
A general purpose operating system

The ADEPT executive is a general-purpose time-

* Development of ADEPT was supported in part by the Advanced Research Projects Agency of the Department of Defense.
39

40

Fall Joint Computer Conference, 1969

sharing system. The system operates on a 360 Model
50 with approximately 260,000 bytes of core memory,
4 million bytes of drum memory, and over 250 million
bytes of disc memory, shown graphically in Figure
1 and schematically in the appendix. With this machine
configuration, ADEPT is designed to provide responsive on-line interactive service, as well as background
service to approximately 10 concurrent user jobs. It
handles a wide variety of different, independent application programs, and supports the use of large
random-access data files. The design-basically a
swapping system·-provides for flexibility and expansion of system functions, and growth to more powerful
models in the 360 family.
ADEPT functions both as a batch processor (whereby jobs are accumulated and fed to the CPU for operation one by one) and as an interactive, on-line system
(in which the user controls his job directly in real
time simply by typing console requests).
Viewed as a batch system, ADEPT allows jobs to
be sub"mitted to console operators or submitted from
consoles via remote batch commands (remote job
entry). In either case, jobs are "stacked" for execution
by ADEPT in a first-in/first-out order. The stack is
serviced by ADEPT as a background task, subject
to the priorities of the installation and the demands
of "foreground" interactive users.' Viewed as an interactive system, ADEPT allows the user to work with
a typewriter, allowing computer-user dialog in real
time. Via ADEPT console commands, the u,ser identifies himself, his programs, and his data files, and
selectively controls the sequence and extent of operation of his job in an ad lib manner. A prime advantage
of the interactive use of ADEPT is that the system
provides an extendable library of service programs
that permit the user to edit data files, compile or
assemble programs, debug and: eliminate program
errors, and generally manage large data bases in a
responsive on-line manner.
System architecture

The architecture of the ADEPT executive is that
of the "kernel and the shell". The "kernel," referred
to as the Basic Executive (BASEX), handles the
major problems of allocating and scheduling hardware resources. It is small enough to be permanently
resident in low core memory, per~itting rapid response
to urgent tasks, e.g., interrupt control, memory allocation, and input/output traffic. The "shell," referred to as the Extended Executive (EXEX), provides
the interface between the user's application program
and the "kernel". It contains those non-urgent, large-

/

CORE ( 26M BYTES)

lj
2303 DRUM
(3.9M BYTES)

2311 DISC PACKS
(7.25M BYTES PER PACK)

2314 DISC STORAGE
(207M BYTES)

2302 DISC STORAGE
(226M BYTES)

Figure 1-Relative capacity of various ADEPT direct-access
storage media available in less than 0.2 seconds. The initial
system that operates at SDC utilizes core, 2303 drum, ~~311 and
2314 disc packs, and 2302 disc storage. The NMCSSC system
utilizes 2314 disc storage in lieu of 2311 or 2302 discs. The architecture of the ADEPT executive is such that it permitR any
combination of the e..bove types of disc storage in varying a.mounts

task extensions of the basic "kernel" prqcesses that
are user-oriented rather than hardware-oriented;
they may, therefore, be scheduled and swapped.
The version of the ADEPT time-sharing system,
thus far developed has multiple levels of control
beyond the two-level "kernel-shell" structure--i.e.,
it can be thought of figuratively as an "onion skin".
Figure 2 shows these relationships graphically.
Beyond EXEX, "object systems" may exist as
subsystems of ADEPT (developed by the user community without modification to EXEX or BASEX.),
thus further distributing and controlling the system
resources for the object programs that form still
another level of the system. The design ideas embodied
in ADEPT parallel those of Dijkstra,o Corbato,6
and Lampson,7 but differ in techniques of implementation.
The ADEPT Basic Executive operates in the lower
quarter of memory, ther~by providing three quarters
of memory for user programs. With the current H
core configuration, ADEPT preempts the first 65,000
bytes of core memory, the bulk of which is dedicated
to BASEX; EXEX must then operate in user memory

The AD!E:PT-50 Time-Sharing System

,.,..------- ........
",. / '

OTHER FUNCTIONS

........ "

/

/

/

41

,,

\

/

\

I

\

I

\

\

I

,
,

I
I

\

I

\

\

,

II

" " ........

/

"'"'-----,.,."

." ./

/

/

.... ....

Figure 2-Multiple levels of control in ADEPT

in a fashion similar to user programs. ADEPT is
designed to operate itself and user programs as a
collection of 4096-byte pages. BASEX is identified
as certain pages that are fixed in main storage and
that cannot be overlayed or swapped. EXEX and
other programs are identified as sets of pages· that
move dynamically between main storage and swap
storage (i.e., drum). It is necessary to maintain considerably more descriptive information about these
swapp able programs than about BASEX. This
descriptive information is carried in a set of system
tables that, at any point in time, describe the current
state of the system and each program.
ADEPT views the 'User as a job consisting of some
number of programs (up to four for the 360/50H
configuration) that were loaded at the user's reouest.
These programs may be independent of one another
or, with proper design, different segments of a larger
task. Implicitly, EXEX is considered to be one of
these programs. To simplify system scheduling, communication, and control, only one program in the
user's set may be active (eligible to run) at a time.
When ADEPT scheduling determines that a job may
be serviced, the current job in core is saved on swap
storage, and the active program of the next job is
brought into core from swap storage and f'xecuted
for a maximum period of time, called a quantum. The
process then repeats for other jobs. Figures 3 and 4
schematically depict these relationships.

Figure 3-Simple commutation of users programs. This figure
illustrates the relationship between user's programs' EXEX
and BASEX. Each spoke represents a user's job, with his EXEX
providing the interface between BASEX and the hardware
resources. The maximum number of interactive job the
IBM 360j50H configuration is ten.

Figure 4-ADEPT's basic sequence of operation. This figure
shows the basic operating system cycle: idle loop is interrupted
by an external interrupt (an activity request); a program is
scheduled, swapped into core from the drum, and executed
escape from the execution phase occurs when quantum termination condition (e.g., time expiro'l.tion, service or I/O call, error
condition) is met; the program i"! then swapped out and control
is returned to the idle loop (if no other program"! are eligible to
be scheduled).

Basic executive (BASEX)
Table I lists the BASEX components and their
general functions as of the eighth and latest executive
release. These basic system components form an
integrated, non-reentrant, non-relocatable, perma-

Fall Joint Computer Conference, 1969

42

nently-resident, core memory package 16 pages long
(each page is 4096 bytes). They are invoked by hardware interrupts in response to service requests by
users of terminals and their programs. Note the
. division of input/output control into cataloged (SPAM
and lOS), terminal (TWRI), and drum (BXEC)
activities to permit local optimization for improved
system performance.
TABLE I-Basic executive components

Component

Function

ALLOC

Drum and core memory allocation.

BXBUG

Debugger for executive programs.

BXEC

Basic sequence and swap control.

BXECSVC

SVC handlers for WAIT, TIME,
DEVICE, STOP AND DISMISS
calls.

EXEX

Linkage routines for EXEX (BASEX/
EXEX interfaces); also services commands DIALOFF, DIALON.

INTRUP

First-level interrupt control.

lOS

Channel-program level input/output
supervisory control.

RECORD

Records SVC, interrupt activity in
BASEX.
Scheduler.

SKED
SPAM

Input/ output access methods to cataloged storage.

TWRI

Terminal input/output control.

System Tables

Resident system data areas for communicationtable (COMTAB) 1 loggedin user's table (JOB), loaded programs
table (PQU) , drum and core status
tables (DSTAT, GSTAT), and a
variety of other tables.

Extended executive (EXEX)
Unlike the tight, closed package of integrated
BASEX components, EXEX is; a loose, open-ended
collection of semiautonomous programs. Table II
lists this collection of programs. EXEX is treated
by BASEX as a user program, with certain privileges,
and each user is given his own "'copy" of the EXEX.
I t is transparent to the user that EXEX is reentrant

TABLE II-Extended executive components

Component

Function

AUDIT

Maintains a real-time recording of all
security transactions as an accountability log.

BMON

Batch monitor for control of background job execution.

CAT

Cataloger for file storage access control; also services FORGET command.

DTD

Transfers recording information from
drum to disc.

DBUG

Debugger for non-executive (user)
programs.

LOGIN

User authentication and job creation.

SERVIS

Library of service commands 'Ghat are
reentrant, interruptible and scheduled:
APPEND, CHANGE, CREATE,
CYLS, DELETE, DRIVES;I INIT,
LISTF, LISTU, LOAD, LOADD,
LOAD and GO, OVERLAY, REPLACE, RESTORE, RESTORED,
SAVE,
SEARCH,
VAItYOFF,
VARYON.

RUN

Remote batch job submission control
servicing commands RUN and
\ OANCEL.

XXTOO

Library of small, fast, executive
service commands: CPU, BGO,
BQUIT, BSTOP, DIAL, DRUMS,
GO, LOGOUT, QUIT, R~BTART,
SKED,
SKEDOFF,
STATUS,
STOP, TIME; USERS.

SYSDEF

Defines input/output hardwa.re configuration at time of system start up.

SYSLOG

Defines authorized user/terminal security profiles at time of system
start up.

TEST

Initializes system tables at time of
system start up.

SYSDATA

Non-resident, shared, system data
table for dial messages and other
common data, e.g., lists of all logged-in
users; other non-resident, job-specific
tables also exist, e.g., job environment
pagel push-down list data page.

The ADEPT-50 Time-Sharing System
and is being shared with other users, except for its
data space. Each job has its own "machine state"
tables saved in its unique set of environment pages.
This structure permits flexible modification and orderly
system expansion in a modular fashion. EXEX is
always scheduled in the same way as other user programs.
Though EXEX components are, in large part,
non-self-modifying reentrant routines and thus, could
at sm!1ll cost, be relocatable; neither user programs
nor EXEX components are relocated between swaps.
The lack of any mapping hardware on the IBM 360/50
and the design goal and knowledge that most user
programs would be of maximum size made unnecessary
a software provision to relocate programs dynamically.
User programs may be relocated once at load time.
however.
Communication and control techniques used in ADEPT

Communication is the generic term used to cover those
services that permit two (or more) programs to intercommunicate, be they system program, user program,
or both. From this communication vantage point we
shall examine the connective mechanism used between
the Basic and Extended Executives; the techniques
that allow components within the EX EX to make
use of one another; and the system design that permits
an object program to control its own behavior as well
as to communicate with the system and with other
object programs.

The ADEPT job or process
Before we discuss the system mechanics, let us
examine how the system treats each user logically.
A user in the system is assigned a job number. Each
job in the system may be viewed as a separate process,
and each process is, by definition, independent of all
other processes running on the machine. A processor job- is not a program. It is the logical entity for
the execution of a program on the physical processor,
and it may contain as many as four separate programs.
A program consists of the set of machine instructions
swapped into the processor for execution, and the
Extended Executive is one of these programs.
The ADEPT executive requires a large number of
system tables to permit Basic and Extended Executive communication. Conceptually, the use of descriptive tables defining the condition of a user's process
is analogous to the state vector (or state word) discussed by Lampson and Saltzer. 8 •9 That is, the collection of information contained by these tables is

43

sufficient to define an inactive user's process state
at any given moment. By resetting the central processor from the state vector, a user's job proceeds
from an inactive to an active state as if no interruption had occurred. The state vector contains such
items as the program counter, the processor's general
registers, the core and drum map of all the programs
in the job, and the peripheral storage file data. All
of the collective data for each program or task in the
process are contained in the state vector.

Basic and extended executive communication
Each ADEPT user (i.e., any person who initiates
some activity within the system by typing in commands) is given a job number and assigned an entry
in the JOB table. The JOB table contains the system's
top-level bookkeeping on user activity. I t contains
the user's identification, his location, his security
clearance, and a pointer to his program queue. Each
user is assigned one entry, or JOB, in the table. Associated with each JOB are the one or more programs
that the user is running.
Top-level bookkeeping on programs is contained
in the Program Queue (PQU) table. Each PQU entry
contains a program identification and some (but not
all) information that describes that program in terms
of its space requirements, its current activity, its
scheduling conditions, and its relationship to other
programs in the PQO that belong to the same JOB.
The detailed descriptive information and the status
of each JOB and its programs are carried in the swappable environment space.
The environment pages (there can be as many as
four) comprise a number of separate tables that contain such information as the contents of the general
registers, the swap storage page numbers where the
balance of the program resides, the program map,
and lists of all active data files. A single environment
page (or pages) is shared by all programs that belong
to the same JOB (user). The system design allows for
environment page overflow at which time additional
pages are assigned dynamically. The environment
pages, PQU table, JOB table, and data pages comprise the state vector of the user's job.
To permit storage of "global" system variables,
and to allow system components to reference system
data that may be periodically relocated, there exists
a system communication table, which resides in low
core so that it can be referenced without loading a
base register.
The IBM 360 supervisor call (SVC) is used exclu-

44

Fall Joint Computer Conference, 1969

sively by EXEX components and object programs to
request BASEX services. Though additional overhead
is incurred in the handling of the attendant interrupt,
the centralization of context switching provided is
of considerable value in. system design, fabrication,
and checkout.

Extended executive communication
An EXEX may make use of another EXEX fUIlction by use of the sve call m~chanism. To support
the recursive EXEX, an additional sve processing
routine is required to manage the different recursive
contexts. This routine, called the sve Dispatcher,
processes calls from user and EXEX functions alike,
manages a swappable data page, and switches to an
interface linkage routine. The· data page contains
a system communication stack that consists of a
program's general registers and the Program Status
Word at the time of the sve. This technique is
analogous to the push-down logic of recursive procedure calls found in ALGOL or LISP language
systems. The stack provides a convenient means of
passing parameters between routines in the EXEX.
Since each job has its own unique data page and environment page, EXEX is both recursive and reentrant.
The environment status table (ESTAT) contains
the swap and core location for each component in
the EXEX and for each program in the job. It resides
in the job environment page. When an EX EX service
is requested, only that particular EXEX program is
brought in from swap storage,: rather than the full
service library. The interface linkage routine provides
this management function; it lies as a link between
the sve Dispatcher and the particular EXEX
function. The interface routine picks up necessary
work pages for the EXEX component involved and
branches to that component aner it is brought into
core. The interface routine maintains a separate pushdown stack of return addresses: providing the means
for the EXEX component to properly exit and return
control to its interface routine and then to the system.
The EXEX component called; may make additional
EXEX sve calls before exiting. To provide correct
work page allocation during recursive calls, the interface routine also saves the work page core and drum
page addresses in the push-down stack. Upon completion of a call, the EXEX component returns to
its interface routine; the interface routine releases
all allocated work pages to the system and branches
to a common unwind procedure.
The unwind procedure, like the sve Dispatcher,
is simply a switching mechanism. It determines, via

the stack, whether to return to a still higher level
EXEX function, or to turn the EXEX off and exit
to the Basic Sequence. This recursive/reentrant control is the most complex portion of ADEPT and is
the "glue" that binds BASEX and EXEX together.
Figure.5 illustrates the recursive process.

Object program communication
One of the more stringent services required of an
operating system is the rapid interchange of large
quantities of data between object programs. The
interchange of even simple arrays, matrices, and tables
via stack parameters or a common fil~ suffers from the
inadequacy of limited capacity or extensive I/O time.
Many operating systems ignore this requirement,
thereby restricting the general-purpose appllications.
Yet there are solutions to this problem, and one successful technique employed in the ADEPT system is
that of "shared memory". Shared memory is achieved
by using the basic mechanism for managing reentrancy,
namely the program environment page map. Through
the ADEPT SHARE Page call, an object program
can request that designated pages of another program

DATA PAGE PUSH
DOWN STACK.
SVC
DISPATCHER
STACKS
EXEX
COMPONENT'S
GENERAL
REGISTERS

NUMBER OF ENTRI ES
EX EX
"A" COMPONEN·r
REGISTER~

UNWIND
DECREMENTS
STACK

EX EX
"B" COMPONENl
REGISTERS

0)

Figure 5-Block diagram of EXEX behavior g,nd
control

The ADEPT-50 Time-Sharing System

45

console and then processed in turn. by this supervisor
in the job be added to its map. If core page numbers
function.
are passed as parameters in various service calls, whole
pages of data may be passed between programs. EXEX
and many object programs operating under this system
Armed interrupts and rescue function
use this method for inter-program communication.
ADEPT operating on the IBM 360/.50H restricts
The basic design of ADEPT conveniently provides
its user programs to 46 active core pages. However,
for prooessing object program "armed" interrupt
by utilizing the GETPAGE call, an object program
calls. This means that an obje-ct program is able to
may acquire up to 128 drum pages and may subseconditionally start (wakeup) and stop (sleep) the
quently activate and deactivate various page sets
execution of its own programs, and others as well.
by utilizing another service call, ACTDEACT (actiThe conditions for ~mploying wakeup calls include
vate/ deactivate). This scheme permits bulk data from
too much elapsed time, or the occurrence of unpredisc storage to be placed on drum and operated upon
dictable but anticipated events, e g., errors and other
at "swap" speeds. Thus skilled system users can
program calls. In "arming" these "software-interachieve efficient use of time and memory by managing
rupt" conditions by object program calls, the program
their own "paging". We consider this the best alternaentry point(s) for the various conditions are specified.
tive considering the questionable state of other, autoWhen such conditions occur, the operating system
matic paging algorithms. 1O ,1l,12,13 Most EXEX comtransfers to the specified entry' point and gives the
ponents use these calls for just such purposes. For
appropriate condition code. (Note that if we take this
example, the interface routines mentioned above use
call one step further, and permit one object program
activate calls to "turn on" called components of the
to arm the software and hardware interrupts of another
EXEX.
object program, we have the basic control mechanism
The Allocator component of ADEPT manages the
necessary to permit the operation of "object systems.
page map for each program. This software map renecessary to permit the operation of "object systems,"
flects the correspondence between drum and core
i.e., subexecutives-another level in the "onion skin"
pages, established initially by the SERVIS (service)
of ADEPT control.)
component at load time. The Allocator's function is
User programs interface with the ADEPT system
to inventory available core and drum pages by mainprimarily via the supervisor call (SVC)· instruction;
taining two resident system tables: one for core, the
a secondary interface is provided via the program
other for drum. Whenever drum pages are released . check interrupt that protects the program and system
or obtained, the Allocator updates the page map in
after various error conditions. The executive design
the job's environment page. The Allocator processes
allows user programs to trap all such interfaces with
the SHARE (page), GETPAGE, FREEPAGE, and
the system via its rescue arming mechanism. This
ACTDEACT calls from EXEX and object programs.
means that one program can trap and get first-level
SERVIS allows a program at run time to add data
control of all occurrences of SVC's and program checks
pages or to overlay program segments from disc or
within a single job. This mechanism also means, then,
tape. In so doing, SERVIS makes use of the various
that the responsibility and meaning for these interAllocator calls.
faces can be redefined at the user program level.
As of this writing, this mechanism is being employed
Simulating console commands
to eonstruct object systems for an improved batch
monitor, an interface for the proposed ARPA NetAn importan.t attribute of ADEPT time-sharing
work,14
and to experiment with automatic translators
is that nearly all the functions and services that can
for
compatibility
with other operating systems. Other
be initiated at the user's console can also be called
uses
include
improvements
in program recovery in
forth within a user's program. A program designer
a
variety
of
user
tools,
e.g.,
compiler
diagnostics.
can, for example, build a system of programs, which
can operate in batch mode under the control of a program by issuing internal commands in much the same
manner as the user sitting at the console .. With this
approach, the ADEPT batch monitor controls background tasks by simulating user terminal requests.
Ba.tch requests can be enqueued by users from any

Resource allocation, access, and management
ADEPT system design, of course, includes a complete set of resource controls that monitor secondary
storage devices.

46

Fall Joint Computer Conference, 1969

The cataloger
The Cataloger, an EXEX component, is functionally
analogous to the core/drum Allocator, but is used
for devices accessible by user programs. It maintains
an inventory of all assignable storage devices, assigns
unused storage on the devices,· maintains descriptions of the files placed on these devices, controls
access to these files, and-upon authorized requestdeletes any file. Specifically, the Cataloger:
• Assigns storage on 2302, 2311 and 2314 discs.
• Assigns tape drives.
• Locates an inventoried file by its name and certain qualifiers that uniquely identify the file.
• Issues tape or disc pack mounting instructions
to the operator when necessary.
• Verifies the mounting of labeled volumes.
• Passes descriptive information to the user program opening a file.
• Allows the user of a file to request more storage
for the file.
• Denies unauthorized users access to files.
• Returns assigned storage to available storage
whenever a file is deleted.
• Maintains a table of contents on each disc volume.
As the largest single compon~nt of the ADEPT
Eexcutive (65,000 bytes), the Cataloger was written
in a new, experimental programming language called
MOL-360 (Machine-Oriented Language for the 360).16
I t is a "higher-level machine language" developed
under an ARPA-sponsored SDC research project on
metacompilers. It resolved the dilemma involving
our desire for higher-level source language and our
need to achieve flexibility with machine code. The
Cataloger design and che6kput, enhanced by the use
of MOL-360, showed simultaneously the validity
of MOL compilers for difficult machine-dependent
programming.

results of EXCP for the call are "interpreted" by
SPAM and returned to the user program as status information. As such, SPAM represents a more symbolic
I/O capability than the EXCP level. It provides a
relatively simple method for executing the operations
of reading, writing, altering, searching for, ELnd positioning records within ADEPT cataloged and controlled disc-based and tape-based file structures,.

Resource mana,gement
As of this writing, the computer operator has a set
of commands at his disposal that allow him to control
the system resources. Various privileged on-line commands enable him to monitor the terminal activities
of system users and to control assignment and availability of storage devices. However, there is an increasing need for a "manager" to be given more
latitude in dynamically controlling the system resources and observing the status of system users,
particularly because ADEPT was designed to handle
sensitive information in classified government and
military facilities. To meet these objectives, a design
effort is under way that gives the computer operator
system-manager status, with the ability to observe
and control the actions of system users. The result
will be a program that encompasses some of the management techniques reported by Linde and Ch aney 16
tailored to present needs.
Swapping and scheduling user programs .

Most of the programs that run under ADEPT
occupy all of the core memory that is not used by
the resident Basic Executive (46 pages on the 360/
50H). If the set of needed pages could be reduced
considerable reduction in swap overhead could be
expected. One way to achieve this is to mark fo][, swapout only those pages that were changed during program execution. The hardware needed to automatically
mark changed pages is unavailable for the 3:60/50;
however, through use of the store-protect feature on
the Model 50, ADEPT software can simulate the effect and produce noteworthy savings in swap time.

Page marking
The SPAM component
SPAM is a BASEX component that permits symbolic, user-oriented I/O. It can be viewed as a specialpurpose compiler that compiles sytnbolicuser program
I/O calls into 360 channel programs, and delivers them
to the Input/Output Supervisor (lOS) for execution
via the EXCP (execute channel program) call. The

Whenever a user program is swapped into ClOre, its
pages are set in a read-only condition. As the program
executes, it periodically attempts to store data (write)
in its write-protected pages. The resulting interrupt
is fielded by the system. After s~tisfying itself that
the store is legal for the program, the executive marks
the target page as "written," turns off write-protect

The ADEPT-50 Time-Sharing System
for that page, and resumes the program's execution.
The situation repeats for each additional page written.
At the completion of the program's time slice, the
8wapper has a map of all the program pages that
were changed (implied in the storage keys with no
write protection). Only the changed pages are swapped
out of core. Measurement of this scheme shows that
about 20 percent of t·he pages are changed; hence,
for every five pages swapped in, only one need be
swapped out, for a total swap of six pages, rather
than the full swap of ten pages (five in, five out). The
scheme makes the drum appear to be 40 percent faster.
The use of the storage protection keys is based on
the functional status of each page rather than on
some user identity. User programs always run with
a program status word key of one, and the bits in
the storage key associated with the programs start
out at zero. After a page has been initially changed,
its key is set to one also. The other bits in the key are
used to indicate: first, a page is transient, not yet
completely moved to or from swap storage; second,
a page is unavailable, i.e., it belongs to someone else;
third, a page is locked and cannot be swapped or
changed; and finally, a page is fetch-protected because
it may contain sensitive information.

Scheduling algorithm
The scheduling algorithm provides for three levels
of scheduling. Jobs that are in a "terminal I/O complete" state get first preference in the schedule. Jobs
in the second level, or background queue, are run if
there are no level-one jobs to run. A job is placed in
level two when the two-second quantum clock alarm
terminates its operation two consecutive times. Compute and I/O-bound programs are treated alike. A
level-two job-when allowed to run-is given quantum
interval equal to the basic quantum time multiplied
by the scheduling level (i.e., 2 sec X 2 = 4 sec).
However, a level-two background job may be preempted after two seconds for terminal I/O. Anyoperation a level-two job makes that terminates its quantum prematurely will return the job to a level-one
status. The batch monitor job is run when the first
two queues are empty. User programs may be written
to overlap execution and I/O activity. Our choice of
scheduling parameters for quantum size, and number of service levels was selected empirically and as a
result of prior experience. 17
A command SKED, which is limited to the operator's terminal, has the effect of forcing top priority
for a job (the job stays at level one all the time). Only

47

one job may run in this privileged scheduling state
at a time.
Pervasive security controls

Integrated throughout the ADEPT executive are
software controls for safeguarding security-sensitive
information. The conceptual framework is based
upon four "security objects": user, terminal, file,
and job. Each of these security objects is formally
identified in the system and is also described by a
security profile triplet: Authority (e.g., TOP SECRET, SECRET), N eed-to-Know Franchise, and
Special Category (e.g., EYES ONLY, CRYPTO).
At system initialization time, user and terminal
security profiles are established by security officers
via the system component SYSLOG. SYSLOG also
permits the association of up to 64 passwords with
each user. At LOGIN time, a user identifies himself
by his unique name, up to 12 characters, and enters
his private password to authenticate his identity. The
LOGIN component of ADEPT validates the user
and dynamically derives the security profile for the
user's job as a complex function of the user and terminal security profiles. The job security profile is
used subsequently as a set of "keys," used when access
is made to ADEPT files. The file security profile is
the "lock" and is under control of the file subsystem.
File access Need-to-Know is permitted for Private,
Semi-Private, and Public use. With the CREATE
command, a list of authorized users and the extent of
their access authorization (i.e., read-only, write-only,
read and write) can be established easily for SemiPrivate files. Newly created files are automatically
classified with the job's "high water mark" security
triplet-a cumulative security profile history of the
security of files referenced by the job. Through judicious use of the CHANGE command, these properties may be altered by the owner of the file.
Security controls are also involved in the control
of classified memory residue. Software and hardware
memory protection is extensively used. Software
memory protection is achieved by interpretive, legality checking of memory bounds for I/O buffer
transfers, legality checking of device addresses for
unauthorized hardware access, and checks of other
user program attempts to seduce the operating system
into violating security controls.
The hardware protection keys are used to fetchprotect all address space outside the user program and
data area. Also, newly allocated space to user programs
is zeroed out to avoid classified memory residue.

The ADEPT-50 Time-Sharing System
for that page, and resumes the program's execution.
The situation repeats for each additional page written.
At the completion of the program's time slice, the
swapper has a ma,p of all the piogram pages that
were changed (implied in the storage keys with no
write protection). Only the changed pages are swapped
out of core. Measurement of this scheme shows that
about 20 percent of the pages are changed; hence,
for every five pages swapped in, ;only one need be
swapped out, for a total swap ot six pages, rather
than the full swap of ten pages (five in, five out). The
scheme makes the drum appear to be 40 percent faster.
The use of the storage protection keys is based on
the functional status of each page rather than on
some user identity . User programs always run with
a program status word key of one, and the bits in
the storage key associated with the programs start
out at zero. After a page has been initially changed,
its key is set to one also. The other bits in the key are
used to indicate: first, a page is transient, not yet
completely moved to or from sw~p storage; second,
a page is unavailable, i.e., it belongs to someone else;
third, a· page is locked and cannot be swapped or
changed; and finally, a page is fetch-protected because
it may contain sensitive information.

Scheduling algorithm
The scheduling algorithm provides for three levels
of scheduling. Jobs that are in a "terminal I/O complete" state get first preference in ,the schedule. Jobs
in the second level, or background queue, are run if
there are no level-one jobs to run. A job is placed in
level two when the two-second quantum clock alarm
terminates its operation two consecutive times. Compute and I/O-bound programs are treated alike. A
level-two job-when allowed to run-is given quantum
interval equal to the basic quantum time multiplied
by the scheduling level (i.e., 2 sec X 2 = 4 sec).
However, a level-two background. job may b~ preempted after two seconds for terminal I/O. Anyoperation a level-two job makes that terminates its quantum prematurely will return the job to a level-one
status. The batch monitor job is run when the first
two queues are empty. User programs may be written
to overlap execution and I/O activity. Our choice of
scheduling parameters for quantum size, and number of service levels was selected eritpirically and as a
result of prior exp~rience.17 .
A command SKED, which is limIted to the operator's terminal, has the effect of f~rcing top priority
for a job (the job stays at level one all the time). Only
!

48

one job may run in this privileged scheduling state
at a time.

Pervasive. security controls
Integrated throughout the ADEPT executive are
software controls for safeguarding security-sensitive
information. The conceptual framework is based
upon four "security objects": user, terminal, file,
and job. Each of these security objects is formally
identified in the system and is also described by a
security profile triplet: Authority (e.g., TOP SECRET, SECRET), Need-to-Know Franchise, and
Special Category (e.g., EYES ONLY, CRYPTO).
At system initialization time, user and terminal
security profiles are established by security officers
via the system component SYSLOG. SYSLOG also
permits the association of up to 64 passwords with
each user. At LOGIN time, a user identifies himself
by his unique name, up to 12 characters, and enters
his private password to authenticate his identity. The
LOGIN component of ADEPT validates the user
and dynamically derives the security profile for the
user's job as a complex function of the user and terminal security profiles. The job security profile is
used subsequently as a set of "keys," used when access
is made to ADEPT files. The file security profile is
the "lock" and is under control of the file subsystem.
File access N eed-to-Know is permitted for Private,
Semi-Private, and Public use. With the CREATE
command, a list of authorized users and the extent of
their access authorization (i.e., read-only, write-only,
read and write) can be established easily for Semi·,
Private files. Newly created files are automa1jcally
classified with the job's "high water mark" security
triplet-a cumulative security profile history of the
security of files referenced by the job. Through judicious use of the CHANGE command, these properties may be altered by the owner of the file.
Security cdntrols are also involved in the control
of classified memory residue. Software and hardware
memory protection is extensively used. Software
memory protection is achieved by interpretive, legality checking of memory bounds for I/O buffer
transfers, legality checking of device addresses for
unauthorized hardware access, and checks of other
user. program attempts to seduce the operating system
into violating security controls.
The hardware protection keys are used to fetchprotect all address space outside the user program and
data area. Also, newly allocated space to user prog;rams
is zeroed out to avoid classified memory re:3idue.

The ADEPT-50 Time-Sharing System
Typically, the complete system reaches "on the air"
status in less than a minute.
System instrumentation

Many of the parameters built into the scheduling
and swapping of early ADEPT versions were based
upon empirical knowledge. The latest versions of
the' Basic and Extended Executives include routines
to record system performance, reliability, and security
locks.
Built into the BASEX is a routine to measure the
overall and the detailed system performance. 20 Such
factors as the number of users, file usage, hardware
and software errors, and page transaction response
time are recorded on unused portions of the 2303
drum. These measurements provide a better understanding of the system under a variety of inputs and
give the designers insight into how the hardware and
software components of the system affect the performance of the human user.
An AUDIT program was made part of the EXEX
to record the security interaction of terminals, users,
and files., AUDIT records EXEX activity in the areas
of LOGIN, LOGOUT, and File Manipulation. This
routine strengthens the security safeguards of the
executive. Specific items that are recorded involve:
type of event, user identification, user account number, job security, device identification, time of event,
file identification1 file security and event success. In
addition, this routine provides accounting information and is used as a means of debugging the security
locks of new system releases.
In addition to the BASEX recording function,
several object programs have been written that simulate various modes of user activity and provide controlled job distributions. These programs, called
"benchmarks," run under controlled conditions and
enhance the means of improving system performance
and throughput, as described elsewhere by Karush.21
The programs are designed to gather performance
measures on the major routines of the executive and
have been of considerable help in system "tuning,"
because they renect the effect of coding and design
changes to various system routines. The routines in
the executive that are of primary concern are the
swapper, the scheduer ,the terminal read/write pack..
age, and the interrupt handling processes. Attempts
are being made to design a set of benchmarks that
represent a typical job mix. However, we are primarily
interested in measuring the performance of our system
against various modifications of itself and in measuring
its behavior with respect to different job mixes.

49

SUMMARY
The ADEPT executive is a second-generation, generalpurpose, time-sharing system designed for IBM 360
computers . Unlike the monolithic systems of the past,l,2
it is structured in modular fashion, employing distributed executive design technIques that have permitted
evolutionary development. This design has not only
produced a flexible executive system but has given the
user the same facilities used by the executive for
controlling the behavior of his programs. ADEPT's
security aspects are unique in the industry, and the
testing and fabrication methods employ a number
of novel approaches to system checkout that contribute to its operational reliapility.
It is important to note that this system deals particularly well, with size limitation problems of very
large files and very large programs. The provisions
made for multiple programs per job, active/inactive
page status for programs larger than core size, page
sharing between programs, common file access across
programs within jobs, and the commitment of considerable space to active fil~ environment tables (up
to four pages worth) contribute to this success. Nevertheless, all these capabilities are designed to handle
the smaller entities as well. We feel ADEPT-50 is
a significant contribution to the technology of generalpurpose time-sharing.

ACKNOWLEDGMENTS
We would like to express our appreciation for the
dedicated efforts of some very adept individuals who
participated in the design and building of this timesharing system. Our thanks go to Mr. Salvador Aranda,
Mr. Peter Baker, Mrs. Martha Bleier, Mr. Arnold
Karush, Mrs. Patricia Kribs, Mr. Reginald Martin,
Mr. Alexander Tschekaloff and all the others who
have followed their lead.

REFERENCES
1 P CRISMAN editor
The compatible time-sharing system: A programmer's guide
MIT Press Cambridge Mass 1965
2 J SCHWARTZ et al
,A general-purpose time-sharing system
.
Proc SJCC Vol 25 1964397-411 Spartan Books BaltImore
3:E W FRANKS
A data management system for time-shared file-processing
using a cross-index file and self-defining entries
AFIPS Proc Vol 28 196679-86 Also available as SDO
document SP-2248 21 April 1966

5(}

Fall Joint Computer Conference, 1969

4 R E BLEIER
Treating hierarchical data structures in the SDC time-shared
data management system (TDMS)
Proc 22nd Nat ACM Conf Thompson Book Co 196741-49
5 E W DIJKSTRA
The structure of T.H.E. multi-programming system
C A C M Vol 11 No 5 May 1968
6 F J CORBATO V A VYSSOTSKY
Introduction and overview of the multws system
Proc FJCC Nov 30 1965 Las Vegas Nevada
7 B W LAMPSON
Time-sharing system reference manual
Working Doc Univ of Calif Doc No 30.1030
Sept 1965 Dec 1965
8 B W LAMPSON
A sch6duling philosophy for multi-processing systems
C A C M Vol 11 No 5 May 1968
9 J H SALTZER
Traffic control in a multiplexed computer system
MAC-TR-30 thesis MIT Press July 1966
10 G H FINE et al
Dynamic program behavior under paging
Proc ACM 1966223-228 Thompson Book Co Wash D C
11 E G COFFMAN L C VARIAN
Further experimental data on the behavior of programs in a
paging environment
C A C M Vol 11 No 7 July 1968471-474
12 L A BELADY
A study of replacement algorithms for d, virtual storage computer
IBM Systems Journal Vol 5 No 2 1966
13 R W O'NEIL
Experience using a time-shared multi-programing system

with dynamic address relocation hardware
Proc SJCC 1967 Vol 30 611-627 Thompson Book Go
Washington D C
14 L G ROBERTS
Multiple computer networks and intercomputer networks and
intercomputer communication
ACM Symposium on Operating System Principles
Oct 1-4 1967 Gatlinburg Tenn
15 E BOOK D C SCHORRE S J SHERMAN
Users manual for MOL-360
SCC Doc TM-3086/003/01
16 R R LINDE P E CHANEY
Operational management of time-sharing systems
Proc ACM 1966 149-159
17 P V McISSAC
Job descriptions and scheduling in the SDC Q-32 timesharing system
SDe Doc TM-2996 June 196628
18 C WEISSMAN
Security controls in the ADEPT-50 time-sharing system
AFIPS Proc FJCC Vol 35 1969
19 W A BERNSTEIN J T OWENS
Debugging in a time-sharing environment
AFIPS Proc FJCC Vol 33 19687-14
20 A D KARUSH
The computer system recording utility: application and
theory
SDC Doc SP-3303 Feb 1969
21 A D KARUSH
Benchmark analysis of time-sharing system
SDC Doc SP-3343 April 1969

APPENDIX A: Advanced development prototype system block diagram.
UNIVUSAL CHAl SeT '1640
HN2 NINT 'lAIN ott"...
6/'0 LINES/INCH (.,Q) w,,9'10

'EVISID 3OA""L '96'

IWT MOOI.&I....::::===+--""
ONlY
..uJICIYTIS

4615 lUMINAL
CONTIOL TV'! I

"" SECOND

IACH
ClIYTEWIDI)

3." M IYTE CA'ACITY
312.5K IVTEJUC TRANSon_ lATE
•.• M SIC AVEUGE ACCESS TIM!

3233

1912TEllGlAPH

TEiMlNAL CONTlOL
TY,11l

lPO Q 20569

c. IICOGNITION
lPOQ23'6oI
IT>! 'NTlIIU"

7U5T1iM'NAl

CONnOt.
IX'ANS'ON
AVE ACCISS W/O
MOVING HIAD • 17 MS,
WITH MOVING HlAO 120 MS

TlANS.IIIAn
'41KIYTDi\K
CA'''',TY

m .. Ivm
MU)( CAlLI
ASS~V

5ni2t2

(1) IILONGS TO CCD

~y31O

DOIS NOT HAVI TY_TIC
(2) ON ONI. V ON1274'

\

."SOlUTE VECTO« AND CONUOL
ALfttANUMEllC KEV'OAIO 124.5

IKIUFF£ltl499
CHARACTER GENElATOI 1680
LIGHT 'EN41B5
FUI\fClION KEVIO ...IO st55
2J8K IVTESlSfC

' .......LL(t OATA AO""Ul
ptAYUlil1 '5.500

OVAL 'APE DRIVES
800BITS!INCH
7 9~TRJ.CK AND 1 7-TRACK

ADDITIONAL
DlIVES
2.8 'IOSQ

90 K eVTES:SEC TUNSFU.
112.SIN/SEC

~A TE

EX'.ANOED
CAPA,BILITY

'381'

TO REMOTE CONSOLES VIA OATA SETS (lNHEN NUOEO)

An operational memory share supervisor
providing multi-task processing within a
"single partition
byJ.E.BRAUN
Penna. -N. J. -Md. Interconnection
Philadelphia, Pa.

and
A.GARTENHAUS
Applied Programming Services, Inc
Philadelphia, Pa.

INTRODUCTION
The real-time digit"al process control system, of which
the Partition Share Supervisor is an operational feature
was designed and implemented to assist in the function~
of monitoring, evaluating and controlling an interconnected system of electrical power utility companies. The main processing unit is located at the
central control office with teleprocessing communications to remote lower level control centers.
The basic addressable unit within the main processor
is the byte (8 data bits + 1 parity bit), with a word
consisting of four bytes. There is a storage protect
option which is implemented through assignment of
storage and "keys" to contiguous 2048 byte blocks of
memory. A group of memory blocks with matching
protect keys comprise a partition or task area. This
protection feature permits non destructive read-out
across partition boundaries but will cause termination
of any task which attempts to write in another task's
memory area.
The arithmetic-logic unit maintains its current status
in a program status word which contains such information as whether or not I/O is currently being permitted on each of the data channels, the protect key for

the instruction presently being executed, present
machine status, length of current instruction, the address of the next in"struction to be fetched, etc. There
are certain instructions within the instruction set
which can only be executed when the machine is in
the "supervisor" state, i.e., when the portion of the
program status word which indicates machine status
is correctly set." These instructions are classified as
"privileged" instructions and perform such functions
"as disabling data channel interrupts, altering storage
keys, resetting the program status word, etc.
The ability of the computer to disallow certain of
its instructions when operating in the normal problem
program state prevents inadvertent destruction of
critical storage area or catastrophic conditions being
caused by problem programs which could lead to
system shutdown.
This system utilizes the indeperldent I/O channel
concept which permits the main processor to continue
execution of program instructions while the channel
transfers data from I/O devices into main storage by
cycle interleaving.
The multi-tasking capability of the manufacturer
supplied software support system permits priority
51

52

Fall Joint Computer Conference, 1969

scheduling of several tasks all utilizing the resources of
one processing unit. The design of the real-time control
system requires that it perform certain of its functions
in a cyclic basis. Therefore, the internal storage has been
divided into' four task areas (partitions) with time dependent and critical programs placed in partitions
with relatively higher priorities. The following task descriptions are listed in order of task priorities:

/
HIGH
MEMORY
ADDRESS

TASK 2

.//
REAL TIME PROCESS CONTROL
TYPEWRITER/CARD READER
TELECOMMUN I CATI ONS CONTROL

72K

ANALOG/DIGITAL TELECOMMUNICATIONS
CONTROL
EMERGENCY DISPATCH ROUT I NES

V/
TASK 3

Task 1 (core requirement) == 42K)

DIG I TAL CONSOLE MESSAGE PROCESS I NG
OUTPUT TEXT GENERATION FOR TASK 2
REMOTE CARD I NPUT PROCESSOR

\

Task 1 is dedicated to the manufacturer supplied
operating system (O/S) which contains supervisory
routines, data management routines priority scheduler,
etc.

TASK-TO-TASK COMMUNICATIONS MONITORING
TASK 4

PART I TI ON SHARE SUPERY I SOR (PSS)

TASK 5/TASK 6

~:::

6K

(SHARED PART IT ION)
TI ME DEPENDENT AND SPEC I AL DEMAND

Task 2 (Icore requirement ==72K)

961t

SC I ENT I FI C APPLI CAT! ON PROGRAMS
(TASK 5)

Task 2 incorporates the process control family of
programs. It also includes the remote typewriter/caTd
reader communications programs since they use little
processing time and benefit from both the independence
of input/output channel operations and quick response
time available to the task.
D~ring power system
emergency situations, Task 2 additionally initiates
routines which, due to their critical nature, retain
system resources and dispatch emergency communications until the disturbance is relieved.

Task 3 (core requirement == 40K)
Task 3 contains special digital console message processing routines, text output gene*ators for programs
operational within Task 2, routines! for processing card
inputs from the telecommunications system and routines which monitor and control inter-task communications.

Task 4 (core requirement == 6K)
Task 4 is the Partition Share, Supervisor (PSS)
which causes Tasks 5 and 6 to share the remaining
available memory. The detailed description of this
task is the subject of this paper.

Task 5 (core requirement == 96K)
Task 5 consists primarily of scientific application
programs. These programs are run as required either on
special demand from real-time on~line tasks or periodically with the length of the period depending on
the nature of the program.

OR
OFFL I NE MISCELLANEOUS USES
(TASK B)

TASK I
(NUCLEUS)

V/

OPERATING SYSTEM CONSISTING OF:
SUPERY I SORY, DATA MANAGEMENT,
PRIORITY SCHEDULER ROUTINES, ETC.

V/

LOW

MEMORY
ADDRESS

Figure I-Initial memory configuration with task
functional descriptions and relative locations shown

Task 6 (core requirement == 96K)
This task is the off-line* task and is dedicat.ed for
miscellaneous uses such as compiles, assemblies, accounting routines, etc.
Figure 1 is a functional diagram of the tasks just
discussed and shows their re1ative locations in computer memory.
General discussion

Task dispatching
Task dispatching is under the control of the operating system. From a copceptual standpoint, the
operating system can be considered to be the only
main program in storage and all other tasks within
the computer as subroutines.

* The term off-line is used in this paper when referring to tasks
which do not directly operate within the real-time environment.
This use is similar to the term "background" which the re9.der
may have previously encountered.

An Operational Memory Share Supervisor
The dispatching function consists of allocating the
resources of the processor to the highest priority task
which is in the Hready" state. When no tasks are in
the ready state, the processor is not working and is in
a wait state. When any task reaches a point where it
no longer can process until the completion of some
event (such as an I/O operation), it relinquishes control of computer facilities to lower priority tasks via
the scheduler. It will regain these facilities when the
event it is awaiting is completed and there are no
higher priority tasks which are in the ready state.

/'"
HIGH
MEMORY
ADDRESS

/

TASK 2

72K

/
TASK 3
4OK

Inter partition communication
The subject real-time system requires that operational tasks be able to communicate for the purpose
of exchanging information such as live data, requests
to run various subtask routines, etc. Tasks which
communicate with other tasks are equipped with intertask communication routines which are considered the
highest priority routines within the individual task. In
this fashion, when the task is dispatched, the internal
task priority scheme allows the communication routines
to be processed first. Furthermore, any task can be
interrupted to allow its communication routines to
operate. Thus tasks can communicate at any time
(asynchronously) .

Partition sharing
The Partition Share Supervisor (PSS) is required to
be able to handle three basic functions:
1. Suspend processing of the off -line task when
required.
2. Load and process the lowest priority on-line
task (LPOL).
3. Upon completion of (2) above, be able to restore
and restart the off-line task.
There are two conditions under which PSS suspends
off-line processing. One is when the previously set
real-time clock causes an interrupt. This interrupt is
recognized as indicating the LPOL is to be recycled
for a periodic run. The other is when a communication
is received' from another task indicating that one of the
routines within the LPOL task is to be executed.
Figure 1 shows the computer configuration in the
normal mode. Normal mode is considered to be when
the shared partition is occupied by off-line programs.
Note that there are four problem program partitions
(excluding the nucleus).
Figure 2 shows the configuration when the off-line
programs are "rolled out" and the LPOL programs
are operational. There are now three problem program

53

TASK 4 PARTITION SHARE SUPERVISOR {PSS'
TASK 5

V

"6

96K

(LO' PR lOR I TV ON LI NE TASK)

~

COMBINED
SINGLE
TASK
AREA
(102K)

Vi'"
TASK 1 (NUCLEUS)
LO.
MEMORY
ADDRESS

42 K

V

Figure 2-Showing memory configuration when low
priority on line (LPOL) task is active

partitions and the area dedicated to the PSS and LPOL
tasks is one contiguous partition.
Detailed discussion

The following description details the operations involved in reconfigurating the system from that of
Figure 1 to that of Figure 2 and returning to that of
Figure 1.
As previously stated, the PSS task is initiated for
one of two reasons:
1. Timer interrupt indicating a need to run the
LPOL task for time dependent programs.
2. External interrupt triggered by communication
from another task indicating a need to process
a requested program.

Prior to either type of interrupt, the PSS task is
in a wait state (i.e., the task cannot be dispatched
until the completion of one of the above two events).

Fall Joint Computer Conference, 1969

E4

Upon being initiated, PSS takes the following steps:
FIELD

1. Places its own task in the supervisor state in
order to allow execution of privileged instructions
required to modify system control blocks in the
nucleus, override the storage protection feature,
and disable system interrupts at critical times.
2. Allows all outstanding I/O to complete in the
off-line partition (quiescing the partition).
3. Erases the boundary between the PSS task
and off-line task.
4. Deletes reference to the now non-existent offline task from operating system control blocks.
5. Writes a copy of the off-line partition, which is
now an extension of the memory area of the
PSS task, on a disc file.
6. ~eads the LPOL task into the vacated area.
7. Executes the LPOL task.
At this point, we have gone from the configuration
shown in Figure 1 to that of Figure 2 and the LPOL
task is now able to process its requests. Upon completion by the LPOL task of all required processing,
the following steps are taken by PSS to return to the
off-line configuration:
8.
9.
10.
11.
12.

Writes the LPOL task on a disc file.
Reads the off-line task into the vacated area.
Re-establishes task boundaries erased in 3.
Restores system reference to the off-line task.
Places the PSS task in a "wait state" awaiting
an interrupt which will cause a recycle.

At thi'3 point, the off-line task is fully restored to the
system and in a "ready state". It will then be red ispatched by the task dispatching routines on a priority
basis.

System 'control blocks
Prior to a detailed discussion of PSS mechanics, we
will discuss relevant system control blocks utilized in
effecting partition sharing.
Task Control Block (TCB)
There is a TCB associated with each task. Contained
in the TCB are various boundaries, indicators, etc.,
used in performing task controL Figure 3 shows those
fields (with references labeled as used in this paper)
which are accessed or modified by PSS.
TCB List (TCBLIST)
The TCBLIST is located in the nucleus and is a
list of TCB 10cationR in ord~r of task priority. There

TCBTAHB

Figure

3~Task

C(I.ENTS
PO I NTEfI TO TASK
MSS (B(JUNDARY
BOX-SEr: FIG.5)

TCBPKE

CONTA HIS STORAGE
PROTECTION KEY
FOR THlt TASK

TCBIDF

TASK I DIENTI FI
NUMBER

TCBTCB

PO I NTEFI TO NEXT
LOIER F'R I OR I TV
TASK T(:B

cn I ON

control block (TCB)

is an entry in the list for each task in the system (see
Figure4).
Task Area Boundary Block (TABB)
There is a TABB associated with each task. The
TABB contains addresses defining the upper and lower
boundaries of the task region and also has a pointer
to the first free area label within the task. The format
of a TABB is shown in Figure 5.
Free Area Label (FAL)
There is an F AL which is an integral part of every
available free storage area in memory. An F AL is

POINTER TO TCB OF HIGHEST PRIORITY TASK
~------------------------------------~~--

POINTER TO TCB OF NEXT HIGHEST PRrORITY TASK
~--------------------------------------------.-

•

•
•
•
.. ,..

.!.,..

POINTER TO TCB OF LOWEST PRIORITY TASK
Figure 4-TCB list (TCBLIST)

An Operational Memory Share Supervi.sor

LABLE
FALPT
FALPT
LOADDR

~
POINTER TO FIRST FREE AREA
LABEL (FAL) IITHIN TASK
AREA. (SEE FIGURE 8)

LOADDR

THE ADDRESS OF THE LOW
BOUNDARY OF THE TASK

HIABOR

THE ADDRESS OF THE HIGH
BOUNDARY OF THE TASK.

HIADDR

55

I01 ESTAT

STATUS INDICATOR FOR THIS
lORE. THE LAST lORE IN THE
CHAIN HAS AN lORE STAT FIELD
11TH A VALliE OF 1-

IOREI 0

FIELD SET TO SAlE ID NUMBER
AS THAT OF THE TCBIDF FI ELD
OF THE TASK IHICH INITIATED
I/O REQUEST (SEE FIGURE 3)

Figure 7-1/0 request element (lORE)

Figure 5-Task area boundary block (TAB B)

Quiescing a partition
effectively a label for each free storage area which
defines the size of it and contains a linkage pointer to
the next FAL. The format of an FAL is shown in
Figure 6.
Input/Output Request Element (lORE)
There is a chain of IOREs for all outstanding or
queued I/O operation requests from any partition.
Each lORE contains information used by the system
I/O interrupt handling routines as I/O operations are
completed. Figure 7 shows the format of an lORE.
System Vector Table (SVT)
The SVT is resident in the nucleus and contains
essential pointers required by the operating system.
Included is a pointer to the start of the lORE chain.
The location of the SVT is retrieved from a fixed memory location which is conditioned with the SVT address
during system initialization.
As mentioned under General Discussion the PSS
.
'
task IS required to run in supervisor state at times.
Although the state of, the PSS task changes from
problem to supervisor and back throughout its execution, these changes of state will not be noted in
this discussion. It should be understood that PSS
operates in problem state at all times where it is not
required to be executing privileged instructions modifying storage in another partition or the nucieus or
disabling I/O interrupts.
'

FAUXT

FAL ..n

FALCOUIIT
FALCOU..T

PO INTER TO NEXT FAL IN THE
CNAIN OF FAL' S..
NOTE: IF THIS FIELD IS ALL
ZEROS, THIS IS THE LAST fAL
IN THE CHAIN.
.....T OF FREE MEIORY
AVAILAILE SURTING AT THE
IEII.I. OF THI S FAL.

Figure 6--Free area label (FAL)

Prior to rolling out the off-line partition, PSS must
be sure all I/O is quiesced in order to prevent the I/O
supervisor routines from accessing some storage area
which is in a transitory state.
There is an lORE for all outstanding and queued
I/O requests. Within each lORE is an identification
number field (IOREID-see Figure 7) which links it
with the initiating task. When that task is involved in
an I/O operation, the TCBIDF field of the TCB
(Figure 3) has a task identification number that will
match the 10REID field of some active lORE.
As I/O interruptions occur, the I/O Interrupt Handler services the interrupt and removes the appropriate
lORE from the chain and makes it inactive.
Partition quiescing is accomplished by initially disabling I/O interrupts, obtaining the TCBIDF field
from the TCB of the task involved, locating the lORE
chain by using the pointer in the SVT, and scanning
the IOREs checking for 10REID fields which match
the TCBIDF field of the TCB. If none are found, there
are no 10REs for the task and it is already in a quiescent
state. If any are found, then the task has a pending
I/O interrupt or outstanding I/O requests. If this is
the case, PSS enables interrupts allowing the I/O
Supervisor to process, if necessary, and then immediately disables them. If the I/O in question has been completed, the lORE will have been removed from the
chain during the time interrupts were enabled.
PSS restarts at the beginning of the chain and checks
again, repeating the above steps until it comes to the
end of the chain without having found any active
elements for the task. When it rea-ches this point, there
are no longer any 10REs associated with the task and
it is in fact quiescent.
It should be noted that since the PSS task has a
higher priority than the task to be quiesced, it does
not anow any new I/O requests to be initiated by that
task since PSS retains the computer resources.

Erasing of a partition boundary and
task deletion
There is control information which is received by

56

Fall Joint Computer Conference, 1969

------------------------------------------------------------------------------------the communications routines within the PSS task
which must be accessible to the, LPOL task for both
reading and writing (such as indications which LPOL
routine to is be run, the replacement value for the
next cycle time which is calculated by the LPOL task
as a function of its current running time, entry point
addresses of routines mutually shared by the PSS and
LPOL tasks, etc.). Additionally; task management is
greatly facilitated by extending the PSS task aroa to
include the LPOL function while controlling via the
PSS Task Control Block (TCB) rather than modifying
the off-line task TCB or creating a newone.
In order to make the shared task area a memory
extension of the' PSS task, the memory areas must be
linked. This is achieved by modifying the TABB (see
Figure 5) of the PSS task so that the LOADDR field
points to the low address of th~ shared task. Figures
8 and 8a show the pointer relationships before and
after these TABB modifications.
The storage protection feature must now be satisfied
to make the two storage areas completely contiguous.
Since there is a mismatch in storage keys between the
PSS and shared tasks, the keys associated with each
protected block of memory within the shared task are
reset to match those of the PSS task. At this point,
I

1

1

ON LINE REAL TIME
TASK AREAS

PSS TASK AREA

1

ON LINE REAL TIME
TASK AREAS

PSS TASK AREA
$..7r:.\

f'

//7

"

""

..... ~s~

I

..

SHARED TASK AREA

/ f 'r//"" "",
I
~
, \\\
I,

,

..

\ \

\\

\\1

'

OPE:RATING
"--t---~ )'\\ .........- - - - 4 Sl'STEM
LOADDR
\\
LOADDR
TASK
\'
'IREA
HIADDR
HIADDR
PSSTABB

1

NUCLEUS

OFFLINE TABB

Figure 8a-TABB pointers after modification

the two task areas have become a contiguous block of
memory assigned to the PSS task area.
Figure 9 shows how, TCBs are linked together within
the system. Note that each entry in the TCBLIST
points to a TCB and each TCB points to t.he next
lowest priority TCB in the chain. Figure 9a shows the
arrangement of the TCBLIST and the TCBTCB field
in the next-to-Iast TCB in the chain after modification
to three partitions. This has been done by replacing
the pointer to the last TCB in the TCBLIST with a
pointer to the next-to-Iast TCB, and setting TCBTCB
field of the next-to-Iast TCB to zero. These modifi-

OPERATING

\I--l-O-AD-O-R~ S~ ~1~M

AREA
HI ADDR

HI ADDR
PSSTABB

NUCLEUS

OFFLINE TABB

Figure 8-TABB pointers in PS$ and offline task
prior to modification

Figure 9-Portion of nucleus showing TCBIJIST gond
TCBTCB pointer relationship prior to modification

An Operational Memory Share Supervisor

1

1

ON LINE REAL TIME
TASK AREAS

~

57

PSS TASK AREA

Figure 9a-TCBLIST and TCBTCB pointers after
modification

cations have additionally made the last task nonexistent to the operating system.

OPERATING
SYSTEM
t - - - - - - - 1 TASK
AREA

t-------1

Rollout jRollin
The process of rolling out the off-line task and rolling
in the LPOL task is a straightforward write/read
operation to a disc file. Since storage is divided into
2048 byte units for assignment of storage keys, the
task area read or written is some multiple of 2048
bytes in length. Thus the records are read or written
in 2048 byte blocks for purposes of simplicity and
efficiency.

Free area modification
The PSS and LPOL tasks now occupy the same task
area. It is neceRsary, therefore, to make certain modifications which will cause all requests for work storage
to be satisfied from that portion of the task area wholly
dedicated to the LPOL task. Although no task boundary exists between LPOL and PSS, if work storage
were to be allocated from the PSS domain, it would
not be subsequently saved and restored in future
cycles since the PSS area is not included in the dynamic
area which is stored on the disc file.
Figures 10 and lOa show how these modifications
are accomplished. Initially (Figure 10) the FALPT field
of the PSS TABB is pointing to the free area within
what was its own task area. This is the normal condition
for this pointer when there is an operating off-line
task. However, we have modified the configuration to
three task areas and we now wish to make the only
available free area all exist in the LPOL area. Figure
lOA shows that the FALPT field of the PSS TABB
has been re-pointed to the first F AL within the LPOL
task area.
At this point, the LPOL task is ready to process

PSS TABS

(NOT BEING USED)
NUCLEUS

Figure lO--F ALPT relationship with F AL locations
prior to modification

I,.,.,

1

ON LINE REAL TIME
TASK AREAS

'r-

PSS TASK AREA

,' FALNXT 1FALCOUNT1

.J'",
.....

-----------------FORMER TASK BOUNDARY
_--_

VACANT TASK AREA

t~f:::----~--:"'-::::-~

(

ZERO

I

9SK

FALPT

PSS TABB

. . . ~~"

.....

"

FALPT

OPERATING
SYSTEM
TASK
.AREA

(NOT BEING USED)
NUCLEUS

Figure lOa-F ALPT fields after modification

Fall Joint Computer Confer'ence, 1969

58

whatever request caused it to be activated. We have
now covered steps 1 through 7, under General Discussion. In returning from the three partition to the
four partition environment, the steps are essentially
the reverse of those detailed.
Upon restoring the off -line task, PSS enters a wait
state and will be restarted as previously outlined. The
task dispatcher port.ion of O/S will restart the off-line
task as soon as there is available computer time and
no higher priority tasks require the computer resources.

Initialization
The initialization process for PSS consists of:
1.
2.
3.
4.

5.
6.
7.
8.

Suspending of off-line processing.
Reconfiguration from forir to three partitions.
Rolling out the off-line task.
Making the off-line task area one contiguous
free area.
Loading the LPOL task and allowing it to
ini tialize itself .
Rolling out the LPOL task.
Rolling in and restarting the off-line task.
Entering the normal cycle at the wait point.

Step 4 above has not been previously covered in
detail. In order to force the initial loading of LPOL into
the desired location, 'the F ALs for PSS are initially
modified. Figures 10 and lOA show the PSS TABB
before and after this is done. The F ALPT field of the
PSS TABB initially points to the first FAL within
the PSS area. The FALPT field of the LPOL TABB
points to the first FAL of its task area. By altering
the FALPT of the PSS TABB to make it point to the
LPOL first F AL and by altering the F AL by both
making it the last F AL in the chain and indicating
one large block of free memory, we have created a
large free area available to PSS for loading the LPOL
programs.
As the LPOL task acquires and releases memory
blocks for work storage, the FALs within the area
are modified by the operating system consistent with
memory availability. PSS simply saves the pointer to
the first LPOL FAL prior to each rollout and restores it
after rollin and prior to reinitiating LPOL. Continuity
of FAL linking is maintained in this fashion.
Special handling

There are occasions when the off-line partition cannot be quiesced. This could be caused by a card reader
jam, a printer being out of paper, etc., causing an
lORE associated with the I/O to remain linked in the

chftin beyond some reasonable amount of time (presently 10 seconds). These conditions are relatively
infrequent; however, provision has been made for them
by advising the operator via the computer console
typewriter and an attention bell that the off-line task
is non-quiescent and requires attention.
The memory area actually required by PSS is less
than 6K. However, in order to initially load PSS into
memory, a large enough partition must be available to
furnish the operating system job scheduler routines
their required amount of core. This requirement is in
the order of 24K. Thus there is a pre-initialization
phase during which PSS changes the initial configuration (Figure 11) of 50K and, 52K to 6K and 96K for
the PSS and off-line tasks, respectively (Figure 1).
The technique for doing this will not be detailed; hmvever, the essential steps are as follows:
1. Heferring to Figure 12, the initial PSS task area
is shown iri three segments (B, C, D) and the
initial off-line task area is shown in one segment
(A). The PSS Pre-Initializer is loaded by the
operating system into area B.

72K

TASK 2 (ON LINE)

40K

TASK 3 (ON LI NE)

TASK 4 (PSS)

50K

TASK 5/6 (OFF-LINE/LPOL)

52K

TASK 1 (OPERATING SYSTEM)

42K

NUCLEUS
Figure ll--Initial task core allocations

An Operational Memory Share Supervisor

2. In ord~r to place the PSS main program in the
area where it can control storage, it must be
forced into area D. To achieve this, the task
area boundary block is modified to make area
D free and areas Band C unavailable.
3. The PSS main program is loaded into area D.
4. The off-line boundary block is, modified to include areas Band C as free areas.
5. Control is passed to PSS main.

/

/'"
TASK 2
(OK LINE)

72K

/'

UPPE R BOUNDARY OF
PSS
TASK AREA

TASK 3
(ON LINE)

(0)

~

PSS TASK

~ 6K

/'
PRE-INITIALIZATION PROGRAM

CONCLUSION

~---UWE R BOUNDARY

>-

OF
PSS
TASK
AREA
5QK
AFTER
PRE-I NI.TIALIZATION

I~

LOWER
/ ' """---. INITIAL
BOUNDARY
OF PSS TASK AREA

®

The configuration is now that of Figure 1.

40K

©

®

59

52K

OFF-LINE TASK AREA
(llHTlALLY)

Implementation of PSS has effectively added 96K of
additional processor memory to the real-time system
of which it is an integral part. This coupled with the
facility to process off-line tasks while having an available stand-by on-line task; has greatly enhanced the
capability of the system. The application of PSS has
effected a maximal utilization of computer resources
by the system.
REFERENCES

/'

1 IBM System/360 operating system control blocks

Form No 028-6628
42K

OPERATING SYSTEM
(NUCLEUS)

V
Figure 12

2 JRM system/360 operating system input/output supervisor

Program Logic Manual Form No Y-28-6616
3 IRM sysfem/360 operating system control program with MFT
Program Logic Manual Form No Y27-712~
4 IBM system / 360 operating system fixed task supervisor
Program Logie Manual Form No Y28-6612

Structured logic
by R. A. HENLE, 1. T. HO, G. A. MALEY
and R. WAXMAN
IBM Components Division
Hopewell Junction, N.Y.

INTRODUCTION

dissipate maximum power at the same time.

Large-scale integration for computer applications
has been predicted for several years, but close examination shows that the progress has been uneven. Memory
designers continually demand higher levels of integration for larger and faster memory systems, and
new memory concepts are being developed to further
exploit the characteristics of large-scale integration.
The one-thousand-circuit chip will become nothing
more than a milestone.
But what of the logic area? Here, we struggle along
hoping to find some high-volume applications for chips
with a mere fifty circuits. When we design a mediumsized machine we find that so much unit logic is required that the average level of integration falls below
ten. Orderly memory and random logic integrated
circuit fabrication procedures are growing so different
that thought is being given to building different types
of manufacturing facilities. This represents a rather
drastic approach and in the authors' opinions may
prove unnecessary.
The success to date in memory is encouraging, for
it gives direction to logic. Memory products should
therefore be examined critically for they may well
hold the key to success for logic products. The salient
features of a chip used in a memory product are:

• Well-Defined Function. The memory chip designer knows exactly how his chip fits into the
entire memory system. He therefore can optimize on a high level. As examples, he uses special
circuits for the latch functions and uses decoders redundantly to save pads.
• Volume. •While the initial memory chip design
is quite complex, the volume requirement makes
the initial design cost nearly negligible. With
this ground rule the chip can be highly engineered,
and nearly order of magnitude improvement
can be expected and obtained.
Structured logic, or array logic as it is sometimes
called, is an attempt to design logic with more of the
characteristics· of memory. Many unsuccessful starts
have taken place, but we shall discuss some of the
more successful efforts. We shall also add some thoughts
of our own, but it should be pointed out that the problem is far from solved.
Logic arrays

The basis of all array logic is a matrix of elements
with programmable interconnections. Diode structures
have been proposed in the past, and a matrix of common collector transistors is of recent interest. The
transistor array is programmed in the factory by
connecting or not connecting the emitter of each
transistor to a common line. (See Figure 1.) We shall
use transistor arrays in our examples, for that is what
we have been working with, but diode arrays should
not be ruled out.

• Regularity. Memory arrays are regular in components and wiring. The layout geometry is well
defined and can be highly optimized for total
chip utilization.
• Low Power. Memory systems are designed and
partitioned so that all circuits on a chip do not

61

62

Fall Joint Computer Conference, 1969

Figure I-A tn3,l1sistor array

The ROS
The read-only store (ROS) array in its simplest
form uses two decoders to feed the array: one feeds
the horizontal lines and the other the vertical lines,
as shown in Figure 2. A particular grid position in the
array is selected by activating ~he appropriate horizontal and vertical decoder line~. The addressed cell
of the array is located at the intersection of the two
activated lines. If the emitter at this address is COll-

nected to the horizontal decoder line, then a, 1 has
been programmed into this particular cell in the array.
If the emitter is unconnected, a 0 is said to be programmed into the array. The presence of the programmed 1 or 0 is sensed at the output when that
particular cell is addressed. The horizontal output lines
are dot ORed together to produce one common output
line, as shown in Figure 3.
Conceptually, the ROS is related directly to a
Karnaugh map, one bit position in the array for each
square in the appropriate Karnaugh map. Figure 4
depicts the four-variable K-map that relates to the
ROS of Figure 2. This relationship proves the universality of a ROS, for any Boolean function that
can be K-mapped can be implemented directly. Universality is the feature of the ROS chip most often
described as an asset, but in practice it is seldom useful except in code translators. The Boolean functions
used in the design of any computer are definitely not
random and not evenly distributed among all possible function"! of n variables. This fact is well documented in the many failures with other universal
logic blocks (ULB's). The real problem with the ROS
array is that it doubles in size each time an input
variable is added. This doubling in size is necessary
to maintain the dubious value of being universal.

The ROAM
The read-only associative memory (ROAl\1)

IS

ROS CIRCUITS

2

4

+6

ROS

C

z

0

Figure 2-Read-only store

Figure 3-Read-only-store circuits

a

Structured Logic
K-MAP
CD
J
AJ B

00

00

1

01

0

01

11

10

1

0

1

1

0

1

~

11

1

0

0

1

10

1

0

1

0

Figure 4-Karnaugh map

matrix of common collector transistors that may be
programmed by conneoting or not connecting the base
of each transistor to a common line in its own column
(Figure 5.) The emitters of each row are commoned
and feed the emitter of an output transistor. Each
row of array transistors and the associated output
transistor form a current switch.
Through phase splitters, each input variable has
both true and complement lines available to the array.
Hence , each variable controls a true line and a complement line (column) in the array. This gives rise

ROAM
A

B

C

Figure 5-Read-only associative memory

63

to the word "associative" in the name. By programming each row in the array to a particular pattern
of l's and O's, the input word pattern will "associate"
(compare) with the appropiate row in the array. If
there is no match, the outputs will remain logical zeros.
If at least one row has a pattern the same as the input
pattern, there will be a logical one output on that
horizontal line (row).
To program the array, each base is tied to a true
line (column), a complement line (column), or is
left floating. Thus, for a base tied to a true line, a 1
on that input line will yield a 1 at the emitter and a
1 at the output, since the row of emitters effectively
forms a DOT -OR (positive logic). Bases tied to a true
line are equivalent to a logical 1, since a 1 at that input causes a 1 at the output.
Conversely, a base tied to a compleme~t line is
equivalent to a logical O. A 0 at a particular input
raises the complement line of the phase splitter,
thereby raising to the 1 level all emitters of transistors
in that column that have their bases tied to the complement line (column).
If the base is left floating, that array grid position
is effectively a DON'T CARE. That is, the output
line will not be raised to 1 by either a 1 or 0 at that
. transistor's column input.
Figure 6 illustrates the implementation of an adder
position with SUlVf and CARRY outputs using a
ROAM array. A black triangle connecting a vertical
line and a horizontal line indicates a base connection;
lack of a black triangle indicates a floating base. Note
that if a true line is connected, then the complement
line is not connected, and vice versa for each array
grid position. Thus, at most, only 50 percent of the horizontal and vertical intersections will ever be used.
To conceptually understand the ROAM and relate
it to the Karnaugh Map it is convenient to think in
terms of negative logic. Thus, down levels are logical
1, the commoned emitters of each row form a DOTAND (all emitters down results in a down level, any
emitter up results in an up level), and dotting the output
transistors results in a DOT -OR.
Each row of the ROAl\I represents a term of a
logical expression in the sum-of-products form. The
logical expression CARRY = B . C + A . B + A . C
is in sum-of-products form, and B . C, A . B, and
A . C are each terms of the expression. Each term
may be implemented on one row of the ROAM. For
example, Figure 6 illustrates the implementation of
the CARRY function. Note that the A true and B
true columns are both connected to a transistor base
in the second row of the ROAM array, yielding the
term A . B. The three rows B . C, A . B, and A . C

64

Fall Joint Computer Conference, 1969

A,S

C ,0

C

1

1

0

0

1""

111

T

10 0

-1

o0
o1

[0

11

0[2] 0
A

0
(l)

A,S

oo 0 OJ
olQJ 0
1 0

are DOT-ORed at the output to yield B . C + A .
B + A . C = CARRY. In forming the term A . B,
the variable C does not have its true or complement
column line connected to a base. CARRY is 1 if A is
1 and B is 1 regardless of the value of C.
Each term of a logical expression in sum-of-products
form is an "implicant" on a Karnaugh Nlap. An implicant is formed by looping the l's in the Karnaugh
map and "reading" the loops from the ma,p. Loops
can only contain adjacent l's and the number of ones
in a loop must be equal to 1, 2, 4, ... , a power of 2.
This results from the fact that adjacent squares on a
Karnaugh map always differ only by the value of
one variable. Two squares looped yields a term with
n-l variables (n = number of variables), four squares
looped yields a term with n-2 variables, etc. Thus,
each implicant requires one row in a ROAM. The
bigger the loop of l's the fewer connections need be
made in that row. The complete expression i.s formed
by DOT -ORing the rows which is the same ,as ORing
the implicants.
The example of Figure 6 uses three loopi3 of two
l's each to form the CARRY. The SUM is formed
by four loops of one 1 each. In this case three con-

CARRY

SUM

C

S

........- .....-+-+---t-~-

~

CARRY

SUM

Figure 6-ROAM adder position

TABLE I----+Bits required for n variables in ROS and ROAM ARRAYS

2

3

VARIABLES
4
5

6

7

8

n

BITS

--.
<:Ij
~

c

R.::
'-"
~
~

j

R.

~

ROS
Always Universal

4

8

16

32

64

128

256

ROAM
2

8

12
18
24

16
24
32'
40
48
56
64

8

24

64

20
30
40
50
60
70
80
90
160
160

24
36
48
60
72
84
96
108
192
384

28
42
56
70
84
98
112
126
224
.896

32
48
64
80
96
112
128
144
256
2048

3

4
5
6
7
8
9
16
2n/2 Rows (Universal)

2 ''J

4·n
6'n
8·n
10'n
12·n
14·n
16·n
18·n
32·n

n·2 n

Structured Logic
nections must be made in each of the four required rows
to obtain
SUM

=

A . 13 . C + A . B . C +

A.B .C

+A· 13·(3
In contrast to the ROS, the ROAM can have uni~
versal capability with only one-half the number of
rows as the ROS needs bits for the same number of
variables. Moreover, the ROAM does not need to be
universal to be useful, thus allowing even further
reduction in size. Table I illustrates the difference
brought about by the ROS requiring one bit per K-map
position and the ROAM requiring one row per K-map
implicant.
Historically, computer functions are composed of
about four implicants or terms. The chart shows that
a four-implicant function is cheaper to implement
with a ROAM than with a ROS when the function
contains six variables or more. When the decoders
required for the ROS are considered, even four-variable functions with four implicants are more economical in ROAlVI than in ROS.
Two useful formulas to compare ROS bits required
with ROAM bits required for a given function are:
ROS bits

=

2n

65

ROAM bits = 2 In,
where n = number of variables, I = number of implicants. Thus, it is more economical to build a function
with the ROAM when 2 I n < 2n. This does not
consider the cost of the ROS decoders, which add a
factor to the inequality.
If we assume that the decoders for n-even take
2n(2n/2) bits, and for n-odd take [en + 1) 2(n+1 )/2 +
(n - 1) 2(n-l)/2] bits, then the cases for which ROAM
should be used are:
1. n even
2 I n < 2n

+ 2 n(2

n / 2) ;

2. nodd

+ (n + 1) (2[n+l1/2) + (n -

2 In < 2n

1) (2[n-ll/2)

Thus, ROAM is more economical than ROS in most
practical problems.
A realistic example of control logic for a small machine model has been implemented using the ROAM
array. Table II gives a comparison of the number of
bits required for a ROAM implementation versus the
number of bits required for a ROS implementation.
Note that the ROAM is significantly more economical.
A partitioning of functions could have been devised
for the ROS implementation. The ROAM would still

TABLE II-ROS vs. ROAM -a control logic example
TOTAL NUMBER OF VARIABLES. ..................................................
TOTAL NUMBER·OF FUNCTIONS ... 0...............................................
TOTAL NUMBER OF IMPLICANTS ................ 00................................
One 7-implicant function of 13 variables
Four I-implicant functions of 7 variables
One I-implicant function of 11 variables
ROAM
ARRAY SIZE: 28 X 12 ......................................................
ROS 1
ARRAY SIZE/FUNCTION: 214 . . . . . . . • . . . . . . . . . . . . . • . . . . . . . . . . . . .
6 ARRAYS FOR 6 FUNCTIONS: 6 X 16,384 ................................. .
SHARED DECODER .......................
TOTAL BITS ................................................ ·....
ROS 2
ARRAY SIZE FOR 13 VA'RTABLES: 213 . • • . . . . . . . • • . . . . . .
ARRAY SIZE FOR 7 VARIABLES: 27 X
ARRAY SIZE FOR 11 VARIABLES: 211 • • • • • • • • • • . • • • • • . .
SHARED DECODER ....
1"OTAL BITS .....................................
0

0

0

•••••

0

•

0

•••••••

0

4.0.0

•••••••

0

••

00

•••

••••

0

•

0

•••••

0

•••••••••••

0

•••

0

0

0

0

••••••••••

•••••••••••

0

0

0

••••

00

••••

•

•••

0

0

0

•

•••••••

0

•••••

0

•••

•••••

0.0

0

0

0

0

0

••••••••••

0

00000.

0

•••

•••••••••••••••••

. . . . . . . . . . . . . . . .

0

••

0.0.

0

••••••• 0

0

0

••

0

0

14
6
12

336 BITS
16,384 BITS
98,304 BITS
3,584 BITS
101,888
8,192 BITS
512 BITS
2,048 BITS
3,584 BITS
14,336

66

Fall Joint Computer Conference, 1969

be more economical than the ROS, however, especially
when one considers the additional wiring complication
-of connecting several small ROB arrays and the additional design time required to effectively partition
the functions.
The optimum size for a ROAM has not been determined, but chips with at least 512 bits on them are
desirable. This capacity would provide between eight
8-variable, 4-implicant functions, and one 64-variable,
4-implicant function (an extreme case, needless to say)
on a chip. The practicality of building and using such
a chip is yet to be determined.

The SLT array
Arrays can be designed so that they may be used for
direct replacement of present logic. The SLT array
performs the function AND-OR-INVERT in negative logic or OR-AND-INVERT in positive logic
and can be used directly to replace SLT logic. While
direct replacement of random logic with array chips
may prove to be the wrong approach in the long run,
it may well be the only way to get array logic started.
The SLT array has the same advantages over ordinary logic that all arrays have: orderliness of design
and layout, and high density with relatively low cost.

In addition, this type of array has a higher bit usage
than other arrays, since it more closely resembles the
familiar random logic, functionally. The SLT array
does not have decoders or phase splitters on its input
lines, as do other types of arrays. This makes the array
less universal than even the ROAIVI array ibut more
effective for r2,ndom logic. It is fair to say that arrays
of this type make poor code translators just as SLT
logic builds poor translators. It is difficult to believe
that any array will be effectiv3 in both random logic
and code translation problems.
As already stated, the ROAM array has specific
applications to decoders and associative memory
problems. The SLT array may very well be the element required to do general logic design. The reason
for this is the placement of the inverters as shown in
Figure 7. This movement of the inverters to the output lines may appear a minor modification, but it
should be remembered that there has never been a
useful logic block with inverters on the input lines. It
may pay to have both true and complemented outputs from a current switch logic block. Figure 8 shows
a full adder implementation in SLT logic 2~nd in an
SLTarray.

Array-driving arrays
The SLT array in Figure 8 demonstrates one necessary feature of an array that has yet to be discussed:
Any logic array must be able to drive any other array
in the same family, including itself. Note in Figure
8 the CARRY output fed back into the array. This
line probably will be an external wire. This technique
is required since it is in effect Boolean faetorin~~, a
proven necessity. This type of feedback is al80 needed
to produce sequential circuits, giving memory to the
arrays.

Figure of merit

Figure 7-8LT array

I t is less meaningful to compare array logic with
random logic in each individual term of power consumption, propagation delay time, and silieon area,
since one can usually be traded for the other, such 9·S
power with delay. Instead a comparison is made of
their figures of merit, chosen to be the product of
power consumption P, delay time T, and silicon area
A, all with weight function of one (PTA). Since no
isolation wall is needed between collector transistors,
a ROS or ROAM cell including approprifl,te interconnections can be laid out on a silicon chip area equivalent to 20-25 percent of that occupied by a transistor
that needs isolation walls. As shown in Figures 5 and 7,

Structured Logic

A
B
B
C

~--------------CARRY

ABC

67

pears to be the limited useful size of a single array,
and the difficulty in standardizing a particular array
configuration. As a minimum achievement at this
time, it appears that arrays will be useful in development of complex functions within a silicon chip.
Array logic will not eliminate the need for a circuit
designer in the future, since specialized designs will
be needed to optimize circuit and component technology. In some of these design cases, the importance of
array logic techniques will be obvious, but in others
it will not be.
At this point, array logic does not appear to strongly
affect the system designer's approach to machine design, and a knowledge of array logic may never be required.
In the future, however, to the extent that array
logic techniques influence the design and optimization
of highly efficient functions, the system designer's
work will be significantly influenced by progress made
in developing array logic techniques.
BIBLIOGRAPHY

Figure 8-SLT full adder position

the delay time of an array is two levels of current
switch emitter follower (CSEF) independent of the
number of inputs. For sophisticated functions, such
as the one-bit adder shown in Figure 8, more than two
levels of logic may be required.
Some typical comparisons of array logic and random
logic include the sampling design of array logic chips
to perform the same function a random logic chip
would. This comparison helps to partially discover
the merit and the limitation of the array logic. In
comparison with random logic chips that perform
sophisticated functions or have two or more cascading
levels of CSEF's, array logic chips have superior
PTA figures.
CONCLUSIONS
Various array configurations described here suggest
that random logic may be implemented by use of an
array of programmable crosspoints. Comparisons of
array logic with conventional logic indicate that in
many cases the PTA figure of merit is superior for
arrays. The most significant problem with arrays ap-

1 R RICE
Computers of the future
IBM Research Report RC-151 April 201959
2 R RICE
Systematic procedures for digital system realization from logic
design to production
Proc IEEE Vol 52 12 1691-1702 pec 1964
3 R C MINNICK
Application of cellular logic to the de:~ign of monolithic digita
systems
Microelectronics and Large Systems
Spartan Books Wash D C 1965 225-247
4 L C HOBBS
Effects of large arrays on machine organization and hardware
software tradeoffs
Proc FJCC 1966 Vol 2989-96
5 R C MINNICK
Cutpoint cellular logic
IEEE Transactions on Electronic Computers Dec 1964
6 W E KING III A GUISTI
Can logic arrays be kept flexible?
AFCR!. Report 65-547 Aug 1965
7 D C FORSLUND R WAXMAN
The universal logic block (ULB) and its application to logic
design
IEEE Conference Record 1966 Seventh Annual Symposium
on Switching and Automata Theory 236-250
8 S S Y AU C K TANG.
Universal logic circuits and their mod1~lar realization
Proc SJCC 1968
9 R C MINNICK
A survey of microcellular research
Jour ACM Vol 142 April 1967 203-241

Characters-Universal architecture

for LSI
by F. D. ERWIN and J. F. McKEVITT
Hughes AircraJt Company
Fullerton, California

defined areas represent the regions of the system with
the highest gate-to-pin ratios. After these portions are
lifted out of the system, the remainder is characterized
by very low gate-to-pin ratios (notably control and
data routing functions). Unable to satisfy the LSI
design criteria of high gate-to-pin ratios any longer,
the designer must look to more standard components.
Unfortunately, any proposed solution to the LSI
partitioning problem which lacks a total system approach tends to drift towards this pitfall.
Researchers striving towards partitioning for total
or near-total LSI implementation tend to diverge
along one of two conceptual paths; bit-slicing and
functional partitioning. To illustrate the difference,
consider the data portion of the computer. In functional
partitioning one may specify an adder as one LSI array, registers as another, a shift register as a third, and
so forth. On the other hand, in bit-slicing one would
design an LSI array consisting of a combined one- or
two-bit adder, registers, shift registers, etc., then build
up his system from this chip type according to the desired word length.
The bit-slice approach has resulted in some notable
advantages, particularly the ability to achieve very
high gate-to-pin ratios and implement systems using
a small number of different array types. 1 ,2 However,
bit-sliced mod~les have the basic flaw of being systemdependent, a drawback described by Pariser in an
early paper.3 This means that behind such bit-slicing
approaches there lie systems, real or implied, for which
the resulting arrays are most efficient. An attempt to
apply the arrays to a significantly different system
results in a poor design. Considering the types of bit-

BACKGROUND
Since the advent of LSI technology, several schemes
have evolved for the utilization of large arrays to their
full potential. A common and straightforward approach
involves the designer restricting himself to the equipment being designed at the moment. Faced with only
a limited set of problems, it is not difficult to specify
a small number of LSI array types which will efficiently
complete the design. While the results are quite encouraging for specific cases,! the drawbacks of any mass
adoption of these techniques are obvious. This, the
so-called "custom approach," would require the semiconductor manufacturer to be responsive to each customer with numerous low-output production runs of
highly specialized devices. The per-unit cost to the
user, for his own efforts as well as those of the manufacturer, would be quite high due to the inability to
spread initial costs over many devices. In addition,
the complexity of lOO-gate-plus arrays is such that it
is difficult to substitute one for another (with efficient
results). This would severely limit the· off-the-shelf
capabilities of both user and manufacturer.
An obvious solution to these problems is the intrqduction of a small set of standard LSI chips. Semiconductor suppliers, making tentative advances into
LSI product marketing, have already proposed such
devices as adders, counters, and shift registers. However, this does not represent the solution to the general
problem. A design heavily committed to the use of these
devices must fall back on MSI or standard I C for the
large remainder of the circuitry. The reason is that
adders, counters, registers and other orderly, well-

69

70

Fall Joint Computer Conference, 1969
.~--------------~--------------------------------------------~---

slice devices being proposed, inefficiencies would most
often be manifest in the design of a simple device in
which the majority .of the gates qf the array intended
to accomplish complex functions ~re wasted. Although
this may be acceptable in some: situations, it is unlikely that it would satisfy the strict requirements of
size, weight, power, and reliability imposed by aerospace and military systems.
It is the contention of this p~per that a judicjous
partitioning of digital systems in general, divorced
from bias towards any particular system, results in a
set of LSI devices that can entirely implement many
different computer systems of varying functional complexities and word lengths.
The resulting group of array~, referred to as a
"character set" and each one indiyidually as a different
"character", is sufficiently small ib. number (10), with
each type having acceptable size· and gate/pin ratio,
to be considered acceptable and desirable in view of its
wide range of app~ications. These! building blocks are
referred to as characters because of the metaphor that
may be made between the building blocks and characters of the alphabet (letters). Letters form words
to express the language whereas ~uilding blocks form
units to build the machine. In both cases a closed set
(of characters) is used to produce the desired end.
Although the character set is neither rigidly functionally-partitioned nor bit-sliced, it is biased towards
functional partitioning to give it the versatility to
efficiently implement both comple* and simple digital
devices. As an approach, functio~al partitioning has
a detailed and successful backgtound. 3 ,4 Bit-slicing
consideratoins give the character set its ability to
implement systems of varying word lengths.
In addition to providing the u~er with a standard
set of chips to implement many different digital machines, the completeness of the approach (the ability
of the characters to implement the whole machine)
relieves the user of the burden of: logic design. These
tasks are reduced to the selection of character types
and word lengths.

Introduction to the character set
A universal conclusion among LSI researchers is
that control functions are more difficult to modularize
than functions related to data :operations. Micromemory control technique was chdsen as the solution
for LSI implementation for several reasons. A micromemory, meaning here a read-only Bolid-state memory
with its sequencer and instruction register, is easily
partitioned into the large modules! necessary for LSI
implementation. Control fUllctions in this form are

then amenable to reproduction in large quantities
of identical units. Also,design with control centered
in one level of micro memory is more orderly and
straightforward.
The micro memory has been provided with a relatively sophisticated microprogram instruction repertoire. This means that the microprogram contains the
essence of the machine's major mathematical functions, such as multiply and complex sequencing. This
is desirable since it represents an efficient use of hardware for these purposes and also reduces the number of
different array types necessary. Also, a versatile repertoire leaves the designer free to make units which
operate as simply or as complexly as desired. The
~egree of flexibility which this repertoire gi ves the
character set is a major factor in its success. It should
be stressed that the "micro operations" of the I~harac
ter set are as important a factor as its logic design. This
fact, a critical one in all LSI solutions committed to
micromemory control, cannot be overemphasized.
Interest in designing a character set at Hughes was
concurrent with the development of an advanced computer system. The character set itself was developed
with the ultimate objective of implementing all future
Hughes digital data processing equipment with a common family of LSI circuits.
The outcome of that original effort revealed that
computer structures in general are frequently ordered,
or at least amenable to such ordering, as shown in
Figure 1.
The divisions of Figure 1 are functional. That is,
regardless of the hardware characteristics, the computer
philosophy is such that its functions may be identified,
separated, and diagrammed as shown in the figure.
From Figure 1 came the concept of the funetional
character set. With the fundamentals of LSI design
in mind, logic was designed to accomplish each computer

BOOLEAN LOGIC FUNCTIONS

COMPUTER
CONTROL
FUNCTIONS

•

MINORI TRANSFER, SHIF1',
ROTATE, COMPLEMENT,
INCREMENT, LOGICAL
OR, ETC .

•

MAJOR, ADO, SUBTRACT,
EXCLUSIVE OR, ETC.

'NPUT/OUTPUT

FUNcn"'~

FAST ACCESS
REGISTER STORAGE

AUXILIARY DEVICES
• COUNTERS
• CLOCKS
:

~~:CT~~:AD

I

~

'----1
_ _ _ _ _ _ .....J
L

CORE MEMORY

Figure I-Computer functional organization

Characters-Universal Architecture for LSI

ta

71

M.M Micro-array
PI Scratch pad memory
P2 Up/Down counter
P3 Switch

M

Ml

M2

INPUT /OUTPUT
FUNCTIONS

~:~~s~~~~ss
SCRATCHPAD

!

II

L.._ _----I._ _ _-...J

II

L..._ _----1._ _ _.....

G1 character

~.BITS---1

~----j.----I

g~~~'::-ER·I ~_ _--j._ _ _- I
L..-_ _---I._ _ _~
SWITCH

Characters of the same letter are logically grouped
into a common unit as illustrated in Figure 3.

CORE MEMORY DEFINED
AS AN I/O-TVPE DEVICE

I

Figure 2-Functional charf:l.cter set

function indicated by the picture. Each unique LSI
chip type which resulted was referred to as,a different
character type and given an identifying name and
number. Figure 2 shows the character set which resulted from the logic design according to the concepts
outlined in Figure 1.
The character set and repertoire have been through
several improvement cycles and used in the test implementation of a NASA computer to be discussed
later. Current plans include test design of the H4400
(a new Hughes computer) with the improved character
set, implementation of the character set with high
speed ~IOS circuits, and construction of one computer
using the characters.
These ten LSI characters alone provide the entire
hardware complement for the logic of a broad range of
computers and digital equipment. No extra logic in
the form of either IC, MSI, or custom LSI need be
added to the characters to finish the job. An important
by-product of this is that the user need never consider
logic design. His tasks are reduced to selection of the
necessary characters and the writing of the appropriate
microprograms for them. In fact, it is possible for the
character set to fit into a realistic total design automation procedure as discussed later.

The G 1 character provides the bulk of storage for
operands of the microprogram. Each character contains four registers of eight bits each accompanied by
reading and writing selector gates. The storage element
is provided with simultaneous dual reading and
writing capability. The storage flip flop itself is designed
for minimum read after write delay.
Eaeh of the two input busses is common to all
registers and carries to the G 1 character eight lines
per bus, one line from each bus for each bit of the
register. Input data selection is accomplished at the
memory element by a coincidence of positive information on a particular input bus and register selection
for that bus by destination decoding logic within the
character. The destination decoding logic is duplicated
to provide for writing from the two input busses into
the same character under control of two different microcommands. As will be illustrated later, this is a key
factor for the machine expandability property of the
character set as it allmvs G 1 to form a data path link
between individual logic units under control of up to
two' different micromemories. Different registers in
r,he character may be written into simultaneously.
Reading of the register is provided by dual source
decoding logic which gates data to independent dual
output busses. This duality provides for information
from any two registers to be simultaneously placed on
two output busses. The conceptual structure of the G 1
character is shown in Figure 4.
Several G 1 characters placed in parallel provide
registers of more than eight bits in length.

Description of the character set
This section describes each of the ten characters.
They are summarized below for reference.
G1
Ll
L2
L3
Ml
M2

Register storage
Generallogic
Arithmetic logic
Input/Output
Micromemory counter
Micro-instruction Register

Figure 3-Typical functional character configuration

Fall Joint Computer Conference, 1969

72

L1 character
The Ll character provides the basic logic functions
selectable by microprogram. In addition input bussing
is provided for nine channels (eight bits/channel).
One channel of the bus is required for each G 1, L2 or
or L3 character connected to the L1 character. The
logic functions provided consist of the rotates, shifts
(logical), no-operation, complement., and incrementation. Also associated with the L1 charac>ter is the decoding logic for these logic operations. The type of
microprogramming used with the functional character
system relies heavily upon the fast and efficient manipulation of bits within the various operands. To this
end, shifts and rotates have been: provided which execute from 1 to 31 positions in a single step (as opposed to serial operation). Incrementation is accomplished with the use of a logic register which may also
be used as a simple holding register. The L1 character
is eight bits wide and contains the following logic:
1. Bussing gates

2.
3.
4.
5.
6.

Decoding logic
Rotate, slJift, and complement logic
Incrementer
L register
Gating to output bus

In Figure 5 is shown a block diagram of the L1
character. Several L1 characters may be connected
together to form logic operations on words longer than
91112-1'

r---I
I
I

L_-,
I
I
I
ENCODED
SIGNALS

ENCODED
SIGNALS

Figure 4--G 1 character block diagram

MICRO.
MEMORV
CONTROL

L _ _ _ _ _ _ _ _ _ _ _ _ GENERAL LOGIC.JI
F~I~

Figure 5-Ll character block diagram

une byte. A limit of four bytes exists in order to maintair! consistency of definition in the rotates and shifts.
Information entering the L1 card from the various
sources is bussed to form the input bus. Then it is
operated upon and the resultant is bussed to the output bus where it leaves the character or is optionally
stored in the L regist.er (",here it would thus be available
at the next mirro-instruction time for use in the increment operation or as an "L" source).

L2 character
The L2 character provides the major arithmetic
functions used by the microprogram. The arithmetic
unit provides the 2's complement sum of the contents of the A and B registers. Addition is performed
with carry look-ahea'l byte parallel. Control signals
may copditioll the adder to alternately provide either
of two special results (a) a mod 2 addition instea.d
of full addition or (b) an input carry to the lowest order
bit for full addition (this forced carry in conj unction
with a negated operand accomplishes a 2'B complement operand for subtraction). The L2 character
consists of two holding registers for the operands of
the adder, the adder itself, decoding and error logic,
and bussing gates. Figure 6 dia.grams function-wise
the L2 character.
A typical arithmetic operation using the L2 character might proceed as follows: (1) first operand traIlS"
ferred to· B register (from output bus), (2) second
operand transferred to A register, (3) after appro,priat.e
delay access result and transfer out of L2 charact.er via
the input. bus. The error logic provides overflow and
carry-out information.

Characters-Universal Architecture for LSI

r----------------------~

73

r-----.,.LL..L...U.. . .

I
1
1

I

~~+h

I

I
I
I

I
I
I
I

I

INPUT

I
_;...1-++-~--+--I

I

INTERRUPT

I

Figure 6-L2 character block diagram

I
I
I
I
I
I
I

(mIl

DESTINATION

'---_~OECODE

I
'------

L3 character
The L3 character provides input/output capability
for the microprogram machine. For purposes here
input/ output includes not only the usual peripherals
but also main memory, scratch pads, real time clocks,
an P -charact.ers-namely all elements of the computer
not directly controlled by the micromemory. The L3
character provides iDput gating for external devicesfour buffered and three non-buffered channels. The
buffered-input gatiDg may be controlled either by the
microprogram or the external I/O device itself. Four
I/O output channels are provided. Interrupt signal
storage and int.errupt mask storage for four channels are
available. Parity generation and checking along with
odd/even control is provided for the four buffpre tl and the error type in cell m has not changed.
Further, the input and output leads of the cascade do
not fail.
I t is assumed that the 12 allowable cell functions for a
Maitra cascade are fI, f2, f3, f4, f5, f6, 17, fs, f9, flO, fn, f13,
and f14. (See Definition 1 for an explanation of the notation
Ii.) Seven allowable errors are assumed for each cell;
these are hb (s-a-l; stuck-at-one), fo (s-a-O; stuck-atzero), fl5-p (complementation where p is the cell
function), f12 (the input X), f3 (the complement of the
input X), flO (the input V), and f5 (the complement of
the input V). These seven errors consist of the two
failure types (s-a-O and s-a-l) usually assumed by
most fault diagnosticians augmented by f15-p, h2, fa, fIr'>
and fs. [Note that flO and i5 have different allowable
error sets; i.e., Ehu = (fr, i15, f5, f12, f3) and Ef5
(fr, f15, flO, f3,

i12).J

Definition 1.
follows:

The cell functions are numbered as

Xi Y i-I fo fl h f3 f4 f5 f6 17 /s /9 flO /n !I2 !I3 f14
Assumptions and definitions

Figure 1 illustrates the interconnection structure of a
Maitra cascade. 3 Every cell in the cascade is a two-input,
one output cell. It is assumed that the Boolean variables
applied to the cascade are numbered as illustrated on the
cascade shown in Figure 1. All testing of the cascade is
accomplished using only the input leads and the output
lead of each cascade (and of arrays). The ability to
measure the functional value produced by a cell by
means of probing a buss connecting two adjacent cells js
not assumed. To minimize the "uncertainties" (the
functional values between cells cannot. be measured and
the location of the error is unknown; therefore, the
functional values between cells are uncertain) involved
in testing cascades, it is assumed that cell n is tested first
(see Figure 1), then cell n-l, etc. If an error occurs in
cell n-j, its propagation may be stopped by one of cells
n-1, n-2, ... , n-j + 1. Once cell n is tested, it may be
set such that it transmits the output of cell n-1 to the
output terminal of the cascade. In this manner (under
certain error assumptions) the cells may be tested in the
following order until error location results: n, n-1, ... , 1.
The number of tests needed to test a cellular cascade is
O(n) *, where n is the number of cells in the cascade.
I t is assumed that only one error (faulty cell) may
appear in a cascade. Also, the interconnections between
cells do not fail, the error is time independent; i.e.,

* See Definition 6.

0 0 0 1
0 1 0 0
1 0 0 0
1 1 0 0

0 1 0
1 1 0
0 0 1
0 0 0

1 0 1 0 1
0 1 1 0 0
1 1 1 0 0
0 0 0 1 1

0
1
0
1

/15

1 0 1 0 1
1 0 0 1 1
0 1 1 1 1
1 1 1 1 1

Definiton 2. An error occurs in a cell whenever the
cell produces a function that is not the same as the
function specified for that cell.
Definition 3. G = (ft, i2, 14, fs, f6,

17, fa, jg, ho, /n, h3,

!t4).

Definition 4. I

p

denotes (1, 2, 3, 4, ... p).

Definition 5. The error function E is a mapping
from G x In to G, where EUh j) = A denotes that cell j
was theoretically to produce fi€,G but instead it
produced AeG. Clearly, E(jj, j) = fi indicates that cell j
does not have an error occurring in it.
Definition 6. X* means either X or X', but not both.
Definition 7.
nitude as n.

O(n) means the same order of mag-

11 necessary and sufficient condition for fault
location in cascades

Location of a single fault in a cascade is considered in
this section. A necessary and sufficient condition for
location of a single fault in a cascade is proven. The

84

Fall Joint Computer Conference, 1969

proof of Theorem 1 can be utilized to obtain an algorithm to loca,te faults in a cellular cascade or array.
Theorem 1. Given a cascade with n cells, then the error
can be located if a,nd only if for every
iEln - (1)
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)

Proof:

E(fl4, i) ~ fIi)? f12
E(fll, i) ~ f10i h
E(fs, i) ~ fo, 112
E(h, i) ~ fo, f3,
E(fa, i) ~ f9, J12' f3
E(f9, i) ~ f6, ~12' f3
E(f13, i) ~ f12; flo
E(h, i) ~ f3,!t6
E(f4, i) ~ fo, f12
E(fl, i) ~ fo, f3
E(flO, i) ~ fo, f15, f6
E(j5, i) ~ f10,fl),!I5

The proof is an inquction proof. Clearly,
the theorem is truJ for the case n = 1.
Assume that the ~heorem is true for a
positive integer k and consider a cascade
with k + 1 cells. Given the cell function
for cell k + 1, if it can be shown that the
error can be located in cell k + 1 if and
only if assumptions (1) through (12) are

Figure 5-Test decision map for fs

Figure 6-Test decision map for f2

Figure 7-Test decision map for f8

Figure 3-Test decision map for fu

Figure 4-Test decision map for f11

valid for cell k + 1, then the proof is
complete.
Assume conditions (1) through (12).
This part of the proof is now completed in
Figures 3 through 14. Note that if Co,
G1, " ' , Gi are used to set Yi = C at time
tI, then if Y i = C is wanted at time ~ if
Go, G1, " ' , Ci are utilized again, Y i is the
same value as it was at it; however all that
can be said about Y i is that it is either C
or C', but not both. This fact is used in the
proof of this theorem. In the figures with
the circled function number it may be
necessary to add one more test to deter-

Fault Location in Cellular Arrays

Figure 8-Test decision map for f9

Figure 9-Test decision map for f1a

Figure lO-Test decision map for f7

Figure ll-Test decision map for f4

85

Figure 12-Test decision map for f1

Figure 13-Test decision map for flo

Figure 14-Test decision map for fr;

mine whether the cell is in error or is
receiving the complemented sequence.
The proof of the other half of the
theorem will be by contradiction. Assume
that the error can be located, but that the
restlictioIlS (1) through (12) are not
needed. Then it can be verified that the
following pairs of conditions give the same
output at the cascade's t~rminal. Since the
two conditions give the same outputs, the
error cannot be located, which is a con··
tradiction of the assumption; therefore,

86

Fall Joint Computer Conference, 1969
(9) Y k = 1, 1, 1 and E(f4, k + 1) = h;
Y k = 0, 0, 1 and E(h, k + 1) = fo.
Y k = 0,0, and E(h, k + 1) = h;
Y k = 0, 0, 1 and E(j4, k + 1) = !J.2'

the assumption that the restrictions are
not needed iA incorrect and the proof is
completed. After (1) an abbreviated notation is used. Note:
Using the Test
Decision lVlaps and the contradiction part
of this proof one can actually determine
the values of Y i-I.
(1) Y k = 1, 1, 1 and E(f14 , k

+

°

(10)

°

k

k

Yk

°

(12)

+

+

°

(4) Y k = 1, 1, 1 and E(f2, k + 1) = f2;
Y k = 0, 1, 1 and E(f2, k + 1) = fa.
Y k = 0, 0, and E(f2, k + 1) = f2;
Yk = 0, 1, 1 and E(f2, k + 1) = fo.

°

(5) Y k = 1, 1, 1 and E(f6, k + 1) = f6;
Y k = 0, 1,
and E(f6, k
1) = fa.
Y k = 0, 0, and E(f6, k
1) = f6;
Y k = 0, 1, and E(f6, k
1) = f12.
Y k = 1, 0, 1 and E(fe, k
1) = f6;
Y k = 0, 1, 0 and E(f6, k + 1) = fg.

°°
°

+
+
+
+

= 1, 1, 1 and E(fg, k + 1) = fg;
Y k = 0, 1, and E(jg, k + 1) = !12.
Y k = 0, 0, and E(fg, k + 1) = fg;
Y k = 0, 1, and E(fo, k + 1) = fa.
Y k = 1, 0, 1 and E(fo, k + 1) = fo;
Y k = 0, 1, and E(fg, k + 1) = f6'

(6) Y k

°
°
°
°

(7) Yk = 1, 1, 1 and E(fla, k + 1) = f13;
Yk = 0, 1, 1 and E(jla, k
1) = f12.
Y k = 0, 0, and. E(fla, k + 1) = f13;
Y k = 0, 1, 1 and E(fla, k + 1) = f16.

°

+

(8) Y k = 1, 1, 1 and E(h, k + 1) = f7;
Y k = 1, 0, 1 and E(j7, k + 1) = fa.
Y k = 0,0, andE(j7, k
1) = f7;
Y k = 1, 0, 1 anp E(J7, k + 1) = fu).

°

+

=

=

0, 1,

°
°

and E~!lO, k

+ 1) = 10.
+ 1) = .flO;
+ 1) = /5'

Y k = 1, 0, 1 and E(flO , k
Y k = 0, 1, and E(ito, k

(2) Y k = 0, 0, and E(fll, k
1) = fn;
Yk = 0, 0, 1 and E(fn, k + 1) = fa.
Y k = 1, 1, 1 and E(fn, k + 1) = fn;
Y k = 0,0,1 and E (fn, k + 1) = fIr,.
(3) Y k = 1, 1, 1 and E(fs, k
1) = fs;
Y k = 1,0, 1 an.;! E(fs, k + 1) = !t2.
Y k = 0, 0, ~nd E(fs, k + 1) = fs;
Y k = 1, 0, 1 and E(fs, k + 1) = fo.

= f15'

k

Y k = 0,0, and E(f14, k + 1) = f14
are equivalent to Y k = 0, 1, and
E(f14, k + 1)= it2 at the cascade's
output terminal.

°

°°

°
(11) Y = 1, 1, 1 and E(flO, k + 1) = ito;
Y = 0, 1, °and E(ito, k + 1)
Y
0, 0, °and E(flO,k + 1) = flO;

1) = f14
are equivalent to Y k = 0, 1, and
E (f14, .k + 1) = fl5 at the cascade's
output terminal.

°

Y k = 1, 1, 1 and E(fl, k + 1) = it;
Y k = 0, 1, and E(it, k + 1) = fo.
Y k = 0, 0, and E(Jl, k + 1) = fl;
Yk = 0, 1, and E(it, k + 1) = fa.

Y k = 1, 1, 1 and E(f5, k + 1) = !5;
Y k = 0, 1, and E(f5, k
1) = !o.
Y k = 0, 0, and E(Is, k
1) = /5;
Y k = 0, 1, and E(/5) k
1) = f15'
Yk = 1, 0, 1 and E(f5, k + 1) = is;
Yk = 0, 1, and E(/5, k
1) = ito.

°
°
°
°

+
+
+
+

If the cascade meets the assumptions of Theorem 1,
then Theorem 1 can be used to determine test schedules
for the location of an error in cascades. It should be
noted that when cell k is tested, one obtains information
about the cells k - 1, k - 2, .. " 1, and therefore a test
schedule with O(n) tests will test any cascade with n
cells under the allowable error set6 • Clearly, if the
conditions of Theorem 1 are relaxed, then fault detection
(and maybe isolation) can be accomplished in the same
number of tests; however, if one is only interested in
fault detection, Theorem 2 is the best technique to use.
If a more complex cascade than the casca.des considered here is under consideration,· then a good
understanding of the method used to derive the
theorems in this paper will allow one to extend the
theories presented. If the cell functions fo, fa, !J.2, and f16
are allowed, then the fault techniques may be easily
extended since none of these functions depend on the Y
value; however, one must exercise care in the use of the
theory because it is based on the ability of the tester to
place theoretically both a
and a 1 on the Y interconnection, and examples (trivial) in which this cannot
be accomplished do exist.

°

Fault detection in Maitra cascades

In

this section the detection of a single ftmlt in a
cascade is considered. The theory for this section is
based on the observation that every n cell Maitra

Fault Location in Cellular Arrays
cascade (as defined in this. paper) produces :;I. function
dependent on X 06.
The purpose of this detection scheme is to utilize
exactly two tests to detect whether a cascade has a
faulty cell.
Theorem 2. Let the Maitra cascade have n cells. If
e2, " ' , c'n are such that f(X o, e1, e2,
en) = Xo*, then

e

1

e1, "', en) = f(O, e1, "', en)
implies that there exists a cell i such
that E(fp, i) = fo, f16, itz, or fa.

(1) f(l,

(2) f(l,

e1,

en)

= (1 *)' and f(O,
(0*)' imply that there
exists a cell i such that E(f p, i) =
it6-p or is·

e1,

"',

e1,

"',

en) =

en) =

1 * and f(O,
0* imply that there is
no error in the cascade or that there
exists a cell such that E(f p, i) = flO
and p r6 10.

(3) f(l,

e1,

Proof:

"',

"',

en)

=

In part (1) f does not depend on Xo;
therefore, there must be a cell i such that
E(jp, i) = fo, f15, it2, or fs. In part (2) f
depends on (X 0 *) '; therefore, there is a
cell i such that E(f p, i) = f15-p or f5'
Whereas, the proof of part (3) is now
obvious.

X 0 was chosen as the variable to be used in Theorem 2
because of the symmetry of the resulting theorem.
Since Xl can be made (by a suitable choice of constants~
to pass theoretically through every cell *, the theorem
could be rewritten in terms of Xl. In terms of the
complexity of the detection scheme it is seen that
cascades could have a very simple detection test
schedule. It should be noted that Theorem 2 can very
easily be adapted to provide fault detection in cascades
if it is assumed that flO is not an allowable error for any
of the 12 cell functions.
Examples
This section consists of examples of the use of
Theorems 1 and 2. fA denotes the measured value of
f whereas fT denotes the theoretical value of f.

* Assuming the cell function for cell 1 is not flO or f6•

87

Example 1. Assume that there is no error in the
cascade shown in Figure 15.
Test
Xo Xl
0
0
0
0
0
0
0
0
0
1
1
0

X 2 Xa
1
0
1
1
0
1
1
0
1
0
1
0

X 4 fT fA
0
0 0
0
1 1
1
1 1
0 0
0
1 1
0
1 1
0

Conclusion

E(fa, 4) = fa
E(fs, 3) = fs
E(f14, 2) = f14
E(f14, 1) = f14

Example 2. Assume that E(fs, 3) =
shown in Figure 15.
Test
Xo Xl
0
0
0
0
0
0
0
0

X 2 Xs
1
0
1
1
0
1
1
0

X 4 fT fA
0
0 1
0
1 1
1
1 0
0
0 1

it6 in the

cascade

Conclusion

E(fa, 4) = f6
E(fs, 3) = f15

Example 3. Assume that E(f14, 2) = fain the cascade
shown in Figure 15.
Test
Xo Xl
0
0
0
0
0
0
0
0

X 2 Xa
1
0
1
1
1
0
1
0

X 4 fT fA
0 1
0
1 0
0
1
1 0
0 0
0

0000000

0101011

Conclusion

E(fa, 4) = fa
E(fs, 3) = f6 80
an extra test is
needed.
E(fs, 3) ~ f5 and
the complemented
sequence
Y2 IS being
received.
E(f14, 2) = f3

Example 4. This example satisfies the hypothesis of
Theorem 2. Assume that E(fs, 4) = fo for
the cascade shown in Figure 1.5.
[(Xo

+

Xl

+

X z) Xa]

EB X 4 =

fT(X O, Xl, X 2, X a, X 4)

!T(X O, 0, 0, 1, 0) = Xo
fA(O, 0, 0, 1,0) = fT(I, 0, 0, 1,0) = 0 implies that there
is a cell i such that E(f p, i) = fo, it5, it2, or fa.

Fall Joint Computer Conference, 1969

88

--1

xo

t
f14

H

C
f14

H

(3
fa

r ~f

closely resembling a cascaded structure).

4

H

ACKNOWLEDGMENT

f6

Figure 15-A cascade to: be tested

The author wishes to thank R. C. Minnick for his help
in the preparation of this paper.

REFERENCES
CONCLUSION
Techniques for fault location an~ detection in cellular
arrays with an allowable error set of fo, f16, !I6-p, fa, !I2,
f6, or flO were described in this paper. It was shown that
the problem of testing an array could be reduced to the
problem of testing a cascade. The solutions presented
are particularly attractive because of their simplicity.
To locate an error, O(n) tests are needed for an n cell
cascade. Detection of an error requires only two tests
if the allowable error set is reduced by one error (flO).
A necessary and sufficient conclition for single-error
location was given. If the restrictions of this condition
are relaxed, then an isolation theorem such as given by
Thurber 6,7 can be derived; however, this isolation
condition will be more complex t~an the theorem given
by Thurber 6,7. A criterion that enables detection of a
single error in only two tests was! derived.
Although the theories presenited were derived for
regular arrays of logic, they have ,potentially wide areas
of application. A good understanding of the philosophies
presented here will allow the extension of the results to
cascades of m input n output cellf';. Also, some irregular
arrays may be tested using this ,theory if they can be
decomposed into sections composed of some form of a
cascaded structure (or sections composed of structures

1 W H KAUTZ
Testing for faults in combinationa,l cellular logic armys
1967 Switching and Automata Theory Symposium
2 W H KAUTZ
Diagnosis and testing oj cellular arrays, properties of
cellular arrays jor logic and storage
SRI Project 5876 Scientific Rpt No 3 July 1967 119-145
3 K K MAITRA
Cascaded switching networks oj two-input flexible cells
IRE Trans on Electronic Computers Vol EC-ll April
1962 136-143
4 R C MINNICK
Cutpoint cellular logic
IEEE Trans on Electronic Computers Vol EC-13 Dec
1964 685-698
5 R C MINNICK
A survey of microcellular research
Journal Association for Computing Machinery Vol 14 April
1967 203-241
6 K J THURBER
Fault location in cellular arrays
PhD dissertation Montana State Univ June 1969
7 K J THURBER
Fault location in cellular cascades
Submitted to IEEE Trans on Computers
8 L M SPANDORFER J V MURPHY
Synthesis of logic .functions on an array of integrated circuits
Scientific Rpt ~o 1 for UNIVAC Project 4645 AFCRL63-.528 Contract AF 19(628)2907 Sperry Rand Corp
UNIVAC Engineering Center Oct 1963

Fast multiplication cellular arrays for
LSI implementation
by C. V. RAMAMOORTHY and
S. C. ECONOl"fIDES
The Univer.~ity of Texas at Austin
Austin, Texas

The methQdQIQgy and retrQactive design prQcedures
Qf the lVlultiplicatiQn Array are presented. IntercQnnectiQn arrangements at the cell level, fQr the array
fQrmatiQn, as well as the mQdule level by. bringing all
mQdule inputs and Qutputs at the terminals Qf the
"package", fQr the purpQse Qf assembling larger multiplicatiQn units, are alsO' shQwn.
Since in any LSI circuit testing impQses a cQmplex
prQblem SQme diagnQstic schemes are suggested for
recQnfiguratiQn and QperatiQn under reduced capabilities 0'1' even by autQmatically switching in Qf a permanently cQnnected spare mQdule.
Other LSI cQnsiderations in terms Qf cell or module
fan-in/fan-Qut, tQtal number Qf pins required per
package, chip sizes and densities and rough cost estimates are alsO' discussed.

INTRODUCTION
The inherent capabilities Qf Large Scale IntegratiQn
technQIQgy have recently shifted attentiQn tQward twO'
majQr cQncepts in the design Qf functiQnal cQmputer
subsystems; the cQncepts Qf FunctiQnal MQdules and
Cellular Arrays.
The FunctiQnal MQdule cQncept emphasizes the
PQssible standardizatiQn Qf frequently used CQmmQn
digital subsystem units such as registers, adders,
cQunters, etc. Because Qf the unique iterative prQperties alsO' displayed by these units it is CQmmQn to' view
them as building blQcks (functiQnal mQdules), built
Qn a single substrate Qf material, the intercQnnectiQn
Qf which can expand significantly their functiQnal
capabilities. In additiQn to' standardizatiQn, their
massive prQductiQn may suggest IQW CQst subsyst~ms.
The Cellular Array cQncept allQws the intercQnnectiQn Qf several types Qf mutually independent logic
blQcks, the cells, in various geQmetric CQnfiguratiQns
to' perfQrm a desired QperatiQn.
This paper is an attempt to' cQmbine the abQve twO'
apprO' aches in the realizatiQn Qf a Binary Cellular
Array multiplicatiQn unit easily adaptable to' the
LSI realizatiQn techniques and speculate the PQssibilities Qf the realizatiQn Qf Qther similar such functiQnal
units aiming to' IQwer the CQst per unit Qf cQmputatiQn and PQssibly increase the Qverall system reliability.
MultiplicatiQn was chQsen in the study because it
fQrms the basis Qf divisiQn and square rQQt operatiQns
by iterative methods as well as others indicated by
design trend Qf present day cQmputing systems.

Single bit multiplier

;'Figures 1 and 2 show the integral parts and the detailed cellular array structure Qf the multiplication
unit, in which each rQW of the array cQrresPQnds to'
Qne bit Qf the multiplier. The array uses K-bit Qperands
prQducing 2K bit prQduct.
TO' achieve fast executiQn time the mUltiplication
is done by perfQrming K-l carry save additiQns (simple
EXCLUSIVE-OR QperatiQns) followed by a full
binary addition. Since the cells in the array Qperate
asynchrQnQusly, the unit as a whQle can Qperate faster
without using a clock pulse.
We, shall next explain the single-bit multiplication
unit in some detail.

89

90

Fall Joint Computer Conference" 1969

------------------------------------~~-----------------------------

m\

m.

m~_ um2
m, --"I
.

r-------------

,

1

-n2

~~~C'

I
I

-n3

C
C

~-I-I----.l¥--_

I
I
I

P~

-

C

_..J'.

P~

_ _n\

P~
AND

-+f-,¥,-------",

CARRY SAVE ADDER

~c-------_n7

I

I
I

'\

I~--~~~~--~~~I

I

,...x.---:1I.~"------"'--""'----"'---JI

I
I

I
L _______________ _

FigurE' 2-The "single-bit" asynchronous mult.iplieation
cellular array

I

-

-

__ I

Figure l-The integral part.s of the asynchronouH
multiplication array

The following example will illustrate the above
matrix formation.

Let the multiplicand be represented by the binary
vector M = (mJ, m2, ... mk) and :the multiplier by the
binary vector N = (n 1, n2, .. . nk).
A kx2k, P matrix is now generated starting from righ t
to left (whose elements Pij are computed from the
relation Pij = m
nj, PijE fO, 1} with the follO\ving
conditions
1: '

EXAl\IPLE

-----

11

o if ni =

SiS k for i - 1,2,3
} ... k
0 and/or 1
k

+ 1S j S i i=I,2,3 ... k

I

1 for

In terms of the array to be implemented, this condition
implies that for the range "i," "j" where Pij = 0 no cell
will be required to perform a 10giO function. Thus the
[PJ matrix has the following form: '
Pl,2k-2 ... Pl,A: ... P13
P2,2k-Z' ••

Pk,2k-l

P2,A:' .. P23

P12

Pu

000 0 101 0 1
00010 L 0 1 0

o 0 1 0 1 0 100
o 1 0 1 0 100 0

1

=

(10101) and N = (111111)

then the P matrix is P =

fl SiS k for i = 1,
2,3, ... k
mj-Hl if ni = 1 and/or i - l < j < k + l
I for i = 1, 2, 3 ... k
Pii

.:.vI

lVIUI.TIP LY ...

101 0 1 000 0

The above matrix can be realized by selective ANDing of components of M and N. This "Shifting Network" accomplishes the proper positioning of the
numbers to be added before their addition, just as in
the conventional multiplication. Arrays of Carry
Save Adders are used to perform the addition IOf these
binary numbers utilizing Wallace's algorithm.!
The first stage of the Carry Save Adder adds the
first two rows of the P matrix (first two generated
partial products) thus generating two vectors-the
first partial sums and the first carry having the form:
S

=

(SI, 2k-l

SI, 2k-2 • •• SI, k • •• SI1)

P21

The double subscript is used to identify the above
vectors with corresponding positions of the P matrix
that contributes to their generation.

Faist Multiplication Cellular Arrays
The logic functions yielding the elements
are:

S2i

91

and

C2j

where j = 2, 3,," 2k - 1. The composite cells are
shown in Figure 3a.
In the subsequent stages the Carry Save Adder will
add three vectors: The sum vector generated at the
previous stage, the carry vector generated at the
previous stage shifted once to the left and the next row
vector of the P matrix.
The logic functions producing the new sand c vectors

C

Slj~

S.
c

l,j-1

•

PHI,j

•

+

IJ

c

•

I,J-I

+

•

1)

MAJORITY
P lj
FUNCTION OF P 2j =t!)--c 1J
THREE
VARIABLES
P

~~J>S

~L·+IJ
"EXCLUSIVE-OR"
FUNCTION OF
THREE
VARIABLES

-cp
I

-nference, 1969

LI aotivates the single multiple of the multiplicand (first
"AND" gate row of each group of rows in the ESA).
L2 activates the 2's complement of the multiplicand
(second "AND" gate row, directly under each row of
inverters).L 3 activates the double multiple of the
mUltiplicand. Therefore, the typical cell of the ICC has
B I, B 2, and Co as inputs and L I, L2 and L3 as outputs.
Its logic functions are shown below. BI and B2 are any
two consecutive bits and Co is the darryout. The logic:

BI B2

Co

0
0
0
0
1
1
1

0
1
0
1
0
1
0

0
0
1
1
0
0
1

1

M M 2M
LI L2 L3
0
1
0
0
1
0
0
0

0
0
0
1
0
0
1

0,

0
0
1
0
0
1
0
0

Note: The interpretation of BII B2 = 01 is not one times
the multiplier as it would obviously appear, but it is
instead two times the multiplicand because of the way
the multipli,er is plaGed in the register, vertically with
the least significant bit on the tOli>. The B I , B2 = 10
combination is interpreted in a similar manner

The typical cell "K" of the ICC is shown in detail in
Figure 5b.

The ,carry save adder, end around carry
accumulator and full binary adder
A layout of the inputs to the CSA stages, the EACA
and FBA is displayed below. The groups of binary
numbers between the lines represent the actuall inputs
to a particular row of cells. The first three groups are
CSA row inputs. The fourth group represents the EACA
inputs and the final group, those of the FBA. An binary
numbers representing partial products are of cou~se
P matrix row vectors activated by the ICC lines dne
to a .particular multiplier bit pair combination.
1 1 1 1 1 1 1 0 1 0 0
1 1 1 1 1 0 1 0 0

1st partial product
2nd partial product,

o 0 0 0 0 1 001 0 0
1 1 1 1 101 0 0 0 0 0
1 1 101 0 0

1st partial sum
1st carry
3rd partial product

1 0 0 0 1 1 0 0 0 1 0 0
o1 1 1 0 0 1 60 0 0 0 0
o10 1 1

2nd partial su m
2nd carry
4th partial product

o 1 0 0 0 1 0 0 0 1 0 0 3rd partial su m
1 0 1 0 1 1 0 0 0 0 0 0 0 3rd carry
o 1 1 1 End Around Carries
1 0 0 0 1 1 1 0 1 0 0 0 14th partial sum

o 1 0 0 0 0 0 0 0 1 0 0 0 4th carry
1 1 0 0 1 1 1 0 1 1 0 0 1

Figure 6-The binary multiplying cellular array

Final Sum (Result)

Figure 6 shows array after superimposing the individual circuits.
It can be easily noticed that there is a reduction
by a factor of two in the total number of cell rows re...,
quired for the array and therefore in the total finml
propagation T p, at the expense of some ,additional
control logic, a number of inverters and an additional
stage for the EACA. No further complexity in the
cell structure results, thus the o'tiginally developed
cells were used, with a minor modification for cell S
as shown in Figure 7a. This cell may also, be present
in the single bit multiplication array.
It must also be noticed that the overflow of bits
resulting in the left-most significant part of the final

Fast Multiplication Ce,llularArrays

95

a good choice for all practical purposes. An interconnecting scheme of standard dimension 64 X 8 bit
modules to realize the 64 bit multiplier was then devised aiming to minimize the number of pins per
module necessary for the interconnection.
As seen in Figure 8, the resulting 64 X 64 multiplication unit requires 2-Full Binary addition stages
and 4-Carry Save addition stages per module, a
total of 32-Carry Save additions and 15-Binary Additions (only one for the first module). However, there
is a real time overlap between these various stages,
and by utilizing a pipelining technique and a series
of flip-flops after each FBA, a 100 percent utilization
of the unit during computation is achieved, and the
multiplication cycle is considerably faster. This is
illustrated shortly in connection with Table III.
The basic module as displayed in Figure 6 has to be
modified further for the interconnection. An extra
FBA and additional gating for diagnostic purposes is

, 64 •

I-10 -I

Figure 7a-Cell "S"-A form of Cell "S"

I

MODULE - 1
PRIl_

l[

Flln-Plnn

~I'"I Bits

I

1
1

MODULE-4

1

MODULE - 5

~

l

MODULE - 6

MODULE - 7

1

1

Figure 7b-CelJ "R"-Reconfiguration cell

product register may be advantageously utilized for
sign and decimal point consideratlons.
Diagnostics and reconfiguration

In order to incorpora te diagnostics in the array
and study- the interconnection problem, a standard
size module had to be assumed. It was felt that the
implementation of a 64 X 64 bit multiplier would be

Bits
2 - Bits

....:...J +-

MODULE -3

r
S _ _ _ _-++--_...

l~

MODULE - 2

NS

L

.<;toraae

Figure 8-Example of an assembled 64 X 64-bit
mUltiplication unit using the pipelining scheme

Fall Joint Computer Conference, 1969

96

----,---------------...:.----~-------

introduced in every module between the output of its
respective FBA and what is shown as a product
register. The typical newly developed cell for the
diagnostics and reconfigura tion is shown in Figure
7b, while the above mentioned modifications are displayed in detail in Figure 9 for a typical module.
As seen, three additional control lines are needed
to perform the following functions.
a. To relay a Fault or No-Fault signal, indicating
that a fauft has or has not occurred in one particular module (NF/F) (e.g., if F = 0 NF = 1).
b. To relay a No Shift signal for the output of this
module, (NS = 1) if no fault has occurred in
the preceding module.
c. To relay a shift, eight-bits to the right, (S = 1)
for the output of this and all subsequent modules
if a fault has been detected in the preceding
module.
The detection of the fault could be accomplished by
a software routine which may check the final product
of the unit periodically and appropriately set the flipflops of the control signals.
By shifting the outputs of all subsequent modules
to the malfunctioning one eight-bit positions to the
right while forcing the output of the faulty module to be
equal to zero at the same time and simultaneously
introducing the spare module which is permanently
connected to the unit, one can still achieve 100 percent
computational efficiency. If another module fails to
function properly, by applying again the same reconfiguration scheme the unit will function with a reduced
capability since the eight-least significant bits of the
multiplier will be lost. No provision has been made at
this point if two modules fail to function properly

OVERFWW

I..

at the same time. At least one of them must be replaced
to put the multiplication unit back in service.
Aiming to maximize the number of multiplications
per unit time, as already mentioned, one can introduce
storage elements at intermediate points. This allows
the unit to accept a new set of operands without waiting
for the total completion of the present computation.
Consider an m X m bit multiplier module. If the
intermediate computations are stored after the Carry
Save adders, the first Binary adder and the second
Binary adder, the rate of multiplications in the module
per unit time will be

Rm =

1
where
max [tcs, t b]

tC8 = Total time propagation through the CHA.
tb = Total time propagation through the FBA for
the binary addition of two m-bit binary
numbers.
Then the number of storage elements required per
module is 2m + m + m = 4m. If, however, storage
elements are inserted at the outputs of the two Binary
Adders only, as shown in Figure 8, the maximum rate
of multiplications in each module per unit time will be

while the total number of storage elements required
will be decreased by half, that is 2m.
The table below gives the sequence of events in
the first four modules of the 64 X 64 composite multiplier unit of eight modules, based on the pipelining
technique.
Table III

1:.,---

6 .. ··8

MODULES

BlTS

TIME UNITS
1
-+--+-..+-----1--rI'--

NF/..-F

2
3

4
5

Figure 9-The comhinationallogic gating
for reconfiguration

1
Bll
B2l
Bn
B41
B51

2
Bll
B 2t , Bl2
B SI, B22
B 41, B32
B n , B41

4

3

Bll
Bu
Bal
B 41 , B 23
B 51, B33

Bll
BZI

B13
B 41, B14
B n , B24

Each time unit in the above table corresponds to the
factor tb + t es , and B i j represents the j th bimtry
addition of the i th multiplication.

Fast Multiplication Cellular Arrays

97

---~-----,------------------------------------------------------------------------

Approximate number of GATES/CELL*
For cell "C" approximately seven-gates are required
For cell "S", itS'"~ approximately three-gates are
required
For cell "R" approximately two-gates are required
For cell "K" approximately nine-gates are required

Figure lO-An alternate. interconnecting scheme for
the 8-modules of the 64 X 64 multiplication unit

Another interconnecting scheme which has not been
investigated yet in detail but seems to be equally as
efficient, considerably faster and adaptable to the
proposed reconfiguration technique is the one shown in
Fig. 10, where each level of nodes represents FBA's
Figure 10, where each level of nodes represents FBA's
performing in parallel with an anticipated multiplication
cycle of

LSI implementation
The implementation shown for the 64 X 8 module
reveals a number of characteristics suitable for large
scale integration. Among them are the repetitive
interconnections of simple identical cells and the
modularity suitable for expansion and reconfiguration.
Below some of the approximate hardware requirements are pointed out.

Approximate 'number of PINS/MODULE
1.
2.
3.
4.
5.

m + n + 2 needed for the multiplicand register
m + n + 2 needed as inputs to the second FBA
m + n + 2 needed for the product
n + 2 needed for the multiplier register
three-control pins for reconfiguration

Approximate number of CELLS/MODULE
The cells are the kinds already discrussed: C, S,
S', R, K. All are present in a module.
1.
2.
3.
4.
5.

m X n/2 cells needed for the CSA stages
m + n cells needed for the EACA stage
m + n reconfiguration cells
2 (m + n + 2) cells needed for the two FBS's
n/2 + 1 cells needed for the ICC.

The above estimates point out the fact that testing
at the individual cell or circuit level (item yet to be
examined) becomes a problem, especially when the
complexity of the chip is increased, with a paralleled
decrease in reliability and yield of non-defective chips.
However, using the modular approach it is advisable
to perform the testing externally on the module and
discard the malfunctioning units. This would considerably decrease the amount of logic on a chip, which would
otherwise have to be inserted for the testing of the
individual circuits. This approach seems to be economically feasible since it is estimated that by 1970
an LSI chip of 100 X 100 mils in size may contain
200 components, at five cents per component, while
by 1975 an LSI chip of 300 X 300 mils in size may
contain as many as 3,600 components at the cost of
about one cent per component. Therefore, miniaturization of LSI chips will discourage the testing on the
individual circuit level, while the loss due to the
discarding of modules after tesing at the frame level,
will be negligible.
In view of the above considerations and since the
present state-of-art high density MOS circuits are
being driven at 10 MHz, implementation of the
multiplier modules as the one presented by MOS circuits appears very desirable from a manufacturing
viewpoint. A reasonable building block might be a
64 X 64 bit multiplication unit requiring an approximate number of 5000 active elements (field effect
transistors) . One could also visualize the whole unit
incorporated in one or two chips. Where speed is the
primary requirement, the unit can be designed using
fast bipolar transistors, with an expected five ns delay.
Assuming then a 64 X 64 bit module is implemented
by bipolar transistors, the execution time could be
in the neighborhood of 0.22.5MS, which when pipelined,
the maximum number of multiplications per second may
be approximately 5 X 106 • An MOS array of the same
module will perform in an order of magnitude slower
than in the bipolar case.

* The above gates 9,re mostly "AND" gates with the "OR" gate
not included in the count. They are also 2(m + n) additional
gates needed for the reconfiguration scheme and m X n gates for
shifting each array.

98

Fall JQint Computer Co¢erence, 1969

The pin count also indicates that the current design
is within the state-of-art of the MOS technology.
The performance figures given :above are educated
guesses since the circuit and int~rmodule delays are
dependent on the circuit types, their interconnections,
the chip topology, etc. In addition 1the design examples
described in the previous sections indicate the ease
with which the array could b~ partitioned to fit
reasonable unit or chip sizes.

functional arrays appear quite feasible and, worth
considering. The possibility of composite design of
a multiplication, division and square rooting unit using
techniques presented in this paper could be very useful, particularly if the division and square root algorithms are based on the availability of fast multiplication units such as those discussed in this paper.

CONCLUSION

The authors would like to thank Mr. Gary Vvang of
the NASA Electronics Research Center for sharing
with them some of his thoughts on the subjeet. Also
Mr. W. R. Adrion, graduate student at the University
of Texas at Austin for his constructive sugg;estions.

Since fast multiplication has become the basis of
iterative divisions and square root~ in fast computers6 • 7
there appears to be a need for ch~ap array type, LSI
realizable multiplication subsystems. This paper reports
the design methodology and the detailed implementation of one such structure. Ease of diagnosis and capability of reconfiguration were used ~s twin requirements
in the final design. When the unit is composed of a
number of modules and a malfunction is detected in
one of them, a method of switching automatically in
a spare module was presented. An estimate of the
logic circuitry in the hard core (that portion of the
unit which must be operating without any faults)
during testing is found to be less that 14 percent for
a 32 X 32 module, 9.7 percent ~or 64 X 64 module
and 4 percent for 128 X 128: module. Therefore,
as the size of the multiplication module-unit increases
the relative size of the hard core decreases very rapidly.
To conclude, the cellular array implementation of an
asynchrouous multiplication unit using mostly noncarry-propagating Carry Save add~rs was accomplished.
The final cell design and the cOJitrol and the reconfiguring circuitry are quite simple.
A number of additional studies needs to be done in
the future. The design of self-diagnosable and repairable

ACKNOWLEDGMENTS

REFERENCES
1 C S WALLACE
A suggestion for a last multiplier
IEEE Trans Prof Group on Electronic Computers Vol 13
No 1 Feb 1964
2 'Methods for high-speed addition and multiplication
NBS Cir No 591 1958
3 0 L MAcSORELY
High-speed arith/11.R,tic in binary computers
Proc IRE Vol 49 No 1 Jan 1961
4 1\1 LEHMAN
Short-cut multiplication and division in automatic binary
digital computers
Proc Inst Elec Eng Paper No 2693M Vol105B Sept 1958
5 I FLORES
The logic of computer arith/11.R,lic
Prentice-Hall Inc 1963
6 D FERRARI
A division method using a parallel multiplier
IEEE Trans Prof Group on Electronic Computers Vol 16
No 2 April 1967
7 S F ANDERSON et al
I BM system model .91: Floating point execution unit
IBM Journal of Research and Development Jan 19167

The Pad Relocation technique for
interconnecting LSI arrays of imperfect
yield
by D. F. CaLHOUN
Hughes Aircraft Company

Culver City, California

INTRODUCTION
The interconnection of circuits required in Large Scale
Integration (LSI) using multi-level metalization above
monolithic semiconductor arrays is taking basically
two approaches. One is predicated on processing with
a reasonable yield entire arrays without any semiconductor defects (i.e., 100 percent yield chips) which
allows once-generated fixed-wiring patterns to obtain
the required interconnect. The second approach aims
at much larger semiconductor hrrays (i.e., full-slice
LSI) for which defect-free processing cannot be expected. Thus, probe tests are made of the semiconductor circuits processed on each LSI slice (or wafer)
and record is made of the good and bad circuit positions. Unique interconnection masks are then generated
to interconnect good circuits in each wafer's particular
yield pattern using certain "discretion" in avoiding
the bad circuits. As a result, the 100 percent yield
approach emphasizes the need to use standard interconnect masks but is complexity limited by the occurrence of defective circuits in larger arrays, whereas
approaches capable of routing around the defective
circuits have required a full set of unique signal interconnect masks for each wafer's particular yield pattern.
The Pad Relocation approach, however, allows the
interconnection of full ..slice LSI arrays containing defective circuits to be accomplished with a minimal
amount of unique interconnect per array. Only a
portion of one of the typically three interconnect levels
varies from array to array, thus allowing significant

improvements in the cost, reliability, and testability
of the finished arrays as well as less limitation on cell
yields and array complexities.
Description of the Pad Relocation technique

Pad Relocation is a technique which allows a predetermined standard pattern of good circuits to be
established on all LSI slices used to perform the same
array function regardless of the varying yield patterns
determined by DC wafer probe tests. This is accomplished by relocating the pads of nearby good circuits
to the positions where good circuits were specified
by a presc~ibed master pattern, but were not· found
during wafer probe tests. The pad positions above a
bad circuit (or any unused circuit) are isolated from
that circuit by a layer of dielectric. Where good circuits are found in expected good circuit locations,
those circuits are used without relocation. Thus, the
Pad Relocation technique functionally establishes a
specified pattern of good circuits as if there had actually
been a 100 percent circuit yield in that pattern. A
single wiring pattern can then be generated for all
the LSI arrays of the same function to accomplish the
much more complex signal interconnect between the
master pattern circuits. By determining standard
cross-under areas within the Pad Relocation layer
where relocation lines need never occur, it has been
shown that large arrays can be interconnected with
the same number of total interconnect layers as required by discretionary techniques.

99

100

Fall Joint Computer Conference, 1969

With each wafer's good circuits located in the predetermined master pattern, an optimal standard
interconnect of the circuits can be made for each
wafer. Since this signal routing and mask-making
expense is incurred only once for each function, much
more effort can be spent optimizing the signal routing.
As a result, the total number of interconnect levels
(including Pad Relocation) may actually be fewer
(for very complex arrays) than pther techniques by
which the interconnect is generated for each wafer's
particular yield pa,ttern.
The Pad Relocation technique has been 100 percent successful for all integrated circuit and special
LSI wafers considered so far. The "master pattern"
gives the prescribed locations of good circuits to
which each LSI array's particular yield will be tailored.
Statistically, if M is the percentage of wafer circuits
in the master pattern and Y is the wafer circuit yield
from probe tests, then only M(100 - Y)/100 percent
of all wafer circuits need to be relocated. For example,
if Y = 35 percent and M = 30 percent, then the
relocation (as a statistical average) of 19.5 percent
of the wafer circuits will establish a master pattern
that uses 86 percent of all the good wafer circuits.
This would allow 120 good circuits to be located in
prescribed positions, leaving an average of only 20
good circuits unused.

An example
The methodology of the Pad lRelocation technique
is best described by example. Figure 1 shows the mapping of circuits on an LSI wafer. :Each dot represents
the position of a semiconductor. cell such as a full
adder, or a quad two-input NAND gate cell, or a flipflop, etc. Figure 2 identifies with a slash (/) the location of all circuits determined to be good by dc wafer
probe tests on a particular slice., The yield of wafer
circuits varies from 10 percent to 90 percent depending
on the circuit complexity, and the locations of the
good circuits cannot be predicted from wafer to wafer.
This makes it impossible to use standard interconnect patterns without first transforming the various
wafer yielq patterns to a single standard pattern.
The circuit yield (the percent of :total circuits which
are good) for the wafer in Figure 2; is nearly 30 percent
and yet there is not a single area :of 100 percent yield
that is larger than three circuits by two circuits. Thus,
100 percent yield could obtain urtits with only about
5 percent of the complexity allowed by full-slice interconnection techniques. The goal ~s to tailor by some
efficient means the locations of the good circuits in
Figure 2 to a standard pattern that may be used for

Figure I-Integrated circuit wafer

..

//

..

. / / ..

/

.

./ .
./ .
. I I I ... / ..
........ / / . / . / /
. / ... / / . / . / / . / / .
.. / .. / .. / . / .. / / / / .
/ / / ...... / . / / / / . /
... / .. " / . / .....
. . / . . . . . / /. . / . / / . . . .
... / / / . . . / / / .. / . / ....
.. / / / / . / .. / .... / / / .
..... / / . /
. / / .... .
. . / / . / / / .. / .. / ... .
/ .. -././ .... / / / / . / .
... / / . / / . / . / .. / / / / .
/ ... / . / .... / .....
/ . / / / . / .. / /
... / . / / ...
/ ..
./ .
/

Figure 2-Wafer after test-Slashes show good cireuit
positions

all wafers with about the same circuit yield. For higher
yield wafers, there are other standard patterns. which
use more good circuits.
Figure 3 shows a master pattern (in heavy dots)
which can be used for wafers having at least a 25 percent yield. That pattern is characterized by a, more

The Pad Relocation Technique

•• • • • • • • • • • • •
• • • • • •• • •• • • • • •
•• • • • • •• • ••••••••
••••••••••••••••••••
• • • • • •• • •• • • • • • • • • • •
• •••••••••••••••••••••
• •••••••••••••••••••••
••••••••••••••••••••••
• ••••••••••••••••••••••
•

.......................

· ....•.•.•...•........
·• •••••••••••••••••••••
'

••••••••••••••••••••••
• •••••••••••••••••••••
•
•• • • •
• •
• • • •• •
• • • • • • •
••• • • • •
•
•
••••• • • • • • • • • • • •
• • • • • • • •• •
•

•• ••••• • ••• •• • •
••
•• ••
•• • • • • • • • e ••
•

101

••••••

• . . I I ..
. .• I I • . ,
•
.1.1·····1···
·111·.·1·.·.········
. ··.··.·1'·1·'1···1.

.. ... II . I. I I . , I .•... I I
. ·1·.1'.1·1·'11'1··111
III·.· '."'.1'1.1".
. •. I . • . . . • . I . • . . . . • . .
.. ·1.···.11··1·11···
.··1.1·' ··1.1··.·'·.·
·.·11'1.1··.···.111.
., ·.11·.·· ·.1· . • . .
.·11·1.1··.··1··· .
1 · . · . · 1 · · · ·1.1"1 •
. 11·11'.'" ·1111
. , . '.1·1··· '1.' .•.

l·ll'.I·.II·

. . • . '1·11' . . •
·1·
•.

el

Figure 3-A master pattern of good circuits-All wafers
will be matched to this pattern by the Pad
Relocation technique

dense usage of good circuits toward the center of the
wafer \vith good circuit positions never adjoined on
more than one side by another circuit in the master
pattern. The latter characteristic facilitates the routing
of standard signal interconnect as well as the relocation of circuits in at least three directions. The matching
of the master pattern to the expected yield distribution as a function of distance from the wafer center
optimizes the conflicting goals of minimum number of
relocations and maximum probability of fulfilling the
master pattern.
Figure 4 shows the Figure 3 master pattern superimposed on the particular \vafer yield of Figure 2.
The objective now is to route a nearby good circuit,
shown by a slash, to each heavy dot (i.e., master pattern position) which initially is \vithout a good circuit. This specification can be completed manually
giving a coding sheet descr~ption of necessary circuit
relocations; or a simple computer routing program can
output a punched tape or cards that can be used to
make a mask automatically. The computer routine for
Pad Relocation \vill use about two orders of magnitude less run time than a customized signal routing
primarily because no circuit placement or logic signal
routing are required. Pad Relocation requires only
that a good circuit be identified for relocation to each
position in the master pattern which did not initially
have a good circuit. A later paper will present work
that is under way to automate the Pad Relocation

Figure 4-Master pattern superimposed on the particular
yield of the Figure 2 wafer

selection and Rpecification with the use of interactive
graphics.
Figure 5 shows a manually generated specification

AREA A

.

:ri-J::1··

:{1.r..~:ill ~ f: J ~g.
n:

• . . ·Ihl-e//,·.;-.....
·J-e1·'/,·/6·11.'

11

"if
~~~~:~:::~~~.~

. . .-. . -/- . . ,1. r--:..J . -./
. . foe

I---e / . / /-' . . .

r-J4

I.~ia~n ~ ~
. . . . ' f H / ' " ' ' ·.1
.•
. · e - - I / · I . h · , , · . / ..
e---t. e-J flf· I ' " . 1.1.· I.e
'11'11·· ·.··1111·

•. · - . t · I · · · !-e .. , .
.. hll.fH·~h··

. '~:i~:{q:"
fH· ~ ..

Figure 5-Specification of a set of relocations necessary
to completely implement the master pattern of
Figure 3

Fall Joint Computer Conference, 1969

102

of posdible relocations that cOn)pletely satisfies the
master pattern of Figure 3, us~ng the good circuit
positions of the wafer in Figure! 2. The longest relocation line length is less than 10.45 inch. Figure 6
shows how the relocation in area of Figure 5 can be
accomplished without crossovers for a quad two-input
gate cell. Each gate of the bad ci~cuit at the lower left
is functionally replaced with a good gate from the top
right circuit. It should be noteq that the computer
needs only subroutines for leaving (or entering) a cell
from the top, bottom, left, and right, for moving parallel lines' over some number of c~lls, and for making
ninety degree turns in order to dq all the possible Pad
Relocation routing patterns. Figure 7 shuws the actual
Pad Relocation of an SN5480 g~ted full adder above
a silicon wafer using 0.002 incl~ aluminum lines on
0.0035 inch centers. Figure 8 s~ows how simple the
Pad Relocation mask is if it is cbnsidered as a set of
the above mentioned subroutines.

A

Figure 7-Pad Relocation of an SN5480 gated full
adder above a silicon wafer (Using O.002-inch
aluminum lines on O.0035-inch centers)

Intermedia.te step to full wafer LSI
Figure 9 shows an intermediate step to full-wafer
LSI using the Pad Relocation te;chnique. Three 4-bit
Modular Multiplier modules are ~o be fabricated from
the three bordered half-inch square areas (as was suggested in a 1968 FJCC paper by D. F. Calhoun).
Within the three bordered areas,; slashes again represent good circuits and circles show the master pattern
,

rr.....-r

,

I

~

locations. The lines terminating in arrowheads show
how three, eight, and five good circuits can be relocated into the positions circled to establish the same
pattern of good circuits for each module, thus allowing
the use of one standard signal interconnect pattern
for all subsequent modules tailored to that pattern.
Figure 10 demonstrates the simplicity of a coding
sheet specification of the necessary circuit relocations

~

...

n

.

..I

t

•

II

•
c~t

1
--'
,'..
_.L-- ..

,

r-II
I

I

f.--

-

__ .

~:~~:I

I

- -,

I
I

..

I

..
r'
- --

_,.

__

I

. :.J ..

~

I

~

::!.~~

_··1 ...
------ . . . . .

1 .

• .....J

.

I

I

!

i

......

.It

I

;

L _____ ..... _ ----------

Figure 6-A set of pad relocations ;necessary to replace
functionally the quad two-inptit gate circuit in
area A of Figure 5

.

-

:-,

~:

Figure 8-Mask pattern for the pad relocations specified
in Figure 5

The Pad Relocation T'echnique

Figure 9-Pad Relocation routing for three 200-gate
modules on a single l-Yzinch wp,fer

103

Figure ll--Four relocation patterns for SN5480's

for the three multipliers of Figure 9. Figure 11 shows
the four possible Pad Relocation interconnect patterns
which are necessary for the LSI multipliers. For these
modules it seems appropriate to incorporate simple

signal cross-under lines and power distribution in
the Pad Relocation level so as to require only two
additional levels of interconnect above the tested
LSI chips.

..... PAD RELOCATION LSI

A Pad Relocation LSI hardware program

PHONE

CiRCUit

HUGHES

0."

Rtt6CAliuN biMteflbN

~T~~+W++W++W~~w+~~~~~~C~L~U;fl~:t3~1~*,
_,:

I

T

S

IFI

S

THE

...L..I-i-I-I-U+~-I--I-+--I--I--l~I-++W++++-j-E~!~2f-+--I-W-++-~~~~C*~+'ll."J'+-j ~-+~.+-I-i-+-+-I-+-I-+-l

-'-l-U!--W-J.-l-l-iW-l-+-W-~--+-l-I-I-l--I-l-+-l-I-1'~L4"IH++H-H-I~H~~ C: ~ LU~~"I-+++++++-l++-l
iLU!--W-J.~UL+-W-~--+-l-W+-I-l-+~~~L*2+-W-~+h~~L~~CA E :
I

L 2

•

RI

H

C

L, I

N 33
PECIFI S
IC
R L CA I N S
AT
R
I
E USED
H T C I RC IT
SI I

II

,I

II

An LSI hardware development program began in
January 1969 (in which Hughes Aircraft Company
contracted Texas Instruments to do the multi-level
processing) and which resulted in fully tested and
packaged 207 gate arrays in May 1969. During this
program, (1) TI fabricated and tested one type of
their LSI wafers having a certain mix of gates and
flip-flops, (2) TI supplied the yield information on
each wafer to be processed for Hughes, (3) Hughes
generated both the one standard signal interconnect
mask for all wafers as well as an iI).dividual Pad Relocation mask for each wafer, and (4) using the mask specifications from Hughes, TI processed the two additional
levels of interconnect and tested and packaged each
of the finished units. Similar programs for higher
complexity arrays have since been initiated. The
results of this program are described below.

The logi,c array to be built in: LSI
Figure lo-Coding sheet specification

Investigations were made three years ago at Hughes
Aircraft Company into the applicat:on of LSI arrays

104

Fall Joint Computer Conference, 1969
j

to techniques for doing the verx high speed sum-ofproducts computations required: in advanced digital
filtering systems. A result of thi;s study ,vas the de.velopment of the high speed ":l\10dular Carry Advance
l\1ultiplier" which was described l in a 1968 Fall Joint
Computer Conference paper by D. F. Calhoun. Among
its characteristics is its modularity \vhich allows
longer wordlength multiplication$ to be efficiently accomplished (in terms of speed ~nd parts) simply by
paralleling more of the identic~l modules. A 5-bit
sign-and-magnitude Modular Multiplier designed with
four types of logic gates and a JK flip-flop was thus
chosen as the vehicle for LSI development on this
program. Such an array forms and. stores in a register
the 9-bit sign-and-magnitude product of two 5-bit
operands. The 5-bit multiplier design uses 153 NAND
gates and 9 flip-flops (each equi\ralent to six NAND
gates) for a total of 207 interconpected gates per LSI
wafer.
The logical interconnection of, 207 gates using less
than one square inch of an LSI ~afer represents well
any state-of-the-art bipolar LSI ~pproach. Two levels
of interconnect (including the Pad Relocation) were
used above the tested wafer which already had a first
level of metalization for component interconnect.
In terms of cross-over complexity, signal linelengt.hs,
and circuit fan-outs, the IVToduhtr l\1ultiplier design
can be considered typical of a 200 gate logic array.

Figure 12-Texas Instruments LSI type "K" slice
(HAC Photo 4R07185)

Description of the chosen LSI slice
The chosen semiconductor slice :for this LSI development program was the Texas Instruments type HK"
slice. Basically, the K slice is a hiploar array of transistor-transistor logic (TTL) ga~es and flip-flops occupying an active area of about 11.1 square inches. A
picture of this LSI wafer is shown in Figure 12. The
array is subdivided into 298 cell!3 of dimension 0.084
inch by 0.044 inch. Of the 298 Basic wafer cells, 170
are split into two 42 by 44 mil halt-cells for gates while
the 128 JK flip-flops on the wafkr occupy full 84 by
44 mil cells. The distribution of logic elements on the
K slice is shown in Figure 13. Each cell labeled "3"
has two independent three-input NAND gates while
the adjacent cells labeled "5" have an independent
five-input NAND gate and a on~-input NAND gate.
In three of the rows of gates ~ single seven-input
NAND gate designated by a "7" was processed instead
of two three-input NAND gates. The rows of fullsized 84 by 44 mil cells contain the JK flip-flops, which
are labeled "FF". In total there! are 642 logic gates
(170 ones, 264 threes, 170 fives, 'and 38 sevens) and
128 JK flip-flops processed on the wafer.

LSI ARRAY· SLICE "K"

-

t

44 MILS

~
F

T6

F

F

7/ 5
F

31&

6

3

5

3

6

316

5

3

5

3

5

31 5

315

F

F

F

F

F

F

F

F

F

F

F

F

F

315

3

5

3

5

3

5

3

5

315

3T 5

3/5

F,

F

F

F

F

F

F

F

F

F

F

F

d6

ih

7

5

7

5

7

5

7

5

F

F

F

F

F

F

F

F

F

F

F

3

6

3

5

3

5

3

6

315

F

F

31,

31 6

315

315

F

F

F

F

F

F

F

F

F

F

F

F

F

F

71 5
F

F

3T 5

F

71 6

315

F

F

F

F

F

F

F

F

F

F

F

F

31 6

3

6

3

5

3

5

3

&

31&

3T 5

F

F

F

F

F

F

F

F

F

F

F

F

71 6

7

F

F

F

F

F

F

F

31 5

F

F

F

31&

31 6

F

F

F

F

31 6

F

F

316
F

7

5

7

5

7/5

71 5

F

F

F

F

F

F

F

31 5

3

5

3

5

3/ 5

31 &

F

F

F

F

F

F

F

F

F

F

F

F

3/ 5

31 6

3

&

3

6

3

6

3

5

31 5

3r 5

31&

31 B

31 5

F

F

F .F

F

F

F

F

F

F

F

F

F

F

F

7

5

7

5

7

&

7

5

715

7/5

71 5

7ls

F

F

F

F

F

F

F

715

71&
F

F

F

F

31

e

31 5

F

F

F

F

F

F

F

F

F

F

F

F

F

F

F

F

F

F

F

F

315

3

5

3

5

3

5

3

5

3T 5

31 6
F

F

F

F

F

F

F

F

F

F

F

F

F

31 5

315

3

5

3

6

3

6

3

5

3T 5

31&

3/5

31 5

3

5

3

5

3

6

3

6

31 5

31 5

3

6

3

6

3

5

3

5

3\ 5

3

5

3

5

3

5

3

5

F

F

316

F

F

Figure 13-LSI array slice "K"

71 5

71 5

715

F

F

F

F

F

F

3/ &

3/ 5, 31 5

F

F

F

F

F

31 6
F

F

F

F

1188 MILS

F

5

31'

3/ &

315

F

715
F

6

F

F

F

F

3

3/ 5

F

F

F

7

F

3j5

715

F

F

F

F

F

dB
F

31 5

F

31'
3j1

F

715

B

F

F

F

F

&

71'
F

F

F

F

F

71 5
F

31 5
F

F

3

7\5
F

F

F

3r5

F

715
F

F

F

F

F

3\ 6

F

5

3

31 6

F

5

3

3

31 5

F

3

5

5

315
F

5

3

6

F

F

3

5

3

F

F

5

3

3

F

F

3

5

315

3/6

F

5

3

3/5

3/ 5

F 'F

3

F

3
3/5

315
3} a

7/ 5
F

I

1176 MILS

IM~LSr-

F

F

F

3\ 5

F

I

The Pad Relocation Technique
Selection of the master pattern and
pad relocation patterns
First, a master pattern of circuits was chosen to
define the standard circuit positions on the K slice
that would be interconnected to form the Modular
Multiplier function. This master pattern (shown in
Figure 14) was defined with respect to (1) maximizing
the probability of successful fulfillment, Pr(M), of
the master pattern, (2) facilitating the standard signal
interconnect, and (3) using a minimum number of
relocation patterns efficiently. After the master pattern and the repertoire of relocation patterns to be
used were determined, restricted areas in the Pad
Relocation level were defined to allow signal crossunders from the standard top level signal interconnect. Sufficient cross-under capability for this design
was found in the flip-flop cells alone by using certain
areas of these cells which are not required by any of
the defined relocation patterns. Other cross-under
areas can be defined for any more complex designs
so as to still use only two metalization layers above
the tested circuits. A set of Pad Relocation patterns
was prepared to allow the efficient selection of the

Master Pattern Cell Designation Key:

t:.

= 1 input

o

= 3 input gates

o=4

gate

input gates

105

particular patterns and their positions necessary to
fulfill each wafer's master pattern. The chosen set
of K slice relocation patterns is shown in Figure 15.
This semiautomated specification has :fi~,cilitated a
very fast turnaround and low cost capabiiity for the
generation of Pad Relocation masks and for working
with new routing requirements, wafer layouts and
logic designs.
'

LSI program results
The end results of the Hughes effort described in
this section were the two metalization mask specifications used by TI to process each wafer. Only one of
these is unique since the use of Pad Relocatio~ allowH
all signal interconnect to be obtained from a oncegenerated standard mask. Figure 14 shows the worksheet specification of how the yield of a typical LSI
slice can be tailored to the chosen master pattern.
The lines with arrowheads at the end specify relocation patterns from the set of patterns shown in Figure
15. The completion of the K slice master pattern was
accomplished successfully on each of the 30 wafers
attempted. A typical time for a man to complete and
verify the specification shown in Figure 14 was two
minutes manually.
From the specifications like those in Figure 14, the
necessary relocation patterns were selected from the
standard set shown in Figure 15 and were added to

o = JK flip-flop

~
11111

/MIIIIIf

~

F u,

"lI

'It

~

l

"'"
.jill

~

m

.. II

;!II'

-

1@!::siiil

ill
Figure 14~Pad Relocation worksheet with master
pattern locations shown

l

Figure 15-Set of K slice relocation patterns

106

Fall Joint Computer Conference, 1969

the standard cross-under pattern to complete the Pad
Relocation mask such as the on~ shown in Figure 16.
Only the particular circuit relocation patterns vary
within this mask which allows thb least possible variation of interconnect and testing from one array to
another. The more complex but standard mask is the
one shown in· Figure 17 which abcomplishes all necessary. signal interconnect (except the cross-unders to
the Pad Relocation level) and the power distribution
for the 5-bit multiplier design. The design for this
mask can efficiently be done manually for arrays of
this and larger size since the ~aster pattern is well
distributed. In mask plotting itime alone, the Pad
Relocation mask required only about 20 percent the
time required to plot the signal interconnect metalization patterns. A photograph of the final 207 gate
LSI multiplier is shown in Figure 18.
Statistics of Pad Relocation master patterns

The choice of a master pattern for Pad Relocation
is important since its definition affects the average
number of relocated circuits (and thus the routing
time and mask complexity) as well as the number and
simplicity of the signal interconnect levels. Also a good
statistical match between the ~aster pattern and the
expected wafer yield distribution will result in a higher

-

•

probability of successful relocation. As an example,
consider a master pattern that is defined too densely
about a wafer's periphery. Since peripheral wafer
circuits show a much lower yield than the more central

'1IJJ'IJJIJIJ'IJJJJIII~J~llllllll.JI
.. t: l ;.
;.;
:.
!.

~;

~

1,

:;

~

,

c'·

,

'

I

••
'hi

.......

I •

L

-

II-II~=.II1II

\

It I

~

, • II .11 II I

I •

-

I II

~~

II

=

11111

.......
-

II

- -ill'
.- -:;D:5

--.-~

•

Figure 17-5-bit. modular multiplier standard interconnect mask

L

11111

--

~-

--q

.

I •

ill'

-

,

•
~

iiIIIt

1.1

•

' .

. . ,

11111111111111111111"111111111111111111 .
li'igure 16-Pad relocation mask with standard crossundel'S

Figure 18-207 gate multiplier LSI array usin.g Pad
Relocation (HAC Photo 4R09152)

'The Pad Relocation Technique
ones, there will statistically be more relocations, longer
relocation lengths, more difficulty in satisfying the
master pattern, and a higher concentration of signal
interconnect above the master pattern than if the
master pattern had been chosen to match the "expected" yield distribution as was done for the example
shown in Figure 3.
A first question that must be answered is what is the
"expected" yield distribution? Investigations thus far
have pointed out only that there is significant decrease
in yield as a function of the distance from the wafer
center which can be attributed to boundary defects,
and that when good or bad circuits occur, there is a
more than random clustering effect. No ability to
predict the locations of these clusters has been obtained.
What must be done is to examine the yield of large
samples of the wafer types that will be used to determine the distribution that best describes their
expected yield patterns. This distribution will be different for different ranges of yield as well as for different
circuit complexities and wafer types. The master pattern for a specific range of yield, wafer type, and wafer
size should be matched to the expected distribution
so as to take advantage of any knowledge of where
good circuits are more probable. By so doing, the
probability of successfully fulfilling a master pattern
is maximized while minimizing the expected length of
the longest relocations.
StatisticaJ techniques have been developed to determine and compare the efficiency of various master
patterns in terms of maximizing both the utilization
of good circuits and the probability of successfully
fulfilling the master pattern. For example, if y is the
percentage of the total circuits that were found to be
good (i.e., the yield), m the percentage of total circuits that are in the master pattern, and r the number
of unused circuits from which a relocation could be
made to each master pattern circuit, then the probability of successfully fulfilling each master pattern
circuit independently is:

Y

~

L.J

~ (1 - y - 1)(1 - y)k
(1 - y)k = Y LJ

k=O

k"'"r

+ (1 - y)ry = y

L

(1 - y)k

(1)

k-O

where the first term is the probability that the master
pattern circuit itself is good, and each succeeding term
is the conditional probability of needing to examine
another candidate for relocation times its probability
of being good. Equation (1) can be simplified as follows:

(1 - Y - 1)

k=O

~ (u - l)u k
k=Q
-Y

= Y L.J

(2)

with

a

P(l) = Y + (1 - y)y + (1 - y)2y.+

107

u = (1 - y)

and
k=r

L

k=O

(u - l)u k

k=O

- (u r+1

-

1)

1 -(1 - y)r+l

(3)

therefore,
P(l) = 1 - (1 - y)r+l

(4)

If the master pattern has a total of M circuits in
it, then the joint probability of successfully fulfillin g
all of the M circuits becomes:
P(M)

= P(l)M = [1 - (1 - y)]r+1M

(5)

Equation (5) is based on an uncorrelated and pseudorandom distribution of good circuits (see Reference 10
with y 2:: 0.25) as well as the same assumption as
Equation (1) that there are r circuits (good or bad) for
each master pattern circuit fsom which a relocation can
be made independently of the other master pattern
circuits. It is, however, an unnecessary restriction to
assign r circuit positions which could only be used to
fulfill each master pattern circuit. Instead, consider
successively examining up to r circuit positions which
are the closest to each particular master pattern position
and, for which, there is still a free path in the Pad.
Relocation level to the master pattern position. Then
Equation (5) will give the probability of successfully
relocating (if necessary) to each of the M required
master pattern positions at least one of the r closest and
free circuit positions.
Equation (5) determines a family of curves. for
P reM) versus M for various yields and values o~ r.
Figure 19 shows the curves of PrOf) versus M With
y = 0.5 for r = 4 and r = 9. It should be noted that
each circuit of M may actually be many interconnected
gates of logic and M = 100 would represent 1000 gates

Fall Joint Computer Conference, 1969

108

1.00
y

=0.5" CIRCUIT YIELD

0.90

allow the standard signal interconnect to be designed
to require the minimum number of levels and the
minimum area per level. Thus, chip areas can be less
interconnect limited.

0.80

Improvement of testing and reliability of la:rge
scale integrated systems

0.70

0.60

~
...

0.50

~

M =220
FORP=O.S
0.40

0.30

0.20

0.10

0.0

20

50

100

soc

200

1000

M

Figure 19--The probabilty Pr{M) of successfully
fulfilling a ma.ster pattern of M cifcuits by relocating from
one of up to r nearby circuits. Eeqh circuit is a tested unit
which may have many gates 6f logic complexity

if each circuit of M had 10 gates of equivalent logic
complexity. If it is desired to 'successfully fulfill the
master patterns of at least half the wafers considered,
Figure 19 shows that 220 circuits (and thus probably
750 or more gates) can be used if r = 4, and 680 circuits can be used if r = 9. Of ¢ourse, any wafers for
which the master pattern was hot easily fulfilled are
not lost since they can be inv~ntoried and used for
other master patterns, or for integrated circuits, or
diced and bonded to substrates~ As a comparison the
most complex current bipolar p.iscretionary unit has
an equivalent Al of 169 while the 100 percent yield
approach has reached an equivalent M of only 24.

Advantage of Pad Relocation to iJSI
signal interconnect
The prime advantage of Pad ~ Relocation LSI which
has been described above is th~t it places the pads of
all used circuits in standard positions which both allows fixed-pattern signal' routing between these circuits as well as the utilization of more circuits than
allowed by other LSI techniques. There are further
advantages, however, to the rquting of the standard
signal interconnect. For exaIl1ple, the positions to
which circuit pads will always be brought can be modified and optimized to facilitate the necessary routing
of signals as well as to minimize the lengths of the
longest or the most critical signal paths. This will also

Semiconductor device reliability, as well as propa~
gation delay, is highly dependent on proper maintenance of junction temperatures within certain
bounds. From the maximum specified junction temperature, a maximum power dissipation per wafer
area can be computed which is dependent on the heat
conductive characteristics of the wafer and the cooling
techniques used, as well as on the area and power dissipation of the particular circuits. Thus there will be
a maximum number of circuits that should be powered
up on the wafer. In addition, no region of t.he wafer
should exceed a certain maximum power density in
order to insure that the wafer will not have relative
"hot spots" where too many powered circuits are located. Pad Relocation LSI can help insure that the
wafer power dissipation density is not excessive by
specifying the relocated circuits to be primarily those
from areas of sparce circuit utilization, thus obtaining
a more uniform pmver density across the enti.re wafer.
By so doing, the system cooling requirements can be
relaxed and/or more circuits can be used on the same
wafer. This more uniform power dissipation could be
quite difficu ~t to insure with other routing techniques
since there is less choice in the used circuit positioning.
A simple means by which a Pad Relocation 

0

(12)

Likewbe, define the Category history C h , at the eth
event.~'),~

(13)

F, = u~

(19)

From equations (11) through (14) we see how the
Authority and Category histories accumulate as a
function of event e. These events are the specific times
when files are accessed by a job. To maintain security

Fall Joint Computer Conference, 1969

122

i

TABLE I-8ecurity property determination matrix

Object

~roperty

User, u

Authority
A

Category
C

Given Constant

Given Constant

Franchise
F
u

.------------------Terminal, t

Given Constant

Job, j

min(A 1o At)

File, f

Existing file
Given Constant

u~

Given Constant
Cu

T\

u~J

C,

Existing file
Given Constant

New file
max(A(he-1), peA;»~, e > 0

New file
Ch(e - 1) U Ci, e

>

0

integrity, these. histories can n:ever exceed (i.e., be
greater than) the job security profile. This is specified as,
If equations (22) and (23) hold, then by definition

Ah(oo)

~

Ai

(20)
u =

Ut

=

Uj

(24)

(21)

Access is granted to a file jf and only if
For e::l 0, we see the properties initialized to their
simplest form. However, as e g~ts large, the histories
accumulate, but never exceed thai upper limit set by the
job. Ah(e) and Ch(e) are impQrtant new concepts,
discussed in further detail laterl We speak of them,
affectionately, as the securj~y "high-water mark," with
analogy to the bath tub ring that marks the highest
water level attained.
The Franchise of a new file is always obtained from
the Franchise of the job given by equation (6). When
i = II = 0, the job is controlled by the s~ngle user Uj who
becomes the owner and creator of the file wth the sole
Franchise for the file.

Access control
Our model is now rich enough tq expreSl:) the equations
of access control. We '\\ ish to control access by a user to
the system, to a terminal, and to a file. Access is granted
to the system if and only if
UEU

(22)

where U is the set of all sanctioned users known to the
system.
Access is granted to a terminal if· and only if

(25)

for propertjes A and C according to equationEI (8) a.nd
(9), and
(26)

If equations (25) and (26) hold, then access is granted
and Ah(e) and Ch(e) are calculated by equations (12)
and (14).

Model interpretation
Three different dimensions for restricting :Jiccess to
sfnsitive information and information processes are
possible with the security profile triplet. The generality
of this technique has considerable application 1;0 public
and military systems. For the system of interest,
however, the Authority property corresponds to the Top
Secret, Secret, etc., levEls of government and m~litary
security ~ Category c)rresponds to the host- of special
control compartments used to restrict access by project
and area; such as those of the Intelligence and Atomic
Energv communities; and the Franchise property
corresponds to access sanctioned on the lbasis of

Security Controls in ADEPr-50 Time-Sharing Systetn
need-to-know. With this interpretation, the popular
security terms "classifics-tion" and "clearance" can be
defined by our model h the SB,me dimensions--as a
nUn/max test on the security plofile trjplet. CIgssification is attached to a security object to designate the
minimum security profile required for access, vvhereas
clearance grants to a security object the maximum
security profile jt has permissjon to exercise. Thus, legal
aCCfSS obtains if the clearance is greater than or equal
to the classjfication, i.e., if equation (25) holds.
Another observation on the modEl is the "job
umbrella" concept implied by equatjons (22) through
(26); i.e.. tbe derived clearance of the job (not thf'
clearance of the user) is used as the securhy control
triplet for file access. The job umbrella spreads a
homogeneous clearance to normalize access to a
heterogeneous assortment of program and data files.
This simplifies the problem of control in a multi-level
security system. Also note how the job umbrella's
h;gh-water mark (equat;ons (11) through (14» is used
to automatically classify new files (equ9tions (17) and
(18»; this subject is discussed further below.
A final observp.tion on the model is its p,pplic["tion of
need-to-know to terminal access, equation (23). This
feature allows terminals to be restricted to special
people and/ or special groups for greater control of
personnel intmfaces-i.e., systems programmers, computer operators, etc.
Security control implementation

The selection of a set ,theoretic model of security
control was not fortuitous, but [) deliberate choice biased
toward computation91 efficiency and ease of implementatjon. It permits the clean separation and isolation of
security control code from the security control data,
which enables ADEPT's security mechanjsms to be
openly discussed and still remain safe-a point advocated by others.14.16 We achieve this safety by "arming"
the system with security control datB, only once at
start-up time by the SYSLOG procedure discussed later.
Also, the model jmproves the credibility of the security
system, enhancing its understanding and thereby promoting its certification.

Security objects: Identity and structure
Each security object has a unique identification (ID)
within the system such that it can be managed indivjdually. The form of the ID depends upon the securityobject type; the syntax of each is given below.

123

User identification
For generality of definition, each user is uniquely
identified by his user:id, which must be less than 13
characters with no embedded blanks.
The user :id can be any meaningful encoding for the
local installation. For example, it can be the individual's
Social Security number, his military serial number, his
last name (if unique and less than 13 characters), or
some local installation man-number convention. The set
of all user :ids constitutes the universal set, U.
Terminal identification
All peripheral devices in ADEPT are identified
uniquely by their IBM 360 device addresses. Besides
interactive terminals, this includes disc drives, tape
drives, line printer, card reader-punch, drums, and 1052
keyboard. Therefore, terminal:id must be a two-digit
hexadecimal number corresponding to the unit address
of the device.
Job identification
ADEPT consists of two parts: the Basic Executive
(BASEX), which handles the allocation and schedul~ng
of hardware resources, and the Extended Executlve
(EXEX), which interfaces user programs 'with BASEX.
ADEPT is designed to operate itself and user programs
as a set 'of 4096-byte pages. BASEX is identified as
certain pages that are fixed in main core, whereas EXEX
and user programs are identified as sets of, pages that
move dynam.ically between main and s~ap memory.
A set of user programs are identified as a job, with page
sets for each program (the program map) described in
thejoh's environment area, Le., the job's "state tables."
Every job in ADEPT has an environment area that
is swapped with the job. It contains dynamic system
bookkeeping information pertinent to the job, including
the contents of the machine registers (saved when the
job is swapped out), internal file and ~/O control tables,
a map of all the program's pages on drum, user:id, and
the job security control parameters. The environment
page(s) are memory-protected against readin~ and
writing by user programs, 80S they are really swappable
extensions of the monitor's tables .
. The job:id is then a transitory internal parameter
which changes with each user entrance and exit from the
system. The job:id is a relative core memory address
used by the executive as a major index into central
system tables. It is mapped into an external two-digit
number that is typed to the user in response to a
successful LOG IN.

124

Fall Joint Computer Conference, 1969

File identification
ADEPT's file system is quite rich in the variety of
file types, file organization, and equipment permitted.
There are two file types: temporary and permanent.
Temporary files are transitory "scratch" disc files,
which disappear from the system: inventory when their
parent job exits from the syst~m. They are always
placed on resident system volumes, and are private to
the program that created them.
Permanent files constitute the majority of files
cataloged by the system. Their permanence derives from
the fact that they remain inventoried, cataloged, and
available even after the job that created or last referenced them is no longer present, and even if they are not
being used. Permanent files may be placed by the user
on resident system volumes or on demountable private
volumes.
There are six file organizations from which a user may
select to structure the records of his file: Physicalsequential, Sl; non-formatted, S2; index-sequential, S3 ;
partitioned, S4; multiple volume fixed record, S5; and
single volume fixed record, S9. Regardless of the
organization of the records, ADEPT manages them as a
collection, called a file. Thus, security control is at tho
file level only, unlike more definitive schemes of
sub-element control. 8,10--12
All the control information of a file that describes
type, organization, physical storage' location, date of
creation, and security is distinct from the data records
of the file, and is the catalog of the file.
All cataloged ADEPT files are uniquely identified by
a four-part name; each part has various options and
defaults (system assumptions). This name, the file:id,
has the following form:
file:id : : = name, jorm,·user:id, volume:id
Name is a user-generated cha~acter string of up to
eight characters with no embedded blanks. It must be
unique on a private volume as well as for Public files
(described below).
Form is a descriptor of the internal coding of a file.
Up to 256 encodings are possible, although only these
seven are currently applicable:

1
2
3
4
5
6
7

= binary data
= relocatable program
= non-relocatable program
card images
= catalog
= DLO (Delayed Output)
= line images
=

U ser:id corresponds to th~ owner of the file, i.e., the
creator of the file.
Volume:id is the unique file storage device (tape, disc,
disc pack, etc.) on which the file resides. For various
reasons, including reliability, ADEPT file inventories
are distributed across the available storage media,
rather than centralized on one particular volume. Thus,
all files on a given disc volume are inventoried on
that volume.

Security properties: Encoding and structure
Implementation of the security properties in ADEPT
is not uniform across the security objects as suggested
by our model, particularly the Franchise property. Lack
of uniformity, brought about by real-world considerations, is not a liability of the system but a reflection of
the simplicity of the model. Extensions to the model ~tre
developed here in accordance with that actually
implemented in ADEPT.
Authority
Authority is fixed at four leveJs (w = 3 for Hquation
(1)) in ADEPT, specifically, UNCLASSIFIED, CONFIDENTIAL, SECRET, and TOP SECB.ET in
accordance with Department of Defense security
regulations. The Authority set is encoded as :~ logical
4-bit item, where positional order is important. Magnitude tests are used extensively, such that the high-order
bits imply high Authority in the sense of equ2.tion (8).
Category
Category is limited to a maximum of 16 eompart·
ments (1/1 :::; 15 for equation (2)), encoded as a logical
16-bit item. Boolean tests are used exclusively on this
datum. The definition of (and bit position correspondence to) specific compartments is an installation option
at ADEPT start-up time (see SYSLOG). Typical
examples of compartments are EYES ONLY,
CRYPTO, RESTRICTED, SENSITIVE, etc.
Franchise
Property Franchise corresponds to the military
concept of need-to-know. Essentially, this corresponds
to a set of user:ids; however, the ADEPT implementation of Franchise is different for each security object:
1. User: All users wishjng ADEPT service must be
knowIl to the system. This knowledge is imparted
by SYSLOG at start-up time and limited to
approximately 500 user:ids (max(U) :::; 500).

Security Controls in ADEPT-50 Time-Sharing System
2. Terminal: Equation. (5) specifies the Franchise
of a given terminal, F t, as a set or user:ids. In
ADEPT, F t does not exist. One may define all
the users for a given terminal, i.e., F t ; or alternatively, all the terminals for a given user. Because
SYSLOG orders its tab1es by user:id, the latter
definition was found more convenient to
jmplement.
3. Job: The Franchise of a job is the 'llser:id of the
creator of the job at the time of LOGIN to the
system. Currently, only one user has access to
(and control of) a job (p, = 0 for equation (6)).
4. File: Implementation of Franchise for a file (F f),
is more extensive than equation (7). In ADEPT,
we wish to control not only who accesses a file,
but also the quality of access granted. We have
defined a set of four exclusive qualities of access,
such that a given quality, q, is defined if
q

E

{READ, WRITE, READ-ANDWRITE, READ-AND-WRITEWITH-LOCKOUT-OVERRIDE}

(27)

ADEPT permits simultaneous access to a file by
many jobs if the quality of access is for READ
only. However, only one job may access a file
with WRITE, or READ-AND-WRITE quality.
ADEPT automatically locks out access to a file
being written to avoid simultaneous reading and
writing conflicts. A special access quality, however, does permit lockout override. Equation (7)
can now be extended as a set of pairs,
F f = {(uJ, qO), (u), ql), "', (ul, q'Y)}

(28)

where q i are not necessarily distinct and are given
by equation (27).
The implementation of equation (28) is dependent upon 1', the number of franchised u,sers.
When l' = 0, we have the ADEPT Private file,
exclusive to the owner, uJ; for l' = max(U), we
have the Public file; values of l' between these
extremes yield the Semi-Private file. l' is
implicitly encoded as the ADEPT "privacy"
item in the file's catalog control data, and takes
the place of F f for all cases except a Semi-Private
file. For that case exclusively, equation (28) holds
and an actual F f list of user:id, quality pairs
exists as a need-to-know list. The owner of a file
specifies and controls the file's privacy, including
the composition of the need-to-know list.

125

Security control initialization: SYSLOG
SYSLOG is a component of the ADEPT initialization
package responsible for arming the security controls. It
operates as one of a number of system start-up options
prior to the time when terminals are enabled. SYSLOG
sets up the security profile data for user:id and
terminal:id, i.E.:" the "given constants" of Table I.
SYSLOG creates or updates a highly sensitive
system disc file, where each record corresponds to an
authorized user. These records are constructed from a
deck of cards consisting of separate data sets for
compartment definitions, terminal:id classification, and
user:id clearance. The dictionary of compartment definitions contains the less-than-9-character mnemonic for
each member of the Category set. Data sets are formed
from the card types shown in Table II. Use of passwords
is described later in the LOGIN procedure.
An IDT card must exist for each authorized user; the
PWD , DEV , SEC , and CAT card types are optional.
Other card types are possible, but not germane to
security control, e.g., ACT for accounting purposes.
More than one PWD, DEV, and CAT card is acceptable
up to the current maximum data limits (i.e., 64 passwords, 48 terminal:ids, and 16 compartments).
A variety of legality checks for proper data syntax,
quantity, and order are provided. SYSLOG assumes ~he
following default conditions when the correspondlIlg
card type is omitted from each data set:
PWD
DEV
SEC
CAT

No password required
All terminal:ids authorized
A = UNCLASSIFIED
C = null (all zero mask)

This gives the lowest user clearance as the default,
while permitting convenient user access. Various options
exist in SYSLOG to permit maintenance of the internal
SYSLOG tables, including the replacement or deletion
of existing data sets in total or in part.
The sensitivity of the information in the security
control deck is obvious. Procedures have been developed
at each installation that give the function of deck
creation, control, and loading to specially cleared
security personnel. The internal SYSLOG file itself is
protected in a special manner described later.

Access control
A fund2.mental secur.1ty concern in multi-3ccess sysis that many users with different clearances will be
simultaneously using the system, thereby raising the

126

Fall Joint Computer Conference, 1969
TABLE II-SYSLOG control cards

Card Type

Purpose

DICT

I dentifies start of data set of compartment definitions.
Defines up to 16 compartments.

compartment 1

TERMINAL
UNIT terminal:id
IDT 'U,ser:id
PWD password
DEV terminal:id1
SEC Authority
CAT compartment 1

compartment16

password'
terminal:id48

compartment 16

Identifies start of data sets of terminal definitions.
Identifies start of a terminal data set.
Identifies start of a user data set.
Defines legal passwords for user:id up to 64.
Defines legal terminals for user:id up to 48.
Defines user:id Authority.
Defines user:id Category set.

possibility of security compromise. Since programs are
the "active agents" of the user, the system must
maintain the integrity of each and of itself from
accidental and/or deliberate intrusion. A multifile
system must permit concurrent access by one or more
jobs to one or more on-line, independently classified files.
ADEPT is all these things--multiuser, multiprogram,
and multifile system. Thus, this section deals with access
control over users, programs, and files.

an unsuccessful LOGIN. Furthermore, the terminal is
ignored (will not honor input) for approximately 30
seconds to frustrate high-speed, computer-assisted.
penetration attempts. If, however, the match is
successful (equation (22) holds), the current password in
the SYSLOG file for this user:id is discarded ,and
LOGIN proceeds to create the job clearance.
(

start)

User access control: LOGIN
To gain admittance to the system, a user must first
satisfy the ADEPT LOGIN decision procedure. This
procedure attempts to authenticate the user in a fashion
analogous to challenge-response practices.
The syntax of the ADEPT LOGIN command, typed
by a user on his terminal, is as follows:

----- Equatic'n (22)

/LOG IN user :id password accounting
Figure 1 pictorially displays the LOG IN decision
procedure based upon the user-specified input parameters. Usel':id is the index into the SYSLOG file used to
retrieve the user security profile. If no such record exists
(Le., equation (22) fails), the LOGIN is unsuccessful and
system access is denied. If the security profile is found,
LOGIN next retrieves the terminal:id for the keyboard
in use from internal system tables, and searches for a
match in the terminal:id list for which the user:id was
franchised by SYSLOG. An unsuccessful search is an
unsuccessful LOG IN.
If the terminal is franchised, then the current password is retrieved from the SYSLOG file for this usel':id
and matched against the password entered as a kevboard
parameter to LOGIN. An unsuccessful match i; again

----- Equation (23)

----- Equation (22)

----- Equations (15) and (16)

Figure 1-LOGIN decision procedure

Security Controls in ADEPT-50 Time-Sharing System
Passwords in ADEPT obey the same syntax conventions as user:id. (See the earlier description of User
Identification.) Although easily increased, currently
SYSLOG permits up to 64 passwords. Each successful
LOG IN throws away the user password; 64 successful
LOGINs are possible before a new set of passwords
need be established. If other than random, once-only
passwords are desired, the 64 passwords may be encoded
in some algorithmic manner, or replicated some number
of times. Once-only passwords is an .easily implemented
technique for user authentication, which has b~en
advocated by others.2,7 It is a highly effective and
secure technique because of the high permutability of
12-character-passwords and their time and order
interdependence, known only to the user.
Once the authentication process is completely satisfied, LOGIN creates the job security profile according to
equations (15) and (16) of our model. That is, the lower
Authority of the user and the terminal becomes Ai, and
the intersection (logical AND) of the user and terminal
Category sets becomes the Category of the job, Cj. For
example, a user with TOP SECRET Authority and a
Category set (1001 1001 0000 1101) operating from a
SECRET level terminal with a Category set (0000 0000
0000 0010) controls a job cleared to SECRET with an
empty Category set.

Program access control: LOAD
As noted earlier, the ADEPT Executive consists of
two parts: BASEX, the resident part, and EXEX, the
swapped part. EXEX is a body of reentrant code
shared by all users; however, it is treated as a distinct
program in each user's job. Up to four programs can
exist concurrently in the job. Each operates with the job
clearance-the job clearance umbrella.
LOAD is the ADEPT component used to load the
programs chosen by the user; it is part of EXEX and
hence operates as part of the user's job with the job's
clearance. Programs are cataloged files and as such may
be classified with a given security profile. As is described
in "File Access Control" below, LOAD can only load
those programs for which the job clearance is sufficient.
Once loaded, however, the new program operates with
the job clearance.
In this manner, we see the power of the job umbrella
in providing smooth, flexible user operation concurrent
with necessary security control. Program files may be
classified with a variety of security profiles and then
operate with yet another, i.e., the job clearance. By this
technique security is assured and programs of different
classifications may be operated by a user as one job. It

127

permits, for example, an unclassified program file (e.g.,
a file editor) to be loaded into a highly classified job to
process sensitive classified data files.
File access control: OPEN
Before input/output can be performed on a file,
a program must first acquire the file by an OPEN call
to the Cataloger. Each program must OPEN a file for
itself before it can manipulate the file, even if the file is
already OPENed for another program. A successful
OPEN requires proper specification of the file's descriptors-some of which are in the OPEN call, others of
which are picked up directly by the Cataloger from the
job environment area (e.g., job clearance, user:id)-and
satisfactory job clearance and user:id need-to-know
qualifications according to equations (25) and (26) of
our model. Equation (25) is implemented as (8) as a
straightforward magnitude comparison between A j and
AI' Equation (25) is implemented as (9) as an equality
test between C I and (C j / \ C / ). We use (C j / \ C / ) to
ensure that C I is a subset of the job categories; i.e., the
job umbrella. Lastly, equation (26) is a NOP if the file
is Public; a simple equality test between Uj and UI if the
File is Private; and a table search of F I for Uj if the file
is Semi-Private. These tests do increase processing time
for file access; however, the tests are performed only
once at OPEN time, where the cost is insignificant
relative to the I/O processing subsequently performed
Qn the file.
The quality of access granted by a successful OPEN,
and subsequently enforced for all I/O transfers, is that
requested, even if the user hp"s a greater Franchise. For
example, during program debugging, the owner of a file
may OPEN it for HEAD access only, even though
READ-AND-WRITE access quality is perm.itted. He
thereby protects his file from possible uncontrolled
modification by an erroneous WRITE call.
Considerable controversy surrounds the issue of
automatic classification of new files form.ed by subset or
merger of existing files. The heart of the issue is the poor
accuracy of many such classification techniques17 and
the fear of too many over-cle.ssifIed files (a fear of
operations personnel) or of too many under-clPJssified
files (a fear of the security control officers). ADEPT
finesses the problem with a clever heuristic-most new
files are created. from. existing files, hence classify the new
file as a private file with the composite Authority BJnd
Cate.

J'

,,0

I

YJ'~

4C'c

:i?o~~C'~ ~o~J'~c

,,0

;.~~

o~~.
q,.;.

J'~'.f>J'J'r J'J'~ ~~~

C'"oy

~(

'.l:

'-I'.()

4'~o/

~

.t~ ¢'~

;.

...."

A'./;

:i?oJt(

'1"",

If(

~

:/~ 1'(~ -I'(~
oJ'~
~
~~p~

¢'~ ~-I'(~ 1'(~

1l

~~ o~~ o~<:~C'o. o~ o(~ c:.,~

J'<~"eC'Y<1>

;p

"l'lf.

J'

A

<~ '1'<~4'~ ~"b~.,.

-I'.()

~

~<1>

.

EVENT

~

J'

LOGIN

X

X

LOGOUT

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

CHANGE FILE

X

X

X

X

X

X

X

CLOSE FILE

X

X

X

X

X

DELETE FILE

X

X

X

X

~- 1-.-

RECLASS

X

X

x

x

OPEN FILE
REOPEN

l

FILE

REPLACE
DEVICE LIST

2

4

WRAPUp 5

X

X

X

X
X

X
X

x

CATEGORY DICTIONARy
RESTART

X

3

~

x

x
X

--- 1 - - - -f.---

x

x

v

X

1 This is the "OPEN existing file" command.
2 A list. of all

the terminal devices and their assigned security and categories is recorded at each system load.

3 A list of the prose category names is recorded at each system load.
4 Whenever the system is restarted on the same day (and AUDIT had been turned on earlier that day) the time of
the restart is recorded.
5 The time that the AUDOFF action was taken, or the time that the WRAPUP function called AUDIT, to terminate the
AUDIT function.

132

Fall Joint Computer Conference, 1969

fully demonstrated a security control mechanism that
more than adequately supports heterogeneous levels and
types of classification. Of note in thi~ rega~d is the
LOGIN decision procedure, access control tests, job
umbrella, high-wat.er mark, and audit trails recording.
The approach can be improved in the direction of more
compartments (on the order of 1000 or more), extension
of the model to include system files, and the implementation of a single Franchise test for all security
objects. The implementation needs redundant encoding
and error detection of security profile data to increase
confidence in the system-though we have not ourselves
experienced difficulty here. The increase in memory
requirements to achieve these improvements may force
numerical encoding of security data, particularly
Category, as suggested by Peters.7
Second, SYSLOG has been highly successful in
demonstrating the concept of "security armin.g" of the
system at start-up time. Our greatest difficulty in this
area has been with the human elem.ent-the computer
operators-in preparing and ha.ndling the control deck.
In opposition to Peters,7 we believe the operator should
not be "designed out of the operation as much as
possible," but rather his capabilitits should be upgraded
to meet the greater levels of sophistication B.nd responsibility required to operate a time-sharing system. 20 He
should be considered part of line management. ADEPT
is oriented in this direction and work now in progress is
aimed at building a real-time security surveillance and
operations station (SOS).
Third, we missed the target in our attempt to isolate
and limit the ~mount of critical coding. Though much
of the control mechanism is restricted to a few components--LOGIN, SYSLOG, CATALOGER, AUDIT
-enough is sprinkled around in other areas to make it
impossible to restrict the omnipotent capabilities of the
monitor, e.g., to run EXEX in Problem state. Some
additional design forethought could have avoided some
of this dispersal, particularly the ·wide distribution in
memory of system data and programs that set and use
these data. The effect of this shortcoming is the need for
considerably greater checkout time, and the lowered
confidence in the system's integrity.
Lastly, on the brighter side, we were surprisingly
frugal in the cost of implementing this security control
mechanism: It took approximately five percent of our
effort to design, code, and checkout the ADEPT
security control features. The code represents about ten
percent of the 50,000 instructions in the system. Though
the code is widely distributed, SYSLOG, security
commands, LOGIN, AUDIT, and. the CATALOGER
account for about 80 percent of it. The overhead cost of

operating these controls is difficult to me8.sure, but it is
quite low, in the order of one or two percent of total
CPU time for norm.al operation, excluding SYSLOG.
(SYSLOG, of course, runs at card reader speed.) The
most significant area of overhead is in the checking of
I/O channel programs, where some 5 to 10 msec are
expended per call (on the average). Since this time is
overlapped with other I/O, only CPU bound programs
suffer degredation. AUDIT recording also cont.ributes
to service call overhead.. In actuality, the net operating
cost of our security controls may be zero or possibly
negative, since AUDIT recordings showed us numerous
trivial ways to measurably lower system overhead..
ACKNOWLEDGl\'lENTS
I would like to acknowledge the considerable encouragement I received in the formative stages of the ADEPT
security control design from lVIr. Richard Cleaveland, of
the Defense Communications Agency (DCA). I woul.d
like to thank l\irs.l\1artha Bleier, l\1r. Peter Baker, and.
Mr. Arnold Karush for their patient care in designing
and implementing much of the work I've described
Also, I wish to thank Mr. l\Tarvin Schaefer for assisting
me in set theory notation. Finally, I would like to
applaud the· ADEPT system project personnel for
designing and building a time-sharing system so
amenable to the ideas discussed herein.
REFERENCES
1 A HARRISON
The problem of privacy in the computer age: An annotated
bibliography
RAND Corp Dec 1967 RM-5495-PR/RC
2 L J HOFFMAN
Computers and privacy: A survey
Stanford Linear Accelerator Center Stanford Univ Aug
1968 SLAC-PUB-479
8 H E PETERSEN R TURN
System implications of information privacy
Proc SJCC Vol 30 1967 291-300
4 W H WARE
Security and privacy in computer systems
Proc SJCC Vol 30 1967 279-282
5 W H WARE
Security and privacy: Similarities and differences
Proc SJCC Vol 80 1967287-290
6 R LINDE C WEISSMAN C FOX
The ADEPT-50 time-sharing system
Proc FJCC Vol 35 1969 Also issued as SDC Doc SP-3344
7 B PETERS
Security considerations in a multi-programmed comp uter
system
Proc SJCC Vol 30 1967 283-286
8 RYE CAPRI COINS OCTOPUS SADIE Systems

Security Controls iIi A:P,EPT'-50 Time-Sharing System
NOC Workshop National Security Agency Oct 1968
9 H W BINGHAM
Security technique8 for EDP oj multi-level cla88ified
information
Rome Air Development Center Dec 1965 RADC-TR-65-415
10 R M GRAHAM
Protection in an information proce88ing utility
ACM Symposium on Operating Systems Principles Oct
1967 Gatlinburg Tenn
11 L J HOFFMAN
Formularie8-Program controlled privacy in large data ba8e8
Stanford Univ Working Paper Feb 1969
12 D K HSIAO
A file 8y8tem for a problem 80lving facility
Dissertation in Electrical Engineering Univ of Pa 1968
13 J I SCHWARTZ C WEISSMAN
The SDC time-8haring 8Y8tem revi8ited
Proc ACM Conf 1967 263-271
14 P BARAN
On di8tributed communication8: IX, 8ecurity, 8ecrecy, and
tamper-free con8ideration8

133

RAND Corp Aug 1964 RM-3765-PR
15 C WEISSMAN
Programming protect'ion: What do you want to pay?
SDC Mag Vol 10 No 8 Aug 1967
16 J P TITUS
Wa8hinqton commentary-Security and privacy
CACM Vol 10 No 6 June 1967379-380
17 I ENGER et al
.{l utomatic 8ecurity cla88ification study
H.ome Air Development Center Oct 1967 H.ADC-TR-67-472
18 A KARUSH
The computer sY8tem recording utility: A pplication and
theory
System Development Corp March 1969 SP-3303
19 A KARUsiI
Benchmark analysi8 of time-8haring 8ystem8 : Methodology and
re8ults
System Development Corp April 1969 SP-3343
20 It R LINDE P E CHANEY
Operational management of time-8haring 8Y8tems
Proc 21st Nat ACM Conf 1966 149-159

Management of confidential information
b,y EDWARD V. COMBER
System Dynamics, 1m.
Oakland, California.

INTRODUCTION
For many years, informed persons have expended
considerable time and energy attempting to evolve
an acceptable philosophic assessment of the concept
of "privacy." Studies made in the fields of anthropology,
phychology, and sociology are in general agreement
that both the mental and physical well-being of an
individual requires fr~edom to experience some degree
of personal anonymity within the envir?nment.
While the significance of "privacy" has been recognized,
it has eluded the constraint of an acceptable definition. .The search for a workable definition continues
as man seeks a means for establishing, practical bounds
for inter-personal relations.
Recently, the concern for "privacy" has become a
rallying point for those who see the present growth
and applications of data automation as a threat to
the "rights of privacy" of the individual. These advocates lament that the individual is unaware of the
threat to his "loss of privacy" as his attention is
diverted by the glowing promises of anticipated
benefits that may become available through data
automation.
It is the writer's belief that through the proper and
reasonable utilization of the tools of modern data technology man will have within his power a mechanism
that has the potential of becoming his strongest ally
in his search for means to preserve the values of "privacy." In reality, the critical element in this question
of "privacy" should not address itself to the electromechanical capability of the computer or system telecommunications functions. The true focal point is the
direct challenge to the discipline and conduct of man
who is the designer and user of the data system. 6 Man

must be willing to abide by the standards he derives
from his own "privacy" criteria. He 'must staunchly
forego any temptation to engage in system shortcuts,
and he must hold to the position that he will not accept
lightly any violations of the "confidentiality controls"
established for system operation. Any breach in the
integrity 'of the system must be viewed as a direct
personal challenge to the integrity' of each person
associated with the undertaking.

SUMMARY
The following is a brief resume of significant elements
that have been identified with the question of "privacy." These comments are not offered as final nor
are they to be considered as embracing the entire
area of concern. The summary is presented simply as
a means of bringing together some key factors that
could serve as a foundation for a basic "privacy" control system. The working standards will evolve as
man gains more experience with this powerful ally
and is able to resolve philosophical and ethical questions that are inherent in the concept of "privacy".
As the environment and pace of modern life adjust
to current needs, the nature of "privacy" will probably
also reflect changes in priorities and the character of
the social stresses.
Elements in the invasion of privacy

No definitive statement exists which provides a
clear and acceptable statement of what is "private
information," or what constitutes an "unwarranted
invasion of privacy." Any criteria proposed to date
to identify "private information," or describe an act
135

136

Fall Joint Computer Conference, 1969

that would constitute "unwarranted invasion of privacy," must take into account whether or not such
disclosure of the specific data:
A. Would relate to an individual, a family or other
small group in such manner as to facilitate the
likelihood of the unwarranted identification of
the individuals, or
B . The data is not considered public information by
provision of legal statute, or
C. Would cause or be the basis for unjust economic
loss or social stigma or harassment to the
individual, or
D. Result in the unnecessary loss of a property
right.
What is private vs. what is confidential?

When attempting to discuss "privacy," the term
"confidentiality" inevitably will join the debate, but
does not promote clarification. What sort of personal
information do reasonable men interpret as "private?"
The answer to this question depends upon many
things; for example, anyone or more of the following
factors may apply:
A. The context within which the specific information
is embedded,
B. The amount of information assembled and accessible,
C. The intrinsic nature of the information.
D. The sophistication of the social values held by
the individuals concerned,
E. The character and scope of the sub-culture,
F. Significance of personal attributes such as: age,
ancestry, social status, race, etc.

Recently, the California Intergovernmental Board
on EDP was established by statute.1 It is charged
with responsibility to provide for intergovernmental
representation in the coordination of the many government sponsored EDP programs and to take leadership
in the establishment of intersystem standards. The
Intergovernmental Board appointed a select Technical Advisory Committee to assist in the preparation
of a Manual to serve as a guideline for all agencies
in the development of local systems and facilitate
adequate interface capability as required. The manual
was completed and is under review by the Intergovernmental Board prior to general release to official
agencies throughout the State of C~lifornia.
A sub-committee of the Technical Advisory Committee was specifically assigned to address the question
of "privacy". The members of the Privacy Sub-com-

mittee concluded, after some study, that there are a
number of personal information items that could be
made accessible to an integrated data system without
any threat to the individual "privacy". It was also
recognized that there are many other data items that
for one reason or another should be restricted from
wide access in the absence of an established right to
know. Some examples of these data items are as shown
below:
A. Information that may not be relevant to personal
privacy:

Name
Maiden Name
Address
Age or DOB
Race

Sex
Marital Status
N arne of Spouse
Next of Kin

B. Information that would probably be relevant to
personal privacy:

Occupation
Education
Income
Religious Preference
Political Preference
Family Size
N umber of Children
Ages of Children
Taxes Paid
History of Residence
Attitudes Toward Social Issues
Property Ownership
Value of Real Property
Marital History
Drinking Practices
Hospitalization Record
Medical Record
Symptoms of Illness
Record of Arrest
Ancestry
Nationality
Name of Relatives
Response to Psychological
or Medical Questions
Proliferation of data it'3ms throughout culture

While some of the information items mentioned
above may be found on records that are classified as
confidential, many of the information items may also
be found on records that are not subject to restriction

Management of Confidential Information
by law or policy. The current trend in social intercourse and information exchange reflects an everbroadening depth of self-revealment by individuals.
Private and governmental services are being extended
into newer areas and thereby attracting the participation of an ever-growing segment of the citizenry.
The integration of interagency information systems
with data exchange introduces a new dimension associated with the creation of composite record images
of persons known to the total system. These images are
the product of independent and frequently unrelated
inputs of data to serve other specific needs. Any
integrated interagency information system with this
potential capability must be administered by professionally qualified persons who remain sensitive of the
need to verify both the identification of the subject
of inquiry and the inquirer's "right to know". As more
data systems are activated and interfaces are established, the individual who is the initial source of the
data becomes more remote and isolated from the
operational inquiry that relates to his record. It should
be the constant aim of the system design, operational
programming, and user discipline to assure that system
integrity is not subverted.
Significance of developing standards for data verification

Attention should not be directed solely to provide
for the identification and classification of personal
data items. What is equally important, standards
must be developed and adopted to guide data acceptance and utilization with respect to the ability
to verify the information. For example, the confidence
in the operating system will be increased and utilization encouraged if the user is assured that data items
are subject to verification as to:
A.
B.
C.
D.
E.
F.

Accuracy
Bias
Completeness
Currency
Documentation
Satisfaction of Legal Requirements

A safety value that will support a sound verifica'tion program is to initiate a practi,cal data purge
system. The best data system in terms of cost/benefit
analysis is one that has a high content of active data
and one that is adequately updated. The effect of establishing a continuous and critical purge system is to
provide an orderly review of file content, to remove
inactive or low value data.

137

One approach to a data classification plan

A number of studies have been undertaken in an
attempt to identify and define data items that should
be processed as classified or confidential. There have
been perhaps as many solutions offered as there have
been studies proposed. The Privacy Sub-committee
mentioned above proposed a simple three category
data plan for consideration and approval or the California Intergovernmental Board on EDP.2 The concept is summarized below:
A. Confidential:

This classification has the highest level of
restriction, and should be limited to data which
is prohibited from free and full disclosure by
statutory regulation (law).
B. Restricted:
This is data which:
1. Is not prohibited from full and free disclosure by statute (coufidential), and
2. An unauthorized intrusion could constitute an unwarranted invasion of personal privacy, and
3. Has been administratively assigned a
security classification-restricted.

C . Unclassified:
All data maintained by a public agency not
otherwise classified as confidential or restricted
as defined below.
Sources of classification criteria

The criteria for the establishment of classification
of data arise from a variety of sources. In many instances, the criteria is a result of the interaction of
one or more of the following:
A. Public Policy:
The living residue of tradition and social acceptance.
B. Statutory Law:
The formalized and legal codification of social
needs and standards of conduct.
C. Legal Interpretation:
The implementation of judicial and administrative decisions that have been sanctioned
through public acceptance.
D. User Agency Specifications:
Operational decisions that have been adopted

138

Fall Joint Computer Conference, 1969

and ennunciated to promote agency goals In
an atmosphere of public support.
E. Personal Needs of The Individual:
Acceptance of the system integrity by the public who participate and furnish personal information to assist an agency function with respect
to the needs of the individual (Federal Census,
Social Security, etc.).
Each of the sources of criteria utilized is subject
to its own characteristic variations, and will require
continuous reevaluation. The scope of data items
subject to the confidential classification are under
constant adjustment and reassessment due to the
dynamic character of the social conditions which give
rise to the data.
Identification of areas sensitive to intrusion3

.

One of the main deterrents to the development of
new ideas about privacy has been the lack of specificity
as to where the threats to privacy may arise. Many
agree that at· some future date, a serious threat may
develop. That a real danger exists today is not universally accepted.
Let us consider the potential challenge to "privacy"
that may originate from any of the;following sources:
A. The accidental observance of data by an individual.
B. The accidental dumping oj a volume of confidential data to general view.
C. The solitary snoop.
D. The snoop-Jor-pay (hired spy).
E. The file stealer.
F. Misuse of confidential file by administrator having
access to system.
G. Organized crime.
H. Totalitarian government.
I. Another possibility might be the intrusion of the
private sector into government data files.
Establish policy on data classification

Before any acceptable automation program can
be developed to process information that may be considered "private" or "confidential," certain policy
decisions must be resolved.
A. The responsible administrators representing users
of the system must reach agreement on the data
content of the information' system. This agreement must include the identification of any
data items or files that would be subject to
restricted access or inquiry. If the restriction

is pursuant to current policy, said policy should
be specified:
1.
2.
3.
4.

General Public Policy
Agency Administrative Policy
Statutory Provision
Judicial Ruling

B. Specific criteria should be established based on
the accepted policy statements, and serve as a
guide to test the classification of all data, introduced into the system. The c011.tinued validity
of a classification should be based upon periodic
challenge and justification.
C. A policy manual should be prepared and maintained as a ready reference to facilitate system
operation.
1. Personnel participating in the system should
be held individually accountable for full
compliance with the "policy guidelines."
2. The policy manual should be subject to
continuous review and update to remain
current with system requirements, technology, and legal specifications.
D. Additional considerations in the development of
an Interagency Information System to maintain privacy control. Decisions regarding the
following elements of the system design and
operation will prove significant:
1. Facility Security:

(a) Location of Hardware
Single vs. Multiple Facility
(b) Physical Adequacy
Equipment
Personnel
(c) Access to Facility
Normal
Emergency
2. Equipment:

(a) Selection
(b) Configuration
(c) Operating Characteristics
Multi-processing
Multi-programming
Remote Terminals
3. Program Control:

(a) Single Management Responsibility

Management of Confidential Information

(b)

(c)
(d)
(e)

(f)

User Representation and Participation
Operating System
Monitor of System Services And
Access
System Applications
Man Machine Interface (Key Consideration)
Modularization of System Applications
Does Modularization Weaken
Privacy Control?
Integration of Compatible Systems
Does Program Control Reside
With The Core System?

4. The Human Factor:
This is the critical and perhaps most
unpredictable element in the functioning
process.
(a) Personnel Recruitment, Selection
And Appointment
(b) Personnel Training And Supervision
(c) Maintenance of Operating Discipline
(d) Personnel Retention

139

encourage system utilization by t:q.e participants for which it was designed.
1. Equipment (system hardware):

(a) Location and physical security of
equipment.
(1) Central Computer Installation
(2) Associated Peripheral
Equipment
(3) Back-Up FacilitiesDuplicate Files
(b) Remote terminal installations
(I/O devices.)
(c) Circuit Security
2. System Configuration

(a) Central Data Bank vs. Dispersed
Data Bases
(b) Central Data File vs. Central Index Concept
(c) Central System Control vs. Remote
Terminal Activation

Precautions to minimize potential for "privacy" violations

(1) Restricted Terminal Operation
(2) Multiple Function Remote Terminal

The same versatility and power that makes the
computer valuable as a data manipulator can be employed to monitor system services and support human
supervision procedures. The operating information
system should provide (assuming an adequate system
analysis and design):

3. Software System Support-Programming
must be developed with an awareness of
the need for system integrity and data
security. Provision must be made to provide control over basic software components, such as:

A. A Sound Data Classification System
1. Specify data subject to restricted access
and special protection.
2. Provide for isolated storage of restricted
data if necessary.
3. Determine who has right to access to
confidential data and under what operating
conditions.
4. User agency personnel should be certified for access by administration.

B. Physical Conditions:
What levels of control should be imposed to
promote system integrity and at the same time
provide a functional environment that will

(a)
(b)
(c)
(d)

Program Library
Back-Up Documentation
Diagnostic And Test Routines
Continuous Coding of Update
Schedules That Support The
Identification Schemes Inherent
to The Confidentiality Control
Programs

(e) Transaction Monitor Logs Should
Be Designed to Provide The
Basis For Operational Supervision But Not Reveal The
Location or Content of The Confidential Files Which Are Subject
to Monitor Control

140

Fall Joint Computer Conference, 1969

-------------------------------------------------------------------------------------4. Personnel Requirements-If the system
equipment and facilities justify particular planning to minimize the hazards to
confidentiality, it is certain that consideration be given to the personnel who will
function in the system. The scope of attention should extend through both the
employees who perform the technical
services associated with EDP, and the
operating personnel of the agency for
which the information system was developed. Despite all that has been said
heretofore, the "key" to security of information rests with the individuals who
have access to the data system. Our
personnel planning should encompass
many specific areas. The following relate
most directly to physical factors:
(a) Personal Safety
(1) Area Accessibility
(2) Emergency Provisions

(b) Personal Accountability
(1) Identification Control

Plan
(a) Access to Installation
(b) Access to Specific
Work Areas
(2) Is the Plan PracticalUsed?
(c) Conveniences And Necessities
(1) Are They Adequate?
(2) Are They Properly Located?

(d) What Special Precautions Are Warranted When Non-employee Personnel Are Permitted Access to
The Installation Area?
(1) Equipment Maintenance

(2) Building Service Maintenance

C. System Design Considerations:
Control provided through specific programming techniques.
1. Limiting Terminal Access to The System-Programming

(a) Classification Schedule (Data Level
Control)
(1) Terminal Identification

(2)
(3)
(4)
(5)

Terminal Verification
User Identification
User Verification
Call-Back COnCep1j

(b) Restriction of Detail of Information in Response to Inquiry
(Data Item Control)
(1) Refer to Index -. Pointer

to Source Data
(2) Status Indicator
(3) Advise Supervisory Station
(a) Secure
Permission to Interrogate The Restricted File
(b) Receive Seleeted
Hesponse
Throug;h Monitor Agent
(4) Specific
Limitation
Terminal Operation

on

(a) Data Input
(b) Data Manipulation
(c) Data Output
(d) Data Change or
Update
(e) Data Purl~e
2. Establish A Monitor On All Terminal
Action to Intercept and Identify unauthorized attempts to access the system.
(a) Identify Transmitting Terminal
And Location
Operator(?)
(b) Identify Terminal
(c) Identify Specific Nature of Restricted Access Attempt
(d) Provide For Supervisory Level
Notification of The Attempt to
Support Maintenance of System
Discipline
The
Unauthorized At(e) Abort
tempt to Secure Data
3. Maintain audit review of selected files to

Managem,ent of Confidential Information
facilitate the orderly purge of files and to
check levels of file activity
(a) Establish, as necessary, periodic
file review procedures to challenge the continued "confidential" status of individual data
items to assure conformity with
system policy and user need
(b) Maintain
necessary
statistical
measures of activity in restricted
files to document operational
policy decisions.
(c) Provide special test routines to
challenge the confidentiality
procedures and verify system
functional integrity
(d) The Human Factor- The concern
for confidentiality of data and
file security eventually will focus on an assessment of problems
that arise from the human element in the man-machine system. Despite the sophistication
exercised in system analysis, design and implementation, specific
recognition must be given to
the fact that people participate
in system operations.
What about a future computer utility?4

With the rapid and diverse growth of computer
services and recognizing the intimate relation between
hardware facilities, communication channels and the
users of the systems, it is no accident that discussion
should arise about the future establishment of a computer-communication utility. The need for such a
service becomes more apparent as we see the introduction of time-sharing systems and the implementation of large integrated data services that support
major regional and even statewide programs. The
arguments pro, and con the justification for a computer-communication utility are beyond the scope of
this paper. However, the utility concept does provide
the opportunity to propose several avenues of approach
to improving the "privacy" control aspect in personal
data· systems. One of the recurring suggestions has
been to establish a system of certification and licensing
for persons directly involved with the design, installation, management} and the operation of data systems
-nontaining sensitive personal information. A second
device that could prove of value w{)uld be to effect

141

control through regulation of the computer-communication utility service.

CONCLUSION
The challenge of privacy control

Violations of standards regarding confidentiality
or privacy of information occur when particplar items
of personal data furnished to an information system
for approved selective use are released to unauthorized
persons or in a manner that jeopardizes expected
system integrity.
A. The Predominance of The Human Factor

Tbe integrity of any information system regarding confidentiality or invasion of privacy
will eventually be resolved at the level of the
human factor. Machines, data sets, file cabinets,
index cards, tape drives, disk files, memory
modules, computers, report registers-each of
these devices is an inanimate object devised by
man to receive, transfer, or hold information
items made available to the system through
human intervention. Data stored in these devices are significant only insofar as the output
is meaningful to man, and subject to change
or exposure by the action of an individual. Data
stored in an inactive or inaccessible device
without human interaction will not reveal information that would provide the basis for a
violation of privacy. The relationship between
man and his information system can be described as consisting of the following basic elements:
(1) Man conceives the system.
(2) Man builds the elements necessary to provide the system.
(3) Man organizes the elements and establishes a scheme of operation.
(4) Man gathers the data that he introduces into the system.
(5) Man activates the system.
(6) Man commands the resources of the
system.
(7) Man utilizes the results of the system in
his external contacts in society.
The consistent factor in the above summation
is the predominant relationship of man to the
system. Man is responsible for creation of
the system, the input of information, the
manipulation of that information, and the final

142

Fall Joint Computer Conference, 1969
disposition of the data produced or revealed by
the system.

B. Personnel Standards Are Necessary

Due to the prime significance of the human
element in the integrity of any data automated
system, the programs must address the following problems in a forthright manner:
(1) Personnel standards must be established
for all participants.
(2) All accepted personnel must be indoctrinated on a continuing basis regarding
the system objectives, functions, operational responsibility, etc.
(3) Specific training must be provided regarding system participation and
terminal operation.
(4) Each installation should have competent
supervision and a plan of routine
inspection of operations.
(5) Each agency participating in a larger
shared system must be accountable
for the performance and integrity
of its representatives. It must also be
responsible for the release of any
system information that is received
from a classified file.
(6) All personnel who have access to the system should be required to sign a voluntary statement acknowledging their
individual responsibility to protect
the integrity of the system and respect
the confidentiality of classified data.
This statement could be a factor in the
initial as well as continued employment. 4
The operating system must prove convenient
and satisfactory to the User. It must provide
an effective service with assurance as to its
accuracy and adequacy. Outputs should be
tailored to meet the user need under the circumstances of the inquiry. The efficiency of
the system should discourage any user development or maintenance of alternate or substitute systems. The man-machine interface should
be maintained through the use of simple, direct
devices with a minimum requirement for coding
progressive verification, etc. An automated
data system should be so designed and supported that the user is free to direct his full
attention to his prime functional responsibility.

The information system must be a viable fmd
practical tool. It should function at the convenience of the user, with intelligible outputs
consistent in time and content to satisfy the
service requirement. Where a system requires specific security restrictions, these must
be furnished and function without imposing
any awkward limitation on the legitimate user
of the system.
C. Weak Policy And Discipline Result8 in An
Inferior System

Recent critics have voiced objection to the development of major data banks and interagency
information sharing systems in government service. Their objection has been based, in part,
on certain practices associated with private
credit bureau operations. The lament, properly
uttered, pointed to a lack of data control fLnd
exercise of discretion by a number of these
private agencies. While the economic and social
value of credit rating bureaus is rendily admitted, the loose policies regarding "privacy
of data" casts a shadow regarding the ability
to maintain integrity in a major information
system. I believe it is an unfortunate and improper inference to conclude that public information systems cannot protect the "privacy"
of information due to questionable practices
among some business organizations established
to collect and merchandise private information for profit.
D. Limitation of Data Access of Specific
A uthorizalion
Suggestions have been made that an individual
should specify the extent of utilization of personal information and then the system be required to conform to the intention expressed
by the individual. This proposal sounds reasonable, but on further consideration:. presents
subsequent problems in data management,
modification of data use authorization, etc.,
that demand thorough study.

E. Individual Right of Inspection of Record - File
Correction

Perhaps one of the most practical approaches
toward satisfaction of individual "right to privacy," and at the same time facilitate the
availability of the maximum of information resources to solve social needs is to make pro-

Management of Confidential Information
VISIOn so that the individual can inspect the
system files that contain his personal data.
The individual should also have means to seek
correction of any data item that is in error and
subject to bias interpretation.

F. Develop Realistic Data Purge Policy
Attention should be given to the development
of basic guidelines regarding the longevity of
data resident in a file or information system.
The current trend is to collect and classify
more and more data on more and more people.
While hopefully most of the data will have
social value, I am sure that a significant quantity will provide little benefit to the individual
or the community. It is not too early to consider the need for sound purge criteria so that
the data retained in an operating system will
offer the highest potential return for the energy
expended.
G. Adequate Training Programs Must Be Developed
And Employed For The EDP Staff And Personnel of The User Agency Who Have Occasion
to Engage The Data System
The content should include an introduction to
system design concepts, the overall functions
and data processing applications that are components of the system and a thorough instruction in terminal man-machine dialog. In addition, some attention should be given to explaining the service philosophy with particular attention to the rules regarding access to
and utilization of any information from confidential or restricted files. The legal and mora]
issues must be clearly defined, and an understanding accepted by all who engage the system
that a violation of the security code regarding

143

restricted data may be sufficient grounds for
removal from system participation or dismissal.
The training program must be viewed as a continuing support function with periodic refresher
classes, problem sessions, review of privacy
criteria, etc. It is most important that the
agency administrators and key supervisory
personnel become involved in this program" and
not leave the system discipline t.ask to the technical staff who are not equipped nor responsible
for this duty.

H. Despite much uncertainty and misgivings as to
the effectiveness in terms of "privacy" control
that will result from the imposition of a licensing
scheme, such a potential mechanism will be the
subject of more intense consideration with the
passage of time.

REFERENCES
1 Intergovernmental Board on Electronic Data Processing
created by statute passed by Legislature of the State of
California. S B No 1100. This statute established under
sections No 11710-11720 of the Government Code
2 File Security Procedures-Report by Sub-Committee on
Privacy and Confidentiality of the Intergovernmental
Board on Electronic Data Processing Oct 18 1969
3 Ibid
4 D E SCHWEINFURTH
The coming computer utility-Laissez-Faire licensing or
regulation?
Computer Digest May 1968
5 A F WESTIN
Privacy and freedom
Atheneum New York 1967
6 Hearings Before a Sub-Committee on the Committee on
Government Operations House of Representatives-89th
Congress (Second Session) July 26 27 and 28 1966
7 System Development Corp "SDC Magazine" Vol 10 Nos
7 and 8 July Aug 1967 (This issue focussed on the question
of computer privacy.)

Some syntactic methods for specifying
extendible programming languages
by VICTOR SCHNEIDER
Purdue University
Lafayette, Indiana

Model of translator system

Our model of a programming-language translator
system is represented schematically in the block diagram of Figure 1. This diagram divides the translator
system into two components. The first component T is
a translator program that reads in and translates the
valid programs of some programming language L.
The output of the translator is a subset T(L) of the
intermediate language. The second component is a
system M for executing the programs translated into
the intermediate language. It will be seen that, in this
intermediate language, the operators follow their
operands in postfix (reverse polish) form, and they are
relatively machine jndepend.ent. In this paper, we will
be mainly concerned with defining the operation of
the translator component by specifying the' inputoutput relationships of the translator for a particular
programming language. These relationships will be
described in a syntactic notation that is independent
of the particulE r translation algorithm used. for implementing the translator T.
The language that was chosen as an example for this
paper is Wirth and Weber's EULER.14 EULER is
quite similar to ALGOL 60 in appearance and capabilities, and it has additional features found in the
LISP list-processing language. The original EULER

Input .Programs
in Language L

Figure I-Simplified block diagram of a translator
system

syntax was written to conform to the requirements of
a precedence translation algorithm,14 and contains a
number of syntactic rules whose purpose is to facilitate
construction of a precedence translator from these rules.
Because of the presence of these stylized rules, it was
decided to rewrite the EULER grammar into a more
compact and transparent form than the one in which
it originally appeared. An Irons-style notation2 ,3 was
used to specify the translation of this new EULER
grammar.
Reverse Polish translation of programming languages

To illustrate what we mean by a syntactic specification of a programming-language translator, let us
consider as an example the following small portion of
the EULER syntax and examine some of the basic
devices used by our EULER sY:'ltem:
145

146

Fall Joint Computer Conference, 1969
Grammar 1. A Simplified Subset of EULER
Syntactic Rule

Rule of Translation

(expr) ---+ (var) = (expr)
I(sum)
(sum) ---+ (sum) + (term)
I(term)
(term) ---+ (term) * (factor)
I(factor)
(factor) ---+ (sum»)
lat (var)
I(var)
I(var ). ( expr-sequence )).
(var) ---+ (name)
(expr-sequence) ---+ (expr)
I (expr~sequence), (expr>

(var) (expr > assign
I
(sum) (term )add
I
(term) (factor )multiply
I
(sum)
(var)
(var )in
(expr-sequence) (var )in
variable (name)
I
(expr-sequence) (expr)

Note that the rules of translation above refer to
sequences of symbols on the right parts of syntactic
rules. In this example, we see that the rules of translation specify how symbols and sequences of symbols in
the source language are rearranged and rewritten in the
translated language. Where no change at all is indicated
in the translation of a particular rule, the symbol
·"1" appears as a translation rule. As an example of how
sequences of symbols are rearranged for translation, the
infix addition of


+



is translated into the reverse polish sequence of symbols
consisting of a "" followed by a ""
followed by the intermediate-language command for
adding together the values resulting from evaluation
of the previous two subexpressions. As in good polish
notation, parenthesis are removed from around expressions, and this process is specified by associating
the translation nde "" with the syntactic rule
---+(  on the lefthand
side are used for translating arithmetic operands into
the intermediate language. For example, the syntactic
rule
---+ 
indicates that operands in arithmetic expressions are
variable names, and the translation of a  into
the sequence

 in
indicates that the "in" command is used for fetching
the value associated with  and for storing that
value on top of the run-time operand stack of systom
M.
The other syntactic rule
---+ at 
reflects the fact the EULER permits use of program
variables that are pointers to data named by other
program variables. Hence, the effect of the "at" command of the source language is to suppress the appearan~e of "in" in the translated program after the translated variable name. In this case, a pointer to the data
stored in  is left on top of the operand stack in
system M at run time. Finally, the rule
 -+ 
means that the names of program variables are translated into the sequence "variable  ." Here, the
effect of the "variable" command is to find a pDinter to
the data stored in the following name by system M alli
to place this pointer on top of the run-time op~rand
stack.
The sequence ".(  )." on
the right part of the remaining  rule is ~he
definition of an EULER function call. FunctlOn
calls are translated with the parameters preceding the
function name in the translated program. In this way,
the function call can be made to look like a reverse
polish operator having n operands: with n the nnmber of

Syntactic Methods for SpecifyingEJxtendible Programming Languages
parameters. A parameterless function call is translated
exactly the same way as a program variable. Thus,
the sequence
"variable < name> in"

in a translated program serves both to fetch data and
to initiate a call on a function, depending on the
< name> involved. This calling sequence will be
referred to in the following discussion of extendible
language features.
In the full translation grammar for EULER given
in Appendix 2, it is possible to see how the methods
presented in the preceding example are applied to the
specification of a complete programming language.
Note that this larger grammar uses, e.g., the symbol
"+" in place of the "add" instruction of our small
example, and, in general, translates as many sourcelanguage symbols as possible directly jnto commands
of the intermediate language. The description of EULER
programming given in Appendix 1 of this paper should
clarify the meaning of the EPLER operators used,
and the following section in thIs paper wHI discus 3 the
syntactic methods for optimizing and extending
EULER as they are developed in the EULER grammar. A full description of the intermediate reversepolish language specified by the EULER rules of
translation can be found in Schneider. 10
Syntactic methods of optimizing expressions

In the EULER grammar of Appendix 2, the rules of
translation specify that a conditional statement or
expression of the form
"IF < expr> 1 THEN < expr> 2 ELSE < ..expr> 8"
is translated into its intermediate language version in
the form
"l$IF 2 $THEN 3 $ELBE"
Note that each of the expressions here can themselves
contain conditional expressions of any desired degree
of nesting, and each of the subexpressions will be rearranged aFi shown above. In this intermediate language
Syntactic Rule

(prim) ~ (stringprim)
(stringprim) ~ (stringhead) I
(stringhead) ~ I
I(stringhead) (symbol)

147

the "$IF" command causes an interpretive scan to
the matching "$THEN" label if  1 is false.
Otherwise execution continues until a "$THEN" is
reached, at which point a scan occurs to the "$ELSE"
label that matches this "$THEN" . In this way,
"$THEN" and "$ELBE" behave like baJanced parentheses around expressions, and also serve as placemarkers to which control can be transferred in the
translated program.
This mechanism for executing translated cond tional
expressions is used also as the basis for translating
logical expressions into a partially optimized form.
To take an example, the EULER sequence corresponding to a disjunction is represented by
" OR ".
Its translated form is
" < disj > $IF $TR UE $THEN < conj > $ELSE".
Here.. if the first operand" " of the expression
is true, the entire expression is true. Therefore, the
second operand is evaluated only if the first operand
is false. A similar mechanism is used for the sequence
" < conj > AND < neg> ".
Here, if the first operand is false, the second operand
need not be evaluated. Hence, the translated conjunction is of the form
" $IF  $THEN $FALSE $ELSE."
Some syntactic methods of extending E U LE R

After developing the appropriate techniques for
translating conditional expressions and for optimizing
logical expressions, the next order of business is to
use these syntactic tricks to provide extended facilities
in the EULER language. The introduction of full
string-processing facilities into the EULER system is
the first example to be considered. Without altering
the EULER interpreter, and with a little reprogramming of the translator, we can effect the following
improvement:
Rule of Translation

I
(stringhead )).

(stringhead). * (symbol),

148

Fall Joint Computer Conference, 1969

Here, a string of arbitrary length is translated into a
list whose cells store the symbols in the string one
symbol in the cell in sequence. With this arrangement,
it is possible to manipulate strings using the list concatenation operator provided by EULER, and using
EULER subroutines to perform tests for list equality
and containment.
The second example involves the addition of facili-

ties for reading in data at run time within the framework of the EULER system. In this case, additional
facilities must be provided in the EULER polish string
interpreter. These facilities take the form of routines
for converting numbers into their internal representation and for packing string data. The added syntax
consists of the following set of rules:

Syntactic Rule

Rule of Translation

(program) -+ .ENTRY (block).EXIT.
\.EKTRY (data)., (block) .EXIT.
(data) -+ (datahead) END
(datahead) -+ DATA (item)
\ (datahead )., (item)
(item) -) (number)
I (stringprim)
I (datalist)
(datalist) -+ .0.
I (datalisthead ) (item»).
(datalisthead> -+ .(
I (datalisthead >(item),

(block)
(data> (block)

With this program structure, the data portion could
be read in by a run-time subroutine that leaves the
data in a pre-arranged location of memory. The
interpreter routine could then be read in over the data
routine, and the translated program would be executed.
A statement of the form "READ < prim>" would
then store an appropriate link to some segment of
the read-in  on top of the run-time operand
stack.
The third example involves the use of a syntactic
notation to expand the EULER language into a selfextendible programming language similar to MAD / 1
(4) and ALGOL 68 (11). By an extendible programming
language, people currently mean the following two
things.
a. A language in which the programmer can specify
new data types and data structures composed
of novel configurations of data elements.
b. A language in which the programmer is able to
reorder the priorities of expression operators and
is able to specify arbitrary new operations at
will.
In EULER, there already exists a general mechanism
for allowing programmers to manipulate data structures,
namely, the list mechanism. EULER lists can be
constructed from arbitrary combinations of data

I

$DATA (item)
I
I
I
I
I
I
I
I

elements. However, EULER only has eight data types
with no facilities for extending their ranges. Such rangeextension facilities depend on the machine on which
the language is implemented, and algorithms for specifying such data types as numbers of arbitrary precision
must be written for the machine in question. Hence,
our example will concentrate on the machine-independent
problem of specifying new operators in programs.
Any reasonable programming language must presuppose the existence of a standard set of expre~~ion
operators before provision is made for aUa wing programs to expand this set of operators. VVith each
standard operator will be associated a standard precedence level, and the operators to be introduce:l by
the programmer must also have precedence levels. A"
the term is currently used, operator precedence (or
priority) is a measure of how expression operators
compare in binding power. For example, exponentia.tion
is said to have lower precedence than addition, bec:aus~
. exponentiation is performed before addition in
2.rithmetic expressions. Thus, precedence impose<:J an
ordering on the operations of a language. This ordering
is reflected in the ordering of syntax rules in programming language grammars. In the EULER grammar
above , rules are ordered so that list concltenation is
.
performed first, then exponentiation, and so on, unttl
the operation of value assignment. From concatenation

Syntactic Methods for Specifying E,xtendible Programming Languages
to assignment of value there are nine levels of precedence.
Our approach in providing, for the programming of
new operations js to assign these operations to one of
nine c:asses of operators, reflecting the nine levels in
original grammar. This means that the translator must
now treat operators as though they are procedure calls
that ca.n only be written into the translated program

149

where their associated precedence level permits th eir
operations to occur. In order to permit the programmer
to tell the translator what precedence is associated with
a newly defined operator, we require an additi onal
operator declaration in our language. This declaration ,
together with the precedence syntax of express)ons
that follows, is sufficient to provide the expanded
operator-definition facility

Grammar 2. An Expression Grammar for Defining New Operators
Syntactic Rule

Rule of Translation

(expr ) ~ (var) (opname) (expr )
I (disj )
(disj) ~ (disj) (opname) (conj)
I (conj )
(conj) --? (conj) (opname ) (neg)
I(neg)

(var) (expr) $VARBL (opname) $IN
I
(disj ) (conj ) $VARBL (opname) $IN
I
(conj ) (neg) $VARBL (opname) $IN
I

(catena) ~ (catena) (opname) (prim)
I (prim)

(catena) (prim) $VARBL (opname) $IN
I

(blockhead) ~ (blockhead)
(operatordec ).,
(operatordec) ~ OPERATOR
(opname)
I(operatordec), (opname)

(blockhead) (operatordec )

(explI) ~ (opname) = (opdef)
(opdef) ~ (defhead) (expr) $.
(defhead) ~ (rankpart)
(operand part ).,
(rankpart) ~ RANK OF (digit).,
(operand part ) ~ OPERANDS (name)
I (operandpart), (name)
(opname) ~ (symbol)
I (opname) (symbol)
In the expression syntax above, the 
in each rule is translated into a procedure call, \vith
parameters consisting of the one or more operands
associated with each . These procedure
calls either refer to the "Standard" operator associated

$NEW (opname)
(operatordec) $NEW (opname)

(opname ) (opdef) =
I
(rankpart) (operandpart)
(Not Translated)
$FORMA (name)
$FORMA (name) (operand part )
I
I
with a particular precedence level or refer to the translated  declared by the programmer. It is
assumed that the translator will automa.tically enclose
each translated program with an extra outer block
containing procedure definitions for the set of standard

150

Fall Joint Computer Conference, 1969

operators basic to the language. In this way, the
standard operators can be redefined within a particular
program, but will regain their usual meaning upon exit
from the block in which. the redefining statement
occurred. A consequence of this method of allowing
new operator definitions is that program subroutines
may use operators global to their definitions, but may
not have operators passed to them as parametsrs,
since all assignment of precedence is performed at
translation time.
A certain amount of optimization is still possible
within the framework of this extendible translator. As
an example, suppose that we write the following pro~
cedure correspond to the standard operator for logical
conjunction:
AND = RANK OF 7., OPERANDS X, Y., IF Y
THEN X ELSE FALSE $.
The actual parameters in the procedure call for logical
AND above are expressions surrounded by ".$" and
"$.". Thus, the effect of the conditional expression in
the operator definition given above is to evaluate the
Y parameter only once and not to evaluate the X
parameter unless Y is true.
Grammar 3. A

Programmer~defined syntactic

augments to existing

languages
As a next step in allowing programmers to decide on
the nature of their own programming languages, we
could conceive of a translator facility for allowing
programmer~specified syntactic and semantic augments
to existing programming languages. The idea behind
this definitional facility is that the translator can be
provided with facilities for accepting new syntactic
rules and associating their right parts with :rules of
translation that are essentially calls on global procedures. The operands within the new syntactic augments are than translated as parameters supplied to
the procedures for executing the augments. The
feasibility of such augments, provided they do not
lead to problems of syntactic ambiguity, can be inferred
from the algorithms presented in Schneider. 9 .10
As an example of what a programmer might be
tempted to add to his language, and of the methods he
could use, we consider the problem of adding ALGOL
W-style iteration to the EULER language. In the
folloWing translation grammar, the global procedures
used in translated programs are "$FOR." and
"$ WHILE", corresponding to the incremented variable and ]ogioal iterations, respectively.

Programmer~Defined

Syntax of Iterative Statements

Syntactic Rule

(a) (expr) ~ WHILE (expr)l DO (expr)2
(b) (expr) ~ FOR (var) FROM (expr)l UNTIL (expr)2 BY (expr)3 DO
(expr)4
Rule of Translation

(a) .$ (expr)1 $.. $ (expr)2 $.$VARBL $WHILE $IN
(b) (var) (expr)l (expr)2 (expr)3 .$ (expr)4 $.$VARBL $FOR $IN
Note that the controlled statement in the syntax
above is translated with procedure definition brackets
".$." and "$.". In this way whenever the corresponding
formal parameter in the "$FOR" OR "$WHILE"
procedure definition is executed, the entire controlled
statement is executed as a procedure. The procedure
definitions of "$FOR" and "$WHILE" that follows
are the "semantics" of Grammar 3:
$FOR = .$FORMAL VAR, EXPl, EXP2,
EXP3, STAT.,
BEGIN LABEL TEST, CYCLE.,

VAR = EXPl.,GOTOTEST."
CYCLE .. VAR = VAR+ EXP2.,
TEST .. IF(VAR- EXP3) *SIGN(EXP2)GT 0
THEN UNDEFINE D
ELSE BEGIN STAT., GO TOOYOLE:
END $.
$WHILE = .$FORMALLOGEXP, STAT.
BEGIN LABEL OYOLE.,
OYOLE .. IF LOGEXP 'THEN BEGINE~T A1',
GO TO OYOLE END
ELSE UNDEFINED END $.

Syntactic Methods for Specifying Extendible Programming Languages

151

The flowchart of Fjgure 2, showing the transitions to
and from the box corresponding to < expr <, illustrates
hO\v the EULER translator was programmed.
1111+1

REFERENCES

N ",Sj
1
j=j+l

THEN

(consequence)

(alternative)
Outcode(N1 )

Outcode(N1 )

1.1-1

1a1-1
Sj ill ?

(pro cde t)

).

Out code (Sj)
j.j+l

Outcode(Sj)
jaj+l
1-1-1
TO INITIAL POINT FOR



~

Xl 

Xl

~

X 2 < consequence>

X2

~

< condition>

By letting Xl be THEN and X2 be IF in the translator,
the coding is greatly simplified, and no ambiguities
are introduced, since the X; can be treated as "new
and distinct" symbols of the normal-form grammar.

1 R W FLOYD
A descriptive language for symbol manipulation
JACM Vol 8 1961 579-584
2 E T IRONS
A syntax dire::ted compiler for ALGOL 60
CACM Vol 4 1961 51-55
3 P M LEWIS R E STEARNS
Syntax-directed transduction
JACM Vol 15 1968465-488
4 D L MILLS
The syntactic 8truciure oj MADlt
DDC Rpt No AD-671-68:-3 1968
5 P NAUR editor
Revised report on the algorithmic langua(,'c ALGOL 60
CACM Vol 6 1963 1-17
6 V 13 SCHNEIDER
The design of processors for context-free languages
NSF Memo Northwe",tern Univ Hl65
7 V B SCHNEIDER
Pushdown-store processors of context-free languages
Dept of Industrial Engineering Northwe-"tern Univ 1966
Evam;ton III
8 V B SCHNEIDER
Syntax-checking and parS'ing of conte;rt-free languages by
pushdown-store auto mata
Proc SJCC 196771-75
9 V B SCHNEIDER
A system for deS'igning fast programming language translators
Proc SJ CC 1969 777-792
10 V B SCHNEIDER
A translator system for the EULER programmng language
Tech Rpt 68-76 Computer Science Center Univ of Md
College Park 1969
11 A VAN WIJNGAARDEN editor
Report on the algorithmic language ALGOL 69
Mathematisch Centrum 49 2e Boerhaavestraat Am",terdam
The Netherlands 1969
12 J WEIZENBAUM
A symmetric list processor
CACM Vol 6 1963524
13 N WIRTH
A generalization of ALGOL
CACM Vol 6 1963 547-554
14 N WIRTH H WEBER
A generalization of ALGOL and its formal definition: Parts
I and II
CACM Vol 9 1966 13-25 89-99

Appendix I
Features of the E U LE R language

EULER is a nested block-structure language,
similar to ALGOL. Thus, every block, consisting of a
sequence of statements surrounded by BEGIN and

152

Fall Joint Computer Conference, 1969

END parentheses, can be treated as a single statement
in ALGOL fashion. An EULER program consists of
an EULER block preceded by .ENTRY and followed
by.EXIT ..
In EULER., there are three declarations. One declaration is for data variables, one for program labels,
and one for formal parameters of procedures. In the
program
".ENTRYBEGIN NEW X, Y.,
LABELZ., ...
Z .. X

X and Y will store data, and Z will be a label preceding some statement.
Assigning a data type to a declared variable is
accomplished by writing an assignment statement with
data of the appropriate type on the right-hand side
of the assignment. Thus, typing of variables in EULER
is dynamic, since any assignment statement can change
the data type stored in a variable. And, data typing
is implicit, since there are no declarations like rea.!,
integer, etc., as appear in ALGOL. The followi.ng is a
list of the right EULER data types:

+ YEND .EXIT."
I. Number --In the EULER system, all numbers are assumed to be floating
point numbers. The assignment statement

"V

=

E.,"

with E a numerical expression or number, causes variable V
to become a numerical variable.
II. Symbol

-In this EULER implementation, an assignment statement
such as
"V = . *ALPHAN.,"
causes the six characters "ALPHAN" to be stored in the
location named by variable V.

III. Logical

-The logical constants are TRUE and FALSE, standing
respectively for logical truth and falsehood. The assignment
statement,

"V = L.,"
with L a logical constant or logical expression, causes variable
V to become a logical variable.
IV. Label

--EULER programs use two declarations. "NEW" is used to
declare a data variable, and "LABEL" is used to declare the
presence of a label in some block of a program. Interestingly,
if V is a variable in some EULER block, and V is not in a
block global to the block of label L, then the assignment
statement

"V = L.,"
causes V henceforth to be of type label, and to be interchangeab1e with L in GO TO statements.
V. Reference-In EULER, if VI is a variable not in a block global to the
block of variable V2, then the assignment statement
"VI = AT V2.,"

Syntactic Methods for Specifying Extendible Programming Languages
makes VI a pointer to the data stored in V2. After VI is
turned into such a pointer, the two statements
and

"V2 = V2 + 1.,"
"VI IN = VI IN

+

1.,"

will have exactly the same effect of manipulating whatever
data is stored in V2.
VI. Procedure--An assignment statement of the form
"VI = .$ (expr) $.. ,"
causes VI to become the name. of a parameterless procedure
call with body given by (expr). As a programming example,
we might consider the following EULER block: "BEGIN
NEW X, Y., X = 2.,
Y = .$FORlVIAL Z., X = X
OUT Y~(5). END"

+

Z$ .. ,

When Y.(5). is operated on by the "OUT" operator, the
value 7.0000 will be -written out.
VII. List

-In EULER, lists can be constructed in three distinct ways:
(a) On command: "VI = LIST 300.,"
This statement creates a list of 300 undefined cells and makes
VI their name.
(b) By explicit notation: "V2 = .(1,.(2, 3)., 4) .. ,"
This statement creates a list consisting of two numbers and a
sublist and makes V2 the name of that list.
(c) By concatenation: "VI = VI CON CAT V2.," Using the
CONCATenation operator, small lists can be joined into
larger ones.
In addition, lists can be subscripted in the same way as
ALGOL arrays, each element of a list can be any EULER
data type, including label, reference, and procedure. The
following EULER block is a small example of the genera1ity
of the list notation: "BEGIN NEW X, Y., LABEL Z.,

=

.(2, .$ BEGIN X = X+ 1., Y(X) END $.,
.$ OUTX$., Z) .. ,
X = Y(l)., Y(X)., GOTO Y(4).,
Z .. OUT .*FINISH END"
Y

With this program segment, first 3.0000, then FINISH will
be written out by the executed program.
VIII. Undefined-Every variable declared by "NEW" in an EULER program
is initially of type "UNDEFINED." In addition, "UNDEFINED" is used as a data constant occasionally and as an
empty option in conditional statements such as:
"V = IF LI THEN .(1, 5). ELSE UNDEFINED.,"
For more details on EULER programming, the reader is referred to the Wirth and
Weber EULER paper.14

153

154

Fall Joint Computer Conference, 1969
Appendix 2
11 new translation grammar for EULER
Syntactic Rule

Rule of Translation

1: (program) ~ .ENTRY (block) .EXIT. (block)
(blockhead) (body) $END
2: (block) ~ (blockhead) (body) END
$BEGIN
3: (blockhead) ~ BEG IN
(blockhead) (label dec )
1(blockhead) (labeldec ).,
(blockhead) (vardec )
1(blockhead) (vardec ).,
$NEW name
4: (vardec) ~ NEW (name)
(vardec) $NEW (name)
1(vardec ), (name)
$LABEL (name)
5: (labeldec) ~ LABEL (name)
(labeldec ) $LABEL (name)
I (labeldec), (name)
I
6: (body) ~ (body)., (stat)
I
1(stat)
7: (stat) ~ (labdef) (stat)
I
1(expr)
I
$LBDF (name)
8: (labdef) ~ (name) ..
(expr) $GOTO
9: (expr) GO TO (expr)
(expr) $OUT
lOUT (expr)
(var) (expr) =
1(var) = (expr)
I
I(disj )
1(condition) (consequence) (alternative) I
(expr) $IF
10: (condition) ~ IF (expr)
(expr) $THEN
11: (consequence ) ~ THEN (expr)
(expr) $ELSE
12: (alternative) ~ ELSE (expr)
I
13: (disj ) ~ (conj)
(disj ) $IF_$TR UE $THEN_
1(disj ) OR (conj)
(conj ) $ELSE
I
14: (conj) ~ (neg)
(conj ) $IF_ (neg) $THEN_
1(conj) AND (neg)
$FALSE $ELSE
I
15: (neg) ~ (relation)
(relation) $NOT
INOT (relation)
I
16: (relation) ~ (sum)
{sum )1 (sum )2 (relop )
1(sum)1 (relop) (sum)2
$EQI$NEQI$GEQ
17: (relop) ~ EQINEQIGEQ
I$LEQI$GTI$LT
ILEQIGTILT
I
18: (sum) ~ (term)
(term)
1+ (term)
(term) $NEG
1- (term)
(sum)(term) {+I-}
1(sum) {+I-} (term)
19: (term) ~ (factor)
I
(term) (factor) {*I/I./.I
1(term) {*1/1·/·
$MODUL}
IMODULO} (factor)
I
20: (factor) ~ (catena)
(factor) (catena )**
1(factor )** (catena)
I
21: (catena) ~ (prim)
(catena) (prim) $CONCA
1(catena) CONCAT (prim)
$UNDEF
22: (prim) ~ UNDEFINED

Syntactic Methods for Specifying Extendible Programming Languages
Syntactic Rule

23:
24:

25:
26:
27:
28:
29:
30:
31:

32:

33:

34:
35:
36:
37:
38:

I(val')
I (label)
I( (expr»)
I (block)
I (procdef)
I (reference prim )
I (Iistprim)
I (numberprim)
1(logicalprim )
ITAIL (prim)
I (val') . ( (expr-sequence )) .
I{symbolprim)
(label) -~ (name)
(val') ~ (name)
I (val') IN
1 (val') (sum-sequence»)
(expr-sequence) ~ (expr)
I (expr-sequence), (expr)
(sum-sequence) ~ (sum)
I (sum-sequence), (sum)
(referenceprim) ~ AT (val')
(list prim ) ~ (list)
ILIST (sum)
(list) ~ .( ).
I (listhead> (expr )).
(listhead) ~ .(
(numberprim ) ~ (number)
IREAL (disj)
ILENGTH (catena)
IABSOL UTE (sum)
IINTEGER (sum)
(logicalprim ) ~ TRUE
1FALSE
ILOGICAL (sum)
1(sypeinquiry) (val' )
(typeinquiry) ~ ISNU
IISLOIISLAIISLI
IISPR IISREIISSY IISUN
(symbolprim) ~ . * (6-symbol string)
(procdef) ~ .(prochead ) (expr) $.
(prochead) .$
I (prochead) (formaldec ).,
(formaldec) ~ FORMAL (name)
(formaldec ), (name)
(6-symbolstring)
{ (letter)1 (digit) (blank)
I,I·I$I*I?I = 1+1-

Rule of Translation

(var) $IN
(label) $IN
(expr)
I
I
I
I
I
I
(prim) $TAIL
(expr-sequence) (val') $IN
I
$VARBL (name)
$VARBL (name)
(val') $IN
(val' ) (sum-sequence»)
I
(expr-sequence) (expr)
I
(sum-sequence») (sum)
(val')
I
(sum) $LIST
I
I
I
$NUMBR (number)
(disj ) $REAL
(catena) $LENGT
(sum) $ABSOL
(sum) $INTEG
$TRUE
$FALSE
(sum) $LOGIC
(val' ) (typeinquiry )
$ISNU I$ISLO I$ISLA
I$ISLI I$ISPR I$ISRE
I$ISSYI$ISUN
I
I
.$-(prochead) (formaldec )
$FORMA (name)
$FORl\1A (name) (formal dec )
I

i>1<}6

(i.e., a string of 6 characters.)
39: (name) ~ (letter)

I

155

156

Fall Joint Computer Conference, 1969

----------------------~-------------------------------------------------,--

Syntactic Rule

I(name> (letter>
I (name> (digit>

40:
41:
42:
43.

Rule oj Translation

I
I
(For the IBlYI 7094 and the UNIVAC 1108, only the first six characters of a
(name> are translated.)
(number) ---'? (integer>
Converted to octal.
I(integer). (integer)
Converted to octal floating point.
(integer> ---'? (digit>
I (integer> (digit>
(digit> ---'? 0111 ... 19
I
(letter) ---'? AI ... IZ
I

--

SYMPLE-A general syntax directed
~acro preprocessor
by JAMES E. VANDER MEY
The Pennsylvania State University
University Park, Pennsylvania

ROBERT C. VARNEY
The Pennsylvania State University
McKeesport, Pennsylvania

and
ROBERT E. PATCHEN
IBM Corporation
Boston, Massachusetts

INTRODUCTION
The subject of this paper is a general syntax directed
macro preprocessor system. One of the suggested potential uses of this system is that of evaluating new or
extended programming languages by the technique of
syntax directed macros. This led to the association of
the acronym SYl\1PLE (SYntax Macro Preprocessor
for Language Evaluations) with this system.
A preprocessor is a processor intended to be used prior
to another processing stage. In our case, it is assumed
that the SYlVIPLE preprocessor system will generally
be used in processing higher level language texts (ones
which are user oriented), producing output text in the
same or a similar higher level language.
The term "macro" is used in a very general sense in
this paper. As in other macro systems, the macro mechanism consists of the recognition of a macro "reference"
in the source text being processed, and a macro "definition" defining a translation proceduFe invoked by
some corresponding macro reference.
A SY1\:lPLE macro definition consists of two parts:

the "macro semantic portion" or "macro body"; and
the "macro templates."
The macro semantic portion is the translation procedure and consists of the instructions to be executed when the macro is "invoked". A macro is
invoked when a pattern described in one of its
macro templates is recognized by the parser in
the source input text. This macro reference pattern
may have identifiable parts which are then considered as arguments for the semantic portion.
A macro template defines a possible macro reference pattern for this macro and consists of two
distinct parts: A specification of a general syntactic substructure of the source input text in which
a given macro reference may occur (i.e., context);
and any necessary further syntactic qualifications
within that general syntactic substructure (e.g., a
specific pattern). The actual pattern matching
technique for macro reference is thus a two level
syntax directed matching procedure. This syntax
157

158

Fall Joint Computer Conference, 1969

directed macro reference technique is the method
by which SYl\1PLE achieves both simplicity and
generality.
The SYl\1PLE system as a macro system is not tied
to any particular programming language. The base
(source input) language and the object (output) language of the macro facility could in fact be entirely
different languages.
The syntax of the languages to be processed and/
or extended must be adequately described through the
syntax description metalanguage of the S Yl\1PLE
system. This syntactic description is used for determining "context" for macro references and thus the requirements for a minimally "adequate" syntactic description
of a language are proportional to the degree of context
required to isolate macro references.
As a very simple example, assume all macro references
must occur in only a single specific syntactic unit (syntactic substructure) of the base language (e.g., only
labels of Fortran statements). Then to facilitate the
recognition. of macro references in the source language,
the syntax of the base language need only be described
via the metalanguage to the extent that it can isolate
this syntactic unit type (i.e., Fortran labels.) vVhen
recognized, this syntactic unit will then be considered
as a candidate for containing a macro reference.
After a candidate syntactic unit is isolated in the
source input a check can be made for the existence of
specific macro references by testing for further qualifying patterns within that syntactic unit. For instance,
a Fortran label of "three blanks followed by t"yO numbers" might be a specific macro reference. A check would
thus he made for this reference according; to the syn-'
tactic pattern defining "three blanks followed by two
numbers" whenever a Fortran label is recognized. This
process of local syntax investigation is called "template
matching" for a macro reference.
It is also through the template matching facility
that translation parameters in the source language
(e.g., arguments, conditions, etc.) are recognized and
passed to the actual macro facility. These translation
parameters, which we shall call argument strings, can
be manipulated by the instructions contained in the
body of the macro (semantic portion).
Since the primary function of the SYl\1PLE system
is that of a preprocessor, the translation process is mainly that of a manipulation of argument strings and the
insertion of modified and/or created strings back into
the source input. Hence, the actual semantic portion
of the macro is implemented in a language oriented to
the manipulation of character strings. Thus translation
due to macro references and related translation param-

SYW'LE PREPROCESSOR S'I'STEM FI..CNI

Figure I-A general flow of the SYMPLE macro
preprocessor system

eters generally results in the insertion of the translation code in the base language into the body of the
code being processed. It will be shown that this "in
place" translation in the SYMPLE system does not
necessarily imply expansion in exactly the same place
(i.e., at the lexicographical location of the maero
reference).
An attempt will now be made to summariize and
interrelate the functions of the SYMPLE system by
outlining the system functional flow via a system flow
diagram (Figure 1) and the following brief description.
The preprocessor operates as follows:
1. The first items processed contain control information which includes such items as the device(s)
from which subsequent information is to be read,
the device(s) designed for system output, the
names of special edit macros, specifie listing
options, etc. Control information ma,y oceur
in the input stream at other logical stages of
processing.
2. A description of the base language syntactic
structure is read as input and proeessed to
build a data base for the recognition portion.
This data base will be used later by a parser.
3. Macros (templates and associated semantic
translation routines) are read in, stored, and
used to create necessary data bases for later
processing.
4. A source deck is read in and parsing; of the
source input begins. (Probable entry point for
most users.)
a. As a syntactic unit is recognized, a check
is made to see if any macros have templates
to be matched in this syntactic unit.

SYMPLE
Ternplates of edit macros, if any, are tested
last. When there are no templates left to
be checked and if the end of the total
parse has not been encountered, the parse
is continued.
b. If a macro template match is successful,
the argument strings are passed to its
associated macro semantic portion. There
may be any number of macro templates
associated with a given macro semantic
portion, and ident.ical template patterns
can be associated with different macro
semantic portions.
c. The instructions in the current macro
semantic portion are executed (actually
interpreted) and the results of their operations are effected (e.g., storage manipulation, insertion of translation into input
source, dynamic creation of new macro
templates or semantics for this or other
macros). Upon completion of execution
control is returned to 4a above.
5. When the source deck has been completely
parsed and thus source time translations, including any necessary editing, have been completed, the file is then ready for output in a
manner specified by the control information.
6. Processing is now completed, but by appropriate control information another cycle may
be initiated on (a) new information or (b) on
a previous preprocessor output file. Thus, in the
latter case, we have the possibility of a multipass preprocessor, if desired.
The remainder of this paper will be devoted in the
main to the details of what the SYMPLE system can
do and in general how one goes about using the SYMPLE system. The syntax description metalanguage is
introduced first followed by an introduction to the
macro translation (semantic) and insertion capabilities
ofSYMPLE.
Syntax description metalanguage

The syntax description metalanguage is used to describe a parsing "grammar" of the base language in
which macro references are to be embedded and thereby
outline the manner in. which the source input is to be
parsed. For example, suppose a label field is one syntactic structure to be parsed. The parser should then be
told that a label field consists of, say, five characters
which are either all digits, all blanks, or a string of
blanks followed by a string of digits.

159

The grammatical metalanguage used to direct
SYMPLE',s parser is similar to the Backus-Naur
Form 4 (BNF) metalanguage. For example, similar
grammatical productions are used to define syntactic
structures; the nonterminals and terminals of BNF are
also used being renamed syntactic units and literal
strings, respectively. There, are, however, several features in SYMPLE's metalanguage which were incorporated to extend the power and simplicity of grammatical description over that of standard BNF.
Actual productions in SYMPLE's metalanguage to
define the parsing desired in the preceding example are
(LABEL-FIELD) :5&5(0$' 'O$(DIGIT»
(DIGIT) :'0' 1'1' 1'2' 1'3' 1'4' 1'5' 1'6' 1'7' 1'8' 1'9'
The first production above is interpreted as: a label
field is defined as not less than five nor more than five
characters of a string of zero or more blanks 'immediately followed by zero or more digits.

Productions
The syntactic units of the base language are defined
by productions in the metalanguage. These productions are of the form:
(LHS): right side
where (LHS) represents the syntactic unit being defined on the left side and the right side contains metalinguistic descriptions of other syntactic unites) and/or
literal string(s) in the left to right order in which they
comprise the structure of (LHS). The colon (:) separates the defined syntactic unit on the left side from
the defining information on the right side.
The first production of the base language grammar
must be the definition of the syntactic unit representing
the total syntactic structure of the base language (i.e.,
the initial or distinguished symbol of BNF). Other
productions may be in any order.
(Named) Syntalctic units
The metalinguistic representation of a syntactic unit
in a production is a string of arbitrary length enclosed
in parantheses. The string (called the name of the
syntactic unit) may be composed of any characters
with the exception of those used as special delimiters
in the syntax description metalanguage (i.e., illegal
characters are 0: ;'1 $&).

160

Fall Joint Computer Conference, 1969

Literal strings
A literal string is represented in the metalanguage
by the desired string of characters enclosed in single
quotation marks ('). Any character may be used within
a literal string, except that a single quotation mark is
represented by two adjacent single quotes for each
occurrence in the literal string in order to differentiate
it from the ending delimiter of the literal string.

Alternatives
If a syntactic unit in the base language may h~ve
alternative representations, these alternatives may be
represented in the metalanguage as a single production
with the alternatives of the syntactic unit each appearing on the right side and separated from each other by
the conventional OR symbol (I).

Example:

(DIGlf):'1'1'2'1'3'I(OTHER)

Complex substructures (Unnamed syntactic
units)
If one does not wish to break down and label a syntax substructure in detail, but simply label an entire
complex substructure as a syntactic unit, pairs of parentheses may be used as grouping in::licators. Consider
the following equivalent examples of a definition of
the syntactic unit (NUM4).

Example:

Example:

(NUM) :'2'1'3'1'4'
(NUM2) :'3'1'4'1'5'
(NUM3):'5'1 '6'1'7'
(NUM4) :'1' (NUM) (NUM2) 1'1'
(NUM3)
(NUM4): '1' «'2'1'3'1'4')('3'1'4'1'5')1
('5'1 '6'1 '7'))

Grouping may occur to any depth desired and each
quantity within the grouping parentheses must have
the form of any legal right side of;a production.

Quantity repetition and bounds
Often in the syntax of a base language a (named or
unnamed) syntactic umt or literal string may be required to occur several times. Or it may be desirable
to specify that a syntactic structure b3 a function of
the length of an input string in addition to other qualifications (e.g., a label field of exactly five characters
and consisting of . .. ).
To indicate either the repetition of a string (Le., the
input string defined by a syntactic structure) or the
length bound on the number of characters in some

string, an operator group must precede the respective
quantity in the syntax. The operator group ils of the
form n$m or n&m for the string and character counters
respectively, where n is an integer representing the
lower bound and m, an in 'jeger representing the upper
bound.
Consider the following example.
(A): 3$3 (SUB-STRUCTURE)
(B): 3$3 (SUB-STRUCTURE)
(C): 'C'
(SUB-STRUCTURE):
O~~5 (0)
1$3'AB'
The first production defines (A) as exactly three strings
of (O$5(C)1$3'AB'). Thus, acceptable strings for (A)
might be ABABAB or ABCABCC.CCABAB or CCABABCABAB, etc. However, (B) is defined as exaetly
three characters which are otherwise defined as in (A).
Thus, (B) can be only CAB; no other combinations
will yield exactly three characters. Notice that the
string counter differs from the character counter in that
it is distributed over all inner strings whereas the character counter represents an absolute bound over a given
substructure.
When productions include quantities with :repetition
counts, the parser which utilizes these produc:tions will
attempt to find the largest number of those quantities
in the input source consistent with the upper bound of
repetitions. If the input contains more than 1Ghe upper
bound of these quantities, the input string corresponding to the upper bound count of quantities will be reeognized and succeeding repetitions will be analyzed according to the syntax following. A lower bound count
of zero is allowable and simply indicates the optional
omission of the quantity.
The absence of an explicit lower bound implies a
lower bound of one. The absence of an explilcit upper
bound implies an upper bound which is the maximum
bound allowable in the system. In the present im.plementation it is 32767. It should be noted that
1$1 (SYUN) and (SYUN) are equivalent as are
$(SYUN) and 1$32767 (SYUN)

Complement look-ahead
The symbol -, preceding a literal string, syntactic unit
or grouping indic?tes that at that point in the syntax
the quantity indicated lll:ust not occur: This :ls called a
complement look-ahead for the indicated quantity at

SYMPLE
parse time. If the quantity is found, the parse being
attempted has failed. (Any syntactic units found on the
look-ahead will not result in macro template match
attempts.) If the quantity is not found, the parse continues as before the complement look-ahead.
Example:

(LETTER):'A'I'B'I'C'\'D'I'E'
(SPLTRSTRG) :$( --, '0' (LETTER»

The strings recognized as (SPLTRSTRG) will be any
string which consists of one or more of A, B, D or E,
butnotC.

Scan positioning
The production defining a syntactic unit can be made
to include, without investigation as to structure, an
arbitrary lengh of input, or it. may require that a
particular syntactic unit in the input conform to more
than one syntactic structure. This is done by explicitly
positioning the location at which the parser is "looking."
This location, called the scan position, can be adjusted
either relative to its present position or to the beginning
reference points in the syntax of the parsed input.
a-X(Space) positioning
The occurrence of the symbol X immediately followed
by an unsigned integer number and delimited by bracketing commas at any point in the right side of a production will cause the scan position to be adjusted
rightward from its present location the integer number
of positions specified. The symbol X and following
number must be bracketed on both sides by commas
except in the following cases: X is the first (last)
symbol of a grouping level or the first (last) symbol of
the right side of a production, in which case the left
(right) comma is not required.
Example: Define an (END-CARD) to be an
80 character string. The first six characters must be
blanks, the next 66 characters must have the word
END somewhere with the rest blanks, and the last
eight characters may be anything.
(END - CARD): 6 & 6' '66 & 66 (0$"

('END')

0$' ') , X8

b-T (Tab) positioning
The format is similar to that of X positioning, except
a T is used instead of an X.
The T scan positioning results in the scan position

161

being moved the specificed number of places to the
right of the beginning location at which the parse began
at (1) this grouping level, if the T positioning appears
within a grouping parenthesis pair, or (2) th~ right side
of the production otherwise.
Example: A syntactic unit (El\1PLOYEE-NO.)
is defined to be an 80 character string with'i1 syntactic
unit (LAST-NAME) beginning in position one, followed by a single blank and then the syntactic unit
(FIRST-NAIVIE). Exactly 15 spaces after the beginning of (FIRST-NAl\/[E) is to appear the syntactic
unit (CODE). Finally (NUMBER) will be 75 spaces
from the beginning of (ElVIPLOYEE-NO.).
(El\tfPLOYEE-NO. ): (LAST-NAME) "
((FIRST-NAl\tfE) , TI5, (CODE)), T75,
(NUNIBER)
Recursive grammars in the metalanguage

Recursive grammars (i.e., productions with the
syritactic unit of the left side occurring as well on the
right side, or being in the derivation of a syntactic
unit of the right side) are allowed in the metalanguage
subject to certain conditions.
For instance, left recursive productions are not allowable, but other recursive productions are allowable.
Further, the character (&) bound counts are cumulative . from the initial (top) occurrence in a recursive
parse while the repetition bounds ($) are effective at
each leVf~1 of recursion.
N on-specific grammars in the metalanguage

Let a non-specific grammar be one in which the
particular alternatives of structure for a syntactic unit
may have structurally the same headings (i.e., leading
components which are structurally the same). The metalanguage allows the specification of such grammars
and at recognition time the parser always picks the
first specified (or left most) alternative as its initial
guess. Subsequent guesses continue with the next
specified alternatives.
The user must be aware of the possible consequences
if the apparent ambiguity in a non-specific grammar
causes the recognition of syntactic units to be rejected
later as a result of an unsuccessful parse. Though the
back-up to the next alternative is handled automatically by the parser, the syntactic units recognized may
result in macro invocations; the results of which will
not automatically be negated . Relevant user aids in
this area are provided by the system.
The following example illustrates' a parsing grammar

162

Fall Joint Computer Conference, 1969

for a language which is context sensitive and not context free and which utilizes recursive productions.
L = (Onl nOn:n

~

1)

(LANG) :(LSTR) -; '1', Tl, $'0' (RSTR)
(LSTR) :'O'(LSTR)'I'1 '01'
(RSTR) :'I'(RSTR)'O'j '10'
The parser first determines that the input string
belongs to the context-free language On 1nx; checks to
make sure x does not begin with a 1; repositions to the
beginning of the parsed substring of l's and then determines that the remaining substring of the input
string belongs to the context-free language 1nOn. If
the above conditions are true, then the input string
belongs to the context-sensitive language Onl nOn.
The SYMPLE macro facility

The macro facility of SYl\IPLE provides the actual
translation mechanisms. The macros themselves are
read in to the system following the base language
grammar and prior to the user's source deck. The individual macro definitions are described in this section.
MACRO FORMAT
The overall format of an individual macro definitions
is as follows:
< macro name> ( < syntactic
unit» = < template body> / ( < syntactic
unit» =
1969 11_#35 11 #35

1969-11_#35 1969-11_%2335

Navigation menu

Versions of this User Manual:

Views

Navigation