1970 11_#37 11 #37

1970-11_#37 1970-11_%2337

User Manual: 1970-11_#37

Open the PDF directly: View PDF PDF.
Page Count: 683

Download1970-11_#37 1970-11 #37
Open PDF In BrowserView PDF
AFIPS
CONFERENCE
PROCEEDINGS
VOLUME 37

1970
FALL JOINT
COMPUTER
CONFERENCE

AFIPS
CONFERENCE
PROCEEDI NGS
VOLUME 37

1970
FALL JOINT
COMPUTER
CONFERENCE
November 17 -19, 1970
Houston, Texas

The ideas and opinions expressed herein are solely those of the authors and are not
necessarily representative of or endorsed by the 1970 Fall Joint Computer Conference Committee or the American .. Federation of Information Processing
Societies.

Library of Congress Catalog Card Number 55-44701
AFIPS PRESS
210 Summit Avenue
lVlontvale, New Jersey 07645

©1970 by the American Federation of Information Processing Societies, IVlontvale,
New Jersey 07645. All rights reserved. This book, or parts thereof, may not be
reproduced in any form without permission of the publisher.

Printed in the United States of America

CONTENTS
A SPECTRU1VI OF PROGRAM]\1ING LANGUAGES
The macro assembler, SWAP-A general purpose interpretive
processor .................................................. .
Definition mechanisms in extensible programming languages ....... .

1
9

1\1. E. Barton
S. A. Schuman
P. Jorrand

VULCAN-A string handling language with dynamic storage
control .................................................... .

21

E. F. Storm
R. H. Vaughan

On memory system design ..................................... .
Design of a very large storage system ........................... .

33
45

Design of a megabit semiconductor memory system ...........

.I... .

53

R. l\1. 1\1eade
S. J. Penny
R. Fink
1\1. Alston-Garnjost
D. Lund
C. A. Allen
S. R. Andersen
G.K.Tu

Optimum test patterns for parity networks ...................... .

63

A method of test generation for fault location in combinatorial logic ..

69

The application of parity checks to arithmetic control ............. .

79

MODERN l\1El\10RY SYSTEl\1S

DESIGN FOR RELIABILITY
D. C. Bossen
D. L. Ostapko
A. 1\1. Patel
Y. Yoga
C. Chen
K. Naemura
C. P. Disparte

OPERATING SYSTE1\1S AND SCHEDULES
Scheduling in a general purpose operating system ................. .

89

Scheduling TSS/360 for responsiveness .......................... .
Timesharing for OS ........................................... .

97
113

SPY-A program to monitor OS/360 ........................... .

119

V. A. Abell
S. Rosen
R. E. Wagner
W. J. Doherty
A. L. Scherr
D. C. Larkin
R. Sedgewick
R. Stone
J. W. l\1cDonald

AEROSPACE APPLICATIONS
An efficient algorithm for optimum trajectory computation ........ .
Hybrid computer solutions for optimal control of time varying
systems with parameter uncertainties ......................... .

129

K. S. Day

135

W. Trautwein
C. L. Conner

143
159

R. N. Freed
1\1. E. Stevens

COl\1PUTER PROCUREMENT REQUIREMENTS IN RESEARCH
AND DEVELOPMENT
The role of computer specialists in contracting for computers-An
interdisciplinary effort ...................................... .
Selected R&D requirements in the computer and information sciences

l\1ULTI-ACCESS OPERATING SYSTEl\1S
Development of the Logicon 2+2 system ........................ .
System ten-A new approach to multiprogramming ............... .

169
181

A. L. Dean, Jr.
R. V. Dickinson
W. K. Orr

ANALYSIS OF RETRIEVAL SYSTEl\1S
On automatic design of data organization ........................ .
Analysis of retrieval performance for selected file organization
techniques ... '.' ............................................ .

187

W. A. l\1cCuskey

201

A. J. Collmeyer
J. E. Shemer

Analysis of a complex data management access method by simulation
modeling ................... '" ., .......................... .

211

V.Y.Lum
H. Ling
l\1. E. Senko

Fast "infinite-key" privacy transformation for resource-sharing
systems ................................................... .

223

J. M. Carroll
P. M. l\1cLelland

231

J. S. Vierling
l\1. Shivaram

241
251

D. C. Martin
L. Siklossy

257

J. J. Allan
J. J. Lagowski
M. T. l\1uller

269

l\1. R. Irwin

275

G. E. Heyliger

281

N. Abramson

Computer-aided system design ................................. .

287

Integrated computer aided design systems ....................... .

297

Interactive graphic consoles-Environment and software .......... .

315

E. D. Crockett
D. H. Copp
J. W. Frandeen
C. A. Isberg
P. Bryant
W. E. Dickinson
lV1. R. Paige
R. C. Hurst
A. B. Rosenstein
R. L. Beckermeyer

COl\1PUTER ASSISTED UNDERGRADUATE INSTRUCTION
On line computer managed instruction-The first step ............ .
Development of analog/hybrid terminals for teaching system
dynamics .................................................. .
Computer tutors that know what they teach ..................... .
Planning for an undergraduate level computer-based science education system that will be responsive to society's needs in the
1970's .............................................. '....... .

COlVIPUTER CONIl\1UNICATION PART I
The telecommunication equipment market-Public policy and the
1970's ..................................................... .
Digital frequency modulation as a technique for improving telemetry
sampling bandwidth utilization ............................... .
THE ALOHA SYSTEl\1-Another alternative for computer communications ............................................... .
CO~lPUTER

AIDED DESIGN

INTERFACING COl\IPUTERS AND EDUCATION
l\lDS-A unique project in computer assisted mathematics ........ .

325

Teaching digital system design with a minicomputer .............. .
Computer jobs through training-A preliminary project report ..... .

333
345

R. H. Newton
P. W. Vonhof
W. C. Woodfill
l\1. G. l\1organ
l\1. R. l\1irabito
N. J. Down

COMPUTER COMMUNICATION PART II (A Panel Session)
(No papers in this volume)
SURVEY OF TIlVIE SHARING SYSTE1VIS (A Panel Session)
Technical and human engineering problems in connecting terminals
to a time-sharing system. . .................................. .

355

J. F. Ossanna
J. H. Saltzer

l\1:ultiprogramming in a medium-sized hybrid environment ......... .
The binary floating point digital differential analyzer ............. .

363
369

Time sharing of hybrid computers using electronic patching ........ .

377

W. R. Dodds
J. L.Elshoff
P. T. Hulina
R. l\1:. Howe
R. A. l\IIoran
T. D. Berge

HYBRID SYSTEMS

SIMULATION LANGUAGES AND SYSTEMS
J. D. l\1:ar kel
B. Carey
B. E. Tossman
C. E. Williams
N. K. Brown
R. L. Granger
G. S. Robinson
G. R. Trimble, Jr.
D. A. Bavly

Digital voice processing with a wave function representation of speech

387

SIl\1:CON-An advancement in the simulation of physical systems ...

399

COl\1:SL-A Communication System Simulation Language ........ .

407

Cyberlogic-A new system for computer controL ................. .

417

A model for traffic simulation and a simulation language for the
general transportation problem ............................... .

425

R. S. Walker
B. F. Womack
C. E. Lee

433
445
451

A. I. Wasserman
D. Van Tassel
R. Nlallary

461

J. E. Stuehler

471

L. A. O'Neill

477

D. Bjorner

493

D. E. Farmer

503

H. lVI. Gates
R. B. Blizard.

515

G. N. Pitts
P. B. Crawford

ART, VICE AND GAl\1:ES
Realization of a skillful bridge bidding program .................. .
Computer crime .............................................. .
Tran2-A computer graphics program to make sculpture .......... .
COl\1:PUTERS AND l\1:ANUFACTURING
l\1:anufacturing process control at IBl\tJ:. ......................... .
Extending computer-aided design into the manufacture of pulse
equalizers. . . .............................................. .
EFFECT OF GOVERNl\I[ENT CONTROLS IN THE
COlVIPUTING INDUSTRY (A Panel Session)
Finite state automation definition of data communication line control
procedures ................................................. .
A strategy for detecting faults in sequential machines not possessing
distinguishing sequences ..................................... .
Coding/ decoding for data compression and error control on data links
using digital computers ...................................... .

COl\I[PUTATIONAL EFFICIENCY AND PERFOR1\IANCE
l\1inimizing computer cost for the solution of certain scientific
problems .................................................. .

Analytical techniques for the statistical evaluation of program running
times .... , ................................................ .
Instrumenting computer systems and their programs .............. .

519
525

B. Beizer
B. Bussell
R. A. Koster

535
547
555

R.
R.
C.
R.

Integration of rapid access disk memories into real-time processors ..
1\1anagement problems unique to on-line real-time systems ......... .

563
569

ECAl\1-Extended Communications Access l\lethod .............. .
Programming in the medical real-time environment ............... .

581
589

R. G. Spencer
T. C. Malia
G. W. Dickson
G. J. Clancy, Jr.
N. A. Palley
D. H. Erbeck
J. A. Trotter, Jr.

Decision making with computer graphics in an inventory control
environment .........................................-...... .

599

Concurrent statistical evaluation during patient monitoring ........ .

609

NEW DIRECTIONS IN PROGRAlVIlVIING LANGUAGES
(A Panel Session) (No papers in this volume)
TEXT PROCESSING
SHOEBOX-A personal file handling system for textual data ...... .
HELP-A question answering system ........................... .
CyperText-An extensible composing and typesetting language .... .

S. Glantz
Roberts
G. Moore
P. 1\1ann

COIVIl\1UNICATION AND ON-LINE SYSTElVIS

J. S. Prokop
F. P. Brooks, Jr.
S. T. Sacks
N. A. Palley
H. Shubin
A.A.Afifi

SELECTED COlVIPUTER SYSTEl\1S ARCHITECTURES
Associative capabilities for mass storage through array organization ..
Interrupt processing with queued content-addressable memories .....

615
621

A language oriented computer design ........................... .

629

A. 1\1. Peskin
J. D. Erwin
E. D. Jensen
C. 1\,fcFarland

641

A. I. Rubin

653

1\1. Sakaguchi
N. Nishida

PROSPECTS FOR ANALOG/HYBRID COlVIPUTING
(A Panel Session)
Analog/hybrid-What it was, what it is, what it may be .......... .
TOPICAL PAPER
The hologram tablet-Anew graphic input device ................ .

The macro assembler, SWAP-A general
purpose interpretive processor

by M. E. BARTON
Bell Telephone Laboratories
Naperville, Illinois

INTRODUCTION

Inputs

A new macro assembler, the SWitching Assembly
Program (SWAP), provides a variety of new features
and avoids the restrictions which are generally found
in such programs. Most assemblers were not designed
to be either general enough or powerful enough to accomplish tasks other than produce object code. SWAP
may be used for a wide variety of other problems such
as interpretively processing a language quite foreign
to the assembler.
SWAP has been developed at Bell Telephone Laboratories, Incorporated, to assemble programs for three
very different telephone switching processors. (SWAP
is written in the IBM 360 assembly language and runs
on the 360 with at least 256K bytes of memory.) With
such varied object machines and the need to have all
source decks translatable from the previously used assembler languages to the SWAP language, it is no
wonder that the SWAP design includes many features
not found in other assemblers. The cumulative set of
features provides a powerful interpretive processor that
may be used for a wide variety of problems.

The SWAP assembler may receive its original input
from a card, disc, or tape data set. The SOURCE
pseudo-operation allows the programmer to change the
input source at any point within a program. It is also
capable of receiving input lines directly from another
program, normally a source editor.

Outputs
Because the input line format is free field, the assembly listing of the source lines may appear quite
unreadable. Therefore, the normal procedure is to have
the assembler align all the fields of the printed line.
The positions of the fields are, of course, a programmer
option. There are several classes of statements that
may be printed or suppressed at the programmer's
discretion. In keeping everything as general as possible,
it is natural that any operation, pseudo-operation, or
macro may be assigned to any combination of these
classes of statements.
In addition to producing the object program, which
varies with different applications, and the assembly
listing just described, SWAP has the facility to save
symbol, instruction, or macro definitions in the form of
libraries which may be loaded and used to assemble
other programs.
Macro expansions and the results of text substitution functions are another optional output. The programmer completely controls which lines are to be
generated and the format of these lines. These lines
may be printed separately from the object listing or
placed on card, disc, or tape storage. This optional output may be used to provide input to other assemblers,

DESCRIPTION
The source language is free field. Statement labels
begin in column one. Operation names and parameters
are delimited by a single comma or one or more blanks.
Comments are preceded by the sharp sign (#), and the
logical end of line is indicated by the semicolon (;) or
physical end of card. A method is provided for user interpretation of other than this standard syntax ; SWAP
is currently being used as a preliminary version of
several compilers.
1

2

Fall Joint Computer Conference, 1970

and in this way SWAP can become a pseudo-compiler
for almost any language. This output can also be used
to produce preliminary program documents from comments which were originally placed in ·the source· program deck.

Variables

There are numerous types of variable symbols, such
as integers, floating point numbers, truth value, and
character strings. The programmer may change or
assign the type of any symbol as he wishes. Fot this
purpose, the type of a symbol or operation is represented by a character. Each variable symbol may have
up to 250 user-defined attributes which are data associated with each symbol. In addition, each symbol
represents the top of a push-down list which allows the
programmer to make a local use of any symbol.
A string variable would be defined by· the TEXT
pseudo-operation:
VOWELS

TEXT

'AEIOU'

while a numeric value is assigned by SET:
LIMIT SET

10

The 'functional' notation is used extensively to
represent not only the value of a symbol attribute, but
also to represent array elements and predefined or
user-defined arithmetic functions. In the following
statement:
ALPHA (SA)

SET

BETA (SB) +10

the ALPHA attribute of symbol SA would be assigned
a value ten ·greater than the BETA attribute of symbol
SB.
An array of three dimensions would be declared by
the statement:

Expressions

The following is a hierarchical list of the operators
allowed in expressions:

1

**
*

or
and

/

unary-

and

unary-,

+

and

=, >, <, -, =
=>
=<
&

or
or
or
and

~

~

::;;
-,

and

}

exponentiation
multiplication and
division
negation and complement
addition and subtraction
the six relational operators
logical AND and
AND of complement
logical OR and EXeLUSIVE OR

( ), [ ], and { } may be used in the usual manner to
force evaluation in any order.
Four particular rules apply to the use of these
operations:
1. Combined relations ApBpC are evaluated the
same as the expression ApB&BpC where pis any
relational operator.
2. Character strings in comparisons are denot~d as
quoted strings.
3. The type of each operand is used to determine
the method of evaluation. (For example, the
complement of an integer is the 32-bit complement while the complement of a truth value is a
I-bit complement.)
4. If a TEXT symbol is encountered as an operand
in an expression, it is called an indirect symbol,
and its value is the result of evaluating the
string as an expression.

ARRAY CUBE ( -1:1,3,0:2)=4
In this example, the range of the first dimension runs
from -1 through + 1, while the second dimension is
from + 1 through +3, and the third is from 0 through
2. Each element would have the initial value 4, but
the following statement could be used to assign another
value to a particular element of the array:
CUBE ( -1,2,0)

SET

5

An attribute, array, or function reference may occur
anywhere that a symbol may be used in an arithmetic
expreSSlOn.

Predefined Functions

Several built-in or predefined functions are provided
to aid in evaluating some of the more common expressions. The following is a partial list of the available
functions:
E(EXP)
MAX (EXP1,

.•• ,

EXP n )

Results in 2 raised to the
EXP power.
Returns the maximum of
the expressions EXP1
through EXP n.

SWAP

STYP(EXP, C)

SET(SYMB, EXP)

Returns the value of EXP,
but the type of the result
is the character C as discussed in the Variables
section.
Returns the value of EXP
and assigns that same
value to the symbol
SYMB. This differs from
the SET pseudo-operation in that the symbol
is defined during the
evaluation of an expres-

Y is present, but the value of INC (3) is 4 since an
argument value for Y was omitted.
The other feature which allows an arbitrary number
of arguments is the ability to loop over a part of the
defining expression, using successive argument values
wherever the last dummy parameter name appears in
the range of the loop. This feature is invoked by the
appearance of an ellipsis ( ... ) in the defining expression. The range of the loop is from the operator immediately preceding the ellipsis backward to the first
occurrence of the same operator at the same level of
parentheses. As an example, consider the following
statement:

SIOn.

DFN

To allow the programmer to define any number of
new functions, the DFN pseudo-operation is provided.
The general form of a function definition is written:
••• ,

P n) =A1:B1, A 2 :B2,

••• ,

An:Bn

where F is the function name, the Ps are dummy
parameter names, and the As and Bs are any valid
expressions. These expressions may contain the Ps and
other variables as well as other function calls which may
be recursive.
To evaluate the function, the Bs are evaluated left
to right. The result is the value of the A corresponding
to the first B that has a value of true (or nonzero).
The colons may be read as the word "if." A simple
example would be the function:
DFN

POS(X)=1:X>0,0:X~0

which returns the value 1 if its argument is positive;
otherwise, the result is zero. If the expression Bn is
omitted, it is assumed to be true. Another example is
the following definition of Ackermann's function:

=A~X**(Y +C)'+- --

The range of the loop is from the
following the right
parenthesis backward to the + between the A and the
X. The call SUM (4, 1,2,3) would yield the same
result as the following expression:

A +4**(1 +C) +4**(2+C) +4**(3+C)
The loop may also extend over the expression between
two commas as the next example shows. A recursive
function to do the EXCLUSIVE OR of an indefinite
number of arguments could be defined by:
DFN

XOR(A, B, C) =A-,B I B-,A: -,C?,
XOR(XOR(A, B) ,IC,l ...

N =0, ACK(M -1, ACK(M, N-1))
Two features are provided to allow an arbitrary number of arguments in the call of a function. The first is
the ability to ask if an argument was implicitly omitted
from the call. This feature is invoked by a question
mark immediately following the dummy parameter
name. If the argument was present, the result of the
parameter-question mark is the value true; otherwise,
the value is false. For example, the function defined by:

INC(X, Y)=X+Y:Y?,X+1

would yield the value 7 when called by INC (2, 5) since

)

Sequencing control

The pseudo-operations that allow the normal sequence of processing to be modified provide the real
power of an assembler. In SWAP, the pseudo-operations
that provide that control are JUMP and DO. JUMP
forces the assembler to continue sequential processing
with the indicated line, ignoring any intervening lines.
The statement:
JUMP

DFN ACK(M, N) =N+l:M=0,ACK(M-1, 1):

DFN

SUM(X, Y)

+

Programmer-defined functions

DFN F(P1, P 2,

3

.LINE

will continue processing with the statement labeled:
.LINE. The symbol .LINE is called a sequence symbol
and is treated not as a normal symbol but only as the
destination of a JUMP or DO. Sequence symbols are
identified by the first character, which must be a period.
A normal symbol may also be used as the destination
of a JUMP or DO, if convenient. The destination of a
JUMP may be either before or after the JUMP statement.
The JUMP is taken conditionally when an expression is used following the sequence symbol:
JUMP

.XX, INC> 10

# IS

IT OVER LIMIT

4

Fall Joint Computer Conference, 1970

The JUMP to .XX will occur only if the value of the
symbol INC is greater than ten.
The DO pseudo-operation is used to control an assembly time loop and may be written in one of three
forms:
DO
DO
DO

.LOC, VAR= INIT, TEXP, INC
.LOC, VAR=INIT, LIMIT, INC
.LOC, VAR=(LIST)

(i)
(ii)
(iii)

Type (i) assigns the value of IN IT to the variable
symbol VAR. The truth value expression TEXP is
then evaluated and, if the result is true, all the lines
up to and including the line with .LOC in its location
field are assembled. The value of INC (if INC is
omitted, 1 is assumed) is then added to the value of
VAR and the test is repeated using the incremented
value of V AR.
Type (ii) is the same as type (i) except that the
value of V AR is compared to the value of LIMIT; the
loop is repeated if INC is positive and the value of VAR
is less then or equal to the value of LIMIT. If INC is
negative, the loop is repeated only while the value of
V AR is greater than or equal to the value of LIMIT.
Type (iii) assigns to V AR the value of the first item
in LIST. Succeeding values are used for each successive
time around the loop until LIST is exhausted.
The following are examples of the use of DO:
Type (i)
Type (ii)
Type (iii)

DO
DO
DO

.Y,M=I,M~10&A(M»0

.X, K=I, 100, K+1
.Z, N = (1, 3, 4, 7,11,13,17)

Control of optional output

Selected results of macro and text substitution facilities may be used as an optional output. This is accomplished by the use of the EDIT psuedo-operation
which may be used in a declarative, global, or range
mode.
The declarative mode does not cause any output to
be generated, but is used to declare the destination
(printer, punch, or file) of the output and the method
of handling long lines. It is also used to control the
exceptions to the global output mode. For example,
the statement:
PRINT EDIT OFF ('ALL') ,
ON ('REMARKS', NOTE, DOC),
CONT(72, 'X', '- - -')
would indicate that edited output is to be printed, and
that any line that exceeds 72 characters is to be split

into two print records with an X placed at the end of
the first 72 characters and the remainder appended to
the - - -. If EDIT ON, the global form, were to be
used with the above declarative, then only lines that
contain NOTE or DOC in the operation field as well
as all remark statements will be outputted.
The range form of ED IT allows a sequence of lines
to be outputted regardless of their syntax. Lines outputted in this mode are then ignored by the remainder
of the assembly processes.
Two examples of this form are EDIT .NEXT which
causes the next line to be outputted, and EDIT .LINE
which causes all lines up to, but not including, the line
with the sequence symbol .LINE in its label field. See
the Appendix for examples of the use of the EDIT
pseudo-operation.

Macros

The macro facilities incorporated in SWAP make it
one of the most flexible assemblers available. The
macro facilities presented here are by no means exhaustive but only representative of the more commonly used features.
The general form of a macro definition is:
MACRO
prototype statement
macro text lines
MEND
The prototype statement contains the name of the
macro definition as well as the dummy parameter
names which are used in the definition. The macro
text lines, a series of statements which make up the
definition of the macro, will be reproduced whenever
the macro is called.
Any operation, pseudo-operation, or macro may be
redefined as a macro. Also, there are no restrictions as
to which operatiorrs are used within a macro definition;
this means that it is legitimate for macro definitions to
be nested.

Macro operators and subarguments

Macro operators are provided to allow the programmer to obtain pertinent information about macro
arguments and some of their common parts. A macro
operator is indicated by its name character followed by
a period and the dummy parameter name of the
operand. For example, if a parameter named ARG has
the value (A, B, C), then the number operator,

SWAP

N.ARG, would be replaced by the number of subarguments of ARG; in this example, N.ARG is replaced
by 3.
Any subparameter of a macro argument may be accessed by sUbscripting the parameter name with the
number of the desired sub argument. Additional levels
of sub arguments are obtained with the use of multiple
indexes. As an example, let the parameter named ARG
assume the value P (Q, R (S, T)), then:
ARG(O)
ARG(I)
ARG(2)
ARG(2, 0)
ARG(2, 1)

is replaced by P
is replaced by Q
is replaced by R(S, T)
is replaced by R
is replaced by S

The macro operators may be used on the results of
each other as well as on subparameters; for example,
N.ARG (2) would be replaced by 2.
The following is an example of a simple macro to
define a list of symbols:
MACRO
DEFINE LIST
DO .LP, K=1, N .LIST
LIST(K,1)

SET LIST(K, 2)

.LP

NULL # MARK END OF DO LOOP
MEND

If the macro were called by the following line:
DEFINE «SYMB, 5), (MATRIX (2),7), (CC, 11))
it would expand to:

SYMB
MATRIX (2)
CC

SET 5
SET 7
SET 11

Some of the major macro functions are:
1. IS (expression, string) is replaced by string if
the value of expression is nonzero; otherwise,

the result is the null string.
2. IFNOT(string) is replaced by string if the
expression in the previously evaluated IS had a
value of zero; otherwise, the result is null.
3. STR(exPl, exp2, string) is replaced by exp2
characters starting with the expl character of
string.
4. MTXT (tsym ) is replaced by the character
string which is the value of the TEXT symbol
tsym.
5. MTYP (symb) is replaced by the character that
represents the type of the variable symbol
symb.
6. MSUB (string) is replaced by the result of doing
macro argument substitution on string a second
time.
7. SYSLST(exp) is replaced by the expth argument of the macro call.
8. MDO(DO parameters; string) is a horizontal
DO loop where string is the range of the loop.
Each time around, the loop produces the value
of string, which is normally dependent on the'
DO variable symbol.
Keyword argulllents

When the macro is called, keyword arguments are
indicated by the parameter name followed by an equal
sign and the argument string. An example would be
the following calls of a MOVE macro:
MOVE
MOVE

Macro functions

To provide more flexibility with the use of macros,
several system parameters and macro functions have
been made available. Macro functions are built-in
functions that are replaced by a string of characters.
This string, called the result, is determined by the
particular function and its arguments. The arguments
of a macro function may consist of macro parameters,
other macro function calls, literal character strings, or
symbolic variables. An example would be the DEC
macro function, which has one argument, either a
valid arithmetic or logical expression. The result is the
decimal number equal to the value of the expression;
the call DEC (7 +8) would be replaced by 15.

5

FROM=NEWDATA, TO=OLDDATA
or
TO=OLDDATA, FROM=NEWDATA

Both calls will yield the same expansions as the expansion of the MOVE macro using normal arguments:
MOVE NEWDATA,OLDDATA
Default arguments

Default strings are used whenever an argument is
omitted from a macro call. The default string is assigned on the macro prototype line by an equal sign
and the desired default string after the dummy parameter name. Although the notation is the same, default
arguments are completely independent of the use of
keyword arguments.

6

Fall Joint Computer Conference, 1970

Marco pseudo-operations

The ARGS pseudo-operation provides a method of
declaring an auxiliary parameter list which supplements the parameter list declared on the prototype
statement. These parameters may also be assigned
default values.
The parameters defined on an ARGS line may be
used anywhere a normal parameter may be used. The
parameter values may be reset by the use of keyword
arguments.
I t is also possible for the programmer to reset his
named macro argument values anywhere within a
macro by using the MSET pseudo-operation. For
example:
PARM MSET DEC(PARM)
would change the value of P ARM to its decimal value.
The following is an example of the use of the ARGS
pseudo-operation:
MACRO
FUN NUMBER
ARGS WORD = (ONE, TWO, THREE)
#
NUMBER=WORD (NUMBER) .
MEND
When the macro is called by FUN 1 + 1, the following
comment would be generated:

# l+l=TWO
but the call FUN 1+1, WORD = (EIN, ZWEI, DREI)
would generate:

# 1+1=ZWEI
Text manipulating facilities
Some of the more exotic features provided by SWAP
are the character string pseudo-operations and the
dollar macro call.

HUNT and SCAN pseudo-operations

The HUNT pseudo-operation allows the programmer
to scan a string of characters for any. break character
in a second string. It will then define two TEXT
symbols consisting of the portions of the string before
and after the break character. For example, the

statements:
BRKS TEXT

HUNT

'+-*/'

.LOC, TOKEN, REMAIN,
'LSIZE *ENTS', BRKS

will result in the symbols TOKEN and REMAIN
having the string values of 'LSIZE' and '*ENTS' respectively. If one of the characters inBRKS could not
be found in the scanned string, then a JUMP to the
statement labeled .LOC would occur.
The SCAN pseudo-operation provides the extensive
pattern matching facilities of SNOBOL3 I along with
success or failure transfer of control. This pseudooperation is written:

where TSYM is a previously defined string valued
variable. The SNOBOL3 notation is used to represent
the pattern elements PI through P n as well as the GO TO
field. See the references for a more detailed presentation
of these facilities.

Dollar functions

Dollar functions are very similar to macro functions
in that the result of a dollar function call is a string of
characters that replace the call. However, these functions may be used on input lines as well as in macros.
The dollar functions provide the ability to call a oneline macro anywhere on a line by preceding the macro
name with a dollar sign and following it with the argument list in parenthesis. For example, the macro:
MACRO
CHECK
A,B
IS(AO

MEND
MACRO
PRINT FMT
DO
K~=2,N.SYSLST
t. CHECR FOR ITERATIVE LISTS
IS (. STR (1, 1, SYSLST (K~) ) ' : ' (', ITEMI)IFNOT(ITMI:DEC(KI) TEXT)
'SYSLST(KI) ,
.X
NULL
MOO (K~=2, N. SYSLST; MTXT (ITMI :DEC (K~»
)
FMT
OU1'_

.x

MEND

8B

80

Fall Joint Computer Conference, 1970

MACRO

FMT

t.

OUT

KI SET 1;JI SET 0 ;JJI SET 0

.LP
EDIT
• NEXT
GENERATE A LINE OF PRINTOUT
MSUB(MTXT(FMT:_:DEC(KI»)
JUMP .LP,SET(KI,KI+1) SFMT:_L It HAS FORMAT BEEN EXHAUSTED
JUMP
.OUT,JI~N.SYSLSTIJISJJI
tt WHEN PRINT LIST
EXHAUSTED OR NOTHING BEING DONE
JJI

SEl'

JI

.RLP
EDIT
• NEXT
It BACK UP TO LAST LEFT PAREN
MSUB(STR(FMT:_K,SOO,MTXT(FMT:_:DEC(FMT:_R»»
JUMP .RLP SET (KI,FMT:_P.+ 1) >FMT:_L&JJI '

MEND

I

MACRO
FMT

.t SLASH

BRK_S
BRK_C

FMT:_:DEC{~LlNES)

AI

SET

0 ;ILINES
MEND

TEXT 'MDO(KI=1,AI;MTXT(ITMI:DEC(KI)}) ,
SET 'LINES.'

I
MACRO

BRK Q

•• QUOTED STRING

ITM%:DEC(SET(A%,A'.1» TEXT 'Q.MTXT(REM')·
REM % TEXT
'STR(K.Q.MTXT(REM')+2,99,MTXT(REMi»

•

MACRO

BRK_H

ITMI:DEC(SET(AI,AI+1»
REMI

,

t

MEND

TEXT

tt HOLERITH STRING
TEXT 'STR(2,TRMI,MTXT(REMI»'

'STR(TRMI+1,99,MTXT(REMI»'

MEND

t.

LN

MACRO
FTYP_I
INTEGER
MSET
STR(2,10,MTXT(TYPI»

OP

MSET

DEC (MAX (1, DUP~) )

ITMI:DEC(SET(Ai,AI+1»
I.DEC (JI»

TEXT

;I.DEC (I.SYSLST (SET

':I.MDO(IN=1,MIN(DP,I.N.I.SYSLST(J~,JI+1»

MEND

,LN,'

'»'

I

MACRO
FTYP X

ITMI:DEC(SET(AI,A'+1»
MEND

It BLANKS

TEXT 'MDO(NI=1,MAX(1,DUPI); )'

8D

8E

Fall Joint Computer Conference, 1970

MACRO

END
SYSPRINT EDIT OFF
MEND
FORT_PROG

I I TERMINATE SOURCE
tt. END OF SOURCE

LI~TING
PPOGRA~

It NOW EXECUTE SOURCE PFOGRAM
t t TERM! N1\ TE RUN

FORT_PROO

END 1
MEND
I
FORMAT

OPBITS

END

OPBITS ON(ACTIV~
OPBITS OFF (CONT)

END

ON(ACTIV~

t

ALLOW THESE OPS TO EXPAND
DURING MACRO DEFINIT!ON

t NO CONTINUATION

ALLOWED FOR END

MACRO
EDIT

OPBITS ON(ACTIVE)

•

EDIT

ON (FORMAT, END)

MACRO
t MAKE ENTIRE PROGRAM A MACRO DEFINITION
FORT_PROG
SYSPRINT EDIT • NEXT
•• EJECT TO NEW PAGE
1
PRINT
EDIT ON
It PRODOCE SOURCE LISTING

Definition mechanisms in extensible
programming languages
by STEPHEN A. SCHU1V[AN*
U niversite de Grenoble
Grenoble, France

and

PHILIPPE JORRAND
Centre Scientifique IBM-France
Grenoble, France

INTRODUCTION

1. A base language, encompassing a set of indispensable programming primitives, organized so
as to constitute, in themselves, a coherent
language.
2. A set of extension mechanisms, establishing a
systematic framework for defining new linguistic
constructions in terms of already existing ones.

The development of extensible programming languages
is currently an extremely active area of research, and
one which is considered very promising by a broad
segment of the computing community. This paper represents an attempt at unification and generalization of
these developments, reflecting a specific perspective on
their present. direction of evolution. The principal influences on this work are cited in the bibliography, and
the text itself is devoid of references. This is indicative
of the recurring difficulty of attributing the basic ideas
in this area to any single source; from the start, the
development effort has been characterized by an exceptional interchange of ideas.
One simple premise underlies the proposals for an
extensi hIe programming language: that a "user" should
be capable of modifying the definition of that language,
in order to define for himself the particular language
which corresponds tb his needs. While there is, for the
moment, a certain disagreement as to the degree of
"sophistication" which can reasonably be attributed to
such a user, there is also a growing realization that the
time is past when it is sufficient to confront him with
a complex and inflexible language on a "take it or
leave it" basis.
According to the current conception, an extensible
language is composed of two essential elements:

Within this frame of reference, an extended language is
that language which is defined by some specific set of
extensions to the given base language. In practice,
definitions can be pyramided, using a particular extended language as the new point of departure. Implicit
in this approach is the assumption that the processing
of any extended language program involves its systematic reduction into an equivalent program, expressed
entirely in terms of the base language.
Following a useful if arbitrary convention, the extension mechanisms are generally categorized as either
semantic or syntactic, depending on the capabilities that
they provide. These two types of extensibility are the
subjects of subsequent· sections, where models are developed for these mechanisms.
Motivations for extensible languages

The primary impetus behind the development of
extensible languages has been the need to resolve what
has become a classic conflict of goals in programming
language design. The problem can be formulated as

* Present address: Centre Scientifique IBM-France
9

10

Fall Joint Computer Conference, 1970

power oj expression versus economy oj concepts. Power
of expression encompasses both "how much can be
expressed" and "how easy it is to express". It is essenti ally a question of the effectiveness of the language,
as seen from the viewpoint of the user. Economy of
concepts refers to the idea that a language should
embody the "smallest possible number" of distinguishable concepts, each one existing at the "lowest possible
level". This point of view, which can be identified with
the implementer, is based on efficiency considerations,
and is supported by a simple economic fact: the costs
of producing and/or using a compiler can become prohibitive. Since it is wholly impractical to totally disregard either of these competitive claims, a language
designer is generally faced with the futile task of
reconciling two equally important but mutually exclusive objectives wit.hin a single language.
Extensible languages constitute an extremely pragmatic response to this problem, in the sense that they
represent a means of avoiding, rather than overcoming
this dilemma. In essence, this approach seeks to encourage rather than to suppress the proliferation of
programming languages; this reflects an increasing disillusionment with the "universal language" concept,
especially in light of the need to vastly expand the
domain of application for programming languages in
general. The purpose of extensible languages is to establish an orderly framework capable of accommodating
the development of numerous different, and possibly
quite distinctive dialects.
In an extensible language, the criteria concerning
ecohomy of concepts are imposed at the point of formulating the primitives which comprise the base language.
This remains, therefore, the responsibility of the implementer. JVIoreover, he is the one who determines the
nature of the extension mechanisms to be provided.
This acts to constrain the properties of the extended
languages subsequently defined, and to effectively control the consistency and efficiency of the corresponding
compilers.
The specific decisions affecting power of expression,
however, are left entirely to the discretion of the user,
subject only to the restrictions inherent in the extension
mechanisms at his disposal. This additional "degree of
freedom" seems appropriate, in that it is after all the
language user who is most immediately affected by
these decisions, and thus, most competent to make the
determination. The choices "rill, in general, depend on
both the particular application area as well as various
highly subjective criteria. What is important is that
the decision may be made independently for each individual extended language.
At the same time, the extensible language approach
overcomes what has heretofore been the principal ob-

stacIe to a diversity of programming languages: incom.;.
patibility among programs written in different languages. The solution follows automatically from the
fact that each dialect is translated into a common base
language, and· that this translation is effected by essentially the same processor.
Despite the intention of facilitating the definition of
diverse languages, the extensible language framework
is particularly appropriate for addressing such significant problems as machine-to-machine transferability,
language and compiler standardization, and object code
optimization. The problems remain within manageable
limits, independent of the number of different dialects;
they need only be resolved within the restricted scope
of the base language and the associated extension
mechanisms.

Evolution oj extensible languages

An extensible language, according to the original
conception, was a high level language whose compiler
permitted certain "perturbations" to be defined. Semantic extension was formulated as a more flexible set
of data and procedure declarations, while syntactic
extension was confined to integrating the entities which
could be declared into a pre-established style of expression. For the most part, the currently existing extensible
languages reflect this point of departure.
It is nonetheless true that the basic research underlying the development of extensible languages has taken
on the character of an "attempt to isolate and generalize
the various "component parts" of programming languages, with the objective of introducing the property
of "systematic variability". A consequence of this effort
has been the gradual emergence of a somewhat more
abstract view of extensible languages, wherein the base
language is construed as an inventory of essential
primitives, the syntax of which minimally organizes
these elements into a coherent language. Semantic extension is considered as a set of "constructors" serving
to generate neW, but completely compatible primitives;
syntactic extension permits the definition of the specific
structural combinations of these primitives which are
actually· meaningful. Thus, extensible languages have
progressively assumed the aspect of a language definition framework, one which has the unique property
that an operational compiler exists at each point in the
definitional process.
Accordingly, it is increasingly appropriate to regard
extensible languages as the basis for a practical language
definition system, irrespective of who has responsibility
for language development. Potentially, such an environment is applicable even to the definition of non-

Definition Mechanisms in Extensible Programming Languages

extensible languages. Heretofore, it has been implied
that any given extended language was itself fully
extensible, since its definition is simply the result of
successive levels of extension. In conjunction with the
progressive generalization of the extension capabilities,
however, one is naturally led to envision a complementary set of restriction mechanisms, which would
serve to selectively disable the corresponding extension
mechanisms.
The intended function of the restriction mechanisms
is to eliminate the inevitable overhead associated with
the ability to accommodate arbitrary extension. They
would be employed at the point where a particular
dialect is to be "frozen". In effect, such restriction
mechanisms represent a means of imposing constraints
on subsequent extensions to the defined language (even
to the extent of excluding them entirely), in exchange
for a proportional increase in the efficiency of the
translator. The advantage of this approach is obvious:
the end result of such a development process is both a
coherent definition of the language and an efficient,
operational compiler.
Within this expanded frame of reference, most of the
extensible languages covered by the current literature
might more properly be considered as· extended languages, even though they were not defined by means of
extension. This is not unexpected, since they represent
the results of the initial phase of development. The
remainder of this paper is devoted to a discussion of
the types of extension mechanisms appropriate to this
more evolved interpretation of extensible languages.
The subject of the next section is semantic extensibility,
while the final section is concerned with syntactic
extensibility. These two capabilities form a sort of twodimensional definition space, within which new programming languages may be created by means of
extension.
SEMANTIC EXTENSIBILITY
In order to discuss semantic extensibility, it is first
necessary to establish what is meant here by the
semantics of a programming language. A program remains an inert piece of text until such time as it is
submitted to some processor: in the current context, a
computer controlled by a compiler for the language in
which the program is expressed. The activity of the
processor can be broadly characterized by the following
steps:
1. Recognition of a unit of text.
2. Elaboration of a meaning for that unit.
3. Invocation of the actions implied by that
meaning.

11

According to the second of these steps, the notion of
meaning may be interpreted as the link between the
units of text and the corresponding actions. The set of
such links will be taken to represent the semantics of
the programming language.
As an example, the sequence of characters "3.14159"
is, in most languages, a legitimate unit of text. The
elaboration of its associated meaning might establish
the following set of assertions:
-this unit represents an object which is a value.
-that value has a type, which is real.
-the internal format of that value is floatingpoint.
-the object will reside in a table of constants.
This being established, the .actions causing the construction and allocation of the object maybe invoked.
The set of assertions forms the link between the text
and the actions; it represents the "meaning" of 3.14159.
With respect to the processor, the definition of the
semantics of a language may be considered to exist in
the form of a description of these links for each object
in the domain of the language. When a programming
language incorporates functions which permit explicit
modification of these descriptions; then that language
possesses the property of semantic extensibility. These
functions, referred to as semantic extension mechanisms,
serve to introduce new kinds of objects into the language, essentially by defining the set of linkages to be
attributed to the external representation of those objects.

Semantic extension in the domain of values: A model
The objects involved in the processing of a program
belong, in general, to a variety of categories, each of
which constitutes a potential domain for semantic
extension. The values, in the conventional sense, obviously form one such category. In order to illustrate
the overall concept of semantic extensibility, a model
for one specific mechanism of semantic extension will
be formulated here. It operates on a description of a
particular category of objects, which encompasses a
generalization of the usual notion of value. For example,
procedures, structures and pointers are also considered
as values, in addition to simple integers, booleans, etc.
These values are divided into classes, where each
class is characterized by a mode. The mode constitutes
the description of all of the values belonging to that
class. Thus the mode of· a value may be thought of as
playing a role analogous to that of a data-type. It is

12

Fall Joint Computer Conference, 1970

assumed that processing of a program is controlled by
syntactic analysis. Once a unit of text has been isolated,
the active set of modes is used by the compiler to
elaborate its meaning. Typically, modes are utilized
to make sure that a value is employed correctly, to
verify that expressions are consistent, to effect the
selection of operations and to decide where conversions
are required.
The principal component of the semantic extension
mechanism is a function which permits the definition
of new modes. Once a mode has been defined, the
values belonging to the class characterized by that
mode may be used in the same general ways as other
values. That is to say, those values can be stored into
variables, passed as parameters, returned as results of
functions, etc.
The mode definition function would be used like a
declaration in the base language. The following notation
will be taken as a model for the call on this function:
mode u is T with 7r;

either the name of an existing mode or a constructor
applied to some combination of previously defined
modes. There are assumed to be a finite number of
modes predefined within the base language. In the
scope of this paper, int, real, bool and char are taken
to be the symbols representing four of these basic
modes, standing for the modes of integer, real, boolean
and single character values, respectively. Thus, a valid
mode definition might be:
m ode integer is int;

The model presented here also includes a set of mode
constructors, which act to create new modes from existing
ones. The following list of constructors indicates the
kinds of combinations envisioned:
1. Pointers
A pointer is a value designating another value.
As any value, a pointer has a mode, which
indicates that:

The three components of the definition are:
1. the symbol clause "mode u",
2. the type clause "is T",
3. the property clause "with 7r".
The property c1ause may be omitted.
The symbol clause

-it is a pointer.
-it is able to point to values of a specified class.
The notation ptr M creates the mode characterizing pointers to values of mode M. For example,
mode ppr is ptr ptr real;

specifies that values of mode ppr are pointers to
pointers to reals.

In the symbol clause, a new symbol u is declared as
the name of the mode whose description is specified
by the other clauses. For example,
mode complex is ...

may be used to introduce the symbol complex. In addition, the mode to be defined may depend on formal
parameters, which are specified in the symbol clause as
follows:
mode matrix (int

ill,

int n) is ...

The actual parameters would presumably be supplied
when the symbol is used in a declarative context, such as
matrix (4, 5)A;
The type clause

In the type clause, T specifies the nature of the values
characterized by the mode being defined. Thus, T is

2. Procedures
A procedure is a value, implying that one can
actually have procedure variables, pass procedures as parameters and return them as the
results of other procedures. Being a value, a
procedure has a mode which indicates that:
-it is a procedure.
-it takes a fixed number (possibly zero) of
parameters, of specified modes.
-it returns a result of a given mode, or it does
not return any result.
The notation proc (MI, ... , Mn) M forms the
mode of a procedure taking n parameters, of
modes MI ... Mn respectively, and returning a
value of mode M. As an example, one could
declare
mode trigo is proc (real)real;

Definition Mechanisms in Extensible Programming Languages

for the class of trigonometric functions, and then
mode trigocompose is proc (trigo, trigo)trigo;

for the mode of functions taking two trigonometric functions as parameters, and delivering a
third one (which could be the composition of
the first two) as the result.
3. Aggregates
Two kinds of aggregates will be described:
a. the tuples, which have a constant number of
components, possibly of different modes;
b. the sequences, which have a possibly variable
number of components, but of identical
modes.
a. Tuples
The mode of a tuple having n components
is established by the notation [M 1s1, ... ,
Mnsn], where M 1 ... Mn are the modes of
the respective components, and Sl ... Sn
are the names of these components, which
serve as selectors. Thus, the definition of
the mode complex might be written.
mode complex is [real rp, real ip];

If Z is a complex value, one might write
Z.rp or Z.ip to access either the real part
or the imaginary part of Z.
b. Sequences
The mode of a sequence is constructed by
the notation seq (e)M, where e stands for an
expression producing an integer value, which
fixes the length of the sequence; the parenthesized expression may be omitted, in
which case the length is variable. The
components, each having mode M, are
indexed by integer values ranging from 1
to the current length, inclusively. The
mode for real square matrices could be
defined as follows:
mode rsqmatrix (int n)
is seq (n) seq (n) real;

If B is a real square matrix, then the notation B(i)(j) would provide access to an individual component.

4. Union
The union constructor introduces a mode characterizing values belonging to one of a specified

13

list of classes. The notation union (MI' ... , 1\1: n)
produces a mode for values having anyone of
the modes 1\1: 1... Mil' Thus, if one defines
mode procir is proc (union (int, real));

this mode describes procedures taking one parameter, which may be either an integer or a
real, and returning no result. A further example,
using the tuple, pointer~ sequence and union
constructors, shows the possibility of recursive
definition:
mode tree
is [char root,
seq ptr union (char, tree) branch];

The list of mode constructors given above is intended
to be indicative but not exhaustive. Moreover, it must
be emphasized that the constructors themselves are
essentially independent of the nature and number of
the basic modes. Consequently, one could readily admit
the use of such constructors with an entirely different
set of primitive modes (e.g., one which more closely
reflects the representations on an actual machine).
What is essential is that the new modes generated by
these constructors must be usable in the language in
the same ways as the original ones.

The property clause

The property clause "with 7r" when present, specifies
a list of properties possessed by the values of the mode
being defined. These properties identify a sub-class of
the values characterized by the mode in the type clause.
Two kinds of properties are introduced for the present
model: transforms and selectors.
1. Transforms
The transforms provide a means of specifying
the actions to be taken when a value of mode
M1 occurs in a context where a value of mode
M2 is required (1\1:1~M2). If M is the mode
being declared, then two notations may be used
to indicate a transform:

a. "from M1 by V.E1," which specifies that a
value of mode M may be produced from a
value of mode M1 (identified by the bound
variable V) by evaluating the expression E 1.

14

Fall Joint Computer Conference, 1970

b. "into M2 by V.E2'" which specifies that a
value of mode M2 may be produced from a
value of mode M (identified by the bound
variable V) by evaluating the expression E 2.
The following definitions provide an illustration:
mode complex
is [real rp, real ip]
with from real
by x. [x, 0.0],
into real
by y. (ify.ip=O
then y.rp
else error) ;
mode imaginary
is [real ip]
with from complex
by x. (if x.rp = 0
then [x.ip]
else error),
into complex
by y. [0.0, y.ip];

By the transforms in the above definitions, all of
the natural conversions among real, complex,
and imaginary values are provided. It must be
noted that the system of transformations
specified among the .modes may be. represented
by a directed graph, where the nodes correspond
to the modes, and the arcs are established by
the from and into transforms. Thus, the rule to
decide whether the transformation from Ml into
1\112 is known might be formulated as follows:
There must exist at least one path from MI
to M 2 •
n. If there are several paths, there must exist
one which is shorter than all of the others.
lll. That path represents the desired transformation.
1.

2. Selectors
The notation "take 1\1 1s as V.E" may appear in
the list of properties attached to the definition
of the mode M. It serves to establish the name
"s" as an additional selector which may be
applied to values of mode M to produce a value
of mode MI. Thus, if X is a value of mode M,
then the effect of writing "X.s" is to evaluate
the expression E (within which V acts as a
bound variable identifying the value X) and to

transform its result into a value of mode MI.
As au example, the definition of complex might
be augmented by attaching the following two
properties:
take real mag as Z. (sqrt (Z. rp

i

2+Z.ip

i

2»,

take radian ang as Z. (arctan (Z.ipjZ.rp»;

The mode radian is presumed to be defined elsewhere, and to properly characterize the class of
angular values.
As with the case of the constructors, the properties
presented here are intended to suggest the kinds of
facilities which are appropriate within the framework
established by the concept of mode.
In summary, it must be stressed that the model developed here is applicable only to one particular category of objects, namely the values on which a program
operates. Clearly, there exist other identifiable categories which enter into the processing of a program
(e.g., control structure, environment resources, etc.).
It is equally appropriate to regard these as potential
domains for semantic extensibility. This will no doubt
necessitate the introduction of additional extension
mechanisms, following the general approach established here. As the number of mechanisms is expanded,
the possibility for selective restriction of the extension
capabilities will become increasingly important. The
development of the corresponding semantic restriction
mechanisms is imperative, for they are essential to the
production of specialized compilers for languages
defined by means of extension.
SYNTACTIC EXTENSIBILITY
A language incorporating functions which permit a
user to introduce explicit modifications to the syntax
of that language is said to possess the property of
syntactic extensibility. The purpose of this section is to
establish the nature of such a facility. It is primarily
devoted to the development of a model which will serve
to characterize the mechanism of syntactic extension,
and permit exploration of its definitional range.
It must be made explicit that, when speaking of
modifications to the syntax of a language, one is in fact
talking about actual alterations to the grammar which
serves to define that syntax. For a conventional language, the grammar is essentially static. Thus, it is
conceivable that a programmer could be wholly unaware of its existence. The syntactic rules, which he is
nonetheless constrained to observe (whether he likes·
them or not), are the same each time he writes a pro-

Definition Mechanisms in Extensible Programming Languages

gram in that language, and no deviation is permitted
anywhere in the scope of the program. The situation is
significantly different for the case of a syntactically
extensible language. This capability is provided by
means of a set of functions, properly imbedded in the
language, which acts to change the grammar. Provided
that the user is cognizant of these functions and their
grammatical domain, he then has the option of effecting
perhaps quite substantial modifications to the syntax of
that language during the course of writing a program in
that language; this is in parallel with whatever semantic
extensions he might introduce. In effect, the grammar
itself becomes subject to dynamic variation, and the
actual syntax of the language becomes dependent on
the program being processed.
The syntactic macro mechanism: A model

The basis of most existing proposals for achieving
syntactic extensibility is what has come to be called
the syntactic macro mechanism. A model of this mechanism is introduced at this point in order to illustrate
the possibilities of syntactic extension. The model is
based on the assumption that the syntactic component
of the base language, and by induction any subsequent
extended language, can be effectively defined by a
context-free grammar (or the equivalent BNF representation). This relatively simple formalism is adopted
as the underlying definitional system despite an obvious
contradiction which is present: a grammar which is
subject to dynamic redefinition by constructs in the
language whose syntax it defines is certainly not
"context-free" in the strict sense. Therefore, it is only
the instantaneous syntactic definition of the language
which is considered within the context-free framework.
The most essential element of the syntactic macro
mechanism is the function which establishes the definition of a syntactic macro. It must be a legitimate linguistic construct of the base language proper, and its
format would likely resemble any other declaration in
that language. The following representation will be
used to model a call on this function:
macro cp where

7r

means p;

The respective components are:

cp, the production;
7r,

the predicate; and

p,

the replacement.

The macro clause would be written in the form
macro

C~'phrase'

15

where C is the name of a category (non-terminal)
symbol in the grammar, and the phrase is an ordered
sequence, Sl ... Sn, such that each constituent is the
name of a category or terminal symbol. Thus the production in a macro clause corresponds directly to a
context-free production. The where and means clauses
are optional components of the definition, and will be
discussed below.
A syntactic macro definition differs from an ordinary
declaration in the base language in the sense that it is a
function operating directly on the grammar, and takes
effect immediately. In essence, it incorporates the
specified production into the grammar. Subsequent to
the occurrence of such a definition in a program, syntactic configurations conforming to the structure of the
phrase are acceptable wherever the correspondingcategory is syntactically valid. This will apply until such
time as that definition is, in some way, disabled. As an
example, one might include a syntactic macro definition
starting with
macro

FACT~'PRIlVI

!'

for the purpose of introducing the factorial notation
into the syntax for arithmetic expressions. Within the
scope of that definition, the effect would be the same as
if the syntactic definition of the language (represented
in BNF) incorporated an additional alternative
(factor): : = . ..

I (primary)!

Thus, in principle, a statement of the form
c : = nl/((n-m)! *m!);
might become syntactically valid according to the
active set of definitions.
The production

The role of the production is to establish both the
context and the format in which "calls" on that macro
may be written. The category name on the left controls
where, within the syntactic framework, such calls are
permitted. One may potentially designate any category
which is referenced by the active set of productions.
The phrase indicates the exact syntactic configuration
which is to be interpreted as a call on that particular
macro. In general, one may specify any arbitrary sequence (possibly empty) of symbol names. The constituents may be existing symbols, terminals which
'were not previously present, or categories to be defined
in other macros. This is of course, subject to the constraint that the grammar as a whole must remain both

16

Fall Joint Computer Conference, 1970

well-formed and non-ambiguous, if it is to fulfill its
intended function.
In addition, the macro clause serves to declare a set
of formal parameters, which may be referenced elsewhere in the definition. Each separate call on that
macro can be thought of as establishing a local syntactic
context, defined 'with respect to the complete syntax tree
which structurally describes the program. This context
would be relative to the position of the node corresponding to the specified category, and would include
the immediate descendants of that node, corresponding
to the constituents of the phrase. Ata call, the symbol
names appearing in the production are associated with
the actual nodes occurring in that context. Thus, the
terminal names represent an individual instance of
that terminal, and the category names represent some
complete syntactic sub-tree belonging to that category.
In order to distinguish between different formal parameters having the same name, the convention of subscripting the names will be adopted here; this notation
could readily be replaced by declaration of unique
identifiers.
The replacement

The means clause specifies the syntactic structure
which constitutes the replacement for a call on that
particular macro. It is written in the form
means 'string'

where the string is an ordered sequence, composed of
either formal parameters or terminal symbol names.
An instance of this string is generated in place of every
call on that macro, within which the actual structure
represented by a formal parameter is substituted for
every occurrence of its name. If the complete syntactic
macro definition for the factorial operator had been
macro

panded. In effect, the meaning of every new construct
introduced into the language is defined by specifying
its systematic expansion into the base language. Accordingly, one might consider syntactic extension
merely as a means for permitting a set of "systematic
abbreviations" to be defined "on top of" the base
language.
An important consequence of the fact that a syntactic
macro definition is itself a valid element of the base
language is that such definitions may occur in the context of a replacement. This is illustrated by the following example, showing how a declaration for "pushdown stack" might be introduced:
macro DECLo~'TYPEI stack [EXPR1] IDEN1;'
means
'TYPE 1 array [1:EXPR 1] IDEN 1 ;
integer level_IDEN 1 initial 0;
macro PRIMo~'depth_IDEN l'
means 'res (EXPR1)';
macro PRIMl~'IDENl'
means '(if level_IDENI >0
then
(IDENI [level_IDEN 1],
level_IDEN1 : =
level_I DEN 1 -1;)
else
error ("overdraw
IDEN 1 "))';
macro REFRo~'IDEN l'
means '(if level_IDEN I <
depth_IDENI
then
(level_IDEN 1 : =
level_IDEN 1 1;
IDEN 1 [level_IDEN1])
else
error ("overflow
IDEN1"))';';

+

FACTo~'PRIMl!'

means 'factorial (PRIM 1) , ;

Thus a declaration of the form
integer stack [K] S;

then each call on this macro would simply be replaced
by a call on the procedure named "factorial", assumed
to be defined elsewhere.
When present, the means clause establishes the
semantic interpretation to be associated with the corresponding production; if absent, then presumably the
construct is only an intermediate form, whose interpretation is subsumed in some larger context. The
"meaning," however, is given as the expansion of that
construct into a "logically lower level language" .
While the replacement may be expressed in terms of
calls on other syntactic macros, these will also be ex-

would generate not only the necessary array for holding
the contents of the stack, but also several other declarations, including:
1. An integer variable, named level_S, corre-

sponding to the level counter of the stack. It is
initialized to zero on declaration.
2. A literal, written "depth_S," for representing
the depth of that stack. Its expansion is given
in terms of the operator res, which is taken to
mean the result of a previously evaluated ex-

Definition 1\{echanisms in Extensible Programming Languages

pression, and presumed to be defined accordingly.
3. A macro definition (PRIlVh) which establishes,
by means of a compound expression, the interpretation of the stack name in "fetch-context".
This allows one to write "N: = S;" for removing
the value from the top of the stack S and assigning it to the integer variable N.
4. A macro definition (REFRo) which establishes
the corresponding "store context" operation.
One can then write "S: = 5 ;" to push a new value
into the stack.

3.

17

Si~'phrase'

\vhere Si is a previously declared parameter
representing a category, and the phrase is written
analogously to that of the production in a macro
clause. It verifies whether the immediate substructure of the specified parameter corresponds
to the indicated configuration. The constituents
of the phrase are also declared as formal parameters. An interesting example is suggested by a
peculiarity in PL/l, wherein the relation
"7 <6<5" is found to be true. A possible
remedy might be formulated as follows:
macro RELo~'REL1STAT 1
A PROCl~'HEADl ... '
A HEADl~'IDENl:proc ... '

The ellipsis notation is introduced with the
framework of functions (3) and (4) to indicate
that the structure of the corresponding constituents is irrelevant [and indeed, it may not
even be knowable in the contexts that can be
established by functions (5) and (6)].
7. 3 Si~'string'
which is successful on· the condition that the
string (generated analogously to the replacement
string) is directly reducible to the category
specified by Si, which is also declared as a formal
parameter to represent the cOIhpleted sub-tree
reflecting the analysis.
8. 3 Si{='string'
which is analogous to function (7), but the condition is generalized to verify whether the string
is reducible (regardless of the depth of the
structure) to the specified category. The definition of the "maximum" function, which requires
two syntactic macros, provides an interesting
example:
macro PRIMo~'max (EXPRLIST 1 ) ,
where EXPRLIST l~'EXPRl'
means '(EXPR l)';
macro PRIMl~'max (EXPRLIST l)'
where EXPRLIST ]~'EXPRLIST 2,
EXPR 2 '
A 3 PRIM 2{='max (EXPRLIST 2)'
means '(if PRIJ\1: 2 > EXPR 2
then res PRIM 2
else res (EXPR 2))"

9. P (arguments)
where P is the name of a semantic predicate, and
the arguments may be either formal parameters
or terminal symbols. Such conditions constitute
a means of imposing non-syntactic constraints
on the definition of a syntactic macro. They are
especially applicable in those situations where
it is necessary to establish the mode of a particular entity. For example, one might rewrite the
factorial definition as follows:
macro FACTo~'PRIMl!'
where is_integer (PRIM l )
means 'factorial (PRIl\1: l)' ;

In this form the definition also has the effect
of allowing different meanings to be associated
with the factorial operator, dependent on the
mode of the primary.
10. 3 Si: F (arguments)
Where F is the name of a semantic function
which conditionally returns a syntactic result.
Si is also declared as a formal parameter to represent this result. The semantic functions and
predicates establish an interface whereby it is
possible to introduce syntactic and semantic
interdependencies. A likely application of semantic functions would be definitions involving
identifiers:
where 3 LABL l : newlabel (IDENl) ...

A particularly intriguing possibility is to provide a semantic function which evaluates an
arbitrary expression:
where 3 CONST 1: evaluate (EXPR 1)

.•.

Obviously, this concept could be expanded to
encompass the execution of entire programs, if
desired.
I t is evident that the role of the where clause in a
syntactic macro definition is to provide a framework
for specifying those properties which effectively cannot
be expressed within the context-free constraints. The
fashion in which they are isolated allows these aspects
to be incorporated without sacrificing all of the practical advantages which come from adopting a relatively
simple syntactic formalism as the point of departure.
With respect to the model presented here, however, it is
nonetheless clear that the definitional power of the
syntactic macro mechanism is determined by the power
of the predicates.
Operationally, the syntactic macro mechanism can
be characterized by three distinct phases, each of which
is briefly considered below.
Definition phase

The definition phase encompasses the different functions incorporated within the base language which act
to insert, delete or modify a syntactic macro definition.
Together, they constitute a facility for explicitly editing
the given grammar, and are employed to form what
might be called. the active syntactic definition. This consists of the productions of the currently active syntac-

Definition Mechanisms in Extensible Programming· Languages

tic macros (with their associated predicates and replacements), plus the original productions of the base
language. An interesting generalization would be to
provide a means of selectively eliminating base language productions from the active syntactic definition,
there by excluding those constructions from the source
language; they would still remain part of the base
language definition, however, and continue to be considered valid in the context of a replacement. In this
fashion, the syntax of an extended language could be
essentially independent of the base language syntax,
thus further enhancing the definitional power of the
syntactic macro mechanism.

Interpretation phase

The interpretation phase includes the processing of
syntactic macro calls. It consists of three separate
operations: (1), recognition of the production; (2),
verification of the predicate; and (3), generation of the
replacement. Obviously, these operations must proceed concurrently with the process of syntactic analysis,
since syntactic macro expansion is incontestably a
"compile-time facility". Given the present formulation
of the syntactic macro mechanism, some form of what
is called "syntax directed analysis" suggests itself
initially as the appropriate approach for the analyzer.
It must be observed that the character of the analysis
procedure is constrained to a certain extent by the
nature of the predicates contained within the active
syntactic definition. Furthermore, the presence of
semantic predicates and functions precludes realization
of the analyzer/generator as a pure preprocessor.
In general, there will be the inevitable trade-off to
be made between power of definition and efficiency of
operation. It is pointless to pretend that this trade-off
can be completely neglected in the process of formulating the syntactic definition of a particular extended
language. However, deliberate emphasis has been given
here to power of definition, with the intention of providing a very general language development framework,
one which furnishes an operational compiler at every
stage. It is argued that the problem of obtaining an efficient compiler properly belongs to a subsequent phase.

Restriction phase

The restriction phase, as construed here, would be a
separate operation, corresponding to the automatic
consolidation of some active syntactic definition in
order to provide a specialized syntactic analyzer for
that particular dialect. The degree to which this

19

analyzer can be optimized is determined both by the
syntactic complexity of the given extended language,
and by the specific constraints on further syntactic
extension which are imposed at that point. If subsequent extensions are to be permitted, they might be
confined within extremely narrow limits in order to
improve the performance of the analyzer; they may be
excluded entirely by suppressing the syntactic definition functions in the base langua'ge. In either case,
various well-defined sub-sets of context-free grammars,
for which explicit identification and efficient analysis
algorithms are known to exist, constitute a basis for
establishing the restrictions. This represents the greatest practical advantage of having formulated the syntactic definition essentially within the context-free
framework.
In conclusion, it is to be remarked that syntactic
extensibility is especially amenable to realization by
means of an extremely powerful extension mechanism
in conjunction with a proportionally powerful restrictions mechanism. This approach provides the essential
definitional flexibility, which is a prerequisite for an
effective language development tool, without sacrificing
the possibility of an efficient compiler. In the end,
however, the properties of a particular extended language dictate the efficiency of its processor, rather than
the converse. This is consistent with the broadened
interpretation of extensible languages discussed in this
paper.

BIBLIOGRAPHY
1 T E CHEATHAM JR
The introduction of definitional facilities into higher level
programming languages
Proceedings of AFIPS 1966 Fall Joint Computer Conference
Second edition Vol 29 pp 623-637 November 1966
2 T E CHEATHAM JR A FISCHER Ph JORRAND
On the basis for ELF-an extensible language facility
Proceedings of AFIPS 1968 Fall Joint Computer Conference
Vol 33 Part 2 pp 937-948 November 1968
3 C CHRISTENSEN C J SHAW Editors
Proceedings of the extensible languages symposium
Boston Massachusetts May 1969 SIGPLAN Notices
Vol 4 Number 8 August 1969
4 B A GALLER A J PERLIS
A proposal jor definitions in ALGOL
Communications of the AGM Vol 10 Number 4 pp
204-299 April 1967
5 J V GARWICK
GPL, a truly general purpose language
Communications of the ACM Vol 11 Number 9 pp
634-639 September 1968
6 E T IRONS
Experience with an extensible language
Communications of the ACM Vol 13 Number 1 pp 31-40
January 1970

20

Fall Joint Computer Conference, 1970

7 Ph JORRAND
Some aspects of BASEL, the base language for an extensible
language facility
Proceedings of the Extensible Languages Symposium
SIGPLAN Notices Vol 4 Number 8 pp 14-17 August 1969
8 B M LEAVENWORTH
Syntax macros and extended translation
Communications of the ACM Vol 9 Number 11 pp 790-793
November 1966
9 M D McILROY
M aero instruction extensions to compiler languages
Communications of the ACM Vol 3 Number 4 pp 214-220
April 1960
10 A J PERLIS
The synthesis of algorithmic systems
First ACM Turing Lecture Journal of the ACM Vol 14
pp 1-9 January 1967

11 T A STANDISH
Some features of PPL, a polymorphic programming language
Proceedings of the Extensible Languages Symposium
SIGPLAN Notices Vol 4 Number 8 pp 20-26 August 1969
12 T A STANDISH
Some compiler-compiler techniques for use in extensible
languages
Proceedings of the Extensible Languages Symposium
SIGPLAN Notices Vol 4 Number 8 pp 55-62 August 1969
13 A VAN WIJNGAARDEN B J MAILLOUX
J E L PECK C H A KOSTER
Report on the algorithmic language ALGOL 68
MR 101 Mathematisch Centrum Amsterdam October 1969
14 B WEGBREIT
A data type definition facility
Harvard University Division of Engineering and Applied
Physics unpublished 1969

Vulcan-A string handling language
with dynamic storage control*
by E. F. STORlVI
Syracuse University
Syracuse, New York
and

R. H. VAUGHAN
National Resource Analysis Center
Charlottesville, Virginia

INTRODUCTION

meric computation is provisional in the sense that one
ordinarily wants to transform a piece of data only if
that datum (or some other) has certain properties.
For example, a certain kind of English sentence with
a verb in the passive, may want to be transformed
into a corresponding sentence with an active verb.
Or, in a theorem proving context, two formal expressions may have joint structural properties which permit
a certain conclusion to be drawn. In practice, however,
it is the rule rather than the exception that a datum
will fail to have the required property, and in such a
case one wishes that certain assignments of values had
never taken place. In order to accommodate these very
common situations the semantics of Vulcan are defined
and implemented so that changes to the work space
are provisional. While this policy requires some complex machinery to maintain internal storage in the
presence of global/local distinctions and of formal/
actual usage, these maintenance features give Vulcan
much of its power and flexibility.
3. Suppression of Bookkeeping Detail: A programmer should never need to concern himself with storage
allocation matters. Nor should there be troublesome
side effects of the storage maintenance conventions.
Specifically it should be possible to call a set of parameters by name in invoking a procedure or subroutine
so that changes to the values of actual parameters may
easily be propagated back to the calling context. In
such a case no special action should be required from
the programmer. In addition the details of the scan of
a symbol string to locate an infix substring should
never intrude on the programmer's convenience. And
the use of local/global distinctions and formal/actual

The implementation of the man-machine interface
for question-answering systems, fact-retrieval systems
and others in the area of information management
frequently involves a concern with non-numeric programming techniques. In addition, theorem proving
algorithms and more sophisticated procedures for
processing natural language text require a capability
to manipulate representations of non-numeric data
with some ease, and to pose complex structural questions about such data.
This paper describes a symbol manipulation facility
which attempts to provide the principal capabilities
required by the applications mentioned above. In
order to reach this goal we have identified the following
important and desirable characteristics for a set of
non-numeric programming capabilities.
1. Conditional Expressions: Since the formal representations of non-numeric information are ordinarily
defined inductively, it is to be expected that algorithms
to operate on such representations will also be specified
inductively, by cases. A conditional language structure
seems appropriate for a "by-cases" style of programming.
2. Storage Maintenance: Assemblers and other higher-level langua,ges eliminate many of the troublesome
aspects of the storage allocation problem for the user.
Very little use has been made, however, of more sophisticated storage maintenance functions. N on-nu-

* This work was supported by the National Resource Analysis
Center in the Office of Emergency Preparedness.
21

22

Fall Joint Computer Conference, 1970

usage should require no special action in a recursive
situation.
4. Numeric Capabilities: It should be possible to
perform routine counting and indexing operations in
the same language contexts that are appropriate for
processing symbol strings. At the same time, more
complex numerical operations should be available, at
least by means of linkage to a conventional numerical
language.
5. Input/Output: Comprehensive file declaration
and item handling facilities should be included if the
non-numeric features are to be useful in realistic applications. Simple formatting conventions should be available to establish a correspondence between the fields
of a record and a suitable set of symbol strings.
6. Interpretive Execution: There is little penalty
associated with the interpretive execution of nonnumeric algorithms, since the bulk of the system's
resources are concerned with accommodating a sequential, fixed~field system to a recursive, variable-field
process. In addition, interpretive execution is easier
to modify on a trial basis, and permits some freedom
in the modification of source language syntax, provided
there is an intermediary program to convert from
source code to the internally stored form, suitable
for interpretive execution.
While there are other desirable features for a very
general programming language, these were accepted
as a minimum set for exploratory purposes. An overall
goal was to attain a reasonably efficient utilization
of machine resources. In this implementation study
it was felt desirable to achieve running speed at the
expense of storage utilization if a choice were required.
Since most non-numeric computing processes are
thought to be slow in execution, it was decided to emphasize speed whenever possible in the Vulcan system.
List processing certainly plays a central role in the
applications contemplated here. But the Vulcan language was initially intended to be experimental and
to provide an exploration tool, and the implementation was therefore restricted to string manipulation,
elementary arithmetic and file handling.

OVERVIEW
The Vulcan language has been successfully implemented on a Univac 1108 system running under EXEC8, and a comprehensive programmer's reference manual
is available. 1 The emphasis in the implementation of
V ul~an has been on providing a powerful storage maintenance structure in place of complex and general elementary operations. From experience with applications this has been a satisfactory compromise. Ex-

travagant elementary operations have not been so
commonly needed, and when needed they are easily
supplied as specially tailored Vulcan procedures.
Storage maintenance for a recursive situation, on the
other hand, would be much more difficult to supply
in terms of more conventional programming language
structures.
V ulcan is an imperative rather than a functional
language. Since every call on a Vulcan procedure may
be treated both as. an imperative assertion and as a
Boolean expression there are places in the language
design where the notion of truth value assignment
has a character not predictable from more conventional usage. The conventions adopted to cover these
cases may be seen to be justified by their use in applications.
Since Vulcan is a conditional language there are
no labels and no GOTO statements. In a word, the
logical structure of an algorithm must be expressed
in purely inductive terms.
For the numerical calculations associated with a
string manipulation algorithm there are rudimentary
arithmetic operations and conversions between alphanumeric and binary, and there is a comprehensive
range test. All of these operations are defined only for
binary integers. 1\10re complex numerical processing
may be invoked by coding a Fortran program with
parameters passed to it from Vulcan. While there are
restrictions on this facility it has been found to be
more than adequate for the situations encountered so
far.
A complete package of' file processing functions is
available as an integral part of the Vulcan system.
Individual records can be read or written, files opened
or closed, temporary or permanent, on tape or mass
storage. By specifying a format in terms of lengths of
constituent substrings, a record can be directly decomposed into its fields by a single call on the item
handling facility. Calls on the item handler are compatible with the Boolean character of a Vulcan procedure.
There is an initial syntax scanner which puts the
Vulcan constructs into a standard form suitable for
interpretive execution. There are several constructs
which are admitted by the syntax scanner for which
there are no interpretable internal codes, and the
scanner is used to supply equivalent interpretable
internal codes for these situations. The ability to deal
with quoted material in any context appropriate to
an identifier is a case in point.
The scanner has been implemented so that a Vulcan
program may be punched on cards in free-field style.
There are no restrictions on the position of Vulcan
constructs on the program cards except that a dollar

VULCAN

sign (signalling a comment card) may not appear in
column 1, and columns 73-80 are not read.
The more common run-time errors are recognized
by the interpreter and there are appropriate error
messages. As with any non-numeric facility, restraint
and judgment are required to avoid situations where
excessive time or storage can be used in accomplishing
very little.
The entire Vulcan scanner/interpreter occupies
approximately 3900 words of core. A small amount of
storage is initially allocated for symbol tables and
string storage. When this storage is exhausted additional .5000 word blocks of storage are obtained from
the executive. Routine data processing computations
seem to make modest demands on storage, while a
theorem-prover may consume as much storage as is
given it.
A system written in Vulcan consists of a set of Vulcan
procedures. A procedure is a sequence of statements,
and a statement is a sequence of clauses. A clause is
conditional in character and consists of a series of basic
symbol manipulation functions, Input/Output operations, a limited set of arithmetic facilities, and procedure calls. The language is recursive in execution
so that a call on a procedure is executed in a context
which depends on the data available at the time the
call is made. The distinctions between local and global
identifiers and between formal and actual parameters
that are common to Algol are explicitly utilized in
Vulcan.

23

A facility exists to assign a literal string to an identifier:
(1) X = 'ABC'
(2) Y = (assigns the empty string to Y)
Quoted strings may be associated together from left to
right. Suppose one wishes to assign the following literal
string:
RECORDS CONTAINING 'ABC' ARE LOST.
The following literal assignment will create and store
the above string:
X = 'RECORDS CONTAINING'" , 'ABC' " ,
'ARE LOST.'
Spaces outside quote marks are ignored by the
translator. Note that five sub-strings are quoted in
the above literal assignment:
RECORDS CONTAINING

ABC

ARE LOST.
LANGUAGE DEFINITION
Symbol strings
A string is a sequence of zero or more occurrences
of characters from the UNIVAC 1108 character set.
In particular, the empty string, with zero character
occurrences, is an acceptable string. A string is normally referenced with an identifier and an identifier
to which no string has been assigned is said to be improper. (One common run-time error results from an
attempt to use an improper identifier in an incorrect
way.) A symbol string may also be quoted directly in
the context of an instruction. Except for the left-hand
side of an infix or assignment operation, anywhere that
a string identifier may be used, a quoted literal string
may be used in place of that identifier. For example,
both
(1) WRITE ('ABC')
and
(2) X = 'ABC', WRITE (X),
cause the string 'ABC' to be printed.

The string value of an identifier is called the referent
of that identifier and it may be changed as a result
of an operation. Note that the quote mark itself is
always quoted in isolation.

Language structure
The instructions In Vulcan are embedded in expressions which, like instructions, come out true or
false. A clause is an expression which has an antecedent
and a consequent part, separated by a colon, and
bounded by parentheses. The instructions are coded
in the antecedent and consequent parts and are separated \\1.th commas. For example,

where the ~i and Pi are Vulcan instructions.
A clause comes out true if all the instructions in the
antecedent part, executed from left to right, come out
ture. In this case, all the operations in the consequent

24

Fall Joint Computer Conference, 1970

part are executed, from left to right. For example, the
clause
will come out true and PI will be executed just in case
instruction cPI comes out true and instruction cP2 comes
out false (its negation making it true).
A clause with no antecedent part always comes out
true:
The consequent part of a clause may also be empty:

A program is a set of procedures with a period (.)
terminating the last procedure. The initial procedure
is executed first and acts as a driver or main program
for the system. All other procedures are executed only
by means of calls to them. The completion of this
initial procedure terminates the run.
String manipulation operations
There are two basic string manipulation instructions,
the concatenate operation and the infix test.

(cPI, cP2:)
A clause with neither an antecedent nor a consequent
part comes out true and performs no computation.
(:)

A statement is a series of clauses enclosed in square
brackets:

The consequent part of at most one clause will
be executed within a statement. Starting with the
left-most clause, if a particular clause comes out true
(as the result of all the tests in its antecedent part
coming out true), then, as soon as execution of all the
operations in the clause is finished, the remaining
clauses are skipped and execution begins in the next
statement. If a particular clause comes out false (as
the result of some test in its antecedent part coming out
false), then, execution begins on the next clause. If
any clause comes out true in a statement, then, the
statement is considered to come out true. If all clauses
in a statement come out false, then, the statement is
considered to come out false.
A procedure body is a sequence of statements
bounded by angle brackets, ', and preserve
the original.
Procedure INITIAL sets the values for the identifiers W, X, and Y and then calls procedure REP,
passing in the actual parameters W, X, Y, and Z to
formal parameters A, B, C, and D. (Note that W, X,
and Yare proper and that Z is improper when the call is
made.) Procedure REP then replaces all occurrences of
string B in string A with string C and calls the new
string D. Notice that if no occurrence of B is found
in A, then D is simply set to the referent of A. Called
with the input given in procedure INITIAL, REP will
set the referent of Z to 'LIST ITEl\1S IF AGE> 24
AND WEIGHT> 150'.

(2) 'll.ll.9.000'

PROCEDURE INITIAL;
(3) 'll.$ll.19.24'

(4)

LOCAL W, X, Y, Z;

'+ -86'

If the string to be converted· is not well-formed, then
BINARY(X) comes out false. If it is well-formed, then
the command comes out true and the referent of X is
the converted binary integer string, six characters in
length. If X is improper, error termination occurs.
Arithmetic operations are listed below.
means

X

(2) SUB (X, Y, Z)

means

X = Y - Z

(3) MPY (X, Y, Z)

means

X = Y

(4) DIV (X, Y, Z)

means

X = Yj Z

(5) SETZRO (X)

means X = 0

=

Y

+Z

(1) ADD (X, Y, Z)

([(: W = 'LIST ITEl\1S IF AGE GREATER THAN
24 AND WEIGHT GREATER THAN 150',
X = 'GREATER THAN', Y = '>', REP(W,
X, Y,Z»])
PROCEDURE REP (A, B, C, D);
LOCAL Xl, X2;
([(AjX1.B.X2 : REP (X2, B, C, D), D = X1.C.D)
( :D = A)])

*Z

where the identifiers Y and Z must have referents that
are binary integers, six characters in length. Each
operation always comes out true .. The operation DIV
(X, Y, Z) yields the integral quotient in X and discards
the remainder.
There is one numeric test:
RANGE (X, Y, Z),
where the identifiers· X, .Y, and Z must be binary integers, six characters in length. RANGE(X, Y, Z) comes
out true just in case X ~ Y ~ Z and comes out false
otherwise.
The following Vulcan· program illustrates the basic
operations and the language structure presented thus
far.
In this example, as part of a fact retrieval query
scheme, the task is to simplify an English language
sentence by replacing all occurrences of the string

INPUT OUTPUT OPERATIONS
The Input-Output operations in Vulcan fall into
two categories: (1) card reading and line printing
operations, and (2) file handling operations (for tapes,
Fastrand files, etc.).

Card read and line print
There are standard operations to read a string from
a card and to write a string on the line printer. The
instructions are as follows:
(1) WRITE (Xl, ... , XN)

(2) PRINT (Xl, ... , XN)
(3) READ (Xl, ... ,XN)
WRITE causes the referents of the strings for each
identifier in the list to be printed on successive lines.
PRINT, for each identifier in the list, writes out the

28

Fall Joint Computer Conference, 1970

identifier, followed by an ' =' sign, followed by the
string. If a string is longer than the number of print
positions on the line, remaining characters of the string
are printed on subsequent lines.
For each identifier in the list, READ reads the next
card and assigns the string of characters on the card
to the next identifier. Trailing blanks on a card are
deleted before assigning the string to the identifier.
If a blank card is read, the empty string is assigned
to the identifier.
The WRITE and PRINT operations always come
out true. READ comes out false if any EXEC 8 control card is read, but comes out true otherwise.
There is a modified version of READ available for
use with remote terminal applications which avoids
unwanted line feeds and carriage returns.
File handling operations

The traditional concept of reading and writing items
(logical records) and blocks (physical records) is extended in Vulcan to provide for the handling of individual fields within items. An item in a file is thought
of as a single string which may be decoded into various
substrings, or fields. Alternatively, a set of substrings,
or fields, may be grouped together to form an item
which is then put into a file. These two functions are
accomplished by the ITIVIREAD and ITIVIWRITE
operations, respectively. Supplied on each ITIVIREAD
or ITlVIWRITE request is the name of the file to be
processed, a format which is a definition of the fields
within the item, and a list of identifiers. The specific
relation between the format and the list of identifiers
in each particular request is the subject of Part B of
this section. The general sequence of commands for
manipUlating data files in Vulcan is as follows. Prior
to executing the Vulcan program, buffer storage requirements must be supplied with the FILE statement.
Each file to be processed must be assigned, either externally or dynamically, through the CSF instruction
(described later). The file must be opened before reading
or writing and then closed after processing. A file may
be reopened after it is closed, and it need not be reassigned unless it has been freed. The Vulcan file handling
capability employs the Input-Output Interface for
FORTRAN V under EXEC 8 described in the National
Resource Analysis Center's Guidebook, REG-104.
The user is advised to read this manual before using
the Vulcan file handling commands. The instructions
for file handling and their calling sequences follow.
1. OPEN: opens a file for processing.
CALL: OPEN(FN, l\10DE, TYPE, LRS, PRS,

LAF, LAB), where
FN = Filename (1-12 characters)
}VIODE = l\10de of operation (1 :::;l\10DE:::;7)
TYPE = Type of blocks (1 ~TYPE:::; 5)
LRS
= Logical record size, for fixed size
records.
(l:::;LRS:::;PRS).
If
LRS = '0' then variable SIze
records are indicated.
PRS
= Physical record size (1 :::;PRS:::;N,
where N is buffer size stated on
the FILE declaration).
LAF
= Look-ahead factor (is ignored
if LAF= (empty»
LAB
= Label (is ignored if LAB =
(empty».

Only the first five arguments are necessary for
opening a file. The label field (LAB), or the label (LAB)
and look-ahead factor (LAF) fields may be omitted
in the call. The OPEN request comes out true if the
operation is completed normally and comes out false
otherwise. I/O status information may be retrieved
with the STATUS instruction, described later in this
section. For example, the Vulcan instruction
OPEN('TEACH', '2', '2', '28', '28')
"\\-ill open an output file named 'TEACH' (with fixed
size blocks with no block counts and no sum checks)
of 28-word records each and 28-word items (i.e., one
item per physical record).
2. CLOSE: closes a file that has been opened.
CALL: CLOSE (FN, TYPE), where
FN
= File name (1-12 characters)
TYPE = Close type (1- 2!56 I-

a:::
o

~

64-

...J
~

161-

'1. SIO I IYSTE••

~

f3

a::

>-

I-

SB

11111.

.1

t

4.11.0

2

lid,

,.11

I

III

4 1'10

4 11100

z

..

Z

"1000

NORMALIZED SYSTEM PERFORMANCE

Figure 6-Main memory requirements as a function of
processor power

formance cannot be improved by a hierarchy. Clearly,
as the needed capacity approaches the buffer size, use
of two leve]s is uneconomical.
In extending the memory hierarchy structure to
multiple levels, the statistics of Figure 4 continue to
apply. They must be corrected for the block size used,
however. At each successive level the capacity must
increase as the access time increases. The number of
references not found in an intermediate level will be
approximately the same as if that level were itself the
inner level of memory in the system.

64K BYTE
CAPACITY

>-

>-

37

..J

Algorithms

m
-

LLJ
N

--'
«

::E
2. first leve1 buffer costs twice as high ($.50)
3. main memory access longer (50 cycles)
4. miss ratio improved (lowered) by a factor of
two for each capacity.
Some qualitative rules for optimizing memory
system cost/performance are apparent from these
analyses:
1. as buffer memory is relatively more expensive
less should be used;
2. as main memory is relatively slower more buffer should be used;
3. as algorithms yield a lower miss rate less buffer
should be used.

The converses also apply.
In order to assess the utility of a three-level hierarchy one must first evaluate the two-level alternatives.
To find the most favorable three-level configuration
we must consider a range of capacities for each buffer
level. Figure 12 shows how cost-performance values
for the three-level alternatives can be displayed as a
function of first-level buffer capacity for comparison
with the two-level situation.
Conditions that are favorable to the use of a threelevel configuration include:
1. expensive first level technology
2. steep cost/performance curve for main memory
technology
3. relatively high miss ratios
4. large total memory requirements
An optimum three-level configuration will use less
first-level and more second-level buffer memory than
the equivalent two-level case. The two-level con-

~O----~----~----~----~--~----~
8
16
32
64
128
256

Z

FIRST LEVEL BUFFER CAPACITY
(K BYTES)

Figure 12-Cost/performance analyses of three-level hierarchy
examples

figuration is more generally applicable, until a lower
cost bulk memory is available.
DISCUSSION
A properly designed memory hierarchy magnifies
the apparent performance/cost ratio of the memory
system. For example, the first case assumed in Figure
11 shows a cost/performance advantage of five times
that of a plausible single-level memory system with a
three-cycle access costing $.15 per bit. The combination achieves the capacity of the outer level at a performance only slightly less than that of the inner level.
Because of the substantial difference in the capacities,
the total cost is not greatly more than that of the outer
level alone.
The early memory hierarchy designs attempted to
integrate the speed/cost characteristics of electronic
and electromechanical storage. Now the large performance loss could be predicted from the relatively
enormous access time of the rotating device. For example, degradation of more than 100 times over operation from entirely within two-microsecond memory
would occur with addresses generated every two microseconds, 64K byte buffer (core) capacity, 512 word
block size, and 8ms average drum access time. To compensate for such disparity in access time, the inner
memory must contain nearly complete programs.

42

Fall Joint Computer Conference, 1970

160K

Figure 13-Distribution of program size

Successful time-sharing systems essentially do this.
Figure 13 shows the results of several studies13 ,14,15 of
the distribution of their program size.
These time-sharing systems also indicate direction
toward the use of multiple level internal memory. In
particular, they show the need for low-cost mediumaccess bulk memory. They are caught between inadequate response time achieved paging from drum or
disk and prohibitive cost in providing enough of present
large capacity core memories (LCM). However, designers such as Freemanll and MacDougall12 have
stated that only by investment in such LCM can systems as powerful as the 360/75 have page accessibility
adequate to balance system cost/performance. Freeman's design associates the LCM with a backing disk,
as a pseudo-disk system.
Transparent hierarchy can make it easier to connect
systems into multiprocessing configurations, with
only the outer level common. This minimizes interference at the common memory, and delays due to
cable length and switching. It has no direct effect on
associated software problems.
To date, hierarchy has been used only in the main
(program) memory system. The concept is also powerful in control memories used for microprogram storage.
There it provides the advantages of writeable control

memory, while allowing the microprograms to reside in
inexpensive read-only or non-volatile read-write memory.
A primary business reason for using hierarchy is to
permit continued use of ferrite memories in large systems. With a buffer to improve system performance,
ferrites can be used in a lowest cost design. It is unnecessary to develop ferrite or other magnetic memories at costly, high performance levels.
The uSe of multiple levels also removes the need to
develop memories with delicately balanced cost/performance goals. Rather, independent efforts can aim
toward fast buffer memories and inexpensive large
capacity memories. This permits effective use of resources and implies higher probability of success.
Systems research in the near future should concentrate upon better characterization of existing systems
and programs. There is still little published data that
describes systems in terms of their significant statistical
characteristics. This is particularly true with respect
to the patterns of information scanning that are now
buried under the channel operations required to exchange internal and external data. Only from analysis
and publication of program statistics and accompanying
machine performance data will we gain the insight
needed to improve system structure significantly.
REFERENCES
1 C J CONTI
Concepts for buffer storage
IEEE Computer Group News Vol 2 No 8 March 1969
2 C J CONTI D H GIBSON S H PITKOWSKY
Structural aspects of the system/360-Model 85, I.-General
organization
IBM Systems Journal 7 11968
3 J S LIPTAY
Structural aspects of the system 360 Model 85, II-The
cache
IBM Systems Journal 711968
4 T KILBURN
Electronic Digital Computing Machine
Patent 3,248,702
5 T KILBURN D B G EDWARDS M J LANIGAN
F H SUMMER
One-level storage system
IRE Transactions on Electronic Computers
Vol 11 No 2 1962 pp 223-235
6 D W ANDERSON F J SPARACIO
R M TOMASULO
The IBM System/360 Model 91: Machine philosophy and
instruction handling
IBM Journal Vol 11 No 81967
7 L BLOOM M COHEN S PORTER
Considerations in the design of a computer with a high
logic-to-memory speed ratio
Proc of Sessions on Gigacycle Computing Systems AlEE
Winter General Meeting January 1962

On Memory System Design

8 D H GIBSON
Considerations in block oriented systems design
AFIPS Proceedings Vol 30 SJCC 1967 pp 75-80
9 S S SISSON M J FLYNN
Addressing patterns and memory handling algorithms
AFIPS Proceedings Vol 33 FJCC 1968 pp 957-967
10 B S BRAWN F G GUSTAVSEN
Program behavior in a paging environment
AFIPS Proceedings Vol 33 FJCC 1968 pp 1019-1032
11 D N FREEMAN
A storage hierarchy system for batch processing
AFIPS Proceedings Vol 32 SJCC 1968 p 229

12 M H MACDOUGALL
Simulation of an ECS-based operating system
AFIPS Proceedings Vol 30 SJCC 1967 P 735
13 A L SCHERR
Time-sharing measurement
Datamation Vol 12 No 4 April 1966 pp 22-26
14 I F FREIBERGS
The dynamic behavior of programs
AFIPS Proceedings Vol 33 FJCC 1968 pp 1163-1167
15 G E BRYANT
JOSS-A statistical summary
AFIPS Proceedings Vol 31 FJCC 1967 pp 769-777

43

Design of a very large storage system*
by SAMUEL J. PENNY, ROBERT FINK, and IVIARGARET ALSTON-GARNJOST
University of California
Berkeley, California

files-and single experiments produced tape libraries of
hundreds of reels each.
The problems of handling large tape libraries had
become well known to the experimenters. Tapes were
lost; they developed bad spots; the wrong tapes· were
used; keeping track of what data were on what tape
became a major effort. All these problems degraded the
quality of the data and made the experiments more
expensive. A definite need existed for a new approach.
The study of the problem began with establishment
of a set of criteria for a large-capacity on-line storage
device, and members of the LRL staff started investigating commerically available equipment. The basic
criteria used were:

INTRODUCTION
The Mass Storage System (MSS) is a data-management
system for the on-line storage and retrieval of very large
amounts of permanent data. The MSS uses an IBM
1360 photo-digital storage system (called the chipstore)
with an on-line capacity of 3 X 1011 bits as its data
storage and retrieval equipment. It also uses a CDC 854
disk pack for the storage of control tables and indices.
Both these devices are attached to a CDC 6600 digital
computer at the Lawrence Radiation LaboratoryBerkeley.
Plans for the NISS began in 1963 with a search for an
alternative to magnetic tape as data storage for analyses
in the field of high energy physics. A contract was
signed with IBM in 1965 for the chipstore, and it was
delivered in March of 1968. The associated software on
the 6600 was designed, produced, and tested by LRL
personnel, and the Mass Storage System was made
available as a production facility in July of 1969.
This paper is concerned with the design effort that
was made in developing the }\1:ass Storage System. The
important design decisions, and some of the reasons
behind those decisions, are discussed. Brief descriptions
of the hardware and software illustrate the final result
of this effort.

a. The storage device should be on-line to the
central computing facility.
b. It should have an on-line capacity of at least
2.5XIOll bits (equivalent to 2000 reels of tape).
c. Access time to data in the storage device should
be no more than a few seconds.
d. The data-reading transfer rate should be at least
as fast as magnetic tape.
e. The device should have random-access
capability.
f. The storage medium of the device should be of
archival quality, lasting 5 years at least.
g. The storage medium need not be rewritable.
h. The frequency of unrecoverable read errors
should be much lower than on magnetic tape.
1. Data should be easily movable between the
on-line storage device and shelf storage.
j. The device hardware should be reliable and not
subject to excessive failures and down time.
k. Finally, the storage device should be economically worthwhile and within our budget.

CHOICE OF THE HARDWARE
By 1963 the analysis of nuclear particle interactions
had become a very large application on the digital
computers at LRL-Berkeley. More than half the
available time on the IBM 7094 computer was being
used for this analysis, and the effort was expanding.
Much of the problem was purely data manipulationsorting, merging, scanning, and indexing large tape

Several devices were proposed to the Laboratory by
various vendors. After careful study, including computer
simulation of the hardware and scientific evaluations of

* Work done under auspices of the U.S. Atomic Energy Commission.
45

46

Fall Joint Computer Conference, 1970

Developer
fluids entry

-'----- Control path
- - . Dota path
~ Pneumatic box path

Unexposed
film entry

Monual .ntry
and .xlt for
boxes
.IL7a7 -5312

Figure 1-General MSS architecture

the technologies, the decision was made to enter into a
contract with IBM for delivery, in fiscal year 1968, of
the 1360 photo-digital storage system. This contract was
signed in June of 1965. The major application contemplated at that time is described in Ref. 1.
It was clear that one of the major problems in the
design of the associated software would be the storage
and maintenance of control tables and indices to the
data. Unless indexing was handled automatically by the
software, the storage system would quickly become
more of a problem than it was worth. Protection of the
indices was seen to be equally important, for the
system would be dependent on them to physically locate
the data. It was decided that a magnetic disk pack
drive, with its removable pack,was the most suitable
device for the storage of the MSS tables and indices.
A CDC 854 disk pack drive was purchased for this
purpose.

developer unit completes the processing of a chip within
2.5 min; it overlaps the developing of eight chips so that
its processing rate is comparable to that of the recorder.
Up to 32 chips are stored together in a plastic box.
Figure 2 shows a recorded film chip and the box in
which it is kept. These boxes are transported between
the recorder-developer, the box storage file, and the chip
reader station by means of an air blower system.
Transport times between modules on the Berkeley
system average around 3 sec.
Under the command of the 6600 computer the
chipstore transports a box from the storage file to the
reader, picks out a chip, and positions it for reading.
The chip is read with a spot of light generated by a
cathode-ray tube and detected by a photomultiplier
tube at an effective data rate of 2 million bits per
second. The error correction-detection codes are checked
for validity as the data are read, and if the data are
incorrect, an extensive reread and error-correction
scheme is used to try to reproduce the correct data. The
data are then sent to the 6600 across a high-speed data
channel. Chip pick and store times are less than 0.5 sec.
The box storage file on the Berkeley 1360 system has
a capacity of 2250 boxes. This represents an on-line data
capacity of 2750 full reels of magnetic tape (at 800
BPI); 1360 systems at other sites have additional file
modules, giving them an on-line capacity three or more
times as great as at Berkeley.
A manual entry station on the chipstore allows boxes
of chips to be taken out of the system or to be reinserted.
By keeping the currently unused data in off-line storage
and retaining only the active data in the file, the
potential size of the data base that can be built in the
MSS is equivalent to tens of thousands of magnetic
tapes.

DESCRIPTION OF THE HARDWARE
1360 Photo-digital storage system

The IBM 1360 chipstore is an input-output device
composed of a storage file containing 2250 boxes of"
silver halide film chips, a chip recorder-developer, and a
chip reader. Figure 1 shows the general arrangement of
the chipstore hardware and its relation to the CDC 6600
computer. References 2 through 5 describe the hardware
in detail. A brief summary is given below.
A chip is 35 by 70 mm in size and holds 4.7 million bits
of data as well as addressing and error-correction or
error-detection codes. Data from the 6600 computer are
recorded on the chip in a vacuum with an electron beam,
taking about 18 sec per chip. The automatic film

Figure 2-Recorded film chips and storage box

Design of Very Large Storage System

A process control computer is built into the chipstore
hardware. This small computer is responsible for
controlling all hardware actions as well as diagnosing
malfunctions. It also does the detailed scheduling of
events on the device. Communication between the
chipstore and the host computer goes through this
processor. This relieves the host of the responsibility of
commanding the hardware in detail, and offers a great
deal of flexibility.
854 Disk pack drive

The CDC 854 disk pack drive holds a removable
10-surface disk pack. The pack has a typical access time
of 90 msec, and a data transfer rate of about 1 million
bits per sec. Its storage capacity is 48 million bits.
MSS uses this pack for the storage of all its tables and
indices to the data that have been written into the 1360
chipstore. A disk pack was chosen for this function to
insure the integrity of the MSS tables. The 854 has a
proven record of hardware and data reliability. Also,
since the pack is removable, the drive can be repaired
and serviced without threat to the tables.
6600 Computer complex

The chipstore is connected to one of the CDC 6600
computers at LRL through a high-speed data channel.
The 6600 computer has 131072 words of 60-bit central
core memory (CM) , a central processor unit (CPU)
operating at a 100-nsec cycle rate, and 10 peripheral
processor units (PPU). Each PPU contains 4096 words
of 12-bit core memory and operates at a 1-}Lsec cycle
rate. The PPUs control the data channel connections to
the external input-output equipment and act as the
interface between jobs residing in C1\1 and the external
world.
The operating system on the 6600 is multiprogrammed
to allow several jobs to reside in CM at once and share
the use of the CPU. Two of the PPUs act as the system
monitor and operator interface for the system, and those
remaining are available to process task requests from
the monitor and execute jobs. The MSS, composed of
both CPU and PPU programs, has been built as a
subsystem to this operating system.
CHOICE OF THE MASS STORAGE SYSTEM
SOFTWARE
Design objectives

Having made the commitment on hardware, the
Laboratory was faced with designing and implementing

47

the associated software. The basic problem was to
produce a software system on the CDC 6600 computer
that, using the IBM 1360 chipstore, ,would lead to the
greatest increase in the productive capacity of scientists
at the Laboratory. In addition, it was necessary that the
system be one that the scientists would accept and use,
and to which they would be willing to entrust their data.
I t would be required to be of modular design and
"open-ended," allowing expansion and adjustment to
new techniques that the scientists might develop for
their data analysis.
Overall study of the problem yielded three primary
objectives. 1\1ost important was to increase the reliability of the data storage, both by reducing the number
of data-read errors and by protecting the data from
being lost or destroyed; much time and effort could be
saved if this objective were met. The second objective
was to increase the utilization of the whole computer
complex. The third was to provide facilities for new,
more efficient approaches to data analysis in the future.
The problem was divided into three technical design
areas: the interaction between the software and the
hardware, the interaction between the user and the
software, and the structure of the stored data.
In the area of software-hardware interaction, the
design objectives were to maximize protection of the
user data, interleave the actions for several jobs on the
hardware, reduce the need for operator intervention, and
realize maximum utilization of the hardware. This was
the approximate order of importance.
Objectives in the area of user interaction with the
MSS included making that interaction easy for the user,
offering him a flexible data-read capability, and supplying him with a protected environment for his data. Ease
of data manipulation was of high value, but not at the
expense of data protection. A flexible read mechanism
was necessary, since if the users could not read their data
from the 1\1SS, they would seek other devices. This
flexibility was to include reading data from the chips tore
at rates up to its hardware limit, having random access
to the data under user control, possibly intermixing data
from the chipstore, magnetic tapes, and system disk
files, and being able to read volumes of data ranging in
size from a single word to the equivalent of many reels
of tape.
The problem of data structures for the MSS was
primarily one of finding a framework into which existing
data could be formatted and which met the requirements of system and user interaction. This included the
ability to handle variable-length data records and files
and to access these data in a random fashion. It was
decided that a provision to let the user reference his data
by name and to let the system dynamically allocate
storage space was very important. It was also important

48

Fall Joint Computer Conference, 1970

to have flexible on-line-off-line data-transfer facility so
that inactive data could be moved out of the way.

Software design decisions

Several important design decisions were made that
have had a strong effect on the nature of the final
system. Some of these decisions are listed here.
Each box used for data storage is given a unique
identification number, and this number appears on a
label attached to the box. A film chip containing data is
given a unique home address, consisting of the identification number of the box in which it is to reside and the
slot in that box where it is to be kept. Control words
written at the beginning of the chip and at various places
throughout the data contain this address (along with
the location of the control word on the chip), and this
information can be checked by the system to guarantee
correct positioning for retrieval of the data. It is also
used to aid in recovery procedures for identifying boxes
and chips. This control information can be used to help
reconstruct the MSS tables if they are destroyed.
The control words are written in context with the data
to define the record and file structure of the data on the
chips. The user is allowed· to give the address of any
control word (such as the one at the beginning of a
record) to specify what data are to be read. This scheme
meets the design objective of allowing random access to
data in the chipstore.
Data to be written into the chipstore are effectively
staged. The user must have prepared the data he wishes
to be recorded in the record and file structure he desires
in some prior operation. He then initiates the execution
of a system function that puts the source data into chip
format, causes its recording on film chips, waits for the
chips to be developed, does a read check of the data, and
then updates the J\1SS tables.
Data read from the chipstore are normally sent
directly to the user's program, though system utility
functions are provided for copying data from the
chipstore to tape or disk. If the user desires, he may
include a system read subroutine with his object
program that will take data directly from the chipstore
and supply them to his executing program. This method
was chosen to meet the objectives of high data-transfer
rates and to provide the ability to read gigantic files
of data.
To aid the user in the access and management of his
data in the J\1SS, it was decided to create a datamanagement control language oriented to applications
on the chipstore. A user can label his data with names
of his own choosing and reference the data by those
names. A two-level hierarchy of identification is used,

that of data set and subset. The data set is a collection
of named subsets, in which each subset is some structure
of user data. The control language is not limited to
manipulating only data from the chipstore; it can also
be used to work with magnetic tape or system disk files.
Two more decisions have greatly simplified the
overall problem of data management in the MSS. The
first was to allocate most of the on-line storage space on
the chipstore in blocks to the scientists engaged in data
analysis· of current experiments, and give them the
responsibility of choosing which of their data are to
reside on-line within their block and which are to be
moved off-line. The second decision was to treat all as
permanent. Once successfully written, film chips are
never physically destroyed. At most, the user may
delete his reference to the data, and the chips are
moved off-line.

DESCRIPTION OF THE J\1SS SOFTWARE
The system in use on the 6600 computer for utilizing
the chips tore results both from design effort at the
beginning of the project and from experience gained
during the implementation and initial production
phases. Its essential features are listed below.
Indexing and control of the data stored in the
chipstore are handled through five tables kept on the
disk pack, as follows.
The box group allocation table controls the allocation
of on-line storage space to the various scientists or
experiments at the Laboratory. Any attempt by a user
to expand the amount of on-line space in use by his box
group above its allowable limit will cause his job to be
aborted.
The box identification table contains an entry for each
uniquely numbered box containing user data chips. An
entry tells which box group owns the box, where that
box is stored (on-line or off-line), which chip slots are
used in the box, and the date of its last use.
The file position table describes the current contents
of the 1360 file module, defines the use of each pocket in
the file, and gives the identification number of the box
stored in it.
The data set table contains an entry for each of the
named collections of data stored in the chipstore. Status
and accounting information is kept with each data-set
table entry. Each active entry also points to the list of
subsets collected under that data set.
The subset list table contains the lists of named subsets
belonging to the entries in the data set table. A subset
entry in a list gives the name of the subset, the address
of the data making up that subset, and status information about the subset.

Design of Very Large Storage System

These tables are accessed through a special PPU task
processor program called DPR. This processor reads or
writes the entries in the tables as directed. However,
if the tables are to be written, special checks and
procedures are used to aid in their protection. Twice
daily the entire contents of the MSS disk pack are
copied onto magnetic tape. This is backup in case the
data on the pack are lost.
All communication to the chipstore across the data
channel link is handled through another PPU task
processor program called 1CS; 1CS is multiprogrammed
so that it can be servicing more than one job at a time.
Part of its responsibility is to schedule the requests of
the various user jobs to make most effective use of the
system. For instance, jobs requiring a small amount
of data are allowed to interrupt long read jobs. Algorithms· for overlapping box moving, chip reading, and
chip writing are also used to make more effective use
of the hardware.
1CS and DPR act as task processors for jobs residing
in the central memory of the 6600. The jobs use the
MSSREAD subroutine (to read from the chips tore) or
the COPYMSS system utility to interface to these task
processors. These central memory codes are described
below.
The reading of data from the chipstore to a job in
central memory is handled by a system subroutine
called MSSREAD. The addresses of the data to be read
and how the data are to be transmitted are given to
MSSREAD in a data-definition file. This file is prepared
prior to the use of MSSREAD by the COPYMSS
program described later; MSSREAD handles the
reading of data from magnetic tape, from disk files, or
from the chipstore. If the data address is the name of a
tape or disk file, MSSREAD requests a PPU to perf.orm
. the input of the data from the device a record at a tIme.
If the address is for data recorded in the chipstore, it
connects to 1CS, and working with that PPU code, takes
data from the chipstore, decodes the in-context structure and supplies the data to the calling program.
A'system program called COPY1\,fSS is responsible
for supplying the user with four of the more common
functions in MSS. It processes the MSS data-manage-

TABLE I-Distribution of MSS Implementation Effort.
Operation
Procurement and Evaluation
System design
Software coding
Software checkout
Maintenance, documentation, etc.

Man-years

1.0
2.8
1.7

0.8
1.2

49

TABLE II-MSS Usage Per Week.
Number of read jobs
Number of write jobs
Chips read
Bits read
Unrecoverable read errors
Chips written
Percentage down time

250
100

11500
5.4X1010

15
1900
8.5

ment control language to construct the data-definition
file for MSSREAD. It performs simple operations of
copying data from the chipstore to tape or disk files. It
prepares reports for a user, listing the status of his data
sets and subsets. Finally, COPY1\,fSS is the program
that writes the data onto film chips in the chipstore.
To write data to the chipstore, the user must prepare
his data in the record and file structure he desires. He
then uses the MSS control language to tell COPYMSS
what the data set and subset names of the data are to be
and where the data can be found. COPY1\1SS inserts the
required control words as the data are sent through 1CS
to the chips tore to be recorded on film chips. After the
chips have been developed, 1CS rereads the data to
verify that each chip is good. If a chip is not recorded
properly, it is discarded and the same data are written
onto a new chip. When all data have been successfully
recorded and the chips are stored in the home positions,
COPYMSS uses DPR to update the disk pack tables,
noting the existence of the new data set-subset.
The remaining parts of the 1\1SS software include
accounting procedures, recovery programs, and programs to control the transfer of data between on-line
and off-line storage. These programs, used by the
computer operations group, are not available to the
general user.

RESULTS AND CONCLUSIONS

Effort
A total of about 7.5 man-years of work was invested
in the Mass Storage System at LRL-Berkeley. The
staff on the project was composed of the authors with
some help from other programmers in the 1\,fathemati~s
and Computing Department. The breakdown of thIS
effort is shown in Table I.

Operating experience
The Mass Storage System has been in production
status since June 1969. Initial reaction of most of the

50

Fall Joint Computer Conference, 1970

TABLE III-Comparison of Storage Devices at LRL-Berkeley

On-line capacity (bits/device)
Equivalent reels of tape
Cost of removable unit
Storage medium cost (¢/10 3 bits)
Average random access (sec)
Maximum transfer rate (kilobits/sec)
Effective transfer rate&
Approximate capital costs (thousands
of dollars)
Mean error-free burst length (bits)

MSS

CDC 607
tape drive

CDC 854
disk pack

IBM 2311
data cell

CDC 6603
system disk

3 .3X 1011
2750
$13jbox
0.008
3
2000
1100
1000

1.2XI08
1
$20/reel
0.017
(minutes)
720
500
100

4.8X107
0.4
$500/pack
1.0
0.075
1330

4.5X10 8
3.75

35

3.0X109
25
$500/cell
0.17
0.6
450
200
220

1.6XI09

2.5XI07

>1010

109

>1010

0.125
3750
400
220

&Based on usage at LRL-Berkeley; the rates given include device-positioning time.

users was guarded, and many potential users were slow
in converting to its use. As a result, usage was only about
2 hours a day for the first 3 months. Soon after, this level
started to increase, and at the end of one year of
production usage a typical week (in the month of June
1970) showed the usage given in Table II.
IVlost of the reading from the chips tore is of a serial
nature, though the use of the random-access capability
is increasing. Proportionally more random access
activity is expected in the future as users become more
aware of its possibilities.
A comparison of the 1\188 with other data-storage
systems at the Laboratory, shown in Table III, points
out the reasons for the increased usage. For large
volumes of data, the closest competitor is magnetic tape
(assumed here to be full 2400-foot reels, seven-track,
recorded at 800 BPI).
The values shown in Table III are based on the
following assumptions: on-line capacities are based on
having a single unit (e.g., a single tape drive); capital
costs are not included in the storage medium costs;
effective transfer rates are based on usage at LRL, and
are very low for the system disk because all jobs are
competing for its use; and all costs given are only
approximate.
The average data-transfer rate on long read jobs
(involving many chips and many boxes) is more than
one million bits per second. This is decidedly better than
magnetic tape. Short reads go much faster than from
tape once the 3-sec access time is complete.
The biggest selling point for the Mass8torage
System has been the extremely low data-error rate on
reads. This rate is less than 1/60 of the error rate on
magnetic tape. The second most important point has
been the potential size of the data files stored in the
chipstore. Several data bases of from 20 to 200 boxes

of data have been constructed. Users find that having
all their data on-line to the computer and not having to
rely on the operators to hang tapes is a great advantage.
Their jobs run faster and there is less chance that they
will not run correctly.
The cost of storing data on the chipstore has proven
to be competitive with magnetic tape, especially for
short files or for files that will be read a number of times.
Users are. beginning to find it profitable to store their
high-use temporary files on the chipstore.
The system has not been without its difficulties.
Hardware reliability has at times been an agonizing
problem, but as usage increases and the engineers gain
more experience on the hardware, the down time for the
system has decreased significantly. We now feel that
5 percent down time would be acceptable, though less
would be preferable. Fortunately, lack of hardware
reliability has not affected the data reliability.
CONCLUSIONS
Though intended primarily as a replacement for
magnetic tape in certain applications, the MSS has
shown other benefits and capabilities. Data reliability is
many times better than for magnetic tape. Some
applications requiring error-free storage of large
amounts of data simply are not practical with magnetic
tape, but they become practical on the chipstore. The
nominal read rate is faster than that of magnetic tape
for long serial files. In addition, any portion of a file is
randomly accessible in a time ranging from a few
milliseconds to 5 seconds.
The MSS is not without its limitations and problems.
The 1360 is a limited-production device: only five have
been built. It uses technologies within the state of the
art but not thoroughly tested by long experience.

Design of Very Large Storage System

Keeping the system down time below reasonable limits
is a continuing and exacting effort. Development of both
hardware and software has been expensive. The software
was a problem because the chips tore was a new device
and people had no experience with such large storage
systems.
The Mass Storage System has met its purpose of
increasing the productive capacity of scientists at the
Laboratory. It has also brought with it a new set of
problems, as well as a new set of possibilities. The
biggest problem is how to live with a system of such
large capacity, for as more and more data are entrusted
to the chipstore, the potential loss in case of total failure
increases rapidly. The MSS offers its users important
facilities not previously available to them. More
important, the age of the very large Mass Store has
been entered. In the future, the MSS will become an
important tool in the computing industry.

51

REFERENCES
1 M H ALSTON S J PENNY
The use of a large photodigital mass store for bubble chamber
analysis
IEEE Trans Nucl Sci Volume NS-12 4 pp 160-163 1965
2 J D KUEHLER H R KERBY
A photo-digital mass storage system
AFIPS Conference Proceedings of the Fall Joint Computer
Conference Volume 29 pp 735-742 1966
3 L B OLDHAM R T CHIEN D T TANG
Error detection and correction in a photo-digital storage
system
IBM J Res Develop Volume 126 pp 422-4301968
4 D P GUSTLIN D D PRENTICE
Dynamic recovery techniques guarantee system reliability
AFIPS Conference Proceedings of the Fall Joint Computer
Conference Part II Volume 33 pp 1389-1397 1968
5 R M FURMAN
IBM 1360 photo-digital storage system
IBM Technical Report TR 02.427 May 15 1968

Design of a megabit semiconductor
memory system
by D. LUND, C. A. ALLEN, S. R. ANDERSEN and G. K. TU
Cogar Corporation
Wappingers Falls, N ew York

INTRODUCTION

insertion and extraction of cards a mechanical assembly
is also included. The card connectors are mounted on a
printed circuit interconnection board. All necessary
system wiring is done on the outside surfaces of this

This paper describes a 32,768 word by 36 bit word
Read/Write Memory System with an access time of
250ns, and a cycle time of 400ns.
The memory system is based on IV[OS technology for
the storage array and bipolar technology for the
interface electronics. A functionally designed storage
array chip with internal decoding minimizes the number
of external connections, thereby maximizing overall
system reliability. The average power dissipation of the
overall system is maintained at about OAmw per bit
including all support circuitry dissipation. This is based
on a card configuration of 102 modules with a maximum
module dissipation of 600mw.
System status
At present test sites containing individual storage
array chip circuits and single bit cross sections have
been processed and are being evaluated. Although
initial test results are favorable sufficient data has not
been accumulated to verify all design criteria. Sourcedrain storage array chip production masks are in line
with other levels nearing completion. Layouts of the·
bipolar support chips are complete and ready for
generation of production masks.

1~

Lw~~

System description

CONNECTIONS

An isometric view of the complete 32,384 word by 36
bit memory system is shown in Figure 1. The total
volume occupied by the system is 0.6 cu. ft., resulting in
a packing density of approximately 2 million bits/cu. ft.
A mechanical housing is provided for the eight multilayer printed circuit cards that contain the memory
storage elements and peripheral circuits. To facilitate

'.... 'J,......
! /'.AIRt t
FLOW

I

I

MEMORY SYSTEM ASSEMBLY
(8 CA RD )

Figure 1-Memory system assembly

53

54

Fall Joint Computer Conference, 1970

board with voltage distribution accomplished by the
internal planes. Additional edge connectors are mounted
in this board to accommodate I/O signal cabling via
plug-in paddle cards. Power connections are provided
at the outermost edge of the board.
Since the purpose of this design was to provide a
large, fast, low-cost system for use as a computer main
frame memory the following design constraints were
observed:

8.80 IN.
,-

A one megabit capacity was chosen to be representative of the size of memory that is applicable to a fairly
large, high-speed processor. It was decided that the
system should be built from modular elements so that
memory size and organization could be easily varied.
An additional advantage of ease of servicing and
stocking accrued from this approach.

Speed

A balance between manufacturability and system
requirements was established in setting the performance
objectives. This tradeoff resulted in a goal of 250ns.
access time and 400ns cycle time.

Density

The density of memory cells should be maximized in
order to create minimum cost per cell. An objective of
1024 bits of information was chosen as a reasonable goal
using present LSI technology on a .125 in. X .125 in.
chip. In order to keep the I/O signal count within
reasonable bounds it was decided that address complementing and decoding should be included within the
chip. The chip was structured 1024 words by one bit.
Memory card

A drawing of the basic modular unit, the memory
card, is shown in Figure 2. The card is a multilayer
printed circuit unit with two external planes for signal
wiring and two internal planes for distribution of the
three required voltages and ground. Ninety-eight
double sided connecting tabs are situated along one
edge of the card on a .150 in. pitch. These tabs provide
for a mating connection with the edge co.nnectors
mounted on the interconnection board, and serve to
electrically connect all supply voltages and signal wiring

~I II-

-,

.96CM

~=~o~~~~~~~~;-

'~ """"c TYP

A A A A A A A A A P P
A A A A A A A A A P P

(39)

A A A A A A A A A P P
A A A A A A A A A P P

Ill£D
W
L

MEMORY
CARD

§

A A A A A A A A A P P B

~

A A A A A A A A A P P B
A A A A A A A A A P P CL ___ .-R (~rp

....

~

Capacity

22.35 CM

l
g\

f'i ...- _ D~LAY

A A A A A A A A A P P
B B L.t

S/L S/L S/L S/L S/L S/L S/L S/L S/L

t:~OIIIIIIlmmIIlIlIImlDl_IIIIIIHIIIIIIII.~

CONNE~TOR3

~

-11-.100 TYP

-l.

L ___________ J

TYP

(3)

..I:~ ~~

:=P
,BOARD
- -

::b
' • ,

=-r;;!;L
fi

Figure 2-Memory card

to the card. The modules mounted on the card contain
one or two chips each, solder reflow bonded to a wiring
pattern on a ceramic substrate. Each module occupies a
0.7 in. square area. The 72 modules marked "A" cont~in
the storage array with two chips of 1024 bits each
included in each module. The "B" modules provide the
primary stages of bipolar buffering while the "P"
modules contain the secondary bipolar buffering and
decoding. :l\{odules "CL" and "DEL" provide for timing
generation while the remaining "S/L" modules perform
the sense amplification and latching functions.

Logic design

Memory system logic design was based on the
modular card concept to provide easy upward and
downward variation of total memory capacity. This
card contains all necessary input buffering circuitry,
timing circuits, storage elements, sensing circuits, and
output registers. The card is structured so that smaller
organizations can be obtained by depopulating modules.
TTL compatible open collector outputs are provided to
allow "wired-or" expansion in multiple card systems
such as the 32K word by 36 bit system discussed here.
Unit TTL compatible input loads help alleviate the
problems of driving a multiple card system.

Card logic flow

A signal flow logic diagram for the 8192 word by 18
bit memory card is shown in Figure 3. Thirteen single
rail address lines are required to uniquely determine one

Design of Mega-Bit Semiconductor Memory System

55

- READ/+WRITE o--.!...i-----+-I

B
+A
+A
ADDRESS
INPUTS

+A
+A
+A
+A
+A
+A

(8)

(8)
(8)
(8)
(8)
(8)
(8)
(8)

B
B
B
B
B
B
B
B

(9)

WORD
DECODE
AND
DRIVE

(9)
(9)

ARRAY
32 X 32

(9)
(9)
(9)
(9)

(9)

z

~>0 { .
cU&...,

a:Ci!

§~..J

GND

I&.

Z
0

RESET

(,)

SET
CONFIGURATION {UPPER 112 - EVEN IITS
CONTROL
LOWER 112 - ODD BITI

2,4,_ ETC.
l,lI,5 ETC.

COST PERFORMANCE MEMORY CARD LOGIC
( 8192 WORDS BY 18 BITS )
( WITH MI AND M2 INPUTS GROUNDED AS SHOWN)

Figure 3-Cost performance memory card logic

of 8192 words. Four control lines are required as
follows:
Select-causes selection of entire card.
Read/Write-determine the mode of operation to
be performed.
Set-provides timing for the output data register.
Clock-generates timing for read and write operations as well as timing for cyclic data refreshing.
Thirty-six more lines are used for data-in and
data-out.

array chips. Ten address lines (0-9) drive all storage
array chips on the card in parallel, decoding to one of
the 1024 bits stored on each chip. The remaining address
lines (10-12) are decoded and combined with the timed
Select pulse to create two Row Select signals which
energize two of the sixteen rows of array chips on the
card (two rows of chips per row of modules). Srnce there
are nine array chips in each row, a total of eighteen bits
are read out in each operation. The eighteen bits are
transmitted to eighteen combination differential sense
amplifier and latch circuits which are, in turn, wired to
the card connector interface.

Read operation signal flow

All input lines are buffered immediately upon entering
the memory card. A second stage of address buffering is
included on the card to allow fan out to all 144 storage

Write operation signal flow

Cell selection is performed in the same fashion during
a write cycle as in a read cycle. However, instead of

56

Fall Joint Computer Conference, 1970

sensing the differential pairs associated with each bit
position as in a read operation, the lines are pulsed by
one of a pair of bit driver circuits. The magnitude of this
excursion is sufficient to force the selected cell to the
desired state as indicated by the condition of the
data-in line.

Storage array chip logic organization

The storage array chip is organized in a 32 by 32
matrix of storage cells. Five input address lines are
complemented upon entering the chip and then
selectively wired to the word decoder drivers to provide
a one-of-32 selection. These word drivers are also gated
by Row Select so that only storage cells on a selected
chip are energized. The remaining one-of-32 decoding
function is performed on the cell outputs using the
remaining five input address lines. The 32 outputs of
this final gating stage are wire-ored together to the
single differential pair of output bit lines.
Tillling structure

Because the array chip is operated in a dynamic
fashion, it is necessary to provide several timed lines
for periodic refreshing of data and for restoration of the
array chip selection circuits after a read or write
operation. To minimize the number of lines required at
the system interface, the timing generation circuits and
delay lines are included on each memory card. These
functions are implemented with nonsaturating current
switch circuits for minimum skew between timed pulses.
Tapped delay lines are used to chop and delay the input
clock pulse. A total of four timing pulses are generated
as described below:

(nl)

160
I

I

ADDRESS

.J

ENABLE

--.J

400
I

_ 1 _ _ _ _ _ _ _'

_____---....r
REFRESH

SELECT

SELECT I
REFRESH
RESTORE

Figure 4-Storage array chip input timing

the selection circuit node capacitances that were
discharged during the immediately preceding operation,
and for the node capacitances of the storage cells
themselves.
A diagram showing the relative timings of array chip
input lines is shown in Figure 4.
A timing chart for the memory system interface is
shown in Figure 5. It can be seen that two timed lines
are required at this interface. The first is the Clock line
from which all the aforementioned timings are derived.
The second is the Set line which latches array data into
the output register.

Systelll operation

A block diagram for the complete 32K word by 36 bit
memory system is shown in Figure 6. Eight memory

.

T

Row Select: This line is used to turn on the array
chip word and bit selection circuits during a read or
write operation.
Refresh: This line is timed to follow the Row Select
line and energizes all word selection circuits to refresh
the array data.
Enable: The address inverters on the array chip are
enabled by this line during a normal read or write
operation. During the refresh portion of the cycle the
absence of this pulse disables the address inverters so
that all word selection circuits are simultaneously
energized. This permits refreshing of data in all storage
cells.
Restore: This line gates on load devices in all array
chip selection circuits· during the refresh portion of the
cycle. These devices provide a recharging path for all

TIME

o

CLOCK

-

100

200

300

.

~OO

T

600

w

700

SELECT

Z

~

fUUUUL'

ADDRESS
MOOIFY

~

if/////M

v////u//.

READ I
WRITE

~~

SET

.

w

WRITE
READ

Ii:a..- ~

DATA OUT

V//A!

ACCESS TIME •

2~0

nl

DATA IN

1.----...

1. . - -

READ - - - -....

WRITE

----t.1

TIMING DIAGRAM FOR COST PERFORMANCE
READ-WRITE MEMORY SySTEMS

Figure 5-,-Timing diagram for cost performance read-write
memory systems

Design of Mega-Bit Semiconductor Memory System

SET

~

SELECT 0

~

DATA IN
(I - 18)

READ I WRITE

-~

SELECT 2

SELECT

~

~

~
8192 WORD
BY
18 BIT
MEMORY CARD

~

8192 WORD
BY
18 BIT
MEMORY CARD

~

8192 WOR!).
BY
18 BIT
MEMORY CARD

V

~~

DATA OUT

(I - 18)

'/

8192 WORD

~~

~ ~

DATA IN
(19 - 36)

r\
r\

CLOCK

SELECT 3

~

V

I

ADDRESS
(13 LINES)

8192 WORD
BY
18 BIT
MEMORY CARD

~

1\

-

~~
1-1-

l\~

~

1\

~~
tLz:

'''~'T

MEMORY CARD

t!>:l

8192 WORD
BY
18 BIT
MEMORY CARD

~

8192 WORD
BY
18 BIT
MEMORY CARD

~

8192 WORD
BY
18 BIT
MEMORY CARD

~~

DATA OUT
(19 - 36)

""

Figure 6-Memory system block diagram

cards, each containing 8192 words by eighteen bits are
interconnected as shown to form the total system. All
cards are addressed in parallel with four mutually
exclusive Select lines energizing one pair of memory
cards each cycle. Each card output is "wire-ored" with
three other card outputs to expand word depth from
8192 words to 32,768 words.
lV[aximum access time is 250ns as measured from the
+ 1.6 volt level of the input Clock leading edge transition. IVIinimum allowable cycle time is 400ns. and is
measured in a similar manner from one leading edge
Clock transition to the next. Since the Clock line
provides refreshing of data, it is also necessary that a
maximum Clock repetition time of 1.2~s be maintained
to avoid loss of information.

Circuit design
In the design of LSI memories the most important
costs to be minimized are as follows:
Unmounted chip cost per bit
Chip carrier cost per bit
Printed circuit card cost per bit
Support costs per bit

57

The chip cost per bit is largely a function of the area
of processed silicon required per bit of storage, the
process complexity as measured by the number of
masking or diffusion steps, and the chip yield. All of
these factors strongly favor a MOS-FET chip process
over bipolar process. For a given chip size the chip
carrier costs, the printed circuit cost and the support
costs are all inversely proportional to the number of bits
per chip, thus the advantage of high~density MOS-FET
array circuitry is overwhelming.
The chief drawback to MOS-FET circuits for semiconductor memories is their low gain-bandwidth
compared with bipolar circuits using equivalent geometric tolerances. This shortcoming can be minimized
by using bipolar circuits to provide the high-current
drives to the MOS-FET array circuits, and by using
bipolar amplifier circuits to detect the low MOS-FET
sense currents. If the circuits are partitioned so that all
the devices on a given chip are either bipolar or MOSFET, no additional processing complexity is added by
mixing the two device types within the same system.
The use of bipolar support circuits also allows easy
interfacing with standard bipolar logic signals, thus
the interface circuits can match exactly standard
interface driving and loading conditions.
Given an MOS-FET array chip, the two most
important remaining choices involve the polarity of the
MOS-FET device (n-channel or p-channel) and the gate
oxide thickness. It is well known that the transconductance of n-channel devices is approximately three
times that of equivalent p-channel device and thus the
current available to charge and discharge capacitance is
SUbstantially greater. Since the substrate is backbiased
by several volts in an n-channel device, the source-tosubstrate and drain-to-substrate capacitances are also
slightly lower, with the net result that n-channel
circuits are a factor of two to three faster than equivalent p-channel circuits. This speed difference is critically
important if address decoding and bit/sense line gating
are to be included on the MOS-FET chip. Because the
transconductance of a MOS-FET device, and consequently its ability to rapidly charge and discharge a
capacitance, is inversely proportional to the gate oxide
thickne~s, it is advisable to use the minimum thickness
that the state of- the art will allow; in this case 500
Angstroms was chosen as the minimum that would give
a good yield of pinhole free oxide with adequate
breakdown voltage. Other key device parameters are
tabulated below:
V t = 1.25V nominal with substrate bias
psub = 20cm P type
'Ym

pd

= 33.5~a/v nominal
= 70/square N type

58

Fall Joint Computer Conference, 1970

r - --- - WOiti::"TWOito
------------ARRAV:::---l
I
I NYERTER DECODER
RESTORE
IIV
52 UNITSJI
I R ~1t.NIA8LE II UNITS 52 UNITS:

J--

j

I __
1 - - - " -- ~R-i cs I"

~~~ty

II

IOZ4

SAR~l

~

'
I

~

~

IOV

~

I

__

L--+---+--+-':::"'--"'::::::~~~ FO'5"i-

~

FO'18

I fuaE I

I

I

~----..,-~-----,

INVERTER
II UNITS
10V

I

:

UNITS~

I

DECOD£R
52 UNITS

I

I
1

1:~I_$j
. . '. . i." -- --------~
I

1

I

L

~

-i

-_~

I

-i

~

~C"

1•
I

UN I TI

I

~~." ORO" I
8/S PAIRSJ

______ L _____________ _

cross-section circuit schematic of the array chip.
Included below are nominal chip parameters:
Address input capacitance ... (including gate protective device) 4pf
Enable input capacitance. . . (depending on address) 2.75 pf or 20pf
Restore input capacitance ... (including gate protective device) 57pf
Sense line input capacitance ... 5.5pf
Select input capacitance ... 8pf
Word line capacitance ... 7.5 pf
Bit line capacitance ... 2pf
Sense current ... 150JLa
l\1aximum gate protective device input 3400V

Figure7-Array chip cross-section

Storage cell
Chip partitioning

Since it was desired that the same chip set be used to
configure memory systems of different sizes, different
word lengths, and different system speeds, many of the
chip partitioning choices are obvious. The timing
circuits, which are used only once per system, are
contained on a separate chip. The sensing and bit/drive
circuits are combined on one chip to allow easy expandability in the bit dimension. The array drivers are
contained on a third chip type to allow easy expansion
in the memory size, while general buffering and gating
logic make up the fourth chip type. The most important
chip-partitioning choice involves the dividing line
between bipolar and MOS-FET circuits at the array
chip interface. By including the array word-line
decoding and the array bit/sense line gating on the
array chip, the number of connections to the array chip
can be greatly reduced, allowing the chip carrier wiring
to be less dense and the chip pad size and spacing to be
relaxed. The complexity of the bipolar support circuitry
was reduced still further by including the address
inverters on the array chip, with a small penalty in
delay. Ifa MOS-FET sense amplifier/bit driver were
included on the array chip, however, the increase in
delay would be excessive, owing to the poor response
time of MOS-FET high-gain amplifiers. In the design
shown here, the cell sense current is gated to a bipolar
sense amplifier for amplification and discrimination, and
the cell nodes are driven through the same l\10S-FET
gating circuits to the desired state during the write
operation. This arrangement requires that approximately 35 percent of the available array chip area be
used for decoding and gating circuits, with the remaining
65 percent used for storage cells. Figure 7 shows a

Typical MOS-FET storage cells are shown in
Figure 8. In ceIl8(a), Tl and T2 form the cross-coupled
pair, while T 3 and T 4 gate the external circuitry to the
cell nodes, either to sense the state of the cell by
detecting the imbalance in the current through T 3 and

+v

BI T I SENSE

BIT I SENSE

~---------+-.

WORD DR I VE

(0 )

+v

BIT I SENSE

BIT I SENSE

~---------+--. WORD DRIVE

(c)

Figure 8-Storage cell configurations

Design of Mega-Bit Semiconductor Memory System

THIN OXIDE

L

I...
I

W----e-I

l _ _ _ _ _ _ _ _ _ _ _ -'

Figure 9-W /L ratio

T 4 or to write into the cell by pulling one node to ground
while simultaneously driving the other cell node positive. The load devices, T5 and T 6 , replace the leakage
current from the more positive node during stand-by.
Since one of the load devices has full voltage across it at
all times, the standby power dissipation of the cell will
be quite high in comparison to the cell sense current
unless the W/L ratio, Figure 9, of the load device
(T 5, T 6) is made very small compared to the W /L ratio
of the cross-coupled device (T I , T2)' This, in turn,
requires that either the load devices or the active
devices or both occupy a large chip area. In addition,
the standby load current flowing through the on-biased
active device provides a voltage drop across that device,
tending to unlatch the cell. This effect can be compensated for by increasing the value of all device
thresholds, however, this will require a higher supply
voltage to maintain the same standby current thereby
increasing the power dissipation.
In cell 8(b), the standby power is reduced by pUlsing
the "restore" input at a clock rate sufficiently fast to
replace leakage current from the cell node capacitance,
while maintaining a low average power drain. The chief
drawback to this cell is the five connections must be
made to the cell, with a resulting increase in cell
complexity over (a) above.
Cell 8(c) shows the configuration chosen for this
memory. In this cell, both the word selection and the
restore functions are performed through the same
devices and array lines, by time sharing the word-select
and restore line. During read-out, the cell operation is
similar to 8(b) above. At the end of each memory cycle,
however, all word lines are raised to the "restore" level
for a period sufficient to recharge the cell node capacitances, then all word lines are dropped and the next
memory cycle can begin. Selection of the "restore" level
is dependent on the speed at which the cell node
capacitance is to be charged and the sense line voltage
support level required during restore. Too high a
"restore" level creates a large current flow thru the
restore devices lowering the sense line voltage used to
charge the cell; too low a voltage prevents the cell node

59

capacitance from reaching the required voltage for data
retention. This cell employs fewer devices and less
complex array wiring than either of the cells above, and
thus requires substantially less silicon area. The
disadvantage of this approach is that the restore
function must be synchronized with the normal read/
write function since they share the same circuitry. The
average power cannot be made as low as in (b) above,
since the restore current and the sense current are both
determined by a common device, and the restore
frequency is determined by the memory cycle time;
however, the average power can be made significantly
lower than with the static cell 8(a) above.
MOS-FET support circuits
The MOS-FET support circuits employed on the
array chip are shown in Figure 10. A better understanding of the circuit operation will be gained by first
considering the MOS-FET inverter circuit (Figure 10).
At the end of a read/write cycle, the input address level
is held down, the E level is down, and the R line is
pulsed positive, charging node A to approximately +7
volts. When the R pulse has terminated, node A
remains positive awaiting the next input. At the start
of the read/write cycle, the address input is applied to
T I ; if the address line is positive, node A quickly
discharges through T I , and when E is applied to T 3,

+v
SELECT I REFRESH

~~

ADDRESS

r
I

4

3

2

5

(b)

+V

E

R~ftr~l
TI

ADDRESS I ""ur ...:.--.-.j

I

T3

4 I-

..

-:-

:±= CL
(a)

Figure lO-Array chip inverter-decoder circuits

6()

Fall Joint Computer Conference, 1970

T 3 remains non-conducting and the address inverter
output remains at ground potential. If, however, the
address input line is a down level, then node A remains
charged to +7 volts, and both Tl and T2 are cut-off,
while T 3 is biased on. When a positive E pulse is
applied to T 3, current is capacitively coupled into
node A from both the E node and from the output node,
with the result that node A is driven more positive than
either; thus T 3 remains strongly biased on, charging the
output node capacitance to the level of E. When the
positive E pulse is terminated, the same action quickly
discharges the output to ground through the E line. At
the ead of the address pulse, a positive R pulse is again
applied to T 2, restoring node A to
7 volts. This
regenerative inverter has several advantages over a
conventional source follower circuit; (a) the output up
level is set by the level of the E input, and does not vary
with the device threshold voltage; (b) the output rise
time is nearly linear, since the gate-to-source bias on T3
remains well above the threshold voltage throughout
the transition, and (c) this same high conductance
output device can be used to both charge and discharge
the load capacitance. Since the leakage current from
node A during a cycle is negligible, the final potential
of node A, and thus the output drive current, is
determined by the capacitor-divider action of the
gate-to-source, gate-to-drain, and gate-to-substrate
capacitances associated with device T 3. Any of these
capacitances can be artificially increased to optimize the
circuit operation. The operation of the decoder circuit
(Figure lOb) is similar to the inverter just described,
with the bi-Ievel chip select/refresh line replacing the E
input discussed previously. Thus, a single word line is
selected to the higher (Select) level during the Read/
Write portion of the cycle, while all word lines are
selected to the lower (Refresh) level during the Restore
portion of the cycle. Thus the cell input/output devices
are biased to a low impedance to provide maximum
sense current during readout, and to a higher impedance
to reduce the power dissipation and maintain the
necessary sense line voltage during the restore operation.
Protective devices are used on all gate inputs to the
array chip to reduce the yield loss from static electricity
during processing, testing, and assembly. Because the
array chip uses a P-epitaxy grown on a P+ substrate it
was possible in this system to replace the usual RC
protective device with a more favorable zener type. This
device is an N diffusion diffused at the same time as
the source-drain diffusions and exhibits a low internal
impedance when its depletion region intersects the P+
substrate. The required reverse breakdown voltage is
obtained by controlling the depth of the N diffusion.
When driven with an impedance equivalent to a human
body, approximately 1000 ohms, gate protection is

+

+

+

1-----1

i

's VB.

i

h

NUMBER OF

I

IA~~~~~~~~sl
1 INFINITY I
1

1

I

I

I

1
1

V.

ATE

Figure 11-Gate protective device

provided for input levels up to 3400 volts. Figure 11 and
equation 1 represent the characteristics and operation
of this type protective device as presented during the
IEEE, International Electron Devices lVleeting, October
29 thru 31, 1969. 1 For analysis the device is arranged as
a series of distributed elements; each element containing sub-elements rs, ra, and V BR.
Vgate = V BR+ {(V in - V BR)(rsra)I/2 [cosh (rs'Y /ra) 1/2]-1 } /
R8+(rSra)1/2

(1)

In this design 'Y, the number of elements, was set at
nine with the following sub-element values:
rs

= 4.27 ohms

r

= 61.2 ohms

V BR = 30 volts

The maximum capacitance before breakdown
1. 25pf.

IS

Bipolar support circuits

Because of the critical timing relationships required
among the Select/Refresh, Enable, Address, and Restore pulses to the array chip, all timing pulses are
generated on each card by a special timing chip and three
tapped delay lines. This arrangement· allows each card
to be fully tested with the timing circuits that will drive
it, and minimizes any interaction between cards in a
multi-card system.
The TTL compatible buffer chip allows interfacing
conveniently with the TTL compatible logic of the using
system, and 'minimizes the loading which the memory
presents to the system.
A schematic cross-section of the Drive and Sense
circuits is shown in Figure 12. The Driver module, when
addressed, selects a row of nine Array Chips from a low
impedance TTL output. The ten address inputs to the

Design of Mega-Bit Semiconductor Memory System

TO 7 0'1'11111 AIIIIAY C:III.'

61

CROSS SEeTION
HIVE -SENSE SCHEMATle

DIAGRAM
FIG. 1-2

"'ST'"

IIiADLI

--- - -------- - -- - - -------- ------ - ---- ---'-i
~

~.:a

7

II_.ILICT
'1'1._111-,
o-+-~,.,.,."

:::.- :!

1I
1

-

I till"
·111

I

I

1

0-+-- - - - - '

I

AaO-'-----'

1

.. 0-+------'

'1

I
1
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ .J1

r- - - - - - - - - - - - - - - - - - - - - - - - - - Vet

DATAIII-

81,.

KIlO

ZlNO
DIIIVI
OUT

-FLOIII

- ------

--------------------------,

-

II.
UI

DI'I'

_VI

OUT

O-~:--~~----~-L==~====::~====~==~====~====t===~======~==~-4----~----~----~

r- ___________ J

I _________
111111 LATCII "DULl
L
.,

1
1
I

1

1

I

I
L-------"3I

LATCII ~:;;.::=:::-r--<>

I

I

1

I
ON.

L

___________________________________________
NT

IIIIIT

III

..

JI

Figure 12-Cross-section drive-sense schematic diagram

Array Chip serve to select one bit of the 1024 bits/chip.
The write pulse permits the data-in to be loaded
differentially into the single bit which has been addressed. The removal of the write pulse turns off both
the "one" and "zero" bit drives with the low impedance
active pull ups rapidly charging the capacitance of the
bit lines to a voltage level required for the re~d mode of
operation.
The sense amplifier requires a minimum differential
signal of 50 micro amps to detect a "one" or "zero"
stored in the addressed bit. This information is transferred to a set-reset latch which is included to increase
the "data-good" time to the using system.
During a portion of every cycle not used for read/
write operation the timing chip provides refresh and
restore timing pulses which turn on all the driver modules on the memory card to a lower voltage level, and
perform the refresh operation previously discussed.
All four of the bipolar support chips are packaged
one-chip per chip carrier, to allow flexibility in configuring various size memory systems. In all cases,
the power density is limited to 600 mw per chip carrier,

15-0

140

130

1\

...

~

110

!

1

I

CARD S

\

l~lP~~9)"H,8.8"W

I

~
~

~

90

200

250

I

CARD PITCH" 0.6"
MODULE PI.TCH • 0.7"
Tin· 50·C

~

100

80
150

I

VAR 1AT 1ON OF JUNC T 1ON TEMPERATURE
WITH
VELOC ITY AND INLET TEMPERATURE

300
-

350

~ t--.

400

VELOCITY

450

r--- r---500

550

600

ft I min

Figure 13-Variation of junction temperature with velocity and
inlet temperature

62

Fall Joint Computer Conference, 1970

a level which allows for convenient forced-air cooling.
Because the limiting heat factor is the junction temperature of the bipolar support circuits an cooling
considerations are in respect to this parameter. Figure
13 illustrates the junction temperature as a function
of air flow thru the system.
CONCLUSION
The memory system described here is but one of many
possible sizes and organizations that can be created
using the same modular approach. If desired, several
smaller organizations can be used within the same
system without significant cost penalties. The system
approach to memory design has created an optimum
condition wherein each individual component is

matched to the other components with which it must
interact. This approach also yields a memory with
a simple, effective, easily usable set of interface requirements. It is anticipated that increasing yields
will allow prices competitive with magnetic storage
for high-performance main memories. This low cost,
coupled with high performance and density, makes
a powerful combination for use in future system designs.

REFERENCE
1 M LENZLINGER
Gate protection of MIS devices
Presented at International Electron Devices Meeting
Washington D C 1969

Optimum test patterns for parity networks
by D. C. BOSSEN, D. L. OSTAPKO and A. M. PATEL
IBM Laboratories
Poughkeepsie, N ew York

input Exclusive-OR gate. Each gate, therefore, is
given a complete functional test so that single error
detection means that any error in "one Exclusive-OR
gate can be detected. The following is the definition
of a gate test.

INTRODUCTION
The logic related to the error detecting and/or correcting circuitry of digital computers often contains
portions which calculate the parity of a collection of
bits. A tree structure composed of Exclusive-OR gates
is used to perform this calculation. Similar to any other
circuitry, the operation of this parity tree is subject
to malfunctions. A procedure for testing malfunctions
in a parity tree is presented in this report.
Two important assumptions are maintained throughout the paper. First, it is assumed that the parity tree
is realized as an interconnection of Exclusive-OR gates
whose internal structure is unknown or may differ.
This requires that each gate in the network receive a
complete functional test. Second, it is assumed that
detection of single gate failures is desired.
Since each gate must be functionally tested, an minput Exclusive-OR gate must receive 2m input patterns. It will be shown that 2m test patterns are also
sufficient to test the network of any size, if m is the
maximum number of input lines to any Exclusive-OR
gate. Hence, the procedure yields the minimum number
of test patterns necessary to completely test the network for any single Exclusive-OR gate failure. It will
also be shown, by example, that the procedure is fast
and easy to apply, even for parity trees having a large
number of inputs.

Definition 1 :

A test for a k-input Exclusive-OR gate is the set of
2k distinct input patterns of length k. Figure 1 shows a

three input Exclusive-OR gate, the 23=8 input test
patterns, and the output sequence which must result
if a complete functional test is to be performed.
If the output sequence and the sequences applied to
each input are considered separately, each will be a
vector of length 2k. Thus;-the Exclusive-OR gate can
be considered to operate on input vectors while producing an output vector. Figure 2 shows a three input
Exclusive-OR gate when it is considered as a vector
processor. In terms of vectors, a test is defined as
follows.
Definition 2:

A test for a k-input Exclusive-OR gate is a set of k
vectors of length 2k which, when considered as k sequences of length 2k , presents a1l2k distinct test patterns
to the gate inputs.

GATE AND NETWORK TESTABILITY

Theorem 1:
If K is a test for a k-input Exclusive-OR gate, then
any set M, MCK, having m, 2~m~k-l, elements
forms 2k - m tests for an m-input Exclusive-OR gate.

Since the approach is to test the network by testing
every gate in the network, it is primarily necessary to
discuss what constitutes a test for an individual Exclusive-OR gate. Although it is assumed that the
parity trees are realized as a network of Exclusive-OR
gates, no internal realization is assumed for the Exclusive-OR gates. Hence, it will be presumed that all
2k input patterns are necessary to diagnose a single k-

Proof:

Consider the k vectors in K as sequences. Arrange
the sequences as a k by 2k matrix in which the last m
63

64

Fall Joint Computer Conference, 1970

00010111
00101011
01001101

=:8. .-..

Wo WI W2 W3 W4 W5 Ws

01110001

Figure I-Three input Exclusive-OR gate with test patterns

rows are the sequences in M. Code each column' as a
binary number with the highest order bit at the top.
Since the columns are an distinct according to definition
1, each of the numbers 0 through 2k-1 must appear
exactly once. Considering just the bottom m rows, it
follows that each of the binary numbers 0 through
2m-1 must appe:u exactly 2k - m times. Since each of
the possible sequences of m bits appears 2k - m times,
definition 1 implies that the set M forms 2k - m tests for
an m-input Exclusive-OR gate.

Network testability:

Two conditions are necessary for a network of Exclusive-OR gates to be completely tested. First, each
gate must receive a set of input vectors that forms a
test. Second, anyone gate error must be detectable at
the network output. For the first condition it is necessary that the set of vectors from which the tests are
taken be closed under the operation performed by the
k-input Exclusive-OR gates. The second condition
requires that any erroneous output vector produce an
erroneous network output vector. The structure of this
set of vectors and their generation will be discussed in
the following sections.

AN EXAMPLE

Wo

101 I 100

Wo 0

WI

010 I I 10

WI

W
2
W3

00101 I I

W2

W4

1100101

w5

I I I 00 10

W4
w5

Ws

01

Ws

100 101 I

1001

W5 W3 W2 Ws WI W4
0 Ws W W3 Wo W
4
2
0 Wo W5 W WI
4

~3

0

0

I, ~

= 01001101, .!!. = 01110001

Figure 2-Three input Exclusive-OR gate as a vector processor

w3
0

Figure 3-Test sequences and their addition table

then successively determining input sequences which
test each gate to produce the desired output.
Figure 3 presents the seven sequences and the associated addition table that will be used in the example. Figure 4 illustrates the gate labeling procedure
which will be used to determine the inputs when the
output is specified. Figure 5 shows the parity tree with
57 inputs and 30 Exclusive-OR gates of two and three
inputs arranged in a four level tree. The procedure
generates eight test patterns which will completely test
all 30 gates of the tree.
The procedure is initiated by assigning an arbitrary
sequence to the output of the tree. In the example,
Wo is selected as the final output sequence. Employing
the 3-input gate labeling procedures shown in Figure
4, the inputs are determined to be WI, W 2, and W 4 •
With these three sequences, the gate is completely
tested. These inputs are then traced back to the three
gates in the third level. Using the gate labeling procedure again, the inputs for the gates from left to right
are W 2, W a, Ws; W 3 , Wo; and W s, W 2 • The sequences
assigned to the inputs can be determined quickly and
easily by making use of tracing and labeling. Under
proper operation, each gate is completely tested and a
single gate failure will produce an incorrect sequence

2 -INPUT

= 00010111, ~ = 0010101

w2 Wo
0

The test pattern generation procedure is so simple
and easy to apply that it will be presented by way of
an example before the theoretical properties of the
desired sequences are discussed. The algorithm proceeds by selecting an arbitrary output sequence and

WHERE ~

WI Ws W5

NOTE: Wi

==

3-INPUT

Wi (MOD 7)

Figure 4-Gate labeling procedures

Optimum Test Patterns

at the output. Above each input the required sequence
is listed, and the correct output is the sequence Woo
The test patterns are obtained by reading across the
sequences and noting the correct output. The test is
completed by adding the all zero test pattern. This
should produce a zero output.

65

000000000000000000000000000000000000000000000000000000000
III 100
110 III
Oil 110
001011
101001
010 101
100 010

101 100
010 III
100 110
III 011
110001
011 101
001 010

010001 101 001 110 100
100 101 010 101 011 III
II I 010 100 010001 110
110 100 III 100 101011
011.11 I 110 II I 010001
001 110011 110 100 101
101 OJ 1001 Oil II 1010

010001
100 101
II 1010
110100
Oil '"
001 110
101011

011 110 100 101
001 01 I III 010
101 001 110 100
010101 011 III
100 010001 110
III 100 101 011
110 II I 010001

III III 001
110 110 101
011 Oil 010
001001 100
101 101 II I
010010110
100 100011

4SO 561 013561 602 124013 124 346561 602124 23534651 013 4SO 450 124

THEORETICAL PRELIMINARIES
Consider the set of vectors generated by taking all
mod-2 linear combinations of the k vectors of a given
test set K. This set is obviously closed under mod-2
vector addition. In a parity check tree network an
input of any subset of vectors from this set will produce vectors in the set at all input-output nodes of
the Exclusive-OR gates. Some further insight can be
gained by viewing the above set as a binary group
code. The generator matrix G of this code, whose rows
are k vectors from K, contains all possible k-tuples as
columns. If we delete the column of all O's in G, the
resulting code is known as a MacDonald l code in which
the vector length n is 2k -1 and the minimum distance
d is 2 k - l • The cyclic form of the MacDonald code is
the code generated by a maximum length shift register.2

o
I

o
I
I
I

o
o

Figure 5-Four level parity tree with test patterns

degree kin GF (2). Let g(X) = (Xn-1)jp(X) where
Theorem 2:
Any independent set of k vectors from the Maximum
Length Shift Register Code of length 2k -1 forms a test
set for a k-input Exclusive-OR gate, excepting the
pattern of all O's.
Proof:

n = 2k -1. Then the first vector Wo of the MLSRC is

the binary vector obtained by concatenating k-1
zeros to the sequence of the coefficients of g (X). The
vectors WI, W 3 ••• Wl- 2 are then obtained by shifting
WI cyclically to the right by one digit for 2k - 2 times.
The method is illustrated for k = 3. A primitive polynomial of degree 3 in GF (2) can be obtained from
tables, 2 e.g., X3+ X + 1 is primitive.
g(X) = (X'l-1)j(Xa+X+1) =X4+X2+X+1.

Any independent set of k-vectors from the code
forms a generator of the code. In the Maximum Length
Shift Register Code as well as in the MacDonald Code,
2d-n = 1. This implies*3 that any generator matrix
of the code contains one column of each non-zero type.
By definition 2, this forms the test for a k-input ExOR gate excepting the test pattern of all O's.
Corollary:

Then Wo is obtained from g(X) as
Wo=10 1 1 1 0 0
The sequences WI, W 2 ••• W6 are obtained by shifting
Wo cyclically as,
W1=0 1 0 1 1 1 0

W 2 =O 0 1 0 1 1 1
TV3 = 1 0 0 1 0 1 1

For an m-input gate, m~k, any set of m-vectors
from a MLSRC of length 2k -1 forms a sufficient test.
The proof follows from Theorems 1 and 2.
The maximum length shift register sequences can
be generated2 by using a primitive polynomial p(X) of

* In

Reference 3 it is shown that in a group code with 2d -n =

t > 0, there are t columns of each type.

W 4 =1 1 0 0 1 0 1
W5=1 1 1 0 0 1 0

W 6 =O 1 1 1 0 0 1
Note that when W 2k-2 is shifted cyclically to the right
by 1 digit, the resulting vector is Woo For the purpose
of uniformity of relationship among the vectors we

66

Fall Joint Computer Conference, 1970

introduce the notation: Wi== Wi (mod 2k_l). Now the
following theorem gives a method of selecting independent vectors from a MLSRC.

Theorem 3:

The vectors Wi, W i+l, ... , W i+k-l in a MLSRC of
length 2k -:-1 form an independent set.
Proof:

Suppose g(X) is given by g(X) = grXr+ gr_1Xr-I +
O, where r= (2k-l) -k. Then the set of
vector W o, WI, ... ,Wk - I are given by

... + glX+g

go
gr-l

0

0

0

go

0

0

go

0

o
o
o

(2) and (3) to determine the proper inputs to
the. corresponding gates.
4. The vectors at the input lines to the Ex-OR tree
are then the test input vectors with the correct
output as Wi.
5. An additional all 0 pattern as input to the· network with 0 as correct output completes the
test.

It is easy to see· that the test patterns generated by
the above algorithm provide a complete test for each
Ex-OR gate in the parity check tree. Furthermore, any
single gate failure will generate an erroneous word
which will propagate to the output. This is due to the
linearity of an Ex-OR gate. Suppose one of its inputs is
the sequence Wi with a corresponding correct output
sequence W j. If the input Wi is changed by an error
vector to Wi+e, then the corresponding output is
Wj+e. Clearly, the error will appear superimposed on
the observed network output.
TEST MECHANIZATION

o

o ..

go

Clearly they are linearly independent. Because of the
cyclic relationship, this implies that Wi, W i+I , ... ,
W i +k - 1 are independent.
Corollary:

The vectors Wi+I, Wi+2, ... ,Wi+m- I , and WiEBWi+lEB
... EBWi+m-I, (m~k), form an independent set. With
this as a test to an m-input Ex-OR gate, the correct
output vector is Wi.
As a direct consequence of the above theorems we
have the following algorithm for the test pattern generation for a given Exclusive-OR network.

We have shown that the necessary test patterns for
a parity tree can be determined by a simple procedure
using a set of k independent vectors or code words
W o, WI, ... , W k - l from a MLSRC as the input to
each gate of k inputs. The result of applying this procedure to a network is an input sequence Wi for each
network input and each network output. Testing is accomplished by applying the determined sequences
simultaneously to each input and then comparing the
expected network outputs with the observed network
outputs.
Let the gate having the greatest number of inputs in
the. network show k inputs. The entire test can be
mechanized using a single (2k-l)-stage feedback
shift register. To do this a unique property of the
MLSR codes is used. From this property it follows that
the entire set of non-zero code words is given by the

Algorithm for test pattern generation:

It is assumed that the Exclusive-OR network is constructed in the form of a tree by connecting m-input
Ex-OR gates where· m may be any number such that
m~k.

1. Select any vector Wi from a MLSRC of length
2k -1 as the output of the network.
2. Label the inputs to the last Ex-OR as Wi+I,
W i+2, ••• , W i+m-l, and Wi EB W i+l EB ... EB W i+m-l·
3. Trace each of the above inputs back to the
driving gate with the same vector. Repeat steps

~--Wo

---=,--... W,

L....--._ _ _

------------'----~0,--_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

Figure 6-Shift register for generating test patterns

W2 "-3
W "-2
2

Optimum Test Patterns

67

2k - 2 cyclic shifts of any non-zero code word together

with the code word itself.
If a (2k-1)-stage shift register is loaded with a particular code word Wo as in Figure 6, then the sequence
of bits observed at position 1 during 2k -1 shifts of
the register is the code word Woo Similarly for every
other position i, a different code word W i - 1 is observed,
so that the entire set of 2k -1 sequences is available.
Since the correct output of the network is one of the
code words, it is also available at one of the stage outputs for comparison. The general test configuration is
given by Figure 7.
SELF-CHECKING PARITY TREE
Let us suppose that the test sequences and the shift
register connections for a parity network have been
determined as in Figure 7. A modification of this mechanization can be used to produce a self-testing parity
network under its normal operation. The key idea is to
monitor the normal randomly (assumed) occurring
inputs to the network and to compare them with the
present outputs of the shift register. When and only when
a match occurs, the comparison of the outputs of the
parity networks with the appropriate code words is
used to indicate either correct or incorrect operation,
and the shift register is shifted once. This brings a
new test pattern for comparison with the normal inputs. Every 2k -1 shifts of the register means that a
complete test for all single failures has been performed
on the network.

12 "-1

STAGE s.R.l

1

Wo

~~

W2"_3

W2"-2

···

PARITY
TREE

~~

ERROR

Figure 8-Self checking parity tree

The mechanization of the self-checking parity tree
is shown in Figure 8. The inputs to the AND gate
Awl are the set of input lines of the parity tree which
receive the test sequence Wi. The inputs to the AND
gates AWiO are the inverse of the input lines of the parity
tree which receive the test sequence Wi.
An alternate approach to self-checking is to use the
testing circuit of Figure 7 as a permanent part of the
parity tree. The testing is performed on a time-sharing
or periodic basis while the circuit is not used in its
normal mode. This is easily accomplished by having
the clock, which controls the shift register, gated by a
signal which indicates the parity tree is not being used.
This could be a major portion of the memory cycle
when the parity tree under consideration is used for
memory ECC.
CONCLUSION

....
.....,
j

OR

!

ERROR

Figure 7-General testing scheme

I

We have shown that a very low and predictable number
of test patterns are necessary and sufficient for the
complete testing of a parity tree under the single failure
assumption. The required tests are easily and rapidly
determined by an algorithm which is presented. (An
application of this technique is also given for a selfchecking parity tree.) Since the effect of the input test
patterns is a complete functional test of each gate, the
tests are independent of any particular failure mode.

68

Fall Joint Computer Conference, 1970

REFERENCES
1 J E MAcDONALD
Design methods for maximum minimum-distance I error
correcting codes
IBM J of R&D Vol 4 pp 43-471960

2 W W PETERSON
Error correcting codes
MIT Press Cambridge Massachusetts 1961
3 A M PATEL
Maximal group codes with specified minimum distance
IBM J of R&D Vol 14 pp 434-443 1970

A method of test generation for fault
location in combinationallogic*
.
by Y. KOGA and C. CHEN
University of Illinois
Urbana, Illinois

and
K. NAE1VIURA
Nippon Telegraph and Telephone Public Corporation
Musashino, Tokyo, Japan

exhaustively generated tests to isolate failures or near
minimal test generation methods for failure detection,
but their methods are impractical to generate tests
for actual digital machines. Actual test generation
using the method presented in this paper has been
done for the ILLIAC IV Processing Element control
logic, and is briefly discussed.

INTRODUCTION
The Path Generating Methodl is a simple procedure
to obtain, from a directed graph, an irredundant set of
paths that is sufficient to detect and isolate all distinguishable failures. It was developed as a tool for diagnostic generation at the system level, e.g., to test data
paths and register }oading and to test a sequence of
transfer instructions. But it has been found to be a
powerful tool for test generation for combinational
logic networks as well.
The combinational network to be diagnosed is represented as a set of complementary Boolean forms, where
complementation operators have been driven inward
to the independent variables using DeMorgan's Law.
A graph is then obtained from the equations by translating logical sum and logical products into parallel
and serial connections, respectively. A set of paths is
generated from the graph, which is irredundant and
sufficient for detection and isolation of single stucktype failures.
The advantage of this approach to test generation
lies in the irredundancy and isolation capability of the
generated tests as well as the simplicity of the algorithm.
Several test generation methods have been develveloped,2,3,4,5,6 but none attacks the problem of efficient
test generation for failure isolation. Some of these
papers presented methods to reduce redundancy of

PATH GENERATING METHOD
In this section, test generation by the PGM (Path
Generation Method) to a given directed graph will be
discussed briefly.
Let us consider a graph with a single input and a
single output such as that shown in Figure 1. If this
actual circuit has multiple inputs or outputs, we add a
dummy input or output node and connect them to the
actual inputs or outputs so that the graph has only one
input and one output node.
There exist thirteen possible paths from the input
node No to the output node N5 of the digraph in Figure
1, but not all of these are needed to cover every arc of
the graph. We arrive at a reduced number of test paths
in the following manner.
Starting at the input node, we list all the nodes which
are directly fed by the input node, i.e., have an incident arc which originated at the input node, and
draw lines corresponding to the arcs between them.
Level zero is assigned to the input node and level one
to the nodes adjacent to the input node. Nodes directly
connected to the level one nodes are then listed and
assigned to level two. This step is repeated until all

* This work was supported by the Advanced Research Projects
Agency as administered by the Rome Air Development Center,
under Contract No. US AF 30(602)4144.

69

70

Fall Joint Computer Conference, 1970

INPUT

~ ---=:':":::---11

)t---:;O~/1:""-__d

d=o·b·c

a

a

b
c

graph

OUTPUT

o

Figure 1-A directed graph

nodes are covered. If a node has already occurred on a
higher level or previously on the same level, we define
it as a pseudo-terminal node and cease to trace arcs
down from it.
Whenever a path from the input reaches a pseudoterminal node, we complete the path by arbitrarily
c

~ INPUT
No

•
denotes
a' pseudo
terminal node.

complement graph

and 1 denote a stuck-at-one and stuck-at-zero

failure, respectively, and
by the output failure.

*

denotes a masked failure

Figure 3-AND gate and its graphic representation

choosing any route (usually the shortest) which goes
from it to the output. Six paths are obtained from the
digraph in Figure 1 as shown in Figure 2, where shortest paths are selected after reaching a pseudo-terminal
node.
The main advantage of this test generation method
is that the set of paths generated by the PGM is an
irredundant set which is sufficient for detecting and
locating any distinguishable single failure within any
cycle-free graph. It should be noted that any arc in the
graph is assumed to be able to be actuated independently for a test path.

GRAPHIC REPRESENTATION OF
COMBINATIONAL LOGIC

Figure 2-Generated test paths

To apply this PGM to a combinational logic network,
a graphic representation of a combinational logic
which takes into account stuck-type failures must be
used.
An AND gate with three inputs and one output has
possible s-a-1 (stuck at one) and s-a-O (stuck at zero)
failures. A s-a-O failure at output d is indistinguishable

Method of Test Generation

from each s-a-O failure of the inputs a, band c, but there
exist five distinguishable failures, as shown in Figure 3.
Let· us consider the straightforward graphic representations of this AND gate and its complement expression. In this example, a, band c can denote simple
variables or sets of subgraphs representing parts of a
logic network. Note that if the four paths are assumed
to be paths to test the AND gate where these. paths
can be actuated independently, all distinguishable
faults can be detected and single faults can be located.
The graphic representation is slightly modified to
demonstrate this, as shown in Figure 4, where Fd=O
means no such fault that the output d is s-a-O.
It is obvious that anyone of five distinguishable
faults can be located by the four test paths, where only
one test path should be completed for each test. To
generate a set of test inputs, variable valu'es should be
assigned such that only the path to be tested is completed and the rest of the paths are cut off. The test
values for the variables (a, b, c) are determined to be
(1, 1, 1), (0, 1, 1), (1, 0, 1) and (1, 1, 0) for a three
input AND gate.
If one input variable is dependent on another then
normally distinguishable failures may become indistinguishable. For example, if variable a is dependent
upon variable b, then a s-a-l failure at input a and a
s-a-l failure at input b may become indistinguishable
or undetectable.
Whenever anyone of the variables a, b, and c is replaced by a subgraph which represents a part of a logic
network, the same discussion is extended to the complex

71

:~d
Original Logic gat~

(a)

d ~

ab

+ c

eao.

~e~,.
e
a.,

~F.:.',

Fe.,

(b)

=

d

CL"
'C=O

Possible gate failures

: °0°

~o

~_,

(c) OR gate test
generation graph

1;.,

F;
b-,

~-o
e

subgraph for e and

.lSr

(d)

e

denotes a new indistinguishable failure by connection.

r.raph for test generation

Figure 5-A logic network containing a negation

a

b
C

Fc"=1

Fd=o

Figure 4-Complex graph for test generation to take into
account failures

graph. Also, a similar argument can be applied to an
OR gate. If a NOT operation appears between two
successive gates, the input variables to the following
gate are replaced by the dual subgraph of the preceding
gate. Alternatively, the graph will be given directly
from equations modified such that negations are driven
inward to the primary input variables by applying
DeMorgan's Law to the given Boolean equation. For
example, the graph for test generation with the logic
network in Figure 5a is given as shown in Figure 5d.
The same graph is derived from the transformation
of the Boolean equation as

d=ab+c=a+b+c
and the graph for test generation is given directly by
the above equation. It is obvious that distinguishable
failures in the original logic network are still distinguishable in the complex graph representation for test
generation.

72

Fall Joint Computer Conference, 1970

eration and distinguishable failures in a combinational
network were not clearly established. The main advantage of the graphic representation of a combinational network (including the complement expression)
is that the graph contains failure information explicitly
as discontinuities of arcs or nodes instead of s-a-O and
s-a-l failures in the original combinational logic
network.

r-------..,I

~--------~I

Fvm2-COf :
!r---------~I

I
:
I

~

~ T.

!::r--L i
tH. .JJi

1.

TEST GENERATION FOR COMBINATIONAL
CONTROL LOGIC

L ___

~

r--------,
~

piI-wi6--1
PAW-W17-:.t •

JiUti

I

:
I

I

I
I

I
I
I

PY[8-ACLDl

Figure 6-An example of control logic of ILLIAC IV PE (Closed
dotted line denotes an IC package and half one denotes
a part of IC package)

From the previous discussion it will be noted that if
those input variables which correspond to the nodes
in a path test through the original graph of a logic function are activated, the combinational logic network will
give an output of a logic 1, whereas if the path goes
through the complement graph, the output will be a O.
For example, if we set a = 1, b = 1 and c = 1 in Figure 4,
the output of the network is a logical 1. If a, b or c
stucks at 0, the faulty network will produce output 0
instead of 1. This test can detect single failures a, b, c
or output d stuck at o.
In order to detect the s-a-l failure of input line a, b, c
and output line d, the path tests in the complement
graph are required. A s-a-O type failure of one node in
an original graph will become a s-a-l type failure in
the complement graph and s-a-l type failure of one
node in an original graph will become s-a-O type failure
in the complement graph. Now it is clear that the complement graph of the original graph is required for the
output stuck at ~1.
In test generation methods which have been presented in the past, the relationships between test gen-

The output of any portion of a computer control logic
is usually governed by many input conditions, but the
fan-in of a typical logical element is usually restricted
to only a few inputs. This causes the number of gate
levels necessary to generate a function to increase and
the structure of control logic becomes tree-like. The
network shown in Figure 6 is a typical control logic of
the ILLIAC IV PE. Since there are about 50 distinguishable failures in the network, about 50 iterations
of a path sensitizing would be required by conventional
technique, or more than 8000 terms would have to be
handled by Armstrong's method. 2 In both cases, neither
the irreducibility of tests nor the isolation capability
of distinguishable failures would be guaranteed.
The network of Figure 6 is translated into the graph
of Figure 7 and Figure 8, from which the PGM will
generate tests, and the irredundancy and isolation
capability of the generated tests are guaranteed as well
as the simplicity of the algorithm.
To make a test path in the graph, the variables on
the path under test should be actuated and the rest of
the paths should be cut off. If the original logic network does not have a complete tree structure, a few
conflicts may occur in assigning values to variables to

Figure 7-A graph representation of Figure 610gic diagram

Method of Test Generation

make a test path generated by the PGM. These may
easily be resolved, as will be shown later.

73

PYE8-ACLDI +-( «PMW-EI---O AND NOT FYEM32-LOT) OR
«FYEElA-HMT OR NOT PMW-EI---O) AND NOT

Transformation of boolean equations to arc descriptions

FYE8-ACLCl) OR
«NOT P-EX-UF-LH AND

The description of a combinational logic network is
assumed to be given by a set of Boolean equations
using the operators AND, OR and NOT.
For example, from Figure 6 of a part of the control
logic of the ILLIAC IV PE, the Boolean equation is

(FYEElA-HMT OR NOT PMW-EI---O»

AND NOT

FYEM648L-T) OR
(NOT FYE98ASLIT AND
«FYEElA-HMT OR NOT PMW-EI---O) AND NOT
P-CARRYH-L AND NOT PEXDI-L48L») OR
«P4W-WIO--IOR PAW-W09--1 OR
PBW-W09--1) AND
(NOT P-EX-UFlLH AND
(NOT PAW-W16--1 AND PAW-W17--1) AND
(NOT FYEM329LIT OR NOT FYEMULTL9T) AND

FYEM648L-T

P-EX-UF-LH

(FYEElA-HMT OR NOT PMW-EI---O») OR
« (FYFElA-HMT OR NOT PMW-EI---O) AND NOT
P-EX-UF-LH AND NOT FYEM649L-T AND
(NOT PAW-W16--1 AND PAW-W17--1»

FYE98ASLIT

(PAW-WOl--l OR PBW-WOl--l OR
P4W-W02--1»

FYEElA-HMT

AND

OR

«PGC--16--1 OR P--1-7I--O) AND
«(FYEZlA-HMT OR NOT PMW-EI---O) AND NOT
FYE98ABLIl) OR

PMW-El---O

(NOT FYE9BABLFI AND
PAW-WOl--l

(FYEElA-HMT OR NOT PMW-EI---O»»);

PBW-WOl--l

Figure 9-Squeezed equation of Figure 6

P4W-W02--1

FYE98ABLIl

FYEElA-HMT

PMW-EI---O

P--L-7I--O

Figure 8-A complementary graph of Figure 6 logic diagram

derived* and this equation was then 'squeezed' by a
program as shown in Figure 9, where logical constants
(used to disable unused gate inputs) are removed from
the functional Boolean expression, and NOT operators
are driven into the innermost individual variables of
the equation by use of DeMorgan's Law.
N ow we try to transform the Boolean equations into
the graph descriptions. AND operations are trans-

* In the case of the ILLIAC IV PE design, the Boolean equations
are automatically generated from wiring information. This same
equation set was also used for debugging the logical design.

74

z

Fall Joint Computer Conference, 1970

=

a AND b

z

= a OR b

Let zbe a Boolean function of the Boolean variables
Xl, X2, ••• ,Xn and expressed as Z=Z(XI, X2, ••• , xn).
Let P be one of the path tests generated from the arc
description of the Boolean function z, and, be defined
bya set of Boolean variables on the path as P=
{Xll1 Xh, ••• ,Xla } where Xl l1 Xl 2, ••• ,xlaE {Xl, X2, ••• ,
Xn,

~a)

AND operation

(b) OR operation

Figure 10-Transformation of the Boolean equations into graph.
AND operation in graphic form

formed into series connections and OR operations into
parallel connections as shown in Figure 10.
The graphic representation of a combinational logic
network is translated as arc description for the input
to the PGM program. The AND operation, a AND b,
is translated as b~a, where a is the source node and b
the destination node. The OR operation is translated
as a~dn1, dn2~a, b~dn1 and dn2~b, where dn represents dummy node.
In the arc description generation program which we
developed, redundant dummy nodes are removed insofar as possible. For example, dummy nodes can be
eliminated from the OR operation in the various ways
shown in Figure 11 depending on the original Boolean
equation.
For ILLJAC IV PE control logic we get 111 Boolean
equations. The 111 Boolean equations and their 111
complemented equations can be visualized as 222· subgraphs and all connected to an input node and output
node. The arc descriptions of this big graph are processed by a program (PGM algorithm) to produce a
set of 464 paths for diagnosis.

X2, ••. , X n }.

{Xll1 Xh, ••• , Xl a }·

The conflict in the path will cause some trouble in
the assignment of the variables. Most of the time, they
can be avoided and this will be discussed in the next
section.
Let P= {Xll1 Xl 2, ••• ,Xla } be one of the paths and
('YI, 'Y2, ••• ,'Yn) be one of the value assignments where
'Y i = 1 if i E {h, l2, .•. , la} and the other 'Y i values are
arbitrarily chosen. If there exists another path P' such
that P' = {Xhl1 Xh2' ••• ,Xht/} where Xh17 Xh2' •• · ' Xht/E
{Xl, X2, ••• ,xn } and Xhl = ... =XhfJ= 1 after the above
assignment, the path P' = {Xh17 Xh2' ••• , Xht/} is called
a sneak path.
The sneak path P' is actually a path with its variables
being assigned 1 in addition to the path P in which we
are interested. The test values assigned to the variables
of the path test P= {Xl17 Xh, ••• ,Xlal can detect stuck
type failures s-a-O or s-a-1 for each literal in the path.
For example, if one of the input signals is Xi ( EP) then
the test pattern derived from P can detect a s-a-O failure
at input Xi. If one of the input signals is Xi ( EP) and
its literal Xi appears in P, that is xiEP, then the test
pattern derived from P can detect a s-a-l failure at
input Xi. Note that this detect ability of the failures associated with the input Xi is under the assumption

Al\B

AA~

Conflict and sneak paths

In a graphic representation, every path on the graph
is assumed to ,be able to be actuated independently to
the other paths, but this assumption is not always
true in the case of combinational logic network
representations.
For example, if there is a variable on a path such
that the variable simultaneously completes one portion
of the path and opens another portion of the path,
that is, the variable x appears as both x and x in one
path, then no test path actually exists.
In the following theoretical discussion, theseproblems will be analyzed accurately.

Xl,

A path P= {Xll1 Xh, ••• ,Xlal is said to have a
conflict if there exists at least one Xi such that Xi E
{Xl, X2, ••• ,Xn }, Xli=Xi and Xlk=Xi for Xli' XlkE

i)

V

( ... ) AND (A,tlR B) AND ( •••

ii) C AND (A

pR

B) AND ( •••

nl

A

O
B

A

B

E

iii) ( . . pR ••

) AND

(A

PR

B) AND E

iv)

C AND (A

pR

B) AND E

Figure ll-OR operation in graphic form

Method of Test Generation

that there are no conflicts or sneak paths for any test
value assignment to the variables in the path. Apparently redundancy in the original logic network
causes sneak paths in the graph representation, and
these sneak paths reduce the detectability of failures
by the path tests.
This is discussed more precisely as follows: Let
P= {xzu XZ 2, .•• , xzal, where xzi(jE {I, ... , aD is a
literal in the path we are interested in and P' =
{Xhu Xh2' ••• Xh/3} (-~P) is a sneak path as defined previously. Let a subset P" be defined as P" = p n p' •
Then the test value assignment to the variables of
path P can at most detect stuck-type failures in the
input signals XZu XZ 2, .•• ,xziEP". The test pattern
cannot detect failures in XZi+!' .•. , XZa E P - P".
This is proved as follows:
If a path P is in the original graph, a sneak path P'
cannot be in the complement graph. Let P be in the
original graph corresponding to the logical network
function f and P' be in the complement graph corresponding to the complemented logical network function J. Then we can express f and! as follows:
f=XlI 'Xl2 .•• XZa+RI
!=Xhl. X h2' . . xh/3+R2

By sneak path definition Xlu Xl2, ••• Xl a , Xhl1 ••• Xh{J
are assigned 1, therefore f = 1 + RI = 1. But f = 1 contradicts ! = 1 + R2 = 1. So a path P being in the original
graph and the sneak path pI being in the complement
graph cannot exist. Similar arguments can be applied
to prove that a path P be in the complement graph and
a sneak path pI being in the original graph cannot
exist. So P and P' both must be in the original graph
corresponding to the Boolean function f or both in
the complement graph corresponding to the complemented function J.
First, assume that the path P and the sneak path
pI are in the graph, not including complement expression, corresponding to the original logic function f. If
all the variables in the path Pare ANDed together
the result is XlIXZ2Xl~ • • • Xla' This is a term of the Boolean
expression of the logic network function f after expansion but before simplification. Similarly for the
sneak path pI we get another term XhlXhz •• ,Xh{J for
the Boolean functionf. Letf=xlI .. . Xla+Xhl" ,xh{J+R.
Where R is the logic sum of the remaining terms of the
Boolean function f.
Since Xl l1 XZ 2, ••• ,xziEP" =pnp' , we can rearrange
the function f as follows:

According to the value assignment and sneak path

75

definitions, we assign 1 to Xl l1 XZ 2, • , Xla and Xhl1 • , Xh/3
for the variables corresponding to the path P. A test
with logic value assignment 1 to Xk can detect a s-a-O
failure at location Xk if the change of the logic value
from 1 to 0 will result in a change of the logic values
at the output. On the other hand a test with logic
value assignment 0 to Xk can detect a s-a-l failure at
location Xk if the change of the logic value from 0 to 1
will result in a change of the logic value at the output.
First consider the s-a-O failure for Xli where XliE P"
and XZi is positive. Under the value assignment scheme
xlr=l, xz 2 =1, . . . , xla=l, xhl=I, ••• and xh{J=I, also
R = O. If Xli stucks at 0 and R still remains at 0, this will
change the function value from 1 to O. This corresponds
to the change of the output of the combinational logic
network from 1 to O. If R contains such one term in the
form of sum of products, as XliXklXk2' • • Xky and Xkl =
Xk2= ••• =xky=l and Xli=O under the previous assignment, the stucking at 0 of XZi will change R from 0 to 1.
This keeps the output remain at 1 when the input Xli
stucks at O. Therefore, the test derived from the path
P cannot detect the s-a-O failure a Xli' This will not
occur when xZi is a one-sided variable. So the test can
detect the s-a-O failures for those positive one-sided
variables xljin P". For the variable,s Xli+l1 Xl i +2 , • • •
and Xla E P - P", the test cannot detect the failures.
Assume xjEP-P" is a positive variable and stucks at
O. The term XZ i + 1XZ i +2 • • • Xla becomes 0 but Xhi+lXhi+2' ••
Xh{J is still 1 under the same value assignment scheme.
Since all xi's E P" are assigned logical 1, the function
value still remains at 1 regardless of whether Xj E P - P"
is 0 or 1. So the test cannot detect the s-a-O failures for
any positive variable Xj in P - P".
Similar arguments can be applied for s-a-l failure
of Xi and its literal xiEP. Now we have only proven
those paths in the original graph which correspond
to the Boolean function f. Similar arguments can be
applied to those paths in the complement graph except
the function is J instead of f.
If P" = P n p' is an empty set, the test derived from
P cannot detect any failure. Thus this test is useless,
and such a path P is said to have a fatal sneak path P'.

Test generation
The PGM program generates a set of paths from
the arc descriptions of the combinational logic network.
These paths will be processed to produce a set of test
sequences to detect and locate the failures.
Let z be a Boolean function of Boolean variables
Xl, X2, ••• , Xn and expressed as z = z (Xl, X2, ••• , Xn) •
Without loss of generality, assume z is positive in
Xl, X2, ••• , Xi and negative in Xi+1, Xi+2, ••• , Xii that is,

76

Fall Joint Computer Conference, 1970

through Xi appear in uncomplemented form and
Xi+! through Xj appear in complemented form only. But
Z is both positive and negative in Xj+l, Xj+2, ••• , X n •
That is, both Xk and Xk ( j + 1 ~ k ~ n) appear in the irredundant disjunctive form of· z. For example, if
Z(Xl, X2, X3, X4) =XlX2+X2X3+X4 then Z is positive in Xl,
negative in X3 and X4 but either positive or negative in
X2. Let us define those variables Xl, X2, • • • , Xi and
Xi+l, ..• , Xj as one-sided variables and those variables
Xj+l, Xj+2, ••• , Xn as two-sided variables.
Suppose the PGM program produces paths PI,
P 2 , ••• , Pm from the arc description of the equation
Z = Z (Xl, X2, ••• , Xn). Consider only one path Pl. Let
PI be defined by a set of variables on the path as
PI = {Xll' Xh, ••• , Xlal,
where
Xl

Let Xl D Xlz, ••. ,Xla be defined as variables inside path
and other variables as variables outside path. For example, if we have Z=Z(Xl, X2, X3, X4) and PI = {Xl, X2},
then Xl and X2 are variables inside path and X3 and X4
are variables outside path.
If PI = {Xll1 Xl 2 , • • • , Xl a } is one of the paths produced
by PGM program from the arc descriptions of the
equation Z=Z(Xl, X2, ••• x n ), then one can get the test
from PI by the following procedure:

Application to ILLIAC IV PE control logic
The ILLIAC IV PE can be divided functionally into
three major portions, the data paths, the arithmetic
units such as the carry propagate adder, the barrel
switch, etc., and the control logic unit. Tests for the
data paths and arithmetic units have been generated
by other methods. l
To diagnose the ILLIAC IV PE completely, control
logic tests have been generated by an automatic test
generator system which uses the methods presented in
the previous sections.
The control logic test generator system consists of
the following subsystems:
1. Equation generation and simplification program
2. Transformation program to change Boolean
equations into arc descriptions
3. PGM program
4. Test generation program
a. Conflict checking
b. Value assignment to variables
c. Sneak path checking

They are combined into the system shown in Figure
12.

1. Set the positive variables inside path at 1 and

the negative variables inside path at O.
2. Check two-sided variables inside path. If Xi and
Xi appear in the path, conflict occurs. Stop. If
only positive form Xi of the two-sided variables
Xi appears in the path, set it at 1. Otherwise at

o.

3. Set the positive variables outside path at 0 and
negative variables outside path at 1.
4. Set the two-sided variables outside path at o.
5. Check for sneak paths.
6. If a sneak path exists, change one of the twosided variables. Go back to step 5. If the sneak
path still exists after checking all the combinations of the binary values of two-sided variables
outside path, check for the fatal sneak path.
7. If no fatal sneak path appears, the assignment
of the logic values is good. Therefore, a test is
determined.
When the PGM was applied to the ILLIAC IV PE
control logic, onlysix of 111 equations were discovered
to have path conflicts. Many of these conflicts may be
avoided by rearranging the input cards to the PGM
program, since the paths selected depend somewhat on
the ordering of the input equations.

(111 equations)

Pragram which drives
the "NOT" to the
innermost individual
variables

~-r--_...L...J (464 tests)

Transformation from
Boolean equations
into the Arc
descriptions

Test Generat ian
1. Conflict checking
2. Assignment of
the variables
3. Sneak path checking

Figure 12-Controllogic test generation system

Method of Test Generation

77

TABLE I-Value Assigned Tests for Combinational Logic Network of
Figure 6 Diagram
(THE OUTPUT SIGNAL IS PYE8-ACLDl)
SIGNAL NAMES

PATH NUMBERS
111111111122222222223333333333444
123456789012345678901234567890123456789012

FYE8-ACLCl
FYE98ABLFl
FYE98ABLIl
FYE98ASLIT
FYEEIA-HMT
FYEM329LIT
FYEM32-LOT
FYEM648L-T
FYEM649L-T
FYEMULTL9T
P4W-W02-1
P4W-WI0-l
PAW-WOl-l
P4W-W09-1
PAW-Wl6-1
PAW-W17-1
PBW-WOl-1
PBW-W09-1
PEXDI-L48L
PGC-I6-1
PMW-EI-0
P-CARRYH-L
P-FX-UFILH
P-EX-UF-LH
P-L-71-0

111111111101110111111111111111111111111011
111111001111111110111010100000000000000000
111110111111011111111100100000000000000000
111111110111111101111111111111111100011111
010110010000111010111001101111011101101011
101111111111111011100000000000010000000000
111111111110111111111111111111111111111101
111111111011101111111111111111111111100111
110001111111111111011110100001111111111111
011111111111111111111000000000010000000000
001000000000000000100110111111111111111111
100000000000000100001111111110111111111111
000010000000000000000110111111111111111111
000000000000000000010111111110111111111111
000001111111111011000000000100000100000000
111110000000000100111111110111110111111111
000100000000000000000110111111111111111111
010000000000000000000111111110111111111111
111111110111111101111000000000000001000000
000001010000100001000110100000000000000000
010110010001111010111111111111111111111101
111111110111111101111000000000000000100000
001111111111111011100111111110000011111111
110001111011101111011000000010000000001000
000000100000000000000110100000000000000000
THE FIRST 21 PATHS ARE FOR THE
OUTPUT "PYE8-ACLDl" WHICH CORRESPONDS TO THE ORIGINAL GRAPH.
THE REST OF THE PATHS ARE FOR'
THE COMPLEMENTARY OUTPUT "NOT
PYE8-ACLDl" WHICH CORRESPONDS
TO THE COMPLEMENTARY GRAPH.

Table I shows the variable assignment for the control
logic tests in Figure 6.
Test dictionaries for failure location can be generated
by a system similar to the test dictionary generator
system associated with the PGM program. The test
dictionary generation will be reported in a separate
paper.

CONCLUSION
The path generation method for test generation for
combinational logic has been discussed and an example
of the test generation system for ILLIAC IV PE control
logic has been presented.

Test generation by means of graph representation
of the Boolean functions of combinational logic networks has several advantages over other methods.
First, distinguishable faults are explicitly expressed as
nodes in the graph. A test which is derived from one
path in the graph can detect stuck-type failures, if no
sneak paths exist. The nodes in the graph correspond
to the failure locations and failure types (s-a-O or
s-a-l) in the combinational logic network.
Second, a complete set of tests for fault location can
easily be generated from the graph by the PGM program. If no conflicts or sneak paths exist in the set of
paths generated by the PGM, the corresponding set of
tests is sufficient for locating failures in the combinational logic network.

78

Fall Joint Computer Conference, 1970

This method is a powerful tool for testing tree structure logic networks. If the structure of a logic network
is not of the tree type, the conflicts may occur.
A method of checking for conflicts and sneak paths
has also been presented. This is used to determine the
validity of the tests for the combinational logic network.
Conflicts can easily be reduced by replacing tests or
rearranging of the PGM inputs after inspection of the
generated tests. It is noted tha t these conflicts are not a
result of our approach, but rather a property of the
network itself.
Generally, conflicts will be few in control logic networks because their structure is close to a pure tree
structure, and no sneak paths exist if there is no redundancy in a logical network.

ACKNOWLEDGMENT
The authors would like to thank Mr. L. Abel for his
enthusiastic discussion and our advisor,Professor D. L.
Slotnick.
This work was supported by the Advanced Research
Projects Agency as administered by the Rome Air
Development Center, under Contract No. US AF
30(602)4144.

REFERENCES
1 A B CARROLL M KATO Y KOGA
K NAEMURA
A method of diagnostic test generation
Proceedings of Spring Joint Computer Conference pp
221-228 1969
2 D B ARMSTRONG
On finding a nearly minimal set of fault detection tests for
combinational logic nets
IEEE Trans on Computers, Vol EC-15 No 1 pp 66-73
February 1966
3 J P ROTH W G BOURICIUS P R SCHNEIDER
Programmed algorithms· to compute tests to detect and distinguish between failures in logic circuits
IEEE Trans on Computers Vol EC-16 No 5 pp 567-580
October 1967
4 H Y CHANG
An algorithm for selecting an optimum set of diagnostic tests
IEEE Trans on Computers Vol EC-14 No 5 pp 706-711
October 1965
5 C V RAMAMOORTHY
A structural theory of machine diagnosis
Proceedings of Spring Joint Computer Conference pp
743-7561967
6 W H KAUTZ
Fault testing and diagnosis in combinational digital circuits
IEEE Trans on Computers Vol EC-17 pp 352-366 April
1968
7 D R SHERTZ
On the representation of digital faults
University of Illinois Coordinated Science Laboratory
Report R418 May 1969

The application of parity checks to an arithmetic control
by C. P. DISPARTE
Xerox Data Systems
EI Segundo, California

INTRODUCTION

Inactivity alarm.

As circuit costs go down and system complexity goes
up, the inclusion of more built-in error detection circuitry becomes attractive. l\1:ost of today's equipment
uses parity bits for detection of data transfer errors.
between units and within units. Error detection for
arithmetic data with product or residue type encoding
has been used to a limited extent. However, a particularly difficult area for error detection has been control
logic. When an error occurs in the control, the machine
is likely to assume a state where data is meaningless
and/ or recovery is impossible. Some presently known
methods of checking control logic are summarized below.

Checks the loss of timing or control signals.
Method of checking an arithmetic control

The application of parity checks for error detection
in an arithmetic control appears to have been first suggested in 1962 by D. J. Wheeler.2 He suggested the
application of a "parity check for the words of the store"
as an advantage of the fixed store control where parity
checks would be applied to each microinstruction word.
In a conventional type control, the method of applying
parity checks is similar provided that the parity bits are
introduced at the flow chart stage of the design. The
present method is applied to an Illiac II type arithmetic
control which is a conventional control rather than a
read only store control. The method gives single error
detection of the arithmetic control where errors are defined as stuck in "1" or "0".

Methods of checking control logic!
Sequential logic latch checking

A parity latch is added to a group of control latches
to insure proper parity. The state logic must be desjgned
such that there is no common hardware controlling the
change of different latches.

THE ILLIAC II

Checking with a sim.ple sequential circuit

The Illiac II which was built at the University of Illinois is composed of four major subsystems as shown in
Figure 1. The Executive Subsystem includes Advanced
Control, the Address Arithmetic Unit and a register
memory. The Arithmetic Subsystem contains Delayed
Control, the Link Mechanisms and the Arithmetic Unit.
The Core Memory Subsystem is the- main storage. The
Interplay Subsystem contains auxiliary storage, I/O
devices and the associated control logic.

A small auxiliary control is designed which serves as a
comparison model for the larger control being checked.
Using a special pattern detecting circuit

An auxiliary sequential machine is designed which
repeats a portion of the larger sequential machine's
states in parallel. This gives a check during part of the
cycle of the larger machine.

The Illiac I I arithmetic subsystem
Checking with an end code check

The arithmetic Subsystem of the Illiac II shown in
Figure 2 performs base 4 floating point arithmetic. The
input and output channels carry 52 bits in parallel.

A check on the control outputs is accumulated and
sampled at an end point.
79

80

Fall Joint Computer Conference, 1970

SPINDAC, a small delayed control
Core
Memory

Executive
Subs~m

4------ ---------_____ •

t
I
I
I

•
Arithmetic
Subsystem

- .........

- ................

-........

!,
I

........................

I

-....-

•

........

Interplay

CONTROL A6.TH - - - -

QA.TA A6.TH - - -

Figure l-ILLIAC II organization

The first 45 bits of the operand are interpreted as a fraction in the range -1:::; f < 1. The last 7 bits are interpreted as an integer base 4 exponent in the range:
-64:::;X<64. Both the fraction and the exponent have
a complement representation. The other input data
channel carries a six bit Delayed Control order which
specifies the operation performed by the Arithmetic
Subsystem;
The Arithmetic Subsystem is composed of three
principal units. The Arithmetic Unit (AU) contains the
computational logic and is divided into two maj or subunits as indicated. The Main Arithmetic Unit (MAU)
and the Exponent Arithmetic Unit (EAU) handle the
fractional and exponential calculations respectively.
The second principal unit of the subsystem contains the
Link Mechanism (LM) logic. This logic transmits commands from De]ayed Control to the Arithmetic Unit
(AU). It may further be divided into gate and selector
mechanisms and status memory elements. Delayed
Control is the third principal unit of the Arithme~ic
Subsystem. Delayed Control logic governs the data flow
in the AU via the LM.
The order being executed by the AU is held in the
Delayed Control register (DCR). A new order cannot be
transferred to DCR until the order presently held has
been decoded and initiated by Delayed Control. If the
order requires an initial operand, Advanced Control
(AC) determines whether Delayed Control has used the
operand presently held in FI(IN). If so, AC places the
new operand in IN; otherwise, it must wait. If the order
requires a terminal operand (i.e., a store order) AC
checks the contents of the OUT register before the store
order is completed.

Delayed Control is constructed with a kind of logic
known as "speed-independent". The theory of speed
independence holds that a speed independent logic array retains the same sequential properties regardless of
the relative operating speeds of its individual circuits. 3
The theory permits parallel operations while at the same
time precluding the occurrence of critical races .
A smaller version of Delayed Control called
SPINDAC (SPeed INDependent Arithmetic Control)
has been used as a model for the present study.
SPINDAC was designed by Swartwout4 to control a
subset of the Illiac II floating point arithmetic instructions. The relatively simple arithmetic unit which
SPINDAC controls performs thirteen arithmetic instructions including addition, multiplication, exponent
arithmetic, and four types of store orders. Iror the purposes of this study, SPINDAC has been divided into
eight subcontrols as shown in Figure 3. Each of the subcontrols has one or more states. The Add subcontrol,
for instance has five states Al through A5. In general,
there is one flip-flop in SPINDAC for each state. The
entire SPINDAC has 29 states.

The MAU, EAU, and LM
The essence of this description is due to Penhollow. 5
The Arithmetic Unit (AU) consists of the Main Arithmetic Unit (J\tIAU) and the Exponent Arithmetic Unit
(EAU). These two units operate concurrently, but are
physically and logically distinct. Both receive their
operands from the 52 bit IN register. The first 45
bits of this are interpreted as a fraction, -I:::;f < 1, and
is the l\1AU operand. The last 7 bits are interpreted as

~g~-+

________________________

5=2~its~

From

Advanced
Centro!

From

Advanced --Control

----..

Link
Mechanisms
4-

L ____ "- _______ ,

:

52 bits

F&~~T)--------------~--------------~

Figure 2-The arithmetic subsystem of the ILLIAC II

Application of Parity Checks

an exponent, - 64::; X < 64, and is the EAU operand.
The complete floating point operand contained by IN
may be expressed as p=f·4x • Floating point results
placed in OUT have the same form. Both f and x are
in complement representation.
The block diagram of the Illiac II MAU is shown in
Figure 4. Registers A, M, Q and R each have 46 bits,
while S has 48 bits. Since the two adders yield sums in
base 4 stored carry representation, A and S also contain
23 and 24 stored carry bits respectively.

81

To FO(OUT)
via RE5gFO

From
F1(iN)
F1gMEM

TheMAU
During the decode step of every Delayed Control
order, the gate FlgMEM transfers the first 45 bits of
IN to M even though the order does not use an initial
operand. The results of the previous operation are generally held in A and Q which represent the primary rank
of the double length accumulator. The Sand R registers
form the secondary rank of the double length accumulator which usually holds an intermediate result at the end
of an operation. During the store step of every store
order, the RESgRO gate transfers a modified copy of
R to the OUT register.
The two adders shown in Figure 4 are composed of
base 4 adder modules. The A adder has the contents of
the A register as one input and the output of the MsA
selector as the other. In either case, the selector output
in two's complement representation is added to the
stored carry representation held in A or S. A subtraction
is accomplished by causing 1\1 to appear at the selector
output and then adding an extra bit in the 44th position.
The selector mechanisms have memory. Once a particular selector setting has been chosen by Delayed Con-

Exponent

ADD
Arithmetic
Subcontrol SUBCONlROL
E1-E2

1

1

Clear.Ad:l
Subcontrol

mrmalize
ubcontrol

A1-A5

81-82

N1-N3

J
1

Isu~~~r~11

Su
~~01

!

I

51-56

M1-M8

ICorTeCt
Overflow
Detect Zero
Subcontrol

K1-K2

J-1
Decode II
Subcontrol
D1

I

Figure 3-SPINDAC (SPeed INDependent Arithmetic Control)

Figure 4-The ILLIAC II main arithmetic unit

trol it remains in effect until a new setting is made.
The settings shown in Figure 4 are easily interpreted,
provided the outputs of the A and S adders are used in
place of the register outputs.
The gate mechanisms do not have memory, so they
must be activated each time the contents of the associated registers are changed. If the gate is not activated, the register simply retains its old contents regardless of the bit configuration appearing at its inputs.

The EAU
The block diagram of the Illiac II EAU is shown in
Figure 5. The EA, ES, EM and E registers each contain
8 bits. The EAU does arithmetic modulo 256. An 8 bit
adder (D-adder) with an optional carry into the Oth
position provides the capability of doing exponent arithmetic and counting. It accepts the outputs of the EA
register and the sD selector as inputs, and yields a sum,
D, which can be placed in ES via gES or in E via DgE.
The selector sEA controls the input to EA via gEA.
The gate mechanism EMgE controls the input to E.
During the decode step the contents of Fl are transmitted to EM via FlgMEM. At the end of an operation the exponent of the result is left in E.
The EAU decoder is a large block of logic whose inputs are the outputs of the D-adder. Its purpose is to
detect specific values and ranges of the adder output.
Knowledge of these values is used in the execution of the
floating add instruction. Detection of whether the output is inside or outside the range -64::;x<64 is also
accomplished at this point. Since knowledge of the previous range or value of d must be remembered during
the time the inputs to the adder are changed, gES or
DgE will gate the outputs of the EAU decoder into a

82

Fall Joint Computer Conference, 1970

SPINDAC flow chart

A partial flow chart for the SPINDAC Add sequence
is shown in Figure 6. The actions in two of the five states
A3 and A4 are indicated. Inside each of the boxes are
the control outputs in the form of gating and selector
outputs. On the lines leading to each box are the conditional inputs in the form of decoder, status element and
selector outputs. They determine which of the control
outputs is to be energized. The signals in the boxes on
the center line preceded by ALL are always energized
when in that state.
The action effected by states A3 and A4 is the alignment of operands for floating point addition and subtraction. Rather than attempting to explain each of the
symbols in the flow chart, only the simple case for equal
exponents (i.e., no alignment required) will be explained.

to Delayed Control
MgE

sD
EMsD

EMsD
OsO

-2s0
2sD
-22sD

F1gMEM

Figure 5-The ILLIAC II exponent arithmetic unit

The Equal Exponent Case

register cal1ed ED. The memory elements of this register
are named according to the range of d they represent.
The Link Mechanisms (LM) include gates, selector
mechanisms, and control status memory elements. Delayed control directs the data flow and processing in the
Arithmetic Unit (AU) via the LM. Delayed Control
requests may determine the state of the LM directly or
indirectly through the outputs of decoders. The inputs
to these decoders may be the outputs of registers, adders,
or status memory elements. Selector mechanism and
status memory elements remember their present state.
The outputs of certain status memory elements influence the branching of Delayed Control. The setting
of one or more of these elements during the present
control step partial1y determines the sequence of future
control steps.

In Figure 7 the contents of A and Q are first right
shifted (74: AQsSR) then left shifted (4SRsAQ) so that
AQ remains with its original unshifted contents. The
exponent is gated into ES at A3 by gES in the ALL box.
The selector (with memory) El\1sD was initially set in
state Al (not shown) which sensitizes this path. In
state A4, the exponent is passed back to EA via gEA
and ESsEA. El\1gE gates the addend (subtrahend)
exponent to E. OMsS in A3 effects only a transfer.
Kl\1sA is always the final step before leaving the A3A4 loop. This is to assimilate carries which have been
in base 4 stored carry representation until the final pass
through the loop. XA controls the exit from the loop
(exit if XA= 1). The least significant carry into the
adder is cleared by C. All of these control signals are dependent on the various outputs of the exponent decoders such as d> 0, es = 2 etc. The actions 2sD and

es>O)
(d>O)

A3

to
A5

A4

'roml
A2 -__-1~A~LLL-__~~~~~y+~AL~L__~~~~~~~~to
• • A
A
A5
(clone)

hv(es>0)

(cr)vO)

Figure 6-SPINDAC add sequence partial flow chart

AOs

Figure 7-Partial add sequence for equal exponents

Application of Parity Checks

12 (setting status element 12) are meaningless for the case
of equal exponents.

EMsD,CR,gP3
(es>O)

Speed-independent hardware

r---------..
ALL

2sD,ByCR,gP2

--------------_1

(es> 2)
The logic design procedure used for Delayed Control
(and SPINDAC) employs Basic Logic Diagrams
(BLD's) developed by Swartwout and others at the
University of Illinois. 6, 7 A digest of this work as well as
the design for a new error checking BLD is in the Appendix.
In the logic design procedure, each state of the flow
chart such as A3 or A5 is associated with a control point
(CP). The CP in tum has some associated logic and a
flip-flop. Using this terminology, it can be said that the
Add sequence has five control points (five states) and
the entire SPINDAC control has 29 CP's. Using this
design procedure, the entire SPINDAC control can be
mechanized with 27 flip-flops and 346 gates.

83

-2sD, ByC R, gP1

gA,gQ,gEA,ESsEA,gPO

(done)
(done)

.n.(es>0)

Figure 9-Application of parity checks to simplified control
point A4

THE APPLICATION OF PARITY CHECKS
A general arithmetic control moAel is shown in Figure
8. Here a bit pattern at the output of the control represents a pattern of the gating and selector signals transmitted to the arithmetic unit. The pattern will be a function of: (1) the instruction presently being executed,
(2) the conditional inputs and (3) the current step of the
control. The control must be protected against two types
of errors: first, an erroneous bit pattern at the outputs,
and second, an incorrect sequence of the internal states
of the control. In a "speed-independent" control, the
internal states of the control change one memory element at a time. In most practical designs, this means
that the internal states of the control must be encoded
with a high degree of redundancy. One systematic way
of achieving a speed independent control, for instance,

Conditional
Inputs
~

Encoded {
Instruction
Being
Executed

Figure 8-An arithmetic control model

shifts the single active control point down a shift
register. If any two bits (control points) are true at the
same time, the control is known to be in an incorrect
state. A method is suggested here of applying one or
more parity check symbols to the outputs of the speedindependent control so that an erroneous output bit
pattern may be easily detected. If the control action
with faulty outputs can be detected before the effect
has been propagated, a replacement control may be
switched in or maintenance may be initiated.

Method

The method of applying single error detection parity
checks is explained with reference to simplified
SPINDAC control point A4 in Figure 9. In the flow
chart, some boxes are entered conditionally. These
conditional boxes are the ones which have gating or
selector outputs which are energized only if the appropriate conditions are true. The signals gPO, gP1, gP2,
and gP3 are gating parity checks which have been chosen
according to the following three rules:
1. If a conditional box has an even number of gating and/or selector signals, .add an odd parity
checking gate to each box (gP1, gP2 and gP3
in the example).
2. If a conditional box has an odd number of gating
and/ or selector signals, no parity checking gates
are added.

84

Fall Joint Computer Conference, 1970

RATIO OF
OfEO6
51 56 51 '>1
5H 51 58 5B
59 5~ ~9 59

~o 00
ao 00
~o 00
~o 00
~o 00
ao 00
~O 00
~O 00
dO OC

lB
18
18
18
18
18
1B
18
lB

BATCH

'lA 35 004C 04

4~

46 5A OOJO 00

~O

dO 00 SA SA

00
00
00
00
00
00
00
CO
00

~9

14 14 14
OA 01 OA
00 00 00
4~ Uti OH
30 08 liS
Jl 08 08
12 OY UH ~o
4M 08 08 dO

1~

0000
0000
0000
0000
00:)0
tlOOO
~6
UO?O
46 5~ 0000
46 59 0000

~9

1f'

OA 11 11 J4 l ' l ' 1F IF

OJ
OJ
OJ
OJ
OJ
03
OJ
OJ
OJ

52
5J
54
55
56
57

1P
1P
11
1P
1P
OA

HO
UO
00
00
80
HO
dO

24 0000
2~ Dono
26 0000
21 0000
2H 0000
24 0000
2A nooo
7B 0000
2C 0000
)0
J1
Jl
]]
jll

BATClI
BATCH

l'
1P

~o

]2
]2
J2
J2
12
12
12
12
32
J2
12
.12
12
J2

p9

~9

J4 O'j ,)8 oil) OA 1"1 11 J4 l'

32
12
32
J2
J2

004C
G04C
004C
004C
004C
n04e
n04C
D04e
0()4C
004C

~9

11
1F
1F
1F
l'
11
l'
1F
01

18 ~9 ~9 ~~ ~9 ~9
18 ~q 59 ~9 ~9 ~9
1B ~9 ~9 ~9 ~9 ~9
lH ~9 ~9 ~9 ~9 ~9
18 ~~ ~9 ~9 ~9 ~9
1B ~q ~9 ~9 ~9 ~9
1B ~9 ~9 ~9 ~9 ~9
1C 1J 1J 1 J 1 J 1J
lD 2~ 14 14 14 1 ..
1£ 14 1~ 1~ 1~ 1~

n4
04
04
04
04

01
07
01
01
07
17
01
07
01

1

P
R
J
4

T

rNTERACTIVE 10 )') 004C
INTERACTIVE J1 Of 004C
INTP.R,H:'!'[vF J2 05 !)04C
INT~RhCTrv~ 1] 0] n04e
INTERACTIVE IU OR 004C
40
41
42
41
44
INTER~crIVF 45
INTEPACTIVR 116
INTERACTIV~ 47
INTF.!HcrrvE 48
INTERJCTIVE 49

J

P
R
J
J

OJ 0000 01 00 00 00 00 00 01 16 16 21 00 00 00 00
01 UOOO 02 )4 34 OS 06 ~O O~ 17 11 J .. 1F IF l ' 11'

1F 00 nouc 04 23 '3 1F 0000 UU J4 t4 OH 08

WR rTE
LOGOFF
LOr-ON
BULK 1-0

SA Tr. ff
LEV~L

Q

5

CO fJ4
SYSOP~fi 0
TNTERACTIVE 01 04
T NTF-RACT IV P OJ. )4
INTERAC1'IV~ I) I 04
INTRRAC'l'IVE 04 all
INTFlnCTIV": 0') 011
IN T ER ACT IV ~ 06 04
INTERACTIVE 07 04
INTEUCTIVE 08 04
I NT ER ACT IV P. 09 04
SULK 1-0
OA 04
OB 1A
BATCH
OC 1A
BATCH
01) 1A
tUTCIi
OE 1A
SlTCK
OF 1A
BATCR
10 1A
aATCR
11 H
SATCK
llH
8ATCR
1 1 1.\
aATCR
LOGON
H v2
1, {I/~
LOGOFF

THIRD LEVEL BATCH
SECOND

P
R
I

T

F
V

~J

44

46 46
41 41
4H 48

51
52

~O

~1

~1

~1

52

~2

~3
~4

52
53

~4

~4

~~

~4

~5
~fi

5~

50

~J

~O

5]

~O

~O

IN'l'BLOCK

Figure 4-Schedule table T5-the best TSSj360 release 5.01 table

~A

1F
1F
1F
IF
1f

U' 1F· IF 11

49
49
2C

1F
1F
1P
1F
IF
1F
1F
1F
IF

1F
lF
1F
l'
l'
IF
1F
1F
1F

1F
1F
1F
1F
1F
1F
IF
1F
1P

1F
11
IF
IF
It
U'
1F
1F
1F

~o

~o

SO

~O

~o

4~

4'1

18

~1

~1

~1

~1

~1

1~

~2

~2

~2

~I.

~l

18 ~J 5] 5J
18 ~4 ~4 ~4
18 ~'> ~~ ~~
18 56 ,6 56
1B ~1 ~1 ~I
lB ~S 5M 5H
18 ~'I ~9 59
~o

53 ~3
54 ,4

"

~~

~6

5&

~1

~/

~8 58
59 59

'>0 ,0 50

~u

Scheduling TSS/360 for Responsiveness

SET

HOLDI'fG
INTERLOCK
SET

PliE1UDICE

POR
INT~RLOCK

SET

p

P

P

P

L
C
It

R

R

R

R

J
1

J

J

J

I.

J

1&
17
17
11

I.J 00 00 00
2 .. 11' 11' 11'
I. .. 11' 11' 11'
2 .. 11' 11' 11'
l .. 11' 11' 11'
1.4 11' 11' 11'
1 .. 11' 11' 11'
1.4 11' 11' 11'
211 11' 11' 11'
2 .. 11' 11' 11'
2l 01 OAOl
27 OB DB DB
21 OC DC DC
H OD 00 OD
U OJ! OJ! DB
21 or OF 01'
21 10 10 10
II 11 11 11
21 12 12 12
21 lJ 13 13
21 , .. 111 14
lO 15 15 15

00
11'
11'
11'
11'
11'
11'
11'
11'
11'
01
OB
DC
00
OJ!

I.J 00 00
2 .. 11' lP
I.~ IF IF
I.b lP 11'
22 lA 11
21 lB 18
28 lC lC
21 , .. , ..
lO 15 15

00
lP
IF
11'
11
lB
lC
, ..
15

00
11'
11'
lP
11
18
lC
, ..
15

1F 01 0013 01 2J F1' lP 0000 00 11' 11' O~ 08 00 OA 11 11 24 lP lP

lP

lV

15
14
01.
00
11'
11'
11'
OB
lJ

15
, ..
01
00
lP
11'
11'
08
lJ

3~T

I'iT~R'CTIVE

:,rAP'!'I!!G

20
10
10
10
10
10
10
10
10
10
20
10
10
10
10
10
10
10
10
10
20
20

PP
PF
1'1'
1'1'
Fl'
P'1'
1'1'
P'1'
1'1'
1'1'
P'1'
1'1'

20
10
20
10
20
Q1 10
01 10
~Ot3 01 20
0013 01 20

F1'
1'1'
FP
PF
FF
FF
PI'
FP
PP

LOG01'~

01
01
01
01
01
01
01
01
01
01
01
01
01
01
01
01
01
01
01
01
02
02

SYSOPEBO
INTER'CTJV!
TNTI!RICTIV!
IlITERACTIVE
SUL!{ 1-0
BITCR
BITC!f
LOGOIl
LOG01'1'

16
17
18
19
11
1B
lC
lD
lE

00
02
02
01
0)
04
04
01
03

0013
0013
0013
0013
0013
001]
001J

01
01
01
01
01

"RIT!

I!

0011
0013
0013
0013
0013
0011
00 13
0013
0013

01
01
01
01
01
01
01
01

20
20
20
20
10
20
10
10
)0

29 18 0026
21 03 0026
IB 18 0026
2C 11 002b
20 16 0011
2E 15 0013
2F 14 0016
'0 14 002~
it '" !I01"·
12 14 001J

n~

10
18
20
10
40

20
21
22
23
2 ..
IIITERACTIV~ 25
I NTERAC1' IV,. 26
27
81TCH
28
BITCH

06
06
06
06
06
06
06
06
06

00
01
02
01
04
05
06
07
OS
09
01
OB
rr OC
"1' 00
1'1' OE
1'1' or
1'1' 10
1'1' 11
1'1' 12
PI' 13
P'P 111
1'1' 15

0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0121'
0121'

012,.
0121'
012'
0121'
012'
012'
0121'
0000
0000

D T I!
ESP

T
II

I
II

L

R

I

I

TilE
I
D

I

I

""
pp 0

T

T

T

00
00
00
00
00
00
00
00
00
00
00
01
01
01
01
01
01
01
01
00
00
00

16 0121' 00
17 0121' 00
1~ 0121' 00
19 OllP 00
11 012P 00
18 0111' 00
lC 011P 00
lD 0121' 00
lE 0121' 00

20
21
22
21
2.
25
26
21

PF
PF
PP
F1'
PP
~O PF
10 PP
10 PP

29
21
2B
2C
20
2E
2F
11

00 00 00
117 51 3D
117 51 3D
.. 7 51 30
In 51 30
47 51 JD
.. 7 51 3D
.7 51 ]D
47 51 30
.. 7 51 3D
01 01 01
IIC 35 35
4C 35 35
IIC 3') 35
I&C 35 35
IIC 35 35
I&C 35 35
"C 35 35
IIC 35 35
I&C 35 35
, .. , .. 111
15 15 15

00
08
08
08
08
08
08
08
08
08
01
OB
OC

00
OE

OF
10
11
12
13
, ..
15

00 16 00 00
28 11 Q~ 00
l8 1~ 08 00
2C 19 08 00
01 11 01 00
"c 35 18 UB 00
35 ]& lC 13 00
, .. 14 lD , .. 00
15 15 lE 1~ 00

01 16 1&
01 11 17
01 18 18
01. 19 19
01 lilA
01. lB lB
01 lC lc
01 lD lD
Oil! lE

0000
08
0000
02
0000
02
0000
01
0000
01
ooal
02
0000
01
0000
01 110' f f 11 0000
!Il 110 PF 32 0000

00
13
00
00
00

l3
2J
l3
l3

21
41
21.
2P
JO
31
liB
.. I
""
49

28
19
2C
lD
2E
32
28
2C
I.D
lD

08
08
08
08
08
08
08
18

00
80
00
00
00
00
80
80
Oil IlO
08 80

01
01
01
01
01
01
01
01
01.
01

11
18
18
19
19
19
19
19
19
19

11
18
18
19
19
19
19
19
'"
19

JJ 0121'
1 .. 0121'
0121'
Oll'
012P
012F
J" OllP
11 812F
3B 0121'
lr 012'

00
lJ
00
00
00
00
H
2J
21
13

34
4C
J3
39
3l
3B
50
"P
liE
.. E

]5
JJ
3b
31
38
3C
35
J6

34
3J
39
11
3B
3C
35
J6
J7 11
31 18

J9
II
J8
lC
JI)
JI
J8
3C

'4 00
80
00
00
00
00
ao
00
80
80

01
01
0&
01
01
01
01
01
01
01

18
lB
1B
lC
lC
lC
lC
lC
lC
lC

18
18
1B
lC
1C
1C
lC
lC
lC
1C

H
21
l"I
28

0121'
012P
011P
0121'
012F
012F
0121'
012'
012P
0121'

00
13
00
00
00
00
23
23
2J
21

21
41
21
21'
30
31
29
2B
2C
2D

lB
28
2C
20
2E
32
2C
2D
2E
21!

3E
lD
"3
....
.. 5
lib
3P
"0
41
"2

08
08
08
08
08
08
08
08
08
08

01
01
01
0&
01
01
OA
0&
01
01

11
11
18
19
19
19
18
19
19
19

11
11
18
19
19
19
18
19
19
19

lq
211
25
26
2&

08 00 08 11
08 80 08 11
08 00 0& 19
O~ 00 0& 18
08 00 0& 18

2~

ao

3D
3D
lP
40
41
112
J1'
40
111
III

1'1'
pr
P1'

SHRINKIN~

.. 7
liB
119
III
48

01
01
17
16
15

0026
0026
0026
0026
0026

08
oa
10
20
18

1'1'
pF
F1'
FF
1'P

III
.. 8
4"
41
.. 8

0000
0000
0000
0000
0000

00
13
00
00
00

48
.. 1
CIA
48
III

29
29
2C
28
29

1D
3D
40
JF
lE

,)HRl!IKING
DELAYING
SIIRINKING
5HR INK IN:>
SHRI!lKING

4C
liD
4E
liP
50

13
11
13
,.
10

0026 10 08
0016 10 08
11 JC
0016 02 20
0026 09 18

F1'
1'1'
1'1'
FF
P1'

IIC 0121'
40011'
.. E 0121'
4P 01l1'
SO 012F

00
21
00
00
00

.. 0
QC
111'
SO
11=

J]
31
37
3&
35

"D .. 0
IIC .. 0
Jl !A
J9 J9
34 14

1)1)1~

11

lB
1B
lB
lB
lB
lB
lB
lB
18
10
lE

n

1"1
11
17
11
17
lA
lB
1B
lB
lB
lB
lB
lB
lB
lB
10
11

15
14
OA
00
08
08
08
DB
13

III
.. 4
45
46

41

11
H
17
11

15
lQ
01
00
3D
28 30
20 1P
j j 35 27
33 J~ 26

]0
3E
lP
QO
.. ,
42
IIJ
4 ..
45
116

DELATIN!,;
,.ROIIIlIG
GROIiING
GROIiING
GROiltNG
DELATING
DELATING
DELUING
JELA YIKG

n

15
, ..
OA
00
2B

FP
1'1'
PI'
rl'
'I'
pp

3D
JE
11'
40
41

lb
17
,.,
11
11

15
, ..
01
00
47
41
2B

10
10
20
10
.. 0
.. 0
20
30
.. 0
110

3d
1'1
JA
JIl
)C

OA
01
01
01
01
OA
01
01
01
01
01
01
01
01
01
01
01
01
0&
01
01
01

C

2J
21
23
01
2J
23
2J
23
23

OS 0026 01
DB 0026 01
1~ 0011 01
11 0010 01
16 000 .. 01
15 000 .. 01
14 OOll 01
111 0010 01
14 000" 01
14 000 .. 01

31

00
8D
80
80
60
tiO
80
80
80
80
00
80
80
80
80
80
60
80
80
80
00
80

00
.. 1
QB
"I
01

1'F
1'1'
FI'
FF
t'1'
F'
1'1'
PF
1'1'
Ft'

U
311
)'}
16

RP"
CR R

0000
0000
0000
0000
0000
0000
0000
0121'
012P

at

PI'
1'1'
1'1'
1'1'
1'1'
1'1'
PI'
P1'
PI'

E

lJ 002" Ud 10
11 0026 08 18
lC 002& 02 20
ta 0016 02 10
1& DOll 01 "0
1'1 ~Oll 01 "0
11. 0026 01 20
11 0026 01 30
1A 0011 01 40
11 0011 01 .. 0

SHR!!IK ING
LO)PI'IG
DELAYING
B1'EPACTIYE SHRI!lKING
~Rr
SHRINKING

~G

X

0026
0010
0010
0010
0010
0010
0010
0010
0010
0010
0026
0013
0013
0013
Don
0013
0013
0013
0013
0013
0026
0026

';ROIII~G

'H1'

1

E

05
lA
18
18
18
18
18
18
19
19
0 ..
10
1D
10
lD
1D
lD
lD
10
lD
03
0 ..

DELHING

A.lITT

5

D

00
01
02
03
0"
05
06
07
08
09
01
OB
DC
00
OE
01'
10
11
12
11
, ..
15

IITlRACT.IVE
t1tTERACl'IVE
IITERICTI'E
TRTF.RACTIYF.
IIITP.RACTIVE
INTERICTI'E
IITlRACTIf1!
INTERICT!'"
IITERlCTIYP.
BULl{ 1-0
B1TCH
BATCR
BlTCR
BATCR
BATCR
BlTCR
BITCR
BITCR
BITCR
LOGON

';!lOIII'iG
DELAYING
GROIiING
GROIiING
G.BOilING
GROIiING
DELAYING
DELATING
JELlYT NG
JELlYI N!,;

RATSH

1/

•

L

~!lOIIT!fG

SET

C

R

!>ELlYIN::
DELI Y'I !f,;
DELlYI MG

Lv)"T NG

11
T

L

GROIiING
Lo')::>I'IG
llELAYIIG
GROIII'IG
TNT-P\CTIVE GROWING
GBOIII'lG

:lsr

II

C
H
L

LOG01'P
LOGOI
BULK 1-0
SYSOPERO
IlIT!PlCTIV E

WAIT! NG

LOO!'!

L

L

P I
"
I I U II
X X L T

oU

SYSOP~RO

ST18TIIG

H

I

L
P T
EllS
V I
V
P. 0 I

10
10
01
02
08

1'1'

3,}
36
17
38

60
80
80
80
00

Oil!
01 1D
OA 1&
01 16
01 11
~O OA 18
~O 01 19
80 01 18
80 01 lC

til

00
80
00
00
00
00
80
80
80
80

00
80
00
00
dO

08
08
01
01
01

lB
lB
lC
1C
18

lE
lD
11
16
11
18
19
18
lc

1.0

l~

1.1 , ..
21. 01

23 00
24 lP
2~ 11'
26 lP
l7 08
28 13

15
14
OA
00
11'
lP
11'
08
13

or

10
11
12
13
, ..
15

l" 11' 11' lP lP
l~

11'
11'
11'
11'
11'
1P
lP
.!b 1F
2b 11'

11'
IF
11'
11'
11'
11'
IF
11'
11'

11'
lP
IF
11'
11'
11'
1P
11'
lP

11'
11'
IF
lP
11'
lP
11
11'
11'

JII
311
J9
31
JB
3C
J9
JA
JB
JC

J4
JII
39
J&
JB
JC
39
JI
J8
JC

J4
3 ..
J9
J&
3B
3C
J9
JI
JB
JC

J"
JII
39
38
3C

lP
11'
lP
11'
11'
11'

2&
26
26

IF
11'
IF
11'
11'
lP
IF
11'
lP
IF

11'
11'
lP

11'
lP'
lP
11
lP
11'
11'
11'
11'
11'

lP
11'
11'
lP
"
"
11'
11'
11'
11'

11
11
19
18
18

2"
2Q
2b
25
25

11'
11'
1F
11'
11'

11'
11'
11'
11'
lP

IF
11'
1P
11'
11'

11'
11'
11'
11'
11'

lB
18
1C
18
lB

21 .. D
21 .. D
l8 16E
21 .. ,
1.1 50

4D
.. D
.. E
4P
50

.. D
.. D
Il£
Qp
50

.. D
liD
II!
.. P
50

25
I.b
i6
16
lb
lb

l8

211
28
III
2~

I.Il

l&
:i5

11'

Jl

J9

J1
JB
3C

IIITERACl'I'I E !>1 11 0008 01 20 FF 51 0000 00 28 '>2 JP Oil 00 01 18 llf i5 tF ,,. 11' 11'
INT!RICTIV~ ~2 16 0006 11 30 F1' 52 OODU 00 2C 53 40 08 00 01 19 19 l6 11' 1F lP 11'
INTERACTIVE 53 15 0004 01 110 1'1' 51 0000 00 2C 2D 40 08 00 01 19 19 2& 1P 1P lP lP

Figure 5-Schedule table T47-The table in use when we first ran without LOS

105

106

Fall Joint Computer Conference, 1970

crossing the 20 mark and building rapidly. The
average delay during the RUNOFF between
522 lines of type-out was 3.4 seconds. This
included four unusually high delays of 71.6,
85.4, 87 and 117.1 seconds. It turns out that
these unusual delays occurred because of a high
level of simultaneous load. I was the chief perpetrator of this load and was thus hoisted on
my own petard. I was doing a virtual memory
sort over 2.5 million bytes of data during the
interval of time when the four large delays
occurred. This table still had its weak moments.
This, however, was the only cloud on a perfect afternoon. No other user complained.
6. One user was on with an 1130-2250 doing
algebraic symbol manipulation with LISP.
7. Three users were on all afternoon using a medical
application program.
8. Two users were on editing using the REDIT
command with a 2741 for input and a 2260
scope for output. One of these two was the high
success story of the day .. He was on from 1 :08
until 2 :55 p.m. During this time he executed
622 editing commands. This was an average
interaction time of 10 seconds. This includes
type-in time, processing time, think time and
display time. And he is not a touch typist!
He averaged 5.1 seconds to respond to TSSj
360 with each new command. TSSj360, in turn,
averaged 4.3 seconds to process his request,
change the display and then open his 2741
keyboard for the next request. This includes
a single high peak of 104 seconds during my
2.5 million byte sort, similar to the RUNOFF
delays.
9. The remainder of the users were in various
stages of editing, compiling, executing, debugging, abending, and other normal operations.
Most of the users were on the system for one
or more hours of consecutive use that day.
10. Ninety per cent of the compilations completed
in less than one minute that afternoon.
11. To compound the issue, we were running with
what we considered to be a critically low amount
of permanent 2314 disk space available on that
day. There were between 4,000 and 5,500 pages
of public storage available out of a possible
48,000 pages, of on-line public storage. Thus
I assume higher than normal delays due to disk
seeking existed during this day.
12. The most quantitative evidence I can offer
from the contribution of the balanced core time
and working set size concepts was obtained
from the Level Usage Counters.

a. Of all task dispatches, 89 per cent required
less than 32 pages.
b. Ten per cent required between 32 and 48
pages. This could be even lower if the assumption is made that this simply reflects a
working set change and not a working set
size change.
c. The remaining one per cent of all dispatches
were for up to 64 pages.
d. Breaking this down by sets of table levels,
there were:
Starting set
Looping set
Batch
Holding interlocks
Waiting for interlocks
AWAIT
BULKIO
SYSTElV[OPERATOR
LOGON
LOGOFF

28 percent
30 percent
4 percent
5 percent
2 percent
5 percent
10 percent
5 percent
6 percent
5 percent

13. Since the BCU has not been available SInce
December 20, 1969, there were no BCU measurements made that day.
14. The table in use on that day was T47 (Figure
5), which is very similar to T48 (Figure 6). T48
was created to reduce the impact on other users
of people doing in core sorts of several million
bytes on a computer which only allocates them
several thousand.
Changes in indicative programs

The programs discussed here exhibit to a marked
degree improvements which are present to a lesser
degree in all programs. They are:
1. A tutoring program, using a 2260 display with a
2741 for some input formerly had the following
property:
If you asked it a question whose answer was
unknown it displayed a message to that effect.
Then after a few seconds the screen would change
to let you choose what to do next.
Once the first new table was put in use, the
first message was usually not seen by most
people. This was because the old normal system
delay no longer existed. The program's author
had to put in a PAUSE after the first message
to allow it to be read.
2. When the first new table was put into daily use,
only one user was slowed down. He was using

Scheduling TSS/360 for Responsiveness

L
I
V
I
L

ST1RTJlIG
SET

HOLDING
INTERLO'CK
SET

P

T

R

S

0.U

"A

"I

I

V

1

X

X

o A

H

C

B

L

T

R

P

A

U

II

R

L T
S I

D

I

X

D T
I

S

II
P

T
W

L

E

R

1

T

H

I

I

D

1

II
1

I

I

T

T

SYSOPIBO
IHTER1CTIYE
IRTERlCTlYI
IHTER1CTIYI
IRTP.R1CTIYE
IHTER1CTIYI
fIITED1CTIYF.
IRTIRlCTIYE
IHTIR1CTI'E
IHTIUCfIYE
BULK 1-0
BATCH
BITCH
BITCH
B1TCR
BITCH
BATCH
BITCH
BITCH
BlTCR
LOGOH
LOGOFF

00
01
02
03
0 ..
OS
0&
0'7
09
09
0'1
OB
DC
DO
OE
or
10
11
12

OS 0026
18 0020
18 0020
18 0020
HI.0020
18 0020
18 0'020
18 0'020
18 00'20
18 0020
011 00'26
lD 0011
10 0'0'13
lD 0013
lD 0013
10 0013
10 DOll
lD 0'013
10 0013
13 lD 001J
14 03 0026
15 Olt 0026

01 20
01 10
01 10
01 10
01 10
01 10
01 10
01 10'
01 10
0'1 10'
0'1 20'
01 10
01 10
0'1 10
01 10
01 10
01 10
01 10
01 10
0110
02 20
02 20

PP 00 0000
FF 01 0000
FP 02 0000
pr 01 0000
pr 0 .. 0000
FF 05 0000
FF 06 0000
FP 01 0000
FP 08 DODO
FF 0'9 0'0'00
FF 0'1 O'O'!)O'
FF DB 012r
FF DC 012r
FF OD D12r
FF DE 012F
rF OF 012F
FF 10' 012F
FF 11 012F
FP 12 012F
PF 11 012F
FF , .. 00'00
FF 15 0000

00
00
00
00
00
00
00'
DO
00
00
DO
FE
FE
FE
FE
FI
FE
FI
00
00

00 00 00 00
21 51 10 OK
21 51 10 OK
2~ 51 lD 08
21 51 JD OK
21 51 10 08
21 51 10 011
21 51 10 011
21 51 lD 0'11
21 51 10 OK
01 0'1 01 01
"C 15 15 OB
4C 15 15 DC
4C 15.15 OD
..C l!» 35 DE
4C 15 15 DF
4C 15 J5 10
4C 15 J!» 11
4C 15 J5 12
lie 151513
111 lit 14 14
15 15 15 1!»

SYSOPERO
IlITIR1CTIYE
IN'l'ER1CTIVP.
INTERACTIVE
RULK 1-0
BATCH
BITCH
LOGO II
LOGOFF

16
11
18
19
11
18
lC
10
lE

02
02
02
02
02
02
02
02
02

FF
FP
FF
FF
FF
FF
FF
FF
FF

00'
00
00
00
00
00
00
00
00

00 00 1&
47 2B 17
4B 28 lK
ItA 2C 1'1
01 OA 11
ItC J5 18
35 3& lC
14 lit 10
15 1~ It

00
02
02
0.1
OJ
Olt
0"
01
03

0026
0026
0026
0026
0026
0026
0026
0026
0026

20
10
20
30
20
10
]0
20
20

16
17
lK
1'1
11
18
1C
10
1E

OllF
012P
012F
012F
012F
012F
012F
012F
012P

FE
1'1

RP"
CR R
"" Q

PP

H

L

II

P

P

P

P

L

C
H

L
C

R

R

II

II

J

K

J
].

J

L

J
1

1

..

C
K

T

00 01
80 01
80 01
KO 01
KO 01
80 01
ItO 01
80 01
KO 01
KO 0'1
00 01
KO OA
80 01
80 01
KO 01
80 01
110 01
80' 01
80 01
8001
DO 01
KO 0'1

DO DO 01
OK 00 01
O~ DO 01
OK 00 0'1
0'1 DO 0'1
DB DO OA
1J 00 0'1
14 00 0&
1!» 00 01

1& 1& 2J 00 00 00 00
17 11 2" IF IF 1F lP
17 17 lit 1P IF 1P 1P

17 17 2 .. IF lP IF 1P
17 17 lq 1F 1F 1F 1P
11 1F 1F 1P
U' IF IF 1P
1F lP lP lP
IF IF lP lP
1F lr IF 1P
11 11 :lI. 01 01 01 01
1B 1B 21 OB OB OB OB
1 B 1 B 2'1 OC OC OC OC
lB lB 21 00 00 00 00
lB 1B 21 01 DE OE DE
lB lB 21 OF DP OF OF
18 lB 21 10 10 10 10
lB 1B l7 11 11 11 11
lB lB 21 12 12 12 12
1B lB 27 1J 1J 1J 1J
10 lD 21 , .. , .. , .. , ..
11 lE 20' 1~ 1~ 1~ 1~
11 17 lq
11 17 lq
17 17 lq
17 11 2q
17 11 2q

1 b l j 00 DO 00' 00
11 17 2" IF 1F IF 1F
lK 18 2!» IF IF 1F IF
1'1 1'1 if> 1F 1F 1F lP
lllAl.l.ll111AlA
lB 18 n 1B 1B lB lB
1C lC 2H lC lC lC lC
10 lD 21 , .. 14 111 , ..
lE 1£ 20 1!» 1~ 1~ 1~

1b

WRITE

IF 01 0026 02 23 FP IF 0000 00 IF IF 08 08 00 0'1 17 11 lq IF IF 1F IF

LOGOFF
LOGOIl
BULK 1-0
SYSOPBRO'
IIITIRlCTIY I
IH TERlCTIYE
IIiTIRACfIYP.
BATCH
BITCH

20
21
22
23
21t
25
26
27
28

06
06
06
06
06
06
0&
06
06

02
02
02
02
02
02
02
02
02

20
20
20
20
10
20
]0
10
30

FF
FF
FF
FP
FF
FF
FF
FF
FP

20
21
22
2]
2 ..
25

000'0
0'000
0000
0000
0000
oooa
2~ 0000
27 012F
2a 012F

FE
FE
FE
FE
FE
FI
PE
PE
FE

1!»
, ..
01
00
47
47
2B
JJ
31

14
01
00
2B
2B
20
35
J5

15
, ..
01
00
JD
JD
3F
27
28

15 KO O~ lE
111 80 01 10
01 KO 01 11
00 80 OA 1&
OK 00 0'1 17
OK KO 01 lK
08 KO 01 19
DB KO 01 lB
lJ 80 01 lC

GROIIIIIG
DILATING
G801llllG
INTER1CTIVE GROIlING
GROIIIIlG
SET
r.ROIIIIIG
DILlfUIG
DiLl TIlIG
DILlYIIIG
DELI fING

29
21
2B
2C
20
2E
2F
10
11
]2

08 0026 08
08 001& O~
18 0016 02
17 ·0026 02
16 .0013 01
15 0011 01
lit 002& 02
14 0026 01
111 0026 01
, .. 0013 01

10
18
20
10
itO
40
20
30
qO
40

FF
FF
FF
I'P
FF
FF
FF
FF
FF
FF

2'1
21
2B
2C
20
2E
2F
3D
11
J2

0000
0000
0000
0000
0000
0000
0000
0000
0000
000'0'

00'
FE
00
UO
00
DO
FE
FE
FI
FE

21
q1
21
2F
JO
Jl
liB
itA
4'1
1t9

2B
29
2C
20
2E
J2
2B
2C
20
20

10
10
1F
40
.. ,
42
3F
qo
.. ,
42

0'8
OK
08
08
08
08
0'8
OK
08
08

GROIlING
DELlflNG
GROIIIIlG
GROIlIIIG
GROWIIIG
GROVIIIG
DJ!tUIIiG
DIL1TlNG
DILATIIiG
OILUIIG

])
JIt
)5
16
)1
)8
19
31
lB
lC

13 0026
110026
lC 002&
lB 0026
11 001J
19 001)
11 0026
11 0026
11 0013
11 DOll

08
08
02
02
01
01
02
01
01
01

10
18
20
10
40
40
20
30
itO
.. 0

FF
FF
FF
Fr
FF
FF
FF
FF
PF
FF

13
1 ..
J'>
16
37
18
19
31
1B
lC

01lF
OllF
012F
012F
012F
012F
012F

00
FE
00
00
00
00
FE
8111 FE
012F FE
012F FE

JIt
4C
JJ
39
31
3B
!»O
4F
4E
4£

J5
33
J6
37
18
3C

14
J.i
J9
11
38
)C
J~ 35
J& J6
37 17
17 18

JIt 00 0'1 lB lB 27 Jq
2-' Jq
00 01 lB lB 21 3'J
Jl 00 01 lC lC 28 Jl
JB 00 0'1 1C 1C 2K J8
3C 00 01 lC lC 28 3C
.i'l 110 01 1C lC 2K 39
Jl 00' 0'1 lC 1C III JA
1B 80 01 1C lC 28 JB
iC 80 01 lC lC 28 JC

GROWING
DIL1TIIIG
GROIIIIIG
IIITER1CTI'E GROIIING
GROIIIIIG
SET
GROIIING
DELATING
DILAYI NG
DELlTlNG
DILlYIIiG

10
lE
3F
40
Itl
42
.. 1
41t
itS
46

DB
OB
18
17
16
15
, ..
lq
, ..
lq

0026 01
O~26 01
0013 01
0010 01
0004 01
0001t 01
0013 01
0010 01
00011 01
lOOIt 01

10
10
20
10
itO
110
20
10
110
itO

FF
FF
FF
FF
'F
FF
FF
FF
FF
FF

JD
lE
3F
.. 0
Itl
.. 2
1t1
.. 4
It'>
4&

012F
012F
012'
012F
012F
01lP
0121
012'
012P
012F

DO
FE
00
DO
00
00
FE
FE
FE
FE

21
47
21
2P
10
11
29
2B
2C
20

2B
2B
2C
20
2E
12
2C
lD
2E
2E

3E
10
q3
44
q5
1t6
1F
itO
41
42

08
08
08
08
08
08
08
0'8
OK
JK

00
80
00'
00
00
00
80
80

0'1
01
01
01
01
01
OA
DA
ao 01
60 01

17
17
lK
19
19
19
111
19
1'1
19

17
17
18
19
19
19
18
19
19
19

SRRIllltitiG
LOOPIIiG
DILAYIIIG
INTER1CTIVE SHBIIKIIIG
SET
S8R tNK ING
SHRTIIKIRG

1t7
1t8
1t9
ijl
ItS

07
07
17
16
15

ry026
0026
0026
0026
0026

10 08 FF 47
10 08 FF 1t8
01 30 PF 49
~2 20 FF 4&
08 18 FF ItB

0000
0000
0000
0000
0030

00
FI
00
00
00

1t8
1t1
.. 1
4B
47

29
29
2C
lB
29

3D
10
qo
3F

08
08
OK
08
IF. 38

00
80
00
00
00

OK
08
OA
DA
DA

17
17
19
18
111

17 lq IF IF IF IF
17 2_ 1F IF IF lP
19 2& IF IF IF lP
18 2~ IF IF 1F 1F
1M 2~ IF IF IF IF

LOOPING
RITCH
SET

'>HtlIIIKIIiG
DIUTING
SHlIINKlllr.
<;H!lIIIKINr.
SHRINKIIIG

ItC
.. 0
.. B
4F

11
11
18
11
19

0016
0026
0026
00'26
0026

10
10
01
02
OS

.. C 012F
.. D012F
ItE 012F
ItF D12F
SO 012F

00
FE
00
00
00

.. D
ItC
rtF
SO
IIC

13 4D
H 4:
31 3A
J6 19
35 3q

00'
110
00
00
00

0'8
DK
01
OA
01

lB
lB
lC
lC
lB

lB
lB
lC
lB
1B

ST1RTlIIG
SET

IIlT£R1CTIVE ~1 17 0010 01 18 FF 51 0000 00 2F 52 3F OK 00 OA 111 111 2~ IF 1F IF IF
tllTER1CTIVE '>2 16 OOOC 01 20 FF 52 000'0 00 2F 53 itO OK 00 DA 19 19 26 1F IF IF IF
IIIT£R1CTIVF. ~j 15 ~OOq 01 JO FF 5] 0000 00 30 30 .. 0 08 DO 0'1 19 19 26 IF IF lP "

PREJUDICI
llAI'l'IIIG
FOR
IN.TERLOCK
SET

LOOPING

LOOPING
BITCH
SIT

lWAIT

~O

0026
0026
0026
0026
0026
0026
002&
0026
0026

08
08
10
20
18

FF
FF
FF
FF
FP

1~

lE
10
11
1&
17
lK
1'1
lB
lC

DO 01 17 17
80 0'1 lK lK
DO 01 18 lK
00· 01 1'1 19
DO 0'1 19 1'1
00 0'1 19 1'1
80 01 19 19
~o 0'1 1'1 19
80 DA 19 19
80 0'1 19 19

20' 15 1~ 1~ 1~
21 lq , .. lq lit
l2 0'1 0'1 0'1 0'1

23 00 00 00' DO'
lq IF IF 1F "
l~ IF IF 1F 1F

2& IF IF
l1 OB OB DB DB
2K lJ lJ H l J

"

l .. IF 1F IF IF
l~

l5
2f>
26
2&
26
26
26
l6

IF IF IF IF
IF IF " IF
1F IF IF "

IF
1F
IF
IF
IF
IF

IF
IF
1P
IF
IF
IF

IF
IF
IF
IF
lY
lY

jq

J/W)[aW(T)/ax]}

The usual procedure is to choose oa (t) so as to cause
the greatest diminution in 

0 (8) to dW =0= [aW(T)/ax], dx(T) +[aW(T)/aT] dT and the uniform sensitivity direction is oa(t) = -p'(t)G(t)/a(t) where a

/W) (aW lax) ]}'t=Tox(T) ="-k'ox(T) (6) where k is a vector constant (defined explicitly) for the nominal x (t), and It remains to determine a (t) such that the sensitivity of (1) to oa(t) is uniform over the entire trajectory. Assuming that a nominal control aCt) and trajectory x(t) on [to, T] are available, the time interval [to,·T] is partitioned into small increments of width fl.t. Any small control perturbation oaT at time r, to~r~ T, of the type shown in Figure 1, with amplitude A (r), duration fl.t, and fixed norm II oaT II is required to effect an equal change in

= a

(T, s)G(s)A(r) ds T where if> is the transition matrix of (5). The norm of oaT, from Figure 1 and (8) is In place of (6), the penalty function variation can be expressed as an integral d(T, s) G(s) ds A (r) = constant (11) [J Trajectory Computation 131 Eliminating A (r) from (10) and (11) gives [f X k' T T+~t cf> ( T, s) G (s) ds ]2 (12) I T The original steepest-descent gradient direction -p'(t)G(t) is modified by aCt) (see (9)) at r=t to produce uniform sensitivity in the penalty function. The optimum step size A, as before, minimizes =---O---C>--~-~----(l_-o -0.1 0 Kl 0.1 Figure 5-Simple analog circuit for real time simulation and fast time predictions using same integrators Minimization method In order to distinguish between local ffilmma and absolute minima the search in the X j parameter space is performed in two phases: 1. Systematic Grid Search All possible parameter combinations within a region of specified limits and a grid of specified fineness are evaluated for their performance J. Such a complete survey is feasible as long as the parameter space is of low dimension as in present applications (Figure 6). 2. Gradient Search The Fletcher-Powell-Davidon gradient method3 uses the grid search minimum as a starting point to precisely locate the minimum. Example: Booster load relief controller design The practical features of the optimization scheme are best illustrated using the following example. The peak bending loads at two critical stations, Xl and X2 (Figure 7) of a Saturn V launch vehicle shall be minimized during powered first stage flight while the vehicle is subjected to severe winds with a superimposed gust (Figure 8, top strip) and to an engineout failure at the time of maximum dynamic pressure. The major variables are: a vehicle angle of attack control engines gimbal angle Figure 6--Parameter optimization performed in two phases: (1) Systematic grid search (0) for complete survey of parameter space; Grid point of minimum J ( 0) serves as starting point for (2) gradient search which locates the minimum more precisely (D). From grid search contour plots (Lines of J = Const) can be displayed at CRT for better insight into J-topology. generalized displacement of bending mode at nose of vehicle cP vehicle pitch attitude error X commanded pitch attitude relative to vertical at time of launch z drift, normal to the reference trajectory ~ slosh mass displacement, normal to reference traj ectory Y (x), Y' (x) bending mode normalized amplitude and slope at station x 1/ Disturbance terms due to engine failure: cPF pitch acceleration due to engine failure ZF 1/F lateral acceleration due to engine failure bending mode acceleration due to engine failure Control law The attitude controller to be optimally adjusted has feedback of attitude error, error rate and lateral acceleration: (4) 138 Fall Joint Computer Conference, 1970 Rate Gyro Output: (Ji= cb+ Y' (XRG) 7j Accelerometer Output: Ti=Z+A l 4>-A2cJ>-A 31i+A 47J Gimbaled Engine Dynamics: Shaping Filters: Attitude Error Filter: cJ>s=-i/V+aw aw= tan-1(Vwcosx/V - Vw sinx) Figure 7-Minimax load relief performance criterion to minimize peak bending loads at two critical locations, Xl, X2: ~ =max/MBi(t) /M Bi / ~=1,2 +q tv+To jtv ()2RG(t) dt--+min (6) tv~ti = cJ>+ Y' (xPG)7J ; , / \ _ Peak Moment for ~ Constant Gain Case 50% higher than peaks I~ of optimized system 70 80 90 Figure 8-Typical Saturn V S-IC optimization results. Three gain schedules are optimally adjusted to minimize the peak bending loads among two stations (1541 and 3764 inches) for the disturbance and failure history of the top charts. Peak loads are substantially reduced compared with nominal Saturn V performance (without accelerometer feedback). Weighting factor q=O.05; floating optirilization time interval To=20 sec in performance criterion (6) Hybrid Computer Solutions 139 Engine Failure Switching Function: +1windward engine out o= { Wind Disturbance 0 no failure - lleeward engine out Failure Mode Only one bending mode and one slosh mode was included in this study. The bending moment at station Xi is M Bi = M' aia+ M' /lit3+ M';;~ 100~ aw Engine No. a o 3{~t o ~_ ~ -----,...Jr== Windward Engine Failure l: +-r ~~ I I Optimized Gain Schedules l a (sec) ~~t~~==~I~r=== I ~ t+--.,.....-~.-.,-- Complete numerical data for all coefficients used in the simulation are compiled in Reference 4. Pitch Angle Engine Deflection Selection of performance index (dtg ) fl (deg) r 50 [ ~ r[~--""""",,,,,,,,,,I~-+--= ,~ I I In earlier studies4 quadratic performance criteria such as Bending Moments at Station 2 MB2'5~, M B2 0 • Peak MOIllent for Constant Gain Case 22% Higher Than Peaks of Optimized System \ . Flight Time -sec 70 80 90 Peak Moment for Constant Gain Case 25% Higher Than Peaks of Optimized System I~ I At Station 1 (5) MBI M - : '5~ 0 -+- _.. __ .+-- BI 60 were-used. They allow a straightforward physical interpretation and at the same time can still be loosely related to linear optimal control theory. Neglecting external disturbances (aw ~ 0), Equation (5) can be rewritten in the more familiar form ~ Figure 9-8aturn V 8-1C optimization of Figure 8 repeated for assumed windward engine failure under otherwise identical flight conditions. Optimal gain schedules are strongly dependent upon assumed failure condition Results where a is an n-dimensional coefficient vector, q3, q4 are constants depending upon ql, q2, M a' and M / and superscript T denotes transpose. The results from optimizations where Equation (5) was minimized were disappointing insofar as peak bending loads were reduced by a few percent only, whereas the major reductions were in the RMS values of the bending loads. Since peak loads are of major concern to the designer, a more direct approach was made to reduce peak loads by using the minimax criterion (6) of Figure 7. During each run bending load peaks at both critical stations were sampled and compared. Only the greater of the two peaks was included in J. This peak amplitude normalized with respect to the structural limit M B was the major term in J. The only additional term, the mean square of measured error rate, was included to ensure smooth time histories and trajectory stability. This performance criterion reduced the number of weighting factors to be empirically adjusted to one, whereas n such factors must be selected in linear optimal control theory for an nth order system. A typical optimization result is shown in Figure 8. Drastic reductions in bending moment peaks result from the minimax criterion compared with the constant gain case. It should be noted, however, that perfect predictions 20 seconds ahead were used in the optimization including the anticipated failure. In Figure 9 the results of a similar case ar.e shown. All flight conditions are identical to the previous case except for the failure mode: a leeward engine fails in Figure 8,a windward engine in Figure 9. Again, peak bending loads are substantially reduced in magnitude compared with a case with nominal constant adjust~ ments of the controller. However, two of the three optimal gain schedules (ao(t) and g2(t)) differ drastically for the two failure modes. In view of the lack of any a priori knowledge about time and location of a possible engine failure no useful information can therefore be gained from the two optimizations concerning load relief controller design. This basic shortcoming of all strictly deterministic optimization schemes must be relieved before the method can be applied to practical engineering design problems characterized by parameter or failure uncertainties. 140 Fall Joint Computer Conference, 1970 A or B. The performance index evaluated for each failure may be of the form of Figure 10. Neither K optA nor K oPtB would be optimal in view of the uncertainty concerning the failure mode. Performance might be unacceptably poor at the level I Au if Failure A occurred and the control parameter were adjusted at the optimum for Failure B. The best tradeoff in view of the failure uncertainty is the minimum of the upper bound of J A and J B (solid line in Figure 10). In the example of Figure 10 this optimum which is the least sensitive to the type of failure lies at the lower intersection of the two curves. - J Upper Bound J of Performance for Failure A ~ B Extension of the optimum-seeking computing scheme B Failure A KoptA or B KoptB Figure lO-Optimum adjustment of scalar control parameter K considering possible occurrence of failure A or B The most direct way to locate the minimum of the upper performance bound is to simulate all possible failure modes for a given set of control parameters in order to determine the upper bound I. A gradient dependent minimization technique can again be applied to seek the minimum. One might expect convergence difficulties around the corners of these I-functions. However, only minor modifications were necessary to the basic Fletcher-Powell-Davidon gradient scheme and to the preceding grid search to locate comer-type minima. The changes included a relaxing of the gradient convergence test (I VJ I ~~, where }; is a small specified number). If all other convergence tests are passed, then ~ is doubled in subsequent iterations. In OPTIMAL REGULATOR DESIGN INSENSITIVE TO FAILURE UNCERTAINTIES Previous work to reduce parameter sensitivity has centered around inclusion of sensitivity terms aJo/aK in the performance index to be minimized, where J 0 denotes performance under nominal conditions and K is the uncertain parameter vector.5 Substantial additional computational load would arise if such an approach were implemented on the hybrid computer. Moreover, in the case of possible failures the uncertain parameters may vary discontinuously from one discrete value to another like the engine failure switching function in the preceding example: I for Failure A 0= 0 for Nominal Case { -1 for Failure B q1M B1 :T") ~j.B4\1J I Flight Condition A (Windward Engine Out) Flight Condition B (Leeward Engine Out) MAxlMBil MBi Figure ll-Generalized load relief performance criterion to minimize upper performance bound J for two possible operating conditions } = Another approach is therefore chosen: The hybrid optimization method is extended to account for such failure uncertainties even if no partial derivatives exist. Consider the case of two possible failure conditions, -:IJ max IMBi/MBd + t,,+To jt" 82RG(t) dt~min CaseA,CaseB i=l,2 t,,erand lengths, the LA and LB fields are also decimal quantities. The AC and BC bits indicate whether the fields addressed by A and B respectively, are located in the common partition or in the user partition. Thus, a maximum of 20K characters of memory are available to the System Ten programmer, 10K in common and 10K in his partition. The IA and IB fields of the instruction are used for index register selection. They are two bit binary fields; IA = 0 indicates that the A address is not indexed, IA = 1 indicates that the effective address is A plus the contents of the first index regis~er, and so on. The System Ten instruction set is given in Table 1. (A) denotes the contents of the field address by A. When a branch instruction is encountered, control passes to the instruction addressed by A if the condition specified by LA is met. If this condition is not met the condition specified by LB is checked and if met, control passes to the instruction address by B, otherwise the next instruction in sequence is executed. In addition to specifying conditions related to the outcome of arithmetic and input/output operations, the LA and LB fields may specify that a subreutine branch is to be taken or that a branch is to be taken when a device has a pending SERVICE REQUEST. In this latter case, the address of the device requesting service is stored at the location specified by B. One form of unconditional branch allows the programmer to give up a portion of his allotted processing time. This is the branch and switch instruction. When this instruction is encountered, a branch is taken and partition switching occurs. For example, if a program is waiting for a request for service from a terminal, it can be designed to relinquish processing time to other partitions until the -request occurs. In disc input/output instructions, the B field is the address of a six character disc address rather than a character count. No character count is required as System Ten disc records have a fixed length of one hundred characters. A disc file has fifty records per track. Record addresses are interleaved by five so that there are four sectors between sequentially addressed sectors. This permits the programmer to modify disc addresses and do a limited amount of housekeeping between sequentially addressed sectors. Thus, random access file organizations which employ some form of scrambling can be implemented very efficiently. There is, however, a penalty when purely sequential access methods are used. b1 b6 b5 0 o 0 0 0 0 0 0 0 0 I o 0 0 I I 0 I I 0 0 I 0 I I I 0 0 0 I I I 0 0 I 0 0 I 0 I I 0 I I 1 0 1 I 0 I I I I I t I 0 I 0 I 0 1 0 I 0 0 0 0 10 II VT 12 13 14 15 FF CR SO SI 8 9 0 I 0 NUL SOH STX ETX EOT ENQ ACK BEL BS HT LF I 2 3 4 5 6 1 0 I 0 I:::S: 0 2 3 B LB ADDRESS 4 5 ADDRESS 6 7 8 9 CHARACTER System Ten demonstrates the feasibility of providing multiprogramming capabilities, without the need for a Row A o CONCLUSION b4 b3 jb2 bl 2 LA 186B I I I 0 I 0 I I I 0 I 0 I 2 3 4 5 6 DLE SP ! DCI DC2 DC3 -# DC4 S NAK % SYN a ETB ~ CAN ( ) EM SUB ESC + FS • GS RS I US 0 •A P Q R S T U • I . * . I 2 3 4 5 C 1 B C D E F G H •f v W 9 w h V I : J Z ; - K L M N ! 0 Figure 5-Character set b c u • [ i j k { ] I' m 1. , A- complex software executive. Adoption of a system design philosophy oriented toward application requirements rather than unlimited generality made implementation of the hardware executive a straightforward and inexpensive task. P q r •t X Y > a 1 d 8 9 < I 0 Figure 6--Instruction format J z ACKNOWLEDGMENTS We would like to thank the many individuals who contributed to the development of System Ten. Worthy of particular mention are D. Neilson, E. Poumakis, and H. Schaffer whose ideas provided the framework for the system as it exists today. : n ~ 0 DEL REFERENCES 1 T B STEELJR Multiprogramming--Promise, performance, and prospect Proceedings FJCC Vol 33 p 99 1968 On automatic design of data organization by WILLIAM A. McCUSKEY Burroughs Corporation Paoli, Pennsylvania INTRODUCTION of these relationships, of their representations in the IPS, of the representation of the data in the IPS storage, and of the logical access and storage assignment proc;esses which will operate on the data organization. The term process is used here to mean an operation or set of operations on data, whether that process is described by the problem definer or defined by the system designer. The system design procedure is itself a process and will be referred to as such. The procedure for organizing data for an IPS may be thought of ideally in terms of four operations. First, a problem definer interprets a problem in his environment and defines a set of requirements which are as complete and concise as possible and which any solution of the problem, manual or automatic, must satisfy. A problem definition is complete if, in order to solve the problem, a system designer needs no further information from the problem definer. The problem definer defines the information processing problem in terms of sets of data, membership relationships among· these sets of data, processes operating with the data, time and volume requirements on the processing, other constraints, and a measure of effectiveness for the solution. In order that the best possible design be produced, relative to the given measure of effectiveness, the problem definer should place as few restrictions as possible on the number of alternatives the system designer may consider. Second, the system designer develops a specification of logically ordered structure for the data and the logical access processes which may be used to find any element in the structure. This structure will be called the logical organization of the data. An example is a binary tree, array, or any directed graph. Third, the system designer specifies for these logically structured data the corresponding representatio1;ls in the storage and the strategies for storage assignment. The resulting structure will be called physical organization of the data. And fourth, the implementor of the system converts A number of research efforts have contributed to the beginning of a methodology for the automatic design of large-scale information processing systems (IPS). See for instance Nunamaker.1 One facet of study in these efforts is the design of data organization. Such a study was undertaJmn in the context of Project ISDOS,* now at the University of Michigan. The purpose of the study was to develop a model of the data organization design process and to create from this model a method for generating specifications of alternative data organizations. The first step of the study was to obtain a view of data organization uncomplicated by data usage. To this end the design of data organization (DOD) was divorced from the total IPS design process. A method for decision-making, which relates data organization to data usage and a measure of effectiveness, was to be a second phase of the study. The purpose of this paper is to outline some initial results and implications of a set-theoretic approach to DOD which was developed for ISDOS. The assumed framework of the DOD process is described briefly. Within this framework concepts of data are defined in terms of sets. The DOD process can then be described in terms of set-theoretic operations. Finally some implications of the approach are given. ORGANIZATION OF DATA-A FRAMEWORK The term data is used here to mean the IPS representation of objects which are used as a basis for decision or calculation. The term data organization is used here to mean the set of relationships among data established by the problem definer or created by the system designer, as well as the representations of these relationships in the IPS. A design of data organization is a specification * Information System Design and Optimization System 187 188 Fall Joint Computer Conference, 1970 the actual data from its present form to a form which meets the new specifications. Within this framework the approach was to view all concepts of data in terms of sets and then to define the design process, steps one through three above, in terms of set-theoretic operations on these sets. The settheoretic details may be found in McCuskey.2 The following attempts a more narrative description. CONCEPTS The concepts of data organization described below must be viewed in the context of an ideal automated design system such as ISDOS. The problem statement, written in a formal problem statement language, is input to a system design program. This program specifies how processes should be organized into programs, how data should be structured logically and physically, and how the programs and data should be managed as a complete system. The system design is then automatically implemented. The goal of this description of data concepts is to provide a framework within which to formulate a precise, simple algorithm. The algorithm must operate on a problem definition of data to produce a specification of IPS storage organization for the actual data. Because of this goal the sets of data which the problem definer describes are viewed here as settheoretic sets related by unordered cross-product relations. The algorithm must then establish what redundancies to keep, specify how the data should be ordered and then specify how this logical structure should be represented in storage. The goal requires that the distinction between logical and physical data organization be defined precisely. The logical structure discussed below is the structure which is directly represented in storage. It incorporates some features, like redundancy specification, which are generally considered in the realm of "storage structure". Problem description From the problem definer's point-of-view an IPS operates on symbolic representations of conceptual or physical characteristics such as name, age, address, etc. The elementary object used to build such IPS representations will be called a symbol. The problem definer must specify an alphabet, the set of all symbols which are valid for the problem he is defining. One such alphabet is the EBCDIC character set. Each occurrence of a characteristic, such as sex, amount, or title, may be thought of as an ordered pair of symbol sequences. The first component of the pair is the data name; the second component is the data value. The ordered pair will be called, generically, a data item. A data item will be denoted by its associated data name. An instance of a data item is a specific data name/data value pair. Thus (NAME, JONES)* is an instance of the data item NAME. Common usage abbreviates this statement to "JONES is an instance of NAME". A data item has sometimes been referred to as an attribute, data element, or datum. In common high-level programming language usage the data value is the "data" stored while the data name is "data about data" which appears in the source program and enters a symbol table during compilation. . From the problem definer's point-of-view the IPS at any point in time will contain representations of many different occurrences of a given characteristic, say warehouse number. Disregarding how warehouse numbers are associated with other data in the IPS, one can describe a set of all distinguishable instances of a data item, named WHNO, existing in the IPS at the given time and having the same data name. Instances are distinguished by data value. The set WHNO contains no repeated occurrences of warehouse number. Such a collection will be called a data set at level 0 (henceforth, data set/O). The data set is referenced, like a member data item, by the data name common to all its elements. Context determines whether a data name refers to a data item or a data set. Associated with a data set/O is a number, called the cardinality of the set, which specifies the anticipated number of elements (unique data item instances) in the data set. Among data sets/O exist cardinality relationships such as: "at any given time approximately three unique instances of ITNO and exactly one unique instance of CITY will be associated with a unique instance of WHNO". The anticipated cardinality and cardinality relationships among data sets, as defined here, are characteristics of the information processing problem and must be specified by the problem definer. The elements of a data set represent unique occurrences of an object, such as warehouse number, used in the problem as a basis for decision or calculation. What objects are used and how many unique occurrences of each must be represented in the IPS at anyone time depend on how the problem definer interprets the problem. These cardinality specifications eventually will help the system designer determine how much storage space * A pair of right-angle brackets, ( ), will be used to indicate an ordered n-tuple (here a 2-tuple). Automatic Design of Data Organization may be required for any data organization design which he considers. The concept of data set may be extended to higher levels. Data sets/O may be related by a named set membership association. The problem definer then describes processes in terms of operations on these associations as well as data items. For example, an updating process might be defined for INV (inventory) where INV is the data name associating the data items WHNO (warehouse number), ITNO (item number), and QTY(quantity). Nothing is said about ordering or logical structure on INV except the specification of set membership. In set-theoretic terms INV is a subset of the unordered cross-product of the three data sets/O. INV names the data set/l (data set at level one), the next level above its highest level component. Such set membership relationships may be visualized in a non-directed graph as a tree in which data set names are associated· with vertices and dotted arcs represent membership relationships. A graphic representation of the data set/l INV is given in Figure 1. A data set/n (data set at level n) (n;?: 1) may be thought of as a set of (distinguishable) ordered pairs. Each ordered pair is unique within the data set/no The first component of the pair is the data name of this data set/no The second component of the pair is an unordered m-tuple. Each component of the unordered m-tuple is an element (itself an ordered pair) of a data set/j (05::j5::n-l). At least one component of the unordered m-tuple is from a data set/(n-l). The term data set component refers to a component of this unordered m-tuple. A data set component is referenced by its data name. Data set element refers to a unique member element of the data set/no Component instance refers to the instance of a data set component in a given data set element. Figure 2 gives an instance of the data set •, :oN It' • , DBO , , , , , , , , , I , , " " " " " "• • l'1'lfO Figure I-Graph representation of data set INV " • qI'Y }> }> , (IftO,2>1> , }> Figure 2-Instance of data set INV INV. * The data set contains five data set elements. The data set components are WHNO, ITNO, and QTY. (WHNO, 3) is a componentinstance of WHNO in three data set elements. The concepts of cardinality and cardinality relationships, described above for data sets/O, are extended to data set/no As with data sets/O cardinality specifications for data sets/n must be given by the problem definer. According to the above definitions a data set/n element is unique within its data set. However, multiple instances of the same data set element may appear as component instances in a data set at a higher level. In Figure 2 (WHNO,3) is a unique data set element of WHNO but is a component instance in three data set elements of the data set INV. This multiplicity of occurrence of the same data set element is referred to here as redundancy. The amount of redundancy-the multiplicity of occurrence of the same data set element -in a data set/n is determined by cardinality relationships among the component data sets, by the cardinality of each component data set, and by the association of data sets defined by the problem definer. The design of logical data organization may be viewed as a specification of the amount of redundancy and ordering of data set elements and component instances. For the design process to consider as many alternative logical structures as possible, as little structure-redundancy reduction and ordering-should be implied by the problem definition. The above view of data sets admits as much redundancy and as little ordering as the problem definition can allow and still be complete and concise . Logical data organization The first problem for the system design process is to take a specification of these data sets and, by performing * A pair of braces { }, will denote an unordered m-tuple. 190 Fall Joint Computer Conference, 1970 ,., , , .. INV , • WHNO • ,., STOCK CITY • ITNO • QTY Figure 3-Graph representation of revised data set INV a sequence of operations, obtain a specification of logical data organization for the data set. Logical structure is provided for two reasons. First, the logical structure maintains in some form the membership associations established and referred to by the problem definer in his problem statement. Second, the logical structure provides a path or sets of paths to any element of the structure. Logical access processes, for example binary search, depend on such paths. The logical structure of data may be visualized as a directed graph and will be called a data structure. Each vertex of the graph represents either a data item or a data structure. A data item or data structure repre~ sented by a vertex will be called a data structure component. An arc of the graph then represents a logical ordering relationship between two data structure components. Such a directed arc is an ordered pair of data structure components and will be called a connection. The logical connection described here is the connection which will be represented directly in storage by a fixed or variable distance in the address space of the storage. A data structure can then be viewed as a set of connections-that is, a set of ordering relations among its data structure components. A series of contiguous connections, called a logical access path, may be formed between two data structure components. Logical access processes use these paths to access components in the structure. A specification of data structure is a pattern which when applied to an instance of a data set yields an instance of the given data structure. Consider the data set INV, revised and described by the non-directed graph given in Figure 3. INV has been redefined to be a data set/2. An instance of data set INV is given in Figure 4. To avoid the confusion of mUltiple brackets, the depiction of the data set instance in Figure 4 omits the bracket symbols of Figure 2 and factors the data names to the heading of the figure. Each circled single data value represents a data item instance. Data set membership relationships are represented by bounding lines in the heading. Each entire row represents a data set element of INV. Each column represents instances of the specified data item. While a horizontal ordering of data items has been introduced in the figure for ease of reading, it must be remembered that this ordering is only artificial: the data set components WHNO, CITY and STOCK actually form an unordered 3-tuple and ITNO and QTY form an unordered 2-tuple . In the development of a data structure from the data set INV the system designer might specify the connections (WHNO,CITY), (CITY,STOCK) and (WHNO,STOCK). Similarly the connections (ITNO, QTY) and .(QTY,ITNO ) might be specified within the . data structure developed from STOCK. The data structure· components of the data structure developed from INV are WHNO and CITY, which are data items, and STOCK which is itself a data structure. The structure indicated so far is depicted in Figure 5a. For convenience, INV and STOCK will temporarily be the names given to the data structures developed from the data sets INV and STOCK. Consider now the connection from WHNO to STOCK. This connection creates an ambiguous reference because there are t.wo data structure components in STOCK. If a logical access path is to be constructed from, say, WHNO to the data structure , , DV r - - S'l'OCIC~ lIDO Cl'l'Y I'l'110 Cll'Y G) ® ® ® CD @ @ G) ® Q) (j) (!) ® @ ® ® G) ® Q) (j) CD ® @ @ ® @ @ ® Figure 4-Instance of revised data set INV Automatic Design of Data Organization (a> ,. . ," .. • .. .. • • , I DV •• • STOCIC , (b) (0) (el) Figure 5-Development of a data structure for INV 191 192 Fall Joint Computer Conference, 1970 STOCK, then through STOCK to QTY, the questions can be raised: At what point or points, ITNO and/or QTY, can the path enter the data structure STOCK and at what point or points can the path exit STOCK? What is the precise path from WHNO to QTY and out to another component? It is important that this ambiguity be resolved. When the data structure is represented in storage, the programs which represent 'the logical access processes will operate on the storage representations of the logical access paths in order to access a given component representation. The ambiguity in the path from WHNO to QTY must be resolved if the program representing the logical access process is to reference the representation of QTY from the representation of WHNO. The ambiguity is resolved here by designating one or more of the data structure components as entry components and one or more components as exit components of the given structure. A data structure component may be an entry component, an exit component, both, or neither. The set of entry and exit components will be called the boundary of the data structure. Since a data item may be considered an elementary data structure, the boundary of the data item is the data item itself. A data item is its own entry and exit component. Thus, the connection to a data structure means a logical reference to its boundary; that is, to each of its entry components. A connection from a data structure means a logical reference from each of its exit components. This interpretation of. connections makes no assumptions about the storage representation of the connection or of the boundary. When the boundary consists of multiple entry and exit components the logical access process must contain the logic for deciding through which entry and exit component the logical access path should go. In the graph depiction of a data structure the boundary may be represented by broken arcs from the vertex representing the data structure to the vertices representing entry components; and by broken arcs from the vertices representing exit components to the vertex representing the data structure. The graph representation of the data structure then has a vertex denoted by the name of the data structure and a sub-graph representing the logical relationships among the data structure components. The arcs representing the boundary specify which subgraph is associated with which data structure vertex. In the data structure INV, Figure 5b, WHNO has been designated as the entry component of the data structure INV and STOCK as the exit component. ITNO has been designated as both an entry and an exit component of STOCK. QTY occurs also as an exit component of STOCK. The boundary of INV is the component set consisting of WHNO and STOCK. One piece is still missing from the picture of a data structure. An instance of a data set may contain multiple instances of its components. For example, for each WHNO there may be one CITY but many STOCKs. In the data set INV the same WHNO instance and CITY instance, for example (WHNO,3) and (CITY,A) in Figure 4, were associated redundantly with each of three different STOCK instances. The logical design of the data may specify that for each STOCK instance the corresponding WHNO and CITY instances will be repeated and maintained in the logical structure. In other words full redundancy of data will be maintained. If this design is implemented in storage, the same values of WHNO and CITY will be stored for each related instance of STOCK. On the other hand the logical design may specify that only one occurrence of the redundant WHNO and CITY will be maintained and with that occurrence of WHNO and CITY will be associated a new data structure each of whose components is one of the related instances of the data structure STOCK. The redundancy of WHNO and CITY has been reduced. This structure is depicted in Figure 5c. A structure of multiple instances of the same data structure is sometimes called a repeating group. Within this newly created structure the instances of STOCK for a given WHNO/CITY combination are given some ordering, e.g., ascending numerical order by ITNO value. In addition, a boundary is specified for this new data structure; for instance, the entry component is the numerically first STOCK (f(ITNO» and the exit component is the numerically last STOCK (l(ITNO». In the graph these ordering and boundary specifications can be attached to the arcs to and from the STOCK vertex. The system designer may give a name to this new structure, as nS(1) in Figure 5c. Assuming the given redundancy reduction, one can apply similar reasoning at the level of INV. According to the cardinality relationships given earlier, several instances of the data structure for INV will occur in an instance of the data structure, one for each instance of WHNO. Each instance of INV will have the logical structure described in Figure 5c. This new structure has three components: WHNO, CITY, and nS(1). In each instance of INV, WHNO and CITY appear once and are connected to a data structure whose components are instances of STOCK. The data structure, nS(O), combining all instances of INV structure will be an ordering of instances and will have a specified boundary. The complete specification of the data structure is given in Figure 5d. The design gives a specification of both ordering and redundancy and establishes the Automatic Design of Data Organization network by which a data structure component may logically be accessed. Note that the membership relationships given by the problem definer have been maintained. Associated with a data structure is one or more logical access processes which function to find or access the relative logical position of a component instance in an instance of the data structure. A logical access process uses contiguous connection instances to construct a path to the relative position of the desired component instance. For example, to find the data value of CITY for a given data value of WHNO, an access process for the above structure might create. a path which starts at the entry component instance of the data structure DS(O) and courses through each instance of INV until it finds the given WHNO value which connects to the required CITY value. In each instance of INV the path leads from WHNO to DS(l) and exits via the DS(l) vertex. The access path does not course logically through the instances of STOCK. From the point· of view of the system designer a logical access process is like a problem-defined process. The system design process must incorporate it with the problem-defined processes into the program structure. Any data which the logical access processes need in order to function properly are designer-defined data sets which themselves must be organized logically and physically. At this point the system designer becomes a problem definer. Physical organization Physical organization of data means here the IPS storage representation of the given data structure. Two degrees of physical organization should be recognized: relative organization and absolute organization. Relative organization involves the manner in which the data structure components, connections, and boundary of a data structure will be represented in IPS storage. Such a specification involves questions of numbers of cells of memory required, sequential or non-sequential storage of components, header cells, etc., but not the actual locations of the data in storage. Absolute organization involves the actual locations of the components, connections, and boundary representations in storage. Absolute organization is specified by a set of storage assignment processes and must maintain the specified relative storage organization. In the following discussion major consideration is given only to the relative organization. For the design of relative physical organization a relative storage area is defined here. This conceptual storage area is an ordered set of cells. Each cell is 36 y - Length Poai tion I +-- '0. , . . . . . 2. 5. 6. ,..... b1ta- 1, !: ......... J ~---~r"''''~", I. . . . . . . . ~ I:--__---!l ~ r----.,;.----:O ...- ... ... r'" ............ 1 ... Ai I ~-..........".. ... -... -...~L I ......... . . 7. 8.1~-"'-"'-.. --~I ......... ...-r-------..;;:... 9'1 . . . .J ...... "'1' I 10. :......... ... _ r[ . .~~ . . . . . . . . . . . )t------!l ~ 11. 193 I ......... > -. ..... - - 1 Bode A ..... .............. ... ... ... : : ..... ~'T-I...-..;.... ""'------.>I J Figure 6-Storage node A in relative storage uniquely identified by an order-relative position and a length. Cells are non-overlapping. The length of a cell is measured in common units, such as bits. Looked upon as elements of a set, and regardless of what they will represent, the cells in relative storage may be grouped together conceptually in many different ways. A storage node, or simply node, is defined to be a collection of elements each of which is a cell or a storage node. The cells in a storage node need not be contiguous relative to the ordering. In Figure 6 node A consists of three elements: two nodes and a cell. A single cell may be regarded as an elementary node. For convenience in referencing a node relative to another node a measure of separation between the two nodes is defined. A separation between two nodes will be defined as the number of units of length, bits or words for instance, which, according to the given order-relative positions of the cells in relative storage, separates a reference cell in one node from a reference cell in the other node. The ceJl from which reference is made in the given node to another node will be called an exit cell of the given node. The cell to which reference is made in the given node from another node will be called an entry cell. The reference cells of a node will be called the node boundary. An elementary node or single cell is its own entry and exit cell. Specification of entry and exit cells for a node is required for much the same reason that entry and exit components are specified for a data structure. If particular boundary cells were not specified, then reference to or from a multi-cell node would be ambiguous. In Figure 6 cell 0 has been designated the entry cell of node A (denoted by Ai). Cell 7 has been designated the exit cell (denoted by A 0). It should be noted that the choice of the boundary 194 Fall Joint Computer Conference, 1970 3. 1 4: Node B 8: B.~ 9: J Figure 7-Storage node B of a node in the conceptual relative storage is arbitrary. Multiple entry and exit cells may be designated in a node. Several different separations can occur between two nodes if one or both have multiple entry and exit cells. In Figures 6 and 7 only a single separation has been defined between nodes A and B. This separation is one cell-Ao to B i . Figure 6 and following assume that all cells have the same length and that separation is measured in number of cells. The system designer must specify first how to represent a data structure by a node in the conceptual relative storage and then specify a storage assignment process for mapping the node into absolute storage. The relative storage representation of the components, connections, and· boundary of a data structure will be called here a storage structure. A data structure component is represented in the relative storage by a node. If the data structure component is a data item, this node may be a set of cells which are contiguous relative to the ordering of relative storage. In Figure 8a the designer has decided to represent the data item WHNO by a two-cell node with the first cell being both the entry cell,WHNO i , and the exit cell, WHNO o• The system designer has decided that the first cell of the node will be the reference cell in any future separation specifications. The specific orderrelative position of this node in relative storage is unimportant. Only the separation between it and other element nodes of the given storage structure is important. Figure 8a also represents data items CITY, ITNO, and QTY. The number of cells required to represent the data item is determined from Hie problemdefined size of the data value. This representation assumes that only the data value is to be represented in the node. If the data· structure component is a data structure itself then the storage structure maybe defined recursively by representing the components, connections, and boundary of this component. A connection in a data structure may be represented in one of two ways: 1. by a fixed separation between an exit cell of the node representing the first component and an entry cell of the node representing the second component; 2. by a variable separation which will not be given actual value until storage assignment takes place. In either case the IPS will maintain a record of' the separation. In common practice the fixed· separation is recorded as a fixed address offset in program instructions. To' handle variable separation the system designer may define another data item, a pointer, to be associated with the data structure component from which the connection is defined. The system designer also defines maintenance processes to update the pointer and perhaps other associated data sets, such as headers and trailers,· to aid in maintenance.· In Figure 8b the connection (WHNO,CITY) has been represented by a variable separation in the form of a pointer. A fixed (a) 'lHN°l I'l'NOi [ I [ I WHNO ITNO ] I ] WDOo CITyil ITNOo qI'Yi I I C1TYo CITY I qI'Yo qI'Y (b) 'lHN°i OITYi [ 'lHNO ] ~ I CITY WHNOo I'l'NOi [ qI'Yi I I ITNO Ql'Y , CITYo (c) STOCK1 [ I I ITNO qI'Y J STOClCo r I, STOCK o Figure 8-Development of storage structure ] I'l'NOo I I Q'l'Yo Automatic Design of Data Organization separation of two cells has been specified to represent the connections (ITNO,QTY) and (QTY,ITNO). The data structure boundary is represented by a node boundary developed in the following way. The designer may specify that the boundary of the node representing the whole data structure consists of the entry cells of nodes representing the data structure entry components and the exit cells of nodes representing the data structure exit components. Alternatively, the designer may incorporate additional cells in the node and define them to be the entry and exit cells of the node. He then defines a fixed or variable separation between these cells and the respective boundary cells of nodes representing the data structure entry and exit components. The additional cells and the boundary cells of nodes representing the data structure entry and exit components together represent the data structure boundary. In terms of the graph representation of a data structure, for instance Figure 5d, the use of additional cells corresponds to treatment of the broken arc, say from DS(l) to STOCK, as a true connection; DS(l) is represented by the additional cells and the connection is represented by fixed or variable separations between these additional cells and the ENTRY and exit cells for the first and last instances of STOCK, respectively. If no additionaJ cells are used, the broken arc is not viewed as a true connection and is therefore not represented explicitly in relative storage. In Figure 8c the data structure boundary of STOCK has been represented by the entry. cell of the entry component ITNO and the exit cells of the exit components ITNO and QTY. Associated with a storage structure is one or more storage assignment processes. A storage assignment process will apply the relative storage structure design to an instance of the data structure and assign actual physical hardware memory locations. The storage assignment process is responsible for maintaining all "data about data" which is necessary for assignment of positions and all positional information which is necessary for use by the implemented logical access processes. The anticipated number of length units, e.g., cells, required by a node to represent a data structure instance may be developed from the size of data item nodes, the cardinality relationships given by the problem definer, the amount of redundancy defined by the system designer, and the requirements of pointers and related additions. See McCuskey3 for details. A storage assignment process, like logical access processes, must be incorporated with the problemdefined processes to create program structure, whether at the operating system level or elsewhere. Any "data about stored data" which the storage assignment process requires is, from the point of view of the system - r- -- - 195 --I DlISIGlf PROCBSS I I I I I I DA.TA. STRUCTURE I DESIGlIBR I I I I I DESIGNER I I I I I I STRUCTURE I DESIGNER I L ________ r - - - - - - - - - - - --.J I ~I----~ --l Figure 9-Data organization design process designer, just like problem-defined data-data sets which must be given logical and physical structure. DESIGN PROCESS The goal of the concept descriptions above is to provide a framework within which .to formulate an algorithm which, given a specification of problemdefined data, would specify how the actual data will be stored and accessed in the IPS. Figure 9 gives a very general flow chart of a design process for data organization. In the design process the data structure designer accepts a specification of data sets and generates a specification of data structure (components, connections, and boundaries) and of logical access processes. While generating a specification of data structure, the designer acts itself as a problem definer; the problem is logical access to components of the data structure. The 196 Fall Joint Computer Conference, 1970 decision-maker has been developed. How a decision should be made at each point depends on the relation between the designed structure, the processes operating on it, and the performance criterion. As yet this relation is not understood . Consider the specification of data structure for set INV (Figure 3). Suppose first that the given redundancy is to be retained. Then a general, recursive data structure design procedure might be: ' \ DS(O) f'(1IBRO) I I 1(1UIIO) \/ ./ ,; ,/ BY ........... "- "- '\. ,/ / "\ // t \ / \ Process D .-----...-.~.~ ~ 1IBRO / ~ / / erl'Y I / / / / /1 , '\ \ \ \ ~~\ ;.~ Figure 10-Result of process D definition of logical access processes must be input to the process structure design in order to be incorporated in the program specification. The structural information must be specified in terms of data sets and then input to the design algorithm. The storage structure designer accepts the specifications generated by the data structure designer and produces a specification of storage structure (relative storage representation of data structure components, connections, and boundaries) as well as the storage assignment processes which will map the storage structure into absolute storage. Like the data structure designer, the storage structure designer is a problem definer; the problem is storage assignment. The storage assignment processes and information required by those processes must be defined and run through the design algorithm. The process structure designer organizes the original problem-defined processes, the logical access processes, and the storage assignment processes and generates program specifications. How the logical access processes are represented in programs depends on how the storage structure and storage assignment processes have been designed. How the storage assignment processes are represented in programs depends on the characteristics of the absolute storage of the IPS. In the context of this general picture of the design process only the specification of data structure and storage structure is considered below. An initial attempt at a method of generating alternative designs is described. The purpose of this attempt was to gain an understanding of what decision points are involved. No 1. For each component of the given set, if the component is not a data item then apply Process D recursively. 2. Define connections among the components of the given set. 3. Define a boundary from among the given components. The process assumes all instances of a component are structured alike. A component may be a data set component or, in a repeating group, a data structure instance. The result of an application of Process D to INV, yielding a structure similar to that in Figure 5d, is given in Figure 10. Note that redundancy ofWHNO and CITY will be maintained here while in Figure 5d it is not retained. Suppose now that Process D has not been applied to INV. Suppose one wishes only to reduce redundancy. Reduction of redundancy may be accomplished in the following way: Process R 1. Partition the original set according to the data values of instances of one or more components. A partition element is a subset of the original set. In a partition element each criterion component maintains mUltiple instances of the same data value. 2. Replace the partition element by a new element in the following way: a. one instance of each criterion component replaces the multiple instances; b. the remainder of the original subset is grouped by itself as a separate element. The replacement operation will be called here truncation. The remainder will be called the truncated set. Figure 11 develops from Figure 4 a partition and truncation of INV according to the values of WHNO. The deleted component instances are blacked out. As in Figure 4 rows represent (unordered) data set elements and columns represent (unordered) data set components. Automatic Design of Data Organization In Process R step one establishes which redundancies may be reduced in step two. The partition in Figure 10 establishes the redundancy of WHNO by definition; redundancy of CITY is established because the problem definer specified only one CITY instance per WHNO instance. The truncation operation performs the actual redundancy reduction. Neither, one, or both of WHNO and CITY may have redundancy reduced. In Figure 11 both were chosen. These operations may be extended. A sequence of partitions leads to several levels of redundancy reduction. The sequence of partitions establishes a set of candidates for redundancy reduction at each level. The candidates are the criterion components established at that level or above and other components which are in one-to-one correspondence to criterion components. Starting at the deepest level and working upward, the design process can decide which candidates to choose for redundancy reduction. For a given candidate to have its redundancy reduced to a minimum its redundancy , must be reduced at each level from the deepest up to the level at which it originally entered the set of candidates. If its redundancy is not reduced at the deepest level then its full redundancy is maintained. Partial redundancy for a component is established by not selecting the component for reduction at some level in between. Once the component fails to be chosen it drops out of the set of candidates; its redundancy is fixed. This expanded redundancy reduction scheme at each level factors out selected components to the next higher level and leads to a hierarchical form of organization. The scheme may be combined with Process D above to form Process DS: Process DS 1. Define n-Ievel partition. 2. For level n, n-l, ... , o. a. Define a truncation at this level. b. In the truncated set. 1. apply Process D with data set components and, possibly, truncated sets as components. ll. apply Process D with truncated set elements as components. Operation 2.b.i specifies the structure of an element of a repeating group or data set. Operation 2. b.ii amounts to specifying the structure of that repeating group. Once a component or truncated set has been structured it is a data structure component. Figure 5d shows the pattern resulting from one application of Process DS to INV. Figure 12 shows the 197 , ~----------DN r--- S'1'OCJ( ~ WHNO parti tion I~ element " - I~_" 205 rO l til k=l '---v-----' '----v-" Retrieval of Retrieval of index record data record If the top level is not contained within a single block, the mean retrieval time will be a function of the number of blocks comprising said level, and the search strategy employed. Under these circumstances, the results of Equations 3 or 7 may be applied, together with Equation 9, to obtain an expression for Tr • .- 05=t- az=l - j , I ! I r~som3Hi-TvL CASE STUDY The results derived in the preceding section may be used to establish the relative merits of the several techniques. To illustrate this procedure, a hypothetical application is presented in the form of a brief case study. It is assumed that the file in question consists of m records. The index file (s), with the exception of the top level (Case III), are maintained on a drum while the main file is kept on disk. The block size is fixed (for all devices) at 1024 bytes. It is further assumed that the data records are 512 bytes in length; a retrieval thus involves a single disk access. Index records are 16 bytes in length (12 bytes for the key plus 4 bytes for the address). Therefore each block has a capacity of 64 index records (b = 64). The drum and disk are characterized by the following parameters: Drum Mean Latency Data Transfer OO&=!.----- 11 = 17 Xl msec .43 msec Figure 6-Number of index levels (Case III) or more, the advantage of the binary search considerable. IS Case II: Calculated index In this case, the index file is organized as a random file. Where linear probing is employed, the m records are distributed over n blocks. With block chaining, n' = n+ tl.n blocks are allocated. Figure 8 describes the variation in Tr as a function of m/nb. The utilization factors, U (Linear Probing) and U' (Block Chaining) are likewise plotted in Figure 8. For a utilization factor of .80, the mean time to retrieve a record, assuming Linear Probing, is 109.1 msec. With Block Chaining (and a utilization' of .80) the retrieval time is 115.2 msec. Note that in either case, the retrieval time is independent of the size of the file, dependent instead on the ratio m/nb where n is controllable via T. Disk Mean Seek Mean Latency Data Transfer 82 12 X2 msec =75 =12.5 msec = 4.16 msec Case I: Spatial index In this case, the index file is organized as a sequential file-a physically contiguous string of blocks containing the m index records, ordered by key. We assume for the sake of example that the mean occupancy of a block is 50 records. This corresponds to a utilization of roughly 80 percent (U ~ .80). Under these circumstances, the mean time required to retrieve a record is described in Figure 7. For files of 30,000 records Case III: Tabular indices In this case, K sequential index files are maintained. It is assumed here that K is selected so that the top level index is contained entirely within a single block. Assuming a utilization factor of .80, K is obtained as the solution to the following inequality: 50K - 1 < m ~ 50 K (10) We further assume that the top level index (a single block) is kept in core. The time spent searching the top level may therefore be neglected. Hence, the mean retrieval time is given by Tr= (K-1) (ll+Xl) + (82+12+ x 2) (11) 206 Fall Joint Computer Conference, 1970 v ,. J / ~ -" ~ ~ ~ ~ / r.r ~ ~ ~ ~SEMCH ~H b=84 U~.• ,e' ,t' ,t' m NUM8ERCJ= RECatOS Figure 7-Mean retrieval time (Case I) Figure 9 describes the variation in Tr as a function of m. For example, when 2,500 !!! a: I- w a: o CylinderN 100 I- w ~ ~::~: ~~--,<--'<;----; i= ~::~; ~~~.>,----,<---; Number of Records = 50,000 ~::: 1 p...,"'--r'-T~"""'--i"-T.>....j Cylinder Overflow Index No Master Index l--'-1--'--'--'L.LJ-.LJ U = random keys S = sorted keys IC = insertion (write-checks included) CC = change (write-checks included) 10T--r---------~-----,------------~---- 100 1000 NO. OF RECORDS TO BE RETRIEVED Master Index Figure 1 211 212 Fall Joint Computer Conference, 1970 50 I8 40 ~ ~ ........ .... --..... -- --- - - - _ _ U,I II: 8w II: § ............ (a) number of index levels, (b) device placement of index levels, (c) device placement of data, (d) block size of data, .(e) amount of overflow, (f) device placement of overflow, etc. Parameters which are fixed for a specific . file design include (a) actual method of access-direct or buffered sequential, (b) number of buffers, (c) type and number of transactions, (d) file size, (e) record size, etc. The paper is divided into three sections and an appendix. The sections are: a. The characteristics of direct access through the indexes; b. The characteristics of sequential search of the data; c. A comparison of the two methods. 30 w > w C t; II: e w ! 20 Number of Records .. 50,000 Cylinder Overflow No Master Index U .. random keys S .. sorted keys 0" retrieval I - i~ C - change --------- ~ __ - - - ---------U.Q _--- _---- ~ S.C S.Q The appendix of the paper presents comparisons of model runs with actual computer runs to illustrate the accuracy attainable with the model's approach. 10 o+-----~------r-----.------,------~--50 40 30 20 10 50 PERCENT OVERFLOW ............. Graph 2 .gives, for the first time, an indication of the complex behavior of an actual data management access program. FOREM I, which was used for the study, contains 300-500 FORTRAN statements dealing with the analytic evaluation of access time and storage layout for different parameter values. Each run consumes on the average about 10 seconds of machine time and a few minutes of designer set-up time. 40 I ~ ................ - .......- - _ --------- .......ndex Cylindar Overflow 30 II: 8w II: § Number of Records = 50,000 I -h:tsert C'"' c:hange 0- retriewl S .. sorted keys U" r.ndom keys III > W ...wC ...wo 20 --------- U,C _ - - - - ----------S,C U,Q II: The indexed sequential access method is one of the few fundamental, qualitatively different access methods (sequential, direct, and perhaps chained access being other possibilities). It is based on data and a hierarchy of indexes that are sequenced on a particular identifier. The method has been programmed by manufacturers and users in a number of specific implementations. In Figure 1, we present the physical storage layout of one specific implementation. Its variable parameters include _--- s,o :IE i= THE INDEXED SEQUENTIAL ACCESS METHOD U,I 10 o+------.------.------.------~----~--10 20 30 40 50 PERCENT OVERFLOW Graph 3 Analysis of Complex Data Management Access Method 213 DIRECT INDEXED ACCESS In direct indexed access, the data management routine is presented with the identifier of a particular record. The identifier is compared sequentially ~gainst the key entries in the highest level index. When the identifier is matched, equal or low, descent is made to a section of the next lower level index by means of an address pointer associated with the appropriate key. At the track level index, for every prime track* there are two index entries: one containing the highest key in the prime track and one containing the highest key of the overflow associated with the prime track. Search of the prime track involves sequential processing of blocked records on the track. Search of the overflow involves following chains through the records that have been pushed off the prime track by insertions. The critical 500 --- ..... ------- Cylinder Overflow Master Index Number of Records '" 1,000,000 U" random keys S '" sorted keys a '"retrieval I .. insert C = change w > w ii: ~ e 200 w ------~'£...------ ------s:-C-U,Q. :I; ..... . . - -- _---- s,a j:: 100 o+-----~----~------~----~------ 600 jg -_ ---- 10 20 30 40 __--__ 50 PERCENT OVERFLOW ................. .!!:.,I--- 500 --Graph 5 z 8w S!! ~ 400 ~ ~ ________ ~---- U..Jl. - - - - -------s,C 8 ~ I _________________~~~a~-------300-1- w > w Cylinder Overflow No Master Index Number of Records = 1,000,000 U = random keys S = sorted keys I = insert C = change a = retrieval ~ Gi a: 200 ~ w :i i= 100 Ol+-----.-----.------.-----r----~-------- 10 20 30 40 50 PERCENT OVERFLOW Graph 4 * A prime track is a track dedicated during the loading process to contain data records whose keys fall within a particular value range. When inserts are made and no space is available, records will be pushed off the track into an overflow area. These records are said to be overflow records. parameters which we studied were: 1. File size (two files: 50,000 and 1,000,000 records); 2. Number and placement of index levels (Master Index (MI = 1\1), no master index (None), and master index in core (MC»; 3. Overflow configurations: a. Overflow in the same cylinder as the prime track or cylinder overflow (IE); b. Overflow in the same pack (I B) ; c. Overflow on a separate pack (IS); 4. Percent of overflow (eleven values: 0-50 percent at 5 percent intervals); 5. Transaction types (query (Q), change (C or CC), and insert (I or IC». (The second C indicates that a write check is made.) 6. Input key stream (random SU = U or sorted SU = S); 7. Number of records retrieved. The records were 725 bytes long and were stored in unblocked form on an IBM 2314 disk storage device. The indexes were on separate packs from the data and 214 Fall Joint Computer Conference, 1970 Index structure 400 ............ ' ........ ............... ---- - - - . _ U,I en a z o (,J w ~ U) a 300 u: 8w u: ~w > w ~ I- ,-- 200 ~----­ U,Q_----- u: o w :E --- ..... Cylinder Overflow Master Index in Core Number of Records = 1,000,000 S = sorted keys U = random keys I = insert C = change Q = retrieval w I- ...... Index structure tradeoffs can be considered by consulting Graphs 2, 3, 4, 5 and 6. Graphs 2 and 3 indicate that a master index* is not useful for a small file while Graphs 4 and 5 indicate that the opposite is true for large files. This in itself is a relatively obvious conclusion, however, the location of the decision point between the two file sizes is of more interest. This decision point depends on whether index entries are placed in full track blocks (for performance) or in smaller blocks (to conserve core storage). Forfull track blocking with reasonable key sizes, a master index becomes useful only after the cylinder index exceeds four tracks in length (for an IBM 2314, this is equivalent to seven disk packs). At the other extreme, where each -------~ i= S,Q ", 70 Ol+------.------.-------r------r------~--50 40 30 20 10 ------- U,I _ . . . - " " 60 PERCENT OVERFLOW Graph 6 ,.,. . / .,/ ./ en 0 z 0 50 (,J w ~ U) 0 u: processing time for the qualified records was assumed to be negligible. Even though it was apparent that a large number of model runs were involved, it is also clear from the immediately previous .statements that all possible parameters were not varied. 0 (,J w 40 u: 8 w > w ~ Overflow in same Disk Pack No Master Index Number of Records = 50,000 U = random keys S = sorted keys I = insert C = change Q = retrieval 30 I- w u: 0 I- w :E N umher of records retrieved Graph 1 indicates the general behavior of various transaction types as the number of records retrieved in a transaction is varied. In the unsorted key case, the average time per record remains constant, independent of number of records; the· sorted key case diverges from this curve because the access .arm requires smaller and smaller steps to transverse the data disk pack as more records are retrieved. In these runs, index blocks were not buffered so the divergence is not as great as it would be if the access arm on the index pack could march across the index files .also. Insert requires more time than change because records must normally. be moved to make room for the inserted record. i= u,.,S...- 20 - ---- -- -- ----~.-_-sC ...-"'- ,... ... ,... ......... ..". 10 O+------.-------r------~----~------_r_ 10 20 30 40 50 PERCENT OVERFLOW Graph 7 * Master index is an index to the cylinder indexes (Figure 1). Analysis of Complex Data Management Access Method entry occupies a separate physical block, the decision point lies at two cylinder index tracks (corresponding to about one-half of a 2314 disk pack). For the large file, the differences between these choices can be quite significant: a. master index, full track index blocking 1'-'130 seconds; b. no master index, full track· index blocking 1'-'300 seconds; c. no master index, one index entry per block 1'-'1600 seconds. .,..,. / / 700 ------ U.I,,""" -'-'~ 600-r-------- 500 ii:i o z oo w The permanent storage of the master index in core provides an additional 20 percent improvement over case (a) (Graph 6). 215 ~ (I) o ~ ow 400 Overflow in same Disk Pack Master Index Number of Records = 1.000.000 U = random keys S = sorted keys I = insert C = change = retrieval a a: ~ w > w a: / / 50 / ,.,..,..,.", 300 / I- / w a: ~ w ::?! / i= U.I// 200 ---------~ --- ---- U C..,,""" .",.. '''./ __ -- ....,.,-" U.q.......".., __ _ ' - - ,.,. ..--" "" S.O -S.C 40 ii:i 100 0 z 0 0 w ~ (I) 0 a: 30 0 0 w a: 8 w > w a: ~ w a: 20 Overflow in different Disk Pack No Master Index Number of Records = 50.000 S = sorted keys U = random keys I = insert C = change a = retrieval ------ 10 /" 20 30 40 50 PERCENT OVERFLOW // / .",/ U.C,,/ ./ O~""" . / Graph 9 _ ..,,""G." ___----' 0 04-------~------r_------r_----~r_----~ ....,.,"" S.C ~ w ::?! Overflow configuration i= 10 O+-------~------r------,------~------_r_ 10 20 30 PERCENT OVERFLOW Graph 8 40 50 Graphs 7-10 supplement Graphs 2 and 5 (the most desirable index configuration for IE) to provide a picture of performance behavior by overflow configuration. In all overflow cases, the numbers of logical and physical records per track are significant parameters in predicting performance. All operations are affected·by the number of ·logical records per track; even small percentages of overflow result in long overflow chains when there are onehundred or more reCords per track. On the other hand, the number· of physical records per track primarily 216 Fall Joint Computer Conference, 1970 advantage is somewhat compromised by the sensitivity of this configuration to insertions that are not uniformly distributed over all the cylinders. Since enough space must be reserved in every cylinder to hold the maximum number of inserts per any cylinder, there can be extreme space wastage in those cylinders which have little insertion activity. This problem can, of course, be eliminated by combining IE with one of the other overflow configurations, on same pack (IB) or on separate pack (IS), to handle unusually dense insert activity. The differences between same pack and separate pack are less significant than their differences with respect to cylinder overflow. In general, performance will be worse than separate pack for small amounts of overflow, but 500 400 iii c z 0 u w ~ en C a: 0 u w 300 Overflow in different Disk Pack Master Index Number of Records = 1.000.000 U = random keys S = sorted keys I = insert C = change = retrieval a ", a: ~ U.~'" w > w a: ~ w a: 0 ./ 250 / 200 ~ w :E i= ---- " " ,,'" " ", " ..".," u.a....... '" ........ ..".,"S.C '" Number of Records = 50.000 ----- ----- ---- 200 iii c z ou 100 w !e en c a: o u w a: 10 30 20 40 50 8 w PERCENT OVERFLOW > ~ 150 ~ w a:: Graph 10 ~ w :E OVERFLOW IN SAME CYLINDER j:: infiuences insertion behavior. When room must be made for inserts on the prime tracks, the following records must be rewritten block by block until the last record is pushed into overflow. The penalty for rewriting large numbers of physical blocks on the prime track is so drastic that performance generally improves as overflow initially increases, because insertion into overflow is less costly. The surprising fact is that insertion performance will normally improve until the number of blocks in the overflow chain is twice the number of blocks on the prime track. As expected, cylinder overflow (IE) generally provides the best performance because no additional arm motion is required to access the overflow area. This performance 100 o L-____~~--__~------~------~----~ 10 20 30 PERCENT OVERFLOW Graph 11 40 50 Analysis of Complex Data Management Access Method 217 small. In the case of separate pack overflow, the initial and subsequent arm motions will average one-half the number of cylinders in the overflow. For amounts of overflow exceeding one-half pack in size, these longer subsequent motions will dominate performance. 1,000 iii o z 8w 100 ~ en Designing a file with overflow o a: 8w a: w > w a: ~ w a: ~ 10 Cylinder Overflow No Master Index Overflow =0% w :E i= Number of Records = 50,000 100 10,000 t,OOO 100,000 NO. OF RECORDS RETRIEVED It is generally believed that overflow hampers performance. In fact, since insertion performance often improves with increased overflow, optimum total performance may be obtained when there is a certain amount of overflow in the file. The optimum can be determined by weighting each of the individual curves for retrieval, update, and insertion by the percentage of transactions of that type. When the curves are added together, the minimum on the total curve will lie at the optimum overflow percentage. For example, in the case of the small file without a master index, we will assume that all transactions Graph 12 15~---- _____________________________ eventually "will be better for very large amounts. This is because the initial arm movements to overflow for same pack overflow will be across half the prime area and a portion of the overflow. Arm movements for chain following inside the overflow area will be relatively (2) en o z 8 w 1,000 ~ (4) 10 en o It: o U w It: ~ (8) « (16) en a: w iii 0 z w >' w 0 () w !!! 100 ~ 0 It: 0 o () w a: w > w ~ w 5 :I i= a: ~ w a: 0 ~ Cylinder Overflow No Master Index 10 Cylinder Overflow No Master Index Overflow = 25% w :E i= Number of Records =50,000 Number of Records = 50,000 100 1,000 10,000 NUMBER OF RECORDS RETRIEVED Graph 13 100,000 0+------,-------r------.------,------'25 o 5 10 15 20 PERCENT OVERFLOW Graph 14 218 Fall Joint Computer Conference, 1970 15(2) 30 15(4) 15(8) 15(18) IB (2) iii o z o ~. 20 IB (4) fh 18 (8) IE: IB (18) ~ o 8 w IE: ~ w > w a:Iw IE: ~ ~ 10 t= Number of Records .. 50.000 Number of BufferS in bnckets No MISter Index IS .. Overflow in different Disk PlICk IB .. Overflow in same Disk PlICk expensive if there are a large number of blocks on the prime track. Nonetheless, it is especially needed in insertion to protect the correctness of the rewritten data. THE BUFFERED SEQUENTIAL ACCESS PROCESS In this process,. the ·system is presented with an identifier and finds, by means of an index search, the location of the record having that identifier or the next highest identifier. At this point, it begins a buffered sequential search of the data, pausing at the end of each prime track overflow area to access the track index. For this study, we have assumed a particular implementation. That is, on the prime track, one-half the total number of buffers may be active in a chained read or write operation at anyone time. If the total number of buffers is equal to twice the number of physical blocks on a track, then a complete track can be read or written in one revolution. Overflow tracks, on the other hand, are accessed one physical block at a time. When there is contention for reading and writing services, the principle of first-in-first-out is applied. 15(2) o+-----~------.-----~------._----_.------~ o 10 20 30 40 300 50 15(4) PERCENT OVERFLOW 15(8) 15(18) Graph 15 involve 100 qualified records, and transactions are evenly distributed among updates, retrievals, and insertions with random as well as sorted keys. Graph 11 presents total performance curves for the three types of overflow allocation in the small file. Cylinder overflow (IE) performance is optimum with 25-45 percent overflow and separate pack overflow (IS) performs best at 10-15 percent overflow. The optimum for same pack overflow (IB) generally occurs at zero percent overflow. IB(2) iii o z 8w IB(4) 200 IB(B) ~ en o IB (16) IE: 8w a:: § .n w > w ii: Master Index Number of buffers in brackets Number of Records = 1,000,000 1S .. OVerflow in different Disk Pack IB .. Overflow in same Disk Pack I- w General e In all test cases, the indexes and data were on different disk packs; and record accesses driven by random key input strings took significantly longer than accesses driven by sorted key input strings. These differences would be marginal if the indexes and data were located in the same pack. While update-in-place characteristics with or without write-check are very similar to retrieval characteristics since they involve only one or two added disk rotations, the use of write-check in record insertion creates entirely different characteristics. It can be extremely w 2! a:: 100 t= O+-----~------~-----r----~r-----~ 10 20 30 PERCENT OVERFLOW Graph 16 40 50 Analysis of Complex Data Management Access Method 219 Number of records retrieved Graphs 12 and 13 indicate the general performance behavior of the access process for various numbers· of records retrieved. For a given number of buffers and large numbers of records retrieved, it is an unexceptional linear function. These curves will, however, become more horizontal for fewer numbers of records, because the initial index search will be a more important factor in average access time per record. For similar reasons, the device placement of the indexes is only significant when small numbers of records are accessed. While the effect of the number of buffers will be discussed later, it. is interesting to note that large numbers of buffers are most useful for small· amounts of overflow. 28 24 22 20 iii 0 z 18 ~ 18 8 en 0 c 8 '" '"> '" i: '"c 0 ~ ~ '"2 t= Overflow configuration and overflow percentage Graphs 12 and particularly 13 and 14 indicate that sequential performance is significantly affected by the amount of overflow present in the file. Arm motion to and in the overflow area is primarily responsible for the rapid change in performance characteristics. The slope of the cylin4er overflow (IE) curves is determined by the differences in access time between 14 c 12 10 No Master Index Overflow in same· Disk Pack Overflow = 25% Number of Records = 50,000 8. 8 4 2 0 100 200 300 400 500 NO. OF RECORDS READ (100 UPDATED) Graph 18 No Master Index Cylinder Overflow Overflow = 25% Number of. Records = 50,000 I 100 I I I I 200 300 400 500 NO. OF RECORDS READ (100 UPDATED) Graph 17 prime area records and overflow area records. This, in turn, is determined by the number of records that can be retrieved in one revolution from· the prime area because accessing in the overflow area is always at one record per disk revolution. The primary factors in this determination are prime area record blocking and buffering. The slight downward slope of the·· cylinder overflow (IE) curve for two buffers is due to the fact that larger numbers of overflow records reduce the necessity for reading index tracks. The knee in the pack overflow (IB) curves will occur at the overflow percentage where there is one overflow record per prime track. In these tests we have assumed that the overflow records are uniformly distributed over the. prime tracks; if we had not, then the knee in the curve would be less sharp. As can be seen for the present experimental configuration, pack overflow begins to outperform separate overflow· (IS) when each prime track .has about three overflow. records associated with it. Buffers and update performance In ,the case of retrieval discussed above, any increase in the number of buffers always causes the timing curves to shift downward, but parallel to their prior locations 220 Fall Joint Computer Conference, 1970 ·22 20 18 en 0 .z 0 16 0 w !e en 0 14 a: 0 0 w 12 a: w > w 10 w 8 a: Ia: 0 I- w :E No Master Index Overflow in different Disk Pack Overflow = 25% Number of Records = 50,000 6 t= 4 file, the designer may use either basic direct or buffered sequential search. We provide here an example situation. If the overflow for the small file is organized on a cylinder overflow (IE) basis and the input keys are sorted, the basic direct access method will require 10 seconds to access 100 records. (See Tables I and II.) The queued sequential access method, using 10 buffers, can retrieve about 1,000 records in the same time. In this case, if better than one record in 10 is pertinent to the query and processing time is insignificantly small, then sequential access will provide better performance. Generally speaking, if p is the number of records which must be read sequentially to find a qualified one, tq , the average time to read a record in buffered sequential mode and tb, the average time to read a record in basic random mode, then the queued mode is more efficient if tb > P etq • (This formula is most appropriate 2 0 100 200 300 400 500 Retrieval & Update Time (sec.) NO, OF RECORDS READ (100 RECORDS UPDATED) Table I Graph 19 (Graphs 12 and 13). When some fraction of the records are updated, and therefore rewritten, there need not be a regular increase in performance as the number of buffers is increased. In Graphs 17, 18 and 19, as the number of buffers is increased from 2 to 8, the time to read x records and update y ~ x of them decreases regularly. However, a further increase up to, but less than, 16 buffers reduces overall performance. The reason for this phenomenon lies in the interference of seeks for reading and writing of data. When the capacity of the buffers available is less than, or equal to, one-half of a track (in this case, 8 buffers or less), the access system can both write and read n/2 blocks in a single revolution (n is the number of buffers available). These two operations cannot be smoothly interspersed when ~ track < the capacity of the buffers < 1 track. In the above runs, the record processing time was not a significant factor. If processing time is significant, then instances will occur where the 2 buffer configuration will perform better than the 8 buffer one. A detailed analysis of these situations is quite involved and is best performed by simulation models. GENERAL CONSIDERATIONS overflow 0 5 10 25 0 5 10 25 s QISAM BISAM retrieve 500 records retrieve 100 records IE IS IB 4.2-15 4.2-15 4.2-15 IE IS IB 10(s) 10(s} 10(s} 15(u) 15(u} 15(u) lO(s) 10(s} l1(s) 4.7-15 5.7-16 8.4-18 15(u) I 15(u} 161u) 10(s) 10(s} l1(s) 5.1-15 7.3-17 13-23 15(u} 15(u} 16(u} 10(s) 12(s) 14(s} 6.4-15 12-21 15-23 '16(u) 17(u) 19(u) retrieve 500 records update 100 records update 100 records 13(s} 13(s} 4.7-15 4.7-15 4.7-15 13(s) 18(u) 18(u) 18(u) 13(s) 13(s} 13(s} 5.4-15 6.5-16 9.9-19 18(u) 18(u} 18(u) 13(s) 13(s} 14(s) 6-15 8.5-17 19-25 18(u) 18(u) 19(u) 13(s) l4(s) 16(s) 7.8-15 15-22 21-26 18(u) 19(u) 21(u) = sorted i ! keys u - unsorted keys No Master Index number of records = 50,000 IS - Overflow in different Disk Pack IB - Overflow in same Disk Pack IE - Overflow in same Cylinder Choice of access method In certain special cases, particularly when the records relevant to a search are confined to a small area of the Table I Analysis of Complex Data Management Access Method when there are many· records to be read because the initial reading of the index in buffered sequential mode can affect tq substantially.) To approximately determine tq , let b be the number blocks per data track and T the track revolution time. Assuming the minimum number of buffers, Retrieval. overflow ro.J4T+2Te (cylinder index and data on separate packs) (cylinder index with data) 40-146 5 44-146 64-190 85-189 10 49-148 92-190 : 138-237 25 61-149 160-242 I 155-237 0 45-146 45-146 45-146 5 51-146 95-190 !I 101-193 10 59-148 152-241 1199-256 25 76-149 191-264 1 Master Index Exists (cylinder index with data) s ,. sorted keys Variation of hardware parameters The results presented in this paper are for a particular device; it is, however, of interest to understand the effect of changes in hardware parameters, such as access arm speed, track size, rotational speed and processor speed. Of these parameters, access arm speed is the most independent of the others in its effect on performance. In the basic access method for typical configurations, a 100 percent increase in arm speed will result in about a 20 percent improvement in total performance. While increased arm speed will, significantly narrow the difference in performance between the direct indexed access processes for various overflow schemes, sequential performance will only be affected when large amounts of overflow exist in pack and separate overflow configurations. Track size, rotational speed, and central processor speed do, however, interact in a complex fashion with regard to the loss of revolutions. Increases in CPU speed generally will result in no performance deterioration and they may improve performance by saving previously IE 13O(s) 176(u) 13O(s) 176(u) 13O(s) 177(u) 133(s) 180(u) 222-267 I IS I 13O(s) 176(u) 132(5) : 180(u) 138(s) 183(u) 156(5) : 204(u) IB 13O(s) 176(u) 136(s) 180(u) 143(s) 190(u) . 169(s) 214(u) update 1000 records \ 154~s) 202(u) 154(5) 202(u) 154(5) 202(u) 158(s) 205(u) I 154~s? 202(u) 1 157(5) . 204(u) 161~s) 208(u) 181(s) 229(u) 154~s) 202(u) 161(s) 208(u) 168(s) 215(u) 194(.5? 240(u) number of records - 1.000.000 (cylinder index and data separate) where Tee is average cylinder search time in the cylinder index. For the small file, we have N e = 1. Thus, T~2X25+75+12.5~138. Reading 100 records in the basic direct mode requires approximately 14 seconds as confirmed by our measurement (Table I). Thus, if p~5 to 10, then the buffered mode and the basic direct mode provide similar performance. IB 40-146 retrieve 5000 records update 1000 records i t~4T+Te+Tee IS 40-146 ms. where Te is average cylinder search time of the file, and N e is the number of cylinder index tracks. BISAM retrieve 1000 records 0 If the file has no master index, tb can be estimated by ~2T+2Te+(Ne·T)/2 QISAM retrieve 5000 records IE The factor of 0.5 represents the cost of possible revolutions. Thus, in the case of the sample files, the time to read a record is t~2T+Te+(Ne·T)/2 Update Times (sec) Table II tq~(1.5XT)/b+T. tq~(1.5X25)/8+25""30 & 221 u ,. unsorted keys IS = Overflow in different Disk Pack IB = Overflow in same Disk Pack IE = Overflow in same Cylinder Table II lost track revolutions. Track size and rotational speed· will normally result in gradual improvements in performance, except in the cases where the CPU can no· longer complete processing in time to access the next .. record. These cases will result in major discontinuous deteriorations in performance through lost revolutions. Other parameter changes The size of the records in a file influences performance considerably. For smaller record sizes, the timing curves will have a larger slope at all points and the intersections with the time axis will be lower. If the record size is very small, a slight increase in overflow percentage will degrade performance tremendously. A larger record size shows exactly the opposite effect. Here the performance curves will intersect the axis at a higher point and they will have less slope. The number of records in a block or the blocking factor also affects performance. A large blocking factor will decrease storage space but it increases transmission time. Small blocking factors decrease transmission time but increase arm movement time. A thorough analysis is again needed to determine optimum blocking. 222 Fall Joint Computer Conference, 1970 CONCLUSION In this paper, we have presented a prototype parametric study of the type that is almost mandatory for knowledgeable design of a complex file organization. This study, which includes thousands of data points, would not have been possible without a fast, accurate simulation model such as FOREM I. The results are presented to give the reader an. indication of the intricate interdependence of the many parameters that he must consider if he wishes to produce an excellent file design. REFERENCES 1 F o'NrUJ,tted file organization techniques Final Report Air Force Contract AF 30(602)-4088 May 1967 2 M E SENKO V Y LUM P J OWENS A file organizatum evaluation model (FOREM) IFIP Congress 1968 APPENDIX This section presents comparisons of the model runs with actual computer runs to illustrate the accuracy Mode of Retrieval Overflow 2 Handling Percent* Overflow Hodel Result (secs.) Measured Result (secs.) Model Error (percent) File creation Sequential retrieval Sequential ret=rieval Sequential retrieval Sequential retrieval Sequential retrieval Sequential r"':rieval Sorted keyl retrieval Sorted key retrieval ind 0 cyl 0 10.9*** cyl 5. 16.6 16.1 3.11 cyl 16.1 30.0 27.9 7.52 ~:~~~:v~? 'j Sorted key retrieval Sorted key retrieval Sorted key retrieval Rand... retrieval Randoa retrieval Randoa retrieval Randoa retrieval Randoa retrieval Randoa retrieval Randoa IIDdate 186. ind 0 10.9*** ind 5. 45.5** ind 16.7 82.1** 159. 8.51 8.64 17.0 28.1 26.1 36 •.9 23.3 69.2 18.6 cyl 0 422. 414. cyl 5. 419. 414. 1.21 cyl 16.7 448. 451. 9.67 2.43 ind 0 422. 412. ind 5. 457. 464. ind 16.7 613.** 544. 1.93 1.51 12.7 cyl 0 790. 732. 7.92 cyl 5. 787. 744. 5.77 cyl 16.7 816. 773. 5.56 ind 0 781. 715. 9.2 ind' 5. 802. 752. 6.64 ind 16.7 922. 846. 8.98 915. 970. 6.01 cyl 0 1 - The keys of the records to be retrieved are sorted in ascending order and retrieval carried outiq the order of, this reference. 2 - cyl means cylinder overflow (overflow records in same cylinder as prime records). and ind means independent overflow (overflow records in different cylinders as prime records). Appendix insert 1 Mode of Retrieval Random update J"ll threats to privacy, presently available counter-measures, and the current operational environment. 3. Altering or destroying files. 4. Obtaining free use of system resources. The nature of deliberate infiltration will be discussed within the framework presented by Peterson and Turn, l who established the following categories:. A. Passive Infiltration 1. Electro-magnetic pickup (from CPU or peripheral devices). 2. Wiretapping (on communications lines or transfer buses). 3. Concealled transmitters (CPU, peripheral devices, transfer buses, communications lines). B. Active Infiltration 1. 2. 3. 4. 5. 6. 7. 8. THREATS TO PRIVACY The challenges to the privacy of information in a computer system may be accidental or deliberate; this discussion relates specifically to deliberate challenges, although the software developed may afford some protection against the undesired consequences of accidental compromise. Browsing. Masquerading. Exploitation of trap doors. "Between-lines" entry. "Piggy back" infiltration. Subversive entry by centre staff. Core dumping. Theft of removable media. Browsing is defined as the use of legitimate access to the system to obtain unauthorized information. 111asquerading consists of posing as a legitimate user after obtaining proper identification by ~ubversive means. Trap doors are hardware or software deficiencies that assist the infiltrator to obtain, information having once gained access to the system. Between-lines entry consists of penetrating the system * Support of the Defence Research Board (Canada) and the Canada Council, Social Sciences and Humanities Division is gratefully acknowledged. ** Now with the Univac Division, Sperry Rand of Canada, Ltd., Ottawa, Ontario, Canada. 223 224 Fall Joint Computer Conference, 1970 when a legitima.te user is on a communications channel, but his terminal is inactive. Piggy-back infiltration consists of selectively intercepting user-processor communications and returning false messages to,the user. Directed ! Threat Against ~ Passive LINES, Browsing SYSTEM .. Masquerade Between-Lines Methods to enhance privacy are roughly classified as follows: 1. 2. 3. 4. 5. Access control. Privacy transformations. Processing restr.ictions. Monitoring procedures. Integrity management. Coun ermeasure Privacy Process. Threat Integritl IControl Transform Restrict Monitor "anage. DEVICES EXISTING COUNTER1\tlEASURES , IAccess . i Nmo • • NONE . GOOD . . NONE NONE · · GOOD' GOOD GOOD ~NE : FAIR FAIlr FAIR GOOD IFAIR 'NONE GOOD' FAIR FAIR ~ONE 'NONE FAIR FAIR FAIR ~ONE ~ONE I . Trap-Doors CPU NONE NONE ~ONE FAIR DEVIC,ES FAIR GOOD IFAIR · Systems CPU NONE NONE INONE NONE Entry DEVICES FAIR .. FAIR Theft CPU IDEVIC,ES .. ~ONE iGOOD Piggy-Back Core Dump FAIR . .. ~D .. ' NoNE NONE ~ONE GOOD ~D ,NONE GOOD !NONE FAI,R FooD Figure 1-Threat-countermeasure matrix Access control consists of authorization, identification, and authentication and may function on the system or file level. Authorization to enter the system or files is generally established by possession of an account number or project number. The user may be identified by his name, terminal, or use of a password. The user may be required to perform a privacy transformation on the password to authenticate his identity. Peters2 recommends use of one-time passwords. Passwords may also include authority codes to define levels of processing access to files (e.g., read only, write, read-write, change protection). Privacy transformations include the class of operation which can be used to encode and decode information to conceal content. Associated with a transformation is a key which identifies and unlocks the transformation to the user and a work factor, which is a measure of the effort required of an infiltrator to discover the key by cryptanalysis. Processing restrictions include such functions as provisions to zero core before assigning it to a second user, mounting removable files on drives with disabled circuitry that must be authenticated before accessing, automatic cancellation of programmes attempting to access unauthorized information, and software which limits access privileges by terminal. lJIlonitoring procedures are concerned with making permanent records of attempted or actual penetrations of the system or files. Monitoring procedures usually will not prevent infiltration; their protection is ex post facto. They disclose that a compromise has taken place, and may help identify the perpetrator. Integrity management attempts to ensure the competence, loyalty, and integrity of centre personnel. In some cases, it may entail bonding of some staff. EFFECTIVENESS OF COUNTERMEASURES The paradigm given in Figure 1, grossly abridged from Peterson and Turn, characterizes the effectiveness of each countermeasure against each threat. We independently investigated each cell of the threat-countermeasure matrix in the real-time resourcesnaring environment afforded by the PDP-I0/50 at Western (30 teletypes, 3 remote batch terminals). Our experience 1eads to the following observations: Passive Infiltration: There is no adequate countermeasure except encipherment and 'even this is effective only if enciphered traffic flows on the bus or line attacked by the infiltrator. Competent, loyal personnel may deter planting wireless transmitters or electromagnetic pickups within the computer centre. Browsing: All countermeasures are effective; simple aCcess control is usually adequate. Masquerading: If the password js compromised, most existing countermeasures are rendered ineffective. Use of authentication, one- time passwords, frequent change of password, and loyalty of systems personnel help to preserve the integrity of passwords. Separate systems and file access procedures make infiltration more difficult, inasmuch as two or more passwords must be compromised before the infiltrator gains his objective. Monitoring procedures can provide ex post facto analySIS. Between-Lines Entry: Only encipherment of files, or passwords applied at the message level rather than for entire sessions, provide adequate safeguards. 1\10nitoring may provide ex postfacto analysis. Fast Infinite-Key Privacy Transformation Piggy-Back Techniques: Encipherment provides protection unless the password is compromised. Monitoring may provide ex post facto analysis. Trap-doors: There is no protection for information obtainable from core, although monitoring can· help in ex post facto analysis. Encipherment can protect information contained in auxiliary storage. Systems entry: Integrity management is the only effective countermeasure. There is no other protection for -information in core; even monitoring routines can be overridden. Encipherment protects information in virtual storage only to the extent that passwords are protected from compromise. Core dump: There is no effective protection except integrity management, although monitoring procedures can help in ex post facto analysis. Theft: Encipherment protects information stored in removable media. Our initial study persuaded us that privacy transformation coupled with password authentication would afford the best protection of information. Integrity management procedures were not within the scope of this research. PRIVACY ENVIRON1VIENT: MANUFACTURERS Our next task was to investigate the privacy environ-· ment of resource-sharing systems. Five manufacturers of equipment, doing business in Canada, participated in our study. Their contributions are summarized in the following points: 1. The problem of information security is of great 2. 3. 4. 5. 6. 7. concern to all manufacturers of resource-sharing equipment. lVlost manufacturers are conducting research on privacy; only a small minority believes that the hardware and software currently suppli.ed are adequate to ensure the privacy of customer information. The password is the most common vehicle of system access control; dedicated direct lines are recommended in some special situations. At least two manufacturers have implemented password authentication at the file level. There appears to· be no customer demand for implementation of hardware or software privacy transformations at this time. l\![ost manufacturers stress the need for integrity management. Two large manufacturers emphasize the need for thorough log-keeping and monitoring procedures. 225 RESOURCE-SHARING SYSTEMS Number of .Systems 4 j II A.utho.ri.zation Ident.ification A.uthority A.ccount t Name Password Project f 3 A.ccou,nt t 3 Account t Name Password - Project .f 2 Account t - - 1 Account t Name Password 1 Project t Name - - Password - - 1 1 Project f I I I Figure 2-Access control in 16 Canadian resource-sharing systems 8. Communication links are seen as a major security weakness. We next surveyed 25 organizations possessing hardware that appeared to be suitable for resource-sharing. Sixteen organizations participated in our study, representing about 75 percent by traffic volume of the Canadian time-sharing industry. From information furnished by them, we were able to obtain a "privacy profile" of the industry. The average resource-sharing installation utilizes IBM equipment (Univac is in second place). The typical system has over 512 thousand bytes of core storage and 175 million bytes of auxiliary storage. The system operates in both the remote-batch and interactive modes. It has 26 terminals communicating with the central processors over public (switched) telephone lines. In seven systems, authorization is established by name, account number, and project number. Five systems require only an account number. Nine systems require a password for authority to enter the system; the password is protected by either masking or printinhibit. Identification is established by some combination of name, account number, project number, or password; in no case is identification of the terminal considered. No use is made of one-time passwords, authentication, or privacy transformations. In no system is a password required at the file level; seven systems do not even require passwords. Access control prOViSIOns of 16 Canadian systems are summarized in Figure 2. 226 Fall Joint Computer Conference, 1970 Only two systems monitor unsuccessful attempts to gain entry. In nine systems, both centre staff and other users have the ability to read user's files at will. In six systems, centre staff has unrestricted access to user files. Only three organizations have implemented integrity management by bonding any members of ' staff. The state of privacy, in general, in the Canadian resource-shariIig industry, can be described as chaotic and, with few exceptions, the attitude of systems operators towards privacy as one of apathy. PRIVACY TRANSFORMATION: FUNCTIONAL SPECIFICATIONS It was decided, therefore, to begin development of a software system for privacy transformation that would be synchronized by an authenticated password, anticipating that sooner or later some users will demand a higher degree of security in resource-sharing systems than is currently available. Such an authentication·privacy transformation procedure would afford the following advantages: 1. Provide protection for the password on com- munications channels. 2. Implement access control at the file level. 3. Obviate the need for storing passwords as part of file headings. 4. Afford positive user identification since only authorized. users would be able to synchronize the keys of the privacy transformation. 5. Furnish "work factor" protection of files against browsing, "between-lines" entry, "piggy-back" infiltration, "trap doors" to auxiliary storage, entry of systems personnel to auxiliary storage, eavesdropping on transfer buses, and theft of removable media. The technique of privacy transformation that seemed most promising was a form_ of the Vern an cipher, discovered in 1914 by Gilbert S. Vernan, an AT&T engineer. He suggested punching a tape of key characters and electromagnetically adding its pulses to those of plain text characters, coded in binary form, to obtain the cipher text. The "exclusive-OR" addition is used because it is reversible. The attractive feature of the Vernan cipher for use in digital systems is the fact that the key string can readily be generated by random number techniques. For maximum security (high work factor) it is desirable that-the cipher key be as long as the plain text to be encrypted. However, if the flow of information is heavy, the production of keys may place extreme loads on the arithmetic units of processors-the rate of message processing may then be too slow to be feasible. Two solutions have been proposed. In the first, relatively short (e.g., 1,000 entries) keys are produced and permutations of them used until repetition is unavoidable. A second approach is to use an extremely efficient random number generator capable of producing strings that appear to be "infinite" in length, compared to the average length of message to be transformed. PRIOR WORK (SHORT-KEY METHOD) An algorithm for a short key method presented by Skatrud3 utilizes two key memories and an address memory. These are generated off-line by conventional random number techniques. Synchronization of an incoming message and the key string is' achieved by using the first information item received to address an address memory location. The contents of this memory location provides a pair of address pointers that are used to select key words from each of the key memories. The key words are both "excJusive-OR'ed" with the next data item, effectively providing double encryption of it. The address memory is then successively incremented each time another data item arrives. Each address location provides two more key address pointers and each key address furnishes two key words to be "exclusive-OR'ed" with the current data item. Key word pairs are provided on a one-forone basis with input data items until the entire message has been processed. For decoding, the procedure is completely reversible. PRESENT WORK ("INFINITE" I(EY l\tlETHOD) We decided to use the infinite key approach because it would: 1. Reduce storage requirements over those required by short key methods. This will tend to reduce cost where charges are assessed on the amount of core used; and, more importantly, will permit implementing the transformation on small computers (e.g., one having a 4096word memory) located 'V\1 f::; .. ;.;. ~1.1 47. " ~:..) ~ 3 36. -3 B 3 J;J ':ECrlSEL3E;; GE'" 59.::: 5') 3 4 6 :;t}. '5 7J I 4 :3 52.·~ 5'3 ~.•I:_{J rr:,.J • TE,·{. L. -) I:: Y IU ij":: I STJC-J Ij~~5 H~~ DlSmI'1lJfI )'J .I? ::;C.)I'E!:; FH P·)ST-fESf 6 I -7:3 71-'5"1 ) 11:>?]- 13:3 Figure 2-The output for section 1005 is shown to illustrate the output produced by Program XN1 when the weekly post-tests are scored On Line Computer Managed Instruction 80 REt-l TH I 5 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 720056, 720133, 721799, 722835, 723934, 724165, 724872, 725390, 725425, 725196, 726699, 726811, 721539, 727945, 728561, 728659, 728925, 117 73e921, QUALITY EDUCATIONAL DEVELOPMENT IS THE :XS F'ILE 3, 5, 1, 5, 5, 5, 5, 5, 5, 1, 5, 1, 5, 1, 1, 5, 5, 5, 43.4, 1,2,1,1,2,2,1,1,1,1 57.8, 2,2,1,1,1,2,2,2,1,1 47.3, 2,2,1,1,1,1,1,1,1,1 74.5, 2,2,1,2,2,2,1,2,1,2 28 , 1,1,1,1,1,1,1,1,1,1 39.2, 1,1,1,1,1,1,1,2,1,2 66.3, 1,2,1,2,2,2,1,1,2,2 64.8, 2,1,1,2,2,2,1,2,1,2 52.9, 2,1,1,1,2,2,1,1,1,1 91.9, 2,2,1,2,~,2,2,2,2,2 54.8, 1,2,2,1,1,1,2,1,1,2 63.8, 1,1,2,2,2,1,1,2,1,2 42. , 1,1,1,1,2,1,1,1,2,1 71.4, 2,1,2,2,2,2,1,2,1,1 59. , 2,1,1,2,2,2,1,1,1,2 53.6, 2,1,1,2,2,2,1,1,1,1 49.4, 2,1,1,2,2,1,1,1,1,1 39.8, 1,1,1,1,1,1,1,1,2,1 Figure 3-0ne of the section data files created by Program XNl is shown. These files contain each student's I.D. number, confidence score, and his response to each question on the test COMPUTER ANALYSIS OF STUDENT PERFORMANCE POST-TEST K STUDEfH I D NUMBER I I j 103 QUESTION 5, CORRECT QUESTION 7 HROflG Figure 4-A line from one of the data files (XI-X12) is annotated to indicate the items stored CURRENT AND RESISTANCE ANALYSIS OF SCORES BY MEDIA GROUP 0-60 61-70 SG 0 0 IB 0 1 AV 2 0 SO 0 2 TB 1 L 71-80 5 81-90 91-100 MEAN 2 13 91.1 6 12 89.7 3 6 8 87. 5 5 11 87.4 2 5 12 87.3 1 1 8 7 87.4 L/SG a 2 • 4 11 88.8 CNTR 1 1 2 7 17 90.2 1 3 2 6 82.2 0 1 7 6 88.8 CAI-I data files (Table II, data files XI-X12) for each section (Figure 3) which contained the student's identification number, group number, confidence score, and his response to each question in coded form (Figure 4). The identification or I.D. number is used to uniquely identify each student's responses. This number is stored along with the data and checked whenever programs manipulate these numbers to ensure that the responses of one student are not attributed to another student. These data files could then serve as input for any variety of analysis programs. 235 CAl-II 1 Figure 5-The results of the analysis of scores by media group is shown for Post-test K. This output is produced by Program XN2 Two of these programs will be described here. The first program (,fable I, Program XN2) accessed the data files to produce an analysis of performance by media group (Figure 5), and an analysis of performance by question (Figure 6). It should be recalled that the tests were scored by section since the section professors were responsible for student grades but the media groups were randomly assigned independent of the sections. It was thus necessary to perform analysis both by section and by group. The data in the section files (files Xl-X12), therefore, had to be analyzed or sorted to determine the effectiveness of the different media-mixes. The "analysis by group" sorted the student confidence scores by media-group. On a weekly basis, the difference in performance by group was generally small. Conclusions on the effectiveness of the media-mixes must await an analysis of variance of group performance during the semester. In addition, the response of each student to each question was sorted by computer· to determine the percent of the students in each media group who missed each test question. This "analysis by question" or item analysis often showed distinct differences in performance by the various media groups. As an example, consider the results of the twenty-question test given at midsemester (Figure 7). Question one which tested an 236 Fall Joint Computer Conference, 1970 QUALITY EDUCATIONAL DEVELOPMENT COMPUTER ANALYSIS OF STUDENT PERFORMANCE POST-TEST K CURRENT AND RESISTANCE ANALYSIS: PERCENT OF STUDENTS WHO MISSED A QUESTION QUES. 1 5G 15. IB 4.35 0 AV 10.5 10.5 50 o TB 4.35 L o L/5G 4.76 CNTR CAI-I 3 9 10 20. 15. 21.7 13. o 10. 25 15. 21.7 4.35 17.4 21.7 8.7 15.8 10.5 10.5 21.1 10.5 0 36.8 10.S 4.35 17.4 26.1 39.1 0 o 43.5 17.4 4.35 4.35 17.4 47.8 8.7 4.35 ~O.4 17.4 4.76 14.3 19. 42.9 4.76 0 33.3 14.3 0 14.3 9.52 0 57.1 0 o 33.3 9.52 25 3.57 14.3 3.57 14.3 7.14 7.14 3.57 0 35.7 21.4 28.6 21.4 28.6 7.14 21.4 21.4 0 CAl-II 13.3 0 13.3 6.67 20. 26.7 13.3 6.67 0 26.7 TOTAL 4.49 12.9 10.7 16.9 36. 9.55 1.69 26.4 22.5 4.35 0 system. Several other programs (see Tables I and II) must be executed to T-Score the exams and to produce the data files read by this program. The order of execution is XN9, XN10, and XN8. The output for each section (Figure 8) lists each student's name, his current weeks raw confidence score, the equivalent T-Score, the cumulative average of all T-Scores to date, and his percentile standing in the class. This program reads data from four different data files, three of which were written by computer programs. The M4 file (Figure 9) contained the master list of students in the course. This file contained the list of student names, identification numbers and group numbers and it was grouped by section. After the post-test and prior to executing Program XN8, the M 4 file was copied into the J file and absentees were indicated in the J file. In this manner, the master list (M4 file) remained unaltered. This file is created at the beginning of the semester and is altered only when a student 35.7 QUALITY EDUCATIONAL DEVELOPMENT 11.2 COMPUTER ANALYSIS OF STUDENT PERFORMANCE POST-TEST H Figure 6-Program XN2 also analyzes student responses to determine the percent of students missing each question as a function of media group. A sample output is shown ANALYSIS: PERCENT OF STUDENTS WHO MISSED A QUESTION QUES. objective on the manipulation of vectors was correctly answered by more than 95 percent of the students taking the exam. In contrast, question 19 was answered incorrectly by 28 percent of the students. If the breakdown by group is scanned for this question, the individual group percentages vary from 14 to 48 with a low of 3.7 percent for the CAl group. It should be noted that all of the course objectives were not treated by all of the media. 8 The behavioral objective tested by question 19 (electric field/superposition) was treated by the lecture, study guide, and CAl material of week G, and by a review study guide made available to all groups during the review week. If we can assume that all media were of equal content, it would appear that CAl best enables the achievement of this objective in general. Further analysis could well refine the conclusion in terms of individual student characteristics. The second program which will be discussed (Table I, Program XN8) provides the cumulative output of the weekly bookkeeping. This program illustrates the scope of data manipUlation possible using only simple BASIC statements and a file oriented time-shared REVIEW EXAM A B C D E F G CAl 1 o o 4.76 0 4.76 0 o o 1.13 2 20. 21.7 19. 27.3 28.6 28.6 45.5 29.6 27.7 40. 39.1 47.6 63.6 57.1 28.6 54.5 63. 49.7 o 4.35 0 4.55 0 o o o 1.13 5 40. 21.7 33.3 63.6 28.6 28.6 27.3 44.4 36.2 6 70. 43.5 76.2 63.6 61.9 57.1 59.1 77.8 63.8 7 5. 13. 4.76 4.55 0 19. 9.09 3.7 7.34 8 10. 13. 14.3 13.6 4.76 19. 4.55 11.1 11.3 9 10. 17.4 28.6 27.3 19. 19. 36.4 25.9 23.2 10 5. 8.7 4.76 4.55 4.76 4.76 18.2 11.1 7.91 11 15. 8.7 19. 0 9.52 9.52 13.6 11.1 10.7 12 75 69.6 71.4 72.7 66.7 66.7 77.3 81.5 72.9 13 20 21.7 33.3 4.55 47.6 19. 36.4 18.5 24.9 14 25 26.1 19. 40.9 28.6 23.8 36.4 40.7 30.5 15 75 82.6 95.2 95.5 85.7 81. 81.8 63. 81.9 TOTAL 16 20 8.7 23.8 13.6 19. 19. 22.7 22.2 18.6 17 10. 8.7 19. 22.7 4.76 9.52 18.2 7.41 12.4 18 50 30.4 71.4 68.2 52.4 57.1 54.5 74.1 57.6 19 25 43.5 14.3 36.4 47.6 19. 36.4 3.7 27.7 20 75 69.6 95.2 59.1 85.7 90.5 86.4 74.1 79.1 Figure 7-The analysis of the percent of students missing each question is given for the mid-semester review exam to illustrate the feedback this output provides On Line Computer Managed Instruction ON-SITE COMPUTER MANAGED DIRECTIVES CURRENT GRADES AND CUMULATIVE AVERAGE T-SCORES SECTION 803 POST-TEST M MAGNETIC FIELD I RAW SCORE (M) T-SCORE (M) CAS (A-M) PSM (A-M) BRILLA,R. 68.5 53 55 86 CASKEY,H. 63.8 49 43 8 DRAWNECK, R. 83.6 63 5}, 63 HINSON,L. 67.8 52 56 91 34 N~ HOPPER,W. 71.7 55 48 HORNE,B. 88.1 67 50 53 JOHNSON,G. 75.6 58 56 91 KENNEDY,T. 89 68 52 69 KRATOCHVIL 79.8 60 49 44 MARTIN,A. 64.1 49 51 63 MC DEVITT,R. 63.9 49 45 14 OSBORN,D. 63 48 50 53 SCHULER,T. ABSENT ABSENT 48 34 SHEARER,G. 67.7 52 48 34 SMITH,D. 82.6 61 56 91 SZOKA,M. 91.2 72 50 53 TOMLIN,E. 79 59 59 96 VAUGHN,D. 61.9 48 50 53 WOOD,C. 46 38 41 5 WILKENSON,J. 40.5 34 39 2 Figure 8-The output from Program XN8 for one section for week (M) is shown drops the course. All programs which access data files check the I.D. numbers to ensure that there is no possibility of using the wrong data for a student's record. The other files accessed by XN8 were the Kl, T3, and Ml files (Table II); the Kl and Ml files were grouped by section. The Kl file (Figure 10) contains each student's I.D. number, the sum of his T-Scores, and the number of exams involved. This file can be used to calculate the cumulative average T-Score since that is the sum of the T -Scores divided by the number of exams. The Kl file is updated weekly by computer (Table 1, Program XNI0) in the following manner: the program reads the cumulative information from the Kl file, reads the current T -Score from another file (file Ml), checks I.D. numbers to ensure a match, then adds the current T -Score to the sum, increments the number of exams by one, and outputs this information to the K2 file. Thus after executing XNI0, the Kl file contains the previous weeks cumulative 237 data and the K2 file contains the updated cumulative data. The Kl file is then deleted (a paper tape of the file can be saved) and the K2 file is renamed to Kl before executing XN8. This minimizes the amount of data stored in the computer. The XN8 program also reads data from the Ml file (Figure 11) which contains student I.D. numbers, the raw confidence score, and the equivalent T-Score for the current week. The fourth file read by program XN8 is the T3 file (Table II) which contains the data necessary to determine a student's percentile standing in the class. A listing of program XN8 (Figure 12) is given to show the simple BASIC statements which are involved. For example lines 400-480 check that the I.D. numbers for all data refer to the same student. Should a mismatch occur, the program is written to print (lines 420-430) the name of the student (A$) from the master file, his section (T), and the files involved. The program reads the student data one student at a time, and prints the results to avoid excessive use of subscripted variables since this would affect the permitted 100801.,1.,1 102 "BENHAM.,~1.".,720490.,5 104 "BLAKEY.,B."., 720651., 5 106 -BRUCKEH.,B.-.,721029.,1 108 -CROOK.,K.-.,721694.,1 110 -DIX.,S.-,,722079.,1 112 -ENGLUND.,R.-.,722380.,1 114 "GALLI.,W.".,722702,1 116 -GILLOOLy.,J.-.,722863.,5 118 -HEDRICK,M.-.,723584.,5 120 -KEITHLY,T.-.,724410,5 122 -KING.,M.-.,724571.,1 124 "MOFFATT.,W.".,725978.,5 126 "NEUPAVEH.,A.-.,726356.,1 128 "NIELSEN.,J.-,726412.,1 130 -PRINCE.,T.-,727014,1 132 "ROUND,H ... .,724399,5 134 -SCHUBB1T.,J.-,727700 1 1 136 .. SHOElVlAKER., J. -,727896., 5 138 .. SNYDER.,W ... .,728155.,5 140 -SPRINGYAN,R.-.,728232.,5 142 -VANMAELE.,J.-.,728932,5 144 .. VANOHSDEL.,R ... .,728939,1 150 0,0,0 Figure 9-The Master File of student names (M4 file) is listed for one section to show how the names, J.D. numbers and group numbers are stored 238 Fall Joint Computer Conference, 1970 lOa 101 102 103 104 105 106 101 108 109 110 111 112 113 114 11 5 116 11 7 11 811 9 120 121 122 1 , 720490 720651 721029 721694 722079 722380 122102 722863 72358 /1 724410 124571 725978 726356 726412 727Q 14 72L1399 727730 127896 728155 728232 728932 728939 , , , , , , , , , , , , , , , , , , , , , , 855 616 641 661 482 612 481 704 5Ll1 617 676 663 804 742 716 618 635 661.1 604 602 614 673 , , , , , , , , , , , , , , , , , , , , , , 14 12 13 13 11 14 10 14 13 12 14 14 13 14 13 14 14 13 12 13 i3 13 , , , , , , , , , , , , , , , , , , , , , , Figure IO-The data stored in the Kl tile are listed for one section. This file contains the cumulative sum of each student's T-Scores and the number of exams involved length of the BASIC program. The initial information on the heading for each section is read and printed (lines 210-360), while the data for each student in the section is read and printed in a loop (lines 390-600). Program XN8 will always run error-free since three input files (Kl, ]\1:1, T3) are created by computer and the fourth file (M4) is checked during the execution of Program XN9. CONCLUSIONS The CMI programs described were successful from two points of view: the dead time between the student's responses and the analysis of results was kept to a minimum and the flexibility of the system was maintained. Both of these advantages accrue from the fact that the system was operated on.:..line in a time-sharing mode. There is no doubt that the use of a batch-processing system increases the dead time between collection of data and output since turn-around time for even efficient data centers can range from 3-24 hours. In contrast, time-sharing provides almost im- mediate responses with a limiting factor being speed of input and output. The modular design of this system, with an emphasis on easy access to data and results, allowed a flexibility which is often lacking in large CMI programs where the sheer size and complexity of the program often precludes changes. In turn, this flexibility ensures user satisfaction since the system is responsive to individual requirements. Furthermore, the use of an easy conversational language (BASIC) allows direct access to the system by teachers, educators, and students. As noted above, a major area of consideration for on-line CMI is which remote terminal to use. In this experiment, a relatively large amount of time was required for input and output because the rate of transmission of the teletype is 10 characters/second. This time can be cut considerably simply by using one of the faster remote terminals available which transmit at rates of 30 characters/second and are compatible with most time-shared systems. Furthermore, if the course design utilizes multiple-choice exam questions, students can mark their answers on a card and a mark-sense card reader can be used. This eliminates 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 11 5 116 11 7 118 119 120 121 122 1 , 720490 , 720651 721029 , 721694 722079 , 722380 722702 722863 , 723584 , 724410 724571 725978 , 726356 726412 727014 , 724399 , 727700 727896 , 728155 , 728232 , 728932 , 728939 , , , , , , , , , , 70.2 , 5'! 73.3 , 56 56.8 , 45 84.2 , 63 65.2 , 50 26.8 , 22 0 , 0 , 53.6 , 43 52.1 41 84.6 , 65 55.2 , 44 43.3 , 36 85.4 , 65 75.3 , 57 69.5 , 54 0 , 0 , 67 , 51 , 57.1 45 76.3 , 58 69.2 , 53 31.4 28 49.8 , 40 , , , , , , , , , , , , , , , , , , , , , , Figure ll-The MIfile, which contains a student's raw confidence score and the corresponding T -Score for the current exam, is listed for one section On Line Computer Managed Instruction 100 200 210 220 230 240 245 250 255 256 260 270 280 290 310 320 322 324 326 330 340 350 360 370 380 390 395 400 410 420 430 435 440 450 460 470 480 490 SOO 510 520 530 540 550 560 570 580 590 600 610 620 630 640 6SO 660 700 REM .... PROGRAM XNS .... DIM C (l00) FILES T3. H5. J. I'll. K1 R~D Il.X1.N.C(N) IF,N=X1 THEN 245 60 TO 220 READ 12.VS.LS.KS READ 13,T,X,P R~D 14.Y READ 15,Z PRINTON-SITE COMPUTER MANAGED DIRECTIVESPRINT PRINT PRINTCURRENT GRADES AND CUMULATIVE AVERAGE T-SCORESPRINT PRINT TAB(26);-SECTION -;T PRINT PRINT PRINTPOST-TEST - ;VS,LS;KS PRINT PRINT PRIN'CNAME- ,-RAW SCORE- ,-T-SCOHE- ,-CAS- ,-PSMPRINT TAB (16);- (N)-; TAB (32 H- (N)-; TAB(44);- (A-N)-; TAB (59 >;- (A-N)PRINT PRINT READ 13,AS, 0, U IF 0=0 THEN 610 READ 14,U1,G,T1 IF Ul =D l'HEN 440 PRINT-l.D'S FOR -;AS;- OF SECTION -;T;- DO NOT MATCH 'PHINT-! IN THE J AND 1"1 FILESGO TO 660 READ IS.U2,S,N IF U2=U1 THEN 490 PRINT-I.D'S FOR -;AH- OF SECTION -;T;- DO NOT MATCHPHINT- IN THE J AND K1 FILESGO TO 660 IF 6>0 'CHEN 540 LET M=INT< C!)/N)+.5) LET W=CCCM)/C(XI »"100 PRINT AS,ABSENT-.M.INTHJ+.5) GO TO 590 LET S=S+T1 LET N=N+1 LET l'!l=INTCCS/N)+.S) LET W=CCCMI )/CCX1 »"100 PRINT AS.G.T1.Ml.INT(W+.S) PRINT GO TO 390 FOR 1= 1 TO 10 PRINT NEXT I IF X<13 THEN 250 PkINT- THIS COMPLETES THE OUTPUT FOR THIS PROGRAM!)IOP END Figure 12-A copy of Program XN8 is listed to show the simple BASIC statements which can be used to manipulate student data files the time and personnel required to punch student responses on paper tape. Nothing in the system design precludes each user from choosing the terminal best suited to his needs. In summary, the implementation of CM! is essential if teachers are going to effectively manage the learning process to provide individualized instruction. On-line CMI satisfies the needs of educators for a rapid, flexible, easily accessible system. On-line CMI can currently perform the tasks of diagnosis, testing, " 239 record keeping, and analysis. Such a system is also capable of elucidating, validating, and imp1ementing algorithms which provide individualized learning prescriptions. ACKNOWLEDGl\1ENTS The authors wish to acknow1edge the many helpful discussions and suggestions contributed by Dr. A. F. Vierling, Manager of the Honeywell Educational Instruction Network (EDINET), and Dr.A. T. Serlemitsos, Staff Physicist at Quality Educational Development, Incorporated. REFERENCES 1 H J BRUDNER Computer-managed instruction Science Volume 162 pp 970-976 1968 2 Revised listing of objectives Technical Reports Numbers 4.3a and 4.3b United States Office of Education Project Number 8-0446 November 3 1969 3 A F VIERLING CAl development in multimedia physics Technical Report Number 4.30 United States Office of Education Project Number 8-0446 November 3 1969 4 W A DETERLINE R K BRANSON Evaluation and validation design Technical Report Number 4.7 United States Office of Education Project Number 8-0446 November 3 1969 5 W A DETERLINE R K BRANSON Design for selection of strategies and media Technical Report Number 4.9 United States Office of Education Project Number 8-0446 November 3 1969 6 E H SHUFORD JR Confidence testing: A new tool for management Presented at the 11th Annual Conference of the Military Testing Association Governors Island New York 1969 7 W C GARDNER JR The use of confidence testing in the academic instructor course Presented at the 11th Annual Conference of the Military Testing Association Governors Island New York 1969 8 A F VIERLING A T SERLEMITSOS CAl in a multimedia physics course Presented at the Conference on Computers in Undergraduate Science Education Chicago Illinois 1970 Development of analog/hybrid terminals for teaching system dynamics by DONALD C. }\iARTIN North Carolina State University Raleigh, North Carolina enthusiasm. As pointed out by Professor Alan Rogers at a recent Joint SCI/ ACEUG Meeting at NCSU, this intimate student-professor relationship simply cannot be achieved in today's large classes unless the instructor learns to make effective use of the modern tools of communication, i~e., movies, television, and computers. In effect, we turn students off by our failure to recognize the potential of these tools, especially the classroom use of computers. The material which follows points out the need for interactive terminals, describes the capabilities of prototype models already constructed, and then outlines the classroom system which is presently being installed for use in the fall of 1970. INTRODUCTION A recent study completed by the School of Engineering at North Carolina State University brought to light a very serious weakness in our program to employ computers in the engineering curricula, i.e., the inherent limitation on student/computer interaction with our batch, multiprogrammed digital system. The primary digital system available to students and faculty is the IBM SYSTEM 360/Model 75 located at the Triangle Universities Computation Center in the Research Triangle area. This facility is shared with Duke University and the University of North Carolina at Chapel Hill. In addition, the Engineering School operates a small educational hybrid facility consisting of an IBM 1130 interfaced to an EAI TR48 analog computer. We use some conversational mode terminals on the digital system but it has been our experience that they are of limited value in the classroom and, of course, only accommodate on the order of two students per hour. It is our feeling that terminals based on an analog or hybrid computer would materially improve student/ computer interaction, especially aiding the comprehension of those dynamic systems described by ordinary differential equations. This paper has resulted from our attempts to outline and define the requirements for an analog computer terminal system which would effectively improve our teaching in the broad area of system dynamics. The need for some reasonably priced dynamic. classroom device becomes apparent when we consider the ineffectiveness of traditional lecture methods in such courses as the Introduction to Mechanics at North Carolina State University. This is an engineering common core course in mechanical system dynamics which has an enrollment of about 400 students per semester. This is a far cry from early Greek and Roman times when a few students gathered around a teacher, who made little or no attempt to teach facts but instead attempted to stimulate the students' imagination and THE NEED FOR INTERACTIVE TERMINALS Classroom demonstrations The computer has long held great promise both as a means for improving the content of scientific and .technical courses and as an aid for improving methods of teaching. While some of this promise has been realized in isolated cases, very little has been accomplished in either basic science courses or engineering science courses in universities where large numbers of students are involved. It is certainly true that significant improvement in course content can be achieved by using the computer to solve more realistic and meaningful problems. For example, the effects of changing parameters in the problem formulation can be studied. With centralized digital or analog computing facilities, this can be accomplished only in a very limited way, e.g., problems can be programmed by the instructor and used as a demonstration for the class. Some such demonstrations are desirable but it is impossible to get the student intimately involved, and at best they serve only as a supplement to a text book. At North Carolina State 241 242 Fall Joint Computer Conference, 1970 Figure 1 University we have developed a demonstration monitor for studies of dynamic systems which seems to be quite effective. We recently modified our analog computers so that the primary output display device is a video monitor which can be used on-line with the computer. Demonstrations have been conducted for other disciplines, for example, a Sturm Liunville quantum mechanics demonstration for a professor in the Chemistry·Department. This demonstration very graphically illustrates the concept of eigenfunctions and eigenvalues for boundary value problems of this type. A picture of the classroom television display is shown in Figure 1. The instructor's control panel makes several parameters, function switches and output display channels available for control of the pre-patched demonstration problem. Switches are also available to control the operation of the analog computer . The direction we are proceeding in the area of demonstration problems is to supply the instructor with a pre-patched problem, indicate how to set potentiometers, and how to start and end the computation. Since the display is retained on a storage screen oscilloscope with video output, he can plot multiple solutions which will be stored for up to an hour, or he can push a button to erase the screen at any time. Some demonstrations have been put on the small video tape recorder, shown in Figure 1, but we find the loss of interactive capability drastically decreases the effectiveness of the demonstration. Student programming In addition to classroom demonstrations, the student can be assigned problems and required to do the pro- gramming. While we believe this to be an excellent approach for advanced engineering courses, such as design, systems analysis, etc., it has proven less than satisfactory for the first and second year science and engineering courses. Even when the students have had a basic programming course, valuable classroom time must be spent in techniques for programming, numerical methods, and discussions of debugging. The students tend to become involved in the mechanics of programming at the sacrifice of a serious study of the problem and its interpretation. With inexperienced students, even when turnaround time on the computer is excellent, the elapsed time between problem assignment and a satisfactory solution is usually much too long. We have tried this student programming approach for the past four years in a sophomore chemical engineering course. While it has been of some value, we are now convinced that it is not a satisfactory method of improving teaching. The teaching of basic computer programming does, however, have a great deal of merit in that it forces students to logically organize their thoughts and problem solving techniques. Also it helps to provide an understanding of the way computers can be applied to meaningful engineering problems. Thus, we intend to continue the teaching of basic digital computer programming in one of the higher level languages at the freshman or sophomore level, and then make. use of stored library programs for appropriate class assignments. In addition, we will continue to assign one or more term problems in which the student writes his own program, especially in courses roughly categorized as engineering design courses. Digital computer terminals It is appropriate at this point to emphasize why we feel time-shared or conversational mode terminals are not the answer to our current problem. It has been our experience that the conversational te~minal is an outstanding device for teaching programming, basic capabilities of computers, and solving student problems when the volume of data is limited. However, if the relatively slow typing capability of students is considered, we have found that a class of 20 or 30 students can obtain much faster turnaround time on the batch system. To be sure, the student at the terminal has his instantaneous response, but the sixth student in the queue is still waiting two hours to run his program and use his few tenths of a second CPU time. One can certainly argue that this is an unfair judgment since the solution is to simply buy more terminals for about Development of Analog/Hybrid Terminals $2500 each. Unfortunately, these terminals are like cameras and their use involves a continuing expense often greater than the initial cost. Connect time and communication costs for a sufficient number of terminals have discouraged such terminal use on the campus at the present time. The experience of the North Carolina Computer Orientation Project, which essentially provided a free terminal for one year at have-not schools, has been similar in that it proved very difficult for these schools to utilize the terminal in any science course other than a course in programming. Present limitations on classroom use of computers It is reasonable to ask why, in a university that has had good computing facilities for some time, computer use in the classroom is so limited. We feel there are several reasons why the majority of instructors do not use this means of communication to improve instruction techniques. The first of these reasons must be classed as faculty apathy. There is no other explanation for the fact that less than 25 percent of our engineering faculty use the digital computer and less than 5 percent avail themselves of our analog facility. Admittedly, it is extremely difficult for a physicist, chemist or engineer who is not proficient in computing to program demonstration problems for his classes. Because such demonstrations, while a step in the right direction, do not really make use of the interactive capability of the computer to excite the students' imagination, there is often little motivation for the professor to learn the mechanics of programming. Fortunate indeed is the occasional instructor who has a graduate student assistant competent to set up and document such demonstrations. The second reason, closely coupled to the first, is that computer output equipment for student use in the classroom is either not available or just too expensive for large classes. Digital graphics terminals, for instance, sell for between 15 and 70 thousand dollars, depending on terminal capability, and an analog computer of any sophistication at all will cost 5 to 10 thousand d,ollars with the associated readout equipment. In our basic introductory mechanics course, with ten sections of forty students, a minimum of twen ty such terminals would be required if we assume two students per terminal as a realistic ratio. Even if such analog or digital computer terminals were available, we would then be faced with the problem of teaching the students (and faculty) a considerable amount of detailed programming at the expense of other material in the curriculum. We feel that analogi 243 hybrid computer terminals designed to accomplish a specific, well-defined task, will provide an economical interactive student display terminal for many engineering courses. Such a terminal is described in this paper. CLASSROOM TERMINAL SYSTEM We have recently received support from the National Science Foundation to study the effect of student interaction with the computer in courses which emphasize the dynamic modeling of physical systems. It is a well known fact that interaction with a computer improves productivity in a design and programming sense. The question to which we are seeking the answer is: Will computer interaction also improve the educational process effectively without leaving the student with the impression that we are using a "magic black box"? To accomplish this goal, we are installing sixteen analog/hybrid terminals in a classroom to serve thirtytwo students. The classroom in which these terminals will be placed is about 150 feet from the School's analog and digital computer facility. At this point, we should place some limits on terminal capability and function. If we accept the premise that the student need not learn actual patching to use the analog computer terminal and eliminate the traditional concept of submitting a hard copy of his computer output as an assignment, the desirable features of a terminal might be as follows: 1. Parameters: The student must have the capa- bility of varying at least five different parameters in a specific problem. Three significant digits should be sufficient precision for these parameters and their value should be either continuously displayed or displayed on command. 2. Functions: The student should have access to function switches to perform such operations as changing from positive to negative feedback to demonstrate stability, adding forcing functions to illustrate superposition, adding· higher harmonics to construct a Fourier approximation of a function, introducing transportation delay to a control system, etc. Three to five of these function switches should be sufficient. 3. Problems: The student should be able to select several different problems, say four, at any of the individual terminals. Depending on the size of the analog computer, the student could use the terminal to continue a study of a problem used in a prior class, compare linear and nonlinear models of a process, etc. 244 Fall Joint Computer Conference, 1970 4. Response Time: The response time for each of twenty terminals should be about one to three seconds, i.e., the maximum wait time to obtain one complete plot of his output would be something like one second plus operate time for the computer. Computer operate time has been selected as 20 milliseconds for our equipment although a new computer could operate at higher speeds if desired. 5. Display: The display device for each terminal must be a storage screen oscilloscope or refreshed display for x-y plots. Zero and scale factors must be provided so that positive and negative values and phase plane plots can be plotted. Scaling control must be presented in a manner which is easy for the student to use and understand. 6. Output selector: The student should be able to select from four to five output channels for display on the oscilloscope. 7. Instructor display: The instructor should have a terminal' with a large screen display which the entire class can observe. His control console should have all the features of the student terminals and should also have the capability for displaying anyone of the student problem solutions when desired. He should also have master mode controls to activate all display r--' ~-.--'- I ! ,~~'I.. i---.- ~~I ?-_~~; to _T!t_,mK_s- ' I ~ i~, -,"~.-:~..--.--, I ~ '.~;,_ ;?".--- ~ ~ U' ;t ::hl J=> ~--' I----,ril.... Given a terminal system with these features, we have then defined the primary objective as being a study of the use of this classroom in some specific courses. We are initiating this evaluation with one course in Engineering Mechanics, Introduction to Mechanics (EM 200) and two courses in Chemical Engineering, Process Analysis and Control (CHE 425), and Introduction to System Analysis (CHE 225). Thus, we start our evaluation with one sophomore engineering core course with three to four hundred students per semester and two courses with thirty to forty students per semester at the sophomore and senior level. In addition to these two courses, we are attempting to schedule as many demonstrations of the system as possible for other departments in the hope of stimulating their imaginative use of the terminals. A flow sheet for the classroom terminal system is given in Figure 2. The system includes a small digital mini computer which is to be used for control, storage, and scaling of terminal information. The system consists of the following components: a. Sixteen student terminals b. One instructor terminal c. Digital mini computer with I/O device for programming d. Control interface to analog computer t \~ L____~___) b ~I~t ~~ ~~~~.I) ,~~~----------- ;;,~, !, I CONI'ROI. ~.;-UNG AI) 1.:~ING terminals. It would be advantageous for the instructor to have a small screen display to monitor student progress without presenting the solutions to the class on the large screen. 9>!HL VIGIl'Pi GOMP'?j'ER FOR CONI'ROl. PlID SCALING O} '!'EHMlNALS Figure 2-Flow sheet for proposed analog/hybrid terminal system The inclusion of a small digital computer in this terminal system opens up some very interesting future possibilities such as using the terminals for digital simulation as well as analog. As will be seen in the next section, the digital computer provides scaliilg so that parameters can be entered in original problem units. It also acts as temporary storage for each terminal as it awaits service and controls the terminal system. The system is designed to operate in the following manner. The instructor informs the computer operator that he wishes to use problem CHE 2 during a certain class or laboratory period. The operator places the proper problem board on the TR-48 and then sets the servo potentiometers which are not controlled by the student terminals and static checks the problem with the hybrid computer. He also sets up the program CHE 2 on the mini computer just prior to class. From this point on the system is under the control of the instructor and students. Development of Analog/Hybrid Terminals 245 FUNCTIONAL DESCRIPTION OF TERMINAL I+\yisi·ioisiul General D!'~:rAl. Dl;':r'LA:t fHoiOl ~ The basic terminal configuration is shown in Figures 3 and 4. All display functions are located on the upper display panel and control or data input is provided on the inclined control panel. a .-- -- -- --- - --- ----------t Control I I The controi functions available to the student include power on-off, store, non-store, erase, and trace intensity. Th~se controls are located on either side of the oscilloscope display as shown in Figure 4. Indicator lights are also provided in this area for terminal identification, error and terminal ready status. The erase function is used quite often by the student and thus is also available on the keyboard. In addition to the basic operating controls, the student can request either a single solution or repeated solutions on his display unit. For the single solution mode, he displays one solution as soon as the sequencer reaches his terminal address. In the repeat solution mode, his oscilloscope is unblanked and displays a solution each time the sequencer reaches his address. DBDEB Display Scaling I I RESERVED P'OR fWURE ALPHAl4ERIC I I KlTOCAiW EXf'ANSlVt. I I I I I I I I I I I I I ~-------------------- I I J EJ El 0 0 III[!]!!) ffi@lill CD0[I1 EJ ~ B EI Ikj@]~ § GB~ ~ Figure 4-Terminal display and control panels The worst case response time for either mode would be on the order of one second, even if all other terminals had solution requests pending. Output display and scaling The primary output display device is a Tektronix type 601 storage screen oscilloscope. This oscilloscope is mounted directly behind the vertical front display panel as shown in Figures 3 and 4. These oscilloscopes are modified to provide for s~aling as shown below. The operating level circuit is modified to provide a switched store and non-store operation for the user. Output scaling is automatically selected when the student depresses anyone of the four push buttons on the left hand side of the display panel. The picture which indicates the normalized scaling is printed on the face of the lighted push button switch. The left hand switch scales the output signal to display positive X and Y values. The second switch . displays positive X and allows for both positive· and negative values of the Y variable. Phase plane plots can be displayed by selecting the right hand switch in this set. We have been using this type of output scaling for oscilloscopes in our laboratory for over a year with very satisfactory feedback from student users. The student must have the option of selecting from several X and Y analog output lines. This option is provided at the terminal by depressing either or Figure 3 III [!] then the appropriate number on the keyboard, and then the 1ENTER I key. For example, if 246 Fall Joint Computer Conference, 1970 the student is instructed to plot Xl versus Y4, he would actuat~ the following keys on the terminal keyboard as shown below:· This system allows the student to enter parameter values in the actual problem units, e.g., if the input temperature for a heat exchanger simulation is 150°F, he sets this value rather than some normalized fraction. IENTER I is depressed, the register can be cleared with the ICLR Ikey. The If an error is made before The digital software sets up the linkage between the requested output line and the control unit which switches variable outputs for specific problems to the two analog output lines leading to the classroom. All analog outputs are on the two X and Y lines, but each terminal Z line is energized only at the appropriate time, i.e., in answer toa solution request for that terminal. Parameter entry There are many ways in which parameters can be set frOIn an analog or hybrid computer terminal. In the first terminals we constructed, parameters were simply multiplexed potentiometers connected to the analog computer with long shielded cable. Thumbwheel switches can be effectively used to set digital to analog coefficient units or servo potentiometers .at the analog computer. Since this hybrid terminal system includes a small digital computer, parameters will be entered with a fifteen digit keyboard as indicated in Figure 4. The parameter function switch, I rpARI, keyboard, I and ENTER keys are used in the following sequence. Suppose the student wishes to set parameter number four at a value of 132. He would depress switches in the following sequence: IpARI EI EJ EJ o IENTER I or I 100000 PAR lENTERI If the parameter number three were less than unity, say 0.05, he would enter or I PARI EJElOEJ01ENTERI I PAR lEI 0 EJ E]IENTER I use of actual rather than normalized parameters requires additional registers in the terminal but is essential for beginning students. We must remember that they are studying the dynamic system, not analog computer scaling. It is also in keeping with the concept of using the terminal as input and output for the hybrid computer. If the parameters represent frequency, temperature, pressure, etc., they should be entered as they appear in the problem statement if at all possible. Since "parameter values are scaled by a program in the digital computer, scientific notation can also be used to permit both E and F format data entry from the hybrid terminal. The digital software interprets the input data to separate the mantissa and exponent portions of the number entered in E format. For example, the student might enter fENTERI and the digital computer would convert this number to +0.00015. Reading parameter values One significant advantage of the thumbwheel switch parameter entry as opposed to the keyboard is the ability to remember a specific parameter value at any time. If the student forgets the value, he needs to be able to display it at the terminal on request. This capability is provided by a six character plus sign display module located in the upper left corner of the terminal as shown in Figures 3 and 4. This display unit automatically shows the" student that his parameter value was correctly entered at the keyboard and accepted by the digital computer. The format of the output display is controlled by the digital computer software. In addition to displaying the correct value of the parameter when entered at the keyboard, the Development of Analog/Hybrid Terminals value of any parameter previously set can be indicated using the I DISPLAY I key. Suppose the student wishes to retrieve but not change the value of parameter three. He would press ~ I , thenmon the I keyboard, and then the DISPLAY key. The digital computer software then causes the proper value to be displayed and light keyboard button number II] to identify the requested parameter number. A separate digital display module could be used to indicate the requested parameter number, but lighting the keyboard number has some advantage when displaying the status of function switches as noted later. feature of the terminal. Thumbwheel switches and keyboard entry of parameters are fine but tend to be somewhat slow when the student is interested in observing the effect of a range of parameter variations on a simulation or fitting a model to experimental data points. A potentiometer on the terminal, as we have employed in the past, avoids this problem but involves transmission of analog signals over long lines. As a compromise, either a two or four position switch is used to increment the value of any selected .parameter at a rate determined by the digital software. Thus, the student increases or decreases a particular parameter value at either a fast or slow rate with this switch. The sequence of operations would be as follows: suppose the student wishes to vary parameter four through a range of values to observe its effect on the solution. He might choose a starting value of zero for this parameter which, for example, might represent the damping coefficient in a linear, second order system. He presses I I, PAR Analog/digital readout display. The student presses 0, followed by the appropriate number on the keyboard, and then the IDISPLAY J function followed by the numbers on the keyboard and then selects the One of the advantages which immediately becomes apparent when the terminal includes digital readout and a small digital computer is the capability of returning numbers which are the result of an iterative calculation. A'· first order boundary value problem where the unknown initial condition or time constant is sought would be one illustration. Another example which we have been using in a basic process instrumentation course is to demonstrate the operation of a voltage to frequency converter or time interval analog to digital converter. Any digital or analog number can thus be returned to a terminal by selecting one of the output lines for key. This display feature is also extremely valuable in presenting calculated results from the hybrid computer through the trunk lines as indicated in the overall system diagram, Figure 1. 247 He then selects the 0 and 0 IENTER I key. I REPEAT SOLUTION I mode to observe the solution each time the sequencer reaches his terminal number. If the student has selected the I STORE I mode, he can then plot a family of curves as he increases the damping factor from zero to unity by pushing the parameter slewing switch in the increase direction. A four position switch allows a choice of either fast or slow incremental changes in the parameter value. Another obvious application for this function is in curve fitting of experimental data with one or more parameters. Function switches Control of the electronic function switches is provided at each terminal. To set function switch one in I FUN I ,fol'o"k I on the key- theiONI position, the student presses lowed by the number board and then the One variable parameter [I] and I ENTER I key. If he wishes to know the present state of any function switch, he The availability of a control which can vary one parameter through some range of values is an important presses 'FUN I , the switch number, and then the 248 Fall Joint Computer Conference, 1970 I DISPLAY I key. The terminal response is to light the function switch number and either the Io"k I or I I OFF keys on the keyborad to indicate the pres- ent state of that particular switch. These function switches can be used in any way desired by the instructor, e.g., adding successive terms of a power or Fourier series to demonstrate the validity of these approximations, adding various controller modes in a process control simulation, etc. The instructor's terminal The instructor's terminal is designated as terminal number zero. This termi~l uses as its primary output device a Tektronix Type 4501 storage oscilloscope instead of the Type 601. Since this scanning oscilloscope has video output, the instructor can display his solution on the closed circuit television monitor for the class at any time. In addition, the instructor has the capability of un blanking any or all student terminals to let them have a "copy" of his solution to compare with· their own. He can also unblank his terminal and pick selected student solutions for display to the rest of the class. SOFTWARE Basic operating system The basic software to serve the analog terminals is written in assembly language for the PDP-8 control computer. This is a 4K machine with hardware multiply and divide, although this feature is not essential for terminal operation. The basic cycle time for the system is controlled by the analog clock which alternately places the analog computer in the initial condition and operate modes every twenty milliseconds. The first ten milliseconds of each initial condition period is to ensure adequate time for problem and function selection by the relay multiplexer. The second ten milliseconds is the normal initial condition time to charge the integrating capacitors as shown below. ::.t~rx NORMfJ. 3IIITCHI~l'} MlIX NOPA!,L, 301 IT CHING ANALOG -21- AN.tI.LOG PROSLEM SOLUT ION ....1 g ~ ....1 g ~ TIME IC TIME AN"; LOG INrrIAL TTI.fE ANALOG OPERATE CONDrrION MODE IC TIME ANALOG INITIAL MODE CONDrrrON MODE 8 Pl R ~ I-- 20 ms --~1Mo1.r--- ~ ~,--- 20 ms - -...... 20 ms --+ A terminal user can activate an action key at any time, e.g., t ENTER I, I DISPLAY I , ISINGLEI ISOLUTION I, or IREPEAT SOLUTIONI . This request for action, along with the necessary data and address is stored in a 32 bit shift register in each terminal. As each terminal is interrogated in sequence by the PDP-8 the action bit is tested. If the user wants service, his data is transferred to a specific core area. For instance, suppose he wishes to set parameter number 1 at a value of 0.32. He activates the following keys: IENTER I. ~ The []] rn 0 m[!] IENTER( key is the action code in this instance. When the sequencer reaches his terminal, this data is transferred to storage in the proper memory locations in the PDP-8. A similar action is taken to set function switches, and select problems or output channels. The basic operating system software controls all of these action operations. When a solution is requested, the parameters, functions, and outputs, along with an unblanking signal, is sent back to the terminal during the next analog computer operate cycle. The basic system software also converts the floating point parameter values supplied by the student to integer values used by the digital coefficient units or digital to analog converters. This feature of floating point data entry requires that the instructor provide the scaling information for each problem as described next in the application software section. A pplication software The application software is written in a special interactive language developed for the PDP-8. This language makes use of a cassette tape recorder in our system, but could be used from the teletype if necessary. The information required by the terminal operating system to convert floating point to integer parameters is their maximum and minimum values. When the instructor is setting up the terminal problem, the computer software solicits responses similar to the following: IDENTIFY YOUR PROBLEM NUMBER The instructor would then type, CHE4 Development of Analog/Hybrid Terminals The computer responds with PROVIDE MAXIMUM AND MINIMUM VALUES FOR THE PARAMETERS If the instructor wishes to give the student control over parameters one and two, he types PAR 1, MAX 50, MIN 25 PAR 2, MAX 0.5, MIN 0 END PAR LIST A similar conversational procedure is used to identify function switches, problems, and analog computer output channels. In our system, this scaling and switching data is stored on the magnetic tape cassette. A paper tape unit could also be used if desired. When the instructor wishes to use the terminal system at a later date, he places his cassette in the tape deck and the proper problem board on the analog computer. From this point on, the problem or problems can be controlled from the individual terminals. COSTS The cost of a system such as described in this paper is naturally dependent on the number of terminals involved. Since our system was developed jointly with Electronic Associates, it is difficult to evaluate the actual development and design costs. The individual terminals, including a type 601 Tektronix storage oscilloscope should be on the order of $3500 to $4000 each. Mini computers such as used in this system would range from $6000 to $10,000 and cassette tape systems 249 are available for about $3000. The major question mark in the estimation of system cost is the hybrid control interface to couple the analog and digital computers. If a special interface could be developed for about $10,000, the cost of a ten terminal system would be on the order of $60,000. This system could be coupled to any analog computer and, of course, provides basic hybrid capability as well as terminal operation. If a hybrid computer were already available, the terminals could be added for about $3500 to $4000 each plus wiring costs. CONCLUSION The key to student and instructor use of these terminals is the development of appropriate handout materials. Several of these handouts have been written in programmed instruction form and have resulted in very favorable feedback from students who used early models of the terminals. Although the complete classroom system will be used for the first time in the fall of 1970, we have been very gratified with student acceptance of the few terminals now in use. Laboratory reports now consist of answering specific questions concerning the dynamic system under study rather than computer diagrams and output. Also, the student can really proceed at his own pace, and return at any time to repeat a laboratory exercise simply by giving the computer operator the problem number . We are excited about the potential of this classroom terminal system and believe that we will see significant improvement in the students' understanding of dynamic systems as the system is used in additional curricula. Computer tutors that know what they teach by L. SIKLOSSY University of California Irvine, California* limited domain (multiple-choice question), or it must match exactly or partially (through key-words) some stored answers. The result of the diagnostic is submitted to a strategy program (K6). The strategy program may use additional data, past performance for instance, to determine the next frame of the course. INTRODUCTION Computer tutors hold the promise of providing truly individualized instruction. Lekanl lists 910 Computer Assisted Instruction (CAl) programs, and this large number demonstrates the ,\~de interest in the field of computer tutors. The computer is eminently suited for the bookkeeping tasks (averaging, record keeping, etc.) that are usually associated with teaching. In such non-tutorial tasks, the computer is greatly superior to a human teacher. On the other hand, in strictly tutorial tasks, the computer is usually handicapped. In particular, CAl programs seldom know the subject matter they teach, which can be seen by their inability to answer students' questions. We shall consider the structure of tutorial CAl programs and discuss some of their shortcomings. Some of the latter are overcome by generative teaching systems. Finally, we shall outline how we can construct computer tutors that know their subject matter sufficiently to be able at least to answer questions in that subj ect matter. KI: STUDENT ANSWER "IGNORANT" CO}V[PUTER TUTORS K5: DIAGNOSTIC: COMPARE STUDENT ANSWER TO A FINITE NUMBER OF STORED ITEMS Most CAl programs have a structure very close to that of mechanical scrambled textbooks. These textbooks and their immediate CAl descendants, which we shall call selective computer teaching machines, * consist of a number of frames. Figure 1 shows the structure of a frame of a selective computer teaching machine. The computer tutor may either start the frame with some statements (box labelled K2), or directly ask the student some question (K3). The student's answer (K4) is compared to a finite store of anticipated responses (K5). The answer itself may have been forced into a KG: STRATEGY p.: DETERMINE NEXT MOVE * Present address: University of Texas, Austin, Texas. * To use the terminology of Utta1. 5 p. DENOTES PROGRAM Figure I-Frame of a selective computer teaching machine 251 252 Fall Joint Computer Conference, 1970 Disregarding the bookkeeping tasks that the computer tutor can perform, we shall concentrate on· the structure of the tutor. The two major criticisms that have been levied at selective computer teaching machines are: a. Their rigidity;- Questions and expected answers to these questions have been prestored. b. Their lack of knowledge. They cannot answer a student's questions related to the subject matter that is being taught. GENERATIVE TEACHING MACHINES In an effort to overcome the rigidities of selective computer teaching machines, some researchers have developed generative teaching machines. Figure 2 describes a frame of a generative teaching machine. In this case, the computer tutor, instead of retrieving some question or problem, generates such a question or problem, a sample of some universe. The generation is accomplished by a program called the generator program (L2). LI! p. DENOTES PROGRAM The sample is presented to the student who tries to manipulate the sample: i.e., answer the question or solve the problem. Concurrently* another program, the performance program, manipulates the sample (L5). The performance program knows how to manipulate samples in the universe of the subject matter that is being taught. Before continuing with our description, we shall give some examples of generative teaching machines. Uhr has described a very elementary system to teach addition. A program by Wexler 3 teaches the four elementary arithmetic operations. Peplinski4 has a system to teach the solution of quadratic equations in one variable. Uttal et al. 5 describe a system to teach analytic geometry. When Wexler's system teaches addition, the generator program generates a sample, namely two random numbers that satisfy certain constraints. An example would be: four digits long, no digit greater than 5. (Note that the number of possible samples may be very large.) The performance program simply adds the two numbers. A diagnostic program (L7) analyzes the differences between the answers of the student and the performance program. In Wexler's system, the numbers given by the student and the system mayor may not be equal. If unequal, the diagnostic program may determine which digits of the answers are equal, which number is larger, etc. The findings of the diagnostic program, together with other information (such as past student performance), are given to a strategy program (L9). This program may decide to communicate some aspects of the diagnosis to the student-for example : "Your answer is too small." (L11); it may halt (L10); or transfer to a new or the same frame (L1). Transferring to the same frame is not an identical repetition of the previous frame, since usually the generator program ·will generate a different sample. In a generative computer tutor, questions are no longer prestored but are generated by a program. Since the questions are not usually predictable with exactitude, a performance program is needed to answer them. The performance program is at the heart of a computer tutor that knows what it teaches. PROGRAMS THAT KNOW WHAT THEY TEACH UI: ~~~~L- _________ _ COMMUNCATE ASPECTS OF DIAGNOSTIC p. TO STUDENT Figure 2-Frame of a generative teaching machine The performance program of a generative computer tutor can solve the problems in some universe; in other words, we may say that the program knows its subject * This is the meaning of the wavy lines in the flowchart. Computer Tutors That Know What They Teach 253 occurs faster when students generate their own examples. A COMPUTER TUTOR THAT KNOWS SOME SET THEORY Figure 3-Frame of a computer tutor that knows what it teaches matter. We can use the performance program in two additional ways beyond its use in generative tutors. The performance program may answer questions generated by the student. It can also explain how it answers some questions and thereby teach its own methods to the student. Figure 3 describes a knowledgeable computer tutor. The path of boxes L1, L2, L5, L6, L7, L9, L10 and L11 has been discussed above in the framework of a generative tutor. The function of box L12 is to explain to the student the problem-solving behavior of the performance program in those cases when the behavior of the performance program can be fruitfully imitated by a human being. In box L3, the student is allowed to generate samples. The previous path can then be followed: both student and computer· tutor can manipulate the samples with the tutor helping the student. The tutor can also manipulate the sample directly, thereby, in effect, answering a student's question (path L3, L5 /, L6 /). We can even let the tutor make mistakes (L4, L5 /), which gives the student an opportunity to catch the teacher in error (LS). It is important to let the student generate samples so that he can find out what happens in particular cases about which he feels unsure. It is impossible to preprogram a set of samples that would be satisfactory to all students. In addition, some experimental evidence (Hunt;6 Crothers and Suppes7) indicates that learning We shall illustrate the framework of a knowledgeable computer tutor by a program that teaches set union. The subject matter was selected for the ease with which appropriate performance and diagnostic programs could be written. The programming language is the LISP 1.6 version of LISP running on the PDP-10 computer. The program is not completely polished and the outputs have been slightly edited. A more definitive version of the program will be described elsewhere, but we hope to give here enough feeling for the power of such a computer tutor to be appreciated. Sets are represented by unordered lists. The sets (A(B C» and ((C B)A) are equal. Each contains two elements, one of which is itself a set. ( ) is the empty set. A dialogue between the computer tutor and a student follows. We first consider the'computer generative mode: Computer: What is the union of (HI 32) and ((DOG»? Student: (HI). C. : Your answer is not right. Do you see why? S.: No. C. : You missed some elements. The following also belong to the union: (DOG) 32. The union is ((DOG) HI 32) ... The output does not make it evident that the sets were generated using a random set generator. The set generator generates sets of a given number of elements and of a given maximum depth. (A set with no subsets as elements has depth 0.) The atomic elements of a set are randomly selected from an array which includes the atoms HI, 32 and DOG. The length and depth of the set are themselves random numbers in ranges that are input to the set generator routine. The program computes the union of the sets (HI 32) and ((DOG». The union is the set ((DOG) HI 32). This result is compared to the student's response. The computer tutor analyzes the differences between the desired and obtained answers and, in this case, exhibits the terms missing from the student's answer. The path we have followed in Figure 3 is L1, L2, L5, 254 Fall Joint Computer Conference, 1970 L6, L7, L9, and L1l. We shall follow a different path below and let the student generate sets. C. : Type two sets and I will tell you their union. S.: (AS AS) (B D). C. : (AS AS) is not a set. Do you understand why? S.: No. C. : AS are duplicate elements in your first set. Type two more sets. S.: (A V D) ( ). C.: The union of (A V D) and ( ) is (A V D) ... TABLE I-Diagnostic Program for Set Union and Some Possible Comments Cases Determined by Diagnostic Program 1. A=8 1 u8 2 (set equality) C. : Would you like to quiz me ? You type in two or more sets and I will find the union. Then you type whether I am right or wrong ... S.: (RST) (). C.: The union is (R S T). S.: Wrong. C. : No, you're wrong. I was right. Do you see why? S.: No. C. : Type what you think the union is. S.: (R S T). C.: S.: C.: S.: C. : But that's what I typed. Type some sets. (A B) (A). The union is (A). Wrong. Oh my goodness! I did make a mistake ... We expanded the previous path to include box L4. The tutor now may make mistakes, and the student has an opportunity to play teacher. If, when asked "Type what you think the union is," the student had typed "(T R S)," the tutor's response would have been: "The sets (R S T)and (T R S) are equal." Here we have followed the path L1, L3, L4, LS', L6 /, L8, L7, L9 and L1l. The tutor gives a final test to the student. In it, the following appears: C. : What is the union of the following sets: (BIRD ( ) S7) (S7 (DOG) 32 87)? S. : Impossible. C. : You caught me. The second list was not a set. There is nothing like giving impossible problems! Your answer is correct. Your answer is incorrect. 2-1. (81U82)-A~() 2-1-1. «8 1 U 8 2) -A) The inputs proposed by the student are checked for well-formedness. The program finds that (AS AS) is not an acceptable representation for a set since the element AS is repeated. An appropriate diagnostic message is given. The path in Figure 3 was L1, L3 and LS', with a return to Ll. Next the student wants to determine how union operates with the empty set. The path followed was again L1, L3, and LS'. We allow the tutor to make mistakes and give the student a chance to catch them. Possible Partial Comments to Student n 81~( ) You left out some element(s). You left out some element(s) of the first set. 2-1-2. «8 1 U 8 2) -A) n 82~( ) You left out some element(s) of the seconp. set. 2-2. A -(8 1 U 82)~( ) Some element(s) in your answer are neither in 8 1 nor in 8 2 • We have not yet coded the introspection program that would explain to the student how the performance program calculates set unions. Table I lists diagnoses that can be used in teaching set union. The two sets are S1 and S2: the student's answer is A. We assume that all sets have been checked for well-formedness. U, n and - denote set union, intersection and difference. The tutor can diagnose not only that (for instance in Table I, case 2-2) some elements in the answer should not have been there, but also tell the student which elements these are. Table II lists dIagnoses that can be used in teaching set intersection. The two tables show how algorithmic computations allow the computer tutor to pinpoint the student's errors. TABLE II-Diagnostic Program for Set Intersection and Some Possible Comments. Cases Determined by Diagnostic Program 1. A =8 1 n 8 2 (set equality) 2. A~81n82 2-1. (81n82)-A~() Possible Partial Comments to Student Your answer is correct. Your answer is incorrect. You left out some element(s) which belong to both 8 1 and 82. Some elements in your answer do not belong to both 8 1 and 8 2 • Some element(s) in your answer belong to 8 2 but not to 8 1 • Some element(s) in your answer belong to 8 1 but not to 8 2 • Some elements in your answer are neither in 8 1 nor in 8 2 • Computer Tutors That Know What They Teach The symbol-manipulating capabilities required of the computer tutor would be difficult to program using one of the computer languages that were designed specifically to write CAl programs. RECIPE FOR A KNOWLEDGEABLE COJ\1PUTER TUTOR The framework of Figure 3 shmvs explicitly how a knowledgeable computer tutor can be built. First, we need a performance program which can do what we want the student to learn. We have programs that appear to know areas of arithmetic, algebra, geometry, group theory, calculus, mathematical logic, programming languages, board games, induction problems, intelligence tests, etc. The computer models which have been developed in physics, chemistry, sociology, economics, etc., are other examples of performance programs. To complete the computer tutor, attach to the performance program appropriate generator, diagnostic, strategy and introspection programs. We used our recipe for a knowledgeable computer tutor to develop a tutor to teach elementary set theory and gave examples of the capabilities of this tutor. The manpower requirements for the development of a computer tutor are considerable and we have not applied the recipe to other areas. Our demonstration, therefore, remains limited but we hope that it was sufficiently convincing to encourage other researchers to develop more knowledgeable and powerful computer teaching systems. The major difficulty that we experienced was in the area of the topic of understanding of the diagnostic program. In particular, linguistic student responses could not be handled in general. Presently, we only accept very limited student answers expressed in natural language. The development of computer programs which better understand language* would lead to a much more natural interaction between student and tutor. CONCLUSION :\1ost CAl programs cannot answer student questions for the simple reason that these programs do not know * See Simmons8 for an effort in that direction. 255 the subject matter they teach. We have shown how programs that can perform certain tasks could be augmented into computer tutors that can at least solve problems or answer questions in the subject matter under consideration. We gave as an example a program to teach set theoretical union and showed the diagnostic capabilities of the tutor. These capabilities are based on programs and are not the result of clever prestored examples. The student-tutor interaction will become less constrained after enough progress has been made in computer understanding of natural language. ACKNOWLEDGIVIENTS J. Peterson and S. Slykhouscontributed significantly to this research effort. REFERENCES 1 H A LEKAN I ndex to computer assisted instruction Sterling Institute Boston Mass 1970 2 L UHR The automatic generation of teaching machine programs Unpublished report 1965 3 J D WEXLER A self-directing teaching program that generates simple arithmetic problems Computer Sciences Technical Report ~ 19 University of 'Wisconsin Madison Wisconsin 1968 4 C A PEPLINSKI A generating system for CAl teacMng of simple algebra problems Computer Sciences Technical Report ~ 24 University of Wisconsin Madison Wisconsin 1968 5 W R UTTAL T PASICH M ROGERS R HIERONYMUS Generative computer assisted instruction Communication ~ 243 Mental Health Research Institute University of Michigan Ann Arbor 1969 6 E B HUNT Selection and reception conditions in grammar and concept learning J Verbal Learn Verbal Behav Vol 4 pp 211-215 1965 7 E CROTHERS P SUPPES Experiments in second-language learning Academic Press New York New York Chapter 6 1967 8 R F SIMMONS Linguistic analysis of constructed student responses in CAl Report TNN-86 Computation Center The University of Texas at Austin 1968 \ Planning for an undergraduate level computer-based science education system that will be responsive to society's needs in the 1970's by JOHN J. ALLAN, J. J. LAGOWSKI and MARK T. MULLER The University of Texas at Austin Austin, Texas Digital computer systems have now been developed to the point where it is feasible to employ them with relatively large groups of students. As a result, defining the problems involved in the implementation of computer-based teaching techniques to supplement classical instructional methods for large classes has become a most important consideration. Whether the classes are large or small, colleges and universities are faced with presenting increasingly sophisticated concepts to continually-expanding numbers of students. Available teaching facilities, both human and technical, are increasing at a less rapid rate than the student population. Typically, the logistics of teaching science and engineering courses becomes a matter of meeting these growing demands by expanding the size of enrollments in lectures and laboratory sections. It is now apparent that we can no longer afford the luxury of employing teachers in non-teaching functions -whether on the permanent staff or as teaching assistants. Many chores such as grading and record keeping as well as' certain remedial or tutorial functions do not really require the active participation of a teacher, yet it is the person hired as a teacher who performs these tasks. Much of this has been said before in various contexts; however, it should be possible to solve some of these problems using computer techniques. In many subjects, there is a body of information that must be learned by the dtudent but which requires very little teaching by the instructor. Successful methods must be found to shift the onus for learning this type of material onto the student thereby premitting the instructor more time for teaching. Thus, computer techniques should be treated as resources to be drawn upon by the instructor as he deems necessary, much the same as he does with books. Basically, the application of computer techniques is supplemental to, rather than a supplantation oj, the INTRODUCTION The purpose of this paper is to discuss the planning of an undergraduate level computer-based educational system for the sciences and engineering that will be responsive to society's needs during the 1970's. Considerable curriculum development research is taking place in many institutions for the purpose of increasing the effectiveness of student learning. Despite the efforts under way, only limited amounts of course matter using computer-based techniques are available within the sciences and engineering. Planning for a frontal attack to achieve increased teaching effectiveness was undertaken by the faculty of The University of Texas at Austin. This paper presents the essence of these faculty efforts. An incisive analysis of the state of the art with regard to the impact of technology on the educational process is contained in the report "To Improve Learning" generated by the Commission on Instructional Technology and published by the U.S. Government Printing Office, March, 1970. 1 The focus is on the potential use of technology to improve learning from pre-school to graduate school. The goals stated in the above report are (1) to foster, plan, and coordinate vast improvements in the quality of education through the application of new techniques which are feasible in educational technology, and (2) to monitor and coordinate educational resources. AN OVERVIEW OF COMPUTER-BASED TEACHING SYSTEMS Until recently, interest in using machine-augmented instruction has been centered primarily on research in the learning processes and/or on the design of hardware and software. 257 258 Fall Joint Computer Conference, 1970 human teacher. The average instructor who has had little or no experience with computers may be awed by the way the subj ect matter of a good instructional module can motivate a student. If the program designer has been imaginative and has a mastery of both his subject and the vagaries of programming, the instructional module will reflect this. On the other 'hand, a pedantic approach to a subject and/or to programming of the subject will also unfortunately be faithfully reflected in the module. Thus, just as it is impossible to improve a textbook by changing the press used to print it, a computer-based instructional system will not generate quality in a curriculum module. Indeed, compared with other media, computer methods can amplify pedagogically poor techniques out of proportion. The application of computer techniques to assist in teaching or learning can be categorized as follows: 1. Computer Based Instruction (CBI)-this connotes interactive special purpose programs that either serve as the sole instructional means for a course or at least present a module of material. 2. Simulation a. of experiments 1. that allow "distributions" to be attributed to model parameters 2. that are hazardous 3. that are too time consuming 4. whose equipment is too expensive 5. whose principles are appropriate to the student's level of competence, but whose techniques are too complex b. for comparison of model results with measurements from the corresponding real equipment 3. Data Manipulation (can be interpreted as customarily conceived computation) for a. time compression/expansion-especially in the laboratory, as in data acquisition and reduction with immediately subsequent "trend" or "concept" display b. advanced analysis in which the computation is normally too complex and time consuming c. graphic presentation of concepts-possibly deflections under dynamic loading, molecular structures, amplitude ratio and phase lead or lag, ... 4. Computer-Based Examination and Administrative Chores to accompany self-paced instruction. SYSTEM DESIGN PHILOSOPHY Design concepts that foster a synergistic relationship between a student and a computer-based educational system are based upon the following tenets: 1. The role of the computer in education is solely that of a tool which can be used by the average instructor in a manner that (a) is easy to learn, (b) is easy to control, and' (c) supplements instructional capability to a degree of efficiency unattainable through traditional instructional methods. 2. Computer-based' education, although relatively new, has progressed past the stage of a research tool. Pilot and experimental programs have been developed to the point where formal instruction has been conducted in courses such as physics2 and chemistry.3.4 Despite this, the full potential of these new techniques has yet to be realized. Future systems, that are yet to be designed, must evolve which will provide sufficient capacity, speed and flexibility. These systems must be able to accommodate new teaching methods, techniques, innovations and materials. Programming languages, terminal devices and communications should be so conceived as to not inhibit pedagogical techniques which have been successfully developed over the years. The system should incorporate new requirements which have been dictated through progressive changes in education and adjust without trauma or academic turbulence. 3. Initial computer-based instructional systems should be capable of moderate growth. Their role will be that of a catalyst to expedite training of the faculty as well as a vehicle for early development of useful curriculum matter. Usage by faculty will grow in parallel with evolving plans for future systems based upon extensive test and evaluation of current course matter. The design of individual instructional modules to supplement laboratory instruction in the sciences and engineering will follow the general elements of the systems approach which has gained acceptance in curricular development. 5 This approach to curriculum design can generally be described as a method for analyzing the values, goals, or policies of hu'man endeavor. The method as applied to computer-assisted instruction has been described in detail by Bunderson. 6 Although there have been several different descriptions of the systems approach to the design of curricular materials, two major elements are always present: course analysis and profiling techniques. Course analysis consists of 1. a concise statement of the behavioral objectives Computer Based Science Education System of the course expressed in terms of the subject matter that is to be learned. 2. the standard that each student must reach in the course. 3. the constraints under which the student is expected to perform (which may involve an evaluation of his entering skills). The results of the course analysis lead to an implementation of the suggested design by incorporating a curriculum profile ("profile techniques"), which contains L function analysis, i.e., the use of analytical factors for measuring the parameters of a task function 2. task analysis,7 i.e., an analysis that identifies in depth various means or alternative courses of action available for achieving specific results stated in the function analysis 3. course synthesis, the iterative process for developing specific learning goals within specific modules of instructional material. A general flow diagram which shows the relationship between the elements in the systems approach to curriculum design appears in Figure L STATEMENT OF CONSULTANTS BEHAVIORIAL OBJECT IVES PRODUCTION OF MATERIALS 259 Goals and behavioral objectives Some of the longer range goals defined should be to demonstrate quantitatively the increased effectiveness gained by computer,:-based techniques and further, to develop skill in the faculty for "naturally" incorporating information processing into course design. Another long range goal should be to effect real innovation in the notions of "laboratory courses" and finally, to instill in graduating students the appreciation that computers are for far more than "technical algorithm execution." In more detail, some of the goals are to: L Plan, develop and produce computer-based course material for use in undergraduate science and engineering courses to supplement existing or newly proposed courses. Course matter developed will utilize a systems approach and consider entering behaviors, terminal objectives, test and evaluation schema, provisions for revision, and plans for validation in an institutionallearner environment. 2. Produce documentation, reports, programs and descriptive literature on course matter developed to include design and instructional strategies, flow charts and common algorithms. 3. Test course matter developed on selected classes or groups for the purpose of evaluation, revision and validation. This is to determine individual learner characteristics through use of techniques such as item analysis, and latency response using interactive languages. 4. Promote and implement an in-depth faculty involvement and competency as to the nature, application and use of computer-based instructional languages, course matter and techniques. 5. Compile, produce and make provisions for mass distribution of such computer-based materials as are tested, validated and accepted for use by other colleges or institutions participating in similar programs. In an academic environment the use of time-sharing computers for education is gradually being incorporated as a hard-core element, which in time will become so commonplace a resource for faculty and student use that it will serve as an "educational utility" on an interdisciplinary basis. To achieve a high degree of effectiveness of such systems, the faculty using computerbased teaching techniques as a supplement to laboratory type instruction must initially become involved. The pedagogical objectives of this whole approach are: Figure i-Systems approach to curriculum design 1. To correct a logistic imbalance: i.e., a condition marked by the lack of a qualified instructor and/ 260 Fall Joint Computer Conference, 1970 2. 3. 4. 5. 6. 7. or the proper equipment and facilities to perform a specific teaching task (or project) being at a specific place at a specific time. To provide more information per unit time so that, as time passes, the course will provide inincreasingly in-depth knowledge. To allow new ranges of material to be covered when one specifically considers things currently omitted from the curriculum because of student safety or curriculum facility costs. To increase academic cost effectiveness, because it is certainly expected that adroit information processing will free the faculty from many mundane tasks. To provide both parties with more personal time and flexibility, because it is anticipated that considerable amounts of time are to be made free for both student and faculty. To make a computer-based system the key to the individualization of mass instruction by utilizing dial-up (utility) techniques. To develop laboratory simulation so that it is no longer a constriction in the educational pipeline because of limited hardware and current laboratory logistics. Evaluation criteria In general, there are two distinct phases to the evaluation ot instructional modules. The first of these coincides with the actual authoring and coding of the materials, and the second takes place after the program has advanced to its penultimate stage. These two types of evaluation can be referred to as "developmental testing" and "validation testing," respectively. Developlllental testing This testing is informal and clinical in nature and involves both large and small segments of the final program. The specific objective of this phase of the development is to locate and correct inadequacies in the module. It is particularly desirable at this point of development to verify the suitability of the criteria which are provided at the various decision points in the program. It is also anticipated that the program's feedback to the students can be improved by the addition of a greater variety of responses. Finally, testing at this stage should help to uncover ambiguous or incomplete portions of the instructional materials. Relatively few students (about 25) will be required at this stage of evaluation; however, since this phase is evolutionary, the exact number will depend on the nature and extent of the changes that are to be, and have been, made. Materials which are rewritten to improve clarity, for example, must be retested on a small scale to establish whether an improvement has in fact been made. A final form of this program, incorporating the revisions based on this preliminary testing, will be prepared for use by a larger group of students. Validation testing The formal evaluation of the programs will occur in a section (or with a part of a section) of a regularly scheduled class. It is assumed that a selection of students can be obtained that is representative of the target population for which the materials are designed. One of the following two methods for obtaining experimental and control groups is suggested, depending upon circumstances existing at the time of the study (i.e., number of sections available, whether they are taught by the same instructor, the willingness of instructor to cooperate, section size, etc.). The preferred method is to arbitrarily designate one course section experimental and one course section control with the same instructor teaching both sections. An assumption here is that the registration procedure results in a random placement of students in the sections. The alternative method of selecting students follows. The instructional programs are explained to the total student group early in the semester, and a listing of those students who are willing to participate is obtained. Two samples of approximately equal size are randomly selected from this list. One sample of students is then assigned to work with the computerassisted instructional facilities and is designated as the experimental group. The criteria used to evaluate the programs are as follows: 1. The extent to which students engage in and/or attain the behavioral objectives stated for the program. For tutorial and remedial programs pre- and post-tests are the instruments for measuring attainment and will help answer the question: Does the computer-based supplement actually teach what it purports to teach? For experiment simulations, the successful completion of the program is prima facie evidence that the student has engaged in the desired behaviors. 2. Achievement as measured by the course final examination. Computer Based Science Education System TABLE I-Curriculum Development Research Tasks CONTROL EXP. SAME INSTRUCTOR STANDARD I ZED STANDARDIZED AREA EXAM AREA EXAM FORM 1 FORM 1 ATTITUDE ATTITUDE INVENTORY INVENTORY COURSE CONTENT + Task Title COURSE CONTENT PRE-TEST CAl ~ I I I I FINAL EXAM I I I .~ I ND I CATES OVERALL EFFECT AND EFFECT ON _-------1.------1 2; I.=> STANDARDIZED AREA EXAM FORM ATTITUDE INVENTORY I I I I I EFFECT RETENTION INDICATES' NET GAIN IN LEARNING AND CHANGES STATISTICAL ANALYSIS OF DATA: TECHN I QUES Department Dr. J.J.A. Assistant Professor Mech. Engr. Aerospace Structural D, IG, MM, PrS, Sim, Stat, Res Dr. E.H.B. Assistant Professor Aero. Engr. Theoretical Chemistry Sim, D, Rsim Dr. F.A.M. Professor Dr.R.W.K. Faculty Associate Chemistry Biophysical Analysis SIM, Rsim, Stat Dr. J.L.F. Assistant Professor Zoology IN ATTITUDE I THE EQUIVALENT OF AN ANALYSIS OF COVARIANCE USING REGRESSION Research Investigator D,IG,MM, PrS, Res __ J_______ t ___ ~ I I ~ INDICATES IMMEDIATE I I Type Application Machine Element Design SUPPLEMENT POST-TEST HOUR EXAM OVER COURSE CONTENT 261 I I I I L! ____________ ::.1 Application Code D-Drill and Practice; G-Graphics; IG-Interactive Graphics; MM-Mathematical Modeling, Gaming; PrS-Problem Solving; OLME-On Line Monitoring of Experiments; Sim-Simulation; Rsim-Real Experiments Plus Simulation; Stat-Statistical Data Processing ; Res-Research in Learning Processes. Figure 2-Test and evaluation schema for overall program 3. Net gain in achievement as determined by preand post-testing with the appropriate standardized area examination. 4. Changes in student attitudes as measured by pre- and post-attitude inventories. The statistical treatment used to evaluate the programs in light of the above criteria will be a multiple linear regression technique equivalent to analysis of covariance for criteria 2 and 3. The covariables used will include, (1) A placement exam score (if available); (2) the beginning course grade (for students in advanced courses); (3) sex; (4) SAT* scores, high school GPA**; and (5) other pertinent information available in the various courses. The key elements of the test and evaluation schema are shown schematically in Figure 2. IMPLEMENTATION Planning is one of the most important aspects of CBI and when done properly yields what we might call a "systems approach" to curriculum or course design. This means not only the setting up of a course * Scholastic Aptitude Test. ** Grade Point Average. of instruction by following a comprehensive, orderly set of steps, but also a plan for involving other faculty. No two curriculum designers will agree as to what constitutes the essentials in outlining a course in its entirety before beginning. However, there are certain sequential or logical steps that can be defined precisely. These steps have been designated below as the course planning "mold" and are based upon actual in-house experience in planning, developing, testing and evaluation of course matter. Examples of this planning as shown in the entries in Tables I and II are authentic TABLE II-Curriculum Development Research Task Cost Task Title Total Duration Dollar (Mo.) Value* Personnel Cost Computer Time Cost Elements of Design 27 $72,440. $31,120. $41,320. Aerospace Structures 31 $111,720. $45,920. $65,800. Theoretical Chemistry 24 $53,150. $36,900. $16,250. Biophysical Analysis 19 $43,250. $23,600. $19,650. * Not including hardware cost. 262 Fall Joint Computer Conference, 1970 and have been derived by requesting each new associate investigator to fit the approach to beginning his work into a "mold." Descriptive material for each prospective participating faculty member is shown in the following example, * and is organized as follows: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. Title Introduction Objectives Project Description Method of Evaluation Impact on the Curriculum Pedagogical Strategy Current or Past Experience Related Research Plans for Information Interchange Detailed Budget * Time Phasing Diagram Application Cross-Reference Chart EXAMPLE OF A PROSPECTIVE ASSOCIATE INVESTIGATOR'S PROPOSAL: INTERACTIVE DESIGN OF MACHINE ELEMENT SYSTEMS machine elements. One of the primary goals is to allow students to consider machine element systems instead of the traditional approach of the machine elements. The object of this is to show the interplay of the deflections and stresses due to imposed forces on machine element systems. Another objective is to allow distributions to be attributed to parameters of systems normally considered as discrete. Latest in the series of engineering texts are those that consider probablistic approaches to design. Finally in the design process, by definition, a task performed at one instant in time does not give the observer of a designer any information about what the designer's next task might be. Hence, it' is most important to construct an environment that will allow the student to "meander" through the myriad of possibilities of. alterations of a nonfeasible solution of a design problem that might make it subsequently a feasible solution. Another objective of using an interactive system is so considerations not necessarily sequential in an algorithmic sense can be superposed for the student's consideration. That is, he can consider system reliability, system structural integrity, and system economics simultaneously and make design decisions based on anyone of those fundamental considerations. In troduction Project description Engineering design is a structured information/ creative process. The synthesis of physically feasible solutions generally follows many design decisions based on processed available information. The structure of design information has been defined in the literature. 8 Processing this design information in a scientific and logical method is important to the understanding of how one teaches engineering design, particularly the part that computer augmentation might play in the process. Essentially, the computer can relieve the engineering designer of those aspects of problem manipulation and repetitive computation that might otherwise distract him from the context of his problem. The fundamental principle here is to have the designer making decisions, not calculations. Objectives There are several objectives to being able to put a student in an interactive situation for the design of * It should be emphasized that the courses listed in Tables I and II represent only a portion of the curriculum matter which is being or needs to be developed. * Sample forms for items 11, 12, and 13 can be obtained from the authors. The form for item 11 in particular gives the details necessary to arrive at a project's estimated dollar cost. This project involves the integration into a conventional classroom of one or more reactive CRT displays. The addition of computer augmentation to the traditional teaching of engineering element systems design is proposed, in order to satisfy the needs of the undergraduate student who would like to be able to synthesize and analyze more realistic element systems than is now pedagogically possible. The following specific desired functions in the computer-based system are necessary to make it an easily accessible extension to the student designer's functional capabilities: 1. Information storage and retrieval-The system must be able to accept, store, recall, and present those kinds of information relevant to the designer. For example, material properties, algorithms for analyzing physical elements, conversion factors, etc. 2. Information Interchange-The system must be able to interact with the designer conveniently so that he can convey information with a minimum of excess information during the operation of the system. 8 3. Technical Analysis Capabilities-The system must be able to act on the information pre- Computer Based Science Education System sented, analyze it, and present concise results to the designer. 4. Teaching and Guidance Capabilities-The system must provide the student designer with information necessary to enable him to use the system and also to tutor him in the context of his problems. 5. Easy Availability-The system should be available to the student user and adaptable to his modifications. ' In essence this project can be described as the design and implementation of a computer-based reactive element design system which would be as integral a part of a teaching environment as the blackboard, doors and windows. Further, in this project the amount of additional material in terms of reliability, value analysis, noise control and other modern engineering concepts that can be integrated into the standard curriculum can be conveyed more effectively and to a deeper degree. Method of evaluation The effectiveness of this method will be evaluated by presenting the students with more comprehensive design problems at the end of the semester. Several observations would be made. First, they would be able to take into account more readily the interaction between systems elements in a mechanical network. They should not require as many approximations with respect to fatigue life, method of lubricating bearings, etc. Illlpact on the curricululll The effect on the curriculum will be as follows. The Mechanical Engineering Department at The U niversity of Texas has what is known as "block" options. In this way, a student in his upper-class years may select some particular concentrated area of study. Part of the material that he now receives in one of his block courses 366N could be moved back to his senior design course 366K. The effects on the curriculum should be to free, for additional material, one-half semester of a threehour course at the senior undergraduate level for those maj oring in design. Pedagogical strategy The strategy of the course at present is the solution of authentic design problems solicited from industry. 263 However, all analysis techniques are currently confined to paper and pencil with computer algorithms where practical. The new strategy will add the ability to analyze mechanical systems networks on an interactive screen and have an opportunity to manipulate many possible ideas (both changes in topology and changes in parameters) per unit time. Current or past experience The investigator has spent several years on the development of interactive graphic machine design systems both in industry and in academic institutions. Further, the investigator has been teaching the design of machine element systems since 1963 and prior to that practiced design of machine element systems since 1958. Related research Professor Frank Westervelt at the University of Michigan is currently conducting research in the area of interactive graphic design. His maj or emphasis is on the design of the executive systems behind interactive computer graphic systems. Dr. Milton Chace also at the University of Michigan uses some software generated by Dr. Westervelt's group. His concentration is on using Lagrange's method to describe dynamic systems and be able to manipulate the problem's hardware configurations on the screen. Professor Daniel Roos of the MIT Urban Systems Laboratory is now putting a graphic front end on his ICES system. Professor Gear at the University of Illinois has a system running on a 338/360-67 system and his primary interest there is to work with people who are manipulating electrical network topologies and doing network analysis. Professor Garrett at Purdue University is using graphic terminal access to his university's computer for performing a series of analyses on mechanical engineering components. The unique aspect of the work that is envisioned here, as opposed to all above mentioned relative works, is primarily that it will be an instructional tool. That is, the computer is recognized as an analysis tool, a teacher, and an information storage and retrieval device. With respect to each of the above, there are two connotations. With respect to analysis, the computer will perform analysis in its traditional sense of engineering computation and it will also analyze the topology of mechanical systems as drawn on the screen such that the excess information between the designer and the computer will be reduced. Concerning teaching, 264 Fall Joint Computer Conference, 1970 the computer will work with the student not only to teach him how to use the system, but also in another sense it will help him learn about his problem's context by techniques such as unit matching and unit conversion and checking topological continuity before allowing engineering computations. With respect to information storage and retrieval, traditional parameters such as material yield strengths and physical properties of fluids will be available. Another concept in interactive design will be developed when the system can work with the student and help him optimize apparently continuous systems made up of objects such as bearings that only come in discrete quantities, for instance VB inch increment bores, etc. Plans for inforInation interchange Information interchange, of course, has two meanings: (1) how to get information from other schools to the University of Texas, and (2) how to disseminate the results of my research to other interested parties. It is the responsibility of the principal investigator to insure that information from professional society meetings, personal visits to other campuses, research reports, published articles, personal communication, and "editorial columns" in both regional and national newsletters of societies such as ASME, IEEE, SID, SCI, ACM, and others are disseminated to his research associates. Further, it is the researcher's responsibility to generate research reports, published articles, take the responsibility for keeping his latest information in "news items," attend meetings, and write personal letters to other people in the field who are interested to help keep them abreast of his current research activity. EXPERIENCES TO DATE Completed projects During the academic year 1968-69 two pilot projects were conducted on the use of computer-based techniques in the instruction of undergraduate chemistry. Fifteen curriculum modules were written for the General Chemistry project and eight for Organic Chemistry. In each instance the modules covered subjects typical of the material found in the first semester of each course. Each module was "debugged" and polished by extensive use of student volunteers prior to use in the pilot studies. In each study, the modules were used to supplement one section of the respective courses under con- trolled conditions. Results from this two-semester study on computer-based applications in organic chemistry indicate a high degree of effectiveness. Not only was the study effective in terms of student score performance, but also in terms of outstandingly favorable attitudes towards computer-based instruction by students and professors. Concurrently, faculty members of the College of Engineering have also conducted various courses using computer-based techniques. Dr. C. H. Roth of the Electrical Engineering Department has developed a computer-assisted instructional system that simulates a physical system and provides computer-generated dynamic graphic displays of the system variables. 9 An SDS 930 computer with a display scope is used for this project. The principal means of presenting material to the students is via means of visual displays on the computer output scope. A modified tape recorder provides audio messages to the student while he watches the scope displays. A flexible instructional logic enables the presentation of visual and audio material to the student and allows him to perform simulated experiments, to answer questions (based on the outcome of these experiments) and provides for branching and help sequences as required. Instructional programming is done in FORTRAN to facilitate an interchange of programs with other schools. Dr. H. F. Rase, et al.,l0 of the Chemical Engineering Department have developed a visual interactive display for teaching process design using simulation techniques to calculate rates of reaction, temperature profiles and material balance in fixed bed catalytic reactors. Other computer-based instructional programs developed include similar modules in petroleum engineering and mechanical engineering. In each instance, the curriculum modules are used to supplement one or more sections of formal courses. In several instances it is possible to measure the course effectiveness against a "control" group. Course modules generally contain simulated experiments, analysis techniques using computer graphics, and some tutorial material. The general objectives for all of the aforementioned studies are as follows: 1. To give the students an opportunity to have simulated laboratory experiences which would otherwise be pedagogically impossible. 2. To provide the students with supplemental material in areas which experience has shown to be difficult for many students. 3. To provide a setting in which the student is allowed to do a job that professionals do; ·i.e., collect, manipulate and interpret data. 4. To give the student the feeling that a real, con- Computer Based Science Education System cerned personality has been "captured" in the computer to assist him. 5. To individualize the student-computer interactions as much as possible by allowing the student to pace himself. 6. To give the student a one-to-one situation in which he can receive study aid. Present state of computer-based learning systems Curriculum development in engineering and the sciences is being accomplished within the University of. Texas through the Computation Center CDC-6600 system using conversational FORTRAN within the RESPOND time-sharing system. Also, small satellite pre- post-processors such as a NOVA, a SIGMA 5 and anSDS-930 are linked to the CDC-6600. Of special significance is the Southwest Regional Computer Network (SRCN) which is now in the early stages of use and is built around the CDC-6600. Some eight intrastate colleges and universities are linked. Workshop and user orientation sessions are still under way concurrent with curriculum development. The integration of course matter for this network will be accomplished during the next two to three years. A discussion of the factors involved in curriculum software development and some evaluation aspects follows. Cost factors and management considerations When developing an accounting system for examining the cost of generating software for teaching, one readily realizes that there is the traditional coding and the subsequent assembly. and listing. However, because academic institutions frequently have this class of endeavor funded from agency sources, a next step is very frequently hardware installation (if the terminals or the central processor have not been "on board" prior to the initiation of the research). Once the hardware is available, however, on-line debugging can take place and the first course iteration can begin. The next step is the first course revisiQn in which the material is rewritten, possibly restructured, to take advantage of experiences on the first pass. Then, the second course iteration is performed and finally the second course revision. The estimates for coding, assembly, listing, and debug time, etc., are a function of 1. the computer operating system and its constraints. 2. the coding proficiency of the professionals involved. 265 3. the way in which the computer is supposed to manifest itself, that is: a. as a simulation device in parallel with the real experiment, b. as purely a simulation device, c. as a base for computer-augmented design, d. for data acquisition, e. for computer-based quizzes, f. for computer-based tutorial and drill, g. or finally, for computation. 4. the degree of reaction required, that is in the physiological sense, how the computer must interact for the user's latent response. 5. the extent of the data base required and the data structure required to allow one to manipulate that data base. The primary considerations are organization and control. In this work the organization consists of the following. At each college there is a project coordinator who is the line supervisor for the secretarial services, programming services, and technical implementation services, and who acts as the coordinator for consultants. At the University level there exists a review panel, composed of members of the representative colleges and several people trom other areas such as educational psychology and computing, which evaluates the works that have been conducted in any six-month period and also evaluates the request for continuation of funds to continue a project to its duration. The actual fiscal control of each project is with the project coordinator of that particular college. Also, the purchase of equipment is coordinated at the college level. The control as expressed above is basically a twostep control; a bi-annual project review, plus a budgetary analysis, is accomplished by an impartial review panel. Their function is to act as a check on the context of the material presented and to recommend continuance of particular research tasks. As a further step, this research is coordinated in all colleges by the Research Center for Curriculum Development in Sciences and Mathematics. SUMMARY Dissemination of information Dissemination of information is planned through (1) documentation of publications in the form of articles, reports or letters, (2) symposia to be held which cover procedures and fundamentals for developing curriculum in CBI and (3) furnishing of completed packages of 266 Fall Joint Computer Conference, 1970 course matter on computer tapes, cards or disks to other colleges or institutions desiring, and able to use, this material. Wide publicity will be given completed course matter through such agencies as ERIC, ENTELEK and EDUCOM. The provision for continuing periodic inputs to the above agencies will provide for current availability of curriculum materials. Future outlook The future outlook for time-sharing in computerbased educational systems is extremely bright with respect to hardware development. The advent of the marketing of newer mini-computer models selling for five to ten thousand dollars-some complete with a limited amount of software-is already changing the laboratory scene. The configuration of direct-connect terminals, with the inherent advantage of installation within one building, further enhances the use of this type of system by eliminating the high expense of data lines and telephones. Software in the form of a powerful CBI language for the sciences and engineering designed specifically for minicomputers is perhaps one of the most important needs. Rapid progress has been made in developing a low cost terminal using plasma display or photochromic technology, however, the promise of a low cost terminal has yet to be realized. A small college, high school or other institution may be able to afford a small time-sharing computer system with ten terminals that could meet its academic needs for less than $36,000; however, still lacking isOthe vast amount of adequate curriculum matter. Educators using CBI are in the same position as a librarian who has a beautiful building containing a vast array of shelves but few books to meet academic needs of students or faculty. The task of curriculum development parallels the above situation and must be undertaken in much the same manner as any other permanent educational resource. Educational benefits Benefits that result from implementing this type plan are a function of the synergistic interplay of (1) personnel with expertise in the application of computerbased techniques, (2) computer hardware including interactive terminal capability, (3) faculty-developed curriculum materials, and (4) the common information base into which the entire research program can be embedded. The program can provide students with a learning resource that serves many purposes. That is, the computer can be the base of a utility from which the user can readily extract only that which he needs, be it computation, data acquisition, laboratory simulation, remedial review, or examination administration. At all times the computer can simultaneously serve as an educator's data collection device to monitor student interaction. This modularized dial-up capability can give the students an extremely flexible access to many time-effective interfaces to knowledge. Administrative benefits When this type project has been completed, all users may have access to the results. This unified approach can yield modules of information on cost accounting which can be used as a yardstick by the University. Further, this type project insures that data-yielding performance is of a consistently high quality, and that regardless of the subject matter on any research task the depth of pedagogical quality is relatively uniform. The subject matter developed can serve as the basis for conducting experiments in teaching and continuing curricular reform. Indeed, the association of faculty in diverse disciplines can serve as a catalyst for curricular innovations in the course of preparing materials for computer-based teaching techniques. The instructional efforts described here can serve as the basis for displaying the unity of science and engineering to the undergraduate student in an indirect way. For example, if a student is taking a set of science courses in. a particular sequence because it is assumed each course contributes to an understanding of the next one, it is possible to use programs developed in one course for review of remedial work in the next higher course. Thus, an instructor in biology can assume a certain level of competence in chemistry without having to review the important points in the latter area. Should some students be deficient in those aspects of chemistry that are important to the biology course, the instructor can assign the suitable material that had been developed for practice and drill for the preceding chemistry course. With a little imagination, the instructional system can make a significant impact on the student by giving him a unified view of science or enginering. It is possible to develop both a vertical and horizontal structure for common courses which can be used on an interdisciplinary basis for integration in the basic core curriculum in the various departments where computer-based techniques are used. The revision problem of inserting new and deleting old material in such a system is considerably simplified for all concerned. Computer Based Science Education System There is a vast, largely unexplored area of applications. As time-sharing methods become more widespread, terminal hardware becomes less complex, and teleprocessing techniques are improved, the potential usefulness of computers in the educational process will increase. With the technology and hardware already in existence, it is. possible to build a computer network linking industries and universities. Such a network would (1) allow scientists and engineers in industry to explore advanced curriculum materials, (2) allow those who have become technically obsolete to have access to current, specifically prepared curriculum materials, training aids and diagnostic materials, (3) allow local curriculum development as well as the ability to utilize curriculum materials developed at the university, (4) allow engineers to continue their education, (5) provide administrators with an efficient and time-saving method of scheduling employees and handling other data processing chores such as inventories, attendance records, etc., and (6) provide industrial personnel with easily obtainable and current student records to aid in giving the student pertinent and helpful counseling and guidance. Although complaints have been voiced that computer-based techniques involve the dehumanization of teaching, we argue to the contrary-the judicious use of these methods will individualize instruction for the student, help provide the student with pertinent guidance based upon current diagnostic materials and other data, and allow instructors to be teachers. ACKNOWLEDGl\1ENTS Special thanks are due Dr. Sam J. Castleberry, Dr. George H. Culp, and Dr. L. O. Morgan (Director), of the Research Center for Curriculum Development in Science and Mathematics, The University of Texas at Austin. Also, appreciation is extended to numerous 267 colleagues who have contributed to our approaches with their ideas. REFERENCES 1 To improve learning US Government Printing Office March 1970 2 N HANSEN Learning outcomes of a computer-based multimedia introductory physics course Semiannual Progress Report Florida State University Tallahassee Florida 1967 p 95 3 S CASTLEBERRY J LAGOWSKI Individualized instruction in chemistry using computer techniques J Chem Ed 47 pp 91-97 February 1970 4 L RODEWALD et al The use of computers in the instruction of organic chemistry J Chem Ed 47 pp 134-136 February 1970 5 R E LAVE JR D W KYLE The application of systems analysis to educational planning Comparative Education Review 12 1 39 1968 6 C V BUNDERSON Computer-assisted instruction testing and guidance W H Holtzman Ed Harper & Row New York New York 1970 7 R GAGNE The analysis of instructional objectives for the design In Teaching Machines and Programmed Learning II The National Education Association Washington DC 1965 pp 21-65 8 J J ALLAN Man-computer synergism for decision making in the system design process CONCOMP Project Technical Report 1(.9 University of Michigan July 1968 9 E A REINHARD C H ROTH A computer-aided instructional system for transmission line simulation Technical Report No 51 Electronics Research Center The University of Texas Austin 1968 10 H FRASE T JUUL-DAM J D LAWSON L A MADDOX The use of visual interactive display in process design Journal Chem Eng Educ Fall 1970 The telecommunications equipment marketPublic policy and the 1970's by l\,IANLEY R. IRWIN University of New Hampshire Durham, New Hampshire render service in 85 percent of the geographical sector of the country and account for 15 percent of the remaining telephones in the U.S. The independents are substantially smaller than AT&T and in decreasing size include the General Telephone and Electronics System, United Utilities, Central Telephone Company and Continental Telephone respectively. Although General is by far the largest, the independents have experienced, over the past two decades, corporate merger and consolidation. Western Union Telegraph Company provides the nation's message telegraph service, the familiar telegram and Telex, a switched teletypewriter service. Recently, Western Union has negotiated with AT&T to purchase TWX thus placing a unified switched network under the single ownership of the telegraph company. 2 In addition to their switched services, the carriers provide leased services to subscribers on a dedicated or private basis. In this area, both Western Union and AT&T find themselves offering overlapping or competitive services. The carriers, franchised to render voice or non·· voice service to the general public, reside in an environment of regulation. Licenses of convenience and necessity are secured from either Federal or state regulatory bodies and carry with it a dual set of privileges and responsibilities. In terms of the former, the carriers are assigned exclusive territories in which to render telephone or telegraph services to the public at large-a grant tendered on the assumption that competition is inefficient, wasteful, and unworkable. In terms of the latter, the carriers must serve all users at non·-discriminatory rates, submitting expenses, costs and revenue requirements for public scrutiny and overview. In the United States, the insistence that utility firms be subject to public regulation rests on the premise that economic incentives and public control are neither incompatible nor mutually exclusive. INTRODUCTION The growing interdependence of computers and communications, generally identified with developments in digital transmission and remote data processing services, has not only broadened the market potential for telecommunication equipment but has posed several important public policy issues as well. It is the purpose of this paper to explore the relationship between the telecommunications equipment market and U.S. telecommunication policy. To this end we will first survey the traditional pattern of supplying hardware and equipment within the communications common carrier industry and second, identify recent challenges to that industry's conduct and practices. We will conclude that public policy holds a key variable in promoting competitive access to the telecommunication equipment market-access that will redound to the benefit of equipment suppliers, communication carriers, and ultimately the general public. THE COMMUNICATION COMMON CARRIER As a prelude to our discussion it is useful to identify the major communications carriers in the United States. The largest U.S. carrier, AT&T, provides upwards to 90 percent of all toll or long distance telephone service in the country. In addition to its long line division, AT&T includes some 24 telephone operating companies throughout the U.S.; the Bell Telephone Laboratory, the research arm of the system; and Western Electric, the manufacturing and supply agent of the system.! These entities make up what is commonly known as the Bell System. The non-Bell telephone' companies, include some 1800 firms scattered throughout the U.S. These firms 269 270 Fall Joint Computer Conference, 1970 THE POLICIES AND PRACTICES OF THE CARRIER Given the carriers and their environmental setting, three traits have tended to distinguish the communication industry in the past. These include, first, a practice of owning and leasing equipment to subscribers; second, the policy of holding ownership interest in equipment suppliers and manufacturers; and finally, a practice of taking equipment and hardware requirements from their supply affiliates. Each of these policies has conditioned the structure of telecommunication equipment for several decades and thus merits our examination. Tariffs In the past at least the carriers provided what they term a total communication service. That service embraced loops or paired wires, switching equipment, interoffice exchange trunks and terminal or station equipment. Prior to 1968, subscribers were prohibited from linking their own telephone equipment to the telephone dial network. This policy was subsumed under a tariff generally termed the foreign attachment tariff-the term "foreign" in reference to equipment not owned or leased by the telephone company. Users who persisted in attaching equipment not supplied by the carrier incurred the risk of service disconnection. Carrier non-interconnection policy extended to private communication systems as well as user equipment. In the former case, the denial of interconnection rested on the carriers apprehension that customers would tap profitable markets. Accordingly, customer interconnection would lead to profit cream skimming, loss of revenues to carriers, and ultimately require the telephone company to increase rates to the general public or postpone rate reductions. Interconnection was also said to pose problems of maintenance for the utility. With equipment owned partly by subscribers and owned partly by utility, who would assume the burden of service and who would coordinate the introduction of new equipment and new facilities? Whether these problems were real or imaginary, public policy sanctioned carrier ownership and control of related terminal equipment under the presumption that divided ownership would compromise the systemic integrity of a complex and highly sophisticated telephone network. Telephone subscribers had little choice but to adjust to the foreign attachment prohibition. This meant that of necessity equipment choices were restricted to hardware supplied and leased by the carrier. Competitive substitutes were by definition minimal or nonexistent. In fact the carrier's policy of scrapping old equipment removed a potential used market from possible competition with existing telephone equipment. Integration In addition to certain practices embodied as filed tariffs, the carriers owned manufacturing subsidiaries. The integration or common ownership of utility and supplier thus marked a second characteristic of the industry. Obviously Western Electric, the Bell System's affiliate dominated the industry, and over the years, accounted for some 84 to 90 percent of the market. General Telephone's supply affiliates, acquired since 1950, accounted for some 8 percent of the market. Together the two firms approached what economists term a duopoly market; that is, two firms supplying in excess of 90 percent of the market for telecommunication equipmen.t.3 Despite the persistence of integration, the efficacy of vertical integration experienced periodic review. In 1949, for example, the Justice Department filed an antitrust suit to divest Western Electric from the Bell System. 4 The suit premised on a restoration of competition to the hardware market, asserted that the equipment market could grow and evolve under conditions of market entry and market access. In 1956, a consent decree permitted AT&T to retain Western Electric as its wholly owned affiliate on the apparent assumption that divestiture served as an inappropriate means to achieve the goal of competition. 5 Instead, market access was to be achieved by making available Bell's patents on a royalty free basis. Still later the Justice Department embarked on another antitrust suit. This time the antitrust division challenged General Telephone's acquisition of a West Coast independent telephone company on grounds that the merger would tend to substantially lessen competition in the equipment market. 6 The General suit felt the weight of the Bell Consent Decree. So heavy in fact was the Bell precedent that the Department cited the 1956 Decree as grounds for dropping its opposition to General Telephone's merger. 7 Procurement A third practice inevitably followed the carrier's vertical relationship; namely the tendency to take the bulk of their equipment from their own manufacturing affiliates. Perhaps such buying practices were inevitable. Telecommunication Equipment Market Certainly, in the carriers' judgment, the price and quality of equipment manufactured in-house was clearly superior to hardware supplied by independent or nonintegrated suppliers. Indeed, the courts formalized the procedure of determining price reasonableness by insisting that the carrier rather than the regulatory agency assume the burden of proof in the matter.8 The result saw the arbitor of efficiency pass from the market to the dominant firm. Over the years the integrated supplier firm has accorded itself rave reviews. Consultant's under carrier contract repeated those reviews. But under the existing rules of the game, one would hav~ hardly expected the firm to act differently. However long standing, the triad of tariffs, integration and procurement has held obvious implications for independent suppliers of equipment and apparatus. First, non-integrated suppliers found it difficult to sell equipment to telephone subscribers given the enforcement of the foreign attachment tariff. Second, the non-integrated supplier was not particularly successful in selling directly to integrated operating companies. No policy insisted that arms-length buying be inserted between the utility and its hardware affiliate, and indeed the integrated affiliate assumed the role as purchasing agent for its member telephone carriers. Little. surprise then that the percentage of the market penetrated by independent firms has tended to remain almost constant for some forty years. Having said this it must be noted that the carriers insisted that the quality, price and delivery time of equipment from their in-house suppliers was clearly superior to alternative sources of supply. It was as if a competitive market was, by definition less, efficient in allocating goods and services. The private judgment of management was never put to an objective test. Indeed, the carriers resisted formal buying procedures as unduly cumbersome and unwieldy.9 That resistance has tended to carry the day. Third, vertical integration skirted the problem of potential abuse inherent in investment rate-base padding. Utilities, for example, operate on a cost plus basis, i.e., they are entitled to earn a reasonable return on their depreciated capital-a derivation of profits that stands as the antithesis of the role of profits under competition. Vertical integration compounded utility profit making by providing an incentive to transplant cost plus pricing to the equipment market. Certainly, the penalty for engaging in utility pricing was difficult to identify much less discipline. The affiliate occupied the best of all possible worlds. In all fairness one must note the institutional con_.straint erected to prevent corporate inefficiency on the equipment side. The argument ran that the regulatory 271 agency monitored the investment decision of the carrier. By scrutinizing the pricing practices of the integrated affiliate indirect regulation prevented exorbitant costs on the supply side from entering the utility rate base and passed forward to the subscriber. Indirect regulation allegedly protected both the public and the utility. As an abstract matter, indirect regulation held an element of appeal. Translating that theory into practice was obviously another matter. On the federal level, the FCC has never found an occasion to disallow prices billed by Western Electric to AT&T.lO Yet, 65 percent of AT&T's rate base consists of equipment purchased in-house and absent armslength bargaining. l1 Finally, vertical integration placed the independent equipment supplier in an awkward if not untenable position. As noted, the independent firm could secure equipment subcontracts from its integrated counter-· part. The dollar value of those subcontracts was not unimportant. The problem was that the non-integrated supplier was still dependent upon its potential competitor-a competitor who exercised the discretion to make or buy. Little wonder then, that the viability of independent equipment suppliers was controlled and circumscribed by the tariffs, structure, and procurement practices of telephone utilities. Patently, no market existed for the outside firms; and without a market, the base for technical research and development, much less the incentive for product innovation, was effectively constrained, not to say anesthetized. Access to the telecommunication equipment was, in short, limited to suppliers holding a captive market. The government asked the monopoly firm to judge itself; and after dispassion3lte inquiry the firm equated the public interest with preservation of its monopoly position. MARKET CHANGES AND PUBLIC POLICY ]JI[ arket changes All this has been common knowledge for decades. Whatever the pressures for reassessment and change, those pressures were sporadic, ad hoc and often in-· consistent. Today, however, the telecommunications industry is experiencing a set of pressures that marks a point of departure from the triad of policies described above. In a word, the pace of technology is challenging the status quo. More specifically, new customized services tend to be differentiated from services rendered by the common carriers. Time sharing and remote computer based services, for example, provide and impetus for a host of specialized information ser- 272 Fall Joint Computer Conference, 1970 vices. The rise of facsimile and hybrid EDP/ communication services is equally significant as a trend toward specialization and sub-market development. Subscribers themselves, driven by the imperatives of digital technology, seek flexibility and diversity in their communication facilities. Segmentation and specialization is gaining momentum. At the same time, the telecommunication hardware market is experiencing a proliferation of new products that pose as substitutes to carrier provided equipment. In the terminal market, for example, modems, multiplexors, concentrators, teleprinters, CRT display units are almost a daily occurrence~ Indeed, the computer itself now functions as a remote outstation terminal; and many go so far as to assert that the TV set holds the potential as the ultimate terminal in the home and the school. The carriers, of course, have not stood idly by. In the terminal market, touch-tone hand sets, picturephones and new teletypewriters signal an intent to participate in remote access input output devices as well. But the point remains-carrier hardware no longer stands alone. The proliferation of competitive substitutes and the potential rise of competitive firms is now a process rather than an event. ' The same technological phenomena is occurring in the transmission and switching area as well. Cable TV provides a broad link directly to the home, the school, or the business firm. Satellite transmission poses as a substitute for toll trunking facilities and the FCC has recently licensed MCI as a new customized microwave carrier. Furthermore, carrier switching technology is challenged by hardware manufactured and supplied by firms in the computer industry. All of this suggests that carrier affiliates no longer possess the sole expertise in the fundamental components that make up telecommunication network and services. It is perhaps inevitable then, that the growing tension between the existing and the possible has surfaced as questions of public policy. These questions turn once again on matters of tariff, procurement and integration. Public policy decisions Tariffs Undoubtedly, the FCC's 1968 Carterphone Decision marks one significant change in the telecommunication equipment industry.12 Here the Commission held that users could attach equipment to the toll telephone network. Indeed, the Commission insisted that carriers establish measures to insure that customer-owned equipment not interfere with the quality and integrity of the switched network. Subsequently, the Commission entrusted the National Academy of Science to evaluate the status of the network control signalling device. Although somewhat cautious, the Academy has suggested that certification of equipment may pose as one feasible alternative to noncarrier equipment. 13 The implications of Carterphone bear repetition. For one thing the decision broadens the users option in terms of equipment selection. The business subscriber no longer must lease from the telephone company, but may buy hardware from other manufacturers as well. For another, an important constraint has been softened with respect to suppliers of terminals, data modems and multiplexors as well as PBX or private branch exchanges. Indeed, some claim that the decision has established a two billion dollar market potential for private switching systems.H Ironically, the carriers themselves, to meet the demand for PBX's, have turned to independent or nonaffiliated firms supplying such equipment. Nevertheless, the Carterphone in softening previous restraints continues to pose an interesting set of questions. For example, what precisely is the reach of Carterphone? Will the decision be extended to the residential equipment market? Is the telephone residential market off limits to the independent suppliers of telecommunications equipment? These questions are crucial if for no other reason than terminal display units are already on stream and the carriers themselves are now introducing display phones on a commercial basis. Indeed, the chasm between Carterphone's reality and promise will bulk large if public policy decides that the residential subscriber cannot be entrusted with a freedom of choice comparable to the business subscriber. Vertical integration The vertical structure of the carriers has also been subject to reexamination. A relatively unknown antitrust suit involving ITT and General Telephone system is a case inpoint. 15 The suit erupted when General Telephone purchased controlling interest in the Hawaiian Telephone Company-a company that was formerly. a customer of ITT and other suppliers. ITT has now filed an antimerger suit on grounds that General forcloses ITT's equipment market. The ITT suit represents a frontal assault on General's equipment subsidiaries, for ITT is seeking a ban on all of General Tel's vertical acquisitions since 19.50. In a Telecommunication Equipment Market word, the suit seeks to remove General Telephone from the equipment manufacturing business. While the suit is pending, it is obviously difficult to reach any hard conclusions, but one can speculate that anything less than total victory for General Telephone will send reverberations throughout the telephone industry and the equipment market. Certainly, if General Telephone is required to give up its manufacturing affiliate, then the premise of the Western Electric-AT &T consent decree will take on renewed interest. Another development in the equipment market focuses on Phase II of the FCC's rate investigation of AT&T .16 This phase is devoted to examing the Western Electric-AT&T relationship. Presumably, the Commission will examine Bell's procurement policies as well as the validity of the utility-supplier relationship. What conclusions the Commission will reach are speculative at this time. In terms of market entry for the computer industry, the implications of Phase II are both real and immediate. Still another facet of integration is the relationship of communication to EDP and carrier CATV affiliates. In the former, the FCC has ruled that with the exception of AT&T, carriers may offer EDP on a commercial basis via a separate but wholly owned corporation. 17 Nothing apparently prohibits a carrier from offering an EDP service to another carrier-note the current experiments in remote meter reading. By contrast, the FCC has precluded carriers from offering CATV service in areas where they currently render telephone service. 18 Both moves, to repeat, hold important market implications for manufacturers of telecommunication equipment. ProcureInen t Finally, equipment procurement has surfaced once again as a policy issue. Consider Carterphone, domestic satellites and specialized common carriers as symptomatic of the procurement theme. The premise supporting Carterphone is that the user is entitled to free choice in his equipment selection. Once that principle has been established, and that may well be debatable, someone is bound to pose an additional question. Should suppliers be permitted to sell to the Carriers directly rather than through carrierowned supply affiliates? Perhaps that precedent has already been made. Bell System companies may buy computer hardware directly from computer suppliers, thus permitting the computer industry to bypass Western Electric's traditional procurement assignment. 19 The point may well be asked, does this policy merit generalizing across the board? 273 Access to the equipment market in the domestic satellite field poses as a second issue. A White House memorandum has advised the FCC that the problems of spatial parking slots and frequency bands do not bar the number of competitive satellite systenis. 20 And in return, the FCC has reopened its domestic satellite docket for reevaluation. If the Commission adopts only segments of the White House memo, domestic' satellites will presumably raise the issue of competitive bidding in one segment of the hardware market. As it stands now, all satellite equipment, whether secured be Com Sat or the international carriers, must be secured through competitive bids. 21 These rules apply not only to the prime contractor, but all sub-contracting tiers at a minimum threshold of $24,000. The equestion persists, if domestic satellites evolve within the continental U.S., will competitive bidding procedures attend such an introduction whether in satellite bird or in earth terminal stations. These issues will likely gain momentum as the carriers move into the production of ground station equipment. Finally, an FCC proposed rule made in the area of specialized carriers, bears directly on the equipment market. 22 The docket is traced to the FCC's MCI decision which authorized a specialized carrier to offer service between Chicago and St. Louis. 23 Since the MCI decision, the FCC has received over 1700 applications for microwave sites. In its recent docket, the Commission has solicited views in proposed rulemaking that would permit free access of any and all microwave applicants. As the Commission noted, "Competition in the specialized communications field would enlarge the equipment market for manufacturers other than Western Electric. . . .' '24 If this policy becomes implemented and the FCC can prevent the carriers from engaging in undue price discrimination, it is clear that the one constraint to the growth of specialized common carriers will be the output capacity of firms who manufacture such equipment. CONCLUSION In sum the premises supporting the tariffs, structure and practices of the carriers have been exposed to erosion and subject to revision. That change has in turn spilled into the policy arena. Firms in the telecommunication equipment industry-and this incJudes the computer industry-will find it increasingly difficult to avoid the policy issues ofa market whose potential exceeds $5 billion. One might argue that questions dealing with market entry are in one sense peripheral issues. That is, public policy should direct its attention to existing 274 Fall Joint Computer Conference, 1970 structures as well as potential entry. In this context competitive buying practices may well pose as a workable solution to the vertical integration problem. But that solution is obvjously of short term duration. The pace of technology is suggesting that something more fundamental must give. Over the next decade, the nation's supply of telecommunication equipment must expand by an order of magnitude and that goal stands in obvious conflict with monopoly control of telecommunication equipment suppliers. REFERENCES 1 W H MELODY I nterservice subsidy: Regulatory standards and applied economics Paper presented at a conference sponsored by Institute of Public Utilities Michigan State University 1969 2 New York Times July 291970 3 Final Report President's Task Force on Communications Policy December 7 1968 4 United States v Western Electric Co Civil No 17-49 DNJ filed Feb 14 1949 5 Consent Decree US v Western Electric Co Civil No 17-49 DNJ January 23 1956 6 United States v General Telephone and Electronics Corp Civil No 64-1912 SD NY file June 19 1964 7 In the US District Court District of Hawaii International Telephone and Telegraph Corporation v General Telephone and Electronics Corporation and Hawaiian Telephone Company Civil No 2754 G T&E's motion under Rule 19 Points and Authorities in Support Thereof April 21 1970 8 Smith v Illinois Bell Telephone 282 US 1930 9 Telephone investigation Special Investigation Docket No.1, Brief of Bell System Companies on Commissioner Walker's Proposed Report on the Telephone Investigation 1938 10 The domestic telecommunications carrier industry Part I Presidents Task Force on Communications Policy Clearinghouse pp 184-417 US Department of Commerce June 1969 11 Moody's public utility manual Moody's Investors Service Inc New York August 1969 12 Before the FCC in the Matter of Use of the Carterphone Device in Message Toll Telephone Service FCC No 16942 In the Matter of Thomas F Carter and Carter Electronics Corporation Dallas Texas Complainants v American Telegraph and Telephone Company Associated Bell System Companies Southwestern Bell Telephone Company and General Telephone of the Southwest FCC Docket No 17073 Decision June 26 1968 13 Report of a technical analysis of common carrier/user interconnections National Academy of Sciences Computer Science and Engineering Board June 10 1970 14 New York Times July 121970 15 In the US District Court for the District of Hawaii International Telephone and Telegraph Corp v General Telephone and Electronics Corp and Hawaiian Telephone Company Complaint for Injunctive Relief Civil Action No 2754 October 18 1967 Also Amended Complaint for Injunctive Relief December 14 1967 16 Before the FCC In the Matter of American Telephone and Telegraph Company and the Associated Bell System companies charges for Interstate and Foreign Communication Service 1966 Stanford Research Institute Policy Issues Presented by the Interdependence of Computer and Communications Services Docket No 19979 Contract RD-10056 SRI Project 7379B Clearinghouse for Federal Scientific and Technical Information US Department of Commerce February 1969 17 Before the FCC in the Matter of Regulatory and Policy Problems Presented by the Interdependence of Computer and Communication Service and Facilities Docket No 16979 Tentative Decision 1970 18 Before the Federal Communications Commission In the Matter of Applications of Telephone Companies for Section 214 Certificates for Channel Facilities Furnished to Affiliated Community Antenna Television Systems Docket No 18509 Final Report and Order January 28 1970 19 A systems approach to technological and economic imperatives of the telephone network Staff Paper 5 Part 2 June 1969 PB184-418 President's Task Force on Communications Policy 20 Memorandum White House to Honorable Sean Burch Chairman Federal Communications Commission January 231970 21 Before the FCC In the Matter of Amendment of Part 25 of the Commission's Rules and Regulations with Respect to the Procurement of Apparatus Equipment and Services Required for the Establishment and Operation of the Communication Satellite System, and the Satellite Terminal Stations Docket No 15123 Report and Order April 3 1964 22 Before the FCC In the Matter of Establishment of Policies and Procedures for Consideration of Applications to Provide Specialized Common Carrier Services in the Domestic Public Point-to-Point Microwave Radio Service and Proposed Amendments to Parts 2143 and 61 of the Commission's Rules Notice of Inquiry to Formulate Policy Notice of Proposed Rulemaking 1 and Order July 1970 (Cited as FCC Inquiry on Competitive Access) 23 Federal Communications Commission In re Application of Microwave Communications Inc for Construction Permits to Establish New Facilities in the Domestic Public Point to Point Microwave Radio Service at Chicago Illinois St Louis Missouri and Intermediate Points Docket No 16509 Decision August 14 1969 24 FCC Inquiry on Competitive Access op cit July 1970 p 22 Digital frequency modulation as a technique for improving telemetry sampling bandwidth utilization by G. E. HEYLIGER Martin Marietta Corporation Denver, Colorado INTRODUCTION strictly band-limited signal are required for complete recovery of that signal. Nevertheless, 5 to 10 samples per cycle are widely employed. There are reasons, practical and otherwise, for the resulting bandwidth extravagance: A hybrid of Time Division Multiplexing (TDM) and Frequency Division Multiplexing (FDM), both wellestablished in theory and practice is described herein. While related to TDM and FDM, the particular combinations of techniques and implementations are novel and, indeed, provide a third alternative for signal multiplexing applications. The essence of the idea is to perform all band translation and filtering via numerical or digital techniques. Signal multiplexing techniques are widely employed as a means of approaching the established theoretical limitations on communication channel capacity. In general, multiplexing techniques allow several signals to be combined in a way which takes better advantage of the channel bandwidth. FDM systems accomplish this by shifting the input signal basebands by means of modulation techniques, and summing the results. Judicious choice of modulation frequencies allows nonoverlapping shifted signal bands, and permits full use of the channel bandwidth. Refinements such as "guard bands" between adjacent signal bands and the use of single sidebands can further affect the system design, but, in general, the arithmetic sum of the individual signal bandwidths must be somewhat less than haH the composite channel bandwidth. TD1VI systems achieve full utilization of channel bandwidth in quite a different way. Several signals are periodically sampled, and these samples are interleaved so that the individual signal must be sampled at least twice per cycle for the highest signal frequency present in accordance with Nyquist's sampling theorem. In this case, also, the number of signals that can be combined depends upon the sum of individual signal bandwidths and the bandwidth of the channel itself. The sampling theorem states that only two samples per cycle of the highest frequency component of a 1. Many times it is difficult, if not impossible, to place a specific upper limit on "significant'. frequency components. Safe estimates are made. 2. Interpretation of real-time or quick-look plots is simpler and more satisfying if more samples per cycle are available. 3. Aliasing or folding of noise is more severe for relatively low sampling rates and inadequate prefiltering. This paper acknowledges the practice of oversampling but avoids the difficulties previously described. Full use is made of the sampling bandwidth by packing several signals into that bandwitdh utilizing a form of FDM. The novelty lies in the use of FDM and the way modulation is achieved for periodically sampled signals. SYSTEM DESCRIPTION Before describing the system, it is useful to briefly consider some theoretical background. The following discussion should clarify the basic ideas. Consider a source signal with the spectrum shown in Figure l(a). It is well known that sampling signals at a frequency is = l/T where T is the time between samples, results in a periodic replication of the original spectrum as shown in Figure l(b). Modulation of the original signal by frequency /0 produces the usual sum and difference frequencies, and sampling then results in the replicated pattern shown in Figure 1 (c). 275 276 Fall Joint Computer Conference, 1970 I I I . -3 21. -I • -I 21. rh S ; " e r e d by Low-Pass 0 I I I I 1.21. I. 321. ~ C'\OC'\ ~ do~ A C'\OC'\ (a) Original :. ! I, r71 -I. -T2~-~ Fllt.rin~ ·I s I 1 2 I. I 0 1 2 I. Is I ., 3 2 I. I I r71 1.21. I. (a) Combined Spectra of Three Oversamp1ed and Modulated Sources 1 1 (b) Sampled 1 , C"'\ f:,. C'\ c;J ~ f:,. ~ Q C'\ I) -t, 3l1... -12f~ I ~ 0 's 12f" 1"'""Jr; • 32f. (b) Demodulated by 1/2 f (c) Sampled and Modulated by fO I 3!f, Figure 1-Spectral effects of sampling and modulating ; t I I s rro G""\ rro C" fA) I~rro C\ rro C"' fA) ~ I -, .. -12/. I 0 12/ I 321 I S S • (c) Demodulated by 1/4 fs Now consider three source signals with the spectra shown in Figure 2(a), all with roughly the same bandwidth. Modulating the second and third signals with the frequencies f8/2 and fs/4, respectively, results in the shifted spectra shown in Figure 2(b). Summing yields the composite spectrum shown in Figure 2(c). This composite signal now makes full use of the sampling bandwidth. Figure 3 shows the inverse process of obtaining the original spectra. Demodulating by the same frequencies used for modulation successively brings each signal band to the origin where low pass filtering eliminates all but the original signal. Since few signals are strictly band-limited, it is evident that crosstalk noise will appear in the received signal. This noise can be controlled by the degree of pre- and postfiltering. For certain relatively inactive signals, the crosstalk may be no penalty at all. In general, however, crosstalk presents the same problems here as with any FDM system. The important point to be made is that tracking of the modi demod oscillators is not relevant since these operations are obtained directly by operating on successive samples, i.e., there are no local oscillators per se. In general, modulation is accomplished by multiplying the signal source by a single sinusoidal frequency or carrier. Sampled signals are modulated in t. Figure 3-Prefiltered separation of combined signals the same way, but the modulating frequency multiplier is required only at successive sample times. Modulation (i.e., multiplication) by integer fractions of the sampling frequency is particularly simple if appropriate sample times are chosen. For example, certain modulation frequency amplitudes are quite easily obtained as shown in Table I. The phase shift of 1'(/4 for 1/4 fs was chosen to avoid multiplication by zero yet retain natural symmetry. All the modulation factors may be easily obtained by modifying the sign of the signal magnitude and/or multiplying by a factor of 1/2. Furthermore, the majority of interesting cases are handled by these modulation frequencies, packing two, three, or four FDM channels within the sampling bandwidth. This degree of packing nicely accommodates practical oversampling systems encountered in practice. For particular applications, it may be useful to employ arbitrary modulation frequencies and the corresponding sequence of numerical multipliers (nonrepeating or repeating). A hybrid form of implementation is shown in Figures 4 and 5. Figure 4 is the modulator, and Figure 5 is the demodulator. Not explicitly shown, but implied, Table I Modulation Frequency t (a) Three Oversampled Source Signals 1 4 C"'\ fs f s Modulation Factor = 0,1,2,... General Expr.ession. k (t fs 21fT) = fs ) cos (k;;- 21fT = cos cos k1f l, -1, ... 1f cos k2 0, I, 0, -1, ... 1 (No t e Con stan l Amplitude) (b) Original Signals Modulated by O. 1/2 fs' and 1/4 fs' Respectively I' C"'\ 0 -I 2 I. 0 1 f 1 f b C'""\ t. 1 2 I. (c) Combined Spectra (Reduced to Sampling Bandwidth) Figure 2-Combinations of oversampled signals Periodic Sequence 3 s 5 cos f21fT ) (k 6' = cos k3 cos [s ) (k 321fT = 5 - 1, t, -t, 1, -t, -to ... 'If 2 cos k3 1f TABLE I-Modulation Factor -1, -to t.··· Digital Frequency Modulation 5, -- - - - - - - -- - - - -- - - -- - -- - - - - - -- - - - -, I I t 1 : I I To Conventional Time Division Multiplexing System Notes: 1) 1/2 ~f' . ,-1--1-....1...-, IS L ________________________________ ~ fa' is a periodic pulse stream. delayed with respect to f s. the sampling pulse sequence. (See text..) Figure 4-Sampled FD M modulator is the use of the combined signal output as a single sampled source for conventional TDM systems. The system diagram assumes the case of four signals of roughly equal bandwidth to be combined into a single signal. Subfunctions such as sampling, counting, digital decoding trees, and operational amplifiers can be implemented in a variety of ways utilizing conventional, commercially available functional blocks or components. Details of the subfunction implementations themselves are incidental to the concept but important to the particular application. Referring to Figure 4, the multiplexer modulator works as follows: Four independent signals (Sl, S2, S3, and S4) are accepted as inputs. One, shown as S1, goes directly to the summing amplifier, A. Each of the other signals is switched periodically under control of the approprlate binary counter which is synchronized and driven by the sampling frequency pulses. As shown, S2 is alternately switched from the first to the second of a two-stage cascade of operational amplifiers. The effect of this chain is to alternately multipiy S2 by the factors plus one and minus one, i.e., the modulation factor cos k7l"; = 0, 1, 2, ... in accordance with Table I and considering the modulation signal valid at the sample times only. Similarly, S3 is multiplied by the periodic sequence (1, -72, -72) again in accordance with the third line of Table I. The effect, considered at sample times only, is to modulate Sa by 1/3 fs. Fip.ally, S4 is modulated by 1/6 F s , by periodically switching this signal to one of six inputs of the operational amplifier chain with the gains (1, 72, -72, -1, -72, 72) in accordance with line four of Table I. All four outputs are summed by the operational amplifier A, and the summed signal sampled at the 277 output of A at the sampling frequency, fs. It should be noted that the switching counters can be changed at any time after a sample is taken from the output of A; therefore, the design of the system provides that the pulse driving the counters is delayed slightly more than the aperture time of the sampled output. This mechanization provides ample time for switching operations prior to the subsequent sampling. The sampled output signal, St*, can be used as an input to a conventional TDM system. The demodulator shown in Figure 5 is very similar to the modulator. In fact, within the dotted lines it is identical. Here, the appropriate output from a conventional TDM system, St*, is used as input to all four counter-controlled switches. A sample and hold operation is employed at the input in order to drastically reduce the time response requirements of the operational amplifiers. Again, sequentially switching the input effectively demodulates St by the frequencies 1/6 fs, 1/3 fs, and 1/2 fs. Since this modulation is effective only at the sampling instants, a sample and hold circuit is required at each output. The low-pass filter eliminates components of all but the demodulated signal. Note that for the demodulator, the signals f't should precede fs in phase by the aperture (or pulse width) of St*, to allow a maximum time for change in St* to be accommodated by the amplifier and switching chain. Since fs is derived from St* and is a periodic signal, any desired relative phasing is readily achieved. SYSTEM ADVANTAGES AND CAPABILITIES Several useful and interesting features are inherent in the system: 1. Numerical Modulation of Sampled SignalsBecause the modulation signal is required only .---------------------.- --- --------1 'T" " " '3 " 1/2 •"{"ehron'"} T c.in.tor L __ - - - - - - - - - - - - - - - - -------- ---- -- f. Figure 5-Sampled FDM demodulator 278 Fall Joint Computer Conference, 1970 2. 3. 4. 5. at the sampling instants, a periodic sequence of numerical multipliers substitutes for the local oscillator of conventional frequency modulation systems. Conventional oscillator accuracy and stability problems do not arise, and very low frequency modulation is readily achieved. Coincident Sampling of Several Signals-Conventional TDM systems may combine signals sampled at the same rate, but at different instants of time. This approach provides for combining signals sampled at the same rate and same times. Full use of conventional TDM techniques can be employed on the combined signal. Full Utilization of Sampling BandwidthThe sampling rate chosen defines the unaliased bandwidth in a sampled data system. Here, a way of combining several independent signals is employed so that the total sampling bandwidth can be utilized for transmission of information. Signal Independent Choice of Sampling RateAs a corollary to 3, this system permits, even promotes, oversampling of individual signals. Oversampling is attractive and widely practiced as previously noted. The system described here avoids the usual oversampling penalties by packing several independent signals within the sampling bandwidth. Noise Aliasing Avoidance-Some source signals must be heavily prefiltered or oversampled in order to avoid the noise signal folding effects of sampled data systems. Again, oversampling can be employed without the usual penalties. It should be noted that wideband noise will of course result in crosstalk among the combined channels. In summary, the system described gives a new dimension in the design of signal multiplexed systems. Combination of these techniques with the conventional TDM and FDM techniques allows the designer to tailor a sampled data system to the peculiarities of a specific set of source signals, while making full use of the available sampled bandwidth. following system functions can be identified: 1. Sampling, timing, and switching; 2. Analog/digital (A/D) conversion; 3. Sample modulation/demodulation. The modulator/transmitter also requires an adder for combining the signals, while the demodulator/ receiver requires a suitable lowpass filter for each output. Conversion to a digital representation of the signals can be performed at most any point in the system. Following conversion, the subsequent functions are performed via conventional digital arithmetic and logic operations. Exclusively digital implementation As an extreme example, consider an implementation that provides A/D conversion at the source (modulator/transmitter input). Modulation is accomplished by arithmetic multiplication of the source sequence values by the desired modulation sequence, cos kO o where 0 = sT. Note that in this case, the modulation sequence need not be a periodic sequence if a means is provided for generating the values cos k 00 for all integers, k. Independent signals are combined after modulation simply by arithmetic addition of corresponding modulated sequence values. The summed sequence is the output. The combined PCM samples are then handled as with a conventional TDM system. At the demodulator/receiver, the input is the digital sampled sequence as derived from a conventional PCMsystem. Demodulation is performed as before; arithmetic multiplication of the input sequence by the appropriate sequence of values, cos k 00 • Each resulting output must be filtered to eliminate the other signal components. Filtering can be ac·· complished numerically using either recursive or nonrecursive techniques. The outputs then are available as separate signals corresponding to those first transmitted. The digital output sequence may be used directly for listing, further processing, or as an input to an incremental plotter. Alternatively, D/ A, conversion and hold operations convert the signal to its analog equivalent. ALTERNATIVE IMPLEMENTATIONS Mixed analog/digital implementations The hybrid system described herein uses pulse amplitude modulation (PAM). However, pulse code modulation (PCM) can be employed as well, in one of several attractive alternative implementations. The Evidently, a number of obvious combinations of PAM and PCM are possible. Thus, operational amplifier (op-amp) modulation can be used in combination Digital Frequency Modulation with a time-shared AID converter and arithmetic summation with the result handled as a conventional PCM signal. Similarly, at the receiver, DI A conversion may take place at the output of the PCM arithmetic modulator, and the result passed through a conventionallow-pass analog filter for signal recovery. A nalog system simplifications Figure 4 presents the system in a way that aids description and understanding. Good design practice would permit combination of the modulation and summing functions in a single op-amp stage. Similarly, various combinations of cascades 2- and 3-way switches might be advantageous instead of the single stage 6-way switch shown in Figure 4. Modulation sequence considerations The op-amp modulator implementation requires that the modulation sequence, cos k eo, be a repeating or periodic sequence. From a practical point of view, only a small number of modulation values should be employed, since each requires additional switching and input to the op-amp. While the only theoretical limitation on the number of values is that eo be some rational fraction of 211"', the simple ratios of the examples shown should prove most useful in practice. Arithmetic implementation of the modulation and demodulation function imposes no constraint on the number of distinct modulation values, cos k eo. Successive values may be generated arithmetically using some equivalent of the following algorithm: sinkeo = cos(k - l)e o sinOo + sin(k - 1)0 0 cose o coske o = cos(k - 1)00 coseo - sin(k - l)e o sine o Only the initial values cos 00 and sin eo are required to start. If eo is some rational fraction of 211"', the sequence will be repeating; otherwise, not. In this case any desired modulation frequency (wo) may be realized. 279 Bandwidth packing variations While roughly equal bandwidths were assumed for the combined signals of the system described, the fundamental constraint is that the sum of, signal bandwidths plus guard bands must be less than the sampling frequency. As usual with FDM systems, both upper and lower sidebands for each signal must be included in this consideration. Choice of a suitable modulation frequency then depends upon the placement of each signal band within the sampling bandwidth. Clearly, many variations of center frequencies and bandwidth are feasible and useful. Variations in digital system A general purpose digital computer can perform all operations required for modulating, summing, demodulating, and filtering. Where such a computer is already employed in the data system for switching, comparison, calibration, and control, the additional functions described here become particularly attractive. Standard programming practices can be used to perform the essential functions described here. Alternatively, for the system example the arithmetjc operations required are quite simple. Multiplications of Y2 and -1 are readily realized by right shift and sign change operations, respectively. A special purpose digital computer with few storage registers and capability for "right shift," "add," "sign change," and conventional register transfers, will provide the required functions. CONCLUSION The digital frequency modulation technique described herein permits combination of several signals into a single signal having a sampled bandwidth equal to the sum of the original signal bandwidths. Utilization of this technique to reduce the penalties of oversampled telemetry channels appears particularly attractive. THE ALOHA SYSTEM-Another alternative for computer communications* by NORMAN ABRAMSON University of Hawaii Honolulu, Hawaii WIRE COMMUNICATIONS AND RADIO COMMUNICATIONS FOR COMPUTERS INTRODUCTION In September 1968 the Uhiversity of Hawaii began work on a research program to investigate the use of radio communications for computer-computer and console-computer links. In this report we describe a remote-access computer system-THE ALOHA SySTEM-under development as part of that research program! and discuss some advantages of radio communications over conventional wire communications for interactive users of a large computer system. Although THE ALOHA SYSTEM research program is composed of a large number of research projects, in this report we shall be concerned primarily with a novel form of random-access radio communications developed for use within THE ALOHA SYSTEM. The University of Hawaii is composed of a main campus in Manoa Valley near Honolulu, a four year college in Hilo, Hawaii and five two year community colleges on the islands of Oahu, Kauai, Maui and Hawaii. In addition, the University operates a number of research institutes with operating units distributed throughout the state within a radius of 200 miles from Honolulu. The computing center on the main campus operates an IBM 360/65 with a 750 K byte core memory and several of the other University units operate smaller machines. A time-sharing system UHTSS/2, written in XPL and developed as a joint project of the University Computer Center and THE ALOHA SYSTEM under the direction of W. W. Peterson is now operating. THE ALOHA SYSTEM plans to link interactive computer users and remote-access input-output devices away from the main campus to the central computer via UHF radio communication channels. At the present time conventional methods of remote access to a large information processing system are limited to wire communications-either leased lines or dial-up telephone connections. In some situations these alternatives provide adequate capabilities for the designer of a computer-communication system. In other situations however the limitations imposed by wire communications restrict the usefulness of remote access computing. 2 The goal of THE ALOHA SYSTEM is to provide another alternative for the system designer and to determine those situations where radio communications are preferable to conventional wire communications. The reasons for widespread use of wire communications in present day computer-communication systems are not hard to see. Where dial-up telephones and leased lines are available they can provide inexpensive and moderately reliable communications using an existing and well developed technology.3,4 For short distances the expense of wire communications for most applications is not great. Nevertheless there are a number of characteristics of wire communications which can serve as drawbacks in the transmission of binary data. The connect time for dial-up lines may be too long for some applications; data rates on such lines are fixed and limited. Leased lines may sometimes be obtained at a variety of data rates, but at a premium cost. For communication links over large distances (say 100 miles) the cost of communication for an interactive user on an alphanumeric console can easily exceed the cost of computation. 5 Finally we note that in many parts of the world a reliable high quality wire communication network is not available and the use of radio communications for data transmission is the only alternative. There are of course some fundamental differences * THE ALOHA SYSTEM is supported by the Office of Aerospace Research (SRMA) under Contract Number F44620-69-C0030, a Project THEMIS award. 281 282 Fall Joint Computer Conference, 1970 between the data transmitted in an interactive timeshared computer system and the voice signals for which the telephone system is designed. 6 First among these differences is the burst nature of the communication from a user console to the computer and back. The typical 110 baud console may be used at an average data rate of from 1 to 10 baud over a dial-up or leased line capable of transmitting at a rate of from 2400 to 9600 baud. Data transmitted in a time-shared computer system comes in a sequence of bursts with extremely long periods of silence between the bursts. If several interactive consoles can be placed in close proximity to each other, multiplexing and data concentration may alleviate this difficulty to some extent. When efficient data concentration is not feasible however the user of an alphanumeric console connected by a leased line may find his major costs arising from communication rather than computation, while the communication system used is operated at less than 1 percent of its capacity. Another fundamental difference between the requirements of data communications for time-shared systems and voice communications is the asymmetric nature of the communications required for the user of interactive alphanumeric consoles. Statistical analyses of existing systems indicate that the average amount of data transmitted from the central system to the user may be as much as an order of magnitude greater than the amount transmitted from the user to the central system. 6 For wire communications it is usua]]y not possible to arrange for different capacity channels in the two directions so that this asymmetry is a further factor in the inefficient use of the wire communication channel. The reliability requirements of data communications constitute another difference between data communication for computers and voice communication. In addition to errors in binary data caused by r~ndom and burst noise, the dial-up channel can produce connection problems-e.g., busy signals, wrong numbers and disconnects. Meaningful statistics on both of these problems are difficult to obtain and vary from location to location, but there is little doubt that in many locations the reliability of wire communications is well below that of the remainder of the computer-communication system. Furthermore, ~ince wire communications are usually obtained from the common carriers this portion of the overall computer-communication system is the only portion not under direct control of the system designer. THE ALOHA SYSTEM The central computer of THE ALOHA SYSTEM (an IBM 360/65) is linked to the radio communication ~ ~ TRANSMIT CENTRAL DATA COMPUTER IBM 360165 ~ ~ DATA MODEM Figure I-THE ALOHA SYSTEM channel via a small interface computer (Figure 1). Much of the design of this multiplexor is based on the design of the Interface Message Processors (IMP's) used in the ARPA computer net.4, 7 The result is a Hawaiian version of the IMP (taking into account the use of radio communications and other differences) which has been dubbed the MENEHUNE (a legendary Hawaiian elf). The HP 2115A computer has been selected for use as the MENEHUNE. It has a 16-bit word size, a cycle time of 2 microseconds and an 8Kword core storage capacity. Although THE ALOHA SYSTEM will also be linked to remote-access inputoutput devices and small satellite computers through the MENEHUNE, in· this paper we shall be concerned with a random access method of multiplexing a large number of low data rate consoles into the MENEHUNE through a single radio communication channel. THE ALOHA SYSTEM has been assigned two 100 KHZ channels at 407.350 MHZ and 413.475 MHZ. One of these channels has been assigned for data from the MENEHUNE to the remote consoles and the other for data from the consoles to the MENEHUNE. Each of these channels will operate at a rate of 24,000 baud. The communication channel from the MENEHUNE to the consoles provides no problems. Since the transmitter can be controlled and buffering performed by the MENEHUNE at the Computer Center, messages from the different consoles can be ordered in a queue according to any given priority scheme and transmitted sequentially. Messages from the remote consoles to the MENEHUNE however are not capable of being multiplexed in such a direct manner. If standard orthogonal multiplexing techniques (such as frequency or time multiplexing) are employed we must divide the channel from the consoles to the MENEHUNE into a large number of low speed channels and assign one to each console, whether it is active or not. Because of the fact that at any given time only a fraction of the total number of consoles in the system will be active and because of the burst nature of the data from the con- THE ALOHA SYSTEM soles such a scheme will lead to the same sort of inefficiencies found in a wire communication system. This problem may be partly alleviated by a system of central control and channel assignment (such as in a telephone switching net) or by a variety of polling techniques. Any of these methods will tend to make the communication equipment at the consoles more complex and will not solve the most important problem of the communication inefficiency caused by the burst nature of the data from an active console. Since we expect to have many remote consoles it is important to minimize the complexity of the communication equipment at each console. In the next section we describe a method of random access communications which allows each console in THE ALOHA SYSTEM to use a common high speed data channel without the necessity of central control or synchronization. Information to and from the MENEHUNE in THE ALOHA SYSTEM is transmitted in the form of "packets," where each packet corresponds to a single message in the system. 8 Packets will have a fixed length of 80 8-bit characters plus 32 identification and control bits and 32 parity bits; thus each packet will consist of 704 bits and will last for 29 milliseconds at a data rate of 24,000 baud. The parity bits in each packet will be used for a cyclic error detecting code. 9 Thus if we assume all error patterns are equally likely the probability that a given error pattern will not be detected by the code is10 2-32 =10- 9• Since error detection is a trivial operation to implement, 10 the use of such a code is consistent with the require- ' unr I rtlTIt- In n n ~ n · ...,. 2 I o fA 0 0 I •• user " sum ·: o Om 1000 orint It ~ ·:• ~ Interference repetitions ~ time --+ Figure 2-ALOHA communication multiplexing 283 ment for simple' communication equipment at the consoles. The possibility of using the same code for error correction at the MENEHUNE will be considered for a later version of THE ALOHA SYSTEM. The random access method employed by THE ALOHA SYSTEM is based on the use of this error detecting code. Each user at a console transmits packets to the MENEHUNE over the same high data rate channel in a completely unsynchronized (from one user to another) manner. If and only if a packet is received without error it is acknowledged by the MENEHUNE. After transmitting a packet the transmitting console waits a given amount of time for an acknowledgment; if none is received the packet is retransmitted. This process is repeated until a successful transmission and acknowledgment occurs or until the process is terminated by the user's console. A transmitted packet can be received incorrectly because of two different types of errors; (1) random noise errors and (2) errors caused by interference with a packet transmitted by another console. The first type of error is not expected to be a serious problem. The second type of error, that caused by interference, will be of importance only when a large number of users are trying to use the channel at the same time. Interference errors will ,limit the number of users and the amount of data which can be transmitted over this random access channel. In Figure 2 we indicate a sequence of packets as transmitted by k active consoles in the ALOHA random access communication system. We define T as the duration of a packet. In THE ALOHA SYSTEM T will be equal to about 34 milliseconds; of this total 29 milliseconds will be needed for transmission of the 704 bits and the remainder for receiver synchronization. Note the overlap of two packets from different consoles in Figure 2. For analysis purposes we make the pessimistic assumption that when an overlap occurs neither packet is received without error and both packets are therefore retransmitted. * Clearly as the number of active consoles increases the number of interferences and hence the number of retransmissions increases until the channel clogs up with repeated packets.l1 In the next section we compute the average number of active consoles which may be supported by the transmission scheme described above. Note how the random access communication scheme of THE ALOHA SYSTEM takes advantage of the nature of the radio communication channels as opposed to wire communications. Using 'the radio channel as we have described each user may access the same * In order that the retransmitted packets not continue to interfere with each other we must make sure the retransmission delays in the two consoles are different. 284 Fall Joint Computer Conference, 1970 channel even though the users are geographically dispersed. The random access communication method used in THE ALOHA SYSTEM may thus be thought of as a form of data concentration for use with geographically scattered users. RANDOM ACCESS RADIO COMMUNICATIONS We may define a random point process for each of the k active users by focusing our attention on the starting times of the packets sent by each user. We shall find it useful to make a distinction between those packets transmitting a given message from a console for the first time and those packets transmitted as repetitions of a message. We shall refer to packets of the first type as message packets and to the second type as repetitions. Let X be the average rate of occurrence of message packets from a single active user and assume this rate is identical from user to user. Then the random point process consisting of the starting times of message packets from all the active users has an average rate of occurrence of r=kX where r is the average number of message packets per unit time from the k active users. Let T be the duration of each packet. Then if we were able to pack the messages into the available channel space perfectly with absolutely no space between messages we would have rT=1. Accordingly we refer to rT as the channel utilization. Note that the channel utilization is proportional to k, the number of active users. Our objective in this section is to determine the maximum value of the channel utilization, and thus the maximum value of k, which this random access data communication channel can support. Define R as the average number of message packets plus retransmissions per unit time from the k active users. Then if there are any retransmissions we must have R>r. We define RT as the channel traffic since this quantity represents the average number of message packets plus retransmissions per uni ttime multiplied by the duration of each packet or retransmission. In this section we shall calculate RT as a function of the channel utilization, rT. Now assume the interarrival times of the point process defined by the start times of all the message packets plus retransmissions are independent and exponential. This assumption, of course, is only an approximation to the true arrival time distribution. Indeed, because of the retransmissions, it is strictly speaking not even mathematically consistent. If the retransmission delay is large compared to T, however, and the number of retransmissions is not too large this assumption will be reasonably close to the true distribution. Moreover, computer simulations of this channel indicate that the final results are not sensitive to this distribution. Under the exponential assumption the probability that there will be no events (starts of message packets or retransmissions) in a time interval T is exp( -RT). Using this assumption we can calculate the probability that a given message packet or retransmission will need to be retransmitted because of interference with another message packet or retransmission. The first packet will overlap with another packet if there exists at least one other start point T or less seconds before or T or less seconds after the start of the given packet. Hence the probability that a given message packet or retransmission will be repeated is [1- exp( -2RT)]. (1) Finally we use (1) to relate R, the average number of message packets plus retransmissions per unit time to r, the average number of message packets per unit time. Using (1) the average number of retransmissions per unit time is given by R[1- exp(-2RT)] so that we have R=r+R[1- exp(-2RT)] or (2) Equation (2) is the relationship we seek between the channel utilization rT and the channel traffic RT. In Figure 3 we plot RT versus rT. chama' trafflc RT .50 --------------- ----------- -------------- .40 .30 .20 .10 .10 ~'5 .186 channal utilization r T Figure 3-Channel utilization vs channel traffic THE ALOHA SYSTEM Note from Figure 3 that the channel utilization reaches a maximum value of 1/2e=0.186. For this value of rr the channel traffic is equal to 0.5. The traffic on the channel becomes unstable at rr = 1/2e and the average number of retransmissions becomes unbounded. Thus we may speak of this value of the channel utilization as the capacity of this random access data channel. Because of the random access feature the channel capacity is reduced to roughly one sixth of its value if we were able to fill the channel with a continuous stream of uninterrupted data. For THE ALOHA SYSTEM we may use this result to calculate the maximum number of interactive users the system can support. Setting . we solve for the maximum number of active users A conservative estimate of A would be 1/60 (seconds)-l, corresponding to each active user sending a message packet at an average rate of one every 60 seconds. With r equal to 34 milliseconds we get kmax = 324. (3) Note that this value includes only the number of active users who can use the communication channel simultaneously. In contrast to usual frequency or time multiplexing methods while a user is not active he consumes no channel capacity so that the total number of users of the system can be considerably greater than indicated by (3). The analysis of the operation ~f THE ALOHA SYSTEM random access scheme provided above has been checked by two separate simulations of the system. 12,13 Agreement with the analysis is excellent for values of the channel utilization less than 0.15. For larger values the system tends to become unstable as one would expect from Figure 3. 285 REFERENCES 1 N ABRAMSON et al 1969 annual report THE AWHA SYSTEM University of Hawaii Honolulu Hawaii January 1970 2 M M GOLD LL SELWYN Real time computer communications and the public interest Proceedings of the Fall Joint Computer Conference pp 1473-1478 AFIPS Press 1968 3 R M FANO The MAC system: The computer utility approach IEEE Spectrum Vol 2 No 1 January 1965 4 L G ROBERTS Multiple computer networks and computer communication ARPA report Washington DC June 1967 5 J G KEMENY T E KURTZ Dartmouth time-sharing Science Vol 162 No 3850 p 223 October 1968 6 P E JACKSON C D STUBBS A study of multiaccess computer communications Proceedings of the Spring Joint Computer Conference pp 491-504 AFIPS Press 1969 7 Initial design for interface message processors for the ARPA computer network Report No 1763 Bolt Beranek and Newman Inc January 1969 8 R BINDER Multiplexing in THE ALOHA SYSTEM: MENEHUNE-KEIKI design considerations ALOHA SYSTEM Technical Report B69-3 University of Hawaii Honolulu Hawaii November 1969 9 W W PETERSON E J WELDON JR Error-correcting codes-Second edition John Wiley & Sons New York New York 1970 10 D T BROWN W W PETERSON Cyclic codes for error detection Proceedings IRE Vol 49 pp 228-235 1961 11 H H J LIAO Random access discrete address multiplexing communications for THE ALOHA SYSTEM ALOHA SYSTEM Technical Note 69-8 University of Hawaii Honolulu Hawaii August 1969 12 W H BORTELS Simulation of interference of packets in THE ALOHA SYSTEM ALOHA SYSTEM Technical Report B70-2 University of Hawaii Honolulu Hawaii March 1970 13 P TRIPATHI Simulation of a random access discrete address communication system ALOHA SYSTEM Technical Note 70-1 University of Hawaii Honolulu Hawaii April 1970 Computer-aided system design* by E. DAYID CROCKETT, DAYID H. COPP, J. W. FRANDEEN, and CLIFFORD A. IS BERG Computer Synectics, Incorporated Santa Clara, California PETER BRYANT and W. E. DICKINSON IBM ASDD Laboratory Los Gatos, California and lVIICHAEL R. PAIGE University of Illinois Urbana, Illinois Gatos, California, which defined the proposed CASD system and looked into the problems of building the various component programs. Details of several prototype programs which were implemented are given elsewhere. 1 There are no present plans to continue work in this area. This paper is essentially a feasibility report, describing the overall system structure and the reasons for choosing it. It includes descriptions of the data forms in the system and of the component programs, discussions of the overall approach, and an example of a device described in the CASD design language. INTRODUCTION This paper describes the Computer-Aided System Design (CASD) system, a proposed collection of computer programs to aid in the design of computers and similar devices. CASD is a unified system for design, encompassing high-level description of digital devices, simulation of the device functions, automatic translation of the description to detailed hardware (or other) specifications, and complete record-keeping support. The entire system may be on-line, and most day-to-day use of the system would be in conversational mode. Typically, the design of digital devices requires a long effort by several groups of people working on different aspects of the problem. The CASD system would make a central collection of all the design information available through terminals to anyone working on the job. With conversational access to a central file, many alternative designs can be quickly evaluated, proven standard design modules can be selected, and the latest version of the design can be automatically documented. The designer works only with high-level descriptions, which reduce the number of trivial errors and ensure the use of standard design techniques. From October, 1968, through December, 1969, the authors participated in a study at the IBM Advanced Systems Development Laboratory in Los THE SYSTEM IN GENERAL The (proposed) Computer-Aided System Design (CASD) system is a collection of programs to aid the computer designer in his daily work, and to coordinate record-keeping and documentation. It offers the designer five major facilities: H igh-level description The designer describes his device in a high-level, functional language resembling PL/I, but tailored to his special needs. This is the only description he enters into the system, and the one to which all subsequent modifications, etc., refer. * This work was performed at the IBM Advanced Systems Development Laboratory, Los Gatos, California. 287 288 Fall Joint Computer Conference, 1970 High-level simulation An interpretive simulator allows the designer to check out his design at a functional level, before it is committed to hardware. The simulation is interactive, allowing the designer to "watch" his design work and evaluate precisely design alternatives. Translation to logic specifications The high-level design, after testing by simulation, is automatically translated to detailed logic specifications. These specifications may take a variety of forms, such as (1) input to conventional Design Automation (DA) systems, or (2) microcode for an existing machine. On-line, conversational updating The designer makes design changes and does his general day-to-day work at a terminal, in a conversational mode. Batch facilities are also available. design automation systems) is a natural by-product of the CASD organization. The CASD system can thus be viewed as an extension to higher levels of current systems for design, in roughly the same way that compilers are functional extensions of assemblers to higher levels. The general organization of the system is pictured in Figure 1. The designer describes his device in a source design language, which is translated by a compiler-like program called the encoder to an internal form. The internal form is the input both to the highlevel simulator (called the interpreter) and to a series of translators (two are shown in Figure 1) which convert it to the appropriate form of logic specifications. Different series of translators give different kinds of final output (e.g., one series for DA input, another series for microcode). The entire system is on-line, operating under control of the CASD monitor, which handles communication to and from the terminals. The user interface programs handle the direct "talking" to the user and invoke the proper functional programs. DATA FORMS IN THE CASDSYSTEM Complete file maintenance and documentation Source design description Extensive record-keeping is provided to keep track of different machines, different designs of machines, different versions of designs, results of simulation runs, and so forth. High-level documentation of designs (analogous to that produced at lower levels by today's The CASD design language the designer uses is a variant of PL/I, stripped of features not needed for computer design and enriched with a few specialized features for such work. PL/P and CASD's languageS are described more fully elsewhere. Procedures The basic building block in a CASD description is the procedure. A procedure consists of: (1) declarations of the entities involved in the procedure, and (2) statements of what is to be done to these entities. A procedure is written as a PROCEDURE statement, followed by the declarations and statements, followed by a matching END statement, in the usual PLII format: PROC1: PROCEDURE; declarations and statements ENDPROCl; KEY ..-.-.... = DATA FLOW +---+ = CONTROL FLOW Figure I-The CASD system defines a procedure whose name is PROC1. A procedure represents some logical module of the design, e.g., an adder. A complete design, in general, would have many such procedures, some nested within Computer-Aided System Design others. The adder procedure, for example, may contain a half-adder as a subprocedure. 289 2. The WAIT statement takes the form WAIT(expression) ; Data iteJns Each procedure operates on certain data items, such as registers or terminals. These items are defined by DECLARE statements, which have the general format: DECLARE name attribute, attribute, ... ; The name is used to refer to the item throughout the description. The attributes describe the item in more detail, and are of two types-logical and physical. Logical attributes describe the function of the item (it is bit storage, or a clock, say); physical attributes describe the form the item is to take in hardware (magnetic core, for example). Logical attributes influence the encoding, interpreting, and translating functions. Physical attributes, on the other hand, are ignored by the interpreter, giving a truly functional simulation. Like any block-structured language, the CASD language has rules about local and global variables, and scope of names. These have been taken directly from the corresponding rules for PL/I. Statements The basic unit for describing what is to be done to the data items is the expression, defined as in PL/I but with some added Boolean operators, such as exclusive or (jIJ), and some modifications to the bit string arithmetic. The basic statement types for describing actions on data items are the assignment, WAIT, CALL, GO TO, IF, DO, and RETURN statements. These are basically as they are in PL/I, except as described below. 1. The assignment statement is extended to allow concatenated items to appear on the left-hand side. Thus: XREG II YREG:=ZREG; where XREG and YREG are 16 bits each and ZREG is 32 bits, means to put the high 16 bits of ZREG into XREG and the low 16 bits into YREG. In combination with the SUBSTR built-in function,4 this assignment statement offers convenient ways to describe shifting and similar operations. The assignment symbol itself is the ALGOL" : = " rather than" = " as in PL/I. It thus differs from PL/I in that it allows one to specify a wait until an arbitrary expression is satisfied. This is useful for synchronizing tasks (see below). 3. The GO TO statement includes the facility of going to a label variable, and the label variable may be subscripted. This is useful for describing such operations as op-code decoding-for example: GO TO ROUTINE (OP). Sequencing The differences in motivation between CASD's language and PL/I are most evident in matters of sequence control and parallelism. PL/I, as a programming language, does not emphasize the use of parallelism. Programs are described and executed sequentially, which is not adequate for a design language. The basic unit of work in CASD is the node. A node is a collection of actions which can be performed at the same time. For example, XREG: = YREG; and P:=Q; can be performed together if all t4e items involved are distinct. On the other hand, XREG: = YREG; ZREG:=XREG; cannot be performed (as written) at the same time, since the result of the first written operation is needed to do the second. The basic CASD rules are: 1. Operations are written as sequential statements. 2. However these operations are performed (sequentially or in parallel), the end results will be the same as the results of performing them sequentially. 3. Sequential statements will be combined into a single node (considered as being done in parallel) whenever this does not violate rule 2. That is, CASt> assumes you mean parallel unless there's some "logical conflict."5 Of course, the designer may want to override rules 2 and 3. Another rule gives him one way to do this: 4. A label1ed statement always begins a new node. Another way is by specifying parallelism explicitly. If the DO statement is written as DO CONCURRENTLY, all statements within the DO will be executed in parallel. Finally, the TASK option of the CALL statement makes it possible to set several tasks operating at once. 290 Fall Joint Computer Conference, 1970 Preprocessor facilities Some of the PL/I preprocessor facilities have been retained. These include the iterative %DO, which is particularly useful in describing repetitive operations, and the preprocessor assignment statement, useful for specifying word lengths, etc. No defaults Unlike PL/I, the CASD language follows the principle that nothing should be hidden from the designer. In particular, it has no default attributes, and everything must be declared. Similarly, it does not allow subscripted subscripts, subscripted parameters passed to subroutines, or anything else that might force the encoder to generate temporary registers not specified by the designer. Such restrictions might be relaxed in a later version, but we feel that until we have more experience with such systems, we had better hide as little as possible. Internal form Before the source description can be conveniently manipulated by other programs, it must be translated to an internal form. This form is designed to be convenient for both the translator programs and the interpreter. Compromises are necessary, of coursea computer program might be the most convenient form for simulatjon, but would be of no use at all to the translator. The CASD internal form resembles the tabular structure used for intermediate results in compilers for programming languages. It consists of four kinds of tables: descriptors, expressions, statements and nodes. The descriptor table records the nature of each item (taken from its DECLARE statement). The entries are organized according to the block structure of the source description and the scope-of-names rules of the language. The expression table contains reverse Polish forms of all expressions in the source description, with names replaced by pointers to descriptors. Each expression appears only onee in the expression table, although it may appear often in the source description. In effect, the expression table lists the combinational logic the translator must generate. The statement table consists of one entry for each statement in the source description, with expressions replaced by pointers to entries in the expression table, and a coded format for the rest of the statement (statement type plus parameters). The node table tells which statements in .the statement table belong in the same node, and the order in which various nodes should be executed. The internal form has thus extracted three things from the source description-data items, actions to be taken on those items, and the timing of the actions-and recorded them in three separate tablesthe descriptor, the statement, and the node tables. The expression table is added for convenience. Simulation results The high-level simulation involves three forms of data: values of the variables, control information, and run statistics. Before a simulation run begins, the variables of the source design description (corresponding to registers, etc.) must be assigned initial values. One way to do this is with the INITIAL attribute in the DECLARE statement) which makes initialization of the variables at execution time a fundamental part of the description. Sometimes, though, the designer may want to test a special case, and simulate his design starting from some special set of jnitial values. CASD permits him to store one or more sets of initial values in his files; and for a given simulation run, to specify the set of initial values to be used. In this way, he can augment or override the INITIAL attribute. At the end of a simulation run, the final values of the variables may be saved and used for print-outs, statistics gathering, or as initial values for the next simulation run. That is, a simulation run may continue where the last one left off. The high-level, interpretive simulation in CASD is perhaps most useful because of its control options. As an interpreter, operating from a static, tabular description of the device, the CASD simulator can give the user unusually complete control over the running of the simulation. Through a terminal, he can at any time tell the system which variables to trace, how many nodes to interpret at a time, when to stop the simulation (e.g., stop if XREG ever gets bigger than 4 and display the results), and so forth. These control conditions may be saved just as the data values may be, and a simulation run may use either old or new control conditions. Permanent records of a simulation also include summaries of run statistics (the number of subprocedure calls, number of waits, etc.). Computer-Aided System Design Translator output Different translators produce different kinds of output. Assembly-language level listings of mircocode might be needed for some lower-level systems, the coded equivalent of ALD sheets for others. Typically, output would include error and warning messages. File structure In an on-line, conversational system, it is particularly important that the working data be easily ac·· cessible to the user and the control language seem natural to him. CASD attempts to facilitate user control in two ways: through the user interface programs, and the structure of the data files. The basic organizational unit in the CASD files is called the design. A design consists of all the data pertinent to the development of some given device. A design may have many versions, representing current alternatives or successive revisions. Each version has some or all of the basic forms of data associated with it: source description, internal form, simulation results, translator output, and so on. Two catalogs, one for designs and one for versions, are the basic access points to CASD data. A typical entry in the design catalog (a design record) contains a list of pointers to the version descriptors for each version of every design in the system. The version descriptor contains pointers to each of the various forms of data for that version (source description, ... ) plus control information telling which set of translators has been applied to the design in this version, and so on. These descriptors give the user interface programs efficient access to needed data. For example, if the user asks to translate a given design, the interface finds the version descriptor, and can then tell if the design has been encoded, and if not, inform the user and request the input parameters for encoding. PROGRAMS IN THE CASD SYSTEM CASD monitor and support programs All the CASD component programs are under control of a monitor program, which provides the basic services for communicating with terminals and allocates system resources. In the prototype version 6 the environment was OS/360 1VIVT, and it was convenient to set up the monitor as a single job, attaching one subtask for each CASD terminal. The CASD files were all in one large data set, and access to them was controlled by service routines in the monitor. The moni- 291 tor also controlled the allocation of CPU time to various CASD terminals within the overall CASD job. This approach makes it easier to manage the various interrelated data forms within the versions, and would probably work in environments other than OS/360 as well. Besides the monitor and the data access routines, the support programs include a text-editing routine to use in editing the source description. User interface programs CASD system control is not specified in some general language. Rather, each CASD function has its own interface program, which has the complete facilities of the system available to it. The design records and version descriptors give precisely the information needed by user interface programs. A typical user interface program might be one for encoding and simulating a source design description already in the CASD files. The version descriptor shows, for example, whether or not the source description has already been encoded. The interface may then give the user a message like "Last week you ran this design for 400 nodes. Should the results of that run be used as initial values for this run?" The point is that the conversation is natural to the task at hand. The tasks under consideration are well defined, and each natural combination of them has its own interface program. Encoder Since the CASD encoder is roughly the first haH of a compiler, it may be built along pretty standard lines. Care must be tak