1972 12_#41_Part_2 12 #41 Part 2

1972-12_#41_Part_2 1972-12_%2341_Part_2

User Manual: 1972-12_#41_Part_2

Open the PDF directly: View PDF PDF.
Page Count: 692

Download1972-12_#41_Part_2 1972-12 #41 Part 2
Open PDF In BrowserView PDF
AFIPS
CONFERENCE
PROCEEDINGS
VOLUME 41
PART IT

1972
FALL JOINT
COMPUTER
CONFERENCE
December 5 - 7, 1972
Anaheim, California

The ideas and opinions expressed herein are solely those of the authors and are not necessarily representative of or
endorsed by the 1972 Fall Joint Computer Conference Committee or the American Federation of Information
Processing Societies, Inc.

Library of Congress Catalog Card Number 55-44701
AFIPS PRESS
210 Summit Avenue
Montvale, New Jersey 07645

©1972 by. the American Federation of Information Processing Societies, Inc., Montvale, New Jersey 07645. All
rights reserved. This book, or parts thereof, may not be reproduced in any form without permission of the publisher.

Printed in the United States of America

CONTENTS
PART II
649
661

F. D. Vickers
F. B. Baker

669

A. Goodman

A highly parallel computing system for information retrieval ........ .
The architecture of a context addressed segment-sequential storage .. .

681
691

B. Parhami
L. D. Healy
K. L. Doty
G. Lipovski

A cellular processor for task assignments in polymorphic multiprocessor
computers
A register transfer module FFT processor for speech recognition ..... .

703
709

A systematic approach to the design of digital bussing structures .....

719

J. A. Anderson
D. Casasent
W. Sterling
K. Thurber
E. Jensen

Cognitive and creative test generators ............................ .
A conversational item banking and test construction system ........ .
MEASUREMENT OF COMPUTER SYSTEMS-EXECUTIVE
VIEWPOINT
Measurement of computer systems-An introduction .............. .
ARCHITECTURE-,-TOPICS OF GENERAL INTEREST

DISTRIBUTED COMPUTING AND NETWORKS
Improvement in the design and performance of the ARPA network ...

741

Cost effective priority assignment in network computers ............ .

755

C.mmp-A multi-mini processor ................................. .

765

C.ai-A computer architecture for multiprocessing in AI research .....

779

J. McQuillan
W. Crowther
B. Cosell
D. Walden
F. E. Heart
E. K. Bowdon, Sr.
W. J. Barr
W. A. Wulf
C. G. Bell
C. G. Bell
P. Freeman

NATURAL LANGUAGE PROCESSING
Syntactic formatting of science information ....................... .
Dimensions of text processing ................................... .
Social indicators from the analysis of communication content ....... .

791
801
811

N. Sager
G. R. Martins
P. J. Stone

819
829
837
849

G. Baird
L. G. Stucki
N. Schneidewind
A. Merten
D. Teichroew

MEASUREMENT OF COMPUTER SYSTEMS-SOFTWARE
VALIDATION AND RELIABILITY
The DOD COBOL compiler validation system ....................
A prototype automatic program testing tool ......................
An approach to software reliability prediction and quality control ...
The impact of problem statement languages in software evaluation ..

.
.
.
.

COMPUTER AIDED DESIGN
The solution of the minimum cost flow network problem using associative processing .............................................. .

859

Minicomputer models for non-linear dynamics systems ............. .
Fault insertion techniques and models for digital logic simulation .... .

867
875

A program for the analysis and design of general dynamic mechanical
systems .................................................... .

885

D. A. Calahan
N.Orlandea

A wholesale retail concept for computer network management ....... .

889

D. L. Grobstein
R .. P. Uhlig

A functioning computer network for higher education in North
Carolina .................................................... .

899

L. H. Williams

Multiple evaluators in an extensible programming system .......... .
Automated programmering-The programmer's assistant ....... " .. .
A programming language for real-time systems .................... .

905
917
923

Systems for system implementors-Some experiences from BLISS ....

943

B. Wegbreit
W. Teitelman
A. Kossiakoff
T. P. Sleight
W. A. Wulf

949
959
965

R. Ruud
C. D. Warner
H. Cureton

971
977
985
993

M. V. Wilkes
J. H. Pomerene
A. S. Hoagland
W. F. Bauer
A. M. Rosenberg

V. A. Orlando
P. B. Berra
J. Raamot
S. Szygenda
E. W. Thompson

COMPUTER NETWORK MANAGEMENT

SYSTEMS FOR PROGRAMMING·

MEASUREMENT OF COMPUTER SYSTEMS-MONITORS AND
THEIR APPLICATIONS
The CPM-X-A systems approach to performance measurement .... .
System performance evaluation-Past, present, and future .......... .
A philosophy of system measurement ............................ .
HISTORICAL PERSPECTIVES
Historical perspectives-Computer architecture. . . ................ .
Historical perspectives on computers-Components ................ .
Mass storage-Past, present, future .......................... " .. .
Software-Historical perspectives and current trends ............... .

INTERACTIVE PROCESSING-EXPERIENCES AND
POSSIBILITIES
NASDAQ-A real time user driven quotation system .............. .
The Weyerhaeuser information systems-A progress report ......... .
The future of remote information processing systems ............... .

1009
1017
1025

Interactive processing-A user's experience ....................... .

1037

G. E. Beltz
J. P. Fichten
M. J. Tobias
G. M. Booth
H. F. Cronin

IMPACT OF NEW TECHNOLOGY ON ARCHITECTURE
The myth is dead-Long live the myth .......................... .

1045

Distributed intelligence for user-oriented computing ................ .
A design of a dynamic, fault-tolerant modular computer with dynamic
redundancy ................................................. .

1049
1057

MOS LSI minicomputer comes of age ... , ........................ .

1069

E. Glaser
F. Way III
T. C. Chen
R. B. Conn
N. Alexandridis
A. Avizienis
G. W. Schultz
R. M. Holt

ROBOTICS AND TELEOPERATORS
Control of the Rancho electric arm .............................. .

1081

Computer aiding and motion trajectory control in remote manipulators.

1089

A robot conditioned reflex system modeled after the cerebellum. . . ...

1095

M. L. Moe
J. T. Schwartz
A. Freedy
F. Hull
G. Weltman
J. Lyman
J. S. Albus

DATA MANAGEMENT SYSTEMS
Data base design using IMSj360 ................................ .
An information structure for data base and device independent report
generation .................................................. .

1105

R. M. Curtice

1111

SIMS-An integrated user-oriented information system ............ .

1117

C. Dana
L. Presser
M. E. Ellis
W. Katke
J. R. Olson
S. Yang

A data dictionary j directory system within the context of an integrated
corporate data base ..................... _..................... .

1133

B. K. Plagman
G. P. Altshuler

Framework and initial phoses for computer performance improvement ..

1141

Core complement policies for memory migration and analysis ....... .
Data modeling and- analysis for users-A guide to the perplexed ..... .

1155
1163

T. Bell
B. Boehm
R. Watson
S.Kimbleton
A. Goodman

MEASUREMENT OF COMPUTER SYSTEMS-ANALYTICAL
CONSIDERATIONS

TECHNOLOGY AND ARCHITECTURE
(Panel Discussion-No Papers in this Volume)

LANGUAGE FOR ARTIFICIAL INTELLIGENCE
Why conniving is better than planning ........................... .

1171

The QA4 language applied to robot planning ...................... .

1181

'Recent developments in SAIL-An ALGOL-based language for
artificial intelligence ......................................... .

1193

J. A. Feldman
J. R.Low
D. C. Swinehart
R .. H. Taylor

1203

D. Teichroew

1225

J. C. Strauss

1235

R. R. Hench
D. F. Foster

Computer jobs through training-A final project report ............ .

1243

M. G. Morgan
N. J. Down
R. W. Sadler

Implementation of the systems approach to central EDP training in
the Canadian government .................................... .
Evaluations of simulation effects in management training ........... .

1251
1257

G. H. Parrett
H. A. Grace

Conceptual design of an eight megabyte high performance chargecoupled storage device ....................................... .

1261

Josephson tunneling devices for high performance computers ........ .
Magnetic bubble general purpose computer ....................... .

1269
1279

B. Augusta
T. V. Harroun
W. Anacker
P. Bailey
B. Sandfort
R. Minnick
W. Semon

G, J. Sussman
D. V. McDermott
J. A. Derksen
J. F. Rulifson
R. J. Waldinger

USER REQUIREMENTS OF AN INFORMATION SYSTEM
A survey of language for stating requirements for computer-based
information systems ......................................... .

MEASUREMENT OF COMPUTER SYSTEMS-CASE STUDIES
A benchmark study ............................................ .

SERVICE ASPECTS OF COMMUNICATIONS FOR REMOTE
COMPUTING
Toward an inclusive information network ................. : ....... .

TRAINING APPLICATIONS FOR VARIOUS GROUPS OF
COMPUTER PERSONNEL

ADVANCED TECHNICAL DEVICES

ADVANCES IN NUMERICAL COMPUTATION
On the numerical solution of III-posed problems using interactive
graphics .................................................... .
Iterative solutions of elliptic difference equations using direct methods ..
Tabular data fitting by computer ................................ .
On the implementation of symmetric factorization for sparse positivedefinite systems ............................................. .

1299
1303
1309

J. Varah
P. Concus
K. M. Brown

1317

J. A. George

Cognitive and creative test generators
by F. D. VICKERS
University of Florida
Gainesville, Florida

yielded a useful generator for a particular type of test
question. This presentation provides background material for the discussion of concepts which are not so
simple and which are now under investigation. Finally, the last section provides some ideas for future
development.

INTRODUCTION
Noone in education would deny the desirability of
being able to produce quizzes and tests by machine.
If one is careful and mechanically inclined, a teacher
can build up, over a period of time, a bank of questions
which can be used in a computer aided test production
system. Questions can be drawn from the question
(or item) bank on various bases' such as random,
subject area, level of difficulty, type of question,
behavioral objective, or other pertinent characteristic.
However, such an item bank requires constant maintenance and new questions should periodically be
added.
It is the intention of this paper to demonstrate a
more general approach, one that may require more
initial effort but in the long run should almost eliminate the need to compose additional questions unless
the subject material covered changes or the course
objectives change. This approach involves the design
and implementation of a computer program that
generates a set of questions, or question elements, on
a guided but random basis using a set of predetermined
question models. Here the word generate is used in a
different sense from that used in item banking systems.
The approach described here involves a system that
creates questions from an item bank which is, for all
practical purposes, of infinite size yet does not require
a great deal of storage space. Storage is primarily
devoted to the program.
It appears at this stage of our research that this
approach would only be applicable to subject material
which obeys a set of laws involving quantifiable parameters. However, these quantities need not be purely
numerical as the following discussion will demonstrate. The subject area currently being partially
tested with this approach is the Fortran language and
its usage.
The following section of this paper presents a brief
summary of a relatively simple concept which has

SYNTAX QUESTION GENERATION
A computer program has been in use at the University of Florida for over six years that generates a
set of quizzes composed of questions concerning the
syntax of Fortran language elements. See Figures 1
through 5. The student must discriminate between
each syntactic type of element as well as invalid constructions. The program is capable of producing quizzes
on four different sets of subject area as well as any
number of variations within each area. Thus a different variation of a quiz can be produced for each
section of the course. Figure 2 contains such a variation
of the quiz shown in Figure 1. The only change required in the computer program to obtain the variation
is to provide a single different value on input which
becomes the seed of a psuedo random number generator. With a different seed a different sequence of random numbers is produced thereby generating different
variations of question elements.
For each question, the program first generates a
random integer between 1 and 5 to determine the
answer category in which to generate the element. As
an example, consider Question 27 in Figure 1. The
random integer in this case was 2 thus a Fortran
integer variable name had to be created for this question. A call was made to a subroutine which proceeds
to generate the required / name. This subroutine first
obtains a random integer between 1 and 6 which represents the length of the name. For Question 27, the
result was a 2. The routine then enters a loop to generate each character in the name. Since for integer
names the first character must be I, J, K, L, M or N,
649

650

Fall Joint Computer Conference, 1972

~l Ar·~ E ••••••••••••• ~

CIS 3"2

SECTION 1

OUIZ 1

••••••••••••

1 n•••••••.•••••••

THE 25 ELEr1ENTS BELO\'! BELONG TO ONE OF THE FOLLO\'IING FIVE CATEGORIES.
I NO I CATE ON BOTI-I TH I S SHEET Arm YOUR ANS\.'!ER SHEET IN \'JH I CH CATEGORY
EACH ELEMEtJT BELONG5.
1.
2.
3.
4.

5.

1.
2.
3.
4.
5.

FORTRAN IV SPEC'fi.L CHAPACTER
FORTPAN IV CON5TANT
FORTPAN IV SYt 1BOL
A VALID JOB CONTROL LANGUAGE COHMANO
NONE t"IF THE ABOVE
A
A
f\

MH..JA68

n.

fl.

l~hJ2

7.
t)

• •• • ,. n•
· .. . 11.
• ••• 1.? •

· ••• l3 ..

• ••• 11J •

,

65KNFI2ST
15856251
ICALC
..J~16 K
r.J55
6.9543E-5

6.

v

• ••• 1.5.

· ... 16.

• ••• 17.
• ••• 18.
• ••• 1. q •
• ••• 20.

• ••• 21.
• ••• 22.
• ••• 23 .
• ••• 24.
• ••• 2 5 •

)
1~4793F)460

.

IFllST

l$r·10YV2
IENO
(

ICALC

,)

$\'1$4 T
78L7KUJ
110 838475,56
42760.
=

L6QIX

THE 25 ELEMENTS BELOW BELONG TO ONE OF THE FOLLOWING FIVE CATEGORIES.
1.
2.
3.
4.
5.
• ••• 2 ~ •

1'lY-

• ••• 27 •
• ••• 28.

APV~'

• ••• 2 q •

• •• . 30.
· .. . 31.
• ••• 3? •
• ••• 3:3.
· .. . 34.

• ••• 35.
• ••• 3f1 •
• ••• 37 •
• ••• 3R •

A FORTRAN INTEGER CONSTANT
A ~nRTRAN INTEGER VARIABLE
A FORTRAN REAL CONSTANT
A FORTRAN REAL VARIABLE
NnNE OF THE ABOVE

KS
K
584
*OO5g0
.655147

• •• ;lJ 4.

• 61~O 176

• ••• lJ 5 •

MN
KOKLTP
PHK4Q(
5.l~{)E-5

Y5Z
37

SCOI1ING FORMULA = I1IGHT*2
I NH~tJM SCORE = 10

~~

• ••• ~ 9.
• ••• 40.
• ••• l~ 1 •
• ••• 42.

• ••• 43.
• ••• 46.
• ••• 47.

• ••• 48-.
• ••• 4 ~.
• ••• 50.

24.20

Figure I-Quiz 1 example

2.70E+7
.449E-3
.04G39E+4
447675023
J
EHHY$G5
JUPTAH47F
50.E+l
725
3.E+3
fJYR
U$QQR*S3447

Cognitive and Creative Test Generators

CIS

r'~A ~.~ E ••••••••••••••••••••••••••

3t')~

SECT !CHI 2

('(lIZ 1

In .•.•......•....

THE 25 ELEr1E~!TS Br:LOH BELONG TO ONE DF THE FnLl()VIING FIVE CATEGORIES.
IN!1ICATE NJ ~nTH THIS SIIEET ANn YOllR AnS"'ER SHEET n~ '1HICH CATEGORY
EACH ELFYENT BEUHJ3S.
' .A.

2•
3.

,,\

4.
5.
1.
2.
3.
4.

5.
~.

7•
8.
0
"

.

• ••• 1 n •

· ... '.1.
•

••

It

1~ •

• .•. 13.

FORTRAN IV SPECIAL CHARACTER
FORTR~.N IV CONSTAnT
A FonT Ri\tJ IV SYM[H'~L
l\ VALl!) Jon CONTrOL LANGUAGE cnm1AtlO
NONE OF THE ABOVE

1.

X57.1=!(
=

• ••• 14.
• ••• 15.

.62522E+8
IFLIST
q I "'~ n~~1

• ••• 1.7 •
· ..• 18.

.

· •.. 1 n•

n

5134 r "J81l
PPh~~KIJ

u-w

OTEf1A0KG
45048833
3.7
IINT[P.

• • ., • 26.

• ••• 27 •
• ••• 28.
• ••• 29.
· .. . 30.

• ••• 31.
• ••• 32.
• ••• 33.
• ••• 3lJ •

• ••• 35.

• ••• 3n •
• ••• 37 •

• ••• 38.

IV
IlllJ'I7T

CY0KF.

• ••• 11) •
• ••• 2 n •

,

• ••• ,21.
• ••• 22.

4XI
1/7KG

• •.• 23.

• ••• 24.
• ••• 25.

THE 25 ELEMENTS BELOW BELONG TO ONE OF T.'E
1.
2.
3.
4.
5.

I
.78242E+9

IINSERT

I)

7f'20SQSR2
8. E+ll
KOV

~OLL()WING

FIVE CATEGORIES.

'\ FORTRAN I NTEGEP. COtlSTANT
FOPTP.AtI INTEGEr VAP.Il\BlE
A FORTRAN REAL CONSTANT
A FORTRAN REAL VARIABLE
rWNE OF THE ABOVE
r...

JJ8
K2NP3

• ••• 3<1.
• ••• 4" •
• ••• 41.
• ••• 42.

PFR

AZJVM7
41
H8Z
L3F5
SEEXQH
.8FF.+5
VFKCY
R*JVYP
• FE-l~
9.E-2

• ••• 43.

• • ~ • l~ 4 •
• .'• • l,~5.
• ••• 4 f' •
• ••• 47 •
• ••• 48.
• ••• 4 q •
• ••• 50.

= nlGHT*2

SCORING

FO~MtJLA

t·1INI~·11H·1

sconE = 10

24.30

Figure 2-Quiz 1 variation

H3VOG
5. 7453l~E-3
184
Q 5 Ll 0 1HOQUVT
Y70D+$7PO
9.04E+1
,!;810E4DL
H03(

8.2873c}E+4
2
096
1'13

651

652

Fall Joint Computer Conference, 1972

CIS 302

NAt~ E ••••••••••••••••••••••••••

SECTION 1

QUIZ 2

I O•••••••••••••••

THE 25 ELEMENTS BELOW BELONG TO ONE OF THE FOLLOWING FIVE CATEGORIES.
INDICATE ON BOTH THIS SHEET AND YOUR ANSHER SHEET IN WHICH CATEGORY
EACH ELEMENT BELONGS.
1.
2.

3.
4.
5.
1.
2.
3.
4.
5.
5.
7.
8.
9.

·· ...
'.0.
.. . 11.

• ••• 1.2.
• ••. 13.

EXPRESSION CONTAINING ONLY ONE r-10DE OF OPERAND (INTEGER OR RE
ALL OTHER EXPRESSIONS
A VALID ARITHMETIC STATEMENT
AN INVALID ARITm~ETIC STATEMENT
(CONTAINS AN =SIGN)
NONE OF THE ABOVE
AN

LG09'F9= ( I TM-JSC)
N55W=ALOG(.4/Z$D/.Q5)
28(
BI=NY7M-5
EXP«-O.5f1255»
-8+3
(+(-L)
K=(-JUPT21)+5
COS ( +til J - D)
ARS(COS(5.6QSOE+4**4»
9.48E+4=(-IX6RY)
TANH {ZXH**\JTHY)
,«53)/LlS8R1)

• ••• 1 r. •
• ••• 15.
• ••. 15.
· .. . 17 .
· ..• 18.
· .. . 19.

•••• 20.

· .. . 21.
• ••• 22.
• ••• 2 3.
• ••• 24.
• ••• 25.

«7239+XDZU»
N=OROBLI)
Y1K)(l)
A,(395278364)
S$X·«.~OE-9»

2358(
-\tJ5'HJFX**1
+JQ*l
TQ=*~3296*q

37=(LE)+9
+DE=6591')
.504111

THE 25 ELEMENTS BELOW BELONG TO ONE OF THE FOLLOWING FIVE CATEGORIES.
1.

2.
3.
4.

5.
• ••• 2 f)

•

• ••• 27 •
• ••• 28.
• ••• 29.
• ••• 30.
· .. . 31.
• ••• 3? •
• ••• 3 3 •
• ••• 3 r~ •

• ••• 35.

• ••• 36.
• ••• 37 .
• ••• 38 •

A
A
A
J\

STATn·1E~IT

STATp·1ENT
STATEt-1ENT
STATEMENT
NONE OF THE

CAUSING AN UNCONniTIONAL TRANSFER
HAVING A 2 HAY CONDITIONAL TPANSFER
(ASSUME NO
HAVItJG A. 3 HAY CONDITIONAL TRANSFER
(ABNOR~1.1\L
HAVING A 4 t..,.A.Y CONDITIONAL TRANSFER
(TERMINATIONS
APOVE ANn/OR A SYNTACTICALLY INCORRECT STATEMENT

GOTO(796,562,282),K18BWB
GO TO 175
GOTO(7886,~S,q,1),INAIGO

GOTO(7,7,7),HYI
I F ( Np- 4 ) q, 2 5, C)

GO TO 65
GOTO(77,5,402,524S),L81V
GOTn(3),N3017
GOTO(Q6,210,210,Q6),N
GOTO(8,8,8,S),CfZ
IF«A»31,31,31
(KL)3350,672,33S0
GOTO(282,6Rl,1,5),NKS

SCOR I NG FORr'aJLA = RI CHT*2
MINIMUM SCORE = 10

• ••• 3 q •

• .•• 40.
· ••• 41.
• ••. 42.
• ••• 43.
• ••.• 4 r~ •
• ••• 45.

• ••• 46.
7•
• ••• 48.

• ••• f.s.

• ••. 49.
• ••• 5 0'.

24.20

Figure 3-Quiz 2 example

GO TO 989
(9,82,30,952S),IUK
GOTn(514,55,648,8),K$J
IFC-LGUZN)4,3,814
IF(O~FG»16,4,22
IF(OW02/.5)5~70,1,S3

00TO(917,657,3433), I
IF«DZTLY»2,2,2
GnTO(S,813,S,QS),MOXO
GnTO(Q,8383,8,48),NOIAO
GOTO(4,1,2,5283),LRIGP9
IF{ALOG(W»376,413,413

Cognitive and Creative Test Generators

CIS 302

NAt~E

••••••••••••••••••••••••••

SECTION 1

OUIZ 4

I D............... .

THE 25 ELEMENTS BELOW BELONG TO ONE OF THE FOLLOWING FIVE CATEGORIES.
INDICATE ON BOTH THIS SHEET ANn YOUR ANSWER SJiEET IN WHICH CATEGORY
EACH ELEMENT BELONGS.
1.
2.
3.
4.
5.

• •

4J: •

1.
2.
3.
4.
5.
6.

7.
8.
9.

• ••• 10.
• ••• 11.
• ••• 12.
• ••• ]. 3.

A VALID
A VALlO
A VALID
A VALID
NONE OF

INPUT STATEMENT
OUTPUT STATEMENT
FIELO SPECIFICATION
FORHAT STATEMENT
THE ABOVE

PR I NT, JJN I , PZ
FORMAT(5HI7ZV(,217)
E13.6
PRINT,VCXNl
PRINT,IAOSI
READ(5, 988 )l~ZS

OR

FOR~1AT

· .• . 14.

• ••• 15.
· ••• 16.

• ••• 17 •
· .. . 18.

PRINT,C,62,NCOZ~

FORMAT(832)
551
41
2Ell.4
REAn(S,73)X,MBWDVZ,JSY3Y
FORMATC'F',4H2RH*)

• ••• 1 q •
• ••• 2 () •
• ••• ? l .

• ••• 22.
• ••• 23.
• ••• 24.
• ••• 25.

CODE

RE.f\O, MLGN, G4 K, J
12
F0 Rr·,1A T ( E11 • 0 , E1 fl • 4 )
PRINT,N,uonER5,L
REAO,IF
PRINT,G,LT,XIHJC
FORMAT(2A2)
El1.12
2X
READ, t;18
291
REAn(5,32)E2146

THE 25 ELEMENTS BELOW BELONG TO ONE nF THE FOLLOWING FIVE CATEGORIES.
1.
2.
3.
4.
5.
• ••• 2[1.

• ••• 27.
• ••• 28.
2!l.
• ••• 30.
•

• •0

•

.... 31.
• ••• 32.

A VALID SURSCnlPT
A VALID INTEGER SUBSCRIPTED VAPIABLE
A VALID REAL SUBSCRIPTED VARIABLE
A VALID Dn~ENSION STATE~~ENT FOR r1AIN PROGRAMS ONLY
NONE OF THE ABOVE

-74
N5Z(161)
9*LVDMS4-9
K3GOJY(5*K-2)
LP-5
0

• ••• 3 q •

• ••• 40.
· .. . 41.
• ••• IJ 2.

• ••• 43.

DIHENSION Q$OC(13,4,7,3)
DAYB7A(LXPG)
-4
• •• • 34 .
• ••• 35.·, DIMENSION 0(5,8,7,1,1,3)
• ••• 36. ~HX(4*ILMHP-5)
• ••• 37 • ~ I B(NONU8, 7*,JRl-8,IJ4M)
• ••• 38. WSSC C3 *N~1)
•••• 33.

SCOR f NG
M~NIMU~

FOP.~1IJLA

SCORE

= RI CHT*2

= 10

• ••• 44.
• ••• 45.

·• ...
'l () .
••• 47 •
• ••• 48.

• ••• 4 Q.

• ••• 50.

24.10

Figure 4-Quiz 4 example

DIMEN~ION

S(R,5,6,3,63)

K$V1SV+8

E~Q(7*LV9JY2,~*N,644)

ZZI2U(S*NNBLW-4,M9U,5Q)
OIMENSION ZA05T(7,8,3)
DI~1ENSION FRO(1,5,6)
M34B(N,M8+5,q*JRC,S,N~)

D7NOCM+9,NRG74E,8*NY)
IR7K4
DIMENSION dF91 (4,5,7)
AZ(I+9~5*L9,8*f,MVOV8+7)
ITA4UCM8U+4, K-6, 8*t1~'1pr'.~2)

653

654

Fall Joint Computer Conference, 1972

CIS 302

tJA~.1E

••••••••••••••••••••••••••

SECTION 1

nUlz 5

I D•••••••••••••••

THE 25 ELEMENTS BELOW BELONG Tn ONE OF THE FOLLOWING FIVE CATEGORIES.
INDICATE ON BOTH THIS SHEET AND YOUR ANSWER SHEET IN WHICH CATEGORY
EACH ELEMENT BELONGS.
1.
2.
3•
4.
5.

·...
··...
...

1.
2.

3.
4.
5.
6.

7.

8.
...
··••••
... 10.9.
.... 11.

• ••• 12.
• ••• 13.

A VALID DO STATEMENT WITH IMPLIED INCREMENT
A VALlO DO STATEMENT WITH EXPLICIT INCREMENT
CAN BE EITHER AN INDEX, INITIAL VALUE, UPPER LI,..11 T OR INC REM EN '1
CAN ONLY BE AN INITIAL VALUE, UPPER LIMIT OR INCREMENT
NONE OF THE ABOVE

DO 90 J
~, 94, 6
DO 7566 JDKAUT
885, KI
DO 22 IAo.S == K, 52
LT2
DO 5 N
NACICR, 98
00 4978 I
4, 2fl
5
MYJ9, 351, 21
DO 8 JAg
431436592
8489622
DO 9 MCSXLU = 3, N7LC, H
I
DO 8847 L 35, 880, LIl
:I

:I

:I

:I

IiII

• ••• 14.

• 4, J30
NY6P$ • 3, LM
DO 7 NOWY • 225, 1861

· . .. 15.

0090 M
DO 9431

· . .• 17 .
· . .• 18.

009290 KM58NI

• ••• 20.

DO

•••• 16.

• ••• 19.

• ••• 21.
• ••• 22.

• ••• 23.
• ••• 24.
• ••• 25.

1.6378E-8
N5

• 85,

N

82 J == 38, 927
L1YVOS

o.

DO 453 KXR
J3, 7437
105
DO 1583 K == 241), K
I:

:I

THE 25 ELEMENTS BELOW BELONG TO ONE OF THE FOLLOWING FIVE CATEGORIES.
1.
2.
3.
4.
5.
•••• 26.
••.• 27.

•••• 28.
•••• 29.
•••• 30.
· .. . 31.

•••• 32.
•••• 33.
• ••• 34.
•••• 35.
•••• 36.
•••• 37.
•••• 38.

A VALID
A VALID
A VALID
A VALID
NONE OF

ARITHMETIC STATEMENT
CONTROL STATEMENT
INPUT OR OUTPUT STATEMENT
SPECIFICATION STATEMENT
THE ABOVE

GO TO NNBL
WRITE(6,17)CQU,YB7,VY
FORMAT(6X,18)
READ,V9E,Ll
GOTOC17,5148),COPAAF
STOP
G=ALOGCO.E-2)*9462.56
CONTINUE
DIMENSION JBCLC2,3,5)
(823~4,837,4},MZPRI

FORMAT(7X)
O.QYPP==N I D
PRINT,63777729,SC,AK14

SCORING FORMULA == RIGHT*2
MINIMUM SCORE = 10

. •••• 39.
• .... 40 ..
· .• . 41 •
• ••• 42 •

• ••• 43 •
• ••• 44.

• ••• 45 •
• ••• 46 •
• ••• 47.
•••• 48.
• ••• 49 •
•••• 50.

24.10
Figure 5-Quiz 5 example

5.3E+21i11+40052*MW2
GO TO 9150
READ,UU2CK,NX
FORMAT(')=),',lA3,2F6.5)
FORMAT('=M+L',lX,'('}
PRINT,S,FW4KNL,NX~J4
WGAME=Y~T**L/S2VFON

READ,KI,$QJ9U,W
DIMENSION LX(6,5,4!J,1)
GO TO 653
L-029668-MGR
FORMAT(7H041$D2/,')')

Cognitive and Creative Test Generators

KEY

KEY

KEY

('lUll 1
SEC 1

nUIZ 1
SEC 1

1.
2.
3.
4.
5.

3
1
5

'-

1.

'- .

3.
4.
5.

6.

4
3

6.

7.

?>

7.

8.
11.

2

8.
C).

10.
11.
12.
13.
14.
15.
1fi.
17.
18.
1"l.

20.
21.
?2.
23.
24.
25.
26.
?7.
2R.
2g.
30.
31.
32.
33.
34.
35.
36.
37.
38.
39.
40.
41.
42.
43.
44.
45.
46.
47.
48.
49.
SO.

5
1

'-

1
4
5
4
1
4
1
1
3
5
4
2
1

3
5
2
l~

1
5

~

2
2
5
3
4
1
3
3
3
1
2
5
5
3
1
3
4
5
24.20

10.
11.
12.
13.
14.
15.
16.
17.
18.
lq.
20.
21.
22.
23.
24.
25.
2fi.
27.
28.
2fl.
30.
31.
32.
33.
34.
35.
36.

37.
38.
3t).
40.
41.
42.
43.
44.
45.
46.
47.
48.
49.
50.

OUIZ 1

SEC 1

3
1
5
2
4
3
3
2
5
1

2
1
4
5
4
1
4
1
1
3
5
4
2
1
3
5

2
l~

1
5
3
3
2
2
5
3
4
1
3
3
3
1

2
5
5
3
1
3

4
5
24.20

1.
2.
3.
4.
5.
6.

7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.

lR.

19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.
36.
37.
38.
3q.
40.
41..
42.
43.
44.
45.
46.
47.
48.
4q.
50.

KEY

KEY

QUIZ 1

01.117. 1

SEC 1

3
1
5

'-

I~

3
3
2
5
1

2
1
lJ

5
4

1
4

1

1.
2.
3.
4.
5.

3
1
5
2
4
3
3

n.
7.
8.

.
10.
~

11.
12.
13.
14.
15.
16.
17.
18.

SEC 1

1

3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.
36.
37.
38.
39.
40.
41.
42.

2

43~

5

44.
45.
46.
47.
48.
49.
50.

2

5
1
2
1
4
5
4
1
4
1
1
3
5
4

1

ltl.

3
5

1
3
5
2
4
1
5
3
3
2
2
5
3
4
1
3
3
3

2
5
5

20.
21.
22.
23.
24.
25.
2F..
27.
28.
29.
30.
31.
32.
33.
34.
35.
36.
37.
38.
39.
40.
41.
42.
43.
44.
45.

3

'J6.

3

1

47.
48.
49.
50.

l~

2
1

3
5
2
4

1
5
3
3
2
2
5

3
4
1
3
3
3
1

3

4
S

24.20

Figure 6-Key example

1.
2.

'-

5
1
3

4
5
24.20

3
1
5
2
4
3
3
2
5
1
2
1
4
5
4
1
4
1
1
3
5

4
2

1
3

5
2
4
1
5

3
3
2
2
5

3
4
1
3
3
3
1

2
5
5
3
1

3
4
5
24.20

655

656

Fall Joint Computer Conference, 1972

the first random number in this loop would be limited
to a value between 1 and 6. Subsequent random numbers produced in this loop would be between 1 and 37
corresponding to the 26 letters, 10 digits and the $
sign. Thus, for Question 27, the characters KS resulted.
In similar fashion, the names for Questions 33, 34, and
43 were produced.
As each category for each question is determined
by the main program, the values between 1 and 5 are
kept in a table to be used as the answer key. This
Name
MAIN
SETUP
QUIZi
ALPNUM
SYMBOL
CONSTA
SPECHA
JCLCOM
NONEi
INTCON
INTEXP
REAEXP
MIXEXP
MIXILE
UNIARY
PAREN
BINARY
FUNCT
ARITH
GOTON
IFSTAT
COGOTO
INOUT
FIESPE
FORMAT
DOSTAT
SIZCON
CONTRL
SPESTA
INTVAR
REACON
REAVAR
STANUM
SUBSCR
INTSUB
REASUB
DIMENS

table is listed for each quiz and section as shown in
Figure 6 for use in class after quiz administration is
complete. A card is also punched containing the key
for input to a computerized grading system which is.
used to grade tests and homework and maintain
records for the course.
To illustrate the scope of this quiz generator in terms
of programming effort, the following list gives the
name and purpose of each subroutine in the total
package. Each routine is written in Fortran IV:

Purpose
General test formatting and key production
Prints a leader to help operator setup printer
Calls routines for categories in each quiz
Generates single alphanumeric characters
"
a Fortran symbol
"
"
"
constant, real or integer
"
"
"
special character
"
" job command
"
none of the above entries for each quiz
"
a Fortran integer constant
/I
" "
" expression
II
real
" "
"
II
" " mixed
"
" illegal expression
"
"
uniary operator expression
"
"
expression within parentheses
"II
" binary operator expression
II
" function call
II
" Fortran arithmetic statement
II
"
"
GO TO
statement
"
"
IF
"
"
II
"
"
comp GO TO
"
II
"
"
I/O statement
II
" format field specification
" format statement
"
II
" Fortran DO statement
II
" constant of a given size
" control statement
"
"
specification statement
"
"
integer
variable
"
1/
" real constant
1/
" " variable
1/
" statement number
1/
" subscripted variable
1/
" integer
"
"
1/
" real
"
"
1/
" dimension statement

The only major criticism that can be made of these
quizzes is that they fail to test the student on his
understanding of the behavior of the computer under
the control of these various statements either singly

or in combination. This understanding of the semantics
of Fortran, of course, is imperative if a programmer is
to be successful. Thus a method is needed for generating questions which will test the student in this under-

Cognitive and Creative Test Generators

standing. It is this problem the solution of which is
now being sought. The following sections describe
some of the major concepts discovered so far and
possible methods of solution.

657

ANSl'lER AND
DISTRACTOR
GENERATOR

MAIN
GENERATOR

KEY

SEMANTIC QUESTION GENERATION

GENERATOR

Work is now under way for designing a system to
produce questions which require semantic understanding as well as syntactic recognition of various Fortran
program segments. The major difficulties in such a
process is the determination of the correct answer for
the generation of a key and the computation of the
most probable incorrect answers for the distractors of
a question. Both of these determinations sometimes
involve semantic meanings (i.e., evaluation of expressions or the execution of statements) which would be
difficult to determine in the same program that generates the question element in the first place. As a good
illustration, consider the following question model:
Given the following statement:
IF (X + 2.0 - SQRT(A» 5,27,13
where X = 6.5
and A = 22.7
Transfer is made to the statement whose number is
(1) 5 (2) 27 (3) 13 (4) indeterminant
(5) none of the above as
the statement is invalid
Here the generator would have created the expression X + 2.0 - SQRT(A) , the three statement
numbers 5, 27 and 13 and finally the two values of X

Figure 8-2nd stage involvement of key and distractors

5 KEY = 1
GO TO 10
27 KEY=2
GO TO 10
13 KEY=3
10 CONTINUE
This problem can be solved by letting the main
generator program generate a second program to compute the key as well as generate the question for the
test. This second program would then be passed to
further job steps which would compile and execute
the program and determine the key for the question.
Figure 7 illustrates this concept.
As an illustration of a question involving more
difficult determination of answer and distractors, the
following question model is presented.
Given the statement:

I

=

J/2

+X

where J = 11

..

MAIN
GENERATOR

~

TEST

I

and X

=

6.5

the resulting value of I is
KEY
GENERATOR

KEY

Figure 7-2nd stage involvement of key

and A. The order of the first four answer choices could
also be determined randomly. In this particular question, determination of the distractors is no problem
but the determination of the correct answer involves
an algorithm similar to the following:
X = 6.5

(1) 11.5

(2) 11

(3) 12

(4) 6.5

The determination of the five answer choices would
have to be determined by an algorithm such as the

~,1A

IN

GENERATOR

A = 22.7
IF (X

+ 2.0 -

SQRT(A» 5,27,13

(5) 6

Figure 9-No 2nd stage involvement

658

Fall Joint Computer Conference, 1972

THE NEXT FOUR QUESTIONS REFER TO THE FOLLOWING STATEMENT:

= K, N, 537
WHERE N = 961 AND K = 1
DO 746 LS21Q4

1.

THE FINAL VALUE OF THE 00 VAPIAP,LE, LS2IQ4, IS:
(1)
(5)

2.

1
(2) 537 (3)
NONE OF THE ABOVE

538

(4)

2

THE STATEMENTS WITHIN THE DO LOOP ARE EXECUTED M TIMES,
~.1 IS:

\~HERE

(1)
(5)

3.

1
(2) 537 (3)
NONE OF THE ABOVE

538

(4)

2

IF K = 962, THE STATEtAENTS ~nTHIN THE LOOP \'!OULO BE
EX ECUTEQ N T U~ES \\IHERE N IS:
0 (2)
1 (3) UNDETEPMINARLE
(4) THE PROGRAM WILL NOT BE EXECUTED
(5) NONE OF THE ,ABOVE
(1)

4.

ONE LEGITIMATE STATEMENT FOR THE LAST STATEMENT IN THE LOOP IS:
(1)

25G/XKL+L~6

(2) GO TO 31

STOP
RETURN
(5) "'!R I TE ( G, 20) I
(3)
(Ld

5.

GIVEN THE

STATE~ENT:

GO TO

(578,95~,q75,852,212,864,4g8,7q3),K6

K6 IS 4
TRANSFER IS "1AOE TO THE STATH1ENT

~"'HERE

~JHOSE

NW1BER IS:

(1) TRANSFER IS ~.~ADE TO THE FIRST 8 NIH}.BEPS HITHIN
THE PARENTHESIS IN THAT ORnER
(2) 852
(3)

4

(4) MORE INFORMATION IS NEEDED
(5) TRANSFER I S NOT tAAOE REC.A.USE THE STATH·1ENT I S I NVAL I D
6.

GIVEN THE STATEMENT:
IF(CXPJE+797) 43, 326, 896
IF CXPJE = .24
TRANSFER IS t"AOE TO THE STATEt)nn NUMREREO:
(1) 0.24000
(5) ~.'()NE

(2)

43

(3) 326

(4) 8gB

OF THE ABOVE

Figure lO-Semantic question examples

Cognitive and Creative Test Generators

659

following:
TEST
ORIENTED
SOURCE
LANGUAGE

J = 11
X

= 6.5

ANSI = J/2

+X

J/2

+X

IANS3 = J/2.

+X

IANS2

=

TOSL
PROCESSOR
(SNOBOL)

FORTRAN
PROGRAM

Figure ll-TOSL Language environment

obvious that a more reasonable method of writing the
source program is needed.

ANS4 = X
IANS5 = X
In this problem not only does the determination of
the key depend on further computation but also the
distractors and the correct answer. Thus the second
\ program generated by the first program must be involved in the production of the test as well as the key.
Figure 8 illustrates this concept.
Some questions are very simple to produce as neither
key nor answer choices depend on a generated algorithm. An example is:
Given the following statement:
DO

35 J5 = 3, 28, 2

The DO loop would normally be iterated N times
where N is
(1) 13

(2) 12

(3) 14

(4) 28

(5) 35

Here the answer choices are determined from known
algorithms independent of the random question elements. No additional program is therefore required
for producing this test question and its key. Figure 9
illustrates this condition.
It would then appear that a general semantic test
generator would have to satisfy at least the conditions
exhibited in Figures 7, 8 and 9.
Figure 10 illustrates results obtained from a working
pilot program utilizing the method illustrated in
Figure 8. This program is a very complicated one and
was very difficult to write. To produce a Fortran
program as output from a Fortran program involved
a good deal of tedious work such as writing Format
statements within Format statements. It has become

FUTURE INVESTIGATION
An attempt will be made to design a source language
oriented toward test design which will then be translated by a new processor into a Fortran program. See
Figure 11.
This new language is visualized as being composed
of a mixture of languages including the possibility of
passing simple English statements (for the textural
part of a question) through the entire process to the
test. Fortran statements could be written into the
source language where such algorithms are required.
Finally, statements to allow the specification of random question elements and the linkage of these random elements to the algorithms mentioned above will
be necessary.
Several special source language operators can be
introduced to facilitate the writing of question models.
Certain special characters can be chosen to represent
particular requirements such as question number
control, random variable control, answer choice control, answer choice randomization, and key production.
It is anticipated that SNOBOL would make an excellent choice for the processor language as it will
allow for rapid recognition of the source language
elements and operations and in a natural way generate and maintain strings which will find their way
into the Fortran output program and finally into the
test and key. The possibilities of such a system look
very promising and hopefully, such a system can be
made applicable to other subject fields as well as the
current one.

A conversational item banking and test construction system
by FRANK B. BAKER
University of Wisconsin
Madison, Wisconsin

SYSTEM DESIGN

INTRODUCTION

File structure
Most conscientious college instructors maintain a pool
of items to facilitate the construction of course examinations. Typically, each item is typed on a 5" X 8" card
and coded by course, book chapter, concept and other
such keys. The back of the card usually contains data
about the item collected from one or more administrations of the item. To construct a test, the instructor
peruses this item bank looking for items that meet his
current needs. Items are selected on the basis of their
content and further filtered by examining the item data
on the card, overlapping items are eliminated, and the
emphasis of the test is balanced. After having maintained such a system for a number of years, it became
obvious that there should be a better way. Consequently, the total process of maintaining an item bank
and creating a test was examined in detail. The result
of this study was the design and implementation of the
Test Construction and Analysis Program (TCAP).
The design goal was to provide an instructor with a
computer based item banking and test construction
system. Because the typical instructor maintains a
rather modest item bank, the design emphasis was upon
flexibility and capabilities rather than upon capacity.
In order to achieve the necessary flexibility TCAP was
implemented as a conversational system using an interactive terminal. Considerable care was taken to build a
system that had a very simple computer-user interface.
The purpose of the present paper is to describe the
TCAP system. The order of discussion proceeds from
the file structure to the software to the use of the system.
This particular order enables the reader to see the
underlying system logic without becoming enmeshed in
excessive interaction between components.

The three basic files of the TCAP system are the Item,
Statistics and Test files. A record in the Item file contains the actual item and is a direct analogy to the
5"X8" card of the manual scheme. A record in the
Statistics file contains item analysis results for up to ten
administrations of a given item. Test file records contain summary statistics for each test that has been administered. The general structure of all files is essentially
the same although they vary in internal detail. Each
file is preceded by a header (see Figure 1) that describes
the layout of the record in the file. Because changing
computers has been a way of life for the past ten years,
the header specifies the number of bits per character and
number of characters per word of the target computer.
These parameters are used to make the files word length
independent. In addition, it contains the number of
sections per record, the number of characters per record
section, characters per record and the number of
records in the file. The contents of the headers allow all
entries to data items within a record to be located via a
relative addressing scheme based upon character counts.
This character oriented header scheme enables one to
arbitrarily specify the record size and layout at run
time rather than compile time; thus, enabling several
different users of the system to employ their own record
layouts without affecting the TCAP software.
A record is divided into sections of arbitrary length,
each preceded by a unique two character flag and terminated by a double period. Sub sections within a section
are separated by double commas. These flags serve a
number of different functions during the file· creation
phase and facilitate the relative addressing scheme used
to search within a record. Figure 2 contains an item

661

662

Fall Joint Computer Conference, 1972

File Header
Element
1
2

3
4
5
6-15

Contents
Name of file
Number of bits per character in target
computer
Characters per word in the target computer
Characters per record in the file
Number of sections in the record
Number of characters in section
where
i = 1,2, ... 10
Figure I-Typical file header

file record that represents a typical record layout. The
basic record layout scheme is the same in all files, but
they differ in the contents of the sections. A record in
the item file consists of seven sections: Identification,
Keyword, Item, Current item statistics, Date last used,
and Frequency of use, previous version identification.
The ID section contains a unique identification code for
the item that must begin with *$. The keyword section
contains free field keyword descriptors of the item
separated by commas. The item section contains the
actual item and was intended primarily for multiple
choice items. Since the item section is free field, other
item types could be stored, but it has not been tried to
date. The current item statistics section stores the item
analysis information from the most recent administration of the item. The first element of this section is the
identification code of the test from which the item
statistics were obtained. The internal layout of this
section is fixed so that the FORTAP item analysis program outputs can be used to update the information.
The item statistics section contains information such as
the number of persons selecting each item response, item
difficulty, and estimates of the item parameters. The
next section contains the date of the most recent administration of the item. The following section contains

a count of the total number of times the item has been
administered. These two pieces of information are used
in the test construction section to prevent over use of
an item. The final section of the item record contains
the unique identification code of a previous version of
the same item. This link enables one to follow the
development of a given item over a number of
modifications.
A record in the Statistics file contains 11 sections, an
item identification section and 10 item statistics sections identical in format to the current item statistics
section of the item record. These 10 sections are maintained as a first in, last out push down stack with an
eleventh data set causing the first set to be pushed end
off. Records in the Test file are similar to those of. the
Item file and have five sections: Identification, Keywords, Comments, Summary statistics of the test, and
a link to other administrations of the same test. The
comments section allows the instructor to store any
anecdotal information he desires in a free field format.
The link permits keeping track of multiple uses of the
same test such as occurs when a course has many sections.
The record layouts were designed so that there was a
one to one correspondence between each 72 characters
in a section and the punched cards used to create the
file. Such a correspondence greatly facilitates the ease
with which an instructor can learn to use the system.
Once he has key punched his item pool, the record layouts within each file are quite familiar to him and the
operations upon these records are easily understood.
This approach also permitted integration of the FORTAP item analysis program into the TCAP system with
a minimum conversion effort.
It should be noted that the file design allows many
different instructors to keep their items in the same
basic files. Alternatively, each instructor can maintain

Item File Record
*$ STAT 01 520170 ..
ZZ EDPSY,STATISTICS,ESTIMATORS,MLE. .
QQ ONE OF THE CHARACTERISTICS OF MAXIMUM LIKELIHOOD ESTIMATORS IS THAT IF SUFFICIENT ESTIMATES EXIST, THEY WILL BE MAXIMUM LIKELIHOOD ESTIMATORS. ESTIMATES ARE CONSIDERED SUFFICIENT IF THEY, ,
(A) USE ALL OF THE DATA IN THE SAMPLE"
(B) DO NOT REQUIRE KNOWLEDGE OF THE POPULATION VALUE, ,
(C) APPROACH THE POPULATION VALUE AS SAMPLE SIZE INCREASES, ,
(D) ARE NORMALLY DISTRIBUTED.
VlW TEST 01 220170 ..
1 1 0 0014 .18 - .21 -01.36 -0.22"
1 2 1 0054 .69 + .53 -00.93
.63, ,
1 3 0 0010 .12 + .64 -01.77 -0.83 ..
VV 161271. .
yy 006 ..
$$ STAT 02 230270...

Figure 2-A record in the item file

Conversational Item Banking and Test Construction System

his own unique set of basic files, yet, use a common copy
of the TCAP program. The latter scheme is preferred
as it minimizes file search times.
Software design

The basic programming philosophy adopted was one
of cascaded drivers with several levels of utility routines. Such an approach enables the decision making at
each functional level to be controlled by the user interactively from a terminal. It also enables each level of
software to share lower level utility routines appropriate
to its tasks. Figure 3 presents a block diagram of the
major software components of the TCAP system. The
main TCAP driver is a small program that merely presents a list of operational modes to the user: Explore,
Construct, and File Maintenance. Selection of a.particular mode releases control to the correspondmg next
lower level driver. These second level drivers have access to four search routines that form a set of high level
utility routines. The Identification search routine
enables one to locate a record in a file by its unique
identification code. The Keyword search routine implements a search of either the item or test file for records
containing the combination of keywords specified by the
user. At present a simple conjunctive match is used, but
more complex logic can be added easily. The Parameter
search utility searches the item or statistics files for
items whose item parameter values fall within bounds
specified by the user. The Linked search routine all~ws
one to link from a record in one file to a correspondIng
record in another file. For example, from the item file
to the statistics file or from the item file to the test file.
Due to the extremely flexible manner in which the user
can interact with the three files it was necessary to access these four search routines through the Basic File
Handling routine. The BFH routine initializes the file

Figure 3-TCAP software structure

663

handlers from the parameters in the headers, coordinates
the file pointers, and handles certain error conditions.
Such centralization relieves both the mode implementation routines and the search routines of considerable
internal bookkeeping related to file usage. The four
search routines in turn have access to a lower level of
utility routines, not depicted in Figure 3. These lowest
level utilities are routines that read and write records,
pack and unpack character strings, convert numbers
from alphanumeric to integer or floating point, and
handle communication with the interactive terminal.
The purpose of the EXPLORE routine is to permit
the user to peruse the three basic files in a manner
analogous to thumbing through a card index. The EXPLORE routine presents the user with a display listing
seven functions related to accessing records within a
file. These functions are labeled: Identification, Keyword, Parameter, Linked, Restore, Mode and Continue. The first four of these correspond to the four
utility search routines. The Restore option merely reverses the linkage process and causes the predecessor
record to become the active record. The Mode option
causes an exit from the EXPLORE routine and a return to the Mode display of the TCAP driver. The Continue option allows one to continue a given search using
the present set of search specifications.
The Test Construction Routine is used to assemble an
educational test from the items in the item file. Test
construction is achieved by specifying a set of general
characteristics all items should have and then defining
sub sections of the test called areas. The areas within
the test are defined by user supplied keywords and the
number of items desired in an area. The Test Construction routine then employs the Keyword search routine,
via BFH, to locate items possessing the proper
. keywords. This process is continued until the speCIfied number of items for an area are retrived or the end of the
item file is reached. Once the requirements of an area
are satisfied the user is free to define another area or
terminate this phase. Upon termination certain summary data, predicted test statistics, and the items are
printed.
The function display of the File Maintenance routine
presents the user with three options: Create, FORTAP
and Single. The Create option is a batch mode proc~ss
that uses the File Creation from Cards subroutIne
(FCC) to create any of the three basic files !rom a ca:d
deck. To use this option, it is necessary to SImulate, Via
cards, the interaction leading to this point. The FORTAP option is interactive, but it assumes th~t the
FORTAP item analysis routine has created a card Image
drum file containing the test and item analysis results.
The file contains the current item statistics section for
each item in the test accompanied by the appropriate

664

Fall Joint Computer Conference, 1972

identification sections and test links. A test file record
for the test is also in this file. The File Maintenance
routine transfers the current item statistics section of the
item record of each item in the test to the corresponding
record in the statistics file. It then uses the FCC
subroutine toreplace the current item statistics section
of the item records with the item statistics section from
the FORTAP generated file. If an item record does not
exist in the Item file a record is created containing only
the identification sections and the current item statistics. The test record is then stored in the Test file
and the header updated. The Single option is used to
perform line item updates on a single file. Under this
option the File Maintenance routine assumes that card
images are stored in an update file and that only parts
of a given record are to be changed.
OPERATION OF THE SYSTEl\1
The preceding sections have described the file structure and the software design. The present section describes some interactive sequences representing typical
uses of the TCAP system. The sequences contained in
Figure 4 have had the lengthy record printouts deleted.
The paragraphs below follow these scripts and are intended to provide the reader with a "feel" for the system operation.
Upon completion of the usual remote terminal sign
in procedures, the TCAP program is entered and the
mode selection message--TYPE IN TCAP MODE=
EXPLORE, CONSTRUCT, FILE MAINTENANCE
is printed at the terminal. The user selects the appropriate mode, say EXPLORE, by typing the name. The
computer replies by printing the function display message. In the EXPLORE mode, this message is the list of
possible search functions. The user responds ty typing
the name of the function he desires to pe~form, keyword in the example. The computer 'responds by asking
the user for the name of the file he wishes to search.
N ext, the user is instructed to type in the keywords
separated by commas and terminated by a double
period. The user must be aware of the keywords employed to describe the items and tests in the files.
Hence, it is necessary to maintain a keyword dictionary
external to the system. This should cause little trouble
as the person who created the files is also the person using the system. Upon receipt of the keywords, the EXPLORE routine calls the Keyword Search routine to
find an item containing the particular set of keywords.
The contents of the item record located are then typed
at the terminal. At this point the system asks the user
for further instructions. It presents the message
FUNCTION DISPLAY NEEDED. A negative reply

causes a return to the Mode selection display of the
TCAP driver. A YES response causes the EXPLORE
function list to reappear. If one wishes to find the next
item in the file possessing the same keyword pattern,
CONTINUE, is typed and the search proceeds from
the last item found. In Figure 4 this option was not
selected. Returning to the Mode selection or reaching
the end of the file being searched causes the Basic File
Handler to restore the file pointers to the file origin.
The next sequence of interactions in Figure 4 links
from a record in the Item file to the corresponding record in the Statistics file. It is assumed that one of the
other search functions has been used to locate a record
prior to selection of the LINKED option, the last item
found via the Keyword search in the present example.
The computer then prompts the user by asking for the
name of the file from which the linking takes place,
item in the present example. It then asks for the name
of the file the user wishes to link to statistics in the example. There are several illegal linkages and the Linked
search routine checks for a legal link. The Linked search
routine extracts the identification section of the item
record and establishes the inputs to the Identification
Search routine. This routine then searches the Statistics file for a record having the same identification
section. It should be noted that a utility routine used a
utility routine at this point, but the cascaded control
was supervised by the EXPLORE routine. When the
proper Statistics record is found its contents are printed
at the terminal. Again, the system asks for directions
and the user is asked if he desires the function display.
In the example, the user obtained the function display
,and selected the Restore option. This results in the prior
record, the item record, being returned to active record
status. and the name of the active file being printed.
The system allows one to link and restore to a depth of
three records. Although not shown in the example sequences, the other options under the EXPLORE mode
operate in an analogous fashion.
The third sequence of interactions in Figure 4 shows
the construction of an examination via the TCAP system. Upon selection of the Construct mode, the computer instructs the user to supply the general item
specifications, namely the correct response weight and
the bounds for the item parameters X 50 and {3. These
minimum, maximum values are used to filter out items
having poor statistical properties. The remainder of the
test construction process consists of using keywords to
define areas within the test. The computer prints AREA
DEFINITION FOLLOWS: YES, NO. After receiving
a YES response the computer asks for the number of
items to be included in the area. The user can specify
any reasonable number, usually between 5 and 20.
The program then enters the normal keyword search

Conversational Item Banking and Test Construction System

TYPE IN TCAP M~DE =EXPL~RE, C~NSTRUCTIPN, FILE MAINTENCE
FUNCTIPN DISPLAY
TYPE KIND ~F SEARCH DESIRED

EXPL~RE

IDENT,KEYW~RD,PARAMETER,LINKED,REST~RE,CPNTINUE,M~DE

KEYWORD
TYPE IN FILE NAME
ITEM
TYPE IN KEYW0RDS SEPARATED BY C0MMAS
TERMINATE WITH ..
SKEWNESS,MEAN,MEDIAN..
*$AAAC 02 230270 ..
THE ITEM REC0RD WILL BE PRINTED HERE
{
FUNCTI0N DISPLAY NEEDED YES,N0
YES
FUNCTI0N DISPLAY
TYPE KIND 0F SEARCH DESIRED
IDENT,KEYW0RD,PARAMETER,LINKED,REST0RE,C0NTINUE,M0DE
LINKED
LINKED SEARCH REQUESTED
TYPE NAME 0F FILE FR0M
ITEM
TYPE NAME 0F FILE LINKED T0
STAT
*$AAAC 02 230270..
THE STATISTICS REC0RED WILL BE PRINTED HERE
{
FUNCTI0N DISPLAY NEEDED YES,N0
YES
FUNCTI0N DISPLAY
TYPE KIND 0F SEARCH DESIRED
IEDNT,KEYW0RD,PARAMETER,LINKED.REST0RE,C0NTINUE,M0DE
REST0RE
ITEM REC0RD FILE REST0RED
FUNCTI0N DISPLAY NEEDED YES,N0
YES
FUNCTI0N DISPLAY
IDENT,I<:EYW0RD,PARAMETER,LINKED,REST0RE,C0NTINUE,M0DE
M0DE
TYPE IN TCAP M0DE = EXPL0RE,C0NSTRUCTI0N,FILE MAINTENANCE
C0NSTRUCT
TYPE IN WEIGHT ASSIGNED T0 ITEM RESP0NSE
1
TYPE IN MINIMUM VALUE 0F X50
-2.5
TYPE IN MAXIMUM VALUE 0F X50
+2.5
TYPE IN MINIMUM VALUE 0F BETA
.20

TYPE IN MAXIMUM VALUE 0F BETA
1.5

AREA DEFINITI0N F0LL0WS YES,N0
YES
TYPE IN NUMBER 0F ITEMS NEEDED F0R AREA
10

TYPE IN KEYW0RDS SEPARATED BY C0MMAS
TERMINATE WITH ..
CHAPTER1,STATISTICS,THE0RY,FISHER. .
AREA DEFINITI0N F0LL0vVS YES,N0
YES
TYPE IN NUMBERS 0F ITEMS NEEDED F0R AREA
10

Figure 4-0perational sequences

665

666

Fall Joint Computer Conference, 1972

TYPE IN KEYW0RDS SEPARATED BY C0MMAS
TERMINATE WITH ..
CHAPTER2,DISTRIBUTI0N,FREQUENCY,INTERVAL..
AREA DEFINITI0N F0LL0WS YES,N0
YES
TYPE IN NUMBER 0F ITEMS NEEDED F0R AREA
10
TYPE IN KEYW0RDS SEPARATED BY C0MMAS
TERMINATE WITH ..
CHAPTER3,BIN0MIAL,PARAMETER,C0MBINATI0N,PERMUTATI0N ..
AREA DEFINITI0N F0LI.J0W8 YES,N0
YES
TYPE IN NUMBER 0F ITEMS NEEDED F0R AREA
10
TYPE IN KEYW0RDS SEPARATED BY C0MMAS
TERMINATE WITH ..
CHAPTER4,HYP0THESES,LARGE SAMPLE,Z TEST ..
AREA DEFINITI0N F0LL0WS YES,N0
N0
ITEMS REQUESTED PER AREA
10 10 10 10
ITEMS F0UND PER AREA
6 9 8 10
PREDICTED TEST STATISTICS
MEAN = 16.0758
STANDARD DEVIATI0N = 4.561111
RELIABILITY = .893706
D0 Y0U WANT ITEMS PRINTED YES,N0
N0
ITEM IDENTIFICATI0N
X50
BETA
h$AAAA 03 230270. .
.470000
.450000
(THIS INF0RMATI0NWILL BE PRINTED F0R ALL ITEMS)
TYPE IN TCAP M0DE =EXPL0RE,C0NSTRUCTI~N,FILE MAINTENANCE
EXIT
THAT IS END 0F RUN,G00DBY
Figure 4-tContinued)

procedures and the user enters the keywords that define this area of the test. Upon receipt of the keywords
the item file is searched for items possessing the proper
descriptors and whose item parameters are within
bounds. Completion of the keyword search results in a
return to the area definition message. The area definition and search process can be repeated up to ten times.
A NO response to the area definition message results in
. the printing of the table showing the number of items
requested per area and the number actually found per
area. The table is followed by the predicted values of
the test mean, standard deviation, and internal consistency reliability index. These values are computed
from the current values of the item parameters X 50 and
{3 of the retrieved items. These predicted values assist
the test constructor in determining if an appropriate
set of items has been selected by the system. The program then asks the user if he wants the selected items
printed. If not, only the identification section and the
values of the item parameters are printed. This information allows one to use the Identification search option
of the EXPLORE routine to retrieve the items at a
later date. A minor deficiency of the present test con-

struction procedures is that a reproducible copy of the
test is not produced. A secretary uses the hard copy to
prepare a stencil or similar master. With some minor
programming this final step could be accomplished.

Some enhancements
At the present time the full TCAP design has not
been implemented and a number of additional features
should be mentioned. Two sections of the item record,
date of use, and frequency of use can be employed to
prevent over use of the same items. A step in the test
construction mode will enable the user to specify that
an item used since a certain date or more than a specified
number of times should not be retrieved. The software
for this additional filtering has been written but not debugged.
A significant enhancement is one that enables the
test constructor to manipulate the items constituting a
test. For example, an instructor may not be satisfied
with the items the computer has retrieved in certain
areas. He may wish to delete items from one area and

Conversational Item Banking and Test Construction System

add items to another. This can be done interactively
and the predicted test statistics should be re-calculated
as each transaction occurs. At the present time, such
manipulations require a re-run of the total test construction process. An extension allowing considerable freedom
in manipulating items of the constructed examination
via the utility search routines has been designed but not
implemented.
The TCAP system was originally designed to be
operated from an alphanumeric display, hence the mode
display, function display terminology, but the present
implementation was accomplished using teletypes.
Alphanumeric displays have been acquired and many
user actions will be changed from typed in responses to
menu selections via a cursor. These displays will relieve
the user of the major portion of the typing load and
make the system a great deal easier to use.
Some observations

The TCAP design goals of flexibility, capability and
ease of use produced a conflicting set of software requirements. These requirements combined with the fact that
the operating system of the computer forced one to
treat all drum files as if they were magnetic tapes resulted in a challenging design problem. The requirement for providing the user with computer based
equivalents of present capabilities was solved through
the use of cascaded drivers and multiple levels of utility
routines. Such a scheme enables the drivers to be concerned with operational logic and the utility routines
with performing the functions. The use of multiple
levels of utility routines provided functional isolation
that simplified the structure of the programs. The final
TCAP program was highly modular, hierarchical in
structure and quite compact.
The use of relative addressing in conjunction with
the character oriented file records and a header scheme
proved to be advantageous. The approach makes transferring TCAP to other computers an easy task. Hopefully, the only conversion problem will be adjusting
the FORTRAN A formats to the target computer. A
significant feature of the approach is that record layouts within files are defined at run time rather than at
compile time. The practical effect is that each instructor
can tailor the number of sections within a record and
their size to suit his own needs. Thus, the item, statistics, and test files can be unique to a given user.
TCAP modifies its internal file manipulations to process

667

the record specifications it receives. Such flexibility is
important in the university setting where each instructor feels his instructional procedures are unique.
One consequence of the high degree of operational
flexibility and the range of capabilities provided is that
housekeeping within TCAP is extensive. A good example of this housekeeping occurs when the File Maintenance routine updates the item files from the item
analysis results file generated by the FORTAP program. Because not all items in the test will have records
in the item file, the File Maintenance routine must keep
track of them, create records for them, add them to the
item file, and'inform the user that the records have
been added. There are numerous other situations of
comparable complexity throughout the TCAP system.
Handling them smoothly and efficiently is a difficult
task. Because TCAP was implemented on a large computer, such situations were generally handled by creating supplementary drum files and provided working
arrays in core. The use of random access files would
have greatly simplified many of the internal housekeeping problems.
On the basis of the author's experience with the design and implementation of the TCAP system o'ne
salient conclusion emerges. Such programs must be
designed as complete software systems. To attempt to
design them in a sequential fashion and implement
them piecemeal is folly. The total system needs to be
thought through very carefully and the possible interactions explored. If provision is to be made for future,
but undefined, extensions, the structure of the program
and the files must be kept simple to reduce the interaction effects of such enhancements. It appears to be a
characteristic of this area of computer programming
that complexity and chaos await your every decision.
This caveat is a reflection of the many design iterations
that were necessary to achieve the TCAP system. The
end product of this process is a system that provides
the instructor with an easy to use tool that can be of
considerable assistance. Being able to maintain an item
bank and assemble tests to meet arbitrary specifications
aids one in performing an unavoidable task. To do so
quickly and efficiently is worth the investment it takes
to convert one's item bank into machine readable form.
The TCAP system illustrates again that tasks performed by manual means can often be quite difficult to
implement by computer. In the present case a reasonable implementation was achieved by making the
system interactive and taking advantage of the capabilities of both man and machine.

Measurement of computer systemsAn introduction
by ARNOLD F. GOODl\1AN
McDonnell Douglas Astronautics Company
Huntington Beach, California

NEED FOR MEASUREMENT

puter Conferences, Proceedings of Fall Joint Computer
Conferences, Journals of the Association for Computing
Machinery and Communications of the ACM, as well
as selected Proceedings of ACM National Conferences
and Proceedings of Conferences on Application of
Simulation. The resulting personal bibliography and
the unpublished bibliographies of BeIP, Miller2 and
Robinson3-each with its own bias and deficiencywere utilized to obtain an initial indication of pioneer
activity involving measurement.
Measurement of computer systems was presaged by
Herbst, Metropolis and Wells4 in 1945, Shannon 6 in
1948, Hamming6 in 1950 and Grosch7 in 1953. Bagley,S
Black,9 Codd,10 Fein,11 Flores,12 Maron13 and N agler14
published articles concerning it during 1960. These
were followed in 1961 with the related contributions
of Barton,16 Flores,16 Gordonp Gurk and Minker,18
Hosier,19 and Jaffe and Berkowitz. 20 During 1962, there
were pertinent papers by Adams,21 Baldwin, Gibson
and Poland,22 Dopping,23 Gosden and Sisson,24 Hibbard,26
Patrick,26 Sauder,27 Simonsen28 and Smith. 29
Many of the concepts and techniques which .were
developed for defense and space systems-whose focal
point was hardware rather than software-are also
applicable to computer systems. The system design,
development and testing sequence was perfected by
the late 1950's. Since the early 1960's, system verification, validation, and cost and effectiveness evaluation have been prevalent. The adaptation of these
concepts and techniques to measurement of computer
systems-especially software-is not as simple as
system specialists tend to believe, yet not as difficult
as software specialists tend to believe.
In the middle 1960's, sucb..concepts and techniques
began to be applied to the selection and evaluation of
computer systems, and to software as well as hardware.
Ratynski,30 Searle and Neil,31 Liebowitz32 and Piligian
and Pokorney33 describe the Air Force and National
Aeronautics and Space Administration (NABA) adapta-

Computer systems have become indispensable to the
advancement of management, science and technology.
They are widely employed by academic, business and
governmental organizations. Their contribution to
today's world is significant in terms of both quantity
and quality.
This significant growth of computer utilization has
been accompanied by a similar growth in computer
technology. Faster computers with larger memories
and more flexible input and output have been introduced, one after another. Interactive, multiprocessing,
multiprogramming, realtime and timesharing have
been transformed from catchy slogans into costly
reality-or at least, partial reality.
In addition, computer science has come into being,
and has made great progress from an art toward a
science. Departments of computer science have appeared within many colleges and universities. A new
profession has been created and is attempting to
mature.
These three areas of phenomenal growth-computer
utilization, computer technology and computer
science-have produced the requirement for a new
field, measurement of computer systems. In an atmosphere of escalating computer cost and increasing
budget scrutiny, measurement provides a bridge
between design promises and operational performance.
This function of measurement is complemented by the
traditional need for measurement of any art in search
of a science.
ACTIVITY INVOLVING MEASUREMENT
A limited survey was conducted of the 1960-1970
literature on measurement of computer systems. This
survey included all Proceedings of Spring Joint Com669

670

Fall Joint Computer Conference, 1972

tion of their system acquisition procedures to software
acquisition. Attention then shifted to measurement of
computer system performance, with a corresponding
increase of activity. Sackman34 discusses computer
system development and testing, based upon the Air
Force and NASA experience. An important development of the period was the formation of a Hardware
Evaluation Committee within SHARE35 during early
1964, and its evolution into the SHARE Computer
Measurement and Evaluation Project36 during August
1970, which served as a focal point for significant
progress. 37
A preliminary but informative indication of activity
involving computer system effectiveness evaluation
prior to 1970 appears below. When a comprehensive
bibliography on measurement of computer systems is
compiled and annotated, the gross characterization
of activity given in this paper may be re'finedand
expanded-especially in the area of practical contributions and contributors to measurement. Raw material
for that bibliography and characterization may be
found in the unpublished bibliographies of Bell, l
Miller,2 Robinson3 and the author mentioned aboveas well as a bibliography by Crooke and Minker,38 one
in preparation by Menck,39 and the selected papers in
Hall. 37
During a keynote address at Computer Science and
Statistics: Fourth Annual Symposium on the Interface
in September 1970, Hamming coined the name of
"compumetrics"-in the spirit of biometrics,econometrics and psychometrics-for measurement of computer systems. 40 It is fitting _that the naming of
compumetrics occurred at this symposium, since
measurement of computet systems is truly a part of
the interface-or area of interaction-of computer
science and statistics,4l
Hamming phrased it well when he stated: 40
"The director of a computer center is responsible -for managing the utilization of large
amounts of money, people and resources.
Although he has a complex and important
statistical problem, his decisions are normally
based upon the simplest collection and analysis of data-since he usually knows little
statistics beyond such elementary concepts
as the mean and variance. His need for statistics involves both the operational performance of his hardware and software, and the
environment provided by his organization
and users."
"A new discipline that seeks to answer these
questions-and that might be called 'compu-

metrics'-is in the process of evolving. Karl
Pearson and R. A. Fisher established themselves by developing novel statistical solutions
to significant problems of their time. Compumetrics may well provide contemporary
statisticians with many such opportunities."
Workshop sessions on compumetrics followed
Hamming's remarks at the Fourth Symposium on the
Interface. During these sessions,40 "there developed a
feeling that this symposium marked a beginning which
must not be _allowed to be an end" -that sessions on
compumetrics be scheduled at the Fifth Symposium
on the Interface, and that a local steering committee
be formed to promote interest in compumetrics.
It is not surprising, therefore, that a Special Interest
Committee on Measurement of Computer SystemsSICMETRICS-was initiated within the Los Angeles
Chapter of the Association for Computing Machinery
during April 1971. SICMETRICS is compiling a
bibliography on compumetrics. 39
There were sessions on computer system models and
analysis at the Fifth Annual Princeton Conference on
Information Sciences and Systems42 in March 1971.
In April 1971, the ACM Special Interest Group on
Operating Systems-8IGOPS-sponsored a Workshop
on System Performance Evaluation43-with sessions
on instrumentation, mathematical models, queuing
-models, simulation models and performance evaluation. There were sessions on system evaluation and
diagnostics at the 1971 Spring Joint Computer Conference 44 during May 1971. This was followed in November 1971 by workshop sessions on compumetrics
at the Fifth Symposium on the Interface,45 by a session
on operating system models and measures at the 1971
Fall Joint Computer Conference,46 and by a Conference
on Statistical Methods for Evaluation of Computer
Systems Performance47-with sessions on general approaches, evaluation of current systems, input analysis,
software reliability, system management, design of
experiments and regression analysis. During November
1971, the ACl\I Special Interest Committee on Measurement and Evaluation-SICME-was also formed.
The ACM Special Interest Groups on Programming
Languages-SIGPLAN-and on Automata and Computability Theory-SIGACT-sponsored a Conference
on Proving Assertions about Programs48 in January
1972. A Symposium on Effective Versus Efficient
Computing49-with sessions on responsibility, getting
results, implementation, evaluation, education and
looking ahead-was held during March 1972, and so
was a session on computer system models at the Sixth
Annual Princeton Conference on Information Sciences
and Systems. 50 In May 1972, there was a session on

Measurement of Computer Systems

compumetrics at the 1972 Technical Symposium of
the Southern California Region of ACM, and there
were sessions on system performance measurement
and evaluation at the 1972 Spring Joint Computer Conference. 51 An ACM Special Interest Group on Programming Languages-SIGPLAN-Symposium on
Computer Program Test Methods followed during
June 1972. 52
The National Bureau of Standards and AC1VI are
jointly sponsoring a series of workshops and conferences on performance measurement. An informative
discussion of many practical aspects of compumetrics
is contained in Canning. 53 Finally, the 1972 Fall Joint
Computer Conference 54 in December 1972, has coordinated sessions on meas~rement of computer systems-executive viewpoints, system performance, software validation and reliability, analysis considerations,
monitors and their application, and case studies.
Across the Atlantic, a Performance Measurement
Specialist Group was organized within the British
Computer Society in early 1971. A number of its working groups are functioning on specific projects, and it
sponsored a conference in September 1972.
This summary of activity involving measurement
of computer systems clearly outlines the growth and
increasing importance of compumetrics. Proposal of
a structure for compumetrics is, therefore, quite appropriate. The presentation below is general and suggestive, rather than detailed and complete-as is
appropriate for an introduction.

STRUCTURE FOR MEASUREMENT
A structure-or framework-is proposed for measurement of computer systems, to serve as a background
for both understanding and developing the subject.
I t provides not only a common set of terms-which
may be familiar to some and new to others, but also a
guide to the current-as well as potential-extent and
content of compumetrics. Such a structure is critical
for subjects that have matured and crucial otherwise,
whether or not there is universal agreement on detailed
portions of it. The conceptual framework for Air Force
and NASA acquisition of computer systems30- 34 provides a context in which not only the structure for
measurement, but also the structure for effectiveness
evaluation, should be considered.
Compumetrics .concerns measurement in-internal
to-or of-external to-computer systems. As for
biometrics, econometrics and psychometrics, this means
measurement of a general nature applied to computer
systems in a broad sense. A computer system is taken
to be a collection of properly related elements, including

671

a computer, which possesses a computing or data
handling objective. The structure for compumetrics is
described in terms of computer system evolution and
computer system operation. Computer system evolution
is divided into design, development and testing, and
computer system operation is divided into objective,
composition and management. A sequence of questions-including the if, why, what, where, when, how
much and how of measurement-should be developed
and then answered for each element of the structure.
The structure is presented from the viewpoint of a
statistician who is knowledgeable about computers, in
order to augment Hamming's viewpoint as a computer
scientist who is knowledgeable about statistics. In
addition, this structure is considerably more comprehensive and definitive than that which is implied by
Hamming's original discussion. 40 An outline version
of it appeared in Locks. 45
At present, measurement of computer systems might
be characterized as a growing collection of measurements on their way toward a science, and in need of
planning and analysis to help them get there. Bell,
Boehm and Watson 55 provide an adaptation of the
scientific method to performance measurement and
improvement of a computer system: from understanding the system and analyzing its operation,
through formulating performance improvement hypotheses and analyzing the probable cost-effectiveness
) of the corresponding modifications, to testing specific
hypotheses and implementing the appropriate combinations of modifications-as well as testing the costeffectiveness of these combinations. As a complement
to this approach, the author 56 presents a user's guide
to data modeling and analysis-including a perspective
for viewing and utilizing such a framework for the
collection and analysis of measurements. That paper56
discusses the sequence of steps which leads from a
problem through a solution to its assessment, some
aspects of solving problems which should be considered,
and an approach to the design and analysis of a complex system through utilization of both experimental
and computer simulation data ..

Measurement and system evolution

Within this and the following sections, appropriate
terms appear in capital letters for emphasis. Such a
procedure· produces not only clarity of exposition, but
also a lack of smoothness, in the resulting text. The
advantage of the former is sought, even at the disadvantage of the latter. In addition, words are employed
in their usual nontechnical sense.
Computer systems evolve from DESIGN through

672

Fall Joint Computer Conference, 1972

DEVELOPMENT to TESTING. For illustrative
purposes, we present one partition-from among the
many which are possible-of this evolution into more
basic components. It is meaningful from both a manager's and a user's point of view. For a given computer
system, the accomplishment of more than one component may be occurring simultaneously, and the
accomplishment of all components may not be feasible.
The DESIGN of a computer system involves the
system what and how. A REQUIREMENTS ANALYSIS ascertains user needs and generates system objectives, and a FUNCTIONAL ANALYSIS translates
system objectives into a desired system framework.
Then SPECIFICATION SYNTHESIS transforms
the objectives and desired framework into desired
performance and its description. Finally, STRUCTURE
develops system framework from the desired framework, and SIZING infers system size from its framework.
System· DEVELOPMENT is concerned with implementing the system what and how. It proceeds from
HARDWARE AND SOFTWARE SELECTIONwhich includes the decision to make or buy, through
HARDWARE AND SOFTWARE ACQUISITIONwhich involves either making or buying-and HARDWARE AND SOFTWARE COMBINATIONwhich implements the framework in terms of acquired
hardware and software, to SOFTWARE PROGRAMMING-which includes the programming of additional
software. How well the framework was implemented
is then determined by HARDWARE AND SOFTWARE VERIFICATION. Development is completed
by SYSTEM DOCUMENTATION to describe the
system what and how, and by PROCEDURE DOCUMENTATION to describe the how of system operation
and use.
TESTING of a computer system has the objective
of assessing how well the system performs. First,
system INTEGRATION-which could have been
included under development-assembles the hardware,
software and other elements into. a system. This is
followed by system VALIDATION, for ascertaining
how well the specifications were implemented and for
contributing to quality assurance. COST EVALUATION determines how much the system costs in terms
of evolution and operation, and EFFECTIVENESS
EVALUATION determines how well the system performs in terms of operational time, quality and impact
upon the user. The final step in testing is, of course,
OPERATION-performance for the user.
McLean57 proposes a characterization for the "all-tootrue life cycle of a typical EDP system: unwarranted
enthusiasm, uncritical acceptance, growing concern,
unmitigated disaster, search for the guilty, punishment

of the innocent, and promotion of the uninvolved." An
excellent discussion of computer system development
and testing-whose application should alter this cycleis provided by Sackman. 34 In addition, measurement
was apparently employed in many places within the
design, development and testing sequence for the information system of Winbrow. 58
Where is measurement currently utilized in the system evolution sequence? Measurement is inherently
involved in hardware specification synthesis, sizing
and cost evaluation. It is employed to a limited extent
during hardware requirements analysis and selection,
and it emerged in importance as a significant contributor to hardware validation and performance
monitoring-which is a portion of effectiveness evaluation. Weare only beginning to consider serious and
systematic measurement as it concerns software verification, validation, and cost and effectiveness evaluation. In fact, we are beginning to use the same terminology for hardware and software that was used in
the early 1960's for defense and space systems-which
were predominately noncomputer hardware. "Requirements for AVAILABILITY of Computing System
Facilities"59 provides an excellent example, with its
use of reliability, maintainability, repairability and
recoverability.
Where should measurement be utilized in the evolution sequence? It probably· has an appropriate use in
most, if not almost all, components of the sequence.
In particular, system verification, validation, and cost
and effectiveness evaluation-as well as reliability and
its fellow ilities 59-have no real meaning without
measurement.
Measurement and system operation

A computer system operation has COMPOSITION
and an OBJECTIVE, as well as being subject to
MANAGEMENT. As a guide to discussion and
thought, a useful-but not unique-division of system
operation into more ba&ic elements is now described.
A given computer system, however, may not involve
all of these elements.
COMPOSITION of a computer system concerns
what constitutes the system. The main component,
by tradition, has been computer HARDWARE-which
may involve input, memory, processing, output, communication or special purpose equipment. Since the
means for communicating with that equipment currently
costs from one to ten times as much as the hardware,
the main component really is SOFTWARE-which
may involve input, storage and retrieval, operating,
application, simulation, output or communication
program packages. The system may also contain

Measurement of Computer Systems

FIRMWARE, which is either soft hardware or hard
software-such as a microprogram, and PERSONNEL.
How to operate and use the system is covered by the
operating PROCEDURE. The system aspects include
all two way INTERFACES such as hardware-software,
all three way INTERFACES such as firmwarepersonnel-procedure, all four way INTERFACES such
as hardware-software-personnel-procedure, and the five
way INTERFACE of hardware-software-firmwarepersonnel-procedure.
What the computer system does primarily-although it may do many things concurrently or sequentially-is the system OBJECTIVE. DATA MANAG EMENT emphasizes storage and retrieval of data
by the system. Operating upon data by the system is
the focus of DATA PROCESSING. COl\1l\1AND
AND CONTROL stresses input and output of data
by the system, and decisions aided by the system.
As observed by Boehm, an alternative view is that
all three types of systems aid the making of decisions:
data management systems provide the least aid, data
processing systems provide more aid, and command
and control systems provide the most aid. The distinction among these also depends upon what the
environment is and who the user is-data management or command and control systems are frequently
called information systems. In addition, the same
system- or a portion of it-might frequently be utilized
for more than one objective.
Computer system MANAGEMENT involves system administration and supervision. PLANNING is
projecting the system's future. Getting operations
together and focused constitutes COORDINATION,
and keeping operations together and directed constitutes CONTROL. REVIEW provides an assessment.
of the past and present, while TRAINING provides
system operators. Finally, USER INTERACTION
concerns system calibration and acceptance by the user.
Measurement has traditionally been employed on
computer hardware and personnel, has begun to be
employed on software and firmware, and may someday
be employed on procedure and interfaces. It has been
applied in data management and data processing, but
should also be applied in command and control. As
for management in general, measurement is only beginning to be utilized in computer system planning,
coordination, control, review, training and user interaction.
STRUCTURE FOR EFFECTIVENESS
EVALUATION
Consideration of the need for, activity involving,
and structure for measurement implies that an impor-

673

tant unsolved problem for the 1970's is the evaluation
of computer system effectiveness. That this is true for
library information systems is explicitly stated in a
recent report by the National Academy of Sciences
Computer Science and Engineering Board,60 and that
it is true for computer systems in general is implicitly
stated in arecent report by GUIDE Int~rnationa1.59
As Maclean observed,57 we are like Oscar Wilde's
cynic: "A man who knows the price of everything, and
the value of nothing."
Effectiveness evaluation determines how well the
system performs in terms of operational time, quality
and impact upon the user. It has both an internal or .
inwardly oriented aspect-which determines how well
the system responds to any need, and is more efficiency
than effectiveness-and an external or outwardly
oriented aspect-which determines how well the system responds to the actual need, and is truly effectiveness. The point of view that is taken as to what
effectiveness is and how it should be evaluated is also
extremely important. Viewpoints of the user and his
management should be considered, as well as viewpoints of the system and its management. In terms
of both aspects and viewpoints, effectiveness evaluation
is much broader than mere performance measurement.
Evaluating the impact of the system upon a user is
essentially the reverse of system design or selection,
which evaluates the impact of the user upon a potential
or real system. In order to accomplish this, it is necessary to evaluate how well the promises of system
design or selection are fulfilled by system operation.
An informative, as well as interesting, exercise would
be the real impact evaluation of applications such as
those surveyed in 1965 by Rhodes,61 Ramo,62, Gerard,63
lVialoney,64 l\1cBrier;65 11erkin and Long,66 Gates and
Pickering,67 Ward,68 Baran69 and Schlager. 7o
Based upon Air Force and NASA experience,
Sackman34 provides a thorough treatment of computer
system development and testing. This treatment includes:

• A survey of system engineering, human factors,
software and operations research points of view
on testing and evaluation-all of which are implicitly oriented inwardly toward the system,
rather than outwardly toward the user.
• A description of test levels, objectives, phasing
within development and operation, approach and
chronology.
• A discussion of the analogy between scientific
method and system development--during which,
a sequence of increasingly specific hypotheses is
posed and tested, as the implicit promises of

674

Fall Joint Computer Conference, 1972

XXXIII.

XXXII.

XXVIII.
USE R GENE RAL

xxv.

XXIV.

UNIT, USER
'ANDTASK

USER AND TASK

UNIT AND CENTER

CENTE R AND UNIT

GENERAL

GENERAL

GENERAL

XXVI.

XXVII.

UNIT, CENTER
AND TASK

I

CENTER, UNIT
AND TASK

r--------..,

USER DATA

~--~

CENTER MANAGEMENT

XXXI.

UNIT AND USER

-_ .,. -:....l
- ,--;;.-------,L__-_
I
INPUT
r
~--------~

xxxv.

AND UNIT

xxx.

XXIX.
-

_

UNIT MANAGEMENT
AN~ USER

USER
MANAGEMENT

~_ _ _ _-.~

----=1

XI·~:~~"OATA

L ________

~----

~

~------~

II
I
I

I
I
:;:-.J

:

I

I

I

r------,------I
: XX~::d:PUT

L _____ '---_ _ _- '

:X:----

XI~,o::~ INPUT

UNIT INPUT
QUALITY

: XX:~!~~PUT

L ______ L -_ _--'

Figure I-Structure for evaluation of data management or command and control system effectiveness

design become explicit promises during development and explicit performance during operation.
• A summary of the philosophical roots of this
analogy and approach.
• A short bibliography.
It constitutes an excellent contribution to effectiveness
evaluation, as well as a firm foundation for the framework of Bell, Boehm and Watson, 55 but more is needed.
In addition, almost all library system effectiveness
evaluation has been centered around--'-if not actually
restricted to-variations of two simple ratios, called
relevance and recall. And Fingings 1 and 2 in the
National Academy of Sciences report60 state that much
more is needed.
The complexity and importance of effectiveness
evaluation combine to require a significantly broader
and deeper, as well as more meaningful, structure. Most
of the significance and ultimate payoff associated with
computer systems involve the external environment
and aspects of the system, from various points of view.
Despite that fact, the preponderance of effectiveness
evaluation has not focused upon such aspects from the
appropriate points of view.34.71-79
A structure for computer system effectiveness evaluation is proposed, as both a step toward fulfilling that
need and an elaboration of the structure for compumetrics. Figure 1 contains a general version of the
structure for data management or command and

control systems, and Figure 2 contains a general version
of the structure for data processing systems. The
graphic presentations of the figures are complemented
by the corresponding' verbal descriptions-which employ words in their usual nontechnical sense. Effectiveness evaluation of a computer system might require a
combination of the structures in Figures 1 and 2, since
the system might frequently be utilized for more than
one objective. In addition, the entire structure might
not be of interest for a given system.
An initial indication of activity involving computer
system effectiveness evaluation is then summarized.
Finally, selected papers that illustrate such activity
are briefly discussed. This summary and discussion
serve as a background against which to view the proposed structures.
Evaluation of data management or command and
control systems

In Figure 1, there are three main categories of
characteristics-FLOW,
EFFECTIVENESS
and
VIEWPOINTS-all of which reside within an ECONOMIC AND POLITICAL ENVIRONMENT.
FLOW characteristics (I-XI) involve the flow of data
and need for data, from a user and his task through the
system unit and center back to the user and his task.
Those characteristics (XII-XXIII) which describe
how well the flow of data satisfies the need for data-

l\1easurement of Computer Systems

both internal and external to the system-comprise
EFFECTIVENESS. VIEWPOINTS contain the
various points of view (XXIV-XXXV) regarding the
flow and its· effectiveness. All of these characteristics
are embedded within an ECONOl\1IC AND POLITICAL ENVIRONMENT, whose influence is
sometimes explicit and sometimes implicit yet always
present.
A USER (I) of the system and a TASK (II) which
he is performing jointly generate a need for data, called
USER DATA NEED (III). To satisfy this need, the
user contacts either the appropriate outlet of the
system-SYSTEM UNIT (IV)-or other sources for
data-OTHER USER SOURCES (V). The unit essentially becomes a user now and contacts either the
SYSTEM CENTER (VII) or OTHER UNIT
SOURCES (VIII), in order to satisfy its UNIT DATA
NEED (VI). DATA (IX) is then output by the system
or other sources to the user for performance of his task.
Finally, there may also be USER DATA INPUT (X)such as data generated by the user in his task or by user
management regarding an impending change in its
basic need-by the user to the unit, and UNIT DATA
INPUT (XI)-such as data generated by the unit or
by unit management regarding an impending change
in its basic need-by the unit to the system.
Operational characteristics of the unit and center
in terms of time-how quickly or how often-are

XXX.
USER MANAGEMENT

~

z

>

USER AND TASK

UNIT, USER
AND TASK

CENTER MANAGEMENT
AND UNIT

UNIT MANAGEMENT
AND CENTER

XXIX.

XXVIII.

~~r-X-X-II.------+---------~'--X-XI-II.----~--X-XI-V.--~--+---------~
USER GENERAL

XXXIII.

XXXII.

UNIT MANAGEMENT
AND USER
XXVII.
UNIT AND
USER GENERAL

XXVI.

grouped under UNIT OUTPUT TIME (XII) and
CENTER OUTPUT TIME (XIII), those in terms
of quality-how well or how completely-are grouped
under UNIT OUTPUT QUALITY (XIV) and CENTER OUTPUT QUALITY (XV), and those in terms
of impact-how responsively or how significantlyare grouped under UNIT OUTPUT IMPACT (XVI)
and CENTER OUTPUT IMPACT (XVII). Time
characteristics emphasize the internal aspects of the
system and impact characteristics emphasize the external aspects of the system, while quality characteristics emphasize both the internal and external
aspects of the system. In addition, time is the easiest
to measure objectively as well as the least meaningful
quality is more difficult to measure objectively than
time and less difficult to measure objectively than
impact, as well as more meaningful than time and less
meaningful than impact ... impact is the most difficult to measure objectively as well as the most meaningful. Effectiveness may be viewed as the average,
over all users and tasks, of the effectiveness for specific
user and task combinations.
There may also be USER INPUT TIME (XVIII)
and UNIT INPUT TIME (XIX)-to indicate how
quickly or how often the user inputs data to the unit
and the unit inputs data to the center, USER INPUT
QUALITY (XX) and UNIT INPUT QUALITY
(XXI)-to indicate how well or how completely these

XXXI.
-

CENTER AND
UNIT GENERAL

UNIT AND
CENTER GENERAL

·--xx-v-.----~--------~

UNIT, CENTER
AND TASK

CENTER. UNIT
AND TASK

r---=..-'- ---:.-- ----------- -------_=..,

It

tl

~:

~:

VII.
PROCESSED
DATA

I
I
,

I

XVI.

,
:

USER INPUT
TIME

XVII.
UNIT INPUT
TIME

f- X~'~. - - - -.-------~- - - - -- -'-----------ll- ~X-:- - - - - ,.---------, - - - -- -'--....;.;;.:;"-----...1

ixx.-u:R--- •
1
INPUT

L

_

~M!~:" _

1 ~~~~~~~lIT
_____-+, _______

:

I-;x~----:

675

.-=----i-!

UNIT INPUT
DUALITY

-------'------...1

UNIT INPUT

L_~~~~ __ <---='-":':':"'--...1

Figure 2-Structure for evaluation of data processing system effectiveness

676

Fall Joint Computer Conference, 1972

were accomplished, and USER INPUT IMPACT
(XXII) and UNIT INPUT IMPACT (XXIII)-to
indicate how responsively or how significantly these
were accomplished. In this case, the user is serving the
system and the above roles are reversed. Internal aspects of the user are focused upon by time and external aspects of the user are focused upon by impact,
while both internal and external aspects of the user are
focused upon by quality.
What we mean by effectiveness, as well as how we
evaluate it, will vary according to our point of view.
The task specific viewpoint of the user toward the unit
is USER AND TASK (XXIV), that of the unit toward
the user is UNIT, USER AND TASK (XXV), that
of the unit toward the center is UNIT, CENTER AND
TASK (XXVI), and that of the center toward the unit
is CENTER, UNIT AND TASK (XXVII). USER
GENERAL (XXVIII), UNIT AND USER GENERAL (XXIX), UNIT AND CENTER GENERAL
(XXX), and CENTER AND UNIT GENERAL
(XXXI) represent general viewpoints of the user for
the unit, the unit for the user, the unit for the center,
and the center for the unit. Finally, the viewpoint of
user management toward the unit constitutes USER
MANAGEMENT (XXXII), that of unit management
toward the user constitutes UNIT MANAGEMENT
AND USER (XXXIII), that of unit management
toward the center constitutes UNIT MANAGEMENT
AND CENTER (XXXIV), and that of center management toward the unit constitutes CENTER MANAGEMENT AND UNIT (XXXV). Internal aspects
of the system are stressed in center viewpoints and
external aspects of the system are stressed in user viewpoints, while both internal and external aspects of the
system are stressed in unit viewpoints. Task specific
viewpoints are the easiest to measure objectively,
general viewpoints are more difficult to measure objectively than task specific viewpoints and less difficult to measure objectively than management viewpoints, and management viewpoints are the most
difficult to measure objectively-the meaningfulness
of these depends, of course, upon point of view.

USER PROGRAMMING NEED (III) or USER
PROCESS IN G NEED (V). To satisfy this need, the
user contacts the SYSTE1VI PROGRAMMING UNIT
(IV) or SYSTEM PROCESSING CEN,.,ER (VI)which is also contacted to satisfy USER AND UNIT
PROCESSING NEED (V). PROCESSED DATA
(VII) is then output to the user for performance of his
task. There may also be USER PROGRA1VIMING
INPUT (VIII) by the user to the unit, or USER AND
UNIT PROCESSING INPUT (IX) by the user and
unit to the center.
Operational characteristics of the unit and center are
grouped under UNIT OUTPUT TIME (X) and
CENTER OUTPUT TIlVIE (XI), UNIT OUTPUT
QUALITY (XII) and CENTER OUTPUT QUALITY
(XIII), and UNIT OUTPUT IMPACT (XIV) and
CENTER OUTPUT IMPACT (XV). There may also
be USER INPUT TIME (XVI) and UNIT INPUT
TIl\1E (XVII), USER INPUT QUALITY (XVIII)
and UNIT INPUT QUALITY (XIX), and USER
INPUT IMPACT (XX) and UNIT INPUT IMPACT (XXI).
Task specific viewpoints are those of USER AND
TASK (XXII), UNIT, USER AND TASK (XXIII),
UNIT, CENTER AND TASK (XXIV), and
CENTER, UNIT AND TASK (XXV). USER GENERAL (XXVI), UNIT AND USER GENERAL
(XXVII), UNIT AND CENTER GENERAL
(XXVIII), and CENTER AND UNIT GENERAL
(XXIX) represent general viewpoints. Finally, management viewpoints are given by USER MANAGEMENT (XXX), UNIT MANAGEMENT AND
USER (XXXI), UNIT MANAGEMENT AND
CENTER (XXXII), and CENTER MANAGEMENT AND UNIT (XXXIII).
Some modification and considerable refinement may
be required to employ one of these structures on an
actual computer system. The structures do, however,
indicate important considerations for evaluating the
effectiveness of a computer system. In addition, they
are considerably more comprehensive than current
structures, and provide a guide toward their own
modification and refinement.

Evaluation of data processing systems
Activity involving evaluation

Figure 2 contains the characteristics of FLOW
(I-IV), EFFECTIVENESS (X-XXI) and VIEWPOINTS (XXII-XXXIII)-all being surrounded
by an ECONOMIC AND POLITICAL ENVIRONMENT. Since it differs from Figure 1 only in
terms of the basic flow for data and need, a brief
description is now presented.
A USER (1) and his TASK (II) jointly generate

This introduction to compumetrics concludes with
an initial indication of activity involving computer
system effectiveness evaluation prior to 1970, and a
brief description of selected papers which illustrate the
activity. That indication and description provide a
context in which to consider the structures given above.
Utilizing the unpublished bibliographies of Bell, 1

lVleasurement of Computer Systems

Miller,2 and Robinson3 and the author, each processing
its own bias and deficiency, a preliminary characterization of effectiveness evaluation activity before 1970
was obtained. Those pioneering papers that appeared
prior to 1963 and treated the general topic were included, but those papers that emphasized mathematical modeling or computer simulation-the majority of which were more concerned with mathematics
than with measurement-were not included.
There were 234 separate references remaining after
duplicate listings within these bibliographies were
eliminated. The number (and approximate percentage)
of documents by year were:

•
•
•
•
•
•
•
•
•
•
•
•
•
•
•

1945-1 (0%)
1948-1 (0%)
1950-1 (0%)
1953-1 (0%)
1960-7 (3%)
1961-6 (2%)
1962-9 (4%)
1963-7 (3%)
1964-14 (6%)
196.1:>-8 (3%)
1966-13 (6%)
1967-23 (10%)
1968-31 (14%)
1969-62 (27%)
1970-50 (22%)

677

are summarized by Sackman :78
• All five employ computer time and some measure
of man time.
• All five employ some measure of program quality.
• Gold employs three additional measures of quality,
and Smith employs one additional measure of
quality.
• Gold and Schatzoff, Tsao and Wiig employ a
measure of cost.
• All five employ-in an implicit, rather than explicit, manner-both system and user viewpoints.
Finally, Shemer and Heying79 include both internal
and external aspects of effectiveness in the design model
for a system, which is to perform timesharing as well
as batchprocessing-and then compare operational
system data with the design model.

ACKNOWLEDGMENTS
The critical review of this paper and constructive suggestions for its improvement by Thomas Bell, Barry
Boehm, Richard Hamming, Robert Patrick, Harold
Petersen and Louis Robinson are gratefullv acknowledged.

REFERENCES
These numbers and percentages are, of course, affected by all pioneering papers having been counted
at the lower end and by some recent papers having
possibly been missed at the upper end. Nevertheless,
they do exhibit a general trend in the variation of
activity over the period. A serious characterization of
such activity awaits the compilation and annotation
of a comprehensive bibliography on measurement of
computer systems-by categories in the structures for
measurement and effectiveness evaluation, as well as
by year.
An elementary structure for evaluation of command
and control system effectiveness-in its external form as
well as its internal form-is provided by Edwards.71
Both Rosin72 and Bryan73 consider time and quality
characteristics of data processing system performance
for a large variety of users, the former on a batchprocessing system and the latter on a timesharing
system. Five experiments for comparing the performance of a timesharing system with that of a batchprocessing system-Gold,74 Sackman, Erikson and
Grant,75 Schatzoff, Tsao and Wiig,76 and Smith77-

1 T E BELL
Computer system performance bibliography
Unpublished
2 E F MILLER JR
Bibliography on techniques of computer performance analysis
Unpublished
3 L ROBINSON
Bibliography on data processing performance evaluation
Unpublished
4 E H HERBST N METROPOLIS N B WELLS
Analysis of problem codes on the MANIAC
Mathematical Tables and Other Aids to Computation
Vol 9 No 49 1945 pp 14-20
5 C E SHANNON
A mathematical theory of communication
Bell System Technical Journal Vol 27 1948 p 379
6 R WHAMMING
Error detecting and error correcting codes
Bell System Technical Journal Vol 29 1950 p 147
7 H R J GROSCH
High speed arithmetic: The digital computer as a research
tool
Journal of the Optical Society of America Vol 43 No 4
1953 pp 306-310
8 P R BAGLEY
Item 2 of two think pieces: Establishing a measure of

678

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

Fall Joint Computer Conference, 1972

capability of a data processing system
Communications of the ACM Vol 3 No 11960 pI
A J BLACK
SAVDAT: A routine to save input data in simulator tape
format
Report FN-GS-151 System Development Corporation
1960
E F CODD
Multiprogram scheduling: Parts I-IV
Communications of the ACM Vol 3 Nos 6 and 7 1960
pp 347-350 and 413-418
L FEIN
A figure of merit for evaluating a control computer system
Automatic Control 1960
I FLORES
Computer time for address calculation sorting
Journal of the Association for Computing Machinery
Vol 7 No 4 1960pp 389-409
M E MARON J L KUHNS
On relevance probabilistic indexing and information retrieval
Journal of the Association for Computing Machinery
Vol 7 No 3 1960 pp 389-409
H NAGLER
An estimation of the relative efficiency of two internal sorting
methods
Communications of the ACM Vol 3 No 111960 pp 618-620
R S BARTON
A new approach to the functional design of a digital
computer
Proceedings of 1961 Fall Joint Computer Conference
AFIPS Press 1961 pp 393-396
I FLORES
Analysis of internal computer sorting
Journal of the Association for Computing Machinery
Vol 8 No 11961 pp 41-80
G GORDON
A general purpose systems simulation program
Proceedings of 1961 Spring Joint Computer Conference
AFIPS Press 1961 pp 87-98
H M GURK J MINKER
The design and simulation of an information processing
system
Journal of the Association for Computing Machinery
Vol 8 No 2 1961 pp 260-271
W A HOSIER
Pitfalls and safeguards in real-time digital systems with
emphasis on programming
IRE Transactions on Engineering Management 1961
J JAFFE M I BERKOWITZ
The development and uses of a functional model in the
simulation of an information-processing system
Report SP-584 System Development Corporation 1961
C W ADAMS
Grosch's law repealed
Datamation Vol 8 No 7 1962 pp 38-39
F R BALDWIN W B GIBSON C B POLAND
A multiprocessing approach to a large computer system
IBM Systems Journal Vol 1 No 11962 pp 64-70
0 DOPPING
Test problems used for evaluation of computers
BIT Vol 2 No 4 1962pp 197-202
J A GOSDEN R C SISSON
Standardized comparisons of computer performance
Proceedings of 1962 IFIPS Congress 1962 pp 57-61

25 T N HIBBARD
Some combin.atorial properties of certain trees with
applications to searching and sorting
Journal of the Association for Computing Machinery Vol
9 No 1 1962 pp 13-28
26 R L PATRICK
Let's measure our own performance
Datamation Vol 8 No 6 1962
27 R L SAUDER
A general test data generator for COBOL
Proceedings of 1962 Spring Joint Computer Conference
AFIPS Press 1962 pp 371-324
28 R H SIMONSEN
Simulation of a computer timing device
Communications of the ACM Vol 5 No 7 1962 p 383
29 E C SMITH
A directly coupled multiprocessing system
IBM Systems Journal Vol 2 No 3 1962 pp 218-229
30 M V RATYNSKI
The Air Force computer program acquisition concept
Proceedings of 1967 Spring Joint Computer Conference
AFIPS Press 1967 pp 33-44
31 L V SEARLE G NEIL
Configuration management of computer programs by the
Air Force: Principles and documentation
Proceedings of 1967 Spring Joint Computer Conference
AFIPS Press 1967 pp 45-49
32 B H LIEBOWITZ
The technical specification-Key to management control of
computer programming
Proceedings of 1967 Spring Joint Computer Conference
AFIPS Press 1967 pp 51-59
33 M S PILIGIAN J C POKORNEY
Air Force concepts for the technical control and design
verification of computer programs
Proceedings of 1967 Spring Joint Computer Conference
AFIPS Press 1967 pp 61-66
34 H SACKMAN
Computers system science and evolving society
John Wiley & Sons Inc 1967
35 Proceedings of SHARE XXIII
Share Inc 1969
36 Proceedings of SHARE XXXV
Share Inc 1970
37 G HALL Editor
Computer measurement and evaluation: Selected papers
from the SHARE project
SHARE Inc 1972
38 S CROOKE J MINKER
KWIC index and bibliography on computer systems
simulation and evaluation
Computer Science Center University of Maryland 1969
39 H R MENCK Editor
Bibliography on measurement of computer systems
ACM Los Angeles Chapter Special Interest Committee
on Measurement of Computer Systems Unpublished
40 A F GOODMAN Editor
Computer science and statistics: Fourth annual symposium
on the interface-An interpretative summary
Western Periodicals Company 1971
41 A F GOODMAN
The interface of computer science and statistics
Naval Research Logistics Quarterly Vol 18 No 2 1971
pp 215-229

l\1easurement of Computer Systems

42 M E VAN VALKENBURG et al Editors
Proceedings of fifth annual Princeton conference on
information sciences and systems
Princeton University 1971
43 U 0 GAGLIARDI Editor
Workshop on system performance evaluation
ACM Special Interest Group on Operating Systems 1971
44 Proceedings of 1971 Spring Joint Computer Conference
AFIPS Press 1971
45 M 0 LOCKS Editor
Proceedings of computer science and statistics: Fifth annual
symposium on the interface
Western Periodicals Company 1972
46 Proceedings of 1971 Fall Joint Computer Conference
AFIPS Press 1971
47 W F FREIBERGER Editor
Statistical computer performance evaluation
Academic Press 1972
48 J MADAMS J B JOHNSON R H STARKS
Editors
Proceedings of an ACM conference on proving assertions
about programs
ACM Special Interest Groups on Programming Languages
and on Automata and Computability Theory 1972
49 F GRUEN BERGER Editor
Effective versus efficient computing
Publisher to be selected
50 M E VAN VALKENBURG et al Editors
Proceedings of sixth annual Princeton conference on
information sciences and systems
Princeton University 1972
51 Proceedings of 1972 Spring Joint Computer Conference
AFIPS Press 1972
52 W C HETZEL Editor
Program testing methods
Prentice-Hall Inc 1972
53 R G CANNING Editor
Savings from performance monitoring
EDP Analyzer Vol 10 No 9 1972
54 Proceedings of 1972 Fall Joint Computer Conference
AFIPS Press 1972
55 T E BELL B W BOEHM R A WATSON
Framework and initial phases for computer performance
improvement
Proceedings of 1972 Fall Joint Computer Conference AFIPS
Press 1972
56 A F GOODMAN
Data modeling and analysis for users-A guide to the
perplexed
Proceedings of 1972 Fall Joint Computer Conference
AFIPS Press 1972
57 E R MACLEAN
Assessing returns from the data processing investment
Effective versus Efficient Computing Publisher to be
selected (see 49)
58 J H WINBROW
A large-scale interactive administrative system
IBM Systems Journal Vol 10 No 4 1971 pp 260-282
59 Requirements for AVAILABILITY of computing
facilities
User Strategy Evaluation Committee GUIDE
International Corporation 1970
60 Libraries and information technology

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

679

Information Systems Panel Computer Science and
Engineering BoardNational Academy of Sciences 1972
I RHODES
The mighty man-computer team
Proceedings of 1965 Fall Joint Computer Conference
Part 2 AFIPS Press 1965 pp 1-4
S RAMO
The computer and our changing society
Proceedings of 1965 Fall Joint Computer Conference
Part 2 AFIPS Press 1965 pp 5-10
R W GERARD
Computers and education
Proceedings of 1965 Fall Joint Computer Conference
Part 2 AFIPS Press 1965 pp 11-16
J V MALONEY JR
Computers: The physical sciences and medicine
Proceedings of 1965 Fall Joint Computer Conference
Part 2 AFIPS Press 1965 pp 17-19
C R McBRIER
Impact of computers on retailing
Proceedings of 1965 Fall Joint Computer Conference
Part 2 AFIPS Press 1965 pp 21-25
W I MERKIN R J LONG
The application of computers to domestic and international
trade
Proceedings of 1965 Fall Joint Computer Conference
Part 2 AFIPS Press 1965 pp 27-31
C R GATES W H PICKERING
The role of computers in space exploration
Proceedings of 1965 Fall Joint Computer Conference
Part 2 AFIPS Press 1965 pp 33-35
J A WARD
The impact of computers on the government
Proceedings of 1965 Fall Joint Computer Conference
Part 2 AFIPS Press 1965 pp 37-44
P BARAN
Communication computers and people
Proceedings of 1965 Fall Joint Computer Conference
Part 2 AFIPS Press 1965 pp 45-50
K J SCHLAGER
The impact of computers on urban transportation
Proceedings of 1965 Fall Joint Computer Conference
Part 2 AFIPS Press 1965 pp 51-55
N P EDWARDS
On the evaluation of the cost-effectiveness of command and
control systems
Proceedings of 1964 Spring Joint Computer Conference
AFIPS Press 1964 pp 211-218
R F ROSIN
Determining a computing center environment
Communications of the ACM Vol 8 No 7 1965 pp 463-468
G E BRYAN
JOSS: 20,000 hours at a console-A statistical evaluation
Proceedings of 1967 Fall Joint Computer Conference
AFIPS Press 1967 pp 769-777
M GOLD
Time-sharing and batch-processing: An experimental
comparison of their values in a problem-solving situation
Communications of the ACM Vol 12 No 5 1969 pp
249-259
H SACKMAN W J ERIKSON E E GRANT
Exploratory experimental studies comparing online and
offline programming performance
Communications of the ACM Vol 11 No 11968 pp 3-11

680

Fall Joint Computer Conference, 1972

76 M SCHATZOFF R TSAO R WnG
An experimental comparison of time sharing and batch
processing
Communications of the ACM VollO No 5 1967 pp
261-265
71 L B SMITH
A comparison of batch processing and instant turnaround
Communications of the ACM VollO No 8 1967
pp 495-500

78 H SACKMAN
Time-sharing versus batch-processing: The experimental
evidence
Proceedings of 1968 Spring Joint Computer Conference
AFIPS Press 1968 pp l-lO
79 J E SHE MER D W HEYING
Performance modeling and empirical measurements in a
system designed for batch and time-sharing users
Proceedings of 1969 Fall Joint Computer Conference
AFIPS Press 1969 pp 17-26

A highly parallel computing system
for information retrieval*
by l3EHROOZ P ARHAMI
University of California
Los Angeles, California

PARALLELISM AND INFORMATION
RETRIEVAL

INTRODUCTION
The tremendous expansion in the volume of recorded
knowledge and the desirability of more sophisticated
retrieval techniques have resulted in a need for automated information retrieval systems. However, the high
cost, in programming and running time, implied by such
systems has prevented their widespread use. This high
cost stems from a mismatch between the problem to be
solved and the conventional architecture of digital
computers, optimized for performing serial operations on
fixed-size arrays of data.
It is evident that programming and processing costs
can be reduced substantially through the use of
special-purpose computers, with parallel-processing
capabilities, optimized for non-arithmetic computations.
This is true because the most common and time-consuming operations encountered in information retrieval
applications (e.g., searching and sorting) can make
efficient use of parallelism.
In this paper, a special-purpose highly parallel
system is proposed for information retrieval applications. The proposed system is called RAPID, Rotating
Associative Processor for Information Dissemination,
since it is similar in function to a conventional byteserial associative processor and uses a rotating memory
device. RAPID consists of an array processor used in
conjunction with a head-per-track disk or drum memory
(or any other circulating memory). The array processor
consists of a large number of identical cells controlled by
a central unit and essentially acts as a filter between the
large circulating memory and a central computer. In
other words, the capabilities of the array processor are
used to search and mark the file. The relevant parts of
the file are then selectively processed by the central
computer.

Information retrieval may be defined as selective
recall of stored knowledge. Here, we do not consider
informa tion retrieval systems in their full generality but
restrict ourselves to reference and document retrieval
systems. Reference (document) retrieval is defined as
the selection of a set of references (documents) from a
larger collection according to known criteria.
The processing functions required for information
retrieval are performed in three phases:
1. Translating the user query into a set of search
specifications described in machine language.
2. Searching a large data base and selecting records
that satisfy the search criteria.
3. Preparing the output; e.g., formatting the records,
extracting the required information, and so on.
Of these three phases, the second one is by far the most
difficult and time-consuming; the first one is straightforward and the third one is done only for a small set of
records.
The search phase is time-consuming mainly because
of the large volumes of information involved since the
processing functions performed are very simple. This
suggests that the search time may be reduced by using
array processors. Array processing is particularly
attractive since the search operations can be performed
as sequences of very simple primitive operations. Hence,
the structure of each processing cell can be made very
simple which in turn makes large arrays of cells
economically feasible.
Associative memories and processors constitute a
special class of array processors, with a large number of
small processing elements, which can perform simple
pattern matching operations. Because of these desirable
characteristics, several proposals have been made for

* This research was supported by the U.S. Office of Naval
Research, Mathematical and Information Sciences Division,
Contract No. NOO014-69-A-0200-4027, NR 048-129.
681

682

Fall Joint Computer Conference, 1972

using associative devices in information retrieval
applications.
Before proceeding to review several attempts in this
direction, it is appropriate to summarize some properties
of an ideal information retrieval 'system to provide a
basis for evaluating different proposals.
PI. Storage medium: Large-capacity storage is used
which has modular growth and low cost per bit.
Variable-length records are
P2. Record format:
allowed for flexibility and storage efficiency.
P3. Search speed: Fast access to a record is possible.
The whole data base can be searched in a short
time.
P4. Search types: Equal-to, greater-than, less-than,
and other common search modes are permitted.
P5. Logical search: Combination of search results is
possible; e.g., Boolean and threshold functions of
simple search results.
Some proposalsl-3 consider using conventional associative memories with fixed word-lengths and, hence, do
not satisfy P2. While these proposals may be adequate
for small special-purpose systems, they provide no
acceptable solution for large information retrieval
systems. With the present technology, it is obviously not
practical to have a large enough associative memory
which can store all of the desired information1, 2 without
violating PI. Using small associative memories in
conjunction with secondary storage3 results in considerable amounts of time spent for loading and unloading
the associative memory, violating P3.
Somewhat more flexible systems can be obtained by
using better data organizations. In the distributed-logic
memory,4,5 data is organized as a single string of symbols
divided into substrings of arbitrary lengths by delimiters. Each symbol and its associated control bits are
stored in, and processed by, a cell which can communicate with its two neighbors and with a central control
unit. In the association-storing processor,6 the basic
unit of data is a triple consisting of an ordered pair of
items (each of which may be an elementary item or a
triple) and a link which specifies the association between
the items. Very complex data structures can be represented conveniently with this method. Even though
these two systems provide flexible record formats, they
do not satisfy PI.
It is evident that with the present technology, an
information retrieval system which satisfies both PI and
P3 is impractical. Hence, trading speed for cost through
the use of circulating memory devices seems to provide
the only acceptable solution. Delay-line associative
devices that have been proposed 7,8 are not suitable for
large information retrieval systems because of their fixed

word-lengths and small capacities. The use of head-pertrack disk or drum memories as the storage medium
appears to be very promising because such devices
provide a balanced compromise between PI and P3. An
early proposal of this type is the associative file processor 9 which is a highly specialized system. Siotnick10
points out, in more general terms, the usefulness of
logic-per-track devices. Parkerll specializes Slotnick's
ideas and proposes a logic-per-track system for information retrieval applications.
DESIGN PHILOSOPHY OF RAPID
The design of RAPID was motivated by the distributed-logic memory of Lee4,5 and the logic-per-track
device of Slotnick. 1o RAPID provides certain basic
pattern matching capabilities which can be combined to
obtain more complicated ones. Strings, which are stored
on a rotating memory, are read into the cell storage one
symbol at a time, processed, and stored back (Figure 1).
Processing strings one symbol at a time allows efficient
handling of variable-length records and reduces the
required hardware for the cells.
Figure 2 shows the organization of data on the
rotating memory. Each record is a string of symbols
from an alphabet X, which will not be specified here. It
is assumed that members of X are represented by binary
vectors of length N. Obviously, each symbol must have
some control storage associated with it to store the
search results temporarily. One control bit has proven to
be sufficient for most applications even though some

HEAD-PER-TRACK
DISK

o

CELLS

CONTROL UNIT

TOANO FROM
OTHER SYSTEMS

Figure 1-0verall organization of RAPID

Parallel Computing System for Information Retrieval

RAPID is very similar to the distributed-logic memory
in principle but differs fronl it in the following:

....

ROTATION

ONE
RECORD
(VARIABLE
LENGTH)

HEAD-PER-TRACK
DISK

CLOCK
TRACK

EMPTY ZONE
TO ALLOW SUFFICIENT
T11VI'E FOR PREPARING THE
NEXT INSTRUCTION
(OF THE ORDER OF 1j.ts)

STATE

683

SYMBOL (N BITS)

1[11·····11
ONE CHARACTER

Figure 2-Storage of characters and records

operations may be performed faster with a larger control
field. Control information for a symbol will be called its
state, q E {O, I}. A symbol x and its state q constitute a
character, (q, x).
One of the members of X is a don't-care symbol, 0,
which satisfies any search criterion. As an example for
the utility of 0, consider an author whose middle name
is not known or who does not have one. Then, one can
use 0 as his middle initial in order to make the author
field uniform for all records. We will use the encoding
11 ... 1 for 0 in our implementation. In practice, it will
become necessary to have other special symbols to
delimit records, fields, and so on. The choice of such
symbols does not affect the design and is left to the
user. It should be emphasized, at this point, that
RAPID by itself is only capable of simple pattern
matching operations. Appropriate record formats are
needed in order to make it useful for a particular
information retrieval application. One such format will
be given in this paper for general-purpose information
retrieval applications.
The idea of associating a state with each symbol is
taken from Lee's distributed-logic memory.4,5 In fact,

1. Only one-way communication exists between
neighboring characters in RAPID. This is
necessitated because of the use of a cyclic
memory but results in little loss in power or
flexibility.
2. The use of a cheaper and slower memory makes
RAPID more economical but increases the
search cycle from microseconds to miliseconds.
3. Besides match for equality, other types of
comparisons such as less-than and greater-than
are mechanized in RAPID.
4. Basic arithmetic capability is provided in
RAPID. It allows for threshold combinations of
search functions as well as conventional Boolean
combinations.
With the above data organization, the problem of
searching for particular sets of records will reduce to
that of locating substrings which satisfy certain criteria.
Search for successive symbols of a string is performed
one symbol per disk or drum revolution. There are at
least two reasons for this design choice:
1. At any time, all the cells will be performing
identical functions (looking for the same symbol).
This reduces the hardware complexity of each
cell since the amount of local control is minimized
and fewer input and output leads are required.
2. The alternative approach of processing a few
symbols at a time fails in the case of overlapping
strings. Suppose one tries to process k symbols at
a time (k > 1) by providing local control for each
cell in the form of a counter. Then, if the i-th
symbol in the input string is matched, the cell
proceeds to match the (i + 1)-st symbol. Hence,
if one is looking for the pattern ABCA in the
string ... DCABCABCADA ... , only one of the
two patterns will be found. Also, the pattern
BCAD will not be found in the above example.
THE CONTROL UNIT
Figure 3 shows a block diagram of RAPID which is a
synchronous system operating on the disk clock tracks.
The phase signal generator sequences the operations by
generating eight phase signals. PHA, PHB, PHC, and
PHZ are generated once every disk revolution while
PHI, PH2, PH3, and PH4 are generated once every bit
time (Figure 4). During PHA, the cell control register
(CCR) , input symbol register (ISR) , and address

684

Fall Joint Computer Conference, 1972

ONE LINE
PER CELL
N+2
LINES
PER CELL
HEAD-PER -TRACK
DISK
OR
DRUM

•

N+1
LINES
PER CELL

holds the instruction to be executed for one disk
revolution. The function of various fields in this
register will now be described.
MULTIPLE
RESPONSE
RESOLVER
(MRR)

This field consists of two bits, RST and RSY. RST
commands the cells to read the state bit into the
current state flip-flop, CSF. RSY commands the cells
to read the symbol bits into the current symbol
register, CSR.

LAS PHC

CELLS

ONE LINE
PER CELL

12 LINES

-I

I-

_

:z:
a: ...

II)

a:

Ow

wOW

:J

I!::J

~~

~

g

oww

«rna:

0

a:
~

a:

!a
~

~

~!!O

it

it

!!O
«

Write field
~§~
0-lt:l

I-

~~

uua:

N+1
LINES

rn~a:

-I

.Je:~

-IZt:l

a:

PHASE
SIGNAL
GENERATOR
(PSG)

N LINES

Readfield

..

SAZ
SELECTED
ADDRESS
.IS ZERO

This is similar to the read field and consists of WST
and WSY. WST commands that the condition bit, CON
(see description of condition field), replace the current
state. WSY is a command to replace the current symbol
by the contents of current symbol register, CSR,
if CON = 1.

Address selection field
This field contains two bits, LAS and RAS. If the
LAS bit of this field is set, the address selection register

CONTROL UNIT

Figure 3-Block diagram of RAPID

selection register (ASR) are cleared. During PHB and
PHC, these registers are loaded. Then the execution of
the instruction in CCR starts. During PH3, the output
character register is reset. It is loaded during PH4 and is
unloaded, through G4, after a certain delay.
Most parts of the control unit, namely the instruction
sequencing section and the auxiliary registers which are
used to load CCR, ISR, and ASR or unload OCR, are
not shown in Figure 3. It should be noted, however, that
these parts ·process instructions at the same time that
the cells are performing their functions such that the
next instruction and its associated data are ready before
the next PHB signal. The system can also be controlled
by a general-purpose computer which is interrupted
during PHB to load the auxiliary registers with the next
instruction and associated data.
The arrangement of records on disk is shown in
Figure 2. The N +1 bits of a character are stored on
parallel tracks while the characters of a record are
stored serially. One or more clock tracks supply the
timing pulses for the system. The empty zone is
provided to allow sufficient time for loading the control
registers for the next search cycle.
Figure 5 shows the cell control register (CCR) which

ONE DISK OR
DRUM
REVOLUTION

ONE BIT
TIME

PHA

PHB

PHC

_-----'n n ... Jl"-----_
_-----In L ... J _____
_------'n L ... ~"'"'PH1

PH2

PH3

_-----'n

PH4

fL ... ~
fL
HZ

_ _P----i

Figure 4-Timing signals

Parallel Computing System for Information Retrieval

TABLE I-The Match Condition
for the State Part of a Character

(ASR) is loaded from the multiple response resolver
(MRR). MRR outputs the address of the first cell with
its ASF on. If the RAS bit is set, the accumulated state
flip-flop, ASF, in the cells will be reset. The function of
ASF will be described with the cell design. The address
selection field allows the sequential readout of the tracks
which contain information pertinent to a search request.

.,,:tJ

-m

m»

60

:tJ
(I)
-I

B.EAD§!ATE

:tJ

(I)

READ SYMBOL

:E
(I)

~RITE

STATE

m:tJ

r-

o~

~RITE

SYMBOL

~~

o
o

o

never
if q = 0
if q = 1
always

1

o

1
1

1

Match field

MATCH ~TATE TO 1

3:
(I)

MATCH~TATETO~ERO

Condition field

...
:E
(I)

r

~

.!:.OAD~R

:tJ

.RESET ASF

~

»

Z

(I)

»»
-1-1

Match

3:

-I

0(1)

3:~

MSZ

This field consists of two subfields; the state match
subfield, and the symbol match subfield. These subfields
specify the conditions that the state and symbol of a
character must meet. If both conditions are satisfied for
a particular character, the current match flip-flop
(CMF) of the corresponding cell is set. The state match
subfield consists of MSI and MSZ. The conditions for
all combinations of these two bits are given in Table I.
The symbol match subfield consists of three bits; GRT,
LET, and EQT. All the symbols in the cells are simultaneously compared to the l's complement of the
contents of ISR. Table II gives the conditions for all
combinations of the three signals. S is the symbol in a
cell and r is the l's complement of the contents of ISR.

-<

:!!~e;
mro
6 m :tJ

MSl

~

-<

"':E

685

~

...

Qm

N

3:

»

-I

(")

G)

X

:tJ
-I

."

ru:!EATER !HAN

m

r

0

3:(1)

»-<
-13:
(")0:1

..

xO
r

r
m

LESSIHAN

0

m

EQUALI.0

r
0

LOGICAL .fUNCTION

-I

-I

."

(I)

(")

(")

0

z

0
~

(5
Z

:!!
m
r
0

.§.ELECT~F

(I)

(")

0

This field specifi~s how the condition bit, CON, is to
be computed from the contents of the following four
flip-flops in a cell: current state flip-flop, CSF; accumulated state flip-flop, ASF; current match flip-flop, CMF;
and previous match flip-flop, PMF. LOF specifies the
logical function to be performed (AND if LOF= 1, OR
if LOF=O). The other four bits in this field specify a
subset W of the set of four control flip-flops on which the
logical function is to be performed. For example, if
SCS=I, then CSF E W.
TABLE II-The Match Condition for the
Symbol Part of a Character

z

-I
:tJ

0
r

(I)

»

.§.ELECT ASF

(I)

Match

GRT LET EQT

."
." ~
fI)

m
r
m

(I)

(")

3:

§.ELECT CMF

(")

-I

(5
Z

(I)

"0

3:

§.ElECTfMF

Figure 5-The cell control register (CCR)

0
0
0
0

0
0
1

0

1

1

1
1
1
1

0
0

0

1
1

0
1

1

0

1

if S
if S
if S
if S
if S
if S

never
Y or S
< Y or S
< Y or S
;: Y or S
> Y or S
~ Y or S
always

=

=a
=a
=a
=a
=a
=a

686

Fall Joint Computer Conference, 1972

TO
MULTIPLE
RESPONSE
RESOLVER

CURRENT
STATE
FLIP-FLOP

"TO
PROCESSING
SECTION

S

FROM
DISK

S
CSF
PH4

0

R

ADS
PHZ
RAS
PHZ

MS1

ASF

0

R

ACCUMULATED
STATE
FLIP-FLOP

Z

0

(,)

STATE
MATCH
STM

PH3

MSZ

z

0

i=

CURRENT
MATCH
FLIP-FLOP

PREVIOUS
MATCH
FLIP-FLOP

S

S

0

(,)

~

II)

C
0

I-

PMF

CMF

R

C
z

R

0

0

FROM
PROCESSING
SECTION
SYM
SIGNAL
TO
SYMBOL
TRACKS

Figure 6-Control section of a cell

As will be seen later, the cell design is such that by
appropriate combinations of bits in CCR, other functions besides simple comparison can be performed.
THE CELL DESIGN
Each cell consists of two sections; the control section,
and the processing section. Roughly speaking, the
control section processes the state part of a character
while the processing section operates on the symbol part.
The control section (Figure 6) contains four flip-flops:
current state flip-flop, CSF; accumulated state flip-flop,
ASF; current match flip-flop, CMF; and previous match
flip-flop, PMF. CSF contains the state of the character
read most recently from the disk. ASF contains the
logical OR of the states of characters read since it was
reset. This flip-flop serves two purposes: finding out
which tracks contain at least one character with a set
state (reset by ADS during PHZ) and propagating the
state information until a specified character is encountered (reset by RAS during PHZ and by CMF
during PH4). CMF contains (after PH3) the result of
current match. It is set if both the state and symbol of
the current character meet the match specifications.

Finally, PMF contains the match result for the previous
character.
The condition signal, CON, is a logical function of the
contents of control flip-flops. The four signals SCS, SAS,
SCM, and SPM select a subset of these flip-flops and
the logical function signal, LOF, indicates whether the
contents of selected flip-flops should be ANDed
(LOF= 1) or ORed (LOF=O) together to form CON.
The value of CON will replace the state of current
character if the write state signal, WST, is activated.
The address selection signal, ADS, is activated by the
address selection decoder. This signal allows conventional read and write operations to be performed on
selected tracks of the disk. It is also possible, through
the multiple response resolver, to read out sequentially
the contents of tracks whose corresponding ASF's are
set.
The processing section, shown in Figure 7, contains an
N -bit adder with inputs from ISR and the current
symbol register, CSR. During PHI, a symbol is read
into CSR. During PH2, contents of CSR are added to
contents of ISR with the result stored back in CSR.
Overflow indication is stored in the overflow flip-flop,
OFF. Before the addition takes place, the don't-care

Parallel Computing System for Information Retrieval

flip-flop, DCF, is set if CSR contains the special don'tcare symbol o. From the results of addition, it is decided
whether the symbol satisfies the search specification
(SYM = 1 if it does, SYM = 0 if it does not).
The adder in each cell allows us to add the contents of
ISR to the current symbol or to compare the symbol to
the l's complement of the contents of ISR. If we denote
the current symbol by S, the contents of ISR by Y, and
its l's complement by Y, then:

S= Yiff S+Y +1=2N
S> Y iff S + Y +1> 2N
S Y iff

Z~O and OFF = 1

S < Y iff OFF = 0

687

Note that the carry signal into the adder is activated
if anyone of the signals GRT, LET, or EQT is active.
The above equations are used in the design of the
circuit which computes the symbol match result, SYM
(upper right corner of Figure 7). The result of symbol
match is ANDed with the result of state match (STM)
during PH3 to set the current match flip-flop.
Finally, during PH4, the contents of CSR can be
written onto the disk or put on the output bus. Since the
address selection line, ADS, is active for at most one
cell, no conflict on the output bus will arise.
EXAMPLES OF APPLICATIONS
We first give a set of 12 instructions for RAPID.
These instructions perform tasks that have been found
to be useful in information retrieval applications. Each
instruction, when executed by RAPID, will load CCR
with a sequence of patterns. These sequences of patterns
are also given. We restrict our attention to search

SYMBOL
MATCH

OVERFLOW
FLIP-FLOP

~

oa:

I-z

zo

0-

ul-

S

o~

1-(1)

OFF
R

FROM {
INPUT
BUS
(lSR)

0 t---+------~

ADDER

.•.
PH2

PH2
PH2

S

R

0

S

~

(I)

FROM
DISK

•
•
•

i5
PH2

R

0

0

l-

•
•
•

•

•
PH2

PH2

Or----------------------------------~
CURRENT \
SYMBOL
CSR
REGISTER

PH4

FROM
CONTROL
SECTION

Figure 7-Processing section of a cell

688

Fall Joint Computer Conference, 1972

instructions only. Input and output instructions must
also be provided to complete the set.
7.
1. search and set 8: Find all occurrences of the
symbol 8 and set their states.
2. search for 8182 ••• Sn: Find all the occurrences of
the string 8182 ••• 8 n and set the state of the
symbols which immediately follow Sn.
3. search for m.arked 8182 •• . 8n : Same as the
previous instruction except that for a string to
qualify, the state of its first symbol must be set.
4. search for m.arked 1/1 8: Search for symbols
whose states are set and have the relation 1/1 with
s. Then, set the state of the following symbol.
Possible relations are <, ::S;, >, ~, and ¢.
5. propagate to 8: If the state of a symbol is set,
reset it and set the state of the first S following it.
6. propagatei: If the state of a symbol is set,

8.
9.
10.

11.
12.

reset it and set the state of the i-th symbol to its
right.
expand to 8: If the state of a symbol is set, set
the state of all symbols following it up to and
including the first occurrence of 8.
expand i: If the state of a symbol is set, set the
state of the first i symbols following it.
contract i: If the state of a symbol is reset,
reset the state of the first i symbols following it.
expand i or to 8: If the state of a symbol is set,
perform 7 if an 8 appears within the next i
symbols; otherwise, perform 8.
add 8: Add the numerical value of 8 to the
numerical value of any symbol whose state is set.
replace by 8: If the state of a symbol is set,
replace the symbol by 8.

The microprograms for these instructions are given

TABLE III-Microprograms for RAPID Instructions

Contents of CCR
c

1

$

1

Match Field
Condition Field
Write Address
Field Selection State
Symbol
,-ogic FF Selection
L E
W W l If rtf iR·
S
G
l
S S
S
E Q
A A S S
C A C P
S S
R
0
T T
T Y
S S
T
M M
1 Z
F
S S
0
0 1
0
1 0
1 1
0
1 0
0

1

$1

1

1

0

0

1

1

0

0

1

0

0

0

1

0

:;

So.
CIJ

~

CIJ

Instruction

.t:l

§

~

~

z:

litO:::

~V)

c
CIJ

~

It-

co
0

u

Read
Field
If

S

T
1 search and sel. s

2

~earch

for sls2, •• sn

R
S
y

j=2 to n

$j

1

1

1

0

0

1

0

0

0

1

0

0

0

1

3 search for markeg 5 152••• s n j=l to n

$.i

1

1

1

0

0

1

0

0

0 1

0

0

0

1

4 search for marked 1s

5

pr~e.a2~te

t,2

5

<

1

$

1

1

1

0

0

1

0

0

1 0

0

0

0

1

:S

1

5

1

1

1

0

0

1

0

0

1

1

0

0

0

1

>

1

$

1

1

1

0

0

1

0

1

0 0

0

0

0

1

~

1

$

1

1

1

0

0

1

0

1

0

1

0

0

0

1

•

1

S

1

1

1

0

0

1

0

1

1

0

0

0

0

1

1

$

1

1

1

0

0

1

1

0

0 1

0

1

1

0

1

0

0

1

0

1

1 1

0

0

0

1

1

0

0

1

1

0

0 1

0

1

0

0

1

1

6 eroea2ate i

i

7 exeand to s

1

xl?~nd i

i

1

1

0

0

1

0

1

1

1

0

1

0

0

1

9 contract i

i

1

1

0

0

1

0

1

1 1

1

1

0

0

1

1

0

0

1

1

1

1 0

1

1

0

1

0

1

0

0

1

0

1

1 1

0

1

0

0

1

8

1

S

$

1

1

1

1

1

10 exeand i gr..j.Q s

i

11 ~ s

1

s

1

1

1

0

1

0

0

0

1

s

1

0

1

0

1

0

0

0

12

reg1l~e ~l.

s

1

Parallel Computing System for Information Retrieval

689

If the record length is specified by two characters, we
note that t1t2 ~ 8182 iff t1 > 81 or t1 = 81 and t2 ~ 82. Hence,
we write the following program:
search for A
search for marked >
propagate 1
replace by T
search for A
search for marked 81
search for marked ~
replace by T
search and set T
replace by cP
propagate to p
propagate 3
search for marked E

RECORD
LENGTH
FIELD

ONE INFORMATlO_N FIELD

FIELD
INFORMATION

SEPARATOR
SYMBOL

FIELD
END
SYMBOL

Figure 8-Data storage format

in Table III. A blank entry in this table constitutes a
don't-care condition. The entries in the repetition
column specify the number of times the given patterns
should be repeated. As can be seen from Table III, this
set of instructions does not exploit all the capabilities of
RAPID since some of the bits in the CCR assume only
one value (0 or 1) for all the instructions.
To illustrate the applications of RAPID, we first
choose a format for the records (Figure 8). The record
length field must have a fixed length in order to allow
symbol by symbol comparison of the record length to a
given number. The information fields can be of arbitrary
lengths. The flag field contains three characters; two for
holding the results of searches, and one which contains
a record type flag. The Greek letters used on Figure 8
are reserved symbols and should not be used except for
the purposes given in Table IV.
As mentioned earlier, a special symbol, ~, is used as a
don't-care symbol. It is also helpful to have a reserved
symbol, T, which can be used as temporary substitute
for other symbols during a search operation. Let us now
consider two simple examples to show the utility of the
given instruction set.
Example 1. Assuming that the record length is
specified by one symbol, the following program marks
all the empty records whose lengths are not less than 8.
This is useful when entering a new record of length 8 to
find which tracks contain empty records that are large
enough.

search for cPTIO'
expand to cJ>
search for marked magnet
expand 10 or to {3
contract 3
propagate to p
propagate 3
search for marked v

It is important to note that the record format given
here serves only as illustration. Because of its generality
and flexibility, this format is not very efficient in terms
of storage overhead and processing speed. For any given
application, one can probably design a format which is
more efficient for the types of queries involved.
CONCLUSION
In this paper, we have described a special-purpose
highly parallel system for information retrieval applicaTABLE IV-List of Reserved Symbols

x
p

cP

~ 8

."
~
T

E

82

Example 2. The following program marks all nonempty records which contain in their title field,
designated by TI, a word having "magnet" as its first
six characters and having 3 to 10 non-blank characters
after that. {3 designates the "blank" character.

0"

search for A
search for marked
propagate to p
propagate 3
search for marked

81

Indicates start of length field.
Indicates end of a record.
Separates name and information subfields in a field.
Indicates end of a field.
Designates the end of an empty record.
Designates the end of a non-empty record.
Is the don't-care symbol.
Is used as temporary substitute for other symbols.

690

Fall Joint Computer Conference, 1972

tions. This system must be evaluated with respect to the
properties of an ideal information retrieval system
summarized earlier. It is apparent that RAPID satisfies
P2, P4 and P5. The extent to which PI and P3 are
satisfied by RAPID is difficult to estimate at the
present.
With respect to PI, the storage medium used has a low
cost per bit. However, the cost for cells must also be
considered. Because of the large number of identical
cells required, economical implementation with LSI is
possible. Figures 6 and 7 show that each cell has one
N-bit adder, N +6 flip-flops, 6N +39 gates, and 4N +23
input and output pins. For a symbol length of N = 8
bits, each cell will require nomore than 250 gates and 60
input and output pins. The number of input and output
pins can be reduced considerably at the expense of more
sophisticated gating circuits (i.e., sharing input and
output connections).
With respect to P3, the search speed depends on the
number of symbols matched. If we assume that on the
average 50 symbols are matched, the matching phase
will take about 70 disk revolutions (to allow for
overhead such as propagation of state information and
performance of logical operations on the search results).
Hence, the search time for marking the tracks which
contain relevant information is of the order of a few
seconds.
Some important considerations such as input and
output of data and fault-tolerance in RAPID have not
been explored in detail and constitute possible areas for
future research. The interested reader may consult
Reference 12 for some thoughts on these topics.

ACKNOWLEDGMENTS

The author gratefully acknowledges the guidance and
encouragement given by Dr. W. W. Chu in the course
of this study. Thanks are also due to Messrs. P. Chang,
D. Patterson, and R. Weeks for stimulating discusSIOns.

REFERENCES
1 J GOLDBERG M W GREEN
Large files for information retrieval based on simultaneous
interrogation of all items
Large-capacity Memory Techniques for Computing Systems
New York Macmillan pp 63-67 1962
2 S S YAU C C YANG
A cryogenic associative memory system for information
retrieval
Proceedings of the National Electronics Conference pp
764-769 October 1966
3 J A DUGAN R S GREEN J MINKER
WE SHINDLE
A study of the utility of associative memory processors
Proceedings of the ACM National Conference pp 347-360
August 1966
4 C Y LEE
Intercommunicating cells, basis for a distributed-logic computer
Proceedings of the FJCC pp 130-136 1962
5 C Y LEE M C PAULL
A content-addressable distributed-logic memory with
applications to information retrieval
Proceedings of the IEEE Vol 51 pp 924-932 June 1963
6 D A SAVITT H H LOVE R E TROOP
ASP; a new concept in language and machine organization
Proceedings of the SJCC pp 87-102 1967
7 W A CROFUT M R SOTTILE
Design techniques of a delay line content-addressed memory
IEEE Transactions on Electronic Computers Vol EC-15
pp 529-534 August 1966
8 P T RUX
A glass delay line content-addressable memory system
IEEE Transactions on Computers Vol C-18 pp 512-520
June 1969
9 R H FULLER R M BIRD R M WORTHY
Study of associative processing techniques
Defense Documentation Center AD-621516 August 1965
10 D L SLOTNICK
Logic per track devices
Advances in Computers Vol 10 pp 291-296 New York
Academic Press 1970
11 J L PARKER
A logic-per-track retrieval system
'Proceedings of the IFIPS Conference pp TA-4-146 to
TA-4-150 1971
12 B PARHAMI
RAPID; a rotating associative processor for information
dissemination
Technical Report UCLA-ENG-7213 University of California at Los Angeles February 1972

The architecture of a context addressed
segment-sequential storage
by LEONARD D. HEALY
U.S. Naval Training Equipment Center
Orlando, Florida

and
GERALD J. LIPOVSKI and KEITH L. DOTY
University of Florida
Gainesville, Florida

INTRQDUCTION

tecture is its associative (or context) addressing capability. Search instructions are used to mark words in
storage that match the specified criteria. Context addressing is achieved by making the search criteria depend upon both the content of the word being searched
and the result of previous searches. For example, consider the search of a telephone directory in which each
entry consists of three separate, contiguously placed
words: subscriber name, subscriber address, and telephone number. The search for all subscribers named
John J. Smith is a content search-a search based upon
the content of a single word. The search for all subscribers named Smith who live on Elm Street is a context search-the result of the search for one word affects the search for another.
Associative addressing, or more correctly, content
addressing, has been attempted on discsl in which each
word in the memory is a completely separate entity in
such an addressing scheme. This paper shows how context addressing can be done. Words nearby a word in the
storage can be searched in context, such that a successful
search for one word can be made dependent on a history
of successful searches on the nearby words. Strings, sets,
and trees can be stored and searched in context using
such a context-addressed storage. 2 More complex structures such as relational graphs can also be efficiently
searched.
The context-addressed disc has the following advantage over a random-accessed disc in most nonnumeric data processes. Large data bases can be
searched, for instance, for a given string of characters.
Once a string is found, data stored nearby the string on

This paper presents a new approach to the problem of
searching large data bases. It describes an architecture
in which a cellular structure is adapted to the use of
sequential-access bulk storage. This organization combines most of ,the advantages of a distributed processor
with that of inexpensive bulk storage.
Large data bases are required in information retrieval, artificial intelligence, management information
systems, military and corporate logistics, medical diagnosis, government offices and software systems for
monitoring and analyzing weather, ecological and social
problems. In fact, most nonnumerical processing requires the manipulation of sizable data bases. An examination of memory costs indicates that at present the
best way of storing such data bases, and the one most
widely used in new computer systems, is disc storage.
However, the disc is not used anywhere near its full
potential.
Discs are presently used as random access storages.
Each word has an address which is used to select the
word. However, the association of each word with a
fixed location, required in a random access storage, is a
disadvantage. In a fixed-head disc, each word is read by
means of a read head and can be over-written by a
write head. N ow,if we discard the capability to randomly address, associative addressing can be used as
words are read, and automatic garbage collection can be
performed as words are rewritten.
Perhaps the most important feature of this archi-

691

692

Fall Joint Computer Conference, 1972

the disc track can be returned to the central processor.
Only relevant data need be returned, because the irrelevant data can be screened out by context-addressed
searching on the disc itself to select the relevant data.
In contrast, a conventional disc will return considerable irrelevant data to the central processor to be
searched. Thus, the I/O channel requirements and primary storage requirements of the computer are reduced
because less data is transferred. In fact, there is a maximum number of random-accessed discs that can be
serviced by a central processor because it has to search
through all the irrelevant data returned by all the discs,
whereas an unlimited number of context-addressed
discs can be searched in parallel. Moreover, the instructions used to search the disc storage can be stored in the
disc storage itself. Thus, the central processor can transfer a search program to the disc system, then run independently until the disc has found the data. The computer would be interrupted when the data was found.
This will reduce the interrupt load on the computer.
In this paper we therefore study the implementation
of a context-addressed storage using a large number of

discs. The segment-sequential storage to be studied will
have the following characteristics (see Figure 1). The
entire storage will store a I-dimensional array of words,
called the file. From the software viewpoint, collections
of words related in a data structure format are stored
in a contiguous section of the file, called a record.
Records can be of mixed size. From the hardware viewpoint, the file will be broken into equal-length segments
and stored on fixed-head discs, one segment to a disc.
In the time taken to rotate one disc completely, all
discs can search simultaneously for a given word in the
context of a data structure as directed by the user's
query, marking all words satisfying the search. Words
selected by such context searches can be over-written
with new data in such a data structure, erased, read
out to the I/O channel, or selected as instructions to be
executed during the next disc rotation. Data in groups
of words can be copied and moved from one part of the
file to another part as the data structure is edited. In
the meantime, a hardware garbage collection algorithm
will collect erased words to the bottom of the file so that
large aggregates of words are available to receive large
records.
MOTIVATION

/
RECOROS / {

§}
•
•
•
•
•

•
•

SEGMENTS

•
•

..
SOFTWARE MAKEUP

HARDWARE PLACEMENT

Figure I-Storage of records as segments

The problem that leads to the system architecture
proposed here is the efficient use of storage devices
equivalent to large disc storages. Access to files stored
on such devices is currently based upon a sequential
search of the file area by reading blocks of data into the
main storage of the central processor and searching it
there or by use of a file index which somehow relates
the file content to its physical location. Many hierarchies
of searches have been devised-all efforts to solve
the basic problem that the storage device is addressed
by location but the data is addressed by its content.
The advantage of information retrieval based upon
content is well documented. 3,4,5 However, the trend has
been toward application of associative-search hardware
within the central computer. Content-search storages
have been implemented as subsystems within a computer system ;6,7 but even in these cases, the use of the
search subsystem has been closely associated with operations in the central processor. The devices fit into the
storage hierarchy between the central processor and the
main core storage. A typical application of a contentaddressed storage is as a cross-reference to information
in main storage-the cache storage. An associative
storage subsystem specifically designed for the processing useful in software applications has been proposed, 8
but even that is limited in size by the cost of the special
storage hardware.
Systems of the type mentioned are small, high-speed

Architecture of Context Addressed Segment-Sequential Storage

units. They are limited to content search and are
restricted in size relative to bulk storage devices. Their
application to searching of large data bases is limited
to general improvement of central processor efficiency
or to searching the index for.a large data base. What is
needed for true context search of a large data base is an
economic subsystem which can be connected to a computer and can perform context search and retrieval
operations on a large data base stored within that subsystem.
The approach described in this paper provides just
such a subsystem. It is a semi-autonomous external
device which has its own storage and control logic.
The design concept is specifically oriented toward use
of a large bulk storage medium instead of high-speed
core storage. In addition, the processing capability of
the subsystem has been expanded to include not only
list processing, but also special searches such as matching data strings against templates and operations on bit
strings to simulate networks of linear threshold elements useful in pattern recognition.
The basic building block of the proposed architecture
is a segmented sequential storage. The sequential storage was chosen because it provides an economically
feasible way to store a large data base. In order to
perform search operations on this data base, the storage
must be divided into segments which can be searched in
parallel. Each segment of the sequential storage must
have its own processing capability for conducting such a
search. This leads to a cellular organization in which
each cell consists of a sequential storage segment.
The segment-sequential storage has the following
property. Suppose n items are compared with each other
exhaustively. This requires n storage words. Thus, the
total size of the storage obviously grows linearly with n.
However, as the size grows, more discs are added on, but
the time for a search depends only on the size of the
largest disc and not on the number of discs. Thus, the
time to search for each item in a query is still the same.
The total time for the search grows linearly with the
number of words to be compared. As a first approximation to the cost of programming, the product of storage
size and search time grows as n 2 • This compares with n 3
for a conventional computer. Thus, this storage is very
useful for those operations in which all the elements in
one set are exhaustively compared with each other or
with members of another set, especially when the set is
very large. Similarly, the cost of a comparison of one
element with a set of n elements grows as n 2 in a conventional processor, and as n in this architecture. The
rate of growth of the cost of programming for this
architecture is the same as for cellular associative
memories,9 primarily because it too is a parallel cellular
system.

693

Some algorithms demand exhaustive comparisons.
Some of these are not used because of their extreme cost.
Other algorithms abandon exhaustive comparison to be
usable in the Von Neumann computer at some increase
in programming complexity, loss of relevance or accuracy, or at the expense of structuring the data base
so that other types of searches cannot be efficiently
conducted. In view of the lower cost of an exhaustive
search, this storage might· be useful for a number of
algorithms which are now used for information management in the Von Neumann computer and many others
which are not practical 011 that type of machine.
Discs appear to be slow, but their effective rate of
operation can be made very fast when they are used in
parallel. A typical disc rotates at sixty revolutions per
second. The segment-sequential storage will be able to
execute sixty instructions per second. (Faster rates
may eventually be possible with special discs, or on
processors built from magnetic bubble memories, semiconductor shift registers, or similar sequential memories.) However, if one hundred fixed-head discs storing
32k words per disc are simultaneously searched, nearly
two hundred million comparisons per second are performed. This is approximately the rate of the fastest
processor built. This large system of 100 discs would
cost about $5000 per disc for a total cost of $500,000.
This cost is small compared to that of a new large
computer. Thus, this architecture appears to be costeffective.
This architecture is based on storage and retrieval
from a segmented sequential table data structure utilizing associative addressing. This results in the following
characteristics.
(1) The search time is independent of the file size.
The data content of each cell is searched in
parallel; the search time depends only upon the
cycle time of the individual storage segment and
the number of instructions in the query.
(2) The search technique is based largely upon context. Notables or cross references are required
to locate data. However, there are cases where
cross references can be used to advantage.
(3) New data may be inserted at any place in the
file. The moving of the data that follows the
place of insertion to make room for the new information is performed automatically by the
cells.
(4) Whenever information is deleted from the file,
later file entries will be moved to close the gap.
Thus, the locations in the bulk storage will always be "packed" to put available storage at
the end of the file area.
(5) The system is a programmable processor. Since

694

Fall Joint Computer Conference, 1972

each instruction takes 1/60 second to be executed, as much processing should be done as
possible for each instruction. Further, because
the cell is large, the cost of the processing hardware will be amortized over many words in that
cell. Thus, a large variety of rather sophisticated
instructions will be used to search and edit the
data. Programming with these instructions will
be simpler than programming a conventional
computer in assembler language.

\

l CONTROLLER J
~

r

Lastly, since this architecture is basically cellular,
where one disc and asso,ciated control hardware is a
(large) cell, the following advantages can be obtained.
(1) The system is iterative. The design of one cell is
repeated. The cost of design is therefore amortized over many cells.
(2) The system is upward expandable. An initial
system can be built with a small number of cells.
As storage demands increase, more cells can be
added. The old system does not have to be dis.:.
carded.
(3) The system is fail soft. If a cell is found to be
faulty, it can be taken out of the system, which
can still operate with reduced capability.
(4) The system is restructurable. If several small
data bases are used, the larger system can be
electrically partitioned so that each block of
cells stores and searches one data base independently of the other blocks. Further, several
systems attached to different computers, say in a
computer network, can be tied together to make
one larger system. Since the basic instruction
rate· is only sixty instructions per second, the
time delays of data transmission through the
network are generally insignificant. Thus, the
larger system can operate as fast as any of its
cells for most operations.

j

CENTRAL PROCESSOR
INPUT/OUTPUT CHANNEL

BROADCAST/COLLECTOR BUS
~

r

r

CELL

CELL 14-- ••••• ---. CELL
2
N-I

I

CELL
N

Figure 2-System block diagram.

CENTRAL PROCESSOR
INPUT/OUTPUT CHANNEL

CONTROL
T- REGISTER
K- REGISTER

Based on these general observations, the segmentsequential storage has very promising capabilities. In
the next sections, we will describe the machine organization and show some types of problems that are easily
handled by this system.
SYSTEM ORGANIZATION
The system block diagram for the segment-sequential
storage is shown in Figure 2. The system consists of a
controller plus a number of identical cells. The controller
provides the interface with an I/O channel of the central
computer necessary to perform: (1) input and output

i~

OPERANDS
t t t

_______

MICROPROGRAMS

i

~

WORD LENGTH

•

8R_O_A_D_C_A_S_T_/_C_O_L_L_EC_T_O_R__B_U_S__________- - Jf
Figure 3-Controller block diagram

Architecture of Context Addressed Segment-Sequential Storage

operations between the central computer's core storage
and the storage of the individual cells, and (2) search
operations commanded by the central computer. Each
individual cell communicates with the controller via the
broadcast/collector bus and with its left and right
adj acent neighbor by a direct connection. All cells are
identical in structure.
A more detailed diagram of the controller is shown in
Figure 3. The controller appears similar to a conventional disc controller to the central computer. It performs the functions necessary to execute orders transmitted from the central computer via its I/O channel.
The segment-sequential storage is thus able to perform
its specialized search operations under the command of
a program in the I/O channel. Intervention of the
central computer is required only for initiation of a
search and, perhaps, for servicing an interrupt when the
search is complete.
In its role in providing the interface between the
I/O channel and the cells, the controller is quite different from a conventional disc controller. Instead of
registers for track and head selection, this controller
provides the registers required to hold the information
needed by the cells in performing their specialized search
operations. These registers are:
(1) Instruction Register-I: This register holds the
instruction which tells what function the cells
should perform during the next cycle. The instruction is decoded by a read-only memory that
breaks it down into microinstructions.
(2) Comparand Register-C: This register holds the
bit configuration representing the character being
searched for. It has an extension field Q which is
used when writing data into the cell storage.
(3) Mask Register-K: This register holds a mask
which specifies which bits of the C Register are
to be considered in the search.
(4) Threshold Register-T: This register holds a
threshold value which allows use of search criteria other than exact match or arithmetic inequality.
(5) Bit-length Register-B: This register is used to
hold the number of bits in the data word. This
allows the word size of the storage segments to
be selected under control of the computer.
A block diagram of the cell is shown in Figure 4.
Each cell executes the commands broadcast by the
controller and indicates the results by transmission of
information to the broadcast/collector bus and also
through separate signal lines to its adjacent neighbors.
The C, K, T, and B Registers of the controller are
duplicated in each cell. These registers are used by the

BROADCAST I COL LECTOR

695

BUS

C- REGISTER

K- REGISTER

T- REGISTER

STATUS

LOGIC

READ
HEAD

WRITE
HEAD

SEQUENTIAL
MEMORY
SEGMENT

Figure 4-Block diagram of cell

arithmetic unit in each cell in performing the commanded operation upon its segment of the storage. The
status register is used to hold composite information
about the flag bits associated with individual words in
the storage segment. Control logic in the cell determines what signals are passed from the cell to the
broadcast/ collector bus and to adjacent cells. Each cell
can transfer its entire storage contents to its neighbor.
DATA FORMAT
The storage structure of the segment-sequential
storage system consists of a number of cells, each of
which contains a fixed-length segment of the total sequential storage. Figure 5 depicts the arrangement of
data words within one such segment. The storage segment within the cell is a sequential storage device such
as a track on a drum or disc, a semiconductor shift
register, or a magnetic bubble storage device. Words
stored in the segment are stored sequentially, beginning
at some predefined origin point. Data read at the read
head is appropriately processed by the arithmetic unit
and written by the write head.
The information structure of the segment-sequential
storage system consists of fixed-length words arranged

696

Fall Joint Computer Conference, 1972

ORIGIN
START OF RECORD
.,...--INDICATED
BY START BIT

SEQUENTIAL
FILE
RECORD I

CIRCULATION

RECORD 2

__ START OF RECORD
--"'INDICATED
BY START BIT

Figure 5-Word arrangement in a storage segment
Figure 6-Division of a file into fixed-length segments

in variable-length records. The words in a record are
stored in consecutive storage locations (where the location following the last storage location in a segment is
the first storage location in the following segment).
Thus, a record may occupy only a part of one storage
segment or occupy several adjacent segments. The start
of a record is indicated by a flag bit attached to the first
word in the record, and an end of a record is implied by
the start of the next record. Figure 6 shows how a record
may be spread over several adjacent segments.
Figure 7 shows an expanded view of one word in
storage. The b data bits in the word are arranged
serially, least significant bit first, with four flag bits
terminating the word. The functions of the flag bits are:
(1) S: The START bit is used to indicate the beginning of a data set (record). The search of a record begins with a word containing a START bit.
(2) P: The PERMANENT bit is used for special
designations. Interpretation of this bit depends
upon the instruction being executed by the cell.
(3) M: The MATCH bit is used to mark words
which satisfy the search criteria at each step in
the context search operations.

(4) X: The X bit is used to mark deleted words.
Words so marked are ignored and are eventually
overlaid in an automatic storage compression
scheme.
OPERATIONAL CONCEPTS
The basic operation in context searching is a search
for records which satisfy a criterion dependent upon
both content and the result of previous searches. As an

~

• DATA BITS

~WORD

Figure 7-c-Word configuration

..

I

r

FLAGS [

Architecture of Context Addressed Segment-Sequential Storage

697

example to illustrate how the segment-sequential
storage is able to search all cells simultaneously, consider the ordered search for the characters A, B, C.
That is, determine which records contain A, B, and C
in that order but not necessarily contiguous.
The three searches required to mark all records that
satisfy such a query are:
(1) Mark all words In storage which contain the
character A.
(2) Mark all words in storage which contain the
character B and follow (not necessarily immediately) a previously marked character in
the same record. At the same time, reset the
match indication from the previous search.
(3) Repeat the operation of step 2 for the character
C.
The result of these steps is to leave marked only those
records. which match the ordered search specified.

S
M

LS TR RS

10 10 10 I

LS TR

RS

10 II 10 I

LS TR RS

LS TR RS

10 II 10 I

10 10 10 I

Figure 8a-Flag and status bits before start of search

Figure S shows four segments of a system which will
be used to illustrate the processing of such a search. The
storage segments each contain four words (characters).
Only the START and MATCH flags are indicated.
The origin (beginning) of each segment is at the top
and the direction of search is clockwise (data bits rotate
counter-clockwise under the head). A record containing the string Q,C,B,P,A,B,N,L,K,R,C,T,C begins at
the origin of the left-most segment and continues over
all four segments. The right-most segment also contains
the start of the next record which consists of the string
beginning B,A,C.
The first command causes all words containing
the character A to be marked in the MATCH bit.
Thus, after one circulation of the storage, the words are
marked as shown in Figure Sb.
In order to perform context-search operations in one
storage cycle, status bits must be provided in each cell.
These are used to propagate information about records
which are apread over more than one cell. The status

LS TR

RS

10 10 10 I

LS TR RS

II1IIII

LS TR RS

10 1

I

10

LS TR RS

I

10 10 II I

Figure 8b-Flag and status bits after search for A

bits and their uses are:
(1) TR: The TRansparent status bit is set if no
word in the cell is marked with a START bit.
It is used to indicate that the status indication to
be transmitted to adjacent cells depends upon the
status of this cell and the status input from adjacent cells.
(2) LS: The Left Status bit is set if any word in a
cell between the origin and the first word marked
with a START bit is marked with a MATCH
bit. This bit indicates a match in a record which
begins to the left of the cell indicating the status.
(3) RS: The Right Status bit is set if any word in the
cell following the last word marked with a
START bit is marked with a MATCH bit. This
bit indicates a match condition which applies to
words stored in the cells to the right of this cell,
up to the next word marked by a START bit.
These status bits are updated at the end of each cycle
of the storage. The condition of the status bits after
each operation is performed is shown in Figure S.
The second search command causes all previous
MATCH bits to be erased, after using them in marking
those words which contain a B and follow a previously
marked word in the same record. If the previously
ORIGIN

LS TR RS

10 10 10 I

ORIGIN

LS TR RS

10

II

10 I

ORIGIN

LS TR RS

II I " I I

ORIGIN

LS TR RS

II

1010'

Figure 8c-Flag and status bits after search for B

698

Fall Joint Computer Conference, 1972

SEARCH

a

COMPARAND

MARK

COMPARISON
TYPE

Figure 9b-Search and mark instruction format

LS TR RS

10 10 10 I

LS TR RS

II II II I

LS TR RS

101 I I 0 I

LS TR RS

10 10 10 I

Figure 8d. Flag and status bits after search for C

marked bit and the word containing the B are in the
same cell, the marking condition is completely determined by the logic in the cell. However, in most cases
it is necessary to sense the status bits of previous cells
in order to determine whether the ordered search condition is satisfied. Notice that the status bit conditions
can be propagated down a chain of cells in the same
manner as a high-speed carry is propgated in a conventional parallel arithmetic unit.
Figure 8c shows the flag-bit configurations for each
word in storage and the status bits for each cell after
the completion of the search for B. Figure 8d shows the
configurations after the C search. After three cycles of
the storage, all records in storage have been searched
and those containing the ordered set of characters A, B,
C have been marked. In general, a search requires one
storage cycle per character in the search specification
and is independent of the total storage size.
BASIC OPERATIONS
In this section, the operations for performing context
searches are described in a more formal manner than in
the example above. The instructions are a subset of the
complete set which is described in a report.l0 The use
of these instructions will be illustrated in the section
following this one.
Each instruction includes a basic operation type and,
in most cases, a function code which further specifies
what the instruction is to do. Figure 9 shows the instruction format and its variations. Instructions which perform search and mark operations use the function code
to specify the type of comparison to be used. Instruc-

B:

The contents of the Bit-length Register is
denoted 11. The word length b = ..L 11.
G: The contents of the Comparand Register is
denoted Q. Individual bits are Cl (least significant bit) through Cb (most significant bit).
K: The contents of the Mask Register is denoted
K. Individual bits are represented by the
same scheme as that used for C.
W: The word of cell storage currently being considered is denoted W. Individual bits are
represented by the same scheme as that used
for C.
R: R denotes the contents of a flip-flop in each
cell which is used to indicate the result of the
comparison. R~1 for a "match" and R~O
for "no match". The match performed is the
comparison between Q and W in those positions where k i = 1. In the examples considered
in this paper, the comparisons are arithmetic
(=, ~, ~).
M: The MATCH bit associated with each word
(see Figure 7) is denoted M. M without superscript designates the MATCH bit in the word
being compared, W. M with a numeric superscript indicates the MATCH bit before or
after the one being compared; e.g., M -2
represents the MATCH bit two words before
the word on which the comparison is being
made. Inequality signs used as superscripts
indicate logic signals representing the union of

INPUT-

INST
TYPE

tions which initiate input or output operations allow
two specifications in the function field. The first designates the channel to be used in the data transfer. The
second tells whether the start of each record should be
marked, in preparation for another search operation.
The symbols used in describing the instructions are
given below. The notation is that due to Iverson, modified for convenience in describing some of the special
operations performed by the search logic.

COMPARAND

Figure 9a-Basic instruction format

FUNCTION

OUTPUT

'*

NOT USED

CHANNEL
NUMBER

INDICATES THE START FUNCTION BIT

Figure 9c-Input-output instruction format

•

Architecture of Context Addressed Segment-Sequential Storage

TABLE I-Description of Instructions
SS C

699

TABLE II-Data Format for
String XYABLMNCDPEFWZ

String Search
M~(R/\M-l)V(M/\P)

Set the MATCR bit in any word where the masked
comparison of the word and the comparand satisfies
the comparison type specified in the function field of the
instruction and the word is immediately preceded by a
word in the same record which was left with its MATCR
bit set by the previous instruction. Also, set the MATCR
bit in any word which was left with its MATCR bit set
by the previous instruction and has its PERMANENT
bit set. Reset all other MATCR bits.
Ordered Search

OS C

M~(R/\M<)V(M/\P)

Set the MATCR bit in any word where the masked
comparison of the word and the comparand satisfies the
comparison types specified in the function field of the
instruction and the word is preceded (not necessarily
immediately) by a word in the same record which was
left with its MATCR bit set by the previous instruction.
Also, set the MATCR bit in any word which was left
with its MATCR bit set by the previous instruction and
has its PERMANENT bit set. Reset all other MATCR
bits.
Mark Start
wi~S/\(M>VM) where i=..L(Channel No.)
M~S/\(Start Function)
If the channel number i specified in the instruction is
between 1 and b, set Wi, the ith bit of the first word in
any record which contains a word with its MATCR bit
set. If the start function bit in the instruction is a one,
set the MATCR bit in any word which has its START
bit set. Reset all other MATCR bits.

MS -

P:

S:

all MATCH bits in the record before (M <)
and after eM» the word being compared.
The PERMANENT bit associated with each
word (see Figure 7) is denoted P. The same
superscript conventions apply to P as to M.
The START bit associated with each word
(see Figure 7) is denoted S. The same superscript conventions apply to S as to M.

WORD

CONTENTS

1
2
3

I/O Flags (S)
X

4
5
6
7

A

8

N

9
10
11
12

C

Y

B
L

M
D
P

E

13
14

F
W

15

Z

(S) indicates the START bit for this
word is set.

TABLE III-Program to Find Match for $AB$CD$EF$

NO.

INSTRUCTION
TYPE FUNCTION COMPARAND

1

OS

A

2

SS

B

3

OS

C

4

SS

D

5

OS

E

6

SS

F

7

MS

The instructions which are considered in the examples
in the next section are described in Table I.
SEARCH EXAMPLES
The following examples show the application of the
segment-sequential storage to matching strings with
templates. ll A template consists of characters separated
by parameter markers which are to be matched by
parameter strings. For example, $AB$CD$EF$ is a
template which matches any string formed by the concatenation of any arbitrary string, the string AB,
another arbitrary string, the string CD, another arbitrary string, the string EF, and another arbitrary string.

2,S

REMARKS
mark all strings
which begin A
or $.
mark all strings
begin
which
AB, $, or $B.
mark all strings
which follow the
strings
above
and begin C or
$.
mark all strings
satisfy
which
the AB search
and contain a
subsequent string
which satisfies
the CD search.
mark all strings
which follow the
strings
above
and begin E or
$.
mark all strings
satisfy
which
the template.
flag channel #2
for input and
mark the start
of each record.

700

Fall Joint Computer Conference, 1972

The arbitrary strings need not be the same, and any or
all may be the null string. The string XYABLMNCDPEFWZ is one example of a string which matches this
template.
In the following examples, it is assumed that the
first word in each record has had its MATCH bit set
by the last instruction of the previous search. The programs shown perform the specified search, initiate the
in put of the selected records to the computer, and mark
the first word of each record in preparation for the next
search.

The case where a set of fixed strings is stored in the
segment-sequential storage is illustrated first. The data
format for a typical string is shown in Table II. The
first word is used to hold I/O flags. The characters in
the string are stored in sequential words following the
I/O word.
The program to search all strings in storage and mark
the ones that match the template $AB$CD$EF$ is
shown in Table III. A template search takes one instruction for each character in the template plus an instruction toset the I/O flag in those records which contain
the strings matching the template.
Find templates to fit a string

The case where a set of templates is stored in the
segment-sequential storage is considered next. The data
format for stored templates is shown in Table IV. The
parameter marker, $, is replaced in storage by use of
the PERMANENT bit in those words which contain a
character which is followed by a parameter marker.
A p:rogram to find templates to match the string
XYABLMNCDPEFWZ is shown in Table V. The
TABLE IV-Data Format for Template
$AB$CD$EF$
CONTENTS

1

I/O Flags

2
3
4
5
6
7

A
B
C
D
E
F

NO.

(S),(P)
(P)
(P)

(P)

(S) indicates the START bit for this word is
set.
(P) indicates the PERMANENT bit for this
word is set.

INSTRUCTION
TYPE FUNCTION COMPARAND

1

SS

X

2

SS

Y

3
4

SS
SS
SS
SS
SS
SS
SS
SS
SS
SS
SS
SS
MS

A
B
L
M
N
C
D
P
E
F
W
Z

5

Find strings to fit a template

WORD

TABLE V-Program to Find Templates for
XYABLMNCDPEFWZ

6
7
8
9
10
11
12
13
14

15

1,S

REMARKS
mark all strings
which begin X
or $.
mark all strings
which begin XY
or $.

flag channel #1
for input and
mark the start
of each record.

execution of this program illustrates how the PERMANENT bit is used. The X and Y searches do not find a
match with the template shown in Table IV. However,
since the PERMANENT bit in the first word in the
record is set, the first word vd.ll remain marked by a
MATCH bit and therefore continue as a candidate for a
successful search.
The A and B searches cause the MATCH bit in the
word containing B to be set. Since this word also has its
PERMANENT bit set, the MATCH bit will remain set
during the searches for the remaining characters in the
input string (except for the last character). The search
continues in this fashion, with MATCH bits associated
with characters immediately followed by a parameter
marker being retained. This results in multiple string
searches within each record, corresponding to different
ways a given string may fit a template.
The search process continues in this fashion up to the
last character in the input string. There are two ways
in which a template can satisfy this search: (1) the last
character in the template may match the last character
in the input string and the next-to-Iast character in the
template have its MATCH bit set, or (2) the last character in the template may have both its MATCH bit
and its PERMANENT bit already set. The last search
instruction in the program tests for both these conditions and at the same time resets the MATCH bits in
all characters which do not meet the conditions. The

Architecture of Context Addressed Segment-Sequential Storage

last instruction in the program causes the records which
satisfy the search to be marked for input to the computer's core storage.
The examples above show that the segment-sequential
storage reduces the finding of matching templates to a
simple search. The time required to execute such a
search depends only upon the number of characters in
the query.
Examples of other possible applications of the segment-sequential storage are given in a report.10 One use
is retrieval of information necessary to display a portion
of a map. This is a typical problem encountered in
graphic displays, where a subset of the data base is to
be selected on the basis of x-y location. Another example is the use of the segment-sequential storage to
simulate networks of linear threshold devices.
CONCLUSIONS
This paper has presented a new architecture designed
to solve some of the problems in searching large data
bases. The examples given indicate its usefulness in
several practical applications. Since the system is built
around a relatively inexpensive storage medium, it is
feasible now. In the future, LSI techniques should make
its cellular organization even more attractive.
REFERENCES
1 P ARMSTRONG
Several patents

701

2 G J LIPOVSKI
On data structures in associative memories

Sigplan Notices Vol 6 No 2 pp 347-365 February 1971
3 G ESTRIN R H FULLER
Some applications for content-addressible memories

Proc FJCC 1963 pp 495-508
4 R G EWING P M DAVIES
A n associative processor

Proc FJCC 1964 pp 147-158
5 G J LIPOVSKI
The arch~tecture of a large associative processor

Proc SJCC 1970 pp 385-396
6 L HELLERMAN G E HOERNES
Control storage use in implementing an associative
memory for a time-shared processor

IEEE Trans on Computers Vol C-17 pp 1144-1151
December 1968
7 P T RUX
A glass delay line content-addressed memory system

IEEE Trans on Computers Vol C-18 pp 512-520
June 1969
8 I FLORES
A record lookup memory subsystem for software facilitation

Computer Design April 1969 pp 94-99
9 G J LIPOVSKI
The architecture of a large distributed logic associative memory

Coordinated Science Laboratory R-424 July 1969
10 L D HEALY G J LIPOVSKI K L DOTY
A context addressed segment-sequential storage

Center for Informatics Research University of Florida
TR 72-101 March 1972
11 P WEGNER
Programming languages, information structures, and
machine organization

McGraw-Hill 1968

A cellular processor for task assignments
in polymorphic, multiprocessor computers
by JUDITH A. ANDERSON
National Aeronautics & Space Administration
Kennedy Space Center, Florida

and
G. J. LIPOVSKI
University of Florida
Gainesville, Florida

switch required to restructure the computer to perform
the selected tasks.
This paper will be restricted to those Junctions performed by the cellular processor; in particular, the task
qualification phase and the portions of the task assignment phase related to the cellular processor.

INTRODUCTION
Polymorphic computer systems are comprised of a
large number of hardware devices such as memory
modules, processors, various input/ output devices,
etc., which can be combined or connected in a number
of ways by a controller to form one or several computers
to handle a variety of jobs or tasks. 1 Task assignment
and resource allocation in computer networks and
polymorphic computer systems are currently being
handled by software. It is the intent of this paper to
present a cellular processor which can be used for
scheduling and controlling a polymorphic computer
network, freeing some of the processor time for more
important functions. (See Figure 1.)
Work has been done in the area of using associative
memories and associative processors in scheduling and
allocation in multiprocessor systems. 2 ,3 Since the
scheduling process often involves a choice of hardware
resources which might do the job, a system able to detect elm out of n" conditions being met would be more
suited to the type of decision-making required. The
system to be discussed involves a threshold-associative
search; that is, all the associative searching performed
detects if at least m corresponding bits in both the associative cell and the comparand are one.
Scheduling and controlling can be divided into three
distinct phases. The first is task qualification, determining which tasks are possible with the available hardware.
The second phase is task assignment, deciding which of
the candidate tasks found to be qualified in the first
phase will be chosen to be performed next. The third
phase is the actual controlling or connection of the

SCHEDULING
The method for ordering requests consists of storing
the queue of requests in a one-dimensional array of cells.
One request requires several contiguous cells for storage.
The topmost cells store the oldest request. New requests
are added to the bottom and are packed upward as in a
first-in, first-out stack. An associative search is performed over all the words stored to determine which requests qualify for assignment. The topmost request
which qualifies will be chosen for assignment. Using a
slightly more complex cell structure, a priority level
may be associated with each request, resulting in a
priority based, rather than chronological, method for
task assignment, providing for greater flexibility. The
priority-based system will not be discussed here, but
further detail relative to it may be found in a previous
report.4
METHOD OF OPERATION
The basic system consists of a minicomputer and a
cellular processor for task ordering. (See Figure 1.) Requests generally take the form of which processors are
required, how much memory is required, and which
703

704

Fall Joint Computer Conference, 1972

SWITCH CONTROL
(MINICOMPUTER)

REQ.

CELLULAR
PROCESSOR

spond to one threshold function, including the threshold
value. The devices chosen from to meet that threshold
value will be indicated with a one in its bit position.
Let S be the status register and (Q) (T) be the request
word where Q is the binary vector representing a request and T the binary number giving the threshold
value T. The output C of the threshold function may be
expressed as
C~T~

n

L Q[IJ!\S[IJ.
1

Figure I-Polymorphic computer network controlled by
cellular processor and minicomputer

peripheral devices and how many of each type are required to perform a particular task. These requests are
made to the minicomputer via a simple, low-volume
communication link, such as a shift register, data bus,
or belt. The minicomputer then formats the requests
into a request set which is explained below.
The request set is given an identification word and is
input to the bottom of the task queue stored in the
cellular processor. This unit stores all the request sets
and determines which requests can be qualified for assignment based on current hardware availability. The
topmost request set in the cellular processor which
qualifies is chosen for assignment.
It is necessary for the processor to know which devices in the polymorphic computer system are not currently in use, and therefore are available for assignment.
To provide this information, each physical device in
the system has a bit associated with it in an Availability
Status Register. If a unit, such as a tape drive, is free,
its corresponding bit in the status register will be a
one. When the unit is in use, its corresponding bit will
be reset to a zero.
The requests are of the form indicating which type of
hardware devices are required, how many are required
and which, if any, particular physical units are required.
These requests can all be expressed as a Boolean AND
of threshold functions. Each request word will correTABLE I -Status Register Assignment
BIT

DEVICE

1,2
3-6
7-12
13
14
15
16
17,18

Processors 1 and 2
Memory Units 1-4
Tapes Drives 1-6
Line Printer
Disc
Card Reader
Card Punch
CRT 1 and 2

A request set then consists of an identification word
and a word for each threshold function necessary to express the entire request.
Consider, for exa.mple, a system composed of the components or peripheral devices and the status register bit

I D WORD
IITS:
1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

II

It

20 21

22

0000000011001011011001
ID

THRESHOLD

REQUEST
' I

WORDS
1

I

I

I

,

1 0'1 0 0 0 011 1 0 0 0 0 01, 110
10 10 I 1 0 0 1 0 0
I I 1 I
1
1 0'0 00 01 0
00'10 0 0 010
1 11 1 1I 0,10'0
I
I I 1
1010 1000010
001111110000001010
I
I ( I
PROC!

MEMORY

1

TAPE

DRIVES

I

LPI D !CR I CP

I

CRT

THRESHOLD

Figure 2-Request set example

assignments shown in Table I. The status register in
this example would be 18 bits long.
A request would be of the sort that the required devices for Task Number 429 are Processor 1, CRT 1,
Tape Drive 1, and any two other tape drives, the Line
Printer, and any two memory units. This request set
would consist of four words, the ID word and three
request words, shown in Figure 2.
The threshold value of the ID word is set exactly
equal to the number of "l's" in the ID field. This is
for hardware considerations in order to do an associative
search on the ID words. All the units which are absolutely necessary (mandatory devices) can be compactly
represented by a single threshold request that implements the AND function. The first request word represents all such mandatory devices, whereas the second
and third request words represent "any two other tape
drives" and "any two memory units," respectively.

A Cellular Processor for Task Assignments

This request set, along with any other requests which
were made would be input to the queue. When all three
of the request words above could be satisfied with some
or all of the available hardware, an interrupt to the
minicomputer is generated. The minicomputer can then
read out the ID word of the topmost request set that
can be satisfied and is therefore qualified. If this request
set is the highest in the queue, it will be assigned. Whichever request set is read out will be removed from the
cellular processor and the requested resources allocated
for that task by the minicomputer.
HARDWARE DESCRIPTION
The hardware realization for this cellular processor
consists of a bilateral sequential iterative network. 5
That is, it is a one-dimensional array of cells, all cells

705

having the same structure and containing sequential
logic. Each cell receives external inputs as well as inputs
which are outputs from its adjacent cells as shown in
Figure 3.
Each cell stores either a request word or an ID word,
or it is empty. All cells receive hardware status information which is broadcast into them continuously for
comparison with their stored requests. When one or
more request set has qualified for assignment, an interrupt is generated to the minicomputer. A hardware
priority circuit chooses the topmost qualified request to
be assigned. The cellular processor outputs the ID
word for this request via a data channel which is set up
through all the cells above the cell containing the qualified request in the queue. When a request is chosen for
assignment, its ID word is broadcast to the cellular
processor for removal from the queue. A timer is associated with the uppermost cell in the array and is
used to indicate if requests are stagnating in the queue
so that action may be taken by the minicomputer.
Requests are always loaded into the bottom of the
queue. Removal is either from the top, when the timer
mentioned above exceeds some maximum value, or by
deletion after the request has been assigned. If a task
request is cancelled, it may be removed from the queue
by treating it as if it were assign~d. When requests are
removed from the middle of the queue by assignment,
the other requests move upward to pack into the emptied cells.
Each cell is basically made up of an n-bit register, a
threshold comparator, two cell status flip flops, a data
channel, and combinational logic as shown in Figure 4.
The n-bit register is divided into two fields. The first k
bits, Q, store the binary vector representation of the
request or ID word. The last n-k bits represent the
threshold value, T, for the threshold comparator. The
threshold comparator, which will be discussed in more
detail later, outputs a one if and only if at least T
positions in both Q and the status input, S, are one's.
That is,
C~T~ LQ[I]AS[I]

or,

c~o~cE Q[I]AS[I])-CL T[I]X2I).

Figure 3-Cellular processor

The two cell status flip flops, TOP f f and D f f indicate
whether a cell contains an ID word or not and whether
a cell contains data or is empty, respectively.
The data channel through each of the cells is used to
output information and for packing data to economize
on the number of pins per cell. The data flow in the
data channel is always upward, toward the top of the
queue. The data channel within a cell may be operated
in two ways. It may allow data coming into the cell on

706

Fall Joint Computer Conference, 1972

0'

+- - - -

-- -

-

01

I

CT

- -'"
I

I

DATA
CHANNEL

S
(STATUS

INPUT)

Figure 4-Basic iterative cell

the channel to pass through into the data channel of
the next cell and will be referred to as the bypass mode
of operation. Also, by means of an electronic switch, it
may place the contents of its register into the data
channel. This will be referred to as the transfer mode.
Through the use of the load enable of the register (the
clock input of the register flip flop), it is also possible to
load the register with the information which is in the
data channel. Operation of the data channel is controlled by the cell status, the control signals from the
minicomputer, and a compare rail, CT.
When a request is loaded into the cellular processor,
it enters via the data channel and is loaded adjacent to
the lowest cell containing data. This is determined by
the D f f output from the cells. Once a cell has data
loaded, its threshold comparator continuously compares the register contents, Q, against the status, S.
When a threshold compare has been achieved, that is,
T~

request set from the queue by placing the ID of that
set on the status lines and commanding a set removal
via the control lines. While a removal is being commanded, the set whose ID matches with the ID on the
status lines resets its data flip flop, D if, and passes a
one along the R (reset) rail. This rail propagates in a
downward direction and causes all cells to reset their
D f f until a TOP cell is encountered. This removes the
request set from the queue. There now is a group of
empty cells in the middle of the stack of cells. When a
cell containing data detects an empty cell above it, it
places its data into the data channel and generates a
pulse on the DR (data ready) rail. This pulse travels
upward and enables the loading of data into the uppermost cell in the group of empty cells, that is, the first
empty cell below a non-empty cell which it encounters.
This is determined by D', the value of the D If of the
next higher cell. Each cell moves its data upward until
all the empty cells are at the bottom of the queue.
The comparison operation is not stopped by the data
being in the process of packing. The compare rail, CT,
is passed through empty cells unless the DR rail is high,
indicating data is actually in transit. An example of the
switching of the data channel during the loading and
shifting, or packing, process is shown in Figure 5.

L: S[IJI\Q[IJ

a one is ANDed into the CT rail, which is propagating
upward, toward the top of the queue. When all the reque~t words in a set compare, the CT rail entering the
TOP cell of the request set is a logic one. This causes an
interrupt to be generated, indicating to the minicomputer that there is a qualified set. The interrupt, INT,
is placed into an OR tree external to the cell network to
speed the interrupt signal to the minicomputer to increase response time of the system. Upon receipt of the
interrupt, the minicomputer can interrogate the processor to determine which request set caused the INT
to be generated. The ID word of the topmost qualified
set is broadcast via the data channel, and stored in the
output register. The minicomputer can then remove the

L...--y-.L....:.I0--,-10--,1

Figure 5-'-Example of shifting and loading

10 10 I

A Cellular Processor for Task Assignments

Further details of the cell operation are given in an
earlier report.4 A method for implementing priority
handling was also discussed.

707

5

THRESHOLD COMPARATOR
Current literature on threshold logic discusses integrated circuit realizations of threshold gates with up to
25 inputs and with variable threshold values. 6, 7 The
threshold comparator mentioned earlier consists of a
threshold gate with variable threshold which is selected
by the contents of the threshold register. The inputs to
the threshold gate are the contents of the status register,
8, ANDed bit by bit with the contents of the cell request register, Q, as shown in Figure 6. All inputs are
weighted one.

C+Y $

I

S[I]

A

a[ll

Figure 6-Threshold comparator

If the number of inputs to the threshold gate is restricted to the 25 inputs indicated above, the hardware
realization discussed here must be modified to overcome
this restriction. In particula~, the various types of resources can be divided into disjoint sets of similar or
identical devices such as memory units, processors,
I/O devices, etc. A request would not be made, for
instance, which would require either a tape drive or a
processor. Each set would then have a threshold value
associated with it and the compare outputs from all the
threshold gates would be ANDed to yield the cell compare output, as illustrated in Figure 7. For simplicity,
we will assume an ideal threshold element exists with an
unlimited number of gate inputs in our further discussion, which can be replaced as indicated above.
For large computer networks, the number of devices
will be large. Since the processor discussed here requires

C

Figure 7-Modular threshold comparator

more than 3n interconnections (pins) for each cell,
where n is the number of devices, a method of dividing
the cell into smaller modules which are feasible with
current technologies in LSI must be considered.
First, the cell must be split into modules of lower bit
sizes. This may be done as discussed previously by dividing the hardware devices into disjoint sets of similar
or identical devices. Each module or sub-cell will then
have a threshold associated with it and a threshold
comparator. One control sub-cell is also necessary which
will contain all the logic required for storing the cell
status, generating and propagating the rail signals, and
control the data channels in the other sub-cells in its
cell group. This is illustrated in Figure 8.
This modularity of cell design also allows the cellular
processor to be expandable. If the system requirements
demand a larger (more bit positions) cell, rather than
having to replace the entire cellular processor, an additional storage module may be added for each cell. This
also reduces the fabrication cost since only two cellular
modules would have to be designed regardless of the
number of devices ina system.

------------~~~~----------~

ASSOCIATIVE

STORAGE

MODULES

Figure 8-Modular cell structure

DR R CT

CONTROL
MODULE

708

Fall Joint Computer Conference, 1972

CONCLUSION
The threshold associative cellular processor incorporates
a very simple comparison rule, masked threshold comparison. This rule was shown to be ideally suited to task
qualification in a polymorphic computer, or an integrated computer network like a polymorphic computer,
and was shown to be easily implemented in current
LSI technology.
The processor developed using this type of cell would
considerably enhance the cost effectiveness of polymorphic computers and integrated computer networks
by performing task requests and would reduce the software support otherwise required to poll the status of
devices in the polymorphic computer or an integrated
computer network. The scheme shown here will have
application to other task qualification problems as well,
such as a program sequencing scheme to order programs
or tasks based on a requirement for previous tasks to
have been performed. 4 This modular cellular processor
provides a system which can handle a wide range of
scheduling problems while retaining a flexibility for expansion and at the same time increasing speed by performing the parallel search rather than polling.

REFERENCES
1 H W GSCHWIND
Design of digital computers
Chapter 9 Springer Verlag 1967
2 D C GUNDERSON W L HEIMERDINGER
J P FRANCIS
Associative techniques for control functions in a multiprocessor,
final report
Contract AF 30(602)-3971 Honeywell Systems and
Research Division 1966
3 D C GUNDERSON W LHEIMERDINGER
J P FRANCIS
A multiprocessor with associative control
Prospects for Simulation and Simulator of Dynamic
Systems Spartan Books New York 1967
4 J A ANDERSON
A cellular processor for task assignments in a polymorphic
computer network
MS Thesis University of Florida 1971
5 F C HENNIE
Finite state models for logical machines
John Wiley & Sons New York 1968
6 J H BEINART et al
Threshold logic for LSI
NAECON Proceedings May 1969 pp 453-459
7 R 0 WINDER
Threshold logic will cut costs especially with boost from LSI
Electronics May 27 1968 pp 94-103

A register transfer module FFT processor for speech analysis
by DAVID CASASENT and WARREN STERLING
Carnegie-MellOn University

Pittsburgh, Pennsylvania

FOURIER TRANSFORM APPLICATIONS TO
SPEECH PROCESSING2

INTRODUCTION
On-line speech analysis systems are the subject of much
intensive research. Spectral analysis of the speech
pattern is an integral part of all such systems. To
facilitate this spectral analysis and the associated
preprocessing required, a special purpose fast Fourier
transform (FFT) processor to be described is being
designed and constructed. One unique feature of this
processor which facilitates both its design and implementation while providing an. easily alterable machine
is its construction from standard logic modules which
will be referred to throughout as register transfer
modules or RTM's.l This design approach results in a
machine whose operation is easily understood due to
this modular construction.
Two of the prime advantages of such a processor are:

Let us briefly review Fourier transform techniques as
used in speech processing.
In the discrete time domain, a segment of speech
8(~T+nT) can be represented by
8(~T+nT) =p(~T+nT)*h(nT)

(1)

*

where
denotes discrete convolution and ~T is the
starting sample of a given segment of the speech waveform.p(~T+nT) is a quasiperiodic impulse train
representing the pitch period and h (nT) represents the
triple discrete convolution of the vocal-tract impulse
response venT), with the glottal pulse genT) and
radiation load impulse response r(nT),
h(nT) =v(nT)*r(nT)*g(nT)

(2)

The vocal tract impulse response is characterized by
parameters called formant frequencies. These parameters vary with corresponding changes in the vocal
tract as different sounds are produced; however, for
short time spectrum analysis of speech waveforms, the
formant frequencies can be considered constant.
Given the above speech model, speech analysis
involves estimation of the pitch period and estimation
of formant frequencies. These parameters are estimated
using the cepstrum of a segment of a sampled speech
waveform. For our purposes, the cepstrum is defined
as the inverse discrete Fourier transform (IDFT) of
the log magnitude spectrum of the speech waveform
segment. The details of cepstral analysis are shown in
Figure 1. The input speech segment to Figure 1
8(~T+nT), typically about 20 msec in duration, is
weighted by a symmetric window function w (nT)

(1) The very low design, implementation, and
debugging lead times which result from the RTM
design at the higher register transfer logic level
rather than at the conventional gate level.
(2) The RTM processor can be easily altered due to
the pin-for-pin compatability of all logic cards.
Different hardwired versions of a given algorithm can be easily implemented by appropriate back plane rewiring.

Because of the stringent time constraints imposed by
such a design effort, this processor can also serve as a
feasibility model for the use of RTM's in other complex
real-time systems. This is one area in which little work
has been done.
When in operation, the processor will accept input
data in the form of an analog speech signal and output
the resultant spectral data to a PDP-II computer for
analysis.

x(nT)

=8(~T+nT)w(nT)

= [p(~T+nT)*h(nT) Jow(nT)

05:.n-=< 7L
I
I

:
C.C.

C::::::~7
-1

I

W

I

)-,----'~

Figure 4-The real-valued input FFT algorithm for N = 16

* denotes complex conjugate

712

Fall Joint Computer Conference, 1972

a + ib

1 - - -.......

c + id

c'+id'

wn

cos

at

a

+

· N
27Tm)
(c cos 21Tm + d Sl.n
N

bl

b

+

(d cos

c'

· N
27Tm)
a - (c cos 27Tm + d Sl.n
N

d' = -

a' +ib '

2mn

N+

21Tm
i sin - N

21Tm

· N
27Tm)
c Sl.n

N

[b - (d cos 2;m - c sin 2;m)

Figure 5-The complex calculation
* denotes complex conjugate. N

= number of samples

Computer calculations using this algorithm yielded a
maximum error computed at critical values and extrema
which ranges between -0.00782 and 0.0094. The
coefficients of f (x) were chosen for easy of binary
implementation.
FFT ALGORITHM FOR REAL-VALUED INPUT
Various FFT algorithms exist. One particularly
adaptable to RTM implementation will be briefly
reviewed. The complex discrete Fourier transform of a
sampled time series x(k) (k=O, ... , N -1) can be
written as
1
X( j) == -

n-l

E x(k)e-

i2'dk / N

(13)

Nk=o

It has been shown4 that when the x (k) series is real,
Re [XC j) ] is symmetric about the folding frequency
F,; and 1m [XC j) ] is antisymmetric about F,. Figure
3 shows this pictorially.
An algorithmS which eliminates calculations that will
lead to redundant results in the real-valued input case
has previously been discussed. Figure 4 graphically
illustrates this algorithm for N = 16. The algorithm
can be represented by the expression
n-l

X( j) =

L: Bo(k)

k=O

W-jk

(14)

where W =e27ri/ N ; Bo(k) is real; j=O, 1, ... ,N/2; and
N =2m where m is an integer.
The "complex calculation" shown in Figure 4 is a
slight modification of the butterfly multiply6 normally
used in FFT algorithms. Details of the calculation are
shown in Figure 5, from which the signal flow is apparent. Each complex calculation box, as shown, moves to
the right to operate on all operands within its group.
On the first level, this box performs eight computations,
on the second level each box performs 4 calculations,
etc.
Since the multiplications are ordered as above,
addressing for this multiplier is fairly straightforward.
For ease in accessing the complex multiplier Wm, its
complex values should be stored in the order in which
they occur. An algorithm for determining the sequence
of the exponent m has been documented, and a set of
recursive equations which specify the addresses of the
four operands for every complex calculation can be
formulated. s The address sequencing is easily implemented in a hardware unit for automatic generation of
the required addresses in the proper sequence.
It is apparent from Figure 4 that all complex calculations involving one complex multiplier Wm can be
completed before the next complex multiplier is used. 7
For example, all calculations involving WO can be completed on all 3 levels, then all calculations involving
W2, etc. In the conventional method all calculations on
one level are completed before dropping to the next
level. If the complex multipliers are stored in their
accessed order, there is no need to explicitly store the
sequence of exponents. Furthermore, each complex
exponent in this addressing scheme need be accessed
only once.
As in the conventional FFT implementation, the
resultant Fourier coefficients must be re-ordered. With
the accessing order of the complex multipliers specified
by a linear array A, the exponent m for the ith W is
given by m=A (i). An inverse table look-up enables the
scrambled Fourier coefficients to be accessed from
memory in the order of ascending frequency. To implement this inverse table look-up, the location N of the
ith harmonic is found from the value m in the array A
and by using its position in the array as the value of N.
TABLE II-Formulas for Calculating the Number of
Operations in FFT Algorithnis
Real Multiplications
complex inputs
real inputs

(m - 3.5)N
(2m - 7)N

+6
+ 12

Real Additions

+4
+4

(1.5m - 2.5)N

(3m - 3)N

Register Transfer Module FFT Processor

713

TABLE III-Description of RTM Modules

o

ANALOG
SPEECH
SIGNAL

12.

r-- -:
I
TO
PDP-ll

:

~ ~

J:
~r

256
POINT

FFT

MULTIPLY
256 POINTS
BY HAMMING
WINDOW
VECTOR

256
POINT
FFT

VECTOR

K.bus

- - ---.
256 POINT
INVERSE FFT
(FORMS CEPSTRUM)

I

I
I
:

~------------------- _____ J
Figure 6-FFT processor data flow. Boxed area denotes future
extension of the processor

In implementation, the sequence of locations is, for
convenience, stored separately.
Table II below compares the number of operations,
and consequently, the speed, of the conventional
?ooley-Tukey radix-2 FFT algorithm for complex
mputs, and the FFT algorithm for real inputs. 5 In the
formulas N = 2m , where N is the number of samples.
These formulas assume special cases such as exp (iO)
are calculated as simply as possible. About 72 the
number of operations are required for real inputs as for
complex inputs, owing to the elimination of redundant
calculations. As explained previously, the algorithm
can be streamlined further by sequencing through the
complex multipliers rather than across each level. A
software version of these techniques has been implemented7 and has achieved a real-time processing speed
of 10,300 samples/sec. This is the equivalent of one
256 po~nt FFT every 25 msec. The minimum speech
processmg speed required for this system is one 256
point FFT every 10 msec. It is evident that speeding
up the. algorithm requires hardwiring the complex
calculatIOn and address generation.

T.a/d
DM.bool
DM.const
DM.gpa
DM.ii
DM.index
DM.mult
DM.oi
DM.pdp-l1
DM.tr
M.array
M.sp

controls asynchronous timing of sequential
operations
analog to digital converter
boolean flags
4 word read only memory
general purpose arithmetic unit
general purpose input interface
FFT address genera tor
multiply unit
general purpose output interface
PDP-l1 interface
temporary storage register
read/write memory; ",,2 JLsec access time
read/write scratch pad memory; ",,500 nsec
access time

the FFT algorithm for real-valued inputs generates
harmonics only through the folding frequency. The
binary logarithm of the magnitude of each of these 129
complex values is then calculated and the result transferred to a PDP-H.
During processing, the buffer must continually store
the input samples. After the third group of 128 samples
has been stored, samples 128 thru 383 are weighted by
the window and processed. Although a 256 point FFT
is performed, the window is shifted by only 128 words
each time thus including each sample in 2 FFT cal:culations, each time with a different weighting factor.
SPEECH
SIGNAL
FFT
I- INDEXING
UNIT

ARITHMETIC
UNIT
1 ADDER
1 MULTIPLIER

PROCESSOR DATA FLOW
Figure 6 shows the logical flow of data through the
processor. The "Future Extension" section will not be
implemented initially. Instead the log magnitude of the
spectrum will be transferred to a PDP-II. At this point
the s~ectral ~nvelope can be extracted by digital
recurSIve filtermg techniques rather than by cepstral
smoothing. This approach adequately demonstrates
the feasibility of a real-time RTM processor.
The analog speech signal is sampled at 10 kHz and
stored in a buffer. When 256 8-bit words have been
accumulated, they are weighted by a Hamming window.
A 256 point FFT is then performed on these weighted
samples. This results in only 129 complex values since

Function

Module

~

.~

MEMORY

BOOLEAN
FLAGS

BUS TO BUS
INTERFACE

H

PDP-ll
INTERFACE

ARITHMETIC
I- UNIT

I-

1 ADDER
1 MULTIPLIER

ARITHMETIC
UNIT
2 ADDERS

HMEMORY

H

1

H

MEMORY

BOOLEAN
FLAGS

BOOLEAl'J
FLAGS

H

BUS TO BUS
INTERFACE

BUS 2

~
BUS 3

Figure 7-Block diagram of FFT processor
Bus 1 samples and buffers speech signal. Bus 2
performs FFT. Bus 3 calculates binary logarithm and
interfaces to a PDP-l1

714

Fall Joint Computer Conference, 1972

K.bus
r- --,
I ANALOG I

K.bus

: SPEECH L
I SIGNAL I

DM.pdp-l1

L

f

-

bus

=;r=
OM. index

l
TO

--I

T.a/d

bus-Al
bus--A2
bus--A3

PDP-II

+
DM.ii

bus-A4
initialize
increment

DM. gpa
DM. gpa

DM.mult

DM.bool

DM.bool

M. array

DM.oi

LDM.ii

LDM.ii

M.array

M.array

(512 words;

(512 words)

M.sp

M.sp

b

end

Al (7:0)
A2 <7:0>
A3 (7:0>
A4 (7:0'>

(b) DM.index - FFT address generator. The
DM.index control lines are described in Table IV

lated. In the 12.8 msec used to sample 128 words the
following three operations must be performed:
(1) The Hamming window must be applied,
(2) The 256 point FFT performed, and
(3) The log magnitude of each harmonic calculated.
RTM LEVEL DESIGN

BUS 3

BUS 2

Figure 8-RTM structure of FFT processor. The modules are
described in Table III

The first FFT thus operates on samples 0-255, the
second FFT on samples 128-383, the third on 256-511,
etc. In the actual machine a 384 word ring buffer
memory is used to achieve the sequencing of the blocks
of 128 samples.
The time constraints on the system are easily tabu-

A block diagram of the processor structure is shown
in Figure 7. It is a three bus system with each of the
above operations performed on a separate bus. Figure 8
shows the specific RTM modules used; Table III
describes the modules.
With the exception of DM.mult and DM.index, the
data modules shown in Figure 8 are all standard RTM's.
The functions of the two nonstandard modules are
outlined below and illustrated in Figure 9.

TABLE IV-Description of Control Lines for Indexing
Unit DM.lndex

·bus

control line

A-bus

done

(256 words)
DM.oi
M.sp

BUS 1

b

function

DM.mult
initialize
increment

B- bus

A <15:0>
B <15:0>

bus ~
bus ~
bus ~
bus ~
done
-end

Figure 9-(a) DM.mult-multiply unit

A1
A2
A3
A4

initialize indexing unit
calculate next 4 operand addresses for complex
calculation
load 1st address on bus
load 2nd address on bus
load 3rd address on bus
load 4th address on bus
signals end of calculations involving one complex
multiplier
signals end of FFT

Register Transfer l\10dule FFT Processor

Uata Buffering
Windowing

1.5

Data Transfer:
bus 1 to bus 2

1.5

BUS

1

6.5

HT

BUS 2

~tagnitude

Caieul at ion

Data Transfer:
bus 2 to bus 3

Logarithm
Ca1culation

5.3

Reorder and
Data Transfer:
to PDP-ll

5.3

1 Processor Cycle

BUS 3

It-i_ _ _ _ _ _---=.:12:..:..!:.8~_ _ _ _ _ _ __<

2

3

4

5 6 7
MSEC

8

9

10 11

12 13

715

samples and transferring them to bus 2. Bus 2 spends
1.5 msec simultaneously accepting data from bus 1,
calculating the magnitude of the harmonic components
and transferring the results to bus 3. 6.5 msec are spent
calculating the FFT. This leaves 4.8 msec (12.8-1.5-6.5)
of dead-time during each processor cycle; time when no
processing occurs on bus 2. Bus 3 spends 1.5 msec
accepting data from bus 2, and 5.3 msec simultaneously
calculating the logarithm of 129 samples and transferring them to the PDP-II. This leaves 6 msec of deadtime on bus 3. It is clear that bus 2 carries the heaviest
processing load; therefore, bus 2 dead-time determines
that a speed margin of 4.8 msec exists; that is, the
processor completes processing each set of 256 samples
4.8 msec faster than needed to maintain real-time
operation.

Figure 10-Processor timing diagram

Accuracy
DM.mult

This module multiplies the two 16 bit positive
numbers in registers A and B. Any 16 bits of the 32 bit
result can be placed on the bus. The multiplier was
implemented using Fairchild 9344 2 X 4 bit multipliers.
DM.index

High speed hardware indexing units for FFT operand
address generation have been presented in the literature. 8
This module generates the addresses of the four operands
of every complex calculation during the FFT. It is a
hardware implementation of the recursive equations
for the FFT algorithm for real value inputs discussed
previously. It was designed to sequence through all
calculations involving one complex multiplier. Table IV
defines the control lines shown in Figure 9 (b) .
The four 8-bit registers, AI, A2, A3 and A4 hold the
addresses of the four operands. These registers do not
physically exist since the addresses are generated
combinatorily upon command; they are defined for
logical purposes only.
Figure 10 shows the timing diagram of the processor.
All arithmetic operations, register transfers, and
memory accesses involve use of the bus, which has a
settling time of 500 nsec. Therefore, the average speed
of any operation is 500 nsec. This value was used in
calculating the processing times shown in Figure 10.
For example, approximately 13,000 operations are
required to perform each 256 point FFT on bus 2. The
processing time, therefore, is 6.5 msec. Bus 1 is continually buffering data, however, only 1.5 msec of 1
processor cycle (12.8 msec) are spent windowing 256

The question of accuracy always arises for a processo.
operated in fixed point mode. As noted previously,5
distribution of the 1/N normalization factor over the
entire transform constrains the magnitudes of the
operands at each level to prevent overflows. The only
overflow possibility occurs during the calculation of the
magnitude of the Fourier coefficients. When overflow
occurs (positive or negative), the largest (positive or
negative) number will be chosen.
Simulation runs to determine the effect of multiplier
size on accuracy were conducted. A 16 X 16 bit multiplier was used in conjunction with the fixed point FFT
described to process actual speech signal samples. For
audible speech, accuracy of 1 percent relative mean
square error was achieved when compared to floating
point results. The same simulation using a 12 X 12 bit
multiplier resulted in an error of 6 percent. For signals
of small magnitude (such as the signal generated by
silence) the error for the 16 X 16 bit multiplier rose to
25 percent; however, this is acceptable for processing
the silence signal. For comparison, previous published
accuracy results for a 16 X 16 bit multiplier and similar
FFT algorithm7 showed a maximum error of ±O.OI2
percent fullscale with a standard deviation of ±0.004
percent fullscale. On the basis of these results, the
12 X 12 bit multiplier was considered too inaccurate;
therefore, the 16 X 16 bit multiplier was chosen.
RTM control

RTM control logic is designed with 2 basic modules:
1. Ke: a module which initiates arithmetic operations, data transfers between registers, and
memory read/write cycles.

716

Fall Joint Computer Conference, 1972

2. Kb: a module which chooses a control branch
based on the value of a boolean flag.
With these modules the control for executing an
algorithm can be specified in a manner quite similar to
programming the algorithm in a high level programming
language. This greatly simplifies the design of the
control, thus resulting in a significant reduction in
design time.
This concept can easily be illustrated by investigating
a section of bus 2 control. This particular section controls the complex calculation for the degenerate case of
wo, that is, when the complex multiplier is 1 +iO. For
this case the equations shown in Figure 5 reduce to

a' =a+c
b'=b+d
c'=a-c
d'=d-b
A and B are general purpose arithmetic unit registers;
INDEX is a storage register used for sequencing the
counter through the 64 complex multipliers; ONE is a
constant generator containing a "1"; and MAl and
MB1 are memory address and buffer registers, respectively. The control for this series of complex calculations is then:
Ke

Ke

(L~l;

initialize)

INDEX<-Ol

Kb (done)

[1__
1

Ke
Ke
Ke
Ke
Ke
Ke
Ke
Ke
Ke
Ke
Ke
Ke
Ke
Ke

-----,

(MA1~A1;

read)

~.

. )
(next control sectIOn

(A~MB1)

(MA1~A2;

read)

(B~MB1)

(MB1~(A-B)/2;

write)

(MA1~A1)

(MB1~(A+B)/2;
(MA1~A3;

write)

read)

(B~MB1)

(MA1~A4;

read)

(A~MB1)

(MB1~(A-B)/2;

write)
increment)
(MB1~(A+B)/2; write)
(MA1~A3;

By dividing the result~ of each complex calculation by
2, the 1/N normalization factor can be distributed over
the entire calculation.
The control section for the remaining complex calculations is, of course, more complex requiring 46 Ke
and 7 Kb, but its design and implementation remain
straightforward. To accomplish control of all operations
on bus 2, including accepting data from bus 1, executing
the FFT, calculating the magnitudes of the Fourier
coefficients, and transferring data to bus 3, about 120Ke
and 20 Kb were used.
FUTURE EXTENSIONS
The speech processing application for this processor
involves an initial Fourier transform, a second Fourier
transform to obtain the cepstrum and an inverse
Fourier transform. Figure 6 shows data flow for the
proposed final form of the pipeline processor.
The present system is memory limited because 14 bus
transfers in and out of memory are required for every
complex calculation. Approximately 500 nsec are
required for a bus transfer; 250 nsec to load data on the
bus and 250 nsec to read data from the bus. Faster
memory and bus systems can decrease this portion of
the processing time. The processor fulfills both the
overall goal of a modular FFT computer to meet the
minimum processing rate of 10K data samples/sec,
and attain accuracy of 1 percent relative mean square
error necessary for speech analysis. This was done using
existing RTM's with only 2 new modules required.
It should be emphasized that while the processor
performs a specialized function (calculating the FFT) ,
the RTlVI modules themselves, with the exception of
DM.index, are general and can be used to implement
any processor. In fact, since only the back plane wiring
determines the characteristics of the processor, one set
of RTM modules can be shared among many processors,
if the processors will not be used simultaneously. This
can result in substantial savings over the purchase or
construction of several complete processors.
Along these lines, it would be advantageous to
develop more complex but still general RTM modules.
Specifically, a generalized micro-programmed LSI RTM
module could be coded to implement the entire complex
calculation, the FFT address generator, or any other.
algorithm on a single card. The complex calculation is
an area where the system's speed can be significantly
improved. At present, 46 bus transfers are required for
each complex calculation. This number could be reduced by a factor of 3 by constructing one card to
perform the entire complex calculation. The present

Register Transfer Module FFT Processor

system's specifications did not require such improvements and the RTM design concepts were used to
investigate various system designs using existing
modules rather than constructing an entire system from
the start.

SUMMARY
This paper has reviewed the basic FFT algorithms and
presented a method by which a relatively sophisticated
piece of hardware such as an FFT processor could be
designed at the register transfer level in a much shorter
time than required in a conventional gate level design.
The simplicity of this modular construction has permitted a fairly in-depth view of the processor. The
resultant product and its method of implementation are
rather unique in that they combine the convenience of
a control logic that is similar in structure to software
algorithms with the processing speed of a completely
hard-wired algorithm.
ACKNOWLEDGMENTS
The authors wish to acknowledge the assistance of Lee
Bitterman, and Professors Gordon Bell and Raj Reddy

717

of CMU in the design and implementation of this
FFT processor.
REFERENCES
1 C G BELL et al
The description and use of register transfer modules (RTM's)
IEEE Transactions on Computers Vol C-21 1972
2 R W SCHAFER L R RABINER
System for automatic formant analysis of voiced speech
The Journal of the Acoustical Society of America Vol 47
No 21970
3 E L HALL et al
Generation of products and quotients using approximate binary
logarithms for digital filtering application
IEEE Transactions on Computers Vol C-19 1970
4 G D BERGLAND
A guided tour of the fast fourier transform
IEEE Spectrum Vol 6 1969
5 G D BERGLAND
A fast fourier transform algorithm for real valued series
Communications of the ACM Voill 1968
6 B GOLD et al
The FDP, a fast programmable signal processor
IEEE Transactions on Computers Vol C-20 No 1 1971
7 J W HARTWELL
A procedure for implementing the fast fourier transform on
small computers
IBM Journal of Research and Development Vol 15 1971
8 W W MOYER
A high-speed indexing unit for FFT algorithm implementation
Computer Design Vol 10 No 12 1971

A systematic approach to the design of digital
bussing structures *
by KENNETH J. THURBER, E. DOUGLAS JENSEN, and LARRY A. JACK
Honeywell, Inc.
St. Paul, Minnesota

and
LARRY L. KINNEY, PETER C. PATTON, and LYNN C. ANDERSON
University of Minnesota
Minneapolis, Minnesota

INTRODUCTION

Type and number of busses

Busses are vital elements of a digital system-they
interconnect registers, functional modules, subsystems,
and systems. As technological advances raise system
complexity and connectivity, busses are being recognized as primary architectural resources which can frequently be the limiting factor in performance, modularity, and reliability. The traditional view of bussing
as just an ad hoc way of hooking things together can no
longer be relied upon to produce even viable much less
cost-effective solutions to these increasingly sophisticated interconnect problems.
This paper formulates a more systematic approach
by abstracting those bus parameters which are common to all levels of the system hierarchy. Every bus,
whether it connects registers or processors, can be characterized by such factors as type and number, control
method, communication mechanism, data transfer conventions, width, etc. Evaluating these parameters in
terms of the preliminary functional requirements and
specifications of the system constitutes an efficient
procedure for the design of a cost-effective bus structure.

Busses can be separated into two generic types: dedicated, and nondedicated.
Dedicated busses

A dedicated bus is permanently assigned to either
one function or one physical pair of devices. For example, the Harvard class computer characterized by
Figure 1 has two busses, each of which is dedicated according to both halves of the definition. One bus supplies procedure to the processor, the other provides
data. If there were multiple procedure memory modules
on the procedure bus, that bus would be functionally
but not physically dedicated. The concept of "function" is hierarchical rather than atomic; in the sense
that the procedure bus of Figure 1 carries both addresses and operands, it could be viewed as physically
but not functionally dedicated. This dichotomy is reversed in Figure 2, which illustrates another form of
Harvard class machine. In this case, one bus is functionally dedicated to addresses and the other to operands. They are undedicated from the standpoint of
data/procedure separation, and physically undedicated
as well.
The principal advantage of a dedicated bus is high
throughput, because there is little, if any, bus contention (depending on the type and level of dedication).
As a result, the bus controller can be quite simple compared to that of a non-dedicated bus. Also, portions of
the communication mechanism which must be explicit
in undedicated busses may be integral parts of the

BUS STRUCTURE PARAMETERS
Each of these bus structure parameters involves a
variety of interrelated tradeoffs, the most important of
which are considered below.

* This work was supported in part by the Naval Air Development
Center, Warminster, Pa., under Navy contract number N6226972-C-0051.

719

720

Fall Joint Computer Conference, 1972

-

PROCEDURE BUS

-

1

T
PROCESSOR

1
-

PROCEDURE
MEMORY

DATA BUS

DATA
MEMORY

1
-

Figure 1-Harvard class computer with dedicated procedure
and data busses

devices on a dedicated bus: addresses may be unnecessary, and the devices may automatically be in sync.
A system may include as many undedicated busses
as its logical structure and data rates require, to the extreme of one or more busses between every pair of devices (Figure 3).
A major disadvantage of dedicated busses is the cost
of the cables, connectors, drivers, etc., and of the
multiple bus interfaces (although the interfaces are
generally less complex than those for nondedicated
busses). If reliability is a concern, the busses must be
replicated to avoid potential single-point failures.
Dedicated busses do not often support system modularity, because to add a device frequently involves
adding new interfaces and cables.
Non-dedicated busses

Non-dedicated busses are shared by multiple functions and/or devices. As pointed out earlier, busses may
be functionally dedicated and physically non-dedicated,
or vice versa. The Princeton class computer of Figure 4
illustrates a commonly encountered type of single bus

-

T

PROCESSOR

1
-

ADDRESS BUS

-

T
PROCEDURE
MEMORY

1
-

OPERAND BUS

Figure 3-Adding a device to a non-dedicated bus structure

structure which is not dedicated on either a functional
or a physical basis. The interesting case of multiple,
system-wide, functionally and physically non-dedicated
busses is seen in Figure 5. Here every device can communicate with every other device using any bus, so the
failure of a bus interface to some device simply reduces the number of busses (but not devices) remaining available to that device.
The crossbar matrix is a form of non-dedicated bus
structure for connecting any element of one device
class (such as memories) to any element of another
(such as processors). It can be less efficiently used to
achieve complete connectivity between all system devices. The crossbar can be very complex to control, and
the number of switches increases as the square of the
number of devices, as' shown in Figure 6. It also suffers
from the disability that failure of a crosspoint leaves no
alternative path between the corresponding devices.
By adding even more hardware, the crossbar switch
can be generalized .to a code-activated network (analogous to the telephone system) in which devices seek
their own paths to each other.

-

T

DATA
MEMORY

PROCESSOR

MEMORY

I/O

1
--

Figure 2-Harvard class computer with dedicated address
and operand busses

Figure 4-Princeton class computer with a single
non-dedicated bus

Systematic Approach to Design of Digital Bussing Structures

PROCESSOR

PROCESSOR

MEMORY

MEMORY

I/O

721

the devices which desire but do not obtain the bus must
wait for another opportunity to contend for it.
The communication technique is usually more complex for non-dedicated busses, because devices must be
explicitly addressed and synchronized.
Bus control techniques

Figure 5-Multiple, system;,.wide, non-dedicated busses

Another relatively unconventional non-dedicated bus
structure is the permutation or sorting network which
can connect N devices to N other devices. The sorting
n~twork may be implemented with memory or gating,
but in either case if all N! permutations are allowed, the
hardware is extensive for anything but very small N's.
Non-dedicated busses offer modularity as their main
advantage, in that devices generally may be added to
them more easily than to dedicated busses. Multiple
busses such as those in Figure 5 not only increase bandwidth but also enhance reliability, rendering the system
fail-soft. While non-dedicated busses avoid the proliferation of cables, connectors, drivers, etc., they do
exact a toll in usage conflict. Bus allocation requires
logic and time, and if this time cannot be masked by
data transfers, the bus bandwidth and/or assignment
algorithm may have to be compromised. Furthermore,

MEM

MEM

I

I
I

Centralized bus control

MEM

PRocr-.-------~~H~.---------~~.~.---~!----~:~r~

I
I

When a bus is shared by multiple devices, there must
be some method whereby a particular unit requests and
obtains control of the bus and is allowed· to transmit
data over it. The major problem in this area is resolution
of bus request conflicts so that only one unit obtains
the bus at a given time. The different control schemes
can be roughly classified as being either centralized or
decentralized. If the hardware used for passing bus control from one device to another is largely concentrated
in one location, it is referred to as centralized control.
The location of the hardware could be within one of the
devices which is connected to the bus, or it could be a
separate hardware unit. On the other hand, if the bus
control logic is largely distributed throughout the different devices connected to the bus, it is called decentralized control.
The various bus control techniques will be described
here in terms of distinct control lines, but in most cases
the equivalent functions can be performed with coded
transfers on the bus data lines. The basic tradeoff is
allocation speed versus total number of bus lines.

With centralized control, a single hardware unit is
used to recognize and grant requests for the use of the
bus. At least three different schemes can be used, plus
various modifications or combinations of these:
• Daisy Chaining
• Polling
• Independent Requests.
Centralized Daisy Chaining is illustrated in Figure 7.

PROC.....----~-~------. ~IJ~t--..
.....
·--"'!iIl~t__
I

_____ J

BUS
AVAILABLE

....

BUS
CONTROLLER
BUS REQUEST

-

....,BUS BUSY

-

Figure 6-Adding devices to a crossbar bus

DEVICE
0

!
,

I-

-

Figure 7-Centralized bus control: daisy chain

DEVICE
N

722

Fall Joint Computer Conference, 1972

I

DEVICE
0'

"'I

J

BUS REQUEST .BUS
CONTROLLER

BUS BUSY

-...

POLL COUNT

,~

.....

Figure Sa-Centralized bus control: polling with a global counters

Each device can generate a request via the common
Bus Request line. Whenever the Bus Controller receives a request on the Bus Request line, it returns a
signal on the Bus Available line. The Bus Available line
is daisy chained through each device. If a device receives the Bus Available signal and does not want control of the bus, it passes the Bus Available signal on to
the next device. If a device receives the Bus Available
signal and is requesting control of the bus, then the
Bus Available signal is not passed on to the next
device. The requesting device places a signal on the
Bus Busy line, drops its bus request, and begins its
data transmission. The Bus Busy line keeps the Bus
Available line up while the transmission takes place.
When the device drops the Bus Busy signal, the Bus
Available line is lowered. If the Bus Request line is
again up, the allocation procedure repeates.
The Bus Busy line can be eliminated, but this essentially converts the bus control to a decentralized Daisy
Chain (as discussed later).
The obvious advantage of such a scheme is its simplicity: very few control lines are required, and the number
of them is independent of the number of devices; hence,
additional devices can be added by simply connecting
them to the hus.
A disadvantage of the Daisy Chaining scheme is its
susceptibility to failure. If a failure occurs in the Bus
Available circuitry of a device, it could prevent succeeding devices from ever getting control of the bus or
it could allow more than one device to transmit over
the bus at the same time. However, the logic involved
is quite simple and could easily be made redundant to
increase its reliability. A power failure in a single device
or the necessity to take a device off-line can also be
problems with the Daisy Chain method of control.
A~other disadvantage is the fixed priority structure
which results. The devices which are "closer" to the
Bus Controller always receive control of the bus in
preference to those which are "further away". If the

closer devices had a high demand for the bus, the further
devices could be locked out.
Since the Bus Available signal must sequentially
ripple through the devices, this bus assignment mechanism can also be quite slow.
Finally, it should be noted that with Daisy Chaining,
cable lengths are a function of system layout, so adding,
deleting, or moving devices is physically awkward .
Figure 8a illustrates a centralized Polling system. As
in the centralized Daisy Chaining method, each device
on the bus can place a signal on the Bus Request line.
When the Bus Controller receives a request, it begins
polling the devices to determine who is making the request. The polling is done by counting on the polling
lines. When the count corresponds to a requesting
device, that device raises the Bus Busy line. The controller then stops the polling until the device has completed its transmission and removed the busy signal.
If there is another bus request, the count may restart
from zero or may be continued from where it stopped.
Restarting from zero each time establishes the same
sort of device priority as proximity does in Daisy Chaining, while continuing from the stopping point is a roundrobin approach which gives equal opportunity to all
devices. The priorities need not be fixed because the
polling sequence is easily altered.
The Bus Request line can be eliminated by allowing
the polling counter to continuously cycle except while
it is stopped by a device using the bus. This alternative
impacts the restart (i.e., priority) philosophy,and the
average bus assignment time.
Polling does not suffer from the reliability or physical
placement problems of Daisy Chaining, but the number of devices in Figure 8a limited by the number of
polling lines. Attempting to poll bit-serially involves
synchronous communication techniques (as described
later) and the attendant complications.
Figure 8b shows that centralized Polling may be
made independent of the number of devices by placing
a counter in each device. The Bus Controller then is
reduced to distributing clock pulses which are counted

DEVICE

DEVICE

0

N

~

----

CLOCK
OSCILLATOR

... BUSY (INHIBIT)

,Ir

Figure Sb-Centralized bus control: polling with local counters.

Systematic Approach to Design of Digital Bussing Structures

by all devices. When the count reaches the code of a
device wanting the bus, the device raises the Busy line
which inhibits the clock. When the device completes
its transmission, it removes the Busy signal and the
counting continues. The devices can be serviced either
in a round-robin manner or on a priority basis. If the
counting always continues cyclically when the Busy
signal is removed, the allocation is round-robin, and if
the counters are all reset when the Busy signal is removed, the devices are prioritized by their codes. It is
also possible to make the priorities adaptive by altering
the codes assigned to the devices. The clock skew
problems tend to limit this technique to small slow
systems; it is also exceptionally susceptibh~ to noise and
clock failure.
Polling and Daisy Chaining can be combined into
schemes where addresses or priorities are propagated
between devices instead of a Bus Available signal. This
adds some priority flexibility to Daisy Chaining at the
expense of more lines and logic.
The third method of centralized bus control, Independent Requests, is shown in Figure 9. In this case
each device has a separate pair of Bus Request and Bus
Granted lines, which it uses for communicating with
the Bus Controller. When a device requires use of the
bus, it sends its Bus Request to the controller. The controller selects the next device to receive service and
sends a Bus Granted to it. The selected device lowers
its request and raises Bus Assigned, indicating to all
other devices that the bus is busy. After the transmission is complete, the device lowers the Bus Assigned
line and the Bus Controller removes Bus Granted and
selects the next requesting device.
The overhead time required for allocating the bus can
he shorter than for Daisy Chaining or Polling since all
Bus Requests are presented simultaneously to the Bus
Controller. In addition, there is complete flexibility
available for selecting the next device for service. The
controller can use prespecified or adaptive priorities, a
round-robin scheme, or both. It is also possible to dis-

I

DEVICE
0

.... BUS REQUEST 0

I

j

1---1

~ ~

1

DEVICE
N
~

j~

~

BUS
CONTROLLER

BUS GRANTED 0
I
I
I

BUS REQUEST N
BUS GRANTED N

,

,

BUS ASSIGNED

Figure 9-Centralized bus control: independent requests

723

BUS

_ - - - - - -.... DEVICE 0
AVAILABLE

BUS
REQUEST

o

BUS
REQUEST
N

Figure lOa-Decentralized bus control: daisy chain 1

able requests from a particular device which, for
instance, is known or suspected to have failed.
The major disadvantage of Independent Requests is
the number of'lines and connectors required for control.
Of course, the complexity of the allocation algorithm
will be reflected in the amount of Bus Controller hardware.
Decentralized bus control

In a decentrally controlled system, the control logic is
(primarily) distributed throughout the devices on the
bus. As in the centralized case, there are at least three
distinct schemes, plus combinations and modifications
of these:
• Daisy Chaining
• Polling.
• Independent Requests
A decentralized. Daisy Chain can be constructed
from a centralized one by omitting the Bus Busy line
and connecting the common Bus Request to the "first"
Bus Available, as shown in Figure lOa. A device requests
service by raising its Bus Request line if the incoming
Bus Available line is low. When a Bus Available signal
is received, a device which is not requesting the bus
passes the signal on. The first device which is requesting
service does not propagate the Bus Available, and keeps
its Bus Request up until finished with the bus. Lowering
the Bus Request lowers Bus Available if no successive
devices also have Bus Request signals up, in which case
the "first" device wanting the bus gets it. On the other
hand, if some device "beyond" this one has a Bus Request, control propagates down to it. Thus, allocation is
always on a round-robin basis.
A potential problem exists in that if a device in the
interior of the chain releases the bus and no other device is requesting it, the fall of Bus Request is propagating back toward the "first" device while the Bus Available signal propagates "forward." If devices on both

724

r+

Fall Joint Computer Conference, 1972

...

DEVICE

4

0

DEVICE

1

DEVICE
N

- ---+

f--

BUS AVAILABLE

Figure 1Ob-Decentralized bus control: daisy chain 2

sides of the last user now raise Bus Request, the one to
the "right" will obtain the bus momentarily until its
Bus Available drops when the "left" device gets control.
This dilemma can be avoided by postponing the bus assignment until such races have settled out, either asynchronously with one-shots in each device or with a
synchronizing signal from elsewhere in the system.
A topologically simpler decentralized Daisy Chain is
illustrated in Figure lOb. Here, it is not possible to unambiguously specify the status of the bus by using a
static level on the Bus Available line. However, it is
possible to determine the bus status from transitions on
the Bus Available line. Whenever the Bus Available
coming· into a device changes state and that device
needs to use the bus, it does not pass a signal transition
on to the next device; if the device does not need the
bus, it then changes the Bus Available signal to the next
device. When the bus is idle, the Bus Available signal
oscillates around the Daisy Chain. The first device
to request the bus and receive a Bus Available signal
change! terminates the oscillation and takes control of
the bus. When the device is finished with the bus, it
causes a transition in Bus Available to the next device.
Dependence on signal edges rather than levels renders
this approach somewhat more susceptible to noise than

DEVICE
0
~.

..~

.

,..... :::...

""II"""

~

"

"

H

~

POLLING CODE

.::

I

.... ~

""II ............ ~

BUS AVAILABLE

BUS ACCEPT

.

DEVICE
N

----------

"

"

Figure ll-Decentralized bus control: polling

the previous one. This problem can be minimized by
passing control with a request/acknowledge type of
mechanism such as described later for communication,
although this slows bus allocation. Both of these decentralized Daisy Chains have the same single-point
failure mode and physical layout liabilities as the
centralized version. Specific systems may prefer either
the (centralized) priority or the (decentralized) roundrobin algorithm, but they are equally inflexible (albeit
simple).
Decentralized Polling can be performed as shown in
Figure 11. When a device is willing to relinquish control
of the bus, it puts a code (address or priority) on the
polling lines and raises Bus Available. If the code
matches that of another device which desires the bus,
that device responds with Bus Accept. The former
device drops the polling and Bus Available lines, and
the latter device lowers Bus Accept and begins using
the bus. If the polling device does not receive a Bus
Accept (a Bus Refused line could be added to dis-

DEVICE
0

----------

j~

.~~

,.

"11 ....

...>

DEVICE
N
4~

BUS REQUESTS

BUS ASSIGNED

~"

~

"'II .......... ~

,r

Figure 12-Decentralized bus control: independent requests

tinguish between devices which do not desire the bus
and those which are failed), it changes the code according to some allocation algorithm (round-robin or
priority) and tries again. This approach requires that
exactly one device be granted bus control when the
system is initialized. Since every device must have the
same allocation hardware as a centralized polling Bus
Controller, the decentralized version utilizes substantially more hardware. This buys enhanced reliability in
that failure of a single device does not necessarily
affect operation of the bus.
Figure 12 illustrates the decentralized version of
Independent Requests. Any device desiring the bus
raises its Bus Request line, which corresponds to its
priority. When the current user· releases the bus by
dropping Bus Assigned, all requesting devices examine
all active Bus Requests. The device which recognizes
itself as the highest priority requestor obtains control
of the bus by raising Bus Assigned. This causes all
other requesting devices to lower their Bus Requests

Systematic Approach to Design of Digital Bussing Structures

(and to store the priority of the successful device if a
round-robin algorithm is to be accomodated).
The priority logic in each device is simpler than that
in the centralized counterpart, but the number of lines
and connectors is higher. If the priorities are fixed
rather than dynamic, not all request lines go to all
devices, so the decentralized case uses fewer lines in
systems with up to about 10 devices. Again, the decentralized method offers some reliability advantages
over the centralized one.
The clock skew problems limit this process to small
dense systems, and it is exceptionally susceptible to
noise and clock failure.
Bus communication techniques

Once a device has obtained control of a bus, it must
establish contact with the desired destination. The information required to do this includes
•
•
•
•

Source Address
Destination Address
Communication Class
Action Class.

The source address is often implicit, and the destination address may be also, in the case of a dedicated bus.
Communication class refers to the type of information
to be transferred: e.g., data, command, status, interrupt
etc. This too might be partially or wholly implicit, or
might be merged with the action class, which determines the function to be performed, such as input,
output, etc. After this initial coordination has been
accomplished, the actual communication can proceed.
Information may be transferred between devices synchronously, asynchronously, or semisynchronously.
Synchronous bus cOInInunication

Synchronous transmission techniques are well understood and widely used in communication systems, primarily because they can efficiently operate over long
lengths of cable. A synchronous bus is characterized by
the existence of fixed, equal-width time slots which are
either generated or synchronized by a central timing
mechanism.
The bus timing can be generated globally or both
globally and locally. A globally timed bus contains a
central oscillator which broadcasts clock signals to all
units on the bus. Depending on the logical structure and
physical layout of the bus, clock skew may be a serious
problem. This can be somewhat alleviated by distributing a globally generated frame signal which synchro-

725

nizes a local clock in each device. The local clocks drive
counters which are decoded to identify the time slot assigned to each device. A sync pulse occurs every time
the count cycle (Le., frame) restarts. The device clocks
must be within the initial frequency and temperature
coefficient tolerances determined by the bus timing
characteristics. Skew can still exist if a separate frame
sync line is used, but can be avoided by putting frame
sync in the data. The sync signal then must be separable
from the data, generally through amplitude, phase, or
coding characteristics. If the identifying characteristic
is amplitude, the line drivers and receivers are much
more complex analog circuits than those for simple binary data. If phase is used, the sync signal must be
longer than a time slot, which costs bus bandwidth and
again adds an analog dimension to the drivers and receivers. If the sync signal is coded as a special binary
sequence, it could be confused with normal data, and
can require complex decoders.
All of the global and global/local synchronization
techniques are quite subject to noise errors.
There are two basic approaches to synchronous
busses: the time slots may be assigned to devices on
either a dedicated or non-dedicated basis. A mix of both
dedicated and undedicated slots can also be used. If
time slots are dedicated, they are permanently allocated
to a device regardless of how frequently or infrequently
that device uses them. Each device on the bus is allowed
to communicate on a rotational (time division multiplex) basis. The only way that any priority can be established is by assigning more than one slot to a device
(sometimes call super-commutation). More than one
device may be assigned to a single time slot by submultiplexing (subcommutating) slower or mutually exclusive devices.
Generally, not all devices will wish to transmit at
once; system requirements may not even require or
permit it. If any expansion facilities for additional devices are provided, many of the devices may not even
be implemented on any given system. These two factors
tend to waste bus bandwidth, and lowering the bandwidth to an expected "average" load may risk unacceptable conflicts and delays in peak traffic periods.
Another difficulty that reduces throughput on a dedicaded time slot bus is that devices frequently are not
all the same speed. This means that if a device operates
slower than the time slot rate, it cannot run at its full
speed. The time slot rate could be selected to match the
rate of the slowest device on the bus, but this slows
down all faster devices. Alternatively, the time slot
rate can be made as fast as the fastest device on the bus,
and buffers incorporated into the slower devices. Depending on the device rate mismatches and the length
of data blocks, t?-ese buffers could grow quite large. In

726

Fall Joint Computer Conference, 1972

addition, the buffers must be capable of simultaneous
input and output (or one read and one write in a time
slot period), or else the whole transfer is delayed until
the buffer is filled. Another approach is to run the bus
slower than the fastest device and assign multiple time
slots to that device, which complicates the control and
wastes bus bandwidth if that device is not always transferring data. Special logic must also be included if
burst or block transfers are to be permitted, since a
device normally does not get adjacent time slots.
For reliability, it is generally desirable that the receiving device verify and acknowledge correct arrival of the
data. This is most effectively done on a word basis unless
the physical nature of the transmitting device precludes
retry on anything other than a complete block or message. If a synchronous time slot is wide enough to allow
a reply for every word, then data transmission will be
slower than with an asynchronous bus because the time
slots would have to be defined by the slowest device on
the bus. One solution is to establish a system convention
that verification is by default, and if an error does occur,
a signal will be returned to the source device N (say
two) time slots later. The destination has time to do the
validity test without slowing the transfer rate·, however ,
the source must retain all words which have been transmitted but not verified.
Non-dedicated time slots are provided to devices
only as needed, which improves bus utilization efficiency at the cost of slot allocation hardware. Block
transfers and priority assignment schemes are possible
if the bus assignment mechanism is fast enough. The
device speed and error checking limitations of the
dedicated case are also shared by· non-dedicated systems.
Asynchronous bus com.m.unication

Asynchronous bus communication techniques fall
into two general categories: One-Way Command, and
Request/ Acknowledge. A third case is where clocking
information is derived from the data itself at the desti-

DATA

I
DATA READY

I II

t

I

I
I

t1

I
I
I

t2

I
I

I
t3 It4

I

I
I
I

I

I

Figure l3-Asynchronous, source-controlled, one-way
command communication

DATA

I

DATA
REQUEST

I
I
I
I

14

I

I I
t1

I

t t2 :
t3

I
I

·t

Figure l4-Asynchronous, destination-controlled, one-way
command communication

nation (using phase modulation, etc.) ; this is not treated
here because it is primarily suited to long-distance bitserial communications applications and is well documented elsewhere.
One-Way Command refers to the fact that the data
transfer mechanism is completely controlled by only
one of the two devices communicating-once the transfer is initiated, there is no interaction (except, perhaps,
for an error signal).
A One-Way Command (OWC) interface maybe controlled by either the source or the destination device.
With a source-controlled OWC interface, the transmitting device places data on the bus, and signals Data
Ready to the receiving device, as seen in Figure 13.
Timing of Data Ready is highly dependent on implementation details, such as exactly how it is used by the
destination device. If Data Ready itself directly strobes
in the data, then it must be delayed .long enough (h)
for the data to have propagated down the bus and
settled at the receiving end before Data Ready arrives.
Instead of "pipelining" data and Data Ready, it is
safer to allow the data to reach the destination before
generating Data Ready, but this makes the transfer
rate a function of the physical distance between devices.
A better approach is to make Data Ready as wide as
the data (i.e., h=t3 =O), and let the receiving device
internally delay before loading. t4 is the time required
either for the source device to reload its output data
register, or for control of the bus to be reassigned.
The principal advantages of the source-controlled
OWC interface are simplicity and speed. The major disadvantages are that there is no validity verification
from the destination, it is difficult and inefficient to
communicate between devices of different speeds, and
noise pulses on the Data Ready line might be mistaken
for valid signals. The noise problem can be minimized

Systematic Approach to Design of Digital Bussing Structures

by proper timing, but usually at the expense of transfer
rate.
The validity check problem can be avoided with a
destination-controlled owe interface, such as shown
in Figure 14. The receiving device raises Data Request,
which causes the source to place data on the bus. The
destination now has the problem of deciding when to
look at the data lines, which is related to the physical
distance involved. If an error is detected in the word,
the receiving device sends a Data Error signal instead
of another Data Request, so the validity check time
may limit the transfer rate. The speed is also adversely
affected by higher initial overhead, and by twice the
number of bus propagation delays as used by the
source-controlled interface.
The Request/Acknowledge method of asynchronous
communication can be separated into three cases: N onInterlocked, Half-Interlocked, and Fully-Interlocked.

DATA

:~P---:__~__!~_

DATA READY
DATA ACCEPT

tl

I
I
I t2 I
I

"

t+-

t3

I

I

I

I

t4

I

-+t

t+-

It6

I

I

I

,

I
t5 - . .

Figure I5-Asynchronous, non-interlocked,
request/acknowledge communication

Figure 15 illustrates the Non-Interlocked method.
The source puts data on the bus, and raises Data Ready;
the destination stores the data and responds with
Data Accept, which causes Data Ready to fall and new
data to be placed on the lines. If an error is found in the
data, the receiving device raises Data Error instead of
Data Accept. This signal interchange not only provides
error control, but also permits operation between devices of any speeds. The price is primarily speed, although some added logic is also required. As with the
One-Way Command interface, the exact timing is a
function of the implementation. There are now two
lines susceptible to noise, and twice as many bus delays
to consider. Improper ratios of bus propagation time
and communication signal pulse widths could allow
another Data Ready to come and go while Data Accept
is still high in response to a previous one, which would
hang up the entire bus.
This can be avoided by making Data Ready remain
up until Data Accept (or Data Error) is received by the

727

DATA--...I[I
DATA READY'-----~
DATA ACCEPT

I I
tl

t2

't3 t
I '.

I
I

t5
t4

Figure I6-Asynchronous, half-interlocked,
request/acknowledge communication

source, as seen in Figure 16. In this Half-Interlocked
interface, if Data Ready comes up while Data Accept
is still high, the transfer will only be delayed. Furthermore, the variable width of Data Ready tends to protect it from noise. There is no speed penalty and very
little hardware cost associated with these improvements
over the Non-Interlocked case.
One more potential timing error is possible if Data
Accept extends over the source buffer reload period.and
masks the leading edge of the next Data Ready. Figure
17 shows how this is avoided with a Fully-Interlocked
interface where a new Data Ready does not occur until
the trailing edge of the old Data Accept (or Data Error) .
Also, both communication signals are now comparatively noise-immune. The device logic is again slightly
more complex, but the major disadvantage is that the
bus delays have doubled over the Half-Interlocked case,
nearly halving the transfer rate upper limit.
Semisynchronous bus cOInmunication

Semisynchronous busses may be thought of as having time slots which are not necessarily fixed equal
width. On the other hand, they might also be viewed
as essentially asynchronous busses which behave
synchronously when not in use.

DATA

DATA READY

DATA ACCEPT

Figure I7-Asynchronous, fully-interlocked,
request/acknowledge communication

728

Fall Joint Computer Conference) 1972

t

DATAl:j

~-----I
I

BUSAVAILABLE

.....
----

I I
~tll4-t
1
I

-+t

I
I

r+--

t2

DATAftEADY/
BUS AVAILABLE
DATA ACCEPT

~
r;~
rL
~ ld---.~ i ..- - - - -.....
1

I

I

I

f I I

I tll 1
1 t2

I
t+-

I

2

n'--___

~JI I

I

It41

I

t+1

t5

DATA ACCEPTI
BUS AVAILABLE

~11
14-

I

Semisynchronous busses were devised to retain the
basic asynchronous advantage of communication' between different speed devices, while overcoming the
asynchronous disadvantage of real-time error response
and the synchronous disadvantage of clock skew. Error
control in a synchronous system does not impede the
transfer rate because the error signal can be deferred as
many time slots as the validity test requires. This is not
possible on a conventional asynchronous bus since there
is no global timing signal available to all devices. Actually, this is true only when the bus is idle, because
while it is in use there are one or more communication
signals which may be observed by all devices. So an
asynchronous bus could defer the Data Error signal for
some N word-times as defined by whatever transfer
technique is employed. But when no device is using the
bus, these signals normally stop, so the one or more
pairs of devices which transferred the last N words have
no time reference for a deferred error response. The semisynchronous bus handles this problem by generating
extra communication signals which serve as pseudoclock
pulses for this purpose when the bus is idle. Only N
pulses are actually needed, but a continuous oscillation
may facilitate the restart of normal bus operation.
The location of this pseudoclock depends on the bus
control· method. If the bus is centrally controlled, the
Bus Controller can detect the idle bus condition and
generate the pseudo clock signals. A decentrally controlled bus requires that this function be performed by
DATA

")t ; )_II I-L
I
~1..___..,.._~IW
1-1
I I
I
I

DATAREADY---......~

t3 ~

Figure 18-Semisynchronous, source-controlled, one-way
command communication

--.J

t3 ~

Figure 19-5emisynchronous, non-interlocked
request / acknowledge communication
(Data Ready/Bus Available)

r:

DATA-....

It~: 151
I

I

t2 .,..t~

Figure 20-Semisynchronous, non-interlocked,
request/acknowledge communication
(Data Accept/Bus Available)

the last device to use the bus. The replication of logic
adds cost, and if this last device should fail while generating the pseudoclocks, the entire bus will be down.
Like asynchronous busses, semisynchronous busses
may be either One-Way Command or Request/
Acknowledge.
Figure 18 illustrates how the timing of a semisynchronous source-controlled bus resembles that of its
asynchronous counterpart (there is no corresponding
destination-controlled case) . Instead of the source
device sending a Data Ready to signal the presence of
new data, it sends a Bus Available to define the end of
its time slot and the beginning of the next. During a
time slot, the bus assignment for the following slot is
made; Bus Available then causes the next device to
place its destination address and data on the bus. The
selected destination then waits for the data to settle,
loads it, and generates another Bus Available.
Combining the function of Data Ready with that of
Bus Available (a line generally required by an asynchronous bus) is a benefit which accrues to all semisynchronous busses. The semisynchronous One-Way
Command interface does avoid the real-time error re~
sponse, but it is still highly susceptible to noise, and
incompatible with devices of differing speeds.
DATA~
DATA READYI
BUS AVAILABLE
DATA ACCEPT

--1 \
~

I

I

~

~:

~A:'i!
I I I
I

I

I I.

I

~I+I

I

I

~ t1~

t4+1 t5

I

I
I
I
I

I
t3

r+-

Figure 21-Semisynchronous, half-interlocked,
request/acknowledge communication
(Data Ready /Bus Available)

Systematic Approach to Design of Digital Bussing Structures

729

DATA

DATAR::::~ ~~i
BUS AVAILABLE
DATA ACCEPT

I
~

~~ ....i -.-..,- - - - i .--.._ _-;1-.--;--_....
, _____

~ t1

--+II+- t3~

t2-., t4-

t4

DATA READYI
BUS AVAILABLE
DATA ACCEPT

I
1 t1
I

t
I

I

!4Figure 22-Semisynchronous, half-interlocked,
request / acknowledge communication
(toggling Data Ready/Bus Available)

For semisynchronous as well as asynchronous busses,
there are Non-Interlocked, Half-Interlocked, and FullyInterlocked Request/Acknowledge interfaces.
The Non-Interlocked interface shown in Figure 19 is
a direct extension of the One-Way Command case. It
h.andles de,vices of different speeds, but also is susceptIble to nOIse and potential hangup.
However, a semisynchronous bus using Data Ready
as Bus Available for a Non-Interlocked interface picks
up one of the liabilities of synchronous busses. The
transmitting device will not generate Bus Available
until Data Accept has been received and its word-time
is finished, which wastes bus bandwidth if a slower
source is followed by a faster one in the next time slot.
This can only be alleviated with the same sort of bus
bandwidth and buffer size trade-offs that a synchronous
bus would use to match different device speeds.
Figure 20 illustrates a scheme which solves this difficulty by using Data Accept for Bus Available. This
optimizes bus bandwidth in the asynchronous sense
that the transfer rate is slaved to the speed of the receiving device. Of course, the noise and hangup problems are still present.
Since using Data Ready as Bus Available is unsuccessful for Non-Interlocked interfaces, it is not surpris-

DATA~I
DATA READY
DATA ACCEPTI
BUS AVAILABLE

I

;

~

'\d1-----I

----!--.....:.--....;;;..,"

I

I

-.j t1 t4-t2

II

I

I

t2 ~

Figure 24-Semisynchronous, fully-interlocked,
request/acknowledge communication
(Data Ready /Bus Available)

ing that it doesn't work in the Half-Interlocked case
either. As seen in Figure 21, ts is wasted because only
the leading edge of Data Ready is used asBus Available.
Als?, one device would try to hold Bus Available up
whIle another is pulling it down. The second device
could wait for the first to release the line, but skew on
the Data Accept line from the first destination to the
first and second sources would cause the wait to be
quite lengthy. Furthermore, if Data Accept must be
used by both source devices, it may as well transfer
control instead of Bus Available.
To keep from wasting t s, it might be proposed that
Bus Available simply be toggled and both edges be
utilized as in Figure 22, but the same state conflict
exists here. Toggling a Bus Available flip-flop with
Data Accept makes no more sense than both source
devices employing Data Accept, and would add time.
Thus, Data Accept must be converted to Bus Available, as shown in Figure 23. Except for a deferred error
signal, the disabilities of a conventional Half-Interlocked asynchronous bus continue to apply.
The same reasoning causes the Fullv-Interlocked
interface of Figure 24 to be rejected for that of Figure
25, where the trailing edge of Data· Accept serves as
Bus Available.

DATA

2

DATA READY

I'

I

I

I

I

1

I

I

I

I

-+1,...1
1t3
I
I+- t4-.j

Figure 23-Semisynchronous, half-interlocked,
request/acknowledge communication
(Data Accept/Bus Available)

DATA ACCEPTI
BUS AVAILABLE

I

I

I

It11 t21

I

I

I

Figure 25-Seinisynchronous, fully-interlocked,
request / acknowledge communication
(Data Accept/Bus Available)

730

Fall Joint Computer Conference, 1972

Data transfer philosophies

There are five basic data transfer philosophies that
can be considered for a bus:
•
•
•
•
•

Single word transfers only
Fixed length block transfers only
Variable length block transfers only
Single word or fixed length block transfers
Single word or variable length block transfers.

(It should be noted that here the term "word" is used
functionally to denote the basic information unit on the
bus·, bus width factors are covered later.)
.
The data transfer philosophy is directly involved WIth
three other major aspects of the system: the access
characteristics of the devices using the bus; the control
mechanism by which the bus is allocated (if it is nondedicated); and the bus communication techniques. Of
course, if the bus connects functional units of a computer such as processors and memories, the data transfer philosophy may severely impact programming,
memory allocation and utilization, etc.
Single words only

The choice of allowing only single words to be transferred has a number of important system ramifications.
First, it precludes any effective use of purely blockoriented devices, such as disks, drums, or BORAMs.
These devices have a high latency and their principal
value lies in amortizing this time across many words in
a block. To a lesser extent, this concern also applies to
other types of devices. There can be substantial initial
overhead in obtaining access to a device: bus acquisition, bus propagation, busy device delay, priority resolution, address mapping, intrinsic device access time,
etc. Prorating these against a block of words would reduce the effective access time.
The second factor in a single-word-only system is the
bus control method. Since a non-dedicated bus must be
reassigned to another device for each word, the allocation algorithm may have to be very fast to meet the bus
throughput specs. Even if bus assignment occurs in
parallel with data transfer, this could restrict the sophistication of the algorithm, the bus bandwidth, or
both. Judiciously selected parameters (speed, priorities,
etc.) conceivably could enable a bus controller to handle
blocks from a slow device on a word-by-word basis.
A single-word-only bus requires that the communication scheme operate at the word rate, whereas with
block transfers it might be possible for devices to effect
higher throughput 'by interchanging communication
signals only at the beginning and end of each block.

Fixed length blocks only

Bus bandwidth may be increased at the expense of
flexibility by transferring only fixed length blocks of
data. Problems arise when the bus block size does not
match that of a block-oriented device on the bus. If
the bus blocks are smaller, some improvement is
achieved over the single-word-only bus, but not as
much as would be possible. If the bus blocks are too
large, extraneous data is transferred, which ,:ast~s bus
bandwidth and buffer space, and unnecessarlly tIes up
both devices. However, there are applications such as
lookaside memories where locality of procedure and
data references make effective use of a purely fixed
length block transfer philosophy.
Since the bus is assigned for entire blocks, the control
can be slower and thus simpler. Likewise, the communication validity check can be restricted to blocks because
this is the smallest unit that could be retried in case of
an error. The Data Ready aspect of communication
would have to remain on a word basis unless a selfclocked modulation scheme is used.
Variable length blocks only

The use of dynamically variable length blocks is
significantly more flexible than the two previous approaches, because the block size can be matched to the
physical or logical requirements of the devices involved
in the transfer. This capability makes more efficient
use of bus bandwidth and device time when transferring
blocks. On the other hand, the overhead involved in
initiating a block transfer would also be expended for
single word transfers (blocks of length one). Thus, a
compromise between bandwidth and flexibility may
have to be arranged, based on the throughput requirements and expected average block size. An example of
such a compromise would be a system in which the sizes
of the data blocks depended on the source devices. This
avoids explicit block length specification, reducing the
overhead and improving throughput.
The facility for one-word blocks requires that the
control scheme be able to reallocate the bus rapidly
enough to minimize wasted bandwidth. Data error
response may' also be required at the word rate.
Single words or fixed length blocks

In a system where there are high priority devices with
low data requirements, this might be a reasonable alternative. The single word option reduces the number
of cases where the over-size block would waste bandwidth, buffer space, and device availability, but it still

Systematic Approach to Design of Digital Bussing Structures

suffers from poor device and bus utilization efficiency
when more than one word but less than a block is
needed.
The expected mix of block and single word transfers
would be a primary influence on the selection of control
and communication mechanisms to achieve a proper
balance of cost and performance.
Single words or variable length blocks

As might be expected, the capability for both single
words and variable length blocks is the most flexible,
efficient, and expensive data transfer philosophy.
Single words can be handled without the overhead involved in initializing a block transfer. Data blocks can
be sized to suit the devices and applications, which
optimizes bus usage. The necessity for reassigning the
bus as often as every word time imposes a speed constraint on the control method which must be evaluated
in light of the expected bus traffic statistics. If data
validity response is desired below a message level, the
choice of a communication scheme will be affected.
Bus width
The width of a bus impacts many aspects of the system, including cost, reliability, and throughput. Basically, the objective is to achieve the smallest number of
lines consistent with the necessary types and rates of
communication.
Bus lines require drivers, receivers, cable, connectors,
and power, all of which tend to be costly compared to
logic. Connectors occupy a significant amount of physical space, and are also among the least reliable components in the system. Reliability is often diminished
even further as the number of lines increases due to the
additional signal switching noise.
Line combination, serial/parallel conversions, and
multilevel encoding are some of the fundamental
techniques for reducing bus width. Combination is a
method of reducing the number of lines based on function and direction of transmission. Complementary
pairs of simplex lines might be replaced with single halfduplex lines. Instead of dedicating individual lines to
separate functions, a smaller number of multiplexed
lines might be more cost effective, even if extra logic is
involved. This includes the performance of bus control
functions with coded words on the data lines.
Serial/parallel tradeoffs are frequently employed to
balance bus width against system cost and performance.
Transmitting fewer bits at a time saves lines, connectors,
drivers, and receivers, but adds conversion logic at each
end. It may also be necessary to use higher speed (and

731

thus more expensive) circuits to maintain effective
throughput. The serial/parallel converters at each end
of the bus can be augmented with buffers which absorb
traffic fluctuations and allow a lower bandwidth bus.
(Independent of bus width considerations, this concept
can minimize communication delays due to busy destination devices.) Bit-serial transmission generally is the
slowest, requires the most buffering and the least line
hardware, produces the smallest amount of noise, and
is the most applicable approach in cases with long lines.
Parallel transmission is faster, uses more line hardware,
generates greater noise, and is more cost-effective over
shorter distances.
Multilevel encoding is an approach which converts
digital data into analog signals on the bus. It is occasionally used to increase bandwidth by sending parallel
data over a single line, but there are numerous disadvantages such as complexity, line voltage drops, lack of
noise immunity, etc.
THE SYSTEMATIC APPROACH
A systematic approach to the design of digital bussing
structures is outlined in Figure 26. It assumes that pre-

....-----1~ SYSTEM REQUIREMENTS

AND SPECIFICATIONS

+

STEP 1: TYPE AND NUMBER
OF BUSSES

t
STEP 2: CONTROL METHOD

+
~TEP

3: COMMUNICATION
TECHNIQUES

!

TECHNOLOGY
CONSTRAINTS

I

STEP 4: DATA TRANSFER
PHILOSOPHIES

!
STEP 5: BUS WIDTHS

t

DETAILED DESIGN
Figure 26-0utline of the systematic approach

732

Fall Joint Computer Conference, 1972

liminary functional requirements and specifications
have been established for the system. The tradeoffs
for each b~parameter are interactive, so several iterations are generally necessary. Even the system requirements and specifications may be altered by this feedback in order to achieve an acceptable bus configuration
within the technology constraints.

Step 1: Type and number of busses
This is the first and most fundamental step, and involves the specification of dedicated and/or nondedicated busses. The factors to be considered are:
throughput; cost of cables, connectors, etc.; control
complexity; communication complexity; reliability;
modularity; and bus contention (i.e., availability).

Step 2: Bus control methods
The central choice is among three centralized and
three decentralized methods. The Step 1 decision regarding dedicated and non-dedicated busses has a major
influence here. The other considerations are: allocation
speed; cost of cables, connectors, etc.; control complexity (cost); reliability; modularity; bus contention; allocation flexibility; and device physical placement
restrictions.

Step 3: Communication techniques
Either synchronous, asynchronous, or semisynchronous communication techniques may be used, depending on: throughput; cost; reliability; mixed device
speeds; bus utilization efficiency; data transfer philosophy; and bus length.

Step 4: Data transfer philosophies
This step is strongly influenced by the need for any
block-oriented devices on the bus. In addition, the data
transfer philosophy is a function of: control speed; allocation flexibility; control cost; throughput; communication speed; communication technique; device utilization efficiency; and (perhaps) programming and memory allocation.

Step 5: Bus width
Bus width is almost always primarily dictated by
either bus length or throughput. Other aspects of this
problem are: cost, reliability; communication technique; and communication speed.

CONCLUSION
Historically, many digital bus structures have simply
"occurred" ad hoc without adequate consideration of
the design tradeoffs and their architectural impacts.
This is no longer a viable approach, because systems
are becoming more complex and consequently less
tolerant of busses which are designed by habit or added
as an afterthought. The progress in this area has been
hindered by a lack of published literature detailing all
the bus parameters and design alternatives. Some aspects of bussing have been touched on briefly as a
subsidiary topic in computer architecture papers, and
a few concepts have been treated at great length in
the substantially different context of communications.
In contrast with these foregoing efforts, the intent of
this paper is to serve as a step towards a more systematic
approach to the entire digital bus structure problem
per se.
ANNOTATED BIBLIOGRAPHY
Although many digital designers recognize the importance of bus structures, there have been no previous
papers devoted solely to this subject. When bus structures have been discussed in the literat1,ll'e, it has been
as a topic subsidiary to other aspects of computer
architecture. This section attempts to collect a comprehensive but not exhaustive selection of important
papers which deal with various considerations of bus
structure design. A guide to the bibliography is given
below so that particular facets of this I!laterial can be
explored. Additionally, each entry has been briefly annotated to provide information on its bus-related contents. The bibliography is grouped into nine categories:
Computer Architecture/System Organization, I/O,
Sorting Networks, Multiprocessors, Type and Number
of Busses, Control Methods, Communication Techniques, Data Transfer Philosophies, and Bus Width.

Computer architecture/system organization
(A2, B2, B4, D2, D3, D4, D7, D8, D9, H3, LI, L4,
L7, M4, M5, R5, S7, T4, WI, W5)
Papers in this category basically deal with the architecture of computers and systems, and with how subsystems relate to each other. Alternative architectures
(D2, L4, WI) and specific architectures (B2, B4, D3,
D4, D7, D8, D9, W5) are discussed. Item A2 is tutorial.
The impacts of bus structures (D2, H3, LI) and LSI
(L7, M5, R5) on systems organization are described.
S7 pursues the effects of new technology on bus struc-

Systematic Approach to Design of Digital Bussing Structures

tures per se. Report T4 (on which this paper is based)
examines the entire bussing problem, and contains a
detailed bus design for a specific system.

733

different numbers of busses. B2 points out the hierarchical nature of bus structures. F2 is an example of a store
and forward bus structure with dedicated busses and
extensive routing control.

I/O
Control methods

(A2, Cl, K4)
(AI, A2, B7, PI, P2, P3, Ql, S2, S6, S8, W2, W4, Yl)
Several papers deal with bus structures as a subcase
of I/O system design. K4 is a tutorial on I/O architecture with many implications on bus structure communication and control. A2 discusses the relationships
among the executive, the data bus, and the remainder
of the system. Cl considers the overall architecture of
an I/O system and its control.
Sorting networks

(Bl, L6, T2, T3)
These papers deal with sorting or permuting bus
structures. Bl and L6 utilize very simple cells and
basically construct their systems from bitonic sorters.
T2 utilizes a different, approach which is oriented
toward ease of implementation with shift registers. T3
employs group theory and a cellular array approach to
derive a unique network configuration.
Multiprocessors

(AI, C2, C8, C9, Dl, G5)
These papers deal with the design of multiprocessor
computer systems. C9 covers the bus architecture of
multiprocessors through 1963. Al describes a multiprocessor with dual non-dedicated busses controlled by
a decentralized daisy chain. C2 discusses the relationship between channel rates and memory requirements.
C8 and Dl are about multiprocessors using data exchanges. G5 describes a multiprocessor bus that uses
associative addressing techniques in its communication
portion.
Type and number of busses

(AI, A3, B2, B6, D6, DlO, F2, Gl, 11, Kl, K3, L2,
L3, L9, M3, S5, W8, Z2)

The majority of the control techniques are some form
of either centralized independent requests (A2, or decentralized daisy chaining (AI). PI uses polling, and
P2 deals with priority control of a system.
Communication techniques

(C6, C7, D5, Fl, G3, G4, H2, H4, Ml, R2, R3, R4,
Sl, S3, S4, 89, TI, W6, W7)
These papers tend to be concerned with communication techniques directly rather than as a subsidiary
topic. R2 discusses the information lines necessary to
communicate in a system. C6, C7, and Ml cover synchronous systems. H4 and S3 are good presentations of
the synchronous clock skew problem. 84 deals with the
design of a frame and slotted system. Fl describes the
use of phase-locked loops for synchronism, while W7
uses bit stuffing for synchronization. The synchronous
system in H2 uses a combination of global and local
timing. R3 deals with a synchronous system with nondedicated time slots. D5 contains a good summary of
asynchronous communication, and G3 furnishes further
examples. G4 points out the importance of communication in digital systems.
Data transfer philosophies

(A4, Cl, C3, C4, C5, G2, HI, L5, L8, M2, W3)
Papers in this category are concerned with the
philosophies of data transfers. A4 is about transmission
error checking and serial-by-byte transmission. C3,
C4, and C5 cover buffering and block size from a statistical point of view in simple bus structures such as
"loops." G2 studies the choice of block sizes. L5 considers the buffering problem.
Bus width

The papers in this group describe a computer architecture and include some comments relating to the
type and number of busses. Z2 is an example of a dedicated bus, while Al presents a non-dedicated bus. AI,
DlO, L2, L3, and Z2 are cases of bus structures with

(B3, B5, G3, C4, C5, K2, Rl, T5, Zl)
These papers address the problem of reducing the
number of lines in the bus. B3 deals with line drivers

734

Fall Joint Computer Conference, 1972

and receivers, and contains an extensive bibliography on
transmission line papers. B5 discusses balancing the
overall system configuration. C3, C4, and C5 are interested in the relationships of burst lengths, number of
lines, etc. K2 describes a transmission system utilizing
multilevel encoding. T5 is a comprehensive study of line
reduction, and includes all the tradeoffs on buffering,
multilevel codes, etc., in the design of an actual bus. A
machine with a single 200 line bus structure is the topic
of Rl.

REFERENCES
Al R L ALONSO et al
A multiprocessing structure
Proceedings IEEE Computer Conference September 1967
pp 56-59
This paper describes a multiprocessor system with non-dedicated instruction and data busses. The control method is a
simple decentralized daisy chain.
A2 S J ANDELMAN
Real-time I/O techniques to reduce system costs
Computer Design May 1966 pp 48-54
This article describes two real-time I/O applications and
how a computer is used in each. It also indicates the
relationships among the system executive, the CPU
computations, and the I/O data bus. It includes centralized
bus control.
A3 J P ANDERSON et al
D825-a multiple-computer system for command and control
Proceedings FJCC 1962 AFIPS Press pp 86-96
This paper functionally describes the switch interlock
system of the Burroughs D825 system. The switch is
essentially a crossbar which can handle up to 64 devices.
A priority-oriented bus allocation mechanism handles
conflicting allocation requests. Priorities are preemptive.
A4 A AVIZIENIS
Design of fault-tolerant computers
Proceedings FJCC 1967 AFIPS Press pp 733-743
This paper describes the internal structure of the JPL-STAR
computer. The bus structure consists of two busses and two
bus checkers. The busses transmit information in four-bit
bytes and the bus checkers check for transmission errors.
B1 K E BATCHER
Sorting networks and their application
Proceedings SJCC 1968 AFIPS Press pp 307-314
This paper describes various configurations of bitonic sorting
networks which can be utilized as routing networks or
permutation switches in multiprocessor systems.
B2 H R BEELITZ
System architecture for large-scale integration
Proceedings FJCC 1967 AFIPS Press pp 185-200
This paper describes the architecture of LIMAC. It notes
the hierarchical nature of bus structures, stating, "A local bus
structure interconnects the sub-partitions of a functional
module in the same sense that the machine bus interconnects
all functional modules."
B3 ROBERG et al
PEPE implementation study
Honeywell Report 12251-FR Prepared for System
Development Corporation under Subcontract SDC 71-61

This report contains an extensive bibliography of signal
transmission papers and a survey of line drivers and
receivers. It also describes the bus designs for the PEPE
multiprocessor system.
B4 N A BOEHMER et al
Advanced avionic digital computer-arithmetic and control
unit design
Hughes Aircraft Report P70-517 prepared under Navy
contract N62269-70-C-0534 December 1970
This report describes a main data bus design for the
Advanced Avionic Digital Computer, including the bus
communication and allocation mechanisms.
B5 F P BROOKS K ElVERSON
A utomatic data processing
Wiley New York 1969 Section 5.4 Parameters of computer
organization pp 250-262
This section descusses speed/cost/balance tradeoffs in
computer architecture. Of specific interest is how bus width,
speed, and degree of parallelism affect computer performance. Examples of tradeoff results are given in terms of
the System/360.
B6 W BUCHHOLZ
Planning a computer system
M~Graw-Hill New York 1962
Chapter 16 of this book describes the data exchange of the
STRETCH computer. The data exchange is a switched bus
which handles data flow among I/O and external storage
units and the primary store. It is independent of CPU
processes and able to function concurrently with the central
processor.
B7 H B BURNER et al
A programmable data concentrator for a large computing
system
IEEE Transactions on Computers November 1969 pp
1030-1038
This paper describes the internal structure of a data
concentrator to be used with an IBM 360/67. The concentrator utilizes an Interdata Model 4 computer. The details
of the bus structure, including timing and control signals,
are given. The system was built and utilized at Washington
State University, Pullman, Washington.
C1 G N CEDARQUIST
An input/output system for a multiprogrammed computer
Report No 223 April 1967 Department of Computer
Science University of Illinois
This report describes the architecture of I/O systems, and
deals with some parameters of bus structures through
discussion of data transfers. It is primarily concerned with
the implementation of centralized control and communication logic.
C2 Y C E CHEN D LEPLEY
Bounds on memory requirements of multiprocessing systems
Proceedings 6th Annual Allerton Conference on Circuit and
System Theory October 1968 pp 523-531
This paper presents a model of a multiprocessor with a
multilevel memory. Given a computation graph with
specified execution times and main memory requirements,
bounds on the required main memory and the inter-memory
channel rates are calculated. The trade-off between main
memory size and backing memory channel capacity is
discussed at some length.
C3WWCHU
A study of asynchronous time division multiplexing for time
sharing computer systems
Proceedings FJCC 1969 AFIPS Press pp 669-678

Systematic Approach to Design of Digital Bussing Structures

This paper describes the use of an asynchronous time
division multiplexing system. A model is given which relates
buffer size and queuing delays to traffic, number of lines, and
burst lengths.
C4WWCHU
Demultiplexing considerations for statistical multiplexers
IEEE Transactions on Computers June 1972 pp 603-609
This paper discusses tradeoffs and simulation results useful
in the design of buffers used in a computer communication
system. The tradeoffs between message lengths, buffer size,
traffic intensity, etc., are considered.
C5 W W CHU A G KONHEIM
On the analysis and modeling of a class of computer
communication systems
IEEE Transactions on Communications June 1972
pp 645-660
This paper derives models for a computer communication
enyironment, applied to star and loop bus structure
systems. The model provides a means of relating statistical
parameters for traffic intensities, message lengths, etc.
C6 N CLARK A C GANNET
Computer-to-computer communication at 2.5 megabit/sec
Proceedings of IFIP Congress 62 North Holland Publishing
Company September 1962 pp 347-353
This paper describes an experimental synchronous high
speed (2.5 megabit/second) communication system. It
indicates the relationships of all system parts necessary to
communicate in a party-line fashion among three computers.
C7 COLLINS RADIO CORPORATION
C-system overview 523-0561644-001736 Dallas Texas October 1
1969
This brochure describes the architecture of the Collins
C-System, especially the design and features of the Time
Division Exchange (TDX) loop. The TDX loop is a 32
million bit-per-second serial communication link. Communication between devices is at a 2 million word-per-second
rate. The system as initially implemented contained 16
channels, with expansion to a 512 million bit-per-second
capability envisioned.
C8 ME CONWAY
A multiprocessor system design
Proceedings FJCC 1963 AFIPS Press pp 139-146
This paper describes the design of a multiprocessor system
which useds a matrix switch (called a memory exchange) to
connect processors to memories. The unique feature of the
configuration is that an associative memory is placed
between each processor and the memory exchange for
addressing purposes.
C9 A J CRITCHLOW
Generalized multiprocessing and multiprogramming systems
Proceedings FJCC 1963 AFIPS Press pp 107-126
This paper describes the state of development of multiprocessor systems in 1963. There were essentially three bus
schemes in use: the crossbar switch (Burroughs D825), the
multiple bus (CDC-3600) and the time-shared bus (IBM
STRETCH). Functional descriptions of the bus concepts
are presented.
D1 R L DAVIS et al
A building block approach to multiprocessing
Proceedings FJCC 1972 AFIPS Press pp 685-703
This paper describes a bus structure (called a Switch
Interlock) for use in a multiprocessor. It discusses the
tradeoffs in choosing the structure, and looks at single bus,
multiple bus, multiport, and crossbar systems. The Switch
Interlock is a dedicated bus matrix switch which supports

D2

D3

D4

D5

D6

D7

D8

D9

DlO

735

both single word and block transfers. The switch is designed
to be implemented for bus widths from bit-serial to fully
word-parallel.
A J DEERFIELD
Architectural study of a distributed fetch computer
NAECON 1971 Record pp 214-217
This paper describes the distributed fetch computer in
which the fetch (procedure and data) portion of the machine
is distributed to the memory modules.
A J DEERFIELD et al
Distributed fetch computer concept study
Air Force Contract No F-71-C-1417 February 1972
This report describes the design of a bus structure for use in
the distributed fetch computer. This machine repartitions
the fetch and execute portions of the processor in a multiprocessor system. The fetch units are associated with the
memories instead of being with the execute units, thus
decreasing bus traffic.
A J DEERFIELD et al
Interim report for arithmetic and control logic design study
Navy Contract N62269-72-C-0023 May 1972
This report describes a proposed bus structure for the
Advanced Avionic Digital Computer and some of the
tradeoffs considered during the design.
J B DENNIS S S PATIL
Computation structures
Chapter 4-Asynchronous Modular Systems
MIT Department of Electrical Engineering Cambridge
Massachusetts
This chapter describes the reasons for asynchronous
systems, and gives examples of asynchronous techniques
and their timing mechanisms. It is useful in understanding
asynchronous communications.
E W DEVORE D H LANDER
Switching in a computer complex for I/O flexibility
1964 NEC pp 445-447
This paper describes the IBM 2816 Switching Unit, the bus
system utilized to interconnect CPU's and tape drives. It
discusses the modularity tradeoffs made in the 2816.
DIGITAL EQUIPMENT CORPORATION
PDP-II handbook
Chapter 8---Description of the UNIBUS pp 59-68 Maynard
Massachusetts 1969
This chapter of the PDP-ll user's manual describes the
UNIBUS functionally as a subsystem of the PDP-ll. Data
transfer operations performed by the bus are described and
illustrated with examples, along with general concepts of bus
operation and control.
DIGITAL EQUIPMENT CORPORATION
PDP-11 interface
Application Note Maynard Massachusetts
This document gives a brief description of the PDP-ll
UNIBUS, a single undedicated bus with centralized
daisy-chain control and fully-interlocked request/acknowledge communication.
DIGITAL EQUIPMENT CORPORATION
PDP-11 unibus interface manual
DEC-ll-HIAB-D Maynard Massachusetts 1970
This manual gives a detailed description of the PDP-ll
UNIBUS, its operation in the computer, and methods for
interfacing peripheral equipment to the bus.
S B DINMAN
Direct function processor concept for system control
Computer Design March 1970 pp 55-60
This article describes the (patented) GRI-909 bus structure.

736

Fall Joint Computer Conference, 1972

The machine consists of a series of functional modules strung
between two undedicated busses with a bus modifier unit
(which serves a function similar to the alpha code on the
Harvard MARK IV). The GRI-909 is quite similar to the
DEC PDP-11.
F1 K FERTIG B C DUNCAN
A new high-speed general purpose input/output mechanism
with real-time computing capability

Proceedings FJCC 1967 AFIPS Press pp 281-289
This paper describes techniques for. I/O .processing of
self-clocked data utilizing phase locked loops.
F2 H FRANK et al
Computer communication network design-experience with
theory and practice

SJCC 1972 AFIPS Press pp 255-270
This paper describes the ARPANET design from the
vantage point of two years experience with the message
switching system. ARPANET is a store and forward
message switching network in which a device interfaces into
the system by means of an interface message processor
(IMP). The IMP then routes the message through the
network topology. This paper provides insight into the
design and specification of dedicated "store.,.and-forward"
message switching systems.
G1 E C GANGL
Modular avionic computer

NAECON 1972 Record pp 248-251
This paper describes the architecture of a modular computer
including its internal bus structure. The bus consists of four
parallel segments: a data bus, a status bus, a microprogrammed command bus, and a power distribution bus.
G2 D H GIBSON
Considerations in block oriented systems design

Proceedings SJCC 1967 AFIPS Press pp 75-80
This paper describes the rationale and techniques for block
transfers between CPU and memory. The study is to
determine the affect of block size on CPU throughput.
G3 A I GROUDAN
The SKC-2000 advanced aerospace computer

NAECON 1972 Record pp 229-235
This paper describes the SKC-2000 computer and its
internal bus structure. The bus operates in a request/
acknowledge mode of communication and can handle
devices of different speeds from 1 microsecond to larger than
a millisecond with no design changes.
G4 H W GSCHWIND
Design of digital computers

Communications in Digital Computer Systems Chapter 8
Section 5 Springer-Verlag New York 1967 pp 347-367
This section describ~s computer I/O and access paths
(busses) in terms of their communication ramifications. It
points out that "even experts failed to look at computers
seriously from a communication point of view for a
surprisingly long time." It also details the communication
that occurs in some general computer configurations.
G5 D C GUNDERSON
Multi-processor computing apparatus

U S Patent 3521238 July 13 1967
This patent describes a method of bussing in a multiprocessor system based upon the use of an associative switch.
This bus scheme allows processors to access a centralized
system memory by either location or some property of the
data (content addressabiIity). Each processor has its own
individual access to the system memory so the bus is very
reliable.

HI M L HANSON
Input/output techniques for computer communication

Computer Design June 1969 pp 42-47
This article describes the I/O systems in several UNIVAC
machines, and considers the types of data transfers, staus
words, number of lines, method of operation, etc., of these
bus structures.
H2 R H HARDIN
Self sequencing data bus technique for space shuttle

Proceedings Space Shuttle Integrated Electronic Conference
Vol 2 1971 pp 111-139
This presentation describes the design of SLAT (Slot
Assigned TDM), a data bus for space shuttle. SLAT is a
synchronous bus with global plus local synchronization. The
requirements, length, control method, clock skew, and
synchronization tradeoffs are discussed.
H3 H HELLERMAN
Digital computer system principles

Data Flow Circuits and Magnetic-Core Storage
McGraw-Hill New York 1967 Chapter 5 pp 207-235
This chapter contains a discussion of data flow or bus
circuits, with special emphasis on the trade-offs possible
between economy and speed. The author stresses the fact
that the bus organization of a computer is a major factor
determining its performance.
H4 G P HYATT
Digital data transmission

Computer Design Vol 6 Noll November 1967 pp 26-30
This article deals primarily with the transmission of data in
a synchronous bus structure. It considers in detail the clock
skew problem, and describes propagation delay and
mechanization problems. It concludes that the clock pulse
should not be daisy-chained, but radially distributed, and
that the sum (worst case) of data propagation delays must
be less than the clock pulse period.
11 F INOSE et al
A data highway system

Instrumentation Technology January 1971 pp 63-67
This article describes a data bus designed to interface many
digital devices together. The system is essentiaJly a
nondedicated single bus with one wire for data and another
for addresses. The system is connected together in a "loop
configuration." It uses a "5-value pulse" for synchronization, etc. The system has an access time of 200 microseconds
and can handle 100 devices on a bus up to 1 kilometer in
length.
Kl J C KAISER J GIBBON
A simplified method of transmitting and controlling digital
data

Computer Design May 1970 pp 87-91
This article treats the tradeoffs between the number of
parallel lines in a bus and the complexity of gating at the bus
destinations. The authors develop a matrix switch concept
as a data exchange under program control. The programmed
instruction thus is able to dynamically interconnect system
elements by coded pulse coincidence control of the switching
matrix.
K2 H KANEKO A SAWAI
Multilevel PCM transmission over a cable using feedback
balanced codes

NEC 1967 pp 508-513
This paper describes a multilevel PCM code (Feedback
Balanced Code) suitable for transmission of data on a
coaxial transmission cable.

Systematic Approach to Design of Digital Bussing Structures

K3 L J KOCZELA
Distributed processor organization
Advances in Computers Vol 19 Chapter 7 Communication
Busses Academic Press New York 1968 pp 346-349
This author presents a functional description of a bussing
scheme for a distributed cellular computer. Each processor
can address its own private memory plus bulk storage.
Communication between cells takes place over the bus in
two modes: Local (between two cells) and Global (controller
call plus one or more controlled cells). The intercell bus is
used for both instructions and data; all transfers are set up
and directed by the controller cell by means of eight bus
control commands.
K4 G A KORN
Digital computer interface systems
Simulation December 1968 pp 285-298
This paper is a tutorial on digital computer interfaces. It
begins with the party line I/O bus, and covers how devices
are controlled, how interrupts are handled, and how data
channels operate. It discusses the overall subject of
interfaces (I/O and bussing system) from the systems point
of view, describing how the subsystems all relate to each
other.
Ll J R LAND
Data bus concepts for the space shuttle
Proceedings Space Shuttle Integrated Electronic Conference
Vol 3 1971 pp 710-785
This presents the space shuttle data management computer
architecture from a bus-oriented viewpoint. It discusses the
properties and design characteristics of the bus structures,
and summarizes the design and mechanization trade-offs.
L2 F J LANGLEY
A universal function unit for avionic and missile systems
NAECON Record 1971 pp 178-185
This paper discusses some trade-offs in computer architectures, and categorizes some architectures by their bus
structures, providing an example for each category. It
considers single time-shared bus systems, multiple bus
systems, crossbar systems, dual bus external ensemble
systems, multiple-bus integrated ensemble systems, etc.
L3 R LARKIN
A mini-computer multiprocessing system
Second Annual Computer Designers Conference Los Angeles
California February 1971 pp 231-235
The topology of communication between computer subsystems is discussed. Six basic topologies for communication
internal to a computer are described: (1) radial, (2) tree,
(3) bus, (4) matrix, (5) iterative, and (6) symmetric. Some
topological implications of bus structures are discussed
including the need to insure positive (one device) control of
the bus during its transmission phase. All six topologies can
be expressed in terms of dedicated and non-dedicated bus
structures.
L4 S E LASS
A fourth" generation computer organization
Proceedings SJCC 1968 AFIPS Press pp 435-441
This paper functionally describes the internal organization
of a "fourth-generation" computer including its data
channels and I/O bus structure.
L5 A L LEINER
Buffering between input/output and the computer
Proceedings FJCC 1962 pp 22-31
This paper describes the tradeoffs in synchronizing devices,
and considers solutions to the problem of buffering between
devices of different speeds.

737

L6 K N LEVITT
A study oj data communication problems in a self-repairable
multiprocessor
Proceedings SJCC 1968 AFIPS Press pp 515-527
This paper presents a method of aerospace multiprocessor
reliability enhancement by dynamic reconfiguration using
busses which are data commutators. Two realizations of
such a bus technique are permutation switching networks
and crossbar switches.
L7 S Y LEVY
Systems utilization of large-scale integration
IEEE Transactions on Computers Vol EC-16 No 5 1967
pp 562-566
This paper describes a new approach to computer organization based on LSI technology, employing functional
partitioning of both the data path and control. Of particular
interest is the data bus structure of an RCA Laboratories
experimental machine using LSI technology.
L8 W A LEVY E W VEITCH
Design for computer communication systems
Computer Design January 1966 pp 36-41
This article relates memory size considerations to a user's
wait time for a line to the memory. It is applicable to bus
bandwidth design in the analysis of buffer sizes needed to
load up a bus structure.
L9 R C LUTZ
PC M using high speed memory system for switching
applications
Data and Communication Design May-June 1972 pp 26-28
This article details a method of replacing a crossbar switch
with a memory having an input and output commutation
system and some counting logic. Advantages of this
approach are low cost and linear growth.
Ml J S MAYO
An approach to digital system network
IEEE Transactions on Communication Technology April
1967 pp 307-310
This paper deals with synchronizing communication between devices with unlocked clocks. A system with frame
sync is postulated and the number of bits necessary for
efficient pulse stuffing is derived.
M2 J D MENG
A serial input/output scheme for small computers
Computer Design March 1970 pp 71-75
This article describes the trade-offs and results of designing
an I/O data bus structure for a minicomputer.
M3 J S MILLER et al
Multiprocessor computer system study
NASA Contract No 9-9763 March 1970
This report reviews the number and type of busses used in
several computing systems such as: CDC 6000, IBM DCS,
IBM 360 ASP series, IBM 4-Pi, Burroughs D825 and 5500,
etc. It goes on to suggest the design of a multiprocessor for
a space station. In particular the system has two busses,
one for I/O and one for internal transfers. Specifically
described are: message structure, access control, error
checking and required bandwidth. A 220 MHz bandwidth
requirement is deduced.
. M4 W F MILLER R ASCHENBRENNER
The GUS multicomputer system
IEEE Transactions on Computers December 1963
pp 671-676
This paper describes an Argonne Lab experimental computer with several memory and processing subsystems. All
internal memory communication is handled by the Dis-

738

M5

PI

P2

P3

Ql

Rl

R2

R3

Fall Joint Computer Conference, 1972

tributor, which functions as a data exchange and is
expandable. No detailed description of the Distributor
operation is furnished.
R C MINNICK et al
Cellular bulk transfer systems
Air Force Contract No FI9628-67-C-0293 3 AD683744
October 1968
Part C of this report describes a bulk transfer system
composed of an input array, an output array, and a mapping
device. The mapping device moves data from the input to
the output array and may contain logic. Simple bulk
transfer systems are described which perform permutation
on the data during its mapping.
P E PAYNE
A method of data transmission requiring maximum turnaround
time
Computer Design November 1968 p 82
This article describes a method of controlling data transmission between devices by polling.
M PIRTLE
Intercommunication of processors and memory
Proceedings FJCC 1967 AFIPS Press pp 621-633
This paper discusses the throughput of several different bus
structures in a system configuration with the intent of
providing the appropriate amount of memory bandwidth.
It describes the allocation sequence of a typical bus, and
concludes that it can be very effective to assign " ..•
priorities to requests, rather than to processors and busses
... with memory systems which provide ample memory bus
bandwidth to the processors."
W W PLUMMER
Asynchronous arbiters
Computation Structures Group Memo No 56 MIT Project
MAC February 1971
This memo describes logic for determining which of several
requesting CPU's get access and in what order to a memory.
It is potentially a portion of the control logic for a bus
structure, and describes several different algorithms for
granting access.
J T QUATSE et al
The external access network of a modular computer system
Proceedings SJCC 1972 AFIPS Press pp 783-790
This paper describes the External Access Network (EAN),
a switching network designed to interface processors to
processors, processors to facilities, and memory to facilities
in a modular time sharing system (PRIME) being built at
Berkeley. The EAN acts like a crossbar switch or data
exchange, and consists of processor, device, and switch
nodes. To communicate, a processor selects an available
switch node and connects the appropriate device node to it.
R RICE WR SMITH
SYMBOL-a major departure from classic software dominated
Von Neumann computing systems
Proceedings SJCC 1971 AFIPS Press pp 575-587
This paper describes a functionally designed bus-oriented
system. The system bus consists of 200 interconnection lines
which run the length of the mainframe.
R RINDER
The input/output architecture of minicomputers
Datamation May 1970 pp 119-124
This article surveys the architecture of minicomputer I/O
units. It describes a typical I/O bus and the lines of
information it would carry.
M P RISTENBATT D R ROTHSCHILD
Asynchronous time multiplexing

IEEE Transactions on Communication Technology June
1968 pp 349-357
This paper describes the use of "asynchronous time
multiplexing" techniques on analog data. Basically, the
paper describes a synchronous system with non-dedicated
time slots.
R4 K ROEDL R STONER
Unique synchronizing technique increases digital transmission
rate
Electronics March 15 1963 pp 75-76
This note provides a method of synchronizing two devices
having local clocks of supposedly equal frequencies.
R5 K K ROY
Cellular bulk transfer system
PhD Thesis Montana State University Bozeman Montana
March 1970
Bulk transfer systems composed of input logic, output logic,
and a mapping device are studied. The influences of
mapping device, parallelism, etc., are considered.
SI T SAITO H INOSE
Computer simulation of generalized mutually synchronized
systems
Symposium on Computer Processing in Communications
Polytechnic Institute of Brooklyn April 1969 pp 559-577
This paper describes ten ways to mutually synchronize
devices having separate clocks so that data can be accurately
. delivered in the correct time slot of a synchronous system.
The results of the simulation relate to the stability of the
synchronizing methods.
S2 J SANTOS M I OTERO
On transferences and priorities in computer networks
Symposium on Computers and Automata Vol 21 1971
pp 265-275
The structure of bus (channel) controllers is considered
using the language of automata theory. The controller is
decomposed into two units: one receives requests and
availability signals, and generates corresponding requests to
the other unit which allocates the bus on a priority basis.
Both units are further decomposed into subunits.
S3 J W SCHWARTZ
Synchronization in communication satellite systems
NEC 1967 pp 526-527
This paper describes tradeoffs and potential solutions to the
clock skew problem in a widely dispersed system.
S4 C D SMITH
Optimization of design parameters for serial TDM
Computer Design January 1972 pp 51-54
This article derives analytical tools for the analysis and
optimization of a synchronous system with global plus local
timing.
S5 D J SPENCER
Data bus design techniques
NASA TM-X-52876 Vol VI pp 95-113
This paper discusses design alternatives for a multiplexed
data bus to reduce point-to-point wiring cost and complexity. The author investigates coupling, coding, and
control factors for both low and high signal-to-noise ratio
lines for handling a data rate less than five million bits per
second.
S6 D C STANGA
Univac 1108 multiprocessor system
Proceedings SJCC 1971 AFIPS Press pp 67-74
This paper describes how memory accesses are made from
the multiple processors to the multiple memory banks in the
1108 multiprocessor system. It gives a block diagram of the

Systematic Approach to Design of Digital Bussing Structures

S7

S8

S9

T1

T2

T3

T4

system interconnectivity and describes how the multiple
module access units operate to provide multiple access paths
to a memory module.
D J STIGLIANI et al
Wavelength division multiplexing in light interface technology
AD-721085 March 1971
This report describes the fabrication of a five-channel
optical multiplexed communication line, and suggests some
alternatives for matching wavelength multiplexed light
transmission times to digital electrical circuits.
J N STURMAN
An iteratively structured general purpose digital computer
IEEE Transactions on Computers January 1968 pp 2-9
This paper describes a bus and its use in an iterative
computer. The system is a dual dedicated bus structure with
centralized control.
J N STURMAN
Asynchronous operation of an iteratively structured general
purpose digital computer
IEEE Transactions on Computers January 1968 pp 10-17
This paper describes the synchronization of an iterative
structure computer. The processing elements are connected
on a common complex symbol bus. To allow asynchronous
operation, a set of timing busses are added to the system
common complex symbol bus. The timing busses take
advantage of their transmission line properties to provide
synchronism of the processors.
F W THOBURN
A transmission control unit for high speed computer-tocomputer communication
IBM Journal of Research and Development November 1970
pp 614-619
This paper describes a multiplex bus system for connecting
a large number of computers together in a star organization.
Special emphasis is given to the transmission control unit,
a microprogrammed polling and interface unit which uses
synchronous two-frequency modulation and a serializer/
de-serializer unit.
K J THURBER
Programmable indexing networks
Proceedings SJCC 1970 AFIPS Press pp 51-58
This paper describes data routing networks designed to
perform a generalized index on the data during the routing
process. The indexing networks map an input vector onto
an output vector. The mapping is arbitrary and programmable. Several different solutions are presented with varying
hardware, speed, and timing requirements. The networks
are described in terms of shift register implementations.
K J THURBER
Permutation switching networks
Proceedings of the 1971 Computer Designer's Conference
Industrial and Scientific Conference Management Chicago
Illinois January 1971 pp 7-24
This paper describes several permutation networks designed
to provide a programmable system capable of interconnecting system elements. The networks are partitioned for LSI
implementation and can be utilized in a pipeline fashion.
Algorithms are given to determine a program to produce any
of the N! possible permutations of N input lines.
K J THURBER et al
Master executive control for AADC
Navy Contract N62269-72-C-0051 June 18 1972
This report describes a systematic approach to the design of
digital bus structures and applies this tool to the design of a
bus structure for the Advanced Avionic Digital Computer.

T5

WI

W2

W3

W4

W5

W6

739

The structure is designed with three major requirements:
flexibility, modularity, and reliability.
A TURCZYN
High speed data transmission scheme
Proceedings 3rd Univac DPD Research and Engineering
Symposium May 1968
The increasing complexity of multiprocessor computer
systems with a high degree of parallelism within the
computer system has created major internal communication
problems. If each processing unit should be able to communicate with many other subsystems, the author recommends either a data exchange, or switching center, or
parallel point-to-point wiring. The latter has the advantage
of fast transfer and minimal data registers, but in a
multiprocessor it results in a large number of cables. This
paper discusses the state-of-the-art of internal multiplexing
and multi-level coding schemes for reducing the number of
lines in the system.
E G WAGNER
On connecting modules together uniformly to forma modular
computer
IEEE Transactions on Computers December 1966
pp 864-872
This paper provides mathematical group theoretic precision
to the idea of uniform bus structure in cellular computers.
P W WARD
A scheme for dynamic priority control in demand actuated
multiplexing
IEEE Computer Society Conference Boston September
1971 pp 51-52
This paper describes a priority conflict resolution method
which is used in an I/O multiplexer system.
R WATSON
Timesharing system design concepts
Chapter 3-Communications McGraw-Hill 1970 pp 78-110
This chapter provides a summary of "communication"
among memories, processors, lOP's, etc. The discussion is
oriented toward example configurations. Subjects discussed
are: (1) use of multiple memory modules, interleaving, and
buffering to increase memory bandwidth; (2) connection of
subsystems using direct connections, crossbar switches,
multiplexed busses, etc.; and (3) the transmission medium.
Items discussed under transmission medium are synchronous
and asynchronous transmission, line types (simplex, halfduplex, and full-duplex), modulation, etc.
DR WELLER
A loop communication system for I/O to a small multi-user
computer
IEEE Computer Society Conference Boston September
1971 pp 49-50
This paper describes a single-line non-dedicated bus with
daisy-chained control for the DDP-516 computer. Message
format and speed of operation are detailed.
G P WEST R J KOERNER
Communications within a polymorphic intellectronic system
Proceedings of Western Joint Computer Conference San
Francisco May 3-5 1960 pp 225-230
This paper describes a crosspoint data exchange used in the
RW-400 computer. The switch was mechanized using
transfluxor cores.
L P WEST
Loop-transmission control structures
IEEE Transactions on Communications June 1972
pp 531-539
This paper considers the problem of transmitting data on a

740

Fall Joint Computer Conference, 1972

communication loop. It discusses time slots, frame pulses,
addressing techniques, and efficiency of utilization. It also
discusses a number of ways for assigning time slots for
utilization on the impact of slot size on loop utilization
efficiency.
W7 M W WILLARD L J HORKAN
Maintaining bit integrity in time division transmission
NAECON 1971 Record pp 240-247
This paper describes the tradeoffs involved in synchronizing
high speed digital subsystems which are communicating
over large distances. It considers clocking and buffering
tradeoffs.
W8 D R WULFINGHOFF
Code activated switching-a solution to multiprocessing
problems
Computer Design April 1971 pp 67-71
The author points out that multiprocessor computer
configurations have a large number of interconnections
between elements causing considerable hardware and
software complexity. He describes a technique whereby
each program to be run is assigned a code, identifier, or
signature; then when this program is activated the system
resources it requires can be "lined-up" for use. He compares
this scheme to that employed for telephone switching. Code
activated switching is illustrated by two system block
diagrams: a special purpose control computer and a general
purpose time-shared computer.
Y1 B S YOLKEN
Data bus-method for data acquisition and distribution
within vehicles
. NAECON 1971 Record pp 248-253
This paper discusses a time division multiplexed bus, and

considers bus control, bit synchronization, and technology
tradeoffs.
Z1 R E ZIMMERMAN
The structure and organization of communication processors
PhD Dissertation Electrical Engineering Department
University of Michigan September 1971
This dissertation describes a multi-bus computer used as a
terminal processor. It has a pair of instruction busses which
start and then signal completion of processes performed in
functional units or subsystems. The machine has three data
busses: a memory bus which serves as the primary system
communication bus, a flag address bus, and a flag data bus.
All busses are eight bits wide and the three data busses are
bidirectional.
Z2 R J ZINGG
Structure and organization of a pattern processor for
hand-printed character recognition
PhD Dissertation Iowa State University Ames Iowa 1968
This dissertation describes a bus-oriented special purpose
computer designed for research in character recognition.
The machine contains a control bus, a scratchpad memory
bus, and three data busses. Each register that can be
reached by a data bus has two control flip-flops associated
with it and these determine to which data bus it is to be
connected. These connections are controlled by a hardware
command. The contents of several registers can be placed on
one data bus to yield a bit-by-bit logical inclusive OR.
Also, the contents of one data bus can be transferred to
several registers and the contents of all three busses
transferred in parallel under program command. This
processor is a rather interesting example of a five bus
processor.

Improvements in the design and
performance of the ARPA network
by J. IVLIVlcQUILLAN, W. R. CROWTHER, B. P. COSELL, D. C. WALDEN, and
F. E. HEART
Bolt Beranek and Newman Inc.
Cambridge, Massachusetts

INTRODUCTION

systems (called Hosts) and one communications processor called an Interface l\1essage Processor, or Il\1P.
All of the Hosts at a site are directly connected to the
Il\1P. Some Il\1Ps also provide the ability to connect'
terminals directly to the network; these are called
Terminal Interface l\1essage Processors, or TIPs. The
Il\1Ps are connected together by wideband telephone
lines and provide a subnet through which the Hosts
communicate. Each Il\1P may be connected to as many
as five other Il\I{Ps using telephone lines with bandwidths from 9.6 to 230.4 kilobits per second. The typical
bandwidth is 50 kilobits.
During these three years of network growth, the
actual user traffic has been light and network performance under such light loads has been excellent.
However, experimental traffic, as well as simulation
studies, uncovered logical flaws in the IMP software
which degraded performance at heavy loads. The software was therefore substantially modified in the spring
of 1972. This paper is largely addressed to describing
the new approaches which were taken.
The first section of the paper considers some criteria
of good network design and. then presents our new
algorithms in the areas of source-to-destination sequence and flow control, as well as our new IMP-to-IMP
acknowledgment strategy. The second section addresses
changes in program structure; the third section reevaluates the IlVIP's performance in light of these
changes. The final section mentions some broader
Issues.
The initial design of the ARPA Network and the
Il\1P was described at the 1970 Spring Joint Computer
Conference, l and the TIP development was described
at the 1972 Spring Joint Computer Conference. 2 These
papers are important background to a reading of the
present paper.

In late 1968 the Advanced Research Projects Agency
of the Department of Defense (ARPA) embarked on
the implementation of a new type of computer network
which would interconnect, via common-carrier circuits,
a number of dissimilar computers at widely separated,
ARPA-sponsored research centers. The primary purpose
of this interconnection was resource sharing, whereby
persons and programs at one research center might
access data and interactively use programs that exist
and run in other computers of the network. The interconnection was to be realized using wideband leased
lines and the technique of message switching, wherein a
dedicated path is not set up between computers desiring
to communicate, but instead the communication takes
place through a sequence of messages each of which
carries an address. A message generally traverses
several network nodes in going from source to destination, and at each node a copy of the message is stored
until it is safely received at the following node.
The ARPA Network has been in operation for over
three years and has become a national facility. The
network has grown to over thirty sites spread across the
United States, and is steadily growing; over forty
independent computer systems of varying manufacture
are interconnected; provision has been made for terminal
access to the network from sites which do not enjoy the
ownership of an independent computer system; and
there is world-wide excitement and interest in this type
of network, with a number of derivative networks in
their formative stages. A schematic map of the ARPA
Network as of the fall of 1972 is shown in Figure 1.
As can be seen from the map, each site in the ARPA
Network consists of up to four independent computer
741

742

Fall Joint Computer Conference, 1972

translate directly into increases in the maximum
throughput rate that an IMP can maintain. Our new
algorithm in this area is also given below.
Source-to-destination flow control

Figure t-ARPA network, logical map, August 1972

NEW ALGORITHMS
A balanced design for a communication system should
provide quick delivery of short interactive messages
and high bandwidth for long files of data. The IMP
program was designed to perform well under these
pimodal traffic' conditions. The experience of the first
two and one half years of the ARPA Network's operation indicated that the performance goal of low delay
had been achieved. The lightly-loaded network delivered short messages over several hops in about
one-tenth of a second. Moreover, even under heavy
load, the delay was almost always less than one-half
second. The network also provided good throughput
rates for long messages at light and moderate traffic
levels. However, the throughput of the network degra~ed significantly under heavy loads, so that the goal
of high bandwidth had not been completely realized.
'Ye isolated a problem in the initial network design
WhICh led to degradation under heavy loads. 3 •4 This
problem involves messages arriving at a destination
IMP at a rate faster than they can be delivered to the
destination Host. We call this reassembly congestion.
Reassembly congestion leads to a condition we call
reassembly lockup in which the destination IMP is
incapable of passing any traffic to its Hosts. Our algorithm to prevent reassembly congestion and the
related sequence control algorithm are described in
the following subsections.
We also found that the IMP and line bandwidth
requirements for handling IMP-to-IMP traffic could be
substantially reduced. Improvements in this area

. For efficiency, it is necessary to provide, somewhere
m the network, aeertain amount of buffering between
the source and destination Hosts, preferably an amount
equal to the bandwidth of the channel between the
Hosts multiplied by the round trip time over the
channel. The problem of flow control is to prevent
messages from entering the network for which network
buffering is not available and which could congest the
~etwork and lead to reassembly lockup, as illustrated
mFigure2.
In Figure 2, IMP 'I is sending multi-packet messages
to IMP 3; a lockup can occur when all the reassembly
buffers in IMP 3 are devoted to partially reassembled
messages A and B. Since IMP 3 has reserved all its
remaining space for awaited packets of these partially
reassembled messages, it can only take in those particular packets from IMP 2. These outstanding packets,
however, are two hops away in IlVIP 1.· They cannot get
through because IlVIP 2 is filled with store-and-forward
packets of messages C, D, and E (destined for IMP 3)
which IMP 3 cannot yet accept. Thus, IMP 3 will never
be able to complete the reassembly of messages A
andB.
The original network design based source-to-destination sequence and flow control on the link mechanism
previously reported in References 1 and 5. Only a single
message on a given link was permitted in the subnetwork at one time, and sequence numbers were used to
detect duplicate messages on a given link.
We were always aware that Hosts could defeat our
flow control mechanism by "spraying" messages over an
inordinately large number of links, but we counted on
the nonmalicious behavior of the Hosts to keep the

IMP 1

-

IMP 2

ID~ -- III II~

DID

IMP3

/"1message A

rGII"

reassembly

'---- -----~

ii-OBi
~

_ _ _ _ _ _ _ _ ......J

"- message B
reassembly

Figure 2-Reassembly lockup

Improvements in Design and Performance of ARPA Network

number of links in use below the level at which problems
occur. However, simulations and experiments artificially
loading the network demonstrated that communication
between a pair of Hosts on even a modest number of
links could defeat our flow control mechanism; further,
it could be defeated by a number of Hosts communicating with a common site even though each Host used
only one link. Simulations3 ,4 showed that reassembly
lockup may eventually occur when over five links to
a particular Host are simultaneously in use. With ten
or more links in use with multipacket messages, reassembly lockup occurs almost instantly.
If the buffering is provided in the source IMP, one
can optimize for low delay transmissions. If the buffering is provided at the destination IlVIP, one can optimize
for high bandwidth transmissions. To be consistent
with our· view of a balanced communications system,
we have developed an approach to reassembly congestion which utilizes some buffer storage at both the
source and destination; our solution also utilizes a
request mechanism from source Il\1P to destination
IMP.*
Specifically, no multipacket message is allowed to
enter the network until storage for the message has been
allocated at the destination Il\1P. As soon as the source
Il\1P takes in the first packet of a multipacket message,
it sends a small control message to the destination IMP
requesting that reassembly storage be reserved at the
destination for this message. It does not take in further
packets from the Host until it receives an allocation
message in reply. The destination IMP queues the
request and sends the allocation message to the source
IlVIP when enough reassembly storage is free; at this
point the source Il\1P sends the message to the destination.
We maximize the effective bandwidth for sequences
of long messages by permitting all but the first message
to bypass the request mechanism. When the message
itself arrives at the destination, and the destination
IlVIP is about to return the Ready-For-Next-Message
(RFNM), the destination IMP waits until it has room
for an additional multipacket message. It then piggybacks a storage allocation on the RFNM. If the source
Host is prompt in answering the RFNlVI with its next
message, an allocation is ready and the message can be
transmitted at once. If the source Host delays too long, or
if the data transfer is complete, the source IMP returns
the unused allocation to the destination. With this
mechanism we have minimized the inter-message delay

* This mechanism is similar to that implemented at the level of
Host-to-Host protocol,6,7,8 indicative of the fact that the same
sort of problems occur at every level in a communications system.

743

and the Hosts can obtain the full bandwidth of the
network.
We minimize the delay for a short message by transmitting it to the destination immediately while keeping
a copy in the source IMP. If there is space at the
destination, it is accepted and passed on to a Host and
a RFNl\iI is returned; the source IMP discards the
message when it receives the RFNM. If not, the
message is discarded, a request for allocation is queued
and, when space becomes available, the source IMP
is notified that the message may now be retransmitted.
Thus, no setup delay is incurred when storage is available at the destination.
The above mechanisms make the IMP network
much less sensitive to unresponsive Hosts, since the
source Host is effectively held to a transmission rate
equal to the reception rate of the destination Host.
Further, reassembly lockup is prevented because the
destination IMP will never have to turn away a multipacket message destined for one of its Hosts, since
reassembly storage has been allocated for each such
message in the network.

Source-to-destination sequence control
In addition to its primary function as a flow control
mechanism, the link mechanism also originally provided
the basis for source-to-destination sequence control.
Since only one message was permitted at a time on a
link, messages on each link were kept in order; duplicates
were detected by the sequence number maintained for
each link. In addition, the IMPs marked any message
less than 80 bits long as a priority message and gave it
special handling to speed it across the network, placing
it ahead of long messages on output queues.
The tables associated with the link mechanism in
each IMP were large and costly to access. Since the
link mechanism was no longer needed for flow control,
we felt that a less costly mechanism should be employed
for sequence control. We thus decided to eliminate the
link mechanism from the IMP subnetwork. RFNl\1s are
still returned to the source Host on a link basis, but
link numbers are used only to allow Hosts to identify
messages. To replace the per-link sequence control
mechanism, we decided upon a sequence control
mechanism based on a single logical "pipe" between
each source and destination IMP. Each IMP maintains
an independent message number sequence for each
pipe. A message number is assigned to each message at
the source IMP and this message number is checked at
the destination Il\1P. All Hosts at the source and
destination Il\1Ps share this message space. Out of an

744

Fall Joint Computer Conference, 1972

eight-bit message number space (large enough to
accommodate the settling time of the network), both
the source and destination keep a small window of
currently valid message numbers, which allows several
messages to be in the pipe simultaneously. Messages
arriving at a destination IMP with out-of-range message
numbers are duplicates to be discarded. The window is
presently four numbers wide, which seems about right
considering the response time required of the network.
The message number serves two purposes: it orders the
four messages that can be in the pipe, and it allows
detection of duplicates. The message number is internal
to the IMP subnetwork and is invisible to the Hosts.
A sequence control system based on a single source/
destination pipe, however, does not permit priority
traffic to go ahead of other traffic. We solved this
problem by permitting two pipes between each source
and destination, a priority (or low delay) pipe and a
nonpriority (or high bandwidth) pipe. To avoid having
each IMP maintain two eight-bit message number
sequences for every other IMP in the network, we
coupled the low delay and high bandwidth pipe so that
duplicate detection can be done in common, thus requring only one eleven-bit message number sequence
for each IMP.
The eleven-bit number consists of a one-bit priority/
non-priority flag, two bits to order priority messages,
and eight bits to order all messages. For example, if we
use the letters A, B, C, and D to denote the two-bit
order numbers for priority messages and the absence of
a letter to indicate a nonpriority message, we can
describe a typical situation as follows: The source IMP
sends out nonpriority message 100, then priority
messages lOlA and 102B, and then nonpriority message
103. Suppose the destination IMP receives these
messages in the order 102B, lOlA, 103, 100. It passes
these messages to the Host in the order lOlA, 102B,
100, 103. Message number 100 could have been sent to
the destination Host first if it had arrived at the
destination first, but the priority messages are allowed
to "leapfrog" ahead of message number 100 since it
was delayed in the network. The IMP holds 102B until
lOlA arrives, as the Host must receive priority message
A before it receives priority message B. Likewise,
message 100 must be passed to the Host before message
103.

Hosts may, if they choose, have several messages
outstanding simultaneously to a given destination but,
since priority messages can "leapfrog" ahead, and the
last message in a sequence of long messages may be
short, priority can no longer be assigned strictly on the
basis of message length. Therefore, Hosts must explicitly indicate whether a message has priority or not.

With message numbers and reserved storage to be
accurately accounted for, cleaning up in the event of a
lost message must be done carefully. The source Il\1P
keeps track of all messages for which a RFNl\1 has not
yet been received. When the RFNM is not received for
too long (presently about 30 seconds), the source IMP
sends a control message to the destination inquiring
about the possibility of an incomplete transmission.
The destination responds to this message by indicating
whether the message in question was previously received
or not. The source IMP continues inquiring until it
receives a response. This technique guarantees that the
source and destination IMPs keep their message
number sequences synchronized and that any allocated
space will be released in the rare case ~hat a message is
lost in the subnetwork because of a machine failure.
IMP-to-IMP transmission control

We have adopted a new technique for IMP-to-IMP
transmission control which improves efficiency by
10-20 percent over the original separate acknowledge/timeout/retransmission approach described in
Reference 1. In the new scheme, which is also used for
the Very Distant Host, 9 and which is similar to· Reference 10, each physical network circuit is broken into a
number of logical "channels," currently eight in each
direction. Acknowledgments are returned "piggybacked" on normal network traffic in a set of acknowledgment bits, one bit per channel, contained in every
packet, thus requiring less bandwidth than our original
method of sending each acknowledge in its own packet.
The size of this saving is discussed later in the paper. In
addition, the period between retransmissions has been
made dependent upon the volume of new traffic. Under
light loads the network has minimal retransmission
delays, and the network automatically adjusts to
minimize the interference of retransmissions with new
traffic.
Each packet is assigned to an outgoing channel and
carries the "odd/even" bit for its channel (which is
used to detect duplicate packet transmissions), its
channel number, and eight acknowledge bits-one for
each channel in the reverse direction.
The transmitting IMP continually cycles through its
used channels (those with packets associated with
them), transmitting the packets along with the channel
number and the associated odd/even bit. At the receiving IMP, if the odd/even bit of the received packet
does not match the odd/even bit associated with the
appropriate receive channel, the packet is accepted and
the receive odd/even bit is complemented, otherwise
the packet is a duplicate and is discarded.

Improvements in Design and Performance of ARPA Network

Every packet arriving over a line contains acknowledges for all eight channels. This is done by copying
the receive odd/even bits into the positions reserved for
the eight acknowledge bits in the control portion of
every packet transmitted. In the absence of other
traffic, the acknowledges are returned in "null packets"
in which only the acknowledge bits contain relevant
information (i.e., the channel number and odd/even bit
are meaningless; null packets are not acknowledged).
When an IMP receives a packet, it compares (bit by
bit) the acknowledge bits against the transmit odd/even
bits. For each match found, the corresponding channel
is marked unused, the corresponding packet is discarded, and the transmit odd/even bit is complemented.
In view of the large number of channels, and the
delay that is encountered on long lines, some packets
may have to wait an inordinately long time for transmission. We do not want a one-character packet to
wait for several thousand-bit packets to be transmitted, multiplying by 10 or more the effective delay
seen by the source. We have, therefore, instituted the
following transmission ordering scheme: priority packets
which have never been transmitted are sent first; next
sent are any regular packets which have never been
transmitted; finally, if there are no new packets to
send, previously transmitted packets which are unacknowledged are sent. Of course, unacknowledged
packets are periodically retransmitted even when there
is a continuous stream of new traffic.
In implementing the new IlVIP-to-IMP acknowledgment system, we encountered a race problem. The
strategy of continuously retransmitting a packet in the
absence of other traffic introduced difficulties which
were not encountered in the original system, which
COMMON STORE
RELOAD I DIAGNOSTICS
INITIALIZATION ITABLES ~~;
~~
BACKGROUND
~
TASK STORE a FORWARD ~
~
TASK REASSEMBLY
0:~
TASK REPLY
f0:-0
~
MODEM TO IMP
~""-""-""-""-""-""-""-""-"
-0~
IMP TO MODEM
~~
-0
HOST TO IMP
-0
HOST TO IMP
-0
~~~-:-=:IM7.:P:--:T==O:-,;H~O:,=ST;-_ _ _-t~~ 24 PAGES
~
IMP TO HOST
~
SS1
TIMEOUT
~~
~

~

t-0

DEBUG
STATISTICS
STATISTICS

~ STAT. TABI.£S"

~

S::~
.~~

-"

~

MESSAGE TABLES, ALLOCATE TABLES ~
ROUTING TABLES
to:::~'"
~___V=E~RY~D=IS~T~AN~T~H=OS=T_____~, ___
-0

Figure 3-Map of core storage

I PAGE =512 WORDS
BUFFER STORAGE
PROTECTED PAGE

745

retransmitted only after a long timeout. If an acknowledgment arrives for a packet which is currently being
retransmitted, the output routine must prevent the
input routine from freeing the packet. Without these
precautions, the header and data in the packet could be
changed while the packet was being retransmitted, and
all kinds of "impossible" conditions result when this
"composite" packet is received at the other end of the
line. It took us a long time to find this bug 1*
PROGRAM STRUCTURE
Implementation of the IMPs required the development of a sophisticated computer program. This program was previously described in Reference 1. As
stated then, the principal function of the IMP program
is the processing of packets, including the following:
segmentation of Host messages into packets; receiving,
routing, and transmitting of store-and-forward packets;
retransmitting unacknowledged packets; reassembling
packets into messages for transmission into a Host; and
generating RFNMs and other control messages. The
program also monitors network status, gathers statistics, and performs on-line testing. The program was
originally designed, constructed, and debugged over a
period of about one year by three programmers.
Recently, .after about two and one-half years of
operation in up to twenty-five IlVIPs throughout the
network, the operational program was significantly
modified. The modification implemented the algorithms
described in the previous sections, thereby eliminating
causes of network lockup and improving the performance of the IMP. The modification also extended
the capabilities of the IMP so it can now interface to
Hosts over common carrier circuits ( a Very Distant
Host9 ), efficiently manage buffers for lines with a wide
range of speeds, and perform better network diagnostics.
After prolonged study and preliminary design,3.4 this
program revision was implemented and debugged in
about nine man months.

* Interestingly, a similar problem exists on another level, that of
source-destination flow control. If an IMP sends a request for
allocation, either single- or multi-packet, to a neighboring IMP,
it will periodically retransmit it until it receives an acknowledgment. If it receives an allocation in return, it will immediately
begin to transmit the first packet of the message. The implementation in the IMP program sends the request from the same buffer
as the first packet, merely marking it with a request bit. If an
allocation arrives while the request is in the process of being
retransmitted, the program must wait until it has been completely
transmitted before it sends the same buffer again as the first
packet, since the request bit, the odd/even bit, the acknowledge
bits, and the message number (for a multipacket request) will be
changed. This was another difficult bug.

746

Fall Joint Computer Conference, 1972

We shall emphasize in this section the structural
changes the program has recently undergone.
Data structures

Figure 3 shows the layout of core storage. As before,
the program is broken into functionally distinct pieces,
each of which occupies one or two pages of core. Notice
that code is generally centered within a page, and there
is code on every page of core. This is in contrast to our
previous practice of packing code toward the beginning
of pages and pages of code toward the beginning of
memory. Although the former method results in a large
contiguous buffer area near the end of memory, it has
breakage at every page boundary. On the other hand,
"centering" code in pages such that there are an integral
number of buffers between the last word of code on one
page and the first word of code on the next page
eliminates almost all breakage.
There are currently about forty buffers in the IMP,
and the IMP program uses the following set of rules to
allocate the available buffers to the various tasks requiring buffers:
• Each line must be able to get its share of buffers for
input and output. In particular, one buffer is
always allocated for output on each line, guaranteeing that output is always possible for each
line; and double buffering is provided for input on
each line, which permits all input traffic to be
examined by the program, so that acknowledgments
can always be processed, which frees buffers.
• An attempt is made to provide enough store-andforward buffers so that all lines may operate at full
capacipy. The number of buffers needed depends
directly on line distance and line speed. We currently limit each line to eight or less buffers,
and a pool is provided for all lines. Some numerical
results on line utilization are presented in a later
section. Currently, a maximum of twenty buffers is
available in the store-and-forward pool.
• Ten buffers are always allocated to reassembly
storage, allowing allocations for one multipacket
message and two single-packet messages. Additional buffers may be claimed for reassembly, up
to a maximum of twenty-six.

real and four fake) that can be connected; additionally,
twelve words of code are replicated for each real Host
that can be connected. The program has fifty-five
words of tables for each of the five lines that can be
connected; additionally, thirty-seven words of code are
replicated for each line that can be connected. The
program also has tables for initialization, statistics,
trace, and so forth.
The size of the initialization code and the associated
tables· deserves mention. This was originally quite
small. However, as the network has grown and the
IMP's capabilities have been expanded, the amount of
memory dedicated to initialization has steadily grown.
This is mainly due to the fact that the IMPs are no
longer identical. An IMP may be required to handle a
Very Distant Host, or TIP hardware, or five lines and
two Hosts, or four Hosts and three lines, or a very high
speed line, or, in the near future, a satellite link. As the
physical permutations of the IlVIP have continued to
increase, we have clung to the idea that the program
should be identical in all IMPs, allowing an IMP to
reload its program from a neighboring IMP and providing other considerable advantages. However, maintaining only one version of the program means that the
program must rebuild itself during initialization to be
the proper program to handle the particular physical
configuration of the IlVIP. Furthermore, it must be able
to turn itself back into its nominal form when it is
reloaded into a neighbor. All of this takes tables and
code. Unfortunately, we did not foresee the proliferation
WORDS

o

500

HOSTS (8)
IMPS (64)
LINES (5)
IN ITI ALiZATION
STATISTICS
TRACE
REASSEMBLY
ALLOCATE
HEADER

Figure 4 summarizes the IMP table storage. All
IMPs have identical tables. The IMP program has
twelve words of tables for each of the sixty-four IMPs
now possible in the network. The program has ninetyone words of tables for each of the eight Hosts (four

BACKGROUND
TIME OUT

28

Figure 4-Allocation of IMP table storage

1000

Improvements in Design and Performance of ARPA Network

,;<--,

"-,

reas"!'lbI Y

/M--{ ,

I -,... , FROM

__----IO-Q-I-C----f-- .. ,.,-,.., \

,

HOSTS \

\

\

J

' , "..I' ,,
II ~ ..... "

~-RFNMs

'"

\',,:'

receive
allocate
logic

;=t-

TTY
""I
DebuC)
,
Trace
\ Parameters
Statistics
" Discard

,L.

'( IMODEM

,

J./

\ acknowledged "- I
\ packets~
I

r\duplicotereceiv8
I {aCkets . Qcks...-J

I

Teletype

747

I

I

~--........

I
I packets
I

C

free

r---,; transmit
\ raCkS
S/F
I I
Routing/ I

""- ""-

,

.... _-"

""- .... 8ACKGROtJNQ
/
single packet
"
messages
........ -

L..-_ _ _ _ ,

FROM
HOSTS

,/

/

request 1

I
I
I
I
I
I
I
"TO
L_~ /--M / MODEM

..... _'"

request 8

aII oca tes

o QUEUE oDERIVED
PACKET

• CHOICE

,,-

'--.. ) ROUTINE

Figure 5-Packet flow and processing

of IMP configurations which has taken place; therefore,
we cannot conveniently compute the program differences from a simple configuration key. Instead, we must
explicitly table the configuration irregularities.
The packet processing routines

Figure 5 is a schematic drawing of packet flow and
packet processing. * We here briefly review the functions
of the various packet-processing routines and note
important new features.

Host-to-IMP

(H~ I)

This routine handles messages being transmitted
from Hosts at the local site. These Hosts may either be
real Hosts or fake Hosts (TTY, Debug, etc.). The
routine acquires a message number for each message
and passes the message through the transmi allocation
logic which requests a reassembly allocation from the
destination IMP. Once this allocation is received, the
message is broken into packets which are passed to the
Task routine via the Host Task queue.
Task

* Cf.

Figure 9 of Reference 1.

This routine diI:ects packets to their proper destination. Packets for a local Host are passed through the

748

Fall Joint Computer Conference, 1972

reassembly logic. When reassembly is complete, the
reassembled message is passed to the IMP-to-Host
routine via the Host Out queue. Certain control
messages for the local IMP are passed to the transmit or
receive allocate logic. Packets to other destinations are
placed on a modem output queue as specified by the
routing table.
IMP-to-Modem (I....M)

This routine transmits successive packets from the
modem output queues and sends piggybacked acknowledgments for packets correctly received by the Modemto-IMP routine and accepted by the Task routine.
Modem-to-IMP (M.... I)

This routine handles inputs from modems and passes
correctly received packets to the Task routine via the
Modem Task queue. This routine also processes incoming piggybacked acknowledges and causes the
buffers for correctly acknowledged packets to be freed.
IMP-to-Host (I....H)

This routine passes messages to local Hosts and
informs the background routine when a RFNM should
be returned to the source Host.
Background

The function of this routine includes handling the
IMP's console Teletype, a debugging program, the
statistics programs, the trace program, and several
routines which generate control messages. The programs
which perform the first four functions run as fake
Hosts (as described in Reference 1). These routines
simulate the operation of the Host/IlVIP data channel
hardware so the Host-to-Il\1:P and Il\1:P-to-Host routines
are unaware they are communicating with anything
other than a real Host. This trick saved a large amount
of code and we have come to use it more and more. The
programs which send incomplete transmission messages,
send and return allocations, and send RFNl\1:s also
reside in the background program. However, these
programs run in a slightly different manner than the
fake Hosts in that they do not simulate the Host/IMP
channel hardware. In fact, they do not go through the
Host/IMP code at all, but rather put their messages
directly on the task queue. Nonetheless, the principle
is the same.

Timeout

This routine, which is not shown in Figure 5, performs
a number of periodic functions. One of these functions
is garbage collection. Every table, most queues, and
many states of the program are timed out. Thus, if an
entry remains in a table abnormally long or if a routine
remains in a particular state for abnormally long, this
entry or state is garbage-collected and the table or
routine is returned to its initial or nominal state. In
this way, abnormal conditions are not allowed to hang
up the system indefinitely.
The method frequently used by the Timeout routine
to scan a table is interesting. Suppose, for example,
every entry in a sixty-four entry table must be looked
at every now and then. Timeout could wait· the proper
interval and then look at every entry in the table on
one pass. However, this would cause a severe transient
in the timing of the IMP program as a whole. Instead,
one entry is looked at each time through the Tim~out
routine. This takes a little more total time but is much
less disturbing to the program as a whole. In particular,
worst case timing problems (for instance, the processing
time between the end of one modem input and the
beginning of the next) are significantly reduced by
this technique. A particular example of the use of this
technique is with the transmission of routing information to the IMP's neighbors. In general, an Il\1:P can
have five neighbors. Therefore, it sends routing information to one of its neighbors every 125 msec rather
than to all of its neighbors every 625 msec.
In addition to timing out various states of the program, the Timeout routine is used.to awaken routines
which have put themselves to sleep for a specified
period. Typically these routines are waiting or some
resource to become available, and are written as coroutines with the Timeout routine. When they are
restarted by Timeout the test is made for the availability of the resource, followed by another delay if the
resource is not yet available.
PERFORMANCE EVALUATION
In view of the extensive modifications described in
the preceding sections, it was appropriate to recalculate
the IlVIP's performance capabilities. The following
section presents the results of the reevaluation of the
IMP's performance and comparisons with the performance reports of Reference 1.
Throughput

VS.

message length

In this section we recalculate two measures of IMP
performance previously calculated in Reference 1, the

Improvements in Design and Performance of ARPA Network

maximum throughput and line traffic. Throughput is
the number of Host data bits that traverse an Il\1P each
second. Line traffic is the number of bits that an Il\1P
transmits on its communication circuits per second and
includes the overhead of RFNl\ls, packet headers,
acknowledges, framing characters, and checksum
characters.
To calculate the Il\1P's maximum line traffic and
throughput, we first calculate the computational load
placed on the Il\1P by the processing of one message.
The computational load is the sum of the machine
instruction cycles plus the input/output cycles required
to process all the packets of a message and their acknowledgments, and the message's RFNl\1 and its
acknowledgment. For simplicity in computing the
computational load, we ignore the processing required
to send and receive the message from a Host since this
is only seen by the source and destination Il\,1Ps.
A packet has D bits of data, S bits of software
overhead, and H bits of hardware overhead. For the
original and modified Il\t:IP systems, the values of D,
S, andH are:
Original

Modified

D 0-1008 bits
S 64 (packet) +80 (ack) =
144 bits
H 72 (packet) +72Cack) =
144 bits

0-1008 bits
80 bits (packet+ack)
72 bits (packet+ack)

The input/output processing time for a packet is the
time taken to transfer D+S bits from memory to the
modem interface at one Il\1P plus the time to transfer
D+S bits into memory at the other Il\1P. If R is the
input/ output transfer rate in bits per second, * then the
input/output transfer time for a packet is 2(D+S)/R.
Therefore, the total input/output time, I m, for P packets
in a B bit message is 2(B+PXS)/R. The input/output
transfer time, I r , for a RFNl\1 is 2S/R.
To each of these numbers we must add the program
processing time, C; this is about the same for a packet
of a message and a RFNl\1.

* In this calculation we will be making the distinction between the
516 IMP (used originally and reported on in Reference 1) and the
316 IMP (used for all new IMPs). The 516 has a memory cycle
time of 0.96 psec, and the 316 has a cycle of 1.6 psec. The 316
provides a two-cycle data break, in comparison with the four-cycle
data break on the 516. Thus, the input/output transfer rates are
16 bits per 3.84 psec for the 516 and 16 bits per 3.2 psec for the 316.

749

For the original Il\1P program, the program processing
time per packet consisted of the following:
l\1odem Output 100 cycles Send out packet
l\1odem Input
100 cycles Receive packet at other
INIP
Task
150 cycles Process it (route onto an
output line)
l\1odem Output 100 cycles Send back an acknowledgment
l\,1odem Input
100 cycles Receive acknowledgment
at first Il\1P
150 cycles Process acknowledgment
Task
700 cycles Program processing time
per packet
For the modified Il\1P program, the program processing time consists of:
l\1odem Output 150 cycles Send out packet and
piggyback acks
l\1odem Input
150 cycles Receive packet and process acks
Task
250 cycles Process packet
550 cycles

Program processing time
per packet

Finally, we add a percentage, V, for overhead for
the various periodic processes in the IJV[P (primarily
the routing computation) which take processor bandwidth. V is presently about 5 percent.
Weare now in a position to calculate the computationalload (in seconds), L, ~f one P packet message:

packets

RFNM

The maximum throughput, T, is the number of data
bits in a single message divided by the computational
loads of the message; that is, T = B / L.
The maximum line traffic (in bits per second), R,
is the throughput plus the overhead bits for the packets
of the message and the RFNl\1 divided by the computationalload of the message. That is,

R = T + (P + 1) X (S + H) =B
__
+--.:..(P_+~l)_X_C-,--S_+_H_)
L
L
The maximum throughput and line traffic are plotted
for various message lengths in Figure 6 for the original
and modified programs and for the 516 IMP and the
316 IMP.

Fall Joint Computer Conference, 1972

750

250~----------~~--~--?---~-'

The changes to the IMP system can be summarized
as follows:

200

• The program processing time for a store-andforward packet has been decreased by 20 percent.
• The line throughput has been increased by 4
percent for a 516 INIP and by 7 percent for a 316
IMP.

150
100

2

:3

4

5

6

7

8

250~--~--~~------~----~---'

As a result, the net throughput rate has been increased
by 17 percent for a 516 IMP and by 21 percent for a
316 IMP. Thus, a 316 IlVIP can now process almost as
much traffic as a 516 IMP could with the old program.
A 516 IMP can now process approximately 850 Kbs.
• The line overhead on a full-length packet has been
decreased from 29 percent to 16 percent.

(J)

b. 50 Kb
1000 Miles

CI

z

0
U
lLI

0

0

2

:3

4

5

6

7

8

As a result, the effective capacity of the telephone
circuits has been increased from thirty-eight full packet
messages per second on a 50 Kbs line to forty-three full
packet messages per second.

(J)

...J
...J

250

Round trip delay vs. message length

c. 230.4 Kb

:!:

100 Miles

200
150
100

200

150
100
50

d. 230.4 Kb
1000 Miles

-------------------------------------------------------

--

In this section we compute the minimum round trip
delay encountered by a message. We define round trip
delay as in Reference 1; that is, the delay until the
message's RFNM arrives back at the destination IMP.
A message has P packets and travels over H hops. The
first packet encounters delay due to the packet processing time, C; the transmission delay, T p ; and the
propagation delay, L. Each successive packet of the
message follows C+Tp behind the previous packet.
Since the message's RFNM is a single packet message
with a transmission delay, T R , we can write the total
delay as

first packet

successive
packets

RFNM

For single packet messages, this reduces to

o~~~~--~--~--~--~~--~

012:3

45678

MESSAGE LENGTH (PACKETS)

Figure 6-Line traffic and throughput vs. message length. The
upper curves plot maximum line traffic, the lower curves plot
maximum throughput

The curves of Figure 7 show minimum round-trip
delay through the network for a range of message
lengths and hop numbers, and for two sets of line speeds
and line lengths. These curves agree with experimental
data.11·12

Improvements in Design and Performance of ARPA Network

751

Line utilization
lOOO----------------------------------------~

800
600
400

2

:3

4

5

8

6

The number of buffers required to keep a communications circuit fully loaded is a function not only of line
bandwidth and distance but also of packet length, IJ\1P
delay, and acknowledgment strategy. In order to
compute the buffering needed to keep a line busy, we
need to know the length of time the sending IMP must
wait between sending out a packet and receiving an
acknowledgment for it. If we assume no line errors,
this time is the sum of: propagation delays for the
packet and its acknowledgment, Pp and P A; transmission delays for the packet and its acknowledgment,
Tp and T A; and the IMP processing delay before the
acknowledgment is sent. Thus, the number of buffers
needed to fully utilize a line is (Pp+Tp+L+P A+
TA)/Tp.
Since Pp = P A, the expression for the number of
buffers can be rewritten:

2P
L+TA
-+1+-Tp
Tp

1000~--------------------------------------,

800

That is, the number of buffers needed to keep a line
full is proportional to the length of the line and its
speed, and inversely proportional to the packet size,
with the addition of a constant term.
To compute Tp , we must take into account the mix of
short and long packets. Thus, we write

T p = xTs+yTL
x+y

lOOO--------------------------------------~

800
600

2

:3

4

5

6

7

MESSAGE LENGTH (PACKETS)
Figure 7-Minimum round trip delay vs. message length.
Curves show delay for 1-6 hops

where x to y is the ratio of number of short packets to
number of long packets and Ts and TL are the transmission delays incurred by short and long packets,
respectively. The shortest packet permitted is 152 bits
long (entirely overhead); the longest packet is 1160
bits long. Computing Ts and TL for any given line bandwidth is a simple matter; they typically range from
106 J.Lsec for Ts on a 1.4 Mbs line to 120.5 msec for TL
on a 9.6 Kbs line.
Assuming worst case IMP processing delay (that is,
the acknowledge becomes ready for transmission just
as the first bit of a maximum length packet is sent),
L=TL •
The acknowledge returns in the next outgoing packet
at the other ,IMP, which we assume is of "average"
size:*

8

Propagation delay, P, is essentially just "speed of

* Variations of this assumption have only second order effects on
the computation of the number of buffers required.

752

Fall Joint Computer Conference, 1972

- - lS:0L
- - 8S:1L
- - - 2S :lL
-----lS:1L
-···-OS:lL

--------

---

,/
..,.......,.

. .....-:;;."

,/

-:~--.~--.~--.~::.~.-=~~~~~~ .. ,
11

10

light" delay, and ranges from 50 }lsec for short lines,
through 20 msec for a cross country line, to 275 msec
for a satellite link.
We can now compute the number of buffers required
to fully utilize a line for any line speed, line length, and
traffic mix. Figure 8 gives the result for typical speeds,
lengths, and mixes. Note that the knee of the curves
occurs at progressively shorter d'stances with increasing
line speeds. The constant term dominates the 9.6 Kbs
case, and it is almost insignificant for the 1.4 Mbs case.
Note also that the separation between members of each
family of curves remains constant on the log scale,
indicating greatly increased variations with distance.
GENERAL COMMENTS

------

en

- --

~

.,/

/.

/

~

,/

/:.../,./
...... -/

~,,/ . /

-..: ==.::. =-=.::. =:::..---=---::,:-::,:.,- ~ .'
.~~

0::
W
LL
LL

The ARPA Network has represented a fundamental
development in the intersection of computers and
communications. Many derivative activities are proceeding with considerable energy, and we list here some
of the important directions:

::::>
al

11

10

LL

0
0::
W
al

1000

::!!
;:)

:z
100

Figure 8-Number of buffers for full line utilization. Traffic mixes
are shown as the ratio of number of short packets (8) to number
of long packets (L)

• The present network is expanding, adding IMP
and TIP nodes at rates approaching two per
month. Other government agencies are initiating
efforts to use the network, and increasing rates of
growth are likely. As befits the growing operational character of the ARPA Network, ARPA is
making efforts to transfer the network from under
ARPA's research and development auspices to an
operational agency or a specialized carrier of some
sort.
• Technical improvements in the existing network
are continuing. Arrangements have now been made
to permit Host-IMP connections at distances
over 2000 feet by use of common-carrier circuits.
Arrangements are being made to allow the connection of remote-job-entry terminals to a TIP.
In the software area, the routing algorithms are
still inadequate at heavy load levels, and further
changes in these algorithms are in progress. A
major effort is under way to develop an IMP
which can cope with megabit/second circuits and
higher terminal throughput. This new "high speed
modular IMP" will be based on a minicomputer,
multiprocessor design; a prototype will be completed in 1973.
• The network is being expanded to include satellite
links to oversea nodes, and an entirely new approach is being investigated for the "multi-access"
use of satellite channels by message switched
digital communication systems. 13 This work could

Improvements in Design and Performance of ARPA Network

lead to major changes in world-wide digital communications.
• Many similar networks are being designed by
other groups, both in the United States and in
other countries. These groups are reviewing the
myriad detailed design choices that must be made
in the design of message switched systems, and a
wide understanding of such networks is growing.
• The existence of the ARPA Network is encouraging
a serious review of approaches to obtaining new
computer resources. It is now possible to consider
investing in major resources, because a national, or
even international, network clientele is available
over which to amortize the cost of such major
resources.
• Perhaps most important, the network has catalyzed
important computer research into how programs
and operating systems should communicate, with
each other, and this research will hopefully lead
to improved use of all computers.
The ARPA Network has been an exciting development, and there is much yet left to learn.
ACKNOWLEDGl\1ENTS
Dr. Lawrence G. Roberts and others in the ARPA
office have been a continuing source of encouragement
and support. The entire "IMP group" at Bolt Beranek
and Newman Inc. has participated in the development,
installation, test, and maintenance of the Il\/[P subnetwork. In addition, Dr. Robert E. Kahn of Bolt
Beranek and Newman Inc. was deeply involved in the
isolation of certain. network weaknesses and in the
formative stages of the corrective algorithms. Alex
McKenzie made many useful suggestions during the
writing of this paper. Linda Ebersole helped with the
production of the manuscript.

753

4 R E KAHN W R CROWTHER
Flow control in a resource sharing computer network
Proceedings of the Second ACM IEEE Symposium on
Problems in the Optimization of Data Communications
Systems Palo Alto California October 1971 pp 108-116
5 F HEART S M ORNSTEIN
Software and logic design interaction in computer networks
Infotech Computer State of the Art Report No 6 Computer
Networks 1971
6 S CARR S CROCKER V CERF
Host/host protocol in the ARPA network
Proceedings of AFIPS 1970 Spring Joint Computer
Conference Vol 36 pp 589-597
7 S CROCKER J HEAFNER R METCALFE
J POSTEL
Function-oriented protocols for the ARPA network
Proceedings of AFIPS 1972 Spring Joint Computer
Conference Vol 40 pp 271-280
8 A McKENZIE
Host/host protocol for the ARPA network
Available from the Network Information Center as NIC
8246 at Stanford Research Institute Menlo Park California
94025
9 Specifications for the interconnection of a host and an IMP
Bolt Beranek and Newman Inc Report No 1822 revised
April 1972
10 K BARTLETT R SCANTLEBURY
P WILKINSON
A note on reliable full-duplex transmission over half duplex
links
Communications of the ACM 125 May 1969 pp 260-261
11 G D COLE
Computer networks measurements techniques and experiments
UCLA-ENG-7165 Computer Science Department School of
Engineering and Applied Science University of California at
Los Angeles October 1971
12 G D COLE
Performance measurements on the ARPA computer network
Proceedings of the Second ACM IEEE Symposium on
Problems in the Optimization of Data Communications
Systems Palo Alto California October 1971 pp 39-45
13 N ABRAMSON
The ALOHA system-Another alternative for computer
communications
Proceedings of AFIPS 1970 Fall Joint Computer Conference
Vol 37 pp 281-285

REFERENCES
SUPPLEMENTARY BIBLIOGRAPHY
1 F E HEART R E KAHN S M ORNSTEIN
W R CROWTHER D C WALDEN
The interface message processor for the ARPA computer
network
Proceedings of AFIPS 1970 Spring Joint Computer
Conference Vol 36 pp 551-567
2 S M ORNSTEIN FE HEART W R CROWTHER
H K RISING S B RUSSELL A MICHEL
The terminal IMP for the ARPA computer network
Proceedings of AFIPS 1972 Spring Joint Computer
Conference Vol 40 pp 243-254
3 R E KAHN W R CROWTHER
A study of the ARPA network design and performance
Report No 2161 Bolt Beranek and Newman Inc August
1971

(The following describe issues related to, but not directly
concerned with, those discussed in the text.)
H FRANK I T FRISCH W CHOU
Topological considerations in the design of the ARPA computer
network
Proceedings of AFIPS 1970 Spring Joint Computer Conference
Vol 36 pp 581-587
H FRANK R E KAHN L KLEINROCK
Computer communication n6twork design-Experience with theory
and practice
Proceedings of AFIPS 1972 Spring Joint Computer Conference
Vol 40 pp 255-270

754

Fall Joint Computer Conference, 1972

R E KAHN
Terminal access to the ARPA computer network
Courant Computer Symposium 3-Computer Networks
Courant Institute New York November 1970
L KLEIN ROCK
Analytic and simulation methods in computer network design
Proceedings of AFIPS 1970 Spring Joint Computer Conference
Vol 36 pp 569-579
A A McKENZIE B P COSELL J M McQUILLAN
M J THROPE
The network control center for the ARPA network
To be presented at the International Conference on Computer
Communications Washington D C October 1972

L G ROBERTS
Extension of packet communication technology to a hand-held
personal terminal
Proceedings of AFIPS 1972 Spring Joint Computer Conference
Vol 40 pp 295-298
L G ROBERTS B D WESSLER
Computer network development to achieve resource sharing
Proceedings of AFIPS 1970 Spring Joint Computer Conference
Vol 36 pp 543-549
R THOMAS D A HENDERSON
McROSS-A multi-computer programming system
Proceedings of AFIPS 1972 Spring Joint Computer Conference
Vol 40 pp 281-294

Cost effective priority assignment
in network computers
by E. K. BOWDON, SR.
University of Illinois
Urbana, Illinois

and

W. J. BARR
Bell Telephone Laboratories
Piscataway, New Jersey

the task is large. Demand services are those services
which, though defined in advance, may be required at
any time and at a possibly unknown price. Frequently
the only previous agreements made refer to the type of
service to be delivered and limits on how much will be
demanded. Examples of tasks which are run on a
demand basis include research, information requests,
and program debugging runs. University computing
centers generally find that most of the services which
they offer are of this type.
Every installation manager who offers either contract
or demand services should have a solid and acceptable
answer to the critical question "What do I do when my
computer breaks down?" If he wishes to ensure that he
can meet all commitments, the only answer is to
transfer tasks to another processor. This is where
network computers enter the picture. If the center is
part of a network computer, tasks can easily and
quickly be transferred to another center for processing.
The concept of transferring tasks between centers
through a broker has been widely discussed in the
literature. 1 •2
Our basic assumption is that economic viability for
network computers is predicated on efficient resource
sharing. This was, in fact, a major reason for the construction of several networks-to create the capability
of using someone else's special purpose machine or
unique process without having to physically transport
the work. This type of resource sharing is easily implemented and considerable work has been done toward
this goal. There is, however, another aspect of resource
sharing which has not been studied thoroughly: loadleveling. By load-leveling we mean the transfer of tasks
between computing centers for the purpose of improving

INTRODUCTION
Previously, the study of network computers has been
focused on the analysis of communication costs, optimal
message routing, and the construction of a communications network connecting geographically distributed
computing centers. While these problems are far from
being completely solved, enough progress has' been
made to allow the construction of reasonably efficient
network computers. One problem which has not been
solved, however, is making such networks economically
viable. The solution of this problem is the object of
our analysis.
With the technological problems virtually solved, it
is readily becoming apparent that no matter whose
point of view one takes, the only economically justifiable
motivation for building a network computer is resource
sharing. However, the businessmen, the users, the
people with money to spend, could not care less whose
resources they are using for their computer runs. They
care only that they receive the best possible service at
the lowest possible price. The computing center manager
who cannot fill this order will soon find himself out of
customers.
"The best possible service ... " is, in itself, a tall
order to fill. The computing center manager finds
himself in a position to offer basically two kinds of
computing services: contract services and demand
services. Contract services are those services which the
manager agrees to furnish, at a predetermined price,
within specified time periods. Examples of this type of
service include payroll runs, billings, and inventory
updates. Each of these is run periodically and the value
placed by the businessman on the timely completion of

755

756

Fall Joint Computer Conference, 1972

the throughput of the network or other criteria. We
contend that the analysis and implementation of
user-oriented load-leveling is the key to developing
economically self-supporting network computers.
A SCENARIO OF COST EFFECTIVENESS
Until recently, efforts to measure computer efficiency
have centered on the measurement of resource (including processor) idle time. A major problem with
this philosophy is that it assumes that all tasks are of
roughly equal value to the user and hence the operation
of the system.
•
As an alternative to the methods used in the past, we
propose a priority assignment technique designed to
represent the worth of tasks in the· system. We present
the hypothesis that tasks requiring equivalent use of
resources are not necessarily of equivalent worth to the
user with respect to time~ We would allow the option
for the user to specify a "deadline" after which the
value of his task would decrease, at a rate which he can
specify, to a system determined minimum. With this in
mind, we have proposed a measure of cost effectiveness
with which we can evaluate the performance of a
network with an arbitrary number of interconnected
systems, as well as each system individually.3
We define our measure of cost effectiveness 'Y, as
follows:

where
Lq is the number of tasks in the queue,
M is the maximum queue length,
R is the number of priority classes,
a is a measure (system-determined) of the "dedicatedness" of the CPU to the processing of
tasks in the queue,

and
n

(3(i)

= (R-i)

:E [g( j) /f( j) ]
j=1

where
g (j) is the reward for completing task j (a user
specified function of time) ,

and

(3 (i) indicates a ratio of reward to cost for a given

priority class and is sensitive to the needs of the user
and the requirements imposed on the installation. It is
user sensitive because the user specifies the reward and
is installation sensitive because the cost for processing
a task. is determined by the system. The measure of
CPU dedicatedness (a), on the other hand, is an
entirely installation sensitive parameter.
The first problem which becomes apparent is that
which arises if
R-l

:E (3(i) =0.
i=O

This occurs only in the situation where there is exactly
one priority class (i. e., the non-priority case) . We will
finesse away this problem by defining
(3

for this case. Intuitively, this is obvious, since the
smaller this term gets, the more efficiently (in terms of
reward) a system is using its resources. Furthermore,
in the absence of priorities, the order in which tasks are
executed is fixed, so this term becomes irrelevant to our
measure of cost effectiveness. Thus, for the nonpriority case, we have
'Y=

(Lq/M) a

which is simply<: the measure of the relevance of the
queue to processing activities. This is precisely what we
want if we are going to consider only load-leveling in
non-priority systems. However, we are interested in the
more general case in which we can assign priorities.
An estimate of the cost to complete task j, f( j) is
readily determined from the user-supplied parameters
requesting resources. Frequently these estimated
parameters are used as upper limits in resource allocation and the operating system will not allow the program to exceed them. As a result, the estimates tend to
be high. On the other hand, lower priorities are usually
assigned to tasks requiring a large amount of resources.
So the net effect is that the user's parameters reflect his
best estimate and we may be reasonably confident that
they truly reflect his needs.
At the University of Illinois computing center, for
example, as of July 26, 1971, program charges are
estimated by the following formula:

f( j) is the cost (system determined) to complete
taskj.

The term (Lq/ M) a is a measure of the relevance of the
queue to processing activities. Similarly, we can look
at (3(i) asa measure of resource utilization. Note that

j E (i) =0
R-l

(M -Lq)

cents=a(X+Y) (bZ+c)+d
where
X = CPU time in centiseconds,
Y = number of I/O requests,

(2)

Cost Effective Priority Assignment in Network Computers

g(j)

g(j)

g2

,

t2

L..-_.-+-,,'_.-__
. -__
. --+-·--'lo,,-,in7
8:00 am

3:30pm

tal

Ideal function.

8:00am

3:30pm

time

time
(b)

Approximate function.

Figure 1-Example of user's reward function

Z = core size in kilobytes,
a, b, c are weighting factors currently having the

values 0.04, 0.0045, and 0.5, respectively.
and
d is an extra charge factor including $1.00 cover
charge plus any special resources used (tape/disk
storage, card read, cards punched, plotter, etc.).

The main significance of the reward function g ( j)
specified by the user is that it allows us to determine a
deadline or deadlines for the task. Typically we might
expect g (j) to be a polynomial in t, where t is the time
in the system. For example, the following thoughts
might run through the user's head: "Let's see, it's
10:00 a.m. now and I don't really need immediate
results since I have other things to do. However, I do
'need the output before the 4:00 p.m. meeting. Therefore, I will make 3: 30 p.m. a primary deadline. If it
isn't done before the meeting, I can't use the results
before tomorrow morning, so I will make 8:00 a.m.
a secondary deadline. If it isn't done by then I can't
llRP. thp. results, so after 8: 00 a.m. T don't, care."
The function g (j) this user is thinking about would
probably look something like Figure 1a. Now, this type
of function poses a problem in that it is difficult for the
user to specify accurately and would require an appreciable amount of overhead to remember and compute.
Notice, however, that even if turnaround time is
immediate, the profit oriented installation manager
would put the completed task on a shelf (presumably
an inexpensive storage device) and not give it to the
user until just before the deadline-thus collecting the
maximum reward. As a result, there is little reason for
specifying anything more than the deadlines, the
rewards associated with meeting the deadlines, and the
rate of decrease of the reward between deadlines, if
any. Applying this reasoning to Figure 1a we obtain
Figure lb. Note that this function is completely
specified with only six parameters (deadlines tl, ~;
rewards gl, g2; and rates of decrease ml, m2).

ft

757

In general, we may assume that g (j) is a monotonically non-increasing, piecewise linear, reward function
consisting of n distinct sets of deadlines, rewards, and
rates of decrease. Thus we can simply specify g (j) with
3n parameters where n is the number of deadlines
specified.
Note that, in effect, the user specifies an abort time
when the g (j) he specifies becomes less than f( j). If
the installation happens to provide a "lower cost"
service, I( j) and if g ( j) >I( j), this task would be
processed, but only when all the tasks with higher
g (j) had been processed.
Now, what we are really interested in, is not so much
an absolute reward, but a ratio of reward to cost. Since
f( j) is, at best, only an estimate of cost, we cannot
reasonably require a user to specify an absolute reward.
A more equitable arrangement would be to specify the
rewards in terms of a ratio g( j) /f( j) associated with
each deadline. This ratio is more indicative of the
relative worth of a task, both to the system and to the
user, since it indicates the return on an investment.
PRIORITY ASSIGNMENT
Let us now turn our attention to the development of
a priority assignment scheme which utilizes the
reward/ cost ratios described in the previous section.
Webegin by quantizing the continuum of reward/cost
ratios into R distinct intervals. Each of these intervals
is then assigned to one of R priority classes 0, 1, 2, ... ,
R -1 with priority 0 being reserved for tasks with
highest reward/cost ratios and priority R -1 for tasks
with reward/cost ratios of unity or less. A task entering
the system will be assigned a priority according to
its associated reward/cost ratio.
We want to guarantee, if possible, that all priority 0
tasks will meet their deadlines. Furthermore, if all
priority 0 tasks can meet their deadlines, we want to
guarantee, if possible, that all priority 1 tasks will meet
their deadlines and, in general, if all priority k tasks
can meet their deadlines, we want to guarantee that as
many priority class k+ 1 tasks as possible will meet
their deadlines. (Note that we are concerned with
guaranteeing deadlines rather than rationing critical
resources. )
To facilitate the priority assignment, we introduce the
following notation: For priority k, let T i denote the
ith task. Then we assume for each Ti that we receive
the following information vector:
(Ti,

where

T i is an identifier,

g/f, d i , Ti, 8i)

758

Fall Joint Computer Conference, 1972

gI j is the rewardl cost ratio associated with meeting .the task's deadline,
d i is the task's deadline associated with glj,
T i is the maximum processing time for the task,

encounter a task T k , k~j, such that Sk::;Sk+l-n.
(N ote that T j +1 and all its predecessors are
guaranteed to meet their deadlines.)
4. However, if Sj+Tj>Si+1 but FjSj or
Sj+Tj>Si+1, a deadline might be missed; so we
proceed with Step 3.
3. COlllpacting schellle Let ij. denote the float
time between any two tasks T j- 1 and Tj, where
h is defined:

0

1

2

3

4

5

6

j

F j=

~h =
k=1

Tj,

I I
Tl

T2

I

)
A)

+ ~Tj

where t is the current time. Now, starting with
task Tj, if Sj+Tj>Sj+1 and Fj~Sj+Tj-Sj+l we
assign a new starting time to T j given by:

and we continue with T j - 1, T j - 2 , etc., until we

I

I

2

3

T2

T3

T4

\.~---

B)

I

0
3

2
0

~--~)~

INFORMATION VECTOR

4

5

I I I I

I

T3

= 2
= 1
= 1
d2 = 4
'2 = 2
52 = 2
d3 = 10
'3 = 3
53 = 7
d 4 = 14
'4 = 2
54 = 12
d 5 = 15
'5 = 1
55 = 14

T5

(3)

I

0

d1
'1
51

Tl

is

k=1

9

SCHEDULE OF TASKS.

j

Sj-t

8

I I I I I I I I I I I

h=sj- (Sj-l+Tj-1)

Then, Fj, the total float time preceding
given by:

7

FLOAT TIME, fi

INFORMATION VECTORS.

Figure 2-State of priority Class k at time t =0

T4

I

T51

759

Cost Effective Priority Assignment in Network Computers

0

1

2

3

5

4

6

7

8

9

0

I I I I I I I I I I I

ITl I

T2

I

I
A)

I

2

3

4

5

I I I I

I

T3

I

I

I

T4

IT51

Tl

S.OiEDULE OF TASKS BEFORE ASSIGNMENT.

I

I
1

I

I

I

I

0

1

1

I I I I I I I I I I I I 'I I I

2

3

4

ITI I T2 I

5

6

7

8

9

0

2

I

T3

3

4

5

I T5 ITs/

T4
B)

SOiEDULE OF TASKS AFTER ASSIGNMENT.

Tl
T2

3

I

T4 (T51

0

0

I I I I I I I I I I I I I I I I

1

ITl

2

3

4

I

T2

I

5

7

6

8

9

T3

I

0

1

I

4

5

Tj

"'"
1\

c)

I
2

1

0

T3

i\
\

Tiis (Ti, 9, 2, 7). We find thatlh84 since
7 +2> 7 and a deadline could be missed. But
F3~83+T3-84 since 4~7+2-7=2, so we assign
a new start time to T3: 83=84-T3=7-2=5.
Now 82~83-T2 since 2~5-2, so the priority
assignment is complete and all tasks are guaranteed to meet their deadlines. The resulting
state of priority class k is shown in Figure 4.
(N ote the effect of the last in first out rule for
breaking ties on start times.)
(iii) Next suppose the information vector for Ti is
(Ti, 9, 4, 5). We find that d284
since 5 +4 > 7, and a deadline could be missed.
But F3~83+T3-84 since 4~4 so we assign a new

0

A)

SOiEDULE OF TASKS BEFORE ASSIGNf'lENT.

0

T5
15

0

Ts 141

INFORfv1ATlON VECTORS AFTER ASS I GNflENT •

0

1

2

3

ITl

I

T2

I

4

5

7

6

I

8

9

I
1

0

I

2

I
3

I

T5 ITs

I

5

I I I I I I I I I I I I I I I I
B)

I

T3

T4

I

SOiEDULE OF TASKS AFTER ASSIGNMENT.

Figure 3-Results of priority assignments for Example (i)

t=o, the state of priority class k is that shown in
Figure 2. (Note that since all priority class k tasks have
similar g/ I, we need not show these ratios.) Notice that
forming the float time column is analogous to forming a
forward difference table. .In each of the following
examples we assume that Figure 2 is the initial state of
priority class k.
(i) Suppose the information vector (with g/I
omitted) for Ti is (T i,6, 1,5). Beginning with
T 5, we observe that lh8s since 2+2>3, so we assign a new start
time to T 2: S2=SS-T2=3-2=1. Now Sl+T1>S2
since 1 + 1 > 1, so we assign a new start time to
T 1 : 81=82-T1=O. The priority assignment is
complete and all tasks are guaranteed to meet
their deadlines. The resulting state of priority
class k is shown in Figure 5.
(iv) As a final example, suppose that the information
vector for Ti is (T i , 9, 5, 4). As before, we find
that th < di ~ ds < d4 < ds, so we insert T i between
T2 and Ts and renumber the tasks accordingly.
However, 8s+Ts>S4 since 4+5>7, and a deadline could be missed. Furthermore, F s <8s+Ts-84
since 1 <4+5-7 =2, and the compacting scheme
will not help us. Instead we leave the start times
at their latest critical values and hope that
sufficient float time is created later to enable the

I

0

1

2

3

4

5

6

7

8

9

I
2

0

I I I I I I I I I I I

ITi I

T2

I

I

T3

I
3

4

0

1

2

4

~I

T2

I

I

7

6

8

9

0

1

2

I

5

7

6

3

4

T2

I [iJ I

8

9

ITl ]

B)

T4

2

I
5

T4 IT51

Tl

1

T2

4
2
2

\

1

6

\

TG 141

2

lli I

3

T2 I
B)

5

I

6

7

T3

I

8

9

T4

I

I
1

0

I

2

I

I

I

3

T5

I
4

Tz

\

2
1
1

1

4

0

2
2
9

\

T3 52
10
T4 37
14
T5 122

\

TG 141

~

15

5

ITGI

SCHEDULE OF TASKS AFTER ASSIGNMENT.

Tl

c)

\

SCHEDULE OF TASKS BEFORE ASS I GNt-'ENT.

4

1

0
2
0

INFORMATION VECTORS AFTER ASSIGWENT.

Figure 5-Results of priority assignments for Example (iii)

I

T4

IT51

I

I

0

1

I
2

I
3

I
4

C)

I

I

T5

ITGI

SCHEDULE OF TASKS AFTER ASS I GNMENT •

I I I I I I I I I I I I I I I I

1

5

I

5

I I I I I I I I I I I I I I I I

\

0

4

W

I

A)

3

I I I I

I

T3

T3 51
10
T4 73
14
T5 12Z

IT]

2

0

SCHEDULE OF TASKS BEFORE ASS I GNMENT •

A)

I I I I

I

5

3

I I I I I I I I I I I

15

1

o
1

2

o

I NFORMA.T ION VECTORS AFTER ASS I GNt-'ENT •

Figure 6-Results of priority assignments for Example (iv)

tasks to meet their deadlines. The results of this
assignment are shown in Figure 6. Note that T4
is the task which is in danger of missing its
deadline~

The last example brings up the problem of what to do
with a task whose deadline is missed. We simply treat it
as though it had just entered the system using the next
specified deadline as the current deadline. If no further
deadlines are specified, the task is assigned priority
R -1 and will be processed accordingly.
When a processor finishes executing a task the following scheduling algorithm is used to determine which
task is to be processed next. Generally, the algorithm
takes the highest priority task in the queue that is
closest to its latest starting time.

Cost Effective Priority Assignment in Network Computers

Scheduling algorithm

network of N centers,

'YN,

761

as follows:

N

Beginning with k = 0 and using l as an index,

'YN

=

L: Wi'Y

i

given that

(4)

i=1

1. We examine the float time, iI, for the first task in
priority class k. Then for l = k+ 1 :
2. If il of priority class k S 'Tl for the first task of
priority class l, we set k = l and continue with
Step 1. Otherwise we continue with Step 3.
3. Set l = l+ 1 and continue with Step 2 until all
priority classes have been considered. Then
continue with Step 4.
4. Assign the first task, T I , in priority k to the
available processor.

The effect of this scheduling algorithm is quite simple.
It instructs the scheduler to schedule the important
tasks first and then, if there is sufficient time, schedule
those lower priority tasks in such a way that as many
deadlines as possible are met.
In the foregoing we have tacitly assumed that each
task enters the system sufficiently before its deadline to
allow processing. The two algorithms taken together
facilitate meeting the deadlines, where possible, of the
higher priority tasks. Those tasks which do not meet
their deadlines will tend to be uniformly late.
LOAD LEVELING IN A NETWORK OF
CENTERS
Thus far we have been concerned only with cost
effectiveness in a single center. Next, let us consider the
more general problem of load leveling within a network
of centers. Each center may contain a single computer
or a subnet of computers. The topological and physical
properties of such networks have been illustrated4- S and
will not be discussed here.
We wish to determine a strategy which optimizes the
value of work performed by the network computer.
That is, to guarantee that every task in each center will
be processed, if possible, before its deadline and only
those tasks that offer the least reward to the network
will miss their deadlines. Implicit in this discussion is
the simplifying assumption that any task can be
performed in any center. This assumption is not as
restrictive as it may sound since we can, for the purposes of load leveling, partition a nonhomogeneous
network into sets of homogeneous subnetworks which
can be considered independently. Thus, in the discussion which follows, we will assume that the network
computer is homogeneous.
We define the measure of cost effectiveness for a

where
the Wi are weighting factors that reflect the relative
contribution of the ith center to the overall computational capability of the network,
and 'Yi is the measure of cost effectiveness for the ith
center.
Note that if a center is a subnet of computers,
we could employ this definition to determine the measure of cost effectiveness for the subnet. We also let Cii
denote the cost of communication between centers i and
j; and tii the transmission time between centers i and j.
Ideally, we want the network computer to operate so
that all tasks within the network are processed before
their deadlines. If a task is in danger of missing its
deadline, we want to consider it as a candidate for
transmission to another center for processing. The
determination of which tasks should be transferred
follows the priority assignment (i.e., priority 0 tasks in
danger of missing deadlines should be the first to be
considered, priority 1 tasks next, etc.) .
We note that this scheme may not discover all tasks
that are in danger of missing their deadlines. In order to
discover all tasks that might be in danger of missing
their deadlines, we would require a look ahead scheme
to determine the available float time and to fit lower
priority tasks into this float time. The value of such a
scheme is questionable, however, since we assume some
float time is created during processing and additional
float time may be created by sending high priority.
tasks to other centers. Also, the overhead associated
with executing the look ahead scheme would further
reduce the probable gain of such a scheme.
The determination of which center should be the
recipient of a transmitted task can be determined from
the measure of cost effectiveness of each center. Recall
that the measure indicates the worth of the work to be
processed within a center. Thus a center with a task in
danger of missing its deadline will generally have a
larger measure than a center with available float time.
Thus, by examining the measures for each center, we
can determine the likely recipient of tasks to be transmitted. These centers can in turn, examine their own
queues and bid for additional work on the basis of their
available float times. This approach has a decided
economic advantage over broadcasting the availability
of work throughout the network and transmitting the
tasks to the first center to respond. The latter approach

762

Fall Joint Computer Conference, 1972

has been investigated by Farber9 and discarded in
favor of bidding.
Once a recipient center has been determined, we
would transmit a given task only if the loss in reward
associated with not meeting its deadline is greater than
Cij, the cost of transmitting the task between centers
and transmitting back the results.
When a task is transmitted to a new center its
deadline is diminished by tij, the time to transmit back
the results, thus ensuring the task will reach its destination before its true deadline. Similarly, the reward
associated with meeting the task's deadline is diminished by Cij, since this represents a reduction in profit.
Then the task's g/f ratio is used to determine a new
priority and the task is treated like one originating in
that center.
This heuristic algorithm provides the desired results
that within each center all deadlines are met, if possible,
and if any task is in danger of missing its deadline, it is
considered for possible transmission to another center
which can meet the deadline.
SUMMARY
We have introduced a priority assignment technique
which, together with the scheduling algorithm, provides
a new approach to resource allocation. The most
important innovation in this approach is that it allows
a computing installation to maximize reward for the
use of resources while allowing the user to specify deadlines for his results. The demand by users upon the
resources of a computing installation is translated into
rewards for the center.
This approach offers advantages to the user and to
the computing installation. The user can exercise control over the processing of his task by specifying its
reward/cost ratio which, in turn, determines the importance the installation attaches to his requests. The
increased flexibility to the user in specifying rewards
for meeting deadlines yields increased reward to the
center. Thus the computing installation becomes cost
effective, since for a given interval of time, the installation can process those tasks which return the maximum
reward. A notable point here, is that this system readily
lends itself to measurement.
The measure of cost effectiveness is designed to reflect
the status of a center using the priority assignment
technique. From its definition, the value of the measure
depends not only on the presence of tasks in the system
but upon the priority of these tasks. Thus the measure
reflects the worth of the tasks awaiting execution rather
than just the number of tasks. Therefore, the measure
can be used both, statistically, to record the operation of

a center, and dynamically, to determine the probability
of .available float time. This attribute enables us to
predict the worth of the work to be performed in any
center in the network and facilitates load-leveling
between centers.
We have spent a good deal of time discussing what
this system does and the problems it attempts to solve.
In the interest of fair play, we now consider the things
it does not do and the problems it does not solve.
One of the proposed benefits of a network computer is
that it is possible to provide, well in advance, a guarantee that, at a predetermined price, a given deadline
will be met. This guarantee is especially important for
scheduled production runs, such as payroll runs, which
must be processed within specified time periods. The
system as presented does not directly allow for such a
long range guarantee. However, to implement such an
option, we simply modify the reward to include the loss
of goodwill which would be incurred should such a deadline be missed. Perhaps the easiest way to implement
this would be to reserve priority class zero for tasks
whose deadlines were previously guaranteed. Under
this system we could assure the user, with confidence,
that the deadlines could be met at a predetermined
(and presumably more expensive) price.
A second problem with the system is that the algorithms do not optimize the mix of tasks which would
be processed concurrently in a multiprogramming
environment. A common strategy in obtaining a good
mix is to choose tasks in such a way that most of the
tasks being processed at one time are input/output
bound (this is especially common in large systems which
can support a large number-five or more-tasks concurrently). Generally smaller tasks are the ones selected
to fill this bill. Under our system, the higher priority
classes will tend to contain the smaller and less expensive tasks since priority is assigned on the basis of a
cost multiplier which is user supplied. We assume a
user would be much more reluctant to double (give a
reward/cost ratio of 2) the cost of a $100 task than to
double the cost of a $5 task. This reluctance will tend
to keep a good mix present in a multiprogramming
environment.
The final problem we would like to consider is what
to do with a task if (horror of horrors) all of its deadlines are missed. There are basically two options, both
feasible for certain situations, which will be discussed.
The ultimate decision as to \yhich is best rests with the
computer center managers. Therefore, we will present
the alternatives objectively without any intent to
influence that decision.
The first alternative made obvious by the presentation is that when. a task misses all of its deadlines the
results it would produce are of no further use. Con-

Cost Effective Priority Assignment in Network Computers

tinued attempts to process a task in this instance would
be analogous to slamming on the brakes after your car
hits a brick wall; simply a waste of resources. Thus, if
the deadlines are firm, a center manager could say that
a task which misses all of its deadlines should be considered lost to the system.
On the other hand, the results produced by a task
could be of value even after the last deadline is missed.
In this case the center manager could offer a "low cost"
service under which tasks are processed at a reduced
rate but at the system's leisure. The danger in this
approach is that if run without outside supervision, the
system could become saturated with low cost tasks to
the detriment of more immediately valuable work.
This actually happened at the University of Illinois
during early attempts to institute a low cost option.
The confusion and headaches which resulted from the
saturation were more than enough to justify instituting
protective measures. From the results of this experience,
it is safe to say that no installation manager will let it
happen more than once.
Even in the presence of a few limitations, our system
represents a definite positive step in the analysis of
network computers. Our approach treats a network
computer as the economic entity that it should be: a
market place in which vendors compete for customers
and in which users contend for scarce resources. The
development of this approach is a first step in the long
road to achieving economic viability in network computers.
ACKNOWLEDGMENTS
We are particularly grateful to Mr. Einar Stefferud of
Einar Stefferud and Associates, Santa Monica, California for his constructive criticism and constant

763

encouragement in this effort. Weare also indebted to
Professor David J. Farber of the University of California at Irvine, California for many interesting
conversations about bidding in distributed networks.
Finally, we would like to thank the referees for their
careful reviews and suggestions to improve this paper.
This research was supported in part by the National
Science Foundation under Grant No. NSF GJ 28289.
REFERENCES
1 E STEFFERUD
Management's role in networking
Datamation Vol 18 No 41972
2 J T HOOTMAN
The computer network as a marketplace
Datamation Vol 18 No 4 1972
3 E K BOWDON SR W J BARR
Throughput optimi.zation in network computers
Proceedings of the Fifth International Conference on
System Sciences Honolulu 1972
4 N ABRAMSON
The ALOHA system
University of Hawaii Technical Report January 1972
5 H FRANK I T FRISCH
Communication transmission and transportation networks
Addison-Wesley Reading Massachusetts 1971
6 L KLEINROCK
Communication nets stochastic flow and delay
McGraw-Hill New York New York 1964
7 R SYSKI
Introduction to congestion theory in telephone systems
Oliver and Boyd Edinburgh 1960
8 E BOWDON SR
Dispatching in network computers
Proceedings of the Symposium on Computer
Communications Networks and Teletraffic April 1972
9 D J. FARBER K C LARSON
The structure of a distributed computing system-software
Proceedings of the Symposium on Computer
Communications Networks and Teletraffic April 1972

C.mmp-A mnItI-mInI-processor
···
*
by WILLIAM A. WULF and C. G. BELL
Carnegie-Mellon University
Pittsburgh, Pennsylvania

INTRODUCTION AND MOTIVATION

REQUIREMENTS
The CMU multiprocessor project is designed to
satisfy two requirements:

In the Summer of 1971 a project was initiated at eMU
to design the hardware and software for a multiprocessor computer system using minicomputer processors (i.e., PDP-II's). This paper briefly describes an
overview (only) of the goals, design, and status of this
hardware/software complex, and indicates some of
the research problems raised and analytic problems
solved in the course of its construction.
Earlier in 1971 a study was performed to examine
the feasibility of a very large multiproceSsor computer
for artificial intelligetnce research. This work, reported
in the proceedings paper by Bell and Freeman, had an
influence on the hardware structure. In some sense,
this work can be thought of as a feasibility study for
larger multiprocessor systems. Thus, the reader might
look at the Bell and Freeman paper for general overview and potential, while this paper has more specific
details regarding implementation since it occurs later
and is concerned with an active project. It is recommended that the two papers be read in sequence.
The following section contains requirements and
background information. The next section describes
the hardware structure. This section includes the
analysis of important problem in the hardware design:
interference due to multiple processors accessing a
common memory. The operating system philosophy,
and its structure is given together with a detailed analysis of one of the problems incurred in the design. One
problem is determining the optimum number of "locks"
which are in the scheduling primitives. The final section
discusses a few programming problems which may
arise because of the possibilities of parallel processing.

1. particular computation requirements of existing
research projects; and
2. research interest in computer structures.
The design may be viewed as attempting to satsify the
computational needs with a system that is conservative enough to ensure successful construction within a
two year period while first satisfying this constraint,
the system is to be a research vehicle for multiprocessor
systems with the ability to support a wide range of
investigations in computer design and systems programming.
The range of computer science research at eMU
(i.e~, artificial intelligence, system programming, and
computer structures) constrains processing power, data
rates, and memory requirements, etc.
(1) The artificial intelligence research at eMU
concerned ~ith speech and vision imposes two
kinds of requirements. The first, common to
speech and vision, is that special high data rate,
real time interfaces are required to acquire data
from the external environment. The second more
stringent requirement, is real time processing for
the speech-understanding system. The forms of
parallel computation and intercommunication
in multiprocessor is a matter for intensive
investigation, but seems to be a fruitful approach
to achieve the necessary processing capability.
(2) There is also a significant effort in research on
operating systems and on understanding how
software s Y'l3tems are to be constructed. Research
in these a~eas has a strong empirical and experimental component, requiring the design
and construction of many sy~tems. The primary

* This work was supported by the Advanced Research Projects
Agency of the Office of the Secretary of Defense (F44620-70-0107)
and is monitored by the Air Force Office of Scientific Research.

765

766

Fall Joint Computer Conference, 1972

requirement of these systems is isolation, so
they can be used in a completely idiosyncratic
way and be restructured in terms of software
from the basic machine. These systems also
require access by multiple users and varying
amounts of secondary memory.
(3) There is also research interest in using Register
Transfer Moduks (RTM's) developed here and
at Digital Equipment Corporation (Bell, Grason,
et al., 1972) and in production as the PDP-16
are designed to assist in the fabrication of hardware/software systems. A dedicated facility is
needed for the design and testing of experimental system constructed of these modules.
TIMELINESS OF MULTIPROCESSOR
We believe that to assemble a multiprocessor system
today requires research on multiprocessors. Multiprocessor systems (othpr than dual processor structures) have not become current art. Possibly reasons
for this state of affairs are:
1. The absolutely high cost of processors and
primary memories. A complex multiprocessor
system was simply beyond the computational
realm of all but a few extraordinary users, independent of the advantage.
2. The relatively high cost of processors in the
total system. An additional processor did not
improve the performance/ cost ratio.
3. The unreliability and performance degradation of
operating system software,-providing a still
more complex system structure-would be
futile.
4. The inability of technology to permit construction of the central switches required for such
structures due to low component density and
high cost.
5. The loss of performance in multiprocessors due
to memory access conflicts and switching delays.
6. The unknown problems of dividing tasks into
sub tasks to be executed in parallel.
7. The problems of constructing programs for
execution in a parallel environment. The possibility of parallel execution demands mechanisms
for controlling that parallelism and for handling
increased programming complexity.
In summary, the expense was prohibitive, even for
discovering what advantages of organization might
overcome any inherent decrements of performance.
However, we appear to have now entered a techno-

logical domain when many of the difficulties listed
above no longer hold so strongly:
1'. Providing we limit ourselves to multiprocessors
of minicomputers, the total system cost of
processors and primary memories are now within
the price range of a research and user facility.
2'. The processor is a smaller part of the total
system cost.
3'. Software reliability is now somewhat improved,
primarily because a large number of operating
systems have been constructed.
4'. Current medium and large scale integrated
circuit technology enables the construction of
switches that do not have the large losses of the
older distributed decentralized switches (Le.,
busses).
5'. Memory conflict is not high for the right balance
of processors, memories and switching system.
6'. ThNe has been work on the problem of task
parallelism, centered around the ILLIAC IV
and the CDC STAR. Other work on modular
programming [Krutar, 1971; Wulf, 1971] suggests how subtasks can be executed in a pipeline.
7'. l\tIechanisms for controlling parallel execution,
fork-join (Conway, 1963), P and V (Dijkstra,
1968), have been extensively discussed in the
literature. Methodologies for constructing large
complex programs are emerging (Dijkstra, 1969,
Parnas, 1971).
In short, the price of experimentation appears reasonable, given that then' are requirements that appear
to be satisfied in a sufficiently direct and obvious way
by a proposed multiprocessor structure. Moreover,
there is a reasonable research base for the use of such
structures.

RESEARCH AREAS
The above state does not settle many issues about
multiprocessors, nor make its development routine.
The main areas of research are:
1. The multiprocessor hardware design which we
call the PMS structure (see Bell and Newell,
1971). Few multiprocessors have been built,
thus each one represents an important point in
design space.
2. The processor-memory interconnection (Le.,
the switch design) especially with respect to
reliability.

C.mmp-A Multi-Mini-Processor

3. The configuration of computations on the multiprocessor. There are many processing structures
and little is known about when they are appropriate and how to exploit them, especially
when not treated in the abstract but in the context of an actual processing system:
Parallel processing: a task is broken into a
number of subtasks and assigned to separate
processors.
Pipeline processing: various independent
stages of the task are executed in parallel
(e.g., as in a co-routine structure).
Network processing: the computers operate
quasi-independently with intercommunication
(with various data rates and delay times).
Functional specialization: the processors have
either special capabilities or access to special
devices; the tasks must be shunted to processors as in a job shop.
Multiprogramming: a task is only executed
by a single processor at a given time.
Independent processing: a configurational
separation is achieved for varying amounts
of time, such that interaction is not possible
and thus doesn't have to be processed.
4. The decomposition of tasks for appropriate
computation. Detailed analysis and restructuring
of the algorithm appear to be required. The
speech-understanding system is one major
example which will be studied. It is interesting
both from the multiprocessor and the speech
recognition viewpoints.
5. The operating system design and performance.
The basic operating system design must be
conservative, since it will run as a computation
facility, however it has substantial research
interest.
6. The measurement and analysis of performance
of the total system.
7. The achievement of reliable computation by
organizational schemes at higher levels, such as
redundant computation.
THE HARDWARE STRUCTURE
This section will briefly describe the hardware design
without explicitly relating each part to the design constraints. The configuration is a conventional multiprocessor system. The structure is given in Figure 1.
There are two switches, Smp and Skp, each of which
provide intercommunication among two sets of components. Smp allows each processor to communicate
with all primary memories (in this case core). Skp

~l

767

Smp

(m-to-p crosspoint)

r- -- - - - -:- 1 -

--I

I

1 K.configurat1On

I

I K. configuration

I

'-- _ _ _ _ _ _ _ _ _ _I
k

Skp

(p-to-k; nUll/ dual duplex/ crosspoint

where:

Pc/central processor; Mp/primary memory; T/terminals;
KS/ slow device control (e. g., for Teletype);
Kf/fast device control (e.g., for disk);
Kc/control for clock, timer, interprocessor

c~unication

lBoth switches have static configuration control by manual and
program control

Figure I-Proposed CMU multiminiprocessor
computer /C.mmp

allows each processor (Pc), to communicate with the
various controllers (K), which in turn manage the
secondary memories (Ms), and I/O devices transducers (T). These switches are under both processor
and manual control.
Each processor system is actually a complete computer with its own local primary memory and controllers for secondary memories and devices. Each
processor has a Data operations component, Dmap,
for translating addresses at the processor into physical
memory addresses. The local memory serves both to
reduce the bandwidth requirements to the central
memory and to allow completely independent operation and off-line maintenance. Some of the specific
components shown in Figure 1 are:
K.clock: A central clock, K.clock, allows precise
time to be measured. A central time base is
broadcast to all processors for local interval
timing.
K.interrupt: Any processor is allowed to generate
an interrupt to any subset of the Pc configuration at any of several priority levels. Any pro-

768

Fall Joint Computer Conference, 1972

cessor may also cause any subset of the configuration to be stopped and/or restarted. The
ability of a processor to interrupt, stop, or
restart another is under both program and
manual control. Thus, the console loading function is carried out via this mechanism.
Smp: This switch handles information transfers
between primary memory processors and I/O
devices. The switch has ports (i.e., connections)
for m busses for primary memories and p busses
for processors. Up to min(m,p) simultaneous
conversations possible via the cross-point arrangement.
Smp can be "Set under programmed control or
via manual switches on an override basis to
provide different configurations. The control
of Smp can be by any of the processors, but one
processor is assigned the control.
Mp: The shared primary memory, Mp, consists
of (up to) 16 modules, each of (up to) 65k, 16 bit,
words. The initial memories being used have the
following relevant parameters: core technology;
each module is 8-way interleaved; access time is
250 nanoseconds; and cycle time is 650 nanoseconds. An analysis of the performance of these
memories within the C.map configuration is
given in more detail below.
Skp: Skp allows one or more of k Unibusses (the
common bus for memory and i/o on an isolated
PDP-l1 system) which have several slow, Ks
(e.g., teletypes, card readers), or fast controllers,Kf, (e.g., disk, magnetic tape), to be
connected to one of p central processors. The k
Unibusses Jor the controllers are connected to
the p processor Unibusses on a relatively long
term basis (e.g., fraction of a second to hours).
The main reasons for only allowing a long term,
but switchable, connection between the k
Unibusses and the processor is to avoid the
problem of having to decide dynamically which
of the p processors manage a particular control.
Like Smp, Skp may be controlled either by
program or manually.
Pc: The processing elements, Pc, are slightly
modified versions of the DEC PDP-ll. (Any of
the PDP-11 models may be intermixed.)
Dmap: The Dmap is a Data operations component
which takes the addresses generated in the
processor and converts them to addresses to use
on the Memory and Unibusses emanating from
the Dmap. There are four sets of eight registers
in Dmap, enabling each of eight 4,096 word
blocks to be relocated in the large physical
memory. The size of the physical M p is 220

words (221 bytes). Two bits in the processor,
together with the address type are used to
specify which of the four sets of mapping registers is to be used.
Dmap

The structure of the address map, is described below
and in Figure 2 together with its implications for two
kinds of programs: the user and the monitor programs.
For the user program, the conventional PDP-11 addressing structure is retained-except that a program
does not have access to the "i/o page," and hence the
full 16-bit address space refers to the shared primary
memory.
A PDP-11 program generates a 16-bit address, even
though the Unibus has 18-bit addressing capability.
In this scheme the additional two address bits are
obtained from two unused program status (PS) register
bits. (N ote, this register is inaccessible to user pro-

User's l6-bit address

I

bank 00

bank 01 1------I...;;..-.2.!ba~nk:...!s~el~ec:!ti~on!!.._J

register selection
bank 10

bank 11

no relocation
(local UNIBUS)

~j

I
format:

2l-bit

!

CHUibus Address

reserved for expansion of physical page number
(reserved)
NXM

Write protect
'Written-into'

Figure 2-Format of data in the relocation registers

C.mmp-A Multi-Mini-Processor

grams.) These are two additional bits, provides four
addressing modes:
OO-mOdej

oI-mode
IO-mode
II-mode

These addresses are always mapped, and
always refer to the shared, large, primary
memory.
All but 8 kw (kilo words) of this address
space is mapped as above. The 8 kw of this
space which is not mapped refers to the
private Unibus of each processor; 4 kw of
this space is for private (local) memory and
4 kw is used to access i/o devices attached
to the processor.

For mapped references, the mapping consists of using
the most significant five bits of the 18-bit address to
select one of 30 relocation registers, and replacing these
by the contents of the 8 low order bits of that register
yielding an overall 2I-bit address. Alternatively, consider that two bits of the PS select one of four banks
of relocation registers and the leftmost three bits of
the users (I6-bit) address select one of the eight registers in this bank (six in bank three). A program may
(by appropriate monitor calls) alter the contents of
the relocation registers within that bank and thus alter
its "instantaneous virtual memory" -that is, the set
of directly addressable pages. The format of each of the
30 relocation registers is as also shown in Figure 2
where:
1. The 'written-into' bit is set (to 1) by the hardware whenever a write operation is performed on
the specified page.
2. The 'write protect' bit, when set, will cause a
trap on (before) an attempted write operation
into the specified page.
3. The NXM, 'non-existent memory', when set,
will cause a trap on any attempted access to the
specified page. Note: this is not adequate for,
nor intended for, 'page fault' interruption.
4. The 8-bit 'physical page number' is the actual
relocation value.

THE MEMORY INTERFERENCE PROBLEM
One of the most crucial problems in the design of
this multiprocessor is that of the conflict of processor
requests for access to the shared memories.
Strecker (1970) gives closed form solutions for the
interference in terms of a defined quantity, the UER
(unit execution rate). The UER is, effectively, the rate
memory references and, for the PDP-II, is approximately twice the actual instruction execution rate.

769

(Although a single instruction may make from one to
five memory references, about two is the average.)
Neglecting i/o transfers*, assuming access requests to
memories at random, and using the following mean
parameters:
tp

the time between the completion of one
memory request and the next request
ta,te
the access time and cycle time for the
memories to be used
tw = te - til. the rewrite time of the memory
Strecker gives the following relations:
UER = (m/te) (1 - (1 = l/m)p)
UER

=

m X 1 - (1 - I/m)p
t
1 - (1 - I/m)p

UER

=

(m/te) (1 - (1 - P m/m)p)

tp

tr

where Pm + (m/p)(--)(l - (1 - Pm/mP ))
te
-

-

I

=

0

Various speed processors, various types of memories,
and various switch delays, td, can be studied by means
of these formulas. Switch delays effects are calculated
by adding to til. and t e, i.e., ta' = td + til.; and t e' =
td + te. For example, the following cases are given in
the attached graphs. The graphs show UER X 106 as
a function of p for various parameters of the memories.
The two values of td shown correspond to the ,estimated
switch delay in two cable-length cases: 10' and 20'.
The t e,til. values correspond to six memory systems
which were considered. The value of tp is that for the
PDP-II model 20.
Given data of the form in Figures 3 and 4 it is possible to obtain the cost effectiveness of various processor-memory configurations. An example of this
information for a particular memory configuration
(16 memories, te = 400) and three different processors
(roughly corresponding to three models of the PDP-II
family) is plotted in Figure 5. Note that a small configuration of five Pc.I's has a performance of 4.5 X 106
accesses/second (UER). The cost of such a system is
approximately $375K, yielding a cost-effectiveness of
12. Replacing these five processors with the same
number of Pc.3's yields a UER of 15 X 106 for about
$625K, or a cost-effectiveness of about 24. Following
this strategy provides a very cost-effective system
once a reasonably large number of processors are used.

* A simple argument indicates that i/o traffic is relatively
insignificant, and so has not been considered in these figures. For
example, transferring with four drums or 15 fixed head disks at
full rate is comparable to one Pc.

770

Fall Joint Computer Conference, 1972

accurate method under consideration is to associate
a small memory with each crosspoint intersection.
This can be constructed efficiently by having a memory
array for each of the m rows, since control is on a row
(per memory) basis. When each request for a particular
row is acknowledged, a 1 is added to the register corresponding to the procesor which gets the request.
These data could then serve as input to algorithms of
the type described under (1). Such a scheme has the
drawback of adding hardware (cost) to the switch, and
possibly lowering reliability. Since the performance
measures given earlier are quite good, even for large
numbers of processors, this approach does not seem
justified at this time.

Legend
24

Processor:

t

M_ry:

p -

p

• 700 ns (PDP-ll model 20)
1,5,10, ••• ,35

number memory modules = 8
22

t ,t = (300, J), (400 ,250), (650,350),
c a
(900,350) ,(1200,500)
td = 190,270

20

18

16
<)
~

---114
 1.

C.mmp-A Multi-Mini-Processor

775

A report by McCredie (McCredie, 1972) discusses
two analytic models which have been used to study
this problem; here we shall merely indicate the results.
Figure 7 illustrates the relationship predicted by one
of McCredie's models between the mean response time
to a scheduling request, the number of critical sections,
and the number of processors.
Mean response time increases with the number of
processors. For S constant, the increase in mean response time is approximately linear, with respect to
N, until the system becomes congested. As N increases
beyond this point, the slope grows with increasing N.
The addition of one more critical section significantly improves mean response, for higher values of
N, in both models. The additional locking overhead, L,
associated with each critical section degrades performance slightly for small values of N. At these low
values of N, the rate of requests is so low that the extra
locking overhead is not compensated for by the potential parallel utilization of critical sections.
The most interesting characteristic of these models
is the large performance improvement achieved by
the creation of a small number of additional critical
sections. The slight response time degradation for low
arrival rates indicates that an efficient design would
be the implementation of a few (S = 2, 3 or 4) critical
sections. This choice would create an effective safety
valve. Whenever the load would increase, parallel
access to the data would occur and the shared scheduling information would not become a bottleneck.
The overhead at low arrival rates is about 5 percent
percent and the improvement at higher request rates
is approximately 50 percent.
Given the dramatic performance ratios predicted
by these modes, the HYDRA scheduler was designed
so that S lies in the range 2~7 (the exact value of S
depends upon the path through the scheduler).

requires the introduction of language and system
facilities for creating and synchronizing sub-tasks.
Various proposals for these mechanisms have existed
for some time, such as fork-join, "P" and "V", and
they are not especially difficult to add to most existing
languages, given the right basic hardware. Parallelism
has a more profound effect on the programming environment, however, than the perturbations due to a
few language constructs. The primary impact of
parallelism is in the increase in complexity of a system
due to the possible interactions between its components. The need is not merely for constructs to invoke
and control parallel programs, but for conceptual tools
dealing with the complexity of programs that can be
fabricated with these constructs.
In its role as a substrate for a number of rearch
projects, C.mmp has spawned a project to investigate
the conceptual· tools necessary to deal with complex
programs. The premise of this research ~s that the
approach to building large complex programs, and
especially those involving parallelism, is essentially
methodological in nature: the primitives, i.e., language
features, from which a program is built are not nearly
as important as the way in which it is built. Two particular methodologies-"top-down design" or "structured programming" (Dijkstra, 1969) and "modular
decomposition" (Parnas, 1971) have been studied by
others and form starting points for this research.
While the solution to building large systems may
be methodological, not linguistic, in nature, one can
conceive of a programming environment, including a
language, whose structure facilitates and encourages
the use of such a methodology. Thus the context of
the research has been to define such a system as a
vehicle for making the methodology explicit. Although
they are clearly not independent, the language and
system issues can be divided for discussion.

PROGRAMMING ISSUES

Language issues

Thus far both highly general and highly specific
aspects of the hardware and operating system design
of C.mmp have been described. These alone, however,
do not provide a complete computing environment in
which productive research can be performed. An
environment of files, editors, compilers, loaders, debugging aids, etc., must be available. To some extent
existing PDP-11 software can and will be used to
supply these facilities. However, the special problems
and potentials of a multiprocessor preclude this from
being a totally appropriate set of facilities.
The potential of true parallel processing obviously

Most language development has concerned itself
with "convenience"-providing mechanisms through
which a programmer may more conveniently express
computation. Language design has largely abdicated
responsibility for the programs which are synthesized
from the mechanisms it provides. Recently, however,
language designers have realized that a particular
construct, the general goto, can be (mis)used to easily
synthesize "faulty" programs and a body of literature
has developed around the theoretical and practical
implications of its removing from programming languages (Wulf, 1971a).

776

Fall Joint Computer Conference, 1972

At the present stage of this research it is easier to
identify constructs which, in their full generality, can
be (mis) used to create faulty programs than to identify
forms for the essential features of these constructs
which cannot be easily misused. Other constructs a,re:
Algol-like scope rules

The intent of scope rules in a language is to provide
protection. Algol-like scope rules fail to do this in two
ways. First, and most obviously, these rules do not
distinguish kinds of access; for example, "read-only"
access is not distinguished from "read-write" access.
Second, there is no natural way to prevent access to a
variable at block levels "inside" the one at which it is
declared.
Encoding

A common programming practice is to encode information, such as age, address, and place of birth, in
the available data types of a language, e.g., integers.
This is necessary, but leads to programs which are
difficult to modify and debug if the manipulation of
these encodings is distributed throughout a large program.
Fixed representations

Most programming languages fix both syntactic
and run-time representations; they enforce distinctions between macros and procedures, data and program, etc., and they provide irrevocable representations of data structures, calling sequences, and storage
allocation. Fixed representations force programmers to
make decisions which might better be deferred and, occasionally, to circumvent the fixed representation
(e.g., with in-line code).
SYSTEMS ISSUES
Programming should be viewed as a process, .not a
timeless act. A language alone is inadequate to support
this process. Instead, a total system that supports all
aspects of the process is sought. Specifically, some
attributes of this system must be:
(a) To retain the constructive path in final and
intermediate versions of a program and to make
this path serve as a guide to the design, construction, and understanding of the program.

For example, the source (possibly in its several
representations) corresponding to object code
should be recoverable for debugging purposes;
this must be true independent of the binding
time for that code.
(b) To support execution of incomplete programs.
A consequence of some of the linguistic issues
discussed above is that decisions (i.e., code to
implement them) will be deferred as long as
possible. This must not pre---- ------------

G
2.2) [Digitalis]

[produces]

changes in

G
2.3) [Digitalis1

[produces]

[ changes] in

I
electrolytes

G
2.4) [Digitalis]

[produces 1

[changes] in

I
electrolytes

G
2.5) [Digitalis J

[produces]

[changes in]

I
[electrolytes]

I
electrolytes

I
3.

CONJ

Q

(have)concentration [in]
fluxes

P

movement in
{at a) rate
[movement] out
of
[(at a) rate]

(has)most prominent (case in)

------- -+

C
[ cells]

and

C
[ cells]

,that is,

C
the cell

and

C
[the cell]

INTEREST WAS INITIALLY FOCUSED ON CHANGES IN POTASSIUM; MORE RECENTLY, CHANGES IN CALCIUM HAVE BEEN RECOGNIZED
TO BE OF GREAT IMPORTANCE.
I

changes in

potassium
I

changes in

calcium

TABLE II-Formatted Sentences 4-6
Symbols as in Table 1
4.

TOXIC DOSES OF DIGITALIS CONSISTENTLY REDUCE THE INTRACELLUlAR CONCENTRATION OF POTASSIUM TIT A WIDE VARIETY OF
CELLS, INCLUDING CARD~C MUSCLE CELLS.
NO

4.1)

Q

G
Digitalis
toxic
doses

G
4.2) [Digitalis]
[toxic
doses]

5.

Vss

Vq

D

reduces
consistently

[reduces
consistently]

Nl
I
potassium

I
[potassium1

V

Q

(has)concentration intra-

in a wide
variety
of cells

including

C

(has)[concentration intra-]

--- ~ 4

[cellular] in cardiac
muscle
cells

J ---------

slows

potassium

---------~

influx into

the cell

CONCURRENTLY, INTRACELLUIAR SODIUM AND WATER ARE INCREASED.

6.1)

increases

6.2)

[increases1

results from

C

I

6.

I

C
cellular

THIS RESULTS FROM THE SLOWING OF THE INFLUX OF POTASSIUM INTO THE CELL.

5.1) ~---------------------------- --------------5.2)

CONJ

Ds

N2

. Concurrently

)1

I
sodium

(is) intra-

I
water

[is intra-

C

!

cellular

and[concurrently]

cellular]
I

Syntactic Formatting of Science Information

799

TABLE III-Formatted Sentence 7
Symbols as in Table 1

7.

IT IS NOT CERTAIN WHETHER THESE LINKED CHANGES IN SODIUM AND POTASSIUM ARE PRODUCED BY A SINGLE EFFECT
OR ARE SEPARATELY MEDIATED.
Q

7.1) single effect

vss
produces

f

+------------- ------------

II +-.-------------

7.6)

v

D r

Q

-----<6.1>--------

s

--------- ---------+

------------ -----<5.2>--------

------------

CONJ

D

and

+

or

mediates
separately

+-------------~

[mediates
separately]

+-------------- ------------

-----{5.2]-------- ---------

---------+

~

+-------------- ------------

-----~<6.1>]------ ---------

---------+

is linked to

+-------------- ------------

-----t<5. 2 >j------ ---------

---------+ .

In (1.): While changes in cells produced by digitalis is
ambiguous in English as a whole, it is not ambiguous in
the sublanguage since nouns in the class C (cells) do
not occur as the subject of Vss verbs (produce). In the
sublanguage the word changes only operates on quantity
words Q (e.g., amount, rate) or verbs which have an
implicit Q on them. In the formats, therefore, change
occupies the V q position. In the format for sentence 1
this places internal milieu of cells in the Se position,
suggesting that it contains an implicit Q and V. This
is supported by the fact that the paraphrase changes in
the amounts of X 1, X 2, • • • in cells is an acceptable substitute for internal milieu of cells in all its textual
occurrences in this sublanguage.
In (2.): The first part of sentence 2 contains lost
repeated material (zeroing) which can be reconstructed
because of the strong grammatical requirements on the
superlative form : Most prominent have been changes
in ... is filled out to Most prominent of these changes
have been changes in . . .. These changes is a classifier
sequence replacing the full repetition of sentence 1,
which is then shown in, the format as the first (zeroed)
unary sentence of 2.
In (2.3-2.5): The word which indicates that changes
(along with digitalis produces) has been zeroed is the
repeated in after and. In 2.2, the V in Se is (have)
concentration in ( or: concentrate to some amount in),
which in the sublanguage requires an object noun from
the gross tissue-cell class T. Similarly in 2.3, the V
fluxes (with unspecified P) requires an object noun from
T. In the analyzed texts both of these Vs occurred almost

-----[6.1]--------

[and]

wh

exclusively with the noun cell as their object. The
definitional connective that is between 2.2,3 and 2.4,5
supports substituting the word cell for T.
In (3.): The sublanguage requirements on the noun
class I (potassium, sodium) as the first noun in Se1
when Se is operated on by V q (changes), are that the
verb be of the type V IT or V II and the second noun be of
class T or I. The continuity of this sentence with its
surrounding sentences suggests that the verb is V IT and
the noun T (more specifically C: cell) .
In (5.1): The pronoun this replaces the entire preceding sentence.
In (7.): These linked changes in sodium and potassium
transforms into These changes in sodium and potassium
which are linked. The portion up to which are linked is
a classifier of the two preceding unary sentences, 6.1
and 5.2, pinpointed by the repetition of the words
sodium and potassium in the classifier sequence. It is
these two conjoined unary sentences which are operated
on by a single effect produces in lines 7.1, 7.2, and again
by mediates separately with unknown N subject, in
lines 7.3, 7.4. The portion which are linked applies to
both occurrences of 6.1 and 5.2 in 7.1-4. The wh in
which is the connective and the ich part is a pronoun for
the two sentences, as indicated by { }. The fact that
the sentences were reconstructed by use of a classifier
is indicated by the ( >inside the { } in 7.5-6. Although
this sentence seems empty, it is common in scientific
writing for a sentence to consist of references to previous
sentences with new operators and conjunctions operating
on the pronouned sentences. The linearity of language

800

Fall Joint Computer Conference, 1972

makes it difficult to express complex interconnections
between the events (sentences) except with the aid of
such pronouned repetitions of the sentences.
The appearance of a word like effect in the column
usually filled by a pharmacological agent noun G may
herald the future occurrence of a new elementary
sentence or a new set of conjoined elementary sentences
(classified by the word effect) which will intervene
between G and the present Se. This appears to be one
of the ways that new knowledge entering the subfield
literature is reflected in the formats and the sublanguage grammar.
In fact, in the work described here, the first investigation, which covered digitalis articles up to about
1965, showed certain sets of words (including mechanism, pump and, differently, ATPase) appearing in the
No or Ds column as an operator on Se. In later articles,
which were investigated later, these nouns appeared
increasingly as subjects of new Se subtypes listed above
in the grammar, connected by conjunctions to the
previously known Se. The shift of these words from
occurring as operators to occurring in (or as classifiers
of) new Se subtypes is the sublanguage representation
of the advance of knowledge in the subfield.
ACKNOWLEDGMENTS
This work was supported by Research Grants R01 LM
00720-01, -02, from the National Library of Medicine,

National Institutes of Health, DHEW. Important
parts of the sublanguage grammar are the work of
James Munz, to whom many of the results and methods
are due.
REFERENCES
1 F W LANCASTER
Evaluation of the M edlars demand search
National Library of Medicine 1968
2 Proceedings of 1971 Annual Conference of the ACM pp
564-577
3 N SAGER
Syntactic analysis of natural language
Advances in Computers 8 F Alt and M Rubinoff eds
Academic Press New York 1967
4 N SAGER
The string parser for scientific literature
Courant Computer Symposium 8-Natural Language
Processing R Rustin Ed Prentic Hall Inc Englewood Cliffs
N J In press
5 String Program Reports Nos 1-5
Linguistic String Project New York University 1966-1969
6 D HIZ A K JOSHI
Transformational decomposition-A simple description of an
algorithm for transformational analysis of English sentences
2eme Conference sur Ie Traitement Automatique des
Langues Grenoble 1967
7 String Program Reports No 6
Linguistic String Project New York University 1970
8 A F LYON A C DEGRAFF
Reappraisal of digitalis, Part I, Digitalis action at the cellular
level
Am Heart J 72 4 pp 414-418 1961

Dimensions of text processing*
by GARY R. MARTINS
University of California
Los Angeles, California

good news for those of us with strong interests in one or
another of the many kinds of textual data processing.
But we must face the fact now that there remains a
serious and large-scale educational task that must be
undertaken if the future growth of textual data processing is to fulfill the high hopes for it that we now
entertain. Text processing tasks and systems are too
often considered in isolation from one another, with the
results that (1) much design and implementation work
needlessly duplicates prior accomplishments, and (2)
potentially useful generalizations and extensions of
existing systems for new applications are overlooked.
This is a tutorial paper, then. My purpose is to take a
broad view of. the text processing field in such a way as
to emphasize the relations among different systems and
applications. The structure of these relationships will be
embedded in an informal descriptive space of two
dimensions. In the interests of focussing attention on the
unifying character of this framework, however imperfect
and incomplete it surely is, I shall avoid the discussion
of the internal details of specific systems.

INTRODUCTION
Numerical data processing has dominated the computing industry from its earliest days, when computing
might better have been called a craft than an industry.
In those early days it was not uncommon for a mixed
group of scientists and technicians to spend an entire
day persuading a roomful of vacuum tubes and mechanical relays to yield up a few thousand elementary
operations on numbers. The emphasis on numerical
applications was a wholly natural consequence of the
dominant interests of the men and women who designed,
built, and operated those early computing machines.
Within a single generation, things have changed
dramatically. Computing machines are vastly more
powerful and reliable, and easier to use thanks to the
efforts of the software industry. Perhaps of equal
importance, access to computers can now be taken for
granted in the more prestigious centers of education,
commerce, and government. And we may be approaching the day w:hen computing services will be as widely
available as the telephone. But it is still true that
numerical data processing-"number crunching," in one
form or another-is the principal application for
computers.
That, too, is changing, however. Due principally, I
think, to the highly diversified needs and interests of the
greatly expanded community of computer users, the
processing of textual or lexicographic materials already
consumes a significant percentage of this country's
computing resources, and that share is rising steadily.
By the end of this decade, if not before, text processing
of various kinds may well become the main application
of computers in the United States.
This prediction, no doubt, carries ~he ring of authentic

TEXTUAL DATA PROCESSING
By "textual data processing" I mean a computing
process whose input consists entirely or substantially of
character strings. For the most part, it will be convenient to assume that this textual input represents
natural language expressions, such as, for example,
sentences in English. All kinds of systems running today
fit this deliberately broad and usefully loose definition:
programs to automatically make concordances, compile
KWIC indexes, translate between languages, evaluate
personnel reports, drive linotype machines, abstract
documents, answer questions, perform content analysis,
route documents, search libraries, and edit manuscripts.
I am sure everyone here could add to this list. It will be
instructive to include programming language compilers
in our discussion, as well.

* Research reported herein was conducted under Contract
# F30602-70-C-0016 with the Advanced Research Projects
Agency, Department of Defense.

801

802

Fall Joint Computer Conference, 1972

TWO DIMENSIONS OF TEXT PROCESSING
An important set of relationships among these highly
diverse activities can be clarified by locating them in an
informal space of two dimensions: scope and depth of
structure. The dimension of scope has to do with the
magnitude of the task to be performed: the size of the
problem domain, and the completeness of coverage of
that domain. To illustrate, an operating productionoriented Russian-to-English machine translation system
has a potentially vast input domain, namely, all the
sentences of Russian. But an experimental model for
such a system, containing only a tiny dictionary and a
few illustrative grammar rules-something concocted
for a demonstration, perhaps-has a highly restricted
domain. The scope of the two systems differs greatly,
with important consequences which we shall consider in
a moment.
The second of our dimensions measures the richness
of structure or "vertical integration" developed for the
text. This is essentially a linguistic dimension, reflecting
the fact that the text itself is made up of natural
language expressions of one kind or another. This
dimension does not take into account simple linear
physical divisions of text, such as the "lines" and
"pages" of text editing systems like TECO.l Rather, it
measures an essentially non-linear hierarchy of abstract
levels of structure which define the basic units of
interest in a given application.
Scope

Generally, the dimension of scope as applied to the
description of text processing systems does not differ in
any systematic way from the notion of scope as applied
to other systems. It enables us to express the relative
magnitude of the problem domain in which the system
can be of effective use. There are two key factors to be
considered in estimating the scope of a particular
system. The first of these has to do with the generality
of input data acceptable to the system. If the acceptable
input is heavily restricted, the scope of the system is
relatively small. Voice-actuated input terminals have
been rather vigorously promoted during the past few
years; their scope is small indeed, being limited to the
effective recognition of a very small set of spoken words
(names of digits and perhaps a few control words), and
often demanding a mutual "tuning" of the terminal and
its operators. Programming language compilers provide
another rather different example of systems with limited
scope in terms of acceptable input; both the vocabulary
and syntax of acceptable statements are rigidly and
narrowly defined. In contrast, text editing systems in
general have wide scope in terms of acceptable input.

The other factor which plays an important role in
determining the scope of text processing (or other)
systems has to do with the convenience and flexibility
of the interface between the system and its users.
Obviously, this factor will be of lesser importance in the
evaluation of systems operated as a service in a closedshop batch-processing environment. It will be of major
importance in relation to systems with which users
become directly involved, interactively or otherwise.
Compilers are a good example of such systems, as are
such widely-used statistical packages as SPSS2 and
BMD,3 and interactive processors like BBN-LISP,4
BASIC5 and JOSS. 6 More to our present point are
text-editing systems such as TECO, QED,7 WYLBUR,8
HYPERTEXT9 and numerous others; in terms of
acceptable data input, these latter systems impose few
restrictions, but they may be said to differ significantly
in overall scope on the basis of differences in their
suitability for use by the data processing community at
large. It takes a sophisticated programmer with extensive training to make use of TECO's powerful editing
and filing capabilities, for example; this restricts the
system's scope. A most ambitious assault on this aspect
of the problem of scope in text editing systems is that
of Douglas Englebart and his colleagues at Stanford
University;lO a review of their intensive and prolonged
efforts should convince anyone of the serious nature of
the difficulties involved in widening the general
accessibility of text editing systems.
I am sure we have all had experiences with text
processing systems of very restricted scope. A decade
ago, it was a practice of some research organizations to
arrange demonstrations of machine translation systems;
in some memorably embarrassing instances, the scope
of these systems was unequal to even the always quite
carefully hedged, and sometimes entirely pre-arranged,
test materials allowed as input. More commonly, we
may have written or used text processing programs of
one kind or another which were created in a deliberately
"quick and dirty" fashion to answer some special need,
highly localized in space and time. It is important to
note that the highly restricted scope of such "one-shot"
programs in no way diminishes their usefulness; given
the circumstances, it may indeed have involved an
extravagant waste of resources to needlessly expand
their scope in either of the two ways I have mentioned.
While it may be quite difficult to measure the
relative scope of different text processing systems, at
least the basic notions involved are simple: breadth of
acceptable input data and, where appropriate, the
breadth of the audience of users to which the system is
effectively addressed. Let us now review the depth of
structure dimension of text processing, where the basic
notions involved may be somewhat less familiar.

Dimensions of Text Processing

Depth of Structure

We may assume that the text to be processed first
appears in the system as a continuous stream of
characters. It will rarely be the case that our interest in
the data will be satisfied at this primitive level of
structure. We will most often be interested in words, or
phrases, or sentences, or meanings, or some other set of
constructs. Since these are not given explicitly in the
character stream, it will be necessary to operate on the
character stream to derive from it, or assign to it, the
kinds of structures that will answer our purposes. The
lowest level of structure, in this sense, consists of the
sequence of characters in the stream. The highest level
of structure might perhaps involve the derivation of the
full range of meanings contained in and implied by the
text. Between these extremes we may define a variety
of useful and attainable structures. The dimension along
which we measure these differences is that to which I
have given the somewhat clumsy name of depth of
structure.
The number of useful applications of text processing
at the lowest level of structural depth-the character
stream-is quite large. Most text editing systems
operate at this level. Other more specialized applications
include the development of character and string occurrence frequencies, of use principally to crypt analysts
and communications engineers. But, for applications
which cannot be satisfied by simple mechanical patternseeking and matching operations, we must advance at
least to the next level of structure, that of the
pseudo-word.
Pseudo-words

The ordinary conventions of orthography and punctuation enable us to segment the character stream into
word-like objects, and also into sentences, paragraphs,
etc. The word-like objects, or pseudo-words, may be
physically defined as character strings flanked by blank
characters and themselves containing no blank characters. This is still a fairly primitive level of structure,
and yet it suffices for many entirely respectable
applications. Concordances have often been made by
programs operating with this level of textual structure,
for example. But there are serious limitations on the
results that can be achieved. It is not possible, for
instance, for the computer to determine that "talk" and
"talked" and "talks" are simply variants of the same
basic word, and that they should therefore be treated
similarly for some purposes. The same difficulty appears
in a more refractory form with the items "go" and
"went." Thus, if our intended application will require

803

the recognition of such lexicographic variants as members of the same word family, it will be necessary to
approach the next more or less clearly defined level of
textual structure, that of true words.
Word recognition

There are two principal tools used for the recognition
of words in text processing: morphological analysis and
dictionaries. They come in many varieties, large and
small, expensive and cheap. Either may be used without
the other. The Stanford Inquirer content-analysis
systemll employs a very crud£ kind of morphological
analysis which consists in simply cutting certain
character strings from the ends of pseudo-words, and
treating the remainder as a true word. This procedure is
probably better than nothing at all, but it can produce
some bizarre confusions; for example, the letter "d" is
cut from words ending in "ed," on the assumption that
the original is the past tense or past participle form of a
verb. "Bed" is thus truncated to "be," and so on. The
widely used and highly successful KWIC12 indexing
system operates with a crude morphological analysis of
essentially this kind.
]\l10re sophisticated morphological analysis is attainable through the use of more flexible criteria by which
word-variants are to be recognized, at the cost of a
correspondingly more complex programming task. But
it is hard to imagine what sort of rules would be needed
to cope with the so-called strong verbs of English. The
best answer to this problem is the judicious use of a
dictionary together with the morphological analysis
procedures. 13 In systems employing both devices, a
pseudo-word is typically looked up in the dictionary
before any analysis procedures are applied to it. If the
word is found, then no further analysis is required, and
the basic form of the word can be read from the
dictionary, perhaps together with other kinds of in..
formation. Thus, "bed" would be presumably found in
the dictionary, preventing its procrustean transformation to "be." Likewise, "went" would be found as a
separate dictionary entry, along with an indication that
it is a variant of "go." On the other hand, "talked"
would presumably not be found in the dictionary, and
morphological analysis rules would be applied to it,
yielding "talk"; this latter would be found in the
dictionary, terminating the analysis.
Here I have tacitly assumed that our dictionary
contains all the words of English, or enough of them to
cover a very high percentage of the items encountered
in our input text. In fact, there are very few such
dictionaries in existence in machine-usable form. The
reason is twofold: on the one hand, they are very

804

Fall Joint Computer Conference, 1972

expensive to create, and generally difficult to adapt,
requiring a level of skills and time available only to the
more heavily endowed projects; on the other hand,
while they provide an elegant and versatile solution to
the problems of word identification, most current text
processing applications simply do not require the degree
of versatility and power that large-scale dictionaries can
provide.
A number of attempts have been made in the past to
build automatic document indexing and dissemination
systems based upon the observed frequencies of words
in the texts. In these and similar systems, it was found
to be necessary to exclude a number of very frequent
"non-content-bearing" words of English from the
frequency tabulations-such words as "the," "is," "he,"
"in," which we might collectively describe as members
of closed syntactic classes: pronouns, demonstratives,
articles, prepositions, along with a few other high
frequency words of little interest for the given applications. The exclusion of these words is accomplished
through the use of a "stop list," a mini-dictionary of
irrelevant forms. Such small, highly specialized dictionaries are easily and cheaply constructed, and have
proven useful in a wide variety of applications.
Automatic typesetting systems provide another good
example of a useful text processing application operating
at the level of the word on the dimension of depth of
structure. The key problem for these systems is that of
correctly hyphenating words to make possible the right
and left justification of news columns. Morphological
analysis of a quite special kind is employed to determine
the points at which English words may be broken, and
this analysis is often supplemented with small dictionaries of exceptional forms. Another more sophisticated
but closely related application of rather narrow but
compelling interest is that of automatically translating
English text into Braille. Once again, specialized word
analysis, supplemented by relatively small dictionaries
of high frequency words and phrases, have been the
tools brought to bear on the problem.
Before moving on to consider text processing applications of higher rank on the scale of depth of structure,
I should like to pause for a moment to comment on
what I believe to be a wide-spread fallacy concerning
text processing in general. Somehow, a great many
people, in and out of the field of text processing, have
come to associate strong notions of prestigiousness
exclusively with systems ranking at the higher end of the
dimension of depth of structure. It is hard to know how
or why this attitude has developed, unless it is simply a
reflection of a more general fascination with the obscure
and exotic. But it would be most unfortunate if capable
and energetic people were for this reason diverted from
attending to the many still unrealized possibilities in

text processing on the levels we have been discussing.
We have only to compare the widespread usefulness of
text processing systems operating at the word level and
below with the generally meagre practical contributions
of systems located further along this dimension to dispel
the idea that there is greater intrinsic merit in the latter
systems. In now moving further along the dimension of
depth of structure, we leave behind a broad spectrum
of highly practical and useful systems that sort and edit
text, make indexes, classify and disseminate documents,
prepare concordances, set type, translate English into
Braille, perform content analysis, make elementary
psychiatric diagnoses from writing samples, assist in the
evaluation of personnel and medical records, and
routinely carry out many other valuable tasks. It may
be only moderately unjust to repeat here a colleague's
observation that, in contrast, the principal product of
the systems we are about to consider has been doctoral
dissertations.
Syntax

Syntax is, roughly speaking, the set of relations that
obtain among the words of sentences. For some
applications in the text processing field, syntactic
information is simply indispensable. The difference between "man bites dog" and "dog bites man" is a
syntactic difference; it is a difference of no account in
applications based upon word frequencies, for example,
but it becomes crucial when the functions or roles of the
words, in addition to the words themselves, must be
considered.
Syntactic analysis is most neatly accomplished when
the objects of analysis have a structure which is rigidly
determined in advance. The syntactic structure of valid
ALGOL programs conforms without exception to a set
of man-made rules of structure. The same is true of
other modern programming languages, and of artificial
languages generally. This fact, together with the tightly
circumscribed vocabularies of such languages, makes
possible the development of very efficient syntactic
analyzers for them.
Natural languages are very different, even though
some artificial languages, especially query and control
languages, go to great lengths to disguise the difference.
In processing ordinary natural language text we are
confronted with expressions of immense syntactic
complexity. And, while most artificial languages are
deliberately constructed to avoid ambiguity, ordinary
text is often highly ambiguous; indeed, ambiguity is a
vital and productive device in human communication.
The syntactic analysis of arbitrary natural language text
is therefore difficult, expensive, and uncertain. It will

Dimensions of Text Processing

come as no great surprise, then, that text processing
systems that require some measure of syntactic analysis
seldom carry the analysis further than is needed.
Further, the designers of such systems have defined
their requirements for syntactic analysis in a variety of
ways. The result is that existing natural language text
processing systems embody a great variety of analysis
techniques. To some extent, this situation has been
further complicated by debates among linguists as to
what constitutes the correct analysis of a sentence,
though the influence of these polemics has been minor.
Over the past decade, and especially over the past five
years or so, techniques for the automatic syntactic
analysis of natural language text have improved rather
dramatically, and are flexible enough today to accommodate a variety of linguistic hypotheses.
Earlier, in discussing the place of the dictionary in the
identification of words, I mentioned that such dictionaries might carry other information in addition to the
word's basic form. Often, this other information is
syntactic, an indication of the kinds of roles the word is
able to play in the formation of phrases and sentences.
These "grammar codes," as they are often called, are
analogous to the familiar "part of speech" categories we
were taught in elementary school, though in modern
computational grammars these distinct categories may
be numerous. A given word may be assigned one or more
grammar codes, depending upon whether or not it is
intrinsically ambiguous. A word like "lucrative" is
unambiguously an adjective. But "table" may be a
noun or a verb. A word like "saw" exhibits even greater
lexical ambiguity: it may be a noun or either of two
verbs in different tenses.
The process of syntactic analysis, or parsing, generally
begins by replacing the string of words-extracted from
the original character stream as described earlier-by a
corresponding string of sets of grammar codes. It then
processes these materials one sentence at a time. Parsing
in general does not cross sentence boundaries for the
simple, though dismaying, reason that we know very
little about the kinds of rule-determined connections
between sentences, if indeed there are any of substance.
On the other hand, the sentence is the smallest really
satisfactory unit of syntactic analysis since we can be
confident of our results for one part of a sentence only
to the degree that we have successfully accounted for
the rest of it, much as one is only sure of the solution to
a really difficult crossword puzzle when the whole of it
has been worked out.
If a sentence consists entirely of lexically unambiguous words-a rarity in English-then there is only a
single string of grammar codes for the parser to consider.
More commonly, the number of initially possible
grammar code sequences is much higher; it is, in fact,

805

equal to the product of the number of distinct grammar
codes assigned to each word. Whatever the number, the
parser must consider each of the possible sequences in
turn, first assembling short sequences of codes into
phrases-such as noun phrases or prepositional phrases
-and then assembling the phrases into a unified
sentential structure. At each step of the way, the par.ser
is engaged in matching the constructs before it (i.e.,
word or phrase codes) against a set of hypotheses
regarding valid assemblies of such constructs. The set of
hypotheses is, in fact, the grammar which drives the
parsing process. When a string of sub-assemblies corresponds to such a hypothesis (or grammar rule), it is
assembled into a unit of the specified form, and itself
becomes available for integration into a broader
structure.
To illustrate, consider just the three words "on the
table." The parser, first of all, sees not these words, but
rather the corresponding string of grammar code sets:
PREPOSITION ARTICLE NOUN/VERB.* Typically, it may first check to see whether it can combine
the first two items. A table of rules tells the parser, as
common sense tells us, that it cannot, since "on the" is
not a valid English phrase. So, it considers the next pair
of items; the ambiguity of the word "table" here requires
two separate tests, one for ARTICLE
NOUN and
the other for ARTICLE VERB. The former is a valid
combination, yielding a kind of NOUN-PHRASE ("the
table"). The ARTICLE + VERB combination is
discarded as invalid. Now the parser has before it the
following: PREPOSITION NOUN-PHRASE. Checking its table of rules, it discovers that just this set of
elements can be combined to form a PREPOSITIONAL-PHRASE, and the process ends-successfully.
This skeletal description of the parsing process is
considerably oversimplified, and it omits altogether
some important distinctive characteristics of parsing
techniques which operate by forming broader structural
hypotheses and thus playa more "aggressive" role in the
analysis. The end result, if all goes well, is the same: an
analysis of the input sentence, usually represented in the
form of a labelled tree structure, which assigns to each
word and to each phrase of the sentence a functional
role. Having this sort of information, we are able to
accurately describe the differences between "man bites
dog" and "dog bites man" in terms of the different roles
the words play in them. In this simple case, of course,
the SUBJECT and OBJECT roles are differentially
taken by the words "dog" and "man."
I remarked earlier that few production-oriented
systems incorporate large-scale dictionaries. The same

+

+

* The notation "X/Y" is used here to indicate an item that may
belong either to category X or to category Y.

806

Fall Joint Computer Conference, 1972

is true of syntactic analysis programs; large-scale
sentence analyzers are still mainly experimental. The
parsing procedures of text processing systems that are
widely used outside the laboratory, with a very few
interesting exceptions, are designed to produce useful
partial results of a kind just adequate to the overall
system's requirements. Economies of design and implementation are usually advanced as the reasons for these
limited approaches to syntactic analysis.
A meritorious example of limited syntactic analysis is
provided by the latest version of the General Inquirer,
probably the best known and most widely used of
content-analysis systems. The General Inquirer-3,14 as
it is called, embodies routines which are capable of
accurately disambiguating a high percentage of multiple-meaning words encountered in input text. This
process is guided by disambiguation rules incorporated
in the system's large-scale dictionary, 15,16 the Harvard
Fourth Psychosociological Dictionary. These rules direct
a limited analysis of the context in which a word
appears, using lexical, syntactic, and semantically-derived cues to arrive at a decision on the intended sense
of the word. In this manner, nine senses are distinguished for the word "charge," eight for "close," seven
for "air," and so on.
There are language translating machine in daily use
by various government agencies in this country. These
machine translation systems are basically extensions of
techniques developed at Georgetown University in the
early 1960's. Their relatively primitive syntactic capabilities are principally aimed at the disambiguation of
individual words and phrases, a task which they
approach-in contrast with the General Inquirer-3with an anachronistic lack of elegance, economy, or
. speed. For most purposes, the output of these systems
passes through the hands of teams of highly-skilled
bilingual editors who have substantial competence in
the subject matter of the texts they repair. A most
valuable characteristic of the government's machine
translation systems is the set of very-large-scale
machine-readable dictionaries developed for them over
the course of the years. It is to be expected that major
portions of these will prove to be adaptable to the more
modern translation systems that will surely emerge in
the years ahead.
A working system employing full-blown syntactic
analysis is the Lunar Sciences Natural Language I nformation System. 17 This system accepts information
requests from NASA scientists and engineers about such
matters as the chemical composition and physical
structure of the materials retrieved from the moon. The
queries are expressed in ordinary English, of which the
system handles a rich subset. Unrecognizable structures
result in the user's being asked to rephrase his query.

The requests are translated into an internal format
which controls a search of the system's extensive data
base of lunar information.

SeInantics

The next milestone along the dimension of depth of
structure is the level of semantics, which has to do with
the meanings of expressions. Although semantic information of certain kinds has sometimes been used in support
of lexical and syntactic processes (as, for example, in the
disambiguation procedures of the General Inquirer-3),
the number of working systems, experimental or
otherwise, which process text systematically on the
semantic level is close to zero. Those which do so impose
strong restrictions on the scope of the materials which
they will accept. The aforementioned lunar information
system has perhaps the widest scope of any running
system that operates consistently on the semantic level
(with natural language input), and its semantics are
closely constrained in terms of the concepts and
relations it can process.
More flexible semantic processors have been constructed on paper, and a few of these have been
implemented in limited experimental versions/ 8 ,19,20,21
The more promising of these systems, such as RAND's
MIND system,22,23 are based upon the manipulation of
semantic networks in which generality is sought through
the representation of individuals, concepts, relations,
rules of inference, and constructs of these as nodes in a
great labelled, directed graph whose organization is at
bottom linguistic. It is an unsolved problem whether
such an approach can produce the needed flexibility and
power while avoiding classical logical problems of
representation.
The applications to which full-scale semantic processors might one day respond include language
translation and question-answering or fact retrieval. At
present, the often-encountered popular belief in superintelligent machines, capable of engaging in intelligent
discourse with men, is borne out only in the world of
science fiction.
Beyond seInantics

Pushing further along the dimension of depth of
structure, beyond the rarified air of semantics, we
approach an even more exotic and sparsely populated
realm which I shall call pragmatics. There will be little
agreement about the manner in which this zone should
be delimited; I would suggest that systems operating at
this level are endowed with a measure of operational

Dimensions of Text Processing

self-awareness, taking an intelligent view of the tasks
confronting them and of their own operations.
A remarkable example of such a system is Winograd's
program24 which simulates an intelligent robot that
manipulates objects on a tabletop in response to
ordinary English commands. The tabletop, the objects
(some blocks and boxes), and the system's "hand" are
all drawn on the screen of a video console. The objects
are distinguished from one another by position, size,
shape, and color. In response to a command such as
"Put the large blue sphere in the red box at
the right."
the "hand" is seen going through the motions necessary
to accomplish this task, possibly including the preliminary removal of other objects from the indicated
box. What is more, the system has a limited but
impressive ability to discuss the reasons behind its
behavior:
Q: Why did you take the green cube out of the red
box?
A: So I could put the blue sphere in it.
Q: Why did you do that?
A: Because you told me to.
We should note that the scope of this system is in many
ways the most restricted of all the systems we have
mentioned, an observation which subtracts nothing
from its great ingenuity.
SOME OBSERVATIONS
Now let me share with you a number of more or less
unrelated observations concerning text processing systems which have been suggested by the two-dimensional
view of the field which I have just outlined.
Information and depth of structure

We process text in order to extract in useful form
(some of) the information contained in the text. In
general, the higher a text processing system ranks on the
dimension of depth of structure, the greater is the
amount of information extracted from a given input
text. This seems an intuitively acceptable notion, but I
believe it can be given a more or less rigorous restatement in the terms of information theory. In sketching
one approach to this end I shall attack the problem at
its most vulnerable points, leaving its more difficult
aspects as a challenge for others.
Consider a system which processes text strictly on a
word-by-word basis, like older versions of the General

807

Inquirer, for example. Such a system will produce
identical results for all word-level permutations of the
input text. But arbitrary permutations of text in general
result in a serious degradation of information. t We
conclude that text processing systems which operate
exclusively at this level of depth of structure are
intrinsically insensitive to a significant fraction of the
total information content of the original text.
Similarly, syntactic processors whose operations are
confined to single sentences (as is generally the case) are
obviously insensitive to information which depends upon
the relative order of sentences in the original text. And
so on.
These considerations are of interest primarily because
of their potential use in the development of a uniform
metric for the dimension of depth of structure.
The dimensions are continuous*

I want to suggest the proposition that neither of our
dimensions is discrete, in the sense that it presents only
a fixed number of disjoint positions across its range.
That the scope of systems, in our terms, varies over a
dense range is surely not a surprising idea. But the
notion is fairly widespread, I believe, that the depth
of structure of text processors can be described only in
terms of a fixed number of discrete categories or
"levels." Unfortunately, the use of the terms "syntactic
level," "semantic level," etc. is difficult to avoid in
discussing this subject matter, and it is perhaps the
promiscuous use of this terminology which has contributed most to misunderstanding on this point. **

t It may not be easy to usefully quantify this information loss.
The number of distinct word-level permutations of a text of n
words is given by n!/(ftl 12! ... 1m!) where Ij is the number of
occurrences of the j-th most frequent word in a text with m
distinct words. For word-level permutations, this denominator
expression might be generalized on the basis of the laws of lexical
distribution (assuming a natural language text), replacing the
factorials with gamma function expressions. After that, one faces
the thorny empirical questions: how many such permutations can
be interpreted, wholly or in part, at higher levels of structure, and
how "far" are these from the original?
* The term "continuous" is not meant to support its strict
mathematical interpretation here.
** In the possibly temporary absence of a more satisfactory
solution, a linear ordering for the dimension of depth of structure
is derived informally in the following way. First, the various
"levels" are mutually ordered in the traditional way (sublexical,
lexical, syntactic, semantic, pragmatic, ... ) on the empirical
grounds that substantial and systematic procedures on a given
"level" are always accompanied by (preparatory) processing on
the preceding "levels," but not vice-versa. Various theoretical
arguments why this should be so can also be offered. Then, within
a given "level" we may find various ways to rank systems with
respect to the thoroughness of the job they do there.

808

Fall Joint Computer Conference, 1972

As I have tried to indicate, there is in fact considerable
variation among existing text processing systems in the
degree to which they make use of information and
procedures appropriate to these various levels. There
are programs, for example, which tread very lightly into
the syntactic area, making only occasional use of very
narrowly circumscribed kinds of syntactic processes.
Such programs are surely not to be lumped together
with those which, like the MIND system's syntactic
component,25 carry out an exhaustive analysis of the
sentential structure, merely because they both do some
form of what we call syntax. The same is true of the
other levels along the dimension of depth of structure;
the great variety of actual implementations defies
meaningful description in terms of a small number of
utterly distinct categories. We use the terminology of
"levels" because, in passing along this dimension of
depth of structure we pass in turn a small number of
milestones with familiar names; but it is a mistake to
imagine that nothing exists between them.
Now I want to conjecture that it is precisely the
quasi-continuous nature of this dimension which has
helped to sabotage the efforts of researchers and system
designers over the years to bring into being a small
number of nicely finished text processing modules out
of which everyone might construct the working system
of his choice. The ambition to create such a set of
universal text processing building blocks is a noble one
and, like Esperanto, much can be said in its favor. But
those who have worked on the realization of this scheme
have not enjoyed success commensurate with the
loftiness of their aims.
Why this unfottunate state of affairs? I believe it can
be traced to the generally unfounded notion in the minds
of text processing system designers, and their sponsors,
that valuable economies of talent and time and money
can be achieved by creating systems which, in effect,
advance as little as possible along this dimension in order
to get by. In fact, as is evident, for example, in some
recent machine translation products, this attitude may
be productive of needlessly complex and inflexible
systems that are obsolescent upon delivery.
Of course, in many cases differences in hardware and
software have prevented system designers from making
use of existing programs. And in some cases the point
might be made that considerations of operating
efficiency dictated a "tight code" approach, ruling out
the incorporation of what appears to be the unnecessary
power and complexity of overdesigned components. But
many of the systems in this field are of an experimental
nature, where operating efficiency is relatively unimportant. And it often happens in the course of
systems development, that our initial estimate of depth
of structure requirements turns out to be wrong; that is

how "kluges" get built. In such instances, the aim at
local economies results in a global extravagance.
A partial remedy is for us to become familiar with the
spectrum of requirements that systems designers face
along this dimension of depth of structure, and to learn
(1) how to build adaptable processing modules, and (2)
how to tune these to the needs of individual systems.
I invite you to join me in the belief that this can and
should be done.
Lines of comparable power

Let us consider for a moment the four "corners" of
our hypothetical two-dimensional classification space.
Since we have no interpretation of negative scope or
negative depth of structure, we will locate all systems in
the positive quadrant of the two-dimensional plane. At
the minimum of both axes, near the origin, we might
locate, say, a character-counting program written as a
week-end project by a library science student in a
mandatory PL/1 course.
High in scope, and low in depth of structure are text
editing programs in general. Let us somewhat arbitrarily
locate Englebart's editing system in this corner, on the
basis of its strong user orientation.
Low in scope, but high in depth of structure: this
practically defines Winograd's tabletop robot simulator.
The domain of discourse of this system is deliberately
severely restricted, but it surpasses any other system I
know of in its structural capabilities.
High in both scope and depth of structure: in the real
world, no plausible candidate exists. We might imagine
this corner of our space filled by a system such as HAL,
from the movie "2001"; but nothing even remotely
resembling such a system has even been seriously
proposed.
The manner in which real-world systems fit into our
descriptive space suggests that some kind of trade-off
exists between the two dimensions; perhaps it is no
accident that the system having the greatest depth-ofstructure capabilities is so severely restricted in scope,
while the systems having the greatest scope operate at
a low level of structural depth. It is my contention that
this is indeed not an accident, but that it reflects some
important facts about our ability to process textual
information automatically. It would seem that, given
the current state of the art, we can, as system designers,
trade off breadth of scope against advances in structural
depth, and vice versa, but that to advance on both
fronts at once would require some kind of genuine
breakthrough.
This trading relationship between the dimensions can
be expressed in terms of lines of comparable power or

Dimensions of Text Processing

sophistication. Having the shape of hyperbolic asymptotes to the axes of our descriptive space, such lines
would connect systems whose intrinsic power or sophistication differs only by virtue of a different balance
between scope and depth of structure. The state of the
art in the field of text processing might then be characterized by the area under the line connecting the most
advanced existing systems.
Since genuine breakthroughs are probably not more
common in this field than in others, our analysis
supports the conclusion that run-of-the-mill system
design proposals which promise to significantly extend
our automatic text processing capabilities in both scope
and depth of structure are probably ill-conceived, or
perhaps worse. Yet proposals of this kind are not
uncommon, and a number of them attract funds from
various sources every year. I feel sure that a better
understanding of the dimensions of text processing on
the part of sponsoring agencies as well as system
designers might result in a healthier and more productive climate of research and development in this field.

809

upgrade the structure-handling capacity of systems
having considerable scope. While the design of effective
hybrid systems for text processing involves many
considerable problems, this approach seems to offer a
means of bringing the unique power of computers to
bear on applications which now lie on the farther side of
the state-of-the-art barrier with respect to fully
automatic systems.
ACKNOWLEDGMENTS
The writing of this paper was generously supported by
the Center for Computer-Based Behavioral Studies at
the University of California at Los Angeles, and was
encouraged by the Center's Director, Gerald Shure. I am
indebted to Martin Kay and Ronald Kaplan of the
RAND Corporation and to J. L. Kuhns of Operating
Systems, Inc. for illuminating discussion of the subject
matter. The errors are all my own.
REFERENCES

Men and machines

Finally, I want to simply mention a set of techniques
which can be of .inestimable value in breaking through
the state-of-the-art barrier in text processing, and to
indicate their relation to our two-dimensional descriptive space. I have in mind the set of techniques by
which effective man-machine cooperation may be
brought to bear in a particular application. It has for
some time been known that the human cognitive
apparatus possesses a number of powerful patternrecognition capabilities which have not even been
approached by existing computing machinery. A number of projects have investigated the problems of
marrying these powers efficiently with the speed and
precision of computers to solve problems which neither
could manage alone.
In the field of textual data processing, the potential
payoff from such hybrid systems, if you will permit me
the phrase, increases greatly as we consider higher levels
along the dimension of depth of structure. We humans
take for granted in ourselves capabilities which astound
us in machinery; most 3-year-old children could easily
out-perform Winograd's robot simulator, for example.
Whereas at the lower levels of this dimension, in tasks
like sorting, counting, string replacement, and what not,
no man can begin to keep up with even simple machines.
I conclude from these elementary observations that
well-designed man-machine systems can greatly extend
the scope of systems at the higher end of the dimension
of depth of structure, or (to put it in another way) can

1 Text editor and corrector reference manual (TEeO)

Interactive Sciences Corporation Braintree Mass 1969
2 N H NIE D H BENT C H HULL
SPSS: Statistical package for the social sciences
McGraw-Hill New York 1970
3 W J DIXON editor
BMD: Biomedical computer programs
University of California Publications in Automatic
Computation Number 2 University of California Press
Los Angeles 1967
4 D G BOBROW D P MURPHY W TEITELMAN
The BBN-LISP system
Bolt Beranek & Newman BBN Report 1677 Cambridge
Massachusetts April 1968
5 PDP-10 BASIC conversational language manual
Digital Equipment Corporation DEC-10-KJZE-D Maynard
Massachusetts 1971
6 PDP-10 algebraic interpretive dialogue conversational language
manual
,Digital Equipment Corporation DEC-10-AJCO-D Maynard
Massachusetts 1970. The AID language in this reference is
an adaptation of the RAND Corporation's JOSS language.
7 QED reference manual
Com-Share Reference 9004-4 Ann Arbor Michigan 1967
8 WYLBUR reference manual
Stanford Computation Center Stanford University Stanford
California revised 3rd edition 1970
9 W D ELLIOT W A POTAS A VAN DAM
Computer assisted tracing of text evolution
Proceedings of 1971 Fall Joint Computer Conference Vol 37
10 D C ENGLEBART "\V K ENGLISH
A research center for augmenting human intellect
Proceedings of 1968 Fall Joint Computer Conference Vol 33
11 0 HOLST I R A BRODY R C NORTH
Theory and measurement of interstate behavior: A research
application oj automated content analysis
Stanford University May 1964

810

Fall Joint Computer Conference, 1972

12·P L WHITE
KWIC/360
IBM Program Number 360D-06.7.(014/022) IBM
Corporation St Ann's House Parsonage Green Wilmslow
Chesire England United Kingdom
13 M KAY G R MARTINS
The MIND system: The morphological-analysis program
The RAND Corporation RM-6265/2-PR April 1970
14 P J STONE D C DUNPHY M S SMITH
D M OGILVIE et al
The general inquirer: A computer approach to content analysis
MIT Press Cambridge 1966
15 E F KELLY
A dictionary-based approach to lexical disambiguation
Unpublished doctoral dissertation Department of Social
Sciences Harvard University 1970
16 P STONE M SMITH D DUNPHY E KELLY
K CHANG T SPEER
Improved quality of content analysis categories: Computerized
disambiguation rules for high frequency English words
In G Gerbner 0 Holsti K Krippendorf W Paisley P Stone
The Analysis of Communication Content: Developments in
Scientific Theories and Computer Techniques Wiley New
York 1969
17 W A 'VOODS R M KAPLAN
The lunar sciences natural language information system
Bolt Beranek and Newman Inc Report No 2265 Cambridge
Massachusetts September 1971
18 M R QUILLIAN
Semantic memory
In M Minsky editor Semantic Information Processing
MIT Press Cambridge Massachusetts 1968

19 B RAPHAEL
SIR: A computer program for semantic information retrieval
In E A Feigenbaum and J Feldman Computers and Thought
McGraw-Hill New York 1968
20 C Ii KELLOGG
A natural language compiler for on-line data management
AFIPS Conference Proceedings of the 1968 Fall Joint
Computer Conference Vol 33 Part 1 Thompson Book
Company Washington DC 1968
21 S C SHAPIRO G H WOODMANSEE
A net-structure based question-answerer: Description and
examples
In Proceedings of the International Joint Conference on
Artificial Intelligence The MITRE Corporation Bedford
Massachusetts 1969
22 S C SHAPIRO
The MIND system: A data structure for semantic information
processing
The RAND Corporation R-837-PR August 1971
23 M KAY S SU
The MIND system: The structure of the semantic file
The RAND Corporation RM-6265/3-PR June 1970
24 T WINOGRAD
Procedures as a representation for data in a computer program
for understanding natural language
MIT Artificial Intelligence Laboratory MAC TR-84
Massachusetts Institute of Technology Cambridge
Massachusetts February 1971
25 R M KAPLAN
The MIND system: A grammar-rule language
The RAND Corporation RM-6265/1-PR March 1970

Social indicators based on communication content
by PHILIP J. STONE
Harvard University
Cambridge, Massachusetts

TYPES OF COMMUNICATION CONTENT
INDICATORS

INTRODUCTION
Early mechanical translation projects served to inject
some realism about the complexity of ordinary language processing. While text processing aspirations
have become more tempered, today's technology makes
possible cost effective applications well beyond the index or concordance. This paper outlines one new challenge that is mostly within curent technology and may
be a major future consumer of computer resources.
As we are all aware, industry and government cooperate in maintaining an extensive profile of our economy
and its changes. As proposed by Bauer! and others,
there is a need for indicators regarding the social, in
addition to the economic, fabric of our lives. Several
volumes, such as a recent one edited by Campbell and
Converse,2 review indexes that can be made on the
quality of life. Kenneth Land has documented the
growing interest in social indicators, reflected in volumes of reports and congressional testimony, as drawing on a wide basis of support. As the conflicts of the
late 1960's and early 1970's within our society made
evident the complexity of our own heterogeneous culture, interest in social indicators increased.
Most social indicator discussions focus on statistics
similar to economic indicators. A classic case is Durkheim's4 study on the analysis of suicide rates. Another
kind of social indicator, which we consider in this paper,
is based on changes in the content of mass media and
other public distributions of information, such as
speeches, sermons, pamphlets, and textbooks. Indeed,
in the same decade as Durkheim's study, Speeds compared New York Sunday newspapers between 1891
and 1893, showing how new publication policies (price
was reduced from three cents to two cents) were associated with increased attention to gossip and scandal at
the expense of attention to literature, religion and
politics. Since then, hundreds of such studies, called
"content analyses," have been reported.

Many different content indicators can be proposed =
Which sectors of society have voiced most caution
about increasing Federalism? How has the authoritarianism of church sermons changed in different religions? How oriented are the community newspapers to
the elites of the community? Such studies, however, can
be divided into two major groups.
One group of studies is concerned with comparing
the content of different channels through which different sectors of society communicate with each other.
Such studies often monitor the spread of concepts and
attitudes from one node to another in the communication net. Writers such as Deutsch 6 have discussed
feedback patterns within such nets.
Another group of studies is based on the realization
that a large segment of public media represents different
sectors of society communicating with themselves.
Social scientists have repeatedly found that people
tend to be exposed just to information congruent with
their own point of view. Thus, rather than focus on the
circulation of information between sectors of society,
these studies identify different subcultures and look at
the content of messages circulated within them.
In fact, anyone individual belongs to a set of subcultures. On the job, he or she may be exposed to the
views of colleagues, while off the job the exposure may
be to those with similar leisure time inter~sts, religious
preferences, or political leanings. Given the cost effectiveness of television for reaching the mass public, the
printed media has become used more for directed messages to different subcultures. Thus, while there has
been the demise of general circulation magazines such
as the Saturday Evening Post and Look, the number of
magazines concerned with particular trades, hobbies,
consumer orientations and levels of literary sophistication has greatly increased.
811

812

Fall Joint Computer Conference, 1972

While the printed media recognizes many different
subcultures (and one only has to watch the sprouting
of new underground newspapers or trade journals to
realize how readily a market can be identified), there
has been a more general resistance to recognizing how
many subcultures there are and how diverse their views
tend to be. Given the enormous complexity of our culture, each sector tends to recognize its own diversity,
but assumes homogeneous stereotypes for other sectors.
After repeated blunders, both the press and the public
are coming to realize that there are many different subcultures within the black community, the student community, the agricultural community, just as we all know
the're are many different subcultures in the computer
community. As sociologist Karl Mannheim7 identified
some years ago, the need to monitor our culture greatly
increases with such heterogeneity.
Gradually, awareness,of a need is turning into action.
Since the Behavioral and Social Science (Bass) reportS
released by the National Academy of Science in 1969
gave top priority to developing social indicators, government administration has been set up to coordinate
social indicator developments and several large grants
have been issued. Within coming years, we may expect
significant sums appropriated for social indicators.
COMPARISON WITH CONTENT ANALYSIS
RESEARCH
What language processing expertise do we have today
to help produce such social indicators? The Annual Review prepared by the American Documentation Institute or reports on such large text processing systems
as Salton's "SMART"9 offer a wide variety of possibly
relevant procedures. The discussion here focuses on
techniques developed explicitly for content analysis.
Content analysis procedures map a corpus of text
into an analytic framework supplied by the investigator. Itis information reducing, in contrast to an expansion procedure like a concordance, in that it discards
those aspects of the text not relevant to the analytic
framework. As a social science research technique, content analysis is used to count occurrences of specific
symbols or themes.
The main difference between content analysis as a
social science research technique and mass media social
indicators concerns sampling. A researcher samples text
relevant to hypotheses being tested. Only as much text
need be processed as necessary to substantiate or disconfirm the hypothesis. Usually, this involves thousands or tens of thousands of words of text. A social
indicators project, on the other hand, involves moni-

toring many different text sources over what may be
long periods of time. The total text may run into millions of words.
A hypothetical example illustrates how a social indicators project can come to be such a large size. A social
indicators project may compare a number of different
subcultural sectors in each of several different geographic locations. Within each sector, the sample should
include several media, so it does not reflect the biases
of one originator. The monitoring might cover several
decades, with a new indicator made bimonthly. Imagine
then a 4 dimensional matrix representing 14 (subcultural sectors) X 5 (geographic regions) X 4 (originating
sources) X 150 (bimonthly periods). Each cell of this
matrix might contain a sample of 15,000 words of text.
The result is a text file of over a billion characters.
Social science content analysis, which has been computer aided for over a decade (see for example, Stone
et al., 10 Stone et al., 11 Gerbner et al. 12), has used manual
keypunching to provide the modest volumes of machine
readable text needed. If the content analysis task were
simple (such as in our first example below), human
coders were often less expensive than the cost of getting
the text to machine readable form. Computer aided
content analysis has tended to focus on those texts, such
as anthropologists' files of folktales, where the same
material may be intensively studied to test a variety
of hypotheses.
Social indicators of public communications, on the
other hand, will require high speed optical readers capable of handling text in a wide variety of printing fonts.
Optical readers for selected fonts have been around for
some time, but readers capable of adapting to many
new fonts are just coming into existence. The large
general purpose reader developed by Information International, which incorporates a PDP-10 as part of its
hardware, represents this kind of machine. It is able to
"learn" new fonts and then offer high speed reading
from microfilm with a low error rate, even of third
generation Xerox copy.
Both social science content analysis research and social indicators allow for some noise, just as economic
indicators tolerate an error factor. Some of the noise
stems from sampling procedures. Other noise comes
from measurement procedures. The "quality control"
of social science research or monitoring procedures involves keeping the noise to a tolerable minimum. Assurances are also needed that the noise is indeed random, rather than leading to specifiable biases. With
large sampling procedures, a tolerable modest random
noise level should be considerably more than allowed in
many kinds of text processing applications. For example,
a single omission in an automated document classifica-

Social Indicators Based on Communication Content

tion scheme might cause a very important document
to go unnoticed by the users of the information retrieval system, thus causing great loss.
LEVELS OF COMPLEXITY
Both content analysis procedures for testing hypotheses and procedures for creating social indicators
come at varying levels of complexity. Some pose little
difficulty for today's text processing capabilities while
others pose major challenges. If one accepts a growing
consensus among artificial intelligence experts that a
successful language translation machine must "understand," in a significant sense of that word, the subject
matter it is translating, then the most complicated social indicator tasks begin to approach this domain of
challenge. Let us start at the simpler levels, showing
how these needs have been met in content analysis, and
work up to these more difficult challenges.
The simplest measure is to identify mentions of an
individual, place, group, or race. For example, Johnson,
Sears and McConahay13 performed a manual content
analysis of what they call "black invisibility" since the
turn of the century in the Los Angeles press. They show
that the percent of newspaper space devoted to blacks
in the major papers is much less than their percent of
the Los Angeles population warrants, and, furthermore,
the ratio has been getting worse over time. A black person can read the Los Angeles press and obtain the impression that blacks do not exist there. Thus, point out
the authors, some blacks took a "We won!" attitude
toward the large amount of destruction in the Watts
riots. Why? As reported by Martin Luther King,14 they
said "We won because we made them pay attention to
us." There is indeed a hunger to have one's existence
recognized.
"Black invisibility" can be assessed by counting the
number of references to blacks compared to whites, or
the newspaper column inches given to each. A computer
content analysis would need a dictionary of names
referring to black persons or groups. The computer
should have an ability to automatically update this dictionary as processing continued, for race may be only
identified with early newspaper stories about the person
or group. Thus, few stories today about Angela Davis
any more identify her race. The computer challenge,
should optical readers have had the text ready for processing, would have been minimal.
Johnson, Sears, and McConohay carried their research another step, classifying the stories according
to whether they dealt with anti-social activities, black
entertainers, civil rights, racial violence and several

813

other categories. These additional measures make
"black invisibility" more evident, for what little coverage blacks receive is often unfavorable and unrepresentative of the community. These additional topic
identifications would again hold little difficulty for a
computer analysis. It is not difficult to recognize a baseball story, a violent crime story, or a society event.
The Johnson, Sears and McConahay "black invisibility" indexes were only made on two newspapers with
very limited samples in the earlier part of the time
period studied. Their techniques, however, could be
applied to obtain "black invisibility" indexes for both
elite and tabloid press in every major metropolitan area
of the country. It is an example of how a content analysis measure can have considerable potential as a future
social indicator.
The next level· of complexity is represented by what
we call thematic analysis. For example, we might be
interested in social indicators measuring attitudes
toward increasing Federalism in our society. Separate
indicators might be developed to tap occurrences of
themes such as the following:
(1) The Federal government as an appropriate re-

cipient of complaints about ...
(2) The Federal government as initiator of programs for ...
(3) The Federal government as restricting or controlling ...
(4) The Federal government as responsible for the
well being of . . .
Such themes are measured by first identifying synonyms. Rather than refer to the "Federal government,"
the text may refer to a particular agency. The verb
section may have alternative forms of expression. F inally separate counts might be kept for each theme
relevant to different target clusters such as agriculture,
industry, minority groups, consumer goods, etc.
Past work in content analysis has offered considerable
success in studies on a thematic level. Thus Ogilvie
(1966) found considerable accuracy in computer scoring of "need achievement" themes in stories made up
by subjects. The scoring involved thematic identifications similar in complexity to the Federalism measures
cited above. Ogilvie found that the correlation between
the computer and a human coder was about .85, or as
high as the correlation between two human coders.
A still higher complexity is represented by the packaging of thematic statements into an argument, plot,
or rationale. This has recently become prominent in
psychology, with Abelson 16 writing about "opinion
molecules" while Kelly17 writes of "causal schemata."

814

Fall Joint Computer Conference, 1972

The concern is with, if I may substitute computer terminology, various "subroutines" that we draw on to explain the world about us. Many such subroutines are
shared by the community at large, so that a passing
reference to any part of the subroutine can be expected
to cause the listener to invoke the whole subroutine.
To take a very simple example, such phrases as a "Communist inspired plot," "subversive action," and "Marxist goals" can all be taken as invoking a highly shared
molecule including something as follows:
Communists create (inspire) plots involving subversive actions against established ways in order
to force changes in society toward their Marxist
goals.
Matters are rather simple when dealing with such
weatherbeaten old molecules, but take the end run kinds
of debates between politicians about school bussing to
try and identify the variety of molecules surrounding
that topic. The inference of underlying molecules involves theoretical issues that can go well beyond text
processing problems.
Again, there is a relevant history in content analysis,
although computer aided procedures have only recently
had any successes. The classic manual study was by
Propp18 who showed that Russian folktales fell into
variants of a basic plot. Recently, anthropologists such
as Colby19 and Miranda20 have pushed further the use
of the computer to study folktale plots. Investigators
such as Shneidman21 have worked on detailed manual
techniques to identify the forms of "psycho-logic" we
use in everyday explanations. Social indicators at this
level should pose considerable difficulty for some time
to come.
NEW TEXT PROCESSING RESOURCES
Content analysis research may share with social indicators projects in the priorities for new text processing
resources. These priorities may be quite different from
those in information retrieval or other aspects of text
processing. We here review these priorities as we see
them.
'
While automated linguistic analysis has been preoccupied with questions of syntactic analysis, content
analysis work has given priority to semantic accuracy.
Semantic errors effect even the simplest levels of measurement and were known to cause considerable noise
in many measurement procedures. Even the "black invisibility" study, for example, is going to have to be
able to distinguish between "black" the color and

"black" the race, as well as other usages of "black." A
Federalism study may expect verbs like "restrict" and
"control," but in fact the text may use such frightfully
ambiguous words as "run," "handle" or "order." A
first order of business has been to reduce such noise to
more manageable levels.
One might argue that procedures for such semantic
identifications should come after the text has received
a syntactic analysis. Certainly this would simplify the
task and increase accuracy. However, many simpler
social indicators and content analysis tasks do not
otherwise need syntactic analyses. For social indicator
projects, the large volumes of text discourage invoking
syntactic routines unless they are really needed. In
content analysis research, text is often transcripts of
conversational material having a highly degenerate
syntactical form. For these applications, a syntactically
dependent analysis of word senses might be less than
satisfactory. Thus, for both social indicators and content analysis research, it makes sense to attempt identification of word senses apart from syntactic analysis.
A project was undertaken some five years ago to develop computer routines that would be able to identify
major word senses for all words having a frequency of
over 40 per million. This criterion resulted in a list of
1815 entries covering about 90 percent of running text.
Identification of real, separate word senses is a thorny
problem we have discussed elsewhere; let it simply be
pointed out here that the number of word senses in a
dictionary tends to be directly proportional to the size
of the dictionary. Our goal was to cover the basic distinctions (such as "black" the race vs "black" the color)
rather than many fine-graded distinctions (such as
those of a word like "fine").
Of the 1815 candidates, some 1200 were identified as
having multiple meanings. Two thirds of these, or about
800 words offered considerable challenge for word se~se
identifications. Rules for identifying word senses were
developed for each of these multiple meaning words.
Each rule could test the word environment (specifying
its own boundary parameters) for the presence or absence of particular words or any of sixty different markers. Each rule, written in a form suitable for compilation by a weak precedence grammar, could either assign senses or transfer to other rules, depending on
the outcome. The series of rules used for testing any
one word thus formed a routine.
The implementation of these rules on a computer emphasized efficiency. Since marker assignments often depended on word senses being identified, deadlocks could
occur with some rules testing for markers on neighboring words which could not yet be assigned until the
word. in question was resolved. Strategies include the

Social Indicators Based on Communication Content

computer looking into dictionary entries to see if the
marker category is among the possible outcomes. Despite such complicated options, occasionally resulting
in multiple passes, the word sense procedures are remarkably fast, to the point of being feasible for social
indicators work.
The accuracy of the word sense identification procedures was tested on a 185,000 word sample drawn
both from Kucera and Frances22 and our own text files.
A variety of tests were performed. For example, for a
sample of 671 particularly difficult homographs, covering 64,253 tokens in the text, correct assignment was
made 59,716 times or slightly over 92 percent of the
time. The procedures thus greatly reduce the noise in
word sense assignments.
The second priority for even some of the simplest social indicator projects should be pronoun identification. The importance of the problem depends on the
kinds of tabulations that are to be made. If the question is whether any mention is made in the article, then
pronouns are not such a crucial issue. If the question
involves counting how many references were made,
then references should be identified in both noun and
pronoun form.
We believe that more work should be encouraged on
pronoun identification so as to be better prepared for
future social indicators research. Because many pronouns involve references outside the sentence, the problem is beyond most current syntax studies. Winograd23
provides a heartening example of how well pronoun
identification can be made for local discourse on a specific topic area.
The third priority is syntactic analysis for thematic
identification purposes. This is not just a general syntactic analysis, but an analysis to determine if the text
matches one of the thematic templates relative to a social indicator. Large amounts of computer time can be
saved by only calling on the syntactic routine after it is
established that all the semantic components relevant
to the theme are indeed present. Syntactic analysis can
stop as soon as it is established that the particular word
order cannot be an example of that theme. In general,
we find that a case grammar is most akin to thematic
analysis needs.
The transition network approach of Woods 24 holds
considerable promise for such syntactic capabilities.
Gary Martins, who is with us on the panel, is already
exploring the application of such transition networks to
content analysis problems. This work should be of considerable utility in the development of social indicators
based on themes.
Finally, we come to the need for inference systems to
handle opinion molecules and the like. Such devices as

815

Hewitt's PLANNER25 may have considerable utility
for such social indicator projects. A PLANNER operation includes a data base and a set of theorems. Given
a text statement, PLANNER can attempt to tie it back
through the theorems until a match is made in the data
base. For any editorial, for example, the successful
theorem paths engendered could identify which molecules were being invoked and their domain of application. At present, this is but conjecture; much work
needs to be done.
These priorities, then, are explicitly guided by what
Weizenbaum26 calls a "performance mode," in this case
toward creating useful social indicators. They may
well conflict with text processing priorities in computational linguistics or artificial intelligence. Some social
indicators may only be produced in the distant future,
but meanwhile important results can be accomplished
using optical readers and current text processing sophistication.
DEVELOPING SOCIAL INDICATOR TEXT
ARCHIVES
Having considered text processing research prlOrlties, let us examine what concurrent steps are needed
if relevant text files are to be created and put to effective use.
At present, our archiving of text material is haphazard and, for social indicator purposes, subject to major
omissions. The American public (as publics in most
advanced societies) spends more than four times as
many hours watching television compared to all forms
of reading put together (Szalai, et a1. 27). Yet, even with
the incredible salience of network evening newscasts or
documentary specials, the networks are not required to
place television scripts in a public archive. A content
analysis like Efron's The News Twisters28 had to be made
from homemade tape recordings of news broadcasts.
Similarly if one is to study the content of cummunication channels between sectors of society, one needs
both original and intermediate sources such as press
releases and wire service transmissions. Past critics of
our news media such as Cirino29 have had to make extensive efforts to obtain the necessary primary information. Better central archiving is very much needed.
As discussed by Firestone,30 considerable attention
will have to be given to coordinating the sampling of
text with sampling used for other social indicators. For
example, it makes obvious sense to target the sampling
of union newsletters to correspond to union memberships selected for repeated survey interviews. In one of
our own studies, Stone and Brody31 compared a content

816

Fall Joint Computer Conference, 1972

analysis of news stories on the Vietnam war with the
results of Gallup survey questions ab()ut the effectiveness of the president. This study would have been
greatly aided by (a) better text files of representative
news stories from across the nation and (b) survey information as to media exposure. With less adequate data,
the quality of the analysis suffers.
SAFEGUARDING THE PUBLIC
On the one hand, since the files are based on public
communications, investigators outside the government
should have access to the archives for testing different
models. In this sense, such files would be similar to the
computer economic data bases for testing econometric
models now made available by commercial organizations.
On the other hand, the same technology used to produce social indicators based on content can be used to
invade the content of private communications. This
author, for one, is worried about current military sponsored research that aims to make possible a computer
monitoring of voice grade telephone communication.
Mter all that has been written about privacy, a much
closer safeguard is needed. Further work on content
analysis techniques must be coordinated with such
safeguards.
SUMMARY
This paper has outlined how computer text processing
resources may be used to produce social indicators of
communication content. A new challenge of considerable scale is forecast. The relations of such indicators to
existing content analysis research techniques is identified. Priorities based on social indicator requirements
are offered for future text processing research. Because
of the scale of the operation and its distinct requirements, we suggest that social indicators based on communication .content be considered separate from other
computer text processing applications. Immediate attention is needed for text archiving and safeguarding
the privacy of communications.
REFERENCES
1 R BAUER
Social indicators
MIT Press 1966
2 A CAMPBELL P CONVERSE (ed)
The human meaning of social change
Russell Sage Foundation 1972

3 K LAND
Social indicators
In R Smith Social Science Methods-A New Introduction
Vol 2 In Press
4 E DURKHEIM
Suicide-A study in sociology
1897 (Trans from the French 1951 Free Press)
5 J G SPEED
Do newspapers now give the news?
The Forum 1893 Vol 15 pp 705-711
6 K DEUTSCH
Nerves of government
Free Press 1963
7 K MANNHEIM
Ideology and utopia-An introduction to the sociology of
knowledge
1931 (English translation: Harcourt)
8 NATIONAL ACADEMY OF SCIENCE
Behavioral and social sciences-Outlook and needs
Prentice Hall 1969
9 G SALTON
The SMART retrieval system
Prentice Hall 1971
10 P STONE R BALES J Z NAMENWIRTH
D OGILVIE
The general inquirer
Behavioral Science 1962 Vol 7 pp 484-498
11
D C DUNPHY M S SMITH
D M OGILVIE
The general inquirer-A computer approach to content analysis
MIT Press 1966
12 G GERBNER 0 HOLSTI K KRIPPENDORFF
W PAISLEY P J STONE
The analysis of communications content
Wiley Press 1969
13 P B JOHNSON D SEARS J McCONAHAY
Black invisibility, the press and the Los Angeles riot
Amer J Sociology 1971 Vol 76 pp 698-721
14 M L KING
Where do we go from here? Chaos or community?
Beacon Press 1967
15 D M OGILVIE
In P Stone et al op cit pp 191-206
16 R P ABLESON
Psychological implication
In R P Abelson E Aronson W McGuire T Newcomb
M Rosenberg and P Tannenbaum Theories of Cognitive
Consistency Rand McNally 1968
17 H KELLY
Causal schemata and the attribution process
General Learning Press 1972
18 V PROPP
Morphology of the folktale
1927 American Folklore Society (Trans 1958)
19 B N COLBY
Folk narrative
Current Trends in Linguistics Vol 12 1972
20 P MIRANDA
Structural strength, semantic depth, and validation procedures
in the analysis of myth
Proceedings Quatrieme Symposium sur les Structures
Narratives Konstanz Germany 1971 In Press
21 E S SHNEIDMAN
Logical content analysis: An explication of styles of "concludifying"

Social Indicators Based on Communication Content

In Gerbneret al op cit
22 H KUCERA W FRANCES
Computational analysis of present-day American English
Brown University Press 1967
23 T WINOGRAD
Procedures as a representation for data in a computer program
for understanding natural language
Report MAC TR-84 MIT February 1971 (Selections
reprinted in Cognitive Psychology 1972 #1)
24 W WOODS
Transitional network grammars for natural language analysis
Comm ACM 1970 Vol 13 pp 591-602
25 C HEWITT
PLANNER-A language for proving theorems in robots
Proc of IJCAI 1969 pp 295-301
26 J WEIZENBAUM
On the impact of the computer on society
Science Vol 176 pp 609-6141972

817

27 A SZALAI E SCHEUCH P CONVERSE
P STONE
The use of time
Mouton 1972
28 E EFRON
The news twisters
Nash 1971
29 R CIRINO
Don't blame the people
Diversity Press 1971
30 J M FIRESTONE
The development of social indicators from content analysis
of social documents
Policy Sciences In Press
31 P STONE R BRODY
Modeling opinion responsiveness to day news-The public
and Lyndon Johnson 1965-1968
Social Science Information Vol 9 #1 pp 95-122

The DOD COBOL compiler validation
system
by GEORGE N. BAIRD
Department of the Navy
Washington, D. C.

INTRODUCTION

Data Systems Languages* and published in May of
1961. Recognizing that the language would be subject
to additional development and change, an attempt
was made to create uniformity and predictability in
the various implementations of COBOL compilers.
The language elements were placed in one of two
categories: required and elective.
Required COBOL-1961 consisted of language elements (features and options) which must be implemented by any implementor claiming a COBOL-1961
compiler. This established a common minimum subset
of language elements for COBOL compilers and hopefully a high degree of transferability of source programs
between compilers if this subset was adhered to.
Elective COBOL-1961 consisted of language elements
whose implementation had been designated as optional. It was suggested that if an implementor chose
to include any of these features (either totally or
partially) he would be expected to implement these
in accordance with the specifications available in
COBOL-1961. This was to provide a logical growth
for the language and attempt to prevent a language
element from having contradictory meaning between
the language development specifications and implementor's definition.
As implementors began providing COBOL compilers
based on the 1961 specifications, unexpected problems
became somewhat obvious. The first problem was that
the specifications themselves suggested mandatory as
well as optional language elements for implementing
COBOL compilers. In addition the development docu-

The ability to benchmark or validate software to ensure
that design specifications are satisfied is an extremely
difficult task. Test data, generally designed by the
creators of said software, is generally biased toward a
specific goal and tend not to cover many of the possibilities of combinations and interactions. The philosophy of suggesting that "a programmer will never
do . . ." or "this particular situation will never happen"
is altogether absurd. First, "never" is an extremely
long time and secondly, the Hagel theorem of programming states that "if it can be done, whether absurd
or not, one or more programmers will more than likely
try it."
Therefore, if a particular piece of software has been
thoroughly checked against all known extremes and a
majority of all syntactical forms, then the Hagel
theorem of programming will not affect the software
in question. The DOD CCVS attempts to do just that
by checking for the fringes of the specifications of
X3.23-19681 and known limits. It is assumed that a
COBOL compiler will perform satisfactorily for the
audit routines, then it is likely that the compiler supports the entire language. However, if the computer
has trouble with handling the routines in the CCVS
it can be assumed that there will indeed be other
errors of a more serious nature.
The following is a brief account of the history of the
DOD CCVS, the automation of the system and the
adaptability of the system to given compilers.

* The Conference on Data Systems Languages (CODASYL) is an
informal and voluntary organization of interested individuals
supported by their institutions who contribute their efforts and
expenses toward the ends of designing and developing techniques
and languages to assist in data systems analysis, design, and
implementation. CODASYL is responsible for the development
and maintenance of COBOL.

BACKGROUND
The first reVISIon to the initial specification for
COBOL (designated as COBOL-196P) was approved
by the Executive Committee of the Conference on
819

820

Fall Joint Computer Conference, 1972

ment produced by CODASYL was likely to change
periodically thus, providing multiple specifications to
implement from. Compilers could consist of what the
implementor chose to implement which would severely
handicap any chance of transferability of programs
among the different compilers, particularly since no two
implementors necessarily think alike. Philosophies vary
both in the selection of elements for a COBOL compiler
and in the techniques of implementing the compiler
itself. (As ridiculous as it may sound, some compilers
actually scan, syntax check and issue diagnostics for
COBOL words that might appear in comments both
in the REMARKS paragraph of the Identification
Division and in NOTE sentences in the Procedure
Division.) The need for a common base from which to
implement became obvious. If the language was to
provide a high degree of compatability, then all implementations had to be based on the same specification.
The second problem was the reliability of the com;..
piler itself. If the manual for the compiler indicated
that it supported the DIVIDE statement, the user
assumed this was true. If the compiler then accepted
the syntax of the DIVIDE statement, the user assumed that the object code necessary to perform the
operation was generated. When the program executed,
he expected the results to reflect the action represented
in his source code. It appears that in some cases perhaps
no code was generated for the DIVIDE statement
and the object program executed perfectly except for
the fact that no division took place. In another case,
when the object program encountered the DIVIDE
operation, it simply went into a loop or aborted. At
this point, the programmer could become decidedly
frustrated. The source code in his program indicated
that: (1) he requested that a divide take place, (2) there
was no error loop in his program, (3) the program
should not abort. This is .the problem we are addressing: A programmer should concern himself with
producing a source program that is correct logically
and the necessary operating system control statements
to invoke the COBOL compiler. In doing so, he should
be able to depend on the compiler being capable of
contributing its talent in producing a correct object
program.
If the user was assured that either: (1) each instruction in the COBOL language had been implemented
correctly, or, (2) that each statement which was implementeddid not give extraneous results, then the
above situation could not exist.
Thus, the need· for a validation tool becomes apparent. Although all vendors exercise some form of
quality control on their software before it is released,

it is clear that some problems may not be detected.
(The initial release of the Navy COBOL audit routines
revealed over 50 bugs in one particular compiler which
had been released five years earlier.)
By providing the common base from which to implement and a mechanism for determining the accuracy
and correctness of a compiler relative to the specification, the problem of smorgasbord compilers (that may
or may not produce expected results) should become
extinct.
The standardization of COBOL began on 15 January
1963. This was the first meeting of the American Standards Association Committee, X3.4.4, * the Task Group
for Processor Documentation and COBOL. The program of work for X3.4.4 included ... "Write test
problems to test specific features and combinations of
features of COBOL. Checkout and run the test problems
on various COBOL compilers." A working group
(X3.4.4.2) was established for creating the "test
problems" to be used for determining feature availai;.
bility.
The concept bf a mechanism for measuring the
compliance of a COBOL compiler to the proposed
standard seemed reasonable in view of the fact that
other national standards did indeed lend themselves
to some form of verifications, i.e., 2X4's, typewriter
keyboards, screw threads.
Il\;fPLEMENTING A VALIDATION SYSTEM
FOR COBOL
In order to implement a COBOL program on a given
system, regardless of whether the program is a validation routine or an application program, the following
must be accomplished:
1. The special characters used in COBOL (i.e.,
etc.) must be converted for the
system being utilized. t
2. All references to implementor-names within each
of the source programs must be resolved.
3. Operating System Control Cards must be pro-

'(', ')', '*', ' +', ' <'

* The American Standards Association (ASA), a voluntary
national standards body evolved to the United States of America
Standards Institute (USASI) and finally the American National
Standards Institute (ANSI). The committee X3.4.4 eventually
became X3J4 under a reorganization of the X3 structure. X3J4 is
currently in the process of producing a revision to X3.23-1968.
t For most computers the representatives for the characters
A-Z, 0-9, and the space (blank character) are the same. However,
there is sometimes a difference in representation of the other
characters and therefore conversion of these characters from one
computer to another may be necessary.

The DOD COBOL Compiler Validation System

duced which will cause each of the source programs to
be compiled and executed. Additionally, the user must
have the ability to make changes to the source programs, i.e., delete statements, replace statements, and
add statements.
4. As the programs are compiled, any statements
that are not syntactically acceptable to the compiler
must be modified or "deleted" so that a clean compilation takes place and an executable object program is
produced.
5. The programs are then executed. All execution
time aborts must be resolved by determining what
caused the abort and after deleting or modifying that
particular test or COBOL element, repeating steps 3
and 4 until a normal end of job situation exists.
Development of audit routines

l\1arch 1963, X3.4.4.2 (the Compiler Feature Availability Working Group) began its effort to create the
COBOL programs which would be used to determine
the degree of conformance of a compiler to the proposed
standard. The intent of the committee was not to furnish a means for debugging compilers, but rather to
determine "feature availability." Feature availability
was understood to mean that the compiler accepted the
syntax and produced object code to produce the desired result. All combinations of features were not to
be tested; only a carefully selected sample of features
(singly and in combination) were to be tested to insure
that they were operational. The test programs themselves were to produce a printed report that would
reflect the test number and when possible whether the
test "Passed" or "Failed." See Figure 1.
When a failure was detected on the report, the user
could trace the failure to the source code and attempt
Source Statements

TEST-OOOl.
MOVE 001 TO TEST-NO.
MOVE ZERO TO ALPHA.
ADD 1 TO ALPHA.
IF ALPHA = 1 PERFORM PASS ELSE PERFORM FAIL.
TEST-0002.
Results
TEST
ADD

NO

ADD

21

1

P - F
P

F

Figure I-Example of X3.4.4.2 test and printed results

821

to identify the problem. The supporting code (printing
routine, pass routine, fail routine, etc.) was to be written
using the most elementary statements in the low-level
of COBOL. The reason for this was twofold:
1. The programs would be able to perform on a
minimum COBOL compiler (Nucleus levell,
Table Handling levell, and Sequential Access
level 1).
2. The chances of the supporting code not being
acceptable to the compiler being tested were
lessened.
The programs, when ready, would be provided in
card deck form along with the necessary documentation for running them. (The basic philosophies of
design set forth by X3.4.4.2 were carried through all
subsequent attempts to create compiler validation
systems for COBOL.)
Assignments were made to the members of the committee and the work began. This type of effort at the
committee level, however, was not as productive as
the work of standardizing the language itself.
In April 1967, the Air Force issued a contract for a
system to be designed and implemented which could
be used in measuring a compiler against the standard.
The Air Force COBOL Compiler Validation System
was to create test programs and adapt them to a given
system automatically by means of fifty-two parameter
cards.
The Navy COBOL audit routines

In August of 1967, The Special Assistant to the
Secretary of the Navy created a task group to influence
the use of COBOL throughout the Navy. Being aware
of both the X3.4.4.2 and Air Force efforts, (as well
as the time involved for completion), a short term
project was established to determine the feasibility
of validating COBOL compilers. After examining the
information and test programs available at that time,
the first set of routines was produced. In addition to the
original X3.4.4.2 philosophy, the Navy added the
capability of providing the result created by the computer as well as the expected result when a test failed.
Also, instead of a test number, the actual procedure
name in the source program was reflected in the output.
See Figure 2.
The preliminary version of the Navy COBOL audit
routines was made up of 12 programs consisting of
about 5000 lines of source code. The tailoring of the
programs to a particular compiler was done by hand

822

Fall Joint Computer Conference, 1972

(by physically changing cards in the deck or by using
the vendor's software for updating COBOL programs).
As tests were deleted or modified, it was difficult to
bring the programs back to their virgin state for subsequent runs against different compilers or for determining what changes had to be made in order that
the programs would execute.
This was a crude effort, but it established the necessary evidence that the project was feasible to continue
and defined techniques for developing auditing systems.
Because of the favorable comments received on this
initial work done by the Navy, it appeared in the best
interest of all to continue the effort.
After steady development and testing for a year,
Version 4 of the Navy COBOL Audit Routines was
released in December 1969. The routines consisted of
55 Programs, consisting of 18,000 card images capable
of testing the full standard. The routines had also become one of the benchmarks for all systems procured
by the Department of the Navy in order to ensure that
the compiler delivered with the system supported the
required level of American National Standard COBOL. *
Also, Version 4 introduced the VP-Routine, a program that automated the audit routines. Based on
fifty parameter cards, all implementor-names could
be resolved and the test programs generated in a onepass operation. See Figure 3.
In addition, by coding specific control cards in the
Working-Storage Section of the VP-Routine as constants, the output of the VP-Routine became a file
that very much resembled the input from a card reader,
i.e., control cards, programs, etc.
By specifying the required Department of Defense
COBOL subset of the audit routines to be used in a
validation, only the programs necessary for validating
Source Statements
ADD-TEST-l.
MOVE 1 TO ALPHA.
ADD 1 TO ALPHA.
IF ALPHA =2 PERFORM PASS ELSE PERFORM FAIL.

Results
FEATURE PARAGRAPH P/F COMPUTED EXPECTED
ADD

ADD-TEST-l

FAIL

ADD

ADD-TEST-2

PASS

1

2

Figure 2-Example of Navy test and printed results

* In 1968, the Department of Defense, realizing that several
thousand combinations of modules/levels were possible, established four subsets of American National Standard COBOL for
procurement purposes.

V-P Routine Input:
X-O SOURCE-COMPUTER-NAME
X-I OBJECT-COMPUTER-NAME
X-3

X-8 PRINTER
X-9 CARD-READER
X-lO

X-50

Audit Routine File:
SOURCE-COMPUTER.
XXXXXO

SELECT PRINT-FILE ASSIGN TO
XXXXX8
The audit routine after processing would be:
SOURCE-COMPUTER.
SOURCE-CO MPUTER-NAME.

SELECT PRINT-FILE ASSIGN TO
PRINTER.
Figure 3-Example of input to the support routine, Population
file where audit routines are stored and resolved audit routine
after processing

that subset of elements or modules would be selected,
i.e., SUBSET-A, B, C, or D. The capability also existed
to update the programs as the "card reader" file was
being created. The use of the VP-Routine was not
mandatory at this time, but merely to assist the person
validating the compiler in setting up the programs for
compilation. Once the VP-Routine was set up for a
given system, there was little trouble running the audit
routines. The user then had only to concern himself
with the validation itself and with achieving successful
results from execution of the audit routines. When an
updated set of routines was distributed, there was no
effort involved in replacing the old input tape to the
VP-Routine with the new tape.
The Air Force COBOL audit routines

The Air Force COBOL Compiler Validation System
(AFCCVS) was not a series of COBOL programs but
rather a test program generator. The user could select

The DOD COBOL Compiler Validation System

823

Source statement in test library
4U
T 1N078A101NUC, 2NUC
PICTURE S9(18).
400151 77 WRK-DS-18VOO
PICTURE S9(18).
400461 77 A180NES-DS-18VOO
VALUE 11111111111111111.
400471
PICTURE S9(18) COMPUTATIONAL
400881 77 A180NES-CS-18VOO
VALUE 111111111111111111.
400891
802925 TEST-1NUC-078.
TO WRK-DS-18VOO.
MOVE A180NES-DS-18VOO
802930
TO WRK-DS-18VOO
802935
ADD A180NES-CS-18VOO
TO SUP-WK-A.
802940
MOVE WRK-DS-18VOO
TO SUP-WK-C.
MOVE '222222222222222222'
802945
TO SUP-ID-WK-A
802950
MOVE 'lN078'
802955
PERFORM SUPPORT-RTN THRU SUP-TRN-C.
Test results
.lN078
.lN079.
.222222222222222222.09900.
Figure 4-Example of Air Force test and printed results

the specific tests or modules he was interested in and
the AFCCVS would create one or more programs from
a file of specific tests which were then compiled as audit
routines. Implementor-names were resolved as the
programs were generated based on parameter cards
stored on the test file or provided by the user.
The process required several passes, including the
sorting of all of the selected tests to force the Data
Division entries into the Data Division and place
the tests themselves in the Procedure Division where
they logically belonged. An additional pass was required to eliminate duplicate Data Division entries
(more than one test might use the same data-item and
therefore there would be more than one copy in the
Data Division). See Figure 4.
Still another program was used to make changes to
the source programs as the compiler was validated.
As in the Navy system, certain elements had to be
eliminated because: (1) they were not syntactically
acceptable to the compiler or, (2) they caused run time
aborts.
Department of Defense COBOL validation system

In December 1970, The Deputy Comptroller of ADP
in the Office of the Secretary of Defense asked the
Navy to create what is now the DOD Compiler Validation System for COBOL taking advantage of: (1) the
better features of both the Navy COBOL Audit Routines (Version 4) and the Air Force CCVS and (2) the
four years of in-house experience in designing and implementing audit routines on various systems as well as
the actual validation of compilers for procurement
purposes.

The Compiler Validation System (of which the support program was written in COBOL) had to be readily
adaptable to any computer system which supported a
COBOL compiler and which was likely to be bid on any
RFP issued by the Department of Defense or any of
its agencies. It also had to be able to communicate with
the operating system of the computer in order to provide an automated approach to validating the COBOL
compiler. The problem of interfacing with an operating
system mayor may not be readily apparent depending
on whether an individual is more familiar with IBM's
Full Operation System (OS), which is probably the
most complex operating system insofar as establishing
communication between itself and the user is concerned, or with the Burroughs Master Control Program
(lVICP), where the control language can be learned in a
fifteen or twenty minute discussion.
Since validating a compiler may not be necessary
very often, the amount of expertise necessary for communicating with the CVS should be kept to a minimum.
The output of the routines should be as clear as possible
in order not to confuse the reviewer of the results or to
suggest ambiguities.
The decision was made to adopt the Navy support
system and presentation format for several reasons.
(1) It would be easier to introduce the Air Force tests
into the Navy routines as additional tests because the
Navy routines were already in COBOL program format.
It would have been difficult to recode each of the Navy
tests into the format of specific tests on the Air Force
Population File because of the greater volume of tests.
(2) The Navy support program had become rather
versatile in handling control cards, even for IBM's
as, whereas the Air Force system had only limited
control card generation capability.

824

Fall Joint Computer Conference, 1972

The merging of the A ir Force and Navy routines

The actual merging of the routines started in
February 1971 and continued until September 1971.
During the merging operation, it was noted that there
was very little overlap or redundancy in the functions
tested by the Air Force and Navy systems. In actuality,
the two sets of tests complemented each other. This
could only be attributed to the different philosophies
of the two organizations which originally created the
routines. For example in the tests for the ADD statement:
Air Force
signed fields
most fields 18 digits long

more computational
items

Navy
unsighed fields
most fields 1-10 digits
long
more display items

After examining the Add tests for the combined DOD
routines, it was noticed that a few areas had been
totally overlooked.
1. An ADD statement that forced the "temp"
used by the compiler to hold a number greater
than 18 digits in length:
i.e., ADD

-t999999999999999999
-t999999999999999999
-t999999999999999999
- 999999999999999999
- 999999999999999999
-99 TO ALPHA

. . . where the intermediate result would be
greater than 18 digits, but the final result would
be able to fit in the receiving field.
2. There were not more than eight operands in
anyone ADD test.
3. A size error test using a COl\1PUTATIONAL
field when the actual value could be greater
than the described size of the field, i.e., ALPHA
PICTURE 9(4) COl\1P ... specifies a data item
that could contain a maximum value of 9999
without an overflow condition; however, because
the field may be set up internally in binary, the
decimal value may be less than the maximum
binary value it could hold:
Maximum COBOL value = 9999
lVlaximum hardware value~16383
Therefore, from this point of view, the merging of

Source statements
ADD-TEST-l.
MOVE 1 TO ALPHA.
ADD 1 TO ALPHA.
IF ALPHA = 2
PERFORM PASS
ELSE
GO TO ADD-FAlL-I.
GO TO ADD-WRITE-I.
ADD-DELETE-l.
PERFORM DELETE.
GO TO ADD-WRITE-l.

Initialization if
necessary.
The Test.
Check the results of the
test and handle the
accounting of that
test.
Normal exit path to the
write paragraph.
Abnormal path to the
write statement if the
test is deleted via the
NOTE statement.
Correct and computed
results are formatted
for printing.

ADD-FAIL-I.
MOVE ALPHA TO COMPUTED.
MOVE '2' TO CORRECT.
PERFORM FAIL.
ADD-WRITE-I.
Results are printed.
MOVE 'ADD-TEST-l' TO PARAGRAPH-NAME.
PERFORM PINT-RESULTS.
ADD-TEST-2.
Figure 5-Example of DOD test and supporting code

the routines disclosed the holes in the validation systems being used prior to the current DOD routines.
The general format of each test is made up of several
paragraphs: (1) the actual "test" paragraph; (2) a
"delete" paragraph which takes advantage of the
COBOL NOTE for deleting tests which the compiler
being validated cannot handle; (3) the "fail" paragraph
for putting out the computed and correct results when
a test fails; and (4) a "\\-'rite" paragraph which places
the test name in the output line and causes it to be
written. See Figure 5.
The magnitude of the size of the DOD Audit Routines
was approaching 100,000 lines of source coding,
making up 130 programs. The number of environmental changes (resolution of implementor-names) was
in the neighborhood of 1,000 and the number of operating system control cards required to execute the
program would be from 1,300 to 5,000 depending on the
complexity of the operating system involved.
This was where the support program could save a
large amount of both work and mistakes. The Versatile
Program l\'1anagement System (VPl\1S1) was designed
to handle all of these problems with a minimum of
effort.
Versatile program management system (VPMS1)

A good portion of the merging included additional
enhancements to the VPl\1S1 (support program)

The DOD COBOL Compiler Validation System

which, by this time, through an evolutionary process
had learned to manage two new languages; FORTRAN
and JOVIAL. The program had been modified based
on the additional requirements of various operating
systems for handling particular COBOL problems;
the need for making the system easy for the user to
interface with, and the need to provide all interfaces
between the user, the audit routines, and the operating
system.
The introduction of implementor names through" X-cards"

The first problem was the resolution of implementornames within the source COBOL programs making up
the audit routines. In the COBOL language, particularly
in the Environment Division, there are constructs which
must contain an implementor-defined word in order for
the statement to be syntactically complete. Figure 6
shows where the implementor-names must be provided.
THE NOTE placed as the first word in the paragraph causes the entire paragraph to be treated as
comments. Instead of the "GO TO ADD-WRITE-l"
statement being executed, the logic of the program falls
into the delete paragraph which causes the output results to reflect the fact that the test was deleted.
If the syntax error is in the Data Division, then the
coding itself must be modified. VPl\1S1 shows, in its
own printed output, the old card image as well as the
new card image so that what has been altered is readily
apparent, i.e.,
012900 02APICZZ9Value'I'. NCI085.2 OLD
012900 02 A PIC ZZ9 Value 1. NCI08*RE NEW
ENVIRONMENT DIVISION.
SOURCE-COMPUTER.
implementor-name-l.
OBJECT-COMPUTER.
implementor-name-2.
SPECIAL-NAMES.
implementor-name-3 is MNEMONIC-NAME

FILE-CONTROL
SELECT FILE-NAME ASSIGN TO implementor-name-4.

825

If, while executing the object program of an audit
routine, an abnormal termination occurs, then a change
is required. The cause might be, for example, a data
exception or a program loop due to the incorrect implementation of a COBOL statement. In any case, the
test in question would have to be deleted. The NOTE
would be used as specified above.
In addition, VPMSI provides a universal method
of updating source programs so that the individual who
validates more than one compiler is not constantly required to learn new implementor techniques for updating source programs.
Example of update cards through VPMSl:

012900 02 A PIC ZZ9
VALUE 1.
013210 l\10VE 1 TO A.
014310 NOTE

014900*
029300*099000

(If the sequence number is
equal the card is replaced;
if there is no match the
card is inserted in the appropriate place in the
program.)
(Deletes card 014900)
(Deletes the series from 029300
through 099000).

To carry the problem a step further. Some of the
names used by different implementors for the high
speed printer in the SELECT statement have
been PRINTER, SYSTEl\1-PRINTER, FORMPRINTER, SYSOUT, SYSOUl, PI FOR LISTING, ETC. It is obvious to a programmer what the
implementor has in mind, but the compiler that expects
SYSTEl\1-PRINTER, will certainly reject any of
the other names. Therefore, each occurrence of an
implementor-name must be converted to the correct
name. The approach taken is that each implementorname is defined to VPlVlS1. For example, the printer
is known as XXXX36 and the audit routines using
the printer would be set up in the following way:
SELECT PRINT-FILE ASSIGN TO
XXXXX36
And the user would provide the name to be used by the
computer being tested through an "X-CARD."
X-36 SYSTEl\;f-PRINTER
VPl\1S1 would then replace all references of XXXXX36
with SYSTEl\1-PRINTER.
SELECT PRINT-FILE ASSIGN TO
SYSTEM-PRINTER.

data division.
FD FILE-NAME
VALUE OF implementor-name-5 IS implementor-defined.
Figure 6-Implementor defined names that would appear
in a COBOL program

A bility to update programs

The next problem was to provide the user with a
method for making changes to the audit routines in

826

Fall Joint Computer Conference, 1972

ADD-TEST-l.
NOTE (Inserted by the user as an update to the program.)
MOVE 1 TO ALPHA.
TO TO ADD-WRITE-I.
ADD-DELETE-l.
PERFORM DELETE.

Figure 7-Example of deleting a test in the DOD CCVS

an orderly fashion and at the same time provide a maximum amount of documentation for each change made.
There are two reasons for the user to need to make
modifications to the actual audit routines:
a. If the compiler will not accept a form of syntax
it must be eliminated in order to create a syntactically correct program. There are two ways
to accomplish this. In the Procedure Division
the NOTE statement is used to force the "invalid" statements to become comments. The
results of this action would cause the test to be
deleted and this would be reflected in the output. See Figure 7.
OPERATING SYSTEM CONTROL CARD
GENERATION
The third problem was the generation of operating
system control cards in the appropriate position relative
to the source programs in order for the programs to be
compiled, loaded and executed. This was the biggest
challenge for VPMS1; a COBOL program which had
to be structurally compatible with all COBOL compilers and which also had to be able to interface with
all operating systems with a negligible amount of
modification for each system.
The philosophy of the output of VPMS1 is a file
acceptable to a particular operating system as input.
For the most part this file closely resembles what would
normally be introduced to the operating system through
the system's input device or card reader, i.e., control
cards, source program, data, etc.
The generation of operating system control cards is
based on the specific placement of the statement and
the requirement or need for specific statements to accomplish additional functions. These control cards are
presented to VPMS1 in a form that will not be intercepted by the operating system and are annotated as

to their appropriate characteristics. The body of the
actual control card starts in position 8 of the input
record. Position one is reserved for a code that specifies
the type of control card. The following is allowed in
specifying control cards: Initial control cards are
generated once at the beginning of the file. Beginning
control cards are generated before each source program
with a provision for specifying control cards which
are generated at specific times, i.e., JOB type cards,
subroutine type cards, library control cards, etc. Ending control cards are generated after each source program with the same provision as beginning control
cards. Terminal control cards are generated prior to
the file being closed. Additional control cards are
generated for ~ssigning hardware devices to the object
program, bracketing data and for assigning work areas
to be used by the COBOL Sort.
There are approx:mately 25 files used by the entire
set of validation routines for which control cards may
need to be prepared. In addition to the control cards
and information for the Environment Division, the
total number of control statements printed for VPMS1
could be in the neighborhood of 200 card images and
the possible number of generated control cards on the
output file could be as large as 5000. The saving in time
and JCL errors that could be prevented should be
obvious at this point.
This Environmental information need not be provided by the user because once a set of VPMS1 control
cards has been satisfactorily debugged on the system
in question, they can be placed in the library file that
contains the same program so that a single request
could extract the VPMS1 control cards for a given
system.

CONCLUSION

It has been demonstrated that the validation of COBOL
compilers is possible and that the end result is beneficial to both compiler writers and the users of these
compilers. The ease with which the DOD CCVS can
be automatically adapted to a given computer system
has eliminated approximately 85 to 90 percent of the
work involved in validating a COBOL compiler.
Although most compilers are written from the same
basic specifications (i.e., the American National Standard COBOL, X3.23-1968, or the CODASYL COBOL
Journal of Development) the results are not always
the same. The DOD CCVS has exposed numerous
compiler bugs as well as misinterpretations of the
language. Due to this and similar efforts in the area of

The DOD COBOL Compiler Validation System

compiler validation, the compatibility of today's
compilers has grown to a high degree.
Weare now awaiting the next version of the American
National Standard COBOL. The new specifications
will provide an increased level of compatibility between
compilers because the specifications are more definitive
and contain fewer "implementor defined" areas. In
addition, numerous enhancements and several clarifications have been included in the new specification-

827

all contributing to better software, both at the compiler
and the application level.
REF'ERENCES
1 American National Standard COBOL X3.23-1968
American National Standards Institute Inc. New York 1968
2 COBOL-61 Conference on Data System Languages
U. S. Government Printing Office Washington D. C. 1961

A prototype automatic program testing tool
by LEON G. STUCKI
McDonnell Douglas Astronautics Company
Huntington Beach, California

Dijkstra, in relation to both hardware and software
"mechanisms," continues by stating:

" ... as a stow-witted human being I have a very
small head and had better learn to live with it
and to respect my limitations and give them full
credit, rather than try to ignore them, for the
latter vain effort will be punished by failure."
-Edsger W. Dijkstra

"The straightforward conclusion is the following:
a convincing demonstration of correctness being
impossible as long as the mechanism is regarded
as a black box, our only hope lies in not regarding
the mechanism as a black box. I shall call this
'taking the structure of the mechanism into
account.' "1

SOFTWARE SYSTE1VIS l\1EASUREl\1ENT
TECHNIQUES

As suggested by R. W. Bemer and A. L. Ellison in
their 1968 IFIP report, the examination of hardware
and software structures might incorporate similar test
procedures:

The measurement process plays a vital role in the
quality assurance and testing of new hardware systems.
To insure the reliability of the final hardware system,
each stage of development incorporates performance
standards and testing procedures. The establishment of
software performance criteria has been very nebulous.
At first the desire to "just get it working" prevailed in
most software development efforts. With the increasing
complexity of new and evolving software systems,
improved measurement techniques are needed to
facilitate disciplined program testing beyond merely
debugging. The Program Testing Translator is an
automatic tool designed to aid in the measurement and
testing of software systems.
A great need exists for new methods of gaining insight into the structure and behavior of programs being
tested. Dijkstra alludes to this in a hardware analogy.
He points out that the number of different multiplications possible with a 27-bit fixed-point multiplier is
approximately 254. With a speed in the order of tens of
microseconds, what at first might seem reasonable to
verify, would require 10,000 years of computation. 1
With these possibilities for such a simple operation as
the multiplication of two data items, can it be expected
that a programmer anticipate completely the actions
of a large program?

"Instrumentation should be applied to software
with the same frequency and unconscious habit
with which oscilloscopes are used by the hardware
engineer.' '2

Early attempts at the application of measurement
techniques to software dealt mainly with efforts to
measure the hardware utilization characteristics. In an
attempt to further improve hardware utilization,
several aids have been developed ranging from optimized
compilers to automated execution monitoring systems. 3 •4
The Program Testing Translator, designed to aid in
the testing of programs, goes further. In addition to
providing execution time statistics on the frequency of
execution for various program statements, the Program
Testing Translator performs a "standards" check to
insure programmers' compliance to an established
coding standard, gathers data on the extent to which
various branches of a program are executed, and
provides data range values on assignment statements
and DO-loop control variables.

829

830

Fall Joint Computer Conference, 1972

As was pointed out by Heisenberg, in reference to the
measurement of physical systems, a degree of uncertainty is introduced into any system under observation. With Heisenberg's Uncertainty Principle in mind,
the Program Testing Translator is presented as a
"tool" to be used in the software measurement process.
Just as using a microscope to determine the position of
a free particle introduces a degree of uncertainty into
observations, 5 so must it be concluded that no program
measurement tool can guarantee the complete absence
of all possible side effects. In particular, potential
problems involving changes in time and space must be
considered. For example, the behavior of some real-time
applications may be affected by increased execution
times. To avoid the use and development of more
powerful program testing tools because of possible uncertainties, however, would be as great a folly as to
rej ect the use of the microscope.
DATA ANALYZED BY THE PROGRAM
TESTING TRANSLATOR

(2) The number and percentage of those branches
and CALLs actually taken or executed.
(3) The following specific data associated with each
executable source statement.
(a) detailed execution counts
(b) detailed branch counts on all IF and GOTO
statements
(c) min/max data range values on assignment
statements and DO-loop control variables.
Several previous programs7 ,8,9 have provided interesting
source statement execution data. The additional data
range information provided by the Program Testing
Translator, however, proves useful in further analyzing
program behavior. Extended research investigating
possible techniques for automatic test data generation
will make use of these data range values. The long term
goal of this research is directed toward designing a
procedure for obtaining a minimal yet adequate set of
test cases for "testing" a program.
STANDARDS' CHECKING

In a paper by Knuth a large sample of FORTRAN
programs was quantitatively analyzed in an attempt to
come up with a program profile. This profile was expressed in the form of a table of frequency counts
showing how often each type of source statement occurs
in a "typical" program. Knuth builds a strong case for
designing profile-keeping mechanisms into major computer systems. 6 Internal organization of the Program
Testing Translator was designed with Knuth's table
of frequency counts in mind.
The Program Testing Translator gathers and analyzes
data in two general areas: (1) the syntactic profile of
the source program showing the number of executable,
nonexecutable, * and comment statements, the number
of CALL statements and total program branches, ** and
the number of coding standard's violations, and (2)
actual program performance statistics corresponding
to various test data sets.
With all options enabled, the actual program performance statistics produced by the Program Testing
Translator include:
(1) The number and percentage of those executable
source statements actually executed.

* Executable statements include assignment, control, and input!
output statements. Nonexecutable statements include specification and subprogram statements.
** A branch will denote each possible path of program flow in all
conditional and transfer statements (i.e., all IF and GOTO
statements in FORTRAN).

Although general in design, the initial implementation
of the Program Testing Translator was restricted to
the CDC 6500. Scanning the input source code, the
Program Testing Translator flags as warnings all
dialect peculiar statements which pose possible machine
conversion problems. The standard is basically the
ASA FORTRAN IV Standard10 with some additional
restrictions local to McDonnell Douglas. The standard
can easily be altered to reflect the needs of a particular
installation, in contrast to previous compilers which
have incorporated a fixed standard's check (e.g.,
WATFOR).
DEVELOPMENT OF THE PROGRAM TESTING
TRANSLATOR
The Program Testing Translator serves as a prototype
automatic testing aid suggesting future development of
much more powerful software testing systems. The
basic components of the system are the FORTRAN-toFOR TRAN preprocessor and postprocessor module.
(See the section on Use of the Program Testing
Translator.)
A machine independent Meta Translatorll was used
to generate the Program Testing Translator. Conventionally, moving major software processors between
machines posed serious problems requiring either completely new coded versions, or the use of new metacompiler systems. 12 This lVleta Translator produces an

Prototype Automatic Program Testing Tool

831

ASA Standard FORTRAN translator which represents
an easily movable translation package.
In general, for implementation on another machine,
FORTRAN-to-FORTRAN processors such as the
Program Testing Translator require only that the
syntactic definition input to the Meta Translator be
changed to reflect the syntax of the new machine's
FO RTRAN dialect.
INSTRUMENTATION TECHNIQUES
The instrumentation technique used by the Program
Testing Translator is to insert appropriate high-level
language statements within the original source code
making as few changes as possible. 13 Three memory
areas are added to each main program and subroutine.
One is used for various execution counts while the other
two are used for the storage of minimum and maximum
data range values for specified assignment statements
and DO-loop control variables. The size of these
respective memory areas depends upon the size of the
program being tested and the options chosen.
Simple counters of the form:
QINT(i) =QINT(i) +1
are inserted at all points prefixed by statement numbers,
at entry points, after CALL statements, after logical IF
statements, after DO statements, and after the last
statement in a DO-Ioop.8.14 Additional counters are
used to maintain branch counts on all IF and GOTO
statements.
l\,finimum and maximum data range values are
calculated following each non-trivial assignment statement. These values of differing types are packed into
the two memory areas allocated for this purpose.
Minimum and maximum values may also be kept on all
variables used as DO-loop control parameters. These
values are calculated before entry into the DO-loop.
USE OF THE PROGRAM TESTING
TRANSLATOR
Overall program flow of the Program Testing Translator is diagrammed in Figure 1. Basically, the user's
FORTRAN source cards are input to the preprocessor.
. This preprocessor module outputs: (1) an instrumented source file to be compiled by the standard
FORTRAN compiler and (2) an intermediate data file
for postprocessing. This intermediate file contains a
copy of the original source code with a linkage field for
extracting the profile and execution-time data for the
program.

Figure 1-Program testing translation job flow

The object code produced by the FORTRAN compiler is linked with the postprocessor from an object
library. The resulting object module can now be saved
along with the intermediate Program Testing Translator
data file. Together they can then be executed with any
number of user test cases. Using the intermediate file
built by the preprocessor and data gathered while
duplicating the original program results, the postprocessor generates reports showing program behavior
for each specific test case. Analysis of these reports will
help eliminate redundant test cases and point to sections
of the user's program which have not yet been "tested."
Examination of these particular areas may lead to
either their elimination or the inclusion of modified
test cases to check out these program sections.
Preliminary measurements indicate that the execution time of the instrumented object module, with all
options enabled, is approximately one and one half to
two times the normal CPU time. Increases in IIO time
are negligible in most cases.
ACTUAL TEST EXPERIENCE
Although the Program Testing Translator has only
been available for a short time, several interesting
results have come to light.
One of the first major subroutines, processed at
l\,fcDonnell Douglas, was an eigenvalue-eigenvector
subroutine believed to be the most efficient algorithm
currently available for symmetric matrices. I5 •16 Of the

832

Fall Joint Computer Conference, 1972

613 source statements in the subroutine, it was immediately noted that the nested DO-loop shown here was
accounting for one quarter of all the execution counts for
the entire subroutine (see Appendix).

DO 640 1=1, N
DO 640J=1, N
640 B(I) =B(I) +A(I, J)*V(J, IVEC)
Several immediate observations are worthy of mention. First, note that the complexity of the above
statement with its double subscripting and multiplication makes it a costly statement to execute. Second, it
can be seen that the subscripting of the variable B(I)
could be promoted out of the inner loop. A good
optimizing compiler should promote the subscripting
of the B (I) 's and produce good code for the double
subscripting and muItiplication17 but it cannot logically
redesign the nested loop. A redesigned machine language subroutine replacing the original loop has now
cut total subroutine execution time by one third.
Further analysis of the same program, in an attempt to
determine why several sections .of code were not

executed, revealed a logic error making it impossible to
ever reach one particular section of code. This error,
which was subsequently corrected, can be seen on the
first page of the original run contained in the Appendix.
This was a subroutine experiencing a great deal of use
and thought to be thoroughly checked out.
Running the Program Testing Translator through
itself has resulted in savings of over 37 percent in CPU
execution times. The standard's checking performed by
the Program Testing Translator has verified the machine independence of the Meta Translator.
Table I contains a summary of the actual program
statistics observed on the first eight major programs
run through the Program· Testing Translator. It is
interesting to note that only 45.9 percent of the possible
executable statements were actually executed. Of more
importance, however, is the fact that only 32.5 percent
of all possible branches were actually taken.
Table II compares the class profile data gathered at
lVlcDonnell Douglas by the Program Testing Translator
with the Lockheed and Stanford findings cited by
Knuth. 6 The syntactic profile of the l\1cDonnell Douglas
and Lockheed samples were remarkably similar.
Stanford's "student" job profile shows much less

TABLE I-Actual Program Statistics with the Program Testing Translator
Program
Total Number of Statements

AB33
1,578

AD77
11,111

F999
2,833

JOYCE
3,033

META
1,125

MI01

PTT

775

772

UT03
1,445

TOTALS
22,672

No. of Comment Statements
Percentage of Total

355
22.5

3,847
34.6

644
22.7

176
5.8

86
7.6

189
24.4

44
5.7

54
3.7

5,395
23.8

No. Other Nonexecutable
Statements
Percentage of Total

177
11.2

905
8.1

257
9.1

372
12.3

534
47.5

40
5.2

249
32.3

254
17.6

2,788
12.3

No. Standard's Violations
Percentage of Total

9
0.6

33
0.3

28
1.0

65
2.1

1
0.1

1
0.1

23
3.0

44
3.0

204
1.0

1,046
66.3

6,359
57.2

1,932
68.2

2,485
81.9

505
44.9

546
70.5

479
62.0

1,137
78.7

14,489
63.9

No. Actually Executed
Percentage Executed

678
64.8

2,213
34.8

1,155
59.8

846
34.0

419
83.0

392
71.8

364
76.0

584
51.4

6,651
45.9

No. of Branches
Avg./Exec. Statements
No. Actually Executed
Percentage Executed

357
0.34
195
54.6

2,635
0.41
571
21.7

355
0.70
203
57.2

189
0.35
112
59.3

333
0.70
175
52.6

510
0.45
175
34.3

6,956
0.48
2,261
32.5

No. of CALL Statements
Avg.jExec. Statements
No. Actually Executed
Percentage Executed

20
0.02
18
90.0

369
0.06
119
32.2

32
0.06
21
65.6

9
0.02
3
33.3

19
0.04
5
26.3

99
0.09
76
76.8

912
0.06
335
36.7

No. Executable Statements
Percentage of Total

Total Statement Exec.
Counts (in thousands)

26,772

2,929

859
1,718
0.44
0.69
376
454
43.8
26.4
86
0.04
26
30.2
112

278
0.11
67
24.1
1,129

5,284

1,133

1,087

71

38,517

Prototype Automatic Program Testing Tool

TABLE II-A Comparison of Syntatic Class Profiles
McDonne~1

Douglas
Total No. State22,672
ments
23.8
Percentage
Comments
12.3
Percentage Other
Nonexecutable
63.9
Percentage
Executable
Avg. No.
0.48
Branches/
Executable
Statement
Avg. No. CALLs/
0.06
Executable
Statement

Lockheed*
245,000

Stanford*
10,700

833

l\1odeled after development of the FORTRAN-toFORTRAN system, instrumentation systems for other
languages of heavy use such as COBOL or PL/l might
well be developed.
The most important area now being investigated,
however, is the possible design of extensible automatic
testing aids to provide for the automatic generation of
test data. Evolvement of future testing tools along these
lines would greatly aid the quality assurance aspects of
large software systems.

2l.6

10.2

10.6**

12.3**

67.8

77.5

0.54

0.32

ACKNOWLEDGMENT

0.04

The research described in this report was carried out by
the author under the direction of T. W. l\1iller, Jr. and
R. G. Koppang in the Advance Computer Systems
Department at the l\1cDonnell Douglas Astronautics
Company in Huntington Beach, California.

0.09

* Note: These figures represent this author's best attempt at
extrapolating comparable measurements from Knuth's paper.6
Knuth's percentage figures had to be corrected by adding the
comment statements into the total number of statements. Calculations of the average number of branches per executable statement require two assumptions: (1) 30 percent of the IF statements
had 3 possible branches· while 70 percent had 2 branches, (2)
96 percent of the GOTO .statements were unconditional (i.e.,
1 branch), while 4 percent were switched (i.e., 2 branches were
assumed).
** Includes the following: FORMAT, DATA, DIMENSION,
COMMON,
END,
SUBROUTINE,
EQUIVALENCE,
INTEGER, ENTRY, LOGICAL, REAL, DOUBLE, OVERLAY, EXTERNAL, IMPLICIT, COMPLEX, NAMELIST,
BLOCKDATA.

emphasis on internal documentation (i.e., fewer comment statements) and also exemplifies a more straightforward approach to flow of control (as seen in Stanford's 0.32 branches per executable statement compared
to 0,48 and 0.54 branches per executable statement for
the two aerospace companies) .
EXTENSIONS OF THE PROGRAl\1 TESTING
TRANSLATOR
As alluded to in earlier sections, much more powerful
testing systems can and should be built in the future.
Relatively simple changes to the postprocessor module
could enable the execution time data from multiple test
runs to be combined automatically into composite
test reports.
Changes to the translator module might provide the
options of first and last values on assignment statements
as well as range values.

REFERENCES
1 E W DIJKSTRA
Notes on structured programming
Technological University Eindhaven The Netherlands
Department of Mathematics April 1970 TH
Report 70-WSK-03
2 R W BEMER A L ELLISON
Software instrumentation systems for optimum performance
IFIP Congress 1968 Proceedings Software
Session 2 Booklet C p 39-42
3 System measurement software SMS/360 problem program
efficiency
Boole and Babbage Inc. Product Description Palo Alto
California May 1969 Document No. S-32 Rev-l
4 D N KELLY
Spy a computer program execution monitoring package
McDonnell Douglas Automation Company Huntington
Beach California MDC G2659 December 1971
5 W HEISENBERG
The uncertainty principle
Zeitschrift fuer Physic Vol 43 1927
6 D E KNUTH
An empirical study of FORTRAN programs
Stanford Artificial Intelligence Project Memo AIM-137
Computer Science Dept Report No. CS-186
7 1ST LT G W JOSEPH
The fortran frequency analyzer as a data gathering- aid for
computer system simulation.
Electronics Systems Division United States Air Force
L G Hanscom Field Bedford Massachusetts March 1972
8 D H H IGNALLS
FETE A FORTRAN execution time estimator
Computer Science Department Stanford University
STAN-CS-71-204 February 1971
9 CDC 6500 FWW user's manual
TRW Systems Group Redondo Beach California
10 Proposed American standard X3-1,..3-FORTRAN
Inquiries addressed to X3 Secretary BEMA
235 E 42nd Street N ew York NY March 1965

834

Fall Joint Computer Conference, 1972

11 Meta translator

Advanced Computer Sciences Department McDonnell
Douglas Astronautics Company Huntington Beach
California currently in preparation
12 A R TYRILL
The meta 7 translator writing system
Master's Thesis School of Engineering and Applied Science
University of California Los Angeles California
Report 71-22 September 1971
13 E C RUSSELL
A utomatic program analysis
PhD Dissertation in Engineering School of Engineering and
Applied Science University of California Los Angeles
California 1969
14 V G CERF
Measurement of recursive programs
Master's Thesis School of Engineering and Applied Science

University of California Los Angeles California
70-43 May 1970
15 S J CLARK
Computation of eigenvalues and eigenvectors of a real
symmetric matrix using SYMQRl
Advanced Mathematics Department McDonnell Douglas
Astronautics Company Huntington Beach California
Internal Memorandum A3-830-BEGO-71-07 November
1971
16 S J CLARK
Further improvement of subroutine SYMQRl
Advanced Mathematics Department McDonnell Douglas
Astronautics Company Hungtington Beach California
Internal Memorandum A3-830-BEGO-SJC-094 March 1972
17 F E ALLEN
Program optimization
Ann Rev in Automatic Programming 5(1969)
pp 237-307

APPENDIX

COUNT

'ItOGIt"K loUTING CloUDING 't INDICATES CONVE"1l0N WUNINOS)
. , " .n. NUl'll' GO TO .sao
I'CABSCS) .Ii'. USce» GO TO 350

It-. IIC

• -SeECI)
,DCI*1)
, on.u
•K

37'0
3613
2601
2601
2601
2607
2607
2607
2607
2607
2607
lOU
1066
1066

* ceDCI.l'

o • ceeCS.u

~~SnN,~~O TO RI!TURN
It • hNO"M

;J~fC

En-u
•
GO. TO no

Q

3'0 , , cuec n • seDCZ.U
Q , SeeCl*U
DUtU • C-,/5 • I(
i-(.-.1) 11 C-fU.U
"SlUN 360 TO RETUR'"
GO TO tOO
ItO It It ti NO""
180 CO~TlNU&

11)66

1066
1066
11)66

3790
117
117
117

TI"' • ceiONMIl * S-DCNU'

IHIW) ..httHUIt1' - CaD CNU)

&, HUMU

C

~

Q
C

!!

TEMP

INfERNAL P!lOCIDII"! n

TRue
111
TRUE
1066
MIN .. 9,""'14'E-01
MI N~.l. 2023366!-01
MI "" .. 2. 41J7164'E.01
1'111'4 ... 1.244'6611 01
BRANCH 1
2607
MIN"'1.01000"1-01
HIN •• l. 2443661E-01
BRANCH 1
2601
1'1 I'h.l. 2O~8'61!.0l
HI N•• 6 • 12e4361E .. 02
MI N"'1.156'33'E-Ol
fltl rt-d.l O~.4flE"02
S~ANCH

1

1066

MIN. '.U'10"!.U

HI "" •• 1. '6D24UE.n

Ht"~·1.~n~1

.., IN"-l. 5602425e.03

3613
FAI.SE
2607
FALSE
MU·9."U60n-u
MAX. 1.2SI6663e.01
MU. 2.4810,'4hOl
flU. 1.115I685E 01
MU· lilOO1"GE-tat
MAX. 1. 17'26"Eoo01
MUw
MAX.
HAX.
MU.

1.U 4 Z4UEa01
4.0277033Ea01
1.238302,8E·n

o1io12""~

fUX- 4. ZlS0652hDl
MAXu 3.1436U2E.Q3
itl~-1-.-nh6~»-

MAX. 3.14;56612E,,03

CAI;Ctll;ATi 'iole ASTATIeN e S R A I ! ! S p O N D I N I t - f l - - - - - - - - - - - - - - - - - - - - - - - - - - - -

THI VECTORCIfI,Q"

11'

GO f& ' "
'00 '1' .A81(1I)
0; "'8tU)

NORM !I' PII!SQRT U • 0 • CQO/pp, •• 2)
GO TO no
0-.01 ~ TO , u - HENCE,
NORM. OO-SORT(1.0 - Cp'/QU'*Z)
'20 C • '/NORM
I , blllOR11
GO TO !I!TURN. C310,34D.~U)

,It C •

IMPOSSIBLE TO REACH AS IS
IIGE~VECTO"S ( I '

ANY! BY INVERSE ITERATION.

C

~

C I' AI.I. illlNVALUES WERE OIlUIN&D (F411. • 0), THEN AI.I. UGENV!CTORI
C WERE E!TIoIIR COH'UTED AI.READY BY ROTA TI ON OM THE REM AI NI NG E lG!NYECTORS
IN'fER~[ "[RATIO''" C1"'''11' • 11,
IF AI;;l;; EI8EN'IAI;;I;IES
V ~ERII NOT OITAINED C'All. .NE. 0), STORE ERROR FLAG IN
Ii Rotell, .tt,IIOTC"II.I, jlHICH INDICATES !HEA!: WILl, 9E NO EIGEt>iYECTOIIS
e F-oft .Tio/t ttHNV-kL-vES", 'OUftD. ALSO, IIRO OUT REMAINING 1.0enioNt
OF ROT A"RAY. lEE COMMENT .0I.LOWING STATEMENT NUMBER 621.

G H.... II IBUINiB IV

3190
5790
Z11'
2119

NEVER TRUE

i.~

S I 1).0
NORM. 0.0

C 'IND REMAHIING

~790

00 MUST >ZERO

fU "eM ,fl.

i

"EcrrlC ElIEeU,.rON DATA

eAAffOH- 1
11'
1'1 I N. 1.21571631-0'
1'41 ",. 1. UD2833E-U
T~Ij!

U1t
MI "". ". ".53"!-O4
SRANelol 1
2'1'

MAX. '.0125388;Jbos.
MAX, 01. 2UO"2~.01
~AI;Se
2Yl'
MAX. '. 4Z'38Ub01

tOff0- FiLff
10-11 TItIti
10 7 s. HI N, 1.6846126E .. 03 "AX. ". 2130,,2E"01

.-

31'0
5"'0
3190

MIN, .. i.OODOOOOe-oo
MIN .. '. "'''tOedl
BRANCH 1
11'
IRANCf04 3
1066

MAXu i.OOOOOOOE-OO
flAX- " '''UUE 01
BUNCH 2
2601

Prototype Automatic Program Testing Tool

P~OO'UH

!.ISTING C!.UDINQ v

I~DIClTES

COUNT

CONV6RStON WARNINGS)

.,Q.

835

SPtCIF'tC ElCECUTIOIII DAU

GO TO 612
BIIANC'"
"
62' 1'(1'1
NI 00 TO 627
TRue
o
,J " M • 1
o
DO 626 J • J,N
-C-----626 VC I, tVEC) ~ 0.0
C
C IF' !TEll .G'. 6, COM"UTE RESIDUAl. USING THE TRIDJAIiONAI. HATRIX AND
c THE EIGENVECTOR VCJdVEC) OF' THE TRIDIAGO~AL HATRIX. IF' THE RESIDUAL
C VECTOJI HAS ALL ELEME~TS LESS THAN l,OE"8 I ~ ABSOLUTE VAlour;, VCl,r VEC)
-C-t~-u----'N'I!"~ !I QENV!CTOR --.-mHtM-ttvff1I----11Ij+lt:1LLr-tC~O~"''f_T....IHIllr'lZ~Ett''O~.;------------------------------­
C IF NOT, ROTnVEC) cONTAINS TH-E I.ARGEST EL.EMENT OF' THI: RESIDUAl.
C GREATER T"'.N ~.. OE-8,
C
627 H'C nEP .LE. 6) GO TO 630
59 FALSi
"
TJlUE
DO 628 ( - l,N
o
- en 8CII
• DtCI""---~---------------------SUI'I1 • AiS 1
may be the appropriate distribution for representing
time between failures. The log-normal and gamma
(with appropriate choice of parameters) are other
functions with an increasing hazard rate which may
also be appropriate for this phase.

A pplication oj Reliability Theory to SoJtware

There are also major differences between hardware
and software reliability. These are listed below:
• Stresses and wear do not accumulate in software
from one operating period to another as in the
case of certain equipment; however, program
quality may be different at the start of each run,
for the reason given below.
• In the case of hardware, it is usually assumed
that between the burn-in and wearout stages an
exponential distribution (which means a constant
hazard rate) applies and that the probability of
failure in a time interval t is independent of
equipment age. However, for software, there may
be a difference in the initial "state of quality"
between operating periods due to the correction
of errors in a previous run or the introduction of
new errors as the result of correcting other errors.
Thus it is appropriate to employ a reliability
growth model which would provide a reliability
prediction at several points in the cumulative
operating time domain of a program.
• For equipment, age is used as the variable for
reliability prediction when the equipment has
reached the wearout stage. Since with software,
the concern is with running a program repeatedly
for short operating times, the time variable which
is used for reliability purposes is the time between
troubles. However, cumulative operating time is a
variable of importance for predicting the timing
and magnitude of shifts in the reliability function
as a result of the continuing elimination of bugs
or program modification.
Over long periods of calendar or test time, there will

1.0

When applied to software reliability, many of the
basic concepts and definitions of reliability theory
remain intact. Among these are the following:
• Definition of reliability R(t) as the probability
of successful program operation for at least t hours
• Probability density function J(t) of time between
software troubles, or, equivalently, the time rate
of change of the probability of trouble
• Hazard rate z(t) as the instantaneous trouble rate,
or, equivalently, the time rate of change of the
conditional probability of trouble (time rate of
change of probability of trouble, given that no
trouble has occurred prior to time t)

841

OPERATING TIME

-t

CUMULATIVE TEST TIME
Figure 3-Reliability growth

842

Fall Joint Computer Conference, 1972

Step 1. Assemble Data

TEST DATAl
ASSEMBLE
DATA
IDENTIFY
STATISTICAL
DISTRIBUTION

POINT AND {
CONFIDENCE
LIMIT
ESTIMATES

ESTIMATE
RELIABILITY
PARAMETERS

ESTIMATE
RELIABILITY
FUNCTION
RELIABILITY
AND
CONFIDENCE
LIMIT

It

MAKE RELIABILITY
PREDICTION

Data must first be assembled in the form of a time
between troubles distribution as was indicated in
Figure 2. At this point, troubles are also classified by
type and severity.
Step 2. Identify Statistical Distribution

In order to identify the type of reliability function
which may be appropriate, both the empirical relative
frequency function of time between troubles (see
example in Figure 5) and the empirical hazard function
are examined. The shapes of these functions provide
qualitative clues as to the type of reliability function
which may be appropriate. For ~xample:
• A monotonically decreasing jet) and a constant
z(t) suggest an exponential function.
• An f(t) which has a maximum at other than t = 0
and a z(t) which monotonically increases suggests:
-Normal function or
-Gamma function or
- Weibull function with (3 > 1.
• A monotonically decreasing j( t) and a monotonically decreasing z(t) suggest a Weibull
function with (3 < 1.
After some idea is obtained concerning the possible
distributions which may apply, point estimates of the
parameters of these distributions are obtained from
the sample data. This step is necessary in order to
perform goodness of fit· tests and to provide parameter

Figure 4-Steps in reliability prediction

be shifts in the error occurrence process such that
different hazard rate and probability density functions
are applicable to different periods of time; or, the same
hazard and probability density functions may apply
but the parameter values of these functions have
changed. This shift is depicted in Figure 3, where the
reliability function, which is a decreasing function of
operating time is .shown shifted upward at various
points in cumulative test time, reflecting long-term
reductions in the trouble rate and an increase in the
time between troubles.

Time Between Troubles Distributions
Ship I

Program I

.4
.3

.2
.1
O~--L-~--------------~--~--~

Approach

0-.9

1.0-1.9

2.0-2.9

Time Between Troubles (Hours)

The steps which are involved . in one approach to
software reliability prediction are shown in Figure 4.

_______

4.0-4.9
6.0-6.9
3.0-3.9
S.9-!5.9
7.0-7.9

Figure 5-Probability density functions

Approach to Software Reliability Prediction and Quality Control

Kolmogorov - Smirnov Exponential Test
Program I Many Ships*
1.0- Program Run Time Distribution Function

1.0 " Q
.9 4l " " .8 '" ~ ~,
.7
~,,~
.6
.5

, , " "'"
\,

.4

\

.3

~

\

\

~

I

J

I

1
I

---I

*

d = ± .139 Confidence Band
a =.05 Level of Significance
N = 93 Data Points

~~

~" "~

I

843

'~
•

~~

.2~--~--~~--~~-.~~--~~~_~~--~-~--~--~

x

~

Upper Confidence Limit

,~

.I

.09~----~----~----+\-----+----~,-----+-----+----~----~

.08~--~----~----~\~--~----~~~~--~--*--~----~--~

.01
.06

/

/"\
\

I

/

,

"r\.

/

From Table of lid II Distribution Values -

/
"
V
"

.05 t-------+'---/---+--"7"'~+___r_~--+--V~x--+----_x~~-t----f--------I
.04 .- Lower Confidence Limit
03 ~

.

Theorectical Exponential
Distribution Function

.02 ,...

e-·

I\..

/

I /

487T

x

"""

Empirical Data - - - -

"

*Ships I, 2, 3, 4, 5, 6, 7.
.01

I

I

2

1

4

5

Program Run Time (Hours)
Figure 6-Goodness of fit test

6

7

8

844

Fall Joint Computer Conference, 1972

estimates for the reliability function. In order to make
a goodness of fit test, it is necessary to provide an
estimate of the theoretical function to be used in the
test. This is obtained by making point estimates of
the applicable parameters. In the case of the one
parameter exponential distribution, the point estimate
would simply involve computing the mean time between troubles = total cumulative test time/number
of troubles, which is the maximum likelihood estimator of the parameter T in the exponential probability density functionf(t) = liTe-tIl.
In the case of a multiple parameter distribution,
the process is more involved. For the Weibull distribution, the following steps are required to obtain parameter point estimates:
-

-

-

A logarithmic transformation of the hazard
function is performed in order to obtain a linear
function from which initial parameter values can
be obtained.
The initial values are used in the maximum
likelihood estimating equations in order to obtain parameter point estimates.
The probability density, reliability and hazard
functions are computed using the estimated
parameter values.

At this point, a goodness of fitness test can be performed between the theoretical probability density
and the empirical relative frequency function or between the theoretical and empirical reliability functions. The Kolmogorov-Smirnov (K-S) or Chi Square
tests can be employed for this purpose. An example
of using the K -S test is shown graphically in Figure 6.
This curve shows a test with respect to an exponential
reliability function. Shown are the upper and lower
confidence limits, the theoretical function and the
empirical data points. Since the empirical points fall
within the confidence band, it is concluded that the
exponential is not an unreasonable function to employ
in this case.
Step 3. Estimate Reliability Parameters Confidence
Limits

The point estimate of a reliability parameter provides the best single estimate of the true population
parameter value. However, since this estimate will, in
general, differ from the population parameter value
due to sampling and observational errors, it is appropriate to provide an interval estimate within which
the population parameter will be presumed to lie. Since
only the lower confidence limit of the reliability func-

tion is of interest, one-sided confidence limits of the
parameters are computed. In Figure 7 is shown an
example of the results of the foregoing procedure,
wherein, for an exponential distribution, the point
estimate of mean time between troubles (MTBT) is
2.94 hours (hazard rate of .34 troubles per hour) and
the lower confidence limit estimate of MTBT is 2.27
hours (hazard rate of .44 troubles per hour). The lower
confidence limit of MTBT for an exponential distribution is computed from the expression Ti = 2nl/x~n.l-a
where· T: is the lower confidence limit of MTBT, n is
number of troubles, l is the MTBT (estimated from
sample data), x2 is a Chi-Square value and a is the
level of significance.
Step

4. Extimate Reliability Function

With point and confidence limit estimates of parameters available, the corresponding reliability functions can be estimated. The point and lower limit
parameter estimates provide the estimated reliability
functions R = e-·34t and R = e-· 44t , respectively, in
Figure 7. In this example, the predicted reliabilities
pertain to the occurrence of all categories of software
trouble, i.e., the probability of no software troubles
of any type occurring within the operating time of t
hours.
Step 5. Make Reliability Prediction

With estimates of the reliability function available,
the reliability for various operating time intervals can
be predicted. The predicted reliability is then compared with the specified reliability. In Figure 7, the
predicted reliability is less than the specified reliability
(reliability objective) throughout the operating time
range of the program. In this situation, testing must
be continued until a point estimate of MTBT of 5.88
hours (.017 troubles per hour hazard rate) and a
lower confidence limit estimate of MTBT of 4.55
hours (.022 troubles per hour hazard rate) are obtained.
This result would shift the lower confidence limit of
the predicted reliability function above the reliability
objective.
Estimating test requirements

For the purpose of estimating test requirements in
terms of test time and allowable number of troubles,
curves such as those shown in Figure 8 are useful. This
set of curves, applicable only to the exponential reliability function, would be used to obtain pairs of (test

Approach to Software Reliability Prediction and Quality Control

Reliability Function and Its Confidence Limit
for Program It Ship I Using Exponential
Reliabi I ity Function.
a =.05 Level of Significance

1.0

Reliability Required to Satisfy Reliability Objectives*
R=e -.017 t'
Lower Confidence Limit
R =e-· 022 T
.........

s

.9
.8
Reliability Objective (Assumed)

.7

r.

.6

.5
~

+-

:cc
Q)

Exponential Reliability (Existing)
R = e -.341:'

.4
Lower Confidence Limit (Existing)

R =e -.44 t'

.3

a::

.2
.I

2

3

4

5

Operating Time (Hours)

*Assumino Zero
R = e-·

020

T

6

7

t'

Troubles During Remaining Tests.

For 10 Troubles During Remaining Tests.

Figure 7-Reliability prediction

845

846

Fall Joint Computer Conference, 1972

Amount of Test Time Required to Achieve
Indicated Lower Limit of Reliability

800

Exponential Reliability Function

R, •.ge.... 1 hr.

700
-

T, =19.5 hr•.

600

1

500

1

400

:

.1

R, =.90.... , hr.

300

Tl =9.48 hro .

,.

R ·.8!S ... ·Ihr.

~ 200

T,.a.l!ShrL

100
2

4

6

8

10

12

14

16

18 .20 22 24 26

28

30

Number of Trouble. During Te.t

Figure 8-Test requirements

time, number of troubles) values. The satisfaction
during testing of one pair of values is equivalent to
satisfying the specified lower limit of reliability Rz for
t hours of operation. For example, if a program reliability specification calls for a lower reliability confidence limit of .95 after 1 hour of operating time, this
requirement would be satisfied by a cumulative test
time of 100 hours and no more than 2 troubles; a
cumulative test time of 200 hours and no more than 6
troubles; a cumulative test time of 300 hours and no
more than 10 troubles, etc. The required test time is
estimated from the relationship T
[tX~n,l-aj
2Ln(1jR z)], where T is required test time, t is required
operating time, x2 is a Chi Square value, n is number
of troubles, Rz is the required lower limit of reliability
and a is level of significance.
PRELIMINARY RESULTS AND CONCLUSIONS
A Naval Electronics Laboratory Center (NELC)
sponsored study* was performed, employing the concepts and techniques described in this report, on
Naval Tactical Data System (NTDS) data. The data
utilized involved 19 programs, 12 ships and 325 software trouble reports. The maj or preliminary results
and conclusions follow:
1. On the basis of Analysis of Variance tests, it
was found that NTDS programs are heterogeneous with respect to reliability characteristics. There was greater variation of reliability
between programs than within programs. This
result suggests that program and programmer
characteristics (source of between program
*N. F. Schneidewind, "A Methodology for Software Reliability
Prediction and Quality Control," Naval Postgraduate School,
Report No. NPS55SS72032B, March 1972.

variation) is more important in determining
program reliability than is the stage of program
. checkout or cumulative test time utilized (source
of within program variation). This result indicates a potential for obtaining a better understanding of the determinants of software reliability by statistically correlating program and
programmer characteristics with measures of
program reliability.
2. Goodness of fit tests indicated much variation
among programs in the type of reliability function which would be applicable for predicting
reliability. This result and the Analysis of
Variance results suggest that program reliability
should be predicted on· an individual program
basis and that it is not appropriate to merge sets
of trouble report data from different programs
in order to increase sample size for reliability
prediction purposes.
3. Based on its application to NTDS data, the
approach for reliability prediction and quality
control which has been described appears
feasible. However, the methodology must be
validated against other test and operational
data. Several interactive programs, written in
the BASIC language, which utilize this approach, have been programmed at NELC*.
Another model by Jelenski and Moranda* has been
developed and validated against NTDS and NASA
data. Other approaches, such as reliability growth
models, multiple correlation and regression studies
and utilization of data smoothing techniques will be
undertaken as part of a continuing research program.
BIBLIOGRAPHY
1 R M BALZER
EXDAMS-extendable debugging and monitoring system
AFIPS Conference Proceedings Vol 34 Spring 1969
pp 567-580
2 W J CODY
Performance testing of function subroutines
AFIPS Conference Proceedings Vol 34 Spring 1969
pp 759-763
3 J C DICKSON et al
Quantitative analysis of software reliability
Proceedings-Annual Reliability and Maintainability
Symposium San Francisco California 25-27 January 1972
pp 148-157

* Programmed by Mr. Craig Becker of the Naval Electronics
Laboratory Center.
* Jelenski, Z. and Moranda, P. B., "Software Reliability Research," McDonnell Douglas Astronautics Company Paper
WD1808, Navember 1971.

Approach to Software Reliability Prediction and Quality Control

4 BERNARD ELSPOS et al
Software reliability
Computer January-February 1971 pp 21-27
5 ARNOLD F GOODMAN
The interface of computer science and statistics
Naval Research Logistics Quarterly Vol 18 No 2 1971
pp 215-229
6 K U HANFORD
A utomatic generation of test cases
IBM Systems Journal Vol 9 No 4 1970 pp 242-256
7 Z JELINSKI P B MORANDA
Software reliability research
McDonnell Douglas Astronautics Company Paper
WD 1808 November 1971
8 JAMES C KING
Proving programs to be correct
IEEE Transactions on Computers Vol C-20 Noll
November 1971 pp 1331-1336

847

9 HENRY CLUCAS
Performance evaluation and monitoring
Computing Surveys Vol 3 No 3 September 1971
pp 79-91
10 R B MULOCK
Software reliability engineering
Proceedings-Annual Reliability and Maintainability Symposium San Francisco California 25-27 January 1972
pp 586-593
11 R J RUBEY R F HARTWICK
Quantitative measurement of program quality
Proceedings-1968 ACM National Conference pp 672-677
12 N F SCHNEIDEWIND
A methodology for software reliability prediction and quality
control
Naval Postgraduate School Report No NPS55SS72032B
March 1972

The impact of prohlem statement
languages on evaluating and
improving software performance
by ALAN MERTEN and DANIEL TEICHROEW
The University oj Michigan
Ann Arbor, Michigan

INTRODUCTION

overall problem of methodology. Its solution calls for a
review of the process itself, so that maximum benefit
can be had from the use of computers."
This paper is concerned with one technique-problem
statement languages-which are in accordance with the
above philosophy as regards the system building system.
They are an answer to the question: "How shall we
conduct systems building now that computers are
available'?"
Problem statement languages are a class of languages
designed to permit statement of requirements for information systems without stating the processing procedures that will eventually be used to achieve the
desired results. Problem statement languages are used
to formalize the definition of requirements, and problem
statement analyzers can be used to aid in the verification of the requirement definition. These languages and
analyzers have a potential for improving the total
process of system building; this paper, however, will be
limited to their role in software evaluation.
Software systems can be evaluated in three distinct
ways. User organizations are primarily interested in
whether the software produced performs the tasks for
which it was intended. This first evaluation measure is
referred to as the validity of the software. "Invalid"
software is usually the result of inadequate communication between the user and the information system designers. Even in the presence of perfect communication, the software system often does not initially meet
the specifications of the user. The second evaluation
measure of software systems is their correctness. Software is correct if for a given input it produces the output
implied by the user specifications. "Incorrect" software
is usually the result of programming or coding errors.
A software system might be both valid and correct
but still might be evaluated as being inefficient either
by the user or the organization responsible for the com-

The need to improve the methods by which large
software systems are constructed is becoming widely
recognized. For example, in a recent study (Office of
Management and Budget!) to improve the effectiveness of systems analysts and programmers, a project
team stated that: "The most important way to improve
the effectiveness of the government systems analysts
and programmers is by reducing the TIME now spent
on systems analysis, design, implementations, and maintenance while maintaining or improving the present
level of ADP system quality."
As another example, a study group (U.S. Air Force2)
concluded that to achieve the full potential of command
and control systems in the 1980's would require research
and development in the following aspects of system
building: requirements analysis and design, software
system certification, software timeliness and flexibility.
Software evaluation is one part, but only a part of
the total process of building information systems. The
best way to make improvements is to examine the total
process. This has been pointed out elsewhere (Teichroew
and Sayani3) and by many others; Lehman, 4 for example,
states:
"When first introduced, computers did not make a
significant impact on the commercial world. The real
breakthrough came only in the 1950's when institutions
stopped asking, 'Where can we use computers?' and
started asking, 'How shall we conduct our business now
that computers are available?'
In seeking to automate the programming process, the
same error has been committed. The approach has been
to seek possible applications of computers within the
process as now practiced.. . . Thus the problem of increasing programming effectiveness, through mechanization and tooling, is closely associated with the

849

850

Fall Joint Computer Conference, 1972

puting facility. This characteristic is termed' performance
and is measured in terms of the amount of resources
required by a software package to produce a result. The
processing procedures and programs and files might be
poorly designed such that the system costs too much
to run and/or makes inefficient use of the computing
resources. Poor performance of software is usually the
result of inadequate attention to design or incorrect information on parameters that affect the amount of resources used.
Before discussing the value of problem statement
languages in software evaluation, the concept is described briefly. The impact on software validity is also
discussed. Once requirements in a problem statement
language are available, it becomes possible to provide
computer aids to the analysis and programming process
which reduce the possibility of introducing errors. This
impact of problem statement languages on software
correctness is discussed later in the paper. In general, a
given set of requirements can be implemented on a given
complement of hardware in more than one way and each
way may use a different amount of resources. The potential of problem statement languages to improve software performance is examined. Some preliminary conclusions from work to date on problem statement languages and analyzers related to software evaluation, and
the impact of software evaluation on the design and use
of problem statement languages and analyzers are discussed.
PROBLEM STATEMENT LANGUAGES AND
PROBLEM STATEMENT ANALYZERS
Problem statement languages were developed to permit the problem definer (i.e., the analyst or the user)
to express the requirements in a formal syntax. Examples of such languages are: a language developed by
Young and Kent;5 Information Algebra;6 TAG;7
ADS;8 and PSL. 9 All of these languages are designed to
allow the problem definer to document his needs at a
level above that appropriate to the programmer; i.e.,
the problem definer can concentrate on what he wants
without saying how these needs should be met.
A problem statement language is not a generalpurpose programming language, nor, for that matter, is
it a programming language. A programming language is
one that can be used by a programmer to communicate
with a machine through an assembler or a compiler. A
problem statement language, on the other hand, is used
to communicate the need of the user to the analyst.
The problem statement language consequently must
be designed to express what is of interest to the user;
what outputs he wishes from the system, what data
elements they contain, and what formulas define their

values and what inputs are available. The user may
describe the computational procedures and/or decision
rules that must be used in determining the values of
certain intermediate or output values. In addition, the
user must be able to specify the parameters which determine the volume of inputs and outputs and the conditions (particularly those related to time) which
govern the production of output and the acceptance of
input.
These languages are designed to prevent the user
from specifying processing procedures; for example, the
user cannot use statements such as SORT (though he
is allowed to indicate the order in which outputs appear), and he cannot refer to physical files. In some
cases the languages are forms oriented. In these cases,
the analyst using the problem statement language communicates the requirements by filling out specific columns of the forms used for problem definition. Other
problem statement languages are free-form.
The difficulty of stating functional specifications in
many organizational systems has been well recognized.
(Vaughan10): "When a scientific problem is presented
to an analyst, the mathematical statement of the relationships that exist between the data elements of the
system is part and parcel of the problem statement.
This statement of element relationships is frequently
absent in the statement of a business problem. The
seeming lack of mathematical rigor has led us to assume
that the job of the business system designer is less complex than that of the scientific designer. Quite the contrary-the job of the business system designer is often
rendered impossible because the heart of the problem
statement is missing!
The fact that the relationships between some of the
data elements of a business problem cannot be stated
in conventional mathematical notation does not imply
that the relationships are any less important or rigorous
than the more familiar mathematical ones. These relationships form a cornerstone of any system analysis,
and the development of a problem statement notation
for business problems, similar to mathematical notation, could be of tremendous value."
Sometimes the lack of adequate functional specifications is taken fatalistically as a fact of life: An example
is the following taken from a recent IBM report:11
"In practice, however, the functional specifications
for a large proj ect are seldom completely and consistently defined. Except for nicely posed mathematical requirements, the work of completely defining a functional
specification usually approximates the work of coding
the functions themselves. Therefore, such specifications
are often defined within a specific problem context, or
left for later detailed description. The detailed definition of functional specifications is usually unknown at

Impact of Problem Statement Languages

the outset of the project. Much of the final detail is to
be filled in by the coding process itself, based on ageneral idea of intentions. Hopefully, but not certainly,
the programmers will satisfy these intentions with the
code. Even so, a certain amount of rework can be expected through misunderstandings.
As a result of these logical deficiencies, a :large programming project represents a significant management
problem, with many of the typical symptoms of having
to operate under conditions of incomplete and imperfect
information. The main content of a programming
project is logical, to be sure. But disaster awaits an approach that does not recognize illogical requirements
that are bound to arise."
,
Problem statement languages can reduce the existence of illogical requirements due to poor specification.
Despite the well-recognized need for formal methods of
stating requirements and the fact that a problem statement language was published by Young and Kent in
1958, such languages are not in wide use today. One
reason is that, until recently, there did not exist any
efficient means to analyze a problem definition given in
a problem statement language. Therefore, these languages were only used for the documentation of each
user's requirements. Under these conditions, it was difficult to justify the expense of stating the requirements
in this formal manner.
A problem statement language, therefore, is insufficient by itself. There must also be a formal procedure,
preferably a software package, that will manipulate a
problem statement for human use. Mathematics is a
language for humans--: humans can, after some study,
learn to comprehend it and to use it. It is not at all
clear that the equivalent requirements language for
business will be manipulatable by humans, though obviously it must be understandable to them. Computer
manipulation is necessary. because the problem is so
large, and a person can only look at one part at a time.
The number of parameters is too large and their interrelationship is too complex.
A problem statement language must have sufficient
structure to permit a problem statement to be analyzed
by a computer program, i.e., a problem statement
analyzer. The inputs and outputs of this program and
the data base that it maintains are intended to serve as
a central resource for all the various groups and individuals involved in the system building process.
Since usually more than one problem definer is required to develop requirements in an acceptable time
frame, there must be provision for someone who oversees the problem definition process to be able to identify
individual problem definitions and coordinate them;
this is done by the problem definition management.
One desirable feature of a system building process is to

851

identify system-wide requirements so as to eliminate
duplication of effort; this task is the responsibility of
the system definer. Also, si:~lCe the problem defines
should use common data names there has to be some
standardization of their names and characteristics and
their definition (referred to here as "functions"). One
duty of the data administrator is to control this standardization. If statements made by the problem definer
are not in agreement as seen by· the system definer or
data administrator, he must receive feedback on his
. '~errors" and be asked to correct these.
All of these capabilities must be incorporated in the
problem statement analyzer which accepts inputs in the
problem statement language and analyzes them for correct syntax and produces, among other reports, a comprehensive data dictionary and a function dictionary
which are helpful to the problem definer and the 'data
administrator. It performs static network analysis to
ensure the completeness of the derived relationships,
dynamic analysis to indicate the time-dependent relationships of the data, and an analysis of volume specifications. It provides the system definer with a structure
of the problem statement as a whole. All these analyses
are performed without regard to any computer implementation of the target information processing system.
When these analyses indicate a complete and error-free
statement of the problem, it is then available in two
forms for use in the succeeding phases. One, the problem statement itself becomes a permanent, machinereadable documentation of the requirements of the
target system as seen by the problem definer (not as seen
by the programmer). The second form is a coded statement for use in the physical systems design process to
produce the description of the target system as seen
by the programmer.
. .
A survey of problem statement languages is given in
Teichroew. 12 A description of the Problem Statement
Languages (PSL) being developed at The University of
Michigan is given in Teichroew and Sibley.9 In this
paper the terms problem statement language and problem statement analyzer will refer to the general class
while PSL and PSA will be used to mean the specific
items being developed in the ISDOS Project at The
University of Michigan.
ROLE OF PROBLEM STATEMENT
LANGUAGES IN SOFTWARE VALIDITY
Definition of software validity

Often when software systems are completed, they do
not satisfy the requirement for which they were intended. There may be several reasons for this. The user

852 Fall Joint Computer Conference, 1972

may claim that his requirements have not been satisfied. The systems analysts and programmers may
claim that the requirements were never stated precisely
or were changing throughout the development of the
system. If the requirements were not precisely and correctly stated, the analysts and programmer may produce
software which does not function correctly.
Software will be said to be valid if a correct, complete
and precise statement of requirements was communicated to the analysts and programmers. Software which
does not produce the correct result is said to be invalid
if the reason is an error or incompleteness in the specification.
Problem statement languages can increase software
validity by facilitating the elimination and detection
of logical errors by the user, by permitting the use of
the computer to detect logical errors of the clerical
type and by using the computer to carry out more
complex analysis than would be possible by manual
methods.
Elimination or detection of logical errors by the user

Problem statement languages and analyzers appear
to be one way to increase the communication between
the user organizations and the analysts and programmers. Usually organizations find it very difficult to distinguish between their requirements and various procedures for accomplishing those requirements. This difficulty is often the result of the fact that the user has had
some exposure to existing information processing procedures and attempts to use this knowledge of techniques to ((aid" the analyst and in the development
of the new or modified information system.
The major purpose of problem statement languages
is to force the user to state only his requirements in a
manner which does not force a particular processing
procedure. Experience has shown that this requirement
of problem statement languages is initially often difficult to impose upon the user. Often, he is accustomed
to thinking of his data as stored in files and his processing requirements as defined in various programs written
in standard programming languages. He has begun to
think that his requirements are for programs and files
and not for outputs of the system.
There is an interesting parallel between the use of
problem statement languages and the use of data base
management systems. Many organizations have found
it difficult for users to no longer think in terms of
physical stored data, but to concentrate on the logical
structure of data and leave the physical storage to the
data base management system. In. the use of problem

statement languages the user must think only of the
logical data and the processing activities.
Initial attempts to encourage the use of problem
statement languages have indicated some reluctance on
the part of users to state only requirements, particularly if the "user" is accustomed to a programming
language. However, once the functional analysts (problem definers) become familiar with the problem statement technique and learn to use the output from the
problem statement analyzers, they find that they are
able· to concentrate on the specification of input and
output requirements without having to be concerned
about the design and implementation aspects of the
physical systems. Similarly, output of problem statement analyzers can be used to aid the physical systems
designers in the selection of better processing procedures
and file organizations. The physical systems designer
has the opportunity to look at the processing requirements and data requirements of all the users of the
system and can select something that approaches global
optimality as opposed to a design which is good for only
one user.
One of the Ihajor problems in the design of software
systems is the inability of users to segment the problem
into small units to be attacked by different groups of
individuals. Even when the problem can be segmented,
the direct use of programming languages and/or data
management facilities requires a great amount of interaction between the user groups and the designers
throughout problem definition and design. Problem
statement languages allow the individual users to state
their requirements on a certain portion of the information system without having to be concerned with the
requirement's definition of any other portion of the
information systems.
.
This requirement is slightly modified in organizations where there exists a data directory (i.e., a listing
of standard data names and their definition.) In this
case each of the users must define his requirements in
terms of standard names given in the data directory.
In this case it is the purpose of the problem statement
analyzer to check each problem statement to determine
if it is using the previously approved data names.
Besides the requirement to use standard data names,
the individual problem definers can proceed with their
problem definition in terms of their inputs, outputs, and
processing procedures without knowledge of related
data and processing activities. It is the purpose of the
problem statement analyzer to determine the logical
consistency of the various processing activities. It has
been found that the individual problem definer might
modify his statement of requirements upon receipt of
the output of the problem statement analyzer. At this

Impact of Problem Statement Languages

time he has an opportunity to see the relationship between his requirements and those of others.
Use of that problem statement analyzer to detect logical
errors of the clerical type

Given that requirements are stated precisely in a
problem statement language, it is possible to detect
many logical errors during the definition stage of system
development. Traditionally, these errors are not detected until much later, i.e., until actual programs and
files have been built and the first stages of testing have
been completed. Problem statement analyzers such as
the analyzers built for ADS and for PSL at The University of Michigan can detect errors such as computation and storage of a data item that is not required by
anyone, and the inputting of the same piece of data
from multiple sources. Extensions of these analyzers
will be able to detect more sophisticated errors. For
example, they might detect a requirement for an output requiring a particular item prior to the time at
which the item is input or can be computed from available input.
The problem analyzer can be used to check for problem definition errors by presenting information to a
problem definer in a different order than it was initially
collected. A data directory enumerates all the places in
which a user-defined name appears. Another report
brings together all the places where a system parameter has been used to specify a volume. An analyst can,
by glancing through such lists, more readily detect a
suspicious usage.
Complex analysis of requirements

The use of the computer as described above is for
relatively routine clerical operations which could, in
theory, be done manually if sufficient time and patience
were available. The computer can also be used to carry
out analysis of more complicated types or at least provide an implementation of heuristic rules. Examples are
the ability to detect duplicate data names or to identify
synonyms. These capabilities require an analysis of
use and production of the various basic data items.
ROLE OF PROBLEM STATEMENT
LANGUAGES IN SOFTWARE
CORRECTNESS
Definition of software correctness

Software is said to be incorrect if it produced the
wrong results even though the specification for which

853

it was produced is valid and when the hardware is working correctly. The process of producing correct programs can be divided into five major parts:
-designing an algorithm that will accomplish the requirements
-translating the algorithm into source language
-testing the resulting program to determine whether
it produces correct results
-debugging, i.e., locating the source of the errors
-making the necessary changes.
Current attempts to improve software correctness

Software incorrectness is a major cause of low programmer productivity. For example, Millsll states:
"There is a myth today that programming consists of
a little strategic thinking at the top (program design),
and a lot of coding at the bottom. But one small statistic is sufficient to explode that myth-the number of
debugged instructions coded per man-day in a programming project ranges from 5 to at most 25. The
coding time for these instructions cannot exceed more
than a few minutes of an eight-hour day. Then what
do programmers do with their remaining time? They
mostly debug. Programmers usually spend more time
debugging code than they do writing it. They· are also
apt to spend even more time reworking code (and then
debugging that code) due to faulty logic or faulty communication with other programmers. In short, it is the
thinking errors, even more than the coding errors
which hold the productivity of programming to such
low levels."
It is therefore not surprising that considerable effort
has been expended to date to improve software correctness. Among the techniques being used or receiving attention are the following:
(1) Debugging aids. These include relatively simple

packages such as cross-reference listers and snapshot and trace routines described in EDP
AnalyzerI3 and more extensive systems such as
HELPER.17
(2) Testing aids. This category of aids includes
module testers (e.g., TESTMASTER) and test
data generators. For examples see EDP AnalyzerI3 and the survey by Modern Data. IS
(3) Structured and modular programming. This
category consists of methodology· standards and
software packages that will hopefully result in

854

Fall Joint Computer Conference, 1972

fewer programming errors in the first two
phases-designing the algorithm and translating
it to the .source language code:-and that they
will be easier to find if they are made. (Armstrong,t9 Baker,20 and Cheatham and Wegbreit.2l)
(4) Automated programming. This category includes
methods for reducing the amount of programming that must be done manually by producing
software packages that automatically produce
source or object code. Examples are decision:::;
table processors and file maintenance systems. 13 ,lS
Role of problem statement languages and analyzers
software correctness

~n

While the need for the aids mentioned above has been
recognized, there has been considerable resistance by
programmers to their use. 13 Problem statement languages and problem statement analyzers can be of
considerable help in getting programmer acceptance of
such aids and in improving software correctness directly.
With the use of problem statement languages and
analyzers, the programmer gets specifications in a more
readable and understandable form and is, therefore, less
likely to misinterpret them. In addition, extensions to
existing analyzers could automatically produce source
language statements. These extensions would take
automatic programming methods to a natural limit
since a problem statement is at least theoretically all
the specification necessary to produce code. When the
specifications are expressed in a problem statement
language, the logical design of the system has effectively
been decoupled from the physical design. Consequently,
there is a much better opportunity to identify the
physical processing modules. Once identified, they have
to be programmed only once.
ROLE OF PROBLEM STATEMENT
LANGUAGES IN SOFTWARE
PERFORMANCE
Definition of software performance

Software which produces correct results in accordance
with valid specification may still be rejected by the
user(s) because it is too expensive or because it is not
competitive with other ~oftware or with non-computerized methods. The "performance" of software, however,
is a difficult concept to define.
One can give examples of performance measures in
particular cases. For example, compilers are often

evaluated in terms of the lines of source statements
that can be compiled per minute on a particular machine. File organizations and access software (e.g., indexed sequential and direct access) are evaluated with
respect to the rate at which data items can be retrieved.
Sometimes software is compared on the basis of how
computing time varies as a function of certain parameters. For example, matrix inversion routines are evaluated with respect to the relationship between process
time and size of the array. Similarly, SORT packages
are evaluated with respect to the time required to sort a
given number of records.
Current methods to improve software performance

Software performance is important because any given
piece of software is always in competition with the use
of some other method, whether computerized or not.
Considerable effort has been expended to develop
methods to improve performance.
Software packages have been developed to improve
the performance of programs stated in a general purpose
language, either by separate programs, e.g., STAGED
(OST) and CAPEX optimizer,I3 or incorporated directly
into the compiler. Similar techniques are used in decision table processors.
A number of software packages developed can be
used to aid in the improvement of performance of existing or proposed software systems. The software packages include software simulators and software monitors.
Computer systems such as SCERT, SAM and
CASE14, 15, 16 can be used to measure the performance of
a software/hardware system on a set of user programs
and files. Another approach to improving performance
of software systems is to measure the performance of
the different components of an existing system either
through a software monitor or by inserting statements
in the program. The components which account for the
largest amount of time can then be reworked to improve performance.
Each of these software aids attempts to improve the
efficiency of a software system by modifying certain
local code or specific file organizations. What would be
more desirable is the ability to select the program and
file organizations that best support the processing requirements of the entire information system.
Role of problem statement languages in software performance

Problem statement languages and analyzers can be
used to improve the performance of software systems

Impact of Problem Statement Languages

even beyond that feasible with the aids outlined above.
Decisions involving the organization of files and the
design of the processing procedures can be made based
on the precise statement, and analysis, of the factors
which influence the performance of the computing system because the problem statement requires explicit
statement of time and volume information. The time
information specifies the time at which input is received
by the system or the time at which the specified output
is required from the system. For a scheduled or periodic
input, the time information specifies the time of the
day, month, or year, or relative to some other calendar.
For unscheduled or random input, the expected rate for
a fixed time interval is specified. The volume information consists of specification of the "size" of the input
or the output. For example, the number of time cards
or the number of paychecks. Volume information is
specified so that it is possible to determine the number
of characters of data that will be stored or processed
and moved between storage devices.
In order to design an efficient information system, the
analyst must consider the processing needs of the individual users in arriving at a structure for the data base.
Each of the data elements of the data base must be
considered, and an indication made of the processing
activities to be supported by the specific data. From this,
the system designers must determine the file organization, i.e., the physical mapping of the data onto secondary storage. Methods for maintaining that data must be
determined through the use of the time and volume information specified in the information system requirements.
Problem statement analyzers summarize, format and
display the time and volume information which is relevnnt to the file designers. Our experience with problem
statement languages seems to indicate that efficient
systems are designed in which a file organization is initially- determined and then the program design is
undertaken. Problem statement languages are used to
state the processing procedures required to produce the
different output products or to process the various input
data. Problem statement analyzers must have the ability to aid the systems analyst to group various data and
also to group procedures in such a way as to minimize
the amount of transfer of data between primary memory and secondary memory. One of the outputs from
the problem statement analyzer such as the one for
PSL is an analysis of the processing procedures which
access the same or overlapping sets of data. These
processes can, in many cas~, be grouped together to
form programs.
Currently software systems are often defined in
which a file organization is initially selected based on

855

some subset of the processing requirements. Following
this selection, additional processing requirements are
designed to "fit into" this initial file organization. As
problem statement analyzers become more powerful, it
will be possible to delay this decision concerning selection of a file organization until all the major processing
requirements have been specified.
REMARKS
To our knowledge there does not exist any definitive
study with a controlled experiment and collection of
data to answer questions such as "why does software
not produce the desired result; why does it not produce
the correct result; .and why does it not use resources efficiently?" However, it is generally agreed that the
major causes include the following:
1. Errors in communication or misunderstandings
from those who determine whether final results
are valid, correct and produced efficiently to
those who design and build the system.
2. Difficulties in defining interfaces in the programming process. The serious errors are not in individual programs, but in ensuring that the output
of one program is consistent with the input to
another.
3. Inability to test for all conceivable errors.

Considerable effort has gone into developing methods
and packages to improve software--~ome of these have
been mentioned in this paper. The ISDOS Project at
The University of Michigan has been engaged in developing and testing several problem statement languages
and problem statement analyzers. This paper has been
concerned with the ways in which this use of problem
statement languages and problem statement analyzers
will lead to better software. Problem statement analyzers have been developed for ADS and for PSL at
The University of Michigan. The analyzers have been
used in conjunction with the development of certain
operational software systems as well as a teaching and
research tool in the University.
The ADS analyzer has been tested on both large and
small problems to determine its use in software evaluation and development. The analyzer is currently being
modified and extended to be used extensively by a
government organization. Initial research concerning
the installation of this analyzer within the organization
indicates that analyzers must be slightly modified to
interface with the procedures of the functional user.

856

Fall Joint Computer Conference, 1972

Other portions of the analyzer appear to be able to be
used as they currently exist. Generally, it appears as if
analyzers will have a certain set of organizationally independent features and will have to be extended to
include a specific set of organizationally dependent
features for each organization in which they are used.

3

CONCLUSION

4

Many of the techniques being used or proposed to improve software performance are based on the current
methods of developing software. The problem statement language represents an attempt to change the
method of software development significantly by
specifying software requirements. This paper has attempted to demonstrate that the use of problem statement languages and analyzers could improve software
in terms of validity, correctness and performance.
The design of problem statement languages and the
design and construction of problem statement analyzers
are formidable research and development tasks. In some
sense the design task is similar to the design of standard
programming languages and the design and construction of compilers and other language processors. However, the task appears more formidable when one considers that these languages will be used by non-computer personnel and are producing output which must
be analyzed by these people.
The procedure by which these techniques are tested
and refined will probably be similar to the development
and acceptance of the "experimental" compilers and
operating systems of the 1950's and 1960's. These
techniques are directed at the specification of requirements for application software. This phase of the life
cycle of information systems has received the least
amount of attention to date. The development and use
of problem statement languages and analyzers can aid
this phase. As the languages are improved and extended,
their value as an aid to the entire process of software
development should be realized.

2

5

6

7

8

9

10

11

12

13

14

ACKNOWLEDGMENTS
This research was supported by U.S. Army contract
#010589 and U.S. Navy Contract # N00123-70-C2055.
REFERENCES
1 OFFICE OF MANAGEMENT AND BUDGET
Project to improve the ADP systems analysis and computer

15

16

17

programming capability of the federal government
December 17 1971 69 pp and Appendices
U.S. AIR FORCE
Information processing data automation implications of air
force command and control requirements in the 1980's
Executive Summary February 1972 (SAMSO/XRS-71-1)
D TEICHROEW H SAYANI
A utomation of system building
DATAMATION August 15 1972
M M LEHMAN L A BELADY
Programming systems dynamics or the metadynamics of
systems in maintenance and growth
IBM Report RC 3546 September 17 1971
J W YOUNG H KENT
Abstract formulation of data processing problems
Journal of Industrial Engineering November-December
1958. See Also Ideas for Management International
Systems-Procedures Assoc
CODASYL DEVELOPMENT COMMITTEE
An information algebra phase I report
Communications ACM 5 4 April 1962
IBM
The time automated grid system (TAG): sales and systems
guide
Reprinted in J F Kelly Computerized Management
Information System Macmillan New York 1970
H J LYNCH
ADS: A technique in system documentation
ACM Special Interest Group in Business Data Processing
Vol 1 No 1
D TEICHROEW E SIBLEY
PSL, a problem statement language for information processing
systems design
The University of Michigan ISDOS Working Paper June
1972
P H VAUGHAN
Can COBOL cope
DATAMATION September 11970
H D MILLS
Chief programmer teams: principles and procedures
IBM Federal Systems Division FSC 71-5108
D TEICHROEW
Problem statement languages in MIS
E Grochla (ed) Management-Information-Systeme Vol 14
Schriftenreiche Betriebswirteschaftliche Beitrage zur
Organisation und Automation Betriebswirtschaftlicher
Verlag Weisbaden 1971
EDP ANALYZER
COBOL aid packages
Canning Publications Vol 10 No 8 May 1972
J W SUTHERLAND
The configuration: today and tomorrow
Computer Decisions N J Hayden Publishing Company
Rochelle Park February 1971
J W SUTHERLAND
Tackle system selection systematically
Computer Decisions NJ Hayden Publishing Company
Rochelle Park February 1971
J N BAIRSTOW
A review of systems evaluation packages
Computer Decisions N H Hayden Publishing Company
Rochelle Park June 1970
R R RUSTIN (ed)

Impact of Problem Statement Languages

Debugging techniques in large systems
Prentice-Hall 1971
18 SOFTWARE FORUM
Survey of program package-programming aids
Modern Data March 1970
19 R ARMSTRONG
Modular programming for business applications
To be published

20 F T BAKER
Chief programmer team management of production
programming
IBM Systems Journal No 1 1972
21 T E CHEATHAM JR B WEGBREIT
A laboratory for the study of automated programming
SJCC 1972

857

The solution of the minimum cost flow and maximum· flow
network problems using associative processing
by VINCENT A. ORLANDO and P. BRUCE BERRA*
Syracuse University
Syracuse, N ew York

INTRODUCTION

ASSOCIATIVE MEMORIES/PROCESSORSI-5

The minimum cost flow problem exists in many areas
of industry. The problem is defined as: given a network
composed of nodes and directed arcs with the arcs
having an upper capacity, lower capacity, and a cost
per unit of commodity transferred, find the maximum
flow at minimum cost between two specified nodes while
satisfying all relevant capacity constraints. The classical
maximum flow problem is a special case of the general
minimum cost flow problem in which all arc costs are
identical and the lower capacities of all arcs are zero.
The objective in this problem is also to find the maximum flow between two specific nodes. Algorithms exist
for the solution of these problems and are coded for
running on sequential computers. However, many parts
of both of these problems exhibit characteristics that
indicate it would be worthwhile to consider their solution by associative processors.
As used in this paper, an associative processor has
the minimum level capabilities of content addressability
and parallel arithmetic. The cont~nt addressability
property implies that all memory words are searched
in parallel and that retrieval is performed by content.
The parallel arithmetic property implies that arithmetic
operations are performed on all memory words
simultaneously.
In this paper, some background in associative
memories/processors and network flows, is first provided. We then present our methodology for comparison
of sequential and associative algorithms through the
performance measures of storage requirements and
memory accesses. Finally we compare minimum cost
flow and maximum flow problems as they would be
solved on sequential computers and associative processors; and present our results.

The power of the associative memory lies in the
highly parallel manner in which it operates. Data are
stored in fixed length words as in conventional sequential processors, but are retrieved by content rather than
by hardware storage address. Content addressing can
take place by field within the storage word so, in effect,
each word represents an n-tuple or cluster of data and
the fields within each word are the elements, as illustrated in Figure 1. One of the ways in which accessing
can take place is in a word parallel, bit serial manner
in which all words in memory are read and simultaneously compared to the search criteria. This allows
the possibility of retrieving all words in which a specified
field satisfies a specified search criterion. These search
criteria include equality, inequality, maximum, minimum, greater than, greater than or equal to, less than,
less than or equal to, between limits, next higher and
next lower. Further, complex queries can be formed by
logically combining the above criteria. Boolean connectives include AND, inclusive OR, exclusive OR and
complement. Finally, any number of fields within the
word can be defined with no conceptual or practical
increase in complexity. That is, within the limitation of
the word length, any number of elements may be
defined within a cluster.
In addition to the capabilities already mentioned,
associative memories can be constructed to have the
ability of performing arithmetic operations simultaneously on a multiplicity of stored data words.
Devices of this type are generally called associative
processors. Basic operations that can be performed on
all words that satisfy some specified search criteria as
previously described include: add, subtract, multiply
or divide a constant relative to a given field; and add,
subtract, multiply or divide two fields and place the
result in a third field. This additional capability
extends the use of the associative processor to a large

* This research partially supported by RADC contract
AF 30602-70-C-0190 Large Scale Information Systems.

859

860

Fall Joint Computer Conference, 1972

class of scientific problems in which similar operations
are repeated for a multiplicity of operands.
While various architectures exist for these devices
and they are often referred to by other names (parallel
processors, associative array processors), in this paper
we have adopted the term associative processor and
further assume that the minimum level capabilities of
the device include content addressing, parallel searching
and parallel arithmetic as described above.

NETWORK

NETWORK DATA STRUCTURE
An example network is given in Figure 2, in which the
typical arc shown has associated with it a number of
attributes. The type and number of these attributes
depends upon the specific network problem being
solved but typically include the start node, end node,
length, capacity and cost per unit of commodity transferred. In solving problems, considerably more than
just the network definition attributes are required due
to additional arc related elements needed in the execution of the network algorithms. Included are items such
as node tags, dual variables, flow, bookkeeping bits,
etc. Thus, each arc represents an associative cluster of

TYPICAL ARC

(A,B,C,D,E)

o---~~o
DATA CLUSTER
A
B
C
D
E

-

START NODE
END NODE
LENGTH
CAPACITY
COST

Figure 2-Data structure for network problems

BIT FIELDS
F4

data and hence can be stored within the associative
memory with a minimum of compatibility problems.
The above discussion applies to network problems in
general and served as the basis for research by the
authors6 •7 into the use of associative processing in the
solution of the minimum path, assignment, transportation, minimum cost flow and maximum flow problems.
Results for all problems indicated that a significant
improvement in execution time can be achieved through
the use of the associative processor. The purpose of this
paper is to describe the details of the methodology used
and present the results obtained for the minimum cost
flow and maximum flow problems.

WORDS

METHODOLOGY AND MEASURES OF
PERFORMANCE

DATA CLUSTER: (FI' F2 , ... , Fn)

Figure 1-Associative memory layout

The general methodology followed in this research
was to first solve small problems on sequential computers in order to develop mathematical relationships
that could be used to extrapolate to large problems;
then to solve small problems on an associative processor
emulator to again generate data that could be used in
extrapolating to large problems and finally, to compare
the results. This methodology had the distinct advantage of obtaining meaningful data without having to

Solution of Minimum Cost Flow and Maximum Flow Network Problems

expend vast amounts of computer time in solving large
problems. Further, since we did not have a hardware
associative processor at our disposal, through the use
of the emulator, we were able to solve real problems in
the same way as ,vith the hardware.
In order to compare the compiler level sequential
program to the emulated associative program, it was
first necessary to define some meaningful measures of
performance. It was considered desirable for these
measures to be implementation independent and yet
yield information on relative storage requirements and
execution times since these are the characteristics most
often considered in program evaluation. Measures
satisfying the above requirements which were used in
the performance comparisons of this research are storage
requirements and accesses.
The storage requirements measure is defined as the
number of storage words required to contain the
network problem data and any intermediate results. It
should be noted that the number of bits per storage
word would typically be greater for an associative
processor since word lengths of 200 to 300 bits are
typical of the hardware in existence or under consideration. However, word comparisons have the
advantage of being implementation independent while
providing a measure that is readily converted to the bit
level for-specific comparisons in which the word lengths
of each machine are known. A determination of storage
requirements for the competing programs was accomplished by counting the size of the arrays for the
sequential programs and the number of emulator storage
words for the associative programs. In both cases we
assumed that enough memory was available to hold
the entire problem.
The storage accesses measure is defined as the number
of times that the problem data are accessed during
algorithm execution. Defined in this manner this
quantity is also implementation independent. However,
it should be noted that the ratio of sequential to
associative processor accesses is approximately equal to
the ratio of execution times that would be expected if
both algorithms were implemented on current examples
of their respective hardware processors. This is true
since the longer cycle time of the associative processor
is more than offset by the large number of machine
instructions represented by each of the sequential
accesses. Collection of data for this me~sure was
accomplished in the sequential programs by the addition of statements to count the number of times that
the array problem data were accessed. Only accesses to
original copies of the array data were included in this
count. That is, accessing of non-array constants that
temporarily contained problem data was not counted.
Data collection for the associative programs was

861

accomplished by a counter that incremented each time
the emulator was called.
SEQUENTIAL ALGORITHM ANALYSIS
It was recognized that it would be highly desirable to
obtain the sequential algorithms from an impartial,
authoritative source, since this would tend to eliminate
the danger of inadvertently using poor algorithms and
thus obtaining results biased in favor of the associative
processor. A search of the literature indicated that these
requirements were perhaps best met by algorithms
published in the Collected Algorithms from the Associationfor Computing Machinery (CACM)8.
While these algorithms may not be the "best" in
certain senses, they have the desirable property of being
readily available to members of the computer field.
Algorithm 3369 is the only algorithm published in the
CACM that solves the general minimum cost flow
problem stated above. This algorithm is based on the
Fulkerson out-of-kilter methodlO which is the only
approach available for a single phase solution to this
problem. That is, this method permits conversion of an
initial solution to a feasible solution (or indicates if
none exists) at the same time that optimization is
taking place. Other algorithms accomplish these tasks
in separate and distinct phases.
The single algorithm published by the CACM for
the maximum flow problem is number 32411 which is
based on the Ford and Fulkerson method. 12 This
method appears to be recognized as a reasonable
approach since it is consistently chosen for this problem
in textbooks on operations research. 13 ,14
These sequential algorithms were implemented in
FORTRAN IV and exeeuted on the Syracuse University Computing Center IBM 360/50 to verify
correctness of the' implementations and to collect
performance data.
A detailed analysis of the logic for Algorithm 336
indicates that the access expressions for this program
are as follows
NB

NACsEQ =11 NARCS+

L: NAIsEQ(BR)i
i=l

NN

+

L: NAIsEQ(NON)j

(1)

j=l

where
NAIsEQ(BR)i=N +21 NLAB i +4 NONL1 i
+13 NONL2 i +9 NAUG i -30

(2)

NAI sEQ (NONL-=4 N +19 NLAB j +4 NONL1 j
+13 NONL2j+4 NARCS+9 NPLAB j -12

(3)

862

Fall Joint Computer Conference, 1972

and

was developed for Algorithm 324 and is given as follows

NAC

= number of storage accesses required for

N AI (BR)

= number of accesses in a flow augmenting

problem solution
iteration, called a breakthrough condition
NAI(NON) = number of accesses in an improved dual
solution iteration, called a non-breakthrough condition
NB
= number of breakthrough iterations in
a problem
NN

=. number

N

= number of network nodes

NARCS

= number of network arcs

NLAB

= number of nodes labeled during an
iteration

NONL1

= number of arcs examined during an
iteration that have both nodes unlabeled

NONL2

= number of other arcs examined in an
iteration that do not result in a labeled
node

NPLAB

= number of arcs with exactly one node
labeled

NAUG

= number of nodes on a flow augmenting
path.

of non-breakthrough iterations
in a problem

Note that the above expressions represent a nontypical best case for the sequential labeling process
since it is assumed that only one pass through the list
of nodes is required for the entire labeling process.
To simplify the above expressions, assume that all
arcs processed which do not result in a labeled node are
of type NONLl. This then makes NONL1 =NARCSNLAB. Further assume that NPLAB takes on its
average lower bound of NARCS/N. Both of these
assumptions introduce further bias in favor of the
sequential program. After making these substitutions,
equations (2) and (3) become
NAIsEQ(BR)i=N +17 NLAB i +4 NARCS
+9 NAUG i -30

(4)

+9 NARCS/N -12

(5)

In a similar manner, a best case access expression

NAI sEQ :=3 N +8 (NARCS/N) (NAUG i -1)
+10 NAUG i +4 NLAB i -16

(6)

where N AI = the number of accesses in an iteration.
The above access expressions were verified by comparing the predicted values with those obtained experimentally through actual execution of the· programs.
ASSOCIATE ALGORITHM ANALYSIS
The out-of-kilter method described above was also
used as the basis for the associative processor algorithm
since it represents the only minimum cost flow method
available that is developed from a network rather than
a matrix orientation. The node tags which are used to
define the unsaturated path from source to sink are
patterned after the labeling method of Edmonds and
Karp as described in HU. I3 This selection was made to
exploit the associative processor minimum search
capability by finding the minimum excess capacity after
the sink was reached, rather than performing a running
comparison at each labeled node as in the original
labeling method. For a discussion of the details of this
development see Orlando. 6
Asjndicated earlier, hardware implementation of the
developed algorithm was not possible since very few
associative processors are in existence and in particular
none was available for this research. To circumvent
this problem, as previously stated, a software interpretive associative processor emulator was developed
after extensive investigation of the programming
instruction format and search capabilities available on
the Rome Air Development Center (RADC) Associative Memory.Is
Additional arithmetic capabilities expected to be
available on any associative processor were included in
the emulator. Thus, it had the basic properties of
content addressability, parallel search and parallel
arithmetic.
In operation, the associative network programs, composed primarily of calls to the emulator, are decoded and
executed one line at a time. Each execution, although
composed of many sequential operations, performs the
function of one associative processor statement written
at the assembly language level. The program for the
emulator was implemented in FORTRAN IV and
executed on the IBM 360/50. Complete details ·of the
capabilities and operation of the emulator and listings
of the associative emulator programs are contained
in Orlando. 6
The access expressions for the associative processor
program derived through a step by step analysis of the

Solution of Minimum Cost Flow and Maximum Flow Network Problems

logic are presented below. The terminology used is the
same as defined previously.
NB

NAC AP =3+

I: NAIAP(BR)i
i=l

NN

+

I: NAIAP(NON)j

(7)

j=l

where
NAI AP (BR)i=13 NLAB i +3 NAUG i +20
NAIAP(NON)j= 13 NLABj+29.

(8)
(9)

The above access expressions represent a worst case
for the algorithm logic since each step includes the
maximum amount of processing possible. That is, it
was assumed that the out-of-kilter arc detected always
belonged to the last case to be tested and that each
node used as a base point for labeling only resulted in
the labeling of one additional node.
The associative processor algorithm for the maximum
flow problem is based on the Ford & Fulkerson method12
with the modification of node labeling as described
above. A comparable worst case access expression for
this algorithm is
NAI APi = 11 NLAB i +3 NAUG i

(10)

The above access expressions were also verified using
experimental data obtained from execution of the
emulated programs.

863

it is seen that 11 NARCS accesses are required by the
sequential program for this purpose while equation (7)
shows that the associative processor program requires
3 accesses for problem initialization regardless of network size. Thus, the comparison on an iteration basis
introduces an additional bias in favor of the sequential
program.
In order to avoid handling the breakthrough and nonbreakthrough cases separately, the comparison will be
made on the basis of an average of breakthrough and
non-breakthrough access requirements. That is, change
to mean values and define
NAI = NAI(BR) + NAI(NON)
2

(11)

.

Experience with the algorithm indicates that in
general the majority of the problem iterations result
in non-breakthrough and therefore the average as
defined in equation (11) gives this case a smaller than
realistic weighting. A comparison of the iteration access
expressions, equations (4), (5), (8) and (9) indicate a
greater relative performance gain for the associative
processor in the breakthrough case. Therefore, the equal
weighting of the iteration types introduces additional
bias in favor of the sequential program.
Substitution of the access expressions in equation
(11) yields
NAIsEQ=%(5 N+32 NLAB+12 NARCS+9NAUG
+9 NARCSjN -42)
NAIAP= %(26 NLAB+3 NAUG+49).

(12)
(13)

PERFORMANCE COMPARISON

Now, let NLAB=aN and NAUG=bN which by

The list orientation of the sequential program for the
minimum cost flow problem imposes a requirement of
7 NARCS + N words for the storage of problem data.
This is approximately seven times the NARCS+l
storage words required by the associative processor
program. However, since both programs store network
data in the form of arc information, the above comparison is the same for all networks.
Access comparisons between the sequential and
associative processor programs are made on an average
per iteration basis. This eliminates the need to assume
values for the number of breakthrough and nonbreakthrough iterations needed for problem solution.
This approach is valid in terms of total problem access
requirements since both algorithms are based on the
same method and would therefore require the same
number of each type of iteration in the solution of the
same problem. The main effect of this approach is to
eliminate from the comparison the number of accesses
required for problem initialization. From equation (1)

TABLE I-Minimum Cost Flow Access Performance Data

N

100

500

1,000

NARCS

ASSOCIATIVE

SEQUENTIAL

NAI

NAI

R

D

2:0
.01
100 1,475
2,864
1,000
8,304
5.6
.10
.61
6,000
38,529 26.1
10,000
62,706 42.5 1.00
2.0
.002
500 7,275
14,464
1,000
17,468
2.4
.004
.04
10,000
71,549
9.8
100,000
612,359 84.2
.40
.60
150,000
912,809 125.5
250,000
1,513,709 208.1 1.00
2.0
.001
1,000 14,525
28,964
10,000
83,004
5.7
.01
623,409 42.9
.10
100,000
.60
600,000
3,625,659 249.7
1,000,000
6,027,459 415.0 1.00

864

Fall Joint Computer Conference, 1972

Z 10
0

107

7

N= 1000

t=
<[

...- 10'
l&J

0

t=
<[

0::

l&J

SEQ

Q..

lIJ

SEQ

0::
L&J

0-

~ 10 5

en
L&J
en
en

u

«

10

5

10

4

L&J

lIJ
C)

u
u

N=IOOO

«

«>

10'

0::

!::

en
l&J
en
en

~

N=IOOO

z

0::

10

<[

4

N=50

I

N=IOOO

L&J
C!)

AP

I~

«

0::

N=500

L&J

AP

>

<[
I

3

10

102

103
10 4
10 5
NARCS - NUMBER OF NETWORK ARCS

10'

I~

N=IOO

Figure 3-Minimum cost flow access requirements
10 3
10 4
10 5
NARCS - NUMBER OF NETWORK ARCS

10 2

10

6

Figure 5-Maximum flow access requirements

400
350~----~~----~--------~--------~--------_

L&J

LIJ

:: 350

>

....

~ 300

et
u

u

~300

o

CI)

(J)

~ 250

et

"~ 250

"..J

et

....

t=

z

z

LIJ

::> 200

LIJ

LIJ

o

L&J

(J)

CI)

....

(J)
(J)

CI)

et

a:

0::

(J)

LIJ
U
U

a:

150

o

o

....
et

I

200

::>

o

et

N =1000

~ 50
u

et

50

I

a:

o
o

0.2

0.4
0.6
0.8
D - NETWORK DENSITY

Figure 4-Minimum cost flow access ratio

1.0

0.2

0.4

0.6

0.8

D-NETWORK DENSITY

Figure 6-Maximum flow access ratio

1.0

Solution of Minimum Cost Flow and Maxim~m Flow Network Problems

definition imposes the constraint a, b ~ 1. Making this
substitution and forming the ratio of sequential to
associative accesses yields
R= NARCS(12+9/N) +N(5+32a+9b) -42 ( )
N(26a+3b) +49
. 14
Since, a, b ~ lesselecting a = b = 1 gives the most conservative assessment of the impact of the associative
processor as applied to this problem. Recall that this
implies that NLAB=NAUG=N. Substituting these
values into equations (12), (13) and (14) yields
NAISEQ=NARCS(6+4.5/N) +23 N-2l (15)
NAIAP = ~ (29 N +49)

(16)

R= NARCS(12+9/N) +46 N -42 (17)
29N+49
.
The solution of these equations over a representative
range of node and arc values results in the data of
Table 1 which are presented graphically in Figures 3
and 4. The associative processor access requirements are
seen to remain constant with changes in the number of
network arcs, reflecting the parallel manner in which
the arc data are processed. As shown in Figure 4, the
access ratio data of Table I are plotted against network
density which is defined as
D= NARCS
.
N(N-l) .

(18)

Analysis of the preceding data indicates that the

TABLE II-Maximum Flow Access Performance Data

N

100

500

1,000

NARCS

100
1,000
6,000
10,000
500
1,000
10,000
100,000
150,000
250,000
1,000
10,000
100,000
600,000
1,000,000

ASSOCIATIVE

SEQUENTIAL

NAI

NAI

625

R

D

926
1.5 .01
1,654
4.2
.10
12,254 19.6 .61
19,934 31.9 1.00
4,726
.002
3,125
1.5
1.8 .004
5,718
.04
23,574
7.5
202,134 64.7
.40
301,334 96.4
.60
499,734 159.9 1.00
9,476
6,250
1.5
.001
27,404
.01
4.4
206,684 33.1
.10
1,202,684 192.4
.60
1,999,484 319.9 1.00

865

access ratio R lies in the range
2.0~R~0.4

N

for

N~100

depending upon the density of the network. Because of
the approach used, this is an indication of a lower
bound on the performance improvement afforded by
the associative processor and values of R considerably
greater than this bound would typically be expected.
An equivalent analysis6 for the maximum flow problem
yields a sequential program storage requirement of
5(NARCS+l) words against an associative requirement of (NARCS+l) storage words.
Access expressions for this problem were determined
to be
NAIsEQ = NARCS (2-8/N) +7.5 N -16 (19)
NAIAP=6.25 N

(20)

Performance data resulting from these expressions,
presented in Table II and Figures 5 & 6, indicate that
1.5~R~0.3

N

for

N~100.

SUMMARY
A comparison was made of the relative performance of
the associative processor to present sequential computers on the basis of storage requirements for problem
data and the number of times that these data were
accessed in the course of solving the minimum cost flow
and maximum flow problems. It was indicated that the
ratio of sequential to associative storage accesses gives
an approximate indication of the ratio of execution
times to be expected assuming typical hardware speeds
for each processor.
Sequential comparison data were obtained through
FORTRAN implementation of algorithms published
by the ACM as representing typical examples of
sequential solutions to these problems. Storage word
requirements were obtained directly from the program
declarations while access data were obtained by inserting counters to accumulate the number of times that
the problem data were accessed in the execution of the
sequential programs.
Flow diagrams for the associative processor solution
of these problems were developed based upon the
capabilities inherent in an associative processor. By
analyzing these diagrams it was possible to calculate
the number of memory words required for problem
data as well as the number of storage accesses required
in the execution of the algorithms. To test the correctness of the derived algorithms and verify the accuracy
of the access calculations, the algorithms were programmed in associative statements at the assembly

866

Fall Joint Computer Conference, 1972

language level and executed on an interpretive emulator
program written in FORTRAN and run on the Syracuse
University Computing Center IBM 360/50. Emulation
was required since large scale examples of the associative
hardware are not yet available.
It was shown that the storage requirements for the
minimum cost flow and maximum flow problems were
7 NARCS+N' and 5(NARCS+1) words respectively,
where NARCS is the number of arcs and N is the
number of nodes in the network. The number of associative processor words was determined to be
N ARCS+ 1 in both cases. Considering the differences
in word lengths, both systems require approximately
the same amount of storage.
The access expressions for each of the competing
programs were simplified assuming a best case for the
sequential and a worst case for the associative processor. Under the stated assumption, the resulting
ratio ranges of

2.0~R~0.4 N}
1.5~R~0.3

for

N~100

N

represent a lower bound on the performance improvement to be expected through the application of the
associative processor to the solution of the minimum
cost flow and maximum flow problems respectively.
REFERENCES
1 A G HANLON
Content addressable and associative memory systems; a survey
IEEE Transactions on Electronic Computers August 1966
p509
2 J A RUDOLPH L C FULMER W C MEILANDER
The coming of age of the associative processor
Electronics February 1971

3 A WOLINSKY
Principals and applications of associative memories
Presented to the Third Annual Symposium on the Interface
of Computer Science and Statistics Los Angeles California
January 30 1969
4 J MINKER
Bibliography 25: An overview of associative or contentaddressable memory systems and a KWIC index to the literature: 1956-1970
ACM Computing Reviews October 1971 p 453
5 W L MIRANKER
A survey of parallelism in numerical analysis
SIAM Review Vol 13 No 4 October 1971 p 524
6 V A ORLANDO
Associative processing in the solution of network problems
Unpublished doctoral dissertation Syracuse University
January 1972
7 V A ORLANDO P B BERRA
AssQciative processors in the solution of network problems
39th National ORSA Meeting May 1971
8 CACM
Collected algorithms from the communications of the association for computing machinery
ACM Looseleaf Service
9 T C BRAY C WITZGALL
Algorithm 336 netflow
Communications of the ACM September 1968 p 631
10 D R FULKERSON
An out-of-kilter method for the minimal cost flow problem
Journal of the SIAM March 1961 p 18
11 G BAYER
Algorithm 324 maxflow
Communications of the ACM February 1968 p 117
12 L R FORD D R FULKERSON
Flows in networks
Princeton University Press 1962
13 T C HU
Integer programming and network flows
Addison-Wesley 1969
14 H M WAGNER
Principals of operations research
Prentice-Hall Inc 1969
15 Manual GER 13738
Goodyear Aerospace Corporation

Minicomputer models for non-linear dynamic systems
by J. RAAMOT
Western Electric Company, Inc.
Princeton, New Jersey

INTRODUCTION

I t is necessary to have some understanding of basic
integer arithmetic operations before the solution scheme
can be discussed. Therefore, the following sections
introduce the concepts of F-space and difference terms
which are used in integer arithmetic.

The computational methods of integer arithmetic have
been extended to a variety of applications since the
first publication. 1 ,2 The most noteworthy application of
integer arithmetic is the calculation of numerical
solutions to initial value problems. This method is
introduced here with the example of the differential
equation:
dx
-+x=O
dt

F-SPACE SURFACES
A common problem in mathematics is to find the
roots of an expression: Given some f(x) the task is to
find the values of x which satisfy the equationf(x) =0.
These roots can be obtained by a method of trial and
error where successive values of x are chosen until the
equation is satisfied. A simpler method is to introduce
the additional variable y, and to find the points on
y=f(x) where the contour crosses the x-axis.
This technique of introducing one additional variable
is central to operations of integer arithmetic. In the
two-dimensional case, a contour f(x, y) =0 is the intersection of the surface F=f(x, y) with the xy;..plane.
Here F is the additional variable and is denoted by a
capital letter in order to develop a simple notation for
subsequent operations. This three-space is called
F -space. It can be created for any dimensionality as is
indicated in the table in Figure l.
Integer arithmetic is not concerned with an analytic
characterization of F -space surfaces, but with a set of
solution points (F, x, y) at integer points (x, y). The
integer points are established by scaling the variables
so that unity represents their smallest significant incre-'
ment over a finite range.
In mathematical calculations the use of integer cal~
culations is avoided because each multiplication may
double the number of digits which have to be retained,
and the resultant numbers tend to become impractically
large. This does not happen in integer arithmetic because the values of F are evaluated at adj acent integer
points (x, y) and are expressed as differences. Thereby
multiplication is avoided and addition of the differences
is the only mathematical operation that is used.

(1)

By substituting the variable y in place of the derivative,
the equation becomes
y+x=O

(2)

which represents a trajectory in the phase-plane. Given
an initial solution point (xo, Yo, to), other solution
points (x, y, t) are readily found by first solving the
phase-plane equation and then computing the values of
the variable t from rewriting the above equation as
t=-

fd;

(3)

This example of finding solution points to an initialvalue problem demonstrates the procedure which is used
in integer arithmetic solutions. Other solution schemes
avoid this procedure because in the general case the
phase-plane trajectory cannot be expressed in the form
f(x, y) =0, and an incremental calculation of solution
points (x, y) builds up errors.
The major contribution here is that with integer
arithmetic techniques, the points (x, y) along the
phase-plane trajectory can be calculated with no
accumulation of error in incremental calculations, even
though the trajectory cannot be expressed in a closed
form. As a result, this method handles with equal ease
all initial-value problems without making distinctions
as to non-linearity, order of differential equation, and
homogeneity.

867

868

Fall Joint Computer Conference, 1972

order to distinguish it from finite differences in difference equations, from slack variables in linear programming, and from partial derivative notation,
because F x has a relationship to all of these but differs
in its interpretation and use.
The notation for indicating the direction of incrementation is to have F x or F -x, In addition, the difference between successive difference terms is the second
difference term in x and is defined by the identity:

INTEGER ARITHMETIC

DIMENSION

SOLUTION TO
EQUATION

F-SPACE
EQUATION

f(x)=O

y=f(x)
CONTOUR

ROOTS

2

3

f(x,y) =0

F=f(x,y)

CONTOUR

SURFACE

flx,y)=O
y=O

Fl =f(x,y)
F2=y

ROOTS

CONTOUR

f(x,y,z)=O

F=f(x,y,z)

SURFACE

SURFACE

(7)

Figure I-Table of solutions to equations which are obtained by
operations in F -space

In an initial-value problem, the problem is to follow
the phase-plane trajectory from an initial starting point.
There are various integer arithmetic contour-following
algorithms which are based on the sign, magnitude, or a
bound on the F -value. To use anyone of these algorithms
it is necessary to specify the start point (F, x, y), the
direction, and the difference terms. To apply these
contour-following algorithms to a phase-plane trajectory, the difference terms have to be established. This
forms the major portion of the computation.
There are two methods for finding the difference
terms and both are illustrated here for the equation of
the circle
x2+y2=r2
(8)
I t forms the F-space surface

The F-space solutions are exact for polynomials
f(x, y) =0 at all integer points (x, y). Also the F-space
surface F = f(x, y) is single-valued in F. Therefore, there

F =r2- (X 2+y2)

(9)

which is a paraboloid of revolution and is illustrated in

is no error in calculating successive solution points
based on differences, and the solution at anyone point
does not depend on the path chosen to get there. These
properties do not hold for non-polynomial functions
(e.g., exponentials) , but there the F -space surface
points can be guaranteed to be accurate to within one
unit in x and y over a finite range.!

F

DIFFERENCE TERMS
Let the value of the variable F at a point (x, y, ... )
be F and be F Ix-tl at (x+l, y, ... ). Then the first
difference term in x is defined by the identity

15

(4)

10

I t is the change in the F-variable on x-incrementation.
This identity can be rewritten for the F -space surface

5

F=f(x, y)

(5)

o

-~---x

as
F =f(x±l
x

,

y

) -f(

x, Y

)

= L..J
~ (±l)n
8n]'(x, y)
,
8 n
n=!

n.

x

(6)

The notation F x is chosen for difference terms in

Figure 2-The F-space surface of the circle, F
a paraboloid of revolution

=

25 - (X2+y2) is

Minicomputer Models for Non-Linear Dynamic Systems

Figure 2. The intersection of this surface with the
xy-plane forms the circle of radius r.
Case 1: Given F=f(x,y)

by the ratio of difference terms at adjacent integer
points. Thus, at a point (x, y) the derivative is a good
approximation of the ratio of difference terms, and vice
versa:
dy
-Fz
_/'"Oov _ _

In this case the difference terms are established from
the identity:
F±z=F IZ±l-F
=f(x±l, y) -f(x, y)

= =t= (2x±l)

(10)

Likewise, it can be demonstrated that the first y difference term is

dx- F'JI

Fz= -F_(z+!)

(13)

F1I =

(14)

and
-

F -('11+1)

which state that the difference in F-values between two
adjacent integer points does not depend on the direction.
In the example of the circle, the derivative is

Case 2: Given dy/dx

Based on the definition of the derivative, it can be
shown that the derivative at a point (x, y) is bounded

(12)

The exact difference terms are found from this
approximation by requiring the F -space surface to be
single-valued In F. This requirement results in the
identities

(11)

Both difference terms for the circle are illustrated in
Figure 3.

869

dy =_ ~

dx

y

(15)

For this given derivative equations (13) and (14) are
satisfied only by the introduction of additional terms,
here constants c, such that
x+c= (x+l)-c

(16)

and
2c=-1

(17)

The resultant ratio of the difference terms is

Fz

2x+l
2y+l

----

F'JI

(18)

and can be written as
F ±z _ =t= (2x±l)
F ±'J1 - =t= (2y±l)

Figure 3-Integer arithmetic difference terms are the changes in
the F variable between adjacent integer points in the
xy-plane. Successive points are selected to be along
the circle but not necessarily on the circle

(19)

This result is identical to the one in case 1.
To summarize, in case 2 the function f(x, y) =0 was
not given but its derivative was. This is sufficient to
calculate the correct difference terms. It is easy enough
to verify that incremental calculation of solutions
(F, x, y) based on the difference terms F ±z and F ±'J1 is
exact, accumulates no errors, and represents integer
solution points on a single-valued surface in F-space.
The reader can also easily verify that the intersection
of the F -space surface with the xy-plane is the given
circle.

870

Fall Joint Computer Conference, 1972

SECOND ORDER DIFFERENTIAL EQUATIONS
A general second order non-linear differential equation is represented by the equations
dx
- =y
dt

(20)

dy
dt +g(x, y) =0

(21)

The first step in finding numerical solutions to the
equations is to calculate the difference terms F x and F y.
Their ratio is approximately
Fx

-dy

-~--

Fy

dx

-dy

dt

= --. -

dt

dx

g(x, y)

= ---

Y

One example of a contour-following algorithm based
on the sign is the following: Given an initial direction
vector, then the choice of the next increment in the
phase-plane is the one which has the difference term
sign opposite that from the F-value. If both difference
terms and F have the same sign, then the direction is
changed to an adjacent quadrant and an increment is
taken along the direction axis traversed. Subsequently,
the choice of either a positive or negative t-increment
determines whether the new direction is acceptable or
another change of direction has to be made.
Values of the variable t are obtained by anyone of the
two following integration steps. Either,

t=fdX~~~

(22)

which can be written immediately from the above
equations.
The exact values of the difference terms must satisfy
the identities of equations (13) and (14). These identities are formed by adding appropriate additional terms,
g' (x, y) to both sides. The resultant difference term
then is
Fx=g(x, y) +g'(x, y)
(23)
In a similar fashion, the difference terms
Fy=y+c

(24)

-F-(H1) = (y+1)-c

(25)

and
are identical if c = ~. It is not practical to reduce further
the ratio of exact difference terms in the general case.
Later, specific examples will illustrate this technique.
Given the exact difference terms and the initial
values, then anyone of the integer arithmetic contourfollowing algorithms can be applied to find adjacent
integer points (x, y) along the phase-plane trajectory
without accumulating errors, and the points (x, y) are
guaranteed to be accurate to within unity (which is
scaled as the least significant increment in the calcUlations) .
Successive solution points are calculated by incrementing one variable in the phase-plane and adding the
corresponding difference term to F. In general, the
solution points (F, x, y) are on the F-space surface but
are not contained in the phase-plane. The important
result is that there is no accumulation of errors in the
incremental calculation of solution points on the F-space
surface.
Errors are introduced in relating the F-space surface
points to the phase-plane trajectory, but the contourfollowing algorithms can always guarantee that these
errors are less than unity.

y

x

(26)

Fy

or
t=-

f

dy
g(x, y)

1

~ ~ Fx

(27)

will result in the same value of the variable t. Here the
integration is approximated by an incremental summation over either x or y increments.
The error in t-increments becomes large whenever
the difference term in the denominator of the summation becomes small. This problem is avoided by
choosing the summation which contains the largest
difference term.
In order for t to increase in the positive sense, the
direction of incrementation in the phase-plane is chosen
to make the product of the x-direction vector and
y-value be positive. Otherwise, t increases in the
negative sense. This result is derived from equations
(20) through (22). Thereby, the solution method is
complete.

EXAMPLE 1
The integer arithmetic method of finding numerical
solutions to differential equations is illustrated here by
the example of a second order linear differential equation. It has easily derivable parametric solutions in t
but its phase-plane trajectory cannot be expressed as
f(x, y) =0. This initial value problem is stated as
dx
-=y
dt

(28)

(29)
with initial values of
(xo, Yo, Yo) = (0, 20, 0)

(30)

Minicomputer Models for Non-Linear Dynamic Systems

871

I ts parametric solution depends on the values chosen
for the constants k and w 2• The values chosen here are
k= .08 and w 2 = .04 which result in' the parametric
solution
X=

104.ge-·o4t sin 0.196t

(31)

and
y= -4.1ge-·o4t sin 0.196t+20.0e-· o4t cos 0.196t

(32)

These calculations apply only for the analytic solution and not in the integer arithmetic solution scheme.
There, the first step is to establish the difference terms
from the given equations (28) and (29).
According to equation (12) the approximate ratio
of the difference terms is

Fx
F 1I -

~

ky+w 2x-yo

-"-------''-

y

(33)

The requirement that the F -space surface is a singlevalued surface, as stated in equations (13) and (14),
is applied to obtain the exact difference terms
(34)
and
(35)

Then the ratio of the terms is multiplied by 2n/2n
where n is an appropriate integer to eliminate the
fractions. The choice of direction in the phase-plane for
positive t establishes that the x, y-direction vector is

,.,

..,............................................
.,........

•••......

.................•
.

Figure 4-The phase-plane trajectory of the linear differential
equation discussed in example 1, for the initial values
(x, y, t) = (0, 20, 0). The integer arithmetic solutions
are displayed on a CRT

Figure 5-Top: The integer arithmetic solutions (x, t) as displayed on a CRT for the trajectory of Figure 4
Bottom: A CRT display of the integer arithmetic
solutions x as calculated in real time. The
time axis is 5msec/cm

(1, -1), and the'difference terms are:

F ±x= ±n[2ky+w2(2x±1)]

(36)

F ±11= ±n(2y±1)

(37)

The resultant phase-plane trajectory is illustrated in
Figure 4 and the numerical results (x, t) are compared
with the calculation of values of x in real time in Figure
5. The peak to peak values of x are equal to 114 increments which corresponds to a 1 percent accuracy. As
can be seen, the first cycle is calculated in 25 milliseconds. A comparison of the numerical with the
analytic solution confirms that all points (x, y) are
unit distant from the true trajectory.

872

Fall Joint Computer Conference, 1972

The above example illustrates how the incremental
calculations are set up for the second order differential
equation. The numerical solutions (x, y, t) are obtained
by application of: the integer arithmetic calculation,
even though there is no closed expressionj(x, y) =0 for
the trajectory.

problem to obtain an improved resolution of integer
solution points. This is done by rewriting the van der
Pol equation with the new variables x' and y'

SCALING

and by taking increments of lin units in x', 11m units
in y', and replacing e by elk. Then the difference terms
are computed, and the variable nx' is replaced byx and
my' is replaced by y. The resultant difference terms are:
F ±x=m[±e(6x2±6x+2-6n2 )y+nmk(±6x+3)]

In many problems it is necessary to scale the variables
to obtain either an improved or a coarser resolution.
Such scaling is best illustrated by the example of the
circle given in equation (9).
First, an improved resolution in x only is obtained by
taking increments of lin units where n is an integer
and integer calculations are retained. Then,

Fnx= -[(x+1/n)2- x2]

= - [n- 2(nx+1)2- x 2]
= - [n- 2 (2nx+l)]

(38)

On multiplying F by n2, the difference terms are

Fnx= - (2nx+1)

(39)

and
(40)
The other scaling example takes n increments inx
at a time. Then,

Fx/n= -[(x+n)2- x 2]
= -[n2(xln+1)2- x2]
= -n2 (2xln+1)

x=x'

(44)

dx
I
-=y
dt

(45)

(46)
and
(47)
For m=n= 10 and e=k= 1, the point (x, y, F) =
(0,21, -83840) is located in F-space on the limit cycle
F-space surface. Based on the difference terms, an
incremental calculation of solution points along the
limit cycle returns to the same start point. This confirms that there is no accumulation of errors in the
incremental calculations.
Likewise, for start points chosen both inside and
outside the limit cycle, results agree with the expected
trajectories in that all resultant trajectories terminate
with the limit cycle. These results are in complete
agreement with published data3 for values of the
constant e=O.l, 1.0, and 10.
The variable t is calculated from either summation:
(48)

(41)

and the y difference remains

or
(42)

The last step in scaling is to substitute a new variable
for nx or xln respectively in the above examples, and
to proceed with integer calculations.

EXAMPLE 2
The earlier example of the integer arithmetic solution
scheme represented a linear differential equation with
parametric solutions. For this example is chosen the
van der Pol equation
d2x
dx
dt 2 +e(x2 -1) dt +x=O

(43)

The phase-plane trajectory of this equation has a
stable limit cycle with a radius of 2 for the constant
e>O. Near the limit cycle, it is necessary to scale the

t=

K

L: Fx

(49)

'J/

It should be remembered that x, y, and F have been
rescaled and correspondingly also the numerator in
these summations is scaled to K. It is given by the
equation
(50)
for both the x and y incrementation sum.
If the forcing function, 5 sin 2.5t is applied to the
van der Pol equation, then the difference term in x
becomes
F ±x=m[±e(6x2±6x+2-6n2)y+mnk(±6x+3)

=F6mn2k(5 sin 2.5t)]

(51)

and the y difference term remains unchanged. In this
instance, again the results are in complete agreement
with published data. 4

Minicomputer Models for Non-Linear Dynamic Systems

873

HIGH-ORDER SYSTEMS
An initial value problem can be written as:
dxl

dt = fl (xl, x2, ... , xn, t)
dx2

dt

=f2(xl, x2, ... , xn, t)

dxn

dt =fn(xI, x2, ... ,xn, t)

(52)

The numerical solutions (xl, x2, ... , xn, t) are
found by taking the set of equations in the ratios:
dx2 dx3
dxl 'dx2' ...

(53)

Each ratio represents a phase-plane trajectory for which
the difference terms can be established.
An increment in x2 in the first phase-plane trajectory
also corresponds to an increment in x2 in the second
phase-plane trajectory, which in turn may result in x3
being incremented. Whether or not x3 is incremented
depends on the particular integer arithmetic contourfollowing algorithm which is used. For example, the
algorithm based on the sign forces an x3 increment

xl

SUBROUTINE

Ai calculates Fxi and F x(i+l)'

SUBROUTINE Bi chooses next increment or operation.
SUBROUTINE C i updates all variables for an Xi increment.

Figure 7-Flow chart of an integer arithmetic algorithm for
tracking coupled trajectories, showing the calculations
for the i-th variable

when the value of the difference term for that variable
has the opposite sign of the current F-value.
These coupled trajectories are illustrated in Figure 6
for the simple equation

xl
128 128

d4x
1
-4 - -x=O
dt
8

-x2

o
x3

-x2

x3

~
I

-64

0

2~2
- x4 1

-16

0

Figure 6-Coupled phase-plane trajectories for the equation,
1
d4x
- 4 - -x
dt
8

=0

The variable t is obtained from the first trajectory and is
scaled to T = 128 at x1 = 1 unit from termination

(54)

given the initial values (xl, x2, x3, x4, t) = (128, -64,
32, -16,0). A general algorithm for coupled trajectories
is shown in the block diagram of Figure 7. As can be
seen, an increment in the first variable may immediately
ripple through to an incrementation in the n-th variable,
after which, starting from the end of the chain, each
traj ectory achieves a stable solution as determined by
the contour following algorithm.
The variable t can be calculated from anyone phaseplane trajectory each resulting in the same value but
being consistent with the resolution of computation.
CONCLUSIONS
The integer arithmetic solution method has been applied
to a variety of initial-value problems, of which representative examples are illustrated above. Associated

874

Fall Joint Computer Conference, 1972

with this method are a number of theorems. These
prove that the F -space surface is single-valued in F,
that the direction field is bounded by the ratio of
difference terms, that some trajectories have integer
F-space solutions at all integer points in the phaseplane, and that for other trajectories the F-space
surface is approximated, but the accuracy of results is
guaranteed over a finite domain. However, additional
theorems remain to be developed to insure that the
method is applicable to all initial-value problems, and
to determine the necessary conditions for stability.
The solution method is summarized as follows:
Successive solution points along a phase-plane trajectory are calculated by adding a difference term to the
F-value and incrementing the associated phase-plane
variable. These simple operations are offset by the more
complex contour-following algorithms which track the
trajectory by examining the state of calculations and
then selecting the next increment. Here the underlying
concept is that the trajectory is the contour formed by
the intersection of the F-space surface with the phaseplane.
There exists a duality between the integer arithmetic
technique and the standard Runge-Kutta or predictorcorrector solution methods. In integer arithmetic, the
phase-plane variables are the independent variables and
t is a dependent variable obtained as a result of integration. Just the reverse is true in the standard
methods; t is an independent variable and the phaseplane variables are obtained as a result of integration.
The integer arithmetic technique finds solution points
on the phase-plane trajectory even though there may
not exist an analytical expression of that trajectory.
Likewise, the standard method finds solution points of
integrals which cannot be expressed in analytic form.
An additional duality is that after initial scaling, the
integer arithmetic solutions have a guaranteed accuracy
whereas the standard methods require a subsequent
accuracy calculation.
The computations involved in the integer ~rithmetic

method are simpler than the ones in other methods:
The examples illustrated in this paper were programmed
in assembly language for the Digital Equipment
Corporation PDP-15 computer. It has only 4096 words
of store and does not have a hardware multiplier. The
entire program is contained in 300 words of store and is
executed in 50 microseconds per increment in x or y,
including the time calculation. It is difficult to execute
any other solution scheme within such limited facilities
or comparable speed.
Other examples have been programmed in FORTRAN
on a large PDP-I0 computer. There the execution time
is 10 times slower, and is comparable to the standard
numerical integration methods. In these examples,
floating point calculations were used for the integer
arithmetic calculations.
ACKNOWLEDGMENTS
The development of this method resulted from the
application of integer arithmetic techniques at the
Western Electric Engineering Research Center in
Princeton. Also, there are substantial contributions by
J. E. Gorman in formulating the integer arithmetic
techniques.
REFERENCES
1 J E GORMAN J RAAMOT
Integer arithmetic technique for digita,l control computers
Computer Design Vol 9 No7 pp 51-57 July 1970
2 A G GROSS et al
Computer systems for pattern generator control
The Bell System Technical Journal Vol 49 No 9
pp 2011-2029 November 1970

3 L BRAND
Differential and difference equations
Wiley 1966 New York
4 L LAPIDUS R LUUS
Optimal control of engineering processes
Blaisdell Waltham 1967

Fault insertion techniques and models
for digital logic simulation
by STEPHEN A. SZYGENDA and EDWARD W. THOMPSON
Southern Methodist University
Dallas, Texas

to validate fault detection or diagnostic tests, to create
a fault dictionary, to aid in the automatic generation
of diagnostic tests, or to help in the design of diagnosable
logic.
The activities of a digital fault simulation system
can be divided into two major areas. The first is the
basic simulator, which simulates the fault free logic
net. The activities of the second part are grouped under
the heading of program fault insertion. (For digital
fault simulation as opposed to physical fault insertion.)
The merit of the fault insertion activities can be
judged on five points. These are:

INTRODUCTION
During the past few years it has become increasingly
apparent that in order to design and develop highly
reliable and maintainable digital logic systems it is
necessary to be able to accurately simulate those
systems. Not only is it necessary to be able to simulate
a logic net as it was intended to behave, but it is also
necessary to be able to model or simulate the behavior
of the logic net when it contains a physical defect.
(The representation of a physical defect is known as
a fault.) The behavioral simulation of a digital logic
net which contains· a physical defect, or fault, is known
as digital fault simulation. 1- 6
In the past, two methods have been used to determine the behavior of a faulty logic net. The first approach was manual fault simulation. 7 (For logic nets
of even moderate size, this method is slow and often
inaccurate.) The second method used is physical fault
insertion. 7 In this method faults are physically placed
in the fabricated logic, input stimuli are applied and
the behavior of the logic net observed. Although
physical fault insertion is more accurate than manual
fault simulation, it is still a lengthy process and requires hardware fabrication. The most serious limitation, however, is that physical fault insertion is dependent on a fabrication technology which permits access
to the input and output pins of logic elements such as
AND gates and OR gates. With discrete logic this is
possible, however, the use of MSI and LSI precludes
the process of physical fault insertion. Since MSI and
LSI will be used in the future for the majority of large
digital systems, the importance of digital fault simula. tion can be readily observed.
The major objective of digital fault simulation is to
provide a user tool by which the behavior of a given
digital logic design can be observed when a set of
stimuli is applied to the fabricated design and a physical
defect exists in the circuit. This tool can then be used

(1) Accuracy with which faults can be simulated.
(2) Different fault models that can be accommo-

dated.
(3) Methods for enumerating faults to be inserted.
(4) Extraction of information to be used for fault
detection or isolation.
(5) Efficiency and capability of handling large
numbers of faults.
ACCURACY OF FAULT SIMULATION
In order to accurately predict the behavior of a logic
net which contains a fault, the basic simulation used
must be capable of race and hazard analysis. A simple
example of this is shown in Figure 1. In this example
an AND gate has three inputs, a minimum delay of 3,
and a maximum delay of 4. At time T2 signal A starts
changing from 1 to 0 and at the same time signal B
starts changing from 0 to 1. The period of ambiguity
for the signals is 3. For the fault free case, signal C
remains constant at o. Therefore, the output of the
gate remains constant regardless of the activity on
signals A and B. If signal C has a stuck-at-Iogical 1
fault, there is potential for a hazard on the output of
the gate between time T 5 and T 9. This hazard will not
875

876

Fall Joint Computer Conference, 1972

A~"'O
B 0+1
C

o

D

DELAY
MIN 3
MAX 4

A

o

T2

T5

Tg

I

!

I

I

1

I

I

tw#J1hl
I

I
B

o
D 0

D

!
I

wj'/#dI

iI

I

I

-------tl-----+-I--- NO FAULT

o ______

- I~:lal tA l.lA.6. 6. . L.I. .Ll5L1

___

FAULT PRESENT

Figure I-Fault induced potential error

be seen unless the fault insertion is done in conjunction
with simulation that is capable of detecting such a
hazard.
The fault insertion to be discussed here is done in
conjunction with the TEGAS28,9 system. TEGAS2 is
a table driven assignable delay simulator which has
eight basic modes of operation of which three are concerned with faults. For the first mode, each element
type is assigned an average or nominal propagation
delay time. This is the fastest mode of operation, but
it performs no race or hazard analysis. Mode 2 is the
same as mode 1, except it carries a third value which
indicates if a signal is indeterminate. The third mode
has a minimum and maximum propagation delay time
associated with each element type and performs race
and hazard analysis. All three modes can use differing
signal rise and fall times.
Fault insertion and parallel fault simulation are performed in all three of these modes. When fault insertion
is done in mode 3, races or hazards that are induced
because of a fault, will be detected. If fault insertion
is done in mode 2, no fault will be declared detected
unless the signal values which are involved in the detection are in a determinate state. Also it can be determined if a fault prevents a gate from being driven
to a known state.
By using TEGAS2 as the basic simulator, faults can
be simulated to whatever degree of accuracy desired
by the user.

model or insert various kinds of faults. Most fault
simulation systems are capable of modeling only the
class of single occurring stuck-at-Iogical 1 and stuckat-logical 0 pin faults. Although it has been found that
this .class of faults covers a high percentage of all
physICal defects which occur, (considering present
technology), it is certainly not inclusive.
In an effort to remain as flexible and as efficient as
possible and to be able to model different classes of
faults, TEGAS2 has developed three different fault
insertion techniques. The first technique is used to
insert. signal faults or output pin faults. This type of
fault IS where an entire signal is stuck-at a logical 1
or a logical O. The distinguishing factor is that the
fault affects an entire signal and not just an input pin
o~ an element. Figure 2 illustrates a signal or output
pIn fault as opposed to an input pin fault.
At the beginning of a fault simulation pass, the
OUTFLT table (Figure 3), which contains all of the
signal faults to be inserted, is constructed. There is one
row in the table for every signal that is to be faulted.
The information in each row is the signal number,
~ASK1, and MASK2. The signal number is a pointer
Into an array CV where the signal values are stored.
MASK 1 has a 1 in each bit position that is to be
faulted. The right most bit of a word containing a
signal value is never faulted since that bit represents
the good machine value. MASK2 contains a 1 in any
bit position that is to have a SAl inserted and a 0
where there is either a SAO or no fault to be inserted.
Parallel fault simulation is accomplished by having
each bit position, in a computer a word containing a
signal value, represent a fault.
At the end of each time period during simulation,
for which any activity takes place, the signal faults
are inserted. This method is very simple and requires
little extra code. For example, let CV be the array containing the signal values and OUTFLT be the two
dimensional table discussed above. Then, to insert one

A~I-------------~
I------F

B ,...:.0_ _----1

FAULT 2
"INPUT PIN"

E

C ....:.0_ _---1

D~I-------------~

1------ G

OUTPUT VALUES

FAULT MODELS

~
o 0
I I
I

In order for a fault simulation system to be as
flexible and as useful as possible, it should. be able to

0

NO FAULT
FAULT I
FAULT 2

Figure 2-Example of a signal fault and a pin fault

Fault Insertion TechniquBs and Models

FAULT RECORD (FAULTS TO BE INSERTED)

COT

INDEX

MFNT (FAULT MACHINE

1~I
I

FAULT
NO.

N

~g~~~~PONDENCE
TABLE)

\

y
FAULT TABLES

MASK 1- POSITION OF FAULT
MASK 2-TYPE OF FAULT

Figure 3-TEGAS2-Table structure for fault simulation

or more faults on a signal, the following statement
would be executed.
CV(OUTFLT(i,l»

(CV(OUTFLT(i,l» .AND.
(. NOT. OUTFLT(i,2».
OR. OUTFLT(i,3). [1]

i represents the row index in the signal fault table
OUTFLT. It is not necessary to insert signal faults
after a time period that has no activity, since none of
the signals will have changed value. By inserting signal
faults in this manner it is not necessary to check a flag
every time an element is evaluated to see if its output
must be faulted.
The second method of fault insertion is used for input
pin faults. An input pin fault only affects an input
connection on an element. This is demonstrated in
Figure 2. In the table structure for the simulator, each
element has pointers to each of its fan-ins. The pointer
to any fan-in that is to be faulted is set negative prior
to a simulation pass. During simulation the input
signals for an element are accessed through a special
function. That is, the evaluation routines for thedifferent element types are the same as when fault insertion is not performed except that the input values for
the element are acquired through the special function.
This function determines if a particular fan-in pointer
is negative. If a pointer is negative, the element being
evaluated and the signal being accessed are looked up
in a table containing all input pin faults to be inserted
for a simulation pass. The appropriate fault can then
be inserted on the input pin before the evaluation
routine uses it.
The input pin faulting procedure can be more clearly
illustrated by first examining the major tables used in
simulation. These are given in Figure 4. Each row in

877

the circuit description table (CDT) characterizes a
signal or a single element. The first Bntry in the CDT
table is a row index into the function description table
(FDT). The second entry CDT (i, 2), points to the
first of the contiguous set of fan-in pointers (in the FI
array) for element i. CDT (i ,3) points to the first of a
contiguous set of fan-out pointers (in the FO array)
for element i, and CDT (i,4) specifies how many
fan-outs exist. The signal value (CV) table contains
signal values. The ith entry in the CV array contains
the value of the ith signal or element in the CDT table.
Each row in the FDT table contains information
which is common to. all elements of a given logical type.
FDT (i, 1) contains the number of the evaluation
routine to be used for this element type, FDT (i, 2),
FDT (i,5), and FDT (i,6) contain the nominal, maximum, and minimum delay times respectively. FDT
(i ,3) specifies the number of fan-ins for this element
type. FDT (i, 7) contains the number of outputs for
the given element type. (This is used in the case of
multiple output devices.) FDT (i ,4) is used for busses.
The simplest evaluation-routine for a variable input
AND gate V\rill now be given. This is the routine used
when no race and hazard analysis is performed, nor is
an indeterminate value used.
N =
IFIPT =
ITEMP =
DO 10NN =
K =
ITEMP =
10 CONTINUE

(FDT(CDT(I, 1), 3)
CDT(I,2)
ALLONE
1,N
FI(IFIPT + NN - 1)
ITEMP.AND. CV(K)

The integer variable ALLONE has all bits set to one.

COT(POINTERS)

FI*
(INTERCONNECTIONS)

ELEMENT NUMBER (INDEX)
MODE 3
(HAZARDS)

*TABLES THAT ARE
DYNAMICALLY ALLOCATED

CV *

CV2*

CV3 -

~
TABLES USED
FOR SIMULATION

COT
FDT
FI
FO

}

INTERCONNECTION DATA
AND
ELEMENT CHARACTERISTICS
(MODEL DEFINITION)

FDT
(ELEMENT TYPES)
FINES CHARACTERISTICS
~
OF ELEMENTS - NO. OF INPUTS, TIME DELAY,
AMBIGUITY REGION, ETC.

rn

Figure 4-TEGAS2-Simulation table structure

878

Fall Joint Computer Conference, 1972

All that is required to change this routine so that input
pin faults can be inserted is to replace CV(K) with
FLTCV (K). FLTCV is a function call. It determines
if the fan-in pointer K is negative and if so it uses the
INFT tal>le to insert the appropriate fault. The INFLT
is the same as the OUTFLT table except that the
relative input pin position, to be faulted, is given. The
combination of the element number and the input pin
position on that element identify a particular pin to be
faulted.
The third method of fault insertion is used for
complex faults. A complex or functional fault is a
fault used to model a physical defect which does not
correspond to a single or multiple occurring stuck-atlogical 1 or stuck-at-Iogical 0 fault. An example of this
is a NAND gate that becomes an AND in the presence
of some physical defect. For this approach an element
is first evaluated by its normal evaluation routine.
Then, if a complex fault is to be inserted on that
element, it is evaluated again using a routine which
models the complex fault. An example would be an
inverter gate which no longer inverts. In this case,
the normal inverter routine would be used first, then
an evaluation routine, which merely passes the input
signal along, would be used.
As with the other insertion techniques, a table
(FUNFLT, Figure 3) is constructed at the beginning
of a simulation pass and it contains all elements that
are to have complex faults for that pass. This table
also contains' the routine number that will evaluate a
prospective complex fault and again which bit position
will represent the fault. Each entry in the Fl[NFLT
table has one extra space, used for input pin position
when modeling input shorted diodes. It is the responsibility of the complex fault evaluation routines to merge
their results with the results of the normal evaluation
routine so that the proper bit represents the fault. This
is accomplished by using MASKI in the FUNFLT
table. Assume that variable SP temporarily contains
the non-fault. element evaluation results and the
variable SPFT contains the results from the element
representing the complex fault. As was stated before,
MASKI contains a 1 in the bit position that is to
represent the fault, then the statement
SP

=

(SP.AND. (.NOT.MASKI))
(MASK1.AND.SPFT)

.OR.

will insert the fault in SP.
Other faults that can be modeled with the complex
fault insertion technique are shorted signals, shorted
diodes, NAND gates that operate as AND gates, edge
triggered D type flip-flops which are no longer edge
triggered, etc. Two signals which are shorted together
would be modeled as in Figure 5. A dummy gate is

A

......

A*

......

0
U

M

a

...

P'

a*

...

-".

Figure 5-Shorted signals

placed over signals A and B. In the faulted case, the
dummy gate takes on the function of an AND gate or
an OR gate, depending on technology. In this case,
the input signals are ANDed or ORed together and the
result passes on to both A* and B*.
Another class of faults that .can be modeled, to some
extent, with. the complex method, is transient or intermittent faults. This is possible only because TEGAS2
is a time based simulator. As an example, let us model
the condition of a particular signal periodically going
to 0 independent of what its value is supposed to be.
Again we pass the signal through a dummy gate as in
Figure 6. The dummy gate also produces the fictitious
signal (F) which is ANDed with the normal signal. The
fictitious signal is normally 1, however the dummy
gate can use a random number generator to periodically
schedule the fictitious signal to go to O. It can also use
the random number generator or a given parameter
to determine how long the signal remains at O. From
this discussion, the flexibility and power of the complex
fault insertion method can be seen.
In addition to modeling the faults described· above,
any combination of faults can be modeled as if they
existed simultaneously. A group of faults that exist
at the same time is considered to be a multiple fault.
With this capability multiple occurring logical stuck
at 1 and logical stuck at 0 faults can be modeled. Also
multiple complex faults can be inserted, or any combination of the above.
Modeling a group of multiple faults is accomplished
simply by letting a single bit position in the signal
values represent each of the faults that are to exist
together. That is, MASKI in the fault tables would
have the same bit position set to one for each of the
faults in a group of multiple occurring faults.
This approach to the handling of multiple faults
has permitted us to develop a new technique for

Fault Insertion Techniques and Models

simulating any number of faults, from one fault to all
faults, in one simulation pass. The added running time
for this approach is slightly more than that needed to
do a parallel simulation pass which consideres a number
of faults equal to the host machine word length minus
one. Hence, the approach has the potential of being
less time consuming than the one pass simulatorslO
and more flexible and efficient than the traditional
parallel simulators. The technique is called 111ultiple
Number of Faults/Pass (MNFP) and partitions the
faults into classes that will be simulated as multiple
faults. Therefore, each bit position represents a group
of faults. If the groups are structured such that blocking faults (such as a stuck at 1 and a stuck at 0 simultaneously on the same signal) are not included in the
same group, fault detection can be achieved. If fault
isolation is required, the fault groups which contain
detected faults will be repartitioned and the process
continued. For example, if we are simulating 35 groups
of faults and five groups indicate faults being detected,
the five groups will be repartitioned and simulated for
isolation. The efficiency of this approach is derived
from the fact that the other 30 groups need not be
simulated any further for these inputs. Assume that
these 30 groups each contained 70 faults. For parallel
simulation, this would require 59 additional simulation
passes, over the (MNFP) approach.
Another feature of this approach is that all faults
need not, and indeed, sometimes should not, be simulated in one pass. For example, assume the following
partition of 2,168 faults.
N umber of groups
10
5
20
15
10
9
69 (Total Groups)

Number of faults/group
19
27
35
71
.6
2

2,168 (Total Number
of Faults)

For this case, two passes of the simulator would be
required, assuming a 36 bit host machine word.
MNFP is also being used in conjunction with diagnostic test generation. For example, assume the existence of 3500 faults, and that our diagnostic test
generation heuristics have generated three potential
tests (T 1, T 2 and T 3). If the faults are partitioned into
groups of 100 each, all faults could be simulated in one
pass. Hence, if each test is applied to the fault groups
using MNFP, it would require three passes to determine

...

A

A*

879

......

0
U

F

..,..

M

Figure 6-Intermittent fault

(to some degree) the relative effectiveness of the tests.
If Tl and T2 detected faults in only one group, and T3

detected faults in 5 groups (with fault detection being
the objective), the most likely candidate for further
analysis would be Ta. Even if all of the faults in these
5 groups were then considered individually, (the worst
case) the entire process would require 18 simulation
passes, as opposed to 100 passes using conventional
parallel simulation. Further studies to determine the
most efficient utilization of the MNFP technique are
presently under way.
ENUMERATION OF FAULTS
If every fault that is to be inserted must be specified
manually, it could be a very laborious process. It is
certainly necessary to be able to specify faults manually
if desired, but it is also necessary to be able to generate
certain classes of faults automatically. TEGAS2 is set
up in a modular fashion such that control cards can
be used to invoke independent subroutines which will
generate different classes of faults. Additional faults
can be specified manually. New subroutines can be
easily added to generate new classes of faults as the
user desires. One class of faults that the system presently generates automatically, is a collapsed set of
single occurring stuck-at-Iogical 1 and stuck-at-Iogical
o pin faults. This is the class of faults most often used.
A collapsed set of faults refers to the fact that many
faults are indistinguishable. For instance, a stuck at
o on any of the input pins of an AND gate and a stuck
at 0 on the output of the AND gate cannot be distinguished. If this group of faults is collapsed into one
representative fault, it is considered to be a simple
gate collapse. This is easy to perform and many existing
systems utilize this feature.

880

A

Fall Joint Computer Conference, 1972

12

E
B.-..;:;3.£..4~.....L..-_ _

13,14
15,16

G

17,'8

C ~5!..:,6~-f'"_ _""'"

F

ODD NUMBERS = S-A-I FAULTS
EVEN NUMBERS=S-A-O FAULTS
8 COLLAPSED FAULT SETS:
(1),(3), (2,4,IO,14) t (9,13,17,15,11),
(5),(7) ,(6,8, 12 ,16), (18)

Figure 7-Fault collapse

A simple gate collapse is not, however, a complete
collapse. Figure 7 gives an example of a completely
collapsed set of faults. There are a total of 18 possible
single occurring S-A-l, S-A-O .faults. A simple gate
collapse will result in 12 sets of faults. However, an
extended collapse results in only 8 sets of distinguishable
faults. This amounts to a reduction of 33 percent over
the simple collapse. In large examples, the extended co-l
lapse has consistently shown a reduction of approximately 35 percent over the simple collapse.
The information gained in collapsing a set of faults
has additional value. This information can be used
in determining optimal packaging of elements in MSI
or LSI arrays so as to gain fault resolution to a replaceable module. As in Figure 7, it can be readily seen that
these three elements might best be placed on the same
replaceable module. This is true because all three elements are involved in an indistinguishable set of faults.
FAULT DETECTION
The activity associated with determining when a
fault has caused an observable malfunction can be
termed fault detection. As with many other functions,
TEGAS2 uses a dummy gate designated as the detection gate, for this purpose. A detection gate is specified
for signals that are declared observable for the purpose
of fault detection. These signals are many times referred to as primary outputs or test points. An ordered
set of such signals is called the output vector. If any
fault causes the value of one or more points on the

output vector to be different from when the fault is
not present, the fault is declared detected.
Whenever one of the signals, which is part of the
output vector, changes value, its corresponding detection gate determines if a fault is observable at that
point. This is easily accomplished since the fault free
value for any signal is always stored in the low order
bit of the host machine word. The values for that
signal, corresponding to each of the faults being simulated at that time, are represented by the other bits
in the word. Hence, all that is necessary is to compare
each succeeding bit in the word to the low order bit.
If the comparison is unequal, a fault has been detected.
For each simulation pass the machine fault number
table (MFNT) (Figure 3) which cross-references each
bit position in the host machine word with the fault
to which it corresponds, is maintained. Once a comparison is unequal, the table can be entered directly by bit
position and the represented fault can be determined.
When a fault is detected, the detection gate records
the identification number of the fault detected, the
good output vector, the output vector for the fault
just detected and the time at which the detection occurred. The input vector, at the time of detection, may
be optionally recorded. The important thing to note
is that since TEGAS2 simulates on the basis of very
accurate propagation delay times, the time of detection has significance. By using this additional information, it is possible to gain increased fault resolution.
An example of this is when two faults A and B result
in identical output vectors for all input vectors applied.
Without any additional information, these two faults
cannot be distinguished. However, if the malfunction
caused by fault A appears before the malfunction
caused by B, then they can be distinguished based on
time of detection.
The detection gate performs several other duties.
For example, in mode 2 a signal may be flagged indeterminate. In this case, the detection gate checks to
see that both the fault induced value and the fault
free value are determinate before a fault is declared
detected. In all modes of operation, a detection gate
may have a clock line as one of its inputs. This clock
line or strobe line, may be used to synchronize the
process of determining if a fault is detected. In this
manner, systems of testers which can examine for
faults only at certain time intervals can be accurately
simulated.
FAULT CONTROL
When dealing with logic nets of any size at all, such
as 500 elements and up, there are thousands of faults

Fault Insertion Techniques and Models

to be considered. If such a magnitude of faults is to be
simulated efficiently, a good deal of attention must be
paid to the overall fault control. The overall control
should be such that it will handle an almost unlimited
number of faults and be as efficient as possible.
The first step in the process of fault simulation is the
specification of the faults to be inserted. As was stated
earlier, certain classes of faults may be generated
automatically and others specified manually. In either
case, the faults are placed sequentially on an external
storage device. After all faults have been enumerated,
an end-of-file mark is placed on the file and it is rewound. This is accomplished with a control card. The
number of faults that can be specified is limited only
by the external storage space available.
The maximum number of faults that can be simulated
during a single simulation pass is dependent on the
number of bits in the host machine word, unless MNFP
is used. As mentioned earlier, one bit is always used for
the good machine and the others are used to represent
fault values. Through the use of a control card, the
user may specifiy the number of bits to be used, up to
the maximum allowable.
Let N be the number of faults to be simulated in
parallel. The basic steps in fault simulation would then
be as follows:
(1) Enumerate all faults to be simulated.
(2) Store on an external device all data necessary
to initialize a simulation pass.
(3) Read sequentially N faults from the external
fault file and set up the appropriate fault
tables.
(4) Negate the appropriate pointers based on the
faults tables.
(5) Pass control to the appropriate mode of simulation as determined by the user.
(6) If all faults have been simulated-stop.
(7) If there are more faults to be simulated, restore data necessary to initialize simulation
and go to 3.
What has not been explicitly stated, up to this point,
is that all input vectors or input stimuli are applied to
a group of faults before going to the next group of
faults. This will be called the column method. With
zero delay or sometimes unit-delay simulation, fault
control is not usually done in this manner. In these
cases, a single input vector is applied to all groups of
faults before going to the next input vector. This will
be referred to as the row method. Between applying
input vectors in the row method, all faults are examined
to determine which ones have been detected and these
are discarded. The faults remaining can then pe rer

881

grouped so that fewer faults need be simulated
with the next input vector.
On the surface, the row method control seems more
efficient than the column method. However, there are
several things to be considered. First of all, when the
row method is used with sequential logic, the state
information for every fault must be saved at the end
of applying each input vector. This requires a great
deal of bit manipulation and storage space. The
amount of state information that must be stored is
dependent on the type of simulation used. If, as with
most zero delay simulators, the circuit is leveled and
feedback loops are isolated and broken, only the
values of feedback loops and flip-flop states need to
be stored. With a simulator such as TEGAS2, the
circuit is dynamically leveled and feedback loops are
never detected and broken, therefore, every signal
must be stored. This is one of the reasons that the
row method is not considered to be as practical with
a time based simulator such as TEGAS2.
A second consideration is the fact that with TEGAS2,
all input stimuli can be placed at appropriate places
in a time queue before simulation begins. Once simulation begins, it is one continuous process until all
stimuli have been applied. Because of this, a large
number of input stimuli can be processed very rapidly
and efficiently.
The abiljty to place all input stimuli in a time queue
would not be possible if a time based simulator were
not used. With a zero delay, or even a unit delay simulator, input stimuli cannot be specified to occur at a
particular time in reference to the activity of the rest
of the circuit. Therefore, one input vector is applied
and the entire circuit must be simulated until it has
been determined to be stable. Then the next input
vector can be applied, etc. In this manner, there is a
certain amount of activity partitioning between input
vectors, which lends itself to the row method.
The third factor to consider is that if a fault is no
longer simulated after it is once detected, a certain
amount of fault isolation information is lost. If the
column method is used, the cost of retaining a fault
until the desired fault isolation is obtained is considerably less than with the row method.
EXAMPLES
To demonstrate the fault simulation capabilities of
TEGAS2, as presented in this paper, consider the
network in Figure 8. This network is a particular gate
level representation of a J-K master slave flip-flop.
The nominal propagational delay time of each of the
NAND gates is four (4) time units and the delay of

882

Fall Joint Computer Conference, 1972

CLEAR

CLOCK

simulation, there were thirty faults declared to be
detected. Seven of the faults not detected were the
same as in the first mode of simulation. The three other
faults not detected were 1, 20, an.d 23. The reason
these faults were not detected, in the three valued
mode of simulation, is that they prevented the network
from being driven to a known state. In the race and
hazard analysis mode of simulation, twenty-six faults
were declared to be detected. Out of the fourteen
faults not detected, ten are the same as those not detected in the three valued mode. The other four faults
are 5, 7, 8 and 11. Faults 7 and 8 were never detected
because they never reached a stable state different
from a stable state of the good machine's value. Many

PRESET

Figure 8-JK master slave flip-flop example

the NOT gate is two (2) time units. The minimum
delay of the NAND gates is three (3) time units and
the maximum is five (5) time units. For the NOT gate,
the minimum and maximum is one (1) and three (3)
units, respectively.
The two valued assignable nominal delay mode of
simulation is the fastest mode, but it performs no race
and hazard analysis. In this mode of simulation, all
signals are initially set to zero. Suppose that for the
network in Figure 8; the signals J, K, CLEAR, and
PRESET are set to 1. Now let the signal CLOCK
continuously go up and down with an up time of five
and a down time of fifteen. The outputs, Q and Q will
oscillate because they are never driven to a known
state. If the same input conditions are used in the
three valued mode of simulation, the outputs will
remain constant at X (unknown).
To demonstrate the power of the race and hazard
mode of simulation,assume that the inputs J and K
change values while the GLOCK is high and that the
clock goes to zero two time units after the inputs
change. Under these conditions, internal races will be
created and a potential error flag will be set for both
of the outputs Q and Q.
Performing the extended fault collapse on the network in Figure 8 resulted in a total of forty faults
(Table I) that must be inserted .. (A simple gate collapse
would result in fifty-two faults to be inserted.)
.Table II gives a set of inputs that were applied to
the network in all three modes of fault simulation. The
table gives all primary input signal changes. In the
following analysis of fault detection, the signals Q and
Qare the only test points for the purpose of observing
faults. In the first mode of simulation, two valued
assignable nominal delay, thirty-three of jthe faults
were detected.· The seven faults not detected were 15,
16, 25, 27, 29, 31, and 33. In the three valued mode of

TABLE I-Collapsed Set of Faults for Network in Figure 8
Fault
No.
1
2
3
4
5
6
7
8
9
10
11

12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40

Gate

Q
Q
F3
F3

F2
F2
F2
QB
QB
F4
F4
F1
F1
F1

F6
F6
F5
F5

Signal

Fault
Type

CLEAR
CLEAR
CLOCK
CLOCK
PRESET
PRESET
QB
PRESET
Q
F4
PRESET
F3
F7
F7
Q
PRESET
CLOCK
F2
Q
CLEAR
QB
F3
CLEAR
F4
QB
CLOCK
CLEAR
F1
K
K
J
J
F7
F4
F6
F6
F7
F3
F5
F5

SAl
SAO
SAl
SAO
SAl
SAO
SAl
SAl
SAO
SAl
SAl
SAO
SAl
SAO
SAl
SAl
SAl
SAO
SAl
SAl
SAO
SAl
SAl
SAO
SAl
SAl
SAl
SAO
SAl
SAO
SAl
SAO
SAl
SAl
SAl
SAO
SAl
SAl
SAl
SAO

Fault Insertion Techniques and Models

883

TABLE III-Time vs. Fault Detection
CLEAR

FAULTS
Time

DUMI C:

=:J INTERMITTENT
(I)

CLOCK

PRESET

Figure 9-Complex faults

times faults 7 and 8 caused the output signals Q and Q
to be in a state of transition while the good machine
value was stable. However, this is not sufficient for
detection. Faults 5 and 11 caused potential errors and
were therefore never declared to be absolutely detected.
Table III gives time vs. faults detected for each of
the three modes of simulation. Note that some of the
faults are detected at different times between the two
valued and three valued modes of simulation. This is
due to the fact that some signals were not driven to
known states in the three valued mode of simulation
until a later time. Faults were also detected at different times in the race and hazard analysis mode
since minimum and maximum delay times were used.
N ow the insertion of complex faults will be demonstrated. The flip-flop network is marked in Figure 9
with five complex faults. Three of the complex faults
require dummy elements. The first fault is an intermittent SAO on the PRESET signal. The dummy
TABLE II-Input Signal Changes

Signal
J
K
CLOCK
CLEAR
PRESET
CLEAR
CLOCK
CLOCK
J
K
CLOCK
CLOCK
PRESET

Value
Changed
to
1
0
0
0
1
1
1
0
0
1
1
0
0

Time of
Change
0
0
0
0
0
30
70
110
131
131
131
134
160

4
5
8
10
12
16
18
20
46
50
80
83
86
90
120

Mode-2

Mode-1
9,21

Mode-3

21
21

39
1,6,20,38,40
3, 12, 14
24, 28

6,24,28,38,40

23
22,26

22,26

19

19

13,37

13,37

2,4,32

2, 3, 4, 9, 12, 14
32,39

6,24,28,38,40
22,26
19
13,37

2, 3, 4, 9, 12, 14
39,32

123
124
126
128
130
141
147
150
151
167
179

18,34,36
10

18, 34, 36
10

18, 34, 36
10
7
30, 35
17

7
30,35
17

5,8
11

5, 8
11

17, 30, 37

element DUMI is used to insert this fault~ The second
fault is an input diode shorted on the connection of
signal F4 to gate F6. To insert this fault, the signals
F4 and F7 are passed through the dummy element
DUM2. The element DUM3 is used to model a signal
short between signals F5 and F6. A fourth complex
fault is the case where element F3 operates an AND
gate instead of a NAND gate. The fifth complex
fault is a multiple fault. This multiple fault consists
of a SAO on the input connection of signal CLEAR
to gate Q, a SAlon signal Fl, and gate F2 operating
as an AND gate instead of a NAND gate.
The same input signal changes as given in Table II
up through time 110 were applied to the network with
these faults present. The times of detection for these
faults in mode 1 are:
Time
10

12
16

120
120

Fault No.
1
4

3
2
5

884

Fall Joint Computer Conference, 1972

Hence, these faults were simulated simultaneously and
detected by the given input sequence for this mode of
simulation.
SUMMARY
The TEGAS2 system is capable of simulating faults at
three levels. The most accurate level performs race and
hazard analysis with minimum and maximum delay
times. The fault insertion methods developed for
TEGAS2 are capable of modelling not only the traditional set of single occurring stuck-at logical one and
stuck-at logical zero faults, but, also a wide range of
complex faults such as intermittents, shorted signals,
and shorted diodes. In addition, any multiple occurrence of the above faults can be modeled. The specifications of these faults can be done by the user or an extended collapsed set of single occurring stuck-at
faults can be generated automatically. Due to accurate
time based simulation for faults, it is possible to extract
accurate time based fault diagnosis information. Finally,
with the introduction of the MNFP technique, a new
dimension has been added to digital fault simulation.
REFERFNCES
1 E G ULRICH
Time-sequenced logical simulation based on circuit delay and
selective tracing of active network paths
Proceedings ACM 20th National Conference 1965

2 S A SZYGENDA D ROUSE E THOMPSON
A model and implementation of a universal time delay simulator for large digital nets
AFIPS Proceedings SJCC May 1970
3 M A BREUER
Functional partitioning and simulation of digital circuits
IEEE Transactions on Computers Vol C-19 pp 1038-1046
Nov 1970
4 S G CHAPPELL S S YAU
Simulation of large asynchronous logic circuits using an
ambiguous gate model
AFIPSProceedings F JCC November 1971
5 R B WALFORD
The LAMP system
Proceedings of the Lehigh Workshop on Fault Detection
and Diagnostics in Digital Circuits and Systems
December 1971
6 R M McCLURE
Fault simulation of digital logic utilizing a small host machine
Proceedings of the 9th ACM-IEEE Design Automation
Workshop June 1972
7 E G MANNING H Y CHANG
A comparison of fault simulation methods for digital systems
Digest of the First Annual IEEE Computer Conference
1967
8 S A SZYGENDA
A simulator for digital design verification and diagnosis
Proceedings of the 1971 Lehigh Workshop on Reliability
and Maintainability December 1971
9 S A SZYGENDA
TEGAS2-Anatomy of a general purpose test generation and
simulation system for digital logic
Proceedings of the 9th ACM-IEEE Deisgn Automation
Workshop June 1972
10 D B ARMSTRONG
A deductive method for simulating faults in logic circuits
IEEE Transactions on Computers May 1972

A program for the analysis and design of
general dynamic mechanical systems
by D. A. CALAHAN and N. ORLANDEA
The University of Michigan
Ann Arbor, Michigan

Constraint (connection) equations:

INTRODUCTION

i=l, 2, ... m

The physical laws that govern motion of individual
components of mechanical assemblages are well-known.
Thus, on the face of it, the concept of a general computer-aided-design program for mechanical system
design appears straightforward. However, both the
equation formulation and the numerial solution of these
equations pose challenging problems for dynamic
systems: the former when three-dimensional effects are
important, and the latter when the equations become
"stiff"! or when different types of analyses are to be
performed.
In this paper, a three-dimensional mechanical
dynamic analysis and design program is described.
This program will perform dynamic analysis of nonlinear systems; it will also perform linearized vibrational
and modal analysis and automatic iterative design
around any solution point in the nonlinear dynamic
analysis.

(5)

where
E is the kinetic energy of the system
qj are generalized coordinates (three rotational and
three translational) ,
Uj are the coordinate velocities,
Ai are Lagrange multipliers, representing reaction
forces in joints,
pj are generalized angular momentums,
Qj are generalized forces,
€/Ji are constraint functions representing different
types of connections at joints (see Figure 2).
Representing all subscripted variables in vector form
(e.g., !f.= [UI U2 ••• u6JI) , these equations become

l(y.,

~, fj, p,~; t) =0

cJ!(~)

=Q

(6)
(7)

By referencing the free body equations of (1-3) to the
joints, we can view the above as a "nodal" type of
formulation.

FORMULATION
The equations of motion of a three-dimensional
mechanical system can be written in the following form.
Free body equations:

NUMERICAL SOLUTION
Static and transient analysis

To avoid the numerical instability associated with
widely separated time constants, most general-purpose
dynamic analysis programs employ implicit integration
techniques. The corrector equation corresponding to
(6) has the form

(1)

(2)

o~ + of) A~+ (oF) Aq
( Ko
T duo
OU
~-

j=4, 5, 6
(3)

j=1,2, ... 6

+( KoOF
- +OF)
- Ap+ (OF)
AA=-F
T oJ!.
oJ! o~
(iJ€/J/og) A2=-€/J

(4)

885

(8)
(9)

886

Fall Joint Computer Conference, 1972

of the system determinant using Muller's method. 5
This determinant is readily found from the interpreter,
which performs an L U factorization to find the vibrational response.
GENERATION OF THREE SPARSE MATRIX CODES
FOR STATIC. TRANSIENT, AND
VI.BRATIONAL (MODAL) ANALYSIS

Solution efficiency

Figure 1-0utline of program capabilities

where

T is the integration step size,
Ko is a constant of integration.
The matrix of partial derivatives in (7-8) is solved
repetitively using explicit machine code. The Gear
formula is used for integration. 2
Vibrational and modal analysis

Substitution of s for Ko/T in (8) can be viewed as
resulting in the linearized system equations

[s (aF)n
a,?& + (aF)n]
ay' OU+ (aF)n
a~ o~
aF + aF)
ap op+ (aF)n
a1 o~=I(s)

+ (s ap

(acfJ/ ag) ofj = Q

For each corrector iteration involved in (8-9), the
minimum set of variables that must be determined are
those required to update the Jacobian and right hand
side vector-E.. The constraint equations represented by
9!. = Q can in general relate any qj variables in a nonlinear
manner; also, from (1) and (2), the A's appear in the
aEla~ term of the Jacobian. Therefore, it seems convenient to solve for all the arguments of E. of (6). We
do not, then, attempt to reduce the number of equations
. to a "minimum set" (such as the number of degrees of
freedom) ; since most variables must be updated anyway,
we find no purpose in identifying such a minimum set
for the purpose of transient analysis.
In contrast, for vibrational and modal analysis, a
significant savings could be achieved by reducing the
equations to the number of degrees of freedom of the
system. We do not exploit this at present, since a single
transient analysis easily dominates other types of
analysis in cost.
AUTOMATIC DESIGN

(10)

(11)

where
) n represents evaluation at the nth time step;
this includes the static equilibrium case (n=O),
0_ represents a small variation around the nth
time step,
1(s) is a force or torque source vector.

The evaluation of the vibrational response now proceeds
by setting s = jw = j27rJ, and sweeping J over the frequency range of interest. This repeated evaluation is
similar in spirit to the repeated solution of (8-9) at
every corrector iteration. However, now an interpreter4
is used for solution of the complex-valued simultaneous
equations.
Modal analysis (i.e., determination of the natural
frequencies) is relatively expensive if all modes must be
found. However, the dominant mode can usually be
found (from a "good" initial guess) in 5-7 evaluations

Unlike electrical circuit design, it is not common to
design mechanical systems to precisely match a frequency specification. It is far more common to adjust
only the dominant mode to achieve an acceptable
dynamic response.
One of the most direct approaches to automatic
iterative adjustment of the natural modes is to apply
Newton iteration to the problem of solving ~ (s) = 0
TYPE OF
JOINTS

SYMBOL

NO. OF EQUATIONS
OF CONSTRAINTS

EXAMPLE OF
APPLICATION

SPHERICAL

~

3

SUSPENSION OF CARS

UNIVERSAL

~

4

TRANSMISSION FOR CARS

CYLINDRICAL

~

4

MACHINE TOOLS

TRANSLATIONAL

~

5

MACHINE TOOLS

~

5

BEARINGS

5

SCREWS

REVOLUTE

SCREW

A

Figure 2-Constraint library

Program for Analysis and Design of General Dynamic Mechanical Systems

~-'-I-__- - -

887

6.4

I
I

I
I

I

.I

.18

log f

Figure 5-Vibrational response

y

x

Figure 3-Example mechanism

where A is the system determinant associated with
(4). In particular, if 8i is a desired natural mode, and
~j is a parameter, then we solve iteratively

or

PROGRAM DESCRIPTION
The general features of the program are outlined in
Figure 1. In general, the program is intended to permit
analysis of assemblages of links described by their
masses and three inertial moments compatibly connected by any of the joints of Figure 2. Other types of
mechanical elements (gears, cams, springs, dashpots)
will be added shortly. IVlost of these fit neatly into the
nodal formulation, effecting only the constraint equations given in (2).
EXAl\1PLE

Here, the term in brackets can be identified as a transfer
function of the linearized system.

The system shown in Figure 3 was simulated over a
duration of 42.5 seconds of physical time. It was as-'

(-.45,1.7S)
t=.S
j(al
IS

-~
en

10

"'o"
~

'-

.~

S

o

O.S

1.0

loS

[sec]
Figure 4-Transient response

..----x--------~.707

--x---.------~~~--

-1.28
t=O

-.28
t=.28

t=O

Figure 6-Locus of natural modes

888

Fall Joint Computer Conference, 1972

sumed that a motor with a linear torque vs. speed
characteristic, 72 vs. W2, drove the system against a
constant load torque, 74. Figure 4 shows the transient
response; a vibrational response is shown in Figure 5
around the static equilibrium point. Figure 6 shows the
motion of the natural frequencies as the transient
response develops; it may be noted that as the natural
modes develop a larger imaginary component, an
oscillation of increasing frequency appears in the transient response.

can be expected to yield some numerical difficulties.
Among these appear to be a high degree of oscillation
in the reaction forces (X/s), preventing any effective
error control to be exerted on these variables.
ACKNOWLEDGMENTS

SUIVnVIARY

The. authors gratefully acknowledge the interest and
support of the U. S. Air Force Office of Scientific
Research (Grant No. AFOSR-71-2027), and the
National Science Foundation (NSF Grant No.
GK-31800).

The nodal formulation of (1-5) offers a number of
programming and numerical solution features.

REFERENCES

( 1) No topological preprocessing is necessary to
establish a set of independent variables; equations can be developed directly from the connection data, component-by~component.
(2) Having a large number of solution variables
assists in modeling common physical phenomena;
for example, frictional effects in joints are
routinely modeled and impact is easily handled.
(3) Sensitivity necessary for man-machine and
iterative design are easily determined due to the
explicit appearance of common parameters (e.g.,
masses, inertial terms, link dimensions) in the
nodal formulation.
(4) The use of force and torque equations permits
easy capability with current methods of continuum mechanics for internal stress analysis.

It must be mentioned that the transient solution of
three-dimensional mechanical systems poses some
interesting numerical problems not present in the
related fields of circuit and structural analysis. First,
the equations are highly nonlinear requiring evaluations
of tensor products at each corrector iteration; also,
associated matrices are of irregular block structure.
Second, the natural modes are not infrequently in the
right half plane, representing a falling or a locking
motion (see Figure 3). The integration of such (locally)
unstable equations is not a well-understood process and

1 C W GEAR
DIFSU B for solution of ordinary differential equations
CACM
Vol 14 No 3 pp 185-190 March 1971
2 N ORLANDEA M A CHACE D A CALAHAN
Sparsity-oriented methods for simulation of mechanical
dynamic systems
Proc 1972 Princeton Conf on Information Sciences and
Systems March 1972
3 F G GUSTAVSON W M LINIGER
R A WILLOUGHBY
Symbolic generation of an optimal count algorithm for sparse
systems of linear equations
Sparse Matrix Proceedings (1969) IBM Thomas J Watson
Research Center Yorktown Heights New York Sept 1971
4 H LEE
An implementation of gaussian elimination for sparse systems
of linear equations
Sparse Matrix Proceedings (1969) IBM Thomas J Watson
Research Center Yorktown Heights New York Sept 1971
5 D E MULLER
A method for solving algebraic equations using an automatic
computer
Math Tables Aids Computer Vol 10 pp 208-215 1956
6 M A CHACE D A CALAHAN N ORLANDEA
D SMITH
Formulation and numerical methods in the computer evaluation
of mechanical dynamic systems
Proc Third World Congress for the Theory of Machines and
Mechanisms Kupari Yugoslavia pp 61-99 Sept 13-20 1971
7 R C DIX T J LEHMAN
Simulation of dynamic machinery
ASME Paper 71-Vibr-l11 Proc Toronto ASME Meeting
Sept 1971

A wholesale retail concept for
computer network management
by DAVID L. GROBSTEIN
Picatinny Arsenal
Dover, New Jersey

and
RONALD P. UHLIG
US Army Materiel Command
Washington, D.C.

THE MANAGEMENT PROBLEM

The commitment to share computer resources in a
network implies substantial changes in the resource
control topology of that organization. This is particularly true for organizations that have existing computing
facilities which will be pooled to form the base of the
network's resources. The crux of the matter is that
sharing implies not only that you will let someone else
utilize the unused capacity of your computer; it also implies that you may be told to forgo installing your own
machine because there is unused capacity elsewhere in
the resource pool. If your mission depends on the availability and suitability of computer services from someone else's machine you suddenly become very interested
in the management structure which governs the relationship between your organization and the Qne that has
the computer.
The purpose of this paper is to examine some of the
objectives and problems of an organization having existing independent computing centers, when it contemplates moving into a network environment.

In the past few years the technical feasibility of computer networks has been demonstrated. An examination
of the existing networks, however, indicates that they
are generally composed of homogeneous machines or are
located essentially in one geographical area. The most
notable exception to this is the ARPA Network which is
widely distributed geographically and which has a
variety of computers. The state-of-the-art now appears
to be sufficiently far along to allow serious consideration
of computer networks which are not experimental in
origin and are not university based.
When a large governmental or industrial organization
contemplates the establishment of a computer network,
initial excitement· focuses on technical sophistication
and capabilities which may be achieved. As the problem
is examined more deeply it becomes progressively clearer
that the management aspects represent the greater
challenge. There a.re a number of sound reasons for an
organization to establish a computer network, but
fundamental to these is the intent to reduce over-all
computer resources required, by sharing them. The implications of this commitment to share are more far
reaching than is immediately obvious when the idea is
first put forth.
In both government and industry it is common to
find computing facilities established to service the needs
of a particular profit center or activity. That is, the
computer resources necessary to support a mission
organization are placed under its own control, as in
Divisions 1, 2, and 4 in Figure 1.

COMPUTER NETWORK ADVANTAGES
Computer network advantages can be divided into
two categories, operational and management. The following advantages are classified· as operational in that
they affect the day to day use of facilities in the network:
1. Provide access to large scale computers by users
who do not have on-site machines.

889

890

Fall Joint Computer Conference, 1972

CORPORATE
HEADQUARTERS

Figure 1-:Qec~ntralized computing facilities

2. Provide access to different kinds of computers
that are available in the network.
3. Provide access to specialized programs or technology available at particular computing centers.
4. Reduce costs by sharing of proprietary programs
without paying for them at multiple sites.
5. Load level among computing centers.
6. Provide back-up capability in case of over-load,
temporary, or extended outage.
A very fundamental advantage is the provision of a
full range of computing power to users, without having
to install a high capacity machine at each site. To
achieve this, it is necessary to provide access to the network through time sharing, batch processing and interactive graphics terminals. For each of these to be applied ~o that portion of a project for which it is best
suited, all must have access to a common data base.
Computer users frequently find programs that will
be valuable to them but which have been developed for
some other machine. Conversion to their own machine
can be time consuming· and costly even if the programs
are written in FORTRAN. Computer networks can
offer access to different kinds of machines so that bor- .
rowed programs may be run without conversion. If the
program will serve without modification it need not be
borrowed at all but can be used through the network at
whichever installation has developed it. Thus a network
environment can be used to encourage specialized technology at each computing center so that implementation
and maintenance costs need not be repeated at every
user's site.
Computer networks can provide better service to
users by allowing load leveling among the centers in the
network so that no single machine becomes so overloaded that response and turn-around time degrade to
unacceptable levels. Furthermore, the availability of like
machines provides back up facilities to insure relatively
uninterrupted service, at least for high priority work.

From the standpoint of managing computer resources,
networks offer several advantages in helping to achieve
the goal of the best possible service for the least cost.
Among these advantages are:
1. Greater ease and precision in identifying aggregated computing workload requirements, by
providing a larger and more stable base from
which to make workload projections.
2. Ability to add capacity to the network as a
whole, rather than at each individual installation
by developing specifications for new main frames
based on total network requirements, with less
regard for specific geographic location.
3. Computing power can be added in· increments
which more closely match requirements.

Experience at a number of installations indicates that
it is extremely difficult to project computer use on a
project by project basis with sufficient accuracy to use
the aggregated data as a basis for installation or augmentation of computer facilities. Project estimations
vary widely, particularly in scientific and engineering
areas. The need for computer support is strongly driven
by the week to week exigencies of the project. Because
of this variability, larger computing centers can often
project their future requirements better from past history and current use trends, than by adding up the requirements of each individual project. In a computing
network these trends can be more easily identified, and,
since the network as a whole serves a larger customer
base than any single installation, the projections can be
made more accurately. Simply stated, the law of large
numbers applies to the aggregate.
The second and third management advantages listed
above are interrelated but not really the same. In a
network, adding capacity to any node makes that capacity available to everyone else in the network. It is important to recognize that this applies to specialized
kinds of capacity as well as to general purpose computer cycles. Thus when specifications for new hardware
are developed, they can include requirements derived
from the total network. Finally computer capacity
tends to come in fixed size pieces. In the case of computers which can service relatively large and relatively
long running computer programs, the pieces are not only
large, they are very expensive. When these have to be
provided at each installation requiring computer services, there is frequently expensive unused capacity
when the equipment is first installed. In a network,
added computing power can be more easily matched to
overall requirements because the network capacity
increments are distributed over a larger base.

Wholesale Retail Concept for Computer Network l\1anagement

WHOLESALE VS RETAIL FUNCTIONS
Now let's examine the services obtained from a computing network.
At most large computing centers, personnel, financial,
and facilities resources are devoted to a combination of
functions which include acquisition and operation of
computing hardware, installation and maintenance of
operating systems, language processors, and other general purpose "systems" software, and design and development of applications programs. These functions
are integrated by the computing center manager to try
to provide the best overall service to his customers.
The Director of Computing Services at a location with
its own computer center, provides an organizational
interface with his local customers which may include
the Director of Laboratories, the Director of Research
and Engineering, Director of Product Assurance, and
other similar functions which require scientific and
engineering computer support.
But what structure is required if there is no computing
center at a particular location? How does the use of
computer network services, instead of organizationally
local hardware, affect the computer supported activities? Conversely, in a computer network environment, what is the effect of having no customers at the
actual local site of the computing center? What functional structure is required at such a lonely center and
what services should it offer?
In considering the answer to these and other questions involved in the establishment of a computer network it is useful to distinguish wholesale from retail
computing services. At its most fundamental level the
wholesale computing function might be defined as the
production of usable computer cycles. In order to
achieve this it is necessary to have not only computer
hardware, but also the operating systems software,
language processors, etc., which are needed to make the
hardware cycles accessible and usable. The wholesaler
produces his services in bulk. The production of wholesale computer cycles may be likened to the production
of coal, oil, or natural gas. Each of these products can
be used in support of a wide variety of applications from
the production of electricity to heating homes to broiling steaks on the back yard grill. The specific application is not the primary concern of the wholesaler. His
concern is to produce bulk quantities of his product at
the lowest possible cost. The Wholesale Computing
Facility (WCF) like the oil producer, has to offer a welldefined, stable product, in a sufficient number of grades
(classes of service) to satisfy his end users. To achieve
this he also must have a marketing function which
interacts with his retailers in order to maximize the

891

TABLE I-Resources and Services Offered by A Typical
Wholesale Computing Facility (WCF) ;
RESOURCES
Computers
System Software
General Purpose Application
Software
Systems Programmers
Operators
Communications Equipment

SERVICES
Batch processing access
Interactive terminal access
Real time access
Data File storage
Data Base Management
Contract programming
Consulting Services
Systems Software
Hardware Interfaces
Communications
Documentation & Manuals
Marketing/Marketing
Support

value of the products he offers. The marketing function
includes technical representatives in the form of software and hardware consultants which can explain to
the retailer how to derive the maximum value from the
services offered and how to solve technical problems
which arise.
Table I is a non-exhaustive list of the resources needed
and services offered by a typical WCF.
Unlike the Wholesale Computing Facility which
strives for efficient and effective production of general
purpose computing power, the Retail Computing Facility (RCF) has the function of efficiently and effectively
delivering service directly to the user. The user's concern is with mission accomplishment. He has a project
to complete, and the computer provides an analytical
tool. He is not directly concerned with efficiency of
computer operation; he is concerned with maximizing
the value of computer services to his project. In this
respect fast turn-around time and specialized applications programs which ease his burden of communicating
with the computer may be more important than obtaining the largest number of computer cycles per dollar.
The retailer's function is to. provide an interface between the WCF and the user. His primary concern is
to cater to the special needs, the taste and style of his
customers. He must provide a wide variety of services
which tailor the available computing power to each
specialized need.
To do this it is vital that the retailer understand
and relate to his user's needs and capabilities. For the
sophisticated user he may 'have to provide interactive
terminal access and a variety of high level languages
with which the user can develop his own ,specialized
applications programs. For others he must offer
analyst and programmer services to develop computer

892

Fall Joint Computer Conference, 1972

TABLE II-Resources Needed By and Services Offered By a
Typical Retail Computing Facility (ReF)
RESOURCES
Wholesale /Retail
Agreements
Access to computers
(terminals)
Personnel
General Purpose Applications
Programs
Marketing Support

SERVICES
Usable Computer Time
Special Purpose Applications
Programming
General Purpose Applications
Programming
Software Consultant
Services
Applications and Debugging
Consultation
User Training
Administrative Services
Arrangement for Terminals
Users Guides, Manuals,
Key Punching, Password
Assignments
Marketing /

applications to the customer's specifications. His primary orientation must be toward supporting his user's
missions.
The Retail Computing Facility also represents its
users to the Wholesale Computing Facilities. In doing
so, it helps the wholesaler to determine the kind of
products which must be offered. The retailer may need
to buy batch processing, interactive time sharing, and
computer graphics services. He may need access to
several different brands of computers, in order to process applications programs which his users have developed or acquired from others. He acquires commitments
for these services from wholesalers through wholesale/
retail agreements. Table II indicates resources needed
and services provided by ReFs.

sale/retail agreements are regarded by the retailer as a
resource. For the retailer to depend on them, the agreements must be binding, and the retailer must be assured that he will receive the same treatment when he
is accessing a· computer remotely through the network
as he would if he were geographically and organizationally a part of the wholesaler's installation. After
all, to achieve the benefits of sharing computer resources which a network offers, it is necessary to tell
some organizations that they cannot have their own
computers. Thus it is clear that binding agreements, as
surrogates for local computer centers, are fundamental
to successful network implementation.
Anothe'r reason to distinguish between wholesale and
retail facilities is to make it clear that you cannot serve
users merely by placing bare terminals where they can
be reached. Examination of the retail functions indicate
that they include a large number of the user oriented
services offered by existing computing centers. It is important to recognize that the decision to use only terminal access to the network at some locations, does not
result in saving all the resources that would be required
to set up an independent computing center at those locations. Quite the contrary, if computing services are
needed, it is a management obligation to provide the
required resources for a successful Retail Computing
Facility.
A third reason for identifying the two functions is
that in discussing organization and funding, lines of
responsibility and control are clearer to portray. This
third reason implies that the wholesale/retail distinction
is useful in understanding and planning for network
organization, whether or not the distinction becomes
visible in the implemented organization as separate

DIRECTOR
OF
COMPUTING SERVICES

UTILITY OF THE WHOLESALE RETAIL
DISTINCTION
The notion of separate wholesale andretail computing
facilities is useful for several reasons, particularly when
a large company or government agency is attempting
to integrate independent decentralized computing
centers into a network. In the pre-network environment both the wholesale and retail facilities tend to be
contained in the same organization and have responsibility for servicing only that organization. In a network
environment it is important to identify the Wholesale
Computing Facility in order to understand that it will
be serving other organizations as well, and therefore
must take a non-parochial point of view. The importance
of this viewpoint is indicated by the fact that whole-

..........................

I

I

i···································

ADMINISTRATIVE

I
COMPUTER
OPERATIONS

WHOLESALE COMPUTING
FACILITY

SERVICES

:

I

:

SYSTEMS
SOFTWARE

:

!

........................................:

1

I
SCIENTIFIC
APPLICATIONS
DEVELOPMENT
RETAIL COMPUTING
FACILITY

"........................................

Figure 2-Wholesale and retail computing facilities identified
within a "typical" computing center

Wholesale Retail Concept for Computer Network Management

segments. The wholesale and retail portions of a "typical" computing center are indicated in Figure 2.
APPLICATION OF THE WHOLESALE/RETAIL
MANAGEMENT CONCEPT
The concepts discussed to this point were developed
in a search for answers to some very real problems currently facing the authors' organization. We want to
make it clear that these theories and ideas are not
official policy of our organization; rather they are possible. solutions to some of these problems. In discussing
the approach described above with colleagues thr.oughout our organization, we discovered that it is useful to
consider possible applications of these ideas in concrete
rather than abstract terms. Our colleagues needed to
know where they fit into the plan in order to understand
it. Furthermore, mapping a general plan onto the structure of a specific organization is a prerequisite to acceptance.
THE AUTHORS' ORGANIZATION
The authors are in scientific and engineering data
processing management positions with the US Army
Materiel Command (AMC) , a major command of the
US Army employing approximately 130,000 civilians,
and 13,000 military at the time this paper was written.
AMC has the mission of carrying out research, development, testing, procurement, supply and maintenance
of the hardware in the Army's inventory. The scope
of this mission is staggering. Some of the major organizational elements comprising the Army Materiel
Command include "Commodity Commands" with
responsibility for research, development, procurement
and supply for specific groups of commodities (hardware), depots for maintenance and supply, and independent laboratories for exploratory research.
Because of the nature of its mission, AMC might be
likened to a large corporation with many divisions. For
example, one of the "Commodity Commands"-Tank
Automotive Command in Detroit, Michigan-c-carries
out work similar to that carried out by a major automobile manufacturer in the United States. Another
"Commodity Command"-Electronics Comm,andcarries out work similar to that carried out by a major
electronics corporation. In a sense each of these "Commodity Commands" operates as a small corporation
within the larger parent corporation. Each Commodity
Command has laboratory facilities for carrying out
research in its areas of commodity responsibility. In
addition, independent laboratories carry out basic and

893

exploratory research. It may be helpful in the discussion
which follows to draw a comparison between industrial
situations and the Army Materiel Command. The
Commanding General of AM C occupies a position similar to that of the President of a large diversified corporation. The Commanding Generals of each of the
Commodity Commands and independent laboratories
might be compared to Senior Group Vice Presidents in
this large corporation, while the Commanding Officers
of various research activities within Commodity Commands carry out functions similar to those carried out
by Vice Presidents responsible for particular mission
areas within a corporation.
As in many large corporations, AM C has a number of
different types of computers in geographically dispersed
locations to provide computer support under many
different Commanding Officers.
Locations having major computing resources which
are candidates for sharing, and locations requiring
scientific and engineering computer support are shown
in Figure 3. The resources which are candidates for
sharing include 8 IBM 360 series computers (1 model 30,
1 model 40, 2 model 44s, 1 model 50, 3 model 65s),
three Control Data Corporation 6000 series computers
(1 CDC 6500, and 2 CDC 6600s), seven Univac 1100
series computers (6 Univac 1108s, 1 Univac 1106), one
Burroughs 5500 computer, two EMR 6135 computers,
and two additional major computers not yet selected.
These 21 computers are located and operated at 17 different locations among those shown in Figure 3. It is
not clear that everyone of the locations requiring

ALASKA-\

Figure 3-AMC locations which have scientific computers or
require scientific computing services

894

Fall Joint Computer Conference, 1972

services should ultimately receive them through a
computing network. The main purpose of this illustra, tion is to show the magnitude of the problem.
In exploring the existing situation it came as somewhat of a surprise to discover that we already have most
of the management problems of computer networks,
despite the fact that not all of the seventeen computer
sites and thirty-one users sites are interconnected.
Computer support agreements now exist between many
different activities within Army Materiel Command, although not all of these provide for service via terminals.
DECENTRALIZED MANAGEMENT OF THE
NETWORK NODES
The wholesale/retail organizing rationale discussed
previously was developed as a vehicle for better understanding our present management structure, and as an
aid in identifying a viable structure for pooling computer resources across major organizational boundaries.
A number of proposals to centralize operational management of all of these computers were considered
and discarded. The computing centers which would
form the network exist today, and most have been
operational for a number of years. They are well managed and running smoothly and we would like to keep
it that way. Furthermore, the association of these
centers with the activities which they serve has been
mutually beneficial. ("Activity" is used here to refer to
an organizational entity having a defined mission and
distinct geographic location.) The centers receive resource support from the, activities and in turn provide
for the specialized needs of the research and development functions which they serve. Sharing of these
specialized technologies and services is a desirable objective of forming the network. For these reasons, the
authors believe decentralized computer management
would be necessary for a successful network.
To make our commitment to decentralized computer
management viable, we needed to face squarely the issue that each existing computer is used and controlled
by a local Commanding Officer to accomplish the assigned research and development mission of his activity. But network pooling of computer resources implies that some activities use the network in lieu of installing their own computer. For this approach to succeed, availability of time in the computer pool has to be
guaranteed to approximately the same degree as would
derive from local hardware. The offered guarantee in a
network environment would be an agreement between
the activity with the computer and the activity requiring computer support. To make the network suc-

ceed, corporate (or AMC) headquarters would have to
set policies insuring that agreements have sufficient force
to guarantee the using organization the resources specified.
In the following paragraphs we will discuss how these
agreements might be used in the Army Materiel Command type of environment. If we replace the words
"Commanding Officer" with the words "Vice President" it seems clear that the same concepts apply to
industry as well as to the military situation.
If agreements are to become sufficiently binding so
that they can be considered a resource it would be
necessary to expand the basic mission of the Commanding Officer who "owns" the' computer. The only way to
make the computer into a command-wide (or corporate)
resource would be to assign the Commanding Officer
and his Director of Computing Services the additional
mission of providing computer support to all organizations authorized to make agreements with him, and to
identify the resources under his control which would be
given the task of providing computer services to "outside" users. These resources would now become a Wholesale Computing Facility serving both local and outside
organizations.
FUNCTIONS OF THE RETAILER
In a large corporation with many divisions each
division would require a "retailer" of computer services
to perform the applications oriented data processing.
Those divisions which operate computers would operate
them as wholesale functions to provide computer
service to all divisions within the corporation. Substituting the words "Commodity Command, Major Subordinate Command, or Laboratory" for the word
"Division" the same principle could apply to the Army
Materiel Command. Although a local commander would
give up some cOIitrol over "his" computer, in that he
would guarantee s6me capacity to ou'tside users, he
would gain access to capacity on every other computer
within the command, to support him in accomplishing
his primary mission.
Retail Computing Facility (RCF) describes that part
of the organization responsible for assuring that computer services are available to the customers and users
to accomplish the primary mission of the local activity.
Every scientist and engineer within an activity, e.g.,
laboratory, would look to his local RCF to provide the
type of service required. The RCF would turn to wholesalers throughout the entire corporation. This would
give the retailer the flexibility to fit to the job an available computer rather than having to force fit the job on

Wholesale Retail Concept for Computer Network IVranagement

WHOLESALE
COMPUTING
FACILITY 1

895

for submitting a job to a local computer at a user's
home installation.

WHOLESALE
COMPUTING
FACILITY 2

FUNCTIONS OF THE WHOLESALER

f~\
1

iiitl\\\
2

3
n-1
CUSTOMERS/USERS

n

Figure 4-RCF uses wholesale service agreements with several
WCFs to provide retail services to customers

to the local computer. These relationships are shown in
Figure 4.
In order to obtain the resources and provide the
services listed in Table II a considerable amount of
homework would have to be done by the retailer. The
retailer would estimate the types and amounts of
services required by his various users and arrange agreements with wholesalers to obtain these services. It
must be recognized that this is a difficult job and in
many instances cannot be done with great accuracy.
The retailer would act as a middleman between
customer/users and Wholesale Computer Facilities
within the network.
The retailer would be responsible for negotiating two
different types of agreements. He would have to negotiate long term commitments with various wholesalers
by guaranteeing to these wholesalers a certain minimum
dollar amount; in return the wholesalers would guarantee to the retailers a certain minimum amount of computer time.
The other type of agreement which retailers could
negotiate with wholesalers would be for time as required. This would take the form of a commitment to
spend dollars at a particular wholesale facility when the
demand occurred and if time were available from that
wholesaler. The retailer could then run jobs at that
WCF on a "first come, first served" basis, or according
to whatever queue discipline was agreed upon in advance. The range of agreements would be from "hard
scheduled computer runs" to "time as available."
I t is imperative that a user not have to go through
lengthy negotiation each time he requests computer
service from a retailer. SUbmitting a job through the
local retailer to any computer in the corporate network
should be at least as simple as the current procedures

Figure 5 graphically depictes the Wholesale Computer
Facility relationship to retailers. The WCF at installation m would provide resources and services listed in
Table 1 through the network to retailers at various
activities throughout the corporation. Normally, the
WCF at installation m would still provide most of the
service to the retailer at installation m; however, that
retailer would not have any formally privileged position over other retailers located elsewhere in the network. The primary functions of the wholesaler would be
to operate the computers and to provide the associated
services which have been negotiated by various retailers. The wholesaler might well have services which
were duplicated elsewhere in the network; however, he
might also have some which were unique to his facility.
It would be essential that every retailer in the network
be made aware of the services offered by each WCF,
and it would be the responsibility of the wholesaler to
ensure that all of his capabilities were made known.
In addition to operating the computer or computers at
his home installation, the WCS might also be responsible
for providing services to retailers through contracts
placed with facilities external to the corporation. For
example, a propriety software package not available
from any computer in the corporate pool, but required
by one or more retailers, might be available from some
other computer which could be accessed by the corporate net. A contract to access those services could
then be placed through one of the wholesalers.
HEADQUARTERS FUNCTIONS IN
MANAGING THE CORPORATE NETWORK
Corporate Headquarters interaction with decentralized wholesale and retail computing facilities can be

•.

WHOLiLE/RETAi\AGR"E~~NTS
RETAIL
COMPUTING
FACILITY 1

RETAIL
COMPUTING
FACILITY 2

RETAIL
COMPUTING
FACILITY n

Figure 5-WCFs provide service to multiple RCFs

896

Fall Joint Computer Conference, 1972

CORPORATE HEADQUARTERS

CORPORATE DIRECTOR
OF
COMPUTING SERVICES

COMPUTER
NETWORK
STEERING
COMMITTEE

COMPUTER
NETWORK
MANAGEMENT

Figure 6-Corporate headquarters organization with
decentralized WCFs and RCFs

provided through the establishment of two groups,
Computer Network Management (CNM), and the
Computer Network Steering Committee (CNSC).
Both CNM and CNSC should report to the Corporate
Director of Computing Services as shown in Figure 6.
Overall, the headquarters is responsible for insuring
that computer support requirements of scientists and
engineers throughout the corporation are effectively
met, and that they are provided in an efficient manner.
The first responsibility is to insure that proper computer support is available. The second responsibility is
to insure that the minimum amount of dollars are expended in providing that support.
Computer Network Management (CNM) has basic
headquarters staff responsibility to insure that the network is well coordinated and well run. It should:
1. Recommend policy and procedures for regulation
and operation of the network.
2. Resolve network problems not covered by corporate procedures.
3. Negotiate facilities management agreements
with the appropriate corporation divisions to
operate Wholesale Computing Facilities (see
Figure 6).
4. Work with WCFs, RCFs and the Computer

Network Steering Committee to develop long
range plans concerning network facilities.
5. Serve as a network-wide information center on
facilities, services, rates, and procedures.
CNM need not be directly involved in the day to day
operations of the network. Wholesale/retail agreements
should be negotiated betweenWCFs and RCFs without requiring headquarters approval, so long as these
agreements are consistent with overall corporate policy.
Obviously, agreements not meeting this requirement
would· require CNM involvement. However, headquarters should function, insofar as possible, on a management by exception basis.
A Computer Network Steering Committee (CNSC)
should be established to suggest policy for consideration
by the corporation. Members of the CNSC should be
drawn from the corporation's operating divisions which
have responsibility for decentralized management of the
wholesale and retail computing facilities. The Computer
Network Steering Committee can promote input to the
Corporate Headquarters {)f useful comments and ideas
on network policy and operation.
Under the general structure some specific functions in
whi~h Computer Network Management would be involved can be discussed further.
Policies set by Computer Network Management
should govern the content of agreements between
wholesalers and retailers. The following is a list of
some of the items which would have to be covered in
such agreements:
1. The length of time for which an agreement
should run would have to be spelled out in each
case.
2. The wholesaler would have to guarantee a specific amount of service to the retailer in return
for a guarantee of a minimum number of dollars
from the retailer.
3. The kinds and levels of service to be provided
would have to be spelled out in detail.
Another major area in which Computer Network
Management could be involved is in setting rates and in
rationing services during periods of congestion. Policies
should be established which would promote as effective
and efficient support as possible during congested periods, without starving any single customer. Also, the
total amount of computer time which each wholesaler
can commit should be regulated to prevent over commitment of the network.
Computer Network Management should set up some

Wholesale Retail Concept for Computer Network Management

form of "currency" to be used when resources become
congested. The amount of "currency" in the network
would be regulated by Computer Network Management
with advi_ce from the Computer Network Steering Committee. This "currency" based rationing scheme should
be put into effect ahead of time, rather than waiting
until resources become so congested that it has to be
created under emergency conditions. It is probable that
separate rations should be established for different
classes of service, such as interactive terminal service,
fast turn around batch processing, overnight turn
around, etc.
Under this organizational concept the corporate
headquarters would have to assume a greater responsibility for projecting requirements and procuring new
hardware and software to meet those requirements
throughout the corporation. Some requirements would
arise which would have to be met immediately. There
would not always be sufficient time to purchase new
hardware or software. In such cases computer network
management could arrange for external service contracts
to be let through one or more wholesalers. CNM would
have the responsibility for identifying peak workloads
anticipated for the entire network on the basis of feedback information received from wholesalers and retailers. When overall network services become congested, an open ended external service contract might be
placed to handle the· excess. This provides time for a
corporate decision to be made as to whether or not additional computing capacity should be added to the
network.
The last major responsibility of Computer Network
Management would be to aggregate requirements being
received from wholesalers and retailers and to use these
to project when new hardware and software should be
procured for the S&E network community. The primary
responsibility for justifying this new hardware would
rest with CNM, drawing on all corporate resources for
support and coordination. Computer Network Management with guidance from the Computer Network Steering Committee, would also be responsible for determining where new hardware should be placed in order to
run the network in the most effective and efficient
manner.
Computer Network Management would fulfill its
mission of insuring computer support to corporate
scientists and engineers by negotiating facility management agreements with specific divisions of the corporation to establish and operate Wholesale Computer
Facilities. These WCFs would offer the specified kinds
and levels of service to Retail Computer Facilities via
the network. ReFs would tailor and add to the services
to meet requirements of local customers.

897

SUMMARY
The notions of wholesale and retail computer facilities
are particularly useful in examining the problems which
must be faced when entering a computer network environment. The concept helps to clarify the functions
which must be performed within a network of shared
computer resources, and the management commitment
which must be made if the objectives and advantages of
such sharing are to be realized~; Mapping of the wholesale/retail functions onto the corporate organization
which is forming the network can be valuable in identifying to members of that organization what their roles
would be in the network environment. Such clarification
is a prerequisite to securing the commitment necessary
to make a network successful.
Decisions as to whether or not operational management of the computer centers should be decentralized
will vary with circumstances, but if efficient, well managed decentralized computing facilities exist, they
should be retained. In any case, a central computer network mamigement function is needed to set policy and
to take ad overall corporate viewpoint. It should· be
remembered, however, that the primary purpose of a
scientific and engineering computing network is to
provide services to research and development projects
at field activities. As such, the goal should be to contribute to the optimization of the costs and time involved in the research and development cycle, rather
than to optimize the production of computer cycles.
The establishment of a network steering committee
which includes representatives from field activities can
help to insure the achievement of this goal aItd to increase confidence in the network among the field personnel which it is to serve.
Finally it is important to realize that a corporation
begins to enter the network environment, from the
management standpoint, as soon as some of its major
activities begin to share computer resources, whether or
not it involves any computer to computer communications facilities. Recognition of this point and a careful
examination of corporate objectives and goals in computer sharing should lead to the establishment of a
computer network management function, so that the
corporation can manage itself into an orderly network
environment rather than drifting into a chaotic one.
ACKNOWLEDGl\1:ENT
The authors would like to gratefully acknowledge extensive discussions and interaction with a group of
people whose ideas and hard work contributed sub-

898

Fall Joint Computer Conference, 1972

stantially to the content of this paper: Mr. Einar
Stefferud, Einar Stefferud & Associates; and the following members of organizations within the US Army
Materiel Command: Richard Butler, Harry Diamond
Labs; John Cianfione, US Army Materiel Command
Headquarters ; James Collins, Missile Command;
Tom Dames, Electronics Command; Edward Goldstein, Test and Evaluation Command; Dr. James Hurt,
Weapons Command; Paul Lascala, Aviation Systems
Command; Sam P. McCutchen, Mobility Equipment
R&D Center; James Pascale, Watervliet Arsenal;
Michael Romanelli, Aberdeen Research & Development
Center; George Sumrall, Electronics Command.
REFERENCES
1 E STEFFERUD
A wholesale/retail structure for the AMC computer network
Unpublished Discussion Paper Number ES&A/AMC/CNC
DP-1 February 3 1972
2 J J PETERSON S A VEIT
Survey of computer networks
Mitre Corporation
MTP-357 September 1971
3 F P BROOKS J K FERRELL T M GALL IE
Organizational, financial, and political aspects of a three
university computing center
Proceedings of the IFIP Congress 1968 E49-52
4 M S DAVIS
Economics-point of view of designer and operator
Proceedings of Interdisciplinary Conference on Multiple
Access Computer Networks
University of Texas and Mitre Corporation 1970

5 J J HOOTMAN
The computer network as a marketplace
Datamation Vol 18 No 4 April 1972
6 C MOSMANN E STEFFERUD
Campus computing management
Datamation Vol 17 No 5 March 1971
7 E STEFFERUD
Computer management
College and University Business September 1970
8 L G ROBERTS B D WESSLER
Computer network development to achieve resource sharing
AFIPS Conference Proceedings May 1970
9 F E HEART et al
The interface message processor for the ARP A computer
network
AFIPS Conference Proceedings May 1970
10 C S CARR S D CROCKER V G CERF
HOST-HOST communication protocol in the ARPA network
AFIPS ConfevenO.e Proc.eedings May 1970
11 E STEFFERUD
Management's role in networking
Datamation Vol 18 No 4 April 1972
12 E STEFFERUD
The environment of computer operating system scheduling:
Toward an understanding
Journal of the Association for Education Data Systems
March 1968
13 BLUE RIBBON DEFENSE PANEL
Report to the President and the Secretary of Defense on the
Department of Defense Appendix I: Staff report on automatic
data processing
July 1970
14 S D CROCKER et al
Function-oriented protocols for the ARPA computer network
AFIPS Conference Proceedings
May 1970

A functioning computer network
for higher education in North Carolina
by LELAND H. WILLIAMS
Triangle Universities Computation Center
Research Triangle Park, North Carolina

INTRODUCTION

medical schools, two engineering schools, 30,000 undergraduate students, 10,000 graduate students, and 3,300
teaching faculty members.
The primary. motivation was economic-to give each
of the institutions access to more computing power at
lower cost than they could provide individually. Initial
grants were received from NSF and from the North
Carolina Board of Science and Technology, in whose
Research Triangle Park building TUCC was located.
This location represents an important decision, both

Currently there is a great deal of talk concerning computer networks. There is so much such talk that the
solid achievements in the area sometimes tend to be
overlooked. It should be clearly understood then, that
this paper deals primarily with achievements. Only the
last section, which is clearly labeled, deals with plans for
the future.
Adopting terminology from Peterson and V eit, 1
TUCC is essentially a centralized, homogeneous network comprising a central service node (IBM 370/165),
three primary job source nodes (IB1VI 360/75, IBM: 360/
40, IBM: 360/40) and 23 secondary job source nodes
(leased line Data 100s, UCC 1200s, IBJVI 1130s, IBM
2780s, and leased and dial line IBl\1: 2770s) and about
125 tertiary job source nodes (64 dial or leased lines for
Teletype 33 ASRs, IBM 1050s, IBM 2741s, UCC 1035s,
etc.) See Figures 1 and 2. All source node computers in
the network are homogeneous with the central service
node and, although they provide local computational
service in addition to teleprocessing service, none currently provides (non-local) network computational
service. However, the technology for providing network
computational service at the primary source nodes is
immediately available and some cautious plans for
using this technology are indicated in the last section
of this paper.

50 educational institutions
universities, colleges,
community colleges, technical institutions, and
secondary schools (various medium and low
speed terminals)

DUKE/DURHAM
360/40

NCSU/RALEIGH

UNC/CHAPEL HILL

BACKGROUND

360/40

360/75

PRIMARY TERMINAL

The Triangle Universities Computation Center was
established in 1965 as a non-profit corporation by three
major universities in North Carolina-Duke University
at Durham, The University of North Carolina at
Chapel Hill, and North Carolina State University at
Raleigh. Duke is a privately endowed institution and
the other two are state supported. Among them are two

NOTE:

PRIMARY TERMINAL

IN ADDITION TO THE PRIMARY TERMINAL INSTALLATION AT DUKE, UNC,
AND NCSU, EACH CAMPUS HAS AN ARRAY OF MEDIUM AND LOW-SPEED
TERMINALS DIRECTLY CONNECTED TO TUCC.

Figure l-The TUCC network

899

900

Fall Joint Computer Conference, 1972

• One institution
() More than one institution in
one location
Asheville
Charlotte

2

Durham
Elizabeth City
Greensboro
Raleigh
Winston"'-Salem
Wilmington

2 (higher education)
10 (secondary school system)
3
2

4
3
3
2

TOTAL NETWORK INSTITUTIONS:

53

Figure 2-Network of institutions served by TUCC/NCECS

3067
POWER &
COLLANT
DIST.

2314
DISK
FACILITY
DR VES
2880
BLOCK
MULTI-

3165 CPU
2 MILLION BYTES

2314
DISK
FACILITY
9 DRIVES

t~h

SELECTOR
SUB-CHANNEL
2803
TAPE
CONTROL

REMOTE FIELD
ENGINEERING
ASSISTANCE

2540 CARD
READ PUNCH

2701
DATA
ADAPTER

DUKE M/40
(40.8K BAUD)
(40.8K BAUD)
UNC M/75 (40.8K BAUD)

64 PORTS FOR LOWSPEED TYPEWRITER
TERHINALS (110 BAUD)

ADAPTER

••

•

TOTAL OF:
20 }'1ED-SPEED AT 2400 BAUD
8 MED-SPEED AT 4800 BAUD

Figure 3-TUCC hardware· configuration

3330
DISK
FACILITY
8 DRIVES

Functioning Computer Network for Higher Education

because of its geographic and political neturality with
respect to all three campuses and because of the value
of the Research Triangle Park environment.
The Research Triangle Park is one of the nation's
most successful research parks. In a wooded tract of
5,200 acres located in the small geographic triangle
formed by the three universities, the Park in 1972 has
8,500 employees, a payroll of $100 million and an investment in buildings of $140 million. The Park contains 40
buildings that house the research and development
facilities of 19 separate national and international
corporations and government agencies and other
institutions.
TUCC pioneered massively shared computing; hence
there were many technological, political, and protocol
problems to overcome. Successive stages toward solution of these problems have been reported by Brooks,
Ferrell, and Gallie;2 by Freeman and Pearson;3 and by
Davis. 4 This paper will focus on present success.

1971, replacing a saturated 360/75 which was running
a peak load of 4200 jobs/day. The life of the Model 75
could have been extended somewhat by the replacement
of 2 megabytes of IBM slow core with an equal amount
of Ampex slow core. This would have increased the
throughput by about 25 percent for a net cost increase
of about 8 percent.
TUCC's minimum version of the Model 165 costs
only about 8 percent more than the Model 75 and it is
expected to do twice as much computing. So far it has
processed 6100 jobs/day without saturation. This
included about 3100 autobatch jobs, 2550 other batch
jobs, and 450 interactive sessions. Of the autobatch
jobs, 94 percent were processed with less than 30
minutes delay (probably 90 percent with less than 15
minutes delay), and 100 percent with less than 3 hours
delay. Of all jobs, 77 percent were processed with less
than 30 minutes delay, and 99 percent with less than 5
hours delay. At the present time about 8000 different
individual users are being served directly. The growth
of TUCC capability and user needs to this point is
illustrated in Figure 4.
Services to the TUCC user community include both
remote job entry (RJE) and interactive processing.
Included in the interactive services are programming
systems employing the BASIC, PL/1, and APL Janguages. Also TSO is running in experimental mode.
Available through RJE is a large array of compilers
including FORTRAN IV, PL/l, COBOL, ALGOL,
P~/C, WATFIV and WATBOL. These language
facilities coupled with an extensive library of application programs provide the TUCC user community with
a dynamic information processing system supporting a
wide variety of academic computing activities.

PRESENT STATUS
TUCC supports educational, research) and (to a
lesser, but growing extent) administrative computing
requirements at the three universities, and also at 50
smaller institutions in the state and several research
laboratories by means of multi-speed communications
and computer terminal facilities. TUCC operates a
2-megabyte, telecommunications-oriented IBM 370/
165 using OS/360-MVT/HASP and supporting a wide
variety of terminals (see Figure 3). For high speed
communications, there is a 360/75 at Chapel Hill and
there are 360/40s at North Carolina State and Duke.
The three campus computer centers are truly and
completely autonomous. They view TUCC simply as
a pipeline through which they get massive additional
computing power to service their users.
The present budget of the center is about $1.5 million.
The Model 165 became operational on September 1,

ADVANTAGES
The financial advantage deserves further comment.
As a part of the planning process leading to installation

120
110
100

TOTAL JOBS PER IIONTH RUN AT TUCC

90
80
70
JOBS
PER
IIONTH
(x 1000)

60
50
40
30
20
10
1967

1968

901

1969

1970

1971

Figure 4-TUCC jobs per month, 1967-1972

1972

902

Fall Joint Computer Conference, 1972

of the Model 165, one of the universities concluded that
it would cost them about $19,000 per month more in
hardware and personnel costs to provide all their computing services on campus than it would cost to continue participation in TUCC. This would represent a
40 percent increase over their present expense for terminal machine, communications, and their share of TUCC
expense.
There are other significant advantages. First, there
is the sharing of a wide varietrof application programs.
Once a program is developed at one institution, it can be
used anywhere in the network with no difficulty. For
proprietary programs, usually only one fee need be paid.
A sophisticated TUCC documentation system sustains
this activity. Second, there has been a significant impact
on the ability of the universities to attract faculty
members who need large scale computing for tlleir
research and teaching and several TUCC staff members
including the author have adjunct appointments with
the university computer science departments.
A third advantage has been the ability to provide
very highly competent systems programmers (and
management) for the center. In general, these personnel
could not have been attracted to work in the environment of the individual institutions because of salary
requirements and because of system sophistication
considerations.

NORTH CAROLINA EDUCATIONAL
COMPUTING SERVICE
The North Carolina Board of Higher Education has
established an organization known as the North
Carolina Educational Computing Service (NCECS).
This is the successor of the North Carolina Computer
Orientation Project 5 which began in 1966. NCECS
participates in TUCC and provides computer services
to public and private educational institutions in North
Carolina other than the three founding universities.
Presently 40 public and private universities, junior
colleges, and technical institutes plus one high school
system are served in this way. NCECS is located with
TUCC in the North Carolina Board of Science and
Technology building in the Research Triangle Park.
This, of course, facilitates communication between
TUCC and NCECS whose statewide users depend upon
the TUCC telecommunication system.
NCECS serves as a statewide campus computation
center for their users, providing technical assistance,
information services, etc. In addition, grant support
from NSF has made possible a number of curriculum
development activities. NCECS publishes a catalog of
available instructional materials; they provide curricu-

lum development services; they offer workshops to
promote effective computer use; they visit campuses,
stimulating faculty to introduce computing into courses
in a variety of disciplines. Many of these programs have
stimulated interest in computing from institutions and
departments where there was no interest at all. One
major university chemistry department, for example,
ordered its first terminal in order to use an NCECS
infrared spectral information program in its courses.
The software for NCECS systems is derived from a
number of sources in addition to sharing in the community wide program development described above.
Some of it is developed by NCECS staff to meet a
specific and known need; some is developed by individual institutions and contributed to the common cause;
some of it is found elsewhere, and adapted to the system. NCECS is interested in sharing curriculum
oriented software in as broad a way as possible.
Serving small schools in this way is both a proper
service for TUCC to perform and is also to its own
political advantage. The state-supported founding
universities, UNC and NCSU, can show the legislature
how they are serving much broader educational goals
with their computing dollars.
ORGANIZATION
TUCC is successful not only because of its technical
capabilities, but also because of the careful attention
given to administrative protection of the interests of
the three founding universities and of the N CECS
schools; The mechanism for this protection can, perhaps, best be seen in terms of the wholesaler-retailer
concept. 6 TUCC is a wholesaler of computing service;
this service consists essentially of computing cycles,an
effective operating system, programming languages,
some application packages, a documentation service,
and management. The TUCC wholesale service specifically does not include typical user services-debugging,
contract programming, etc. Nor does it include user
level billing nor curriculum development. Rather these
services are provided for their constituents by the
Campus Computation Centers and NCECS, which are
the retailers for the TUCC network. See Figure 5.
The wholesaler-retailer concept can also be seen in
the financial and service relationships. Each biennium,
the founding universities negotiate with each other and
with TUCC to establish a minimum financial commitment from each, to the net budgeted TUCC costs. Then,
on an annual basis the founding universities and TUCC
negotiate to establish the TUCC machine configuration,
each university's computing resource share, and the
cost to each university. This negotiation, of course,

Functioning Computer Network for Higher Education

Figure 5-TUee wholesaler-retailer structure

includes adoption of an operating budget. Computing
resource shares are stated as percentages of the total
resource each day. These have always been equal for
the three founding universities, but this is not necessary.
Presently each founding university is allocated 25 percent, the remaining 25 percent being available for
NCECS, TUCC systems development, and other users.
This resource allocation is administered by a scheduling
algorithm which insures that each group of users has
access to its daily share of TUCC computing resources.
The algorithm provides an effective trade-off for each
category between computing time and turn-around
time; that is, at any given time the group with the least
use that day will have job selection preference.
The scheduling algorithm also allows each founding
university and NCECS to define and administer quite
flexible, independent priority schemes. Thus the
algorithm effectively defines independent sub-machines
for the retailers, providing them with the same kind of
assurance that they can take care of their users' needs
as would be the case with totally independent facilities.
In addition, the founding university retailers have a
bonus because the algorithm defaults unused resources
from other categories, including themselves, to one or
more of them according to demand. This is particularly
advantageous when their peak demands do not coincide.
This flexibility of resource use is a major advantage
which accrues to the retailers in a network like TUCC.
The recent installation of the old TUCC Model 75 at
UNC deserves some comment at this point because it
represents a good example of the TUCC organization in
action. UNC has renewed a biennial agreement, with
its partners, calling essentially for continued equal
sharing in the use of and payment for TUCC computing
resources. Such· equality is possible in our network
precisely because each campus is free to supplement as
required at home. Further more, the UNC Model 75 is
a very modest version of the prior TUCC 1\1odel 75. It
has 256K of fast core and one megabyte of slow core

903

where TUCC had one and two megabtyes respectively.
Rental accruals and state government purchase plans
combined to make the stripped Model 75 cost UNC less
than their previous Model 50. It provides only a 20
percent throughput improvement over the displaced
Model 50. The UNC Model 75 has become the biggest
computer terminal in the world!
There are several structural devices· which serve to
protect the interests of both the wholesaler and the
retailers. At the policy making level this protection is
afforded by a Board of Directors appointed by the
Chancellors of the three founding universities. Typically
each university allocates its representatives to include
(1) its business interests, (2) its computer science
instructional interests, and (3) its other computer
user interests. The University Computation Center
Directors sit with the Board whether or not they are
members as do the Director of NCECS and the President of TUCC. A good example of the policy level
function of this Board is their determination, based on
TUCC management recommendations, of computing
service rates for NCECS and other TUCC users.
At the operational level there are two important
groups, both normally meeting each month. The Campus Computation Center Directors' meeting includes
the indicated people plus the Director of NCECS and
the President, the Systems Manager, and the Assistant
to the Director of TUCC. The Systems Programmers'
meeting includes representatives of the three universities, NCECS and TUCC.
addition, of course, each
of the universities has the usual campus computing
committees.

In

PROSPECTS
TUCC continues to provide cost:-effective general
computing service for its users. Some improvements
which can be foreseen include:
1. A wider variety of interactive services to be made

available through TSO.
2. An increased service both for instructional and
administrative computing for the other institutions of higher education in North Carolina.
3. Additional economies for some of the three
founding universities through increasing TUCC
support of their administrative data precessing
requirements.
4. Development of the network into a multiple
service node network by means of the symmetric
HASP-to-HASP software developed- at TUCC.
5. Provision (using HASP) for medium speed terminals to function as message concentrators for

904

Fall Joint Computer Conference, 1972

low speed terminals, thus minimizing communication costs.
6. Use of line multiplexors to reduce communication costs.
7. Extension of' terminal service to a wider variety
'of data rates.

Administrative data processing
Some further comment can be made on item 3. TUCC
has for some time been' handling the full range of
administrative data processing for two NCECS
universities and is beginning to do so for other NCECS
schools~ The primary reason that this application lags
behind instructional applications in the NCECS schools
is simply that grant support, which stimulated development of the instructional applications, has been absent
for administrative applications. However, the success
of the two pioneers has already began to spread among
the others.
With the three larger universities there is a greater
reluctance to shift their administrative data processing
to TUCC, although Duke has already accomplished
this for their student record processing. One problem
which must be overcome to complete this evolution and
allow these unive,rsities to spend administrative computing dollars on the more economic TUCC machine is
the administrator's reluctance to give up a machine on
which he can exercise direct priority pressure. The
present thinking is that this will be accomplished by
extending the sub-machine concept (job scheduling
algorithm) described in the previous section so that
each founding university may have both a researchinstructional sub-machine and an administrative submachine with unused resources defaulting from either
one to the other before defaulting to another category.
Of course, the TUCC computing resource will probably
have to be increased to accommodate this; the annual

negotiation among the founders and TUCC provides a
natural way to define any such necessary increase.
SUMMARY
Successful massively shared computing has been
demonstrated by the Triangle Universities Computation Center and its participating educational institutions in North Carolina. Some insight has been given
into the economic, technological, and political factors
involved in the success as well as some measures of the
size of the operation. The TUCC organizational
structure has been interpreted in terms of a wholesaleretail analogy. The importance of this structure and
the software division of the central machine into
essentially separate sub-machines for each retailer cannot be over-emphasized.
REFERENCES
1 J J PETERSON S A VEIT
Survey of computer networks
MITRE Corporation Report MTP-359 1971
2 F P BROOKS J K FERRELL T M GALLIE
Organizational, financial, and political aspects of a three
university computing center
Proceedings of the IFIP Congress 1968 E49-52
3 D N FREEMAN R R PEARSON
Efficiency vs responsiveness in a multiple-service computer
facility
Proceedings of the 1968 ACM Annual Conference
4 M S DAVIS
Economics-point of view of designer and operator
Proceedings of Interdisciplinary Conference on Multiple
Access Computer Networks
University of Texas and MITRE Corporation 1970
5 L T PARKER T M GALLIE F P BROOKS
J K FERRELL
Introducing computing to smaller colleges and universities-a
progress report
Comm ACM Vol 12 1969319-323
6 D L GROBSTEIN R P UHLIG
A wholesale retail conCept for computer network management
AFIPS Conference Proceedings Vol 41 1972 FJCC

Multiple evaluators in an extensible
programming system*
by BEN WEGBREIT
Harvard University
Cambridge, Massachusetts

INTRODUCTION

the evaluators, verifier, and optimizer are to fit together. Compiling an extensible language where compiled code is to be freely mixed with interpreted code
presents several novel problems and therefore a few
unique opportunities for optimization. Similarly, extensibility and multiple evaluators make program automation by means of source level transformation more
complex, yet provide additional handles on the automation process.
This paper is divided into five sections. The second
section deals with communication between compiled
and interpreted code, i.e., the runtime information
structures and interfaces. The third section discusses
one critical optimization issue in extensible languagesthe compilation of unit operations. The fourth section
examines the relation between debugging problems, '
proving the correGtness of programs, and use of program
properties in compilation. Finally, the fifth section discusses the use of transformation sets as an adjunct to
extension sets for application-oriented optimization.
Before treating the substantive issues, a remark on
the implementation of the proposed solutions may be in
order. Our acquaintance with these problems has arisen
from our experience in the design, implementation, and
use of the ECL programming system. ECL is an extensible programming system utilizing multiple evaluators; it has been operational on an experimental basis,
running on a DEC PDPI0, since August 1971. Some of
the techniques discussed in this paper are functional,
others are being implemented, still others are being designed. As the status of various points is continually
changing, explicit discussion of their implementation
state in ECL will be omitted.
For concreteness, however, we will use the ECL system and ECL's base language, ELI, as the foundation
for discussion. An appendix treats' those. aspects of
ELI syntax needed for reading the examples in this
paper.

As advanced computer applications become more complex, the need for good programming tools becomes more
acute. The most difficult programming projects require
the best tools. It is our contention that an effective tool
for programming should have the following characteristics:
(1) Be a complete programming system-a language,
plus a comfortable environment for the programmer (including an editor, documentation
aids, and the like).
(2) Be extensible, in its data, operations, control, and
interfaces with the programmer.
(3) Include an interpreter for debugging and several
compilers for various levels of compilation....:.-all
fully compatible and freely mixable during execution.
(4) Include a program verifier that validates stated
input/output relations or finds counter-examples
(5) Include facilities for program opfimization and
tuning-aids for program measurement and a
subsystem for automatic high-level optimization
by means of source program transformation.
We will assume, not defend, the validity of these
contentions here. Defenses of these positions by us and
others have appeared in the literature.l.2.3.4.5 The purpose of this paper is to discUss how these characteristics
are to be simultaneously realized and, in particular, how

* This work was supported in part by the U.S. Air Force,
Electronics Systems Division, under ContractF19628-71-C-0173
and by the Advanced Research Projects Agency under Contract
F19628-71-C-0174.
905

906

Fall Joint Computer Conference, 1972

MIXING INTERPRETED AND COMPILED
CODE
The immediate problem in a multiple evaluator system is mixing code. A program is a set of procedures
/which call each other; some are interpreted, others
compiled by various compilers which optimize to various levels. Calls and non-local gotos are allowed where
either side may be either compiled or interpreted. Additionally, it is useful to allow control flow by means
of RETFROM-that is the forced return from a specified procedure call (designated by name), with a specified value as if that procedure call had returned normally with the given value (cf. Reference 6).
Within each procedure, normal techniques apply.
Interpreted code carries the data type of each entityfor autonomous temporary results as well as parameters and locals. Since the set of data types is openended and augmentable during execution, data types
are implemented as pointers to (or indices in) the data
type table. Compiled code can usually dispense with
data types so that temporary results need not, in general, carry type information. In either interpreted or
compiled procedures, where data types are carried, the
type is associated not with the object but rather with a
descriptor consisting of a type code and a pointer to the
object. This results in significant economies whenever
objects are generated in the free storage region.
Significant issues arise in communication between
procedures. The interfaces must:
(1) Allow identification of free variables in one procedure with those of a lower access environment
and supply data type information where required.
_
(2) Handle a special, but important, subcase of # 1non-local gotos out of one procedure into a lower
access environment.
(3) Check that the arguments passed to compiled
procedure are compatible ",ith the formal
parameter types.
(4) Check that the result passed back to a compiled
procedure (from a normal return of a called function or via a RETFROM) is compatible with
the expected data type.
These communication issues are somewhat complicated
by the need to keep the overhead of procedure interface as low as possible for common cases of two compiled procedures linking in desirable (i.e., well-programmed) ways.
The basic technique is to include in the binding (i.e.,
parameter block) for any new variable its name and its

mode (i.e., its data type) in addition to its value. Names
are implemented as pointers to (or indices in) the symbol table. (With reasonable restrictions on the number of
names and modes, both name and mode can be packed

into a 32-bit word.) Within a compiled procedure, all
variables are referenced as a pair (block level number,
variable number within that block). Translation from
name to such a reference pair is carried out for each
bound appearance of a variable during compilation; at
run time, access is made using a display (cf. Reference
7). However, a free appearance of a variable is represented and identified by symbolic name. Connection
between the free variable and some bound variable in
an enclosing access environment is made during execution, implemented using either shallow or deep bindings
(cf. Reference 8 for an explanation of the issues and a
discussion of the trade-offs for LISP). Once identification is made, the mode associated with the bound variable is checked against the expected mode of the free
variable, if the expected mode is known.
To illustrate the last point, we suppose that in some
procedure, P, it is useful to use the free variable BETA
with the knowledge that in all correctly functioning
programs the relevant bound BETA will always be a
character string. To permit partial type checking during compilation, a declaration may be made at the head
of the first BEGIN-END block of P.
DECL BETA:STRING SHARED BETA;
This creates a local variable BETA of mode STRING
which shares storage (i.e., binding by reference in
FORTRAN or PL/I9) with the free variable BETA.
All subsequent appearances of BETA in P are bound,
i.e., identified with the local variable named BETA.
Since the data type of the local BETA is known, normal compilation can be done for all internal appearances
of BETA. The real identity of BETA is fixed during
execution by identification with the free BETA of the
access environment at the point P is entered. When the
identification of bound and free BETA is made, mode
checking (e.g., half-word comparison of two type codes)
ensures that mode assumptions have not been violated.
In the worst case, parameter bindings entail the same
sort of type checking. The arguments passed to a procedure come with associated modes. When a procedure
is entered, the actual argument modes can be checked
against the expected parameter modes and, where appropriate, conversion performed. Then the names of the
formal parameters are added to the argument block,
forming a complete variable binding. Notice that this
works in all four cases of caller/ callee pairs: compiled/

Multiple Evaluators in an Extensible Programming System

compiled, compiled/interpreted, interpreted/compiled
and interpreted/interpreted. Since type checking is
implemented by a simple (usually half-word) comparison, the overhead is small.
However, for the most common cases of compiled/
compiled pairs, mode checking is handled by a less
flexible but more efficient technique. The mode of the
called procedure may be declared in the caller. For example:
DECL G:PROC(INT,STRING;COMPLEX);
specifies that G is a procedure-valued variable which
takes two arguments, an integer and a character string,
and returns a complex number. For each call on G in
the range of this declaration, mode checking and inserti?n of conversion code can be done during compilation,
wIth the knowledge that G is constrained to take on
only certain procedure values. To guarantee this constraint, all assignments to (or bindings of) G are type
checked. Type checking is made relatively inexpensive
by giving G the mode PROC(INT,STRING;COMPLEX)-i.e., there is an entry in the data type table
for it-and comparing this with the mode of the procedure value being assigned. The single comparison
simultaneously verifies the validity of the result mode
and both argument modes.
Result types are treated similarly. For each procedure call, a uniform call block is constructed* which
includes the name of the procedure being called and the
expected mode?f the result (e.g., for the above example,
the name field IS G and the expected-result-mode field is
COMPLEX). This is ignored when compile-time checking of result type is possible and normal return occurs.
However, if interpreted code returns to compiled code
or if RETFROM causes a return to a procedure by ~
non-direct callee, then the expected-result-mode field is
checked against the mode of the value returned.
Transfer of control to non-local labels falls out
naturally if labels are treated as named entities having
constant value. On entry to a BEGIN-END block (in
either interpreted or compiled code), a binding is made
f~r e.ach label in that block. The label value is a triple
(IndIcator of whether the block is interpreted or compiled, program address, stack position). A non-local
goto label L is executed by identifying the label value
referenced by the free use of L, restoring the stack position from the third component of the triple and either
. .
'
Jumpmg to the program address in compiled code or to
the statement executor of the interpreter.

* This can be included in the LINK information. 7

907

UNIT COMPILATION
In most. programs the bulk of the execution time is
spent performing the unit operations of the problem
domain. In some cases (e.g., scalar calculations on
reals), the hardware realizes the unit operations directly.
Suppose, however, that this is not the case. Optimizing
such programs requires recognizing instances of the
unit operations and special treatment-unit compilation-to optimize these units properly.
An extensible language makes recognition a tractable
problem, since the most natural style of programming
is to define data types for the unit entities, and procedures for the unit operations in each problem area.
(Operator extension and syntax extension allow the
invocation of these procedures by prefix and infix expressions and special statement types.) Hence, the unit
operations are reasonably well-modularized. Detecting
which procedures in the program are the critical unit
operations entails static analysis of the call and loop
structure, coupled with counts of call frequency during
execution of the program over benchmark data sets.
The critical unit operations generally have one or
more of the following characteristics:
(1) They have relatively short execution time; their
importance is due to the frequency of call, not
the time spent on each call.
(2) Their size is relatively small.
(3) They are terminal nodes of the call structure, or
nearly terminal nodes.
(4) They entail a repetition, performing the same
action over the lower-level elements which collectively comprise the unit object of the problem
level.
Unit compilation is a set of special heuristics for exploiting these characteristics.
Since execution time is relatively small, call/return
overhead is a significant fraction. Where the unit
operations are terminal, the overhead can be substantially reduced. The arguments are passed from compiled
code to a terminal unit operation with no associated
modes. (Caller and callee know what is being transmitted.) The arguments can usually be passed directly
in the registers. No bindings are made for the formal
parameters. (A terminal node of the call structure calls
no other; hence, there can be no free uses of these variabIes.) The result can usually be returned in a register
again, with no associated mode information.
Since the unit operations are important far out of
proportion to their size, they are subject to optimizing
techniques too expensive for normal application. Opti-

908

Fall Joint Computer Conference, 1972

mal ordering of a computation sequence (e.g., to minimize memory references or the number of temporary
locations) can, in general,* be assured only by a search
over a large number of possible orderings. Further, the
use of identities (e.g., a*b+a*c~a*(b+c)) to minimize
the computational cost causes significant increase in the
space of possibilities to be considered. The use of arbitrary identities, of course, makes the problem of program equivalence (and, hence, of cost minimization)
undecidable. However, an effective procedure for obtaining equivalent computations can be had either by
restricting the sort of transformations admittedll or by
putting a bound on the degree of program expansion
acceptable. Either approach results in an effective procedure delivering a very large set of equivalent computations. While computationally intractable if employed
over the whole program, a semi-exhaustive search of this
set for the one with minimal cost is entirely reasonable
to carry out on a small unit operator. Similarly, to take
full advantage of multiple hardware function units, it
is sometimes necessary to unwind a loop and rewind it
with a modified structure-e.g., to perform, on the ith
iteration of the new loop, certain computation which
was formerly performed on the (i-1)st, ith, and (i+ l)st
iteration. Again, a search is required to find the optimal
rewinding.
In general, code generation which tries various combinations of code sequences and chooses among them
(by analysis or simulation) can be used in a reasonable
time scale if consideration is restricted to the few unit
operations where the pay-off is significant. Consider,
for example, a procedure which searches through an array of packed k-bit elements counting the number of
times a certain (parameter-specified) k-bit configuration
occurs. The table can either be searched in array orderall elements in the first word, then all elements in the
next, etc.-or in position order-all elements in the first
position of a word, all elements in the next position, etc.
Which search strategy is optimal depends on k, the
hardware for accessing k-bit bytes from memory, the
speed of shifting vs. memory access, and the sort of
mask and comparison instructions for k-bit ~ytes. In
many situations, the easiest way of choosing the· better
strategy is to generate code for each and· compute. the
relative -execution times as a function of array length.
A separate issue arises from non-obvious unit operations. Suppose analysis shows that procedures F and G
are each key operations (i.e., are executed very frequently). It may wellbe that the appropriate candidates
for unit compilation are F, G, and some particular

* The only significant exception is for arithmetic expressions with
no common subexpressions.

10

combination of them, e.g., "F;G" or "G( ... F( ... )
... )". That is, if a substantial number of calls on G are
preceded by calls of F (in sequence or in an argument
position), the new function defined by that composition
should be unit compiled. For example, in dealing with
complex arithmetic, +, -, *, /, and CONJ are surely
unit operations. However, it may be that for some program, "u/v+v*CONJ(v)" is critical. Subjecting this
combination to unit compilation saves four of the ten
multiplications as well as a number of memory frequencies.
ASSUMPTIONS AND ASSERTIONS
If an optimizing compiler is to generate really good
code, it must be supplied the same sort of additional
information that would be given to or deduced by a
careful human coder. Pragmatic remarks (e.g., suggestions that certain global optimizations are possible) as
well as explicit consent (e.g., the REORDER attribute
of PL/I) are required. Similarly) if programs are to be
validated by a· program verifier, ~sistance from the
programmer in f{)rming inductive assertions is needed.
Communication between the programmer and the
optimizer/verifier is by means of ASSUME and ASSERT forms.
An assumption is stated by the programmer and is
(by and large) believed true by the evaluator. A local
assumption
ASSUME(X~O);

is taken as true at the point it appears. A global assumption may be extended over some range by means of the
infix operator IN, e.g.,
ASSUME(X~O)

IN BEGIN ... END;

where the assumption is to hold over the BEGIN-END
block and over all ranges called by that block. The function of an assumption is to convey information which
the programmer knows is true but which cannot be
deduced from the program. Specifications of the wellformedness of input data are assumptions as are statements about the behavior of external procedures
analyzed separately.
Assertions, on the other hand, are verifiable. From the
program text and the validity of the program's assumptions, it is possible-at least in principle-to validate
each assertion. For example,
ASSERT(FOR I FROM 1 TO N DO TRUEP(A[I]~
B[ID) IN BEGIN ... END

Multiple Evaluators in an Extensible Programming System

should be provably true over the entire BEGIN-END
block, given that all program assumptions are correct.
T~e interpreter, optimizer, and verifier each treat assumptions and assertions in different ways. Since the
interpreter is used primarily for debugging, it takes the
position that the programmer is not to be trusted.
Hence, it checks everything, treating assumptions and
assertions identically-as extended Boolean expressions
to be evaluated and checked for true (false causing an
ERROR and, in general, suspension of the program).
Local assertions and assumptions are evaluated in
analogy with the conditional expression
NOT (expression)==}ERROR ( ... )
(This is similar to the use of ASSERT in ALGOL W.12)
Assumptions and assertions over some range are checked
over the entire range. This can be done by checking the
validity at the start of the domain and setting up a
condition monitor (e.g., cf. Reference 13) which will
cause a software interrupt if the condition is ever violated during the range.
Hence, in interpreted execution, assumptions and assertions act as comments whose correctness is checked
by the evaluator, providing a rather nice debugging
tool. Not only are errors explicitly detected by a false
assertion, but when errors of other sorts occur (e.g.,
overflow, data type mismatch, etc.), the programmer
scanning through the program is guaranteed that certain
assertions were valid for that execution. Since debugging is often a matter of searching the execution path
for the least source of an error, certainty that portions
of the program are correct is as valuable as knowledge
of the contrary.
The compiler simply believes assertions and assumptions and uses their validity in code optimization. Consider, for example, the assignment
X~B[I-J]-60

Normally, the code for this would include subscript
bounds checking. However, in
X~(ASSERT(1~I-J I\I-J ~LENGTH(B)))

IN B[I-J]-60
the assertion guarantees that the subscript is in range
and no run-time check is necessary.
While assertkms and assumptions are handled by the
compiler in rather the same way, there are a fewdifferences. Assumptions are the more powerful in that
they can be used to express knowledge of program behavior which could not be deduced by the compiler,
either because necessary information is not available
(e.g., facts about a procedure which will be input during

909

program execution) or because the effort of deduction is
prohibitive (e.g., the use of deep results of number
theory in a program acting on integers). Separate compilation makes the statement of such assumptions essential, e.g~,
ASSUME(SAFE(P)) IN BEGIN ... END
insures that the procedure P is free of side effects and
hence can be subj ect to common subexpression elimination.
Unlike assumptions, assertions can be generated by
the compiler as logical consequences of assumptions,
other assertions, and the program text. Consider, for
example, the following conditional block (cf. Appendix
for syntax), where L is a pointerto a list structure.
BEGIN L=NIL==} ... ; ... CDR(L) ... END
Normally, the CDR operation would require a check
for the empty list as an argument. However, provided
that there are no intervening assignments to L, the
compiler may rewrite this as
BEGIN L=NIL==} ... ;

ASSERT(L~NIL)

IN BEGIN ... CDR(L) ... END END
in which case no checks are necessary. Assertions added
by the compiler and included in an augmented source
listing provide a means for the compiler to record its
deductions and explicitly transmit these to the programmer.
The program verifier treats assumptions and assertions entirely differently. Assumptions are believed. *
Assertions are to be proved or disprovedl4 ,15 on the
basis of the stated assumptions, the program text, the
semantics of the programming language, and specialized
knowledge about the subject data types. In the case of
integers, there has been demonstrable success-the· assertion verifier of King has been applied successfully to
some definitely non-trivial algorithms. Specialized
theorem provers for other domains may be constructed.
Fortunately, the number of domains is small. In ALGOL
60, for example) knowledge of the reals, the integers,
and Boolean expressions together with an understanding of arrays and. array subscripting will handle most
program assertions.
In an extensible language, the situation is more complex, but not drastically so. The base language data
types are typically those of ALGOL 60 plus a few others,
e.g., characters; the set of formation rules for data aggregates consists of arrays, plus structures and pointers.

* One might, conceivably, check the internal consistency of a set
Df assumptions, Le., test for possible contradictions.

910

Fall Joint Computer Conference, 1972

Only the treatment of pointers presents any new issues-these because pointers allow data sharing and
hence· access to a single entity under a multiplicity of
names (Le., access paths). This is analogous to the problem of subscript identification, but is compounded
since the access paths may be of arbitrary length.
However, recent work16 shows promise of providing
proof techniques for pointers and structures built of
linked nodes. Since all extension sets ultimately derive
their semantics from the base language, it suffices to
give a formal treatment to the primitive modes and the
built-in set of formation rules-assertions on all other
modes can be mapped into· and verified on the base. **
One variation on the program verifier is the notifier.
Whereas the verifier uses formal proof techniques to
certify correctness, the notifier uses relatively unsophisticated means to provide counterexamples. One
can safely assume that most programs will not be initially correct; hence, substantial debugging assistance
can be provided by simply pointing out errors.· This can
be done somewhat by trial and error-generating values
which satisfy the assumptions and running the program
to check the assertions. Since. programming errors
typically occur at the extremes in the space of data
values, a few simple heuristics may serve to produce
critical counterexamples. If, as appears likely, the computation time for program verification is considerable,
the use of a simple, quicker means to find the majority
of bugs will be of assistance on online program production. While the notifier can never validate programs, it
may be helpful in creating them.
OPTIMIZATION, EXTENSION SETS, AND
TRANSFORMATION SETS
One of the advantages of an extensible language over
a special purpose language developed to handle a new
application arises from the economics of optimization.
In an extensible language system, each extended language Li is defined by an extension set Ei in terms of the
base language. Since there is only a single base, one can
afford to spend considerable effort in developing optimization techniques for it. Algorithms for register allocation, common sub expression detection, eliinination of

** This gives only a formal technique for verification, i.e., specifies
what. must be axiomatized and gives a valid reduction technique.
It may well turn out that such reduction is not a practical solution
if the resulting computation costs are excessive. In such cases,
one can use the underlying axiomatization as a basis for deriving
rules of inference on an extension set. These may be introduced in
a fashion similar to the specialized transformation sets discussed
in the next section.

variables, removal of computation from loops, loop
fusion, and the like need be developed and programmed
only once. All extensions will take advantage of these.
In contrast, the compiler for each special purpose
language must have these optimizations explicitly included. This is already a reasonably large programming
project, so large that many special purpose languages go
essentially unoptimized. As the set of known optimization techniques grows, the economic advantage of extensible language optimization will increase.
There is one flaw in the above argument, which we
now repair. There is the tacit assumption that all
optimization properties of an extended language Li can
'be obtained from the semantics and pragmatics of the
base. While the logical dependency is strictly valid,
taking this as a complete technique is rather impractical.
While certain optimization properties-those concerned solely with control and data flow-can be well
optimized in terms of the base language, other properties depending on long chains of reasoning would tax
any optimizer that sought to derive them every time
they were required.
The point, and our solution, may best be exhibited
with an example. Consider
FOO(SUBSTRING(I, J, X CONCAT Y»
which calls procedure FOO with the substring consisting of the Ith to (I +J -1)th characters of the string obtained by concatenating the contents of string variable
X with string variable Y. In an extensible language,
SUBSTRING and CONCAT are defined procedures
which operate on STRINGs (defined to be ARRAYs of
CHARacters) .
SUBSTRING~

EXPR(I,J:INT, S:STRING; STRING)
BEGIN
DECL SS:STRING SIZE J;
FOR K TO J DO SS[K]~S[I+K-1];
SS
END
CONCAT~

EXPR(A,B:STRING; STRING)
BEGIN
DECL R:STRING SIZE LENGTH(A)+
LENGTH(B);
FOR M TO LENGTH(A) DO R[M]~A[M];
FOR M TO LENGTH(B) DO R[M+LENGTH(A)]
~B[M];

R
END

One could compile code for the above c~ll on Faa
by compiling three successive calls-on CONCAT,

Multiple Evaluators in an Extensible Programming System

SUBSTRIN G,and FOO. However, by taking advantage of the properties of CONCAT and SUBSTRING,
one can do far better. Substituting the definition of
CONCAT in SUBSTRING procedures
SUBSTRING(I, J, A CON CAT B)=
BEGIN
DECL SS :STRING SIZE J;
DECL S:STRING BYVAL
BEGIN
DECL R:STRING SIZE LENGTH(A)
+LENGTH(B);
FOR M TO LENGTH(A) DO R[M]~A[M];
FOR M TO LENGTH(B) DO
R[M+LENGTH(A)]~B[M];

R
END;

FOR K TO J DO
SS
END

SS{K]~S[I+K-1];

The block which computes R may be opened up so
that its declarations and computation occur in the surrounding block. Then, since S is identical to R, S may
be systematically replaced by R and the declaration for
S deleted.
'
BEGIN
DECL SS:STRING SiZE J;
DECL R:STRING SIZE LENGTH(A)+
LENGTH(B);
FOR M TO LENGTH(A) DO R[M]~A[l\tI];
FOR lVI TO LENGTH(B) DO
R[l\1 + LENGTH(A)]~B[lVI];
FOR K TO J DO SS[K]~R[I+K-1];
SS
END
This implies that R[M] is defined by the conditional
block
BEGIN
M ~ LEN GTH(A)=}A[M];
B[M - LENGTH(A»)
END
Replacing M by I +K -1 and substituting, the assignment loop becomes
FOR K TO J DO

SS[K]~BEGIN

has the form
FOR x TO VO DO BEGIN
x~vl=}fl(X);

f 2 (x)

END
where Vi are loop-independent values and fi are functions in x. A basic optimization on the base language
transforms this into the equivalent form which avoids
the test
FOR x TO MIN(vo,vl) DO flex);
FOR x FROM MIN(vo, vl)+1 TO VO DO f2(x);
Hence, SUBSTRING (I, J, A CONCAT B) may be
computed by a call on the procedure*
EXPR(I,J:INT, A,B:STRING; STRING)
BEGIN
DECL SS:STRING SIZE J;
FOR K TO MIN(J, LENGTH(A)-I+l) DO
SS[K]~A[I + K -1];
FOR K FROM MIN(J, LENGTH(A)-I+l)+1
TO J DO SS[K]~B[I+K-LENGTH(A)-I];
SS
END
This could, in principle, be deduced by a compiler
from the definitions of SUBSTRING and CONCAT.
However, there is no way for the compiler to know a
priori that the substitution has substantial payoff. If
the expression SUBSTRING(I,J,A CONCAT B) were
a critical unit operation, the heuristic "try all possible
compilation techniques on key expressions" would discover it. However, the compiler cannot afford to try all
function pairs appearing in the program in the hope
that some will simplify-the computational cost is too
great. Instead,'the programmer specifies to the compiler
the set of transformations (cf. Reference 17 for related
techniques) he knows will have payoff.
TRANSFORM(I,J:INT, X,Y:STRING;
SUBSTITUTE)
SUBSTRING(I, J, X CONCAT Y)
TO
SUBSTITUTE(Z:X CONCAT Y,
SUBSTRING(I,J,Z» (I, J, X, Y)

In general, a transformation rule has the format
TRANSFORM«pattern variables); (action variables»

K~LENGTH(A)-I

+1=>A[I+K-l];
B[I+K-LENGTH
(A)-I]
END
Distributing the assignment to inside the block, this

911

(pattern)
TO

(replacemeJ;lt)

* Normal common subexpression elimination will recognize that
LENGTH (A), I-I, and MIN(J, LENGTH(A)-I+l) need be
calculated only once.

912

Fall Joint Computer Conference, 1972

All lexemes in the pattern and replacement are taken
literally except for the (pattern variables) and (action
variables). The former are dummy arguments, statement-matching variables, etc.; the latter denote values
used to derive the actual transformation from the input
transformation schemata. In the above case, the procedure SUBSTITUTE is called to expand CONCAT
within SUBSTRING as the third argument. The simplified result, CP, is- applied to the dummy arguments.
Hence, calls such as SUBSTRING (3,2*N +C, AA
CON CAT B7) are transformed into calls on CP(3,2*N +
C, AA, B7).
When defining an extension set, the programmer de..
fines the unit data types, unit operations, and additionally the significant transformations on the problem
domain. These domain-dependent transformations are
adjoined to the set of base transformations to produce
the total transformation set. The program, as written,
specifies the function to be computed; the· transformation set provides an orthogonal statement of how the
computation is to be optimized.
For example, in adding a string. manipUlation extension, one would first define the data type STRING
(fixed length array of characters). Next, one defines the'
unit operations: LENGTH, CONCATenate, SUBSTRING, SEARCH (fora string x as part of a string
y starting at position i and return the initial index or
zero if not present). Finally, one defines the transforma~
tions on program units involving these operations.

string variable would be implemented as a pointer to a
simple STRING (i.e., PTR(STRING» with the understanding that assignment of a string value to such a
string variable causes a copy of the string to be made
and the pointer set to address the copy. * With these
three possible representations available, one would define the data type string variable to be

TRANSFORM(X,Y:STRING) LENGTH(X
CONCAT Y)
TO LENGTH(X)+LENGTH(Y)

The predicate WHEN appearing in a pattern is
handled in somewhat the same fashion as are ASSERTions during program verification. It is proved as
part of the pattern matching; the transformation is applicable only if the predicate is provably TRUE and
the literal part of the pattern matches. Here, it must be
proved that LENGTH(X) is a constant over the block
B and all ranges called by B. If so, the variable can be
of type STRING. Similarly, if there is a computable

TRANSFORM(A,X,Y,Z:STRING; SUBSTITUTE)
X CONCAT Y CONCAT Z
TO SUBSTITUTE(A: Y CONCAT Z;
X CONCAT A) (X,Y,Z)

So long as the transformations are entirely local, they
act only as macro replacements. The significant transformations in an extension set are those which make
global, far reaching changes to program or data. Clearly,
these changes will require knowledge, assumed or asserted, about that portion of the program affected by
these changes ..
Consider, for example, the issue of string variables
in the proposed extension set. If a string variable is to
have a fixed capacity, the type STRING is satisfactory.
If varial;>le capacity is desired but an' upper bound can
be established for each string variable, the type VARSTRING could be defined like string VARYING in
PLjI. If completely variable capacity is required, a

ONEOF(STRING, VARSTRING, PTR(STRING»
Each string variable is one of these three data types.
To provide for the worst case, the programmer could
specify each formal parameter string variable to be
ONEOF(STRING, VARSTRING, PTR(STRING»
and specify each local string variable to be a PTR(STRING). A program so written would be correct, but
its performance would, in general, suffer from unused
generality. Each string variable whose length is fixed
can be redeclared
TRANSFORM(Dl,D2:DECLIST, S:STATLIST,
F:FORM, X; WHEN)
WHEN (CONSTANT(LENGTH(X») IN
BEGIN D1; DECL X:PTR(STRING)
BYVAL F; D2; SEND
TO
BEGIN Dl; DECLX:STRING BYVAL F;
D2; SEND

* This does not exhaust the list of possible representations for
strings. To avoid copying in concatenation, insertion, and
deletion, one could represent strings by linked lists of characters
nodes: each node consisting of a character and a pointer to the
next node. A string variable could then be a pointer to such node
lists. To. minimize storage, one could employ hashing to insure
that each distinct sequence of characters is represented by a
unique string-table-entry; a string variable could then be a pointer
to such string-table-entries. Hashing and implementing strings by
linked lists could be combined to yield still another representation
of strings. In the interest of brevity, we consider only three rather
simple representations; however, the point we make is all the
stronger when additional representations are considered.

Multiple Evaluators in an Extensible Programming. System

maximum length less than a reasonable upper limit
LIM, then the data type VARSTRING can be used.
TRANSFORM(D1,D2:DECLIST, B:BLOCK,
F:FORM, K:INT, X; WHEN)
BEGIN D1; DECL X:PTR(STRING)
BYVAL F; D2; WHEN(LENGTH(X)~
KI\K~LIM) IN B
END
TO
BEGIN D1; DECL X:VARSTRING SIZE
K BYVAL F; D2; BEND

To prove an assertion for a variable X over some
range, it suffices to prove the assertion true of all expressions that are assignable to X in that range. An
assertion about LENGTH(X) is reasonable to validate
since it entails only theorem proving over the integers18-once the string manipulation routines are
reinterpreted as operations on string lengths. Fortunately, most of the interesting predicates are of this
order of difficulty. Typical WHEN conditions are: (1) a
variable (or certain fields of a data structure) is not
changed; (2) an object in the heap is referenced only
from a given pointer; (3) whenever control reaches a
given program point, a variable always has (or never
has) a given value (or set of values); (4) certain operations are never performed on certain elements of a
data structure. Such conditions are usually easier to
check than those concerned with correct program behavior, since only part of the action carried out by the
algorithm is relevant.
That is, the technique suggested above for simplifying proofs about string manipulation operators by considering only string lengths generalizes too many related cases. To verify a predicate concerned with certain
properties, one takes a valuation of the program on a
model chosen to abstract those properties. 19 The program is run by a special interpreter which performs the
computation on the simpler data space tailored to the
property. To correct for the loss of information (e.g.,
the values of most program tests are not available),
the computation is conservative (e.g., the valuation of
a conditional takes the union of the valuations of the
possible arms). If the valuation in the model demonstrates the proposition, it is valid for the actual data
space. While this is a sufficient condition, not a necessary one, an appropriate model should seldom fail to
prove a true proposition.
CONCLUSION
An interpreter, a compiler, a source-level optimizer employing domain-specific transformations, and a program

913

verifier each compute a valuation over some model.
Fitting these valuators together so as to exploit the
complementarity of their models is a central task in
constructing a powerful programming tool.
ACKNOWLEDGMENT
The author would like to thank Glenn Holloway and
Richard Stallman for discussions concerning various
aspects of this paper.
REFERENCES
1 B WEGBREIT
The ECL programming system
Proc AFIPS 1971 FJCC Vol 39 AFIPS Press Montvale
New Jersey pp 253-262
2 A J PERLIS
The synthesis of algorithmic systems
JACM Vol 17 No 1 January 1967 pp 1-9
3 T E CHEATHAM et al
On the basis for ELF-an extensible language facility
Proc AFIPS FJCC 1968 Vol 33 pp 937-948
4 D G BOBROW
Requirements for advanced programming systems for list
processing
CACM Vol 15 No 7 July 1972
5 T E CHEATHAM B WEGBREIT
A laboratory for the study of automating programming
Proc AFIPS 1972 SJCC Vol 40
6 W TEITELMAN et al
BBN-LISP
Bolt Beranek and Newman Inc Cambridge Massachusetts
July 1971
7 E W DIJKSTRA
Recursive programming
Numerische Mathematik 2 (1960) pp 312-318. Also in
Programming Systems and Languages S Rosen (Ed)
McGraw-Hill New York 1967
8 J MOSES
The function of FUNCTION in LISP
, SIGSAM Bulletin July 1970 pp 13-27
9 IBM SYSTEM/360
PL/I language reference manual
Form C28-8201-2 IBM 1969
10 R SETHI J D ULLMAN
The generation of optimal code for arithmetic expressions
JACM Vol 17 No 4 October 1970 pp 715-728
11 A V AHO J D ULLMAN
TransformatilJns on straight line programs
Conf Rec Second Annual ACM Symposium on Theory of
Computing SIGACT May 1970 pp 136-148
12 R L SITES
Algol W reference manual
Technical Report CS-71-230 Computer Science Department
Stanford University August 1971
13 D G BOBROW B WEGBREIT
A model and stack implementation of multiple environments
Report No 2334 Bolt Beranek and Newman Cambridge
Massachusetts March 1972 submitted for publication

914

Fall Joint Computer Conference, 1972

14 R FFLOYD
Assigning meanings to programs
Proc Symp Appl Math Vol 19 1967 pp 19-32
15 R F FLOYD
Toward interactive design of correct programs
Proc IFIP Congress 1971 Ljubljana pp 1-5
16 J POUPON B WEGBREIT
Verification techniques for data structures including pointers
Center for Research in Computing Technology Harvard
University in preparation
17 B A GALLER A J PERLIS
A proposal for definitions in Algol
CACM Vol 10 No 4 April 1967 pp 204-219
18 J C KING
A program verifier
PhD Thesis Department of Computer Science
Carnegie-Mellon University September 1969
19 M SINTZOFF
Calculating properties of programs by valuations on specific
models
SIGPLAN Notices Vol 7 No 1 and SIGACT News No 14
January 1972 pp 203-207
20 B WEGBREIT et al
ECL programmer's manual
Center for Research in Computing Technology Harvard
University Cambridge Massachusetts January 1972

a statement of the form

If ill is TRUE then the block is exited with the value

of CO; otherwise, the next statement of the. block
executed. For example, the ALGOL 60 conditional
if ill! then COl else if ill2 then CO2 else C0 3

is written in ELl as

(Unconditional statements of an ELl block are simply
executed sequentially-unless a goto transfers control
to a different labeled statement.)
A.2 Declarations

The initial statements of a block may be declarations
having the format
DECL £:

APPENDIX: A BRIEF DESCRIPTION OF
ELl SYNTAX
To a first approximation, the syntax of ELl is like
that of ALGOL 60 or PL/I. Variables, subscripted
variables, labels, arithmetic and Boolean expressions,
assignments, gotos and procedure calls can all be written
as in ALGOL 60 or PL/I. Further, ELI is-like ALGOL
60 or PL/I-a block structured language. Executable
statements in ELl can be grouped together and delimited by BEGIN END brackets to form blocks. New
variables can he created within a block by declaration;
the scope of such variable names is the block in which
they are declared.
The syntax of ELl differs from that of ALGOL 60
or PL/I most notably in the form of conditionals,
declarations, and data type specifiers. For the purposes
of this paper, it will suffice to explain only these points
of difference. (A more complete description can be found
in Reference 20.)
A.1 Conditionals

Conditionals in ELl are a special case of BEGIN
END blocks. In general, each ELl block has a valuethe value of the last statement executed. Normally,
this is the last statement in the block. Instead, a block
can be conditionally exited with some other value CO by

IS

~S;

where £ is a list of identifiers, ~ is the data type, and
S specifies the initialization. For example,
DECL X, Y: REAL BYVAL A[I];
This creates two REAL variables named X and Yand
initializes them to separate copies of the current value
of A[I]. The specification S may assume one of three
forms:
(1) empty-in which case a default initialization determined by the data type is used.
(2) BYVAL CO-in which case the variables are
initialized to copies of the value of CO.
(3) SHARED CO-in which case the variables are
declared to be synonymous with the value of co.
A.3 Data types

Built-in data types of the language include: BOOL,
CHAR, INT, and REAL. These may be used as data
type specifiers to create scalar variables.
Array variables may be declared by using the built-in
procedure ARRAY. For example,
DECL C: ARRAY(CHAR) BYVAL CO;
creates a variable named C which is an ARRAY of

Multiple Evaluators in an Extensible Programming System

915

AA Procedures

CHARacters. The LENGTH (i.e., number of components) and initial value of C is determined by the value
of U.
Procedure-valued variables may be defined by the
builtin procedure PROC. For example,

A procedure may be defined by assigning a procedure
value to a procedure-valued variable. For example,

DECL G:PROC(BOOL,ARRAY(INT); REAL);

EXPR(X:REAL,N:INT; REAL)
BEGIN DECL R:REAL BYVAL 1; FOR I TO N
DO R~R*X; REND

declares G to be variable which can be assigned only
those procedures which take a BOOL argument and an
ARRAY(INT) argument and deliver a REAL result.

assigns to IPOWER a procedure which takes two arguments, a REAL and an INT (assumed positive), and
computes the exponential.

IPOWER~

Automated programmering-The programmer's assistant
by WARREN TEITELMAN*
Bolt, Beranek, & Newman
Cambridge, Massachusetts

features as complete compatibility of compiled and
interpreted code, "visible" variable bindings and control
information, programmable error recovery procedures,
etc. Indeed, at this point the two systems, BBN-LISP
and the programmer's assistant, have become so intertwined (and interdependent), that it is difficult, and
somewhat artificial, to distinguish between them. We
shall not attempt to do so in this paper, preferring
instead to present them as one integrated system.
BBN-LISP contains many facilities for assisting the
programmer in his non-programming activities. These
include a sophisticated structure editor which can either
be used interactively or as a subroutine; a debugging
package for inserting conditional programmed interrupts around or inside of specified procedures; a
"prettyprint" facility for producing structured symbolic output; a program analysis package which produces a tree structured representation of the flow of
control between procedures, as well as a concordance
listing indicating for each procedure the procedures that
call -it, the procedures that it calls, and the variables it
references, sets, and binds; etc.
Most on-line programming systems contain similar
features. However, the essential difference between the
BBN-LISP system and other systems is embodied in
the philosophy that the user addresses the system
through an (active) intermediary agent, whose task it
is to collect and save information about what the user
and his programs are doing, and to utilize this information to assist the user and his programs. This intermediary is called the programmer's assistant (or p.a.).

INTRODUCTION
This paper describes a research effort and programming
system designed to facilitate the production of programs. Unlike automated programming, which focuses
on developing systems that write programs, automated
programmering involves developing systems which
automate (or at least greatly facilitate) those tasks that
a programmer performs other than writing programs:
e.g., repairing syntactical errors to get programs to run
in the first place, generating test cases, making tentative
changes, retesting, undoing changes, reconfiguring,
massive edits, et al., plus repairing and recovering from
mistakes made during the above. When the system in
which the programmer is operating is cooperative and
helpful with respect to these activities~ the programmer
can devote more time and energy to the task of programming itself, i.e., to conceptualizing, designing and
implementing. Consequently, he can be more ambitious, and more productive.
BBN-LISP
The system we will describe here is embedded in
BBN-LISP. BBN-LISP, as a programming language,
is an implementation of LISP, a language designed for
list processing and symbolic manipulation.! BBN-LISP
as a programming system, is the product of, and vehicle
for, a research effort supported by ARPA for improving
the programmer's environment. ** The term "environment" is used to suggest such elusive and subjective
considerations as ease and level of interaction, forgivingness of errors, human engineering, etc.
Much of BBN-LISP was designed specifically to
enable construction of the type of system described in
this paper. For example, BBN-LISP includes such

THE PROGRAMMER'S ASSISTANT
For most interactions with the BBN LISP system,
the programmer's assistant is an invisible interface
between the user and LISP: the user types a request,
for example, specifying a function to be applied to a set
of arguments. The indicated operation is then per-

* The author is currently at Xerox Palo Alto Research Center
3180 Porter Drive, Palo Alto, California 94304.
** Earlier work in this area is reported in Reference 2.
917

918

Fall Joint Computer Conference, 1972

formed, and a resulting value is printed. The system is
then ready for the next request. However, in addition,
in BBN-LISP, each input typed by the user, and the
value of the corresponding operation, are automatically
stored by the p.a. on a global data structure called the
history list.
The history list contains information associated with
each of the individual "events" that have occurred in
the system, where an event corresponds to an individual
type-in operation. Associated with each event is the
input that initiated it, the value it yielded, plus other
information such as side effects, messages printed by the
system or by user programs, information about any
errors that may have occurred during the execution of
the event, etc. As each new event occurs, the existing
events on the history list are aged, with the oldest event
"forgotten". *
The user can reference an event on the history list by
a pattern which is used for searching the history list,
e.g., FLAG:~$ refers to the last event in which the
variable FLAG was changed by the user; by its relative
event number, e.g. -1 refers to the most recent event,
-2 the event before that, etc., or by an absolute event
number. For example, the user can retrieve an event in
order to REDO a test case after making some program
changes. Or, having typed a request that contains a
slight error, the user may elect to FIX it, rather than
retyping the request in its entirety. The USE command
provides a convenient way of specifying simultaneous
substitutions for lexical units and/or character strings,
e.g., USE X FOR Y AND + FOR *. This permits
after-the-fact parameterization of previous events.
The p.a. recognizes such requests as REDO, FIX,
and USE as being directed to it, not the LISP interpreter, and executes them directly. For example, when
given a REDO command, the p.a. retrieves the indicated event, obtains the input from that event, and
treats it exactly as though the user had typed it in
directly. Similarly, the USE command directs the p.a.
to perform the indicated substitutions and process the
result exactly as though it had been typed in.
The p.a. currently recognizes about 15 different
commands (and includes a facility enabling the user to
define additional ones). The p.a. also enables the user
to treat several events as a single unit, (e.g. REDO 47
THRU 51), and to name an event or group of events,
e.g. , NAME TEST -1 AND -2. All of these capabilities
allow, and in fact encourage, the user to construct
complex console operations out of simpler ones in much
the same fashion as programs are constructed, i.e.,
simpler operations are checked out first, and then
combined and rearranged into large ones. The important

* The storage used in its representation is then reusable.

point to note is that the user does not have to prepare in
advance for possible future (re-) usage of an event. He
can operate straightforwardly as in other systems, yet
the information saved by the p.a. enables him to
implement his "after-thoughts."
UNDOING
Perhaps the most important after-thought operation
made possible by the p.a. is that of undoing the sideeffects of a particular event or events. In most systems,
if the user suspects that a disaster might result from a
particular operation, e.g., an untested program running
wild and chewing up a complex data structure, he would
prepare for this contingency by saving the state part of
or all of his environment before attempting the operation. If anything went wrong, he would then back up
and start over. However, saving/dumping operations
are usually expensive and time-consuming, especially
compared to a short computation, and are therefore not
performed that frequently. In addition, there is always
the case where disaster strikes as a result of a supposedly
debugged or innocuous operation. For example, suppose
the user types
FOR X IN ELTS REMOVE PROPERTY
'MORPH FROM X
which removes the property MORPH from every member of the list ELTS, and then realizes that he meant to
remove this property from the members of the list
ELEMENTS instead, and has thus destroyed some
valuable information.
Such "accidents" happen all too often in typical
console sessions, and result in the user's either having
to spend a great deal of effort in reconstructing the
inadvertently destroyed information, or alternatively
in returning to the point of his last back-up, and then
repeating all useful work performed in the interim.
(Instead, using the p.a., the user can recover by simply
typing UNDO, and then perform the correct operation
by typing USE ELEMENTS FOR ELTS.)
The existence of UNDO frees the user from worrying
about such oversights. He can be relaxed and confident
in his console operations, yet still work rapidly. He can
even experiment with various program and data con-,
figurations, without necessarily thinking through all
the implications in advance. One might argue that this
would promote sloppy working habits. However, the
same argument can be, and has been, leveled against
interactive systems in general. In fact, freeing the user
from such details as having to anticipate all of the
consequences of an (experimental) change usually re-

Automated Programmering

sults in his being able to pay more attention to the
conceptual difficulties of the problem he is trying to
solve.
Another advantage of undoing as it is implemented
in the programmer's assistant is that it enables events
to be undone selectively. Thus, in the above example, if
the user had performed a number of useful modifications to his programs and data structures before noticing
his mistake, he would not have to return to the environment extant when he originally typed FOR X IN ELTS
REMOVE PROPERTY 'MORPH FROM X, in order
to UNDO that event, i.e., he could UNDO this event
without UNDOing the intervening events. * This means
that even if we eliminated efficiency considerations and
assumed the existence of a system where saving the
entire state of the user's environment required insignificant resources and was automatically performed
before every event, there would still be an advantage to
having an undo capability such as the one described
here.
Finally, since the operation of undoing an event itself
produces side effects, it too is undoable. The user can
often take advantage of this fact, and employ strategies
that use UNDO for desired operation reversals, not
simply as a means of recovery in case of trouble. For
example, suppose the user wishes to interrogate a
complex data structure in each of two states while
successively modifying his programs. He can interrogate
the data structure, change it, interrogate it again, then
undo the changes, modify his programs, and then repeat
the process using successive UNDOs to flip back and
forth between the two states of the data structure.
IMPLEMENTATION OF UNDO**
The UNDO capability of the programmer's assistant
is implemented by making each function that is to be
undo able save on the history list enough information to
enable reversal of its side effects. For example, when a
list node is about to be changed, it and its original
contents are saved; when a variable is reset, its binding
(i.e., position on the stack) and its current value are
saved. For ~ch primitive operation that involves side
effects, there are two separate functions, one which
always saves this information, i.e., is always undoable,
and one which does not.
Although the overhead for saving undo information
is small, the user may elect to make a particular operation not be undo able if the cumulative effect of saving

* Of course, he could UNDO all of the intervening events as
well, e.g., by typing UNDO THRU ELTS.
** See Reference 1, pp. 22.39--43, for a more complete description
of undoing.

919

the undo information seriously degrades the overall
performance of a program because the operation in
question is repeated so often. The user, by his choice of
function, specifies which operations are undoable. In
some sense, the user's choice of function acts as a
declaration about frequency of use versus need for
undoing. For those cases where the user does not want
certain functions undo able once his program becomes
operational, but does wish to be able to undo while
debugging, the p.a. provides a facility called TESTMODE. When in TESTMODE, the undoable version
of each function is executed, regardless of whether the
user's program specifically called that version or not.
Finally, all operations involving side effects that are
typed-in by the user are automatically made undo able
by the p.a. by substituting the corresponding undo able
function name(s) in the expression before execution.
This procedure is feasible because operations that are
typed-in rarely involve iterations or lengthy computations directly, nor is efficiency usually important. However, as a precaution, if an event occurs during which
more than a user-specified number of pieces of undo
information are saved, the p.a. interrupts the operation
to ask the user if he wants to continue having undo
information saved.
AUTOMA'rIC ERROR CORRECTION-THE
DWIM FACILITY
The previous discussion has described ways in which
the programmer's assistant is explicitly invoked by the
user. The programmer's assistant is also automatically
invoked by the system when certain error conditions
are encountered. A surprisingly large percentage of
these errors, especially those occurring in type-in, are of
the type that can be corrected without any knowledge
about the purpose of the program or operation in
question, e.g., misspellings, certain kinds of syntax
errors, etc. The p.a. attempts to correct these errors,
using as a guide both the context at the time of the
error, and information gathered from monitoring the
user's requests. This form of implicit :;i.ssistance provided
by the programmer's assistant is called the DWIM
(Do-What-I-Mean) capability.
For example, suppose the user defines a function for
computing N factoral by typing
DEFIN[ ( (FACT (N) IF N = 0 THEN 1 ELSE
NN*(FACT N -1)*].
When this input is executed, an error occurs because
DEFIN is not the name of a function. However, DWIM

* In BBN-LISP ] automatically supplies enough right parentheses to match back to the last [.

920

Fall Joint Computer Conference, 1972

notes that DEFIN is very close to DEFINE, which is
a likely candidate in this context. Since the error occurred in type-in, DWIM proceeds on this assumption,
types = DEFINE to inform the user of its action, makes
the correction and carries out the request. Similarly if
the user then types FATC (3) to test out his function,
DWIM would correct FATC to FACT.
When the function FACT is called, the evaluation of
NN inNN*(FACT N -1) eauses an error. Here,
DWIM is able to guess that NN probably means N by
using the contextual information that N is the name of
the argument to the function FACT in which the error
occurred. Since this correction involves a user program,
DWIM proceeds more cautiously than for corrections
to user type-in: it informs the user of the correction it is
about to make by typing NN(IN FACT)~N? and then
waits for approval. If the user types Y (for YES), or
simply does not respond within a (user) specified time
interval (for example, if the user has started the computation and left the room), DWIM makes the correction and continues the computation, exactly as though
the function had originally been correct, i.e., no information is lost as a result of the error.
If the user types N (for NO), the situation is the same
as when DWIM is not able to make a correction (that
it is reasonably confident of). In this case, an error
occurs, following which the system goes into a suspended state called a "break" from which the user can
repair the problem himself and continue the computation. Note that in neither case is any information or
partial results lost.
DWIM also fixes other mistakes besides misspellings,
e.g., typing eight for "C" or nine for ")" (because of
failure to hit the shift key). For example, if the user had
defined FACT as
(IF N=O THEN 1 ELSE NN*8FACT N-l),
DWIM would have been able to infer the correct
definition.
DWIM is also used to correct other types of conditions not considered errors, but nevertheless obviously
not what the user meant. For example, if the user calls
the editor on a function that is not defined, rather than
generating an error, the editor invokes the spelling
corrector to try to find what function the user meant,
giving DWIM as possible candidates a list of user
defined functions. Similarly, the spelling corrector is
called to correct misspelled edit commands, p.a. commands, names of files, etc. The spelling corrector can
also be called by user programs.
As mentioned above, DWIM also uses information
gathered by monitoring user requests. This is accom-

TABLE I-Statistics on Usage

Sessions

exec
inputs

edit
commands

undo
saves

1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.

1422
454
360
1233
302
109
1371
400
294
102
378

1089
791
650
3149
24
55
2178
311
604
44
52

3418
782
680
2430
558
677
2138
1441
653
1044
1818

p.a.
commands

spelling
corrections

87
44
33

17
28
28

184

64

8
6
95
19
7
1
2

0
1
32
57
30
4
2

plished by having the p.a., for each user request,
"notice" the functions and variables being used, and
add them to appropriate spelling lists, which are then
used for comparison with (potentially) misspelled units.
This is how DWIl\t{ "knew" that FACT was the name
of a function, and was therefore able to correct F ATC
to FACT.
As a result of knowing the names of user functions
and variables (as well as the names of the most frequently used system functions and variables), DWIM
seldom fails to correct a spelling error the user feels it
should have. And, since DWIM knows about common
typing errors, e.g., transpositions, doubled characters,
shift mistakes, etc.,* DWIM almost never mistakenly
corrects an error. However, if DWIM did make a mistake, the user could simply interrupt or abort the
computation, UNDO the correction (all DWIM corrections are undo able) , and repair the problem himself.
Since an error had occurred, the user would have had to
intervene anyway, so that DWIM's unsuccessful
attempt at correction did not result in extra work for
him.
STATISTICS OF USE
While monitoring user requests, the programmer's
assistant keeps statistics about utilization of its various
capabilities. Table I contains 5 statistics from 11
different sessions, where each corresponds to several
* The spelling corrector also can be instructed as to specific user
misspelling habits. For example, a fast typist is more apt to make
transposition errors than a hunt-and-peck typist, so that DWIM
is more conservative about transposition errors with the latter.
See Reference 1, pp. 17.20-22 for complete description of spelling
corrections.

Automated Programmering

CONCLUSION

TABLE II-Further Statistics
exec inputs
undo saves
changes undone
calls to editor
edit commands
edit undo saves
edit changes undone
p.a. commands
spelling corrections
calls to spelling corrector
# of words compared
time in spelling corrector (in seconds)
CPU time (hr: min: sec)
console time
time in editor

921

3445
10394
468
387
3027
1669
178
360
74
1108*
5636**
80.2
1:49:59
21:36:48
5:23:53

* An "error" may result in several calls to the spelling corrector,
e.g., the word might be a misspelling of a break command, of a
p.a. command, or of a function name, each of which entails a
separate call.
** This number is the actual number of words considered as
possible respellings. Note that for each call to the spelling corrector, on the average only five words were considered, although
the spelling lists are typically 20 to 50 words long. This number
is so low because frequently misspelled words are moved to the
front of the spelling list, and because words are not considered
that are "obviously" too long or too short, e.g., neither AND
nor PRETTYPRINT would be considered as possible respellings
of DE FIN.
individual sessions at the console, following each of
which the user saved the state of his environment, and
then resumed at the next console session. These sessions are from eight different users at several ARPA
sites. It is important to note that with one exception
(the author) the users did not know that statistics on
their session would be seen by anyone, or, in most cases,
that the p.a. -gathered such statistics at all.
The five statistics reported here are the number of:
1. requests to executive, i.e., in LISP terms, inputs

to evalquote or to a break;
2. requests to editor, i.e., number of editing commands typed in by user;
3. units of undo information saved by the p.a., e.g.,
changing a list node (in LISP terms, a single
rplaca or rplacd) corresponds to one unit of undo
information;
4. p.a. commands, e.g., REDO, USE, UNDO, etc.;
5. spelling corrections.
Mter these statistics were gathered, more extensive
measurements were added to the p.a. These are shown
for an extended session with one user (the author) in
Table II below.

We see the current form of the programmer's assistant
as a first step in a sequence of progressively more
intelligent, and therefore more helpful, intermediary
agents. By attacking the problem of' representing the
intent behind a user request, and incorporating such
information in the p.a., we hope to enable the user to be
less specific, and the p.a. to draw inferences and take
more initiative.
However, even in its present relatively simplistic
form, in addition to making life a lot more pleasant for
users, the p.a. has had a sup rising synergistic effect on
user productivity that seems to be related to the overhead that is involved when people have to switch tasks or
levels. For example, when a user types a request which
contains a misspelling, having to retype it is a minor
annoyance (depending, of course, on the amount of
typing required and the user's typing skill). However,
if the user has mentally already performed that task, and
is thinking ahead several steps to what he wants to do
next, then having to go back and retype the operation
represents a disruption of his thought processes, in
addition to being a clerical annoyance. The disruption
is even more severe when the user must also repair the
damage caused by a faulty operation (instead of being
able to simply UNDO it).
The p.a. acts to minimize these distractions and
diversions, and thereby, as Bobrow puts it, Cl • •• greatly
facilitates construction of complex programs because it
allows the user to remain thinking about his program
operation at a relatively high level without having to
descend into manipulation of details. "3 We feel that
similar capabilities should be built into low level
debugging packages such as DDT, the executive language of time sharing systems, etc., as well as other
"high-level" programming languages, for they provide
the user with a significant mental mechanical advantage
in attacking problems.

REFERENCES
1 W TEITELMAN D G BOBROW A K HARTLEY
D L MURPHY
BRN-LISP TENEX reference manual
BBN Report July 1971
2 W TEITELMAN
Toward a programming laboratory
Proceedings of First International Joint Conference on
Artificial Intelligence
Washington May 1969
3 D G BOBROW
Requirements for advanced programming systems for list
processing (to be published July 1972 CACM)

A programming language for real-time systems
by A. KOSSIAKOFF and T. P. SLEIGHT
The Johns Hopkins University
Silver-8pring, Maryland

the process, thus avoiding the construction -of a
program which exceeds the capacity of the target
computer, or which uses undue core capacity
and time for low-priority operations.
7. Provide a representation of a computer program
which is self-documenting, in a manner clearly
understandable by either an engineer or programmer, making clearly visible the interfaces
among subunits, the branch points and the successive steps of handling each information input.

SUMMARY
This paper describes a different approach to facilitating
the design of efficient and reliable large scale computer
programs. The direction taken is toward less rather
than more abstraction, and toward using the computer
most efficiently as a data processing machine. This is
done by expressing the program in the form of a twodimensional network with maximum visibility to the
designer, and then converting the network automatically into efficient code. The interactive graphics
terminal is a most powerful aid in accomplishing this
process. The principal objectives are as follows:

INTRODUCTION

1. Provide a computer-independent representation

2.

3.

4.

5.

6.

The development, "debugging," and maintenance of
computer programs for complex data-processing systems is a difficult and increasingly expensive part of
modern systems design, especially for those systems
which involve high speed real-time processing. The
problem is aggravated by the absence of a lucid representation of the operations performed by the program
or of its internal and external interfaces. Thus, the successful use of modern digital computers in automating
such systems has been severely impeded by the large
expenditure of time and money in the design of complex
computer programs. The development of software is increasingly regarded as the limiting factor in system development.
The individual operations of the central processing
unit of a general purpose digital computer are veryelementary, with the result that a relatively long sequence
of instructions is required to accomplish most dataprocessing tasks. For this reason, programming languages have been developed which enable the programmer to write concise higher level instructions. A compiler then translates these high-level instructions into
the machine code for a given computer. The programmer's task is greatly facilitated, since much of the detailed housekeeping is done by the compiler.
High level languages are very helpful in designing

of a process to be accomplished by a specified
(target) computer, and automatically transforming this representation into a complete program,
in the assembly language of the specified computer.
Design the representation so as to make highly
visible the processing and flow of individual
data, as well as that of control logic, in the form
of a two-dimensional network, and make it
understandable to engineers, scientists and computer programmers.
Design the representation so that it can be configured readily on an interactive computerdriven graphics terminaL
Design a simple but powerful set of computerindependent building blocks, called Data Circuit
Elements, for representing the process to be accomplished by a computer using distinct forms
to represent each class of function.
Enable the user to simulate the execution of the
Data Flow Circuits by inputting realistic data
and observing the resultant logic and data flow.
Facilitate the design of an efficient complex
data processing system by making visible the
core usage and running time of each section of
923

924

Fall Joint Computer Conference, 1972

programs for mathematical analysis and business applications. In contrast, they do not lend themselves to
the design of real-time programs for complex automated
systems. The high-level languages obscure the relation
between instructions and the time required for their
execution, and thus can produce a program which later
proves to require unacceptably long processing· times ..
Further, automated systems must often accommodate
large variations in data volume and "noise" content.
The use of existing high-level programming languages
inherently obscures the core requirements for storing
the code and data. This results in inefficient use of
memory and time, by a factor as high as three, and is
therefore a limiting factor in data handling capacity.
In such systems assembly language is often used to insure that the program meets all system requirements,
despite the increased labor involved in the detailed
coding. For these reasons the design of computer programs for real-time systems is much more difficult than
the preparation of programs for batch-type computational tasks.
An even more basic difficulty is a serious communication gap between the engineers and the programmers.
Engineers prepare the design specifications for the program to fit the characteristics of the data inputs and
the rate and accuracy requirements of the processed
outputs. In so doing they cannot estimate reliably the
complexity of the program that will result. The programmers have little discretion in altering the specifications to accommodate the limitations on computer
capacity and processing times. Consequently, the development of a computer for an automated system consequently often results in an oversized and unbalanced
product after an inordinate expenditure of effort and
time.
PRINCIPAL FEATURES
The principal features of the technique developed to
solve these problems and the objectives listed in the
Summary, are as follows:
Data flow circuit language*

The basis of the technique is the representation of a
computer program in" a "language" resembling circuit
networks, referred to as Data Flow Circuits. These

* The term "D.ata Flow" has been employed earlier but with quite
different objectives than those described. in. this work. (W. O.
Sutherland, "On-Line. Graphical Specification of Computer
Procedures," PhD thesis, Massachusetts Institute of Technology,
January 10, 1966).

represent the processing to be done in a form directly
analogous to diagrams used by engineers to layout
electronic circuits. Data Flow circuits correspond to a
"universal language" having a form familiar to engineers and at the same time translatable directly into
computer code. This representation. focuses attention
on the flow of identifiable data inputs through alternative paths or "branches" making up a data processing
network. The switching of data flow at the branch points
of the network is done by signals generated in accordance with required logic. These control signals usually
generate "jump" instructions in the computer program.
Data Flow circuits are constructed of building blocks,
which will be called Data Circuit Elements, each of
which represents an operation equivalent to the execution of a set of instructions in a general-purpose computer. These Data Circuit elements are configured by
the designer into a two-dimensional network, or Data
Flow circuit, which represents the desired data processing, as if he were laying out an electronic circuit using
equivalent hardware functional elements. Special circuit elements can also be assembled and defined by the
designer for his own use.
The direct correspondence. between individual Data
Circuit elements and actual computer instructions
makes it possible to assess the approximate time for
executing each circuit path and the required core. This
permits the designer to balance during the initial design
of the circuit, the requirements for accuracy and capacity against the program "costs" in terms of core and
running time. This capability can be of utmost importance in high-data-rate real-time systems, using
limited memory.
The Data Flow circuit representation also serves as a
particularly lucid form of documenting the final derived
computer program. It can be configured into a form
especially suited for showing the order in which the program executes each function.
A pplication of computer graphics

The form of the Data Flow circuits and circuit elements is designed to be conveniently represented in a
computer-driven graphics terminal, so as to take advantage of its powerful interactive design capability.
In this instance, the Data Flow Circuit is designed on the
display by selecting, arranging and connecting elements
using a light pen, joystick, keyboard or other graphic
aid, in a manner similar to that used in computer design
of electronic circuits.
As the circuit isbeing designed, the computer display
program stores the circuit description in an "element
interconnection matrix" and a data "dictionary".

Programming Language for Real-Time Systems

This description is checked by the program and any
inconsistencies in structure are immediately drawn to
the designer's attention.
Transformation into logical form

After the elements and interconnections have been
entered into the interactive computer by means of either
a graphic or alphanumeric terminal, the computer converts the Data Flow circuit automatically into an Operational Sequence by means of a Transformation program. This orders the operations performed by the
circuit elements in the same sequence as they would be
serially processed by the computer.
Code generation and simulation

In this step the computer converts the operational
sequence into instructions for the interactive computer.
The program logic is then checked out by using sample
inputs and examining the outputs. Errors or omissions
are immediately called to the attention of the designer
so that he can modify the faulty connections or input
conditions in: the circuit on-line. The assembly language
instructions for the target computer are then generated.
Integration and testing

The derived program is assembled by the interactive
computer with other blocks of the total program and the
result is again checked for proper operation. Subsequent
modifications to the program are made by calling up
the circuit to be altered,and making the changes at the
display terminal.
The above steps provide the Graphical Automatic
Programming method for designing, documenting and
managing an entire complex computer program through
the use of Data Circuit language. The result is highly
efficient system software which is expected to be produced at a fraction of the time and cost achievable by
present methods.
Data circuit elelllents

In selecting the "building blocks" to be used as the
functional elements of Data Flow circuits, each Data
Circuit Element was designed to meet the following
criteria:
1. It must be sufficiently basic to have wide application in data processing systems.
2. It must be sufficiently powerful to save the de-

925

signer from excessive detailing of secondary
processes.
3. It must have a symbolic form which is simple to
represent and meaningful in terms of its characteristic function, but which will not be confused
with existing component notation.
The choice and definition of the basic GAP (Graphical Automatic Programming) Data Circuit Elements
has evolved as a result of applications to practical
problems. Seven classes of circuit elements have been
defined, as follows:
SENSE elements test a particular characteristic of a
data input and produce one of two outputs according
to whether the result of the test was true or false.
OPERATOR elements perform arithmetic or logical
operations on a pair of data inputs and produce a data
output.
COMPARISON elements test the relative magnitude
of two or three data inputs and produce two or three
outputs according to the result of the test.
TRANSFER elements bring data in and out of the
circuit from files in memory and from external devices.
INTEGRATIN G elements, which are in effect complex operator elements, collect the sum or product of
repeated operations on two variables.
SWITCHING elements set and read flags, index a
series of data words, branch a succession of data signals
to a series of alternate branches, and perform other
branching functions.
ROUTIN G elements combine, split, and gate the
flow of data and control signals, and provide the linkage
between the program block represented by a given
Data Flow Circuit and other program blocks (circuits)
constituting the overall program. Some routing elements do not themselves produce program instructions,
but rather modify those produced by the functional elements to which they are connected.
Table I lists the elements presently defined for initial
use in the Graphical Automatic Programming language
(GAP). These include four· SENSE elements, eleven
. OPERATOR elements, six COMPARISON elements,
six TRANSFER elements, fourteen ROUTIN G elements, three SWITCHING elements, and six INTEGRATIN G elements. Others found to meet the basic
criteria and be widely applicable will be added to the
basic vocabulary. Each designer also may define for
his own use special-purpose functions as auxiliary elements, so long as they maintain the basic characteristics, i.e., they accurately show data flow and are directly
convertible to machine instructions to permit precise
time and core equivalency. Most of these can be built
up from combinations of the basic elements.

926

Fall Joint Computer Conference, 1972

Figure 1 illustrates the symbolic representation of a
typical circuit element of each of the seven classes.
Solid lines are used for data signals and dashed lines for
control sign-als. Data inputs are denoted by an X, data
outputs by a Y, control inputs by a C and control out-

Element Type

Name

puts by a J. When the input or output may be either
control or data the letters I or 0 are used. A U simply
means unconnected.
In Figure 1 the sample elements are seen to have the
following types and numbers of connections:
Data
Inputs

Control
Inputs

Data
Output.s

Control
Outputs

1
1-2
2
2-3
2

0

1-0
1-0
1-0
2

2
0-2
1
0-3

2-0
0-1
3-0

DATA SPLIT
BRANCH ON ZERO
ADD
BRANCH ON COMPARE
READ FILE
SET BRANCH
SUM MULTIPLY

ROUTING
SENSE
OPERATOR
COMPARISON
TRANSFER
SWITCHING
INTEGRATING

0

2

1
1
1

3
0

0

1

0

0-1

Figure I-Data flow circuit elements graphical representation

OPERATOR and COMPARISON elements are provided with an optional control input which serves to
delay the functioning of the element until the receipt of
the control signal from elsewhere in the circuit. The
READ FILE and other loop elements have a control
input which serves a different purpose, namely to initiate the next cycle of the loop.
At present, the maximum number of connections for
any element is eight and for SENSE and OPERATOR
elements it is four. Connections, or terminals, are numbered clockwise with 1 at 12 o'clock.

Data preparation

All of the elements described above have either more
than one input or more than one output. There are a
number of elementary operations which simply alter a
data word, thus having a single input and a single output. These operations include masking, shifting, complementing, incrementing and other simple unit processes
ordinarily involved in housekeeping manipulations, as
for example packing several variables into a single data
word or the reverse.

TABLE I-Data Flow Circuit Elements

COMPARISON

SENSE
Branch
Branch
Branch
Branch

on
on
on
on

Zero
Plus
Minus
Constant

OPERATOR
Add
Average
Multiply
Subtract
Divide
Exponentiate
And
Inclus i ve or
Exclusive or
Minimum
Maximum

Branch on Compare
Branch on Greater
Branch on Unequal
Correlate
Threshold
Range Gate
TRANSFER
Read Word
Write Word
Read File
Write File
Function Table
Input Data
Output Data
INTEGRA TING
Sum Add
Sum Multiply
Sum Divide
Sum Exponentiate
Product Add
Product Exponentiate

SWITCHING
Set Branch
Read Branch
Index Data
ROUTING
Linkage Data
Passive Split
Data Split
Control Split
Linkage Exit
Passive Junction
Da ta Junction
Control Junction
Linkage Store
Data Gate
Data Pack
Linkage Entry
Data Loop
Control Loop

Programming Language for Real-Time Systems

ROUTING

SENSE

,~

DATA SPLIT

TRANSFER

.:.?X
RF

Z

-:;-

OPERATOR

+
z·

V/J

VfJ

lJ/X

I

COMPARISON

-~
-

+

v

I

J/V

BRANCH ON ZERO

ADD

BRANCH ON COMPARE

SWITCHING

INTEGRATING

SPECIAL

z-l U/Jt
x

;-

-

SM

v

v

v

READ FILE

SET BRANCH

SUM MULTIPLY

NONE DEFINED

Figure 1-Data flow circuit elements graphical representation

In the Data Flow Circuit notation, such manipulation
is specified by a "prepare" operation preliminary to the
operation performed by each element. The manipulations involved in data preparation, which represents a
major portion of the "housekeeping" labor in programming, are thereafter accomplished f}.utomatically along
with the translation of the functional operations of the
elements in the Data Circuit. This type of operation is
designed graphically by closed arrowheads at input
terminals.

927

3. Write File (WF) 1 to enter the selected hits into
the TRK file for retention.
4. A Data Split (DS), a Data Gate (DG), and
two Control Junctions (CJ) , distribute the
data and cpntrol to the correct element terminals. Data inputs to the circuit are provided by
a Linkage Data element, (LD), the control input
by a Linkage Entry element (LE) , and two
control exits by a Linkage Exit element (LX).
In the Data Flow circuit in Figure 2, the numbers in
parentheses are unique reference numbers for each element and are prefixed with an "R" in the following text.
The reference numbers, R, and element labels in parentheses do not actually appear at the graphic terminals
but are used in the explanations of the circuit that follows:
The circuit is activated by a control signal at Read
File, element R3.
This element reads out a hit word containing
range and amplitude (A). The input at terminal
1 is the base address of the file (HIT) and at
terminal 2 is the index (N) for the negative
number of hits.
The Data Split (R6) distributes the hit word
to the Branch on Compare (R4) and to the
Data Gate ,(R7).
At the Branch on Compare element the amplitude (A) is extracted and used to compare with
a threshold (T).
If the amplitude is greater than or equal to

SAMPLE DATA FLOW CIRCUIT

The particular system from which the following example has been drawn concerns real-time processing of
radar signals or "hits." This function is normally associated with track-while-scan radar systems.
The logic of the example "Hit Sorting Program" illustrated in Figure 2 operates by indexing through a
number of hits in ~he HIT file. Each hit whose amplitude is greater than or equal to a specific threshold
(T) is placed in the track (TRK) file. When the HIT
file is empty or the TRK file is full the program is exited.
Three functional and seven nonfunctional elements
accomplish this task:

HIT

(LEI

Description of a sample flow data circuit

(1)

ENT

.

(01 (LOI
N

1>----

----1>

,----------------I

I
I

I

I

i ___-Y
~t-~-_-,-----~ (~il
r

~,

(LOI

I

I
I

II

I

I
I

II

m

L---L7------

IDGI

(CJI (91

I
IL ________________ _
(51

WF

1. Read File (RF), to extract each hit from the
HIT file.
2. Branch on Compare (BC), to select hits whose
amplitude equals or exceeds the threshold.

EXH
(LXI (21

TRK
(01
(Lot

Figure 2-Hit sorting program

(01 (Lot
J

----£>

EXT
(LXI (21

928

Fall Joint Computer Conference, 1972

the threshold, control is passed to the Data
Gate (R7). The original hit word fi'omthe Read
File enters the Write File element (R5). The
index at terminal 2 (J) is incremented. If the
index indicates that the file is full the output to
the circuit exit is selected. But normally the hit
is placed in the file by using the base at terminal
4 (TRK) and the index (J). Control is then
passed through the Control Junction (R8) to
the looping input (terminal 5) of the Read File.
If the amplitude is less than the· threshold,
control is immediately passed to the looping input to the Read File.
The looping input to Read File (R3) causes
the index (N) to be incremented. If the index
indicates that no more entries or hits are present
the output to the circuit exit (Linkage Exit) is
selected. Otherwise, the next hit is read out and
processed through another cycle of the loop.
The quantities HIT, N, T, TRK, and J are
outputs from the Linkage Data element (RO).
This element does not appear explicitly in the
graphical representation, as in the case of the
other linkage elements.
The Hit Sorting Program is a simple example for the
purpose of explaining the techniques employed in
GAP. A circuit more representative in size is the Target
Coordinate Computation Circuit, shown in Figure Al
and is developed in an analogous way in the Appendix.

ELEMENT
DIRECTORY
AND INDEX

PREPARE LIST
DATAPACKtJST
GLOSSARY

GENERATE
OPERATIONAL
SEQUENCE
rRANSFORMATION
GENERATE
EXECUTION
SEQUENCE

TRANSLATION

Figure 3-Processing of a GAP circuit

circuit· has been found to function properly assembly
code of the target computer is generated.
The computer configuration used in this example is
an IBM 360/91 operating under MVT. The software is
written in the Conversational Programming System
(CPS) and is operational under any terminal in the
system. The simulation step generates CPS code and
the target computer is a Honeywell DDP-516 whose
assembly language is called DAP.

Sample processing of a data flow circuit
Input

Once the particular function has been defined in the
form of a GAP circuit on a scratch pad several steps
are taken to generate code for the target computer
(Figure 3). The circuit is input and checked interactively through a graphics or alphanumeric terminal.
The transformation process then converts the twodimensional Gircuit representation into a sequential
representation. of the order in which code for the elements is to be written.
The first step in the transformation is a detailed trace
through the circuit. This trace produces a tabulation,
called the Operational Sequence. By removing all nonfunctional steps and other information not necessary for
final coding, this is reduced to an ordered list of elements and connections called the Execution Sequence.
Each entry in the Execution Sequence corresponds to a
dynamic macro statement.
The writing of instructions, Translation, now takes
place. The user can select actual target computer code
or a simulation of the target computer code. Normally
the simulation step is first selected, and later when the

The first step in the process of Graphical Automatic
Programming is the input of a Data Flow Circuit
into an interactive computer terminal. Unless the
circuit is very simple, it is usually first laid out roughly
on a scratch pad in order to save terminal time during the initial conceptual stages of circuit design. When
an alphanumeric terminal is employed, as in the example descr·bed in the succeeding sections, the circuit
input consists of entering the element labels (e.g.,
RF for Read File), number of terminals, and Interconnection Matrix. The latter requires the operator to
specify only output connections for each element. An
interactive program completes the matrix. The output
connections are given in terms of the element reference
and terminal numbers and type of connection.
Table II gives the Interconnection Matrix of the Hit
Sorting Program illustrated in Figure 2. Each row of the
matrix lists connections to each of 'the terminals of the
given element. The order of the rows is in accordance
with an arbitrary but unique reference number assigned

Programming Language for Real-Time Systems

929

TABLE II-Interconnection Matrix

Terminal Number
1
C

0

1
2
3
4
5
6
7
8
9

0

2

3Yl
306
5 13
0 X3

C7

3 13
0

X2

U U

0 X4

7X3

0 XS

c:r:&l>
6. X2
5 C6
4 C3

7Yl
~ r.3

4 ~S
4 C4

I

3

4

5

6

7

3Y2

4Y2

5Y2

SY4

IJl

2Jl
9Jl
2J2
4Y6
SYl
3J5