1968 04_#32 04 #32

1968-04_#32 1968-04_%2332

User Manual: 1968-04_#32

Open the PDF directly: View PDF PDF.
Page Count: 550

Download1968-04_#32 1968-04 #32
Open PDF In BrowserView PDF
AFIPS
CONFERENCE
PROCEEDINGS
VOLUME 32

·1968
SPRING JOINT
COMPUTER
CONFERENCE
APRIL 30-MAY2
ATLANTIC CITY, NEW JERSEY

The ideas and opinions expressed herein are soley those of the authors and are
not necessarily representative of or endorsed by the 1968 Spring Joint Computer
Conference Committee or the American Federation of Information Processing
Societies.

Library of Congress Catalog Card Number 55-44701
THOMPSON BOOK COMPANY
N ationai Press Building
Washington, D.C. 20004

© 1968 by the American Federation of Information Processing Societies, New
York, New York, 10017, An rights reserved. This book, or parts thereof. may
not be reproduced in any form without permission of the publisher.

CONTENTS
COMMERCIAL TIME-SHARING - THE SECOND GENERATION
Time sharing versus batch processing: The experimental evidence
Computer scheduling methods and their countermeasures ...... .
Some ways of providing communication facilities for time shared
computing ............................................... .
The Baylor medical school teleprocessing system .............. .

H. Sackman
E. G. Coffman, Jr., L. Kleinrock

1
11

H. L. Steadman, G. R. Sugar
W. Hobbs, J. McBride, A. Levy

23

A. Appel
W. Newman
R. J. Smith, J. H. Tracey,
W. L. Schoeffel, G. K. Maki

37

A.N.DeMott
R.E.Boche

61

31

COMPUTER AIDED DESIGN
Some techniques for shading machine renderings of solids ...... .
A system for interactive graphical programming ................ .
Automation in the design of asynchronous sequential circuits .... .

47
53

SCIENTIFIC APPLICATIONS OF GENERAL INTEREST
Interpretation of organic chemical formulas by computer ........ .
A simulation in plant ecology ................................ .
A major seismic use for the fast-multiply unit .................. .

R.D.Fore~e~

67
73

T.J. Hollingsworth,J. D. Morgan
A generalized linear model for optimization of architectural
planning ........................... ; .................... .

R. Aguilar, J. E.Hand

81

J. L. Little, C. N. Mooers
A. K. Bhushan, R. H. Stotz
S. W. Andreae
E.J. Smura

89
95

COMPUTERS IN COMMUNICATIONS SYSTEMS
Standards for user procedures and data formats in automated
information systems and networks .......................... .
Procedures and standards for inter-computer communications .... .
An error-correcting data link between small and large computers
Graphical data processing .................................. .
The advancing communication technology and computer
communication systems .................................. .

105
111

S.J. Kaplan

119

P. Balaban, J. Logan
R. Vichnevetsky
W. J. Mueller, P. E. Buchthal
G. P. Hyatt, G. Ohlberg

135

HYBRID COMPUTER SYSTEMS AND TECHNIQUES
Analog computer simulation of semiconductor circuits .......... .
Stable computing algorithms for partial differential equations .... .
BASP - A Biomedical Analog Signal Processor .............. .
Electrically alterable digital differential analyzer ................ .

143

15t
161

COMMERCIAL DATA PROCESSING
DATA FILE TWO ........................................ .
GIPSY - A Generalized Information Processing System ........ .
The ISCOR real-time industrial data processing system ........ .
Martin Orlando reporting environment ........................ .
Simulation applications in computer center management ........ .

R. J. Jones
G. Del Bigio
W. M. Lambert, W. R. Ruffels
M. J. McLaurin, W. A. Traister
T. F. McHugh,Jr., E. Scott

171
183
193

197
209

MULTIPROGRAMMING OPERATING SYSTEMS
Multiprogramming system performance measurement and
analysis ................................................
Multiprogramming, swapping, and program residence
priority in FACOM ......................................
A storage hierarchy system for batch processing ................
Burroughs B6500/B7500 stack mechanism ....................

.

H. N. Cantrell, A. L. Ellison

213

.
.
.

M. Tsujigado
D. N. Freeman
E. A. Hauck, B. A. Dent

223
229
245

R. W. Reichard, W. F. Jordan, Jr.

253

J. I. Raffel, A. H. Anderson,
T. S. Crowther, T. O. Herndon,
C. Woodward
C. C. M. Schuur

259

ADVANCES IN MAGNETIC MEMORY DESIGN
A compact, economical core memory with all monolIthic
electronics .............................................. .
A progress report on large capacity magnetic film memory
development ............................................ .

A fast 21h mass memory .................................... .
A magnetic associaiive memory .............................. .

267
275

SWITCHING THEORY
Selection and implementation of a ternary switching algebra ...... .
Application of Kamaugh maps to Maitra cascades .............. .
U niversallogic circuits and their modular realizations .......... .
Sorting networks and their applications ........................ .

R. L. Herrmann
G. Fantauzzi
S. S. Yau, C. K. Tang
K. E. Batcher

283
291
297
307

J. F. Teixera, R. P. SaIlen
A. P. Feldman
F. Lee
J. Allen

315
323
333
339

F. Abraham, L. Betyar,
R. Johnston
S. Bowman, R. A. Lickhalter

345

K. Bandat
E. J. Sande wall
D. L. Londe, W.J. Schoen

363
375
385

N. Sinowitz
L. G. Tesler
L. Constantine

395
403
409

MAN-MACHINE INTERFACE
The Sylvania data tablet .................................... .
Computer input of forms .................................... .
Machine-to-man communication by speech Part I .............. .
Machine-to-man communication by speech Part II .............. .
A system of computer support for neurophysiological
investigations, etc ......................................... .
Graphical data management in a time-shared environment........ .

353

LANGUAGES: TODAY AND TOMORROW
On the formal definition of PL/I .............................. .
LISP A: A LISP-like system for incremental computing ...... , ..
TGT: Transformationalgrammar tester ...................... .
DAT APLUS: A language for real time information retrieval
for hierarchial data bases .................................. .
A language design for concurrent processes ........ ~ ........... .
Control of sequence and parallism in modular programs ........ .
GENERAL INTEREST
Anatomy of a real-time trial .................................. .
Fourth generation computer systems .......................... .

A. B. Kamman, D. R. Saxton
. C. J. Walter, M. J. Bohl,
A. B. Walter

415
423

Fourth generation computer organization ...................... .
Optimal control of satellite attitude by a random search
algorithm on a hybrid computer ............................ .
Evaluation and development techniques for computer assisted
instruction programs ...................................... .
Computer capacity trends and order-delivery lages 1961-1967

S. E. Lass

435

W. P. Kavanaugh, E. C. Stewart,
D. H. Brocker

443

M. Tarter, T. S. Hauser,
R.L.Ho1comb
M. H. Ballot, K. E. Knight

453
461

A. S. Chai

467

J. S. Rosko

473

M. Schwartz, S. H. Richman

483

E. W. Pullen, D. F. Shuttee

491

J. J. Dent
F. B. Cole, W. V. Bell

503
509

K. N. Levitt, M. W. Green,
J. Goldberg

515

H. Y. Chang

529

DIGITAL SIMULATION TECHNIQUES
Error estimate of a 4th order Runge-Kutta method with oniy one
initial derivative evaluation ................................ .
Improved techniques for digital modeling and simulation of
nonlinear systems ........................................ .
Extremal statistics in computer simulation of digital
communication systems .................................. .
MUSE: A tool for testing a multi-terminal system in a multiterminal environment .................................... .
FAULT DIAGNOSIS
Diagnostic engineering requirements .......................... .
Self-repair techniques in digital systems ........................ .
A study of the data commutation problems in a self-repairable
multiprocessor .......................................... .
A distinguishability criterion for selecting efficient
diagnostic tests .......................................... .

Time-sharing versus batch processing:
the experimental evidence
by H. SACKMAN
System Development Corporation
Santa Monica, California

INTRODUCTION
Time-sharing of computer facilities has been widely
acclaimed as the most significant evolutionary step
that" has been taken in recent years toward the
development of generalized information utilities.
The basic techniques of interactive man-computer
time-sharing were developed in the 1950's in connection with realtime command and control computing
systems, initially in SAGE air defense. Time-sharing
was practiced in these pioneering systems in the sense
that many military operators at separate cons()les"":'
consoles equipped with push-buttons, CRT displays
and light. guns - were able to request and receive
information from the centrai computing system at
essentially the same time. These historical roots
reveal that time-sharing is an outgrowth of realtime
system development.
The emergence of time-sharing systems as generalpurpose online computing facilities is primarily a
development of the 1960's. The users of such systems
are a more or less random and changing collection of
people at any point in time, typically but not necessarily working on unrelated tasks with different
computing programs, entering and leaving the system
independently of one another, and using it for varying
and largely unpredictable periods of time; such use
approaches that of a public utility, roughly analogous
to the quasi-random pattern of telephone traffic.
ExperimentaI time-sharing systems were designed
and operated in the first half of this decade. The
Massachusetts Institute of Technology developed the
Compatible Time-Sharing System (CTSS) used for
Project MAC (Corbato, Merwin-Daggett, and Daley,
1962);1 the System Development Corporation
developed TSS, the Time-Sharing System for the
Advanced Research Projects Agency of the Department of Defense (Schwartz, Goffman and Weissman,
1964),2 and RAND developed JOSS, the Johnniac
Open-Shop System (Shaw~ 1964).3 Commercial
applications have sprouted and are rapidly spreading

with practically all computer manufacturers marketing or developing some version of time-sharing hardware, software, and support facilities.
.
In batch or offline processing - the operational
workhorse of most contemporary data processing and
. the evolutionary predecessor of time-sharing - the user
typically has indirect contact with the computer.
Batch processing has been the rule for economical
operation, with stacked jobs done one at a time on a
waiting-line basis. Job scheduling is often mediated by
programed operating systems based on job priority
and estimated computer running time. Turnaround.
time may take minutes, hours, days or even more
than a week before completed outputs are returned in
response to job requests. Proponents of stacked-job
systems argue that throughput time, useful computations per unit time, is at a maximum with minimum
waste of computer resources.
In contrast, time-sharing permits fast and direct
access to the computer when the user wants it
(provided that guaranteed access is available). For
many. types of data-processing tasks, the user can get
what he wants in minutes rather than hours or days.
He may exert continual control over his program
and he is free to change his mind and do things
differ-ently, at least within system capability, as he
interacts with the computer. Time-sharing typically
means expense-sharing among a large number of
subscribers, with reduced computing costs for many
kinds of applications. And perhaps most significant
of all, the online nature of time-sharing permits direct
man-computer communication in languages that are
beginning to approach natural language, at a pace
approaching normal human conversation, and in
some applicatipns, at graded difficulty levels appropriate to the skill and experience of the user. Timesharing systems, becuase of requirements for expand·ed hardware and more extensive software, are
generally more expensive to build and op~rate than
closed-shop systems using the same central computer.

2

Spring Joint Computer Conference, 1968

Time-sharing advocates feel that such systems more
then pay for themselves in convenience to the user, in
more rapid program development, and in manpower
savings.
Time-sharing, however, has always had its critics.
Their arguments are often directed at the efficiency
of time-sharing, that is, at how much of the computational power of the machine is actually used for
productive data processing as opposed to how much
is devoted to relatively non-productive functions
(program swapping, idle time, etc.). These critics
claim that the cost-effectiveness of time-sharing
systems is questionable. when compared to inodern
closed-shop methods, particuiariy the most advanced
versions of fast-turnaround batch systems. Since
online systems are presumably more expensive
than offline systems, there is little justification for
their use except in those situations where online
access is mandatory for system operations (for
example, in realtime command and control systems).
Time-sharing advocates respond to these charges
by saying that, even if time-sharing is more costly
with regard to hardware and operating efficiency,
savings in programmer man-hours and in the time
required to produce working programs more than
offset such increased costs. The critics, however, do
not concede this point either. Many believe that
programmers grow lazy and adopt careless and inefficient work habits under time-sharing. Easy access
to the computer, they claim, tends to make users
more prone to casual and costly trial and error
computer runs with poorly prepared problems, in an
effort to trade off computer time against human time,
as compared to the batch environment in which
computer time is at a premium and programers do
more extensive desk checking. in fact, they claim that
instead of improving, user performance is likely to
deteriorate.
While the controversy continues to rage, many
computer installations, pursuing their own unique
evolutionary paths, are qU,ietly assimilating the best
of both worlds. Time-shared systems are tending to
find it convenient to run short jobs to ·completion and
to interleave stacked production jobs into long
pauses in online operations as "background" tasks.
Conventional operating systems are becoming less
conventional by incorporating, in novel forms, many
features associated with time-sharing (e.g., direct
coupled and remote batch systems). Of special
interest are the high capacity, fast turnaround batch
sysh~ms such as those reported by Lynch (1967).4
With the continued growth of computer installations,
the evolutionary varieties of oniine and offline
facilities are diversifying into new forms and are also

converging in.to hybrid forms. It may well be that
many large computer complexes of the future will
offer a variety of services in a spectrum of optional
online, offline and mixed operational modes.
The above arguments are characteristic of the
specuiative controversy that h3:s attended the recent
rapid growth of time-sharing. For various and complex
reasons - which range beyond the purpose and scope
of this paper but which are treated elsewhere in
detail by the author (1967)<5) - the growth of an applied
experimental tradition in man-computer communication has not been vigorously pursued in the computer
sciences. Over the last two years, however, this
subjective and predisciplinary tenor has finaHy,
and somewhat belatedly, taken a more objective and
scientific turn with the advent of experimental
comparisons of time-sharing and batch processing
in the literature. Five such studies are available
and together they comprise an instructive and valuable
body of knowledge on methodology and findings
(Erikson, 1966,6 Gold, 1967;7· Grant and Sackman,
19678 Schatzoff, Tsao and Wiig, 1967 9 and Smith,
1967 10). The objectives of this paper are to critically review and evaluate these studies, summarize
areas of agreement and disagreement, point up key
gaps in these initial experiments, and sketch the more
promising avenues for future research.

Comparative methodology of the experimental studies
Table I outlines and summarizes the main characteristics of the· five experimental studies. Unfortunately,
an outline of this kind can not do justice to the
extensive details of each study, and the interested
reader is referred to the original articles. The aim of
this section is to review comparative methodology
to help determine the technical scope and limitations
of these studies. Table I breaks the description of each
study down into five categories - subjects, problems,
computer system facilities, experimental procedure,
and performance measures. Each of these is discussed
in turn.
There are a total of 212 subjects in ail five studies.
It probably comes as no surprise to anyone that
college students form the bulk of this population,
with only one sample showing a highly experienced
group of programmers (Grant and Sackman). It will be
noted later that the three studies with small samples
were organized around relatively efficient experimental designs to optimize the information yield
from the results.
A critical experimental control factor, not shown
in Table I, enters into the selection of SUbjects. This
factor is the nature of the computer-reiated experience
of the subjects and their bias, as a result of their

Time:-Sharing versus Batch Processing
Experimental
Characteristics

Erikson
(1966)

Gold
(1967)

Grant and Sackman
(1967)

Schatzoff, T.... and
Wii&: (1967)

3

Smith
(1967)

SUBJECTS
127

12

60

Sample Size
Type of Subjects

Prop-ammer trainees

Under&n\duate and "aduate students

Experienced prOlP"amers
from an R&D setline

UnderJP"aduate students
with h1ch prOlP"amine
aptitude

Under!P"8duale and IP"8duate students in an introductory prOJP"amins course

Experience Level

Leu than one year

78% of subjects had taken
at least one prOlP"aming

Averaa;e of7 yean
experience

"'Some" PI'Op-aminl
experience

Most subjects had less
than a year experience

Number and Typea of
Problems

Two probleml, a _tInc routine and a cube
puzzle

One problem. simulation
model of construction
IndUitry

Two problelUl. a1pbra
and maze

Four problems: Monte
Carlo InteJP"8tlon.
algebraic sortinc. Pi,
Latin translator. text
format conversion

Two easy ''warmup'' prob1e1Ul and four experimental
problelUl; cosine Infinite
aeries, matrix sorting,
ianJUaJe translation.
heuristic prolP"8m

Difficulty Level

ConceptuaDy simple

Moderately difficult.
ope_oded problem

Moderately difficult
for skilled student
IUbjects

Moderately difficult for
beJlnners

Approximately 40
hours to complete aD
problems

Approximately 60 hours to
complete aU problems

h!ms

PROBLEMS

Moderately difrK:ult for
highly experienced
programers

Average Completion
Time

A few hours

IS to 20 hours

Approximately 60 hours
10 complete both prob-

ONLINE/OFFLINE
FACILITIES
Online Facility

SOC Q-32 TimeSharing

MIT Time-8harinl
IBM 7094

SOC Q-32 TimeSharing

MIT Time-Sharins. IBM
7094

Burroushs B-SSOO batch
Iystem at Stanford with
'''instant'' turnaround

Batch FacUlty

Same facility~mulated offline conditions

MIT Batch FacUlty. IBM
7094

Same facility~mulalion of offline condi.
tions

IBM 7094 selentirK:
batch facility

Same facility. with normal

Lanluase Used

l1NT -interpretative
I!icher-order language
fOr time-sharins

pY NAMO-simulation
langUlJe used in timesharing and batch modes

JTS (higher-order language) and SCAMP
(machine lansuage)

Not mentioned

Burroughs Extended ALGOL

Batch Turnaround
Time

Usually aeveral
minutes. variable

6 hours (daytime). 10
hours (ovemieht), variable

Constant at two hours

Not mentioned

Variable, ulUaUy hours

2X2 Latin Square: two
problems os. on/off

Two matched croups of
lubjects

2X2 Latin Square: two
problems YO. on/off

Graeco-Latin Square:
4 problems, 4 lubjects, os. on/off compqison

two problems on "batch"
and two on "instant"

turnaround

EXPERIMENTAL
PROCEDURE
Experimental Desicn

comparison

(omparison

Statistical Tests

Analysis of variance,
lOme nonparametrle
tests

A variety of nonparametric
statistici comparing the
two croups

factor analysis

Experimental Controls

Counterbalanced order
of problems and experimentalVllriablel

Questionnaire itelUl, deadline for completed problem

counterbalanced experi.

Trainee cia.. IP"8des

CIaIS"adel

MotiVlltional Controls

Analysi$ of variance,

Biosraphical items.

Matched "oups of subjects, each subject taking

Analysis of variance,
correlational analysis

Descriptive statistical
comparisons; no tests of
statistical significance

Counterbalanced order
of experimental desicn

Counterbalanced order of
"batch" and "instant" modes

mental order

Recording Procedures

Computer records and
personal lop

Computer records, Itudent
....1 and questionnaires

KEY PERFORMANCE
MEASURES

DebuS man-bours

Problem_lvinl man-hours

Computer time

Computer time

Job assignment

Not mentioned

Class grades

Computer records, experi·
Menter logs, paper and

Computer recording,
student logs and ~ues·

pencil test

Computer recording,
work lop, paper and
pencil test

DebuS man-hours

Elapaed time

Initial prosram preparation

Analysis

Time to prepare new run

tionnaire

Keypunch time
Codins man-hours
Computer time
Prosram size

Tllk performance
Ratinp of written reports
Cost com parisons

Individual differences
Basic Programin&

Questionnaire items

Numher of runs
Computer runs per trip

Computer time
Pro"am runnlne time

Individual differences

Knowledge Test Scores

Prop-amer'l time

Prop-am lize

Balle Prop-amin, Know·
Iedp Test scores

::::ber

Elapsed time
of compi....

Total Coat

Co.puler tillle
Submission interVllI.
Ouestionnaire items

TABLE I - Comparative characteristics offive experimental studies
comparing time-sharing with batch processing

experience, toward time-shared or batch systems.
For example, Erikson's subjects were trained primarily in online programing, whereas Schatzoff, Tsao
and Wiig indicated that their suojects had most of
their previous experience in batch systems and used
batch-oriented procedures in the experimental timesharing mode. The other three studies had subjects
with various degrees of mixed online and offline
experience. Obtaining equal familiarity and equal
skill in online and offline activities is a difficult kind
of experimental control. An antidote to this problem,

only partially encountered in these studies, is to
deliberately select subjects on the basis of equal
experience and to offer them extensive and equal
practice sessions in both modes up to some standard
level of proficiency.
The problems cover a fairly wide area of programming and problem solving. They include mathematical
problems, various puzzles, sorting procedures, and a
simulation model. While many of these are typical of
program tasks, they can hardly be" put forth as representative. For· example, there are no large data

4

Spring Joint Computer Conference, 1968

base or statistical analysis problems - the kinds requiring large data storage and much computation,
which often lend themselves more efficiently to
batch processing. On the other hand, neither were
there any particularly long, exploratory programs,
such as those encountered in graphics and display-centered systems, that lend themselves more efficiently to
time-sharing. All problems were individual rather than
. team-oriented tasks. Perhaps most basic of all,
there are no empirical .norms available to determine
the representativeness of the various data processing
tasks.
The difficulty level of most studies varies from
"conceptually simple" to "moderately difficult.;'
There were no reported cases of subjects who were
unable to complete the experimental tasks even
though some studies indicated missing data. The
average time for subjects to complete their experimental tasks varied from a few hours up to 60 hours.
The longer problems give some idea of the manpower costs of conducting this kind of research and
underscore the general tendency to use students
or trainees.
The problem posed by Gold for his student subjects
differed from the other four studies in that it was not
a programming task. The experimental vehicle was a
computerized simulation model of the construction
industry and its market; the student's task was to
formulate and construct a set of decision rules to
maximize his profits as an independent, small-scale
builder in this simulated, cyclical market. The computerized simulation model provided criterion performance scores which constituted feedback for the
students by indicating their profit level in response
to decision rule inputs for this open-ended problem.
The online/offline faciiities reveai key dilemmas
faced by the experimenters in attempting to construct
unbiased and equal conditions for an objective
comparison between time-sharing and batch processing. In the two SDC studies, time-sharing was real
and batch processing had to be simulated on the
Q-32 Time-Sharing System. In Smith's study, the
basic system was· batch and time-sharing was simulated by providing "instant" turnaround time (several
minutes); there were no conversational or interactive
features in this simulated online condition. While
Smith's study is primarily a comparison between
conventional batch and fast-turnaround batch, it is
included here because of the useful information
it contributes to timing and feedback aspects of the
time-sharing/batch controversy. The two MIT studies
were the only ones offering ostensibly comparable
online and offline modes without resorting to some
form of simulation.

The computer language employed is another
difficult control variable. Gold and Smith were able
to have their subjects use the same language which,
they claimed, was equally applicable and useful for
both modes. Erikson used TINT, an interactive
ianguage, for the noninteractive mode. in the GrantSackman study, most subjects used JTS, originally
a batch processing language, later adapted to timesharing. Schatzoff, Tsao and Wiig do not mention
any languages at all; since they indicate that their
subjects used batch procedures in the time-sharing
mode, and that their subjects only had a brief indoctrination in time-sharing, one cannot help but wonder
whether their comparison provided reasonably
comparable starting conditions under both experimental modes. This same criticism applies, at least
in part, to the two SDC studies.
Experimental control problems are compounded
further with respect to turnaround time under the
batch mode. These turnaround times vary from
minutes, to hours, to next~day turnaround. Only
Grant and Sackman controlled this variable at a
constant value of two hours. While this procedure
provided rigorous experimental control over turnaround time, it was obviously unrealistic in not
providing variability in turnaround service. The
other investigators apparently left their subjects to
the vagaries of their particular operationai batch
system without obtaining exact measures of turnaround time for each run. 1n addition, for all studies,
it is not clear whether subject waiting time during
batch turnaround was spent working on the problem,
or not working on the problem, and for some of the
studies, whether it was included or excluded in subject logs of man-hours spent on the experimental'
task. Future studies in lhis area shouid incorporate
systematic variation and control of machine turnaround time, and careful recording of what the
subject does during this time. Lack of experimental
controls in this area unquestionably increases error
variance in performance measurement and decreases
the reliability of the final results.
The next category in Table I, experimental procedure, reveals a remarkable spectrum of experimental
designs for the five studies. The Graeco-Latin
Square configuration of the Schatzoff, Tsao and Wiig
study is the most sophisticated experimental design,
whereas the Smith study merely compared mean
s(.:ores of matched groups without any reported
measures of dispersion or any tests of statistical
significance. With a sample of four subjects, the
Schatzoff, Tsao and Wiig study had to have optimal
statistical efficiency to demonstrate reliable results,
whereas with Smith's sample of 127 subjects, ob-

Time-Sharing versus Batch Processing
served mean differences are correspondingly more
Nevertheless, the absence of statistical
tests and neglect in reporting measures of dispersion
in the data from which statistical tests may be constructed, are to be deplored since these practices
reduce the cost-effective yield of an experiment,
leave quantitative results ambiguous, and deprive
the larger community of useful information on
individual differences. _
The three experiments using Latin-Square designs
employed analysis of variance and correlational techniques to the findings, which nbt only provided
statistical tests for online/offline comparisons, but
also yielded valuable information on problem and
individual performance differences. The GrantSackman study was the only one which included an
exploratory factor analysis of subject performance.
Gold's tests were exclusively non-parametric, and
as in Smith's study, no quantitative findings on
individual differences were reported.
The experimental controls included matching of
groups in the studies with the largest samples (Gold
and Smith) primarily on the basis of questionnaire
items. The remaining three studies, using LatinSquare designs, involved stratjfied samples of subjects (e.g., experienced programmers, high-perfor, mance students, trainees) with random assignments of
subjects to the various test conditions in accordance
with the experimental design. Motivational controls
essentially consisted of class grades for students
and fulfillment of job assignments for the experienced
SDC programmers. Individual competition probably
spurred most subjects to work hard at their assigned
tasks and to keep most of their problem- strategy and
-'tactics to 'themselves, at least in the three small
sample experiments. These motivational constraints
were probably less effective -in -the two experiments
with the larger subject samples.
The recording procedures characteristically included computer recording for machine usage, subject logs for man-hours spent on experimental tasks,
questionnaires for selecting and matching subjects
and for collecting observations and ratings on selfperformance. Gold collected the most comprehensive
questionnaire data on ~is subjects before, during,
and after the experiment. I terris included biographical
data, problem-solving behavior, and comparative
attitudes toward time-sharing and batch processing.
Paper and pencil tests of programmer ability were used
in three _studies. Schatzoff, Tsao and Wiig selected
students who received a grade of A on the IBM Data
Processing Aptitude Test; in the two SDC studies,
the Basic Programming Knowledge Test (developed
at the University of Southern California) was adminisrdiabl~.

5

tered to the subjects. Of the various recording
procedures, the computer records were probably the
most objective and the: subject logs were the ones
most open to intentional and unintentional errors.
Inthe three studies with small samples, it was easier
to keep the subjects under surveillance, to monitor
their manual reporting procedures, and to tactfully
resolve discrepancies as they arose. In the two larger
sample studies, experimenter monitoring of individuals had _to be more indirect. Neither Smith nor
Gold discuss possible errors or bias in student reporting procedures in any detail.
The last category in Table 1 covers the experimental
payoff, performance measures. The two key _performance measures running through all five studies are
man-hours and computer time required to complete
experimental tasks. The computer time measure is the
most straightforward. Man-hour measures appear in
various forms and are partitioned in different ways.
For example, the two SDC studies distinguish coding
time from debugging time; Gold uses a single measure
of problem-solving time; the other two studies incorporate an overall measure of elapsed time with different ways of slicing man-hours spent on experimental
tasks. Cross-comparisons are somewhat difficult because measures are defined differently for different
contexts.
The three studies utilizing Latin-Square designs devote some attention to the analysis of individual performance differences. Although individual differences
were not originally a key objective of these studies,
there was an unavoidable serendipitous fallout of human differences from the analysis of variance in each
investigation. The study of individual differences was
carried furthest in the Grant-Sackman experiment
through an exploratory factor analysis of performance
measures.
Questionnaries bearing on -~H.lbjeci preference between online and offline operations were used in the
Gold and Smith studies. This performance measure,
while subject to the problems that plague questionnaire reliability and validity, is of special interest in the
time-sharing/batch controversy in providing ,an index
of user attitudes and in testing for a bandwagon effect.
The SDC studies used final program size and running time as measures of performance. It is' surprising that these objective, easily obtainable, and obvious
measures of programing efficiency were not reported
in the other two studies requiring completed programs.
It would be of value to test whether programs are
written more efficiently, as measured by these two
indices, in the online or the offline mode.
The performance measures in two cases (the Gold
study and the Schatzoff, Tsao and Wiig study) in-

6

Spring Joint Computer Conference, 1968

clude estimates comparing online/offline costs which
incorporate man and machine factors. In both cases
these costs were derived-from experimental measures
of human and machine time which were used as empirical parameters in simple cost models.
The Goid study had some unique measures of performance. The most notable is task effectivenesshow well the subject performed his task, as measured
by his profit in the simulated construction industry
model. Whereas the other studies measured effectiveness in terms of how long it took the subject to complete a standard task, Gold was also able to obtain a
quantitative measure of how well the subject performe-d (profit). Gold's study was also unique in obtaining written -rep~rts from each subject to assess
their mastery and grasp of the experimental task from
an independent source of (verbal) data. He was also
the only experimenter who required his subjects to
give a standardized account of their computer runs on
a run-by':run basis. These various measures enabled
Gold to obtain more diversified data than any of the
other studies on problem-solving and decision-making
activities in the online and offline setting.
In an attempt to explore the relation between paper
and pencil tests and performance on experimental
tasks, the two SDC studies incorporated scores on
the Basic Programming Knowledge Test in their analyses of individual differences. Since sample sizes
were small, and since validity correlations of successful paper-and-penciltests of job performance are traditionally moderate to low, these tests, at best, were tentative probes.
Summing up, what are the chief methodological
characteristics, strengths, and weaknesses of these
five studies in regard to subjects, problems, computer
facilities, experimental procedure, and performance
measures? The subjects _were primarily students or
trainees - experienced data processing personnel
were used in only one study. While the experimental
problems rangeq over a broad area, involving many
types of data processing tasks and procedures and
requiring many hours for successful solution, certain
types of tasks prominently occurring in batch processing and in time-sharing are not encountered, and it is
difficult to assess how representative these problems
are for data-processing in general and how well they
are balanced for an objective online/offline comparison. Some of the toughest problems were met in providing comparable time-sharing/batch facilities;
matched computers and equivalent languages posed
many problems, and the crucial variable of batch turnaround time was generally not systematically controlled. The experimental procedures show diverse
ievels of experimental sophistication, with the most

critical problems occurring in the observation and
measurement of human performance. Even at this
early stage, the range of performance measures is
impressive, covering a variety of man-machine indices; on the other hand, the paucity of automatically
coHected measures of human and program performance, particularly in the online setting, is somewhat
disappointing. More powerful online techniques, such
as regenerative recording of user performance - a
technique for capturing the complete real-time interaction between the user and the computer so that it
can be played back in its entirety for later analysis
(Sackman, 1967)11- should be developed and applied
to the experimental investigation of a broad spectrum
of user tasks.
Results of experimental studies

In this section the key results of each of the five
experiments are successively summarized in tabular
form and briefly evaluated; a composite box-score
of the results of all five experiments is also presented.
The next section, Interpretation, provides a crosscomparison and an overall evaluation of method and
findings. The tabular format for the results of each
experiment essentially consists of a list of key performance variables, with - observed scores in the online
and -offline mode, and obtained statistical significance
for the observed difference (providing such tests were
conducted); additional notable findings follow this list,
and each table concludes with a box-score listing what
. the author believes to be the most significant results
of the given study.

Performance Measur
Debug

~n-Hours

Time-Shareci
Mode
5.0

Computer Time (sec.)
Number of TINT Statements

:latch

Mode
9.6

146

492

51

53

Range of Individual Differences
in Debug Man-Hours

8:1 and
3:1

7:1 and
6:1

Range of Individual Differences
in Computer Time

5:1 and
4:1

4:1 and
3:1

*Nonparametric
1.

Statistical
Si nificance
.06 *

.04*

tests of mean differences in adjusted scores.

Time-sharing requires fewer :::an-hours and much less co",putar tille for
debugging with programer trainees than a simulated noninterac~ive !:lode
when an interpretive language developed for time-sharing is used in
both modes.
Individual differences in performance are larger than online/offline
system differences.

TABLE II - Main results of the Erikson study

Time-Sharing versus Batch Processing

Perfor.:unce Neasure
Problem-Solving Man-Hours
Task Performance (profit)
Computer Time (min.)
Understanding of Problem
(rating of written report)

Time-Shared
Mode

Batch
Mode

15.5
$1404

19.3

1.25
Lower

Overall Cost

No appreciable difference

Subj ect Preference

More
desirable

Range of Individual Differences
in Problem Solving Man-Hours

Batch
Mode

Time-Shared
Mode

Elapsed Time (days)
Analysis Time (min.)

3059

2295

.001

Programer Time (min.)

5672

2737

Computer Time (min.)
Compilations
Total Cost (dollars)

.001

.08

.02

92

101

118

49

.05

1579

1075

.08

Range of Individual Differences for above variables

4 :1

Statistical
Significance

46

29.5

.002

.04

Less
desirable

7:1

Performance !-1easure

.05

$1215

*7.13
Higher

Statistical
Significance

7

3:1 to 4:1

*There

is an additional editing load under time-sharing with DYNANO that is
not present under the batch mode. Comparable adjusted figures are not
available.

1.

Students experienced in batch techniques and who are inexperienced in
ti::le-sharing techniques, and who essentially use batch procedures under
both modes, use less of their own time and incur lower man-machine cc.:;ts
to prepare, code ano debug programs under the batch mode than in tiIr..=sharing.

2.

Ti.I:le-sharing, even with subjects unfamiliar with its use, requires less
total elapsed time than batch processing to prepare, code and debug
programs.

1.

Time-sharing requires fewer man-hours than batch processing in a problemsolving task with a sample of 60 students.

2.

Time-sharing is accompanied by a higheJ.: level of effectiveness than
batch processing.

3.

In a select group of students, individual differences in performance
are much larger than system differences between tiIr.e-sharing and batch
processing.

3.

Batch processing requires much less computer time than time-sharing for
the given problem.

4.

"''hile time-sharing required more compilations than batch processing
under the conditions of this experiment, there was no significant
difference in the expenditure of computer time under both modes.

4.

Time-sharing is strongly preferred by student subjects over batch
processing, and this preference [;rows with increasing exposure to both
modes.

5.

The above conclusions are contingent upon the type of programing languages
us.?d in both I!!odes and the extent and variability 0: batch turnaround
times--both of which were not report ad in the original articla.

TABLE III-Main results of the Gold study

TABLE V.,.- Main results of the Schatzoff, Tsao and Wiig study

I

Performance :·leasure

I Debug Man-Hours

I
j

Computer Time (sec.)
Program Size (machine words)
Program Run Time (sec.)

I

Ti;ae-Sharcd
Mode
19.3

Batch
}:ode

747

548
2339
3.7

Range of Indiv~dual Differences
in Debug Man-Hours

14:1 and
13:1

6:1 and
9:1

Range of Individual DIfferences
in Computer Time

3:1 and
11:1

7:1 and
8:1

Factor Analysis of
Performance Measures

Two factors:

.05*

31.2

2534
3.7

Statistical
Significance

Perfor:::ance

~!easure

programing speed and
program economy

lI~nstant"*
Hode

Batch*
Xode

Initial Progra::l Preparation (min.)

440

405

Time to Keypunch Original Program (min.)

109

108

Time to Prepare New Run

311

293

Number of Runs per Student

7.1

6.6

Computer Runs per Trip

2.5

1.9

Elapsed Time (days)

3.0

3.7

Computer Time (average min. per run)

I

II *Analysis

I

Intervals Between Successive Computer
Runs (min.)
First Quartile
Median
Third Quartile
Student Preference

.277
40
205
not reported

.186
210
450 (ellt.)
not reported

70%

24%

of variance on transformed scores.

*No

I

tests of statistical Significance were reported.

I
1.

Time-sharing requires fewer man-hours to debug programs for highly
experienced programers than a simulated batch system with a two-hour
turnaround time.

2.

Computer tillie, program size, and pr03ram I"UIIning time ~., not
l>igniHcertHy int'iueneed by i;stch versuil til!l.. ~shad.ng IIIOddll under
conditions of this experiment.

3.

th.

Individual performance differences in 3. highly experienced group of
programers are considerably larger than observed system differences
between time-sharing and batch processing.

4 .. An exploratory factor analysis of the experimental data revealed two
basic programing skills--progra::ling s?eed and program economy.

TABLE IV - Main results of the Grant-Sackman study

1.

Instant turnaround batch results in less elapsed time than conventional
batch to prepare, code and debug programs for a relatively large sample
of student users.

2.

Instant turnaround results in heavier computer time expenditure than
conventional batc.h processing.

3.

Instant turnaround is preferred by substantially more students than
conventional batch processing;

4.

Instant turnaround is associated with changes in programing working
patterns that are characterized by shorter intervals between successive
job runs and earlier compietion ?f the experimental task.

5.

The above conclusions are contingent u!'on the variability of the da:a
and derived tests of statistical s:'gnificance w:licn were not: report:ed
in the published study.

TABLE V 1- Main Results of the Smith Study

8

Spring Joint Computer Conference, 1968

Interpretation

What are the consistent patterns, the ambiguities,
and the gaps in the findings of the five studies? Six
types of performance measures in the data are reviewed: subject time, computer time, system costs,
user preference, individual differences and special
measures. Composite results for the first four measures are shown in Table VII.
Computer
Ti",e

Man-Hours

"

Cr.,,!: ..

User
Preference

Eriksim

T1me-Shar:lna
1.9:1

TilIIe-Shar:lna
3.4:1

T1me-Shar:lna

T1me-Shar:lna

Gold

T1me-SbariDc
1.211

Batch
5.7:1

Approz.
S_

T1me-Shar:lna

I

I

I

I

I Grant
Sackman

and

Schatzoff,
Taao and

I

1.6:1

Batch
1.4:1

Batch
2.1:1

TlIle-Shar1q
1.1:1

T1me-Shar1Da

Approx.
Same

T1me-Shar1q

Batch
1.5:1

Hot laported

I

W11&

Smith

Median for
All Studies

Instant
1.2:1

""

T1ma-Shar:lna
1.2:1

Batch
1.5:1

Approx.
Sama

wtant

Batch
1.4:1

Approx.
5_

'r1me-Shar:lna
Prefarred

"Tha moda showins a raported ~ appears in each box tosathar
with ita favorabla ratio; a.,., this 811.t1")' shows lass man-hours for
tillie-sharing at a 1.9:1 ratio.
""Instant" batch is treated in this tabla as a siDulated varsioa of
t1me-sharing.

TABLE VII-Composite Experimental Box-Score: Time-Sharing
Versus Batch Processing

Four out of five studies show time-sharing (or its
simulated equivalent) to result in less human time in
producing programs or solving problems than batch
processing (or its simulated eqUIvalent). Only the
Schatzoff, Tsao and Wiig study shows a reverse trend,
and these authors admit to their subjects' use of batch
techniques under time-sharing. Further, these authors
do, in fact, show less elapsed time for completion of
experimental tasks under time-sharing. On the other
hand, the Erikson study, which shows the greatest relative performance advantage for time-sharing (almost
2: 1 -in trainee man-hours), was based on the use of an
interactive interpretjve language in both modes, which
created a favorable bias for time-sharing. With· these
qualifiers at both extremes in mind, it appears that
time-sharing does tend to require less elapsed time
and fewer man-hours to produce programs and solve
problems. The magnitude of this performance advan-

tage is not very large - the median improvement for all
five studies is roughly 25 percent less human time
under time-sharing than in batch processing. No
claims are made for the meaning or the stability of
this or the other medians, but they do give a crude rule
of thumb for the pooied resuits of these five studies.
The comparative results on computer time show no
clear-cut trend. They range from a 6: 1 ratio in favor
of batch processing in Gold's study (an admittedly
inflated ratio since computer times in the two modes
are not strictly comparable), to middle-of-the-road
ratios varying from 1.5: 1, to 1.4: 1 in favor of batch in
the next two studies (Smith, and Grant and Sackman)
to i. i : 1 in favor of time-sharing in the Schatzoff,
Tsao and Wiig study to a 3: 1 ratio in the same direction in Erikson's study. The conservative conclusion
is that computer time is highly sensitive to the unique
conditions of each experiment and that no consistent advantage seems to accrue to either mode as far as
the pooled data of these studies are concerned. On the
other hand, the median ratio is 1.4: 1 in favor of batch
computer time, and perhaps this might serve as a
"best" estimate for the pooled data.
The combined results for human time and computer
time, assuming that the reported trends are reliable,
reinforce the hypothesis that in time-sharing the user
trades off computer time for his own time. That is, to
state the extreme case, rather than check out his program as thoroughly as he can at· his desk, the timesharing user is more likely to take a less-polished version or only a partially checked program to the computer for a trial run than his batch counterpart. Timesharing critics will assail this practice by claiming that
the user develops careless and lazy work habits
through excessive reliance on extra computer runs;
time-sharing advocates will assert that such behavior
allows more intelligent exploration and testing of alternative solutions at a natural pace for the user when
and as problems arise. While there is probably some
truth to both positions (which are not mutually exclusive), it is hoped that future experimental analyses of
problem-solving stages in both modes will lead' to
improved hypotheses in the dynamics ofman-computer communication that will supersede these rather
crude stereotypes of user behavior under time-sharing and batch processing.
The data on system costs also shows no definite
trend. While only two studies reported cost estimates,
the overall. results indicate that one study shows
definitely less expense for time-sharing (Eriksonless computer time and fewer man-hours), three
studies show roughly equal costs for both modes
(computer time and man-hour results in opposite cost
directions), and one (Schatzoff, Tsao and Wiig) shows

Time-Sharing versus Batch Processing
a 50 percent cost advantage for batch processing.
Here again the results are contingent upon unique experimental conditions.
The comparative results on user attitudes show a decided preference for time-sharing in Gold's study and
a strong preference for "instant" over conventional
batch in Smith's study. In the two SOC studies, although a formal poll was not taken, most subjects
apparently preferred time-sharing over the simulated omine conditions. Schatzoff, Tsao and Wiig do
not report any opinion, data. The avaiiabie, evidence,
such as it is, indicates that time-shari,ng and "instant"
batch (minutes of turnaround time) are preferred over
conventional batch (hours of turnaround time). There
are no data to indicate how time-sharing would fare
against fast-turnaround batch. While it is not at all
surprising that the subjects liked easy access to computers and fast computer response, it is nevertheless
desirable to demonstrate this experimentally. User
preference for the interactive conversational features
of time-sharing over and above the fast response of
instant batch is still a moot point.
Individual differences were investigated in those
three studies using analysis of variance techniques. In
each case, performance differerices between subjects
were larger and overshad()wed system differences between time-sharing and batch processing. The observed ranges were sometimes at an order of magnitude between best and poorest performers - even with
relatively stratified subject samples. 'Although no
measures of the dispersion of subject performance
were reported in 'the Gold and Smith studies, it is
hoped that such analyses will be forthcoming since
these two studies have the largest user samples. Except for the Grant-Sackman exploratory factor analysis of individual performance differences, no systematic analysis of human differences was attempted. This
factor analysis resulted in two well-defined and essentially independent factors - one concerned with
programming speed (low coding and debugging time,
and low computer time) and the other with program
economy (smaller and faster running programs). While
the entire area of individual differences in man-computer communication, from economic, system performance and humanistic points of view, is probably
more important than operating system differences,
nevertheless, little has been done and virtually nothing
is known about such individual differences.
Gold's study is the only one that attempted to assess
how well the experimental task was done and how well
it was understood. He found that the time-sharing
group made a significantly larger profit in the simulated construction industry market and that they also
understood the problem better than the batch processing group, at least as determined by independent rat-

9

ings of written reports from both groups. These findings, but just for this one study, support the contention that time-sharing leads to a higher-quality end
product than conventional batch.
The distribution of successive computer runs in
Smith's study shows interesting differences between
the instant and the conventional batch modes. Median
turnaround time for subjects to prepare their next run
is more than twice as short in the instant mode. Problem-solving speed is apparently slower under conventional batch. As time-sharing adherents have often
pointed out, ready accessibility of computer services
lends itself to natural pacing in problem-solving tasks,
whereas the forced delays inherent in conventional
batch turnaround time tend to disrupt normal problemsolving patterns and inhibit spontaneous closure. U nfortunately, the intervals between successive computer runs under batch, and between successive console
sessions under time-sharing were not reported in the
other studies, thus, the above hypothesis is still conjectural. Nor does this hypothesis bear upon the differences between instant batch versus interactive
time-sharing.
The two SOC studies, at least as far as program size
and running time are concerned, are neutral with respect to Gold's results in that no significant program
differences were found between time-sharing and simulated batch modes. It would be of interest if online/
omine experiments were conducted in which subjects
were instructed to write short and fast-running programs in addition to solving the experimental tasks.
Without such instructions subjects are likely to concentrate primarily on working solutions rather than on
operating costs of the finished product. Program size
and running time can be used to measure comparative
"quality" of final programs - a useful measure in
realtime computing systems, for example, where
space and running time are often at a premium.
What is the composite picture of experimental comparisons of time-sharing and batch systems, at least
as depicted by the available studies, and what are
the main gaps in this portrait? The rather blurred portrait that emerges seems to show that time-sharing
is more likely to get the job done faster, perhaps at
higher quality, at a working pace preferred by users.
Batch processing may, more often than not, require
less computer time, and perhaps at somewhat less
cost than time-sharing. Prior familiarity with batch
or time-sharing, and built-in individual or institutional
bias toward one or the other, especially if ,coupled to
computer system tools or languages built for one mode
rather than the other, could easily shift the balance in
the familiar direction. Overshadowing these system
differences are wide-ranging individual differences
which seem to account for most of the observed variance in performance.

10

Spring Joint Computer Conference, 1968

Except for Gold's exploratory work on the quality
of the user's final product, virtually nothing has been
done on human creativity in the online/offline setting.
No studies have been performed on the distinctive
characteristics of conversational interaction in timesharing and whether these characteristics offer any
advantage over fast batch systems. No work has been
done on a comparative error analysis of user performance between time-sharing and batch processing except for some preliminary tabulations listed by Smith
(1967).12 There are no detailed case histories on the
real time pattern of problem-solving - a kind of timeand-motion study of human decision making - that
occurs under online and offline conditions, away
from the computer as well as at the computer. Until
we understand the behavioral dynamics of man-computer communication we can hardly expect to understand the relative tradeoff between alternative modes
of data processing, including the comparison between
time-sharing and batch processing. It is not within
the scope of this paper to develop a systematic framework for comparative analyses of user performance;
this has been done elsewhere ~y the author. 11 Suffice it to say that it is an encouraging sign of the times
that significant experimental attempts have been made
to obtain open scientific data on camparative mancomputer systems, and that the application of computers to human affairs is becoming more a shared, applied science and less a secretive, crude, trial-and error
technology.
REFERENCES
I F J CORBATO M MERWIN-DAGGETT R C DALEY
An experimental time-sharing system
Proceedings of the Spring Joint Computer Conference 1962
pp 335-355
2 J I SCHWARTZ E G COFFMAN C WEISSMAN
A general purpose time-sharing system

Proceedings of the Spring Joint Computer Conference 1964
vol 25 pp 397-311
3 C J SHAW
The JOSS system
Datamation vol 10 no II November 1964 pp 32-36
4WCLYNCH
Description of a high capacity, fast turnaround university
computer center
Proceedings of 22nd National Conference Association for
Computing Machinery Thompson Book Co 1967 Washington
D C pp 273-288
5 H SACKMAN
Experimental investigation of user performance in time-shared
computing systems: retrospect prospect and the public interest
SP-2846 System Development Corporation Santa Monica
California 5 May 1967
6 W j ERIKSON
A pilot study of interactive versus non interactive debugging
TM-3296 System Development Corporation Santa Monica
California 13 December 1966
7 M GOLD
Methodology for evaluating time-shared computer usage
Doctoral Dissertation Massachusetts Institute of Technology
Alfred P Sloan School of Management 1967
8 E E GRANT H SACKMAN
An exploratory investigation of programmer performance
under on-line and off-line conditions
SP-2581 System Development Corporation Santa Monica
California 2 September 1966
9 M SCHATZOFF R TSAO R WIIG
A n experimental comparison of time sharing and batch processing
Communications of the ACM vol 10 n05 May 1967 pp 261-265
10 LYLE B SMITH
A comparison of batch processing and instant turnaround
Communications of the ACM vol 10 no 8 August 1967 pp
495-500
II H SACKMAN
Computers system science and evolving society
John Wiley and Sons Inc New York 1967
12 LYLE B SMITH
Part one: a comparison of batch processing and instant turnaround
Part two: a survey of most frequent syntax and execution-time
errors
Stanford Computation Center February .1967

Computer scheduling methods and their countermeasures
by EDWARD G. COFFMAN,JR*
Princeton University
Princeton, New Jersey
~-,1

auu

LEONARD KLEINROCK**
University of California
Los Angeles, California

INTRODUCTION
The simultaneous demand for computer service by
members from a population of users generally results
in the formation of queues. These queues are controlled by some computer scheduling method which
chooses the order in which various users receive attention. The goal of this priority scheduling algorithm
is to provide the population of users with a high
grade of service (rapid response, resource availabiljty, etc.(, at the same time maintaining an acceptable
throughput rate. The object of the present paper is to
discuss most of the priority scheduling procedures
that have been considered in the past few years, to discuss in a coherent way their effectiveness and weaknesses in terms of the performance measures mentioned above, to describe what the analysis of related
queueing models has been able to provide in the way
of design aids, and in this "last respect, to point out
certain unsolved problems. In addition we discuss the
countermeCisures which a customer might use in an
attempt to defeat the scheduling algorithm by arranging his requests In such a way that he appears as a high
pri?rity user. To th~ extent that we can carry out such
an undertaking, the single most important value of this
consolidation of the results of analysis, e~perimenta­
tion, and experience will be in the potential reduction
of the uncertainty connected with the design of a
workable service discipline.
By a grade or class of service we mean the availability of certain resources (both software and hardware), a distribution of resource usage costs, and a
well-defined distribution of waiting or turn-around
times which applies to the customer's use of these re*This research was supported in part by the Bell Telephone Laboratories, Murray Hill, New Jersey.
**This research was supported in part, under Contract OAAB0767-0540, United' States Army Electronics Command, and also
in part by the Advanced Research Projects Agency (SO-184).

sources. In multi-access, multiprogramming systems
throughput may conveniently be measured in terms of
computer operating efficiency defined roughly as the
percentage of time the computer spends in performing user or customer-directed tasks "as compared with
the time spent in performing executive (overhead type)
tasks. We shall avoid trying to measure the programmers' or users' productivity in a multi-access environment as compared with productivity in the usually less
flexible but more efficient batch-processing environment. For discussions on this subject see References
1 and 2 and the bibliography of Reference 3.
With a somewhat different orientation some of these
topics have been covered elsewhere. In particular,
Coffman,4 Greenberger> and in more detail Estrin
and Kleinrock6 have reviewed the many applications
of queueing theory to the analysis of multiprogramming systems. In addition, Estrin and Kleinrock have
surveyed simulation and empirical studies of such systems. Howeyer, in contrast to the purposes of the
present paper, the work cited above concentrates on
mathematical models and on service disciplines to
which mathematical modeling and analysis have been
to some extent successfully applied. We shall extend
this investigation to several priority disciplines not
yet analyzed and to others which more properly apply
to batch-processing environments. Furthermore, as
indicated earlier, the present treatment investigates on"
a qualitative basis the detailed interaction of the customer and the overall system with the service discipline.

Classification o/priority disciplines
Before classifying priority disciplines consider the
following very general notion of a queueing system.
In Figure 1 we have shown a feedback queueingsystem consisting of a computer (service) facility, a queue
or system of queues of unprocessed or incompletely
11

12

Spring Joint Computer Conference, 1968

processed jobs (or more generally, requests for service), a source of arrivals requesting service and a
feedback path from the computer to the system _of
queues for partially processed jobs. The system will
be defined in any given instance by a description of
the arrival mechanism, the service required from the
computer, the nature of the computer facility, the service discipline according to which the selection of service requests from the system of users is determined,
and the conditions under which jobs are "fed back"
to the system of queues. In all of the service disciplines discussed in the next section we make the following assumptions: 1) the arrival mechanism is
such that if the arrival source is not empty it generates
new requests according to some probability distribution, 2) the service disciplines are such that the computer facility will never be idle if there exists a job
in the system ready to be executed.

ARRIVING

SYSTEM OF

UNITS

QUEUES

DEPARTURES

PROCESSOR

B. Resume vs. restart
This characteristic determines how service is to
proceed on a previously interrupted (preempted) job
when it comes up for service again. With a resume
priority rule no service is lost due to interruption and
with a restart rule all service is lost. Assuming that
the costs of lost service are intolerable in the applications of concern to us we shall treat only resume rules
and systems in which such rules are feasible.
C. Source of priority information
Service disciplines may be classified according to
the information onwhich they base priority decisions.
Such a list would be open-ended; however, the sources
of the information may be considered to fali in one of
three not necessarily disjoint environments: 1) the job
environment whereby the information consists of the
intrinsic properties of the jobs (e.g., running time, input/output requirements, storage requirements, etc .. ),
2) the (virtual) computer system environment (e.g., dynamic priorities may be based on the state of the system as embodied in the number of jobs or requests
waiting, storage space available, location o( jobs
waiting, etc.) and 3) the programmers' or users' environment in which management may assign priorities according to urgency, importance of individual
programmers, etc.

Figure I - Feedback queueing system

D. Time at which information becomes known
There are a variety of ways to classify priority service disciplines. Indeed, one point of view is expressed
by saying that priorities may be bought (e.g., in disciplines where bribing is allowed,1 earned e.g., by a program demonstrating favorable characteristics in a
time-sharing system), deserved (e.g., by a program exhibiting beforehand favorable characteristics), or a
combination of the above. For our purposes we shall
classify a given priority method according to the properties listed below.
A. Preemptive vs. non-preemptive disciplines
This characteristic generally determines how new
arrivals are processed according to the given discipline. If a low priority unit is being serviced when a
higher priority unit arrives (or comes into existence
by virtue of a priority change of some unit already in
the system) then a preemptive service discipline immediately interrupts the server, returns the lower priority unit to the queue (or simply ejects it from the
system), and commences service on the higher priority unit. Note that non-preemptive disciplines involve preplanning in some sense. However, the extent of pre-planned "schedult;?s" may vary wideiy.

Classical service disciplines assume that the information on which priority decisions are based is
known beforehand. On the other hand, time-sharing
disciplines are a prime example of service disciplines
in which decisions are based on information (e.g., running time and paging behavior, which is obtained only
during the processing of service requests. Such information, of course, is used to establish priorities
based on the predicted service requirements of requests which at some time were interrupted and returned to the queue.
All of the priority scheduling methods to be discussed are applicable to the infinite input population
case (in which the number of possible customers is
unbounded) as well as to the finite population case
(in which a finite number of customers use the system) - see Reference 6.

Priorities based only on running times
The intent of systems using a so-called runningtime priority discipline is that the shorter jobs should
enjoy better service in terms of waiting times. The
PCPS (first-come-first-served) system is commonly
used as a standard of reference to evaluate the suc-

Computer Schcduiing lViethods and Countermeasures
cess of this intent. The first two algorithms below assume job running times are known at the time they arrive at the service point, while most of the remainder, which have arisen primarily in connection with
multiprogramming systems, assume that no indications of running times are known until after jobs have
been at least partially run.
A. The shortest-job-first (SJF) disciplines
This is a non-preemptive priority rule whereby the
queue is inspected only after jobs are completely
processed (served) at which times that job in the queue
requiring the least running time is the next to receive
service to completion (and thus there are no cycled
arrivals). The SJF descipline has the obvious advantage of simplicity and the somewhat less obvious advantage that the mean customer waiting time in· the
system is less than in any system (including the FCFS
system) not taking advantage of known running times.
However, it is clear that significantly long running
jobs suffer more in an SJF system than in an FCFS
system. Thus, the reduction in the first moment of the
waiting time comes at the cost of an increase in the
second moment (or variance). We discuss the SJF
discipline further in connection with the following discipline.
B. The preemptive shortest-job first (PSJF) discipline
With this discipline the SJF priority rule is applied
whenever a job is completed as well as whenever there
is a new arrival. If a new arrival has, at its time of
arrival, a service requirement less than the remaining running time required by the job, if any, in service, then the latter job is cycled back to the single
queue and the computer given over to the new arrival.
The job returned to the queue is subsequently treated
as if its running time were that which remained when
it was interrupted; i.e., we have a preemptive, resume
discipline.
The PSJF discipline has the advantages over the
SJF and FCFS disciplines of further accentuating
the favoritism enjoyed by the short running jobs and
further reducing the average waiting time in the system. (Again, the variance will be increased.) Indeed,
it has been shown that the PSJ F discipline is the optimum running-time priority discipline in these last two
respects. This relationship between the SJ F and PSJ F
disciplines is seen in part by observing that time-ofarrival receives some consideration in the SJF discipline but, because of the preemption property, none
at all in the PSJF discipline.
The principal disadvantage of the PSJF discipline
in the computer application is the cost associated with
interrupting a job in progress, placing it into auxiliary

1")
1,)

(queue) storage, and loading the higher priority job
for execution. Although this swapping process with
auxiliary storage devices may not always be necessary, depending on the size of main storage, it may outweigh the advantage of PSJF scheduling over SJF
scheduling. Since it is usually difficult to expect advance knowledge of exact running times, it is encouraging to note that in (,l thorough study by Miller
and Schrage9 it is shown that even with partial indications of running times significant improvements in
mean flow time are possible at the expense of increases inthe second moment.
It is obvious that the major effect of these disciplines on the programmer-users of such systems is
a salubrious one in that it causes them to produce faster, more efficient jobs. However, the reaction of.a
user with a job ready to be submitted to an SJF or
PSJF system depends to some extent on what information regarding the state of the system is available
to the user. If the user can see only the queue length
(and this is usually available) then whether or not he
balks (refuses to join the queue) depends on how long
his job is, his knowledge of the distributi9n of the running times of jobs· submitted to the system, and his
assessment of his. chances were he to decide to come
back later. If indIcations of the running times of jobs
in the queue are known at arrival time then good estimates of waiting time are possible; thus, longer jobs
wanting fast service are more likely to balk. With
knowledge only of queue length, however, it would
appear that less balking would occur with the SJF
and PSJF systems than with the FCFS system. On
the other hand, reneging (leaving the queue after joining it and before being completely serviced) would be
more likely since long jobs are likely to progress rather slowly toward the service point.
The countermeasures available to the users of the
SJF system in attempts to defeat (or take advantage
of) the computer scheduling algorithm are rather obvious. Firstly, it is clearly advantageous to submit
as short a job as possible. The natural consequence
of this action suggests that a user partition his request
into a sequence of short independent requests. Secondly, unless special precautions and penalties are
provided, the users may purposefully lie about (i.e.,
underestimate) their required running time. Such
countermeasures lead to a situation in which the attempted discrimination among jobs becomes ineffective and preferred treatment is given to those users
displaying the use of clever and/or unethical tactics.
We will continue to observe this unfortunate result
in the other scheduling algorithms.
So far we have been discussing computer operating
disciplines as if there existed but one queue of jobs

14

Spring Joint Computer Conference, 1968

waiting in the central processor. This is not generally
true if we consider also the queue or queues of jobs
waiting for the use of input/output (I/O) devices. The
executive or supervisor system itself may be one of
the "jobs" in the I/O queues. In a general batch-processing or time-sharing system it is more realistic to
assume that jobs consist of phases or tasks whose requirements alternate· between the use of the central
processor and the use of some I/O device. Then the
SJF policies may be defined just for the centrat proce.ssing time (as implied above) or by the sum of central processing and I/O time.
It is not our intention to discuss I/O scheduling
disciplines in any detail but it should be kept in mind
that a job completion in the central processor system
may simply mean that the given job has reached a
point where it requires an VO process before continuing. Similarly, an arrival may mean a job returning
from the I/O system for more computing time. This
is not to say, however, that I/O and central processing scheduling are independent processes; indeed, it
may be necessary that one of the criteria for assigning priority (external or internal to the comput system) be which I/O devices are required and for how
much time. This is taken up again below.

c.

The round-robin (RR) discipline

This well-known schedu~ingprocedure was first
introduced in time-sharing systems as a means for
ensuring fast turn-around for short service requests
when it is assumed that running times are not known
in advance. As seen below the RR discipline falls
within a class of so-called quantum-controlled service disciplines in which the size (q) of the quantum
or basic time interval is to be considered as a design
parameter. In an RR system the service faciiity (computer) processes each job or service request for a maximum of q ~econds; if the job's service is completed
during this quantum then it simply "leaves" the system (i.e., the waiting line and the central processor),
otherwise the job is cycled back to the end of the single queue to await another quantum of service. New
arrivals simply join the end of the queue.
As can be seen, the use of running time as a means
of assigning priorities is implicit in the RR discipline, whereas it is explicit in the previous two disciplines. Running time priorities are assigned after
a job has been allocated a quant~m of service- if the
job requires additional service it suffers an immediate
drop to the lo·west (relative) priority and sent to the
end of the queue. Furthermore, it is clear that the RR
policy uses both running time and time-of-arrival to
make (implicit) priority decisions. This latter dependence is seen by noting that all jobs arriving at any

time earlier than a given job will have been allocated
at least one more quantum of service when the given
job reaches the service point.
The extent to which the RR discipline maintains the
shortest-job-first policy (in a posterior fashion) depends on the quantum size. Ciearly, if q is allowed
to be infinite we have a FCFS system. On the other
hand as q approaches zero we have in the limit a socalled processor-sharing system 10 in which the part of
the processor not devoted to executive or overhead
functions is "divided up" equally among the jobs currently in the system. In short, we have a system with
no waiting line wherein it is possible to execute all
jobs simuitaneousiy but at a rate reduction for an individual job which is proportional to the number in
the system. (The computer systems with multiple program counters approximate to some extent the RR behavior with q very. close to zero). Despite the better
service for short jobs as q decreases, questions of efficiency generally dictate against quantum sizes too
small in conventional computer systems. We elaborate
on this shortly.
The extent of discrimination by the RR discipline
in favor of short jobs also depends .on the distribution of job running times. In particular, the RR discipline clearly does not take advantage of any knowledge gained by the quantum-exec1)tion of ajob beyond
the fact that the job simpiy requires more. For example, the distribution of job running times may be such
that any job requiring more. than two quanta will with
probability .95 require in excess of 10 quanta. I n this
event the desire to favor shorter jobs would indicate
that all jobs having received two quanta should not
come to the service point for the third time until all
jobs in the system have received at least two quanta.
The distribution of job running times that is appiicable to RR scheduling in this respect is the exponential
distribution, which also corresponds to the assumption that has been found analytically tractable in most
queueing theoretic studies of the RR discipline. This
arises from the so-called memoryless property of the
exponential di~tribution which means in our application that after executing a job for q seconds (with q
arbitrary) the distribution of the remaining time to
completion is always the same (and equal to the original distribution). Thus, it is the continued identical
uncertainty in job running times that constitutes the
primary job characteristic making the simple RR discipline desirable.
It is immediately evident from the definition of the
RR discipline that the basic disadvantage consists of
the swapping (removing one job from and placing another job into service) necessary for jobs requiring
in excess of q seconds of service. l\1any approaches

Computer Scheduling 1vfethods and Countermeasures
to the solution of this problem have been taken. For
example, increasing the size of main memory so that
many jobs may coexist -there eliminates much of the
need for swapping. Of course, this is a limited and
possibly expensive solution. Also, overlapping the
swapping of one job with the execution of another in
·systems with appropriate memory control and storage
capacity makes the swapping process a latent one so
that the suspension of the central processor for input/
output processes is reduced. Nevertheless. with mod-ern, large-scale mUltiprogramming systems, the swapping process remains a principal bottleneck to efficient operation with many users.
Several analytical studies of RR disciplines have
been_ carried out ll ,12,26 with the goal of determining,
for a system defined b¥ a given arrival process and job
running time distribution, the interaction between system performance (efficiency, throughput, or waiting
times) and the swap-time and quantum size parameters. As v'erified by experience, analysis has shown
how performance deteriorates sharply when the quantum size for a given swap time and system loading is
made lower than a certain minimum range of values (or
alternatively when loading becomes too heavy for a
given quantufilsize). The priority disciplines described
in the next section illl1;strate techniques whereby this
excessive deterioration of service is avoided to some
extent.
As regards countermeasures, the mere reduction
of one's job length gains little. However, if a user were
to partition his job into many smaller jobs, then he
would achieve superior performance than a user with
an identical job which was left intact. Again, the clever
user wins. An interesting property of the q = 0 case
is worthy of note, namely, that all customers have
identical ratios of service time to mean time spent in
system!
D. The multiple-level feedback (FB) discipline

The FH discipline differs from the RR discipline in
a way which is analogous to the way in which the
PSJF rule differs from the SJF rule. In an FB system
a new arrival preempts (following the quantum, if
any, in progress) all jobs in the system and is allowed
to operate until it has received at least as many quanta
as that job(s) which has received .the least number of
quanta up to the time of the new arrival: Alte~na­
tively, the FB system may be viewed as consisting of
multiple queue-levels number 1, 2, 3, ... with new arrivals put in queue-levell, jobs having received 1,
. quantum and requiring more in queue-level 2, etc.
After each quantum-service the next job to be operated will be the one at the service point of the lowest
numbered, non-empty queue-level.

15

Once again, shorter jobs receive better service at
the expense of the longer jobs, and large jobs are not
allowed to interfere or delay excessively the execution
of small jobs. However, the mean flow time is the
same in the RR and FB systems. (In this regard, we
note the existence of a conservation law 13 which
gives the contraint* under which one may trade the
speed of response among a populatio~ of us-ers.) The
choice between the RR and' FB priority disciplines
is determined basically by how much one wants to
favor short jobs, for the basic algorithms involve the
same amount of swapping. It is true; however that the
PB discipline is somewhat more costly to implement
in the sense that indicators must be used to keep
track of the amount of service received by each job.
In the FB system, the users' countermeasure is
again to partition his work into many smaller jobs each
requiring a small number of quanta (one quantum each,
optimally).
Observe that the RR, FB, and PCPS disciplines
may be combined in a variety of useful ways. Two
combinations that have been used are described below.
E. The two-level FB or limited :Q.R nritor Opti"".
Mo4el lbIiber

33A5R
Base coat

$42 + $50

ilIlta eet (alllOJ' required)

Peper tape reeder and pundt.
FuJ.l/balt cluplex ow1 tob

Upright/inverted _tob
Ea""pe kq
Blaohed.ero

:-x-:~~

gDATA!

l!.!]J Ls!.!J

Figure 1 - Present system

This is broken down in Table II. Although this system
works moderately well, it has several significant drawbacks. The first is that it is somewhat expensive and
the telephone company expects to increase the rates
on some of the items. Second, it places a considerable
load on the in-house exchange which at the same time
must absorb increases in normal telephone use. Third,
it lacks flexibility with regard to connection to other
communication networks such as ARS or Western
Union Telex service.

TABLE

n

Monthly Cost For Present System
(See Figure 1)

Equipnent

!fo.

Unit Cost

Teletypes

40

$100

Data sets

64

25

Central exchange lines

36

17·50

701 exchange lines

46

2.85

Total Equipnent Cost

1

$4,000
1,600
630

~
Total

$6,361

lThis is an average price tor a mixture ot Model 33' 8 and Model 35' s
wi th a variety of options.

Spring Joint Computer Conference, 1968

26

Some other alternatives for obtaining facilities

For communication with our DSD-940 we require:
(1) terminals, (2) transmission facilities, and (3) a

"line-concentrator." The need for the first two is
clear. The need for the' third is aiso clear when we
consider that there are more Teletype terminals than
entry ports to the computer. For each of the three
facilities there are several choices of equipment and
often several choices of suppliers or methods of supply
for each piece of equipment. We first consider facilities supplied completely by the Telephone Company
and then consider composite facilities supplied from
any combination of sources. This separation is a
logical one since the Company insists on supplying
either complete systems or just leased private lines.
Complete systems available from the
telephone company

The Telephone COJ;npany has proposed the system
shown in Figure 2. The costs for this system are
shown in Table III. Clearly this system offers little
savings over our current system and in fact, on the
basis of the cost and trouble for converting, is not a
desirable change. The actual cost will probably be
greater than shown in the table. This is a result of
the Telephone Company policy of requiring 5-year
leases on some switching equipment. Tlie system
proposed has a $250-per-month termination penalty
for the unexpired part of a 5-year lease. In view of
the rapid developments in time-shared computing it
is unlikely that any system designed now will meet
our needs for the next 5 years. We would therefore
incur J considerable termination penalty and the actual
cost of the system in Figure 2 will probably be higher
than that for the current system.
TABLE In
Monthly Costs For Telephone Company Isolated Svitching System

Figure 2 - Telephone Company isolated switching system

Another proposed arrangement would use only our
internal telephone system (701). In studying Table II
one quickly notices that lines connected through the
central exchange are far more expensive than lines
connected through the 701. We have the present
split arrangement, because the 70 1 doe~ not h~ve
enough capacity to handle the whole ttme-shanng
load. Naturally one question is whether the capability of the 701 can be increased and the cost of doing
it. This can be done. and Figure 3 shows this arrangement. The costs are shown in Table IV. Although this
system does not offer significant savings it is clearly
preferable to the system of Figure 2. There is still a
termination penalty for the extra switching equipment required. However, it is probable that this equipment wouid simply be kept and used to meet increased
demands for normal telephone service and no penalty
would be incurred.

(See Figure 2)

Equipment

No.

Unit Cost

Teletypes

40

$100

Data sets

64

25

Total Equipment Cost

1

$4,000
1,600

Li'nes:
Radio Bldg.

28

Main Campus

10

1.50

42

40

30th St. Bldgs.

2

6

Ccrnputer Ports 2

24

15

~
Total

$6,054

lThis 1s an average price for a mixture of Model 33's and Model 35's
\11th a variety of options.

2The cost of the concentrator is !ncluded in this item.
There is n termination charge of $250/mo. for the unexpired uart of
u fi veo

ye~, r

lrnl~;(, ~

Figure 3 - Expanded 701 system

12

Composite systems

Terminals and data sets - Terminals could in theory
be anyone' of several devices. However from the
viewpoint of cost, availability and compatibility with
the current system, ~fodel 33 or 35 Teltypes are the
best choice. These can be either leased or purchased.
In principle terminals can be leased from the Tele-

Providing Communication Facilities for Time-Shared Computing
phone Company or Western Union. However the
Telephone Company will lease Teletype units only
in conjunction with other equipment and Western
Union will lease only Model 35 Teletypes. If terminals are purchased, we wouid favor the Model 33
over the Model 35 because of a 1:4 cost ratio. If terminals are purchased, we must also. arrange for their
maintenance. At present we know of four possibilities; maintenance by ESSA, GSA, Western Union,
and RCA Service Company. Initial estimates are
that maintenance wouid cost about $25 per month
per terminal.

TABLE

rv

Monthly Cost For Expanded 701 System
(See Figure 3)

Equipnent

No.

Unit Cost

Teletypes

40

$100

25

1

Data sets

64

701 exchange lines

64

2.85

Increase 701 capaCity2

35

3.95

Total Equipnent Cost
$4,000
1,600
182

~
Total $5,920

~s

is an average price tor a mix':'Jre of Model 33's and Model 35'·s
with a variety of options.

2There is a termination charge of 1./2 the cost tor the remaining
part of a 5 year lea se.

We do not yet have final estimates of maintenance
costs or of life expectancy for the terminals. Based
on information obtained so far and assuming a useful
life of 2 years for Model 33 equipment, a purchased
Model 33 KSR Teletype would cost approximately
$20 per month plus maintenance or $45 per month.
The Telephone Company lease cost is $42 per month.
Western Union has not yet filed a tariff on Model
33 Teletypes. They have offered to lease Model 35
Teletypes to us at $70 to $95 per month including
maintenance.
Any of the terminals discussed above will require
sorrie form of data set or modem. In examining the
various arrangements available from the Telephone
Company we find that they all include a substantial
number of data sets which must be leased at $25 per
month each. In many cases there is no technical need
for an elaborate data set and this is therefore an area
where substantial saving can be effected. For example, when private lines are used, suitable modems can
be purchased for less than $100 and Western Union
leases modems for $13.75 per month. The Telephone

27

Company charges $25 per month fof' a modem to connect Model 33 and 35 Teletypes to private lines.
This makes the rental of Teletypes from them more
costly than any of the other alternatives.
Another type of modem that is readily available
is the acoustic or magnetic telephone coupler. These
are available from a number of sources at purchase
prices as low as $250. We are already using a number
of them and find that they are adequate for most of
our needs. The use of these telephone couplers permits most telephones to be used as entry points to
the computer.
Transmission. facilities-Off our main campus and
perhaps outside of the building where the SDS-940
is installed there is no choice of lines other th~m from
the Telephone Company. There are two types" of
lines available - type 1000 for DC and low frequency
signalling and type 3000 for voice frequency signalling.
Either type has an approximate monthly lease cost
of $1.50 per half-duplex circuit within the same building and $1.00 per ~ mile per half-duplex circuit outside the building, with a $4.00minimum per line. Such
leased lines are the only answer fo~· off-campus buildings. For terminals on the main campus we have considered the installation of our own wiring, however,
we have not yet obtained any.cost estimates for doing
this. An additional possibility for remote locations
where we expect to have a significant number of terminals is t() place a line concentrator there to reduce
the number of lines to the computer. Lines within the
main building could then be provided either by us or
the Telephone Company.
A problem in using type 3000 lines is that they
transmit a.c. only (300-3000 cps); therefore tone
signalling must be used. Fortunately, as mentioned
earlier, suitable modems are readily available at low
cost. A problem in using type 1000 lines is that either
the SDS-940 system must be modified to accommodate them or a special interface must be provided
external to the computer. Therefore there is no opportunity to effect a major saving by using type 1000 lines
in place of type 3000 lines.
Switching facilities - The switching facility need
not be a general exchange. In particular only one-way
switching is required. Also no choice needs to be exercised by the originating terminals. Therefore a simple line concentrator is all that is required. (The Rand
Corporation has taken this approach for their JOSS
time~sharing system.) A suitable system can be obtained for approximately $6000. This would accomodate up to 200 terminals and 40 computer "ports."
Optimum systems
A comparison of the alternatives shows that a com-

28

Spring Joint Computer C'onference, 1968

posite system will be less expensive than one obtained
completely from the Telephone Company. The two
best alternatives are a system u~ing telephone couplers
on the regular internal telephone system or a system
using private lines and a purchased line concentrator.
Figure 4 shows a system using telephone couplers.
It is similar to the system of Figure 3 but eliminates
the need for obtaining Teletypes and data sets from
the telephone company. Table V gives the projected
costs for this arrangement. We would save $1640
per month by converting to it.

to these types of services. This is however an undesirable method since the number of ports is limited.
With our own switcher (and probably with one leased
from Western Union) we can connect these terminals
to the exchange input lines as shown in Figure 4,
thus providing access to outside and special transmission facilities without having to dedicate ports to
particular kinds of inputs.

SOS - 940
~1I01l'fll~--'hCO"PUTER

24

Figure 4- Telephone coupler - 701 system
Figure 5 - Purchased switching system
TABLE V

Eql11valent Monthly Cost For Telephone Coupler - 701 System

TABLE VI

(See Figure 4)

Eq~valent

Monthly Cost For Purchased Switching System
(See Figure 5)

Equipnent

No.

Unit Cost

Teletypes

40

1
$85

Data sets

24

25

600

Couplers

40

10

400

701 exchange lines

64

2.85

182

Exchange

35

3·95

Increase 701 Capaci ty2

Total Equipnent Cost
$3,400

~
Total $4,720

1

This is an average price for a mixture of Model 33's and Model 35's
with a variety of options. We estimc.te that leasing frem Western
Union or purchase of Teletypes will average at least $15 per month
less than Telephone Company rates.

Equipnent

No.

t1nit Cost

Teletypes

40

1

Modems

64

$ 85

Total Equipment Cost
$3,400

2
3
2
250

250

84

192

Lines:
Radio Bldg.

28

3

Main Campus

10

8

80

2

12

~

30th St. Bldgs.

Total $4,030

~ere

is a tennination charge of 1/2 the cost for the remaining part
of a 5 year lease.

lwe estimate that leasing from Western Union or purchase of Teletypes
will average at least $15.00 per month less than Telephone Company rates.

~ased

Figure 5 shows the system using private lines and
a purchased line concentrator. The costs for this
system should not exceed those shown in Table VI
and will probably be somewhat lower. With it we
would expect to save $2330 per month over our present communication facilities. This arrangement has
the additional advantage that it is the one best able
to provide for connections to the time-sharing system
from ARS, Telex, the regular telephone system, etc.
Any system we get must be able to accommodate a
small number of such special inputs. One alternative is simply to dedicate some of the computer ports

on a two year service life after installation.

Tariffs

Tariffs are often mentioned by the common carriers
as a reason why this or that cannot be done. To a
considerable extent, tariffs are the carriers' catalogs
and' price lists and simply reflect the business policies
of the carriers. They are not, although you may get
the impression they are, immutable laws writ in stone.
They are sometimes easily changed upon application
to the appropriate regulatory agency. Unfortunately,
the carriers' business policies are sometimes rather

Providing Communication Facilities for Time-Shared Computing

inflexible and may be difficult to get modified to accommodate new services properly. This has been
especially true in regard to connecting customerowned equipment to the telephone system. However
the trend within the FCC is toward increased flexibility in regard to the use of these "foreign attachments," and this should make it much easier to use
composite systems than it has been in the past. Our
experience has been that it is useful to vigorously pursue our requests for new or unusual services even
though the initial reaction from a carrier is that it
cannot provide these services. In several such cases
service arrangements that we proposed were initially

29

rejected by the carrier as illegal or unacceptable
for other reasons and were later found to be legal
and acceptable.
CONCLUSIONS
There are a number of alternatives for meeting the
communications needs of a time-shared computing
system. We have found that an amazing variety of
services can be obtained from the telephone company
and that there are often good alternatives to use of
telephone company equipment. In our case the investigation of these alternatives is leading toward substantial cost saving and improved service to our users.

The Baylor medical school teleprocessing
system-operational time-sharing on a
system /360 computer*
by WILLIAM F. HOBBS and ALLAN H. LEVY
Baylor University College of Medicine
Houston, Texas

and
JANE McBRIDE
IBM Corporation
Houston, Texas

Both local CRT terminals (cable-connected to a contol unit which is directly attached to the channel)
and remote CRT terminals (connected by' telephone
lines) are supported by the software. See Figure 1
for the configuration of the Baylor machine.
The primary programming·· support under which
BTS was developed is Operating System/360--, MFT
(multiprogramming with a fixed number of tasks).
OS/360-MFT, when it is loaded, divides core storage into a number of sections or partitions. These
partitions are fixed in size until the Operating System
is reloaded. OS/360- MFT must be utilized with at
least two partitions if teleprocessing and batch jobs
are to operate concurrently. Minor modifications are
.required to OS/360. These changes are incorporated
into the standard systems generation procedures.
Other programming support under OS/360 which is
used by the teleprocessing system includes BTAM
(Basic Telecommunications Access Method) and
.
Basic Graphics Support.
If ther~is no batch job work to be done, one partition, for teleprocessing only, may be used. The te'Ieprocessing monitor dynamically subdivides its partition as terminal jobs are requested until an available
core is used. As soon as a terminal job ends, its core
is freed and made available for another terminal user.
Each job is storage protected (write protect only)
so that it cannot alter or destroy any other job in
the system. The system time-shares between the teleprocessing programs and the batch job stream if
there is one. Time-slicing and control transfer during
wait status are both utilized to accomplish time-sharing. Additional details are set forth in a later section.
The system provides extensive language interfacing

INTRODUCTION
The Baylor Teleprocessing System (BTS) is designed
to operate as a time-sharing system. It accomplishes
the following functions:
1. It allows several jobs initiated from various
~ terminals to run concurrently with one batch job
stream.
2. It permits the use of high-level languages for
the construction of all programs, including those
designed for remote terminals.
3. It insulates the user program from changes in the
operating system by providing a set of macroinstructions and interface routines for input
and output over telecommunication lines.
4. It provides certain utility functions for the terminal user, including the ability to build, alter,
and retrieve data sets, and to communicate with
the machine operator and other terminal users.
5. It provides a means by which programs originally
written to run as batch jobs may be used from
a remote terminal.
6. It insulates user programs from hardware errors
originating during data transmission.
The system has been operational since July, 1967
on an IBM System/360 Model 50 with 256,000 positions of core storage. The terminals which the system
supports include a cathode ray tube-keyboard terminal (IBM 2260 Display Station), and 2 types of
typewriter terminals (IBM 2740 and IBM 1050).
*Supportoo in part by grant FR-259 from the National Institutes of
Health, grant HM-509 from the Division of Hospital and Medical
Facilities, United State Public Health Service, and grant RT-4
from the Bureau of Social and Rehabilitation Services, HEW.

31

32

Spring Joint Computer Conference, 1968

TEXAS INSTITUTE FOR REHABILITATION AND RESEARCH
COM PUTER FAC I L I TY

lAB

CIC

VIT ST

HQ

MED REC

PT

C!C

AOM

===::---

_ ,

•

·PAR:··

DR. BURCH
':DATA:
_ / :AD.AP.

226

OR McCLOSKY

1«I~~f.O<+---__
DR WATT

MEDREC

__-----------~

---_

~=

,,-',..,-

,./

I

\

' , "

-',__

E'_~E~m.
DR. NOAll

DR. WATT

DR. McCLOSKY

BAYLOR

BAYLOR

METH-RAD

ANDERSON

BUS OFF

=~

----'~

~

o PARAMOUR
[] BAYLOR
-LOCAL 1967-68
---REMOTE 1967-68

DE.

Figure 1 - Operational configuration. The system serves both scientific users at Baylor University College of Medicine and the
hospital data management system at the Texas Institute for Rehabilitation and Research (TIRR)

so that teleprocessing programs may be written in
Assembler Language, PL/l, FORTRAN or COBOL.
Programs written in f\ssembler Language input and
output data to and from a terminal through macroinstructions. All high level language programs
access terminals through CALL statements.
Each terminal is assigned a unique 8-character symbolic name which identifies its location and terminal
type. This terminal name plus the time and date
comprise the message prefix which is added by the
system to every input message. A pro'gra!ll can thus
explicitly identify the source of every input message
by inspection of the terminal name in the message
prefix.
Space for all data sets for all teleprocessing jobs
must be allocated at the time the teleprocessing system is loaded and initialized each day. Terminal users
cannot create additional data sets. They can, however,
read or alter any existing data sets that are defined for

their use at system initialization time. Definition of
files used in the system is handled by the Job Control
Language ofOS/360.
It is advantageous, although not essential, to have
only checked-out programs running in the teleprocessing partition. BTS thus includes a teleprocessing simulator which allows any programmer to debug his teleprocessing program as a batch job. The simulator uses
the card reader and the printer to simulate terminal
input and output devices.

Terminal operation
The command language or set of control statements
is dependent of the type of terminal in use. A control statement is recognized by its first two characters - "$$." The teleprocessing monitor responds
to the successful entry of any control statement with
an "OK."

Baylor Medical School Teleprocessing System
The command language consists of seven control
statements:
$$ACCOUNT
$$EXECUTE (or $$EXEC)
$$END
$$DDNAMES
$$EDIT
$$EOT
$$CONSOLE
A terminal user ordinarily would first wish to enter
a $$ACCOUNTstatement as follows:
$$ACCOUNT (BI23, B1234, JONES)
This statement records on disk the user's pertinent
accounting information: department number, project
number, name and time of sign on. This accounting
information must be recorded before a user is allowed
to execute a program. Accounting information, once
recorded, is not cleared until a user sends a $$EOT
control statement. At this time another accounting
record is written to clock off the user.
Once a user has established his accounting record,
he can then execute a program. Programs to be executed from terminals must be stored in one of several
teleprocessing libraries on .disk. When a user·· wishes
to execute a program, he must tell the teleprocessing monitor the program name, the amount of core it
requires, and the program type. The program type
may!Je:
1) SINGLE-only one copy of the program may
be in core at any time. (Example - a file update
program)
2) COPY -the program handles only one terminal
but multiple copies may be in core and executing concurrently.
3) MULTIPLE-the program handles multiple
terminals so that only one copy ever need be in
core.

33

$$EXEC (ELCOMP, 40000)
He is requesting the program named ELCOMP
which uses 40,000 bytes of core and is type SINGLE.
If 40,000 bytes of core are available, the program will
be executed immediately; if not, the user will be informed that all core is in use, and he should try again
later.
When a user wishes to end a program he is executing, he issues the following control statement:
$$END
This statement frees the core of the program he is
executing, but does not clear the accounting information of the user. Thus he is now free to execute any
program he wishes.
A terminal user may request the exclusive use of a
data set by means of the $$DDNAMES control statement as follows:
$$DDNAMES (DATAl, DATA2)
The teleprocessing monitor checks its list of data
set names previously entered with a $$DDNAMES
statement and notifies the user if OAT A 1 or OAT A2
is already in use. If neither is already in use, both are
added to the list and the user receives his "OK"
response.
A terminal user can request special editing functions
by means of the $$EDIT control statement. Two
levels of editing are available. Level 1 editing merely
translates all character codes to internal EBCDIC
codes. Level 2 editing removes all typewriter control
and function key characters from the text of input message. The system default is level 2 editing. A user
may also request that blanks be suppressed on incoming data and that small letters be translated to corresponding capital letters.
To end a session at the terminal, a user might enter
the following control statement:
$$EOT

If no program type is specified, the monitor assumes

If he is executing a program, the program will

type SINGLE. A user might call into execution the
message switching program with the following control
statement:

be terminated and its core will be freed. II). any case,
his accounting information will be "logged off" on
disk. The monitor will respond, as usual, with an
"OK" when the user is cleared.
A terminal user may communicate with the machine
room operator by means of a $$CONSOLE statement. He might send the following statement:

$$EXECUTE (SWITCH, 2000, MULTIPLE)
He is asking for the program named SWITCH
which uses 2,000 bytes of core storage and handles
multiple terminals. If 2,000 bytes of core are available, the program will be executed immediately and
the user will receive a message from the program
SWITCH with instructions on how to messageswitch.
As another example, a user might enter the following $$EXECUTE statement:

$$CONSOLE (MY JOB WILL RUN 5 MIN
PAST SHUTDOWN TIME. MAY WE
DELAY SHUTDOWN?)
The entire message within the parentheses will be
printed on the console typewriter in the machine
room, together with a prefix which identifies the send-

34

Spring Joint Computer Conference, 1968

ing terminal. Thus the operator may send a reply back
to the terminal if necessary.
There are seven different control statements in the
command language of BTS. Frequently a user wishes
to specify special data set or editing information
and he must enter several statements. Therefore, the
use of catalogued control statements is the substantial assest for the terminal user that OS/360
catalogued procedures are for the Job Control language user. A disk data set (hereafter called the Control Statement Library) contains sets of control statements which may be called by the terminal user with
just one entry. For example, a set of control statements called SWITCH exists on the Control Statement Library. The set contains the following control
statements:
$$ACCOUNT (3999, X9999, ANYNA~1E)
$$EDIT (2,F)
$$EXEC (SWITCH, 2000, MULTIPLE)
A terminal user can invoke these control statements
(and, hence, execute the program named SWITCH)
by entering the following command:
$$SWITCH
The terminal user can also add, change, display, or
delete sets of control statements in the Control Statement Library. However, password protection is available (if wanted) so that a s~t of control statements may
be displayed and used but not altered.

NUCLEUS
PARTITION 0
IMASTER
;SCHEDULER
T.e.B.
iT.C.B.

I~ITOR

I

PARTITION I

IBATCH
ISTREAM
LT.C.B.

I

I

I

I

I

I
I

J

;

PRIORITY
=255

PRIORITY
=254

'PRIORITY
=253

I
Figure 2 - Layout of core storage after initialization of OS/360MFT for two partitions

can execute concurrently. N is defined in the Teleprocessing System Tables and can be changed by reassembling that module. The teleprocessing task control blocks (TCB's) are added to the OS/360 TCB
chain and given priorities below the teleprocessing
monitor, while the priorities of the other lower
Operating System partitions are shifted down. See
Figure 3 for a sample layout of core storage after
OS/360--MFT has been initialized with two partitions
and BTS has been initialized for three concurrent
teleprocessing jobs.

Time-sharing the teleprocessing partition
One feature of BTS is its ability to handle multiple
teleprocessing jobs within one OS/360 partition. To
implement this capability requires two changes to
OS/360:
1) The task control block tabie in the nucieus must
be expanded to handle additional teleprocessing
task control blocks.
2) Two system termination (ABEND) modules
must be changed. Both of these changes can be
incorporated into the standard systems generation, using the source moduies for. the termination routines.
When OS/360 is loaded, the system is initialized
with at least one partition. Under OS/360-MFT one
task control block (TCB) is created for each partition
of core storage. The TeB controls the execution of
each successive job within the partition. See Figure 2
for a sample layout of core storage after OS/360MFT has been initialized with two partitions.
BTS is the first job scheduled by tHe Operating
System. The initialization Routine builds a number, N,
of teieprocessing task control blocks where N will
be the maximum number of teleprocessing jobs that

NUCLEUS
MASTER

PARTITION 0
T.P.

PA~TITION

I

BATCH

f.8~~DULER T..O~I.TOR -+{PRIORITY=254) 5 TREAM
PRI 0RtTY=25 0
OS /360
NUCLEUS

Figure 3 - Layout of core storage after initialization of OX/360MFT for two partitions and BTS for three concurrent teJeprocessingjobs

The teleprocessing task control blocks wait in a
non-dispatchable mode until a terminai user requests
a program. At that time, the monitor gets core for the

Baylor Medical School Teleprocessing System
requested program, completes other Operating System control blocks and puts the program into execution under the control of one of the teleprocessing
task control blocks. From this time until the teleprocessing program terminates, it is handled by OS/360
like any other program executing under a standard
task control block.
When the teleprocessing program terminates, BTS
frees all of the core used by the terminating program, and the task control block is made non-dispatchable, waiting to be re-used by another teleprocessing job.
The system time-shares using two methods:
1) Time-slicing with circular rotation of task control block priorities.
2) CPU control transfer when any task goes into
wait status.
Time-slicing is accomplished using an equal priority
algorithm. At the end of each specified time interval
(for example, 100 milliseconds) BTS takes the user
task with the highest priority and gives it the lowest
priority. The priorities of all other tasks on the chain
are incremented by one, thereby giving a new job the
highest user priority. When each time interval expires,
the circular chain of task priorities is re-adjusted. If
a batch job stream is in use, its priority is also rotated
on the circular chain.
CPU control transfer when a task goes into wait
status is handled by the OS/360-MFT dispatcher.
Dispatching is attempted on a task priority basis,
searching down the chain until a task is found that
can utilize the CPU. If no task is found, the system
waits until an event (such as I/O) completes so that
some task can use the CPU.
The combination of the two methods described
above provides the advantages of concurrent execution of mUltiple jobs and attempted full CPU utilization.
Language inter/acing to terminals

The large majority of teleprocessing programs
need to communicate directly with one or more terminals. In order to facilitate this communication, the
capability is provided to get a message from a terminal
(GETMSG), to put a message out to a terminal,
(PUTMSG), and to break conversational mode with a
terminal (BREAK). Macro-instructions provide
these capabilities to the Assembler Language programmer. Interface routines allow the high level language programmer to make use of these functions via
a CALL statement.
For example, an Assembler Language program may
issue the following macro:
GETMSG DATAAREA, 96

35

The next message received from a terminal in conversational mode with this program will be placed in
DAT AAREA. The maximum length of the message
will be 96 bytes which includes the message prefix of
15 bytes, 80 data bytes, and an end of block character (EOB). The program can then look at the 8character symbolic terminal name in the message
prefix to find the source of the data.
The same Assembler Language program may issue
the following macro~
PUTMSG ADDRESS=DATAAREA+15,
LENGTH=80 DEST=BCOMPS60,
PRIOR=O, LINE=12
START=I, ERASE=NO
The monitor will then send (with a priority of zero)
to the terminal named BCOMPS60 (Baylor Computing Science 2260) 80 characters of data beginning at
DATAAREA + 15. The 80 characters of data will
be written on Line 12 of the 2260 and a START MI
symbol will then be put in the first position of Line 1.
The 2260 screen will not be erased prior to the write.
If the terminal indicated by the parameter 'D EST'
were not a 2260, the last three parameters would be
ignored.
An Assembler Language program may also issue
the following macro:
BREAK BC0MPS60
The monitor will then break conversational mode
between the specified terminal and the program issuing the BREAK. If the specified terminal is the last
or only terminal in conversational mode with the program, the program will be terminated and its core
will be freed.
The same three teleprocessing capabilities (GETMSG, PUTMSG, and BREAK) are available to all
high level language programs written in PL/l, COBOL, or FORTRAN.
For example, a PL/l program may request a message as follows:
CALL GETMSG (STRING, RETURN);
The next message received from a terminal in conversational mode with this program will be placed in
STRING (which is a character string variable of
varying length). The return code will be placed in
RETURN.
Likewise, a FORTRAN program may wish to send
a message:
CALL PUTMSG (ARRAY, LENGTH, DEST,
PRIOR, RETURN)
The data in ARRAY will be sent to the terminal
specified by. DEST with the priority specified in

36

Spring Joint Computer Conference, 1968

PRIOR. The return code from the write will be placed
in RETURN.
In the same manner, a COBOL program may break
conversational mode as follows:
CALL 'BREli",K' USING TERM; RETURN.
Conversational mode will be broken between the program and the terminal specified by TERM. If the
specified terminal is the last or only terminal in conversational mode with the program, the program will
be terminated and its core will be freed.
With the language interfacing facilities of the Baylor
Teleprocessing System, no special training is required
for application programmers to make active use of the
system. They are free to use the language with which
they are familiar, and by doing so, they can access
any terminal on the system.
Terminal input/output

Two basic types of terminals are supported by BTS.
Local CRT terminals are connected to the central
processor by cables. All other terminals are connected
by means of telephone lines. Different programming
methods are used for these terminal types.
Because of the relatively high speed of local CRT
terminals, no queues of messages are maintained for
them. Read and write operations involving these
terminals are always carried out immediately when
they are requested by a problem program.
For all other types of terminals, two output queues
at different levels of priority are maintained for each
telecommunications line. Each PUTMSG to one of
these terminals causes a message to be placed in one
or the other queue according to its priority. These
queues are maintained both in core and on direct
access storage. A sufficient amount of text is kept in
core to insure that the communications lines are kept
active. Additional text is spilled to direct access
devices.

When a line becomes idle, the following method is
used to determine what operation to carry out next:
If any high priority messages are enqueued for
transmission on the· line, the next message (on a first
in-first out basis) is dequeued and sent. if no high
priority messages are awaiting transmission and some
terminal on the line is using a problem program the
line is left idle until the program issues a G ETMSG.
At this time a read operation is begun on the line.
If no terminal on the line is using a problem program, low priority messages (if any) are transmitted.
When no low priority messages remain to be sent, a
general polling operation is begun on the line. That is,
each terminal in turn is interrogated to see if it wishes
to send a message.
A dynamic buffering technique is used. This method
allows buffers to be assigned to a line from a common
pool of available buffers even while a channel input
operation is in progress. Thus, individual input messages may vary greatly in length without tying up
large amounts of storage to handle worst-case situations.
The Baylor Teleprocessing System has proven useful for the research environment in which it was developed. It is presently supporting several groups of
scientific' users, including those involved in mass
spectroscopy, radiotherapy treatment planning, and
gynecological cancer control. It is also the support
system for a comprehensive cvmputer-oriented hospital data management project at the Texas Institute
for Rehabilitation and Research. With this operational
load, it has been found that the size of core storage is
the single most important limitation. We plan system
support for low speed large core storage and for the
IBtv1 2314 direct access device. It is felt that such
additional machine capacity will remove present
limitations and enhance the operational capabilities
of the system.

Some techniques for shading machine renderings of solids
by ARTHUR APPEL
IBM Research Center
Yorktown Heights, N. Y.

reflection, the effect of surface texture, tonal specification, and the transparency of surfaces. At present, there is the- additional problem of hardware
for display of the calculated picture. Devices presently available use lines or points as the principal
pictorial element and are not comparable to oil
paint, or wash, or crayon in the ability to render the
subtle changes in tone or color across an area.
The best we can hope to do is to simulate the halftone process of printing.

INTRODUCTION
Some applications of computer graphics require a
vivid illusion of reality. These include the spatial
organization of machine parts, conceptual architectural design, simulation of mechanisms, and industrial design. There has been moderate success in the
automatic generation of wire frame ,1 cardboard
modeI,2 polyhedra,3.4 and quadric surface5 line drawings. The capability of the machine to generate vivid
sterographic pictures has been demonstrated. 6
There are, however considerable reasons for developing techniqu(!s by which line drawings. of solids
can be shaded, especially the enhancement of the
sense of solidity and depth. Figures 1 and 2 illustrate
the value of shading and shadow casting in spatial
description. In the line drawing there is no clue as
to the relative position of the flat plane and the
sheet metal console. When shadows are rendered, it
is clear that the plane is below and to the rear of
the console, and the hollow nature of the sheet
metal assembly is emphasized. Shading can specify
the tone or color of a surface and the amount of
light falling upon that surface from one or more
light sources. Shadows when sharply defined tend
to suggest another viewpoint and improves surface
definition. When controlled, shading can also emphasize particular parts of the drawing. If techniques
for the automatic determination of chiaroscuro
with good resolution should prove to be competitive with line drawings, and this is a possibility,
machine generated photographs might replace line
drawings as the principal mode of graphical communication in engineering and architecture.
A picture strictly rendered in chiaroscuro defines the scene in a dark and light area pattern,
colored or in tones of grey and no lines are made.
Rembrandt and Reubens were masters of chiaroscuro. In order to simulate the chiaroscuro of a
photograph many difficult problems need to be
solved such as the effect of illumination by direct
and diffuse lighting, atmospheric diffusion, back

Figure 1 - A machine generated line drawing of an electrical console and an arbitrary plane in space

This paper presents some recent experimental
results in the automatic shading of line drawings.
The purpose of these experiments was to generate
pictures of objects consisting of flat surfaces on a
digital plotter and to evaluate the cost of generating such pictures and the resultant graphical
quality.
37

38

Spring Joint Computer Conference, 1968

Figure 2 - A shaded line drawing of the scene in Figure I

Previous work

Considerabie work has been done in the digitizing of photographs. 7•8 Especially successful are the
pictures transmitted from spacecr~ft.9 The significance
of this work is the demonstration of the quality of
digitally generated pictures.
L. G. Roberts has accomplished the converse of the
problem being discussed in this paper by developing
techniques for the machine perception of solids which
are assemblies of convex poiyhedra modules. 10
His work suggests the possibility that it may be
more useful to analyze the contents of a photograph
and to create a mathematical model of the scene.
This analysis can be used to generate any view of
the scene with greater graphical control.
G. Lasher has written a program which can be
used to generate three dimensional graphs of mathematical functions which are unique for values of
X and Y. This program, which was used to illustrate
an article in theoretical physics,11 generates contour curves of the surface, constant coordinate
curves and renders only those curve segments that
are visible in a perspective projection. A shading
effect occurs in these pictures because the projection of the surface rulings tend to concentrate as the
surface becomes tangent to the line of sight. This
effect contributes significantiy to the vividness of
the renderings.

J. L. Pfaltz and A. Rosenfeld have applied their
notions on encoding plane regions to shading two
dimensional maps.12 Their notion of skeleton representation is that a region can be specified by a list
of points on a plane and a radius; all parts of the plane
within the radius of the point are within the region
described. For shading, a set of parallel straight lines
are generated and those portions of the lines which be
within the region are rendered. The angle and spacing of the set of parallel lines can be varied and other
textures can be generated.
A vivid automatically shaded picture of a polyhedron was generated by a subroutine written by
B. Archer for an articie by A. T. Coie. 13 The shading is accomplished by varying the spacing of parallel lines. The spacing of lines on a particular surface
is proportional to the illumination. No attempt was
made to determine the shadow cast by the polyhedron
and the methods described are inadequate for drawing
more than one convex polyhedra at a time.
Recently, C. Wylie, G. Romney, D. Evans, and A.
Erdahp3 published an algorithm for generation of
half-tone pictures of objects described by assemblies
of triangular bounded planes. Their results are toned
pictures generated with calculation time competitive with line drawings. However their scheme sets
the source of illumination at the viewpoint, and since
a point iight source cannot see the shadows it casts,
no shadows are rendered.
Previous work in the automatic determination of
chiaroscuro demonstrates how the computer can
improve the level of graphics designers can work with.
The primary limitation has been neglect of shadow
casting from arbitrarily located light sources. Also
no work has been done on the control of the toned
picture to take into account the surface tone or color
of an object. Any system for rendering in chiaroscuro
should solve economically at least these two problems.
It can be seen from previous results that toning an
area by varying the spacing of parallel lines is not
entirely satisfactory. This technique is economic but
has several disadvantages. The lines when widely
spaced do not fuse to form a continuous tone. The
viewer does not then perceive the object but is distracted by the two dimensional pattern. Depth perception is reduced. These lines also tend to suggest
a surface finish which may not exist. A good standard
for evaluating toning mechanics can be the ben-day
pattern used in printing. This pattern enables a great
range of dark and light with good tonal fusion. The
half-tone process uses many small dots arranged in a
regular array. The size of dots are varied to create a
degree of grayness, smail dots where white predominates are light and as the dots increase in size such that

Techniques for Shading Machine Renderings of Solids

they eventually blend together the toned region becomes darker. In order for the dot pattern not to be
distracting, the dot spacing should be at least seventy
dots to the inch. The large dot density required for
toning indicates that calculation schemes for toning
should be as resolution independent as possible. For
an algorithm to be resolution independent it must
enable perfect resolution. It may not be possible for
contemporary hardware to take advantage of such an
algorithm but this should be the goal.
Toning on a digital plotter
A great many experiments were conducted t')
evaluate the quality of various toning techniques that
would be applicable to digital plotting. A simulation
technique tested was to shoot random light rays from
the light source at the scene and project a symbol
from the piercing point on the first surface the light
ray pierced. These symbols would concentrate in
regions of high light intensity, and a negative print of
the hard copy could be made which would appro xi. mate a photograph if enough light rays are generated.
Even for about 1000 light rays results were splotchy.
Generating light rays in regular patterns improved the
graphic quality but did not allow economic tonal control. During these experiments, various symbols were
evaluated for graphic quality and speed of plotter
generation. The plus sign or a small square were to
give best results. Eventually the best technique from
graphic and economic considerations for toning was
found to be plus signs arranged in staggered rows
with shadows outlined. This arrangement is most
easily seen in Figure 2. The size of plus signs were to
be rendered proportional to darkness required.
Ignoring atmospheric diffusion, the intensity of
light incident upon a unit plane area from a point
light source is:
1= S (Cosine L)/D2
where S is the intensity of the light source, L is the
angle of the normal to the plane and direction of light
at the illuminated area, and D is the distance from the
illuminated point to the light source. For experimental
purposes it was assumed that the light source is so far
from the objects being illuminated that variations in
Land D are insignificant. L need be calculated only
once for each surface. Also since we are interested
in simulating illumination only to the extent that comparative light and darkness of surfaces are displayed
and also because the range of toning on the digital
plotter is limited, we did not concern ourselves with
the actual intensity of the single light source. The
comparative intensity of illumination of a point on
a plane then is proportional to Cosine L. The apparent
illumination of a flat surface then will be constant

39

over the surface. The digital plotter does not generate
light as a cathode ray tube but makes a dark mark. The
size of this mark should indicate an absence of light.
So the degree of darkness at a point or the size of
plus sign rendered is
H*= I-Cosine L
For simplicity it can be assumed that if a point is
in shadow the largest allowable mark will be rendered
on that point. For point j then, the size of symbol H j
is the maximum symbol Hs. Hs is· proportional to the
dot spacing. During early experiments of the size of
symbols rendered on a point not in shadow was
H j = 1 - Cosine L
However it was found that results were confusing; it
was difficult to detect the difference between surfaces
in shadow and surfaces almost parallel to the direction
. of light. In actual viewing of a solid, surfaces almost
parallel to the direction of light reflect a considerable
amount of light due to the surface roughness, but as
soon as the surface faces away from the light source,
no light can be reflected and the apparent illumination changes sharply. In order to simulate this effect a
contrast factor, h, usually .8, was introduced. So then
Hj=h Hs (I-Cosine L)

(la)

if the surface point j is on faces the light source and
H j = Hs ifpointj is in shadow.

(Ib)

We are faced now with essentially at least four programming problems:
1. Given a viewpoint and a mathematically described scene what is the point in picture to point
in scene correspondence? This problem is to determine what visible point, if any, on the objects
being rendered project onto a particular point in
the picture plane.
2. Given one or more light sources what is the intensity of light falling on a point in the scene?
This problem includes the determination of
which regions are in shadow.
3. Given a light source what are the boundaries of
the shadow cast and how much of this cast
shadow can be seen? If the picture could be
rendered large and/or if symbol density could be
large outlining the shadows could be dispensed
with.
4. How can the tone or color of a surface be specified and how should this specification affect the
tones rendered?
Econ·omic solutions to the first two problems are
most critical. If an economic point to point correspondence technique could be found that would permit dense symbol packing, the problem of casting
shadow outlines could be eliminated. The problem of

40

Spring Joint Computer Conference, 1968

determining how much light falls on a flat surface not
in shadow is trivial, and even for curved surfaces this
is not difficult, but economically determining exactly
what regions of the scene are in shadow is a very difficult problem.

Figure 3 - An assembly of planes which make up a cardboard model
of a building

Figure 4-Another view of the building

-~

,

----':0;-;

----L~
-

--

~,-

--------

-

-

~~

Figure 5 - A higher angle view of the bUilding. 7094 calculation time
for this picture was about 30 minutes.

and enable easily coded graphical experimentation.
Figures 3, 4, and 5 are examples of point by point
shading. Referring to Figure 6, the technique in generating these pictures was as follows:
1. Determine the range of coordinates of the projection of the vertex points.
2. Within this range generate a roster of spots
(PiP) in the picture plane, reproject these spots
one at a time to the eye of the observer and generate the equation of the line of sight to that spot.
3. Determine the first plane the line of sight to a
particular spot pierces. Locate the piercing point
(Pi) in space. Ignore the spots that do not correspond to points in the scene (P np).
4. Determine whether the piercing point is hidden
from the light source by any other surface. If the
point is hidden from the light source (for example
P 2) or if the surface the piercing point is on
is being observed from its shadow side, mark on
the roster spot the largest allowable plus sign Hs.
If the point in space is visible to the light source
(for example PI) draw a plus sign with dimension H j as determined by Equation 1.
This method is very time consuming, usually requiring for useful results several thousand times as
much calculation time as a wire frame drawing. About
one half of this time is devoted to determining the
point to point correspondence of the projection and
the scene. In order to minimize calculation time for
point by point shading and maintain resolution, techniques were developed to determine the outline of cast
shadows. Outlining shadows has the advantage that
all regions of dissimilar tones on the picture plane
are outlined. Even when projected shadows are delicate, and symbol spacing is large, the shadows are
specified and the discontinuity in tone is emphasized.
The strategy for point by point determination of
shadow boundaries is as follows: (Referring to Figure 7)

PNpDOES NOT
CORRESPOND.TO
ANY POINT ON
THE OBJECT

Figure 6 - Point by point shading

Point by point shading

Point by point shading techniques yield good
graphic results but at large computational times. These
techniques are docile, require the minimum of storage

Figure 7 - Segment by segment outiining of shadows

Techniques for Shading Machine Renderings of Solids
1. Classify all surface line boundaries into shadow
casting and non-shadow casting. A shadow casting line is from the viewpoint of an observer
at the light source a contour line. F.or assemblies
of flat surfaces, a contour· line along which all
surfaces associated with this line appear on only
one side of this line.
2. Determine w~ether the observer is on the shadow side or lighted side of all surfaces.
3. Subdivide all shadow casting lines, one at a
time, into small segments (Kl, K2), usually .005
units, and determine the midpoint of this segment(KM).
4. Generate a light ray to the midpoint of the segment (KM). If any surface lies between KM and
the light source go on to the next segment. Determine the next surface behind KM that the light
ray to KM pierces within its boundary. If no surface lies behind KM go on to the next segment.
A point can cast only one shadow. Project K 1,
KM, and K2 onto the surface to obtain K 1S,
KMS, and K2S, the shadows of Kl, KM, and
K2. If KMS lies on a surface which is seen from
its shadow side go on to the n~xt segment. This
particUlar shadow boundary is invisible. Also
a shadow cannot fall within a shadow.
5. Test KMS for visibility. If KMS is hidden from
the observer go on to the next segment.
6. If KMS is visible project the line (KlS-K2S)
onto the picture plane and draw the projection.
As cim be expected,· determining the outline of
shadows by this described strategy is very time
consuming usually requiring as much time as a point
by point line visibility determination.

of the solid. For polyhedra, given a specific viewpoint,
a contour line is a material line which is the intersection of two surfaces, one of which is invisible.
For a given viewpoint the quantitative invisibility of
a material line can change only when it passes behind
a contour line. Figure 8 illustrates how quantitative
invisibility varies as a line passes behind a solid.
Notice that only surfaces which are viewed from the
spatial side affect the measurement of quantitative
invisibility. In determing line visibility for line drawings only those segments of the line for which quantitative invisibility is zero are drawn. For this application
only the quantitative invisibility of vertex points are
stored and changes in quantitative invisibility along
a line are measured and discarded as soon as commands to the graphic device are generated. The
methods of quantitative invisibility can be applied
to shading a picture if the changes in quantitative
invisibility of a line from the light sources and the
observer are stored and compared.

TYPICAL CONTOUR
LINES

Methods of quantitative invisibility

In a previous report, the notion of quantitative
invisibility was discussed as the basis for rapidly determining the visibility of lines in the line rendering of
polyhedra. 4 P. Loutrell has implemented several tactical improvements for this application. 15 Quantitative
invisibility is the count of all surfaces, seen from
their spatial side, which hide a line from an observer
at a given point on the line.
The methods of quantitative invisibility are useful
because techniques for detecting changes in quantitative invisibility along a line· are more economical
than measuring the visibility, absolute or quantitative,
at a single point. These techniques are applicable only
to material lines which are lines that have specific end
points and that do not pierce any bounded surface
within its boundary. Objects that are manufactured
contain only material lines. A contour line is a line
along which the line of sight is tangent to the surface

41

Figure 8 - Changes in .quantitative invisibility. Object A is in front
of and does not touch object 8

The method of cutting planes

In descriptive geometry, the intersection of simple
quadric surfaces is determined by passing carefully
chosen planes through the quadric surface to determine the intersection curve of the quadric surface and
the plane; and from these first and second degree
surface intersections the intersection curve of one or
more quadric surfaces can be deduced. This procedure
is time consuming but does solve a problem difficult
for most mathematicians. This technique of manual
rendering is the inspiration for the method of cutting
planes for shading machine renderings of solids. Point
by point shading techniques are expensive because

42

Spring Joint Computer Conference, 1968

it is difficult with good resolution to correlate the
shading of adjacent spots on the picture plane. With
simple codings, the method of cutting planes enables
such correlation in one direction, with more elaborate
coding the correlation can be in all directions.
The basic· concept of the method of cutting pianes
is that when the intersections of a plane that passes
through the observation point and assemblies of planes
which can enclose one or more polyhedra are projected onto the picture plane, these projected intersections are colinear. In detail, as illustrated in Figure 9, the strategy is:

LIGHT SOURCE

6. Those lines of intersection leJ which are on surfaces which face away from the light source can
be rendered with no further analysis. These
lines are completely in shadow and along their
visible projected length plus signs of the maximum size (Hs) can be generated.
7. Lines of intersection IeJ which are on surfaces
which face toward the light source can be analyzed to determine changes in quantitatives
invisibility from the viewpoint of the light source.
Those projected portions of the lines which are
hidden from the light source are rendered by a
series of plus signs of size Hs. Those portions
which are visibie to the iight source are rendered
by a series of plus signs whose size is determined
by Equation 1.

PICTURE PLANE
ICPONTERSECTION
OF CUTTING PLANE
a PICTURE PLANE)

PROJECTIONS OF ICJ,ICK
ONTO PICTURE PLANE.

Figure 9 - The method of cutting planes

1. Generate a cutting plane which passes through
the observation point.
2. This cutting plane will intersect the picture plane
along a specific line (ICP).
3. This cutting plane will cut or pass through the
surfaces of the polyhedra and generate the intersection ICS which is a string of three or more line
segments. Eflch of these segments is a material
line (leJ).
4. All the ICJ of the polyhedral faces and a particular .cutting plane will project colinear onto the
picture plane. This colinear projection is the lirie
ICP.
s. These intersections ICJ for a particular cutting
plane can then be measured for changes in quantitative invisibility by techniques previously reported 4 •15 • Those intersections IeJ which
prove to be completely invisible can be quickly
determined and need be analyzed no further. We
have now determined a correspondence between
a line on the picture plane and a series of lines
in the scene to be rendered.

Figure 10 - Two views of a machine part where the light source is
moved relative to the object

Techniques for Shading Machine Renderings of Solids

43

The resolution of shading by the method of cutting
planes is no longer limited by the spot to spot spacing
on the picture plane but by the spacing of the cutting
planes intersections with the picture plane. For comparable resolution the calculation time for shading by
cutting planes is slightly less than the square root of
the time for shading by point by point methods.

Figure 12 - Assembly of the two previously drawn machine parts.
7094 calculation time: about 50 seconds.

plus signs drawn on a particular spot can be the sum
of the shadow intensity from all the light sources.
(2)

Figure 11 - Another machine part. 7094 calculation time for this
picture was about 30 seconds

The speed of calculation is very dependent on how
effectively the measurements of quantitative invisibility of all the lines in the scene are stored and correlated. This list is the basis for determining visibility
along cutting plane intersections. The first version of
the Fortran IV program used to generate Figures 10
to 14 was experimental and is not as fast as theoretically expected. Pictures were plotted on an IBM
1627 (Calcomp). A faster version' which can . take
into account more than one light source is under development for operation on a 360/67. The larger core,
greater data storage capacity and time sharing capability of this machine will be utilized. The method of
cutting planes which enable rapid correspondence of
projected points to real points in the scene certainly
includes the illumination of the object by more than
one light source of differing intensities. The size of

where H is the shadow intensity from a particular
light source.
The more intense the light source, the more intense
the shadow it causes when a point cannot see this
source of light,
I TOTAL=Lli

(3)

where I is the intensity of the light source
HSi = IJI TOTAL

(4)

Where HSi is the shadow intensity in the absence of
light from light source i.
Hi = HSi when a point is in shadow
Hi = HSi (I-CosineL) h

(5)

when a point is seen by light source i.
It is also possible to exercise tone control for emphasis while generating the half-tone picture. A list
of comparative surface tones can be entered which
will describe the basic tone of each surface. For example, if a scene consists of object A with four sur-

44

Spring Joint Computer Conference, 1968

faces and object B with six surfaces and object A is
lighter than object B the surface tone list would be
(.5, .5, .5, .5, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0). The size
of the shading symbol can then be determined by
HTik = (Hi X FLIGHT + FTONE) TONEk

(6)

where FLIGHT and FTONE are influence factors of
light and surface tone, and TONEk is the relative tone

Figure 13 - Another machine part

of surface k. FLIGHT and FTONE enable the control of highlighting. Master copy for the preparation
of color process plates for letterpress printing have
been generated using mathematical models similar to
equation 6. It is obvious that once the basic problems of determining how light falls on the object are
solved, considerable artistic freedom is possible.

Figure 15 - Assembly of the three machine parts. 7094 calculation
time for this view was about two minutes

ACKNOWLEDGMENTS
The author is grateful to G. Folchi, Dr. G. Lasher,
and Dr. P. Loutrell for some helpful discussions. R.
L. Ennis,J.J. Gordineer,J. A. Mancini, Mr. & Mrs. E.
P. McGilton, W. H. Murray, and G. J. Walsh among
others. were helpful with plotter output and other
machine problems. The author is deeply indebted to J.
P. Gilvey and F. L. Graner for their continuing support and encollragement of this project.
REFERENCES

Figure 14-Another view of the machine Dart shown in the orevious
figure .. The light source has been moved ~elative to the obj~ct. Notice the light passing through the opening in the object

1 T E JOHNSON
Sketchpad III: A compuier program for drawing in three
dimensions.
Proc AFIPS 1963 Spring Joint Computer Conf Vol 23 pp
347-353
2. A APPEL
The visibility problem and machine rendering of solids
IBM Research Report RC 1618 May 20 1966

Techniques for Shading Machine Renderings of Solids
3 P LOUTREL
Determination of hidden edges in polyhedral figures: convex
case
Technical Report 400-145, Laboratory for Electroscience
Research NYU September 1966
4 A APPEL
The notion of quantitative invisibility and the machine rendering of solids
Proc ACM 1967 Conference pp 387-393
5 R A WEISS
BE VISION a package of IBM 7090 Fortran programs to draw
orthographic views of combinations of plane and quadric surfaces
JACM 13 April 1966 11 194-204
6 H R PUCKETI
Computer method for perspective drawing journal of spacecraft and rockets
Vol I No 1 pp 44-48 1964
7 W S HOLMES H R LELAND G E RICHMOND
Design of a photo interpretation automaton
AFlPS Conf Proceedings 1962 Fall Joint Computer Conference Vol 22 pp 27-35
8 R W CONN
Digitized photographs for illustrated computer output
AFIPS Conference Proceedings 1967 Spring Joint Computer

45

Conference Vol 30 pp 103-106
9 Lunar Orbiter Surveys the Moon Sky and Telescope Vol 32
No 4 October 1966 pp 192-197
10 L G ROBERTS
Machine perception of three-dimensional solids
Technical Report No 315 Lincoln Laboratory MIT May 1963
II G LASHER
Mixed state of type-I superconducting films in a perpendicular
magnetic filed
The Physical Review Vol 154 No 2 pp 345-348 Feb 10
1967
12 J L PFALTZ A ROSENFELD
Computer representation of planar regions by their skeletons
CACM Vol 10 No 2 February 1967 pp 119-125
13 A J COLE
Plane and stereographic projections of convex polyhedra from
minimal information
The Computer Journal
14 C WYLIE
G ROMNEY
D EVANS A ERDAHL
Half-tone perspective drawings by computer
AFIPS Conf proceedings 1967 Fall Joint Computer Conference Vol 27
15 P LOUTRELL
Phd Thesis NYU September 1967
NYU September 1967

A system for interactive graphical programming*
by WILLIAM M. NEWMAN**
Harvard University
Cambridge, Massachusetts

INTRODUCTION
A system is described in this paper for developing
graphical problem-oriented languages. This topic is of
great importance in computer-aided design, but has
hitherto received only sketchy documentation, with
few attempts at a comparative study. Meanwhile displays are beginning to be used for design, and the
results of such a study are badly needed. What has
held back experimentation with computer graphics
has been the difficulty of specifying new graphic
techniques using the available programming languages; the method described in this paper appears
to avoid this difficulty.
Defining a problem-oriented language
Notation

Any description of an interactive process must define the response of the system to each input. For
this reason it is convenient to describe graphical problem-oriented languages in terms of actions and reactions. An action is simply an input which may produce a response; the corresponding reaction defines
this response, and in addition any unmanifested effect
of the action on the state of the machine. The same
action may cause a different reaction on different
occasions: for example, movement of the light pen
may affect the display in a number of different ways.
It is therefore convenient to treat the system as a
finite-state automaton, and to say that the reaction is
determined by the state of the program as well as by
the action. In other words, the actions are inputs to
the automaton, which cause it to change state; reactions are the outputs.
Just how convenient this is for describing interactive processes is illustrated by the following example. A 'rubber-band' line l can be created by means of
*The work described in this report was supported by a Science
Research Council Contract, No. B/SR/2071, "Computer Processing of Three-Dimensional Shapes."
**Formerly at the Centre for Computing and Automation, Imperial
College, London.

47

a light pen and one push-button, in a sequence of five
operations:
1) press button to start pen tracking;
2) track pen to starting point of line;
3) press button to fix starting point;
4) track pen to end point;
5) press button to fix end point and stop tracking.
The 'rubber-band' effect is created by displaying a
line joining the starting point to the pen position
throughout stage 4.
Figure 1 shows a state-diagram representing this
sequence. Each branch represents an action, and the
resultant reaction is specified in the "arrowhead."
Only valid actions are included; for example, pen
movements are meaningless in state I and are therefore omitted. The inclusion or exclusion of an action
may add semantic properties to the diagram. This is
shown by the 'pen movement' branches on states 2
and 3, which imply pen tracking during those states
and make explicit reference to tracking unnecessary.

button
store
starting point

pen movement

pen movement

Figure 1 - A state-diagram representing rubber-bank line-drawing

48 Spring Joint Computer Conference, 1968
The state-diagram has been used in this way as the
basis of a method for defining problem-oriented languages. A particular advantage of this technique is
the wayan immediate reaction can be associated with
each action in a sequence; this is of great importance
in graphical programs. On the other hand the state-diagram offers no direct method of attaching semantic
functions to groups of actions, and is therefore of
little use for describing phrase-structured grammars.
This is less of a drawback than it seems. An interactive
problem-oriented language need not possess a complex structure to function efficiently, and benefit can
often be gained from simplifying the language as much
as possible. Roos, for example, has noted the difficulty
experienced by some engineers in using the relatively
simple languages of the ICES System. 2
The basic function of the state-diagram, as illustrated in Figure 1, is to indicate the actions which may
validly occur during each state, and the reactions and
changes of state which they will cause. A number of
additions have been made to this basic notation. Normally, branching takes place when a user action
matches an action defined in the diagram. Branching
may however be over-ridden by the result of a test
routine included in the branch definition. Furthermore, branching may be initiated by the program itself
by means of system actions: thus the result of a procedure may determine to which of several states the
program will branch. A procedure of this kind is called
a program block and is attached to a state rather than
to a branch; it is 'executed every time the corresponding state is entered. Program blocks need not terminate in a system action, but may instead be used to
provide some sort of continuous background activity.
These are the essential additions to the notation.
One other has been included for convenience in
programming 'conversational' systems, in which
each input message produces a predetermined output
message. This message can be coded within the
reaction procedure or program block, but it is convenient to be able to state it separately as an output
string or response. States therefore possess a response
as well as a program block, and reactions are similarly
defined as two components, a response and a procedure. The procedure is represented as an instruction
for execution or I EX.
The suggested form of these additions to the
notation is shown in Figure 2; a somewhat similar
notation has been used by Phillips 3 to describe realtime control programs. Figure 2 illustrates how the
first example could be entended to permit the removal
of lines and initialization of the program.· These two
functions are controlled by the commands DELETE
and RESTART; as explained below, commands may

be typed at the console typewriter, or may be arranged
to appear on the screen as light-buttons. After
giving the command 0 ELETE, the user points the
light pen at each line to be removed. Deletion is
carried out by a test routine DLAST, which also
tests whether any other lines remain on the screen.
When none remains, or the user gives the command
DRAW, the program changes state. RESTART
causes the program to enter the initial state 4, execute its program block PBGO and return to state 1
when initialization is complete.
The language
A Network Definition Language has been developed so that problem-oriented languages, defined in
the form of state-diagrams, can be compiled into
interactive programs. W. R. Sutherland4 has shown
that programs can be described directiy to the computer in graphical form, and this technique has obvious applications to the input of state-diagrams.
However, much of the information in these diagrams
is in character form, and would be difficult to describe
in purely graphical terms. For this reason, and because
it is more suited to off-line preparation, a characterbased language was preferred. The following remarks
and examples are intended to give a general impression of this language, which is described elsewhere
in some detail. 5
The state-diagram is described by defining each
state in turn; each such state definition is followed
by a list of the branches from that state and their
properties. The ordering of the state definitions,
and of the branch definitions within a list, is immaterial. State and branch definitions are constructed
from statements, each defining one property as in
the following examples:
RESP

PRESS BUTTON

PB

PB22

lEX

REPROG

These define a response "Press button," a program
block called PB22 and an instruction for execution
named REPROG, respectively.
Each state has three properties (name, response
and program block) and each branch has seven.
For convenience, however, the language permits
certain stat~ments to be omitted if the value of
the property is nun, meaningless or zero. The only
statements which are syntactically necessary are
those defining the names of states and the actions
of branches. The remaining statements in a state or
branch definition must follow these, but can be ~iven
in any order,
-

A System for Interactive Graphical Programming

49

G
I

I
I

D

C>

program
block

lEX

o
Q re~o~
test
routine

button

pen
movement

Figure 2 - An extended diagram including responses, and with
provision for drawing and deleting lines and for initialization

Sonie care has been taken to avoid explicit references to peripherals in the language. As a result,
state-diagrams are largely device-independent and
compile into similarly device-independent programs.
For example, a 'pen movement' action may originate
as movement of a light pen or tracker ball, as a pair
of typed coordinated, or even as two numbers read
off a tape. Any such compatible set of actions is
called a category, and the definition of an action
must include the category name in the first statement.
Some .actions, such as typed commands, require a
further property to define the message content. A
command "restart" would be defined thus:
ACT

0

MES

RESTART

The first statement indicates an action of category
0, which includes light-button hits as well as typed
commands; the second defines the message content.
Numerical names have been used for categories so
that extensions to the category list can be made
more easily.

Table I shows the state-diagram of Figure 2 coded
by means of the language into a network definition
and illustrates the use of the state entry to define
a change of state. If no state entry is mentioned
in a branch definition the state remains unaltered.

Compiling and executing an interactive program
Bilingual programming
The Network Definition Language contains no
facilities for coding the procedures named in state
diagrams. It is intended rather to be used in conjunction with a procedure-oriented language, each
language being used for the tasks to which it is most
suited. Some readers may disagree with this approach,
which requires the programmer to be bilingual.
The fact remains that procedure-oriented languages
on to which powerful control facilities have been
grafted rarely make for easy programming. It therefore seems reasonable that interactive programs
should be written in two languages, one procedureoriented and the other corztrol-oriented.

50

Spring Joint Computer Conference, 1968
1
Comment:
PRESS BUTTON TO TRACK
0
RESTART
4
ACT 0
MES DELETE
5
SE
ACT 10
2
SE

STAT
RESP
ACT
MES
SE

State definition, state I
State I response, "Press button to track"
Branch definition, action of category 0 (command)
Message "restart"
State entry, i.e. branch leads to state 4
Branch definition; command "delete" leads to state 5

Branch definition, category 10 (button)
Pressing button leads to state 2

STAT
RESP
ACT
ACT
lEX
SE

2
PRESS BUTTON TO DRAW
7
10
STORPT
3

STAT
RESP
ACT
SE
ACT
lEX

3
State 3 definition
PRESS BUTTON WHEN COMPLETE
10
Branch definition; pressing button leads to state 1

State 2 definition
State 2 response
Branch definition, category 7 (pen movement)
Branch definition; pressing button leads to state 3
STORPT stores pen position as starting point when button is pressed

I

7
DUNE

Branch definition, pen movement
DUNE computes and displays fresh line at every pen movement

STAT 4
INIT
PBGO
PB
ACT 5
SE

State 4 definition
I nitial state, program starts here
Program block PBGO, executed on entering state 4
Branch definition, category 5 (system)
Completion of PBGO leads to state 1

STAT
RESP
ACT
MES
SE
ACT
TEST
SE
END

5
POINT AT LINE TO DELETE

State 5 definition

0

Branch definition; command "draw" leads to state 1

DRAW
1
6
DLAST
I

Branch definition: category 6 (pen hit)
Test routine DLAST deletes indicated line
If last line, branch to state 1

TABLE I: The example of Figure 2 coded into Network Definition Language

This bilingual approach permits an interactive program to be created as three separate components,
namely the control component, procedure component
and supervisor. The supervisor contains routines for
handling interrupts and maintaining the display; at
its nucleus is a program which analyses and interprets
inputs. This program, the Reaction Handler, is basi(.;ally a table-driven syntax analYser.5 The tables to
which it refers are ring-structures and include a model
of the state-diagram, created- by compiling the network definition with a Network Compiler. These
tables form the control component of the program.
They contain references to the test routines, program
blocks and instructions for execution, which constitute
the pro,cedure component and are compiled separately.
The decision to use ring-structures for the Reaction
Handier tables was made for a number of reasons. It
permitted null-valued entries to be omitted from the

tables, and others such as message and response
definitions to be of variable length. It meant that a
package of routines would be available for building
data structures for computer-aided design. It also
allowed a much more flexible approach to the design
of the Reaction Handler, since the tables could be
easily _extended or rearranged. The ring-processing
package, which has been described in another paper, 7
was based on the ASP language of J. C. Gray.8 It
differs from the- earlier ring languages of Roberts 9 and
others in the ease with which dynamic alterations can
be made to the structure. In particular, elements can
be attached to rings or removed from them without
altering the element size, since the connections are
made by ring starts and associators which need not
be contiguous with the element. This flexibility has
made it possible to write an on-line, incremental Network Compiler.

A System for Interactive Graphical Programming
The network compiler
The essence of an incremental compiler, as described by Lock,IO is that each statement is independently compiled into executable form, and can
later be modified without complete recompilation.
Normally this means assigning a number· to each
statement so that it can be referenced. In the Network
Compiler it was found sufficient to assign names to
the states and branches; individual statements could
then be referenced by the property name. Branch nam:ing has since been discarded in order to save space,
and it is therefore no longer possible to refer back to
individual branch definition statements. Users have
not found this to be a great disadvantage.
The first statement in a state or branch definition
causes the Network Compiler to set up a corresponding ring element; the two classes of element are called
state elements and branch elements. An extra word is
provided in the state element to hold the state name.
Each further statement in a definition. has the effect
of attaching to the element a definition ring defining
the named property. The manner in which rings define
properties is shown diagrammatically in Figure 3. The
value of any property of an element can be found by
selecting the definition ring with the appropriate
attribute in the associator, and ascending it to the
ring start; the element attached to this ring start contains the value.

51

Statements defining state entries are treated in the
same fashion: the ring to which the branch element
is attached leads to the named. state element. Each
state element has such a state entry ring, whose
constituent elements define the set of branches leading: to this state. A second ring, the permitted action
ring, starts from each state element, and defines the
set of branches leading from that state; this set includes branches which return to the same state and
therefore possess no state entry. Figure 4 shows part
of the ring-structure resulting from compiling the network definition of Table I.

state
element 1 - - - - 1
state 1
category
element

branch
element

branch
element

V

ring start

-¢-

associator

.....

I

"category"

I
"

Figure 3 - Defining properties by means of definition rings. This
shows the ring structure defining a branch of category 7 with an
lEX called DUNE. An assocator is shown in detail

Figure 4 - Part of the ring-structure resulting from compiling the
network definition of Table I

As mentioned above, the Network Compiler is
designed for on-line use, and may be operated from
the teleprinter or from the display using light-buttons.
The teleprinter has proved the more convenient for
on-line ~ompilation, but the light-buttons and displayed responses provide a valuable aid, particularly
to the novice. The compiler will also accept paper:

52

Spring Joint Computer Conference, 1968

tapes prepared off-line. It has error-checking facilities~
and will halt at any erroneous command on the tape
until the correct version is typed.
The reaction handler: modes
A program under the controi of the Reaction Handler may be in one of three modes: these are interrupted mode, reaction handling (or RH) mode and waiting mode. Waiting mode does not imply inactivity,
but that the program has reached one of the- states in
the state-diagram and is ready for an action to occur.
It may therefore be engaged in computation (user waiting for computer) or looping on a dynamic stop )computer waiting for user);
During waiting mode any action by the user will
cause the program to switch to interrupted mode, and
a stimulus to be passed to the Reaction Handler. The
stimulus specifies the device involved and may also
refer to a block of data or message, such as a teleprinter character or a pair of light pen coordinates.
Device name and message address are stored in a
five-word element which forms the head of a queue of
stimuli.
The program then enters RH mode and the Reaction Handler processes the first stimulus in the queue.
Stimulus processing is a form of syntax analysis,
whose goal is to match the stimulus to an entry in the
appropriate table. If this goal cannot be reached, the
stimulus represents an ungrammatical action. If however the goal is reached, the table may specify a fresh
goal to be attained. Eventually the process stops,
either when it fails to reach a goal or when it arrives
at an ultimate goal where no further goal is specified.
The Reaction Handler then fetches the next stimulus
from the queue and processes it; when the queue is
empty the program returns to waiting mode.
At any time the Reaction Handler may be interrupted by a user action, and a further stimulus may be
added to the end of the queue. To prevent the queue
from growing uncontrollably, software flip-flops or
latches are used to govern the rate at which each device generates stimuli. The light pen latch, for example, is set when a pen stimulus enters the queue and
cleared when it has been processed; while it is set,
all fresh pen positions are ignored.
Stimulus processing
The first task of the Reaction Handler on receiving
a fresh stimulus is to establish what category of action,
if any, the stimulus represents. This it does by referring to a category table. Once an action has been
recognised in this way, it is matched against descriptions in a branch table of the branches leading from
the current state. This second phase determines what
reaction and change of state should occur, and it is
convenient to describe this phase first.

The branch table is in fact the ring structure created
by the Network Compiler, as described above and
illustrated in Figure 4. The Reaction Handler's first
goal is to find a branch of the appropriate category
which belongs to the current state. This can be done
by searching in parallel the permitted action ring and
the category definition ring. If any branch elements
belong to both rings, the next goal is to find among
them an element whose message definition matches
the stimulus message. A series of message comparisons therefore takes place; before carrying out a comparison on a branch element, the test routine is executed.
If a matching element is found, the corresponding
reaction takes place: the reaction response is displayed, and the lEX is executed. If no match can be
achieved. the Reaction Handler will accept any
brah~h- ~lement with a 'null' message. The final
goal is the branch's state entry. If none exists, there
is no change of state. If on the other hand the branch
element is attached to a state entry ring, the new
state's response is displayed, and the program block
address is used as a transfer address when the program eventually returns to waiting mode. This address may be modified by a jump displacement included in the branch definition: this is a method of
achieving multiple entries to a program block. If the
state has no program block, the program transfers to
a standard dynamic stop address.
The most time-consuming part of this process is the
parallel ring search, which becomes particularly undesirable when repeated actions such as pen movements are taking place very frequently. The parallel
search is therefore avoided by scanning through the
permitted ,action ring at every change of state. At
each branch element on the ring an activity bit is
set which allows a simple search through the category
definition ring to be carried out whenever an action
of that category occurs.
category

element

source
element

Pigure 5 - A category table ring-structure, capable of dealing with
typed c!-,mmands and light-button commands

A System for Interactive Graphical Programming
The first phase of the reaction handling process, in
which an input stimulus may be recognized as a
particular category of action, is carried out in a similar
fashion to the second. During this phase the Reaction
Handler uses the category table, a fragment of which
is shown in Figure 5. When a stimulus is received, it is
first matched with a source element containing the
same device name as the stimulus. A search is then
carried out along the down-ring from this element,
in an identical fashion to the second-phase search
along the category ring. If a matching element is found
on this ring, its lEX is executed. The next goal is
to find an up-ring from this element, leading to a category element: this ring is treated in the same fashion as
the state entry ring in the second phase. The category
element forms a link between the category table and
the branch table, and the down-ring which starts at
this element is in fact the category definition ring used
in the second-phase search.
The first phase of reaction handling performs two
important functions. It is capable of concatenating
stimuli so that the combined input string can be treated
as a single action by the second phase; it is also responsible for grouping together actions of the same
category. Both functions are illustrated by Figure 5,
which depicts the ring-structure for dealing.with input
commands. Typed commands originate as a string of
characters, each of which matches with element E2
and is stored in a buffer. When the terminating character is typed, the lEX of element E 1 exchanges the.
single-character stimulus message for the complete
buffer ~ontents, and the second phase of reaction
handling commences from the "Category 0" element.
Light -button hits produce stimuli containing the complete command as· an input message. These match
with element E3 and lead immediately to the second
phase.
When the program changes state, the old activity
bits must be cleared and the new ones set. The Reaction Handler must also carry out various set-up functions, defined as properties of the source elements.
These functions include such operations as clearing
buffers, starting pen-tracking and setting up the lightbutton 'menu.' Source elements are themselves held
on a ring, whose members define the set of active
devices. The display is treated as a number of different 'devices': light pen interrupts are separated into
light-button hits, tracking interrupts and so forth, and
passed to the Reaction Handler under different device names.
In its general layout, the category table closely resembles the branch table, and is set up by a very
similar compiler. This Category Compiler is also incremental, and accepts table descriptions written in
the Category Definition L~nguage,5 which differs

53

only slightly from the Network Definition Language.
In general, a complete category table suits most programs, but it is convenient to be able to edit out unwanted categories with the aid of the Category Compiler in order to save space.
CONCLUSION
At the time of writing, the Reaction Handler system
has been in use for only a few months. Nevertheless
it has during this brief period demonstrated a number
of valuable features. In particular, the Network Definition Language provides a very efficient means of
writing graphical programs, and simple experiments
with graphical techniques can now be carried out. in
a matter of hours instead of weeks. Both the language
and the underlying state-diagram concept are extremely simple, and can be used by those with very
little programming experience.
The adoption of a bilingual approach has undoubtedly helped to make this possible, and it is interesting to
compare other systems of a similar nature. The use of
a separate language to define a program's control
sequence has been proposed before, but it is rare to
find explicit reference to the need for two languages in
interactive programming. The ICES System employs
what amounts to a bilingual method, in which a Command Definition Language is used to define the control sequence. The language is designed around the
use of card-image input, however, and is not particularly suitable for interactive programming. Command
Flow Graphs are used in a similar fashion to statediagrams, but the concept of program states is not
employed.
A much more powerful facility for treating problemoriented languages of a very general nature is provided
in the AED System.l1 Language syntax can be described by means of the AEDJ R Command Language,12 the extreme generality which this system
permits is attractive, but is probably unnecessary in
graphical programs. The Command Language is
very complex, and its efficient use obviously requires considerable experience.
The processing of basic characters by the AED System is carried out by the RWORD System. This
system is particularly interesting, as it employs the
concept of representing programs as finite-state
automata. It possesses many of the features of the
Reaction Handler, but avoids the explicit definition of
program states, a feature which has been found valuable in practice. RWORD instead uses a very neat
regular-expression language for defining vocabulary
words, and avoids the use of tables in order to speed
up program execution. It is clearly capable of producing more efficient programs than is possible using the

54

Spring Joint Computer Conference, 1968

Reaction Handler's ring-structured category network.
Nevertheless, the Reaction Handler has performed
quite satisfactorily as a real-time supervisor. It
provides a fast response to all types of user action,
including pen .movement where a good response is
essential. It does so at the expense of a high system
overhead, which may reach as much as 20% duririg
pen-tracking. In a display processor, which is idle
most of the time, this is quite acceptable.
Less acceptable is the space consumed by the supervisor. The system was deyeloped on an 8K DEC
PDP-7 computer and Type 340 display, and in this
machine the supervisor occupies nearly 4K. Besides
the Reaction Handler, this includes the ring-processing package, a full set of interrupt-handling and output
routines, and a software character generator, Some
difficulty was experienced in coding the ring-processing routines as pure procedures, due to the lack of
index registers on the PD P-7. It seems likely that the
size of the supervisor could be greatly reduced by
using a machine equipped with index registers and a
hardware character generator.
ACKNOWLEDGMENTS
I wish to thank Mr. C. B. Jones for his extensive
assistance in programming the system. I am also
grateful to Mr. Alan Tritter for suggesting including
the test routine; and to many members of staff of the
Centre for Computing and Automation, Imperial College, and of the Cambridge University Engineering
and Mathematical Laboratories, including Professor
W. S. Elliott and MessrsG. F. Coulouris, C. A. Lang
and R. J. Pankhurst, for their advice and encouragement.

REFERENCES
1 I E SUTHERLAND
Sketchpad: a man-machine graphical communication system
Proceedings of the 1963 Spring· Joint Computer .Conference
2 D ROOS .
ICES system design
MIT Press Cambridge Massachusetts 1966 p 25
3 C SE PHILLIPS
Networks for real-time programming
Computer Journal Volume 10 May 1967 p 46
4 W R SUTI:IERLAND
On-line graphical specification of computer procedures
MIT Lincoln Laboratory Technical Report No 405
Lincoln Laboratory Lexington Massachusetts
5 W M NEWMAN
Definition languages for use with the reaction handler
Computer Technology Group Report 67/9 Imperial College
London October 1967
6 T E CHEATHAM K SATTLEY
Syntax-directed compiling
Proceedings of the 1964 Spring Joint Computer Conference
7 W M NEWMAN
The ASP-7 ring structure processor
Computer Technology Group Report 67/8 Imperial College
London October 1967
8 J C GRAY
Compound data structures for computer aided design:
a survey
Proceedings of the ACM 20th Anniversary Conference 1967
9 L G ROBERTS
Graphical communication and control languages
Information System Sciences Spartan Books 1964
10 KLOCK
Structuring programs for multiprogram time-sharing on-line
applications
Proceedings of the 1965 Fall Joint Computer Conference
11 D T ROSS
The AED approach to generalized computer-aided design
Proceedings of the ACM 20th Anniversary Conference 1967
12 D T ROSS
AEDJR: An experimental language processor
~ IT Electronic Systems Laboratory Memorandum 211, 1964

Automation in the design of asynchronous
sequential circuits*
by R. J. SMITH, II, J. H. TRACEY, W. L. SCHOEFFEL and G. K. l\1AKI
University of Missouri at Rolla
Rolla, Missouri

INTRODUCTION
Sequential switching circuits are commonly classified· as being either synchronous or asynchronous.
Clock pulses synchronize the operations of the synchronous circuit. The operation of an asynchronous
circuit is usually assumed to be independent of such
clocks. The operating speed of an asynchronous circuit is thus limited only by basic device speed. One
disadvantage of asynchronous circuit design has been
the complexity of the synthesis procedures for large
circuits.
This paper describes a computer program 1•2 which
automatically generates the complete set of design
equations for asynchronous sequential circuits. Many
of the algorithms employed are new and have been
shown to be much more praCtical than classical techniques for the synthesis of large circuits.
Minimum or near-minimum variable internal state
assignments are generated. using two of th~ Tracey
algorithms. 3 An evaluation p-rocedure predicts which
of several codes generated will most likely yield the
least complex design equations. Next-state equations,
including don't·cares, are then produced without constructing transition tables. Output-state equations are
also generated. Finally, simplified normal form design
equations containing no static hazards are produced.
The program is capable of designing circuits much too
large to design manually.
The operation of a sequential circuit is often
described by means of a flow table. 4 An example is
shown in Figure 1. The columns of a flow table represent input states, while the rows represent internal
states assumed by the circuit. Each flow table entry
specifies the next internal state and output which
result from the given input and internal states. When
the next-state equals the present state, that state is
said to be stable and is customarily circled. Unstable
*The research reported in this paper was supported in part by
the National Science Foundation through Grant GK-820.

states correspond to transitions within a flow table
column.

0)/10

4

2
3
4 ®IOI
3 @IOI @IOO
4
(4)/11 @/OO
5
@/IO
2
Figure I - Flow table

A sequential circuit is operating in fundamental
mode if the inputs are never changed unless the circuit is stable internally. If, in addition, each unstable
state leads directly to a stable state, the circuit is
said to be operating in normal fundamental mode. The
computer program described in this paper automatically generates design equations for asynchronous
sequential circuits operating in the normal fundamental mode. Circuit specifications are conveniently
input in flow table form. Input state binary codes are
specified by the program user. A summary flow chart
of the procedure followed by the synthesis algorithm
is shown in Figure 2.

55

56

Spring Joint Computer Conference, 1968

The program used to implement the design procedure suggested above is written in PL/l. This language was chosen because of its bit-string data format, Boolean operations, and the controllable storage
feature. The program consists of about 2000 PL! 1
statements divided into 8 subroutines .. The system
was designed to run on an IBM 360/40 but could be
run on a somewhat smaller machine.

Constraints generated as" a result of applying the
above theorem may be listed in Boolean matrix form,
with each row corresponding to a partially specified
state variable. Consider, for example, the ·constraint
list generated by the flow table of Figure 1. The constraints shown below are generated on a per-column
basis in satisfying theorem parts a) and b):
Constraint
List

-

nsr..Q

Figure 2 - The programmed synthesis algorithm

State assignment algorithm
The internal state assignment procedures employed
are a modification of those" described by Tracey. 3
Either completely or partially simplified flow tables
may be input to the program. Flow table simplification is not presently included in the synthesis procedure, but will be included in a later version of the program.
The Tracey state assignment algorithms are based
oil the following theorem which is reproduced without proof. "A row assignment to a flow table which
allots one internal state per row is satisfactory for the
realization of normal fundamental mode flow tables
without critical races if and only if for every transition
(Sh Sj), a) if (Sm, Sn) is another transition in the same
column, then at least one internal state variable partitions the pair {Sb SJ and the pair {Sm, Sn} into separate blocks; b) if Sk is a stable state not involved in
any transitions in the column, then at least one internal
state variable partitions the pair {Sj, Sj} and the state
Sk into separate blocks; and c) for i =1= j, Si and Sj are
in separate blocks of at least one internal state variable partition."

(14; 23)
(15; 23)
( 3; 24)
(24; 5)
(25; 14)
{

1.

\

.1,

A\

-rJ

Boolean
Matrix
12345
0110011-0
-101-

-0-01
10-10
()

1

V--.l-

The program forms all partitions associated with the
topmost stable state of column 1, then all irredundant
partitions due to the second state, and continues until
all stable states in the column have been examined.
The process is then repeated for the remaining colurns.
Note that for the above example, none of the column-generated constraints partitioned flow table rows
1 and 4; the constraint (1 ;4) was thus included in the
list to satisfy theorem part c)."The program checks all
pairs of flow table rows, and generates additional partitions as required by c).
The state assignment problem now becomes one of
finding a minimum number of internal state variables
satisfying all of the constraints just generated. This
problem has been shown to be analogous to the generation of maximal compatibles by the Paull-Unger algorithms for now table simplification. A set of completely specified state variables, at least one of which
covers each constraint, corresponds to the maximal
compatibles. These completely specified state variables will be referred to here, therefore, as maximal
constraints. The selection of a minimum number of
maximal constraints, and hence minimum number of
internal state variables is similar to the covering problem in the Quine-McCluskey method for simplifying
Boolean equations. Details of these algorithms are
available in the literature and will not be given here. 3 •5•6
As in the Paull-Unger Method A,5 maximal constraint generation may begin with the assumption that
no constraints exist. Then each constraint is examined
for contradictions to this assumption and all implied
partitions are generated. After each constraint has
been so examined, one is left with the set of maximal
constraints. In another approach, Paull-Unger Method B, one begins with the list of constraints and seeks

Automation in Design of Asynchronous Sequential Circuits
to enlarge each through the complete specification of
the corresponding state variable until all enlargements are found that cover one or more of the original
constraints. Both procedures were programmed and
the second was found to be nearly a factor of two faster than the first in this application.
As stated previously, the selecting of a minimum
number of maximal constraints is similar to the covering problem in Boolean equation simplification. A
branching method has been used which is capable of
producing an irredundant minimum-variable assignments. Operating speed of this algorithm is increased
by reorganization of the maximal constraint list, based
on the idea that those maximal constraints including
the largest number of the original constraints would
most likely be members of a minimal cover. Internal
representation of each maximal constraint is restructured in such a manner that the covering problem cannot be further simplified using column dominance
techniques.
It is obvious that either of the two assignments below satisfy all of the constraints shown above:
Assignment # 1
1 2 345
Yl 0
1 0 0
Y2 0 1 1 0 1
Y3 0 1 0 1 0

Assignment #2
12345
Yl 0 1 1 0 1
Y2 0 1 1 1 0
Y3 0 1 0 1 0

Furthermore, these are the only two significantly
different minimum assignments which successfully
code the Figure 1 flow table.
It has been found that even with certain look-ahead
provisions in the branching routine, generation of
minimum variable assignments becomes a time-consuming problem for typical flow tables of 12 rows or
more. A second and much faster algorithm has been
programmed. It is an approximate method, and generates near-minimum variable codes.
The fast algorithm reduces the Boolean matrix corresponding to the maximal constraints through the
use of an approximate reduction technique. A constraint is constructed which seems to include a large
number of matrix rows. The included matrix rows are
then removed. This process is then continued until
all rows of the original matrix are included in at least
one of the generated constraints. This reduced matrix
corresponds to a near-minimum variable state assignment.
The fast Boolean matrix reduction program usually
produces satisfactory assignments having less than
1Y3 times the minimum number of variables. Assignment generation times for large flow tables may be re-

57

duced by two orders of magnitude using this approximate procedure. Near-minimal assignments have been
efficiently generated for flow tables having up to 75
specified next-state entries and 150 constraints with
approximately 15 minutes computer time on an IBM
360/50. Many satisfactory assignments are often generated. One of these may be selected by a test routine
or chosen by the designer. The test routine, to be
discussed below, chooses a "good" assignment for
reduced hardware realization.
Design equations

As the example above illustrates, the code generating algorithms frequently produce several satisfactory assignments. Generated codes may be evaluated
by a procedure due to Maki, 7 which selects that assignment most likely to have simple next-state equations.
Consider, for example, the assignments and·next-state
equations shown in Figure 3. Note that the next-state
equations for column 11 Assignment # 1 are much less
complex than those for Assignment #2.

Assignment =11= I Assignment:fl:2
YI Y2 Y3
YI Y2 Y3

000

o

000

I

I

0

I

0 0

I

I

o
o

0

0

0

I

I

010

o

YI

=YI II + ...

Y2=Y2 I I+···
Y3 = II + ...

I

2
3

4
2
2

I

4

o

5

4
5

0

6

5

YI =(YI+Y2)II+'"
Y2 = Y2Y3 1I 1+ ...
1

Y3 = (Y2

+ '13) II +

Figure 3 - Partial flow table and assignments

The test ·procedure searches for that assignment
with a maximum amount of reduced dependency in the
next-state equations. Two types of reduced dependency are easily detected fropl the assignment. First,
observe that Y3 is dependent only on the input in the
given example. This can be predicted by noting that
Y3 has the same value, 1, for all stable states in the
column. A second observation is that Y 1 is dependent
only on the input and the present state of yl' Similarly,
Y2 is a function only of Y2 and the input. This can also
be predicted by simply noting that Yl and Y2 are never
excited to change state for any transition under input
11' In other words, one need simply observe that Yl

58

Spring Joint Computer Conference, 1968

and Y2 have the same value for state pair (1, 4), state
pair (2, 3) and again for state pair (5, 6). Observe the
increased complexity of the next-state variables in
Assignment #2 of Figure 3 as a result of its failure to
insure reduced dependency. The programmed routine based on this method wi!! evaluate each generated
state assignment for reduced dependency in just a few
seconds.
Maki has also described a procedure for obtaining
next-state equations without construction of the traditional excitation matrix. 7 An algorithm derived from
his method is presented here.
Each internal state transition may be associated
with a p-subcube of the n-cube defined by the input
and internal state variables. Furthermore, all of the
next-state entri.es of p-subcubes associated with a
single stable state wi!! be identical; and equal to the
row code of the stable state. Consider, for example,
the application of Assignment 1 to Column 11 of Figure 3, as shown in Figure 4.
In the transition between rows 2 and 3, all states
in the p-subcube YlY2 (Yl = 1, Y2 = 1) must have the
same next-state entries, namely that of stable state 3,
110. A tabular form of p-subcube generation may be
illustrated as follows:
Yl Y2 Y3

o

() Stable Row Code
3 Unstable Row Code

1 1

P-Subcube Resulting from
Transition

1 1 -

The transitions from rows 4 and 5 to stable state 1
define the remaining two p-subcubes
listed in PI 1 of
.
P;"nrp
'l Nntp th':lt thp Bnnlp'.:ln
thec;:e
I.b ....... '"" -'. .L" " ' ... ~ "' .. u. . . '" "'..........
"-'' ' ..................... c;:nm
u.......... .a.P_11 nf
""A. ~&&

.&

\J

terms represent all next states requiring specified
entries under input 11'

Notice that if Yi is 1 in the stable state row code
then next-state variable Yi = 1 for all states in p-subcubes associated with that stable state. For example,
since in row 3 Yl = 1, all states in the p-subcube YIY2
will have next-state variable Y 1 = 1. In other words, all
p-subcubes associated with transitions to a stable
state will appear in the Boolean sum-of-products nextstate equation Y i if digit i of the stable '9w code is one.
As the p-subcubes are generated by the computer
program, they are added to the appropriate next-state
I-set lists only if the corresponding next-state variable
is 1 in the subcube (see Figure 3). The final results are
(partially simplified) Boolean equations representing
the I-cells of the next-state variables.
The synthesis program also generates the output
equations of the sequential circuits. The output corresponding to a given stable state is also associated with
all unstable states leading to 'the stable state. All psubcubes generated previously are grouped according
to stable state. If an output variable is 1 for a particular stable state, the associated p-subcubes become a
partial list of I-sets under. the corresponding input.
The output I-sets for Column 11 in Figur~ 1 are shown
as sums in Figure 3.
To permit further simplification of the design equations generated abov~, it is desirable to compute the
unspecified entries for all equations. Fortunately,
unspecified p-subcubes are common to all the design
equations. A Boolean equation for don't-care entries
is generated by simply ta~ing the complement of the
available equation for specified entries (see Equation
PIt in Figure 3).
Complementation of a Boolean sum-of-produCts expression may be performed by complementing the
expression, multiplying out the result, then simplifying the resultant sum-of-products expression to obtain
the solution. The procedure used here is a modification of that method. Simplification illustrated by
9

A ·(A + B + C) = A

I,

Y, Y2 Y3

CDII

000
2

3

I

Y, •
Y2

:: YI Y2 1,
:: 0 . Ii

001

4

010

5

02 ::

3 @/2

"

yIY2 I,

Y3
0,

10

I

PI,::YI Y2+ Y, Y2 + YI Y3

:: (y,'Y2'

+

y,'Y3', I,

YIY21,

Figure 4 - Partial flow table, specified p-subcubes and 1-sets

A·(A'

and

+ D + E) = A(D + E)

is performed both before and during the multiplicationof the product-of-sums expression. Redundant terms
are also deleted. A brief example will perhaps illustrate the method employed. Figure 5 shows complementation of the sum of p-subcubes shown as Pll in
Figure 3.
A normal-form Boolean equation for each next-state
variable may be obtained by combining the don't-care
terms found above with the appropriate next-state 1sets. Since the output associated with don't-care internal states may be assumed to be unspecified, the
output-state equations also include the same don'tcare terms.

Automation in Design of Asynchronous Sequential Circuits

I

I

PI I = YI Y2 + YI Y2

I

I

+ YI Y3
PI 1 =(Y1 + Y21). (YI + Y2)' (YI1 + Y3)
= YI (Y21) + Y2 + Y3 . (Y1 + Y21). (YI + Y2)
1
1
= YI Y2 + YI + (YI ' Y3 + Y2 Y3) '(VI + Y2)
= YI + YI ' Y2Y3
1

Figure 5 - P-subcube complementation and simplification

The program then finds prime implicants of each
design equation produced above. A conventional consensus algorithm is used and will not be presented
here.
A covering algorithm is used to find simplified, but
not necessarily minimal design equations. Instead of
covering the I-cells of a design equation the program
covers the I-sets originally generated from flow table
columns. (Recall that a I-set is: a subcube containing
one or more vertices or I-cells for which the expression is 1.) The problem of generating and covering a
large number of I-cells is thus avoided. More importantly, it can easily be shown that by covering the 1sets, all static hazards associated· with vertical flow
table transitions are eliminated from combinational
circuit outputs. A static hazard exists when there is a
transition between a pair of adjacent states having the
same output, during which it is possible for a momentary improper output level to occur. Using two-level
AND-OR synthesis, if each product (prime implicant)
covers only I-sets, all transitions within that I-set are
static-hazard-free; static hazards may only be caused
by input-state changes which correspond to horizontal
transitions on the flow table.
A procedure for eliminating the remaining "horizontal" hazards has been included. It is based on the
restriction that only one input-state variable at a time
changes. All pairwise combinations of a design equation's products (prime implicants) are examined for
horizontally adjacent I-sets. If such an adjacency is
found, a static hazard exists. Since a horizontal transition may only originate at a stable state, the static
hazard cannot possibly cause a malfunction unless
one of the I-sets includes a stable state.
Consider, for example, the illustration shown below,
which is the simplified design equation for Y 2 of Figure I, using Assignment I:
Y 2= Y3'V'W + Y2 V+ Yt W'
(where the input variables are v w)
Note that the horizontally adjacent I-sets Ya'v'w and
Yt W' appear as the first and third terms. If any stable

59

state has a code in the subcube YIY3'V' then a static
hazard exists which may cause a malfunction. Note
that stable state 3 in columns 11 and 12 both satisfy

Program performance
Execution times obtained using the program described here depend on hardware and software efficiencies, as well as the complexity of the input flow
table. The solution times stated here were obtained
using an IBM 360/50 computer and the IBM Release
13 PL/I Compiler.
Simplified .design equations for flow tables of 6
rows by 4 columns (24 cells) have been produced in 45
seconds to 4 minutes, depending on problem complexity. Eight row by 4 column tables usually are solved in
1.5 to 8 minutes. Three assignments for a 12 X 4
(48 cell) flow table have. been produced in about 8
minutes, with next-~tate equations (unsimplified)
generated in 3 minutes per assignment. Two satisfactory codes for a 18 x 4 (72 cell) flow table were found
in 15 minutes. Computation times for large problems
have been found to be extremely problem-dependent.
SUMMARY
A description of a programmed algorithm for the synthesis of normal fundamental mode sequential circuits
has been presented. The program permits the logic
designer to input his asynchronous sequential circuit
specifications in the form of a flow table and obtain all
next-state equations and output equations in the form
of simplified sum-of-products. Two internal state assignment algorithms are available to the designer. One
will generate a minimum-variable assignment but may
be lengthy to execute while the other will execute
much faster but guarantees only a near-minimum variable solution. A testing routine is then available to
aid the designer in deciding which of several satisfactory state assignments will tend to reduce the complexity of the design equations. A "good" assignment will
be selected and simplified next-state and output equations will be generated based on the selected assignment. The complete program has been written in PL/I
and is running on an IBM 360/50 computer at the University of Missouri at Rolla. Flow tables with up to 75
specified next-state entries have already been run
and much larger flow tables will soon be generated
for experimentation purposes.
REFERENCES
1 R J SMITH II
A programmed synthesis procedure for asynchronous sequential circuits

60

Spring Joint Computer Conference, 1968

Masters Thesis University of Missouri at Rolla 1967
2 W L SCHOEFFEL
Programmed state assignment algorithms for asnychronous
sequential machines
Masters Thesis University of Missouri at Rolla 1967
3 J H TRACEY
Internal state assignments for asynchronous sequential
machines
IEEE Transactions on Electronic Computers Volume EC-15
pp 551-560 August 1966
4 D A HUFFMAN
The synthesis of sequential switching circuits
Journal of the Franklin Institute vol 257 pp 151-190 and

275-303 March and April 1954
5 M C PAULL S HUNGER
Minimizing the number of states in incompletely specified
sequential switching functions
IRE Transactions on Electronic Computers vol EC-8 pp 356367 September 1959
6 E J MC CLUSKEY
Minimization of Boolean functions
Bell System Technical Journal pp 1417-1443 November 1956
7 G K MAKI
Minimization and generation of next-state expressions for
asynchronous sequential circuits
Masters Thesis University of Missouri at Rolla 1967

Interpretation of organic chemical formulas by computer *
by ALBERT N. DeMOTT
Computer Research
Rockviile, Maryland

INTRODUCTION
Over the last few years, a frequently discussed
problem in the area of chemical information systems
has been the need for some means by which chemists
could communicate with the system in terms of their
normal chemical language, the structural formula,
rather than requiring them to use special, machineoriented notations. The Walter Reed Army Institute
of Research (WRAIR), as part of its Chemical
Structures Storage and Retrieval System, l has
developed an economical and effective computer
program to analyze structural formulas as normally
written by chemists, producing as output a· detailed
description, in machine-oriented format, of the atoms
in the molecule and their connections to each other.
In principle, any trained chemist can prepare compounds for entry in the system master file, or questions
for searching it, without any special training in the
WRAIR system. The program is now being used in
daily operations and is, we believe, the only operational program capable of performing this function
without major restrictions on the formulas which can
be accepted. As a special case of the general problem
of the man-machine interface, the program may well
be of interest outside the chemical field, particularly
since many of the techniques used have no essential
relation to chemistry.
The operational cost of this facility compares
favorably with the cost of preparing the connection
tables,2 fragment lists, systematic names,3 or other
special notations 4 required by many chemical retrieval systems. Execution time on an IBM 7094
computer averages about five to seven minutes per
thousand compounds. In the environment in .which
the program is currently operating, preparation of
input to the program requires one and a half to
two minutes of clerical time per compound, and about
five minutes of 7094 time per thousand compounds.
The program accepts well over 95% of the chemically*This paper is contribution No. 330 from the Army Research
Program on Malaria.

correct structures presented to it, and the accuracy of
interpretation of those' accepted (excluding compounds . for which warning messages are issued)
closely approaches 100%. The only limitations on the
freedom of the chemIst in writing formulas are the
following: (1) Organic, rather than inorganic, chemical
conventions must be'followed where the two systems
differ. (2) The structural formula must be given in
enough detail to resolve any ambiguities which
might normally be resolved by the context of discussion. (3) A few specialized types of compounds
(such as polymers, coordination compounds, and
stereo isomers) cannot be handled. (4) In a few cases
of variant usage, the chemist is restricted to one of
the options normally opento him; in general; however,
the program will handle all or most of the conventions
commonly used.
Background

The number of known organic compounds has
increased rapidly over the last ten or fifteen years,
creating a critical need for rapid means of retrieving
information about compounds related to the compound a chemist may be currently studying. For
example, an urgent problem in the medical field,
at present, is to find new anti-malarial drugs which
will be effective against the drug-resistant strains
of the malaria parasite which have appeared recently
in southeast Asia. When a research worker finds a
compound with some effectiveness in treating the
disease, his first need is to obtain information about
the biological activity of known related compounds,
as a guide to determining what modifications to the
molecule might offer promise of increasing the activity
of his potential drug. A manual search of files containing several hundred thousand compounds is impractical, no matter how well cross indexed they may be,
since the portion of a molecule which is relevant
for one search is likely to be irrelevant for nearly
all others. In the example just cited, in fact, the
question of which portion of the molecule is relevant
61

62

Spring Joint Computer Conference, 1968

is precisely the question the search is "intel1ded to help
answer.
To solve this problem, WRAIR has developed a
computerized storage and retrieval system which
allows the user to specify any chemically valid
structure or portion of a structure. The system
will then retrieve all compounds in the file which
contain that structure as a part of their molecules.
Retrieval is on. the basis of a successful mapping of
the structure of the question into the structure of
the compound on' file. In order to permit such a
mapping the file entries must contain, and input must
provide, a specification of the characteristics of
each atom in the molecule and a specification of ail
the pairs of atoms which are bonded directly to each
other (with the nature of the bond given in each
case). On the other hand, the structures of new
compounds for the file, whether obtained from
published catalogues, submitted by the chemists
who have synthesized them, or gotten from some
other source, will nearly always be specified originally by means of conventional chemical formulas.
The formulas can be, and in the past have been, converted· to an atom-by-atom and bond-by-bond format
by manual methods, but this is a tedious task which
must be performed by trained chemists. Furthermore, the original formula is lost in this process and is
not available at retrieval time. "The program which
is the subject of this paper" was created to bridge
this gap. The preparation of input can be entrusted
to relatively unskilled clerical personnel, since
their only task is to copy the chemist's original
drawing. Furthermore, since the original formula
is' provided to the system in a binary coded form,
it can be' convenIently included in the master file
entry for the compound and printed on a Hne printer
at retrieval time. It can also be provided to the user
during initial processing in conjunction with rejection
messages and warnings of ambiguities and suspected errors.
In the current operational environment of the program, input is prepared by typing on a chemical typewriter. s The paper tape output from the typewnter
is converted to magnetic tape and processed by computer into line-by-line order. Output from the program
consists of a fairly conventional connection list.
The details of this input and output are beyond the
scope of this paper, however, since they do not
affect the logic of the program. Relatively minor
coding changes would suffice, in fact, to provide
output in other formats (such as a connectivity
matrix) or to accept input prepared in other ways,
provided the input represents a line-by-line image
of the formula and preserves the original geometric
relations.

The problem
Organic chemical formulas are a language which
has grown up over the past 100 years with little
attempt at standardization. Its "rules of grammar"
have never been coditled, and must be deduced
from the actual practice of chemists. In principle,
a formula is a conventionalized picture of a molecule
as projected on a plane, but the emphasis must be on
the word "conventionalized." Each atom is represented by an element symbol consisting of one
or two letters, and the connections between atoms
are indicated by straight lines (single, double, or triple
according to the nature of the bond) connecting two
element symbols. In practice, however, the name
"structural formula" is misleading. Only the major
outlines of structure are shown by means of bond
lines - the details must be inferred by the reader.
For example; the characters "S02" in an organic
(but not in an inorganic) formula mean that two
oxygens are each double bonded to a sulfur atom
and that the sulfur atom in turn is single bonded to
each of two other atoms in the molecule. (The bond
lines for the latter bonding mayor may not be written.) If you ask a chemist why "S02" represents
this structure and no other, the answer' will be, in
substance. "Because it does." Chemically it is perfectly pos~ible for two oxygens to be single bonded to
a sulfur atom with each oxygen single bonded in
turn to some other atom, and such structures do in
fact occur. They are never, however, represented
by "S02'" Most "structural" formulas include
lengthy strings of element symbols,. subscripts,
parentheses, and brackets. Their structure is obvious
to a chemist, but not at all apparent to a layman.
The problem of interpreting chemical formulas is
therefore twofold: First, the program must be able
to trace the chains and rings formed by bond lines,
occasionally in extensive patterns resembling chickenwire. Second, it must be able to determine the structures implied by strings of symbols which give no
explicit indication of the mutual relations of the
atoms represented.

Basic procedure of the program
Input to the program consists of structural formulas
whose individual characters have been arranged
in line-by-line order, including all blanks within
each line. One formula is read in, stored in a twodimensional matrix, and a starting node is chosen
arbitrarily. For the purposes of the program, a node
may be a string of symbols, but for simplicity let
us assume that all nodes in the structure are single
atoms with or without attached hydrogens. The
atoms and its characteristics (including the numbei

Interpretatio~ of Organic Chemical Formulas by Computer

of attached hydrogens, if any) are recorded. A dot
at the comer of a ring structure is interpreted to
represent a carbon atom with enough attached
hydrogens to make up its full valence of four. Next,
all adjacent matrix cells are checked for bonds
pointing to the node. Each bond found is traced and
the matrix location of the atom at its far end is entered
in a table of unprocessed nodes. This entry also
records the nature of the bond and identifies the atom
at the node currently being processed. The location
of the node table entry is then stored in the matrix
cell at the far end of the bond.
When all bonds pointing to the node have been
traced, the valence of the atom at the node is checked
to make sure it agrees with the total of the bondings
shown. A new starting point is then chosen by taking
an entry from the node table, and the new node is
.processed in the same manner. If a matrix cell
containing a bond pointing to the node is found to
have a node table reference in it, the information in
the entry is used to record the bonding between
the atoms at the two nodes. The table entry is then
erased. Processing of the molecule is complete
when no entries remain in the node table.
Interpretation of strings of symbols

When a node consists of a string of element symbols, subscripts, brackets, and parentheses, the problem be~omes vastly more complicated. The bulk of
the coding in the program is devoted to handling this
problem.
Two basic approaches to the problem of interpreting such strings were considered in designing
the program. The first approach would be to analyze
strings entirely by program logic, taking each element
symbol separately and inferring the relation of its
atoms to the other atoms in the string. In terms of
an analogy with natural languages, it represents
interpreting a sentence word by word, allowing for
all the changes in meaning of a given word which
can be produced by changes in context, and for the
variation in the .relation between two words which
results from changes in their relative positions and
the presence or absence of other words in the sentence. In the case of chemical formulas, one of the
major difficulties in this approach is the fact that
most chemical elements can take on anyone of
several different valences (i.e., have different number
of bonds).
A second approach to interpreting strings would
be, using the linguistic analogy, to analyze the sentence in terms of phrases instead of individual words.
Chemically, this would mean defining a set of glyphs
(i.e., groups of element symbols and auxiliary char-

63

acters), each of which would represent one and only
one arrangement of atoms and bonds. Many such
glyphs can, in fact, be defined, and the approach
offers obvious advantages from the standpoint of
simplicity of programming. The approach was used
with considerable success by E. B. Gasser and
C. W. Gregory at Colgate-Palmotive Company in
designing and implementing for a small computer
an experimental predecessor to the present program.
The final decision, however, was in favor of using
program iogic exciusiveiy, and experience has confirmed that the decision was a good one. First, it
was found that a number of lengthy glyphs would need
to be defined, with many glyphs being subsets of
longer ones. This would require lengthy table searches
and repetitious processing of element symbols- as
overlapping fields were tested successively against
the glyph table. Program execution would be slow .
Second, the number of glyphs to be defined would
be large (probably on the order of 1,000), and few
would be used by chemists with absolute consistency.
To require a chemist to consult such a glyph list
to ensure that his structure would be interpreted
correctly would defeat the primary goal of the program, and would open wide opportunities for errors.
Third, and most vital, a study of a set of representative formulas led to, the conclusion that it would in
fact be possible to abstract a set of rules simple
enough to be practical from a programming standpoint, and universal enough to insure reliable operation of the program. Furthermore, it appeared possible
to define criteria for identifying genuinely ambiguous
formulas and either rejecting them or issuing a warning to allow the chemist to check the program's
interpretation. The flexibility of the program logic
approach appeared to be more valuable than the
definiteness of the glyph approach.
The original set of rules turned out, not unexpectedly, to be thoroughly inadequate; but progressive
refinement as problems became apparent has produced, with no changes in the basic logic, a program
which comes very close to meeting the goals originally defined. At present, two typewritten pages
are sufficient to specify the conventions which
chemists must observe in writing formulas for input
to the system.
A complete description of the rules used to interpret strings of element symbols is beyond the scope
of this paper, but the basic principles are as follows:
1. Since Western languages are written from
left to right (and most chemists are Westerners)
strings are usually written, and can usually be analyzed, from left to right.
2. The bondings on each atom will exactly equal
one of its normal valences, unless another valence

64

Spring Joint Computer Conference, 1968

has been specified in the formula. Except for oxygen
groups (see rule 5 below), the valence will be the
lowest compatible with the valences of surrounding
atoms.
3. A string, since it resembles a chain in ~ppeai­
ance, will norma!!y represent a chain structure
(with or without side branches) and, except in connection with oxygen groups, it will not contain
any rings. A straight chain should be preferred over
a chain with branches, where both are possible.
4. Each string represents a single molecule, or
portion of a molecule, and each atom in the string
must be bonded directly or indirectly to every other
atom in the string.
5. Oxygen, particularly subscripted oxygen, is
most likely to be bonded as a side atom, rather than
as part of the main chain, even if this ·requires assigning the atom to which it is bonded a valence higher
than its lowest normal valence.
6. Triple bonds are rare, and a pattern of one
double and one single bond is preferred over a
triple bond.
In processing a string, the general procedure is
to take each atom in turn (treating subscripted symbols other than oxygen as if an equivalent number
of symbols had been written side by side). First,
any written bonds approaching the atom vertically
or from the left are traced and their value is subtracted from the valence of the atom. (Bondings
are made or entries added to the bond table in the
same way as for structures shown in full detail.)
Next, the valences of the atom are used to satisfy
all unsatisfied valences remaining on previous atoms
in the string, unless all atoms in the string so far have
the same valence and no written bonds are present
on any of them. In the latter case, the atom will be
left unbonded. Last, any bonds approaching the atom
from the right will be used to satisfy valences on
whatever atom still has unsatisfied valences. This
will not necessarily be the atom being processed,
but may well be an earlier one. The program then
requires that unsatisfied valences be left on at least
one atom in the string, unless the end of the string
has been reached. In the latter c~se, all valences
must be satisfied. If at any stage of processing the
valences on the atoms are such that the above rules
cannot be followed, the valence of one of the atoms
involved is raised to its next higher value.
In addition to the main processing routine described
above, three special routines are provided to deal
with (I) oxygen groups, (2) alkane chains of the form
CnHm, and (3) groups inverted from their natural
order becuase they occur at the left· end of a string.

Parentheses and brackets
Although several special usages occur, and are
provided for by the program, brackets and parentheses are most often used in one of two ways:
(1) The parenthetic group may represent one or more
branches from the main chain of the string. This is
referred to as "fanwise bonding." If the parentheses
carry a .subscript, all· the groups represented will
be bonded to an atom or atoms in the main chain.
If an oxygen group precedes, each parenthetic group
will be bonded to a different atom within the oxygen
group. Otherwise, all groups must be bonded to the
same atom. (2) The group may represent a unit which
is repeated as part of the main chain. This is called
"chain bonding." The groups are bonded to each
other in a chain, with the first group bonded to a
preceding atom in the string and the iast group
bonded to an atom which follows.
The two usages are distinguished by counting what
might be called "handles" on the group. If the group
has only one handle, it is bonded fanwise. If it has two
handles it is chained. Handles are counted by first
processing the group as if it were a string in itself,
reserving one valence on the first atom in the group
if the group is not at the beginning of the whole
string. A handle is then defined as (1) a reserved
valence, (2) unsatisfied valences on anyone atom in
the group, or (3) a written bond extending to some
atom outside the group.
Parentheses and small brackets are expanded
and processed when the end of the group is reached
in normal processing of the string. Large brackets
(which may enclose substructures r3:ther than single
strings) are processed when interpretation of the
structures within them has been completed.
SUMMARY
The program described in this paper meets a need
which has been recognized for a number of years, by
allowing communication between chemists and
computers in terms familiar to all trained chemists.
Certain limitations still exist, but our experience
has been that when a formula must be rewritten to
meet these limitations the result is nearly always a
formula which chemists consider better chemically.
Since these same formulas are used as part of the
output for searches, this can be a distinct advantage.
The program is economical in operation, and some
two years of use have shown it to be reliable and
subject to progressive refinement.
ACKNOWLEDGMENTS
All work on this program has been supported by the

Interpretation of Organic Chemical Formulas by Computer
U. S. Army Medical Research and Development
Command.
The original design work was performed by the
author at the Service Bureau Corporation and was
initially implemented by him and by other employees
of the Corporation working under his supervision.
Some modifications were made by the author while
at Computer Applications Incorporated, and a
major revision was carried out by him at Computer
Research.
REFERENCES
1 D P JACOBUS D E DAVIDSON A P FELDMAN
J A SCHAFER
Experience with the mechanized chemical and biological
information retrieval system
Presented before the Division of Chemical Literature

65

American Chemical Society Chicago Illinois
September 1967
2 W S HOFFMAN
An integrated chemical structure storage and search system
operating at Du Pont
Presented before the Division of Chemical LiteratUl e
American Chemical Society Chicago Illinois
September 1967
3 G G VANDERSTOUW I NAZNITSKY J E RUSH
Procedures for converting systematic names of organic
compounds into atom-bond connection tables
Journal of Chemical Documentation vol 7 no 3 1967
4 E HYDE F W MATTHEWS L H THOMSON
Conversion of Wiswesser notation to a connectivity matrix
for organic compounds
Presented before the Division of Chemical Literature
American Chemical Society Miami Beach Florida
April 1967
5 A FELDMAN D B HOLLAND D P JACOBUS
Automatic encoding of chemical structures
Journal of Chemical Documentation vol 3 no 4 1963

A simulation in plant ecology
by RA YMOND E. BOCHE
Texas T echn%gical C o/lege

Lubbock, Texas

fertility limits the scope of the model to relatively
short periods of time during which soil fertility remains relatively constant and uniform in its effect on
species present.
It should be emphasized that the model is for growth
in a natural environment; consequently, it does not at
present encompass such considerations as probability
of seedlings started, human intervention, or other
extraordinary influences such as fire.

INTRODUCTION
The purpose of this paper is to present some results
from a preliminary study investigating the application of the computer sciences to problems in plant
ecology. Results include a model which simulates the
growth of a forest in a particular time dependent environment and an implementation of that model using
a digital computer and assumed data. At present the
model is somewhat restricted in level of detail and
range of applicability. It is, however, believed to be a
pioneer in plant sciences, and further study will surely
suggest directions for refinement in level of detail. The
range of applicability is constrained by factors not
modeled to a young, growing forest of natural occurrence in a particular environment. Validity is anticipated, not for individual trees, but for the entire forest
presented as an ecological system.

Feasibility

Before embarking on this project, it is appropriate to
investigate its· feasibility. Toward that end, it would
seem that three major areas must prove amenable to
computation if the model is to satisfy our general
purpose.
First, growth must be predictable from environment.
Given a plant of known heredity and of known
previous history, it is possible, by applying the
principles of plant physio'logy, to predict with considerable certainty the physiological reactions
which will be evoked in that plant upon its exposure to a given complex of environmental conditions.

Purpose and scope

The general purpose of the model described here is
to investigate the feasibility of computer simulation of
plant growth processes. The specific model developed
devotes unusual attention to adaptability and flexibility in order to provide ready means of incorporating improvements and modifications suggested by
plant physiologists or plant ecologists. This approach
recognizes the strongly qualitative and descriptive
aspects of these sciences and makes allowance for the
shortage of quantitative knowledge and relationships
required by the model.
The specific model, a young, growing forest of
natural occurrence in a particular environment, will
allow us to predict changes in composition of a forest
as influenced by the ecological interactions. The
model includes and accounts for three of the five principal limiting factors in plant ecology, light, temperature, and moisture. The exclusion from consideration
of the remaining two factors, soil fertility and soil
type, limits the application of the model to forests of
natural origin; thus assuring some degree of soil type
compatibility for species present. Excluding soil

Second, ecological interaction must be predictable given a particular flora subjected to a particular environment.
Ecological measurement has been sufficiently
perfected to give material aid in predicting the
hazards to be encountered in critical areas under
various types of land use and management.
Third, the interactions must be "modelable" and
"computable. "
Important in plant ecology is the principle of
limiting factors, which says in effect that the least
favorable of conditions present will prove epistatic. In particular, photosynthisis or plant metabolism, will be controlled by the least favorable of
soil fertility, soil type, light, temperature, and
moisture.
67

68

Spring Joint Computer Conference, 1968

The three above definitive statements were abstracted from the Encyclopaedia Britannica and, together, give strong indications of feasibility for the
proposed study.

Basic model structure
1. General flow

A forest, for our purposes, will be defined as two or
more established trees that interact ecologically with
one another. The first step in the model is to initialize a
particular forest (thru observation or assumed data)
and measure its initial composition.
The second step is to apply an amount of growth
resources (temperature 1 light1 and moisture) determined as a result of a simulated period of climate.
At each time step of the model, the moisture added
is allocated to individual trees. The temperature,
light, and moisture that would otherwise be available
during the climate period are then modified by the
influence of neighboring trees. The actual moisture
availability determined is based upon moisture evaporation rates influenced by light and temperature, previous moisture present, and potential losses to the subsoil during the period.
Finally, growth takes place at a rate determined by
moisture, temperature, and light. Growth rates used
vary with individual species and give account of current season and present state of maturity. Moisture
used in growing is removed from the soil.
Composition.is measured, as requested by input
data; the processing of successive climate periods
continues until completion.

An important model analogue used is the depicting of shading by an angle, (determined by species,
season, and latitude) with shade occurring to the north
during the daylight hours within the ar.ea of the right
triangle, specified by the height of a tree and its shading angle. (Figure 1) The effects of shade on temperature and light (and hence on evaporation and
growth) are modeled by analogy. The single angle
selected will result in less area of shade than actually
exists during the early morning and late afternoon
hours; but in so doing, account is taken of the much
reduced heat and light intensities occurring at those
hours.

I

II
I

.t=:$~-------rl hz

hi

I

I~--JIL..--~m
I
1-----

d 2- - - . . . . .

~-----dl-----~

Figure 1- Shade angle

Other simplifications and analogies are, in general,
less crucial to the model and are encountered primarily
in level of detail.
3. Level of detail

2. Simplifications and analogs
The particular sample of a forest selected or assumed will he a straight line. I n the case of observed
data, a strip of some appropriate, but as yet undetermined, width will be selected; all trees within that strip
will be assumed to occur on a single straight line.
This simplification is believed to be essential to the
computational feasibility of the project and causes no
significant detraction from realistic representation
of the physical world. To simplify calculations the
line is chosen in a north-south direction. Thus we will
be modeling a "two-dimensional" forest. It should be
recalled that we are interested in the ecological system, rather than individual trees. Such a simplification may cause loss of information concerning individual trees, but the loss of information to the overall
model should prove to be well within the bounds of
significance in the best climate sub-model we can hope
to construct, or in the best plant growth models we
can conceive.

Every reasonable attempt has been made at every
point to select a level of detail commensurate with
the return of enlightening information and to provide
flexibility in model construction and implementation,
allowing modular adaptability over a broad range of
detail levels. We are unlikely to have achieved an
optimum level of detail in this early model. Also, since
a major purpose of the study was to in.vestigate feasibility, it has been appropriate in some cases to quite
consciously avoid detail that would tend to improve
validity but have little bearing on feasibility. An example is that the present model considers all moisture
below the surface as· a single number for each tree.
The number indicates a total amount available withouf regard to root zones or soil stratification.
The climate model is, at present, a yearly cycle of
temperature, moisture, and light means, each quantified as a single number occurring per time period.
The time period selected was one month with model
runs extending for ten years of simulated time~

A Simulation in Plant Ecology

Principal modelfeatures
1. Climate
,During each period of simulated time the climate
determines the quantities of additional resources made
available for growth. The "model" of climate used
here is simply a table depicting typical precipitation, temperature and light during each month.
Precipitation is moisture added in centimeters during the month. Light is the approximate number of
hours of daylight per day reduced slightly during
months normally experiencing considerable cloud
cover or fog. Temperature was entered as a single
number determined for each month as follows: Mean
daily high and low temperatures for the month were
assumed to occur at 1 p.m. plus 20% of time until
darkness and 1 a.m. plus 90% of time- until daylight
respectively. It was assumed that temperature rises
and falls linearly from high to low during the day.
The positive portion of the above temperature function reduced by 50°F. was integrated between the
limits, daylight and darkness, to determine the single
temperature input for the month.

69

3. Moisture Zones
The extent of a tree's root system limits the range
or distance in each direction in which soil moisture
will be available to it. The model assumes that, in the
absence of conflicts, all moisture which falls within
a distance equal to a tree's present height in either
direction becomes available to its roots directly or
through capillary action. This distance, or "natural
moisture zone," is illustrated in Figure 2 with 45°
triangles. (Other angles could be used and varied by
species and state of maturity.)
35
30

25

~15

":ciii

10

ZONE

I

4

5 6 789

10

II 12

2. Shade

During the daylight hours shade occurs to the
north of each tree. As noted above, an angle, 0, is
used together with the tree height to determine a
triangular area in which shade occurs. The shade angle, 0, will generally be smaller for evergreen than for
deciduous trees except during those periods of dormancy in which the shade angle is reduced for the
leaf shedding deciduous species. If any other tree
exists wholly or partially within the shaded area,
its light and temperature environment, and hence, its
growth rate, is modified. The "shade factor," a number between 0 and 1 indicating the percent shaded, is
determined for each tree at each time step by the
following procedure.
After resetting all shade factors to zero, perform the
following computation for each tree in the system,
working from south to north (Figure 1).
1. For tree n, d 1 = n 1 tan O.
2. Find d2 for the next tree to the north.
3. If d 1 'T e2, begin again at step 1 with tree n + 1.
4. h3=tanO/(d 1 -d2)
5. h3 = min(h 3,h 2)
6. Shade factor = max(h3/h 2 , present shade factor).
7. Return to step 2 above to see if more than one
tree to the north is shaded by the present tree.
During each time period, loss of moisture added
occurs due to evaporation from the soil at a rate dependent on whether or not the ground is shaded at the
center of the "moisture zone."

Figure 2 - Moisture allocation

When natural moisture zones overlap, as is generally the case, the trees must compete for the moisture that falls in the overlapping zones. The moisture
zone algorithm begins by preparing a table of the
starting and stopping co-ordinates of each tree's natural moisture zone. The entire forest is then divided
into moisture zones. Associated with each zone is a
list of trees competing for moisture in that zone,
(Table 1). First we search the table of natural moisture zones to find the tree with the smallest starting
co-ordinate. The co-ordinate and tree are entered as
the first line of Table 1. We continue by selecting,
for each line, the next smallest co-ordin~te from the
natural moisture zone table and entering it in Table 1.
The co-ordinate entered terminates the precedirtg zone
and begins a new one. If a starting co-ordinate is selected, the list of trees competing in the preceding zone
is copied and the new tree added to the competition.
If a tree's stopping co-ordinate is selected, the list is
copied with that tree deleted. Zones of zero length
will be subsequently ignored.
4. Moisture allocation
The moisture that falls in each moisture zone will
be allocated to the trees competing in that zone by
some combination of the following rules. (A weighting factor has been provided as input.)

Spring Joint Computer Conference, 1968

70

TAB L E

MOISTURE

MOISTURE
ZONES·

STARTING
CO-ORDINATE

4

6

8

1

ALLOCATION

STOPPING
CO-ORDINATE

TREES CONFLICTING IN ZONE

95

102

102

112

1

5

112

123

1

5

6

123

126

1

5

6

2

126

127

1

5

6

2

127

130

1

5

6

2

3

130

131

1

5

6

2

4

6

4

131

131

1

5

131

133

1

5

6

10

133

143

5

6

7

11

143

145

5

6

7

12

145

146

5

6

7

8

13

146

162

5

6

7

8

14

162

163

6

7

8

12

•

1

3
4

8

12

At Time Zero

Rule A allocates moisture in direct proportion to
the heights of the competing trees.
Rule B projects with the 45° angle the heights of
all competing trees to the center of the zone and then
allocates moisture in proportion to the projected
heights.
Rule C allocates moisture in inverse proportion
to distances from tree basis to the center of the zone.
Results presented later were obtained using Rule C.
5. Growth

Each species has associated with it an ideal growth
rate which is a function of height. The growth that
actually occurs during a climate period will be determined by reducing the ideal growth by some
amount for each ·unfavorable environmental condition
encountered. ideal conditions for each species, moisture usage rates, and sensitivities, are input parameters.
A tree is selected, and its ideal growth for the period
is determined as a function of species and height. If
the tree is deciduous and dormant in the present
climate period, its projected growth is reduced by an
input factor.
The absolute difference in the temperature number
that occurred and the ideal temperature is divided by
the sensitivity. The sensitivity is the temperature differences that reduces growth by 10%. Thus, from the

quotient of the ·above division, we determine a new
growth rate.
The growth is multiplied by the hours of light per
day in the current period divided by twelve. If a tree is
entirely shaded (shade factor equals one) the growth
rate is reduced by the "zero light rate" input factor
for that species. If the tree is partially -shaded, the
growth rate is reduced to a proportionate rate between the zero light rate and the last determined rate.
The moisture needed for growth is determined
first, from a usage rate multiplied by height, and then
by one plus the projected growth in meters. A subsoil loss is determined as a percentage of moisture
present. If the available moisture is greater than that
needed for growth and subsoil loss, the moisture is
removed and growth takes place. If adequate moisture is not present, the actual growth will be that
fraction of the projected growth for which moisture is available.
Preliminary results

At the time of this writing no field data collection
has taken place. Therefore the results obtained so far
have been the outcome of qualitative model testing.
Primarily, this testing has been the operation of the
model under extreme conditions such as total absence of some necessary growth resource. Other
tests have included such things as operation with
and without ecological interference as controlled by
spacing of individual plants. A sensitivity analysis
has also been performed in order to insure "reasonable" responsiveness to major factors modeled.
Validity of basic model structure has now been assumed, since behavior under the tests indicated above
is of a form and direction anticipated by initial assumptions.
The output from a typical model run is included
below (Table 2 and Figure 3). The forest in this particular example was obtained by random shuffling of
data cards representing a deciduous and an evergreen species. (Similar model runs with more than 100
trees have been made.) In Table 2, co-ordinates and
heights are in meters. Shade factors and moisture
present are described above. The trees at each end
are generally in a superior competitive position, and
hence "less representative." The third and fourth
trees illustrate markedly the effect of changing environment with time. Others illustrate such effects
as declining growth rates caused by the increased
moisture requirements accompanying increased
heights.
This particular, somewhat abstract, forest has been
run many times with many variations of model parameters. As a result of the adaptability and responsive-

A Simulation in Plant Ecology

71

The future
II

Ie

II

5
II

I

III

20
13

7
III

17

"

10

14

I

iii

10

I

MU

in

.....

I

iiU
COOROIIlATE '.1

iiii

I

I

=::

tl'

.-

=::

Figure 3 - Results

TABLE

2

RESULTS

TREE

!2:....

INITIAL HEIGIrr
CO-ORDlNATE ~ ~ 5 YRS.

28.93

0.00

1259.9

5.41

6.76

0.80

58.2

3.36

4.12

0.32

0.0

3.45

4.11

0.04

0.0

30

30.• 47

31.02

0.00

60.6

30

31.94

33.57

0.00

50.2

O.Sl

31.4

120

Evg.

25

26.96

2

127

Dec.

4

3

128

Evg.

129

Evg.

5

132

bee.

6

142

Evg.

4

148

SHADE· MOISTURE·
HEIGHT
~ ~ ~

Dec.

15

15.54

16.25

25

~6.90

28.52

0.00

51.6

2.74

3.16

1.00

0.0

8

168

Evg.

9

169

Dec.

10

170

Evg.

4.05

5.45

1.00

0.1

11

173

Dec.

10

10.61

11.36

1.00

41.5

12

176

Evg.

30

31. IS

32.51

0.03

0.0

13

182

Dec.

15

15.71

16.39

0.75

65.1

14

lS9

Dec.

4

5.46

6.56

0.00

70.4
0.0

15

190

Dec.

2.70

3.12

0.19

16

191

Evg.

3.S1

5.15

0.00

0.2

17

~

Dec.

10

10.62

11.25

0.00

22S.1

IS

197

Evg.

30

32.04

34.0S

0.00

1820.6

•

At Five Years

ness demonstrated, we are able to conclude that the
model will prove valuable in its present form and help
to open the gates to a large number of computer studies and applications in the plant sciences.

A step preliminary to future development will be
an extensive data collection effort. Data collected
over extended periods of time and taken from many
different geographic regions for many different species
and forest densities will be necessary to properly
adjust, or "fine tune," model parameters.
Additional levels of detail may be considered. A
stratified soil model seems to be a most promising
possibility and would allow for finer accounting of
moisture avaiiabiiity, soH type, and fertiiity in the root
zone of individual trees.
A less difficult, but logical, next step will be the
ascribing of measures of economic importance to
the forest. Such a measure will allow the present
model to be used as a "laboratory" for the investigation of many forest ~anagement practices. For example, quantitative results concerning the value of
thinning to reduce shading could be determined by
simply applying the model repeatedly with various
thinning criteria applied to the same forest. Such a
study made in the physical world, by experimentation,
would not only take many years to complete, but
would depend on the critical assumption that each
forest or subforest under study was completely equivalent and subjected to the very same environmental
influences during the entire duration of the study
period.
The adaption of the principals and algorithms of
the present model to crop plants has already begun.
The emphasis, here, must change from one o(reaction
to natural influences to that of reaction to soil management practices. Also, our attention must focus on the
fruit of the plant rather than the plant itself. However,
the many similarities make the present model an important first step in this direction.
REFERENCES
1 F W WENT
The experimental control'ofplant growth
Chronica Botanica Company Waltham Massachusetts 1957
2 R E BOCHE
Some algorithms for allocation of environmental resources
determining plant growtb
Symposium on Physiological Systems in Semi-Arid Environments. Sponsored by the University of New Mexico and the
National Science Foundation Albuquerque New' Mexico
November 1967 Proceedings to appear t968

A major seIsmIC use for the fast-multiply unit
by ROBERT D. FORESTER*
Digital" Seismic Corporation
Houston, Texas

and
TIMJ. HOLLINGSWORTH and JAMES D. MORGAN
Petty Geophysical Engineering
San Antonio, Texas

No matter how much speed or capacity you add to
your computer system, programmers will develop'software which will tax it to its limits. }>rogrammers at
Petty Geophysical Engineering are no exception.
Two years ago a fast-multiply unit which can multiply
and add 2,000,000 times per second was added to their
CDC 3200 installation. Soon afterwards they developed ~ sophisticated program for enhancing seismic
signals which depends heavily on the unit's great
speed. The program is called "APE", which is an
acronym for Automatic Phase Editing. It has proved
to be a valuable tool in the search for oil.
In order to see how the APE program fits into the
scheme of seismic exploration for oil, we will broadly
describe how seismic data are gathered in the field
and processed in the laboratory.
Figure 1 shows a diagrammatic cross section of
how seismic data are collected in the field. Dynamite
is loaded at the bottom of shotholes drilled through
the earth's weathered layer. Sound energy emitted by
shooting the dynamite spreads downward and is reflected upward by discontinuities in the velocity or
density of the earth's layering. Ray paths symbolize
the directions along which the energy travels. The returning echos are sensed by a surface array of geophones and recorded on .magnetic tape in analog or
digital form. The weathered layer tends to be irregular
an4 can cause refleCted events to mismatch in time
from one geophone to the next.
Figure i shows a photographic plot of multichannel, seismic data recorded on magnetic tape.
Energy for a single shot was recorded by an array of
24 geophone groups, each geophone group representing a seismic ~race. The horizontal sub-surface coverage of the geophone array was about Y.z mile. In this

example, the time extent of the recording shown is
1.9 seconds. The maximum time length of the original
magnetic recordings is usually about 6 seconds. For
most digital processing purposes, each trace is sampled 3000 times; thus for a 24-trace record, the computer must handle 72,000 samples of data. Our center
processes tens of thousands of records like this each
month. In· terms of digitized data samples, this
amounts to an input of over a billion pieces of information per month.

WEATHERED LAYER

CHANGE IN
VELOCITY OR
DENSITY

Figure 1 - Cross section of seismic reflection shooting

REFLECTION TIME IN SECONDS

Figure 2 - A 24-trace seismogram

*Formerly Vice President-Service Development, Petty Geophysical Engineering, San Antonio, Texas.

73

74

Spring Joint Computer Conference, 1968

The processed records are assembled into cross
sections like that shown in Figure 3. This section comprises 6 records representing a horizontal sub-surface
coverage of about 3 miles. The depth penetration is
about 10,000 feet. The reflection pattern extending
across the section depicts the configuration of strata
within the earth. This type of presentation is widely
used by geophysicists to detect structural and stratigraphic anomalies in the earth which may contain oil.
(To emphasize the reflection patterns, the positive
halves of the trace swings are filled in solid by an
electronic plotter.)

WEATHERED LAYER

COMMON REFLECTION POINT

Figure 4-12-fold common-reflection-point setup

Figure 3 - A seismic cross section

It would be so simple if there were no noise, but
mother nature is not that kind. There are many varieties of noise, both random and organized, to plague
the seismic interpreter. In order to cancel noise, or
build up t.he signal-to-noise ratio, it's now a common
practice to shoot repeated coverage in an area and
then add or "stack" the data together. Such shooting
of repeated coverage has added greatly to the store
of data to be processed in the last five years. Many
computing centers have sprung up to keep up with
the increased load.
Figure 4 shows a diagram of a 12;;fold repeated
coverage field setup. The shotpoints. and geophones
are placed so that the 12 ray paths converge symmetrically toward a common reflection point at each
reflector. Travel times along each slanted ray path
are corrected to what they would be, had the ray
path been vertical. The corrections involve a nonlinear time stretching which is nicely done by a computer. If the weathered layer is highly variable in

velocity and thickness, reflections may show severe
displacements from one trace to the next.
In Figure 5 there are 9 sets of 12-fold commonreflection-point l traces. Each set is to be reduced to
a single trace by addition. Owing to weathering irregularities and noise, the reflections are not aligned and
have a ragged appearance. Stacking events which are
badly out-of-phase would do little toward building up
the signal-to-noise ratio.
In the past, our geophysicists used to determine
tifl1.e shifts which would line up reflections by visual
correlation of reflection character. To straighten up
the reflection bands shown in Figure 5 would be a
laborious task taking most of a working day. The
APE computer program will aljgn reflections automati cally. Figure 6 shows the effects of applying
APE to the data in Figure 5. The computer took only
a few minutes to generate this display of smoothly
aligned reflections.
We will now describe the basic principles which
are utilized in the total APE process, each step of
which makes use of the fast-multiply unit.
Figure 7 shows the correlation process in the time
domain. The correlation process is commonly used
to search for a correspondence between traces. In
this example we are searching for a replica of the
search signal in the field trace. As. the search signal
is moved in incremental steps past the field trace,
the amplitude values are cross multiplied and summed.
The sums constitute the amplitudes of the output
trace at the bottom of the Figure. Three sums, corresponding respectively to positions A, B, and C of the

A Major Seismic Use for the Fast-Multiply Unit

en
Q

75

en
0

z

Z

0

0

(,)

U

w

I&J

en

en

~

~

I&J

W

i=

I-

~

~

Figure 5 - 12-fold trace collections before APE

search signal, are marked on the output trace . .In
position B the search signal is lined up exactly with
the pulse of identical shape on the field trace. The
output trace shows an amplitude peak corresponding
to position B. The arithmetic for position B, which
is noted on the right side of the Figure, could be done
by the fast-multiply unit injust 2.5 microseconds.
In the correlation process, the phases of the two
traces are subtracted from each other. Since the
search signal in this example is identical to that of
the field trace, the phases subtract out to zero for
all frequencies and a symmetric output waveform is
obtained.
At one time we tried to estimate the corrections
needed to bring traces into time alignment by simply
noting displacements among correlation peaks. Experience showed that these peaks were generally
too dull to be used for the precise determination of
such corrections. This simple approach was abandoned in favor of the APE process, which employs
cross-correlation only as a part of a more sophisticated scheme.

Figure 6-12-fold trace collections after APE

Convolution, which is another part of the APE
process, is depicted in Figure 8. Convolution is popularly ,thought of as the mathematical equivalent of
filtering. For digital filtering applications, the sliding
waveform represents the impulse response of the filter to be used. The numerical technique of summing
cross multiplications is identical to that used for the
correlation process. The difference between convolution and correlation is that the sliding waveform is
time-reversed in the convolution process.
The arithmetic corresponding to position B of the
sliding waveform is shown on the right. This arithmetic could also be done with the fast-multiply unit
in 2.5 microseconds.
Th'e convolution process causes the phases of the
harmonic components of the two traces to be added
to each other. Since the shape of the field trace is
identical to that of the impulse response, the output
waveform can be thought of as an impulse response
produced by double filtering. Note that the output
waveform is not symmetric.

76

Spring Joint Computer Conference, 1968

+6

FIELD TRACE

I

I

~-3-1

.---ti......&._....1_\""'\lJ7."'"-+---r---r::
......- - - -4

6x 6·36
2= 2!! 4
~~~~~-T~~---4x-4. 16
-3x-3· 9
-I x-I· I
66

+6

FIELD TRACE

I

~I
~

~
~II
B

+2+6

-Ix 6. -6
-3 x 2. -6
IMPULSE RESPONSE
-4x-4·+16
(TIME REVERSED) ~~...,......-+-:#l-.....L----- 2 x -3· -6
6x-l· -6

___...~~

-8

ABC

OUT~ IT\
~""r~i~""""'"

Figure 8 - Convolution in the time domain

Figure 7 - Correlation in the time domain

Figure 9 shows how convolution is used in the
APE process. For simplification, operations involving
only a single field trace are shown. The top trace is the
field trace; the middle trace, the autocorrelation of
the field trace; the bottom trace, the convolution of
the field trace with its autocorrelation. The higher the
order of the cross multiplication involving a given
shaped signal, the more the final wave shape will
ring or oscillate. Note that the convolved autocorrelation is much more oscillatory than the original field
trace. Such induced ringing would be bad for seismic
interpretation because it can cause discrete reflection
wavelets to overlap each other. Thus the APE process carried only this far would resuit in a loss of
resolution.

To get rid of the ringing, we resort to a process
called "deconvolution. 2 " Figure 10 depicts the successive stages of the deconvolution process. The
ringing trace is first autocorrelated. Using Z-transforms, an inverse filter is designed which when convolved with a ringing trace will shrink the reflecfrequencies
tions into spikes. In a spike trace,
contain equal amplitudes. Because the low and high
frequencies brought up by the deconvolution process
may contain excessive noi~e, it is generally necessary
to apply to the deconvolved data a bandpass filter
having a broad peak. To accomplish bandlimiting,
the deconvolved trace is convolved with a symmetric
impulse response to give the trace at bottom. The
bandpass-filtered trace can be thought of as having
been obtained by replacing each spike by a replica of
the impulse response of the bandpass filter; the magnitude, polarity, and time position of each replica
being determined by the spike it replaces.

all

A Major Seismic U·se for the Fast-Multiply Unit

77

ence trace by the amount trace 1 lags it; and the correlation function for trace 2 lags the reference trace
by the amount trace 2 leads it.
___- - - - - REFERENCE TRACE

I

I

~

\

FIELD TRACE

--

\

,- -

-

-

FIELD TRACE I

___- - - - - - - - FIELD TRACE 2

~~

t

AUTOCORRELATION

\ ~~

---............
-______

__

,.-----}
I

,

,,/

~
\

CROSS
CORRELATIONS

---........} FIELD TRACES
\ /'

+_~

CONVOLVED
WITH CROSS
CO~TIONS

n

/

CONVOLVED

--- L-------}
---.....
AUT~LATION

Figure 9 - Convolution of an autocorrelation

DECONVOLVED
TRACES

~

___ ~ I \/-----}
.",

FILTERED TRACES

RINGING TRACE

AUTOCORRELATION
FUNCTION

INVERSE FILTER
(DESIGNED BY Z TRANSFORMS)

J~_-,nL....__-----' r - - - U

DECONVOLVED TRACE

BAND -PASS FILTERED
TRACE

Figure 10- Deconvolution process

We are now ready to describe the total APE process (Figure 11). The goal is to line up traces 1 and 2
with respect to the reference trace. Manipulations involving trace 1 are shown as dashed lines; those for
trace 2, as solid lines. The reference trace is first
cross-correlated with traces 1 and 2, respectively.
Because phases are subtracted in the correlation process, time lead and lag relations become reversed.
The correlation function for trace 1 leads the refer-

Figure II - Steps in the Automatic Phase Edit process

Traces 1 and 2 are convolved with their respective
cross-correlations with the reference trace. The resultant convolutions line up in phase, but ring excessively.
The two functions are than autocorrelated and inverse
filters designed. The inverse filters are then convolved
with the ringing functions to restore resolution, as
signified by the lined-up spikes. These spike traces'
are then convolved with a bandpass filter to yield
aligned traces which are seismically realistic in appearance.
In the time domain, APE can cause various frequencies to shift by different time amounts. In other
words, APE can produce non-linear phase shifts.
This is demonstrated in Figure 12 where APE is applied to synthetic pulses of two different frequencies.
The raw data in the top panel consists of identical
suites of low-frequency pulses on the left side and
high-frequency pulses on the right side. Not ·only are

78

Spring Joint Computer Conference, 1968

(J)

o

z
o

u
w

(J)

~

w
~
~

Figure 13 - 12-fold stack before APE

Figure 12- Using APE to align synthetic pulses of two different
frequencies

the pulses statically shifted relative to one another.
but the interval between them is not uniform. Two of
the traces are inverted in polarity to symbolize accidental reversal of terminal connections in the field.
In the middle panel, the low-frequency pulses have
been lined up by manual editing. Note that the high-'
frequency pulses are still misaligned because of the
non-uniform interval between pulses on each trace.
The bottom panel shows the results of APE processing. Both low- and high-frequency pulses have been
lined up, and the traces reversed in polarity have
been automatically inverted. (I n this example, ringing is yet to be removed by deconvolution.)
APE's power of producing non-linear phase shifts
can compensate for distortion in wavelet character
occurring from one trace to the next. Areal variations
in the filtering properties of the weathered layer usually account for most of such distortion. Manual determinations of time-alignment corrections are limited

by the assumption that the filtering properties of the
weathering do not change much throughout an area.
Now let us again examine the effects of APE on
field data. We have noted that APE transformed the
ragged-appearing reflections in Figure 5 into the
smoothly aligned reflections shown in Figure 6. In
addition there is an improved signal-to-noise ratio
as indicated by the fact that the intervals between the
reflections appear cleaner and quieter. This happened
because the APE process builds up only those frequencies showing trace-to-trace coherence and cancels those that lack coherence.
The 12-fold trace collections in Figures 5 and 6
were stacked to yield the results in Figures 13 and 14,
respectively. The reflections on the stacked APE
section are cleaner, stronger, and sharper than those
on the stacked non-APE section. Because reflection
character has been stabilized from trace to trace, residual variations in character should be more reliably
diagnostic of changes in lithology. Because the phase
shifting has been done objectively by a machine,
structural anomalies will not be suspected as being
due to human bias, as is often the case in manual

A Major Seismic Use for the Fast-Multiply Unit

SUBPROCESS

79

MULTIPLICATIONS AND
ADDITIONS

I. CROSS-CORRELATION
7.200,000
2. CONVOLVED CROSS-CORRELATION 7,200,000
3. DECONVOLUTION
A. AUTOCORRELATION
7,200,000
B. CONVOLUTION WITH
INVERSE FILTER
7.200.000
C. CONVOLUTION WITH
BAND-LIMITING FILTER 3,600,000
32,400,000
TOTAL
(J)

o

z
o

Figure 15 - Tally of multiplications and additions required to
APE process a 24-trace seismogram

(.)

LLI
(J)

~

Without the fast-multiply unit, APE processing
would not be economically feasible. Figure 15 tallies
the number of multiplications and additions required
to APE process a single record of 24 traces, each
trace having been sampled 3000 times. The number of
multiplications and additions required to APE process one record amounts to a grand total of 32,400,000.
The fast-multiply unit speeds through the computations in just 16.2 seconds. For APE processing, a
client currently buys 16,200 mUltiplications and additions for a penn~. APE p~oce~sing represents a big
seismic use for the fast-multiply unit.

LLI

:IE

i=

REFERENCES
Figure 14 - 12-fold stack after APE

editing. All these improvements effected by APE can
help the geophysicist make more reliable seismic interpretations in the search for oil.

. I W H MAYNE
Common reflection point horizontal data stacking techniques
Geophysics vol XXVII no 6 pt II December 1962 pp 927-938
2 The MIT geophysical analysis group reports
Special issue of Geophysics vol XXXII no 3 June 1967 pp
411-521

A generalized linear model for optimization
of architectural planning
by RODOLFO J. AGUILAR and JAMES E. HAND
Louisiana State University
Baton Rouge, Louisiana

INTRODUCTION

internal and external factors such as budget, market
restrictions (construction costs, rentability, individual
preferences), parking regulations, lot coverage, etc.,
in addition to environmental factors, many of which
defy quantification.
The procedure described in this section attempts
to consider as many as possible of the quantifiable
factors in optimizing architectural planning within
the framework of economics; return on invested
capital.
With the information extracted from the model,
(a type of simulation model in many respects), rational
decisions can be formulated based upon knowledge
of the most recent tax-depreciation structure, maintenance costs, alternative investment opportunities,
etc.

In the realm of architectural planning there exists a
type of problem with which designers are frequently
confronted whe~e financial return is the most appropriate measure of the sy~tem's" effectiveness. In this
category can be included. all rental and spe~ulative
housing (single and multiple family dwellings>,," office
buildings, warehouses, stores, many industn"al facilities, etc., and building complexes which combine some
or all of these to provide comprehensive services to
the tenant.
Traditionally, the problem of planning for capital
investment has been handled in a semi-empirical way,
where the data which serve as basis for decision making may be more or less reliable and up to date, but
where the processing of these data, to arrive at the
optimum allocation of space for maximum expectation
of financial return, is entirely intuitive and, consequently, unreliable.
The writers have studied the allocation problem in
detail and have concluded that once the pertinent data
are collected, the most profitable design configuration
can be ascertained through the optimization of a linear
model which incorporates the most salient features
of the real-life, planning situation.
The allocation of rental housing (single type facility)
has been modeled by Aguilar. 1,2 However, a mix of
various types of uses has not been investigated previously_
The model developed in the following section takes
into account multi-use facilities and considers realistic design constraints such as zoning regulations,
parking requirements, etc. An example problem is also
presented to illustrate application of the theory.

Letxa be the floor area of facility type i, at levelj,
of architectural quality k.
The "type i facility" refers to the use made of the
space considered; offices, apartments, stores, warehouses, laboratories, industrial facilities, classrooms,
etc.
The "j lever.' denotes the location of the facility;
first, second, ... th floor level.
The "k quality" refers to the degree of architectural
refinement of the space; types of finishes, environmental control, and many other considerations, the
effects of which are reflected in the cost of construction and in the rentability of the facility.
Similary, Let

,n

c~

= cost per unit of area of facility i, at level j, of ar-

chitectural quality k.
rb = rent per unit of area, per unit of time, from facility i, at levelj, and architectural quality k.
pb = probability of renting or selling facility i, at level
j, and architectural quality k.
and,
qb = probability of not renting or selling facility i, at

The model
Definition of terms
The design of buildings and other architectural
facilities for financial return is constrained by many
81

82

SpringJoint Computer Conference, 1968

level j, and architectural quality k. Hence,
p~+q~= 1.
It will be assumed that the market is large, and that
for the time period under consideration-, it is in a steady
state condition, such that the introduction of additionai
facilities for rent or sale will not affect significantly
the parameters defined above.

Mathematical formulation
The total expected rent per unit of time can be expressed as follows:
L

m

n

E(R)= ~ ~ ~
~

ph rhxh'

(1)

J=11=1

This is the objective function to be maximized
subjected to constraints of the types described below.
1. Market
a. A survey may reveal that, within a specified
geographic region, there exist vacancy rates
for some or all of the proposed facilities.
These vacancy rates are in fact the q~ = 1 - p~
which were already considered in the formtilation of the objective function. Nevertheless,
the q~ should be compared with the upper limits set on them by banks and other funding
institutions. If these limits are exceeded, the
project may be very difficult if not impossible
to finance. Therefore,
q~ ~

(qik) max, for all i and k.

(2)

b. Market preference.s data may indicate that the
area of each facility type and quality must not
exceed a certain fraction of the total building
area. Let flk be upper bounds. to the area
ratios; then, the constraints can be written,

In general,
L

n

LL~>1

(4)

k=li=l

because of the requirement that the ~ be upper bounds
to the area ratios.
The e~ are small positive constants introduced into
the constraint equations and inequalities to avert
degeneracy.
*The range of the sum on the left hand side of inequality (3) varies
for each facility type and quality and for this reason it is not given
explicity, This convention will be used for the rest of this paper.

2. Zoning Regulations such as:
a. Maximum building coverage of total lot area
which can be mathematically expressed as
follows,
j= 1,2, ... m,
where Aj = maximum allowable building area
of floor levelj.
b. Height restrictions which could be overall
building restrictions that j ~ m or restrictions on each individual facility; merchandising space, for example, should be located on
lowest floors, etc.
c. Off-street parking regulations, usually given
as the number, ni', of parking stalls per building area, ah of f~cility type i. I.n addition, the
area, a, requin!d for each car for parking,
drives, etc., can be easily computed and the
off-street, surface parking restriction expressed as,
(6)

where At = total buildable lot area (total lot
area minus area required for landscaping,
street rights of way, utility easements, set
back restrictions, topographically unsuitable
land, etc.).
Note that( At -

~ ~ x~ )is the site area avail-

able for parking (At minus first floor area of
building).

3. Design decisions (massing studies). These decisions are made by designers for aesthetic and
other reasons and are often arbitrary. Nevertheless, their effect upon the economic health of the
system can be measured by comparing all optimal solutions from models with different sets of
design constraints to one having no design constraint. The model without design constraints
yields a solution that in this paper will be called
the optimum optimal solution. The massing associated with the optimum optimal solution may
not be aesthetically andlor structurally acceptable and it is at this point that design constraints
must be introduced. The constrained model will,
in general, yield a lower value of the objective
function. Thus, if E(R)o is the optimum optimal
rent and E(R)c is the expected rent when the
problem is constrained by design considerations;

Generalized Linear Model for Optimization of Architectural Planning

83

(7)
where Cc is the cost of the design decisions.
It should be realized that aesthetic values may
affect the four parameters c5, r5, p5, and q5 as
well as others. Ifth~ir effect is known, the parameters should be modified accordingly and a new
E(R)c computed.
The ability to ascertain the effect or" design
. decisions upon the system's economic health is a
most valuable characteristic of the model, for it
measures indirectly the "cost" of beauty and of
other aesthetic factors; it represents an attempt
to quantify some of the intangibles of architecture and urban planning. Design constraints may
take different mathematical forms depending
upon the massing restrictions. If, for example,
the planner decides that the building should
take the form of a tower with a spread out, s
story base, b times or more larger than each of
the tower's floors (Lever House type of building), the constraints would be written, thus,

+

(10)

~ x5 = ~ ~ xfis+l) + ej,j = s+2, s+3, ... , m.

The ej, j = 2, ... ,m, are, again, small positive
constants introduced to avert degeneracy. Many
other types of design decisions can be similarly
formulated.
4. Budget, generally expressed as a fixed amount of
money, B, which must not be exceeded by all
fees and construction costs, excluding the cost
of land.
Construction costs for each facility type and
quality are, in general, functions of the total
number of floors. Specifically, as the number of
floors in the building increases, the costs per
unit of area could decrease or increase. This
variability is not usually too sensitive and can
be conveniently expressed as a cost multiplier,
step function of the total number of floors, m.
Let {3ij (m) ~ 0 be such step function. Graphically, it could be typically mapped as in Figure 1.
The figure shows that {3~ (m) decreases for (a~)2 ::::; m
< (a~)3. It further decreases for (a~)3 ::::; m < (a~)4.
Then, it increases in the interval (ati)4::::; m < (ati)~, and
the increase is even greater for (ah)5 ::::; m::::; (a~)6.
These latter increases would reflect the higher costs
of foundation and vertical transportation, for example.

.."
o

c
o

U

Tolal

Number

of

Floors

COST MULTIPLIER STEP fUNCTION

Figure 1 - Cost multiplier step function

When m > (a5)6' the cost mUltiplier becomes infinite,
denoting that (a5)6 is the upper bound to the number of
floors due to zoning restrictions or other considerations. The variation shown in Figure 1 is, of course,
only one of an infinite number of possibilities, but all
of them will exhibit the same general sh;pe.
In general, there will be n X m X L construction
cost multipliers with a proportionate number of (ab)q
nodes where changes in the costs occur. Let the node~
be ordered sequentially for alli f3ij (m) and renumbered
'Yq, q = 1, 2, 3, ... ,r. Then, a typical cost multiplier,
step function of m, would appear as given in Figure 2.

to T.
Talal

>;..

Number

of

Flooro

COST MULTIPLIER STEP fUNCTION SEQUENTlAU.Y ORDERED

Figure 2 - Cost mUltiplier step function sequentally ordered

84

Spring Joint Computer Conference, 1968

This relabeling is necessary to show, for each f3b (m)
multiplier, all points at which construction costs
change for any type, level and quality of facility; thus,
affording complete control upon the optimization
process.
The cost multipliei functions, howevci, intioducc a
further complication in the model because the number
of floors may not be known a priori. This is, in fact, a
non-linear characteristic of the system, that, in this
context, exhibits recycle loops. A method will be proposed to handle the non-linearity in a satisfactory,
linear manner. Additional item~ of cost and cost coefficients must be defined, for example:
a. Cost of sub-surface exploration, soil investigation, foundation recommendations, and report.
Usually given as a lump sum, bs •
b. Architectural-Engineering fees, usualiy expressed as a p~~centage of total construction costs by a
decimal coefficient, Ca.
c. Fund for contingencies, also expressed as a percentage of total construction costs. Let Cc be the
decimal equivalent of that percentage.
d. Cost of movable furniture, expressed as a percentage of construction cost for each facility
type i and quality k. Let Ci k be the decimal equivalents of those percentages.
e. Supervision of construction costs, generally computed as a lump sum, bc' which is the total salary
or salaries of one or more supervisors for the
anticipated duration of the construction phase of
the building. bc is assumed constant for a given
project.
f. Cost of parking facilities per unit of area, Cpo
Support space, such as halls, elevators, toilets, etc.,
is assumed to be included in the xh and for this reason,
its cost must be apportioned among tenants and/or
buyers (as it always is).
From these considerations, the following constraint
inequality can be readily formulated:
(1

+ca +Ch){ ~ ~ (l +cn [~f3~ c~ X~])
+ Cp a [

Step 1. Assume that the maximum number of floors
'Yr will be built (see Figure 2). Therefore,
m = 'Yr and the corresponding J3h mUltipliers
are used in inequality (11) and in the formulation of the mathematical model. A solution is then obtained using a linear programming algorithm.
Step 2. Suppose that the solution generated under
these conditi~ns yields m = Asb and that
'Yr ~ 'Yst ~ 'Yr-t· Then, the f3il used were the
correct ones and this is an optimal solution
for the constraints imposed.
If on the other hand, 'Yst < 'Yr-t, the J3b
used are incorrect and the solution is not
consistent with its associated construction
costs. Proceed to step 3.
Step 3, Now; assume that the total number of floors
m = 'Yr-l - 1, i.e., the maximum value on
the next lower interval. Use this value of m
to select the f3i~' formulate the model,
and solve using linear programming. Two
possible resuits follow:
(I) The solution yields in = 'YS2 and 'YS2 < 'Yr-2' In
this case, repeat step 3 for the next lower "interval.
(2) The solution yields m = 'YS2 where 'Yr-t > 'YS2 ~
'Yr-2' Then, the optimal solution for the model
has been obtained and m = 'Ys2'
5. Many other constraints can be imposed upon
the objective function to make the problem conform to reality as much as possible. In writing
the constraints, however, one must keep in
mind that they must be linear, if the problem
solution is to be obtained with a linear programming algorithm, and one must be careful not to
write iineary dependent andior redundant constraints, for they can be a source of error and
inefficiency in carrying out the solution.

Optimization procedure
(II)

~ :: ( ~ ~ x~ ) ] + b, + b, .; B.

The constraint given above presupposes that the
total number of floors, m, is known in advance. On
this basis, the J3b mUltipliers are obtained without difficulty. In performing an optimization study, however,
design constraints would be relaxed at some point,
and the total number of. floors would become one of
the unknown parameters. This, in turn, makes the f3~
be undetermined and a non-linear, recycle loop could
be generated. To avoid this complication, the following algorithm is proposed:

The model developed in" the" previous sectIon is a
linear one in that a linear objective function is subjected to linear constraint inequalities and equations.
It fits within the framework of problems which" can
be optimized with "Linear Programming" methods,3.4
for which computer programs are generally available.
I t can be" shown that, after the introduction of slack
and artificial slack variables, the matrix of the structural coefficients must have the same rank as the matrix formed by augmenting the previous matrix with
the column vector of stipulations, for the system to
have basic solutions. This is a good check on the constraint equations, and shouid be performed before embarking upon the task of solving the problem.

o~

OJ,

CONCLUSIONS
The architectural planning model presented in this
paper allows the designer to make rational decisions
based on the information provided by the optimal solutions generated. It must be emphasized that the approa~h proposed by the writers asserts the importance of the role played by the architect-planner in
shaping the urban environment. Even with a systematic method such as the one described, he must be,
at all times, deeply involved in the formulation of the
model, especially at the "design constraint" level, for
the linear model presented is' a full-fledged "simulation machine," sensitive to the setti~gs the designer
gives it and responsive to the reactions of the system's
economic health. A simple; but descriptive, functional example problem is given in the Appendix.
REFERENCES
1 R J AGUILAR
Decision making in building planning
Computers in Engineering Design Education College of Engineering The University of Michigan Ann Arbor Vol III pp 111-26 to
111-33 April 1 1966
2 R J AGUILAR
The mathematical formulation and optimization of architectural
and planningfunctions
Division of Engineering Research Louisiana State University
Baton Rouge Bulletin No 93 1967
3 W W GARVIN
Introduction to linear programming
McGraw Hill Book Company Inc New York 1960
4 R W LLEWELLYN
Linear Programming
Holt Rinehart and Winston New York 1964

TYPE
1. Office

AREA

~ OF TOTAL BUILDING

QUALITY
1
Z
3

10
40

zs

Z. Stores

1

30

3. Apartments

1
Z

40
30

TABLE I - Rentability

D. Zoning regulations. for off-street parking are:
a. Offices-l parking space/ea. 200 sf of bldg.
area.
b. Stores - 1 parking space/ea. 1,000 srof bldg.
area.
c. Apartments - 1 parking space/ea. 800 sf of
bldg. area.
Further - building coverage shall not exceed
30% of total lot area. Allow 400 sf/car for
parking space, drives, etc., and a total of
5,000 sf for landscaping.

E.
TYPE

LEVEL

QUALITY
Z

1

3

1. Office.

1. Ground FI'r
Z. Second Fl'r

4.50
4.00

4.00
3 •. 50

3.50
3.00

Z. Store.

1. Ground FI'r

3.00

--

--

3. Apartment.

1. Ground FI'r
Z. Second FI'r
3. Third FI'r

3.00

--

Z.50
Z.SO
Z.OO

3.00
~.SO

----

APPENDIX
TABLE 11- Rent ($/sq.ft./yr.)

A functional example
An individual ~ishes to develop a 250 ft. x 400 ft.
commercial piece.. of property located at the intersection of 2 major traffic arteries.
A. Market studies indicate that for a development to be rentable, available office space must
not exceed 2 floors, stores must not exceed 1
floor and living units, 3 floors for a "walkup" facility.
B. The studies further indicate that within the
sector of the city where the property is located,
offices have a 20% vacancy rate, stores a 5%
vacancy rate and apartment units a 10% vacancy
rate (occupancies must not be less than 70%, to
obt~in financial support of a funding agency).
C. Data on market rental preferences reveal that
facility areas of individual types and qualities
should not exceed the following percentages
of total building area:

F. The (Bh ch) coefficients are given below in tabularform.

TYPE

QUALITY

TOTAL NUMBER OF FLOORS IN BUILDING
Z
3
1
LEVEL
LEVEL
LEVEL

...

......
..,
:>
"e

t:>

:

.

-- ----- ----

1
~o. 00
Z
8.00
3
5.00
Z. Stores
1
3.00
1
0.00
3. Apartmenta
Z
7.00
Cost of parklng area - $0.75 per sq. ft.

II. Offices

-

-- --- --- --

...

...

..,

..,...

......
g

1

0

..

......... ......... ......
...
1l
.".

":>

.
0

u

]

0
.c
~
~
ZO.OO 19.00 - 19.00 18. O~
17.00 16. O~ - 17.00 15.0(
14.00 14.0C -- 13.00 13.0(
lZ.00
-- 11.00
19.00 18.00 -- 18.00 17.0( 17.
17.00 16.00 -- 16.00 15.0( 14.

--

'"

--

--- --

TABLE III-Construction cost, including support space ($/sq.ft.)

G. Miscellaneous Costs:
a. Architectural-Engineering
fees-6%
of
total construction cost.
b. Contingencies - 1.5%' of total construction
cost.

86

Spring Joint Computer Conference, 1968

c. Movable furniture1... ·Offices - 2% of total construction cost.
2. Stores - 1%. of total construction cost.
3. Apartments-4% of total construction
cost.
d. Soil's report-$3,000
e. Supervisio~ of construction - $2,000.
H. Budget: Shall not exceed $800,000, excluding
cost of land.
Question: What types of units and how many
square feet of each shall the investor develop to
maximize the rent under present market conditions?

CONSTRAINTS:
I. The lowest construction cost is, from F,
$11.00 per sq. ft. Therefore, an upper bound to
the total building area is 800,000/11.00 ==
74,000 sq. ft. Therefore,
xll + xll + Xlal + Xl~ + Xl2 + xi1 + xli + xil
+ Xa21 + xa~ + Xi2 + xata + xia ~ 74,000.
When slack variable Xl is introduced, obtain,
x~+x~+x~+xh+x~+x&+xA+x~+x~+xk
+ Xa22 + xa~ + xa~ + Xl = 74,000.
(13)

N~tice that the total building area is 74,00o-Xl.

The model
Let:

i i. Market preferences.

~

Offices
i =~ 2~ Stores
l3 ~ Apartments
( I

From C, and introducing slack variables X2
through x 7 , obtain,
(14)

1 ~ Ground Floor
j = 2 ~ Second Floor
{
3 ~ Third Floor

(15)

1 ~ Quality 1
k= 2 ~ Quality 2
'- 3 ~ Quality 3

XiJI + xi~ + .25xl + X4 = 18,500

(16)

xi. + .30Xl + X5 = 22,200

(17)

f

(18)

Then, from A and E,
xii +
k=1,2,3;
k=2,3,;
j=2,3;

Xi2 + Xf3 + .30Xl + X7 = 22,200

(19)

II I. Site building area.
From D, in Eq. (5), and with slack variable Xg ,

k= 1,2,3,;

j= 1,2,3.

xli + Xli + X13l + X2i1 + Xail + xll + Xg = 30,000.
(20)

From B,
ql~ = .20 < .30, j= 1,2,3;

k= 1,2,3,.

OK.

q2~ = .05 < .30, j=I,2,3,

k= 1,2,3.

OK.

qa~ =.10 < .30, j= 1,2,3; k=I,2,3.

OK.

I V. Parking.
From D., in Eq. (6), and introducing slack
variable Kg, form,

From B, E and Equ. (I),
(21)

OBJECTIVE FUNCTION:

+ Xg = 95,000.

Max E(R)
= .80 (4.50x t ll +4.00X t2l + 3.50Xlal
+ 4.00 x12+ 3.50 Xl22 + 3.00 xi~)+ .95 (3.00X 2tl)
+ .90 (3.00xil + 2.50xl1 + 3.00X:l2 + 2.50x;h
+ 2.50 x:!a + 2.00Xfa)·

(12)

V. Design Decisions (Massing Study).
(All floors must be approximately the same
size)
First floor area approximately equal to second
floor area.

Generalized Linear ~v1odel for Optimization of Archiieciurai Pianning
xh+x~+x~+x~+x~+x~=x~+xA+xA
+ X3\ + X3~ + e l ·

First floor area approximately equal to third
floor area,

Let e 1 = e2 = 1000 to avert degeneracy. This
means the second and third floor areas will be
within 1000 sq. ft. of the first floor area. One
can write:

(22)

87

contain more than 12 non-zero variables. This
limitation could be removed by introducing additional constraint conditions.
In accordance with the budget algorithm
given in the paper, if design decision constraints are removed and the optimal solution
shows that
(25)
the total number of floors could not exceed
two (2). Therefore, set X3~ = Xi3 = 0 in the model and formulate the budget constraint equation as follows:
1.075 { 1.02 (20 xlI + 17 xlI + 14 XlI + 19 x l \

and,

(23)
Artificial slack variables must be introduced into eq.'s (22) and (23) before proceeding to
optimize.
VI. Budget.
Under the assumption that the total number of
floors in the building will be three (3), the
budget constraint, from F, G, H, ineq. (11) and
introducing slack variable x lO , can be written
as follows:
1.075 {1.02 (19 xlI + 17 xlI + 13 xA

+ .50 (xlI + X321 + xl2 + X!2)} + XIO = 795,000.
(26)

I n a similar manner, if the new optimal solution
shows:
(27)
the building can have only one (1) floor. Hence,
set Xl2 = Xl22= xl2 = Xl2 = X322= 0 in the model and
the budget constraint becomes,

+ 1.01 (13

x~lt)

+.75 {2.00 (xlI + xlI + xli) +.40 (xlI) + .50

+ xI2 )} + XlO= 795,000.

(24)

The model is complete and a linear programming maximization of objective function (12)
subjected to constraint equations (13) through
(24) can now be performed. Because there are
only 12 constraint equations, any basic feasible solution, including the optimal one, cannot

(28)

The budget algorithm decribes the procedure
to follow in handling cost coefficient multipliers.
When design constraints are relaxed, the
optimum optimal solution is obtained.

88

Spring Joint Computer Conference, 1968

SOLUTION
The first run, with no design constraints, yielded the
following data.
Ai a gross expected profit of $ i 40,570 per annum,
build:
12,536 sq. ft. of Quality 3, ground floor
OFFICES
16,700 sq. ft. of Quality 2, second floor
OFFICES
15,043 sq. ft. of Quality I, ground floor
STORES
5,865 sq. ft. of Quality I, second floor
APARTMENTS
This 2=lev.eI solution was obtained using construction costs for a 3=level comp~ex. It is therefore necessary to change the cost coefficients in the
budget constraint. The third level variables were assigned large, positive cost coefficients in order to drive
them out of solution.
The second run, with no design constraints,
yielded:
At a gross expected profit of $132,640 per annum,
build:
4,348 sq. ft. of Quality 2, ground floor
OFFICES
11,660 sq. ft. of Quality 3, ground floor
OFFICES
12,711 sq. ft. of Quality 2, second floor
OFFICES
13,992 sq. ft. of Quality I, ground floor
STORES

3,929 sq. ft. of Quality J, second floor
APARTMENTS
This two-level configuration constitutes the optimum-optimal solution with which ail other optima
(subject to all design constraints) must be compared.
I t is important to note that the optimum-optimai
solution established a building area of 30,000 sq. ft.
on the ground floor and 16,640 sq. ft. on the second
floor.
.
The third run, for which a design constraint was
introduced into the model (that the ground floor area
must be approximately equal to that of the second
floor), yielded the following space allocation:
For a gross expected profit of $132, 126 per annum,
build:
9,856 sq. ft. of Quality 3, ground floor
OFFiCES
2,506 sq. ft. of Quality i, second floor
OFFICES
18,712 sq. ft. of Quality 2, second floor
OFFICES
1,671 sq. ft. of Quality 3, second floor
OFFICES
14,035 sq. ft. of Quality I, ground floor
STORES
Due to one decision, the designer is forced to relinquish approximately $500/annum in profit and to
re-allocate spaces so that a local optimum may .be
achieved subject to the constraint imposed upon the
building configuration.
Various other design decisions can be incorporated
with the same ease. This completes the presentation
of the example problem.

Standards for user procedures and data formats in
automated information systems and networks
by JOHN L. LIlTLE
NatiQnal Bureau of Standards
Washington, D. C.

and
CALVIN N. MOOERS
Rockford Research Institute
Cambridge, Massachusetts

INTRODUCTION
At the present time, a low-cost, suitably connected
teletypewriter, and a telephone call, is the "passport" that can permit a person to make direct contact
with any of more than two dozen computer-based information storage and processing capabilities within
academic and research establishments scattered
across the country. By the same· method of access,
one can in addition make contact with at least as many
commercial services offering similar capabilities. By
any reasonable estimate, there are now (Spring 1968)
in operation more than two thousand such teletypewriter units which are being used by students, faculty,
scientists, engineers, secretaries, and administrators.
New accessible computer facilities, both academic and
commercial, are being announced with regularity . .In
addition, large projects, both privately and governmentally sponsored, are under way with the purpose of
creating vast topical information stores with associated processors. An important part of some of these
plans is the linking of the stores into large networks,
both for the exchange of information among the stores,
as well as for presentation of the information to, and
service to, the directly-connected, ultimate user.
The rate of growth of such facilities is very high,
and it will continue to be high because of the exceedingly favorable user response to this mode of service.
As recently as 1961, essentially no facilities of this
kind were in operation. Yet, during just the past three
years, the number of accessible processors, and the
number. of users, has increased by at least thirtyfold.
Such a rate of growth represents a tenfold increase
every two years. Because of the evident enthusiasm of
the users, the size of the untapped population, and the

potentials of this kind of service, we can probably expect this rate of increase to continue unabated for at
least the next five years, before the saturation effects
begin to take over. If these predictions are fulfilled
, (and they may be exceeded), then by 1972 there will
be in the order of 15,000 accessible automated
storage and processor complexes, both large and
small .. Also, there will be in use something in the order
of 300,000 on-line terminal devices of various kinds,
each permitting connections to automated remote systems over spans ranging from one room to the next,
up to long-haul transoceanic connections by radio
satellite.
These numerical predictions may be compared to
the more than 4 million electric office typewriters
now in use; about 80,000 commercial teletypewriter
instruments operating in business point-to-point
communications; and about 50,000 electronic computers, both large and small, now installed.
This happy picture of such widespread user acceptance and the growth of a new technology is seriously
flawed by the Babel of differing languages and control
methods in use. Once a telephone connection to a remote automated storage and processor unit has been
established, the user is absolutely helpless unless he is
thoroughly familiar with the particular keyboard
rituals and incantations required to elicit performance
from the specific remote machine. It is safe to estimate
the knowledge of at least 30 to 40 different languages
and rituals would now be required for operation of all
the 50 or so presently accessible automated systems.
In each contact, the poor user must know which language and ritual to employ before he can establish
even the most minimal communication with the re89

90 Spring Joint Computer Conference, 1968
mote machine. For example, should he type "HELLO" or should he press the BREAK key to get the
computer's attention? Should he use the CARRIAGE
RETURN or the ALT MODE key to request the
computer to act? And so on, for something like a dozen basic control functions.
The present chaotic situation is rapidly growing
worse as more systems are brought into operation. For
each new system, the local computer programmers
take it as their prerogative to concoct yet another
unique language and method of control. This is a direct
result of lack of standards subscribed to by the users
to provide compatibility, as well as a lack of administrativeguidance. If such proliferation continues, unchecked by any plan or guidance, it is predictable
that by 1972 there will be hundreds, or more likely
thousands, of differing control methods and languages~
If so, a substantial part of the great advantage of widespread communications and of direct access to the
automated information systems will have been needlessly destroyed.
This is not a unique situation. History is merely repeating itself. The North American railroads could
not merge into a true system until they had undone
a great proliferation of track gauges, couplers and
brakes. Likewise, the power companies could not
link together until they had abandoned a variety of
powerline frequencies. Telegraph and telephone followed a similar course. A nationwide highway system
depended upon undoing many local signalling and road
marking conventions.
In 1968 we are confronted with the ingredients of
information systems, and again, the existing confusion
must be undone to some extent, and a set of interface
and operating standards must be agreed upon, in order
to provide a viable system of information networks.
The purpose and orientation of this paper

The purpose of this paper is to attempt to find a way
out of this chaotic situation by the development of
standards for user control procedures and standards
for data formats to be used in autoJJ}ated information
systems and ~etworks. A goal is to make it possible
for a person sitting before the keyboard of his on-line
teletypewriter or display console to use a standard
method for establishing communication with, and
using, a remote automated information storage and
processing system. Another goal is to PFovide him
with a standard method for control of format and the
interpretation of the output of such systems.
The method of approach is to examine the complex
environment confronting the person making contact
with and using a remote automated information processing system, and from this examination to discover

the fundamental functional characteristics of the situation. From the study of the logical properties of these
functional characteristics, it has been possible to formulate a proposed course of action that can lead to
workable standards for use in automated systems.
Two main areas are treated~ The first area has to do
with the problems of elementary user control procedures and methods for gaining access to and control of
automated data store and computer systems. It is
primarily concerned with the possibility of standardizing a set of certain elementary keyboard actions. It
stops short of undertaking a detailed study of all user
command languages (e.g., BASIC, QED, or QUICKTRAN).
The second area is concerned with the format in
which data and other information are presented to the
user~ or in which it may be interchanged between
cooperating automated information systems.
An important constraint has been to formulate a
course of action which would provide users with a
set of basic control methods and procedures, and
methods for dealing with data formats, yet which
would not destroy the investment in locally-established procedures or data stores. The goal of satisfying this constraint appears to have been achieved.
The results of the study in a nutshell

In brief, the results of the initial part of this standardization study are as follows. The fantastic variety
of different keyboard control actions presently being
used is so great that any effort of standardization by
seeking a consensus is out of the question. However,
it is found that there are only about twelve logically
distinct elemental control actions which are of crucial
importance to the user when he initially enters an
automated information system. These elemental logical actions can be standardized as to function and can
be given standard keyboard assignments. Then the
user, by invoking these standard actions, can have access to the full control r~sources of the rest of the system. These twelve control actions are shown in Appendix I, along with suggested keyboard assignments.
The necessity for user group action

Because the technology of automated information
systems is still so very new, the number of people who
are now actually using on-line terminals and automated information systems is still very small. However, they already begin to provide a cross section of
the great number of users of the future. These present
users - since they can now realize what some of the
problems are - hold a great obligation to the users of
the future. It is their obligation to begin immediately
to take the steps that are necessary to protect the vital

Standards for User Procedures and Data Formats
interests of themselves and future users, and to do so
before the present chaotic situation grows larger or
becomes permanent (as it could!).
Historically - that is, since computers first came
into being - any layman user of a computing machine
found himself cast in a role which was inferior to that
. of the machine. The layman users have had to conform to the requirements set by those in charge of the
machines. When questions arose affecting the interests
of the users, it was the convenience of the machine
(speed and efficiency, or inherent limitations) which
were cited as providing the basis for the decisions
taken. The users had essentially no recourse, since
the machines were in fact rare, expensive, sO'Ilewhat
limited in capability - and most important, under the
control of another group.
A change in orientation is now justified. Processing machines are now available in abundance. They
have speeds and power of a magnitude only hoped for
ten years ago. Auxilliary facilities (disk stores, displays, communication) have had a tremendous improvement. As a result, we should now take a new
look at the ways these machines are applied. More important, we should reconsider the manner in which
we think about ourselves in respect to the machines.
In comparison to the machines, the rare and valuable commodity has now become the person doing a
job. The acknowledged purpose of the large scale automated information system is to provide service to
people - to provide them with information, and to arrange and rearrange the information that they bring to
the system.
The on-line terminal and automated systems

Our thinking about an automated information system should begin with the "user," the person sitting
down at the on-line terminal keyboard. When not involved with an active connection to some automated
system, the on-line terminal behaves much like an ordinary typewriter. One makes it go by touching keys,
and the result is line after line of printing on the page,
or display on the screen.
An on-line terminal can be thought of as a type writer that permits one to "do things" at the keyboard - to
do things of a most general sort. One can store text,
edjt it~ arrange it, print it out, search for and retrieve
information, perform computations, transmit text to
another location, and do many other things -limited
only by the range of facilities accessible to the system.
Access to automated information systems will also
be performed by a wide variety of devices with keyboards, in addition to typewriter-like devices. Thus
consideration must also extend to such things as TVtype presentations or displays, to various photograph-

01

7.1

ic processes and printing methods, various high speed
methods of printing, and the like. However, it is probable that typewriter-like devices will be the most
widely used devices in the predictable early future.
For this reason, and also for the reason that most persons are familiar with the typewriter, the typewriter
will be used as the most useful point of departure in
considering standards for user procedures in automated information systems.
The user system interface

The confrontation between the person and the automated system comes at the user's fingertips and at the
user's eyes. The user is concerned with what he must
do with those fingers to perform the actions that he
desires to accomplish. Similarly, he is concerned with
the interpretation to be given to the text which is presented on a page before his eyes.
The confrontation at the user's fingers encompasses
the area of "standardization of user procedures,"
since it is through such user procedures that the action
to be taken by the system is determined.
The confrontation at the user's eyes encompasses
the area of "standardization of data formats," since
the form and arrangement in which the text is presented on the page is most important for· his understanding of what any particular passage of text means.
It is at this user-system interface that we are particularly concerned: first with the logical or functional
characteristics of the system, i.e., what the system can
do for the user; and second with the manner in which
the user tells the system to do something for him.
Throughout, we must view the behind-the-scenes
technology (whether electronic, computer, or communication) as being in a secondary service role to the
user, whoever and wherever he may be.
Focus upon details or upon principles?

Most standardization efforts, in the ordinary course
of affairs, are concerned with achieving and stating a
consensus with regard to a relatively large set of particular details, technical aspects, physical dimensions,
codes, numbers, material compositions, procedures,
and the like. This course cannot be followed here.
There is no basis for such kind of consensus based on
overt detail. Already there are several dozen data storage and computer projects, and they are characterized by vastly different details in their techniques
for accomplishing even the simplest (which are the
most important) things. If one has even the briefest
conversation about these matters with a custodian of
one of the presently operating automated systems,
it will disclose how completely he realizes the impossibility (and undesirability) of achieving a useful

92

Spring Joint Computer Conference, 1968

standard based upon a consensus of details as practiced at the various automated systems.
This kind of impediment to standardization prevails in both parts of this standards study. In the "user
procedures" area, consensus is impossible because
of the great variety of control methods used for character delete, transfer of control, call for help, and
other actions. In the "data formats" area, the possible
number of usable formats for data is infinite. Thus
again, consensus based upon detail is not possible.
The only basis for achieving the consensus required for ~ooperation and standardization is through
the focus of attention to another level, a level at which
consensus is possible.
In user procedures, we must focus our attention upon the various elements of logical control which are
accomplished, and not upon the detailed key action,
button pushing, or other ritual used to initiate each
element of control action. Most of the elements of
logical control are present in all systems, in one form
or another. Therefore it is possible to discover, isolate, and identify these basic logical elements. Then,
with a list of such recognized logical control elements
available for discussion, it is reasonable to expect
that a consensus upon them can be reached, and that a
given set of them can be chosen as being required
and necessary for user control of automated systems.
After a consensus is available on such a common set
of logical control elements - then and only then.;.... will
it be appropriate to consider choices for standard keyboard methods of accomplishing the different logical
control actions.
A surprising outcome of this study is that it appears
that only about twelve basic logical elements are sufficient when used in the proper environment.
A similar situation prevails in the area of data formats. There is no hope for any success through the
cataloging of formats to be used, and seeking a consensus upon them. Yet, it does appear that a consensus can be developed for a standard method of describing formats for data. Each specific use of data can then
have a standard and commonly understood description of its format. Through this description, the meaning, or purpose, of each segment of text or data can be
understood by the user. The same standard data format descriptions can also be the basis from which the
automated systems will be able to manipulate and rearrange the data and place it in more useful forms.
User procedures vs. programming languages

In this study of standardization, it is necessary to
have a clear distinction between "user procedures"
and "programming languages." By "user procedures"
we are concerned with such elemental control actions

as: how to correct typing errors while communicating
with the system; how to get the attention of the remote
automated system; how to get assistance in the use of
the automate'd system; how to take overriding control to stop undesired actions; and so on. We are also
concerned with the procedural environment in which
these commands are given.
In contrast, "programming languages" are concerned with formal languages systems, such as FORTRAN, ALGOL, or COBOL, which are used for
writing out a step-by-step description of the series of
actions to be taken by a computer in carrying out
some computation or manipulation. A complete prepared description of this kind is what is known as a
"program." Associated with programming languages
will often be means used by the programmers and
technicians for writing, correcting, filing, and maintaining such programs.
Programming languages were developed for an entirely different purpose than that of serving the person
working at an on-line terminal communicating with
the ordinary automated information system. Such
programming languages are mainly for the technicians. For this reason, the conventions and habits
that have been built up around these formal computer
programming languages do not necessarily have any
direct application to user control procedures. Indeed,
some of the conventions of these languages are inherently bad for our purposes. Therefore we should be
most cautious about importing conventions from these
languages (without careful examination) into standards for user procedures.
There is an intermediate class of languages which
deserves careful consideration as to how they might
fit into this program. These languages are in three
groups: (1) the on-line numerical computational languages specifically oriented to layman users, exemplified by BASIC, JOSS, and QUICKTRAN; (2) the
on-line editorial languages for text preparation and
correction, exemplified by QED, TYPESET and
RUNOFF, and ATS; and (3) the general on-line teletypewriter control languages, such as TRAC. Each of
these languages is of tremendous interest to the user.
For various reasons, it is undesirable to attempt to include them at this time within this standardization
study. Later, it may be very desirable and appropriate
to nominate certain of these languages for consideration for future standardization. U ntiI then, we should
observe how the proposed standards for "user control" and "data formats" impact upon these specific
on-line languages. As a result of such impact, it may
be desirable to formulate requirements for minor modifications of the nominated languages in order to make
their control features more compatible with the standard user control procedures.

Standards for User Procedures and Data Formats
In further explanation of what is meant by "user
procedures," the following situation should be contemplated: A person (meaning a person not expert in
either programming or in computers) comes to an online terminal keyboard. This keyboard may be -different from the one with which he is familiar. The
system into which he makes a connection may be different from the one on which he has had his training.
Frequently he will be connected into a remote automated system \x/hich he has never used before. Such
is the situation. In seeking standards for user procedures, we desire to provide a standard course of action which will allow such a person to use the new keyboard, to use the local facilities, and to use the remote automated facilities to perform the work he
needs to do. Attention must therefore be focussed upon all of these aspects of the situation.
The concern must be with respect to what the user
will do, and how he can be guided in his actions in using the individual elements of the environment provided to him. We are overwhelmingly interested in the
logical requirements of the user's task (and not of the
logic of the machine). We are concerned with how the
user's actions may be delimited by his human physical
or mental capabilities and limitations. If there a;e discrepancies or shortcomings (as there may well be) in
the presently available hardware or in present software facilities, these discrepancies should be discovered and examined. Future hardware and software
must accommodate the user's logical requirements.
Biological evolution is simply too slow to make the
user accommodate!
Formats-external vs. internal
Data formats and file structures have long been a
very esoteric matter. Part of their complexity and
mystery arose from the historic development of computer hardware and storage media. Data formats and
file structures were based upon the artifacts of particular computers and upon the 80-column IBM tabulating card. Other aspects of the complexity can be explained by the early limitations, both in size and generality, of the available equipment. These limitations
led the programmers to develop many complex and ingenious techniques to accomplish their tasks at hand.
The limitations in hardware, from these earty days
in computer history, are now rapidly disappearingparticularly the limitations in lack of storage space and
speed of operation of the processors. Unfortunately,
the programmers have not in general used these technological advances to make things simpler. Quite the
contrary. Contemporary programming and use of the
systems seems now to involve more complexity than
ever before. What is worse, some of these elements of

93

complexity often obtrude into the user area, limiting
the user in what he can do, and requiring the user to
know unnecessary details about the internal structure
of the system. This requirement on the user extends
into the area of data formats and file structures.
In this study of standards, in its concern with data
formats, a thorough-going attitude is taken of separating the external representation of the data from all
matters of internal representation. The external representation is of paramount interest to the user. It is
the one which he sees upon the piece of paper, or he
sees displayed on the TV screen. The external representation is also the one that he must employ when
he is entering new data (text or numbers) into the system. It is also the representation that he must picture
in his mind when he operates on (edits, selects, or rearranges) the stored data by means of his keyboard
actions.
In contrast, the internal representation of data is
concerned with the manner in which particular computers and other hardware store and move the data
around inside the system. This is not the user's province, and there should be no imposition upon him as to
how it is done. For example, it should be of no concern
to him whether the hardware represents his data by
means of 6-bit, 8-bit, or 64-bit units within the computer. He should have no concern with "packing" and
"unpacking" of characters within the computer words.
He should not be troubled with physical file units.
These are all aspects of the supporting technology, i.e.,
of the hardware and the programs.
By "external representation" is meant the manner
in which the data is presented to the user - the arrangement of the data in lines or columns, or the like.
Thus the user is concerned with the various separators
(such as spaces, or new lines) that determine the format of presentation. When data is arranged in a known
format, it is easy to know what a data element means
(e.g., whether the name of a person or the name of a
street). In general, the format of presentation may include indentations, columnar arrangements, punctuation, special markers, and the like. Because files may
be extensive, the user must also be able to command
the display (or printout) of only a part of some file. In
this case, he must be able to give a command in terms
of the external representation of the data. His command may depend either upon format (e.g., "show the
next line") or upon the data element (e.g., "type the
address"). A technique for data formats must therefore be provided which will permit a_user to give such
commands for the handling of the data.
The techniques for external representation will have
a wider application than merely facilitating what the
local user sees and does. They can provide a standard

94

Spring Joint Computer Conference, 1968

format description and control method for the general
interchange of data between automated information
systems. In data interchange between automated
systems, the machine at the receiving end is faced
with some of the same problems of interpretation and
control that a user faces. Thus the solution of the problem for the human user - as a byproduct - provides a
machine-independent solution for the interchange of
data between systems.
In this sense, it is appropriate to consider that "external data representation" should deal with all matters of data representation Jar the user when the data
are written on any display medium or interchange medium. By "interchange medium" we must include wire
transmission, paper tape exchange, magnetic tape exchange, or any other form of machine readable record
interchange. For many of these interchange media,
standards of some sort already exist - but they are not
standards which apply to the format of the data as the
user sees it. I nstead, the standards for the "internal
representation" in general permit any user "text" to
be written, and the user is concerned only with the
manner in which such text can be structured and put
in format for his external use. In consequence, these
two areas - "internal" or machine representations
and formats, and "external" or user text representations and formats - can be considered quite separate.
In fact, it is important that such separation be preserved, and that separate standards be developed for
the two distinct domains. If this separation is maintained, then the work on external data formats will not
collide in any way with the requirements and standards for message transmission, for magnetic tape
records, or for other media representations.

Standard descriptionJor external data Jormats
Since data formats have a potentially infinite
variety, it is futile to consider the standardization of
the formats themselves. This is not attempted. The
effort is directed first to discovering what constitutes
the necessary logical structure behind formats in
general. To be specific, an external data format, in
the sense used here, consists of text segments separated by various spacers or syntactic markers. These
provide the clues to the meaning to be attached to the
different text segments. The markers do not act in
isolation: their meaning must be stated somewhere.
The formal statement of the manner of the use of the
syntactic markers, and how they are used to create the
total format and to specify the content of the text
segments is called a "data description."
Data descriptions can be standardized, and this is
one of our goals. A logical treatment of the elements
of format generation has been developed, and from

this suggestions for a standard method of description
of external data formats is outlined in Part IV of the
Reference Reports.
ACKNOWLEDGMENT
This paper is based in part upon the series of referenced reports prepared under contract by C. N.
Mooers for the National Bureau of Standards in 1967.
That contract was inspired in part by the insight and
dynamic persuasiveness of Dr. Chalmers W. Sherwin.
REFERENCES
C N MOOERS
Standards for user procedures and data formats in automated
information systmes and networks
Zator Co Cambridge Mass
Part I - The need for standardization and the manner in which
standardization can be accomplished
July 5 196745 pp
Part II - The standardizable elements of user control procedures and a unified system model
Aug 10 196739 pp
Part III - A suggested standard keyboard assignment for the
elemental user control actions
Aug 1 196721 pp
Part IV - A standard method for the description of external
data formats
Aug 28 196732 pp
NOTE: The above reports were prepared under Contract No.
CST-382 to the National Bureau ofStandartio;
APPENDIX 1- TABLE OF ASSIGNMENTS
ELEMENTAL USER CONTROL ACTIONS
Description
Dialog Signals
User Signal
Typewriter Signal
Slop
Single-Character
String Commands
Delete Character
Delete Line
Literal Prefix
Argument Separator
Help
Commands to
The System
Standard Me de
Revert One Level
Restart (or begin)
Exit System

Mnemonic

ST

Suggested
Assignment*
; (semicolon)
(carriage return) (line
feed) or (new line)
(break)

DC
DL

#
&

US
TS

LP
AS

,(comma)

HE

?

SM
RV
RS

SMorSTD
RV
RS
EX

EX

"'These arc mercly suggestions for purposes of discussion. The assignments suggested in this chart are from Part III of the References.

Procedures and standards for inter-computer
communications
by ABHA Y K. BHUSHAN and ROBERT H. STOTZ
Electronic Systems Laboratory
Massachusetts Institute of Technology
Cambridge, Massachusetts

INTRODUCTION
In recent years there has been considerable discussion on computer communication networks,1,2,3 information service systems 4 ,5 and the computer uti 1ity.6,7 Sophisticated displays maintained by small processors connected to remote multi-access computers
are also being developed. 8 Such applications involving
interaction and exchange of information between computers are increasing in number and pointing to the
widening need for reliable computer':to-computer communications. In order to allow communication between many arbitrary computers, a uniform agreedupon manner for exchanging information is needed.
This requires the establishment of a standard message
format (i.e., character code structure and message
syntax) and common communications protocol procedures (i.e., the agreed-upon manner of exchanging
messages).
This paper is an attempt towards defining the needed format and protocol for inter-computer communications. A standard message format, based on
USASCII* which appears to be flexible enough to
cover the entire range of inter-computer communications is proposed. For the protocol procedures, only
guidelines are presented because a single standard
may not be feasible due to the wide differences in hardware, software, and capabilities of various machines,
and the differences in user requirements and nature of
operation. (It should be noted that the procedures described in this paper are recommended for general inter-computer communications and may not necessarily be the best for some particular non-gene~alized
applications. 9 )
C omputer-to-computer communications
Computer-to-computer communication differs from
communcation to teletypewriters and remote terminals
*USA Standard Code for Information Interchange.

attended by a human operator in several important
respects. These come about because of the absence
of the interactive human operator from the environment. When a teletypewriter is the recipient of a message, the human operator acts as a highly sophisticated interpreter and error detector/corrector. On the
other hand if a computer is to act on a messflge received, it must perform these functions. The computer must be told the beginning and end of messages
and it must do careful error checking, since its acting
on a bad message can have disasterous effects. Generally, this implies retransmission systems with
schemes for block error detection and message acknowledgment,10,11 which are straightforward techniques. for machines with data storage and· programmable processing capabilities.
Another difference between inter-computer communications and communication to terminals attended
by a human operator is that a man can seldom take
advantage of very high data rate lines. A computer by
contrast is very good at making decisions rapidly so
that it can usually interact effectively over highspeed communication lines. In fact, communication
at high speed is economical, efficient and often essential for an adequate interaction between machines.
The cost of high-speed data transmission between
computers is likely to be drastically reduced in the
future with the introduction of digital transmission into
the communications network. Transmission systems
carrying all types of communications in a digital pulse
stream are gradually being introduced into the Bell
System. 12 ,13 Eventually they will be connected together in a digital hierarchy to form a nationwide digital
communcations network.
Inter-computer communication also differs from
communication between a computer and such devices
as tapes, discs, and printers. Computers have processing capabilities that data storage devices do not
possess. Thus communication need not be on a mes95

96

Spring Joint Computer Conference, 1968

sage-by-message basis. The rules for inter-computer
communication need to be and can be more elaborate,
and the communications can be made more efficient
and interactive.
The present-day public facilities for data communications include the Bell System and Western Union
wideband facilities. 14 ,15 The Bell System 303 series
data sets provide communications at speeds up to
230.4 kbps on private leased lines. The data transmission facilities may be synchronous or asynchronous
and full duplex or half duplex. * Synchronous transmission is more efficient as it obviates the need for
start and stop elements with each character. For this
reason, most computer I/O terminals have only the
synchronous adapter option available for high-speed
links.
The message format and communications protocol
procedures described herein do not depend on the nature of the transmission network. They are recommended for full-duplex as well as half-duplex, and
synchronous as well as asynchronous, transmission at
speeds typically ranging from a few hundred bits per
second to well in the megabits per second range.

Standardization and flexibility
OUf intent in the design of the message format and
communications procedures is to provide as much
consistency with standards as possible, without unduly losing flexibility in operation, overall efficiency,
and convenience of transmission. It is essential for
this objective that all communicators employ a common standard code and be able to use the existing facilities at little or no extra cost. The obvious choice for
this code was the USA Standard Code for Information Interchange l6 ,17 (USASCII or ASCII). * In fact
most communication carriers and computer equipment manufacturers have accepted this standard proposed by the U.S.A. Standards Institute and are manufacturing equipment based on it. The general message format described in this paper follows closely
the draft proposed USA Standard Communication
Control Procedures,18,19 and is intended to be compatible with equipment offered by the communication
carriers and the computer industry.
There have been some questions regarding the usefulness of USASCII in a computer network environ-

*Synchronous (or character synchronous) transmission means characters (and bits) are transmitted at a fixed rate. Asynchronous transmission means the interval of time between characters can vary arbitrarily. Full duplex is the ability to transmit simultaneously in both
directions while half duplex is the ability to transmit in both directions, but not simultaneously.

·USASCII X3.4, 1967. The standard code is referred to both as
the ASCII code and the USASCII code.

ment. Alternative solutions to message formatting (incompatible with USASCII) using fixed block lengths
and rigid bit coding have also been proposed. 3 ,9 It has
been argued that it is not desirable to conform to
USASCii, if more efficient alternatives can be found.
It is our feeling that besides being a proposed USA
standard, USASCII represents a very good way of
transmitting information. Though USASCII may not
be the best for one particular application, it appears to
be the most reasonable for a large variety of applications. Also, it can be used with intelligence to provide
a high degree of flexibility. An alternative coding
scheme will not be very much more efficient. than
USASCII and still be able to provide the degree of
flexibility and reliability that is needed for inter-computer communications. It may be uneconomical as
well as unnecessary.
In our recommendations we have adopted the basic
framework provided by USASCII, and have extended
it to suit the many requirements of inter-computer
communications. For certain applications (such as
very efficient binary text transmission) we have felt
USASCII to be somewhat inadequate. Alternatives
have been suggested wherever necessary and possible. It is our hope that the proposed USA standards
may be modified for the inter-computer communications applications, or at least not made so rigid as to
make suitable alternatives incompatible.

The messageformat
For clarity we shall first define some terms:
A transmission - a group of one or more messages
which are transmitted continuously
without interruption.
A message
- a sequence of characters arranged
for the purpose of conveying information from an originator to one or
more destinations. It includes all
text and the appropriate heading.
A heading
- a sequence of characters which constitute the auxiliary information
necessary to the communications of
a text. Such auxiliary information
may include, for example, characters
representing routing, priority, security message numbering and associated separator characters.
A text
- a block of data that is to be transmitted as an entity from the sender
to the receiver(s).
A typical message is illustrated in Fig. 1. Each character in the above message (with the exception of
Bee) is a 7-bit ASe I I character followed by one bit of
odd parity (an odd rather than even parity is chosen to
7

Procedures and Standards for Inter-Computer Communications

r:

MeSSage,

s
----- 0

S
Heading T
H t
X

I

t

97

start of block check

E B
Te x t T C
X IC

-----

•

t start at block check

Bec= block

check
character

Lend

of block check

* used only when messages constitute
separate transmission blocks

Figure 2 - Blocking of text into separate messages
Figure I - Basic message format

facilitate bit synchronization). The least significant
bit is transmitted first and the parity bit is transmitted
last. 20 The characters SYN, SOH, STX, ETX, ETB
and EaT are all standard ASCII communication control characters and have a fixed code value and meaning.
The message starts with an SOH (Start of Heading)
followed by the heading (messages that have no heading information may start with an STX but are of
questionable value in an inter-computer environment).
An STX (Start of Text) signals the end of heading and
the beginning of user's text. An ETX (End of Text)
indicates the end of user's text and of the message.
A block check character (BCC) immediately follows
the ETX. (Current USASI proposals recommend the
longitudinal block parity check. * )
In many systems there are limitations on maximum
message (or transmission block) length due to such
factors as error rates or buffer capacities. If the size of
a text provided by the user exceeds the maximum
message (or transm~ssion block) size, the text may be
blocked into several messages (or transmission blocks)
as shown in Fig. 2. An ETB (End of Transmission
Block) is used to indicate the end of all text blocks
except the last (which ends with an ETX), and is also
immediately followed by a block check character. The
error check includes the ETX or the ETB, which ends
the checked sequence, but does not include the SOH
(or STX) which starts it as shown in Figs. J·and 2.
For the purpose of message block identification and
other communication protocol needs, each of the messages (or transmission blocks) should be preceded by a
separate heading. Note that this is a modification of
USASI standards which propose that only the first of
these blocks should be preceded by a heading. Thus
*This is an 8-bit character generated by taking a binary sum, without carry, on each of the 7 individual bits of the transmitted code
(resulting in an even longitudinal parity). The eighth bit of the Bee
is a character parity on the Bee itself, and is in the same sense ai
the rest of the characters (odd in our case).

messages (or transmission blocks) which contain only
a part of the text should be regarded as separate messages insofar as communications is concerned.
On start of transmission, SYN (synchronous idle)
characters (the number may vary with equipment
needs) are supplied to provide the communcations
adapter with correct synchronization and indicate the
start of transmission. These may also appear within
the message (if required for synchronization), and may
be stripped from the message on reception. An EOT
(End of Transmission) signals the end of a transmission which may have contained one or more messages,
as shown in. Figs. 2 and 3. For half-duplex service the
originator may go into the receive mode (temporarily
relinquishing its right to transmit) after the transmission of an EaT. (Proposed USASI standards recommend this 'EaT' function on ETB and ETX's also.
This would prohibit multiple mess~ge transmission.
We recommend that only the EaT character should be
used for this purpose.)

~------ Transmission

---------1

Figure 3 - Multiple messages in a single transmission

Multiple message transmissions of the type shown
in Fig. 3 are important towards making the communications more convenient and efficient. A message
need not be acknowledged before the next is sent,
and messages can be sent continuously without interruption. For mUltiple message transmission capability it is important that all messages containing text
and requiring an acknowledgment (for error control)

98

Spring Joint Computer Conference, 1968

contain a message identification in the heading. Messages with identifications may now be acknowledged
with their identification numbers.N on-text messages
containing only heading information (such as acknowledgments) may have the fOimat shown in Fig. 4.
These messages start with an SOH and end with an
ETB. A BCC follows the ETB so that the heading
message is error checked. For extra error protection,
other messages containing text may also have their
headings errors checked by the use of ETB (see Fig. 4).

Figure 4- Non-text messages and error checking of headings

Heading information
Information in the heading of the message is used
only to aid the communications program to transmit,
receive, sequence and route the messages efficiently
and securely. The heading information is never seen
by the user, and all information regarding the text
portion of the message (e.g., should the ASCII characters in the text be interpreted as ASCII or binary) is
contained in the text portion ofthe message.
Our scheme for coding information into the heading
is to separate the ASCII characters into three categories: controi characters, key characters and argument characters. As shown in Fig. 5, control characters are all those characters with octal values from
00 through 37, key characters are those with octal
values 40 through 77, and argument characters are all
those with octal values greater than 77.
The control characters are ASC II control characters assigned meaning only in accordance with proposed USASI standards. The key characters have a
predefined meaning and may be followed by a string
of characters from the third category called the argument of the key. The argument of a particular key is
interpreted by the computer in accordance with the
predefined meaning assigned to the particular key.
The key completely defines the interpretation of the
information contained in the argument. Additional
separator characters are not needed, as key characters themselves act as information separators.
The key characters together with some control
characters serve as heading control characters. When

used for the purpose of heading control information,
the standard meaning of the ASCII control characters must be preserved, thus communication control
characters like ACK, NAK and ENQ are not used in
the heading because there exists computer/communications hardware that is sensitive to these. Note also
that proposed USASI standards disallow the use of
these characters within a message. Additional key
characters may be generated by using an escape
character (a specified key or control character). The
escape character followed by an argument assumes
the meaning of a new key character. If a particular key
character is not recognized by the computer software
(or device hardware), it is to be ignored (together with
the argument which may follow it) until the next
recognizable key or control character appears.
The concept of key character with arguments provides a large capacity for expansion and great flexibility in operation, together with simple implementation. The scheme, a simple extension of the proposed
USASI standards, represents an attractive alternative
to hardwiring particular bits in a heading. Its advantage over the 'Escape' convention outlined in
USASCII is the greater efficiency permitted by use of
single characters instead of character sequences.
For illustration, the heading control characters have
been divided into three classes, representing communication requirements for different appiications. For
example, some small machines may require only a
limited class of heading control characters. Other
small machines (e.g., satellite computers for display
application) may require unbuffered transmission.
Table I illustrates a possible assignment of key characters and their meanings. This list may be modified
and extended as requirements are better understood.
in Table i, the character n stands for a binary number
represented by the six least significant bits of the argument character. A message of size n contains no
more than 2**n characters. Since messages can be of
variable length, n serves primarily as an upper bound
on the maximum size of the message.
The basic Class I heading control characters include
only the acknowledgment, identification, inquiry
(not USASCII control characters) and a means to
interrupt and quit. Messages containing identification
may be acknowledged with that identification. Also,
a limitation on the maximum size of the message can
be indicated. Such a bound may be necessary, as some
machines may have limited buffers.
Class II heading control characters are intended
primarily for unbuffered transmission between machines (i.e., machine A can send data to machine B
directly into its proper location in B). This is particularly important when a large time-sharing computer
communicates with a sma!! satellite computer. In

Procedures and Standards for Inter-Computer Communications

0

0

0

Bit
Potitions

0

0

0
I

0

b4

b3

b2

0

0

0

0

0

0

0

0

0

0

I
I

I

0

0

I

0
0

99

I
I

I

0

I

I

! ! !

0

0

0

0

0

0

0

0

0
0
0

0

0

0

0

0

P

\

A

Q

a

q

B

R

b

r

C

S

c

s

0

T

d

t

E

U

e

u

F

V

f

v

G

W

g

w

H

X

h

x

y

0

0
0

@

0

0
0

Control characters
}

Communication control characters

Second category designated as
Key characters for additional controls

D

-

y

J

Z

j

z

K

C

k

L

\

M

J

m

{
I
}

N

"

n

-,

0

First category (standard
USASCII definitions)

P

0

Third category designated as
argument characters (includes
the USASCII control character
DEL)
Control character DEL
in~luded in the argument set

Figure 5 - Proposed designation of USASCII characters for
message control in inter-computer communication

100

Spring Joint Computer Conference, 1968

general, satellite computers will have a limited amount
of memory, and it is desirable not to have to provide a
large message buffer in this machine. On the other
hand, if a small message buffer is provided, it will
take multiple calls on the time-shared computer to get
a large message across, which is also undesirable.
Direct transmission is very attractive in this case.
Note that the "where to" and "number of characters
in this message" information must come before the
text, i.e., in the heading.

a complete overhaul when a change has to be introduced.
In the long run, the key character scheme may
even turn out to be more efficient than fixed bit coding for some applications, since it would not be necessary to code all of the information all the time. For example, the acknowledgments would not have a message id, priority may be assumed normal unless otherwise indicated and acknowledgments, repeat acknowl-

T ABLE I - Assignment of key characters

Class

Key
character

#

+
?

II
I
&

III
<
>

Interpretation

arg
arg
arg
arg
n

argument is message identification
acknowledge message #arg (message O. K.)
negative acknowledge message #arg (message in error)
repeat acknowledgment of message #arg
send transmissions no longer than size n
interrupt (get processor's attention)
quit (stop transmission of messages)

arg
arg
arg
arg

sender's status
sender's interpretation of receiver's status
number of characters in text of message
argument is core memory address of incoming text

n
n
n
n
arg
arg
arg

make standard buffer size n
"=0" request rejected
"=n" request accepted
size n blocks of text only
routing information (sender)
routing information (receiver)
escape sequence to obtain additional key characters

Class III heading control characters are intended
primarily. to provide buffer allocation and message
routing information in a network environment. Thus
information such as standard buffer size, message
block size and identification of source and destination computers may be included in the heading by
using the appropriate key characters.
Alternative schemes suggested for coding heading
information have been on a bit-by-bit basis. We feel
that even though bit-coding the information may be
slightly more efficient (in that there is no overhead for
key characters and some information may be less than
6 bits), it will not have the flexibility of coding that
the key character scheme provides. As the size and
capacity of a computer network increases, or if additional functions are called for, the key character
scheme can easily accommodate these by use of longer arguments and/or additional key characters. A
rigid bit-coding scheme on the other hand, will need

edgments, etc., mayor may not be included within the
heading of the message.

Text information
The manner of transmitting text information recommended herein conforms with USASCII and the proposed USASI standards. USASCII however is principally geared towards the transmission of alphanumeric symbols, not binary data which is required in
general inter-computer communication. Also, it is often desirable to be able to mix binary data and ASCII
alphanumeric data within a single message.
The mechanism we have chosen to transmit binary
data is to preface it in the text stream with a Group
Separator (GS) character (ASCII octal 035), and to
escape (back to alphanumeric text) with a File Separator (FS) character (ASCII octal 034). The binary data
is encoded as the six least significant bits of ASC I I
characters with bit 7 always a 'ONE.' The binary mes-

Procedures and Standards for Inter-Computer Communications
sage is then treated simply as a stream of ASCII characters. It should be noted that the 64 permissible ~'bi­
nary" characters are identical with the argument set
and none of them fall in the category of key or control
characters.
The positioning of GS and FS characters in a text
stream is completely independent of the message itself. Thus a single message may contain both binary
and USASCII information, or any combination of the
two as shown in Fig. 6. The start of text is always assumed alphanumeric unless indicated otherwise by
aGS.

Figure 6 - Transmission of binary and alphanumeric text within a
message

Characters with 'ZERO' in their most significant
bit may occur within binary mode of transmission and
be interpreted usefully. These may be control characters or key characters. The USASCII control characters are to be used in accordance with USASI recomendations. The key characters occurring within binary mode may be used to further define the binary
information that follows it and indicate what it is
about. The meaning assigned to the key characters in
text may be different from that assigned in the heading. Thus the text may be preceded by its own leader
information set by user conventions. Another example
of the possible use of key characters within binary text
may be to indicate the end of binary string if it is not
an even multiple of 6 bits. An ASCII numeral 'n'
(1 through 5) may follow the last binary characterof
the string to indicate the number of meaningful information bits in that character (see Fig. 6).
Transparent text

Many people have expressed a need for a "Transparent Text" mode of operation which would disable
communications system sensitivity to control codes
and permit an unrestricted coding of data. If unrestricted transmission of full 7-bit binary information
(e.g., all 128 ASCII characters, "pure" 7-bit binary
data or encrypted information) is allowed, messages
containing the ASCII codes EOT, ETX, ETB, ACK,
N AK and similar control characters could be transmitted intact without affecting the transmission system. Similarly, if the parity bit is also ignored in the

101

transparent mode, it would be possible to send full 8bit binary data. As mentioned previously, the ability to
transmit binary data is extremely important in computer communications, and many computers use 8-bit
bytes.
The main difficulty with transparent text transmission is that there is no standard way to indicate the
end of this mode. If all 128 (256 if the parity bit is
also used for information) codes are allowed for data,
there can be no reserved "end transparent mode"
code. For transmitting 7-bit transparent text, USASI
has proposed a technique that makes use of the communication control character OLE (Data Link Escape)}7 The proposal is that the sequence OLE STX
initiates the transparent text mode and OLE ETX
terminates it. If a bit pattern equivalent to 0 LE appears within the transparent data, it is replaced by the
sequence OLE OLE to permit the transmission of
OLE as data. In addition, other control sequences using OLE (such as OLE ETB and OLE SYN) are
available to provide active control characters within
transparent text as required (see Fig. 7).

t start of block check
~ end of block check

* used only when messages constitute
separate transmission blocks

Figure 7 - Proposed USASI technique for transparent text transmission

We see several difficulties with this technique proposed by USASI. First, the detection (and generation)
of the special OLE control sequences must be done in
hardware to be practical. This implies that the channel hardware of all the computers in a network must be
- built to conform to these rather complex rules for
transparent text. Further, the technique will be ineffective (or disastrous!) if any of the equipment· along
the communication line is ignorant of these rules and
is sensitive to ASCII communication control codes.
Also, this method makes error recovery a more difficult task. For example, if the character that indicates
the "end transparent mode" is in error there will be
no simple means for escapi~g from the transparent
mode. This is because standard ASCII communication control characters are ignored once in transparent
mode. Finally there is the price that has to be paid for
inserting extra OLE's and extracting them (each character in the transmitted and received messages has to
be examined as to whether or not it is a OLE).

102

Spring Joint Computer Conference, 1968

Our approach for transmitting transparent or
binary text is to provide a 6-bit binary mode within
ordinary text by using the Group and File Separator
characters, as suggested earlier. This has the advantage that it is uniform, USASCII conforming, easy to
implement, allows mixing l~·1SCII symbols with binary
in a single message, and keeps aside the standard control character set for error control and recovery protocol. Further, it is independent of code-sensitive equipment, and does not require insertion and deletion of
OLE's in the transmitted and received texts. The
price paid is the reduced effective bandwidth, since
only 6 bits out of each 8-bit character carry information.
Computer companies with machines based on
8-bit characters, have a compelling motivation for
providing an 8-bit binary "transparent" mode.
To accomplish this IBM, for example, has extended
the USASI approach by using the character parity
bit for information and thus allowing all possible
256 codes. 21 ,22 The longitudinal block parity check
recommended by USASI is inadequate for error
control in the absence of character parity, thus IBM
has resorted to the use of the more powerful cyclic
redundancy check23 CRC-16. This permits the checking of any code set using the checking polynomial
X16 + Xl.5 + X2 + 1. Although we recommend our 6-bit
transmission for sending binary data, we feel the
USASI standards should allow the cyclic redundancy
check as an alternative for adequate error control
in those cases where 8-bit transparent transmission
is necessary.
Since we feel that it is desirable to always avoid any
problems of transmission system sensitivity to communication control codes, such as exist in the above
two approaches, we have devised an alternative
scheme for 7- or 8-bit transparent text. In this scheme,
the ten ASCII communication control characters
(with correct parity) always retain their fixed meaning
and are set aside for error control and recovery
protocol. The transparent mode is initiated by
OLE STX. Whenever a communication control
character code appears within the binary stream,
the OLE character is inserted to precede it as in the
previous schemes, but bit 7 of the control character
is inverted (changed from 0 to 1) to change it to a
non-control (and non-key) character code. Normal
communication control characters retain their
ASCII meaning, thus ETB and ETX signify the end
of text and SYN is interpreted as synchronous idle.
The price paid in this scheme is the slightly less than
four-per cent overhead necessary in transmitting
the DLE's (assuming a random distribution of binary
numbers) and the hardwareisoftware required to
implement it.

There are several advantages to this scheme for
8-bit transparent text transmission. First, it retains
a single meaning for the ASCII communication control characters, which simplifies error-recovery
procedures, Secondly; the communications hardware does not get involved at all in the rules for
OLE's, since the control codes retain their meaning.
The only place that this has to be done is at the
transmitting and receiving computers, and it can be
done either in software or in hardware. In a computer
network (of the store-and-forward type), each node in
the network can be oblivious of the fact that the
message it is handling is in transparent mode.

Communications protocol
Communications protocol here refers to the uniform
agreed-upon manner of exchanging messages between
computing machines. This includes data link control,24 acknowledgments and error recovery procedures. It may also include message buffering and
routing techniques required for communication of
messages. As pointed out earlier a single standard
protocol procedure may not be practical because of
varying needs of different systems and discrepancies
in their hardware/software characteristics. Operation
could be synchronous or asynchronous, full duplex
or half duplex, point-to-point or mUlti-point, centralized or decentralized, and over private line, switched
network, or a general network environment employing
store-and-forward routing techniques. In many of
the above cases, the protocol needs would differ,
and hence the procedures adapted would vary.
Even thougb the ASCII communication control
characters (such as ACK, NAK and ENQ) and
other communication control character sequences
starting with DLE (such as DLE ?, meaning wait
before transmit, and OLE EOT, meaning mandatory
disconnect) may be sufficient for communications
protocol needs, our recommendation for general
inter-computer communications is to provide these
functions in heading type messages by use of key
characters, as described earlier. There are several
important reasons for doing this. First of all much
greater error protection is provided since each of the
heading messages is followed by a BCC and is error
checked (undetected errors in control character sequences can have disastrous effects). Secondly,
multiple messages and message blocks can now be
reliably acknowledged with their particular identification number, and enquiries can be sent asking for
particular messages. (This feature when implemented
with control character sequences is limited and
clumsy.) A third reason is that key characters in
general will not cause any hardware action along
the communications line (communication control

Procedures and Standards for Inter-Computer Communications
characters on the other hand may cause line turn
around etc.). This should result in a great improvement
in operation, since it is easier and more efficient to
have error recovery in software rather than hardware
for the more sophisticated computers. Finally, in a
general network environment all the messages
(including acknowledgments and enquiries) will need
routing information. It will be difficult to provide
this routing information without the use of heading
messages.
To start transmission, a number of SYN characters
are first sent to establish the bit and character synchronization. The originating station may then send
an enquiry to request a response from the remote
station. This is a heading message included within
an SOH and ETB followed by block check character
and an EOT. The message may include (besides an
enquiry), station identification, station status and
other desirable information. The receiving station
will respond with its own heading message to establish
the link. Messages can now be exchanged between the
computing machines.
Full-duplex operation allows simultaneous communication of messages in both directions and thus
obviates the need for transmission reversals. It
also completely avoids the issue of line control, and
the possible confusion arising thereof (i.e., both
communicators trying to transmit simultaneously
over a single line). In half-duplex operation, however,
suitable protocol procedures must be adopted to
assign line priority to the stations and avoid such confusion. To avoid "hung" conditions (confusion in
half-duplex operation), and accidental loss of messages or acknowledgments, the transmitter is always
responsible for getting the message through and
acknowledged. If the expected acknowledgment is
not received within a specified time (exact time
may be fixed by each communicator depending on
the requirements), an enquiry is sent by the originating station to repeat last acknowledgment.
Error recovery procedures

'Error recovery procedures refer to correction of
conditions such as "hung" (in half-duplex lines),
incorrect receipt of messages or acknowledgments
and loss of messages. These procedures are relatively simple if operation is in a message-by-message
manner. In the general network environment, it is
desirable to allow multiple message transmission.
This causes a few problems to arise. If a message is
received in error, we cannot assume anything about it.
It may have been an acknowledgment, a retransmission or a regular message. Our recommendation
is to transmit multiple messages continuously (for

103

reasons of greater efficiency) until an error occurs.
When an error condition (loss of message indicated
by irregularity in id numbering, acknowledgments
not received in specified time, or erroneous message
indicated by error checks) is detected, the continuous
multiple message process is to be interrupted and
error recovery procedures should go into effect.
These error recovery procedures will be more efficient and simple on a message-by-message basis.
On detection of an error condition, the receiving
station can ask for retransmission of messages and
outstanding acknowledgments by suitable heading
messages. The transmitting station will send the
retransmissions desired with the highest priority.
When the error condition is corrected, the normal
multiple message transmission process may be
resumed.
If for some reason the communicators are unable to
get through arter repeated attempts, they should be
able to resort to additional error recovery mechanisms. One such back-up procedure may be a request to exchange line and station status information.
The status message may point out the future course
of action to the communicators, and thus make error
recovery simpler in exceptional circumstances.
CONCLUSION
In this paper we have attempted to define the message
format and protocol procedures required for intercomputer communications. A great emphasis has been
laid on flexibility, compatibility, efficiency and
convenience throughout. We have suggested the use
of the framework provided .by USASI standards and
USASCII. Our scheme for coding heading information by use of Key characters appears reasonably
efficient and provides flexibility and convenience.
Transmission of 6-bit binary within USASCII text
is recommended for binary messages (ideal for
machines whose word lengths are based on 6-bit
characters). Alternative schemes for transparent
text have been discussed and their relative merits
pointed out. For 8-bit transparent text, without
character parity, the longitudinal block parity check
seems inadequate for error control and the use of
the more powerful cyclic redundancy check is
recommended. Finally, we have discussed the
acknowledgment and error recovery procedures
required for a general network environment. Use
of heading messages rather than communication
control character sequences is recommended for
communications protocol. An approach we recommend is to operate on a message-by-message basis
for error recovery (the normal operation being
multiple message transmission). This would simplify

104

Spring Joint Computer Conference, 1968

error recovery and make it more efficient. Further
experimentation in form of real-time or simulation
studies is required to define the error recovery
procedure more completely.
We hope that the communications and the computer
industries, and various groups attempting intercomputer communications, would devote attention
to the problem of compatibility and standards.
Recognition and acceptance of such standards and
guidelines would be an important step forward in
the direction of compatible computer networks
and information utilities of the future.
ACKNOWLEDGMENTS
The authors are grateful to the Messrs. J. E. Ward
and R. G. Mills for the many discussions that led
to the current paper. Appreciation is aiso due to
Mr. J. E. Ward for his editorial and technical suggestions.
Work reported herein was supported in part by
Project MAC, an M.I.T. research program sponsored
by the Advanced Research Projects Agency, Department of Defense, under Office of Naval Research
Contract Number Nonr-4102(0l). Reproduction in
whole or in part is permitted for any purpose of the
United States Government.
REFERENCES

2

3

4

5

6

7

L G ROBERTS
Multiple computer networks and inter computer communications
Conference of ACM on Operating Principles
Gatlinburg October 1967
A K BHUSHAN R H STOTZ J E WARD
Recommendations for an inter-computer communication
networkfor MIT
Projeci. MAC Memoranuum MAC M 355 juiy i 967
DAVIES BARTLETT et al
A digital communication network for computers giving
rapid response at remote terminals
Conference of ACM on Operating Principles
Gatlinburg October 1967
J B DENNIS
A position paper on computing and communications
ibid
R G MILLS
Communications implications of the project MAC multipleaccess computer system
1965 I EEE International Convention Record
D F PARKHILL
The challenge of computer utility
Addisson Wesley 1966
E E DAVID R M FANO
Some thoughts about the social implications of accessible
computinR

AFIPS Conference Proceedings Vol 27 Part I Spartan
Books 1965 pp 243-247
8 D T ROSS R H STOTZ et al

The design and programming of a display interface system
integrating multi-access and satellite computers
ACM/SHARE 4th Annual Design Automation Workshop
Los Angeles California June 1967
9 P BARAN
On distributed communications
Rand Corporation Memoranda August 1964 RM 3420 PR
10 R J BEN ICE A H FREY JR
A n analysis of retransmission systems
I EEE Transactions on Communication Technology
December 1964 pp 135-146
II F E FROEHLICH R R ANDERSON
Data transmission over a self contained error detection
and retransmission channel
BSTJ January 1964
12 R A KELLY
An experimental high speed digital transmission system
Bell Labs Record pp 65 February 1967
13 D F HOTCH
Digital communications
Bell Labs Record February 1967
14 R T JAMES
High speed information channels
I EEE Spectrum April 1966
15 W E SIMONSON
Data communications the boiling pot
Datamation April 1967
16 Proposed revised A merican standard code for information
interchange
Communications of the ACM Vol 8 No 4 April 1965 pp
207-214
17 Code extension in A SC II (an A SA ll.'orial)
Communications of the ACM Vol 9 No 10 October 1966
pp 758-762
18 Control procedures for data communications-an ASA
progress report
Communications of the ACM Vol 9 No 2 February 1966
pp 100-107
19 Ninth Draft proposed USA Standard Communication
Control Procedures for the USASC!!
July 1967
20 Character structure and character parity sense for serialby-bit data communication in the A merican standard code
for information interchange
Communications of the ACM Vol 8 No 9 September 1965
pp 553-556
21 I BM Systems Reference Library Form A27 3004 0 General
I nformation Binary Synchronous Communication
22 IBM
Technical
Newsletter
No N27 3011 Re Form
A22 6468 1 on SDA II
23 W W PETERSON D T BROWN
Cyclic codes for error detection
Proceedings IRE Vol 49 pp 228235 January 1961
24 H FICKES
Data link control procedures: what they are and what they
mean to the user
Proceedings ACM National Meeting
August 1967 pp 521-525

An error-correcting data link between
small and large computers
by SYPKO W" ANDREAE and ROBERT W. LAFORE, JR.
Lawrence Radiation Laboratory
University of California
Berkeley, California

Operating environment
The need for a data-link connecting small dataacquisition computers to a central computer with
great analysis power arose in a particular context at
the Lawrence Radiation Laboratory in Berkeley.
Both the type of high-energy physics experiments being performed and the operation of the available large
computer, a CDC 6600, posed unusual design p~ob­
lems.
Experimental requirements

A comparison of two important approaches to the
recording of data from high-energy physics experiments is instructive in providing a background for the
operation of the data link.
The bubble chamber approach uses a photographic
process to record tracks made by particles in a nuclear
event. Afterwards the photographs are examined by
elaborate man-machine scanning systems to yield
data in the appropriate digitized form. This data may
then be analyzed by computer. Bubble chamber pictures in general contain much more information than
can be abstracted from them during the first such
analysis. I t is thus possible to scan the same pictures
many times to digitize the data relating to different
phenomena.
By contrast, the counting approach uses a technique
in which the data from the experiment is digitized
directly. In the past, electronic counters were the
principle devices used to record the data; today there
is a variety of such direct-digitizing devices, including
spark chambers and photomultiplier tubes. The success of a counting experiment depends largely on how
correct the physicist is initially in assuming which
phenomena are to be expected. The experiment is
specifically aimed at one or perhaps a few phenomena,
and if well-aimed, will provide the required data. But
If the physicist's assumptions were not accurate, there
is then no opportunity to re-examine the data for other

phenomena, as with bubble-chamber photographs.
The entire experiment must be rerun with the equipment set up to record a different phenomenon.
Computer availability
The physicist therefore requires rapid feedback in
the form of completely analyzed data to permit him
to alter his experiment's configuration during a run if
required. The capability of the small computers, such
as the PDP-8, commonly used for on-line data acquisition at the experiment are too limited to allow this
type of analysis, although they can perform simple
checks to determine whether the experimental equipment is working normally and whether the data looks
reasonable in a general sense. The only way the
experimenter can use the large computer is to handcarry magnetic tapes from the small to the large computer, a process resulting in a turnaround time of hours
or even days.
To provide feedback within a useful time the experimenter thus requires sufficient computing power to
provide him,' on-line, with a detailed analysis of his
data in terms of physics. A suitable computer is
available at the Radiation Laboratory, a CDC 6600
located several thousand feet from the Bevatron and
184-inch cyclotron where high-energy physics experiments are conducted. Private telephone lines are also
available, running more or less along the desired
routes. The data link was conceived to provide a
reliable high-speed transfer medium between the small
and large computers.
The CDC 6600 provided several unusual problems
in the design of the data link because of its construction and the way it is normally used at the Radiation
Laboratory. The 6600 is designed to protect the central proce~sing unit (CPU) as much as possible from
the interference of input/output (I/O) devices. The
CPU .may be thought of as being surrounded by a
protective layer of central memory (see Fig. 1), which
105

106

Spring Joint Computer Conference, 1968
in fact the only part of the 6600 system which can be
interrupted.

Design Objectives

Figure I-CDC 6600 organization

is in turn surrounded by 10 peripheral processors (PP)
which communicate with I/O devices via 12 delta
channels. The only way the CPU can communicate
with the outside world is via the 13 1K core memory.
Nearly all the PP's are involved in I/O communications. Their tasks are assigned by a controlling PP
on the basis of availability, which makes for very
efficient use of the PP's. In this sense there is a kind
of "time sharing" within the system, an approach used
in many areas of 6600 design. This kind of time sharing resuiis in a very fast computer, but not one which
is easily used for time sharing in the more usual sense
of devices out~ide the computer. For example, there
is no facility for interrupts, so that a PP is forced to
continually check a flag from a particular device to determine when service is required. This is satisfactory
for such devices as tape units. However, dedicating a
PP to watch a flag from the data link would result in a
prohibitively heavy demand on the pool of available
PP's, since many minutes might elapse between calls
for service from the experiment. The data link cannot therefore be treated as a normai i/O device.
Another possibility is to alter the operating system
to assign a PP to check at infrequent intervals if the
link is requesting service. However, any such modifications are difficult to make on the 6600 operating
systems and would result in degradation of the batch
processing throughput. it is therefore necessary to
include the 6600 operator in the interrupt chain: he is

From the situation described above, the following
design objectives evolved for the data link:
1. No hardware modifications were to be made to
the 6600. Modifications to the software operating system were to be minimized to avoid degradation of the normal batch processing.
2. A reasonably high data rate was required, preferably exceeding that of the high-speed tape units
already used by the 6600.
3. Error-correction faciiities were to be kept as
much as possible within the data link itself, to
avoid complicated software checking by both the
small computer and the 6600. Also, since the
data were to be carried by twisted-pair phone
lines, error-detection capability needed to be
quite powerful.
4. The link would need to be used only occasionally, no more often than every half hour or so, for
the transmission of 20 to 30,000 words of data.
5. Rapid response by the 6600 to the link was unnecessary: a lapse of several minutes from the
time the link requested service from the 6600
until the 6600 was able to respond was acceptable.
6. The link was to be kept as general-purpose as
possible to enable it to be used with other types
of computers if the need arose.

Design philosophy and evolution
The approach to the design of the data link was
governed by one important consideration: since a
relatively small amount of data handling would require use of the link, cost was to be minimized, and
existing equipment was to be used as much as possible. This eliminated immediately such alternate
approaches as using a medium-sized computer on-line
or substituting coaxial cable for the already existing
phone lines.

Synchronization
The use of twisted-pair lines, however, raises fairly
serious noise problems, since the route through which
they run contains many powerful noise-producing devices, such as spark chambers, magnets, and the rf
field of the cyclotron. Associated with the noise problem is one of synchronization: how to provide communication between two synchronous devices each
running independently on its own clock.
The simplest solution to the synchronization problem is to transmit an echo, or "received your last

Error-Correcting Data Link Between Small and Large Computers
word" signal from the receiver to the transmitter, and
send the next word only when the echo is received. In
this way either the receiver or the transmitter can halt
the flow of data when the words cannot be obtained
from, or accepted by, the respective computer. The
alternative approach, that of sending all words without interruption at a rate slow enough for the slowest
computer to handle even during the worst case, would
actually have required a slower transmission rate.
Error correction
In order to match the format of both the PDP-8 and
the PP, data are transmitted in words of 12 parallel
bits, with a 13th bit indicating whether the word is
data or function/status (see below). Various errordetection schemes were considered. The most common, the addition of a parity bit in parallel with the
data word, has decreasing utility as the number of
bits of the transmitted word increases. In this case,
with 13 parallel bits and the possibility of powerful
noise wiping out entire words, parity was considered
too weak a system. Another common error-detection
method, the checksum, suffers from a similar difficulty
in that if the average error rate is high enough for
several errors to occur in each block of data, many
retransmissions of each block may be required before
the record is received with the correct checksum.
Also, either storage devices capable of holding an
entire data block must be provided at each end, which
is prohibitively expensive, or software in the two
computers must intervene in the case of an error to
cause the data block to be retransmitted.
A method which both solves the synchronization
problem and provides a powerful error-detection technique is to echo each word in its entirety back to the
transmitter for a complete bit-by-bit comparison before the next word is sent. This of course increases the
time to transmit each word, but the increase is not as
large as the propagation delay, since reading and
writing data between the computers and the link may
be overlapped with transmission. However, for the
line lengths in use, the propagation delay is the limiting factor: one typical line of 4000 ft has a round-trip
delay of about 12.5 J.Lsec.
Line transmission and reception

Each individual bit of a word is transmitted over
its own line. Each line is a twisted pair, transformerisolated from its associated transmitter and receiver.
A full duplex approach is used, and therefore 26
twisted pairs are dedicated to word transfer. An additional 8 lines are used for the control signals, making
for a total systems requirement of 34 twisted pairs.
These are privately owned telephone lines.

107

In order to keep the cost low, it was desirable to
design' transmitters and receivers using simple circuitry but still capable of rejecting a significant amount
of noise. Each transmitter consists of two power-nand
integrated circuit gates which drive two transistors
connected in a push-pull configuration. This produces
a bipolar pulse approximating a square wave in the
secondary of the output transformer. Two monostable vibrators control the width of the positive and
negative areas .of the bipolar pulse for all 13 data
transmitters in one synchronizer.
Both areas of the bipolar pulse are, in general, equal
in width. For reasons of convenience, the width is
chosen in this design to maintain an amplitude attenuation of a factor of 4; for instance, 1.3 J.Lsec for a
4000-ft line.
Our initial receiver design actually proved unnecessarily complex. It required the positive portion of
the incoming pulse (now resembling a sine wave due
to the filtering effect of the lines) to pass through
a height window much like a single-channel analyzer.
In tests over the actual phone lines, however, we
found that, even with the window set quite wide, many
more errors were caused by failure to detect an existing bit than by triggering on unwanted noise. The final
receiver design therefore requires only that the positive portion of the incoming signal exceeds a certain
threshold.
The maximum word-transfer repetitive rate of the
link is 80,000 words per sec for a 40oo-ft line (12.5J.Lsec round-trip delay). Obviously the lines of the link
are operating with a bandwidth far in excess of the
bandwidth of normal telephone lines. In contrast to
the latter, the link lines are limited to a length of a few
miles and do not run through switch circuits, line
amplifiers, etc.
I mplementation and operation
Levels of communication

Communication over the link system actually takes
place on. several different levels, as indicated somewhat idealistically in Fig. 2. In a general sense the
link may be thought of as a medium of communication
between the experiment and the mathematical analysis of the experimental data. This data, and the results
of the analysis, are transformed into 12-bit data words
by the small-computer program and the FORTRAN
program in the CPU of the 6600.
The programs however are not restricted to sending
data. They can also communicate instructions to each
other. For example, the experimenter, via the small
computer, can request different analysis approaches or
output format, while the FORTRAN program may

108

Spring Joint Computer Conference, 1968

instruct the small computer to modify a display program depending on the results of analysis. Thus the
experimenter has available to him in a limited wayan
extended on-line processing capability that includes
some control over the analyzing process.

--

r---

Experimental results
Mathematical interpretation
I---

~

Data
12 bit words
E

E
~

e

>.

...

,S;

Function

CIt

en

u
.c;;

.Q

.~

CI)

CD

0

'E
CD

Q.
)(

ILl

~

Q.

0

e
~

~

~
N

c

CDO

E

(f)

Experiment

0
......

.....

e
en
·iii

.~

:;

0

0.
Q.

CD

U

'0

a status words

Q.

=
E

Q.

=-----

~

u'"
.:;"5
CDC

0>'
CI)

-Control
sionals

~
·c
N

0
CD ...
C.J:

cu
oc

.J: >UCl)

CD
C

'S

e

0

......

-

a..
a..

>.

'0
C

0

z

 t

~
--t-t-~~ --~
- A , 1.

1 1

j

L,~.;

t--+- t' + t

.
_ qb
Ij=- 1 (qf+qr) -

;

;

7

(20)

7

where the total charge in the base is qb = qr + qr.
The instantaneous base current becomes
~j C dVje C dVjc
. -'
Ib - Ij + 7 dtY- je dt + jc dt

(21)

which is of the same form as Equation (5). Again, if
Figure 5 - Simulated diode response

The separation principle for transistors

Equation (20) becomes

Theory

. _.
Ib - Ij +

The charge control equations for a transistor are3

Ib

Tbf

+

.

+q) , C- dVej . C. dV C j ] (16)
Tbr -r dt qr r + Je dt + JC dt
1

(1 1)

d
dVej
q, - + - qr
- +-.qf
+ C je-• Tf
7bf
7r dt
dt
Collector
Current,
ic =

C dVje
je dt +

at

dv jc
C jC Yt(22)

The breadboard method for transistors

qr ,~(

Emitter
Current,
'Ie =

at

where the effective lifetime is now at7 and the transition region capacitances are at C je and at C jC .

Base
Current,

. =_[qf

' di j
dt +

at7

-~If + ql\ ~ + d-)+
I

r

I

hr /

:t

qd- C jC

d;t

(17)

Figure 6 shows the arrangement for time scaling
transistors in a similar manner to that used for diodes.
The operational amplifiers provide the effect of a
differential amplifier and apply a voltage proportional
to the low frequency current i j across the capacitance

C.
The current through capacitance C is
(18)

. = C d(v-vh) = CA di j
..:1+
r d+L
UL

Is

(23)

Analog Computer Simulation of Semiconductor Circuits

in much shorter setup times than
analog programming.

In

139

conventional

The analog method for transistors
To simulate the transistor the extended Ebers-Moll
model of Figure 7 was used. This model is equivalent
to those used in digital analysis programs such as
CIRCUS4 or NET P, etc.

~-v

a, i"

.....

e

Figure 6- Breadboard method of transistor simulation

c
+-Ie

assuming that the voltage drop across the sensing resistor r is small compared with Vje and Vjc the current
into the two capacitors at Cje and at C jC is
. -

lk -

at

C dVjc
C dVjee
je dt ~ at jc dt

b

(24)

Figure 7 - Modified Ebers-Moll transistor model

(25)

If CAr = atT, Equation (24) is identical to Equation
(21) and Figure 6 is the simulation for a transistor
time scaled by the factor

The emitter and collector currents ie and ic are given
in Equations (17) and (18).
Designating the values:

....9L = ifr which is the forward conduction current
arTf

CAr

at'=-T-

qr = Irr
. W h'IC h'IS t h e reverse cond uctIon
. current
-arTr

I t is important to note the assumption that Tbf = Tbr
This restriction becomes important only when the
collector junction becomes forward biased and it appears that the best solution then is to use the analog
method described next.
The capacitances in the simulated model are determined by the measured values 6 and the desired time
scale. The magnitude of the sensing resistor r depends
on the base current level to be detected. Generally
100 fl resistor is adequate. In cases where this value
affects the transistor behavior, feedback can be used
to compensate completely for the voltage drop across
the sensing resistor.
The advantage of the breadboard approach is that
the real transistor is used as its own dc model. All dc
nonlinearities and parameter interdependencies within the transistor are therefore available without
the need for complicated function generators. The
programming and debugging are similar to breadboard
circuit building and checking. This generally results

=

and using the relations. 3

T.

(26)

The Equations (17) and (18) can be rewritten as

where
. _.
If -

Ifr

di

+ arTf dtfr +

C

je

dVej

dt

(28)

These equations can be implemented on the analog
computer using the simulated diodes in Figure 3a or
Figure 3b. Either of those diode configurations can be
used at the discretion of the designer. One is probably
more useful in a loop analysis solution and the other

140

Spring Joint Computer Conference, 1968

in a nodal analysis. Mixed methods can also be used.
As an example, the N PN transistor amplifier shown in
Figure 8 was simulated using the layout of Figure 9.
The simulation uses one diode of Figure 3a and one of
Figure 3b.

The resulting waveforms of the base, emitter, and
collector currents are shown in Figure 10. Qualitative
rather than quantitative tests were performed to observe the influence of the various transient terms on
the waveforms. Those waveforms show very close
correspondence to those observed in real transistors.
The delay time (D), rise time (R), storage time (S)
and fall time (F) are clearly seen in the Ie waveform
of Figure 10. The spike (8) in the Ie waveform is determined by the charging of the base-collector capacity C je , the small pulse (C) in the Ie curve is controlled by the charging of the base-emitter capacity
C je . The spike (A) in the base current Ib is caused by
the charging of C je and C jC and the large negative pulse
SC is due to the removal of the stored charge from
the junction.

Figure 8 - Grounded emitter amplifier

One of the important advantages of this type of a
simulation (compared with the breadboard method) is
that the lifetime for the emitter diode, Thh and the
collector diode, Thn can be varied independently which
avoids the assumptions in Equation (19).

~f---, Vin

~~=-

'-f- ~

lV/line ;t-.-+-if---+-f-t-+-t-+-+-+-+-+-+-+-t--+---f-,-l.
- ~==$~=-!.-~::""-.'*,,,.c;-+--i=

I

-

lmA/line f--,-+--+-+--+-+-+-+-+-+--+-+-+-i.-A

.....

SC_
I

_-

Ie

I

lmA/line

"'"
I
I

+---l-++-+-+-+- 1

;t-:

1;

1
. R

- I I

....

~.1
py

....

--...__ : 1·

-J

II

!-.

.

Tl

I

......-Figure 9 - Simulation of grounded emitter amplifier

Experimental results
To illustrate the analog .method the simulated
grounded emitter amplifier of Figure 8 was driven by
a squarewave.
The transistor and circuit paramenters used in the
simulation were: r'b = lOll; af = 0.985; a r = 0.2;
9
Tf = 1.13 X 10-9 sec; Tr = 20 X 10- sec; C je = IpF;
C jc = 0.3pF; Eo = 6V; re = 50011; Vin = 107 Hz squarewave 5v peak.
The scale factors were chosen to be: at = 107 ;
£Xv = I; U'1 = 1000, where U'b U'v, al are the time, voltage and current scaies respectively.

~

--. .::: .,",-c. ~:~ ~-=.:;.

~

1\

~

\

._.-

-

J--.-L-'-"--'-~~L·

'-1.-.i.....-

\TO. 05800 /dl v • ,-'...L...J---L..J......,L-L.J.-.Li

C
I I I I

~I+I'~I~I~I-+-I+I~I~I

Figure 10- Simulated amplifier response

. Application of the breadboard method to logic circuit simulation has been described in the literature 6
where very close agreement is shown to exist between
the simulated logic gate and the actual circuit.

Analog Computer Simulation of Semiconductor Circuits
CONCLUSION
A description has been given of methods for simulating diodes and transistors using semiconductor
devices as computing elements to generate the nonlinear dc characteristics. Two distinct approaches
exist: one, called the breadboard method, uses the
actual device and requires sensing resistors for cutrent detection; the other, known as the analog method,
simulates the devices using traditional analog computing methods such that currents and voltages appear as
analog variables. A comparison of the breadboard and
analog methods is summarized below.
Validity of models

Both methods rely on the separation of the lowfrequency behavior of the device from its high-frequency response. The analog method replaces voltage
sensing resistors by operational amplifiers. This reduces the possibility of swamping any small bulk effect resistances with comparable sensing resistors.
For this reason, the simulation of diodes tends to be
more accurate in the analog method. In the simulation
of transistors, however, the breadboard method appears to have an advantage. It accounts correctly for
all effects simulated in the analog method except collector lifetime. In addition it also provides an accurate
representation of low-frequency effects such as base
widening and nonlinear current gain.
The model in the analog method corresponds exactly to charge control and Ebers-Moll models used in
digital programs such as CIRCUS and NET-I. Nonlinear functions, such as current gains af and an or
transition region capacitances Cje and CjC can be readily modeled in the digital programs. These features can
also be incorporated in the analog method at the expense of additional function generators. The analog
method by its modular nature does allow for a simple
simulation of the collector lifetime - a parameter of
significance in applications where the transistor saturates. Finally, it allows for the simulation of arbitrary
base resistances and changes in the parameters of the
dc model. This control enables a designer to do speculative design with devices that are not yet fabricated
as well as characterize a device with arbitrary parameters ..
Computing flexibility

The question of flexibility can be resolved into two
parts. The first question deals with the ease of programming - a measure of the difficulty encountered in
translating a circuit sketch to a working computer simulation.
For simulating diodes, the analog method is as
simple to use as the breadboard method. A 10-diode

141

logic gate was simulated and debugged in a few hours
of analog computing time.
In transistor simulations, the breadboard method
which retains the topology of an experimental bread~
board, has a distinct advantage in ease of programming
over the analog method. The analog method involves
traditional analog programing with little or no simi-·
larity to the experimental breadboard.
The second question of flexibility concerns the ability of adapting the circuit simulation to a form suitable
for optimization procedures and parameter tolerance
analysis. The analog method has a clear advantage for
these purposes. There are no external capacitors or
sensing resistors used in the method. The traditional
analog computer components that comprise the simulation are easily controlled by digital computer software in a hybrid computer arrangement. Optimization
procedures and sensitivity analysis become feasible
and it becomes possible to envision the circuit simulation as a digital subroutine.
Typical solution times for both methods are 10 to
100 msec per solution. This means a favorable factor
of approximately 104 in comparing solution times
of digital programs such as NET-l,5 CIRCUS,4 etc.
The implication of this speedup is the ability to do
sensitivity studies and apply optimization procedures
to the design of integrated circuits.
The separation method has most significant value
in problems where time scaling the experimental circuits is useful. Such ap·plications include the analysis
and design of high-speed logic gates, microwave circuits, and high-frequency amplifiers.
ACKNOWLEDGMENT
The authors are indebted to C. F. Simone and J.
Chernak for active encouragement and contributions
in the course of this work; to H. Gummel, B. T. Murphy and E. J. Angelo, Jr. for invaluable discussions
on theoretical aspects of the charge control representation; to K. W. Sussman and H. C. Rorden for assistance in developing the analog programs; to J. V.
Wait of University of Arizona for many stimulating
discussions.
REFERENCES
1 H K GUMMEL B T MURPHY
Circuit analysis by quasi-analog computation
Proc IEEE (letters) vol 55 p 1758 October 1967
2 A W LO
I ntroduction to digital electronics
Addison-Wesley 1967 p 43
3 PEG RA Y et al.
Physical electronics and circuits models oJtransistors
SEEC vol 2 J Wiley and Sons 1964 p 206
4 L D MILLIMAN W A MASSENA R L DICKHAUT

142 Spring Joint Computer Conference, 1968
CIRCUS, A digital computer program for transient analysis of
electronic circuits
Harry Diamond Laboratories 346-1 January 1967
5 H F MALMBERG F L CORNWALL F N HOFER
NET-l network analysis program
Los Alamos Scientific Laboratory Report LA-3 I 19 September

1964
6 E J ANGELO JR J LOGAN K W SUSSMAN
The separation technique - a method for simulating transistors
to aid integrated circuit design
IEEE Trans Electronic Computers vol EC-17 no 2 February
1968

A new stable computing method for
the serial hybrid computer integration
of partial differential equations
by ROBERT VICHNEVETSKY
Electronic Associates, Inc.
Princeton, New Jersey

INTRODUCTION

Serial hybrid integration o/the diffusion equation

Partial differential equations involving one space
dimension and time can be solved by hybrid computers using the serial (or continuous space-discrete time)
method. In so doing, the continuous integration capability of the analog computer is used along the space
axis while integration along the time axis is performed
in a discrete fashion by making use of finite differences.
The continuous integration problem in the space direction is in many practical cases of a mixed boundary
nature, and is furthermore often unstable from the
error propagation standpoint.
The purpose of the present paper is to. introduce a
decomposition method which, under quite broad applicability conditions, allows the spatial integration
problem to· be separated in a finite number of subproblems, each of them computationally stable and
free of the iteration requirement present in the mixed
boundary original equations:
This decomposition method applied to the spatial
integration problem is to be contrasted with tht!
Green~s function~ method. 1,4,5 In the latter method,
Green's functions of the spatial differential op~rator
are pre-computed, and the solution to the· 'differential problem is replaced by an integral relation
to the non-homogeneous terms. This method however, cumbersome in its computer implementation
in terms of computing time or hardware requirements.
Furthermore, the integral expression of the solution
presents unfavorable error propagation properties
which are not present in the decomposition method
presented here.
The method is first illustrated by the example of the
heat diffusion equation and then. generalized to more
extended problems into a formal way.

We consider the problem of integration of the diffusion equation in one spatial dimension (the slab
problem - Fig. 1).

ax

~=

at
UE

(0,1)

XO(t)}

x.(O,t) =
x(1,t) = X 1(t)
x(u,o)~ Xu(o)

=

aex

k---..r·
iJu2

(2.1)

given boundary conditions
2.2

x

X(LL, t)

X,(t)

o
Figure 1 - The function x(u,t)

The serial method of integration by hybrid computation consists in discretizing time and integrating continuously in space. If we denote by Xl(U) the value of
x(u,t) at time t1 = i6t, then (2.1) can be approximated
by
143

144

5pring Joint Computer Conference, 1968

:1

r----- - - - - r - - - rI - - - r - - - - - w - - -

or

I

I

I

II

I
I
I

I

~I

",I

~

~I

It:

...~

It: I

~I

!

ILl

IL

which can be rewritten:

0

0

'"z

~I
oJ

~

~

where

:I

(2.4)

Equation (2.3) is an ordinary differential equation in
the independent variable u; the expression (2.4) represents stored past information at time t1+1 when (2.3)
is integrated to produce Xi+l. The new value 51+1 to be
stored is computed by a recursive expression, which
can be derived from (2.4)
Si+1

=

X1+1

+ (I -

1

Figure 2 - Lines of integration and function storage in the spacetimepiane

1+1
d2x~.
9) k~t __

du 2

= XHl + ( 1- 9)
9

\

~

CD

O------~\------~~~~~t:~+-,~~tt~+2------------

INTEGRATION Of

(Xi+l -

51)

.

2 Xi +1 l. 1 .
k8t.t -d
-x + _-s,
du 2

or, taking (2.3) into account.
51+1

1

X'+l(U)

(2.5)

An alternative expression is:,
"

"

1

"

"

5 1+1 = 51 + 9 (x1+1 - 51)

(2.6)

5ince
kd2x.i:= dx I
-- du 2 . at I t = ti

DIGITAL
TO
ANAlDG

ANALOG

TO
DIGITAL

ANALOG
DIGITAL

we can see that 5 i , as defined by relation (2.4), is in fact
the approximated value of x(u) at the time
+

Thus the serial method of integration described
above consists in integrating at time t = ti+l the ordinary differential equation (2.3) which produces the
solution Xi+t(U) at that time, and storing 5i+l (given by
(2.5» which represents both the solution at time
ti+2-9 and the right hand side of equation (2.3) at the
next integration time (Fig. 2). A block diagram representing this sequence of computer operations is
shown in Fig. 3.
The spatial integration

We are left with the problem of finding a computing
algorithm permitting the integration" of Equation

-Sl(u)

STORE

_Si+1 (u)

-5(11)

Figure 3 - Simplified block diagram for the serial-hybrid computer
solution of the heat diffusion equation

(2.3) taking the appropriate boundary conditions into
account.
Let us rewrite (2.3) in the form:

(2.7)

The boundary conditions (2.2) yield the boundary
conditions of(2.7):
Xi+l(O) = XO(ti+l) }
Xi+l(1) = X 1(ti+l)

and

(2.8)

As was previously remarked, changing u into -u by
inverting the direction of integration would simply
invert Al and A2 , leaving the same instability properties. However, we can write

(! - Al )( 1u - A2 ) • x

L(x) =
If we try to perform the integration of (2.7) in either
the forward or the backward direction,
u

€

or u

€

(0
(i

~

~

1)
0)

In view of this, any function x 1(u) solution of
(3.4)

du

we find that this equation is computationally unstable
in both directions. It is true that, due to the fact that
the integration interval (0,1) is finite, a solution with an
acceptable acc1l:racy may be expected; but, in addition,
this problem is of a mixed-boundary values nature,
requiring an iterative type solution. Since (2.7) is a
differential equation of the second order, both x and

~~ have to be known in u = 0 to allow the integration to
be started. An iterative solution might consist in

as~

. a vaIue clor dx
I
-c.
h·
.
summg
d~ u = 0 pellormmg t e mtegratlon

is also a solution of L(x 1) = 0, since we have

The integration of (3.4) is stable in the forward direction, and requires one boundary value only (in
u=O)
Similarly, any solution of
du - A2X 2 = 0

L=(lu2- ke~t

(3.7)
The integration of (3.6) is stable in the backward
direction, and requires one boundary value only (in
u= 1).
Finally, if yi 6t\ is a solution of:
dyHI

s

Thus, assuming that n linearly independent solutions
of the homogeneous equation (6.5) are known, the
problem of finding a solution to the unhomogeneous,
mixed boundary problem expressed by (5.1) and (5.2)
can be reduced to the integration of the two differential equations (6.2) with any set of convenient
boundary conditions, plus the straightforward calculation of the n coefficients bl by the application of
(6.9). (Fig. 5).

= f3 fi' I
1= 1, 2, ... n

CPx(XI)

A linear combination of the form
n

x(u) = Xn+l(u)

+L

alX,(u)

(6.12)

1=1

will satisfy the boundary conditions (5.2) of the problem provided the al satisfy the relations:

INTEGRATE
~------~

LF(Xn.. la1

n

~---------,

CPl(Xn+l)

,

+L

1=1

CP2(Xn+l)O +

INTEGRATE
LB(JlaH(U) 1 - - - - - ,

a,f31,1

= Bl

n

L a,/32,1 = B2
1=1

ANALOG
DIGITAL
STORE
H(ul

(6.13)
n

CPs(Xn+l)O

STORE

+. L

1(U)

1=1

al/3s,1

= Bs

n

STORE n
ELEMENTARY
SOLUTIONS
X*.(U)

CPS+l(Xn+l) ±
)------4I-----t

SATISFY
BOUNDARY
CONDITIONS
EONS. 6.9

Figure 5 - Hybrid computer block diagram for the integration
of L(x) = H(u) by the method of decomposition

L aJ,8s+.,1 = BS+l
1=1

CPn(Xn+l)1 +

n

L all3n,1 = Bn

;(6.12) also satisfies (5.1) by virtue of (6.4) and (6.10).
However, the coefficients al will have to be computed by the solution of the linear set of algebraic
equat!o!ls (6. 13) for each value of ti.

150 Spring Joint Computer Conference, 1968
The computation ofn elementary solutions
The method by which n elementary solutions
Xl(U); I = 1, 2, ... n are obtained is not necessarily of

geat importance to the method described in the previous section; unles~ L() i~" time- (or i) dependent. We
have seen in the case of the "diffusion equation I ntegration by Decomposition that the elementary solutions
could themselves be obtained previously to the spacetime hybrid integration process by the use of the decomposed operators LB and LF themselves. This will
still be true in the general case if:
1. LB and LF are permutable operators, i.e.,
LsLF= LFLS
2. k (the order of L F ) is equal to s (the number of
specified boundary conditions in u = 0) and thus
n - k (the order of L B ) is equal to n - s (the nun'tber of
boundary conditions in u = 1)
3. L( ) is not dependent of time
If condition :# 1 above is not fulfilled, i.e.,
(7.1)

then the k independent elementary solutions which
can be obtained by the (stable) forward integration of

(7.2)
are still independent elementary solutions of L(XI) ~ 0,
since

But the remaining (n-k) needed elementary solutions of (6.10) cannot be obtained by the integration
of LB in the backward direction, since the relation

LB(x) = 0

(7.3)

does not imply that L(x) = O. Indeed, this would imply
that L(x) = LBLF(x) = 0, which is only tru"e if x is an
elementary solution of LF(x) = O. Thus, one may need
to use some other means to obtain the additional

(n-k) elementary solutions. This may be done by the
(forwar~ or backward) integration of L(xt) = 0, with
any set of boundary conditions, provided they are
linearly independent of those already used.
There is a variety of ways in which these elementary solutions can be obtained, which actually depend
on the peculiarities of the problem at hand. However,
it is of importance to note that the algorithm described
in Sections 6.1 and 6.2 permit this general method to
be applied with any set of n elementary solutions, and
that therefore iteration is never needed to obtain such
a set.
The set of elementary solutions (6.6) can be derived from the set (6.10) by an algebraic process performed prior to the time-space integration similar to
(3.13) - (3.16). This results in relations (6.9) which
are somewhat simpler than (6.13), but can only be
done if L( ) is not time dependent.
In problems where L( ) is time (or i) dependent, a
complete set of elementary solutions will have to be
obtained at each computation step.
REFERENCES
1 R M TERASAKI
Analog computation of Green's function for integrating two
point boundary value problems
IRE Transactions EC-ll 57 1962
2 R VICHNEVETSKY
Error analysis in the computer simulation of dynamic systems:
variational aspects of the problem
I EEE Transactions on Electronic Computers vol EC-16 no 4
August 1967 pp 403-411
3 EEL MITCHELL
Hybrid techniquesfor solution of partial differential equations
Electronic Associates Inc SAG Report #19 October 1963
4 H S WITSENHAUSEN
On the hybrid solution of partial differential equations
Proceedings of IFIP Congress 1965 vol 2 pp 425-43! Spartan
Books
5 T SCHRODER
Neue fehlerabschatzungen for verschiedene iterations - verfahren
2 Augev Math und Mech 361681956
6 W T REID
Generalized green matrices for two point boundary value orobkms
"
SIAM J Appl Math vol 15 no 4 July 1967

BASP-A biomedical analog signal processor*
by WILIAM J. MUELLER, PAUL E. BUCKTAL, PHILIPPOS
LAMBRINIDIS, KARL E. SCHULTZ and LEO F. WALSH
State University of New York, Upstate Medical Center
Syracuse, New York

INTRODUCTION
With the advent of low-cost digital logic modules,
discussions on hardware-software trade-off have
become popular.! Suggestions have· been made to
construct an analog computer of digital modules and
the availability of digital differential analyzer modules
is a step in this direction. Further it has been suggested
that a re-evaluation of hardware and software organization be made,2 considering that there has not been
any great systems variation in the implementation of
computers.
In the past several years, a number of special purpose digital devices have become available for the
processing, generally in real time, of analog signals
obtained from patients in a medical center or from
animals in an investigator's laboratory. These devices
have a limited accuracy· and data rate capability and,
although they may be faster than a general-purpose
digital computer, are too slow for real time analysis
of some of the more complex biological systems. If
more than one full channel is required or if several
types of analyses are necessary, the cost is prohibitive.
Three years ago we wired a large array of printed
circuit logic modules to programmable patch panels
and a system similar to the more recent macromodules
of Ornstein et al. 3 was connected to an analog computer (TR48) to carry out some real time analysis of
biomedical data. The system operated properly but,
because of the large number of connections, programming was horrendous. At the same time investigators
at our Medical Center began asking for faster and
more sophisticated data processing of biomedical analog signals. Several special-purpose high-speed real
time processors were constructed using the newly
available Motorola "MECL" microcircuit modules.
It became apparent· that a ~edium priced but programmable processor of analog data at a sample rate of
*The development of this processor was supported by the General
Medical Sciences branch of the National Institutes of Health
through Grant GM 11413.

one megahertz was needed. General purpose computers currently available cannot perform the necessary data manipulations in real time and an "unconventional system"2 was required. This paper concerns
itself with the philosophy and investigation of such
a system specifically designed for the real time processing of biomedical analog signals.
Processor requirements

Some of the slower biomedical signals such as from
EEG, (brain waves) ECG, (heart waves) and breathing or blood pressure transducers are easily analyzed
on a general purpose computer alth I

±elx

TIME

c~l;c::;;O"
I

I
i
59, 510, 511, 512

PICK-UP
CYCLE
SEOUENCER

'"

REGISTER
LENGTH

V

~~~~~~g~

'II,
II

!i

i
,I

iI

II

"

5::,:4,:::55::,

11='

INPUT/OUTPUT
LOG!C

I i
I

fl±dZ

~l

'"
::lOA

INTEGRATORS

-"

"

i~

II

I
I

I

I

l

I

f

J:I

L:J

T19982

Figure 1- System functional diagram

ment. The program registers will be used to hold the
interconnection and scaling program for a particular
DDA function. In addition, these program registers
are time shared to load initial conditions into the
integrators and to transfer the readout data to the
GP computer.
The interconnection control for the DDA is accomplished in a two dimensional array through a space
domain and time domain selection. Programmable
digital bits S 1 thm S3 and S4 thm S6 set up the space
domain selection and time domain selection, respectively. Each integrator contains four of these registers and associated selection logic to implement the
four incremental inputs, defined as the three dy inputs
and the dx input to each integrator. In addition, the
S7 programmable bit reverses the polarity of the incre,mental input to implement a sign change function.
The S8 programmable bit is used to disable the input,
to prevent incremental signals to that input from
affecting the integrator computation.
The pickup cycle sequencer is actually the bit
timer, which is used both to encode and to decode the
time domain multiplexed signals. The space domain
selection logic will select the appropriate interconnection bus. The time selection logic will generate

a pulse, the time position of which corresponds to the
time multiplexed position of the programmed input
increment. In addition, the !::,. input logic will perform the polarity reversal and input disable functions
under program control. The incremental inputs to the
DDA integrators will be received from the!::" input
logic, under program control, to update the integrators.
Programmable mtegrator scaling is accomplished
by program register bits S9 through S12. These bits
will select the register length for the particular integrator, thereby scaling the parameter over a range of
approximately 65,000.
The four incremental inputs to each integrator
are programmed with bits S 1 thm S8, repeated for
each of the four incremental inputs. The register
length selection is accomplished with bits S9 thm S 12
for each integrator. Therefore, a 36 bit program register is required to completely define the input and scaling for each DDA integrator. The incremental output
of the integrator is multiplexed onto the dz bus under
t,he control of the pickup cycle sequencer. This time
multiplexed position, a pulse position identification, is
characteristic of each integrator and is not programmable. The time and space multiplexed incremental
outputs of the integrators are recirculated to the selec-

Electrically Alterable Digital 'Differential Analyzer
tion logic for fanout to four inputs for each of 64 integrators, yielding a total fanout of 256 incremental
inputs.
The input/output logic controls the loading of
initial conditions during the, Initialize Mode and the
transfer of computational parameters during the Readout Mode. There is provision for direct G P computer
control for the initializing and readout of the DDA
computer parameters.
Certain types of overhead equipment are required
for computer operation, induding the dock puise generator and power supplies. This overhead equipment
permits the DDA computer to be completely selfcontained, thereby minimizing the interface requirements with the G P computer and peripheral equipment. ThisDDA computer can operate asynchronously relative to the G P computer and other control
equipment. Interface synchronization is obtained
with buffer logic and registers, thereby permitting
the DDA computer to operate independent of limitations that are characteristic of the external equipment.
The dynamic operation of the computer is controlled by the modes, cycles, bit times, and clock
pulses. The modes of operation are commanded by
the G P computer or other peripheral control equipment. The cycles and timing pulses are generated
within the DDA computer.
The modes of operation of the DD A computer are
defined as the Compute Mode, Initialize Mode, and
Readout Mode; described in Table I.

BIT TIME
CLOCK
PULSES

2

3

4

5

6

7

I

I

I

I

I

I

PICKUP CYCLE (8 CP}
8 x 71.4 = 5700s
(1) ~dy, dx

CYCLES

-, CYCLE 1

8

9

10

I

I

11

163

TABLE I -Operating Modes

MODE

COMIUlf

CYCLE

liT

nMf

nMfS

NOlfS

{jaSECl

I

PICKUP

2

INCIEMEN;

O,21~

3

INlfGlAlf

0.214 pS

0,51'0 pS
tenT, nve

pS

<_~CH

I~

INITIALIZE

LOAD INITIAL
CONDITIONS

M WOlDS
20 BITS/WORD
liT TIMES

1280

ASSUME IMC CLOCK
l_pS

II!ADOUT

tECllCULAlf

M WORDS
20 lin--WORD
I_liT nMfS

A5SUMI' IMC 0 nt'I<

1280;.s----',

INmAlfD
AT THE ST.... T OF
THE COMPUTATION

{lION COMPUTa
COMMAND
\QiiniiiiVi
AS REGlUIIfD

I

Tl9995

During the Compute Mode, the DDA integrators
automatically sequen~e' through the computation
and generate solutions to the equations that they
mechanize. This mode of operation is composed of
repetitive DDA iterations. Each of these iterations
is composed of three cycles that require a total of
14 clock pulses to complete. These 14 clock pulses
constitute a full DDA compute iteration, illustrated
in Figure 2. The duration of each cycle is determined primarily by the propagation time requirements through the longest logical chain to perform
the required function. Cycles 2 and 3 are implemented as sequential computations to simplify the
logic required for the integrator arithmetic operations.
Cycle 1 is defined as the Pickup Cycle, performing
the encoding and decoding functions' that interconnect
the incremental outputs, obtained during the previous

12

13

I

I

14

2

I

3

I

4

I

5

I

6

I

7

I

8

9

I

10

I

1I

12

I

13

I

14

INCREMENT
CYCLE (3 cpo INTEGRATE
3 x 71.4 21~ ~YCLE (3 CP)
(1' y + ~dy
~ x 71.4 2J4
(2) R - rd:, 2 (J) Y - R

=

----_+_ CYCLE

2

=

CYCLE 3

--+0---- CYCLE

1 - - - - - I ' - CYCLE 2

CYCLE 3

ITERATIONS I-I---------ITERATION k + l - - - - - - - i - - + I - - - - - - - - - I T E R A T I O N k + 2 ------.~~.
CLOCK FREQUENCY. 16 Me
CLOCK PERIOD. 71.4 os PERIOD

T19996

Figure 2 - Compute mode sequencing

164

Spring Joint Computer Conference, 1968

iteration~ to the incremental inputs of the appropriate
integrators for use during the next iteration. These
incremental quantities are encoded both in space and
in time. The space encoding merely selects the appropriate incremental bus line. The time encoding requires a pulse position type of incremental encoding.
The Pickup Cycle is composed of 8 bit times that
are coincident with the sequential multiplexing of 8
integrator outputs on each interconnection bus. A
particular bit time in the Pickup Cycle will be selected,
unde~ program control; resulting in one of 8 sequential
increments on each bus selected as the input to the
integrator. In this manner, the Pickup Cycle will permit the decoding of the increment for use during Cycle
2, the Increment Cycle. Three dy and one dx inputs
are permitted for each integrator, where the Pickup
Cycle will permit four input increments to be received.
The Increment Cycle, Cycle 2, permits the integrator to accept the incremental dy inputs, received
during the pickup cycle, and simultaneously add
these increments to the Y and R registers. The algebraic sum of the input dy increments are accumulated
in a two bit counter, to be algebraically added to the
Y register to update the Y parameter. This dy increment sum is also added to the R register, where the
relative weighting is one half of the dy increment
weighting. This R register arithmetic operation implements a trapezoidal type integration algorithm. The
incremental sum is added to the Rand Y registers
simultaneously in parallel arithmetic fashion. Therefore, at the completion of the Increment CyCle the
value in the Y register is the updated Y value and the
value in the R register has the new trapezoidal correction.
The Integrate Cycle, Cycle 3, permits the new Y
parameter to be algebraically added to the R register
under control of the dx incremental input. At the
completion of the Integrate Cycle, the R register
will contain the new remainder and the dz output
will be available from the integrator to be picked up
during Cycle 1, the Pickup Cycle, of the next iteration.
It should be noted that the Y and R registers will be
updated with the dy increment input only for the
increments accumulated for the Pickup Cycle pertaining to that iteration. In addition, the Y register will
be added to the R register during the Integrate Cycle
only if a dx increment had been received during the
Pickup Cycle pertaining to that iteration. Therefore,
the operation in the Increment Cycle depends upon
the dy increments and the operation during the Integrate Cycle depends upon the dy and dx increments
received during the Pickup Cycle of the corresponding
iteration.
The sequencing logic will permit the entrance into
the Compute Mode to be accomplished only at the

start of the Pickup Cycle. Similarly, eXItmg from
the Compute Mode can only be accomplished at the
completion of the Integrate Cycle. This will assure
the proper sequential operation of the DDA computer,
independent of mode changes and interruptions.
Therefore, the G P computer that interfaces with
the DDA can command readouts at any time without
the loss of DDA information or synchronism. A readout command will, effectively, stop time for the DDA;
readout the required information; then permit the
DDA to resume proper operation.
The Initialize Mode is intiated by the external G P
computer or other equipment that will load initial
conditions. This Mode will clear the R register and
other selected flip-flops in the system. In addition,
it will switch the Y register in each integrator into
a serial shift mode of operation. This will permit initial
conditions to be loade~ into a selected register in
serial form from the GP computer, the external tape
reader, or various other types of peripheral equipment. The loading of the Y register is accomplished
in serial to significantly reduce interconnections and
logic. The time required to load 64 integrators with
initial conditions will only take several milliseconds
of time. This is not considered a significant duration
of time to warrant the extra hardware to load the
registers in parallel fashion.
The Readout Mode of operation will discontinue
the computation, but preserve the parameters contained in the DDA computer. The readout will be
accomplished by placing the Y register of each integrator into a shift register mode of operation with
a recirculation loop. These registers will continually
recirculate, thereby permitting the contents to be
available in serial word form to the external equipment, while preserving the Y register information
through the recirculation loop. A bit timer will keep
track of the word, permitting the shifting operation
to be concluded when the serial word is properly
assembled in the shift register. All of the Y register
outputs will be multiplexed into the input of the
G P computer. The particular register can be selected
with an address from the GP computer.
The program registers are used to hold the programmed interconnections and scaling information
during the Compute Mode of operation. During the Initialize and Readout Modes, the program registers are
time shared as interface buffer registers for the serial
transfer of information.
A bit timer is used to sequence through the three
cycles of the Compute Mode and synchronize the
shifting of the Y register in the Initialize and Readout
Modes. In addition, it will synchronize all basic timing
operations in the computer. A clock pulse generator
is used to generate the precision timing required for

Electrically Alterable Digital 'Differential Analyzer

synchronous operation of the complete DDA computer.
Software approach to DDA programming
The versatile concept of the TEADDA will permit
the implementation of associated software, resulting
in a significant increase in the utility of the system.
With t~e advent of the Electrically Alterable DDA, it
becomes feasible to describe a DDA compiler and
associated software aids, The
is a- software
--- comniler·
-----c---- -----._-,package that is run on a G P computer to generate' a
functional program for the D D A. This program is then
organized into the machine language format with th~
DDA assembler program and printed out with the· map
printer program.
the DDA compiler accepts inputs from the programmer in a programmer oriented form; The inputs
are the equations to be mechanized, the initial condition parameters, and a definition of the auxiliary
functions such as the map printout. The compiler will
. automatically generate a detailed program for assembly into machine language.
The input format for the compiler is defined with
distinction made between the various parameters and
operators, as listed in Table II. The general symbol
is defined as a capital letter, while the specific parameter or operator is defined with numbers and lower
case letters associated with the general symbol. The
programmer will generate an equation in analytic
form, shown in equation (1), then convert it to the
compiler input form, shown in equation (2), with reference to Ta\?le ~1.
TABLE II-Compiler Input Symbology

Input

Symbol

Example

165 .

piler. The required accuracy of the solution would be
entered as a scaling aid and to define the accuracyspeed tradeoff for the solution.
The compiler output is a scaled program ready for
assembly, a list of input programmer errors for diagnosis, and a DDA interconnection map program for
documentation. The map program can be generated
in punched tape form to be used on an off-line map
printer.
The compiler will generate aDD A oriented program by using special algorithms that are presently
used by DDA programmers to formulate the problems. The D D A programs are implemented with
implicit servos, which are used to generate complex
functions and solutions from basic operations.
These basic operations include multiplication, addition, subtraction and sine/cosine generation. The
DDA compiler will program relatively complex operations such as arctan, division, and exponential functions that are derived from the basic operations. These
derivations are illustrated in equations (3), (4), and (5)
respectively.
The arctan operation is derived as follows:
,x
z=arctan-

(input form)

y

(3a)

tanz=-

x
y

(3b)

sinz
cosz

x
y

(3c)

--=-

x cos z - y sin z = 0

(3d)

d(x cos z) - d(y sin z)=O

(3e)

x d cos z + cos z dx - ,y d sin z - sin z dy = 0
(implemented form)

Of)

Variable

V

Vex

Constant

K

Kel

Differentials

D

D Vex

yz-z=O

(4b)

Integrals

S

S Kel

d(yz) - d(x) = 0

(4c)

Whole Numbe rs

H

H Vex

ydz+zdy-dx=O (implemented form)

(4d)

The division operation is derived as follows:
x
z=(input form)
y

... _-

_

(1)

The exponential functions for most numbers are
derived as follows:
z=x· b

HVz = S(KI *Vx*Vy*DVt)

(2)

. Independent initial conditions would be entered by
the programmer, but initial conditions that are implicit
in the other parameters would be derived by the com-

(4a)

(input form)

(5a)

log z - b log x = 0

(5b)

dz -b dx=O
z
x

(5c)

xdz-bzdx=O

(implemented form)

(5d)

166 Spring Joint Computer Conference, 1968
The development of subprograms for the DDA
compiler is presently being conducted at Teledyne.
A sample compiler flow chart is illustrated in Figure 3.
The DDA assembler program will accept inputs
from a DDA compiler or programmer and assemble a
detailed DDA program in machine language. The
assembler inputs will consist of each computational
element identification number, incremental inputs,
word length, and element type. The assembler will
assign the DDA elements to the computational functions. The coding will be defined for interconnections
between elements to implement the computation and
for word length to implement scaling.

the computation. An ancillary output is the initial
condition loading of the Y registers and the start/stop
criteria of the program, along with intermediate printouts that may be required by the programmer.
The input format for the assembler is in symbolic
form, illustrated in Table III. The DDA elements
may be switches, integrators, servos or other types of
computational elements. The first letter 'of the sym;.
bol in Table III indicates the element type, while
the following three numbers form the identification
of that type of element.
Table

ELEMENT
NUMBER
AND TYPE

!dY I

I
I

m.

'!dy Z

Assembler Input Format

:dy 3

;;::

1035

SOlZ

!
I :z-~-I

PRINT

-ITE

1500

YIZ5

STOP

-ITE

30000

HZ5

_c::on

C::Oll;

!dx
-dt

-SOI2

!Y

I

:j

!Y Scale

I tn8,51
+0

Length

.;)

i4

+0

10

SOl2

TABLE III-Assembler Input Format
REFOfIMULA T£
(QUA lIONS FOfI OOA
GENERATING
NEW VARIABLES
WHERE APPLICABlE

The nomenclature for the first row in Table III
defines that integrator I 125 shall accept dependent
variable incremental inputs from servo S 012, servo
SOlS, and integrator I 035; with an independent variable incremental input (dt) from the computer clock.
The initial conditions, scaling, and register length
parameters are also defined.

Figure 3 - DDA compiler flow chart

The output of the assembler for a hard wired DDA
is a wire list from which the production personnel can
wire the elements together. For a patch programmable
DDA such as the TRICE; the output is a wire list for
the patchboard, a market bit or other selection criterion of the word length, and initial condition parameters. For the TEADDA, the output of the assembler is the program register loading which defines and
code's the routing, register length, and polarity for

Figure 4 - Assembler flow chart

167

Electrically Alterable Digital Differential Analyzer
A typical assembler program flow chart is illustrated
in Figure 4. This program is similar to DDA assembler
that is presently used at Teledyne to program a hard
wired avionic DDA that is in production.
The map printer program is primarily used as a
checkout aid and for documentation of. reports and
programs. The inputs are identical to those generated
for the assembler, with the added feature of assigning
an acronym to name the variables. Because of lack of
industrial standards for DDA symbology and the
difficulty of interconnecting loops by a shortest lead
concept, the map printer output nomen,.clature shall
be as defined in Figure 5 and illustrated in Figure 6.

BLOCK 2
dGRR~_,",

S201
dARR

-dARR
-1201
dGRR

S201
dMUP
M201

dCEE
1102

19
IS

-dARR

1201
dBAR
1202

dX_ _""
INTEGRATOR
NAME
Y
L
LENGTH
Y
Y SCALE

Y

BlEP

dMUP

T20006

Figure 6 - Map printer output

5

Y

dX
dY

NAME AND SOURCE
NAME AND SOURCE
1
dY
NAME AND SOURCE

5

2

8
_.: .:_.-,-__

dY
3

NAMf AND SOURCE

dZ

NAME

;,ERVO
dY

"",r":l:dZ

r

dd~Y
~:. . __y_-~_t~....
X

L-----..:

dY

1

2

NAME AND SOURCE
NAME AND SOURCE

NAME AND SOURCE
dY
3
dZ
NAME
L
LENGTH

VARIABLE MUl ilPLIER

dZ.

dX
dY
X
lX
Y
lY

NAME AND
NAME AND
NAME
lENG TH OF
NAME
lENGTH OF

SOURCE
SOURCE
X REGISTER
Y REGISTER

SWITCH
dY 1 NAME AND SOURCE
dZ

dY
dZ
1
2

o

o

2

NAME AND SOURCE
NAME
CONDITION
CONDITION
CONDITION

2

Figure 5 - Key to DDA map printer

Hardware description

The electronics industry has made tremendous progress in the development of advanced electronic components and techniques. In particular, the advent of
integrated circuits has virtually revolutionized the
electronics industry, especially the digital computer
area. The trend has been toward components that are
very small, fast, low in power consumption, and highly
reliable. In order to take advantage of the advanced

hardware, new and sophisticated handling and packaging techniques have been required. In order to
handle and package the miniature components, much
of the advantage of the small size has been lost. Teledyne has developed packaging techniques that make
maximum use of the characteristics of the advanced
components. In particular, the Micro-Electronic
Modular Assembly (MEMA) takes maximum advantage of the characteristics of integrated circuits.
These are:
a. Preserving the small size characteristic of the
integrated circuit chip by placing many chips in
a single package.
b. Preserving the high speed characteristics of the
integrated circuit chip by placing many chips in
very close proximity with extremely short interconnections.
c. Preserving the inherent reliability of the integrated circuit chip by eliminating multiple packaging levels.
In addition, greatly improved manufacturing and
maintenance concepts are achieved and manufacturing costs are significantly reduced. The basic MEMAs
a~e hermetically sealed. flat packs with 24 or 36 leads
and dimensions of 1.0 x 0.75 x 0.06 inches. Each
MEMA can contain up to 32 digital integrated circuit
"bare chips" or 25 analog chips (integrated circuit:
resistor, capacitor, etc.). The digital MEMA is illustrated in Figure 7 ana the analog MEMA is illustrated
in Figure 8. The "bare chips," 1 mil wire leads, and
substrate interconnections are clearly visible in the
photographs on the MEMAs with the covers removed.
The interconnection pattern is photoetched onto the
ceramic substrate, the chips are die-bonded to the substrate, then leads are bonded to connect the chip
signal pads to the substrate interconnection pattern.

168

Spring Joint Computer Conference, 1968

The leads are composed of 1 mil (1/1000 inch)
aluminum wire that is ultrasonically bonded to the
chip and substrate.

Figure 9 - Computer build-up

Figure 7 -Digital circuit MEMA

SUBSTRATE
INTERCONNECTIONS

INTEGRATED
CIRCUIT CHIP
CAPACITOR
CHIP
RESISTOR
CHIP
TRANSISTOR
CHIP

Figure 8 - Analog circuit MEMA

The MEMA contains the functional complexity
of a large printed circuit board, but in a highly miniaturized package; yielding enhanced pertormance,
reliability, cost, size, and weight.
Several levels of modularity are provided in the
computer design, as ill~strated in Figure 9. Monolithic integrated circuits are assembled to form
MEMAs, MEMAs are assembled to form functional

submodules, submodules are assembled to form functional modules, and the modules are then assembled
to form the computer unit.
Due to the high frequency of the clock pulse generator, considerable design effort has been expended to
assure that propagation delays and the clock pulse
skew effect are minimized. 3 Propagation delays
through logic' and interconnections are made nearly
constant to all parts of the computer. The clock pulse
generator is located at the physical center of the computer, with the pulses fanned out to all modules in
a radial manner. It should be noted that the dimensions
of the computers are in inches rather than feet, reducing the propagation delay problem. The small dimensions are primarily the result of the MEMA packaging technique.
The TEAD D A packaging technique implements
each integrator and the associated programmable
interconnections and scaling in tweive MEMAs.
Each integrator and its associated logic will have a
Mean Time Between Failure (MTBF) of almost one
million hours, based on the ten million hour MTBF of
a MEMA. Therefore, the MTBF for the TEADDA
containing 64 integrators is over 12 thousand hours
or 500 days. Using a degradation factor of two for
higher levels of packaging and interconnectIOn, the
MTBF would be greater than 250 days of continuous operation. The low failt.Ire rate and the ease of
malfunction isolation and correction yields an extremely low down time for the TEADDA. This is a
significant factor in the low "cost of ownership" that
is an important part of the TEADDA concept.
CONCLUSION
The electrically aiterable DDA should prove to be an
extremely useful simulation and computing "tool"

Electrically Alterable Digital Differential Analyzer 169

to be used in conjunction with analog, digital, and
hybrid computers. The TEADDA fills a technological
gap associated with computer systems, where the
TEADDA will complement the other computing
"tools" to more efficiently cover the spectrum of
computer applications.
REFERENCES
1 E L BRAUN

Digital computer design
Academic Press New York 1963 Chapter 8 P 448
2 G A KORN T M KORN
Electronic analog computers
McGraw Hill New York 1956
3 G P HYATT
High speed digital computer implementation techniques
1967 International Electron Devices Meeting

DATA FILE TWO-A data storage and retrieval system
by REUBEN S. JONES*
General Electric Company
Phoenix, Arizona

2. A file like DF-2 can be used for large data
banks (400,000 items).
3. Synonyms and dictionary word hierarchies
can be had with little extra use of disc space or
processing time.
4. List processing (as used here) is the best technique for indexing large data files.
5. The notational system used in the Integrated
Date Store l literature is powerful and makes
file descriptions easily understood.
6. The IDS/COBOL verbs are useful in describing data file procedures.

INTRODUCTION
The Data File Two* was conceived to meet the
needs of a corporate headquarters in the $500 million
sales category. The file would contain monthly sales
data for the past two years and the customer master
file, current month orders and sales, railcar locations,
current year profit plans, personnel records, ledgers,
one-year profit plan data, and five-year profit plan
data. This f!le is assumed to be about 400,000 items.
The stimulus for trying to put "all" corporate data
in one file came (rom dealing with these data handling
problems:
• Coding and coding structure_ is difficult to discipline and change. Yet, the acquisition of new
companies and the launching of new products
require fast change and severe discipline.
• Relating profit planning data for the next oneand five-year periods to current performance,
corporate goals, and the capital plan is done
manually. The planning process is severely limited by the slow turn-around on analyses of these
relationships.
• Inquiry into sales history, personnel files and
planning files, and current order and sales files is
available only through large periodic reports.
• New data structures (such as proposed organization changes) cannot be easily proposed and
analyzed without creating new files and programs
or doing the analysis manually.
• Data external to the corporate operations cannot
be easily put into the current corporate data structure for comparisons and analysis.

File description

A. The directory-dictionary approach to communication with a data storage tile
A directory-dictionary is used to classify data in
the file, for storing that data, and for retrieving the
data from the file.
The directory contains a list of all of the attributes
under which a data item may be classified. The
directory might also be considered a list of fieldnames. In DF-2, the directory would contain the
following kinds of entries:
PRODUCT
CUSTOMER SOLD TO
DATESHIPPED

EMPLOYEE NAME
RAILROAD CAR NUMBER

There is a dictionary for each directory entry. This
dictionary contains the specific attribute values for
the directory item under which the dictionary is
found. A dictionary under the directory entry PRODUCT might contain:

The reader should find the following ideas acceptable after reading this paper:
I. A data file with 100% indexing is feasible.

DIAMMONIUM PHOSPHATE FELDSPAR
MONOSODIUM GLUTAMATE BENTONITE

When storing and retrieving data, the directory item
and the dictionary item are used in pairs:

* This paper was written while Mr. Jones was employed at International Minerals and Chemical Corporation where he held positions as Operations Manager - Accent International Division,
Manager - Systems Engineering, and Operations Research Engineer.
* A name only. No relation to other file design names.

PRODUCT
DIAMMONIUM PHOSPHATE
CUSTOMER SOLD-TO
WESTINGHOUSE
DATE INVOICED
01/02/67
CAR NUMBER
ACL 40781

171

172

Spring Joint Computer Conference, 1968

Synonyms may be declared for both directory
entries and dictionary words so that short spellings,
common abbreviations and presently used coding
may be used to store and retrieve data in the file.
Data retrieval is accomplished by issuing to the
system lines of data description. Each line of data
description consists first of the directory entry
and, second, the dictionary value. Any number of
des_criptive lines can be used to describe a data item.
DF-2 is to be implemented using Integrated Data
Store. 1 Integrated Data Store notation and terminology will be used throughout this file description.
the reader may wish to familiarize himself with IDS
notation and language by referring to the publication
titled "Integrated Data Store," published by the
Information Systems Division of the General Electric
Company. Some IDS diagram conventions and other
notation conventions are given in the Appendix.

They are placed at regular intervals
across the file.
Coupler
Records

- contain nothing but chain links.
They are placed near the Value
records.
File Entry
record

Dictionary

Value

chain

record

Dictionary

Value

There is one

B. The elementary file structure
The records and description which follow pertain
to the elementary file. "Elementary" here means
the minimum file necessary to implement the directory and dictionary and the storage of data items.
In the actual implementation, the data storage file
will be somewhat more elaborate in order that
additional features such as synonyms, storage of
character strings, cartesian intersections such as
month, day and year, and hieI:archical relationships
such as Belgium and Luxembourg being part of the
Benelux group. These elaborations, however, can,
at this point in the description, serve only to confuse
the basic file structure description.
The elementary file contains the following records:
File Entry - one record only in the file. The point
at which the file is entered.
Directory
Records - contain the names of the classes into
which the dictionary is divided.
These records may contain words
such as PRODUCT, CUSTOMER,
SHIP FROM, SHIP TO, NAME,
DATE, or EMPLOYEE NAME.
They are spaced out across the file.
Dictionary
Records - contain the particular descriptors
such as DIAMMONIUM PHOSPHATE, WESTINGHOUSE,
NAHX 94872, a Duns Number, or
JANUARY. They are placed near
the directory records.
Value
Records - These records contain only numeric
quantities in the elementary file.

There is one
Coupler record in
to describe the value

this chain for each
Coupler
value that this

in the Value record.

record
dictionary word is
used to describe.

Figure I - The elementary file structure diagram

The Elementary File Structure Diagram is shown in
Figure 1. There is one Value record for each data
item. The Value record contains the one numeric
piece of data. There is one Coupler record in the Value
Coupler chain for each dictionary word needed to
classify the "value." The Directory chain connects
all of the directory entry records. The Dictionary
chain connects all of the dictionary words to a directory entry. The Value chain connects all of the value
records. The Where-used chain connects, via the
Coupler records, all of the places that a particular
dictionary word is used to describe a data item.
An illustration of how a data item would be stored in
the Elementary File is shown in Figure 2. The data
item is the numeric value 407 classified under PRODUCT POTASH, and UNITS TONS. The blank
rectangle at (2) is the Coupler record linking the
number 407 to the word TONS under the directory
entry UNITS. The blank rectangle at (3) is a Coupler
record linking the number 407 to the dictionary word
POT ASH under the directory entry PRODUCT.
The dictionary is to contain all the words necessary
to describe the data items. Computational numeric
values are stored in the value record. In the elementary file, the value record contains only a numeric
value and two fiie chain links. "
In the full-blown file, the value records are allowed

DATA FILE TWO
Data It_:

synonym records have a pointer to master
so that when a search hits on a synonym,
the prime word may be retrieved directly.

VALUE!!91. at (l)
PRODUCT
UNITS

POTASH via (J)
tONS

via

173

(2)

Value

Record

Records
PliODUCT

tONS

Dictionary
Record.
POTASH

407

UNITS

M

I

•

Coupler
Records
(J)

IDS DIAGRAH CONVENTIONS NOT USED

Figure 2 - An illustration of how a data item would be stored in
the elementary file

to contain character strings such as address, terms
phases, or shipping instructions.

c. A more elaborate file structure
l. Synonyms
A means of declaring synonyms in the dictionary is provided. Synonyms are used to name
coding or short words which could be used to
store data and to make inquiries. The synonym
for AC'CENT 24.;4(1/2) OZ might be: 51307,
24-4 1/2 OZ, 4 1/2 OZ AC'CENT, and AC'CENT 4 1/2 OZ. The principal descriptor used
as file output is called the prime dictionary
word, and the other synonyms are called
dictionary synonyms.
Synonym declaration is also provided for the
directory entries. Synonyms for the directory
entry PRODUCT could be PROD., PROD,
PRO and PROD CODE.
Figure 3 shows the elementary file diagram
with synonym records added. The discussion
on synonyms applies to directory synonyms as
well as dictionary synonyms.

The synonyms lise lip fHe space only to the
extent of adding one dictionary record for
each synonym.
The dictionary chain is in alphanumeric sequence. The synonym records are in the same
dictionary chain so that prime words and
synonyms are intermingled in the' list. Each
synonym record is also a detail record of a
chain of which the prime word is master. The

Figure 3 - The elementary file diagram with synonym records

2. Character String
Long credit terms, phrases, shipping instructions spelled out company names and street
addresses would probably not be dictionary
words. It is, therefore, necessary to provide a
means of storing longer character strings.
To provide storage and retrieval of character
strings, three new record types are provided.
The character string record is comparable to
the value record in the elementary file. The
first 40 characters of a string are stored in the
character string record, CHAR-STRING.
The remaining characters in the string are
stored in records named STRING-LINES.
These records holq 40 characters each and
are chained to the CHAR-STRING record as
master thus enabling the file to hold character
strings of any length.
Another type of Coupler record is needed
which is named STRING-COUPLER. The
elementary file with character string records
added is shown in Figure 4.
The Character-String records could be thought
of as a second file which uses the same dictionary as the first file.

174

Spring Joint Computer Conference, 1968

L -_ _ _- - - l CART-

CHAIN

chain

EXTENDSTRING

chain

Figure 5 - The elementary file with the Cartesian record

Figure 4 - The elementary file with character· string records

3. The Cartesian Directory Record
When storing data, the question arises as to
whether to put the date, "01/02/67", in the dictionary or to store the date as three parts: Month
1, day 2, and year 1967. Usually, the decision
would be to store month, day, and year separately in order to make it easier to summarize
data into monthly and yearly periods.
The casual system user who is retrieving data
on, for example, shipments for May 2, 1968, is
not likely to remember that date is stored as
month, day, and year. A solution would be to
Store the date under DATE 01/02/67, and
MONTH 1, DAY 2, and YEAR 67. This would,
however, increase the number of Coupler
records needed in the file and thus would
increase the use of physical file space.
The splitting of date into month, day and year
is provided for in the directory without redundant data classification. This is done by using the
CARTESIAN directory record. The Elementary
file with the CARTESIAN record is shown in
Figure 5.
The CARTESIAN record is a detail record in
the Directory chain just as are other Directory
records.

The CARTESIAN record is master of the
chain CART-CHAIN. The detail records in the
CART-CHAIN chain are the Directory records
which make up the CARTESIAN directory
entry. U sing the date example, the directory
word DATE would be stored in the CARTESIAN record and the directory entries DAY,
MONTH and YEAR would be stored in
Directory records. The DAY, MONTH
and YEAR records would be detail records
of the CART-CHAIN chain which had the
DATE record as master. The CARTESIAN
record has no dictionary since the elements of
date are stored in the MONTH, DA Y and
YEAR dictionaries. Synonyms for the CAR T ESIAN directory record are contained in the
CART-SYN record.
When retrieving data using DATE 1/2/67, the
retrieval program would, upon finding that
date was in a CARTESIAN record, fracture
the inquiry into 1, 2, and 67, assigning "1"
to the first directory item in the CART-CHAIN
chain, which would be MONTH, assigning
"2" to the next CART-CHAIN chain record,
DA Y, and assigning "67" to the next record
item YEAR. An inquiry which used
DATE
1/2/67
for a retrieval criteria would be translated into:
MONTH 1
DAY
2
YEAR
67
The same would occur when storing a data item.

DATA FILE TWO
4. The Benelux Hierarchy
Dictionary words have hierarchical relationships. to other words that need to be defined in
the data file. An example might be sales to
Belgium, the Netherlands, and Luxembourg
which make up the Benelux group. It is desirable to have data for these three countries
summed when a call for sales to Benelux is given.
All data stored under Belgium, Netherlands, and
Luxembourg could also be connected to the
word Benelux. As in the consideration of
synonyms, this would require an additional
Coupler record for each data item. By use of
the BENELUX record and the LUXEMBOURG record in the Where-used chain, the
number of extra records to define the Benelux
problem described above is reduced to four
regardless of the number of data items. The
diagram of the Elementary file with BENELV X
and LUXEMBOURG records added is shown
in Figure 6.

175

The BENELUX and LUXEMBOURG records
are detail records in the Where-used chain, and
the LUXEMBOURG record is also a detail
record in the chain LUX-CHAIN.
The BENELUX record is master of the LUXCHAIN. The BENELUX record is found in
the Where-used chain of the dictionary word
higher in the hierarchy (the word BENELUX
in our example) and the LUXEMBOURG
record is found in the Where-used chain of the
dictionary word lower in the hierarchy (Belgium,
Luxembourg, and the Netherlands, in our
example). The depth of the hierarchy is unlimited, and the number of different hierarchies
in which a word may be defined is unlimited
since there is no limit to the number of BENE- .
LUX and LUXEMBOURG records which
may be placed in the Where-used chain. The
linking together of these records for the Scandinavian countries is illustrated in Figure 7.

File Entry
Dictionary Records

record

SCANDINAVIA

DENMARK

chain
Value

Directory

SWEDEN

NORWAY

record
Dictionary
chain

Benelux
Record

Dictionary
record

Coupler
Where-used

IDS DIAGRAM CONVFJiTIONS ARE NOT USED

record

chain

Figure 7 - How Benelux records are used to tie the Scandinavian
countries together

record
LUX-CHAIN
chafn
LUXEMBOURG
record

Figure 6- The elementary file with Benelux and Luxembourg
records

Data extraction scheme
A. Data extraction scheme for the elementary file

In order to explain the data extraction scheme,
the elementary file will be used along with a simplified
retrieval routine and inquiry. In the next section on
retrieval timing, other I DS features and record
fields not mentioned before will be added to improve
the retrieval times.

176 Spring Joint Computer Conference, 1968
The user wishes to see a totai of all potash sales.
The inquiry would be:
FOR
PRODUCT
POTASH
TRA.NSACTION
SALE
DiSPLAY
TOTAL
Figure 8 shows a data extraction procedure for the
elementary file which would serve the above inquiry.

The Inquiry:
GO TO:

GO TO
is left

---f}

Y

ELSE is

Y

Start

POTASH

Tl!.ANSACTION
DISPLAY

$

SALE

TOTAL

Retrieve File Entry record.

2

down

FOR
PJl.ODUCT

Retrieve Directory record for
PRODUCT.

Retrieve Dictionary record
for POTASH.

'Loop A

Retrieve next record of the
Where-used chain.

_----I 5
6

If record is chain master.

Save REFERENCE-CODE in
WORK-AREA_B.

7

Retrieve next record on the

8

If record is chain master.

9

If REFEl!.ENCE-CODE equal

10

Head the Where-used chain.

11

If Dictionary record is not

Value-Coupler chain.

• In-core processing time will be either overlapped or otherwise insignificant.
• For retrievals of records along a chain which
uses the "Place-Near" IDS statement, there
will be one disc read for each five (5) retrieve
commands and one arm seek for each ten (i 0)
retrieve commands.
• For retrievals of records not along a "PlaceN ear" chain, there will be one seek and one read
for each retrieve command.
• An average of two synonyms will be assumed
for each prime dictionary word and directory
entry.
• Y = the number of value records (data items).
• C = the average number of dictionary words
needed to describe one value. Also, the average
number of Coupier records in the Vaiue-coupier
chains.
• D = the number of dictionary prime words.
• E = the number of directory prime entries.
• YC= the average number of Coupler records in
the
Where-used chains.
• t = average disc rotational delay.
• T = average arm seek time.
• N = number of search (item selection) con4itions
given per inquiry.

o

WORK-AREA_B.

SALE.

12

Head the Value Coupler chain.

13

Add "value" to TOTAL bucket.
Print

~

bucket.

Figure 8 - A data extraction procedure for the elementary file

B. Modifications to the elementary data extraction
scheme for faster operations and a more elaborate
inquiry.
The following discussion concerns data extraction
speed a~d the file and procedure modifications
necessary for speed-up.
The following timing, assumptions and definitions
will be made so as to simplify this discussion.
• Magnetic disc with movable arms will be the
file device
• A large number of IDS pages will remain in core
memory (10 or more).
• The file device will have one data channel and
will allow no simultaneous arm seeks.

The chart in Table I gives a brief statement on
how the file or procedure was changed for speed-up,
the resulting timing formula for each block in Figure 8,
and the file search time computed for a specific
example.,
The example data base used in the timing and file
size illustrations is as follows:
Prime dictionary words
100,000
Synonyms per word
2
Prime directory entries
100
Synonyms per entry
2
Data items
400,000
Classifications per data item
20
Numbe'r of item selection criteria
per inquiry
2
The largest time loss in the Elementary procedure
is in Step 10. This can be reduced to in-core time by
defining a link to master of the- Where-used chain in
the Coupler-record and by using the -reference code*
of SALE to compare to this link (reference code).
This change would cause the repitition of the processing in Steps 1 through 3 "N" times since the
reference code of SALE must be found for comparison.
*Reference code is an IDS term for the records relative address on
the file.

DATA FILE TWO' 177
The next largest time loss is in Step 7. This will
stay except for a small reduction. The number of
executions of Loop B will be reduced by putting a
count of Coupler-records on the Where-used chain in
the prime Dictionary record. This will allow us to
select the search path (in the search example, the word
POTASH or the word SALE) which is the shortest.
This would reduce the number of Loop B executions
by a factor of about r~r.
N

Steps 2 and 3 will be reduced by breaking the Dictionary __fl:~dPirectory chains with' range master*
records. These range master records will be spaced
across the file allowing room for Dictionary records
to be clustered around them.
The conditional branch at Step 11 would be elaborated to include a check against all search conditions
with each condition being marked for hit or no-hit.
The "add Value to bucket" step at 13 would be replaced by whatever report or computational requirements are called for.
Step 12 time will probably be zero since the Value
coupler chain is a "Place-Near" chain, and the
record is likely to be in core.
Table I gives the arithmetic expressions for the search
time consumed by each procedure step.
The average search time into the example data base
after making the modifications noted above is:
1425 T+t +64 (T+t)
10 5
For disc units with the
of
and
t
T
25
200
10
90
25
o

seek and rotational delay times
The average search time is:
50 seconds
22 seconds
9 seconds

File size for full blownfile with allfeatures discussed

The full blown file structure is shown in Figure 9.
The size of an Integrated Data Store record
where:

is:

R = Record size in characters
L = Number of chain links, counting
links to master
R= 5 +4L+(NumberofData Char.)

The sizes of the DF-2 records are shown in Table
II. Also in Table II, the expressions for computing
*Range masters break up a long detail chain. Where S = number of
records in the chain, search speed improvement is about 2v8.
S

Figure 9 - The full blown file structure diagram

the disc space used by each record type is shown. A
file size is computed for the example data base.
The data storage and retrieval language
A. Function
The function of the Data Storage and Retrieval
Language is to provide a means of storing data in the
file from common sources and making changes to that
data, to make changes to the descriptive words used
in the directory and dictionary, and to retrieve and
report data from the file.
More specifically, it is assumed that the following
operations with the file will be desirable:

1. To define dictionary words and their synonyms.
2. To define directory entries and their synonyms.
3. To load data into DF-2 from magnetic tape and
card files used as part of existing systems.
4. To make changes to dictionary words, directory
entries and their synonyms already stored in
the file.
5. To change data previously loaded into the file.
6. To select and display data from the file "online".
7. To extract and print data in more elaborate
report fonDs "off-line".

178 Spring Joint Computer Conference, 1968
TABLE I

AVERAGE SEARCH TIME EXPRESSIONS
Search Procedure

Explanation of

Expressions

Average Search Time

Step No. in

FUe Changes

For Average

Por the Example

Figure 8

For SpeecL-Up

Search Time

Given in the Text

0

0

1

File master is usually core
resident.

2

Direction chain is a "place near"

NG(!

10

+!)
5

35 (!
10

+!)
5

chain subdivided by range master
records.

Two synonyms per prime

directory entry.
3

Dictionary chain is a iiplace near ii
chain subdivided by range master
records.

Two synonyms per prime

dictionary word.
4

Always a random retrieval VC

o

VC (T + t)

o

fN
N

64 (T + t)

times.

5 &6

In core operations.

7

Value Coupler chain is a "place near"
chain.

o

o
C (VC)(! + !)fN
D 10
5 N

1280

(!10 +!)
5

Step is executed C times for

each Loop A execution.
8 &9
10 & 11

In core operations.

o

o

A link to master is placed in the

o

o

o

o

o

o

Coupler record for the Where-used
chain.

The referenee code in this

link i, used to compare to the referenee code of the dictionary word
SALE.

In this way, the retrieval

of the dictionary word is avoided at
this procedure step.
12

Since Step 7 accessed all couplers for
the Value record, and sinee the Value
Coupler chain is a "place near" chain,
the page containing the Value record is
almost certain to be in core.

13

In core operation.

DATAFILETWO

179

TABLE II
DISC SPACE USAGE
Estimated
File Size
For Example

Expression

Record TXEe

Links

Characters

For Number

Number Of

In Millions of

..i!!L

Per Record

Of Records

Records

Characters

DIR-RANGE

2

13

DIRECTORY

4

39

CARTESIAN

3

35

DIR-SYN

2

31

CART-SYN

2

31

DICT-RANGE

2

DICTIONARY

G

18

E

100

2E

200

31

~ 3D/E

55

4*

44

D

100,000

4.40

SYNONYMX

2

31

2D

200,000

6.20

LUXEMBOURG

3*

25

1,000**

1,000

.03

BENELUX

3*

35

300**

300

.01

STD-COUPLER

3*

17

CV

8,000,000

136.00

STRING-COUPLER

3*

17

VALUE

2

21

V

400,000

8.40

CHAR-STRING

3

57

V/10**

40,000

2.28

STRING;..LINES

1

49

V/10**

40,000

1.96

TOTAL
* Links to master included
** An estimating basis assumption.

159.28

180

Spring Joint Computer Conference, 1968

All of these functions are to be defined by simple
statements which can be typed in at remote terminals
and which will be executed by interpretive programs
written in IDS/COBOL.
The data retrieval capability is to be used by persons
with iittle data processing experience and probably no
formal typing training. Except for logic of two or more
levels and "greater than" "less than" conditionals, no
special characters will be needed. Descriptions with
imbedded single blanks may be used since two or
more blanks are used to separate words instead of
single blanks or commas.
B. Examples of language uses
A formal language description is not given because
it has not been completed. Some illustrations of the
use of the DF-2 ianguage for retuievai ofraiicar tracing
data, planning data, and personnel data are shown below.
Example 1
Railcar locations might be stored and updated in
Data File Two. The first example is a request
for all railcars passing St. Louis carrying Triple.
Super Phosphate which are loaded but unassigned
(Roller) or destined for the company's own plant
(Plant Food). The system responded, as shown,
with four hits.

FOR
STATUS
ROLLER
OR
SOLDTO
PLANT FOOD
I CAR LOCATION
ST.LOUIS
PRODUCT
TRIPLE SUPER
SIGHTING
NOT
DESTINATION
UNITS
TONS
DISPLAY
CARNUMBER
PRODUCT TYPE
CAR NUMBER
ACL 086541
SAL 030655
ACL 761453
SAL 088304

REPORTING RR
VALUE
END

PRODUCT TYPE
ROP
GRANULAR
COARSE
ROP

REPORTING RR VALUE
MONON
76.3
MOPAC
54.4
KCS
74.6
MONON
79.3

FOR
STATUS
CAR TYPE

RETURNING
HOPPER

COUNT RETURN TO
NORALYN
BONNIE

PORT SUTTON
END

END
RETURN TO
RETURN TO
RETURN TO

NORALYN
BONNIE
PORT SUTTON

5

10
16

END

Example 3
The profit planning data for the company might
also be maintained on the file. The next example of the language use shows the TOT A L verb
and the system response.
END
TRANSACTION
SHIPMENT
UNITS
TONS
FISCAL YEAR
68/69
GENERATION
FORECAST 1
PRODUCT GROUP
POTASH
TOTAL

MONTH

ALL

END
JULY
AUGUST
SEPTEMBER
OCTOBER
NOVEMBER
DECEMBER
JANUARY
FEBRUARY
MARCH
APRIL
MAY
JUNE

45260
43565
42536
32564
50632
86734
361271
201000
107271
50000
501761
250302

END

END

Example 2
Company controlled leased cars would be
maintained on the file whether loaded or returning. Another type of inquiry would tie to count
the hopper cars returning to three shipping
location.s (Noralyn, Bonnie, Port Sutton).

Example 4
The next example is an internal personnel
search for an engineer with business systems
experience.
FOR
DEGREE
MASTER
MAJOR
ENGINEERING

DATAFILETWO
MECHANICAL
CHEMICAL
MINING
AGE
LESS THAN 35
EXPERIENCE
BUSINESS SYSTEMS
DISPLAY
NAME
LOCATION

181

FIELD

END
(System response not shown)
APPENDIX

Each box represents
one record type.
Each record type
has a name.
Each chain has one

The record type which is

and only one master

master of the chain is at

~

record type.

Examples:
MONTH
JANUARY
YEAR
67
VALUE
245.678
Discriptive names of records are used in describing
the elementary file to avoid having the reader memorize shortened COBOL names. These names are ·in
beginning capitals only.

the tail of the arrow.
A chain is represented by an
arrow.

Chains have names.

Record types at the head of
arrows are detail records of

CONVENTIONS USED IN THE TEXT AND
IN EXHIBITS IN THIS PAPER WITH REqARD
TO RECORD NAMES, CHAIN NAMES,
WORDS APPEARING LITERALLY IN A
RECORD, AND SCHEMATIC DIAGRAMS OF
LIST STRUCTURES
Record, field and chain names are in italics and
are in all capital letters.
Examples:
FILE-ENTRY
a record name
WHERE-USED
a chain name
WORK-AREA-B
afield name
Words which are contents of record fields such as
dictionary words in the data file are in all capital
letters. Numeric values which are contents of Value
records are in italics.

•

I

the chain.

A record may be master
of one chain and
detail of another.

Any number of record types may
be detail records of the same chain.
A record type may be detail of more
than one chain.

Figure 10- IDS diagram conventions

Integrated Data Store schematic diagram conventions are used in figures except where noted. A brief
review of IDS Diagram Conventions is given in Figure 10.
REFERENCES
INFORMATION SYSTEMS DIVISION GENERAL
ELECTRIC COMPANY
Integrated Data Store AS-CPB-483A Rev 7-67

GIPSY -A generalized information processing system
by GIAMPAOLO DEL BIGIO
International Atomic Energy Agency
Vienna, Austria

INTRODUCTION
The problem of mechanized documentation at the
International Atomic Energy Agency was first approached by using the IBM KWIC system, which
proved, however, to be insufficient for most of the
applications which were envisaged.
Consequently, it was decided that a new system,
GIPSY, should be developed which had the following basic characteristics:
- the input material must be entered in such a way
that it can be unambiguously recognized, i.e.,
retrieved, by the computer;
- the system must be able to store and retrieve
the information, and display it in a form which
is not fixed a priori, but which can vary according to the user's needs;
- the system must provide for such additional functions as sorting, duplication check and file maintenance.
From the consideration of these points it became
clear that the system had to comprehend two subsystems:
1) the input subsystem which permits consistent
cataloging and encoding of the material to be
processed; and
2) the machine subsystem which processes the
material by means of the computer.
The input subsystem has been discussed in detail
elsewhere, l while the description of the machine subsystem is the purpose of this paper. However, since
the two subsystems cannot be described completely
independently, a short description of the first is given
in the next section.
I nput organization

The input to the system consists of the bibliographic
descriptions of ~ocuments, or input "units." An input
"unit" consists of a set of bibliographical data, or
"data elements," e.g., author, title, publisher, etc.
Some data elements are composed of several subelements, e.g. multiple authors. These are identified

by special marks in the body of the data element.
Where needed a fixed structure, according to normal
documentation practice, has been defined, using
key-words such as YOL., NO., P., etc. and/or punctuation signs.
This permits the program to check the formal correctness of input and give specific error messages
where the defined input rules are not followed. It
also permits the enforcement of consistency in input
preparation, which ultimately means consistency in
the output products.
Each "data element" is given a 4-position code, the
B-code, as follows:
First position:
Second position:
Third position:
Fourth position:

Class and subclass code
Data type code
Data subtype code
Language code.

The concepts of class, subclass, type and subtype
are described below. We have identified, in GIPSY,
three types or "classes" of documents:
1) Original: the publication whose bibiiographic
description is being entered into the system;
2) Source: the publication in which the "original"
document was documented, e.g. an abstract
journal;
3) Citation: a document cited by the "original"
document.
In addition, a document of any of the three classes
above may be independent, e.g., a technical report, or
it may be part of some larger work, e.g., a journal article, a book in a series, etc. We have therefore introduced subclasses to indicate a dependency relationship: the first subclass pertains to the main document, the others to the related document.
Consequently a bibliographic description of a document is the description of the document itself and
the description of any other related document which
may exist.
For example, the bibliographic description of a
journal article found in an abstract journal requires
183

184

Spring Joint Computer C'onference, 1968

the description of the following three documents:
1) the article itself (class original, subclass main
document);
2) the journal (class original, subclass related
document); and
3) the abstract journal (class source, subclass
main document).
A document of any class is described by giving
the pertinent bibliographical data in a predetermined
sequence.
The code for a data element is' a two-level code, the
first level representing the general type of data (e.g.,
author" title, etc.), the second level a specific type
(e.g., personal author, corporate author, main title,
subtitle, etc.).
F,or some data elements an additional identifier
may be needed: the language in which the information
is entered.
.
A typical example of a B-code is the following:
A
2
0
E

J,

J,

J,

J,

class: original
type:
subtype:
language:
subclass: main
title main title
English
document
Since data type and subtype apply to any class of
document, assuming the above code represents a title
of a journal article, the title of the parent journal
could have the following B-code:
C
2
0
F

J,

J,!

J,

class: original
type:
subtype:
language:
main title
subclass: .related title
French
document (journal)
The actual B-codes used in the GIPSY system are
given in Appendix 1.

Figure 1- Major programs: input checking, printing of bibliography,
generation of index files

Overall system description
System organization

The machine subsystem consists of a set of nine
special purpose programs, a monitor, a generalized
sort program, and a system maintenance program.
The approach taken in the design was to construct
a modular system in such a way that those features
which were important but might not be needed to
produce a given output may vary according to the
particular job.
The execution time of individual programs may also
vary greatly according to the features selected.
Special purpose programs
The nine special purpose programs may be subdivided into four groups as described below. (The
information flow throughout the system is shown
in Figures 1,2,3,4, and 5.)

Figure 2-Major programs: printing of indexes

GIPSY 185
1) Index 1: composed of the index entry and
the document numbers of the relevant citation in the printed bibliography.
2) Index 2: composed of the index entry, a
portion of the input unit and the document
number of the relevant citation in the printed
bibliography. A special case of Index 2 is a
KWIC (Key-Word-/n-Context)' index.
d) Catalog cards.

GIPSY2

Figure 3- Utility programs

SORT7

GIPSY5

...- - - -.... Delete cards

Figure 4 - File maintenance programs

Major programs
These permit the production of:
a) A check-list of the input material for proofreading, which includes program-generated
error messages, concerning formal input errors
such as sequence, punctuation, coding, missing
data elements, etc.
b) A printed bibliography, which may be classified
according to a user-supplied subject classification scheme. Provision for cross references is
also made.
c) Two types of indexes:

Figure 5 - Duplication check programs

Utility programs
These allow the user to change the order of the
units within a file', and/or of data elements within 2
unit.
Since, for checking purposes, data elements within
an input unit must be entered in ascending order by
the respective B-code, a utility program is provided
to change the printing sequence of these data elements. This feature is particularly useful when producing catalog cards, where the first data element
is normally the one by which the cards are filed (e.g.
title, author, etc.).

186 Spring Joint Computer Conference, 1968
Duplication check
These programs permit the automatic generation
of a duplication check code, for each input document, and the construction and updating of a dictionary file containing duplication check codes of all
documents processed by the system.
They provide for checking the duplication check
codes of the new material against the duplication
check dictionary to ensure that there are no duplicated
documents.
File maintenance
A generalized file maintenance program permits the
user to change the contents of a GiPSY fiie either by
replacement, deletion or insertion of items. The
possibility of extracting selective items from anyone
file is also provided.
Monitor and system maintenance
The GIPSY system is tape-resident and therefore
a monitor is provided to call the desired program for
execution. Since each program returns control to the
monitor after completion of its functions it is possible
to "stack" several jobs in a single run. To reduce
operator intervention to a minimum the system permits the user to allocate tape units dynamically by
means of control cards.
The system maintenance program provides for updating the GIPSY system tape, by adding new programs, replacing entire programs or parts, or deleting
obsolete programs.
Generalized sort
The sort program utilized by GIPSY is the standard IBM SORT 7 for the 1401. Whenever needed
the relevant GIPSY program automatically generates SORT 7 control cards.
System control
Because of the large number of features provided
by the system, the operation of every GIPSY program is controlled by one or more control cards,
as explained below.
System control cards
These permit the selection of the desired program
from the system tape, allocate tape units and print
messages for the machine operator.
Standard GIPSY control card
This card" is present for every program, has a fixed
format, and indicates items such as page size, record
iength, presence or absence of an optional file,
spacing, etc.

Control statements
These cards specify options applicable to individual data elements, e.g., indentation, sorting, etc.
Control statements have a free format.
Description of individual programs

Major programs
GIPSYI
GIPSY 1 is the input and editing program. It accepts
as input either a punched card or a tape file (CIT).
B-codes within a bibliographical unit are checked
for ascending sequence and validity. The ascending
sequence of document numbers can also be checked.
For those data types for which a structure is defined,
this is also checked.
Outputs from GIPSYI are a check-list of the input
data with diagnostic messages, a statistical report
and a tape file called Standard Bibliographic File
(SBF), containing the material to be processed by the
system. GIPSY 1 also provides for on-line error
recovery.
GIPSY3
The functions performed by GIPSY3 are the
following:
a) Produce a "ready-to-publish listing of the information stored on the SBF;
b) Add this information to a Master Bibliographic
File (MBF);
c) Generate an index file (FILEl)
The powerful report generator included in G IPSY3
provides for several features, the most important
of which are:
-variable page size;
- variable spacing between units;
- suppression of the printing of selected data
elements (by using this feature one can print,
for example, a title list);
- indentation of selected data elements;
-overprint (bold-face) of selected data elements;
- spacing before printing selected data elements.
In addition, provided that the bibliographic units
have been assigned a subject identification code, subject headings, subh'eadings and cross references can
be printed by using a special. card file (header cards).
During or independently of the printing of the SBF
it is possible to generate another file containing selected data elements (FILEt). The purpose of
FILE 1 is to obtain printed indexes such as author,
corporate author and report number indexes.
The ability to generate several indexes at one time
permits considerable saving of machine time and
set-up tIme; only one sort and one printing pass are
necessary.

GIPSY 187
When a report number index is requested, this is
generated in such a way that the report numbers will
be sorted in the correct alphanumeric sequence, e.g.,
TID-47 before TID-I 00. This is done automatically
by the program and does not require manual editing
of the report number when punching.
The information processed can be stored on a master file (MBF) which can be subsequently used to produce, for example, cumulative indexes.

It should be noted, at this point, that the stopword
list of G IPSY8 does not attempt to produce the final
version of the index (this function is accomplished
by the external stopwords in GIPSY9) but rather to
reduce the sorting time of the index file as much as
possible. Another important feature of GIPSY8 is
to let the user decide which characters are to be taken
as part of the selected word.

GIPSY9
GIPSY4
GIPSY 4 prints the contents of a sorted index file
(FILEt) generated by GIPSY3. The main features
are the ability to print directly on two columns and
to print several indexes in different formats. The report generator includes, where applicable, the same
features as in GIPSY3. By using header cards a fixe~
title can be printed at the top of every page.
Since for every index a new set of control cards is
required, this makes it possible to change the printing
format for every index.
GIPSY8
G IPSY8 accepts as input an SBF and generates
an index file (FILE2). Any data elements may be
selected as the index entry (author, corporate author,
keywords, etc.), and any data element may be selected as additional information (e.g., an author index
showing the title as additional information). G IPSY8
can also generate a subject index by extracting keywords from selected data elements, e.g., title, abstract.
In this case either KWIC (Key Word In Context)
or KWOC (KeyWord Out of Context) indexes can
be generated. When generating such a subject index
there are certain words which should not be selected,
e.g., prepositions or articles. These are referred to as
stopwords. There are 15 stopwords which, according
to our own and others' experience constitute about
25% of the total number of words in a given file. These
are the following:
a, of, in, on, by, to, as, at, an, the, and, for, with,
from, some. There are certainly other words which
appear quite frequently and which could be included
in the list. It was felt, however, that the user should
decide on these words rather than the system designer,
since in fact most of them depend on the specific
application. The approach taken in GIPSY8 is to
provide a built-in stopword table containing the
words given above, but let the user be free to provide
his own list, which can contain about 300 words maximum. This also permits the use of stopwords in different languages. The stopword lists currently used
at the International Atomic Energy Agency show that
about 50% of the words in a given file can be eliminated.

GIPSY9 prints the contents of a sorted FILE2
generated by GIPSY8. The report generator in
G IPSY9 provides for a large number of printing
features.
In addition to the index file itself, G IPSY9 accepts
a stopword file, referred to as the external stopword
file. This is the counterpart of the stopword list of
GIPSY8.
.
GIPSY9 can be conditioned to print only those
words which are on the external stopword file; in this
case the file acts as a dictionary of meaningful words.
To improve the consis~ency and lls,efulness of the
index, the external stopword file provides for the printing of cross references and for the possibility of printing certain words only for selected docu.ments. This
feature is particularly useful for handling those words
which have the same spelling but different meanings.
Utility programs

GIPSY2
The main feature of G IPSY2 is the automatic
generation of sort fields. The SBF is a variable-length
record file, and the length of different data elements
is variable within the record itself. The existing sort
programs, on the other hand, require that the sorting
keys are always in a fixed position of the record. The
automatic generation of sort fields permits sorting
of the SBF by any desired data elements: this is done,
by extracting information from the selected data
elements and inse~ing it ina fixed position of the output SBF.
When the contents of the SBF are written on the
master file (MBF) by GIPSY3 the generated sort
fields are deleted.
In the case of periodic announcement lists, the
bibliographic citations are normally assigned a sequential ascending document number, which, if the
bibliography is sorted by the computer, is not known
at the time of input preparation. A feature of G IPSY2
allows the user to assign such a document number
after the file has been ·sorted. (Other 'features' of
GIPSY2 are described in a later section.

188

Spring Joint Computer Conference, 1968

GIPSY7
This program permits the arrangement of data
elements within a printed bibliographical unit in an
order other than the ascending sequence of the
B-codes; as required by GIPSYL GIPSY7 is normally uSed before GIPSY3.
Duplication check

ing the information on the file for those documents;
or the update mode, in which the codes on the dupcheck file are added to the dictionary file. When
operating in the update mode the program does not
check for duplication, thus assuming that possible
duplicate codes represent different documents.
A print-out of the updated dictionary can also be
requested by the user.

GIPSY2

Jmplementation

Besides the features described earlier, G IPSY2 may
also be used to generate a I3-character duplication
check code as follows:
NNNNIIYYTTTTT
where: NNNN are the first 4 characters of the name
of the first personal author (or editor);
II
are the initials of the above; if neither
an author nor an editor was given
then NNNNII are the first 6 characters of the corporate author;
YY
is the year of publication;
TfTTT are the first characters of the first
5 words of the title.
If any item, or part of it, is missing the corresponding positions of the dupcheck code contain
full-stops (.). The duplication check code, the corresponding document number, and some additional
information are written on the dupcheck file. The
additional information is selected automatically by
the program according to the particular type of document, e.g., report number(s) for technical reports,
subtitles for books, journal citation for journal articles,
etc. This need was felt after some experience with a
previous version of the system, in which the additional
information was not present.
I n fact, in some cases (in particular for progress
reports), the dupcheck code generated by the program was the same for different documents. This required visual scanning of the full citation in order to
be sure that two documents with the same duplication check code were actually the same document.
Since the additional information was put on the duplication check file this scanning is practically no longer
required.

The GIPSY system has been implemented for an
IBM-1401 with 12000 storage positions, high-Iowequal compare and advanced programming features
(the space suppression feature, 4000 additional core
positions and the 1407 console typewriter may be used
if installed), 5 magnetic tape units, card reader and
punch unit.
I t has been used since 1965 for the preparation of
several nonperiodical and two periodical publications of the IAEA, the "List of References on Nuclear Energy," an announcement list with annual
personal, corporate author and report number indexes,
and "Nuclear Medicine, A Guide to Recent Literature," also an announcement list, each issue of which
includes author index, an isotope index and a KWIC
index. The total number of documents processed and
sorted to date is about 50,000.
In addition to the above, GIPSY has been used with
success to produce other types of publications which
were not strictly documentation-oriented.

GIPSY6
G I PSY6 accepts as input a dupcheck file generated
by GIPSY2 and a dupcheck dictionary which contains the dupcheck codes of all documents previously processed by the system.
G I PSY6 may be used in either of two modes:
the check mode, in which the codes on the dupcheck
fiie are compared with the codes in the dictionary
and each time they match a message is printed show-

SUMMARY
GIPSY is a generalized system to process bibliographic information.
The input to the system describes the material
which is being documented. The bibliographic components are identified by means of a code (B-code),
permitting access to any of those components which
may have an independent documentary value (e.g.,
authors for author index, report numbers for report
number index, etc.).
The information can then be processed to obtain
printed reports: bibliographies, classified if desired,
with or without abstracts; and various indexes, including subject indexes (KWIC). A duplication check
procedure is also included in the system.
The aim of the system is to produce ready-topublish copy, and therefore various editing and format features are provided. The information processed
can be stored for subsequent use, e.g. production
of cumulative indexes or retrieval.
CONCLUSION
After two years of practical experience with the
GIPSY system it can be said that the objectives out-

189

GIPSY
lined in the introduction of this paper have been fully
met ~nd that in fact the system has proved to be general enough to handle applications which were not envisaged at the time it was designed.
Another important factor has been that through the
use of the system we have been forced to be much
more consistent in the application of descriptive cataloguing rules than we were during manual operation.
We also feel that the approach taken is a valid contribution to the creation of a generalized information
processing system in which the system adapts itself
to the user's needs, rather than the user to the system.
The need for this characteristic becomes more and
more evident with the information explosion of recent
years and with the increasing number of organizations
faced with the problem of mechanized documentation.

APPENDIX 2

An example of some features of GIPSY
Figures 7-15 show some printed outputs of the system. We have taken a sample bibliographic description
and followed it throughout most of the GIPSY programs.
GIPSY1:

...

ACKNOWLEDGMENT
The author is indebted to the staff of the INIS Section of the IAEA and in particular to Mrs. J. Robinson
and Mr. T. W. Scott for their suggestions during the
development of the system. He is also grateful to
Dr. F. Lang, IBM Austria, for the assistance given
during the inception of the GIPSY system.
REFERENCES
1 T W SCOTT F LANG
Coding and structuring input data for the GIPSY system
Proceedings of the American Documentation Institute 1967
2 G DEL BIGIO
GIPSYprogram manual
International Atomic Energy Agency 1967

-

-

--

--

-,J

----

The input unit in Figure 7 contains some
mistakes indicated by GIPSY 1 as follows:
PUN CTN: a full-stop, not a comma,
must separate initials of
names.
LAST UN FLAGGED SUBTYPE IS
INCOMPLETE: the title
(A20E) does not end with
a> sign.
MISSNG: the report number (A30)
is missing. Since the document is a report, the report
number is a required data
element.
DA T A:
B is not a valid element of
the collation (A41). This
should have been P.
Figure 8 shows the same input unit after
the indicated errors have been corrected.

APPENDIX 1

GIPSY Coding Matrix
The matrix given in Figure 6 shows data types and
subtypes, with their appropriate codes, running from
~op to bottom along the left side of the figure. All of
the respective subclass codes form a sequential row
running from left to right across the top of the figure.
The sequence of data elements is that normally followed in documenting an entry for bibliographic use.
In the GIPSY system, this sequence must be adhered
to in entering data, but may be subsequently changed if
desired. Similarly, the sequencing of classes and subclasses follows normal bibliographic practice in describing an entry.
Within the matrix, an x is shown where a given data
element is normally entered"for the particular subclass
column. An (x) is shown for data elements which may
be present or which are optional. In the fourth position of the B-code column, language code, the word
"yes" appears for those data elements which logically
can have and normally will have a distinction according to the language in which they appear.

Figure 7 shows how the input unit is printed on the check list.
Soacine is inserted hv th~ r--O'
nroO'r!=lTTl
tn
iTTl_
. . - ...........
- ........
prove legibility.
The sample unit contains the following data
elements:
AOO:
document identification. This
card gives information such as
subject category code, type of
document code (R = report)
AI0:
personal authors
A16:
corporate author
A20E: English title
A30:
report number
A41:
collation
960E: English keywords
Since the> sign, which is used in GIPSY
to separate subelements within a data element and to terminate a data element, is
a non-print character on the 1403 printer,
it is replaced by the program by a record
mark (t) to simplify proofreading.

GIPSY7:

Figure 9 is an example of a catalog card.
The filing entry in this case is the keyword
(960E), which appears first. The order of
data elements has been changed by GIPSY
7. The card was printed by GIPSY3.

190

Spring Joint Computer Conference, 1968

i
B· CODE

--GIPSY

CODING

MATRIX

CLASS· ORIGINAL

!
sri ~ IS~B I
!
I
I s : I; I I
I I

I

DEFIlI[ED ITEM

I

A

'

S

!

L

DOCUMENT OR PART ENTERED

INDEPENDENT PUBLICATION OR PART

INDEPENDENT PUIlLICATION OR PART

"IN" AN IND. PUBL'N

"IN" AN IND. PUBL'N

"IN" AN IND. PUIlL'N

I.

I

I

I

I

A

o

IDENTIFICATION/SYSTEM UTILITY

0

Identification (categories, species, etc.

I

Utility (oortfield)
1

D

I

X

I

K

!

I I

"IN" A SERIES

I~

I

J

I

L

M

R

"IN" A JOUHNAL

I

I

2

I

,X

I

I"IN"AS~
I

PHCC.

0

(X)

I

(X)

2

(X)

Common affiliation

3

(X)

Corporate author(B)

6

(X)

7

(X)

8

(X)

Secondary corporate author(s)
TITLES

X

(X)

(X)

(X)

(X)

(X)

(X)

(X)

(X)

(X)

(X)

(X)

(X)

(X)

(X)

(X)

(X)

(X)

(X)

(X)

(X)

X

X

X

(X)

(X)

(X)

(X)

2

journal or document. tilie

0

YES

X

X

X

X

Subtitle

1

YES

(X)

(X)

(X)

(X)

YES

(X)

(X)

(X)

(X)

(X)

+

Addenda to title
IDENTIFYING NUMBERS

2

X

(X)

(X)

(X)

(X)

,

I

3

Report number(s), primary

0

(X)

Patent number(a)

1

(X)

2

(X)

(X)

9

(X)

(X)

Specification number(a)
Report number(s). secondary

If

DESCRIPTION

4

Place of publ'n. publisher. date

0

(X)

X

1

(X)

X

Journal description

2

"Inclusive" pagination (book)

3

(X)

(X)

(X)

X

X

X

X

X

(X)

(X)

X

(X)

(X)

(X)

(X)

(X)
(X)

X
(X)

(X)

X

X

X

9

I

(X)

(X)

Physical deacription

Notes

(X)

(X)

(X)

(X)

(X)

~

NOTES OR AVAILABILITY

~

0

(X)

(X)

(X)

6

KEYWORDS OR KEY TERMS

6

0

(X)

(X)

(X)

(X)

(X)

(X)

(X)

(X)

(X)

7

ABSTRACT I ABSTRACTOR(s)

Abstract

7

0

Abstractor(a)

YES

9

'"

Figure 6-GIPSY coding matrix

A DO l
I

LAST Ulll'LAGGlD
5l~Z

I

9

"

3

(X)

Translator(.)

Notes

"

~I

I

I

A JOURNAL

1

Personal authors(s)

3

YES

1.9

'"

AUTHORS

I

"IN A SEHU;S

C

r "IN"

i
0

Editor(s)

2

!

I

B

r

"IN" A JOURNAL

G

I

CLASS z CITATION

A

N

,

CLASS z SOURCE

10
I.

_~~.:
A

DOl
001

~

0205 0
PETlUkHIH V,J •• tI'ONOMMEY l.I ....H.OKOSI«I. YU.D • •
~~~,!~S!. fOIl NUClUII OlSOIlCH. DUeNA IUSS ... LAI. OF HIGH

PUllCTII

CHEMICAL ANALYSIS, HYD~OGEN, PIONS, PROTONS,
RESEARCH, NUCLEAR REACTIONS, USES.

IS ~~DIIf'~T: POSSlII.l APPLICATION OF A MlClUa OUCTION IN CHEMICAL

:to
00

DOl

: ~ ~ :!

1966••

~~~:s~:~~::~~~SlS .

.cHEmAl AHAlYSlS • ....,OIIOGEN •• PIONS.

PETRUKHIN V.J., PONOMAREV L.I.,
YU.D.

P~OKCSHKIN

ON A POSSIBLE APPLICATION 8F A NUCLEhR
REACTION IN CHEMIC4L RESEARCH.
JINR-P-2558. 1966. 5 P.

Figure 7 - GIPSY I printout with error messages

Figure 9-GIPSY3: catalog cards
ADO
10

001

0205

0

001

PETRUKH1N. y.J.,.PONQ....R.EV L.I . . +PItQllOSHllN. YU.D. t-

20 E
!

001
COZ

ON • POSSIBLE APPUCAHON OF .l NUCLEAR lEAtHCM
IItESEA!Cto! .. '!'

'0
00
H

001
001
001

JIlIA·p·u,e ••
1966.'
, P• •

001
002

NUCLEAR ItEACTlONS"USES.KHE"ICU
.,ROTONS,n,ESEAI\CN ••

AU

MISSNG

• 60 E

l

:~

CttE"lClL

l~Al YStS.tHYOAOGEN,.'IOHS,

Figure 8-GIPSY I printout, corrected

GIPSY2: . Figure 10 shows the document number reassignment and the dupcheck code generation features of G I PSY2. The input unit
51542 is assigned the document number
00044 and the dupcheck code PETRYJ660APAO.

GIPSY
GIPSY9:
C0041
00042
00043
00044
00045
00046
00047

51512
51760
51447
51542
51514
51445
51543

OGIEVI66EFISA
OSHEVI6ItBSAHE
PETKI.650TAOI
PETRVJ660APAO
PONOLI 660TTOT
ROTTI .65FPCFA
SHIRDV64PIUTl

OGIEVI66EFISA
OSHEVI64BSAHE
PETKI.650TAOI
PETRVJ660APAO
PONOLI 660TTOT
ROTTI.65FPCFA
SHIRDV64PIUTl

02
02
02
02
02
02
02

05
05
05
05
05
05
05

Figure 1O-GIPSY2: assignment of sequential reference numbers

GIPSY3:

43

Figure 11 shows the unit as it could be
printed by GIPSY3 in a bibliography.
Spacing, indentation and overprinting are
controlled by the use of control cards.
Note that the AOO data element has' been
prevented from printing by means of a control card.

PETKOV I.
JOINT I'IST. F~R "IUCLEAil. RESEARCH, DUSNA (USSR). LAB. OF HIGH
ENERGY.
ON THE AMPLITUDE OF INELASTIC SCATTERING OF FAST PAltTlCLES ON
NUCLEI.
JI"IR-P-2037. 1965. 8 P.

Figure 14 shows the same index as figure
13 but with titles. Figure 15 is an example
. of a KWI C index from titles. Both these
indexes were generated by G IPSY8 and
printed by GIPSY9.

CHARGED PARTICLES
THE ISOTOPIC DISCHARGE CHAMBER WITH HYDROGEN JiND HELIUM FILLING.

CLOUD CHAMBERS
SEARCH Fun. NEii DECAY

ING.s THE ISOTOPIC DISCHARGE
TRACK COORDINATES IN A SPARK
OTRON BEAM."
MAGNETIC
nON OF A NUCLEAR REACTION IN
SHEll MODEL EXCITATIONS AND
FRACTIONAL PARENTAGE
ED+
WAVE FUNCTIONS OF THE

iC.-ZERO-iiiO

I nl:

~ESONS.

CHAMBER WITH HYDROGEN AND HELIUM FIll
CHAMBER.s
+FOR EVALUATING PARTICLE
CHANNEL FOR FOCUSING A DEFLECTED CYCL
CHEMICAL RESEARCH." +P'OSSIBLE APPLICA
CLUSTER EXCITATIONS IN LIGHT NUCLEI.,.
COEFFICIENTS FROM ALPHA-PARTICLES.COLLECTIVE STATES OF EVEN-EVEN DEFORM

*CALL

GIPSYl

*1*

ID2JlNA.

130367 JINR DEMO RERUN

toCAll
GIPSY?
*7*tOOO
,",eVE 960E TO Al

.t ... Ll

GIPSY)
*3* PJINR 2101tO
*EDIT .(,. (5,91 .).

Ol~OOlO.8P12001

-OELETE AD, A16
.INDENT L 960E .. AID. T AZOE

.SKIP 1 .'10, .1.20, A30

Figure II-GIPSY3: printing a reference, with indentation and
overprinting

*CALL
.ASSIGN
-'SSIGN

Figures 12 and 13 show the author and
the keyword indexes, generated by
GIPSY3 and printed by GIPSY4. Both
indexes are generated and printed in one
pass.

GIP$Y2
S8FtNP TO 3
SBFOUT TO 0\

*2* $00001
05000011060 1 1
.SORT-FIELDS AIo-V (30), A~Y (6)
*CALL

GIPSY4:

SORT7

1t5 62 010559OO1001PI11 11 5020210062015
0071006
L
:tASS ItM

G[PSY)
SBF INP TO It

*3* PJINR 2 001
-eDIT 15,9,SI

3 ... 300103706500806121061 0504-

132

~ELETE AD
~VERP1\JNT

AZOE
-INDENT 0 AID, AU. S .1.20. 0 A30. 6 960£
_SKIP 1 960E
-.FILEl AlO"l30.960E

*CALL

MONITOR

OPERATOR PLEASE REPLACE TAPES ON UNITS 2. 3 AND It WITH THE FOLLOWING TAPES --

FILEl ON UNIT3

FILE2 - tUlle ON UNIT'"
KNOt ON UNITS
SCRATCH ON UN]T6

FJlEZ -

BARASHENKOV V.S.
BElYAEV V.B.
6
9
BILEN'KII S.M.
BIlENIKAYA 5.1.
10
BIRYUKOV V.A.
11
BLANK I.
12

PERELYGIN V.P.
32
PETKOV I.
43
PETRUKHIN V.J.
44
PISAREV A.F.
11
PODGORETSKII M.I.
36
POLUBARINOV I.V.
40,41
PONOMAREV L.I.
44,45
POPOVA A.B.
25
PROKOSHKIN YU.D.
44
PYATOV N.I.
19

OANILOV V.I.
13
DAD WONG Due
14
CESIMIROV G.
15
CEUTSCH M.
16

""USE

.-c ... LL

GIPSY'"
......1.10 20001106503503111001 03531

3

01

H3 Ala

AUTHOR INDEX

*EDIT (5.9,P)
."'.'30 2
106503503111001 0352
012
H3 130
REPORT NO. INDEX
H .130
REPORT NO.
REF. NO.
REPORT NO.
H"'30
NO.
H2 A30
----- --HZ 130
."'*960E2
106503503110000 03521 13 01
H3 960E
KEYWORD INDEX

REF.l

GIP$Y8

.a.l oao

+, -,-

019&OE120E
03'(-03')*CAlL
SORT7
52 34 010500001OO1PI11 11 50203".0080021
0017013

. .*5 11.

100040170

L

Figure 12-GIPSY4: author index
CHANNELS
13
CHARGED PARTICLES
28
CHEMICAL ANALYSIS
44
CLOUD CHAMBERS
1
COINCIDENCE METHOD
21

ELEMENTARY PARTICLES
B,56
EMISSICN
6

ENERGY
3,21,43,47,50,55
ENERGY LEVelS
3,4,5,7,8,10,16,18,21,25,26,35,
42,47,51,56,57
EQUATIONS

Figure J3-GIPSY4: keyword index

2B
11
13
44
7
46
57

Figure 16 is a listing of the control cards
used to produce the sample cases. This
listing has been included so that the
reader can get a "feeling" of how a job is
described to GIPSY.

NUCLEAR REACTIONS, USES, CHEMICAL ANALYSIS, HYDROGEN, PIONS,
PROTONS, RESEARCH.

FQUATIONS, "IATHEMATICS, METHOCS.

0;::

Figure 15 -GIPSY9: KWIC index

PfTl\UKHIN V.J., PONCMAREV L.I., PROKOSHKIN YU.O.
JOINT I'IST. FOR NUCLEAR I\ESEARCH, DUSNA (USSR). LAB. OF HISH
E:NERGv.
ON A POSSIBLE APPLICATION OF A NUCLEAR REACTION IN CHEMICAL
RESEARCH.
JINR-P-2558. 1966. 5 P.

PONOMAREV L.I.
JOINT INST. FOR NUCLEAR RESEARCH, DUSNA (USSR). LAS. OF HIGH
-ENERGY.
ON THE THEORY OF THE ASYMPTOTIC REPRESENTATION OF SPHEROIDAl
FUNCTIONS.
JI'lR-P-2564. 1966. 6P.

~QuES

Figure I4-GIPSY9: KWOC index

SCATTERING. PARTICLES, ~mCLEI, ENEqGY, EXCITATIO'l,
CRess S"CTIC'JS.

45

2B

CHEMICAL ANALYSIS
ON A POSSIBLE APPLICATION OF A NUCLEAR REACTION IN CHEMICAL
RESEARCH.

ELECT~O"S,

44

191

*CAlL
GIP$Y9
*ASSIGN FIlE2 TO '"
*9.1
1
*'EDIT 15.9,'1

... 300 1 0380650 15 01000 1

GIPSY8
. . . 110110031061 OU20E
~.5 11. +$ - , 03"-03 1 t*till

*CAlL
SORT1
50l 3 ... 010238001001Pl11 11 5021170080080
02010)7
L

GIPSY9
."SSlGN FUE2
*9. 1001

.l3001031t06507506600lt2

.EDIT 15,9,PI

Figure J6-GIPSY control cards used to prepare
Figures 7 through 15

The ISCO"R real-time industrial data
processing system
by W. M. LAMBERT
Control Data Corporation
VanderbijJpark. South Africa

and
W. R. RUFFELS
South African Iron an~ Steel Corporation (ISCOR)
VanderbijJpark,. South Africa

also serves as a standby system for real-time programs
in the event of failure of the real-time system. Furthermore process control computers will also be installed
to control the actual processing of material on the

A real-time industrial data processing system
collects information, processes it, and responds to
the user in sufficient time to influence or control
the production processes, material distribution and
accounting records.
At ISCOR (South African Iron and Steel Corportation) a comprehensive real-time industrial data
processing system is being installed. This system
will utilize 'in:-plant' communications terminals and
automatic data logging units to gather and display
data. It will respond to remote terminals and the
Mills process control computers to initiate and
control production flow through the manufacturing
process. Batch processing jobs Will also be initiated
through the real-time ·system to provide production
reports and accounting records.
In view of the number of production centres
involved it was decided to divide the system into
two major application areas, namely customer order
control and production control. It was also decided
that the customer order control application would
be installed on a corporate wide basis, while the
production control application would be further
subdivided into works and mills areas.
This paper will discuss the approach to the development" of the system and its installation at the
ISCOR Works, Vanderbijlpark, South Africa.
The computer system to be employed consists of
two large Control Data 3300 Central Processors
located at the Vanderbijlpark Works. Each processor
will have its own operating system. One system will
be primarily dedicated to the real-time programs
while the other handles batch processing programs and

mills~

The real-time system will capture and process
data at the actual time the steel is passing through
the different phases of production while the batch
processing will be carried out according to a preset
schedule. All commercial applications and reports
emanating from the data processed by the real-time
system will be processed into reports periodically in
the batch processing system. The real-time processor
is also capable of processing batch jobs whenever its
real time load permits.
Since the real-time operations require a high
degree of processor, peripheral, and terminal reliability, each system will monitor the condition of
the other. Should the real-time processor not be
functioning properly the batch processor will abort
all its jobs and assume the real-time load. In order
to accomplish this' interchangeability of work load,
the system is configurated so that each processor may
access all peripherals and terminals via alternate
data input/output routes.
The failure recovery procedures to be employed,
in the event of the 'in-plant' terminals or real-time
system peripherals not being in operation, have been
carefully laid out in order that the highest degree of
back up possible is obtained with a minimum of
manual operation or recording. For example, when a
message is transmitted to the central processor from
a mill process control computer or data logger, the
193

194

Spring Joint Computer Conference, 1968

central processor must acknowledge the receipt of the
message withIn five seconds. If the acknowledgment
is not received by the sending unit, the message is
transmitted again. Should the second message not be
acknowledged, the transmitting unit assumes there
has been a failure within the communications network
and will automatically dump the data into punched
paper tape for later entry via a tape reader in the
computer centre.
Remote 'in-plant' terminals are being designed
and located in such a way that a terminal situated
in close proximity, can be used. should a unit be
out of operation. Maintenance' of the 'in-plant'
terminals units will generally follow the policy of
replacing a unit with a spare rather than to repair
it on site.
In order to ~reate a smooth running system,
essential 'on-line' files, which cannot be re-established quickly and efficiently, are maintained in
duplicate on separate disc units each of which is
accessed through different channels and controllers.
'On-line' files are available to the real-time system
at all times, usually without requiring computer
operator action.
Overall the dual system approach is the one
followed, with at least two routes provided from the
processor for each type of input or output.
The order control system provides for the direct
entry of orders into the computer system from the
various sales offices via papertape transmission and
teleprinter units. The system receives the order and
accepts it commercially and industrially and confirms
the order status with the initiating sales office.
Acceptance of the order includes validating it
from the standpoint that all of the data required to
process the order are present and correct. Such items
as customer data and product specifications are also
vetted by the system. The order is then priced and
the customer credit arrangements are checked.
Assuming that the order is correct and complete and
that the product specifications are acceptable, the
order is forward loaded on a mathematical model of
the works and a forecast delivery period for the
material is computed. If the delivery period arrived
at is not within the limits 'specified by the customer,
the sales organisation is notified and may adjust
priorities if it so desires. Should the sales organisation change the priority, the order will be reloaded on the mathematical model. In the event of the
order failing a sp~cific test, a query is generated
and the appropriate section is notified automatically.
For example, if order data is missing or invalid,
the originating sales office is advised. When product
specifications are new or unusual, the works metal-

lurgist is notified. Should credit arrangements be
inadequate, the credit department is called. In order
that all computer-generated queries receive prompt
attention, and that no information is lost, each query
is followed up with a reminder if the response has not
been received within a set time limit.
Once the order has been accepted, and forward
loaded on the plant, it remains in the system until
its entire life cycle is complete. An order's life
begins with its entry by sales, matures when it has
been dispatched by the works, and dies when it has
been paid by the customer. After that, it becomes
another statistic for use in the operations research
or sales forecast system.
Periodically, blocks of forward loaded orders
are passed from the order control system to the
production control system. The production control
system then schedules these forward loaded orders
on the works production units, and reports their
progress through the units until the order is complete
and ready for dispatch.
Blocks of orders are transferred from the order
control system at intervals which will maintain a
sufficient pool of orders to provide an adequate
product mix for scheduling. The system schedules
the orders based on existing conditions within the
mill, and issues recommended schedules in advance
for approval by the production planner.
Preliminary schedules are prepared several days
in advance of their actual production data, but
the final scheduling is 'on-line' in the case where
the mill is controlled by a process control computer.
For example, an ingot to be rolled at the slab mill
is on a preliminary schedule several days in advance.
The actual detailed rolling instructions, however,
are issued only \vhen the ingot is charged into a
soaking pit to be heated to rolling temperature,
approximately six hours in advance of the actual
rolling. The rolling instruction is in the form of
a punched card which is read by the mill control
computer when the mill operator is ready to roll the
ingot. The card is produced by the real-time system on
a remotely located card punch at the mill production
office.
The major real-time function of the computer
system is the collection and display of data from a
large number of 'in-plant' terminals. This enabies
the system to do accurate production and management
reporting, and the scheduling of production.
To illustrate how the production control system
functions, let us consider the steel melting plant
and slab mill complexes. The steel melting plant
produces casts of steel. A cast is a ladle of about
two hundred tons of molten steel tapped from a

The ISCOR Real-Time Industrial Data Processing System 195
furnace. The molten steel is teemed from the ladle
into molds which form ingots. The molds are stripped
from the ingots after they have hardened, and the
ingots are rolled and sheared into slabs at the slab
mill. An ingot is a block of steel weighing between
ten and twenty five tons, depending upon the mold
size. A slab is a smaller flat block of steel, usually
weighing around five tons. Its size and weight depend
upon the requirements of the rolling mill which will
use it.
Starting with the forward ioaded orders taken
from the order control system, the production control
system computes the number and specifications of the
slabs required to satisfy the orders over a production
cycle. Once it has determined the slab requirement,
it computes the number and specifications of the
ingots from which the slabs .will be rolled. It then
computes the number and specifications of the cast
of steel that must be made at the steel melting plant
to produce the required ingots. As each computation
is done an 'on-line' file of slab, ingot and cast requirements is built up. When these computations are
complete, the computer prints preliminary schedules
for the slab mill and a steel order for the steel melting plant. Once approved, the schedules and steel
order are issued to the production units by the
production planner.
The steel order specifies the cast and ingots to
be produced, but does not stipulate the sequence
in which they are to be manufactured. Thus, the
actual slab mill schedule and rolling instruction
cannot be prepared until the ingots have actually
arrived at the mill.
The progress and specifications of the cast and
ingots are reported to the computer system by
personnel using 'in-plant' terminals at the steel
melting plant, the ingot weigh bridges and the soaking pits. The terminals are also used to dIsplay
information and instructions regarding the steel
flowing through the production facility.
Once the ingots reach the soaking pit, where they
are heated to proper rolling terperature, their arrival
is reported through an 'in~plant' terminal. The system
then· refers to its 'on-line' files to provide the necessary rolling instructions. The specifications of
each ingot are checked, and instHrctions are generated
which will use the ingot for the highest priority order
having the same specifications.
When the ingot is drawn from the soaking pit
and enters the mill, the rolling instruction card,
which was prepared on an 'in-plant' terminal card
punch, is read into the mill control computer by
the mill operator. The mill control computer then
controls the process of rolling the ingot into a slab,
and the shearing of the rolled slab into smaller

slabs. At the same time, the mill control computer
logs the particulars about each slab rolled, and
automatically transmits the data to the real-time
computer system. After each slab has been rolled,
sheared, and moved to the slab stocking yard; information about its condition, treatment, and location is
reported to the real-time computer through 'inplant' terminals. The system updates'on-line' disc
files so that up-to-date records of material stocks,
production status, and order progress are available
at all times.
These 'on-line' disc files are used t~ satisfy inquires
for remote terminals, and as the basis for scheduling
and production reporting.
Periodically, the 'on-line' disc files in the realtime computer are dumped onto magnetic tape. The
magnetic tape is transferred to the batch processor,
and production and management reports and accounting records are prepared daily, weekly, monthly,
or as required. The periodic dumping of the data
from disc to tape also provides an added measure
of back-up, should it become necessary to reconstruct
an 'on-line' file.
The design of the data collection and display network is an area which involves a tremendous amount
of preplanning. Since the average 'in-planC terminal
operator will not be highly trained or educated, it
was necessary to develop techniques and methodS
which would enable the system to capture the data
with a minimum of operator action.
To do this, communication methods are employed
that involve basically three types of remote terminals:
standard keyboard send-receive teleprinters; numerical data entry devices, which are standard keyboard
send-receive teleprinters with the addition of a modified push button keyboard; and cathode ray tube
. display units.
To use the standard keyboard send-recieve teleprinters, the operator contacts the computer by transmitting a call code. This action steps-up the communications and program linkages, and starts a 'computer
to operator' conversation that is controlled by the
computer. The computer asks a series of questions,
and the operator answers each question sequentially.
The first question must be answered before the second
is asked. Each response to a computer generated
question is validated by the system to make certain the
data received are reasonable before the conversation
is continued. Normally, an 'in-plant' terminal will be
utilizing only one communications program. However,
the ability to conduct several simultaneous conversations with a terminal has been incorporated into the
system.
When using the numerical data entry devices, the
terminal operator is provided with a preset message

196 Spring Joint Computer Conference, 1968

format. He sets up his entire transmission on the push
button keyboard. The· terminal operator then sends
the message to the computer by depressing a 'tn~ns­
mit' button. This ty~ of unit reduces the number of
messages required to collect the data, and makes
it possible for the terminal operator to check the data
before he sends them.
The cathode ray tube display units are used for
both data collection and as display terminals. To collect data, the technique of displaying a form on the
unit for the operator to fill in, is most commonly used.
To obtain a display of data, the operator calls for a
specific display by sending a call code. The computer
automatically presents the latest data on the files
regarding the request.
N orm8.J.ly, all conversations between the computer
and the 'in-piant; terminais are on an immediate
response basis. On the other hand, provision has been
made for terminal operators to request information
which they will collect at a later time on the same
terminal, or on a different 'in-plant' terminal. Provision has also been made for the computer to advise
the terminal operator that there will" be a delay in displaying the data requested. At ali of the 'in-plant'
terminals, the programs have been deSigned so that

the operator can converse with the computer in either
Afrikaans or English.
When an order has been produced and is ready to
be dispatched, the real-time processor is advised
through the data coiiection terminais, and the order
control system is brought back into action.
The order control system will produce, on remotely
located 'in-plant' terminals at the dispatch area, the
necessary documents for shipping the product to the
customer. Such items as consignment notes, gate
release note, loading particulars, and special handling instructions are printed on remote units.
The oidei contiol system also piepaies the invoices
and customer accounts on a batch processing basis.
In order to provide an adequate pool of data so
that queries about orders can be answered promptiy,
all of the data relative to an order are retained in the
'on-line' files for thirty days after the orde"r has been
dispatched. Accounting and statistical data are retained in off-line files indefinitely.
The system outlined has been installed and tested.
All of the necessary software has been developed.
The order control system and the first portions of
production control system are now operational.

Martin Orlando reporting environment
by MICHAELJ. McLAURIN and WALTER A. TRAISTER
Martin Marietta Corporation
Orlando, F!orida

Design objectives
The system was designed "to alleviate the problem
of "one time" or special request reports. A method
was needed to quickly produce reports on request
without having to write, compile and debug programs
in order to produce the reports. The original design
objectives were to permit any individual who understood how to prepare the input parameters the capability of producing any report within thirty minutes
preparation time. MORE eliminates the necessity
of havi~g to write' and maintain great numbers of
special purpose report programs. It eliminates the
necessity for special passes of master, sort parameter
and activity files in order to produce the desired
report. This is possible since the MORE system is
a series of external subroutines which may be called
by any existing program which is already accessing
or passing a specific file. Thjs system affords the
user the advantage of using the built-in general
print program or providing his own format program
to the system. It has been determined that thi\;!
system satisfied 85 to 90 percent of Martin Orlando
business report requirements.

of the information to compose the report. Data
is selected, based upon satisfying the requirements of the input parameter cards.
4. Executive Monitor Program- This program sorts "
the selected or picked report data into report
sequence and controls the programs required
to format the report data.
5. General Print Program- This program formats
the picked data records based on the print
parameter cards which were input to the system.
This program may be substituted for by any user
written program if desired.
6. Systems Interface Program - This program will
take any COBOL source program and add to it
all linkages and coding necessary to utilize the
MORE system. The output from this program
is an updated COBOL program. The user also
has the option of simultaneously compiling the
program and placing the executable load module
on program library.

Mechanics
All references to programs in this section may be
followed on Figure I.

System modules
The following system modules can be followed on
Figure I. The system consists of the following programs. A detailed explanation of the function of each
can be found under MECHANICS.
1. Input Parameter A udit Program - This program's primary function is to sort the input parameter cards and audit every field of the parameter
cards.
2. Communication Program-This may be any
program which calls the data selection program,
hereafter referred to as the "PICKER." The
communication program may" be any existing
file maintenance, audit, or any program which
passes files containing potential report data.
3. Data Selection or Picker Program-This is
the program which does the actual data selection

Parameter card audit program

This program is run before the communications
program which "calls" the Picker program. The
audit program sorts the parameter cards on request
number (report number) and sequence number within
the report. All fields of the parameter cards are edited
and all appropriate error messages are issued (see
Exhibit E for detailed error messages). The exception
report is produced on line so that the parameter cards
may be corrected and the job can be rerun immediately. The edited parameter cards are written onto a
temporary disk file along with a good or bad indicator.
During the pick phase, only those requests which are
error-free will be honored by the Picker. The Audit
program currently restricts the system to a maximum
197

198

Spring Joint Computer Conference, 1968

I

(PiCK-N-PRINT
PARAMETER
I
CARns
I

I
I

___1 __

I

EXCEPl'IONS

~

I

_I

I

INPUT
PARAMETER
AUDIT
PROGRAM

'"t~
'----J
CARns

I

i

USER'S FIM
OR
COMMUNICATION
PROGRAM

PICKER
PROO1WI

J

~

PICKER

Represents
1 pick
parameter
card

-

I----~

Cant.

Only to

Reco-.er

UNSORTE
REPORT

Lost

/

\ -riiE- )

(_--L----.1

~

SEARCH

I

if

Reports

~t.!rl.!!.

limite fer the request.

IEX»;L~''-_C_AR_OO_---',
I

-

-1

i

'---__----'a:

PRINT

\

EJe=>

1«- - - - ,

MONITOR
SORT

-)

"PROGRAM
LIBRARY

I

when not wi thin liJIi til next request
in table is checked against user' ..

record.

FORMAT

Only to

Recover

/\_--

Compared

Against
from and
to Limits

Lost

Reports

!
I

UTILITY
PRINT

1-

I

_____ - - - - - '

Figure t - Systems flow

of thirty-six reports and 600 pick and print parameter
cards.

Figure 2 - Communication program and picker interface

The foHowing coding is generated in -the user's
program by the SYSTEM's Interface Program.
This coding accomplishes all linkage between the
user'~ progr~m and the picker.
READ-MASTER
READ or MO VE master to be picked into PICKAREA and when updating is completed G(l} T0
CALL-I.
CALL-I.

Picker program
When the picker is called the first time, it reads
the temporary disk file created by the audit program
and builds entries into the picker table. (See Figure 2
to follow processing). Only those requests which were
error-free will be built into the table. Only pick
parameter information is built into the table. If print
parameter cards are encountered, the picker. immediately builds a format record and releases it to the
unsorted stacked report tape. Once the table is built,
the picker examines the first updated master record
(if the calling program is a file maintenance program)
sitting in the user's output area. The picker compares
every record passed through the user's output area
against each report request in the pick table. Any
records which get a hit against a report request is
built according to build field specifications on the pick
parameter cards. The following Figure 2 shows how
the communication program and the picker interface,

M0VE 0223 T0 ENTRY-C0UNT e>F PICKER-REC0RD.
ENTER LINKAGE.
CALL "PICKERPK" USING REQUESTC0NTR0L,REQUEST-N0-ST0RE,ELE- - MENT,NUMBER,ARGUMENT,ENTER-NWA Y,PICK-AREA-LENGTH,PICK-AREA,
F0RMAT-REC0RD,PICKER-REC0RD.
F0RMAT-HF-REC0RD,F0RMAT-SW. ENTER C0B0L.
IF ENTER-N-WAY IS EQUAL T0 4WRITE
PICKER-REC0RD THEN G0 T0 CALL-I.
IF ENTER-N-WAY IS EQUAL T0 5 AND
F0RMAT-SW IS EQUAL T0 1 WRITE
F0RMAT-REC0RD THEN G0 T0 CALL-1.
IF ENTER-N-WA Y IS EQUAL T0 5 WRITE
F0RMAT-HF-REC0RD THEN G0 T¢
CALL-I,

Martin Oriando Reporting Environment
IF ENTER-N-WAY IS GREATER THAN
ZER0 PERF0RM FIND-ELEMENT THRU
ANY-EXIT THEN G0 T0 CALL-I, ELSE
G0 T~ READ-MASTER.

02

199

FI REDEFINES GREG0RIANDATE.
03 F2
PICTURE 99.
03 F3
PICTURE 99.
03 F4
PICTURE 99.

FIND-ELEMENT
G0 T0 Al ,A2,A3, etc.
DEPENDING 0N ELEMENT-NUMBER,
ELSE G0 T0 ANY-EXIT.
Al.M0VE Fl T0 ARGUMENT THEN G0
T0 ANY-EXIT.
A2.M0VE F2 T0 ARGUMENT THEN G0
T0 ANY-EXIT.
A3.M0VE F3 T

DY-SERIAL-M0VE PICTURE X0CCURS 960 TIMES DEPENDING 0N ENTRY-C0UNT 0F STKMSTR-REC0RD. 01 PICKER-REC0RD. 02 FILLER 02 ENTRY-C0UNT 02 B0DY 01 F0RMAT-REC<)RD 02 FILLER 01 F0RMAT-HF-REC0RD 02 FILLER PICTURE X(36). PICTURE S9999. PICTURE X OCCURS 960 TIMES DEPENDING ON ENTRY-COUNT OF PICKER-RECORD~ PICTURE X(403). PICTURE X(493). 203 204 Spring Joint Computer Conference, 1968 EXHIBITC 360/50 PICK-N-PRINT PARAMETERS AWO IDENTIFICATION. ANALYIT NO~ PHONE EXT. CA"D SHEET DATI! NDANKS I C !i;~" i !&IQ 0 I! l II S!Q~ PICI: FIELDS LIIIlITI 1-1-(O-,-a-p-!!-L-"--"'"~"T.":J..!!A"'!!"'III'-G-C-II---'il-~-ri-i..G-+:i!;1'".-i'ii-i;.-·i.....:-'t--::~-r:-------r.=.!!IlJ.j'-------- "0. i 1 .I NAMe / TOTALING IIAII: 'i I I I I Ii' : I MO. 8: .......D-I1 ,os. : L P ! PttOM· LIIIIT TO· LIIiIT . i . i ,ii' i CA HE: i ~ 1$ !lUILD PIUDS a .UI~:"~ILD / .UI~S:IIILD P~~D '/otT t I I I II I 11 ! ! ! 12 II \ \ 14 I I ! ! I I I ! ! I: ! ,L I ! ! ! I l! 11 I 1! I ! : l: II! II. i ! ' i 1 ' r! ! i i i i ! ! ,! I j ! ! ! 11 16 17 11 " • II 21 22 D SEQ. MO. r C ~ " SI:;I-"T""_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _.;,;;"1I::;;4:;:-=G..;;;&.:..POO=T,;:.HG:;:.::L:..;,IT.::;ER:;:;4:;:LS=--_-------------------If R D. L c' ~ ~ ~ c! 'RNIT IllAGI ~. JT II L. t NOT!!: ADIII"IITNATIVE WONK ON DEN 10ENTIP1CATION IIUST AP.!"W ~ EACH SHEET. .. - K. J. IIcLurila ~iartin EXHIBIT D GENERAL 1. eel. CARD-IDENTIFICATION. < Denotes monitor-tag-card; * de- 2. cc2-3. 3. cc4-6. B. notes pick parameter card, $ denotes print parameter card, and > denotes heading or footing print parameter card. REQUEST NUMBER. A two digit sequential number for each request submitted. SEQUENCE NUMBER. A three digit sequential number which is continuous throughout all requests. PRINT TRAIN DESIGNATOR. A one digit code which indicates the type of print train to be used. 9. cc3S-39. FORM QUANTITY. A two digit number which indicates the number of boxes of cards or paper required by the punch or print program. (May be blank.) 10. cc40-42. SORT TAG LENGTH. The number of _. . II. cc43. MONITOR TAG CARD 1. cc7-9. FORM-NUMBER. A three digit number assigned to the form for printing purposes. 2. cc I 0-17. PROGRAM NUMBER. An eight digit PROGRAM-ID that will process or. format this data after sorting. 3. ccIS-19. REQUEST NUMBER. A two digit sequential number for each request submitted. 4. cc20-29. ARD NUMBER. A six digit number which identifies the A WO identification and a three digit number which identifies the report. The ARD field has the following format: 1. ccl. 2. cc2-3. 3. cc4-6. ~X~X)-a AWO ID 205 S. cc35. TRANSMITTAL DEFINITIONS A. Oriando Reporting Environmeni report id 5. cc30-31. CARRIAGE TAPE NO. A two digit number which identifies the carriage tape to be used to print the formatted data which was output by the format program. 6. cc32-33. BLOCKING FACTOR. A two digit number indicating the number of logical records contained within each physical tape record written out on the formatted tape. 7. cc34. DECOLLATING CODE. A one digit code which indicates the decollating procedure and bursting pro,?edure. 4. cc7-21. 5. cc22-24. 6. cc25-39. 7. cc40-54. in thp _.... _. __ . ._.................. - hon\l ---J rh~r~rtpr4i: of ouL _.&.. thp ...... - ""... .. put record which will constitute the sort field. This field must be 223 which means that the minimum record length of all records passed to the executive monitor sort will be 263 bytes. HALT CODE. A non-blank character in this field indicates to the pick program that if all requests of a pick were bad, the picker should 'ABEND' the user's program. It is only necessary to include this code in the first Monitor Tag card of a pick. This code must be included for non-file maintenance programs. CARD IDENTIFICATION. < Denotes that this is a monitor-tag card; * denotes that this is a pick parameter card; $ denotes that this is a print parameter card; > denotes that this is a heading or footing print parameter card. REQUEST NUMBER. A two digit sequential number for each request submitted. SEQUENCE-NUMBER. A three digit sequential number for each parameter card within a request. PICK FIELD NAME. A fifteen digit field which is used by the customer to fill in the name of the field upon which he wants to pick. PICK FIELD NO. A three digit field which is used by Management Information Systems to fill in the 'F· number (corresponding field number in the program) which corresponds to the field name supplied by the customer. FROM LIMIT. Indicates the lower limit against which the argument will be compared (left justified). TO LIMIT. Indicates the upper limit against which argument will be compared (left justified). 206 Spring Joint Computer Conference, 1968 EXCLUSIVE PICK CODE. If ccS 5 contains a non-blank character everything will be picked except the range represented in the FROM and TO LiMiTS fieids. 9. cc56. RANGE CONTROL. If the pick field number of the parameter' card represents a range, the first parameter card of the range describes the upper limit and cc56 contains an X. The second parameter card of the range describes the lower limit and cc56 contains an X. 10. cc57-58. LOGIC CODE. If "AND" logic is to be applied, column 57 will contain an X or any non-blank character tor this specific parameter card. 8. ccSS. If "OR" logic is to be applied, column 58 will contain an "X" or any nonblank character for this specific parameter card. Card columns 57 and 58 are mutually exclusive. For any specific' pick parameter card only one of these columns may contain a non-blank character. 11. cc59-73. BUILD FIELD NAME. The name of the field which the customer specifies indicating that this field is to be represented in the output record. 12. cc74-76. BUILD FIELD NUMBER. A three digit number representing the relative location of an element in the pick-area from which the body of the output record is to be built. The absence of any build fields indicates that the entire record is to be picked. 13. cc77-80. Not used. D. PRINT PARAMETER FORMAT CARD a. DETAIL PRINT DATA: The following columns will be used to describe the individual fields of a detail print line: 1. cc53-54. DETAIL SLEW VALUE. A two digit numeric field which indicates the number of lines to be slewed before printing. This field must be specified for each field of the line. If group indicating - $-'CH' type control must show positive slew and $-'CF' type control must show 00 slew. 2. cc59-73. DETAIL FIELD EDIT MASK. A maximum of a fifteen position edit mask which will describe the print format for the preceding detailed fieid to be picked. This mask is right justified. 3. cc77-79. DETAIL FIELD PRINT POSITION. A three digit field which reflects the right most horizontal print position where the data are to be printed. If group indicating $-'CF' type control lowest level must contain '999' and higher levels must contain '000'. 4. cc80. LEVEL INDICATOR. A one position field which indicates that the build field number is to be used for control purposes. Levels 1 thru 5 indicate minor thru major respectively with FINAL level being the highest level specified. The maximum number of levels is six. This column will be blank if not applicable. b. HEADING AND FOOTING LITERALS: The following columns will be used to describe literal information of the user's heading and footing lines. These fields are shown at the bottom of the input transmittal (Exhibit B). 1. cc7-8. 2. cc9. TYPE CONTROL FIELD. A two digit field which may contain any one of the following types: CH Control Heading OH Overliow Heading OF Overflow Footing CF Control Footing (a) Control Heading and Control Footing literals are associated with their respective control level indicator (cc80) of the detail format card. (b) Overflow Heading and Overflow Footing literals are associated with their respective 'HDR-NO' (cc9) indicating at overflow time the relative sequence in which they are to be printed. There is a maximum of six Overflow Hei;ldings 'Headings and Footings for each• report. HEADER NUMBER INDICATO R. A one digii numeric fieid Martin Orlando Reporting Environment 3. 4. 5. 6. 7. which is used to associate fields to to be 'floated' into the Overflow Heading and Footing lines from the detail format cards. This field is used only with types 0 Hand OF. A maximum of six Overflow Headings and Overflow Footings may be specified. A 1 represents the first and a 6 represents the last. cc10-11. HEADING OR FOOTING SLEW. A two digit numenc· field which indicates the number of vertical lines to be slewed before this heading or footing line is printed. This value should be specified only in the first of each set of two cards. Not used. cc12. CARRIAGE CONTROL. A one cc13. digit field which indicates whether or not this heading or footing line is to be slewed to the top of the next page. The only acceptable values for this field are 1 or A 1 indicates slew to top of page before printing. This field should be used only in the first of each set of two cards. cc14-79. HEADING OR FOOTING PRINT IMAGE. A total of 132 characters which contain the heading or footing literal information to be printed. If group indicating and final totals are required, a literal line with 00 in the slew field must be specified. cc80. CONTROL LEVEL INDICATOR. A one position field which associates a specific control Heading or Footing with the respective control level. A maximum of six Control Headings and six Control Footings may be specified. A 1 represents the first and a 6 represents the last. 2. cc22-24. FIELD NUMBER. A three digit field which serves one of the following functions: (a) Specifies the field number to be totaled for Control Footings or inserted for Control Headings. (b) *** indicates that this is the mask for the page counter. (c) NN$ indicates that this is the mask for the date. NN is the number of the date to be obtained from the GETDATE subroutine. Anyone of the eleven dates of GETDATE may be specified. (d) 'N' specifies the literal to be floated into a heading or the increment to be added to a counter for every record to be printed. 3. cc25. HEADER NUMBER. A one digit field which associates fields to be floated into heading and footing with their respective Overflow Heading. or Overflow Footing line. This field may contain a number from 1 to 6. A 1 represents the highest level 0 H or OF and a 6 represents the lowest. 4. cc26-28. HEADING OR ·FOOTING PRINT POSITIONS. A three digit field which indicates the right justified print positions of the heading or footing line where the floated data are to be printed. 5. cc29-30. TYPE CONTROL. A two digit field which indicates to which Heading or Footing type (i.e., OH or OF, CH or CF) the floated information applies. EXHIBITE PARAMETER CARD DIAGNOSTICS c. FLOATED DATA FOR HEADINGS AND FOOTINGS: 1. cc7-21. HEADING OR FOOTING MASK. A maximum of 15 digits which represents the edit mask for the data to be floated into the heading or footing literal line. This mask must be right justified. justified. 207 A. GENERAL MORE THAN 600 PARAMETER CARDS REMAINDER DELETED PARAMETER CARDS ARE OUT OF SEQUENCE 208 Spring Joint Computer Conference, 1968 MORE THAN 36 REQUEST -REMAINDER DELETED B. MONITOR TAG CARD ILLEGAL SUPPRESSION ZERO SUPPRESSION CANNOT BE SPLIT ZERO SUPPRESSION ILLEGAL FOLLOWING DECIMAL POINT MONITOR TAG CARD MUST BE ist OF REQUEST V, g, S, $, *, Z, ), ( ILLEGAL FOLLOWING DECIMAL POINT FORM NUMBER IS BLANK PROGRAM NUMBER IS BLANK INVALID REQUEST NUMBER ARD NUMBER IS BLANK BLOCKING FACTOR IS INVALID SORT LENGTH FIELD IS BLANK WARNING HALT CODE IS BLANK DECOLLATING-BURST CODE IS BLANK CARRiAGE TAPE NUMBER is BLANK UPPER-LOWER CASE PRINT CODE IS BLANK B FOLLOWING P IS ILLEGAL CAN HAVE ONLY 1 DECIMAL POINT INDICATOR PER PICTURE P's CANNOT BE SPLIT ONLY ONE SIGN INDICATOR LEGAL PER PICTURE (MAY NOT FOLLOW A SIGN $ CANNOT FOLLOW DECIMAL POINT $ CANNOT BE SPLIT P CANNOT IMMEDIATELY FOLLOW A $ * ILLEGAL FOLLOWING DECIMAL POINT *(s CANNOT BE SPLIT (ILLEGAL AFTER X OR A C. PICK PARAMETER CARD FROM LIMIT IS BLANK TO LIMIT IS BLANK LOGIC CODES ARE IN ERROR FROM LIMIT OF FIRST RANGE CARD IS BLANK TO LIMIT OF SECOND RANGE CARD IS BLANK SUPPRESSION XORA ILLEGAL FOLLOWING MUST HAVE 2 RANGE CARDS FOR EACH RANGE ILLEGAL CHARACTER IN PICTURE BLANKS WITHIN PARENS IS ILLEGAL ONLY NUMERIC IS LEGAL WITHIN PARENS FROM LIMIT IS GREATER THAN TO LIMIT NOTHING VALID SPECIFIED FOR PICTURE D. DETAIL PRINT PARAMETER CARD WHEN PRINTING ALTERNATE PICK AND PRINT PARAMETER CARDS WARNING-BOTH PRINT POS OF FORMAT CARD ARE BLANK WARNING-BOTH MASKS OF CARD ARE BLANK FORMAT MUST SPECIFY DE CC IN 1st FORMAT CARD OF REQUEST SUPPRESSION ILLEGAL AFTER P PARENS ILLEGAL FOLLOWING V E. HEADING AND FOOTING PRINT PARAMETER CARD BOTH HEADER NUMBER AND CTL LVL CANNOT BE BLANK NO SLEW VALUE SPECIFIED FOR THIS OH,OF,CH,CF NO TYPE CONTROL SPECIFIED FOR THIS OH,OF,CH,CF OH OR OF MUST HAVE HEADER NUMBER WARNING PRINT IMAGE IS BLANK Simulatio"n applications in computer center management by THOMAS F. McHUGH jR. International Business Machines Corporation Waltnam, Massachusetts and DR. ELLIS L. SCOTf University of Georgia Athens,Georgia Significant developments occurred during the 1960's in the use of computer-based simulation models for management analysis. Data processing personnel contributed extensively to these developments. However, the role characteristically attributed to them emphasized programming and machine operations. Relatively few models have been concerned with managePlent aspects of computer center operations. * A simulation study recently completed at the U niversity of Georgia (U Ga) Computer Center suggests that models for computer center management purposes are feasible and will have extensive applications. The UGa model was constructed in the interest of simplifying the complex decision making tasks involved in managing the operations of the computer center. The major processing configuration includes four IBM CPU's-a 7094 operated in a mag tape I/O mode, two 1401's used for both I/O support to the 7094 and for central processing, and a recently installed 360/65. The diversity of input and the elaborate operating system necessary to support the hardware complement combine to present the management with many examples of hardware interface, routing, scheduling, and queue management problems typical of most large- and medium-scale facilities. This suggests that the approach taken in the UGa study may have value for the many D P managers faced with similar decision making and evaluation tasks. The objective of the U Ga study was to develop a simulation model which could be applied on an ongoing basis by computer center management in deci*For a recent example see "SCERT: A Computer Evaluation Tool," by Donald J. Herman in Datamation, February, 1967. sions involving machine allocation, machine capability projections, system optimization and similar classes of management problems. In brief, the methodology involved the performance of a system analysis, and the construction of a simulation model to reflect the essential behavioral characteristics of the system. Once the model was completed and its validity determined, the model could then be utilized to compare alternative configurations and operating procedures without direct experimentation. The UGa operating system At present, input is batched for all three types of machines. * The user may have access to the computer he desires by simply submitting his input through a formalized data collection procedure. 7094 mode 7094 input in punch card form must be routed to one of the 1401's for card-to-tape processing. Input tapes are then introduced into the 7094 job queue to be processed on the basis of a priority system. Priorities in this and all other sub-systems are based on two criteria- user convenience and production efficiency. That is, jobs are rated according to the need of the user and the effect of job execution on system performance at that point in time. Output tapes generated by the 7094 are routed back to one of the 1401's for tape-to-list processing. 1401 CPU mode In addition to serving as I/O support devices for the 7094, the 1401 's are used to a limited extent for *The 360/65 is scheduled to be upgraded to a model 67 to provide time-sharing capability. 209 210 Spring Joint Computer Conference, 1968 central processing and utility jobs. One of the machines has a 16k character memory with four tape drives; the other has an 8k memory with two tape drives. Most central processing tasks are performed by the 16k machine, with the residual (i.e .. taoe-tocard, data-tape building, etc.) going to the 8k.' ~ 360/65 mode All processing functions, including I/O, are performed on-line in the 360 system. Therefore, all input is routed directly to the computer upon entry into the system. Development of the simulation model Development of the simulation model occurred in three steps: collection of the data on the operations of the UGa system; construction of the model; and, execution of the model using data from step one as input. A sample period of one week was selected from operations during the first six months of 1967 and designated for both data collection and simulation purposes. Utilization records indicated that the week of May 7-13 met the selection criteria of freedom from excessive amounts of unusual kinds of inputs, and that input figures for the sample period be comparable with current operations. Data collection involved identification of the major variables of interest in the study and compilation of data on these variables. Major variables For the purposes of the study, average turnaround time was determined to be the most suitable measure of system performance. A verage turnaround time was measured in minutes between entry of input Gob-entry time) into the machine room and exit of the completed job from' the machine room. The secondary variable, processing time, was defined in minutes' for the execution of the job on the CPU, including I/O processing for the 7094 sub-system. Data acquisition The accounting routine records which are maintained' on-line on both 7094 and 360 CPU's were the source of process time data for those two machines. 1401 process time was computed from information contained in a utilization log which is maintained for accounting purposes. Data on other operations, such as card-to-tape and tape-to-list, were compiled by stop-watch observations over a sub-sample during the sample period. Model Construction The model was conceptualized as the set of mathematical, functional and logical relationships which express the essential behavioral characteristics of the UGa computer system. Following the definitIon of these parameters the model was constructed using General Purpose Simulation System/360 as the programming vehicle. G PSS is well suited for simulation applications of this type. * Its block coding format is akin to the standard flow-charting techniques familiar to most all system analysts and data processing personnel. Hence, the utilization of G PSS does not necessitate extensive machine programming knowledge. In addition, G PSS also features several approximation devices, such as the random number generator, and user defined mathematical functions, which simplify simulation of complex systems. In the U Ga model, a mathematical function was employed to enter the simulated jobs (transactions) into the model at the same rate and time as the real jobs entered the machine room during the sample period. The dependent variable in this function was the number of jobs which entered the system during each of 168 hours in the sample week. The independent variable was the number of simulated hours which had elapsed since the start of the simulation run. Evaluation of the function at the close of each simulated hour caused the same number of transactions to be introduced into the model as were jobs into the real system at that hour during the sample period. Process time for each of the major system operations was also defined by means of a mathematical function. The independent variable in this function was based on a graph showing the cumulative percentage of jobs requiring various amounts of processi?g time. A cumulative time distribution of processing time was constructed for each of the CPU's. Dependent variable values were determined by the random number generator. Values thus generated were interpreted to be percentages. The distribution of transactions among the simulated CPU's followed the same pattern as the distribution of the inputs in the real system. Model validation The simulation model was executed ten times (8330 simulated jobs) and the results of each execution were averaged. This procedure was considered necessary to reduce the probability of random effects biasing the results. Results from simulation runs were then compared with empirically derived data for the sample period of operations. Two methods of comparison were employed. The first method was comparison *See "General Purpose Simulation System/360: Introductory Users Manual." White Plains: International Business Machines Corporation, i 967. Simulation Applications in Computer Center Management of overall average turnaround time data generated by simulation runs with real system data obtained from accounting program outputs and from utilization records maintained by the operations staff in the machine room. Mean turnaround time for the sample period was 283 minutes; for the model it was 275 minutes, a difference of 8 minutes or 3%. The second method of comparison was a Chi-squared test of the individual process-time observations for both the model CPU's and the real system processors. No significant difference was established. Model applications Following validation, the model was then applied to three classes of management problems: (1) machine capability projections; (2) machine allocation; and, (3) operating system optimization. Questions were formulated with the immediate interests of the U Ga Computer Center management in mind. Question One: At what level of 7094 utilization do the 1401 I/O facilities become saturated and incapable of meeting the support demands of the 7094? Transaction (simulated job) inputs into the 1401 's were increased until the 7094 throughput stabilized and additional 1401 entries went into queue. Management had heretofore assumed that the 1401 's would saturate before the 7094, requiring upgrading of support equipment. In the simulation runs, however, the 7094 showed an average utilization of 96%, while the average utilization for the 140 I-16k was only 79%, and 77% for the 8k machine. An additional 150 7094 transactions were entered, and although 1401 utilization rose to 81 % and 79%, the 7094 utilization average remained stable. In effect, this information eliminated the necessity for short term planning for the upgrade of 7094 support facilities. Question Two: At the saturation level of operations, how much improvement in turnaround time for the average user could be expected if all job types currently classified high priority were run on the 360 rather than the 7094? Management assumed that as 7094 utilization increased to the saturation point it would be forced to reprogram and reallocate certain kinds of input. In the interest of establishing a priority hierarchy for reprogramming and obtaining maximum 7094 operating efficiency both during and after conversion, management wished to ascertain the effect of transferring 7094 jobs currently rated high priority to the 360 system. Given the saturation data in the previous 211 run, the model was modified to re-route specially classified jobs. With the simulation of the 7094 subsystem in saturation, the re-routing to the 360 of that 7% of jobs classified as high priority resulted in a reduction of the mean turnaround time of the 7094 of 20%. Further manipulation of the job generation data showed that approximately 30% more transactions could be added before the 7094 returned to a state of saturation. In the new mode with special jobs excluded, turnaround time under saturation for the 7094 was 12% below that of the original model. Therefore, it was concluded that the proposed change in machine allocation would increase throughput on the 7094 and reduce turnaround time. Question Three: What would be the effect of eliminating third shift operations at the current level of job input? Although anticipating increased utilization in the future, management was nevertheless interested in determining the effect on turnaround time of eliminating the third shift at the current level of utilization. The model was modified to simulate an inactive third shift. Turnaround time on the basis of a 16-hr. processing day increased from 275 to 523 minutes. This was not surprising since the elimination of the third shift at the current level pushed utilization to near saturation. Furthermore, in the three shift operation the third shift has a low input rate and provided slack time for backlog processing. On the basis of both service time and production efficiency criteria, elimination of the third shift would be detrimental to system performance. Turnaround time would increase by nearly 100% and relatively minor system perturbations, such as unscheduled downtime, would almost certainly result in serious backlogging. CONCLUSIONS The simulation approach represented by the U Ga study manifests the general applicability of the concept of computer simulation in DP installations. Each facility has its own singular operating characteristics and hardware configuration, and must tailor the idea to its own requirements. Nevertheless, DP installations generally are in a favorable position to profit from simulation applications. The technology to facilitate these applications is at hand. Further, recent advances in simulation programming systems, such as G PSS, make the employment of simulation techniques available without elaborate machine programming efforts. In this evolving technological environment the expanded use of simulation for computer center management seems worthwhile. Multiprogramming system performance measurement and analysis by H. N. CANTRELL and A. L. ELLISON Generai Eiectric Company Phoenix, Arizona Why " ... design without evaluation usually is inadequate."1 "Simulation... is applicable wherever we have a certain degree of understanding of the process to be simulated. "1 "The key to performance evaluation as well as to systems design is an understanding of what systems are and how they work."1 "The purpose of measurment is insight, not numbers. "3 Why should we spend time and money analyzing the performance of computer systems or computer programs? These systems or programs have been debugged. They work. They were designed for optimum performance by competent people who are just as convinced that their performance is optimum as they are that the program or system is logically correct. Why then should we analyze performance? There are three main reasons: 1. There may be performance bugs in a program. Performance bugs are the result of errors in evaluation or judgment on performance optimization. We have no reason to suspect that performance bugs are any less frequent or iess serious than logical bugs. Thus, if the performance of a program or a system is important then it should be performance debugged by measurement and analysis. 2. If a new or better system or program is to be designed, then a good, quantitative understanding of the performance of previous systems is necessary to avoid performance bugs in the new design. 3. If an important program or system is intolerably slow, then the real reasons for its poor performance must be found by measurement and analysis. Otherwise time and money may be spent correcting many obvious but minor inefficiencies with no great effect on overall performance. Worse yet, the whole thing may be reimplemented with all key bugs preserved! In all three of these reasons the objective of performance analysis is to understand the unknown. We're looking for performance bugs. If we knew what these bugs were and what they cost in performance, then we wouldn't have to look for them. But because we are looking for unknown performance limiters, we don't know in advance what performance gains can be made by finding and fixing these bugs. We don't even know how hard or how easy it. will be to fix these bugs after we find them. Thus, there is no way of predicting the performance payoff from the time and money spent in performance analysis. This time and money must be spent at risk, essentially on the basis of faith that the payoffs from performance analysis will exceed the value of any alternative way of spending this time and money. This is nothing new. This "faith" concept is well understood and accepted in the scientific and engineering communities. One of the purposes of this paper is to demonstrate that analysis pays off in programming to at least the same degree as in engineering and science. How All good analysis consists of a combination of theoretical an~lysis and empirical measurement. This is the classical "scientific method." The theoretical analysis may be done through a mathematical or simulation model (2) or it may simply consist of an understanding of how the system or program is supposed to work. The empirical measurement is designed to test the theory. Neither theory nor measurement alone is sufficient. Theoretical analysis alone may solve a nonexistent or unimportant problem. Measurement alone often misses a few critical parameters needed to test the theory.' Since neither can stand alone, analysis usually consists of successive applications of the clas213 214 Spring Joint Computer Conference, 1968 sical theory/measurement/revised theory/revised measurement/etc., cycle. I t is important to recognize the iterative nature of analysis, the theory/measurement cycle. Successive cycles revise or obsolete previous concepts. Therefore, time spent in polishing the first theory or the first measurement method will almost certainly be wasted. In analyzing a computer system or program the quantity to be measured is time - the time required to perform different functions or the time spent waiting to perform those functions. Performance analysis consists simply of trying to answer the questi0t:\, "Where does the time go?" and given an answer, ap- . plying a sUbjective judgment as to whether the amount of time spent on a given function is reasonable. The application of these concepts to an analysis of the performance of a multi programmed computer system and to the programs which operate in this system will now be considered with examples and illustrations taken from such an analysis of the General Electric-625/635 G ECOS I I operating system (5) and the system software and user programs which operate in that system. Only software measurement techniques are discussed. Schulman,4 for example, discusses hardware measurement techniques. A. System analysis For accounting purposes a mUltiprogramming operating system usually has to keep track of how much processor time is applied to each program and how much I/O channel time, by channel and device, is used by each program during its execution. Thus, the G ECOS I I system keeps a running total by program of all processor, channel and device time used. These totals are updated for each period of processor use and for each I/O transaction. When a program terminates, all of its accumulated times are transmitted to an accounting file and the totals are zeroed for the next program. All of the major functions of the G ECOS I I operating system itself are executed in 64 different, functional, operating system "programs." The time spent on each of these programs is accounted for in substantially the same way as for user programs. Operating system times are not used or reported. They are accumulated by the system because it is faster to do it than to decide not to do it. Thus, in its normal operating mode the G ECOS I I system is very highly and very accurately instrumented. In addition, the system has a TRACE mode of operation, usually used to debug changes to the operating system. In TRACE mode a trace entry is made in a circular list for every major operating sys- tem event, such as dispatching the processor to a program or servicing an I/O channel terminate interrupt. Given that a great deal of data is available, how do we go about analyzing the performance of this system? A few chronological steps in the actual analysis of this system will illustrate the theory/measurement cycle. 1. User Program Accounting Analysis Given an understanding of how the multiprogrammed system was supposed to operate (the theory), normal user program accounting data was analyzed. This data showed the starting and terminating time-of-day for each program and the processor and I/O channel time used by that program. The question, "Where does the time go?" was asked but could not be answered from this data. Our understanding of how the system was supposed to operate could not fully explain how it was operating. A tentative theory, "The extra time goes into overhead," was advanced. 2. Overhead Analysis A complete and accurate breakdown of all G ECOS overhead processor time was available in core and could have been printed in a report. However, following the rule that initial polished reports are a waste of time, several octal memory dumps of this core area were analyzed. These dumps showed that overhead processor time was significant but not excessive. They also showed a significantly high percentage of processor idle time in a system that was supposed to be heavily loaded. The theory, "The extra time goes into· overhead," was dropped and replaced by the question, "Why is the processor idle time percentage so highT' A polished overhead summary report was never implemented. The octal dump analysis had demonstrated that overhead was not the problem and that, whatever the problem was, it could not be exposed by analysis of overhead time summaries. 3. Trace Analysis Since summary data were not adeguate to test the theory of system operation, a more detailed approach was taken. The internal logical trace entries generated by the operating system in TRACE mode were captured by a trace collector program and written on magnetic tape with a high resolution time-of-day value for each trace entry. In the first successful 5 minute run of this program 350,000 trace entries were captured on tape. This represented a complete timed record of literally everything significant that had occurred in 5 minutes of full speed operation. Multiprogramming System Performance Measurement and Analysis (The trace collector program itself was one of the user programs in core and accounted for about 1% of the lead on the system.) 350,000 trace entries, if printed at one entry per line, would produce an essentially indigestible pile of paper 2Y2 feet high. So only the first 10,000 trace entries were printed to provide an understanding of the data so that a data reduction method could be devised. A succession of data reduction and reporting methods was then tried. At each stage the results were anaiyzeu, comparing the th~n current theory. of operation to the actual measurements. At each step both the theory of operation and the method of reporting the data were usually revised and the cycle repeated. This was very hard, detailed analysis work but it produced a steadily improving understanding of how the system really worked and steadily improving methods for measuring and displaying critical performance phenomena. Ultimately two complementary measurement methods and two reports evolved. These are described in a later section of this paper. At the same time this analysis also yielded a succession of well defined, understood, and evaluated system performance bugs. Like logic bugs, some of these performance bugs were very obscure, involving many levels of complicated inter-relationships, but many were very obvious once they had been found and pinpointed in the light of the improved understanding of system operation. Again, like logic bugs, some of the performance bugs were hard to fix but many were corrected with only a few minutes work (after the required change had been defined). As performance bugs were found they were corrected in the prototype version of the next system software distribution. Thus, the theory/ measurement/improved theory/improved measurement/improved theory/improved measurement/improved system. A typical time to go through this cycle was one week although for each of the two examples described below, a significant performance bug was found,. evaluated, corrected in the system, and the improved system performance measured confirming the correction, all in one eight hour day. During the operating system improvement cycle, fourteen performance bugs were found and fixed, resulting in an average throughput improvement on customer sites of 30% with a range of from 10% to 50% depending on the load mix. Two examples of the performance bugs found are: 215 1. Operating System Core Space The G ECOS I I Operating System was designed to operate within 16K of core but would use more core if it was available. This is quite adequate for small systems but for large systems with several system printers running the operating system actually used an average of about 30K of core. But only 16K was reserved for the operating system. Measurements of large systems in operation showed that when too many slave programs were packed into core, the operating system was squeezed down and performance was degraded by about 25% due to the operating system fighting to get needed overlays into core. Over a day's operation this degradation averaged about 5%. To correct this problem the core space reserved for the operating system was made dependent upon system size (about 25% of the total core space). The previously measured performance degradation immediately disappeared. 2. Dispatcher Interval The G ECOS I I dispatcher assigned the processor to user programs on a round-robin basis, allowing 62.5 milliseconds of processing for each program. Many programs would become roadblocked, waiting fot I/O, in much less than this 62.5 milliseconds. The next program would then be dispatched. During processing on one user program, I/O terminate interrupts would be serviced and any I/O activities waiting in the queue would be started, but only the program processing at that time could ini~iate new I/O activities. The I/O time to read or write a system standard block to drum or high speed tape is about 25 milliseconds. The dispatcher overhead time to switch· from one program to another was measured at about 0.5 milliseconds. This appears to be a fairly reasonable strategy but actually it is very bad. If I/O bound programs are multiprogrammed with a compute bound program, the ideal situation, then the compute bound program will retain control of the processor for 62.5 milliseconds every time it is dispatched. The 1/0 activities started by the other programs would have terminated within the first 25 milliseconds and those programs and their I/O resources would remain idle for the remaining 37.5 milliseconds. Thus, with one compute bound program in core all operating system and user program drum and tape I/O operated at about 40% of normal speed. 216 Spring Joint Computer Conference, 1968 To correct this problem the dispatcher interval was changed from 62.5 milliseconds to 15 milliseconds. The results were immediately apparent. In one benchmark of two tape sort programs multiprogrammed with a compute bound program~ the sorts ran twice as fast. B. Program analysis An operating system and an installation's operating procedures provide the environment in which users' programs and manufacturer-supplied compilers, assemblers, sort programs, etc., operate. The overall performance of a computer system obviously depends upon both the efficiency of the environment and on the efficiency of the programs which operate in that environment, or at least on the subset (usually small) of all programs which account for most of the load on the system. In an earlier section, System analysis, the performance analysis of the environment was discussed. This section is concerned with analyzing the perfor"mance of individual programs. Again the starting point is the question, "Where does the time go?" 1. Input/Output and Compute Time Profiles I/O transactions and their degree of concurrency with each other and with computation are important characteristics of system performance. Thus, they must be accurately measured as a part of system performance analysis. By measuring system performance, as described above, with only one program operating in core a complete I/O and compute time profile of that program is obtained. From this profile the programmer can apply vaiue judgments and tradeoffs as to whether the time spent is appropriate to the function being performed and whether the optimum degree of concurrency has been attained. Such an analysis is normally a part of the initial design of any program, but surprises (performance bugs) are not unusual. Optimal blocking, buffering, and other I/O strategies are not always achieved. An example is a CO BO L program which repeatedly wrote a very small transient file on a tape and then read it back again a few moments later. The total amount of data written and read was very small so this I/O time hardly entered the initial performance calculations. But the measured I/O compute profile for this program s~owed that most of its total time in the machine was spent in repeatedly opening, closing, and label checking this small transient tape file. The time spent on providing transient storage was outrageous compared to any of several alternative ways of performing the same function. This is an example of functional value analysis. 2. Compute Time Analysis System analysis methods and measurements provide adequate measures of the I/O-compute prorile for an individual program. But system oriented measures do not cast any light on where the compute time goes within a program. This is a classical problem whose solution is an essential part of performance analysis. It had been under attack throughout the G E-625/635 performance study described above but its solution was accidental. Actually, the data measuring the internal compute profile of a program had been taken and existed on a magnetic tape before it was recognized that this data represented a complete, accurate, and general solution to the problem. This is an example of one of the extremely valuable but completely unexpected discoveries which occasionally occur in the course of a serious effort made to understand the unknown. The solution is quite simple and can be applied on any computer which has a program interrupt capability. It is particularly easy to do in a multiprogrammed system. The method used is a high density sampling method. If an executing program is frequently interrupted according to some random or periodic time schedule which is known to be statistically independent of any natural execution pattern in the program, then the frequency with which the interrupt location falls within a particular instruction sequence is proportional to the total time spent by the program in executing that instruction sequence. To obtain a compute time profile of a program, that program is loaded and run in its normal fashion without any modifications and without any need for prior knowledge of any of its characteristics. By using the typical "timer run-out" feature of the multiprogramming operating system, the program is interrupted frequently during execution. (A one millisecond execution increment gives about 25 sample points per typical I/O transaction on a GE-635.) The address of the next instruction to be executed in the program at the time of the interrupt is recorded for each interrupt and written on magnetic tape. At the completion of the run this tape is sorted by interrupt location address and the resulting frequency distribution is printed. The relative amounts of compute time spent in various parts of the program are determined by comparing the distribution of interrupt locations to the program listing. An analysis run with this method takes about one-third more GE-635 computer time than a Multiprogramming System Performance Measurement and Analysis normal run of the program being analyzed. At 1,000 sample points per second of compute time, the distribution of compute time within a program is resolved down to the individual instruction execution frequency for all but very infrequently executed routines, which are by nature uninteresting for this kind of analysis. All compute time concentrations within the program are very clearly defined and measured without any need for prior knowledge of where they are. The "accidental" application of this method occurred because G ECOS I I normally set the interval timer to interrupt a program after 62.5 milliseconds of compute time. The location of the interrupt was a part of the trace entry which was collected in the Trace Analysis method described in A-3 above. Thus, this data had already been taken in crude, 62.5 millisecond interrupt period, form before the value of the data was recognized. To be statistically rigorous the fixed, one millisecond sampling interval should be replaced with a random sampling interval with a one millisecond mean, but a fixed interval appears to be satisfactory. Some programs might conceivably have an execution pattern which repeats in synchronism with the one millisecond interval. Such synchronization has never happened in practice but if it did it would immediately be evident from indications of very high execution frequencies at the synchronization points and very low frequencies at nearby points in the same instruction sequences. The sampling interval could then be changed to some non-synchronous value for that program. This method has been applied to a wide variety of programs and has been found to be a very valuable tool for tuning long running or frequent1y used programs. The method finds many types of compute time performance bugs if they are present and pinpoints the areas in which tuning will be of greatest value. For example, the first application of this method to the FORTRAN compiler led to fixes which increased the speed of the compiler by 27%. What In tht? current implementation both the system and the individual program performance measurements are taken by a small (4K or 6K) program, called MAPPER, loaded as one of the user programs in the G E625/635 multiprogrammed system. This program 217 originally ties itself into the standard operating system using a special privileged interface and later unties itself upon termination, leaving the operating system in its original form. (Only one instruction in the operating system is chang~d while MAPPER is operating.) The program is tied into the trace facilities of the operating system and scans all trace entries. Otherwise it uses no processor time unless it is called upon to print a line or write a block of data on tape. A. MAP Every two seconds MAPPER samples most of the various accounting time cells in the operating system, subtracts the value from two seconds ago, and prints out a single line showing the percentage of processor and channel time that has been applied to various pro,;· grams and operating system functions over that two second period. Percentages are rounded to the nearest percent. Values less than .5% are printed as a period. If a user program is present in core but has not used any processor time over a two second interval, then an asterisk is printed for that program. An example of MAPPER output is shown in Figure 1. Whenever a user program (slave program) is started or terminated the identification of that program is printed and a core map showing the locations of all user programs is· printed. Each print position corresponds to 1K of core. Page averages and run averages are shown at the bottom of the page. The report can be printed in real time on an on-line printer, or sent to the standard system output facility (SYSOUT), or stored on tape for later printing. Normally it is printed on an on-line printer. This is an analysis report. It is not intended to be a display of all of the interesting data that are available and could be printed. Its objective is to display only that data which are essential to an understanding of what is going on in the system and to display the data on one -line so that the vitally important interactions between parallel functions can be seen and understood. Briefly, a line on the report shows the percentage of processor or channel time applied to various functions over a two second period. Idle processor time is included in dispatcher (DISP) time. The actual dispatcher time ranges from 2% to 5%. Tape channel time can exceed 100% because there are several tape channels available. The only columns which do not show time percentages are: 1. Time-of-day The left most column shows time-of-day in hours. 218 PG 97 062767 10 <--DRUM-> S L M A V S T ... T 0 T S L A V <--DISC-> TAPE F' R T 5 E E 0 L T ... M A T A V 4 1 1 2 2 ! 2 2 2 3 4 4 4 260 234 5 6 4 2 3 4 W(----SLAyE PROGRAMS----> T G Q H U T E S L Y S L S L v v S L Y • S L ,v S L v S L D r v S 7 II M G G M G L G B 0 E , E T R I !l R SEN !l , H D C L TUM 0 I H N COP 1 B T q 0 E E R R L T M S 455 5 5 6 6 , 6 6 7 M 7 0 1 2 J 0 2 4 5 , 0 I S C C C C D T G M G S G C ~ H H M S Y , 0 E 8 P P P P P Y P L N P E 0 H 4 321 TEL N R " P S 10'.0692 75 25 43 24 19 17 17 21 24 21'1 5 '5 :; • • ., 10'.0697 80 20 26 25 1 24 24 28 24 27 4 '5 5 1 •• 14.07n3 A1 19 24 21 3 54 9 46 30 30 20 4 4 5 1 1 14.07t3 F'!N C'292.1-3 F'ORTR. !.4413.11 CORLEY - - --;- •• " - -- - - -. -. --- - - •• - -- - - -.- - - 4 4 44 4 44 4 4 444 4 4 4 4 4 4 4 4 4 4 4 4555555555555555<;5555555555555555- -- -11- --- - .-. - - -- - - -. -- - ___ --7 777 14.0713 40 60 47 7 41 26 100 4 95 1 8 27 12 • 26 5 1 • 1 1 • 2 5 5 J.11 • CMP, 14.1)714 . SJlT S3473.1-6 RUIS 14422 ..... M.SCHUC - - - -;: -_ .. ;;; --- - - -. - --- -- -. - ---- - - .--- 4 4 4 44 4 44 4 44 4 4 4 4 4 4 4 4 4 4 4 4 455555555555555555555555555555555-66666 66666 666"6666666666666666667777 14.n719 45554019::>17573 732'" 1 7 359 169 1813 155 3.4 14.n7?4 0;347332672569 69131'117 4012. 49 21 156 1 14. 07 ~n 36 64 57 22 30 25 27 22 12'5 1 7 35 • 7 10 21 3 1 1 4 8 1 9 14.nn5 3'" 64 63 23 40 24 17 17 911 8 3 6 . 12 9 14 5. 1 5 4 .11 14.0741 47 53 f>l 34::>7 13 9 9 11'5 8 44 1 2 11 9 12 3 1 5 5 1 4 14.0746 45 55 48 :?2::>6 13 16 16 121'1 8 26 11 7 6 11 15 5 2 .. 5 7 14.0752 42 58 55 16 4a 1<' 31 31 39 8 17 9 1" • 10 14 • 5 • 1 5 .. 1 7 14.n758 58 4<' 35 24 11 11 15 15 4::> 8 22 15 20 , 9 16 • 5 6 1 14. r,763 59 ~c 25 H, I) 1. 64 64 31 8 25 9 24 4 8 17 5 4 2 14.0769 57 44 39 17::>3 10 31 31 2~ 8 17 16 23 '8 13 2 5 4 1 3 14.n774 '57432" 4?? 1049 4931 8 181325' 8 162 55 1 2 14.07"0 60 40 36 19 17 9 43 43 27 8 23 13 24 !J 8 14 • 1 5 4 1 1 1 14.n7'15 EJB A3781.1-5 RLMS 15221.WHSC"'UCK - - - - =- _. ~ - - - - - -. - - - - - - - .. - - - -- - -.-- - 4 4 44 44 4 4 4444444444444444 --- -.- - --- - -. ------ - .-------.---- 66666666666666666666666666666666 7 77 7 14.078542 '585512431.358 58 2~ 2 8 15 25169 12111 12155 2. l' .CMP5 e. -_ 14.07~6 ••• SRT C32~til1·5 RLHS 15253.0 ~OR!(~AN - - - -,; _8.;; - - - - - -. - - - - - - -. - - - - - - -.- - - 4 4 44 4 4 4 ~ 4 4 4 4 4 4 4 4 4 4 4 4 44 4 4 -55555'5555555'555'55555555555555555 6666666666 6666t-66666666666666666 77 7 7 14.n791 14.n7Q6 14.01\02 14.(1110" 14.~A':.'I 43 53 Sl 72 77 i::~:;~ ;~ ~~;~~~1~~; ~~1!;~2~~: ~~32 78 18 6 0 . 14. nA~O 14.0111< "ill 40 3:57 4'" 50; 10 45 1Q 3::> 28 4 28 31 20 11 <'~ 36 27 9 2~ 5 1<' 17 12 t? 1::> 5,? 96 96 111 25 1!l 10; 63 4G 6 34 1311 50 13 37 91 34 0;;>8 34 5 29 87 3 44 9 30 41'1 20 2 1 8 II 8 8 8 "38 12 • 41 39 4~ 31 1 40 50 • 27 1 11 5 3 4 , 7 6 5 5 5 12 1 3 " • 6 9. 1;: :I 4 2 1 4 3 4 12 5 4 .H 52! 1 0; '5 • " 5 4 1 2 1 ENe .. ;::3 1 • 5 6 1. 2 F'IN C'!I291.1-4 F'ORTRA 14426.G,RRETSON _ - - -~ __ .. ;. - _ - ___ .. ______ - + ____ - __ ._ - - - - - _ +- - - - - . - . - - - - __ - . - - __ 55555555555555555555555555555555 6 666 6666 66666 "666 666666666666666 7777 14 .n!,\.~5 ••• SRT . C5292.2-4 GELO'O 14413,R CORLEY - - - -;; - ~.;; - - - - - -. - -- - - - -. - - - - - - -11- - - - - -- 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 44 4445555555555555555'5555555555555555 6 666 66666666 6666 6666666666666666 777 7 14.nR10; tlO 4P 41 4:57 39 62 62 54 8 40 19 1 '5 1 2 • 2. 5 5 1 14 • CMP5 14.nA41 68 32 40 6 34 39 69 69 8 2 42 24 1 '5 4 4 5 2. '!I 14.ne47 76 24 23 15 8 39 28 17 10 8 11 2' 38 "5 2 1 0; 5 1 2 14. o,,0;2 !l1 19 26 28 39 53 0;:.'1 8 16 24 4D 3 0; 5 6 1 14.flR5e 10303717203937 37 8 142828.5 .65 1 14.nM3 46546 63986 86 8 41229 334 i 5'5 3.! 14. na73 _-_ EJB C'!I292.2-4 GELO'O 14413." COIILE'f' -11-- - ----. - ------. ---_- __ • ----55555'55555555555555555555555555566666666666666666666666666666666 7777 ----;;--. ;,---__ e. -_ . ____•______ PG TOTL RUN TOT '59 .5 41 3A 17;>0 55 5 2 3 18 19 47 39 • 8 8 2 155 • 1 • •• 1 3 .... , 2 2 • 1. , ENSM • • SAVE Figure 1 -Internal performance MAP of about one· minute of GE-635 operation during a customer's normal work day. RUN TOT ALS apply to about 2.3 hours of operation 2. FREELK This is the number of free iinks of space avaHable on the drum. 3. FREEMT This is the number of magnetic tapes which available and have not been allocated to user programs. 4. WTGQUE This is the number of user jobs waiting in the external queue. The column headed SL V 7 shows the time spent on the MAPPER program itself. The column headed IOTM shows the time spent in handling I/O terminate interrupts. This value is inflated by about two-to-one because of extra overhead from MAPPER. The columns headed CMPI-4 are the operating system functions driving on-line printers while DSYT is the system output collector· function. G EIN is the on-line card reader· input function, which also builds the job queue. The overall operation of a. computer system is as dependent upon what goes on outside the machine in the machine room as it is upon what goes on inside the machine. Thus, the MAP report is normally printed out· on-line in real time with the system analyst present and observing the operation of both the internal and external systems. MAP runs are usually made over periods of several hours of normal system operation at a cost of one online printer, 4 K of core, and about a 10% reduction in machine performance due to the increased overhead of operation in trace mode. From observation during such runs and later studying the data, the following types of system performance factors can be found and evaluated. i. Hardware configuration bottlenecks-I/O chan- nels, tapes, drum, disc, core, etc. Multiprogramming System Performance Measurement and Analvsis 2. Indications of inefficient I/O strategy - blocking, buffering, etc., in individual user programs. 3. The effectiveness or ineffectiveness of the actual multiprogramming mix. 4. The effects of machine room procedures upon system performance. 5. The relative efficiencies of various optional strategies of machine operation with enough information to define why this way is better than that one. 6. Possible operating system performance bugs. B. Major event report In addition to producing the MAP as described earlier, the MAPPER program will optionally collect all operating system trace entries and write these on a tape. This is a very detailed mode of measurement and is normally done only over short periods, 10 to 15 minutes, of operation. The tape of trace data collected. by such a run is then processed to produce a major event report. This report displays events such as dispatching the processor to a program or taking the processor from a program. It also displays all I/O initiate and terminate events and a few other trace events of special significance. The major event' report provides a microscopic view of what is going on in the system. One line on the MAP report expands to about 300 lines on the major event report. This microscopic view is used to measure and understand the detailed operation of the operating system. c. Slave profIle analysis Another option in the MAPPER program is used to measure the distribution of compute time within user programs. With this option the MAPPER produces the MAP report and also sets the dispatcher interval to one millisecond and collects timer runout trace events for the grogram being analyzed. These are subsequently processed and used. The MAP for this run provides the I/O-compute profile data for the program being analyzed. Results The results of system performance analysis work are both tangible and intangible. The tangible results are those which lead to significant improvements in system and program performance. These have been and are being achieved. The intangible results are the better understanding of the technology which is 219 attained. Ultimately, these may be of the greatest value. This section is, therefore, devoted to some of these significant intangible results. I. The critical path in a multiprogrammed system The second by second performance of a multiprogrammed system is always limited by the speed of the processor or an I/O channel or by a path through several of these resources used in series, as with an unbuffered program which reads a tape, then computes, and then writes a tape. Thus, for every iine on a MAP report, if some single limiting resource is not saturated, there must be a performance limiting critical path through some series of resources whose total utilization adds up to 100%. Such a critical path is the only mechanism by which each of all the resources in the system can be partially idle over some short period of time during which there is a load on the system. 2. Resource Utilization Strategy For all practical purposes it is never advantageous to arbitrarily defer processor or I/O channel use to some future time if that resource can be used now. The idle resource time generated by such an action can never be recovered. An example might be a program which could overlap processor and I/O time with double buffering but does not, assuming that some other program will use the free processor and channel time. This is like betting a dollar that you already have, on a chance that you may win only that dollar back. You can only lose or at best break even. 3. Resource-Seconds Analysis In a multiprogrammed system all programs share the same set of resources. Each program imposes a load of some number of resource-seconds upon each resource. Critical resource tradeoff decisions can be made on the basis of which approach uses the least critical resource-seconds. Two tradeoff examples follow. (a) How many tapes should be used for a tape sort given that the sort time is dependent upon the number of tapes used? The critical resource is the number of tape drive-seconds available on the system. A plot of number of tapes used times the sort time for that number of tapes should be made. The sort that uses the fewest tape drive-seconds is the optimum in a shared resource system. (b) How much core should be used for a program given that the total running time for that program is dependent upon the amount of core used? The critical resource is the total number of core 220 Spring Joint Computer Conference, 1968 word~seconds available. The running time for each of various core size alternatives should be determined. The alternative which uses the least total core word-seconds is the optimum in a shared resource system. Extreme cases, such as the program whose optimum core size is all of the available core, should be studied a little more carefplly. In extreme cases several critical resources, core, tapes, etc., may be involved in the problem. When this occurs a resource-second weighting factor, such as the cost of the resource, should be assigned to each resource. Then the optimum is the combination which uses the least total weighted resource-seconds. For even more rigorous analysis, unused resources may be weighted with the probability that they will be used by some other program. Usually the tradeoff values are sufficiently well separated that this degree of accuracy is not needed. 4. Core utilization In the GE-625/635 system each program must be allocated to a block of contiguous core. Core is compacted as required by moving programs in core to make space for new programs. The significant point here is that this is not a problem. Measured core utilization is quite high and core compaction (program moving) time is insignificant, well under 0.5%. s. Operating system overhead A multiprogramming operating system must perform many functions which are. not required in a uniprogramming operating system. But as measurements of the GE-625/635 GECOS II operating system demonstrate, all of these functions can be achieved in a low overhead operating system. Totai overhead averages about 15% of processor time. . 6. Measurements by customers U sing normal accounting or billing statistics, the customer can measure the gross performance of a mUltiprogrammed computer system in substantially the same way as he would measure the performance of a uniprogrammed system. In a uniprogrammed system these individual job statistics measure the performance of the total system and the efficiency of each prograO,l as each individual job is executed. The gross measures are simply the sum of a number of independently valid individualjob measures. But in a multiprogrammed system several programs are operating concurrently in core. The total time spent by anyone program in core is dependent upon not only the work done by that program but also upon the degree of its competition for resources with the other programs in core. Thus, the individual job mea- surements usually measure neither system performance at a given time nor the efficiency of individual programs while they were in core. The customer can measure gross performance but he cannot measure detailed system or program performance with these statistics. For example, the amounts of compute and I/O channel time used by a program may be measured but there is no indication as to whether or not these resources were used concurrently.. As another example, set-up time usually disappears as a measure but not as a problem. Overall we find that when conventional, uniprogrammed type measures are applied to a multiprogrammed system the customer has lost many of his previous measures of system, operator, and program performance. But he is actually operating a considerably more complex system and needs better, not fewer measures. Thus, system and program performance measurement methods of the type discussed in this paper appear to be essential to the customer's effective operation of his multi programmed computer system. 7. Load mix - the top ten If the total time per month used by each program in an installation were computed and the results sorted by the amount of time used, how much of the total load would be accounted for by the top ten programs? Typically, the top ten programs appear to account for from 50% to 80% of the load. (Actually the "top ten" may turn out to be the top five or the top twenty at any given installation, but the same principle applies.) The "top ten" concept implies that the efficiency or throughput of an installation is from 50% to 80% determined by the efficiency of a very few programs. These programs are worth tuning, particularly when it is recognized that performance .tuning can and has reduced program running times by two or three to one. The top ten programs are worth the application of some expert talent and the use of program analysis methods such as those described in this paper. This is the easiest, fastest, and least expensive way to increase the capacity of a computer installation. 8. Turn around time When a program arrives inthe machine room it is almost sure to find that one of the top ten is there ahead of it. After all, one or another of the top ten is on the machine most of the time. In a uniprogrammed system almost everybody has to wait, almost every time, for one or more of the top ten programs to be finished. But in a multiprogrammed system, many short jobs can be loaded and finished in parallel with the execution of one of the top ten. They no longer have to wait, as long as all the available core is not filled with Multiprogramming System Performance Measurement and Analysis :- 221 several of the top ten, all competing for the same resources. This suggests a good operating procedure for a multiprogrammed system. The long jobs should not be saved for the second or third shifts. The top ten jobs should be identified. All processor saturating combinations of one or two of these top ten programs should be determined. Then the operators should load the machine so that one of these combinations, no more, no less, is in the machine at all times. The short jobs will still receive excellent turn around time and there will normally be a new top ten combination waiting to' replace the one that has just finished. CONCLUSION Methods have been developed for analyzing the performance of a multiprogrammed computer system and the programs which operate within that system. Some of these methods, and many of the specific measures and reports developed do not apply directly to other systems but the general approach used does apply. The tangible and intangible results produced from the study reported here demonstrate the value of performance arialysis. The methods and philosophy may be helpful as an example of how to do it. REFERENCES 1 P CALINGAERT System performance evaluation: survey and appraisal Co~m ACM 101 Jan 1967 pgs 12-18 2 N R NIELSEN The simulation of time sharing systems COMM ACM 107 July 1967 3 R WHAMMING (Paraphrased by the author) 4 F D SCHULMAN Hardware measurement device for IBM system/360 time ' sharing evaluation Proc 22nd ACM Conference 1967 pgs 103-109 5 GE-625/635 COMPREHENSIVE OPERATING SUPERVISOR (GECOS) REFERENCE MANUAL General Electric Information Systems Division CPB-1195 Multiprogramming, swapping and program residence priority in the FACOM 230-60 by ~1AKOTO TSUJIGADO Fujitsu Limited Tokyo, Japan INTRODUCTION Fr\COM 230-60 is a large size electronic digital computer developed by Fujitsu Limited. The system consists of 1) up to 2 processing units, 2) a 256k word (maximum) high speed core memory that operates at a 0.92 /L sec. cycle time, or at an effective cycle time of 0.15 /L sec. with 16 memory banks and 3) a 768k word (maximum) low speed core memory that operates at a 6.0 /L sec. cycle time, or at an effective cycle time of 1.0 /L sec. with 6 memory banks. The average duration for execution of one instruction by the Gibson-Mix method is 1.6/L sec. t for each processing unit. Because each processing unit has. 6 base-registers, it is easy to write a location-independent program which makes dynamic relocation of a program possible. The ~ost commonly used larg~ random-access storage devices are: magnetic drum with 544k words, a mean access time of 17 m sec., and transmission rate of 135k bytes/sec.; magnetic disk with 20,000k words, a mean access time of 1}0 m sec., transmission rate of 125 kilo bytes/sec.; and disk pack with 1,8ook words, a mean access time of 87 m· sec., and transmission rate of 156k bytes/sec. For the F ACOM 230-60 Operating System, system requirements include real-time job processing and conversational job processing, in addition to the conventional batch job processing. To control these three processing modes together under a single control program, two concepts have been adopted in the system design. They are the multi-tasking operation as well as three priorities, i.e., job priority, execution priority and program residence priority. This paper introduces the program residence priority concept. FA COM 230-60 multiprogramming requirement To show why the system requires the multiprogramming property, consider the abstract computer indicated in Fig. 1. Assume that this model is required to handle u programs simultaneously, and that each program is processing quantity Di input and output data The following symbols are used: u: total number of jobs being executed in a system D j : quantity of input and output data to be processed by ea~hjob (unit is 1 character) D: average value of Di above, i.e., D 1; (D t +D2 + ... +Du) u =- P: data quantity to be processed by the processing unit (characters) H: system overhead g: average time for execution of one instruction by the Gibson-Mix method (seconds) For one processing unit, g = 1.6 /L sec. For two processing units, g = 0.9 /L sec. s: number of instruction steps needed to process one character of input or output data. p An abstract model of .-, svstem simultaneously executing u jobs, each processing Di (i ~ 1, 2, •.. ,u) quantity of input and output data. Figure I - System model The relation between Di and P with overhead H is given in the following expression. u L Djr= (1 i=t 223 H) . P 224 Spring Joint Computer Conference, 1968 u When L Di > (1 - H) . P, the processing unit causes i=l the, input and output devices to slow down, or loses a part of the input data, since the processor can not complete the processing of all the input and output data given. u When L Di < (1 - = 1500 [lines/min] X 136 [bytes/line] 60 [sec/min] . . = 3450 [bytes/sec] dD . Tt= 1064 + 3450 ~ ~ 5000 [bytes/sec] H) . P, idling occurs in the process- i=l ing unit. u Since uD = L Db differentiating the equation with rei=l therefore spect to t, 2 u = (1 - H) 1.25 x 10 dO dP u dt- = (1 - H) dt S (1) dD.IS t he average quantIty . 0 f t he mput . were F and outh .. dP IS . one'JO b'm a umt tIme, and -d put data 'processedyb . t the data quantity processed by processing unit P iri a unit time. On the other hand, from the dimension equatIon, P [ch] . s [step/ch] . g [sec/step] = t [sec] The following equation can be obtaine~ by differentiating with respect to t. dP dt 1 -=- (2) 1- H u=dO-- t ' = u (1 - H) 12.5 s 1.3) When ten magnetic tape handlers with an effective speed each of 62.5k (bytes/sec) are used in eachjob, = 6.25 X 105 [bytes/sec] (3) -'S'g _ in Equation (3), u, the number of jobs simuitaneously executed by the system, is given as a function of dd~ dO, -d = 45,000 + 5,000 = 5 X 104 [bytes/sec] ~~ ='62.5 X 103 x 10' From Equations (1) and (2) dt 1.2) When a magnetic tape handler with an effective speed of 45k (bytes/sec) is added to the above case, (average quantity of input and output data pro- cessed by each job in a unit time), s (number of steps needed to process one character of input or output) and H (overhead). 2) In real time processing When 1,000 transmission lines with a speed of 1,200 baud are processed by one job, dO <.it = u Card reader speed = 800 [cards/min] 800 [cards/min] X 80 [bytes/card] 60 [sec/min] = 1064 [bytes/sec] Line printer speed = 1500 [lines/min] 120 (bytes/sec) x 1000, .' = 1.2 X 105 [bytes/sec] Further consequences of Equation (3): 1) In batch mode jobs 1"1) When card reader and line printer are used, 1- H s u =5.2 (1- H) s 3) In conversational modejob · sale ~ . dDwould . be 2(bytes/ I t IS to estImate t hat qt- sec) since it would take about one second per character to type-in the data and also to read the printed data. u = 3.1 x ] 05 X (1 - H) s ~lultiprogramming, Note: Program swapping affects H and is independent of dO dt The above relations are shown in Fig. 2, where H is assumed to be zero. On the other hand, from the experience with the FORTRAN compiler of FACOM 230-50, which is smaller than F ACOM 230-60, the value of s is presumed in the range from 32 to 128. So, in batch processing, even when only a few jobs are simultaneously executed while read-to-random access storage and print-from-them multiple operations are concurrently performed, the idle time will no longer be available to the processing unit. I n the conversational mode, however, more. than 256 jobs can simultaneously be ~xe. cuted, even if each job needs such a large quantity of . information processing that, with the overhead, s=512. Swapping and Program Residence Priority simply assuming that each job requires an area of 32k words, the core memory of 8192k words would be necessary to execute 256 jobs. It is impractical to make such a large core memory, so the situation naturally cails for program swapping techniques. In this section, we will discuss the idle time of the processing unit (and of the system) caused by swapping. The symbols below are used in the following discussion. u: number of on-line users who utilize the processing unit once in each response cycle. tr: response cycle, a period of time wherein each user utilizes the computer at least once (seconds) f: average period of time assigned to a user's program in each response cycle, including the overhead. e: average period of time excluding the overhead assigned to a user's program in each response cycle, that is e = f· (l - H) ts: tj: to: H: d: da : n: L: V: / (/ /6 64 ~ /024 dQ% 16K 64K 2S(.K S InetrucUOIl e1iepe • 1-B • doD , _ere B • 0, 1 • 1.6 % 10-6 MO. 225 A: average period of time for a swapping (one way) average period of time for a roll-in (= ts) average period of time for a roll-o~t (= ts) overhead; the ratio of the number of program steps, in a control program, either for services or for idling, to the total number of program steps. number of data channels available for the program swap. number of data channels actually provided for the program swap. number of jobs in the main memory at a certain time. average length of users' programs which correspond to jobs (K words) transmission rate of one random access-storage (K Bytes/Sec) average access time of one random access storage (seconds) , " ' e'l As pointed out by J. I. Schwartz,2 !he nlat1_ .-.en • (DlUlber of e1-.ltan.ou ...re) U4l e (IlUIIber of 1netrucUOIl e1iepe n.Heel to proc... 011. character of aPl't or oatput data) at each t1:acl wlae of the pal'aMter : . u = tr . (1- H) in worst case 2ts+e (aPl't ad oatput quaU 107 proc.... a the wa1 t tt.) Figure 2-Job multiplicity If, however, a sufficiently large number of data channels and enough random access storages to swap programs smoothly are provided, and the system can handle the dynami~ relocation of a program, u can be written as A nalysis of swapping time From the preceding section, it is concluded that the FACOM 230-60 System could simultaneously process 256 jobs in the conversational mode. In this case, (4) 226 Spring Joint Computer Conference, 1968 In order to obtain the relation between the swapping time and the execution time of the processing unit, the time chart in Fig. 3 is used, since the channels transfer data one way at a time. From the Fig. 3, IAJ1't4"!1/{£' ! ire "1F/ 11-4-, , I I' lL I I : , I I I I I I I I I I I, I I I I I I; I 'I I , ! L [KW] . 4[Bytes/Word] = V[KB/s] (ts[s] - A[s]) I, I' I~II I I ! I ,I I, I i I I i-tLI-iL1 : to I 11; I i I~I+I I I I ' ! , I Il LL...,......I~! ! I I I I I K2-Ud i j I " n-I- l4-i-4Y Lit! ! i '+I'~ Iii I rl II, ·11 ~ I .1 Itl ~ I :JJI: ,! ,j, ~ I! t ' I or Part t ot MiD ___..... -~ t • 1;_, 4• " D.' jJ job_ t • 1D 1;be = A+ 4·L V (7) 2·A·u 8·u n=d= 1 + - - + - _ . L tr V ·tr u ..1!bared bJ II1lUple Jo~ s From Equations (4), (6) and (7) and since n = d I I kl DIT : I .uUpl1c1V ot Further, the next relation is provided between the average size of programs each constituting one job, the data transfer rate and the average access time of a random access storage, and the program swapping time \ I 1i-4Ll...a.i I~I...L'!~t"1 I (6) Par1;otMiD...o17:: 1'1 I: I I I i I 2ts = (n - 1).f = (n - l)·e 1. - H t 1;_, • • 5, D • 5 MiD MII017. t1: period ot 1;1DI tor roll-1J: ( .. 1;. ) to: period ot 1;1DI tor roll-out ( • to ) ta period ot u .. tor eacruUOIl, 1Dclud1DC cmtrlwad. Figure 3 - Main memory operating diagrams Equation (5) is valid when the capacity or power of the ·dat~ cha.nnels is equal to that of the processing unit, and, if the former is greater than the latter, ti +f+to < n·f and, no idle time will be available in the processing unit. When the capacity of the data channels is less than that of the proc~ssing unit, t~+f+to > n·f and, 'idling will occur in the processing unit. In order to use the processing unit without idling, conditions d ~ 2 and n ~ 2 are necessary when f < 2ts' In the F ACOM 230.-60. Operating System operating in the conversational mode, the condition f < 2ts will always hold even if a high speed drum (V = 10.0.0. (KB/S), A = 0..0. 17 (S) is attached. Therefore we assume d ~ 2 and n ~ 2 in the following. From Equation (5) (8) If A = 0..0.17 [S], V = 135 [KB/S], tr = 16' [S] '(since an on-line user carrying out a FORTRAN com-. pilation with a typewriter normally takes 16 seconds on the average to type-in one statement; the value 16 or so is' ~ufficie~t for tr), the following exp~essions are obtained according to u, or the number of on-line users. 1) In the case of 64 on-line users, n=d= 1.14+0..24 XL- 2) In the case of 128 on-line users, n=d= 1.3+0..5 x L3) In the case of 256 on-line users, n=d= 1.5 +0..9 xL The three cases are illustrated in Fig. 4. In addition, when A, V and u are assigned 0..0. 17, 10.0.0. and 256, respectively n=d= 1.5 +0..128 x L The relation is illustrated in Fig. 4. It was concluded from the above analysis that providing a sufficiently large number of data channels and of random access storages for the program swap should be abandoned and that the remaining time should be applied to batch mode jobs to eliminate idle time caused by reduction in the number of data channels used. The idle time rate increases ina processing unit as the number of data 'channels actually used decreases. From Fig. 3, the rate of idle time = d ;~a X 100 [%] l\iultiprogramming, Swapping and Program Residence Priority 227 0/0 L·d.·~ 100 I----I----I----+-----+----f---+---+--+--+----t--_+_ I I 18~-----+--~-+/~~~~~~---r------+- " .=·'4 ~-----+--~~+-----~-------r------+~-----+--r-~+---r-~--~--~------+- 1,2 ~-----+~----~~--~-------r------+I ~/O ~----~----~+-~~~----~-r------t" Ol?; V-I3$" J~ ~. 10 20 40 .30 so [K'If] L.. ~Id._ /0 /234567 II d.a. Data channela actually provided Figure 4 - Relation between data channels and program size 1'he relat1011 bet. . . the nuaber ot data channels actually prov:l.ded tor program nappiDt; (da) and the rate ot the idle tiM pnerated From the relation in Equation (8) in the processing unit. L=~(n;I'~_A) L: average length of progr&118 d: desired llUIIber ot data channels tor proaram napping Figure 5 - Idle time ratio Fortr= 16 [S], u= 128, A=0.017 [S], V = 135 [KB/S] _ da U - ) n- 1 L=34 ( }6-0.017 d- I d . -2- tr . -0.-0-17-+--'--0.-00-4L The relation is shown i; Fig. 6. Figure 5 shows the rate of idle time considering L . da = 256[KW]. In addition, the number of on-line users simultaneously accessing the system will be obtained from Equation (8). (d must always be greater than 1 as discussed between Equations (5) and (6), and, from the Fig. 3, d= 3,5,7, ... ) u. ~Z~----~------~----~----~------~-- II 2S6r------r--~~~----~-----4------~­ I-< !128~----~--~~f------~~---4------4-­ ., ~64~----~~--4---~~-----P~~-+--­ ~ ~ °3Z~----~------~~~~--~~~----~~ d- 1 tr u='--. 2 A+ 4L 'd .! 16 !8 V Therefore, for any number of data channels actually provided, dat d - 1 u=·--·--· d 2 tr A + 4L where d must be 2m + 1 when da = 2m or 2m + I for m=1,2,3, ... In case of A = 0.017, V = 135 d- 1 tr u = d' -2-' 0.017 + 0.03L and ill case of A =0.017, V = 1000 ~----~-----~f------~-----4------~- 4 (9) V da 4 8 16 64[KWJ .berage program size V: data transfer rate A: average aece88 t1ae da: number of data channels actually ProVided tor program tr: respoo .. cycle Figure 6- Relation between number of on-line users and program size 228 SprIng Joint Computer C"onference, 1968 The division of main memory space between batch jobs and conversational jobs is prese"nted as" follows. The size. of main memory space' for conversational jobs· is qetermined by d a • L, and that for batch" jobs is determined by (the total main memory size -da ' L- the size of the area for the control programs). Tije batch job control program requires the joh-step initiator to allocate Jobs until the ~ain memory area reserved for batch jobs fills up, while" the conversational job control program requires the' job-step initiator to allocate jobs until the number :of active users reaches ~, which is calculated by Equations (9). Execution Priority Program Residence Priority H1gb High 'I I Real Time Job . - -. !i Conversat:lona.1. oJ OD medium i »atch Job I I Low Medium T ABLE 1- Normal Usage of Priorities trator, or at the syste~ generati?n time, and the rest of the memory is allocated to batch jobs. Con.versaintroduction ofprogram residence priority tional jobs are. always rolled out immediately" aftei they are put into the waiting status. However, for Figure 5 indicates that over 60 percent idle time is control program logic simplicity and in order to exegenerated in the processing unit with 4 data channels . cute . batch jobs. and conversational jobs under a for program swap when the average size of programs single control program, the program residence priority in the conversational mode is 20k worqs. Thus, the concept was chosen. Arbitrarily, about 10 percent of system should not be operated in the conversational the" control program would be a special control promode alone, but ~ogether with the batch mode. Furgram for batch jobs and about 15 percent would be thermore, program swapping, i.e., swapping of jobs,_ for conversational jobs. should not be used for batch mode processing. These the reasons "why program residence priority has ACKNOWLEDGMENT been considered in the design of the FACOM 230-60 , Dr. Toshio Ikeda of Fujitsu Limited pointed out that operating system. resources should first be allocated to both real time In the system, one load module consists of up to 127 mode jobs and batch mode jobs, with the remaining program blocks. This is the unit of swapping, to which resources allocated to the conversational mode jobs. the. program residence priority is attached. The proMr. Takeshi Maruyama of Fujitsu Limited introduced gram residence priority determines the priority for the program residence priority concept. Thanks are staying in the main storage between program blocks when competition occurs. For example, in conversa- due -to them and to Mr. Isao Saoda, Mr. Takuma Yamtional mode jobs, a higher execution priority and a amoto, Mr. Akira Okamoto and Mr. Haruo Kunizawa lower program residence priority than for the batch for their encouragement and assistance. .mode jobs are to be set respectivelv. With this priority system, the conversational mode task is executed REFERENCES before execution of the batch mode task, and the for1 N UKAI mer jobs are rolled out before the latter jobs are rolled Operation analysis of FA COM 230-60 by system simulator SOL out. Generally, the priorities will be determined at Proceedings of the 1967 Conference of the Four Electncal . each installation, as in Table I. Associations in Japan Before introduCing the' concept of program resi2 J I SCHWARTZ E G COFFMAN C WEISSMAN "dence priority into the system design, the following A general-purpose time-sharing system method was applied. The total main memory capacity Proceedings of the 1964 Afips Spring Joint Computer" Conference for conversational jobs is prescribed by the adminis- are . A storage-hierarchy system for batch processing by DAViD N. FREEMAN Triangle Universities Computation ·Ce~ter Research Triangle Park, North Carolina initially riddled with logical errors and interface inconsistencies. Although these errors are continually being Operating System/360 was designed to meet a identified and corrected by IBM - with the co-operasever~ core-memory constraint: a 14K-bytes resident tion of its customers - the aggregate reliability is supervisor plus a repertoire of compilers, utility pro- • barely satisfactory today. Our experience is similar to grams, sort programs, and application packages fitting that of many OS/360 users: the system frequently into 18K bytes (approximately 4500 data words and encounters an uncorrectable software error and stops executable instructions). Many supervisory functions included in the nucleus of pre-360 systems were re- . dead, whence it must be reloaded. On small systems, job losses from these dead stops packaged into 1000-byte overlays for OS/360 (e.g. are typically of less consequence; the operator reloads logic to OPEN and CLOSE files-hereafter called the system and restarts the job that failed. data sets, following OS/360 nomenclature).5 SpeciOn a large communications-oriented system, dead fication of device type, buffering technique, and data stops are intolerable; the TUCC system is continually set identification - which was assembled, compiled, t sending/receiving jobs from 5-25 satellites simultaor li~k-edited into many pre-360 application programs neously, and re-transmission of jobs is rarely com-is deferrable in OS/360 until the data set is actually pletely successful, e.g., jobs are lost, partially proopened for processing, essentially "latest-possible cessed, processed twice, etc. In an early versio~ of binding of data-set attributes and processing mode" the TU CC communications package - comprising (cf. Part 3 of Reference 5 for a complete discussion). both IBM- and locally-written subroutines - ~e found Likewise, the assemblers, compilers, and utility pro: that 30% of all jobs were reruns, incurred by some grams offer a wide range of language facilities for failure of the collection/processing/distribution netrelatively small machines. In particular, the E-level work (sh~wn in Figure 1, "The TUCC Computer Netcompilers require approximately 18K bytes to match a 14K resident supervisor in a 32K machine; the F-level compilers require approximately 44K bytes, to match a 20K. supervisor in a 64K machine; the ASHEVILLE, GREENS8ORO, G-level Fortran compiler requires approximately 80K bytes in a 128K machine; and the H-level Fortran compiler requires approximately 210K bytes in a 256K machine. . The gross effects of this packaging of the supervisor f':::\ compilers, and other programs has produced thr~ major performance problems in large-memory, fastCPU systems: reliability, operator-intervention loss. es, and system I/O inefficiencies. Fundamental performance problems ~FROM­ ;:f?0::~=~ 5KB BROAaIA/lD V d~ztt ~ ~ {W'tt~ ,~}2 ~ !!£ - Reliability CHAPEL HILL TYPEWRITERS, PAPER TAPE READERS. AND CARD READ£RS ftGUBE I" Many core-resident subroutines of pre-360 systems were divided into multi-overlay structures in OS/360, 229 NCSU - RALEIGH TbeIllCC_utertlet.,,:fc and One Tvpical CMrn'S . Figure 1- The TUCC computer network and one typical campus 230 Spring Joint Computer C"onference, 1968 work and One Typical Campus" and Figure 2, "Disk Queuing for Job Flow between TUCC and a Typical Satellite Computer"). LAJ«lE J(I!S SlPERYISOR '-, Figure 2 - Disk queuing for job flow between TUCC and a typical satellite computer Although this loss rate has dropped considerably during 1966-67, further improvements in system reliability are mandatory for better user throughput, i.e., as experienced by the remote customer rather than as measured only by gross efficiency of the central computer. In many cases, job losses are attributable to unusual hardware-relat-ed event-g, e.g., "unforeseen status ~ig­ nals from communications lines. For our collection/processing/distributIon" programs, we have therefore developed three guidelines: • Make minimal changes to IBM programs, relying on IBM's gradual rehabilitation of their programming support into error-free status; • 'Restrict modifications to those subroutines whose alteration can produce important performance improvements; • Evolve from each current system into the next production system (rather than install sweeping changes), to minimize dead stops and lost jobs in the network. Operator-intervention losses OS/360 requires significant operator intervention for the following situations: 1. The system must be re-Ioaded after a dead stop. 2. A tape reel or disk pack must be mounted, possibly displacing another reel or pack. 3. A job has reached some logical dilemma which can only be resolved by the operator (e.g. no tape drives remain for a reel-mounting need). 4. A job requests' information/acknowledgment . from the operator, (e.g., authorization to over- write an unexpired data set - one normally retained until a pre-specified date). F or situation (1), TU CC has added significantly to IBM-supplied software, to furnish a faster, more re1iable checkpoint/restart facility based on a checkpoint hierarchy. 1\1ultiple checkpoints is a well-established concept, offered by several operating systems to permit flexible roll-back and restart operations. The TUCC hierarchy offers the following facility: large capacity core storage (LCS)* contains at all times the essential job-status data. Restart procedures incur no worse than the following inefficiencies: .' Jobsl currently being transmitted to TUCC must be re-transmitted, but only the last job from each terminal; • Job output being transmitted from TU CC must be re-transmitted, but only the last block of print images per terminal; • Jobs currently in process at TUCC can be restarted or skipped. The advantage of LCS for checkpoints is obvious; they can be taken more frequently than with disks, drums, or tapes with far less processing overhead. Furthermore, LSC is '·'safe." During a three-month evaluation of dead stops (10 to 15 per day), LCS checkpoints were always intact. This peculiar "safety" is attributable to the TUCC modifications to OS/360; most of LCS is "concealed" from the OS/360 mainstorage supervisor, which furnishes core blocks to the supervisor, compilers, and application programs. Only certain TUCC-written routines can access the upper 1700K-bytes of LCS, wherein are stored checkpoint data (and other functions described below). Altogether, reloading from dead stops has been made faster and less cumbersome for human operators. Time losses from intervention situations (2) - (4) above can be alleviated only by ample advance notice; mounting/dismounting tapes or disks is necessary, but painfully slow on a high-performance system. No job tickets accompany source decks submitted from remote terminals - tickets usually essential for rapid set up of private reels and disk packs in a non-satellite job-shop facility. 6 Recognizing this problem, TUCC has written "jobhold" logic into its job-manager subroutine, such that volume~setup messages are issued 100-200 seconds before each setup job is due for processing. Only * The IBM LCS is available in one- or two-million byte modules, either, uninterleaved or two-way interleaved. The TUCC system has two million bytes uninterleaved. Access time is 3/Ls, full cycle time is 8/Ls, the rental cost is 0.6¢/byte/month. By contrast, the fast core storage on a Model 75 is normally four-way interleaved, with an access time of O.4l-'s. full cycle time of 0.7 5J,LS , and a rental cost of 3. 7./byte/month. A Storage-Hierarchy System for Batch Processing when the operator has performed this setup to the system's satisfaction - correct volumes in correct status (e.g., a private data reel with ring out) on correct drives - will the job be dispatched. . No core storage or other resources are committed to "held" jobs other than their source-program image area on disk. The TUCC system allocates 28 million bytes of disk space to queue jobs awaiting processing; with the text-compression algorithm of Reference 7, this becomes effectively 80 million bytes, i.e.. , 1,000,000 card images. Based on our measurements of average source-deck size - approximately 300 cards - this suffices for 2500 jobs at point IN Q7 5 in Figure 2, assuming 25% aggregate track wastage. (Each job is allocated an integral number of tracks to improve scheduling flexibility and reduce job losses.) Approximately two days' work can be queued on a single disk pack, which has proved to be a more-thanadequate reservoir. System I/O inefficiencies We noted above that OS/360 performance suffers from heavy I/O activity supporting: a. Supervisory services. OPEN, CLOSE, and various interruption handlers have been divided into 1000-byte overlays which flow through a small number of core buffers. Typically 70K bytes of overlays flow through 5K-I0K bytes of core, each subroutine being overlaid when its core is reclaimed by another subroutine. b. System data sets, such as macros for the job-control function (PROCLIB) and disk workspace for the control program (SYSJOBQE); c. Multiple overlays of the job scheduler, linkage editor, and compilers. The demand rate for these overlays is extremely high in a multiple jobstream environment, and typical OS/360 systems spend a significant fraction of clock time awaiting their retrieval from secondary storage. In References 8-11, TUCC described this systemI/O problem and presented its three-element strategy for alleviating the problem: (i) Pseudo-readers, pseudo-punches, and pseudoprinters, which simulate real I/O devices using a combination of program logic, LCS, and disk queue space: (ii) Pseudo-disk, which simulates a single-drive 2314 using a different combination of logic, LCS, and real disk storage; (iii) A monitor for small nonsetup jobs, which represent an especially significant load on a multiuniversity system. 231 Two fundamental guidelines - based on the per-· formance of our system and other large 360s - are as follows: 1. Only LCS is a sufficiently fast source and sink for system I/O on a 360/75 system. The fastest conventional rotating device (for S/360, the 2301 drum) interlocks processing too long for satisfactory CPU utilization, under normal TUCC operating conditions. 2. Aftei ie-assigning most system I/O from diskl drum to pseudo-devices (in LCS), sufficient I/O activity has remained to permit additional job processing using conventional mUltiprogramming. At TUCC, multiple job streams use distinct pseudo-devices concurrently, each yielding CPU control when it must await an I/O completion (for a real device, of course). Multiprogramming concepts are required in most third-generation operating systems, where a large CPU performs concurrent peripheral operations . as well as mUltiple, unrelated jobs. TUCC modifications to OS/360 focus first on improving the speed of system I/O to improve aggregate throughout; by contrast, IBM's Multiprogramming with Variable number of Tasks (MVT) option in OS/360 focuses first on multiprogramming. Our justification may be peculiar to our job mix and operating characteristics, although we suspect they resemble those of other universities and scientific establishments. U nmodifi~d, our disk-oriented OS/360 system spent 85%-90% of clock. time in . CPU-Wait state (i.e., awaiting the completion of one or more I/O events. This figure was obtained by numerous readIngs of a home-made CPU meter, whichintegrates the--CPU-waiting signal over a ~me-minute interval). Clearly, running two job streams of similar characteristics could not reduce CPU-Wait below' 70%. * Considering channel contention and unit contention for one-of-a-kind data sets like the processor library (LINKLIB), it seems improbable that CPUWait could be reduced below 80%. Although our job mix and equipment are significantly different from TSS running on the 360/67, CPU-Wait under OS/360 resembles the "page-wait" problem analyzed by Neilson,12 Lauer,13 and Smith.14 In particular, Neilson discovered by simulation that a zero-latency disk - essentially a specification of TU CC hyperdisk - would significantly raise the aggregate throughput of TSS on the 360/67. Lauer used LCS as a paging device after demonstrating that a conventional drum could * Twice the iilefficiency (15%) of a single job stream. 232 Spring Joint Computer Conference, 1968 not deliver pages at a rate to keep the CPU satisfactorily utilized. If CPU-Wait could be reduced to 60% and if channel and device contention could be averied, two job streams should reduce CPU-Wait nearly to 20%. Three job streams could reduce CPU-V/ait still further, although IBM claims to have demonstrated- by analyses and simulations not yet publicly availablethat more than 3-4 job streams cannot raise the gross throughput of an average 360/75-class system running a conventional job-shop mix. We are currently measuring CPU-Wait against core requirements for each job stream. If CPU-Wait remains high, we will consider adding fast core siorage so that additional job streams can utilize otherwise-idle CPU time. In this respect, the storage-hierarchy system and MVT concur on how to convert CPU-V/ait to pioductive operation. Early-1968 readings of CPU-Wait range between 30-35% with a single job stream, less than half the figure with unmodified OS/360. * More important is the complementary figure: CPU activity has been raised fivefold. Some of this activity supports locally-written code and adds to the CPU overhead of the total system. Indeed, 95% of the I/O operations in the system are now simulated by locally-written code: the pseudo-devices cited above and described in the following section. Nonetheless, the gross number of jobs per hour has been raised significantly: this is the true measure of system throughput. The addition of minor CPU overhead has acted as a catalyst to relieve CPU-Wait due to system I/O; this overhead is evaluated in Appendix C. Elements of the TUCC storage-hierarchy system Environment As, enumerated in above section, the principal elements of the TUCC storage-hierarchy are the check'point system, pseudo-devices, and small-job monitor. The remainder of this paper ad.dresses the last two elements. The discussion will focus on system I/O, then discuss the relative suitability ~f fast core, LCS, disk, and drum storage for (a) ,supervisory code, (b) supervisory tables, (c) I/O 'buffers, (d) application-program code, and' (e) application-program work-space. The 360/75 CPU has a mean instruction time of IlLS; the TUCC machine has a fast core of 524K bytes and a large, slower core of 2097K byte~ The two 2860 selector channels each are rated at 1300KB. Each is attached to a 2314 disk system, *The double-job-stream system shown in Figure 2 keeps CPUWait under 10%. contammg eight drives mounted with removable disk packs. The packs contain over 28M bytes apiece, aggregating a total on-line disk storage of 466M bytes. Each disk system can perform one 312KB data-transfer operation from one drive at a given, instant. Five magnetic tapes, the card/printer system, and communications control units all interface to a third (multiplexor) channel. Since they can all operate simultaneously with less that 5% utilization of fast memory, they will not be further discussed here (cf. References 8-11 for details). This study is therefore restricted to core and disk storage, since effective CPU utilization under OS/360 depends piincipally on judicious allocation of functions to these media. Basic postulates of this paper are as follows: • I BM-supplied compilers, sorts, utilities, and application packages for OS/360 will not soon reduce/modify their usage of system I/O. • For reasons of compatibility, documentation and maintenance, the TU CC community insists on using IBM-supplied compilers and other software-excepting only the WATFOR compiler15 and TSAR*-and makes little use of other customer-written, high-performance, reducedfunction software. Thus, facilities like PU FFT16 or CORC/CUPL17 would have to offer significant functional advantages to displace IBM-supplied compilers in the TU CC community even if their compile-execute performance were markedly superior. W ATFO R meets this test: it is functionally equivalent to FORTRAN G and offers approximately a 30: 1 throughput advantage (mean job times of 0.5 seconds and 15 seconds respectively, for small student jobs). • IBM wiH not soon change the relative price or relative performance of its fast-core, LCS, drum, or disk products. Pseudo-readers, pseudo-punches, and pseudo-printers Application programs, compilers, and the control program itself are each allocated one pseudo-card reader, one pseudo-punch, and one pseudo-printer, furnishing the standard SYSIN, SYSPUNCH, and SYSOUT functions. These are simulated by the TUCC system using a combination of locallywritten subroutines in the OS/360 nucleus, two disk packs (one accepting input fran several dozen remote *TSAR is a statistical data-retrieval package developed at Duke University. It furnishes simplicities not currently available in IBM's Scientific Subroutine Package, as well as significant performance superiority. A Storage-Hierarchy System for Batch Processing job entry (RJE) sources, one accumulating output for the terminals), and ·buffer storag~ in LCS. Pseudoreade·rs furnish images to the two TU CC job streams at over 1200 card images per second, and pseudoprinters at over 1000 print images per second. SYSIN/ SY.SPUNCH/SYSOUT are entirely CPU-limited, even for program loops which "drive" the pseudodevices as fast as .possible. Details· of the TUCC implementation can be found in Appendix B. 1/0 --t:I~-- Slf'ER'v'ISOO 10CC /oDD! FI CATI ONS The current allocation of LCS to the hyperdisk is 1650K, equivalent to 220 tracks. Ari arbitrary number of real tracks - currently 2400 - comprise its other storage area. The hyperdisk is thus represented to the user as a 40%-unavailable pack; in OS/360 nomenclature, the aggregate number of allocated and unallocated tracks in the "volume table of contents" is exactly 2400, divided into two major areas by the simulator, as shown in Figures 3 ("Core Map of a Hyperdisk Simulation U sing Only LCS"), 4 ("Hyperdisk Structure"), and 5 ("Sample Track Control Table and Chained Lists"): 1. The read-only area comprises frequently-used compiler and subroutine libraries. The simulator knows exactly where the read-only area ends and the write-read area begins. As each track is addressed by the user, the simulator consults its track control table: if the track is in LCS, the simulation is performed at once; if the track is not in LCS, the simulator must first retrieve the track from its (fixed) position within the 2400track extent on real disk. The retrieved track has the sa~e information content as the original track -- -71 1 IttJ\IJ TRACK fil INTO @/ N'PLi CATia. PIOOWI @ I II1lZlZZlZIIl7711ll1ll1ll ~ I ....... Hyperdisk The I/O supervisor CALLs the disk simulator for all service to a single-drive disk addressed over (non-existent) channel 4 of the TUCC system. This hyperdisk is "transparent" to its users; it services all legal channel programs using a combination of LCS and real disk storage. The flow of track images between LCS and disk is totally controlled by the disk simulator and is decoupled from processing activities of partitions using the hyperdisk. In other words, many partitions can simultaneously have I/O requests queued on hyperdisk, although the simulator is actively servicing only one at a time. In addition to completing legal channel programs, the disk simulator appropriately terminates illegal channel programs, e.g., for an invalid SEEK address, illegal sequence, incorrect length, etc. The only statuses which it never returns are, of course, "data check" or "channel check," indicating hardware failure. 233 lCS Ii BuFFERS FOO PSEtJOO-REArERS I ETC, j On£R SYSTEM Fll'lCTIONS HvPERDISK TRACK-CCNTRQ TMlLE F- I . l) TAACK STORAGE .. N9. J . .... 1111111111111111111111111/1/11111111111/1/1111111111 ~_ -.J ------------------------Figure 3 - Core map of a hyperdisk simulation using only LCS 2316 DISK PACK Track Nos. .11 ....1,... 18 20 v -~~~!~--­ o C08UB L u. -FOmrS-- M E wasted VTOC :mm::: L~~~~!~ __ __ _~~I~!~ __ _ _~~I~!~ / - -- 1060 240 SYSUTI S L1NKLl8: Job scheduler, Fortran compilers, PL/I compfl er. Assemlers. Linkage editor, Util ities. etc. Y 5 SYSUT2 J o ----------------------------Remaining scratch 8 Q E ~~;:v~c:~ron~~s <-- - - ~Read-only data sets--=--""' ______ ~> ~Wrlte/read data s e t s - - - ; ) . / / ~ YTOC for hyperdlsk - ATTRIBUTES DATA SET /WE VTOC TRACK NOS. Pack type, size, etc. 18 - 19 20· - 39 ~ ALGLlB BLKSIZE-3625,RECfMoU !1! COBUB BLKSIZE=3625,RECFH=U 40 - S9 -: FORTLI8 BLKSlZE-3625.RECfJ4=U 60-99 I';: -' --.. g-- \; '- (i L1NklIB - ~ - - ~ 359 - 1059 BlKSIZE-3625,RECF... U ------r-----------------+---------- ~oO,--SY-S~--E----r-------------------------------------+--- 2_~ -2_4O_0~ SYSUTI (As determi ned by each user) __ __ aa • bb bb + I - cc SYSUT2 Figure 4 - H yperdisk structure 234 Spring Joint Computer Conference, 1968 TRACK CONTROL TABLE f-8 CONTROL BITS---41 ~24-BIT LCS ADDRESS OR 16-BIT TRACK ADDRESS~ I rOO 0 0 0 0 0 0 I 11 0 0 0 0 0 0 0 I 00 00 00 00 o0 0 0 0 0 0 0 I (LCS ADDRESS OF TRACK 1) (REAL-DISK ADDRESS OF TRACK 2) I (LCSA""'" OF TRACK 3) I 0 (=~~_~~~ ____ TRACK STORAGE THE LUFO AREA ._ --,. _. ___. _ . y ' - _ '- I AND CHAIN LUFO CHAIN FORWARD BACKWARD 4 4 \ ,0 ; 1° \ I 0" DATA STORAGE 7312 - BYTES <; .... ---.-- f -~J ---~~-I II N----\ttlt\ T:~ ___ ~ ___ / I T- r °t. ~ I C(HoIENTS (1) T1 WAS ACCESSED MORE RECENTLY THAN T4 (2) T4 WAS ACCESSED MORE RECENTLY THAN T3 (3) T2 IS ON THE CHAIN OF AVAILABLE TRACKS, I.E. ALL TRACKS NOT ON THE and pseudo-printers, a seek-minimizing algorithm selects the nearest available track in the writeread area on real disk. Expart, the small-job monitor In OS/360, core memory may be divided into two or more contiguous blocks (partitions) which process independent streams of work. To reduce the overhead for initiating/terminating small jobs - typically FORTRAN and PL/I debugging jobs - TUCC has written a monitor which: • Operates in a lOOK "express" partition independent of the full-function batch-processing partition: • Processes non setup jobs with small-memory needs; and • Receives control only when the batch-processing partition is in CPU-Wait. The express partition absorbs CPU-Wait from all higher-priority partitions, since its input/output is exclusively to pseudo-devices. LUFO CHAIN Experimental results and conclusions Figure 5 - Sample track control table and chained lists - - of the disk dataset; however, it has been re-formatted - during a once-only run of the simulator - to match the internal representation used by the simulator, i.e., inter-~ecord gaps of real tracks are represented by control fields in LCS. A longest-unused-first-out algorithm (LV FO) controls retention/overlaying of track areas in LCS. When all areas have been' allocated and a new track must be created (or retrieved from real disk), an existing track must be spilled or overlaid. For read-only data sets, the choice is obviously the. latter. The LUFO algorithm is as follows: as each in-LCS track image is referenced, it is promoted to the "head" of the LUFO chain, as shown in Figure 5. Inactive tracks logically "drift" to the bottom of the LUFO chain. When a new track ar,ea is required, the bottom track is released, i.e., overlaId if read-only. 2. Write-read tracks are those above track-address 1060 in Figure 4; they are always written at least once before they are read. Except for the disk workspace of the control program (SYSJOBQE), they are released at the end of each job. If a write-read track must be spilled - because it has drifted to the bottom of the LUFO chain-it is written to an arbitrary track within the write-read area of the real disk. The relative address of the real track is retained in the track control table (Figure 5, "Sample Track Control Table and Chained Lists"), just as for pseudo-readers Benchmark Jobs Since June, 1966, TUCC has run benchmark jobs against each major new system (i.e., when new hardware or a new release of OS/360 or a major TUCC project is installed). 1bese benchmark jobs, the "Golden Deck" described in Appendix A, fairly represent our job mix. We evaluate our instantaneous throughput rate based on this deck, although the substantial improvements seen in Appendix A are not directly reflected in jobs/day statistics. We process only 12001500 jobs/day for the foHo\ving reasons: • A large daily investment in preventive maintenance and systems development is necessary for a communications-oriented system with ~ 105 transistors; • !he system runs out of work during the midnight shift; • An increasing number of jobs have entered "production" status, whereas most jobs in 1966-67 were debug runs. Evaluation of the performance improvements may be found in Section I I I of Appendix A. The core map Early in this paper, OS/360 was characterized - for the TUCC environment-as having unsatisfactory design points for the supervisor, compilers, etc. By adding our storage-hierarchy system to OS/360, A Storage-Hierarchy System for Batch Processing CPU-Wait due to system I/O has been substantially reduced. TABLE I-Core MapofTUCC System DECIMAL ADDRESS ELEMENT 0-80K I/O supervisor, other necessarilyresident interrupt handlers, and device simulators. 80-330K Batch-processing partition. 330-440K EXP ART partition. ,440-524K Executable code for job collection/ dissemination partition. (Beginning of LCS) ----------------------------------------------- 524-830K Buffers and control blocks for job collection/dissemination partition. 830-865K Supervisory tables accessed by binary-lookup subroutines. 865-935K Low-usage supervisor subroutines: OPEN, ABEND, WTO, etc. 935-2585K Hyperdisk 2585-2621K Checkpoint and accounting data The system elements in fast core seem justifiable, on the whole; they include all executable code except low-usage supervisor subroutines. Executing instructions out of LCS is possible but generally undesirahle; I-cycle timings are inflated from an average of O.4p,s to 4.5p,s. Low-usage supervisor subroutines are an exception to this general guideline; by their nature, they have few instruction loops (which are particularly undesirable in LCS). Each 1000-byte subroutine executes a few instructions (probably no more than 200), then yields control to the next subroutine or returns control to the caller. We contend that such subroutines are appropriately executed out of LCS, that I-cycle overhead is less than the overhead to retrieve them from LCS (or disk or drum) in'to fast core. To retrieve each supervi~or subroutine from hyperdisk would require 3000p,s (cf. Appendix C), which equals the overhead of executing 750 instructions from LCS instead of fast core. If each subroutine call indeed requires only 200 instruction executions, the advantage is c1early·to execution-fro.m-LCS. Of course, these supervisor subroutines could be included in fast core. as part of the system nucleus. I 235 This strategy would obviate either LCS-execution overhead or retrieval-from-LCS overhead. However, the 70K core requirement would have to either (a) be subtracted from the 250K batch partition or (b) displace the EXPART partition. In case (a), many user programs would require re-programming (e.g., FORTRAN programs with large matrices) into overlay struct1!res. Possibly, an optimizing option of FORTRAN H would have to be constrained (or deck sizes constrained, equivalently). In case (b), TUCC would lose the important price/performance advantage of mUltiprogramming in a 524K-byte system. We note that IBM prices are approximately in the following ratio (lease price per byte per month): 2314:2301 :LCS:fast-core=I:40:250: 1600 Most executable code in the job-collection partition has higher aggregate activity than the low-usage supervisor subroutines. Thus, the TUCC system actually takes advantage of the multiple-overlay structure of the supervisor to segregate high-usage from lowusage code. U sing the L UFO algorithm described in an earlier section, the hyperdisk retains high-usage tracks for most non-supervisory code in LCS, whereas low-usage tracks drift back to disk storage - and remain there until they again become active. OS/360 compilers constrained fo small design points have already segregated high-usage code and tables into resident segments of their overlays, whereas lightly-used compilation elements - both instructions and data - are diverted to disk. The TUCC system capitalizes on this a priori structuring. Preliminary comparisons with standard, drumoriented versions of OS/360 have confirmed the performance superiority - and price-performance superiority'::"'of the TUCC system for a remotejob-entry university environment with a broad mix of student jobs, major compute-limited jobs, and I/O-limited file processing. ACKNOWLEDGMENTS The storage-hierarchy system is the work of 15 systems programmers at TUCC, of whom a number ar~ employed by the Raleigh branch-office of IBM. Helpful suggestions have bee~" offered by numerous IBM development groups and OS/360 customers. The HASP project of IBM-Houston includes approximately the same pseudo-readers and pseudo-printers as the TUCC system; it was developed at the same time as ours and implemented several months earlier. REFERENCES 1 T KILBURN D J HOWARTH F H SUMNER R B PAYNE 236 Spring Joint Computer Conference, 1968 The Manchester University atlas operating system parts /-l/ The Computer Journal October 196 I 2 T KILBURN D B G EDWARDS M J LANIGAN F H SUMNER One-:level storage system IRE Transactions on Electronic Computers Vol II No 2 April 19623 F H SUMNER E C Y CHEN G HALEY . The central control unit of the atlas computer Proceedings IFIP Congress 1962 4 D MORRIS F H SUMNER M TWYLD An appraisal of the atlas supervisor Proceedings of the 22nd Annual ACM National Conference 1967 5 G H MEALY B I WITT W A CLARK Thefunctional structure ofOS/360 IBM System Journal Vol 5 No 1 1966 6 W C LYNCH Description of a high capacity, fast turnaround, university computing center Communications of the Association for Computing Machinery Vol 9 No 2 February 1966 7 G BENDER D N FREEMAN J D SMITH Function and design of DOS/360 and TOS/360 IBM SystansJournal Vol 6 No 1 Page 191967 8 D N FREEMAN (editor) Systems developmefu at TUCC: 1966-1967 Paper distributed at SHARE XXVIII San Francisco February 1967 9 D N FREEMAN Structure of TUCC linkage to three campuses: Hardware and software Reprints of the 6th Annual Southeastern Regional Meeting of ACM, June, 1967 10 J F WALKER OS/360 throughput problems Reprints ofthe 6th Annual Southeastern Regional Meeting of ACM June 1967 11 J W STEPHENSON JR Use of bulk core to improve system performance Reprints of the 6th Annual Southeastern Regional Meeting of ACM June 1967 12 N J NEILSON The simulation of time sharing systems Communications of the Association for Computing Machinery Vol 10 No 7 July 1967 13 H F LAUER Bulk core in a 360/67 time-sharing system FJCC Proceedings 1967 14 J L SMITH Multiprogramming under a page on demand strategy Communications of the Association for Computing Machinery Vol 10 No 10 October 1967 15 P W SCHANTZ et al Watfor- The University of Waterloo Fortran IV compiler Communications· of the Association for Computing Machinery Vol 10 No I January 1967 16 S ROSEN R A SPURGEON J K DONNALLY PUFFT - The Purdue University fast Fortran translator Communications of the Association for Computing Machinery Vol 8 No 11 November 1965 17 R W CONWAY W L MAXWELL . CORC - The Cor.nell computing languages Communications of the Association for Computing Machinery Vol 6 No 6 June 1963 18 G A BLAAUW F P BROOKS JR The structure of System/360 Part I: Outline of the logical structure IBM Systems Journal Vol 3 No 2 1964 APPENDIX A GOLDEN DECK THviINGS I. DESCRIPTION OF RUNS A. RUNCO (130 source cards) COBOL (COBOL-E: Procedure: pile, link, go) Compile: source listing data division map procedure division map diagnostics Link-Edit: cross reference map 500 lines of print output Go: com- B. CTEST (2600 source cards) COBOLLNK (COBOL-E: Procedure: compile,_li~k) Compile: source listing data division map hex code listing diagnostics Link-Edit: cross reference map C. TIME (500 source cards) Procedure: Compile: Link-Edit: Go: ASIvtLNKGO (ASS EMBLER-F: assemble, link, go) source listing relocation dictionary cross reference table cross reference table 50 lines output 120 cards input D. PLI T 1 (200 source cards) Procedure: Compile: Link-Edit: Go: PLI (PL/I-F: compile, link, go) source listing diagnostics diagnostics only no output 8 cards input E. PLICP (1100 source cards) Procedure: Compile: PLICMP (PL/I-F: diagnostics compile) F. AFORT (1200 source cards) FORTCMP (FORTRAN-E: Procedure: compile) Compile: diagnostics oniy A Storage-Hierarchy System for Batch Processing G. BFORT (775 source cards) Procedure: FTLNKGO (FORTRAN-E: compile, link, go) source listing Compile: storage map diagnostics cross reference table Link-Edit: 220 lines of print output Go: 130 cards of input H. TSAR Go: 237 no input K. SFORT (null FORTRAN, 2 source statements) Procedure: Compile: Link-Edit: Go: FORTRAN (FORTRA.N-E: compile, }ink, go) source listing storage map external references diagnostics diagnostics no input (no source) Procedure: Compile: Link-Edit: Go: PGM = EXEC using JOB LIB (Statistical Prod. program) none none 175 cards input 525 lines output I. SPU (null PL/I, 4 source statements) Procedure: Compile: Link-Edit: Go: PL1 '(PL/I-F: compile, link, go) source listing diagnostics diagnostics only noinpnt J. SAMB (null assembly, 3 source statements) Procedure: Compile: Link-Edit: ASMLNKGO (ASSEMBLERF: assanble, link, go) source listing reloc~tion dictionary cros~, reference table cross reference table II. TIMINGS The test environment is indicated below and coded as follows: 1st position - S/36Omodel number 2nd position - OS/360 release number 3rd position - SYSIN device 4th position - SYSOUT device 5th position - Locally written systems software The following abbreviations are used: C - card reader (2540) P - 1430 printer (Model N 1, universal character set) T-60KBtape D14 - 2314 disk D14M - 2314 disk with SYSIN disk-to-disk data movement H - HYPERDISK (pseudo disk in LCS) D - Directories in LCS F-LCSFETCH Times are given in seconds. 238 Spring Joint Computer Conference, 1968 COMPILE .LINK COMPILE LINK L GO *}OB TIME RUNCO 40,2,C,P 40,3,C,P 75,6,C~P 75,6,T,P 75,6,T,P (print train) 75,6~T,T 75,6,T,T,D 75,6,T,T,DF 75,9,T,T,DF 75,II,DI4M,DI4,DH (2303 LINKLIB) 75,II,DI4M,DI4,DH (HDSK LINKLIB) 84 102 73 74 70 60 83 30 19 10 09 50 29 29 25 24 21 14 694 697 556 558 463 166 140 126 157 152 74 74 66 851 849 630 632 529 61 52 201 178 97 96 07 06 104 102 1080 403 256 240 211 40 43 26 27 23 567 690 479 471 383 1680 1136 761 739 617 121 19 325 465 39 33 04 03 152 188 170 168 79 79 69 49 43 32 24 117 96 45 45 41 41 36 30 1540 1549 92 92 90 1827 1813 216 216 200 89 168 -'A ~'"t 88 136 1 1 11 04 04 236 60 163 12 II 08 08 05 05 96 115 52 38 19 18 CTEST 40,2,C,P 40,3,C,P 74,6,C,P 75,6,T,P 75,6,T,P (print train) 75,6,T,T 75,6,T,T,D 75,6,T,T,DF 75,9,T,T,DF 75,11,D14M,D14,DH (2303 LINKLIB) 75,11,DI4M,DI4,DH (CHDSK LINKLIB) TIME 40,2,C,P 40,3,C,P 75,6,C,P 75,6,T,P 75,6,T,P (print train) 75,6,T,T 75,6,T,T,D 75,6,T,T,DF 75,9,T,T,DF 75,ll,DI4M,DI4,DH (2303 LINKLIB) 75,II,DI4M,DI4,DH (HDSK LINKLIB) PLITI 40,2,C,P 40,3,C~P 75,6,C,P 75,6,T,P 75,6,T,P (print train) 75,6,T,T 75,6,T,T,D 75,6,T,T,DF 75,9,T,T,DF A Storage-Hierarchy System for Batch Processing 75,II,DI4M,DI4,DH (2303 LINKLIB) 75,11,DI4M,DI4,DH (HDSK LINKLIB) 16 15 08 08 89 89 239 115 114 PLICP 40,2,C,P 40,3"C,P 75,6,C,P 75,6,T,P 75,6,T,P (print train) 75,6,T,T 75,6,T,T,D 75,6,T,T,DF 75,9,T,T,DF 75,II,DI4M,DI4,DH (2303 LINKLIB) 75,II,DI4M,DI4,DH (HDSK LINKLIB) 933 353 71 59 53 45 40 25 27 16 . 13 933 353 71 59 53 45 40 25 27 16 13 303 185 172 72 69 78 65 34 28 12 09 303 185 172 72 69 78 65 34 28 12 09 AFORT 40,2,C,P 40,3,C,P 75,6,C,P 75,6,T,P 75,6,T,P (print train) 75,6,T,T 75,6,T,T,D 75,6,T,T,DF 75,9,T,T,DF 75,11,DI4M,DI4,DH (2303 LINKLIB) 75,11,DI4M,DI4,DH (HDSK LINKLIB) COMPILE LINK GO *JOB TIME 64 75 37 37 31 539 542 52 52 48 958 798 205 205 188 27 20 18 05 05 26 22 22 19 19 108 68 61 34 32 86 78 86 29 27 21 16 11 II 86 78 86 29 27 21 16 II II BFORT 40,2,C,P 40,3,C,P 75,6,C,P 75,6,T,P 75,6,T,P (print train) 75,6,T,T 75,6,T,T,D 75,6,T,T,DF 75,9,T,T,DF 75,II,DI4M,D14,DH (2303 LINKLIB) 75,II,DI4M,D14,DH (HDSK LINKLIB) 355 181 116 116 109 61 50 26 21 10 08 TSAR 75,6,C,P 75,6,T,P 75,6,T,P (print train) 75,6,T,T 75,6,T,T,D 75,6,T,T,DF 75,9,T,T,DF 75,11,DI4M,DI4,DH (2303 LINKLIB) 75,II,DI4M,D14,DH (HDSK LINKLIB) 240 Spring Joint Computer Conference, 1968 SPLI 75,6,C,P 75,6,T,P 75,6,T,P (print train) 75,6,T,T 75,6,T,T,D 75,6,T,T,DF 75,9,T,T,DF 75,II,DI4M,DI4,DH (2303 LINKLIB) 75,II,DI4M,DI4,DH (HDSK LINKLIB) 35 35 31 28 28 24 70 70 61 58 50 29 22 13 08 05 03 21 14 12 05 05 07 07 06 05 05 02 02 01 01 09 35 35 32 25 25 21 04 04 03 64 64 56 26 17 03 46 II 07 06 08 04 03 01 01 01 20 12 10 LINK GO *JOB TIME ...,p ~; 24 "\.-1 k"'t 11 SAMB 75,6,C,P 75,6,T,P 75,6,T,P (print train) 75,6,T,T 75,6,T,T,D 75,6,T,T,DF 75,9,T,T,DF 75,ll,D14M,DI4,DH (2303 LINKLIB) 75,II,DI4M,DI4,DH (HDSK LINKLIB) COMPILE SFORT 75,6,C,P 75,6,T,P 75,6,T,P 75,6,T,T (print train) 75,6,T,T,D 75,6,T,T,DF 75,9,T,T,DF 75,II,DI4M,DI4,DH (2303 LINKLIB) 75,11,D!4M,D!4,DH (HDSK LINKLIB) j 26 26 23 22 18 12 07 03 0 ...1- 26 26 22 22 19 11 10 04 07 07 06 06 05 05 03 01 04 01 Vi 59 59 51 50 42 28 20 08 08 I I I *Does not include termination time for each job. Prior to the installation of HYPERDISK this time was approximately 02 seconds/job. Termination time with HYPERDISK has been reduced to approximately 'Ol-to 01.5 seconds/job. III. DISCUSSION OF THE TIMI.NGS RUNC 0 initially compiled only slightly faster on 360/75 than on our original 360/40; assigning SYSIN/ SYSOUT to tape reduced compilation from 73 to 60 seconds. Moving SVCLIB and LINKLIB into LCS halved this time; this move was our first attempt at a hierarchical system and has been documented elsewhere. i i The improved Release-9 compiler and control program halved the time again, and the pseudo-printer/hyperdisk additions reduced compile time to 8.4 seconds. Similarly, link-edit time has been reduced 5: 1 from the tape-in-tape-out (T ITO) time of 24 seconds for the (non-hierarchical) Release-6 system. Since the linkage editor requires no card/tape input and writes few SYSOUT lines, the gains are attributable primarily to the hyperdisk (and its precursor, which serviced only SVCLIB and LINKLIB). Aggregate job time has been reduced by 6: 1 from a TITO system. CTEST - This COBOL job requires voluminous SYSIN/SYSOUT during the compilation step; the link-edit step has been reduced by approximately 10: lover the TITO system. T JM E - This job evaluates supervisory services offered by OS/360 such as OPEN, CLOSE, GET, A Storage-Hierarchy System for Batch Processing PUT, etc. Assembly time was reduced 4: lover a TITO system. (TI M E was not operable for several months due to a time-dependency in OS/360. This error was first exposed by our early storage-hierarchy system and was corrected by IBM when their improvement program also exposed it.) PLI T I - This job has a prolonged, no-I/O execu- tion phase - hence the dramatic reduction when TUCC replaced the 360/40 with the 360/75. The compilation time has been reduced 3: lover the TITO system. PLICP- This non-trivial PL/I compilation has been reduced 3: 1 from the TITO system; the compilation rate is over 5000 statements per minute. AFORT - This non-trivial, multiple-compilation FORTRAN-Ejob"has been reduced over 8: I from the TITO system; the compilation rate is 8000 statements per minute. BFORT, TSAR, etc. - The remaining benchmark jobs show comparable improvements. The Release-II runs were performed during concurrent job-collection/dissemination, whereas all previous timing runs were performed with communications equipment (and other I/O irrelevant to the experiments) turned off. Thus, the Release-II figures are somewhat inflated due to interrupt-servicing and buffer manipulations unrelated to the timing tests. APPENDIX B -IMPLEMENTATION DETAILS FOR THE PSEUDO-READERS, PSEUDO-PUNCHES, AND PSEUDO-PRINTERS Within the I/O supervisor of OS/360 (in resident nucleus), there are approximately ten incidences of the following privileged-instructions: 18 Start I/O, Test I/O, Halt I/O, and Test Channel. TUCC has replaced each instruction with a CALL to an I/O filter subroutine, which determines if the operation is to be attempted on a real device or a pseudodevice. In the former case, the I/O instruction is issued by the subroutine, which then returns control to the I/O supervisor. In the latter case, the channel command words (CCWs) are interpreted by a cardreader simulator, print simulator, or punch simulator, as distinguished by the device address (a parameter of the I/O filter subroutine). Each card-reader interpretation comprises the following events: 1. An input buffer pool in LCS is tested for availability of the next card image from this pseudoreader. 24! 2. If the next image is available, it is decompressed and moved from LCS to the target area in fast core specified by the READ CCW. "Decompression" is "restoration of blanks" as in Reference 7; when the card image was originally captured by the TU CC computer network - possibly at the central computer, possibly at an "intelligent" satellite - it was scanned for string~ of at least 2 blanks, each of which being replaced by a one-byte control field. The card image is retained in this compressed representation untii the instant it is "read" by the reader simulator. This technique conserves LCS and disk storage throughout the queuing network depicted in Figure 2. 3. After the next card image has been moved out of LCS, control information for the buffer pool is appropriately updated. If the current buffer* is emptied, the track-manager subroutine is notified to refill the buffer. 4. The reader-simulator returns control to the I/O supervisor, indicating that the READ was instantly and perfectly performed. 5. At step (2), if no card images are available - because the track manager has fallen behind the pseudo-reader or because the system is out of work - the job-processing partition requesting the card image is idled until a fresh buffer is retrieved. Anticipatory buffering keeps two or three tracks of card images in LCS, so that the pseudo-reader rarely goes idle; statistics are not yet available on this phenomenon. Pseudo-printers and pseudo-punches operate in an analogous fashion: 1. As each job-processing partition requests printing/punching of an image, the I/O supervisor issues a CALL to the I/O filter subroutine. 2. The image is compressed and inserted into an LCS buffer. 3. Buffer-pool control information is updated; if an output buffer is filled, the track manager is notified to write the track image to disk. On the current TUCC system, one pack is used to queue SYSIN and one pack to queue SYSOUT. The track manager totally controls the status of these 8,000 tracks, using occupancy tables in LCS to record the tracks for each job and the queue of jobs arriving from each satellite. To conserve access-mechanism motion, a simple seek-minimizing algorithm is used to allocate tracks, viz., WRITE into that available track nearest the current mechanism position. Visual observation of the SYSIN/SYSOUT packs shows low mechanism activity, even when several *One disk track can hold 7294 bytes" 242 Spring Joint Computer Conference, 1968 5KB communications lines and two job-processing partitions are contending for a single mechanism. This activity is kept low primarily (a) by the high information density per track, averaging 240 card images or 180 print lines, and (b) by the seek-minimizing algorithm. After processing, jobs can wait indefinitely in the output queue for each satellite; sizeable tables are required to service terminals which submit, say, 100 jobs on Friday and do not again establish contact with TUCC until Monday. However, this resulting convenience of operation is much esteemed by the satellite installations, and TU CC will remain the principal queue point of the complex for the foreseeable future. APPENDIX C DETAILED TIMINGS OF THE HYPERDISK A. SIMULATOR OVERHEAD To measure the overhead due to the I/O-filter and disk-simulator subroutines, the following experiments were performed: 1. 1000 "no-operation" (NOP) CCWs were issued to a real disk to determine EXCP/WAIT/interrupt overhead. Repeated measurements furnished a low-variance average of 1180p.s for an EXCP/WAIT sequence on a 360/75. This is used hereafter as a base figure. 2. 1000 NOP CCWs to the hyperdisk averaged 1550p.s, i.e., £!. simulator overhead of 370p.s per EXCP/WAIT sequence. 3. Three 1000-event experiments were performed to time READ and SEARCH overheads for the disk simulator. a. To read the first one-byte block on each track required an additionaI210p.s. To read a block of .TV bytes required ~n additional IV;;-s, since LCS operates at 1MB for block transfers. (If the disk-simulator were used with two-way-interleaved LCS - an extra-cost option - this incremental time would be reduced to O.5Np.s.) Note that the average time to read a block of 1\1 byt~s from real disk is (12500 + 3. 12N)p.s ; the average time from a 2301 drum is (8600 + 0.8N)p.s. Representative values are as follows: * b. To read the lQth one-byte record on each track required 280p.s more than to read the first record. Thus, the time to search the identification of one record unsuccessfullyIII OS/360 nomenclature, "SRCHID=" *The EXCP/WAIT overhead for servicing a drum in OS/360 is t 80J.ts Jess (on a 360/75) than for disk, since the "stand-alone SEEK" is unnecessary. The hyperdisk behaves like a drum in this respect. TABLE II-Average EXCP/WAITTime (ILX) on a 360/75 with 2314 Disk, 2301 Drum, and Hyperdisk ~, 2301 DRUM HYPERDIS~ I .... 80 13,930 9,344 1,840 1000 16,800 10,080 2,760 3000 23,040 11,680 4,760 followed by "TIC" - is approximately 30p.s. This is a negiigibie effect for hyperdisk performance, since there are rarely more than 40 blocks per disk track. The hyperdisk (arbitrariiy) begins each search with the first record on a track. The figures cited for the disk and drum in the above table assume half-track latency, on the average. The hyperdisk figures should therefore be somewhat increased to reflect simulated half-track searches: { 2440 } { 1840 } 2850 instead of 2760 4790 4760 B. REAL DISK I/O WITHIN THE HYPERDISK To determine the average level of LCS activity vs. real-disk activity within the hyperdisk, we inserted counters at over 50 points within the simulator. The gross overhead due to counter activity is less than 0.01% of clock time. Four recent readings of the counters are essentially in agreement: with 220 track images in LCS, less than 0.8% of the channel programs directed to the hyperdisk necessitated reading of a real track. For example, during one 50-minute period, 162,000 SIOs were issued to the hyperdisk (of which half were for datatransfer operations); this produced only 612 full- track READs and 38 WRITEs within the hyperdisk. These WRITEs were, ,In fact, for the 40-track SYSJOBQE data set, * which'is formatted in LCS when the system is initially loaded. The READs were, of course, directed to the read-only data sets depicted in Figure 4. Included in the jobs were assemblies, compilations in all languages, and link-edits. C. CPU OVERHEAD DUE TO THE HYPERDISK During the 50-minute experiment cited just above approximately 81,000 non-trivial channel pro*Since the original measurements were made, SYSDBQE has been increased from 40 to 100 tracks. A Storage-Hierarchy System for Batch Processing During the 50-minute_ experiment cited in Section III.C, approximately 81,000 non-trivial channel programs were directed to the hyperdi'sk. The aggregate number of interpreted CCW's was 940,000, and 69,000,000 bytes were moved by the simulator to/ from LCS. Assuming 1760JLs per non-trivial channel program and 50JLs per interpreted CCW, plus 1JLS per byte moved from/to LCS, the aggregate overhead was as follows: 143 seconds for EXCP + WAIT + lio filter 47 seconds for CCW interpretation 69 seconds for data movement 259 seconds of 3000 seconds clock time Thus, 8.5% of clock time was spent servicing the hyperdisk, including all I/O-supervisor overhead (cf. Neilson's results 12 ). Burroughs'B6500/ B7 500 stack mechanism by E. A. HAUCK and B. A. DENT Burroughs Corporati~n, Pasadena, California and within the Processor in' 51 bit words. The first 3 ?its of the word are used as tag bits, which serve to Identify the various word types as illustrated in Fig. 1. The remaining 48 bits are data. Tag bits, in addition to identifying word type, provide the B6500/B7500 Processor with two unique features: (1) data may be' referenced as an operand, with the processor worrying about whether the operand consists of one or two wor~s, and (2) system integrity and memory protectIon are extended to the level of the basic machine data words. If a job attempts to execute data as pro~ram code, or to modify program code, the system is mterrupted. INTRODUCTION Burroughs' B6500/B7500 system structure and are an extension of the concepts employed in the development of the B5500 system. The unique features, common to both hardware systems, are that they have been designed to operate under the control of an executive program (MCP) and are to be programmed in only higher level languages (e.g., ALGOL, COBOL, and FORTRAN). Through a close integration of the software and hardware disciplines, a machine organization has been developed which permits the compilation of efficient machine code and which is addressed to the solution of probl~m.s associated with multiprogramming, multiprocessing and time sharing. Some· of the important features provided by the B6500/B7500 system are dynamic storage allocation, re-entrant programming, recursive procedure facilities, a tree structured stack organization, memory protection and an efficient interrupt system. A comprehensive stack mechanism is the basic ingredient of the . B6500/B7500 system for providing these features. philosop~y DATA WORDS , 000 I 010 , 010 I 'EXPONENT' I MANTISSA I I I I 'EXPONENT' MANTISSA (MSI I +oII.f----- EXPONENTIMSII ~~~~rECISION I~~~D~fl~ I I MANTISSA (LS) 1-6 BITS;..... B6500/B7500 processor , I go~~~~D~~~~S~ 39 B I T S - - - . I DESCRIPTOR WORDS The command structure of the B6500/B7500 Processor is Polish string, which allows for the separa. tion of program code and data addresses. The basic machine instruction is called an operator syllable. This operator syllable is variable in length, from a minimum of 8 bits to a maximum of 96 bits. In the . interest of code compactness, more frequently used ' . operator syllables are encoded in the 8 bit form. The Processor is provided with a hardware implemented stack in which to manipulate data and store dynamic program history. Also, data may be located in arrays outside the stack and may be brought to the stack temporarily for processing. Program parameters, local variables, references to program procedures and data arrays are normally stored within the stack. The data ·word of the B6500/B7500 Processor is 51 bits long. Data are transferred between. memory I 245 I I 101 I P , C , I I RID I 101 I P' C I LENGTH ADDRESS II I I LENGTH I P~b1 DESCRIPTOR I ADDRESS I frgfRAM DESCRIPTOR 1--20 BITS----.l..--20 BITS--J SPECIAL CONTROL WORDS I 011 I E I III I I STACK NO IOISPLACEMENT' L LI I I I I I I I I PROGRAM SYLLABLE INDEX I I I ! 001 IE! STACK NO. !DISPLACEMENT OF STACK CONTROL I MARK WORD (MSCWI I PROGRAM CONTROL ADDRESS CCA.PLE I WORO(PCWI RETURN CONTROL WORO(RCW) II I ! ADDRESS COUPLE! INDIRECT REFERENCE L...::...::..:...r.=+=-:.::::::..:.:..:.::....F:..::::::=~.---4.t-~~~~.I WORD IlRW/IRWS) 1-10 BITS-I..-16 BITS--l 5BIT~ 14 BITS----l Figure 1 - 86500/87500 word formats 246 Spring Joint Computer Conference, 1968 The stack The stack consists of an area of memory assigned to a job. This stack area serves to provide storage for basic program and data references associated with the job. In addition, it provides a facility for the temporary storage of data and job history. When the job is activated, four high speed registers (A, X, B and Y) are linked to the job's stack area (Fig. 2). This linkage' is established by the stack poi~ter register (S), which contains the memory address of the last word placed in the stack memory area. The four topof-stack registers (A, X, Band Y) function to extend the job's stack into a quick access environment for data manipulation. . I TOP OFSTACKREGiSTER - - - -1 I IN/OUTPUT I PATH OF OAur'------...-..... TO STACK I I _ L I I ______ I __---.1 STACK AREA ASSIGNED TO PROGRAM I ~--t--::: t;:::::-==:; I STACK AREA CURRENTLY IN USE STACK MEMORY AREA register, and the Stack Limit (SL) register. The contents (If the BOS register defines the base of the stack area, and the SL register defines the upper limit of the stack area. The job is interrupted if the S register is set to the value contained in either SL or BOS. The contents of the top-of-stack registers are maintained automatically by the processor hardware in accordance with the environmental demands of the current operator syllable. If the current operator syllable demands that data be brought into the stack, then the top-of-stack registers are adjusted to accommodate the incoming data, and the surplus contents of the top-of-stack registers, if any, are pushed into the job's stack memory area. Words are brought out of the job's stack memory area and pushed into the top-of-stack register for operator syllables which require the presence of data in the top-of-stack registers, but do not explicitly move data into the stack. Top-of-stack registers operate in an operand oriented fashion as opposed to being word oriented. Calling a double precision operand into the top-of-stack registers implies the loading of two memory words into the top-of-stack registers. The first word is always loaded into the A register where its tag bits are checked. If the word has a double precision tag, a second word is loaded into X. The A and X registers are then concatenated to form a double precision operand image. The Band Y registers concatenate when a double precision operand is moved to the B register. The double precision operand splits back to single words as it is pushed from the Band Y registers into the stack memory area. The reverse process is repeated when the double precision operand is eventually popped up from the stack memory area back into the top-of-stack registers. Figure 2 - Top of stack and stack bounds registers Data are brought into the.stack through the top-ofstack registers. The stack's operating characteristic .is such that the last operand placed into the stack is the first to be extr~cted. The top-of-stack registers become saturated after having been filled with two operands. Loading a third operand into the top-of-stack registers causes an operand to be pushed from the top-of-stack registers into the stack memory area. The stack pointer register (S) is incremented by one as each additional word is placed into the stack memory area; and is, of course, decremented by one as a word is withdrawn from the stack memory area and placed in the top-of-stack registers. As a result, the S register continu~ly .points to the last wQrd placed into the job's stack memory area. A job's stack memory area is bound, for memory protection, by two registers, the Base of Stack (BOS) Data addressing Three mechanisms exist within the B6500/87500 Processor for addressing data or program code: (1) Data Descriptor (DD)/Segment Descriptor (SO), (2) Indirect Reference Word (lRW), and (3) Stuffed Indirect Reference Word (lRWS). The Data Descriptor (DO) and Segment Descriptor (SO) are 85500 carryovers and provide the basic mechanism for addressing data: Or program segments· which are located outside of the job's stack area. The basic addressing component of the descriptor is an absolute machine address. The Indirect Reference Word (lRW) and the Stuffed Indirect Reference Word (IRWS) are B6500/B7500 mechanisms for addressing data located within the job's stack memory area. The addressing component of both the IRW and IRWS is a relative address. The IRW is used to address within the im- Burroughs' B6500/B7500 Stack tvfechanism _ mediate environment of the job's stack, and addresses relative to a display register (described later in Nonlocal Addressing). The IRWS is used to address beyond the immediate environment of the current procedure, and addresses relative to the base of the job's stack. Addressing across stacks is accomplished with an IRWS. The descriptor In general, the descriptor functions' to describe .and locate data or program code associated with a given job. The Data Descriptor (DD) is used to fetch data to the stack or store data from the stack int~ an array which resides outside the job's stack area. The format of Data and Segment Descriptors are illustrated in Fig. 1. The ADDRESS field of both descriptors is 20 bits in length and contains the absolute address of an array in either main system memory or in the backup disk store. The Presence bit (P) indicates whether the" referenced data are present in main system memory or in the back-up disk store, and is set equal to ONE when the" referenced data are present in main system memory. A Presence Bit Interrupt is incurred when the job makes reference to data via a descriptor which has a P bit equal to ZERO. The Presence Bit Interrupt stimulates the operating system (called the Master Control Program, or MCP) to move the data from disk to main memory. The data location on disk is contained in the AD DRESS field of the D D when the P bit is equal to ZERO. After transferring the data array into the main memory, the operating system (MCP) marks the descriptor present by setting the P bit equal to ONE, and places the current memory address into the ADDRESS field of the descriptor. The interrupted job is then reactivated. A Data Descriptor may describe either an entire array of data words, or a particular element within an array of data words. If the descriptor describes an entire array, the Indexed bit (I-bit) in the descriptor is ZERO, indicating that the descriptor has not yet been indexed. The LENGTH field of the descriptor defines the length of the data array. A particular element of an array may be described by indexing an array descriptor. Memory protection is insured during indexing operations by performing a comparIson between the LENGTH field of the descriptor and the index be~ng applied to it. An Invalid Index Interrupt is incurre"d if the index value exceeds the length" of the memory area 'defined by the descriptor. If the value being used to index the descriptor is valid, the LENGTH field of the descriptor is replaced by the index value. At this time the I -bit in the de- 247 scriptor is set to ONE to indicate that indexin.¥ has taken place. The ADDRESS and LENGTH fields are added together to generate an absolute machine address whenever a present, indexed Data Descriptor is used to fetch or store data. The Double Precision bit (D) is used to identify the referenced data as being either single or double precision and, as a result, is also associated with the indexing operation. The D bit being equal to ONE signifies double precision" and implies that the index value be mUltiplied by two"before indexing. The Read-Only bit (R) specifies that the memory area described by the Data Descriptor is a read-only ~rea. An interrupt is incurred UDon referencinf! an area through "a descriptor with the intention to write if the R bit is equal to ONE. The Copy bit (C) identifies a descriptor as being a copy of a master descriptor and is related to the present bit action. The intent of the copy action is to keep multiple copies of an absent descriptor linked back to one master descriptor. Copy action is incurred when a job attempts to pass by name an absent Da~a Descriptor. When this occurs, the hardware manufactures a copy of the master descriptor, forces the C bit equal to O~E and inserts into the-- ADDRESS field the address of the master descriptor. Thus, multiple copies of absent descriptors are all linked back to the master descriptor. Non-local addressing The most important single aspect of the B6500/ B7500 stack is its facility for storing the dynamic history of a program under execution. Two lists of program information are saved in the 86500/B7500 stack, the stack history list and the addressing environment list. The stack history list is dynamic in nature, varying as the job is driven through different program paths with changing sets of data. Both lists are generated and maintained by the B6500/B7500 hardware system. The stack history list is formed from a list of Mark Stack Control Words (MSCW) which are linked together by their DF fields (Fig. 3). A MSCW is inserted into the stack as a procedure is entered, and is extracted as that procedure is exited. Therefore, the stack history list grows and contracts in accordance with the procedural depth of the program. Mark Stack Control Words serve to identify the portion of the stack related to each procedure. When the procedure is entered, its parameters and local variables are entered in the stack following the MSCW. When executing the procedure, its parameters and local variables are referenced by addressing relative to the location of the related MSCW. 248 Spring Joint Computer Conference, 1968 STACK HISTORY LIST Figure 3 - Stack history and addressing environment list Each MSCW is linkeo back to the prior MSCW through the contents of its DF field to ide'ntify thepoint in the stack where the prior procedure began. When a procedure is exited, its related portion of the stack is discarded. This action is achieved by . setting the stack pointer register (S) to point to the memory cell preceding the most recent MSCW (Fig. 4). This top-most MSCW, pointed to by another register (F), is in effect deleted from the stack history list by causing F to point back at the prior MSCW, thereby placing it at the head of the stack history list. I ,.. This concept is implemented in the Burroughs' B5500 system, and it provides a convenient means to handle subroutine entry and exit. But this mechanism alone also gives rise to one of the most serious limitations of the ALGOL implementation on the B5500. In the B5500 stack, local variables are addressed relative to the first Mark Stack Control Word (which corresponds to the outer-most block), or relative to the most recent Mark Stack Control Word (which corresponds to the current procedure). All intervening Mark Stack Control Words, however, are invisible to the current procedure. This means that the variables declared global to the current procedure, but local to some other pr~cedure, cannot be addiessed at all r This inability to reference variables declared non-local to the current procedure but local to some other procedUie is termed the non=local addressing problem. The manner in which these variables are addressed in the B6500/B7500 stack can best be understood by. , analyzing the structure of an ALGOL program. The addressing enyironment of an ALGOL procedure is established when the program is structured by the programmer, and is referred to as the lexicographical ' ordering of the procedural blocks (Fig. 6A). At compile time, this lexicographical ordering is used to form address couples. An address couple consists of two -items: I)-the lexicographi~allevel v/) of the variable, and 2) an index value (0) used to locate the specific variable within a given lexicographical level. The lexicographical ordering of the program remains static as the program is executed, thereby allowing variables to be referenced via address couples as the program is executed. ... C=Is~-----·{t~os~~~1 PORTION OF STACK STACK HISTORY LIST : =__ ~+.r I DISPLAY REGISTERS P9 ... ,.. ... 05 04 03 02 PROCEDURE~" 01 DO PROCEDURE "0" 1 ~ f~ ~~O~F ,.. t---~------~= ,.. r-~-"~++~r'-+-.t:T;:;:O~S~WOR=ftiO DISCARDED ,.. STACK ADDRESS MEMORY ENVIRONMENT AREA... LIST - - ~1PI ~=--==---+--+ ~ 1:!L PROCEDURE ""," 1 ==4- PROCEDURE ·0". -~~. OUTER PROG. BLOCk I ______ ,.. Figure 4-Stack cut-back operation on procedure exit ~--- -=± ,.. Figure 5 - Di~play registers indicating current addressing environment Burroughs' 86500/B7500 Stack Mechanism B E G I N - - - - - - - - - - - - Lexicographical Level"2" REAL VI i LL-2,8=2 REAL V2j LL-2,8-3 PROCEDURE Ai LL-2,8-4 B E G I N - - - - - - - - - LexicoQraphical Level"3" RtAL V3 i LL - 3, 8-2 PROCEDURE Bi LL"3,8-3 l B E G I N - - - - - - LexiCOQraphical Level"4" V3-3j VI-V3, [ END; B. ~ END PROCEDURE C i B E G I N - - - - - - - - - LexicOQraphical Level"3" REAL V4~ LL-3, 8· 2 PROCEDURE D i L L -3, 8- 3 B E G I N - - - - - - LexicoQraphical Level "4" REAL V5 i LL -4,8- 2 V4- 4 i V5-5j [ A· Vi-V4j ENDi Ci ENDi Figure 6a-ALGOL program with lexicographical indicated structur~ PROCEDURE "a" - - Lexic09raphlcal Level"4" - - - - - Lexicographical LeveI"3" - - - - - - - - LexicOQraphical leY.. "2" Figure 6b - Addressing environment tree of A LGO L program in Figure 6a The B6500/B7500 contains a network of Display Registers (DO through D31) which are caused to point at the appropriate MSCW (Fig. 5). The local variables of all procedures global to the current procedure are addressed in the B6500/B7500 relative to the Display Registers. The address couple is converted into an absolute memory address when the variable is referenced. The lexicographical level portion of the address couple functions to select the Display Register which contains an absolute memory address pointing at the MSCW related to the procedural block (environment) where the referenced variable is located. The index 249 value of the address couple is then added to the contents of the Display Register to generate an absolute memory address to locate the variable. It should be recognized that the address couples assigned to the variables in a program are not unique. This is true because of the ALGO L scope of definition rules, which imply that two variables may have identical address couples only if there is no procedure within which both of the variables can be addressed. So this addressing scheme works because, whereas' two variables may have the same address couples, there is never any doubt as to which variable is being referenced within any particular procedure. What this does imply, however, is that there is a: unique place (a MSCW) to which each Display Register must point during the execution of any particular procedure, and that the settings of the Display Registers might have to be changed, upon procedure entry or exit, to point to the correct MSCW. This list of MSCWs to which the Display Registers must point is called the addressing environment of the procedure. The addressing environment of the program is maintained by the hardware. It is formed by linking the MSCW's together in accordance with the lexicographical.structure of the program. This linkage information is contained with the Stack Number (Stack No.) and Displacement (DISP) fields of the MSCW, and is inserted into the MSCW whenever a procedure is entered. The contents of the DISP field indicate the environment in which the entered procedure was declared. Thus the addressing environment list is formed by linking each procedure entry Mark Stack Control Word back to the MSCW appearing immediately below the declaration for that procedure. This forms a tree structured list which indicates the legitimate addressing environment of each procedure under dynamic conditions (Figs. 5 and 6B). This list is searched by the hardware to update the Display Registers' contents whenever a procedure entry or exit occurs. The entry and exit mechanism of the Processor hardware automatically maintains both stack lists to reflect the current status of the program. Therefore, the system is able to respond to, and return from, interrupts conveniently. Interrupt response is handled as a procedure entry. Upon recognition of an interrupt condition, the hardware causes the stack to be marked, inserts into the stack an indirect reference word (address couple) pointing to the interrupt handling procedure, inserts a literal constant to identify the interrupt condition, and then causes an entry into the operating system interrupt-handling procedure. The Display Registers will track with the entry into toe interrupt-handling procedure to make all legitimate 250 Spring Joint Computer Conference, 1968 variables visible. Also upon return! the Display Registers track back to the environment of the former procedure, making all of its variables visible again. lJ' ultiple stacks and ie-entiant code The 86500/87500 stack mechanism provides a facility to handle several active stacks. These stacks are organized into a single tree structure. The trunk of this tree structure is a stack which contains certain operating system global variables, and contains all of the Segment Descriptors describing the various procedures within the operating system. Let us make a distinction between a program, which is a set of executable instructions, and a job, which is single execution of a program for a particular set of data. As the operating system is requested to run a job, a level-l branch of the basic stack is created. This level-l branch is a stack which contains only the Segment Descriptors describing the executable code for the named program. Emerging from this level-l branch is a level-2 branch, a stack to contain the variables and data for this job. Thus, starting from the job's stack and tracing downward through the treestructure, one would find first the stack containing the variables and data for the job (at level 2), the program code to be executed (at level 1), and finally the operating system's stack at the trunk (level 0). A subsequent request to run another execution of an already-running program would require that only a level-2 branch be established. This level-2 stack branch would sprout from the level-l stack that describes the already running program. Thus two jobs which are different executions of the same program will have a common node, at levell, which describes the executable code. It is in this way that program code, which is not modifiable, is re-entrant and shared. It comes about simply from the proper tree-structured organization of the various stacks within the machine. Thus all programs within the system are re-entrant, including all user prograJl?s as well as the compilers ' and the operating system it·self. The B6500/87500 stac~ mechanism also provides the facility for a single job to split itself into two indepet:ldent jobs. It is anticipated that the most common lise of this facility will occur when t~ere is a point in a job where two relatively large .independent processes must be performed. This kind of splittjng could be used to make full use of a multiprocessor c'onfiguration, or simply to- reduce elapsed time by mUltiprogramming the independent processes. This kind of program splitting becomes almost literally "reproduction by budding" in the B6500/ 87500 system. A split of this type is handled by establisting a new limb of the tree structured stack, with the two independent' jobs sharing that part of the stack which was created before the budding was requested. The process is recursively defined, and can happen repeatedly at any level. An implementation restriction limits the total number of separate stacks to 1024. This tree-structured organization for handling multiple stacks is referred to as the Saguaro Stack System. Linkage of stack branches is achieved through a single array of data descriptors, the stack vector array (Fig. 7). A data descriptor is entered into the array for every stack branch as it is set up by the operating system. This data descriptor, the stack descnptor, serves to describe the length of the memory area assigned to a stack branch, and its location in either main memory or on disk, JOB STACK NO.n JOB STACK NO.3 JOB STACK NO.2 JOB STACK NO.1 MSCW DISPLAY REGISTERS STACK VECTOR DESCRiPTOR I~~ + ~ ..--------~ Figure 7 - Multiple linked stacks A stack number is assigned to each stack branch to indicate the position of its stack descriptor within the stack vector array. The stack 'number is used as an index value to locate the related stack descriptor from the stack vector array for subsequent reference. The stack vector array's size and location in memory is described by the stack vector descriptor. This descriptor is located in a reserved position of the stack's trunk (Fig. 7). All references to stack branches are made through the stack vector descriptor which is indexed by the value of the st~ck number 8urroughs' 86500/87500 Stack Mechanism 25 I to select the stack descriptor for the referenced stack. A Presence 8it Interrupt is incurred upon making reference to a stack which is not present in memory. This Presence 8it Interrupt facility provides the means to permit stack overlays and recalls under dynamic conditions. Idle or inactive stacks may be moved from main memory to disk as the need arises, and when subsequently referenced will cause a adding the contents of DISP and 8 to the base address of the referenced stack. The base address of the stack is determined by accessing the stack descriptor as described previously. The information contents of the stuffed IRWS with the exception of 8, is dynamic in nature and must therefore be accumulated as the program is executed. The contents of the stack number (Stack No.) and DISP fields are entered into the IRWS by a special hardware operator which is in- ing system to recall the non-present stack from disk. Referencing· a .variable within the current addressing environment· of an active procedure is accomplished through the use of the address. couples contained in the IRW and the address couple field of the Program Control Word (PCW) as shown in Fig. I. 80th· references are made relative to the Display Registers specified by the address couple. The address couple and Display Registers are usable only for addressing variables within the scope of the current addressing environment. Reference to variables beyond the scope. of the current environment is accomplished by a stuffed I RWS. This causes the addressing to be accomplished by addressing relative to the base of the stack (BOS) in which the variable is located. The IRWS contains information specifying the stack number (Stack No.), the location (DISP) of the related MSCW, and the displacement (8) of the parameter relative to the MSCW. The absolute memory location of the sought parameter is formed by to pass a parameter by name. ACKNOWLEDGMENTS Recognition for the stack concepts and operating philosophy of 8urroughs' 86500/87500 system must be extended to many system designers engaged in both the 85500 and 86500/87500 programs. Among the contributors, special mention should be made for B. A. Creech, 8urroughs Corporation, and R. S. 8arton, W. M. McKeeman, consultants. REFERENCES 1 Burroughs' B5500 information processing system reference manual Burroughs Corporation 1964 2 A narrative description of the Burroughs' B5500 disk file master control program Burroughs Corporation 1965 3 B RANDALL L J RUSSELL ALGOL 60 implementation Academic Press III 5th Avenue New York 1964 A compact, economical core memory with all-monolithic electronics by ROBERT W. REICHARD and WILLIAM F, JORDAN, JR. Honeywell Computer Control Division Framingham, Massachusetts INTRODUCTION The computer memory business. has been plagued at various times during the past 8 years with a cliche that the art of core memories would be exhausted within 5 years. This paper describes the attainment of new standards of size and cost and new design features intended to further postpone this elusive demise. . The new standards include greater reductions in cycle time, volume, and selling price than had been hoped for. These significant improvements involved the following factors: Use of monolithic integrated circuits for all major electronic functions related to the core stack; Integration of a core driver transjstor with decoding/timing logic circuitry; A novel packaging concept without wired backboard; A high-performance power supply subsystem. Over the past decade, the cost per bit and cycle time of state-of-the-art core memories have simultaneously been halved every 2 t years. In large measure, this is due to the continuing competitive market and to the appearance of new suppliers, some of whom have yet to become significant factors in the market. Some short-lived competitors grasp an occasional opportunity and depart, affecting the market by their having appeared. The core memory business may not have attained a semblance of maturity and stability. Opportunism has been amply demonstrated as an unsatisfactory business approach in the long run. A classic approach to design, product development, and release to manufactur,e has been recognized as desirable. There is a paradox here in that, as the market grows and larger production runs become necessary and possible, tooling requirements increase in scope and duration and the conception-to-production cycle tends to increase. Perhaps this .is the reason that 253 several of the major suppliers in the core memory business have been able to continue in business des~ pite the fact that they are essentially specialty-houses with platoons of project engineers. However, there seems not to be general tacit admission that the economy will no longer support such frivolity; or more properly, that such approaches can no longer compete against that of a disciplined organization. The logical conclusion from this set of constraints is that design capabilities and requirements must be projected far' into the future, and an organization must be willing to back long, expensive programs which generally admit to rather high risk. The success of ~ne or more suppliers in doing this can mean only jeopardy to those attempting to operate in the old "job-shop" manner. Product requirements A planned, disciplined approach with stated objectives of main-frame capacities, fast cycle time, economy, fast delivery, modularity, and small size will be discussed in this presentation. An obvious, evolutionary tendency has been for memory capacities to become larger in order to meet the needs of ever more powerful hardware and software products. The system· described provides a maximum capacity of approximately 300 kilobitsas 8192 words of 36 bits, or ~ 6,3 84 words of 18 bits. The limiting capacities are a result of cost-speedcapacity trade-offs as well as the limit of signall noise ratio affected by sense-winding geometry and length. This system, with its power supply unit, is shown in Figure 1. Detailed studies of the maximizing the performance/ cost index of core memory systems dictated that the mode of operation of this memory be that known currently as 2t D. The technique dates back more than a decade, but only recently has become economically attractive, due to the advent of integrated circuits. 254 Spring Joint Computer Conference, 1968 Figure 1 - 300- Kilobit memory system and power supply The technique is an elemental implementation of a coincident-current memory in which one axis of_ read-write circuitry is replicated for each bit; the result is repetitive assemblies, but of drive lines which are short, since they need not be continuous through the stack. This permits a new degree of freedom in optimizing the aspect ratio of bit areas for one or another parameter. A further gain is the obvious dispensing with inhibit windings, enabling smaller cores for a given size wire, and the saving of time formerly allocated to insuring time overlap on inhibit and write currents, plus time required for recovery of sensing circuitry from the inhibit transients. Another goal stated at the initiation of this product development was the requirement for reasonably fast delivery, to order, of a variety of capacities, which is at odds with the concept of a hard-wired custom core stack. This last item can be expedited by the act of inventorying tested planes and stacking to order or inventorying an array of stacks; but clearly -a preferable route is to avoid stacking in its traditional sense. In the subject memory no hardwired stacking ever is done; the implementation of an assembly of planes is achieved by pressure-contact connectors making a one-to-one connection between facing surfaces of adjacent boards. Sixty-four such one-to-one connections are made by each proprietary connector. The resultant connections not only achieve the linking of the long drive axis but also serve to distribute dc power and logic signals through the resultant assembly. The results of using stackable planes include not only the obvious ones of less restrictive inventory capability and/or less dependence upon the supplier but also lower cost, since stack test and tedious stack - repair are eliminated. Maintenance _and repair are , also simplified since plane replacement is practical, and a true capability for incremental field expansion is made possible. Modularity of capacity was also a prestated goal at the initiation of this design. As with any manufactured product, a considerable degree of standardization is necessary to achieve economical manufacture. The modularity of the present equipment is fixed by the design at 32,768 bits, or the content of one plane. Thus, increments of capacity are 4 bits in 8K memories or 2 bits in 16K memories. Inasmuch as the 4K memory uses half-density planes, the modularity is also 4 bits at that capacity. Only five other major electrical circuit modules are required, with a total diversity of eight assemblies, to implement systems of 33K through 300K bits of storage. Standardization is further attested by the fact that only seven IC flat packs are used throughout. These factors result in a system which requires minimal inventory vis-a-vis a broad range of capacities that can be assembled in short periods of time. Electrical design In present-day state-of-the-art electronic equipment it is becoming more and more difficult to dissociate mechanical design from electrical design. In this design it was found possible to avoid artificial delineations and in fact to capitalize upon -the interdependence of spatial and electrical characteristics both in a miscroscopic (e.g., etched conductors) and macroscopic (e.g., organizational) sense. Thus, the following discussion treats those areas in which the electrical requirements predominate. Packaging organization of this memory system was conceived to emphasize an intimate, orderly relationship' between data logic and storage. All of the circuitry associated with the regeneration of data from a core plane is located on peripheral plug-in cards. The cards and cores are mounted in - a co~ planar relationship permitting an orderly stackup 'of complete regeneration channels. A 4-bit, 8192-word data plane is illustrated in Figure 2. The core plane contains conventional 20 mil o-d cores requiring nominally 800-mA fulldrive current, in a conventional 2 t D three-wire rectilinear array. The long lines are re-entrant such that when current linking coincidence occurs in a selected core, anti-coincidence occurs in that core corresponding to the - second intersection. Furthermore, the two positions of each re-entrant line link separate sense windings, two of which are interleaved within each bit area. This results in senseline noise of 64 delta-pairs on the long lines and only 8 pairs on the short lines. Printed-circuit sense wiring pairs connect to monolithic sense amplifiers mounted on the data cards. In addition to optimal differential terminators, a resistor is provided on Compact, Economical Core Memory with All-Monolithic Electronics each data card to optimize the common mode termination. High-level TTL gates are also mounted on the data cards for storing, gating, and output transmission line driving functions. AIR FLOwl CORE PLANE SIGNAL INPUT/ } OUTPUT SELECTION CARD DATA CARD Figure 2 - Organization and cQnstruction of data plane Adjacent to the data card is the selection card, which contains all of the decode-drive circuitry required for the short drive selection axis. An 8192word selection. card contains circuitry for "four 4X 4 matrices; a 16,384-word selection card contains _two 4,x 8 matrices. Storage for the associated address bits is provided by R-S TTL flip-flops mounted on each selection card~ Although these address flipflops are repeated on tach plane their ulilization is not extravagant. The redundant storage actually ,serves as local modular buffering~· for the double rail address signals. The organization of address storage is unique in this unit. Address registers respond to their inputs at all times except when the memory is busy., This is the so-called open-ended address register configuration. "Look ahead" gating is 'also used to shorten the address delays by a technique similar to that of feed-forward in servo systems. ,The long-axis drive and control circuitry is packaged on the top and bottom planes to sandwich the core stack. The top plane contains integrated selection diodes for the long-axis 16 x 16 matrix. A control card and a long~axis selection card are both mounted on each end plane. The control card contains logical gates and an electrical delay line for the generation of memory timing signals. Each long-axis selection 255 card contains 16 switch-sink pairs for one coordinate of the 16 x 16 matrix. The 'key to this packing desity is the very compact decode-drive circuitry which is constructed of monolithic chips. One chip contains two 400-mA switch pairs with decoding for eight or fewer address bits. Since logical ground and the sink emitters are common, this function has exactly 14 pins for use in a standard flat pack. The smaller matrices required on the short axis of the 21 D memory organization also efficiently employ this function; unused address , inputs are used for the lUodulation and zoning of data. A line selection matrix organization has been conceived which relieves some of the integratedcircuit liabilities: voltage breakdown, pin limitation, and accidental destruction. This arrangement of decode-drive chips with a memory current-limiting resistor prevents catastrophic damage from inadvertent simultandous activation of read and write. I t avoids the requirement of saturating a switch to a memory drive voltage and therefore avoids the need of a higher-voltage supply. Any output may be shorted to ground without damage. The drive current resistor is time-shared for the read and write halves of the memory cycle. The single resistor (having full duty cycle dissipation capability) is clamped to limit the IC voltage requirement. A technique is employed with the long-axis address selection matrix for reducing the loading effect of drive line bus capacity; integrated diodes are used to segment the 16-line busses into four groups of four drive lines. Circuit layouts were judiciously considered to satisfy the problems of signal transmission, noise, and termal management. On the selection board, driver flat packs are mounted on a wide copper lamina with a thermal compound. The row of flat packs is perpendicular to the cooling air stream to avoid cumulative heating effects, since, in the worst case, each flat pack can dissipate some 600 milliwatts. The lamina is expanded, filling the unused area on the selection board to enlarge the cooling area. The complete regeneration path is organized with the cooling air stream passing over the heat-sensitive cores and sense amplifiers first, then the diode matrices and- selection circuit, and finally the veryhigh-dissipation drive resistors. Care was taken to -employ as few types of materials and assemblies as possible. As part of this theme, a scheme for replacing odd and even core planes with just one assembly was conceived. Three types of card layouts and three types of board layouts satisfy all layouts and no logical hook-up wiring, the optional system characteristics - signal inversion, 256 Spring Joint Computer C'onference, 1968 data levels, and partitioning zones - are accomplished by the proper placement of jumpers on the circuit cards. From an electrical point of view, it would have been tempting to use multilayered cards with ground pianes. This expensive approach was avoided, however, by carefully laying out minimum signal lines and filling any unused area with ground or supply . bus pattern·s. The data card has been especially considered to insure balanced sense lines and the absence' of electromagnetic or electrostatic coupling. On the core planes the sense lines are surrounded with ground peninsulars for current-free electro. static shielding. The sense boundary connector pins on the data card are connected on one end for the same reason. The creation of ground noise is minimized on the selection board by an alternate ac circuit return which is laid out for close coupling to the input drive current paths. The mutually cancelling effects of flux linkage from opposing fields leaves only magnetic leakage inductance to impede the flow of current. The degree to which a system is implemented by repeated subassemblies is an important diagnostic characteristic. High repeatability decreases the needed spare parts inventory. Repeatability of disconnectable subassemblies permits transp.osition of parts for localizing faults, and comparison for tracing. In this system, all of the signal paths (including drive currents) associated with a data bit are limited to a specific plane level for orderly modularity. All of the active circuitry is located on plug-in cards. Many of the selection diode flat packs (36) are on exposed surfaces of the assembly. The remaining (52 in an 8K x 24) can be reached by disassembling the stack. All of the flat packs. and all of the plug-in cards are repeated. Mechanical implementation As was indicated previously, the memory module described herein provides 300 kilobits of storage in substantially less than 1 cubit foot in a package which includes cooling fans and filters. In actuality the package contains some 20% unused volume, so that the equipment represents" very' high' functional packaging density for a high-performance commercial memory. One fact in the space economy is the relative lack of discrete components. Another very important one is the lack of a' conventional backboard. Two proprietary connectors are instrumental in achieving this. One is the stacking connector mentioned previously, which provides 64 connections on 0.050inch centers, and incorporates precise aligning pins. The contacts are precious metal, as are the pads on the mating PC boards. The other connector is an 88-pin edge connector with a double rank of precious metal-tipped contacts on 0.100-inch centers, and provision on the affixing end fer mechanical and soider retention to a PC board. The appiication of this connector is to effect an edge-to-edge connection of numerous close-spaced contacts with a minimum of hardware. These contacts also mate with precious metal-plated pads on another PC board. This edge connector is in practice affixed to each plug-in board, and is therefore replaceable if ever necessary. Thus, there is no backboard in the conventional sense and most interboard connections have essentially a short, straight run with minimal connector capacity and minimal length, hence minimal inductance. Figure 3 indicates the internal construction of the system. t Figure 3 - View of internal construction of memory system There is some conventional wiring present in this system, and that is, in addition to the obvious power . wiring, the input and output signal wiring from the I/O connector to each board in the "core stack." These wires are all twisted pairs, affixed to a crimp-on, poke-home contact at the I/O connector end and at the other end to a crimp-on contact which slides onto a 20 X 30.;.mil rectangular pin affixedto the stack boards. There are no hand-soldered wires in the unit except in the ac fan power wiring. Note that the' signal leads enter and exit via the stack boards even though all electronic functions are performed on the plug-in boards. This is done so that insertion and extraction of boards is kept simple and that the number of connectors is kept to a minimum. That signal paths are lengthened by this procedure may be questioned, but Compact, Economical Core Memory with All-Monolithic Electronics any added length is over well-controlled and absolute- . ly repeatable signal paths. A single, dense I/O connector containing 200 pins is utilized. A central jackscrew and positive alignment pins Insure against damage to pin or shell by mismating. The mating shell and loose pins furnished to a customer with each unit may be crimped or soldered, and no tools are necessary to insert the wired contacts. High-performance card extenders were developed to permit operation of any plug-in board in position for maintenance checks without compromising system -performance. This is achieved through a simplified multilayer board technique. The added ground planes reduce coupling effects between lines, and lower the lime impedance to avoid discontinuities which have previously restricted the use of extenders. Results achieved The memory system described here is the Honeywell Computer Control Division ICM-500 jL-Store, a product first delivered in January, 1968 and now being produced in quantity. This system, with' capacities as great as 8192 words of 36 bits, provides 600-ns cycle time and 300-ns access time. It is packaged in a sturdy but simple ~nclosure about 0.6 cubic foot in volume and 25 pounds in weight. This enclosure includes flushing fans and filters and may be mounted via either of two surfaces in any of several orientations. , Power consumption of the largest module is approximately 350 watts. More than half of the system power is dissipated in the set of drive resistors, all of which are located at the exhaust end of the cooling air stream. Four supply voltages are req1Jired, all referenced to ground. A proprietary power supply furnish- 257 es all of these voltages, with appropriate regulation, for one maximal module. It also provides the usual features of line-voltage sensing and low-line-voltage indication, with provision for enabling orderly and non-destructive shutdown as well as startup, overvoltage and overcurrent protection, remote-sensing for termperature compensation of drive voltage, and thermal overload protection. This supply is operable from a nominal lIS-volt line, 50 through 400 Hz. It weighs only 45 pounds and occupies about one-half cubic foot. By the application of reliability forecasting techniques refined via similar predecessor products, an MTBF of some 25,000 hours is forecast for this system and a unit is presently on life test to commence accumulation of supporting data. All logical and driv.e circuits are immune to failure due to accidental groundins, and the selection matrix design is such that excessive read and write currents cannot flow simultaneously and cause destruction to components or to stored data. In addition, the fault-localizing time is short since functional subunits are interchangeable. The availability of high-performance extender boards also enables functional boards to be extended for signal tracing without compromising operational speed. Test points, accessible without any disassembly or unplugging of cards, also facilitate the localizing of signal interface problems. ACKNOWLEDGMENTS The authors wish to acknowledge contributions to the development of this product by personnel of the Memory Products Department at Honeywell, Computer Control Division, particularly Mr. Dana W. Moore, and by members of related areas. A progress report on large capacity magnetic film memory development by JACK I. RAFFEL, ALLAN H. ANDERSON, THOMAS S. CROWTHER, TERRY O. HERNDON and CHARLES E. WOODWARD Massachusetts Institute of Technology Lincoln Laboratory* Lexington, Massachusetts INTRODUCTION In 1964 we proposed an approach to magnetic film memory devel,opment aimed at providing large; highspeed, low-cost random-access memories. 1 Almost withQut exception, all early attempts at film memory design emphasized speed with little consideration for the potential of batch-fabrication to reduce costs. Based on our earlier work in building the first film memory in 1959,2 and a 1,000 word, 400 nsec model for the TX-2 computer in 1962,3,4 we had reached some fundamental conclusions about the compatibility of high speed and low cost for destructivereadout film memories. It seemed clear to us that in order to provide significantly more storage capacity per dollar it was necessary to achieve very high bit densities, to eliminate as far as possible internal connections, to fabricate access wiring integrally with the storage medium if possible., and to provide very wide magnetic operating tolerances in order to achieve adequate yields for arrays of 105 bits or more. It was also clear that for small memories the cost of storage elements is essentially irrelevant and that the area of significant impact for batch-fabricated films should be in large memories, i.e., greater than one million bits. There were at least five areas of major uncertainty when this proposal was made: (1) The signal to be detected in the presence of both random noise and parasitic transients was lower than in any previous magnetic store. (2) Techniques did not exist for forming precise, dense (2 mil width, 2 mil space) lines over large areas (l0" X 1.5") cheaply and reliably. (3) Routine evaporation of uniform magnetic films over large areas with elimination of defects ·Operated with support from the U.S. Air Force. 259 greater than 0.5 mil in any dimension remained to be achieved. (4) Connection to many lines, even at the periphery, at these densities appeared difficult. (5) Finally, processing and testing of such large arrays had to be sufficiently automated to justify the pains taken to achieve high density. N one of the questions raised here has been answereddefinitively ,but the completion ofa one million bit prototype memory (show!1 in Fig. 1), which is to be installed in the TX-2 computer, has proviqed us with enough encouragement on each of these points to justify a high degree of confidence and to open up exciting possibilities for extending these techniques further than had originally been anticipated. The most interesting feature of the memory for system applications is the. parallel-access to 352 bits. This multiword access should be useful for a number of applications such as parallel processing of the Illiac IV variety, list processing, searching, and display buffering. Although for the present TX-2 will handle a single subword of 44 bits on each memory access, the memory bussing arrangement is such that access may be made to each of the eight subwords during the memory cycle of less than one microsecond. M emory design A. Structure and organization The basic structure of the memory is shown in Fig. 2. A single layer composite magnetic film is operated in a rotational destructive-read-out mode with two access wires. 2 Each bit is composed of two 2 x 6 mil intersections of the word and digit lines with a density of 12,500 bits/in 2. Magnetic film structures which provide flux closure in the hard, easy, or both 260 Spring Joint Computer Conference, 1968 rJ I UUUUU I I i I I I II II I II 352/ DIGITS I fI I I I iJ II UUUUU--L-! I: I II II II II I: I ~--------------~v~---------------J 3200 WORDS (PLUS 400 SPARES) 352 SENSE AMPLIFIERS 352 DIGIT .DRIVERS 100 WORD SWITCHES 64 WORD DRIVERS PROTOTYPE LARGE CAPACITY MEMORY 6 1. 1 x 10 BITS Figure I - One million bit magnetic film memory. The memory stack is in the center, under the dark plate, with the word line connections and diode matrix at the top and bottom. Word access circuitry is in the two card files. One digit card is plugged into digit lines at the upper right-hand corner. A fully populated memory will have digit cards in the sockets on all four corners, top and bottom. The digit lines shown have not yet been connected to sockets- ...--r --....____________...r-"-"'""L._ FIBERGlASS - O.OO4-INCH THICK DIGIT LINE - 0.0001)( 0.006 INCH O.Olo-INCH CENTERS ~ / WORD LINE 0.0002 )( 0.002 INCH ON 0.004-INCH CENTERS MAGNETIC LAYER - 11001)( 0.002 INCH -GLASS SUBSTRATE - 1/4iNCH THICK INSULATION WORD LINES Figure 2 - Detail of memory structure and arrangement of access lines. Two 2 x 6 mil intersections form one memory bit directions were considered but rejected when adequate margins were obtained with the single layer. Although the open structure has fabrication advantages, the closed structures remain of interest for future work. The organization of the one mi1lion bit memory is shown in Fig. 3. The ten inch long glass substrates each have 360 lines of 2 mil width which have been formed by etching vacuum evaporated magnetic and copper layers. Storage is in the magnetic layer while the copper forms the word line. Forty lines on each substrate are redundant and can replace other Figure 3 - Memory organization. Five substrates and four digit pieces comprise each half of the memory defective lines through simple wiring on the edge connectors. Advantage has been taken of the low back voltage of the storage element to minimize stack interconnections by addressing many bits, up to 384, on each word line. The digit-lines are formed from pairs of 6 mil wide copper lines on 10 mil centers which have been etched from copperclad fiberglass in a hairpin configuration as shown in Fig. 2. Two digit pairs are connected in a bridge connection to the digit driver and sense amplifier as shown in Fig. 4. Stack assembly is accomplished by pressing the glass substrates against the digit lines with no critical registration. The TX-2 memory, as shown in Fig. 3, uses ten substrates and 352 digit lines to form a 3200 x 352 stack, although to the computer it appears as a 25,600 x 44 memory. Since read-out is destructive, each digit-line requires a, sense amplifier and digit driver. It was recognized at the outset that the overall costs for the prototype would be dominated by the digit circuitry and that an economical module size would use longer digit lines and more substrates. DIGIT DRIVER Figure 4 - Digit coupling circuit Progress Report on Large Capacity Magnetic Film Memory Development B. Digit circuitry If one accepts the premise that low cost depends crucially on batch-fabrication whose effectiveness in tum depends on high density, the ultimate lower limit on bit size and signal energy becomes a determining factor in memory design. This limit is set by random noise considerations. It has been shown5 that the mean time between failures for an M digit memory with a cycle time T and peak-signal to rtn<1_nni<1p r~tin .1.1."' .. ..., ....... .I.1LA.II,..I.'-' ... ..I..& ..UJ A Irr i<1 0.1. ni"pn h,,· v ""'.1..1 J • .L 1 / .I.~ 'Y V 2T MTBF= (I-erf A/V2a-) M This relationship gives a required signal-to-noise ratio of 8.3: 1 for a MTBF of one year for a 1 /Lsec, 400 bit memory. In this design the bit size was reduced to the limits of our then existing fabrication technology with a resultant bit signal amplitude 9f 130 /Lvolts, 35 nsec wide, at the sense line terminals. This small signal in the presence of rather large drive currents imposed stringent requirements on the sense amplifier and the coupling circuitry between it and the digit lines. The essential features of the coupler are shown in Fig. 4. The two halves of the digit line pair are connected in series to the sense transformer after passing through common mode filters. The digit current is transformer coupled to the two digit halves so as to feed them in parallel, thus providing a cancellation of the two digit currents at the sense transformer. A small potentiometer compensates for mismatch in digit line resistances. The sense system bandwidth is 14 MHz, the noise figure is 6 dB, and the peak-signal to rms-noise ratio is 27:1. A synchronous clamp samples the sense amplifier output just before strobe time and establishes a relative base-line reference from which to measure signal excursion, thus eliminating the effects of low frequency components in the digit transient. However, this introduces a random noise component in the baseline which leads to a total augmentation of the noise by V2, reducing the signal-to-rms-noise ratio to 19:1. Two synchronous clamps are used as a' SPDT switch to direct the sense -amplifier output to the appropriate side of the strobe flip-flop, dependent on which half of the memory is being addressed. These strobe flip-flops are also the buffer storage and are arranged in a rectangular array of 44 bits by eight subwords. Anyone of the subwords can be selected for writing into or reading out of the memory by external access circuitry. The polarity switch for the digit pulse is a flip-flop which is transformer-coupled to the digit lines. 261 Power is applied from a pulser shared by four digit circuits. The flip-flop state, and so the output polarity, is determined by the strobe-buffer flip-flop. Transition times are 25 nsec for the operating digit current of 190 rna ± 40 mao Four complete digit channels are packaged on one card which plugs directly into sockets on the digit lines as shown in Fig. 1. C. Word circuitry Word-lines are connected in groups of eighteen lines (two of which are spares) on the substrate . The common end of a group is selected by a transistor switch and the lines driven through a diode matrix. Because of packaging considerations, the 3200 lines are driven by two 32 x 50 matrices. Word current amplitude is 500 ma with rise and fall times of approximately 35 nsec. D. Nonrandom noise Great care is necessary in the stack design and fabrication to minimize all digit-line difference mode voltages other than the signal. 6 At signal time, noise· may be generated by either inductive or differential capacitive coupling between word and digit-lines, both of which may be caused by line defects and spacing nonuniformities. Maximum allowable noise of about one half signal amplitude is determined by random noise and strobe threshold uncertainty. Noise due to group-switch selection voltages and the digit transient largely determine the memory timing. E. Timing The access time of the memory from change of address to information output from the buffer flipflops is about 450 nsec. The largest contribution to this delay is the transient on the sense-line due to group-switch voltage transitions. The circuit-limited cycle time for read-rewrite or clear-write is 600 nsec. Recovery from the digit-pulse transient limits the total cycle time to l/Lsec with the digit transient overlapping the group-switch transient. Production techniques A. Film preparation In order to provide the very high single-bit margins necessary to obtain good yields on 100,000 bit arrays, a wall coercive force specification was chosen which was well above the maximum digit writing field. The film specifications are He ;::: 15 Oe, intrinsic Hki ::::; 20 Oe, a90 < 5°, skew f3 < 2°, and thickness d = 1200 ± 150A. The best method found to obtain these characteristics was to deposit a two-layer film,1 having typical characteristics shown in Table I. 262 Spring Joint Computer Conference, 1968 T ABLE I - Characteristics of Composite Magnetic Film First layer 50% Co 47% Ni 3% Fe Second layer doHk He a90 {3 800A 250e' 250e ±3° ±O.5 ° ° 400A 30e ° 1200A 150e 20e 83% Ni 17% Fe Composite 150e ±3° ±1.0° The large skew of the second layer is due to the off-center position of its melt. The skew of the composite film, being largely determined by the thicker, higher Hk first layer, is within specification. The substrate is wI! soft glass, i 0.76" X 1.6", opticaiiy polished on one surface. It is cleaned in detergents, spray-rinsed, then dip-rinsed in distilled water before air-drying in a filtered-air bench. No ultrasonic cleaning is used as this may cause pitting of the surface. A drum holds fourteen substrates, and evaporation takes place from rf induction heated melts. A 24" source-to-substrate distance reduces film thickness variation to ± 2.5% over the center 8" of the substrate occupied by memory elements. A layer of Cr about 100A thick is first deposited on the substrate to improve adhesion. The magnetic layers are deposited at a substrate temperature of 335°C and then 5 /L of copper at 150°C. The rate of copper evaporation must be kept beiow i 6 /L per hour to eiiminate spattering tiny balls of molten copper onto the substrate. B. Fabrication Each substrate has 360 word lines in twenty groups of eighteen lines. Each group is terminated in a common pad at one end and each line in a separate pad at the opposite end. Groups are interleaved so that 180 lines are connected to diodes at each end. The arrangement of connection pads is shown in Fig. 2. Either holes in the line or bumps on the edge of a line may cause word noise. A good word line must have no defect larger than 0.5 mil in its largest dimension. After experiencing considerable difficulty in obtaining the desired line quality with photoexposure of a resist through a mask, a mechanical scribing technique was developed for line definition. The substrate with its magnetic alloy and copper coating is coated with a thin layer of photoresist. The connection pads at the ends are photoexposed from a master pattern and the resist developed and cured. The word-line pattern of two mil lines on four mil centers is then scribed in the photoresist, without penetrating the copper, using a diamond tool. The. automated scribing machine which controls the tool has been built on a coordinatograph (Fig. 5). After a setup time of ten minutes, the machine operates unattended for the 2~ hours required to scribe one substrate. The metal layers are etched using standard procedures. The line edge smoothness is excellent. A typical substrate has perhaps three unacceptable nicks or opens and 25 unacceptable bumps or shorts. Scribing defects are nearly always attributable to defects in the copper layers which may cause the tool to jump or slide over some resist, thus accounting for the preponderance of bumps. These are easily repaired after etching by direct removal with a special scribing tool. The ends of the substrates are gold plated and the active area coated with a 0.4 mil layer of photoresist which insulates the word from the digit lines. Figure 5 - Scribing machine with a substrate being scribed. The photo-exposed word-line pads can be :o.'een . Connection to the word and group pads is by a pressure connection so that connections to a substrate in a tester or the memory can be made very easily. The connector assembly uses spring-loaded pin connectors to which the diodes of the selection matrix are wired. Substitution of spare for 'defective lines is done on this connector. The digit line patterns are eighteen inches long, two inches wide, and consist of 192 lines, six mils wide on ten mil centers (96 pairs). Half this length is active digit line, the other half being required for fan out lor direct connection to the dIgit cards as is shown in Fig. I. The memory uses eight such patterns. Digit conductors must be entirely free of shorts and opens and can have no holes in a line greater than one mil in diameter. The digit line is made by etching half-ounce (0.7 mil) copper laminated to five mil glass-epoxy. Exposure of the resist is by a projection printing technique in which the photo master is spaced 25 mils away from the resist layer. A Progress Report on Large Capacity Magnetic Film Memory Development traveling front-sUIface mirror is used to paint the master and substrate with collimated light.s This eliminates all degradation of thephotomaster, and by projecting the light, left and right a few degrees off the normal any remaining dust particles are undercut by the light and prevented from causing flaws in the resist coating. The photomasters are generated actual size by s~ribing emulsion-coated glass plates on the automatic scribing machine: Co Assembly The two halves of the memory shown in Fig. 3 are assembled on two sides of a ground plane as can be seen in Fig. 1. A resilient material spaces the digit lines from the ground plane while the substrates are pressed against the digit lines by a cover plate with air bags. The resilient backing and air bags are necessary to achieve close spacing between word and digit lines .with loose tolerances on digit-line material and substrate thicknesses. The stack is assembled under clean conditions to eliminate dust particles which can cause spacing irregularities. Word connections are made on two sides of the stack and digit cards plug into digit-line sockets on the oiher two sides. D. Testing procedure The testing procedure is designed to eliminate as many defective substrates as possible before the more expensive processing steps of complete inspection, repair, and electrical testing. The composite film substrates have wide operating margins with uniform large-scale characteristics. Furthermore, the causes of errors at individual bit positions are well known and easily detected. These two factors make simple preliminary screening effective. After the substrate has been coated, it is tested in a B-H looper which measures average total flux, H k , He, dispersion and skew. The pinholes greater than one mil in diameter are counted and adhesion checked. After scribing and etching the. film is again looped, line resist~nce checked, and line edge quality evaluated. The substrate transparency and regularity of the word lines make it possible to see small defects with the unaided eye. At this time curves of signal vs digit current of several bits may also be made as a check on the B-H loop data. All individual bit errors are attributable to physical defects such as scratches, pits, or dirt on the glass or holes in the layer of magnetic material all of which may decrease both signal amplitude and operating margins. Defects in the copper line edge larger than 0.5 mil may cause undesired inductive coupling of word current into the digit line. The effect of a defect is often quite critically dependent on its position with respect to the digit line; a translation of two 263 mils may be the difference between a good and bad bit. Word lines are continuous and to simplify memory assembly no attempt is made to maintain any word to digit line registration. In final testing, therefore, it is assumed that any defect may occur at the worst position. Fig. 6 is a photograph of the automatic pulse tester mechanism. The 360 lines of a substrate are accessed with a diode matrix. Only eight digit channels are provided and these digit lines are mechanically indexed automatically along the substrate to test the 384 possible digit positions. Several different worst-case tests are apI'lied in sequence to each word at each digit position. The signal' is sampled and tested at one threshold level, and on a failure the address and test number are pdnted. A test including 1,000 adjacent word disturbs at a ~ MHz rate, each test repeated sixteen times, requires about 45 minutes per substrate. To obtain worst position testing of. all possible defects it is necessary to do this test several times at increments of several thousandths of an inch. Although the number of disturbs is lower than desirable, the test is made at a digit current level for disturbing which is 150% of the digit write current. It would be relatively straightforward to use higher clock speeds to decrease the total test time. Figure 6 - Substrate holding and positioning mechanism of tr.e electrical tester. Connection is made to the substrate with spring. loaded pin connectors The defect registration problem and difficulties with word noise in the tester have caused us to supplement the electrical test with optical inspection although improvements to the tester are possible which would make it alone sufficient. Using vertical illumination and looking through the glass substrate the operator views the magnetic layer at 100 power in a comparator as the substrate is moved along its 264 Spring Joint Computer Conference, 1968 length on a motor-driven table. All defects except copper holes can be seen and are recorded. Although it is fatiguing, the substrate can be completely scanned in two hours. With presently planned electro-optical detection· this time will be reduced to about· one hour with better· accuracy and minimal operator intervention. Acceptance of a substrate could be done after optical inspection alone, however, at present margins are examined in the tester at each defect position and a complete scan is performed as a double check. Optical inspection would clearly be difficult with multilayer closed structures and impossible with nontransparent substrates. E. Yields In interpreting yield data it is important to recognize that processing parameters and specifications for the runs induded have been barely stabilized. However, the yield question is so fundamental to the overall objective of reduced costs that it is essential to provide some hard data no matter how provisional or unrepresentative of ultimate possibilities. The yield data is taken from ten successive production runs of the evaporator starting with the first one in which composite films were evaporated to memory specifications. Of the total 131 substrates, 22 were rejected for thin copper and copper surface imperfections, 16 did not meet magnetic specifications, and one was damaged. These 39 were rejected before any processing; during processing 50 were rejected before final test for the following reasons: Costs Extrapolating costs from a prototype built using experimental techniques is always difficult. In particular, little effort has gone into minimizing the costs of associated electronics. On the other hand, it is possibie to make some reasonabie projections of material and direct labor costs for completely tested memory arrays based on present processing techniques. A single substrate cost under $100 appears to be a reasonable estimate «0.1 ¢ per bit) assuming a potential yield of 50%. Present digit circuit costs are approximately $30 for parts and four man-hours/channel (3200 bits). Word circuit costs per word line are approximately $1.00 for parts and 0.1 man-hour (352 bits), It should be possible to make substantial reductions in the labor' involved in ali phases of the fabrication process; parts costs would also be substantially reduced for quantity purchases. The use of integrated semiconductors could radically reduce circuit costs. Work has already begun on a memory which will use the techniques described above with digit lines extended by a factor of five in length to provide a better economic balance between digit circuit and other stack and circuit costs Future developments The processing machinery to accomplish the extended stack is sufficiently well developed that the main development effort can be applied to exploring possibilities for achieving perhaps another (1) spatter bumps on copper surface - too order of magnitude increase in word-line density. Preliminary scribing and etching have produced lines high an evaporation rate 16 (2) poor adhesion 7 as narrow as 0.1 mil with good edge definition over 9 (3) damaged in processing large areas. At line widths corresponding to some ~ 4) poor Hne edges due to a bad scribing tooi 16 five waveiengths of iight the potential for batch(5) other 2 fabrication of multimillion bit arrays is accompanied by formidable problems in signal detection and Of the remaining 42, 23 were rejected in final testing, interconnection. The technology of merging integrated 21 for excessive defects and 2 for poor magnetic semiconductor access circuitry and high-density film characteristics. The 19 acceptable represent 15% of arrays presents considerable challenge with enormous the total and 45% or those reaching final test. It is . dividends. It becomes reasonable, for the first quite certain that the large attrition due to poor coptime, to seriously consider the possibility of providing per, poor line edges, and damage can be significantly random-access static storage which is economically reduced. In one run in which there was no loss due to competitive with rotating mass memories. evaporation or processing defects, fourteen substrates ACKNOWLEDGMENT were processed and nine were acceptable. The importance of providing redundancy should be Acknowledgment is made of significant contribuemphasized. In the acceptable substrates two to tions to this program by Robert Berger, Gilbert Gageight lines were defective. Some of the 21 substrates non, Elis A. Guditz, Charles Hoover, and Mark Nairejected for excessive defects would have been man. acceptable if line substitutions were permitted on a REFERENCES 32 instead of a 16 line-group basis since spare lines 1 J I RAFFEL are committed to a group by the bussing on the subFuture developments in large magnetic film memories J App! Phys 35 No 3 part 2 p 748-753 March 1964 strate. Progress Report on Large Capacity Magnetic Film Memory Development 2 J I RAFFEL Operating characteristics ofa thin film memory J Appl Phys Supplement 30 No 4 P 60S-6\ S April 1959 3 J W FORGIE A time- and memory-sharing executive program for quickresponse on-line applications Proceedings of the 1965 Fall Joint Computer Conference 4 J I RAFFEL A H ANDERSON H BLATT T S CROWTHER and T 0 HERNDON The FX-/ magneticfilm memory MIT Lincoln Laboratory Technical Report No 278 August 1962 5 H BLATT Random noise considerations -in the design of magnetic film sense smplijiers 265 MIT Lincoln Laboratory Group Report 1964-6 August 1964 6 J I RAFFEL A H ANDERSON T S CROWTHER T 0 HERNDON and C E WOODWARD A million bit memory module using high-density batch-Jabricated magnetic film arrays To be published Proceedings of the 1968 Intermag Conference 7 T S CROWTHER Specijications and yields of composite magnetic films for a high density memory To be published Proceedings of the 1968 Intermag Conference 8 E A GUDITZ T 0 HERNDON W J LANDOCH and G P GAGNON Large area high density precision etched wiring To be published Proceedings of th~ 19(j8 Electronic Components Conference Washington DeMay 1968 A fast 2 Yz D mass memory by C. C. M. SCHUUR Philips' GloeilampenJabrieken N.V. Eindhoven, the Netherlands INTRODUCTION The mass memory described in this paper is a randomly addressable magnetic core memory having a storage capacity of 0.5 Megabytes. Because of the many inherent advantages to be gained, such as low core-stringing, costs, high s~eed and negligible heat dissipation within the core st'ack, a 2Y2 D selection organization is used. Since the principle of this organizational structure has been adequately covered in recent literature 1,2,4,5 , jt is assumed known and will not be discussed further herein. In order to obtain maximum flexibility, the memory has been designed for five modes of operation as follows: (1) Read cycle time access time 1.3 / pB 1.2 /ILS (2) Write cycle time 1.2 /JLS (3) Read-Restore cycle time access time 2.5 IlLS 1.2 /ILS (4) Read-Modify-Write cycle time access time 2.5 IlLS 1.2 IlLS (5) Clear-Write cycle time 2.5 /ILS The values listed above are very conservative. Witl the prototype, which is a completely populated memo ory, it cycle time of 2 usec and an access time of 1 usec can be obtained under adverse conditions. These conditions are: a) Tamb = 5 °C - 45°C. b) worst case voltage tolerances of ± 10%. c) full worst pattern along word lines and bit lines. There are no limitations imposed on the sequence in which addresses are selected, or the number of times each individual address is selected sequentially in any of these five operating modes. The design of the input and output circuits is such that the corresponding inputs or outputs of several memories can be linked by means of a single cable. By linking corresponding address bit inputs it is possil, 267 ble to increase the number of bytes per memory word. By linking corresponding data inputs and outputs the number of storage locations can be increased. All the memory inputs and outputs are provided with gates. The gates can be controlled by the computer so that it is possible to use several memories or. banks of memories interleaved, which greatly' increases 'the effective byte transfer rate. Alternate memories or banks can be addressed every 100 ns. As is evident from the foregoing although the storage capacity is limited, the memory has ~een developed for use with larger mass memory systems Small mass memories whose design is such that sev- ' eral of them can be combined to form a larger entity offer some significant advantages. They are relatively easy to mass produce, particularly the core' siacks~ and enable large mass memory systems to be tailored to the needs of the customer. When the memoiy m()dules in such a system are self-contained it is possible to effect repairs on a particular unit without having to switch off the whole system. One drawback of small 2Y2 D mass memories is that they require more electronic components per bit, however measures can be taken to overcome this. Because of their organizational structure', 2Y2 D memories of a given size and speed are cheapest when the w~rd lengths are kept as short as possible. By choosing a word length of one byte (nine bits) a small mass 'memory can be made, without sacrificing the inherently low cost per bit of a larger 2Y2 D mass memory having longer words. Should a mass memory of only limited storage capacity be required it will not be possible to combine several standard mass memories to obtain more practical word lengths. H~wever, it will be shown later that by making the most of the separate sense wires and the arrangement of ,word-line selection switches, words of up to 144 bits ( 16 bytes) in length, can be made with the standard core stack. Under these operating conditions the byte transfer rate per, memor~_il!~ creases from 0.4 Megabytes/s to 2.6 Megabytes/s. It will be realized that a corresponding increase in the 268 Spring Joint Computer Conference, 1968 necessary electronics is thus required, an increase in power consumption also occurring. A new interconnection technique has made it possible to keep the core stacks very compact. I n the event of failure, which is very unlikely, the defective part of the stack can be repiaced in a short time. The word-line selection diodes (8704) can also be replaced quickly and easily. They are mounted on small printed-circuit cards which can be plugged into sockets mounted at the rear of the stack container, rather than on the stack itself or on the matrix planes. The dimensions of the core stack and its associated electronics are 98 cm (39 in) height by 45 cm (18 in) width by 50 cm (20 in) depth. If power supplies are added these dimensions become 147 cm (59 in) height by 45 cm (18 in) width by 50 cm (20 in) depth. Since the heat dissipation within the core stack is low, ambient temperatures of up to 40°C can be tolerated; the lower temperature limit is 10°C. The standard memory (520,000 words of 9 bits) will be sold for well under 2 ct (U .S.)/bit. !:)E~;:;E '';IEE St!:BMA TP. IX '"="T'T'7"n--f:-::-;;y---l.---+----1 WITH TWO ~~~~~;WIR~ F-'L'~:a.q.-.L-~--~---1 I' 2048 WORn LINES I IT INES .1 Figure I - Schematic representation of the core array Core stack The nth bits of all words are arrayed schematically in a rectangle of256 bit-lines by 2048 word-lines. Nine of these rectangles make up the co!,"plete matrix of 2304 bit X 2048 word lines as shown in Fig. 1. Because a matrix of this size with 30 mil cores at 30 mil centers would be unwieldy it has been divided into 36 submatrices. The nine sub-matrices in each column are arranged so as to make four sub-stacks. Each substack contains 130,000 words each of nine bits. The word-lines, split into nine sections by the division into sub-matrices, can easily be reconnected by dipsoldering. A number of sockets mounted on each substack locate the cards carrying the word-line selection diodes. Conventional wiring is used between the socket pins and the word-line terminals on the sub-stack. Reconnecting the bit-lines separated by the subdivision is more complicated. Both ends of each bitline in each sub-matrix are connected to conductors printed on polyimide foils. The correlated bitlines of corresponding matrices in two adjacent substacks can be interconnected by simply pressing the foils together, as shown schematically in Fig. 2. The two outermost sub-stacks are connected by pressing their f()ils against glass epoxy strips, on which are printed the same patterns of conductors as on the polyimide foils. These conductors are then joined by conventional wiring to the pins of the sockets for the cards carrying the bit-line selection diodes, which are mounted on the main stack frame. The frame also carries the clamping devices for pressing the polyimide foils together and the rails along which the sub-stacks Figure 2 - Arrangement and interconnection of submatrices slide into position. A main stack frame containing four sub-stacks is shown in Figs. 3a and 3b, a substack in Fig. 4. Typical drive conditions for the cores used in this memory are: Full current 700 rnA (at 25°C) Rise time 0.1 IlLS Pulse width 0.4 Ip.,s Under these conditons, the typical response values are: rV 1 53 mV wV z 6 mV (disturb ratio = 0.5) Peaking time 0.2 IlLS Switching time 0.4 Ip.,s Selection of bit lines and word lines. As mentioned earlier diode-matrix selection is used in this memory. For the selection of nine bit-lines, one out of each of the nine groups of 256 bit-lines, nine diode matrices of 16 by 16 diodes are required. The rise time of the drive current through a bit-line is A Fast 2Y2 D mass Memory' 269 Figure 4 - Subs tack Figure 3a- Front view or core stack A complete circuit for the selection of one line from a group of four is shown in Fig. 5. The circuit is quite conventional and straightforward. To reduce the possibility of amplitude differences occurring in the read and write currents, a single voltage source and one set of resistors are employed. The voltage source is floating and' balanced with respect to earth by means of a voltage divider, so that the voltage variation on the selection lines is only 6 V thus minimizing the noise picked up by the sense wire. Capacitors shunted across the current determining resistors improve the current.waveform. Auxiliary switches are employed to ensure short capacitor discharge times. CURREft DETERMUIIG RESISTOR Figure 3b - Rear view of core stack approximately 250 ns. To prevent deterioration of the "one" signal owing to slowly increasing word-line drive currents, the rise time of these currents should be less than 300 ns. The word lines are thus divided into eight groups of 256 lines, each having its own diode matrix consisting of 16 x 16 diodes. \ COMKOB WIfE SELEC'l'IOlI SWI'rCB BLEEDERS + - Figure 5 - Selection switches and diode matrix 270 Spring Joint Computer Conference, 1968 The most suitable coupling devices between switches of this type and their drives are transformers since they are particularly useful as a base curr~nt supply source; moreover they permit the use of very simple drive circuits. For selecting one line from a total of 256 lines, 64 selection switches are required. These form a selfcontained unit mounted on a single card. Nine such units are required for the bit-line, and eight for the word-line selection. Drive circuits and address decoding Fig. 6 shows a combination of selection switches, drive switches and address decoders in simplified form. Two such combinations are employed in the memory, one selects the bit-lines, the other the wordlines. Current phasing as a ~eans of core selection though not used here, could also be controlled by the readand write-drive switches. Saturation of a selection-switch drive transformer which may occur after several successive switching operations, is prevented by diodes Ds and Dm which ensure rapid demagnetization of the particular transformer after the termination of a read- or write-drive pulse. With both the address-drive and read- or write-drive switches broken, the primary winding of the transformer previously selected is isolated. Point A tends to become negative and is clamped to earth by D whilst point B swings positively and is damped to + 12 V by D, resulting in a constant voltage of some 15 V being applied to the primary. This accelerates the decay of the magnetizing current through the winding. Sense wires and amplifiers 'I'IlHlIG - •• ___ •••!1l:;RF.ss i:EG.(.2.I.l:S). . Unlike other 2Y.! D mass memories described in the literature, the memory under discussion employs a separate wire for sensing the interrogated bits. The path followed by the two pairs of sense wires is shown in Fig. 7a. For the sake of simplicity a 2x (128 x 512) sub-matrix is reduced in the figure to 2 x (8 x 32). By arranging the cores so that their axes are mutually parallel, good balancing and easy threading of the sense winding are obtained. The manner in which several sense wires are combined to form one sense winding is shown in Figs. 7b and 7c. Because of the method used to terminate the sense wires in this memory, perfect balancing of the sense winding is required. The type of termination employed eliminates distortion of the output signal owing to phase differences at the inputs of the differential preamplifiers. Figure 6 - Drive circuits of selection switches I n each 4 x 4 matrix of address drive switches, each switch is responsive to two sets of address decoders, thus two sets of read- and write-selection switches are pre-selected. This occurs in each lineselection unit connected to one of these driver matrices. The final decision whether or not the reador write-selection switches are to be driven, is left to the read- and/or write-drive switch in each selection unit. Three functions can thus be assigned to the read and write-drive switches: (1) selecting either the read- or write-selection switches in all selections units, governed by the command pulses from the timing unit. . (2) selecting one unit from eight word-line selection units on the basis of three address bit levels; (3) digit switching in the bit-line selection units. :~=::::-:==~ :!? ...-101 '--------'=::::::: :~ :: /»"''''''''''''''''''~ .. . W"""""""""",,«o::: ~! Figure 7a-Sense wire through core array A Fast 2% D Mess Memory _-$-12---__ t t tf tf l}!' ~ Figure 7b - Combination of sense wires and preamplifiers 16384 cor.. 16384 coree Zoo Zoo Zoo 00 16384 ooree "''''71 kl1 main adjustment which is determined by the setting of the manually variable voltage V thrcs. With a threshold circuit of this type it is relatively simple to feed back a fraction of the core driving voltage to the threshold circuit, so that a considerable improvement in operating tolerances is obtained. This could be done by a resistor network which includes a thermistor to render the threshold level insensitive to variations of the core driving voltage, arising from temperature changes in the core stack. The sense amplifier strobe time can be vai"icd as a function of the address by means of the circuit shown in Fig. 9a. In the circuit, V A is the output voltage of a digital-to-analogue converter, whilst a sawtooth voltage VB is applied to the other input of the long-tailed pair. Figure 7c-Connection of sense wires to one preamplifier One pre-amplifier is capable of handling as many as 65,000 bits, thus only eight pre-amplifiers are required per word-bit. Despite the large number of cores per sensing circuit, only 128 core pairs on each selected bit-line and 32 core pairs on every selected word-line contribute to the noise level. In order to limit the level of noise, only those preamplifiers coupled to the selected cores at a given instant are operational. The signal-to-noise ratio is improved by initiating the bitline currents during a given operation some 250 ns before the word-line current. In order to compensate for delay and attenuation of the signal on the sense wire, the threshold level and strobing time of the sense amplifier are required to vary as a function of the address. O+-----T-~r-----~----------- thermistor Figure 9a,b - Circuit to vary sense amplifier strobing as a function of the address to preampl. output .. t - - - - output L..-.._ _ _ strobi~ Figure 8 - Circuit to vary sense amplifier threshold as a function to the address Variation of the threshold level is accomplished by a circuit (Fig. 8) which is actually a digital-to-analogue converter. The variation constitutes a correction of the The resulting output voltage is a positive-going pulse (Fig. 9b) of which ton varies with the address of the selected word, and tofr is constant. Reference to the oscillograms of Fig. 10 shows the output voltages from the various addresses, together with the bit and word currents and strobing and data register outputs. Experience gained during manufacture of the sub-matrices has completely vindicated the choice of an extra wire for sensing. The cost of the extra wire has proved to be less than the cost of the transformers that would otherwise be required (3). The external core-stack wiring is much simpler and much less critical with regard to noise pick-up. 272 Spring Joint Computer Conference, 1968 Longer words Two methods can be adopted in the memory to increase the number of bits per word. The length of the words is limited by the number of sense wire pairs (1 II 11\ " l'"t'"t). Bit-line selection unit~ . ~:rd-~ln~ .~~:~;~l n ur:l ~ ... Fig 10c Fig 10 e Figure II-Combination of bit-line and word-line selection units to interrogate 36 bits at a time in an essentially 9-bit organization Fig lOr Fig 109 Fig 10 h Figure 10- Read command One technique is illustrated in Fig. 11. Here, use is made of the fact that eight separate word-line selection units are employed. When nine bit-lines are driven concurrently with four word-lines rather than one, 36 bits can be interrogated at 3: ,time. This is possible if the bits are situated on 36' 'different sense wires and the output voltages of the sense wires are staticized by 36 different registers. A word of 36 bits has to be stored in four shifts of nine bits, each shift occurring in 1.2 J.ts; this operation is done independently by the memory. The memory is thus rearranged to produce 130,000 words each of 36 bits with an access time of 1.2 J.ts and a cycle time of 1.3 J.tS + 4.8 J.ts = 6.1 J.ts. The bit transfer rate increases from 3.6 X 106 bits/s to 5.9 X 106 bits/so When for instance two word-lines are driven concurrently with nine bit-lines, 18 bits can be interrogated at once the storage time being 2.4 J.ts. It is thus possible with this technique to multiply the number of bits per word without a proportional increase in heat dissipation occurring. A drawback is the mUltiplication of the cycle time, however the short access time is maintained. It is also possible to increase the number of bits per word by increasing the number of bit-line selection . units. Referring tU • ,----- ,',/' . Tolerances . - - - - - + - D.. u tapat vord ,',' 111\ I Sene. S.l"tloD diod•• have _pl. D.'. rea_ .liot decoder Dot Mea draw tor the eat. of .1aplioi V Ii, .witolt. Figure 12 - Combination of bit-line selection units and sense wires in a 36-bit organization the figure part of a sense wire array of 128 x 512 addresses is shown. P and Q are selection units for the 64 bit-line} and N is a word-line selection unit. A combination of two corresponding bit-lines for ex.. ample Pl and ql with any word-line, selects a pair of cores located on two different sense wires. The bitline selection units P and Q are not however, correlated solely to a particular word-bit R or S. By suitable addressing, P can select a core of either wordbit S or R, Q then selects a core of either word-bit R or S. The address has thus to be decoded, the output of the decoder being fed to the digit switch whose function is to direct the digit information into the correct channels. One item in the stack which has to be changed is the wiring between the bit selection diodes and the bit-lines, however, the standard sub-stacks can be used without any wiring or constructional changes. A memory is thus realized which has a capacityof 130,000 36-bit words, an access time of 1.2 ILS and a cycle. time o( 2.5 ILS. The two methods described in the foregoing can be combined to make a mass memory of 32,000 words each of 144 bits, with an access time of 1.2 ILS and a cycle time of 6.1 ILS. In such a system the bit transfer rate is 23.6 X 106 bits/so Power supplies and protections The power supplies are designed to operate from a three-phase a.c. mains supply. All the d.c. outputs Operating tolerances of a memory system can be represented by means of an n-dimensional Schmoo diagram, where n is almost limitless. Only a few of these variables are of practical interest. For this memo~y a two-dimensional Schmoo diagram has been made, having the sense amplifier threshold plotted along one axis and the voltages corresponding to the core drive currents along the other (Fig. 13). In plotting the graph all other voltages were set to their nominal· values (-6 V, +12 V and 30V). During the measur~m~ntsthe ambient temperature of the lllemory. was kept at the upper limit of 45°C, at which level the noise is at a maximum. SEIISE AIIPL. !DESBOLD (v) ~ 20 10 t L.---:1)at:-------4....2 - + -.... 46--50--+--5+-4--DR-l-VE-C--~R~:;T DE~~~.~~ Figure 13 - Schmoo diagram. No feed-back from core drive voltage to sense amplifier threshold Mechanical design All electronic components are mounted on doublefaced, epoxy-glass printed circuit cards. The cards carrying the selection and drive switches, read amplifiers and timing circuits, measure 32 x 25 cm. (l2~ x lOin). The cards slide into a compartment which can be mounted in a standard (19 in) rack. 274 Spring Joint Computer C'onference, 1968 Cable terminating resistors and cable amplifiers are mounted on cards of 32 x 13 em (12~ x. 5 in) which are housed in a second compartment also containing the cable connectors. To take full advantage of the low heat dissipation within the stack container-4.5 W in a volume of 84 litres (22 U.S. galls)-the container is situated at the base of the c a b i n e t . ' The compartments housing the memory electronics are located immediately above the stack container. The power supplies are accommodated in two such compartments which occupy the upper part of the cabinet together with another unit housing the transformers and certain control circuits, Disposition of the various sub-assemblies in this manner renders forced cooling unnecessary. The space occupied by a complete memory (less cabinet) measures 147 em (59 in) height by 45 em (18 in) width by 50 em (20 in) depth. • CONCLUSIONS Well-designed and properly terminated sense wires enable the manufacture of a fast and relatively inexpensive mass memory to be realized. Although the application of sense wires usually implies a limitation to one particular word length for a given storage capacity, it transpires that by purely electronic means, a large range of word lengths can be covered using standard sub-stacks. This is a very im- portant factor in mass memory design since the· core stack is invariably the most difficult and most expensive part to develop and manufacture. Provided therefore that sufficient attention is given to the interface circuitry, small economic mass memories can be manufactured on a mass production basis to form· part of larger memory systems. ACKNOWLEDGMENT Thanks are expressed to Mr. P. J. Baker of our Technical Publication Dept. who edited the manuscript. REFERENCES 1 R J PETSCHAUER G A ANDERSEN W J NEUMANN A large capacity low cost core memory Presented at the IFIP Congress in New York N Y May 1965 2 T J GILLIGAN P B PERSONS High Speed Ferrite 21;2 D Memory Proceedings - Fall Joint Computer Conference 1965. 3 A M PATEL J W SUMILAS A 21;2 F ferrite memory sense amplifier IEEE JOURNAL of Solid State CirCUits, Vol. SC-l No. 1 September 1966 . 4 J REESE BROWN First- and second-order ferrite memory core characteristics and their relationship to system performance IEEE Transactions on Electronic Computers vol EG-15 no 4 5 2361 Core storage original equipment manufacturers information IBM Publication SRL - 28 - 822 - 6869 - I A magnetic associative memory: * by TSE-YUN FENG** Syracuse University Syracuse, New York INTRODUCTION Memory scheme Hundreds of technical papers and reports on associative systems have been published since 1956 when the first electronic associative memory was reported. So far the use of such a system is still very limited because of its cost. But the tremendous technological advances made in the batch fabrication of thin films and integrated circuits for the last few. years would evidently reduce the absolute cost of an associative system and enhance its usefulness. . Much work has been done in the area of magnetic associative memories. 1- 19 However, mo~t of the memory schemes proposed have a low "per-bit mismatchto-match signal ratio"*** which is one of the principal factors in determining the ultimate size (or speed) of the memory. In these systems the delta noise due to the reversible switching of the magnetic material is compensatable, thus with appropriate compensation techniques, it is possible to make the per-bit mismatch-to-match signal ratio independent of the signalto-noise ratio of the magnetic device. As a result, the per-bit mismatch-to-match signal ratio is greatly increased. It can also be shown that an associative memory may have the ability of detecting the neighboring codes of a given code if a similar signal compensation technique is employed. In the following scheme the implementation can be achieved by many kinds of devices. However, a 4 X 4 memory model using multiaperture cores was constructed and tested. Experiments performed on the memory model confirmed the effectiveness of the compensation techniques. For a magnetic core if the ·maximum induced voltage due to the irreversible switching of flux is normalized to 1 and that due to the reversible switching (delta noise) is 8, the maximum possible signal to noise ratio is 1+~8 --8- >p, (1) where p is the largest integer to satisfy this rel~tion. If parallel-by-bit interrogation is assumed, p is also the maximum possible number of bits per word that an associative memory may have ··before the noise masks the information-bearing signal. Equation 1 indicates that in order to increase p, 8 must be small. Since 8 is an inherent property of the magnetic material and cannot be eliminated, it must be compensated externally. Suppose that there is a noise generator that would produce a noise of 8 when interrogated. Its sense wire is connected in series with a memory bit but opposite in polarity. The output during interrogation would then be (1 + 8) - 8 = 1 when the memory bit changes state, and is 8 - 8 = 0 when the memory bit does not· switch. The per-bit mismatch-to-match signal ratio would for this case be infinite. In reality, the total compensation of delta noise is very difficult (if not impossible). This is mainly due to the nonhomogeneity of the material and other factors involved in the fabrication of the memory devices. Suppose that the maximum noise that cannot be compensated by the noise generator is e(e < 8). Then the per-bit mismatch-to-match signal ratio would be in the worst cas,e: l-e ->p *The work reported here was supported by the United States Air Force Contract AF 30(602)-3546. **Formerly with the University of Michigan, Ann Arbor, Michigan. ***The "per-bit mismatch signal" is the signal produced at a word sense wire when only one bit of the word is interrogated and the information stored in the bit is different from that of interrogated bit. Similarly, the "per-bit match signal" may be defined as the word signal produced when the information stored in the bit matches that of the interrogation bit. In most cases it is the signalto-noise ratio of the associative cell. e (2) Experimental result shows that the per-bit mismatch-to-match signal ratio is greatly increased to ~~1 = 520 by the noise compensation technique (Fig. as compared to (Fig. 2). 275 52 4 = 1) 13 for the uncompensated case 276 Spring Joint Computer Conference, 1968 1 fl see/div (0) Noise Voltage Figure 1 - The per-bit mismatch and match signal voltages Figure 3 shows the circuit configuration of the associative memory with noise compensation. The two states of an associative cell are defined by Fig. 4. Thus, each cell may consist of a flip-flop and a few gates or two multiaperture cores (or their equivalent). The vertical lines of the 4 x 4 memory are the interrogation wires and the horizontal lines are the row sense wires. The memory interrogation pulses have three states as given by Table I. There is no 11 state since both interrogation wires are never energized simultaneously during the searches. The symbol 0 is used here for the masked bits. The relations between input, store, and output may be expressed by Table 2. The choice of 0 output for a matched condition simplifies the detection structure of the associative memory. In Fig. 3 the memory has four 4-bit words with C l l C l2 C l3 C l4 = 1100, C 21 C 22 C 23 C 24 = 0110, C 31 C 32 C 33 C 34 = 1110, C41 C 42 C 43 C 44 = 0100, as defined by Fig. Figure 2 - The signal and noise voltages of a multiaperture core 4. The row noise generator uses the same types of cells as those in the memory but for each cell only the portion that would generate delta noise is interrogated and sensed.' Of course, this is not always necessary. For example, when tntnsfluxors are used in the scheme, each memory cell requires two transfluxors, but each noise cell requires only one transfluxor which is in the blocked state. Equality search As an illustration for equality search, suppose that a pattern of 1110 is stored in the asso\..-:.1tion regir.,ter (Fig. 3). Then, during the interrogation period th~ interrogation wires 111, 1 2b 13b 140 will be energized. The normalized induced voltages at the row sense wires are NI = 1 + 48, N2 = 1 + 48, N3 = 48, N4 = 2 + 48. But these voltages are balanced by a voltage of (-48) £~ l'a..1agnetic Associative ~v1emory TABLE I - Interrogation pulse states ~ Input ofi-th Bit FROM MANU~ CONTROL ~i{ I nterrogation Pulses GATES rBl:llRECTIONAL GATES t-+---+-~--+--+--:---:-+-+"'":'"'\ o o 1 1 0 ~ o 0 T ABLE II - Relations between memory input, store, and output >- ; Input Store 0 0 0 0 0 1 0 1 0 0 0 Output ""2 or )( or 1 ~ ~ Figure 3 - Circuit and configuration of a noise-compensating and error checking associative memory "1" "0" bit bit M: Signal output when interrogated (1 + 0) m: Noise output when interrogated (0) Figure 4- Two states of an associative cell induced at the sense wire N r • Therefore, the net voltages relative to ground appearing at the N wires are: 1 ± 4E, I ± 4E, ± 4E, 2 ± 4E and E represents the maximum noise that cannot be compensated by the noise generator. Hence, for an exact match there will be no (or little) voltage at a sense wire and there will be varying degrees of voltage differences in all other cases. If these voltages 'are normalized by a threshold device, then we have Nt = 1, N2 = 1, N3 = 0, N4 = I. indicates equality match. Experimental results on equality search are shown in Fig. 5. Figure 5(a) is the 4-bit match signal without noise compensation. After noise compensation the 4-bit match signal IS reduced to that shown in Fig. 5(b). ° ° (b) i... - 1 " :::a ~~(!~n s ig.:--~al. wi :. . t rloi S~.:, ~~)~:~·!.~~:.;t.:.SB·;:': Figure 5 - Match signals from a 4-bit word ~.~:.~::. 277 278 Spring Joint Computer C'onference, 1968 I nequality search Suppose that 1110 is still stored in the association register and we search for inequality responses. The function of the memory is the same as before. In order to detect inequality condition by the same circuit used fOi equality seaich AND gates are added. Figure 6 shows one possible arrangement. For equality searches the upper AND gate of each word is activated so that both the Nand N wires have the same state. But during the inequality search period the negation line is activated so that the states of the N wires are inverted from that of the N wires. Thus, we have N~=O, N~=O, N~= 1, N~=O instead of 1101 for the previous case. Here we have mUltiple responses for inequality match. Figure 7 shows the I-bit mismatch signals from a 4-bit wOid. rogation period the interrogation wires 111, 121 , 140 will be energized. The new output voltages from the sense wires will be now 0 ± 3€, 1 ± 3€, 0 ± 3€, 1 ± 3€, and the states of the N[ wires are Ni= 0, N; = 1, N~-= 0, N4 = 1. Again we have mUltiple responses to our search. The output voltages from the word sense wires are similarto those shown in Fig. 5(b) and Fig. 7(b). I I NEGATION N•1 4 - - 4 1 (a) I-b~t llli.sma;~,ch fl sec/dlv -w-ithou.t noise compensation N2 N2 .~ '0 From Memory N'3 ~ § 0 ~,\J N3 1!J. sec/div (b) l-bit misrnatch with no:Lse cOlnpensation Figure 7 - I-bit mismatch signals from a 4-bit word Proximity search Figure 6 - And gates for simple searches Similarity search Suppose that the same pattern 1110 is stored in the association register as before but the third bit of the search word is masked off, i.e., the word to be searched is 1100. (Such a search when applied to one field is known as the similarity search.) Then during the inter- The proximity search is defined here as the match such that the interrogated field of the memory word is different from that of the search word by a predetermined distance. * Such a match is very useful, particularly when redundancy techniques are applied *The distance between two code points in an m-dimensional space is given by the number of coordinates (or bits) by which they differ. L~ to the associative memory to increase its reliability. Thus, the proximity match for distance 1 would detect single errors in the system. The proximity match for higher distances would detect higher-order errors provided that it is permitted by the per-bit mismatchto-match signal ratio and the redundant codes used. The proximity search is also useful in some pattern recognition schemes. Referring to Fig. 3 and the illustration on equality search we may summarize the results in Table III. Table III indicates that the memory words No.1 and No. 2 differ from the search word by a distance of 1, thus their signal outputs are 1 ± 4E. Similarly, signal outputs from words No.3 and No.4 indicate a match (distance 0) and a distance of 2 respectively. Thus, if we introduce a cell which would generate a signal of (-1) in the row noise generator to balance the signal outputs, the same detection circuit would be able to detect the proximity match for distance 1. This is done by connecting a cell Cp to the row noise generator shown in the lower right corner of Fig. 3. The proximity - match switch Sp will be closed to the C p sense wire which has an induced voltage of (-1)* during the interrogation period. The result of proximity match for distance 1 is shown in Table IV. The' term in the signal output comes from the switching signal compensation. The compensated signal voltage , was measured to be about 2 mv (Fig. 8) in our experiments. i'.,1agnetic Associative rv1emoiY 279 1 Il sec/div Fig. 8 Compensated signal vo~tage TABLE III - Output of equality search Signal Output Search Word Memory Word #1 #2 #3 #4 I ±4E I ±4E 0±4E 2±4E I 100 10 I 1 10 o 100 oI - 1 Il sec/div (a) Proximity match for distance 1 TABLE IV - Output of proximity search Search Word I I 10 \! Signal Outpu I! Memory Word #1 #2 #3 #4 1 100 o I 10 I 1 10 0100 I: I' II'I 0±4E±~ 0±4E±~ I -I ±4E±~ I ±4E±~ It Figure 9 shows the experimental results on proximity match for distances I and 4 from a 4-bit word. 1;.;. sec/div (b) Proximity match for distance 4 *Strictly speaking the induced voltage is (-I ± E). Fig. 9 Proximity match signals for distances 1 and 4. 280 Spring Joint Computer Confere~ce, 1968 CONCLUSION This paper demonstrates the effectiveness of the compensation scheme. Evidently the memory scheme can be used in large systems with high reliability. REFERENCES Word Block :# IU Row Noise Generator # I Word Block # I L ••.••...•.•••.••••..•.•.• I Word Block #2U Row Noise Generator #2 -: :t a. ::t o Word Block :# 2 L . . ..... ........ -..... !. ""' ., ~ ~ Word Block:#wU Row Noise Generator :# w Word Block #wL Figure 10 - A noise-compensating and error-checking associative memory The complete associative memory scheme In the previous sections an associative memory with noise and signal compensations have been described. The experimental results as shpwn by the oscillograms were· exceptionally good. However ,- due to the 'small number of words involved, the propagation delay and attenuation effects in this model were negligible. A complete associative memory scheme taking the propagation and attenuation effects into consideration is shown in Fig. 10. The configuration of Fig. ~ is only a portion of Fig. 10 which consists of an association register, a mask, the word biock No.1 U and the row noise generator No.1. 1 A APICELLA J FRANK BILOC - A high speec NDRO one core per bit associative element Intermag 1965 2 J R BROWN JR A semipermanent magnetic associative memory Proceedings of the Confereqce on Nonlinear Magnetics 1961 3 W F CHOW L M SPANDORFER Plated wire bit steering for logic and storage Proceedings of the 1967 Spring Joint Computer Conference 4 W F CHOW Plated wire content-addressable memories with bit-steering technique IEEE Trans on Electronic Computers October 1967 5 R H FULLER J C TU R M BIRD A woven pl~ted-wire associative memory National Aerospace Electronics Conference 1965 6 J GOLDBERG M W GREEN Large files for information retrieval based on simultaneous interrogation of all items Proceedings of the Symposium on Large-Capacity Memory Techniques for Computing Systems 1961 7 R T HUNT D L SNIDER J SUPRISE H N BOYD Study of elastic switching for associative memory systems Contract AF 30(602)-3103 Report 1964 8 J R KISEDA H E PETERSEN W C SEELBACK M TEIG A magnetic associative memory IBM Journal April 1961 9 R R LUSSIER R P SCHNEIDER All magnetic content addressed memory Electronic Industries March 196~ 10 J E MCATEER J A CAPOBIANCO R L KOPPEL Associative memory system inmplementation and characteristics Proceedings of the 1964 Fall Joint Computer C'onference 11 W L MCDERMID H E PETERSEN A magnetic associative memory system IBM Journal January 1961 12 R C MINNICK Magnetic comparators and code converters Symposium on the Application of Switching Theory in Space Technology 1962 13 M NAIMAN Content-addressed memory using magnetoresistive readout of magnetic thin films • Intermag 1965 14 J I RAFFEL T S CROWTHER A proposalfor an associative memory using magnetic films IEEE Trans on Electronic Computers October 1964 15 A D ROBBI R RICCI Transfluxor content-addressable memory Intermag 1964 16 C A ROWLAND W 0 BERGE A 300 nanosecond search memory Proceedings of the 1963 Fall Joint Computer Conference A Niagnetic Associative Niemory 17 D 0 SMITH K J HARTE Content-addressed memory using magneto- or electro-optical interrogation IEEE Trans on Electronic Computers February 1966 and June 1967 18 G T TUTTLE '"'01 ",01 How to quiz a whole memory at once Electronics November 15 1963 19 E L YOUNKER C H HECKLER JR D P MASHER J M JARBOROUGH Design of an experimental instantaneous response file Proceedings of the 1964 Fall Joint Computer Conference Selection and implementation of a ternary switching algebra* by ROBERT L. HERRMANN Aerospace Division; Honeywell Incorporated Minneapolis, Minnesota m, " INTRODUCTION 0, 1, or 2, and,!he symbols a, 8, fn' and superscripted variables denote functions of one or more variables. The symbol = is used in its usual sense to denote equality of two expressions. An n-ary operation is a function of n variables that is one of the basic functions defining an algebra. Manipulating functions within an algebra is governed by a set of theorems that apply to the particular. albegra. As stated above, one of the desirable features of a practical switching algebra is the ability to manipulate expressions to reduce the amount of hardware required to implement a given function. Theorems that aid manipulation of expressions should therefore play an important role in the selection of a suitable ternary switching algebra. During the last few years there .has been a growing interest in non-binary switching algebras. A number of such algebras have been proposed,t-6 but so far they have had little practical application. Requirements for a practical switching algebra include economical circuits for the basic operations, the ability to implement an arbitrary function with a resonable number of basic circuits, and a suitable set of rules to allow the logic designer to form and manipulate expressions without undue mathematical gymnastics. These are rather vague criteria for a "practical" switching algebra, but they do serve as guidelines for attempts to establish suitable algebras. Among non-binary switching algebras, ternary switching algebras have created the most active interest. The possibility of using positive, zero, and negative voltages or currents for representing three logic values appears appealing. The information transmitted by each wire in a system may be increased without employing complication level detectors and without sacrificing noise rejection. Complementary transistor configurations can be used to perform the logic functions. Any single valued function in an algebra with a finite number of discrete truth values, including the basic operations of the algebra, may be expressed in tabular form by a truth table. Hence, all possible basic operations for a ternary switching algebra may be expressed in this way. Furthermore, such a table completely defines the function. To establish a practical ternary switching algebra, a suitable set of operations must be selected from the set of all possible operations. Methods employed· by the author to select a suitable set are discussed in this paper. In the following discussion, the integers 0, 1, and 2 denote the three truth values [constants), capitallet-· ters A, B , ... denote variables that assume the values Algebraic rules Before embarking on a discussion of algebraic rules, it is necessary to define the constituents of such an algebra. They are: 1) The set of constants {O, 1, 2} 2) A set of symbols {A, B, ... , N} to represent variables. 3) A set of basic operations on one or more variables with appropriate symbols denoting mappings into the set of constants. 4) A set of rules governing the format of expressions in the algebra. The approach taken in this paper to determine a suitable ternary switching algebra is to first establish a desirable set of manipulation rules, and then to search for a set of basic operations that best satisfies these rules. The truth table is most straightforward and easily understood method' of representing function. Many algebraic rules may be introduced as constraints on the truth tables repr~senting the basic operations. The associative law for a single binary operation a a *This paper is based on a thesis for the degree of Master of Science in Electrical Engineering submitted by the author to the faculty of Purdue University, West Lafayette, Indiana, in August 1964.. 283 Aa(BaC) = (AaB)aC = AaBaC (1) 284 Spring Joint Computer Conference, 1968 means that the order of evaluation of several consecutive applications of the operation is immaterial. As a direct result, the binary operation a can become a generalized n-ary operation a(xb x2 , ••• ,x n). The commutative law (2) AaB=BaA states that the order of the operands to which all operation a is applied is immaterial. A binary operation for which these two laws hold may be generalized to a single level n-ary operation. If these two laws an~ adopted as necessary criteria for the operations to be selected, only unary and binary operations n~ed be considered. The binary operations AND and OR of conventional switching algebra are generalized in this way, as evidenced by the existence of multiinput AND and OR gates. These two laws also play an important role in manipulation of algebraic expressions. The idempotent law (3) AaA=A can also be expected to aid simplification of algebraic expressions by sometimes reducing a mUltiplicity of a variable to a single occurrence. With these three laws as constraints, the number of possible operations is reduced to a reasonable number. The effect of these laws on the truth table of a general binary operation a will tirst be considered. The idempotent law req1:lires that the diagomil of the truth table matrix consist of the constants 0, 1, and 2, in that order. The commutative law requires that the matrix by symmetric. Table I illustrates these constraints on the truth table. A a 0 0 0 a b 1 a 1 c 2 b c 2 1 2 TABLE I. Generalized Commutative and Idempotent Binary Operation The effect of the associative law on the truth table is not as obvious and can best be determined by trialand-error testing of the associative law on the 27 possible truth fables of the form shown in Table I. This was done by t~e author with the aid of a computer pr9gram.7 Nine of the remaining 27 operations were found to be associative. The truth tables for these nine are shown in Table II. One or more of these nine binary operations, along with one or more suitable urary operations, must be selected to form a suitable set of basic operations. Laws involving pairs of binary operations are considered next. This will determine which combinations of binary operations will work well together in terms of expression manipulation and simplification. The distributive and anti-distributive laws for a pair of binary operations a and 8 (AaB) 8 (A8B) a (AaB) 8 (A8B) a = (A8A) a (B8C) (A8C) = (AaA) 8 (BaC) (AaC) = (AaA) 8 (BaC) (A8C) = (A8A) a (B8C) (AaC) = = = = A a (B8C) (4) A 8 (BaC) (5) A 8 (BaC) (6) A a (B8C) (7) occur frequentiy in usefui aigebras, and may be useful in a switching algebra. They aid in manipulating. and simplifying expressIons by reducing the number of occurrences of a variable or subexpression. Idempotency is assumed in writing Equations (4) through (7). Equations (4) and (5) are the distnoufive- raws; Equations (6) and (7) are the anti-distributive laws. As with the associative law, Equations (4) through (7) may belested for validity on each of the 36 distinct pairs of binary operations from the set of nine in Table II. Twelve of the 36 distinct pairs were found by the author7 to satisfy both Equations (4) and (5). Six additional pairs were found to obey either Equations (4) or (5), but not both, and none were found to satisfy Equations (6) or (7). Referring to Table II, the pairs that satisfy both Equations (4) and (5) are: P l = {lIb, IIc} P2 = {lId, IIhj P3 = {He, IIg} P 4 = {lIb, Hh} P5 = {IIc, IIg} Ps = {lId, lIe} P 7 = {IIa, lId} P s = {IIa, lIe} P 9 = {Hf, lih} P lO= {lIb, IIf} P u = {IIc, IIi} P l2 = {IIg, IIi} (8) These 12 pairs are not truly distinct; among the 12 pairs, isomorphisms exist. Two pair are isomorphic if one pair can be converted into the other simply by interchanging the constants 0, 1, and 2 along with the appropriate permutation of rows and columns in the truth table. The validity of an algebraic rule does not depend upon the choice of symbols to denote the constants as long as the appropriate symbols are used in the rule, so all pairs in an isomorphic set must obey the same total set of rules. The 12 pairs of binary operations found to obey both distributive laws may be partitioned into four isomorphic sets of pairs. 7 Referring to Equation (8), these isomorphic sets are: Selection and Implementation of Ternary Switching Algebra 0 1 2 0 0 0 0 0 1 0 1 2 2 0 1 0 1 2 0 0 0 0 1 0 1 2 0 0 1 2 0 0 0 0 1 1 0 1 2 2 0 2 (b) (a) 0 1 2 0 0 1 1 1 1 1 1 2 2 1 1 0 1 2 0 0 1 0 1 1 1 2 0 1 (e) 0 (f) 0 1 2 0 0 0 2 2 1 0 1 2 2 2 2 2 2 (d) (c) 0 1 2 0 0 1 2 1 1 1 1 2 2 2 2 0 1 2 0 0 1 2 1 1 1 1 2 2 2 1 (g) 285 0 1 2 0 0 2 2 2 1 2 1 2 2 2 2 2 2 0) (h) TABLE I I. Set of Associative, Commutative, and Idempotent Binary Operations 0 1 = {PI' P 2 , P 3 } O 2 = {P 4 , P 5 , P 6 } 0 3 = {P7 , P S ' P 9 } 0 4 = {P IO , PH' P I2 } (9) One pair in an isomorphic set may be preferable from an implementation standpoint if the symbols 2, 1, and are assumed a priori to be represented by positive, zero, and negative voltages respectively. So far only binary operations have been considered. Equally important is the selection of suitable unary operations. U sing the truth table approach, there are 33 = 27 possible unary operations. The identity mapping and the mappings into a single constant are obviously of no value in a switching algebra. Eighteen of the possible unary operations map into two of the three truth values, and five map into the full set {a, 1,2}. U nary operations in the "truncating" class that map into two truth values have been proposed by several other authors. 3 ,5,6 However, when one attempts to form a _"sum-of-products" or "product-of-sums" form of expression for a function in an algebra using this class of unary operations, three unary operations are required and the "terms" in the expression are two-valued. As an example, consider the three unary operations J o, J 1, and J 2 defined in Table II I. These three unary operations may be used in defining an expansion theorem, to be discussed later, that results in a sum-of-products form of expression for all possible functions (see Equation 16). Each term ° A o 1 2 2 1 o 1 2 o 1 o 2 2 o 1 o 2 1 TABLE III. Truncating Unary Operations in such an expression is a logical product of a constant and unary functions "J" of each of the independent variables. With the exception of the constant, the constituents of each term can assume only the values {a, 2}. Ternary information is regained by the teniary logical sum of some terms which can assume the values {O, I} and some which can assume the. values {O, 2}, as dictated by the constants in each elementary product. It should be noted in passing that DeMorgan type laws exist for J o and J 2 , but not for J h using- the Post binary operations. (P 4 in Equation (8». By using nontruncating unary operations, a similar approach to expressing functions may be taken using single applications of only two unary operations and retaining three valued terms in a sum-of-products expression. The five remaining unary operations that map into the full set (0, 1, 2) are shown in Table I V. 286 Spring Joint Computer Conference, 1968 A JO(A) J 1 (1\) J (A) 2 0 2 0 0 1 0 2 0 2 0 0 2 G 4 satisfied only Equation (13b) or its inverse, but not both, for only one of the unary operations in Table IV. This would indicate that one of the pairs of binary operations in G 1 or G 2 should be selected for the binary operations. Functional completeness As yet, the required number of unary and binary operations has not been determined. Since each basic operation requires a fundamental circuit for implemenTABLE IV. Nontruncating Unary Operations tation of the algebra, the total number should be kept small. An absolute minimum number may be undesirThe superscripts denote application of the unary able if unnecessary complexity of expressions result. operation to the variable A. Adopting for the moment A compromise may be made if one or two additional the symbol ' to denote a general unary operation, the operations significantly reduce expression complexity. operations Al and A2 in Table IV obey the law Functional completeness, the ability of an algebra to express an possible function~ of k variables for. all k, «A')')' = A (10) dictates a minimal set of 0Pt:rations. A direct proof of functional completeness of a given set of operations is and the operations A 3 , A 4 , and A 5 obey the law mathematically involved, I but it may be proven in. directly by showing that every operation of a known (11). (A')'=A functionally complete algebra, such as ternary Post Algebra,S may be produced by a finite expression in Assuming that one or more unary operations are the algebra in question. The basic operations EB, . ,and necessary, their position within an expression should A' of Post Algebra are shown in Table V. be able to be manipulated in conjunction with the selected set of binary operations. The laws that allow H_ I !! I manipulation of unary operations in conventional A Ai 2 1 0 A··B AeB 0 1 '2 switching algebra are the DeMorgan Laws: (A+B) = A·B (A·B) = A+B (12) A set oflaws of this general type should besought in the selection of unary operations for a ternary switching algebra. A set of possible "DeMorgan-type Laws" for the unary operation is shown in Equation (i 3). 0 0 1 2 0 0 0 0 o 1 1 1 1 2 1 0 1 1 1 2 2 2 2 2 2 0 1 2 2 0 TABLE V. Ternary Post Algebra ! (AaB), (AaB), (AaB), (AaB)' (AaB), (AaB)' (AaB)' (AaB)' (AaB), (AaB), = A' a = A' 8 = A" a = A" 8 = A' a = A' 8 =A a =A 8 =A a = A j) B' B' B" B" B" B' B B' B" B" (a) (b) (c) (d) (e) (0 (13) (g) (h) (i) 0) Each of the above rules, as they are stated and with a and 8 interchanged, were tested on each of the 12 pairs of binary operations in Equation (8) for each . unary operation in Table IV. It was found that isomorphic sets G 1 and G 2 in Equation (9) satisfied Equation (l3b) and its inverse (with a and 8 interchanged) for one of the unary operations in Table IV~ G 3 and They are identical to the operations shown in Tables II(b), II(h), and the operation Al in Table IV. The two binary operations are also identical to one pair found to be idempotent, associative, commutative, and distributive, and belong to isomorphic set G 2 Two binary operations seem to be a good choice based on algebras that have already been proposed. l-6 From the results of the distributive test listed in Equation (8), it is easily shown that no three of the operations in Table II are all distributive with each other. An algebra with a single binary operation such as conventional NAND logic could be established, but expression complexity would be increased. Based on the results of the DeMorgan law tests, the choice is between the pairs in isomorphic groups G 1 and G 2. It is not obvious from algebraic considerations which group contains the best pair of operations. Pair PI in group G 1 obeys the following additional rules: Selection and Implementation of Ternary Switching Algebra 287 OaA=O 2aA=A (14) 18 A=A 08 A=O Pair P4 in group O 2 obeys a somewhat similar set of rules: OaA=O 2aA=A (15) 08A=A 28A=2 J XN (N) = unary operations that satisfy the expansion theorem This theorem is a generalization of the Boolean expansion theorem that results in a "sum-of-products" expression in conventional switching algebra. The terms f(x A , xB, ... , xn ) are the constants 0, 1, or 2 obtained from the truth table representing f(A, B, ... ,N). Lee and Chena have proposed an algebra consisting of the Post binary operations ill and ., and the unary operations J o, J hand J 2 defined in Table III. The required "J" operations are, unfortunately, of the "truncating" type that map into the set {O, 2}. Another possible expansion theorem is: The Post binary operations, pair P4, are by far the easiest to implement, and thus are preferable. These may be considered the functions min(A, B) and max (A, B) in the usual arithmetic sense,5 and can be implemented with simple diode gates similar to those used in conventional logic. Post Algebra and variations based on its binary operations has been suggested by several other authors 3-6 for ternary switching algebras, so the idea certainly i3 not new. The essence of the author's approach is that an algebra is chosen a priori, but rather by a suitable examination of all possible algebras. Selection of a suitable set of unary operations remains. Of the five unary operations in Table IV, A 3 obeys the DeMorgan Laws Equation (13b) and its inverse with a and 8 interchanged. A3 and the two Post binary operations do not form a functionally complete algebra, so one additional operation must be included. Either A4 or A5 may be used. Before further consideration is given unary operations, the subjects of how to form and manipulate expressions must be introduced. This theorem corresponds to the "product-of-sums" form of conventional switching algebra. Expansion theorem (16) only will be considered in the following discussion; a discussion of both would be too lengthy. Expansion theorem (16), when applied to a given truth table, results in an equation of the form Expansion theorems f(A, B) = (O)(Jj(A) Jj(B) (1) Jk(A)J(B) El) ••. ) ffi(l)(Jm(A)Jn(B)ffi ... )ffi(2)(Jr(A)J s(B)EB .. ~)(20) A general method of generating expressions in a given switching algebra is an expansion theorem 1::l A logic designer uses the Boolean expansion theorem,9 perhaps without realizing it, to generate expressions in conventional switching algebra. One comused ternary expansion theorem 3 is: n IT Fx = Fo·F x;,,:o I Fn = logical product of all Fx_( 19) From Equations (15), which apply toEB and ., (20) reduces to Denoting the subexpressions in Equation (21) by "U" and "V,'·' f = (1. U) EB (V) 2 If the "J" functions in Equation (2) are simple variables with or without a single application of a basic unary operation, a sum-of-products expression of the form commonly encountered in binary switching algebra results; otherwise, it does not. This form is preferable since currently known methods of simplification, such as the Karnaugh Map and Quine's method, are applicable only to this form of expression. From Equation (22) some general requirements of the subexpressions U and V may be inferred. Both U . and V are basic sums-of-products with no constants. 2 XA=O XN=O f(x A , XB, ... ,XN)· JX A (A)·JXB(B) ... JxN(N) (16) where L Fx = logical sum of all the Fx = Fo ffi Flffi ... Fn x=o and where f(A, B, ... , N) L ... L n f(A, B, ... , N) ! 288 Spring Joint Computer Conference, 1968 For f = 0, both U and V must equal "0". For f = 1, either (1 .U) or V or both must equal "1" and neither equal "2". Either U = 1 or U = 2 will satisfy (1. U) = 1. For f = 2, V must equal "2" since (1. U) can never equal "2". These requirements can be met by sum-of-products forms of U and V oniy if U and V are allowed to assume only the values {O, 2}. This is true with the operations chosen because a basic product assumes the value 2 for one and only one condition but can be either "0" or "1" under other conditions over which there is no control. This problem may be avoided by truncating U and V at the individual variable level, such as is done when using the unary operations J o, J b and J2 detined in Table III, or by handling Equation (22) in such a manner that the problem does not occur. An equivaient form of Equation (22) may be wriuen (23) where U = J 2(X) and V = J 2 (Y) The sUbexpressions X and Y now may be three valued with truncation performed at the expression level in a prescribed manner independent of the function rather than at the individual variable level. Since Equation (23) is independent of the function, a new operation T(X, Y) may be defined by Equation (23). The truth table for T(X, Y) is shown below. y x T (X,y) .... 0 1 2 ... 0 0 0 2 1 0 0 2 2 1 1 2 XCA , B, • •• ,N) VCA, B, ••• : N) ~ _ _..., _ re)(,V) i--.f(A, Figure 1- Illustration ofT(X,Y) function that satisfy the requirements of X and Y. A 3 and A4 would be the proper choice for expansion theorem (18). The op~rations Efj, " A3 and either A4 or A5 form a functionally complete algebra since (24) but cannot in itself produce simple sums-of-products for all possible functions. By appending T(X, Y) as defined above to the set of operations, the variable expressions may be written in sum-of-products form. Simplification of expressions Simplification of the subexpressions X and Y in Equation (23) may be accomplished by modifications of techniques already established for conventional switching algebra. The term minterm may be adopted form conventional switching algebra to denote a basic product in a ternary sum-of-products expression. The appropriate minterms are included in X to make X = 2 when f = 1, and in Y to make Y = 2 when f = 2. The minterms corresponding to f = 2 may be included in X as "don't care" conditions. One method of forming and simplifying X and Y is a ternary version of the Karnaugh l\lap. This method directly applies the laws (25a) and (A, B) $ (A ·C) = A ·(B E9 C) TABLE VI. Definition ofT(X, Y) T(X, Y) may be electronically implemented as one of the basic "building blocks" for constructing a working digital system using ternary logic. It will occur once for each unique combinational function of any number of variables. It is not necessary at all in the few cases where the base algebra can express the functien directly. T(X, Y) might by symbolically represented py Figure 1. If A 3, which obeys the DeMorgan Laws, is chosen felr one of the unary operations, the addition of A5 all<~IWS writing simple Hsum-of-products" expressions B, ••• ,Nl I (25b) and indirectly depends on the associative, commutative, and idempotent properties of $ and '. As an example, consider the simple function f(A, B) defined by the truth table in Figure 2(a). The subexpressions X and Y may be defined mathematicallyas: Selection and Implementation of Ternary Switching Algebra A feA, B) 0 B 0 1 2 1 0 1 1 1 2 1 Ca) (b) 2 2 1 2 2 2 2 X X X 0 1 0 2 "2 1 2 2 2 2 2 2 2 2 2 0 1 2 0 2 2 X X(A, B) 2 2 "X" = DON'T CARE 2" = 0 OR 1 "Y(A, B) 289 must be present to effect a grouping, as dictated by Equation (25a). Other simplification methods can be developed, based on methods used in conventional switching algebra, but are beyond the scope of this presentation. The Karnaugh Map method presented is one way to create suitable expressions to direct construction of a working digital system using ternary logic "building blocks." (c) Implementation Figure 2 - Example of Karnaugh simplification where J 1 and J 2 are as defined in Table III and Po(A) = A P 1(A) = AS P2 (A) = A (27) In words, Equations (26) simply state that all basic products or minterms are included in X that equal 2 when f = 1 and all minterms are included in Y that equal 2 when f = 2. In un simplified form, X and Y for the example in Figure 2(a) are Y(A,B) = A3BE9AB3E9AB5E9AB X(A,B) = A3B5E9A5BE9A5B3E9A5B5 (28) For simplification purposes, any or all terms in Y(A, B) may be included in X(A, B) as don't care conditions since if Y(A, B) = 2 the value of X(A, B) is immaterial (see Table VI). In this example, if the term AB5 in Y(A, B) is included in X(A, B), X and Y may be simplified to X(A, B) = A5 ffi B5 Y(A, B) = A EB A3B f(A,B) = T(X,Y) (29) A Karnaugh map method may be applied to obtain the same result as shown in Figure 2(b) and 2(c). These reflect X(A, B) = 2 for f(A, B) = 1 and Y(A, B) = 2 for X(A, B) = 2. All "2" entries in the truth table for Y(A, B) are entered in the table for X(A, B) as "don't care" conditions. Simplification is done by groupings of 3n terms. The conditions for grouping terms differs slightly from conventional Karnaugh Maps. Groupings of 3 n terms instead of 2n terms are used, and the conditions for "adjaCency" are slightly more complex. For two variables, the groupings are obvious as shown in Figures 2(b) and 2(c). For three or more variables, two minterms are "adjacent" if they differ in only one element of the basic product just as in the conventional case; these are not nec~ssarily physically adjacent in the table as in the conventional case with more than four variables. Furthermore, all three terms that differ by one element Before a useful digital system can be created, sequential techniques and suitable storage elements must be developed. The author cannot hope to more than merely scratch the surface of these areas. The discussion will be limited to consideration of a suitable storage element. The obvious choice for a storage element is a ternary version of the flip-flop, if such a thing exists. First, the characteristics of such an element must be defined. It obviously must have three stable states and be capable of switching from one to another upon application of external stimuli. To be compatible with the algebra discussed above, it should have at lease one tri-Ievel output Q, and preferably three outputs Q, Q3, and Q5. The required input stimuli must be obtained from the tri-Ievel voltage signals provided by the other basic circuits. The law «A')')' = A (30) for the first unary operation in Table IV indicates that a tri-stable configuration can indeed be produced by cascading three circuits that implement the operation A I , and connectingJhe last back to the first. This is analogous to. cross-coupling t\yo inverters in conventional logic to form a basic flip-flop. A basic ternary flip-flop is shown symbolically in Figure 3. Figure 3 - Representation of a ternary flip-flop Still needed is a method of controlling the state of the basic ternary flip-flop by external stimuli. There can be as many, if not more, variations as there are currently with binary flip-flops. Ternary diode gates inserted between each of the A l circuits in Figure 3 is 290 Spring Joint Computer Conference, 1968 a method of control. Consolidation of the basic idea shown in Figure 3 into a single flip-flop package with appropriate inputs and outputs is left to the circuit designer. An interesting variety of "clocked" flip-flops could result from the tn-level nature of the input signals. Positive and negative pulses about the center voltage level could perhaps be used to stimulate two entirely different actions within a ternary flip-flop, such as to load it and to preset it to a given state. A set of basic circuits has been devised by the author to implement the operations A I , A 3 , A 4 , A 5 , and T(X, Y) using complementary-symmetry transistor techniques. CONCLUSION In the preceding sections, a suitable ternary switching algebra was determined by applying a reasonable set of constraints to the class of all possible ternary algebras. Based on the ability to manipulate, simplify, and implement algebraic expressions, the Post binary operations EB and " the unary operations A 3 and A 4 or A 5 , and the auxiliary binary operation T(X, Y) seem to be the best choice for a set of basic operations. The similarity between the theorems of this algebra and those of conventional switching algebra allow variations of conventional simplification techniques to be employed. Sequential techniques, a necessary addition to the theory to be able to develop useful digital systems, were considered only briefly. Variations of conventional sequential techniques could perhaps be used to form the basis of a theory for ternary sequential systems. Primary emphasis was placed on the groundwork for development of a suitable storage element, 'n ternary version of the flip-flop. :Much work remains to be done concerning sequential topics, including a comprehensive sequential theory, further development and implementation of ternary flip-flops, and suitable ternary memory systems. Although the basic "inverter" circuits discussed are more complex than their binary predecessor, circuit complexity is becoming less and less a primary considedltion as integrated circuit technology ex- pands. Most current dig~tal integrated circuits are limited not by circuit complexity, but by the number of terminals available on the physical package. A given number of signal terminals could convey more information in and out of a package if ternary circuits were used. This could possibly effect a savings in interconnection, especially where parallel transmission of large data words is used. ACKNOWLEDGMENTS The author is indebted to Professors R. R. Korfhage and S. Freeman of Purdue University for their guidance and suggestions while the author was performing the research leading to this paper. REFERENCES 1 0 LOWENSCHUSS Nonbinary switching theory 1958"IRE National Convention Record part 4 pp 305-317 2 R D BERLIN Synthesis of N-valued switching circuits IRE Transactions on Electronic Computers vol EC-7 pp 52-56 March 1958 3 C Y LEE W H CHEN Several-valued combinational switching circuits Transactions of the AlEE (Communication and Electronics) vol 75 pt I pp 278-283 July 1956 4 E MUEHLDORF Ternaere schaltalgebra Archiv der Elektrischen Uebertragung vol XII pp 138-148 March 1958 5 M YO ELI G ROSENFELD Logical design of ternary switching circuits IEEE Transactions on Electronic Computers vol EC-14 pp 19-29 February 1965 6 A MUKHOPADHYAY Symmetric ternary switching functions IEEE Transactions on Electronic" Computers vol EC-15 pp 731-739 October 1966 7 R L HERRMANN Development of a three-valued switching algebra Master's thesis Purdue University Department of Electrical Engineering August 1964 8 E L POST Introduction to a general theory of elementary propositions American Journal of Mathematics vol 43 pp 163-185 1921 9 P C ROSENBLOOM The elements of mathematical logic New York Dover 1950 Application of Karnaugh maps to Maitra cascades by GIUSEPPE FANTAUZZI F ondazione U. Bordoni Rome, italy and Montana State University Bozeman, Montana INTRODUCTION Statement of the problem A Maitra cascade, as shown in Fig. 8, is a one dimensional cellular array whose cells have only one output and two inputs. At the output of the last cell the function is performed whose independent variables are introduced at the free inputs of the cascade cells. There are two different kinds of Maitra cascades according to whether a different binary variable is introduced or not at each independent input. l' In the first case the cascade is an irredundant one, in the second it is said to be redundant. The most important results about the synthesis of Maitra cascade are given in Refs. 1-5. In Ref. 6 it is proved that a sufficient condition for a cellular cascade to reach its optimal synthesis possibility is that every cell can perform the set of five functions shown in Fig. la, Ib, lc. XI .b Proposition 1: A completely specified binary function F(x 1 ••• xn) is realizable by a Maitra cascade if both of the following hold: I - There exists a variable Xi such that one of the following conditions is satisfied: a: F(x 1•• • xn) =xt + G(XI' .. Xi-lXi+l' .. xn) b: F(x 1••• xn) = xt G(x 1 ••• Xi-lXi+l' .. Xn ) c: F(Xl'" xn) = xt:EB,G(x l... Xi-lXi+l' .. xn) d: F(xt = 1) is a parity function II - In the cases a,b,c, function G(XI' .. Xi-lXi+l' .. xn) is Maitra realizable; in case d, G = F*(Xi = 1) E9 F(Xi = 0) is Maitra realizable Proof: See Refs. 5 or 6.(*) G~X·tY~F ~F \)Tx .0 ods proposed by the authors of the above papers are different but the properties of the Boolean functions from which they are derived are quite similar. These properties can be summarized by the following two Propositions: i .C Proposition 2: t ~---~F lj~ ~ subset 6f t~ variables .d Figure 1 - Functions to be performed by the cells in a Maitra cascade Four different algorithms have been published for the synthesis of Boolean functions by means of Maitra cascades. Three of them concern the irredundant Maitra cascades 2- 4 while the fourth 5 can be used for the redundant ones, too (see also Ref. 6). The meth291 When Xi satisfies conditions a,b,c of Proposition 1, F can be obtained at the output of a cell whose inputs are Xi and G, as it is respectively shown in Fig. 1a, 1b, or 1c. When Xi satisfies condition d, F is obtained at the output of the cascade shown in Fig. 1d. In this cascade, the function produced by the cell whose input is Xj, is selected as shown in Fig. 2. The exclusive-or cascade following the Xi cell must realize the odd parity function F*(xt= 1)(*) (*) f* stands either for f or for its negative, f. (*) Methods given in Refs. 3-5 are derived directly from the state- ments of Propositions I and I I. Maitra's method2 is developed in a different way, but can be stated from the above propositions (see Section 6). 292 Spring Joint Computer Conference, 1968 If the parity The function performed and G is function is : by the cell of Xi is: so defined: IF(x=I} I I \ xy IF{oi. F (i~ F(o) • F(n F(x= I) F(x = I). ~, iy F(o) • F(n F(x = I x+y F(o) • F(n ) I - The Karnaugh mapping of F(x 1 ••• xn) has the (n -I)-cube [xt] {[xt]) completely labeled {unlabeled} II - The mapping of G is given in the subcube (*) rLAiJ v*l IlLJ>o.lJJ fv*l \. Proof: Obvious Example: .1 Figure 2 - Statement of properties of the cascade in the case d of Propl Proof: See Refs. 3-5. In order to test if a given function can be synthesized by a Maitra cascade,3-5 it is necessary to look for a variable Xi satisfying one of the conditions a, b, c, d; then a function G with only n-l variables' is obtained. The same considerations used for F must be repeated for G. So, by applying the statement of Propositions 1 and 2 at most n-2 times, it is possible either to obtain the cascade realizing F, or to establish that there is no cascade realizing F. If, in the application of of the algroithm condition d is never used, the resulting cascade is irredundant; in the other case it is a redundant one. From the above considerations it is clear that, for testing the Maitra realizability of a given function F, it is necessary to be able: (i) to look for a variable Xi and then (ii) to derive the new function G with one variable less than F. By using the Karnaugh map it is possible to solve both above prob1ems in a simple way, both for redundant and for irredundant cascades. To show this, in the following sections some rules are given for testing on the Karnaugh maps for the Xi variable requested by Proposition 1 (Rules AI BI CI o I). Then four other rules are given for deriving the mapping of G from the one of F (All BII ell 011). These rules are given in four different section according lV the condition a, b, c, d of Proposition 1 to be satisfied by Xi' There are· additional problems in the statement of an algorithm for Maitra cascades. These problems are concerned with the need of selecting variables where there is more than one Xi satisfying Proposition 1. They are not considered here because they are completely solved 4•5 and do not involve Karnaugh maps. Rules A {B} If a variable tion 1: G 1000 1001 Figure 3 - An example of application for rules A Consider the example shown in Fig. 3.1 {4.1} whose Karnaugh mapping is given in Fig. 3.2 {4.2}. The 3cube [x 2 ] {[x2 ] } is completely full {empty} and the mapping of the residual function G is given in 3cube Ix~l CX2.1l. For the sake of clarity, in Fig. 3,3 {4.3} the mapping of F over a picture of 4-cube is given. (*) In the remainder of this paper the (n-l )-cubc is indicated by Xi satisfies condition a {b} of Proposi- [x*] iff {xt . .. is equai in all ihe coordimoes of its veltices; if xt = Xl the coordinate Xi is 1{O} Xi Xi} Application of Karnaugh Maps to Maitra Cascades 293 .I .2 Ol~~m~ II The all pairs (e (') lying in opposite 3- cubes and adjacent are labeled in opposite way. Figure 5 - An example of application for rules C RuLes D .3 1000 1001 Figure 4 - An example of appiication for rules B RuLes C If variable Xi satisfies condition c of Propositions I: 1- The 2 (n - I)-cubes [xJ and [xJ are complementary, i.e., for any pair of vertexes (gf) such that gE[Xi] f E[XJ and the Hamming distance between g and f is 1, one and only one of its elements is labeled. * II - G is mapped on the (n-I)-cube [Xi] Proof: If condition c of Proposition 1 holds, F satisfies the following identity f=xiG+xli Therefore, the function mapped on [xJ is the inversion of the function mapped on [xJ and a minterm belonging {not belonging} to G cannot belong- {must belong} to G. This implies that for every vertex of [xJ labeled, its adjacent vertex in [xJ cannot be labeled and vice versa. End of proof. Example: An example is shown in fig. 5. This is the same as to state g and fare adjacents but belonging to opposite (n-l )-cubes. (*) If variable Xi satisfies condition d of Proposition 1, either an odd parity function or its inverse is mapped in the (n-I)-cube [Xi*]' To test if this happens, the following properties can be used: a - Both if an odd or an even parity function is mapped on [xt], there are 2n - 2 vertexes labeled on it. {3 - If the function mapped on [xi] is either a nondegenerate (i.e., depen~ing on all the variables Xh ... Xi-l Xi+l' .. xn ) odd parity function or its inverse, no labeled {unlabeled} vertex in [xt] has any of its adjacent vertexes belonging to [xt] labeled {unlabeled} (see Fig. 6). 'Y - If an odd parity function is map'ped on [xt] , vertex 00 ... Ox* 00 ... 0 is unlabeled. If an even pclrity function is mapped 00 ... Ox! 0 ... 0 is labeled~ By making use of the above conditions, the following rule can be derived' I - For testin-g if a parity function is mapped on [x:J first count the number of labeled vertexes in [xt] If they are 2n- 2 , reduce the function mapped on lxt] until it depends on all the coordinates of the map, then test condition {3. If this condition is fulfilled, by using condition 'Y, test whether the function mapped on [xfJ is an odd or an even parity function. There are two different rules for determining the mappin~ of G according to whether F(xt = 1) or F(xt = 1) IS an odd parity function. IICi - If F(x·~ = I) is an odd parity function, G = F(Xi = 0) EB F(Xi = 1) is obtained by performing the 294 Spring Joint Computer Conference, 1968 * ***** I 1* * 1* 1* * * * * * * * * * * * * * * * * * * * * * No labeled cell has I I I I .v.. -A- I Xa I I I I I 'I' V V I I T v I 'I v Figure 8 - A general example An example an adjacent labe led cell and vice versa. Figure 6 - Map of a parity function exclusive-or of all the pairs (~f) of vertexes the elements of which are both adjacent and belonging to~ opposite (n-l)-cubes [Xi] and [Xi]. I IiI- If •F(xt = 1) . is even parity function, the coincidence operation must be performed instead of the exclusive-or. ' Example: The function to be synthesized is mapped in Fig. 9. Variable X4 satisfies condition d of Proposition 1 because the function mapped in 5-cube [x 4 ] is equal to X5 and so it is the even parity function of one variable. The dependent input G of cell X4 goes into is mapped in Fig. 10. This mapping has been obtained by applying rule D 1If3. Variable X6 satisfies on the map of Fig. 10 condition c of Proposition 1. The new function G is mapped on the 4-cube [X6] of Fig. 10 and is given in Fig. 11. Variable X5 satisfies condition d of Proposition 1. In fact, the 4 = 23 - 1 vertexes of the 3-cube [x 5 ], after reduction (see Fig. 14), give the function X3 E9 X2' The new function G is given in Fig. 12 that is obtained by applying Rule D 1If3. Variable X3 satisfies condition b of Proposition 1 on the mapping of Fig. 12. On the 2-cube [X3] of Fig. 12 is mapped the function Xl + X2 that can be realized by a cell. Byapplying Proposition 1 five times and using Propositions 2 at each step, a Maitra cascade has been obtained realizing F at its output (see Fig. 8). II~ IO~ .2 .3 .4 I v Xs v TH "a "s I wC' I a 4 "e 5 I't er. 'j4:TH·~..I3R8 'ICLI iteration i2 Iter.1 I iteration .5 001 011 010 Figure 7 - An example of application for rules 0 Rules DI DII are applied in the example shown in Fig. 7. The function F is mapped in Fig. 7.2. The function mapped on [x 4 ] is X2 (odd parity function of one variable) as it is easy to see after reduction to irredundant form (Fig. 7.3). Cascade realizing F is shown in Fig. 7.5 whose function G is mapped in Fig. 7.4 obtained from Fig. 7.2 by applying rule L> 110: 110 III 101 100 ** * *** * * * * * Figure 9- Map of the function to be synthesized Application of Karnaugh Maps to Maitra Cascades .[x;J. --- -00 -01 -II -10 000 001 011 010 100 * * * * * * * * * I II 101 1'00 Figure 10- First residual function 000 001 * 011 010 110 I II * 101 100 * Figure 12 - Third residual function --- XS J~ 00- -0- -1- 01- 000 ....--4t'f'P,..,. 11- 001 10- 011 010 * * * 110 III -1- ·00 .... * * 100 * * * Figure 13 - Fourth residual function -01 ~~,.". 101 295 -II * ...-.- * -10 . . . .- . j. . . . . . . . . . Figure t 1 - Second residual function Figure 14-Simplification of the function mapped in the 3-cube [X6] offig 11 296 Spring Joint Computer Conference, 1968 SUMMARY In the studies on Cellular Logic, much work has been done with Maitra cascades. This is due to the fact that these cascades are useful as "elementary" module in more sophisticated cellular arrays. R. C. r-v1innick l has suggested the introduction of Karnaugh maps in the study of maitra cascades; he has given a casuistry for determining the Maitra realizability of a Boolean function with four variables by using its Karnaugh mapping. In this paper the application of Karnaugh maps to test for the Maitra realizability of any function with any number of variables is studied. A set of necessary and sufficient conditions is given to test the Maitra realizability by using only Karnaugh maps. The practical usefulness of the rules given here is limited only by the difficulty to study Karnaugh maps with more than 6-7 variables. Furthermore, the results given in this paper can be advantageously applied to the study of Rectangular Arrays. CONCLUSION The "realizability _test" [2 p. 1401 given by Maitra makes use of a geometrical representation of Boolean function that is a slight modification of the Karnaugh map. The algorithm introduced by Maitra is given without any mention of the Karnaugh maps and its it ility is limiied by the prior need to establish an ordering of the variables. Furthermore, it can be applied only for the irredundant cascades. By making explicit reference to the Karnaugh maps, some rules are obtained in this paper that can be applied for both redundant and the irredundant cascades. They do not require any ordering of the variables. These rules, in the particular case of the irredundant cascades, are similar to those of Maitra but they are derived in a different way and, when they are used for applying the algorithm given in Refs. 3 and 4, the cascade synthesizing F can be obtained. The results obtained in this paper can be applied to more general cellular arrays both rectangular and one dimensional as is shown in Ref.7. ACKNOWLEDGMENT The author wishes to thank Prof. Robert C. Minnick for his help in improving this paper and Kathleen Wright who typed the manuscript. REFERENCES R C MINNiCK etai Cellular arrays for logic and storage Final Report AFCRL-66-613 AD 643178 Stanford Research Institute Menlo Park California April 1966 2 K MAITRA Cascade switching networks of two input flexible cells IRE Transactions EC Vol EC-l 1 No 2 pp 136-143 April 1962 3 J SKLANSKY General synthesis of tributary switching networks IEEE Transactions on EC Vol EC-12 No 5 pp 464-469 October 1963 4 S LEVY R WINDER T MOTT JR A note on tributary switching networks IEEE Transactions EC Vol EC-13 No 2 pp 148-151 April 1964 5 H STONE A KORENJAK Canonical form and synthesis of cellular cascades IEEE Transactions on EC Vol EC-14 No 6 pp 852-862 December 1965 6 G FANTAUZZI Catene di Maitra Internal Report Fondazione Ugo Bordoni Rome Italy December 1966 7 G FANTAUZZI Catene di Short Internal Report Fondazione Ugo Bordoni Rome Italy December 1967 Universal logic circuits and their modular realizations by s. s. YAU andC. K. TANG Northwestern University Evanston, Illinois INTRODUCTION In order to achieve the great economic advantage of utilizing integrated circuits in computer circuitry, it is desirable to design a circuit which can realize any logic function of a fixed number of variables by simply varying its input terminal connections. Such a circuit is called a universal logic circuit (U LC). When the number of variables becomes large, a ULC may be too complex to be built in a single package economically. Hence, it is preferred to use ULC's of a small number of variables as the modules to build a ULC of a large number of variables. Such modules are called universal logic modules (ULM's). In this paper, we shall first present a three-variable ULC, which has a fan-in for each logic gate not exceeding four, and consists of only 7 I/O pins. Then, we shall extend" the ULC's to four or more variables. There are 12 I/O pins in a ULC of four variables, and several models with different fan-in limitations will be given. The logic gates in the ULC's may be all NAND or all NOR gates. Then, a simple technique for designing a ULC of any large number of variables using the ULC'sof a small number of variables, say three variables, as the ULM's will be established. It will be seen that the ULC obtained by this technique will require a small number of ULM's. Moreover, the fault'-detection tests for ULM's and a diagnostic .procedure for locating all the faulty ULM's in the modular realization of a ULC realizing a given logic function will be presented. Finally, a method for improving the reliability of a ULC using an error-correcting code will be demonstrated. U niversallogic circuits of three variables The problem of designing a ULC was first treated by Forslund and Waxman,! and later by Ellison, et al. 2 and Elspas, et al. 3 They employed the concept of equivalenc.e classes to reduce the number of all possible logic functions· of a given number of variables to the number of the equivalence classes. An equivalence class is a set of logic functions that may be ob297 tained from a particular network by only manipulating the application of variables· to the input terminals of the network. One of the most common constraints on these manipulations is that only true variables are available with the permutation of the variables at, the input terminals permitted. With this restriction, Hellerman4 partitioned the 223 = 256 three-variable logic functions into 80 equivalence classes. In order" to reduce the number of equivalence classes, Forslund and Waxman l assumed that both true and complement variables are available at the input, and true and complement logic functions are both available at the output (two output terminals). In addition, biasing (to a logical 1 or 0) and duplication of input variables to the input terminals are also permitted. The equivalence classes defined this way reduces its number from 80 to 10 for three-variable logic functions. In this paper, the same constraints are to be placed on the manipulations of input terminals, except that only one output terminal will be required and that the biasing "0" and "1" are not necessary. We shall employ a different approach to obtain the ULC. It is noted that a logic function f(x, y, z) of three variables x, y, z can always be expanded with respect to any two of the three variables x, y, z as follows: f(x, y, z)=x )7f(O, 0, z)+ x y f(O, 1, z) + x y f( 1, 0, z) + x y f(1, 1, z), (1) where the functions f(O, 0, z), f(O, 1, z), f( 1, 0, z) and f( 1, 1, z) are functions of z only, and each of these functions assumes one of the four values: z, z, or 1. Hence, a circuit shown in Fig. 1 can realize any arbitrary three-variable logic function f(x, y, z) if the side terminals C l and C 2 are connected to x and y respectively and the four front terminals L"1(), A A"":h A 2 , and A3 are connected to the appropriate values z, z, 0, and 1. Based on (1) we obtain a ULC of three variables consisting of AND, OR and INVERTER gates shown in Fig. 1. It is noted that the biasing "0" and "1" are not necessary. In Fig. 1, for example, if f(O, 0, z) = 0, connecting terminal Ao to biasing "0" is the same as connecting Ao to input variable y; ° 298 Spring Joint Computer Conference, 1968 and if f(O, 0, z) = 1, connecting terminal Ao to biasing "1" is the same as connecting Ao to input variable y. Similar arguments apply to terminals A}, A 2 , and A 3 • :f)-AND gate $-OR gate using NAND, OR and INVERTER gates. This circuit is more desirable than those shown in Figs. 2 and 3, since it requires only 16 diodes and 3 transistors, while the circuits shown in Figs. 2 and 3 need 16 diodes and 7 transistors_ Furthermore; the circuit shown in Fig. 4 is more reliable because fewer transistors are used. A similar circuit can be obtained using AND, NOR and INVERTER gates. In each of the above circuits, a total of 7 I/O pins is required. -+-INVERTER ~NORgate :: :: ::~::: II ~ A3 f{ I. I,i ) A2 f{ I ,O/Z) +--+--+--+---i~...., f( ) ~~~x:.!.:!,y~,Z:.!. Figure I-A ULC of three variables consisting of AND, OR and NOT gates AI f(O,I,Z) It is well-known5 that a two-level AND and OR cir- Ao f O.O.Z) .cuit can be replaced by a NAN D-gate circuit of the same configuration. Thus, the circuit shown in Fig. 2 is also a ULC of three variables employing NAND gates only. Figure 3-A ULC of three variables consisting of NOR gates only y y X I X C, Ao f(o,o,z) AI f{o.I,z) A2 f{ l.o,Z) A3 l(oo,z) Figure 4-A ULC of three variables consisting of OR, NAND and INVERTER gates f( I,I,Z) Figure 2-A ULC of three variables consisting of NAND gates only A ULC of three variables consisting of NOR gates can be obtained by using the dual relationship between NAND's and NOR's, and is given in Fig. 3. It is noted that the configurations of the ULC with NAND gates and the ULC with NOR gates are identical, and the only difference between these two circuits is the permutation of the input values for the front terminals. Another reaiization is shown in Fig. 4 To evaluate the ULC given above, a comparison with the results given by Forslund and Waxman} is made as follows: Consider the circuit shown in Fig. 2 as a minimum-pin ULC. Since gates 0 6 and 0 7 are included in the ULe, the circuit has 7 pins, 7 gates, and 3 levels. The minimum-pin U~C of three variables given by Forslund and Waxman also has 7 pins, but it requires 10 gates and has 5 ievels. The ULC given in Fig. 2 also has the advantage that only . one complement input is required, whereas the minimum-pin ULC given by Forslund and Waxman requires two complement inputs. If the circuit shown in Fig. 2 is considered as a minimum-gate ULC, gates Universal Logic Circuits and Their rv10dular Realizations G 6 and G 7 should be excluded from the ULC at the expense of adding two more pins. Consequently, it ends up with a minimum-gate ULC of 5 gates, 9 pins and 2 levels. The minimum-gate ULC of Forslund and Waxman has 6 gates, 9 pins and 4 levels, and can realize only the logic functions in nine out of the ten equivalence classes. It is seen that 5 is the absolute minimum number of gates required for any ULC of 3 variables, since the realization of the exclusive-or function of 3 variables alone requires a minimum of 5 NAN D gates. 6 izations with their input' terminal connections permuted. The rule of permutation on the residue functions of one variable for the NOR realization is to replace 1 by and by 1 in the residue functions for a NAND realization. For instance, f(O, I, 0, w) in Fig. 5 should be replaced by f(l, 0, 1, w) for the corresponding NOR realization. The ULC's of five or more variables can be derived in a similar way. It can be shown that a ULC of n variables obtained by this method has p input pins, where ° ° Universal logic circuits offour and more variables (3) The problem of designing a U LC of four or more variables was also treated by Forslund and Waxman l , using the same idea of equivalence classes as in the case of three-variable ULC. Due to the large amount of computations required, it is prohibitive to obtain such a ULC by that method. However, the approach used in the last section for obtaining the ULC of three variables can readily be extended to four or more variables. Since a logic function f(x, y, z, w) of four variables can be written in the form f(x, y, z, w) = x y z f(O, 0, 0, w) + X Y z f(O, + X Y z f(O, 1, 0, w) + X Y z f(O, + x y z f(l, 0, 0, w) + x Yz f(l, + x Y z f(l, 1, 0, w) + x Y z f( 1, With a fan-in limitation of four, this approach will yield a ULC of n variables, n ;;::: 2, which has q gates and [ levels, where 0, 1, w) 1, I, w) 0, I, w) I, I, w), (2) the ULC of four variables shown in Fig. 5 is obtained. It is noted that there is a NAN D gate with a fan-in of 8 in this realization. Two other NAND realizations with smaller fan-in limitations and more gates are given in Figs. 6 and 7. Similar to the case of three-variable ULC's, the corresponding NOR realizations of Figs. 5-7 can easily be shown that they have the same configurations of the original NAND realZ Figure 6-A ULC of four variables with 5 levels and a fan-in 4 z y x .X ~ A f(o.o.o.w) ·0 AI f(o.o,I,W) A2 f(o,l.o.w) A3 f(o.I.I,W) f(x.y. z ,W) f(x.y,z,W) A4 f( I.O.O.W) As f( 1,0,1 ,W) A6 f( I, I,O,W A7 f(1 1.1 W Figure 5 - A U LC of four variables with 3 levels and a fan-in 8 Figure 7 -A ULC of four variables with 4 levels and a fan-in 5 300 Spring Joint Computer Conference, 1968 q= i { 10· -"3(2 n- 2 - 1) + (n + 2) r (= The number p of input pins of a ULC and King's upper bound on p. (2n-l - 1) + (n - 1) when n is odd (4) when n is even I 2 3 4 5 6 p 3 6 11 20 37 King's upper bound 6 11 20 37 70 n I t when n is odd when n is even. n n+ 1 (5) For example, a ULC of five variables with a fan-in limitation of four is shown in Fig. 8. The numbers given by (4) and (5) can certainly be reduced if a larger fan-in is permitted. It is noted that for any n only one complementary input variable is required and all others can be true input variables in a ULC obtained by this method. u z y I T ABLE I - The number p of input pins of a ULC and King's upper boundonp. (n + 3)P- (~) (n + 2)P+ (~) (n + I)P- (~)nP +... + (-1)~ (~)3P. The number of possible distinct connections must not be smaller than the number of logic functions of n variables. Hence, the following inequality is obtained. x (n + 3)P - (V (n + 2)P + (~) (n + l)P - O~)nP n ivRr I II I I I + ... + I foo 10,V f O,I,I,V f(o I,O,O,v f(o 1,0 I,V) f(o,1 ,I ,o,v) f(o,I,I,I,V) f( I,O,O,O,v} f( I,O,O,I,V) f(x,y,z,u,V) (-1)n(~)3P ~ 22 . The minimum p satisfying the inequality (6) is a lower bound on p, which is listed in Table II. It is _noted that Elspas. et al. 3 have derived a ULC of 4 variables with a total of 9 I/O pins, and by decomposition a ULC of 5 variables with a total of 19 I/O pins was obtained. For n ~ 6, their result requires exactly the same number of input pins given by (3). They have also derived a lower bound for p, which is smaller than that listed in Table II, because all complementary inputs are allowed in their derivation. f(I,O,I.O,V) TABLE I I - The lower bound on p calculated according to (6). f(I,O.I,I,V) f(I,I,O,O,V) f(I,I,O,I,V) f(I.I,I,O,V) f(I,I,I,I,V) I · Lower bound I Figure 8-A ULC of five variables A comparison of the number of input pins required here and the upper bound of the number of input pins required for a ULC given by King 7 is shown in Table I. The upper bound given by King is for the ULC defined in a slightly different way, namely only true inputs are used, while in this paper one complementary input is allowed. It is seen that the values of pare considerably lower than King's upper bound. A lower bound on the number of required input pins is derived here with the restriction that only one complementary input is allowed. First we calculate the number of possible distinct connections of p input pins to the set of values {a, 1, Xh X2 ," ,Xn,X n } such that every Xj, 1 ::::; i ::::; n, is connected to at least one pin. It can be shown that this number is given by 0 • 2 3 4 5 6 3 5 7 12 21 ! Realization of a universal logic circuit using universal logic modules We have shown in the last section that a ULC of any large number of variables can be found. However, it follows from (3)-(5) that the complexity of the ULC increases rapidly as the number of variables increases. From either economical point of view or maintenance point of view, it becomes prohibitive to build ULC's of various large numbers of variables in individual integrated circuit packages. Hence, we would like to present a technique for realizing a ULC of a large number of variables using identical ULC's of a small number of variables as modules, which are called universal logic modules (ULM's). Obviously, there are two great advantages of this technique. First. we only need a large quantity of identical Universal Logic Circuits and Their Modular Realizations 301 ULM's to build ULC's of various numbers of variables. Secondly, when there are faults in a U LC, we only need replace the faulty U LM's instead of the whole ULC. To derive the modular realization of a ULC of n variables using U LC's of 3 variables as the U LM's (denoted -by ULM-3's), let us first consider the case when n is odd. Since any logic function f(x h X2,' .. ,x n) of n variables, n ~ 3, can be expanded to the form f(x h X2 , ... , xn) = xlx2f(0, 0, X3 , ... , xn) + xl x2f(0, I, X3 , ... , xn) + xlx2f(l, 0, X3 , ... ,xn) + x2xl(l, 1, X3 , ... , xn), of a ULM-3 in the last level Pi if it is connected to the residue function with the binary argument whose decimal representation is i. It is obvious that there are 2n- 1 front terminals of the ULM-3's in the last level for a modular ULC of n variables. Fig. 10 shows the modular realization of a ULC of 7 variables using ULM-3's. (7) it can be realized by a ULM-3, provided that the side terminals C I and C 2 and the front terminals Ao, Ah A2 and A3 shown in Fig. I, 2, or 3 are connected to the input variables Xl and X2 and the residue functions f(0,0,x3,' .. ,x n), f(O, I ,X3, ... ,x n), f( 1,0,X3, ... , xn) and f(l,l, X3 , ... , xn) respectively. This ULM-3 forms the first level of the modular realization of the U LC. Since we can repeat this process to each of the residue functions, the second level of the modular realization consists of four ULM-3's whose side terminals are connected to the input variables X3 and X4 and front terminals connected to appropriate residue functions of n - 4 variables. Continue this process until the. residue functions become functions of the variable Xn. Because n is odd, and because each expansion reduces the number of variables. of the residue functions by exactly 2, it requires a total of (n - 1)/2 expansions. This implies that f(x l , ... , xn) can be realized by using ULM-3's in a tree structure consisting of (n - 0/2 levels as shown in Fig. 9. It is seen that there are 4j - t ULM-3's in thej-th level of the tree structure. Each of the front terminals of the ULM-3's in the last level is connected to one of the four values 0, 1, Xn and'xn defined by the corresponding residue function of variable Xn, which can be found as follows: Trace the path from the output terminal F to the front terminal in the last level in question in the tree structure, and use two bits to write the binary representation of the subscript h for the front terminal AQ of the ULM-3 in each level. Then, the concentration of the ~(n-l) 2-tuples in the order of the path forms the argument of the residue function for the front terminal. For instance, if the path from the output terminal to a front terminal in the last level in a modular U LC of 5 levels passes through the front terminals AI, A2, A o, A 3, Al of the ULM-3's in the 1st, 2nd, ... ,5th levels respectively, the residue function for this terminal is f(O, 1, 1,0,0,0, I ,1,0, I ,X n). For convenience, we shall call the front terminal F Figure 9 - The modular realization of a U LC of n variables when n is odd When n is even and when only ULM-3's can be used in the modular realization, only slight modification in the first level is required. Instead of expanding the logic function according to (7) for the first level, we only expand the logic function as follows: f(x l , X2, ... , xn) = Xl f(O, X2, ... , Xn) + Xl f( 1, X2, ... ,x n)· (8) It is easily seen that (8) can be realized by a ULM-3, provided that the side terminals C I and C 2 are both connected to the input variable Xl, the front terminals Ao and A3 connected to the residue functions f(O, X2, ... , xn) and f( 1, X2,"" xn) respectively, and the connections for Al and A2 are don't-care. Then, each of the residue functions in (8) is a function of an odd number of variables and hence can be realized by the previous method. The residue functions for the front terminals of the ULM-3's in the last level can be found in the same way as before except that only the first bit in the binary argument of the residue function corresponds to the subscript of the front terminal of the ULM-3 in the first level. The first bit is or I depending upon whether the front terminal ° 302 Spring Joi.nt Computer Conference, 1968 x.. Xs ~ X3 . ! I I~i . ~LM-3h , XI XI II I I II UUJ.-3J- 1I Ir--t I "}fl ~§ ULM-3 ~ II , II I I I I I ULM-3 ULM-31- ~~=====:l ULM-3..J f ULM-3 ULIoI-3~' ULM-3 ULM'"3} I I ULM-3i-J Figure 11 - A modular realization of a ULC of six variables using ULM-3's only ---,1 ULM-3J----, ULM-31----, ULM-3 3RO LEVEL !----"Ll ULM-3 L.....::::::f r------ti ULM-3 -::::::f ULM-41 f (XI. X. •...• x.) J ~ Figure to-The modular realization of a ULC of 7 variables ULM-3 ~l of the ULM-3 in the first level in the path is Ao or A3 respectively. Fig. 11 shows such a modular realization of a ULC of 6 variables. It is noted that in this case we have not used the full capacity of the ULM-3 in the first level. In fact, if we are not restricted to use ULM-3's only in the modular realization, the three ULM-3's in the first and second levels can be substituted by a ULM-4 as shown in Fig. 12. The terminal connections and the residue functions for the front terminals of the ULM-4 in the last level can be found by considering the ULM-4 shown in Fig. 5 or 6 and the expansion of f(x., xu ... , xl1l ) with respect to the variables Xl, X2, and x3. The above modular realization technique can easily be extended to using ULM's of any variables. If only ULM's of a fixed number of variables can be used, it is often necessary to have some don't-care terminal connections to some of the ULM's and hence some of the ULM's are not utilized to their full capacity. It can be shown that if only ULM-4's are used in the modular realization, there will be no don't-care terminal connections to any ULM-4 ULM-3~ ""'--:::1.L ULM-3J-Figure 12-A m~dular realization of a ULC of six variables using ULM-3's and a ULM-4 for a ULC of n variables, where n = 3a + 4 and a is a nonnegative integer. However, if both ULM-3 and ULM -4 are used in the modular realization, the don't-care terminal connections can always be avoided. Furthermore, it is noted t~at the tree structure of the modular realization of a ULC of n variables using ULM-k's always has 2n- 1 front terminals in the last level for any k. Fault-detection testfor ULM's and afaultdiagnostic procedure for modular U LC' s The problem of developing a practical fault-diagnostic procedure for a logic circuit of a large number of variables is still far from being solved. s-12 When integrated circuit packages are used for logical design, Universal Logic Circuits and Their Modular Realizations so far there is no practical method of deriving a set of minimal fault diagnostic tests which can locate the faulty packages. At present, it is possible to find a set of tests to detect all faults and to locate most of the faults to within a reasonable number of packages. 1:1 In this section, we shall present the fault-detection tests for ULM's and a diagnostic procedure locating all the faulty ULM's in the modular realization of a ULC realizing a given logic function. Fault-detection tests for ULM's For a ULM built in an integrated circuit package, a defective unit usually means that a set of gates and possibly several connecting wires are burnt or broken. If we make no assumptions about the types of the faults in ULM's - whether they are due to single- or multiple-component failures, open- or short-circuit wires or gates, etc. - it is obvious that the fault-detection tests for ULM's must exhaust all possible combinations of the input terminals. Hence, for a ULM with p input terminals, where p is given by (3), it requires 2 P tests to detect all possible faults in a ULM. It is seen that a ULM-3" requires 64 tests* and a ULM-4 requires 2048 tests. A diagnostic procedure to locate all the faulty ULM's in a ULC Consider a ULC of nvariables made of ULM-k's. Because a set of tests corresponding to all possible combinations of the n input variables will definitely detect all the faults in a logic circuit realizing a specific function, we need at most 2n tests for detecting faults in the U LC realizing a given logic function. t If there are no restrictions on the type of possible faults in the ULM-k's, the 2 n tests also form the minimum test set. Let a test applied to a ULC of n variables be represented by the n-tuple (bb b 2, ... ,b n) of binary components, where bl is the value of the inp·ut variable XI, I = I, 2, ... , n, employed in the test. Let T j and T/ be the tests (b h b2, ... , bn-b 0) and (b h b2, ... ,bn- h 1) respectively, where (b h b2, ... , bn- 1) *If the faults are restricted to "stuck-at-l" and "stuck-at-O" types, and if we assume that only a single fault can occur at a time in a U LM9,1l, it can be shown that the set of minimum tests for a ULM-3 consists of only 8 tests. tit is noted that passing the 2R tests only guarantees that the ULC will realize the logic function under consideration, but does not guarantee that the ULC will realize any logic function of n variables correctly. Similar to the fault-detection tests for a ULM, the set of tests required to ensure a U LC realizing any logic function of n variables correctly will have 2Q tests, where q is the number of input terminals of the ULC. However, if we assume that there is only a single faulty module in a U LC at a time, then the minimum number of tests required to ensure the U LC realizing any logic function ofn variables correctly is 2n +3 • 303 has the decimal representation i. It follows from th~ tree structure of the modular realization of a U LC that if there are no faults in the U LC, the tests T j and T j will make the output terminal F logically connected to the front terminal· Pi in the last level of the tree structure, where Pi is connected to the residue function f(b l , b 2, ... ,b n- 1 , xn). Hence, the output terminal F and the terminal Pi should have the same value under the tests Ti and T[. When F and Pi have different values under test T j or T[ or both, the terminal Pi is said to have a faulty test. Furthermore, when we say that apply tests to terminal Pj, it implies that apply tests Ti and Ti to the U LC. Because of the tree structure of the U LC, it is obvious that there is one and only one path from ohe output terminal F to each terminal Pj, and the path contains one and only one ULM-k in each level of the U LC. If a ULM-k Mr (of any level) is the paths terminating at the terminals Pj, P Hh ... , PHd, we shall say that Mr covers the terminals Ph Pi+l, ... , PHd' Now, the diagnostic procedure to locate all the faulty ULM-k's in a ULC of n variables can be summarized as follows: I) Apply tests to terminals Pi' i = 0, 1, ... , 2n-l_l. If there exists no terminals with faulty tests, go to Step 4);otherwise, go to the next step. 2) Start from j) = 1. Let La be the set of all the ULM-k's in the ath level in which each ULM-k covers at least one faulty terminal. Apply the faultdetection tests to each of the ULM-k's in La. If all the ULM-k's in La are good, go to the next step. Otherwise, replace each faulty ULM-k in La and apply test to all terminals Pi'S covered by each replaced ULM-k. Record the terminals Pi'S with the new faulty tests and those terminals with previous faulty tests not covered by lhe replaced ULM-k's, and go to the next step. 3) Increase aby I, and go to Step 2) when ais smaller than the number of levels of the tree structure. Otherwise, the ULM-k's (in the last level) covering at least one ~erminal with a faulty test are faulty and should be replaced, and go to Step 4). 4) All the ULM-k's in the ULC are good for realizing the logic function under consideration. To demonstrate this diagnostic procedure, let us consider the ULC shown in Fig. 10. We first apply tests to all Pi'S. Assume that terminals ip o, P 4 , P s, P 6 , P 7 , P31 , PS6 ' and PS7 , be the terminals with faulty te"sts after replacing M 17 • Hence, we know that Ph Pg , and assume that we find M21 is good. Then, we have to apply fault-detection tests to M 17 , MIS and M20 since each of these ULM-3's covers at least one terminal with a faulty test. Suppose we find that M17 is faulty and that M 18 and M 20 are good. Then, replace 304 Sprir:tgJoint Computer Conference, 1968 and apply tests to terminals PO,P I,... ,P I5 . Let PI, P g, P IO and Pll be the terminals with faulty tests after replacing M 17 • Hence,. we know that Ph Pg, PIO,Pll,P3bP56,P57 and P63 are all the terminals with faulty tests. Since we reach the iast leve1, we know ~Y11,r-.13,r-.1gJA:15 and M: 16 are faulty and should be replaced. This terminates the procedure. Ml7 I mproving the reliability of the modular realization of a U LC by an error-correcting code The reliability of the modular realization of a ULC can be improved by adding redundant ULM's using an error-correcting code. In this section, we shall demonstrate how to apply Hamming single-errorcorrecting code to increase the reliability of the modular realization of a ULC of n variables. The circuit is shown in Fig. 13 and the following notations are employed. f = f(x 1 ,x 2,xa,... ,xn) fo=f(O, 0, xa, ... ,xn) fl = f(O, 1, xa, ... ,xn) f2 = f(1, 0, Xa,··· ;xn) fa = f(1, 1, xa,···,xn) (9) The four blocks Bo,B 1,B 2 and B3 are the modular realizations of the ULC's of n-2 variables and have the outputs fo,fl,f2 and fa respectively. The Hamming single-error-correcting code with 4 information sYl!1bols is used, and its parity-check matrix H and generator matrix G are given by x... x.. XI 1'' lB. Figure 13-A modular realization of a ULC of n variables using Hamming single-error-correcting code 0 H=r: I l G= 0 0 1 0 Pl P2 fo P3 f1 [1 1 0 1 0 1 0 0 1 0 0 :1 0 I 0 1) f2 f3 0 0 1 0 0 1 0 0 (10) n (11) The 4 information symbols to be encoded are fo,fl,f2 and f3' which are placed in the 3rd, 5th, 6th and 7th positions of the 7-bit code word respectively, while the remaining three positions are the parity check symbols Pl,P2,P3 as shown in (lO. It follows from (l0) and (11) that the parity-check symbols PhP2 and P3 can be expressed in terms of fo,fl,f2 and f3 as follows: Since fo,fl,f2 and f3 are functions of the n-2 variables X3""'Xm PhP2 and P3 are __ als 3, we have to· add 75% redundant ULC's of n-2 variables and one highly reliable decoder-ULM-3 package. It is notes that the method illustrated above can easily be extended to the use of Hamming code with more than four information bits. Furthermore, the error-correcting code that can be used for increasing the reliability of the ULC is not restricted to the Hamming code, and the number of errors that can be corrected is not restricted to a single one. DISCUSSION In this paper, we have presented simple design techniques for ULC's, which are especially suitable to the use of integrated circuit packages for implementation. Various effects, such as the number of pins, the number of logic gates and the number of logic levels in a package, on the design of ULC's have been considered. For the ULC's obtained by the modular realization method, a diagnostic procedure for locating all the faulty ULM~s in a faulty ULC has been established. Furthermore, a method for improving the reliability of a ULC using error-correcting codes ·has been demonstrated. It is noted that an important practical advantage of using a ULC to realize a given logic function is that we need not find the minimal sum or minimal product of the logic function, which, however, is required for conventional realization methods. The onlysimplification process necessary to be applied to the logic function is to detect whether it can be written in a form which involves fewer variables. This result is used to determine a ULC of the smallest number of variables for realizing the given logic function. It should be pointed out that the ULC's considered in this paper are restricted to realizing any single logic function. A natural extension of this study is to consider the design of a multiple-output ULC for realizing any set of m logic function. One way to obtain such a multiple-output U LC is simply to connect the m ULC's, each of which realizes one of the m logic functions, in the form of sharing the common inputvariable terminal'\. It is quite unlikely that a multipleoutput ULC with fewer I/O terminals can be obtained, because in general there are no fixed relations among the m logic functions to be realized. ACKNOWLEDGMENT The work reported here was supported in part by the U. S. Office of Scientific Research under Grant No. AF-AFOSR-1292-67. REFERENCES 1 D C FORSLUND R WAXMAN The universal logic block (U LB) and its application to logic design Conference Record of 1966 Seventh Annual Symposium on Switching and Automata Theory I EEE Publication 16C40 pp 236-250 B KOLMAN A P SCHIAVO 2 J T ELLISON Unlversalfunction modules UNIVAC Tech Rept Contract No AFI9(628)-6012 (DDC AD-655395) April 1967 3 B ELSPAS etal Properties of cellular arrays for logic and storage Stanford Research Institute Scientific Report 3 Contract No AF-19-628~5828· (DDC AD-658832) pp 59-83 June 1967 4 L HELLERMAN A catalogue of three-variables OR-INVERT and ANDINVERT logical circuit IEEE Transaction on Electronic Computers vol 12 pp 198223 1963 5 R B HURLEY Transistor logic circuits New York Wiley 1961 6 R A SMITH Minimal three-variable NOR lwd NAND logic circuits IEEE Transactions on Electronic Computers Vol EC-14 No 1 pp 79-81 February 1965 7 W FRANK KING III The synthesis of multipurpose logic devices Conference Record of 1966 Seventh Annual Symposium on Switching and Automata Theory I EEE Publication 1640 pp 227-235 R A JOHNSON E KLETSKY 8 J D BRULE Diagnosis of equipment failures IRE Transactions on Reliability and Quality Control vol 01 RQC-9 pp 23-34 1960 9 J M GALEY R E .NORBY J P ROTH Techniquesfor the diagnosis of switching circuitfailures IEEE Transactions on Comm and Elect Vol 83 No 74 pp 509-514 1964 10 H Y CHANG An algorithm for selecting an optimum set of diagnostic tests IEEE Transactions on Electronic Computers Vol EC-14 N05pp705-7111965 11 D B ARMSTRONG On finding a nearly minimal set of fault detection tests for combinationalloRic nets I EEE Transactions ooElectrooic Computers vol EC-14 no 1 pp 66~ 73 1966 12 W H KAUTZ Fault diagnosis in combinational digital circuit First Annual IEEE Computer Conference Digest IEEE Publication 16C51 pp 2-5 1967 13 Logic partitioning in LSI Panel Discussion L M Spandorfer (moderator) IEEE ComputerGroup News vol no 6 p 16 May 1967 Sorting networks and their applications by K. E. BATCHER Goodyear Aerospace Corporation Akron,Ohio INTRODUCTION over its inputs, A and B, and presents their minimum on its L output and their maximum on its H output. To achieve high throughput rates today's computers perform several operations simultaneously. Not only are I/O operations performed concurrently with computing, but also, in multiprocessors, several computing operations are done concurrently. A major problem in the design of such a computing system is the connecting together of the various parts of the system (the I/O devices, memories, processing units, etc.) in such it way that all the required data transfers can be accommodated. One common scheme is a high-speed bus which is time-shared by the various parts; speed of available hardware limits this scheme. Another scheme is a cross-bar switch or matrix; limiting factors here are the amount of hardware (an in X n matrix requires m X n cross-points) and the fan-in and fan-out of the hardware. This paper describes networks that have a fast sorting or ordering capability (sorting networks or sorting memories). In (~)p(p + 1) steps 2P words can be ordered. A sorting network can be used as a multiple-input, multiple-output switching network. It has the advantages over a normal crossbar of requiring less hardware (an n-input n-output switching network can be built with approximately (~) n(log2 n)2 elements versus n2 in a normal crossbar) and of having a constant fan-in and fan-out requirement on its elements. Thus, a sorting network should be useful as a flexible means of tieing together the various parts of a large-sc~e computing system. Thousands of input and -. -f- l output lines -can be accommodated with a reasonable amount of hardware. Other applications of sorting memories are as a switching network with buffering, a multiaccess memory, a multiaccess content-addressable memory and as a multiprocessor. Of course, the networks also may-be used just for sorting and merging. A A' B' oooC--- ... --- MIN (A,B) L ~--- B L' MAX (A,B) H fo-I--- H' Figure 1 - Symbol for a comparison element If the numbers in and out of the element are transmitted serially most-significant bit first the element has the state diagram of Figure 2. A reset input places the element in the A = B state and as long as the A and B bits agree it remains in this state with its outputs equal to its inputs. When the A and B bits disagree the element goes to the A < B or the" A > B state and remains there until the next reset input. In the A > B state the output H equals the input A and the output L equals the input B. In the A < B -state the opposite situation occurs. A:8 0-,,\ ~::O, u- A -- 1 ,B:: 0 Comparison elements The basic element of sorting networks is the comparison element (Figure 1). It receives two numbers Figure 2-State diagram for a serial comparison element (mostsignificant-bit first) 307 308 Spring Joint Computer Conference, 1968 A serial comparison element can be implemented with 13 NO RS and can be put on one integratedcircuit chip. When used in sorting networks each Hand L output will feed an A or B input of another element so the fan-out is constant regardless of network size; this fact could be used to simplify the design of the chip. With several of the currently available logic families speeds of 100 nanoseconds/bit with a propagation delay from inputs to outputs of 40 nanoseconds are easily achieved. Faster operation can be attained by treating several bits in parallel in each step with more complex comparison elements. Some of the appiications described beiow wili require "bi-directional" comparison elements. Besides the A and B inputs and the Hand L outputs there are H' and L' inputs and A' and B' outputs (see Figure 1). If A > B then B' = L' and A' = H', if A < B then B' = H' and A' = L', otherwise A' and B' are left undefined. Information flows from left-to-right over the solid lines and from right-to-Ieft over the dotted lines. c 1 1..--~~ I l.i I...----~ • • cs Odd-even merging networks Merging is the process of arranging two ascendinglyordered list of numbers into one ascendingly-ordered list. Figure 3 shows a symbol for an "s by t" merging network in which the s numbers of one ascendinglyordered list, aI, ~, ... ,as are presented over s inputs simultaneously with the t numbers of another ascendingly-ordered list, b l , b2 , ••• , bt over another t inputs. The s + t outputs of the merging network present the s+t numbers of the merged lists in ascending order, c l , C2, ••• , CS + 1 ' A "1 by i" merging network is simply one comparison element. Larger networks can be built by using the iterative rule shown in Figure 4. An "s by t" merging network can be built by presenting the odd-indexed numbers of the two input lists to one small merging network (the odd merge), presenting the even-indexed numbers to another small merging network (the even merge) .and then comparing the outputs of these small merges .with a row of comparison elements. I The lowest output of the odd merge is left alone and becomes the lowest number of the final list. The ith output of the even merge is compared with the i + 1th output of the odd merge to form the 2ith and 2i + 1th numbers of the final list for all applicable i's. This mayor may not exhaust all the outputs of the odd and even merges; if an output remains in the odd or even merge it is left alone and becomes the highest number in the final list. 2 a b 1 1 < a < b < . 2 < . < . t • • < • • < b 2 C < C 1 2 + • • < a s t C s+t Figure 3 - Symbol for an "s by t" merging network Appendix A sketches the proof of this iterative rule. Figure 5 shows a "2 by 2" and a "4 by 4" merging network constructed by this rule. A H2 P by 2P " merging network constructed by this rule uses p.2 P+ 1 comparison elements. The longest path goes through p+ 1 comparison elements and the shortest path through one element. Doubling the size of a merge only increases the longest path by unity so the merging time increases slowly with the size of the network. Sorting Networks and Their Applications 309 Bitonic sorters a1 - _ - a 2 a3------~----~~ a 4 a a 5 Cl"o s--- t ___ C s+ Another way of constructing merging networks from comparison elements is presented here. While requiring somewhat more elements than the odd-even merging networks, they have the advantage of flexibility (one network can accommodate input lists of various lengths) and of modularity (a large network can be split up into several identical modules).2 We will call a sequence of numbers bitonic if it is the juxtaposition of two monotonic sequences, one ascending, the other descending. We also say it remains bitonic if it is split anywhere and the two parts interchanged. Since any two monotonic sequences can be put together to form a bitonic sequence a network which rearranges a bitonic sequence into monotonic order (a bitonic sorter) can be used as a merging network. Appendix B shows that if a sequence of 2n numbers, aI, a2, ... , a2n is bitonic and if we form the two nnumber sequences: min (a b a n+1), min (a2, an+2), ... , min (am a2n) (1) and max (aI, a n+1), max (a2, a n+2), ... , max (am a2n), (2) Figure 4 -Iterative rule for odd-even merging networks C 7 C that each of these sequences is bitonic and no number of (1) is greater than any number of (2). This fact gives us the iterative rule illustrated in Figure 6. A bitonic sorter for 2n numbers can be constructed from n comparison elements and two bitonic sorters for n numbers. The comparison elements form the sequences (1) and (2) and since each is bitonic they are sorted by the two n-number bitonic sorters. Since no number of (1) is greater than any number of (2) the output of one bitonic sorter is the lower half of the sort and the output of the other is the upper half. A bitonic sorter for 2 numbers is simply a comparison element and using the iterative rule bitonic sorters for 2P numbers can be constructed for any p. Figure 7 shows bitonic sorters for 4 numbers and 8 numbers. * A 2P -number bitonic sorter requires p levels of 2P - 1 elements each for a total of p. 2p - 1 elements. It can act as a merging network for any two input lists whose total length equals 2p • Large bitonic sorters can' be constructed from a number of smaller bitonic sorters; for instance, a 16number bitonic sorter can be constructed from eight 4-number bitonic sorters, as shown in Fig. 8. This allows large networks to be built of standard modules of convenient size. Figure 5 -Construction of"2 by 2" and "4 by 4" odd-evell" merging networks *-Readers may recognize the similarity between the topologies of the bitonic sorter and the fast-fourier-transform. a a a a 1 2 3 4 b 1 b 2 b 3 b 4 A L C 1 C 2 C 3 C 4 C 5 C s a 310 Spring Joint Computer Conference, 1968 ::=v.a:v3. - 0. .------. 3 / 'I A L I V In - ! TEM BITONIC SORTER Cn - 2 C n-1 C n a a n-ITEM BITONIC • SORTER ~ Cn+1 C C 1 C 2 C 3 C 4 1 a n+2 2 3 C_*~ " .J as a a C s s 6 C 7 C 7 C as a 1 , a , • • ., a IS 2 2n C :s; C :S; • • • :s; C 1 2 2n BI TON I C Figure 6 -Iterative rule for bitonic sorters e Figure 7 - Construction of bitonic sorters for 4 numbers and for 8 numbers a 1 as----l a a 9 13 S orting networks A sorter for arbitrary sequences can be constructed from odd-even merges or bitonic sorters using the well-known sorting-by-merging scheme: The numbers are combined two at a time to form ordered lists of length two; these lists are merged two at a time to form ordered lists of length four, etc. until all numbers are merged into one ordered list. To sort 2P numbers using odd-even merges requires 2P- 1 comparison elements followed by 2P- 2 "2-by-2" merging networks followed by 2P-~ "4-by-4" merging networks, etc., etc. The longest path will go through (~)p (p + 1) elements and the shortest path through p elements. The network requires (p2 - P + 4)2 P- 2- 1 comparison elements. To sort 2P numbers using bitonic sorters requires (~)p(p + 1) levels each with 2P - 1 elements for (p2 + p) 2P- 2 elements. Each path goes through (~)p(p + 1) levels. A sorter for 1024 numbers will have 55 levels and 24,063 elements with odd-even merges or 28,160 elements with bitonic sorters. With a 40 nanosecond 4-NUMBER B I TOrJ I C SORTER Figure 8 - A 16-number bitonic sorter constructed from eight 4-number bitonic sorters propagation delay per level the total delay is 2.2 microseconds. Serial transmission of the bits would require about this much time between successive bits of the numbers unless fe-clocking occurs within the Sorting Networks and Their Applications network. Parallel-input-parallel-output registers of 1024 bits each can be placed between certain levels to perform this task or the re-clocking may be incorporated within each canparison elGTlent with a pair of flip-flops on the outputs. The latter scheme does not add to the terminal comit of the comparison element so the cost of the added flip-flops on the comparison element chip is small. One can use any of the familiar techniques for driving shift registers such as the "A-B" technique where successive levels are clocked out-ofphase with each other. With present circuit and wiring techniques a bit rate of 10 megahertz may be possible with 50 nanosecond delay per level (2.75 microsecond delay from input to output of a 1024-word sorter). With re-clocking in the elements and odd-even merges extra elements are needed to balance the unequal-length paths. Bitonic sorters do not have this problem. a fixed output address and a control bit equal to O. At the right side of the m by n merge the m + n items are in one ordered list; each address-inserter item will be directly below any input items with the same address. The adjacent word transfer network, looking at the control bits, connects each address-inserter item to the input item directly above it if one exists (the input item with lowest priority number is picked in each case). The elements in the sort and the merge are bi-directional so two-way paths are formed from input to output. The adjacent word transfer sends back signals over each path to signal each input and output line whether or not a connection has been established. Data can then be transmitted over each of the connected input lines. M INPUT LINES I NPUT ITEM Applications The fast sorting capability of these networks allows their use in solving other problems where large sets of data must be manipulated. Some of these applications are sketched below. Switching network A sorting network can connect its input lines to its output lines with any permutation. The connection is made by numbering the output lines in order and presenting the desired output address for each input line at the input. The sorting network sorts the addresses and in the process makes a connection from each input line to its desired output line for the transmission of data. Bi-directional paths will be obtained if bidirectional comparison elements are used. An alternative permuting network has been shown in the recent literature:! which has less elements [(p - l)2 P + 1 versus (p2 - P + 4)2 P - 2 - 1 for permuting 2P items] but a more complex set-up algorithm. Switching network with conflict resolution The aforementioned switching network assumes each input wants a unique output line. In many applications conflicts between inputs occur and must be resolved by inhibiting conflicting inputs. Figure 9 sketches an m-input, n~outpttt network that performs this task. Each input line inserts a word containing the output address desired (or zeroes if the line is inactive), a control bit equal to I and a priority number into an m-item sorting network with bi-directional elements. This orders the items so input items with the same output address are grouped together and ordered by theIr priority number. The ordered set of m-input items is merged with a set of n items. each containing 311 1DES I REO OUTPUT CONTROL 11 I BIT \ M-I TEM SORTING NETWORK M 1 PRIORITY et:: LLI ...... ~ LLI z >- - ~ LLI et:: 0 0 0 Z Z~ en en en I- LLI LLI ADDRESS-INSERTER ITEM < et:: ~ l- et:: et:: IOUTPUT ADDRESS 1 0 I 0 - - 0 I Z et:: 0 I- N OUTPUT II NES en m N ~ et:: :==LLI : :== et:: 0 z + ::E ~ I- z LLI u < -, 0 < < Figure 9 - An m-input, n-output switching network with conflict resolution Multi-access memory Re-clocking delays in the comparison elements give a sorting network some storage capability which can be augmented if needed with shift registers on the outputs. When the output lines are fed back to the input lines a recirculating self-sorting store is created (Figure 10). In each recirculation cycle word positions are changed to keep the memory in order. I nputs to the memory can be made by breaking the recirculation paths of some words and inserting new words. To prevent destroying old information during input we use the convention that words with all bits equal to "one" are "empty" and contain- no information; these will automatically collect at the "highend" of memory where input lines can use them to insert new words. Outputs from the memory can be accommodated by reserving the most-significant-bit (MSB) of each word; "I" for normal words and "0" for words to be outputted. Words for output will automatically collect at the "low end" of memory where output lines can read them. Selection of which words to output is accommodated by reserving the least-significant-bit (LSB) of each word; ""1" for normal words and "0" 312 Spring Joint Computer Conference, 1968 RECIRCULATION ~HiGHEND~ ( EMP TY WORDS' "i NORMAL WORDS AND OUTPUT REQUESTS OUTPUT WORDS OUTPUTS MSB 1\1----1 \1 LOW END LSB 11ADDREssi DATA 111~g~~AL O/ADDRESSI DATA 11 I ~3~~J~R 1IADDRESSlo--------------O 10 10UTPUT . REQUEST Figure 10- A multi-access memory for "output requests". Logic between adjacent words causes an output request to affect the word directly above it. During one recirculation cycle new words and output requests are entered into memory. During the next recirculation cycle all words are recirculated with no new entries. At the end of the cycle the LSB of each word will precede the MSB of the same word (no reordering occurs in the second cycle). Output requests are identified by a HO" in the LSB and for each request logic performs the following action: if the word above the request is a n9rmal word (" 1" in the LSB) change its MSB to a "0" and empty the request (change all its bits to "1" as they fly by), if the word above the request is another request change the MSB of the first request to "0". During the following recirculation cycle the selected words and unfulfilled requests flow to the low end of memory and are read by output lines. Because the request itself is outputted if no word is found, as many outputs as original requests occur. If the original requests were in order the outputs directly correspond to them (a second sorting network can put the original output requests in order). In use the more-significant part of each word is used as an address and the rest as data. To request a certain address an output request is sent in with that address and zeros fot data. The word returned will be at that address or a higher address if the requested address is empty. While a complete cycle may be long in this memory (50-bit words at 100 nanoseconds/bit = 5 microsecondsirecirculation = 10 microseconds/complete cycle) many inputs and outputs can be accommodated in each cycle. An effective rate of 100 nanoseconds/ word is achieved with 100 inputs and outputs. Such a memory could be useful as the "common memory" of multiprocessors. The self-sorting capability could be useful for keeping "task lists" up to date and performing other housekeeping tasks. Other uses may be as a message "store-and-forward" system and as a switching network with buffering capability. In these uses each output device is given a unique address which it continually interrogates; input devices send their data to these addresses. Multi-access content addressable memory By adding facilities for shifting the bits within the words in the aforementioned memory different fields of the words can be brought into the more-significant portions which govern the ordering of the words. Addressing can then take place on any part of the words. As long as the same field positions are being searched more than one search can be accommodated simultaneously. Multi-processor By adding processing logic to perform additions, subtractions, etc., on groups of adjacent words of a sorting memory one can implement a multi-processor. The sorting capability is used to transmit operands between processors. Merely by changing address fields the multiprocessor can be reconfigured quickly. Such a multi-processor can keep up with the "dynamic topology" of certain real-time problems. To simplify the processing logic one might use the same network or another network to perform table look-up arithmetic. It is possible to have all the processors search the same tables simultaneously. SUMMARY Sorting networks capable of sorting thousands of items in the order of microseconds can be constructed with present-day hardware. Such fast sorting capability can be used to manipulate large sets of data quickly and solve some of the communications problems associated with large-scale computing systems. Standard modules of convenient sizes can be picked and used in any size network to lower the cost. Largescale integration can be applied if the problem of laying out the rather complex topology of the network can be solved. Studies of this problem are being conducted at Goodyear Aerospace. APPENDIX A-SKETCH OF PROOF OF ITERATIVE RULE FOR ODD-EVEN MERGING Let aI, a2, a3,' .. and bb b 2, b3, ... be the two ordered input sequences. Let Cb c 2, C3,'" be their ordered merge, db d 2, d 3, ... be the ordered merge of their odd-indexed terms and eb e2, e3,'" be the ordered merge of their even-indexed terms. For a given i let k of the i + 1 terms in dt, d 2 , d:l,' .. , 313 ~ (A 7) max (d l , d 2, ... ,dn) min (e l , e2, ... , en). If aI, a 2, a 3, ... ,a2n is split into two parts and the parts interchanged d l , d 2, ... , d n and e t , e 2, ... , en undergo a similar interchange. This does not affect the bitonic property nor affect (A 7) so it is sufficient to prove the proposition for the case where al ~ a2 ~ a3 ~ ... ~ aj-I ~ aj ~ aj+l ~ ... ~ a2n (AS) is true for some j (1 ~j~2n). lr ("'Amp frn.-n ..1.1. VI.J..I. "Qp"pr"'JI1 Af thp tprm" Af "pnllPnI"'P" flAP" nAt 'JIffpl"'t A."\o.. ..... T "" .. >.3 ........ '-'.I.. 11,.1..1. ........ ""' ... .I..I..I.U' ,-,.a. '--"""''1."""""",.",,,,,,,",,v '-&" ...... U' .1..1.'-'11... &,..&..1....1.."""",,,, bb b 3, b s, . .. The term d i+1 is greater than or equal to k terms of at, a3, as, ... and therefore is greater than or equal to 2k - 1 terms of at, a 2, a3, ... Similarly it is greater than or equal to 2i + 1 - 2k terms of b I, b 2 b 3, ... and hence 2i terms of Cb c 2, C3, . .. Therefore the bitonic property nor maximums and minimums so it is sufficient to assume na2n then from aj-n ~ aj we can find a k such that j~k< 1n, ak-n~ak and ak-n+1>ak+1 (the sequence ah aj+l, aj+2, ... ,a2n is decreasing while the sequence aj-m aj+l-n, aj+2-m' .. ,an is increasing). Then fl. I"'Amp frAm 'JI """1+1 ""'-'.I.,,"''''' ..LA '-'.1..1..1. ...... 'JI_ 'JI 1' 1.A.3, """5, ••• 'JInfl i -I- 1 _ Sorting Networks and Their Applications 1.A..l.l'\..l..l 1.1. .no. ..... V.l..I..I"" (AI) Similarly from consideration of the i terms of eb e2 e 3, ... , ei the inequality d i = ail (A2) is obtained. Now consider the 2i + 1 terms of c t , C2, C3,' .. ,C2HI and let k come from a b a 2, a3, ... and 2i + 1 - k come from b I , b 2, b 3, ... If k is even we have that C2Hl is greater than or equal to: ei l~i~k-n (A9) and d i = aHn } k terms of a b a 2, a3, ... (!t2)k terms of a b a 3, as, ... 2i + 1- k terms ofb t , b 2, b 3 , ••• i + 1 - (!t2)k terms of b t ,b3,bs, ... i + 1 terms of db d 2, d 3 , ••• for = aHn ei for k-n1 ~----------M-~-.S-~-"T-F-~-c-~-~-a-1W-'----------j I WE NR. I Ke;,r 1]:'.- I SuQu!tter j~t-i-r:-i~i-i-i-iAclcnowledged 1Name or C~unc1- . ---.-.-- ._._._1 II I I -'---f Structure ::itab1l1ty AC1d Base &at '='==1 ,=i==i ;;table Unstable Ii Soll.:bll!t;,r • Methanol i-Hv-·gl-.0-SC-OP1-C-------! ::r . Sol Ir.sol i I I Ij - jI- - jI .-.--. Equat10ns 1nci1cat!ng synthet1c route i_D1_B_cree_t___I S;,rnthaS1zed !,_Q1_rt_ _ iPurchaaed Figure 1- New form, as laid out on a typewriter Creating a new form To create a new form, the user, first of all, assigns it a specific number. Then the form is laid out on the typewriter (Fig. 1). Subsequently computer processing will generate a master form (Fig. 2), suitable for duplication, from which a supply of blank forms can be prepared. In the course of the processing, a number is assigned to each box on the form (but an override Figure 2 - High-speed printer output. This form corresponds to the one in Fig. I, but the program has added a reference box (top) and a validation box .(bottom) 326 Spring Joint Computer Conference, 1968 Figure 3 -The diagrams show, from top to bottom, the creation of a new form, the updating of a file, and the querying of a file option is available). These numbers will be used to code the entries these boxes are destined to hold The diagram in Fig. 3 shows these operations. Such a form differentiates itself from other forms by always providing boxes, not just spaces, for all the information to be entered.· Being contained in a particular box defines an entry; and obviously, the contents of a box is not allowed to spill over. Each new form is provided by the program with two additional boxes (compare Fig. 1 with Fig. 2). In the first of these boxes, the typist repeats the form's identification number. During subsequent machine processing, thi~ number will identify the particular form, and its location will serve as a reference point to the location of all the other information on the form. The second of these boxes, placed at the bottom of the form, is to contain a validation check. By not filling it in, the typIst can discard an erroneously 'filled-in form. (It is not possible to otherwise discard the coded counterpart on tape.) By filling it in, the computer is enabled to compensate for any misalignment of the form in the typewriter. New forms can be created at will, without restrictions except perhaps as to their size, and without requirements for programming. As soon as a form has been created, the system is ready to accept it, and its input will be compatible, in the sense discussed below, \vith the prior input, or with independently created files. Thus, the user is provided with a flexible input system, accommodating a variety 'of input, which he himself can tailor to suit his own needs. Nevertheless, it will be obvious to the user that the design of a new form takes much thought. For instance, a box captioned "temperature" may collect numbers representing either degrees Fahrenheit or Centigrade. He must therefore make the caption specific. Similarly, if the system is to be asked subsequently to alphabetize names, separate boxes ought to be provided for the different parts of a name, so that the last name can be selected unequivocally. The means by which he controls the information to be stored, as well as its organization, are relatively simple - by furnishing adequate space to avoid constricting entries, by being liberal in providing boxes for separating different types of data, and by being explicit with his captions. But although this requires thought on his part, the thought here is directed properly to the prospective data and to its retrieval, and not to the machine and to its requirements. Should it be necessary to alter the design of a form, care must be taken that the same information is assigned the same box number as before (the numberoverride option is then exerted); but the shape or the relative location of the boxes on the redesigned form do not affect the retrieval. If the new form is to accept additional data, the boxes destined to receive this data must be assigned numbers not used on the replaced form. If the changes are more drastic than that, it is better to create a 'new form, having a different form number. Accessibility of the stored data Highly constrained information, such as input on fixed-field coded cards, results in highly consistent files. It might therefore be expected that forms, encoded under relatively lax rules, would result in files of commensurately low consistency. But this is not altogether the case. If the purpose of the constraints was solely to insure consistency, then one might expect such an analogy to hold. But although consistency may profit from the standardizations, abbreviations, etc., which constrain the input, the constraints are there, in good measure, not to insure consistency, but to maximize the utilization of scarce space. The IB~1 card has solely 80 columns, and ever since the days of the first card Computer Input of Forms collators, processing has been easier if only one card is used per item of input. So, in order to conform to this processing precept, the input information is chopped up and squeezed to the very limits of recognizability. In this game, each file fends for itself. Input rules, not to speak of additional space-saving tactics such as superimposed coding, are developed according to a strategy which takes account of the different contents of particular files. Consequently, the same information may be coded differently on different files. This lack of compatibility among files of this kind precludes their being merged into a combined data base, desirable as this may seem. For that matter, how often has not a file become obsolete because a change to a new card format, with the corollary need for updating the backlog, would have proved too costly ? The very constraints work thus often to the detriment of compatibility. This is not to say that an improved compatibility among files, obtained by reducing input constraints, will be without detriment to the consistency within each file involved. But in this case, a certain shift of the burden for data correlation from input to output, a shift from rigid rules developed a priori to an intelligent examination of virtually original input, is not without merits. This is especially so because it would appear that problems due to inconsistencies are experienced not so much in searching a single file, as in attempting to correlate two independ~nt files. Different kinds of data are affected differently by reducing input constraints. Assessing the potential precision of retrieval of data entered on forms, the highest score will go to entries entered by means of a check mark. Numerical entries are next, absolute precision in their retrieval being affected only by outright input errors. (A referee wondered whether, using unconstrained input on forms, the similarity between the numbers 1,234.5 1234.5 01234.5 1.2345 xl ()3 1.2345E3 1234 1/2 wo~ld b~ perceived. In the current state of the program, the first three are accepted at their true value; the last three are rejected, with a message stating that they are not acceptable numbers.) Next down the line of diminishing precision of retrieval, are entries consisting of single words. Here, synonymy can arise. If only a single file is searched, this problem can be solved at retrieval time, without too much difficulty. If the searcher is looking for the color "red" in a particular box on a form, he must also, in this case, look for "scarlet", "crimson", and for any other words his dictionary might suggest. But suppose that the investigator is attempting to cor- 327 relate two files. Suppose, for instance that he wishes to determine whether the color of an insecticide detracts from its effectiveness. He has obtained two files; on one are recorded the sensitivities of various insects to various colors, and on the other the actual colors of various pesticides. He intends to match any color given on one file with the same color on the other. Because of synonymy, he will not obtain all the legitimate matches. Possibly, the problem may be solved by making two passes. In the first, the different terms, used on each file to denote color, are listed. The investigator can then declare which terms he considers synonymous. In the second pass, the desired correlation is then obtained. From the investigator's point of view, this procedure has the advantage that it deals in English, and not in perhaps difficulty reconcilable coding conventions. This area is presently being investigated, but experience is still lacking. Precision of retrieval diminishes further where entries consist of texts. The program presently used has no text-searching capabilities. But even if these were available, precision of retrieval would regress to the vanishing point, where sundry texts are lumped under captions such as "remarks", never to be retrieved except as adjunct to a response to an independently specified search. Not that they are without value. A referee pointed out that the most useful information in a patient's record is usually the hand-scribbled notation that doesn't fit the confines of the prescribed box. Here, indeed, we arrive at· the limits of the system. But then, there are those who maintain that the degradation of information begins the very moment one's thoughts are couched into words! I nterrogation of the data base For the sake of the present discussion, the accumulated contents of identical forms, stored on tape, represent a file; and a combination of these files, originating from different forms, constitutes a data base. To query such a data base, the prospective user must ascertain what information it may contain, and devise a strategy for extracting the information of interest to him. To this end, he may consult the blank input forms. These forms convey a considerable amount of information about the data originally entered. Forms "talk" about the information they contain, and the printed captions, directives and explanations that are present on a form are as pertinent for retrieval as they are for input. Thus, although the system does not code concepts such as temperature, an examination of the 328 Spring Joint Computer Conference, 1968 blank forms used will reveal where, and in what contexts, temperatures are recorded. Original entries are retrieved by citing the corresponding form and box numbers. Blank forms perform this service even after they have become obsolete, and are no longer used for input. The information a form carries about the information it contains remains valid. Should there be too many blank forms to make such an examination convenient, the captions can be listed in an index. A sample of such a captionindex is shown in Fig. 4. Such an index is different from the conventional indexes accessing text. Because of the nature of the generation of index terms used with the latter, the presence of a given index term can be almost accidental. With the described captionindex, each single term will obviously produce ali the material contained in the specified box. Index to Forms (in a reference such as 3/17, 1:..'1e first number is the fonn number, the second is the box number) Accession numbers -see Reference numbers Acids, solubility of compds, in --, stability of compds, in Bases, solubility of compds, in --, stability of compds, in Boiling point Carbon, calc' d., in elemental analysis --, found in elemental analysis Chemical compounds, analysis of, --, boiling point of --, chromatographic data on --, crystallographic data on see Crystallography --, dangerous --, density, exptl. --. elements contained in --, explosive --, hygroscopic --, i. r. spectrum of --, lit. on (syntheses, tests) --, melting point of --, mol. fonnula of --, mol. wht. of --, received at WRA1R, date of, day month year --, received at WRAIR, date of acknowledgment of, day month year --, refr. index of --, shipped by WRAIR, date of, day --, --, --, --, --, --, --, --, month year solubility in acids solubility in bases solubility in methanol solubility in water stability of stability in acids stability in bases stability to heat --, structure of, --, toxic --, u. v. spectrum of --, X-ray data on Chemical elements, calculated for elemental analysis --, found in elemental analysis 3/68,3/69 3/62, 3/63 3/72,3/73 3/66,3/67 3/43,4/19,4/20 3/28 3/29 3/20 3/43,4/19,4/20 3/57 3/59 4/18 3/23 3/59 3/76 3/49 3/58 3/44,4/23,4/24 3/18,4/3 3/19,4/4 3/9 3/10 3/11 3/13 3/14 3/15 3/45 3/5 3/6 3/7 3/68,3/69 3/72,3/73 3/74,3/75 3/64,3/65 3/60 3/62,3/63 3/66,3/67 3/70,3/71 4/20,4/24 3/17;4/5 3/59 4/22 3/28,3/33,3/37 3/41,3/47,3/51 3/55 3/29,3/34,3/38 3/42,3/48,3/52 3/56 Figure 4 - Sample of index referencing to boxes on forms Although the terms of a caption index can be used in combination, there is a point beyond which these terms are unable to achieve further discrimination. This point is reached when it becomes necessary to discriminate within the confmes of a single box. As was discussed earlier, the methods and the precision of retrieval vary from this point on. Even so, however, the preliminary sequestration achieved by means of accessing boxes on forms, is valuable in its own right, and is obtained with comparative ease. We come now to the actual techniques for querying the files or the data base. This might be done by coding parameters on cards, by devising a special language, etc. In the following, however, the use offorms for querying forms is described. Forms for querying and altering the data base Describing the use of forms for accessing files (albeit punched-card files), Postley8 reported that "operators and management can use these ... directly, without submitting their problems to programmers and without becoming programmers themselves"; furthermore, he mentioned that problem definition time is decreased. In Postley's system, a few standard forms describe all retrieval operations. Varying the parameters entered on these forms allows one to vary the format and type of output to be obtained. The system described here, however, allows for the use of any number of query forms (or Q-forms); there are some for querying the file, for specifying output formats, even for correcting items on file, or for applying plausibility checks at input time. These forms can be used in combinations, and new forms, with additional capabilities, can be added. Perusal of existing forms, or of an index thereto~ indicates thus not only what data is available, but also what procedures are available to reach it. The Q-form shown in Fig. 5 is used to obtain listings. Initially, a rough layout of the expected output is sketched on quadrillated paper. The form is then filledin : the desired items of data are specified by their input-box numbers, and the desired locations by their coordinates. To left-adjust data, the beginning x-coordinate is specified; to right-adjust data, the terminal x-coordinate is specified. Centering is obtained by specifying both. A capability exists for entering headings, and the like. The form can be used in conjunction with other forms, such as the Q-form specifying a search (cf. Fig. 8), in which case it will control the format of the result of the search. In Fig. 6, the output shown corresponds to the specifications given in Fig. 5. In Fig. 6, 3X5 cards are produced from the same file, by entering different parameters on the same Q-form. Computer Input of Forms INSTRUCTl~NS ,~~ TH[ S[LECT10N AND DISPLAY e, •• _ _ _ _ _ _ _ _ _ _ _ _ __ 13n B~X CI6NTAINS Ne •• RAPHICS TRANSFER T0 BEIOIN YES N" X Y LeCATI~N I,) Y []I] I2I2J I=I=I I~ [JI] I~~ I=I~ 12] []I] I~~ lEI I~~ 13~ I~~ I~~ !~ I~I] I~~ I~~ I~ [JI] [J!J I~~ I~ I~~ I-I~ I~~ I~ I~~ I~~ I~~ SPAClh. BET~E[N C0NSECUTIYE ITEMS e,1 CAPT10N S~0ULD N0T I' CAPTI0N ;0[S AT I' CAPTI~N ;3£S AT I' CAPTIIIN ;~ES AT ., CAftTUN I l) !2J 1., desired, the last two boxes in part B are filled-in instead. This form, although working only with numerical data at present, illustrates the wide scope to which such forms are adaptable. LiOCATlliIN CAPTI"N END 329 1-_1 I=I I I [J I I I~ I I I~ I I I~ I I I~ I -r I~ I I I~ ETHY~AMINE. l-METHY~-2-PHfNOXY- smUC1URE 0916 NAME M.WT IJL:l ~AL' LINES. EXCEED 10 CHARACTERS HEAD A C0LUMN. ENT[R H. LE,T ~F AN ENTRY. ENTER L. Rl.HT 0F AN ENTRY. ENTER R. i' '''RM 'Iii. I 0189 Figure 5 - Form for the sp~cification of printout format. The specifications entered here produced the output shown in Fig. 6 sTlitUCTUIitE. ""-GUO;.! L: 1U I'1J,J"'c,.tl: JlR-{JUCI .. 1,.10"1.".,°2 lIIit-oQ04 '10"'21,1"' .. °2 "R-li,1U t;1tIM2J~1j ciC-:igJ C20J'lOlJ~ufl 0234 CH3 0, ,,', ,,', "CON"Z 2-NA'HTH.IoMI Df,. 1_2.3."'-T( 'RAMJOIil:Q-6-to1YDIIQX't-..1'-f'!YDIIIOlY-J-ftETI'IQJlTP'HENYL.I- ,",o,i,.,i ... i"i'CHZOH ,', HD ,...!, ,i, • Figure 7 - Another output format generated by using the form in Fig. 5, differently filled-in, however OCH,) Figure 6 - Output format generated by using the form shown in Fig. 5 By means of the Q-form shown in Fig. 8, retrieval may be specified. If a single file is searched, the criterion for retrieval (or one of its criteria) is specified in the first box of part B on the form. If a correlation is Fig. 9 shows a Q-form by means of which entries that are on file can be replaced or purged. This form is designed to use either a single or a double criterion for making a change. The double criterion is needed if, for instance, an accession number was mistakenly repeated on another input form. Fig. 10 shows a Q-form capable of making plausibility checks. This form is normally processed with the 330 Spring Joint Computer Conference, 1968 • , .. 13 I:I::I 4. IF lillil( I:I=t ~~------------------------------- ·_-1 1 '''''. THAT IS PA~E I~ ~F I~ Ci~OiiiiNi ~Hie" A. I~ B. IT MUST 80X I~ ~N FiRM 1-----1 IT MUST C0NTAIN A TRU! NUMBER. C~NTAIN AT LEAST I~ DJ~JTS IIlR CHARACTERS. 8. 1----------} o/l~ THIS V41.,WE IS T'1IS C. IT \I~I..UE ~. THC:r-. Co),1P~E. E ITS VAL.UE MUST 9! T~E 'L.S" T'i:' FC,LL.JI ... I;"G 10 1"'15 IS F,I\" 1=1 ~F A CV'Md'llATI(';" tF i~I FiD!(I'1S ,~~. l~ CAN HAVE A FRACTI0hAI. VALU!. ~ CANNraT VALUE l _REATER SMALL.Eft THAN THE C0NTENTS 0' SIX I~ ,. E. MULTIPL.IED BY I. THt: SPE::IFICAT1P':. Gl~=':"1 r1"-~t C.oI'''STn",TE ~, t~ EQUAL. .,t:1 T~AT I~. 0' B0X CHECI( Ti! VAL.l OA TE ITS VALUE MUST I~. PLUS 0R MINUS I~ X 'IIiR14 Figure 8 - Form for retrieval specification Figure 10- Form for specifying plausibility checks I' _.IX N"'. IIATA. ~~ 1=1 ~E~I"S ~" Fit"'" r-.wI. ~ITH THE 1=1 F~I.L~ftlN~ Ct .... 141N& T"IE F"':'I...,,,I .. iO CHAR.CTE~S. .---~ "Nil "'N IF. FJ~TP'lER"ZRE. THIs SAME OATA. ~~ F~RM 8E~I~S 9"'X :II~. C~NTAINS ~ITP'I THE ~lfH THE TP'lE F0LLZ~IN~ F~I.Lo/lwINiO CH4RACTE~S. --_J 1:=:1 IMAW~ I-I F~RI'1 ENII~E.LY C~NTENl S o/JF illllX ,~IIl. TP'lEN DEI.ETE THIS .oR -tEPL.4CE 1=--1 TH~ x F~R YES) F~I..I."'''IN~ I I:! ','c' To """TE F"" Figure 9 - Form for specifying correction of file input. The checks performed by this form might perhaps be more thorough9 ; it is shown here chiefly to illustrate the variety of tasks that may be performed simply by filling-in a form. Prospects A new generation of computers is emerging, which provicles greater computing capabilities and vastly expanded memories; these computers are therefore much less in need of limitations on the input. Consequently, there is increasing talk of generalized data bases, with capabilities that go beyond the processing of bank accounts or of airline reservations - data bases which could provide answers to requests both varied and unforeseen . For many t):'pes of input,' the old, tried and proven methods will, no doubt, continue to persist. But other types of input, or other modes of use, will virtually insist upon new approaches. Through the use of twodimensional encoding typewriters, which are now becoming available, the power of organization which forms can provide can be fully exploited. The versatility of this combination, the modified typewriters and the forms, will bring cioser the day of processing systems that are truly machine independent. 10 Computer Input of Forms REFERENCES 1 J J BARTOSIEWICZ The manifold business forms mdustry-un economic review BDS A Quarterly Industry Report 5: 11 1964 2 A C PATTERSON The questionnaire as a means of educational research I The extent and reliability ofquestionnaire investigation Scott Educ J 25 :683 1942 0 B HOLLAND D P JACOBUS 3 A FELDMAN The automatic encoding ofchemical structures J Chern Document 3: 187 1963 4 The encoding process, as described here, is covered by U S Patent 3,358,804 Other patent(s) are pending 5 0 HEARNE Tape operated writing machines Automatic Data Proc Feb 1962 pp 32-37 331 o P JACOBUS A FELDMAN 6 0 B HOLLAND B R MOBERLY The coordinate concept-an approach to tape punching Control Eng II :60 1964 J MAY 7 M KLERER Two-dimensional programming Proc FJCC 27:63 1965 8 J A POSTLEY File management application programs DPMA Quarterly 2:20 1966 H 0 HARTLEY 9 R J FREUND A procedure for automatic data editing J Amer Statist Assoc 62:341 1967 10 I am indebted to Dr David P Jacobus and to Daniel J Minnick, of this Institute, and to Robert R Puttcamp, of the Harry Diamond Laboratory, for helpful discussions and suggestions. Machine-to-man communication by speech Part 1: Generation of segmental phonemes from text by FRANCIS F. LEE Research Laboratory of Electronics, Massachusetts Institute of Technology Cambridge, Massachusetts For many years man has been receiving messages from machines in printed form. Teletypes, computer console typewriters, high-speed printers and, more recently, character display oscilloscopes have become familiar in the role that they play in machineto-man communication. Since most computers are now capable of receiving instructions from remote locations through ordinary telephone lines, it is natural that we ask whether with all of the sophistication that we have acquired in computer usage, we can communicate with the computer in normal speech. On the input of the computer, there is the automatic speech recognition problem, and at the output, the problem of speech synthesis from messages in text form. The problem of automatic speech recognition is substantially more difficult than the speech synthesis problem. While an automatic speech recognizer capable of recognizing connected speech from many individual speakers with essentially no restriction on the vocabulary is many years away, the generation of connected speech from text with similar restrictions on vocabulary is now well within our reach. With touch-tone telephone, people can call a computer and, after the initial connection has been made, use the calling buttons as an input keyboard to communicate unambiguously to the computer, thereby temporarily bypassing the very difficult problem of speech recognition. 1 With computer-generated speech, we can foresee the use of touch-tone telephone sets as the only remote terminal device for large and complex information retrieval systems. In this paper and in the companion one by Jonathan Allen, we shall discuss the problems encountered in the computer generation of connected speech from.text source, and our solutions. 333 I t should be clear that with the addition of a printed unit, the system is useful as a reading machine for the blind. It was originally with this purpose that our research into computer generation of connected speech from text source began. 2 We should like to point out that while it may be acceptable to call out stored audio signals corresponding to words or phrases in order to form short messages, such as with the automatic time information service over telephone or the currently available computer audio response systems, it is not possible to extend the method to handle even a vocabulary that might be encountered in reading a second-grade level book. A generative scheme, in which a reasonably small amount of stored data are used, is both desirable and necessary. character-reco~nition The elements of speech sound We use only twenty-six letters of the Roman alphabet and a few additional symbols to express our language in text form. These letters and additional symbols are called "graphemes." A grapheme is the smallest unit of construction of this written language. The smallest segment of speech sound on the level of production and preception is called a. "phoneme." The common element between the written words [cat] and [pack] is the grapheme [a], and in sound the common element between the spoken words [great] and [raise] in the phoneme /e/.* Speech synthesis by rule In recent years, several workers 3 .4.5.6 have demonstrated that it is possible to synthesize continuous speech from phoneme specification. I n order to regu*Phonemes are shown between slashes, and graphemes are shown between square brackets. 334 Spring Joint Computer Conference, 1968 late the intonation and rhythm,· it is necessary to provide additional specifications beyond the segmental phoneme level. The generation of the segmental phonemes and the suprasegmental or prosodic features from printed text has been the main concern of researchers in the speech synthesis field. Figure I - A text-to-speech conversion system A text-to-speech conversion system The approach which was taken here in converting text information to speech is shown in Fig. 1. The entire process is divided into 5 modules. The first module accepts graphemes as input, and translates each written word into its phonemic representation with the proper stress markers. As a result of the translation process, the obtained parts-of-speech information is further condehsed, with some ambiguities removed, by the parts-of-speech pre-processor module. The resultant parts-of-speech information and the phoneme strings of the sentence are then used as input to the phrase-analyzer module. For multiple phonemic representations of the same w.ord, this analyzer module makes a selection based on syntactical considerations. The phrase-analyzer module places phrase boundary markers and in the sentence environment re-evaluates the stresses obtained in the grapheme-to-phoneme translator module. The speech-synthesizer control signal generator module converts the phonemic specifications, stress information, and phrase markers into the speechsynthesizer control signals. The speech-synthesizer module is a hardware unit generating the actual speech output. Since our speech synthesizer and its control signal generation is similar to that reported by Holmes, Mattingly, and Shearme,5 we shall restrict our present report to the first three modules in Fig. 1, namely, the grapheme-to-phoneme translator, the parts-ofspeech preprocessor, and the phrase-analyzer. The rest of this paper deals with the problem of grapheme-to-phoneme translation. Weir and Venesky,9 Monroe. tO They maintained that it is possible to derive a letter-to-phoneme translation through the use of letter context. This phonic method has been used by First and Second Grade teachers in the United States in the teaching of English pronunciation. After a detailed and exhaustive study of the phonic method, it became clear to us that while the method works reasonably well with children, it is totally unacceptable for processing by machine. This should not be surprising because a child, six or seven years old, can obtain cues from many sources, such as illustrations in a book or knowledge of the subject before his first encounter with the written words. In analyzing words for phonic rules, the paradigm form~ of the basic words, i.e., the regularly' inflected forms, are not usually considered, since it is thought that they can be readily derived. To a machine, however, we must be more precise. For example, for the two words [invited] and [profited], we inust prevent the machine from mistaking them as /invitid/ and /profaitidl. If we consider [ed] as the paradigm' ~uffix, it is possible to look up in a list to see whether [invit] is present or not. If it is not, the mute [e] may be appended. Clearly, such a list would be very long. We must also have an exception list for words ending in [ed] which must not be separated, such as with the word [quadruped] which must not be rendered [quadrupe] + [ed]. Similar problems arise with other suffixes beginning with a vocalic sound. The second problem with the phonic method is the difficulty in handling compound words. Compound words abound in English. When a mute [e] occurs in a nonfinal position, there is no simple way of ascertaining that it should be mute other than having a list of all words ending in mute [e] and each time performing an exhaustive matching operation. We would also need an exception list to handle those words ending in a nonmute [e], for example, words like [apostrophe, catastrophe, acme, recipe], etc. With the phonic method there would be no simple way of determining the location of word stresses. While parts-of-speech information can be used to assist in the determination of stress location, the determination of parts-of-speech from spelling and context, as done by Klein and Simmons,l1 involves the use of more exception iists. To resolve ambiguities such as [refuse], (verb) and [refuse] (noun), a syntactic analysis must be performed by using a larger context. Again, the partsof-speech information is needed. The phonic method The phonic approach to the grapheme-to-phoneme translation problem was taken first by Higginbottom, 7 in 1962, and later by Bhivani, Dolby, and Resnikoff,8 Translation through the use of a morph dictionary While the smallest segment of speech on the level of production and perception is a phoneme, the smallest Machine-to-l\1an Communication by Speech-Part I unit of speech that has meaning in a given language is called a "morpheme." A morpheme is composed of one or more phonemes. For example, the spoken word for "cat," represented phonemically as Ikaet/, is a morpheme consisting of three phonemes Ik/, lae/, and It/. Clearly, by themselves, the phonemes Ik/, Iae/, or It I have no meaning, but Ikaetl does. While the morpheme Ikaetl can exist alone or be combined as in Ikaets/, [cats], some morphemes can exist only as adjuncts to other morphemes. They are called "bound morphemes." Bound morphemes serve to impart some quality to the combined form, as the lsI in Ikaets/, which has the meaning of the plural form of the noun. When a morpheme is not a bound morpheme we call it a free morpheme. The Ikaetl in [ cats] , for example, is a free morpheme. Since a word can exist in both printed and spoken form to differentiate between the two forms of a given word, we shall call the spoken form the "p-word" and the written form, the "g-word."* We see that a p-word is either a free morpheme or a free morpheme combined with a bound morpheme. We are all familiar with a rather free occurrence of the compound formation of English p-words such as those corresponding to the g-words [greenhouse, whitestone] etc. There are those p-words with a collection of prefixes and suffixes such as those p-words for [prehistorically, loveliest], etc. A more general and recursive definition of a p-word is "A p-word is either a free morpheme, or a free morpheme canbined with a bound morpheme, or a free morpheme combined with a p-word, or a bound morpheme combined with a p-word." While this definition has to be supported by additional rules of combination in order to be useful as a generative rule, it is quite suitable for the purpose of identifying a p-word in terms of its constituent morphemes. In a majority of cases, the formatio"n of a p-word that contains more than one morpheme amounts to little more than a mere concatenation of the constituents, such as those p-words corresponding to [whitestone, greenhouse] etc. There are, however, interesting cases in which- the p-words involve a transformation in the combined form. For example, IspEsafai/+ likl gives Ispisifikl, [specific], and IgaElaksil + likl gives Igala'ektikl [galactic]. The rules governing these changes are morphophonemic rules. *p-standsJor phonemic and g-stands for graphemic. 335 Morphophonemic rules may depend upon the partsof-speech classification of the resultant p-word. They may also depend upon phonological considerations. The three representations of the bound morpheme lsI corresponding to the plural forms of nouns is an example of phonological dependence. We have IkJetl +/sl ~ Ikdets/, Id-:JgI + lsI ~ Id-!JgzI and IbbII + lsI ~ IboIiz/. The morphophonemic rule in this case may be described as lsI is changed to lizl if the last phoneme in the preceding morpheme is among the set lsI, II I, Idz/, Iz/, It I I, and it is changed into Izl if the preceding morpheme ends in a voiced phoneme, and remains as lsI otherwise. If we now turn our attention to the written form of words, we can find a parallelism in that a g-word, except when it is a root word, can be broken down into smaller elements. Let us call the smallest meaningful units in written form morphs. We can define free morph and bound morph similarly as we did for free and bound morphemes. A general and recursive definition of a g-word may be stated: "A g-word is either a free morph, or a free morph combined with a bound morph, or a free morph combined with a g-word, or a bound morph combined with a g-word." In English, the formation of a g-word containing more than one morph is often the simple concatenation of the individual morphs. There are several very commonly encountered variations requiring adjustment at the junctions. These adjustments can be stated in the form of general rules which we shall call morphographemic rules. We are all familiar with the morphographemic rule which doubles the final consonant letter after a stressed vowel in situations like [capping, equipped], etc. The problem of grapheme-to-phoneme translation is the problem of g-word to p-word mapping. Since a direct individual grapheme-phoneme relationship cannot be established except in very simple cases, a general treatment requires us to seek the correspondence at a deeper level. While the relationship between a g-word and a p-word is very complex, the relationship between a morph and a morpheme is almost one-to-one, barring homographs and homophones. To map a g-word into a p-word, it is only necessary to decompose the g-word into its constituent morphs by using the morphographemic rules in reverse, mapping the morphs into the corresponding morphemes, and applying the proper morphophonemic rules to obtain the pword. The procedure is illustrated in Fig. 2. We shall now proceed to identify the morphographemic rules of English, discuss the morph-to- 336 Spring Joint Computer Conference, 1968 morpheme dictionary, and the method for decomposing a g-word into its constituent morphs. morphographem ic ru les (in reverse) G-word .. mQ!'ph + moroh nrn h- • Identification of morpho graphemic rules of English In order to describe the morphographemic rules of English, it is necessary to recognize certain properties of the morphs. We shall use the following subscripts to denote classes which cnossess certain nronprtipo;: y.A. ""' Y_& ...A_ ...... a: A free morph that never combines with others but always stands alone such as [me]. ' r: A free morph that may combine with others such as [house]. v: A bound vocalic suffix morph that has a vocalic beginning such as [able] in [readable]I' c: A bound consonantal suffix morph that does not have a vocalic beginning such as [ness] in [kindness] . t: A bound terminal suffix morph that does not permit any other suffix morphs to be added to its right. The bound morph [s] is both terminal, as well as consonantal, and the bound morph [ es] is both terminal, as well as vocalic. p: A prefix bound morph such as [un] in [unknown]. d: A morph that must have its final letter doubled when followed by a morph in class "v" such as [hit] . 0: A morph that may optionally have its final letter doubled when followed by a morph in class "v" such as [model]. Let a,{3 denote sequences of letters, a{3 denote the concatenation of a and /3, C denote a single consonantal letter that is preceded by a vowel letter, and T denote the final representation of the g-word. We can list the most often used morphographemic rules of English: .&.1. 1. (a)x~ T; xe{a, r, d, o} (free-morph rule) 2. (compound word rule) x, y, e{r, d, o} (prefix rule) xe{r, d, o} 5. (aC)d,O({3)v~(aCC{3)r 6. (a)x(f3)t~(af3)a (doubling final conspnant letter rule) (terminal-suffix rule) 7. (a'e')r(f3)v~(a{3)r (mute final-e rule) 8. (a'y')r(f3)x~(a'i' (3); (final-y rule). ma In Figure 2 - G-word to P-word mapping (a)x(f3)y~(a{3)y; (general suffix rule) • P-:'~d-~- - - - - - morpheme - -i- -+- morpheme ~f~~-- - - . . -"~::' ~~ :.c:o~ phonem ic do • morphophonemic rules 4. (a)r,o(f3)v,c~ (af3)r xe{r, d, 0, v, c} These rules are to be interpreted in the following manner: 1. Free Morph Rule: Any letter sequence in class "a," or "r," or "d," or "0" may appear as a final form in a text. Since subscript "a" occurs on the the left-hand s~de of oniy this ruie, a ieiter sequence in class"a" can only appear unchanged. 2. Compound Wor~ Rule: This is the general rule for the forma~ion of nonhyphenated compounds. For example, [snow ]+[flake]~[snowflake]. The compound belongs to the same class as the second element, so [over]+[bid]d~[overbid]d; thus, rule 5 can be applied to give [overbid]d + [ing] v~ [ overbidding] r. 3. Prefix Rule: This rule constrains the prefix to be attached only to the left, but it permits mUltiple prefixes. For example, [un]+[do]~[undo],and [re ]+[ undo]~ [reundo] 4. General Suffix Rule: This is the general case of adding a suffix to the right of a letter sequence. For example, Lreason] r+ [able] v~ [reasonable] r [kind] r+ [ness] c~ [kindness Jr 5. Doubling Final Consonant Letter Rule: This rule gives us [bid]d+[ing]v~[bidding]n as well as [model]o+[ing]v~[modeIling]r' Rule 4 perpermits alternative construction in the second case as [model]o+[ing]v~[modeling]r' 6. Terminal Suffix Rule: This rule forbids the further appendage of suffixes after a terminal suffix has been used. 7. Mute Final-e Rule: This rule drops the final mute e of a morph when a vocalic suffix is appended. For example, [move]r+[ing]v~[moving]r' The construction [mile]r+[ageJv~[mileage]r is permitted under rule 4. 8. Final-y Rule: This rule changes the final letter y of the leading constituent into "i" when a Machine-to-Man Communication by Speech-Part I compound is formed or a suffix is appended. For example, [hand]r+[Y ]--~[handY]r according to rule 4, [handy ]r+[ work]r~[handiwork]r according to rule 8 and [drY]r+[er]v~[drier]r according to rule 8. The alternative construction for [ dryer] r is permitted under rule 4. It must be emphasized here that while these rules 1 __ . _____ _ _ _ _ _ _______ .... _ _ _ _ 1 __ are msuIIlclenl lor me purpose 01 generaung umy legitimate g-words, well-formed English g-words encounter no difficulty in being decomposed into their constituent morphs. The decomposition of g-words into their constituent morphs is parsing on the word level. There are many more morphographemic rules in English, but their utilization factors are very low. In the interest of efficiency in processing, only those listed here are actually implemented. As we shall see by restricting the implementation to only these listed rules, there is no sacrifice in the quality of tran~la­ tion if we can be somewhat liberal in the choosing of morphs for the dictionary. • r-P. .. L'-_._ ..... ~ ~ ~ 337 Since the morphographemic rules indicate that when two morphs combine, the changes in the spelling occur only to the left morph, we have decided to scan a printed word from right to left during the decomposition process. The morphs in the dictionary, for this reason, are listed in reverse spelling order. The actual search is performed with an indexed sequential technique. The decomposition procedure is coded as a call to a recursive subroutine, which repeatedly calls upon itse1f until the g-word is successfully decomposed or when it has been determined that no decomposition is possible. Figure 3 illustrates the various stages of decomposition of the g-word [grasshopper]. It should be pointed out that the collating seq ience of the letters and morph terminating symbols is such that morph [dshop ] is encountered 'before [dhop]. Given G-word: grasshoppers First level match: cs Remainder: grasshopper Second level match ver Remainder: grasshopp Modified to: grasshop Third level match: ~ shop The morph-to-morpheme dictionary Fourth level match: ~ as, Webster's New Collegiate Dictionary (7th Edition) contains approximately 100,000 entry words. The inclusion of all paradigmatic forms not listed in the dictionary would probably double the count. Furthermore, with the freedom existing in the English language in forming compound words, it is difficult to put a theoretical upper bound on the total number of all possible words. By choosing to construct a dictionary on the basis of morph entries instead of gword entries, the dictionary size can be brought down to a more manageable level. Our estimate is that with a dictionary containing 32,000 morphs, all entry gwords in Webster's Dictionary, plus all their paradigms and all reasonable compounds can be adequately decomposed. The storage for this dictionary is read-only, and is estimated to be on the order of 4,000,000 bits. For the reading machine project, we have been operating with a 3000-morph dictioriary corresponding roughly to a Fourth Grade level. An entry in the dictionary consists. of four parts, the morph, which acts as the search and compare key in the decomposition process, the subscript marker, which directs the decomposition process, the morpheme, which is the target translation, and the partsof-speech information. For those cases in which a morph leads to mUltiple morphemes, such as Lrefuse J, (verb) and [refuse], (noun), the morpheme field and parts-of-speech field are repeated for each different morpheme. No further fourth level match possible, return to third level: grasshop Remainder: gras rejected on basis of subscript ~. Third level match: ~ hop Rema inder: grass Fourth level match: !:,..grass Remainder: (null) Complete decomposition: ( grass)r + (hoP)d + (er)v + (s)c Note: subscripts are underl ined, e.g. ~, '!.-er Figure 3 - Example of a G-word decomposition S election of morphs for the dictionary We have not established the criteria for the selection of morphs for storing in the dictionary. Let us start by proposing a set of simple criteria and examine them in some detail. 1. All base words are morphs. A base word is not a compound word, and contains no prefix or suffix of any kind. For example, [bake] is a morpho 2. All prefixes and suffixes that do not reflect changes in the pronunciation of the base words to which they may be attached are morphs. For example, [un, ing] are morphs. From a linguistic point of view, all suffixes, regardless whether they meet the second criterion,· ought to be treated as morphs. From an engineering point of view, the given restriction makes it unnecessary to process complex morphophonemic rules. Between those suffixes that always change the pronunciation of the g-word remainder such as the suffix 338 Spring Joint Computer Conference, 1968 [ity] and those that never do such as suffix [ing], there are many that affect it to intermediate degrees. For example, the suffix [able] does not usually affect the pronunciation of the base word, as in [attainable, removable, changeable J, etc., but it does affect some base words, as in [inflammabie, appiicabie], etc. In other cases the spelling of the base word itself is changed as in [tolerable, navigable], etc. In our approach, we choose to include as morphs such partially mutating suffixes, as well as those changed g-words. For example, we list [able] as a morph to achieve economy and list words like [inflammable, applicable, tolerable, navigable] as morphs, too. Briefly, whenever exceptions to a generai rule have to be made, we can avoid rule complication by the creation of new morph entries. Performance of the translator and the resultant synthetic speech without phrase analysis Although the morph dictionary currently in use is derived from the cumulative word list of a fourth grade reader,* the eight morphographemic rules have been tested with representative samples from the Webster's 7th New Collegiate Dictionary. The vocabulary our system is capable of accepting is considerably larger than the reader from which the dictionary was derived. We have not yet made any provision to provide an approximate translation when the decomposition process fails. It seems that with a 32,000 morph dictionary, such occurrences should be very rare, except for proper names and misspellings, which may be handled in terms of letter-by-Ietter spelling. After the grapheme-to-phoneme translator was implemented, we connected the output directly to the speech synthesizer control signal generator module, bypassing the phrase-analyzer. The parameters transmitted to the speech synthesizer control signal generator included only the segmental phonemes, primary and secondary stress markings on the morphemes and an indication of whether the sentence ends in a period or a question mark. We were quite encouraged by what we heard at the speech synthesizer output. We realized that for improved comprehension over longer utterances~ additional processing of the data is essentia1. This additional processing is represented by the work on the phrase-analyzer which is presented in part 2 of this paper. *Roads to Everywhere, published by Ginn and Company, 1964. As an added remark, it should be pointed out that the morph decomposition idea can be used to greatly reduce the size ofthe dictionary needed by mechanical translation of natural languages. ACKNOWLEDGMENT This work was supported principally by the National Institutes of Health (Grant 1 POI GM-14940-0l), and in part by the Joint Services Electronics Program [Contract DA 28-043-AMC-02536 (E)]. REFERENCES 1 G C PATION Touch-tone input and audio reply for on-line calculation Bell Telephone Laboratories Internal Memorandum March 29 1967 2 D E TROXEL F FLEE S J MASON A reading machine for the blind Digest of the 7th International Conference on Medical and Biological Engineering 1967 Stockholm 3 J L KELLY C LOCKBAUM Speech synthesis Proceedings of the Speech Communication Seminar 1962 Stockholm 4 L J GERSTMAN J L KELLY A n artificial talker driven from a phonetic input Journal of Acoustical Society of American vol 33 1961 p 835 V\) 5 J N HOLMES I G MATIINGL Y J N SHEARME Speech synthesis by rule Language and Speech vol 7 part 3 July-September 1964 pp 127-143 6 I G MATIINGLY Synthesis by rule ofprosodic features Language and Speech vol 9 Part 1 January-March 1966 pp 1-13 7 E M HIGGINBOTIOM A study of the representation of English vowel phonemes in the orthography Language and Speech April-June 1962 pp 67-117 8 B V BHIV ANI J L COLBY H L RESNIKOFF Acoustic phonetic transcription of written English Paper delivered at the 68th meeting of the Acoustic Society of American October 21 1964 9 R H WEIR and VENEZKY Formulation of grapheme-phoneme correspondence rules to aid in the reaching of reading Cooperative Research Project No S-139 Report Stanford Univ 1964 10 G MONROE Phonemic transcription of graphic post-base suffixes in English A computer problem PhD Thesis Brown Univ June 1965 11 S KLEIN R F SIMMONS A computational approach to grammatical coding of English words Journal of the Association for Computing Machinery vol 10 No 3 July !963 pp 334-347 Machine-to-man communication by speech Part II: Synthesis of prosodic features of speech by rule by JONATHAN ALLEN Research Laboratory of Electronics, Massachusetts Institute of Technology Cambridg~, Massachusetts For several years, research has gone on in an attempt to develop a reading machine for the blind. Such a machine must be able to scan letters on a nor~ mal printed page, then recognize the scanned letters and punctuation, and finally convert the resultant character strings into an encoded form that may be perceived by some nonvisual sensory modality. Within recent years, at the Massachusetts Institute of Technology, an opaque scanner has been developed, l and an algorithm for recognizing scanned letters· has been devised. 2 The output display can take many forms, but the form that we feel is best suited for acceptably high reading speeds and intelligibility is synthesized speech. Effort has recently been focused on the conversion of orthographic letter strings to synthesized speech. An algorithm for grapheme-to-phoneme conversion (letter representation to sound representation) has been invented by Lee,a which is capable of specifying sufficient phonemic information to a terminal analog speech synthesizer for translation to synthesizer commands. The algorithm uses a dictionary to store the constituent morphs of English words, together with their phonemic representation. Hence each scanned wQrd is transformed into a concatenated string of phonemic symbols that are then interpreted by the synthesizer. The resulting speech is usually intelligible, but not suitable for long-term use. Several problems remain, apart from those concerned directly with speech synthesis by rule from phonemic specifications. First, many words can be nouns or verbs, depending on context [refuse, incline, survey], and proper stress cannot be specified until the intended syntactic form class is known. Second, punctuation and phrase boundaries may be used to specify pauses that help to make the complete senten~e understandable. Third, more complicated stress· contours over phrases can be specified which facilitate sentence perception. Finally, intonation contours, or "tunes" are important for designating statements, questions, exclamations, and continuing or terminal juncture. These features (stress, intonation, and pauses) comprise the main prosodic or suprasegmental features of speech. Several experiments 4 ,5,6 have shown that we tend to perceive sentences in chunks or phrasal units, and that the grammatical structure of these phrases is important for the correct perception of the sentence. In order to display this required structure to a listener, a speaker makes use of many redundant devices, among them the prosodic features, to convey the syntactic surface structure. When speech is being synthesized in an imperfect way at the phonemic level, the addition of these additional features can be used by listeners to compensate for the lack of other information. The listener may then use these cues to hypothesize the syntactic structure,. and hence generate his own phonetic "shape" of the perceived sentence. There is little reason to believe that the perceived stress contour, for example, must represent some continuitlg physical property of the utterance, since the listener uses some form of internalized rules to. "hear" the stress contour, whether or not it is physically present in a clear way. Hence, once the syntactic surface structure can be determined, the "stress" can be heard. Alternatively, prosodic features can be used in a limited fashion to help point out the surface structure, which is then used in the perception of the phonetic shape of the sentence. . The present paper describes a procedure for parsing sentences composed of words that are in turn derived from the morphs provided by the grapheme-tophoneme d.ecomposition, as well as a phonological procedure for specifying prosodic features over the 339 340 Spring Joint Computer Conference, 1968 revealed phrases. As we have indicated, only a limited amount of the sentence is parsed and provided with prosodics, since the listener will "hear" the entire sentence once the structure is clear. We consider first the required parts-of-speech preprocessor, then the parser, and finally the phonological algorithm. Parts-oj-speech preprocessor After the grapheme-to-phoneme conversion is complete, many words will have been decomposed into their constituent morphs. For example, [grasshopper] ~ [grass] + [hop] + [er], and [browbeat] ~ [brow] + [beat]. Each of these morphs corresponds to a dictionary entry that contains, in addition to phonemic specifications, parts-of-speech information. I n the case of morphs that can exist alone ([grass, hop, brow,] etc.) this information consists in a set of parts of speech for that word, called the grammatical homographs of the word, and this set often has more than one homograph. For prefixes and suffixes ([re-, -s, -er, -ness,] etc.), information is given indicating the resultant part of speech when , the prefix or suffix is concatenated with a root morpho Thus [-ness] always forms a noun, as in [goodness] and [madness]. Other researchers 7 •8 have used a computational dictionary to compute parts of speech, relying on the prevalence of function words (determiners, prepositions, conjuctions, and auxiliaries), together with suffix rules of the type just described and their accompanying exception lists. This procedure, of course, keeps the lexicon small, but results in arbitrary partsof-speech classification when the word is not a function word, and does not have a recognizable suffix. Furthermore, ambiguous suffixes such as [-s] (implying pluml noun or singular verb) carryover their ambiguity to the entire word, whereas if the root word has a unique part of speech like [cat], our procedure gives a unique result; [catsJ (plural noun). Hence the presence of the morph lexicon can often be used to advantage, especially in the prevalent noun/verb ambiguities. The parts-of-speech algorithm considers each morph of the word and its relation with its left neighbor, starting from the right end of the word. If there are two or more suffixes [commendables, topicality] the suffixes are entered into a last-In first-out push-down stack. Then the top suffix is joined to the root morph, and the additional suffixes are concatenated until the stack is empty. Compounding is done next, and finally any prefixes are attached. Prefixes generally do not affect the part of speech of the root morph, but [em-, en-,] and [be-] all change the part of speech to verb. Compounds can occur in English in any of three ways, and there appears to be no reliable method for distinguishing these classes. There can, of course, be two separate words, as in [bus stop], or two words hyphenated, as in [hand-cuff], or finally, two root words concatenated directly, as in [sandpaper] . The parts-of-speech algorithm treats the last two cases, leaving the two-word case for the parser to handle. The algorithm ignores the presence of a hyphen, except that it "remembers" that the hyphen occurred, and then processes hyphenated and oneword compounds as though they were both single words. The parts of speech of the two elements ,of the compound are considered as row and column entries to a matrix whose cells yield the resulting part of speech. Thus Adverb·Noun ~ Noun ([underworld]). In general, since each element. may have , several parts of speech, the matrix is entered for e~ch possible combination, but the maximum number of resulting parts of speech is three. Combinations of suffixes with compounds ( [handwriting] ) can be accommodated, as well as one-word compounds containing more than two morphs. The algorithm has a special routine to handle troublesome suffixes such as [-er, -es, -s], in an attempt to reduce the reSUlting number of parts of speech to a minimum. In this way, the algorithm makes use of the parts of speech information of the individual morphs to compute the parts of speech set for the word formed by these morphs. These sets then serve as input to the parser, after having first been ordered to suit the principles of the parser. Parsing As we have remarked, if a listener is aware of the surface syntactic structure of a spoken sentence, then he may generate internally the accompanying prosodic features to the extent that they are determinable by linguistic rules forming part of his language competence. Hence we desire to make this structure evident to the listener by providing cues to the syntax in the prosodics of the synthesized speech. To do this, we must first determine the structure, and tlien implement prosodics corresponding to the structure. Since we are trying to provide only a limited number of such cues (enough to· allow the structure to be deduced), we have designed a limited parser that reveals the syntax of only a portion of the sentence. We have tried to find th~ simplest parser consistent with the'-:? phonological goals that would also use minimum core stor age and run fast enough (in the context of the over-all reading machine) to allow for a realistic speaking rate, say, 150-180 words per minute. Because the absence, or incorrect implementation of prosodies in a small lViachine-to-lVian Communication by Speech-Part H percentage of the output sentences, is not likely to be catastrophic, we can tolerate occ~sional mistakes by the parser, but we have tried to achieve 90 per cent accuracy. These requirements, for a limited, phraselevel parser operating in real-time at comfortable speaking rates within restricted core storage, are indeed severe, and many features found in other parsers are absent here. We do not use a large number of parts of speech classifications, nor do we exhaustively cycle through all the homographs of the words of a sentence to find all possible parsings. Inherent syntactic ambiguity ([They are washing machines]) is ignored, the resulting phrase structures being biased toward noun phrases and prepositional phrases. No deep-structure "trees" are obtained, since these are not needed in the phonological algorithm, and only noun phrases and prepositional phrases are detected, so that no sentencehood or clause-level tests are made. We do, however, compute a bracketed structure within each detected phrase, such as [the [old house]] and [in [[brightly lighted] windows]], since this structure is required by the phonological algorithm. The result is a context-sensitive parser that avoids time-consuming enumerative procedures, and consults alternative homographs only when some condition is detected (such as [to] used to introduce either an infinitive or a prepositional phrase) which requires such a search. The parser makes two passes (left -to-right) over a given input sentence. The first pass computes a tentative bracketing of noun phrases all:d prepositional phrases. Inasmuch as this initial bracketing makes no clause-level checks and does not directly examine the frequently occurring noun/verb ambiguities, it is followed by a special routine designed to resolve these ambiguities by means of local context and grammatical number agreement tests. These last tests are also designed to resolve noun/verb ambiguities that do not occur in bracketed phrases, as [refuse] in [They refuse to leave.]. As a result of these two passes, a limited phrase bracketing of the sentence is obtained, and some ambiguous words have been assigned a unique part of speech, yet several words remain as unbracketed constituents. The first pass is designed to quickly set up tentative noun phrase and prepositional phrase boundaries. This process may be thought of as operating in three parts. The program scans the sentence from left to right looking for potential phrase openers. For example, determiners, adjectives, participles, and nouns may introduce noun phrases, and prepositional phrases always start with a preposition. In the case of some introducers, such as present participles, words further along in the sentence are examined, as well as pre vi- 341 ous words, to determine the grammatical function of the participle, as in [Wiring circuits is fun.] Once a phrase opener has been found, very quick relational tests between neighboring words are made to determine whether the right phrase boundary has been reached. These checks are possible because English relies heavily on word order in its structure. Having found a tentative right phrase boundary, right context checks are made to determine whether or not this boundarv should he accented. After comnletion ---c - - -- - -- of -these checks, the phrase is closed and a new phrase introducer is looked for. This procedure continues until the end of the sentence is reached. When the bracketing is complete, further tests are made to check for errors in bracketing caused by frequent noun/verb ambiguities. For example, the sentence [That old man lives in the gray house.] would be initially bracketed. [That old man lives]NP [in the gray house)PREP p. Notice that sentence hood tests (although not performed by the parser) would immediately reveal that the sentence lacks a verb, and further routines could deduce that [lives], which can be a noun (plural) or verb (third person singular), "is functioning as a verb, although the bracketing routine, since it is biased toward noun homographs, made [lives] part of the noun phrase. We also note the importance of this error for the phonetic shape of the sentence, since [lives] changes its phonemic structure according to its grammatical function in the sentence. An agreement test, however, compares the rightmost "noun" with any determiners that may reflect grammatical number. In this case, [that] is a singular demonstrative pronoun, so we know that [lives] does not agree with it, and hence must be a verb. After the agreement test has been made for each noun phrase, local context checks are used in an attempt to remove noun/verb ambiguities that are important for the phonological implementation, and yet have not been bracketed into phrases containing more than one word. Thus in the sentence [They produce and develop many different machines.] , the algorithm would note that [produce] is immediately preceded by a personal pn:moun in the nominative case, and hence the word is functioning as a verb. Soch knowledge can then be used to put stress on the second syllable of the word in accordance with its function. At the conclusion of the parsing process described above, phrase boundaries for noun phrases and prepositional phrases have been marked, but the structure within the phrase is not known. In order to apply the rules that are used for computing stress patterns within the phrase, however, internal bracketing must be - - - - - • _ •. - _. - - - - - - - - - 6.,.- - - -- - - -- - - - - - 342 Spring Joint Computer Conference, 1968 provided. For this reason, determiner-adjective-noun sequences are given a "progressive" bracketing, as [the [long [red barn]]], whereas noun phrases beginning with adverbials are given "regressive" bracketings, as [[ [very brightly] projected] pictures], A preposition beginning a prepositional phrase always has a progressive relation to the remaining noun phrase, so that we have [in [the [long [red barn]]]] and [of [ [[ very brightly] projected] pictures] ] Furthermore, two nouns together, as in [the local bus stop], are marked as a compound for use by the phonological algorithm. The procedure described above is thus able to detect noun phrases and prepositional phrases and to compute the internal structure of these phrases. The grammar and parsing logic are intertwined in this procedure, .so that an explicit statement of the grammar is impossible. Nevertheless, the rules are easily modified, and additions can readily be made. If, for example, we decide to detect verbal constructions, this could easily be done. At present, however, we feel that recognition of noun phrases and preposi,tional phrases and the provision of prosodics for these phrases is sufficient to allow the listener to deduce the correct syntactic structure for large samples of representative text. Phonological algorithm The method for detecting and bracketing noun phrases and prepositional phrases has now been described. We assume that this surface structure is sufficient to allow the specification of stress and intonation within these phrasal units. The basis for this assumption is given in the work of Chomsky and Halle. 9 The phonological algorithm then uses the surface syntactic bracketing, plus punctuation and clause-marker words, to deduce the pattern for stress, pauses, and intonation related to the detected phrases. In the present implementation, only three acoustic parameters are varied to implement the prosodic features. These are fundamental frequency (fo)' vowel duration, and pauses. It is well known that juncture pauses have acoustic effects on the neighboring phonemes other than vowel lengthening and fo changes" but these effects are ignored in the present synthesis. We thus consider fo, vowel duration, and pauses to constitute an interacting parameter system that serves as a group of acoustic features used to implement the prosodics. The "sharing" of fo for use in marking both stress and intonation contours is another example of the interactive nature of these acoustic parameters. Stress is implemented within the detected phrases by iterative use of the stress cycle rules, described by Chomsky and Halle. 9 These rules operate on the two constituents within the innermost brackets to specify where main stress should be placed. All other stresses are then "pushed down" by one. (Here, "one" is the highest stress.) The innermost brackets are then "erased," and the nIles applied to the next pair of constituents. This cycle is then continued until the phrase boundaries are reached. For compounds, the rules specify main stress on the leftmost element (compound rule), whereas for all other syntactic units (e.g., phrases) main stress goes on the rightmost unit (nuclear stress rule). For example, we have [the [long [red barn]]] 2 4 2 3 where initially stress is I on all units except the article the, and two cycleS of the phrase rule are used. The parser ,has, or course, provided the bracketing of the phrase. Also, [in [[[very brightly] lighted] rooms]] 2 1 321 443 2 requires three applications of the rules, and [the [new [bus stop] ] ] I 2 4 2 3 which contains a compound, requires two iterations. It is clear that for long phrases requiring several iterations, say n, there will be n + I stress levels. Most linguists, however, recognize no more than four levels, so the algorithm clips off the lower levels. At present, three levels are being used, but this limit can be easily changed in the program. In the examples it has been implicitly assumed that each content word started with main stress before the rules were applied. Each word does have a main stress initially, but in general each word has its own stress contour, as, for example, in the triple [nation, national, nationality]. (As Lee:J has pointed out, pairs such as [nation/national] can be handled by placing the two words directly in the morph dictionary, but we have tried to extend the stress algorithm to cover many of these cases. Clearly, there is a compromise between processing time and dictionary size to be determined by experience.) Thus the algorithm must compute the stress for individual words by applying rules for compounds and suffixes. The compound rule is the same as for two separate words that comprise a compound (e.g., [bus stop, browbeat]). Each morph in the lexicon is given lexical stress, so that an initial stress contour is provided, Each suffix is also provided with· information about its effect on r-,,1achine-to-l\.1an Communication by Speech- Part II stress. Hence [-s, -ed] and [-ing] all leave the root morph stress unaltered, and have the lowest level stre~s for themselves. Another example is the'~uffix [-ion], which always places main stress on the immediately preceeding vowel (e.g., [nationalization, distribution]). At present, such changes in stress of the root word are not computed by rule. I n this way, stress contours for individual words are first computed, and then these are "placed" in the bracketed phrase structure and the stress cycl~ is applied until the over-all stress pattern is obtained for the whole phrase. Note that function words receive no stress, so that stress is controlled for these words, even though they do not appear in bracketed phrases. Pauses are provided in a definite hierarchy throughout each sentence. The following disposition of pauses has been arrived at empirically, and represents a compromise between naturalness and intelligibility. At present, no pauses are used within the word at the juncture between any two morphs. Within a bracketed phrase or between two adjacent unbracketed cons tituents no pauses are used between words. At phrase boundaries, pauses of 200 to 400 msec have been used to set off the detected phrase. Short pauses of 100 msec are used where commas and semicolons appear, and pauses of 200 msec are inserted before clause-marker words such as [that, since, which] etc., which serve to break up the sentence into clausal units. Finally, terminal pauses of SOO msec are provided for colon, period, question mark, and exclamation point. Thus a hierarchy of pauses is used to help make the grammatical structure of the sentence clear. The provision of intonational fo contours by rule has been described by Mattingly,t° and our technique is similar to his. The slope of the fo contour is controlled by the specific phonemes encountered in the sentence, and by the nature of the pause at the end of the phrasal unit. Rising terminal contours are specified at the end of int~rrogative clauses just preceding the question mark, except when the clause starts with a [~h-] word, as [where is the station?]. In the absence of a question mark, the intonati~n fo contour'is falling with a slope determined by rule as is done by Mattingly. The starting point for fo at the beginning of a sentence is fixed at 110Hz. The jumps in fo for the various stress levels vary with the initial value of fo, but nominally they are 12, IS, and 30 Hz corresponding to the stress levels 3, 2, and 1 respectively. As noted before, 1 'corresponds to the highest stress in our system. Subjective experience The method of implementing prosodic features on 343 the limited basis described above has been used in connection with the TX-O computer at M.I.T., driving a terminal analog synthesizer. While the resulting speech is still unnatural in many respects, a substantial improvement in speech quality has been attained. It appears that by using limited phrase level parsing and implementation of prosodics mainly within these phrases, sufficient cues can be provided to the listener to enable him to detect the grammatical structure of the sentence and hence provide his own internal phonetic shape for the sentence. Since this system will become part of a complete computercontrolled reading machine operating in real time, it is encouraging to find that such a limited approach is able to improve the speech quality markedly. We anticipate that further work on both phonemic and prosodic synthesis rules will yield even greater intelligibility and naturalness in the output speech, with little additional computing load placed on the system. DISCUSSION The speech synthesis system described here has been developed for research purposes. Hence the implementation of our speech synthesis system has remained very flexible so that further improvements can be easily accommodated. Better rules for phonemic synthesis are being developed, and will be incorporated into the system., Much work remains to be done on the determination of the physiological mechanisms underlying stress, and the resultant observable phonetic patterns which arise from these articulations. Particular attention is being focused on the nature and interaction of fo and vowel duration as correlates of stress. There will also undoubtedly be further improvements in the parsing procedure as experience dictates. From the linguistic point of view, the lexicon for a language should contain only the idiosynchrasie's of a language, everything derivable by rule being computed as part of the language user's performance. Engineering considerations, however, clearly dictate a compromise with this view, and the cost of memory versus the cost of computing with an extensive set of rules must be examined further. It may, for example, become feasible to compute lexical stress by rule, but any advantages of this procedure must outweigh the cost in time and program storage for these rules. ACKNOWLEDGMENTS This work was supported principally by the National Institutes of Health (Grant 1 POI GM-1490-0l) and in part by the Joint Services Electronics Program (Contract DA2S-043-AMC-02536(E»; additional 344 Spring Joint Computer Conference, 1968 support was received through a fellowship from Bell Telephone Laboratories, Inc. REFERENCES 1 C L SEITZ An opaque scanner for reading machine research SM Thesis MIT 1967 2 J K CLEMENS Optical character recognition for reading machine applications Doctoral Thesis MIT 1965 3 F FLEE A study of grapheme to phoneme translation of English PhD Thesis MIT 1965 4 G A MILLER Decision units in the perception of speech IRE Transactions on Information Theory VoIIT-8 No 2 P 81 February 1962 5 G A MILLER G A HEISE W LICHTEN 6 7 8 9 lO The intelligibility of speech as a function of the contest of the test materials J Exptl Psychol41 p 329 1951 G A MILLER S ISARD Some perceptual consequences of linguistic rules J Verb Learn Verb Behav 2 p 2! 7 ! 963 S KLEIN R F SIMMONS A computation approach to the grammical coding of English words J Assoc Computing Machinery 10 334 1963 D C CLARKE R E WALL An economical program for the limited parsing of English AFIPS Conference Proceedings p 307 Fall Joint Comp Conf 1965 N CHOMSKY M HALLE Sound patterns of English (in press) I G MATTINGLY Synthesis by rule of prosodic features T llnOllllOP _.01;_ llnrl .~npprh I Qf:.f:. --"·.0--0-t"'--_&;&"Q •I o..".....,....., An on-line multiprocessing interactive computer system for neurophysiological investigations hy FREDERICK D. ABRAHAM Brain Research Institute, Neuropsychiatric Institute and Psychiatry, UCLA Los Angeles, California and LASZLO BETY AR and RICHARD JOHNSTON Data Processing Laboratory, Brain Research Institute, UCLA Los Angeles, California INTRODUCTION The principal dependencies of neurophysiologists UDon the computer are for data collection and analysis, experimental control, and the development of theoretical models. One possible system providing these functions is one that allows several investigators to on-line time-share a moderate sized digital computer capable of performing input, output, and computational functions in a simple interpretive language that is easy to understand and use in a fast decision experimental environment. A community of neurophysiologists in the UCLA Brain Research Institute share such a computer in its data processing laboratory (DPL) by means of remote console stations in the investigators' laboratories connected to the DPL by a direct cabling system. 5 ,15 A larger computer facility, available to a larger community of health scientists, is used for batch processing where problems do not need continuous interaction with the investigator for on-line control or analysis, or do need greater computational capability.9 The two facilities possess compatible I/O formats, thus making some problems soluable 'by the combination of both computers, and giving other problems the flexibility of either approach. Essentially the DPL is a mUltiprocessing system 12 ,18,21 ·wIth an emphasis on I/O functions and an interpretive system appropriate for neurophysiological investigation and with some unique solutions to resource allocation and system integrity in its temporalspatial (core) algorithm. The economic advantage of such a system is not argued, nor is CPU economy necessarily maximized with present use, though from the standpoint of I/O devices which are so important for such research, a central facility may possess some 345 advantages. Reliability and demands of time-critical users must be realistically estimated for neurophysiological users for whom the on-line aspect may be with reference to the integrity of their experiments. Computer facilities andfunctions A. The basic DPL hardware system The central processor unit (CPU) is a Scientific Data Systems 9300 general purpose digital computer, with a 32K, 24 bit word memory, three time-multiplexed communication channels (TMCC), and two direct access communication channels (DACC) (Fig. 1). Seven magnetic tape units are on one TMCC. Character devices including printer, card reader, plotters, typewriter, paper tape reader, and paper tape punch are on another TMCC. A 30-channel analog-to-digital (A/D) converter and a 16-channel A/D converter are available on another TMCC. The A/D's have a common multiplexer with a 100 Khz conversion rate and a precision of 10 bits plus sign. An 8 M character disc storage is on a DACC. A digital-toanalog converter (D/ A) is on the other DACC. There is also a 24 bit parallel input of sense lines capable of detecting 24 channels of discrete laboratory events, and a 24 bit parallel output of relay drivers capable of operating 24 channels of discrete events in the investigators' laboratories. The system interface unit contains the A/D multiplexer, the D/A connector, the 24 relay drivers and their connections, the 24 sense line connections, and a 6 bit character buffer for input from the consoles in the investigators' laboratories. The remote consoles (Fig. 2 & 3) located in the investigators' laboratories include a 64 key lighted keyboard and a storage oscilloscope. The console also 346 Spring Joint Computer Conference, 1968 H~pH I 1 SDS 9300 32K MEMORY POT (24 BITS) PIN (24 BITS) I I ~ ~. 30 ANALOG INPUTS ~ ~ 16 ANALOG INPUTS I SYSTEM INTERFACE UNIT . 1 - 1- - - - - ~~/~i~AhuTPUT NUMBER I CONSOLE SYSTEM OUTPUT MEMORY SCa£ 1 I NUMBS,? 2 G Dashed line indicates planned system expansion NUMBER 3 LINE PRINTER CARO READER 2OOC/MIN PAPER TAPE READER PAPER TAPE PUNCH o Figure 1 - DPL hardware configuration contains a few lights to indicate the status of a few computer functions. The console (differing from others 7,8,13,16) represents two most important features of the system; the on-line time-sharing feature, and the function of a simple communication between computer and researcher. The consoles allow on-line multiprogramming and multiprocessing of programs already entered via the console or other routes. Its keys are mnemonically labelled and lighted. The mnemonic labels indicate the most basic functions available to the user in terms of events and computations familiar to his research. The lights indicate whether the keyboard is in an upper or lower case mode of operation. The storage oscilloscope is capable of both graphic and alphanumeric display. B. The basic DPL software system The executive system and the interpreter (Shared Laboratory Interpretive Processor, SLIP) for the users' multiprocessing are resident in core. This system performs the reading, interpreting, and processing of console keypresses, and the generation of scope displays. In addition, the computer operator may exe- cute various background activities which take advantage of idle CPU time. These background activities represent the normal production activity of the DPL and include AID conversion, "off-line" plotting on two Ca1comp digital plotters, "off-Hne" printing of assembler output, card-to-tape conversion, and routine tests of digital tapes. At present, there is no capability to time-share routine compilations and assemblies. These operations are performed at scheduled times under the standard SDS supplied monitor system. Console activities may be controlled through a continuous sequence of keypresses, performed in a direct execution mode of available public users' programs. Additional private users' programs may also be written in this mode, and then used in a program execution mode. Any key press may have two types of meanings according to whether the keyboard is in upper or lower case. In upper case the keys have an operator (function) meaning; in lower case they usually have an operand (data) meaning. Each key may have up to 60 operator meanings, the first ten of which are written as machine language subroutines as part of the interpreter, representing a compiler in provid- On-line Multiprocessing Interactive Computer System for Neurophysiological Investigations Figure 2 - Console 347 investigator, by his reference identification, may define 50 more programs per key that are usable and modifiable only by himself (resident on disc or magnetic tape). Similarly, the operand (data) meanings, locatable by a special indexing procedure, are private, with up to 217,088 pieces of data storable on each of 53 keys. All operations are performed in the console "accumulator" (the principal working register of the computer) which may hold a vector of up to 250 real or 125 complex numbers. Operators may be unary, that is operate on the contents of the accumulator leaving the keyboard in operator (upper case) mode. Plotting (graphic scope display of accumlator contents), and Sine (replacing the contents of the register with their sine) are examples of unary operators. Other operators are binary, requiring additional data which must be obtained in the operand mode after execution of the binary operator. A terminator key press in operand mode returns the keyboard to operator mode after the data operand is defined by appropriate key presses. For example, a vector may be added to another, replacmg the register with the vector sum. The data may be generated as needed by key presses representing numbers, or may be called from storage by key presses representing its location indexing. Programs and data are placed on disc as they are defined and may remain there safely for short term needs such as a week or so. If they are desired to be kept longer, they may be dumped into magnetic tape or 'punched cards and reloaded at a later time. Data may be collected from experiments in either a continuous or triggered fashion for spectral transient response, or spike analysis. The data are disc stored with appropriate indexing for further console operations, or stored on magnetic tape in IBM/360 format for subsequent analysis at the other computer facility. c. Uniqueness and 1/0 capabilities Figure 3 - Console keyboard ing a language to the user in terms of I/O and computational functions required in this research (differing from other interactive interpretive languages 7•8 •11 and medical systems 16). The remaining 50 operator definitions per key are optional sequences of other operator and operand meanings. The first ten meanings per button are public routines likely to be used by many investigators but modifiable by none (currently, only the first of the ten exists for each key, see Table I). The remaining 50 meanings are available for private mUltiprogramming usage. That is, each An important difference between SLIP and other similar time-sharing remote console systems 7•g is the availability to the console user of specialized hardware for data collection and experimental control. These capabilities are supplied in addition to a comprehensive set of mathematical and I/O operations. Other major differences between SLIP and other time-sharing systems is in the space (core memory) and time algorithm. Space is' allocated dynamically on an as-required basis for each user. Programs and data are saved on disc, and are automatically read in when referenced. More frequently used data and programs remain in core for longer periods of time. With respect to time allocation, each user may execute one function, and then the computer will automatically service others on a cyclic commutator basis. Time consuming operations such as convolution relinquish the CPU several 348 Spring Joint Computer Conference, 1968 times before completion. Time-critical (synchronous) activities such as AID conversion are interrupt driven and are given the highest system priority. I/O channel access is on a first come, first served basis. D. The Health Science Computer Facility (HSCF). 9 While the DPL is a moderately sized facility, it is specifically geared for the treatment of analog neurophysiological data as well as the on-line timeshared macro language used by the neurophysiological investigator. For many problems, data analysis need not necessarily be performed on-line or with the investigator's continual badgering. A larger, batch processing 360 system available to a larger community of scientists is used for such problems, taking input prepared by the D PL facility and returning output that is given its final form, e.g., plotted DI A data analysis displays, at the DPL. Separate programs are used at each facility on their respective disc storages. No direct link yet exists between the two facilities, and principal· data communication between them is via I/O IBM/360 format compatible magnetic tapes, with additional program control and parameter entry to the HSCF computer with punched cards. Examples showing the various options available with these computers for use with the principal types of neurophysiological research follow. Other medkal systems have been reviewed elsewhere. 10 ,16,17 Exemplary neurophysiological control and analysis A., EEG spectral analysis in behavioral experiments l ,3,4,6,14,20 Experimental control with data collection, analysis, and display for cats in chronic learning situations while measuring EEG activity from several brain sites provides an excellent example of the interaction between the investigator's laborator.y and both computer facilities (Fig. 4). The DPL console system is used to control the experimental situation with a relay-driving program, while simultaneously AID converting several channels of amplified EEG activity and time and stimulus codes with another program that places the data on a digital magnetic tape in IBM/360 Fortran compatible format. The investigator uses prepared programs and uses the console to call them into use, to enter parameters concerning relay driving conditions and digitizing parameters, to enter labels, and to initiate and terminate the relay driving and AID routines. Spectral analysis of the data is preformed at the HSCF by submitting the data tape together with control cards for both their monitor system and a user's spectral analysis program resident on disc. 4 •9 Both listed and magnetic tape or disc outputs are obtained, and subsequent programs at the HSCF further prepare the data, with or without averaging, for various types of data display. The data displays are obtained at the D PL facility with background programs that can read the IBM tapes and controi either Cal-comp plotters or osciiloscopic data displays. Computer operators at DPL usually perform this task, as the investigator often does not wish to view the data until it is complete. This mode of data treatment is quite fine where a predetermined exhaustive data analysis is required from a compulsively well designed experiment. In pilot experiments, one may wish to perform partial analyses in an attempt to determine which features are of greatest interest. Spectral analysis programs written for the DPL console system are in progress to do this. Essentially the trade off is initially increased investigator time for savings of subsequent HSCF time by eliminating analysis of unnecessary aspects of the data. Development of new analyses and displays can be quite efficient with such a system as well. Some of the types of data displays from spectral and rdated time-series analysis include three dimensional maps of spectral power, co-spectral power, coherence, and phase as a function of time (successive, selected samples) (Fig. 5A). Also, coherence and phase as a functio~ of frequency for a given sample; and maps of brain locations _depicting phase and coherence for a given EEG frequency band are very useful displays (Fig. 5B). Phase is usually converted to a time measure rather than an angular measure. B. Evoked potentials in sensory experiments2 ,6 An experiment on auditory electrophysiology provides another example of the various options available with such a computer system in efficiently yielding statistical parameters and sequential analysis when averaging transient neural responses. In this case, experimental control is usually laboratory rather than computer based, while data collection is via the console system, differing only from the EEG in detecting pulse events and digitizing only a brief (usually 50 1000 msec) stretch of data after each pulse which includes the transient neural response to auditory stimuli. The data may be stored in disc with the console system indexing and analyzed as it is collected, as such data analyses may be performed much more rapidly than spectral analysis. Thus o.ne could have such an average evoked response with several additional statistical properties displayed on the oscilloscope and photographed (with the computer relay drivers controlling photography if one wishes) within milliseconds after the presentation of the last stimulus in a series to an animal. This could be of use in making consequent experimentai decisions, or in On-line Multiprocessing Interactive Computer System for Neurophysiological Investigations 349 EXPERIMENTAL CONTROLANO OATA ANALYSIS SYSTEM OPl ---- HSCF c:::::::::::J ~ Figure 4 - Experimental control and analysis flow further data analysis decisions, or simply be convenient in eliminating additional turnover steps that use of the other computer facility would require. However, if the analysis needs were sufficiently predetermined, or required many brain locations to be analyzed, or if the efficiency of immediate solution were not required, then the data analysis could be treated as in the EEG example. That is, data is AID converted in the triggered fashion onto a magnetic tape, analyzed at the HSCF, and analyses graphically displayed via tape return to the OPL with console control. Some typical displays include averages of evoked potentials selected from a series with confidence intervals for the entire average evoked potential (Fig. 5C), or histograms for particular temporal parts of the evoked potential. If one were igterested in sequential changes during the course of time during which averaging occurred, amplitudes and latencies of various components of the evoked potentials may be displayed graphically as well (Fig. 50.) An example of the flexibility provided by having both computers available would be say in determining the durations of averaged evoked potentials of interest for each location measured in auditory nervous systems, or in the number of responses or samples over which interesting changes were occurring, and then sending the data from the whole experiment to the other computer for complete analysis based on such determinations. C. Behavioral experiments with relay-driving contingent upon behavior Some learning experiments require food delivery or some other events to occur in an animal's environment as a function of statistical characteristics of some Spring Joint Computer Conference, 1968 • I CAT 12. III lUG. 1'IW..aM. 1_ TIllS 15 Itt IIWIAC[ OWl 10 CAS[S 6-UN-Z-3-1f-S, APRIL 66. STFtORJ SITUATION. CMRENCE TOPO I"WS. OA/HLl ______ ~ flUWC'IlS'IIUIO WHL-______ ~ 350 6-UN COH C fU !t55Z1 aJIIIIII. _ WIllS WT l.-tl 18..1 ~ INIJIIi IUL 5 Dl5llII.lUL_ 12SII.DsaMI'UIE 1Z5. s:IHI POI I'llII'IIe: • ... .".., IDH :.alii ~ UIIH • I ~tLI"'" ms 4'5 .1111 4'5 .1111 lIII5-O :D!D :tc: ~ ~ ~III 4'5 .1111 ~l¢ :: ~ -I ms :D!D ....... • C' ~~i-IvJl .II ~IIII IIJS-P' -I :000 ms :D!D :a75 .1111 Figure 5 - Exemplary data analysis displays repeated response it may perform. The sense line and interrupt features of the console DPL system along with its relay driving capability allow great flexibility in such control with parameter entry for such statistical decision making. The D PL computer can simultaneously digitize and analyze physiological data, and compute and display analyses of the behavioral data constantly throughout the experiment. D. Neurophysiology as an independent variable Relay driving could also be contigent upon on-line analysis of electrophysiological data as an independent variable, with either behavioral or other neurophysiological data collected as the dependent variable. This is a unique type of experimental design made possible by such on-line neurophysiological computer system. Such experiments are currently planned but not yet operative. Evaluation It is, of course, platidudinous to point out the importance of availability and reliability to the investigator. 7,19 With respect to the integrity of his investigations, time-criticality may exist in the microsecond domain for very critical tasks such as AiD conversion, relay driving, sense line reading, and data analysis when used as independent variables. Time-important problems may also exist for I/O and system availability on an hourly basis, for his "on-line" processing of his experiments, or minutes in terms of waiting for his "off-line" data analyses. The necessity of insulating the system from novice users prompted the decision to develop an interpretive processor. The success of 'similar systems gave support to this decision. The large amount of system overhead required for SLIP increases considerably the amount of execution time required for a given prob- On-line Multiprocessing Interactive Computer System for Neurophysiological Investigations 351 TABLE 1 INTERPRETATION OF CONSOLE KEYS IN THE BASIC SYSTEM LEVEL (1). UC LC KEY RETN WAIT CONT BUG LIST TYPE RUN 0 1 00 2 02 01 3 03 5 05 4 04 6 06 7 07 9 10 11 Q 12 W E R T Y U I 13 .14 0 22 Program terminator Wait requested Continue operation Debug routine Alphanumerical display of AC Print on scope Starts or stops a previously defined background activity SET BlNK PLOT FIND EXT PRGL PRG'" TRNC HEDL TMOD MODE CURS INC SHFT ROT FLIP MOD MIN RAND ZERO INTP AVG 8 15 16 17 20 21. P 23 A S 0 F G H 24 J K L SGMA 25 26 27 30 31 32 33 34 35 36 Parameter set-up Scope erase Vector display of AC·on scope Search a value in the AC Extract eiement(s) from AC load program to AC for correction Store program from AC Truncate the values in AC Examine the header of AC Modify the type of data in AC Select scope at plot mode Extract and extend a part of AC Increment or decrement continuation index 0 r I eve I Change the initial index of AC Rotate AC to the left or right Invert ·the order of values in AC Execute module n Find the-smallest element of AC Generate a random vector in ·AC Count zero-crossings in AC Linear interpolater Compute the average of AC Compute the standard deviation of AC Compute the sine of ACCompute the cosine of AC Compute the arctangent of AC SIN COS ATAN Z X C Note: UC = Operator (upper case) LC = Operand (lower case) 37 40 UC LC KEY LOG V 41 EXP ABS HIST CONJ B 42 N M 43 44 45 / / 47 < < 50 ~ FUNCTION OF THE OPERATOR X 46 51 SUM ( !J. !J. 52 53 54 *t 55 56 57 ~ PROD *t + + SPACE SPACE LOAD CR PROG ~ END DATA ~ 0 60 61 62 63 64 65 66 RSET RSET 67 ALTR .. SKIP REP .j. a 1- 1 1 LEV k .. $ 70 71 72 73 74 75 76 77 FUNCTION OF THE OPERATOR Compute the logarithm of AC Each element of AC acts as an· exponent of e Replace AC with absolute values Distribute values in AC with present bIn width Set the signs of values in AC Multiply Divide Inn;r~l nn~r~~nr l~cc L~~i~~j ~~;~~t~~ ~~~;l ~h~n ... _.. logical operator greater than or equal Sum the AC Compute the delta between successive elements of AC Compute the product of elements of AC Filter the AC Raise AC to an exponential Add Subtract No ope·rator Load data into AC Signify subsequent key-presses as a user generated program Program or repeat loop end IndIcator Defines indirect data (UC) Terminator of a binary operand (LC) Reset the console to wait input status (error· correction) Editing operator Comments mode definition Unconditional branch 00 loop generator (repeat) label sign Conditional' branch Store data from AC Operator or data level change request KEY = Character representation of KEY AC ~ Accumulator lem (sometimes by as much as a factor of 30 when compared to a Fortran version). This increased time may be unacceptable to users requiring large amounts of computation (e.g., matrix inversion, sorting, etc.). The current answer to this problem is writing an efficient machine language program off-line and then merging it with the SLIP system. The pro..gram is then accessed by the SET function (Table I), using a four character name. Current daily limitation on system availability is being remedied by an assembly program development which will enable time-sharing by SLIP and assemblies. The elimination of this limitation will enable acute neurophysiological experiments, running continuously for 24 or more hours, to have constant access to control and data processing of the system. From the user's viewpoint, system reliability has been quite good. Most interruptions aremomentary and have an adverse effect on the investigator only for time-critical on-line experimental processes such as AID data collection, sense line reading, and relay driving. These seldom occur, but, of course, can be very frustrating (costly) to the investigator when they do. ACKNOWLEDGMENTS Much of the credit for the initial planning of the DPL system must be given to Mr. Dan Brown. I ,15 Lionel Rovner I5 was principally responsible for hardware developments, while Mrs. Howard, Mr. McGill, and Mr. Wyman assisted in software development. USPHS Grant NB 02511, NASA Grant NS 6505, ON R Grant ON R 233(91), assisted DPL development, while Calif. DMH Grant 2-36 and USPHS Grant 13268 assisted the neurophysiological research. Mr. Brown and Dr. Walter l9 ,20 largely developed the HSCF data programs with the assitance of Mr. Parmallee and Mrs. Free. The HSCF is heavily supported by USPHS Grant FR 3. 352 Spring Joint Computer Conference, 1968 REFERENCES 1 F ABRAHAM D BROWN M GARDINER Calibrations of EEO power spectra Communications in Behavioral Biology vol 1 1968 2 F DABRAHAM JTMARSH Ampiitude of evoked potentials as a function of slow presenting rates of repetitive auditory stimulation Experimental Neurology vol 14 ,1966 3 F D ABRAHAM N M WEINBERGER Possible mammilothalamic tract involvement in feeding behavior The Physiologist vol 10 1967 4 WRADEY Speciral. analysis and. pattern recognition methods for electroencephalographic data Data Processing Conference Copenhagen 1966 5 LBETYAR . A user-oriented time-shared on-line system CACMvol 10 1967 6 MBBRAZIER The application of computers to electroencephalography Loc. Cit. Ref. 17 7 GEBRYAN Joss: 20,000 hours at the console: A statistical summary AFIPS FJCC vol 32 1967 8 GJCULLER A start in conversational programming for elementary mathmatical problems IFJ_PS Conference Proceedings vol 2 1965 9 WJDIXON Use oj displays with packaged statistical programs AFIPS FJCC Proceedings vol 32 1967 10 ! ETTER Requirements for a data processing systemfor hospital laboratories AFIPS FJCC Proceedings vol 32 1967 11 S L FEINGOLD PLAINT - A flexible language designed for computer-human interaction AFIPS FJCC Proceedings vol 32 1967 12 M S FINEBERG 0 SERLIN Multiprogramming for hybrid computation AFIPS FJCC Proceedings vol 32 1967 13 C MACH OVER Graphic CRT terminals - characteristics of commercially available equipment AFIPS FJCC Proceedings vol 32 1967 14 W A ROSENBLITH Processing neuroelectric data MIT Press Cambridge 1962 15 L D ROVNER D BROWN R T KADO A time-shared computing system for on-line processing of physiological data Proceedings Symposium on Biomedical Engineering Milwaukee vol 1 1966 16 W J SANDERS G BREITBARD D CUMMINS R FLEXER K HOLTS J MILLER G WIEDERHOLD An advanced computer system for medical research AFiPS pjCC Proceedings voi 32 i96i 17 R W STACEY B WAXMAN Computers in biomedical research Academic Press New York 1965 18 ATONIK Development of executive routines, both hardware and software AFIPS FJCC Proceedings vol 32 1967 19 D 0 WALTER Rapid interaction with a digital computer-plusses and minuses The Physiologist vol 9 1966 20D 0 WALTER R T KADO J M RHODES W R ADEY Electroencephalographic baselines in astronaut candidates estimated by computation and pattern recognition techniques Aerospace Medicine vol 38 1967 21 HWYLE GJBURNETT Management ofperiodic operations in a real-time computation system AFIPS FJCC Proceedings vol 32 1967 Graphical data management in a time-shared environment by SALLY BOWMAN and RICHARD A. LICKHAL TER System Development Corporation Santa Monica, California INTRODUCTION At System Development Corporation there is a conviction that one of the most plausible ways to make the cost of software decline is to build general-purpose software that is capable of solving a variety of problems. SDC's most successful effort in this field has been in the area of general-purpose data management. Our initial large-scale, time-shared data management system, TSS-LUCID, enabled the nonprogrammer to describe, load, query, and maintain a data base. In use for over two years, this system provided enough generality to solve such diverse problems as comparison of salary data in different segments of the aerospace industry, analysis of statistical data for a customer in the oil industry, and monitoring of public-supported cancer research projects. Currently 'being implemented on the IBM 360 family of computers is an improved, more powerful version called TDMS (Time-Shared Data Management System). The role of cathode ray tube displays as applied to data management systems is the basic concern of this paper. Traditionally, CRT displays have been used for tabular data display, geographical displays associated with command and control systems, and for engineering applications. Little use has been made in the area of graphical display of structured data files as an adjunct to an information retrieval system. In addition, little attempt has been made to use the scope to assist the user in forming his data retrieval request. With the widespread availability of time-sharing and data management systems, the need for an easy-touse, yet powerful, graphical display system has become more critical. This paper describes the graphical display system currently available on SDC's time-shared Q-32 computer in Santa Monica. The system described (called DISPLAY) is the forerunner of the display component ofTDMS. Design goals In January 1967 we began work on DISPLAY by limiting the design to that set of display problems which relate to analyzing data from a large data base. Thus, DISPLAY was to be concerned with scatter plots and with regression curves but not with rotating three-dimensional figures. Our first design goal was to provide satisfactory response within a time-shared computer. We knew from experience that the nonprogrammer user will tolerate long delays while the teletype clatters along reassuringly, but that he will panic very quickly whe~ confronted with an unchanging CRT. Constructing a display system that will deliver even tolerable response within a general-purpose time-sharing environment, however, requires careful breakdown of the input/output requirements to maintain an interactiveness between system and user and thus avoid lengthy delays waiting for ~service when placed in a lower priority queue. Our second goal was to produce a system easy for the nonprogrammer to use. Communication had to be so natural as to appear inevitable. Still, the system had to have enough power to accomplish useful work. Achieving these first two goals - we hoped - would help us achieve the third: that of gaining users for the system in order to obtain feedback to improve the. system. Our ultimate goal was to use this experience in designing the display portion of the Time-Shared Data Management System, a general-purpose data management system being developed at SDC. Design environment of display The DISPLAY system was designed and built for the IBM Q-32 computer. This is a powerful machine with 64,000 words of core memory, a 2-JLsec cycle time, drum storage, and 4 million words of auxiliary disc storage. The Q-32 is operated in a timesharing mode at all times. During "prime time-," the number of users averages between 25 and 30. SDC's general-purpose time-sharing algorithm provides equal service to all users; thus DISPLAY receives no special advantage over any other program. To the best of the authors' knowledge, there are no other display systems that operate in a time-shared environ353 354 Spring Joint Computer Conference, 1968 ment without receiving some special consideration from the executive. Auxiliary to this computer are 6 CDC DD19 cathode-ray-tube scopes which are drum-refreshed automatically every 22 milliseconds. The 1024 x i 024 scope grid provides good resolution, and the refresh rate provides excellent insurance against flicker. Normal usage of the CRT provides 680 characters and/or vectors to each user, but DISPLAY can operate with up to 1,360 characters. Associated with each CRT is a light pen with a two-position switch, providing an aiming circle and a flicker when fired. Additionally, a teletype is provided for normal communication with the Time-Sharing System and (in some instances) with the DISPLAY system. In previous experiments we had learned that a welter of input equipment is distracting and awkward, if not downright frightening to the user, so we deliberately kept to the minimum of scope, light pen, and teletype. In addition to the hardware available, the designers had a startling richness of software at their disposal. This included an on-line, interactive JOVIAL compiler; elaborate debugging, editing, and other on-line programming tools; and TSS-LUCID, which provided all the machinery necessary to describe, load, maintain, and interrogate large data bases. Additionally, an on-line interpreter, TINT, with full ALGOL capabilities was available. The importance of these software assets cannot be overemphasized. In fact, in building DISPLAY, the TSS-LUCID query program and the TINT interpreter were used almost "as is." In order to do this, of course, a sequencer program had to be built, which effectively time-shares core within the time-sharing system. System description DISPLA Y provides an automatically generated graphical presentation of data stored in a TSS-LUCID data base. The user's entire attention is focused on the scope. All parameters which he may require are listed and he supplies values only by means of his light pen. To make this possible, the program follows the user's light-pen selection and dynamically updates the scope to supply all the choices legal as a result of his last action. If the user changes his mind or makes a mistake, he erases by lightpenning what he wants to delete. His previous selection is erased up to the point that he has indicated, and the scope returns to the set of legal inputs· that are appropriate, after considering his erasure. Dynamic scope updating achieves three purposes: it makes it easier for the user; it guarantees error-free inputting; and it allows the program to deliver rapid response within the time-sharing system. Once the necessary parameters for a graphical display are defined, the user executes the request and receives a standard graphic presentation of his data. The standard presentation implies two things: First, the user receives a data plot rapidly without being troubled with the minute detail associated with laying out a plot; second, extensive capability to override the standard presentation must be available. Data base X-ray At any time during parameter specification, the user may "browse" through his data base under light-pen control. A list of the data base elements are presented 26 to a display page with up to five pages possible. Included with the element list is the element number, element type, and the number of distinct values which each e1ement has in the data base. The user may then access the value list associated with each element through light-pen control. Because the number of data values can be quite large for each element, a random-access scheme is applied to make possible rapid display of any value. When the user requests a value list display for a selected element, a sample of values is presented including the first value, the last value, and up to 24 equally spaced values. To obtain additional data values for the selected element, the user has light buttons to expand the displayed value list about any value. This is an iterative process providing the capability to pinpoint a specific value·out of a list of up to 17,576 values by only three light-pen choices. This has particular value in providing the user a rapid listing of the range of data values and their stored coding in both coarse and fine scale. Another feature of the value list display is the inclusion of a frequency occurrence count for each value. This is very useful in analyzing the data base contents and in error detection within the data base. of Data retrieval The structure of the TSS-LUCID data base that is used with DISPLAY provides rapid data retrieval and broad user selectivity in determining the particular data subset of interest. The user pays no penalty for the retrieval of one data element over another, since all elements of a given data base are equally retrievable. Stored with every data base is a cross-indexed inverted file directory into the actual data. When a data retrieval request is defined, the directories are searched to determine the qualifying data records. These directories are then further used to determine disc addresses of the qualifying records; these are the only records that are examined for retrieving the required data. · Graphical Data Management in Time-Shared Environment To select a new data base while using DISPLAY, the user names the desired data base to be loaded using the teletype. The new data base is found and made part of the DISPLAY environment. There is currently no ability to retrieve from more than one data base simultaneously; that is, one may not retrieve data for the X-axis from data base 1, and data for the Y -axis from data base 2. Standard graphic presentation DISPLA Y is designed to provide an automatically determined, standard graphic presentation in response to the user's light-pen inputs. Axis scaling, axis labeling, data scaling, and data plotting are all performed by the program. The data d~termines the X- and Y-axis scales. For numeric information, the scale consists of 10 graduation marks with a graduation increment based upon the most appropriate selection of 1, 2 or 5 times a power of 10, which encloses the range of data values and gives the la~gest graphic display available. An algorithm determines whether the data values should be plotted from zero, or whether it is more appropriate to display a minimum scale value which is somewhat below the minimum data value displayed. Hollerith data are displayed evenly spaced on the axis with a spacing determined by the number of distinct values for the variable. On the X-axis, to accommodate a larger number of distinct labels, two lines are used (when necessary) to minimize label overlay. Titles for the X- andY-axis are the data base element names specified in the display request. The overall title of the graph is the user's original data subsetting statement or, if there is none, the data base name. Where the user has specified a succession of curves to appear on the scope, each curve is numbered and a legend supplied to distinguish the multiple curves. User picture interface When the requested graphic display appears on the scope, the user has at his disposal a set of "touch-up" overrides for further analyzing the data and modifying the display. Accurate point readout values are available through light-pen selection. The DISPLAY program interprets the light-pen position and displays the numeric or Hollerith X and Y meaning for that point. The user may change the axis scale either to better analyze a given display area or to place the data in clearer perspective. In many cases, much of the data are bunched in a small area of the scope because of a few extremely large values. Changing the range of the display allows for an expansion of the bunched 355 data for clearer insight into the data relationships. For appearance sake, all titling can be altered; commentary may be added; data points or data sets may be deleted; Hollerith axis labeling can be changed. Columns or rows may be repositioned or eliminated. At any time during this process, if the user destroys the original display, he may by light-pen action return the scope to the standard graphic display. Back-up hard copy or auxiliary information from the data base is available through the teletype, While his picture remains on the scope, the user may request the TSS-LUCID query program to retrieve, format, and sort output to explain some quirk of the data revealed on the scope. In addition, the user may save a particular graphic presentation and recall it at a later time. Veryimportantly, the user may use any of the touch-up options on any picture he recalls. The save and recall capability provides the flexibility to store on 'a single file graphical displays generated from several different data bases. Application programming Users who wish to manipulate data mathematically and logically using more sophisticated operators than those provided by the standard D ISPLA Y program can use TINT ( a higher-order ALGOL-type interpreter) within DISPLAY. Although TINT requires that the user have some programming abil~ty, it is designed to facilitate user interaction and incorporates many user-oriented features. The capabilities of TINT include full iteration control, arithmetic control, conditionals, indexing, parameter specifications, teletype print routines, code insertions, debugging aids, etc. TINT programs can be saved and recalled at a.later time. To use TINT within DISPLAY, the user first specifies the data to be retrieved, and then indicates that he wants to operate TINT. DISPLAY initiates the data retrieval. When it is complete, DISPLAY turns over control to TINT. Then the user either specifies an already written TINT program, or writes a new TINT program on-line. The TINT program operates on the retrieved da.ta and outputs a graphic data array, which in turn is fed into DISPLAY for scope presentation. Use of display Several examples of use of DISPLAY are presented in this section, including actual scope photographs. The first example describes in some detail the interaction between the user and the system in the formulation of a display request. The remainder of the examples illustrate the different capabilities 356 Spring Joint Computer Conference, 1968 provided within DISPLAY and indicate the variety of applications to which the system has been applied. Example 1 The user requests a scatter plot showing the amount of taxes paid versus the assessed valuation of property for cities in the State of California with a population of 50,000 or less. In Figure 1, the user has light-penned the parameter X-variable and is preparing to complete the specification. The right side of the scope contains the data base element names from which the user may select to define the X-variable. Shown with the element names are: (l) the element number, (2) the element type, and (3) the number of distinct values for each element in the data base. Hence for COUNTY, the element number is E2, the element type is H for Hollerith, and the number of different counties is 53. On the lower left hand side of the scope, available statistical operators are displayed which may be selected and applied to a numeric type data base element. Figure 2 - X-variable specification specification and is ready to select a different parameter. The user selects Y -variable and light-pens ASSESSED VALUATION OF PROPERTY. In Figure 3, the user has completed specifying the Y -variable and is supplying the data sub setting ciause using the WHERE parameter. Having light-penned ESTIMATED POPULATION, the user is pre- Figure I-Initial display showing the data base elements and the available parameters for the graphical display request The user light-pens. the element name TAXES, and the scope changes to that of Figure 2. The element name list has disappeared since it no longer represents a legal input' at this point in the input specification. Legal inputs shown are the parameter list and the relational EQ (equal) which appears in the lower left. Selection of EQ allows the user to specify particular data values for the selected variable. In this example, however, the user has satisfied the X-variable Figure 3 - X-variable, Y -variable specification and start of data subsetting ciause Graphical Data Management in Time-Shared Environment 357 sented the set of legal relational~ in the lower left side of the scope. The user light-pens the relational LQ (less than or equal), and in Figure 4 receives a list of values associated with ESTIMATED POPULATION. This list gives him the first value, the last Figure 5 - Complete parameter specification for display request of example I Figure 4 - Display of value list for use in data subsetting clause value, and a selected sample of intermediate values. The user selects the value of 50,200 (closest value to 50,000) for use in the data sub setting clause. The user could have used the FINE light button to obtain additional data values to select from. After data value selection, the scope changes to Figure 5, where the user may continue specification of the data subset using the Booleans AND and OR, or, as in the example, light-pen the light button EXECUTE to begin generation of the display request. When the execute action is sensed, user inputs are converted into a data retrieval request, the search of the data base is performed, and an automatic display of selected data occurs (Figure 6). Each axis is automatically scaled and graduated to best reflect the data. Each axis is labeled with the appropriate data element plotted, and the title of the graph is the data sub setting clause provided in the WHERE parameter. To aid the user in data analysis, a set of touch-up options are displayed. In this example, AXIS SCALE is used to expand the area near the origin (Figure 7) to separate the bunching of data points. Light-penning the EXECUTE button results in a rescaling of the plot as shown in Figure 8. Also illustrated in Figure 8 is the use of READOUT. Any data point on the scope may be light-penned, and the actual data values stored in the data base for that point are displayed on the scope, together with a bright plus (+) superimposed on the selected data point. Figure 6 - Standard graphic presentation for example I 358 Spring Joint Computer Conference, 1968 Figure 7 - Use of axis scale Figure 8 - Rescaled graphic presentation showing point READOUT option Example 2 The user requests a scatter plot, showing the distribution of the total bonded indebtedness per county of cities with a populatio!1 of 20,000 people or less. Figure 9 shows the complete input specification for this display, and Figure 10 illustrates the resulting scatter plot. Scope capacity considerations occur when the Y-axis contains a large number of distinct Figure 9 - Parameter specification for example 2 Figure 10- Distribution by county of total bonded indebtedness for selected cities Hollerith values. The user is informed of the condition by the counter in the lower right hand corner of the display. In this example the counter reads 0032, meaning 32 possible data points are not shown. The . user can get access to these missin~ points by adroit handling of the DELETE option. For example, the user may delete the X-axis label and the title to cause the counter to go to zero, at which time the missing data points would be displayed. Graphical Data Management in Time-Shared Environment 359 Example 3 A military data base on status of forces is illustrated next. The display request is for a line plot showing the total amount of troops in training for each assigned readiness level comparing the Army to the Air Force. Parameter specification for this is shown in Figure 11 resulting in the display shown in Figure 12. This display illustrates the use of mUltiple curve generation which is provided by the use of the ITERATE BY parameter. Each curve is numbered and a legend is provided to distinguish the several curves. READOUT also distinguishes the particular curve that a data point is on. Figure 12 - Line plot display showing total troop strength by readiness level Figure 11 - Parameter specification for example 3 showing the use of the ITERATE BY parameter for mUltiple curve generation Example 4 The use of DISPLAY with scientific data is presented. Oceanographic data measurements have been compiled into a data base containing the elements appearing in Figure 13. The illustrated scatter plot (Figure 14) shows a plot of salinity versus depth fOJ" a specific data subset. In this example, the automatic scaling algorithm is shown off to good advantage in that the X -axis is not plotted from zero but instead from a meaningful minimum value. ExampleS The use of a statistical analysis program is represented in this example. A representative sample of oil credit card purchases was loaded in a data base, and DISPLAY was used to plot a frequency dis- Figure 13 - Parameter specification for example 4 using the oceanographic data base tribution of the individual invoice charges. The elements in the data base and the user's specification to achieve the display are shown in Figure 15. The resulting frequency plot (Figure 16) shows some interesting insights into customer buying with large frequency peaks at the $2, $3, and $5 invoice charge. 360 Spring Joint Computer Conference, 1968 Figure 14 - Scatter plot showing the distribution of salinity by depth from the oceanographic data base Figure 16 - Frequency distribution of invoice charges for an oil company's charge accounts Figure 15 - Parameter specification for example 5 showing the use of TINT Figure 17 - Parameter specification for example 6 showing partial list of data elements Within the digitalis drug data base ExampJe6 The final example illustrates the use of DISPLAY with medical data received from the University of Southern California shock ward and the effects of the drug digitalis on the patients. Some of the elements in the data base are illustrated in Figure 17, together with the input request. In Figure 18 the resulting display shows a regression plot of pulse pressure before the drug was used versus pulse pressure after the drug was administered. The discontinuity on the curve is a pictorial representation of the standard deviation of the data. Graphical Data Management in Time-Shared Environment Figure 18 - Data plot and regression curve analysis showing the effects of the use of digitalis on pulse pressure CONCLUSION In evaluating the success of this project, the designers find that the major goals have been met. One of these was rapid response. In most respects, response is well within the limits of user patience. The inputting of the initial parameters; the scope examination of the structure and content of the data base; the regeneration of the display in response to the users' most complex "touch-up" request-all appear nearly instantaneously. Time for retrieval, however, is long. It varies from 15 seconds to five or six minutes elapsed time as a -complex function of the size of the data base, the complexity of retrieval statement, and the number and kind of other time-sharing users. The effect of the lengthy retrieval time, however, is much less disastrous than we had feared, for a user very quickly learns that retrieval takes time and resigns hi!11self to this one wait much more tolerantly than ·we had anticipated. Our second goal, that of easing the user's communication with the system, has been fully achieved. With the scope guiding the user at every step, users quickly arrive at the point of data analysis. Originally, we had hoped that a new user could learn to use DISPLAY on at least a basic level with no more than 30 minutes training. Experience has shown this to be true. The ability to specify a complete graphic display through the use of the light-pen and to have the data base elements and data values automatically displayed when needed facilitates user interaction and overcomes the 361 normal anxieties experienced by new users not familiar with display systems. We feel that the communication channel which we have opened with DISPLA Y may. well be the most appropriate way for a nonprogrammer to communicate with many of the sophisticated computer programs presently being installed on third-generation machines. Our third goal- that of attracting real live users from whose experience we could learn - has also been achieved. Since July, when it became usable, DISPLA Y has proved to be sufficiently general-purpose in nature to solve a variety of problems. A large oil company used DISPLAY to study their distribution of credit card purchases with respect to dollar volume, gallons bought, and number of invoices per customer to better understand the customer population and buying habits. A research project within SDC used DISPLAY to analyze the results of treating shock patients with the drug digitalis. Another application in the medical field involved analysis of cancer research projects by state, by area, by university, and by topic. Still another example was the analysis of the type of sensing equipment used in satellite surveillance. It is important to note that these applications were conducted with only minimal training periods with DISPLAY and in the main by nonprogrammers. These applications were accomplished by the people who had the data problem rather than by data processing personnel. Currently, DISPLAY is available only for use with the SDC Q-32 Time-Sharing System in Santa Monica. Further research is planned using the man-machine interface techniques evolved from this project in a broader range of data management functions, including data base description, maintenance, report generation, and fact retrieval. Applying the work to the small, tabular scopes without light-pens is also being studied. The most immediate goal, however, is to redesign DISPLAY to operate under the SDC 360 Time-Sharing System in association with SDC's Time-Shared Data Management System. REFERENCES I E BENNETT E HARRIS J SUMMERS AESOP: A prototype for on-line user control of organizational data storage, retrieval and processing Proceedings of the Fall Joint Computer Conference vol 27 pp 435-455 2 E L JACKS A laboratory for the study of f,;raphical man-machine communication Proceedings of the Fall Joint Computer Conference vol 26 pp 343-350 3 D T ROSS R H STOTZ D E THORNHILL C A LANG The desif,;n and prof,;ramminf,; of a display interface system inte- 362 Spring Joint Computer Conference, 1968 grating multi-access and satellite computers ESt Memorandum 170429-M-190/MAC-M-353 MIT Electronic System Laboratory 4 S H CAMERON DEWING M L1VERIGHT DIA LOG: A conversational programming system with a graphical orientation Communications ofthe ACM June 1967 5 R KAPLOW J BRACHETT S STRONG Man-machine communication in on-line mathematical analysis Proceedings of the Fall Joint Computer Conference vol 29 pp 465-477 6 Digigraphic System 270; system information manual Control Data Corporation Digigraphic Laboratories Burlington Massachusetts 1965 7 T R ALLEN J E FOOTE Input/output software capability for a man-machine communication and image processing system Proceedings of the Fall Joint Computer Conference vol 26 pp 387-396 8 J MC CARTHY D BRIAN G FELDMAN J ALLEN THOR -A display based time-sharing system Proceedings of the Spring Joint Computer Conference vol 30 pp 623-633 9 J I SCHWARTZ E G COFFMAN JR C WEISSMAN A general-purpose time-sharing system Proceedings of the Spring Joint Computer Conference vol 25 pp 397-411 IO i E SUTHERLAND Sketchpad, a man-machine graphical communication system Proceedings of the Spring Joint Computer Conference, vol 23 pp 329-345 11 B D FRIED The STL on-line computer Vol 1 12 G J CULLER Function oriented on-line analysis Workshop of computer organization 1962 pp 191-213 13 G BURCK and the editors of Fortune The computer age and its potentia/for management New York, Evanston and London: Harper and Row 1965 14 J C R LICKLIDER W F CLARK On-line man-machine computer communication Proceedings of the Spring Joint Computer Conference vol 21 p 113 1962 On the formal definition of PL/ I byK.BANDAT I BM Laboratory Vienna Vienna, Austria INTRODUCTION This paper describes a formal definition of PL/I which has been produced by a group in the IBM Vienna Laboratory. The paper contains the outlines of the method rather than details of PL/1. The definition currently exists as a Technical Report "Formal Definition of PL/I"l which contains the abstract syntax and the semantics for PL/I program text. A second version of this Technical Report is under preparation and will complete the description of PL/I in these areas where the first version omitted language features, or showed major deviations from the current language. The language described in the Report is PL/I as specified in the PL/I Language Specifications of the IBM Systems Reference Library 2, supplemented by additional information. Needs for language descriptions The new potential user of a programming language - somebody who is assumed to be familiar with high level programming languages in generalneeds information on the language on several levels of precision and completeness. At the first contact with the language he would like to know the salient features of the language, the parts of the language which are similar to languages he is familiar with, and the new concepts in the language which mark the step forward in programming language development. The user would like to see the potential areas of application for the new language. He should find all this information in an introductory document to the language like a primer. This primer in an intuitive way explains the concrete representation of programs and data, the various data types and data structures of the language, the structuring of programs by blocks and procedures, the operations which can be used in expressions, etc. A primer need neither be a complete description of the language nor need it be precise in all details. For tutorial purposes simplifications and omissions can be appropriate. A description of the language with an increased level of completeness and precision is required for the actual user who starts to write programs in the language. He needs to have the full information on the concepts of the language and a complete set of rules for writing programs which will be accepted by a compiler. He would also want to know what result the execution of his program will give. Traditionally these needs for most programming languages were served by language manuals which state fairly precisely how a program has to be written. For the meaning of a program manuals frequently supply a natural language description which explains semantics of the language in an informal way, leaving a certain amount of questions open to the intuition of the reader who has to generalize from examples in the manuals and from the interpretation of programs on existing compilers. These means will not satisfy the more advanced programmer who wants to know properties of the language or of a program in all details, say, e.g., in order to clarify whether two programs written to solve the same problem are in fact equivalent. The highest level of precision in language description is needed for the implementer of a language. He requires a reference to the language which for every conceivable question can deliver a complete and correct answer. The implementer should neither be forced nor be allowed to answer questions on the language by his personal interpretation of a manual. The reference tool can also serve as the communication tool by which the language designer conveys the complete information on the language to the implementer. This makes it necessary that the method used in producing the reference document allows easy modifications of the description for the incorporation of language changes and language extensions. We are convinced that the methods developed establish a tool for describing programming languages and PL/I specifically with a degree of precision which . could not be achieved up to now by using informal methods. Formal methods for language definition If we accept the need for a rigorous, complete and 363 364 Spring Joint Computer Conference, 1968 unambiguous definition of a high level programming language we have to find which methods and metalanguages can be used for the purpose. It is frequently claimed that it is appropriate and more convenient for the user to describe a program~ L ..LIl L e a.1 ~:d of a natl1rallangl1!:loP !:l~ the mlng language W lUI u..u ~. .. ........O- .... ~ ~&& describing metalanguage. This may be tolerable for a language manual which has to serve a tutorial purpose as well as to give a description of a language. However, for achieving a precise and unambiguous definition the use of natural language is inappropriate. Natural languages are not well defined languages, they are lacking an exact syntax which allows the unambiguous parsing of sentences and clauses, and they employ words with multiple or vaguely defined meaning. Thus it can never be guaranteed that rules formulated in a natural language, unless applying rigorous restrictions, will have a well defined and unambiguous meaning. Current experience with language manuals written in English has shown up this fact very extensively. Only a formal method used in defining a programming language will yield the required precision and unambiguity. Although formalization is a well established method in the foundation of mathematics and logic, it was only recently that attempts have been made to apply similar methods to programming languages. In defining a language we have to distinguish between the syntax of the language, i.e., the set of formation rules defining all strings which are well-formed programs, and the semantics of a program, i.e., the meaning of a well-formed program. For the definition of the syntax of programming languages Backus Normal Form or some of its equivalents are commonly accepted tools. For the definition of the semantics of a programming language two groups of methods are known, translation and interpretation. The translational approach requires the existence of a completely defined language L. If a translator can be designed, which translates any well-formed program, written in the language L' to be defined, into a program in the language L, then the definition of the language L together with the translator L' to L completely define the semantics of L'. In order to be precise and unambiguous, the rules translating the program have to be written in a metalanguage which itself is completely defined. The methods of semantic definition by interpretation have in common that they specify a function or a process which for any given set of input data and any given program text yields the output data. This can be achieved by designing an abstract mechanism which serves the purpose of a machine for which the lan• 1 ___ : .. H guage to be defined is the machine language. The complete logical description of the programming language contains the description of the possible states of the abstract machine and the way these states are changed by interpreting pieces of program text. • Language definition by a compiler Occasionally it is claimed that an implementation of a language is sufficient as the definition of the language and that all questions about the language which cannot be answered by the manual, can be answered by processing sample programs with the compiler. This of course can be claimed only for the time when a language has already been implemented and not for the development phase of the language. A compiler designed as a tool for converting a program written in a high ievei ianguage into an object program in machine or ;assembly language which can be executed on a machine, has several drawbacks when used as the definition of the language. A compiler is defined and operable only in connection with its environment, i.e., the machine or assembly level target language and the actual machine. If a compiler should be used as the reference for a language, it has to be ensured that for the period of time while the compiler is used as the reference both this environment and the compiler itself remain unchanged. Currently this can hardly be guaranteed to its full extent for any implementation. Furthermore an implementation of a language defines many details in the semantics of a program which are not defined by the semantics of the programming language. Thus for PL/I the compiler contains a specific choice for the order of evaluation of operands of an expression, whereas the language leaves this order explicitly undefined. When a question concerning a ianguage is soived by processing a characteristic program it cannot be distinguished how far the result reflects the situation of the language and how far the result reflects specific implementation or hardware properties. It seems feasible to design a compiler which avoids these problems, for reference purposes only. This compiler - besides requiring a fixed environmentwould need information on all points where information in addition to the semantics of the language is needed for interpreting a program. In fact this would be an implementation of an interpretive formal definition. The system constituting the formal definition of PL/I In designing the system for defining PL/I the required precision of definition, the presumptive user and the impact on the language itself had to be con- On the Formal Definition of PL/I 365 sidered. It was desirable to isolate the concepts and properties of PL/I from one another avoiding unnecessary interrelations and cross references and allowing separate consideration and evaluation of the concepts. Specifically a clear separation of all problems of program representation and notational conventions from the functional concepts of the language had to be achieved. Furthermore, it had to be shown in which areas the language requires a specific choice and additional definitions for a specific implementation. The system designed for the definition of PL/I consists of several stages of processing and interpreting program text. The block diagram of Fig. 1 shows the elements of this system. The compile time facilities of PL/I are considered as a separate sublanguage, defining PL/I program text as the result ofprocessing PL/I source text in a compile time preprocessor. A compile time concrete syntax defines well-formed source text. The concrete syntax of PL/I program text defines well-formed programs for semantic interpretation. Before the interpretation of program text all semantically irrelevant representation properties are removed by converting concrete program text to abstract text, a tree representation for a program. The PL/I machine interpreting abstract text is so designed as to reflect the concepts of PL/I by the various constituents of its state. any string of Pl/I characten set af rules separating well-farmed PL/I source text fram nat wellformed character strings well-farmed PL/ I source text string processar generating PL/I program text tl by processing to PL/I program text, nat checked for syntactic correctness set of rules separating well-farmed PL/I program text from not well-farmed text well-formed PL/I program text convem well-formed PL/I program text to a derivation tree according to syntax rules derivation tree of t2 transforms the derivation tree t3 to a derivation tree t4 which conforms with the abstract syntax, rejects non transformable t3 abstract text, a derivation tree set of rules, defines selectors for pam of abstract text and properties of selected pam abstract text Figure Ia - System for the formal definition of PL/I- definition of program representation abstract text, a notation-independent form of PL/ I program text description of the system-defined program envirorvnent information on input data generation of initial state from system envirorvnent, program text, and input data initial state d the PL/I mochine e!robli!hi!'!S SJob!!I ~!'t!~ of t~ pms!'O!!!, i.e. allocation of static variables, linkage of external identifien, updating d program with denotation of static and controlled variables and extemal entry names pn!pa!! ~I Semantic Interpretation state of Pl/I machine containing abstract text after prepass interpretation af modified abstract text terminal state defining the meG'\ing af a well-formed pL/ I program t2; valid program if abstract text could be interpreted up to the logical end, state contains output data, invalid program if interpretation of obstract text became undefined and terminated before reaching the logical end of the program Figure I b - System for the formal definition of PL/I- definition of program semantics Trees and operations on trees F or the definition and handling of pieces of abstract text and parts of the state of the PL/I machine a notation has been developed which, free from redundancy, only reflects relevant properties of the considered objects. A class of abstract objects has been defined which can be represented by trees. For this class of objects functions are given which enable the generation of trees from elementary objects and the modification of trees by deleting or changing subtrees. Properties of classes of abstract objects can be defined by an abstract syntax. The basic functions will be explained with the help of examples. In these examples the characters enclosed in < > are meta-names used for discussion only and denote an elementary or a composite object. Capital characters denote elementary objects. The abstract objects and used as examples are shown in their tree representation in Fig. 2 and Fig. 3. Selectors on trees I n trees - or, more precisely, in the abstract objects they represent - a subtree attached to a node is selected by a selector function. Thus in the example of Fig. 2 the sub-objects of the composite object are selected as follows: < r > = sel-l « x < s > = sel-2 « x < t > = sel-3 « x » » » 366 Spring Joint Computer Conference, 1968 The function 'is-11' allows checking whether a selector on an object yields a proper or a null object: (x) A I se I - 1 se is-11 (sel-3 « x») = FALSE and is-il (sei-4 « x») = TRUE se I - 3 b 2 '\, (s) sel-l (t) sel- 2 I \, (U) (v) Figure 2 - Composite object represented as a tree I t is to be noted that selectors are functions that are local to the object for which they are defined, the same selector name sel-l is used for the object and for its sub-object . Selectors on the same level do not imply an ordering. Thus a question "which is the first branch attached to the node" has no defined answer. Nodes of a tree do not according to the definition possess names. The meta-names in < > used here serve only for the explanation. If a node requires to be named, that name has to be attached as a separate branch. In the example of Fig. 3 the object possesses the name R, accessible by the selector s-name. Generation of a tree F or the generation of an abstract object from its elementary objects, a generation function JLo has been designed which establishes the arrangement of selectors and selected objects to form a new composite object. For the object of Fig. 2 the generating function is written as: R sel- 1 d < x > = JLo «sel-l: JLo « sel-l: < u » , < sel-2: represented as a tree As again is a composite object, its parts are selected by: < u > = sel-l « r » < v > = sel-2 « r » The object can be obtained as a sub-object of by a composite selector, where the dot serves for the functional composition: < u> = sel-l . sel-l « x » The application of a selector to an object which has has not been specified as selecting a proper object, yields the null object 11. sel-4 « x » = 11 The pair selector: selected objected is enclosed in pointed brackets, where the selected object may itself be a composite object, generated by a JLo-function, as is the case in the example given for the object selected by sel-l on the first level. Modification of a tree F or the modification of objects the modification function JL has been introduced. It contains as the first argument the object to be modified, while the other arguments are pairs of selectors and objects as in the generation function JLo. The function JL can be applied in several ways. If a selector of an argument has already been defined for the object to be modified, the object paired with the selector replaces the old selected sub-object. If the selector has not yet been defined for the object, the JL-function establishes a new branch for the object with the object from the pair as a new sub-object. The modification of the tree in Fig. 2 to form the tree in Fig. 3 would be written as: < x' > = JL« x >; from Fig. 3 into the tree from Fig. 2 would be written as: < x > = JL «x'>; , ll ~ i ~ 3}) For objects whose sub-objects are attached by selectors of the type elem(i) normal functions on lists as length, head and tail have been defined. The structure of the PL/I machine involves major parts belonging to a class of objects called directories. A directory is an object whose sub-objects are selected by unique names. A directory of type pred containing sub-objects of type pred.,.l would be· defined as: is-pred = ({ < n: pred-I > II is-unique-name (n) }) While in the definition of a list the condition following the vertical stroke defines the number of elements of the list, the condition following the two vertical strokes in the definition of a directory leaves the number of elements open but specifies the type of selectors. Abstract PL/I program text and syntax For the definition of the meaning of a PL/I program an abstract form of this program is interpreted. The idea of representing a program in an abstract form has been shown by 1. McCarthy. A well-formed concrete program has only one corresponding abstract form, whereas the abstract form of a program may have a mUltiplicity of concrete representations. The abstract form for a PL/I program - its abstract text - is the result of processing the derivation of PL/I program text-a PL{I program not containing compile time statements - on the translator as shown in Fig. 1. Bya set of rewriting rules the translator eliminates all semantically irrelevant notations like delimiters, and explicitly establishes those declarations which in the concrete text are defined by default rules, factored attributes, and implicit and contextual declarations. The rewriting rules also establish expansions as defined for the LI KE attribute and for not 368 Spring Joint Computer Conference, 1968 fully qualified structure references and insert systemdefined initial condition enabling prefixes. The rewriting rules eliminate the information on the ordering of branches of the derivation tree of concrete text where the order is semantically irrelevant. The abstract text can be represeJ)ted as a tree. For the abstract text a set of rules - the abstract syntax of PL/I - is given which define properties and structure of the tree representing the text. The rules define the branches which build the tree and the selectors which give access to these branches._ As an example the normal form of an assignment statement could be defined by the abstract syntax as: is-assign-stmt = «s-st:is-assign>, , < s-pref: is-cond-name-set> , < s-left -part: is-reference-iist>, , »11 is-unique-name (a)}) The unique name a is an elementary address of storage. The value of a storage element is not directly accessible by the address, but via a second level selector s-value. This property serves the purpose of separating for a variable the cases of non-allocated storage from allocated but not initialized storage. For allocated but not initialized storage it holds that: is-value-representation (s-value . a (S)) = FALSE. Block activation and block local state parts One of the salient features of PL/I is the possibility of controlling the scope of names, the scope of condition enabling prefixes, and storage allocation by the procedure and begin-block structure of a program. The control part ~, the dump Q, and the block local state parts E, C,S, and EI are those parts of the PL/I machine which serve specifically the purpose of interpreting the effects of block activation. An identifier is declared in a block by linking the identifier with a set of attributes in a declaration. The same identifier can be redeclared with a new set of attributes denoting a new entity in an inner block and denotes the new entity in the scope of this block. Thus an identifier may have various uses throughout a program. For resolving the problem of multiple use of an identifier the environment part E is used. In the interpretation of a program on the PL/I machine each identifier in its scope of declaration is associated with a unique name n from the unique name source. The pairs for each block activation are collected in the environment part E. On establishing a block activation, ~ is generated from the environment part of a second block, updating it with the identifiers declared in the block being activated. If the activated block is a begin block, the second block is the dynamically preceding block activation. If the activated block is a procedure block, the second block is that block in which the procedure has been declared. In the united set for a redeclared identifier the association with the new unique name replaces the old one. On termination of a block all identifiers declared as variables of storage class A UTO MATI C loose their meaning. Storage which has been allocated for these variables has to be freed on block termination. A set of identifiers in EI serves the purpose of preserving, for use in the freeing of storage, all locally declared identifiers until block termination. The condition status CS for a block activation contains all information on the condition enabling status as established by condition prefixes, and on the action to be performed when a condition is raised as defined by the system action or by executed ON-statements. The control ~ in any state of the machine contains the instructions to be executed next. Each block activation has its own level of control, which is established on block activation and deleted on block termination, and is transferred to the corresponding part of the dump if a new block is activated. The last instruction executed before the control for the current block becomes empty, is the instruction performing the block termination. The dump 0 of a state of the PL/I machine reflects the nested structure of block activations of the program being interpreted. Whenever a new block is activated, a new dump!? is established, where the contents of the local dIrectories, the dump and the control of the activating block, are stacked on top of the old dump D. On normal termination of this block the old contents as stored in the dump D. are re-established in the local directories and in the control, and the top level of the dump D is deleted. Parallel task and event part P A I n the interpretation of a program containing parallel tasks and I/O events these parallel actions are sequentialized. All active tasks and events have entries in one of the global directories, the parallel task and event part P A, containing all information necessary for their interpretation. At appropriate points in the program interpretation a priority evaluation decides which task will execute the next instruction or instructions. Several directories of the PL/I machine shown in Fig. 6 are global for all tasks and I/O events. Others are local to a specific task of I/O event. Thus, e.g., the control part has to contain only instructions belonging to one task. These task-local state parts are established for the selected after the priority evaluation using the information kept for this task in P A. When the task looses control, the contents of the tasklocal directories are saved by transferring them back toPA. 370 Spring Joint Computer Conference, 1968 Meaning of names and global directories An identifier which has been declared as the name of an entity is associated with a complete set of attributes. In the abstract text- the normal form of the program to be interpreted in the PL/I machine - all declarations of one block are collected in the declaration part of this block. These declarations are interpreted in the prologue action for each activation of a given block. It is necessary to ensure that declarations can be retrieved during the execution of the program for various purposes like matching of attributes in parameter passing, evaluation of attributes on allocation of controlled storage or finding the block denoted by an entry name. This can be achieved either by searching through the abstract text whenever information on the text has to be retrieved or by transferring information on the text to a specific part of the PL/I machine where it can be accessed without employing a text searching mechanism. The definition system for PL/I applies the second method. Global directories contain the meaning of names, i.e., what a name denotes, which attributes are associated with it and, if relevant, which storage has been allocated to it. As already shown in the environment for a block activation each identifier which can be used in a reference is associated with a unique name n. This unique name serves as a selector in the attribute directory AT and in the denotation directory DN, both of which are global directories of the PL/I machine. The Attribute Directory AT associates the unique name of an identifier with the attributes declared with the identifier. The entry in AT is part of the prologue action which has to be performed in the interpretation of a block activation. The attributes declared with an identifier are transferred to the respective element in AT denoted by the unique name, without performing any transformation or evaluation of the declaration. The declaraction may contain expressions which for their later evaluation need the meaning of all names appearing in the expression. For this purpose the elements in AT contain an environment part in addition to an attribute part. The abstract syntax of AT is formally defined "as: is-at = ({ , »11 is-unique-name (n)} ) The attribute part with the predicate is-attr, identical to an attribute part in abstract text, has the abstract ~yntax: is-attr = (is-prop-variable V is-data-param V is-entry V is-file V is-based V ... ) A scalar proper variable of type arithmetic would have the following sub-structure of attributes: is-prop-scal-variable = «s-stg-cl: (is-static V is-automatic V is-ctrl)> , ) The Denotation Directory DN associates the unique name of an identifier with the entity it denotes. An entry name denotes the body of a procedure and the environment of the declaration, a label denotes a statement list and a block activation identification, variable names denote generations which basically are collections of storage addresses. The abstract syntax of DN is: is-dn = ({, , < s-bl-idf: is-block-identification> ) V is-unique-name) > II is-unique-name (n)} ) « The Aggregate Directory AG has the form: is-ag = ({llis-unique-name (n)} ) An element of a generation-list, i.e., one generation has the form: is-gen=( < s-da: is-da> , ) Aggregates are denoted by unique names which are s~lectors to the aggregates in the aggregate directory. The generation serves the purpose of collecting the storage element allocated for a variable and information on the data attributes. The respective parts of the generation are selected by s-addr, and s-da. The notion of generation reflects a basic concept derived for the current formal definition of PL/I. In PL/I variables do not always possess private storage but may be based on already existent and allocated variables. This sharing pattern is reflected by the way in which two generations comprise the same elementary storage items a @. Example for the interpretation of a PL/I variable A simple example will show how the various directories are involved in the interpretation of a PL/I variable. Let us assume a block B as BEGIN; ... DECLARE X FLOAT (8) AUTOtvIATIC, Y, Z; ... X = expression; ... END; On the Formal Definition ofPL!I 371 The example of Fig. 7 shows the same block transformed into abstract text, where all declarations are collected and completed in a declaration part. Block B: '''<''>nd~t-n or A '0 .. z / y x /\,1( ~ ~ , , \ s-scope s-attr 'NTiA' s -stg- cI s-initial do 5- AUT~~~ ~ s-mode REAL ;base s-prec~ S-SCO\; DEC FLOAT 8 Figure 9 - Attributes and environment associated with nx in AT Figure 7 - Example for abstract text of a block B, containing the declaration of variable X When the block B is activated during the interpretation of the abstract text, its declaration part in the prologue action is evaluated and parts are transferred to the local and global directories. All locally declared identifiers are linked to unique names. The new environment is made up by selecting the unique name of an identifier using the identifier itself as selector. E: U sing the same unique names nx , ny, nz as selectors, the prologue action establishes entries in the denotation directory D N, taking unique names b from the set of aggregate names. ,, z y ,, ,, , x j ! \ n z n ,, ,, y Figure 8 - Structure of an environment part containing unique names selected by identifiers Continuing in the prologue action an entry is made for each newly declared identifier in the attribute directory AT. Figure lO-Denotations for variables X, Y, Z in DN When the denotation for the variable X has been establisheda generation is created. The later is established as element of AG selected by bx • The generation-list of bx contains only one element, the variable X having been declared AUTOMATIC. Only in the case of CONTROLLED storage class the aggregate would contain a list of generations. The generation contained in the aggregate for X denotes one storage itme a (~) and the data attributes for a value which can be assigned to a (S). In interpreting the statement-list of the block B there occurs a reference to X in an assignment statement. The value of the expression has to be assigned 372 Spring Joint Computer Conference, 1968 to the proper location in storage, which is found by the access chain. X-- nx- - bx- - ' a E DN AG .§ s-mode / \ s - base cf 6 REAL DEC s - scale b gument~ and a set of successor control trees. Only those instructions for which the set of successor control trees is empty, are the proper instructions on the terminal nodes of the control tree which are candidates for execution. Thus instr-3 in Fig. 12 is no candidate for immediate execution. Each instruction is deleted from the control after it has been executed. A form of the control tree using specific selectors is used to allow the evaluation of arguments of an instruction. In the control tree of Fig. 13 the arguments fl and f2 are evaluated by the execution of instructions instr-2 and instr-3. s-~c '0 instr- i (f), f2 8 FLOAT r; [fl: instr-2, f2: instr-3J Figure 11 - Generation for a variable X Control part and state transitions The control part C of the PL/I machine is significant for the changes of the state of the machine in the precess of interpreting abstract text. Changes of any subpart of the PL/I machine - as in the example in previous section - can only be performed by executing instructions which are elements of the control part. Figure 13 - Control tree, passing of argument values A proper instruction in the Formal Definition of PL/I is defined in the following format: instr-name (x h ••• , x 2)= RETURNS:eo s-partl:e l instr-l; --P;str - 2, instr - 3; finstr-4, instr-5 n Figure 12 - Control tree and formula representation The control part has the form of a tree where the nodes contain instructions. The formula representation equivalent to the control tree in Fig. 12 is used for representing a control tree in the formal definition. The PL/I machine is a sequential machine, i.e., no two actions can be performed in parallel. Thus only language properties can be described which in their logical significance can be sequentialized. All terminals of the control tree designate instructions which are candidates for execution, but only one instruction at a time is executed. This choice of one of several instructions for execution reflects those situations in PL/I where the order of execution for some program parts is undefined, as, e.g., in operand evaluation. Each instruction in the control tree can have ar- s-partn:e n Such an instruction on execution returns the value obtained in evaluating the expression eo, e.g. to an argument place if used as in Fig. 12, and changes parts of the state of the PL/I machine selected by s-partn. The arguments Xl, ... , Xn may contain pieces of the abstract text to be interpreted. The expressions eo , ... ,en are expressions in the metalanguage of the formal definition, i.e. functions on abstract objects involving It-functions and selectors. The arguments Xl , ... , Xn can appear in these expressions. As an example the process of updating DN with a new denotation for a declared variable with the unique name nx as shown in the example in the previous section would require a control tree for execution: update-dn (nx, bx), bx:un-n The instruction would be defined as Def.:!!!!:!}= RETURNS:head (N) s.:n: tail (1S) Def.: update-dn (t h t 2) = s-dn: It (DN; p«x,y)g(x,y» (a,b) = g(a,b) follows *a A *b ::> p«x,y)g(x,y» (a,b) = A«X,y)g(X,y»(a,b) We generalize this from two arguments a,b, to k arguments A I, A2 ... Ak, and decide on the following method: 2.5 Evaluate all expressions (FL A I mt A2m2 ••• Akmk) where FL is the original RHO-expression, except that the RH 0 has been changed into a LAMBDA, and each Aimi is a onemember subset of the argument Ai. 2.6 In step 2.5, we obtain one ambject as value for each possible combination of onemember subset. By virtue of rule R2 above, each such ambject is a subset of the ambject created in (2.2). Mutual references are therefore put on the SUPERSET viz. SUBSET properties. 3.1 Checks whether the fact a k b is already known (Le., whether the ambject be is al- .. . * More precisely, is a list of all ambjects, that stand for subsets of the set that A stands for. ready on the SUPERSET property of a). If so, it returns; otherwise, it continues. 3.2 Adds each member b i of the SUPERSET property of b (including b itself) to the SUPERSET property of a. If we know *a (Le. if a has the flag STAR), we also apply the operators ofb' to a; see step 3.4. 3.3 Adds each member a' of the SUBSET property of a to the SUBSET property ofb. If we know *a', we also apply the operators ofb to a', see step 3.4. Then return. 3.4 To apply the operators of b" to a", first retrieve the OPERS property of b". It is a list of ambjects whose MEANING properties are forms (Fl ...... B" ... ) F 1 is a RHO-expression. it would be possible and legal to re-evaluate these RHOexpressions as described in step 2.4-2.6.* In step 2.5, where we form all combinations of one-member subsets, we would then obtain combinations where a' " the onemember subset of b", is included, which is what we desire. However, we would also obtain combinations where other subsets than a" are used, and these combinations have already been considered. To avoid this inefficiency, we first put the SUBSET property of b" on the push-down-list, and replace it with another property that only contains a". After that, all forms on the OPERS property of b" are evaluated, and finally the old SUBSET property is restored from the push-down-list. 4. There is a functIOn SETSTAR, which is similar to SETSUBSET. it puts the flag STAR on its sole argument, and triggers the necessary operators. The details are analogous to those for SETSUBSET. The above algorithm can be considered as an encodement of the rules RI -R3 for RHO-expressions and the * predicate, as given at the begin~ing of this section. As each occurrence of a symbolic function is reported via SYMQUOTE and may trigger operators, . axioms for symbolic functions and predicates may be encoded as RHO operators. Logic with/our truth-values We agree that a symbolic predicate should be a special kind of symbolic function. It is desirable that (*)Step 26 must be sli~htly modified; the ambject created in step 25 is now to be a subset of the ambject that carries the RHO.expression i.e. the ambject on b":s OPERS property. LISP A the difference between predicates and other functions should be kept as small as possible. In particular, to make it possible to use predicates inside RHOexpressions, we should let the value of a symbolic predicate be a set of truth-values. Let t and f be the original truth-values, defined on relations between objects. Introduce the sets T={t} F={f} S = {t,J1 (stands for "sometimes") ={ } It easily follows *p A P k T ::J P = T (R4) This rule is important and will be used below. We now extend the ordinary logical connectives to this four-valued logic. The connectives shall satisfy the general axiom fn(a U b) = fn(a) U fn(b) , so we have e.g. TV T={t} V {t}={t}=T. T V S = {t} V {tJ} = {t} V ({t} U {f})= ({t} V {t}) U ({t} V {f}) = ... TUT=T Let us not here delve further into this logic. One of our early examples of the use of RH O-expression was p«x) settrue(admire(x,father(x»)) ) (boy n discussed-obj). If we assume the very reasonable axioms *a ::J *father(a) and *b A *c ::J *admire(b,c), we can re-write the operator on the equivalent form settrue( p«x) admire(x,father(x») (boy n discussed-obj) ). If we evaluate the operator in this latter version, the following will happen (although not necessarily in this order); 1. The ambject boy n discussed-obj is evaluated and assigned some properties by operators that are triggered by the use of the function n. 2. The RH O-expression is applied to its argument. A new ambject, af, is introduced. 3. The function SETTRUE is applied to af, which is then set equal to the set T through a property. 381 4. Let r be an ambject which has obtained the flag STAR, and which has the ambjects BOY and DISCUSSED-OBJ on its SUPERSET property. When the last of these three conditions becomes satisfied, the above RHO-expression will be triggered. The system evaluates father(r) and admire(r,father(r». These will of course be new ambjects. If the above axioms are properly encoded, the system will put the flag STAR on them. Moreover, by the specification of how to handle RH O-expressions, the system will evaluate setsubset( admire(r,father(r», af), where af is the ambject introduced in step (2). By rule (R4) above, the ambject admire(r,father (r» will be set EQUAL to af and therefore to T. This was one example of how RHO-expressions can sometimes be evaluated for their value, rather than for their side-effect. Differencesfrom LISP 1.5 Through the introduction of new types of functions, the traditional means of handling functions in LISP become inconvenient. We found an alternative system of conventions, which may have some interest in itself. The association-list in ordinary LISP assigns values to atoms. The value may be a FUNARG - expression (in which case the atom can be used as a function symbol) or an arbitrary S-expression (in which case the atom can only be used as the argument of some function). On the association-list, it does not matter whether an atom stands for a function or something else, but in otherparts of the LISP system it does. On propertylists, functions are defined as EXPR-properties or FEXPR-properties; other values as APV AL-properties. If the atom G has the EXPR-property gg, then the two expressions (FUNCTION G) and (FUNCTION gg) are equivalent, but if the atom A has the APV ALproperty (aa), the two expressions (QUOTE A) and (QUOTEaa) are not at all equivalent. In LISP A, the distinction between functional and other values is abolished~ This leads to the following consequences: 1. The pseudo-function FUNCTION is superflous. We use QUOTE instead. 2. When LAMBDA-expressions are used directly in forms, they must be quoted. To avoid con- 382 Spring Joint Computer Conference, 1968 fusion, we introduce the symbol ETA to be used instead of LAMBDA. Thus the following would be a correct form: «QUOTE (ETA (X Y) (........ ») To increase compatibility with LISP 1.5, it is possible to define functions LAMBDA and LABEL as PHI-expressions. If we let the value of the atom LAMBDA be (PHI R (CONS (QUOTE ETA) R)) ( " .......... ) / ( ............ ) ) 3. When function definitions (e.g. LAMBDA-expressions) are named by atoms, t'hey are put as APVAL-properties (viz. as VALUE-properties in PDP-6-LISP-type implementations). 4. Different kinds of functions, which were previously distinguished through their attributes (EXPR, FEXPR, SUBR, etc.) must now be distinguished some other way. We use the following transformations: 4.1 A previous EXPR on the form (IAMBDAab) is re-written on the form (ETAab) 4.2 A previous FEXPR on the form (LAMBDA (al a2) b) is re-written on the form (PHI al b), 4.3 A previous SUBR is re-written on the form (ECODE.a) where a is the starting address for the machine code routine. 4.4 A previous FSUBR is re-written on the form (FCODE.a) with a as for ECO D E. 5. The general evaluation function for LISP A, evala, evaluates the leading function of a form just like any of the arguments. It therefore becomes possible to evaluate the expression for a function immediately before it is used. For example, we can write «DERIVATIVE SINE) X) where (DERIVATIVE SINE) evaluates into the value of COSINE(*) which is then applied to the value of X. The above changes have the obvious advantages of preparing the ground for the two types of functions in LISP A: symbolic functions and RHO-expressions. (*) The value of the atom SINE is most likely an expression (ECODE . a). The function DERIVATIVE takes this as argument and uses it in an expression of the type derivative (t) = if ...... else if f= sine then cosine else The value of (DERIV ATIVE SINE) is therefore (in expression (ECODE . b). which also happens to the value of the atom COSINE. we can use LAMBDA-expressions as leading functions in forms, just like in LISP 1.5 (LAMBDAexpressions that were EXPR's or FEXPR's must of course still be transformed into ETA- or PHIexpressions). A similar definition of LABEL becomes a little bit more involved. Implementation A preliminary version of evala, the evaluation function in LISP A, was coded in ordinary LISP for the PDP-6 computer. That version lacked some facilities that have been described here. Most important, it only permitted RHO-expressions to be evaluated for their side-effects, not for their value. On the other hand, there were some additional facilities in that system. A modernized version of evala according to the specifications in this r~port is currently (March, 1968) available in LISP for the CD 3600 computer. The crucial feature in these implementations was the use of a queue for expressions-to-be-evaluated. The need for this arises e.g. when we handle RHOexpressions that contain one or more occurrences of symboliC functions. Such a RHO-expression is triggered by the evaluation of an expression (SETSUBSET ...... ); it itself triggers evaluation of similar expressions (e.g., when SYMQUOTEd expressions are set as subsets of larger sets). Obviously this process may continue in a chain or tree. At each step, several new RHOexpressions may be triggered. The order of evaluation of such expressions can be chosen in several ways: 1. Depth-first. If evaluation of expression e triggers expressions f1, f2, ... , fk, we first evaluate f1 and all its consequences to the end of the tree; and only then start to evaluate f2. 2. Queueing. We keep a queue of expressions that are to be evaluated. If e triggers f1, f2, ... , fk, these operations are put at the end of the queue, and the evaluation of all is postponed until the expressions before them in the queue have been handled. When we reach f1, we put the expressions that it triggers at the end of the queue; proceed to f2, etc. 3. Queueing with priority. This is like (2), except that expressions which are deemed particularly LISP A significant, are permitted to step into the queue at some point other than its end. Other alternatives, and more sophisticated ones, are also possible. In our implementation, we have preferred the "queueing with priority" scheme. Other incremental computers The System by Lombardi and Raphael. The idea to use LISP as the basis for an incremental computer is not new; it was originally put forward by L. A. Lombardi and B. RaphaeL 7 They describe a modified LISP system which can (in some sense) evaluate expressions, even when the values of some variables have not been specified. Lombardi and Raphael specify three key requirements for an incremental computer. We shall relate the present work to theirs by discussing whether LISP A satisfies those requirements. The first of them is: "The extent to which an expression is evaluated is controlled by the currently-available information context. The result of the evaluation is a ne~ expression,. open to accommodate new increments of pertinent information by simply evaluating again with a new information". In LISP A, we can write expressions which satisfy this by using RHO-expressions. (We can also avoid paying the cost for it by using LAMB.DA-expressions.) The "current information context" for forms with a RH O-expression as their leading function is information about subsets of the arguments in this form. We would say that the form has been completely evaluated when all one-member subsets of all arguments are explicitly known, and all combinations of them have been considered by the system. Usually, this is not the case, and evaluation is performed to an extent "controlled by the currently-available information context". As we have seen, forms with RHO-expressions are stored away in such a way that they are "open to accommodate new increments of pertinent information" about subsets. The LISP A system does not satisfy Lombardi ~nd Raphael's second requirement ("algorithms, data, and the operation of the computer (!) should be specified by a common language") or their third condition (this language should be understandable by untrained humans). But neither does their incremental LISP, nor any other system we have heard of. Our system is extremely inefficient in terms of computer time, and it can be assumed that theirs is less wasteful. On the other hand, LISP A seems to have the following two advantages over their system: (I) When an expression is evaluated incompletely for lack of information, the system remembers 383 this and resumes evaluation when further, pertinent information increments become available. (2) Through the introduction of ambjects and symbolic functions, our system comes closer to having "a large, continuous, on-going, evolutinary data base", which should be the characteristic environment of an incremental computer. The data base of Raphael's program is identical to that of LISP 1.5, i.e. it is restricted to property-lists of atoms. Lom.bardi has later published a more extensive treatise of incremental computers.8 He there concentrates on the basic representation of data in core, and introduces his own system with three references in each cell. These seem to be quite different problems from the ones tackled in this paper, and a comparison is therefore not attempted. Future Developments of LISP A. The following developments seem natural: A. Write a machine-coded version of evala. B. Introduce a notation through which pseudoparallell execution of several expressions can be performed. This is very natural, since e.g., the order of evaluation of the specializations of a RHO-expression is immaterial. C. Attack storage problems by making use of backing storage, drum or disk. Facilities for parallel execution of expressions then become very important, because they help us to use the time when we are waiting for information from backing storage. SUMMARY LISP A is a modification and extension of LISP 1.5. Besides minor modifications, two new types of functions have been added to the language. One type (symbolic functions) is used to create and extend the data base. If ISF ATHER is a symbolic function, evaluation of (lSFATHER JOHN DICK) will create a representation for the relation in the data base, without asserting its truth. This representation can then be used with conventional LISP functions, which set it true, ask whether it is true, etc. The other type (RHO-expressions) can be used to write a kind of rules of inference, which are automatically triggered in desired situations. The LISP A system is governed by such RHO-expression operators, which trigger each other. There is no coherent program, just a set of operators which communicate through the changes they cause in the data base. The paper gives a general description of the LISP A system. 384 Spring Joint Computer Conference, 1968 ACKNOWLEDGMENTS The author is indebted to professor John McCarthy at Stanford University for his kind guidance during the year 1966/67, when the work reported here was started. He would also like to thank fil.lic. Jacob Palme of the Research institute of Nationai Defense, Stockholm, Sweden, for many valuable discussions during the last year. REFERENCES 1 J McCARTHY Recursive functions of symbolic expressions and their evaluation by machine, part I Communications of the ACM 3 (April 1960) p 184 2 J McCARTHY et al Lisp 1.5 programmer's manual The M.I.T. Press 1962 3 CWEISSMAN LISP 1.5 primer Dickenson Publ Co Belmont Cal 1967 4 E C BERKELEY et al The programming language LISP: Its operation and applications The M.I.T. Press 1966 5 DGBOBROW N aturallanguage input for a computer problem solving system Doctoral Thesis M.I.T. 6 BRAPHAEL SIR: A computer program for semantic information retrieval Doctoral Thesis, M.I.T. 7 L A LOMBARDI B RAPHAEL LISP as the language for an incremental computer in Ref. 4 8 L A LOMBARDI Incremental computation In Frank L Alt ed Advances in Computers Vol 8 Academic Press New York 1967 TGT: Transformational grammar tester by DAVE L. LONDE and WILLIAM J. SCHOENE System Development Corporation Santa Monica, California INTRODUCTION Chomsky defines a generative grammar as one that "attempts to characterize in the most neutral possible terms the knowledge of the language that provides the basis for actual use of language by a speaker-hearer." 1 It is "a system of rules that in some explicit and welldefined way assigns structural descriptions to sentences."1 The syntactic component of such a grammar specifies the well-formed strings of formatives (minimal syntactically functioning elements) in the language and assigns structures to them. Transformational grammars are built on the concepf of the logical separation of two types of structure, th, deep and the surface structure. Accordingly, there are two systems of rules in the syntactic component of a transformational grammar: phrase structure rules generate deep structure, taking the form of a labeled 'bracketing or tree; transformational rules map trees to other trees and determine the ultimate surface structure of a sentence. The deep structure is operated on by the semantic canponent of the gran mar to determine meaning; the surface structure is interpreted by the phonological component. The emphasis on explicitness is a distinct advantage of generative grammars. However, it imposes a great burden on the linguist. Given a deep structure, the determination of the applicable rules in the derivation of the surface structure of a particular sentence is an extremely tedious and time-consuming task, difficult to perform with accuracy. For example, verifying by hand the correctness of the derivation of typical sentences in the IBM Core Grammar2 took us on the average two hours per sentence. And, as the grammar becomes large (as the linguist attempts to account for more phenomena in the language he is describing) it becomes more difficult to provide for all of the possible interrelations of the rules. *The work reported herein was supported in part by contract F I 962867COOO4, Information Processing Techniques, with the Electronic Systems Division, Air Force Systems Command, for the Advanced Research Projects Agency I nformation Processing Techniques Office. 385 TGT is a system of computer programs that can provide assistance to linguists in building and validating transformational grammars. The name TGT '"Transformational Grammar Tester," connotes ~ more ambi~ious project than was intended or undertaken. Not enough is known about some of the components (e.g., the semantic component)3 to warrant testing. And although there has been a considerable amount of work done on phonological rules, TGT has been addressed almost exclusively to the debugging and maintenance of the syntactic component. For more than a decade, computers have been used experimentally to process segments of natural language text according to syntactic rules. The experiments have included both synthetic syntactic processing (generating sentences, together with their structural descriptions according to some previously specified grammar) and analytic syntactic processing (recognizing for a given sequence of formatives considered to be a sentence, each structural description, if any, that the grammar can assign to the sequence). 4.5 Unlike TGT, projects engaged in syntactic processing of natural language have seldom had as their sole objective the refining and extending of the grammar initially specified; often this has not been the primary objective. Nevertheless, important contributions to grammar validation have been made by Petrick,6 although the technique used, that of recognition, is appropriate only for grammars that are relatively complete. The system for syntactic analysis at The MITRE Corporation, described in Zwicky, et al. [1965],1 and modified as described in Gross [1967],8 is currently being used to perform some of the same functions included in TGT. An off-line system for compiling, updating and testing the IBM Core Grammar was written in LISP 1.5 by Blair [1966]9 and is currently in use at the IBM Thomas J. Watson Research Center, Yorktown Heights, New York. Further work on an off-line computer system for grammar testing by synthesis and by constrained 386 Spring Joint Computer Conference, 1968 "random" generation is being currently performed by Dr. Joyce Friedman at the Department of Computer Sciences, Stanford University, with whom we have probitably exchanged information. TOT was designed with the.goai of being as generai as possible while still accommodating its im~ediate and primary user, the Air Force UCLA English Syntax Project (AFESP)* which is currently reviewing and attempting to integrate the· work that has been done on transformational analyses of English. While it was impossible for AFESP at the outset to make definite decisions regarding its needs, since principled decisions on matters such as rule conventions eouid oniy be made after the data were gathered and analyzed, it seemed apparent to us that its needs could better be served by a new system, that (among other things) would be user-oriented and interactive, would handle subcategorization and selectional features, and would facilitate extension and modification. (However, -for comments on factors limiting these goals, see the Discussion section at the end of this paper.) System overview In constructing a grammar with TGT, the linguist will be asking the same questions and employing many of the same procedures that he would have used were the tester not available. With the tester, many of those amenable to programming can now be done by the computeL The current version ofTGT is programmed in the JTS version of the JOVIAL language and occupies 38,800 words of 48 bit memory, operating under the time-sharing system for the AN/FS Q-32 computer. An ITT model 33 teletype is used as the standard input/output device. A CRT with a capability of displaying 680 characters on a 1024 x 1024 matrix is an optional device for output. CRT output is faster and more readable; but its use, because of hardware requirements, dictates proximity to the Q-32. Teletype output is not subject to such distance limitations and is frequently used, even by those who have a CRT, as a means of obtaining hard copy output. The most important tasks to be performed by TGT center around its ability to execute transformations. Usin~ TGT the linguist can determine the applicability of transformations, execute them -and -display their results. Many ancillary functions are provided for specification and manipulation of rules and test structures. Combinations of these functions are employed to aid the user in determining the implications of changes in the rule set. For example, the linguist can save entire derivations of sentences that he considers correct. He may then insert a new rule, or change or delete an existing one, and then have the computer apply the rule set to the base phrase markers of the saved sentences. Changes in the derivations can then be immediately determined. C apabil~tie s* A. Displaying and saving trees Most TGT operations apply to the most recent or "current" tree in memory. Trees may be created by the primitive DISPLAY (abbreviated as D). The tree in Figure 1 could have been initially input by typing: ,.0 (S(# PRE(Q) NP(DET(ART(WH DEF») N [THING <+N> <+PR0> <-HUMAN> <-SG>]) AUX(T(PAST»'VP(V[DISAPPEAR <+V>] (#») (0 The components of the tree (categories, features, and complex symbols) are autom'atically numbered. The "current" tree may be named and saved if desired by typing SAVE or S followed by an alphanumeric string. If there is already a tree in the system with that name, TGT informs the user, who is then given the option of changing the name of the new tree or replacing the old tree. The most recently used trees are saved in core memory. The others are stored on disc. Saved trees may be restored at any time by using the DISPLAY primitive with the name of the saved tree. This then qualifies as the most recently displayed or "current" tree. Trees are initially left-justified on the scope. If the entire tree cannot be displayed, a message is output indicating the numbers of the top-most nodes of the major subtrees that are not displayed. Normally a right-justify command (RJ) is sufficient to display the missing subtree(s). However, an additional command, CENTER, is also useful in these cases. Accompanied by a number, it causes the subtree with that number to be displayed exclusively. CENTER also enables the user to position the current tree anywhere on the scope by specifying the direction in which the tree is to be moved (left, right, up, or down) and the number of units it is to be moved. The tree in Figure 1 may be listed on the teletype by typing the command LIST, reSUlting in the TTY printout shown in Figure 2. It is seldom necessary to input an entire tree parenthetically as in (1) above. Most desired trees can be produced by altering a few basic trees with the following primitives: * The official title of the project is, Integration of Transformational Studies on English Syntax, Principal Investagator- Robert P. Stockwell. Co-Principal Investigators - Paul M. Schachter and Barbara Hall Partee. *A more detailed account of system capabilities can be found in "TGT User's Manual," by W. J. Schoene, System Development Corporation document (in press),lO Transformational Grammar Tester (2) (3) (4) (5) (6) (7) Input* ERASE E AD A ALS A ARS A SUB A EAD Nl (8) EALS (9) EARS (10) ESUB Nl Nl Nl Nl Nl Nl Nl Nl Nl N2 N2 N2 N2 387 Explanation Erases subtree whose number is N 1 (that is, N 1 and all that it dominates). Add A as a daughter to N 1. Add A as a left sister to N 1. Add A as a right sister to N 1. Substitute A for N 1. Erase N 1 from its original position and add it as a daughter to N2. Erase N 1 and add it as a left sister to N2. Erase N 1 and add it as a right sister to N2. Erase N 1 and substitute it for N2. *Where N 1 and N2 r!!present the numl?ers. of nodes of the current tree and A may be one of the following: the number of a node in the current tree, the name of a saved tree, or a structure (subtree, feature or complex symbol) input in the parenthetic notation of (1) above. B.Phrase structure rules Phrase structure rules are not essential to the operation of TG T, but their presence permits a legality check on the trees. The command VERIFY, input from the teletype, will return the message YES or NO indicating that the current tree could or could not have been produced by the phrase structure fules. In the present system, the complex symbols (e.g., those symbols enclosed in straight line brackets in Figure 1 and numbered 11 and 22), are ignored by the VERIFy subroutine. As in Chomsky [1965)1 phrase structure rules are expected to be context free. They may be entered into the system by typing the identifier PS followed by the symbol to be expanded, an arrow (minus sign and greater-than sign) and the string indicating the legal . expansion. PS VP- > AUX (MV (NP, NIL), Figure 1- CRT displ~y of a tree input from teletype and representing the structure of the sentence, "What things disappeared?" COP (NP, ADJ, NIL» . (11) Exainple (11) above indicated that VP may be expanded as anyone of the following strings: (S 1) (if 2) (FnE 3) (Q 4) (lIP 5) (D~ 6) (ART 1) (WH 8) (DEF 9) (N 10) (*cs* 11) [ (THING 12) «+N> 13) «+PRO> 14) «-00> 16)J (AUX 17) «-HmWl> 15) (T 18) (PAST 19) (vp 20) (v 21) (*cs* 22) [ ( DISAPPEAR 23) «+v> 24)] (iJ 25) Figure 2 - Teletype representation ofthe tree in Figure I AUX AUX AUX AUX AUX MV MV COP COP COP NP NP ADJ Parentheses indicate that of the strings within that are .separated by commas, just one is to be chosen. A NIL within the parentheses indicates that none need be chosen. PS S- > (S CONJ S (CONJ S, NIL)*, NP VP) ,(12) 388 Spring Joint Computer Conference, 1968 Example (12) indicates that S may be rewritten as NP S C' .:) S etc. VP CONJ CONJ CONJ S S CONJ S CONJ C" .:) S CONJ S Thus, an asterisk following parentheses indicates reapplicability of the set within. The expansion of a symbol can be changed merely be reinputting the symbol and its new expansion. TGT saves only the most recent. c. Transformations* Transformations are input via the teletype using the operator RULE. The canonicai form is: RULE rule-name structural-description => structure-change . conditions '---v----/ optional RULE HYPOTHETI X A *IB *2(C,D E, *3F) X G*4B X=> 1 = 2 + 1; 2 = 0; 3 = 0. 4 EQ 1. (13) In our format, only structures to which a structure change or a condition applies need be numbered. Any structure, including features and dominating and dominated symbols, may be numbered. (However~ the variable X is subject to certain limitations.) A number immediately preceding a choice set applies to each of the left-most members of the set that are not already numbered. Thus, if a certain proper analysis** of the "current" tree is found, the change 1 = 2 + 1 will add the structure headed by C or D as a left sister to the structure headed by B and then erase the original C - or D-dominated structure. If, however. the proper analysis was found with *3 F following *1B, the structure headed by F is erased. This rule applied to the tree in Figure 3 yields the tree in Figure 4. A special condition on this transformation, which must be satisfied before any structural changes can be performed, is that the structure headed by *4B be the same as the structure of *1B. TGT is designed to facilitate the addition of conditions on transformations. There is no limit to the number of conditions that may be imposed on a structure. Rule H2 in (14) below introduces a notation for expressing proper analyses of subtrees within the Figure 3 - Sample tree to which transformational rule HYPOTHETI is applied Figure 4 - Results of applying rule HYPOTHETI to the tree in Figure 3 *A detailed description of the algorithm used to determine proper analyses and to perform structure changes can be found in "An Algorithm for Pattern Matching and Manipulation with Strings and Trees" by D. Londe and W. Schoene, System Development Corporation document (in press).!! ** A formal description of proper analysis can be found in Petrick [1965]'6 current tree, and for specifying fixed length variables and features. RULE H2 X rB *lC D1A 1< + F >< +0> IE *2N *3P [X / <+H-I> /Z X]J X=> 1 =1 + *4M; 4 < L 2; 4 < L3 (14) Transformationai Grammar Tester Square brackets indicate that the first symbol following the right bracket (]) must dominate the sequence of symbols inside the brackets and, furthermore, that this sequence must constitute a proper analysis of the dominating symbol. TG T permits bracket nesting to any level. If the linguist is concerned only that a symbol A dominate a symbol B regardless of whatever else it may dominate, he can surround the B with general variables, [X B X] A, thus preserving the convention for interpretation requiring a proper analysis. Immediate dominance can be expressed as a special condition on the dominated symbol, i.e., X[X *IB X] A X => structure changes. 1 COND5. Where condition 5 requires that the node to which it applies (in this case *1B) be immediately dominated by the symbol immediately following the right bracket 389 are normally assigned a position, an order of operation, * in a cycle or post-cycle. TGT readily allows the user to define and redefine the order of operation (see discussion of INSERT, Section D). (A). Because of the comparatively recent acceptance of features in transformational grammars and the variety of their function in different writings, TGT handles them internally as if they were special conditions on the symbols immediately dominating them, thus facilitating modification. Rule H2 specifies that the E following [B C D] A must immediately dominate the features < + F > and < + G >. Where the linguist is concerned that a symbol dominate a particular feature at some unspecified distance, he may make use of the notation indicated by [XI <+H - I> /Z X]J in rule H2, where Z is a variable of fixed length, one node. We thus preserve our .convention of specifying the immediate dominator of features. A very convenient innovation is illustrated in the structure change portion of this rule. Generally structure changes manipUlate structures that are present in the tree before execution of the rule. Thus, when we specify that some structure numbered 2 is to be added as a daughter to some structure 1, and that 1 is to be adjoined as a right sister to 3, we intend to adjoin 1 in its original state, i.e., before 2 was added. In TGT we have provided a notation that allows numbers to be assigned to structures that are created by the structural change portion of the transformation, whether these be newly introduced constants or structures that were already in the tree and moved to new positions. These new structures may subsequently be moved or have structures adjoined to them. In H2 we are adding a new symbol M as a right sister to *1C, and to this new symbol we are adding *3P and *2N as left daughters. Rule H2 applied to the tree in Figure 5 yields the tree in Figure 6. RULE is used only to define transformations and assign them names. (As with trees, TGT informs the user when the name is not unique.) Transformations Figure 5 - Sample tree to which transformational rule H2 is applied Figure 6 - Results of applying rule H2 to the tree in Figure 5 *Chomsky [1965] 1 mentions the possibility of intrinsically ordered rules. Although we could have accommodated this concept in TG T by arbitrarily assigning numbers to rules and using a random number generator, we deferred this until someone attempts to write a grammar where the rules are not extrinsically ordered. 390 Spring JointComputer Conference, 1968 TEST, EXECUTE and DERIVE actually apply. the t~anformations to the current tree. In general, TEST is used to determine whether the structural description of a transformation successfully applies to the current tree. The TEST command is accompanied by the name of a rule or the names of the first and last rule of a rule sequence, or the word ALL, in which last case every transformation in the cycle and post-cycle is matched against the current tree. The name of the rule and the message YES or NIL are output for each rule as it is tested. For each transformation that successfully applies, the user is given the option to execute the structural changes of the transformation and continue testing, to execute and wait for another command, to continue testing without changing the current tree, or to wait for a new command. EXECUTE may specify one rule or the first and last of a sequence of rules. No message, other than the rule names and YES or NIL is output. This command is used in preference to TEST when the user knows that he wants the structural changes of each successful transformation to be performed. Traditionally, transformations are applied to structures headed by the symbol S. * Where the "current" tree has more than one S, the user may wish to specify the number of the particular subtree to which he intends TEST or EXECUTE to apply. DERIVE applies each transformation in the cycle to the "lowest" S in the tree. It recomputes "lowest'" S after each cycle until all subtrees have been processed. It then applies all of the postcyclic rules. DERIVE records the S node number associated with each cycle and the transformations that were executed during the cycle. I t is not meaningful to talk of the execution times of transformations because they will vary with the tree, the rule, the conventions of interpretation, and, of course, the computer. However, to give some rough idea of the speed of TGT on the Q-32, we input rules 1-13 of IBM Core Grammar I (except rule 4, which has complex special conditions on its application) and applied them to test trees 28 and 29 of that grammar. The average execution time per rule was .04 seconds. D. Other capacilities matched by a node in the tree and by printing that node number. After HYPOTHETI has applied to the tree in Figure 3, the command MATCH would output:* , 7 X A 8 18 21 24 B C G B oJ (15) Any successful transformation may be reapplied to determine whether it could have applied to the current tree in more than one way. The command AGAIN causes the search for a proper analysis to continue by successively backing up from the last node matched as if there had been no match. Thus, in (15) above, the rule interpreter would look below node 24 for another B; finding none it retreats to G 21, ignores . the original match and eventually finds the analysis below:** X 5 A 7 B 8 (16) 18 C G 23 B 24 Applied once more, AGAIN would eventually look below 18 for a C, a D followed by an E, or an F, and it would find the analysis indicated by (17) below: 5 7 8 33 20 21 24 X A B F X G B (17) The analysis that includes 33F and 23G can be attained by invoking a special condition, condition 6. This suspends the A over A convention* allowing the interpreter to go below a symbol, A, to find another A. This condition is symbol specific in TGT and thus may be selectively applied among the symbols of a structure description. Where a successful match is made on an optional symbol immediately preceded or immediately followed by an X, there is always at least one alternate analysis, that where the NIL is chosen. The command MATCH enables the linguist to determine easily how a successful transformation applies to a tree, by printing on the TTY those symbols of the structural description that were successfully *Variables are not printed when they encompass no nodes. ·Suggestions have been made recently, however, (e.g., UCLA Conference on English Syntax, summer 1967) that each recursive symbol in the phrase structure rules be so treated. **The reference to the A over A convention under (17) is pertinent to this analysis. *A discussion of this convention and exceptions can be found In Chomsky [1962]12 and Chomsky L1964]!:l For each transformation, the command INSERT allows the user to specify information concerning its operation in the rule set: its order of operation; whether it is cyclic or postcyclic; whether it is obligatory or optional; whether (if successful) it may be reapplied to the tree before the next sequential transformation; and whether the interpreter may look below an S in trying to find a proper analysis. The command ED IT facilitates the correction of faulty transformation input by removing the necessity to retype the entire rule. This command is accompanied by two strings of symbols, X' and Y. EDIT searches the erroneous rule for string X and replaces it with string Y. A printout of the names of all the trees in the system at any time can be obtained with the command TREES. Similarly, all rule names are printed in response to the command RULES. The primitive DISPLA Y and LIST also accept rule names as input and will either print on the TTY or display on the CRT scope the appropriate transformation. Trees and 'rules are erased from the system's memory by means of the DELETE command accompanied by the rule or tree name. Upon termination of a run, TG T presents the user with the opportunity of saving on magnetic tape the trees and rules he may have created during the run. Each initiation of TGT allows the user to input a file of trees and rules from tape, or to start up with neither rules nor trees in the system. Thus, many versions of a grammar may be extant and may be undergoing modification and testing simultaneously. Files of trees, rules, and derivations may thus be accumulated from run to run and may be subsequently used by others and merged with the files created by other linguists. The program The organization of the TGT program is straightforward, consisting of an executive routine and an indefinitely extensible set of service routines, which perform the basic system tasks such as tree creation and manipulation, rule testing, and generation of displays and teletype output. In TGT's basic interactive cycle, the executive routine interprets each teletype request to determine which service rOutine is needed, reads and legality checks the parameters associated with the request, and transfers control to the approriate service routine. At the completion of its task, the service routine returns control to the executive routine and the cycle is ready to begin again. Of more particular interest are the functions associated with the interpretation of transformations. When the transformation is input each symbol name is looked up in" the dictionary and is replaced by its dictionary number. If the symbol is not present it is added to the dictionary. The rule is then converted to a form convenient for interpretation and resides in a table whose capacity is, at present, 1100 entries. 'Each symbol in the transformation occupies an entry. Each entry contains the following information: dictionary entry number of this symbol; location in this table of the next sequential symbol; location of previous sequential symbol; location of next optional symbol (in the case of a symbol within choice parentheses); location of the first special condition on this symbol. X *IA ([C D] * 2Z, F G) T=> 1 =0. Z NQ 1. (18) The transformation (18) above would be represented in the rule table as indicated below. S~ecial Dictionary Condition Entry of Next Previous Optional Pointer/ Number Symbol* Sequential Sequential Next Type 1 2 3 4 5 6 7 8 9 10 X A Z F G C D T 2 3 10 5 10 7 8 9 0 0 0 I 2 2 4 0 8 3 0 0 4 0 0 0 0 6 0 0 2 4 I 0 0 0 0 0 0 Entry 6 above represents the special conditon imposed on the symbol Z. The number 4 indicates which condition is imposed (nonequality), and the number 2 under column "Optional Next" of entry 6 is a parameter pointing to the structure headed by the symbol A in entry 2. We handle recursion as a special condition. Thus the symbol Z has two conditions 4 and 1. Condition 1 is always the last condition to operate since it essentially effects a re-entry of the procedure, at which point a proper analysis consisting of the symbols C and D is attempted· for the substructure headed by this Z. The numbers of the tree nodes corresponding to rule symbols are saved in this table so that the tree can be changed in case the pattern match is successful, and so that the pattern matching algorithm can try all possibilities. *For convenience the actual symbol is used here. 392 Spring Joint Computer Conference, 1968 ACKNOWLEDGMENTS If, for example, the node to the right of A in the tree does not dominate a C and D and is not the symbol F, the pattern matching algorithm will back up to the tree node recorded in entry 2 as matching rule symbol A. The variable X is extended to encompass this A and the aigorithm attempts to find another A. A flow diagram and more details of the program can be found elsewhere.lo,li We are indebted to Dr. Barbara Hall Partee of AFESP for her criticisms and suggestions regarding the design and use of TGT, to John Olney of System Development Corporation for his contributions to the initial specifications, and to Dr. Joyce Friedman of Stanford University for suggesting the utility of a single node variable. DISCUSSION REFERENCES I N CHOMSKY We feel that TGT is a tool that is flexible in its capacity to perform operations and is convenient with regard to the manner in which commands may be expressed. We found that within the present framework we couid often handie conditions attached to the structure change portion of a transformation, although we had not intentionally designed the system to do this. Other difficulties have been resolved by breaking complex rules into several simpler rules. We anticipate that many future problems will be solved by expanding the existing set of conditions, which can be expressed directly in the rule. Some other linguistic capabilities that the user may need are less tractable. The advisability of using an entirely different rule schema for handling conjunction has been demonstratedin Schane [1966]14 and Schachter [1967].15 Plans to program the Schachter schema and integrate it into the current model ofTGT are under way. Several different conventions regarding the movement of tree branches by structure changes have been provided internally, but have not yet been made easily accessible to the user. More programming would be necessary, however, were a set of tree-pruning conventions, such as suggested in Ross [i 965],16 to be adopted. A computer system such as TGT can make significant contributions in testing a lexicon and integrating it into a transformational grammar. We are at present familiarizing ourselves with various proposals regarding the form of the lexicon and conventions for lexical insertion. The authors have no illusions regarding the comprehensiveness or generality of TGT. Transformational grammar is in a dynamic state of development. By the time a system of programs is written and checked out, it is in danger of being obsolete. However, a linguist who sets out to write a grammar faces the same problems. Once he has written a number of rules, he may not readily change rule conv.entions unless he is willing to play the Red Queen, who must run as fast as she can just to keep from falling behind. A spects of the theory of syntax THE MIT Press Cambridge Massachusetts 1965 2 P S ROSENBAUM D LOCHAK The IBM core grammar of English In Specification and Utilization of Transformational Grammar Scientific Report no I IBM Corporation Yorktown Heights New York 1966 3 N CHOMSKY The formal nature of language Appendix A In E H Lenneberg Biological Foundations of Language John Wiley & Sons N ew York 1967 4 D BOBROW Syntactic theories in computer implementations In H Borko Automated Language Processing: The State of the Art John Wiley & Sons New York 5 BUN KER RAMO CORPORATION A syntactic analyzer study Final Report under Contract AF 30(602)-3506 Canoga Park California 1965 6 S R PETRICK A recognition procedure for transformational grammars Unpublished Doctoral Dissertation MIT Cambridge Massachusetts 1965 7 A M ZWICKY J FRIEDMAN B C HALL D E WALKER The MITRE syntactic analysis procedure for transformational grammars MTP-9 The MITRE Corporation Bedford Massachusetts August 1965 8 L N GROSS On-line programming system IIser's manual MTP-59 The MITRE Corporation Bedford Massachusetts March 1967 9 F BLAIR Programming afthe grammar tester in specification and utilization of a transformational grammar Scientific Report no I IBM Corporation Yorktown Heights New York 1966 10 W J SCHOENE TGT user's martual System Development Corpor?tion document in press 11 D L LONDE W J SCHOENE An algorithm for pattern matching and manipulation with strings and trees System Development Corporation document in press 12 N CHOMSKY The logical basis of linguistic theory Ninth International Congress of Linguistics Cambridge Massachusetts 1962 Transformational Grammar Tester 13 N CHOMSKY Current issues in linguistic theory In Fodor and Katz The Structure of Language Prentice-Hall Englewood Cliffs New York 1964 14 S A SCHANE A schema/or sentence coordination MTP-IO The MITRE Corporation Bedford Massachusetts April 1966 393 15 P SCHACHTER A schema for derived constituent conjuncilOn Unpublished Working Paper UCLA Air Force English Syntax Conference September 1967 16 J R ROSS A proposed rule o/tree pruning Unpublished paper presented to the Linguistic Society of American December 1965 DATAPLUS-A language for real time information retrieval from hierarchical data bases by NORMAN R. SINOWITZ Bell Telephone Laboratories Holmdel, New Jersey INTRODUCTION To the average person, a computer user is synonymous to a programmer. In fact, to the average programmer, a user is synonymous to a programmer. Information retrieval is an area in which the computer user is not - or, at least, should not be - required to be a programmer. This paper describes a real time information retrieval system which is at once highly accessible, relatively inexpensive, and simple enough to be used by nooprogra_mmers. The system was implemented early in 1967 on a GE-265 computer operating under the time-sharing monitor developed by the Missile and Space Division of the General Electric Company (Valley Forge, Pennsylvania). Access by users is thus from remote teletypewriters communicating over standard telephone lines. By implementing a system on a commercially available time-shared computer service, the effort of developing the time-sharing software is eliminated, and the cost of the development is spread over the many users of the service. The retrieval system was field tested without any investment in expensive hardware. By providing the system with a powerful information processing ability, highly efficient use can be made of the potential information content of the data, with a consequent high performance/cost ratio of the system. What makes the system easy to use is its query language. Called DATAPLUS, the name derives from the system's ability to access data, plus the ability to process data. The DATAPLUS "compiler," which can be more accurately described as an incremental interpreter, was implemented in FORTRAN, and was designed for real time information retrieval from a hierarchical data base. At his remote terminal, the DATAPLUS user types his information request in the form of a message consisting <;>f a continuous flow of statements resem395 bling English. Erroneous or unintended statements can be altered by merely retyping them. Providing the user with these faciliti~s has in most instances overcome the inertia people have against learning to "program." Furthermore, these facilities enable the user to spend more time thinking about his problem, because he is, to a large extent, thinking in the way he is accustomed. Since a data item is often meaningful at a level of the hierarchical structure other than that at which it appears in the data base, flexibility is provided for addressing and manipulating the data at various levels. The language is also "open-ended" in the sense that the user is given the ability to create functions of items appearing in the data base, and to instruct the computer to operate on these functions in tije same way that it operates on items already in the data base. DATAPLUS was implemented in order to retrieve information useful to engineers at Bell Telephone Laboratories. So as not to burden the reader with telephone jargon, we shall exemplify our discussion by referring to a familiar (but hypothetical) data base. In addition to being intelligible to a wide audience, the hypothetical data base was chosen because it is structurally isomorphic to the data base for which the system was implemented. The data base Consider a department store chain which has branches in a number of the larger cities in the conterminous United States. The company operates throughout most of the country, with a corporate headquarters, and separate divisions in the various states. The state organizations are further broken down by cities. Figure 1 shows the hierarchical structure for this data base. At the top is the corporate level. The second and third levels represent, respectively, the state and 396 Spring Joint Computer Conference, 1968 city divisions. The focal level is the fourth level, that is, the STORE level. Each store has one or more departments, and each DEPARTMENT sells at least one ITEM. A variable whose value is intrinsically numeric will be called a range variable; a variable whose value is intrinsically alphameric will be called a nonrange variable. Thus EARNINGS and IN STOCK are range variables, while STORE NAME and INVENTORY NUMBER are nonrange variables. General description of the syntax Figure 1- Hierarchy in the department store data base Data are collected and stored at the STO RE, DEPARTMENT, and ITEM levels. Each datumwhich we henceforth can variable - has a name and a value. A partial list of representative variable names and their levels is shown in Table 1. TABLE I - Partial list of representative variable names and their levels Variable Name STORES STORE NAME ADDRESS COUNTY EARNINGS DEPRECIATION Level Store Store Store Store Store Store DEPARTMENTS DEP ARTMENT NAME SALES FORCE DOLLAR SALES Dept Dept Dept Dept ITEMS ITEM NAME INVENTORY NUMBER IN STOCK ON ORDER PURCHASE PRICE SELLING PRICE Item Item Item Item Item Item Item Like any language, natural or artificial, DATAPLUS has syntax, semantics, and vocabulary. Syntactically and semantically, DATAPLUS resembles English. The basic vocabulary consists of a few key words, . such as "total," "distribute," "and," "or," "when," and a set of nouns representing items appearing in the data base. This section gives an overall picture of the syntax, Vocabulary and semantics are considered only when necessary to the discussion. We begin by citing three examples of messages written in the DAT APLUS language. Example .1 - Suppose we wish to find the total number of stores in Michigan and New Jersey which have any HI FI departments. The message could be ... TOTAL STORES: IN MICHIGAN; NEW JERSEY: WHEN ANY DEPARTMENT NAME =HIFI:GO: Example 2 - Suppose we wish to obtain a frequency distribution of stores versus dollar sales per store in Miami and Tampa. The message could be ... DISTRIBUTE STORES: BY DOLLAR SALES PER STORE: BETWEEN 0 AND 2000000 IN STEPS OF 20000: IN FLORIDA, MIAMI, T AMPA:GO: Example 3 - Suppose we want to extract the store name, the names of the departments, and the earnings/sales ratio for stores in Cook County, Illinois. The message could be ... LET RATIO = EARNINGS/DOLLAR SALES PER STORE: EXTRACT STORE NAME, RATIO, DEPARTMENT NAME: FROM STORES: IN ILLINOIS: WHEN COUNTY = COOK: GO: A message in DATAPLUS thus consists of a continuous flow of statements, * each beginning with a key word, such as "TOTAL," "WHEN," "IN," "LET," and ending with a colon. The final statement in the message must contain the single word "GO." Every message must have an "IN statement" i.e., a statement beginning with "IN." This informs the program in which states and cities the user is interested. *A statement in DATAPLUS is either a complete sentence, a prepositional phrase, or an adverbial clause. DATAPLUS 397 There are three basic operations in DATAPLUS as implemented to date: TOTAL, DISTRIBUTE, and EXTRACT. Every message must contain a statement that begins with one of the basic operation words TOTAL, DISTRIBUTE, or EXTRACT. Any range variable may be summed by using the TOTAL statement. DATAPLUS automatically supplies an average along with the total. Thus, "TOTAL EARNINGS:" will cause the system to furnish the average earnings/store. If there is a DISTRIBUTE statement in the message, there must also be a BY statement and a BETWEEN statement. If we visualize a typical histogram, the DISTRIBUTE statement contains information pertaining to the Y axis variable; the BY statement contains information pertaining to the X axis variable; the BETWEEN statement contains information pertaining to the limits of the X axis and the class interval size. A WHEN statement is optional in every message. Its purpose is to specify the search. In Example 2, the search is implicitly specified as being over every store in Miami and Tampa; in Example 3, the search is explicitly specified for stores in Cook County. The LET statement can be used to define a created function of variables in the data base. A LET statement may appear anywhere in the message, provided that the variable that it creates has not been referred to earlier in the message. Blanks are allowed as they are in English: A blank may not be placed in the middle of a word, but as many blanks as the user wishes may be placed between words. Any special symbol, such as ;+-*/,=:() may be surrounded by as' many blanks as desired. The order of the statements - other than the LET and GO statements-is arbitrary. For example, the user may type the IN statement before the TOTAL statement, or vice versa. Any statement may be overrid~n by merely retyping it. Thus, the user can change his mind about the content of any portion of the message, provided he has not typed GO. lf he has typed GO and there are errors in his message, appropriate diagnostic comments are printed. He need then only retype the erroneous statements in the message, followed by the GO statement. Running the program Once the program is loaded, it communicates its sole signal for user lnformation requests by typing "?:=" after which the user starts typing. The user must be careful not to end a line in the middle of a word or number: Unlike English, DATAPLUS does not allow hyphenation. (However, we have allowed the typesetters to hyphenate DATAPLUS messages in the printing of this paper.) If a message has syntactic or semantic errors, or if not enough user information is supplied, appropriate diagnostics are typed, and the system again returns with its perennial ?:= symbol. Once a valid message is given, the syst~m searches the data, types out the processed results, again types ?:= and the cycle begins anew. Thus, a request for a distribution may be followed with a request for extracting data- without reloading the program. When the. cycle begins anew, the program assumes that a new message will follow. Thi~ assumption can be suppressed by typing "ED IT": followed by those portions of the message (hat the user wishes to edit. Thus, after the computer had printed its results for the message in Example 3, the user could continue with ... EDIT: WHEN COUNTY=DU PAGE: GO: and have the computer supply the same results for Du Page County. The ability to shift from one operation to another combines with the ED IT feature to provide an extremely useful capability of DATAPLUS-the capability for browsing and "hunch pursuit." Upon examining the computer's reply to one message. the user is often stimulated to formulate another message. Experience with the system has demonstrated that a user frequently finds himself as part of this "feedback loop" - coming away from the system with information much more valuable to him than the information he originally intended to seek. The vocabulary The basic vocabulary has two portions - the key words, and the nouns. We have already been introduced to most of the key words. This section is concerned with nouns, which include the state and city names and the names of variables such as those given in Table I. Although a variable appears on a given level in the data base, it can be used on a higher level in a message. An example of this "raising the level" of a variable is the noun DOLLAR SALES which appears as a DEPARTMENT level variable in Table 1, and was used as a STORE level variable in Example 2. In that example the variable of interest was the DOLLAR SALES PER STORE, viz, the dollar sales summed over all departments in the store. 398 Spring Joint Computer Conference, 1968 In DAT APLUS, the level of any range variable may be raised. This is accomplished by following the variable name (as it appears in Table I) by the word PER and then by the word DEPARTMENT or STORE. The variable DEPARTr-YiENTS is a range variable which has the value unity for every department, and the variable ITEMS is a unity-valued range variable for every item in the data base. Since these are both range variables, their levels may be raised. Thus, to obtain a picture of the number of different types of items sold in stores in Dallas, one might say ... DISTRIBUTE STORES: BY J=fEMS PER STORE: BETWEEN 0 and 700 IN STEPS OF 7: IN TEXAS, DALLAS: GO: This will produce a distribution with stores as the Y axis variable, items per store on the X axis, with 100 class intervals each of size 7 items. In this example, the level was raised from item level to store level- i.e~, two levels. If the user tries to "lower the level" of a variable, he will get a diagnostic message. Clearly, the level cannot be lowered since this is a piece of information uriobtainable from the data base. It is perfectly valid to follow the word PER by the name of the level on which the variable was collected. In fact, if the user is unsure about the level, he can always play safe by foilowing the variable name with PER and then the level. The vocabulary can be obtained from the system at. run time. If the user types an invalid state name, the computer will type out a list of valid state names. If the user types a valid state name but an invalid city name, the computer will type a list of valid city names for that state. A listing of valid variable names may be obtained by putting the statement "CAT ALOG": anywhere in the message. Furthermore, the user can obtain definitions of any of the variables by using the DEFINE statement. For example, the statement DEFINE IN STOCK, ON ORDER: may be placed anywhere in the message, and the computer will furnish concise definitions of the variables IN STOCK and ON ORDER. Creating variables DAT APLUS has prOVISIon for extending its vocabulary, that is, for adding to its nouns. This is accomplished by the LET statement, which we mentioned briefly in Section Ill. A created variable in DAT APLUS is always a range variable. In the previous section we saw how to create certain variables that are not in the data base. Those "raised level" variables (since they had to be summed) were range variables. In Part A of the present section we describe how to create algebraic functions of variables. Since we are performing algebraic operations, the operands, as well as the function, must be range variables. Part B of the present section shows how to create special variables, called qualified variables. A. Algebraic functions An algebraic function is formed by algebraic manipulations on range variables appearing in the data base, on raised level variables, or on variables created in earlier LET statements. Some examples are ... (a) LET PERCENT = (SELLING PRICEPURCHASE PRICE)/SELLINq PRICE: (b) LET RATIO = EARNINGS/DOLLAR SALES PER STORE: (c) LET CASHFLOW = EARNINGS + DEPRECIATION: The algebraic operators are +, - ,* ,I. Exponentiation is not allowed. The hierarchy of operations is the same as in FORTRAN. Redundant parentheses are ignored. (Unary operations are not allowed; thus it is not valid to say LET X = - EARNINGS:) The level of all variables appearing on the right hand side of the equal sign must be the same. (The construction would be semantic nonsense otherwise.) The created variable is assigned this level. Once a variable is defined in a LET statement, it may be used in any subsequent statement where a range variable of that level is permitted, including another LET statement. Suppose we wish to obtain a distribution, for stores throughout the United States, of cash flow versus earnings. The message could be ... LET CASHFLOW = EARNINGS + DEPRECIA· TION: DISTRIBUTE CASHFLOW: BY EARNINGS: BETWEEN 0 AND 500000 IN STEPS OF 50000: IN COMPANY: GO: B. Qualified variables Qualification is accomp~ished 'in DAT APLUS by enclosure in parentheses. (This construction is afamiliar One in English.) Some examples are: (a) LET X = IN STOCK(lNVENTORY NUMBER = A0578): (b) LET Y = DOLLAR SALES (DEPARTMENT NAME= HABERDASHERY): (c) LET Z = ON ORDER (INVENTORY NUMBER= B6724): LET T = Z PER STORE: DATAPLUS The variable X is an item level variable. Its value is the number of units in stock that have an inventory number of A0578. The variable Y is a department level variable. Its value is the dollar sales for the department if the department is a haberdashery department and zero otherwise. The variable Z is an item level variable. The variable T is a store level variable, and represents the number of units on order in the store which have inventory number B6724. In general, the value of the created variable is the vaiue of the quaiified variable· if the equality enclosed in parentheses is true, and zero otherwise. The level of the created variable is assigned the level of the qualified variable. Once a variable is created by qualification, it may be used in any subsequent statement where a variable of that level is permitted, including another LET statement. Suppose we wish to determine, for Chicago, all stores which have more than 30 TV sets in stock. The message could be ... LET TVSUPPL Y = IN STOCK(lTEM NAME = TV SET): EXTRACT STORE NAME, ADDRESS: FROM STORES: IN ILLINOIS, CHICAGO: WHEN TVSUPPL Y PER STORE=30 to 9999: GO: Boolean operations Most information retrieval systems allow for Boolean operations on index terms. DAT APLUS allows for Boolean conjunction and (inclusive) disjunction by use of the words "AND" and "OR'; in the WHEN statement. For example, suppose we wish to find the total number of furniture departments in stores in Denver that have a sales force of at most 14 people and dollar sales of more than $100,000. The message TOTAL DEPARTMENTS: IN COLORADO, DENVER: WHEN DEPARTMENT NAME= FURNITURE, AND SALES FORCE=1 TO 14, AND DOLLAR SALES=100000 TO 9999999: GO: If we are interested in finding the number of departments whose name is either "furniture" or "childrens furniture," the message could be ... TOTAL DEPARTMENTS: IN COLORADO, DENVER: WHEN DEPARTMENT NAME FURNITURE, OR DEPARTMENT NAME = CHILDRENS FURNITURE: GO: Intersection and union may be used in the same WHEN statement, with the AN D taking precedence over the 0 R. Thus the message ... 399 TOTAL DEPARTMENTS: IN COLORADO, DENVER: WHEN DEPARTMENT NAME = FURNITURE AND SALES FORCE = 1 TO 14, OR DEPARTMENT NAME = CHILDRENS FURNITURE, AND SALES FORCE = 1 TO 14: GO: will total those departments whose name is furniture and which have a sales force of at most 14 people, or whose name is childrens furniture and which have a sales force of at most 14 people. It is also possible to use the qualified variable as a means for specifying conjunction. Thus, the messages ... TOTAL STORES: COUNTY=KNOX, 70000: GO: IN TENNESSEE: WHEN AND EARNINGS=O TO and LET X=STORES (COUNTY=KNOX): TOTAL X: IN TENNESSEE: WHEN EARNINGS=O TO 70000: GO: are equivalent. Both will determine the number of stores in Knox county, Tennessee, which earn less than $70,000. . Two operations that prove extremely useful in retrieving from hierarchical data bases are the ANY and ALL operations. These can be used for variables mentioned in the WHEN statement which are on a lower level than the level specified by the TOTAL, DISTRIBUTE, or EXTRACT functions. (The level specified in the EXTRACT function is given in the FROM statement.) We have already seen a use of the ANY operation in the example ... TOTAL STORES: IN MICHIGAN; NEW JERSEY: WHEN ANY DEPARTMENT NAME= HI FI: GO: where the variable DEPARTMENT NAME is on a lower level than the variable STORES specified by the TOTAL statement. As an example of· the ALL operation, suppose we wish to find those departments in New Jersey, all of whose -items sell for $8.00 or more. The message could be ... EXTRACT STORE NAME, ADDRESS, DEPARTMENT NAME: FROM DEPARTMENTS: IN NEW JERSEY: WHEN ALL SELLING PRICE PER ITEM=8.00 TO 999999: GO: In this case, the variable SELLING PRICE is on a lower level than the variable DEPARTMENTS. 400 Spring Joint Computer Conference, 1968 Although OAT APLUS does not presently support the Boolean negation operation - due primarily to core size limitations - it is often possible to perform negations by appropriate use of qualified variables. Suppose we wish to find the number of stores in Illinois that are not in Cook County, we could say ... LET X = STORES (COUNTY = COOK): LET Y = STORES (X = 0): TOTAL Y: IN ILLINOIS: GO: The variable X will have the value 1 if the store is in Cook County, and 0 otherwise. The variable Y will have the value 1 if and only if X=O, that is, if the store is not in Cook County. Application to other data bases i\lthough Dl1,.TAPLUS was designed specifically to process data files useful to telephone engineers, the language is capable of wider application. The data file described in this paper can use a simple cognate of the language. Specifically, this means that only the nouns need to be changed and this can be accomplished by a trivial modification of a few program tables. Furthermore, the hierarchical structure of the department. store chain is representative of a tree structure common to many data bases. Suitably extended, the techniques employed in DA T APLUS can be applied to other tree-structured data bases, with the complexity of the extension depending on the number of levels in the hierarchy. Although this paper has been primarily concerned with the OAT APLUS language, it should be clearly understood that the language is the "front end" of an information processing system, the "back end" of which performs such actions as fetching the data from disk to core, calculating distributions, and printing results. Therefore, if one were to use OAT APLUS as the interrogation language for an information system built upon another data file, it would be profitable to borrow as much of the back end of the present system as possible. A large measure of borrowing can in fact be accomplished because of the system's modular design. There is one subroutine (the Read routine) whose sole function is to fetch data from d-isk. The other programs are independent of the actual physical layout of data on the disk. Hence, application to other data bases could be performed by appropriately changing the Read routine with minor revisions of the other subroutines. The fact that the data layout is not an inherent part of the total system suggests three immediate advantages: The data in any application can be structured on disk to take advantage of any idiosyncrasies of the particular data base. The data can be structured to take advantage of the particular computer system's executive program and scheduling policy. The relative efficiencies of different layouts can be experimentally tested by inserting different Read routines. CONCLUSION In the literature on query languages comparatively little consideration has been given to the manipulation of data at the various levels of the multilevel file. 1-8 The necessity of such hierarchical operations as the ANY, ALL, and PER operations in DA T APLUS has been recog~ized in a recent paper9 describing a proposed general purpose data management system. Such operations, combined \-vith the OAT APLUS capabilities of totalling, distributing, and creating algebraic functions, make it possible to more fully utilize the information potential of the data file, and consequently increase the performance/ cost ratio of the information system. OAT APLUS was implemented on the GE-265 computer operating under the G E time-sharing monitor, which allows only 5000 words of user core available for compiled FORTRAN code. (A number of program overlays were obviously required.) The system described in this paper has thus demonstrated the feasibility of implementing - on a small machinea highly accessible, relatively inexpensive, and easy to use information retrieval system with substantial processing capability. REFERENCES 2 3 4 5 6 C W BACHMAN S B WILLIAMS The integrated data store-A general purpose programming systemfor random access memories AFIPS Conf Proc 1964 pp 411-422 Introduction to integrated data store General Electric Computer Department CPB 1048 April 1965 J H BRYANT P SEMPLE JR GIS and file management Proc of ACM September 1966 pp 97-107 G enerlilized information system IBM Manual E20-0179 Reference No A 018773 W D CUMENSON A nnual review of information science and technology John Wiley & Sons New York 1966 Volume I Chapter 5 File organization and search techniques J MINKER J SABLE Annual review of information science and technology John Wiley & Sons N ew York 1967 Volume II Ch.:pter 5 File organization, maintenance and search of machine .files DATAPLUS 7 C T MEADOW The analysis of information systems John Wiley & Sons New York 1967 Chapter 2 The languages ofinformation retrieval 8 N S PRYWES 401 Man-computer probLem soLving lYith multilist Proc of IEEE Volume 54 No 12 December 1966 9 R E BLEIER Treating hierarchical data structures in the SDC time-shared data management system (TDMS) Proc of ACM August 1967 pp 41-49 A language design for concurrent processes by L. G. TESLER Computer Consultant Palo Alto, California and H. J. ENEA Stanford University Stanford, California INTRODUCTION In conventional programming languages, the sequence of execution is specified by rules such as: (1) The statement "GO TO L" is followed by the statement labelled "L" (Branching rule). (2) The last statement in the range of an iteration is followed, under certain conditions, by the first statement in the range (Looping rule). (3) The last statement of a subroutine is followed by the statement immediately after its CALL (Out-of-line code rule). .. , (Other rules) (n) In other cases, each statement is followed by the statement immediately after it (Order rule). This paper will define a class of general-purpose languages which does not need these rules. The power of these languages is equivalent to that of Algol, PL-l , or LISP. Other languages which do not need these rules appear in the literature. 1,2,3 Concurrent processing The advent of multi-processing systems makes it possible for a computer to execute more than one instruction at a time from the same program without resorting to complicated look-ahead logic. There are many ways in which this capability can be utilized; one way is to find several statements that could be executed simultaneously without conflict, and to delegate their execution to different available processors. This technique is sometimes called "concurrent processing". To employ it there must be a means for determining, during compilation, which statements can be processed concurrently. Many proposals for programming languages have suggested that more rules like I, 2, 3, ... ,n should be added for explicit indication of concurrence. 4 ,5,6,7 Examples of such rules are: 403 (n (n + + 1) The statement "FORK M, N, ... " is followed simultaneously by the several statements labeled "M", "N", ... ; the statement "JOIN M, N, ... " terminates the fork. 2) The range of statements following "DO SIMULTANEOUSLY" can be executed for all values of the iterated variable at once. The programmer using these facilities must take care that the statements performed simultaneously do not conflict, e.g., do not assign different values to the same variables. s Other proposals have suggested an analysis of potential conflicts, during compilation, to isolate aU concurrence that does not depend on run-time data values ("program concurrence").8,9,lO,1l,12 This approach is adopted here because its automatic elimination of all possible conflicts guarantees determinacy. Once it is possible to isolate all program concurrence using implicit information, it is tempting to examine the possibility of determining all program sequence (i.e., non-concurrence) using implicit information. If that were possible, then not only rules n + J and n + 2 but also rules 1,2,3, ... , n could be eliminated. In the following section, the class of "single assignment" languages will be defined such that all program sequence and program concurrence are determinable during compilation without explicit indication. Single assignment languages A program written in a single assignment language has the following essential characteristic: No variable is assigned values by more than one statement. The only effect of executing any statement is to as- 404 Spring Joint Computer Conference, 1968 sign values to certain variables named in that statement (no -side- effects). Every statement is an assignment statement. The variables names in each statement belong to two exclusive groups: (1) Outputs: Those which are assigned values by the statement. (2) Inputs: Those whose values are used by the statement. The output variables are said to be dependent on the input variables. The abbreviation "AdB" stands for" A is dependent on B". Dependence is: (a) Transitive: AdBABdC ~ AdC. (b) Antisymmetric: AdB~ 'iBdA. (c) Irreflexive: -'AdA. Circular dependence is not allowed; for example, A can't be dependent on B if BdC and CdA. If neither AdB nor BdA, then A and B are independent. All required program sequence can be determined during compilation by a straighforward tracing of dependence. As the symbol table is built, two-way pointers are constructed between the input and the output variables of each statement. The final symbol table, which is both a data-flow and a program-flow diagram, elucidates the required sequence in the program. Statements that are not found to require sequential execution can be performed concurrently. The rules I, ... , n + 2 are replaced by the single rule: The statement that outputs variable A must be executed before every statement that either inputs A or inputs some B such that BdA. The order of appearance of statements is immaterial to this analysis; consequently, an incremental compiler can be employed which accepts statements typed in any order. This property is especially useful when adding a statement toa previously written program because neither unforeseen side effects nor mislocation of the statement can occur. Optimization One way of optimizing a program is to reduce the amount of redundant computation by combining "common expressions". In a single assignment program there are no side effects; therefore, common expressions throughout the program can be combined during compilation. Another way of optimizing a program is to allocate storage efficiently. For example, in the program: var: x+y; a: var-w; b: 2 x var; c:x-y; d:-var;. storage for the variable "var" need not be reserved until just before any of the statements "a", "b", or "d" demand the value of "var", and may be released as soon as all of "a", "b", and "d" no longer require the value of "var". These requirements can be detected easily during compilation. As a r~sult, storage is never allocated for a variable except when necessary to guarantee the availability of its value for further computation. Compel Compel (Compute Parallel), a single assignment language, will be partially defined so that examples of single assignment programs can be given. This language has not been implemented. Compel programs process three types of quantities: numbers, lists, and functions. A number is a floating-point approximation to a complex number, e.g., 2+i4.6 -17.3 5 A list is an ordered set of zero or more quantities, e.g., [factorial, [6.2,7], [ ] ] 1 by 4 to 9 ] [ 1,5,9] = [1 by 4 for 3] [ 1, while preceding < 9 use preceding + 4 ] l [ while index < 3 use 4 X index - 3 ] A function has one argument which may be of any type and is frequently a list ("parameter list"), e.g., f[ cpx (x i 2) cp[a,b] (axx+bxy) Blocks may be created for local naming-they have nothing to do with storage allocation or with program sequence. The statements within a block may appear in any order, and the blocks within a program may nest and appear in any order. Each block begins with the line: input vi, v2, ... , vn; where vi, ... , vn (n ;::: 0) are the names of those variables defined outside the block and used inside the block. Similarly, each block ends with the line: output wI, w2, ... , wm; where wi, ... , wm (m ;::: 1) are the names of those variables defined inside the block and used outside. Every statement is of the form: VARIABLES :EXPRESSION; For example, the statement: a: [1,2,4]; assigns to "a" the single quantity" [ 1,2,4] ". The statement: A Language Design for Concurrent Process a:each [1,2,4] is analogous to the Algol statement: for a: = 1,2,4 do ... where the range of the do includes all statements which are dependent on "a". On a parallel computer, all values of "a" can be produced concurrently; thus, a variable (in this case "a") can have several values at the same time. The function each splits a list of values so that its elements can be operated on individually. After a list is split and its elements have been operated on individually the results- of the operations are collected back into a list. For example: 405 (a) a program/data flow diagram displaying concurrence and storage release; (b) an Algol program using fork, join, and do simultaneously; (c) aCompelprogram. Figure 1 illustrates concurrent execution of statement and optimization of storage. I START I I a:each [1,2,4]; b:a i 2; c:list ofb; Statement "a" splits the list "[ 1,2,4]" into three quantities; statement "b" squares each quantity; statement "c" collects the results into a single list, i.e., "[ 1,4, 16]". Splitting and collecting are analogous to forking and joining, except that splitting operates on the data flow, but forking operates on the program flow. Every statement that does not assign values to the output variables of its block can be eliminated from that block by systematic substitution. For example, the three statements in the last example can be reduced to: c:listof(each[I,2,4]) i 2; This property is the converse of common expression combination. The two statements: a:each [1,2,4]; b:a i 2+a i 3; where "a" is used in no statement but "b", can be reduced to one statement in another way by use of the following construction: b:a i 2 + a i 3 with a each [1,2,4]; Its advantage over the two statements it replaces is that, by omission of the colon after "a", "a" becomes a name local to the statement. A local variable is distinct from all other variables of the same name throughout the program. In the examples that follow, subscripting is specified by a downward arrow, e.g., ai,j is written: a ~ [i,j] Examples Two simple Compel examples are given below. For each are given: a,d,e Figure I - Concurrent execution of statements and optimization of storage Algol (extended): begin real a,b,c,d,e; comment storage for these variables is reserved immediately upon block entry, and is not released until block exit; fork rl, r2, r3 ; rl: a:=6 ; fork r4, r5 ; r2: b:=7 ; fork r4, r5; r3: c:=8 ; go to r5 ; r4: join rl, r2 ; d :=a-b; go to r6; r5: join rl, r2, r3 ; e:=axb-c; r6: join r4, r5; out:= (a-e)/d; end 406 Spring Joint Computer Conference, 1968 Compel: input; out:(a-e)/d; a:6; e:axb-c; d:a-b; b:7; c:8; output out; Figure 2 demonstrates simultaneous assignment of all the elements of a list. I START I ·-.--~---"--_ _ _--.;L-_ _ ~_- I Ym:~ t! I ______ _ However, the incrementation of "i" is never a major step in a program, but merely one small step in a larger process. In Compel, notation is provided to incorporate such a step into an algorithm. Example 1: to compute the sum of the elements of list "m" one can write: sum:+m; where "+" is a function which returns the sum of the elements of its argument. Example 2: in an Algol program, a variable may be assigned values in several statements, some of which increment the variable: r:=rO; for i:= 1 step 1 until n do begin a [i] :=bxr; r:=r+2; end; rl:=r; In Compel each variable is assigned values by only one statement: r:each [rO by 2 for n]; a: list of (bX r); rl:last (list ofr); Figure 2 - Simultaneous assignment of all the elements of a list Algol (extended): begin array t[ 1:mJ; integer i; inarray (t); for i := I step 1 until m do simultaneously y[i]:= t[i] i 2; outarray (y); end Compel: input t; y: list of (each t) j.2; output y; Programming methods When programming in Compel, som'e traditional techniques cannot be employed and new methods must be substituted. Some of these methods will be discussed below. To avoid circular dependence, the input variables of a statement must be different from the output variables; therefore, it is impossible to write the equivalent of Algol's: i:=i + 1; Conditional statements are not available in Compel; therefore, conditional expressions must be employed to achieve the effect of the Algol statements: y:=yO; b:=bO; t: =to; ifa < 0 then begin b:=c+ 1; y:=a; end else begin t:=r-a; y:=v; end; A corresponding Compel program: b: t: y: LbO, if a < 0 then c+ 1 else preceding] [to,. if a > 0 then r-a else preceding] [yO, if a < ;0 then a else v] t [2]; t t [2]; [2]; The word preceding in the generation of a list denotes the immediately preceding element in the list. If it is necessary to refer to several preceding elements, this can be achieved as in the following example, which generate the first 1000 Fibonacci numbers: A Language Design for Concurrent Process fibonacci:listof(pair ! [2]); pair:each [[O,IJ while index < 1000 use [preceding! [21, preceding ! [ 1J + preceding ! [2 J J ab: list of (each a) concatenate (each b) ; scale: list of list of (each abrow)/maxv (arow) with abrow in ab and arow in a; J; Problems which are solved in Fortran or in Algol by repeatedly changing the values of various matrix elements might seem to be difficult to solve in Compel. Therefore, a matrix reduction by Gaussian elimination will be given as an example of an iterative algorithm. The matrix is stored as a list of rows, and the rows are each lists of elements. The following block defines "gauss", a function of three parameters: "a", "b", and "eps", where "a" and "b" are matrices: a= [;::" "":.J b= nXn ~~l.'""" ~l~"J [b nl b nm ] nXm Gauss returns: The program generates n iterations. In the first, "iter" is the n by n+m matrix "scale"; in the i'th, "iter" is collected into a list of rows from the n - i + 1 rows "reduce" generated by the (i - 1)st iteration. If singularity is discovered, iteration ceases. iter: each [scale, while abs(pivot» eps /\ index \$ ,"W 00 "/ O DISJUNCTIVE INPUT: process becomes active when anyone of inputs is present CONJUNCTIVE INPIJT: process becomes active when aI inputs are present. o ~ Q $ ilISNNCnVE GUinii: one and oniy one output is produced on completion of the process. ~ •• ' CONAKTM OUTPUT: aI outputs are praced on completiol of the process. Figure I - The program graph, a model of computational processes, including parallel processes a program graph corresponds to a whole program module (e.g., a subroutine). Parallelism at this macrolevel is likely to be fairly easy for the programmer to deal with conceptually. Furthermore, the gains from parallelism relative to the cost of task switching, that is, the system overhead required to initiate and complete each task, are likely to be quite high. Forks and joins at the statement level are very likely to cost more in executive overhead than the savings possible from parallel operation. Program hierarchy, that is, the control hierarchy of modules in a system, is an important issue in program design. 7 Essentially, this is the hierarchy determined by normal subroutine calls. As an example, consider the problem of generating lists of ','key words" from sentences in some input text. The basis for choice will not be discussed here, but at least three fundamentally different hierarchies are possible, all performing the same overall process. Control of Sequence and Parallelism in Modular Programs These are given in Figure 2. The hierarchy of control is, in this case, an artifact, like control sequence, and in concept could be eliminated. The co-routines of Conway16 are addressed precisely to this matter. In co-routines, the input-output relationships in fact determine control, avoiding some messy programming details that arise soiely from the artifact of control hierarchy. COffiROI. IlERARCHY GEIlRAI. FORM OF PROCESSIfG eliminated. It is still possible to create indeterminate string structures if the behavior of a program is controlled by the sequence of· appearance of data from different streams. However, a restriction to one active control stream per module in the string structure prevents this provided that an attempted access on an input stream either inhibits the control stream from continuing if the buffer is empty or results in transmission of the data and continuation. *' MDt: -- 411 i CAU. II'IIT CAll LIST CAll KEY PROGRAM AlPHA Elm KEY: -- CAll LIST 00 LIST: :CAl.l1I'IIT PROGRAM GAMMA PROGRAM BETA EJII II'IIT: READ - IF EIII If SENTEIU, CALI. LIST Elm LIST: :f AU LISTED, CALI. !lEY EJII Figure 2 - Alternate hierarchies for identical processing. KEY is a rourine to generate key words from a list of words, LIST generates a list of words from a sentence as a text string, and INPUT inputs a sentence, that is, the three modules contain the processing for the functions as given The idea has been extended in the program strings of Morenoff and McClean. 17.18 The development of program strings was directly motivated by the desire to enable the simple, natural structuring of whole programs into larger units. A program string is illustrated in Figure 3. Each block in the program string is a generalization of the conjunctive node in a program graph. Some part of the block becomes active when some .combination of inputs becomes available. Outputs are produced not all in one event, but at various times. Obviously, a block could be decomposed into an equivalent program graph, but to require this of the programmer in the design process is undesirable. It should be pointed out that the buffer files between blocks cause the determinacy problem, in an sense, to disappear. The buffer files completely isolate units in the system. Data, once outputted to a buffer, cannot be accessed or changed by the generating module. Thus the possibility of attempted simultaneous access to the same cell is OOTPUT Figure 3 - A program string stucture. Each output stream is buffer It is inconvenient to restrict the control of sequence to those determined by simple input-output interrelationships alone. The fact is that some parts of processes are more naturally expressed in terms of statemental succession, others in a control hierarchy, and still others as co-routine or program string structures. What we now do is to present a set of language features which imbed the concept of program strings into the language and extend its applicability down*The proof of determinacy depends on the inability of a module in the structure to make a decision on the presence or non-presence of data on any input stream. With complete isolation and a single control stream, each module is in itself determinate except for possible effects due to its inputs. Consider the first input attempt by a module. Execution of the next s~quentjal instruction guarantees that the data are available and have been accessed independent of the timing of other modules. The control sequence is thus independent of the order of appearance of inputs from different streams. All modules must then be output-functional with respect to inputs from the outside, hence the order of appearance of data on any purely internal stream is fixed by outside input and initial state. Since each internal input stream is identified with one and only one internal output stream, that is, no merges are permitted, the order of appearance of data on these streams cannot depend on timing. Therefore the structure is output-functional with respect to inputs from within the structure as well. 412 Spring Joint Computer Conference, 1968 ward to the subroutine level. The structures so defined are also analogous to the computation graphs of Karp and Miller19 by virtue of imposing a strict pairing of an output set with an input set through the use of the explicit to - explicitfrom pair. t The data control mechanism The "normal" subroutine call is the prototypical point of departure since it is the sole structuring and sequencing mechanism for modules in most languages. Consider the call in routine BETA to subroutine ALPHA with four parameters ALPHA (A,B,X, Y) It is impossible to tell whether the parameters are merely arguments being transmitted to the module ALPHA or whether some are specifying return locations for the output of ALPHA. Let us assume the general case where some, say X and Y, are output parameters. Then conceptually the effect of such a call is to pass the input parameters A and B to ALPHA, to hand control to ALPHA, which retains it until the output values for X and Yare developed, and then to return the output values and control to the calling module. Control must leave BETA and not return until ~he output has been completely generated because, in the subroutine to have been completed. What we would like to do is separate the transmission of information to a module from the transmission from a module. We would also like to make the passing of control an optional matter which is governed solely by input-output constraints, Four language constructs are required for this. BET A must be able to transmit data explicitly to ALPHA and to receive it explicitly from ALPHA. ALPHA must be able to accept data from any module calling it, for example, implicitly from BETA, and return data implicitly to BETA as its caller. This separation of the input linkage function from the output linkage for subroutine calls is somewhat inimical to current languages. Some violence to syntax is thus required. The four proposed statements are shown in Table I. A "normal" call, that is, one which imposes full subordination, is the simple combination of the explicit to and the explicit from. The explicit from is useable in the evaluation stream (as an expression) since the value of a procedure is an output. The execution of an explicit to statement assumes only that the parameters are transmitted to the called module, and following the explicit from statement it can be assumed that the corresponding parameters have been set by the called module. Control, either T ABLE I - Proposed Language Extensions for Data Control of Sequence and Parallelism Among Program Modules. NAME ALGOL FORM DATA SOURCE DATA TARGET SEMANTICS Explicit to tof(x, ... ,z) calling (this) module entry queue of called module. transmit the parameters to the called module. Explicit from from f(x, ... ,z)* calling (this) module. get the value of the called procedure and output parameters. Implicit to return val,(x, ... ,z) return queue of calling (this) module. called (this) module. return queue of calling module. return value of procedure and output parameters as specified by an explicit from. Implicit from receive (x, ... z) entry queue of called (this) module. called (this) module. set values for dummy parameters as obtained from an explicit to. calling (this) module. calling (this) module. transmit parameters to called procedure, activate, return outputs to calling procedure, and restore control in calling procedure. Full call f(a, ... d)* *Alternatively to permit the natural use of explicit from in expressions, f(x, ... , z) could be used for that purpose and call f(a, ... , d) for the full call. t In review it was noted that the structures defined by the proposed mechanism appear to be isomorphic to the computation graphs of Karp and MiIler.19 A proof of this would provide an independent proof of determinacy, as computation graphs are determinate. the same stream as the calling module or from an independent (parallel) stream, is given to the called module some time between the execution of the explicit to and the explicit from. The decision of when and how to give control to the called module, Control of Sequence and Parallelism in Modular Programs within the bracketing mentioned, is that of a supervisory system. Within a called module, the implicit from indicates when a module expects its input parameters, and an implicit to returns output to the module which generated the inputs of a particular activation. N ow it is only necessary for a programmer to get into the habit of making to-calls at the earliest possible point, that is, as soon as all the parameters are available, and from-calls at the latest point, only when the output values are immediately needed. Thus the use of data control of module activation does not require added analysis for possibilities for parallel processing, only different habits. It should be observed that the use of fully parameterized calls, independence of modules except for explicit task relationships, and the control of intermodule sequencing by programmer specified data constraints permit fairly large and complex systems to be structured as the secondary effect of simply fully analyzing and specifying each subpart of the system. Asynchronism is permitted by association buffer files in each data stream. A FIFO queue is associated with the entry interface (by which a module is called) and with each linkage return interface (to which a module returns). Note that the latter are required dynamically and will only need to exceed length one when more than one to-call on the same module from the same (or other) module are allowed to queue up. The modules specified according to the features as described so far are analogous to conjunctive input/conjunctive output nodes of a program graph. This mechanism can be generalized by permitting some parameters to be omitted in any of the statements. Thus, a program might include to F ("p,q) to F (r) to F (,s) which in total would behave like to F (r,s,p,q,) but would permit additional overlap if F were properly written to take advantage of this. For example, p and q may be setup parameters to F, and F would include receive ("a,b) receive (c,d) 413 The generalized mechanism is more complicated to implement and involves potentially more processing on each to or from statement. Input values transmitted by an explicit to must be identified by source (i.e., return linkage) to associate portions of the parameter sequence transmitted at various times and prevent mixing of parameters originating from different modules. Each addition to an entry queue must be checked to see if it satisfies a waiting implicit from. Once an implicit from becomes associated with part of one particular parameter string, all further implicit from calls must be satisfied from the same string until this activation is complete. The additional cost may be offset by increased possibilities for parallelism. However, it should be noted that a sophisticated algorithm may be required to select partial parameter strings in satisfaction of an initial implicit from. An implicit from may be satisfied by several entries in the queue, some of which may hang the module for some time in waiting for other needed parts of the string. Several interesting possibilities for parallelism are nevertheless opened up by the generalized form. By never requesting parameters which are not used on a particular activation, it is possible that execution could begin earlier. In addition, parameters are frequently used merely to be passed on down in the task hierarchy. Accepting these parameters can be delayed until just prior to transmitting them downward. Implementation It should be re-emphasized that the proposed statements are at a very high level and hence the programmer need not be concerned with .their implementation. Each statement can be associated with a fairly complex sequence of processing. The whole is assumed to be superimposed on a micro-level system with "conventional" parallel processing instructions. It is at the micro-level that the problems of implementation discussed earlier are to be attacked. They can be solved using present solutions or potentially, in new ways which depend upon the constrained forms of parallelism possible under the data control mechanism. CONCLUSIONS What has been presented is a generalization of the normal task subordination mechanism to isolate the input and output portions of the call operation. This permits the simple and natural specification of data-presence constraints in such a way that sequence and parallelism are the by-product of a rather straightforward discipline. It is proposed that this generalization be used as a source language level 414 Spring Joint Computer Conference, 1968 method of specifying process parallelism. The data control method proposed is limited to intermodule parallelism. From the standpoint of overhead and supervisor functions, this is probably beneficial. However, nothing prevents data controi from being combined effectively with source language specifiers at the statement level such as the and and parallel for. ACKNOWLEDGMENTS Appreciation is expressed to Professor David Martin, now at U eLA, for early criticism and suggestions and for encouraging the pUblication of this proposal. REFERENCES 1 J P AN DERSON Program structures for parallel processing Communications of the ACM Volume 8 Number 12 December 1965 p 786 2 J B DENNIS E C VAN HORN Programming semantics for multiprogrammed computations Communications of the ACM Volume9 Number3 March 1966 p 143 3 J A GOSDEN Explicit parallel processing description and control in programs for multi- and uni-processor computers AFIPS Proceedings of the 1966 Fall Joint Computer Conference Volume 29 p 651 4 A OPLER Procedure oriented language statements to facilitate parallel processing Communications of the ACM Volume 8 Number 12 December 1965 p 786 5 N WIRTH A note on program structures for parallel processing Communications of the ACM Volume 9 Number 5 May 1966 p 320 6 L L CONSTANTINE Toward a theory ofprogram design Data Processing Magazine Volume 7 Number 12 December 1965 p 18 7 L L CONSTANTINE Concepts in program design Information & Systems Press Cambridge Massachusetts Second Edition 1967 8 E W DlJKSTRA Solution of a problem in concurrent programming C~mmunications of the ACM Volume 8 Number 9 September 1965 p 569 9 D E KNUTH Additional comments on a probiem in concurrent programming control Communications of the ACM Volume 9 Number 5 May 1966 p 321 10 J E RODRIGUEZ Analysis and transformation of computational processes Massachusetts Institute of Technology Project MAC Memorandum MAC M 301 March 7 1966 11 E C VAN HORN Computer design for asynchronously reproducible multiprocessing Massachusetts Institute of Technology Project MAC Technical Report MAC TR 34 November 1966 November 1966 12 F'L LUCONI On the equivalence of two asynchronous computation description schemes Massachusetts Institute of Technology Project MAC Computation Structures Group Memorandum Number 25 13 A L PUGH III Dynamo users manual MIT Press Cambridge Massachusetts 1961 14 S SCHLESINGER L SASHKIN POSE: A languagefor posing problems to a computer Communications ofthe ACM Volume 10 Number 5 May 1967 p 279 15 D MARTIN G ESTRIN Models of computations and systems - Evaluations of vertex probabilities in graphical models of computations Journal of the ACM Volume 14 Number2 April 1967 p 281 16 M E CONWAY Design of a separable transition diagram compiler Communications of the ACM Volume 6 Number 7 July 1963 p 396 17 E MOREN OFF J B MCLEAN Job linkages and program strings Rome Air Development Center Technical Report RADC TR 66 71 18 E MORENOFF J E MCLEAN Interprogram communications program string structures and buffer files AFIPS Conference Proceedings 1967 Spring Joint Computer Conference Volume 30 1967 p 175 19 R M KARP R E MILLER Properties of a model for parallel computations Determinacy termination queueing IBM Research Paper RC 1285 September 1964 Anatomy of a real-time trial-Bell Telephone's centralized records business office by ALAN B. KAMMAN and DONALD R. SAXTON Bell Telephone Company of Pennsylvania Philadelphia, Pennsylvania INTRODUCTION In the spring of 1965, The Bell Telephone Company of Pennsylvania undertook a trial designed to eliminate most of the paper records used for negotiations in a business office. The objectives were to computerize these files, recall the records in real-time with video display devices and direct the customers' incoming calls with an Automatic Call Distributor. August 28, 1967, a Service Representative successfully handled the first customer contact. Currently an average of 3,000 contacts weekly are handled at twenty display terminal positions. This experiment encompasses the 88,000 accounts of residence customers in Upper Darby, Pennsylvania. All business accounts are excluded at this time. Prior to the Centralized Records Business Office (CRBO) trial, 24 girls handled the residence subscribers. They sat in pairs with a tub file connecting their desks. Each file contained bills, toll statements, pending orders, credit information and contact memoranda for approximately 10,000 subscribers. Figure 1 depicts a typical installation. Customer calls to the Business Office were directed through an operator who had access to each Service Representative. During slack periods, calls could be directed to the proper file position after the operator asked the subscriber for his telephone number. During busy hours, calls rarely could be placed to the file location, and Service Representatives excused themselves to leave their position and search for the records in other tub files. CRBO system overview Under the CRBO concept, each Service Representative has a cathode-ray tube (CRT) device as illustrated in Figure 2, calls are directed to "open" positions by an Automatic Call Distributor, and the Service Representative types in the customer's telephone number to receive the account informatfon on her screen. Today, the Service Representative is no longer limited by her paper file of 10,000 accounts. N ow she can retrieve any record from the computer file. The computer file is located at the Conshohocken, Pennsylvania Accounting Center approximately fifteen miles from the Business Office in Upper Darby. Figure 2- Service representative at CRBO position Figure 1 - Service representative using the paper records system 415 416 Spring Joint Computer Conference, 1968 To serve Upper Darby alone, hardware consists of 28 Raytheon 401 Display Terminals divided between two Raytheon 425 control units. Attached to each unit is a printer-adapter and Bell Telephone's Model 35 teletype for reproducing CRT displays. Each unit is equipped to handie ali terminais on either of two four-wire fully duplexed voice-grade circuits, transmitting data at a speed of 2400 bps. Twenty of the CRT's are used by Service Representatives, four by their Supervisors, one by the Public Office Unit and three are located in a training room. At Conshohocken, IBM's 360/40 computer with 262,144 bytes of core storage handles the messages, using a 270 i high-speed adapter for each circuit. Programs and a specialized customer file are stored on four IBM 2311 Disc Drives, while the majority of the customer files are piaced on the 400,000,000 bytecapacity IBM 2321 Data Cell. A 425 Control Unit and two 401 CRT's complete the installation at the Computer Center. Figure 3 illustrates the CRBO hardware configuration. In case of a system failure, four defensive strategies have been devised; (a) transfer to a single resource such as one communication line or control unit; (b) substitution of a resource such as a Dataphone Subset or terminal device; (c) utilization of paper backup records as long as they are retained and (d) handling customer contacts without records, thus usually necessitating call-backs after the system is restored. The selection of backup alternatives is dependent upon a fault isolation and recovery procedure executed via a combination of machine and man diagnostics. This procedure assigns to IBM, Raytheon, Bell's Plant, Business Office and Accounting Departments specific responsibilities, tests and controls. The taskforce organization After Bell of Pennsylvania decided to investigate the possibility of real-time retrieval, a feasibility study was conducted by eight managers representing various Headquarters Staff departments. When the study and its projected costs were approved by the President, a Project Director was appointed. The Project Director then selected four direct subordinates, as indicated in Figure 4. One was in charge o(SystemDesign, the second in charge of Applications Programming, another in charge of Business Office Methods, Practices and Training, and the fourth in charge of Area Coordination. These, in turn, added the necessary personnel so the Task Force consisted of 45 people at its greatest point. In addition, approximately 17 people in the Accounting Department's Standards group were dedicated to the trial although they continued to report within their own organization. These programmers developed the machine-oriented software necessary to mesh the 360/40 with its peripherals and its application programs. The Task Force also received direction from the Planning Department \vhich had spearheaded the triai and was concerned with its integration into the overall mechanization plans of the Company. System capabilities The file organization of each account supports the majority of tasks performed by a Service Representative. A request for a billing display will furnish message unit usage, payment arrangements, balance due, local service and any special bill negotiations. To sati~fy toll inquiries, the computer system is designed to provide a list of those calls, current negotiations on disputed tolls and, on request, a toll investigation status. The latter display will show, after searching three months' records, identical tolls and calls to numbers similar to the one in question. Additional displays are arranged to provide families of related information on the screen at one time, formatted consistent with different transaction codes. A capability is provided for each Service Representative to "treat" customers delinquent in paying their bills. The Company lists a series of dates for each account, stating when treatment steps should be taken ranging from an educational call to an interruption of service for non-payment. Upon entering the proper transaction code with a range of telephone numbers, a Service Representative will receive, for the accounts for which she is responsible, a summary of all pending treatment (collection) work for the day. From that point on, she need only press a function key labeled "NEXT" to bring up account after account in decreasing order of collection importance. Once she contacts the subscriber and makes payment arrangements, she has the ability to type notations into the file. This data automatically reschedules the account for display at a subsequent date if the outstanding balance is not reduced below the treatable limits. "Page Ahead" and "Page Back" function keys are available to access displays with. a great deal of data. The large 1040-character Raytheon CRT was selected to reduce the need .for page:'f1ipping. At present, the function need be used only ten percent of the time. The Service Representative can also type in the telephone number and function code "SRB" to access the master file for each subscriber. This display provides the listed name and address, billing name and address, record of all equipment, additional directory listings, cross-reference notations and permanent remarks notations as of the last bill date. Anatomy of a Real-time Trial 417 (BSAM) (QSAM) FigUN CRBO 3 HARDWARE ANALYSIS B.II Telephone Co. Conshohocken, po. of pa. Upp.' Darby, po. Partition One IBM B.II .60 OlMrating Application Fil. Misc. System Partition System Programs Z ..o Bell 34,696 ]2,768 Interface Program BuH.,. 41,472 Total Methods Operating BYTES BYTES Camm. 44,203 Program BYTES 30,720 S.lector CH 1 S.lector CH 2 Acces. Wo'" Areas BYTES Com_ 31.160 78,285 BYTES BYTES Reod Write 7,289 13,043 BYTES BYTES (QSAM) (BSAM) 12/8/67 Figure 3 - CRBO hardware analysis 418 Spring Joint Computer Conference, 1968 PROJECT r-----rI I I .it \V DIRECTOR ! 'I '--..11I Coordinator; ___A_._T._a_nd_T General Planning Supervisor-B. I. S - - - - --- - ., I '" I jll nll Accot.mting upv - COIDP.uter plicat10ns ~ Number of subordinates varying over the development period. KEY Coordination Direction Figure 4-Centralized records business office task force organization File design Customer records, with the exception of treatment and notational information, are stored in the Data Cell. The Data Cell Drive contains ten cells with twenty subcells each, and ten magnetic strips for every subcell. The strips have one hundred 2000-character . capacity tracks. Three basic customer records are stored on each track, with the capability of overflowing to other tracks addressable from the main record. When the system calls for data, the basic customer record is read into the CPU and the strip returned to its subcell. Likewise, if information from overflow records and history is required, it is read from the Data Cell into core. The data requested by the Service Representative (i.e., Toil) is formatted page by page to completion. The application program is temporarily interrupted at the end of the third page to transmit the first page. When the application program relinquishes control at the conclusion of formatting, core storage areas are released. If another function is subsequently introduced, the Data Cell is reaccessed. The Centralized Records Business Office Master File is a distinct entity, separated from the tape files used for the Company's billing application. Ultimately the Company is striving for a single file hierarchy but for purposes of the trial, new CRBO files were created. 1 I , The Centralized Records Business Office Master File is updated daily for cash payments, treatment referrals, number changes, permanent disconnections, final bills and credits. It is updated monthly for service and equipment changes. The updating process consists of three phases; interface, merge and update. An interface program is necessary because the data produced from the five output runs used daily from the Billing operation are not compatible with the CRBO Computer. The tapes must be read through a data conversion program so that each record is reformatted to conform with the CRBO configuration. Five daily billing outputs are regrouped into two "Change Files," then introduced respectively to master and billing update runs. The billing updates occur several days prior to the date the ne'.v bill is received by the subscriber. During the time from the beginning of update to the actual release to the Post Office of the new customer bills, the computer manages the interim period. The Service Representative desiring information from the latest bill the customer has received, will access the file for current records. She does not need to know that a new billing update has taken place. Through internal controls, her request for current data will be routed to the previous month's data. After the mailing of the new customer bill, any requests from the Service Representative for current data will be routed to the latest file. The files are designed to store three months' data; current and the previous two billing periods. Programming All real-time application programs used by the Service Representatives, were planned by the System Design group using the SAPTAD process. SAPTAD organizes design logic into six levels of detail, starting with System and proceeding through Application, Project, Transaction, A ction and Detail. "System" represents the total universe; in this case, a Business Information System. "Application" is the functional, major subdivision such as CRBO. "Project" represents a breakdown of the application into logical components, e.g., "Account treatment." "Transaction" is a further breakdown of the component, such as "payment entry," while "Action" represent the action to be performed as a result of the transaction, e.g., "place data for display." Finally, "Detail" is logical entity of work necessary to perform an action: in this case, "place 'deny-non-payment.' " All logic conditions, actions and sequences were then placed on Decision Tables. This is a technique of portraying details required in large computer processes. The Tables eliminated machine logic flow dia- Anatomy of a Real-time Trial gramming, improved understanding of mUltiple decisions and provided the ability to prove the table through a simple "Yes - No" process. The tables were turned over to the Application Programming Group who coded them in COBOL E. Subsequently, these were translated into COBOL F to be compatible with certain features in IBM's full Operating Systemrelease eleven. In CRBO's real-time system, application programs form one of two programming categories. The second is concerned with computer functions and consists of machine-oriented programs. These Utility Programs include the Supervisor, Communications Interface Program (CIP), Transaction Analyzer (T A), File Interface Program (FIP), and IBM's Operating System. The Supervisor is the control center of the realtime system. This program receives communications from the various parts, controls and coordinates actions, and is responsible for general computer functioning. The Communication Interface Program (CIP) controls the visual display stations and teletypewriters. It is subordinate to, and under the control of the Supervisor program element. Its major function is to receive and transmit information in the form of coded characters to and from the communication devices. The Transaction Analyzer (TA) is an application based, machine-oriented program, which takes the incoming message and interrogates the function code associated with it. Then it develops the routing and selection of the application programs necessary for its processing. The File Interface Program (FIP) is a common interface for linking the CRBO system files to application programs for reading and writing purposes. It insulates the CRBO programs from machine-oriented programs required for controlling the various direct access hardware devices containing the files. It procures the requested information from the peripheral files on a segment basis, expands the data and makes it available to the CRBO application programs. Since all information from a particular record is seldom required, only those portions or segments of the record which are needed are passed to the CRBO program. If other segments are found to be necessary, another call is made to the File Interface Program. The Operating System for the 360 Computer is a highly complex series of modules and options assembled from a library of routines which can be tailored to the particular needs of the user. One major element of the Operating System contains the processing programs consisting of language translators, ser- 419 vice programs and user-written processing programs. The second element, or control program, supervises the execution of the processing programs, controls the location, storage and retrieval of data and schedules jobs for continuous processing. These Utility Programs were written in the 360 Basic Assembly Language for maximum flexibility and core conservation. Still, however, they occupy so much residency that only 32,768 bytes of core are allocated for the application programs. The Model H40 with its 262, i 44 byte CPU is, therefore, the minimal size machine in the 360 catalogue which could be used for CRBO. Testing Application prograIIJ unit tests were accomplished in several stages. First, each program was tested in a controlled environment at the Transaction, Action and Detail level of the SAPT AD process. As each Detail coding passed its testing requ,irements, it was combined with an Action program and retested. This was then repeated at the higher Transaction level and then the pJogram was released for single-thread testing. Since the application programs were designed to be independent of files or communication devices, and since the CIP, FIP' and other utility programs were not completed until late in the development period, simulator programs were designed. These delivered and received data and, in general, issued the characteristics expected of the system when all terminals and files were operative. In addition,. the simulators replaced unavailable hardware, and gave absolute control over all conditions so that only one application program was under test. For example, there was a simulator program to read card test data, simulate an incoming contact and hence call up the proper application programs. The application program under test then called for data from the files. The data was supplied by another simulator reading from card inputs. Following that, the application program transmitted a display which was intercepted by a third simulator, which transferred this "display" to a teletype printer for review. Single-thread testing took place when the application programs were brought together with the utility programs and most of the hardware. (An exception; Model 35 teletypewriters were substituted for the video terminals.) The tests were conducted using low-volume, controlled input consisting of one transaction at a time. The same transactions used in the unit test stage were supplied via teletype tape to the single-thread stage, then the resulting output was returned to a teletype printer for verification. When a program failed, 420 Spring Joint Computer C'onference, 1968 it was returned to unit test for further development. Then a program was designed to place a load test on the system giving the effect of multiple lines, random transactions and maximum arrival rates. This multi-thread program was written so transactions were fed to the system as fast as the C.P.V. and application programs could handle them. Incoming data was identical to that used in the previous test stages so control of output could be maintained. The data was stacked directly into the communication input queue, and the resultant transaction received directly from the communications .output queue, thus eliminating line transmission delay. Finally, on-line live operation provided the overall systems test. Service Representatives and accounts were added using a controlled schedule, starting with one girl having access to 10;000 accounts for on1y two hours per day. Within ten weeks this increased to the intended objective of 20 girls accessing 88,000 accounts for the standard working day. In the interim, debugging took place using dual software packages. One package supported the system for a week, while all changes were made to the off-line software. Each Monday the packages were switched, and hence the system was improved weekly. This technique permitted an accelerated pace in identifying failures which occurred primarily during busy periods, and focused attention on critical areas requiring immediate work. In concluding th~ descriptions of software and testing, it should be noted that the Project Implementation Schedule was underestimated. This was caused primarily by incomplete definitions in some cases and pioneering programming techniques in others. It became apparent, however, that the number of activities which could be performed concurrently was also underestimated. The balancing effect caused a total on-line date slippage of only twenty-five per cent of the original estimated elapse-time set two years earlier. A CRBO Event Chronology is listed in Table I. Consultants advice Early in the design stage of the trial, the human element was considered. The American Institutes for Research (AIR), based in Pittsburgh, Pennsylvania, made valuable contributions in three fields while working closely with Task Force members. First, they helped to design the floor plan. The new Garrett Road Business Office was given partitioned sections for each of six groups. Each area contains desks for six Service Representatives and their Supervisor. The partitions give the effect of a small office whiie taking advantage of large-office team-size efficiency. To combat claustrophobia and permit free TABLE I - Centralized records business office - event chronology Start Date Complete Date 6-65 7-65 Management meetings, culminating in project approval 7-65 8-65 Task Force appointments 9-65 10-65 General specifications discussions 9-65 11-65 Data collection; existing operation 11-65 2-66 Design documentation techniques investigated 11-65 2-66 Data analysis 12-65 3-66 Completion of design activities 12-65 10-66 Program activities 1-66 11-67 Equipment selection 4-66 7-66 11-66 12-66 Training package development 1-67 10-67 Testing of software 1-67 11-67 Computerization of paper records 8-67 10-67 Item F~asibility Study Preparation of Business Office practices On-line Entire office converted August 28, ! 967 November 20, 1967 circulation of air and light, the upper portion consists of open, pastel-colored lattice work. The lower three feet are enclosed, and contain the wiring necessary for the telephone equipment and Raytheon CRT's. A view of the office is shown in Figure 5. The floor tile is white with dark green striping to give a solid stability in contrast to the pastels, while the brown paneled columns in the office have goldcolored inserts to eliminate any massive effect. The ceiling is a complete series of white. light panels, each with internal polarized sheets to reduce glare on the display devices. It is a tribute to the Raytheon CRT's that the green characters on the dark screen are clearly visible under the excellent lighting conditions. Second, the American Institutes for Research helped to design the desks for Service Representa- Anatomy of a Real-time Trial Figure 5 - Centralized records business office tives. By performing numerous simulation tests on the Representatives who would actually use the positions, a desk with a 45-degree offset for the CRT was designed. The offset was lowered four inches so that the keyboard would be on the same level as the writing surface. The top is a non-glare, brown formica resting on a cream base with silvered legs. Third, AIR helped the Task Force build training packages for the new applications based on a process designed for the Air Force. The technique, using a flow-chart approach, clarifies material, stresses interaction of the various tasks and clearly defines individual accountability. Thejuture ojCRBO A major evaluation effort is now under way. Its purpose is to glean information to enable the BeJi System to produce an optimal design for the retrieval stage of its contemplated Business Information System. Twenty-four representatives, divided among personnel from the American Telephone and Telegraph Company, the Bell Telephone Laboratories and the Bell Telephone Company of Pennsylvania, have prepared an evaluation program consisting of eleven major areas. These areas range from testing the effect on subscriber service, to determining the effect on the Service Representative who must deal with a CRT screen all day; from the cost of incorporating CRBO's best features into a major system, to the cost of maintaining it on a stand-alone basis. Outside consultants will be used whenever the expertise is not readily available in the Bell System, and the entire evaluation should take approximately nine months to complete. 421 Part of the team's functions will be to isolate items which can help the manual Business Office operation. For example, after the first residence groups moved to their new location, they operated with an Automatic Call Distributor and centralized paper record· files. Although they had to leave their position every time they needed a record, delays decreased. Citing another case, detailed flow charts and task definitions were prepared for the use of the Task Force System group to design a training package for the mechanized opera. tion. One District now desires to use that preliminary information to review procedures with new supervisors, and to serve as a guide when analyzing contacts during informal training sessions with Service Representatives. Several-advantages are apparent in the mechanized operation. The "Records out of file" condition occasionally encountered in a manual office is non-existent in CRBO. The Service Representatives no longer need to leave their positions to check the centrally located computer printouts which list the latest payments received from subscribers. On-position filing is eliminated. The need to leave the desk to get rec-· ords in other locations is greatly reduced, and with an average access time of approximately ten seconds, superior service is rendered to the customer. The treatment functions permitting the Service Representative to receive a summary of her daily collection obligations, followed by the accounts being displayed in descending order of importance, is of value to the Service Representative. Her typed notations concerning payment arrangements automatically reschedule the account for treatment, eliminating the need for written memos, or calendar jottings. The system incorporates a combined training and live file, thus permitting maximum hands-on training for Service Representatives. It has substantially reduced training time by eliminating the need for blackboard and chart work, since test cases available for experimentation can be displayed on the screen by the students. Since CRBO is specifically a retrieval trial, much paperwork still exists. Communications with the Service Order Typing Room, and treatment or credit notifications to be posted to the billing operation are still done manually. Since the CRBO file parallels the billing file, it has all the problems associated with duplication and reconciliation. In addition, evening update runs take a considerable amount of time and core, thus precluding the use of the computer for other applications. After the evaluation, the recommendations of the team will strongly influence the future of the Pennsylvania experiment. The true justification of the trial 422 Spring Joint Computer Conference, 1968 will come with the use of the evaluation group's data and results to design the optimum system to handle with accuracy and individuality Bell Telephone subscribers throughout the United States of America. Fourth generation computer systems by CLOY J. WALTER* and ARLINE BOHL WALTER! Autonetics Division North .4merican Roc/rn.'ell Corporation Anaheim, California and MARILYN JEAN BOHL H oneywelllnc. Waltham, Massachusetts INTRODUCTION This paper is presented as a discussion of fourth generation computer systems. To predict future developments in the computer industry is to speculate - to theorize on the basis of observable trends and anticipated needs. Numerous questions arise. We do not know the answers to all questions nor do we know how to obtain all the answers. The intent of this paper is to suggest reasonable approaches to developments and tp offer a solution to a fundamental EDP problem. How can computers and applications be integrated within a communication and control system? Computers of prior generations emphasized computation. Fourth generation computers, as envisioned in this paper, will emphasize a communication and control system. The characteristics of fourth generation systems are outlined in the first part of this paper and discussed in detail later. Prior to this discussion, the computer evolution, the software situation, the effects of large scale integration, and fourth generation programmipg systems are considered. While one cannot predict characteristics of fourth generation systems with certainty, one can confidently assume that many changes in computing will occur. This paper contains speculation concerning the possible changes. Opinions and suggestions within the paper represent a consensus among the authors but are not representative of the company by which the authors are employed. Characteristics We believe that a computer system which possesses *Formerly with Honeywell Inc. the following characteristics will be a fourth generation computer system. 1. The major design criteria will be optimal use of available communication interfaces. The system will be classified as a communicat.ion and control system and will be capable of widely diversified processor applications. 2. The system will be controlled primarily by data rather than by programs as were previous machines. 3. Use of hardware to govern communication and control procedures will be emphasized; extensive use of control programs will be substantially reduced or eliminated. 4. Most processing will be executed in real time; operations will be performed on input at a rate that permits output to be available within the response time required by recipients. 5. The system will be readily expandable. Hardware and "software will be modular in design. Computing power will be modified without redesign of the system. Hardware malfunctions will be corrected by immediate replacement of disabled modules. 6. The hardware design will permit component parts to be updated; systems need not become obsolete. 7. The system will be designed to operate efficiently, and this efficiency will not be significantly affected by distances between connected elements. 8. Most data will be collected at its source. Cards and attendant keypunching operations will be a secondary source of input. 9. Repetitive entry of input will be reduced or eliminated, and the generation of reports will be on an exception basis. 423 424 Spring Joint Computer Conference, 1968 10. The .system will have an efficient, low-cost program generator. 11. The design will emphasize reduction of total system cost. 1.., New' .soff'wa"e ul~l1 b· A s;rnplAr sirnnler ,1"•• i terms· of user convenience, rather than in term·s of function. 13. The system will be designed to function without device-specific software routines. 14. Hardware diagnostic routines will be compatible with I/O routines so that on-line diagnostics can be performed simultaneously· with normal system operations. k. 1 . L 1 l'Y 1.1.1 """ 11!..1 1.,"".1""'- .1..1..1. ... . , .I. Computer evolution First generation computer systems were deveioped primarily for computational purposes. The concept of storing a program to control the operations of a computer and the ability of a computer to cycle repetitively through a sequence of instructions on different data pointed toward use of the computer for computational purposes. Later, the fact was recognized that machines which could be programmed to perform electrical accounting machine (EAM) operations could be marketed. Thus, EDP was born. Overemphasis on programmed control of system elements led to development of and preoccupation with general purpose computers and a concomitant failure to understand the nature of the applications. How many of us learned to program with little understanding of the computer or of the applications to which computers can be applied? How many of us, after learning to program the 650 machine, really thought we understood data processing? To provide modularity and fiexibiiity in generai purpose machines, computer designers delegated obvious hardware functions to· software. A primary design objective was to provide a means by which the user could readily specialize the computer for his particular application. Sorts and other· software routines were developed to perform common functions but these elements were designed to make the general purpose computer fit the area or application. The same application had to be modified or reprogrammed as details within the area or application changed. The development of large data management systems and/or operating systems has resulted primarily from a lack of understanding of the nature of the applications to which computers can be applied. ·Operating systems have tried to blend hardware, applications, and software. It is the application area which is exploding and which will become dominant. What has the user seen with regard to hardware? First generation hardware was characterized by vacuum tubes, second generation by transistors, and third generation by integrated circuits. Of much more importance to the user, however, was the reduction in cost of main memory (from approximately one dollar per bit to approximately five cents per bit) and the increase in reliability of the machine. Major advances were made in the reliability of both logic and memory when the change from tubes to transistors occurred. More memory available at reduced cost led to the development of more and better software and an attendant increase in the complexity of applications to which computers could be applied. As main memories became larger, more programs could be resident in memory. Throughput was increased by reducing time loss due to execution of program load and unload routines and relocation functions. Today's user is interested in total system performance and in the total cost of the system rather than in only the cost of the central processing (CP) unit (currently 15-20% of the hardware cost). He is not as impressed by advanced hardware as he is by efficient operation and ease of programming. The user desires a variety of complex applications, but he wants to tell a computer what to do - not how to do it. The software situation Past and present programming can be reviewed briefly as follows. First generation software was characterized by machine language, subroutines, and assembiers. Second generation .software added higher-level languages, monitors, and macro assemblers. Third generation software includes operating systems, on-line real-time systems, multiprogramming, _ and data management systems. Today's computer user sees: Sophisticated hardware, Complex applications, Increases in application programming costs, and User defensive programming, i.e., programming around instead of with the operating system. No new debugging tools have been developed to complement the increased complexity of applications. The percentage of users who know how their operating systems function is decreasing. Processor time IS consumed by the operating system for internal scheduling, accounting, and job handling rather than for job execution. The following questions arise. Why have we neglected to·define'software in terms of interfaces, functions, and modules? Fourth Generation Computer Systems Why have we failed to develop more helpful debugging features, more acceptable programming standards, and more useable documentation? Is the development of operating systems a brilliant solution to the wrong problem? .Do we in the computing field really understand computing? Today, programming has no theoretical basis and no theories. Why? Programming need not remain a handicraft industry. The designers of fourth generation computers must regard programming as a science rather than as an art. Optimally, scientific theories for programming can be developed. Most assuredly, software systems can be designed and utilized to satisfy clearly defined systems requirements. The problems of programming are not inextricable. Solutions to many programming problems are intermingled with and inseparable from the design of hardware. Engineering personnel must cooperate with software artists to develop a theory of programming based upon an understanding of hardware operations, an understanding of data handling and data control (communication), and an appreciation of software techniques. Only in this manner can redundant programming (repetitive development of programming techniques) be significantly reduced. Parallel developments must be replaced by sequential advancements so that achievements of one individual or group can provide a basis for extended or subsequent advancements by others. Effects of LSI The effects of large scale integration (LSI) on fourth generation computer systems can be examined from the viewpoints of both the manufacturer and the user. Major effects of LSI are (1) computer manufacturers will be forced to fabricate LSI chips, and (2) integrated circuit (I C) manufacturers will enter the computer manufacturing field. Competition will increase. Hardware rentals will be reduced, and software will be easier to use. Effects on the manufacturer Computer manufacturers wW be forced "to fabricate LSI chips. Some of the reasons for this action follow. 1. Intimate knowledge of fabrication techniques and corresponding characteristics is essential to circuit, cell and/or chip design. 2. Purchase of LSI chips reveals significant proprietary information about new developments, particularly in the area of LSI design. Vendors who supply LSI chips to computer manufacturers will have access to complete computer designs. Present legal safeguards 425 of designs appear inadequate. Minor changes to chip fabrication without modification of the function of the chip can be introduced to circumvent legal restrictions. A manufacturer will be dependent upon a selected vendor's ability or willingness to continue to supply required components. Second sourcing will be difficult if determination of the internal manufacturing processes is primarily a function of the original supplier. 3. In-house facilities may be required to provide chips which are not available from external suppliers accordip.g to a schedule which conforms to the manufacturer's needs and priorities. 4. Computer manufacturers who desire to sell to the military will be expected to demonstrate LSI fabrication capability. Today, military customers demand that systems companies possess microelectronic capabilities even if circuit designs compatible with application requirements can be secured from component vendors. 5. Computer manufacturers commonly desire to lead the development of some aspect of hardware. Such leadership will be difficult if research and development of microelectronic circuitry is relinquished to suppliers of components. Manufacturing operations may be reduced to fabrication of interconnection boards and to simple assembly operations. All computer manufacturers must develop design automation capabilities to optimize tradeoffs of performance, function, reliability, cost, and size for integrated semiconductor circuits, thick/thin film circuits, LSI circuits. or..~ny combination of these. Capabilities will be developed-in "order to meet the following general objectives: 1. To develop and utilize optimum circuit fabrication techniques in order to meet requirements for the ma!lufacturer's computer family. 2. To build microelectronic - LSI circuits on a pilot basis and coordinate"" efforts" of circuit designers and fabricators during the shift to the use of LSI designs. 3. to fabricate LSI chiPs for which vendors cannot meet delivery, performance, reliability, or price criteria. 4. To develop techniques leading to computer control of design," deposition, m_~sk generation, and testing, and to computer-generated documentation of specifications. To formulate exact plans for LSI activities is impossible. Materials and fabrication techniques for LSI designs are in various phases of exploratory 426 Spring Joint Computer Conference, 1968 research, development, or pilot production. Plans must be flexible and selected equipment must be adaptable to new techniques. We suggest that LSI activities of computer manufacturers will include at least the following groups: I. Cost Relationship 2; Liaison 3. Component Test and Evaluation 4. Circuit Test and Analysis -5. Test Equipment and Instrumentation 6. System Organization The responsibilities within each group are described in the following paragraphs. to develop a basic allocation system program (Xo) plus the numbe~ ot' 'different types of chips (a) times the product obtained by multiplying the estimated cost of specializing the allocation for a particular type of chip (.6X) by the probability (p) that specialization of the aiiocation for a new chip type is necessary. p should be as small as possible. That is, (I) C=Xo+~X' p' a represents the costs for allocation aids, where X::<'sC::<'sX·a. Cost relationship This group will perform cost analyses and determine cost ratios. The cost of fabricating individual LSI cells will be minor when compared to the cost of allocating, partitioning, simulation, routing, and testing. The manufacturer who delegates allocation, partitioning, simulation, and routing functions to a vendor may be forced to forego much of the profit that otherwise might accrue from a computer sale. The number of basic chips to be used and the cost per pattern must be examined. Cost formulas and cost ratios must be developed. Absolute cost figures are not as important as cost comparisons or cost ratios. Fundamentals must be separated from details. If fundamental costs are identified and organized, details can be viewed in proper perspective. Relationships must be understood. What costs should be compared? Costs which should be calculated are: I. Silicon costs, 2. Design aid costs. 3. Engineering development costs, 4. Factory production costs, and 5. Actual cost per computer. . A general approach to cost calculation follows: Let: X = the cost in dollars to develop a generalized allocation program. Y = the cost in dollars for allocation runs. a = the number of different types of chips per computer. b = the average number of chips of each, type per computer. Z = the average cost in dollars for chip fabrication. n = the number of computers to be produced. To = cost of setting up to test a particular chip type. The following statements can be made. The costs for allocation aids (C) is equal to the cost Engineering costs can be considered as (2) Testing costs can be represented as (3) bTo + T ~ T·a·b·n or T=a [To+ ~T·b·n] (4) Factory production cost'is a·b·Z·m·T. Actual cost per computer is (5) Xo+~X·p·a+ Y'a a·b·Z·T+------- (6) n X will probably be large and Z will probably be small, but their values are significant only when used in comparisons; that is, the ratio between a.b.Zand Xo+~X·p·a+Y·a is very important. The n cost ratios' must be optimized. Additional formulas can be generated easily. If a·b·Z (which represents the silicon cost) is small when compared to Xo+ ~X'p'a+ Y'a major infern ences can be drawn. Cost mentioned here can then be compared to total computer system cost (which includes programming, support, training, maintenance, and peripheral equipment). Liaison LSI liaison will be very similar to collateral effort Fourth Generation Computer Systems between engineering and manufacturing. Liaison personnel will: 1. Assist the logic design engineer with the application of LSI. 2. Assist in optimizing gate per chip ratios. 3. Provide information regarding scheduling within the pilot line. 4. Assist in expediting miniature components or information needed for the fabrication process. Component test and evaluation Component test and evaluation will determine the characteristics of all fabricated component parts and assist with process control and evaluation. Life tests and environmental and mechanical evaluations of elements can be performed. Failure studies and process evaltIations can be conducted to detect process variations affecting component part reliability and quality. To provide a well-organized program, the test functions should be coordinated with component applications and reliability engineering. Circuit test and analysis Circuit test and analysis will determine wheth~r circuits can be fabricated in LSI form. Activities include const~uction" of breadboards, preparation of specifications, selection of component parts and processes, and design of circuits and chips. Evaluation tests must be performed to determine if chips meet required performance and quality specifications. Other chips considered to be proprietary and extensively used in company equipment should be designed and fabricated on a speculative basis. This group can prepare an LSI design manual describing available components and fabrication techniques. Guidelines and rules for the preparation of chip layouts according to the various processes can be indicated. New designs can be evaluated and the design manual revised accordingly. The manual should contain a glossary of technical terms and' a brief description of the design procedures followed in the development of each type of chip. Test equipment and instrumentation This group will conduct electrical inspections and performance evaluation of in-process circuits. Theyt will monitor test equipment requirements and assist as necessary to provide in-process instrumentation. For test of completed circuits, a test console can be constructed which supports standard test equipment in a convenient manner. This test consQle should utilize a standard test breadboard adapter which contains any special test circui~s unique to particular LSI chips. LSI chips should be l!l0unted on 427 some type of a standard test board so that the complete assembly can be inserted into the test breadboard adapter without damaging the leads. Computer-assisted testing is currently limited. Efforts to perform automatic testing should be applied to the design and construction of test probes and fixtures which will be initially operated manually and later integrated with computer-controlled adapters. System organization This goup will be responsible for interface specifications, tradeoff and utilization analyses, resource allocation, user-oriented systems analysis, ~nd determination of required numbers and varieties of basic chips. Interfaces arising during the application of new technology must be understood, characterized, and implemented. Creative developments should be stimulated and utilized. Leadership and coordination for defining the interfaces among various design groups implementing LS I should come from a group responsible for system organization. Tradeoffs can be analyzed in the following areas: 1. Performance, function, and reliability; , 2. Sizes of chip production runs; 3. LSI application in functions such as emulation, interpretation, compilation, and control; 4. LSI application in logic-in-memory arrays such as sorting, searching, and signal switching arrays for parallel processing; 5. LSI determination of paging, table lookups, and other processes in the operating system; and 6. The utilization of each type of chip per family member. Utilization' analy~is also includes developing methods of partitioning large areas of logic and designing logic structures which can be readily partitioned and interconnected. Resource allocation includes development of methods to provide L'SI techniques to replace resource allocation algorithms. The two major resources to be allocated are subsystems' (CP, memory, programs" buses, communication lines, and peripherals) and functional capabilities (logic, arithmetic, and control): User-oriented systems analysis must be conducted to evaluate tradeoffs resulting from maximum or optimum use of LSI from the manufacturer's point of view and from the user's point of view, to insure maximum benefits to the user. Many hardware and software concepts can be viewed simultaneously in an LSI analysis. Determining the number and variety of basic chips to be used will be an important consideration of sys- 428 Spring Joint Computer Conference, 1968 tems designers. Only fifteen or twenty different chips, and a total of only a few hundred chips, may be" required for a computer. Repetitive use of chip designs is mandatory. Determining the optimum attrihutes and sizes of various LSI memories, logic, and functional arrays wiJI be another responsibility of this group. We expect IC manufacturers to examine functions cur~ently Pt?Tformed by software and develop LSI designs to perform many of these functions. In fact, major advancements in computer software development may come from IC manufacturers who do not currently market computers. Effects on the user The effect of LSI on cost, size, and speed of fourth generation computer systems has been discussed in other articles and consequently is not emphasized in this paper. We can confidently predict that CP cost and size win decrease while speed will increase. The major advantage available to the user through LSI will be that many of the operating system functions which are currently performed by software can be performed by hardware. Operating systems cannot be eliminated, but operating efficiency can be significantly improved. For example, one task of a current operating system is to allocate resources in the computer system. The task of resource allocation can be simplified if some resources allocate themselves. Hardware advancements which can be achieved through LSI include self-allocating input/output channels and auxiliary memories which do not require main memory for control. Additional advantages which can be obtained through LSI include: 1. Microprogramming through use of LSI chips (thus allowing a computer to be reorganized according to its work and its workload), 2. Control memory structures which vary in the course of the operation of the machine, 3. Improved fault isolation and self-reconfiguration techniques, 4. Increased use of logic to maintain data integrity, 5. Reduced maintenance costs, 6. Less downtime, and 7. Graceful performance degradation through use of majority voting logic. LSI will allow a single logical element to be replaced by several logical elements in a manner such that the several elements will be used to determine the state or condition of a situation. The state or condition of the situation indicated by a majority of the elements will be accepted as valid - hence, majority voting logic. In any event, it appears that, to the computer user, LSI means lower hardware costs and simpler programming languages. Fourth generation programming systems To predict languages, degrees of complexity, common techniques, primary considerations, etc., of fourth generation programming systems is both venturesome and difficult. Indeed, to predict in detail or with a high detree of accuracy may be impossible. The"" effects" of LSI on "software were discussed in the preceding sections of this paper. This section comprises discussions ?f design approach, the significance of programming, famiiy pianning, user communication with the computer system, a suggested organization for development" of fourth generation programming systems, program generation, and charts which depict data and control flow. First, second, and third generation hardware systems have been designed. Independently, in unrelated efforts, first, second, and third generation software systems have been developed. However, the significance of the total system concept has been disregarded; little~ if any, consideration has been given to the formation of first, second, and third generation computing theories. The nature of computing -must be re-evaluated, and" efforts must be modified accordingly. Cannot creative ideas be applied to integrate software and hardware within effective, useable systems? Through past generations, computer designers attempted to maximize hardware capabilities (primariiy speed). Insufficient thought was given to the user's point of view. For example, he needed a machine which was easy to program, but, in fact, designers seldom, if ever, checked to determine whether their machines would be easy to program or whether programs could be written to maximize utilization of capabilities of system hardware. Major design objectives were to minimize hardware costs, to increase speed, and to plan for batch processing in order to maximize machine throughput. Clearly, insufficient consideration was given to maximizing effectiveness of programming effort. Today, central processor costs are insignificant if compared to total system costs. Programming costs are often several times greater than hardware rental costs. Designing total systems which not only are based upon reliable, efficient hardware but also can be easily programmed is a practicai manufacturing objective. Fourth Generation Computer Systems Designers of fourth generation computers should be familiar with hardware, software, and system constructs. Both software and system theories must be developed in cognizance of hardware practicalities. Indeed, fourth generation computing theories must be developed. Design disciplines, which have been significantly lacking (particularly in software efforts), must be established and followed. Hardware personnel must learn software techniques and contribute to total system design. Before developing a new computer family, a manufacturer must answer the question, "What fundamental EDP problems do I wish to solve?" Efforts and resources can then be channeled accordingly. Certainly, one fundamental ED P problem is the difficulty of programming. The typical user has neither the desire nor the resources to secure knowledgeable personnel primarily to program his computerized applications. This fact can be a significant deterrent to initial installation or to subsequent upgrading of a computing facility. Fourth generation computer manufacturers must design systems with users in mind. Proper family (system) planning by the manufacturer is extremely important, to the us(!r. To design a family of computers requires discipline; effort must be preceded by forethought. If family members differ only in execution speed and storage capacity, all members present the same logical appearance to the programmer. One instruction set is useable with all models. One specification describes the logical functions of all members of the family. Upward compatibility is easily achieved. Thus, processing under" 'different family members is possible without reprogramming. The user can readily modify his system ifhe desires. The difficulty of programming is alleviated in part by the current trend to provide application packages. This trend will continue because small users cannot afford to employ experienced systems analysts. Manufacturers of fourth generation systems will say to the small user, "Submit your data and leave the driving to us." To the medium-scale user, the manufacturer of fourth generation systems can offer application packages and/or a program generator. If the latter option is selected, the user (conmonly, someone not trained in programming or knowledgeable of computers - for example, an accountant) will specialize the system to produce the reports or' information that he, desires in a form which he specifies. The information flow is shown in Figure I. To initiate this information flow, the user operates a desk top input/output device (CRT, TTY, ETY, 429 or other small terminal device) and selects a general program available through the ind~stry-oriented non-resident program generator. He specifies or selects parameter values. Generalized subroutines are fetched from the library by the program generator, and program formation and specialization is completed in the language processor. Special test data supplied by the manufacturer are introduced to test the application program. Processing of the test data produces 'a representative sample ot output which can be expected. If the user is not satisfied, he can enter different parameters and execute another test run or return to the language processor to modify the program which has been created. . - -_ _---L..-_ _ _ ••-,"., .... TEST DATA LIBRARY AND GENERALIZED SUBROUTINES PROGRAM - - - - - , CONTROL FLOW ----------~ LIBRARY FETCHING a LOADING Figure 1- Program generation (lnf~rmation flow) A non-resident program generator that is designed to serve a particular industry will control major constructs of system organization for that industry. Specific requirements of each industry will be recognized; for example, a filing technique will be designed for each major industry. . Program execution with user-supplied data is depicted in Figure 2. Inputs and outputs are effected by means of communication lines. The number of lines is not restrictive. The reader is referred to the latt~r part of this paper for a more complete discussion of the execution. --The large-scale user will commonly design at least a portion of the programs which must b~, ~~eate~ to perform specialized functions. The language which he is required to use must be readily understood and easi~y a,pplied. This language should be used not only 430 Spring Joint Computer Conference, 1968 to configure the system to perform selected applications but also for inputs and outputs. It must be conversational, thus permitting the user to interface and communicate readily with the system at his disposal. Furthermore, this one language should suffice for all applications - irrespective of the of the task to be performed. j-----------------------------, I (SEE FIGURE I) upper right portion of Figure 3. A suggested organization to provide input to the operating system design group is shown in Figure 4. LSI DESIGNERS '\.. OPEIWING SYSTEM DESIGN GROUP USER REQUIREMENTS : lDCS AND EXECUTIVE DESIGNERS ! I~H.:::.H:~ I i . . . -----------------i---;::------ -~ _._______ J 1 fI.H~ I (gJ PROGRAM/Joe' J------------- SYSTEM MONITOR I I: -----------1-+ ~/~ -- _____l I . I I r---------J { COMMI.NCImJNS SYSTEM INTEGRATION AND OONTROL GRgP DESIGN MANAGEMENT SQ£Dll.E /1N) CONTRa.. SPECIFICATIONS TESTtG /1N) UTL!TES GROW c:oMMI.MlCATIONS 0II43N0STIC ROUTNS TTY /1N) ETY VIDEO DISPLAY AUDIO PRWTERS Nfl MICROFLM [)VA IfFUTS Figure 4 - Operating system design responsibilities ---""J OOA FI..DtV -------~ CONTROL FI..DtV INRlAMAT10N Figure 2 - Data and control flow Fourth generation computer designers should consider memory levels and interfaces when planning a computer family. The major levels of memory and interfaces are shown in Figure 5. The memory levels depicted in this illustration differ to some extent from the memory levels common in previous generations. - - - - - - - - - - - - - - ----- l USER Software within the system must be comprehensive. A suggested organization for software development and specific responsibilities of each area are shown in Figure 3. LEVEL ~T OUTPUT METHODS AND PROCEDURES OPERATING SYSTEM JOB MANAGEMENT TASK MANAGEMENT DATA MANAGEMENT PROGRAM MANAGEMENT UTILITY SERVICES SYSTEM INTEGRATION ~EDURES SYSTEM TEST PROCEDURES SYSTEM DOCUMENTATION PROCEDURES SOFTWARE DESIGN GROUP LEVEL ~~·'T~~' MAr. MEMORY LEVEL SIMULATION LINEAR PROGRAMMING MATHEMATICAL SUBROUTINE LIBRARY PERT NUMERICAL CONTROL DATA REDUCTION MANUFACTURING BANKING INSURANCE DISTRIBUTION RETAILING TRANSPORTATION OTHER USER REQUIREMENTS Figure 3 - Software design responsibilities :8'i _ _ I COMPILERS : , ASSEMBLERS: ~----~~ , , : : ' I : : ~ ---------~~;~--r A=:E~ 1- L~C ARRAYS l PRIMARY INDUSTRY -ORIENTED APPLICATION SYSTEMS 8 --- ----- ------- --------~-~!~~- ------ aEM6NTS READ FUNCTION-ORIENTED APPLICATION SYSTEM BUFFERS COUNTERS CONTROL OF COUNTERS CONTROl.. DECISIONS (BASED INPUT OUTPUT MEMORY PROCESSING : ELEMENTS I : : : : , , OHLY MEMORY ...----------, l LEVEL TO SECONDII.RY CONTROL PERMANENT SOFTWARE STORAGE ---- - ----- ---- -- ---- - ----- --_ ...... - - -- -_ ... I PROCESSOR I I ELEMENTS REGISTER LEVEL I MICRO OPERATIONS REGISTER LEVEL STORAGE ---- ----- --·--t- --- ------ ------------ The input to the software design group by the operating system design group is shown in the MEANS OF CONSUlTING AND INTERACTING WITH 1/0 Figure 5 - Major memory level interfaces J Fourth Generation Computer Systems The disconnected lines in Figure 5 indicate major interfaces which shift a~ required for particular family members. The roles of associative memories and of LSI programmable logic arrays also vary. Read only memories and associative memories can be used extensively in medium and large family members - particularly in establishment of program generators. Programmable logic arrays can perform many of the executive processes currently performed by software and can be used to tailor the system to meet particular user needs. Progammable logic arrays and associative memories can replace operating system programs and be used to establish logical system organization. AssociatIve memories can be used for compiling, job . assignment, parallel processing, search operations, handling of priorities and interrupts, and -recognition of I/O commands. Concurrent operation of high-speed peripheral devices will be facilitated. Interfaces between software segments and equipment with regard to facility assignment, protection, release, accounting, relative priority, scheduling, and interrupt procedures should be consistent throughout a computer family. Register-level designers can correlate software modular designs and physical modular designs (functions, translations, data fonnats, instruction formats, etc.). One group .of system/LSI/software designers working at the register level can establish characteristics of the total - system - register size, instruction set, multiprogramming, multiprocessing, .etc. This group must also answer questions such as whether register logic, counters, comparisons, and control logic can be optimally handled by LSI orby IC's. Commendable system design utilizes common majorboards, common memory features, common read only memory units, and common software, thereby reducing the cost of design effort. One set of circuits operating at a uniform clock speed can be designed for the entire family. Systems can be carefully designed to maximize cost effectiveness for the manufacturer and, concomitantly, to maximize potential benefits for the user. Discussion of the characteristics It is appropriate to suggest methods or approaches by which the characteristics of fourth generation computer systems can be implemented. Some methods have been suggested in preceding sections of this paper. The implementation of other characteristics is discussed in this section. Implementation of the 431 remaining characteristics and integration of characteristics to form a system are discussed in the next section of this paper. We have stated that the major design criteria will be optimal use of available communication interfaces. Intrasystem and intersystem communication interfaces are required for both hardware and software. Computer professionals are acutely aware that communication capabilities are an important requirement of the next generation of computers. Fourth generation sy'stems will be controlled primarily by data rather than by programs as were previous machines; i.e., overall system control will be established primarily by input rather than by stored information. Development of this characteristic is dependent upon submission of information il1 real time. Feedback is a key consideration. Proper interaction between intersystem and intrasystem interfaces is vital. The interrelationships between data (communication bits) and programs (information bits) must be carefully defined. Use of hardware to govern communication and control procedures will be emphasized; extensive use of control programs will be. substantially reduced or eliminated. This characteristic is closely related to the preceding one. Focalizing system design by application of communication networks eliminates much of the need for software and facilitates system control. Again, consideratio,n of both intersystem and intrasystem elements is important. System allocation of its resources and use of LSI for control have bee~ discussed in previous sections . When such techniques are applied, control program requirements will be minimized. To write that most processing will be executed in real time is to express an opinion. However, a definite trend within ED P toward more processing of data in real time is readily observable. Real time, as discussed in this paper, does not imply the interleaving of programs or the man-machine interaction of time sharing. It does imply that the system will accept inputs as they are made available and process those inputs within the constraints imposed by desired response times. The system will be readily expandable in terms of both hardware and software. 'A variable instruc· tion set is not implied. However, nested subsets of software will be available to complement nested subsets of hardware. In fact, this nesting of software is currently practiced. The user's software commonly includes both action macros and system macros. System macros commonly contain nested macros which perform communication functions for specific terminal devices. Such macros can be removed or 432 Spring Joint Computer Conference, 1968 specialized. Thus, system modularity results and impetus is given to applying the family concept in terminal design. An example of functional modularity is a multiplex control device which consists of front-removable elements such as a channel unit, speed/code format decoder, data control unit, and power unit. Desired speed/code combinations in the format decoder can be implemented by replaceable majorboards. Character-rate regulation features for a variety of remote terminals can be established by means ~f plug-in majorboards. To construct spedal purpose computers by specialization or combination of generalized hardware and software modules should be possible. Tailoring hardware to the user's particular needs and/or applications appears straightforward. Hardware modularity can also be applied to interconnected elements. Interconnect designs will include inter-junction, inter-flat pack, inter-majorboard, inter-unit, inter-shelf, and inter-backboard. Disciplined interfaces can be established between the unit interconnect system, the structural system, ' and the cooling system, each of which will be constructed as a separate and virtually independent element. A complete enclosure of the unit interconnect system can be designed. All modules can be constructed as entities which are front-removable. A significant objective will be to design systems such that all installation and normal service activities can be performed by means of front access. Hard· ware malfunctions will be corrected by immediate replacement of disabled modules. Malfunctions in real time systems will be corrected by replacemen of disabled modules within a time span of less than one minute. Functional modularity will not only help to alleviate interconnection problems \yithin the ~odule but also permit the interconnection of modules such as processors, I/O channel handlers, memory elements, and peripheral devices. Dynamic system reconfiguration will be possible. Modular design of system hardware is a basic determinant of the degree to which a system can be updated and of the ease with which such updating can be performed. Functional plug-in elements permit the system' to ~e updated. Advancements resulti~g from technical developments can be readily incorporated in systems currently in operation. However, modular design should not be regarded as a permanent deterrent to' obsolescence of fourth generation equipment. The design of fourth generation systems to permit efficient operation regardless of distances between connected elements is discussed ih the last section of this paper Collection of data at its source is a trend in the computer industry. On-line collection of data will be the standard rather than the exception in fourth generation systems. Translation of data from a medium understandable by the user to a medium understandable by the computer will be an accepted function of the computer. Most of the data flowing into and out of computers today is unnecessary. Low-cost mass memory will provide a common data base and reduce or eliminate repetitive entry of data. The generation of reports on an exception basis is a technique of system design rather than a problem of hardware or software. The user must recognize that voluminous report~ in themselves do not provide answers and that identification of key factors and organization of pertinent reports accordingly is a preferable approach. Online ,submission of data or interrogation of the system from remote terminals will be another technique by which desired information can be entered or secured. An overall system approach is needed to determine answers to questions of storage media, I/O devices, types of input and output, frequency of output, etc. The development of an efficient low-cost program generator has been previously discussed. Increased emphasis on reduction of total system cost is an obvious trend and needs no explanation. Software must be designed to facilitate user application. Several methods to ease programming difficulties have been discussed in earlier sections. Device-specific software routines will be eliminated because the required functions can be performed by a general software routine and interchangeable functional hardware modlues. (See discussion of fith characteristic.) Hardware diagnostic routines will be performed during normal system operation. Indication of malfunction will be detected so that corrective procedures can be initiated, thus avoiding costly delays that would otherwise occur. For example, suppose that a diagnostic routine to check multiplex operation is run periodically. If the speed/code format decoder in multiplex unit #3 begins to fail, the operating system is instructed to power up multiplex unit #4 and to switch operations being performed by multiplex unit #3 to multiplex unit #4. A message is typed o~ the typewriter console that the speed/code format decoder on multiplex unit # 3 has failed. Maintenance personnel can remove the defective speed/code format decoder and insert a new functional unit. Compatability of diagnostic routines and I/O routines produces several beneficial results. Fourth Generation Computer Sysiems 1. Minimizes system downtime due to malfunction of hardware elements. 2. Permits graceful degradation. 3. Eliminates the necessity to interrupt normal processing in order to detect and correct minor hardware malfunctions. Fourth generation computer systems What is the fundamental nature of computing? We believe that the basis for computing is data handling (data communication) and data control. Data must be communicated to the system, among system elements, and to external recipients. Data are accepted by the system,· 'stored, and processed. Since the system requires I/O, storage, and processing capabilities, why not develop separate processors to perform these functions in an optimal manner? We suggest that multiprocessing systems similar to the configuration shown in Figure 6 will be widely used. . TO EXTERNAL DEVICES Figure 6 - Fourth generation computer systems The three functional processors can be contained within the same hardware unit. The dots indicate that additional processors can be added to the system. The communication architecture of this system will enhance and encourage modularity by assigning to hardware many of the functions currently performed by software. If several of these small processors are in the system, the failure of one of them will decrease system performance only to the extent that the remaining processors of the same type cannot handle the workload. The operand manipulator processor will perform the application programming function. All logical processes outside of the system control functions will be executed by this processor. A single communication control system program will reside in this processor. 43"3 The data storage processor will handle the requirements for associative memory, secondary storage, mass memory, and communications between processors. The logical structure of the system will be centered around this processor. The' data storage element will be divided into z~nes based upon retention times of stored data. All logical communications between processors or processes will be handled by this processor. This arrangement will permit asynchronous communications. The multiplex processor will include high-speed record channels with interrupt capability and a multiplex channel designed to service a large number of low-speed devices on a time division multiplex basis. These low-speed devices will include badge readers, teletypewriters, process control stations, bank teller window devices, on-line factory test devices, touch tones, keyboards, and CRT's. A f~ll duplex multiplex channel which can send and receive serial data in either a time division multiplex mode or a record mode will be available in the mUltiplex processor. Automatic poll and call functions for devices requiring such services will be generated by hardware. Input lines will be scanned automatically by the multiplex channel unit, and data will be brought into main core storage where they can be easily accessed by the data storage processor. The multiplex processor will be capable of continuous operation. Its functions will include accepting data into the system, making queue entries to provide the proper data processing functions, receiving the results of processing, and distributing these results to the external system. Significant modifications to current mass storage units are needed. These modifications should be introduced to develop mass storage units which will be capable of storing up to a billion characters, have no moving parts, and operate at electronic switching speeds. Current multiprocessing systems are frequently characterized by identical processors used symmetrically. Such an arrangement reduces a multiprocessor system to a multiprogramming system if interlocks and inter-processor communications are ignored. Techniques of multiprogramming are known and give some insight into multiprocessor systems. If processors are allotted for specific functions as proposed in this paper, hardware can control multiplexing, switching between programs, channel allocations, and several of the storage functions. Operating systems will be easier to design and simpler to understand. Multiprocessing offers potential benefits of speed, because execution is in parallel instead of serial; flexibility, because processing modules can be added without redesign of the system; and increased reliability, because redundant processors allow the 434 Spring Joint Computer Conference, 1968 system to "degrade gracefully." Such systems are highly adaptable to potential processor applications. Fourth generation multiprocessor systems will be characterized by record channel units for communicating among processors and a multiplex processor for communicating to external elements. Record channel units will be used to terminate devices which operate on a record-by-record basis and communicate asynchronously with the processor, e.g., drums, disks, and magnetic tapes. The multiplex channel unit will terminate character-oriented devices, i.e., a large number of independent low-speed devices, each operating on a character-by-character basis. The channel will be a serial time division multiplex loop divided into a number of time division slots. The time slots will be detected by adapters. Each adapter will be connected to at least one control unit which provides hardware interface logic between the loop and an addressed device. The multiplex channel will obtain data for the loop from tables in core and return data from devices to these tables on a data replacement basis. Direct digital control loops can be attached to the multiplex channel. Fourth generation computer designers will be cognizant of the importance of tradeoffs and of design interfaces and critical paths between physical modules, physical and software modules, and software modules. Designers of current software are more concerned with software-human interfaces than with intrasoftware interfaces structured to maximize applicability. Tradeoff and interface analyses must answer the questions "How will each change affect the user?" and "How much will each change affect the user?" System tradeoffs in fourth generation computers within communication and control systems will be expressed in terms of response times, communication channel bandwidths, equipment complexities, and numbers of channels. Facets of interface design that are being established include the following elements: 1. Procedures and standards, 2. Combinations of procedures and standards which function as control elements, 3. Interfaces suitable for use with memory (associative, data only, control only, multiple segment read only, and multiple segment write only), 4. Interfaces between I/O devices, and 5. Interrupt, identification, and other real time and quick time intermodular control functions. The address structure of a communication-oriented system will permit comprehensive element identification. In addition, use of truncated addresses within any given environment will assure efficient addressing capability. Proper design of communication modes of operation win remove the responsibiiity for timing considerations from the application program. In the multiprocessing system discussed in this paper, standardization of interfaces and specifications of standard response times and bandwidths will permit the relocation of application devices in the system. When the basic nature of applications is considered from a communication and control point of view, the following functions can be identified: data acquisition and reduction, algorithm computation, monitoring, and process optimization, and control. These functions can be structured as a horizontal unification of computing elements, I/O and communication elements, and user devices, or data-generating elements. Within all applications, there is also a vertical structuring determined by the specific assignment which the system is initialized to perform. Fourth generation systems must be flexible to permit easy and constant reconfiguration and reoptimization. SUMMARY The authors of this paper have attempted to show how computers and applications can be integrated to form a communication and control system. Computer capabilities, tradeoffs, role of LSI, new software systems, examples of design based on interfaces, and overall system configuration have been discussed. Since the primary element that users have in common is data, the development of techniques to achieve data communication and data control is, at the same time, the development of a sound basis for data processing. One way to stimulate the development of technology is to identify situations in which the results of such development can or must be applied, i.e., to identify and present current or portending needs. The authors have attempted to point out such needs in the computing field. Suggestions and comments in this paper can be considered, studied, and discussed. This discussion can lead to the development of new technology before designs of fourth generation computer systems are finalized. Fourth generation computer systems will be characterized by many of the features advocated herein. A fourth-generation computer organization by STAN LEY E. LASS Scientific Data Systems, Inc. Santa Monica, California sive use of combinational logic and separate functional units (e.g., an add/subtract/logical unit and a multiply/divide unit). The. implications of this procedure can be emphasized by estimating the arithmetic operation speeds that will result. These estimates are based on extrapolations from published papers 1•2 •3 and include an allowance for the additional logic levels required. Also, the estimates assume a I-nsec delay in the environment for one level of AND/OR logic along the critical path. The critical-path distance will be minimized by a combination of staying on the integrated circuit chip and keeping the path distance between chips short. INTRODUCTION A single processor's performance is limited by its organizational efficiency and the technology available. Paralleling of processors and/or improving the organizational efficiency are the ways of obtaining greater performance with a given technology. Much research has been done on multiple processors and single processors which perform operations on vectors in parallel. Howerer, significant portions of problems are sequential, and performance in the sequential portions is limited to that of a single processor. This paper describes a proposed new medium- to large-scale computer organization designed to improve single-processor organizational efficiency. The basis of this approach is the separation of memory operations (fetching, storing) control from the arithmetic unit control. Each control unit executes its own programs. Memory operations programs fetch instructions, fetch operands, and store results for the arithmetic unit. Buffering allows a maximum of asynchronism between the arithmetic operations and the memory operations. To perform a given computation, each control unit executes fewer and less complex instructions than a third-generation computer control unit. The less complex instructions require less time to execute and, since fewer instructions per control unit are required, the computer can operate much faster. Cost -performance of logic A logic circuit delay of approximately 0.2 nsec has been achieved on an integrated circuit chip. H ighspeed logic circuit delay of 1.8 nsec has been achieved in the third generation. Low-cost bipolar logic with 250 gates on a chip at 5 cents/gate has been predicted for 1970. Per-gate costs are presently about 50 cents. Cost-performance of logic will thus be about two orders of magnitude better than in the third generation. Arithmetic operation times As a result of this cheaper and faster logic, it will be reasonable to minimize operation times by exten- Estimated arithmetic operation speeds are: Operation Pipelined Time! Elapsed Time Operation 32-bit fixed-point add/subtract 32-bit fixed-point mUltiply 32-bit fixed-point divide 32-bit logic functions 32-64 bit floating-point add/subtract 32-64 bit floating-point multiply 32-64 bit floating-point divide 8 nsec 16 nsec 56 nsec 8 nsec 16 nsec 20 nsec 70 nsec 4nsec 8 nsec 4 nsec 8 nsec 10 nsec With separate functional units, time can also be saved by using functional unit outputs directly as inputs without intervening storage. Pipelining can also be used to increase the throughput. For pipelined operation, the execution of a function is divided into two or more stages, and a set of inputs can be in execution in each stage. The time between successive inputs can be much less than the elapsed time for the execution of a function. Cost-performance of memory Memory costs will be roughly halved by batchprocessed fabrication. Access times on the order of 100 nsec and cycle times on the order of 200 nsec will be achieved. This represents nearly an order of magnitude improvement in cost-performance over thirdgeneration memories. 435 436 Spring Joint Computer C"onference, 1968 I mplications of memory technology Logic speed is increasing relatively faster than memory speed. Cheaper logic makes it reasonable to perform the arithmetic operations in fewer logic levels. As a result, the disparity between arithmetic operation tim.es and memory access times will increase by a factor of roughly two to three. This implies greater instruction lookahead to efficiently utilize the arithmetic unit's capacity - and increased instruction lookahead is difficult to achieve. 4 However, a partial solution to this disparity exists and is described in the sections that follow. Associative bufler and block-organized main memory A scratchpad memory buffers the processor and main memory. Blocks of words are transferred between the scratchpad memory and main memory. The scratchpad memory and the associative memory together comprise the associative buffer. The operation proceeds as follows: The virtual address of a requested word is associatively checked with the virtual addresses of the blocks in the scratchpad. If the word is in a block in the scratchpad memory, it is output to the processor. If not, the block containing the word is obtained from main memory and stored in the scratchpad memory, and the word is output to the processor. Similarly, when storing a word, the block must be in the scratchpad memory. This is similar to paging in third-generation tImesharing systems and it involves the same problems (e.g., which block to delete or store when room is needed for a new block). The net result is a substantial reduction in access time when the word is in the scratchpad memory. 5,6 To provide a basis for comparison, assume a blockorganized main memory with each block consisting of 16 consecutive 32-bit words. Eight interleaved block-organized memories of 100-nsec access time and 200-nsec cycle time provide a combined memory bandwidth of over 2 X 1010 bits/second. Access times from processor to memory are approximately 30 nsec for words in the scratchpad, and 150 nsec for words in main memory. Pipelining through the associative memory and parallel scratchpads is used to achieve a high associative buffer bandwidth. Assume a fetch or store every 10 nsec, where six percent of these require accessing main memory. The six percent is based on data5 modified to reflect the differences in computer organization. This corresponds to an instruction rate of approximately 80 million per second. This also corresponds to six blocks per microsecond from main memory or 15 per- cent of bandwidth. With" bandwidth usage this low, another processor could be added without severe degradation in performance due to interference. It also allows high input-output transfer rates with modest interference. It is desirable, with this design, to group operands and sequence the addressing to minimize the number of block transfers. This lowers the average access time and lowers the main memory bandwidth usage. Programming implications Most programming will be in higher-level languages. The computer cost will be a smaller and smaller portion of the total costs of solving a given problem. The main goal of the designer is to maximize the system throughput with programs written in higherlevel languages. The user sees a system that executes programs written in higher-level languages. The average job execution time does not decrease significantly when the computer speed is increased significantly. The explanation for this seems to be that the number of programmers and the number of jobs they submit each day do not change appreciably, but the jobs they do submit are longer in terms of number of instructions executed; e.g., they try more cases or parameterize in finer increments. The number of instructions executed per job by the operating system (including. compilers) will probably not increase by more than a factor of five, even if increased optimization of compiled code and decreased efficiency due to use of table-driven compiler techniques (for lower software cost) are"factored in. Operating systems, compiling, and input conversions (e.g., decimal-binary) are essentially inputoutput functions and their volume is proportional to the number of programmers and people preparing input and reading output. If the computer speed is increased by a factor of twenty-five, then the operating system (including compilers) time will decrease by a factor of more than five; and the computer will be executing jobs more of the time. Similarly, the proportion of time devoted to byte manipulation, binary-todecimal conversions, etc., will decrease. Byte, halfword, and shifting operations may not be included in the hardware for the above reasons. Shifting would be accomplished by multiplying by a power of two. . The equivalence of logic design and programming Both the logic designer and the programmer implement algorithms. Each has to choose a representation of the data involved. Whereas the programmer uses instructions to implement algorithms, the logic designer uses combinations of logic elements (AND, A Fourth-Generation Computer Organization OR, NOT, and storage). In addition to verifying that the logic is correct, the designer must observe the electrical limitations of the logic elements and their connections (i.e., circuit delay, fan-in, fan-out, and wire propagation delay) in order to execute the logic function correctly within the time allotted. Hardware instruction lookahead is, in effect, a recoding of several instructions to obtain the instantaneous control actions. The hardware recoding and the resulting asynchronism depend on conditions within the computer (e.g., variations in instantaneous memory access time due to interrerence). Hardware recoding operates in real time at execution time and is strictly limited in complexity by time and economic considerations. The recoding can also be performed by software at compile time if execution-time asynchronism is sacrificed. All concurrency is planned at compile time. If an instruction or operand were not available when needed (due to memory interference), the control would halt until it became available. The recoded program, containing control timing and sequencing information, would require several times as many bits as the unrecoded program. It would resemble a microprogram with groups of microinstructions to be executed in parallel. The- computer time required for recoding at compile time is proportional to the length of the program, not the number of instruction executions required to complete the program. Also, software recoding is not limited by the real-time constraint. As a result, the software recoding can economically be much more complex and more effective. In the recoded form, operand fetches are initiated several instruction cycles before they are used. For example, the recoded form of the inner loop of a matrix multiply would be several operand fetches, followed by concurrent operand fetches and arithmetic operations and finally by the last arithmetic operations. The same result could be obtained by an independent operand fetch loop which starts several instruction cycles before the arithmetic operation loop is started. Two separate centers of control are implied. Fewer bits are required to represent the program by specifying the two loops separately, but the number of bits is still more than third-generation instructions require. The proposed computer organization has a separate control unit for fetching and storing (the data channel control unit) and an arithmetic control unit. F or comparison, note that the CDC 6600 and the LIMAC7 have separate instructions (but not separate program&) for arithmetic operations and memory operations. A'l1I -rJ Data channels and their control Figure 1 shows the data channels which are the information-flow paths in the computer. DAT.A CHANNEL CONTROL UN!T ASSOCIATIVE BUFFER MAIN MEMORY Figure 1- Information flow diagni.m. Arrows indicate data paths in the computer. Instructions are transmitted to the arithmetic units over paths indicated by dashed arrows. Double arrows are the data paths for each set of eight data channels. Two data paths suffice for eight data channels, since two data items at most are transferred at a time Channel commands for multiple-word transfers consist of a virtual memory address, the channel number, a flag indicating load or store pushdown stack, address increment, and count. For loop control during arithmetic operations, an end-of-record marker follows the last operand. An attempt to read the end-of-record marker as data will terminate the loop. A channel can be cleared of previous contents by flagging the first command of a new channel program. All store commands of the old channel program for which data was stored in the channel are properly executed. A channel must be cleared or sufficient time must elapse to store the data before subsequent commands reference that data. Another channel capability is the capability to load a variable number of words (limited by the buffer size) in a circular register. Its use is primarily for storing instructions and constants within loops. It can be entered by flagging the channel command which specifies the last word in the loop. The first word will then follow the last word until the channel is cleared. This usage of the channel will be later referred to as circular mode. The input-output register of the channel has a datapresence bit to indicate data availability. The register functions in four ways: 438 Spring Joint Computer Conference, 1968 1. Nondestructive read: sequence, but they will go to the correct word in the The presence bit is left on and the register conscratchpad. teilts remain t.he same. 2. Destructive read: The presence bit· is turned off, the register is fined with the next data word in the .channel~ and . . the presence bit turned back on. 3. Nondestructive store: The current contents of the register are pushed down one and the presence bit is left on. CHANNEL SI DE POINTER INPUT-OUTPUT 4. Destructive store: REGISTER POINTER The current contents of the register are replaced. Provision is made for saving channel status, using the channel for another purpose, and later restoring the channel to its original status. Figure 2 - Data channel buffer The master control unit, fixed-point arithmetic unit, and. input-output unit each have a data channel The mast~r control unit. reserved for commands. Any data stored in these data channels are transmitted to the data channel conEight double-word data channels supply the master trol unit for immediate execution as a command. control unit with instructions. The control is selected The second source of commands is the input -output to one of the eight data channels. Instructions present registers of specified data channels. Commands presin the sel~cted data channel (indicated by the presence ent in the specified data channels (as indicated by the bit) are read destructively from the input-output regispresence bit) are read destructively from the inputter of the data channel and executed. There are five output register and executed. The commands are types of instructions: executed by small, fast, special-purpose computers in . 1. Arithmetic instructions are transmitted to the the data channel control unit. appropriate arithmetic unit. As a~ example, the execution of a single command 2. Channel commands appearing in the instruction (received from the master control unit) loads a data stream are transmitted to the channel control unit channel with commands, -the first command loads through a data channel. another data channel with commands for fetching 3. An all-zero instruction is a no-operation. instructions, the second command loads still another 4. An instruction is provided to conditionally data channel with commands for storing data, and switch between the instruction unit data chanthe remaining commands fetch operands. nels. Channel command programs loading data can fetch 5. An instruction is .provided to conditionally skip ahead of arithmetic execution a number of words a specified number of instructions. limited by the size of the data channel buffer. Channel Arithmetic unit control command programs end by running out of commands. 10\ rI1\ ~\ Data channel buffering Channel buffers would be implemented as circular buffers using integrated scratchpad memories. Channel action when used for loading operands is as follows: Initially, both input and output pointers are set ~o 0 (see Figure 2). The first input requested goes into word 0, and a data-presence bit is set when the input arrives .. Each successive input requested goes to the next higher word (modulo 7). The fetch-ahead depth of our example is limited to 8. Output can only occur if the data-presence bit is set. If the instruction turns off the presence bit, the next output comes from the next higher word. While inputs are requested in the command order, they may arrive at the buffer out of An arithmetic instruction specifies the two inputs, destructive or nondestructive read, and the operation to be performed. The inputs are from data channels and functional unit outputs (see Figure 3). A store instruction transmits the data on an output bus or a data channel to a data channel. One data channel leads to the channel control unit for computed channel commands. The first stage of instruction execution is testing whether the specified inputs are present. If they are not, the control hangs up until they arrive. During the last stage, the inputs are latched in the functional unit while the operation proceeds. The output-presence bit is set when the operation is completed. The presence bit is turned off by a destructive read of the functional unit output (by an instruction), or by test- A Fourth~Generation ComputerOrganization ADDER, SUBTRACTOR. AND LOGIC UNIT MULTIPLY-DIVIDE UNIT INPUT-OUTPUT BUS INPUT BUS EIGHT DATA CHANNEL BUFFERS 439 modes. Leaving the loop is accomplished by switching to another data channel (by conditional branches) or by trying to read data and getting an end-of-record marker. If this data channel is itself in circular mode, . we have a loop within a loop. Conditional branches are handled by anticipatory loading of a channel with the successful branch instruction stream and switching to the channel if the branch is successful. Unconditional branches are handled at the channel command level by loading the channel with the branched instruction stream. Subroutines are ha'ndled by loading a channel with the subroutine (or at least with the beginning of it) and then switching control to that channel. Returning is accomplished by switching back to the original channel. Some subroutining can be specified at the channel command level by channel commands for an unconditional branch to the subroutine and an unconditional branch back. However, sooner or later all channels will be in use, in which case a channel status is saved, the channel is used, and later the channel status is restored. This is analogous to saving a program location counter, executing a subroutine, and then restoring the location counter. Hardware design and packaging Figure 3 - Arithmetic unit organization. Arrows indicate data paths and direction. ing the condition code generated by the operation. For example, a compare is accomplished by testing the condition code of a subtract operation. To pipeline, two pairs of inputs must be latched in before the result of the first pair is read. Trying to read anend-of-record marker results in switching control to the next data channel in the control unit. This is used mainly for terminating loops. To facilitate data exchange between the separate fixed- and floating-point arithmetic units, two data channels are common to both. Channel commandsfor instruction and data sequencing The channel commands for a set of instructions and their data are normally located together in memory. Sequential instructions are executed by transmitting a channel command to the data channel control unit specifying the instructions and their data. SmaUloops are the same as sequential instructions, except that circular mode is specified in the channel command. Arithmetic data may also use circular The computer is naturally partitioned into nearly autonomous units. Repetition of parts is found in the mUltiple uses of data channels (which are mostly memory), the special-purpose computers in the data channel control unit, and the associative buffer (which is mostly associative and addressable memory). The hardware complexity of control needed to achieve a high level of concurrency is minimized by separate contro! of memory operations and of the arithmetic unit. Complex instruction-Iookahead hardware is not required. Input-output Input-output would be controlled by a small computer with a scratchpad memory for input-output comlLands and buffering. The small computer is the interface between the peripherals and the data channels. It also generates commands to control input-output data transfers III the data channels, Time-sharing and multiprogramming Paging from a rotating memory is the currently popular solution to managing main memory in a timesharing environment. If, in a typical third-generation system, a 50 x 106 instructions/sec processor is sub- 440 Spring Joint Computer Conference, 1968 stituted and the page access time plus transfer time not changed significantly, the processor will be waiting on pages most of the time. Access time from rotating memories cannot be improved significantly, but transfer rates with head-per-track systems can be very high. This suggests an approach based on the ability to read complete programs from the disc into main memory quickly, process them rapidly, and return them to the disc quickly. This minimizes the prorated memory usage by a program and allows high throughput without an excessively large memory. The scheduler maximizes processor utilization within the constraint of system response times. Programs normally reside on the disc. The scheduler selects the next program to be transferred to memory. One factor in the selection is the amount of time before program transfer would begin (instantaneous access time). The program is transferred at 109 bits/second (e.g., a 106 bit program is transferred in 1 msec). The program is put on a queue of programs to be processed. Having the complete program in memory allows processing without paging to the next input or output by the program, and then writing the processed program into available disc space (generally the first available disc space) without regard to its previous disc location. The previous disc location is added to the available disc space. Scheduler considerations include the distribution of available space around the disc, distribution of ready-to-be processed programs around the disc, and nearness of ready-to-be-processed programs to their response-time limit. Programs whose processing time exceeds a specified limit are not allowed to degrade the system response time of the other programs. Programs larger than the memory require partitioning into files or pages. Main memory (or optionally a slower, lower-cost random-access memory) and the disc buffer the inputoutput activity of the programs. The system could be organized to place FORTRAN users in one group, JOSS users in another group, etc. Each user group would have its own compiler and supporting operating system. A portion of the operating system would be common to all of the groups. The computer would be a "dedicated" FORTRAN system for a fraction of a second, then a "dedicated" JOSS system for another fraction of a second, etc. As a result, operating systems would be simpler and a change could be made in on"e system without affecting the other systems. CONCLUSIONS AND OBSERVATIONS The system described here achieves concurrency of fetching, arithmetic operations, and storing without the need for complex instruction lookahead hardware. The complexity of control is in software. The bandwidth of the processor is over 100 million equivalent third-generation instructions per second. This rate will be achieved for some problems. However, delays due to \vaiting for operands or instructions in the data channels will lower the processing rate in many cases. Some types of probelms seem to inherently have a great deal of delay - for example, table-lookup using computed addresses. Problems in which the flow of control and addressing are not data-dependent could run near the bandwidth of the system. (An additional requirement for this is that the addressing be such that the block transfer rate between main memory and the associative buffer is reasonable.) Optimizing the code consists of (!) minimizing the delays caused by instructions and operands not being available when needed, and (2) pipelining and overlapping the arthmetic operations. To program for this processor in its machine language, a master control program (instructions) and channel programs (commands) are prepared. There are many chances to make an error and lose synchronism between instructions and commands. As a result of the difficulty in machine-language programming with this organization, even more programming would be in higher-level languages. Initially, the compiler for this computer could be relatively crude and unsophisticated. As time passes the subtleties and characteristics of the design would be assimilated and experience gained by the compiler writers. As a result, midway through the fourth generation the computer should average 50-80 million equivalent third-generation instructions per second. A more powerful data channel command set than is described here may be desirable for non-numeric applications. 8 The only significant way to reduce software cost by hardware is to build a faster computer (with a lower cost per computation), which will then allow the programmer to reduce total costs by using algorithms that are simpler to program but require more computer processing. REFERENCES 1 C S WALLACE A suggestion for afast multiplier IEEE Transactions on Electronic Computers Vol EC-13 February 1964 2 M LEHMAN N BURLA Skip techniques for high-speed carry propagation in binary arithmetic units iRE Transactions on Electronic Computers Vol EC-IO December 1961 l\ Fourth-Generation Co~puter Organization 3 S F ANDERSON 1 G EARLE R E GOLDSCHMIDT D M POWER The IBM system/360 model 91: Floating-point execution unit IBM Journal of Research and Development Vol 11 No 1 January 1967 4 D W ANDERSON F 1 SPARACIO R M TOMASULO The IBM system/360 model 91: Machine philosophy and instruction handling IBM lournal of Research and Development Vol 11 No 1 lanuary 1967 5 D H GIBSON Considerations in block-oriented system design AA1 "T"T.l Proceedings of the 1967 Spring loint Computer Conference 6 G G SCARROT The efficient use of multilevel storage Proceedings of the IFlPS Congress Spartan Books 1965 7 H R BEELITZ S Y LEVY R 1 LINHARDT .H S MILLER System architecture for large-scale integration Proceedings of the 1967 Fall loint Computer Conference 8 B CHEYDLEUR Summary session, proceedings of the ACM programming languages and pragmatics conference Communications of the ACM Vol 9 No 3 Marcj 1966 Optimal control of satellite attitude acquisition by a random search algorithm on a hybrid computer by WILLIAM P. KAVANAUGH, ELWOODC. STEWART and DAVID H. BROCKER Ames Research Center, NASA Moffett Field, California INTRODUCTION Computer implemented parameter search techniques for optimization problems have become useful engineering design tools over the past few years. Many, if not most of the techniques, are based on deterministic schemes which have inherent limitations when the system is nonlinear. Random search techniques have been suggested which propose to overcome some of the difficulties. References 1-3 give good general discussions of the merits of random techniques. Reference 4 develops an algorithm, based on random methods, to solve the difficult mixed twopoint boundary value problem that results from an application of the Maximum Principle. The method was shown to be remarkably effective in solving a fairly complex fifth-order, nonlinear orbital-transfer problem. The purpose of this paper is to discuss the application of the random search algorithm to a still more complex problem to demonstrate its feasibility. The example chosen was the three-dimensional, large-angle, single-axis attitude acquisition control problem in which it is desired to minimize fuel expenditure to accomplish the acquisition. The equations are highly nonlinear since small angle assumptions cannot be made; the control torques are assumed to be limited. This problem is more complex than the orbit-transfer problem in that the dimension of the state vectors is greater by t and the number of degrees of freedom allowed the control action is greater. The same acquisition problem was dicussed in Reference 5 but a proportional control law was assumed. A random parameter search was used in that paper to find the optimal set of feedback constants for the given control system structure so as to minimize system performance (fuel). Systems performances will be compared to indicate the striking improvement in performance with optimal nonlinear control. I n the following sections we will state the control problem for notational purposes, review the random search algorithm developed in detail in Reference 4, discuss the hardware and software necessary for implementing the algorithm, and last, present the results of applying the method to the satellite acquisition problem. Problem formulation The problems considered are restricted to those for which the Maximum Principle is applicable. Although familiarity with the principle is assumed, a few remarks are necessary to properly pose the problems we will be concerned with in this paper. The system to be controlled is defined by the vector equation x = f(x,u,t) (1) where x = (x h X2, ••• ,xn ), u = (Ub .•. , ur ) and u E U where U is the allowable control region. Interest will center an fixed-time problems because of their convenience in computer operations. It will be desired to take the system from a given state x(o) to a final target set S so as to minimize the generalized .cost function n C=Laixi(T) (2) i=O where xo(t) is the auxiliary state associated with the quantity to be minimized. The target set S, for the example chosen here, will be defined as S = Xc E Rm that is, a fixed point in n-dimensional space. It is well known that application of the Maximum Principle to the stated problem invariably requires the solution to the following set of equations: u = u(x,p,t) } x=f(x,u,t) (3) p= p(p,x,u,t) 443 444 Spring Joint Computer Conference, 196~ where p = (Ph ... ,Pn)' We can see that the Maximum Principle yields a good deal of information about the nature of the control, that is, we know the function u(x,p,t), and we know the equations for x and p. However, at no time do We know the specific values of both x and p. For example, at the initial time, x(o) is generally known from the problem specifications, whereas p(o) is not known. The remaining boundary conditions required will be known at the final time by some combination of components from the x and p vectors. Thus, there is difficulty in solving these equations even numerically because the known boundary conditions are split between initial and final times. Random search algorithm In this paper we will use the algorithm developed in Reference 4 for soiving the mixed boundary-value problem. Consequently, we will give only an intuitive account of this approach necessary for the later example application. A direct way of solving the mixed boundary-value problem is to convert it into an initial-value problem. From Equations (3) it is clear that for any arbitrary value of p(o), there will be 'sufficient' information to determine a final state x(T). Since this state will generally be different from the desired state xr(T) , we intro~uce a vector metric J (see Ref. 4 for the significance of using a vector metric rather than a scalar metric) to measure the distance between x(T) and xr(T). It is convenient conceptually to think of the components of the vector quantity J as hypersurfaces in an n-dimensional space of the components of the p vector. Then the boundary-value problem is equivalent to finding the simultaneous minima of all the hypersurfaces. I t is important to note that in this case the minimum values are known, i.e., zero. Deterministic approaches for finding the minima of the hypersurfaces have a number of difficulties. For example, the gradient technique requires the calculation, or possible experimental measurement, of the partial derivatives of the surfaces at each step of an iteration process. If the surfaces are discontinuous, have many relatively rapid slope changes, or regions of zero slope, gradient approaches will fail. The random search techniques overcome these difficulties. References 1 through 4 discuss the virtues of these methods in greater detail. In particular, it is demonstrated in Reference 4 that many of the hypersurface abnormalities mentioned above actually occur in even a moderately complex problem. The random search approach to be used here was described in considerable detail in Reference 4. The approach is based on a direct search of the hypersurfaces by selecting the initial condition vector p(o) from a gaussian noise source, followed by an evaluation of the corresponding values of the hypersurfaces. I t was shown that the pure random search is not practical for moderately high-order systems because of the siow convergence to the minimum. However, by making the search aigoriihm adaptive, the convergence properties were shown to be greatly improved. This was accomplished by: (a) varying the mean value of the gaussian distribution on any iteration so as to equal the initial condition of the adjoint vector on the last successful iteration, and (b) varying the variance of the distribution so that the search is localized when the iterations are successful but graduany expanded in a geometric progression when not successful. Thus, the mean provides a creeping and direction-seeking character to the search while the variance provides an expanding and contracting character. ADJOINT VARIABLE P (0) Figure 1 - Typical boundary cost function surface The behavior of the algorithm is illustrated in Figure 1 by the typical boundary function surface given in only two dimensions. Starting at point 1, the mean of the distribution is made equal to the corresponding value of p(o); the search starts with the small variance indicated and gradually expands in geometrical steps until a lower point on the surface is detected, such as at point 2. Then the mean of the distribution is made equal to the corresponding value of p(0) and the search repeats. The search continues as indicated by the typical numbered points until a value of zero or some small value, E, near zero is reached. The type algorithm has some desirable properties that enable it to find the minimum. For example, it has a local minimum-seeking property that is due to the small variance used on those iterations which are successful. The algorithm also has a global searching property that enables it to jump over peaks, which is due to the expansion of the variance when the iterations are not successful. Further, it will not matter whether the surface is discontinuous, or has many peaks, valleys or flat regions. Implementation The hybrid computer proved to be the most feasible way to implement the random search algorithm. A primary reason for this is that a relatively large number of iterations are required to find a solution. Reference 4 showed that approximately 8000 iterations were required on the average for a typical solution. Each itera- tion can be, from a computational point of view, divided into two steps: (I) integration of the equations of motion on the interval [O,T], and (2) execution of algorithm logic. The analog computer is by far the faster machine in performing step one. Although the second step might be accomplished in approximately the same time with either machine it is best done digitally. Thus the conclusion is reached that a hybrid approach requires a great deal less computer time than a completely digital simulation. It is worth noting that an alternative approach with pseudo hybrid techniques was investigated using an analog computer and something less than a digital computer. However, our experience shows that inaccuracies, limited storage and limited flexibilities in logical operations seriously limit the feasibility of this approach. In the hybrid implementation, the analog computer was delegated the task of solving the state, adjoint, and control equations as given in Equation (3). It also served as the point at which the operator exercised manual control over the hybrid system. The digital computer was required to calculate the metric, provide storage, implement the algorithm logic, randomly generate the initial conditions for the adjoint equations and, finally, oversee the sequencing of events of the iterate cycle. This division of computational effort is shown schematically in Figure 2 which ANALOG ___.......1____- - - DIGITAl COMPUTATIOI COIIPIJTATIOII (ALGORITHM) ----- J.r~JVp~l METRIC COMPUTATIOII ~~ is a block diagram of the search algorithm. The superskript k shown in this figure designates the generic kth iteration of a long sequence of iterations. For convenience the flow of information through the block diagram may be thought to start at p(o) which represents the adjoint initial conditions in analog form. They are applied to their respective initial condition circuits on the analog machine, and then that machine is commanded into an operate mode. The set of Equations (3) are solved on the time interval [O,Tl, and at time T those analog components we are inte~ested in reading are commanded into a hold mode and the executive sequencing program instructs the AID converters to read these variables. On the basis of this information the boundary cost function metric is computed digitally, and then tested by comparing it to the last smallest value discussed above. On the basis of this test information, the algorithm logic operates to control the mean value mk and variance (Tk. A new random vector pk = mk + gk is then generated, converted to an analog signal, and applied to the p( 0) initial condition circuits. The whole process is repeated for the next iteration. Figure 3 is a hardware diagram of the hybrid system used. Shown are the two basic elements of the simulation, the analog and digital computers along with their coupling system, and peripherals. The coupling system is comprised of two distinct parts: (a) the Linkage Syst.ems and (b) the Control Interface System. A discussion of the hardware used in these subsystems is given in the four sections to follow. The next (fifth) section discusses the sequencing of events through the subsystems during one iteration cycle in order to better describe the functioning of the hybrid system as a whole. Discussed in the final section is the flow graph for the algorithm. DIGITAL COMPUTER DATA CONTROL LINKAGE SYSTEM j. f(x,u,t) p.g (p,I,U,t) U·U ,------, CONTROl.. : INTERFACE I SYSTEM : OPERATE I TIME : COUNTER : &.._------ AlGOIUnll LoetC (x,p,tl ITERATlOH CONTROL NOISE SOORCE 1(0) STMOAID I I IlEYIATIOI,a' 1 IlEAl .-0 I ANALOG COMPUTER I I 1 ----------" Figure 2 - Hybrid system block diagram for random search method Figure 3 - Hybrid system hardware 446 Spring Joint Computer Conference, 1968 Digital Computer The digital computer used in the optimization program was an Electronic Associates, Inc. (EAI) model 8400. The particular machine used has 16,000 words of core memory with 32 bits per word. rv1emory cycle time is 2 microseconds. The machine uses paranel operation for maximum speed. Floating point operations are hardware implemented. The optimization program was coded in MACRO ASSEMBLY in order to keep the execution time to a minimum. The instruction repertoire includes special commands by which discrete signals can be sent to or received from the external world. External interrupts are provided which can trap the computer to a specific cell in memory. In an example to be discussed later, the optimization program utilized about 8,000 words of storage. Of these about 1,000 comprised the actuai optimization executive program, the remaining 7,000 being used for subroutines, monitor and on-line debugging and program modification routines. Analog Computer The analog hardware consisted of an Electronic Associates 231 R-V analog computer. Since the state equations, adjoint equations, and the control logic were programmed in standard fashion, analog schematics were not included. The analog computer serves as the point at which mode control of the hybrid computer is accomplished. By manual selection of switches either of two modes can be commanded: (1) In the "search" mode the analog computer operates in a high-speed repetitive manner. Such operation is accomplished by controlling the mode of the individual integrators with an appropriate discrete signal. This signal is a two-level signal which is generated on the control interface in conjunction with the digital computer and, depending on the level, holds an integrator in either "operate" or "initial condition" mode. (2) In the "reset" mode, the integrators are placed in their initial-condition mode and held there. For continuous type output, a display console was connected to the analog computer to provide visual readout of variables. The display contained a cathode ray tube (CRT) which c01jld simultaneously display up to four channels, and enabled photographic records to be taken of the display quantities. The display was extremely helpful in determining if the algorithm was fUQ.ctioning properly. Control Interface The control interface between the analog and digital system is an Electronic Associates, Inc. DOS 350 (see Fig. 3). It is through this unit that the iteration process is controlled. An important task allocated to this subsystem is the operate-time control. This function is implemented through the use of a counter and is the key element in the control of all timing in the hybrid simulation. The counter is driven from a highfrequency source in the interface system allowing for a very high degree of resolution in the simulated operate-time. Also, the interface allows the digital computer to use any conditions in the analog computer which can be represented by discrete variables (binary levels) and to send discrete signals to the analog system to be .used as control elvles. or indicators. An example of the former would be the hybrid system mode control which merely amounted to the operator depressing the "reset" or "search" switch on the analog computer. This action sets a binary level which is then sensed by the digital computer. An example of the latter situation is when the digital sends the operate command to the operate-time counter. The interface system allows patching of Boolean functions. Hence, some of the logic operations required for timing pulses, event signals, and other like operations were very effectively programmed on it. Linkage System The linkage system shown in Figure 3 houses the conversion equipment, the AID and 01 A converters. I t is through here that ail of the data pass between the analog and digital portions of the simulation. The linkage system is controlled by command from the digital computer. Input to the digital computer is through the AID converter via a channel selection device or multiplexer that selects the analog channel to be converted. Conversions were sequential through the analog channels at a maximum rate of 80,000 samples per second from channel to channel. Output to the analog used the 01 A converters, with each data channel having its own conversion unit. The maximum conversion rate of the 0/ A's used is 250,000 conversions per second. Sequencing of events during one iteration The sequencing of events during one iteration cycle are depicted in Figure 4. The instants of time t 1 , t 2 , ... , t5 shown in this figure are considered fixed relative to eac~ other, and tl is conveniently regarded to be the start of the iteration cycle. We will consider the cycle to begin at tl with the analog integrators in an operate mode. As discussed previously, the elapsed time (t2-t 1) is controned by a counter on the interface system. At t2 an interrupt pulse is generated on the control interface which is sent to the digital computer Optimal Control of Satellite Attitude Acquisition 447 START J - - - - - - - - - - O I E ITERATIOII------------l AIALOG CALCULATIOI_-+-_ _ _ OIGITAL OPERATIOI PERIOD PERIOD -----i m READ STATES FROII AULOG OUTPUT DATA TO ANALOG EXECUTIOI OF mORITHIL t4 t5 AULOG IITEGRATORS PLACED II OPERATE lODE Figure 4 - Sequencing of events during one iteration signaling it to commence its operations. Simultaneously, the pulse is sent to the analog to instruct the track-store units to hold their respective values which they had at time t2 • During the interval (t3-t2)· the digital computer reads these analog variables with the AID converter. At t3 the digital sends a pulse via the interface to the analog console which commands the integrators to an initial conditon mode. At t4 , when the data required by the algorithm have been generated, the D/ A converters send these values to the appropriate points in the analog portion of the simulation. The digital machine allows enough tiine for the transients to settle in the initial condition circuits of the analog before sending a command at t5 that places the integrators in an operate mode and starts the counter. Since t5 and tl are the same event, we merely repeat the above sequence for repetitive operation. . Some specific numerical values might be of interest. The total iterate time (t5-t 1) is primarily composed of two parts: (1) (t2 -t 1) which in a later example problem was scaled in the simulation to 7.5 milliseconds, and (2) (t5-t2), which was primarily determined by the speed of the digital machine in computing, converting, and generating random numbers; this latter period was on the order of 7.5 milliseconds. Thus, the total iterate time for the above situation is on the order of 15 milliseconds (or 66 iterations per second). This figure is dependent on the control problem chosen and the exact form of the algorithm implemented. Algorithm flow graph Figure 5 is a program flow graph showing the software requirements on the iteration process. This basically constitutes a majority of the steps involved in the algorithm and the iteration control sequences utilized by the hybrid system. Note the inclusion of the event times t 1 , t 2 , ••• discussed earlier in connection with Figure 4. The program is continuously recycling in a high-speed repetitive fashion. Figure 5 - Algorithm flow graph There are three basic loops in Figure 5 corresponding to the three system modes in the optimization program: a reset loop, a search loop, and an end-state loop. The reset loop initializes the program. The search loop uses the algorithm to search for a solution to the problem. The end-state loop is entered by the digital program when a solution is found, and is used for generating graphic displays. The operator manually selects the search or reset mode as discussed in the section dealing with the analog computer. A more detailed description of these loops is given in Reference 4. Application In this section we will discuss the application of the random search algorithm to the single-axis attitude acquisition control problem. In the following we will first formulate the problem by giving a physical description of the problem and writing out the exact equations of motion. Second, we will outline the equations necessary for determining optimal nonlinear control as derived by means of the Maximum Princi- 448 Spring Joint Computer Conference, 1968 - ~ - - pIe; for comparative purposes we will al~o outline the optimal proportional control derived in Reference 5. N ext, we will discuss the boundary conditions and the vector metric. In the final two subsections we will illustrate the computer results: in one we win give a variety of time history solution and fuel performances, and in t~e other, some cr~ss sections through the boundary cost-function hypersurfaces. Formulation Consider a freely rotating vehicle V about a point bo in inertial space defined by the set of axes (Sl' S2 S3) shown in the sketch. The dynamical equations of motion of a body of fixed inertia rotating about a point in inertial space and acted upon by external torques are given by the following set of equations: • _ r(_ R) I ' .' v . c.:-'l -) t 1 - ~ ex J W2'Ya t 1. 1/ ex 'II r no \. ') J (4} ~2rHP-a)J J3Wlr 12 " W3T {(a-I) /,B} Wl~2+ Y3/,B ) ) ) , I where a= 11/1 2 ,B = 13/ 12 YI = M I /l 2 Y2 = M2iI2 vri 5, I Roll to pitch inertia ratio Yaw to pitch inertia ratio 1' Controll accelerations, rad/sec2 t" nor~aiized to 12 J b I2 Y3=M3/ , , ~ bz J "J" f Wi Tbody rates, fad/sec i= 1,2,3 b5 J-------- 55 l\fi = torque, lb/ft The three kinematical variables (direction cosines) required to specify the orientation of b~ are designated a13, ~3' a33' These variables are related to the dynamical variables by the following se"t of 'differential equations (see Ref: 5 for a discussion on this): = W3(;\23 - w2133 ~3 = WI ~33 - W3~13 ~13 A fixed set of body axes, b h b2, b3, is ascribed to the vehicle in the principle axes of inertia with origin at boo The orientation of any body axis with respect to inertial space will be specified by direction cosines, e.g., a13, a23, a 33 where a 13 is the cosine of the plane angle between Sl and b3, and' similarly for a23 and a 33 . The orientatio~ of the vehicle is specified by a (3 x 3) direction cosine matrix. Since we are interested in single-axis orientation, we will only require knowing three direction cosines instead of nine. In the study we will orient b 3 in inertial space. The control torques required' to orient the vehicle are produced by mass expulsi~n devices alined with each of the three axes. It is assumed the mass flow rates are used to vary the torques, and that they are bounded (except when examining proportional control). The vehicle is inertially unsymmetrical and is considered to be in a general tumbling motion at the initial time. The objective of control is to apply torques for a fixed period of time in a manner that will reduce tot~ll momentum to zero and orient the b3 axjs of the vehicle from any initial orientation to any other prescribed orientation in inertial space. Furthermore, we must accomplish this task with the control program that uses the least amount of fuel. This verbal statement of the problem will now be formulated more explicitly. a33 = W2~13 - (5) WI+3 For this study the specific values of the roll and pitch inertia ratios were taken to be a = 1.15,,B = 0.48. Also, the maximum control torque acceleration permitted in the nonlinear controller situation was limited to approximately one-sixth the peak-acceleration required for proportionai controh or Y lmax = Y 2max = 2Y3max=.1.5 rad/sec 2. To transfer (4) and (5) into state variabie form, the following substitutions are made: a l3 =Xl; WI rX4; u l =Yl/a a23 = x2 ; W2 T Xs ; U2 = Y2 a33 = X~~W3=r X6; U3=Y3//3 This gives us the following set of state variable equations: ?,l = X 6X2- ~X3 ?,2 = X4X3 - XSXl X3 = XSXl - X4 X2 X4 = {(l- (3)/a} XsXs + Ul Xs = {(,B - a)} X6 X4 + U2 X6 = {(a - 1)/,B} X 4X s + U3 (6) The objective of the control is to take the state vector from an arbitrary initial value x(o) to an arbitrary final value x..(T) in' a fixed interval of time [O,T], Optimal Control of Satellite Attitude Acquisition 449 and use the least amount of fuel in so doing. A new coordinate, proportional to the total fuel used in all three axes, can be defined as follows: t xo(t) = J~ IUi(T)ldT (7) o and we can then interpret the objective· as the minimization of the terminal value xo(T). was selected from the set of parameter vectors which satisfied the problem objective and minimized the performance criteria. This parameter vector, in conjunction with equation (l0), defines optimal proportional control. This control may be difficult to achieve in practice, however, since no bounds have been imposed on the thrust. When there are bounds, the control law is then referred to as optimal saturating proportional control. Control Laws The nonlinear optimal control can be derived by an application of the Maximum Principle. This derivation will not be given here but a summary of the equations necessary for computer implementation will be given. First are the adjoint equations which can be shown to be: I?I = P2 X6I?2 = P3X4I?3 = PIXSI?4 = P3 X2I?s = P IX3P6 = P2 XI - P3XS PI X6 P2X4 P2X3- psx6(f3-a)- P6XS(a-l)//3 P3XI - P4 x6(1-/3)/a - P6x4(a-1)//3 PIX2- P4xs(l-/3)/a - Psxi/3-a) (8) Second are the equations defining the optimal control vector at each instant of time: Ui(t) = Nisgn Pi+3(t) if Ipi+3(t)I > 1 Ui(t) = 0 if Ipi+3(t)I < 1 (9) where i = 1,2,3 and Ni is the maximum torque acceleration allowed in the ith control axis. It is seen that the control torque is of the on-off character and that torque direction is obtained by assigning the correct sign to the "on" signal according to Equation (9). An optimal proportional control law used in this paper for comparative purposes wastaken from Reference 5 and is discussed briefly here for the sake of completeness. In Reference 5 the structure of the optimal control law was assumed to be of the form: The vector metric The desired boundary condition at the terminal time was chosen to be zero momentum and alinement of the body axis b3 with the inertial axis S3; this is expressed by xr(T) = (0, 0, 1, 0, 0, 0). To satisfy these boundary conditions, it is necessary in the random search approach to introduce the vector metric J as discussed previously. Its general form was specified in Reference 4 to be' = (Jo,Jv,Jp) where the subs~ripts on the components refer to displacement, velocity, and adjoint variables, respectively. However, in this application, since all terminal states x,,(T) are fixed, p(T) is completely free so that we may ignore the Jp component in the vector metric. For the present example JoandJ v are taken to be J o = V xi + x~ + (X3-1)2 J v = V x~ + x~ + x~ It is clear that J v = 0 implies X4, Xs, X6 = 0 which represents zero momentum as desired. Also, J o = 0 implies the desired final orientation Xl = X2 = 0 and X3 = 1. In actual practice we will only require where the E values are chosen to meet the problem requirements. For the specific problem discussed below, the value of Ev chosen reduced a 10o/sec initial velocity error in each axis to approximately 0.75°/sec in each axis at time T. The Eo chosen required that the b3 body axis be oriented to within a few degrees of the S3 inertial axis, from any initial orientation. Time history -solutions The parameters D, E, F, G and H were left free, and by means of a random parameter search suitable values were found for which the stated objective of the problem (zero momentum and alinement of b3 to S3) was achieved. The search was repeated a number of times, each time observing system performance given by equation (7). An optimum parameter vector In this section we will give some computer solutions for the satellite attitude acquisition problem. Our interest will center on results for the optimal nonlinear control obtained by implementing the random search algorithm discussed in the preceding sections. For comparison, we will also give results for no 450 Spring Joint Computer Conference, 1968 control and the proportional control studied in Reference 5. Table I summarizes the controllers to be studied, the initial conditions of the vehicle, and the resulting fuel requirements. It is worth emphasizing again that in the initial condition vector the first three components are the initial angular positions given in terms of direction cosines varying between -1 and + 1. The gular position variations would persist over a substantial portion of the interval. In this event, the equations of motion are distinctly nonlinear and i~ is inappropriate to linearize them. The Maximum Principle allows these nonlinearities to be dealt with directly, B. Optimal Proportional Control- The time history solutions for the optimal proportional control TABLE I - Comparison of Control Systems Initial Condition (0,0,1,-10,10,10) (0,0,1,-10,10,10) (0,0,1,-10,10,10) No Control Optimal Proportional Control (no saturation) Optimal Impulse Control Optimal Nonlinear Control (with saturation) a. Initial alinement b. Initial nonalinement latter three components are the initial angular velocities in degrees/second, and were chosen somewhat arbitrarily. The desired final condition vector is taken to be xc(T) = (0, 0, 1, 0, 0, 0); thus, the initial momentum is reduced to zero and the body axis b3 is to be kept ali ned with the inertial axis S3' Also, a solution is given for the more general case where initial misalinement exists. . A. No Control- In Figures 6(a) and (b) are shown the time histories of the state variables describing the motion of the system when no control is used and the vehicle starts with the initial condition x(o) = (0, 0, 1, -) 0, 10, 10) This corresponds to initial alinement of axes but with the initial angular velocities indicated. I t can be noted that the angular positions vary over their entire range (-I, 1) during the time interval [O,T]. From these results it might be anticipated that with limited torque control, the wide range of an(a) Fuel ° .65 .28 .31 (0,0,1,-10,10,10) i 1 (0, \12'\12,-1 0, 10, 10) .43 derived in Reference 5 (as discussed above) are given in Figure 7 for the same initial condition as with no control (see Table I). As is well known, optimizing system performance under the assumption of proportional control often leads to impulsive-type control. . -- (a) ANCULAR POSITl0.5 " ~!i~ -- -- -. -~-~ ~~. .1 .............. 't 0- ------- ---- ....... t. -I-~-'" • -- . • __ --. -. ---- -nt;~~.~·· ~y,,-. • •2 DllimlONlESS UIIIT T * (b) AIGtl.AII :~ .. ~ . . : -1 f- 2.5see 0- .. t' _ ". ~~ ~l~L> DEL, then the Ith student is randomly assigned to either the treatment or control group and added to the pool. Naturally, if the pool size equals the number of students still untested at the Ith insertion then the [I + 1] st through Nth students are all paired. This is equivalent to setting DEL= 1 whenever the pool size equals N-1. Eva!~ation and Development for Computer Assisted instruction Programs 455 '" The most important part of the main program is naturally the computation of DEL. When only early scores are available, DEL should be small since there are at this stage many possible future scores;'Y[J], which might satisfy .the inequality IY[MIN] - Y[I] I ~. IY[J] - Y[I] I· Conversely when almost all students have been pretested, DEL should be large since there are fewer scores that can occur between Y[MIN and Y[I].. . The "probability of· mis-pairing," which will be symboiized by A, was used as a criterion for the determination of DEL. Two scores can be said to be mispaired at Stage 1 if one of the N-I "future" scores occurs' between Y[I] and the MINth pool member paired with Y[I]. Now after the Ith insertion, what is the probability that one of the N -I remaining scores will occur in the interval beginning with Y[I] and ending with Y[MIN] (or beginning with Y[MIN] and ending with VEl])? If the remaining N-I scores can be assumed to be independent and uniformly distributed on the interval [0,1] then the probability of mis-paring is exactly A= 1-(1-1 Y(I)- Y(MIN) I )N-I. By solving the above for IY(I) - Y(MIN)I one can determine that DEL is related to A by the function DEL = 1 - (1 - A)l/(N-I). If an allowance is made for a high probability of mispairing, large A, then DEL will be close to one. On the other hand, if one chooses a small probability A then DEL will be small. However, if A and consequently DEL are assigned small values then most students will remain unpaired until the pool size equals N -I, at which point DEL must suddenly become equal to - one. Therefore, the efficient implementation of the CAP main program depends upon some reasonable determination of A and hence DEL. To accomplish this the probability A is considered as a parameter, related only to the sample size N and constant throughout the pairing process. Under this assumption, the pairing criterion DEL is related to I through the formula DEL = 1 - (l - A)l/(N-I) where now both N and A are considered as parameters. By use of the function given in the preceding paragraph the problem of finding an efficient pairing criterion has been reduced to the problem of estimating the parameter A. An overall criterion of efficient pairing therefore, must be introduced and the parameter A estimated on the basis of this criterion. The total pair separation was chosen as this criterion. Separa- . . tion in the case of the score Y[I] and its eventual mate Y[MIN] is defined as simply the distance IY[I] - Y[MIN] I· Clearly the best sequential pairing method is the one which yields the total pair separation closest to the pair separation possible for the ordinary pairing situation. In the ordinary pairing situation, complete information - in the form of knowledge of all N scores, is available· before any student is to be paired. The most efficient sequential pairing algorithm would be the one which best used the limited information available at the Ith stage, i.e., that obtained from the I previously measured scores. The estimation of the parameter A is made in the following way. For a given sample size N, the estimate A is chosen which minimizes the total pair separation. Estimates of A have been obtained for various sample sizes and used in the CAP program. A program was constructed which simulated the above pairing process, and, therefore, could be used to estimate A. The total pair separation was measured for repeated samples of size 50, 100, 200 ,and 250 where A was chosen as .95* K,K = 0, ... 18. For all samples the optimal value of A was found to be surprisingly large. Since the alternative method of sequential pairing, which was described earlier, is a special case of the CAP procedure where DEL = for all I from 1 to N/2 and DEL = 1 for all I from N/2 + 1 to N, the observation that A and hence DEL should be large for small value~ of I tends to show that the CAP procedure greatly improves upon the alternative sequential pairing method. In fact for all sample sizes, the CAP procedure tends to reduce the total pair separation by a factor of at least two. In Table I the estimated optimal values for A are given, as well as two other estimates which are of interest. These are: First, the size of the pool. NI, when NI = N - I, i.e., at . the time when DEL is set equal to one and all subsequent scores are paired. Second, the number of times NI = 0, i.e., the pool is depleted during the pairing pr?cess. Obviously, if the pool is depleted immediately before any stage I other than N then the I th score must be entered into the pool. Theoretical and simulation work has shown that the CAP main program provides a substantial improvement over the alternative simple sequential pairing method. Actual trials with real data are currently being conducted to check the implementation of the CAP technique. One of the reasons trials with actual data are needed to test the efficiency of CAP is that the CAP main program requires the assumption that the pretest scores are uniformly distributed. Since this is obvious- ° 456 Spring Joint Computer Conference, 1968 T ABLE I - Pairing with and without optimal A Sample Optimal Expected Pool Size Expected No. of Total Separation Total Separation Using Size When Pool Must Times Pool is Using CAP with Alternative Pairing Method N A be Emptied Emptied Optimal A (A = 0) JV .JJ ~~ ..... cc 100 150 200 .60 .65 .75 3.80 2.95 1.55 ~n. A .200 .250 .550 1.00 .}.} ly a very restrictive assumption, a CAP subroutine is used to preprocess the Ith and all scores within the pool as soon as the Ith pretest score is available. The remainder of this section will deal with the theory and implementation of this transform subroutine. The transform subroutine Let the sequence of N random variates, identically distributed with density function f(x), be represented by X[I], ... , X[N]. Let the function F(x) represent the cumulative distribution function associated with denSity f(x). The ra~dom variables, F(X[ 1]), F(X[2]) , ... , F(X[N]) are uniformly distribute~ on the interval [0,1]. Therefore, the problem of transforming the pretest scores to suit the requirements of the CAP main program is related to the problem of cumulative distribution function estimation. The sample cumul8;tive or step function F*(x) would in ordinary circumstances be considered a good estimate of F(x). However, for the purposes of CAP preprocessing, it is a poor estimate. The step function F*(x) equals n F*(x) = O/n) 2: [I(x. i=1 l,b )(x) + (/2) I[xj.xi](x)] where Xi represents an arbitrary sample point. F*(x) is not a smooth or differentiable function. Also, since 2nF*(x) must be an integer for every value of x, F*(x) would distort the "local spacing" of the transformed values. Since in the CAP main program the spacing of consecutive points is particularly important, it "is obvious that F*(x) is not a suitable transforma" tion to uniformity. What is required is a smoother estimate of F(x) which can be updated easily as new pretest scores are obtained. An estimate of F(x) which not only fulfills the above two requirements, but also is more efficient than ~*(x) has been investigated by two of the authors 1 •2 and" a ~e~ o~ its pro~erties will be reviewid in this paper. ThIS estimate wIll be represented as t'm(x) where /\ X - a Fm(x)=-b-+ ... " . - a ~ (b - a) Ck kJ k k=l Tr • sm k II Tr \ x- a) -b a - 2.64 4.02 5.01 5.01 1.35 1.54 1.68 1.92 _ Ck 2 n = (b - a)n ~. cos kTr(X j - a) (b - a) ICa,b](X i) and n represents the number of data points Xh ... ,X i ,... ,Xn , and "a" and "b" are two predetermined constants, preferably such that for most Xi the inequality a ~ Xi ~ b will be satisfied. It is shown 2 that as m approaches" infinity ~m(X) ~ ~*(x). Also for all densities with bounded variation, e.g., all continuous distributions commonly e~countered in statistical research (~he Normal, Cauchy, Laplace, Gamma. and Logistic) Fm(x) is a more efficient estimate than ~*(x). Here efficiency is measured in terms of Mean Integrated Square error J(~m) where J(t) = Ef. iF(x) - f:m(xl!' dx. In Ref. 2 it is also shown that the constant m aswith the most efficient estimator of the form :f!'m(x) is usually less than 10. Consequently f;m(x) provides both an xasily computed and a smooth estimate of ~(x) and f'm(x) is actually more efficient than P*(x). In Ref. 1 and 2, a ~le for determining the optimal value of m is given. This stopping rule is based on an unbiased estimator of Mean Integrated Square Error J(~m)' Aisol a computati0R- scheme is given which allows the constants Ck of Fm(x) to be computed recursively. Since the estimator of ~(x) should be revised after each new pretest score becomes available, the recursive computation of the Ck and hence ~m(X) represents a considerable saving in terms of computer time. ~ociated Implementation of CAP The following is a brief outline of the implementationofCAP: A. Since it is likely that no a priori information about the form of ~(x) will be available, the first 20 students are added to the pool and assigned at random to the treatment or placebo group. B. U sing the previous 20 as well a"s the 21 st pretest score the transformation Fm(x) is determined. The procedures used to compute Ck and the stopping rule for determination of mare given in Ref. 1. Evaluation and Development for Computer Assisted Instruction Programs 457 C. The original 21 pretest scores X[I], X[2], ... ,X[21] are transformed by means of ~m(X) to Y[l], Y[2], ... ,Y[21]. D. The score Y[MIN] is determined where IY[MIN] - Y[21] I ~ IY[J] - Y[21]1 for all J from 1 to 20. E. An estimate of A has been read into the program to suit the eventual sample size N of this particular CAl experiment. The pairing criterion DEL is computed DEL = 1 - (1 - A) 1/(N - 21). F. If IY[MIN] - Y[21] I ~ DEL, the 21st and the MINth students are paired and the 21 st student is assigned to the treatment group if the MINth student is assigned to the placebo group or vice versa. G. If IY[MIN] - Y[21]1 > DEL the pool size is increased to 21 and the 21 st student is randomly assigned to either the treatment or placebo group. . H. This process is repeated as the 22nd through Nth student's pretest scores become available. However, 1. For each new score, the constants Ck are updated and a new value of the transform Y [I] calculated for each student in the pool. 2. If at any time before the last pair is formed, the pool is emptied, the next student .is entered into the pool. (Equivalently, DEL is set equal to 0.) 3. When the pool size equals the number of students still to be tested all subsequent student are paired with their closest counterparts within the pool. (Equivalently DEL is set equal to 1.) By following steps A through H, each student within every pair has been randomly assigned to the treatment or placebo group. Also each student, i.e., the Ith, is available for the treatment even though the pretest has yet to be administered to N-I students. The construction olCAI programs Up to this point a method for conducting a ~AI experiment has been described, but no comment has been made concerning the source of data for such an experiment. In this section a brief description will be given of a particular CAl project and the process of program dpvelopment rather than evaluation will be em{.lhasized. For the past two years the School of Public Health at The University of Michigan has been conducting an extensive experiment on the effect of ComputerAssisted Instruction within a section of a large university. This project has already generated ten programs of more than intermediate size, although much data has yet to be gathered before any final conclusions can be announced. Programs have been written in such diverse areas as Biostatistics, Epidemiology, Environmental Health~ Public Health Dentistry, Public Health Education, and Industrial Toxicology. Several different procedures have been used to construct CAl programs and, therefore, our observations about these procedures may be of value to other workers in this field. The construction of CAl programs is an expensive process at this early stage of hardware development and our observations may suggest shortcuts and point out pitfalls. The best place to start when discussing CAl program construction procedures is with the personnel who actually participate in the construction process. Four categories can be listed. A. The person who will actually be responsible for the implementation of the completed programs and who initiates the construction process. At the University this will usually be the professor who wishes to use the material in one of his classes. B. Subject matter oriented staff working under the supervision of the person in category A. Here the term "subject matter oriented" is used to distinguish this type of person from CAl programmers. C. CAl programmers - with experience and training in education or psychology, but with little background in the specific subject matter areas to be programmed. D. Teaching assistants, research fellows, trainees, and others, who might be called the transient workforce. A brief description of the process itself follows. The list given below is, of cou~se, a highly idealized one. However, like any other form of computer programming there are definite clear-cut stages, e.g., flowcharting, coding, debugging, which must be carried out before a satisfactory working program is obtained. 1. An initial decision about the subject matter to be taught must be made. 2. The level of sophistication of the students who will take this program should be determined. 3. Should remedial sections be provided? 4. If yes to No.3, how' elementary should the material in the remedial sections be? 5. Should advanced sections be provided for the brighter students? 458 Spring Joint Computer Conference, 1968 rENTER , I =I + H ) 1=1 FLOWCHART I CAP Update F(X) and recompute loll YU);S I Administer Applicable Sequence Administer post-test and record ALGORITHM Find mate in pool 1 Randomly assign to group and add 10 pool score FOR T Place I In Remove mate opposite from pool f----------~~-~-__1 group from mate 6. If yes to No.5, how advanced should these sections be? 7. A list of the specific concepts that must be presented in a teaching program on this subject should be constructed. 8. The sequential order in which these concepts , should be presented must be determined. 9. At least one question for each concept or fact that tests the understanding of the students must be written. 10. Information that must be presented to the student along with each concept that is presented must be determined. 11. A list of typical misconceptions by students should be made. 12. Constructive responses that would correct these misconceptions should be listed. 13. At least two general constructive comments to be presented to the students who respond with 14. 15. 16. 17. 18. 19. 20. 2 i. 22. an answer that was not anticipated should be written. Pictures or graphs that may be helpful to the students in the understanding of the concepts or facts presented should be obtained. Appropriate use of slides, tapes and typeouts should be determined. The general flow the program is going to follow should be decided upon. The prepared materials in the computer language used must be programmed, i.e., coded. The program must be entered into the computer. The coding should be debugged. The program should be tested (using students) to determine if it actually teaches. Appropriate content revisions indicated by early student testing should be made. Observations should be made of the perfor- mance of at least ten students as they take the program and their comments and questions should be noted. 23. Further changes as indicated from students' reactions to the program should be made. A number of methods have been used in writing programs in the School of Public Health. All of the methods involved the 23 steps described above. The most successful of the alternatives we have tried involves the professor (A), a graduate student or staff member (8 or D), and the programmer (C). The professor and his assistant (A and 8, or D) completes steps 1-8 (the general outline of content materials to be used with a specification of the desired upper and lower limits of the materials). The student carries through steps 9- I 5 and discusses step 16 (general flow of the program with the possibilities of branching) with the programmer (C). The programmer handles steps 17- I 9. Then the responsibility is again assumed by the student and the professor for steps 20-23. Throughout this period, channels of communication have been established between the three people involved. When the work is distributed in this manner, all concerned seem to find the time and maintain interest in the program. The programmer is available for consultation in regard to possible uses of the computer, and the graduate student can solve the majority of the content problems on his own. One problem we have found in implementing this method has been a lack of interest on the part of some graduate students. They have felt the typical pressures for research as opposed to teaching as a prerequisite for advancement within their own fields. The attitude of graduate students in general toward teaching was often a negative one. However, we have found that certain students who plan to teach in the future take this opportunity to develop teaching materials very seriously. This is also noted in Reference 3. They learn how important it is to break topics into small sections and sequence them in a logical order. They are often more strongly motivated after they observe students taking their progrmllS. Frequently advanced graduate students (D) or professional staff (8) do not realize how a misplaced fact or lack of information can lead to miscomceptions on the part of the learner who is not familiar with a subject. Our experiences lead us to believe that CA I programs must be written with full cooperation and communication between the professor (who devotes' as much time as possible) and the programmer. To save the professor's time a student or staff assistant who is familiar with the subject matter and the programmer, carries through several of the 23 steps. Very few of the professors at the University of Michigan, School of Public Health have had the time needed for optimal participation in a project of this type. Therefore, the professor-student or staff assistant-programmer combination is usually the most feasible. This method also enables him to participate with a number of professorstudent pairs. The quantity and quality of programs written is this way in our opinion represents a great improvement over other methods attempted. This statement, of course, is now being rigorously verified using CAP in conjunction with other evaluation procedures. CONCLUSIONS Computer-Assisted Pairing is a dynamic design for paired comparison evaluation of Computer-AssistedInstruction sequences. It appears to be of considerable practical value since the pretest-treatment-posttest sequence can be made invisible to the subject. Simulation studies have shown CAP to be substantially superior to previously considered alternatives. The application of the technique is certainly not limited to Computer-Assisted-Instruction. CAP can be applied fruitfully to almost any experimental situation where paired comparisons are needed. It is especially useful when task initiation and evaluation can be done by the computer. ACKNOWLEDGMENT The authors gratefully acknowledge the advice and support of Drs. Karl Zinn and Stanford Ericksen of the Center for Research on Learning and Teaching, University of Michigan and Associate Dean John Romani of the University of Michigan School of Public Health. Also we would like to thank Miss Wendy Hiller for help in checking the final manuscript of this paper. REFERENCES 1 M TARTER R) HOLCOMB R KRONMAL A description of new computer methods for estimating of the . density of stored data -Pr~eedings 1967 ACM National Conference 511:9 1967 2 R KRONMAL M TARTER The estimation of probability densities and cumulatives by Fourier series methods Accepted for publication Jour Am Stat Assoc 3 M TARTER Programmed instruction in statistics from the professor's point of view The American Statistician 21 :28-31 1967 4"V CLARK M TARTER Preparationfor basic statistics automated text McGraw Hill 1967 5 J E COULSON Programmed learning and computer-based instruction Wiley 1962 460 Spring Joint Computer Conference, 1968 6 W FEURZEIG A conversational teaching machine Datamation 10:6 1964 7 K L ZINN Survey of materials prepared for instruction or research on instruction in on-line computer systems Automated Education Handbook E Goodman (Ed) Detroit Automated Education Center 1965 Computer capacity trends and order-delivery lags, 1961-1967 by MICHAEL H. BALLOT and KENNETH E. KNIGHT Stanford University Stanford, California InN~n=3.362+ INTRODUCTION This paper examines the growth of computational facilities in the U.S. from the end of 1961 to September of 1967, exploring the dynamics of the growth process and attempting to link it to specific market events. The dynamics of supply and demand for general purpose computational capability points to many problems for both producer and user. This paper considers one of these pertinent to the ordering policy and planning of the firm seeking to acquire EDP equipment for, say, systems conversion; the delivery lag, or average time between equipment orders and delivery. This lag is studied and estifllated using two separate empirical models. .076lt* ,R2=.959; (.00351)** (2) for the number of machines installed, the "Big 8," InNin=3.597+ .078lt ,R2=.989;and (.00188) (3) for the number of machines installed, "All," InNln=4.233+ .0717t ,R2=.996. (.00107) 60~--~1-----'1-----'1-----'1r----'1------r-1--' Growth in computers The tremendous growth of computers in this country, and abroad, is a well-known and accepted fact of our times, often referred to as the "Computer Age." A measure of this growth in recent years, based on the Monthly Computer Census reported in. Computers and A utomation, shows a definite exponential trend. The curves, fitted to data compiled for the cumulative number of machines installed and on order, are shown for three grouping: IBM, the "Big 8"* (Burroughs, CDC, G. E., Honeywell, IBM, NCR, RCA, and Univac), and All companies as reported in the Computers and Automation Monthly Census (The "Big 8," Autonetics, Bunker-Ramo, Data Machines, DEC, Electronic Associates, EMR Computer Division, Philco, Raytheon, sec, SDS, and Systems Engineering Labs). [See Figures 1,2, and 3.] For cumulative machines installed, the following equations, in logarithmic form, were estimated: - 50- - 10- o I 3162 (70) 3163 (74) 3/64 3165 3166 (78) (82) (86) 3167 (90) QUARTER,I Figure I-Time series, machines, IBM (1) for the number of machines installed, IBM, *t, the time in quarters, is measured from 9/44 (t = 0), the date of introduction of the first general purpose digital computer, the Harvard Mark I. **This is the standard error of the regression coefficients, Sh" *Inclusion in this group was determined by the number of machines installed and on order and the average value of these. 461 462 Spring Joint Computer Conference, 1968 The curves fitted to cumulative machines on order, in logarithmic form, are as follows: (1) for the number of machines on order, IBM, 1 n Nolo .&. .. .&...I.."'It == 1'/_..,,-1' 1£17-1-- 1 . "070A.t ' ' - ' .... U, 2 , ... = 770· .,1,"" (.00861) (2) for the number of machines on order, the "Big 8," InNr lo =4.264+ .06311 ,R2=.843;and (.00609) (:3) for the number of machines on order, "All," In Ny/o=4.365 -+- .0626t , R2= .874. (.00532) 60'~--~--~----~----.-----.----.---. Je INSTALLED 00~ 10 o J ~ ~~"I 3162 rro) 3/63 3/64 ~ 3166 3167 (74) (78) (82) (86) (90) QUARTER,! Figure 2 - Time series, machines, "Big 8" 60'~--~--~----~----.-----'----'---' i - INSTALLED 50 ~ ~~ 4°1 / IT ./ ~ .. 00'"'" i++~~o .~/II OL~I--~L!----~31=3--~3/64~I~.---3~k~5--~3166~1~~3~~~7---1 (70) (74) (78) (82) (86) QUARTER, , Figure 3 - Time series, machines, "ALL" (91) The results are gratifying, with high statistical significance and high correlation. The three groups show quarterly growth in facilities of better than 7%, with the growth of those of the "Big 8" manufacturers as high as 7.8%. These general results apply also to the curves fitted to the number of Ii1achines on order; with quarterly growth percentages about a point lower. Tests were applied to t~e differences in these trends,* and only those of "All" vs the "Big 8," for machines installed, showell a statistically significant difference (at better than the 1% level). Tests were also applied to each group, comparing the trends for machines installed vs machines on order; again, only one comparison was statistically significant (at 5% or better), this time the installed and on order trends for the "Big 8." Thus, most conclusions that are drawn from inspection of the results, intergroup and between installations and orders, can be only looked on as indicative. These indicate that new installations are growing faster than orders. This implies, given a constant machine mix in production, a small decrease in delivery lag. Further inspection of the statistics show that the "Big 8" (1.5% difference) and all companies reported (.9% difference) seem to be working appreciably on their backlogs, while IBM lags in this sense. These results can be interpreted as indicating that order backlogs are growing faster for IBM than for the other groups. This interpretation would be compatible with the success of the System/360 in the market coupled with the lag of IBM's production scbedules for this system. * One factor needing amplification is the seemingly lower growth of installations of "All" compared with the "Big 8"; this may be explained by the decreasing percentage of the market for machines manufactured by the· small-volume producers. The preceding observations, as a basis for analysis of computer growth and production lags seem, in general, valid but incomplete. To further explore the growth in computational capability, quarterly time series for cumulative machine power** were constructed fqr IBM and the "Big 8" ~ and exponential curves were fitted to these [see Figures 4 and 5]: (1) for cumulative power installed, IBM, (2) 1nPtn =-6.88+ .140t ,R2=.934; (.0084R) for cumulative power installerl, "Big 8," *See1 , p. IV - l~. *See 2, pp. 138 et. passim. **See 3 pp. 40-54, and See 4, pp. 31-35. Computer Capacity Trends and Order~ Delivery Lags In Pin =- 8.95 + .172t ,R2= .991; (.00375) (3) for cumulative power on order, IBM, In pr/o =-14.7 .249t ,R2= .835; and (.0254) (4) for cumulative power on order, "Big 8," 1_ nolo _ 1'1 '1 _L 'l'J C:4111rr---1J.J---'--.""J.J1. n2 _ ,1'- 0""7'1 -OfJ. (.0206) 2400,-----,---,------,-----,-----,,---,---..., 2000 ON ORDER J 1600 1200 800 400 3162 3163 (70) (74) QUARTER, \. Fi~urf'; 4- Time series, computational power, IBM differences; that is, power orders seem to be advancing at a greater quarterly rate than installed power while the opposite is true for the machine series. The back~ log of computational power is growing at about an 11 % differential per quarter for IBM and about 6% for the "Big 8" (probably mostly due to IBM .inclusion in this group). This trend can be credited to the large and powerful third generation computers, being ordered at an accelerating rate and being supplied with a considerable production lag. The results seem to further accentuate the premise that IBrvf is (or was)'" having production difficulties with its new series. The growing orders for IBM computational power (at a quarterly rate of nearly 25%, almost 11 % ahead of deliveries) is a testimony to the advanced selling of System/360, and, as previously asserted, is a production headache on a _grand scale. The differential growth of computational power demanded for now and the future vs power supplied now is also evident in the statistics for the "Big 8." It should be remembered, however, that there is a damping effect due to the inclusive nature of this group. Further breakdown of the data may yield interesting inter-company results, but this is not the main purpose of this paper. Rather the preceding analysis is meant to lay the foundation for the study of the problem of delivery lags, as they effect the 'planning of the firm engaged in expansion replacement of its ED P facilities. .6,----,-----r--,__---.---~--,_____, 2400.-----r-----,---,__--.----"y---~----, ON ORDER J 2000 ~::1 1600 ! Go § 1200 II! A • PUBLIC ANNOUNCEMENT OF SYSTEM 360 B • PROJECTED FIRST DELIVERY C • PROECTED DELIVERY OF FIRST 1000 OF SYSTEM 360 _4 .2 ~ g : eoor -.2 ~J .l~3Jt;2~3163~~ (70) (74) V -.4 Tests on trend differences were carried out on these results, and the difference between groups for the cumulative power installed, 3.2%, was statistically significant at better than the 1% level. We compared the power time series results for IBM and the "Big 8" with the machine time series for these two. The similarity between groups is evidenced in both, but the comparison of installations and orders shows opposite I / -.6'---::+.:;;--~:;-----;-~-~----;-7.:;:------;-31t=;67:-----' QUARTER,' Figure 5 - Time series, computational power, "Big 8" 463 (90) QUARTER,' Footnote figure Analysis of Residuals In pin = a + 13 for IBM t *Residuals (I n Puhs - 1n Peale) of the regression (in logarithmic form) of pin = ce al for I BM show steady increase at higher values of t. Observed values steadily diverge upward from the regression line for 12/66, 3/67 and 6/67, about one year after the first 1000 new System/360 computers were scheduled to be delivered_ Graphically: (see footnote figure above). 464 Spring Joint Computer Conference, 1968 Delivery lags It has been conjectured that there exists a substantial lag between the ordering and installation of new and expanded computational equipment. * This section explores this problem and tries to estimate the Jag for machines and for computational power. Two models are used: one assumes lags of 2, 3, 4, 5, 6, 7, and 8 periods (quarters) and tries to determine the best set of cumulative orders placed up to points in time within these periods to explain the growth of installations at the end of the period; the second assumes lags of 1 to 8 periods and tries to determine the best set of orders placed in single time periods during these over-all periods to explain the growth in installations at the end of the period under study. A. Modell The first model used is of the form AXtn = f ( Bestfit* IBM .1.N~n=- \ ters) and X is used to denote cumulative number of machine installations (N) or the sum of computational power (P), installed (in) and on order (0/0). Table below gives the best resuits for this linear formulation, for N & P with the various postulated lags, and with the constant suppressed as well as included. F-ratio (or FoJ** 0 1210+ .394 N9/ • t-5 (.0372) .1.N~n= .202 \ Nr.!..°4 R2 (or R~)** .911 112.1 (286.8) (.960) 97.5 .890 610.3 (.979) (.0119) Big 8 .1.Nln=-241 + .211 Nolo t-3 (.0213) Nolo .1.Nln= .193 t-3 (.00781) All aNin=- 255 + .205 Nolo t-3 78.4 .867 (533.1) (.976) 29.5 .695 ( 56.9) (.803) 56.5 .813 (132.3) (.904) (.0231) .1.Nin= .187 Nolo t-3 (.00810) IBM .1.Pin =- 5.78 + .0894 P?i.~ (.0164) .1.Pln = .0826 Pfi.°4 (.0190) Big 8 .1.p~n = - 1.69 + .117 Pri.~ (.0156) .1.Pin = .1]6 Pri.°4 (.0101) *Best fit is determined by a combination of highest F-ratio, high R2 and highly significant regression coefficients. Also note that tht; sample size drops from 17 for a = 2 to II for a = 8, for all regressions of the models. *With the constant suppressed, all variances and correlations are computed about the origin rather than the mean in the BMD stepwise regression routine (BMD-02R). See 6. *See 5; January 1966, p. 17; May, 1966, p. 17; June, 1966, p. 135; and January, 1967, p. 125. A X?.Lq ) where a is the postulated delivery lag (2 to 8 quar- TABLE I-Results, model I Group i \ i=i Computer Capacity Trends and Order-Delivery Lags 465 program for all variables. * But both autocrrelation, by biasing and rendering less efficient ~ and ~j, and multinearly useless as a collinearity, by rendering the measure of relative effect, are not serious problems in this analysis. The model is not used structurally, but rather as an indicator of a single variable's explanatory power. B. Model2 a The second model used is of the form dXip = f ~ , dxo/o with a again being the postulated delivery lag, varying from 1 to 8 quarters. The use of absolute first differences helps somewhat with the problems of autocorrelation and multicollinearity, drastically reducing the latter. Also, results can be more directly applied to the testing of the lag length, with support being derived from the lower limits suggested by the first model. The regressions were run for the three groups with machine numbers, and then two of these with computational power. The first set of results are shown in Table I I. The regression results for machinery suggest that the best explanatory order variable for installations in period t is the machine orders accumulated up to and including period t-3, for All and the Big 8. For IBM, though, the lagged order variables are in the area of t-4 and t-5. This result thus is in line with earlier observations in Section I concerning IBM's faster growing order backlog. With a set at 3, 4, 5, 6, 7, and 8 periods, almost identical results were obtained for the regressions involving machines; this was also true for the power regressions with a at 4, 5, 6, 7 and 8 periods. The best fits for machines, as defined in Table I, were selected from seven sets of regression runs, and these were further narrowed down to the one representative equation for each group, as appears in the above tabulation. This selection was repeated for computational power, and these further results suggest that orders accumulated up to and including period t-4 are the best explanatory order variables for power installations in period 1. This may well be due to the growing backlog of machine power for the groups under study. The production lag here, greater than that for machinery, is in line with prior thoughts and the time series data and discussion in the previous section. The obvious problems in this regression model are auto-correlation and multi-collinearity. The first of these is most likely present in the model because of the influence of omitted variables going beyond the included periods and which are very probably serially correlated, affecting the assumed random disturbance term. The second problem is known to exist in the model from the correlation matrix given by the the Pi These results, for number of machines, suggest that the lag may be in the neighborhood of 3 to 4 quarters, perhaps higher. (This is suggested by the occasional appearance in the regressions not shown here, of orders placed in period t-6, and the t-5 term in the second equation shown, as a third explanatory variable of some significance, plus the results of Modell.) And, as was noted for the previous regressions, similar results were obtained for a = 4 to 8. Next is presented the results of the computational power relationship for IBM and the "Big 8," in Table III. TABLE II - Results, model 2, machines Group IBM Big 8 n R2 34.9 .853 .192 aNOt-5 (.0588) lo 35.0 .913 lo 27.8 .835 lo aN: = 779 + .289 aNOt-3 + .476 aNO(-4 (.0601) (.0621) lo aN t = 1610+ .182 aN:~3+ .422 aNOt-4 (.0406) (.0466) lo in + All F-ratio Bestfit* 1n lo aN t = 1670+ • 264 aNOt-3 + .444 aNO(-4 (.0652) (.0658)' *See footnotes, Table I *The small coefficient standard errors, though, might lead one to believe there is not high correlations amongst the explanatory variables, especially in the light of the not very large sample size. But the bane of time series is here too present. 466 Spring Joint Computer Conference, 1968 TABLE III - Results, model 2, power Group IBM Big 8 Bestfit* F-ratio R2 % % ~plnt= 24.2 + .0940 lipt-4 + .0866 lipt - 6 I' 53 (.06! 2) (.0625) .234 lipiu'=44.7+ .111 lip%t-4 + .0614 lip% t t-5 (.0883) (.0850) % + .0913 lip t-6 (.0896) 1.15 .278 *See footnote, Table I. These results, showing extremely low correlation and not statistically significant, must be considered in the light of recent relatively heavy queue switching, with orders moved from the backlog of one computer manufacturer to another having smaller delivery lags in anyone time period. Given the discontinuous nature of the power mix between the manufacturers included in the analysis, large power on order discrepancies from one period to the next would be quite common. The power on order series lacks any semblance of uniformity or direction. Thus, the reresults, though barely suggestive, do indicate that the delivery lags for power are 4 to 6 quarters. These longer lags, weighted by the larger machines, appear consistent with the .results suggested in Modell and the time series for power (see Section I). CONCLUSION Exponential curves were fitted to time series data ·depicted the growth of computational capability, as measured by number of machines and computing power installed and on order. For the first of Jhese series, the growth of machines installed exceeded 7% per quarter, and orders, 6% per quarter. The second series for power showed trends in the area of 15% and 25% for installations and orders, respectively. These lags present a vexing problem for the firm planning for future computer capability. For one plan to be brought to culmination, with the computer going on line, requires a 9 to 18 month delivery lag as well ~hich as a several month installation - on line lag. Then the new facility becomes obsolete economically due to the tremendous technological change in the computer manufacturing industry. * Clearly further research on this problem, involving the trade-off between economic costs and conversion expenses, is needed. *See 3, p. 54. Knight find average advances in computing power, given the equivalent captial cost, greater than 80% per year, 195062. See 4, p. 34. Knight's updating, 1963-66, reveals an average advance of about 140%. REFERENCES 2 3 4 5 6 7 E A McCRACKEN J M CARR JR Statistical methods ESSO Research Laboratories Humble Oil and Refining Comany undated T A WISE The rocky road to the marketplace Fortune October 1966 K E KNIGHT Changes in computer performance Datamation September 1966 K E KNIGHT Evolving computer performance Datamation January 1968 Look ahead Datamation various issues W J DIXON (editor) Biomedical computer programs Health Sciences Computing Facility UCLA 1965 J JOHNSTON Econometric methods McGraw-Hill Book Company Inc 1963 Error estimate of a fourth-order Runge-Kutta method with only one initial derivative evaluation byA.S.CHAI Hybrid Computer Laboratory University o/Wisconsin Madison, Wisconsin Yn+1 = Yn + (l 4ko+ 35k3 + 162k.4 + 125ks)/336 INTRODUCTION In the numerical solution of 'differential equations it is desirable to have estimates' of the local discretizatio~ (or truncation) e~o'rs of solutions at each step. The estimate may be used not only to provide some idea of the errors, but also to indicate when to adjust the step size. If the magnitude of the estimate is greater than the preassigned upper bound, the step size is reduced to achieve smaller local errors. If the magnitude of the estimate is less than the preassigned lower bound, the step size is increased to save the computing time. The 4th-order Runge-Kutta method has the advantage that it provides an easy way to change the ·step size, but it' does not provide as simple a way to get error estimates as does Milne's J?redicto~-c?~­ rector method. I Several. methods2.3,4.s for a~hlevmg error estimates have been derived and are briefly as follows: 1) One-step method' The one-step method provides all the information for the error estimate in one step. The important onestep method is Sarafyan's ps~udo-iterative formul~2 which is a 5th-order Runge-Kutta formula imbedded m a 4th-order Runge-Kutta formula as follows: The estimate is I En+1 = Yn+1- Yn+1 The work of Luther and Konen6 (LegendreGauss) and" of Luther7 (Newton-Cotes, the 2nd formula, and Lobatto) also yield suitable pseudoiterative formulas. Really, pseudo-iterative formulas are 5th-order Runge-Kutta integration schemes used to estimate the error of the 4th-order Runge-Kutta integration. The pseudo-iterative formula can be used to estimate the error at the first step, which can be done by no other method. But it requires about 50% more com'puting time for the additional derivative evaluations. Merson's and Scraton's methods with five derivative evaluations belong to one-step pseudo-iterative method. but they are only applicable in particular cases. 4.8 2) Two-step method4.s This method requires the computation of y n+1 and Yn+2 with a step size h, and then the recomputation ~f y* n+2 with a doubling of the step size. The error eshmate is 1 (LP En+2 = (y* n+2 - Yn+2)/30. Since the error is of the order hS , we can let ko = hf(xmYn) kl = ht~xn + h/2,y n + ko/2 k2 ~ hf(x n + h/2,yn + (ko+ k l )/4) k3 ~ hf(xn + h,yn - kl -t 2k2) k4 ~ hf(x n + 2h/3,yn + (7~o+ lOkI + ~3)/27) ks = hf(x n + 2h/ 1O,y n + (28ko- 125k1 + 546k2 + 54k3- 378k4)/625) Yn+2 = Yn+2 + En+1 + En+2 where the two error terms relate to the two steps, and y* n+2 = Yn+2 + 32En+1 + O(h6) The 4th-order formula is Hence, Formula (1.1) really provides an estimate of the error, 6 E n +29/30 = (31 En+1 - En+2)/30 + O(h ) Yn+1 = Yn + (ko+ 4k2 + k3)/6 and the 5th-order formula is which is close to the error, En+1, in Yn+1' If the estimal~ is for the error, En+2, in Yn+2, it requires that the errors, 467 468 Spring Joint Computer Conference, 1968 En+1 and En+2' in Yn+l and Yn+2 be approximately equal. This method can be used to estimate the error at each two steps starting at Y2 and requires about 37.5% more computing time than fourth-order Runge-Kutta integration without estimates. 3) Multi-step method3.4 , Ceschino and Kuntzmann (Ref. 3, pp. 305-310) collected several multi-step formul~s for the error estimate which was based on Refs. 9, 10. On~ of them is (also in Ref. 4); (1.2) Strictly speaking, (1.2) estimates the error En+7/10 = (1 OEn + 19En+1+ En+2)/30. This method can estimate the error at each step. But this method cannot be used until the completion of the third step (i.e., Y3)' The estimate is close to the error at the last step, because En+7/10 == En+l' If the estimate is for the error, En+l, at the last step, and if the estimate causes the step size to change, four additional derivative evaluations are needed. If the estimate is for the error, En+2' at the present step, it requires that the errors at three successive steps,' Em En+l, and En+2, be approximately equal. This requirement may not be satisfied if the errors change rapidly. In general, the derivative evaluations need most of the computing time. The 4th-order Runge-Kutta method already requires more derivative evaluations than ~he other methods, e.g., Milne's method;1 hence the extra time for the additional derivative evaluations for the error estimate is too expensive, and should be avoided as much as possible. The suggested method Ceschino and Kuntzmann (Ref. 3, p. 308) showed the following formula En+2 = 11~Yn+1 + 1.2.~Yn - h (,!fn+2 + 1.2.fn+1 30 30 9 30 +__~Jn -_1f n-- 1 ) n > 0 (2.1) 30 90 for estimating the local discretization error in Yn+2 in each step in a fourth-order Runge- Kutta integration. The author has found that (2.1) has advantages over the other methods in Section 1 and shows this below. Also, (2.1) can be extended to n = 0, because we can form 1which has an error of O(h6), and then evaluate LI = f(x_ l , Y--I) (2.3) Then, using (2.1) for computing the estimate ~, Equations (2.2) and (2.3) can be employed when (2.1) is just started or when the step size changes. Equation (2.1) requires fn+2' which has to be computed for ko in the next step if the step size does not change; hence, no additional derivative evaluations in each step are needed except for E2 where one additional evaluation for LI by (2.3) is needed. This method requires about 12.5% more computing time for evaluating L 1, when Ll is not available, but no additional time if n > O. Hence, this method has an advantage in computing time over the one-step and twostep methods. Equations (2.1-3) can estimate the error at each step after the first. Equation (2.1) estimates the error En+41/30=(11En+2+ 19En+1)/30 which is closer to the error, En+2' than the estimates in the two-step and multi-step methods. Hence, this method has another advantage over the two-step and multi-step methods. The departure of the estimate from the local discretization error in Yn+2 will be shown in the next section. The derivation of (2,,1) was shown in Ref. 3 and, as well as the derivation of (2.2), is briefly shown in Appendix 1. Equation (2.1) can be employed to be a corrector to get errors in Y of the order h6. The convergence theorem and the experiment result are shown in Appendix 2. The departure of the estimate The departure of the estimate from the local discretization error is (Ref3, pp. 306-308): En+2 - En+2 = -1.hfyn+IEn+l - .!,2.h, ~.~.n.+1 2 30 - _1_1_h6 t'ri+l + O(h 7) (3.1) 5400 To give the reader some picture of the departure, let us consider a differential equation: Y' = ay, Yo= y(xo) Where a is a constant and is not equal to zero. The formula of the local discretization error of the order h5 in Vn+? bv several known 4th-order RungeKutta formulas can be found in (Ref. 3, p. 81). In this example, the local discretization error of the order h5 is Error Estimate of Fourth-Order Runge-Kutta ~1ethod Since y = yoe aX , this gives AJ::O "TV7 The step size, h, was 0.1 and x ran from 0 to 5. The local discretization error at yn+1 is _ a(Yn+1 - f3) - f3(Yn+1 - a)e(a-13)h En+2- Yn+2- y n+1 _ tJa- (Yn+1 . _ a )e(a-13)h where ; } = 1 ± v' 1 +4(Y~;l - Y»+I - Zn+l) The local discretization error En+2 and the relative error of the estimate are shown in Table 3.1. The v;::t1nes. re1;::ttive errors. ------ of -- the ---- -------------- ;::tre --- ;::thont ----- Olin --- --- O'pnpr~1 except when the curve of En+2 approaches zero rapidly. Hence the estimate is in general suitable for practical purpose. frf+1 = yoa6eaXn+1 , and o~·-~-_& fy=a. Hence the departure of the estimate is _ 1 h6a6Yoeaxn+l. En+2 - En+2 - 135 The relative error is E n+2-En+2 8ha= .889ha En+2 9 of which the magnitude is less than 10% if Ihal ~ 0 .. 1. To verify the theory an experiment was run to solve y'=y, Yo= 1, h=O.1 and x=Oto 10. The local discretization error at X n+2 is Yn+2 - yn+1e h. The experimental results for the relative errors of the estimates for x = .2 to 10, were between .08211 and .08225. Then the equation was changed to y'=-y with the same parameters. The local discretization error at X n+2 is Yn+2 - Yn+1e-h. The experimental results for the relative errors were between -.09620 and -.09637, or slightly less than 10%. Hence the experimental result agrees with the theory. If Equations (2.2) and (2.3) are employed for computing L 1 , the departure, which is derived in Appendix 3, is E2-E2=-~hfy El- ~~hE'~'- 54~~h6f}'+O(h7) 1 As before, the relative error for E2 in y' =ay, a =1= 0 is approximately equal to 5 9'h a= .556 h a which is about 5% if Ihal == 0.1. The experimental results for E2 in y' =y and in y' = - y, are .0518 and -.0593 respectively, where h = 0.1. Another experiment is to solve a system of nonlinear differential equations y'(x)= z(x) z'(x) = (2 y(x) - 1) z(x) y(o) = 0.5, z(o) = y'(o) =- 0.25 (3.2) TABLE3.1 x Local error ReI. error .2 .3 .4 .5 .6 .7 .S .9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 I.S 1.9 2.0 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.S 2.9 3.0 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4.0 4.1 4.2 4.3 4.4 .755SE-QS .7047E-QS .6.301E-QS .537 1E-QS .4320E-QS .3223E-QS .2 145E-QS .1 144E-QS .2710E-Q9 -.4447E-Q9 -.9S77E-Q9 -.1352E-QS -.1544E-QS -.15S6E-QS -.1502E-QS -.1317E-QS -.1059E-QS -.7522E-Q9 -.4229E-Q9 -.9004E-1O .2301E-Q9 .5275E-Q9 .793 1E-Q9 .1021E-QS .1211E-QS . 1362E-QS . 1477E-QS . 1556E-QS .1605E-QS . 1626E-QS . 1624E-QS .1603E-QS . I565E-QS .1516E-QS . I 457E-QS .139IE-QS . 1320E-QS . I 247E-QS .1 I 72E-QS .1087E-QS .1024E-QS .9525E-Q9 .S836E-Q9 -.6 193E-Q2 -.4302E-Q2 -.1157E-Ql -.2127E-Ql -.3440E-Ql -.5614E-Ql -.9597E-Ql -.1944E 00 -.S569E 00 .5242E 00 .225SE 00 .1517E 00 .11S9E 00 .97S2E-Ql .S015E-Ql .6420E-Ql .4473E-Ql . 1596E-Ql -.5341E-Ql -.6064E 00 .3657E 00 .2065E 00 .1637E 00 .144SE 00 .1329E 00 .1254E 00 .1l95E 00 .1157E 00 .1126E 00 .1103E 00 .lOS3E 00 .106SE 00 .1054E 00 .1043E 00 .1031E 00 .1025E 00 .1018E 00 .1012E 00 .1005E 00 .1005E 00 .9993E-Q1 .9958E-Ql .992IE-Ql 470 Spring Joint Computer Conference, 1968 4.5 4.6 4.7 4.8 4.9 5.0 .8 I 76E-09 .7548E-09 .6954E-09 .6395E-09 .5872E-09 .5385E-09 .9892E-ot .9876E-ol .9867E-ol .9861E-ol .9860E-ol .9827E-D! Then Local discretization error and relative error of estimate at y in y' = z and z' = (2y - l)z ACKNOWLEDGMENTS The author is indebted to Professors C.A. Ranous and V. C. Rideout for aid in expression in writing this paper and Professor C. W. Cryer for discussion of convergence theory. The author particularly thanks hIs former advisor Professor H. J. Wertz who suggested this problem -and has often given encourage-'" ment. The author is also indebted to the referees for their constructive comments. All the experimental results were obtained on an SDS 930 computer (38 bit = 11.4 digits in mantissa) in the Hybrid Computer Laboratory at the University of Wisconsin, Madison, Wisconsin. Appendix 1 The following statement of a covergence theorem is . similar to Byrne and Lambert l l . Assume that (i) f(x,y) is continuous for x € I and II y II < 00, (2) f satisfies a Lipschitz condition and (3) O(x m Yn) is defined for' all h such that x + hand x - 2h € I. (A2.1) is consistent because lim (x,y)/h=f(x,y),x€I, A convergence theorem is Let Yo be the initial value and II Yl - II 8 19 -.I (h. 19y(x n+2) =-TlY(x n+1) + TlY(x n) + h( 3}fn+2 )+Tlfn+1 8 -f 1 -r ) 1 h~ + O(h 7) +Tl n -3} n-l - 180 -I n+l (A 1. 1) Substitute and O(xn'w~) - O(xn,w n) II ~ h(Llll w~ - Wn II + L2I1w~-1 - Wn-l II + L3\\ W~-2 - wn- 2 11 ) hold for the vectors wri and w n. Under these conditions, the hypotheses (1-3), and the consistence property, (A2.1) is convergent. The proof can be similarly employed as in Byrne and LambertH except that one more term, is added in the right-hand side of the inequality then, after simplification, we get Zn+1 _ 11€n+2+ 19€n+l...LO(h 7) €n+41/30 30 I 11 19 '( 119= 30 dy (xn+t) + 30 dy (xn) - h 9fn+2 + 30fn+1 7 ) Now, therefore, the estimate is (2.1). Equation (2.2) is easy to be derived by y -1 = 1OY2 + 9Yl - 18yo - 3h(f2+ 6f1 + 3fo) + O(hfi) Appendix 2 A convergence Theorem of (2. j ) Equation (A I. I) can be rewritten to . h L 3z n- 2= h L311 y(X n-2) - Yn-211 y(X n+2) = y(x n+1) + dY(X n+1) €n+2 5!~O,h6t;;+1 + O(h y(x 1) II + II Y2 - y(X2) II ~ hL where L is a non-negative constant and let there exist three non-negative numbers L l , L 2, and L3 such that Derivation of(2.1) [3, pp 305-308] and (2.2) It is easy to derive 1 8 + 3 0{n - 9 0'(n-l) - II y II < 00. h~O ~ zn( 1 + hL t) + hz n- l L 2 + hg(h) and thereafter. This proof does not include the point X-t, so we have to assume that X-I € I if (2.2) is employed. Hence (A2.1) can be used as a corrector to get errors to O(h 6 ) except at Yto However Y2 should be corrected by twice the estimate at x 2. Table A2. 1 shows the accumulative errors of noncorrected and corrected solutions in the experiment to solve the system of non-linear equations (3.2). The theoretical solution is I Y(x)=-,. !-l-X ,e Error Estimate of Fourth-Order Runge-Kutta Method The accumulative error was obtained by subtracting y(x n) from Yn (non-corrected) or Yn (corrected by (2.1)). In Table A2.1 the errors of the non-corrected solutions are greater than those of the corrected. x Non-corrected .2 1.531 x 10-8 2.228 X 10-8 3.364 x iQ-3 4.327x 10-8 2.837 X 10-8 2.555 X 10-8 2.104 X 10-8 1.352 X 10-8 .3 .5 1.0 2.0 3.0 4.0 5.0 Corrected 2.929 x 5.402X 1.759'x 6.483!X 6.314X -5.898:X -2.966 x -2.749 x 10- 10 10- 10 W-:; 10-9 10-9 10- 10 10-9 10-9 TABLE A2.1 Errors in non-corrected and corrected solutions of y in Equation (3.2). Appendix 3 The departure of (2.1) when (2.3) is employea If (2.2) is employed for Y-b since Yo is the initial value, we can assume that Yo has no error, then YI = y(x I) + €l and Y2 = y(x2) + EI + E2 Hence fi =f~ +fYi (EI + ... + Ei), i= 1,2 Now . AYQ = Ay(xo) and . AYI = Ay(x l) + hfylE1 with Y-l = Y(X-I) + 19EI + 10~2 and Ll = LI + f Y- 1 (19EI + 10E2) then E2 _11 19 1 11 6 30E2 + 30EI -"6hfYIEI - 5400 h ~t = Hence the departure of the estimate E2 from E2 is 19 1 11 E 2- E2 == - 30hE ' I -"6hfYIEI - 5400 h6fl NOMENCLATURE Y' = f(x,y) h - a system of differential equations where x is the independent variable and Y represents the dependent variables. y, Y' and fare vectors. - step size. Yn 471 - solution of Y at Xn = Xo + nh with an error of O(h5) obtained by a fourthorder Runge-Kutta formula. - solution of Y at Xn with an error of O(h 6). Yn f =f(xmYn) fn =f(xmYn) y(x n) - theoretical solution of Y at Xn' Ay n - increment function which is defined by YnH-Yn' - similar to AYm but replacing Yn in AYn ~y~J. ' - local d}scretization (or truncation) error in Yn which is defined by the difference of y(x n) from Yn with considering Yn-l to be the initial value, e.g. En = y(x n- t ) + Ay(x n- 1 ) - Y(x n) - estimate of En. REFERENCES 1 W E MILNE Numerical solution of differential equations Wiley New York p 66 (1953) 2 D SARAFYAN Error estimation for Runge-Kutta methods through pseudoiterative formulas Technical Report No 14 Louisiana State University New Orleans (May 1966) 3 F CESCHINO J KUNTZMANN Problemes differentiels de conditions initiales Dunod Paris France pp 81 305-310 (1963) English translation by D Boyanovitch Prentic-Hall Englewood Cliffs N J pp 67 247-251 (1966) 4 R E SCRATON Estimation of the truncation error in Rung-Kutta and allied processes ro~p J vol 7 pp 246-248 April 1964 5 L COLLATZ The numerical treatment of differential equations Third edition second printing English translation by P G Williams Springer-Verlag New York pp 51-52 1966 6 H A LUTHER H P KONEN Some fifth-order classical Runge-Kuttaformulas SIAM Review vol 7 pp 551-558 1965 7 H A LUTHER Further explicit fifth-order Runge-Kuttaformlilas SIAM Review vol 8 pp 374-380 1966 8 A R CURTIS Correspondence to Estimation of the truncation error in Runge-Kutta and allied processes CompJ vol 8 p 52 1965 9 H MOREL Evaluation de rerreur sur un pas dans fa methode de RungeKutta Comptes rendus de l' Academie des Sciences Paris vol 243 pp 1999-2002 1965 10 J KUNTZMANN Evaluation de l' erreur sur un pas dans fes methodes a pas separes Chiffres vol 2 pp 97 -102 1959 II G D BYRNE R J LAMBERT Pselldo-Runge-Kutta methods involving two points JACM vol 13 pp 114-123 1966 lmproved techniques for digital modeling and simulation of nonlinear systems by JOSEPH s. ROSKO U nlled A inTaft Research Laboratories East Hartford, Connecticut INTRODUCTION The engineer or s~ientist concerned with the mathematical description of physical systems is continually faced with nonlinear formulations. The nonlinearity may be represented in the form of a differential equation representing process or system dynamics. On the other hand, nonlinear control element characteristics such as hysteresis, saturation, backlash, or nonlinear damping, whose mathematical description is algebraic, may appear. In many instances the mathematical formulation for system description may become so unwieldy that an analytical solution is either impracticalor impossible. It is particularly in situations like these that digital simulation has become an invaluable tool. Historically, Euler integration, Newton-Coates quadrature formulas, predictor-corrector techniques, Runge- Kutta methods, and the techniques belonging to the realm of numerical analysis were first used to obtain approximate solutions to differential equations.l.2 After it beca':11e commonplace to represent the linear components of feedback control systems by transfer functions, the Tustin3 and other substitutional techniques 4 ,5,6,7 evolved. These methods represent each transfer function as an equivalent approximate discrete form. This permits the response of each block to be formulated as a single difference equation. Recently, space vehicle simulation and man-in-the-Ioop studies have necessitated the procurement of high precision and real-time simulations. The term realtime is used in this instance to denote the occurrence of events in a physical system and its simulation with the same time base. Although the simplicity of the algebraic response equations developed by the Tustin method make it attractive, the solution interval necessary for accurate closed loop simulations is prohibitive. A number of significant contributions have been made within the past four years toward the alleviation of these problems. HurtS and Fowler9 employ root locus procedures and Sage and Burt 10,11 have extended the quasilinearization technique to design more exact discrete system models. This paper will first review three digital simulation techniques previously mentioned whose usage is widespread. Emphasis will be on nonlinear systems whose form is shown in Figure 1. Then 11 new realtime simulation method which is based upon purely c,lt) cllf) ~ 473 + I "',' I--- NCIIILINEAII EL£IIENT r-- c(t) 'lis) r--~ HIlI Figure I - A general nonlinear feedback system algebraic procedures and does not require the use of ancillary computation during the design stage is presented in detail. An adaptive filtering technique is also described in which a time-varying compensatory device is employed in conjunction with the discrete system model to increase simulation accuracy. Finally selected examples are used to provide a basis of comparison and substantiate the results obtainable with the new methods. Current simulation techniques Tustin method Tustin's mathematical formulation as applied to system simulation consists of making a discrete approximation to the operational integrating operator (l/s)n and obtaining a digitized representation or eacn rransfer function. Once the various bloeks have been discretized, algebraic difference equations representing the response of each block may be formed. 474 Spring Joint Computer Conference, 1968 The linear components of a system are represented as transfer functions 1 ~ ans n + a n_1s n- + ... + a 1s + ao k >(1) G(s) b kSk+ b k-I Sk - 1+ .. '+I b· S+ b'::n. 0 Begin by dividing the numerator and denominator Equation 1 by Sk; G( )il ~(1/s)k+al(1/s)k-I+... .. s b o(1/s)k + b 1(1/S)k-l + au_l(1/s)k-n+1 + a n(1/s)k-n -t... + bk~iiTs) -+- b~ simulation for a given sample period by matching the closed-loop eigenvalues and the static gains of the analog and digital systems. Their procedure is now presented in an abridged form. I, The initial discrete system shown in Figure 3 is obtained from the continuous system by forming the z-transform of each block and then inserting an incremental delay in the feedback loop to insure realizability. - (2) .. Now a pulse transfer function may be formulated by substituting the Tustin discrete approximations for (l/s), ... , (1/S)k in Equation 2. This operation results in G(z) ~ ~L. I ~ t--" I '----'1 .- .~'1' WI.' ,,'ell 'I ~CNll t--" T ' I Figure 3 - Initial configuration; IBM digital model of a nonlinear system where {}i represents the discrete approximation to the ith order integrating operator (1/S}i as a ratio of polynomials in z. Simplification yields Aozk + A 1Zk-l +... + Ak-1z + Ak k (4) Bozk + BIz -I + ... + Bk-1z + Bk , ! where the coefficients Ao, AI>' .. ,Ak consist of combinations of T, the independent variable increment, and the transfer funCtion coefficients ao, ... , an and the Bo, Bh • . . ,B k consist of combinations of T and the transfer function coefficients b o, ... , b k. Applying this procedure to the system shown in Figure 1 results in the discrete transfer function G1(z), G 2(z), and H(z). In addition, it is necessary to insert a single period deiay in the feedback ioop to render a realizable simulation. The results of these digital modeling operations are shown in Figure 2. To 2. Next each block is assumed to have a step input and the static gains are matched between the analog and digital block transfer functions. This is accomplished by multiplying the discrete transfer function constant parameter, applying the final value by theorems to both transfer functions with the assumed step input and equating the results to determine the constant parameter. As an example, the first block in the forward path is selected. a For integrators, the approach is to multiply its ztransform by the sampling interval, T. 3. After replacing the nonlinearity by some nominal gain; G 3 ; the root locus of each system is matched by inserting a gain element (G 4 ) in the forward path of the discrete system. The results of these operations are illustrated in Figure 4. ~ ~ ~1 . ' ~i­ .- ~'1 ~ I~:-----JI Figure 2 - Digital model of a general nonlinear system using the Tustin method ·simulate a system such as this, all that remains is to obtain a set of recursion formulas using conventional tec·hniques. IBM method HurtS and Fowler,9 'the developers of the IBM Method, specifically propose to achieve a more exact figure 4- Intermediate configuration; IBM digital model of a nonlinear system 4. The last step involves matching the discrete system to the excitation. This is achieved by attaching an "input transfer function" to the discrete system. The matching is accomplished by forming the response, in operational form, of both the continuous Improved Techniques for Digital Modeling and discrete systems, then obtaining the z-transform of the linearized continuous system response and equating it to the discrete response to evaluate I(z). For the system under consideration Z [R(S).G 1(S)G 3G 2(S) ]= 1 + H(s)G 1(S)G 3G 2(S) [. R( )I( ) G4GIGll(Z)G3G2G21(Z) ] z z·1 + z IHHl(Z)G4GlGll(Z)G3G2G21(Z) .(~) The final form of the di!!ital mode) which is ealliv~lent to the general nonlinea; system ~t s~~~li~~ -i~~~;~;i~ is presented in Figure 5. Figure 5 - Final configuration; IBM digital model of a nonlinear system Sage- Burt method, The Sage-Burt approach 10,11 is the first attempt at discrete system modeling which considers the full impact of nonlinearities. However, this technique for modeli~g and simulation requires a considerable amount of tailoring and preliminary design computation. Their procedure is now presented in summarized form by considering the system in Figure 1. 1. Individual blocks are discretized by extracting the transfer function from the aggregate system and formulating an optimization problem. The first block in the forward loop of Figure 1 is selected to illustrate the procedure. If Figure 6, A(s) is the ideal transfer function, A(z) is the desired transfer function, and F(z) is the fixed 1 e 2(NT) =.2rrj J E(z)E(z-l)z-ldz, r (8) where C(z) is the z-transform of the product R(s)A(s) and the contour of integration is the unit circle. A(z) is determined by applying the calculus of variations to Equation 8 such that it is minimized. The results are R(Z-l)F(z-l)C(Z) } A (z)= { [R(z)R(z-l)F(z)F(z-l)]-, P.R., o [R(z)R(z-l)F(z)F(z-l)]+ (9) where P.R. refers to the realizable portion of the term within {} and the + and - subscripts refer to the spectrum factorization operator' denoting extraction of mUltiplicative terms containing poles and zeros inside (+) or outside (-) the unit circle. Generally either a step or ramp is selected as a convenient excitation for this procedure. 2. The nature of the nonlinearity is considered by~the inclusion and determination of a multiplicative constant parameter Pi associated with each block. The desired result is attained when the discrete system output XA(t) approaches the analog output Yi(t) , where XA(t) and Xi(t) are m vectors describing-the system output state for a given input. The continuous state vector output, Ylt), and the excitation, X(t), are assumed to be completely known. The constant parameters P, interpreted as a p vector are then adjusted to minimize the performance index. (l0) subject to the constraint XA [(N + l)T] = f[[A(NT),P] Y A (0)= Yi (0) P [(N + l)T] +.E(NT). (lla) (lIb) (llc) By application of standard variational calculus procedures, it is found that the optimum parameter vector ~ is determined by solution of Equations 10 and 11 together with the adjoint difference equations, r(t) Figure 6 - System for optimization problem E(z) = C(z) - R(z)A(z)F(z), Ay(NT)='Vy{'[YA(NT),P]Ay[(N + l)T] +R[Xi(NT)- YA(NT)]Ap(NT)='Vpf[YA(NT),P]Ay[(N + l)T] ~p[(N ~ l)T] ~ Ay(KT) = ~p(KT) = ~p(O) = O. +- portion of the system. The fixed portion of the system is selected to be unity except where a pure delay must be utilized to render a realizable simulation. Error and total squared error for this· situation are given as and 00 ~ N=O 475 (7) (I2a) (l2b) (12c) Sage and Burt consider the computational problems inherent in this discrete two-point nonlinear boundary value problem and suggest quasi linearization methods for its solution. The interested reader is' referred to Sage and Burt10,11 or Bellman12:13 for particulars concerning this method. 476 Spring Joint Computer Conference, 1968 New simulation method Tustin's method provides excellent simulations of open-loop systems and has attained widespread acceptance and usage. However, when this technique is appiied to the discretization of closed-loop systems, the necessity of inserting a delay in the feedback path to obtain a realizable simulation shifts the normal location of the closed-loop poles appreciably in many instances and results in the consequent performance degradation. Real-time simulation is generally ruled out since a smaller than tolerable time increment must be utilized to achieve adequate response representations. The IBM method and the Sage-Burt method rectify this situation by more perfect modeling of the dosed-loop system. The new simulation method to be presented in this section combines the simpiicity of Tustin's design procedure with the more exact modeling philosophy of these more recent approaches to yield a comparable realtime simulation capability. One initiates the design process by employing the Tustin method to discretize each transfer function. After the necessary unit delay is inserted in the feedback path, the resultant digital system model of a general nonlinear system is that illustrated in Figure 2. Next a discrete· filter with pulse transfer function D(z) is placed at the normal input to the digital model. Figure 7 depicts this situation where the transfer function of the digital compensator has the form o+alz+..... +amzm D(z) (13) bo + blz + ..... + bnzn with all coefficients being real. The first step in the determination of the coefficients in D(z) is to form a stable discrete overall tr~nsfe'r function of the continuous system with the nonlinearity represented as a nominal gain, G 3 Kd(z) =G(s) I(S 1) n (14) = ( )n, n = 0, 1, 2, ... , k, where G(s)= G 1(S)G 3G 2(S) / [1 + H(s)Gls)G 3G 2(S)]: * N ext the overall pulse transfer function of the simulated system is formed by applying the Tustin element to each block in Figure 1 resulting in KA(z) = D(z)G 1(z)G aG 2 (z) 1 + Z-l H(z)G l (Z)G 3 G 2 (Z), where: 1. D(z) has the form of Equation 13. 2. G1(z), G 2(z), H(z) denote the approximate/discrete transfer functions for Gl(s), G 2(s), and H(s) respectively; each obtained utilizing Tustin's method. 3. G 3 represents a nominal gain of the nonlinear element.** All that remains is the determination of the coefficients of the polynominals in D(z). This may be accomplished by equating Equations 14 and 15, expanding the pulse transfer function, and then equating the coefficients of like powers in z. KA(z) = Kct(z) D(z)Gl(Z)G 3G 2(Z) =G(s) 1 + z-lH(z)G l(Z)G 3 G 2(Z) = ( )m n = 0, 1, 2, ... , k. * Adaptive filtering Application of the new design technique results in a discrete system model consisting of numerous pulse transfer functions, system nonlinearities, and a digital compensator. Since the design procedure considers nonlinearities as fixed nominal amplification factors, the coefficients in D(z) are time invariant. For simulatio~s where greater accuracy is a requisite, an adaptive filtering scheme is proposed.Figure 8 is a suggested digital model where iG 1(Z), G 2(z), and H(z) retain their definitions from Figure 7. In this case, the form of the filter given as Equation 17 has time-varying coefficients. (17) The coefficients may be obtained in a manner analogous to Equation i 6 for the fixed filter case. **This assumption is exactly the same made by Hurt. s *( )n represents the Tustin discrete approximation to the nth order operational integrator (I ;s)n. (16) With all pulse transfer functions determined for the linear portion of the system, the corresponding recursive relationships for digital simulation can be obtained. D( ) _ ac(t) + al(t)z + ... + am(t)zm z bo(t) + b 1(t)z + ... + bn(t)zn Figure 7 - New discrete general nonlinear system model (i5) *See previous footnote. Improved Techniques for Digital Modeling 477 tions may be written and subsequently implemented on a digital computer. d[NT] = {(8 - 50AT2)d[(N -l)T] +02T-4-25AT2)d[(N -2)T] + (I 2T + 4)r[NT] +(25AT2-8)r[(N -1)T] + (50AT2-12T+4)r[(N -2)T] +{25AT2)r[(N -3)T]} Figure 8 - Discrete general nonlinear system model with adaptive filtering The filter coefficients will, in general, be explicit mathematical functions of the quantity A(t) representing the pseudo-gain of the nonlinear element. As shown in Figure 8, the Coefficient Estimator plays the dual role of identifying the gain of the nonlinearity and calculating the time-varying coefficients of D(z). The simplest determination of gain identification by direct measurement procedures is A [(N + l)T] = C1(t)·I' C2(t) t=NT (8) where it is noted from Figure 8 that the best possible measurement with this compensatory device is delayed by one sample period. * More refined identification procedures utilizing both direct measurement and extrapolation would introduce further accuracy improvements. A n example for methods comparison To illustrate the design procedure and to present a basis for comparison, the second order nonlinear system shown in Figure 9 will be used. This system was originally employed by Sage and Smith. l l CaCti _I S.I c,ltl , - -_ _..., .. J.~~LCa ~ • Z5 """'is Cl.OI I Figure 9 - Nonlinear system employed as an example By applying the Tustin approximation for O/s)n to Gt(s) and G 2(s) discrete transfer functions Gt(z) and G 2(z) may be obtained. Then by utilizing Equations 13-16 in succession with H(s) = H(z) = 1, one obtains an explicit form for the digital compensator D(z). With these preliminaties complete, a set of difference equa*In cases where the nonlinearity may be expressed as a simple explicit mathematical function, "direct measurement" may be conveniently replaced by "direct formulation." 4+ 12T+25AT2 1-3T c2[NT] = 1 + 3T c2[(N - l)T] 3T + 1 + 3T {d[NT] -c[(N -l)T] +d[(N -l)T] -c[(N -2)T]} (9) 3 C1[NT] = C2 [NT] + 0.01 C2 (NT] 25T c[NT] =c[(N -l)T] +12 {cl[NT] Figure 10 illustrates the result of applying a positive step input of magnitude 10 to the discretized models employing the Tustin, IBM, Sage-Burt, and new simulation methods. A similar comparison of the responses produced for a sinusoidal excitation applied to the IBM, Sage-Burt, and new simulation models is made in Figure I I. These results clearly indicate that the new method for discrete system modeling and realtime simulation is justified since a marked improvement over the Tustin simulation has been achieved through a minimal amount of design effort. Three schemes for the identification of the system nonlinearity were chosen to form the basis of comparing the enhancement afforded to simulation accuracy through the utilization of adaptive filtering. The direct measurement method was first selected to indicate that even "stale" or delayed parameter identification provides better results than a fixed coefficient filter. One-half penod and full period linear extrapolation of the measured pseudo-gain were employed to illustrate further improvements. These identification procedures may be easily implemented as a component of the system simulation with the recursive formulas in Equation 20. Figures 12 and 13 compare the step and DIRECT MEASUREMENT A[(N + l)T] = 1.0 + 0.0IC 22[NT] (20a) 478 Spring Joint Computer Conference, 1968 a - o o A o G G 9 EXACT TUSTIN IBM SAGE-BURT :r NEW METHOD r( t) f = 10 U=I (t) = 0.1 SEC I I 9 A (\ a EXACT IBM SAGE - BURT NEW METHOD r{t) = 10 sin 2 t T = 0.2 SEC 6 4 10 2 a a (,) a a a or (,) or ~; ~~----~----~-------3~!~~~4 t-SEC -2 6 -4 4 . -6 2 -a oo---------o~.-5--------1.~0--------1~.5--------~,0 t - SEC Figure 10- Response of the example system to a step excitation ONE-HALF PERIOD EXTRAPOLATION A[(N + 0.01 + l)T] = 1.0 {1.5c2[NT] - 0.5c2[(N "- l)T] P{20b) FULL PERIOD EXTRAPOLATtON A[(N + 0.01 + l)T] = 1.0 {2c2[NT] .- c2[(N - 1)T] P (20c) sinusoidal responses' of the example syste~ discretized utilizing the new modeling procedure for various adaptive filteri~g measurement schemes. In order to compare the various discrete modeling aJ;ld simulation techniques, simulations of the example system were performed for numerous sampling increments utilizing step and sinusoidal excitations. Then the mean-square error was calculated for each simulation technique for each sampling interval over the observation period 0-5 seconds. The error analysis of· systems simulated with an applied step excitation Fgure 11 - Response of the example system to a sinusoidal excitation tion of magnitude 10 shown in Figure 14 indicates that the new technique provides considerable improvement over the Tustin method and has accuracy comparable to the IBM method and the Sage-Burt quasilinearization method .. Also, note the accuracy improvement and increased region of stability when Sage and Burt utilize quasilinearization. The IBM, Sage-Burt with quasilinearization, and new simulation techniques provide comparable low sensitivity to error as the . simulation time increment is altered. The resulting error analysis of systems simulated , with an applied sinusoidal excitation is shown in Figure 15. A marked improvement over the Tustin method is exhibited by the system simulated using the new technique along with error sensitivity characteristics comparable to the IBM method. Figures 16 an" 17 illustrate the effects of adaptive filtering when applied to the example system for step and sinusoidal excitations respectively. Clearly, in all cases reduced error and reduced error sensitivity to the sampling interval is shown for simulation time intervals less than 0.25 second. As would be expected, Improved Techniques for Digital Modeling 479 16 14 t NEW METHOD ADAPTIVE FILTERING BY: e DIRECT MEASUREMENT • C»4E -HALF PERIOD EXTRAPOLATION a ONE PERIOD EXTRAPOLATIC»4 10 r(t) .. 10 U_1 (t) T = 0.1 8 o NEW METHOD ADAPTIVE FILTERING BY: e DIRECT MEASUREMENT A ONE-HALF PERIOD EXTRAPOLATION ~ ONE PERIOD EXTRAPOLATION SEC rtf) = 10 sin 2T T 12 = 0.2 SEC 6 4 !O 2 u 8 u 0 2 t- SEC 6 -2 4 -4 2 -6 1.0 t - SEC Figure 12 - Step response of the example system for variations of the new simulation technique for small sampling intervals, the error decreases as transitions are made from a fixed filter, to an adaptive filter with direct (but delayed) process identification, to an adaptive filter with. one-half period predictien, all"d finally to an adaptive filter with a one period prediction. To adequately evaluate various digital simulation techniques, it is imperative to consider actual computation time in addition to simulation accuracy and design effort.' For the selected example the number of additions, multiplications, and computation time for a single simulation interval is summarized in Table 1* -10 Figure 13 - Sinusoidal response of the example system for variations of the new simulation technique T ABLE I - Computation Time of a Single Simulation Cycle for the Selecte~. Example MuttiptiComputation AddiTime cations tions METHOD TUSTIN I 87.2JLsec 6 7 IBM 257.6JLsec 10 6 SAGE-BURT 256.8JLsec 9 8 NEW non-adaptive 375.2JLsec 13 12 ~ period extrapolation 532.8JLsec 18 18 These results are based upon an average of 9.6p. seconds for the floating point and instruction and 20p. seconds for the floating point multiply instruction on the PDP-6 digital computer.14 One may note from this table that for this specific example the New Simulation Method requires twice the computation time of the classical Tustin method. It is also possible to deduce from Figures 14 and 15 that for all simulation time intervals greater than 0.02 second, the New Simulation Method exhibits less error than the Tustin method with one-half the simulation interval. A common failing of both the adaptive filtering method and the Sage-Burt method is their inability to cope with large sampling intervals and their possible consequent introduction of instabilities under such conditions. It should be evident that in the former case the difficulty is attributed to parameter identification error, while in the latter, the quasilinearization procedure fails to converge. *In obtaining this tabulation, no recursion formulas are reduced; i.e., it is assumed the response of each block in Figure 9 is desired. CONCLUSIONS A new method for digital modeling and system simula- 480 Spring Joint Computer Conference, 1968 ,0' / I / r I 10' L ¢ NEW METHOD ADAPTIVE FILTERING BY: v DIRECT MEASUREMENT I .011 ONE- HALF PERIOD EXTRAPOLATION 8 ONE PERIOD EXTRAPOLATION I I rW = 10 lJ'l (tl OBSERVATION PERIOi> ()-5 SEC IcP 10° a: 0 a: a: '" '"a: '" :;) 10- 1 0 I/) z <:I TUSTIN to. IBM OSAGE-BURT. P" Pz = I l> SAGE· BURT. QUASILINEARIZATION ¢ NEW METHOD '"'":I r It) = 10 U.I (t) OBSERVATION PERIOD 0-5 SEC IO- S 10.4L-______ o ~ ______ ~ _______L______ 0.2 0.1 ~ 0.3 ' ______ 0.4 ~ ______ 0.5 10- 4 " -______..L-______ ~ 0.6 o 0.1 ~ TIME ~ NEW METHOD ADAPTIVE FILTERING BY: • DIRECT MEASUREMENT .011 ONE· HALF PERIOD EXTRAPOLATION • ONE PERIOD ExTRAPOLATION 1\ =I r(t). 10SIN 2tU_I(t) 101 OBSERVATION PERIOO 0-5 SEC / 10' a: 0 a: a: '" '"~ ~ 0.3 INCREMENT. T Figure 16 - Error analysis of the example system with a step excitation showing the results of adaptive filtering Figure 14 - Error analysis of the example system with a step excitation <:I TUSTIN ~ IBII OSAGE - BURT; P,. ¢ NEW IIETHOD _______L_ _ _ _ ___L________L.....____ 0.2 TIME INCREMENT. T r(t) • 10 SIN 2t U-,(t) OBSERVATION PERIOO 0 -5 SEC I 10° 5 I/) z '":I 1&1 10'2 . 0.1 0.2 0.3 0.4 0.5 0.6 TIME INCREMENT. T Figure 15 - Error analysis of the example system with a sinusoidal excitation o 0.1 0.2 0.3 0.4 TIME INCREMENT. T 0.6 Figure 17 - Error analysis of the example system with a sinusoidal excitation showing the results of adaptive filtering Improved Techniques for Digital Modeling tion has been introduced. The simulations performed utilizing this technique characteristically exhibit a high degree of accuracy improvement over the Tustin method and in many cases accuracy comparable to the IBM or Sage-Burt techniques. The modeling procedure encompasses the simplicity of the Tustin method, involves a minimal amount of tailoring, and does not necessitate the usage of ancillary computer programs for design purposes. Error analysis indicates that the IBM method possesses the desirable characteristics of low simulation error and reduced sensitivity to simulation error. The Sage-Burt method and the newly developed method also offer low sensitivity to error for low sampling intervals which correspond to those generally used for adequate information representation. However, only the new technique offers an easily utilized modeling procedure. Although it must be emphasized that these results are for a selected example, the results of other work appear to be in complete concurrence. For increased simulation accuracy utilizing the basic philosophy of the new simulation method, an adaptive filter whose characteristics are time varying is suggested. An example system simulated using an adaptive filter provides both reduced error and error sensitivity to the simulation time interval. However, as may be inferred from the presented results, in certain cases the utilization of an adaptive filter may limit the real-time simulation capability. ACKNOWLEDGMENTS The author wishes to thank Mr. R. Belluardo for the encouragement he provided. REFERENCES 1 J B SCARBOROUGH Numerical mathematical analysis The Johns Hopkins Press Baltimore Md 1958 2 R WHAMMING Numerical methods for scientists and engineers McGraw-Hili Book Co Inc New York 1962 481 3 A TUSTIN A method of analyzing the behavior of linear systems in terms of time series Journal lEE vol 94 pt II-A May 1947pp 130-142 4 A MADWED Number series method of solving linear and non-linear differential equations Rept No 6445-T-26 Instrumentation Laboratory Massachusetts Institute of Technology Cambridge Mass April 1950 5 R BOXER S THALER A simplified method of solving linear and non-linear sysiems Proc of IRE vol 44 Jan 1956 pp 89-101 6 C A HALIJAK Digital approximation of the solutions of differential equations using trapezoidal convolution Rept No ITM-64 Bendix Systems Div Ann Arbor Mich August 1960 7 J T TOU Digital and sampled-data control systems McGraw-Hili Book Co Inc New York 1959 8 J HURT New difference equation technique for solving nonlinear differential equations Proceedings of the 1964 Spring Joint Computer Conference pp 169-179 9 M FOWLER A new numerical methodfor simulation Simulation vol4 May 1965 pp 324-330 IO A P SAGE R W BURT Uptimum design and error analysis of digital integrators for discrete system simulation Proceedings of the 1965 Fall Joint Computer Conference pp 903-914 II A P SAGE S L SMITH Real-time digital simulation for systems control Proc I EEE vol 54 Dec 1966 pp 1802-1812 12 R E BELLMAN R E KALABA and R SRIDHAR Adaptive control via quasilinearization and differential approximation Rand Corporation Research Memorandum RM-3928PR Nov 1963 I3 R E BELLMAN R E KALABA Quasilinearization and nonlinear boundary-value problems Rand Corporation Research Report R-438-PR June 1965 14 Programmed data processor-6 handbook F-65 Digital Equipment Corporation Maynard Mass 1965 Extremal statistics in computer simulation of digital communication systems* by MISCHA SCHWARTZ and STEVEN H. RICHMAN Polytechnic Institute of Brooklyn Brooklyn, New York INTRODUCTION With the advent of the digital computer it is becomin~ more and more common to simulate the operation of rather sophisticated communication systems on the computer. The performance of systems under various types of operating conditions may be evaluated quite readily and economically prior to actual field usage. The average error rate serves as a very common measure of performance for digital communication systems with a probability of error of less than 10-5 a desirable goal in most system design. Such extremely low error rates pose a real measurement problem, however. Generally with Monte Carlo simulation techniques used one would require data samples of the order of at least 10 times the reciprocal of the error probability to make valid performance estimates, leading to costly and time-consuming simulation runs. The question of more efficient estimation of low error probabilities in communication system simulation is thus an extremely important one. We report here on encouraging results indicating that the methods of Extremal Statistics may reduce the data requirements in many simulation experiments by at least an order of magnitude. Major applic~tions of the field of Extremal Statistics l have heretofore been made primarily to such areas as Flood Control, Structural design, meteorology, etc. It is only relatively recently that applications to communications have begun to be made, with primary emphasis thus far on .the analysis of data obtained from existing system~.2.3 Thus, use has been made, in analyzing these data, of special plotti~g paper developed by Gumbel. lOur approach has· ·differed in assuming from the beginning that all calculations were to be made by a high speed computer, that time was of" the essence, and that we were interested in *The work reported in this paper was supported under NSF Grant OK-527. applying the theory to the simulation of broad classes of systems. Extremal statistics is, as the name implies, concerned \."ith the statistics of the extrema - maxima or minima - of random variables. As such it deals with the occurrence of rare events, exactly the problem encountered in. simulating low error rate communication systems. It IS found l . that· asymptotically (i.e., very larg~ saIllPle m.lmbers of the. random variable under study) many of the most common probability distributions follow a simple exponential law when expanded about an arbitrary point on their tails. Thus, the probability of exceeding a specified value or threshold Xo assumes asymptotically the form (1) The number n represents the number of samples used with an and Un constants, depending on n, and the actual distribution of the random variable. In particular, the probability of exceeding u is' ~,providing anotht definition of u. Figure 1, for an arbitrary probability density function f(x), shows equation (1) graphically. The gaussian (normal), Rayleigh, exponential, and Laplacian distributions are among the examples of the asymptotically exponential distributions. All of these distributions. may be ~pp.roximated by Equation (1) in the vicinity of u. How far from the vicinity of u one may move depends of course on the actual underlying distribution and the particular point (u) one expands . about. As an example Figure 2 compares the exponential approximation to the actual probability of exceedance of x, P e, for a gaussian density function. Here n has been arbitrarily chosen as 100. The actual probability Pe and its exponential approximation are then matched at Pe = 10-2. It is readily shown that the point Un about which one expands, is 2.32, "and an =2.68. 483 484 Spring Joint Computer Conference, 1968 x JLn Xo Fig-lire 1 - Exponential approximation to probability of error Comparing the probability of exceedance Pe for the actual gaussIan and its exponential approximation, as plotted in Figure 2, it is apparent that the two are within 25% of one another at Pe = 10-3 and differ by 50% at Pe= 10-4 • This then points up the significance of the extremal statistics approach: if one is interested in estimating small probabilities of error, say of the order of 10-3 or 10-4 , it may be possible instead to first estimate much higher probabilities, say 10-2 in the example of Figure 2~ If the exponential approximation is valid one should then be able to extrapolate down to the desired probability. Instead of the usual number of samples required to estimate Pe, say 1O/Pe, one can then work with a much smaller number n. There is of course one major problem, however. Since the underlying density function f(x) is in general unknown, or difficult to evaluate in the complex systems of interest to us, the two parameters an and Un are unknown as well, and must be estimated, In the next section we discuss various ways of estimating an and Un, and results of computer runs for two simple density functions, the gaussian and the exponential. The results are quite encouraging: even with additional samples needed to estimate an and Un, one can still save at least an order of magnitude in the total number of samples required to estimate a given probability of error the traditional way. I.n the finai ~ec~ion, we discuss the computer simulation of two specific feedback communications systems for which probabilities of error have been estimated quite successfully using extremal statistics. (One of these systems is an example of one for which actual calculations or probabilities of error are quite difficult to make. In the example shown only bounds .on the error have been obtained and the simulation results check these quite closely.) Extimation of extremal parameters We discuss in this section the use of extremal statistics to estimate small probabilities of error in the case of two known distributions, the exponential and the gaussian. The problem here is twofold: to first estimate the extremal parameters an and U m then to determine, using these estimates, how well the actual probabilities at the tails are estimated. The exponential density function normalized to unit variance is given by f(x) = e-Xu(x) 2 1.=10n GAUSSIAN APPROX: an= 2.68 POn= 2.32 10-4 10-5L----L----........L.----~4-...,. 2 :3 Figure 2 - Exponential approximation to gaussian statistics (2) u(x) the unit step function, while the gaussian density function, again normalized to unit variance, is of course given by f(x) = _1_ V2.'TT e-x2/2 (3) One would expect rather good estimates of the probability at the tail for the exponential density function since it is already in the asymptotic form of equation (1). In the gaussian case, as pointed out in the previous section and as illustrated for one case in Figure 2, it is theoretically possible to extrapolate as much as two orders of magnitude away from the starting point 1In before the quadratic exponential behavior of the gaussian function takes over and produces significant deviations away from the linear exponential behavior of extremal statistics. Extremai Statistics in Computer Simuiation The actuaJ experimental behavior of the exponential approximation depends critically on the estimation of the two parameters an and Un- To determine these we . use the fact, as demonstrated by Gumbel, l that they are intimately connected to the asymptotic statistics of the extrema (maxima) of the random variable x. Specifically, if one generates n independent samples of x the probability density function of the largest (maximum), x rn , of the n samples is asymptotically (n~oo) given by ~ puter simulation involved would thus be nN = 104. The resultant output samples would be grouped into N = 10 groups of n = 103 samples each. The largest sample, Xh in each group would then provide N = 10 samples with wh~~h to estimate an and Un' nN=10 4 SAMPLESOFX 111111111···111···111···111 x nN XI n REGROUP iNTO N=iO IN GROUpS n=103 EACH P n = an exp [-y-e- v ] y == an[xrn-un] 485 n. n n ,'Oii7iiiI "1iii"iiI ,'mm" '" "'imiI" I 2 3 10 (4) LET X i BE LARGEST SAMPLE IN EACH GROUP Equation (4) is found to be valid for a wide class of density possessing exponential behavior at the density functions posse~sing exponential behavior at the tails, with the exponential, gaussian, and Rayleigh functions typical examples. From Equation (4) it is readily shown that un(n~oo) is a measure of the mode of Pn(xm> , while a.!.(n~oo) is a n measure of the dispersion. Specifically, one finds, using Equation (4), that 1 V6 -=-(J"rn a 1T (5) and (6) Here (J"rn is the standard deviation of the maximum (extremal) values, xrn , of x, E(xrn ) the expected value of these maxima, and 'Y = 0.5772 is just Euler's constant. ~t is thus apparent. that to estimate an and Un one must first ensure n > > 1 (This is why Equation (1) is applicable to the tails of density functions, where Pe < < 1), and then generate sufficient samples of the random variable x under test to measure their statistical properties. If N samples of the largest value of x in a group of n are to be made available this implies repeating the experiment nN times in all. It is the total number nN that is to be compared to the usual number 10/Pe • From !he form of Equations (5) and (6) one would expect that for n and N large enough, good approximations to an and Un would be obtained by averaging appropriately over the N samples of the maxima available. As noted later this was in fact found to be the simplest and most accurate procedure in actual experimentation with the computer. This estimation procedure is portrayed in Figure 3. There, as an example, n = 103 and N = 10 are chosen. ~he total number of independent samples or repeats of the com- Figure 3 - Estimation of a and Un There is a tradeoff possible between nand N, given the fixed number of repetitions nN. Thus decreasing n decreases the range over which one would theoretically expect the asymptotic exponential approximation to hold (assuming perfect knowledge of an and un), but allows better estimation of an and Un as N increases. Some analysis of the optimum choice of n and N has been carried in a recently completed doctorial thesis.4 In the actual computer simulations carried out n was taken as 500, N = 20, so that a total of 10,000 actual repetitions of the different experiments tried were performed. Normally this would provide relatively accurate estimation of probabilities of error as low as 10-3 • We were interested in extending the estimation to 10-4 and 10-5 • As noted previously, an obvious initial estimate for a is to replace (J"rn in Equation (5) by the sample standard deviation s, using the N = 20 samples available of the extrema. Similarly a first estimate for u is to replace E(xrn ) in Equation (6) by the sample mean ·'Xrn (again using the 20 extremum samples). (These are the procedures suggested in Figure 3.) Although the sample standard deviation is in general a rather poor estimator of the statistical stand~rd deviation [var (S2) = 2(J"~JN· for gaussian statistics] the experimental results obtained were surprising~y _go<>arates and packs Input cbaracters into right-character buffers Attaches input identification code and·transfers buffer to circular stack RXP Program IXPProgram Makes initial entry In FNT/FST table Intervenes at critical points to update FNT /FST and save output flies 8I!tem Monitor Detects and marks termination of input in TERIIISTAK Multip1'OCellHll , . Service Program Analyzes Input word In circular stack and parcels input Into jungle unit assigned to terminal Determines wh ..ther Input is a command or incoming data Updates job ·Iablefor terminal If a comn,and, seta flags In TERMSTAK for required processing routine If Incoming data, assembles data in buffer (or transfer to the disk and Issues disk write request Figure 3 - Scope processing Figure 1- Input EXECUTIVE ~ pp (lXP) INPUT/OUTPUT pp (lOP) ~J~ J CENTRAL MEMORY .- ~ .--STACK TERM STACK L---- "--- CIRCULAR • ~j'l.; JUNGLEI,....L-'-" - USERS FILE SERVICE PROGRAM .--JOB TABLE FOR TERMINA~ L-- t-... h ~ ~ I SCOPE SYSTEM MONITOR (MTR) -::.~ Service Program Sets up output message or monitors terminal job tables for output messages Calla up a8 much of output messages as lOP can handle during one output period Plaoea tbia outpat In juDcle units Sets output flag In TERMSTAK Service Procram Examines TERMSTAK to determine ..:tionl'eqllired Transfel'll Information for referencing user's flies Into central memory from the disk Cbups, adds, or deletes luformMlanln ..er's til. . . required Updates user's file catalog mel jGb table Places command-accepted response In jungle units lOP Program lOP Program TrlllUlfers output (Olle central processor wprd at a time) of five peripheral processor words to output buffer Places output (one peripl>eral processor word at a time) Into terminal I/O buffer TraD8mits contents of terminal I/O buffer to terminals SeadII I'8llponse to remote terminal Figure 2 - File maintenace Figure 4 - Output 494 Spring Joint Computer Conference, 1968 TIMING PROBLEMS: Many users require execution of the same section of code, use of the same buffer areas, and entry into the same queues simultaneously. SATURATION PROBLEMS: Queues become very long or full, causing rejection of a request for a resource. Buffers are filled, requiring the system to allocate and link to additional buffers. For example, when an INPUT or a FILE command terminates, sorting buffers are in demand and, the disk request queue grows. Another example is the LOGOUT-LOGIN operation by which each user's file catalog must be transferred to from the disk. Four general classes of users can be simulated by the MUSE program: A polite general purpose user who accepts the rhythm of the system and is interested in exercising its capabilities. A stereotype user from a particular job environment who represents a specific set of needs such as file management, short FORTRAN compilations or heavy execution requirements. The impatient user who will run at his speed, rather than the system's speed, and continues relentlessly to enter commands into the system. '{'he hostife user who is intent on breaking the system. Single terminal operations often suffice for the polite user and for the initial encounter with the hostile user. Multi-terminal simulation is the only way to adequately gauge system performance for stereotype users, impatient users or a gang of hositle users. Simulator design was oriented toward emulating the impatient, hostile and stereotype users. Since, however, data decks define both the number and type of user a change from anyone group to another is easily accomplished. The polite user certainly cannot be overlooked. Here is the capability to quickly and thoroughly exercise all variations of a command. For example, the RESPOND user may define a format to organize his data input stream. There are 512 variations on the FORMAT command's structure alone. The checkout of these variations is trivial when run through the simulator but extremely tedious when entered by hand more than once. Problems are easily repeatable when simulator testing is done. Not all problems can be solved or even isolated by using the simulator alone. System malfunctions associated with misplaced files or records require some searching and guessing after an error is detected. This class of problems is accommodated by the design of the interface between MUSE and RESPOND which allows simulator operation to be terminated and activity to be transferred to a Teletype terminal. This mode of operation was also vital in resolving simulator and RESPOND communication problems. Sirnuiator designfeatures The MUSE simulator consists of two basic parts. The major part is essentially a FORTRAN program with several small assembly language (COMPASS) routines incorporated for.... efficient use" of central memory after loading. A second part consists of extensions to the RESPOND executive program which allow communication with the simulator as though it were the system multiplexer. This interface program provides automatic switching of activity between simulator and Teletype terminals. The FORTRAN program resides at one of the control points in the multi programming environment of t~e SCOPE system. This program requires a maximum of 9600 words of core memory for the 16-terminal version and 18000 words for the 64-terminal version. A character conversion table relates Teletype codes to internal codes in the same manner as in the RESPOND system. The commands and data input submitted from the simulator to RESPON D are loaded by the simulator as data strings separated by control cards. The command and input strings reside on disk as separate files for each terminal. A simulator input buffer is filled with characters for each terminal from these files. When the buffer is filled, RESPOND performs a parallel read operation bringing in all characters for all terminals as though the simulator were a multiplexer. Output is transferred by a similar fashion from RESPOND to a simulator output-buffer then to a disk file for each specific terminal. Each data card record in a data string represents one discrete command or one data input line. Data strings may be entered by cards or from magnetic tape. Commands can be up to 77 characters per card and data records up to 80 characters per card with as many cards per input record as desired. The last two columns on a command card are used by the simulator to designate the number of times the command should be repeated under certain conditions. Up to 36 diagnostics and other replies generated by RESPOND are entered into the simulator program's diagnostic table from data cards. When the simulator receives a message from RESPON D, it scans the diagnostic data cards loaded with the program. If a match is not found, the next line of input to RESPON D is issued. if a match is found, coded information entered with each command triggers a variety of actions. This allows a certain degree of recovery MUSE wIthin the simulator from conditions within RESPOND when that system is heavily loaded. The simulator data control cards provide the following capabilities: • the TERMINAL ID card identifies a particular part of the data string with one or more terminals. All possible or 'up to 35 terminals may be speci, fied, individually' or .inclusively, to use the same . s.. ~ c.. ~ 0;< U -< :: c: °ca F F F F F F F F F F F F F F F F F F F F F F F F W W W W W W W W F F F F W W W W F F F F W W W W ~ a0 .9 '"0 '"0 s.. °ca u.. c: °ca °ca ~ s.. ~ -< F F W W F F W W F F W W F F W W F W F W F W F W F W F W F W F W en .... os ~ i1) i1) s.. 0 ~ rn 0 ~ u.. ~ :: i1) s.. .... os ~ >. s.. 0 en .... os ~ c.. ~ 0;< ~ u -< :: c: °ca F F F F F F F W F F F F F W W W W F F F F F F F F W W W W W W W W F F F F W W W W F F F F W W W W ~ :; a0 '"0 0 W W W W W W W W W W W W W W W ~ i1) s.. .9 en '"0 '"0 °ca u.. c: °ca °ca F F W W F F W W F F W W F F W W ~ ~ ~ u.. ~ :: -< i1) :i '"0 0 ~ F F W F F W F W F W F W F W F W F W F F F F F W F F F W W W W W For a non-self-repaired and non-redundant system, the system reliability is an exponential function of time, providing the component of subsystem reliabilities are also exponential functions of time. However, this is not true for a self-repaired or for a redundant system. The system reliability function is not an exponential, but a more complicated curve. Figure 6 shows the reliability, R(t), for a self-repairing module containing an accumulator register. The R(t) for the accumulator without self-repair is plotted on the same axes to illustrate the difference between curves. '"11 LO Ffttl· • I I _o$[L' ~IP&l~I_ " _I ~.--" \ ~. ~ ! A I II 0 101 F = FAILED W=WORKING I KL'-R€PAUIUIC ll00UL€ liT aC~UIlULa_TO~ . . \ ,~ r...;~ 10 TtME .M MOUltS Figure 6 - Reliability versus time (1) The Main and Auxiliary Units (memories) working. (2) The Main Unit and Main Failed Store working and the Auxiliary U nit failed. (3) The Auxiliary Unit, Main Failed and Auxiliary Failed Stores working and the Main Unit failed. From these conditions the reliability (R) of the memory module may be written as: R= {RMR A+ RMRMF (1-RA) + RARMFRAF (1-R M)} Rs (3) Since RM = RA and RMJo, = RAt, this equation may be rewritten as: R = {R~ + RMR MJo, - R~RMJo' + RMR~Jo' - R~R~F}Rs (4) The R's are all time dependent, that is the reliability of a unit depends on how long the unit has been operating. The most commonly assumed form of time dependence is R(t) = e- At (5) where A. is the failure rate. Most data on component reliability are given in the form of A. or failure rate. For an exponential reliability curve the mean time to failure (MTTF) is ItA.. The MTTF is often used as a measure of system reliability. A parameter which can be used in comparing various systems, then, is the reliability of the system at some mission time. The reliabilities of the various parts may be calculated in terms of the mission time and the part failure rates. Failure rates of .05 per 106 hours for JK flip-flops and .02 per 106 hours for gates are assumed. Table II compares the reliabilities and component complexities for a hypothetical computer with and without self-repair. The compute'r is a simple 12-bit machine designed under the self-repair contract for the purpose of demonstrating self-repair techniques. The table shows both the increased complexity and the increased reliability due to self-repair. CONCLUSIONS A technique of implementing self-repair, via duplication and spare switching, for digital systems has been described. It has been shown that the use of this technique is feasible for increasing the reliability of these systems. The primary disadvantages of the technique are increased hardware and cost, and increased execution times. However, in systems where reliability is of utmost importance, the reliability increase can offset the disadvantages. 514 Spring Joint Computer Conference, 1968 TABLE II - Hypothetical Computer Reliability and Complexity Showing Benefits of Self-Repair Module Complexity .. I Register L I Register R N Register Sequencer Misc. Control A Register" L A Register R Q Regi~ter L QRegister R S Register P Register Memory Total Model Self-Repair No. Self~Repair 1 3.4 3.4 3.5 2;6 3.0 2.4 2.4 2.5 2.6 209 3.0 2.0 2.2 REFERENCES 1 R H WILCOX W C MANN Redundancy techniques/or computing systems Spartan Books Washington 0 C 1962 2 P W AGNEW R E FORBES C B STIEGLITZ An approach to self-repairing computers Diges"t of the 1967 IEEE Computer Conference Module Reliability 10,000 Hours 50,000 Hours Self-Repair No Self-Repair Self-Repair No Self-Repair .9956 .9940 .9767 .9705 .9940 .9956 .9767 .9705 .9964 .9969 .9836 .9822 .9871 .9948 .9371 .9693 .9735 .9625 .8435 .8261 .9951 .9796 .9656 .9021 .9945 .9792 .9621 .9003 .9954 .9859 .9714 .9315 .9954 .9863 .9717 .9333 .9936 .9875 .9636 .9389 .9918 .9868 .9541 .9357 .9582 .8248 .5718 .3818 .8861 .7012 .1696 .3536 3 W G BOURICIUS et aI. Investigations in the design 0/ an'automatically repaired computer Digest of the 1967 IEEE Computer Conference 4 T L CHU J R SZEDON C H LEE Preparation and C-V characteristics 0/ silicon-silicon nitride and silicon-silicon dioxide-silicon nitride structures Solid State Electronics vol 10 p 897 1967 A study of the data commutation problems in a self-repairable multiprocessor by KARL N. LEVITI, MILTON W. GREEN and JACK GOLDBERG Stanford Research Institute Menlo Park, California INTRODUCTION In recent years significant effort has been devoted to the de.velopment of techniques for realizing computer systems for which a significant problem of design arises from the unreliability of components and assembly as well as from special constraints on construction and operation. 1 Examples of such systems are computers for aerospace missions and on-line process control. A significant effort has been directed in the past dec~de to attempt to solve various facets of the problem of realizing ultra reliable digital systems. This work, also summarized in detail in Ref. 1 has ranged from investigations of fault-masking techniques for various computer subblocks such as arithmetic processor and memory units, to studies of reliability enhancement policies for large systems. It is felt that most of the problems pertaining to enhancing the reliability of i.solated digital subbloc~s are now understood, at least from the standpoint of being able to make sOl}nd engineering judgments concerning the utilization of the various techniques. Strictly passive redundancy techniques (e.g., replicated voting logic, error correcting coding methods) have in part been appli,ed to the control and arithmetic processing sections of the Satu~ IVb guidance computer, and it has been concluded2 that the application of such techniques exclusively cannot economically satisfy the computation and reliability requirements of future advanced computers. This conclusion has prompted many organizations to investigate dynamic error control mechanisms, in which the logical interconnections among the components of the computer may be altered. 3.4.5 In the system schemes investigated, the reconfiguration is employed only at very high functional levels, but it is well-known that there is potentially greater gain to be achieved by employing the reconfiguration at lower system levels. 515 It has also been recognized that there is considerable advantage to a system wherein the allocation·of computation tasks is adjusted so as to be consistent with the available. equipment. In such systems, which are colloquially said to embody "graceful-degradation," most of the available equipment is performing useful computations. Among the many references on this approach, in Ref. 6 a single processor structure is assumed, and in Refs. 7 and 8 a multi-processor structure is postulated. In tbese studies as well as many others, several important items are not treated in depth, namely those relating to: (1) Diagnostic and replacement policies (2) Logical design techniques for memory, control, and processing units so that diagnosis and repair are facilitated (or indeed feasible) (3) Reliable commutation (or data switching) required for the execution of subsystem replacement (4) The specification .of software for the control of diagnosis and repair. When the new sources of failure that are introduced by the mechanization of the above four items are included in the reliability analysis, it is not clear that the systems will perform as promised. This is an especially critical problem when reconfigurability is extended to low system levels or when the capability for graceful degradation is provided. . Hence, it is imperative that these four items be studied in detail, i.e., that design sche~s be developed that minimize the new sources of failure, and that reliability analyses be carried out so as to accurately estimate overall system reliability. In this paper we develop a multiprocessor model- the structure which appears to be most appropriately matched to our computation requirements - and then study the data commutation problem within the framework of this multiprocessor organization. 516 Spring Joint Computer Conference, 1968 A multiprocessor model Many descriptions of multiprocessor systems have appeared in the literature,8.9,lo and several contemporary computer systems l l rely upon mUltiprocessing. Most of these previous descriptions have been concerned with (1) gross ~stimates of system reliability, assuming for example, that diagnosis and switchover are always executed correctly; (2) scheduling analyses and simulations to facilitate the determination of system responses to various inputs; and (3) the specification of software that will enable the optimal utilization of the hardware. One of the most important functions within a multiprocessor is the switching of data and control signals among the component blocks. The components needed for such switching are themselves sources of system error, and so it is neces. sary to design switching or commutation networks to achieve the utmost reliability. ALL DATA OIAf'!fifLS ARE MDlRECTfCIIAL Figure I - Multiprocessor computer system block diagram The model of a multiprocessor with which we will be concerned is depicted in Fig. 1. The system consists of a set of M high-speed working memories (WM); a set of N simpie processor and control units (SP); a set of Q arithmetic logic units (ALU); a set of R back-up memories (BAM); an input/output device controller (I/O) for which we provide sets of spare registers, counters, buffers,. and .real~time clocks; two commutation networks (eN); a supervisory control unit (SCU); a~d two registers for the setup, i.e. the establishment of input/output links, of the commutation networks; it is convenient to view these registers as forming a component of the SCU. In Ref. 12 we describe in detail the functional requirements of each block type, and, in addition, we present a flow description of the system responses to input, interrupt, and error conditions, It is' envisioned that each SP unit will have the capability of executing comparatively simple decision and arithmetic algorithms, the capability of controlling program flow, and the capability of controlling processor allocation and scheduiing. An AL U will be used for the execution of complex algorithms requiring extensive processing hardware. The SCU will function as a "referee" in all error control processes, and, in essence, represents the system "hardcore" although it can be superseded by an external command. The BAMs will store the task programs, diagnostic programs, and the setup programs for the commutation networks. In addition to the intenmit communication links provided by the commutation networks, a single data channel (probably serial), shown as bold-faced lines in Fig. 1 is provided, linking the SP and WM units with the supervisory control unit. It was noted by Alonso 1o that a complete mUltiprocessing system could be designed containing only this single data channel communication link, (although in practice it is doubtful that this link would be serial), but we have included the possibility of multiple-simultaneous communication between SP and WM blocks because of the additional flexibility thus provided and because at this· stage it seems appropriate to work with a general model. One important feature of the system, not shown explicitly in the figure, is that each defined block of the system will have at least one distinct power supply associated with it. Furthermore, it is assumed that the power can be disconnected from a faulty block without reSUlting in the propagation of errors into connecting blocks due to excessive loading on the part of the disconnected unit. Hence, one mode of error control would trivially involve the disconnection of a major s.ystem block, i.e., WM, SP, or ALU, upon the detection· of a hardware failure. However, it. appears that the overall system reliability is enhanced if some capability for repair is incorporated within these block types. We -have found that the repair operation is particularly. easy to carry out if each major system block is realized as a one-dimensional cascade of identical elements. We call such a logical realization a byte-sliced realization since it is natural to assign a byte (containing at present an undetermined number of bits) of each of the registers, adders, decoders, etc., to each element or slice in the cascade. The repair operation for byte-sliced realizations, assuming each major block contains several redundant slices, then requires the routing of external data to and from only the working, i.e., unfailed slices, of the block, and also the "shorting" of data internal to the block, e.g., arithmetic carry or control information, around faulty slices. Study of Data Commutation Problems In Refs. 1 and 12 logical design. techniques are presented to demonstrate the feasibility of realizing in a byte-sliced structure, the memory, arithmetic logic, and microprogram control functions. Thus, it is seen that the commutation networks will perform the function of establishing communication links between WM-SP units and between SPALU units, and also the function of repairing the units by routing data only to working byte-slices. Commutation requirements The data switching functions described in the previous section give rise to a variety of specifications for commutation networks. In this section we will classify the major network types, and in the remaining sections we will present a number of detailed designs for the various types. In addition to meeting the particular switching requirements (e.g., number of active paths, number of configurations, etc), several engineering constraints should be considered in the design of practical networks. Thus, for any commutation function it is desirable to synthesize networks for which (1) The design is economical (2) The network setup is not difficult (3) The data transfer is rapid (4) Failures in the commutation network do not disable either the commutation network or the modules served by the network (this tolerance to CN failures should be achieved with minimal increase of network complexity), and (5) If the commutation network is "repairable", the diagnostic routines should be easy to specify and of minimal length. We have ~blssified two types of data commutation for ~h~ task of module assignment, (i.e., the assignment of, for example, WMs to SPs) namely: ( 1) Complete permutation - complete utilization (2) Complete pefII?utation - inco~plete utilizat~on, and three types of commutation for the task of repair namely, (3) Incomplete permutation - order preserving (4) Incomplete permutation - nonorder preserving (5) "Shorting." These five commutation functions are described schematically in Fig. 2 where the specific applications of each function are also listed. The assignments associated with the complete [CPCU (N)] permutation - complete utilization function is obvious; the commutation network is to be capable of permuting in an arbitrary manner a set of N-input data lines (all lines active) emerging, for example, from a set of memories, to a set of Noutput lines incident to, for example, a set of SP units. ~ COMMUTATION RJNCTION ~ ~ COMPLETE PERMUTATION INCOMPLETE UTILIZATION INCOMPLETE PERMUTATION ORDER PRESERVING INCOMPLETE PERMUTATION NONOROER PRESERVING APPLICA TION ::::M:: COMPLETE PERMUTATION COMPLETE UTILIZATION GOOD UNIT GOOD UNIT fl ~ "SHORTING" CONMUNICA TION BETWEEN WORKING MEMORY-SIMPLE PROCESSOR (F\A.L CAPACITY OF INTERCOMIECTION) CXlMMUNICATION BETWEEN BACKUP MEMORY-SIMPLE PROCESSOR (LIMITED CAPAOTY OF INTERCONNECTION) ~ GOOD BYTE GOOD BYTE 5i 7 INffRCcNNECTION'liEN GOOC RE~DANT I.HTS EXIST BYTE AND DATA IS ORDER SENSITIVE GOOD BYTE GOOD UNIT 1. CONMUNICATION BETWEEN BYTES OF WORKING =~~~LE INTERCONNECTION WHEN RE~T UNITS EXIST 1. COMMUNICATION BETWEEN SPARE REGISTERS GOOD 2. COMMUNICATION BETWEEN UNIT ALU-SJMPLE PROCESSOR ROUTING OF DATA AROUND FAUL TY BYTE SLICES Figure 2 -Classification of data commutation requirements In the illustration a data transfer path may represent a parallel set of lines containing one computer word (24-56). The assignments associated with the complete permutation - incomplete utilization [CPIU (N ,m)] function'- differ." from those associated with the CPCU function in that for the former, not all terminals are active at anyone time, i.e., only a subset containing m-inputs of the total of N-inputs and outputs need to be interconnected at a given time. For the incomplete permutation-order preserving [IPOP(r,rri)] function, a subset containing m inputs of the r-inputs, say for example associated with the working byte slices of a simple processor and control unit, is to be connected to a subset of the outputs, say associated with the working byte slices of an arithmetic logic unit, but with the restriction that spati~.l ordering of the input signals is to be preserved at the output. The preservation of order is clearly required in the example since the data to be commutated is a binary number. The assignments associated with the incomplet~ permutatiofJ.-nonorder preserving [IPOP(r,m)] function differ from those of the IPOP case in that for the former preservation of order is not a requirement. It may be noted that this function differs from the CPIU function in that the CPIU function requires arbitrary specification of terminal pairs. For the shorting function the outputs of a given byte slice are either to be connected to the succeeding stage (slice) or "shorted" around that succeeding slice. ." 518 Spring Joint Computer Conference, 1968 Corssbar solutions to the commutation network design problem r---------------------, I I I I : : The obvious solution to the commutation network design problem relies upon the use of a single-level crossbar switch, similar to the type commoniy found in central telephone exchanges. I n Fig. 3 we display a schematic representation of a crossbar switch serving a set of WMs and SPs. Here the crossbar, where each single pole-single throw switch represents a crosspoint, performs both the CPCU(N) and IPOP (r,m) requirements. Clearly a crossbar with N2r2 crosspoint would be sufficient, but it was shown in Ref. I that actually N2(r2-m2) crosspoints are sufficient. In any event it is seen that 28.10 4 crosspoints are required to serve 25(=N) processors and memories, 32(=r) total bytes, and 24(=m) bytes required for computation. Since a multiprocessor of this complexity is not unreasonable, there is considerable motivation to seek more economical commutation network designs. (0) (b) r--- ---, I I : I IL_____J: (el (d) r----- p I I I I I I _____ _ L Figure 3 - "Crossbar" realization of commutation function The primitive building block of commutation networks Most of the commutation networks to be described in succeeding sections will be composed of interconnections of the "cell" shown in Fig. 4. We have found that arrays of such cells provide a very attractive balance among the various engineering constraints listed in one s~ction. In addition, the uniformity of cell types and the regularity of array structure make such arrays very well suited to advanced microelectronic technology (LSI). In essence, the cell is a double-pole, double-throw reversing switch controlled by a storage element (e.g., a flip-flop), with some means provided for setting the storage element to the desired state. Fig- Figure 4 - Basic cell ure 4(a) shows a relay-contact version, analogous to circuits in the MOS technology, and Fig. 4(b) a NOR gate-realization of the cell in question. The two modes consist of a "crossing" Fig. 4(c) and'a "bending" Fig. 4(d) of the pair of input leads to the pair of output leads. Figure 4(e) depicts a redundant flip-flop version of the cell, for which any single component failure will result in one of two possible failure conditions, namely (1) the cell can realize only one of two possible modes, i.e., the bend or the cross, which we will call the "stuck-function" condition or (2) one output lead contains a faulty signal, which we shall call the "bad-output" condition. Table I summarizes the failure conditions resulting from various component failures of the cell of Fig. 4(e). Study of Data Commutation Problems TABLE I - Failure conditions for basic cell Component Fault Faulty OR gate Faulty 2-input AND gate Faulty 3-input AN D gate Flip-flop stuck in a mode Same logic value on two outputs . of a flip-flop Failure Condition Bad-Output Bad-Output Stuck-Function Stuck-Function Bad-Output (It is assumed here that if one of the flip-flops is stuck in a particular mode, the other flip-flop is permanently set to this mode, hence resulting in the stuck-function failure.) Commutation networks for complete permutationComplete utilization . 1. Nonredundant networks The synthesis of economical CPCU(N) networks based upon two-state cells has been discussed elsewhere. 13,14,15 The most efficient known procedure is based upon the construction indicated in Fig. 5(a) where the subnetworks P A and P B are themselves CPCU networks; for N even each subnetwork serves N/2 inputs otherwise P A serves (N+ 1)/2 inputs and PB (N-I)/2 inputs. It is easily shown that the number N l(N) of cells required is ~1{\ J17 It is of interest to investigate efficient techniques for setting-up the network cells to realize the necessary mode for a particular permutation. Consider, for example the network for N = 8, shown in Fig. 5(b), and assume that we require the setting of the cells as shown. Referring to Fig. 4 we see that each cell is in the "crossing" mode upon resetting the flipflop. The cell is set to the "bending" mode by the coincidence of logic 0 values on the data inputs and logic 1 value on the P input. Clearly then by applying a 1 to the P-input of all cells in a given .level of the network-in Fig.5(b) a set of values X5 = Xs = 0, PI = 1 will set the cell serving X5 and Xs - and appropriately setting the N input signals, the network can be setup in a time interval proportional to the number of levels in the network. There are 2 log2 N - 1 levels in the CPCU(N) networks presented above. N 1(N) = N (lOg2 N ) - 2 log N + 1* which is asymptotically close to the lower bound of (log2(N !» cells. * (b) 2. Byte-sliced commutation networks YN (a) Figure 5 - Networks for complete permutation - complete utilization *The symbol (x) denotes the smallest integer ~ x. In this section we are concerned with the behavior of the CPCU networks under cell fault conditions and a simple technique of accommodating to these faults. It is clear that for the case wherein the basic cells are double pole, double throw, reversing switches, each' "bad-output" cell failure results in an error only on a single output of the network, and similarly each "stuck-function" failure results in an error on a maximum of two outputs. (In the latter case it is sometimes possible to accommodate to this failure type by appropriately setting the working cells of the network. This accommodation ·technique is discussed in detail in a later section). i.J nfortunately, for 520 Spring Joint Computer Conference, 1968 the case of r-pole cells a single failure could result in the inability of the CPCU network to realize several of the WM-SP assignments. This state-of-affairs is significantly improved by byte slicing the CPCU network as shown in Fig, 6, in this case, the bytes (where each byte is assumed to contain b-bits) of the WM's are permuted in separate networks, that is, the first CPCU has as inputs the first byte of each WM, the second CPU has as inputs the second byte of each WM, etc. The outputs of the first CPCU are ultimately directed to first byte of each SP, etc. It is thus seen that for this byte sliced realization, which of course, requires the distribution of the cell memory flip-flops among each of the CPCU's, a cell failure disables a single byte. These commutation network byte failures can be accommodated for in the identical manner proposed for other byte failures, i.e., by the use of the incomplete permutation-order preserving networks. permuter. What is required of the network PI? A single fault in P2 causes a simple interchange of some particular pair of leads at some cell within the network. This can only show up as a spurious reversal of exactly two ieads at the output. Of course, one might be lucky enough to have the switch fail in the correct position so that no trouble would occur. In any event it would be sufficient that the network PI be capable of effecting the interchange of an arbitrary pair of input leads without changing the relative assignments of the other input leads. ~ [ P2 PI T& - 5510 - 136 (0) TWO PERMUTATION NETWORKS IN TANDEM X·I X·2 X'3 X'4 X'5 X'6 X'7 Figure 6 - Byte-sliced permutation network (b) A "DOUBLE-TREE" NETWORK 3. CPCU networks insensitive to cell failures .a. The "stuck-function" fault For the moment consider only the case in whiCh the network is fault-free or has precisely one bad switch (cell). Figure 7(a) illustrates a straight-forward solution to the single error correction problem. If PI and P 2 are both full permutation networks, then a fault occurring in one of them (such a fault being of the stuck-function type that does not disturb lead continuity) has no effect on the operations of the other network. Obviously this is a ra.ther wasteful approach since all of the remaining switches in the network containing the fault contribute nothing toward forming the desired permutation. Instead let P2 of Fig. 7(a) represent a permutation (CPCU) network and PI be a network specifically designed to undo the damage caused by a fault in P2. Then if a fault occurs in P2 it can be "repaired" by PI while a fault in P 1 causes no trouble because P2 is a full Figure 7 -(a) Two permutation networks in tandem; (b) A "doubletree" network; (c) stuck-function correcting networks The "double'tree" (D-T) network of Fig. 7(b) can do this job. Any pair of input leads can be directed to some switch on the left of the center switch. At this switch (to the left of the center switch) the leads may be interchanged. Whatever switch settings are required to do this, we copy by reflection in the (imaginary) center-line to the right-hand part of the network with the exception of the switch that actually effects the interchange. The corresponding switch in the right-hand part of the network is set in the opposite state. The scheme is illustrated in Fig. 7(b) for the particular case of N = 8 in which we wish to interchange inputs X 2 and X5 • Here the switch settings to the right and left of the center-line are identical, and the center switch effects the desired interchange. Similar networks exist for all values of Study of Data Commutation Problems N and are obtained by "pruning" the corresponding tree networks for the next largest power of two greater than N. When one of these "double-tree" networks is placed in tandem with a full permutation network, we are then able to correct the effect of one switch failure wherever it may occur in the composite total network formed by P~ and P2. If P2 is one of the CPCU type commutation networks we have already considered, it will be found that the input peripheral switches of P 2 and the output peripheral switches of the 0-T network match up into tandem pairs of individual switches. Note for example the pairing of leads at the input of the network of Fig. 5(a) and the similar pairing of output leads in Fig. 7(b). Whenever such a pairing occurs, we may omit one of the two switches if we have provided for the possible failure of one or the other of them. We may therefore omit the entire column of peripheral output switches from any of the D-T networks whenever we adjoin them to one of the CPCU networks. The single error correcting capability is not affected by this "pruning" operation. Call the network that results from the removal of the output switches from the "double-tree" network the TDT(N) network (for truncated double-tree of N leads). Then single-error correction of any CPCU (N) network can be obtained by the tandem addition of one TDT(N) network. Furthermore, if one considers the possible effect at the output of the CPCU (N) of the multiple failure of switches, it turns out that the addition of p TDT(N) networks in tandem to any CPCU(N) network will suffice to correct up to p switch failures in the total network. The argument is much the same as the foregoing one, depending on the possibility of decomposing all such multiple failures into separate pairwise lead interchanges. To estimate the cost of error protection according to the foregoing scheme, we note that the TDT(N) networks contain approximately (3/2)N switches. To correct p errors then takes about 3/2 (pN) switches. Thus if N is very large, we can correct a "few" errors at a cost that is small compared with the total number of switches in the network (= Nlog2N). On the other hand, correction of mUltiple errors with TDT(N) networks does not furnish a recipe for creating arbitrarily reliable networks (in the Shannon sense) while still meeting the asymptotic cell count i.e., = Nlog2N. We have not yet discovered how to do this although it appears likely that it' is possible. If anything can be concluded from results obtained for small values of N, it seems that single fault correction should be obtainable at a cost of (lOg2N), extra switches in excess of the N 1(N). In particular we can exhibit specific networks that correct one fault ar.d have switch counts as indicated in Table II. 52 j TABLE II - Number of Cells in Single "Stuck-Function" Correcting CPCU Networks Leads Switches in Redundant Network 2 3 4 5 6 2 5 7 J 3 5 II 8 14 11 In each case the number of switches shown in Table II is exactly (lOg2N) larger than the correspond:ing value of N l(N). In the cases N = 2, 3 and 4 it is reasonably certain that these realizations are the minimal ones that exhibit the single fault correcting property. An example of a single fault correcting network for N = 4 is illustrated in Fig. 7(c). ------yl ------y4 As a result of extensive experimentation one feels impelled to make the following conjecture: The cost of protection against (correction of) p faults in a CPCU(N) network is no more than the difference in cost between a CPCU(N) network and a CPCU(N+p) network. That is, the cost of correcting each additional fault, say fault i, is smaller than (log2(N+i) ) If this conjecture is indeed true, then there would exist permutation networks of arbitrarily high degree of reliability whose cell-count would not exceed K· N 1 (N) where K is a constant related to the probability of a switch failure. b. An alternative single stuck-function-correcting construction A different and slightly more economical [than the TDT(N) device] method of providing single fault protection in CPCU(N) networks stems from the observation that the construction of Fig. 5(a) can tolerate one cell failure in any peripheral cell if an extra cell serving outputs Y 1 and YN column is retained rather than deleted. The reason for deleting this cell in the first place was that exactly one peripheral cell 522 Spring Joint Computer Conference, 1968 is unnecessary in the irredundant case. Since all of the peripheral switches are ~xactly equivalent in function, it makes no difference which one we delete. Hence, by retaining all of them we can ignore exactly one failure in any of them. If each of the "internal" permutation networks that impiement the construction of Fig. 5(a) are augmented in the same way by replacing the deleted peripheral switch of the CPCU(N) configuration then we obtain a network, call it Sl(N), that can tolerate one switch failure anywhere. An example of such a network, namely Sl(4) is shown in Fig. 8. We notice first that the number of cells, k = 8, so that the network is not minimal [see Fig. 7(c)]. However, on ~ounting the redundant cells required when N = 2r we find that the extra cells are exactly N - 1 in number. This means that in the special case N = 2r the networks Sl(N) are more economicai than those produced by annexing a TDT(N) network to a CPCU(N) network. Recall that this required about 3N/2 extra switches. to arrive at its specified output. By setting the switches of the ladder network in the obvious manner, the fault can be corrected, as illustrated in Fig. 9 for the case wherein input C does not arrive correctly at internal output 4.

and 2 (lOg2(~» - although the -upper -.-_. value appears to be tighter. It is possible to realize the IPOP(r,m) function in a network composed of 2r two-state celis, where each cell contains m + 1 inputs. I t is ,seen that this network approximately satisfies the Study of Data Commutation Problems 525 210g2i jh if a > {3, and the m outputs be Y jl ' Yh.'·~ .. 'Yj m where ja > is if a > {3. Consider each of the m inputs as residing in one of two disjoint groups. Group AI contai~s those inputs that do not "share" an input cell with another distinguished input and group BI contains those inputs which do share an input cell. We will similarly define groups Ao and Bo for the distinguished outputs. The goal is to assign XiI a~d Y il to the same subnetwork (SI or S2), X i2 and Y j2 to the same subnetwork, etc., and the procedure is as follows. Assign XiI and Y j1 to network Sl' by appropriately setting the pertinent input and output cells (except for the case where XiI = Xl and/or Y j1 = Y 1 in which case the assignment to Sl is automatic). Then assign X i2 and Y j2 to network S2 j; if XiI and X i2 are in group BI and/or Y il and Y j2 .are in group Bo the assignment to Figure 15 - An incomplete permutation - order preserving network A network composed of two-input cells which realizes the IPOP(r,m) function, displayed in Fig. 16 for the case r= 8, m=4. Xl Yl X. X2 Y2 X3 X. '2 x. Y3 Y4 Xi3 Xs Ys X6 Y6 Y '3 Xl Y7 Y. '4 Xa Ya 'I Xi4 Y '1 y. '2 Figure 16 - Recursive approach to incomplete permutation - order preserving network The number N 3(r,m) of cells required, for the case r = 2k , m= 2k -I, again using N3(2,l) = 1, is shown to be 3 N3(r,r/2)=rlog2r-·2r+2. The recursion technique will yield a similar expression for arbitrary parameters r, m. S2 is automatic. Next assign Xi3 and Y j3 to Sl' etc., un~ til all of the· in distinguished inputs and OlltputS have been set. This procedure is then applied to set the pertinent output and input cells of the networks SI and S2, etc. It is clear that this assignment procedure can always be carried out. In Fig. 16.we show the setting of the input and output cells for the case XiI = X 2, X i2 = X 4, Xi3 = X 5 , X i4 = X 7 and Yh = Y 1, Y j2 = Y 3,_Yj3 = Y 6, Y j4 =Y 7 The same network will realize the IPNOP function although the set-up is somewhat easier than for the POP case. The techniques for providing failure tolerant IPOP networks are not discussed in this section since they are quite similar to the techniques described previously. We note that a cell failure in the IPOP network (two-input cell type) can disable no more than two byte slices each for the input and output. Since it is assumed that redundant slices are provided, it is possible that a nonredundant network would be used and when cell failures are qetected the slices which could not be served by the network would be discarded. Commutation networks/or "shorting" In Sec; VII we described networks which for a redundant byte-sliced network can serve to route external data between the operating slices of distinct networks (e.g., between an SP and ALU). It was noted· that internal data (e.g., control and carry infor- 526 Spring Joint Computer Conference, 1968 mation) must be routed between the stages of the bytesliced cascade. If a stage (or slice) has failed, then the internal data intended for that stage, which clearly comes from its immediate predecessor or successor, must be shorted around that failed stage. This shorting process must be accomplished reliably or else the entire network would be disabled. Figure 17 - "Shorting" networks The shorting function is quite naturally effected with the two-input basic cell, as illustrated in Fig. 17. For convenience we have only indicated a signal flow to the right although it is clear that the network could be modified to handle bi-directional flow. We have shown the appropriate cell modes so that byte slice 2 and byte slices 5 and 6 ·are shorted out. We note that the network could recover from a single component failure within a cell, which results in either the stuckfunction failure or the bad-output failure. However, a more severe cell failure which results in, for example, a permanent logical zero signal on both outputs of a cell would clearly disable the network, i.e., interrupt the signal flow. Such a failure, which could only result from two component failures within a cell could be accommodated for by the redundant shorting network of Fig. 18. We have indicated the appropriate modes for cells SI' Si, S2, S~ such that byte slice 1 is shorted out, i.e., the output from slice 0 is directed to the input terminal of slice 2. We h~ve also shown the appropriate modes for cells S3, S3, S4 such that the network continues to function although both outputs of S~ are faulty. In this case byte slice 3 cannot be used, but the signal flow is not interrupted. Similarly we have shown how the network accommodates to a double-output failure in S5 in which case slice 5 is bypassed. This technique can be clearly extended to handle failures of greater multiplicity. SUMf\1ARY In this paper we have studied in detail the logical design of networks which are well suited for realizing the various data switching or commutation functions required in a multiprocessor organization where the various modules are repairable. It is assumed that the memory, arithmetic logic, and possibly the simple processor and control modules are realized in a ,bytesliced manner - a realization that has been demonstrated to be practical. These commutation networks might also be useful for certain logical functions within the various modules for example, in the distribution of the outputs of a decoder among control inputs of a set of registers. We feel that the designs we have presented, based upon the primitive two-input, twooutput reversing cell represent adequate engineering solutions to an of the commutation probiems posed, although some theoretical minimization problems still remain. These pr9blems relate to minimum cost designs for the complete and incomplete permutation functions considering both the nonredundant realizations and the realizations which are tolerant to cell failures. In particular our designs for the incomplete permutation functions require a number of cells significantly in excess of the lower bound. The delay in signal transmission encountered for the networks studied is significantly greater than the delay expected for a simple crossbar realization (approximately 210g2N units compared with 1 unit), however, the fan-out and fan-in for the cell array is substantially less than in a crossbar, so that the overall delays may be comparable, for certain dimensions and circuit parameters. If less delay is desired, the networks could be synthesized from, for ~ample, 4-input complete-permutation cells, which would result in one-half the delay, at the expense of somewhat greater total gate cost; however, the failure of such a cell might disable more network outputs than encountered for the two-input cell realizations. ACKNOWLEDGMENT The research reported in this paper was supported by NASA Electronics Research Center, Cambridge, Massachusetts under Contract N AS 12-33. REFERENCES Figure 18 - Redundant "shorting" network I J GOLDBERG K N LEVITT R A SHORT Techniques for the ,realization of ultra-reliable spaceborne computers Final Report-Phase I Contract NAS 12-33 Stanford Research Institute Menlo Park California (September 1966) 2 AES-EPO Staff A ES-EPO study program Final Study Report volunes 1 and 2 IBM Electronics System Study of Data ComrI?-uta.tion Problems Center Owego N ew York (December 1965) 3 A AVIZIENIS A set of algorithms for a diagnosable arithmetic unit Tech Report no 32-546 Jet Propulsion Laboratory Pasadena California (1964) 4 A AVIZIENIS A design offault-tolerant computers Proc Fall Joint Computer Conference (AFIPS) (1967) 5 W G BOURICIUSetal Investigations in the design of an automatically repaired CORl~w 6 7 8 9 . Digest of the First Annual IEEE Computer Conference IEEE Publication 16C51 (September 1967) P W AGNEW et al An approach to self-repairing computers Digest of the First Annual IEEE Computer Conference IEEE Publication 16C5l (September 1967) R P HASSEIT E H MILLER Multithreading design of a reliable aerospace computer Presented at 1966 Aerospace and Electronic Systems Convention (3-5 October 1966) L J KOCZELA Study of spaceborne multiprocessing 2~d Quarterly Report Volume II Contract NAS 12-108 Autonetics Division of North American Aviation Anaheim California (October 1966) E C JOSEPH Self repair: fault detection and automatic reconfigurability Proceedings of the Spaceborne Multiprocessing Seminar 527 NASA Electronics Research Center Boston pp 41-49 (31 October 1966 10 R L ALONSO et al A multiprocessing structure Digest of the First Annual IEEE Computer Conference IEEE Publication 16C5l (September 1967) 11 J F KEELEY et al An application-oriented multiprocessing system IBM Systems Journal Vol 6 no 2 (Entire Issue) (l9~7) 12 J GOLDBERG M W GREEN K N LEVIIT H S STONE A study of techniques and devices for the realization of ultrareliable spaceborne computers . Interim Scientific Report no 2 Contract NAS 12-33 Stanford Research Institute Menlo Park California (November 1967) 13 V E BENES Mathematical theory of connecting networks and telephone traffic Acad611ic Press New York... 1965 14 W H KAUTZ K N LEVIIT A WAKSMAN Cellular interconnection networks Accepted for publication in IEEE Transactions on Electronic Computers 15 A WAKSMAN A permutation network Accepted for publication in the Journal ofthe ACM 16 J GOLDBERG Logical design techniques for error control WESCON paper 9/3 Session 9 (September 1966) A distinguishability criterion for selecting efficient diagnostic tests by HERBERT Y. CHANG Bell Telephone Laboratodes, Incorporated Naperville, Illinois INTRODUCTION Fault diagnostic procedures are usually derived by means of simulation methods. 1 One of the fault simu- . lation methods is the digital technique in which the diagnostic information, viz., the diagnostic tests and test results of a machine, is generated with the aid of a computer program. The input to the program is a logical description of the machine and the output is a sequential testing procedure for the machine, along with the simulated diagnostic test results. 2 A sequential testing procedure is one where the next test to be applied depends on the outcome of the previous test. The efficiency of a sequential testing procedure generated digitally depends largely on the method by which tests are chosen and ordered in the procedure. Some tests, when properly chosen and ordered, will yield a shorter testing procedure and better fault resolvability than others, and therefore, give rise to an "optimum" testing procedure. Previous results indicate that it is impractical to attempt to find a globally optimum testing procedure for any moderate size circuit;3.4 local optimization techniques are therefore used. At each point in the test generation process, several candidates of tests are tried and evaluated, and the "best" one is chosen. Two criteria for evaluating the "goodness" of a test are available: the check-out or detection criterion and the information gain criterion. 2.3.5.6 Both criteria, however'- share the drawback that tests are evaluated, based on the "ability" to detect or identify faulty components, rather than the smal.lest replaceable module(s) or circuit package(s). In this paper, a new criterion, called the distinguishability criterion, for computing the figure-of-merit of tests to derive efficient testing procedures is introduced. The criterion is aimed at optimizing the diagnosability so as to identify failures only to the circuit package (or the smallest replaceable module) level. It appears that the distinguishability criterion is mor~ :> practical since the impact of integrated circuits and modern packaging methods makes distinguishability among faults associated with the same module less necessary. The testing procedure so generated tends to yield shorter test sequences and better resolvability. In Section 2, the sequential testing philosophy is reviewed and two of the existing criteria for selecting tests are briefly described. The distinguishability criterion is introduced in Section 3. Familiarity with References 2,3,5, and 6 is recommended. A review of testing philosophy 529 The philosophy of sequential testing procedure has commonly been described as the "gedanken-experiment" or "black-box" philosophy.2.7 A machine or processor is considered to be a black -box with input and output terminals. A failure or fault is looked upon as a transformation of the fault-free or "good" machine into a different machine. For example,·machine Mi may denote the output of gate Q stuck at low; machine M j may represent the second input terminal of gate R stuck at high; and so on. Thus, if there are N possible failures in a machine, * the objective of a diagnostic procedure would be to identify one of the N + 1 (including the good machine) possible machines or black boxes. The procedure for accomplishing this is essentially a multiple branching experiment in the sense of Moore. 7 In deriving a sequential testing procedure, a "test" or an input configuration is first applied and the "test result" or output configuration of each of the N + 1 machines is computed by simulation. A test is useful if it partitions the collection of N + 1 machines into several equivalence classes, each of which contains only those machines having the same output configuration. Another test can now be applied to one of the equivalence classes. Again by simulating the behavior of each of the machines in this class, the secooci test *The single fault assumption is implied here. 530 Spring Joint Computer Conference, 1968 may "partition" the set into smaller equivalence classes. This process is repeated for every equivalence class of machines until each failure or machine is identified, or the remaining subset ,of machines appear to be indistinguishable, • (3, t) ~- (2,1) 01 ~ 13,21 /1'!~tt-~ (1,t)~Ot (2,2) ~ ~~ 114 A.....-....M I 1t 1•• - •• ~1 01 I _ H (3.3) I.. I nil ... I (1,2) (3,4) M, to ~ ~. Figure 1 - An example of sequential testing procedure An example of such a procedure is shown in Figure 1. Machines, or failures, are denoted as M I, M 2 , ... ,f\1 20 , with "MI" representing the good machine. An input configuration or test 10 11 is applied which partitions the set into three equivalence classes. Thus, if the output configuration is 11, one can conclude that either the machine is fault-free, or it contains one of the following failures: M 2, M 3 , M 4, M s, M 7 , M s , M 9 , M lO , Mlh M 12 , MIS, MIS, M 17 , MIS, M 20. Similarly, if the output is 01, the machine contains one of the following failures: M I3 , M 14 , M I9 • It is seen that if the output is 00, failure Ms is uniquely identified; no more testing is necessary. The entry (i, j) denotes the equivalence class at ith level and jlh partition in the testing procedure. By convention, the top branch (I, I), (2, 1), (3, 1), ... will always denote the equivalence classes containing the good machine MI' , A second test 1001 may be applied to the set (1, 1) and further partitions (1, I) into (2, 1), (2, 2) and (2, 3). A different test 1010 may be applied to the set (I, 3) in order to partition machines {M I3 , M g , M I9 }. A third and a fourth, ... , test may be chosen along each sequence to continue the testing as far as necessary. Each failure, or equivalence class of failures, such as {M14' M I9 } in the terminal equivalence class would be identified by a test sequence, such as 10 j i, 0 j, 10 i 0, 10. I t is evident that there are many input configurations or tests one may choose, to partition the set(s) (i, j). The criterion for choosing a test depends on the purpose at hand. If the purpose is check-out or fault detection, the check-out criterion may be used, This criterion says that for any test T k at the ith level in the testing procedure, the efficiency of T k can be measured by ak == n(i, 1) - nk(i + I, 1) n(i, I) (I) where n(i, j) denotes the number of machines in equivalence class (i, j), and Nk(i + I, j) denotes the number of machines in equivalence class '(i + I: j) resulted from the partitioning by applying T k • In other words, the test, which yields the maximum ak and is therefore, considered to be the best choice, is the one that eliminates the largest number of machines from the good machine equivalence class (i, 1).2,5 If, on the other hand, the purpose is diagnosis, the information gain criterion may be used. For a test that generates an m-nary partition, its information gain is computed as follows. Let (i + I, jI)' (i + I, j2), ... , (i + 1, jrn) be the m equivalence classes which result by applying test T k to (i, j) (see Figure 2). Then the information gain for test T k is where all faults in (i, j) are assumed equiprobable. The gain {3k defined in equation (2) is due to Mandelba~ms and is different from the one used by Seshu. S The information gain defined by Seshu is suited for binary partition. The definition is then extended to the m-nary partition case by saying that an m-nary partition can be replaced by a string of m-I binary partitions. This is done by first computing the gain' over the good machine equivalence class (i + I, 1) with respect to each of the rn-I faulty machine equivalence classes and then taking the sum of the gains. The measure so computed tends to yield a good detection testin~ procedure, Distinguishability criterionfor test selection The essential idea of the criterion to be described here lies in the notion of "fault distinguishability." The criterion is somewhat analogous to the one proposed by the author for combinational testing procedures. 8 Suppose there is a collection of machines Mh M 2, ... ,MN+l to be identified. One would like to apply a test to distinguish, if possible, ea'Ch fauhy machine from the good machine MJ, and from all other faulty Distinguishability Criterion for Selecting Efficient Diagnostic Tests 531 'Yk == nk(i + I, jl) [nk(i + I, j2) + nk(i + I, j3) + ... + nk(i + 1, jm) ] + ... (i+i, jz) JI (iJ j •. li.Lt i \ /1j ..n ,,····"Z, ZI I I nO,i> I I NOTATION ( ij). EOUfVALENCE ClASS AT iTH LEVEL AND jTH PARTITION ~I r = L L"k\i T f' It., 8=1 I, Js}' ~ f·... ,1 I, Jt} J ~ nk\1 T t=s+1 The value of 'Yk reaches a maximum of JiliJll ~(i, j)= 1] when m = n(i, j) meaning that the test T k identifies uniquely each of the n(ij) faults. It reaches a minimum of zero when m = 1, meaning that the test T k cannot partition the set into smaller equivalence c.1asses. This measure is compatible with the information gain criterion concept which has been discussed previously.5.6 As an example· consider the case where three different output configurations and the corresponding tests TI T2 and T3 yield the following partitions: n (I,U = MMD. OF MAatNES IN (itO nk Ci+f ,i.) = NUMBER OF MACHMS IN (it I, is), RESULTED FROM APPLYING Til TO UJ} . Tk • TEST OR t4PUT CONFIGURATION Zt • Zz i .. ·.zm • TEST RESUIJ'S OR OUTPUT COtElGURATION 1 2 2 1 1 3 2 There are six faults to be identified. From Equation (3), the distinguishability 'Yk for each test is: Figure 2 - Partitioning of machines machines. I n other words, there are N(N + 12 pairs 2 of machines to be distinguished. A test is most useful, and therefore should be chosen, if it distinguishes the largest number of pairs of machines. I n general, a test T k partitions the set of machines in (i, j) into m equivalence classes (i+ I, jl)' (i+ 1, j2) , ... , i + 1, jm). Each equivalence class (i + I, js) (s= 1, 2, ... ,m) contains nk(i + 1, js) machines that are are indistinguishable by test T k, where the sum m L nk(i + 1, js) is equal to n(i, j). Then, the test T k ac- s=1 tually distinguishes every machine in (i + 1, js) from every machine in (i + 1, jt), for s =1= t. The total number of pairs of machines distinguished by T k is therefore equal to: . 'YI = 1(1+1+1+1+1) + 1(1+1+1+1) + 1(1+1+1) + 1(1+1) + 1(1) = 15 'Y2=2(1 +3)+ 1(3)= 11 'Y3 = 2(2 + 1 + l) + 2(1 + 1) + 1(1) = 13 The result indicates that T I is the best since it identifies each machine uniquely. T3 is better than T2 in that it isolates faults more finely. The distinguishability criterion defined in equation (3) must be modified if one is interested in isolating faults only to the module level, rather th~n the faulty component level. For most practical purposes, whenever a fault is identified, it is the module(s), i.e., the smallest replaceable partes) of a machine such as a plug-in circuit package or an integrated circuit· card, that will be replaced. Thus, distinguishability among faults associated with the same module would be of no interest. For example, suppose the six faults just considered are associated with three modules: faults fl' f2' f3 associated with one module PI, f4 and f6 asso. ciated with another module P2, and f5 with module P3' 532 Spring Joint Computer Conference,. 1968 PI: 11, q, fl, fl, fl, fl P2: f~ P3: f~, rto Test T 2 partitions these faults into three equivalence classes: where f1j denotes a fault fi assoticated with module j. The distinguishability between faults fl of (i+ 1, jl) and. f3 of (i+ 1, j3)' or f2 of (i+ 1, jl) and f3 of (i+ 1, j3) is no longer relevant. Since, whenever either fh or f2 or f3 occurs, module PI must be replaced. So the emphasis of distinguishability criterion should be such that a test is considered most efficient if it distinguishes the largest number of pairs of faults {fh fj} 's where fi and fj each is as'sociated with a distinct module. The measure of test efficiency can be redefined by removing from 'Yk the number of all those fault pairs that are distinguished by T k but are associated with the same module. Let n(i, j)p denote the number of faults in (i, j) that are associated with module p. Then, n(i, j) = L n(i, j)p p where L is defined to denote summation over, and p only over, all modules with which the faults in (i, j) are associated. In the previous example, n(i+ 1, j3) can be expressed as the sum of n(i+ 1, j3)., n(i+ 1, jJ2 and n(i+ 1, j3)a, where n(i+ 1, ja)h n(i+ 1, jah and n(i+ 1, ja)3 denote, respectively, the number of faults in (i+ 1, ja) associated with modules p., P2 and Pa. Thus, of the m partitions generated by test T k' there are ~ {~ [nk(i + 1, js)p . t~+l nk(i + n, Six tests T k (k = 1,2,3,4,5,6) are available, each generating a partition of equivalence classes as indicated in Table I. To derive a testing procedure TABLE 1- Configurations of Partitionings of (i, j) (i+l, jl) TI T2 T3 T4 T5 Ts f1 ,t1,t],ij,f~,fto f1,t1,q,n,n,fl,t] (i+l, j2) (i+l, j3) q,n. ij, f~ n,n flo n,n fl q fA fA q f~ f;1 f10 f1 t1 q n n n f~ f~ flo t1 q fl n fl t] f~ f~ fto f1 t1 q fl n n ij fJ fto f;l f1 f~ 'Yk Ak I~ Ii] 16 I*' I "9 9 I9 that isolates faults to component .level, the first step is to compute the figure of merit of each test using . either Equation (2) (the information gain criterion) or Equation (3) (the distinguishability criterion at component level). Both criteria indicate test T I is the best choice to partition (i, j); T I is thus selected as the first test. The faults in (i, j) are then partitioned into three equivalence classes: {t1, q, ~, 1], ft, ito}, {fa, fl} and {q, il}. Similar computations can be made for selecting the next best test(s) to further partition each equivalence class. The resultant testing procedure, as shown in Figure 3, indicates that the average length of test sequence to isolate a fault is 2.7 tests. 1, jt)p] } fault pairs which need not be distinguished. The distinguishability measure 'Yk defined in Equation (3) should therefore be.rtlodified as follows: Ak == 'Yk - ~I {L [nk(i + 1, js)~· s=1 P f· ,nk(i + 1, jt)p ] } t=s+1 ( nk(i + I, jt) - nk(i + I, jt)p ) ] } (4) The application of this distinguishability criterion to derive an efficient testing procedure may be illustrated by the following example. Example Suppose in equivalence class (i, j) there are ten faults to be identified. The faulty components are associated with modules PI; P2 and P3 in the following way: 8 8 4 8 ill II Figure 3 - A testing procedure isolating faults to component level Distinguishability Criterion for Selecting Efficient Diagnostic Tests However, if the distinguishability criterion Ak is used to derive a testing procedure that isolates faults to replaceable module level, one observes that test T 2 would be the best choice to partition (i, j) (see Table I). In this case test Tl is less efficient since it distinguishes fewer fault pairs that are associated with different modules. A sample computation using equation (4) to compute Ak is given: A2 = n2(i + I, jl)t { n2(i + I, j2) + n2(i + I, j3) } I I , + n2(i + I, jt)2 {[ n2(i + I, j2) - n2(i + I, j2)2 ] I ~;'33 f,- f•• '7' Te, ~fl) + n2(i + I, j3) } + n2(i + I, j2)2 { n2 (i + I, j3) } =(6) [(2)+(1) ]+(1) [(2-1)+(1) ]+(1)(1) = 21 Figure 4 - A testing procedure isolating faults to module level Test T2 partitions (i, j) into three equivalence classes: {fI, f~, 'fA, fi, fg, fA, !1}, {q, il} and {t1o}· Each equivalence class can be further partitioned by test(s) that yields the maximum figure of merit Ak, computed by considering only those faults in each equivalence class. As an illustration, the Ak'.s of various tests partitionfA, rH are computed and ing the set {fI, f~,fA, fl, tabulated in Table II. In this case test T6 is considered the best choice. The process of test selection is continued until all faults are identified to a single module. The resultant testing procedure is shown in Figure 4. The average length of test sequence to isolate a fault is 1.9 tests, which is much shorter than the 2.7 tests required to isolate faults to component level. n, fl,q,q fl. q, fl. fl. f~ fl-Ps. f~ q-Ps, f7 fl,f},fl.n,fl.Ps fl.n n.Ps fl f..!7 fl.Ps CONCLUSION 4 A distinguishability criterion for measuring the efficiency of diagnostic tests has been developed. The criterion is compatible with the concept of information gain described in the literature,3.5.6 if diagnosability is aimed at the component level. If, however, the diagnosability is aimed -at the circuit package or module level, it is shown that the sequential testing procedure which is generated based on the distinguishability criterion tends to yield shorter testing sequences and better resolvability. 2 0 REFERENCES TABLE II-Configurations of Partitioning of Equivalence Class Tl T3 T4 Ts T6 module level, is approximately half the size of the one required to realize the procedure of Figure 3, where faults are isolated to component level. For many real-time systems such as electronic telephone switching systems and air-borne computers, the resulted saving in memory could represent a significant reduction in cost. 1 [b] Another point worth mentioning is that the program memory required to store the diagnostic testing procedure is often significantly reduced, when diagnosability is aimed at the module level. As an example consider the two diagnostic testing procedures of Figures 3 and 4. I t can be shown that the size of the corresponding program statements required to realize the procedure of Figure 4, where faults are isolated to lEG MANNING H Y CHANG A comparison of methods for simulating faults of digital system Digest of First Annual IEEE Conference 1967 2 S SESHU 0 N FREEMAN The diagnosis of asynchronous sequential switching systems IRE Transactions on Electronic Computers Vol EC-I 1 August 1962 3 J D BRULE R A JOHNSON E KLETSKY Diagnosis ofequipmentfailures IRE Transactions on Reliability and Quality Control April 1960 4 J F POAGE The derivation of optimum tests for logic circuits V~I RQC-9 534 Spring Joint Computer Conference, 1968 PhD Thesis Princeton University 1963 5 S SESHU On an improved diagnosis program I EEE Transactions on Electronic Computers Vol EC-14 February 1965 6 D MANDELBAUM if measure of efficiency of diagnostic tests upon sequential logic I EEE Transactions on Electronic Computers Vol EC-13 October J964 7 E F MOORE Gedanken-experiments on sequential machines Automata Studies Princeton University Press 1956 8 H Y CHANG An aigorithm for seiecting an optimum set of diagnostic tests I EEE Transactions on Electronic Computers Vol EC- ] 4 October ]965 1968 SPRING JOINT COMPUTER CONFERENCE COMMlnEE General Chairman A. S. Hoagland IBM Corporation Vice Chairman Secretary S.M. Matsa IBM Corporation Local Arrangements W. R. Lonergan RCA - EDP H. L. Cooke, Chairman RCA Laboratories Bernard McGovern, Assistant RCA - EDP E. E. Andrews RCA - EDP Technical Program T. R. Bashkow, Chairman Columbia University Richard Auerbach Computer U sage Development Corporation Glenn Bacon 1MB Corporation Public Relations J. M. Kinn, Chairman IEEE R. T. Miller IBM Corporation Special Events Jess Chernak Bell Telephone Laboratories Borge Christensen General Electric Company Chester Y. Lee Bell Telephone Laboratories A. A. Currie, Chairman Bell Telephone Laboratories R. L. Basford Bell Telephone Laboratories Exhibits Morton H. Lewin RCA Laboratories W. M. Carlson, Chairman IBM Corporation Sheldon Weinberg Realtronics, Inc. R. F. Welch UNIVAC Finance C. A. Erdahl, Treasurer Price Waterhouse & Company Burt Totaro UNIVAC Registration M. B. Basson Price Waterhouse & Company G. W. Jacob, Chairman Sperry Gyroscope Co. R. W. Liptak Price Waterhouse & Company Solomon Scherr Hazeltine Corporation Charles Griebell Sperry Rand Corporation Patrick J. Ferrara Sperry Rand Corporation Ladies Cecilie Smolen, Chairman First National City Bank Frances Zederbaum IBM Corporation Gretchen Remick Computer U sage Development Corporation Ellen Schaefer IBM Corporation Printing and Mailing R. W. Thayer, Chairman Princeton Printing Company Advisor Harlan Anderson Time. Inc. AFIPS Society Representatives IEEE S. Levine The Bunker-Ramo Corporation ACM J. M. Spring Computer Methods, Inc. SCI R. J. Doelger Electronic Associates, Inc. ASIS Irlene Stephens CCNY AMTCL W.J. Plath IBM Corporation REVIEWERS, PANELISTS, AND SESSION CHAIRMEN REVIEWERS A. Adler Wilheim Anacker James P. Anderson Gary Bard F. Bates Frank Bevacqua Ruth Block Shelton Boilen W. G. Bourichius Robert C. Calfee Harry N. Cantrell Richard Caplan S. H. Chasen A. Ben Clymer Steve Condon Warren A. Cornell Richard L. Crandall James W. Daniel H. A. Ernst Monroe Fein W. A. Fetter Tudor R. Finch R. Forbes R. Stockton Gaines Alonzo G. Grace, Jr. Charles Gulotta M.J. Haims C. Halstead P. J. Hanratty Tom Hastings R. A. Henle B. Herzog D.Hodges ThomasJ. Hogg Mu-Yue Hsiao H.Johnson B. J. Karafin H. B. Keiier Robert King Eldo C. Koenig Z. Kohavi Mark Koschmann J. Kurtzberg Chester Lee L. Lidofsky Y. S. Lim ArthurW. Lo R. Mandell Michael Marcotty T. J. Matcovich H. E. Meadows M. J. Merritt Gordon S. Mitchell Thurber Moffett Harrison Morse Mervin E. Muller Robert M. McClure H. S. McDonald I. D. Nehama Robert L. Patrick A. V. Pohm W. J. Poppelbaum Paul Reinhard John R. Rice' V. C. Rideout Lawrence G. Roberts Robert F. Rosin R. Roth Wendeii Sander F.J.Samsom C. L. Semmelman S. Shapiro R. L. Shuey' Warner V. Slack OttoJ. M. Smith Robert Spinrad A. Soudak Thomas B. Steel,Jr. William Sutherland A.J. Sutton R. N. Thompson William Timlake C.J. Tunis H. VanBrink E. VanHorn J. V. Wait C.J. Walter R. H. Wanders Robert Ward Roger C. Wood J. J. Y ostpille Edward Yourdon M. S. Zucker PANELISTS J. A. Archibald,Jr. Bruce W. Arden Julius Aronofsky . J. D. Babcock C. W. Churchman M. E. Connelly Philip A. Cramer David C. Evans M. M. Flood R. M. Franklin Ralph Gerard Harry A. Gray H. R.J. Grosch Robert V. Head William B. Helgeson Linder C. Hobbs Richard C. Jones H. A. Kinslow J. F: Lubin Tom Marques Andrew R. Molnar Henry S. McDonald Henry McDonald J. D. McGonagle T. William Olle Ascher Opler BuddJ. Pine R. C. Raymond Walter Rosenblith Thomas Rowen Daniel W. Scott : J. E. Sherman Warren G. Simmons Robert Spinrad . Ithiel de Sola Pool William R. Sutherland C.J. Tunis Robert Ward Jerome B. Wiener M. A. Woodbury Kendall R. Wright SESSION CHAIRMEN C Bachman W.Beam C. R. Deininger P. Dorn G. Feeney R. Forest J. Githens L. Hittel E. L.Jacks W. Keister G, A. Korn C. Lecht T. McFee B. Neff A.Opler V. C. Rideout H. Sassenfeld c. Simone H. Teager J. Wiener R. O. Winder AMERICAN FEDERATION OF INFORMATION PROCESSING SOCIETIES (AFIPS) OFFICERS and BOARD of DIRECTORS of AFIPS Secretary Mr. MAUGHAN S. MASON IBM Hybird Systems Center 2670 Hanover Street Palo Alto, California 94304 President Dr. BRUCE GILCHRIST IBM Corporation Data Processing Division 112 East Post Road White Plains, New York 10601 Treasurer Mr. WILLIAM D. ROWE * The Mitre Corporation 5600 Columbia Pike Bailey's Crossroads, Virginia 22041 Vice President Mr. PAUL ARMER * The RAN D Corporation 1700 Main Street Santa Monica, California 90406 ACM Directors Dr. ANTHONY G.OETTINGER Computer Laboratory HarvardU niversity Cambridge, Massachusetts 02138 Mr. J. D. MADDEN ACM Headquarters 211 East 43rd Street New York, New York 10017 Dr. ROBERT W. RECTOR * Informatics, Inc. 5430 Van Nuys Boulevard Sherman Oaks, California 91401 Dr. WALTER HOFFMAN Computing Center Wayne State University Detroit, Michigan 48202 IEEE Directors Mr. SAMUEL LEVINE Bunker-Ramo Corporation 445 Fairfield A venue Stamford, Connecticut 06902 Mr. L. C. HOBBS Hobbs Associates, Inc. P.O. Box 686 Corona Del Mar, California 92625 Mr. KEITH W. UNCAPHER The RAND Corporation 1700 Main Street Santa Monica, California 90406 Dr. R.1. TANAKA * California Computer Products, Inc. 305 N. Muller Street Anaheim, California 92803 Simulation Councils Director American Society for Information Sciences Director Mr. JOHN E. SHERMAN * Lockheed Missiles & Space Corp. D59-10: B-151 P. O. Box 504 Sunnyvale, California 94088 * Executive Committee Mr. HAROLD BORKO Systems Development Corp. 2500 Colorado Avenue Santa Monica, California 90406 Associationfor Machine Translation and Computational Linguistics-Observer Special Libraries Association-Observer Mr. BURTON E. LAMKIN Library & I nformation Retrieval Staff Federal Aviation Agency 800 Independence Avenue S.E. Washington, D,C, 20003 Dr. DONALD WALKER The Mitre Corporation Bedford, Massachusetts 01730 Society for Information Display-Observer Mr. WILLIAM BETHKE 1806 N. J ames Street Rome, New York 13440 AFIPS Committee Chairmen Abstracting Government Advisory Dr. DAVID G. HAYES The RAN D Corporation 1700 Main Street Santa Monica, California 90406 Dr. HARRY HUSKEY Univ~rsityofCalifornia Division of Natural Sciences Santa Cruz, California 95060 Admissions Harry Goode Memorial A ward Mr. WALTER L. ANDERSON General Kinetics, Inc. 11425 Isaac Newton Square South Reston, Virginia 22070 Mr. ASCHER OPLER T. J. Watson Research Center P. O. Box 216 Yorktown Heights, New York ]0598 Awards IFIP Congress 68 Dr. ARNOLD A. COHEN UNIVAC 2276 Highcrest Drive Roseville, Minnesota 55113 Dr. DONALD L. THOMSEN,JR. I BM Corporation . Old Orchard Road Armonk, New York 10504 Conference International Relations Dr. MORTON M. ASTRAHAN IBM Corporation - ASDD P. O. Box 66 Los Gatos, California 95030 Dr. EDWIN L. HARDER Westinghouse Electric Corp. 1204 Milton Avenue Pittsburgh, Pennsylvania 15218 Constitution & By-Laws Planning Mr. MAUGHAN S. MASON IBM Hybird Systems Center 2670 Hanover Street Palo Alto, California 94304 Dr. JACK MOSHMAN Leasco Systems & Research Corp. 4833 Rugby Avenue Bethesda, Maryland 20014 Education Public Relations Dr. MELVIN A. SHADER IBM Corporation - SDD 1000 Weschester A venue White Plains, New York 10604 Mr. ISAAC SELIGSOHN IBM Corporation Old Orchard Road Armonk, New York 10504 Finance Publications Mr. WALTER M. CARLSON IBM Corporation Old Orchard Road Armonk, New York 10504 Mr. STANLEY ROGERS P. O. BoxR Del Mar, California 92014 Social Implications of Information Processing Technology Information Dissemination Mr. STANLEY ROTHMAN TRW Systems 1 Space _Park Redondo Beach, California 90278 Mr. GERHARD L. HOLLANDER Hoiiander Associates P. O. Box 2276 Fullerton, California 92633 Technical Program Consultant Mr. JACK ROSEMAN Helidyne Corporation 1401 Wilson Boulevard Arlington, Virginia 22209 Mr. HARLAN E. ANDERSON Time, Inc. Time &- Life Building New York, New York 10020 Newsletter U. S. Committeefor IFIP ADP Group Mr. DONALD B. HOUGHTON, 15-W Westinghouse Electric Corporation -3 Gateway Center, Box 2278 Pittsburgh, Pennsylvania 15230 Mr. ROBERTC. CHEEK Westinghouse Electric Corporation 3 Gateway Center Pittsburgh, Pennsylvania 15230 lCC General Chairmen 1968 FlCC 1968SlCC Dr. WILLIAM H. DAVIDOW Dymec Divisions Hewlett Packard Company 395 Page Mill Road Palo Alto, California 94306 Dr. A. S. HOAGLAND IBM Research Center P. O. Box 218 Yorktown Heights, New York 10598 1969SlCC Dr. HARRISON FULLER Sanders Associates, Inc. 95 Canal Street Nashua, New Hampshire 03060 AFIPS Executive Secretary Mr. H. G. ASMUS AFIPS Headquarters 9th Floor 345 East 47th Street New York, New York 10017 1968 SJCC LIST OF EXHIBITORS Academic Press, Inc. Adage, Inc. Addison-Wesiey Pubiishing Company, inc. Addressograph Multigraph Corporation Advanced Computer Techniques Corp. Amp, Inc. Ampex Corporation American Telephone & Telegraph Anderson Jacobson Inc. Applied Data Research, Inc. Applied Dynamics, Inc. Applied ~ogic Corp. Association for Computing Machinery Audio Devices, Inc. Auerbach Corporation Auto-trol Corporation Baldwin ·Kongsberg Company Bolt Beranek and Newman, Inc. Brogan Associates, Inc. Burroughs Corporation Business Supplies CorP. ofAmerica Caelus Memories Inc. California Computer Products, Inc. Certex Inc. Collins Radio Company Comcor Astrodata, Inc. Communi type Corp. Computer Applications, Inc. Computer Communications, Inc. Computer Design Publishing Corporation Computer Industries Computer Methods Corp. Computer Sciences Corporation Computer Test Corporation Computer Sharing Inc./Mauchly Group Computers and Automation Computerworld Com-Share Concord Control, Inc. Control Data Corporation Cybetronics; Inc. Data Disc Inc. Datamark Inc. Datamation Data Processing Ma~azine Data Products Corporation Datascan Data Systems News Datel Corp. DilAn Controls, Inc. Digi-Data Corporation Digital Development Corporation Digital Devices, Inc. Digital Equipment Corporation Digitronics Corporation Dura Business Machines Dynamic Systems Electronics Eastman Kodak Company Educational Computer Products Elbit Computer, Ltd. Electro-Mechanical Research Inc. Electronic Associates, Inc. Electronic Design Electronic Memories Fabri-Tek Ferroxcube Corporation General Computers, Inc. General Design General Dynamics, Electronics Div. General Electric Company General Instrument Corp. General Kinetics Inc. Geo Space Corporation Hewlett-Packard Company Honeywell, Computer Control Div. Houston Instrument Div, Bausch & Lomb IBM Corporation IBM Industrial Products Information Control Corporation Information Displays, Inc. lnfotechnics, Inc. Institute for Electrical and Electronics Engineers Interdata Kennedy Company Keuffel & Esser Company Kleinschmidt, Div. SCM Corporation Laboratory for Electronics, Inc. Electronics Div. Lancer Electronics .Corporation Litton/Datalog Division Lockheed Electronics Company Magne-Head, A Div. of General Instrument Corp. Mandata Systems Inc. Matrix Corp. Memory Technology Inc. Micro Switch, A Div. of Honeywell Midwestern Instruments/Telex Milgo Electronic Corporation 3M Company - Magnetic Products Div. - Mincom Div. - Dup. Products Div. Modern Data Systems . Monitor Systems, Inc. Motorola Instrumentation & Control Inc. McGraw-Hill Book Company The National Cash Register Company The National Cash Register Company/ Industrial Products National Computer Systems Nissei Sangyo, Ltd. North Atlantic Industries, Inc. Omnitec Corp., A Subsidiary of Nytronics, Inc. Peripheral Systems, Div. of Memorex Prentice-Hall, Inc. Presto Seat'Mfg. Corp. Pro-Data, Computer Services Inc. RCA EDP Div. RCA Electronic Components & Devices Raytheon Computer Redcor Corporation Rixon Electronics, Inc. Sanders Associates, Inc. Sangamo" Information Systems Scientific Control Corp. Scientific Data Systems Shepard Laboratories Software Resources ~orporation Soroban Engineering, Inc. Spatial Data Systems, Inc. Syivania Lighting Products Systems Engineering Laboratories, Inc. Tally Corporation Tektronix, Inc. Teletype Corporation Transistor Electronics Corporation Tymshare, Inc. United Telecontrol Electronics, Inc. UNIV AC Division, Sperry Rand Corp. URS Corporation U. S. Magnetic Tape Varian Data Machines Vermont Research Corporation Wang Laboratories, Inc. Wanlass Electric Co. John Wiley & Sons, Inc. Xerox Corporation AUTHOR INDEX Abraham, F., Aguilar, R., Allen,J., Anderson, A. H., Andreae, S. W., Appel, A., Balaban, P., Ballot, M. H., Bandat, K., Batcher, K. E., Bell, W. V., Betyar, L., Bhushan, A. K., Boche, R. E., Bohl, M.J., Bowman,S., Brocker, D. H., Cantrell, H. N., Chai, A. S., Chang, H. Y., Coffman, E. G.,Jr., Cole, F. B., Constantine, L., Crowther, T. S., DeMott, A. N., DelBigio, G., Dent, B.A., Dent, J. J., Ellison, A. L., Fantauzzi, G., Feldman, A. P., Feng, T. Y., Forester, R. D., Freeman, D. N., Goldberg,J., Green, M. W., Hand,J. E., Hauck, E. A., Hauser, T. S., Herndon, T. 0., Herrmann, R. L., Hobbs, W., Holcomb, R. L., Hollingsworth, T. J., Hyatt, G. P., Johnston, R., Jones, R. J., Jordan, W. F.,Jr., Kamman, A. B., Kaplan, S. J., Kavanaugh, W. P., Kleinrock, L., Knight, K. E., 345 81 339 259 105 37 135 461 363 307 509 345 95 67 423 353 443 213 467 529 11 509 409 2'59 61 183 245 503 213 291 323 275 73 229 515 515 81 245 453 259 283 31 453 73 161 345" 171 253 415 119 443 11 461 Lambert, W. M., Lass, S. E., Lee, F., Levitt, K. N., Levy, A., Lickhaiter, R. A., Little, J. L., Logan,J., Londe, D. L., Maki,G. K., Mooers, C. N., Morgan,J. D., Mueller, W. J., McBride, J., McHugh, T. F., Jr., McLaurin, M. J., Newman,W., Ohlberg, G., Pullen, E. W., Raffel, J. I., Reichard, R. W., Richman, S. H., Rosko,J. S., Ruffels, W. R., Sackman, H., SaIlen, R. P., Sandewall, E. J., Saxton, D. R., Schoeffel, W. L., Schoen, W. J., Schuur, C. C. M., Schwartz, M., Scott, E., Shuttee, D. F., Sinowitz, N., Smith, R. J., Smura, E. J., Steadman, H. L., Stewart, E. C., Stotz, R. H., Sugar, G. R., Tang,C. K., Tarter, M., Teixera, J. F., Tesler, L. G., Tracey,J. H., Traister, W. A., Tsujigado, M., Vichnevetsky, R., Walter, A. B., Walter, C. J., Woodward, C., Yau,S. S., 193 435 333 515 31 :353 89 135 385 55 89 73 151 31 209 197 47 161 491 2$9 253 483 473 193 1 315 375 415 55 385 267 483 207 491 395 55 111 23 443 95 23 297 453 315 403 55 197 223 143 423 423 200 297


Source Exif Data:
File Type                       : PDF
File Type Extension             : pdf
MIME Type                       : application/pdf
PDF Version                     : 1.3
Linearized                      : No
XMP Toolkit                     : Adobe XMP Core 4.2.1-c041 52.342996, 2008/05/07-21:37:19
Producer                        : Adobe Acrobat 9.0 Paper Capture Plug-in
Modify Date                     : 2008:11:17 16:49:32-08:00
Create Date                     : 2008:11:17 16:49:32-08:00
Metadata Date                   : 2008:11:17 16:49:32-08:00
Format                          : application/pdf
Document ID                     : uuid:bf3ce131-883b-4297-bec4-4f9ca30f6527
Instance ID                     : uuid:162baee0-8c7c-4a2c-a763-7789eb22edd8
Page Layout                     : SinglePage
Page Mode                       : UseOutlines
Page Count                      : 550
EXIF Metadata provided by EXIF.tools

Navigation menu