1965 11_#27_Part_1 11 #27 Part 1 1965-11_#27_Part_1 1965-11_%2327_Part_1
User Manual: 1965-11_#27_Part_1
Open the PDF directly: View PDF .Page Count: 1119
AFIPS
CONFERENCE
PROCEEDINGS
VOLUME 27
PART 1
1965
FALL JOINT
COMPUTER
CONFERENCE
The ideas and opinions expressed herein are solely those of the
authors and are not necessarily representative of or endorsed by the
1965 Fall Joint Computer Conf~rence Committee or the American
Federation of Information Processing Societies.
Library of Congress Catalog Card Number 55-44701
Spartan Books, Div. of
Books, Inc.
1250 Connecticut Avenue, N. W.
Washington, D. C.
© 1965 by the American Federation of Information Processing Societies,
211 E. 43rd.St., New York, N. Y. 10017. All rights reserved. This book,
or parts thereof, may not be reproduced in any form without permission of
the publishers.
Sold distributors in Great Britain, the British
Commonwealth, and the Continent of Europe:
Macmillan and Co., Ltd.
4 Little Essex Street
London W. C. 2
ii
PREFACE
This volume records in part the technical material
presented at the 1965 Fall Joint Computer Conference. Contained in this publication are the formal
papers selected from a record number of contributions submitted to the Technical Program Committee. No attempt has been made to incorporate
the material presented at panels and tutorial sessions
of the Conference, nor have the invited papers
presented on the final day of the Conference been
included. The Conference Committee hopes that a
subsequent volume will emerge to catch the living
spirit of such deliberations.
Still, the size of this volume is large, just as the
scope of the Conference is broad. This is, in part,
deliberate, since the Conference attempted to provide
the opportunity for professional communication on
every level. Recognizing the increasing degree of
specialization in the hardware and software fields,
the Technical Program Committee added a third
information channel to the Conference to focus
attention on management and applications. These
sessions dealt with questions of marketing and economics as well as applications in the scientific and
humanistic fields. Thus to the orientation in hardware
and software was added the direction of applications
and management in the disciplines that are concerned
with information processing.
The most distinctive feature of this Conference,
however, must be the five "discuss-only" sessions
for which preprints were available before the Conference. Hopefully, new dimensions were added to
the papers through a searching examination of the
material on the floor of the Conference. We regret
that we cannot record the results and evaluate the
technique.
The real and permanent contribution of the 1965
Fall Joint Computer Conference is still the technical
material presented in this volume. The credit goes
to the authors with greatful appreciation of the role
of the Technical Program Committee and Session
Chairmen who engineered the structure. Behind
them are the contributions of many others who, as
members of the various committees, made the Conference possible.
ROBERT W. RECTOR, General Chairman
1965 Fall Joint Computer Conference
CONTENTS
Page
iii
Preface
SESSION 1: PROGRAMMING LANGUAGES
Universal Programming Languages and Processors:
A Brief Summary and New Concepts
WALTER H. BURKHARPT
1
JOHN J. CLANCY
MARK S. FINEBERG
23
R. G. TOBEY
R. J. BOBROW
S. N. ZILLES
37
B. J. KARAFIN
53
MELVIN KLERER
JACK MAY
63
Digital Simulation Languages: A Critique and A Guide
Automatic Simplification of Mathematical ExpressionsThe Formac Algorithm
The New Block Diagram Compiler for Simulation of
Sampled-Data Systems
Two-Dimensional Programming
SESSION 2: ADVANCES IN COMPUTER ORGANIZATION
W. C. MCGEE
H. E. PETERSEN
77
R. E. BRILEY
93
S. P. FRANKEL
J. HERNANDEZ
99
An Associative Parallel Processor with Application
to Picture Processing
R. M. BIRD
R. H. FULLER
105
Computer Organization for Array Processing
D. N. SENZIG
R. V. SMITH
117
Microprogramming for Data Acquisition and COli.trol
Picoprogramming: A New Approach to Internal
Computer Control
A Precession Pattern in a Delay Line Memory
SESSION 3: EFFICIENCY AND MANAGEMENT OF
COMPUTER INSTALLATIONS
G. A. GARRETT
129
The Multi-Discipline Approach: A Marketing Application
B. G. MENDELSON
R. V. MONAGHAN
139
Organizational Philosophy and the Computer Center
M. H. GoTTERER
A. W. STALNAKER
145
Management Problems of Aerospace Computer Centers
v
vi
Page
Planning for Generalized Business Systems
Computer Systems Design and Analysis Through Simulation
Basic Concepts for Planning an Electronic Data
Processing System
R. V. HEAD
153
G. K. HUTCHINSON
J. N. MAGUIRE
161
A. F.
MORAVEC
169
SESSION 6: A NEW REMOTE ACCESSED MAN-MACHINE SYSTEM
,
Introduction and Overview of the Multics System
F. J. CORBATO
A. VYSSOTSKY
V.
System Design of a Computer for
Time-Sharing Applications
Structure of the Multics Supervisor
V.
E. 'L.
GLASER
G.
A.
OLIVER
A.
VYSSOTSKY
F. J.
185
197
203
CORBATO
R. M. GRAHAM
A General-Purpose File System for Secondary Storage
C.
R.
P. G.
Communications and Input-Output Switching in a
Multiplex Computing System
J. F.
213
OSSANNA
231
L. MIKUS
S. D.
Some Thoughts About the Social Implications of
Accessible Computing
DALEY
NEUMANN
E. E.
DUNTEN
DAVID, JR.
243
R. M. FANO
SESSION 7: APPLICATIONS OF SIMULATION
Structure and Dynamics of Military Simulations
Analogue-Digital Data Processing of Respiratory Parameters
Computer Simulation: A Solution Technique for
Management Problems
The Role of the Computer in Humanistic Scholarship
The Structure and Character of Useful InformationProcessing Simulations
LEVINE
249
MURPHY
253
E.
T. W.
ROWE
259
BOWLES
269
L. FEIN
277
A. J.
E. A.
SESSION 8: NATURAL LANGUAGE PROCESSING
Catalogs: A Flexible Data Structure for Magnetic Tape
MARTIN
KAy
THEODORE ZIEHE
283
vii
Page
Information Search Optimization and Iterative
Retrieval Techniques
J. J. ROCCHIO
293
G. SALTON
An Economical Program for Limited Parsing of English
D. C. CLARKE
307
R. E. WALL
The Mitre Syntactic Analysis Procedure for
Transformational Grammars
ARNOLD M. ZWICKY
317
JOYCE FRIEDMAN
BARBARA C. HALL
DONALD E. WALKER
SESSION 9: CELLULAR TECHNIQUES FOR LOGIC,
MEMORY AND SYSTEMS
R. C. MINNICK
327
R. H. CANADAY
343
R. H. SHORT
355
Cobweb Cellular Arrays
Two-Dimensional Iterative Logic
Two-Rail Cellular Cascades
Associative Memory Structure
T.
B.
McKEEVER
371
SESSION 11: THE REVOLUTION IN WRITTEN COMMUNICATION
Computer Editing, Typesetting and Image Generation
M.
V.
MATHEWS
389
JOAN E. MILLER
The Left Hand of Scholarship: Computer Experiments
with Recorded Text as a Communication Medium
GLENN E. ROUDABUSH
CHARLES R.
T.
399
BACON
R. BRUCE BRIGGS
JAMES A. FIERST
DALE W. ISNER
HIROSHI A. NOGUNI
SESSION 12: ON-LINE INTERACTIVE SOFTWARE SYSTEMS
MATHLAB: A Program for On-Line Machine Assistance
in Symbolic Computations
C. ENGELMAN
413
An Integrated Computer System for Engineering Problem
Solving
D. Roos
423
E. BENNETT
435
AESOP: A Prototype for On-Line User Control of
Organizational Data Storage, Retrieval and Processing
E. HAINES
J. SUMMERS
457
Structuring Programs for Multi-Program Time-Sharing
On-Line Applications
Interactive Machine Language Programming
K. LOCK
B.
W.
LAMPSON
473
viii
Page
Responsive Time-Sharing Computer in Business Significance and Implications
Its
CHARLES W. ADAMS
483
SESSION 13: HIGH SPEED COMPUTER LOGIC CIRCUITS
Circuit Implementation of High Speed Pipeline Systems
High Speed Logic Circuit Considerations
Crosstalk and Reflections in High Speed Digital Systems
LEONARD W. COTTON
489
W. H. HOWE
505
A. FELLER
H. R. KAuPp
J. J. DIGIACOMO
511
SESSION 14: COMPUTERS IN THE BIOLOGICAL AND
SOCIAL SCIENCES
Integrating Computers into Behavioral Science Research
HAROLD BORKO
527
GEOFFREY H. BALL
533
JOSEPH A. STEINBORN
561
Computer Correlation of Intracellular Neuronal Responses
FREDERICK F. HILTZ
567
Information Processing of Cancer Chemotherapy Data
ALICE R. HOLMES
ROBERT K. AUSMAN
583
Data Analysis in the Social Sciences
Nonlinear Regression Models in Biology
SESSION 18: TIME-SHARED COMPUTER SYSTEMS:
SOFTWARE/HARDWARE CONSIDERATIONS
A Facility for Experimentation in Man-Machine
Interaction
R. WAYNE LICHTENBERGER
MELVIN W. PIRTLE
589
JAMES W. FORGIE
599
JAMES D. MCCULLOUGH
KERMITH H. SPEIERMAN
FRANK W. ZURCHER
611
WEBB T. COMFORT
619
A Time- and Memory-Sharing Executive Program
for Quick-Response On-Line Applications
Design for a Multiple User Multiprocessing System
A Computing System Design for User Service
SESSION 19: SCRATCHPAD MEMORIES
Design Considerations for a 25-Nsec Tunnel Diode Memory
SMID: A New Memory Element
D. J. CRAWFORD
R. L. MOORE
J. A. PARISI
J. K. PICCIANO
W. D. PRICER
627
R. P. SHIVELY
637
ix
Page
An Experimental Sixty-Five Nanosecond Thin Film
Scratchpad Memory System
G. J. AMMON
C. NEITZERT
Impact of Scratchpads in Design; Multi-Functional
Scratchpad Memories in the Burroughs B8500
S. E. GLUCK
Scratchpad Oriented Designs in the RCA Spectra 70
Scratchpad Memories at Honeywell: Past, Present
and Future
649
661
A. T. LING
667
679
N. NISENOFF
SESSION 20: ARITHMETIC TECHNIQUES AND SYSTEMS
A Bounded Carry Inspection Adder for Fast
Parallel Arithmetic
689
EMANUEL KATELL
JOSEPH F. KRUY
695
A Checking Arithmetic Unit
RICHARD A. DAVIS
705
Serial Arithmetic Techniques
M. LEHMAN
D. SENZIG
J. LEE
715
A Fast Conditional Sum Adder Using Carry Bypass Logic
SESSION 23: SIMULATION OF HUMAN BEHAVIOR
Simulation Models for Psychometric Theories
Human Decision Making Under Uncertainty and Risk:
Computer-Based Experiments and a Heuristic
Simulation Program
Computer Experiments in Motor Learning
C. E. HELM
727
N. V. FINDLER
737
G. R. BUSSEY
753
SESSION 24: HIGH SPEED READ ONLY MEMORIES
A Survey of Read Only Memories
A High-Speed, Woven Read-Only Memory
A Thin Magnetic Film Computer Memory Using a Resonant
Absorption Non-Destructive Read-Out Technique
Development of an E-Core Read-Only Memory
MORTONH. LEWIN
775
M. TAKASHIMA
H. MEADA
A. J. KOLK JR.
789
M.MAY
J. L. ARMSTRONG
W. W. POWEL
801
P. S. SIDHU
B. BUSSELL
809
x
Page
SESSION 25: INPUT/OUTPUT EQUIPMENT
FOR CLOSER MAN-MACHINE INTERFACE
MAGIC: A Machine for Automatic Graphics
Interface to a Computer
D. E. RIPPY
D. E. HUMPHRIES
J. A. CUNNINGHAM
819
A Magnetic Device for Computer Graphic Input
M. H. LEWIN
831
Graphic I -
W. H. NINKE
839
D. R. HARING
847
A. B. URQUHART
857
A Remote Graphical Display Console System
The Beam Pen: A Novel High Speed, Input/Output
Device for Cathode-Ray-Tube Display Systems
Voice Output from IBM System/360
SESSION 26: INDUSTRIAL APPLICATIONS
Corrugator Plant Operating System
WALTER J. KOCH
867
WILLIAM G. DAVIDSON
871
Quality Evaluation of Test Operation via
Electronic Data Processing
A. A. DAUSH
879
The Introduction of Man-Computer Graphics into the
Aerospace Industry
S. H. CHASEN
883
Real Time Programming and Athena Support at
White Sands Missile Range
SESSION 27: HYBRID COMPUTERS FOR FUTURE SYSTEMS
ARTHUR BURNS
893
Optimum Design and Error Analysis of Digital
Integrators. for Discrete System Simulation
ROGER W. BURT
ANDREW P. SAGE
903
Sequential Analog-Digital Computer (SADC)
HERMAN SCHMID
915
Design of a High Speed DDA
M. W. GOLDMAN
929
Hybrid Computation for Lunar Excursion Module Studies
SESSION 28: COMPUTER DIMENSIONS IN LEARNING
Engineering Mathematics via Computers
The Computer: Tutor and Research Assistant
WOSP: A Word-Organized Stored-Program Training Aid
CASE: A Program for Simulation of Concept Learning
JOHN STAUDHAMMER
951
ROBERT J. MEYER
959
M. RASPANTI
965
FRANK B. BAKER
979
xi
Page
SESSION 29: MEMORIES FOR FUTURE COMPUTERS
A 375 Nanosecond Main Memory System Utilizing 7 Mil Cores
Monolithic Ferrite Memories
High Speed Ferrite 2-Y2 D Memory System
Design and Fibrication of a Magnetic Thin Film
Integrated Circuit Memory
Batch Fabricated Matrix Memories
Integrated Semi-Conductor Memory System
G. E. WERNER
R. M. WHALEN
985
I. ABEYTA
M. KAUFMAN
P. LAWRENCE
995
T. J. GILLIGAN
1011
T. J. MATCOVICH
W. FLANNERY
1023
T. L. MCCORMACK
C. R. BRITTARD
H. W. FULLER
1035
HARLEY A. PERKINS
JACK D. SCHMIDT
1053
SESSION 30: COMPUTER-AIDED DESIGN & MAINTENANCE
Strobes-Shared Time Repair of Big Electronic Systems
A Self-Diagnosable Computer
An Automated Interconnect Design System
SystematiC Design of Automata
J. T. QUATSE
1065
R. E. FORBES
D. H. RUTHERFORD
C. B. STIEGLITZ
L. H. TuNG
1073
W. E. PICKRELL
1087
J. P. ROTH
1093
UNIVERSAL PROGRAMMING LANGUAGES AND PROCESSORS:
A BRIEF SURVEY AND NEW CONCEPTS
Walter H. Burkhardt
Computer Control Company, Inc.
Framingham, Massachusetts
ment of the problem was given in a fixed mathematical form. This is due to the special nature of computers, with the memories, the circuit logic, and
electronic switching elements having easy adaptation to mathematical problems and to a tremendous
bulk of knowledge in the form of mathematical formalism.
There are now on the one side machines with
more or less special features for the solution of particular problems, and on the other the problems,
given sometimes in a self-contained formulation,
sometimes in only a vague and inexact form, and
ranging over the whole spectrum of life, science,
and society. The medium to combine both is known
as programming. This function consists of mapping
a solution given to the problems on the machine,
but now better defined as dividing the problems
into elementary task-components and translating
them into terms of the machine.
In this paper, the interface between the problems
and the machines will be discussed with emphasis
on the tools for the solutions-the programming
languages and processors.
INTRODUCTION
Progress in any field depends on the materialization of new ideas. But before this is possible, these
ideas have to be found, investigated, developed and
adapted to the changing world.
In computing, i.e., the use of computers for the
solution of problems, new ideas are available everywhere, although the implications behind them and
the influence on the state-of-the-art are generally
not very well understood. Therefore it is often difficult to separate the wheat from the chaff.
But even valuable ideas are not always useful and
welcome. That is especially the case when the basis
for them is not adquately prepared. To know which
ideas are useful at present, it is necessary to evaluate the state-of-the-art to determine how developments in the field will proceed. There are other reasons. One might be to give the nonspecialist a fast
orientation; another is to readjust the basis in a fast
growing and changing field.
The last decade brought a tremendous gain in
overall computer power and for a unit outlay as
well. Therefore, it is not too surprising if many old
values have to be changed and new ones appear.
The advent of computers gave a very useful tool
for the solution of many tasks for which the stat-
Statement of Problem
The application of computers for solving problems in technical, scientific, and commercial fields
1
2
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
has been very successful. But progress is hampered
by the fact that the machines accept primarily only
their own special language, on digital computers
composed of number sequences. These sequences
are mostly long chains of zeros and ones-which is
rather unintelligible to humans and quite different
from the languages in which the tasks are and can
be easily described.
Possible Solutions
There are two possibilities for solving the difficulties made by the gap between the languages of
machines and the languages of problems. One solution would be to adapt the languages of the machines by developing machine languages more general and closer to the problems encountered, the
high-level language computers, the other one would
be to adapt the problems to the machines. This is
done presently with intermediate languages, between machines and problems, which are easier for
people to use than contemporary absolute machine
languages.
High-Level Language Computers. This would mean
to develop a machine which could communicate in
a higher language. Suggested rather early, and attempted to implement·· to some extent (for example,
in the SEAC machine, 1 this idea could give an elegant solution to the problem. Therefore perhaps it
is revived in newer designs,2,3 and it is even suggested to use a language of the ALGOL-type4 as
machine language. * In addition to the drawbacks
due to the insufficiencies of contemporary programming languages (and these are the only candidates
at present for high-level machine languages) there
are several factors opposed to such a development.
The arguments of high cost for circuitry and restrictions to the applications area are mainly based
on the 'economic feasibility of such designs. But
with an advent of very cheap components and assembly methods, these restrictions could change in
the future.
The arguments of altering. bases must be taken
more seriously. The development is neither fixed on
the problem side nor on the machine side.
Development on the Problem Side. To illustrate
this point a simple example might be taken. In ap*A similar step in this direction is sometimes attempted
in microprogrammed machines with some higher-level language implemented in memory in a semifixed manner.
1965
plications to commercial problems a basic function
is certainly sorting, which is used over and over
again. So it would seem natural to include a machine operation for sorting in the repertoire of such
high-level commercial machines. But what technique of sorting 5 should be implemented? The
best technique to be selected depends on the data
formats and on the machine configurations so that
selecting only one technique is not very feasible.
But inclusion of several different techniques is
highly unlikely. This example will show the difficulties for only one task function. The overall requirements complicate the situation so much that
no reasonable solution is in sight.
Development on Machine Side. Many opinions
state the view that the development on the machine
side is now fixed. 6 But this belief seems prejudiced
and premature. For example, in the near future
memory hierarchies (let's say a memory of 128word diode storage with 50 nanoseconds and 2048
words thin film or thin wire with 200 nanoseconds
and back-up storage of 32,768 words at 600 nanosecond cycle time. Behind these might be bulk core
storage, drums, disks and tapes) could give a user
more computer power (according to the principle of
limited resources) than the more conventional recent design; or mastery of parallel execution features, etc. Although this argument affects mainly
the internal design of a possible high-level language
machine, it complicates the picture and eliminates
many suggestions for solutions. The potentialities
for a standard machine ( or assembly) language
are impeded too by this aspect.
Solutions by Intermediate Languages. The solution
by intermediate steps between problem and machine
languages via programming was at least in the past
the most successful one. It can easily be seen that
the closer to the problem the steps are taken, the
more powerful and quickened the solution will be.
So the region between problems and machines contains software levels of differing machine and problem independence.
Efficiency of Machine Use. Whenever a programming language is different from actual low-level machine language, questions concerning the efficient use of the hardware are apt to arise. These
seem to be of greatest importance on slow and expensive machines. Linearly extrapolated, the emphasis on these questions is decreased to 2 percent
when relating a machine with O.5-miscrosecond cy-
UNIVERSAL PROGRAMMING LANGUAGES AND PROCESSORS
cle time in 1965 to one with 25-miscrosecond cycle
time in 1951 at the same price. Interestingly, the
highest requirements for run-time optimization
with compilers are imposed on hardware which is
inadquate for the intended problem solutions (e.g.,
optimization in FORTRAN II on the 704 and in
ALPHA on the M208 for the solution of difficult
partial differential equations). With the need for
faster computers 9 and a decline in prices for hardware, as in the past decade, these efficiency questions are bound to diminish and perhaps to disappear altogether.
Hierarchy of Programming Languages. Different
hierarchies of programming languages are already
proposed,IO where the criterion is the machine configuration concerned. Of course, many other characteristics could be chosen for classification of programming languages, but the one here presented in
respect to machine independence seems to be most
interesting. A good measure for the level is the degee of declarative freedom for the user. Therefore
on the lowest level would be the absolute machine
languages and with more declarative possibilities
gradually increasing up to the problem level of
declarative languages as follows:
Absolute machine languages (machine level)
No declarative freedom
Assembly languages
No specification of names
and locations necessary
Procedural languages
N a detailed machine
commands necessary
Problem oriented
languages
Language elements from
problem but command
structure procedural
Specification languages,
augmented by semantics
Description of relations
in the prOblem, freedom
of procedure
Declarative languages
(problem level)
Description of the problem, freedom of procedure and solution
The levels from absolute machine language to
procedural languages are very well known from the
literature of rec~nt years. (Sometimes in the past,
procedural languages like FORTRAN, ALGOL and
JOVIAL were incorrectly denoted as problemoriented languages.) Examples for problem-oriented
languages are found inAPT, eOGO, Simscript, etc. l l
The block-notation languages 12 for analog-hybrid
3
simulation on digital computers are examples of augmented specification languages. Semantic content is
there defined by the fixed meaning of the block
names (in MIDAS 13 they are the operations and the
operands by means of the successor specification).
Recently an example for another use of a specification language in an applications program was published14 where Backus-Naur-Form specification was
adopted. As can be expected, the experience reported
stresses the improved facilities .(compared with conventional programming languages) in programming,
check-out, and incorporating changes into the program over conventional programming languages.
Perhaps the first example in declarative languages,
although not on the level designed by the term today,
was the dynamic analysis language DYANA. 15 Some
other approaches are described in a recent paper .16
Translation among Programming Languages. All
programming languages above the actual machine
language impose the necessity for translation to that
language. This task is done by a translator, compiler
or assembler, hereafter called a processor.
Two different aspects have to be distinguished
concerning the translation of programs:
1. Translations horizontally on one level
2. Translations vertically to other levels
Obviously, all translations can be regarded as composed of these two possibilities to various degrees.
The requirements for the practicability of translation
are:
• The rules for the connections of elements in
the languages (the grammars or syntaxes).
• The elements of the languages ( the dictionaries) .
• Their respective relations in both languages
as well.
1. Translations. Horizontally. Horizontal translations of programs among different programming
languages of the same level are in general not possible. The reason is, that the results of one operation
(in extended sense) in a program in source language (the language of the input) may determine
the kind of operation to be used next in the program, and that often the target equivalent of a
source language item is not available. The criterion
for translatability is that all operations in the
source language can be translated separately into
the target language (the language of the output) in
respect to power and extend. Translatability from
4
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
one source language A to a target language B gives,
however, no indications for translatability from B
to A. Whenever some operations are not translatable, they may be simulated, e.g., interpreted at run
time. Because. of the huge number of redundant operation parts involved, interpreted programs run
normally orders of magnitudes slower than comparable translated ones on the same machine.
2. Translation Vertically. Vertical translation of
programs is divided in (a) upward and (b) downward translation.
(a) Upward translations impose generally the
same requirements as those detailed for horizontal
translations. A special case governs the upward
translation or previously downward translated programs. Contrary to some opinion,17 no relevant information for the execution of a program is lost in
the translation process, only the redundant. Therefore, if the translation algorithm is given, all necessary information can be retrieved from programs,
that had been translated before, to build a fully
equivalent program on the former level.
(b) Downward translations are normally not difficult' because the languages on the higher levels are
so designed as to give a specific and determined
target equivalent on the lower level for each source
language element.
Now, by the mechanical transformation of the
program (a description of a problem or its solution)
into representations of other levels with or without
intermediate levels (e.g., DYANA ~ FORTRAN II,
FORTRAN II ~ SAP704, SAP704 ~ 70'4) not
more solutions of a problem are obtained, but only
different representations of the program. Therefore,
with regard to problem considerations, all different
representations of one program (e.g., diagrams augmented by text, DYANA, FORTRAN II, SAP and
704), and all programs giving the same results for
the same sets of data, are said to l?e equivalent. A
similar relation is given among specification languages or notations. 18 Continuing this thought, most
efficiency questions, grammar and syntax peculiarities and details, though interesting and necessary for
the development of the transformation processors,
are definitely unimportant and sometimes even undesirable for the solution of a task in applications
programming.
Experience with High-Level Programming Languages. The aspects of the historical development of
high-level programming languages (with regard to
1965
machine independence) are described in detail
elsewhere. 11 It might be stressed that FORTRAN
was not the first high-level language of algebraic
type but had forerunners in Rutishauser's algebraic
language on the Ermeth in Zurich and in Lanig and
Zirler's language on Whirlwind in MIT. Even
MATH-MATIC for the Univac I, a commercially
available machine, was earlier. But the small VI (a
decimal and alphanumeric machine with only 45
instructions) did not really necessitate and justify a
high-level algebraic language; this was later required with the more complex machines bf von
Neumann-type, like the 704.
The advantages of high-level programming languages are more apparent the more the considered
languages are independent from the machines.
These advantages are:
1. Easier learning and use than lower-level languages, because they are less complicated,
2. Time savings in programming of solutions
for problems,
3. Time savings in debugging and correcting
possibilities for slightly different prol;>lems,
5. Higher machine independence for transition to other computers, and otherwise for
compatibility with hardware,
6. Better documentation (compatibility
among programs and different programmers,
7. More powerful structuring in terms of
problem.
Points (1), (2), and (3) were stressed in the
past and found most important. 19 Nowadays (4)
and (5) receive more attention and in the future
(5), (6), and (7) may become the dominant ones.
It is interesting to note that points (1) through
( 4) have been similarly known to engineers for
decades for the solution of problems in formal instead of numerical notation.
Most astonishing is the large number of programs
still written in a low-level language. 2o This can only
be explained by a steep information gradient between the knowledge in the field and the application programmers, or better, their managers.
Development of New High-Level Programming
Languages
Introduction. The development of new high-level
programming languages, at least in the past, has
been more evolutionary than revolutionary. So the
UNIVERSAL PROGRAMMING LANGUAGES AND PROCESSORS
step from FORTRAN to ALGOL brought with it
these advantages in order of their estimated importance:
•
•
•
•
Chained logical decision sequences
Block structure of program parts
Free notation format
Lifting of various machine restrictions
(i.e., number of subscripts in variables,
modes of expressions, etc.)
Unfortunately, due perhaps to the ambiguities
embedded in ALGOL and its definition, the gain
from switching over to ALGOL programming from
FOR TRAN is considered marginal. Despite all the
efforts in the past, less than 10 percent of all programs for scientific and engineering applications
are coded in ALGOL20 - which is not a striking
triumph for a successor to FORTRAN.21 Similarly, less than 5 percent of the programs in the same
area are coded in FORTRAN IV -what can be
cautiously described as failure of the new facilities
incorporated in FORTRAN IV over FORTRAN
II. The use of a programming language by applications programmers has to be the measure for its
success. If one is not sufficiently used, a programming language is certainly as dead and obsolete as
Mayan or Babylonian and perhaps of just academic
interest.
Requirements for a New High-Level Programming
Language. Several important design criteria - often
violated even in recent designs - have to be
stressed:
Close Relationship to the Problems in the Desired Area. This allows the user a concise and powerful description of the processes and concepts.
Uniqueness. Each item in a correct program has
to have one unique and defined meaning. This is
required by all compatibility reasons.
Simplicity and Clearness of Notation. The language has to be developed and designed with a programming notation for ease of learning, use and
handling of the language in the intended problem
area. (Of course, that does not exclude a formal
and rigid definition of the language. But such a definition should hardly ever be imposed upon a
user.) Requirements for readability and intelligibility are included here. This point of convenience has
to be the main criterion for the standardization of
programming languages. Admittedly, generally one
proposed standard is better than another, if it is
5
more convenient for the user.
Completeness. A good programming language
should be able to handle all occurring tasks within
its designed scope, without need for using means
from outside. Good counter-examples are the missing string-handling facilities in FORTRAN and the
input/output part in ALGOL 60.
Segmentation. For the practical application of
programming languages to large problems, efficient
segmentation features are desirable so that parts of
problems can be handled independently.
Compatibility with Existing Programming Languages. In addition to compatibility in other respects, one is important in regard to the already accumulated knowledge of problem ~olutions (the
program libraries). These libraries consist of two
parts-one created by the present user working with
the language and the other developed elsewhere or
earlier with other languages. The first part requires
elements in the language to build up and use older
programs and program parts in new connotations;
the second demands some means for translation or
interpretation of old libraries.
Development Possibilities. There are three ways of
developing a new programming language:
• Cleaning up and refining existing languages;
• Elaboration and combination of known
useful· features;
• Development from the basic requirements of a problem area.
All three methods were used in the past either separately or combined.
Proliferation and Solutions. The application of
computers with high-level languages to different
problem areas causes a proliferation of programming languages according to the vernaculars in the
application fields. There are two different possibilities:
1. If single programming languages are to
be developed close to the vernaculars, then
some incompatibility will exist between
these.
2. On the other hand, if an intermediate language somewhere in the middle between
problems and machines will be accepted as
the programming standard, then much
6
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
more effort has to be spent on defining the
problems to the computers.
The historical development of progress in the
computer field favors the first alternative, while
computer manufacturers and operations managers of
computer installations try to hold to the second one.
Possible solutions to the dilemma might be found
in:
( a ) Inclusion of language elements of neighboring problem areas into programming
languages presently in use or being developed, or opening the borders to that area;
for an intermediate language with the
scope of an UNCOL22 but on a higher
level or as a subset of a universal programming language.
(b) Development of universal programming
languages.
( c) Development of universal processors.
Universality in this respect is meant to comprise
at least the elements of two really different problem
areas (not vertical combinations or notations) .22
Several proposals for the first of these solutions
(inclusion of language elements) are already reported. Of these, BEEF24 and ALGOL-Genius25 are
both designed to combine a programming language
for algorithmic with one for commercial procedures.
More ambitious in this respect is the NPLSHARE26,27 language to combine in addition the elements of real-time and command languages.
It is most noticeable that software systems (languages and processors) developed upwards from the
machines by combinatinn of existing elements do
not tend to please many users. Despite the desirability of larger extended systems, there are always users who do not need the new scope and are unwilling to pay for the clumsiness and complication due
to inadequate design.
Other development possibilities going from a
fixed base are found in the features of open-ended
systems. To some extent at present, the combining
of languages of two areas results in at least a partial
universal programming language.
UNIVERSAL PROGRAMMING LANGUAGES
Definition
A universal programming language can be defined as a complete set of elements to describe the
1965
problems and solutions in all problem areas. If such
a language can be developed, the design requirements will be the same as for a single high-level
programming language (see the requirements listed
above), but much more valid.
Mathematical Definition and Development.
It is easy to define mathematically the design and
development of a universal programming language
in general.
The complete set Si of all equivalent programs*
Pik1 for the solution of problem k in one area is
given by
Si = UiPik1
Then the operation 8 selects from this set a program,
maximal in respect to power of problem description
Dk = 8 UiPik1
Now all maximal programs of one problem area form
a new complete set Sk:
Sk = Uk 8 UiPik1
From this new set, operation '}' extracts the language
elements and operations for the given area to form
the language for the problem area Gj:
Gj = '}' Uk 8 UiPik1
For the generalized and universal programming language Au, the complete set Sl, generated by U1, of
all languages G1 has to be considered, combined and
integrated by the operation A to give
Au = A U1'}' Uk 8 UiPik1
As may be recognized, the operations, 8, ,}" and
A are very complex and difficult, but the most serious drawback seems to be the large extent of the
various sets required. But this is the only way for
development, be it by methods of evolution via
open-ended languages or by revolution via problem
analysis and then language determination (as given
by an example in reference 28.
Old Proposals
The problem of proliferation of programming
languages was recognized rather early especially in
respect to the effects on processor production. 22,29
So UNCaL, a universal computer-oriented language
was proposed as an interface between high-level
programming languages and computers. Due to the
open-endedness on both sides of problems and machines, such a scheme cannot easily be designed on
the basis of a fixed language. On the other hand, 30
examples for this scheme are known as notations,
*See "Translation among Programming Languages" above.
7
UNIVERSAL PROGRAMMING LANGUAGES AND PROCESSORS
e.g., the prefix notation. 30 But this design level
seems to be inadequate for a satisfactory solution to
the problem.
A similar restriction is imposed on the wellknown flow-chart notation to be used as a universal
programming language, or even as a programming
language. (Recent rumors suggest flow charts to be
used on oscilloscope terminals for on-line programming.)
Design Possibilities
As mentioned in the section on Solutions, there
are two possibilities for the design of universal languages. One is a conventional approach with openended languages and processors so that the users
will develop gradually the required high-level programming languages in the interesting problem
areas. Then from time to time the now achieved
status of a system should be re-evaluated and reshaped to avoid and eliminate inefficiencies and obsolescence. So gradually the best language for a
problem area will mature. As soon as there are
enough languages developed for different problem
areas, then the design of a universal one can be envisaged.
The more direct method suggested by the mathematical definition is to investigate the nature of the
problems, depict the elements for description and
solution, and combine these into a high-level programming language. This method was used to develop the new programming language BEST for
commercial applications. 28 The reported five years
of labor seem relatively high, but the rewards justify
the effort to eliminate all the inadequacies and
inconsistencies which arrive at the fire-line with
programming languages, designed by mutual conconsent at conference tables.
bining of different target languages. This is certainly no accident, as will be stressed later. The target
language area poses heavier and more stringent requirements on processors than the source language
area where it is possible to easily combine several
compilers for different languages (but for the same
machine) into one system and to invoke' the one
momentarily required by control cards (e.g., in
IBSYS.31 The difficulties for the target language
arise mainly because of a third language parameter
in a processor, its own language, i.e., the one in
which the processor is written or the language of
the machine on which the processor has to run.
Design and Implementation. At the source language
side of a processor, besides the simple IBSYS concept, a higher degree of integration could be obtained by (1) accepting mixtures of statements of
different languages, perhaps with indicators as to
which languages they belong; and (2) accepting the
elements of different languages intermixed. This
requires that incompatibilities among the languages
are removed. (For example, the definitions of identifiers in FORTRAN and COBOL are incompatible, with blanks having no meaning in FORTRAN,
but used as separators in COBOL.) So it is proven
that a fairly universal programming language cannot
be developed by simply combining useful features
from different other languages.
If only a restricted universal processor can be
developed, then by feeding a copy of it to itself a
desired less-restricted one could be produced automatically.
>
General Notation. A processor can be defined as a
function (I) for transformation of a program given
in one language into that of another. The parameters
of this function f are then:
UNIVERSAL PROCESSORS
ex) Source language of programs to the processor;
General Requirements and Notation
y) Own language of the processor;
Definition and Feasibility. A universal processor
can be defined as a program for transformation· of
all programs from all required problem areas into
all required target languages. The extent of such a
processor is dependent on the definition of the requirements of the problems and of the machines.
Processors which accept programs in a number of
different programming languages are well known. 31
But no successful experience (aside from the projects outlined below could be found for easy com-
8) Variables for measuring the efficiencies;
E) A variable for the method used in the processor;
etc.
So the processor can be designated by f(a,/3,y,8,E,
f3) Target language of programs from the processor;
... ).
Transformation of Programs by Processors. A source
program is, for example, a given set V 1i of (i) statements (Si) in source language A for the solution of
problem 1 and similarly a target program can be
defined:
8
PROCEEDINGS - - FALL JOINT COMPUTER CONFERENCE,
P1 = Vlisi(A)
P'l = VlkSk(B)
as source program
as target program in language B
The application of the transformation function gives
the relations:
V lkSk (B) = jV liSi (A)
=
Vl~Si(A)
= VljjVjmsm(A)
not specified or pertinent, the space for it is left
empty.
Examples oj the New Symbol
V
for languages separate
translatable only on
the program level
requires a different
transformation algorithm
for languages separate
translatable on
block level; (a block
is defined as the set
/sA~
'1
AB.
=
VliSi(fA)
j
is a FORTRAN compiler written in SAP, translating from
FORTRAN to SAP
704
is a SAP assembler given in 704
machine language and translating from SAP to 704 machine
language
is a NELIAC compiler translating from NELIAC to 1401 SPS
and running on the 709
~709
A
for languages separate
translatable on the
statement level
for languages separate
translatable on the
language level
Simplified Notation. The most interesting and important questions with processors are concerned
with the function of changing the· language representation of programs, ( especially by translating
them to actual machine language) .. Therefore, if no
regard is given to other than the language parameters, the function is reduced to
saP
04
VmSm(A)
= VlijSi(A)
1965
is a precompiler translating into
the source language (e.g., for
error checking in programs) and
running on machine with language L3.
Mode oj Processor Use. Basically two different
modes of processor use can be distinguished: translative and interpretive.
1. Interpretive Mode. The interpretive mode of
processor use is characterized by the handling of
data and source statements at the same time, according to the diagram:
= f(A,B,C).
Of course, the other parameters cannot be completely ignored, but they depend on other variables.
(Measured efficiency of a processor depends on the
methods used, while efficiency requirements are
functions of hardware speeds and costs again, etc.,
so other parameters are omitted here.)
Now a new symbol for a processor is introduced:
2. Translative Mode. The translative mode is
characterized by the processing of source program
and data at different times, at compile time and at
run time, respectively:
at compile time
It designates a processor translating from source
Language A into target language B and is itself
written in (its own) language C. Sometimes a label
as a name for a processor will be used and inserted
into the empty space at the left side of the symbol.
Where one language parameter in the following is
!
I
i
~>~i__,-_~
at run time
9
UNIVERSAL PROGRAMMING LANGUAGES AND PROCESSORS
In real-time concurrent processing, the schemes
would look like
3. For Real-Time Interpretive:
It must be understood that the execution of the
target program at run time is itself considered again
as interpretation.
DATA 1
DATA 2
-SOURCE
PROGRAM 1"
··
·
~
DATA N
~
---~-
SOURCE
PROGRAM
2
--
,
~
-'"
RESULTS 1
,
.~-----
·
·
·
~
..
PROGESSOR
... RESULTS 2
_
SOURCE
PROGRAM N
·
·
·
~
... RESULTS N
4. For Real-Time Translative:
I
SOURCE
PROGRAM 1
SOURCE
----.
PROGRAM 2
,
....
·
·
·
SOURCE
PROGRAM N
PROCESSOR
,
,
DATA 1
I
-
OBJECT
---PROGRAM 1
.....
1 DATA 21
y
OBJECT
PROGRAM 2
....
RESULTS 1
RESULTS 2
·
·
·
l DATA N I
r-+
.....
•
OBJECT
PROGRAM N
RESULTS N
10
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
General Use 0/ Processors. The general use of processors is given by feeding (designated by the simple
arrow ~) a program into the processor to receive
(designated by the double arrow .) the program
in another representation:
is the translation process of a program from source
language Ll to target language L2 by a processor
running on a machine with language La.
A more interesting case is that the program fed
to the processor can itself be a processor. When it
is written in its own-source language it is according
to:
Here it is explained that a processor written in its
source language can be translated to any other language for which a processor exists. From this prospect was derived the old requirement that each processor should be written in its source language. On
the same process is based nowadays the production
of assemblers for new machines. Details on that
method will be explained later.
When the processor is written in its own-target
language, this gives:
1965
This is the ancient method of processor construction by writing it in its target language. So it is
possible to build up on already available processors.
An example of this is the old FORTRAN compiler
written in SAP and translating to SAP, which is
then translated by the SAP assembler into 704 machine language, but it needed the SAP assembler in
the translating process from FORTRAN to 704
code.
Restrictions on the Parameters. The variables in
the transformation function / (A,B,C) of a processor are certainly not independent even among
themselves. The following functional relations
among the language parameters are interesting. Previous mention has been made of the relation between the target and the own language of a processor. Another, but not a very stringent one, governs
the relation between source language elements and
their target language equivalents.
It will now be assumed that the relations can be
defined and the variables separated. Several cases
are then distinguished:
1. The source language parameter A is
independent of the other ones, so that
no functional relation is given there:
A =1= h1(B);
A =1= h 2 (C).
2. The target language variable B depends not on the source or on the own
language:
B =1= ha (A ) ;
B =1= h4 ( C) .
3. Both source and target language are
not related to the own language (but
might depend on each other) :
A =1= h 2 (C);
B =1= ~(C).
4. All language parameters are independent among themselves.
The design of universal processors will now be
investigated according to these restrictions.
Universal Processors. Universal Processors can
be designed under the restrictions of the previous
paragraph and will be treated in the same order.
1. A scheme for a universal processor limited by
restriction (1) could be derived as follows. If the
processor is not dependent with the source language
either on target or on the own language, then the
source language part could be made exchangeable.
As soon as one processor with this characteristic
would be available, processors for all different
source languages could be constructed running and
UNIVERSAL PROGRAMMING LANGUAGES AND PROCESSORS
translating for the same machines. By transforming
another processor with the same characteristic according to:
processors could be written in all languages for
which exchangeable definitions exist, and then translated to the designated machines. The task of writing 2m X n processors for n languages and m machines (there are only m X n processors if the possibility of translation of programs on one machine
for running on another machine is excluded) is now
reduced to the writing of 2m processors or m, respectively) for the m machines and of n language
descriptions for the n source languages.
2. The case where the target language is considered independent of source and own language is
even more interesting. Then target language descriptions for the machine could be developed and
inserted into the processor to give a scheme for
processors to translate for all machines.
Applying the same principle to the translation of
processors could give a universal processor with any
desired target and own language requirements:
The requirements for a universal processor system
would now be to write n processors for n source
languages and m target language definitions for m
machines. These n processors would be written preferably in a high-level language (N3) for which a
processor with the same characteristics for exchangeable target equivalents has been given already.
3. The case that source and target language ale
independent from the own processor language (al-
11
though they may depend on each other, case 1)
would give a very powerful and general system. By
the application of the scheme to itself, any desired
own language and so a rather general universal processor scheme could be obtained:
The implementation requirements would now be
to develop: one processor with removable source
and target language equivalent parts in two copies,
and the definitions for each pair of source-target
languages, giving m x n definitions if they are dependent on each other (case 1) or m + n definitions if they are independent (case 2).
4. When all language parameters are independent, then we have the most general universal processor scheme. Of course, this brings not more solutions than could already be obtained in case 2. The
requirements here would be to have one processor
with the desired characteristics and m + n descriptions of source, target, and own languages.
Discussion. The schemes for universal processors described in the preceding section are outlined
on the assumption that the language parameters of
processors are independent of other variables and
among themselves, at least to a certain degree.
Some relationships among source, target and own
language are known. But up to now it was never
proved or disproved that perhaps they could be separated, and if so, under what conditions. It can be
seen, for example, that between source and target
language only a simple connective relationship exists, but the requirements then imposed on the own
language were not yet evaluated.
The area of source languages is now fairly well
understood, although the techniques are still not in
the best conceivable state; much work is left to be
done; some is going on and progressing satisfactorily. But knowledge of the others is very insufficient
and incomplete.
Many investigations in the past· were dedicated to
the theory of automata. However, most results from
these investigations are too general or of too low a
12
1965
PROCEEDINGS --,- FALL JOINT COMPUTER CONFERENCE,
level to be of great value to present-day computers
with their variety of special hardware features. Only
in the recent past some work was performed on
models of more contemporary machines. 29
As long as actual computers are not well understood there will not be much hope for very successful development of useful universal processors.
The following section describes the various reported projects for automated processor production
and compares these to the described scheme of universal processors.
eral scheme. There is always input I, consisting of a
processor or its description, or the description of
the source language. Input II is sometimes missing
(in some cases of a processor description for I), or
consists of specifications of the target language, and
of a source program in interpretive cases in addition to that.
INPUT II
Projects for Universal Processors
INPUT I
General Scheme and Survey. All literature uncovered in recent years regarding projects for proposals on universal processors fit into the same gen-
~
SUPER
PROCESSOR
~~
The different elements for input and the obtained
output are summarized in Table 1.
Table 1.
Project
Input I
High-level and speciallanguage use.
Processor written in highlevel or special processor
writing language
Processor in UNCaL to
translate to UNCOL
UNCOL
Input II
Resulting Processor
Special Features
Processor in lowlevel language
High-level languages applied to processor COD'struction
Reduction in number of
processors required
Processor for UNCOL on designed
machine
CLIP-JOVIAL
Processor
language
in
high-level
Processor in lowlevel for original
language
"Boot-strapping"
NELIAC
Processor
language
in
high-level
Same as above
Same as above
XTRAN
Processor in high-level
language (with connectors?)
Target machine
macros
Processor in lowlevel for designated
language
Exchangeability of target
language equivalent
SLANG
Processor in SLANGpaLMI
Target language description to generate
the equivalents
Same as above
Generation of target
equivalents from a description
TOOL
Processor in TOOL
Library of macros
Same as above
Translation for new machines
Syntax method
Language specification in
terms of M
Source program in L
Target program in
Interpretive processor accepting language L specification
1. Language specification
1. Macros for M
Same as above
2. Source program
Processor in M
TGS
M
L
2. Generation statement
tables for selection
Meta A
Interpretive processor
with extensive descriptions and specifications
System written in specification languages
Description of language
L in terms of M
Meta B
Description of language
L with connectors
List of target equivalents (macros)
Same as above
System in specification
language separable for
given source and target
languages
Applicative
Expressions
Description of L in Applicative Expressions
Machine definition
in Applicative· Expressions
Same as above
Same as above with Applicative Expressions as
specification language
UNIVERSAL PROGRAMMING LANGUAGES AND PROCESSORS
Two different approaches can be distinguished,
one starting with a processor or the description of a
translation process and the other starting with definitions for the source language. The processorbased projects are generally the older ones, thus reflecting the progress in the field.
13
In the production process the processor (written
in UNCaL) for the source language is translated
by a processor from UNCaL to machine language:
Processor Based Projects
1. High-Level or Special High-Level Language
Use. To gain the advantages of programming using
high-level languages (see Introduction "Experience
with High-Level Programming Languages") in the
construction of processors, projects based on this
were tried rather early and often abandoned immediately. The main reasons were the inadequacies
of high-level languages of those days (mainly FOR-
TRAN and ALGOL) for processor descriptions,
and unfamiliarity with the new technique. To alleviate the difficulties special high-level languages
were developed. 33,34 The scheme here is working
like:
However, the gains by these projects for the construction of universal processors can be considered
marginal, because the original number of processors
required is not reduced and, in addition to that, one
processor for the high-level description language is
required for each machine. This scheme is reported
only for the sake of completeness and because it is
used heavily in other projects.
2. UNCaL. In this project the first suggestion
for a system of some sort of a universal processor
was given. 22,29 It calls for an intermediate language
(see "Old Proposals" above) together with the
appropriate processors. The requirements are here
reduced to m + n processors for n languages and
m machines, instead of m X n (without translation
of programs to run on other machines). For each
source language a processor has to be written in
UNCaL translating into UNCaL and then for each
machine one translating into machine language.
All programs then written in source language N 1
are translated by this new processor, running on
machine with language L 2 , into programs in UNCaL.
These programs are then finally translated to machine language L2 by the translator from UNCOL
to machine language L2 (already required above):
3. CLIP-JOVIAL. Very similar to both the UNCaL and high-level language project is basically the
CLIP-JOVIAL approach. Several different versions
are reported, one without intermediate language and
another, more advanced, with it.35 The diagram for
the simpler version looks like:
Practically, the high-level language scheme where
the source language is used for description with:
N1
the CLIP language (a dialect of ALGOL 58 with additional features for
table packing, string handling, storage
overlapping, and local and global
declarations)
N2
assembly language
L2
709 machine language
14
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1965
to different machines:
The more advanced version uses an interesting
"bootstrapping" method for adapting the processors
The parameters are given according to the table:
N,
N2
N3
L,
L2
L3
L4
M,
M2
JOVIAL
INTERMEDIATE
CLIP
709-A
2000-A
ANFSQ-A
MILITARY-A
709
2000
ANFSQ
with the indication of -A after a computer name
standing for assembly ianguage for that machine,
and the computer name alone standing for its machine language. The parameters (Li) enclosed in
parentheses indicate the insertion of the appropriate
target language equivalents for the intermediate language N 2 and the patching up for it.
MILITARY
A processor is written in CLIP to translate from
source to intermediate language and is itself translated by the CLIP processor into intermediate language. For each machine, a processor is now written for translation from intermediate t9· assembly
language of that machine. With these processors,
the former processor is translated into assembly lan-
UNIVERSAL PROGRAMMING LANGUAGES AND PROCESSORS
15
guage, and the target equivalent in intermediate
language is exchanged for the one in assembly language. At last, the resulting processors are translated by the assemblers to the appropriate machines.
A universal processor scheme requires:
• One processor for each source language
written in CLIP and translating to the intermediate language;
• One processor for each machine to translate
from intermediate language to assembly language, the target equivalents for the intermediate language for patching up in the
insertion;
• One CLIP processor for the intermediate
language (and the assemblers for the different machines).
The main difficulty here is to design an intermediate language in a fixed form for many source languages (e.g., the UNCOL concept, see Section IIC).
4. NELIAC. In NELIAC likewise the high-level
language is used for the programming of the processors.36 The most interesting feature here is the
bootstrapping scheme to obtain the processors for
different machines. 37 In the original version on the
U460 (Countess), about 20 percent of the processor
was handwritten and in machine language inserted
into the processor (indicated by N C/460), after completion of writing the processor in its source
language.
In the notation, symbols for the original names are
retained as follows:
Nc
N 709
NELIAC for the Countess
NELIAC for the 709
The NELIAC-Countess processor was produced
with the patched-up processor:
This version was used in the production of the processors for the B200, the CDC 1604, IBM 704 and
the 709 machines according to the diagram for the
709:
For each machine, two different versions of the
processors have to be written, one in the NELIAC
language for the Countess and the other in the
NELIAC language for the desired machine.
The procedure is to write a processor for a
source language to a target language in a high-level
language for which there is a compiler running on a
machine. This processor is first translated to run on
that machine. Then the processor is written in its
source language and translated to its proper machine by the one already obtained. This process is
the direct equivalent of the old assembler production method, which has been writing the assembler
for a new machine first in an assembly language of
a running machine and then translating it to run on
this machine. The assembler was then again written, but in its source language and translated by the
already obtained assembler to run on and translate
for its proper machine.
5. XTRAN. To adapt the processors for different source languages, the XTRAN system21 accepts
those written in a simplified version of ALGOL '58
with string handling facilities (XTRAN language)
and the set of the macros for the particular machine. Two different sets of macros are used, one is
machine-independent of the three-address type (as
an intermediate language) and the other consists of
the macros of an actual machine in assembly or machine language. The XTRAN system translates the
processor for a source language into the machineindependent macros, accepts the definition of these
macros in terms of the machine-dependent ones,
and then replaces the former with the latter:
16
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
The requirements for a scheme of universal processors are:
• One processor for each source language
written in XTRAN (LI)
• One set of machine-dependent macros
for each machine
• The XTRAN system
1965
Unfortunately no experience is published yet for
this system. This might be due to unsatisfactory
performance for the macro setup. 38
6. SLANG. The SLANG system39 is very similar to XTRAN, but is more ambitious to generate
by itself the macros from a definition of the target
machine. The SLANG compiler accepts in addition
to that a description of the processor in SLANGPOLMI, designated by S :
B
Of course, the building up of the target equivalents
from a machine description is usually a very difficult task, if in general possible at all. And the POLMI language might not be definable in a fixed set.
Therefore, it is not surprising that no further experience with this system is yet reported in the literature.
The requirements for a universal processor
scheme using this system are:
POLMI for each source language.
• One description for each target machine.
7. TOOL. A peculiar system was reported in
TOOL.40 It translates processors written in the
TOOL language for other machines. The target
equivalents for a new machine are extracted from a
library file. So the automated translation of processors given in TOOL designated by T, to different
machines is handled. Generally, processor notation
• One processor written in SLANG-
In this case, the universal processor scheme would
require:
• One processor written in TOOL for
each source language.
• One library of target equivalents for
each machine (presumably with appropriate connectors).
As can be seen, this scheme is very similar to the
method from SLANG. Without further details this
scheme was reported to be working satisfactorily
and to be running on the H800 and H400.
Description Based Projects
The Syntax-Directed Method. The reported projects use a syntactic description of the source
language41 ,42 and are compiling interpretively:
P, S
{ P{L) I S{L,M) }
--+-- p{M)
17
UNIVERSAL PROGRAMMING LANGUAGES AND PROCESSORS
The reason for the requirement of interpretive
mode here lies in the fact that the different language parameters of a processor are interwoven, but
already to a much lower degree than in straightforward processors.
The requirements for a universal processor
scheme here are that the syntax description be separable from the super-processor and that one description be developed for each source-target language pair.
TGS - The Translator-Generator-System. An
ensuing development to the syntax-directed method
is given by the translator-generator-system TGS,43,44
using macro concepts like the XTRAN project (see
paragraph on XTRAN above).
This scheme accepts as input besides the program
to be translated:
1. A sort of Backus-Naur-Form definition of
the source language.
2. A table for macro description and code selection for the target language.
3. The generation strategy tables for the description of the linkage between source and
target definitions.
The super compiler consists of five parts working
subsequently on the source program. Most interesting among them are:
1. A syntactic analyzer for the source program to convert some piece of input string
into an internal representation (a tree form
is used);
2. A generator phase translating the internal
representation into an n-address instruction form, depending on syntactic context;
3. An optimizer phase for source and target
program optimization, eliminating invariant computations out of loops and common
subexpressions (thus being source-language-dependent to a certain degree) and
assigning special registers (thus being machine-dependent to a certain degree) .
4. A code-selector phase driven by the codeselector table to produce symbolic machine
code.
The translation process looks like this:
{p(l),
sell,
G(L, M), B(M)I} -
11<'-
p(M)1
Most important is the endeavor to achieve an object code optimized to a rather high degree at the
cost of great difficulty in the description of the
code selection. In addition to that, the algorithms
seem to be source- and target-language-dependent
(in respect to algorithmic languages containing
expressions and loops, and to machines possessing
special registers, both in a given form) .
For the production of the system, a bootstrap
technique is used, starting from the algebraic language Lo (the language of the CL-I system45 in
which the system was originally written:
{D(L o -
1604-A)}
{D(Lo)}
{D(CXA -
1604)}
~~ _~ro ~~)-\t
~
A universal processor scheme based on this project would require:
• One BNF definition for each source
language
• One table for macro description and
code selection for each target language
• One generation strategy table for each
source-target language pair
Metalanguage Compiler Direct. Likewise derived
from the syntax directed method46 this project uses
a metalanguage description of a source language in
terms of the semantic target equivalents as basic
~
-A
_
~
-A
~
-A
elements of the language on the problem side. 47 This
description is then compiled by the metalanguage
compiler into a processor written in the target equivalents of the metalanguage compiler:
18
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1965
Requirements for a universal processor scheme
are here:
The metalanguage compiler was originally written
in its augmented specification language and compiled by itself:
• One metalinguistic description for each
source language
• One set of target equivalents for each
machine and set of basic semantic elements of each source language
• The metalanguage compiler
Metalanguage Compiler Indirect. Based on the
previous project a method more machine-independent can be proposed, using the macro principle.
Here the basic semantic elements of the source language would be separated from the source language
description and referred to by connectors. These are
then inserted or executed to obtain translative or
interpretive mode. The production of a processor
would be accomplished according to:
Since no machine was available to accept the metalanguage specification, this translation was done
initially by hand. The target equivalent for M 1 and
M2 are normally rather far different from operations
and elements found on actual computers. Therefore,
they have to be interpreted in terms of those for
execution.
{ S (L K), R (K, M I)
required for the development of a universal programming language (see "Mathematical Definition
and Development" and "Design Possibilities"
above).
Applicative Expressions. Another proposal for a
universal processor project could use Applicative
Expressions 48 and would be very similar to the
method described above. Both source and target
language would be described in Applicative Expressions with connectors used for the correct interplay.
Although this scheme might be more general, it
seems to introduce many redundancies and to complicate the description, as the examples in reference
48 prove. The process would be:
A universal processor scheme requires here:
• One description for each source language
• One set of target equivalents for each
machine plus the target equivalents of
new basic semantic elements in a new
source language
• The metalanguage compiler
One special aspect of this method has to be
stressed. By the design process of a source language
in terms of the basic semantic elements, these elements can be separated in a form in which they are
{
S (L, K), R (K, M)
}
E
UNIVERSAL PROGRAMMING LANGUAGES AND PROCESSORS
The requirements for universal processors in this
instance are similar to those in the preceding paragraph.
C
Discussion
B(M)
All reported projects try to gain some power for
the construction of processors in the direction of
universal processors. There are basically three different starting points:
1. The own language for the processor
2. The source language of the processor
3. The target language of the processor
The first point is stressed by all methods to alleviate the specifications of the processor to various
degrees from the use of a high-level language for
explicit writing the processor to the syntax table
specification in TGS.
The complete definition of a source language can
specify at the same time a recognizer for programs
written in that language. This characteristic is used
in the description-based projects.
Techniques for the target language specification
to use a processor on different computers were attempted rather early as they were important to the
development of software in the variety of different
computer designs. But, as far as can be seen, the
obtained results are still very far from a satisfactory
solution to the problem-if it is possible to find
even a fairly general solution. Unfortunately no detailed experience with the XTRAN and TOOL
projects is reported.
Several interesting methods are used for bootstrapping, i.e., the adaption of a processor to a special computer mechanically. They range from the
old assembler construction method used in NELIAC to rather elaborate and sophisticated ones, as in
TGS. Of course, the whole subject needs much
more effort to develop the techniques for a fairly
general universal processor scheme or to prove that
the plan is not possible and to state the conditions
and new insights into the problems will certainly
bring much more progress than was achieved in the
past. This paper is intended to serve as a basis for
such a development.
Appendix
Explanation of Symbols
A
for designation of the source language of a processor
B
D(L)
E
f(a,/3;y,o,€, ... )
f(A,B,C)
G(L,M)
K
Li
M
Ni
peL)
R
S(L,M)
u
19
for designation of the target language of a processor
for designation of the own langauge of a processor
description of a target machine
description of the language L
designates Applicative Expressions
processor function
processor function with regard to
the language parameters
connective relation
connectors
with i as. a number designating a
language parameter
macros
similar to Li
program in L
a list
language specification of syntax
type for source language Land
target language M
as language designator for UNCOL
as a set operator in respect to k
a processor
a processor with source language
A, target language B, own language C and name D
the braces are used to combine
some input other than a single
processor or program
simple arrow for designating the
feed-in to a processor
double arrow for designating the
output from a processor
REFERENCES
1. R. J. Slutz, "Engineering Experience with
the SEAC," Proc. ElCC 1951, pp. 90-93.
2. A. C. D. Haley, "The KDF9 Computer System," Proc. FlCC 1962, pp. 108-120.
3. "Burroughs B5000," in Data Processing Encyclopedia, Detroit, 1961, pp. 50-55.
4. K. Samuelson, "Programming Languages and
their Processors," Proc. IFIP Congr. 1962, Munich,
pp. 487-492.
5. See for example M. H. Hall "A Method of
Comparing the Time Requirements of Sorting Methods," Comm. ACM, vol. 5, pp. 259-263 (May
1963) .
20
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
6. See for example F. P. Brooks Jr., "The Future of Computer Architecture," Proc. IFIP Congr.
1965, New York, pp. 87-91.
7. R. S. Barton, "A Critical Review of the
State of the Programming Art," Proc. SICC 1963,
pp. 169-177.
8. A. P. Yershov, ALPHA-An Automatic
Programming System of High Efficiency, IFIP
Congr. 1965, New York.
9. J. A. Ward, "The Need for Faster Computers," Proc. Padf. Compo Conf. 1963, pp. 1-4.
10. R. F. Clippinger, "Programming Implications of Hardware Trends," Proc. IFIP Congr.
1965, New York, pp. 207-212.
11. S. Rosen, "Programming Systems and Languages," Proc. SICC 1964, Washington, D.C. pp. 115.
12. R. D. Brennan and R. V. Linebarger, "A
Survey of Digital-Analog Simulator Programs,"
Simul. vol. 3, pp. 22-36 (Dec. 1964).
13. H. E. Peterson et aI, "MIDAS-How It
Works and How It's Worked," Proc. FICC 1964,
pp.313-324.
14. H. Schorr, "Analytic Differentiation Using a
Syntax Directed Compiler," Compo I, vol. 7, pp.
290-298 (Jan. 1965).
15. T. J. Theodoroff and J. T. Olsztyn, "DYANA, Dynamic Analyzer-Programmer I & II,"
Proc. EICC 1958, pp. 144-151.
16. J. W. Young Jr., "Non-Procedural Languages," 7th Ann. Tech. Symp., Southern Calif.
Chapter, ACM, Mar. 1965.
17. A. Opler et aI, "Automatic Translation of
Programs from One Computer to Another," Proc.
IFIP Congr. Munich 1962, pp. 245-247.
18. S. Gorn, "Specification Languages for Mechanical Languages and Their Processors, a Baker's
Dozen," Comm. ACM vol. 4, pp. 532-542 (Dec.
1961) .
19. J. W. Backus et aI, "The FORTRAN Automatic Coding System," Proc. WICC 1957, pp. 188198.
20. H. Bromberg, "Surveys of Computer Language Use," Data Proc. Mag. Apr. 1965, p. 37.
21. R. W. Berner, "Survey of Modern Programming Techniques," Compo Bull., Mar. 1961, pp.
127-135.
22. T. B. Steel, "A First Version of UNCaL,"
Proc. WICC, 1961, p. 371.
23. See for example K. Iverson, "Recent Appli-
1965
cations of a Universal Programming Language,"
IFIP Congr. 1965, New York, and IBM Syst.
Iourn., vol. 2. pp. 117-128 (June 1963).
24. N. Moraff, "Business and Engineering Enriched FORTRAN (BEEF)," Proc. 19th ACM
Conf. 1964, Phila. DI. 4.
25. B. Langefors, "ALGOL-Genius, A Programming Language for General Data Processing," BIT,
vol. 4, no. 3, pp. 162-176 (1964).
26. NPL Technical Report, IBM Publications
No. 320-0908, Poughkeepsie, New York, Dec.
1964.
27. Computer Review vol. 6, no. 2, ref. 7275,
p. 108-112 (Mar.-Apr. 1965).
28. J. R. Ziegler, "Computer-Generated Coding
(BEST) ," Datamat., Oct. 1964, pp. 59-61.
29. H. Bratman, "An Alternate Form of the UNCaL Diagram," Comm. ACM, vol. 4, p. 142 (Mar.
1961).
30. See for example C. L. Hamblin, "Translation
to and from Polish Notation," Compo I., vol. 5, pp.
210-213 (Oct. 1962).
31. A. S. Noble and R. B. Talmadge, "Design
of an Integrated Programming and Operating System, I & II," IBM Syst. Iourn. vol. 2, pp. 152-181
(June 1963).
32. See for example C. C. Eigot and A. Robinson, "Random Access Stored Program Machines,"
Compo I., vol. 11, pp. 365-399 (Oct. 1964).
33. J. V. Garwick, "Gargoyle, a Language for
Compiler Writing," Comm. ACM, vol. 7, pp. 16-20
(Jan. 1964).
34. C. A. R. Hoare, "A Programming Language
for Processor Construction," IFIP Congr. 1965,
New York.
35. D. Englund and E. Clark, "The CLIP-translator," Comm. ACM, vol. 4, pp. 19-22 (Jan.
1961).
36. J. B. Watt and W. H. Wattenburg, "A
NELIAC-generated 7090-1401 Compiler," Comm.
ACM, vol. 5, pp. 101-102 (Feb. 1962).
37. M. H. Halstead, Machine Independent Computer Programming, Spartan Books, Washington,
D.C., 1962, p. 37 ff.
38. See for example G. Letellier, "A Dynamic
Macro Generator eor Optimum Use of Machine Facilities by a Translated Program," IFIP Congr.
1965, New York.
39. R. A. Sibley, "The SLANG-system," Comm.
ACM, vol. 4, pp. 75-84 (Jan. 1961).
UNIVERSAL PROGRAMMING LANGUAGES AND PROCESSORS
40. A. Opler, "TOOL, A Processor Construction
Language," Proc. IFIP Congr. 1961, Munich, p.
513.
41. E. T. Irons, "A Syntax-Directed Compiler
for ALGOL 60," Comm. ACM, vol. 4, pp. 51-55
(Jan. 1961).
42. T. E. Cheatham and K. Sattley, "Syntax Directed Compiling," Proc. SJCC 1964, Washington,
D.C., pp. 31-57.
43. S. Warshall and R. M. Shapiro, "A General
Tabl~-Driven Compiler," Proc. SJCC. 1964, Washington, D.C. pp. 59-65.
44. T. E. Cheatham, "The TGS-II Translator-
21
Generator System," IFIP Congr. 1965, New York.
45. T. E. Cheatham et aI, "CL-I, an Environment for a Compiler," Comm. ACM, vol. 4, pp. 2327 (Jan. 1961).
46. A Glennie, "On the Syntax Machine and the
Construction of a Universal Compiler," Carnegie
Tech. Rep. No.2 (AD-240512), July 1960.
47. D. V. Schorre, "A Syntax Oriented Compiler
Writing Language," Proc. 19th ACM Con!. 1964,
Phila., D1. 3.
48. W. H. Burge, "The Evaluation, Classification and Interpretation of Expressions," Proc. 19th
ACM Con!. 1964, Phila., A1.4.
DIGITAL SIMULATION LANGUAGES: A CRITIQUE AND A GUIDE
John J. Clancy and Mark S. Fineberg
McDonnell Aircraft Corporation
St. Louis, Missouri
DYSAC languages. The scope of this paper is limited to the latter.
This field of what might be called parallel languages has enjoyed a vigorous growth since its inception ten years ago. New languages are appearing
at frequent intervals, but it appears the effort is
scattered, and perhaps needs to be channelized. The
authors have applied themselves to providing a
measure of needed direction and perspective.
FOREWORD
The field of digital simulation language, although
barely ten years old, has shown a remarkable
growth and vigor. The very number and diversity of
languages suggests that the field suffers from a lack
of perspective and direction.
While claiming no expertise in the writing of sophisticated compilers, the authors believe a relative
unconcern with implementation details permits a
wider objectivity in matters of format and structure.
Competence to speak on these aspects is claimed on
the basis of extensive analog, hybrid and simulation
language experience.
In Locke's words, "everyone must not hope to be
... the incomparable Mr. Newton ... , it is ambition enough to be employed as an under-laborer in
clearing the ground a little, and removing some of
the rubbish that lies in the way to knowledge."1
BRIEF SURVEY OF THE FIELD
History
Since Selfridge's article appeared in 1955,2 the
field of analog like simulation languages for digital
computers has grown at a rapid rate. Brennan and
Linebarger3•4 have provided an excellent review and
analysis of the history of the field; their surveys are
summarized and somewhat augmented below.
Lesh,5 apparently inspired by Selfridge's work,
produced DEPI (Differential Equation Psuedo
Code Interpreter) in 1958. Hurley6 modified this
language for the IBM 704 (DEPI 4), and then in
conjunction with Skiles7 wrote DYSAC (Digitally
Simulated Analog Computer). This line of development has continued at the Universities of Wisconsin and Colorado under Professors Skiles and Ride-
INTRODUCTION
The appellation "digital simulation language"
unfortunately has been appropriated by two quite
distinct fields: simulation of discrete, serial processes, as typified by the GPSS and SIMSCRIPT
languages; and simulation of parallel, more or less
continuous systems, as typified by the MIDAS or
23
24
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
out, resulting in the BLOC languages (MADBLOC,
HYBLOC, FORBLOC and COBLOC) .8,9
Stein and Rose, in addition to generating
ASTRAL,lO (Analog Schematic Translator to Algebraic Language) in 1958, have provided the theoretical and practical background needed to write a
sorting routine, i.e., an algorithm to deduce the
proper order of problem statement processing. l l
This feature, although overlooked by many authors,
is one of the keys to a useful language. This point
is detailed below.
In 1963, Gaskill et al,12 wrote an excellent simulation language, DAS (Digital Analog Simulator).
The program unfortunately suffered from a rudimentary integration algorithm and the lack of sorting. * MIDAS13 (Modified Integration DAS), written by Sansom et al at Wright-Patterson Air
Force Base, supplied the necessary improvements
and met with unprecedented success. (Approximately 100 copies of the program have been distributed.) The success of MIDAS is explainable by
two facts: the integration routine and sorting feature make it extremely easy to use and MIDAS was
written for the widely used IBM 7090-7094. The
authors of MIDAS have now offered another entry,
MIMIC,14 which is implemented by a compiler program and provides symbolic labelling, logical controlcapability, and freedom from the block-oriented
programming. MIMIC is not without faults, particularly in the areas of data entry, but seems destined
for the same general acceptance as MIDAS-and
deservedly so.
One of the most significant developments has
been computer manufacturer interest in simulation
languages. Scientific Data Systems has led the field
in this respect, having proposed DES-1 (Differential Equation Solver) in 1962.15 DES-l has been
modified extensively in the succeeding years, and
now offers one of the best formats and one of the
most variegated operator lists ;16 SDS has always
promoted the language as part of a total computer
system which provides analog type I/O and programming for a digital computer.
In late 1964, IBM entered the field in a small
way with PACTOLOS, byR. D. Brennan. 17 Apparently, PACTOLUS was intended to have only modest computational capabilities and to contribute
primarily as an experiment in man-machine communication. In this respect, the objectives of P AC*It should be noted that Gaskill considers neither point of
particular significance.
1965
TOLUS are similar to those of DES-I, although
the latter also attempts to be as powerful a computational tool as possible.
PACTOLUS was written for the IBM 1620, and
brought simulation to a previously untapped audience of small installations. The popUlarity of
PACTOLUS is rivaled only by that of MIDAS, if
indeed it has a rival.
More recently, R. N. Linebarger of IBM has announced DSL/90 (Digital Simulation Language for
the 7094 class computer) .18 This language is a significant advance over P ACTOLUS as a computational tool and offers many format improvements.
The above are only a few of the languages; many
others have appeared. Table 1 shows a list of simulation languages, along with some important characteristics of each. This table is the result of a rather
diligent search, but certainly is not comprehensive.
Much of the material has been taken from a survey
given by Clymer. 19
Trends
The present trend in the field is towards extension of the power and utility of the programs. More
efficient execution has been recognized as a goal,
and the compiler approach to implementation has
gained increased acceptance. The provision of more
complex operators and more advanced algebraic
capability represents an effort to increase the utility
of the programs for less analog-oriented applications. These trends are in the right direction, but
efforts have been scattered, and perhaps need to be
channelized.
Another major step has been recognition of the
importance of the man-machine interaction. As
was mentioned, this has been the primary message
carried by the P ACTOLUS program, and has long
been the concern of the DES-l designers. The
SCADS20 program has been used on-line at Carnegie Institute of Technology and the Aerospace Corporation is using EASL21 through a terminal to an
IBM 7094. The authors agree wholeheartedly with
this stress on man-machine interaction, and are
aware of the real import this communication has
for the future of digital simulation.
Areas of Use
Simulation languages have been written for two
more or less diverse reasons: to provide analog
Table 1. History of digital simulation languages.
Name
DEPI
DIDAS
Source of Name
Diff. Eq. Psuedo
Code Interpreter
Author's
Affiliation Computer
Date
Author(s)
1955
R. G. Selfridge USNOTS
Inkoyern
1957
F. Lesh
DIgital Differential
Analyzer Simulator
1957
Analog Schematic
TRanslator to
Algebraic Language
1958
Integration
Routines
Ancestor Sorting Remarks
Simpson's
Rule
- - No
Jet
Burroughs 4th Order
Propulsion 204
RungeLab
Kutta
Selfridge No
G. R. Slayton LockheedGeorgia
IBM 701
IBM 704
The Adam of this genealogy
Expanded and improved Selfridge's work
t:::I
§
Euler
Simulates a DDA
~
>
t'"4
(I)
~
ASTRAL
Stein, Rose
and Parker
Convair
IBM 704
Astronautics
RungeKutta
- - Yes
Isaiah, the voice that crieth in the wilderness.
A precursor of the modern languages. Sorting
and compiler implementation were original with
ASTRAL, and remained advanced features until
very recently.
~
c::t'"4
>
~
0
Z
t'"4
DEPI-4
DEPI for the
IBM 704
1959
J. R. Hurley
AlIisChalmers
IBM 704
4th order
RungeKutta
DEPI
No
First language to use float point hardware, and
thus eliminate scaling problems.
>
Z
0
~
0
t:r.I
DYANA
DYnamics ANAlyzer 1959
T. J.
Theodoroff
General
Motors
Research
Lab
IBM 704
Mechanical system dynamic analyzer
Euler
(I)
>
(")
~
~
~
~
I:)
BLODI
BLOck DIagrammed 1961
Compiler
Kelly,
Lochbaum,
and
Vyssotsky
Ben Labs
IBM 704
and 7090
None
- - No
Block simulator for signal processing devices
c::
t:r.I
>
Z
t;1
>
Digitally Simulated
Analog Computer
1961
DYNASAR DYNAmic Systems
AnalyzeR
1962
DYSAC
J. J. Skiles
and J. R.
Hurley
Univ. of
Wisconsin
CDC 1604 4th order
RungeKutta
Lucke,
Robertson
and Jones
General
ElectricEvendale
IBM 704
and 7090
AdamsMoulton
4 point
predictorcorrector
Variable
integr.
step-size
DEPI-4 No
--
Yes
The prophet with honor only in his own country. Significant improvement of DEPI-4, particularly in format. The program was specific for
a relatively little used computer, which probably caused its undeserved lack of wide acceptance and use outside the University of
Wisconsin.
§
S
t:r.I
Useful innovation was variable step size inte-
,gratioll algorithm.
tv
VI
tv
No but Used extensively at Honeywell. Parallel nature
retains retained by predicting variables around a feedparal- back loop, rather than sorting.
lelism
PARTNER
Proof of Ana]og
Results Through
Numerically
Equivalent Routine
1962
R. F. Stover
Honeywell IBM 650 Trapezoidal
or Euler
Aeorn. Div. and
H-800/1800
DAS
Digital Analog
Simulator
1963
R. A. Gaskill
MartinOrlando
IBM 7090
IBM 7090
1963
R. G. Byrne
Bell Labs
IBM 7090
Euler
No
FORTRAN flavor
1963
Scientific
M.L.
Pavlevsky
Data
and L. Levine Systems
SDS 9300
Choice of
five
No
Excellent language. Part of a computer system
that includes a special, analog type console.
JANIS
?
DYSAC No
0'1
Major contributions to the format of blockoriented languages. Widely used in the Martin
Company.
I"tj
:;c
DES-l
Differential
Equation Solver
0
(")
tt1
tt1
t;
>-4
Z
DIAN
DIgital ANalog
Simulator
1963
Farris and
Buckhart
Iowa State IBM 7074
Univ.
Euler
- - No
Chemical Engineering Simulations
0
VJ
I
"rj
WIZ
?
1963
J. E. Buchanan U.S. Naval ?
Avionics,
Indianapolis
4th order ASTRALYes
RungeKutta-Gill
ASTRAL's only known direct descendant.
>
t"4
t"4
~
0>-4
COBLOC
COdap Language
BLock Oriented
Compiler
1964
Janoski and
Skiles
Univ. of
CDC 1604 Choice of
Wisconsin
three
Logical building blocks (gates, flip-flops), etc.
DYSAC Yes
(Option- provided.
ally No)
Z
~
(")
0
~
I"tj
FORBLOC
PORTRANcompiled 1964
BLOck Oriented
Simulation Language
W. O. Vebber
Univ. of
Wisconsin
Any maTrapezoidal DYSAC No
chine FORTRAN
compiler
Easily modified since FORTRAN used. This approach could lead to machine independence.
e
~
tt1
:;c
(")
0
Z
HYBLOC
HYbrid computer
BLOck Oriented
Compiler
SIMTRAN
J. R. Hurley
IBM 709, 4th order
AllisChalmers 7090, 7094 RungeKutta
and Univ.
of Wisconsin
DYSAC No
Simulates hybrid computer.
tt1
:;c
tt1
Z
(')
~
Trapezoidal DYSAC No
MAD (Michigan Algorithmic Decoder) statements.
1964
L. Tavernini
Univ. of
Colorado
IBM 7090
Modified Integration 1964
DAS
Harnett,
Sansom and
Warshawsky
WrightPatterson
AFB
IBM 7090- 5th order DAS
variable
7094
step predictor corrector
Yes
The Moses of the story, that led digital simulation to the verge of the Promised Land. Very
widely distributed, modified and discussed.
1964
W. J. Henry
Weapons
Research
EstablishmentAustralia
IBM 7090
Yes
Though not used on-line, the program's structured for such use in the future.
MADBLOC MAD Language
BLOCk Oriented
Compiler
MIDAS
1964
"rj
DAS
~
\.0
0'1
VI
PACTOLUS River in which
King Midas washed
off the golden
touch.
1964
R. D. Brennan IBM
Research
Lab
IBM 1620 2nd order
RungeKutta
MIDAS Yes
Mainly an experiment in man-machine communication. Widely used as a simulation tool.
ENLARGED
MIDAS
1964
G. E. Blechman NAAS&ID
IBM 7090 Same as
MIDAS
MIDAS Yes
Enlarged component set of MIDAS and added
plotting routines.
PLIANT
Procedural Language 1964
Implementing
Analog Techniques
R.L.
Linebarger
IBM
Develop.
Lab.
IBM 7090 Trapezoid'al JANIS
MIMIC
No meaning, Ex
post facto, Sansom
claims, MIMIC is
MIDAS InCognito
F. J. Sansom
WrightPatterson
AFB
IBM 7090- 4th order
Runge7094
Kutta.
Variable
step size
MIDAS Yes
IBM 7094 ?
FOR- Yes
TRAN
No
Build own FORTRAN blocks.
t;j
UNITRAC
1965
1965
UNIversal
TRAjector Compiler
W. C. Outten
MartinBaltimore
Improvements over MIDAS include: compiler
implementation, logical elements, improved algebraic capability, and logical control.
S
:::i
>
t""I
CIl
~
~
Stylized differential equation input format. Free
format.
et""I
>
~
~
0
Z
DSL/90
Digital Simulation
Langu,age for the
IBM 7090 class
computers
1965
Simulation of
Combined Analog
Digital Systems
1964
Syn and
Wyman
Powerful, flexible simulation tool. Advanced
PLIANTYes
(Option- format ideas include free format capability.
ally No)
IBM
Develop.
Lab.
IBM 7090- Choice of
eight
Carnegie
Tech
CDC G-20 An algoPART- No
rithm uniqueNER
to SCADS.
A four
point method similar
to RungeKutta.
t""I
>
Z
0
~
t!1
SCADS
J. C. Strauss
and W. L.
Gilbert
Uses the same scheme for parallel operation
as PARTNER, i.e., an extrapolation method.
SCADS was used on-line at Carnegie Tech.
CIl
>
C":l
:::c
~
~
~
'att:I
>
Z
EASL
Engineering Analysis 1965
and Simulation
Language
L. Sashkin
Aerospace IBM 7094 4th order
RungeCorp.
and
Kutta.
S. Schlesinger
Variable
step size.
MIDAS No
Used on-line through a terminal. FORTRAN
statements are permitted in line.
t:::;j
>
0
e
6
tt:I
SADSAC
Seiler Algol Digitally 1965
Simulated Analog
Computer
J. E. Funk
U.S. Air
Force
Academy
MIDAS Yes
Burroughs 5th order
B-5000 and variable step
B-5500
predictorcorrector
Essentially MIDAS, but written in ALGOL.
Significant feature is handling of discontinuities
by interrupting integration routines when switching occurs.
SLASH
Seiler Laboratory
Algol Simulated
Hybrid
1965
J. E. Funk
U.S. Air
Force
Academy
Burroughs 5th order SADB-5000 and variable step SACK
B-5500
predictorcorrector
Gives an ALGOL program control of SADSACK for parametric studies, plotting optimization, etc.
Yes
tv
....,J
28
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
check cases, and to solve differential equations (in
other words, replace an analog computer). MIDAS,
ASTRAL and PARTNER22 were written primarily
for the first reason; DAS, the DEPI family, DES1, and P ACTOLUS, apparently for the second.
Another, perhaps more important, use has not been
stressed by any of the authors, i.e., providing the
best digital computer language for a hybrid problem. Since an analog computer is a parallel device,
and no amount of mental calisthenics can make it
appear serial, it is clear that a parallel digital language is the only solution to the parallel-serial dichotomy in hybrid programming.
FUNDAMENTAL NATURE OF SIMULATION
LANGUAGES
Views Proposed
As noted above, simulation languages have proliferated at an amazing rate in the last few years.
Each new language comes equipped with its own
format and structural idiosyncrasies, which generally reflect the creator's reading of the essence of simulation languages. These analyses .might be classed
as: analog computer simulator, block diagrammed
system simulator, and differential equation solver.
When the concepts are considered in detail, it is
evident that all these views miss the point to some
extent.
Analog Computer Simulator: Some simulation
languages, as noted previously, were written specifically to simulate an analog computer. The purpose
was to provide an independent check of the analog
setup. Most of the authors state, however, that the
resultant programs were used profitably for problem
solving-in other words, the analog was bypassed.
Simulating the analog computer generally results
in an operator (or statement) repertoire which reflects the fundamental physical limitations of analog
computer elements. Examples of this phenomenon
abound; one might mention the limitation on the
number of summer inputs, different elements for
variable and constant multiplication, and the lack
of memory.
In short, the logical culmination 'Of this concept
is a system neither fish nor fowl, with many disadvantages of both analog and digital programming.
Block Diagrammed System Simulator: The overwhelming majority of authors state that their language (or program) was designed to simulate sys-
1965
tems that can be represented by block diagrams.
(Of course, an analog computer is such a system, so
the opinion outlined in the previous subsection can
be seen to be a sub:"set of this view). By and large
this opinion i~ justified, since many problems of
interest are easily expressible in block form. However, if the problem is given simply as a set of ordinary differential equations, reducing the equations
to a block diagram is a tedious, error-prone process. Even if a block-diagrammed system is considered, more often than not some of the blocks
contain differential equations; an example is the
airframe equations in a control loop.
Differential Equation Solver: An opinion sometimes expressed is that simulation languages should
be designed to solve ordinary differential equations.
From what has been said, this view has some merit.
Unfortunately, two problems arise. First, in control
system simulation, transfer functions and nonlinearities are not conveniently expressable in equation form. There is also a certain loss of familiarity
with the system when blocks are completely eliminated. Second, the concept overlooks other important problem areas, e.g., sampled data systems,
where the problem is stated in difference equation
form.
A More Correct Approach
It is seen, then, that all these views regarding the
fundamental nature of simulation languages are too
narrow and confining. Is there an "essence" (in the
metaphysical sense) which is common to aU, yet
not so comprehensive as to be meaningless? Parallelism, the apparent parallel operation of a serial
digital computer, may be an all-inclusive, rational
statement of the essential nature.
All languages extant have taken their format and
structural cues from analog programming and analog operators. Assuming the action was rational, and
not an empty exercise in dialectical synthesis, this
fact provides a clue to a valuable overall view of
simulation languages. The analog computer is, of
course, a parallel device.
As it happens, this is the way most of the world
is structured. Representing physical phenomena
with a serial digital computer is an artifice; useful,
but nevertheless an artifice. The analog computer
has achieved such success and generated such attachment largely because of the close analogy existing between the computer and the physical world.
DIGITAL SIMULATION LANGUAGES: A CRITIQUE AND A GUIDE
29
Obviously, then, if physical systems must be represented with a serial digital computer, the machine
should be made to appear parallel. This is in fact
what has been done in simulation languages, and, of
course, the success is manifest. Difficulties have
arisen, though, because "pseudo-parallel" devices
in the past been modeled too closely on the analog
computer. If the notion of parallelism is correct,
the best parallel device must be sought.
The importance of this concept cannot be overstated. It is not merely a convenient catch-all to
include all previous efforts, but has real consequences for the future of simulation languages. A
programmer if he is to "think parallel" must be
freed from the chore of ordering problem statements. Such freedom is available if a sorting algorithm, .as first proposed by Stein and Rose, is used.
Alternatively, an extrapolation scheme, as used in
PAR TNER and SCADS, achieves the desired parallelism, but at a cost in storage and execution efficiency. The languages incorporating sorting or extrapolation are true parallel languages and provide
the designer with a parallel device to represent his
parallel physical system. It is always treacherous to
be dogmatic, but on this point it seems clear that a
language without sorting (or its equivalent) is simply another, perhaps slightly superior, method of
programming a digital computer, and is in no way a
parallel system simulator.
element, poor logical and memory features, and rudimentary labelling capability. (This latter is a curious anachronism, since even the most primitive
digital computer assemblers permit symbolic labelling). Some formats, notably ASTRAL, were based
directly on a specific analog computer and may be
expected to have certain deficiencies. In others, e.g.,
MIDAS and P ACTOLUS, no real attempt was
made to simulate an analog computer yet the implied hardware limitations are nonetheless present.
As a consequence of the poor format, operational
difficulties are found to stem from trivial clerical
errors, such as dropping commas or decimal points,
or having input statements in the wrong order.
These difficulties are increased by the" multiplicity
of primitive operators, and the consequent need for
large complex networks to represent algebraic statements. The artificialities also tend to make the language more difficult to learn, or having been
learned, to retain all the esoteric details. (The retention of these details might seem a small matter
to the "professional" programmer, but is a real concern to the occasional user.) Modern computers
with character handling ability and high execution
speed can free the programmer from this sort of detail, with very little penalty in increased compilation time.
FORMAT
There is now a large fund. of experience in the
design and utilization of parallel languages, and
some general, somewhat dogmatic, statements can
now be made about format. On a very general level,
these could be reduced to two rules: The format
s;hould be both "natural" and "non-arbitrary."
These rules require some amplification.
"Naturalness:" The input problem statement
should be as close as possible to the normal, accepted method of problem statement. The question
in the designer'S mind should be: if I were preparing a problem for my own future reference and explanation to others, how would I state it? The answer to such a query would naturally vary with the
type of problem.
If a parallel control system is to be analyzed, the
problem would most naturally be stated in a block
diagram, wherein each block represented a more or
less complex operator such as gain, limit, hysteresis,
transfer function, etc.
If, however, a set of differential equations is to
Present Format Inadequacies
Perhaps the most important consideration in designing a simulation language is the utility of the
input format. A good, flexible, natural appearing
format would encourage wide usage, facilitate training, and reduce errors. All existing formats are
much too arbitrary and generally reflect both the artificialities of digital computer modes of thought,
and the physically determined inadequacies of analog elements.
Under digitally derived restrictions, one might
mentation the exaggerated importance given to column position in a statement, the need for commas
and decimal points where unnecessary for clarity,
and the requirement for a specific, arbitrary order
of arguments within a statement.
Analog inadequacies have been mentioned; they
appear as restrictions on the number of inputs to an
General Format Rules
30
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
be studied, it is a great deal of wasted effort to formulate the problem in block form. The process is
conducive to error and really adds nothing to the
understanding of the problem.
It is imperative that the input statement, the
cards presented to the digital computer, should
match as closely as possible the normal, natural
statement of the problem. Of course, there are fundamental limitations, notably the fact that superscripts and subscripts are normally used while the
card punch must work on one line. However, much
can be done, and actually has been done to "naturalize" format, especially in the APACHE * and
UNITRAC23 programs.
To reiterate, the input statement should be flexible enough to match the natural form of diverse
problems; block diagram and differential equation
types have been mentioned. Many problems arise
that are really combinations of these classes, e.g.,
an airframe in an autopilot loop. The format should
natually be capable of stating each problem area in
its normal form.
Non-Arbitrariness: Arbitrary formats,more than
anything, have limited the utility of parallel languages. One commonly finds early enthusiasm for
the idea of parallel languages, and then disenchantment when the "format gap" between the idea and
its implementation is fully appreciated. In many
facilities, where digital computer turn-around is
measured in days, the trivial errors caused by the
complex, arbitrary format extend a problem's
check-out time to unacceptable periods.
However, this is not a fundamental problem;
surely parallel languages· can be written to eliminate
most of these difficulties. It seems that few authors
have given much thought to minimizing arbitrariness in their concern for other aspects of the language. This is seen clearly in the early work of Selfridge and Lesh, where the root idea of parallel languages was the main subject of study. Unfortunately, examples are still apparent: Many outstanding
contributors, in their understandable enthusiasm for
man-machine interaction and efficient implementation, have been satisfied with adopting earlier
format ideas and have neglected the ramifications
of a good, clear, non-arbitrary input statement.
*The APACHE24 program is not really of the genre under
consideration here. The program takes differential equations
and generates an analog wiring diagram and static check.
However, the format considerations are nearly identical to
equation solving languages, and the APACHE authors have
produced an input format worthy of study.
1965
A few obvious requirements are discussed below:
(a) A "free format" statement capability, i.e.,
statements can appear anywhere on the
card. This is available in the UNITRAC
and DSL/90 programs, and should eliminate much of the frustration produced by
coding or key punch slips.
(b) The ability to enter numerical data in any
convenient form, e.g., 200.0 might be
written 200, 200.0, 200., 2E2, 2.0E02, etc.
( c ) The ability to use either literals in a statement (Y = 4X) or the option to symbolically label constants (Y = KX). In the
latter case, the constant would be specified
in the normal fashion. (K = 4 or K =
4., etc.)
(d) The ability to label quantities in a sensible,
problem related fashion, and use this label
to specify the variable without reference to
an arbitrary block number. The latter labeling method could be retained for meaningless intermediate quantities. It should be
noted that the need for problem related
labeling was one of the first lessons learned
by software designers and is now available
with virtually all assemblers.
Complexity of Format
An open question at this time is the allowable
degree of complexity in the input format statement.
The trend in new languages appears to be away
from the simple, analog type blocks to statements
reminiscent of FORTRAN.
Table 2 shows an "algebraic capability" scale,
with some of the languages distributed along it. At
the lower end, one finds a very primitive capability, which can represent any algebraic statement,
albeit in an extremely awkward form. Fortunately,
no one has been inspired to implement this sort of
language. (This is not to say such codes are useless;
a primitive language is generally the intermediate
representation in a compiler program.) Moving up
the table, the next stage is basic mathematical operators modeled by and large on analog computer
components. DAS and MIDAS are good examples
of this class. Here, there is some advance from a
"minimum vocabulary" and a great deal of flexibility is available, particularly for block oriented sys-
31
DIGITAL SIMULATION LANGUAGES: A CRITIQUE AND A GUIDE
Table 2. Algebraic Capability Scale.
(listed in order of decreasing capability)
Description
Statements in any form
understandable by the engineer
Nested sum of products and
any functions or operators
Nested sum of products and operators
Nested sum of products and certain
functions (a la FORTRAN)
Nested sum of products
Single level sum of products
Coefficients on inputs to block operators
Basic mathematical operator (a la analog)
~pica1
Examples
Statement
=
+
dy/ dt
y sin w t
x2
Yo
5, Yo
0, W
2II
Z
(XY + K1SIN WT) *ERF
=
=
MIMIC
DSL/90, UNITRAC
DES-1
Primitive operators, the minimum
necessary, with the minimum inputs
tem representation. However, programming is awkward for algebraic type problems.
The next stage provides coefficient setting on all
inputs to blocks of the type discussed above. * As
on an analog computer, only constants can be multiplied by the input variable.
A natural extension of coefficient setting is, of
course, variable multiplication at element inputs,
the next step up the table. (Division is also assumed permissible here.) This stage has not been
implemented, probably because the transition to the
next stage is so evident.
In this stage, exemplified by DES-I, nesting of
sums of products (and quotients) is allowed, i.e.,
any level of parentheses is permissible. A flexible
statement is provided, although no functional relationship (sin, exp) can be imbedded in the sum of
products.
This lack is provided at the next stage, now
available in the DSL/90 and UNITRAC languages.
Here, an input statement very similar to FORTRAN is available; a limited number of functions
may be used in the statement.
It might seem odd that a MIMIC type language,
which allows only operators in the statement, should
be set above UNITRAC or DSL/90 which permit
functions. However, since function generators can
be considered as operators, this format is extendable downward, and further allows operations like
integration and limiting to be imbedded in the input statement.
Moving now to the top, the next stage provides a
*It is seen here that the evolutionary movement up the
table is not in strict chronological order. ASTRAL and
DEPI preceded DAS and MIDAS.
=
(L*N) - INT ((A + B) *C) K2
Y: ADD (X,MPY (B,Z,SIN)(U»)
Y
X * Y * (SIN(A + B) )+K*M
=
Y = X *(C1 + C2 - C3*(C4-C5»
Y
K1 *X*X + K2*Z + K3*K4 - K5
N04
P01*N01 + P02*N02
=
ASTRAL, DEPI,
DYSAC
DAS
MIDAS
=
=
MI: S1,12
Sl: K1, 12, M3, K5
M1: S1,12
S1: K1,12
S2: S1, K3
12: K1
synthesis of the two below, allowing both operators
and functions to appear in the nested sum of products. The allowable function list would be open
ended and at the user's discretion. No published
work provides this capability. At the very top, and
really hardly in sight, is the capacity to accept any
reasonable problem statement understandable to the
engineer.
There certainly are objections to considering upward movement on the table as c.volutionary, with
the implied value judgment concomitant to that
view. The authors would agree that a powerful algebraic capability, although very useful for some
tasks, must at the present state of the art carry with
it increased complexity and arbitrariness. The tool
may prove too powerful for many users and lead to
confusion and errors. Further, many users prefer the
block approach and with excellent reasons.
However, all fears can be allayed, since regardless of the available complexity, lower level statements are possible by simply limiting the size and
complexity of the more powerful statement. A close
examination of Table 2 will show that any of the
formats are easily derived by restricting the extent
of those above.
Diagnostics
In general, program error diagnostics should be
extensive and specific. Each card containing the
error, and only that card, should be printed, along
with a specific comment on the trouble. Alternatively, diagnostics could be printed immediately adjacent to the erroneous statement, as the source language is being printed. Closed algebraic loops
32
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
should naturally be printed separately from format
errors.
It is expected that improved format will reduce
this sort of error, but some are sure to appear and
rapid checkout requires good diagnostics.
Diagnostics and corrections to the program must
be permitted in the source language to free the programmer from the details of debugging at the machine code level. For hybrid work it is essential
that such source level debugging and modification
be permitted on-line through a console typewriter
or similar unit. Extremely rapid compilation is of
course necessary to make this economical.
On-line debugging and modification has been
explored more fully in an earlier article by the
authors.25 The particular framework in point was a
huge multiprogrammed digital computer with analog type terminals (one might say a large scale
multiprogrammed P ACTOLUS, or a time shared
DES-I). With such a system and a parallel language, the advantages of analog computers (parallelism, intimate man-machine communication)
and digital computers (accuracy, repeatibility) are
both apparent. The language employed with this
system must permit clear diagnostics and simple
modification at the source level to retain the analog
virtues of simple communication and rapid modification.
STRUCTURE
Integration with Software System
As is well known, all modern software systems
contain many languages; one might mention FORTRAN, assembly language, ALGOL and COBOL.
These languages are generally under the control of a
monitor or executive program, which calls programs
to compile (or assemble) various source programs
into machine code. Usually subprograms can be
written in any of the source languages, and a complete program linked at load time.
The parallel language should come under this organization, and be available along with FORTRAN
and the rest, as the optimum source code for a particular class of problems. In this way, and using the
subprogram feature, each problem area could be
written in the best language for a particular task;
say, parallel language for the differential equations,
FORTRAN for the arithmetic, and machine language for chores such as masking, character handling, and Boolean operations.
1965
Extension Capability
In order to remain useful in the face of continuously expanding user requirements, any language
must be able to grow with needs. When a parallel
language is examined with the intent of increasing
capability without organizational disruption, it is
seen that expansion should take the route of adding
operators or functions. Expansion must, of course,
fit neatly into the total software system, i.e., the
other languages and the executive program, as outlined above. Since a common subroutine format is
already available with the other languages, any subroutine, written in any source code, could be used
by the parallel language programmer and be called
simply by using the subroutine name as an operator.
Augmenting the operator set must be made very
simple and obvious so the average user, unfamiliar
with normal digital techniques, can exploit the extension feature without recourse to a systems or
maintenance programmer.
User-Oriented Organization
The language should be designed to easily match
the capabilities of diverse programmer levels. Basic
subsets, such as primitive operators, algebraic statements, etc. should be made available to the less sophisticated programmers. These subsets should be
capable of integration and mixed use by the more
highly skilled user. Also, elements of a more complex nature (e.g., serial operators) should be
available to the expert, but not a matter of concern
for the novice. Thus, a structure is required that
will permit the novice to learn a mimimum subset
and then advance, if he wishes, to the use of an extremely complex and powerful simulation language.
(Or looking at. it yet another way, there should be
open and closed shop versions; the various open
versions upwardly compatible with the closed shop
version.)
In sum, there seems to be no need to restrict the
language's use to a particular programmer level, if
the initial design is done in a systematic manner.
IMPLEMENTATION
Regardless of format and structure, the language's
effectiveness will depend entirely on the quality of
DIGITAL SIMULATION LANGUAGES: A CRITIQUE AND A GUIDE
the implementation. This aspect has recently been, a
major interest area, and the concepts are becoming
rather well developed.
In general, a program that produces machine
code is a necessity for efficient execution. MIMIC
and SCADS, compilers, dire"ctly achieve this, while
DSL/90 and ASTRAL generate a FORTRAN deck
which can then be compiled into an object program.
This latter approach, (if a good FORTRAN compiler is used), can produce efficient code by exploiting the considerable efforts expended by FORTRAN designers. There are certain applications,
particularly those with small machines, where an
interpreter program makes more sense, but generally a compiler seems the best route. This is detailed
more fully below. First, an examination of the
trade-offs involved in writing a compiler program
for a parallel language.
Compiler for Different Applications
As was mentioned, there are three major usage
areas for parallel languages: analog check cases, differential equation solving, and the digital protion of
a hybrid problem. The relative weights given to
compiling and execution times vary with the particular application.
Analog Check Cases: This usage is generally on
a single job basis, i.e., the program is compiled and
run once and then discarded. Since the object program is never used again, only the sum total of
compiling and execution time for one run need be
minimized. In fact, this minimization is hardly a
point to stress, since analog check cases would
probably represent a small total of a digital facility's work load.
A II Digital Simulation: If the language is to be
used for this application on a "load-and-go" basis, minimization of the total time is of prime importance. On the other hand, if production programs are the expected rule, execution time is the
quantity to be minimized.
Hybrid: Here, the requirement for an efficient
object program is a vital consideration, and real
sacrifices can and must be made in compiling efficiency.
As a general rule, compiling time should never
be minimized at the expense of input format, and
only as a last resort should format be sacrificed for
decreased execution time. This latter seems a remote possibility, but it is easy to see compiling
33
time increased in the interests of simpler programming.
Different Computers
If this language is to achieve the general usage
typical of FORTRAN, some thought must be given
to implementation for diverse computers. It is pointless to design a system workable only for a CDC
6600-6800 or the top of the IBM 360 line. Similarly,
it is a waste of effort to aim at implementation solely
for a PDP-8, DDP 116 or SDS 92. The large machines obviously should have the full language, i.e.,
all subsets, and be provided compiled versions. For
the smaller machines, two appro~ches are possible.
The basic subsets could be compiler versions, thus
providing efficient programs although only for small
problems and at a modest language level. Alternatively, the complete system could be run in an interpreter mode, sacrificing time, but permitting the
use ofa very powerful tool on a small machine.
REQUIRED FEATURES
Along with implementation, the operational features of the language (the programmer's bag of
tricks) have been a major concern of language designers. This section does not aim to be all-inclusive; probably some of the "required" features have
not been invented yet. The requirements can be
subdivided into two classes: structural or logical
features, and the types of elements or operators.
Sorting is not discussed, since it is assumed that all
modern parallel languages will be so equipped.
Structural Features
Control: Logical. control over
program structure and execution is of paramount importance. DES-I, being sequential, easily incorporates this feature by the
use of "IF" statements similar to FORTRAN. MIMIC, a true parallel language,
still provides decision capability with the
"logical control variable."
In substance, statements or operators are
serviced or bypassed as a function of problem variables. So long as the by-passing
is done in the true digital sense (non-execution), and not the analog sense (execution, but no use made of the outputs), a
substantial time savings is realized.
1. Logical
34
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
Logical control is quite important. Without it a parallel language yields no more
than a hyper-accurate and hyper-repeatable
analog computer; some of the best features
of the digital computer, decision capability
and memory, are unused.
2. Multiple Rates: This important provision,
available in DES-I, minimizes execution
time by servicing slowly changing variables
only as required. Generally speaking, multiple rates increase the effective bandwidth
of the simulation program. This has real
import for hybrid work.
The multiple rate option is clearly a part
of logical control capability. In this case,
sections of the program are not executed at
every pass, but unlike full logical control,
the bypassing is not under control of a program variable.
3. Macro Capability: The macro capabilities
of modern assemblers should be available to the parallel language programmer.
Using this feature, prototypes of often used
problem sections could be coded with unspecified parameters, and then subsequently used as integrated units. Macros would
obviate repetitious coding of identical
loops or problem areas, e.g., parallel sections of control systems that are identical
in structure.
4. Subprograms in Other Source Code: In a
normal digital computer operation, there is
always a large library of routines available
to the programmer. These programs should
be easily incorporated within the parallel
language. If expansion is implemented as
suggested (see Extension Capability), not
only would the entire library be available,
but the programmer with digital training
could use whatever language desired for
particular problem areas. For example, logical or Boolean operations would be most
easily handled in machine language.
5. Repetitive Operation and Memory: It
should be possible to repeat runs, as on a
repetitive analog computer, with new parameters calculated from previous runs.
This implies two further requirements:
function storage, and algebraic calculations
between runs.
1965
Elements
In this section, the normally found operators, e.g.
summers, multipliers, etc., are taken for granted.
No attempt has been made to be comprehensive;
however, those discussed are considered important
and/or relatively rare in present languages.
1. Integrator: An accurate, efficient integration method is the sine qua non of digital
simulation languages. Apparently no firm
conclusions have been reached as to the
best algorithm; the number of schemes
tried is almost as large as the number of
languages (See Table I). As an example of
the dynamics of this situation, note that
Sansom, having used an excellent method
(4 point variable step predictor-corrector)
in MIDAS, changed to another (modified
Runge-Kutta) in MIMIC.
DES-I, DSL/90 and COBLOC permit a
number of integration options, ranging
from simple Euler integration to complex
Runge-Kutta and Adams-Bashford algorithms. This variety does allow the user
to select a. scheme which is adequate for
the required execution time, but presupposes considerable knowledge of numerical
techniques on the part of the programmer.
This presupposition defeats in large part
the basic idea, i.e., the simplicity and ease
of use, even for relatively untrained people.
In sum, it appears at this time that the
debate is hot on integration methods, and
more experience is still required. Parenthetically, it might be said that an objective,
thorough comparison of the various options would be a real service to the field of
digital simulation.
2. One Frame Delay: This element delays
its input by one integration interval. It
should be available for the sophisticated
user to selectively "desort" the program
list. COBLOC and DSL/90 presently have
the option of sort/no sort, but if the no
sort option is required in only one small
area, much care must be taken in the other
sections to insure proper operation. The
one frame delay is also quite useful for
representing sampled data systems or memory functions.
DIGITAL SIMULATION LANGUAGES: A CRITIQUE AND A GUIDE
3. Hysteresis or Backlash: This element is
not easily constructed from standard analog
type elements, but represents a trivial task
for the digital computer.
4. Limited Integrator: Again, no easy chore
from the standard elements, and no real
effort for the digital computer. MIMIC
presently has a limited integrator element
which is used in conjunction with a standard integrator.
s. Transfer Functions: These operators are
used extensively in control system design
and the like, and are simply constructed
from analog type elements. However, the
very frequency of their use suggests they be
made available as integrated general-purpose units.
In addition to the programming time savings, execution time can be saved, since
the integration algorithm required for a
closed loop transfer function is much simpler than a comparably accurate routine for
open loop integration.
Far greater savings are possible by using a
difference equation algorithm. This. method requires only one computational step of
the same size and complexity of a single
integration. Compare this with the n integrations required for a transfer function
with nth order denominator, when programmed by normal analog methods.
6. Print Operators: Very often, in checkout
and operation, it would be helpful to force
a print (number or words) at an arbitrary
point in the program. The operator would
be similar to a "snapshot" print, but would
be under the control of problem variables.
As a trivial example, consider the printing
of "OVERLOAD" when an analog check
case variable exceeds 100.
7. Parallel Logic: The operators of interest
here are the normal digital units found on
most modern analog computers. (AND
gates, flip-flops, counters, shift registers,
etc.) These elements are presently available in the MADBLOC, COBLOC, and
MIMIC languages. Their inclusion is essential to provide a digital check for a
modern analog computer problem. When
solving differential equations, such units
are also useful for simple logic and storage.
35
For the higher level logic of the type normally associated with general-purpose
digital computers, machine language subprograms, as discussed above, are more
useful.
8. Linkage Elements: For hybrid programming, elements or labels for analog to digital converter (ADC) and digital to analog
converter (DAC) components must be
provided. ADC's could be handled simply
as another parameter input to the problem.
These could be realistically labeled and
then identified somewhere in the program
listing, e.g., ALPHA = ADCl. DAC's
could also be easily handled; for the digital
program they are merely elements with one
input and no output. Sorting these elements presents no difficulties: ADC's would
be treated exactly like constants; DAC's
would be sorted like any other element
which has an input.
9. Hybrid Integrators: Since variables transferred to the analog computer are usually
held for an entire frame, an effective half
interval time lag results. Thus, integrators
generating quantities destined for the analog must account for this lag phenomenon.
Those integrators in a purely digital loop
must, of course, neglect this extrapolation.
Therefore, different integrator types, easily
distinguishable and easily specified, must
be provided for the two requirements. It is
entirely possible that the sorting routine
could automatically make the necessary
distinctions by tracing back from a DAC
to integrators not isolated by another integrator. An alternate procedure is the addition of an extrapolation calculation to each
DAC element. However, this approach
costs both storage and execution time.
CONCLUSIONS
Digital simulation languages have made a real,
and probably permanent, impact on the fields of
both simulation and computer programming. As has
been pointed out, there are more or less serious
faults in all existing languages. The success of the
approach is, however, evidenced by the undeniable
acceptance, utilization and enthusiasm for simulation languages, regardless of the difficulties at the
present phase of development.
36
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
It is the authors' hope that the conclusions and
recommendations proposed herein will add significantly to the utility of simulation languages, and
the field will enjoy even further growth and acceptance.
REFERENCES
1. John Locke, An Essay Concerning Human
Understanding, c.f., The Age of Enlightenment, editor-Isaiah Berlin, Mentor, New York, 1956 p. 33.
2. R. G. Selfridge, "Coding a General Purpose
Digital Computer to Operate as a Differential Analyzer," Proceedings 1955 Western Joint Computer
Conference (IRE), 1955.
3. R. D. Brennan and R. N. Linebarger, "A
Survey of Digital Simulation: Digital Analog Simulator Programs," Simulation, Vol. 3, No.6. (Dec.
1964 ).
4. R. D. Brennan and R. N. Linebarger, "An
Evaluation of Digital Analog Simulator Languages," I.F.l.P. 1965 Proceedings, Vol. 2 (1965).
5. F. Lesh, "Methods of Simulating a Differential Analyzer on a Digital Computer," A CM Journal, Vol. 5, No.3 (1958).
6. J. R. Hurley, "DEPI 4," internal memorandum, Allis Chalmers Mfg. Co. (Jan. 1960).
7. J. R. Hurley and J. J. Skiles, "DYSAC,"
Spring 1963 Joint Computer Conference, Vol. 23,
Spartan Books, Inc., Washington, D.C., 1963.
8. V. C. Rideout and L. Tavernini, "MADBLOC," Simulation, Vol. 4, No.1 (Jan. 1965).
9. J. J. Skiles, R. M. Janoski and R. L. Schaefer, "COBLOC," paper presented at Joint Meeting
of Midwestern and Central States Simulation Councils (May 1965).
10. M. L. Stein, J. Rose and D. B. Parker, "A
Compiler with an Analog Oriented Input Language
(ASTRAL)." Proc. 1959 Western Joint Computer
Conference, 1959.
11. M. L. Stein and J. Rose, "Changing from
Analog to Digital Programming by Digital- Techniques," ACM Journal, Vol. 7, No.1 (Jan. 1960).
12. R. A. Gaskill, J. W. Harris and A. L. Mc-
1965
Knight, "DAS," Spring 1963 Joint Computer Conference, Vol. 23, Spartan Books, Inc., Washington,
D.C., 1963.
13. R. T. Harnett and F. J. Sansom, "MIDAS
Programming Guide," Report No. SEG-TDR64-1, Wright-Patterson AFB,
Ohio,
(Jan.
1964) .
14. F. J. Sansom and H. E. Petersen, "MIMICDigital Simulator Program," SESCA Internal Memo
65-12, Wright-Patterson Air Force Base, Ohio
(May 1965).
15. M. Palevsky and J. V. Howell, "DES-I,"
Fall 1963 Joint Computer Conference, Vol. 24,
Spartan Books, Inc., Washington, D.C., 1963.
16. Anonymous, "SDS DES-I," Scientific Data
Systems Descriptive Brochure, No. 64-42-01C
(1964 ).
17. R. D. Brennan and H. Sano, "PACTOLUS,"
Fall 1964 Joint Computer Conference, Vol. 26,
Spartan Books, Inc., Washington, D.C., 1964.
18. R. N. Linebarger, "DSL/90," paper presented at Joint Meeting Midwestern and Central
States Simulation Councils, May 1965.
19. A. B. Clymer, "Report on Joint Midwestern-Eastern Simulation Council Meeting, June
1964," Simulation, Vol. 3, No.4 (Oct. 1964).
20. J. C. Strauss and W. L. Gilbert, "SCADS,"
2nd Edition, Carnegie Institute of Technology
(March 1964).
21. L. Sashkin and S. Schlesinger, "A Simulation
Language and its Use with Analyst-Oriented Consoles," Aerospace
Corp.
Report,
ATR-65
(59990)-5, San Bernardino, Calif. (April 1965).
22. R. F. Stover and H. A. Knudston, "PARTNER'" Doc. No. U-ED 15002, Aero Divn., Honeywell (1962).
23. W. C. Outten, "UNITRAC," paper presented
at J oint Meeting Midwestern and Central States
Simulation Councils (May 1965).
24. C. Green, H. D'Hoop, and A. Debroux,
"APACHE," IRE Transactions on Electronic Computers (Oct. 1962).
25. J. J. Clancy and M. S. Fineberg, "Hybrid
Computing-A User's View," Simulation, Vol. 5,
No.2 (Aug. 1965).
AUTOMATIC SIMPLIFICATION IN FORMAC
R. G. Tobey, R. J. Bobrow and S. N. Zilles
International Business Machines Corporation
Systems Development Division
Cambridge, Massachusetts
The remainder of this paper is divided into four
sections: Historical Background, The Role of Simplification in the FORMAC System, Simplification
Transformations, and The FORMAC Simplification
Algorithm.
INTRODUCTION
Simplification is a central and basic operation in
the manipulation of mathematical expressions. Indeed, much of the tedious algebra that' plagues
scientists and engineers involves the time-consuming application of simplifying transformations to
unwieldly mathematical expressions. It seems obvious, conceptually, that some simplifying transformations can be applied "automatically" to arbitrary
expressions. However, there are transformations
that require special handling; they simplify some
expressions and complicate others.·
FORMAC, an acronym for FORmula MAnipulation Compiler, is an experimental programming
system currently available as a Type II program
from IBM. It is a tool for programming the IBM
7090/94 to perform tedious mathematical analysis
on complicated mathematical expressions. The
FORMAC language contains, as a subset, FORTRAN IV; hence, FORMAC provides the capacity
for performing both nonnumeric and numeric calculations in the same program. The FORMAC language is described thoroughly in increasing amounts
of detail in references 1, 2, and 3. The details of
FORMAC implementation are presented in reference 4.
HISTORICAL BACKGROUND
A "SIMPLIFY" routine was written as early as
1959 as part of the Dartmouth Mathematics Project. It is the most complex routine reported in reference 5. During the same academic year, Edwards 6
and Goldberg7 explored the possibilities of automatic simplification in the context of electrical circuit
analysis. A LISP coded simplification package was
central to Goldberg's work. Within the next two
years Maling8 discovered that simplification was
essential to the LISP differentiation effort and
Hart9 developed "SIMPLIFY," a LISP function for
simplification. In 1963 Wooidridge 10 completed
another LISP simplify program, which was written
to be used on-line in a time-sharing environment.
In April of 1964 the experimental FORMAC system completed systems test and became operational.
It included a comprehensive simplification capability. In November 1964 the FORMAC system was
released as a Type III program.
37
38
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
The FORMAC AUTomatic SIMplification routine (AUTSIM) presents several contrasts to previous efforts:
1. None of these efforts is part of a comprehensive mathematical-expression manipulation system, nor are they as ambitious as
AUTSIM.
2. The LISP coded efforts are recursive.
AUTSIM is essentially nonrecursive; although the basic scan is controlled by a
push-down store, the simplification transformations do not employ recursion.
3. With respect to his own effort, Wooldridge
(reference 10, page 31) observes, "There
is no doubt that a large proportion of the
time spent simplifying is devoted to repeating simplifications already done."
AUTSIM is designed to avoid redundant
simplifications. This is a fundamental aspect of the AUTSIM scan and the entire
FORMAC object time system.
THE ROLE OF SIMPLIFICATION
IN THE FORMAC SYSTEM
The FORMAC programming system consists of
three parts-a programming language, a preprocessor, and a set of object time routines. In this section we discuss the relationship of automatic simplification to the FORMAC programming system.
The FORMAC programming language is a proper extension of FORTRAN IV and contains 4 declarative statements and 15 executable statements in
addition to the full FORTRAN .IV language. FORMAC also introduces additional symbolic mathematical operators from which symbolic expressions can
be composed. The language statements are summarized in Table 1, and the list of operators that may
be used to compose symbolic expressions is displayed in Table 2. These additions to the FORTRAN IV language permit the user to .construct
and manipulate symbolic mathematical expressions
at object time. The statements listed under 2b in
Table 1 provide an interface with the FORTRAN
program logic and with FORTRAN numeric capabilities. The form of generated symbolic expressions
can control the logic of program execution. Symbolic expressions can be evaluated and the numeric results used in FORTRAN-coded, numeric calculations.
1965
The FORMAC preprocessor scans through a
FORMAC source program and converts the FORMAC language elements into FORTRAN IV statements. Among these are many calls to FORMAC
object time routines. In addition, the preprocessor
creates a prototype for each symbolic expression
that occurs explicitly in a FORMAC command.
This prototype is used by the object time routines
Table 1. Summary of FORMAC Language Extensions to FORTRAN IV.
1. Four Declarative Statements
ATOMIC
DEPEND
P ARAM
SYMARG
declare basic variables, which name
themselves.
declare implicit dependence relations.
declare parametric pairs for SUBST
and EVAL.
declare subroutine arguments as
FORMAC variables; flag program
beginning.
2. Fifteen Executable Statements
( a) statements
LET *
SUBST*
EXPAND *
COEFF*
PART
ORDER *
yielding FORMAC variables
construct specified expressions.
replace variables with expressions.
remove parentheses.
obtain coefficient of variable or of
a variable raised to a power.
partition expressions into terms, factors, exponents.
specify sequencing of variables within expressions.
(b) statements yielding FORTRAN variables
EVAL *
evaluate expression.
MATCH *
compare two expressions for equivalence or identity.
FIND *
determine dependence relations.
CENSUS
count words, terms, or factors
( c) miscellaneous statements
BCDCON
convert to BCD form from internal
form (prepare symbolic expressions for output with FORTRAN
"WRITE" statement).
ALGCON* convert to internal form from BCD
form (facilitates input of symbolic expressions with FORTRAN
"READ" statement).
*These commands called AUTSTM.
AUTOMATIC SIMPLIFICATION IN FORMAC
A UTSIM
ERASE
FMCDMP
control arithmetic done during automatic simplification.
eliminate expressions no longer
needed.
symbolic dump
Table 2. FORMAC Operator Set.
+
EXP
LOG
(unary
internally)
*
(external only)
/
(power, t )
**
(factorial)
FAC
(double
DFAC factorial)
COMB (combinatorial)
(natural
logarithm)
SIN
COS
ATAN
TANH
( arctangent)
]
( delimiter)
( differentiation)
DIF
to generate the required expression when the corresponding FORMAC command is executed. Consider the segment of a sample FORMAC program represented by statements 1 and 2, below.
1. LET U = (1
2. LET X = (1
+ Z)**N
+ Z)**M-U + Z**5
These two statements cause the prototype expressions 1p and 2p to be constructed by the preprocessor.
1p. (1
2p. (1
+ Z)**N
+ Z)**M -
U
+
Z**5
When the call statement, generated by statement 1
during preprocessing, is executed at object time, the
symbolic variable U is defined as the name of the
newly generated expression (1 + Z**N. This is accomplished by scanning the prototype expression
which the preprocessor constructed for (1 + Z) * *N
and replacing the variables Z and N by their current
values. Let us suppose that Z is a symbolic variable,
i.e., it is either the name of a symbolic expression
or it may be an ATOMIC variable. In either case
its value is symbolic. N, on the other hand, is a
FORTRAN variable. Its value is numeric. Then the
generated expression for 1p will be like 1p, only Z
and N will have been replaced by their current values. If the value for Z is W-1 and for N is 2, then U
names the generated "expression 1g. Similarly, if the
value for M is 3, then-after execution of statement
2-X names the expression 2g.
39
19. (1 + W - 1) * *2
2g. (1 -I- W - 1) **3 - (1 + W - 1) **2
+ (W - 1)**5
Note that both these expressions require simplification. The extent to which they may be simplified is
unknown at compile time. Moreover, the degree to
which the FORMAC user understands (when writing the program) just which simplifications will be
applicable to these expressions depends on the complexity of the logic of the program in which they
are embedded. For example,
(1+Z)**M-U
will cancel if the current values of M and N are
equal; however, the values of M and N may be determined by quite complex program logic. We see
that in a mathematical formula manipulation system, organized as FORMAC is, an object time simplification capability is essential. (The expressions
being manipUlated may be far more complex than
in the above example!) Moreover, that capability
must be, to as great an extent as possible, automatic.
The FORMAC object time system is composed
of many subroutines. The command level routines
which correspond to the FORMAC executable
statements are the basic object time routines. These
in turn call a number of service routines. The automatic simplification routine, AUTSIM, is the most
important service routine. Each command which is
starred in Table 1 calls AUTSIM at least once.
The role played by AUTSIM in the FORMAC
object time system is not unlike that envisaged by
the Dartmouth Mathematics Project. 5 Their SIMPLIFY routine was designed to serve a threefold
purpose:
( 1) The answers produced by other programs may appear
in a far more complicated form than necessary, and then it
is desirable to simplify them.
(2) It may be desirable to simplify a given formula before
applying one of the other routines.
(3) It is a program that attempts to reduce formulas· to a
canonical form. Such a form is particularly useful, for
example, when we try to find out whether two formulas
represent the same function or not.
Purpose (1) is relevant to FORMAC; the result
of an EXPAND (see Table 1) may require the collection of like terms in a sum. There is an additional aspect which is worth noting. The design of the
40
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
algorithms for the basic FORMAC commands was
simplified by the assumption that AUTSIM would
clean up expressions after manipulations had been
performed. It was not necessary to rule out a fast,
simple algorithm for a particular manipulation simply because it produced a more complicated expression than necessary. Moreover, the inclusion of
code in the algorithm to eliminate redundant expression elements and to clean up expressions, would
have lead to a proliferation of redundant code
throughout the FORMAC object time system.
Several FORMAC command algorithms are
greatly simplified by the assumption that the
expression they receive is already simplified and in
p-canonical form ( to be defined later) . This is
close to the intent of purposes (2) and (3) above.
Note, however, that in purpose (2) the Dartmouth
investigators were also concerned about the success
of their algorithms on a given expression if it were
unnecessarily complicated. The concern was not as
significant in the design of FORMAC algorithms,
but then the FORMAC design goals did not include
symbolic mathematical manipulations for which
complete algorithms do not exist.
The AUTSIM algorithm itself makes use of the
applicability of purpose (3) . Collection of like
terms in a sum of like factors in a product is dependent upon the assumption that such terms or factors
will be in a nearly identical form.
Since AUTSIM is at the iterative heart of the
FORMAC object time system, it is important to
consider just how fast the algorithm is.
An example is provided in reference 11. The
FORMAC system was used to generate expressions
of interest to the astronomer. The computer (IBM
7094) required 18.67 minutes to generate the first
27 iterates. It is estimated that to do this work with
any reliability would require 60 years by hand.
AUTSIM is called 2,975 times during execution of
this program. Note, this does not mean that every
time AUTSIM was called it changed the input
expression. If the expression was already simplified
no simplification was performed. This also holds
for subexpressions which have been simplified previously (see Fig. 1). No doubt, many of the calls in
the above example resulted in very little actual simplification. But this is a mark of a well-designed
simplification algorithm; it must not perform redundant simplifications.
We have seen that an automatic simplification
algorithm is essential to the successful operation of
1965
the FORMAC system. Moreover, once the decision
has been made to include an automatic simplification algorithm in a formula manipulating system,
the design of the other object time algorithms is
simplified in two ways: one can make definite as.,.
sumptions concerning the form of expressions
which are to be manipulated; and, the form of the
manipulated result need only be mathematically
correct (it may contain redundant or unnecessary
subexpressions which the automatic simplification
routines will remove).
SIMPLIFICATION TRANSFORMATIONS
As intuitively obvious as the need for it may be,
simplification is a difficult class of expression
transformations to define. Even to a human engaged in the manipUlation of complicated or
lengthy expressions, it is frequently not obvious
which transformations constitute actual simplifications of an expression. The confusion is compounded by differences of opinion. W00idridge10 acknowledges this aspect of the problem by referring to his
program as one "which performs 'obvious' (noncontroversial) simplifying transformations." Confusion may also arise from the failure to make a distinction between "simplified" form and "intelligible" form. Frequently, these are not equivalent.
Consider the expression
I
2A
C \) DE (2DF
+ C)
It is in some sense simplified, yet for the engineer
it may be more intelligible in the form:
D
I
A
I
A
c'\} E (F + C/2D) = cp '\} E (F + cp/2)
with cp C/D. An engineer may spend weeks
or months massaging a simplified mathematical
expression. His goal is to arrange the expression so
that the relationship between the crucial variables
becomes transparent or intelligible to the human
observer. The adequacy of the result is highly dependent upon human perception of mathematical
relationships. Yet the expression is already simplified. Like terms and factors have been collected in
sums and products; evaluation of strictly numerical
elements has been performed; various redundant or
extraneous elements have been removed. Although
41
AUTOMATIC SIMPLIFICATION IN FORMAC
INITIALIZE
POINTERS TO
EXPRESSION
UPDATE POI NTERS
SO AS TO
SKIP
THIS WFF
PERFORM
TRANSFORMATION
(CALL LEXICO
IF NECESSARY)
UPDATE POINTERS
SET AABITS
UPDATE POINTERS
SO AS TO
SIMPLIFY
THIS WFF
Figure 1. Flow diagram for AUTSIM subroutine.
we have no better definition to offer of intelligible
form and simplified form, we maintain that the distinction is significant. It is well to note that even
though the FORMAC simplification algorithm is
central to the operation of the FORMAC system,
code (written in the FORMAC language) designed
to reduce expressions to a particular intelligible
form is imbedded in many FORMAC user programs.
If the mathematical context within which simplification is to be performed is suitably uncomplicated,
there exists a canonical form for all permissible
42
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
expressions. For example, there is a canonical form
for polynomials in n variables. The existence of
such a form can greatly simplify the design of a
simplification algorithm. 5 In polynomial manipulation
systems,12,13 simplification is simply the reduction of
polynomials to canonical form. The problem of
designing a simplification algorithm becomes that of
reducing expressions to the canonIcal form .. However, in FORMAC it is possible to generate expressions which when simplified are equivalent but not
identical. The fact that, in FORMAC, expansion of
expressions is performed only under user option
indicates one way in which this can occur. FORMAC
will not expand the expression (a + b) 2 to yield
a 2 + 2ab + b 2 , unless such expansion is specifically
requested by the programmer. A second example of
equivalent expressions is provided by the equation
eX _ e- X
tanh (x) = e x .,
'
e-
X
1965
made to incorporate trigonometric identities in the
FORMAC simplification process. Since the categories indicated above are not mutually exclusive,
there is no need to discuss them all. Our discussion
of simplification transformations is partitioned as
follows: natural transformations, transformations
which apply the distributive law, transformations
which embody associativity and commutativity, and
mathematically undefinable expressions.
In the paragraphs which follow the mathematical
transformations are defined in the FORTRAN notation. Letters from the beginning of the alphabet
represent arbitrary mathematical expressions. As
such, they are often referred to as "well-formed
(sub) formulas," or "wffs." The transformations
which AUTSIM performs are labelled with letters
of the alphabet.
Natural Transformations
•
The automatic application of expansion to all
expressions or the automatic replacement of tanh
(x) by its equivalent would make possible the reduction to a canonical form in these two cases.
However, the intelligibility of the expressions being
manipulated would be greatly impaired. Indeed, the
automatic expansion of expressions would frequently produce the opposite result from that desired by
the user. As a result, the FORMAC simplification
transformations establish at best a pseudocanonical
(p-canonical) form. As will become evident, this pcanonical form makes additional simplifications
possible. It also establishes a structural context
which can be assumed for all automatically simplified ("autsimmed") expressions; thus it reduces the
complexity of the other FORMAC expression manipulation algorithms.
Expression transformations that are candidates
for inclusion in a simplification algorithm can be
categorized in many ways. There are transformations that contribute directly to the establishment of
a p-canonical form. Several transformations embody
basic mathematical laws such as the associative,
commutative, and distributive laws. There are
transformations which are "naturals" and transformations which should be placed under programmer
option. Still others, employ basic arithmetic or
functional identities. AUTSIM performs transformations that fall into each of these categories. It
should be noted, however, that no attempt was
Consider the following transformations:
(a) O**A~O
(b) 1**A ~ 1
(c) A * *0 ~ 1
(d) A**1 ~ A
A~O
A = 0
(e ) (-A) * *N~{ -A * *~ if ~ is an odd .integer
A * *N If N IS an even mteger
(f) -(-A)
~
A
(g) EXP(LOG(A» ~ A.
(h) LOG(EXP(A» ~ A
(i) -(3* A *(-B)*C*(-D» ~ (-3)* A *B*C*D
n
n
(j) l Aj ~ l
Aj
Where Ak = 0
j = 1
j = 1
j~k
m
m
Bj~1I
II
1
Bj
j = 1
j ~k
where Bk = 1
m
(k)
II
j = 1
Bj~O
where k exists
such that Bk = 0
These are "naturals." They are transformations
which one usually performs automatically when manipulating a mathematical expression. FORMAC
also performs these transformations automatically.
A less clear-cut but natural type of transformation is the evaluation of non arithmetic operators
with constant arguments. For example,
X
+ SIN
(1.4)
~
X
+ 0.98545
or
y IFAC (5)
~
y 1120
AUTOMATIC SIMPLIFICATION IN FORMAC
However, in some applications it is desirable to replace such expression elements with the proper
numeric value; in others, it is not. The FORMAC
solution to this quandry is to give the programmer
control over the automatic evaluation of these
expression elements. He has four options from
which to choose: (1) all functions are automatically evaluated, (2) only the integer-valued functions
(FAC, DFC, and COMB) are evaluated, (3) only
the transcendental functions (* *, EXP, LOG, SIN,
COS, ATAN, TANH) are evaluated, or (4) no
functions are evaluated. The first option is the default option.
Transformations that Apply the Distributive Law
Two types of "simplification" utilize the distributive law; these are expansion of a product of
sums and primitive factoring. Some examples follow:
1. A*(B + C) ~A*B + A*C
2. A*(B + C)*(B - C) + A*C**2 ~
A*B**2-A*B*C + A*B*C-A*C**2
+ A*C**2 ~ A*B**2
3. (B + C) **3 ~ B**3
3*B*C**2 + C**3
+ 3*B**2*C +
4. A*X + B*X + C*E*X + D*E
(A + B + C*E)*X + D*E
~
( 1) is a simple example of expansion. (2) is an
expression which requires expansion as an intermediate step toward complete simplification. (3)
illustrates multinominal expansion. (4) is an example of primitive factoring (the coefficient of a single variable, X, has been "factored" out of part of
the expression).
Neither expansion or factoring should be applied
automatically by a programming system. As in example (2), expansion may, for a given expression,
provide the key for further simplification. However,
there are expressions for which it inhibits simplification. Consider
'=
A2 - B2
(A + B + SIN (X) ) * (A - B)
+ A*SIN(X) - B*SIN(X).
H this expression is divided by A + B + SIN (X) ,
only in the expanded form will cancellation occur
automatically, since FORMAC cancels explicit factors but does not perform factorization to uncover
43
them. Since only the global context of the expression
manipulation will in general indicate if expansion or
coefficient gathering will lead to desirable results,
these transformations are not included among the
transformations performed by AUTSIM.. The FORMAC commands, EXPAND and COEFF, provide
these transformations under programmer option.
Transfonnations that Embody Associativity and
Commutativity
The associative and commutative laws for + and *
contribute in a fundamental way to the behavior of
mathematical expressions. A basic design goal for
any simplification algorithm (for a mathematical
structure for which these laws hold) must be to incorporate these laws as naturally as possible. As
shall be obvious in a moment, these laws have implications for the internal representation of FORMAC expressions, the FORMAC mathematical operator set, and the p-canonical form for expressions
-not to mention the simplification transformations.
The associative and commutative laws are assumed to hold for any expression which is generated or manipulated in FORMAC. There are three
transformations included in the FORMAC system
that establish associativity and prepare the way for
the sorting of operands. Such sorting implements
the FORMAC assumption that all expressions are
commutative. These three transformations which
affect the structure of the internal FORMAC p-canonical form, are listed below under (1) and (2)
and in the next paragraph under (script 1). (Note:
(1) and (2) are not performed by AUTSIM.)
1. A- B ~A + (-B).
2. AlB ~ A*B**(-l).
Transformations (1) and (2) are analogous and
accomplish analogous results. The binary inverse
operators - and/ are replaced in one instance by a
unary - and in the other by a binary* *. Under transformation (1), the expression A + B - C + D - E
becomes A + B + (-C) + D + (-E); under (2),
A*B/C*D/E becomes A*B*C**(-1)*D*E**(-l).
The net effect of both transformations is the same,
however. The scope of the main operator (+ or *)
is made explicit in the expressions and, hence, so is
the assumption of associativity. This is an important
characteristic of the FORMAC p-canonical form.
44
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
Moreover, commutativity of either operator can
now be relaized simply by rearranging the wellformed formulas (operands) of the operator. These
two transformations are performed in FORMAC by
the expression translator, which takes expressions
written in FORTRAN infix form and translates them
to the internal FORMAC form.
q
e
(I)
Bj
k = 1
where
S
Ai =
e
em,
=
p
+ s - 1, BI = AI, ... , Bi-l
Bi = CI, ... , Bi+ S ... , Bq = Ap , and
A-A~O
5 * A + (-5) * A ~ 0
5 * A + (-2) * A ~ 3 * A
(A + B) * C + (-A + B) * C ~ 2 * B * C
Notice that all of these involve "factoring," the inverse of expansion. In general, they can be represented by Al *B + A2 *B + ... + An *B ~ (AI
+ ... + An) *B. While it would seem optional to
apply such a "factoring" operation in all cases, the
following difficulties arise if no restrictions are
placed on the Ai:
1. In general, factoring of an arbitrary sum (AI
An) would require recursive use of AUTSIM and LEXICO, since it would create a new sum
of arbitrary characteristics which would have to be
simplified and sorted. This is not a major difficulty,
but we hoped to avoid such recursion.
+ ... +
m=1
q
1965
= AI-I,
= C s , B i+ s = Ai+l,
e is either I or II.
1
The transformation (I) must be included in the
automatic simplification algorithm. It is not sufficient
to include it only in the expression translator. The
FORMAC substitution capability may substitute a
sum for a single operand of a sum. Since both
expressions are already in internal form, the transformation (I) must be applied by AUTSIM in order
to maintain the p-canonical form.
An important consequence of associativity and
commutativity is the fact that like terms in a sum
and like factors in a product may be combined.
This is accomplished in FORMAC by imposing a
specific linear order on the operands of these commutative operators.
The linear order is designed so that operands
which can combine are equal. Hence, collection of
like operands is accomplished by a twofold process:
the operands are lexicographically ordered and
those operands which will cancel are combined as
they are sorted together.
That portion of AUTSIM which orders and combines operands is called LEXICO. (lexicographic
ordering). This routine is distinct from Autsim
proper, because the scan required for sorting is noticeably different from that needed to perform the
operator-operator transformations that make up the
bulk of the AUTSIM routine. Lexicographic sorting
accomplishes two things. In addition to the collection of like factors in a product and like terms in a
sum, it contributes to the establishment and maintenance of the p-canonical form.
Examples of the type of combining which one
might like to perform follow:
2. Finding B might be difficult if both B and Ai
are products. Consider how to factor something like:
AI*BI*B2*~
~
(AI * A4
+ BI*A2*B2*A4 + BI*A2*A3*B2
+ A2* A4 + A2* A 3) *B I *B2.
The coefficient of a term like BI *A2*B2* A4 is quite
ambiguous when it stands alone. For example, does
it represent BI occurrences of A2*B2* A 4, or BI *A2
*A4 occurrences of H2? If one is told that the Ai are
parameters and the Bi are variables, then one can
say that the term represents A2 *A4 occurrences of
B I *B 2. However, LEXICO does not have such information. It would have to determine what the symbolic coefficient of a term is by considering all the
terms in the sum. This could be done by obtaining
the greatest common divisor of all the terms. The
symbolic coefficient of a term would then be taken
to be the result obtained by dividing the term by the
greatest common divisor. Thus, LEXICO could perform the transportation illustrated above. However,
the resulting expression would often not be in the
form desired by the FORMAC user.
3. Consider the expression
(A + B)*C + D) + (A-B)*(C + D)
+ (A + B)*(C-D),
in which combining of like terms could produce
either 2*A*(C + D) + (A + B) * (C - D) or
(A + B) *2*C + (A - B) * (C + D).
Clearly, the major difficulty of cancellation in
sums is the determination of which expressions are
"coefficients." The difficulties which can arise in
45
AUTOMATIC SIMPLIFICATION IN FORMAC
the case when "coefficient" is a partially ambiguous
concept are pointed out in (2), while (3) is an example of a case in which it is completely ambiguous as to which expressions are coefficients and
which are not. In order to eliminate such ambiguities while still retaining a useful capability, cancellation was limited to combining numeric coefficients of like symbolic expressions.
In the case of cancellation in products, we wish
to combine exponents. Notation removes nearly all
of the ambiguity corresponding to the coefficient
difficulty in sums, since base and exponent are
readily distinguishable in most cases. However,
problems can still arise; for example,
(A **B) **C*A**D*A **(B*E) *(A **C) **B
= (A**B)**C*A**(D + B*E)*(A**C)**B
= (A**B)**(C + E)*A**D*(A**C)**B
= (A**B)**(2*C + E) * A**D
This ambiguity is resolved by the presence of a transformation in AUTSIM which establishes the pcanonical form for nested exponents so that
(. . . « Al **A 2) * *Aa) ... * *An)
Al * * (A2 *Aa * . . . An).
(m) (A**B)**C~A**(B*C).
~
Because transformation (m) is performed, LEXICO can combine all factors with identical bases,
giving a powerful cancellation ability. The expression above is reduced through a sequence of transformations to
A ** (2*B*C + B*E + D).
While performing cancellation in products by combining exponents, LEXICO may create sums which
had not previously existed, and which must be
simplified. This could have required recursive use
of AUSTIM and LEXICO. If the transformation
(LOGA1 + LOGA2 + ... + LOGAn ~ LOGA1*
. . . * An) were also performed by LEXICO,
these two transformations could lead to an arbitrary
depth of recursion. Hence, to avoid recursion, that
LOG transformation was omitted from AUTSIM
and steps were taken to make possible the simplification of the exponent sums by a simple, one-level
use of LEXICO and no use of AUTSIM. Some of
these steps will be described in the discussion of
general flow and techniques of LEXICO. As is often
the case, however, hindsight is much clearer than
foresight. It appears that a recursive LEXICO would
have resulted in both a faster and a more compact
simplification package than the tricks used to avoid
recursion in the experimental IBM 7090/94 FORMAC system.
In addition to transformation (m), there are several transformations included in the FORMAC simplification procedure for one reason: they contribute to the reduction of expressions to a p-canonical form which extends the simplification possible
via combination.
There are three transformations that contribute
to the cancellation of like terms in a sum.
+ B + C) ~-A-B-C
+ LOG(B1* ... *Bs) + C ~ A + LOG(B 1)
+ .... + LOG(B s) + C
(n) -(A
(0) A
(p) LOG(A**B)
~
B*LOG(A)
The last two transformations, (0) and (p), prepare
for the collection of logarithmic terms in a sum. For
example,
LOG(A) + LOG(B*A**(-2»
LOG(A) + LOG(B) + LOG(A**(-2»
LOG(A) + LOG(B) + (-2)*LOG(A)
Lexicographic ordering, accompanied by collapsing
of like terms can then produce the simplified result,
- LOG (A) + LOG (B). The transformation (n)
is a trivial application of the distributive law. It
makes possible the reduction A + D - (A + B) ~
(A + D - A - B ~ (A - A) - B + D ~ - B + D.
Two transformations perpare the way for the collection of like factors in a product, transformations (m)
and (q).
(q) (Al* ... *An)**C
~
A1**C* ... *An**C.
The FORMAC p-canonical form is further reflected
by these transformations. Factors of a product are
maintained as distinct bases raised to powers; they
are not gathered together as products (of those bases
which have equal exponents) raised to that power.
The form (A1*A2)**B 1 * (A2*Aa*A4)**B2 * (Al*
A4) ** (- Bd becomes Al**Bl * A2**Bl * A2**B2
* Aa**B2 * A4**B2 * Al**(- B1) * A4**(- Bl).
Then LEXICO can sort and collect like factors
producing
A2**(Bl
+ B2) * Aa**B2 * A4**(-Bl + B2).
Mathematically Undefined Expressions
There are three mathematically undefined expres:sions which may be introduced into FORMAC expressions. These are
46
PROCEEDINGS -
1. O**(-a)
2. LOG(O), and
3. 0**0.
FALL JOINT COMPUTER CONFERENCE,
(a is a positive number),
In FORMAC, the first two expressions are evaluated as they would be by FORTRAN; i.e., each is
replaced by 0 and a suitable message written on the
output listing. The third expression is left intact so
that the programmer may substitute a variable or
expression of his own choice for it. These are
neither consistent nor aesthetically pleasing solutions. Alternative approaches to this problem will
be discussed in a later paper. It is the intent of the
authors to publish a subsequent paper that will contain flow diagrams with sufficient detail to simulate
the AUTSIM algorithm. In such a context, it will
be possible to consider adequately the ramifications
of various additions and changes to the AUTSIM
algorithm.
THE FORMAC SIMPLIFICATION
ALGORITHM
1965
by a delimiter, "]". Hence, the Polish string + A B
CD] represents the expression A + B + C + D;
the Polish string
+ A * Be] D 5]
represents the expression A
and the P,olish string
+
A
+ B * C + C + 5;
* BCD5]]
represents the expression A + B * C * D * 5.
The introduction of the variary operators makes
meaningful the application of transformation I to a
Polish string; hence, the implementation of both
associativity and commutativity is simplified by
this notation. Moreover, fewer symbols are required
to represent a sum of product; less space is required
for the internal representation of expressions.
The entire FORMAC operator set is displayed in
Table 2. The differentiation operator requires a further word of explanation; it is also variary. In delimiter Polish,
df
DIF f x 1]
represents ~
()6
This section is an introductory description of the
FORMAC automatic simplification algorithm.
There are three subsections: FORMAC Internal
Expression Representatation, Details of the AUTSIMScan, and The Organization of LEXICO.
Note that, for the sake of clarity, the power operator
** will be represented by the symbol "1''' in delimited Polish expressions.
Details of Expression Representation
Details of the A UTSIM Scan
Any mathematical expression manipulation system must operate on expressions interpretively,
since the form of an expression may change constantly as it is manipulated. An efficient internal
coding for mathematical expressions is essential.
There are two well-known notations for mathematical expressions: commonly used infix, and classical
Polish notation. The unwieldiness of infix notation
and the difficulties and inefficiencies it presents for
algorithm design are well known. The use of prefix
Polish notation overcomes many of these problems,
and "Cambridge" Polish, introduced in reference 14,
provides space and algorithm economies not provided by classical binary Polish.
"Delimiter Polish" is the form used to encode
mathematical expressions in FORMAC. This can
be thought of as classical Prefix notation, permitting unary, binary, and variary operators. Of the
arithmetic operators, only "+" and "*" are variary.
The scope of range of a variary operator is defined
Governing Operators. The key to the basic scan of
the AUTSIM routine is the decision process that
determines whether a "simplifying" transformation
is applicable to that part of the expression currently
being scanned. This decision process is based on
the concept of governing operators. Every operator
acts upon or governs its operands, and each operator (with the exception of the outermost) is governed by a higher level operator. In the example
and DIF f x 2 Y 1 z 3 ] represents
*
A
()2
x () Y ()3
Z
(f).
+ BC] D]
the outermost operator is the * which governs the +
operator. Or in other words the + operator is governed by the *. The applicability of most simplification transformations can be determined from a
simple transfer table based upon the juxtaposition
of governing and governed operators.
Contextual Checking. For some of the transformations it is necessary to do extra contextual checking
47
AUTOMATIC SIMPLIFICATION IN FORMAC
to eliminate unnecessary or unwanted applications
of the transformation. There are essentially three
types of context which may be checked before
applying a transformation:
1. The first type of checking tests for a specific pattern of operands or operators. For
example, the simplification of t -BK is
only done if k is an integer, but the po ssible applicability of the transformation is
recognized by the combination of the
and the - sign.
2. A second type of context checking tests to
see if the governed subexpression has already been simplified.
3. The final type of checking tests the system
switches to determine if the transformation
should be performed. The evaluation of
non arithmetic operators with constant arguments is an example of this type of
checking, since it is done only if the proper switches are set. In some cases the contextual information is discovered before it
is needed. This contextual information is
associated with certain operators such as t,
COMB, and * through flags which indicate
the status of their operands.
Certain transformations are not applied when first
recognized by the transfer table. These transformations are delayed until the governed subexpressions
are simplified. This is done for two reasons.' First the
transformation may disappear when the subexpression is simplified. For example, the transformation
(EXP(A) )**B ~ EXP(A*B) is delayed because A
might be the expression LOG C in which case (EXP
(LOG C)) * *B ~ C* *B. This reduction could not
occur if the transformation were not delayed. The
second reason is that an unnecessary intermediate
growth in expression size results if certain size-increasing transformations are not delayed. An example of this type of transformation is the transformation (Al* ... *An)**C ~ Al**C ... An**C. It
is easy to see that the size of the right-hand expression is greater than that of the left-hand one since the
exponent occurs n times. If the product is simplified
before applying the transformation, then the number
of replications of C is less than or equal to the number required for the transformation acting on an unsimplified product.
The AA Bit. The fact that an operator and its operands have been simplified is indicated by the AA
bit
ish
the
not
(Already AUTSIMed bit). In the delimiter Polnotation the AA bit is indicated by a dot over
operator. However, constants and delimiters do
have an AA bit. For example, in the expression
+ A *:8C5]
l'
A-I]
the subexpression *:8C5] is simplified as indicated
by the dot over the *. There are AA bits on Band
C, the operands of the *, since in some instances
the lead * can be removed by a transformation but
the operands are still simplified. If the expression was
LOG *BC5]
then this can be transformed to
.
.
+ LOG B LOG CLOG 5]
and Band C would remain simplified and require
no further simplification.
The AA bit has an addition function. It indicates
that a subexpression has already been scanned for
transformations, so only transformations which involved the lead operator of the expression are still
applicable. The AA bit prevents the scan for transformations from oscillating indefinitely.
The FORMAC command subroutines do not reset the AA bit on expressions they manipulate, unless they actually alter the expression so that it is
no longer simplified. Then, the AA bit is reset on
the expression but not on those wffs of the expression which have not been changed. In this manner,
the entire FORMAC object-time system operates to
minimize redundant simplification.
The Scan Pointers. As the expression is scanned,
the location of the governing and governed operators constantly changes. The current governing and
governed operators are determined by the two scan
pointers P and C. The pointer C points to the symbol which is currently being scanned. The pointer P
points to the operator which governs the current
symbol. Consider an example. Suppose that the scan
is currently looking at the following expression
fragment.
PC
t
t
... + *Xy] * t Z2W] 5] ...
The current symbol is the first * operator. The
AUTSIM transfer table shows that no transformation is done for + * so the scan pointer must be
moved. The AA bit on the first * indicates that no
48
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
transformations remain in that operand so that the
scan moves to the next operand. The C pointer is
updated to point to the second operand under the
operator,
P
C
t
t
. . . + *Xy] * t Z2W] 5]
...
Now the current symbol is the second * operator.
As before, no transformation can be applied so the
scan pointer is moved. However, in this case the
product is not simplified so it must be scanned for
possible transformations. Therefore, the scan pointer
moves to the next symbol, the t operator. But the t
is governed by the * operator so the pointer P must
be moved to the new governing operator. After the
scan pointer is moved the expression looks like this:
PC
t t
. . . + *Xy] * t Z2W] 5] . ; .
The scan pointers continue to move in this fashion.
After a transformation is performed the pointers
must be updated so that they will be· properly positioned to continue the scan. At each step in the
scan the current operand of the current governing
operator is simplified before the next operand is
scanned. This method of scanning means that the
current symbol pointer, C, oscillates back and forth
along the delimiter Polish string as the expression
is simplified.
The Push-Down List. In the example above, the
pointer P was moved from the .+ to the second *
when the second product was being simplified. After that simplification is completed, the C pointer
will have returned to the * and the AA bits will
have been set.
C
t
. . . *Xy] * i Z2W]
5]
But the P pointer must be reset to the operator
which governs the *, namely the + operator. Therefore, whenever the scan pointer is moved to scan an
operand, it is necessary to save a pointer to the old
governing operator and any flags associated with it.
This information can then be used to reset P to the
+ operator. Since each operand is in itself a wellformed expression, a natural method for saving the
current status of an operator is to make the AUT-
1965
SIM routine recursive. However, for reasons of efficiency, the A UTSIM algorithm uses a push-down
list (PDL) to save the simplification and scan status. The top entry on the PDL always indicates the
current governing operator so that the P pointer is
just another name for this entry.
LEXICO and Expand. Two major departures are
made from the AUTSIM scan in order to perform
large transformations. The first of these, LEXICO,
is described in detail in the next section. The second major departure is under control of the
EXPFLAG. If it is set on either a * or t operator,
then the expand algorithm is invoked to remove
parentheses.
We have seen that the AUTSIM scan is essentially oscillatory in nature under the control of a pushdown store, a current symbol pointer, and AA bits.
A systems level flow diagram for AUTSIM is provided in Fig. 1.
The Organization of LEXICO
The general flow of the LEXICO subroutine is
displayed in Fig. 2. There are three major blocks in
LEXICO: the COMPARE routine, the COMBINE
routine, and the CLEANUJ> section. Although considerations of efficiency led to an intermixing of
the code, the routines are most easily understood as
separate entities. The remainder of this section will
treat each of these blocks. The requirements placed
on each unit, the motivation for imposing these requirements, and the manner in which these requirements are met are discussed.
The COMPARE Routine. Because the internal
FORMAC representation is not designed specifically for sorting, a special sorting list (BWFF) is constructed from the input sum or product which is
being sorted. Each subexpression under the + or *
is handled separately and the sorting list is built up
as each of the sub-wff's is sorted against the subwff's already on the sorting list (see Fig. 2). The
heart of the sort is the routine which compares the
sub-wff being sorted (AWFF) with the ith sub-wff
on the sort list CBWFF(I)). We shall discuss the
criteria for a sorting order and sketch the actual
sorting order employed by LEXICO.
(a) Criteria for a Sorting Order. The description of the comparison routine would be difficult to
understand without a description of the sorting or-
49
AUTOMATIC SIMPLIFICATION IN FORMAC
CLEANUP
NO
FORM SORTING
LIST INTO
WELL FORMED
SUM OR PRODUCT.
SIMPLIFY
DEGENERATE CASES
GET NEXT
EXPRESSION
TO BE SORTED
(AWFF), i .8., AWFF
IS THE NEXT
TERM IN SUM
OR FACTOR IN
PRODUCT
PLACE AWFF ON
SORTING LIST
AS BWFF (1)
INSERT AWFF
BETWEEN
BWFF (I) AND
BWFF(I-I) IN
SORTI NG LIST
INSERT AWFF
BEFORE BWFF(1)
IN SORTING LIST
COMBINE
AWFF AND BWFF(J:).
REPLACE. BWFF(I)
WITH RESULT
INSERT AWFF
AFTER BWFF(I)
IN SORTING LIST
Figure 2. LEXICa flow diagram.
der which it is supposed to implement. The particular sorting order used in LEXICa was chosen to
meet ·a wide range of requirements.
The first requirement of a method of sorting is
that the sorted result be· unique. That is, the result
of the sort does not depend on the original order of
the subexpressions it sorts. In this case, the sort
yields a linear or total ordering. If the comparison
of any two expressions gives a definite ordering for
these two expressions and if this ordering is transitive, then the ordering is total. If this were the only
requirement, expressions which are represented by
linear strings of symbols, could be sorted by the
left-to-right comparison used in dictionaries. All
50
PROCEEDINGS- FALL JOINT COMPUTER CONFERENCE,
that is needed for this technique is a linear ordering
of the possible symbols, similar to that given to the
letters of the alphabet ("alphabetical ordering").
However, LEXICa must "combine" similar
expressions; e.e., it must perform cancellation. This
imposes the requirement that expressions which can
be combined by cancellation should have almost
identical ordering properties, so that, during the
sort, the routine will compare them with one another'
and realize that cancellation is possible.
Specifically, the following classes of expressions
must have nearly dinetical sorting characteristics:
1. explicit products that differ only by a constant
(numeric coefficient, e.g., * ABC K 1] and
* ABC K2 are numbers;
2. a nonproduct expression and a constant multiple of that expression, e.g., SI X and * SIN
XKd;
3. a negative expression and the product of that
expression and -1, e.g., -SIN X and * SIN X
-1] or * ABC -1] .and - * ABC];
4. exponentiated expressions that differ only in
their exponents, e.g., t X Y and t X + W Z]
or t + A B COS Y] 2 and t + A B COS Y]
Q or EXP + X Y - D] and EXP ATAN Q;
5. an expression and that expression raised to a
power, e.g., SIN + X Y] and t SIN + X Y]
* ABC].
Note that since we have no unary operator denoting
the multiplicative inverse, we avoid the problem
analogous to (3) for exponents.
Combining depends upon recognizing these special cases. They involve differences in top-level
structure; the only information required concerning
the lower-l~vel structures is whether or not they are
identical. It would have been possible to design a
routine that checked for these special cases on the
top level, and used a simple dictionary ordering to
compare subexpressions of the terms or factors it
was sorting. For example, if the minus sign (-)
came ,after all variables in the linear ordering of
symbols, it would be possible to design a routine in
which A and -A sorted almost identically as terms
in a S}lm ( and hence could be canceled) but in
which EXP(A) and EXP(B) sorted closer together
than EXP(A) and EXP(-A). However, it was decided that the sorting order would be based on levels and would be consistent on all levels. That is,
identical functions (sums, products, and exponentiated wffs are considered functions in this sense)
1965
would be compared on the basis of their arguments,
and the arguments would be compared in exactly
the same way as arbitrary terms in a sum or factors
in a product. Thus, to compare (COS (A) and COS
(B), the routine would note that the functions were
the same and then compare the arguments A and B
exactly as it would have if they were terms or factors to be sorted.
The above sorting specifications are clear cut. In
addition to these, however, there were two less
well-defined requirements placed on the sorting order. Since automatic simplification is performed on
all expressions in the system, and in particular on
all expressions that are to be written out, a criterion
of intelligibility was imposed. That is, within the
limits of the other requirements we felt that the ordering induced by LEXICa should produce output
that is in some sense"understandable" to the programmer. It should produce expressions which are
similar in appearance to ones which the program
user might write-given the limitations of the character set and printing techniques available. To
achieve this goal without sacrificing the requirements of the internal p-canonical form, a double
approach was taken. The sorting order itself was
designed to be as "understandable" as possible, and
a way of modifying it prior to output (the ORDER
command) was included in the system.
The final requirement was that the ordering
sould be as "simple" as possible. This served the
double purpose of making it understandable to the
user and as easy to implement as possible. AUTSIM is the most active major subroutine in the
FORMAC system. Every expression created is sent
through AUTSIM at least once, and often several
times. AUTSIM may call LEXICa many times in
the process of simplifying a single expression. Thus
efficiency is a prime consideration in the design of
LEXICa. Any increase in the speed of LEXICa
produces a noticeable increase in the system's performance. In this context, the suggestions made by
Martin5 may prove quite significant.
.(b) The Sorting Order. With these requirements
in mind, we now consider the sorting order. Symbols other than +, *, t, and - are linearly ordered in
the following manner. Atomic variables come before all other symbols, and are ordered among
themselves by the numerical value of the core location of their symbol-table word. Since the symbol
table for each routine is placed in core with the
AUTOMATIC SIMPLIFICATION IN FORMAC
51
variables in alphabetic order and with arrays in
standard FOR TRAN order, this results in variables defined in the same routine being sorted in
straightforward alphabetical order by name, with
array elements in their normal order. However,
variables defined in different routines sort according to the placement of their defining routines in
core, which means the actual order in which variables are sorted is dependent on the order in which
subroutines are loaded.
eral small routines that perform the various tasks
needed for cancellation in sums and products. Under
sums the routine adds the coefficients of the two
terms and produces a new term with the sum as
coefficient. This is made somewhat more complex by
the problem of determining the true coefficients. The
routine must act as if explicit coefficients of 1 and-1
occurred instead of implicit occurrences; e.g., as if
* ABC] were *ABC1], -A were *A -1] and - *
ABC] were *ABC -1].
Immediately following the atomic variables in
the linear ordering of symbols are the operators in
the order SIN, COS, ATAN, TANH, EXP, LOG,
FAC, DFAC, COMB, DIF. This ordering is arbitrary and is determined by the bit patterns assigned
to each operator in the internal expression coding.
Following the operators are the constants. Constans
are ordered among themselves by their numerical
value. The algebraically smaller of two constants
preceeds the larger one. Since all constants in
expressions have been coverted to the same mode
(rational or floating point) by AUTSIM, there is
no possibility of comparing constants of differing
mode.
Other difficulties can arise in finding the constant
coefficients of an expression. The· numerical coefficients of products appear as the final constant before
the delimiter (since all products under sums have
been put into canonical form). But, because the
expressions are in Polish notation, it is possible that
a constant appearing in that position may be the
argument of a preceding function. This is difficult to
check in the case of rational constants which may
appear in the form . . . Kl K2 -1 . . . under a
product. In this case, given a product ending in the
string . . . Kl K2 -1], it is possible that either Kl
or both Kl and t K2 -1 are arguments of preceding
functions and not coefficients.
As may be noted, the symbols +, *, t, and are not compared with other symbols. Instead they
cause the scan routine to compare subexpressions of
the expressions A WFF and BWFF (I) and use that
result as the final comparison. Only if all indicated
comparisons result in identical matches, does the
fact that a +, *, t, or - appeared directly influence
the result of the sort. This is done in order to meet
the constraint that items, which can combine with
one another, sort together.
Once the sorting order has been specified and
understood, the structure of the comparison routine
becomes quite clear. The routine represented by the
COMPARE diamond in Fig. 2 is basically two
leveled. The lower level is a routine that performs a
straightforward comparison of the two expressions
it receives as arguments. As output, it indicates the
relative ordering of the two expressions. The upperlevel routine sends parts of expressions to the lower
level, and uses the finromation returned to determine if expressions can be canceled or combined. It
then passes control as indicated in the flow chart.
If the constants to be combined are rationalmode constants, arithmetic must be performed by
special routines which add fractions and produce a
reduced fraction. Finally, if the result is either 0, 1,
or -1, LEXICa must duplicate the AUTSIM transformations involving such coefficients (e.g., deleting the entire produce in the case of 0, or deleting
the constant and prefixing the expression with a minus sign in the case of -1). This was done in LEX1co for efficiency as LEXICa could easily check
for such results at the time of the addition, while
UTSIM would have to make an entire rescan. This
does, however, result in duplication of code in the
two routines as they are currently written. This is
an example of the old problem of space efficiency
versus time efficiency.
The COMBINE Routine. The COMBINE routine
consists of a basic cancellation routine and an arithmetic routine. Cancellation is accomplished by sev-
Combination in products requires the addition of
exponents. Since exponents may be arbitrary
expressions, the reSUlting sums must in turn be sorted and simplified. This is done by permitting the
product cancellation routine to use the rest of LEX1co in a pseudorecursive fashion. Whenever it is
necessary to cancel exponents, the expressions involved are modified so that the first expression (the
one on the product sorting list) is no longer a wellformed formula in internal form. Instead, its expo-
52
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
nent is transformed into a sorting list of the type
used for sorting sums. The comparison and sum
cancellation routines are then used to sort the terms
of the second exponent onto the sorting list, performing cancellations where possible. If the exponent is not a sum, then only a single use of the sort
and combine routines is needed. However, if the
exponent is a sum, then the sort and combine sec-:tion is used once for each wff under the plus, and
the plus is discarded because the sort list is already
a sum. Of course, since one section of the routine is
being called by another section, care must be taken
to preserve and restore the status of all internal
switches and registers that are modified. When all
factors in a product have been sorted and all. exponent cancellations have been completed, it is up to
the cleanup section to take the modified expressions
and restore them to normal form. This may involve
further simplification, as the exponent which finally
remains may be either 0 or 1.
The arithmetic routines are called upon to do
almost all the arithmetic required in LEXICO. These
include the routines that are needed for computing
the constant term in a sum and the constant coefficient in a product and the routines for cancellation.
Basically, these routines perform addition and multiplication of floating-point and rational-mode constants. The floating-point package is straightforward,
with standard checks for overflow and underflow.
The decision to implement rational mode was made
after the system design had been frozen. In particular, it was impossible to add any new operators since
many routines had been coded making explicit use
of the originally defined operator set. Thus, it was
necessary to represent fractions in the form *Kl t
K2 -1 rather than by a binary rational constant operator, RC Kl K2• This introduced several complications for the LEXICO scan.
The rational-mode arithmetic package is more
complicated than the floating-point package. Multiplication requires two integer multiplications, and
addition requires three multiplications and one addition. The major difficulty arises in controlling
overflow. In the experimental 7090/94 FORMAC,
both the numerator and denominator are limited to
integers less than or equal in magnitude to 236 -1.
Thus, with a moderately large denominator, it is
very easy to have a numerator overflow.
In particular, it is possible to create intermediate
results that require double-precision arithmetic, al-
1965
though the final results are small enough to be represented by single-precision fractions. Therefore,
the rational arithmetic routine contains a doubleprecision, fixed-point add routine, in addition to a
routine that employs the Euclidean algorithm, for
reducing fractions to lowest terms.
The CLEANUP Routine. The cleanup action of
LEXICO converts the sort list form of the expression (BWFF) back to normal FORMAC internal
form. Several transformations are performed in the
process of reconstructing the expression. The order
of the items on the BWFF list is inverted and a
simplified sum or product is created. If the result is
a product, the parity bit is tested to determine if
the product has a negative sign. If parity is negative, the minus sign is included in the constant if it
exists. The minus operator is inserted preceding the
product, if there is no constant factor. If the sort
list is empty and there is no constant, then a sum is
replaced by 0 and a produce is replaced by + 1, or 1. If the product or sum has only one argument,
then the operator (* and +) and its delimiter are
not appended and the single wff is returned.
SUMMARY
The central role of simplification in the FORMAC
programming system and the general approach
pursued in implementing the FORMAC automatic
simplification algorithm have been described. It
has been shown how the universal application
of associativity, commutativity, and properties
of the additive and multiplicative identityelements
(0 and 1) in conjunction with the establishment
of a p-canonical form can produce "simplified"
expressions. In addition, the need for placing
application of the distributive law under programmer option has been indicated. The basic principles
employed in the organization of the simplification
algorithms (AUTSIM) have been presented in detail. In, particular, the role of governing relationships between operators, the need for additional
contextual information, the movement of scan
pointers and the organization of the sorting routine
(LEXICO) has been indicated. The importance of
an already simplified flag is completely eliminating
redundant simplification has been stressed.
This paper has introduced the FORMAC approach to the automatic simplification of mathematical expressions. A subsequent paper is planned in
AUTOMATIC SIMPLIFICATION IN FORMAC
which the simplification algorithm will be presented in complete detail.
REFERENCES
1. J. E. Sammet and E. R. Bond, "Introduction to FORMAC," IEEE Transactions on Electronic Computers, vol. EC-13, no. 4, p. 386 (Aug.
1964)
2. E. Bond, et aI, "FORMAC-An Experimental FORmula MAnipulation Compiler," Proceedings of the ACM National Conference, August
1964.
3. "FORMAC," SHARE General Program
Library, 7090 R2 IBM 0016, IBM Program Information Department, White Plains, N. Y.
4. E. Bond, et aI, "Implementation of FORMAC," IBM Technical Report 00.1260 (Mar. 16,
1965).
5. "Symbolic Work on High Speed Computers," Dartmouth Mathematics Project, Project Report No.4 (June 1959).
6. D. J. Edwards, "Symbolic Circuit Analysis
with the 704 Electronic Computer," MIT B.S. Thesis, 1959.
53
7. S. H. Goldberg, "Solution of an Electrical
Network Using a Digital Computer," MIT Master's
Thesis, 1959.
8. K. Maling, "The LISP Differentiation Demonstration Program," MIT Artificial Intelligence
Project, Memo 10.
9. T. Hart, "Simplify," MIT Artificial Intelligence Project, Memo 27 (1961).
10. D. Wooldridge, "An Algebraic Simplify Program in LISP," Stanford Artificial Intelligence
Project, Memo 11 (Dec. 1963).
11. P. Sconzo, A. R. LeSchack and R. Tobey,
"Symbolic Computation of f and g Series by Computer," Astronomical Journal, vol. 70 (May 1965).
12. W. S. Brown, "The ALPAK System for Nonnumerical Algebra on a Digital Computer," Bell
System Technical lournal, vol. XLII, no. 5 (Sept.
1963 ).
13. A. R. M. Rom, "Manipulation of Algebraic
Expressions," Communications of the ACM, Sept.
4, 1961.
14. J. McCarthy et aI, "LISP Programmer's
Manual," MIT Press, Mar. 1960.
15. W. A. Martin, "Hash-Coding Functions of a
Complex Variable," MIT Artificial Intelligence
Project, Memo 70 (June 1964).
THE NEW BLOCK DIAGRAM COMPILER FOR SIMULATION
OF SAMPLED-DATA SYSTEMS
B. J. Karafin
Bell Telephone Laboratories, Incorporated
Murray, Hill, New Jersey
INTRODUCTION
test sense of the word. The blocks comprising the
system perform the only functions a digital computer can perform, namely, accept a number as an input, operate on it, and produce a number as an output.
Continuous systems that are sufficiently bandlimited may also be simulated. However, a sampled-data system must be designed whose output
pulses would correspond to the sample values of the
desired system. This is easily done if there is a
highest system frequency, and if it is practical to
have a sampling rate higher than twice this highest
frequency. For systems that do not meet these requirements, approximations must be made if the
system is to be simulated using BL0DI. Techniques
and computer programs 2 ,3 are available at Bell Laboratories which produce efficient sampled-data approximations to a wide class of continuous transfer
functions. These programs are even capable of producing punched cards suitable for BL0DI input.
The important point here is that BL0DI makes no
pretense of being a continuous system simulator.
The transformations and possibly approximations
necessary in simulating a continuous system with a
discrete system are left to the engineer.
It should be clear that BL0DI is not a digital
imitation of an analog computer. In its most normal
The block diagram compiler known as BL0DI
was put into use at Bell Laboratories in 1959 and
reported in the Bell System Technical Journal in
1961.1 The compiler has been completely rewritten
to provide substantially increased flexibility. The
new program is called BL0DIB.
BL0DI
BL0DI was written to aid in the simulation of
sampled-data systems. It accepts as input a description of a sampled-data system block diagram written in the BL0DI language. It produces, as output,
a machine language program which will simulate
the described system. The major asset of BL0DI is
that the language corresponds very closely to an engineer's block diagram. It is easily learned and used
even by people with the most superficial knowledge
of computing techniques.
The systems with which the compiler can deal
are combinations of sampled-data circuits, i.e., of
blocks that accept pulses as input and yield pulses
as output. These pulses vary in height, but they all
occur at mUltiples of a fixed clock time. Thus the
systems the compiler accepts are digital in the stric55
56
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
mode of operation, it permits the user to go directly
from a block diagram of a sampled-data system involving transfer functions, etc., to the object simulation program without explicitly considering the
underlying differential equations. A myriad of programs exist that do try to make the digital computer behave like an analog computer. 4 ,5 These block
diagram compilers are built around an integrator
block; the diagrams they accept are not so much
system block diagrams as they are block representations of analog computer programs. BL0DI is system oriented. In a paper describing BL0DIB
applications, 6 simulation of a vocoder is discussed.
In that problem the interest is in system performance and optimization. The system is so large and
so complicated that asking for the differential
equiations is an unrealistic question. The system is
simulated using digitized speech as input data. The
output is also digitized speech which is listened to
and subjectively judged.
The BL0DI language is used to describe the system block diagram. Each block of the block diagram is represented by one list, i.e., by one
punched card. (If the description of the block is
lengthy, it may spill over onto more than one card.)
The coder chooses block types from a dictionary of
about 40 types. This dictionary includes such
blocks as delay lines, amplifiers, transversal filters,
rectifiers, cosine generators, function generators,
etc. Table 1 shows the type dictionary. The list for
each block specifies its type, a name assigned by
the coder, parameters associated with it, and the
names of the blocks to which the signal from the
present block is connected. An example of a blockspecifying list is
RAISE AMP 13.7, QUANT, SUM/3
The block named. RAISE is an amplifier with a
gain of 13.7. It feeds a block called QUANT and
the third terminal of a multi-input-terminal block
called SUM. (The various inputs to multi-input-terminal blocks are specified by a slash followed by
the terminal number.) It should be noted that the
user need specify only one half the connection matriX, namely the outputs. The compiler internally
fills out the matrix and reports unusual circumstances by way of diagnostics.
One of the most pleasant features of the language
is the fact that the order of appearance of the various lists in a circuit description is immaterial. The
blocks may be described in any order; the compiler
1965
analyzes the connection matrix and internally orders
the blocks for processing.
Figure 1 is a skeleton form of a PCM television
transmission scheme 7 involving quantization with
error feedback. Consider the BL0DI description of
this system which is given below.
ANALYSIS
TAPE
r-0
I
I
PREDISTORTION
FIL TER
+
I
I
RECONSTRUCTION
FIL TER
y
I
I
I
I
QUANTIZER
NOISE
I
+
I
I
I
I
I
I
-B- -y___ I
I
IL____
DELAY
J
____
ERROR
Figure 1. Transmission scheme involving quantization with
error feedback.
INPUT
HI
SUMM
QUANT
INP
FLT
ADR
QNT
N0ISE
FBDEL
SUB
DEL
AMP
F
H2
0UTPT
5"501,1,,,12,H1
2,1,-0.502,1.0, SUMM/1
QUANT, N0ISE/2
8,120,200',280', (etc.),
H2, N0ISE/1
FBDEL
1, F
-0'. 96"SUMM/2
ACC
0UT
END
O.502,,0UTPT
5"501,1,,,12
Each line of the above represents one block of the
system diagram of Fig. 1. For each block the left
hand column contains the name assigned the block,
the center column specifies the type of the block,
and the right column specifies the parameters and
output connections. Strictly speaking, INP and
OUT are not functional blocks. They specify points
in the system were samples are read from and
written onto designated tape units respectively.
THE NEW BLOCK DIAGRAM COMPILER
57
END is also not a type; it specifies the end of a circuit description.
In the example we again point out that the process is one of a system simulation. Both the input
and output are tapes of digital television picture
data.
thought of as providing the capabilities for system
simulation coding that is
• evolutionary,
• compatible, and
• modular.
Each of these categories will be discussed separately.
BL0DIB
Evolutionary Programming ( SUPERs)
Using the BL0DI language as a foundation, a
new compiler, BL0DIB, has been written. The new
compiler provides a language that is more flexible
and a programming system that is more complete.
Whereas the old compiler offered a fixed source
language and produced object programs of one rigid
structure (main programs all of whose arguments
were explicitly numeric), BL0DIB offers the engineer some of the generalities available to users of
general purpose programming systems.
With the new compiler, simulation programs (the
result of BL0DIB compilations) are more like the
programs produced by other more general compilers. For example, the user may now choose to have
the simulation run in either the integer or floating
point mode. (The old program ran only in the integer mode. ) Furthermore, simulation programs
may now have symbolic parameters, the values of
which can be supplied at run time and varied to facilitate optimization routines and collections of
families of data. Most important too is that simulation routines may now communicate with other
programs in the computational environment. These
features open the way for many new applications,
some of which are discussed below.
The source language also has new flexibility and
offers new potentialitis. The changes impart roughly
the same flexibility to BL0DI as MACR0 FAP
does to F AP. Along the providing features to make
the coding of simulation problems more convenient
and less tedious, the new program gives the user
the ability to write higher-level, specialized, simulation languages using BL0DI statements as atomic
blocks.
The compiler itself has a new structure. BL0DIB
is actually a preprocessor coupled with a library of
high-level macro definitions. This structure provides general flexibility, lends itself more readily to
changes and additions, and is somewhat less dependent on particular monitor systems and machines.
For purposes of discussion, BL0DIB can be
It may happen that several interconnected BL0DI
blocks are required to realize a single functional
block that will be used many times, perhaps with
several different values for some group of parameters. What is called the SUPER facility of BL0DIB
essentially permits the programmer to draw a line
around such a group of blocks, to name it, and to
use it thereafter as if it were a basic block. Figure 2
shows the block diagram of what can be thought of
as a rectangular integrator. To define a SUPER to
realize this function, coding of the following form
is given to the compiler as part of the source description.
INT
MACR0 INC0N
MIP
INPUT
1,INTRV
INTRV
AMP
1.0 E-5"SL0W
DEL
SL0W
1,SUMM/1
BAT
VALUE
INC0N, DELAY, SUB/1
DEL
DELAY
1, SUB/2
SUB
SUB
SUMM/2
ADR
SUMM
RECT
RECT
ACC
1.0,,0UTPT
0UTPT
M0T
END
After defining INT in this way, it can be used as if
it were a basic block type with one parameter (the
initial value). (In the example the integration interval, which corresponds to the sampling period, is
10-5 • )
We digress to point out that although BL0DIB is
not primarily an analog computer simulator, the,
preceding definition of a rectangular integrator
shows that the compiler has at least as much power
to tackle analog computer problems as some compilers that are so oriented. An example of such a
compiler is DAS-4, which uses rectangular integration and a fixed time base.
Since a SUPER may be used as a basic block after it has been defined, it can be used in the description of a more complicated SUPER. A trival
example is shown where the previously defined INT
block is used in the definition of a new super that
58
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
BATTERY
-INITIAL
CONDITION
1965
UNIT
DELAY
OUTPUT
UNIT
DELAY
UNIT
DELAY
Figure 2. A BL0DIB rectangular integrator.
rectangularly integrates the weighted sum of four
inputs.
SIN
IN1
IN2
IN3
IN4
WGHTI
WGHT2
WGHT3
WGHT4
SUM
INTEG
0UT
MACR0
MIP
MIP
MIP
MIP
AMP
AMP
AMP
AMP
ADR
INT
M0T
END
WI, W2, W3, W4, INC0N
1, WGHTI
2, WGHT2
3, WGHT3
4, WGHT4
Wl"SUM/l
W2"SUM/2
W3"SUM/3
W4"SUM/4
INTEG
I,INC0N,0UT
In BL0DIB, SUPERs may be nested to an almost arbitrary number of levels. Since each SUPER
is very nearly an independent circuit description, a
BL0DIB program can be thought of as an ordered
sequence of circuit descriptions. Each circuit possibly contains one or more of the previously defined
circuits one or more times. The last circuit is of
course the one to be simulated. Coding can thus be
thought of as an evolutionary process. The programmer builds up a "super-BL0DI" language featuring blocks that are commonly used in his application.
Compatible Programming (Subroutines)
BL0DIB offers the user the facility to write simulation programs that can coexist, communicate,
and interact with other programs in the computational environment. The user is provided with a
staement that closely resembles the F0RTRAN
SUBROUTINE statement. With this facility the
programmer has the ability to write a BL0DI program, i.e., a circuit description, with some parameters, such as amplifier gains or quantizer decision
levels expressed as variables, specifying them as
symbolic names rather than as numbers. At the beginning of the program, the programmer tells the
compiler that what follows is to be a subroutine
with the name he specifies and that the variables in
the list he writes must be supplied when the subroutine is called. A typical opening statement of a
BL0DIB subroutine source program might appear as
TEST
SUBR
(GAINI ,GAIN2,SHIFT),
where test is to be the name of the subroutine and
GAINl, GAIN2, and SHIFT are variables whose
values shall be specified by the calling program, and
which are used somewhere in the circuit description
or standard BL0DI program that follows. Statements involving the variables might appear as
AMP22
COUNT
AMP
ACC
GAINl ,SHIFT ,0UTP3
GAIN2,2,0UTP4
The program that is produced is a subroutine structurally identical to subroutines produced by other
compilers such as F0RTRAN. Its parameters may
be varied. It can be loaded along with other programs. It can receive parameter values, and it can
transmit and receive data.
Futhermore, the ability to use floating point arithmetic is simulation programs enhances the compatibility of those programs with numerical analysis
THE NEW BLOCK DIAGRAM C::OMPILER
routines, almost all of which are written in the
floating point mode. (It hardly seems necessary to
state that the use of floating point arithmetic almost
completely eliminates the scaling difficulties inherent in many simulation techniques.)
The simplest application of the subroutine feature is the case in which one wants to simulate a
system repeatedly for a range of values of a parameter(s). For this case one codes the simulation program as a subroutine with a variable parameter(s).
The subroutine is then used in conjunction with a
main program, written in some general purpose language, that calls the simulation subroutine with various values for the parameter (s ) of interest. The
main program can obtain the parameter values from
some internal array, by reading cards, or perhaps
receiving them from a remote console at which the
engineer is stationed.
Iterative system design schemes can be implemented using the subroutine facility. Situations arise
where one has to optimize a group of parameters
associated with a complicated system. The solution
of such a problem might involve the following steps:
1. Make an initial guess of values of the variables.
2. Simulate the system using those values.
3. Use the results of the simulation to calculate new values for the variables.
4. Return to step (2) unless the new values
are within a given neighborhood of the last
values.
To implement such a procedure one writes a simulation subroutine and loads it along with the necessary analysis and design programs. A main program will probably also be necessary to handle control.
Unfortunately, many problems of parameter optimization require human intervention commonly
known as eyeballing and knob-twiddling. This
points up another reason why it is so important for
the simulation program to have the ability to communicate with other routines. The era is almost
upon us in which men at remote stations will have
the capability to interact with the computer. Simulation seems supremely suited for man-machine interaction.
Another use of the subroutine feature is to allow
simulation programs to function as analysis programs. At Bell Telephone Laboratories, a program
used to design sampled-data filters presents, as
59
part of its output, a plot of the impulse response of
the designed filter. Instead of using residue calculations based on the continuous analog of the filter,
the response was obtained from a BL0DIB-produced
subroutine which simulates the actual sampled-data
filter. The circuit is excited by a single pulse and the
resultes of the simulation are plotted and presented
as part of the output.
To sum up this discussion, simulation programs
can now be written to be compatible with other
kinds of programs, opening the way for a great
many diverse functions.
Modular Programming
This feature allows the user to write simulation
subroutines in the BL0DI language that structurally
resemble simulation blocks. This facility allows
what can be thought of as modular programming. A
basic simulation program can be thought of as a
piece of hardware with sockets. Various subsystems
are coded, and the main system is run with a selected group of subsystems plugged into the sockets.
Among other things, modular programming allows the user to effect structural changes. Suppose
there is a very complicated system to be simulated,
and suppose there is a block whose function one
would like to change for each of three simulations.
Perhaps one would have this block a transversal filter for. one simulation, a quantizer for a second, and
perhaps nothing, a wire, for a third. In the main
circuit description the block in question might have
a type called M0D. Three modules would then be
coded, each of type M0D, one a filter, one a quantizer and one a wire. One then runs three different
simulations, loading a different module each time.
An important point here is that the main system
need be compiled only once. This main piece of
software is never changed. New elements are merely
plugged into the sockets to test various design
schemes.
Modular programming makes it possible for one
to build a library of simulation modules .in much
the same way as a library of numerical functions subroutines is built. The feature aids in producing
complete simulation systems for major projects such
as, for example, vocoder research.
.
CONCLUSIONS
BL0DI has proven to be an efficient and easy to
60
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
use tool for the simulation of a class of systems
that:
1. involve a relatively smooth flow of data
(signal) into and out of the system, i.e.,
the system does not process samples in a
complicated order;
2. can be represented in terms of the pulse
circuitry that has been described;
3. and that, once depicted as digital system
block diagrams, contain blocks that usually
change state at each sampling instant.
For this class of problems the compiler produces
machine language programs that are as efficient as
those produced by professional programmers. (For
example, in the course of the algorithm for ordering
the blocks for processing, the compiler searches the
connection matrix, attempting to save store and
fetch operations.)
A new program, BL0DIB, has been written and
described, which is based on the BL0DI language.
BL0DIB is coded for the IBM 7094 II under the
control of the Bell System VII monitor and 1/0
system. To use the system under another monitor
would involve only changes in the 1/0 and interaction with the macro assembler. Changes in the 1/0
are identical to those that were necessary for the
original BL0DI, which were carried out successfully at many computer (7090) installations. This new
program offers -many new features but suffers no
degradation in object program efficiency or ease of
usage. The new features change the nature of the
compiler from a language translator to a full programming system. The new program is particularly
suited for
1. building higher-level BL0DI languages by
defining block types from combinations of
blocks,
2. interative simulation procedures,
3. coding of large, flexible, modular, simulation systems, and
4. the use of simulation programs as subroutines for more complex procedures.
ACKNOWLEDGMENT
We wish to thank R. H. Roth and D. E. Eastwood, with whom many valuable discussions concerning BL0DIB were held, G. J. Spivak, who
helped with the time-consuming task of writing the
1965
compiler, and M. Karnaugh who suggested and encouraged the entire project.
REFERENCES
1. J. L. Kelly, Jr., C. Lochbaum and V. A. Vyssotsky, "A Block Diagram Compiler," B.S.T.l., vol.
40, no. 3, pp. 669-76 (May 1961).
2. J. F. Kaiser, "Design Methods for SampledData Filters," Proc. Frist Allerton Conference on
Circuit and System Theory, Nov. 1963, Monticello,
Ill.
3. R. M. Golden and J. F. Kaiser, "Design of
Wideband Sampled-Data Filters," B.S.T.l.,. vol. 43,
no. 4, part 2, pp. 1533-46 (July 1964).
4. R. A. Gaskill, "A Versatile Program-Oriented
Language for Engineers," IEEE Transactions on
Electronic Computers, vol. EC-13, no. 4, pp. 41521 (Aug. 1964).
5. R. D. Brennan and R. N. Linebarger, "A Survey of Digital Simulation: Digital Analog Simulator Programs," Simulation, vol. 3, no. 6, pp. 23-36
(Dec. 1964).
6. R. M. Golden, "Digital Computer Simulation
of Communication Systems Using the Block Diagram Computer: BL0DIB," Proc. Third Annual
Allerton Conference on Circuit and System Theory,
Oct. 1965, Monticello, Ill.
7. C. C. Cutler, "Transmission Systems Employing Quantization," U.S. Patent No. 2,927,962, Mar.
8, 1960 (filed Apr. 26, 1954).
Table 1. BL0DIB Type Dictionary.
Block
Type
Function
Output
DEL
VLD
ACC
Fixed Delay
Variable Delay
Accumulation
Yk = Xk-l
Yk = Xlk-x2
Yk = Xk + P1Yk-l
FLT
Transversal Filter
Yk = !,pPl-lxk-lp2
AMP
Amplifier
Y = Plx
ADR
Adder
SUB
Subtractor
y = !'Xi
i=1
y = Xl _ X2
MAX
Maximum
y = max Xi
i=1
MIN
Minimum
MPR
DIV
Multiplier
Divider
y = min Xi
i=1
y = X1X2
y = Xl/X2
( pl-2)
1=0
4
4
4
61
THE NEW BLOCK DIAGRAM COMPILER
CLP
Positive Clipper
CLN
Negative Clipper
SCL
Symmetric Clipper
FWR
BAT
Full Wave
Rectifier
Battery
COS
Cosine Generator
GEN
y =
x
+
PLS
Pulser
PRT
Printer
INP
Input
OUT
Output
BOF
Output Buffer
PIP
Integer to floating point conversion
Floating point to
kuteger conversion
Microfilm plotter
Plots up to three
signals on a linear
plot under control
of a fourth input
and a threshold.
Input to SUPER
Output from SUPER
Pl
Yk = pscos(2'1T'k + P2)
Pl
Function Generator y= Pi; i = 2, ... ,Pl
the sequence of parameters repeated
cyclically from second to last.
Pseudo-vondon noise
Noise Generator
WNG
from Gaussian distribution with
standard, deviation
Pl.
QNT
Quantizer
Pl in the number of
representative levels. The representative levels are P2,P4,
• • • , p2Pl. The decision levels are
... ,PS,P5, ... ,
p2Pl-l.
LQT Linear Quantizer
y= ( ;
SQT
y=
Square Root
SMP Sampler
HLD
CNT
DTS
FLF
=
y
)rounded up.
Pl.
yx
{xP3oneotherwise
sample out of P2
Pl specifies an initial phase.
Sample and Hold
Yk = Xl l where Xl2 was
the last control
sample to exceed
the threshold, Pl.
Counter
Output is active level
every.nth time input
exceeds a threshold
and a passive level
the remainder of
the time.
.
{Xl; x 3 > Pl
Double Throw SWItchy= r; oth-;;-rwise.
Flip-flop
Output has a low
state output until
the input exceeds an
upper threshold. An
upper state output
is then maintained
until the input is
below a lower
threshold, etc.
If the input exceeds
trigger level, the
output is the pulse
level for the next
d samples, where d
is a parameter for
pulse length. The
output is a quiescent level if none of
the last (d-J) samples exceeded the
trigger level.
Lists up to three input
signal under control
of a fourth signal
and a threshold.
Reads data from tape
for input.
Writes data on tape
for output.
Writes output samples
into a buffer area in
the computer mem-
P0P
MIC
MIP
MOT
ory.
In the above Yk
= Y (kT) where T is sampling period
= xi(kT)
where xi(t) is the ith input
to the block
The superscript is missing for blocks with only one
input, while the subscripts are neglected for blocks
with no dependence on the past. pj is the jth parameter of the block.
Xik
TWO'-DIMENSIONAl PROGRAMMING*
Melvin KIerer and Jack May
Columbia University
Hudson Laboratories
Dobbs Ferry, New York
2. The system should be easy to learn (and
therefore subject to universal use);
3. It should be adaptable to a wide range of
problems and applications; and
4. It should produce a final product that is
better than the "old-fashioned" product.
In other words, not only should the object
program be cheaper to produce but it
should run faster than programs obtained
using present compilers.
A new user-oriented programming system for
the purpose of facilitating the programming and·
analysis of well-formulated problems has lt been designed and implemented at Columbia University,
Hudson Laboratories. This system consists of a
standard Flexowriter modified to construct twodimensional mathematical expressions and a new
programming language.
The typing and language rules are quite flexible,
unrestrictive, and easy to learn. Typing errors are
easily corr~cted by backspacing and overtyping or
by pressing a special "erase" key. Subscripted and
superscripted arithmetic expressions can be typed
conveniently. Arbitrary-sized summation, product,
integral symbols, and other mathematical symbols
can be constructed from elementary strokes or
formed automatically by selecting the desired symbol from an accessory console keyboard.
Figure 1 is an example of what one research
scientist brought to us for computation and illustrates what we mean by a well-formulated problem. A is a function of all the other variables and,
except for x and y, all are input parameters. When
this is coded-regardless of whether we use FORTRAN, ALGOL, or any other system-the programmer must be careful that the argument of the square
root does not become negative, and that the denominator of the function to be integrated does not become so small as to cause overflow.t
Figure 2 shows the corresponding source language statement as typed on our input device in our
language.
We attempted to meet the following criteria:
1. There should be less human effort: by this
we mean fewer instructions (therefore fewer
errors), less total time spent in coding, less
debugging, and less high-level thinking
necessary to solve the problem;
t A system which will automatically make such analyses
and take appropriate actions is under development by the
authors.
*This work was supported by the Office of Naval Research and the Advanced Research Projects Agency.
63
64
JOINT COMPUTER CONFERENCE,
1965
A=
Figure 1.
'0/ J
1
dx dYe
1
Figure 2.
Our Flexowriter has been modified so that the
platen may be revolved by keyboard control. One
presses the subscript key for the subscript position
and the paper moves up half a line; similarly, the
paper moves down half a line by pressing the superscript key. A typewriter device with this particular
facility is not particularly new. As indicated in the
references, announcement of the intention to construct such a machine was made as far back as July
1958 by two independent groups, one working at
Los Alamos,l1 the other at Lincoln Laboratories,14
and pioneer work in this field has been done by Mark
Wells at Los Alamos. 12,13
Our system permits the construction of symbols
of arbitrary size besides allowing the use of other
conventional mathematical forms such as implied
multiplication, subscripting without the use of artificial conventions, subscript notation to denote a
logarithmic base, and superscripting as in cos -1 x
and cos 2X. Arbitrary-sized integrals, summation symbols, or parentheses may be typed by combining basic
strokes-horizontal and vertical bars, diagonals in
both directions, and upper and lower semicircles as
shown in Fig. 19. These basic strokes have been
designed to interlock with each other. Symbols need
not be symmetric nor well composed. Figure 3
illustrates some of the poorly formed symbols that
are recognized correctly by our system. The strokes
may be typed in any order. For instance, one may
type part of a summation sign, then type part of the
argument and go back to type more of the sum or
part of the limits. Restrictions are of a minor nature;
for instance, there must be enough room above and
below the summation symbol to type the upper and
lower limits.
Figure 4 is a photograph of a page from Hildebrand's book7 on numerical analysis, illustrating his
prescription for solving linear equations. We chose
this as a good example of a well-formulated problem
for comPlltation. We would not call this an algorithm
because it contains an inherent ambiguity. Note that
the last equation has to be computed for i taking on
the value of n first and then n - 1,n - 2, etc., down to
i = 1. When j = 1 in the first equation, the sum
becomes the null set.
Figure 5 is the corresponding program for the
solution of a set of n linear equations in n unknowns as written in our programming language.
The maximum value of n has been arbitrarily set at
20. It can be noted that the body of the typescript
is a fairly reasonable transcription of the text indicating double subscripts, etc. In the last formula our
loop is programmed backwards, but we did not
have to worry about the fact that the summation's
upper limit can be less than the lower limit-the system automatically takes care of this consideration.
Figure 6 shows a truncated continued fraction as
typed for computer input.
One question usually asked after we demonstrate
our system to a visitor is: The system seems fine
for the novice but will it be a useful tool for the experienced senior programmer? Our answer is yes.
The language allows the expert to write long complex statements that are not possible with most
presently operational compilers. Figure 7 illustrates
a program written by one of our senior program-
65
TWO-DIMENSIONAL PROGRAMMING
L
r
n
II
[]
[]
[
LJ
]
J
\
s
v
Figure 3.
431
NlTMEltICAL SOLUTION OF EQUATIONS
tical with c~. Each succeeding element above it is obtained as the result
of subtracting from thc ('orre:,;ponding element of the c' column the inner
product of its row in A' and the x column, with all uncalculated elements
of t.he x column imagincd to be zcros.
The preceding inst.nlct.ions are summarized by the equations
(i
~
j),
(10.4.4)
(i
< j),
(10.4.5)
(10....6)
and
x, -- c.I -
~
L.,
(10.4.7)
I
aikx.t,
k-i+l
where i and j range from 1 to n when not otherwisc
seen that '~e proec8.'1 dr' i hy (lOA "\ is iden··
solution'
t,he Ga"
'on, wl '
1terr
~tricted.
t
'h the
It;A
II'
(10.3 •
T
Figure 4.
mers. Except for the initialization, it is a one-statement algorithm for computing prime numbers.
Note that two variabtes are stepped; one variable is
incremented by 2 with no explicit upper limit while
the· other variable is stepped by 1 (an increment of
1 is assumed if a BY clause is absent) until a
terminating condition is satisfied. In this case the
condition involves the use of the incremented variable, but, in general, the condition does not have to
depend upon the incremented variable and may be a
66
PROCEEDINGS -
1965
FALL JOINT COMPUTER CONFERENCE,
MAX I MUM n=20.
READ n.
READ A
FROM j=1 TO nAND 1=1 TO n.
1j
READ C FROM 1=1 TO n.
1
a 1j=: A1j
IF 1~j THEN
FROM j=1 TO nAND 1=1 TO n
~
-L
a1k~j OTHERWISE a 1j=
k=1
1-1
-L
FROM 1=1 TO n
C1
a 1k'Vk
COMPUTE 'V1= _....::k::;;;;=.....
1 _ _ __
aU
L
n
FROM
,=
BY -1 UNTIL '(1 <"""UTE X,= "
-
"'kX. •
k=1+1
PRINT i {2} , Xi FOR 1=1, 2, ••• , n.
FINISH.
Figure 5.
READ z.
x
=1
+ _ _ _ _ _ _ _--!:.z_ _ _ _ _ _ _ __
--=z=--_______
1 _______
2
+ _ _ _ _ _ _.:;:.z_ _ _ _ _ __
3 _ _ _ _ _ _..::z:-_ _ _ __
2 + _ _ _ _.:;:.z_ _ _ __
5 _ _ ___"z_ _ _ __
2 + ____
z _ __
7 _ _..;;;;Z_ _
2 + _z_
9
PRINT z, x.
FINISH.
Figure 6.
function of parameters unknown at compile time.
Another feature of our "implied loop" is that the
index i is tampered with inside the loop. Also,
FROM clauses may be located anywhere in a statement as long as they make sense. The first FROM
clause is performed most often; only when its UNTIL condition is satisfied does the next FROM
variable become incremented. Apart from computer
memory size there is no restriction to the number
of FROM or FOR clauses allowed.
The use of IF conditional clauses and PRINT
FORMAT statements is shown in Fig. 8. This
program computes moving averages of every 10 data
points and prints out a cumulative average every
20 points. We may see that multiple replacement
operators can be used within a statement. The UNTIL
condition on the FROM loop is satisfied by the reading of a particular data point from a punched card.
67
TWO-DIMENSIONAL PROGRAMMING
DIMENSION P=1000.
j=l.
PRINT j, Pj =3.
FROM 1=1 UNTIL FRACTIONAL PART ~ =0 IF P1 2)P THEN 1=j, j=j+1 AND PRINT j, Pj=P FROM P=5 BY 2 TO INFINITY.
1
FINISH.
Figure 7.
A=8=1=J=k=0.
FROM 1'=1 UNT.IL X=99999 READ X, C()Io1PUTE A=A+X, 8o=8+X, 1..1+1, j .. j+1,
(IF k=0 PRINT 'Y {
3 }, x{
4.2}
AND k=1 OTHERWISE }oo.O AND PRINT 'Y {
3 } , , x { 4.2
IF 1=10 PRINT FORMAT 1, 'Y, B/10 AND C()Io1PUTE 1...80=0, IF j=2O PRINT FORMAT 2,
FORMAT 1 AFTER A TOTAL OF
xxx
POINTS THE MEAN OF THE LAST TEN POINTS IS
FORMAT 2 AND THE AVERAGE OF THE WHOLE SET IS
xxxx.xx.
:yA
} ),
AND C()Io1PUTE j ..O.
xxxx.xx.
FINISH.
Figure 8.
Punched input card format is free field with blanks
separating each datum. As many data points as desired are allowed on each card and the number may
vary from card to card without the need for any
defining information. The range of IF statements
may be delimited by parentheses. OTHERWISE or
ELSE clauses may be absent. Additional IF clauses
may be nested within other IF clauses and may be
put after THEN or ELSE clauses. The parentheses
around the IF k = O... clause cause the program to
go to the IF i = 10... clause in either case; if the
parentheses were not present then the program would
test IF i = 10 only if k =1= O.
There are several ways to print answers on the
highspeed printer. A standard PRINT A statement
will cause the value of A to be printed in floating
point style. A may be an expression and may contain a replacement operator. The PRINT 'Y {3},
X {4.2} will cause 'Y to be printed as a 3-digit integer
and X to be printed with 4 places to the left of the
decimal point and 2 places to the right. Finally, one
may mix numerical results with literal messages by
using a PRINT FORMAT statement and a FORMAT image. The PRINT FORMAT statement lists
the expressions whose values are to be printed, while
the FORMAT image contains the text with the position and size of the results denoted by groups of
lower case x's. When the magnitude of a number is
unknown. one lower case y will cause the result to
be printed in floating point style. Figure 9 shows the
output of this program.
Whenever it is easily possible to determine the
size of an array by inspecting a program there
should be no need to specify its dimension explicitly. Inspection of Fig. 10 indicates that the X array
will require 500 locations and so the program automatically assigns 500 locations to X at compile time.
Currently there is no analysis of space requirements
at run time, so these decisions are made by the
compiler. Figure 10 illustrates a program to compute the mean and standard deviation of a group of
numbers. This also shows how comments are inserted-by putting them between braces.
Figure 11 shows semiautomatic dimensioning. In
this correlation program, the number of points will
be decided at run time but it is known that there
will never be more than 500. It is certainly easier
to specify a "maximum value" for one variable and
let the system do the clerical work than to specify
the same dimension for a number of arrays (X, Y,
and A). Finally, one may also dimension arrays
directly using a DIMENSION statement as shown
in Fig. 12. Here the size of the arrays is not obvious so a DIMENSION statement is needed.
For systems of this type there is always the inherent possibility of ambiguity. For example, since we
allow both double subscripts and implied multiplication, A ij in Fig. 13 may mean either A as a function
68
PROCEEDINGS -
1
2
FALL JOINT COMPUTER CONFERENCE,
1965
2145,32
3
4
128,73
5
630e,75
9012.45
2373.38
6
9523.54
'7
'520,75
8
9
10
6280,40
4510.83
6419.01
10 POINTS THE MEAN OF THE
11
7 06:5,57
12
6312.4'
245,12
13
14
186 0 .23
6105,85
15
16
3015 4 .12
1,54,95
11
18
85,8.44
33 2 8,49
19
20
665 0 .17
AFTER A TOTAL OF 20 POINTS THE MEAN OF THE
AND THE AVERAGE or THE WHOLE SET IS 49'61. 83
2130,11
21
22
8365.7'
2238,01
23
24
1543.25
3658,91
25
26
375.15
92,14
27
28
9825.16
5241.36
29
30
6327.01
AFTER A TOTAL or 30 POINTS THE MEAN OF THE
31
2238,47
32
9510 •• 2
55 1 4,03
33
34
25e 4 .45
35
3275.28
36
3514.?5
258,14
37
38
6325.02
3514,75
39
40
258.14
AFTER A TOTAL OF 40 POINTS THE MEAN OF THE
A~D THE AVERAGE OF' THE WHOLE SET IS 4550.84
41
63 2 5,02
42
68~·.14
43
267.14
44
4814.14
9235,41
45
46
6732.14
5602.14
47
48
2418.46
3579,96
49
50
9520,'1
AFTER A TOTAL OF' 50 POINTS THE MEAN OF' THE
8762.28
51
52
357,19
6854,42
53
2398.75
AFTER ,4 TOTAL OF'
LAST TEN POINTS IS 54 2 8,32
LAST TEN POINTS [S 4495,34
LAST TEN POINTS IS 4580,35
LAST TEN POINTS IS 3699,34
LAST TEN POINTS IS 5546,90
,4
Figure 9.
of two independent indices, i and j, or A as a function of one index, the expression, i times j. Also, since
the argument of a function does not have to be delimited by parentheses, the interpretation of "SIN A COS
B" is ambiguous. Is the argumen tof the "SIN" "A
COS B" or just "A"1 If the expression is interpreted
as (SIN A) X (COS B)-which seems reasonabledoes this mean that if no parentheses are present to
delimit an argument, then the appearance of another
function name will delimit the argument? If so,
then "SEC TAN-l (A/2B)" would be "SEC of
nothing times TAN-l (AI2B)."
TWO-DIMENSIONAL PROGRAMMING
READ S { IDENTI F ICATION
~
•
READ Xi FROM 1=1 TO 500.
y
=
500
PRINT FORMAT 1, S, Y, cr.
FORMAT 1 THE MEAN OF GROUP xx IS xxxx.xx WITH A STANDARD DEVIATION OF y.
FINISH.
Figure 10.
MAXIMUM n=500.
READ n.
n-T
AT=
~ X1 Y1+T FOR
T=O, 1, ••• , n-l.
1=1
WRITE TAPE A, 2, 2, n.
FINISH.
Figure 11.
DIMENSION A=100, 8=100.
READ K,
€.
FROM 1=1 TO LOG2 K COMPUTE A1=B1K AND PRINT 1, A1 , B1 €
FINISH.
Figure 12.
69
70
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1965
Y=SIN A COS B.
Z=SEC TAN- 1 ~B.
A=CSCH -1 A - L.
C=A/2B.
2
7r
E=LOG 2k+14 P + 2
Q
Figure 13.
For the next statement we might ask whether the
argument of CSCH-l is A or A - L. For the following
statement we would want to know whether or not
M is within the summation. The system's interpretation of the last three statements is also of interest. (1T and e are interpreted as 3.14159. . . and
2.71828 ... ) Immediately after the source program
is read, the system's interpretation is listed on the
high-speed printer in a linear, FORTRAN-like, intermediate language. The output resulting from Fig. 13
is shown in Fig. 14. This shows that Aij is interpreted as a two-dimensional array but another context
may yield a more appropriate interpretation. Y is interpreted as the product of SIN A and COS B. The
argument of the secant is interpreted as TAN-l
(A/2B). The variables Land M are interpreted to
be outside the argument of the CSCH-l and the sum,
respectively. The next statement is interpreted in the
FORTRAN manner and is not (A/2B) but(A/2)B.
The argument of the TAN-l in statement 3 is inter-
preted as A / (2B), because it was originally typed
using displayed division rather than a slash. Statement
7 shows that the exponent, 3n - 2 has been moved
from its location adjacent to the COS to its proper
place for computation. Statement 8 shows how the
base of a log is treated. After the "LOG to the base
10" of the argument is taken, that quantity is divided
by the "LOG to the base 10" of the base. We emphasize that each particular interpretation is a function
of the local context in which it is actually used. In
our opinion, an immediate response to the· user returning the system interpretation is of great utility in
resolving many ambiguous forms.
The single variables on our keyboard include the
entire uppercase alphabet, 12 lowercase letters and
16 Greek letters for a total of 54 variables. This may
be doubled because a letter typed in red is considered
different from the same letter typed in black. If 108
variables are not enough one may define other var-
TWO-DIMENSIONAL PROGRAMMING
THIS IS THE WAY WE INTERPRET YOUR STATEMENTS.
AOOOOl
71
IF ANY ARE INCORRECT PLEASE RETYPE THE STATEMENT CORRECTLY,
X-A SUB II,Jl
A00002
A00003
Z.SEC[ARCTANr [IAJ/r2*B]]
A00004
A.ARCCSCH(AJ-L
A00005
B=SUM WITHIN (100,1.1]
J)
or
(A SUB [I).B SUB IIJ)+M
A00006
A00007
A00008
E.LOG[4*P RAISED TO r2]J/LOG[2*K+1J+r[PI]/(2)J
Figure 14.
iables by a SPECIAL VARIABLES statement as
shown in Fig. 15. However, if it is desired, extra
variables may be constructed without the need for
predefinition by appending a red superscript to a
variable, as in AMAx. This figure also illustrates the
trick used to show functional relationship while saving memory space. Comment braces are put around
the SUbscript of F so that the sums are not stored
in an array. The system's interpretation is shown in
Fig. 16.
SPECIAL VARIABLES TEMPERATURE, PRESSURE.
READ AMAX, TEMPERATURE, PRESSURE.
v= (c x TEMPERATURE
V=C
+
+
K x PRESSURE) AMAX.
TXEXMxPxEXRXAxTXUxRxE - K
+
AMAX.
10
F
{!.f =
L
X1J FOR 1=1, 2,
3, ••• ,
50.
J=1
PRINT F.
Figure 15.
Another example of the use of comments is indicated in Fig. 17. This is a segment of an actual pro-
duction program to calculate the power spectrum of
a filtered signal. The part shown in the figure com-
72
PROCEEDINGS -
1965
FALL JOINT COMPUTER CONFERENCE,
THIS IS THE WAY WE INTERPRET YOUR STATEMENTS,
AOD001
DIMENSION TEMPERATURE, PRESSURE
AOD002
READ AMAX,TEMPERATURE,PRESSURE
Ir ANY ARE INCORRECT PLEASE RETYPE THE STATEMENT CORRECTLY,
AOD003
AOD004
AOD005
PRINT r
A00006
rINISH.
Figure 16.
TO M COMPUTE C1=C 1+XX1 • FROM 1=0 TO M-l COMPUTE ~=Xl+l.
READ TAPE XM,2,2,1 AND COMPUTE XMFXM-S. STATEMENT 2. ~ROM J=N-M TO N LOOP TO STATEMENT 3.
FROM
1=0
FROM 1=0 TO
MCOMPUTE
C1=C1+XX1 •
FROM 1=0 TO M-l COMPUTE X1=Xi +l •
2
STATEMENT 3. FROM 1=0 TO M COMPUTE Ci - G
N-l
l
fiNITE COSINE SERIES TRANSFORMATION OF C1
k=2M. D =COS ~ FROM p=0 TO M. D =Dk
P
'"
~
SMOOTH ING THE SPECTRUM
~
OPTION A}
P
-p
M=M-l.
Ci.
FOR 1=0 TO M
FROM p=M+l TO k.
V1 =6t rC
o
L
+
2
t
C
j
COS
~ + CM COS 1T ]
I
J=l
LOOP TO STATEMENT 4. FROM 1'=0 TO M.
p=W=0.
OPT ION A HANN ING OR OPT ION B HAMM ING }
IF OPTIONAB=O THEN E=.5 AND F=.25 OTHERWISE {OPTION B} E=.54 AND F=.23.
Uo=E(VO+V l }. FROM 1=1 TO M-l COMPUTE Ui=EVi+F(Vi_l+Vl+l}.
UMiE(VM_l+VM}.
Figure 17.
putes a correlation function and its cosine transformation. For clarity the actual "book" formula is printed
in red (the color is immaterial) and put between the
comment braces. An experienced programmer would
realize it would be highly inefficient to compute COS
ij7rM a total of M2 times. Since M, i and j are integers
and cosine is a periodic function, the same numbers
would constantly repeat themselves. Thus, the program, which is listed below the comment, gives the
correct answers, but the computation is done in a
much more efficient manner, and the comment serves
as documentation.
With regard to machine code efficiency, it is
always difficult to pick unbiased examples. However, our experience has ted us to believe that in general our object programs are very efficient. Our
symbol recognizer, translator and compiler make for
a complex software system and there is no escaping
the fact that it would be a substantial task to code
our system for another machine. Because we were
TWO-DIMENSIONAL PROGRAMMING
interested in machine-efficient object programs
we made no attempt to make our coding techniques
machine-independent. However, it is possible to
simplify the task of recoding for another system.
One may just recode the symbol recognizer and
translator parts, i.e., up to the point where the
FORTRAN-like intermediate language is produced.
Thus these parts can be considered to be a preprocessor for an existing FORTRAN-like compiler.
We have become convinced that the voluminous
programming instruction and operating manuals
usually encountered are rarely necessary. Thus we
are trying to explore how concise one can make a
system reference manual without impairing its practical utility. Presently we are using a manual consisting of one sheet of stiff 81h x II-inch paper
printed on both sides, as shown in Figs. 18 and 19.
As yet, we do not have enough experience with it to
know whether we want to increase the size of this
one-sheet manual'; yet it is hard to envisage its
ever growing to a size larger than two or three double-faced sheets.
ACKNOWLEDGMENTS
We acknowledge with thanks the programming
assistance of David Levine and Fred Grossman, the
engineering assistance of Charl'es Amann and Saverio Conforti, and the encouragement of Alan Berman, Robert A. Frosch, and Ivan E. Sutherland.
Fig. 4 is reproduced by permission of the McGraw-Hill Book Company.
REFERENCES
1. K. G. Balke and G. Carter, "The COLASL
Automatic Coding System," Dig. Tech. Papers,
ACM Nat!. Con!., 1962, pp. 44-45.
2. A. J. T. Colin, "Note on Coding Reverse
Polish Expressions for Single-Address Computers
73
with one Accumulator," Comput. J., vol. 6, pp.
67-68 (1963).
3. H. J. Gawlik, "MIRFAC: A Compiler Based
on Standard Notation and Plain English," Comm.
ACM, vol. 6, no. 9, pp. 545-547 (1963).
4. A. A. Grau, "The Structure of an ALGOL
Translator," Oak Ridge Nat. Lab. Rep. 3054, Feb.
1961.
5. M. Grems and M. O. Post, "A Symbol Coder
for Automatic Documenting," Comput. News, vol.
147, pp. 9-18; and vol. 148, pp. 15-19 (1959).
6. C. L. Hamblin, "Translation to and from
Polish Notation," Comput. I., vol. 5, pp. 210-213
(1962).
7. F. B. Hildebrand, Introduction to Numerical
Analysis, McGraw-Hill Book Co., New York,
1956.
8. M. Klerer and J. May, "Algorithms for
Analysis and Transl'ation of a Special Set of Computable Mathematical Forms," Tech. Rep. 113,
Columbia U., Hudson Labs. (Oct. 1963).
9. M. Klerer and J. May, "An Experiment in a
User-Oriented Computer System," Comm. ACM,
vol. 7, no. 5, pp. 290-294 (1964).
10. M. Klerer and J. May, "A User-Oriented
Programming Language," Comput. I., vol. 8 (July
1965).
11. Los Alamos Scientific Laboratory, "MANIAC II," Comm. ACM, vol. 1, no. 7, p. 26 (1958).
12. Mark B. Wells, "MADCAP: A Scientific
Compiler for a Displayed Formula Textbook Lan13. Mark B. Wells, "Recent Improvements in
guage," Comm. ACM, vol. 4, pp. 31-36 (1961).
MADCAP," Comm. ACM, vol. 6, pp. 674-678
(1963).
14. A. Vanderburgh, "The Lincoln Keyboard-A
Typewriter Keyboard Designed For Computers Input Flexibility," Comm. ACM, vol. 1, no. 7, p. 4(1958).
74
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1965
REFERENCE MANUAL
Subscripted variables need not be dimensioned when used in
forms such as:
(1) Aij=BiQi,j FOR j=O(2)20 AND i=1 TO 5
or
(2) MAXIMUM n=10. K=15
Vocabulary List
ABS
ABSOLUTE
AND
ARCCOS
ARCCOSH
ARC COT
ARCCOTH
ARCCSC
ARCCSCH
ARCSEC
ARCSECH
ARCSIN
ARCS INH
ARCTAN
ARCTANH
BY
CALL
CARD
CARDS
COMPUTE
CONTINUE
COS
COSECANT
COSH
COSINE
COT
COTANGENT
COTH
CSC
CSCH
CYCLE
DIMENSION
DIVIDED
00
ELSE
END
EOF
EQUALS
EXP
FILE
FINISH
FOR
FORMULA
FROM
GO
HEADING
IF
INFINITY
LABEL
LINE
LINES
LN
LOG
LOOP
MAXIMUM
MESSAGE
MINUS
OF
OR
OTHERWISE
PAUSE
PERFORM
PLOT
PLUS
PRINT
PROCEt>URE
PROGRAM
PUNCH
READ
RETURN
REWIND
ROUND
SEC
SECANT
SECH
SIN
SINE
SINH
SLEW
SPECIAL
SQRT
STATEMENT
STOP
SUBROUTINE
SWITCH
TAN
TANGENT
TANH
TAPE
THE
THEN
TIMES
TO
TOP
TRUNCATE
TYPE
UNTIL
UPPER
VARIABLE
VARIABLES
WITHIN
WRITE
Aij=BiQi,j FROM j=4.5 ••••• K WITHIN i=O BY 3
UNTIL n.
or
(3)
{UNt~L}
[j: ) G
etc.
i=E TO G (Unit steps assumed)
i=N BY 2.34 UNTIL A+B
A=B+5 BY 2 UNTIL Q>20
i=E TO INFINITY
Note: Any number of dots permissable but no extra spaces
FOR i=l. 2 ••••• 5
FOR j=5Cl0)55
before terminating comma. The
difference between the first
FOR i=0 •• 5 ••••• 7.5
two numbers specifies the incremerrt in the first FOR form.
FROM or FOR forms can be used either to begin or end a
statement.
Ai=iBi FROM i=1 TO 10.
FROM i=1 TO 10 COMPUTE Ai=iB i •
00 [UN~~L] = LOOP [ ] = CYCLE [ ]
00 STATEMENT 5 FROM J=l TO 10.
This indicates that all statements up to but not including
5 will be executed. (No two LOOP statements should terminate at the same statement number. Otherwise, any number
of LOOP procedures within or external to other LOOP
procedures is permitted.)
Examples of Acceptable Forms
FROM=WITHIN=AND
FOR +=0,5, ••• ,90 WITHIN r=l TO 10 AND 0=1 TO 5 LOOP TO
FORMULA 6.
The loop to be performed most often is the first one; the
least often is the last.
Note: The horizontal extension
of the lower limit equation and
upper limit expression should
not exceed the corresponding
arms of the sum symbol. The
operand of the sum should be
outside the symbol.
lr
P=
FROM
FROM
FROM
FROM
The letters E, F, G denote an arithmetic expression, e.g.,
E may denote the expression A + 2B + i, otherwise a single
variable is meant. Braces { } denote a choice of forms.
Square Brackets. [ ) denote those forms that are optional.
i,j=E
~B'
i=1 ~,
FROM i-E [BY F)
A period denotes the end of a statement or the end of an
implied loop.
Corrections can be made by overtyping or by pressing the
control key ERASE when positioned over the eI'ror.
The initial value of all variables (including subscripts
is assumed to be 0 unless defined.
Each program must be terminated by the statement END OF
PROGRAM. or FINISH.
More than one statement per typing line is acceptable.
To continue a statement beyond the maximum typing length
for one line, press the TAB button and at least one
carriage return.
Names of variables with more than one character should be
defined by a SPECIAL VARIABLES statement before use.
A comma or the word AND may be used to separate computable
statements.
FROM i=1 TO 10 COMPUTE Ai=Bi+Ci+l' Ci=Ai+lX AND 0 =SIN 6 i •
t
A=
READ = READ CARD = READ CARDS
READ Ai FROM i= 1 TO Aj?15.
Card Format is free field; number of data points may vary
from card to card and may be in either fixed or floating
point form.
READ X. (only one card is read, one datum per card)
READ Ai' Bi + 1 FROM i=E UNTIL Ai=93.643. (Only one set Ai'
Bi+l per card.)
READ Ai FOR i=0(1)105. (Any number of Ai's per card.)
Data may be punched into cards in the following forms:
2 -2 1.596 +3.213 -4.60 2.78T2[=2.78xlO~]
2.78T-2[=2.78x10~2]
2.78E-3(=2.78x10- l ]
Each datum should be separated by at least one blank space
and the value should be within tlO"7& and not exceed nine
significant digits.
E
1,j=E
DIMENSIONA=(N. M).
This indicates that A is an (N+l) by (M+l) array
DIMENSION B=40. Z=30. Q=(10. 50).
SPECIAL VARIABLE[S]=DIMENSION
SPECIAL VARIABLES TEMPERATURE~ HUMIDITY. PRESSURE. COUNT.
LBJ=(14. 200). ay=(10. 15).
UPPER is used in the same manner as DIMENSION AND SPECIAL
VARIABLES except that the indicated variables are stored
in upper memory.
UPPER C. WEIGHT=56. K=(20. 30).
Three Alternate Formulations Of The Same Problem
C--1.
0-15.
FROM n-1 UNT I L
E~3.
F-li.
G-2.
4 COIw
THEN GO TO STATEMENT 2.
PRINT p.
FORMULA 3.
IF w)" GO TO STATEMENT 1.
END OF PROGRAM.
PR INT S.
Figure 18.
p-p""X··
S-S+Px" AND a-a+l.
PRINT A.
FINISH.
FROM V-X TO W COMPUTE
END OF PROGRAM.
a-auyv y •
75
TWO-DIMENSIONAL PROGRAMMING
~RINT X,Y,Z.
PRINT Yi FOR i=I,2, ••• ,N.
FROM i=1 TO N PRINT Ai'
PRINT Yi (A. B).
A and B are integers!;)etween a and S. Yi will be printed
in fixed point output, A significant figures to left of
decimal point, B significant figures to ~iiht of decimal
point. PRINT Yi(3.2), Yi(~)' Vi(~')' ViCO.2)
PRINT Yi (A.B.C).
(Printed as above except that the number is first divided
by laC to change its range.)
PRINT A, B(4.2), C(O.l.l), D. (Maximum of 8 var1ables)
PRINT LABEL = LABEL = HEADING = PRINT HEADING
Only symbols available on the printer are to be used,
maximum of 15 characters per label, maximum of 8 labels
separated by commas.
LABEL A, COUN~, 3X, ZlA, , SIGMA(X), TEMP ••
Messages on the printer or typewriter are printed using
the following forms:
Ai
READ Ai' COMPUTE
Y=~
IF a>k COMPUTE x=~(a-X)6, Y=BijX+COT
AND PRINT Y; a, T, k OTHERWISE COMPUTE x=2ak,
V=Bijx+C OT6 AND PRINT Y, a, T, k FROM a=l TO n
WITHIN T=2 BY .01 UNTIL 3 AND FOR k=0(5)90.
FROM i=l TO 10 AND j=l TO 10 READ Aij ,
COMPUTE Bij=Aij+Xi+Yj AND PRINT Aij' Bij' Xi' Yj , i, j.
FOR r=l, 2, ... , 1,0 AND FOR 9=-w(.01)w
PRINT MESSAGE ••• or TYPE NEGATIVE SqUARE /tOOT.
SLEW N (Printer paper spaced N lines)
SLEW [TO] TOP (Paper will advance to top of page)
IF F
IF F
IF F
IFF
G
G
G
G
IF F
G THEN • • • {OT~~:~ISE} {
IF F
G THEN [CONTINUE] {OTHERWISE} {
./ r
L TAN('hfe),
=~
41=1
cos-Ie, A=T r
THEN GO TO STATEMENT 1.
GO TO STATEMENT 1.
THEN B = C + E.
THEN READ ...
E
GO TO
COMPUTE •••
IF (X!Y AND y (X_Y)2 THEN
COMPUTE TXy=y(c=~)2 AND W=(YTXY)YC AND PRINT W, T ' X, Y
XV
FROM y=2k+3 BY .01T UNTIL W>5800 AND FROM X=l TO 100
OTHERWISE GO TO STATEMENT 2
E
GO TO
}
COMPUTE •••
Examples of mUltiple condition" ~
COMPUTE •••
READ a
2
IF T=5 OR Ga THEN C=D
GO TO FORMULA
CONTINUE
IF P=G AND H>£/2 AND
AND H!CIIl)
IF U=O OR (G=r SIN
OTHERWISE
To define a procedure within a program:
...
• {~~S~g~J~~E} (Name).
GO = GO TO
GO TO STATEMENT 20
•••••••••••••••••••••• RETURN •••••••••••••••••••••••••••••
Comments (non-computable statements) are entered between
{ } symbols.
FROM i=l TO 10 READ X· {READ VALUES}.
Y{i,j}=i+12j.
~
Superscripts that are red are used to form new characters
rather than being interpreted as exponents. The fOllowing
is a short program to determine the maximum absolute value
of a set of positive numbers Xi' The memory cell used by
XMAX_is set to 0 if not previously defined.
i=l TO 100 IF IXil>XMAX""THEN
READ TAPE V, T, P, L. The first L elements of the tape
record is read into locations Vo to VL_ I •
WR~!~eJAPE V2 ' T, P, 5.
(Locations VZ-V6 are written on
REWIND T, p. RWD T, p.
WRITE END OF FILE T, p. EOF T, P.
IF END OF FILE P THEN •••
IF EOF P GO TO •••
~~g~g~J~~EJ ].
To call a procedure:
... CALL (Name) [~~S~~~J~~E] ...
Relative Positions of Special Characters
0 \
•
I
I
I
I
I
__ J
r - -,
r--..,
I
I
'
I
I
I
I
L __
I
I.---l
~
:--1
I
~
In the following example Y is the varl.dble to be plotted,
x is the "independent index" (Le. Y=f(X), A=the minimum
value of X and B=the maximum value of X.
PLOT Zi' i, 0,
• [END[(Name)]
The name of a subroutine can be an alphanumeric string of
any length but must begin with an alphabetic character and
cannot be identical to any item in the vocabulary list.
As many RETURN's as desired may be inserted to branch out
of the subroutine back to the main p~ogram. The END statement is optional. It is preferable that all procedures be
typed at the end of the program. If this is done precede
the subroutine by the statement: STOP. However, if it is
desired to define a procedure inside the main program then
in some manner the program should "jump over" the procedure.
xMAX~IXil~red)
In the following tape commands L is the number of, elements
in the array V, T is the tape number and P is the controiler
(plug) number.
PLOT Y, X, A, B.
• ••••••••••••••••••••••••••••••••• RETURN ••••••••••••, •••••
.............
Use of the next forms eliminates the necessity of using
"DO" or "LOOP" statements. Complitable sub-statements
within an implied loop are separated by a comma or AND.
FOR i=1(1)50 AND j=O BY 2 UNTIL Y>2'000 READ X..
l.J,
COMPUTE Y=2X i ,j AND PRINT Y.
FROM i=l TO INFINITY READ Xi' IF xito COMPUTE Y=Y+Xi'
n=n+2 OTHERWISE GO TO STATEMENT 1.
F~~M
AND PLOT Y, i, -1, 1
FROM i=1 UNTIL Y>l.
FROM i=l TO 565w.
Figure 19.
L __ -l
I
L
__
I
[--1
__.J
F-~
I
I
I
__..J
MICROPROGRAM CONTROL FOR THE EXPERIMENTAL SCIENCES
w. C. McGee and H. E. Petersen
IBM Systems Research and Development Center
Palo Alto, California
INTRODUCTION
In many areas of the experimental sciences, increasing use is being made of general-purpose computers to control experimental apparatus and to record data from experiments. In most such applications the problem exists of connecting the apparatus
to the computer so that data and control information may flow between the two. The problem is
usually solved by placing a controller between the
computer and the external equipment (Fig. 1). In
this position the controller serves two functions:
External
Equipment
~
1'-------11
I
____~~-~
Controller
L
r--jI
I
I
Computer
_ i-----~~______~
Figure 1. General control system.
elements are then interconnected to give the controller its proper terminal characteristics. The design process is essentially no different from that
conventionally used in designing computers, except
of course that a controller is usually not as complicated as a computer.
Although the conventional design process is
straightforward, it has the inherent disadvantage
that it must be repeated for each new configuration
of computer and external equipment. In the experimental sciences, the number of such configurations
is increasing rapidly, and it is quite possible that
progress in this area will be limited by the time and
cost to develop the requisite controllers by traditional methods. The situation would be materially
improved if there were a single design schema
(a) It provides a suitable electrical and logical
interface between the computer and the
external equipment; and
(b) It provides detailed control of the external
equipment, thus leaving the computer free
for other work.
The design of a controller for a particular set of
external equipment and a particular computer presents no serious obstacles. Traditionally, controllers
are implemented from flip-flop registers, logic elements (AND, OR, NOT, etc.), and occasionally
small memories for data buffering. Timing diagrams are drawn showing the levels and pulses required at the controller's terminals, and the logic
77
PROCEEDINGS - - FALL JOINT COMPUTER CONFERENCE,
THE MICROPROGRAMMED CONTROLLER
CONCEPT
The functions of a microprogrammed controller
are expressed in a microprogram which is stored in
1965
a control memory. The microprogram is composed of microinstructions which are read out of
the control memory, one at a time, decoded, and
executed. The microprogrammed controller is thus
primarily a sequential device, in contrast to the
c01,1ventional controller in which different operations usually proceed in parallel.
The microinstructions of a microprogrammed
controller control a simple yet general hardware
configuration. This hardware must be capable of
storing small amounts of data, performing simple
arithmetic and logic operations on these data, and
accepting and transmitting data and control signals
to the attached equipment. The general structure of
one possible configuration meeting these requirements is shown in Fig. 2. The controller is organD-~--
B
~-+----~~-----~I------~-r------~
A __
______J -_ _ _ _ _ _L -_ _ _ _ _ _
~
which was sufficiently general to accommodate a
wide variety of computers and external equipment,
and which could be quickly and easily particularized to meet the requirements for specific controllers. One design schema which appears to approach
this goal is microprogram control.
In microprogram control, the functions of the
controller are vested in a microprogram which is
stored in a control memory. The microprogram is
made up of microinstructions which are fetched in
sequence from the memory and executed. The microinstructions control a very general type of hardware configuration, so that merely by changing the
microprogram, the functions available in the controller can be made to range between wide limits.
In addition, instead of the multiple, parallel logic
elements found in conventional controllers, the microprogrammed controller requires only a single,
central element to perform all logic and arithmetic.
The microprogrammed controller thus has a potential cost advantage over the conventional controller.
The microprogrammed controller concept has
been used to implement the IBM 2841 Storage
Control Unit, by means of which random access
storage devices may be connected to a System/360
central processor. Because of its microprogram implementation, the 2841 can accommodate an unusually wide variety of devices, including two kinds of
disk storage drive, a data cell drive, and a drum.
The 2841 thus provides an instance of the effectiveness of the microprogrammed controller concept in
minimizing the effort that must go into controller
design.
In this paper we will attempt to extend and generalize the microprogrammed controller concept, as
embodied in the 2841 Storage Control Unit, to
yield a more general controller design schema which
would be suitable for use in die experimental
sciences. We will first describe the basic concepts
of the microprogrammed. controller, and then describe how such a controller might be applied to a
control problem typical of the experimental sciences,
namely, the control of a CRT for scanning bubble
chamber firm.
~
78
__
Figure 2. General structure of microprogrammed controller.
~
ized around a set of three data buses, A, B, and D;
an arithmetic and logic unit (ALU); and a set of
registers, Rl, R2, ... ,Rn. Two of the data buses
(A and B) provide input data to the ALU, and the
third (D) receives the output of the ALU. The
ALU is capable of performing simple arithmetic
and logic operations on the input data, such as add,
subtract, AND, OR, etc. The registers provide the
sources of data to be operated on and also serve as
destinations for the results. In general, any two registers specified by the microinstruction may provide
the input data, one of them being connected to the
A bus and the other to the B bus. The result of the
ALU operation may then be returned, via the D
bus, to any specified register (including one of the
source registers, if desired). The registers, buses,
and ALU are all the same width, which may be
chosen to match the requirements of the control application. For example, a bus width of 8 bits (+
parity) appears to be a good choice for a wide
class of applications.
Microinstructions are divided into fields, each of
which has a specific function. To control the data
flow in the configuration of Fig. 2, four fields
MICROPROGRAM CONTROL FOR THE EXPERIMENTAL SCIENCES
would be required: CA, CB, COP, and CD. Field
CA determines which register will be connected to
the A bus; field CB determines which register will
be connected to the B bus; field COP determines
what operation the ALU will perform on the A-bus
and B-bus data; and field CD determines which register the ALU results will be sent to. Each of these
fields can be made large enough to handle the maximum number of variations required. For example,
by using 4 bits for the CA field, one can specify up
to 16 different sources for the A bus.
R1
R2
ADD
R3
CA
CB
COP
CD
~
To illustrate, suppose it were desired to add the
quantity in register R1 to the quantity in R2 and
place the sum in register R3. This could be accomplished with the following microinstruction: *
The structure of Fig. 2 provides for the manipulation of data already in the system, but provides no
way of introducing data into the system. In general
there are two ways of accomplishing this. One is by
providing external connections to some of the registers, as will be described below; the other is by providing input directly from the microinstruction itself. In particular, one of the B-bus sources can be
defined to be the "constant" field CK of the microinstruction. Whenever this source is specified in
the CB field, the contents of the CK field in the
same microinstruction will be gated onto the B bus.
This technique is especially useful for introducing
increments to counts (e.g., + 1) or certain bit patterns to mask off portions of data bytes. For example, to increment the quantity in register R1 by 4,
the following microinstruction could be used:
R1
CK
ADD
R1
CA
CB
COP
CD
4
~
CK
The registers of a microprogram controller fall
naturally into three groups: a control group, in in*The fields of the microinstruction are shown symbolically; in prac~ce they would contain equivalent binary codes.
79
put group, and an output group. To illustrate this
grouping, a slightly more detailed schematic of a
microprogrammed controller is shown in Figure 3.
The control registers are those registers required
for general controller operation, i.e., without regard
to the particular device or devices being controlled.
For example, registers must' in general be provided
to hold intermediate results of ALU processing.
These registers are designated TMP in Fig. 3.
Another set of registers (CHI and CHO in Fig. 3)
is provided for data to be transmitted to and from a
general-purpose computer. These registers could,
strictly speaking, be placed in the input and output
register groups, so that the computer would assume
the same status as any other device connected to the
controller. Communication with a general-purpose
computer is sufficiently stereotyped, however, that
the registers required to effect this communication
can be properly viewed as part of the control group.
A third type of register in the control group is
the status register, designated ST in Fig. 3. Each
bit of a status register indicates the (binary) status
of some portion of the controller or device being
controlled. A status bit may in general be set or reset from an external source (e.g., the computer
channel, to signify that the CHI register is ready to
be sampled); or from the microprogram itself. For
the latter purpose, a CS field is provided in the microinstruction whose decoded value designates a
particular status register bit and the value to which
the bit is to be set. This provides the controller
with the ability to "staticize" certain conditions existing at one time so they may be used to condition
operations at a later time. An important example is
the condition "D=O," i.e., whether or not the output of the ALU is zero. A certain value in the CS
field will cause a 1 or a 0 to be set into a certain
status register bit, according as D = 0 or D=¥=:O on
that microinstruction step. In addition to conditioning later operations of the controller itself, the status registers may of course be used to condition the
operation of external equipment, and as such provide one of the sources for external equipment control.
To illustrate the function of the CS field, suppose
register R 1 contains a count of the data bytes received from the external equipment. Each time a
byte is received, the count is to .be decreased by one
80
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
Control Group
Output Group
Input Group
1965
r·---···----J.-··-··--------··---~ ,--~ , - - __ .--A......-__ "
ALU
Control
Data
j I
.r . - '--r-LI~"l--~
j
Input
Gates
I
Conditions
l
Computer
Channel
q~JjLl-TtJ
c!oI
~JJ
r
Output
Gates
Data
---.
-~I
·1
lMi:c",;",'mcHo_" Decod;nQ
Control
1 I
Memory
Control
Memory
Address
i
_ _ _ _ _ _ .. _____J
Figure 3. Basic microprogrammed controller.
and the resulting value tested for zero. When the
value goes to zero, bit 1 of the status register is to
be set to 1. This can all be accomplished with the
following microinstruction:
Rl
CK
ISUB
CA
CB
COP
1 RIll
CD
I)) I(D=O)~STII )
CK
CS
The second group of controller registers, the input group, provides a place to hold data coming
from the equipment being controlled. In many cases
these data will take the form of binary voltage levels which will be held by the external equipment
until .sampled by the controller and perhaps later
caused to change. In such cases the registers need
not be flip-flop registers but simply terminals which
can be connected to the A or B bus. In other cases,
it may be necessary, because of timing or other
considerations, to buffer the input into flip-flop
registers. In these cases, gating control must generally be provided by the controller and/or external
equipment.
The third and final group, the output group, consists of flip-flop registers where the controller may
deposit data to be used by the external equipment.
The output of these resigters may be taken directly
to the external equipment as binary levels, or may
be transferred, through suitable gating, to external
registers. Normally, the output registers need be
connected only to the D bus. Under certain conditions, however, it may be convenient to bring the
output register data back into the system, and so Abus connections are in general provided for the output registers.
The setting of status register bits under microprogram control provides the basic mode of controlling the external equipment. In some cases it is
convenient to augment this facility with control bits
taken directly from the microinstruction. This is
the principal purpose of the CX field. When a mi-
81
MICROPROGRAM CONTROL FOR THE EXPERIMENTAL SCIENCES
croinstruction is read out of control memory, the
bits of the CX field are not decoded as are the
other fields, but are instead used directly in the external equipment. For example, the microinstruction might, by virtue of the 1 in the CX field, cause
the gating of a quantity into an input register.
CA
CB COP CD
CK
CS
CX
The CX field may also be used to extend the facility provided by the CK field for introducing arbitrary constants into the external equipment. Unlike constants from the CK field, any constants in
the CX field would not enter the bus structure, but
instead would go directly to the external equipment.
Since the microprogrammed controller is a sequential device, a key characteristic is the method
used to get from one microinstruction to the next.
One could, for example, have a conventional "program counter" which contains the memory address
of the instruction currently being executed, and
which is either incremented by one or is respecified
to an arbitrary value (in case of a branch) to obtain the address of the next instruction. For microprograms, a more efficient procedure is to specify
the address of the next instruction in every instruction, whether branching may occur or not. By this
means, branching does not take a separate step, but
may be performed on the same step as some other
operation. Further, successive instructions may be
located anywhere in control memory relative to one
another, providing greater flexibility in the sharing
of common sequences of microinstructions among
different control functions.
In the microprogram controller we are describing, the address of the next instruction is normally determined by the CN and CL fields. The CN
field contains the high-order n-l bits of the n-bit
address of the next microinstruction; and the CL
field determines which of a number of sources will
be used to supply the low-order bit of the address.
Two such sources, of course, are simply a "zero"
and a "one," so that the location of .the next microinstruction may be arbitrarily specified. Thus,
the microinstruction
CA CB COP CD
CK
CJ
CN CL
CS
ex
specifies that the next microinstruction is to come
from location 179 X 2 + = 358. Other sources
which can be specified in the CL field include single
bits of the status register. The microinstruction specifies that the next microinstruction is to come from
location 179 X 2 + = 358 if STI = 0, or from
location 179 X 2 + 1 = 359 if STI = 1, thus effecting a two-way branch on the value of bit 1 of the
status register.
°
°
CA CB COP CD CK
CJ
CX CN CL
CS
By providing additional fields in the microinstruction to designate sources for other bits of the
next-instruction-address, a capability of performing
four-way~ eight-way, etc., branching may be obtained. For simplicity, only two-way branching is
assumed. However, we do provide a more general
facility for using any input or calculated quantity as
the address of the next instruction. Specifically, if
the CJ field of the microinstruction is 1, the next
address will not be obtained from the CN and CL
fields as described in the preceding paragraph, but
instead will be taken from the A bus. Thus, whatever register is gated onto the A bus by the coding in
the CA field, that register's contents will be taken
as the address of the next instruction. This facility
provides a versatile many-way branch which is useful in command decoding and function generation.
For example, suppose a code has been operated on
arithmetically, and the result, which represents the
starting address of the microprogram corresponding
to this code, has been left in register Rl. The coding will perform the desired branching.
IRI I I I I 11 I-I
CA CB COP CD CK
CJ
CN CL
I
CS
II
CX
82
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
In the examples considered above, the microinstructions are shown performing only a single function. In actual use, a single microinstruction will in
general perform a number of functions, within the
constraints imposed by the microinstruction format.
Thus, a single microinstruction will in general perform an arithmetic or logic operation (fields CA,
CB, COP, CD, and possibly CK); set a status register bit (CS field); set external lines (CX field);
and select its successor (CN and CL fields).
The performance of the system described above
is obviously very dependent on the speed of the
control memory in providing microinstruction sequences, as well as the speed of the circuits being
used. In general we can assume that the circuits are
fast enough that the memory will be the limiting
factor. Thus, for high performance, a very highspeed memory is required. However, it is not necessary that the control memory be an ordinary
read/write memory. What is desired is that the
memory be capable of being read very rapidly but
that writing may take place fairly slowly.
In many applications even manually changing the
contents of the memory might be suitable, since it
is possible to rapidly change from one element of a
program to another by branching without shifting
in large blocks of a new program. This means, then,
than any fast-read, slow-write memory that is economical might be used. For several years, the literature in computer technology l-5 has described many
such memories in which data can be changed in a
period of minutes or at most hours, whereas the
data reading times may be as short as a few tenths
of a microsecond. In general, such "read-only"
memories have shown savings in cost over their
read/write counterparts in the same performance
range. Should this cost picture change, read/write
memory can be used though adequate consideration
must be given to the system and operational advantages that apply to each.
By using the control memory as described, we
very much limit the data storage capability of the
system as described t~us far. At least two simple
alternatives are 'available to provide such a capability. One is to attach a conventional random-access
memory directly to the controller bus structure, using one or more of the controller's registers as the
"memory address register" (Le., to hold the address
of the location to be read or written); and one or
more of the controller's registers as the "memory
1965
data register" (Le., to hold the words read from or
written to memory). Reading and writing operations would then be effected by appropriate microprogram sequences, much as any other external device is controlled.
The second alternative to providing storage capability is simply to use the memory of a general-purpose computer attached to the controller through
the computer interface which is provided in the
controller design. This method is especially appealing if the computer has input-output "channels"
which can operate independently of the main processor, since in this case the controller can communicate with computer memory without interfering with the main computer program. Given such a
facility, the controller we have described can perform many functions normally considered to be in
the computer's domain, such as limit testing, function generation, data assembly, etc.
APPLICATION TO PEPR
To illustrate the basic concepts of microprogramming, we would like in this section to describe
briefly a typical control application and the manner
in which a microprogrammed controller might be
configured and programmed to handle the job. The
application we have chosen for this purpose is the
PEPR6 film scanning application. PEPR is a computer-controlled CRT scanner used to automatically
measure bubble chamber tracks which have been
recorded on film. The PEPR cathode ray tube defocuses the electron beam into a short lnie segment
whose angular orientation and location can be independently controlled by the system. Thus, a short
line of light is controlled in angle and position on
the face of the CRT and swept for a short distance
under system control. When this .line of light falls
on one of the film tracks, a photomultiplier tube
responds and the position of the beam is recorded
as the time of arrival of the photomultiplier response. This is accomplished by starting a counter
at the same time the line starts to move and remembering the count value when response occurs. These
count values and the associated angle are sent to
memory where subsequent processing will occur.
The angle of the line is changed and scan repeated
until the entire range of angles specified has been
examined. A similar scan in another small area of
the film is then initiated until, after approximately
MICROPROGRAM CONTROL FOR THE EXPERIMENTAL SCIENCES
500 such cells have been examined, the entire picture has been scanned.
In controlling this scan, the system must specify
the coordinates of the cell center, the range of angles
to be covered, and a few other factors such as sweep
speed, line length, etc. The generation of the line by
the CRT requires a special focusing system and currents that are non-linear functions of the angle
of the line. These nonlinear focusing currents functions M ( ) and N ( ) must also be supplied by
the system.
A possible configuration for the PEPR system
is illustrated in Fig. 4. The principal elements in
this configuration are:
r -.--------.- ----
x
1
I
!
.-------i..--,I
6!1 CRT!
! \ \
I :
c>
:.
i
I i
L~'
~ ~j I
~--------l
LrOutput'Co'
n k'---tr I
o--oFILM' : ,-,::}-
i
(a) The coordinates of the scan cell center.
(b) The length of the flying line.
(c) The effective excursion of the flying line
on a single sweep.
( d) The range of angles to which the flying
line will be oriented on successive sweeps.
Data returned to the computer includes
r:~--,:->----~
y'ri::1-
designated cell on a bubble chamber photograph.
The controller responds by issuing the proper signals to the scan table, receiving signals from the
scan table, converting these signals into meaningful
data, and relaying these data back to the computer.
Command parameters which are available to the
computer include
,
!M-N r- ~}----1~--,
'I
83
Iloput
~I
C
omputer
J'
I
L ____ _
~~--
L. __ . _____ .___ . __ .___ :
Figure 4. PEPR system schematic.
(a) Scan Table. The scan table contains the
cathode ray tube and associated beam
control circuits; the film transport equipment; the optics equipment, including the
photomultiplier tube. and circuits; and the
data acquisition registers.
(b) Computer. The computer provides overall
direction for the scanning, and performs
the logic and arithmetic necessary to correlate isolated "hits" from the scanner into
"tracks." Although not shown, the computer configuration would include certain
standard peripheral items, including a magnetic tape unit for recording system. output.
(c) Controller. The controller provides detailed control of the scan table, in response
to commands from the computer.
Under program control, the computer may issue a
command to the controller to initiate a scan of a
(a) The angle at which one or more hits were
detected.
(b) The interpolation count representing the
location of each hit.
(c) The identification of which track element
detector (TED) detected the hits.
For the sake of illustration, we will assume that
the controller responds to three different commands
from the computer, as follows:
1. Accept Parameters. When the computer
issues this command, it follows the command immediately with a single set of scan
cell parameters, i.e., coordinates of scan
cell center, line length, etc. For simplicity,
we assume that the entire set is transmitted
each time the command is given, and always in a fixed order. These parameters
will then be in effect until the next set is
transmitted.
2. Start Sweep. Using the parameters previously supplied, the controller initiates a
sequence of sweeps, each sweep being
made at an angle one degree greater than
the previous sweep. This continues until
the final angle is reached, or until a hit is
detected at some angle;- In either case, the
controller then sends an appropriate interrupt signal back to the computer.
84
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1965
ALU
Conditions
TED 10
Computer
Channel
Hit Counters
-A ....... _...... _._._
y
FOCUS
OAC
Scan Table
Control
Memory
y
LENGTH
OAC
Control
Memory
Address
STARTI
SWEEP I~
Figure 5. Microprogrammed controller for PEPR scanner.
3. Send Data. This command causes the controller to transmit to the computer the results of the last-detected hit, i.e., the hit
angle, interpolation counts, etc. Again for
simplicity, we assume the entire set of data
is transmitted each time the command is
given and always in the same order.
The computer uses these commands to control
the scanning operation. When data is received from
the controller via the "send data" command, the
computer must make any ncecssary coordinate conversion, consolidate redundant data and correlate
track data from different cells. The result of this
processing and control are track coordinates which
are then recorded on magnetic tape. This data tape
is later processed by a general-purpose computer to
do the physics calculations.
A microprogrammed controller to perform the
above-described functions is shown in Fig. 5. Since
the majority of parameters and data values are . . .
bits or less, we assume a register-ALU-bus width of
. . . bits. The functions performed by the ALU include addition, subtraction and "no-operation," the
latter being used when it is desired to just move a
data byte from one register to another.
The control registers include a single "temporary" register; two channel registers for communication with the computer; and a single status register with bit assignments as follows:
STO
Not used
ST 1
End of sweep; set by scan table
ST2
ST3
Hits obtained on sweep; set by scan table
Controller request; set by controller
ST4
Channel request; set by computer
ST5
Final angle reached; set by controller
ST6
Channel acknowledge; set by computer
ST7
Controller acknowledge; set by controller
85
MICROPROGRAM CONTROL FOR THE EXPERIMENTAL SCIENCES
Bits ST 2 and ST5 are also used directly to initiate
interrupts in the computer.
registers, which in turn drive DAC's in
the scan table.
The input group consists of 8 sets of input terminals of 8 bits each. Six of these (B 1 through B6)
are connected to interpolation counters, while the
remaining two (Tl and T2) are connected to the
TED indicator logic. The input terminals are connected through control gates to the A bus.
(c) CXM and CXN are each 9 bit fields, containing the values of M and N, respectively,
corresponding to a given angle.
The output group contains 10 registers of 8 bits
each. Eight of these registers hold parameters for a
given scan cell and are directly connected to the appropriate external device:
Register
Parameter
The intent of placing M and N into the CX field is
to provide a very rapid means of supplying new M
and N values on each sweep repetition, as descirbed
next.
The operation of the controller is depicted in the
flow charts of Fig. 7. The corresponding coding is
given in Fig. 8. In its quiescent state, the controller
sits in an endless loop waiting for the computer to
issue a command. The controller expects one of
three commands, as follows:
XYl, XY2, XY3 Cell center coordinates X, Y (12
bits each)
Command
Code
F
Focus correction
Accept Parameters
00000001
CO
Interpolation Counter Open Gate
Start Sweep
00000010
CC
Interpolation Counter Close Gate
Send Data
00000011
S
Sweep Speed (Amplitude)
L
Line Length
The remaining two registers, I and F, hold the
initial and final angle, respectively. These registers
are not connected to external equipment, but are
included in the output group for convenience. All
registers in the output group may be connected
through control gates to the A bus, the B bus where
specified, and to the D bus.
The control memory for this application requires
a capacity of at least 256 words and a word length
of at least 54 bits. The format of the microinstruction word, together with the various values which
may be assumed by its fields, is shown in Fig. 6.
Except for the CX field, the meaning of the various
fields is exactly as described in the preceding section. For the present application, the CX field
would be divided into four subfields:
(a) CXl provides a one-bit signal to the scan
table to start a sweep.
(b) CX2 provides a one-bit signal to gate the
CXM and CXN fields (see below) of this
microinstruction into a pair of external
When the presence of a command is detected, the
controller adds a constant to the command to obtain
the address of the next microinstruction. This microinstruction in turn branches the controller to the
sequence corresponding to a given command.
In the "accept parameters" sequence, the controller simply waits for the computer channel to
transmit successive 8-bit bytes. As each byte is
transmitted, the controller deposits it in the appropriate register in the output group.
The "send data" sequence is similar, execept that
the controller places the contents of successive input-group registers on the channel and each time
waits for the computer channel to acknowledge.
In the "start sweep" sequence, the controller execute a microprogram loop, with each traversal of
the loop corresponding to a different sweep angle.
On each traversal the controller performs the following steps:
( a) Places the current angle ( I) on the A
bus and branches to the corresponding
location. In locations 0 through 179 are
stored 180 microinstructions which are
identical except for the values in the CXM
and CXN fields. These values correspond
86
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1965
MICROINSTRUCTION CODING FOR PEPR CONTROLLER
Field
Decoded CA
(5)
Field
Value
o
0
CB
(2)
COP
(2)
CD
(4)
CK
(8)
CJ
(1)
0
NOP
0
0
ADD
XYl
XYl ¢F
2
XY2 TMP SUB
3
XY3 CK
-
CN
<::l
(7)
(3)
CS
(2)
UseN 0
0
NOP
DIRECT
1
UseA 1
ST1
1~ST3
OUTPUT
XY2
2
2
ST2
(D=O)-+ST5
TO
XY3
3
3
-
1-+ST7
INTERFACE
4
F
F
4
4
ST4
5
CO
CO
5
5
ST5
6
CC
CC
6
6
ST6
7
S
S
7
7
~
8
l
l
CPI
¢I
·
·
9
10
~F
¢F
11
TMP
·
·
·
·
TMP (to
(to
12
CHI
13
-(1)
14
-
15
-
16
-
17
81
18
82
19
83
20
84
21
85
22
86
23
11
24
T2
25
·
··
31
-
-
225)
127)
CHO
-
-
FIgure 6. MICrOInstruction codmg for PEPF controller.
( 1) Indicates' not defined
CXl
(1 )
CX2 CXM
(1)
(9)
CXN
(9)
87
MICROPROGRAM CONTROL FOR THE EXPERIMENTAL SCIENCES
Idle loop
Accept Parameters
Sequence
,"-
Send Data
Sequence
...
I
,.---
No Command
.
Test ST4
(Channel
Test for
Command
J
Start Sweep
Sequence
t
ft
I
(180-way bran
ST4=1
Co mma nd ST4=O
Calculate
Command
Pointer Add ess
Branch to
Command
Pointer
Store First
Byte in
Ref:dsters
-
Put 2nd
Byte on
Channel
i
ST5=O
($i=f)
J
Branch to
Command
Sequence
5T6=1
5T6=O
Test ST4
1sT4=1
5T4=O
r.
Branch to
location
Store 2nd
Byte in
Registers
r.Test.~r~~
Start
Sweep
J
5T6=O
-
i
I
ST6=l
I
etc. for
remaining
para~eters
I
Store last
Byte in
Registers,
Branch to
Idle loop
5T1=O
(not end of
end of sw ep)
sweep)
r - - -.......- - .
Test for
Hits
I
etc. for
remaining
data bytes
I
I
I
!
I
I
~-----;
I
Put last :
Byte on i
. Channel,
Branch to
Idle loop
I
Branch to
Idle Loop
Figure 7. Flow chart of microprogram for PEPR controller.
to the associated angle, e.g., location 80
contains M(80) and N(80).
(b) Transmits the appropriate M and N to
external registers; subtracts F from I
and sets ST5 to 1 if the result is zero,
i.e., if I = F.
( c ) Increments the current .angle by one degree; tests ST5 and branches accordingly.
(d) Assuming ST5 = 0 (i.e., I = F), transmits a start sweep pulse to the scan table.
(e) "Waits" for end of sweep.
(f) Tests for hits on the sweep and branches
accordingly.
Assuming no hits were detected, step (f) would
branch back to step (a), and the loop would be repeated. If hits were detected, the controller would
88
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1965
MICROPROGRAM FOR PEPR CONTROLLER
Location
CA
CB
COP
CD
CK
CJ CN CL CS
CX1
CX2 CXM CXN Remarks
0
0
0
0
0
0
0
0
(2)
IDLE LOOP
200
201
202
203
204
205
- ADD
- - - _(1)
CHI CK
TMP
"
-
- - -
-
0
0
TMP 202 0
0
1
0
0
-
-
0
0
-
200 ST4 0
202 0 0
0
208 0 0
- -
- - - -
0
230 0
0
0
0
-
0
238 1
0
0
0
- -
-
-
Branch to
Accept'
Branch to
"Start"
Branch to
II
"Se~d"
ACCEPT PARAMETERS
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
-
CHI
-
CHI
-
CHI
-
CHI
0
0
0
0
-
NOP
-
NOP
-NOP
-NOP
- CHI 0
NOP
-
0
NOP
CHI
-
- - -NOP
CHI 0
- - CHI 0
NOP
- - CHI 0
NOP
-
CHI
-0 -
NOP
0
XY1
0
XY2
0
XY3
0
F
0
CO
0
CC
0
S
0
L
0
4>1:
0
1
-
-
MN TABLE
o
1
2
··
·
79
11
0
-
232 0
1~ST3
0
1...,.ST3
0
1~ST3
0
1~ST3
0
1..."ST3
0
1-+ST3
0
1~ST3
0
~ST
- -
-
-
-
- .
-
-
-
-
M(O) N(O
M(I) N(1)
M(2) N(2)
! 111 11 11 t t t
has been reached, the controller returns to its quiescent state to await further commands. The computer may at this point request data; or it may transmit new parameters; or, it may cause scanning to
resume at the angle one greater than that at which
hits were last obtained, simply by issuing a "start
sweep" command.
M(179) N(179)
In the PEPR application, the cycle time of the
control memory would ideally be selected so that
the loop time of the controller exactly matched the
sweep time of the scan table, i.e., so that neither
device waited for the other. According to Fig. 7,
the controller executes six steps in the loop. Thus, a
sweep time of 10 microseconds would require a con-
90
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
trol memory cycle of no greater than 10/6 = 1.67
microseconds. Such cycle times are readily available
with current technology.
REMARKS
In this paper we have described a different approach to controller design, whose salient aspect is
the use of microprogram sequence in place of conventional wired logic. The implications of microprogrammed control can perhaps be better understood by considering briefly the similarities and
differences between two kinds of controllers for the
PEPR application: the microprogrammed controller
we have just described, and a controller implemented in the conventional manner.
From the standpoint of the hardware (registers,
gates, etc.) required to hold the data coming from
the scanner and computer, the two kinds of controller appear to be roughly equivalent. For exampI, the conventional controller would have to have
two registers for holding the current and final angles, a counter for incrementing the current angle,
and a comparator for comparing the current angle
against the final angle. In the microprogrammed
controller, the corresponding hardware is to be
found in two registers for holding the current and
final angle; and an ALU for decrementing and testing the angle. The specific manner in which this
"common" hardware is controlled, of course, is
much different in the two kinds of controllers.
Another significant difference is in the manner
in which the M and N functions are generated. In a
conventional controller this would be accomplished
with a nonlinear function generator, whereas in the
microprogrammed controller, the function is stored
directly in control memory. The latter is obviously
an advantage if it is ever necessary to modify the
function.
Another difference is found in the generality of
the ALU in the microprogrammed controller. A
conventional controller would not be built with
more arithmetic capability than absolutely required
by the application, whereas the microprogrammed
controller, through its very simple ALU and data
1965
flow characteristics, has theoretically unlimited arithmetic capability. The possibility therefore exists
of extending the controller's functions (e.g., adding
or subtracting constants to data as they are transferred to or from the computer) without loss of
time or additional cost.
We have provided a detailed discussion of one
application, but several others are immediately apparent. Control of any CRT scanner would be similar to the PEPR scanner. In other data collection
systems, counters, pulse height analyzers, telemetry
converters, etc., may all serve as sources of input
data. The controller output registers may be used
for a variety of external control purposes as well as
data sources for display and printing devices. A
wide and dynamic range of control can be provided
by a single hardware complex by means of the "personality" provided by the control program. The
simultaneous control of different experiments, for
example, would be possible simply by incorporating
separate programs within the control memory.
While communication line switching and exchange
and multiple data channel control are potential applications, preliminary analysis has indicated that
expansion of some of the baisc concepts presented
may be desirable.
The microprogrammed controller appears to offer
significant advantages in design simplicity and flexibility, with respect to both the functions performed
(as determined by the microprogram) and the particular equipment to be controlled (as determined
by the input/output register configuration). For
this reason, we feel that the approach is well suited
to the laboratory environment, where changing requirements must be accommodated with a minimum of confusion and cost.
ACKNOWLEDGMENT
The authors would like to acknowledge the initial suggestion made by Dr. Horace P. Flatt that we
explore the use of microprogram control for scanning systems and his continued encouragement of
this effort. We also appreciate the courtesy and help
given by Drs. Irwin Pless, Horace Taft and Arthur
Rosenfeld toward understanding the PEPR system.
MICROPROGRAM CONTROL FOR THE EXPERIMENTAL SCIENCES
REFERENCES
1. D. H. Looney, "A Twistor Matrix Memory
for Semi-Permanent Information," Proceedings of
the Western Joint Computer Conference (1959).
2. J. H. DeBuske, J. Janik and B. H. Simons,
"A Card Changeable Non-Destructive Readout
Twistor Shore," ibid.
3. H. R. Foglia, W. L. McDermid and H. E.
Petersen, "Card Capacitor-A Semi-Permanent,
91
Read Only Memory," IBM Journal (Jan. 1961).
4. T. Ishidate, S. Yoshizawa and K. N agamori,
"Eddycard Memory-A Semi-Permanent Storage,"
Proceedings of the Eastern Joint Computer Conference (Dec. 1961).
5. J. M. Donnelly, "Card-Changeable Memories," Computer Design (June 1964).
6. 1. Pless et aI, "A Precision Encoding and Pattern Recognition System (PEPR)," 1964 International Conference on High Energy Physics.
PICOPROGRAMMING: A NEW APPROACH TO INTERNAL COMPUTER CONTROL
B. E. Briley
Automatic Electric Research Laboratories
Northlake, Illinois
control pulses must be dispatched to the arithmetic
unit. In addition, it performs certain housekeeping
duties, such as incrementing the instruction location
counter.
Most of the housekeeping is performed for all
instructions, and the housekeeping hardware is
shared by them for economic reasons. Similarly,
like portions of different instructions are often realized with the· same piece of equipment. This design
technique is very desirable from the point of view
of economics, but it makes the machine more prone
to total failure if a single element malfunctions.
The conventional control is totally wired in, rendering it quite inflexible. Any afterthought alteration of the instruction set requires a "soldering
iron" approach.
INTRODUCTION
The central processors of conventional computers
may be roughly divided into two sections, an arithmetic section, which performs operations analogous
to arithmetic upon representations of numbers, and
a control section, which produces essentially a sequential group of gating pulses to accomplish the
desired manipulation in the arithmetic section.
The arithmetic section lends itself admirably to
modularization because of its repetitive structure. It
is relatively easy to design and diagnose. The control section, however, has stoutly resisted similar
treatment because it conventionally consists of an
ensemble of special logic arrangements which differ, and, therefore, do not lend themselves to modularization on a logic level. This section is more difficult to design, and if the control malfunctions, an
attempt at self-diagnosis by a typical machine
may be roughly compared with asking an insane
psychiatrist to analyze himself.
Described herein is a new approach to the design
and realization of a control section which is modular by nature, simple to design and diagnose, and
flexible to a unique degree.
Microprogrammed
The microprogramming approach is a definite
step forward in increasing the flexibility of a machine. A microprogrammed control section utilizes
a macroinstruction to address the first word of a
series of microinstructions contained in an internal,
comparatively fast, control memory. These microinstructions are then decoded much as normal instructions are in wired-in control machines, to initiate
production of (in general) a sequential series of
pulses to control the arithmetic section. 1
The microinstructions are generally relatively
weak, but the macroinstructions, calling upon sub-
CONVENTIONAL CONTROL SECTIONS
Wired In
A conventional control section decodes an instruction to ascertain which of a prewired set of
93
94
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
routines of microinstructions, can be quite powerful.
Since these subroutines are alterable, the nature
of macroinstructions is very flexible, limited only
by the microinstruction capabilities which are,
however, fixed by wiring.
Unfortunately, the portion of the control which
handles the microinstructions suffers the same lack
of modularizability as the totally wired-in system.
PICOPROGRAMMED
Philosophy
Consider the control wires passing between the
control section and the arithmetic section; these
might number 100 in a machine of moderate size.
If the signals which appear on these lines are examined during the execution of a typical instruction, it
will be found that relatively few are activated. Further, for most instructions, the number of pulses
which are produced on anyone line is quite small
(there is, of course, a small class of cyclic instructions typified by SHIFT N, which require long
trains of pulses; these will be discussed separately).
Understanding the pieoprogramming technique
requires the recognition of a correspondence between the pulse-programming requirements of a
control section and the capabilities of a memory
element known as MYRA.2 A MYRA memory element is a MYRiAperture ferrite disk which, when
accessed, produces sequential trains of logic-level
pulses upon 64 or more otherwise isolated wires.
The relative width and timing positions of the
pulses on the various wires are trivially adjustable,
not only with respect to other pulses on the same
wire, but with respect to pulses on the other wires
associated with the same disk.
Thus, each memory element is capable of directly
providing the gating pulses necessary to execute an
instruction. A picoprogrammed system, then, consists essentially of an arithmetic section and a modified MYRA memory. A macroinstruction merely
addresses an element in the MYRA memory; whea
the element is accessed, it produces the gating signals which cause the arithmetic unit to perform the
desired functions. In addition, it provides gating
pulses which fetch the operand (if needed), increment the control counter, and fetch the next instruction. Thus, the housekeeping is distributed upon the
disks, so that each instruction is essentially independent of the others.
1965
No clock is needed because each disk, as it completes its switching, causes the next instruction to
be obeyed (i.e., the next (or the same) element to
be addressed). Thus, the machine is not synchronous. On the other hand, it does not have the generally accepted earmarks of an asynchronous machine.
Therefore, a new term has been coined to categorize this species of system: Autochronous (or selftimed).
As a consequence of autochronous operation, if
the driving mechanism for a particular disk should
fail, the machine will halt upon attempting to obey
the corresponding instruction, rather than continuing to perform incorrect calculations as might a
clocked machine when an instruction malfunctions.
It wi)l be seen that the picoprogramming scheme
may be viewed as a logical extension of a microprogrammed system, but on a more basic level. The
required instantaneous levels on all the gate leads
may be considered as bit values of a picoinstruction
which has a word length equal to the number of
gate leads. Picoinstructions are stored at constant
radii upon a MYRA disk, in the proper order to
perform the desired task. The advantages of the
MYRA element are that the picoinstructions are
automatically accessed in sequence (without the necessity of a picoinstruction counter), and successive
one's or zero's in the same bit position are automatically "slurred" together, so that, for example, two
successive one's produce a pulse of duration twice
that of. a single one preceded and succeeded by
zero's. Thus, race conditions and static hazard difficulties are easily avoided.
Advantages
The advantages of a picoprogrammed system are
as follows:
1. Tailorability: Since the instructions are
in effect memory elements they can be
plugged in anywhere, and only their address (order code) changes; a wide range
of instructions (greater than the number
which the system can accommodate) can
therefore be offered. Thus, for example, a
customer could choose any 64 of perhaps
200 available instructions. This produces
about 1.6 x 1053 combinations (in practice,
since some software, e.g., an assembler, is
usually desired, a standard nucleus of instructions might be provided, and the free-
95
PICOPROGRAMMING: NEW APPROACH TO INTERNAL COMPUTER CONTROL
2.
3.
4.
5.
6.
dom of choice would be somewhat reduced; even so, it should be possible to offer a range of machines extending from
highly bit-manipulative to highly computational with the same mainframe) .
Post-Alterability: As a corollary to item 1,
a machine in the field can be altered easily
if the customer decides later that his original
choice of instruction set is no longer optimum.
Graceful Failure: Because each instruction is independent, the failure of one will
not affect others. Thus, catastrophic (nothing works) failure should become a rarity.
Diagnosable: Because of item 3, it should
nearly always be possible to successfully use
some diagnostic routine. In addition, the
unique modularity of the control section
makes localization of a control failure easy.
Easily Designed: Disk wiring follows directly from the required timing diagram.
Housekeeping Processes: May take place
anytime during the instruction execution,
allowing optimum sequencing.
THE MYRA ELEMENT
ical positions of the looped portions as shown in
Fig. 1.2
IlJL
THICKNESS: 0.060"
WIRED CODE: 0 II 00100 (ith bit of 8 picoinstructions)
DRIVE LINE
to
OUTPUT
o
o
o
o
o
o
0
0
O.OIS"
O~
o
PROPAGATING
FLUX CHANGE
WAVE t=to.
------,
MYRA DISK SHOWING ONE
PICOPROGRAMMED WINDING
General
Figure 1. MYRA disk showing one picoprogrammed winding.
A brief description of the operation of the
MYRA element is in order. Already applied as a
semiconventional memory element, 2 where its natural propensity to produce sequential trains of
pulses is, to some degree, circumvented, this ability
is instead capitalized upon in the picoprogramming
approach.
When a step of voltage is applied across a drive
winding passing through the central aperture of a
square loop apertured ferrite disk in the proper remanent flux state, a current ramp will result, and a
propagating flux change wave will nucleate at the
central aperture, and propagate outward at a uniform velocity. As this wave traverses a portion of
material singly looped by a conductor, it produces
across the conductor an emf which, in a properly
proportioned and constituted disk, is large enough
to drive logic circuits directly. This voltage pulse is
proportional in temporal length to the physical
length of the looped portions of the disk. Its position in time relative to other looped portions of the
disk is likewise related directly to the relative phys-
When the disk is reset by a drive and current
in the opposite sense, a flux change wave in the opposite sense nucleates at the center (as before) and
propagates outward, producing an emf in the opposite sense at the outputs of each looping conductor.
If, as is usually the case, the logic elements driven
are sensitive to only one polarity of input voltage,
they will react only to the train of pulses produced
during say, set, and not to their mirror image about
ground produced during reset. However, other
windings can produce the proper polarity during
reset and be ignored during set. Thus, the reset time
is not wasted, for useful pulses differing from (or
identical with) those during set can be produced.
This is somewhat analogous to playing both sides of
a phonograph record.
Pulse widths as brief as 250 ns can be produced
directly, and 125 ns pulses are obtainable from a
disk with phased radii. The output impedance of
the disks is less than 10 ohms, so that many gates
may be driven without buffering, and even coaxial
cable can be driven directly.
96
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1965
System .Considerations
Many instructions require a waiting period in the
midst of their execution (e.g., for an operand fetch
from memory). It is easily possible to accommodate such delays between the set and reset of a disk.
Instructions also differ in their total time of execution. While it is possible to force all instructions
to occupy the same time (dictated by the lengthiest) , it is more efficient to allow differing execution times. T~is is effected by making use of disks
which either physically or operationally differ in
dimensions; the operationally small disks are of the
same physical dimensions, but are only partially
switched.
WORKING PROTOTYPE
General Description
Figure 2. ADD instruction card.
A feasibility model prototype dubbed PUPP
(Prototype Utilizing Picoprogramming) was constructed. Its instruction repertoire includes:
1. ADD
add contents of addressed memory location to contents of accumulator.
STR
store contents of accumulator in
addressed memory location
2.
3. SFT
(a) Right
shift contents of accumulator
one binary place to the right
(b) Left
shift contents of accumulator
one binary place to the left
4. SKP
skip next instruction on non-zero
accumulator
5. PCH
punch (on paper tape) contents
of accumulator
6. NOP
perform no operation, proceed to
next instruction
7. HLT
cease activity
Only a four instruction repertoire can be accommodated at once, one of them necessarily HLT.
Each instruction is implemented in a single pluggable circuit card as shown in Fig. 2 and the instruction cards are completely interchangeable; the same
instruction in a different location performs in exactly the same fashion, the only difference being
that its order code (i.e., its address) changes.
HLT is a special case: the address corresponding
to its order code is an empty location. When an attempt is made to access .an empty location, all activity ceases, and the machine halts. The cessation
of activity is quite literal because of the absence of
a clock.
All flip-flops in the system are of the set-reset
variety. Thus, the double rank instruction location
counter requires four timed pulses for address incrementation; similarly, the accumulator is double
rank. Thus, the system would be considered fourphase if a clock were used.
Memory is provided by flip-flop registers. The
word length is a modest four bits, but the control
signals are essentially identical with those necessary
for a -full size machine.
Instruction Implementation
The implementation of a typical instruction,
ADD, will be discussed.
The pulses which this and all instructions must
provide are those necessary for housekeeping, that
is, incrementing the instruction location counter,
fetching the next instruction, and providing a pulse
which causes the next instruction to be obeyed (addressed). (See Fig. 3.)
PICOPROGRAMMING: NEW APPROACH TO INTERNAL COMPUTER CONTROL
SET
o
CLEAR AUX. COUNT REG.
---
GATE UP COUNT
I
2
+I
RESET
4
5
6
7 \/ 0'
2' ?J
4'
5'
e' 7' \
....
l"""-
I--
DOWN
-
t!)
z
0::
r--
CLEAR INSTRUCTION REG.
UJ
UJ
I-- I--
FETCH & GATE INSTRUCTION
~
UJ
U)
:::>
r--
SET COMPLETED
o
:I:
r--
RESET COMPLETED
SUSTAIN SET
I'
-
CLEAR COUNT REG.
GATE COUNT
a
97
-I--
SUSTAIN RESET
f-....
r--
CLEAR AUX. REG.
-I--
GATE UP ACCUMULATOR
-I--
r--
o
o
-
CLEAR ACCUMULATOR
GATE ADDER TO ACC.
GATE AUX. REG. TO ADDER
c(
o
r--
I-:
ILl
r-- r--
:::>
a
Z
....
:::>
Figure 3. Pulse trains produced by one picoprogrammed MYRA disk to execute ADD (250 ns/division).
In addition to the above, there are some housekeeping pulses which are unique to the picoprogrammed implementation; among these are the sustainers, which are logically combined with a test
pulse to assure the health of the driving circuits independent of the remanent state of the accessed
disk. Normally, an accessed disk will always be in
the proper remanent state to provide a relatively
high impedance to the driving circuit. If, however,
a malfunction (e.g., a power failure) should upset
this arrangement, a driving circuit might attempt to
switch an already-switched disk, and endanger itself; to prevent this, a narrow test pulse is first applied, then, if the device is in the proper state, the
sustainer pulse sustains the drive; otherwise the
drive ceases with no damage done. A SET COMPLETED and RESET COMPLETED pulse is also
provided to inform the circuitry that the disk has
finished setting or resetting, respectively.
Each instruction disk must, of course, provide
pulses (in addition to housekeeping) which are
unique to it; these are shown in Fig. 3 for ADD.
This timing diagram, which in a conventional
sequential circuit control design would mark the
beginning of the design problem, marks instead the
completion of the picoprogramming design problem .. T4is is true because the timing diagram is essentially identical to a cross-section of a wired
disk, with a correspondence between time and radi-
al position, and voltage and axial position of the
wire.
It is, of course, understood that a control wire
which performs some function such as Clear A~cu
mulator will loop some portion (or portions) of all
those disks which require clearing of the accumulator. Since only one disk can ever be in the act of
switching, oring is performed by this common wire;
this wire plays the same role as a sense wire in a
conventional core memory, except that it fails to
loop all memory locations (in general).
In Fig. 2, the windings for the ADD instruction
can be seen. The two multiple turn windings are the
drive windings, one used for setting, the other for
resetting. These are the only multiple turn windings
necessary. Note how few of the radial sets of apertures are populated with wires.
The instruction card is constructed with the
heavy current drivers upon it to minimize the areas
of high-current loops. This has the added advantage of making each instruction autonomous with
respect to failure because any component on the
card can fail without disturbing the remainder of
the machine.
The disks could be driven on a matrix basis
(though, of course, not by coincident current), but
the economic saving would not be significant, the
noise problem could become severe, and the localization of failure advantage would be lost.
98
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
Results
PUPP has logged over 3,000 hours of successful
operation running a Markov Chain program at
speeds not less than 200,000 instructions per second, and up to 300,000 instructions per second.
Complete interchangeability of instructions is
demonstrably realized and electrically as well as
physically smaller disks are successfully employed.
Self-induced noise is a non-existent problem;
instruction cards reside comfortably beside logic
cards.
Forced air cooling is used in the instruction card
area.
A later instruction card design is shown in Fig. 4.
1965
ond disk. The second disk performs one cycle of the
operation, decrements an operation counter and
tests for completion of the operation; this is the
Cycler Disk This disk readdresses itself if the test
indicates that the operation is incomplete; when the
disk has cycled a sufficient number of times to
complete the operation, it addresses the third disk
The third, or Clean Up, disk performs the remaining housekeeping operations: fetching the next instruction, etc. In addition, it performs any operations necessary for, and unique to, the completion
of the instruction.
One disk could suffice for realization of a cyclic
instruction, but it would require extensive inhibition logic to prevent the start-up and clean-up
pulses from occurring during the cyclic portion of
the operation, etc.
FUTURES
Picoprogramming should be applicable to a wide
range of computer sizes because of its mix of advantages. It should make a stored program approach
feasible for very small systems because of its economic advantages; it should substantially tilt the
rule-of-thumb balance between the costs of a
processing unit and its control in such systems. Its
flexibility and diagnostic advantages should make it
attractive in rather large systems as well, particularly in multiprocessor arrangements.
ACKNOWLEDGMENTS
Figure 4. Improved instruction card.
Cyclic Instructions
There is a relatively small class of instructions
such as Multiply, Divide, Shift N, etc., which are
cyclic in the sense that the same sets of pulses must
be made available repetitively. These instructions
can be handled most easily by a set of three disks
per instruction. The first (Set Up) disk performs
the set-up functions, indexing, fetching the operand, placing it in the proper register, and setting
certain count flip-flops; it then addresses the sec-
I wish to acknowledge the encouragement of this
study by J. E. Fulenwider and E. L. Scheuerman,
the cooperation of Dr. M. E. Dempsey and R. J.
Nin, and the very able services of J. R. Holden.
REFERENCES
1. For example, D. Fagg et aI, "IBM System/360 Engineering," Fall 1964 Joint Computer
Conference, Vol. 26, Part I, Spartan Books, Inc.,
Washington, D.C., 1964.
2. B. E. Briley, "MYRA: A New Memory Element and System," Proc. 1965 Intermag. Conference.
PRECESSION PATTERNS IN A DELAY LINE MEMORY
Stanley P. Frankel
Los Angeles, California
and
Jorge Hernandez
SCM Corporation
Oakland, California
INTRODUCTION
The circulated information bits compose 120 "characters" of 4 bits each. (Most of these are decimal digits.) These form three "visible" registers
called K (for "keyboard"), Q ("quotient"), and P
(the double-length "product") register. Associated
with each of these is a storage register of equal
capacity, which is not displayed. Many of the operations of the calculator involve the transfer of the
content of one register to another; from one visible
register to another or from a visible to a storage
register or conversely. These operations are facilitated by increasing the time of handling of each bit,
hence the time in which it is conveniently available
for such exchange processes, to cover the period in
which all possible exchange partners pass through
the delay line circuitry. The use of precession in the
COGITO memory was, in part, motivated by this
facilitation of the transfer operations.
It proved possible to provide the COGITO memory with a precession pattern which divorces the
rate of information handling in arithmetic processes
from the bit transmission rate of the delay line, and
thereby to permit choosing each of these rates to fit
the convenience of its associated circuitry. It is the
The SCM COGITO-240 is an electronic desk
calculator which makes use of one sonic (magnetorestrictive) delay line as its primary memory element.
Some 480 bits of information are held in the delay
line circulation pattern. These are represented in a
Pulse-No Pulse code; the insertion of a pul'se into,
or its emergence from, the delay line at a particular
moment indicates the value one for the corresponding information bit. The absence of that pulse represents the value zero. For a memory unit of this
type it is convenient to recirculate information at a
rate which is of the order of 10!i bits per second.
Thus a convenient value for the delay time of the
line, and for the time of one complete recirculation
of the stored information, is about one-half millisecond.
In the development of the COGITO design it
proved preferable to make use of a rate of handling
of information, as for example in the performance
of an arithmetic operation, which is substantially
smaller than the 10' bits per second recirculation
rate. One reason for this preference is as follows:
99
100
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
purpose of this report to describe that precession
pattern and also several simpler patterns which provide some, but not all, of the desired properties.
Many details of the system chosen were motivated
by unusual aspects of the COGITO design and are
not likely to be of widespread interest. They are not
discussed here.
TIMING CHAIN SYNCHRONIZATION
A series of flip-flops, the timing chain, serves to
count the successive one-microsecond-long time
intervals in which successive information bits are
delivered to and received from the delay line. The
timing chain is driven by a free-running oscil'lator, the "clock."
If the delay line were to be used merely to recirculate without change the 480 bits of stored information, then the task of the timing chain would be
merely that of subdividing one "memory cycle";
that is, the period of time required for one recirculation of the stored information. That is, in fact,
the task of an early part of the timing chain which
serves to distinguish one from another of approximately 480 "clock periods" into which a memory
cycle is divided. By reason of the precession system
a longer period of time, called a "machine cycle,"
becomes significant. The later part of the COGITO
timing chain serves to count the 60 memory cycles
in each COGITO machine cycle.
In a delay line memory of this kind there arises a
problem of synchronization; that is, of ensuring that
information-bearing pulses emerge from the delay
line in an accurately controlled phase relationship
with the clock oscillation. A straightforward way of
ensuring synchronization is to impose rigid control
on the frequency of the oscillator and on the delay
time of the line, and to adjust one or the other of
these parameters so as to bring about the desired
phase relationship. As a measure of the necessary
rigidity of control it may be noted that in a machine like COGITO a long-term drift in either
parameter of 0.1 percent would be intolerable.
HindalP has described a method of synchronization which obviates the need for rigid long-term
stability of these parameters. He uses a delay time,
and therefore also a memory cycle length, which is
substantially longer than the time required for the
insertion of the entire body of stored information
into the delay line. In each memory cycle a "marker
pulse" which is distinguishable (for example, by
1965
greater magnitude) from the information pulses is
set into the line before the insertion of the stored
information. After the entire block of stored information has been received from, and reinserted into,
the delay line there occurs a "silent period" in
which no further information is received from the
line. During the silent period the clock oscillator is
disabled; that is, its oscillation is suppressed. The
emergence of the marker pulse from the delay line,
somewhat later, marks the end of the silent period
and brings about the release from inhibition of the
clock oscillator. The timing of the succeeding activities is controlled by the now-enabled clock.
These succeeding activities are: the insertion into
the line of a new marker pulse, the receipt from and
the reinsertion into the line of the pulses representing stored information, the disabling of the oscillator
for the following silent period, etc.
COGITO makes use of a method of synchronization which is distinguished from that of Hindall in
that. the marker pulse does not differ from the information pulses in magnitude or the like. Rather it
is recognized as the marker pulse by reason of its
emergence from the delay line during a silent period. That is, the first pulse to emerge after the clock
has been disabled is accepted as the marker pulse
and terminates the period of inhibition of the clock
oscillator.
The silent period provides a convenient reference
point for the description of the memory cycle. In
the following the term "memory cycle" will be used
to refer to a period of time which begins in one,
and ends in the next succeeding, silent period.
With either the Hindall or the COGITO method
of synchronization the oscillator frequency and the
delay time of the line may, without harm, drift
gradually; the duration of the silent period will
change continuously to accommodate these drifts. (It
must not, of course, be all'owed to shrink to zero.)
The possibility of continuous change in the length
of the silent period, consistent with the desired synchronization, arises from the suppression of oscillation of the clock. During the silent period all phase
relations from the previous memory cycle, in which
a marker pulse and the block of information pulses
were inserted into the line, are forgotten. After the
silent period the phase of oscill'ation of the clock is
determined by the time of emergence of the marker
pulse and is thus consonant with the times of emergence of the information pulses. Although these two
methods of synchronization provide tolerance of
101
PRECESSION PATTERNS IN A DELAY LINE MEMORY
gradual changes in the two parameters discussed, a
sudden change in either, that is, a substantial
change occurring within one memory cycle, would
still lead to malfunction. Fortunately, such sudden
changes are much more easily prevented than are
long-term drifts.
PRECESSION PATTERNS
In a simple recirculating memory using
GITO synchronization the duration of the
cycle is equal to the delay time of the line,
with its associated circuitry, since each
the COmemory
together
pulse is
reinserted into the line simultaneously with its emergence. The word "simultaneously" must not be
interpreted very literally, since the time of traversal
of the associated circuitry is substantial. More precisely: the recognition of an emerging pulse permits
the introduction into the line of a pulse which is of
well-standardized magnitude, duration, and phase
with respect to the clock oscillator. Figure 1 shows
the simple recirculation without precession of a
marker pulse and a group of information pulses,
with the emergence and reinsertion shown as
"simultaneous" in this conventionalized sense.
CLOCK
OSCILLATOR
SILENT --__.o
PULSES
EMERGING
MARKER
BITS- 51
VI
52
V2
S 23sa
V23sa 5340 V240
51
VI
PULSES
REINSERTED
MARKER
Figure 1. Recirculation without precession.
One-half of the 480 bits of memory held in COGITO, called "V-bits," represent the numbers
held in the 3 visible registers while the remaining
240 bits, cal'led "S-bits," form the storage registers. One S-bit and the corresponding V-bit
form a "bit-pair." It proves convenient to handle
the two bits of a pair together, for the most part,
and to make them available for manipulation over
periods of time considerably longer than a few microseconds. A simple way in which that can be
done is illustrated in Fig. 2. The information bits
which emerge from the delay line in the first memory cycle are named
sl,vl,s2,v2, ... ,s239, v239,s240,v240
in the order of their appearance following the
marker pulse. The first two bits, sl and vI, form
one bit-pair; the following two another pair, etc.
As the first two bits emerge they are captured in
two flip-flops, S and V respectively, and are not
simultaneously reinserted into the delay line. The
following 478 bits are reinserted immediately upon
emergence. Following the insertion of bit v240, the
bit (s 1) held in S is inserted, and after it the bit
(vI) is inserted into the line from flip-flop V. Only
then is the clock oscillator inhibited in order to begin a sHent period. The marker pulse which
emerged immediately before s 1 was not reinserted
immediately but was inserted only at the time of
emergence of the second information bit, v 1. In
that way the introduction of a gap between the
marker pulse and the first information bit is avoided. In the second memory cycle the information bits
emerge in the sequence
s2, v2, s3, ... , v239, s240, v240, sl, vI
and the first two bits, s2 and v2, are captured in S
and V and are held for later reinsertion in the same
way as the first pair was earlier, etc. It will be seen
that in each memory cycle the sequence of 240
bit-pairs is cyclically permuted and that after 240
memory cycles the original sequence has been restored.
Figure 2 has been drawn so as to emphasize
another feature of this simple precession pattern:
the duration of the memory cycle is greater by two
clock periods than that shown in Fig. 1. It is also to
be noted that the "early part" of the timing chain
102
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1965
FIRST MEMORY CYCLE
CLOCK
OSCILLATOR
5IL£NT--
PULSES
EMERGING
EMERGING
SI
VI
INFO.<
BITS
INSERT ED
PULSES INSERTED
~
S2
V2
53
52
n
V2
S3
V3~'- n
V240
V240 51
VI
V231 S240 V240 51
VI
SECOND MEMORY CYCLE
CLOCK
OSCILLATOR
SILENT
PULSES
EMERGING
EMERGING
IN FO.<
BITS
INSERTED
PULSES INSERTED
53
S2
V2
~
Figure 2. A one-group precession pattern.
must now distinguish 482 rather than only 480
clock periods following the appearance of the mark~
er pulse. Atypical activities occur in the first two
and in the last two of these.
In the system of Fig. 2 each V-bit is held in
flip-flop V throughout one memory cycle (perhaps
excepting the silent period) and is available there
for leisurely manipUlation, and one S-bit is simifarly held in S. After one machine cycle, consisting
of 240 memory cycles, all information bits have
thus been held and the original bit-configuration
has been restored. It did not prove convenient to
use in COGITO a machine cycle quite so long as
240 memory cycles ( about one-eighth of a second) and therefore a slightly more complex precession pattern was considered.
A further classification of the information-bits
held in the COGITO memory must now be described. Most of these bits represent the decimal
digits which constitute numbers held in the various
registers. Each decimal digit is represented by four
bits, called tl, t2, t3, t4 in order of increasing significance. (A simple 1,2,4,8, BCD representation is
used.) It therefore proves convenient to organize all
other information held-decimal point·positions, plus
or minus signs, etc.-in similar 4-bit characters.
Thus the entire body of stored information may be
divided into 4 equal parts; a group of 120 bits (60
bit-pairs) which are tl-bits, 120. t2-bits, etc. The
transfers of numbers from one register to another
respects this separation into four groups; in such
a transfer a t1-bit always remains a t1-hit, etc.
Thus it proves convenient to separate the body of
stored information into four parts, and to introduce a precession within each part separately. Such
a precession pattern is shown in Fig. 3.
To reflect the separation into four groups the 480
bits held in memory are renamed as follows. The
120 tl-bits are called
s11, vlI, s12, v12, ... ,s 160, v160.
Similarly the group of t2-bits carries the superscript
2, etc. In the first memory cycle (of a machine cycle)
these 480 bits emerge from the delay line in the order
named: the 120 tl-bits follow immediately after the
marker pulse, then the t2-bits, etc. The first information bit, s11, is copied into flip-flop S and is not immediately reinserted. Then the second bit, v11, is
copied into flip-flop V and a marker pulse is inserted
into the line at this time. (It is the first pulse inserted
since the silent period.) In the following 118 clock
periods the remaining bits of the tl-group are reinserted immediately upon emergence. Then, however,
when the first bit of the second group, namely s21,
emerges from the line it is exchanged with the content
of flip-flop S. That is, the emerging bit is set into
flip-flop S while the prior content of S is returned to
103
PRECESSION PATTERNS IN A DELAY LINE MEMORY
FIRST
MEMORY
CYCLE
CLOCK
OSCILLATOR
SILENT--
~
PULSES
EMERGING.
S4 60 v4so~---------
PULSES INSERTED
CLOCK
OSCILLATOR
---SILENT--~
PULSES
EMERGING
INFO.
BITS
<
EMERGING
INSERTtD
PULSES INSERTED
Figure 3. A four-group precession pattern.
the delay line. In the next clock period the content
of flip-flop V, namely v1 1, is set into the line and the
bit v2 1 is placed in flip-flop V. The remaining bits of
the second group are then reinserted as they emerge.
Similarly, the first two bits of the third group are
exchanged with the contents of flip-flops S and V
and the rest reinserted, and similarly during the emergence of the fourth group. After the emergence and
reinsertion of the last bit of the fourth group (060),
the contents of flip-flop S and V are inserted into the
delay line in two further clock periods in .the same
way as has been described for the simpler precession
pattern of Fig. 2. The bits thus returned to the line
are s41 and 01 respectively.
As can be seen in Fig. 3, the operations just described result in a cyclic permutation of the 60
bit-pairs of each group separately-together with a
rightward displacement of the entire pattern in the
same way as in Fig. 2. After one machine cycle,
consisting of 60 memory cycles, the original configuration has been restored. During that machine cycle each V-bit has been held in flip-flop V, and
each S-bit in S, for one-fourth of one memory
cycle (with the neglect of the silent period) .
In the discussion above attention has been directed to the circulation and precession of the bits of
information held in storage in a delay line. The
possibility that the value of an information-bit
may have changed by reason of an inter-:register,
or an arithmetic operation, etc., has not been mentioned. Each of the symbols used, such as v1 1,
should, however, be understood to represent merely
the name of a variable which may change its value
from time to time by reason of activities not described.
The precession pattern illustrated by Fig. 3 fails
to provide one essential feature of COGITO. Each
V-bit (that is, each bit of a "working register")
upon being picked up into flip-flop V must be
provided with opportunity for leisurely interaction
with the fourth bit to precede or succeed it in occupancy of flip-flop V (that is, the corresponding
bit in another register, with which it may be involved in arithmetic manipulation.) For this reason
a bit which has been held in V for a quarter of a
memory cycle (called a "bit period" ) is not, in
fact, returned to the precession pattern as has just
been described. Instead, it is set into a 4-flipflop shift register in which it remains easily accessible for 4 additional bit periods, that is, for an additional one memory cycle. A bit which is held in
storage in flip-flops in this way may be changed
in value in anyone of these 5 bit periods. After
this holding period the (possibly modified ) V-bit
is returned to circulation as illustrated in Fig. 4. By
reason of the general precession the time for rein-
104
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1965
CLOCK
OSCILLATOR
gap
Figure 4. COGITO precession pattern (shown for third memory cycle of a machine cycle). The information bits are
shown as ambiguous, pulse present or absent.
sertion of the bit which has been held out of circulation for an additional memory cycle is immediately before, rather than after, the, clock period in
which the bit held in S is reinserted. In the clock
period following the reinsertion of the S-bit no
pulse is set into the line, thus the bit pattern shown
in Fig. 4 has one-clock-period-long gaps there.
The "bit value" shown with these gaps is zero. The
omission of four bits from the pattern shown in
Fig. 4, which is otherwise like that shown in Fig. 3,
can be understood as arising from the fact that at
each moment four bits are held out of circulation in
the flip-flop shift register.
CONCLUSION
These examples illustrate the considerable flexibility of delay line storage systems provided with
simple precession patterns. It seems likely that similar techniques will prove helpful in many situations
in which the desired rate of handling of data is
smaller than that convenient for a delay line memory.
REFERENCES
1. L. D. Hindall, "Self Synchronous Delay
Line," United States Patent No.2, 783,455, issued
Feb. 26, 1957.
AN ASSOCIATIVE PARALLEL PROCESSOR
WITH APPLICATION TO PICTURE PROCESSING*
R. H. Fuller and R. M. Bird
General Precision Inc.
Librascope Group
Glendale, California
INTRODUCTION
In recent years, a number of hardware associative
memories had been designed and experimentally
verified. 1,2 These memories allow simultaneous
comparison of all stored data to external data. Data
may be read from, or written into, comparing
words. These memories, acting as peripheral devices to conventional computers, have been studied
for ?pplication to various tasks described in references 1 and 2. The concept of "associative processing," i.e., simultaneous transformation of many
stored data by associative means, has been described previously.3,4,5 This processing mode showed
promise in a variety of tasks, but was not efficient
when peripherally controlled by a conventional
machine. Novel machine organizations were required to fully exploit the potential of these techniques for solving poorly structured nonnumeric
problems, at which present-day machines are not
efficient.
This paper describes a novel Associative Parallel
Processor (APP), having an associative memory as
*The work repored here was supported by AF Rome Air
Development Center, Griffiss Air Force Base, N.Y., under
Contract AF 33 (602)-3371.
an integral part of the machine. Arithmetic algorithms are described which allow it to perform adaptive pattern recognition by evaluating threshold
logic functions. Novel algorithms allow simultaneous
processing of m~my operands in bit-parallel fashion.
The processor is a stored program device with a
powerful command set, and thus has general utility
in problems which allow a single set of commands
to be executed independently, and thus simultaneously over many data sets. These conditions frequently arise in nonnumeric data processing tasks
such as pattern recognition.
Parallel processing is accomplished within the
associative array of APP by the powerful technique
of "sequential-state-transformation," previously described by one of the authors.3 The parallel search
function of associative memories requires that comparison logic be provided at each memory word
cell. The APP, by moderate additions to this logic,
allows the contents of many cells, selected on the
basis of their initial content, to be modified
simultaneously through a "multiwrite" operation.
Content search and multiwrite are the primitive operations necessary to parallel processing by sequentialstate-transformation.
105
106
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
To illustrate the concept of sequential-state-transformation, consider an associative memory which
stores two operands, Ai and Bi, in each word of memory. We desire to add operand Ai to operand Bi
simultaneously in some subset of these words. Processing is serial by bit, and parallel' by word, starting
at the least significant bit of each field. Each word
has an auxiliary storage bit, Ci stored within the memory array. Bits within operand field Ai are designated
Aij (j= 1,2, ... , N), where N is the field length. Bits
in field Bi are similarly designated. The truth table
defining the addition is as follows:
State Number
1
2
3
4
5
6
7
8
Present State
Next State
Au
Bij
Ci
Bij
Ci
0
0
0
0
1
1
1
1
0
0
1
1
0
0
1
1
0
1
0
1
0
1
0
1
0
1
1
0
1
0
0
1
0
0
0
1
0
1
1
1
Note that variables Bij and Ci differ in their present
and next states only in states numbered 2, 4, 5, 7.
The search and multiwrite operations may be used
to perform the addition of all number pairs, starting
at the l'east significant bit, as follows:
1. Search words having Aij = 1, Bij = 1 and
Ci = O.
For these words multiwrite Bij = 0, Ci = 1.
2. Search words having Au = 0, Bij = 0 and
Ci = 1.
For these words multiwrite Bij = 1 and Ci
= O.
3. Search words having Au = 0, Bij = 1 and
Ci = 1.
4. Search words have Aij = 1, Bij = 0 and
Ci = O.
For these words multiwrite Bij
=
1.
Steps (1) through (4) are repeated at each bit of
the operands. Within each bit time, processing is
sequential by state over present states which differ
from the next. state in one or more variables. All
words in a given present state are transformed
simultaneously to the desired next state.
Sequential-state-transformations used to perform
the above word-parallel, bit-serial addition, is evidently a very general mode of associative processing. It allows transformation of memory con-
1965
tents according to any Boolean function of stored
and external variables. It makes full use of comparison logic, implemented at the bit level within an
associative array, and thereby simplifies logic required at the word level. It compares favorably with
other associative processing methods in both speed
and processor complexity.
In the next section we describe the organization
and command set for a processor using the sequential-state-transformation mode of associative processing. In the following section command routines
are given for word-parallel, bit-serial processing described above; and also for a novel mode of wordparaUel, bit-parallel processing which yields significant speed improvement over bit-serial mobes. The
pattern-processing applications is discussed last.
ORGANIZATION AND COMMAND SET
The addition operation presented in the preceding section is a typical example of the associative
processing technique. From it, several conclusions
concerning the desired structure for an associative
processor can be formulated.
1. The single primitive step in associative
processing is identification of all words
storing some configuration of binary state
variables, followed by binary complementation of some state variables within identified words. This primitive makes the
basis of an associative "micro instruction"
which, repeatedly executed with varying
parameters, can transform memory contents according to any Boolean function of
stored and external' binary variables.
2. Since processing is simultaneous over all
stored data, no explicit word address is
provided within an associative instruction.
Many words may be transformed in response to a given instruction. These words
are identified by search criteria contained
within the instruction.
3. Data is processed column-serially to minimize the number of state variables and
thus memory states which must be identified and transformed sequentially. Associative micro instructions thus address a small
number of bit columns in memory. Since
many consecutive bit columns are sequen-
AN ASSOCIATIVE PARALLEL PROCESSOR
tially transformed, efficient means for column indexing are required.
4. Several temporary storage or "tag" bits
within each word are useful to identify
words as members of various sets currently
undergoing transformation. The carry storage bit, defined for the addition task of the
previous subsection, is a tag bit. The location of tag columns are unchanged as successive data columns are processed.
5. Each word cell must have electronics, external to the memory array, which temporarily store the match status of the word, relative to the most recent search criteria, and
allow writing of selected bits in matching
words to either the one or zero state. Writing is simultaneous over all matching
words.
6. For generality, stored program control of
the associative processor is desired. Instructions are accessed sequentially from
control memory, with possible branching
as in conventional machines. Since no benefit derives from storing these instructions
in associative memory, the control memory
is a less costly location-addressed randomaccess memory. Having separate instruction and data memories, the access times
for each may be overlapped.
Elements of the Librascope processor are shown
in Fig. 1 This realization contains an associative
array, partitioned into data and tag columns
(fields), together with requisite word and digit
electronics. Instructions are read from a randoma---I
I
_ _ _ ...J
A LIMIT
B LIMIT
A COUNT
B COUNT
CENTRAL
-CONTROL
I
I
~-..,..--
I
I
I
I
-..,.--- --J.--- - - - - I
I
I
I
L...
I
I
I
t---- I--...,.-
I
WORD
CONTROL
MATCH
INDICATOR
~
I
I
RANDOM ACCESS
CONTROL MEMORY
TAG FIELD
DATA FIELDS
14-----'--WORD
----~
I
WORD
ELECTRONICS
ASSOCIATIVE ARRAY
CONTROL PATH -
-
-
-
DATA PATH - - - - -
Figure 1. Structure of the associative parallel processors.
the command as associative. The two adjacent bits
define the initial state of match flip-flops in word
logic units (Le., the detector plane). Other bits define search and rewrite criteria for the A field, the
B field, and for each of four tag bits. The rightmost bit controls rewrite into matching words or
their next lower neighbors. Functions of these bits
are described in Fig. 3.
To illustrate the utility of this command, consider the task of searching the associative memory
for words matching the data register over a field having it upper limit stored in the A limit register and
its lower limit stored in the A counter. Matching
words are to be tagged in tag bit 1.
The following command accomplishes the desired
tasks:
1Esi SLDWIS--W-SOW
~~---------
. A Control
B Control
Tag 1
The following routine 19ads data into each word
in the associative array. The word field to be writ-
109
AN ASSOCIATIVE PARALLEL PROCESSOR
WRITE NEIGHBOR CONTROL
NEXT
HIGHEST
NEIGHBOR
TO
WORD IN
MEMORY ARRAY
FROM
WORD IN
MEMORY ARRAY
NEXT
LOWEST
NEIGHBOR
SELECTION
TRANSFER
DP TO DR
'1...-_ _ _....
COMMON FOR
ALL WORDS
]
MAlCH STATUS GATE
1------... TO CONTROL UNIT
Figure 2. Word electronics.
ten is again defined by contents of the A counter
and the A limit register:
1. Set the match flip flop for word 0 to "1."
2.
1 N S L D W S - - - - S - W N
3. 1 N S L D W S - - - - S - W L
A Control B Control Tag 1
4. If not match, exit: otherwise go to (3).
Instruction (2) writes into word 0; instruction
(3) writes sequentialiy into each remaining associative word.
Nonassociative commands are provided to load
the A and B counters and limit registers, to branch
from linear instruction sequencing either unconditionally or when specified conditions are met, and
to input or output data. Nonassociative commands
are specifically defined in the illustrative programs
presented in the next section.
ARITHMETIC ALGORITHMS
In previous parallel processors4,5,6 arithmetic algorithms were typically executed over many operands simultaneously, but in bit-serial fashion. All
bits of an operand are stored within a single word
cell as shown in Fig. 4. For operations requiring
two operands, operands may be paired by storing
them in the same cell or by restricted communication between word cells (e.g., communication be. tween "nearest neighbors" only). This word-per-cell
( W / C) organization is efficient when all or
most operands in memory are processed by each
associative command.
An alternate data organization stores bits of an
operand in separate contiguous word cells as shown
in Fig. 5. A similar organization, independently
derived, was recently .described. The bit-per-ce.1l
(B/C) organization allows many operands to be
processed simultaneously in bit-parallel fashion.
Command execution is thus appreciably faster than
for the W / C organization, but each command is
typically executed over fewer operand pairs. Any
operands stored in the same set of contiguous word
cells may be simply paired. The B / C organization is thus efficient for problems which do not allow simultaneous processing of all operands by a
single command, and for problems in which each
110
PROCEEDINGS 0
<
~
~
8
ILl
>
~
0
0
~
<
a:3
(,)
0
(,)
0:
ILl
lZ
0:
ILl
IZ
~~
~I...JILl
(,)0
I\r---\J
I~ I
Eso
E~.
~
lsi s
I
SI L I
-d
Z
N
It)
'It
d
d
ci
~x
CI)
CI)
CI)
CI)
iii
l-
Z
Z
Z
l-
I-
Z
1&.1
0:
~
8
0
iii
iii
iii
iii
C!)
C!)
~ID
~
I-
C!)
;<
0
I
0
D
\I
I~I
I
IWIN
0
~
\/
\/
D
~
~~~~~~
~--
S - DON'T
I-
~
~
~
\ r - - - \ r-----\r-\
l.!ol S 1 0I 1~ I I 1].1 0 I Wl!.1 0 I ~ 1!.I 0 I.!.
IWIN
~IIIW SIIIW SlljW
SiLl
C!)
I-
(,)
~
O:fd
i·
I-
Z
C!)O:
0:
0
0:
0
0:
l-
Cl)1l..
0
...J
II..
o ILl
ILIZ
1965
FALL JOINT COMPUTER CONFERENCE,
...J
Z
I.!
l.!ol
I 0
SIIIW
N
L
I
WRITE
0- SEARCH FOR ZEROS
I -SEARCH FOR ONES
1 - - - - S- SEARCH
SEARCH
1- INCREMENT COUNTER
N - NO CHANGE IN COUNTER
W-WRITE
W- DON'T
WRITE
1 - ' - - - - 0 - SEARCH FOR ZEROS
1----- I - SEARCH
1----- 0 - SEARCH
FOR ONES
COMPLEMENT OF D. R.
' - - - - - D- SEARCH EQUAL TO D. R.
I - - - - - - - S - SINGLE COUNTER INCREMENT
~-----
L- INCREMENT COUNTER THROUGH LIMIT
1-------- S -SEARCH
L...-._ _ _ _ _ _
S-
DON'T SEARCH
E10 - CLEAR DP TO ZERO
EI. - CLEAR DP TO ONE
N
- NO CHANGE IN DP
o -ASSOCIATIVE
NORMAL- N
COMMAND
I -NON-ASSOCIATIVE COMMAND
WRITE LOWER NEIGHBOR - L
Figure 3. Format for associative command.
_________________________________________- J
1
ASSOCIATIVE
ARRAY
Figure 4. Word per cell data organization.
operand must be paired with many others at various
computational steps.
The processor organization and command set
described in the preceding section is equally applicable to W IC and RIC data organizations. Arithmetic
algorithms, approprite to each data organization, are
presented below.
WORD
ELECTRONICS
Consider first an algorithm for the W IC data
organization (Fig. 4), which adds contents of all A
fields to contents of respective R fields, leaving the
resulting sum in the R fields. Tag 1 is used for carry storage and is assumed initially cleared. The routine is as follows:
111
AN ASSOCIATIVE PARALLEL PROCESSOR
____
1
_ _ _ _- J
ASSOCIATIVE
ARRAY
WORD
ELECTRONICS
Figure 5. Bit per cell data organization.
Cell
Contents
B Field
Tag 1
S S 1 W N SSIWNSOW
S SOW N S S O W N S I W
S SOW N S S I W N S I W
S S 1 W I
SSO W I
SO W
If A Count> A Limit, continue; otherwise
jump to (0).
A Field
o
1
1
1
2
3
4
1
1
ESl
ESl
Esl
Esl
The routine first addresses least significant bits
of A and B fields and tag 1. Of eight possible states
for these variables, four must be transformed to new
states (see Introduction). Commands 0-3 accomplish the four required state transformations. Command 3 increments A and B column addresses. The
InstructionNo.
o 1 ESl
1 1 ESl
2 1 N
3 1 Esl
4 1 Esl
SIN
6 0 N
routine is repeated at each column in the A (and
thus B ) field.
A routine which performs the equivalent operation for operands stored in the B I C organization
(Fig. 5) is shown as follows:
A Field
B Field
Tag 1
S-1 WNS-O W N - - S-1 WNS-l W N - - S--W-S--W-SOW
S--W-S-OWNSIW
S - - W - S - l WNSI W
S--W-S--W-SOW
If DP =1= 0 jump to (3).
Lowest
Neighbor
Control
N
N
L
N
N
L
112
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
Instructions 0, 1, and 2 form the partial sum.
Instructions 3, 4, and 5 ripple all carries to comple.;
tion, as is detected by instruction 6. For a worstcase carry, N -1 iterations of steps 3, 4, and 5 are
required.
However, even when the number of parallel data
words is very large, the longest expected carry
string is significantly less than N -1 bits. Typically,
the foregoing algorithm is two to three times faster
than for the algorithm presented for the WIe configuration.
ASSOCIATIVE PATTERN PROCESSING
A number of linear threshold-pattern-recognition devices have been built using analog techniques. 8,9,10,11 Such devices are relatively fast and
inexpensive when applied to simple pattern recognition tasks. They are limited in the allowed
number of input variables and in the dynamic range
of weights assigned to these variables. Wiring complexity increases rapidly with the number of threshold units used. Modification of the weights assigned input variables or of thresholds is expensive
and time-consuming. Higher-order restructuring is
even more difficult. Analog units are thus not suited for classification of complex problems for which
many properties are measured, and where suitable
properties may not be known a priori.
The parallel processing capability of an associative processor is well suited to the tasks of abstracting pattern properties and of pattern classification
1965
by linear threshold techniques. Threshold pattern
recognition devices execute a given operation independently over many data sets, and thus allow the
paralleiism necessary for efficient associative processing. Associative processing affords the accuracy
of digital number representation, and is thus unlimited in fan-in and dynamic range of weights.
Weights are simply altered by changing memory
contents. Wiring and components are regular and
are thus amenable to low-cost, batch-fabrication
techniques. The set of measured pattern properties
is changeaable by changing memory contents, rather
than by rewiring as for analog units. Adaptation is
thus possible in measured properties as well as in
classification.
The Pattern-Processing Model
In this subsection, the pattern-processing model
will be briefly described. Figure 6 represents the
model of the pattern recognition system. "N" binary
valued sensor units are summed, with weights -t- 1,
into some or all of "K" thresholding logic units. A
threshold level, tk, is established for each logic unit.
If the sum of weighted inputs exceeds the threshold,
the unit becomes active and the output, b k , is one;
otherwise the output is zero. Each logic unit has a
weighted connection to some or all of N r response
units. Weights of active logic units are summed and
thresholded at each response unit. A pattern is
classified according to the set of activated response
units.
LOGIC
UNITS
DECODED
RESPONSE
UNITS
PROVIDE THE
CLASSIFICATION
Figure 6. Analog model of pattern recognition system.
113
AN ASSOCIATIVE PARALLEL PROCESSOR
vectors Ck. Logic unit outputs which yield the property vector Bi are formed in the detector plane. In
Phase (2), the inputs to the response units are calculated, using the vector Bi and the weights stored
in the appropriate portion of the associative memory. This yields the response vector Ri in the detector
plane. In Phase (3) , the response vector Ri is
compared with the target vectors T m stored in the
associative memory, and the classification of the
pattern associated with A i is determined. The three
processing phases are further described as follows:
Associative Realizations
The associative memory is organized into three
sections containing, respectively, the connectivity
'vectors Ck, 1 :::; k :::; K; the system of weights Wkn,
1 :::; k :::; K, 1:::; n :::; N r; and the target vectors
T m , 1 :::; m :::; M. The general organization of the
associative memory is shown in Fig. 7, which is
interpreted as follows: In Phase ( 1 ), the set of
logic units activated by the ith pattern is determined,
using the input vector Ai and the stored connectivity
INPUT
FROM
SENSOR
UNITS
AI
DATA
REGISTER
0
CONNECTII(ITY
Qk
0
WEIGHTS
Wkn
0
TARGET VECTOR
1m
DETECTOR
PLANE
ASSOCIATIVE
MEMORY ARRAY
Figure 7. Associative paranel processor realization of pattern recognition system.
(negative). That is, the element c+ jk is + 1 if there
is a positive connection between sensor j and logic
unit k, and is 0 otherwise. Similarly c- jk is 1 if there
is a negative connection between sensor j and logic
unit k. Thus, all connections from sensors to the
kth logic unit are represented by the row vector
Ck = C+k + C-k. Matrix rows C+ k and C- k are
stored in adjacent fields of as associative memory
word.
Phase (1). The set of sensor-logic unit connections
with weights 1 and - 1 may be represented by a
matrix [C] which is stored in the connectivity sector
of an associative memory in the format of Fig. 8.
The matrix [C] may be written
where [C+<-)] is a matrix with binary values whose
entries represent those connections which are positive
'I.
c~
'.
C:' - - -
I
I
I
I
-I
t,
II
I
I
ct.
--- ctJ ---
c~
ct. - - - C:.
I
I
I
I
I
I
c:)
-
I ciN -- I
I
I, ~
I
J
I I
I
C~I
I
Cjj - -- ciT
Co - - - Czl - - I
I
I
I
II
I
I
I
I
I
l
!
e;.
eKj
Sj
c;'
SI
;
Ii
I
I
i
I
I
I
I
CKj
Sa
I
j
-WORD-
Figure 8. Connectivity sector of associative parallel processor.
114
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
The input vector A i is used as a. key to interrogate matrix [C] in the associative memory in order
to determine the set of logic units activated by the
input vector Ai. Bit positions having aij = 1 are
interrogated for c+ jk or c- jk = 1. A binary count
of the number of bits in C- k which satisfy this condition is made in the accumulation fields Sc respectively of each word in Fig. 8. The procedure is again
repeated for C + k. Thresholds tk are prestored in
fields as indicated. After counting, the contents of
the T fields are subtracted from those of fields Sc.
Elements of Sc remaining positive under these operations correspond to activated logic units.
FoUowing these Phase ( 1 ) operations, a single
search for positive counts Sc sets the detector plane
to the match state at each word corresponding to
an activated logic unit. Contents of this segment of
the detector plane are transferred to the data register for use as search criteria in Phase (2).
Phase (2). This phase generates the output vector
ri, . . . , riNr corresponding to the ith (input) pattern. The components
any responding word denotes the name of the pattern, as in the following table, showing the target
vector portion of associative memory.
Target
Vector
Name
Word Address
1
r-1
1
rN
dl
f(d l )
d2
f(d2)
dM
f(dM)
2
r-1
M
r1
M
rN
A program was written for the described patternrecognition model using the instruction set presen~
ed in the second section of this paper.
The pattern recognition program has an adaptive
or learning mode, requiring 120 instructions, in
which weights are adjusted to properly classify a set
of input patterns. The program for each mode is
invarient to pattern parameters used for classification. Since 82 instructions are common to the two
modes, only 131 instructions need be stored in program memory.
~
f;"/
I~
lI-
N, NUMBER OF SENSOR UNITS
10.000
~
~
Il-
... , riN r ).
Phase (3). The target vectors stored in the asso-
ciative memory (Fig. 7) are vectors with binary
valued components (ril, . . . , riN r) = 'Tm , where
m represents the known classification of the ith
'T
pattern. It is assumed that the K X N r matrix of
weight [W] stored in the associative memory have
been predetermined such that each of the M patterns
of interest is correctly classified, i.e., the output of
Phase (2) matches the appropriate target vector,
component for component.
Components (r\ . . . riN r) of target vectors 'Tm
are stored in a portion of the associative array at
locations derived from. "names" dm(i::S; m ::s; M)
of patterns associated with the target vectors (Fig.
7). A response vector generated during Phase (2)
is used as a search key to interrogate the target vector portion of the associative array. The location of
2
rN
r
Processing Times
100,000
are to be generated within the associative processor,
using b i k as an input from Phase (1) and the weights
Wkn stored in the weight section of the associative
memory, Fig. 7. Arithmetic operations, similar to
those discussed in the second section of this paper,
are then used to yield the response vector (ril,
1965
I-
100
~
.I~~
~ /1[1
l;jl I Ii
.u)
v;n
~!t~
10M'
~
I
~~
II
~ V
II-
~
If
~
I
II
I
r
I~
II
4
!5 S 78.IOOMS
I
. III
'11
I
2
N=64K
N'16K
N'4K
11j';K,
4
5878'·IS
2!
4
1878'IOS
T- TOTAL SOLUTION TIME FOR A SINGLE PATTERN IN SECONDS OR MILLI-SECONDS AS INDICATED
Figure 9. Time for associative classification of a single pattern as a function of the number of patterns and sensors
for the W Ie data organization.
*This cycle time is based on a magnetic film realization
of the associative array described in reference 12.
AN ASSOCIATIVE PARALLEL PROCESSOR
Based on the aforementioned recognition program, a word-per-cell data organization, and an associative command cycle of 0.8 microseconds, the
graph of timing efficiency shown in Fig. 9 was constructed. Note that "N" represents the number of
sensor units at the input and "M" the number of patterns distinguishable by the processor. The "MARK
I Perceptron" used 400 sensors in a 20 X 20 array
and 512 logic units.
It can be seen that the APP could solve this
problem in approximately 3 milliseconds using some
2000 words of associative storage. The APP realization offers significantly greater ease of alteration
and somewhat lower cost at a moderate increase" in
processing speed relative to the Mark 1.
CONCLUSION
The associative parallel processor, described in
this paper, achieves considerable generality with
simple word and bit logic through the use of the
sequential-state-transformation mode of associative
processing. Its range of applicability is increased by
novel arithmetic algorithms allowing simultaneous
processing of many operands in bit parallel fashion.
These algorithms allow efficient use of the processor for problems in which only a fraction of the
stored operands are processed by a given command.
Earlier processors4 ,5,6 were efficient only when nearly
nearly all operands were processed by each command.
An important feature of the parallel processor,
when used as a pattern recognition device, is the
ability to modify its functional structure, through
alteration of memory contents, without change in
its periodic physical structure. This adaptive feature
has importance in applications where patterns change
with time, or where the processor is used as a prototype of subsequent machines having fixed recognition capabilities. Further research is required to
fully exploit this adaptive capability.
Linear threshold pattern classifiers of the type
here presented are beginning to find many applications. To date, these types of pattern dassifiers
have been studied and/or implemented for character recognition, photointerpretation, weather forecasting by cloud pattern recognition, speech recognition, adaptive control systems and more recently,
for medical diagnosis from cardiographic data.
Other possible applications include terminal guid-
115
ance for missiues and space vehicles and bomb
damage assessment.
Currently the processor is being studied for
application to the tasks of job shop scheduling,
optimum commodity routing and processing electromagnetic intelligence (ELINT) data. In each
instance signfiicant speed gains have been shown
possible over conventional sequential digital computers. It is interesting to note that the processor
described in the second section of this paper may be
applied to this variety of tasks without significant
changes in organization or command structure.
ACKNOWLEDGMENTS
The authors take pleasure in acknowledging contributions to this effort by Mr. J. Medick and Dr. J.
C. Tu. We are most appreciative of support given
this work by Messrs M. Knapp and A. Barnum of
the Rowe Air Development Center.
REFERENCES
1. P. M. Davies, "A Superconductive Associative Memory," Proc. of the Spring Joint Computer
Conference, May 1962.
2. J. E. McAteer, J. A. Capobianco and R. L.
Koppel, "Associative Memory System Implementation and Characteristics," ibid., Nov. 1964.
3. G. Estrin and R. H. Fuller, "Algorithms for
Content-Addressable Memory Organizations," Proc.
Pacific Computer Conference, Mar. 1963, pp. 118130.
4. .
and
, "Some Applications for
Content Addressable Memories," Proc. Fall Joint
Computer Conference, Nov. 1?63, pp. 495-508.
5. P. M. Davies, "Design of an Associative
Computer," Proc. Pacific Computer Conference,
Mar. 1963, pp. 109-117.
6. R. G. Ewing and P. M. Davies, "An Associative Processor," Proc. Fall Joint Computer Conference, 1964, pp. 147-158.
7. B. A. Hane and J. A. Githens, "Bulk Processing in Disturbed Logic Memory," IEEE Trans. on
Elect. Comp., vol. EC-14, no. 2, pp. 186-195 (Apr.
1965).
8. R. Widrow et aI, "Practical Applications for
Adaptive Data Processing Systems," 1963 WESCON (Aug. 1963).
9. J. C. Hay, F. C. Martin and C. W. Wight-
116
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
man, "The MARK I Perceptron-Design and Performance," IRE International Convention Record,
1960 (Part 2).
10. C. H. May, "Adaptive Threshold Logic,"
Rept. SEL 63-027 (TR 1557-1) Stanford Electronics Lab, Stanford, Calif. (Apr. 1963).
1965
11. G. Nagy, "System and Circuit Designs for
the Tobermory Perceptron," OTS, AD 604 459,
Sept. 1963.
12. R. H. Fuller, J. C. Tu and R. M. Bird, "A
Plated Wire Associative Memory," NAECON Conference, Dayton, Ohio, May 10-12, 1965.
COMPUTER ORGANIZATIO'N FOR ARRAY PROCESSING*
D. N. Senzig and R. V. Smith
IBM Watson Research Center
Yorktown Heights, New York
the logical characteristics of the problem. When
mUltiple AU's are all doing the same task, a single
control unit suffices. For example, one load instruction can cause all AU's to load their separate accumulators each from a different part of the array.
Facility must be provided to inhibit some of the
AU's when exceptional conditions are being handled by the others, or when the number of pieces of
data to be processed is smaller than the total number of AU's available. A suitable paralleling of separate memory units must also be provided to yield
data at the rate required by the AU's.
The cost and speed of an array processing computer depends on the speed of the memories and the
circuitry used, and also on the number of AU's provided. Speed can be characterized by the maximum
rate at which bits can be brought from the memories and processed. Studies to date have indicated
that higher bit rates at proportionately lower costs
are possible with given types of hardware by using
the array processing approach rather than the conventional types of organization.
INTRODUCTION
In spite of recent advances in computer speeds,
there are still problems which make even greater
demands on computer capabilities. One such problem is that of global weather prediction. Here a
three-dimensional grid covering the entire world
must be stepped along through relatively short periods of simulated time to produce a forecast in a
reasonable amount of real time. This type of problem with its demand for increased speed in processing large arrays of data illustrates the applicability of a computer designed specifically for array
processing.
When arrays of data are being handled, it is usual to have to do the same calculations on each piece
of data. This kind of problem is suited to a machine with multiple arithmetic units (AU's) since
each can be carrying on the same task on different
parts of the array. We are fast approaching the
physical limit in speed for computer AU's. On the
other hand, a number of AU's can operate simultaneously to increase the amount of work done per
unit time. The speed and number of these units can
be selected to suit the economics of the case and
VAMP ORGANIZATIONAL CONCEPTS
The VAMP (Vector Arithmetic Multi-Processor) computer will be described independent of
*This work was supported by the Advanced Research
Projects Agency of the Office of the Secretary of Defense
(SD-146).
117
118
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
technology in order to stress organizational concepts apart from arithmetic execution times. However, the ideas described below were tested in a
simulator where details such as instruction format,
word length and number of AU's are fixed.
The VAMP computer consists of three major
units: the Mill, the Memory and the Control.
I
VECTOR ACC.
I!
2
z
W
W
a:
0
(/)
TOI FROM
MEMORY
0
0
, ~) of members
of X to zero set their logical result into u where
individual elements can be set or tested. An interchange of the contents of sand u allows such results
to become the screen.
COMPUTER ORGANIZATION FOR ARRAY PROCESSING*
The effective use of structured operands (vectors
and matrices), depends upon the ability to extract
and reinsert certain elements or groups of elements.
This restructuring depends on selection of elements,
which is a binary operation (i.e., to select or not
select), and is conveniently specified by the bit vector
u. Two operations for restructuring data arrays are
included-compress and expand. The following discussion of the operands is based on I versons's description. I
The compress operation defined on an arbitrary
vector a and a compatible (of equal dimension) bit
vector u is denoted by c ~ ul a and is defined as
follows: the vector c is obtained from a by suppressing from a each component ai for which Ui = 0.
Clearly, the dimension of c is equal to the sum of
the ones in u. For example, if u = (1, 0, 0, 0, 1, 1,)
and a = (1,2,3,4,5,6), then ula = (1,5,6). For
the VAMP instruction, the elements of a are the
numbers stored in the array X. Hence the dimensions of the operand vectors is assumed to be n. The
result vector, c, is stored in X. Denoting the sum of
components (ones) in u by i, then the first i, registers
in X contain the vector c, the remaining n - i are set
to zero.
The expand operation is expressed as c ~ ul a.
The vectors c and u are of a common dimension.
The expand operation is equivalent to choosing successive components of Cij from ai or according as
the successive components Ui are 1 or 0. For example,
if u = (1, 0, 0, 1, 1,) and X contains the numbers
(1, 2, 3, 4, 5, 6), then the result is X containing the
numbers (1, 0, 0, 0, 2, 3). Denoting the sum of components of u by i, the first i elements originally in X
will be preserved, and n-i· elements of the result are
necessarily zero.
The use of compress and expand can be illustrated in a weather problem. The radiation calculation
depends principally on the type of cloud and
amount of cloud cover. For efficient handling, separate arrays for each of the various cloud characteristics should be created by compression. Then all
arithmetic units can be put to work on one such array after the other, all doing the same operation at
anyone time. The separate arrays can then be reassembled into one by the expand and load under
screen control operations for use in the next procedure. Essentially these instructions partially recover
the advantage of making use of the average that a
conventional computer enjoys.
The vector accumulator w is a 2k-bit register. All
°
119
members of X (subject to s) can be added or multiplied together and the sum or product placed in w
(w ~ w + I j Xi; W ~ W X 1Ti Xi). Search instructions to find the maximum (or minimum) value in
wand the registers of X place this maximum (minimum) value into w.
Memory
The design of VAMP requires simultaneous access to n words in memory-one word for each AU
(Fig. 3. Several memory organizations are possible and these are based on the physical arrangements chosen. The result will be a functional arrangement, that is, an arrangement as seen by the
program.
One type of organization results when one physical word is used to feed more than one AU. One
reason for doing this is that the physical word
brought out at each memory access is large enough
to hold several functional words and therefore it is
more efficient to use all of them. If one physical
word does not hold enough functional words to supply all AU's, then several boxes can be accessed
simultaneously. The simplest approach is to group
boxes together and bring out the same physical
word from each in the group to feed all AU's. The
SOLOMON2 case is functionally similar to the one
above but, being serial, one bit of each physical
word in the group is sent to each AU and a succession of groups supply the succession of bits which
make up the functional word. In this case, the data
received by the AU's have a fixed spatial relationship to each other. Functionally it is as if each SOLOMON AU had its own memory and accessed the
same word from it at anyone time as all the others
did from theirs. To ease this restriction, provision
may be made for units to access words in their
neighbor's memories, proceeding in step as before.
This results in the concept of a matrix of processing
elements (arithmetic units, each with a memory) as
in SOLOMON.
For an n arithmetic unit VAMP, 2j separate memory boxes are required (2i ~ n). Assuming a total of
2m words of memory, the addressing is set up so that
the least significant j bits of the m bit address select
the memory box. The most significant m - j bits of
the address then specify the particular word in the
box.
V AMP addresses memory in two modes. In the
vector direct mode, the instruction specifies an initial
120
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
address ao and an increment d. The n addresses ao,
ao + d, ao + 2d, ... ,ao + (n - 1) d are generated.
Since the number of memory boxes is a power of
2 and at least as great as n, an odd value of d will
result in all addresses being in different boxes. The
most usual cases arise from proceeding along a vector or a column of matrix so that d = 1, or along a
row of matrix so that d equals the length of a column.
If the columns are of even length, an additional dummy members can be added to achieve an odd number. Should the program require the use of an even
value of d, it will still be accepted, but with consequent loss of time. A test is made to determine if
d = O. If so, the same word is transferred to all AU's
without attempting to access the memory more than
once.
Addressing in the vector indirect mode uses a set
of n addresses in memory. These addresses are
fetched into Z, but they are then fed to the memory
address registers to bring out n words. Any set of
addresses may be used so that any desired set of
words, possibly including repeats, will be fetched
or stored. The vector indirect mode will generally
lead to multiple requests from the same memory
box. This will result in less than maximum speed;
however, it will be faster statistically than executing
n separate instructions, one for each word to be
fetched or stored.
The SOLOMON scheme has certain physical advantages. It does not require as many memory boxes to avoid memory access conflicts (but to obtain
a suitable memory size, it may need just as many).
The number of bits to access the AU memory is
less than it would be if the entire physical memory
were adressable.
The V AMP scheme requires hardware and time
to produce the address vector. Functionally, the bus
between memory and AU's is an electronic cross
bar switch to gate the n addresses to the memory
units, and an electronic cross bar switch to gate the
n data words between the memories and the n
AU's.
Simulation showed that the memory bus hardware for VAMP can be significantly reduced by
time-sharing a small number of transfer lines (typically 2 or 4) in combination with a simple anticipatory control. This would allow decoding and
address generation to be overlapped with the execution of the previous instruction. Thus, when considering the entire hardware inventory of a machine of
this type, the increase due to the incorporation of
1965
the flexible memory of VAMP should be small.
Also, this is a one-time cost which will be offset by
savings in programming which is continuing activity.
The VAMP memory organization results in a
common functional memory with no relation between a given AU and a section of memory. The
SOLOMON organization yields individual functional memories and fixed addressing. The former
has greater functional advantage than the latter
since AU's have access to all of memory, not just to
the part of it which is their own and their neighbors'.
The flexibility of its common functional memory
means that VAMP will adapt easily to automatic
programming. Computation can proceed in essentially the same way as has been conventionally
done, but n items will be handled at a time. With
SOLOMON-type addressing there will be severe
problems brought about by segmentation of the
memory into individual AU memories of limited
size and the consequent segmentation of the problem data. Boundary problems between the segments
and storage allocation, particularly dynamic storage
allocation, will be prominent difficulties.
Table look-up (where each AU may require a
different value from the table) is rather easily done
with VAMP through use of the vector indirect
mode. No scheme for doing this seems to be available with SOLOMON-type addressing and indeed,
indirect addressing appears to be incompatible with
this addressing scheme. At Livermore a problem
known as Coronet IV uses approximately 50 percent of the total running time of their STRETCH
computer, and 36 percent of the Coronet IV time is
used for table look-up and interpolation of table
values.
Control Unit
The control unit in VAMP includes the instruction fetch controls, instruction decoding index unit
and address unit. As in conventional computers, instructions are stored in the same memory as (and
are indistinguishable from) data. Instruction decoding and execution differs from the 7090, for example, only in that for the vector instructions, micro
operations are issued to n AU's rather than to 1,
and on a vector memory access an index register is
specified as containing an increment from which
the address unit has to generate n addresses.
121
COMPUTER ORGANIZATION FOR ARRAY PROCESSING*
FURTHER DISCUSSION OF THE V AMP AU's
floating point arithmetic in modern large-scale machines designed for scientific problems. The classical arguments for floating point in a conventional
machine would seem to hold equally in array processors.
If the Mill were designed as it functionally appears, that is, as a collection of "classical" AU's of
the type shown in Fig. 2, we would have a require-
One of the reasons for building faster computers
is to solve bigger problems. Bigger problems usually
mean longer chains of computation and, for a given
word length, more round-off error and scaling difficulties. The uncertainty of magnitudes in long calculations has led to the almost universal use of
Xi OUTPUT TO
ii-! AND~'+I
- - - . . - - - - -... TO/FROM MEMORY
FLOATING POINT SHIFT
t
L-____
~~~~~~~~
____
~~
____
LEAST
SIGNIFICANT
FRACTION
________________________________
~
~
EXECUTE
LINE FROM!I
Xi INPUT FROM
X'-IANDXi+1
Figure 2. Vamp arithmetic unit functional data flow.
ment that does not exist in conventional computers.
Namely, all arithmetic operations should run in step
and the time required should be data independent.
It is permissible, for example, for the execution of a
floating point add on the IBM type 7044 to vary
between 4 and 23 cycles with an average execution
time of 5.5 cycles. In VAMP this is not suitable.
Statistically, with many AU's the execution time for
all the reach completion will be close to the maximum and a data dependent execution time would
require extensive local control on each AU that we
prefer to do away with. Floating addition (and
floating subtraction), multiplication, and division
are the operations that normally have data dependent execution times.
In the case of multiplication and division, algorithms that give a data independent execution time
are well known. Multiplication by two bits per iteration and division by the nonrestoring algorithm
reduce the local control for multiply and divide to
merely picking the correct multiple (-1, 0,1,2) of
the multiplicand or divisor. The other control hardware, shift counter, etc., is common to all units.
In the case of floating point addition, a fast algorithm with execution time independent of data
can be developed by providing multiple shifts in the
alignment and normalization phases. In particular,
digit shifts of powers of two (1, 2, 4, 8 digits, etc.,
are provided. For alignment, assume the exponent
difference has· been obtained and is represented in
binary positional notation, the bits having weight
1, 2, 4, 8, etc. Then by having central control issue the micro operation "shift," the local control
need merely be sufficiently sophisticated to gate the
122
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
fraction through the shifter or to bypass it, depending on whether the corresponding bit of the exponent
difference is one or zero.
For normalization,. each AU must be provided
with control logic to determine if a one would be
shifted out of the most significant end of the accumulator by the shift micro operation about to be
given. Again the local control need only be suffi-
1965
ciently sophisticated to execute or ignore the common shift micro operation, based on the value of
the most significant bit.
Unfortunately, the schemes for floating point add
get to be very expensive when applied to each of n
AU's. Some compromise between multiple shift
gates and completion controls would undoubtedly
be made. For instance, the central control could re-
00=
d=
QO,d ARE SPECIFIED
BASE
ADDRESS
IN THE INSTRUCTIONS{ ADDRESS INCREMENT
LOGIC TO
COMPUTE n
ADDRESSES
t
INPUT FROM ZO
Z REGISTERS (VECTOR INDIRECT n.l:
MODE)
~
MEMORY
BOX
o
n LINES - I TO {
EACH Z REGISTER
MEMORY
BOX
I
••
ROUTING
TO MARiS
D···
:
Figure 3. Vamp memory organization.
MEMORY
BOX
2k_1
COMPUTER ORGANIZATION FOR ARRAY PROCESSING*
peatedly issue micro operations for a single shift,
stopping when all AU's signal completion. Another
reasonable alternative is to use a larger radix. Quaternary, octal or hexidecimal systems are obvious
candidates.
Assuming the engineering problems are solved
one is still left with the situation where a scalar
(one pair of operands) operation requires as much
time as a vector operation ( n pairs of operands).
123
Thus when a problem is heavily dominated by scalar operations there is little gaJn in speed.
Rather than attempt to work with multiple AU's,
a register array and a much smaller number of very
high-speed Execution Units may be used. The basic
procedure is to load the register array and stream
operands and results to and from high-speed Execution Units. Figure 4 shows the resultant organization of the Mill.
__ !:.I___ .
~(n,k)
REGISTER
ARRAY
---
TO/FROM MEMORY
~(n,2k)
REGISTER
ARRAY
~(
VERY HIGH SPEED
EXECUTION UNITS
(~nIl6)
1,2k)
!!(n,n
!
(n,l)
MILL
Figure 4. Vamp mill (n
=
number of functional AU's;k
The algorithms we use to obtain high-speed arithmetic operation are almost completely combinatoral
circuits. Floating add (and its variants) use the
alignment and normalization combinatorial shifter
described above.
The multiplier is based on the well-kno~n carry
save multiplier as recently extended by Wallace3 to
process all partial products simultaneously. Wallace's proposal is to interpret the contents of the
multiplier register as radix 4 digits and recode these
digits into the digits - 2, - 1, 0, 1, 2. The now easy
to generate partial products are grouped by threes
and gated into a tree of carry save adders. The outputs of each level of carry save adders are again
grouped by threes and gated into the next level.
Each level of carry save adders reduces the number
of summands by about a factor of 1.5. The two outputs of the tree are added to produce the double
length product.
The divide algorithm is an iterative method
based on that used in the Harvard Marck IV and
discussed by Richards. 4 The technique essentially
involves generating the divisor inverse by a series
of truncated multiplications.
=
number of bits per word).
During a vector. floating point multiply we could
have four sets of operands in motion at once. One
set is being accessed from the register array. One
set is being multiplied. A product is being normalized. A normalized product is being stored in the
register array.
In an effort to keep all operations times in the
stream to about the same duration the multiply tree
can be split such that some fraction of the multiplier
bits are processed simultaneously in each section.
While this may slightly increase the time to complete a single multiply, the cost will be· slightly decreased and overall speed will be significantly
increased.
As with multiply, a snapshot of Vector Floating
Add would reveal many sets of operands in motion
at once. Here we have the following possibilities:
Fetch operands from registers, determine the exponent difference, alignment shift, addition or subtraction, normalization shift, and store result in register.
To a good first approximation, for floating point
fraction lengths of about 32 bits these techniques
give approximately 16 times the speed and cost 16
124
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
Exponent
Fraction
times as much as the classical "parallel" algorithms
that use a parallel adder but do serial shifting and
process multiplier and quotient digits serially.
Hence, for given speed and cost to implement the
vector operations we obtain improved speed for operations involving less than n operands at a time.
The discussion of n arithmetic units will continue
but it should be kept in mind that we propose to
use n registers and a much smaller number of Execution Units. The exact number and form of the
Execution Units would be determined by word
length and the particular circuit characteristics.
S
G
N
1
8
9
1
1
16
Floating Point Word
Double Length Index
Single Length
Index/Address
Logical Acc./Screen
VAMP COMPUTER
This section describes V AMP with word length,
number of arithmetic units, number of memory
units, and instruction set fixed. A simulator to investigate the organizational concepts described
above was written to run on the 7094 for the particular organization described in this section. Since
our study was to investigate concepts, not circuit
and memory speeds, we will not supply such things
as multiply times and, hence, performance improvement factors over convenient targets.
The simulator was not complete in the sense that
interrupts were not programmed and I/O must be
done outside the simulator program. The simulator
will accept a program written in V AMP symbolic
assembly language, assemble, and execute the program.
The simulated VAMP assumes 16 functional arithmetic units and - 16 memory boxes. of 16,384
words each. The word length is 36 bits.' The floating poit number representation is the same as the
IBM 7094: the floating point fraction is binary,
sign and magnitude, the exponent is excess 128.
(-1.0, 0, 1.0 are represented in octal by
601 400 000 000, 000 000 000 000,
and
201 400 000 000 respectively). The data and instruction formats are given below.
Data Format:
Instruction Formats:
0-7
8-11
14-17
18
31
32-35
0P
0P
0P
0P
13
0-3 4-7
X2
12
X2
12
12
12
XO'
10
Xl
11
F 11
ADDRESS
F 11
F 11
ADDRESS
ADDRESS
11
Scalar RegisterRegister
Scalar RegisterMemory
Vector
TRANSFER
Control Words:
o
15, 16 -17,
-35
18
Interrupt
Conditions
Condition Mask
C
0, 1
8, 9
C
35
S
G
N
S
G
N
1965
Instr. Cntr.
Interrupt
Branch Adr.
Program Status Word
Program Status Word Mask
(C C - condition code)
COMPUTER ORGANIZATION FOR ARRAY PROCESSING*
VAMP has 15 index registers. The 10, 11, 12 and
13 fields of instruction formats refer to one of these
registers or, if the field is 0000, to an implicit register that contains an unmodifiable zero.
To keep all address calculations out of the Mill
the index units contains a complete set of arithmetic (add, subtract, multiply, and divide) and
Boolean (AND, OR, equivalence) operations. The
number representation in the index unit is 2's complement. The word length is normally 18 bits with
multiply producing a double length product and divide producing an 18 bit quotient and 18 bit remainder.
Memory Addresses are calculated from information specified in the F, 11, and Address fields. The
bit combination in the field 11 selects the index
register to be used in modifying the Address field.
The instruction is then executed as if its address
field contained the stated address plus the contents
of the index register.
Address modification is extended to include base
address indirect addressing. Base address indirect is
specified by a one in bit 13 of the instruction
(right right-most bit of the flag, F, field). An address is computed by adding the contents of the index register specified by lIto the address part of
the instruction to form a memory address. Bits
13-35 at this base indirect address replace bits
13-35 of the instruction register. The process then
repeats-a new memory address is computed for 11
and the address field. Bit 13 is examined for another level of base indirect addressing. The address
that comes out at the end of the chain of indirect
addresses is called the effective base address.
Vector instructions, i.e., those that do 16 operations simultaneously, use the effective base address
as the address of the first operand. The address of
the second operand is determined by adding the contents of the index registers specified. by field 12 to
the effective base address. Letting ao represent the
effective base address and i2 the contents of the index
register addressed by 12, the address vector, a, is of
dimension 16 and the components are (ao, ao + i2,
. . . ,ao + 15 i2). All values a ::;; i3 < 2 18 are valid.
There is another form of indirect addressing
known as vector indirect addressing. In this mode
the address vector is used, not to address the operands directly, but to address an address vector. This
mode is indicated by a 1 in bit 12 of the vector instruction format. Vector indirect addressing does
125
not proceed beyond 1 level; i.e., the address vector
fetched from memory is used as the operand address
vector without further modification. (When modification of the address vector is required it can be
fetched into the accumulator X and treated as
data.)
In scalar floating point or scalar boolean operations the 4-bit fields XO, Xl, and X2 refer to one
of the 16 double length accumulators. The contents
of the registers XO and Xl serves as operands and
X2 specified the regi,ster to receive the result. For
scalar register-memory instructions the contents of
the memory location specified by the effective base
address and the contents of the X register specified
by the X2 field serve as operands. The result is
placed in the X2 register.
To facilitate programming of loops, where one is
processing 16 elements at a time, three loop closing
instructions, VTILE (vector transfer on index low
or equal), VTIH (vector transfer on index high),
and VCTR (vector transfer on counter) are provided. These instructions combine stapping an index,
testing the index, and conditional branching. They
are made more powerful by having them set to the
"do not execute" state the screen bit of arithmetic
units which will not participate on the last iteration
when there are less than 16 items to process.
Looking at the instruction VTILE in detail, the
first step is to add 16 times the contents of index
12 to index 13 and put the sum in index 13. (The
mUltiply and add is done by simply shifting index 12
four bits left and adding Mod 2 18 .) Then, the new
value of index 13 is compared against the contents
of index (13 + 1) Mod 16.
If the contents of index (13 + 1) Mod 16 are
greater than the contents of 13 a branch is not taken
-the instruction counter is to be advanced by 1.
If the content of index 13 + 1 is less than or equal
ot the content of index 13 the branch is taken, that
is, the effective address is placed in the instruction
counter. The results of the test (> or ::;;) are also
noted for use in the following part of the VTILE
instruction .
The contents of index 12 are now added to 13 and
the sum Mod 2 18 is compared against the contents
of index (13 + 1) Mod 16. If the result of the comparison is the same as that on which the branch decision was made (i.e., both greater than or both less
than or equal) the screen bit 2 is not modified. If
126
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
the results of. the two comparisons are different, the
screen bit 2 is set to zero (do not execute state).
The contents of index 12 are now added to the
above sum.)If this sum Mod 2 18 bears the same relation to index (13 + 1) Mod 16 as did the addition
on which the branch decision was made, screen bit
3 is not modified. If the relation changes the screen
bit is set to zero. This process repeats until 15 additions have been done.
Note that screen bit 1 is never modified, screen
bit 2 may be modified at the end of the first addition, screen bit 15 may be modified at the end of
the 15th addition.
The VTIH instruction differs from VTILE only
in that the branch decision is reversed. If, after the
first add shift operation, the contents of index (13 +
1) Mod 16 are greater than index 13 we branch,
i.e., put the effective address in the instruction
.
MILL
1965
counter; if the contents of index (13 + 1) Mod 16
are less than or equal to index 13 the instruction
counter is advanced by 1. All other operations are
the same in VTILE and VTIH. The instruction
VTCR uses an implicit -1 as the increment and the
contents of the register 12 as the initial value. The
comparison is against an implicit zero. Other than
this the instruction is identical to VTIH.
The 15 additions required by VTILE, VTIH and
VTCR are performed by the same unit and in the
same way as addresses for a vector instruction are
generated.
The instruction set for VAMP has been designed
for the processing of vectors in memory, including
rows and columns of matrices. These will normally
have considerably more components than the number of AU's. Many operations such as compress,
search for largest, and sum and product reduction
(sum or product of all compnents) must operate
CONTROL
.Y,tlS,1l
!,tlS,!)
~
18 81TPATH
--9--3S BIT PATH
~
72 BIT PATH
Figure 5. The simulated vamp CPU.
over the entire vector even though only 16 are handled at anyone time. The instruction set is designed around this concept.
The simulated CPU is shown in Fig. 5. The w, s,
u, X and Z arrays perform the functions described
in the second section of this paper. The address
unit A contains three 18-bit registers and two 18bit adders. Like Z, A is not seen by the programmer. It is used by the control for index and address
arithmetic.
The right-most 18 bits of the program status
word, PSW, contain the instruction counter. Bits 16
and 17 of the PSW word contain the condition
code. The results of all index operations as well as
scaler operations in the Mill are used to set the condition code to indicate whether the result of the last
operation was zero, less than zero, greater than
zero, or overflowed. An instruction to test the condition code and branch accordingly is provided.
Bits 0-15 of the PSW and PSWM were not de-
COMPUTER ORGANIZATION FOR ARRAY PROCESSING*
fined in the simulation. They are reserved for interrupt indicators, interrupt masks, and the interrupt
branch address (the location where a new PSW and
PSWM are to be picked up from and where the current ones are to be placed).
The registers I-BUF and IRB in Fig. 5 are used
by a relatively simple anticipatory control. Instructions are executed in three levels. At level 1 the instructor is fetched from memory and placed in register I-BUF (instruction buffer). In level 2 the instruction op code is scanned to determine if it is a
vector arithmetic insturction or one of the vector
transfer instructions: VTILE, VTIH or VTCR. If it
is one of the vector instructions the necessary addresses are generated. If it is not a vector instruction, no operation is performed. In step 3 the instruction is executed. For most instructions steps 1
and 2 can easily be overlapped with the execution
of the previous instruction. Note however that in no
case are the contents of registers visible to the programmer modified until the previous instruction
has been completed. Thus if an interrupt occurs at
the end of the current instruction some unnecessary
work has been done but no procedure for recovery
of previous register contents need be included.
PROGRAMMING EXAMPLE
The following small FORTRAN problem is coded
in the VAMP symbolic assembly language:
DO 1 I = 2,59
L = A2(1)
1 Al(1) = C*A2(1)
+ B(L)
+ A2(1-1) + A2(1 +
1)
Al and A2 are one-dimensional arrays of 60 elements. B is a one dimensional array of 40 elements
which enter by indirect addressing. C is a constant.
The following instructions are used ( definitions
given above).
LDA
A, 11, F, 12
Load Address
Places the effective address into the index
register addressed by the field 12.
GPS
A, 11, F, 12
Generate Prefix in s.
Set the first a bits of the screen to one where
a is the effective address.
VLDX
A, 11, F, 12
Vector Load X
Load the contents of the memory locations
specified by the address vector into the most
significant half of X (subject to s)
127
VAND
A, 11, F, 12
Vector AND
AND each bit of X with the corresponding
bit in the memory locations specified by the
address vector (subject to s).
VLXI
Vector Load X Indirect
Replace the most significant half of X by
the contents of the memory location specified
by bits 18-35 of X. Note that in this version of
indirect addressing the address vector is assummed to be in X as the result of a previous
operation.
VSTX
A, 11, F, 12
Vector Store X
Store the contents of the 16 accumulators
(subject to s) in the memory locations specified by the address vector.
VFAD
A, 11, F, 12
Vector Floating Add
Algebraically add the floating point numbers
specified by the address vector to the floating
point numbers contained in X (subject to s).
The sums are normalized.
VUFA
A, 11, F, 12
Vector Unnormalized
Floating Add
Same as VFAD except the sums are not normalized.
VFMP
A, 11, F, 12
Vector Floating Multiply
Multiply the floating point numbers at the
memory locations specified by the address vector by the floating point numbers stored in the
most significant half of X register (subject to s).
The double length product appears in X.
VTILE
A, 11, F, 12, 13
Vector on Index
or Low or Equal
See description in section VAMP Computer.
BSS
A Block Started by Symbol
BSS is a psuedo-operation that reserves a
block of A consecutive storage locations.
OCT
Octal Data
OCT is a psuedo-operation that introduces
binary data expressed in octal form into the
program.
The VAMP assembly language program is shown
in Fig. 6.
CONCLUSION
The concept of an array processing computer is
due to SOLOMON. By taking advantage of the features inherent in interleaved memories, very highspeed arithmetic units, mUltiple register CPU's and
by adding a number of special instructions one ob-
128
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
tains a machine that has the functional capabilities
of SOLOMON but which fits within the framework
of a more conventional computer organization. Further, the ideas presented here should result in a machine which is applicable to a much wider range of
problems than SOLOMON.
There certainly exists a large class of problems
for which neither VAMP nor SOLOMON would
show any appreciable advantage over a more conventional organization. Compiling is probably the
best example of these.
LOCATION
OP
LDA
LDA
LDA
GPS
VLDX
VUFA
VAND
A2 CONVERTE D TO T
BEGIN
*
*
*
VUFA
VLXI
VFAD
VFAD
VSTX
VLDX
VFMP
VFAD
VSTX
V TILE
END TEST PR GRAM
DATA STORAG
C
Al
A2
B
TEMP
LOCB
FXZR
MASK
BSS
BSS
BSS
BSS
BSS
VFMP
OCT
OCT
1965
REFERENCES
1. K. E. Iverson, A Programming Language,
Wiley, 1962.
2. J. Gregory and R. McReynolds, "The Solomon Computer," PGEC, vol. EC-12, no. 5, pp.
774-781 (Dec. 1963).
3. C. S. Wallace, "A Suggestion for a Fast Multiplier," PGEC, vol. EC-13, no. 1, pp. 14-17 (Feb.
1964).
4. R. K. Richards Arithmetic Operations in
Digital Computers, Van Nostrand, 1955, pp. 279282.
ADDRESS, 11, F, 12, 13
1" , 1
1, " 2
58, , , 3
=
SET IR 1
SET IR 2 =
1
58
SET IR 3
=
SET SCREEN TO
16
LOAD A2(I), A2(I+.
A2, 2,,1
PLACE BINARY P
FXZR
REMOVE EXPONEI\
MASK
UNCATEDINTEGER
LOCB
A2+li 2,,1
A2-1 2,,1
TEMP",1
A2, 2,,1
C
TEMP",1
AI, 2,,1
BEGIN, , ,1,2
ADD LOC OF B(l) TO
LOAD B(A2(I), B(A2{Ii
ADD A2(I+l), A2(I+2), •••
ADD A2 (1-1). A2(I), ••• ,.
STORE TEMP. RESULT
LOAD A2(I), A2(I+l), ••• ,A
MPY BY C
ADD TEMP. VECTOR
STORE Al{I), ••• ,Al(I+15)
STEP INDEX CNTR 16
1
60
60
40
16
B
233000000000
400777777777
END
Figure 6. A .VAMP assembly language program.
MANAGEMENT PROBLEMS OF AN AEROSPACE COMPUTER CENTER
G. A. Garrett
Lockheed Missiles and Space Company
Sunnyvale, California
At this session it seems to me that you might be
interested in several of the more-or-Iess technical facets of the direction of a large aerospace computer installation. Consequently I will avoid competing with our environment by discussing the ubiquitous problems of recruiting, of personnel motivation, of obtaining cooperation among the members of the various computing groups, or even the
basic problems inherent in convincing our computer
folks that the whole computer center does not exist
for them at all, but rather as a service for the other
parts of our company.
Instead, I want to tell you today about a few of
the figures we have on the actual costs of
"Change"; then go into a few aspects of the
"Turn-Around-Problem" from the management
point of view; and finish with a few remarks on
what a computer center such as ours may reasonably expect in the future.
While there are many fields in which constant
change is the order of the day, the operation of virtually any modern computer center is faced with
adjustment to changing computers, changing computer languages, and changing software systems
with a frequency which is quite notable. In a recent
attempt to analyze the economic effects of such
changes, several interesting relationships have been
noted.
In the past, the speed with which a newly installed computer has been "loaded" has often been
of interest, but few figures have appeared which
treat as a dynamic quantity the relationship between the program checkout load and the production load. Such relationships must be known, however, before one can evaluate the effects of change,
since the purely static "before and after" pictures
tend to conceal many of the significant points.
In analyzing the dynamics of the situation, some
simple relationships have been postulated, and their
predictions compared with those historical data
which were obtainable at the LMSC computation
center.
First, it was assumed that the loading on a computer could be divided between check-out and
production, and that the ratio of these two would
vary with time from installation. Since the proportion of computer programs which run for production without previous check-out is vanishingly
small, it would seem reasonable that the load on a
newly installed computer initially must consist solely of program development work. From such a
starting point it follows that the ratio of production
to development will show a continuous increase until it either levels off with time, or the pattern becomes confused by changes in language, operating
system, or type of work processed by the center.
129
130
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,.
Both the ratio of production to development at
steady state and the rate at which this ratio approaches steady state must be determined in order
to understand and to evaluate the economics involved.
Historical data obtained subsequent to the installation of two types of computers, the IBM 709's
which replaced Univac 1103's and the IBM 1410's
which were installed to handle administrative systems, shows the patterns given in Figs. 1 and 2 respectively.
It can be seen that the data in both cases have a
basic similarity, and that the experience in general
seems to follow the same type of curve. The
smoothed curves themselves were obtained by assuming that the development load would be reduced
1965
half-way to the steady state load during each four
month period.
Data on the introduction of new computer languages have been somewhat more difficult to obtain, and tend to be less definitive. Hardware
changes are necessarily abrupt. Software changes
need not be. However, records have been found
which show the ratio of production to development
following a fairly general shift from Fortran I to
Fortran II on the IBM 7090's which started in June
of 1963. Since the scatter of these data is considerably greater than it was for the introduction of a
computer itself, and since figures are not available
for rework as a separate item, smooth curves were
not derived from the data. Instead, the curves from
Fig. 1, the introduction of the 709's, were plotted
100p-------------------~------------~~--------------------------~----~
80
~
Z
~
0
~
60
~
e;,
~
(j
4-2-64-
Q)
~ 2.0
1il
>
«
@3-29-65
1108 - Exp. Ae
l.0
1108 - Exp. C e
0~===z====~==::=--L---L~1~10~8L-~Exp~.~B.~-~
o
Notes:
•
@
*
100
200
Daily Job Input to System
300
400
;
LOMUSS I Model Data - Manual Handling Times Excluded - Adequate Drum
Buffer Capacity Assumed.
= Actual Data - Manual Handling Times Included.
= With 400 Jobs Input, 59 Were Remaining in System at End of 24 hr
Figure 3. Analysis of UNIVAC 11 07 / q 08 t~rnaround. time
as function of workload (manual handlIng times extenor to
hardware system excluded for LOMUSS I simulation).
SYSTEMS ANALYSIS
The simulations were conducted primarily for
determining an 1108 system configuration for October 1965, but information obtained from the
1107 simulations in April was useful to the then
operating system.
With the workload characteristics remaining approximately constant, the model showed that a 3shift operation would be required when the workload rate reached approximately 250 jobs per day.
This can be seen in Fig. 2 when a linear interpolation is made between the I/O queue plots of 200
and 300 jobs per day. That is, a queue of jobs will
still be on the drum after midnight, which is the
end of the second shift, if the workload rate exceeds
250 jobs per day.
165
There were some manual procedures that were
immediately implemented to minimize the drum
saturation problem, namely, when the operator detected a saturation condition he could stop the input flow by stopping one or more card readers
and/ or he could dump some of the drum output
queue onto tape for later processing during a low
workload period (which would increase the turnaround time). The model made a contribution by
pointing up the importance of the console operator
to system efficiency and by so doing, helped accelerate work on improved manual procedures and automatic system status indicators (for example, modification of the executive routine to give an online console printer message when the I/O drum
buffer became saturated) .
Another pertinent piece of information was the
350-job-per-day 1107 system capacity predicted by
the model as illustrated in Fig. 2. This provided a
useful guideline to establishing system programmer
support levels and program conversion (from other
systems to the UNIVAC system) schedules prior to
installation of the faster 1108 CPU system.
The next step in designing and conducting the
simulation experiments was to incorporate some tentative system hardware plans into the model. These
plans are summarized as follows:
• Installation of a high-speed data link between
the central system and a high-use remote
station.
• Installation of an additional on-site UNIVAC
1004 print, card read and punch system.
• Replacing the 1107 CPU with an 1108 CPU.
Three experiments were conducted which recognized the above system changes combined with a
forecasted change in the characteristics and level of
the 1108 workload. The output from these experiments was the primary information used for determining the 1108 system configuration for October
1965.
Some output from experiment A is illustrated in
Fig. 4, where the original 1107 model configuration
was run with (1) a 350-job-per-day input which
was the maximum forecasted level of work through
1965 and (2) the faster 1108 CPU in place of the
1107. Three system performance measures were
significantly affected in experiment A:
• The average job turnaround time for 350 jobs
dropped from over 4 hours to about 1 hour.
166
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
• The peak I/O drum buffer requirement
dropped from almost 2 million to about
900,000 words.
100
90
80
~
~
70
(§
60
~
50
...,'"
30
~
Input Queue
The output from experiment C is illustrated in
Fig. 6 where experiment B was repeated except that
the mean run time of the jobs was raised to 6 min-
c:::J Output,Queue
.S 40
.0
0
1965
20
10
0
7
8
9 10 11 12
1
10 11 12
- I-
I--AM - - - - I I - t - - - - - P M
I..
-
1
2
3' 4
5
6
7
AM----l
~
Notes:
Input Queue
c:::J Output Queue
1. Average Turnaround Time ; 1. 02 Hours (Manual Handling Times Exterior to
Hardware System Excluded.)
.
2. 1108 CPU Utilization; 49% (Two Shift Operation).
3. Maximum Requirement for I/O Drum Buffer Storage Occurs at 5:10 PM and
Equals 1,860,000 Words.
Figure 4. Experiment A-present 1107 system with 1108
CPU and a 350-job-per-day workload.
.-I~AM ~
!--AM---"""'I------PM - - -__
• System throughput increased substantially as
the 350\jobs were processed in only 2 shifts
instead of 3.
• The mix of the I/O queue changed from primarily input to primarily output and the peak
I/O drum buffer requirement rose to almost
2 million words.
Notes:
1. Average Turnaround Time; 0.45 Hours (Manual Handling Times Exterior to
Hardware System Excluded. )
2. Maximum Requirement for I/O Drum Buffer Storage Occurs at 10:30 AM and
Equals 354,000 Words .
Figure 6. Experiment C - same as Experiment B but with
an average CPU time of 6 minutes instead of 2.54 minutes
(as measured
<
0 represents inhalation.
0 represents exhalation.
F (t) represents the CO2 concentration in the
gas at the mouth.
o
for V (t) > 0 F(t) ~ FI (t) i.e., inhaled concentration.
o
V (t) < 0 F(t) ~ FE (t) i.e., exhaled concentration.
FA (t) represents the alveolar concentration of
CO2 •
We use a two compartment model of the lung.
253
254
PROCEEDINGS -
1965
FALL .TOINT COMPUTER CONFERENCE,
I°
t
Compartment 1, The Dead Space.
This is purely a region of gas transport.
There is no exchange. Its volume = V D.
Compartment 2, The Alveolar Compartment.
This is a region only of gas exchange.
There is no transport.
This compartment is assumed to have no concentration gradients. When gas is exhaled, there is
some mixing of gas from the two regions, but the
first portion of the exhalate represents dead space
gas, the last portion represents alveolar gas. Thus
we have the following curve of concentration versus
time for carbon dioxide in the exhaled gas:
Alveolar C02
Concentration
MV=
0
IV(t) I dt
T
The tidal volume V T is the volume of gas moved out
of the lung on a given breath.
The alveolar volume V A is the volume of gas moved
out of the alveolar compartment on a given breath.
Obviously,
Similarly, we can define the alveolar ventilation rate,
T
I
AV=O
VA
T
~
c
o
Since the C02 concentration in the dead space is
n
F I and its volume is V D, the volume of C02 exhaled
from the dead space = V D X Fl.
c
e
And, volume of C02 exhaled from the alveolar compartment = V A X FA.
n
t
r
V EC02 = V A • FA
a
t
VD
•
Fl.
If FI = 0, a common case, then
o
n
V A=
Dead Space CO2 ~
Concentration
time
End of exhalation
V~~02.
If FI # 0, then
VEC0 2 = VA • FA
Thus sup [FE(t)]=FA(t)
and in! [FE(t)]=FI(t), since the dead space gas
is simply the gas inhaled on the last breath, and left
unchanged. (The inhaled gas is usually considered
to be of constant composition during any given inhalation.)
VE C02(n) =volume of C02 exhaled on nth
breath.
VI C02(n) =volume of CO2 inhaled on nth
breath.
Obviously, V E C02 (n) =
nth breath
VI CO 2 (n)
+
=
I Vet)
IV
FE(t) dt
(t) F I (t) dt
But V T
•
FI
+
(VT - VA) • Fl.
= V IC02.
VA= V EC0 2 - V r C0 2
FA - Fr
These two formulae for V A are known as the Boh
formulae. We will now discuss the processing of the
above data.
This paper discusses an improved version of two
systems previously reported. An attempt has been
made here to perform the· various operations in the
appropriate (analog or digital) section of the equipment instead of doing them all in the "analog" section.
EQUIPMENT
nth breath
The minute volume, the average rate of gas movement out of the lung, can be mathematically defined
as
The system used consists of two transducers, special purpose analog computing equipment with
digital read out and digital computing facilities.
The transducers are (a) a pneumotachograph
255
ANALOG-DIGITAL DATA PROCESSING OF RESPIRATORY PARAMETERS
(Fleisch), strain gage (Statham PM 15) and amplifier (Statham CA 9-10), and (b) an infra-red carbon dioxide analyser (Godart).
The computing equipment consists of 25 .operational amplifiers, some with chopper stabilization
(G.A. Philbrick Researches, K2PA and K2W) and
a multiplier (GAP jR, K5M). Plug-in units for the
amplifiers were fabricated by the author from modules (K3). Control circuitry was synthesized from
digital modules by Tech-Serv (B.R.S.). Read-out
equipment is by Hewlett Packard, and· the digital
computer is a Control Data Corp. 160 A.
READ-OUT
Arterial carbon dioxide tension
Alveolar carbon dioxide concentration
Inhaledcarbondioxiqeconcenlration
Volume of carbon dioxide exhaled
Tidal volume
Per breath '
Alveolarventilationfanatomicl
Alveolar ventilation (physiologic)
Anatomic dead space
Physiologic dead space
Alveolar dead space Idifferenceol D5(AJ.nd D5(Physl
Minute volume
Rate of carbon dioxide excretion
Per minute
Anatomic alveolar ventilation rate
Physiologic alveolar ventilation rate
Alveolar-arterial difference of C02 concentration
Computation (Fig. 1)
From the pneumotachograph, strain gage and
amplifier system a signal arises representing the instantaneous flow rate of the patient's exhalation or
inhalation. A small sample (approximately two liters per minute) is taken from this stream and
passed through the sampling head of the carbon
dioxide analyzer from which is obtained a signal
proportional to the carbon dioxide concentration in
the gas stream (which is lagged approximately 300
ms). To synchronize the two signals, the "flow"
voltage is delayed by an equal amount. This is performed using a (Fig. 2) modification of the Pade
approximation devised by Dr.P. D. Hansen. The
ide tension is obtained using a peak follower technique and the inhaled carbon dioxide from the inverted curve in the same way. These peak followers
are reset after being read out on each breath. A
constant voltage is integrated for the period of the
breath to give a measure of the time taken for that
breath. All five quantities are placed on memory
circuits· at the end of each breath. The integrators
are reset and computation recommences.
FIG.3:
Contact Closure
(FromAnalog
Unit)
CONTROL CIRUITRY
-
+X
Delay· T : capacitor';;P
ISO MS time delay (suggested by P.O. Hansen)
Control Circuits (Fig. 3)
flow signal is rectified and integrated thereby giving the volume exhaled for that breath. The flow
signal and the carbon dioxide signal are multiplied
and integrated and this integrand for each breath
represents the volume of carbon dioxide exhaled per
breath. The peak exhaled or end-tidal carbon diox-
These consist of a series of digital modules by
Tech Serv (B.R.S.). The input pulse to this system
is obtained from a voltage crossing detector and relay on the analog unit. This then initiates a pulse
train in the digital system which proceeds through a
256
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
series of one-shots of variable delay time, each one
of which is triggered by the trailing edge of the
pulse from the preceding one shot. The time detays
are adjusted to the appropriate values. The first one
shot operates the relay which connects the integrators to the memory circuits. (A) The second one
shot is a delay to allow for closure of this relay, a
small dead time, and then operation of the second
(shorting) (B) relay which is operated from a
third one shot. Another one shot is used to provide
a suitable delay between readout of the first channel (FA CO2 ) and reset of this unit. (Relay C).
1965
volume of 500 cc passed through the pneumotachograph. Artificial dead spaces of 40 and 95 cc have
been constructed. Average of the mean of 10 determinations for the 40 cc dead space was 41.6 cc in
one instance and in another 45 . Average of the
mean of 10 determinations for the 95 cc dead space
was 97 in one instance and 92 cc in another. These
results lead us to have some confidence in the ability of the equipment. On the other hand, this confidence can only be maintained if calibration is conscientiously and frequently performed.
Readout
Calibration
Calibration of this equipment is rather complex
due to the large number of functions performed and
every attempt is made to cross-check during calibration. The carbon dioxide analyzer is calibrated
with gases of known chemical composition. Its response to these is linear to within plus or minus
Ilmm pC02 at 760 mm barometric pressure. The
pneumotachograph, strain gage and amplifier system is calibrated by passing oxygen through a flowmeter which delivers a known amount of gas for
any particular position of the rotameter. Stability
and linearity of this system are excellent, the only
aspect requiring frequent adjustments being the zero
level which is sensitive to positional changes of the
transducer. The ability of the system to record accurately the volume passing through the pneumotachograph is tested by comparing the computer output with a volumeter and spirometer. Agreement
here is excellent. ± 3 percent. Stability and accuracy
of the integrators is tested with a sine wave of
known dimensions and again here reproducibility
and accuracy are better than 2 percent.
Finally the system is tested by the simulation of
a dead. space. Calculation of dead space is the most
revealing calibration statistic of the machine because the dead space represents the differences between two fairly large values, namely the tidal volume, and alveolar ventilation, which only differ by
about 20 percent. Consequently errors in these
quantities are reflected in an extreme fashion in the
dead space calculations. Therefore a homogeneous
carbon dioxide mixture is flushed through the pneumotachograph to simulate a zero dead space. Results of this typically indicate an average mean dead
space determination of the order of 5 cc for a total
A multiplexing device connects the five memory
circuits to a digital voltmeter sequentially. The digitized values are then printed or punched out. The
first system, the printing system, is a slow speed
unit consisting of a multiplexing device (Dymec C
2900 A) a digital voltmeter (Hewlett-Packard 405
CR) and printer (Hewlett-Packard 561 B). This
system can read out five parameter in approximately 2.5 seconds. This speed is adequate as long as we
do not have to have an observation on every breath.
(If the readout sequence is not completed the integrators are merely reset and computation recommences. The integrators are not connected to the
memory units in this situation).
The other system is faster and consists of a similar stepping switch type of multiplexing device
(Dymec C 2901 A) which connects the memory
circuits through a 5-space digital voltmeter with a
10 ms sampling time, (Dymec 2401). The output
of this is put on a punched paper tape by a teletype
unit (BRPE-ll). This latter system is of course
much faster and will read out 5 parameters within
approximately 7 /10 of a second. This latter format
is also much more convenient as it can be read directly into the digital computer (CDC 160 A).
With the printing system data must be transferred
onto cards, which is rather tedious.
Programming
Several programs are then available to us. The
first program simply removes the scale factors used
in the analog equipment and punches in the conventional units. Several types of manipulation are performed upon the scaled data of which a few examples are as follows.
ANALOG-DIGITAL DATA PROCESSING OF RESPIRATORY PARAMETERS
We might desire a plot of alveolar ventilation
rate against end-tidal carbon dioxide tension. This
type of plot is useful in studies of the sensitivity of
the respiratory center and the effect of drugs upon
it. It is usually necessary to smooth this plot. The
technique employed is to average the ventilatory
rates over five breaths. Similarly the end-tidal carbon dioxide tension is averaged over five breaths
(each tension is weighted by the time of the breath
to obtain a meaningful average). This type of plot
has been used by us extensively in assessment of the
depressant effects of narcotics.
Another typical problem is determination of the
relationship between tidal volume and alveolar ventilation. The latter is determined by the use of the
Bohr formula a~ove. This is a fairly elementary
program and a plotting routine is incorporated here
also to avoid the tediousness of plotting the large
amount of data.
Using the formula for the case when the inhaled
concentration of C02 is not zero, we must compute
the net output of C02 for each breath, as the denominator for the previously derived formula:
VA= VEC0 2 - V I C0 2
F AC0 2 - F I C0 2
For this case the analog equipment is adjusted to
compute the product of the concentration signal and
all the flow signal, rather than the rectified signal.
By an obvious adaptation of the above program
we can plot the net rate of CO2 production against
time for any time interval of interest. Such a plot is
of interest because this parameter indicates the overall rate at which blood is returning from tissues in a
normal metabolic state. A sharp drop in CO2 output
would indicate that the rate of return blood to the
257
heart was reduced or that there had been a severe
metabolic disturbance. Such information could be
useful for an anesthesiologist during a difficult procedure.
Comparisons between the partial pressure of CO 2
in the arterial blood and in the lung gases are of
interest inasmuch as any great differences reflect
inefficiencies in the lung as an exchange device. True
comparison is not usually made directly, but rather
the 'physiological alveolar ventilation" is determined.
This somewhat empiric parameter is the result of
replacing the FA in the Bohr formula by Fa, i.e., the
fractional concentration corresponding to the partial
pressure of C02 in arterial blood. Comparing the
volume so obtained with the "alveolar ventilation volume" V A, defined in the introduction, allows us to
express the inefficiency in terms of a volume of the
lung (referred to as the alveolar dead space = V A
(phys.) - V.4.) which receives an adequate blood supply but an inadequate gas supply.
It would be easy to extend the above techniques
to obtain many other parameters of respiration, of
interest to the respiratory psychologist and the clinician, such as the timed vital capacity, one second
expiration, etc.
CONCLUSION
Techniques are outlined for rapid data processing
of respiratory parameters. It is suggested that these
techniques are much more efficient than the classical techniques of chemical analysis, etc. Much more
data is obtained and the maximum number of parameters can be calculated from an indivdual experiment. It is suggested that we can have a fruitful union of medicine and data-processing technology.
COMPUTER SIMULATION - A SOLUTION TECHNIQUE FOR MANAGEMENT PROBLEMS
Alan J. Rowe
Graduate School of Business
University of Southern California
Los Angeles, California
counting. Examples of decision rules applied to
physical processes can be found in the Journal of
Operations Research, Management Science, and
Journal of Industrial Engineering. The interesting
fact, however, is that in all of these instances of automated data processing, supervisors are still .required to deal with the workmen who actually operate the processes. Specific data processing activities
can and have been automated, yet management still
performs the basic decision functions.
Although computers are currently being used primarily for rapid processing of data, there is little
doubt that computer-processed information will
be a requirement in providing management with
timely and accurate data for evaluation, analysis,
and as an aid in decision making. At the top management level, decisions are concerned with directing the organization and providing means of assuring its survival. To achieve maximum effectiveness
at the operating level, plans and policies must be applied to the available resources, subject to specified
constraints and risks. However, there is generally
insufficient information for these decisions, and
they often cannot be structured as a set of procedures. But, most important, policy decisions are
based on a blend of intuition, experience and emotion.
Looking more specifically at the management
control process, measurement, reporting, evaluation,
decision rules, and feedback are susceptible to computer processing. However, the decision criteria,
plans and objectives are still subject to human judgment. At the operating level, where the physical
processes in a system are applied, there is the highest opportunity for computer application. Numerous
examples exist of automated data processing, essentially in production and inventory control and ac-
THE USE OF COMPUTER SIMULATION
In view of the intricately complex nature of large
business systems, it is difficult to evaluate new
management concept or system designs. Direct experimentation poses almost insurmountable problems due to disruptions, uncontrolled results, length
of time required, and possibility of costly mistakes.
Computer simulation, on the other hand, has been
shown to provide a suitable methodology to study
business system behavior under a variety of conditions, and provide a means for analysis ofsimultaneous interaction of the many system· variables to
yield valuable insights. In view of its capability of
rapid interrogation of system performance, simula259
260
PROCEEDINGS -
FALL JOINT COMPUTER CONFERANCE,
tion is becoming an integral part of "real time systems."1
Computer simulation can be considered as an attempt to model the behavior of a system in order to
study its reaction to specific changes. The simulation model is seldom an exact analogue of an actual
system. Rather, it is an approximation of continuous time dependent activity. If the properties and
elements of the system are properly defined, then
the tracing through by the computer of the simultaneous interaction of a large number of variables
provides the basis for studying system behavior. A
model of the system indicates relationships which
are often otherwise not obvious and has the capability of predicting system behavior which results
from changes in system design or use of alternate
decision rules.
For many years engineers have used scaled models to simulate system response. The armed forces
have used exact duplicates of operating systems for
training. There have been laboratory studies which
can be considered similitude or an attempt to duplicate reality in a laboratory environment. This has
been extended to the use of management games
where people interact with· the output of a computer
and make decisions on information received. Computer simulation has been directed toward the use
of models of the behavior of a system so that the
results correspond to the problem being studied.
Abstract mathematical models, on the other hand,
are used for problems which correspond with reality
to a sufficient degree to produce useful solutions.
Not only has simulation increased in use as a
means for studying and understanding new problem
areas, but it has a number of distinct advantages.
Once a simulation model is completed, the time for
experimentation is considerably reduced. The cost
of simulation models is now being reduced to the
point where for larger problems it is an extremely
economical tool. The fact that all the work is done
in a computer rather than a laboratory or actual operating environment provides better experimental
design and control. The ability to explain the simulation model in terms of a real problem is a far
more useful tool than some of the analytic techniques which cannot be described to management
or the potential user.
PROBLEMS IN SIMULATION
Although simulation has many advantages, one
1965
should not overlook the difficulty involved in developing a model, programming it on a computer, and
utilizing the results. Although computer simulation
has been used for a number of years, there are still
many pitfalls that must be avoided. One of the
greatest difficulties is that of developing a suitable
model. Another is the use of computers in the simulation process which poses a number of problems,
including computer programming, search techniques, data storage and retrieval, function generators, etc. The computer programming problem has
in many instances proven to be a major stumbling
block.
In recent years there have been a number of approaches taken to minimize the programming problem. One is the development of models to study a
specific area, such as Job Shop Simulation.2 Using
this type model, the user is required to provide appropriate data and a description of the facility to be
studied, and the computer program needs little modification. A similar approach has been taken in the
Gorden General Purpose Simulator. 3 A somewhat
more general approach to this problem has been
tackled by the use of the DYNAMO Compiler,4 in
which a set of equations is submitted to the computer, which in turn compiles these and generates a
computer program. Therefore, once the model is
completed, no further programming is required. As an
alternative to writing directly in machine language,
a simulation language has been developed called
Simscript. 5 Once the model is written, no further programming is required. The Simscript language has all
the flexibility of computing language but much of
the simplicity of a special purpose approach. Quickscript6 and programming by questionnaire7 are extensions of this approach to developing useful simulation languages. Thus, depending on the type of problem being undertaken, it is possible to use a variety
of approaches to obtain a computer program. Several of the computer manufacturers have developed
standard programs which are readily available and
require no further computer programming effort. 8
A second problem is in the area of experimental
design. Considerable effort is often expended in an
attempt to obtain information and is often done in
an inefficient manner based upon poor input data.
It is therefore necessary to consider computer simulation as an equivalent to a laboratory experiment.
Before any simulation is undertaken, areas of payoff or urgency should be established and the feasibility of completion of the project with estimates and
COMPUTER SIMULATION -
A SOLUTION TECHNIQUE FOR MANAGEMENT PROBLEMS
budgets should be provided. When defining the
problem, there should be careful observations and
correct statements concerning what is being studied
and discussions held with experienced personnel.
Preliminary approaches or brainstorming should be
undertaken in order to attempt to define solutions
to the problems being studied. Organization of the
data, the use of sample vs. exhaustive representation, and the use of statistically designed experiments should all be incorporated. This becomes
particularly important when trying to state on a rigorous basis the comparison of one system design to
another. Simply because a problem is run on a computer does not mean it is either valid or statistically
significant.
A number of fairly significant techniques have
been developed for analysis and evaluation of data. 9
Some of these are referred to as Monte Carlo sampling or importance sampling. In these techniques
the data are handled in such a way as to minimize
the amount of data required and to maximize the
information that can be derived from the manipulation of the data. In many applications the use of
analysis of variance or regression analysis is very
important. It is necessary in evaluating the results
of a simulation to have the appropriate criteria and
measures of system performance. These, of course,
do not depend on the simulation but rather on the
user.
The problem of modeling is important since the
results of simulation are no better than the model
used. A model provides a formal statement of system behavior, in symbolic or mathematical form.
The model should be constructed so that the parameters, variables, and forcing functions correspond to
the actual system. The parameters should include
properties which are sufficient to define the behavior of the system; whereas the variables are the
quantities which describe the behavior for a given
set of parameters. The forcing function provides the
stimulus, external to the system, which causes the
system to react. For example, job orders which en..;.
ter a production system cause men to work, machines to run, queues to form, etc. In this way, job
orders become the forcing function for the system.
Whatever particular form is used, a model provides
the frame of reference within which the problem is
considered.
A model need not duplicate actual conditions to
be useful. The model should be designed to predict
actual behavior resulting from changes in system
261
design or application of new decision rules. Prediction implies an understanding of the manner in
which the system reacts; that is, being able to specify the outputs for a given set of inputs.
Models are merely the basis for testing new ideas
and should not become ends in themselves. The
simpler the model, the more effective for simulation
purposes. Tests should be made prior to model
building to determine the sensitivity of the characteristics which are incorporated. Typically, certain
key characteristics contribute the majority of the
information to be derived from simulation. Other
characteristics, although more numerous, do not
contribute much to the final system design. In this
sense, simulation can ,be considered as. sampling the
reaction of a system to a new design. It is imperative, therefore, that a representative sample be taken, rather than an exhaustive sample. Thus, the
number and type of characteristics to be included
should be carefully selected. 10
The major task of simulation is reached at this
point. A logical model, which is merely descriptive,
is not suitable for computer simulation. The model
must be modified to suit the particular computer on
which it will be programmed. Factors such as kind
of memory, speed of computation, and errors due to
rounding must all be taken into account. Simplification is often necessary due to speed of computation
or limitation of the computer memory. The method
of filing information and representing time are also
significant problems. Which data to accumulate, and
at what point in time, often are difficult to decide
beforehand. Thus, the program must be flexible and
easy to change.
As computer programming proceeds, there is
generally feedback which provides the basis for further modification of the model. At the outset,the
decision must be made whether to make the program general or special purpose. The type of programming changes radically, depending upon the
end use of simulation. Modular programming which
treats each section independently provides flexibility at a small cost in computation time and storage.
In view of the many logical relations which exist in
systems, computer programming represents an important aspect of the problem.
SUCCESSFUL APPLICATIONS OF
SIMULATION IN MANAGEMENT PROBLEMS
A considerable body of literature exists covering
262
PROCEEDINGS -
FALL JOINT COMPUTER CONFERANCE,
the use of simulation in various applications. l l,12,13 As
shown in Fig. 1, simulation should be thought of as
a continuum, starting with exact models or replication of reality at one extreme, with completely abstract mathematical models at the other. When viewed
in this manner, the breadth of simulation can be
appreciated.
EXACTNESS
CONTINUUM OF SIMULATION
ABSTRACTION
Figure 1.
Physical
Systems
Non-phys ico I
Systems
1965
The wide variety of simulation applications is
somewhat astounding. Not only has simulation been
extremely successful for purposes of studying physical systems but it has been used for such diverse
applications as the study of personality,14 election
results, gross economic behavior, etc. In order to
evaluate where simulation is most effective, it is
probably best to categorize problems as involving
physical and non-physical systems with high and
low risk decision alternatives.
As shown in Fig. 2, the area for greatest success is
physical problems having very low risks. The poorest applications are nonphysical problems having
high risk or little data. This situation may change
as the simulation technique is applied to a broader
class of problems.
A review of the literature indicates many successful
applications of simulation in business. 15,16,17,18,19 A
High Risk
Poor Data
Low Risk
Good Data
Fa i r Resul ts
Excellent Results
Bad Results
Poor Results
PROBALE SUCCESS OF SIMULATION APPLICATION
Figure 2. Probable success of simulation application.
publication20 "Simulation-Management's Laboratory," based on a fairly extensive survey of simulation applications, shows the applicability to a wide
variety of management problems. Some of the applications include:
Company
1. Large Paper
Company
Management Decision
Area Simulated
Complete order analysis
2. U. S. Army Signal
Supply
Inventory decisions
3. Sugar Company
4. British Iron and
Steel
•
Production, inventory,
distribution
Steelworks melting operation
5. General Electric
Job shop scheduling
6. Standard Oil of
California
Complete refinery operation
7. Thompson
Products
Inventory decisions
8. Eli Lilly & Co.
Production and inventory
decisions
9. E. I. DuPont
Distribution and warehouse
10. Bank of America Delinquent loans
In another survey by Malcolm,21 the following applications are described:
Company
Problem Simulated
COMPUTER SIMULATION -
A SOLUTION TECHNIQUE FOR MANAGEMENT PROBLEMS
1. Eastman Kodak
Equipment redesign,
operating crews
2. General Electric
Production scheduling,
inventory control
3. Imperial Oil
Distribution, inventory
4. United Airlines
Customer service,
maintenance
5. Port of New York Bus terminal design
6. Humble Oil
Tanker scheduling
7. U. S. Steel
Steel flow problems
8. I.B.M.
Marketing, inventory,
scheduling
9. S.D.C.
SAGE Air Defense
10. Matson
r~Hgo
transportation
These lists are not meant to be all-inclusive, but
rather indicate the variety and type of problems that
have been solved by simulation.
THE USE OF SIMULATION FOR
SCHEDULING JOB SHOPS
Scheduling of job shops has long been considered
a critical problem. Extensive work using analytic
techniques to find a suitable solution were tried.
However, in view of the large-scale combinatorial
nature of this problem, no solution was found except for extremely small cases. Extensive models
have been developed over a period of years and
have evolved into what today is known as the Job
Shop ~imulator. Although the computer program
cost a large sum of money and took almost two
years to develop, the Job Shop Simulator has been
used successfully in a large number of companies.
Notably, it has become an integral part of the
manufacturing function at General Electric and has
been used extensively in many other companies, including the Hughes Aircraft Company. 22
The kind of decisions that can be aided by the
use of this type of simulation are the following:
1. Establishing required capacity in terms of
equipment, facilities, and manpower in order to meet unpredictable customer demand.
2. Examination of alternative types of demands and the capability of the system to
respond.
263
3. Examination of the inventory problem relating equipment utilization to cash requirements and customer demand. (It is
possible to meet customer demand by
maintaining large inventories.)
4. Development of appropriate scheduling decision rules to maintain a minimum inventory and meet specified delivery requirements.
5. Study the operation of a physical facility
through the appropriate use of forecasting
techniques, load level techniques, scheduling decision rules, and priority decision
rules.
In addition to specific decision areas, there is the
information generated from the simulation which
provides the basis for feedback on performance so
that management can make decisions on the number
of shifts to run, need for additional equipment or
capacity, or amount of cash to maintain for adequate inventory. The use of this particular program
has been extended to an operational system at the
Hughes Aircraft Company for real time manufacturing control. The Job Shop Simulator was first
used to examine alternative scheduling decision
rules. These rules, in turn, provided the basis for
developing a supplemental computer program which
is used to generate the factory job order status on a
daily basis. This computer program, by application
of priority decision rules, is used to generate new
priority lists each day, taking into account all occurrences for the given day. Thus, the system operates on essentially a daily cycle with all information
current and correct as of that point in time. This
type of real time application appears to offer considerable opportunity for the use of simulation in
industry.23,24
STUDYING BUSINESS SYSTEM BEHAVIOR
Considerable effort has been extended to develop
models of. the total business system. Several efforts
along these lines have been undertaken at SDC,25
Stanford Research Institute,26 IBM Corporation,27
and M.I.T.28 A notable example of work being undertaken in this area is by the Industrial Dynamics
Group at M.I.T. which is concerned with studying
total system behavior. The basic premise of this latter simulation is that the dynamic interaction of
system variables and information feedback leads to
264
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
amplification, oscillation and delays in system performance. The behavior of a system, then, is the
result of desired objectives and the decision rules
employed to carry out these objectives. Thus, where
there is an attempt to make corrections and adjustments in flow rates, there is the possibility of delays or amplification or there may be conflicts between short and long-term objectives. Forrester in
his book on Industrial Dynamics describes a number of studies that have been undertaken and describes future studies of total management systems.
In providing the means for experimenting with a
total business system, a quantitative formulation of
the various behavioral characteristics, component
interdependencies, system flows and stochastic
functions is required. The formulation is used to develop a simulation program which can trace the activities of the system as they change in time. In this
way, a large number of variables can be examined
simultaneously, without explicit knowledge of their
interdependencies.
A model of a business system is concerned with a
"decision network" rather than an explicit characterization of the decision maker per se. The difference stems from studying the information flow and
decision rules in the system, as contrasted with
studying the behavior of the individual. To the extent that a model faithfully characterizes the behavior of a given business and that suitable decision
criteria can be established, a computer model can
generate information useful in the study of business
problems. Further, the computer is capable of providing ·summaries of the information generated during the simulation to permit evaluation of experimental designs or to permit useful insights on system behavior.
The characteristic behavior of decision makers in
response to information provides the basis for
studies concerned with organizational aspects of information flow. The density of communication linkages among decision makers provides data which
could be used to establish critical decision points in
a system. Further, a communication network linking the managers with the operations provides the
means for establishing feedback control loops. In a
sense, management is linked with the resources of
the business via information flowing through a
communications network. Characteristics of the information flow among managers and between the
managers and the operators provides one of the bas-
1965
ic measures of system behavior. In an actual business, much of the information generated is not pertinent to the direct operation of the business, and
often ipformal communication channels provide
useful information. In a sense, the decision maker
has surveillance over a given number of· decision
points, which, when linked to other decision points,
defines the underlying operational structure of the
business.
MODELING THE ACTIVITIES AND
FUNCTIONS OF A BUSINESS SYSTEM
A schematic representation of the information
flows among various functions and activities in a
typical manufacturing (resource transforming) business system is shown in Fig. 3.
Central to the system is the decision-communication network, which is normally thought of as the
mariagement function. The decision network is
linked to the resource transformation or operations
subsystem through the various organizational levels.
In a computer program, the simulated information
generated would be used to execute the decisions in
a synthetic manner. Inputs to the decision network
include the system contraints or policies, system
functions and environmental factors. These too can
be represented by flows of· information via the communication network.
Inputs to the operations subsystems include environmental factors, customer orders, capital resources, material, etc. In addition, transients or perturbations in the operations subsystem could be
used to introduce variability in performance. Outputs of the system enter distribution systems, warehouses or finished goods storage. Although there is
considerable detail associated with the operations
subsystem, a computer model need merely treat the
information aspects as they interact with the decision network. 29
The study of the behavior of a total system requires an explicit description of the time dependencies among the components. The primary concern
in the model discussed here is the conversion of resources into goods and services. By determining appropriate strategies in relation to risks, it is possible to control rates of change of production, work
force stabilization, growth rate, cost-pricing, response to demand, and the relation of income to
investment.
COMPUTER SIMULATION- A SOLUTION TECHNIQUE FOR MANAGEMENT PROBLEMS
---I
REQUIREMENTS: Policies,L DECISION NETWORK:
Authority
Relations
i++
Decision Points
Control Points
SYSTEM FUNCTIONS: Marketing,l...._ _ _'...
Engineering, Planning, Finance
~-----~--
i l....._O_b_ie_c_t_iv_e_s'.....,..C_on_s_tr_a_in_ts_ _......i
H
MANAGEMENT CONTROLS:
Feedback, On-line Control
1-
_ _ _- - - , _ - - - - - '
-
t
I
ENVIRONMENTAL
FACTORS:
Competition,
Vendors,
Government,
Customers
r
COMMUN ICA TlO N
NETWORK: Linkages . . .
DECISION RULES:
Forma I Response,
Actions,
Outcomes,
Optimization
I
f+-
t
265
MEASUREMENT
OF SYSTEM
PERFORMANCE:
Quality,
Cost,
Time,
Queuing,
Flexibility
INFORMATION
DISPLAYS
L--_---'
SYSTEM INPUTS:
Orders, Bids,
Forecasts,
Information,
Resources,
Manpower,
Plant,
Equipment,
Money
RESOURCE
TRANSFORMATION:
Behavioral
Characteristics,
- - . Internal System
Flows,
Men, Material,
Information,
Money
TRANSIENTS OR
PERTURBA nONS:
New Products,
Demands,
Fluctuations,
Absenteeism,
Breakdowns,
Shortages,
Growth, Trends
SYSTEM OUTPUTS:
Storage,
Finished Goods,
Distribution,
Transportati on
Figure 3. Activities and functions of a business system.
Since computer simulation is used to trace the
change in the variables across a time domain, decision rules can be made functions of the state of the
variable rather than using expected values. In this
respect, simulation differs from dynamic programming or gaming strategies which depend on statistical estimates as the basis for optimization. Forcing
functions, which trigger the decision rules, must
also be specified and are related to information
flow in the system.
Since system optimization involves many variables, it is ·necessary to consider the many combinatorial effects. Simulation is a means for examining
a large number of variables simultaneously. However, the solution is not unique, but provides an estimate of the distribution of expected system performance. Thus, although all the combinations
could not possibly be enumerated, sampling results
tend to form relatively stable· and determinable distributions.
DEFINING SYSTEM FLOWS
Flows within the system can be separated into
information flow (paperwork, reports, etc), material flow, resource· flow, and manpower flow. Each
has its own characteristics and is therefore modeled
differently.
Starting with information flow, the channels or
network determines the destination of the information, and the transmission media determine the
speed and message type. Information content is a
function of the data, format, and timeliness. Transformation, distortion, and errors should be included, as well as the queueing effects at the decision
points in the system.
Material flow has received the most attention in
simulation and operations research studies. Thus, a
considerable body of literature exists which could
provide the basis for modeling. Reorder rules, safety stocks, value analysis, collation studies, scheduling rules, and stocking policies have been well doc-
266
PROCEEDINGS -
FALL JOINT COMPUTER CONFERANCE,
umented. There are a number of additional considerations which might also be of interest, such as
surge effects, queueing effects, interdependence of
component parts, work-in-process flow rates, and
resource utilization.
Resource and manpower flow are more difficult
to solve since factors external to a simulation model
may have the major influence. Nonetheless, there
are aspects of these factors -which can be profitably
studied. For example, what are the cash flow requirements in a marginally capitalized business?
What is the relation between demand variation and
capacity? How does capacity and capital requirements change with different products, number of
shifts, skill and mobility of manpower? These and
similar questions are readily susceptible to study
via computer simulation.
The internal system, in' addition to the basic
flows, has a number of other characteristics. In particular, it is necessary to structure certain behavioral patterns such as:
• demand and shipping patterns
• value distribution among products
• variability in man-machine performance
• learning-curve effects
• various lead-time distributions
Rather than attempt an exhaustive description of
the physical characteristics of the internal system,
the considerations discussed are intended to provide
some measure of the complexity and difficulty in
modeling the business system.
There are a number of environmental considerations, as well as system inputs and outputs, which
should also be taken into account in the modeling.
The number and type of competitors, customer demands, vendor characteristics and legal- or civic fac·tors all should be specified in developing a computermodel of a business. Furthermore, forecasting of
demand, gaming strategies, competitive pricing, and
advertising policies are, in effect, the control of system response to variable demand. Thus, for example, maintaining standby capacity in anticipation of
orders, having a complete product line, or carrying
large safety stock in inventory, are all means of response which are really control of the system.
From a total system viewpoint, the availability of
cash affects the above considerations; that is, carry-
1965
ing large inventories may determine the plant capacity required or the advertising budget. In this
sense, cash flow permeates all aspects of system behavior and thus affects control. Similar considerations enter into the make or buy question, material
and tooling purchases, employment stabilization,
etc. It is an explicit treatment of the many interdependencies which provides the basis for total system control.
The modeling of a total business system which
incorporates the many considerations discussed
would undoubtedly involve numerous details. Thus,
where possible, transfer functions or aggregations
should be used rather than the precise flows or system characteristics. Not only does aggregation provide considerable savings in modeling, but, often
more significantly, it helps reduce the size and
complexity of a computer program. The modeling
is, after all, designed to answer given questions or
explore new areas and, therefore, should be governed by these considerations.
CONCLUSION
It is apparent from the many successful applications that simulation will continue to grow in ~ importance and become a truly operational tool for
management decisions. There is still a vast number
of problems that can be tackled, ranging from the
study of specific economic problems to total company system problems. 30 Because of its many advantages and because of the need for improved techniques in management, simulation appears as one of
the most useful tools that has come on the horizon.
There is still much required in the way of improved
modeling, reduced cost of programming, improved
outputs, etc. However, none of these problems is
unsurmountable and the evidence is quite clear that
there are continued improvements on all fronts.
Thus, we can expect to see the use of simulation as
a normal part of business operations in the not too
distant future.
REFERENCES
1. A. J. Rowe, "Real Time Control in Manufacturing," American Management Association Bulletin, No. 24 (1963).
2. A. J. Rowe, "Toward a Theory of Scheduling," Journal of Industrial Engineering, vol. XI,
no. 2 (March-April 1960).
COMPUTER SIMULATION -
A SOLUTION TECHNIQUE FOR MANAGEMENT PROBLEMS
3. G. Gordon, "A General Purpose Systems
Simulator," IBM Systems Journal, vol. I (September 1962).
4. J. W. Forrester, Industrial Dynamics, M.I.T.
Press, Cambridge, Mass. 1961.
5. H. Markowitz, B. Hausner and H. Karr,
Simscript: A Simulation Programming Language,
Prentice Hall, Inc., New York, 1963.
6. F. M. Tonge, P. Keller, A. Newell, "Quickscript-A Simscript-Like Language," Communications of the ACM, vol. 8, no. 6 (June 1965).
7. A. S. Ginsberg, H. M. Markowitz, R. M.
Oldfather, "Programming by Questionnaire,"
RM -446o"-PR, The RAND Corporation (April
1965).
8. H. Markowitz, B. Hausner and H. Karr, "Inventory Management Simulation," I.B.M. Data Processing Information (April 1961).
9. S. Ehrenfeld and S. Ben Tuvia, "The Efficiency of Statistical Simulation Procedures," Technometrics (May 1962).
10. A. J. Rowe, "Modeling Considerations in
Computer Simulation of Management Control Systems," SP-156, Systems Development Corporation
(March 1960).
11. D. G. Malcolm, Editor, "Report of System
Simulation Symposium," American Institut~ of Industrial Engineers (May 1957).
12. W. E. Alberts and D. G. Malcolm, "Report
of the Second System Simulation Symposium,"
American Institute of Industrial Engineers (Feb.
1959).
13. W. E. Alberts and D. G. Malcolm, Report
No. 55, "Simulation and Gaming: A Symposium,"
American Management Association (1961).
14. S. S. Tomkins and S. Messick, "Computer
Simulation of Personality," John Wiley & Sons,
Inc., New York, June 1962.
15. J. Moshman, "Random Sampling Simulation
as an Equipment Design Tool," CEIR (May
1960).
16. A. Rich and R. T. Henry, "A Method of
Cost Analysis and Control Through Simulation,"
Linde Company.
17. H. N. Shycon and R. B. Maffei, "Simulation
267
-Tool for Better Distribution," Harvard Business
Review (Dec. 1960).
18. D. G. Malcolm, "System Simulation - A
Fundamental Tool for Industrial Engineering/' Journal of Industrial Engineering (June 1958).
19. K. J. Cohen, "Simulation in Inventory Control," Chapter XV, Production Planning & Control,
R. H. Brock and W. K. Holsterin, Merrill Books,
Columbus, Ohio, 1963.
20. D. G. Malcolm, "Simulation-Management's
Laboratory," Simulation Associates, Groton, Conn.
(April 1959).
21. D. G. Malcolm, "The Use of Simulation in
Management Analysis-A Survey and Bibliography,"
SP-126, System Development Corporation (Nov.
1959).
22. E. LaGrande, "The Development of A Factory Simulation System Using Actual Operating
Data," Management Technology, vol. 3, no. 1, (May
1963 ).
23. A. J. Rowe, "Management 'Decision Making
and the Computer," Management International, vol.
2, no. 2 (1962).
24. M. Bulkin, J. L. Colley, H. W. Steinhoff, Jr.,
"Load Forecasting, Priority Sequencing and Simulation in A Job Shop Control System," unpublished
paper, Hughes Aircraft Co. (May 1965).
25. M. R. Lackner, "SIMPAC: Toward A General Simulation Capability," SP-367, System Development Corporation (Aug. 1961).
26. C. P. Bonini, "Simulation of Information
and Decisions in the Firm," Stanford University
(April 1960) .
27. D. F. Boyd and H. S. Krasnow, "Economic
Evaluation and Management Information Systems,"
I.B.M. System Journal, vol. 2 (March 1963).
28. E. B. Roberts, The Dynamics of Research
and Development, Harper & Rowe, New York, Jan.
1964.
29. A. J. Rowe, "Research Problems in Management Controls," Management Technology, no. 3,
December, 19
30. R. Bellman and P. Brock, "On the Concepts
of A Problem and Problems Solving," American
Mathematical Monthly, vol. 67, no. 2 (Feb. 1960).
THE ROLE OF THE COMPUTER IN HUMANISTIC SCHOLARSHIP
Edmund A. Bowles
Department of Educational Affairs
IBM Corporation
Armonk, New York
Within the past dozen years or so, the computer
has made itself felt in every aspect of our society.
One hundred years ago, it was the Industrial Revolution which wrought profound changes in the economic and social fabric of the western world. Today there is an upheaval of comparable force and
significance in the so-called Computer Revolution.
Indeed, Isaac Auerbach has characterized the invention of the computer as being comparable to that of
the steam engine in its effects upon mankind. He
predicted that the computer and its application to
information processing "will have a far greater constructive impact on mankind during the remainder
of the 20th Century than any other technological
development of the past two decades."1
To cite but one example, the so-called information' explosion has affected all areas of knowledge, '
scientific and humanistic alike, making analyses
both increasingly complex and time-consuming.
During much of the century, knowledge is estimated
to have doubled every ten years, and journals are
proliferating at the astounding rate of over three
per day. An overriding problem, or challenge, if
you will, is the integration of this new knowledge
into the existing world of scholarship as well as the
dissemination of these new ideas and concepts
269
within the intellectual community. at large. Here
again, one turns inevitably to the computer. In fact,
we have now reached the point where even an anthropologist speaks about the heritage of a culture
being stored in physical objects such as books and
computer tapes! More important, however, is the
existence of data processing and information retrieval as a new and useful tool of tremendous potential; a fact that must be recognized and accepted by
the nonscientific community of scholars.
Let us consider for a moment the principal advantages of the computer to humanistic scholarship
in general. Its incredible speed allows the scholar to
accomplish in a short time what would otherwise
take him a whole lifetime of drudgery to accomplish. Its storage or memory constitutes an infinitely more reliable repository than the mind of' the
proverbial absentminded professor. Its great accuracy is completely dependable even when untold
mountains of statistical data require handling. And
final'ly, its automatic operation is not subject to the
vagaries of human fatigue, periods of interruption,
or even of mood. To the humanist, I would suggest
the most important .of these advantages is the immense saving of time gained by the use of computers. It is useful to remind ourselves that the Oxford
270
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
English Dictionary took some 80 years to complete
with several generations of editors. Jakob and Wilhelm Grimm's monumental Deutsches W orterbuch
began to appear in 1854 and wasn't completed until
1960. Similarly, the manual indexing of the complete works of Thomas Aquinas (approximately 13
million words) would take 50 scholars 40 years to
accomplish, but thanks to the computer, the total
time required by a few scholars working mainly in
Italy was less than one year. 2 A concordance to the
Revised Standard Version of the Bible was produced on a high-speed computer within a period of
several months as compared to the King James
Concordance of the last century which took 54
scholars 10 years to accomplish. 3 The deciphering
of the Mayan hieroglyphic script by Russian mathematicians, we are told, took only 40 hours of computer time for which a human being would have
needed thousands of years to accomplish. 4 All this
leads to the inevitable conclusion that there are a
number of scholarly tasks-call them the more tedious clerical chores, if you will-that in this age demand the use of the computer. Certainly, one can
no longer think of concordances, dictionaries, or
projects involving masses of statistical data and
numerous cross-correlations without bringing into
play the tools of data processing. Thus, the computer's power can be harnessed to relieve scholars in
the humanities of some of their most burdensome
activity while at the same· time providing their research with the benefits of greater speed and accuracy. More important by far, however, are the more
creative uses of data processing as an aid in such
areas as stylistic analysis. More of this in a moment.
Unfortunately there is a great deal of suspicion,
fear, and ignorance on the part of the humanist
concerning the computer and its legitimate role in
scholarship. Some see the machine as eventually
making decisions that man himself should make.
Others find sinister implications in every technological advance, maintaining the attitude that the humanities and technology don't mix. Finally there
are those who, ignorant of mathematics, fear they
are totally and forever incapable of comprehending
the computer and therefore dismiss it. Although no
one would suggest burning .at the stake the maker of
a computerized concordance, as was almost the fate
of the first person to make a complete concordance
of the English Bible in 1544, the computer-oriented
humanist does face some formidable oppositions.
1965
Ironically, while the impact of the computer,
as stated at the beginning of this paper, may be
compared to the influence of the Industrial Revolution one hundred years ago, there is also an analo'gous reaction among many highly placed scholars to
so-called computer-oriented humanistic research.
Not that any misguided intellectual will physically
attack "the dark Satanic mills," as did their Luddite
ancestors, but we do have their counterparts today
who, kowing little of the computer's advantages and
limitations, damn the machine as not only useless
but dangerous to the world of scholarship. To some
of the older, more conservative scholars, putting
lines of verse into a computer seems profane, like
putting neckties into a Waring Blender, as one professor remarked. More seriously, there are academicians who place no value on the scholar's time, like
the professor who, when told that data processing
would vastly speed up the production of an EnglishOld Iranian dictionary, went so far as to say that
what is not needed is a computer but rather enough
money for someone to be completely free for several years so he could sit down and do the necessary
work. A Scottish minister and mathematician sent
an article on the u~e of the computer in biblical
scholarship to a publisher. It was returned promptly
with a notation, "I do not understand this, but I am
quite sure that if I did understand it it would be of
no value." One scholar said a few years ago that,
"If you have to use a computer to answer a question, it is not a question which I would care to
put." Fortunately for the humanities, things are
changing rapidly.
Let us reveal the negative position for what it is
as we now examine some representative prgjects
within various humanistic disciplines which have
made extensive use of the computer as both \an important and productive tool of scholarship.
In the field of archeology, for example, the computer is of use in studies of shards or· fragments of
artifacts found in the diggings of ruins. In this connection, Jesse D. Jennings of the University of Utah
suggests constructing a matrix of coefficie~ts of
similarity of one artifact to another, and thus. to all
others within a given corpus of objects. The two
basic problems are classifying shards as to their cultural provenance and reconstructing whole artifacts
from broken fragments. Jennings has a body of
some 2,600 shards, each of which has 50 attributes.
Obviously, this represents an astronomical number
THE ROLE OF THE COMPUTER IN HUMANISTIC SCHOLARSHIP
of comparisons to make by hand, and yet for a
computer it is a relatively .simple task. 5
Mr. Dee F. Green, a research associate at the
University of Arkansas Museum, is using codes and
statistical techniques in correlation studies of burial
lots from Eastern and Southwestern Arkansas, and
in detailed analyses of ceramic, decorative, and
technical complexes and traditions exhibited in certain areas of the state. Following the maxim that
pottery is "the essential alphabet of archeology," a
code was. developed for reducing the individual attributes of some 4000-odd pottery vessels to a numerical system for computer handling. Once the materiat is classified, the various attributes. will be
sorted into discrete categories and then statistical
techniques applied to lump the attributes into statistically meaningful groups, or ceramic types.
Dr. Paul S. Martin and his associates at the Chicago Natural History .Museum have been using the
IBM 7094 computer at. the University of Chicago
to process archeological data from the southwestern
United States. 6 By this means they have discovered
spatial clusters of both pottery types and pottery
design elements within a pueblo site. It was found
that the clusters themselves tended to be localized
in certain well-defined areas of the site. In addition
it was found that certain room-types contained specific clUsters of artifact types. In each case, the
computer was given frequencies or percentages of
different artifact or shard types by provenance. The
variables were then correlated and submitted to factor analysis. This allowed comparison of roomfloors with one another to find out which rooms
were similar and which different. This information
was then interpreted in terms of. room function, social groups, chronology, and so forth.
As can readily be seen, such projects as these involving many thousands of artifacts, each with
numerous attributes, as well as the dozens of correlations between them, really demand the use of
computers to handle the sheer mass of information
and to derive really meaningful results therefrom.
Historians have faced a new impetus for the application of social science research techniques to
the analysis of historical political data. The InterUniversity Consortium for Political Research at
Ann Arbor is amassing a vast amount of raw data
transferred to tape storage on American political
history. In addition to the formation of a data repository committee, with close ties to the American
Historical Association, is the development of an
271
automated data retrieval system to make available
to historians and politicat scientists alike large bodies of information. The American Historical Association has set up an Ad Hoc Committee on the
Collection of Basic Quantitative Data of American
Political History under the chairmanship of Professor Lee Benson. Election statistics on presidential
campaigns from all counties in the United States
from 1824 to the present, roll-call votes during each
congressional session since 1789, data on federal
court cases, and census and ecological information
are all being computerized. Further materiat awaiting such attention exists in the fields of agriculture,
business statistics, industry, religion, economic, social, and cultural data, foreign trade, employment,
tax data, and housing. The amount of such unpublished information available is staggering. For but
one year in American history, the U.S. Census Bureau Catalog includes over 5000 computer tape
reels of data in the above-mentioned fields.
The Inter-University Consortium held a training
program during the past three summers consisting
variously of elementary courses such as "Introduction to Survey Methods," and "Cases in Survey Research," an eight-week Graduate Pro-Seminar in
Behavioral Research Methods and Quantitative Political Analysis, and advanced seminars conducted
by both the Department of Political Science and the
Survey Research Center of the University of Michigan. To the best of my knowledge, this is the first
and only attempt within a given field of humanistic
endeavor to provide computer training for its constituents.
One of the earliest historical studies involving
the use of data processing was the study of Massachusetts shipping during the early Colonial period
made by Professor Bernard Bailyn of Harvard
University.7 Faced with the problem of sketching a
realistic picture of the subject, he came upon a perfectly preserved shipping record for an I8-year period containing information not only about the vessels registered but about the owners as well; in
other words, material of early American social and
economic history. Bailyn not only summarized and
tabulated this data in comprehensive fashion but
used the opportunity to assess realistically thepossibilities of applying machine techniques to historical material and to explore the problems of procedure.
The shipping register in question consisted of
1696 entries, each giving information about a ves-
272
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
sel and the people who held shares in it. A total' of
4725 punched cards were produced which contained
all the information available in the register on the
ships and their owners. Codes were developed not
only for the purely quantitative data but for such
qualitative information as names of people and
places, vessel types, occupations, building sites, etc.
Although numerous problems were encountered
along the way, it is significant that Bailyn's book
closes with the statement that only with these tools
and techniques was the analysis of the register possible at all.
Two other computer-oriented historical research
projects involve content analysis. Professor Richard
Merritt of Yale University is studying the developing symbols of the American sense of community or
identity as reflected in five colonial newspapers.
Professors Robert North and Ole Holsti of Stanford
University are analyzing the origins of World War I
by means of computer techniques for scanning and
reporting the appearance of themes and relationships in a large body of historical' material pertaining to decision-making during the 1914 crisis. 8
"Communication is at the heart of civilization," but
since students of international relations are considerably more restricted in access to data than most
social scientists-direct access to foreign policy
leaders is severely restricted-one method is to assess their attitudes, values, and assessments by
means of a computer content analysis of political
documents at times of crisis. North and Holsti constructed a dictionary of words, such· as abolish, accept, or armaments. They are sought out in historical documents and changes in style are noted as the
historical crisis grows in severity in terms of verbal
effectiveness, strength or weakness, and activity or
passivity. By this means one can explore for examantagonism and the degree of cohesion between the
pl'e the relationship between the level of East-West
Soviet Union and China.
Professor William Aydelotte of the University of
Iowa has used the techniques of data processing to
aid in the study of the voting patterns in the British
House of Commons in the 1840's.9 During Sir Robert Peel's ministry, Commons debated and voted on
a number of substantial political issues. There were
divisions as well on various aspects of religious
questions, the army and fiscal reform. There exists
an unexploited source in the so-called .division lists
giving information on all the men in Parliament so
far as they voted on the issues in question. The
1965
complexity of this material, the richness thereof,
the fact that most members of Parliament did not
vote consistently "liberal" or "conservative" on all
issues made this entire decade of Parliamentary history ripe for computer analysis; a project which resulted in a total of 6441 four-fold tables each of
which was punched on a separate IBM card.
Certainly the most well-known use of computers
in the field of literature is the construction of verbal indices. Briefly, there are two forms: first, a
simple alphabetical' list of text showing the frequency or location of the words or both; and second, a
textual concordance showing all the words of a given literary work not only alphabetically but in context as well. In this form, each word appears as
many times as there are words within the parameter
arbitrarily chosen for it along with the relevant passages of which it is apart. Such a concordance is
not only of immense value to a literary scholar from
the point of view of time saved, but is useful' also to
those in other disciplines employing literature as
source material. For example, concordances to the
poetry of American authors have been issued since
1959 by the Cornell University Press. In the case of
the works of Matthew Arnold, the lines of verse
were punched on IBM cards, one line per card, to
which was added the line number and page number
from the standard edition and variance from other
collations. A separate title card was punched and·
inserted before each poem. The entire deck of some
17,000 cards was printed out and transferred to
magnetic tape. A computer-generated concordance
program was then made and ultimately a tape prepared with all the significant words of Arnold's
texts arranged in alphabetical' order along with their
locations. A final print-out formed the basis for the
published index. 10
In similar fashion, Professors Alan Markman and
Barnet Kottler of Pittsburgh and Purdue respectively have prepared a computer concordance to five
Middle English poems, ·perhaps the best known of
which is Sir Gawain and the Green Knight. l l Dr.
John Wells of Tufts University is making a computerized word-index to the Old High German glosses
cribbed between some 140,000 lines of medieval
Latin text.
Professor Alice Pollin, working with the Computer Center at New York University, produced a
guide, or cr~tical index, to the 43 volumes of the
Revista de Fi/ologia Espanola from 1914 to 1960,
cross-indexed by authors, subject matter, and book
THE ROLE OF THE COMPUTER IN HUMANISTIC SCHOLARSHIP
reviews. A total of nearly 900 pages was the result,
the product of approximately 60,000 punched cards.
The printout sheets were reproduced photographically and issued as a bound volume. 12 In this way a
methodology for the machine-indexing of periodicals was established.
However, it is in the area of textual analysis that
computer-oriented research in literature shows exceptional and exciting promise for the future. The
massive comparison of text where there are several
or even dozens of sources presents an almost insurmountable problem for the scholar. To compare in
complete detail as few as 40 manuscripts might take
the better part of a lifetime. It is this type of activity that cries for the use of data processing techniques.
Perhaps the first such effort was the study
by Professors Mosteller and Wallace at Harvard
University, and actually continuing over a period of
years, to solve the authorship question of 12 disputed Federalist Papers. 13 Briefly, literary styles of
Madison and Hamilton were identified, then
matched with the style of each of the disputed papers. Having found such factors as sentence length,
vocabulary and spelling to be of no help (the two
authors were remarkably alike), it turned out that
differences in the use of so-called key function
words-particularly those of high frequency such as
from, to, by, upon, also, and because - served to
pin down authorship of the papers in question.
From a number of computations, Mosteller and
Wallace found that most of the disputed documents
were written by James Madison. But consider for a
moment the problem of Dr. John W. Ellison,.a biblical scholar from Massachusetts who, for his doctoral dissertation, studied 309 manuscripts of the
Greek New Testament, then went on to prepare a
complete concordance of the revised Standard Version of the Bible. Fortunately, he used a computer
to assist him in these gigantic tasks. However, there
are over 4600 known manuscripts of the whole or
part of the New Testament, with cross-fertilization
in the copying process that has been going on for a
thousand years or more. Here the use of the computer to determine the interrelationships of the
manuscripts of the text is mandatory.
Scholars at other universities, while not faced
with such vast problems of correlation, are .actively
engaged in similar work. For example, a definitive
edition of the works of John Dryden is being prepared under Professor Vinton Dearing at U.C.L.A.
273
with textual coUation aided by the use of the
computer.14 This grew out of an existing corpus of
240,000 manually indexed cards. Variant texts of
the final section of Henry James's novel Daisy Miller are· being collated by computer in a pilot project
at New York University under Dr. William Gibson.
This, too, is d~signed to aid the· editor faced with a
number of varying manuscripts or printed editions
in making up an Urtext or variorum edition by
supplying him with a printout indicating the identifies between the versions, sentence by sentence.
Dr. James T. McDonough of St.Joseph's College
has demonstrated with the help of a computer that
Homer's Iliad indeed exhibits the consistency of
one poet.15 His study was in part a response to the
persistent question of whether one poet wrote the
epic or if it consists of separate, short ballad-type
songs by separate authors from various times and
places aU strung together. McDonough prepared in
systematic fashion a metrical index of the Iliad, not
by spellings but by the rhythmic function of all 112,000 words in their 157 metrical varieties. Once the
rhythm of each of the 15,693 lines was coded and
punched, an IBM 650 machine was able to isolate
the individual words, sort, count, and print out the
resulting wordlists in a matter of hours.
Mrs. Sally Sedelow of Saint Louis University has
described the use of the computer for a rigorous
description and analysis of pattern attributes of
text. 16 She mak~s the observation that while .colteagues in linguistics have been making major contributions to such fields as machine translation and
information retrieval-and in return, gaining important insights into the structure of languagethose in literature have offered very little and
gained very little. The aim of such studies is to discover the differences between writers' styles and to
shed light on the changes of an individual author's
style over a period of time. Known as "computational stylistics," these techniques deal with the
parameters of literary style in terms of its constituent elements: rhythm, texture, and form.
Turning now to the field of musicology, Professor Harry B. Lincoln of Harpur College is using the
techniques of information retrieval to compile a catalog of musical incipits (that is, brief melodic
quotations of the first six to eight notes of a composition), of the entire body of 16th century Italian
frottole, a known body of some 600 polyphonic
compositions. Since the possible permutations and
combinations of the beginning notes of such pieces
274
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
are almost infinite, both as to pitch and rhythm,
such brief quotations represent unique identifications of the compositions from which they are taken. First the incipit is translated into alphanumerical form by means of a code which can activate a
photon printer. This is a device with a high-speed
rotating disk containing about 1400 characters (in
this case, musical), a light source, lens, and photographic film. This code consists of even numbers
for the spaces, odd numbers for the lines of the musical stave, an H for a half note, a W for a quarter
note, and so forth. One punched aperture card with
a 35mm photograph of the particular score set in
the right-hand side holds information such as the
composer, title, voice part; accession or serial number. The second card contains the proper sequence
of note representing the incipit coded for the photon device. When computerized, this code causes
the printer to raise or lower its focus to the proper
line or space and, when the correct note or other
symbol is in place, shoot a beam of light through
the proper aperture in the disk exposing the film at
that time with the desired musical symbol. At the
same time, a computer program extracts from this
coded information the intervalic order or melodic
profile of the particular incipit. This, then, can be
compared to other musical sources for instances of
borrowings by one composer from another, or from
other works of the composer himself.
A second major area is the use of data processing
as an aid in the analysis of the structure of music.
For example, Professor Bertram H. Bronson at Berkeley has used computer techniques in the study of
folk-songsY By means of punched cards, he coded
the important elements of folk tunes including
range, modal characteristics, prevailing time signature, number of phrases, the nature or pattern of
refrains, final cadences, and so forth. In this way,
an entire corpus of folk song material can be recorded both fully and accurately. The v~rious elements can then be analyzed for statistical patterns,
comparisons, or indeed subjected to any other query
consistent with the data.
A computer can also serve to test hypotheses
through simulation and model's. Dr. Allen Forte of
Yale University is applying machine analysis to
help provide insights regarding the structure of the
atonal music of Arnold Schonberg. The structure of
so-called pre-twelve-tone (or nontonal) music is
still somewhat of a mystery. Mr. Forte has found
the traditional nonmachine forms of analyses lack-
1965
ing and has stated that a structural description of
this music would be virtually impossibl'e without
the aid of a computer. If I understand the outline of
his program, he is formulating a basic working
theoretical hypothesis based upon linguistic and
mathematical models for musical structure, developing an analytical method to explore his ideas by
means of the computer; he then will test the result.
In quite another application of computer techniques in the analysis of musical styl'e, a program
can be written to search for meaningful patterns
and relationships which; because of the number and
quality of variables, might remain obscured and undiscovered if left to the human brain. From these
very patterns, the researcher can then develop new
and significant hypotheses. An interesting example
of this is the proposal of Professor Jan La Rue of
New York University to evolve machine language
to describe stylistic phenomena in 18th century
symphonies, thereby permitting complex correlations and comparisons far beyond the reach of the
hand tabulation. Just as the literary scholar has
quantified style in terms of form, rhythm, and te~
ture, the musicologist has devel'oped a set of guidelines for the purpose of stylistic analysis, breaking
down the various musical elements into sound,
form, harmony, rhythm, and melody. This technique is admirably suited to computer procedures
when one would wish, for example, to determine
whether or not a symphony attributed to one composer was actually written by him. By this technique, the stylistic attributes of a questionable or
anonymous symphony could be compared for correspondences with the stylistic quantification of a
given composer's known symphonies stored within
the computer memory.
Finally, I must mention the Musical Information
Retrieval project at Princeton University being carried out under the direction of Professor Lewis
Lockwood. This involves a programming language
for an IBM 7094 computer by means of which musical data· is stored in the computer for interrogation and manipulation. 18 Information representative
of each note in its complete context and relationship within the score is coded manually and then
stored in anticipation of the questions to be asked.
The pilot project at Princeton involves a stylistic
investigation of the 22 masses and mass movements
of the Renaissance composer J osquin des Pres. In
its broader aspects this type of program enables the
scholar to search for and locate reliably all elements
THE ROLE OF THE COMPUTER IN HUMANISTIC SCHOLARSHIP
within a given category, such as accidentals, or all
examples of a particular intervalic progression. In
addition the computer program can serve as a check
on discrepancies between the original manuscript
and later editions or transcriptions.
The projects I have just described to you represent but a few highlight from the growing roster of
scholars using the computer in humanistic research.
A list of such activities published last spring by the
American Council of Learned Societies reveals well
over a hundred individuals involved with the tools
of data processing. 19 When compared to the state of
the sciences versus the computer some 10 years ago,
the prognosis for the future is good indeed.
In a lecture at M.LT., entitled, "The Computer
in the University," the prediction was made that in
a few years the computer may have settled immutably into our thought as an absolutely essential part
of any university program in the physical, psychological, and economic sciences. On the basis of what
I have said, I think the time has come to amend
that statement to include the humanities. Furthermore, within a short time, I believe a knowledge of
data processing will become part of the "common
baggage" of research tools and techniques required
of every graduate student in the liberal arts. I am
even tempted to go a step further and state that
with the increasing number of courses in programming being offered at our universities the time may_
come when some students in the humanities may be
as fluent in programming as in writing English
composition. Certainly, the computer is fast becoming an important and indispensable research tool
for faculty and students alike.
The value of such an acquaintanceship can be
seen in the case of a professor of art history and
archeology at an eastern college. Describing himself
as probably the man on campus "least likely to benefit from a computer," he took a short summer
course at the college computer center. Later, in reporting on the instruction, he said, "The course profoundly affected the thinking of all of us. This is
the important thing-much more important than
the machine itself. Of course we know that it is the
brains behind the machine that make these miracles
possible. Nonetheless, it is a weapon of such power
that all intelligent men and women everywhere
should know the kind of things it can do. Once we
know that, we can devise ways to make use of it."
In this connection it is useful to bear in mind
Alfred North Whitehead's remarks 40 years ago
275
that "the reason why we are on a higher imaginative level than the 19th century is not because we
have finer imagination, but because we have better
instruments-which have put thought on to a new
level."
Let us, therefore, see the computer as a means of
liberation, freeing the humanist scholar from the
time-consuming operations of the past; a tool providing him in rapid fashion with a proliferating series of sources in the form of statistics, collations,
printouts, cross-references, frequency counts and
hypothetical models upon which he may build a research of new dimensions and complexity. Viewed
in this light, it is a device the potentialities and applications of which we cannot afford to ignore.
REFERENCES
1. Isaac Auerbach, ."Information: A Prime Resource," in The Information Revolution, New York
Times, May 23, 1965, p. 4.
2. Paul Tasman, "Literary Data Processing,"
IBM Journal of Research and Development, vol. I,
pp. 249-56 (1960).
3. See the preface to Nelson's Complete Concordance to the Revised Standard Version Bible,
New York, 1957.
4. Felix Shirokov, "Computer Deciphers Maya
Hieroglyphs," UNESCO Courier, vol. XV, pp.
26-32, (1962).
5. Jesse D. Jennings, "Computers and Culture
History: A Glen Canyon Study," Paper delivered at
a meeting of the American Anthropological Association, San Francisco (Nov. 21, 1963).
6. See for example J. A. Brown and L. G. Freeman, Jr., "A UNIVAC Analysis of Shard Frequencies from the Carter Ranch, Pueblo, Eastern Arizona," American Antiquity, vol. XXX, pp. 162-167,
(1964); and Paul S. Martin, "Archeological Investigations in East Central Arizona," Science, vol.
CXXXVIII, pp. 825-27 (1962).
7. Bernard Bailyn, Massachusetts Shipping
1697-1714: A Statistical Study, Harvard University
Press, Cambridge, Mass., 1959, esp. pp. 137-141
("A Note on Procedure").
8. Ole R. Holsti, "Computer Content Analysis
as a Tool in International Relations Research," in
Proceedings of the Conference on Computers for
the Humanities, Yale University, New Haven,
Conn., 1965.
276
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
9. William O. Aydelotte, "Voting Patterns in
the British House of Commons in the 1840's,"
Comparative Studies in Society and History, vol. V,
pp. 134-163 (1963).
10. Stephen M. Parrish, A Concordance to the
Poems of Matthew Arnold, CorneU University
Press, Ithaca, N. Y., 1959; see also his "Problems
in the Making of Computer Concordances," Studies
in Biography, vol. XV, pp. 1-14 (1962).
11. See Alan Markman, "Litterae ex Machina:
Man and Machine in Literary Criticism," Journal
of Higher Education, vol. XXXVI, esp. pp. 70-72
(1965).
12. Alice M. Pollin and Raquel Kersten, Guia
para la Consulta de la Revista de Filologia Espanola, New York University Press, New York, 1964.
13. Frederick Mosteller and David L. Wallace,
"Inference in an Authorship Problem," Journal of
the American Statistical Association, vol. LVIII,
pp.275-309 (1963).
1965
14. On methodology, see Vinton A. Dearing,
Methods of Textual Editing, University of California Pres, Los Angeles, Calif., 1962.
15. James T. McDonough, Jr., "Homer, the Humanities, and IBM," in Proceedings of the Literary
Data Processing Conference, IBM Corporation
Yorktown Heights, N. Y., 1964, pp. 25-36.
16. Sally Y. Sedelow, "Some Parameters for
Computational Stylistics: Computer Aids to the
Use of Traditional Categories in Stylistic Analysis," ibid., pp. 211-229.
17. Bertrand H. Bronson, "Mechanical Help in
the Study of Folk Song," Journal of American
Folklore, vol. LXII, pp. 81-86 (1949).
18. See Michael Kassler, "A Simple Programming Language for Musical Information Retrieval,"
(Project 295D, Technical Report No.3), Princeton
University, Princeton, N. J. (1964).
19. "Computerized Research in the Humanities:
A Survey," ACLS Newsletter, vol. XVI, no. 5, pp.
7-31 (May 1965).
THE STRUCTURE AND CHARACTER OF USEFUL
INFORMATION-PROCESSING SIMULATIONS
Louis Fein
Synnoetic Systems
Palo Alto, California
I am neither a biologist nor a psychologist. I am
merely an interested observer who has paid diligent,
respectful, yet disciplined attention to this matter.
For a while, I blamed my own ignorance and lack
of understanding of psychological and physiological
processes and of the terminology used to describe
them, for my inability to find what professed brain
modelers were learning about the brain itself. But I
began to doubt that it was I who was at fault, as I
became startlingly aware of the various defensive
postures taken by professed modelers of psychological and physiological processes whom I pressed in
correspondence and in conversation to tell me what
they now knew, that they didn't know before, about
a particular psychological or physiological process,
now that they had a working model; or even what
kind of knowledge they expected to gain with the
use of a model that was not yet developed. The responses were largely evasive, irrelevent, defensive,
offensive, insulting to my person and my ancestry; I
was accused of wanting to deprive honest researchers of their livelihoods; I was ignored, or referred
to the vast literature on the subject; I was admonished to be tolerant and to give this young discipline a chance to develop. Some didn't want to discuss it at all; others sent me reprints. My Freudian
analysis of such responses was orthodox; when a
INTRODUCTION
I am curious about how the brain and nervous
system work. So I became interested in the properties of useful models-especially of useful information-processing models of psychological and physiological processes.
In my professional pursuits as a sometimes consultant, writer, lecturer, teacher, and designer in the
computer field, I have attentively and respectfully
listened, observed, read, and conversed about models that were inspired by the desire of the modelers
themselves to know how the brain works. I doggedly sought for a particular bit of knowledge of how
the brain works that was gained by a researcher primarily because he used a putative model of the
brain-as opposed to knowledge about the brain
gained by direct observation and measurement not
predicted or suggested by a model. For years, I
have been singularly unsuccessful in finding any
neural net models or digital computer program
models (i.e., the information-processing models I
am familiar with) whose use led to a particular experimentally verifiable piece of knowledge of how
the brain works. I have found l'Ots of information
about the models themselves; but not about the
brain.
277
278
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
simple question to a patient elicits a noisy and
boisterous defense, then there is something wrong
with the patient, not the questioner. My conclusion
was that a great deal of psychological and physiological modeling was both aimless and fruitless. I
started to wonder why neural net models and digital
computer models were not as fruitful for gaining
verifiable knowledge about psychological and physiological phenomena, especially of the brain and
nervous system, as mathematical, computational,
and physical models have been for learning about
physical phenomena such as wave motion, aircraft
behavior, and circuit operation. Indeed, it is rare to
find an article or talk on psychological or physiological modeling without the author's observing that
since models have helped the engineer and physicist
to gain knowledge in his subjeCt then models should
be expected to help the author to gain knowledge in
his.
So I wondered in what ways the fruitful models
are different from the fruitless ones. I assumed that
the reason modeM'rs in psychology and physiology
were having little luck was that their models didn't
have whatever characteristics models should have to
suit the modelers' purposes. I tentatively formulated
the question: what are the characteristics of an information-processing model that would suit it to the
purposes of a psychologist or physiologist interested
in some aspect of the brain and nervous system. I
reasoned that once I knew the character of useful
models, I would find that unsuccessful modelers
were using models that did not have certain of these
required characteristics. I want, therefore, to present what I think the structure and character of the
total modeling process must be in order for it to be
potentially useful in psychology and physiology.
Let me give a summary of my viewpoint before
systematically presenting it for your consideration.
Recall that we are focusing attention both on
what physiological or psychological knowledge, if
any, a model can lead to, and of what value such
knowledge is to a particular practitioner-a surgeon, psychiatrist, public health specialist, internist.
The utility of a model depends on how helpful it is
in gaining new knowledge; the value of new knowledge obtained with the help of a model depends on
how much more effective, efficient, and economical
it makes a practitioner in doing his job. In this
context, "A is a model of B" would be an incomplete statement. One must at least say that A is a
1965
potentially useful model of B, if the use of A can
help the simulator predict or suggest new knowledge
in B (expressed in a suitable form and language)
that could be of some value to some practitioner.
But even these three elements (1) the model A,
(2) the modeled system of interest, B, and (3) the
new knowledge hypothesized by the simulator about
B, are not enough. Another element is necessary in
the modeling process if one is to measure its utility,
i.e., how helpful it is in gaining new knowledge. In
order to test these hypotheses-which are predictions or suggestions of answers to particular questions of the simulator, or solutions of his problems,
or resolutions of issues-the simulator must design
and carry out valid and feasible experiments on the
system of interest, B, with the use of appropriate
instruments and apparatus together with procedures
for interpreting the results of the experiment. Without such experiments the hypothesized new knowledge remains conjecture; the simulator cannot give
the practitioner confirmed answers to his questions,
problems, or issues; hence he can't say anything
about the utility of the modeling process.
The crucial issue of whether or not an experiment can be both designed and carried .out to test
the hypotheses suggested by the model A about the
system of interest, B, usually depends on whether
the controllabl'e parameters and variables in the
model A have correspondences in what is modeled,
B; whether the corresponding parameters and variables in B are accessible to the investigator so he
can control and vary their values; and whether the
investigator has instruments suitable for observation and measurement of B.
Without being able to show correspondences between controllable parameters and variables in the
model and in what is modeled, the model is not
even potentially useful as an aid to gaining new
knowledge; without instrumentation, even if one
showed the correspondences, an experiment could
perhaps be designed but not carried out to realize
the potential utility of a model.
The experiments may be gedanken experiments,
i.e., experiments that the simulator would design
and carry out if he knew the correspondences and if
he had the instruments. Thus, a potentially useful
and valuable modeling process has four necessary
elements: ( 1) the model, (2) the modeled system
of interest B, ( 3 ) hypothesized new knowledge
about B, (4) the experiment.
STRUCTURE AND CHARACTER OF USEFUL INFORMATION-PROCESSING SIMULATIONS
THE STRUCTURE AND CHARACTER OF
A SIMULATION
I have used the word "model" to introduce these
notions because it is a familiar term. In other
fields, practitioners use terms such as: mapping,
analog, similitude, isomorph, homomorphism, and
emulation, often as synonyms of model. But as with
the term, model, it is often unclear what these terms
mean. Because the word model is so ambiguous, its
use will be limited in what follows. To avoid ambiguity and misunderstanding, I start with a statement
of what I am not here talking about; then a statement
of what I am interested in; then some definitions.
I am not here interested in the process whereby
an investigator is motivated to design, construct,
and use a system that behaves or is otherwise similar to another system-an investigator who, after
acknowledging his debt to it for its inspiration, has
no further interest in the other system. This species
of "model" is not what concerns us. I emphasize
this distinction because I find that it is often not
made and, if made, misunderstood. If I ask a man
who has announced that he is simulating, say, the
cognitive process, what he has learned thereby
about the cognitive process, and he responds with
what to me is a non sequitur, by telling me what a
wonderful new computer language he has invented
or what clever things his program can do, then this
is an instance of the kind of misunderstanding I am
trying to avoid by making clear and careful definitions of important and distinguishable notions and
objectives.
Nor am I here interested in models that serve as
pedagogic vehicles for educating an investigator and
presenting to him knowledge already in hand.
I am here interested in the process whereby an
investigator designs, constructs, and uses a certain
kind of instrument (heretofore variously called
model, or mapping, or analog ... ) with the aid of
which he forms hypotheses about knowledge of particular objects, phenomena, properties, functions,
events, or thoughts of interest in his field, and who
has designed feasible and valid experiments to test
these hypotheses (verify or find them false).
The process must have all four interdependent
constituents: (1) the investigator, who (or which)
I will call the simulator; (2) the instrument which
(or whom) I will call the simulate; (3) the object
or phenomenon, or properties, or events or thoughts
279
of interest to the simulator, which I will call the
simuland, and (4) the design of the feasible experiments, or the gedanken experiment, which I will
call the experiment. The process itself, I will call
the simulation or the simulation process.
Note that as knowledge is accumulated in developing fields, such as physiology or psychology, or
even information processing, the simuland is rarely
a complex system such as the ear, or the brain, or a
computer, or the eye. It is more often a part of the
ear, or of the brain, or of the computer, or of the
eye. One usually constructs a simulate in order to
use it as an aid in gaining new information, insight,
knowledge, or understanding, about a single property, or a single function, or a single part of a subsystem. Only occasionally, as is the case with "laws"
of physics, will a single simulate serve to answer
questions about many properties, many functions, a
complex of many parts. Thus, we speak of a simu-,
late, useful for finding or calculating or predicting
such things as signal' transmission modes in nerve
fibers; or the logic design of the addressing system
of the core memory in a computer. The simuland is
here meant to be only that part of the simulator's
system of interest about which he seeks knowledge
with the aid of a particular simulate. I emphasize
this point because I believe that speaking of "brain
modeling," to pick only one instance, often confuses both the investigator and those interested in
his work. If an investigator specifies an ambiguous
objective for himself by saying that he is simulating
the brain when he means he is simulating a property of a part of the brain, then his colleagues may
misconstrue him, and they may expect results from
him that the investigator never intended to obtain.
To make these ideas explicit, I ask the reader to
imagine that we have the problem of writing a
manual on a simulation process as herein defined,
for the preparation of a college laboratory report; or
a proposal to a government agency; or the final report to a government agency; or a paper for a
professional journal; or a graduate dissertation. The
organization and format of such a document will
reflect what I consider to be the necessary structure
and character of potentially useful and valuable
simulation processes. It will at the same time systematically disclose to the reader the value, utility,
and progress of the simulation process reported on,
and the goals and intentions of the simulator. The
following is an abstract of such a manual.
280
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
THE PREPARATION OF A PAPER
ON THE SIMULATION PROCESS
This manual is intended to assist authors in the
preparation of reports, proposals, papers, articles,
and dissertations on simulation.
It should be understood that the specific organization and format suggested herein are arbitrary.
However, the literature on simulation is read by
many people with different orientations and varying
degrees of familiarity with the subject matter. For
these people to get the most out of documents on
simulation, they should expect the material in them
to be presented in a consistent organization, style,
and format. Not only does the reader benefit by the
application of these rules, the author benefits too;
they serve as a goad to logical, critical, explicit, and
disciplined thinking and writing.
A paper on a simulation process should always
mention the hypothesized knowledge that was or is
to be tested; to what purpose the knowledge obtained is being put or might be put; a description of
the simulate, a description of the simuland, and a
report of the design of the experiments. The emphasis each topic receives will, of course, depend on
the pattern of emphasis in the work itself. The
question arises: in what order should these topics
be covered in a paper, and what specific items
should be covered under each topic. The following
organization and format is suggested:
Outline of a Report on the Simulation Process
I.
SIMULAND
A. Statement (in the language of the simulator's
field) of what knowledge is already verified
or to be assumed about the structure, functions, properties, and theorems of the system
of interest to the simulator - the simuland.
B. Statement (in the accepted language of the
simulator's field) of a specific question the
simulator wants an answer to; or problem
he wants solved; or an issue he wants resolved, with the aid of the simulate and
the experiment. This is the hypotheses to
be tested.
C. Acceptable form and language of the answer,
solution, or resolution.
II.
1965
SIMULATOR
A. Statement of the purpose and use to which
the new knowledge gained in the simulation
process is or might be put - the effect,
significance, and value of having such knowledge.
III.
SIMULA TE
A. Description in the language of the field of
the simulate, of the structure, functions,
properties, and theorems of the simulate.
B. Identification of corresponding controllable
and measurable or observable variables and
parameters in the simulate and the simuland.
C. The form of addressing the simulate with the
simulator's questions, problems, or issues,
in the language of the simulate.
D. The procedure for deriving from the simulate the answers, solutions, or resolutions.
E. The correspondence rules by which question/
answer or problem/solution, or issue/resolution in appropriate form and language of
the· simulate are transformed into what the
simulator wants, i.e., the corresponding
question/answer or problem/solution, or
issue/resolution in acceptable form and
language of the simuland.
IV.
EXPERIMENT
A. The design of valid and feasible experiments
(which may be gedanken experiments)
whose sole object it is to test the hypotheses
of the simulate, i.e., to verify the hypothesized answers, solutions, or resolutions given
by the simulate, or to prove them false.
The form of the write-up of these experiments is typical of the form of the write-up
of laboratory experiments· performed in
school.
(1) Object of experiment: to test the hypothetical answers given in the simulate
(2) Procedures
( 3) Apparatus and instruments
(4) Results
( 5) Conclusions
( a) Verified hypotheses
(b) Unverified hypotheses
(6) Unexpected results - fall-out
STRUCTURE AND CHARACTER OF USEFUL INFORMATION-PROCESSING SIMULATIONS
(7) Suggestions for other
(a) Experiments
(b) Questions, problems, issues
( c ) Instruments
(d) Theory
THE UTILITY OF A SIMULATION
The utility of a simulation will clearly depend on
the extent to which the simulator learned what he
wanted to learn through the use of his simulate.
It may be noted that in any simulation, in order
for the simulator to learn what he wants to learn,
the simulate must bear a certain relation to its
simuland. Suppose a simulation system consists of,
say, a differential equation simulate from which the
simulator can learn about properties of sta~ding
acoustic wave patterns in an enclosed cylinder
(simuland) and that he can perform experiments in
which he can observe or measure correspondences
between properties of standing waves in his simulate and simuland. The utility of this simulation can
be high. But if the investigator, actually interested
in, say, properties of standing acoustic wave patterns in an enclosed cylinder (simuland), first hypothesizes a simulate of this simuland, and then
builds a simulate of the hypothesi~ed-simulate-of
this-simuland, then he might thereby gain knowledge about the hypothesized but untested simulate,
but he cannot thereby learn anything about the
simuland. For instance, to use a digit-al computer to
solve a differential equation presumed to simulate
wave motion gives information about the differential equation but not about wave motion. Or to
write a digital computer program to simulate a hypothesized neural net model of a part of the eye will
tell you something about the hypothesized neural
network, not about the eye. (On the other hand, if
the differential equation has already been well established as a simulate of wave motion, the digital
computer can be said to give information about
wave motion.)
We will call a simulate of a simuland (or of a
well-established simulate-of-the-simuland) a first
order simulate; and the simulation process that
includes it, a first order simulation process. We will
call a simulate of a hypothesized-simulate-of-a-simuland, a second order simulate; and the simulation
process that includes it, a second order simulation
process, and we will say that only a first order
simulation process can have a nonzero utility.
281
We can now enumerate the major characteristics
that a simulation process must have in order for it
to have utility:
1. The simulate of the process must be a first
order simulate of the simuland. To use the
vernacular: you might learn something
about the ear from a model of the ear; but
you can't learn anything about the ear from
a study of a model of a hypothesized-model-of-the-ear.
2. There must be one-to-one correspondences
between the accessible, controllable, and
measurable or. observable variables and
parameters of the simulate and simuland.
3. There must be a well-designed experiment
for testing the hypotheses.
I mentioned earlier that I sought the characteristics of an information processing model that would
suit it to the purposes of a physiologist and psychologist and that once I knew the character of the
useful models, I would find that unsuccessful modelers were using models that did not have certain
of these characteristics. I have run some gedanken
experiments of recasting into the above framework,
papers that purport to simulate-by digital computer programs or by neural networks-physiological
and psychological processes such as memory, perception, neurosis, and cognition. In papers using
both of these types of information-processing simulations, simulators do not show one-to-one correspondences between accessible controllable and
measurable or observable parameters and variables
in the digital computer program or network and the
organism's brain, nervous system, or other organs of
interest. Nor do they have the instrumentation either to gain access to or to aid in controlling, varying, measuring and observing these variables and
parameters in the organisms themselves. Thus,
without the correspondences and without experiments, such simulations are fruitless.
In other instances, digital computer solutions are
reported for sets of equations representing a network that was inspired by an organic process, and
presumed to be a simulant of it. Obviously such
third order simulations have no potential utility for
learning anything directly about that organic process.
Finally, I read some papers, presumed to be
282
PRqCEEDINGS -
FALL JOINT COMPUTER CONFERENCE.
about simutation, in which I could not locate a
statement of what hypothesis was to be tested, nor a
statement about what specific questions or problems
the simulator hoped to answer, if his simulation
was successful. These were aimless simulations.
I would suggest to laboratory instructors, dissertation advisers, sponsors of research who prepare
1965
requests for proposals and who evaluate contractor's
or grantee's work, journal editors, teachers, and investigators interested in simulation as herein defined, that they would find it valuable in their
professional roles to insist that authors prepare
their written material in an organization and format
similar to the one presented in this paper.
THE CATALOG: A FLEXIBLE DATA STRUCTURE FOR MAGNETIC TAPE
Martin Kay and Theodore Ziehe
The RAND Corporation
Santa Monica, California
contains alternatives to which probabilities are assigned, then these will presumably be in the form of
floating-point numbers. This is what it is like for a
file to contain different kinds of information.
The notion of a catalog* was developed principally with the needs of linguistic computing in
mind. It is oriented more to the storage of information on a long-term medium, such as magnetic tape,
than to its representation in the high-speed store of
a computer. The elementary items of information in
a catalog are called data. The structure imposed on
the data making up a catalog is that of a tree-a
hierarchy of sets of information. Let us consider as
an example how a bibliography or the acquisitions
list of a library-catalogs in the conventional sense
-might be organized within this system. Each document or book has an entry in the file containing
various items of information about it. One of these
can be chosen as the key under which the others are
filed. For example, the acquisition number of an
item can serve as the key fer all information related
to the item. Under it there will be sections for author, title, journal if relevant, publisher, and date.
In an actual application, there would doubtless be
The files of data used in linguistic research differ
from those found in other research applications in
at least three important ways: (1) they are larger,
(2) they have more structure, and (3) they have
more different kinds of information. These are, of
course, all simplifications but not gross ones. It is
true that the files that must be maintained by a
large insurance company or by the patent office are
so large as to pose very special problems, but the
uses to which the files are to be put are fairly well
understood and their format and organization is not
usually subject to drastic and unexpected change. It
is also true that the data from a bubble chamber is
interesting only if collected in vast quantities, but
this is not the only respect in which a bubble chamber is a special kind of tool. A typical linguistic job
will bring together a number of files, each very
large by the standards of everyday computing: a
body of text, a dictionary and a grammar for example. The grammar, if it is anything but a very simple one, will contain a large number of elementary
items of information of different kinds, each related
to others in a number of different ways. This is
what it means to say that the file has a lot of structure. The dictionary may also contain grammatical
codes which may consist of characters from one of
the languages represented in the dictionary or may
be something altogether different. If the dictionary
*The catalog system has been developed through the
joint efforts of the Centre d'Etudes pour la Traduction
Automatique of the Centre National de la Recherche Scientifique at the University of Grenoble, France, and the
Linguistics Research Project of The RAND Corporation.
283
284
PROCEEDINGS -
FALL JOINT COMPUTER CONFERANCE,
other sections but these will serve for the example.
The author section must be capable of handling
cases of multiple authorship without confusion and
we may assume that there is provision for giving
the institution to which each author belongs. The
journal section will also have subsections for volume and issue number.
A pair of entries might appear as in the diagram
at Fig. 1. The acquisition numbers of these two documents are 1746 and 1747 respectively. The first
has one author belonging to a single institution; the
1965
second has two, each belonging to a pair of institutions. Furthermore, the two authors of the second
paper belong to one institution in common, but
since a catalog must, by definition, have the form
of a tree and not of an arbitrary graph, the institution must be mentioned separately for each author.
The first document is a journal article and there are
therefore two nodes below the one representing the
name of the journal; the first gives the volume and
the second the number. The second entry is for a
book and therefore these two nodes do not appear.
Level
------
----
2
1963
S impl ification of
Computer Programs
R.C.A.
3
Processing:
Similarity &
Familiarity in
Verbal Learning
8
Corp
Laboratories
Carnegie
Institute of
Technology
4
Corp
Figure 1. A portion of an acquisitions list.
In this example, the kind of information at each
node can be unambiguously determined from the
structure of the tree. The nodes on level 2 all represent acquisition numbers. The last three nodes under
a given acquisition number represent title, publication information, and date respectively. Any nodes
preceding these represent authors. Any nodes below
those for authors are for institutions. A journal article is distinguishable by the volume and number
nodes which are absent in the case of a book. However, it is not difficult to imagine cases where rules
of this kind would' not work. Suppose, for example,
that the date of each edition of a book were put in,
or that when the date was unknown, we wished
simply to omit it. To cope with these situations, it
would be necessary either to redesign the catalog
structure or to label each datum explicitly to show
the kind of information it contains. The structural
redesign might be as follows. The names of authors
are dropped to level 4 and their institutions to level
5. A new kind of datum is introduced on level 3
dominating the author names. This node has no information of its own; it serves only to show where
•the authors are to be found. A similar node could
be inserted above the date, except that in this case
the node would have information of its own to provide for the case where the date is missing.
The catalog system in fact requires that each datum be tagged with the name of the data class of
the information it contains. This is useful for other
285
THE CATALOG: A FLExmLE DATA STRUCTURE
reasons than the one we have suggested. Each data
class is, for example, associated with a particular
set of encoding conventions. Some contain textual
material, some floating point numbers, some integers and so on. So, by looking at the class of a datum, we can tell not only what its status is in the
catalog as a whole but also how to decode it.
It is convenient to be able to describe the overall
structure of a catalog, to the computer as well as to
other people, in terms of data classes and the relations among them. This we do by means of a map,
which, like the catalog itself, is a tree, but in which
the data are replaced by the names of the data
classes. The map of our hypothetical acquisitions
list is shown in Fig. 2. The name of each data class
appears exactly once in the map, a fact which is
bound up with an important restriction on the way
catalog structures may be designed. The members of
a given data class always appear on the same level
of the catalog-the level on which the name of the
class appears in the map-and they always come
immediately below members of the same other class
-the one whose name appears above theirs in the
map. If two classes are shown directly beneath the
same other class in the map, then their members
must appear in that order in the catalog itself.
Thus, in the example, a title may never come to the
left of the corresponding author. Cases can arise
where these restrictions may seem unduly severe,
but, as we shall shortly see, powerful means are
available for dealing with them.
Level
Acquisitions List
2
Date
3
4
Figure 2. Map of an acquisitions list.
A variant has been proposed for the catalog design we are using as an example in which new
nodes would be introduced to represent sets of authors and sets of dates. Since we have data-class
names, we do not need to adopt this variant. However, classes of this kind, whose members never
contain substantive information, are frequently useful. In other cases, no substantive information is
available for a particular datum, but only for the
nodes it dominates. In these cases, we speak of null
data. The catalog system has been implemented in
such a way that null data occupy no space whatever
on the tape. They can therefore be used freely to
lend perspicuity to the structure of a catalog and
without regard to economy. In order to see how this
is achieved, we must consider the format used for
writing catalogs on tape.
A catalog is reduced to linear order in the most
obvious way. The datum at each node is written on
tape before the data at the nodes beneath, and all
these before nodes to its right. Thus the node at the
root of the tree goes first, followed by those on its
leftmost branch. The first node on a branch is the
first to be written, followed by the nodes on its leftmost branch. When a branch is finished, the one
immediately to the right is taken next. This is the
order arrived at by regarding the nodes as operators
whose arguments are the nodes immediately beneath them, and writing the expression out in Polish parenthesis-free notation.
A set of programs is being written for moving
catalog data between high-speed core and tape. A
blocking scheme is used for the data on tape; the
block size is set by the user. Each datum is treated
286
PROCEEDINGS -
FALL JOINT COMPUTER CONFERANCE,
as a single logical record whose size is not restricted to that of the physical block. Each logical
record is preceded and followed by a link word
which gives the lengths of the records on either side
of it. The length of the preceding as well as the following record is given in order to make it as easy
as possible to backspace. Link words also contain
certain other information of importance to the
housekeeping of the reading and writing programs,
but nothing of the essential structure of the catalog.
This being the case, it is possible for a program using the catalog input/ output system to make use
only of its blocking facilities for certain purposes,
calling on the whole system only when required.
This plan of building the system as a sequence of
layers, each using the facilities provided by the one
beneath, has been followed wherever possible in the
design. The first two words of each block are an
lOBS control word and a FORTRAN control word.
These are used nowhere in the current system but
are included for reasons of compatibility.
The first word of a logical record which contains
a catalog datum is a datum control word. This gives
the data class of the datum and its preceding implicit level number (PIL). It is the PIL that enables
the system to identify the place of null data which,
as we have said, are not explicitly represented by a
logical record. The PIL is defined as follows:
1. For all data on levell, it is O.
2. The PIL of the first datum dominated by a
null datum is the PIL of that dominating
null datum.
3. The PIL of every other datum is the level
number of the datum that dominates it, i.e.
one less than its own level number.
Informally, we can say that tp.e PIL gives the
highest point in the tree encountered on the path
from the previous non-null datum to the current
one. Consider two adjacent data on level i of some
catalog and suppose that the PIL of the second is j,
where j ~ i = 1. i - j - 1 null data are to be assumed between these two levels j + 1, . . . , i - 1.
Given the class of the current non-null datum, the
classes of these null· data are uniquely determinable
from the map.
It is desirable to be able to write general programs for performing standard operations on catalogs without requiring that the user supply complete
information on the structure of the catalog to be
treated. For this reason, the map of a catalog is
1965
written on tape in the logical record immediately
preceding the first datum. The map is represented
as a simple list of data-class names paired with level numbers and taken in the order we have described for the catalog itself. Thus, a class whose
level is given as i is dominated by the class with
level i -1 most recently preceding it in the list. With
the name of each data class is also given a code
showing what encoding conventions are used for the
data of that class.
We have noted that the restrictions imposed on
the design of catalogs could become onerous in
some situations if means were not introduced for
overcoming them. We have been considering how a
library acquisitions list might be represented as a
catalog. But, of all the lists produced by a library,
surely this is the least interesting. Suppose instead
that we were to undertake to accommodate the subject catalog. Most subject classifications have the
structure of a tree to begin with, so that the job
should be easy. One possible strategy would be to
examine this tree to determine the length of its
longest branch, that is, how many categories dominate the most deeply nested one. We may then construct a map with this number of levels plus 5,
which is the number used for the acquisitions list.
This will make it possible to put a complete entry
of the kind considered in the simpler example beneath the node for the most deeply nested category
in the classification scheme. In general, the node
for a subject heading will have two kinds of nodes
directly beneath it, one kind for more particular
categories under that heading and one for documents which cover the whole field named by the
current heading. The structure already set up for the
acquisitions list is repeated once for each document
node in the map, but with different data-class names.
This scheme will indeed work, but it is clearly
unsatisfactory in a number of ways. For one thing,
subject headings will be in different data classes according to their level in the classification as a
whole. For another, the map is unduly large and
monotonous and liable to change when some minor
part of the classification changes. An alternative
strategy rests heavily on the claim made for catalogs
that any kind of data whatsoever can be accommodated in a datum. If this is so, then an entire catalog can be included as a single datum within another
one. From here, it is a short step to the notion of
catalogs with recursive structures. Consider the ex-
287
THE CATALOG: A FLEXIBLE DATA STRUCTURE
ample in Fig. 3. The main catalog has two data
classes of which the lower one has data that are
other catalogs. To emphasize this, we have shown
this node with a square rather than a circle. This
simple two-class map is written at the beginning of
the tape. When a subheading datum is encountered,
the first thing it is found to contain is the map of a
subsidiary catalog. In order that the data of this cat-
alog should be correctly processed, the tape format
must make special provision for them. In fact, subsidiary catalogs are represented not as single logical
records, but as sequences of logical records bounded
by special markers. However, the user of the system
need not concern himself with these details. As far
as he is concerned, the included and the including
catalogs can be treated in exactly the same manner.
Subject Catalog
1
Subheadings
(a) Mai n Catalog
Date
(b) Subsidiary Catalogs
Figure 3. Structure of a library subject catalog.
The subsidiary catalog in Fig. 3 is similar to the
main catalog for the acquisitions list except for the
addition of a single new class to accommodate a
further level of subheadings. This again is a class of
catalogs and their structure is exactly the same as
that of the subsidiary catalog of which they are
members. Fig. 4 shows an excerpt from a catalog
built on this plan.
Any scheme devised to fill the role for which catalogs were devised must be measured against three
main requirements.
1. It must be easy to update.
2. It must provide for retrieval of information
in response to a wide variety of requests.
3. It must allow files to be organized on new
principles as research proceeds.
Now, the catalog system is not intended as a fullfledged information-retrieval system, but it does
contain something of what any such system would
have to provide. In particular, it provides powerful
and flexible facilities for addressing data and sets of
data. Furthermore, this addressing capability is precisely what is required for an updating algorithm
where the principal work consists in identifying the
items of information to be treated.
There is no obvious limit to the refinements that
could be introduced into a catalog addressing
scheme, and our ideas on the subject can be guaranteed to far outpace our ability to implement them.
Here, we must content ourselves with a survey of
some of the simpler notions.
It will be convenient to distinguish between the
location of a datum and its address. Each datum in
a catalog has a unique location which may be
thought of as its serial number on the tape, or as
anything else which preserves its uniqueness. But a
datum may have an indefinite number of addresses,
only some of which refer to it uniquely. The location of a datum will not normally be known to the
user of a large catalog, but this is of no consequence provided that there is a clear method of
288
PROCEEDINGS -
FALL JOINT COMPUTER CONFERANCE,
1965
TranslatJon
of Languages
M.I.T.
Birkbeck
College
Figure 4. A portion of a library subject catalog.
specifying unique addresses. The notion of location
is useful only for understanding how the addressing
scheme works.
Some parameters which are useful in identifying
a datum are its data class, its class ordinal, its level,
its level ordinal and its value. Data class and level
have already been explained. The ith datum of a
given class dominated by a single datum on the next
higher level has class ordinal i. The ith datum of
any class dominated by a single datum on the next
higher level has level ordinal i. We can now describe the form of an elementary address. This is a
triple consisting either of a level, a level ordinal,
and a value or a data class, a class ordinal, and a
value. In either case, any member of the triple may
be left unspecified. The following are some examples:
( 1, ,a) (, ,a) -
(A )
-
The data on level 1 with value a.
All data· with value a.
All data of data class A.
Informally, we are using the convention that, if the
first member of a triple is an integre, it is a level
number; if it contains alphabetic characters, it is a
data-class name.
An elementary address, like any other address,
can be regarded as a function whose value, if defined, is a location or list of locations. Two other
useful functions, descendant and ancestor, have location lists both as arguments and values. These
are:
Des [L] data at
Anc [L] data at
All data dominated by the datum or
L.·
All data dominating the datum or
L.
A concatenation of addresses is itself an address
whose value is the intersection of the location lists
referred to by each of them. This machinery is already sufficient to call for data in a number of interesting ways. The following examples refer to the
289
THE CATALOG: A FLEXIBLE DATA STRUCTURE
functions, member, recursion, and catalog, are required:
Mem[L,M]-where L is an address or location
--US-t and M is an address. The value is defined
only if some of the data referred to by L
contain catalogs. To these internal catalogs,
the address M is applied to yield the final
value of the function.
Rec [L,M]-where L is an address or location
list and M is an address. This permits data to
be located in general recursive catalogs. Its
value is a list of locations arrived at by (1)
applying M to the top level catalog and (2)
applying the whole function to the catalogs
identified by L.
Cat [L]-where L is an address. The address
L is applied within all catalogs contained in
data of the current catalog. The location in
the current catalog of any catalog in which
a datum is found meeting that address becomes a member of the list which is the
value of the function.
catalog, part of which is shown in Fig. 1 and whose
map is given in Fig. 2.
Des [( 2" 1747)] (3,2)
The datum or data on level 3 which are second sons of data on level 2 with value 1747
-the value in this case is H. A. Simon.
Des [(2,1747)] (Author)
Here, "1747" is a level ordinal rather than
a value, but we may assume that there is
just one entry for each acquisition number.
This therefore refers to the authors of the
1747th document in the file-E. A. Feigenbaum and H. A. Simon.
Des [(2,1747)] (Author,2)
-This is the same as the previous example
except that it selects the second authorH. A. Simon.
Anc [(lnstitution"R.C.A. Laboratories)]
-(Author.
All authors from R.C.A. Laboratories. As
far as we know from Fig. 1, this means only
J. Nievergelt.
Des[Anc[(Journal"JACM)] (2)] (Author)
Everyone who has published in JACM.
Anc[Date" 1964 )] (Acquisition number)
Acquisition numbers of everything published
in 1964.
It will be easiest to see how these work by reference
to a specially constructed example. Fig. 5 shows
three maps which are used in the structure· of some
recursive catalog. Map 1 gives the structure of the
main catalog. In this, data of classes Band C contain catalogs with maps 1 and 2 respectively. The
catalogs with map 2 have one class containing catalogs, namely Q; class U, of catalogs with map 3,
also contains catalogs.
It is clear that other functions could be added to
these two without difficulty.
Addressing catalogs, some of whose data contain
catalogs poses special problems requiring some new
machinary for their solution. At least three new
A
/p"
S/r"D
I
E
/s'"
R
Q
T
I
F
U
I
V
(b) Map 2
(a) Map 1
Classes of data contcining catalogs
s -
Map 1
C - Map 2
Q -
Map 3
U - Map 2
Figure 5. A recursive catalog structure.
(c) Map 3
290
PROCEEDINGS -
FALL JOINT COMPUTER CONFERANCE,
Mem [(B),(D"x)]
Data of class D and value x in catalogs in
data of class B in the main catalog.
Rec [(B), (D"x)]
Data of class D and value x anywhere in the
recursive system.
Cat [( Q"x)] Des [(P"y)]
Data in the main catalog, necessarily of class
C, containing a catalog in which there is at
least one datum of class P with value y dominating at least one datum of class Q with
value x.
Mem [(C),Rec [Mem [(Q),(U)],(R)]]
-nata of class R anywhere in the recursive
system.
Des [Cat [(D"y)]] (E"x)
~embers of class E with value x dominated
by a datum (which, in this case, can only
be of class B) which contains a catalog which
has a datum of class D and value y.
Rec [(B),Des [Cat [(D"y)]](E"y)]
Same as above, but in the whole recursive
system.
Let us return briefly to the example of library
catalogs. Suppose that, given an acquisitions list of
the kind we have described, we wish to obtain an
author index. With a conventional file on, say,
punched cards, this would be a simple matter of
sorting. With a catalog, two operations are involved. First, a new map must be designed for the
index in which "author" occupies the level-one position, and the file must be rewritten with the new
structure. Second, it must be sorted, duplicate author entries removed and their dependent nodes
thrown together under a single author node. This is
one of many processes known collectively as transformations. In general, a transformation· is any process which produces a catalog with a certain structure from one or more other catalogs with different
structures. The variety of transformations which
could be applied to a sufficiently complicated catalog is almost endless and the search for algorithms
capable of carrying out anyone, provided only that
it is specified in a terse but perspicuous notation,
leads to theoretical and practical problems which
would, and probably will, fill several more papers.
There is good reason to suppose that any transformation can be broken down into a series of
elementary transformations of which there will be
only three or four types. One of these will have the
1965
function of decomposing a catalog into simple catalogs. A simple catalog is one which has exactly
one data class on each level. If two simple catalogs
have the same classes on the first k levels, then they
can be merged to form a new catalog with one data
class on each of the first k levels, and two on the
levels below. In the same way, it is clearly possible
to merge a pair of catalogs of whatever structures
provided only that they have a comnion class on
level one. If they have common classes on levels 2,
3, etc., then the blend will be the more complete.
Of course, before the merge can take place, it may
be necessary to perform a sort. Now, any catalog in
which there are n data classes which have no subordinate classes can be regarded as a compound of n
simple catalogs. Many transformations can be effected by decomposing the given catalog either partially or completely, changing the relative levels of
the data in some of the simple catalogs, and merging them together again in the same or a differ~nt
order.
Sorting and merging are common components of
transformations as well as other catalog procedures
and the overall system must clearly give them an
important place. Since it is part of the essence of
catalogs that they contain a rich variety of data
types encoded according to diverse conventions,
more flexibility is required than most sorting procedures provide. In particular, the encoding of the information in a given data class will, in general, not
be such that an algebraic sort on the resulting binary number will give the required results. Furthermore, there may be some classes in any catalog in
which the order of the. data cannot be algorithmically determined; their order may be essentially arbitrary, each under its own dominating element on
the level above, or the requirement for the sort may
be that the order of the data in the class be preserved from the original input. To provide for contingencies of these kinds, the catalog merge and sort
routines allow the user to supply a comparison routine for each data class in any catalog to be treated.
This routine takes a pair of data of the specified
class and declares which of them should precede the
other in the output. The routine also knows what
the input order was and may use this in arriving at
a result.
Since the objects to be merged and sorted are
trees rather than files of independent records, the
algorithms must clearly be unconventional in other
ways as well. However, the differences are less than
THE CATALOG: A FLEXIBLE DATA STRUCTURE
might at first appear. If each node in a catalog were
duplicated once for each lowest-level datum it dominated, the catalog would take on the aspect of a
great number of chains with a lowest-level datum at
the foot of each. Each of these chains could then be
treated as a single record to be sorted in a conventional way. Whilst it is never necessary to actually
expand a catalog into this cumbersome form, the
computer can arrange to retain a datum in memory
until all its decendants are passed so that its instantaneous view of the catalog at any moment is as of
a chain.
291
The catalog system could develop in many different ways and it is our intention that it should. For
something so pedestrian as a filing system, it is remarkable how it has captivated the stargazer and
the theoretician as well as the bookkeeper and the
librarian in everyone who has worked on it. And
these have been many. However, it is important for
the welfare of computational linguistics that catalog
systems or something designed to fill the same need
should be made available soon. We have therefore
resolved to be done with theorizing, for the present
at least. What catalogs need is action.
INFORMATION SEARCH OPTIMIZATIO'N AND INTERACTIVE RETRIEVAL TECHNIQUES*
J. J. Rocchio and G. Salton
Computation Laboratory
Harvard University
Cambridge, Massachusetts
INTRODUCTION
In either case, the user can be made to control the
retrieval process by asking him to furnish to the
system information which subsequently determines,
at least in part, the mode of operation for a later
search.
Several methods may be employed to aide the
user in formulating effective search requests. 'One of
the simplest methods consists in providing some
kind of automated dictionary which may be used to
display certain pertinent parts of the stored information. Thus, the frequency of use in the coUection
of certain terms in the vocabulary can be displayed
to allow the user to make a choice between the use
of frequently occurring terms, if "broad" retrieval is
desired,. and that of rarer terms if "narrow" retrieval is wanted. Alternatively, terms related in various ways to those originally included in a search
request may be exhibited, and the user may be
asked to choose from among these related terms in
reformulating his request. The automated dictionary
is then used as an aid in a manual reformutation of
the request.
The iterative search process can also be mechanized more completely by leaving the search request
largely unchanged, but altering instead the information analysis process. In that case, the user furnishes to the system information concerning the
Automatic information retrieval systems must be
designed to serve a multiplicity of users, each of
whom may have different needs and may consequently require different kinds of service. Under
these circumstances, it appears reasonable that the
system should reflect this diversity of requirements
by providing a role for the user in determining the
search strategy. This is particularly important in
automatic systems, where presently used one-shot
(keyword) search procedures normally produce
poor results.
In an automatic retrieval environment in which
the user may be given access to the system-for example, by means of speciat input/output consolesthis can be achieved by two principal methods:
1. By providing automatic aids to the user in
his attempt to formulate effective search
requests.
2. By using the results of previous searches to
determine strategies likely to prove effective during a subsequent pass.
*This study was supported in part by the National Science
Foundation under research grant GN-360.
293
294
PROCEEDINGS -
FALL JOINT COMPUTER CONFERANCE,
adquacy of a preceding search operation, which is
then used automatically to adjust the retrieval process for the next iteration.
In the present study, several alternative search
optimization procedures are examined. In each
case, the automatic SMART document retrieval
process, presently operating on the IBM 7094 computer in a batch-processing mode, is used to simulate the real-time iterative search process.1,2,3 The
automatic evaluation procedures incorporated into
SMART are utilized to measure the effectiveness of
each process, and data are obtained which reflect
the relative improvement of the iterative, user-controlled, process over and above the usual single-pass
search procedure.
THE AUTOMATIC DICTIONARY PROCESS
In a conventional, batch-processing retrieval environment, the user normally relies on his intuition
and experience-possibly aided by published references-in formulating an initial search request.
Once the general context has been established, the
request must be normalized to a form suitable for
use by the retrieval system. In a conventional coordinate indexing system, for example, this normalization would consist in a manual transformation of
the original search request into an appropriate set
of keywords. In certain automatic keyword search
systems, a machine indexing process would generate
the keywords, and stored synonym dictionaries
might be used for normalization. After the analysis
process, the normalized identifiers which specify
the search request are matched with the identifiers
attached to the documents, and correlation coefficients are obtained to measure the similarity between documents and search requests.
In the present section, a system is considered in
which a communications link enables the user to
influence the normalization process by making it
possible for him to choose certain terms to be added and/or deleted from an original search formulation. Four main procedures appear to be of interest
for this purpose:
1. A stored synonym dictionary, or thesaurus,
may be used, given a set of thesaurus entries, to display all related entries appearing under the same concept category.
2. A hierarchical arrangement of terms or
concept classes may be available which,
1965
given a set of initial terms, can provide
more general concepts by going "up" in the
hierarchy, or more specific ones by going
"down. "1,2,3
3. A statistical term-term association matrix
may be computed which can be used, given
a set of terms, to find aU those related
terms which exhibit a tendency to co-occur
in many documents of the collection with
the terms originally specified. 4
4. Assuming the availability of a set of documents retrieved by an initial search operation, one may add to the terms originally
specified in a search request, all those
terms which occur in several of the retrieved documents but do not occur in the
initial request. 5
While it is potentially very useful to provide the
user with a set of terms which may have been overlooked in formulating the original search request, it
is probably even more important to furnish an indication of the usefulness in the retrieval process of
each of the query terms. The most obvious indicator of potential usefulness is the density (or absolute number) of doc:uments identified by each of
the given index terms. The assumption to be made
in this connection is that the usefulness of a term
varies inversely with the frequency with which it is
assigned to the documents of a collection.
Thus, in a coordinate indexing system, in which
the retrieval process is controlled by the number of
matches between terms assigned to documents and
terms assigned to the search requests, the indexing
density provides a straightforward estimate of the
number of documents likely to be retrieved in each
particular operation. If a correlation function is
used to compare keyword sets attached to documents and queries, the relation between number of
retrieved documents and the indexing density of
query keywords is less obvious. However, the general assumption that a query term with high indexing density will produce "broad" retrieval, whereas
one with low indexing density produces "narrow"
retrieval is still valid.
It seems reasonable under the circumstances, to
require that each dictionary display provided to the
user consist not only of the corresponding terms or
concepts, but also of the frequency with which the
INFORMA TION SEARCH OPTIMIZATION
various terms are assigned to the documents of the
coltection. The user can then utilize this information to refine the search request by promoting
terms deemed important and demoting others
which may be ambiguous or otherwise useless in
the retrieval process.
As an example of the use of the indexing density
of query terms, consider the retrieval process illus-
295
trated in Fig. 1. The original text of a request titled
"Morse Code" is shown in Fig. 1a. When this text
is looked-up in a typical synonym dictionary, * eight
distinct concept codes are obtained. The codes, together with the frequency of occurrence in the collection of the corresponding concept classes are
shown in Fig. 1b; the full thesaurus entries are similarly included in Fig. 1c.
"Can hand-sent Morse code be transcribed
automatically into English? What programs exist to read 110rse c·ode? II
Term Used
in Request
hand-sent
Morse
code
transcribed
automatically
English
programs
exist
read
Concept Number
113
35
281
570
119
35
608
234
569
Frequency of
Concept
(405 documents)
12
9 (Lm'l)
37
25
70 (HIGH)
9
104 (HIGH)
,5
25
Figure 1. Processing of request "Morse Code."
a) Original query for "Morse Code."
b) Terms included in original request.
The user who examines the output of Fig. 1b
may notice that concepts 119 (obtained from "automatically") and 608 (from "programs") appear
with excessively high frequency in the document
collection under study; furthermore, these concepts
do not appear to be essential to express the intent
of the query of Fig. 1a. These concepts might then
usefully be removed from the query statement.· The
user may note further, from the display of Fig. 1c,
that "transcribe" can be replaced by "translate"
(concept 570), and "read" by "recognize" (concept
569) in order to render more appropriately the purpose of the request. Finally, the crucial concept 35
(Morse) may be reinforced by increasing its
weight. The two reformulations of Fig. 1d reflect
the corresponding additions, deletions, and substitutions.
The success of the request alterations may be
evaluated by examining the ranks of the two relevant documents (numbers 305 and 394) as shown
in Fig. 1e. It may be seen that retrieval results are
improved for both modifications 1 and 2 over the
original, but that the better result is obtained for
the first modification where the relevant documents
are ranked fourth and eighth, respectively. The correlation coefficients between the two relevant documents and the search requests are also seen to be
*The dictionary used in the example is the "Harris III"
thesaurus available with the SMART retrieval system.1,2,3
296
PROCEEDINGS -
FALL JOINT COMPUTER CONFERANCE, 1965
Corresponding Thesaurus Entries
Concept Number
113
hand-drawn, hand-keyed, hand-sent,
hand, handsent, manual, non-automatic
35
281
English, French, Horse, Roman, Russian
570
code, pseudo-code
record, reproduce, translate, transcribe,
transcript
119
artificial, automate, machine-made,
mechan, semi-automatic
608
234
program
behavior, case, chance, event, exist,
fall, instance, occur
569
read, recognize, sense
Modification 1:
Modification 2:
"Can hand-sent Morse code
be translated into English?
Recognition of manual Morse
code. 1I
U~e
original query and add
!!Horse, Morse, Morse ll •
Type of Q.!ery
Ranks of
Relevant
Documents
Original r.;p.ery
"Morse Code"
30
Modification 1
Modification 2
7
Document
Number
Correlation
394
305
0.29
0.13
4
8
394
305
0.33
0.26
4
16
394
305
0.30
0.13
c) Thesaurus entries for' terms in original request.
d) Modified queries by deletion of common, highfrequency concepts and addition of important lowfrequency concepts.
e) Comparison of search results using original and
modified queries.
much higher for the modified formulations than for
the original.
A second example, and a different dictionary
fee,dback process is illustrated in Fig. 2 for the request labeled "IR Indexing." In this case it is assumed that an initial retrieval operation has taken
place, and that the user would like to use information obtained from the retrieved documents in order
to reformulate his .search request before· a second
attempt is made. The original query text is shown
in Fig. 2a, and the corresponding concept classes
and concept frequencies appear in Fig. 2b. The retrieval results obtained by processing the initial
query are given in Fig. 2c.
Under the assumption that the user examines the
list of retrieved documents, and finds that the 5th
and 6th documents (numbers 79 and 80) are useful
to him, it is now possible to request that concepts
attached to these documents, but not included in the
originat search request, be displayed. This is done
in Fig. Id for concepts jointly included in the relevant documents no. 79 and 80.
lt now becomes possible for the user to pick new
terms, from the list of Fig. 2d-for example, terms
like "coordinate," "lookup," and "abstract"-and
to' use them to rephrase the search request as shown
297
INFORMATION SEARCH OPTIMIZATION
"Automatic Information Retrieval and Machine
Indexing"
Figure 2. Processing of request "IR indexing."
a) Original query text for "IR indexing."
Original Term
Used
Concept Number
in Request
Frequency of Concept
(405 documents)
automatic
119
70
information
350
45
26
6
machine
600
77
indexing
101
11
retrieval
b) Terms included in original request.
Document
Rank
Document
Number
1
167
2
Correlation
Coefficient
Relevant
0.1-1-1
no
166
0.38
no
3
129
0.33
no
4
314
0.33
no
5
79
0.33
yes
6
80
0.30
yes
}
c) Retrieval results for original query (using version
III of Harris Theraurus).
in Fig. 2e. The reformulated request also excludes
concepts 119 (automatic) and 600 (machine)
which are found to occur with excessive frequency
in the document collection. The ranks of the rele-
298
PROCEEDINGS -
FALL JOINT COMPUTER CONFERANCE,
Concepts from
Documents
79 and 80
1965
Corresponding Thesaurus Entries
t,
co-ordinate, I coordinate
intercept, ordinate, pole,
rec tangular-to-polar
49
108
consult, look-up, look,
scan, seek, search
I lookup
I aostract I ,
"
article, auto-abstracting,
bibliography, cataloE:, copy, etc.
114
170
noun, verb, sentence
497
science
"Information Retrieval. Document Retrieval.
Coordinate Indexing. Dictionary Look-up
for Language Processing. Indexing and
Abstracting of Texts."
Retrieval Results Using
Original Query
Ranks of
Relevant
Documents
Document
Nuni>er
Correlation
79
80
221
l26
48
3
0.33
0.30
0.29
0.28
0.21
0.10
5
6
9
II
l2
69
Retrieval Results Using
~Query
Ranks of
Relevant
Documents
Document
Number
CorreIa"
tion
II
80
19
48
l26
221
18
3
0.51
0.41
0•.36
0.23
0.23
0.19
1
4
6
9
d) Concepts common to relevant documents Nos. 79
and 80 and not included in original request.
e) Modified query using terms from relevant documents.
f) Comparison of search results using original and
modified queries.
vant documents obtained for both original and
modified queries are given in Fig. 2f. It may be
seen that the relevant documents have much lower
rank, and correspondingly higher correlation coefficients for the modified search request than for the
original. The lowest relevant document, in fact,
places only 18th out of a total of 405 documents
when the modified query is used, whereas it originally ranks 69th.
Corresponding improvements can be obtained by
the judicious use and display of hierarchical subject
arrangements and statistical term associations.
user who controls not only what is displayed by the
system, but also what is returned in the way of
modified information. A variety of search optimization methods should, therefore, be considered
which place a much larger burden on the system
and a correspondingly smaller one on the user. One
such procedure is the relevance feedback method.
In essence, the process consists in effecting an
initial search, and in presenting to the user a certain amount of retrieved information. The user then
examines some of the retrieved documents and
identifies each as being either relevant (R) or not
relevant (N) to his purpose. These relevance judgments are then returned to the system, and are used
automatically to adjust the initial search request in
such a way that query terms or concepts present in
the relevant documents are promoted (by increasing
their weight), whereas terms occurring in the documents designated as nonrelevant are similarly demoted.
REQUEST OPTIMIZATION USING
RELEVANCE FEEDBACK
The vocabulary feedback process illustrated in
the preceding section appears to be both easy to implement and effective in improving search results. It
does, however, put considerable demands upon. the
299
INFORMATION SEARCH OPTIMIZATION
The amount of improvement to be obtained from
the feedback process depends critically on the manner in which the search request is altered as a function of the user's relevance judgment. The following
process which has been used experimentally with
the SMART system appears to be optimal in this
connection. Consider a retrieval system in which
the matching function between queries and documents (or between query and document identifiers)
induces a metric, or a monotonic function of a metric, on the space of query and document images
( e. g., on the space of keyword vectors). 6 In such a
case, it is possible to produce an ordering of the
documents with respect to the input query in such a
way that increasing distance between document and
query images reflects increasing dissimilarity between them.
Let DR be the nonempty subset of relevant documents from the source collection D, relevance being
defined subjectively and outside the context of the
system. An optimal query can now be defined as
that query which maximizes the difference between
average distances from the query to the relevant document set, and from the query to the nonrelevant
set. In other words, the optimal query is the one which
provides the maximum discrimination of the subset
DR from the rest of the collection (D'-DR). More
formally, let 8(q, d) be the distance function used in
the matching process between query q and document
d. The optimal query" qo may then be defined as that
query which maximizes the function
C
= 8
(q, d) - 8 (q, d)
deDR
deD R
identified by the user, from the remaining documents;
using this optimal query to modify the original search
request the resultant query can then be resubmitted,
and the process may be iterated, as more complete
sets of relevant documents become available through
subsequent retrieval operations. One may hope that
only a few iterations will suffice for the average user;
in any case, the rate of convergence will be reflected
in the stability of the retrieved set.
In the SMART automatic document retrieval system, the query-document matching function normally used is the cosine correlation of the query vector
with the set of document vectors, defined as
- - _ q d _
p (q, d) - ----==--=- - cos () - Iqlldl
q,d
q
where and d are the vector images of query q and
document d, respectively. Since the vector images
are limited to nonnegative components, the range for
the correlation is 0 ~ p ~ 1, corresponding to an
angular separation of 90 ~ () ~ O. Under these conditions, the correlation coefficient is a monotonic
function of the angular distance metric. Furthermore,
since the· correlation decreases with increasing distance, relation (1) may be rewritten as
C·= p
(q, d) -"- p (q, d)
(3)
deDRdeDR
where p is the average cosine function p, It can be
shown, 7 that in this case C is maximized for
(1)
where ,8 is the average distance function, and decreasing distance implies stronger query-document association.
Clearly, Eq. (1) is of no practical use, even under
the assumption that the optimal query qo can be determined as a function of D and DR, since knowledge
of the set DR (the relevant document subset) obviates
the need for retrieval. However, if instead of producing the optimal query qo, thr:- relation (1) is used
to produce a sequence of approximations to qo, starting with some initial query which identifies a part of
the set DR, then a method for automatically generating useful query modifications becomes available. The
system can, in fact, produce the optimal query to
differentiate the partial set of relevant documents,
(2)
qo =
~o 1: I~I - N~no 1: I~I
dieDR
(4)
dieDR
where no = n(D R ), the number of elements in the
set DR, and N = neD), the number of elements in
the collection.
The query modification algorithl1l employed may
now be written in the form
(5)
where qi is the ith query of a sequence, and R =
{rl,r2, . . .,rnt} is the set of relevant document vectors
retrieved in response to query qi, and S = {Sl,S2,
300
PROCEEDINGS -
FALL JOINT COMPUTER CONFERANCE,
... ,Sn2} is the set of nonrelevant vectors retrieved in
response to qi (the specification of sets Rand S constitute the feedback from the user after the ith itera\tion of the process).
A simple graphical illustration of the application of
Eq. (5) for two-dimensional document and query
vectors is given in Fig. 3. Figure 3a shows the original query vector (Qo), and the four document vectors Al and A2 (relevant), and BI and B2 (nonrelevant). Figure 3b illustrates the formation of the
optimal vector used to differentiate between A I and
1965
A2 on the one hand, and BI and B2 on the othe:c. The
resulting, normalized query (QI) is shown in Fig. 3c.
It may be noticed that whereas the original query is
approximately equidistant from sets Rand S, the
modified query is much closer' to R (indeed, it coincides with A I) than to S. This is reflected in the correlation coefficients for both original and modified
queries Qo and QI with the four document vectors,
as shown in Fig. 3d. It is evident that the modified
query will be much more effective in providing retrieval of the relevant document set than the original.
AlfAY/ \
~B2
a) Initial Query (Qo), Relevant Docs.
(A 1,A 2) and Nonrelevant Docs.
(BpB2)
b) Sum of Relevant - Sum of Nonrelevant
Doc. Vectors
Doc.
At
!A2
/
/
(A-B)N
/
Qo
....... ~Bl
00
Ot
AI
0.71
1.00
A2
0.92
0.92
Bt
0.92
0.71
0.38
B2
0.0
/
...........
_ _ _ _~:.:::_:____ B2.
c) Resultant of Qo +Normalized
Sum of Relevant - Sum of
Nonrelevant
d) Correlation of Ouery Vectors
0 0 and 0 1 With Doc. Vectors
Figure 3. Geometrical representation of relevance feedback.
a) Initial query (Qo), relevant docs. (AI' A 2 ) and nonrelevant docs. (Bv B 2 ).
b) Sum of relevant - sum of nonrelevant doc. vectors.
c) Resultant of Qo + normalized sum of relevantsum of nonrelevant.
d) Correlation of query vectors Qo and QI with doc.
vectors.
The query modification process of Fig. 3 was
tested by performing two iterations for a set of 24
search requests, previously used in connection with
the SMART system. Figure 4 shows the results for
a request on "Pattern Recognition." The original
retrieval results, using version 2 of the "Harris"
thesaurus (synonym dictionary), are given in Fig.
4a. The user identifies documents 351, 353, and
350 as relevant, and 208, 225, and 335 as nonrelevant. The query is then automatically modified, in
accordance with the expression of· Eq. (5), and
retrieval performance is compared in Fig. 4b. It
may be seen that drastic improvements are obtained
both in the ranks of the revelant documents and in
the magnitude of the correlation coefficients. The
"recall" and "precision" measures, shown in Fig.
4b, are the normalized evaluation measures incorporated into the SMART system,8,9 which express
the ability of the system to retrieve relevant material
and to reject irrelevant matter.
301
INFORMATION SEARCH OPTIMIZATION
Document
Rank
Document
Number
I
351
353
350
163
82
2
3
4
5
6
7
8
9
10
Correlation
.65
.42
.41
.36
.35
.32
.27
.25
.24
.21
I
208
225
54
335
Retrievlli Results Using
Original Query
User Feedback
Relevant
Relevant
Relevant
-
Not Relevant
Not Relevant
-
Not Relevant
Results Using Query
Modified by User Feedback
Ranks of
Ranks of
CorreiaDocument CorrelaRelevant Document
Relevant Number
tion
Number
tion
Documents
Documents
I
2
3
4
6
9
26
27
33
~
1
351
353
350
163
1
54
205
224
314
39
.65
.42
.41
.36
.32
.24
.17
.17
.16
.12
Recall
.972
Precision .864
I
I
2
3
5
6
7
II
16
17
30
'351
350
353
163
.66
.60
.55
.37
.32
.29
.23
.19
.19
.16
I
54
314
205
39
224
Recall
Precision
.989
.923
Figure 4. Query processing using relevance feedback.
a) Retrieval results using original query for "pattern
recognition" (version 2 of Harris thesaurus).
b) Comparison of search results using original and
modified queries,
Figure 5 is a typical recall-precision plot giving
recall and precision figures averaged over 24 research requests for the original query formulations
as well as for two iterations using relevance
feedback. * It may be noted again that for a given
recall value large improvements are gained in the
average precision by using the retevance feedback
process. Additional improvements are obtained by
identifying further documents as either relevant or
nonrelevant during a second iteration.
For example, he may find that the documents obtained from the system show that his request was
interpreted too narrowly (since all retrieved documents belong to some small subfield of the larger
area which he expected to cover), or too broadly,
or too literally, or too freely.
Depending on the type of interpretation furnished by the user, the system now proceeds to initiate a new search operation under altered analysis
procedures. If the user's verdict was "too narrow,"
a hierarchical subject arrangement similar to the
one mentioned in the second section of this paper
might be consulted, and each original query term
could be replaced by a broader one; if, on the other
hand, the initial search was "too broad," more specific terms might be obtained from the hierarchy. If
the interpretation was too literal, the use of a synonym dictionary might provide more reasonabte resuIts; and so on.
Automatic retrieval systems are particularly attractive in such a situation, because these systems
make it possible to provide at relatively little extra
cost, a variety of indexing procedures which may be
called upon as needed. The SMART system, in par-
AUTOMATIC MODIFICATION OF THE
ANALYSIS PROCESS
The last search optimization process to be described depends, like its predecessor, on feedback
provided by the user, and results in selective
changes in the document and request analysis process. However, instead of furnishing relevance judgments based on the output of a previous retrieval
operation, the user makes a qualitative assessment
*The method of construction of such recall-precision plots
has previously been described in detai1. 9,10
of the effectiveness of an initial search operation.
302
PROCEEDINGS -
FALL JOINT COMPUTER CONFERANCE,
1965
1.0
2 nd ITERATION
RELEVANCE
FEEDBACK
.9
.8
.7
INITIAL
QUERIES
.6
1st ITERATION
RELEVANCE FEEDBACK
z .5
Q
C/)
(3
w
.4
a::
.0...
.3
.2
.1
.3
.4
.5
.6
.7
.8
.9
1.0
RECALL
Figure 5. Precision versus recall for initial queries and queries modified by relevance feedback (averaged over
24 search requests).
ticular, provides a large variety of indexing methods including the following:
1. A null thesaurus procedure which uses the
word stems originally included in documents and search requests· for content identification.
2. A synonym dictionary ("Harris" thesaurus) which replaces original word stems by
synonym classes or concept numbers.
3. A hierarchical arrangement of concept
numbers which can be used, given a set of
concepts to obtain more general ones ("hierarchy up"), or more specific ones ("hierarchy down").
4. A statistical phrase procedure which is
used to replace pairs or triples of co-occuring (related) concepts by a single "phrase"
concept (e.g., the concepts "program" and
"language" might be combined into "programming language") .
5. A syntactic phrase process which generates
phrases only if the components in fact exhibit an appropriate grammatic at relationship.
6. A variety of so-called merged methods,9 in
which the system proceeds iteratively
through two or three simple processes and
combines the output.
Obviously, the ability to generate a multiplicity
of distinct index images for each document does
not necessarily imply that each modification in the
anlysis process results in large-scale improvements
in the search effectiveness. Experiments conducted
with the SMART system have, however, shown that
in many cases a considerable increase in retrieval
effectiveness is obtainable when changes in the
analysis are adapted to the aims of each particular
user.
Consider, in this connection, the evaluation output for a variety of analysis methods produced by
the SMART system, reproduced in Fig. 6 and 7.
Figure 6 contains output for the search request titled "Automata Phrases," with nine relevant documents. Six simple analysis methods are shown: null
thesaurus, "Harris Two" (version 2 of the regular
synonym dictionary) , statistical phrases, syntax
phrases, hierarchy up, and hierarchy down. Thirteen
"merged" methods, each including two simple components, are also included in Fig. 6, as well as nine
303
INFORMATION SEARCH OPTIMIZATION
ALTOAU PHA
HARAIS He
9 RELEVHT
~UlL
THS
STU HAASE
SYNT IX P~R
SUT PHRASE
S'NTU PHR
HIERAACHY UP
OlEA COW~
~ARRIS TwO
NUll THS
U)
f-
Z
w
:0
::>
u
39 312
43 2<1
92 26~
0
0
~
Il.
~
HARRIS rt.c
SlA1 PHR.SE
HAARIS TWC
SnTAJ PH'
~AU IS TWC
HEA'RCH UP
HU I S TWO
>IE. DOWN
ST"T PHRASE
HIERARCHY lP
STAT PHRASE
HIEA OCW~
SY"TAX Pt-R
HI ERUC~Y UP
SYNTAX PtoR
HE. OOWN
TOP 15 AElEVAU TCF 15 HlEV,NT TCP 15 RElEVANT TOP I! .ElHANT TOP 15 RHE.AH TCP 15 RHEV"T Tep 15 RELEVANT TOP 15.ElEVA"
1 316
1 316
1 316
1 316
I 316
I Ht
1 31t
1 316
1 371
1 371
1 371
1 371
1 371
1 371
1 371
1 371
2 371
2 371
2 311
2 371
2 264
2 264
2 12~
2 129
2 264
2 264
2 316
2 316
2 264
2 264
2 3H
2 316
3 129
3 129
3 129
3 129
3 129
3 12~
3 313
3 313
) 316
3 316
3 129
3 129
3 372
3 312
3 372
3 312
4 313
4 313
4 372
4 372
4 313
• 313
• IH
• 176
4 372
4 372
4 372
4 372
4 316
4 316
4 129
4 129
5 372
5 372
5 313
5 313
5 173
t 176
! 2H
~ 371
5 173
8 313
5 313
5 313
5 212
8 313
5 212
t 313
6 176
6 176
6 212
7 176
6 176
II 371
t 167 34 372
6 C02
9 129
6 002
8 176
6 173
9 129
6 313
8 176
7 002 11 241
7 176 10 241
7 218 50 n2
7 24~ 39 315
7 218 12 241
7 238 11 241
7 218 10 176
7 231
Ie /41
8 167 46 315
8 167 43 315
8 167 t! 241
8 38! 43 241
ft 313
14 176
8 176 42 315
ft 313
12 241
8 176 42 315
9 249 73 264
9 249 68 264
9 249 72 315
9 371 82 264
9 129 68 315
9 139 74 2..
9 129 64 315
9)85 69 26~
10 139
10 241
10 127
Ie 166
1C 139
10 385
10 176
10 2<1
11 241
11 166
11 311
11 C45
11 127
11 241
11 127
11 249
12 166
12 385
12 361
12 If!
12 241
12 249
12 241
12 245
13 385
13 245
13 16t
13 173
13 361
13 167
13 361
)3 167
14 045
14 045
14 263
14 IC I
14 176
14 089
14 263
14 161
l ' OI!lCJ
15 113
15 38'
15 2C4
15 263
15 161
15 ZitS
15 283
Rhl( REe- 0.2980 Rhle flEe- 0.3141 RNte RH- o •.ae! RN" Rfe- C.2Cll. RNI( REe- 0.3119 RhK REe- 0.3000 ANI( flee- 0.3982 RNK REC. C.He!
lCG P.E- 0.7488 lCG PRE- 0.7523
lC~
PAE- 0.t4!f lOG PRE- e.6175 LOG PRE- C.7621 LCG PRE- 0.7397
LO~
P.E- 0.7805 LOG PRE- C.7390
NOR REe. 0.9703 NOA AEe. 0.9725 NOf! RE(- O.C)'2t NOR REC- 0.9511 NCR REC- 0.9181 heR IIEe- 0.9105 fl;CR life- 0.9809 NOR REC- 0.'9719
~CR PRE- 0.8956 NCA PRE- 0.8975 ~CR PRE- (I.IH! NOR PRE- C.8(11 NOR PRE- 0.9(128 "CR PRE- 0.8905 "CR PRE- 0.9125 NOR PRE- 0.1!9CC
O\lERALl- 1.0468 OVEA'lL- 1.0670 OVU"ll- (I.E5'91 OVERAll- c.e248 OVERAll- 1.1340 cnAAlL- 1.0391 CVU'Ll- 1.1188 OVERALL- 1.(493
~CA CVR- 1.7469 NCA CVR- 1.760C NCR OVR- 1.!944 NOR OV.- 1.5658 NOR OVR- 1.7962 NCR CVR_ 1.7431 NOR OVO- 1.8171 NOR OV.- 1.7497
4
HIE.ARCMY UP
HIER OCWN
TOP 15 RElEVANT
I 264
I 264
2 316
2 316
3 129
3 129
4 In
5 313
5 313
8 176
6 218 12 311
7 238 56 31'
e 176 68 372
9 127 75 241
10 385
11 361
12 371
13 263
14 249
15 177
RNIC REe- 0.1951
lCG P.E- 0.6236
HCR REC- 0.9481
M:;R Pfle- 0.8121
OVERALL- 0.8193
.. OR ()\lR- 1.5526
HARRIS hC
STAT PHRASE
HIEA OCliIto
~ULL
T~fS
STU
F~USE
TOP 15 "LEVANT
I 129
I 129
2 371
2 371
3 316
3 316
4 167
5 372
5 372
9 313
6 173 12 241
7 002
13 176
8 166 25 264
9 313 67 315
10 213
II 139
12 241
13 176
14 263
15 089
ANI( fIEC- 0.3285
LCG PRE- 0.7084
fIICR IIEC- 0.9742
NOR PRE- 0.8718
aVERILL- 1.0369
Nelli (VA- 1.71,28
HURLS
ac
THES
SYNTAX PH"
~Ull
.AUI S TWC
NUll THES
STAT PHRASE
HARRI! no
NLll THES
SYNTU PHR
TOP 15 RELEUNT TCP 15 RElE"hT
I ~u
I 3H
I 316
I 316
2 12~
2 129
2 129
2 129
3 371
3 311
3 311
3 311
4 313
4 313
4 372
4 372
5 167
t 312
5 313
5 313
t 312
7 Ilt
6 167
8 176
1)76 14 241
7 212 13 241
e 173 31' 2t4
8 176 28 264
~ ee2
73 315
9 173 67 315
Ie Itt
IC 166
II 24~
II 249
12 213
12 213
13 139
13 241
. . 241
14 385
)! 315
15 263
RN" REC- (1.1191 RNI( REe- 0.3435
LOG PRE- C.74C9 LOG PRE- 0.7544
NCfI Rfe- O.~116 NOR REC- (.'91:31 NOft REt- C.9159
NOli PRE- c.eeel NOR PRE- C.8911 NOR PRE- 0.1987
OVERALL- 1.0932 OVERALL- 1.0f:OO O~ERALL- 1.0'97«1
Ntll OVII- 1.171,!5 NOR OVfh 1.7'f:4 NOR OVR- 1.118(1
TCP 15 RELEVANT
I 129
I 12~
2 371
2 HI
3 316
3 31t
4 372
4 372
5 167
9 H!
6 212
II 17t
7 173 12.41
e 166 22 U4
9 313 tI 315
10 213
11 176
12 241
13 263
14 24~
15 249
ANI( REC- 0.~6(1C
let P"E- 0.7332
~A.RIS
TWV
SUTA. FHI
SYNT.JC Ptll
HIfRUny UP
HER
[OW~
HU_I S TWC
NUll THES
HI ERUCHY UP
TCP 15 'ElEVA"
I 316
I 316
2 129
2 129
3 264
3 264
4 313
4 313
5 167
7 176
6 173 13 371
7 176 19 241
8 218 24 172
9 166 8, 315
10 249
11 213
12 127
13 371
14 361
15 263
RhK AEe_ 0.2841
LCG PRE- 0.7013
NOR AEC- 0.9683
NOR PRE- O.8614t
OVERALL- 0.9861
fl:CR CVR- 1.7089
_IS Twe
MIL THES
HI fA OWN
TOP 15 REL ANT
I 316
I
6
1 129
2 12
3 313
3 313
4 167
5 176
5 176 12 371
6 173 15 241
7 238 19 312
8 166 27 264
9 249 61 315
10 213
11 385
12371
13 263
14 045
15 241
RNI( Rfe- 0.3103
LCG PRE- 0.6757
NOli REe- 0.9719
"CR Pile- 0.8506
CV!R'Ll- 0.9861
ttOR OVR,. 1.7103
"ARR 15 TWO
STAT PHRASE
HIEURCHY UP
TOP 15 .ElEVANT
I 316
I 316
2 371
2 371
3 264
3 264
129
4 129
313
5 313
6 72
6 372
7 I
8 176
8 17
15 241
9 002 74 315
10 218
II 167
12 249
13 139
14 127
15 241
RNK REC- 0.3811,
LO~ PRE- C.8169
NOR REC- 0.979'
NOR PRE- 0.'9302
OVERALl:- 1.1983
NOR OVR- 1.8278
.AuIS no
HIERARt.Y ~P
H IE. DO~N
TOP 15 .ELE,"T Tep 15 RElEV,H TCP 15 RELEVANT TOP 15 RELEVANT
1316131613161316
1316
l~IE
1316
13U
2 371
2 371
2 371
2 371
2 371
2 371
2 2t<
2 2t4
3 129
3 129
3 264
3 264
3 129
3 129
3 12~
3 129
~ 313
~ 313
4t 129
4 129
I, 312
I, ~12
4 313
I, H3
5372
5372
5372537253135313
5173
tilt
6176
6176
6313
6 313
6 212
7 11t
t 170 )3 311
7 002 13 241
7 212
9 176
7 176 12 /41
7 21! 55 372
8 238 ~5 315
8 173 14 21,1
8 238 I,~ H!
! 23e 62 3l~
9 167 77 264
9 176 68 315
9 167 71 H~
~ 161 69 241
10 249
10 218
10 249
Ie 2~9
II '39
II 167
11 385
II 127
12 385
12 249
12 241
12 3e5
13 241
13 127
13 166
13 371
14 166
14 241
14 245
14 361
15 045
IS 361
15 045
15 Itt
RNI( REC- 0.2885 ANt( IIEC- 0.4018 RNI( REC- O.~O'H RNK REC- C.2C9)
lOG PRE- 0.1~02 lOG PRE- 0.8188 lce; PRE- 0.742~ LaC PAE- C.E1,32
NOR REC- 0.9689 hCR REC- 0.9812 NCR .EC- c.nll NO • • EC- C.~523
toeA: PRE- 0.8901 flieR PRE- 0.9311 fII(" PIiIE- c.e94:C ·NOR PRE- (.8273
OVERALL- 1.0281 CVEUll- 1.2206 OVEfiAlL- 1.C4E~ OVERALL- (.8'2'
NOR OVR- 1.7350 NOR CVR- 1.8371 NCR CVA- 1.741! NCR OVR. 1.5U8
Figure 6. Evaluation output for request "automata phrases" (28 different analysis methods).
triple merges. * For each method, the output is presented in two parts: the .left part includes the document numbers of the first 15 documents retrieved
by that method, whereas the right-hand side con-
*The ranked document output for the "merged" methods
is produced by taking the ranked lists for the individual component methods and merging these lists in such a way that
all documents with rank 1 precede all documents with rank
2, which in turn precede documents with rank 3, and so on.
304
PROCEEDINGS -
AV~RAGE"S
.AN~
aUA.SI-ClEVe:RDO~
~ULL
HARR15 TWO
r.RAPH~
THFS
1965
FALL JOINT COMPUTER CONFERANCE,
SlAT PHRAt;E
HIFQ:,AQ(HY UP
SYNT AX PHR
STAT PHRASE
HIFQ DOWr..j
SVNT AX PHR
0.1
(\.2
0.3
('1.4
0.1)
0.6
n.7
0.8
0.9
1.t)
0.9150
0.8 7 99
0.8295
('1.7878
C.71:.q2
0.6665
0.5995
0.5503
C.4~q3
0 3411
HARRIS
HARRI S TWO
NULL THf5
~T.T
0.1 0.94"1
0.2 0.8961
0.3 0.8665
0.40.842S
o.t; C.793~
('1.6 ('I.740~
0.70.653]
0.8 0.5429
0.9 0.4853
1.0 0.4191
RNK REC- "0.5318
lOG PRE- 0.7521
NOR REe- 0.9693
NOR PRE: 0.8786
T~O
0.1 O.91}92
0.2 O.A709
0.3 a.Fl070
0.4 0.7697
0.5 0.7473
0.6 0.1111
0.7 n.6P05
O.R 0.6311
0.9 O.~292
].0 ('I 4272
SYNTAX P,..R
SYNTAX PiotR
HIEJlARCHY UP
HIER DOWN
0.1 0.9456
0.2 C.8642
0.3 0.7l0!!
0.4 0.7"3~1
C.5 C".111"1
0.60.6698
0.70.6201
0.8 0.5595
C.9 C.49~1
1.00 39('0
0.1 0.9174
0.2 0.8556
0.3 0.1951
~.4 0.7779
I".t; C.1629
0.6 0.1~04
0.7 0.6875
0.8 0.6117
r.9 0.5291
1.0 0 41~9
PHI( QFCc O.~141 RNI( rlEC'" 0.'5471
LOG PRE- 0.7413 LOG PRE= 0.1641
NOR REC- 0.9683 NOR REe= 0.9113
NOR PRE:.: 0.8687 NOR PRE- 0.8806
HARRIS TWO
HAR~I
NULL THES
NULL THES
HJER DOwN
0.1 0.9458
0.2 0.8938
c.~ C.83S0
0.4 C.8131
C.50.1901
0.60.1274
0.10.6215
0.8 0.5111
o.q 0.4698
1.n 0 4060
RNI( flEe= (;.5173
LOG PRE= 0.7460
NOR REC= C.9668
NOR PRE= 0.8112
o. '3
S TWO
0.1 '0.9583
0.2 0.9108
C.3 C.8590
('.4 ('.8"2;"30
0.50.7876
0.60.7"i18
e.7 C.6491
0.8 0.5499
0.9 0.48'31
1.00 4}52
QNI( PEC= 0.5323
LOG PQE= 0.7545
NOR REC= 0.9705
~OR PRE= (".8815
0.1 0.8124
0.2 O.7S42
O. '3 0.11 ~ 1
0.4 C.61~5
0.15 0.65'8
0.6 0.~B68
0.7 0.5178
o.e
O.44~1
0.9 O.3r;)7
HARRI S TWO
HARRIS TWO
HtFq rmWN
PHR'«;E
0.1 0.94;\1
0.2 0.8989
~.3 0.8678
0.4 0.8227
o.~ 0.7Q44
r.6 0.1515
0.7 0.7134
0.8 0.6235
0.9 0.528e
1.0 0.4466
1.0 n.4]ft
RNK ~EC- 0.5732 RNK qec- o.
25
lOG PQF'" ".7809 L("I(j PPE- n
572
NOP REC'" ('I.q7~1 PIOR RFC=
.97] 1""
NOR PRE- 0.8983 NOR PRE.8842
HIERARCHY UP
n.l 0.9384
('1.2 0.8379
0.7765
n.4 0.1666
0.5 0.7352
1".6 0.6190
o. 7 O.62~3
0.8 0.5645
0.9 0.4667
1.0 0.3853
0.1 O.9~P.4
0.2 O.890C
0.3 0.8345
0.4 0.7981
0.5 0.7692
0.6 0.7417
0.10.706'"
o.e 0.6519
0.9 0.5457
1.0 0 4412
HIERARCHY UP
H!£R nOWN
0.1
0.2
0.3
0.4
0.5
0.6
('1.1
O.B
0.9
1.0
!lNIC
lOG
"lOR
NOR
1.0 0.385('1
qEe- 0.4909 11,,1C
PPF~ 0.7244 LOG
REe., 0.q6~B NOP
PRE- 0.8608 NOR
NUll TH£S
STAT PHRASf
HARRIS TWO
0.1 0.9431
0.2 0.8917
0.3 (\.A~48
r.40.7767
0.50.76'='1
0.6 0.7237
e.70.6639
0.8 0.6C16
0.90.5176
1.004364
RNK RE"C= 0.5568
LOG PREz: 0.7683
NOR REC: 0.9736
NOR PRE= O.fl.897
0.1
0.2
0.3
0".4
0.5
0.6
0.7
O.B
0.9
1.0
RNIC
LOG
NOR
NOR
0.9399
0.8882
0.8161
0.7530
0.72"i7
0.6762
0.6320
0.5920
0.5338
0.4434
REC~ 0.5529
PRE" 1".762]
REC- 0.9726
PRE= 0.8838
HARR IS TWO
NULL THES
STAT PHRASE
0.1 0.e764
0.1 0.9812
1).2 0.1926
0.2 0.9333
0.' 0.7671
0.30.9073
('1.40.711)00
0.40.A502
('\." n.7179
0.'5 0.79('17
0.6 0.6665
0.6 O.73~8
(,.7 0.5945
0.1 0.6696
0.8 0.4873
0.8 0.6123
C.9 0.'3868
0.9 0.54e1
1.0" 2994
1.00 46q6
RNIC qfC- n.4"323 QNI( QEC= 0.'5A18
lOG PRE,.. 0.6699 LOG PRF= 0.7871
NOR RE"C:.: 0.~658 NOR REC= 0.9735
NOR PRE= 0.8448 NOR PREll: 0.9009
STAT PHRASE
HIERARCHY UP
0.9230
0.8863
0.8429
0.8186
0.1715
0.6889
0.60"37
0.5457
0.4626
n.3R46
REC- 0.5140
PRE_ 0.7403
RFC_ 0.9694
PRE- 0.8740
1.0 0.4346
HARP I 5 TWO
STAT PHRASF.
HARRIS TwO
SYNTAX PHR
HlfR DOWN
HrER~RCHY
0.1 0.9343
0.2 0.8816
0.3 0.e444
0.4 0.8016
0.50.7776
0.60.7466
0.70.7088
0.8 0.6]61
0.9 0.53('13
1.00 431:!1
QNK QEC= O.t;696
LOG PRE"= 0.7759
NOR PEC= 0.9739
NOR PRF= 0.~931
UP
0.1 0.9608
n.2 0.8826
0.3 0.8020
n.4 0.7614
C.50.7456
0.60.7166
0.7 0.6548
(l.8 0.5763
0.9 0.4907
1.0 (,\.4074
R~I( qEC= 0.5J86
LOG PRE_ 0.7463
NOR REe= 1).9695
N~R PRE_ 0.8758
o.~
0.6
0.7
0.8
.9
1 0
RNI(
LOG
NOR
NOR
0.9299
0.8803
0.8:'40
0.7928
0.762')
0.7444
0.6987
0.6318
0.5531
0.4544
'" 0.5824
R
PRE- 0.7~05
REC: 0.9140
PRE= 0.8931
HARRIS TWO
NULL THES
SYNTAX PHR
0.1
0.2
0.3
0.4
0.5
0.6
0.1
0.8
0.9
].0
0NK t'E(= O.
1 FlNK
LOG PRF,.. 0."1613 LOG
NOR REC- 0.9708 NOR
NOR PREll: 0.8891 NOR
0.9583
0.9250
0.8879
0.85S?
0.8202
0.7784
0.7037
0.6033
0.5312
0.4567
REC= 0.S784
PRE= 0.1832
REC= 0.9143
PRE= 0.9004
0.1
0.2
0.3
0.4
0.1
0.2
0.3
0.4
0.15
0.6
0.1
0.8
0.9
1.0
RNI(
LOG
NOR
NOR
0.9583
0.9162
0.8688
0.f!:305
0.8014
0.1769
0.6940
0.5984
0.5167
0.4316
RE"C= 0.5545
PRE= 0.7699
REC= 0.9714
PRE= 0.8911
HARR IS TWO
SYNTAX PHR
HARP I S TWO
HIERARCHY UP
Hl£R DOWN
HIER DOWN
0.1 0.9529
0.2 0.8616
0.3 0.8094
0.4 0.7858
0.50.7145
0.60.7380
0.7 0.69]5
0.8 0.5991
('\.9 0.5048
1.00.4051
RNK REC= 0.5339
LOG PRE= 0.7550
NOR REC= 0.9712
NOR PRE= {'I.8813
0.1 0.9152
0.2 0.8713
0.3 0.7957
0.4 0.7117
0.50.7496
0.60.6811
0.70.5982
O.A 0.5156
0.9 0.4344
1.00.3142
RNK REC= 0.4941
LOG P~EII: 0.1256
NOR REC- 0.9679
NOR PRF"= 0.fl637
Figure 7. Precision vs. recall plots for 28 analysis methods
(averages shown over 17 search requests).
sists of only the relevant document numbers and
their ranks in decreasing correlation order with the
request. Below the lists of document numbers, a va-
riety of recall and precision measures are provided
for each analysis procedure, to reflect the effectiveness of the corresponding process.
INFORMATION SEARCH OPTIMIZATION
An examination of Fig. 6 reveals, for example,
that for the request on "Automata Phrases," improved retrieval is obtained by switching from the
word stem procedure to the synonym recognition
process using the regular thesaurus (labeled 1 in
Fig. 6). This is reflected both by the magnitude of
the evaluation coefficients, and by the ranks of the
last relevant document (1 04th out of 405 for the
word stem process (null thesaurus), and 74th for
"Harris Two"). An improvement is also obtained
by switching from "Harris" thesaurus to the phrase
procedures, and from statistical phrases to syxtax
phrases (labeled 2). The third example from Fig. 6
shows that the merged procedure which combines
the statistical phrases with the hierarchy results in
an increase in performance over and above each of
the component methods. A further improvement is
obtained by adding the regular "Harris 2" thesaurus
process to the previously merged pair ( example
four of Fig. 6).
Figure 7 shows evaluation output obtained for
the same 28 analysis methods previously shown in
Fig. 6, but averaged over 17 different search requests. The output of Fig. 7 is presented in the
form of precision vs. recall graphs, similar to that
shown in Fig. 5 (the actual graphs are not drawn
but tables are presented instead). The five examples
specifically indicated in Fig. 7 again confirm the
earlier results that improvements are obtainable
from method to method.
Each of the three search optimization procedures
described in this study appears to be useful as a
means for improving the retrieval effectiveness of
real-time, user-controlled search systems. Additional experimentation with larger document collections
and with an actual user population may be indicated before incorporating these procedures in an operational environment. Iterative, user-controlled
search procedures appear, however, to present an
interesting possibility, and a major hope, for the
eventual usefulness of large-scale automatic information retrieval systems.
305
REFERENCES
1. G. Salton, "A Document Retrieval System
for Man-machine Interaction," Proceedings of the
ACM 19th National Conference, Philadelphia, 1964.
2. G. Salton and M. E. Lesk, "The SMART
Automatic Retrieval System-An Illustration,"
Comm. of the ACM, vol. 8, no. 6 (June 1965).
3. G. Salton et aI, "Information Storage and Retrieval," Reports No. ISR-7 and ISR-8 to the National Science Foundation, Computation Laboratory, Harvard University (June and Dec. 1964).
4. V. E. Giuliano and P. E. Jones, "Linear Associative Information Retrieval," Vistas in Information Handling, P. Howerton, ed., Spartan Books,
Washington, D.C., 1963.
5. R. M. Curtice and V. Rosenberg, "Optimizing Retrieval Results with Man-machine Interation," Center for the Information Sciences, Lehigh
University, Bethlehem, Pa., 1965.
6. J. F. Rial, "A Pseudo-metric for Document
Retrieval Systems," Working Paper W-4595,
MITRE Corp., Bedford, Mass. (1962).
7. J. J. Rocchio, "Relevance Feedback in Information Retrieval," Report No. ISR-9 to the National Science Foundation, Sect. 23, Computation
Laboratory, Harvard University (Sept. 1965).
8. J. J. Rocchio, "Performance Indices for Document Retrieval Systems," Report No. ISR-8 to
the National Science Foundation, Sect. 3, Computation Laboratory, Harvard University (Dec. 1964).
9. G. Salton, "The evaluation of Automatic Retrieval Procedures-Selected Test Results Using the
SMART System," American Documentatio'n, vol.
16, no. 3 (July 1965).
10. C. W. Cleverdon, "The Testing of Index
Language Devices," ASLIB Proceedings, vol. 15,
no. 4 (Apr. 1963).
AN ECONOMICAL PROGRAM FOR LIMITED PARSING OF ENGLISH
D. C. Clarke and R. E. Wall*
IBM San Jose Research Laboratory
San Jose, California
nicat articles as well. Abstracts, like titles, are intended to be concise statements of the information
in a document, but by virtue of their greater length
should provide more potential index terms.
The syntactic recognition problem with abstracts,
however, is much more difficult than with titles.
The latter often consist only of noun and prepositional phrases and rarely post-modifying participles or relative clauses. Abstracts, on the other
hand, potentially exhibit the full range of syntactic
constructions of formal written English, except for
interrogative and exclamatory sentences, which are
excluded by precepts of style. For this reason, the
fairly simple procedures of the Title Analyzer Program were inadequate for dealing with abstracts
with any reasonable degree of accuracy. While our
initial efforts were directed (as in the Title Analyzer) toward the identification of nouns and their
modifiers, the result has been a program written in
COMIT5 yielding a nearly complete parsing of the
"surface" syntactic structure of each ,sentence. The
overall sequence of operations in the program is
shown in Fig. 1.
In the description of the program which follows
we will emphasize those features of the design imposed by the necessity for economical performance
in a projected mechanized indexing system.
The program accepts cards in a format which is
Automatic syntactic analysis has often been proposed as a component of mechanized indexing
systems. 1,2 However, up to this time, frequency
counting and statistical association techniques have
been more favored, since these involve operations
which can be performed with great speed on present
day computers. Syntactic analysis programs, 3 especially the few which have relatively complete grammars, have suffered from the disadvantage of slow
and expensive operation and consequently have seldum been applied beyond the field of mechanical
translation. In this paper, we report the design and
testing of a limited syntactic recognition program
for English which shows promise of becoming accurate enough to aid in mechanized indexing, yet sufficiently inexpensive to make large-scale use practicable.
We originally developed this system as an extension of Baxendale's Title Analyzer Program,4 which
used a smaH number of syntactic clues and a discard list to select "significant" words and phrases
from titles of technical articles for use as index
terms. However, the shallowness of an index produced only from titles seriously limited the applicability of the Baxendale program, so it seemed natural to apply similar techniques to abstracts of tech*Present address (R. E. Wall): Department of Linguistics,
Indiana University, Bloomington, Ind.
307
308
PROCEEDINGS -
FALL JOINT COMPUTER CONFERANCE,
1965
Tentatively bracket
and label phrase
boundaries
Pushdown store
Revise phrase and
clause markings to
correct any ill-formedness
Figure 1. Sequence of syntactic analysis program.
N-type GaAs doped with Te to 3 X 10 17 crn- 3 and 9.6 X 10
17
crn- 3
have absorption coefficients of 20 crn- l and 10 crn- l , respectively,
at 10 475 eV and 77 0 K.
Printed input text
*N-TYPE *GA*AS DOPED WITH *TE TO *(*( **F *) AND *(*( **F *)
HAVE ABSORPTION COEFFICIENTS OF *(*( **F *) AND *(*( **F *)
RESPECTIVELY, AT 1.475 E*V AND 77 DEGREES *K •
Keypunched input text
Figure 2. Sample input sentence.
essentially a character-for-character transcriptiC..l (Fig. 2). We have accepted the present necessity of keypunching and have therefore concentrated
on' making the input format convenient for our syntactic analysis program. Nonetheless, we feel that
this format will be reasonably compatible with automatically transcribed text whenever such becomes
generally available.
DICTIONARY
The function of a dictionary in a syntactic recognition program is to assign each word * of the input
text to one or more word classes (traditionally such
*In this discussion of the dictionary we are referring to
words as tokens or separate occurrences, not as types or the
total of different forms.
AN ECONOMICAL PROGRAM FOR LIMITED PARSING OF ENGLISH
categories as noun, verb, adjective, etc.). The usual
approach is to use a "complete" dictionary, which
contains (ideally) every word form in the language.
Our program uses a "computational" dictionary of
the type described by Klein and Simmons6 which
makes word-class assignments on the basis of
orthographic features. The current dictionary consists of three lists which contain about 1,000 entries
in all:
1. Common function words-prepositions, articles, conjunctions, etc.
2. Word endingst - -ing, -tion, -ed, -OUS, etc.
3. Exceptions to the word ending rulesthing, feed, mention, etc.
One requirement, of the program was that it
should be suitable for use in a mechanized indexing
system. A complete dictionary operating on technical text would require an addition each time a new
word was encountered.' Such additions are not compatible with economic automatic processing. We
thus chose to use a computational dictionary designed to encode correctly the relatively few types
which account for the large proportion of tokens in
running text.
Words which are not classified by the computational dictionary are arbitrarily assigned to the
noun/verb category. This choice is again influenced
by potential use of the program in a mechanized
indexing system. The hypothesis is that the importance of nominal constructions in selection of index
unit candidates places emphasis on the bracketing
of all noun phrases. The grammatical algorithm is
designed to deal with the noun/verb ambiguity.
One advantage of our computational dictionary
is that it is small enough to be contained in the core
storage of an IBM 7094 along with the grammar
rules and program instructions. This allows for a
binary search through the dictionary lists for every
word as it occurs in text ·order. A full diction~ry, on
the other hand, would have to be stored on magnetic tape (in which case the input text would have to
be run in batches, alphabetized, looked up, and
re-sorted into text order) or else stored on drums
or disks, thereby sacrificing the advantage of a binary search. A further consequence . of .the small
computational dictionary contained in core storage
is that the processing of sentences can be open-
t A reverse-alphabetized word list7 was most helpful in
discovering word endings and exceptions.
309
ended. Since there is no need to handle the text in
batches, any number of sentences can be run without interruption once the program has been loaded.
These conveniences, however, are paid for in two
ways. One is the obvious limitation that many
words will receive the arbitrary noun/verb classification because they were not found in the dictionary, and any misclassification may lead to erroneous
phrase bracketings. The other disadvantage is in the
lack of refinement possible in certain word classes.
For example, although the suffix -tion nearly always serves to identify words as noun forms, they
cannot be further subclassified by this clue as animate-inanimate, abstract-concrete, or countable-uncountable. Thus, a computational dictionary
introduces error in syntactic recognition not only by
incorrect word-class assignments, but also by limiting the discrimination which can be made in the
grammar.
GRAMMAR
The grammar gives rules for the allowed combination of word classes into phrases and clauses.
Nine types of phrases-nominal, pronominal, adjectival, past participial, present participial, prepositional, verbal, infinitive, and adverbial-and eight
kinds of subordinate and relative clauses are recognized. The kind of clause is dependent on the clause
introducer and on the alternative structural patterns
predicted by that introducer. For example, if a
clause is introduced by which, the grammar expects
that a verb will be found but that a subject is optional. If a .verb is not found, the algorithm will
search for a re-bracketing to fulfill the requirement for a verb. The output is a labeled bracketing
of these phrases and clauses together with the syntactic word class for each word. An example of the
output is shown in Fig. 3.
Each phrase is enclosed in parentheses and is
followed immediately by an identjfying label (NOUP
= noun phrase, PREP = prepositional phrase, etc.).
A hyphen separates the phrase label from a list of
the word classes assigned to each word in the phrase.
These classes are based on those of Kuno and OettingerS with many modifications. Although a complete list of the word classes and their defining characteristics would be too long to include here, it would
perhaps be helpful to give a few of those appearing
in Fig. 3.
310
PROCEEDINGS -
FALL JOINT COMPUTER CONFERANCE,
1965
THIS PAPER PRESENTS A GENERAL SYSTEM CONFIGURATION FOR AN ARITHMETIC UNIT
OF A COMPUTER WHICH IS USED TO SOLVE POLYNOMIAL PROBLEMS EFFICIENTLY
(THIS PAPER) NOUP-ATES NOUS
(PRESENTS ) VERB-VZZS
(A GENERAL SYSTEM CONFIGURATION ) NOUP-ATBS NOVS NOUS NOUS
(FOR AN ARITHMETIC UNIT) PREP-PRE ATBS NOVS NOUS
(OF A COMPUTER ) PREP-PRE ATBS NOUS
***8C 1
WHICH-RL3
(IS USED) VERB-VBIS pZl
(TO SOLVE ) I NFP-TO I S VZZP
(POLYNOMIAL PROBLEMS ) NOUP-NOVS NOUP
(EFFICIENTLY) ADVP-AVI
***EC 1
Figure 3. Sample of output from syntactic analysis program.
NOUS
PRE
VZZS
NOVS
ATBS
PZI
AVI
noun, singular number
preposition
finite form verb, third person,
singular number, transitivity
unknown
noun/adjective
article modifying singular nominals
past participle homographic
with past tense form
regular adverb
The beginning and end of the relative clause are
marked by the symbols * * *BC 1 and * * *EC 1.
Such an analysis falls short of a "complete" phrasestructure parsing in two ways. First, the labelings of
fine structures within phrases are suppressed. For
example, in the sentence shown in Fig. 3, the noun
phrase an arithmetic unit is not overtly labeled as
such but is included in the prepositional phrase. Likewise, the finer structure of this noun phrase itself
(arithmetic unit = NP, etc.) is not given explicitly.
Using phrase-structure tree diagrams we might
illustrate the difference as in Fig. 4. The tree diagram at the .left represents the phrase marker as it
might appear for this 4-word prepositional phrase
in a phrase-structure parsing; the tree at the right
shows the same phrase as it would be analyzed by
our grammar.
The second difference is the failure to mark dependencies between phrases and clauses. For example, in the output in Fig. 3, the point of attachment
of the prepositional phrase for an arithmetic unit is
not specified. The clues for joining such modifiers
to the proper head seem in many cases to be purely
semantic ones, and the problem is always troublesome in any parsing scheme. Jane Robinson cites
the example9 I saw the man with the telescope in
the park, which can have several different readings,
depending on the words which the prepositional
phrases are understood to modify. We do not yet
know whether the simple expedient of joining
post-modifiers to the nearest allowable preceding
structure can be improved upon with the aid of syntactic information alone, nor do we know how
much of this interphrase structure will be necessary
in order to do the job of indexing. In any case, delimiting the phrase boundaries as we have done is a
prerequisite to any attempt to specify these dependencies algorithmically.
The current implementation of our program does
not incorporate an explicit, separable grammar.
However, a formal description of the grammar in a
context sensitive phrase-structure notation hasbeen written to provide documentation.
"-~
311
AN ECONOMICAL PROGRAM FOR LIMITED PARSING OF ENGLISH
/
PRE
~
ATB
NP
/~NP
ADJP
NOV
for
I
an
arithmetic
NOU
unit
for
an
arithmetic
unit
Figure 4. Comparison of phrase-structure tree diagrams.
ALGORITHM
The algorithm is a set of procedures for assigning
an allowable syntactic structure to an input sentence
according to the rules set forth in the grammar. A
serious problem in implementing parsing algorithms
has always been that the processing time tends to
increase exponentially with the number of words in
the sentence being analyzed (because as sentence
length increases, so in general does the number of
combinatoriat alternatives which must be considered). Consequently, the practical upper limit on
sentence length for the best existing programs has
been about 40 words, and for most programs it has
been much below that. Longer sentences are not at
all rare, however, particularly in technical writing,
and any parsing system which is intended to be' a
practical component of an indexing system should
be able to handle them. We therefore attempted to
design this parsing algorithm so that the total processing time would be directly proportional to sentence length.
The general sequence of operations is as follows.
Mter dictionary lookup is complete, all phrase
boundaries are tentatively identified in one leftto-right pass through the sentence. On a second
left-to-right pass, clause boundaries are estab-
lished, and tests for wetl-formedness in each
clause are performed. Nested clauses are handled by
a pushdown storage mechanism. 10 Whenever an illformed condition is recognized, the algorithm initiates an ordered search for alternatives pertinent to
that condition (different word classes or different
phrase boundaries) and will choose the first alternative which resolves that condition. This strategy,
together with the restriction that no set of alternatives can be tried more than once during the analysis of a sentence, avoids the repetitive tracing of
substructures which have already been recognized as
well-formed. Thus, even worst-case analysis
times will vary linearly (or nearly so) with sentence
length. A final series of passes through the sentence
serves to link phrases joined by coordinating conjunctions and performs a few minor revisions before output.
We ,can illustrate one of the parts of the algorithm, the clause well-formedness testing, by using
the example of the sentence in Fig. 3. During dictionary lookup, both paper and presents have received the arbitrary noun/verb coding, but the tatter word, because of its -s suffix, has been coded
as plural noun and third-person singular verb.
After the first pass, the phrases have been bracketed
312
PROCEEDINGS -
FALL JOINT COMPUTER CONFERANCE,
as they appear in the output (Fig. 3), except for the
first three words.
ThiS
ATES
[
paper
NOUS
VZZP
presents]
NOUP
VZZS
NOUP
The noun homograph of presents has been tried
first and has led to the incorrect bracketing shown.
On the second pass the lack of a verb for the independent clause is noted (the relative clause is
well-formed), and the algorithm then examines
the first noun phrase, beginning at the right-most
word, for a verb homograph. Presents is found to
have such a homograph, so the subject is redefined
as the remaining noun phrase, This paper, and the
test for number agreement between subject and verb
is made. Since this alternative produces a wellformed clause, the final bracketing is:
This paper]
[ ATES NOUS
[~ATBS
[presents]
VZZS
NOUP
VERB
Had this alternative not been allowable, the other
words in the phrase would have been examined for
verb homographs. If none was found, the phrase
would have been restored to its original bracketing
and the next noun or prepositional ph:ase to the
right in the same clause would have been broken
apart and examined similarly. This procedure would
have continued until either a verb and subject had
been established or else all the relevant possibilities
had been exhausted. However, when all the clauses
of a sentence have been made well-formed, no
more alternatives are tried. Thus, the algorithm arrives at only one syntactic structure for each sentence, in contrast with the multiple parsings generated by a program such as the Harvard Syntactic
Analyzer. 8
While multiple analyses are clearly useful in
linguistic research for exposing syntactic ambiguities, in a practical application such as mechanized
indexing they are an embarrassment of riches. Having all structural descriptions for each sentence
would be of little use since at the present time there
is no method for deciding which analyses are also
semantically acceptable and, further, which is the
one "correct" reading intended by the author. Perhaps one could hope to select instead the "syntacti-
1965
cally most probably" parsing ll ,12 if adequate statistics of English grammatical structures were available. Since they are not, we have ordered the search
for alternatives according to what seem to us, intuitively, to be the most frequently occurring structures.
A useful by-product of the algorithm arises
from the fact that all the phrases of a sentence are
tentatively recognized in the first pass. Should the
analysis of any sentence be terminated because of
excessive time, or if no grammatically acceptable
parsing for some clause can be found, many substrings of the sentence will still be correctly identified.
Thus, in cases of noncJ.tastrophic failure, we are
able to get partial results and go on to the next sentence.
Provision is also made for handling parenthetic
expressions (the clause well-formedness tests are
omitted) and clauses separated by semicolons
(treated as separate sentences). Sentences up to 100
words in length can be analyzed. However, some
very long sentences, depending on the particular
structures they contain, will occasionally require
more COMIT "workspace" than is currently available. In such cases, the program writes out the results of the dictionary lookup routine on a tape
which can later be used as input to the syntactic
analysis portion.
PROGRAM TESTING
The program was coded in COMIT5 because of
the ease it provides in the design and updating of
experimental models. Initial debugging and testing
were carried out on a sample of 70 consecutive sentences taken from abstracts in the IBM Journal of
Research and Development. This text, which we
will refer to as IBM #1, was chosen because it
c~ntained a fairly wide range of technical subject
matter. The results were used to make further refinements to the grammar, and then the program
was tested on 5 more texts,each containing 70 sentences, from randomly selected abstracts. One text
was taken from another issue of the IBM Journal
(IBM #2) and the others from the fields of chemistry, physics, acoustics, and, for comparison with
the technical material, literary criticism. The accuracy with which phrases were identified * is indicat*The method of counting phrases is in accord with our
earlier remarks about "complete" parsing. Thus, the object
in a prepositional phrase does not count as an additional
nominal or pronominal phrase.
313
AN ECONOMICAL PROGRAM FOR LIMITED PARSING OF ENGLISH
Table 1. Accuracy of Identifying Phrases.
Text
No. of
Words
NP
IBM #1
IBM #2
Physics
Chemistry
Acoustics
Literary criticism
Total
1593
1867
1805
1716
1705
2192
10878
154/170
170/182
153/163
160/174
174/187
1931216
1004/1092
92%
Technical
material only
8686
811/876
93%
PP
PMA
PMR
PMP
89/95
100/110
112/118
106/114
117/123
133/161
657/721
91%
10/15
12/19
21/27
18/20
10/11
24/35
95/127
75%
10/11
6/7
11/13
14/15
8/11
4/6
53/63
84%
17/17
20120
11/11
16/17
22/23
10/14
96/102
94%
524/560
94%
71/92
77%
49/57
86%
86/88
98%
Pn P
Inf P
VP
209/222
248/267
240/264
214/241
2331249
249/283
1393/1526
91%
6/6
11/12
11/11
12/12
5/5
54/54
99/100
99%
10/11
12/13
14/14
16/16
18/21
27/27
97/102
95%
1144/1243
92%
45/46
98%
70/75
93%
NP-noun phrase
PP-prepositional phrase
Pn P-pronoun phrase
Inf P-infinitive phrase
VP-verb phrase
PMA-post-modifying adjective
Av P
Percent
Totals
11/15
18/19
19/20
14/15
14/15
37/44
113/128
88%
516/562
597/649
592/641
570/624
601/645
731/840
3607/3961
91%
76/84
90%
2876/3121
92%
92
92
92
91
93
87
PMR-post-modifying present participle
PMP-post-modifying past participle
Av P-adverbial phrase
'Total number of words includes tbose contained in
parenthetic expressions. This accounts for the discrepancies
between the total and tbat given for the acoustics text in
Table 1.
ed in Table 1. A further test using 1,000 sentences
taken from Nuclear Science Abstracts gave similar
results.
It was of interest to compare the performance of
our program with that of one of the most complete
parsing programs available, the Harvard Syntactic
Analyzer (HSA).8 We obtained this program from
SHARE, modified it to produce only one analysis
(i.e., the first found) for each sentence, and tested
it on our first text, IBM # 1. After homograph
codes were supplied for words not found in the
HSA dictionary, the program produced an analysis
for 59 of the 70 sentences. Seven were rejected as
ungrammatical, and four had not been analyzed after at least 5 minutes of running time on each by
the SYNTAX subroutine. One sentence was al'lowed
to run for 17 minutes without success. Table 2
shows the comparison of our program with the
modified HSA in identifying phrases in the 59 sentences which were analyzed by both.
Table 2. Comparison with Modified Harvard Syntactic Analyzer
(59 Sentences from IBM #1; 1244 Total Words).
Our program
HSA
NP
115/126
91%
PP
157/169
93%
Pn P
3/3
100%
Inf P
9/10
90%
VP
73/79
92%
PMA
7/9
78%
104/126
83%'
147/169
87%
3/3
100%
9/10
90%
73/79
92%
9/9
100%
PROCESSING TIME
The core clock was used to measure the time of
each phase of our program for every sentence analyzed. A program kindly supplied by Mr. K. L.
Deckert of the IBM Systems Development Laboratory, San Jose, plotted the times against sentence
length on a CALCOMP plotter and calculated the
slopes of the resulting lines by the method of least
squares. The summarized results appear in Table 3.
Ali the data appeared to be well represented by
straight lines except for the second pass (testing for
clause well-formedness and trying alternatives),
which as expected displayed considerable scattering.
The times for sentences containing parenthetic
PMR
7/8
88%
5/8
63%
PMP
12/12
100%
AvP
7/9
78%
Totals
390/425
92%
5/12
42%
6/9
67%
361/425
85%
Table 3 . Average Processing Times for Each Phase
of Program.
Phase
1. Input and dictionary lookup .
sec Time/word
0.072
2. Bracketing of phrases .
0.047
3. Testing clause well-formedness
and trying alternatives .
· 0.019
4. Rebracketing coordinated structures
and other minor corrections .
.0.024
5. Output
· 0.017
TOTAL
· 0.179
314
PROCEEDINGS -
FALL JOINT COMPUTER CONFERANCE,
expressions, because they are treated in a special
way, were also found to deviate markedly from the
line. Nonetheless, we found no clear indication that
the total time might be increasing exponentially
with length for sentences {)f the order of 50 to 75
words.
The total time for our program to analyze the 59
sentences of IBM # 1 (Table 2) was about 4.5
minutes plus 4.2 minutes for compilation (run on
the IBM 7094 but adjusted to the IBM 7090 time).
The time for the modified Harvard Syntactic Analyzer was about 30 minutes (not measured precisely), which does not include the time for dictionary
lookup and updating. We have recently been informed by Dr. Kuno that the running time for the
Harvard program has now been reduced substantially.13 However, this version is not yet available to
us for testing.
DICTIONARY CODING
Since the computational dictionary is a fundamental part of our program, we were concerned with
its ability to assign words to word classes compared
to the assignments made by a complete dictionary.
Klein and Simmons reported that their computational dictionary could correctly assign unique
word-class codes to about 90 percent of the words
in their sample texts. However, this figure measures
the results of two operations: first, assigning each
-word all its possible word-class codes and, second,
eliminating ambiguous codes by means of the context. Our dictionary performs only the first of these
functions, while word-class ambiguities are resolved in the syntactic recognition routine. Therefore, we counted as "correct" those words which
were identified by the computational dictionary or
which were not found and were indeed noun/verb
ambiguities. Table 4 gives the results for the acoustics text. Words coded arbitrarily received only the
noun/verb classification; the fraction of these
marked "incorrect" should have been assigned to
either noun only or verb only, or else should have
had additional codes attached as well. Similar data
were collected for all six experimental texts with an
earlier version of the dictionary. The percentage of
correct coding was somewhat lower (86 percent),
but it was nearly the same for each of the texts.
In order to determine the extent to which parsing
errors arose from inadequacies in the computational
1965
Table 4. Accuracy of Dictionary Coding
for Acoustics Text.
Words
Found among common function
words and exceptions to
. 1001
ending list
Found in ending list
Correct
Incorrect.
Percent
58
_440
13
25
1
137
143
8
8
Total coded correctly* .
. 1578
91
Total coded incorrectly
156
9
Coded arbitrarily (noun/verb)
Correct
Incorrect.
dictionary, we reran two of the texts with corrections to dictionary coding supplied by hand. The
overall accuracy in identifying phrases increased
from 91 to 93 percent for the chemistry text and
from 87 to 90 percent for literary criticism. Thus,
using a perfect dictionary with the present grammar
and algorithm seems to improve the accuracy by
about 2 to 3 percent.
DISCUSSION
The principal result of our work thus far has
been to show that the approach to parsing, which
we adopted for purely practical reasons, nonetheless
succeeds as well in identifying phrases as at least
one other more sophisticated routine. We were
frankly surprised at this result. Because our program was to operate with many handicaps-a minimal dictionary, simple grammar, and severe time
and space constraints on the whole program-we did
not suppose that it would be able to perform so
well.
We must emphasize that the comparison with the
HSA (1963 version) should be accepted with reservations. The HSA, as previously noted, provides a
more complete syntactic description of each sentence than does our program, and therefore the running times are not directly comparable. Also, the
sample for comparison, only 59 sentences, is rather
small. One migIit also argue that selecting only the
HSA's first analysis from each sentence may have
produced a bias, but there seems -to be no reasona-
AN ECONOMICAL PROGRAM FOR LIMITED PARSING OF ENGLISH
ble alternative to this choice, and in fact there is
some reason to believe that the first analysis (rather
than the second, the last, etc.) has a greater probability of being the "correct" syntactic analysis than
does any other. 11
We believe that two factors are chiefly responsible for the degree of success which our program has
so far achieved. First, we have made some fortunate
guesses about the probability of occurrence of syntactic constructions, at least for the kind of technical writing we have investigated. (Note that the accuracy for the literary criticism text was somewhat
lower than for the others despite the fact that the
dictionary coded about the same percentage of
words correctly in this text.) The second factor is
the strategy of searching out alternatives only to
remedy a particular syntactic ill-formation. This
technique aHows most of the previously made
"probable" choices to be left intact whenever an
error is corrected. ~
Three kinds of errors were frequently made by our
program: ( 1) incorrect bracketing of coordinated
structures around and or or; (2) unresolved noun/
verb ambiguity; and (3) incorrect assignment of
words with suffix -ing, which may be adjectival (preand post-modifying), verbal, or gerundive. It is interesting to note that the pattern of errors made by
the HSA differs considerably. A frequent mistake
was also the noun/verb confusion, but it usually
arose from erroneously finding relative clauses beginning with an elliptical which or -that. For example,
a sentence beginning [The maximum signal] [has]
... was analyzed as if it has been [The maximum]
(which) [signal] [has] ... , with a plural noun occurring later in the sentence being called the predicate
verb for the subject maximum.
The current version of our program occupies
about 29,000 registers of an IBM 7094, thus allowing about 4,000 registers for COMIT "workspace"
during analysis. Therefore, no substantial additions
to the dictionary, grammar, or algorithm are possible while maintaining the program's present design
on the IBM 7094. Some slight improvements in
performance can undoubtedly be made at the expense of much more investigation into grammatical
refinements. This. would undoubtedly lead to an increase in running time and storage space required.
We do not know what the limits of accuracy are for
our approach, but we estimate that less than half
the errors are in theory correctable (by an expanded
grammar); the remaining are genuine syntactic am-
315
biguities which presumably can only be resolved by
extrasyntactic information. If this estimate holds, it
means that an accuracy of about 94 percent in
identifying phrases is theoretically attainable.
The point of balance between cost and accuracy
will depend on the particular application envisioned
for the program. Despite the encouraging results
thus far, we cannot claim that our program will
guarantee the feasibility of a mechanized indexing
system. It is clear that more will be required for automatic indexing than an identification of phrases
and clauses. For example, it may be necessary to
specify some interphrase dependencies, and a means
for the deletion of items deemed nonsignificant will
almost certainly be necessary. Also, some syntactic
transformations to convert the material into a format suitable for searching (whether by machine or
human) will probably be essential. Nonetheless, the
prospects for using at le~s.t a limited syntactic analysis program in automatic indexing on a large scale
now seem much more hopeful.
ACKNOWLEDGMENT
The authors wish to express their appreciation to
Mr. J. Bennett and Miss P. Baxendale for substantial contributions to this work throughout its development. Mr. Bennett assisted materially in the programming and made several modifications to the
COMIT system which greatly expedited the debugging.
REFERENCES
1. W. D. Climenson, H. H. Hardwick and S. N.
Jacobson, "Automatic Syntax Analysis in Machine
Indexing and Abstracting," American Documentation, vol. 12, pp. 178-183 (1961).
2. G. Salton (principal investigator), Report
No. ISR-7 to the National Science Foundation,
Harvard Computation Laboratory, Cambridge,
Mass. (1964).
3. D. G. Bobrow, "Syntactic Analysis of English by Computer-A Survey," Proceedings of the
Fall Joint Computer Conference, 1963, p. 365.
4. P. B. Baxendale, "An Empirical Model for
Machine Indexing," Machine Indexing Progress and
P~oblems, papers presented at the Third Institute on
Information Storage and Retrieval, American University, 1961, p. 207.
316
PROCEEDINGS -
FALL JOINT COMPUTER CONFERANCE,
5. V. H. Yngve, "COMIT As an IR Language,"
Communications of the ACM, Jan. 1962, pp. 1928.
6. S. Klein and R. F. Simmons, Journal of the
ACM, 1963, p. 334.
7. A. F. Brown (ed.), Normal and Reverse
English Word List, Univ. of Pennsylvania, Philadelphia, 1963.
8. A. G. Oettinger (principal investigator),
Report No. NSF-8 to the National Science Foundation, Harvard Computation Laboratory, Cambridge, Mass. (Jan. 1963).
9. J. J. Robinson, "Preliminary Codes and
Rules for the Automati -; Parsing of English," Memorandum RM-3339-PR, RAND Corp., Santa Monica, Calif. (Dec. 1962).
1965
10. A. G. Oettinger, "Automatic Syntactic Analysis and the Pushdown Store," Proceedings of the
Twelfth Symposium in Applied Mathematics,
American Mathematical Society, 1961, p. 104.
11. R. E. Wall, "Probabilistic Approach to Ordering Multiple Analyses of Sentences," in Report
No. NSF-13, Harvard C;omputation Laboratory,
Cambridge, Mass. (Mar. 1964).
12. K. C. Knowlton, "Sentence Parsing with a
Self-Organizing Heuristic Program," Ph.D. thesis,
Massachusetts Institute of Technology, Sept. 1962.
13. S. Kuno, "The Predictive Analyzer and a
Path Elimination Technique," Communications of
the ACM, July 1965, pp. 453-462.
THE MITRE SYNTACTIC ANALYSIS PROCEDURE FOR
TRANSFORMATIONAL GRAMMARS*
Arnold M. Zwicky,t Joyce Friedman,t
Barbara C. Hall, t and Donald E. Walker
The MITRE Corporation
Bedford, Massachusetts
INTRODUCTION
The Problem of Syntactic Analysis
A solution to the analysis problem for a class of
grammars appropriate to the description of natural
languages is essential to any system which involves
the automatic processing of natural language inputs
for purposes of man-machine communication, translation, information retrieval, or data processing. The
analysis procedure for transformational· grammars
described in this paper was developed to explore the
feasibility of using ordinary English as a computer
control language.
Given a grammar* G which generates a language
L( G), we can define the recognition problem for G
as the problem of determining for an arbitrary string
x of symbols, whether or not x E L (G). The more
difficult problem of syntactic analysis is to find, given
any string x, all the structures of x with respect to G.
The syntactic analysis problem varies with the class
of grammars considered, since both the formal properties of the languages generated and the definition of
structure depend on the form of the grammar.
A context-free (CF) phrase-structure grammar is
a rewriting system in which all rules are of the form
A ~ cp, where cp is a non-null string and A is a single
symbol; in context-sensitive (CS) phrase-structure
grammars all rules are of the form tilt A 0/2 ~ 0/1 cp 0/2,
where A and cp are as before and 0/1 and 0/2 are strings
(possibly null) of terminal and/or nonterminal symbols. A derivation in a phrase-structure grammar is
represented by a tree in which the terminal elements
constitute the derived string. In a transformational
grammar there is, in addition to a phrase-structure
*The research reported in this paper was sponsored by the
Electronic Systems Division, Air Force Systems Command,
under Contract AF19 (628) 2390. The work was begun in
the summer of 1964 by a group consisting of J. Bruce Fraser, Michael L.Geis, Hall, Stephen Isard, Jacqueline W.
Mintz, P. Stanley Peters, Jr., and Zwicky. The work has
been continued by Friedman, Hall, and Zwicky, with computer implementations by Friedman and Edward C. Haines.
Walker has directed the project throughout. The grammar
and procedure are described in full detail in reference 1. This
paper is also available as ESD-TR-65-127.
tPresent addresses of Zwicky, Friedman and Hall are, respectively: Department of Linguistics, University of Illinois,
Urbana; Computer Science Department, Stanford University,
Stanford, Calif.; Department of Linguistics, University of
California, Los Angeles.
*The linguistic concepts on which this work is based are
due to Noam Chomsky; see, for example, references 2-6.
317
318
PROCEEDINGS -
FALL JOINT COMPUTER CONFERANCE,
component, a set of transformational rules which operates upon trees from the phrase-structure component to produce trees representing sentences in their
final form.
For both types of phrase-structure grammars the
syntactic analysis problem is known to be solvable.
In the case of CF grammars, a number of general
recognition and syntactic analysis procedures have
been developed and programmed. Several syntactic
analysis algorithms for CS grammars given by Griffiths (1964). 8
The case of transformational grammars is complicated by the fact that they do not correspond exactly to any well-studied class of automata. In
fact, a number of decisions crucial to their formalization have yet to be made. This situation makes it
impossible to describe a general recognition procedure for transformational grammars without explicit
conventions about the form and operation of transformational rules. Since there is no widespread
agreement as to the most desirable conventions, it
is likely that different people working on the analysis problem for transformational grammars are actually working on quite different problems. A solution to the problem with one set of conventions will
not necessarily be a solution to the problem with a
different set of conventions. Furthermore, the solution in one case would not necessarily imply the
existence of a solution in another.
The area of formal properties of transformational
grammars needs more study; the results of this attempt to solve the syntactic analysis problem for a
particular one may help in determining the further
restrictions needed on the form of transformational
rules.
1965
phrase-structure component serve to generate a set
of basic trees that are then operated upon by the
rules of the transformational component to produce
a set of surface trees.
Phrase-structure Component
A CS phrase-structure rule t/Jl A t/J2 ~ t/Jl cp t/J2 is
written in the form A ~ CP/t/h - t/12. t/Jl or t/J2 or both
may be null. The rules are ordered; consecutive rules
expanding the same symbol in the same context are
considered to be subrules of a single rule. A rule in
this sense is thus an instruction to choose anyone
of the specified expansions.
The initial symbol of the grammar is SS, and the
first phrase-structure rule is SS ~ # S #. t Further instances of SS and S are introduced by later rules.
These instances are expanded during a succeeding
pass through the phrase-structure rules during which
new instances may be introduced, etc. The result is
a tree that may contain sentence-within-sentence
structures of arbitrary depth. This version of the
phrase-structure component differs somewhat from
the more usual versions, but is similar to the version
presented in Chomsky.3
We shall use the following tree terminology: x is
a daughter of y, x (not necessarily immediately) dominates y, x is the (immediate) right (left) sister of y,
x is terminal, and the sequence Xl, X2, • • • , Xn is a
(proper) analysis of x. We shall also refer to the
(sub) tree headed by x. These terms are all either
standard of self-explanatory.
Transformational Component
THE MITRE GRAMMAR
In order to develop an analysis procedure it was
necessary to fix on a particular set of conventions
for transformational grammar. Many of these conventions agree essentially with the more or less
standard conventions in the literature; points on
which general agreement has not been reached will
be noted.
The grammar contains two principal components: a CS phrase-structure component and a
transformational component. * The rules in the
*There is a third component, the lexicon, which will not
be discussed in detail here.
Form of the Rules. A transformational rule specifies a modification of a tree headed by the node SS
or S. Every such rule has two main parts, a description statement and an operation statement.
The description statement sets forth general conditions that must be satisfied by a given tree. If
these conditions are not met, then the rule cannot
be applied to the tree. The conditions embodied in
a description statement are conditions on analyses of
tThe first phrase-structure rule in the MITRE grammar
differs from this rule by allowing for the conjunction of any
number of sentences. SS may then dominate a sequence of
conjoined sentences. S, on the other hand, never immediately
dominates such a sequence.
319
SYNTACTIC ANALYSIS FOR TRANSFORMATIONAL GRAMMARS
sentences (trees headed by SS, # S #, or S*); a description statement is to be interpreted as requiring
that the given tree have at least one analysis out of a
set of analyses specified by the description statement.
If a tree satisfies the conditions embodied in a
description statement, then the operations apply to
the subtrees headed by the nodes in the analysis. The
operation statement lists the changes to be madethe deletions, substitutions, and movements (adjunctions) of subtrees.
In addition to a description statement and an operation statement, a transformational rule may involve a number of restrictions. A restriction is an
extra condition on the subtrees. The extra condition
is either one of equality (one subtree must be identical to another) or of dominance (the subtree must
contain a certain node, or must have a certain analysis, or must be a terminal node). Boolean combinations of restrictions are permitted.
The form of a transformational rule can be illustrated by the following example:
TWH2
(#) (Q) ($NIL NG) (AUXA) ($SKIP NP AP $RES 19)
1
2
3
(5) ADLES 4
ERASE 5
4
5
$RES 19:
dom WH
The description statement of this rule (TWH2)
consists of five numbered and parenthesized description segments. Each segment specifies one part
of an analysis. When several grammatical symbols
(symbols not beginning with $) are mentioned in a
segment, the interpretation of the segment is that
the corresponding part of the analysis must be a
subtree headed by one of these symbols. When
$NIL is mentioned in a segment, the interpretation
is that the corresponding part of the analysis is optional-that is, the corresponding part may be a null
subtree; if, however, some analysis can be found in
which the correspcnding part is not null, that analysis must be chosen. The occurrence of $SKIP in a
segment is equivalent to a variable between that
segment and the preceding one. * $RES must be followec! by the number of the restriction to which it
refers. There is an implicit variable at the end (but
not at the beginning) of every description statement.
In a more informal and traditional notation, the
*This distinction is not important for our discussion here.
See the discussion in reference 1.
*$SKIP and $NIL may not both be used in a single segment.
description statement of TWH2 would be written as
#
~
+
Q
+
(NO)
~~~-----
1
P
-AUXA-X- {ANp
~
234
I
y+#
~
5
In our system there is no way of referring to a sequence of subtrees as a single part of an analysis,
althiugh there is in the more informal notation.
In outline, the routiJ?e that searches through a
tree for an analysis that conforms to a given description statement searches from left to right
through the tree, attempting (in the case of a segment containing $NIL) to find a real node before
assuming thqt a segment is null, attempting always
(in the case of a segment containing $SKIP) to
"skip" the smallest possible number of nodes, and
checking (in the case of a segment containing
$RES n) to see if a restriction is satisfied as soon
as a node to which the restriction applies is found.
In case one part of the search fails, either because
the required nodes cannot be found or because a
restriction is not satisfied, the routine backs up to
the most recent point at which there remains an alternative (e.g., the alternative of searching for NP
or for AP in the fifth segment of TWH2). As each
part of the analysis is found, the appropriate subtrees are marked with numbers corresponding to the
numbers on the description segments. The tree then
undergoes the modifications specified in the· operation statement.
The operation statement of TWH2 consists of an
(ordered) list of two instructions. There are three
types of instructions: the adjunction instructions, the
substitution instruction, and the erasure instruction.
The adjunction instructions are of the form (cp) AD
n, where cp is a sequence containing numerals (referring to the marked subtrees) or particular grammatical symbols or both, where AD is one of the four
adjunction operations - ADLES (add as left sister), ADRIS (add as right sister), ADFID (add as
first daughter), or ADLAD (add as last daughter)
- and where n is a numeral referring to a marked
subtree. The instruction (5) ADLES 4 specifies the
adjunction of a copy of the subtree marked 5 as the
left sister of the node heading the subtree marked 4.
Substitution instructions are of the form (cp) SUB n,
where cp and n are as before. When such an instruction is applied, copies of the elements of cp replace the
320
PROCEEDINGS -
FALL JOINT COMPUTER CONFERANCE,
subtree marked n, and this subtree is automatically
erased. *
Erasure instructions are of the form ERASE n
(erase the subtree marked n and any chain of nonbranching nodes immediately above this subtree) or
ERASE 0 (erase the entire tree). The ERASE 0
instruction permits us to use the transformational
component as a "filter" that rejects structures from
which no acceptable sentence can be derived.
Derivations. The transformational rules are distinguished as being obligatory or optional, cyclical or
noncyclical, and singularly or embedding. The obligatory / optional distinction requires no special comment here.
A rule is cyclical if it can be applied more than
once before the next rule is applied. A rule may be
marked as cyclical either (a) because it can be applicable in more than one position in a given sentence (say, in both the subject and object noun
phrases), or (b) because it can apply once to yield
an output structure and then apply again to this
output. Otherwise, the rule is marked as noncyclical. In the present grammar case (b) does not occur.
Singulary rules are distinguished from embedding
rules on the basis of the conditions placed upon the
tree search. In the case of a singulary rule the
search cannot continue "into a nested sentence"that is, beneath an instance of SS or S within the
sentence being examined; the search may, of course,
pass over a nested sentence. In the case of an
embedding rule the search can continue into a nested sentence, but not into a sentence nested in a
nested sentence. Singulary rules operate "on one level," embedding rules "between one level and the
next level below."
The transformational rules of our grammar are
grouped into three sets-a set of initial singularies, a
set of embeddings with related singularies, and a set
of final singularies. t The rules are linearly ordered
within each set.
The initial singularies operate on the output of
the phrase structure component; they can be considered as applying, in order, to all subtrees simultane-
1965
ously, since these rules do nothing to disturb the
sentence-within-sentence nesting in a tree. There
are numerous ways to order the application of these
rules with respect to the nesting structure of a tree,
and they are all equivalent in output.
The embeddings and related singularies operate
on the output of the initial singularies. These rules
require a rather elaborate ordering. Let us define a
lowest sentence as an instance of # S # in which S
does not dominate # and a next-to-lowest-sentence
as an instance of # S # in which S dominates at least
one lowest sentence and no instance of # S # that are
not lowest sentences. At the beginning of the first
pass through the embeddings and related singularies,
all lowest sentences are marked. The rules will be
applied, in order, to the marked subtrees. At the beginning of each subsequent pass, all next-to-Iowest
sentences will be marked, and the rules will again be
applied, in order, to all marked subtrees. Characteristically, the embedding rules, when applied during
these later passes, erase boundary symbols and thus
create new next-to-Iowest sentences for the following
pass. However, only those subtrees marked at the
beginning of a pass can be operated upon during
the pass. The process continues until some pass
(after the first) in which no embedding rules have
been applied.
The final singularies operate on the output of the
embeddings and related singularies. They can be
considered as applying, in order, to all subtrees
simultaneously.
A tree that results from the application of all applicable transformational rules is a surface tree.
Each surface tree is associated with one of the sentences generated by the grammar.
Dimensions
The MITRE Grammar generates sentences with a
wide variety of constructions-among them, passives,
negatives, comparatives, there-sentences, relative
clauses, yes-no question, and WH-questions. The
dimensions of the grammar (excluding all rules concerned with conjunction) are as follows:
Phrase Structure Component:
*If $NIL is chosen in th,e nth description segment, then
AD. n or (cp) ~UB n IS vacuous. Null terms in cp are
Ignored; If all of cp IS null the instruction is vacuous.
~ cp)
tThere is also a fourth set, conjunction rules. Because of
the tre~!F~nt of conjunc!ion in. the "En¥1ish Preprocessor
Ma~lUal
IS cu~rently bemg reVIsed, conjunction has been
omItted from thIS presentation.
Transformational Component:
75 rules
approximately 275 subrules
13 initial singularies
321
SYNTACTIC ANALYSIS FOR TRANSFORMATIONAL GRAMMARS
26 embeddings and related
singularies, including 9 embeddings
PRES SO M
I
CAN
15 final singularies
PRES PL M
54 rules
VINT
I
I
CAN
THE MITRE ANALYSIS PROCEDURE
NCT
#
The MITRE analysis procedure takes as input an
English sentence and yields as output the set of all
basic trees underlying that sentence in the MITRE
grammar. If the procedure yields no basic tree, the
input sentence is not one generated by the grammar.
If the procedure yields more than one basic tree,
the input sentence is structurally ambiguous with
respect to the grammar.
.
There are five parts to the procedure: leXIcal
look-up, recognition by the surface grammar, reversal of transformational rules, checking of presumable basic trees, and checking by synthesis. These
parts are described in detail in the following sections.
Lexical Look-up
The first step of the process is the mapping of
the input ·string into a set of pre-trees, which are
strings of subtrees containing both lexical and
grammatical items. The pre-trees are obtained
from· the input string by the substitution of lexical
entries for each word.
A lexical entry for a word may be identical to the
word (in the case of grammatical items like A and
THE). More often, a lexical entry for a word indicates a representation of the word in terms of more
abstract elements (NEG ANY for NONE), a
ADJ
category assignment for the word (
I for
GREEN
GREEN), or a combination of abstract representaPRES SG VTR
tion and category assignment (
I
for
OPEN
OPENS). A word may have several lexical entries.
The number of pre-trees associated with an input
string is then the product of the numbers of lexical
entries for the words in the string. Thus, the string
# CAN THE AIRPLANE FLY # has 15 associated
pre-trees, which can be schematically represented as:
I
SO
FLY
THE NCT SO PRES PL VINT
I
I
CAN
FLY :J:I:
AIRPLANE
VTR
I
CAN
NCT SO
PRES PL VTR
I
I
FLY
CAN
Of these 15 pre-trees, only
# PRES
SG
M
I
CAN
THE
NeT
I
SG
AIRPLANE
VINT
I
#
FLY
is a correct assignment of lexical entries to the words
in the input string. *
Recognition by the Surface Grammar
The surface grammar is an ordered CF phrasestructure grammar containing every expansion
which can occur in a surface tree. Unavoidably, the
surface grammar generates some trees which are not
correct surface trees, even though the corresponding
terminal string may be a sentence obtainable by the
grammar with some other structure.
In the second step of the analysis procedure the
surface grammar is used to construct from each
pre-tree a set of presumable surface trees associated with the input string. Since the surface grammar
is context-free, and context-free parsing algorithms
are known to exist, no details will be given here for
this step of the analysis.
In the course of recognition by the surface grammar, some pre-trees may be rejected. For example,
9 of the 15 pre-trees in the previous section are
rejected in this way. From other pre-trees one or
more presumable surface trees will be constructed.
The remaining steps of the analysis procedure are
designed to determine, for each presumable surface
tree, whether or not the tree is in fact a surface tree
for the input sentence.
*Since the MITRE grammar generates neither imperatives
nor noun-noun compounds, the interpretation of CAN THE
AIRPLANE FLY as analogous to CORRAL THE SADDLE
HORSE is excluded.
322
PROCEEDINGS -
FALL JOINT COMPUTER CONFERANCE,
Reversal of Transformational Rules
The next step in the analysis procedure reverses
the effect of all transformational rules that might
have been applied in the generation of the given
presumable surface tree.
The "undoing" of the forward rules is achieved
by rules that are very much like the forward rules in
their form and interpretation. The discussion under
Form of the Rules, above, applies to reversal rules
as well as to forward rules, with the following additions:
(a) There is a new adjunction instruction, ADRIA
(add as right aunt - that is, add as right sister
of the parent).
(b) Adjunction and substitution instructions have
-been generalized to permit instructions like:
(A)
B
ADRIS
C
n
(1 B C) SUB n
I
231
1965
( c) in some cases the reversing of several forward
rules can be combined in whole or in part into a
single reversal rule.
Reversed final singularies are first applied to all
subtrees. Then reversed embeddings and related singularies are applied in several passes. The first pass
deals with the highest sentences in the tree. Later
passes move downward through the tree, one level
at a time. New lower sentences are created when
boundary symbols are inserted during the reversal
of embedding transformations; in general, a sentence created on one pass is dealt with on the next.
Finally, reversed initial singularies are applied everywhere.
The effect of transformational reversal is to map
each presumable surface tree into a presumable basic
tree. t
Checking of Presumable Basic Trees
As with forward rules, reversal rules are either
cyclical or noncyclical, and either singulary or
embedding. All reversal rules are obligatory. *
The reversal rules are grouped together in the
same way as the forward rules, and the order of
their application within each group is essentially
the opposite of the order of the corresponding forward rules. In many cases, one reversal rule undoes
one forward rule. There are three types of exceptions, however: (a) several reversal rules may be
required to attain the effect of undoing a single forward rule; (b) for some rules, notably the rules
with ERASE 0 instructions, no reversal is needed;
In the next step of the analysis procedure, each
presumable basic tree is checked against the
phrase-structure component of the (forward)
grammar. The check determines whether or not the
presumable basic tree can in fact be generated by
the phrase-structure component; if it cannot, it is
discarded.
Checking by Synthesis. It is possible that transformational reversal and phrase-structure checking
could map a presumable surface tree T 1 into a basic
tree T2 that is not the basic tree underlying T 1• For
example, the reversal rules map at least one presumable surface tree associated with THOSE PIG IS
HUGE into a basic tree underlying THAT PIG IS
HUGE. Even under the assumption that input sentences are grammatical, the possibility remains. For
example, the reversal rules map at least one presumable surface tree associated with THE TRUCK HAS
A SURFACE THAT WATER RUSTS into a basic
tree underlying mE TRUCK HAS A SURFACE
THAT RUSTS. Similarly, they map at least one presumable surface tree associated with THEY CAN
FISH into a basic tree underlying THEY CAN A
FISH.
Revision of the present reversal rules and the
introduction of rejection rules into the transformational reversal step might make a synthesis step
*Optional reversal rules are required when two distinct
basic trees are mapped into identical surface trees by the
application of forward rules. No such example occurs in the
present MITRE grammar.
tDistinct presumable surface trees may be mapped into
identical presumable basic trees; the resultants of distinct
presumable surface trees will continue to be processed separately, however.
I
1
2D
Such instructions are used to restore entire subtrees deleted by forward rules.
( c ) In the reversal of optional forward rules, a
marker OPTN is added as a daughter of a specified node, which in every case is either terminal
or else has only a lexical expansion. Some such
device is required if the result of the final synthesis step is to correspond to the original input
string. The constraint on the placement of
OPTN insures that the marker will not interfere
with the operation of other reversal rules.
SYNTACTIC ANALYSIS FOR TRANSFORMATIONAL GRAMMARS
323
unnecessary. However, the above examples demonstrate that with the present rules this step is essential.
In the synthesis step, the full set of forward transformational rules is applied to each basic tree that
survives the previous checking step. Each optional
rule becomes obligatory, with the presence of the
marker OPTN (in the appropriate position) as an
added condition on its applicability.
The synthesis step maps a basic tree T 2, derived
from a presumable surface tree T 1, into a surface tree
T 3. If T 1 and T 3 are not identical, then T 2 is discarded as a possible source for the input string. If T 1
and T 3 are identical, then T 2 is a basic tree underlying T 1 (and hence, underlying the input string).
parsing would improve the procedure by eliminating some incorrect surface trees at an early stage.
Some increase in the efficiency of the reversal
step might be achieved by making use of a preprogrammed path through the reversal rules, or by using information that certain surface grammar rules
signal the applicability or inapplicability of certain
reversal rules. Similarly, the efficiency of the final
synthesis step might be improved by making use of
a preprogrammed path through the forward transformational rules, or by using information that certain reversal rules have been applied.
A nalysis by Synthesis
Dimensions
The first analysis procedure proposed for transformational grammars was the "analysis by synthesis" model of Matthews. 9 Basically this procedure
involves generating sentences until one is found
which matches the input sentence; the steps used in
the generation provide the structural description.
No attempt to program the analysis-by-synthesis
procedure for transformational grammars has been
reported in the literature. In its raw form this procedure would take an astronomically long time. One
way to refine the procedure would be to use a "preliminary analysis" of some sort, which would have
to be extensive to make any appreciable change in
efficiency. As a result, there may be no sharp boundary between refined analysis-by-synthesis and
direct analysis with a final checking-by-synthesis step. In the case of the MITRE procedure the
final synthesis step plays a relatively minor role in
the total procedure.
The dimensions of the additional components of
the analysis procedure are as follows:
Surface Grammar:
Reversal Rules:
49 rules
approximately 550
subrules
30 final singularies
92 embeddings and
related singularies
12 initial singularies
134 rules
AREAS FOR FURTHER INVESTIGATION
We are investigating a number of problems both
in the grammar and in the analysis procedure, with
the objectives of making the grammar more adequate and the procedure more efficient.
Among the grammatical problems are the use of
syntactic features (see Chomsky3) and the addition
of further rejection rules in the transformational
component. The treatment of conjunction is being
revised. Other topics requiring investigation include
adverbial clauses, superlatives, verbal complements,
imperatives, and nominalizations.
We are examining a number of ways to improve
the efficiency of the analysis procedure. If the input
vocabulary is to be of an appreciable size, an efficient and sophisticated lexical look-up routine
will be required. We are using computer experiments to determine the extent to which the use of a
CS surface grammar, either as the basis of a CS
parsing routine or as a check on the results of CF
Petrick's Procedure
S. R. Petrick10 has proposed and programmed a
general solution to the analysis problem which is
similar in many respects to the MITRE procedure.
One of the main differences between his approach
and ours is that he alternates the use of reversal
rules and phrase-structure rules, while we use first
the phrase-structure rules of the surface grammar
and then the reversal rules. Furthermore, while Petrick's reversal rules are all optional, ours are all
obligatory. It follows that although we may have a
larger number of structures to consider at the beginning of reversal step, this number does nbt increase
as it does at every step in Petrick's procedure.
At the present time the procedures differ in gen-
324
PROCEEDINGS -
FALL JOINT COMPUTER CONFERANCE,
erality, for Petrick has shown that there are algorithms for the construction of his surface grammar
and reversal rules. In the case of the MITRE procedure, the question of the existence of comparable
algorithms has not yet been resolved.
Kuno's Procedure
Another approach to the analysis problem has
been proposed in Kuno. 11 Kuno attempts to find
basic trees, without using reversal rules, by constructing a context-free surface grammar and associating with each of its rules information about the
form of the basic tree.
Kuno reported that an experimental program for
this system had been written and was being tested
on a small grammar. At that time it was not known
whether an algorithm for constructing the required
phrase-structure grammar existed.
COMPUTER TESTS
To test the grammar and the procedure a set of
FORTRAN subroutines (called SYNN) , designed
to be combined in several different programs, has
been written. In one order, the subroutines carry
out the procedure from the stage at which presumable surface trees have been obtained, through" the
base tree, to the final step of comparison of the derived surface tree with the given presumable surface
tree. In other orders they can, for example, convert
base trees to surface trees and back, or check surface trees against context-sensitive grammars.
We describe first the su1:'-routines, in groups corresponding to the major components of the MITRE
procedures, then some of the programs and the results of running the programs on a subset of the
grammar.
Subroutines
Because the primary operations are operations on
trees, the main subroutines of the SYNN package
analyze and manipulate trees. Three of the subroutines treat trees without reference to the grammar:
CONTRE reads in a tree and converts it to the internal format, TRCPY stores a copy of a tree for
later comparison, and TREQ compares two trees to
see if they are identical.
In the SYNN package there are four subroutines
that deal with phrase-structure grammars. CONCSG
1965
and CONCFG read in context-sensitive and contextfree grammars, respectively, and convert them to
internal format. CHQCS and CHQCF check the
current tree against the indicated grammar by a
regeneration procedure.
Most of the subroutines of SYNN are concerned
with the transformational components. Separate subroutines read in the transformational rules, control
the application cycle, mark levels of embedded subtrees, search for an analysis, check restrictions, and
perform the operations.
The application of the forward rules is controlled
by the subroutine APPFX, and the application of
the reversal rules by APPBX. The application cycles are as described in the section Reversal of
Transformational Rules, above, except that each
transformational rule has a keyword which is used
to bypass the search if the transformational keyword does not occur in the tree.
There is also a generation subroutine GENSR
which is best described as a "constrained-random" generator. Within constraints specified by the
user the subroutine generates a pseudo-random
base tree to which other subroutines of SYNN can
be applied.
Programs
In initial tests of the grammar and procedure the
most useful combination of subroutines was in the
program SYN1, which goes from basic tree to surface tree and back to basic tree, checking at every
step. This first program is an iteration of the subroutines CONTRE, TRCPY, CHQCS, APPFX,
CHQCF, APPBX, CHQCS, TREQ. When all parts
are correct, the final result is the same as the input,
and this is indicated by the final comment of the
TREQ subroutine.
The program SYN2 carries out the steps of the
MITRE procedure without the first two steps, lexical look-up and context-free parsing. Its basic
cycle is CONTRE, TRCPY, APPBX, CHQCS,
APPFX, CHQCF, TREQ. After each of the subroutines an indicator is checked to see if the tree
should be rejected.
The program SYN3, which uses the generation
subroutine, is like SYN2 except that GENSR replaces CONTRE in the basic cycle. Inputs for
GENSR are easier to prepare than those of
325
SYNTACTIC ANALYSIS FOR TRANSFORMATIONAL GRAMMARS
CONTRE, so that SYN3 is being used extensively
in debugging the grammar.
The lexical lock-up and context-free parsing
steps of the procedure have not been programmed.
Because algorithms for these steps are known to exist, it was decided that their programming could be
postponed and an existing program used.
Test Grammar
A subset of the grammar, familiarly known as the
JUNIOR grammar, was selected for initial tests of
the procedure. Its dimensions are:
(Forward)
Grammar
Phrase-Structure
Component:
Transformational
Component:
Surface Grammar
Reversal Rules
61
105
11
6
3
20
32
306
6
15
11
32
rules
subrules
initial singularies
embeddings and related singularies, including two embeddings
final singularies
rules
rules
subrules
final singularies
embed dings and related singularies
initial singularies
rules
Twenty-six sentences (plus some variants) constitute a basic test sample for the JUNIOR grammar. This sample, which includes at least one test
for each transformational rule, contains ( among
others) the sentences:
1. The airplane has landed.
2. Amphibious airplanes can land in water.
3.0 Did the truck deliver fifty doughnuts at
nine hundred hours?
4. Were seven linguists trained by a young
programmer for three months?
5. The general that Johnson met in Washington had traveled eight thousand miles.
6. Are there unicorns?
7. John met the man that married Susan in
Vienna.
8. There were seven young linguists at
MITRE for three months.
9. Can all of the ambiguous sentences be analyzed by the program?
10. The linguist the ambiguous grammar was
written by is young.
SYNI has been run on the full set of sample sentences. The total time for a run with 28 sentences
was 5.11 minutes on the 7030 computer.
SYN3 has likewise been run with the JUNIOR
grammar. As an example of running time, a typical
run generating 20 trees carried all of them through
the transformations and reversal rules in a total of 5
minutes. All but one of these trees contained
embedded sentences; half of them contained two
embeddings.
In another experiment, a CF parser was used
with SYN2 to simulate the full procedure. The results for sentences (1), (2), and (6) are:
Pre-trees
Presumable surface trees
Presumable base trees
Correct base trees
(1)
12
8
3
1
Sentence
(2)
90
15
4
2
(6)
1
1
1
1
In the worst case encountered, sentence ( 5) ,
there are 48 presumable surface trees.
It is clear from even these few numbers that if
the procedure is to be practical, it will be necessary
to incorporate a highly efficient routine for obtaining surface trees and to work on the rapid eliminat:on of spurious ones.
REFERENCES
1. "English Preprocessor Manual," SR-132,
MITRE Corp. 1964, rev. 1965.
2. N. Chomsky, Syntactic Structures, Mouton,
The Hague, 1957.
3.
, Aspects of the Theory of Syntax,
M.I.T. Press, Cambridge, Mass., 1965.
4.
, "Formal Properties of Grammars,"
Handbook of Mathematical Psychology, R. D.
Luce, R. R. Bush and E. Galanter, eds., Wiley,
New York, 1963, vol. 2, pp. 323-418.
5.
and G. A. Miller, "Introduction to the
Formal Analysis of Natural Languages," ibid., pp.
269-321.
326
PROCEEDINGS -
FALL JOINT COMPUTER CONFERANCE,
6.
and
, "Finitary Models of Language Users," ibid., pp. 419-491.
7. T. V. Griffiths and S. R. Petrick, "On the
Relative Efficiencies of Context-Free Grammar
Recognizers," Comm. ACM, vol. 8, pp. 289-300
(1965) .
8. T. V. Griffiths, "Turing machine recognizers
for general rewriting systems. IEEE Symp. Switching
Circuit Theory and Logical Design. Princeton, N. J.,
November, 1965, pp. 47-56.
9. G. H. Matthew", "Analysis by Synthesis of
1965
Sentences of Natural Languages," 1961 International Conference on Machine Translation and Applied
Language Analysis, HM Stationery Office, London,
1962, vol. 2, pp. 531-540.
10. S. R. Petrick, "A Recognition Procedure for
Transformational Grammars," Ph.D. thesis, M.I.T.,
1965.
11·~ S. Kuno, "A System for Transformational
Analysis," paper presented at the 1965 International Conference on Computational Linguistics, New
York City, May 20, 1965.
COBWEB CELLULAR ARRA YS*
Robert C. Minnick
Stanford Research Institute
Menlo Park, California
cell of a cutpoint cellular array, the array is thereby
particularized to a required logical property. The
table in Fig. 1 a includes the logical functions that
can be produced at the bottom output of each cell,
depending on the particular specification of its cutpoints. The symbol F in the index column indicates
an R-S flip-flop.
A 3 X 4 array of cutpoint cells is shown in Fig.
1b; the specification bits are indicated as dots. A
realization for one cutpoint cell in terms of diodetransistor circuits is shown in Fig. 2. The four cutpoints in this realization are depicted as switches;
however, they could be photoresistors, flip-flops, or
breaks or bridges in conductors. The DTL realization in Fig. 2 is one of many circuit possibilities
for a cutpoint cell.
INTRODUCTION
The cobweb cellular arrays are embellishments of
the cutpointl -3 cellular array that are made by complicating the cell-interconnection structure. This
new class of arrays will allow for more economical
and efficient logical designs than are possible in
cutpoint arrays. As a background to the new arrays,
the properties of the cutpoint array2 will be reviewed.
CUTPOINT ARRAY
The cutpoint cellular array is a two-dimensional
rectangular arrangement of cells. As shown in Fig.
1b, each cell has binary inputs from neighboring
cells on the top and the left, and binary outputs to
neighboring cells on the bottom and right. In addition to being used in the .cell, the input to the left
of each cell is bussed to the right output. The bottom output of each cell is set as one of six combinational switching functions of the two cell inputs,
or as an R-S flip-flop-in either case by the use of
four specification bits, or cutpoints, in each cell. By
specifying these cutpoints independently for every
PROBLEMS WITH CUTPOINT ARRAYS
It has been shown in the previously cited references that arbitrary logical functions can be realized
using appropriately specialized cutpoint arrays. However, certain of these realizations tend to be inefficient in terms of the number of required cells. For
instance, the best-known realization for a three-bit
parallel adder using no more than two cutpoint arrays
is shown as Fig. 3. In this figure the two three-bit
words (a3, a2, ad and (b s , b 2, bI) are added to form
*The research reported in this paper was sponsored by
the Air Force Cambridge Research Laboratories, Office
of Aerospace Research, Under Contract AF 19(628)-4233.
327
328
PROCEEDINGS -
FALL JOINT COMPUTER CONFERANCE,
INDEX a b c d
0
1
2
3
4
5
6
7
F
0
0
0
0
0
0
0
0
1
0
0
0
0
1
1
1
1
1
0
0
1
1
0
0
1
1
o
1965
z
0 1
1 y'
o x' + y'
1 x'y'
0 x+y
1 xy'
0 xiBy
1 0
1 x=S, y=R
(a)
( b)
RA-64'54'-82
Figure 1. Cutpoint Array.
the sum word (S3, S2, sI). The input carry to the
low-order column is co, while C3 is the overflow bit.
An n-bit parallel adder can be formed in a similar
manner to the one in Fig. 3; a total of (2n + 1) 2
cells are required in two adjoining arrays for such a
realization.
Reference back to Fig. la shows that cells with
an index 1 form the complement of the top input.
This one-variable function is convenient to use when
transmittal of information vertically in an array is
desired. Hence, vertical cascades of Index 1 cells in
a cutpoint array indicate that in effect, no logic is
being performed, perhaps with the exception of one
cell in each such cascade. With this in mind, it is
now observed that in the upper 3 X 7 array of Fig.
3 only six cells, roughly along the diagonal from the
upper-right to lower-left corners, are used logically.
Similarly, in the lower 4 X 7 array in Fig. 3, only
cells in the upper-right triangular area are used logically.
A wastage of cells similar to that encountered in
Fig. 3 had been observed in several cutpoint-array
logical designs, particularly in designs which involve parallel operations. This inefficient use of cells
occurs in most of these situations because every
bit of one operand word must interact with every
bit of a second operand word. In a cutpoint array
the only convenient way this interaction can occur
is to introduce the bits of one word on the side
of the array and the bits of the other word along the
top. This orthogonal introduction of the two operand words into a cutpoint array seems necessary
because no facility is provided within the array to
change the direction of information flow from vertical to horizontal.
It it were possible to redirect the information
329
COBVVEB CELLULAR ARRAYS
y
~~----------~
+5V
2.2K ' s
r--------...
2.2K
+5V
2.2K
2.2K
b
2N706
2N706
x
x
-5V
-5V
ALL DIODES IN 4009
c
z
Figure 2. Circuit for one Cutpoint CelL
°1
°2
°3
C3
53
Co
52
51
Figure 3. Cutpoint realization for a three-bit parallel adder.
330
PROCEEDINGS -
FALL JOINT COMPUTER CONFERANCE,
flow inside a cellular array, two n-bit operand
words might be applied to the side of the array, (or
both to the top), and possibly a significant reduction would result in the number of cells that are required. Instead of requiring 0(n2 ) cells, the resulting array might require O(n) cells. Thus this lack
of control on the direction of information flow constitutes an important problem in the use of cutpoint
arrays.
1965
Another problem encountered in practical logical
designs using cutpoint arrays is the excessive requirement for jumper connections from edge-output
points to edge-input points of the same array. One
example of this problem is shown in Fig. 4, which
is a five-bit shift register driven by a four-phase
clock. For this example four such jumpers are used.
Jumpers of this type will be termed edge jumpers.
A second example is shown in Fig. 5. In this fig-
Figure 4. Cutpoint realization for a five-bit shift register.
ure, the following three combinational functions of
the three variables, X3, X2, Xl, are realized in one
cutpoint array:
E66
E43
= I(1, 6)
= I(O, 1, 3, 5)
E 129 = I(O, 7).
o
Figure 5. Cutpoint realization for three functions of threevariables.
(1)
331
COBVVEB CELLULAR ARRAYS
A requirement for edge-jumper connections in a
cutpoint array often carries with it a wastage of
cells. In the bottom half of the array in Fig. 5, for
example, only three cells are used other than for
transmitting signals.
A third problem often encountered when practical logical designs are made in terms of cutpoint
arrays is an insufficient number of edge connections
to the array. A final problem is the desirability to
have the cells isolated from one another during the
early part of the production so that it is possible to
identify faulty cells by step-and-repeat testing.
The cobweb array is proposed as a means of
meeting all of these problems of inefficiency for
parallel operations, excessive edge-jumping, insufficient edge connections, and lack of cell isolation.
COBWEB ARRAY
A 4 X 4 cobwed array is shown in Fig. 6. It is
seen that within the array each cell has five possible
inputs: two from a horizontal and vertical buss and
three from nearby cells. Connections from edge cells
to the package terminals are shown on Fig. 6 by
peripheral dots. For terminal connections, each cell
on the left and bottom edges of the array has one
non-bus output connected to terminals, each cell on
the right and top edges of the array has one non-bus
input connected to terminals, and each horizontal
and vertical bus is connected to a terminal. For an
M X N-cell cobweb array it is easily seen that
3 (M + N) - 2 package terminals are needed. This
compares with M + 2N terminals for the cutpoint
array of the same size; for square arrays approximately twice the number of terminals are provided
by the cobweb array, while in general the number
of terminals in the cobweb array varies from one
and one-half to three times the number in cutpoint
arrays of the same dimensions.
In Fig. 7 a one internal cell of this cobweb array
is shown with its five inputs labelled as u, v, w, x,
and y Fig. 7 b shows how this cobweb cell can be
fabricated from the previous cutpoint cell and fourteen additional cutpoints. Of course, as mentioned
previously, technologies other than the diode-transistor method shown in Fig. 7 b can be used. The
added cutpoints are labelled e, f, ... , r. In order to
have all single-throw cutpoints, the double-throw
cutpoint a in Fig. 2 is replaced in Fig. 7 b by two
cutpoints, g and e, where a = g' e and a' = ge'.
It is anticipated that cobweb cellular arrays will
be made by one of the modern batch-fabrication
technologies, such as that of integrated circuits. In
making these arrays with integrated circuits, the
number of deposition steps is of related economic
interest. Returning to Fig. 6, and using the nomen-
TA-741581-1
Figure 6. Structure of the cobweb array.
332
PROCEEDINGS - - FALL JOINT COMPUTER CONFERANCE,
1965
x~
~U
(0)
.-------------------------~~----------------------u
,-------------------------~~----------------------V
,---------------------~~~--------------------W
,-------------------~~~--------------------X
,-------------~~~------~~------y
-5V
c
z
TRANSISTORS: 2N706
DIODES: I N4009
r
( b)
T8-741581-2
Figure 7. Diode-transistor realization of cobwell cell.
c1ature of Fig. 7 a , it is seen that if the w busses
are moved to the right of the center in each cell, all
w, v and y interconnections may be deposited
simultaneously. After depositing an appropriate insulating layer, the u and x interconnections together
with connections for power and ground may be
formed as a second deposition layer. Hence the interconnection structure of the cobweb array in Fig.
6 is two-layered. Similar reasoning applied to Fig.
1b shows that the interconnection structure of the
cutpoint array is single layered.
In summary, the cobweb array consists of cells
that have the same amount of electronics as cutpoint cells. Each cell in the new array has about
four times the number of cutpoints as the cutpoint
cell, one and one-half to three times the number of
package terminals as a cut1 ,,",lnt array of the same
size, and a two-layered rather than a one-layered
interconnection structure. It will now be shown that
the use of this more complicated cellular array at
least partially alleviates the previously-discussed
problems of cutpoint cellular arrays.
LOGICAL DESIGN WITH COBWEB CELLULARARRAYS
In cutpoint arrays, switching functions are produced by forming one or more vertical cascades of
cells. In the cobweb cellular arrays, these cascades
of cells no longer are required to be vertical. Indeed, a cascade in a cobweb array may be any chain
of cells that follows the arrowheads in Fig. 6. This
property of cobweb arrays gives the logical designer
a considerable degree of flexibility in forming his
design. The need for an increased ratio of edge connections to cells is met in cobweb arrays. By introducing other assumptions on edge connections it is
possible further to increase this ratio if additional
logical design experience shows this to be desirable.
COBVVEB CELLULAR ARRAYS
In the cobweb array it is possible to use some of
the cutpoints in a cell in lieu of edge jumpers. For
instance, if cutpoints hand k (Fig. 7 b ) are
closed, this causes the x bus and input u to that cell
to be connected together. Similarly, by closing cutpoints j and f, and by opening cutpoint r, the cell
output can be jumpered to the w input bus; for this
connection, the logical function produced by the
cell is immaterial. * It is also possible to jumper as
many as all five inputs and the output of a cell together. Indeed, for those cases where sneak paths
are not introduced, it is possible to form one jumper path among the cutpoints f, h, i, j, k, I and a
second nne among m, n, 0, p, q. Cells that are
specialized in this way are called jumper cells. The
jumper connections are designated by circling the
inputs (and output) that are jumpered together and
by inserting the symbol J inside the cell. If two isolated jumpers are used, triangles will designate the
inputs (and output) on the second jumper.
It should be observed that J cells in the cobweb
array are logically inactive. That is, jumper cells are
used only to make local connections in the array,
and not to perform logical operations. It should also
be noted that jumper-cell connections can be made
in such a way as to allow information flow in violation of the arrowheads in Fig. 6.
It also is possible to use the cutpoints h, i, ... ,
q to obtain an OR of two or more of the five inputs
to a cell. For instance if cutpoints k, 1, m, nand 0
are closed, then the horizontal input to the cell is
x + y, while the vertical input is u + v + w. Care
must be taken when using this property to avoid the
introduction of sneak paths.
333
where () is the EXCLUSIVE-OR operator, and each
of G, H, A and B is a switching function of no more
than n-l variables, and is indep~ndent of Xi.
If it is assumed that G', H', A', and B are each
producible in one cascade of cells, the cobweb arrays
of Figs. 8a and 8b correspond to the two decompositions of Eqs. (2) and (3), respectively. If one or
more of the four (n - 1 )-variable functions are not
producible in a single cascade, either of the above
decompositions may be repeatedly applied until all
subsidiary functions are realizable in one cascade.
The J cell in Fig. 8a with the y input and the z
output circled means that the switches f and 1 (see
Fig. 7) are closed and that switch r is open. This
connects the y input to the cell output without the
use of an external jumper. For cobweb cells that
produce one of the functions listed in Fig. 1a, a
function index is placed inside the cell and the particular inputs (if any) that are connected through
the cutpoints h, i, ... , 1, (Fig. 7), are designated
by circles, while the particular inputs (if any) that
are conected through the cutpoints m, n, ... , q are
designated by triangles. For instance, the cell with
index 5 in Fig. 8a means (b, c, d) = (1, 0, 1), and
since a circle is on the x buss and a triangle is on
the y input, then (h, i, j, k, 1) = (0, 0, 0, 1, 0), and
(m, n, 0, p, q) = (0, 0, 0, 0, 1), and finally (e, f,
g, r) = (0,0, 1, 1).
(3 )
As shown in Fig. 8, no jumpers are needed in the
cobweb-array realization of either decomposition.
Furthermore, it is noted that only one row of cells
is needed for each application of the Reed decomposition.
With none of the cutpoints h, i, ... , q closed, all
cells are isolated in the cobweb arrays. Therefore,
step-and-repeat testing is possible for cobweb arrays
that are fabricated as monolithic integrated circuits,
while it is not possible for the originally proposed
cutpoint arrays.
The particular interconnection structure of the
cobweb array was chosen for several reasons. As the
number of potential inputs to each cell is increased,
the number of interconnection possibilities also increases. But this increase is obtairied at the cost of
additional cutpoints in each cell. Hence, it is desirable to introduce only as much interconnection versatility as the typical logical designer would use.
*In principle, cutpoint r in Fig. 7b could be eliminated
by setting cutpoints b, c, and d so that the output transistor is nonconducting. However, cutpoint r is necessary
for the correction algorithms to be described.
*In order to simplify the artwork, the terminal conventions adopted in connection with the discussion of
Fig. 6 will not be explicitly shown on this and on following
cobweb-array designs.
Two decomposition methods will now be shown
in order to illustrate the elimination or reduction in
the use of edge jumpers by the jumper-cell specialization. It is supposed that a switching function E =
E(Xl, X2, . . . , Xn) is not producible in one cascade
of cells. This function can always be decomposed
on one of its variables, Xi, in several ways, including
a form due to Shannon,
E
=
GXi
+ Hx/
(2)
and a form due to Reed,
E
=
AXi ()
B
334
PROCEEDINGS -
FALL JOINT COMPUTER CONFERANCE,
1965
x·I
(0) SHANNON
X·I
E=AXjG>B
(b) REED
Figure 8. Shannon and reed decompositions using cob-web arrays.
335
COBVVEB CELLULAR ARRAYS
Figure 9. Cobweb realization for a three-bit parallel adder.
I
a
Figure 10. Cobweb realization for a five-bit shift register.
Referring back to Fig. 7 a, the x and y inputs are
carried over directly from the previous cutpoint array. The u input allows the designer to build up a
carry propagation chain within a horizontal row of
register cells. The vertical bus allows one to jumper
a bottom-cell output of an array to a top-cell input.
Finally, the v input is a knight's move away so that
it is possible to build up a cutpoint cascade that
crosses other such cascades. The desirability of having crossings in cellular arrays has been observed
before. *
A number of obvious variations of the cobweb
array is possible. For instance input v, or both inputs v and w in Fig. 7 a may be omitted in each
cell,with a corresponding saving in cutpoints. In
the latter variation, a single-layered interconnection
structur results. Similarly, it is possible to invent
more complicated variations of the cobweb array.
Illustrations of logical designs using cobweb arrays are given as Figs. 9, 10 and 11. These figures
should be compared directly with Figs. 3, 4, and 5,
respectively. First comparing Figs. 3 and 9, it is
seen that an n-bit parallel adder can be synthesized
using 9n + 3 cobweb cells in a single array, while
(2n + 1)2 cells in two adjoining arrays are required
if cutpoint cellular logic is used. Thus, for example,
a 50-bit parallel adder requires 453 cells in a cob"'By Marvin E. Brooking, private communication.
336
PROCEEDINGS -
FALL JOINT COMPUTER CONFERANCE,
condition will cause the shorting or opening of a
conductor or the shorting of a power supply. It appears from a consideration of the integrated circuit
technology that these assumptions are realistic.
web array and 10,201 cells in two cutpoint arrays.
A comparison of Figs. 4 and 10 shows that all
edge jumpers are eliminated in the cobweb realization of a shift register at the cost of one extra row
of cells. Finally, comparing Figs. 5 and 11, it is
seen that the three edge jumpers as well as half the
total number of cells are eliminated when a cobweb
array is substituted for a cutpoint array.
Two' clusters of cells called supercells are defined
by Fig. 12. The shaded cell in each 2 X 2 cobweb
array has five inputs (marked with the symbol I)
that are geometrically equivalent to the cobweb cell
of Fig. 7. The jumpers between points p and in Fig.
12 are used for transmitting the knight's move interconnections. The supercells of Fig. 12 are arranged in such a way that one may first perform a
logical design in terms of a conventional cobweb
array, and then replace each cell in the first and all
odd-numbered rows with a type ex supercell. The
cells in the second and all even-numbered rows are
replaced with a type f3 supercell.
FAULT AVOIDANCE METHODS
In regard to the cutpoint cellular array, methods
have been demonstrated for replacing faulty cells
with spare cells. 2 These methods no longer are feasible with the cobweb arrays; therefore, it is necessary now to develop an alternative faculty cell avoidance algorithm. It will be assumed that the faults
are "electronic;" that is, a transistor has a low beta,
or it has an emitter-collector short, or a diode is
open-circuited, etc. All conductors and cutpoints
will be considered perfect, and furthermore, the circuit design is assumed to be such that no failure
The effect so far has been to increase the number
of cells in the cobweb array by a factor of four. In
this supercell array it is possible under most conditions to make focal perturbations of the logical de-
Figure 11. Cobweb realization for three functions of threevariables.
II
o
o
(a) TYPE
q
a
1965
(b) TYPE
Figure 12. Cobweb supercells.
f3
COBVVEB CELLULAR ARRAYS
Figure 13. Exhaustive listing of the cobweb array fault-avoidance algorithm.
337
338
PROCEEDINGS -
FALL JOINT COMPUTER CONFERANCE,
sign in order to avoid faulty cells. Assuming that
only one or two of the five inputs to a given cell are
connected by means of cutpoints h, i, ... , q in Fig.
7, it is necessary to demonstrate a fault-avoidance
algorithm for C~ = 10 cases (the two-input cases
cover the one-input cases). However, the logical
cells in a supercell array appear in two geometrically different environments that correspond to the
types a and f3 supercells; therefore, a total of 20
cases must be investigated. Proceeding by exhaustion, a fault-avoidance algorithm for each of these
20 cases is shown as Fig. 13. On this figure, a single shading indicates a faulty cell; arrowheads are
attached to the two active inputs for that cell. A
cell with cross-shading is assumed to be a good cell,
s,;~
Rf
RPi
Sri
ro
Rrj
Sm;~
Rmj
Sq
Shj
Rq
Rhj
and it replaces the faulty cell. If the faulty cell has
no symbolism other than the arrowheads and the
shading, it is assumed to have been disconnected by
having all of cutpoints f, h, i, ... , r open; if it has
a J symbol, it is used as a jumper cell with no connections made at the arrowheads. Cells with a dotted single shading are neighboring good logical
cells. Cases where the faulty cell in Fig. 13 occurs
on or near the top row correspond to faults in a supercells, while cases where the faulty cell in Fig. 13
occurs on or near the bottom row correspond to
faults in f3 supercells.
From this deveiopment it should be clear that if
a logical cell in a supercell array is bad, it can be
logically replaced provided that another cell is good
MP REGISTER
12 BITS
~P'
MC REGISTER
12 BITS
~m,
HO PRODUCT REGISTER
12 BITS
ho
LO PRODUCT REGISTER
12 BITS
10
f rno ho q
~
Rf
Slj
Rlj
1965
= po/Ttl
Rpj = (,8xi t 2
Rrj
= (yrno)'t 2
Rmj
= (ro + a,8po>'t 2
Smj= tl
Rq
= frnoT'hot2
Sq = [(fm o + T'h o )' + a',8]t 2
Rhj
= (sy + loa,8')t3
Rli = (loY + Th oy)'t 3
TB-5087-7
Figure 14. Block diagram for a twelve-bit serial multiplier.
339
COBWEB CELLULAR ARRAYS
•
/l y
HO PRODUCT REGISTER
CO PRODUCT REGISTER
MP BIT
STORAGE
Figure 15. Realization of the mUltiplier in terms of fivecutpoint arrays.
at a distance one or two cells from it, according to
Fig. 13.
Thus a fault-avoidance algorithm for cobweb arrays has been demonstrated. Many· variations in the
process are possible. For instance, if multiple faults
prevent complete. avoidance of faulty cells using the
2 X 2 supercells, one can replace some or perhaps all
cells in the supercell array again with supercells until
enough redundancy has been obtained that all faults
can be avoided. Similarly, it may be possible to
compress a supercell array if, for instance, no corrections are necessary in a particular row or column. Similar fault-avoidance algorithms can be deduced for the simplified cobweb arrays mentioned
before.
LOGICAL DESIGN OF A MULTIPLIER
As a final illustration, a logical design is given
for a 12-bit serial multiplier in terms of a single
cobweb cellular array. For comparison purposes, the
same system has been chosen as was previously
reported. 3 The block diagram for this four-register,
five-command multiplier is given as Fig. 14, while
a previously-reported design in terms of five interconnected cutpoint arrays is shown in Fig. 15.
In Fig. 16, this same system is realized in terms
of a single 27 X 16-cell cobweb array.
The statistics on these two realizations for the
multiplier are as follows:
Cutpoint Realization
There are 352 cells in five cellular arrays, and
100 connections at the edges of the five arrays.
28 % of the cells are "1" cells used only for
transmitting information,
71 % of the cells are used logically, and
1 % of the cells are not used.
Cobweb Realization
There are 432 cells in one cellular array, and
10 connections at the edges of the one array.
26 % of the cells are jumper cells,
55 % of the cells are used logically, and
19 % of the cells are not used.
340
PROCEEDINGS -
1965
FALL JOINT COMPUTER CONFERANCE,
It is seen from the above data that while 26 percent
more cells are required for the multiplier in the
cobweb realization than in the cutpoint realization,
only one cobweb array is used versus five cutpoint
arrays; furthermore, the backplane wiring in the
cobweb realization is reduced by an order of magnitude.
web arrays results in a significant reduction in the
required number of cells. Also it is possible to eliminate jumper connections from one edge cell on
an array to another edge cell on the same array
_ when cobweb arrays are employed.
ACKNOWLEDGMENTS
CONCLUSIONS
The logical design of the parallel adder in Fig. 9
was provided by Mr. David W. Masters, the supercell concepts were the result of conversations with
Mr. Milton W. Green, and some of the ideas on
parallel operations in cellular arrays were derived
from unpublished work of Mr. Jack Goldberg. Particular credit is due to Dr. Robert A. Short for his
critical evaluation of the manuscript and for his
many helpful suggestions and comments.
The essential difference between the previously
reported cutpoint cellular array and the cobweb array is the more complicated and flexible interconnection structure of the latter array. This flexibility
allows the logical designer much more geometric
freedom in the embedding of cascade logical realizations. For certain types of digital operations, and
in particular for parallel operations, the use of cob-
v v
V LV
J
J
~!V
~L~!~
k~u
~~~L~
~
.L
..L
..L
..L
..L
V
I
J
J
2
VVlI!VJVVVVVIIVVVE[L
.L
L\
j fI r; V II j [J V [/ V 'j V II V V II
.L
.L
1
.L
f( rII V II VV "; [I [/ j V [/ [I V [j V
.L
.L
rv II V VL [L~ ;L
V V! V V [j V l/ ~ f; II!
..L
[I
j
j
j
Jil
V j [! j [L~ [L
IT!
VI
V
V
V
L\
:;
1I J [I [j [j I'!
V V V II j [I!II V II
L\
f( V V II [/ I'! [/ lL [/ [! [! [; [/ fI V II
.L
.L
.L
V V V [! [/ [j [/.L t; TI.L ITZ [J.L [J V V j trI
1/
41
J
2
5
5
2l...L
j
J
j
L
L
L
5 ...
J
I.L
4
J1
5
-1 6
3
6
5
JU
5Ui
J
J
J
J
J
J
J
J
5
j
j
J
II! ~
V
5
j
4
f;
F
F
[/ j
5
5
j
[j j
J
J
J
j
J
J
J
5
J
J
3
J
J
J1 2
J
F
J..L
F
5
5
2
J
F1
F
5
51
F
F
[I j
1
V
1
1
[I
.
F
5
5
5
F
F
F1 F1
F
J
..
~mi
F1
j
F
F
5
5
5
F
F.L
F
F1
F
j
[J
5
5
5
FU
FU
F)1
FU
F U Fl..L( 5)1
5
5
5
51
5
5 1
5
F
F
F 1
5/
5
F /
F
F
5
5
F
F
F
F1 F 1
F
F
F
F
F
F
51 5
5
5
5
5/
5
5
F
F1
F1
F1
5J-Ll
F
F 1
5
5
5
5
F
F
F
F
F
F
F
5
5
5
J
J
Jdo.
~J..
J
J
J
F
F
F
J
J
J
F
5
/
• 1
•
.
F1 F 1
F
F
5
5
5
F
F..L F
'I
'I
F
5
F
F
'/
5
5)1
• 1
'I Prl ffl IU
I.
J
J
1
51
51'
FU
'/
I
! [!
~.;
..L
[J
1
51 5
/
2
J
l..L FJ..L\ FJ-Ll FJ..L\ F
5
~/
1
J
1
J
F)1
5
F
J
J
R rj
F
5)1 ')1
5
IJ..
II
II II tT'! ~ lL
41-
[J
F
5
.L
1
5
JJ...LF
5
[!
Jl...L
•
l..L 5 ) 1 5 ) 1 5)1
.L
J
J
j
V
..L
[j
[J
[j [j [j [j
V [j V l/ [J
5J..L\ J..L U\
L
V 11 VL ~ ~ ~ 11..L lL V [/ 11 rv~ ~ V
• ..L
[j 11 [lL ~ 11 ~l II [! II tIl
V V~ ~. V
[L II
r;
v
II
V
II
V II
]'( 1£ .L V V ! V
.L
'.L
• .L
j
r;
r;
r;
V V V tIl.L II V.L
V.L V V V
[/ j [/ 11 V tIl [/ [/ r; [/ [/ [/ II j
j
..L
1/ j [j II.L VI [lL j VL II.L V V j II j
if j [/ II [/ 11 j tIlL [/5U 11.J..L\ll/ j [/5J..L j
L
tIl II.L VL 11 [j.L lL 11 [/ II V [j V [L II
II.L II VL
11 [j 11 r; IlL 11 II' [! 1/ 11 j
j
.L
r; V IIj [/ [/ II V j .L V V [j V lL Ii
j 5.L
r; V r; V [I II [j II [/ [;' II r; V.L II
V [J j [j [/ tIl [/ V [j tIl II [J II j
II!V! II!1/ V IT! :; j V II V V V V
JJ..
J
. U V ·V tIlV V V 1/rJ
.V ll1. LI V J
Vj
.
·
11
V
V II
5
Rh;
R"
I
I
J
3
I
J
J
I
J
J
j
J
J
I
J1
JJ...
j
J
5
II V j t1!j r;j trI r;
V II J [/ [j if
[I V
J
J
[lL
5
F
j
J
.
F
J
I Prl rrl
F
J
I
J
J
J
~
V
¥V
[!
V
[!
V
J
.L
J
.L
[I
J.L
[I
J
V
V
J
J
[I j
J
J
[L
II
II
[L II
tIl.L II
tTl I
tIl I
J
J
[/
J
J ..L
J ..L
J
J
rTl
..L
Prl
Figure 16. Realization of the multiplier in terms of onecobweb array.
COBWEB CELLULAR ARRAYS
REFERENCES
1. R. C. Minnick and Robert A. Short, "Investigation of Cellular Linear-Input Logic," Final Report, Contract No. AF 19(628)-498; SRI Report
No. 4122; Stanford Research Institute, Menlo Park,
Calif., prepared for Data Sciences Laboratory, Air
Force Cambridge Research Laboratories, Office of
Aerospace Research, Bedford, Mass., USAF
AFCRL No. 64-6; DDC No. AD 433802 (Dec.
1963).
341
2. R. C. Minnick, "Cutpoint Cellular Logic,"
IEEE Transactions on Electronic Computers, Vol.
EC-13, No.6, Dec. 1964, pp. 685-698.
3. R. C. Minnick, "Application of Cellular Logic to the Design of Monolithic Digital Systems,"
presented at a Symposium on Microelectronics and
Large Systems co-sponsored by the Office of Naval
Research and the Univac Division of Sperry Rand
Corp., Washington, D. C., Nov. 17 and 18, 1964.
(To be published by Spartan Books, Inc. in 1965.)
TWO-DIMENSIONAL ITERATIVE LOGIC*
Rudd H. Canaday
Bell Telephone Laboratories, Incorporated
Whippany, New Jersey
3. The array will be used as a single output
circuit. Only the output of the lower right
element of the array is accessible to the
outside world.
4. Every element in the array realizes the
"majority" function
INTRODUCTION
It is well known that given a suitable Boolean
function, a large number of "gates" or "elements,"
each producing this function, can be interconnected
in a regular structure, or "array," to realize any given Boolean function. Furthermore, the structure of
the array can be invariant to the function being
realized.
f(A,B,C) = AB
+ AC + BC
of its three inputs. *
As a consequence of assumptions (1), (2), and
( 4 ), an array can be described completely in terms
of its width wand height h. Such an array will be
called a "MAjority Array" or "MAJA."
In the remainder of this paper it will be shown,
first, how to synthesize an arbitrary "self-dual"
function in a MAJA. Then this result will be extended to arbitrary functions .and some examples
will be given. This is "intersection synthesis;" Next
a second synthesis technique, "factorization synthesis," will be described, first in a canonic form,
through examples, and then in a more general form.
One of the simplest such structures is the two-dimensional array of three-input one-output elements
shown in Fig. 1. In this paper two methods are
presented for using this structure in the synthesis of
arbitrary Boolean functions. The following assumptions will be adhered to throughout this paper:
1. All elements in the array are identical.
2. The interconnections between elements in
the array are fixed. They cannot be broken
or changed in any way.
*The material presented in this report is based on a thesis
submitted in partial fulfillment of the requirements for the
Doctor of Philosophy Degree in Electrical Engineering at
the Massachusetts Institute of Technology, September 1964.
The research reported was made possible through the
support extended to the M.I.T. Electronic Systems Laboratory by the U.S. Air Force Avionics Laboratory, Navigation and Guidance Division, under Contract AF-33(657)11311 and, in the earlier phases of this research,under
Contract AF-33(657)-8932.
*It is easy to prove! that all of the results given here
extend directly to arrays of "minority" elements:
f(A,B,C) =_-AB
+ AC + BC
This paper is based on the author's Ph.D. thesis.! In
present paper space limitations preclude statements of
theorems and proofs on which the synthesis methods
based. These do appear, together with extensions of
results presented here, in reference 1.
343
the
all
are
the
344
PROCEEDINGS -
FALL JOINT COMPUTER CONFERANCE,
1965
Figure 1. A 4 X 6 array of 3-input elements.
Both synthesis techniques lead to arrays of reasonable size, and embody new synthesis techniques
which may prove to be applicable in other forms of
synthesis also.
Fig. 2 is an example of an XY SBC MAJ A. The
two synthesis methods to be presented both apply to
the SBC MAJ A.
Intersection synthesis is given first for self-dual
functions, as defined below.
PRELIMINARY DISCUSSION
Before discussing array synthesis, it is necessary
to define some terminology for arrays ..
Each element in an array has three inputs, which
will be denoted "top," "center," and "left" inputs
(signal flow in an array is always left-to-right and
top-to-bottom) .
The w inputs (for an array of width w ) consisting of the top input to each element in the top row
of the array form the "top boundary" inputs to the
array.
Similarly, the h inputs (for an array of height h),
consisting of the left input to each element in the
leftmost column of the array, form the "left boundary" inputs to the array.
One particular type of array proves to be of particular interest. This array has, in effect, all top
boundary inputs wired together, and all left boundary inputs similarly wired together.
Definition : An " XY Standard Boundary Condition majority array" ( XY SBC MAJA) is a
MAJA all of whose top boundary inputs carry
the signal Y where Y can be a variable or a
constant, and all of whose left boundary inputs
carry the signal X, where X can be a variable
or a constant.
Definition: Given a Boolean function f(xl, ... , Xn),
then the dual fd(Xl, ... , Xn) of the function f is
defined as:
fd( Xl, ••• , Xn)
=
j(Xl, X2, ... ,Xn)
By applying deMorgan's theorem one can e~sily see
that if f is expressed using only the operations +
(OR), • (AND) and - (NOT), then fd is obtained
by interchanging + and • throughout the expression
for f.
Definition: A Boolean function j(Xl, ... , Xn) is selfdual if and only if
fd(Xl, ... , Xn) = f(Xl, ... , Xn)
Note that by this definition of dual and self-dual, a
function which is a constant is not self-dual since,
if f ==- 1 then fd = f ==- o.
Any n-variable Boolean function f( Xl, .•• , Xn) can
be factored as
j(Xl, ... , Xn) = Xfo + Yft
(1)
with X and Y chosen from {Xl, Xl, ••• ,Xn, Xn}.
If f is a self-dual function, then the existence of
the factorization (1) implies that f can be factored
as
f
=
Xfo
+
Yfod
+ XY
(2)
where X, Y, and fo are the same as in Eq. (1).
Equation (2) is basic to the synthesis algorithm,
which is presented in the following two definitions
and Theorem 1 below.
345
TWO-DIMENSIONAL ITERATIVE LOGIC
y
y
y
y
x
x
x
Figure 2. An XY SBC MAJA.
INTERSECTION SYNTHESIS
Definition: Given two Boolean functions fa and fb,
and given a sum of products expression for
each: I a = rl + r2 + . . . + rk;
fb = tl + t2 +
. . . + tm, then an intersection matrix of fax Ib
is a matrix with k rows and m columns, in which
each entry eij is the intersection of the literals in
Ti with the literals in tj (i.e., eij contains a literal
y if and only if y is in both ri and tj).
Note that the intersection matrix for a given fa and
fb is not unique. It is unique for given sum-of-products expressions (including the ordering of their
terms) for both fa and lb. Now it is possible to define
an SBC MAJA to realize any given self-dual function.
Definition: Given a self-dual function f Sd = XY +
Xfo + Xfo d, and given a k X m intersection mat-
that the MAJA produces all the ones of f Sd ,
since a MAJA without constant inputs must
realize a self-dual function. 1 It is easy to prove
that if the term XY is one the array output is
one. Now let a term XrI in Xfo be one. Then
every literal in the term is one. Then every left
boundary input, and the center input to every
element in the ith row, is one. It is not difficult
to prove that this condition suffices to insure
that the array output is one. Thus the array
output is one for every term in Xfo. Similarly
if a term Yti in Yfod is one then every center
input to the ith column, as well as every top
boundary input, is one. Again this suffices to
insure that the array output is one. Thus every
one of jSd = XY + X/o + Ylod is realized at the
output of an intersection MAJA for f Sd and so
the MAJA realizes fSd.
rix fo X fod, with rows corresponding to terms of
fo, then an XY intersection MAlA for jSd is a
k X m XY SBC MAJA with the center inpuCto
the ilh element chosen to be anyone of the
literals in entry eij of the intersection matrix, for
all i, j: 1 :::; i :::; k, 1 :::; j :::; m.
The synthesis algorithm just presented allows one
to synthesize any self-dual Boolean function. To
extend the result to any arbitrary Boolean function,
the "self-dual expression" for a function is defined.
Again note that one function Is d may have many intersection MAJAs for each factorization (each choice
of X and Y).
Definition: Given any n-variable Boolean function
I(XI, .•. , x n}, and a variable U independent of
(Xl, ... , Xn), the Self-Dual Expression jSd for f
is defined as the (n +""'-1) -variable function: t
fSd(U,XI, . .. , Xn} = Uj(Xl, ... , Xn} + Uld(Xl, ... , Xn).
Theorem 1: Given any XY intersection MAJ A for a
self-dual function lSd, then the output of the
MAJA realizes the function Is d •
Prool:* By construction the MAJA has no constant inputs. Therefore it is sufficient to prove
*The proof given here is very sketchy. The detailed proof,
which depends on a number of theorems not given here, is
in reference 1.
tThis is a reformulation of work done by S. B. Akers.2
346
PROCEEDINGS -
FALL JOINT COMPUTER CONFERANCE,
It is trivial to prove that the self~dual expression
for any function is a self-dual function. Also, if j is
a selfdual function, then
jsd(U,Xl, ... , Xn)
=
array from any factorization of the form
f Sd
=
XY
+ Xfo +
YfOd
f(Xl, ... , Xn).
Clearly this is true if and only if j is self-dual.
To synthesize an arbitrary n-variable function,
proceed as follows:
1. Find the (n + 1 )-variable self-dual expression, jsd, for the function j.
2. Synthesize jsd(U,Xl, ... , Xn).
3. Replace every input U to the array by the
constant input 1 (one) and every input fj
by the constant 0 (zero).
The resulting array realizes j(Xl, ... , Xn) since jsd(1,
= f( Xl, ... , Xn) by construction.
The examples to follow show arrays with the inputs U and fl. Thus these arrays, as shown, realize
the self-dual expression of the given function.
Wherever the inputs U and fj occur, they can be
replaced by 1 and 0 as discussed above to obtain
the array for the given function.
Note that the array for a self-dual function contains, by construction, no constant inputs. It can be
shown that in any MAJ A constant inputs are required if and only if the function being synthesized
is non-self-dual.
In an intersection MAJ A for a function, every
term in the factored expression for the function corresponds to a single row or column in the MAJ A. It
can be shown l that terms in the output function of
an SBC MAJA can correspond not only to single
rows and columns, but also to inputs (or elements)
which do not form a single row or column. Thus it
seems that the intersection matrix construction does
not make maximum use of the MAJ A. In other
words, by realizing some terms in the function using a set of elements not from a single row or column, it is possible to realize many functions in an
SBC MAJ A considerably smaller than an intersection MAJ A for the function. By extensions to this
work, reduced non-SBC arrays can be derived also,
but the methods become much messier and less algorithmic.
It is not possible in the space available here to
discuss reduction techniques. However, the following examples show some arrays in reduced form, as
well as the original intersection arrays.
While it is possible to construct an intersection
Xl ... , Xn)
1965
with fo and fod each expressed as a sum of product
terms, it is obvious that the smallest array results
from choosing X and Y and the expressions for fo
and fOd to minimize the number of terms in fo and in
fOd. This is done in the following examples.
SYNTHESIS EXAMPLES
Before giving examples of synthesis by Theorem
1, it is useful to define a notation which will be
used in examples throughout the rest of this work.
In the many examples to follow in this and succeeding sections it is necessary to show arrays with
variables assigned to the inputs. Since the interelement connections in an array are fixed, an array
with inputs can be completely specified by giving
each boundary input and the center input to each
element of the array. These inputs are presented as
a matrix, with a line separating top and left boundary variables from the center input variables. Thus
the array of Fig. 3 is represented by
B
B
B
B
B
u
D
C
C
C
C
D
B
C
C
jj
B
B
C
A
A
D
A
A
fJ
U
D
fj
D
B
A
A
C
C
c
Clearly this representation is completely general;
it is not restricted to SBC arrays.
Example 1: f(A,B,C,D)=I0,1,4,6,7,8,11,12,13,14.*
A minimum Sum of Products (MSP) form of the
Self-Dual Expression for this function is:
fsd=BCDU + ABCDU + ABCD + ABCD + ABC
+ ABCU + ABCU + ABCDU + BDU + CDU
This function can be factored on BB or CC without
increasing the number of product terms (10) in the
expression, since the term BDU can be written BCDU
or CDU can be writtenpCDU, without changing
Arbitrarily choose the BB factorization.
rd.
*This notation, defined in Caldwell,3 defined the rows of
the truth table for which the function is one.
347
TWO-DIMENSIONAL ITERATIVE LOGIC
B
B
B
B
B
f(A,B,C,O)
Figure 3. SBC MAJA for Example 1.
The intersection matrix is
(ACDU)
(4(12U)
(A~D)
(d~V)
(CDV)
3-input elements for any network capable of realizing
this function.
B
(CDU) (ACD) (ACV) (DU)
(V)
(D)
(CD) (C)
(~V)
(Q)
(2)
(~L
(D)
(4D ) (4)
(~)
(AU) (Q)
(~)
(4)
(U)
(DU)
(C)
(D)
(AC)
(A)
(d'>
(~)
(~)
(C)
Example 2: f(A,B,C,D)
r
MSP
= BD+ ACD + ABC + ABU
+ ADU can be written as
jSd = BD + B(AC+AU) + D(AC+AU)
The SBC MAJA is
One intersection MAJA is
B
U
D
ll.
ll.
C
C
fl
C
B
B
C
C
B
C
C
A
u
D
B
A
A
U
C
15
A
A
A
B
D
U
C
D
C
By reduction techniquest described elsewhere, l a
2 X 3 non-SBC array can be found to realize this
function:
= I1,4,5,6,7,9,11,12,13,15
d
D
A
U
B
B
D
C
A
This is the smallest SBC MAJA which can possibly
realize this function, since six different literals must
appear as inputs, and no smaller SBC array has six
inputs.
Example 3: f(A,B,C,D) = I1,2,5,7,11
r
= ACD + ABD + ABCD +
+ ABC15D + jjDU + CDD
+ ABU + ACU
MSP
d
ABCD
D
A
C
U
A
15
A
C
V
B
A 2 X 3 array is known to be the smallest array
capable of realizing this function since no smaller array has enough terminals. The 2 X 3 array shown
here has one element which performs no logical
function because it has two identical (A) inputs.
The 5-element network resulting from removal of
element 21 has the adsolute minimum number of
Factor on DD for the minimum number of product
terms in the factored expression
r
d
+ D(AC+AB+ABC+BU+CU)
A corresponding SBC intersection MAJA is:
!2
tThese techniquees are heuristic, and results obtained
depend to some extent on the experience of the person
doing the reduction.
= b(~jC+ABCV+AjjU+ACV)
D
15
15
D
D
;1
A
C
A
A
A
B
A
D
C
A
B
C
D
B
U
U
U
D
.(2
U
U
[j
348
PROCEEDINGS -
FALL JOINT COMPUTER CONFERANCE,
By reshuffling rows and colums, it becomes possible to remove the row corresponding to term ACiJiJ
and the column corresponding to term CDU
D
D
D
D
D
D
:If
A
"C
D
A
B
A
A
C
B
B
TJ
fj
The 'term CDU is realized by the center inputs to
elements 31, 42, and 43, and the left boundary. The
term AdB V is realized by the center inputs to elements 11, 21, 31, 42, and the top boundary. This
is the smallest known SBC array for this function.
However, there exists a 2 X 3 non-SBC majority array for the function:
II
u
C
A
B
C
D
B
D
C
A
Arrays of size 2 X 3 or 3 X 3 appear to be typical
for non-self-dual functions of four variables. 1
THE CANONIC ARRAY
The two major disadvantages of the synthesis
method presented above are:
1. The lack of reasonable bounds on the size
of array needed to realize an arbitrary
function.
2. The inability to apply reduction procedures to functions of more than five or six
variables. *
The development of a canonic form for arrays for
arbitrary functions of n variables as is done below
obviates these disadvantages. This canonic form has
the following properties:
1. The canonic array for n variables, for a
given n, is an array of fixed size, with
some inputs fixed and the rest of the inputs chosen for the specific function (typically, well over half of the inputs are
fixed). This array will realize any given
functton of n variables if the nonfixed in*Note, however, that the basic synthesis algorithm (Theorem 1) can be applied to arbitrarily large functions, though
the resulting arrays generally are unreasonably large.
1965
puts are properly chosen. An algorithm for
determining the inputs needed to realize
any given function exists.
2. An algorithm exists for generating the canonic array for any given n.
3. The canonic array for n variables is the
smallest known array to realize the checkerboard (worst-case) function of n variables, for n even.
4. For most given functions the canonic array
is reducible (by methods given in reference
1) .
5. The canonic array embodies a technique
for embedding arrays within larger arrays,
which shows great promise for future work
in multiple output arrays and in nonmajority (nonminority) arrays.
The disadvantage of the canonic array is that the
array required for a given function usually is larger
than the array produced by intersection synthesis
and reduction, assuming that the function is small
enough to make that synthesis-reduction technique
feasible.t
The size of the canonic array is shown in Table 1
as a function of n , the number of variables in the
function to be synthesized. In addition a "connection count" is shown for each n. This is the number of connections to the array which are not invariant over all" functions of n variables, plus one
connection for each variable whose input connections are invariant. One can envision building the
array with all invariant connections wired together
at the time of manufacture. Then the "connection
count" is just the number of input terminals to the
array to allow it to realize any function of n variables.
n
3
4
5
6
7
8
9
Table 1.
Size
Connections
3X3
10
4X6
16
7X8
26
9x14
44
15X18
78
19X30
144
31X38
274
tReduction is feasible on most functions of five variables
and some functions of six variables. 1
349
TWO-DIMENSIONAL ITERATIVE LOGIC
CONSTRUCTION OF CANONIC ARRAYS
In this section the canonic array for n variables
is presented through examples. In the next section
the process of embedding sub arrays in an array is
considered more generally.
Consider the MAJA
U
U
U
U
A
go
number of variables, it is necessary to define a construction method which will result in the canonic
array for any given n. Here the approach taken is
inductive. Given the canonic factorization array for
(n - 1) variables, it will be shown how to construct
the array for· n variables. This will be done by embedding two arrays for (n - 1 variables in the factorization array for n variables. Consider first the case
of n = 4. The array for (n - 1) = 3 variables is
known (Array 2). Take two of them:
Array 1
in which go and gl are input literals chosen from the
set {B,B,U,U}. This array, as straightforward analysis will show, realizes the self-dual function
U
U
U
u
u
B
gooo
gou
C
C
gOOI
u
C
gOIO
B
Array 3
Array 3, with U = 1, realizes
which can be any self-dual function of the three
variables (A,B,U). If U = 1, Eq. (3) becomes
go
= YiCgooo + BCgOOI + Bego lo + BCgo u
4)
U
U
B
which, if go and gl are chosen from {B,B,O,l}, can
be any function of the two variables (A ,B). Thus
Array 1 is the canonic factorization array for n = 2
variables. It is called a "factorization" array because
Eq. (4) is a factorization of the function. It is called
"canonic" because it is in a standard form, as will
become clear later.
Now consider the MAJA
U
glOO
U
C
f
= Ago
+ Agi
(
U
U
U
V
A
goo
U
Ii
gu
B
gOI
B
glO
A
fj
Array 4, with U
U
U
U
If U = 1, this array realizes the function
U
(5)
If goo, gOI, glO, and gu are chosen from {C,C,O,l},
then Eq. (5) can be any function of the three variables (A,B,C). Array 2 is called the canonic factorization array for n = 3 variables even though its
form differs slightly from the canonic construction
to be defined.
It would be possible to continue thus to define
arrays to realize any function of .n variables for
n = 4,5,6, and so on. However, if one wishes to
define a canonic factorization array for an arbitrary
U
gUI
C
giOI
C
guo
B
Array 4
1, realizes
Combine Array 3 and Array 4 in the canonic factorization array for n = 4
U
Array 2
=
U
(6)
U
U
U
..4 ......4. _. __ 4 ____ B
Ii
go 11
C
:glOO
gooo
C
C
gOOI
U
U
gUI
C
glOi
B
guo
B
gOIO: C
B
:- A--- - . A -- -
- "A- ..
Array 5
where dotted lines have been shown only to clarify
the construction of the array. It should be emphasized that Array 5 is a 4 X 6 U U SBC MAJA, with
no modification of its structure. The array contains
subarrays only in the sense that the input pattern to
portions of the array can be identified with the input
patterns to Array 3 and Array 4.
Array 5, with U = 1, realizes the function
(8)
350
PROCEEDINGS -
FALL JOINT COMPUTER CONFERANCE,
where go and gl are the arbitrary 3-variable functions realized by Array 3 and Array 4. To see this,
let U = 1 (and U = 0, of course). Then if A = 1,
the sub array corresponding to Array 4 ( elements
14,15,16,24,25,26,34,35, and 36) has one on its
top boundary (elements 14,15,16), and zero on its
left boundary (elements 14,24,34). It is not difficult
to see that with A = 1, Array 5 has the same output
as Array 4. Similarly, if A = 0, then the subarray
corresponding to Array 3 has zero on its left boundary (elements 21,31,41) and one on its top boundary (elements 21,22,23) and it can be shown that
Array 5 has the same output as Array 3. Thus
Eq. (8) is vertified.
By substitution of Eqs. (6) and (7) into Eq. (8),
one obtains
f = ABCgooo + ABCgOOl +
+ ABCg110 + ABCg11l
(9)
If gOOD through gU1 are chosen from {D,D,O,I}, then
equation (9) can be any function of the four variables (A,B,C,D).
By interchanging rows and columns in Array 5
and then interchanging U and fJ and changing the
subscripts on the g inputs appropriately, one obtains
the MAJA,
U
U
U
U
U
V
U
A
A
U
IB
IglOO
I
_-1. ___ ~ C___
B
~oo
C
go 11
C
gOOl
U
gUl
C
U
C
g110
~Ol. ., _B_ _ _ _ _ _ _
C
A
gOlO I A
B
A Array 6
I
I
which realizes the same function, Eq. (9), as does
Array 5. This is the flipped canonic factorization
array for four variables, "flipped" because it corresponds to Array 5 flipped about its main diagonal
(and with V, U interchanged and the g's renumbered).
To construct the canonic factorization array for
five variables one embeds two 4-variable subarrays
in a factorization array in exactly the same manner
in which Array 5 was constructed. If one uses as
subarrays two copies of Array 5, the resulting 5variable array is 5 X 12, with 60 elements. If,
however, one uses the flipped array, Array 6, the
resulting 5-variable array is 7 X 8 with 56 elements:
U
V
V
V
V
V
V
V
V
1965
V
V
V
V
V
V
IB
C gl111 D
-1-- _4 -:d- ,B
C
gOll1 D IB
gUOl D
gl1lO
B
D
g110l C
gOllO: B
~lOO D
B
D
B
gOlOl C IC
glOl1 D
B I glOOO D
C
glOl0 B
goo 11 D
IB
gOOlO B ,D
glOOl e
goooo D
B ~-A-- -A"-- A -- A
D
gOOOl C
Array 7
--1-
Again, the dotted lines are included to clarify the
construction. Array 7 realizes the function
f = ABCDgoooo + ABCDgoOOl 7 . . . + ABCDg1111
It is interesting to note that the canonic factorization array is the smallest array known which realizes
the "checkerboard" function f (A,B,C,D) = A + B
+ C + D. * The canonic factorization array for this
function is
V
V
V
V
V
V
V
V
V
V
A
B
D
C
A
A
B
D
C
D
C
D
D
C
B
A
D
C
D
A
C
D
B
A
However, for five variables no function is known
which cannot be realized in a reduced intersection
MAJ A smaller than Array 7. It is true in general
that no function is known to be "worst case" for
n odd, although the "checkerboard" function is
always "worst case" for n even.
The construction of canonic factorization arrays
for higher values of n is carried out by successive
embedding of (n - 1) -variable flipped arrays, as was
just illustrated for n = 5. Let Hn denote the height
of the canonic factorization array for n variables, and
let its width be W n. Then, by the construction of the
canonic factorization array
H3 = 3,
W3 = 3
and
Hn = Wn-l+ 1,
Wn = 2Hn-
l
for n
>3
These array sizes are tabulated in Table 1. Note that
in the canonic factorization array for n variables
{Xl,X2, ••• ,xn }, 2 n - l of the inputs, the g inputs,
*This function is termed "checkerboard" because its Karnaugh Map representation resembles a· checkerboard.
351
TWO-DIMENSIONAL ITERATIVE LOGIC
depend on the function being realized, while all other
inputs are fixed, independent of the particular function being realized and equal to one of the 2n
literals {X1,X1,X2,X2, ... , X(n-l),U,U} (U, U) being
in fact constants, of course), so that if all identical
fixed inputs are wired together at the time of manufacture, only 2n + 2 n - 1 external connections to the
U
U
U
U
U
V
V
V
V
array need be provided. This "connection count" is
also tabulated in Table 1.
As a final illustration, the canonic factorization
array for n = 6, f(A,B,C,D,E,F), is shown below,
with solid lines indicating the two 5-variable subarrays and dotted lines indicating the four 4-variable
sub-subarrays.
U
u
U
u
U
U
U
A
A
A
A
A
A
A
B :
C
C
--
C
CD
g
E
E
~
g
E
E
g
C
C
D
~:D
g
E
~
Big
E
g
E
__~ -.:, __E____ g ___ 12 ___
-
u
-
u
u
u
--
j
-
u
-
-
u
u
-
g
E
I
g
E
g
E
D
K.
I
I
D
g
g
E
E
g
_li_ ~ _11 __ -2___ 1) - -----=-,--g
D
C
C
C
C_ __ ~ _
C C C D gE, B
g
g I B
E ',B
D
E
E
~
I
g
g
g IB
B
D
E
E
I
[.
I
g
B
D
C
E'
C
C
D:B
I
A
C,B
A
A
A
A
A
A
B
B
B
C
g
E
C
I
I
D
C
E
g
c __ _
D
g
g
E
DC
g
C
where the 32 nonfixed inputs gooooo through gl1111 are
all denoted simply g.
The factorization array, unless it has prewired
fixed inputs, can often be reduced for a given function. The factorization array factors a function into
subfunctions, each of which is in turn factored until
eventually each sub-subfunction consists of a single
literal. Each subfunction is realized in a UU SBC
sub array. Many of the subfunctions may have DU
SBC MAJA realizations smaller than the one used
in the factorization array. These can be substituted
for the standard sub array with a corresponding decrease in array size.
where X and Yare any single literals. If the array
for gosd is denoted
U ... U
U
u
u
8 ... 8
8
u
8 ... 8
where 8 denotes the various inputs of the array for
gosd, and if the array for g1 Sd is denoted, similarly,
U
EMBEDDED SUBARRAYS
U ... U
U
U
E
E ••• E
E
E
U
E
E ... E
In this section the process of embedding subarrays will be considered in general. If one has two
UU SBC MAJA's realizing two self-dual functions
then the array for
and
W
W
r
d
is
V
X
V ...
X
X
8
8
8
8
... V
€
E
E
8
E
E
€
E ... E
8 ... 8
y
Y ... Y
then these two arrays can be embedded in a WV
SBC MAJA to realize the self-dual function
tsd =
VXgo + VYg1
VW+ WXY
+ WXg 1d + WY god +
W
Array 8
352
PROCEEDINGS -
FALL JOINT COMPUTER CONFERANCE,
In this illustration the arrays for goSd and gl sd
have been assumed to be of equal height. This need
not be true in general.
The formal definition of this construction follows.
Given a self-dual function jsd(Xl,X2, ... ,xn ), express it in a VW factorization as jsd = V g + W gd +
VW where the function g mayor may not be selfdual. This is ,always possible, if V and Ware properly chosen from {Xl,Xl,X2, ... , Xn}. Then factor g
as g = Xgo + Y gl. Again, this is always possible.
Then
and
jsd
= VXgo + VYg 1 + WXg 1d + WYgod
+ VW + WXY
with V,W,X, and Y chosen from {Xl,Xl,X2, ... , Xn}.
Now construct an SBC UU MAJA to realize
goSd = Vg o + Vgod. Call this array Ao. Let its size
be ho X woo Similarly construct the SBC UU MAJA
Al of size hl X Wl to realize gl Sd = Ugl + Ugld.
Note that there is no restriction on how Ao and Al
are constructed, or on their size. It will now be
shown that the two arrays A 0 and A 1 can be embedded in an SBC WV MAJ A, A, of size h X
(wo + Wl) which realizes jSd, where h equals the
larger of ho + 1 and hl + 1.
Let the center input to element ij of Ao be called
aOij, defined for all i, j: 1 ~ i ~ ho, 1 ~ j ~ WOo
Similarly, let the center input to element ij of Al
be called a\j, defined for all i, j: 1 ~ i ~ hl'
1 ~ j ~ Wl. Then the inputs to the h X W array A
are assigned as follows, where h = the largest of
ho + 1 and hl + 1 and W = Wo + Wl and aij denotes
the center input to element ij of A:
For
For
aij
For
For
1 ~ i ~ h - ho and 1 ~ j ~ Wo
aij = X
h - ho < i ~ hand 1 ~ j ~ Wo
= aOkj with k = i - (h - h o )
1 ~ i ~ hl and Wo < j ~ W
aij = a\(j- WO)
hl < i ~ hand Wo < j ~ W
aij = Y
This specifies every input to A in terms of X and Y
and the inputs to Ao and A 1• Array 8 is an example.
It can be proved1 that the array just defined has
as output the function fsd(xl, ... , Xn).
1965
It is very important to observe that the only restrictions on the arrays A 0 and A 1 are
1. That they are SBC arrays with U and U as
boundary variables.
sd
d
sd
2. That they realize go = Ugo + V go and gl =
d
Ugl
+ Ugl respectively.
Condition (2) is not equivalent to the condition (2') :
That when U = 1 and U = 0, Ao and Al realize go
and gl respectively.
Since the subarrays Ao and Al can be any VU SBC
sd
sd
MAJA's realizing the functions go and gl r~spectively,
it is possible to construct one or both of Ao and Al
themselves as factored arrays. In fact, the canonical
factorization array for n variables is just a factored
array with each subarray factored and each sub-subarray factored and so on until each sub-sub . . . subarray is a 3 X 3 canonical array which realizes a function of only three variables.
To illustrate the use of factored sub arrays in a
sd
sd
factored array in the general case, express go and gl
as
sd
d
d
go= URogoo+ US og01 + UR ogo 1 + USog oo + URoSo
and
sd
d
d
gl = URlglO+ USlgll + URlgll + USlg lO + UR1S1
and realize each of them in factored UU arrays,
which are used as subarrays in the array for jSd.
Figure 4 shows the construction of the resulting array. In this array the function f sd has been factored as
fsd = VXRogoo+ VXSOg01 + VYR 1g10
+ VSY1gll + VYR1Sl + WXR 1g
+ WYR ogOl + WYSog oo + W XY + VW
(For the sake of illustration it has been assumed
in Fig. 4 that gooSd can be realized in a 2 X 2 SBC VU
MAJA, while gOl'~d, glOsd, and gllsd, each require a
3X3 SBC UU MAJA.)
A study has been made of two-dimensional arrays
of three-input one-output gates, or elements, each
element realizing the majority function of its three
inputs (f(A,B,C) = AB+AC+BC). These arrays
are functionally equivalent to arrays of minority elements (f(A,B,C) = AB+AC+BC).
TWO-DIMENSIONAL ITERATIVE LOGIC
353
w
w
w
w
w
w
w
Figure 4. Four sub arrays embedded in an SBC factorization MAlA.
SUMMARY
Two methods are developed for synthesizing any
given Boolean function in an array. The first method
results in an array whose size depends on the particular function being realized. The second method
results in an array whose size depends only on the
number of variables in the function being realized.
Any 4-variable function, for example, can be realized
in an array of 24 elements or less.
The principle result of this work is a simple algorithmic synthesis procedure with the following
properties:
1. It is based on building blocks ( arrays)
which are characterized solely by their
width and height, and which contain only
simple three-input, one-output elements of
one type, with a maximum output load of
two elements each.
2. It results in arrays obeying a known upper
bound on size that seems reasonably small.
3. It permits the synthesis of any Boolean
function of n-variables by specifying no
more than 2 Cn-!) inputs to the array.
4. It permits the logical decomposition of the
array into subarrays, corresponding to a
decomposition of the function into subfunctions, with no physical modification of
the array.
5. It results in circuits (arrays) with a longer
delay, and hence lower speed, than conventionallogic circuits.
6. It often requires more elements to realize a
given function than do methods less constrained in element type and interconnection.
REFERENCES
1. R. H. Canaday, "Two-Dimensional Iterative
Logic," Report ESL-R-210 Electronic Systems
Laboratory, Massachusetts Institute of Technology,
Cambridge, Mass., (Sept. 1964). The same material appears in: R. H. Canaday, "Two-Dimensional
Iterative Logic," M.LT. Department of Electrical
Engineering, Ph.D. Thesis, Sept. 1964.
2. S. B. Akers, Jr., "The Synthesis of Combinational Logic Using Three-Input Majority Gates";
Third Annual Symposium on Switching Circuit
Theory and Logical Design, Chicago, October 7-12,
1962.
3. S. H. Caldwell, Switching Circuits and Logical Design, Wiley and Sons, New York, 1958.
TWO-RAIL CELLULAR CASCADES*
Robert A. Short
Stanford Research Institute
Menlo Park, California
design algorithm and the logical capabilities of the
individual columns. For example, in a minterm
composition C may approach the value of ~. The
successful reduction of the constant C has been
achieved by enlarging the class of functions that can
be realized by a single column. In particular, for the
simplest possible interconnection structure with
which a column, or cascade, can be constructedone using two-input single-output cells-more sophisticated design techniques have been developed
that show that C need not exceed the order of Vs
for large n.
In this paper an augmentation of this simplest
interconnection structure will be examined; in particular, provision will be made for two connections
between each of the cells of a column in an attempt
to increase significantly the number of functions
that are realizable within a single column. This
simple structural augmentation that provides one
more interconnecting lead between cells proves to
be significant, for it results in a structure that is
functionally complete-that is, any function of an
arbitrary number of variables can be combinationally realized within the extent of a single column.
Before summarizing these results, as well as
others directed toward reducing the number of cells
required for such cascades, a brief review is given
below of pertinent prior results.
INTRODUCTION
The increasing importance of integrated circuit
technologies has motivated research into the development of systematic and efficient procedures for
the design of cellular arrays-that is, arrays of logical assemblies, or cells, that are interconnected in a
regular fashion. A useful and analytically attractive
approach to the design of two-dimensional, edgefe~ ce.llular arr.ays f~r the realization of arbitrary
sWItchmg functIOns IS based upon the decomposi
tion of the arbitrary function into a set of subfunc
tions, each of which is independently produced by
one of the columns of the array. In this approach,
each column of the array might realize, for example, an individual member of a minimum covering
set of prime implicants; these subfunctions are then
composed by "collecting" the column outputs in a
special row of the array whose final output is a
realization of the desired function, or alternatively
by using edge jumpers in the same array to accomplish the collecting function;
The total number of cells in such arrays typically
grows as Cn2 n , where the exponential factor reflects the growth of the width of the array, and
where C is a constant determined by the particular
*T~e research rep~rted in this paper was sponsored by
the AIr Force CambrIdge Research Laboratories, Office of
Aerospace Research, under contract AF 19(628)-4233.
355
356
PROCEEDINGS -
FALL JOINT COMPUTER CONFERANCE,
1965
y
I
x·In
(c)
+
Figure 1. Array composition for simplest cell.
SINGLE-RAIL CASCADES
The simplest logic cell from which a single column,
or cascade, can be constructed is the familiar twoinput cell indicated in Fig. 1a. It is assumed that
the cell is sufficiently complex that any of the twovariable functions of x and y can be specified for it,
as noted by the index j on the cell itself. Thus, for
the single cell output, j = ji(X,Y), where j can range
over any of the 16 possible 2-variable functions.
The cascade that results from the interconnection
of such cells is indicated in Fig. 1b. The independent
variables enter the cells directly, e.g., X f 2 provides an
input to the second cell; for convenience, the topmost free input, y, will be taken as an binary con-
stant. The output of the cascade is formed on the
single free output at the bottom and is some function
of the in variables, in particular
j=h [Xi,ji
n
n
(Xi
n-!
, ... )]
n-!
One, way of utilizing such cascades in two-dimensional edge-fed rectangular arrays is indicated in
Fig. lc, where the second output for each cell is
formed by reproducing the horizontal input. This
scheme has been utilized in the cutpoint arrays developed by Minnick.! In these arrays, where the horizontal cell inputs are functions of the independent
variables and are not functions of neighboring columns, the basic functional building block remains the
column itself. The functional capabilities of cascades
TWO-RAIL CELLULAR CASCADES
of the type shown in Fig. 1b have been extensively
investigated.
The case where the external independent variables
Xl , • • • , Xm correspond one-for-one with the cell
inputs Xi, • • • , Xi has been studied by Maitra,2
1
n
Sklansky,3 and Levy, Winder and Mott. 4 Design
algorithms for producing the limited class of functions are available. Obviously the same cascade will
suffice for the realization of any function in the class
of functions obtained by permuting the independent
variables; the total number of such realizable equivalence classes has been determined by Stone. 5 The
point of present interest, however, is that the number of such functions comprises an increasingly
insignificant proportion of the totality of possible
functions with increasing m. There are even some
three-variable functions that cannot be produced in
such elemental cascades, e.g., the majority function.
A refinement which increases the number of functions producible by this single-rail cascade ("singlerail" referring to the single connection between cells
of the cascade) was demonstrated by Minnick, 6 who
showed that a one-to-many correspondence between
the independent variables and the horizontal cell
inputs results in a larger class of realizable functions.
These cascades have been called redundant cascades, i.e., m :::; n, and result in the use of certain
of the independent variables to form more than a
single cell input. Design algorithms have also been
developed for these redundant cascades, 7 but again
the pertinent point here is that the class of functions,
while larger than the irredundant case for all m ~ 3,
still forms a vanishing proportion of all functions
for sufficiently large m. Indeed the redundant cascades still do not quite suffice for the three-variable
case.
There is no other way in which the elemental binary two-input single-output cell can be generalized
to augment further the number of producible functions. Although arbitrary functional complexity is
permitted within the cells, * the interconnection
complexity is simply not sufficient for complete
functional capability in a single column.
With this fundamental question disposed of, the
next-rank question becomes of interest: namely,
how does the functional capability of cascades grow
with increasing liberality in the allowable interconnection structure? Does any linear cascade exist that
does achieve complete functional capabilities?
There would appear to be only two ways in
which the interconnection structure of cascades can
357
be liberalized within the basic cascade structure,
which requires that a given cell receive inputs
(other than external) only from its immediate predecessor, and supplies outputs only to its immediate
successor. In one of these ways the number of external inputs to a cell can be augmented, in the other
alternative the number of connections between cells
can be increased. (A distinction should be made
between this latter alternative and the rather intricate interconnection patterns exemplified by the
"cobweb" arrays that have been proposed by
Minnick. 8 In the cobweb array, each cell has as
available inputs not only horizontal and vertical
buses, but also inputs from the cell above, the cell
to the right, and one of the cells located a knight's
move above. Each cell also has an augmented
switching network-e.g., cutpoints-by which it
selects any two of the five inputs to which it responds, and the particular output that it will activate. The resulting designs achieve a much more
efficient utilization of the cells of the array, but the
essential function-composition capabilities are still
those that characterize the single-rail cascade.)
The first of these alternatives has been investigated by Lendaris and Stanley, 9 and on the basis of
their results can be quickly dismissed. The basic
cascade with m external inputs per cell is indicated
in Fig. 2, and it is assumed that each cell can produce any of the (m + 1 )-variable functions on its
output. In summary, although the class of functions
realizable in such a cascade is increased-obviously
all (m + 1 )-variable functions are realizable, for
example-it is still deficient* and becomes increasingly so, as noted in the previously mentioned cascades, as the number of variables increases.
The second alternative, an increase in the number of interconnections between cells, is indicated
in Fig. 3 for the case of two such cell interconnections, i.e., the "two-rail" generalization of the elemental single-rail cascade previously discussed. 10
Again it will be assumed that the cells are individually as complex as need be and, in this case, that
*Although arbitrary functional complexity in each cell
has been assumed, it is interesting to note that such complexity is not necessary. In fact, Minnick 1 has demonstrated
that anyone of 64 different sets of 6 functions each is
sufficient for these purposes. This result serves only to
reduce the individual cell complexity (i.e., to reduce the
range on j in Fig. la), but not to reduce the number of
cells required in a cascade.
*It should be noted that these cascades have not been
examined in the redundant situation; there is no reason to
expect any basic difference in the results, however.
358
PROCEEDINGS -
FALL JOINT COMPUTER CONFERANCE,
any three-variable function of the cell inputs can be
formed (again specified by the j-index). If necessary, variable redundancy will also be permitted,
and as before the topmost y-inputs will be taken as
binary constants. However, no greater generaliza-
1965
tion than the two-rail cascade will be considered;
for as will be shown below, the set of functions
realizable is not only greatly augmented, but indeed
includes the entire class of possible functions. That
is, the two-rail cascade is complete.
X ml
Xm2
Figure 2. Basic multiple-external-input cascade.
TWO-RAIL CASCADES
in the cascade of Fig. 4 will realize any
Logical Completeness
on the second output of the terminal cell. (It may
further be noted that this realization follows the
previous forms in that only one independent variable
is accommodated per cell. If this requirement is
relaxed the top two cells are seen to be redundant
and only the remaining two cells are necessary.)
It further· follows that edge-fed rectangular arrays
can be constructed utilizing such cells (again simply
bussing the x-input through to provide the intercolumn connections, and utilizing a Shannon-type composition on the last n - 4 variables) that would
exhibit a growth rate of Cn2 n , where C is no greater
than 1/16. Thus an improved growth rate compared
with the two-input cell is readily exhibited, but it
must be remembered that the individual cells are
more complex. *
!(XI, X2, X3, X4) = X4g1
The isolated basic two-rail cell is shown in Fig.
4. The cell has three inputs (one is externally supplied) and two outputs. The index j indicate swhich
of the three-variable functions of the cell inputs are
produced independently by the cell outputs.
The greater flexibility offered by such a cell
(which of course must be measured against the
greater cell complexity) is suggested by the cascade
of Fig. 5 which will realize any four-variable
switching function depending upon the specialization of the cells. In particular if the index notation
j :
it (x,
YI, Y2), !2(X, YI, Y2)
is used to· specify the functional outputs of a given
cell, then the assignments of
it: x, 0
jz : YI, x
h : gl(X, YI, Y2), go(x, YI, Y2)
j4 : arbitrary, XYI
X'Y2 = !(XI, X2, X3, X4)
+
+ xlgo
*It also immediately follows that array growth rates
characterized by any arbitrarily small value of C can be
constructed simply by augmenting the number of vertical
rails to the required number. It also follows that in general
the individual cell complexity would rapidly increase.
TWO-RAIL CELLULAR CASCADES
359
Figure 3. Basic two-rail cascade.
X---iII-.(
Figure 4. The two-rail cell.
However, the cascade of Fig. 5 exhibits another
property of two-rail cells that can be utilized to
enhance even further the number of functions realiz-:able in a single column. In particular, the last cell
of the cascade has two independent functions of
the preceding variables presented on its vertical
inputs. Thus the action of the last cell can be selectively applied to portions of the previously generated
results; in the example shown the last variable can
act selectively on the functions g1 and go to obtain
X4g1 and xlgo before composing these in turn for the
final result. This capability suggests a different view
of the possible tasks of the two rails of a two-rail
cellular cascade. Specifically, it is possible that one
of the rails could be designated as an "accumulator" to which partially derived results can be added
(logically) after they are computed in the functional
line of the other rail.
For example, it is well known that arbitrary functions can be composed by simply adding the component minterms. In the two-rail cell it is possible to
do this directly. Thus, again let /1 and /2 refer to
the two cell outputs and x, Yl and Y2 to the inputs,
and assume that It can be specified as anyone of
360
PROCEEDINGS -
FALL JOINT COMPUTER CONFERANCE,
1965
Figure 5. Realization of arbitrary four-variable function.
the set of functions (XY1, X'Yl, 1) and that /2 can
be specified as anyone of the set (Y2, XYl + Y2,
.x'Yl + Y2). Then any arbitrary minterm (or prime
implicant) can be formed on the it leads and, when
completed, it can be added to the sum being accumulated on the 12 leads. At the same point, a new
min term can be commenced on the it leads. It is
apparent that the resulting cascades will be redundant in the sense that the variables are required at
more than one cell; it also follows immediately that
any function is realizable in a single cascade of such
cells.
If it is assumed, then, that a complete minterm
expansion for a function of n variables is utilized in
a single two-rail column, the total number of cells,
N, required is
tal rectangular array that also utilizes a minterm
expansion. It does, however, achieve this rate within the desired single-column restriction.
Alternative realizations to the basic two-rail cell
described above are possible and can be utilized to
reduce the cell count somewhat for the functionally
complete column.
In the first place, if the functions producible on
the second rail are augmented to include the complements of those prescribed above, then the minterm expansion can be utilized to realize either 1 or
I, whichever is simplest. If a final complementation is necessary to produce the specified function,
it can be. accomplished in the final cell of the column. Since the number of minterms in 1 and 1 is
just 2 n , it follows that
N :::;;
In terms of the number of cells required, this
growth rate is no improvement on the most elemen-
!
n2n = n2n- 1
for such a cascade. The same end can be achieved
by augmenting the cell functions so that a product-
TWO-RAIL CELLULAR CASCADES
of-sums can accumulate in the second rail.
The same growth rate can also be achieved by a
two-rail cell based upon the exclusive-OR expansion
f(Xl, ••• , Xn)
=
Co
+ C1Xl + C2X2 + ... + CnXn
+ Cn+ lX1X2 + ...
In this case, if the cell oputs can be selected from
(again referring to Fig. 4):
It :
(x,
XY1,
12 : (Y2, Y2
1)
+
xyt)
then, as in the minterm case, each product term
can be formed in the first rail, to be added or not
to the accumulating subfunction in the second rail,
depending on the value of the particular binary
constant Ci. Since the constant Co can be accommodated by one of the initial boundary constants at
the top of the column, only one cell is required for
each variable occurrence. Hence the total number of
cells can be enumerated as
N
=
I
i=l
i
(~)
= n2 n - 1 •
l
Clearly other two-rail cascades can also be developed based on other expansions. For the realization
of particular functions the use of prime implicants
instead of minterms will suffice, as will various
multilevel combinations (that is, greater than the
usual two-level AND-OR configuration) that reduce the number of variable occurrences for the
particular function to be realized. Since these realizations will result in nonstandard variable orderings
and hence are not appropriate for the composition
of horizontal-bus rectangular arrays, they will not
be further discussed here.
Elf!cient Two-Rail Compositions
Since the two-rail cascade is functionally complete, i.e., any function can be realized within the
extent of a single cascade, two minimization questions next become of significant interest. The first
seeks an algorithm for the minimum length realization of any given function; the second seeks a minimum form of cascade that will be appropriate for a
361
large class of functions, e.g., canonical cascades
which, when the cells are appropriately specialized,
will realize any member of the entire class of functions. The latter question will be briefly examined
here in terms of efficient cascade realizations for
arbitrary n-variable functions, where efficiency is
measured in terms of cascade length. The development of fixed-variable-order cascades suitable for
the class of all functions of n variables is particularly appropriate to the design of rectangular, horizontal-bus arrays, where all the input variables are applied to the columns of the array in the same order,
even though redundantly. In such arrays, designs
that have been minimized for each function realized
in the individual columns generally will need to be
augmented in order that each column can be driven
within the common variable ordering that feeds the
entire array.
Since the minimization criterion of interest is the
length of the cascade (i.e., the total cell count), it
follows that the complexity of the individual combinational cells is not at issue and is subject only to
the logical constraints implied by the two-rail structure. That is, each cell will initially be assumed capable of developing, independently on both of its
outputs, any of the three-variable functions of its
inputs. This point of view is entirely consistent
with the original assumptions made regarding the
single-rail cascades; steps directed toward reducing
this implied individual cell complexity will be discussed in the following section.
Since each cell is assumed to possess complete
three-variable capabilities, the determination of the
minimum-length three-variable cascade is trivial,
and is, of course, one. Almost as obviously, the minimum length four-variable cascade is two, although it is of special interest because it represents
the last instance in which the value is known to be
minimum. A particular instance of such a four-variable cascade is indicated in, Fig. 6 and is based
upon the general exclusive-OR expansion mentioned previously, that is
where gl, g2 are arbitrary functions of three-variables
and are realized on the outputs of the first cell in
the cascade. This particular expansion is selected
with an inductive construction in mind, for it suggests
a means for introducing a new variable to the scope
of a problem with the expenditure of only one cell
362
PROCEEDINGS -
FALL JOINT COMPUTER CONFERANCE,
for that particular variable. Indeed any arbitrary
function of n variables can be expanded as
t(Xl, ... ,xn )
= gl(Xl,
... ,xn-t} +
Xng2(Xl, ... , xn-d
where gl and g2 are arbitrary functions of their arguments, suggesting the general form of realization of
an n-variable cascade as indicated in Fig. 7. In this
figure, the cascade A represents an efficient ( and
presumed known) realization of the (n - 1 )-variable function g2. (The symbol 1> on an output indicates that the particular function realized thereon
may be arbitrarily specified since it does not contribute logically to the result.) The pertinent output
of cascade A provides one input for the cell which
introduces the new variable, X n , which in turn provides an "accumulated" input for cascade B. The
function of cascade B, finally, is to develop the
function gl as efficiently as possible, and to accumulate it on the second rail in order to form the complete n-variable function. An economical realization
of the B cascade can also be obtained by utilizing
the exclusive-OR expansion, this time on only n - 3
of the variables; thus using n = 5 as an example,
gl(Xl, X2, X3, X4) = hO(Xl, X2) + x 3h 1(Xl, X2)
+ x 4h 2(Xl, X2)
+ X3x 4h3(Xl, X2).
A realization of this B cascade corresponding to
n = 5 is shown in Fig. 8. Since two cells are utilized
for each hi function, plus one more for each variable
occurrence of the expanded variables, it follows that
1965
in general such a B cascade requires
NB = 2 • 2 n- 3 + ni
3
i=l
(n ~ 3 ) = (n + 1 )2n- 4
1
cells together. The number of cells. in the complete
n-variable cascade then is certainly bounded by the
recursion relation.
N n ::;; NA
+ NB + 1 =
N n-l
+ (n +
1 )2 n - 4
+ l.
It follows immediately that the number of cells
required for a canonical five-variable cascade, for
example, need not exceed fifteen cells. Further simplifications can be made, however, for all cases. Since
it is immaterial which of the n - 3 variables is used
in the expansion of gl, it can always be arranged
that the last variable used at the bottom of cascade
A can provide one input to the top of cascade B,
saving one cell. Furthermore, since the particular
function ho(xi, Xj) is not multiplied (logically) by
any of the others, it follows that ho and anyone of
the other two-variable functions can be realized in
the same two cells of the cascade, thus saving two
more cells. (This simplification is indicated in Fig. 9
which can be substituted directly for the bottom five
cells in the B cascade of Fig. 8.) Thus for all n, the
upper bound on the number of additional cells
required can be reduced to
N n ::;; Nn-l
+
(n
+
1)2n - 4 -2.
This represents the best bound that has proved valid
for all n. Examination of Fig. 8, however, reveals
Figure 6. Arbitrary four-variable cascade.
363
TWO-RAIL CELLULAR CASCADES
8
Figure 7. Construction of n-variable function.
other cp outputs that suggest even further savings.
For example, the cp output of the fourth cell in the
cascade could be specified as X4 if that variable
could be used logically by the fifth cell in the cascade possibly resulting in the elimination of another
cell. It turns out that such savings can be easily
shown for n = 5, and result from a nonuniform
expansion of the function gl, that is, an expansion
such that the different l4's may be functions of different pairs of variables. For example, an alternative
expansion for gl for n = 5 is
gl(Xl, X2, X3, X4)
= hO(Xl, X2) +
+
(2)
x 4h 2(X2,
(4)
x 3h 1(Xl, X2)
(3)
X3)
+ Xlx 4h 3 (X2, X3)
(1)
and if this expansion is realized in the B cascade in
the order indicated by the numbering of the terms
above, then in each instance a cell can be saved in
the transition from one term to the next in the cascade. In Fig. 10 a five-variable cascade is specified
that utilizes this B cascade for a total of 10 cells for
the entire cascade, and achieves the minimum
known length possible for such a general cascade.
Whether such savings, which accrue from appropriate nonuniform expansions, can be achieved
for all n is not known. For all values of n through n
= 8 the existence of such expansions has been
shown, however, and if this result can be extended
to all n,. then the upper bound can be reduced as
indicated by the recursion
N n ~ Nn-l
+
(n-l)2 n- 4
On the basis of these results it is conjectured that
364
PROCEEDINGS -
FALL JOINT COMPUTER CONFERANCE,
NOTE: The 'no-totion
1965
"0 H'~
on on
output arrow indicates thot
at that output 'he function
H is added to the previously
accumulated function on
fhe second ro i I.
X2
X3
<±>X3 h ,
X,
X2
G) hO = f
Figure 8. A realization of the B cascade. The notation "9h"
on an output arrow indicates that at that output
the function h is added to the previously accumulated function on the second rail.
the above bounds holds for all n.
As a summary, the best known values of N n
are shown in Table 1 for low values of n. For com-
parison, the direct miriterm expansion values are
also shown.
Table 1. Upper-Bounds for Two-Rail Cascades.
TWO-RAIL CELLULAR CASCADES
365
Figure 9. Simultaenous realization of ho and hl.
n
3
4
5
6
7
8
9
10
Minterm
n2n - 1 N n 12
32
80
192
448
1024
2304
5120
1
Bound
+ (n-l)2 n 1
2
10
30
78
190
446
1022
4
Reduced-Cell Compositions
Although to this point the only cost criterion has
been the length of the two-rail cascade, it is apparent that the individual cell complexity is also of
great practical importance. Certainly the general
two-output, three-variable cell will be significantly
more complex and costly than has previously been
described for the single-rail case. 1
In Fig. 11, however, a summary is found of the
six cell types that have been utilized in the efficient
two-rail canonical realizations of the previous section. Of these six only the first type, in part (a) of
the figure, requires the complete generality assumed
for the two-rail cell. This type is only used in the
first cell of any such cascade, however, and the remaining types of cells are required to realize only
very simple three variable outputs. For present purposes it will suffice, then, to demonstrate the avoiddance of cells of the type in part (a), in particular
by realizing the necessary functions in terms of the
remaining five cell types, which can then be thought
of as a sufficient, reduced cell set.
Although the general cell is used only as the first
cell of every cascade, it is convenient to replace the
first two cells in every cascade (as indicated by part
(a) of Fig. 12 which represents the first two cells
of the cascade of Fig. 10) with the five-cell cascade
of par (b) of Fig. 12. It can be verified that these
two cascades are functionally identical according to
the expansion
g2(Xl, X2, X3, X4) = hO(Xl, X3) + x 2h 1(Xl, X3)
+ x4(h2(Xl, X2) +
x 3h 3(Xl, X2) ),
and for the price of three additional cells in part
(b) of the figure, only reduced-cell types are utilized.
ized.
366
PROCEEDINGS -
FALL JOINT COMPUTER CONFERANCE,
1965
A
B
Figure 10. The best-known five-variable cascade.
Thus upper bounds on the number of reduced
type cells required for canonical two-rail cascades
can be obtained simply by adding three cells total
to the more general results previously obtained.
BOUNDS FOR RECTANGULAR ARRAYS
Since the single-column for. two-rail cells has
been shown sufficient for the composition of arbitrary functions, it is of interest to ask whether the
growth rate of rectangular arrays composed of such
columns is thereby enhanced. As one way of structuring such an array, let us consider the conventional Shannon type decomposition
t(Xl, ... ,xn) =
I gi(Xl, ... , Xm) (Xm+l ... Xn)i
i
where an arbitrary function of n variables is expressed canonically in minterms of n - m of the
variables, each with a coefficient that is an arbitrary
function of the remaining m variables. If the mvariable functions are realized in a single column
in a minterm two-rail realization, then the number
of cells in a column, including one for a collector
row, will not exceed
m2m
+
n-m.
Since there are no more than 2 n - m such columns
necessary, the total number of cells in the entire
array, as suggested in Fig. 13, will not exceed the
product, that is
367
TWO-RAIL CELLULAR CASCADES
YI
Y2
YI
'~Y2
'hii,
Ie)
YI
Y2
Id)
Y2
Ie)
Figure 11. Two-rail cell types.
XI
X3
x4
x3
h2
)(1
X
X
N:
=
383
1
no-loop to 1, yes-done
Figure 12. Bit serial, word parallel add.
There are two contiguous words A and B, A above
B, and the sum is formed in B. There are two
marker bits. M1 holds the carry in B. M2 in B receives the bit to be added from A. The jth digit position is designated by dj, j referring to an implicitly defined counter in the Mask register running
from the least to most significant bit.
The truth table disptays the cases of M1, M2, dj
and the new values of M1, dj. The first four cases
require no change in M1 or dj. The sixth case has
the same result as the fifth and can be converted by
exchanging M1 & M2. The eighth case has the same
result as the seventh and can also be converted by
exchanging Ml & M2. Note that the seventh case
ends in the same state of M1 and dj that begins the
fifth case, so the fifth case must be handled before
the seventh.
This algorithm is a simple illustration of one
form of parallel sequential logic. The function is
. broken up into cases which are passed against the
data. As an individual digit of A and B is examined, not all cases will select and so some operations will be dummies. However, all possible cases
are covered. Suppose there were several pairs of
Figure 13 shows the logic tables and algorithm
for another example of paranel sequential logic but
for the second form-sequential by item of data,
parallel by function. This time the example is nonnumeric, format transformation. The input is an N
character packed number, calculated by some process, to be printed out on a check. The rightmost
two characters specify cents, delimited by a period.
The remaining characters are grouped right to left
in fields of up to three characters, each field separated by commas. A dollar sign is placed to the left
of the most significant character and the spaces to
the teft of the dollar sign are filled with asterisks.
There is an exception. Any amount less than one
dollar is printed as dollar sign, zero dollars, point,
and the appropriate cents.
In a conventional machine, this would probably
be programmed as a set of subroutines to perform
the justifying, deletion, shifting, superposition, insertion, testing for most significant digit, asterisk
fill and check for special case. In this example, the
operation will be viewed as two functions, FORMAT SPREAD 2 and FORMAT FILL 3, which
operate in paranel on the data. The two functions
have a common label so that different SPREAD or
FILL functions can be called when appropriate.
The use of no comments in the second and third
label fields allows independent calling.
Both logic tables consist of doublets. There is a
bit of shorthand used in displaying the logic table.
There is an enlargement of the third doublet of the
first logic table to the side. The positions refer to
the characters, not individual bits. In the third
doubtet, the entry word (actually six words, one for
each bit of the character) copies the third character. The associated output word (actually six
384
PROCEEDINGS -
FALL JOINT COMPUTER CONFERANCE,
LABEL
CHARACTERS
Ml
FORMAT-SPREAD2-X
XXX XXX XXI
XXX XXX XXI
1
0
XXX XXX XIX
XXX XXX XIX
1
0
XXX XXX IXX
XXX XXI XXX
~1
"
"
It
11
"
"
"
"
"
"
"
"
"
n
II
"
tt
"
"
II
"
"
FORMAT-X-FILL3
n
"
"
"
"
n
"
" "
"
" "
IV
"
"
II
II
"
"
"
XXX XXI XXX
XXX XIX XXX
o-
XXXXXX XXXXIX ••• XXI
XXXXIX
XXXXXX ••• xxo
•••
o • XXXXXX XXX1XX ••• XX1
XXX1XX XXXXXX ••• XXo
0
....
XXX XIX XXX
XXX 1XX XXX
1
0
• •• XXXXXX XXIXXX • •• XX1
o • • XX1XXX XXXXXX
XXX 1XX XXX
XIX XXX XXX
1
0
XXX ZXX XXX
$X, XXX .XX
1
0
XXX
**$
XXX
***
ZZX
XXX
ZZZ
$XX
XXX
.XX
hl-
0
• •• xxo
•• XXX 000000 001XXX XX •• oXl
• .011 XXXXXX XXXXXX XX •• oXO
•• XXX 000000 OOXIXX XX ••• X1
oXO
• .011 XXXXXX XXXXXX XX
0
•
1
0
' •• XXX 000000 OOXX1X XX
XI
o .011 XXXXXX XXXXXX XX ••• XO
"
XXX ZZZ ZXX
*** *$X .XX
1
•• XXX 000000 OOXXXI XX ••• X1
• .011 XXXXXX XXXXXX XX o •• XO
Vi
II
It
IV
XXX ZZZ ZXX
*** *$Z .XX
1
0
VI
Ii
It
VI
"
"
It
XXXXXX XXXXXI ••• XXI
XXXXXi XXXXXX ••• xxo
XXX
.XX
"
"
"
"
"
1965
II
0
0
••
Figure 13. Example of format transformation.
words) reproduces the third input character but in
the fourth character position. As the enlargement
indicates, the entry words for each bit of the character are interleaved with the output words. The
readout spreads the packed number out as required
by the format transformation assuming all characters are significant.
The second logic table also uses some shorthand
in display. The first four doublets find the most significant character. The fifth doublet checks for the
exception case. In the second doublet, the sixth
character position is the code for the number zero
which is all zeros as shown in the enlargement. The
fifth character position represents the logical condition of some code for a number other than zero.
This breaks up into four subcases (zone bits are
zero for numerics). If the fifth character is not
zero, then at least one of the four low order bits is
one and at least one of the correct output words
will be selected, giving the correct fill.
The structure of the logic tables and the algo-
rithm have been chosen so that both logic tables respond in parallel to the data and format function
call. The time in normalized units is one and the
required storage is 24N-22 words, where N is the
number of 6-bit packed numeric characters.
A Process Control Problem
Figures 14, 15, 16 and 17 illustrate an example
of operations that might be appropriate to a process
control problem. It is assumed that there is a manufacturing process which fabricates cryotron plates of
a repetitive structure. The direct cost of fabrication
is quite low, but· the cost of testing plates could be
quite large, especially if plates which are ultimately
rejected are fully tested since their test cost must be
amortized over the plates that are finally accepted.
The test process is divided into two parts: a final
cold test which determines whether or not the plate
is used in a system and a room temperature test
which foHows the fabrication and precedes the cold
test. The room temperature test is designed to reject
THE ASSOCIATIVE MEMORY STRUCTURE
at low cost a large percentage of the plates that
would ultimately be rejected. It is automated and
steps a set of contacts across the repeated cells of
the plate, performs some tests on parameters, computes the pass/fail components of a unit cell test
vector and transmits this vector to a computer.
There is a set of logic-storage modules, one for
each parameter of the test, which keep track of the
number of cells tested on the plate, the number of
failures and calculate for each parameter whether
the number of failures for the given number of tests
falls above a reject limit or below an accept limit of
a sequential test chart as shown in Fig. 14. If a parameter leads to rejection, plate is rejected and a
R
R
T'
4
Figure 14. Example of process control limit charts.
new one started in test. If all parameters lead to an
acceptance, the plate is passed on for cold test. If
no parameter has led to rejection, but not all parameters have reached a decision yet, the automatic
tester steps over to the next set of contacts and
performs a new test of the same plate. If an individual parameter rejects or accepts, its count is frozen
while the others go on.
The format of the logic-storage module for a
parameter is shown in Fig. 15. It consists of four
contiguous words. The modules may be separated
arbitrary distances but the components of the modules must be together. Each word has a common
label, AUTO TEST, and a 1 in MO, signifying
parameter test. M1 and M2 distinguish the individual words.
The second word accumulates the number of fail-
385
ures in its count field and compares this to its limit
field to see if the parameter indicates rejection. The
first word accumulates the total number of tests in
its count field and compares this to its limit field to
determine when to trigger both itself and the second
word to add the increment fields to the respective
limit fields. The fourth word compares failures to
an accept limit. The third word checks number of
tests to determine when to adjust the limits of itself
and the fourth word.
The key field identifies the parameter being examined by the module. A failure of a parameter is
communicated by driving a one in the key field. A
zero is driven if the parameter passes. The second
and fourth words will respond only to failures. The
first and third words will respond to any test. The
flag field is used in readout to identify whether the
rejection (0) limit or the acceptance (1) limit was
crossed.
The initialize algorithm clears markers 3-8 and
the limit field in an modules and then serially
transfers the initial value field to the limit field.
The count algorithm drives the vector of parameter pass/fail values into the key field and marks
responses with a 1 in M6, the trigger. M5, the
freeze bit, must be zero which ensures that modules
measuring parameters which have crossed a limit
will not do any further counting. The second and
third steps count up by one serially by bit but paraliel by word.
The compare algorithm uses M3 & M4 to indicate in each module whether a count was greater
than (lOin M3,M4), equalto (00 in M3,M4) or less
than (01 in M3,M4) its associated limit. This algorithm proceeds from least to most significant
bits. At any stage where the count and limit bits are
equal, M3 & M4 are unchanged. So at the end of
the cycle the value of M3 & M 4 corresponds to the
inequality value of the highest order unequal bits of
the count and limit.
The check-limits algorithm uses the results of
the compare to determine if any failure limits have
been crossed and if so, freezes those modules.
The adjust limits algorithm examines total test
count of unfrozen modules to see if these limits
have been crossed. If so, both the total test count
word and its associated failure count word are triggered to add the increment to the limit. Figure 16
shows the truth table of this addition in stages corresponding to steps 2, 3, and 4.
386
PROCEEDINGS---' FALL JOINT COMPUTER CONFERENCE,
Incre- Initial
Value MO
Count Limit ment
1
TUO
dTU
TU
XXX
t
1
KUO
dKU
XIX
KU
k
1
TLO
dTL
XXX
t
TL
1
KLO
dKL
KL
XIX
k
Flag Key
XXX
XOX
XXX
XIX
M1
0
0
1
1
1965
M2 M3 M4 MS M6 M7 M8 Label
AUTOTEST
0
AUTOTEST
1
AUTOTEST
0
AUTOTEST
1
I-NITIALIZE
1.
2.
S(l/MO), SAR, W(O/Limit, O/MJ--MS)
S(l/MO, 1/Initia1 Value bit j), SAR, W(l/Limit bit j)-repeat
till field exhausted.
COUNT
1.
2.
3.
S(l/MO, O/MS, test vector/Key), SAR, W(1/M6)
S(l/MO, O/MS, 1/M6, O/Count bit j), SAR, W(O/M6, l/Count bit j)
S(l/MO, O/MS, 1/M6) , SAR, W(O/Count bit j)
Repeat 2 & 3 for all Count bits.
COMPARE
1.
2.
S(l/MO, O/MS, l/Count bit j, O/Limit bit j), SAR, W(l/M3, 0/M4)
S(l/MO, O/MS, O/Count bit j, l/Limit bit j), SAR, W(O/M3, 1/M4)
Repeat 1 & 2 for all bits of Limit field, least to most significant.
CHECK LIMITS
S(l/MO, O/M1, 1/M2, O/MS, 10/M3M4), Sl(l/MO, 1/M1, 1/M2, O/MS, 01/M3M4),
SAR, ~(11/M3M4, l/MS)
SPW, SAR, W(l/MS)
~
S(O/Ml, l/MS), SAR, SNW, SAR, W(l/MS), SNW, SAR, W(l/MS)
S(l/Ml, l/MS) , SAR, SPW, SAR, W(l/MS), SPW, SAR, W(l/MS)
1.
2.
3.
4.
ADJUST LIMITS
S(l/MO, O/MS, O/M2, 10/M3M4), SAR, W(l/M6), SNW, SAR, W(l/M6)
S(l/MO, 1/M6, l/Increment hit j~ 0/M7), Sl(l/MO, 1/M6, O/Increment
bit j, 1/M7), SAR, W(l/M8, 0/M7)
S(l/MO, 1/M6, 1/M8, l/Limit bit j), SAR, W(0/M8, O/Limit bit j, 1/M7)
S(l/MO, 1/M6, 1/M8) , SAR, W(O/M8, l/Limit bit j)
Repeat 2, 3 & 4 for each bit of Increment Field
S(l/MO, 1/M6), SAR, W(0/M6)
1.
2.
3.
4.
S.
Figure 15. Response to parameter test vector.
a
b
c
SC
a'c'MS'
a"c"MS"
a"'c"'MS'"
0
0
0
0
1
1
1
1
0
0
1
1
0
0
1
1
0
1
0
1
0
1
0
1
00
10
10
01
10
01
01
11
000
001
001
010
1 00
101
101
110
000
001
001
010
100
010
010
110
0
1
1
0
1
0
0
1
0 0
0
0 0
1 0
0 0
1 0
1 0
1 0
°
S=a,C-c no change
S=a,C=c
S=a,C=c
S=a,C=c no change
S=a,C=c no change
S=a,C=c
S-a,C-c
S-a,C-c no change
Figure 16. Logic of simultaneous limit increment addition.
Figure 17 shows the stored logic table (one
word) and algorithm to check the sample. Step 1
reads the key and flag of· all reject/accept words
frozen because a limit was crossed. Initially, M7 is
zero and M8 is one. Step 2 will select the logic table if and only if all parameters have crossed the
THE ASSOCIATIVE MEMORY STRUCTURE
Label
&s.
~
Sample Test
111
111
CHECK SAMPLE
1.
2.
S(l/MO, 11M2, l/MS,11/M3M4) ,SAR, R(l/Key, l/Flag)
s(Sample Test/Label, Key readout/Key, Flag readout/Flag),
SAR, W(l/M7)
3.
s(sample Test/Label, O/M7.,
Key Mask), SAR, W(O/M8)
Flag readout/Key readout in
4.
S(Sample Test/Labe1h SAR, R(1/M7,M8),W(O/M7,1/M8)
Figure 17. Accept/reject sample evaluation.
accept limit. Step 3, searching over the key field
with the flag readout in the Contents register and
the key readout in the Mask and for 0 in M7, will
select if and only if not all parameters have crossed
a limit and that none that have crossed were rejections. The last step selects the logic table, reads M7
& M8 and initializes them. The automatic tester interprets the readout of 1 in M8 as a signal to step
the test and go on to the next plate; 0 in M8 as a
continue signal. The plate is passed for 1 in M7,
rejected for 0 in M7 if the test is stopped.
The logic of the example can be concatenated by
providing another set of modules with 0 in MO to
indicate process test. Each time a plate test is
stopped and the plate passed or failed, the final cumulative vector used in the check sample algorithm
can be driven into the process test modules to monitor the overall fabrication process to check if it is
doing well, so-so, or about to go out of control.
The key/flag criteria can involve combinations of
parameters as well as individual parameters themselves for more sophisticated measures of the performance of the fabrication process. The limit/increment logic of either the parameter or process test modules can be made more complex with
additional fields and/ or additional words in the
module (to adjust the increments of the limit fields
for example) and additional steps to allow more
complex sequential testing limit curves which might
be defined by second, third, etc., order difference
equations.
CONCLUSIONS
The Asspciative Memory Structure is a versatile
storage and processing building block for computers
387
and is not limited to the role of a search memory.
While a cellular approach inay be taken to alleviate
interconnection problems of batch fabricated devices, there is some advantage to departing from a
purely cellular system of identical elements in a uniform array. Hierarchy in communication and distributed control allow selective parallelism. Some
cells will respond to control information by determining whether or not a group of associated cells
will respond to a subsequent control sequence.
The small set of primitive information manipulation operations is quite powerful and much more
flexible than built in macro commands. Computer
functions can be composed with the primitives using the parallel sequential logic approach. The
batch fabrication technology is not required to be
able to implement very complex macro commands
and fewer plate or chip types need to be designed
and stocked.
ACKNOWLEDGMENT
The author wishes to acknowledge the contribution of John W. Bremer and Dwight W. Doss, coinventors of the cryotron associative memory structure.
REFERENCES
Companion Papers
B. T. McKeever, "The Associative Switching
Structure," IEEE Trans. on Electronic Computers,
vol. EC-15, to be published in 1966.
- - - , "The Associative Structure Computer,"
ibid.
NOTE: A more extensive set of references for all
three papers will appear in "The Associative Structure Computer."
Cryogenic Electronics Issue, Proc. IEEE, vol. 52,
Oct. 1964.
Integrated Electronics Issue, ibid., Dec. 1964.
Proc. Symposium on the Impact of Batch Fibrication on Future Computers, Los Angeles, Apr.
1965.
W. T. Comfort, "A Modified Hollard Machine"
Proc. FICC, Nov. 1963, pp. 481-488.
'
J. Sklansky, "General Synthesis of Tributary
Switching Networks," IEEE Trans. on Electronic
Computers, vol'. EC-12, pp. 464-469 (Oct.
1963).
388
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
R. C. Minnick, "Cutpoint Cellular Logic," ibid.,
vol. EC-13, pp. 685-698 (Dec. 1964).
D. L. Slotnick, W. C. Borck and R. C. McReynolds, "The SOLOMON Computer," Proc. FICC,
Dec. 1962, pp. 97-107.
R. Fuller and G. Estrin, "Algorithms for Content-Addressable Memory Organizations," Proc.
Pacific Computer Conference, Pasadena, Mar.
1963, pp. 118-130.
R. G. Ewing and P. M. Davies, "An Associative
Processor," Proc. FICe, Oct. 1964, pp. 147-158.
1965
B. A. Crane and J. A. Githens, "Bulk Processing
in Distributed Logic Memory," IEEE Trans. on
Electronic Computers, vol. EC-14, pp. 186-196
(Apr. 1965).
W. Shooman, "Parallel Computing with Vertical
Data," Proc. EICC, Dec. 1960, pp. 111-115.
J. Atkin and N. B. Marple, "Information Processing by Data Interrogation," IEEE Trans. on
Electronic Computers, vol. EC-11, pp. 181-187
(Apr. 1962).
COMPUTER EDITING, TYPESETTING AND IMAGE GENERATION
M. V. Mathews and Joan E. Miller
Bell Telephone Laboratories~ Incorporated
Murray Hill~ New Jersey
INTRODUCTION
Human engineering of both the programs and the
equipment was a prime design consideration and
led to objectives of:
The programs which we will describe were developed to provide a practical system for editing and
publishing text with a digital computer. The system
consists of an electric typewriter, a computer, a cathode ray tube output unit, and a camera. Text and
editorial instructions may be entered into the computer from the typewriter. The computer executes
the instructions and prepares a corrected, justified
text. The text may be written on the cathode ray
tube, photographed by the camera, and published by
standard photo-offset printing. Alternatively, it may
be written on the typewriter by the computer, or
printed on the computer printer.
There has been much recent interest in computer
editing programs. Among others, extensive work
has been done by M. P. Barnett at M.LT., by P. F.
Santarelli at the IBM Systems Development Laboratory in Poughkeepsie and at Project MAC at
M.I.T., by R. P. Rich at the Johns Hopkins Applied Physics Laboratory, and by C. R. T. Bacon at
the University of Pittsburgh. In addition, some machines are being developed to present graphic arts
quality images on cathode ray tubes such as the
Merganthaler-CBS Laboratories Linotron machine.
Oscilloscopes of lower quality are available as output devices for numerous computers.
1. Providing a typewriter as good and as simple as an ordinary secretary's machine. The
typewriter can be located in the office of
the user.
2. Providing a good upper and lower case type
font with flexibility for adding letters and
figures.
3. Allowing the intermixing of text and editorial instructions. Thus corrections may apply to a text while it is being written.
4. Providing a comfortable vocabulary of editing instructions.
When the editing program was written, facilities
at Bell Telephone Laboratories did not provide instantaneous real-time interaction between a typewriter and the computer on which editing was done.
Consequently, the program was written so as not to
require interaction. This decision strongly affected
the way in which lines in the text are located for
correction purposes. When interaction is available,
parts of the program may be modified.
The following section of this paper describes the
editing program, and the succeeding section the
389
390
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1965
typesetting and genration of images on the cathode
ray tube.
THE EDITING PROGRAM
Description
The purpose of this program is twofold: ( 1) to
allow for input and correction of typewritten material; and (2) to store material for future editing
and/ or processing.
The significance of such a system lies primarily
in point (2) for although typists and proofreaders
can· prepare manuscripts and typesetters can produce aesthetically pleasing copy, no one relishes the
problem of altering previously prepared material.
Consequently, the ability to store text in a form
that can be easily updated represents a great convenience. And if the material is stored in a computer, then it is amenable to other helpful forms of
processing such as alphabetizing or sorting.
Therefore, the aim of this project has been to develop a system of editing in which the user can provide input to the computer by a device which is familiar and easy to use (i.e., a typewriter), can easily correct the errors he makes while inputting, and
can easily modify in the future material he has produced in the past. Furthermore, the design of the
system is such that corrections to currently typed
material can be made interchangeably with corrections to preexisting material. The corrections themselves constitute current input and do not require a
second pass operation for their execution. It is felt
that this particular feature adds great flexibility and
power to the technique of computer editing.
Structure of System
The editing program is diagrammed in Fig. 1.
The manuscript is typed at an IBM Selectric correspondence typewriter and transmitted to the IBM
7094 computer. The data is in the form of 6-bit
characters, one for each. of the 44 typewriter keys
and one for each of several operations such as carriage. return, backspace, upper case shift, lower case
shift, etc. This input data consisting of both text
and instructions is pre-processed by the pre-edit
pass of the program. Case shift characters are removed from the data stream and case information is
added to each character. Read requests for material
stored on the permanent disk file of the 7094 are
Figure 1. Block diagram of editing program.
executed. All data is stored on scratch disk
(ITEMS) for examination by main edit pass.
The sequencing of characters is indicated by a
list of pointers which specifies for each character
the location of the character which follows. The instructions which call for modifications to the text
are therefore executed by resetting the appropriate
pointers. Pass 2 executes all such instructions.
The post edit pass is logically equivalent to the
main edit but is included as a separate pass to provide space for routines of the users' own design. In
pass 4 the characters are sequenced in their proper
order by following the pointers. The corrected text
is written on scratch disk CTEXT and transmitted
to its appropriate place in the users' permanent disk
file.
The fifth pass deals with the printing of the text.
A galley proof is prepared on microfilm, unless
otherwise specified, in order that the user may
know the state of the material which has been generated and stored in his permanent disk file. Final
copy may be requested by instruction and the format is dictated by instnlctions still present in the
body of the text.
Organization of Material
All typewritten material is subdivided into units
called items. This organization is determined by the
user with the restriction that an item be no larger
than 1,800 typewriter characters, which corresponds
approximately to one page of typing. Items are
COMPUTER EDITING, TYPESETTING AND IMAGE GENERATION
identified by user-assigned decimal number,s of up
to six integral places and at most two decimal
places. These numbers are typed at the beginning of
the item.
The next larger organizational unit of material is
the standard text. There can be at most 2,000 items
in one standard text. A standard text is identified
by a (at most) 24-typewriter-character name of the
user's choice.
Standard texts (an arbitrary number of them)
constitute standard files, which are identified by an
up to 6-BCD-character name. These files correspond to files of disk storage in the computer. An
author preparing an opus of volumes might use a
file for a volume, a standard text for a chapter, and
each item for a paragraph. Most applications, however, will probably involve one file of one standard
text only.
General Conventions
A single run of the editing program produces
one standard text. The first line of typing indicates
the name and destination (file) of the standard text
being generated. If this standard text is not to be
stored on permanent disk, as may be the case when
only a printed copy is desired, then the file should
be specified as O.
Editing instructions must be distinguished from
text. The presence of every instruction or control is
announced to the program by means of the left
square bracket. The typewriter provides this character in both lower and upper case and thus simplifies
the typing of instructions by eliminating extra case
shifts. Each instruction has its own code of some
single character and this code character must immediately follow the left square bracket. Instructions are terminated in general by a slash.
The text begins with the assignment of an item
number of up to six integral places and at most two
decimal places. Integers need not have decimal
points. The assignment of this number requires the
code of lower case i and therefore [ill labels the
characters that follow as item number one.
The privilege of backspacing has been preempted
by this system in order to provide an effective
means of making erasures. A sequence of backspaces will eliminate the sequence of previously
typed characters of equal length. The usual customs
on underlining and overtyping must be abandoned
inasmuch as any backspacing will destroy the input.
391
Provisions for underlining are made by an instruction but, at present, overtyping is not allowed.
As will be seen in the discussion of the instructions which follows there are special break characters such as the left square bracket and the slash
which have been preempted by the system. In order
to exempt these characters from their usual role
should they be desired in some other context, they
may be effectively placed in quotation marks or
"super quoted." An example will be discussed in the
illustration which follows.
Instructions
Those instructions which are used to edit the text
or control options in the program are referred to as
Class I instructions. They are identified by a code
which is an upper case character and are distinguished from Class II instructions, which specify
the output format. A list of instructions presently
available is given in Fig. 2. In the general form of
the instructions as shown in this list, www and xxx
are used to denote item numbers and yyy and zzz
are used to denote line numbers. These parameters
are considered to be right adjusted in fields marked
by commas or the slash. The character # or 0 may
be typed in most contexts to denote current item or
line. Furthermore, line numbers may be back-referenced by using minus signs if and only if the instruction refers to the item in which it is contained.
Example of Editing
Figure 3 is an unrealistic though illustrative example of an original text prepared at the typewriter
to demonstrate Class I instructions. The requisite
format of the first line is shown.
The first item is numbered 10. It is desired that
the content of this item be printed in a fixed format
and, hence, the first instruction is [F. The word
Objectives is underlined by using the instruction
[U #, # / carriage return ... underlining .. ./carriage
return typed on the same line. Note that the second
slash appears first but was, in fact, typed after the
underlining was complete. The normal mode of format is resumed after the instruction [J. It is intended
that the second line of item lObe shortened to "II.
Description" by use of the instruction [0 .... However, the line number is erroneously stated as 1.
(This difficulty will be remedied later.)
Item 12 contains several errors. A C-type instruction is used to eliminate the extra occurrence
392
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1965
CLASS I
[E xxx/
Erase item xxx.
[E xxx,yyy,zzz/
Erase line yyy of item xxx.
[E xxx,yyy,zzz/
Erase lines yyy through zzz of item xxx.
[I xxx,yyy/ .•• insertion ... /
Insert the text between the slashes
after line yyy of itemxxx.
[0 xxx,yyy/ •.. substitution .•. /
Overwrite line yyy of item xxx with
the text between the slashes.
[S xxx,yyy/ expression A (substitution) expression B/
[S xxx/ expression A (insertion)/
[C xxx,yyy/carriage return ••. /carriage return
Correct line yyy of item
xxx on a character by character basis.
[U xxx,yyy/carriage return .•. /carriage return
Underline specified
characters in line yyy of item xxx.
[R FILE, Text, N,www,xxx/
Read request.
[G FILE, Text,www,xxx/
Galley proof request.
[T FILE, Text,www,xxx/
Type final copy.
[F
Fixed format.
[J
Normal mode (justify).
[P
Paragraph (used in normal mode).
[X www,xxx/
Randomize the order of items www through
xxx.
[M www.xxx.n/
Multiproces8 items www through xxx using
routine PROC n.
CLASS II
[v n/
[h
[m
n/
a, b/
S~ace
horizontally n times.
Margins: a spaces on left, b spaces on right
Go to new page.
[p
[1
Space vertically n times
n/
[e (x,n) ••. (y,m)/
Type on every nth line
Equate character x to symbol number n, etc.
(to extend character set)
Figure 2. Instructions for the editing program.
of the characters "t ype" and to change x to c. two
lines back from the instruction. The importance of
this type of instruction is that the modifications can
be made at the location of the error. Mter the first
slash and carriage return, the platen is rolled back
two lines, and the space bar used to find the proper
position. Then the erroneous characters are deleted
by typing minus signs~ are ignored by spacing, or
modified by overtyping. The second slash and carriage return terminates this instruction and typing
resumes. The instructions [S# / and p (r) / at the
end of the item inserts the missing r in the word
"poofreaders". Note that a line number is not mandatory in the S instruction, and when it is missing,
the search for the first occurrence of expression A
begins with the first line of the specified item.
Item 14 gets off to a false start and is erased by
[E#/. The erasing in this case goes only as far as
the slash of this instruction and does not wipe out
the material which follows in this item.
COMPUTER EDITING, TYPESETTING AND IMAGE GENERATION
(MEMO) (DESCRIPTION)
[i lO/[F
II.
Description of System
Objectivesl [U#,#/
The purpose of this system is three-fold:
(1) to allow for input and correction of typewritten text;
(2) store material for fu:hure editing and/or processing;
(3) to provide high quality typographical output.
[0#,1/11.
Description/
[il2/[JThe significance of such a system lies primarily in point
two for although typists and poofreaders can prepare manuscripts
and t-YJ:le-typesetters
xan produce aesthetically pleasing copy,
no one relishes the task of aleering previously prepared material.
/
[C 12, -2/
Consequently, the ability to store text in a form that can
be easily updated represents a great convenience.
Therefore,
the aim of this project has been to develop a system of
edi ting in which the user can
prov~de
input to a computer
by a device which is familiar and easy to use.
[S#/and p(r)/
[i 14/Furthermore, the [E#/The corrections themselves
constitute input.
[i9.5
[S12,2/point (@"(2)" )for/
EIO,8/[010,2/II.
Description/
ri 11.5 [S12,8/venience.([P
) There/
[i 20 [R MEMO, DESCRIPTION, -1, 6,8
[i 25 [RMEMO, DESCRIPTION, 100, 28.5/
[i 30 [R MEMO,MANUAL,<),O,O/
393
They specify the file name, the text name of the desired items, and up to three parameters, which specify how and what items are to be read. The first
parameter indicates how the desired items are to be
numbered. A negative value indicates that the numbering should start with number of the item in
which this read statement occurs, a value of zero
indicates that the item should be given the number
it already possesses, and a positive value indicates
that the new item should be given a number equal
to its original number incremented by the value of
this first parameter. This last technique of numbering is useful when merging items from several texts
into a new one. The second and third parameters
indicate which items are to be read. If, however,
the third one is omitted, only one item will be recovered, and if both of these parameters are zero,
all items in the text will be recovered. In the example of Fig. 3, item 20 calls for the items from text
DESCRIPTI0N whose numbers range through values 6 to 8. They will be numbered sequentially 20,
21, . . . under the assumption that it is known that
there are no more than five items since the next
read request is in item 25.
Figure 3. Example of input text for editing.
Corrections to Instructions
The next instruction, which begins [S12,2/, gives
an example of super quoting. It is intended that
there be a substitution of (2) in place of the word
two in the second line of item 12. However, this
substitution involves parentheses which will interfere
with the interpretation of the instruction. Therefore,
the three characters (, 2, and) need to be super
quoted. For this purpose a character other than
(, 2, or), for example, the " is chosen to surround
the phrase and the surrounded phrase is preceded by
@. The @ and the two occurrences of " in the resulting string @ " (2)" will not appear in the text, but
they will cause the PRE EDIT program to flag the
super quoted characters with an extra bit thus preventing the MAIN EDIT from making its usual
interpretation. The printing programs, however, will
ignore the extra bit and will treat these characters
properly.
It has been pointed out that an important aspect
of this editing system is that the material is stored
on the disk of the computer for future modification
and processing. The provision for requesting preexisting material, which is to be updated or quoted, is
made by means of a read instruction. These read
statements must be made as items in themselves.
Corrections to instructions are handled in several
ways. First of all, the backspace-retype method,
which is effective for all errors, is the one and only
means for correcting errors in the first line or read
statements. This restriction is due to the fact that
these statements are interpreted in the PRE EDIT
and do not enjoy the benefits of the editing facilities. Several instances of apparent overtyping may
be noted in Fig. 3. They are, in fact, cases of erasure and retyping. Errors can and will be made in
the typing of other instructions, and these mistakes
may not be discovered until some time later after
which the backspace-retype means of correction is
no longer practical. Corrections to Class I instructions (those designated by an upper case code)
must be made ahead of the erroneous instruction.
Facility for allowing instruction A to be executed
in advance of instruction B is provided by placing
instruction A in an item whose number is· less than
that of the item containing instruction B. The data
is scanned in order of increasing item number, and
therefore, the instructions will be encountered in
their proper order. The decimal numbering system
of the items allows for the prenumbering of items.
394
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
In this way corrections are effectively made on the
future as well as on the past.
The mistake made in the overwrite instruction of
item 10 (Fig. 3) is corrected by item 9.5. Here the
old instruction is erased and the proper form given.
print pass after the editing has been completed.
Hence, these instructions can be "wired in or out"
at any time in the editing program by use of the
correction instructions. The S type is particularly
useful for this purpose.
Figure 4 shows the galley of the edited text of
Fig. 3, and Fig. 5 is the final copy as produced by
the output program.
Paragraphing
Paragraphing must be handled by instruction
when typing in the normal mode since the program
is concerned about the associated blanks indicating
indentation. A desired paragraph is indicated by
the in&truction [P which mayor may not be followed by a carriage return. In either case the blanks
which follow will be regarded as the indentation.
Should the user decide upon a paragraph as an afterthought, he may "wire in" the instruction. For
example, item 11.5 of Fig. 3 shows [P and five
blanks being inserted after the word convenience in
item 12.
Changes which control format of output, i.e.,
corrections to Class II instructions, offer no problems as to when they are inserted relative to their
proper position since their execution occurs in the
FILE- MEMO
1965
OSCILLOSCOPE WITH DIGITAL CONTROL
FOR TYPESETTING AND GRAPHIC ARTS
General Description
This section describes a system for the digital
control of a high-quality oscilloscope for the purpose of generating graphic arts quality images such
as are needed for printing text and line drawings. In
general the images will be photographed and the
resulting pictures reproduced by the standard methods of offset printing. The input information which
specifies the image will come from a digital magnetic tape or a computer. On the input, the image is
described entirely in numerical form in the manner
STANDARD TEXT • DESCRIPTION····_··_···
07/12/65 21:».12
10.00 [F
IL Desetiption
Objectivel
The purpoae of thil Iyltem iI three-fold:
(1) to allow for input and conection of typewritten text:
••••••••• 1 •••••••••••••••••••••••••••••••••
"
•••••••••••••
(!) Itore material for futul'e editing anel/or Pfoeeainc:
(S) to provide high quality typost"aphica1 output.
••••••••••••••••••• t •••••••••••••••••••••••••••••
12.00 [JThe lignificance of luch a Iyltem lia primarily in point (2) for
................................................................. .
an~ proofl'eaden can prepare manuxriptl and typaetten can pl'ociuce
.......................................................................................
aesthetically p1ealing copy. no one l'elilha the tuk of altering p1'evioUl1y p1'epl1'eci
.......................................................................................
'"
although typiltl
material Conaequently. the ability to 1t01'e text in a form that can be ealily updated
•••
4
••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••
representl a great convenience.[P
,
................................ .
Therefore. the aim of thil p1'Oject hu been to develop a l)'Item of editing in which
..........................................................................................
the
UIef
can p1'ovide input to a computer by a device which iI familiar and euy to uae.
14.00 The C01Tectionl themlelva conltitute input.
Figure 4. Galley proof.
COMPUTER EDITING, TYPESETTING AND IMAGE GENERATION
395
IL Description
Objectives
The purpose of thissyatem is three-fold:
(1) to allow for input and correction of typewritten text:
(2) store material f('r future editing and/or proceaing:
(S) to provide high quality typographical output.
The significance of such a syatem lies primarily in point (2) for although typiatl and proofreaden
can prepare manuacriptl and typesetters can produce aesthetically pleating copy, no one relishel the
talk of altering previously prepared material Consequently, the ability to ItOre text in a form that
can be easily updated reprelentl a great convenience.
Therefore. the aim of this project has been to develop a system of editin, in which the UIet
can provide input to a computer by a device which is familiar and easy to \lie. The COffectiOIlI
themaelvel conltitute input.
Figure 5. Justified text.
to be discussed below. The motion of the beam of
the scope is completely determined by the numerical data. The intensity and off-on times of the beam
are similarly controlled.
A block diagram of the system is shown in Fig.
6. The general mode of operation is as follows:
Data, in numerical form, which specifies the next
SCOPE AND
ANALOGUE
DEFLECTION
EQUIPMENT
DIGITAL
MAGNETIC
TAPE OR
COMPUTER
DIGITAL
CONTROL
EQUIPMENT
DIGITAL-TOANALOGUE
CONVERSION
EQUIPMENT
~
printed text. The film is then advanced to provide a
fresh film for the next page.
Preliminary Experiments
A set of preliminary experiments has been carried out using a Stromberg-Carlson 4020 Microfilm
Printer and an IBM 7094 computer. The SC 4020
contains a scope and camera of sufficient quality to
test the feasibility of producing printing and drawings. A sample alphabet produced on microfilm on
the SC 4020 and reproduced photographically is
shown in Fig. 7. Here each of the letters in the bottom 6 lines was produced by a number of short vecFile: FONT Stancla,. Reco,cl: BASKERVILLE III
Figure 6. Block diagram of image-generating equipment.
letter or diagram to be generated is read from the
digital magnetic tape or computer. The digital control equipment then determines the sequence of
movements of the scope beam to produce the image. Typically, many (10 to several hundred) beam
movements will be needed to produce a single character. Digital signals from the digital control unit
are passed through digital-to-analog converters to
obtain the voltages to be applied to the x and y
scope deflections and to the beam brightness and
off-on controls. The digital control also advances
the film in the camera at appropriate times. The
camera film is stationary and the shutter is left
open until a full scope face of material is exposed.
This amount will usually Gorrespond to a page of
GALLEY
Oaaractet. in the font
A BCDEFG HIJIT LM NOPQRSTU V W X Ylo
abc4ef.hijkl .. nopqr.tuywx,a __
1r.'.Ii.~I()
..
::···()~/I.
11154117 •• 01··+-_
Cij = ar + ltij
A q;.:'l brown fox ..... oyer the 111, clOS.
Fill
box witII ten ...... jup
In,
Figure 7. Example of type font.
tors, an average of about 16 vectors being required
for each letter. In this way a highly readable output
is obtained with both upper and lower case letters.
Furthermore, an unlimited number of new fonts can
be introduced simply by reading a new set of vectors which describe the letters into the IBM 7094.
Also graphs and line drawings can be treated just as
any other character.
396
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
The quality of the letters produced with vectors
is not as good as the usual printing. Readability
probably compares with an ordinary typewriter.
However, the SC 4020 scope is far from the best
available, and better equipment should greatly im~
prove the quality.
Estimates of Resolution Requirements
The SC 4020 can plot points, dots (slightly larger than points), or draw vectors. The points and
dots can be placed at any position on a 1024 x
1024 raster. The vectors can start at any raster
point and extend up to 64 grid spaces in either or
both x and y directions. Measurements on the image of the SC 4020 indicate resolutions of
Point .... 2.6 grid spaces diameter in 1024 raster
Dot ..... 2.8 grid spaces diameter in 1024 raster
Vector ... 2.3 grid spaces width in 1024 raster
The letters shown in Fig. 7 were produced on a matrix 16 raster points high and an average of 10 raster points wide. Consequently the letter resolution
can be defined as 4.3 vector widths wide and 7 vector widths high. This resolution is sufficient· for a
quality whose readability is probably comparable to
typewriting (although we have made no tests of this
point) . However, the quality is substantially less
than that of good printing. We estimate that firstrate printing could be achieved by tripling the SC
4020 resolution and producing the letters on a 21 X
13 resolution unit grid. On the SC 4020 this would
correspond to a 49 X 30 raster point grid using the
vector mode or a 59 X 36 grid if the dot were used.
In addition to resolution compared to letter size
another factor is very important. This is quantizing
or the comparison of raster space to dot diameter.
In the vector mode the ratio is
1 Raster Space
= 0.43
1 Vector Width
--..,~--~-
This ratio is too large for some purposes. In particular, it would be nice to be able to change the size
of a letter simply by multiplying the lengths of all
involved vectors. Failure of attempts to change size
indicate that a quantizing ratio of 0.43 is much too
large to allow this scaling, and hence a smaller ratio
is desirable. The exact ratio needed may also depend on· the resolution raster used for the letter. We
estimate that the ratio should be no larger than
0.25.
1965
A High-Quality Scope
The SC 4020 scope has a basic resolution of at
best 2.3 parts in 1,024. The scope has a built-in
character mask and was not designed for ultimate
resolution. Better scopes are now available. The
best specifications quoted by scope manufacturers
indicate beam widths of 0.0005" to 0.001" are
available in 5" to 10" diameter scopes. Assuming
that a 5" usable deflection can be obtained with a
beam width of 0.001", then a basic resolution of
1 part in 5,000 can be achieved. This is 11 times
better than the SC 4020 and should produce excellent graphics if it is properly used.
Proposed Raster and Resolution for
Quality Printing
As a result of the preliminary experiments described above, and the specifications of currently
available scopes, specifications for a high-quality
graphics system can be set down. Scheme #1 consists of a raster of 32,768 points across a tube with
a resolution of 1 part in 5,000, and the letters being
generated on a 140 X 85 matrix of raster points.
This would give a 21 X 13 matrix of resolution circles for each letter. The quantizing ratio would be
5,000/32,768 = 0.15, which should be adequate for
smooth size multiplication.
A simulation was carried out on the SC 4020 in
which letters were generated from points on a
64 X 38 raster giving a 25 X 15 matrix of resolution circles. The results shown on Fig. 8 indicate
quite acceptable printing quality.
An electronic system drew,
set in words, sentences, and
justified columns the letters you
are now reading.
th~n
Figure 8. Example of better quality type.
Image Formation From PATCH's
In the work previously described, which has
moderate resolution, images were formed from the
sum of short vectors or from the sum of dots. The
vector width or dot diameter is the resolution of the
scope. With high resolution scopes, the beain is so
small that it is no longer practical to use a single
beam image as the basic area unit. Instead, special
COMPUTER EDITING, TYPESETTING AND IMAGE GENERATION
397
deflection equipment must be added to the scope to
cause the beam to sweep over a basic area unit or
PATCH (Parameterized Area To Construct Holograph). All images are constructed as the sum of a
number of PATCH's. In this way the number of digits required to describe the image is reduced to a
reasonable number, and the speed with which the
image is generated is increased.
PATCH's must meet certain requirements. They
must:
tical sub-areas possessed by several letters. Thus the
loops on the "b," "p," "d," and "g" may be identical (after suitable rotations) i.n some fonts. If so,
then these areas need be described only once in
terms of PATCH's, and suitable equipment devised
to repeatedly call for these standard areas. The concept is similar to the use of subroutines in computer programs.
1. Fit together without leaving interior spaces.
Circles would be unsuitable by this criteria.
2. Provide a good approximation to an image
with a small number of areas.
3. Be describable with a reasonable number of
numerical parameters.
4. Be generatable with reasonably simple analog equipment.
5. Be enlargeable or reducible to allow changes
in font size.
The general philosophy behind the development
of the editing program has been to provide a human-engineered facility for producing text in machine readable form so that a computer can be used
for editing, sorting, and printing. It is hoped that
timesaving in composing will be effected by eliminating much proofreading time inherent in a system
involving human copying. The ability to modify
and republish existing material is probably the most
valuable feature in the system.
It is planned to first use the program for texts
which must be issued in many slightly different editions or which must be frequently modified. Certain
instructional programs and literature describing
computers are prime examples.
The image generation programs have not yet
been tested with the best available scopes. However,
the experiments with the SC 4020, the specifications of the best scopes, and the currently available
digital equipment make it appear possible to build
a high-quality, high-speed, graphic arts display device. The unit can produce on a single scope face
high quality printing with as much' as 200 lines of
350 letters each. Such a scope face would contain
more than an entire page of newsprint. Letters
could be produced at speeds of 500 to 5000 letters/second. These speeds are 10 to 1,000 times
faster than existing photographic or hot metal typesetting equipment. Furthermore, since the image is
described in complete generality digitally, line
drawings, mathematical equations, musical scores,
and an unlimited number of type fonts can be produced by the same, completely standard, means.
Gutenberg invented printing with movable type
in the 15th century, thus superseding handdrawn
letters. We are now ready to replace movable type
with drawn letters. The pen is in the hand of a
computer. Altogether, we .believe the computer is
now ready to provide great assistance to human
written communication.
An area which meets these requirements is shown
in Fig. 9a. The area is bounded by straight lines at
its top and bottom and second order curves at its
left and right edges. Adjacent PATCH's can be
stacked on their straight sides. Simple circuitry has
been designed to generate the PATCH. If desirable
the PATCH can be rotated 90° to provide vertical
straight sides. Rectangles and trapezoids are special
cases of this area. Eight parameters are required to
describe a PATCH; width-a, height-h, curvature
and slope at left and right bounds-Cl, Sl, C l , Sl,
and the coordinates of one comer-Xo, Yo.
A sketch of a generation for the lower case letter
"r" is shown in Fig. 9b. Nine PATCH's are required. Some PATCH's have been rotated 90° as
shown.
Preliminary experiments indicate an average of
10 PATCH's is required for each good quality letter. Thus a font of 100 leters would require 8,000
numbers for its description. This number is substantial, but not too large for currently available
computer memories.
The letters on Fig. 8 were produced with
PATCH's.
Use of Sub-Areas
In order to reduce computer memory requirements, it may be possible to take advantage of iden-
CONCLUSIONS
398
PROCEEDINGS -
1965
FALL JOINT COMPUTER CONFERENCE,
ZIG-ZAG SWEEP
TO FILL IN AREA
S"
c,
h
(a)
PATCH
(b)
FORMATION OF LOWER CASE
BY PATCHS
Figure 9. Generation of images from PATCH's.
II
r
II
THE LEFT HAND O'F SCHOLARSHIP: COMPUTER EXPERIMENTS WITH
RECORDED TEXT AS A COMMUNICATION MEDIA
Glenn E. Roudabush,
Charles R. T. Bacon,
R. Bruce Briggs, James A. Fierst, and
Dale W. Isner
The University of Pittsburgh
Pittsburgh, Pennsylvania
and
Hiroshi A. Noguni
The RAND Corporation
Santa Monica, California
cations: the products of this activity made public.
But these two sides of scholarship are closely related. What to one scholar is a publication, to another
is information. Every scholar stands both to the right
and to the left of every other one.
In our text processing work at the University of
Pittsburgh, we look upon our computers and our
developing system of programs as a tool designed to
extend the abilities of the scholar, on the one hand
to collect, sort, and understand information, and on
the· other to disseminate to others the information
that he generates. In other projects and for most of
the users at our Center, our computers and systems
of programs are seen as a tool to extend the ability
to process and analyze data. These systems are, of
course, well developed. In analyzing data, one's
concern is to reduce, to simplify, and to summarize,
preserving only the most significant aspects of the
data. While in processing information, we wish to
preserve every jot and tittle, allowing no character-
To paint a broad though much simplified picture,
let us suppose at the outset that scholarship begins
with the collection of facts. These facts are of two
distinct kinds. The first are observations and they
consist, for example, of the results of controlled experiments or observations for field work in the case
of science or, perhaps, they are derived from the
study of historical documents in the case of history,
and so on. The second kind of facts are the reported
observations, descriptions of phenomena or events,
or the theories provided by contemporary scholars.
In aggregate, let us refer to the first kind of facts as
"data" and the second as "information." From the
confluence of these two kinds of facts in the mind
of the scholar, new descriptions and theories are
born. When he makes these public, then new information is generated.
Scholarship, strictly conceived, is this activity in
the mind of the scholar. On its right hand are
sources: data and information. On its left are publi399
400
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
istic of any significance to go unrecorded or untransmitted. Finally, in the research we ourselves do
that utilizes natural language text, we come full circle and again use our systems as data processors
and analyzers, treating the information we have collected as data.
Figure 1 shows schematically the overall design
of our text processing system. Four kinds of input
Printer's
Tapes toMagnetic Tapes
I-----~
Cards to
Magnetic
Tape
Paper Tape
to Magnetic
Tape
Conversion to
Standard Magnetic Tape
1---------- -1
I
1
I
I
I
1
I
I
In-House
Analysis
IR Research
Auto-Abstracting
Content Analysis
etc.
Figure 1. Block diagram of the general text processing
system.
are shown. The text on magnetic tape in any arbitrary format may be material obtained from other
centers or from any source that produces text on
tape. One day this source may include material read
by optical character recognition equipment. The
printer's control tapes are paper tapes obtained
from printers and publishers which were originally
used to control some kind of typesetting equipment.
We have locally constructed a paper tape reader that
will accept 5,6,8, 15, and 31-channel paper tape
and, through an IBM 1401 computer, write magnetic tape. This work was completed under a Depart-
1965
ment of Defense Advanced Research Projects Agency grant and has been reported elsewhere.1,2 The
text punched on cards or on Flexowriter-type paper
tape would normally represent material prepared at
our Center.
The block labeled "conversion to standard magnetic tape" represents the encoding of all forms of
natural language text into a particular format according to a schema devised by Martin Kay and
Ted Ziehe of the Rand Linguistics Research Group.
A relatively complete, but still preliminary description of this format has been published as a Rand
Memo. 3 The use of magnetic tape for storage of text
and the use of this standard format are prominent
in our system and more will be said about this in a
moment.
Some source text in exceptionally good condition
may, after encoding in this standard form, be ready
for distribution to other centers requesting it or for
use in our own research. Characteristically, however, some additional processing will be required
and this is represented in the block labeled "utility." At the bottom of this figure, our use of text as
data is represented. Under "in-house analysis" we
have listed information retrieval research, auto-abstracting, and content analysis as examples of this
kind of work.
The series of blocks down the right side of Fig. 1
show the normal sequence of operations for photocomposition. Material to be photocomposed will, in
most cases, be specifically keyboarded for that purpose. This material will be under good control from
the beginning and can go directly into the typesetting system unless it will be used for other purposes
as well. Sorting, editing, and other processing will
generally not be required so that the conversion to
standard format can be bypassed. Both kinds of input to the typesetting system are allowed. An expanded block diagram of the typesetting system itself will be shown in a later figure.
Our system depends to a large extent on the efficient processing of large amounts of natural language text on magnetic tape and this aspect of our
system will be described in somewhat greater detail.
Magnetic tape is, of course, an economic storage
medium and is easily shipped between geographically separated centers. Encoding all text in one
standard format becomes important when many different kinds of text from many different sources
must be processed and shared. When standardized
THE LEFT HAND OF SCHOLARSHIP
input can be expected, a smaller number of general
programs can be written and a useful library can
begin to be accumulated. The standard adopted
must be flexible enough to handle any material one
may encounter. The Rand format seems to fill all of
our current and anticipated requirements and we
have adopted it for our system.
On seven-channel magnetic tape, the minimum
unit is a six-bit pattern plus a parity bit. In a oneto-one character representation, only 64 unique
characters can be defined. In order to extend the
number of different characters that can be represented on tape, either more than one six-bit pattern
can be assigned to each character to be represented
or else, as in the Rand standard format, some of the
available 64 patterns can be used to change the
meaning of the patterns that follow them on tape.
These mode change patterns or characters are of
two kinds: "flags" and "shifts." The flags change
the interpret9tion of succeeding patterns to a new
alphabet, while the shifts retain the same alphabet,
but mediate changes to, for example, upper case,
italics, larger type size, and so on.
Fifteen of the available 64 patterns are permanently assigned as alphabet flags in the Rand system. These 15 patterns along with the blank (octal
60) and a filler character (octal 77) are not a part
of any alphabet and their interpretation never
changes. There are, then, 47 patterns which can be
assigned meanings in each of the 15 alphabets. In
each of the 15 alphabets, some of the available 47
patterns will be assigned mode change functions as
shift characters. In the Roman alphabet, for example, nine patterns are used in this way. The remaining 38 patterns can accommodate the 26 letters, 10
diacritic marks, and the apostrophe with one pattern left unassigned. Notice that separate alphabets
must be used for punctuation, the numerals, and
other symbols occurring frequently in the English
text.
This encoding system gives a flexible representation of the micro-characteristics of text. Larger
units of text, however, have a hierarchical organization which also requires representation. This is accomplished in the Rand system by the "catalog"
format. The fundamental unit in this system is the
datum, which can be thought of as a manipulable
unit of information. A datum may be a text entry
consisting of one physical line of text if from a previously printed source, or one sentence, or one word
401
if that is convenient, or it may be a title or a caption from an illustration, or an annotation or description of another datum added at a later time.
Each datum belongs to a particular class and at the
beginning of each reel of tape following a label record, a map of the corpus is given describing the various classes of material contained in" the file. Each
datum is coordinated with this map and its proper
identification assured by a system of control and
label words accompanying every datum. A representation of the Rand encoding system will be shown
later in our second typesetting example.
We in Pittsburgh became interested in automatic
photocomposition when, in October of 1964, we
acquired a Photon S-560 photocomposition machine
from the National Institutes of Health. This machine had previously been used by Michael Barnett
at the Massachusetts Institute of Technology under
an NIH grant. The Photon is an electromechanical
device driven by punched paper tape. It consists essentially of a movable glass disk with 1400 characters etched on it and a lens system for projecting
these characters onto roll film. The disk can accommodate 16 different type fonts arranged in eight
concentric circles or levels around the disk. The paper tape is punched with double character codes, the
first giving the character position within disk level
and the second giving the escapement for that character. There are additional codes for advancing the
film, positioning the film carriage horizontally, affecting lens shifts for size control, and effecting
shifts to new disk levels for font changes.
When we received the Photon, we also acquired
the PC6 system of automatic photocomposition
programs developed under the direction of Barnett
while he was at M.I.T.4-s The PC6 system is typified by the TYPRINT program which requires text
containing fixed typesetting control codes as input.
These codes are set off from the text by square
brackets, which are reserved, and have fixed meanings as shown in the following examples:
[NP ] New Paragraph
[DL6] Shift to Disk Level 6 (Highland type face)
[VL2] Leave 2 Blank Lines
In using this system, we soon found that the insertion of fixed codes can be laborious, that changes in
format require changes throughout the text, and
that many desirable formats are impossible to
achieve. We felt that a more flexible and more gen-
402
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
erally useful system of programs could be written.
We still believe, however, that the PC system was
a successful first step toward automatic photocomposition and, in general, the typesetting system we
have developed is an outgrowth of our experiences
with it.
The input to our system is either magnetic tape
in an arbitrary format produced from paper tape
punched specifically for typesetting or else magnetic
tape in the Rand standard format. The output is
again paper tape that will drive the Photon. A schematic diagram of this system is shown in Fig. 2. In
this figure, the two forms of magnetic tape iJ).put
Magnetic
Tape to
Photon Code
Recycle
Page
Formatting
Program
Optional
Editing
by Scope
Indexing
Program
Magnetic
Tape to
Paper Tape
Figure 2. Block diagram of the typesetting system.
are shown at the top. The typesetting program is
shown as two separable functions. The first· part,
which translates text into the double character Photon code, is relatively independent of the second
part, but is quite dependent on the particular photocomposition device being used, that is, on the Photon. This part would be largely rewritten if a new
piece of equipment were obtained. It is, however, a
rather simple and straightforward program. The
second part, labeled the "page formatting program,"
1965
represents a real departure from the PC6 system
and other typesetting. systems we have seen. In· this
program, a full page of text is set before outputting
is begun.
The page formatting program shows two forms of
output. The first is a magnetic tape which contains
Photon input that will be converted to paper tape.
The other form of output labeled the "history tape,"
is a magnetic tape containing the original text characters with their associated Photon codes, all of the
material added by the page formatting program,
page and line numbers, and sufficient parametric
information to reset the material exactly as it was
originally done. This tape can be recycled through
the page formatting program with corrections or
additions to the text or simply with changed parameters if the format is to be changed. Since page
numbers, tables, captions for figures, titles and subtitles, and so on are all in their proper place on this
tape, it can be used as input to a program that produces indices and tables of contents. Finally as
shown, this tape might simply be stored for a period of time and then recycled when a new edition is
to be set.
This history tape is an important by-product of
computerized typesetting and may well be a critical
factor in making the adoption of an automatic system economically feasible. This tape is essentially
an exact copy of the printed material, less illustrations which cannot be handled in our system, and is
a compact, machine-readable counterpart of the
standing .type that occupies space in some print
shops and warehouses. Any material in this file can
be simply addressed by page and line number from
the corresponding printed document and changes
made. If a change is made that affects the remainder of the file, for example an insertion that affects
the pagination, all of the file will automatically be
corrected.
In designing this system, we came to the conclusion that typesetting control codes in the text to be
set are necessary if any format flexibility is to be
obtained. They, therefore, appear minimally in our
system. We have tried at the same time to ease the
burden of keyboarding these codes and of changing
their meaning in pre-prepared text by making them
entirely arbitrary. The text-dependent codes can be
though of simply as markers. The actions to be
taken when particular codes are encountered are
separately specified as parameters to the system.
403
THE LEFT HAND OF SCHOLARSHIP
These parameters can be inserted anywhere in the
text ahead of the markers to which they refer, or
they can be punched on parameter cards. If they are
keyboarded with the text, they are normally marked
off by dollar signs or some other specified reserved
symbol. The form of the printd output can be completely changed by changing these parameters with
no re-editing of the text itself.
In our system, we wished to include the ability to
control as much as possible the layout and final
form of the pages in the manuscript. We felt that
the deficiencies of other systems in this respect
stemmed from their line-by-line typesetting. The
attempt to visualize a page by as yet undefined
lines is difficult and usually leads to a number of
unnecessary trial runs. To ease this difficulty on the
programming level, we set full pages. On the conceptual level, we conceive of a page as a collection
of subpages or "boxes." A box is a string of fixed
text delimited by two markers. The material within
a box can be set independently of other material as
though it were itself a page and then the box of
fixed material placed in its proper position on the
page. The box system is recursive so that boxes
may be defined within boxes and for most functions, overlapping is allowed.
The parameters used to control the system are of
three types: (1) general parameters, (2) text boundary parameters, and (3) box parameters. A list of
the general parameters is shown in Fig. 3.
Most of these parameters control the general appearance of the printed output. They include the
specification of page size, number of columns on
the page, type face, point size, and so on. The parameters specifying running page headers include a
provision for incorporating page numbers that are
automatically incremented. The last two parameters
are provided to make the keyboarding somewhat
simpler. The DLIM code allows the specification of
any character to mark off parameters when these
are included in the text in place of the preset dollar
sign. The DEL code allows any character to be
specified as a deletion code. It causes a character
over which it is typed to disappear from the input
string. Only those parameters that are to be different from their preset values need be specified.
The following list of general parameters:
$ PSIZ(8. 5, 11), TFAC(SCOTCH) , TSIZ(10),
HEAD(Page /1/), COL(3.5, 1.5, 3.5) $
would specify 8 Y2 by 11 inch pages to be set in
SYr180L
MEANING
NOTES
PSIZ(x,x)
Page SIZe
COL(x,x,x ••• )
COLumns
Page size is width
by height.
Column widths and
margins alternate.
Reserved words such
as center,spread,etc.
are used to indicate
action desired.
n, i, b,B are names
of type fonts.
Used to indicate
italic or bold type.
Type size is given
in points.
Background size is
also in poi:->ts.
Tab, setting measured
from left margin.
Minimum distance
between words.
Ill3.ximum distance
between words.
Headers may be any
string of text, i t
will be set on both
page s. LHEAD and
RHEAD are set on respective pages only.
Used to surround instructions in text.
Remove s unwanted c haracters when backspacing.
JUSV( s)
JUSt ifica t ion-Vert ical
JUSH(s)
JUStification-Horizontal
TFAC(n,i,b,B)
Type FACe
FONT (f)
FONI'
TSIZ(p)
Type SIZe
BGND(P)
Back GrouND size
TAB(x,x,x •.• )
TAB
M'wS(p,p)
Minimum Word Spacing
XHS(p,p)
maXimum Word Spacing
HEAD(t)
HEADer
LHEAD (t )
Le ft HEADer
RHEAD (t)
Right HEADer
DLIM( c)
DeLIMiter
DEL(c)
D~Letion character
x is a dimension expressed in inches.
P is a dimension expressed in points.
t is any string of text and may include any boundary or box
markers.
c is any keyboardable character and should be one which is
not norr.,ally used in the text.
n, i, b,3 are the names of type faces available. They determine
which type face \n..",..",., .... ;" ... +.;n.... +roo C'hnu7C' +1,0
.... oV+ ,,, ... 11rl
u~'"" \..1...1.'-" ,""VJ..1..l..J..I..I.U.l..I..I..","".l.V.I...l. \..J.. ..... '" ..:J.I..I.V l'Y
"'.I..I.V ..L.&.V~:lr." 'Y L4-..L"'~
LJ
options (Fig. 7e).
To generate a string of characters for inclusion in
the notebook, a matrix of alphanumeric characters
and an accumulator are displayed simultaneously.
The user selects characters by pointing the light pencil at them in sequence. His choices are recorded in
the accumulator and errors can be canceled by a
suitable light pencil command. When the appropriate
character string has been generated, the command
PROCESS moves the display to the next option
(Figs. 7f and 8a).
played, and' executed by the user who need know
nothing more about the communication language
than is necessary to point out his desires. By using
the tree to review the catalog of currently available
file names, object names within files, and property
names within files, the user is able to browse
through the data structure as well as to familiarize
himself with the range of options and implications
available to him.
The sophisticated user is not restricted to this
mode of communication if his knowledge of the
AESOP: PROTOTYPE FOR ON-LINE USER CONTROL
a
443
c
b
f
Figure 7.
Steps in the on-line construction of a file modification command as described in detail in the
text.
system exceeds that implied by the look, choose,
and point philosophy. Users who know the syntax
and vocabulary of the system may directly compose
on the on-line typewriter any message that can be
generated by means. of the light pencil and communication tree. In fact, at any time in the evolution of the AESOP prototype, the typewriter user
has a more extensive language, portions of which, in
sequence, are transferred to the tree as this proves
useful. Currently, requests to print hard-copy, to
erase or to transfer portions of the notebook into
other portions are accomplished only by means of the
typewriter. System error messages are primarily
provided on the typewriter.
PROCEDURE GENERATION
The user can also process the data of the system,
either by executing or modifying established routines or by constructing new routines using his light
pencil and CRT display. In order to do this, he
calls up an on-line algorithm construction display
444
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
a
b
d
e
Figure 8.
1965
c
f
Figure 7 continued.
called OAK-TREET. The display appears as a
tree with one branch displaying commands which
can be light penciled to build and execute a program. A second branch displays some basic classes
of operators and operands or previously established
functions to be used as parts of this program. A
workspace within which to build a tree representation of the desired procedure exists as another
branch (Fig. 9a).
Each time a user light pencils a command, it appears in the upper right-hand corner of the display
(Fig. 9b). When he light pencils a class of operators, the specific operators within that class are displayed on a separate branch of the tree (Fig. 9c).
When he light pencils one of these specific operators, it also appears in the upper right-hand corner
of the display (Fig. 9d). Once a command and an
operator are so displayed, he can then light pencil
any portion of a tree in the workspace and the command will be executed, using the operator in question, at that specific location in the workspace
(Fig. ge). In this general fashion, the user builds,
AESOP: PROTOTYPE FOR ON-LINE USER CONTROL
e
Figure 9.
f
Steps in the on-line construction of a primitive
procedure to mUltiply 4 by 5.
445
446
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
stores, retrieves, modifies, links, and in other ways
develops a catalog of logical and mathematical
expressions for use with his data base.
Numbers are generated by means of a special display of a tree of numerical characters which can be
added in an accumulator by light pencil selection.
These numerical strings are then transferred to the
upper right-hand accumulator in the OAKTREET display for further use.
The OAK-TREET capability operates in two
modes C. The/first displays the commands, operators
and workspace simultaneously. This mode may
crowd the workspace but the display logic drops
portions of the workspace tree off the display face,
leaving branch markers to indicate what parts are
gone whenever the workspace density exceeds a critical value. The second mode is used for inspection.
It turns the entire display face over to the workspace tree. In this mode, any point on the construction can be brought to the top of the display by
means of the light pencil in order to expose more of
the details of the tree below that point.
1965
As an example of the use of the OAK-TREET
feature consider Figs. 10, 11· and 12. A user interested in .a gross estimate of fuel consumption as a
function of distance for a high performance aircraft
in level flight might build the following planning procedure. Using the command REPLACE and a previously stored subroutine with the name JT1, he puts
this routine into the workspace by pointing the/ light
pencil at the node immediately below the/ word
WORKSPACE (Fig. lOa). The previously defined
routine JT1
., is then displayed at that point in the
workspace. In this example, the routine is a gross
ca1culation of fuel consumption and is called POLLBS. It involves multiplying the fuel consumption
rate by the ratio of the distance flown over the average speed of the aircraft. The user may substitute
parameters into this procedure (see Fig. lOb) and
then call for it to be executed by pointing the light
pencil at the command EVALUATE. The result of
the calculation will be printed by the on-line typewriter.
Figure 10. On-line retrieval of aprestored routine and insertion of parameters.
To set up a more complex expression for evaluation, the user· now changes the distance flown to the
variable C multiplied by 100. He does this by replacing 800 with the arithmetic operator X (multiply), then setting the branches to C and the number
100. Both are done with the command REPLACE
(see Fig. 11 a). The user now temporarily stores this
modified routine under the arbitrary name GUS. He
then uses the workspace to establish a new variable D
set to the value of the routine stored under the name
GUS. He does this by putting the logical operator =
immediately under the word WORKSPACE (thereby
erasing everything else) and then adding under it the
variable D and the expression stored under the name
GUS (see Fig. lIb).
This new expression is now stored temporarily
under the name JOE. A conditional expression is then
built in the workspace by putting the logical operator
AESOP: PROTOTYPE FOR ON-LINE USER CONTROL
a
447
b
Figure 11. On-line modification of a procedure, storage and
use as a subprocedure.
IF-THEN-ELSE immediately_ below the word
WORKSPACE. The user then introduces the logical
operator TYPE on both terminal branches of the
conditional to indicate that both results of the conditional test should be typedby the on-line typewriter.
He then adds the variable D to the true branch, and
the same'plus a marker of four dots to the false
branch (see Fig. 12a).
The conditional test is set as a comparison to be
made by the operator LEQ meaning less than or equal
to. The comparison is to be made between the value
of D and the number 400 (see Fig. 12b). This portion of the routine is now stored under an· arbitrary
name ED and a next routine is constructed in order
to execute the previously defined routines.
This next routine involves the use of the operator
DO. It is used to first find the value of D and then
perform the conditional test. This is accomplished
by putting the routine previously stored under the
name JOE as a first branch for DO and then using
the command REPLACE, inserting the routine previously stored under name ED (see Fig. 12c). This
DO procedure is now stored under the name SAM
and a next. higher order routine is built.
This next routine involves the operator FOR which
is used to run through a sequence of values for a
given variable. The variable in this case is the C
which was previously used as the variable for distance
in the fuel calculation. C will be set to values from 1
to 10. The previously defined DO expression, stored
under the name SAM, is then added to be evaluated
for these values of the variable C (see Fig. 12d).
The system can be shifted to the second or inspection mode and the complete routine can be expanded for viewing of any part of the tree (see· Fig.
12e). The light pencil is used to bring any portion
of the tree to the top center (see Fig. 12f).
When the command.EVALUATE.is executed, the
typewriter prints the results of the fuel calculation
from 100 to 1000 miles in increments of 100 miles
followed by the indication of four dots when the fuel
requirement exceeds 400.
The total routine can now be stored as a permanent p::trt of the data base and later used in its entirety. The routine can also be returned to the
workspace for modification or to detach subroutines for other purposes. As long as the total routine remains filed in the data base, it is '. available
for inspection and application on-line.
As complex or extensive routines are constructed,
it beco:nes increasingly undesirable to display the
total structure to the operational user of the routine. In such ca,ses only the name of the routine and
the names of the insertable parameters need be displayed in the OAK~TREET workspace. The appropriate values for these parameters can then be
inserted by the user and the named routine evaluated on call. (See Figure 13 for example.)
In this example a routine called STATUS requires
the insertion of a destination code (AREA), the
type of aircraft(AC-TYPE), and number of aircraft
(NO-AC) to be flown to that destination, and the
number of originating airfields to be checked for the
availability of these aircraft (NO-AF). The routine
448
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1965
a
e
d
f
Figure 12. Steps in the on-line construction of a complex
procedure using previously on-line defi~ed and
stored subprocedures. Expansion for inspection
in (e) and (f). See text for a detailed description
nf
th .. nrnf'..i!nr..
- .............. -
t"' ... - - - - -... - .
then locates up to this number of originating airfields
with sufficient aircraft available, in order of increasing distance from the destination. It lists the airfields,
the number of aircraft available, the organizational
designation of the aircraft, the distance and time required to reach the destination.
The detailed structure of user routines such as
STATUS is examined and debugged in another display environment called DEBUG. (See Fig. 14.) In
the DEBUG function, the structure of the routine is
displayed as the second leftmost limb of a tree along
with a set of commands which permit the programmer to examine any selected portion of the routine
in question and then modify and redefine it.
A routine is brought to the top of the second
leftmost limb of the DEBUG tree by an on-line
typewriter command DEBUG (name). Any subtree
of that routine is then moved to the top of that
limb by pointing the light pencil at the node on top
of the substructure in question. The structure can
AESOP: PROTOTYPE FOR ON-LINE USER CONTROL
a
449
b
')FH:.Jl
11 F8(~ ~':71ITrs
flISTM:CE '120 f!'!
l\nAOM:
13 F86 ~'dfITr-S
OIST,'UlCF. 320 ~,,1
Tt:lr ./0. 'IP!
c
Figure 13. The use of OAK-TREET for parameter insertion
in fixed procedures.
also be displayed, in whole or in part, by itself, in
either a multinode or single node state.
By light pencil, using the command·OAK, the substructure displayed on the second leftmost branch of
the DEBUG tree is moved into the workspace of
OAK-TREET. In this position· it can be modified
using the standard OAK-TREET capabilities. The
modified structure is then moved back to the second
leftmost limb of the DEBUG environment by light
penciling the OAK-TREET command KEYBOARD.
The modified structure replaces the previous structure in the system by light penciling the redefinition
command DEFTR. The revised and redefined function can then be evaluated using the command TEST.
This procedure for on-line program development
and revision applies to all of the user and system
functions constructed in and interpreted by the
AESOP list processor, TREET (see the Appendix
for details). For example, it is possible to DEBUG
(FACTORIAL) as in Fig. 14(a), (b), (c) and (d)
or DEBUG (STATUS) as in (e) and (f). It is also
possible, for example, to DEBUG (OAK) or DEBUG (DEBUG).
In previous examples, data used in the execution
of a user's routine were inserted by means of the
user's light pencil. The results of executing the routine were printed by the on-line typewriter. However, it is also possible to have the routine call for
data directly out of the computer-based notebook
and then place the results of its execution in the
notebook. In effect, the TREET processor can be
considered one of the on-line users of the notebook-based file. The TREET processor retrieves
data from the notebook using any of the system retrieval capabilities available to the other on-line
users of the notebook. It stores data in the notebook using· the same range of storage capabilities
available to human users of the system. It uses the
same communication language.
For example, in Fig. 15, the file BLANK is organized as a scratchpad for keeping track of the input and output data derived from the evaluation of
the STATUS 1 routine. The routine STATUS 1 differs from STATUS (see Fig. 13) mainly in its use
of the notebook for its data base. Working headings
are established in BLANK for COLI through
COL4 to fix the column locations for input to
STATUS 1. Column headings are established in
450
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1" 'Vj
TA~
·l~f\~~
A~l\1\
T " 1~ /\t\
UYT~D
"!\ A
"
~
uCI~ ~
i""
"~~
I
'rnT
a
b
c
d
e
f
Figure 14. Steps in the generation or modification of an interpreted system or user procedure. See text for
details.
1965
AESOP: PROTOTYPE FOR ON-LINE USER CONTROL
b
~uS1
CO'I',(HOS
RrPLACr
CLr
S
ARlTH2
[l
T
TACTi CAL
1
rVALuur
LOGIC2
OAMAGr
['
1
J
K[y'OA~O
HAMtS2
OAMAGt,
[
1
J
STo~r
NUM.t~'
STATUS
[
AOD-~T
\
[~AI[
1
VA~IA'~ts
ACr
WOR"r
ST nuS!
~
'Ou~Ct-LH
onT-LH
1
STATUII
1
0
'
1
Y
os
c
d
e
Figure 15. Steps in the use of the notebook for data input
to and data output from user procedures. See
text for details.
451
452
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
then select those for transfer into permanent organization files and/or reports.
COL6 through COL10 for recording outputs from
STATUS 1. STATUS 1 is constructed to include a
data call on the file BLANK in order to list input
data from COLI through COL4, as shown in Fig.
15 ( a). It also includes a data change order to insert
output data in COL6 through COL10 as in Fig.
15 (b) . Source and destination line numbers are
supplied by the on-line user in OAK-TREET, as
in Fig. 15 ( c ). Input data qre then either generated
on-line or transferred from other portions of the
notebook into BLANK, as in Fig. 15 ( d); the appropriate line numbers are inserted into STATUS 1,
as in Fig. 15 (e); and the execution of the STATUS 1 routine updates the notebook as in Fig.
15 (f).
Using this type of on-line capability, in conjunction with capabilities for moving data from one
location in the notebook to another, the human user
of AESOP can execute routines, store resulting data
in temporary notebook files, review these results, and
I--
TREET
LIST
PROCESSOR
I--
I/O
EXECUTIVE
PROGRAM
DISPLAY
CONSOLES
APPENDIX
Aesop Software Structure
The Aesop prototype is an evolutionary experimental system. As such it is incremental in growth.
To facilitate this growth an attempt has been made
to make the system modular as much as seemed
practical. There are certain elementary system functions such as data retrieval, data updating and data
display. At another level there are additional functions that are preprocessors or switching routines for
the elementary functions. As an example there is a
COpy action that will copy data from one system
file into another. It does this by using the data retrieval and then the data updating routines in
succession. Fig. 16 outlines the functional organization of the system.
~
I----~
LP
INPUT
EXECUTIVE
PUSHBUTTON
INPUTS
1965
~
-
TREE
I-L....e
PROCESSING
TREE
DISPLAY
re
r-ei
TYPEWRITER INPUTS
~
I
COPY
SWAP
ETC
TYPEWRITER
OUTPUTS
FILE AND
OBJECT
DISPLAYS
LIST
~ CHANGE
RENAME
...... RETRIEVAL
PROGRAM
~
Figure 16. Simplified schematic of the AESOP A/I software
interrelations.
The total system occupies about 35,000 computer
words. The remainder of the computer core memory is used as free storage space for list structures.
The two prime considerations in the construction of
the system were:
1. Ease of extension to permit the system to
grow as new capabilities are added to it.
2. Speed of operation to give the operator
fast response to all of his inputs.
Economy of storage has not been a prime consideration. All of the system data base is stored in the
disk unit.
TREET is a general-purpose list processing
system* written for the IBM 7030 computer at the
AESOP: PROTOTYPE FOR ON-LINE USER CONTROL
MITRE Corporation. All programs in TREET are
coded as functions. A function normally has a
unique value (which may be an arbitrarily complex
list structure), a unique name, and operates with
zero or more arguments. A function may or may
not have an effect on the system. Some functions
are used for their effect, some for their value, and
some for both. The OAK-TREET function as it
appears to the operator has commands, data classes
and data which can be used for procedure construction.
What follows is a simplified explanation of the
principal nodes of the limbs of the OAK-TREET
tree. OAK-TREET is constructed in TREET and
OAK-TREET expressions are evaluated by the
interpretive list processor.
OAK-TREET
COMMANDS
REPLACE - The effect of this command is to
place in the workspace, at the point indicated
by the light pencil, the expression, symbol, or
structure indicated by its argument.
EVALUATE - When this command is signalled by the light pencil the expression in the
workspace is evaluated.
KEYBOARD - When this command is signalled by the light pencil, the system will expect
the next command to come from the on-line
typewriter.
STORE - A copy of the expression which" is
indicated by the light pencil is maintained in
the system under the name of the· argument.
ADD-R T - The expression indicated by the
first argu~ent of this command will be added to
the workspace with the same parent as the node
indicated by the light pencil.
*See E. C." Haines, "The TREET List Processing Language." SR-133, The MITRE Corporation (April 1965).
ERASE - Removes the node (and all nodes
dependent upon it) indicated by the light pencil
from the workspace.
DATA
ARITH2 - Arithmetic Operators.
+ - Computes the sum of its arguments.
- - Computes the difference of its arguments.
x-
453
Computes the product of its arguments.
DIV - Computes the quotient of its arguments.
SUMMATION - Sums an expression while
a variable goes from some number to another in increments of one.
FACTORIAL - Computes the factorial of
its argument.
EXPONENT - Raises its first argument to
the power indicated by its second argument.
EQUALS - A predicate which checks for
equality of its two arguments.
LEQ - A predicate which is true if its first
argument is less than or equal to its second argument.
L T - A predicate which is true if its first
argument is less than its second argument.
LOGIC2 - Logical Operators.
TYPE - Types out the value of its arguments on the typewriter.
DO - A convenient way of grouping several
expressions under one node.
EVAL - Evaluates its argument which must
be an expression in Cambridge Polish
notation.
CONS - Computes the list of its second
argument augmented by its first argument.
MEM 1 - Computes the first member of its
argument which must be a list.
REM 1 - Computes the list of its argument
with its first member removed.
AND - Logical Intersection. Value is TRUE
if both arguments are not NIL.
OR - Logical union. Value is TRUE if
either (or both) argument is not NIL;
NIL otherwise.
NOT - Logical negation. Value is TRUE if
its argument is NIL.
A TOM - A predicate which asks whether
its argument is atomic.
EQUALS - A predicate which checks for
equality of its two arguments.
PROG2 - Has as its value its second argument. It is useful for attaching a different
value to a computation.
WHILE - Evaluates its second argument
while its first argument is (evaluates to)
true.
454
PROCEEDINGS -FALL JOINT COMPUTER CONFERENCE,
IF-THEN - IF (al THEN a2) The value
of this expression is the value of a2 if a1
evaluates to true; NIL otherwise.
IF-TH-ELSE - IF (a1 THEN a2 ELSE a3)
The value of this expression is the value
of a2 or a3 depending on whether a1 is
true or false.
FOR - Provides a convenient way to execute an expression (its fourth argument)
for a numerical range (between the values
of its second and third arguments) of a
variable (its first argument).
= - This is the assignment operator. It sets
its first argument (which must be a variable) to the value of its second argument.
Q - Quotes its argument.
ADL - The second argument of ADL must
be a variable which evaluates to a list.
ADL sets that variable to CONS (EVAL( argl) arg2) thus effectively adding something to a list.
CHOP - The single argument of CHOP
must be a variable which evaluates to a
list. The value of CHOP is the first member of that list. The variable is set to the
remainder of the list.
FNA - The value of FNA is the value of
its argument considered as a function applied to no arguments. Its only purpose
is to represent a function of no arguments
in tree structure.
NAME - The value of NAME is the value
of its first argument. The second argument
is ignored. Name is used to label an
expression.
NAMES
This is a set of undefined symbols which may
be used as the name under which a routine is
stored for later use.
NUMBERS
This calls up a function which nerrnits any
integer to be constructed by pointing the light
pencil to its digits in sequence.
VARIABLES
This is a set of variables which can be used
in an expression in the workspace.
DEBUG
DEBUG is a function which allows other functions to be displayed and changed on-line. It
works with any interpreted function but cannot dis-
1965
play a machine language coded function. Most of
the work of debugging a function is done within
DEBUG using the light penciL In the DEBUG display are five branches -from left to right, (1) COMMANDS, (2) name of the function to be examined,
(3) type of. function, (4) ARGS, and (5) PVARS.
VIEW ACTIONS
If any node other than one in COMMANDS limb
is light penciled, then the multiple rooted subtree
headed by that node replaces the tree structure in
the second leftmost branch to the right of the
COMMANDS branch. This feature allows one to
view all of a function that is otherwise too large to
fit on the display, to concentrate attention on a particular substructure of a function, or to select which
part of the tree will be taken into OAK-TREET
for modification.
COMMANDS
RESTORB - This restores the display to its
original· position thereby cancelling all previous
view actions.
BACKUP - This command cancels the last
previous view action (if any have been performed since the last RESTORE).
OAK - The tree on the second leftmost branch
of DEBUG, to the right of the COMMANDS
branch, is placed in the workspace of OAKTREET and the OAK-TREET function is entered. Changes to the tree may be made as
desired. DEBUG is then reentered with this
modified structure by light penciling the command KEYBOARD in OAK-TREET.
KEYBOARD - This command returns control to the on-line typewriter keyboard. If the
control has been returned to the on-line typewriter, the DEBUG function may be reentered
without starting over again by typing R ( ) .
Changes may also be made using keyboard tree
changing functions.
DEFTR - DEFTR redefines the function according to the present (modified) configuration
of the tree. (Changes made to the displayed
version are not reflected in the function itself
until this is done.)
TEST - This command initiates the processing
for execution of the function displayed. The
value of the function is printed by the on-line
typewriter.
AESOP: PROTOTYPE FOR ON-LINE USER CONTROL
BRANCH TWO
On the second leftmost branch appears the name
of the function being examined, and under it a list
of statements and location symbols. Location symbols are represented by a node containing that
symbol. Statements are represented by trees in the
same fashion as in OAK.
TYPE
The function type is normally regular type R.
Type F indicates that the arguments of the function
should not be evaluated prior to evaluating the
function itself. Type U functions allow an arbitrary
number of arguments to be specified; the arguments
455
are collected in a single list and given to the function
as one argument. A type FU function is the combination of types F and U.
ARGS
The list of
will be set as
function when
of the symbols
arguments specifies which symbols
the values of the arguments of the
it is called. The old values (if any)
are automatically saved and restored.
PVARS
The value of an argument in the program variables
list is automatically saved and restored by the
function.
STRUCTURING PROGRAMS FOR MULTIPROGRAM
TIME-SHARING ON-LINE APPLICATIONS*
Kenneth Lock
California Institute of Technology
Pasadena, California
may extend the vocabulary by declarative statements and communicate with the machine in the
extended vocabulary. Due to frequent message exchanges between the man and the machine during
on-line computing, the machine representation of
users' programs must be easy to modify at the
source language level. The technological trend towards large random access memory suggests the retention of several users' programs in core simultaneously, hence mutual memory protection must be
ensured.
This paper describes a scheme of structuring the
users' ALGOL programs in accordance with the syntactical unit of a statement. The scheme enables the
. user to make modifications to his source language
program at the statement level without recompiling
the complete program. The same structure is used
to provide the logic sequence of executing statements and to ensure memory protection among users. The next section describes the operating environment of on-line computing which justifies the
scheme presented in this paper. The following section reviews the recursive definition of a statement in
ALGOL as a syntactical unit which is used as the
unit of communication from the user to the machine as well as the building block of the program
structure in the machine. The next to the last section
INTRODUCTION
The modern art of computation has developed
from plugboard programming through the stored
machine instruction programs controlled by the
users on the consoles, then to problem-oriented symbolic programs computed in the batch mode, towards the on-line computing during which the
users have a large amount of control over their programs. The lower cost per computation and flexibilities of a large capacity high-speed computer naturally lead us to consider the provision of on-line
computing service to several users on a single
high-performance machine in a time-sharing
mode, rather than several smaller machines, one for
each individual. To maximize the efficiency of a
man-machine team working in an on-line computing mode, it is desirable to let the man choose
the language-say English-for communication and
to let the machine do the translation. This idealistic
goal is not impossible, but is currently impractical.
A good compromise is to select as the user language
a formal language such as ALGOL, FORTRAN or
LISP which has a set of explicit syntactical rules
and a small set of basic vocabulary. The user then
*The study is partly supported by National Science Foundation Grant GP4264.
457
458
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
describes the statement-oriented program structure, and the final section shows the role played by
the program structure for multiprogram timeshared on-line computations.
THE ENVIRONMENT IN ON-LINE COMPUTATIONS
e
There are two modes in on-line computation:
constructing the program, and executing the program. Since the programmer is constructing the
program on-line piece by piece, it is desirable to
specify a minimum number of rules-either things
the programmer is not allowed to do, or actions the
programmer may take. In either case the programmer will not be ,burdened with remembering many
rules. When executing the program on-line, the
user must be able to exercise controls to start and
stop the computation at will. The construction of a
machine code program on an operator's console imposes a very simple rule on program modification,
namely that any single instruction may be independently changed. The execution of a program on
an operator's console provides complete. control at
the machine instruction level, namely that the program may be started or' stopped at any specific instruction, or that the program may be stepped
through. However, the direct use of an operator's
console for on-line computing was discarded on
account of the weakness in using a machine code
language for program construction and the wastefulness of computer time due to human intervention.
The introduction of high-level programming languages and batch operation eliminated the above
shortcomings and at the same time ruled out the features of on-line computation.
From the above analysis, we can say that an acceptable on-line computing system must offer each·
user an input/output device. From this device, he
may construct his program piece by piece in a
high-level programming language, in which a
statement is the building block. When executing the
program, he may control the .sequence by starting,
stopping or stepping through his program at the
statement level. For economy, the system must be
time-shared among several users to minimize system idle time. From a user's point of view, he enjoys the advantages of an operator's console and a
high-level .programming language. In this environment, the following are taken as the design specifications:
1965
1. The programming language must be easy to
learn and powerful in expressing algorithms. The
syntax of the language must allow easy extension to
cope with applications such as symbol manipulations. I ALGOL 60 2 is considered as a promising
language.
2. The source language program should not be
completely compiled into a single machine code
program such that local changes in source program
only require local modifications to its machine representation. An incremental compiler is required.
3. The communication between the user and the
machine should be machine independent. For example, the user may ask for the values of variables by
specifying their symbolic names, rather than the actuallocations in memory.
4. Several users on-line should time share the
processor, and all users' programs and data should
be retained in core whenever possible, to minimize
swapping.
5. A statement is taken as the basic unit of processing such that the user may start or stop his program at specified statements or may execute his
program one statement at a time in. a "step" mode.
A STATEMENT IN ALGOL
Since the publication of the "Revised Report on
the Algorithmic Language ALGOL 60,"2 suggestions 3
have been made to generalize it. The following generalization of the definition of a statement is introduced here to give a simpler syntax and render it
more suitable for on-line computations.
:: =
:: = · I : < statement>
:: = I I < assignment statement> I
I I
I empty I
< block> :: = begin < list of statements> end
<:list of statements> :: = l<1ist
of statements>; I «lower
limit>, :: = < declaration>
:: = if then else
The inclusion of as unlabeled
statement and «lower limit>, denotes a statement
already compiled and assigned a unique line number
for identification by the compiler. «lower limit>,
·lh 4
f3
54~BR
B3
51
d
3
54
ti'
B3
£31
h31
5
31
~
-
~
1
--"""
£32
BEX
fS
~
c::
(')
B3
t
h32
fJ)
321
h34
~
c::
£34 ~ BR
~
Z
o
5 34
tS 3 2
B3
32q
~
~
PR = program return
hn
,. £322 II-;;.
FCL
IR
BR
3221
IR
:,
£3222
V.3222
5
5
3222
= < statement>
B
= < block>
~ IR
FCL
323
= < for-clause>
'fJ)
~
o
~
~
c::
::l
t""'
I
~
:::0
o
o
~
)-
~
~
t
Ih3~DR
D33
B
= < declaration>
BEX = < boolean expression>
B3
1
-
return
5
D
h323 £323
~
DR = declaration return
~ FR
B3
L-----==>1~
= if-statement
8
~
)-
return
FR = for-statement return
d~
~
= block
I
i:t'r1
I
rJl
:I:
)-
~
z
o
f2~DR
h2
D2
Figure 2. The structure of the program in Fig. 1.
.J::>.
0\
.....
462
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
Figure 3 shows an element in the program structure. It is a block of '}' words. The various quantities
in an element are listed as follows:
TYPE
13
f
h
PI
Pz
"
line num.ber
~
~
PM code
statem.ent
v
~
lmage
.,..-;:1.0
'l
f
11....------:---:---::-------:-,
Figure 3. An element in the program structure.
TYPE
f3
'}'
f
h
TYPE
block
a type indicator from block, declaration, procedure, if-statement, forstatement, others.
an integer specifying the location of
the first word in statement image field
relative to the first word in the element.
an integer that specifies the number
of words in the element.
a pointer to the element representing
the next statement.
a pointer to the element representing
the block in which the given element
is a statement.
pointers depending on the TYPE according to the following table:
for-statement
PI points to
list of
declaration
not used
not used
statement
after then
not used
others
not used
declaration
procedure
if-statement
line
number
P2 points to
list of
statements
not used
procedure body
statement
after else
statement
after do
not used
a unique number for each element
used for reference by the user.
1965
PM code : Pseudo-machine code for the compiled statement. The PM code differs
from the absolute machine code in the
following ways:
( 1) All references to identifiers are
indirectly addressed through non-relocatable entries in the user's symbol
table.
(2) The last instruction in a PM code
block always returns control to the
execution monitor in the system which
selects the next element in the program structure for execution.
statement
image: The source statement image is retained in its symbolic form.
THE ROLE OF PROGRAM STRUCTURE
IN MULTIPROGRAM TIME-SHARED
ON-LINE COMPUTATIONS
In this section, we will describe how the statement-oriented program structure in the last section can be used in multiprogram, time-shared
on-line computations. The first part of this section
describes the list-structure-like operations on the
program structure during the program statement. input and modifications. The second part shows the
use of program structure during execution in keeping track of the next statement to be executed. In
the final part the dynamic nature of the program
structure is demonstrated to be extremely desirable
in applications that involve frequent. man-machine
interactions and dynamic data structure.
Incremental Statement Compilation
and the Statement-Oriented Program Structure
Conventional compilers translate the source language programs into relocatable codes, and the
loader -converts them into absolute code. This
scheme usually produces an efficient object code;
however, the complete process has to be repeated if
any changes, however small, are made in the source
language program. An incremental compiler is characterized by its ability to compile each statement
independently, so that any local change in a statement calls only for recompilation of the statement,
not the complete program. When the compiled program is structured as in the preceding section, statement insertions, deletions or modifications are han-
STRUCTURING PROGRAMS FOR MULTI-PROGRAM TIME-SHARING
dIed by adding, removing or replacing some elements in the program structure with appropriate
changes in structure pointers. The dependencies between any two statements lie only in the common
set of identifiers that appear in them and their relative location within a program. The latter is encoded into the set of structure pointers in the program
structure.· The identifier dependency among statements is made indirect through reference entries in
the symbol table. Only one reference entry is used
for each distinct identifier such that all statements
can be independently compiled into PM codes. The
contents in the reference entries are set dynamically
during execution according to the declaration on
the identifiers. Figure 4 shows the indirect dependence among statements through the symbol table
and the program structure.
Symbol
table
statement
statement
program
structure
pointers
Figure 4. Indirect dependence among statements through
symbol table and program structure.
When the program is incrementally constructed
on-line, some building code must be specified.
With our statement-oriented program structure
and the definition of a statement given earlier, the
rule becomes very simple:
Any integral number of statements in the program. structure, called "out-statements," can
be replaced by any integral number of newly
specified statements, called "in-statements."
Figure 5 shows several examples that represent
integral number of statements and also some examples that do not represent integral number of statements.
Replacing no out-statements amounts to inserting
in-statements. Specifying no in-statements amounts
463
to deleting the out-statements. Some method of specifying the out-statements in the program for replacement must be provided. One way is probably to
display the source program on a CRT and let the
user mark, by light pen, the limits that enclose the
out-statements. Another method is to associate each
element in the program structure with an identification line when the statement for that element is
compiled and connected into the structure. The user
may subsequently refer to any element in the structure by its line number. The out-statements can be
specified by a pair of line numbers (i1, 12 ) which represent the first and the last of the out-statements, or
an insertion point in the program structure when outstatements are empty. For ease of cross referencing,
successive statements are assigned line numbers in an
increasing order. All statements inserted between the
statements numbered n. and n + 1. are numbered into
sub-levels n. 1., n. 2., etc. Syntactically the out-statements can be defined as follows:
:: =
«lower limit>, :: = +
:: = < upper limit> :: =
:: = .
Examples:
(1., 2.4.)
1.2. +
2.3.4. -
Semantics:
( , and
inclusively, e.g., (2., 3.) in Fig. 6.
< line number> + specifies the point in program structure that follows the f pointer in the
element identified by , e.g., 2. +
and 3. + in Fig. 6.
- specifies the point in the program structure that precedes the element identified
by , e.g., 3. -, 4. - and 5. - in
Fig. 6.
The in-statements that replace the out-statements can be syntactically defined as follows:
:: = I
( , ;
464
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1965
]
]
if BEX then
for FLT do
-
end;
]
begin
end;
[
l
encloses integral number of statements
does not enclose integral number of statements
Figure 5. Examples of specifying integral number of statements.
Example:
A: = B
+
C; (1.,5.); begin C:
=
0 end; 7. 1.
Semantics
can be any ALGOL statement as
defined above (A Statement in ALGOL).
denotes the statement already
in the program structure identified by < line number>.
«lower limit>, and .
When line numbers are used in forming an instatement, they represent the statements already in
the program structure. Copies of these elements are
incorporated into new locations in the program structure; they are not automatically deleted from their old
locations.
A compile command that alters, builds, or manipulates the program structure takes the form:
:: = compile ,
E0M
E0M is an action on the input device that will
interrupt the machine and cause the monitor in the
system to respond to the message.
Example:
Let Fig. 7 a be some program structure, then the
compile command
compile (1., 3.), A: =B; B: =0 E0M
changes the program structure into the form in Fig.
7b.
compile (1., 2.), begin (1., 2.) end E0M
changes it into the form in Fig. 7 c which can be
transformed to Fig. 7d by
compile 1.
+, if BEX then 6.
else
c: = 0 E0M
By using independent statement compilations and
the program structure described in the preceding
section, the user may manipulate his program quite
freely provided that a statement is taken to be the
smallest unit for manipulation. Since the user has to
be familiar with the definition of a statement in
ALGOL before he can express the problem algorithms in the language, the program manipulation
rules based on the concept of a statement should
become very natural and easy to apply for the user.
This is analogous to a user manipulating his machine code program on a console, in which case the
smallest unit he may change in his program is a
single machine code instruction.
On-Line Control over Program Execution
After the source language program is converted
into the statement oriented program structure, interactions among statements are made indirectly
through the reference entries for identifiers in the
symbol table and the set of structure pointers in the
program structure. The last instruction in the pseudomachine code for a statement always returns con-
466
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1965
(a)
A: = B
B:
=0
2.
1.
(b)
Block
1.
4.
A:
=B
B: = 0
3.
2.
(c)
...
~
Block
8
4
4.
BEX
3. 1
1.
1A:2~Bn
f
B:
=0
3.
@
3.2
I
~
C:
(d)
Figure 7.
=0
3.3
~
~
H=-
8
5
5.
86
6.
STRUCTURING PROGRAMS FOR MUL TI-PROGRAM TIME-SHARING
trol to the execution monitor which, from the
pointer to the element for the statement just executed, selects the next statement for execution. Due to
the recursiveness of a statement in ALGOL, a push
down list called ESL for execution status list is
maintained for each user. The top element in ESL
points to the current statement being executed, the
element next below in ESL points to the statement
of which the current statement is a component. For
example, in Fig. 7d, when the statement A : = B
is being executed, the top element in ESL points to
the element numbered 2. and the element next below in ESL points to the element numbered 1. . Depending on the type of the element, the last instruction in the pseudomachine code returns control to
different points in the execution monitor which
takes action depending on the user's operation
mode.
The user's operation mode is set by an execution
command.
Syntax:
< execution command> :: = < start> I
I
:: = execute
E0M
< execution bounds> :: = empty I « starting
point>, :: = step < starting point> E0M
:: = E0M
-< starting point> :: = < stopping point> :: =
empty I
Examples:
execute E0M
execute(1T., 3. 5.) E0M
execute (3. 1.,) E0~
execute ( , 4. 5. 6.) E0M
- - step E0M - step 5. 6. E0M
E0M-Semantics:
A user's program can be either in "execute mode"
or "step mode." A will set the user into the
execute mode. If a nonempty
is specified, the program will start from the < starting
point> and stop at the . An empty
implies the top element in ESL,
and an empty < stopping point> implies an infinite
467
line number. A sets the user's program into
step mode. If the < execution bounds> is empty, the
program will continue from the statement currently
being pointed by the top element in ESL and will
come to a halt only if < stop> is initiated from the
input device or it comes to the previously specified
or a program stop. In step mode,
execution is halted after each statement. A < step>
instructs one statement to be executed. If the in is empty, the element in the
program structure pointed by the top element in ESL
is executed. Otherwise, is set to
be the top element in ESL with appropriate pop ups
and push downs in ESL to maintain the proper block
level being referenced, then the element pointed to
in ESL is executed. In step mode, the execution of
each statement provides the user certain trace information on the on-line output device such as the value
of an expression. At this point we can again see the
analogy to the control a programmer can exercise
on his machine code program in the computer from
an operator's console. The , and
commands are analogous to the start-, stepand stop-push buttons. The is
analogous to setting the instruction counter which
is, in our system, generalized into a push down list
ESL. The control unit in a computer that maintains
the correct execution order from one machine instruction to the next is conceptually extended in our system into the "execution monitor." However there is
the difference that in our system the user communicates in a problem-oriented language.
The execution monitor's operation is described
below by using a set of ALGOL-like statements. The
following terminology is employed:
ESL [1]: the top element in the push down list
ESL.
t(ESL [1]): the t-pointer in the element representing a block pointed by ESL [1]. t points to
the structure that represents all the nondeclarative statements in the block.
f(ESL [1]): the f-pointer in the element representing a statement pointed by ESL [1]. f points
to the next statement, namely the statement that
follows the statement separator.
ts(ESL [1]): the ts-pointer in the element representing a conditional statement pointed by ESL
[ 1] . ts points to the statement that follows then.
fs(ESL [1]): the fs-pointer in the element representing a conditional statement pointed by ESL
[1]. fs points to the statement that follows else.
468
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
ds(ESL [1]): the ds-pointer in the element representing a for-statement pointed by ESL [1].
ds points to the statement that follows do.
push down A into B: all elements in the push down
list B are pushed down one level and the quantity
A becomes the top element in B, i.e., B [i]: =B
[i-1] for i>2 and B [1]: =A.
pop up B: all elements in the push down list Bare
popped up one level, i.e. B [i]: = B [i +] for
i>l.
The original top element in B is lost.
return control to the user: the user's program is
halted and the system is ready to receive a message from the input device.
return control to PM(ESL [1] ): go to execute the
pseudomachine code compiled for the statement
which is represented in the program structure as
an element pointed by ESL [1].
output trace information: when in the step mode,
the execution of each statement provides information on the execution result such as the value
of an evaluated expression, and displays the
next statement to be executed upon receiving
step E0M.
initiate block et;ltry procedure: save all current
machine addresses for the identifiers declared
in this block and load their new local machine
addresses into their reference entries. If the
block is entered recursively, savings are implemented into push down lists.
initiate block exit procedure: restore the machine
addresses for the identifiers declared in this
block to their values in the outer block which for
recursively entered blocks were the top elements
in their push down lists.
set up ESL in accordance with the designational
expression: transfer out of a block is allowed
in which case all the top elements in ESL will
be popped up until the pointer to the block in
which the desiQnated statement is a comnonent
---r - -- - --appears as the top element in ESL, then the
designated statement is pushed in ESL. Each
time a pointer to the block is popped up from
ESL, the block exit procedure is initiated.
set up actual parameters: save all current machine
addresses for the identifiers used as formal parameters in the procedure, into push down lists
when it is recursively called, and load the
machine addresses of words containing the actual parameters into the reference entries of
these formal parameters.
fo.)'
--
-
-
--
-
-
1965
initiate procedure exit: restore the machine addresses for the identifiers used as formal parameters
in this procedure to their values in the block
that initiated this procedure call.
The following ALGOL-like statements describe the
execution which offers the user extensive controls
over his program execution on line.
return from go-to statement:
if in step mode
then output trace information
else;
set up ESL in accordance with the designational
expression;
go to execute next statement;
return from block:
if in step mode
then output trace information
else;
initiate block entry procedure;
push down t(ESL [1]) into ESL;
go to execute next statement;
return from if-statement:
if in step mode
then output trace information
else;
if the Bool~an expression is true
then push down ts (ESL [1]) into ESL
else push down fs(ESL [1] into ESL;
go to execute next statement;
return from for-statement:
if in step mode
then output trace information
else;
if all elements in the for list are serviced
then ESL [1]: =f(ESL [1])
else push down ds(ESL [1]) into ESL;
go to execute next statement;
return from procedure call:
if in step mode
then output trace information
else;
set up the actual parameters;
push down pointer to the procedure body into
ESL;
go to execute next statement;
return from all other statements:
if in step mode
then output trace information
else;
ESL [1] : =f(ESL [1]);
STRUCTURING PROGRAMS FOR MULTI-PROGRAM TIME-SHARING
go to execute next statement;
execute next statement:
if in execution mode and ESL [1] not equal to
to < stopping point>
then go to continue
else begin enter step mode;
return control to the user
end;
continue:
if ESL [1] is a program return PR
then begin enter step mode; return control to
the user
end
else
if ESL [1] is a block return BR
then begin initiate block exit procedure;
popup ESL;
ESL [1]: =f(ESL [1]);
go to execute next statement
end
else
if. ESL [1] is an if return IR
then begin pop up ESL;
ESL [1]: =f(ESL [1]);
go to exceute next statement
end
else
if ESL [1] is a for return FR
then begin pop up ESL;
return control to PM(ESL [1])
end
else
if ESL [1] is a procedure body return PBR
then begin initiate procedure exit;
pop up ESL;
return control to PM(ESL [1])
end
else
ESL [1] is a pointer to an element in the program structure:
return control to PM (ESL [1])
The Statement-Oriented Program Structure Used
in Time-Shared Multiprogramming and Its Compatibility with Dynamic Data Structures
The dynamic nature of on-line computing calls
for a dynamic data structure as well as dynamic
program structure. Nonnumerical applications such
as analytical expression manipulations on computers5 will increase in efficiency and effectiveness if
469
they can be performed on-line. ALGOL can be
easily extended to manipulate list-structure-like
data. 1 The use of a dynamic· program structure is
completely compatible with dynamic data structure.
The same dynamic memory allocator will service all
users' programs and data structure:::.
Figure 8 shows the configuration for the multiprogram time-shared system. Each user's activity
in the system is represented by an I/O device, its
program structure, symbol table, data structure and
operation status, all properly linked under the user's
pointer. Since storage allocations for program and
data structures and the execution of their programs
are all under the control of the mUltiprogram
time-shared system, memory protection against
each other is assured. The system consists of an incremental compiler, an execution monitor, an available storage block manager and a system monitor
that coordinates various phases of operations. Fig.
9 shows the organization that incorporates the
self-optimization technique of adapting a set of
monitor system parameters in accordance with the
operation environment. Such system parameters
may, for instance, cause the monitor to operate in
one of several possible modes. In mUltiprogram
time-shared on-line computations, there is always the question of whether all users' programs
and data should be retained in core, or should only
one be in core with swap between users. Our solution is to let the operation environment dictate the
mode: if all users' programs and data can be comfortably accommodated in core, they will all remain
in core; otherwise they will be divided into groups
and swap among groups. The actual rules used for
adapting the monitor parameters are still subject to
experimentation. Since the muItiprogram timeshared system should be in core all the time, it
should be constructed so that read-only memory
can be used to store them.
CONCLUSION
A multi program time-shared system based on
the concept presented in this report has been under
implementation as an experimental project at the
California Institute of Technology. Invariably many
of its details have been modified to suit the particular hardware which consists of an IBM 7040 computer, a 7288 multiplexor and several Institutedeveloped typewriter consoles.
470
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1965
el algorithmic language and the advantage of interacting with the machine by means of an on-line
console. The time-sharing mode further makes
such operation economically acceptable.
In conclusion, we believe that the use of incremental compilation, system-controlled execution,
dynamically structured programs and data can offer
the users the power of programming in a high-Iev-
---I
I
user
I
I~
I
-
~
multi-programming
time - shared system
J~
I
I
I
I
I
I
I
I
I
\1
-
I
I
I
1/0
device
1
J~
1
I
I
I
\~
I
I
\V
user 1
user 2
1
-
I
-...-
I
I
One user's
program structure
symbol table
data structure
and operation status
I
I
,
- -
L
I
-
------------
. I
one for each user
-
I
------ ---
I
~
available
storage
block
--,
I
I
-',
-I
--------Figure 8. The multiprogram time-shared system configuration.
I
I
I
----I
471
STRUCTURING PROGRAMS FOR MULTI-PROGRAM TIME-SHARING
I
I
users
1\
\if
110 devices
I~
\V
operation
environment
t
~
- ..-
l'
/1\
'V
inc remental
compil~r
-
.-..-
system
monitor
~
execution
monitor
system monitor
parameters
~
storage
manager
Figure 9. Organization of an adaptive system monitor for
multiprogram time-shared on-line computations.
The differences of this system from other similar
systems 6 are the following:
1. Statements in our system are compiled incrementally into directly executable codes.
System interpretation is called for only between statements.
2. Several users may be accommodated simultaneously in core memory.
3. Easily extendable to cope with applications
that call for dynamic data structure such as
algebraic expression manipulation.
The study reported in this paper also reflects
study of the computer organization for on-line
time-sharing applications. Some applications are
given below.:
1. The incremental compilation achieved by
indirectly addressing all operands through
their reference entries suggests a small
very-high-speed memory, functioning much
like the index registers, to be used by all
identifiers' reference entries.
2. The dynamic nature of multiprogram online computation should have a strong influence on memory organization. The algorithms trying to maximize the utilization
of computer memory without sacrificing
computing speed and programming flexibility should be investigated for possible
direct incorporation into hardware configurations. For the same reason that arithmetic unit is used to perform arithmetics
and data channels for input and output,
special processors should be designed to
allocate and relocate users' areas in memories possibly in parallel with the main
computation.
3. The central control unit in a computer
used for multiprogramming should be re-
472
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
sponsible for scheduling various programs
to various processors. The organization of
the central control unit must also reflect
the nature of man-machine interactions
and the types of control statements in the
programming language.
4. The encoding of information, numeric or
symbolic, into computer words should include type indication such that, for example, arithmetic oRerations performed on
nonnumeric quantities can be detected as
errors. This redundancy in information
representation can be used to provide some
error check during execution as well as to
provide a simpler machine instruction set.
For example, the same arithmetic instruction can be used for both floating point
and fixed point numbers if the number
representation suggests its type and whose
indication is decoded accordingly in the
arithmetic unit.
1965
REFERENCES
1. A. J. Pedis and R. Iturriaga, "An Extension
to ALGOL for Manipulating Formulae," Communications of the ACM, vol. 7, pp. 127-130 (Feb.
1964 ).
2. P. N aur et aI, "Revised Report on the Algorithmic Language ALGOL 60," Communications of
the ACM, vol. 6, pp. 1-17 (Jan. 1963).
3. N. Wirth, "A Generalization of ALGOL,"
Communications of the ACM, vol. 6, pp. 547554 (Sept. 1963).
4. J. McCarthy et aI, LISP 1.5 Programmer's
Manual, Massachusetts Institute of Technology
Press, 1961.
5. J. E. Sammet and E. R. Bond, "Introduction
to FORMAC," IEEE Transactions on Electronic
Computers, vol. EC-13, pp. 386-394 (Aug.
1964 ).
6. "IBM 7040 / 7044 Remote Computing System," IBM System Reference Library file no.
7040-25, form C28-6800-0.
INTERACTIVE MACHINE LANGUAGE PROGRAMMING
Butler W. Lampson
University of California
Berkeley, California
few installations. To most programmers, however,
they remain as unfamiliar as other tools, which are
presented for the first time below.
In the former category fall the most important
features of a good assembler; macro instructions
implemented by character substitution, conditional
assembly instructions, and reasonably free linking
of independently assembled programs. The basic
components of a debugging system are also known
but are relatively unfamiliar. 3 For these the essential prerequisite is an interactive environment, in
which the power of the computer is available at a
console for long periods of time. The batch processing mode in which large systems are operated
today of course precludes interaction, but programs
for small machines are normally debugged in this
way, and as time-sharing becomes more widespread the .interactive environment will become
common.
It is clear that interactive debugging systems
must have abilities very different from those of
off-line systems. Large volumes of output are intolerable, so that dumps and traces are to be avoided at all costs. To take the place of dumps, selective
examination and alteration of memory locations is
provided. Traces give way to breakpoints, which
cause control to return to the system at selected instructions. It is also essential to escape from the
INTRODUCTION
The problems of machine language programming,
in the broad sense of coding in which it is possible
to write each instruction out explicitly, have been
curiously neglected in the literature. 1,2 Granted that
less than half of the binary instructions generated in
the past year had their origin in assembly language,
it remains likely that much more than half the instructions executed originated in this way. There
are still many problems which must be coded in the
hardware language of the computer on which they
are to run, either because of stringent time and
space requirements or because no suitable higher
level language is available.
It is a sad fact, however, that a large number of
these problems never run at all because of the inordinate amount of effort required to write and debug
machine language programs. On those that are undertaken in spite of this· obstacle, a great deal of
time is wasted in struggles between programmer
and computer which might be avoided if the proper
systems were available. Some of the necessary components of these systems, both hardware and software, have been developed and intensively used at a
*The work described in this paper was supported by the
Advanced Research Projects Agency of the Department of
Defense under contract SD-18S.
473
474
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
switches-and-lights console debugging common
on small machines without adequate software. To
this end, type-in and type-out of .information
must be symbolic rather than octal where this is
convenient. The goal, which can be very nearly
achieved, is to make the symbolic representation of
an instruction produced by the system identical to
the original symbolic written by the user. The emphasis is on convenience to the user and rapidity of
communication.
The combination of an assembler and a debugger
of this kind is a powerful one, which can reduce by a
factor of perhaps five the time required to write
and debug a machine language program. A full system for interactive machine language programming
(IMP), however, can do much more and, if properly designed, need not be more difficult to implement. The basic ideas behind this system are these:
1. Complete integration of the assembler and
the debugging system, so that all input
goes through the same processor. Much redundant coding is thus eliminated, together
with one of two different languages serving
the same purpose: to specify instructions
in symbolic form; This concept requires
that code be assembled directly into core
(or into a core image on secondary storage) . Relocatable output and relocatable
loaders are thereby done away with.
A remark on terminology: It will be convenient in the sequel to speak of the
"assembler" and the "debugger" in the
IMP system. These terms should be understood in the light of the foregoing: different
parts of the same language are being referred to, rather than distinct languages.
2. Commands for editing the symbolic source
program. The edit commands simultaneously modify the binary program in core
and the symbolic OIl secondary storage.
Corrections made during debugging are
thus automatically incorporated into the
symbolic, and the labor of keeping the latter current is almost eliminated.
3. A powerful string-handling capability in
the assembler which makes it quite easy to
write macros for compiling algebraic
expressions, to take popular example which
can be handled in a few other systems, but
rather clumsily. The point is not that one
1965
wants to write such macros, but that in
particular applications one may want macros of a similar degree of complexity.
These matters are discussed in more detail in the
following. We consider the assembler first and then
the debugger, since the command language of the
latter makes heavy use of the assembler's features.
Before beginning the discussion, it may be well
to describe briefly the machine on which this system is implemented. It is a Scientific Data Systems
930, a 2-microsecond, single-address computer
with indirect addressing and one index register. Our
system includes a drum which is large enough to
hold for each user all the symbolic for a program
being debugged, together with the system, a core
image of the program and some tables. Backup storage of at least this size is essential for the editing
features of the IMP system. The rest of the system
could be implemented after a fashion with tapes.
THE ASSEMBLER
The input format of the IMP assembler isa rather unusual one. Originated on the TX -0. at MIT, it
has been adopted by DEC for most of its machines,
but is unknown or unpopular elsewhere in the industry. Although it looks strange at first, it has substantial advantages in terms of simplicity, both for
the user and for the system. The latter is a nonnegligible consideration, often equally ignored and
overemphasized.
The basic idea is that the assembler processes
each line of input as an expression (unless it is a
directive or macro call). The expression is evaluated
and put into core at the word addressed by the location counter, and the location counter is advanced by
1. Expressions are made up of operands (which may
be symbols, constants - numeric or alphanumeric
- and parenthesized subexpressions) and operators.
Available operators are: + ,-, X ,/,A,V,r-' with their
usual meaning and. precedence; =, <, and >, which
are binary operators with precedence less than +,
and which yield
or 1 depending on whether the
indicated relation holds between the operands; and
::/=, a unary operator with lowest precedence which
causes its operand to be taken as a literal- i.e., it
is assigned a storage location, which is the same as
the location assigned to other literals with the same
value, and the address of this location is the value
of the literal. Blanks have the following significance:
°
INTERACTIVE MACHINE LANGUAGE PROGRAMING
any string of blanks not at the beginning or end of
an expression is taken as a single plus sign.
It is not immediately clear how instructions are
conveniently written as expressions, and in fact the
scheme used depends on the fact that the object
machine is a single-:address, word-oriented computer
with a reasonable number of modifiers in a single instruction. It would work on the PDP-6, but
not on Stretch.
The idea is simple: all operation code mnemonics are predefined symbols with values equal to the
octal encodings of the instructions. On the SDS
930, for instance, LDA (load A) is defined as
7600000 (all numbers are in octal). The expression
LDA + 200 then evaluates to 7600200. When the
convention about spaces is invoked, the expression
LDA 200
evaluates to the same thing, which is just the instruction we expect from this symbolic line in a
conventional assembler.
Modifiers are handled in the same spirit. In the
24-bit word of the 930 there is an index bit,
which is the second from the left, and an indirect
bit, which is the tenth. With the predefined symbols
I = 40000
X = 20000000
the expression LDA I 200 X evaluates to
27640200. In more conventional form it would
look like this:
LDA * 200,2
There is little to choose between them for brevity
or clarity. Note that the order of the terms in the
expression is arbitrary.
The greatest advantages of the uniform use of
expressions accrue to the assembler, but the programmer gains a good deal of flexibility. Examples
will readily occur to the reader.
Using this convention the implementation of the
basic assembler is very simple. Essentially all that
is required is an expression analyzer and evaluator,
which will not run to more than three or four
hundred instructions on any machine. Because all
assembly is into core, there is no such thing as relocatability.
Two rather conventional methods are provided
for defining symbols. A symbol appearing at the
left edge of a line is defined as the current value of
the location counter. Such a symbol may not be redefined. In addition, a line such as
SYM~4600
defines SYM. Any earlier definition is simply over-
475
ridden. The right side may of course be any expression which can be evaluated.
The special symbol • refers to the location
counter. It may appear on the left of a ~ sign. Thus,
the line
.~.
40
A
is equivalent to
A
BSS 40
in a conventional assembler.
There remains one point about the basic assembler which is crucially important to the implementation: the treatment of undefined symbols. When
an expression is encountered during assembly, there
is no guarantee that it can be evaluated, since all
the symbols in it may not be defined. This is the
reason why most assemblers are two pass; the first
pass serves to define the symbols. Becaue the IMP
assembler must accept typewriter input, it cannot be
two pass and must therefore keep track of unde.:.
fined expressions explicitly.
There is a general way of doing this, in which
the undefined expression, translated for convenience
into reverse Polish, is added to a list of such
expressions, together with the address of the word it
is to occupy. At suitable intervals this list is
scanned and all the newly defined expressions are
evaluated and inserted in the proper locations. For
complex expressions there is no avoiding some such
mechanism, and it has the advantage of simplicity.
It is, however, wasteful of storage and also of time,
since an expression may be examined many times
while it is' on the list before it can be evaluated.
One important special case can be treated much
more efficiently, and this is the case of an instruction with an undefined address, which includes at
least 90 percent of the occurrences of undefined
expressions.
For example, when the assembler sees this code:
BRU A
branch unconditional
X
LDA B
A
STA C
the instruction at X has an undefined address which
becomes defined when the label A is encountered.
This situation can be kept track of by putting in the
symbol table entry for A the location of the first
word containing A as an address. In the address of
this word we put the location of the second such
word, and so build a list through all the words containing the undefined symbol· A as an address. The
list is terminated by filling the address field with
ones. When the symbol is defined we simply run
476
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
down the chain and fill in the proper value. This
scheme will work as long as the address field contains only A, since there is then no other information which must be preserved. Note that no storage
is wasted and that when A is defined the correct
address can be filled in very quickly.
The description of the basic assembler is now
complete, except for a few nonessential details, and
we turn to the macro facility. Macros are handle~
in a standard manner, which the following example
should sufficiently illustrate:
STORE
MACRO ARGl,ARG2
IRP TEMP=ARG2
indefinite repeat
ST'ARGI TEMP
ENDR
ENDM STORE
called with
STORE A,(SI,S2,S3)
becomes after argument substitution
IRP TEMP=SI,S2,S3
STA TEMP
ENDR
That is, this string of characters is seen by the assembler as though it were in the symbolic input.
A macro may be defined with more arguments
that it is called with, in which case the extra arguments are made either null strings or generated
symbols. No more arguments are collected than are
called for by the definition. An argument is normally collected literally, character by character, but a
colon appearing before a macro name will cause it
to be expanded. To provide additional flexibility,
two directives called STACK and UNSTACK are
provided which respectively suspend the analysis of
the current expression and resume it.
Some unusual things may be done with this much
machinery. Consider the macro
LIT
MACRO ARG,GEN
STACK
TEMP~.
.~LITERALS
GEN
ARG
LITERALS~LITERALS
1
.~TEMP
UNSTACK
GEN
ENDM LIT
Called with
LDA LIT 20
it will assign a storage location, say LITERALS + 10,
put 20 in it, and assemble
1965
LDA LITERALS + 10
There are many other ways of writing this macro
using the list features discussed below.
The IRP operation used above is not new, but it
is not well known. It causes the lines in its range,
which is delimited by a matching ENDR, to be
processed repeatedly by the assembler. Each time
around the argument, TEMP in this case, is replaced by one of the subarguments, which are the
character strings following the = sign and separated
by commas. The entire process is rather similar to a
macro call, and sub arguments are processed according to the rules for macro arguments, except that
parentheses are not removed. Thus the IRP generated by the expansion of the macro discussed above
in turn expands into
STA S 1
TEMP replaced by S 1
STA S2
TEMP replaced by S2
STA S3
TEMP replaced by S3
Two extensions of this device:
IRP A,B=Al,Bl,A2,B2,A3,B3 $ C=Cl,C2
A,B,C
ENDR
expands into
Al,Bl,Cl,
A2,B2,C2
A3,B3,C2
We illustrate with another macro definition:
MOVE MACRO ARG
IRP TEMPA,TEMPB=ARG
IRP TEMPC= :TEMPA $ TEMPD
=:TEMPB
LDATEMPC
STORE TEMPD
ENDR
ENDR
ENDM
Called by
MOVE (Al,(Bl,Cl ),A2,B2)
this expands into
LDA Al
STA Bl
STA B2
LDA A2
STA B2
Suppose that we have some two-word data structures which we wish to manipulate. We can define
each of them as a macro, using another macro to
do the definition and reserve storage:
TW
MACRO
ARG,GENA,GENB
GENA
0
INTERACTIVE MACHINE LANGUAGE PROGRAMING
GENB
ARG
o
MACRO
GENA,GENB
ENDM ARG
ENDM TW
Now, if we call TW:
TW A
TW B
we can then use the newly defined macros A and B
in the move macro. In fact
MOVE (A,B)
after character substitution both in the macro body
and in the first IRP body is
IRP TEMPA, TEMPB = A,B
IRP TEMPC=.GOOOI, .G0002 $ TEMPD
= .G0003, .G0004
which expands to
LDA .GOOOI
STA .G0003
LDA .GO002
STA .G0004
There are two other repeat directives,
RPT expression
ENDR
which repeats its scope the number of times specified
by the expression, and
CRPT expression
ENDR
which repeats its scope, reevaluating the expression
each time until it is ~ O.
Finally, there is a conditional directive:
IF expression
ELSF expression
ENDF
which causes the lines between· the first IF or ELSF
whose argument is > 0 to be assembled and everything else to be ignored.
The implementation of all this is quite straightforward, and very similar for macros and repeats.
The body of the macro definition or repeat is collected as a character string, with markers replacing
the arguments, and saved away. Each time it is
called, the routine which delivers characters to the
assembler, which we will call CHAR, is switched
from the input medium to the saved string. Arguments for a macro call or IRP are likewise saved as
477
strings. The characters coming from a definition are
monitored for the argument marker, and if it is
found CHAR is switched again, to the argument
string. Whenever any of these strings ends, CHAR
is switched back to the string it was working on before.
All this machinery is of course recursive. The
only restriction is that maCro definitions and repeats must be properly nested. Note that because of
the implementation technique just described a macro definition may contain anything, including other definitions. The other definitions of course are
not made until the macro is called.
The most novel feature of this assembler is the
string or list-manipulating features available to the
programmer, which allow him to define macros
to perform functions normally regarded as the prerogative of a compiler. A list may be assigned to a
symbol as its value by
SYM~ [any string not containing an
unbalanced right bracket]
The string is saved literally as the value of SYM,
with one exception: the character : causes the following symbol to be expanded if it is a macro or
list name, just as it does in macro arguments. The
structure : [string] is equivalent to the string alone.
Once SYM has been equated to a list, any use of
it is exactly equivalent to writing the contents of
the string, including the brackets. Exception: If
SYM appears within brackets or as a macro argument and is not preceded by : it is transmitted literally. In most contexts a string enclosed in brackets has the same effect as one not so enclosed.
Thus the sequence
SYMA~[A]
SYMB~[B]
SYMC~[:SYMA:SYMB',CD]
will leave SYMC with the value [AB,CD]. The '
has the same function as it does in a macro definition.
A symbol equated to a list (or, as always, the
explicit list itself) may be subscripted in two ways.
In the above example, SYMC [2] is equivalent to
B (i.e., a subscript in bracket selects a single character). More generally, SYMC [2,5] is equivalent to
B,CD.
The other form of subscripting selects a segment
of a list delimited by commas: SYMC( I) is the
same as AB.
To illustrate the use of these features we consider the following macro to compile an expression
478
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
with the operators + and -, single character variables and parenthesization:
ARITH EXPR
ARG~EXPR
ARITHI
1965
ELSF OP=2
SUB ARG[SB]
ENDF
OPA~1
SB~O
OP~O
L~LENGTH(ARG)
ENDF
ENDR
ENDM ARITH"
ARITHI
ENDM ARITij:
MACRO OPA,OPB,OP
OP~O
NOTES
OPA~O
CRPT SB < L 1 V OP=-1
SB~SB 1
C~" :ARG [SB]"
IFOP=O
this branch if operator
IF C="("
not yet found
ARITHI
OPA~1
ELSF C="-"
OP~2
ELSF C="+"
OP~1
ELSF C=")"
OP~-1
ELSF 1,
OP~-2
OPB~[:ARG[SB]]
ENDF
ESLF C = "( "
this branch if operator
found
IF OPA= 1
this branch if second
operand is ()
TIDX~TIDX 1
STA T'NUM (TIDX)
OPB~ [:T'NUM(TIDX)]
ARITHI
TIDX~TIDX-l
ELSF 1
ARITHI
ENDF
IF OP=2
CNA
complement A register
ENDF
ADD OPB
ELSF 1
IF OPA=2
LDAOPB
ENDF
IF OP=l
ADD ARG[SB]
LENGTH is a function which gives the length of
the list which is its argument.
NUM evaluates its argument and replaces itself
with the decimal encoding of the value. It is useful
for constructing a series of symbols over which one
wants considerable controI.
Double quotes enclosing a string turn it into an
alphanumeric constant.
This macro, called by
ARITH [(A+B)-(C-D)]
would generate
LDA A
ADD B
STA Tl
LDA C
SUB D
CNA
ADD Tl
Note that there are only six lines in the definition
which actually generate code.
With this example we conclude our discussion of
the assembler. The implementation of lists is quite
straightforward, though a certain amount of care
must be taken about the treatment of colons calling
for expansion. A few minor points have been
glossed over.
THE DEBUGGING SYSTEM
A good interactive debugging system must be
diffcult for the beginner to master. Its emphasis
must be on completeness, convenience and conciseness, not on simplicity. The basic capabilities required are quite simple in the main, but the form is
all important because each command will be given
so many times.
One essential, completely symbolic input and
output, is half taken care of by the assembler. The
other half is .easier than it might seem: given a
word to be printed in symbolic form, the symbol
479
INTERACTIVE MACHINE LANGUAGE PROGRAMING
table is scanned for an exact match on the opcode
bits. If no match is found, the word is printed as a
number. Otherwise the opcode mnemonic is printed, indirect and index bits are checked and the
proper symbols printed, and the table is scanned for
the largest symbol not greater than the remainder of
the word. This symbol is printed out, followed if
necessary by a + and a constant.
The most fundamental commands are single characters, possibly preceded by modifiers. Thus to examine a register the user types
/xl-3; LDA I NUTS+2
where the system's response is printed in capitals.
This command may be preceded by any combination
of modifiers:
C for printout in constant form
S for printout in symbolic form
o for octal radix
D for decimal radix
R for relative (symbolic) address
A for absolute address
H for printout as ASCII characters
I
for printout as signed integer
The modifiers hold until the user types a carriage
return.
For examining a sequence of registers, the commands t and {.- are available. The former examines
the preceding register, the latter the following register. In the absence· of a carriage return the modifiers
of the last examination hold. The ~ command
examines the register addressed by the one last
examined.
The contents of a register may be modified after
examination simply by typing the desired new contents. Note that the assembler is always part of the
command processor, and that debugging commands
are differentiated by their format from words to be
assembled. (This is not difficult, because the only
thing which may occur at the beginning of a line of
assembler code is a label.) Furthermore, debugging
commands may occur in macros, so that very elaborate operations can be constructed and then called
on with the two or three characters of a macro
name.
To increase the flexibility of debugging macros,
the unary operator is defined. The value of SYM 3
is the contents of location SYM 3. With this operator macros may be defined to type out words depending on very complicated conditions. A simple
example is
TO
MACRO A, B
TEMP~A
STOP~1
CRPT STOP
IF TEMP>B
STOP~O
ELSF 1
TEMP~TEMP
1
ENDF
/TEMP;
ENDM TO
called with
TO 100,20
it will type out the first location after 100 with contents greater than 20.
Another important command causes an expression to be typed in a specified format. Thus if SYM
has the value 1253 then
=sym;
1253
would be the result of giving the = command. All
the modifiers are available but the norm,a,! mode of
type-out is constant rather than symbolic. If no
expression is given, the one most recently typed
is taken. Thus, after the above command, the user
might try
SYM
s=;
For convenience, ~ abbreviates S = .
It is often necessary to search storage for occurrences of a particular word. This may be done with
a macro, as indicated above, but long searches
would be quite slow. A faster search can be made
with
$expression;
which causes all the locations matching the specified expression to be typed out. The match may be
masked, and the bounds of the search are adjustable. This command takes all the type-out modifiers as well as
E
which searches for a specified effective address (including indexing and indirect addressing) and
N
which searches for all words which do not match.
For additional flexibility the user may specify a
macro which will be executed each time a matching
word is found.
In addition to being able to examine and modify
his program, the user also needs to be able to run
it. To this end he may start it at a specified location with
;0 location
If he wishes to monitor its progress, he may insert
480
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
breakpoints at certain locations with the command
;B location
This causes execution of the program to be interrupted at the specified location. Contr.ol returns to
the system, which types some useful information
and awaits further commands. An alternate form of
this command is
;B location, macro name
which causes the specified macro to be executed at
each break, instead of returning control directly to
the typewriter. Very powerful conditional tracing
may be done in this way.
After a break has occurred, execution of the program may be resumed with the ;P command. The
breakpoint is not affected. To prevent another
break until the breakpoint has been passed n times
the form n;P may be used.
To trace execution instruction by instruction the
command ;N may be used instead of ;P. It allows
.one instruction to be executed and then breaks
again. n;N allows n instructions to be executed before breaking. A fully automatic trace has been deliberately omitted, but presents no difficulties in
principle.
There remains one feature of great importance in
the IMP system, the symbolic editor. The debugger
provides facilities, which have already been described, for modifying the contents of core. These
modifications, however, are not recorded in the
symbolic version of the program. To permit this to
be done, so that reloading will result in a c.orrectly
updated binary program, several commands are
available which act both on the assembler binary
and on the symbolic.
This operation is not as straightforward as it
might appear, since there is one-to-one correspondence between lines of symbolic and words of
binary. Addresses given to the debugger .of course
refer to core locations, but for editing it is more
convenient to address lines of symbolic. '1'0 permit
proper correlation of these line references with the
binary program, a copy of the symbolic file is made
during loading with the address of the first and last
assembled words explicitly appended to each line.
Since the program is not moved around during editing, these numbers do not change except locally.
When a debugging sessi.on is complete, the edited
symbolic is rewritten without this information.
We illustrate this with an example. C.onsider the
symbolic and resulting binary
Sl
MOVEA,B
1965
(200,20'1 )
ADDC
(202,202)
STORE D, E (203,204)
SI
LDAA
STA B
ADDC
STA D
STA E
BRU SI
(205,20'5) S2
S2 BRU SI
and the editing command
S2-1
insert before line S2-1
;1
SUBF
which gives rise to the following:
SI MOVE A, B (200,201) SI LDA A
STA B
ADD C
(202,202)
BRU .END
SUB F
(1513,1513)
BRU .END 1
STA E
STORE D, E (1514,204)
S2 BRU SI
(205,205) S2 BRU SI
200
201
202
203
204
205
200
201
202
203
204
205
ENDADD C 1512
SUB F
1513
STA D
1514
BRU SI 4 1515
BRU SI 5 1516
All the BRU (branch unc.onditional) instructions
are inserted to guarantee that the right thing happens if any of the instructions causes a skip. Multiple skips, or subroutine calls which pick up arguments from subsequent locati.ons, are n.ot handled
c.orrectly. The alternative to this rather simpleminded scheme appears to be complete reassembly,
which has been rejected as too sl.ow. The arrangement outlined will deal correctly with patches made
over other patches; although the binary may come
to look rather peculiar, the symbolic will always be
readable.
To give the user access t.o the readable symbolic
the command
;S symbolic line address [,symbolic address]
(where the contents of the brackets is optionally
included) causes the specified block of lines to be
printed. Two other edit commands are available:
;D symbolic line address [,symbolic line address]
which deletes the specified block of lines, and
;C same arguments
which deletes and then inserts the text which follows. Deleting SI 1 would result in binary as follows
SI
LDA A
BRU .END
BRU .END 1
STAD
STA E
INTERACTIVE MACHINE LANGUAGE PROGRAMING
S2
BRU Sl
.END
BRU Sl 3
BRU Sl 4
The implementation of these commands is quite
straightforward. One entire edit command is collected and the new text, if any, is assembled. Then
the changed core addresses are computed and the
appropriate record of the symbolic file rewritten.
The scheme has two drawbacks: it does not work
properly for skips of more than one· instruction or
for subroutine calls which pick up arguments from
following locations, and it leaves core in a rather
confusing state, especially after several patches have
been made at the same location. The first difficulty
can be avoided by changing large enough segments
of the symbolic. The second can be alleviated by
reassembly whenever things get too unreadable.
The only other published approach to the problem of patching binary programs automatically is
that of Evans, 4 who keeps relocation information
and relocates the entire program after each change.
This procedure is not very fast, and in any event is
not practical for a system with no relocation.
481
Using the latter figures, we deduce that a pro..
gram of 10,000 instructions, a large one by most
standards, will load in 25 seconds. This number indicates that the cost of the IMP approach is not at
all unreasonable-far more computer time, including
overhead, is likely to be spent in the debugging operations which follow this load. When omy minor
changes are made, it is, of course, possible to save
the binary core image and thus avoid reloading.
In spite of the speed of the assembler, it is possible that a relocatable loader might be a desirable adjunct to the system. There are no basic reasons why
it should not be included.
As to the size of the system, the assembler is
about 2,500 instructions, the debugger and editor
about 2,000.
ACKNOWLEDGMENTS
The ideas in this paper owe a great deal to many
stimulating conversations between the author and
L. Peter Deutsch of the University of California.
REFERENCES
EFFICIENCY
The IMP system depends for its viability on fast
assembly. The implementation techniques discussed
in this paper have permifted the first version of the
assembler to attain the unremarkable but satisfactory speed of 200 lines per second. Simple characterhandling hardware will be installed shortly on our
930; it is expected to double assembly speed on
simple assemblies and to produce even greater improvement on programs with many macros and repeats.
1. G. Mealy, "Anatomy of an Assembly System," RAND Corporation (Dec. 1962).
2. The MIDAS Assembly Program, MIT, Cambridge, Mass.
3. A. Kotok, "DEC Debugging Tape," Memo
MIT-1 (rev.) MIT, Cambridge, Mass. (Dec. 11,
1961).
4. T. G. Evans and D. L. Darley, "DEBUG-An
Extension to Current On-line Debugging Techniques," Comm. ACM, vol. 8, p. 321 (May 1965).
RESPONSIVE TIME-SHARED COMPUTING IN BUSINESS
ITS SIGNIFICANCE AND IMPLICATIONS
Charles W. Adams, President
Charles W. Adams Associates, Inc.
and KEYDATA Corporation,
Boston, Massachusetts
SIGNIFICANCE
Of the many thousands of businesses in the United States today, it is probable that no two use identical office and accounting procedures. Yet in general it would be safe to say that in any business incoming nata are processed, with reference to a file,
using established procedures, to yield six broad
types of results:
intended for executive policy-making as
opposed to those needed by operating personnel).
6. Analytical results, such as sales forecasts
and answers (obtained through simulation) to the question "What would happen
if .,. 1" so frequently asked by management and important to effective policy decisions.
1. Updated file records.
2. Operational documents, such as invoices,
purchase orders, pay checks, and the like.
3. Exception notices, status reports, and responses to inquiries regarding the standing
of such items as accounts receivable, inventory or personnel records.
4. Historical documentation required by custom, law, auditors, tax officials or boards
of directors, whether in the form of printed
reports, microfilm or magnetic tape.
5. Reports required by management in addition to, or preferably in place of, the historical documentation mentioned above,
consisting primarily of a listing of situations which vary substantially from established norms ( that is, exception reports
Each of these types of results involves different
volumes of information and requirements of frequency and currency. Yet all ensue from the processing of data according to established procedures,
and all require access to essentially the same file of
information on the history, status, and organizational objectives of the company with which they
are concerned.
The purpose of this paper is to discuss the significance of responsive time-shared use of electronic
data processing equipment for processing business
data to produce the results mentioned above. As
used here, responsive time-sharing refers to a
multiprogrammed system intended to provide commercial services to many tens or hundreds of remotely located users. In this system information is
483
484
PROCEEDINGS -
FALL JOINT COMPUTER ~CONFERENCE,
introduced through keyboards or other manually
operated media. The input data are received and
collected into "messages" either by satellite computers or other message buffering devices, or by the
central processor or processors. The input is processed promptly upon receipt and the results appear
at the remote console quickly enough to influence
the subsequent actions of the operator. No paper
tape or punched cards are involved.
Such systems tend to entail two added expenses
when compared with more conventional, nonresponsive time-sharing on a minute-by-minute
or hour-by-hour basis: manual data entry over a
telephone line makes inefficient use of that line,
and a certain amount of "overhead" is inherent in
time-shared processing on a moment-by-moment basis. Both of these costs can be and are being
reduced, the first by data compression and traffic
concentration, and the second by improvements in
the design of hardware and software. But even with
existing techniques, responsive time-sharing can
demonstrably increase the operating effectiveness of
many types of businesses through improved currency, cost reduction, and control in the p!ocessing of
their data. Each of these is discussed below.
Currency
The urgent need for current or timely information was of course the spur which led to the
development of the early responsive systems for
airline reservation handling, stock market transaction processing,. and the like. Currency on a daily,
weekly or monthly basis is also essential to top-level
management, but it is usually obtainable (though
not always obtained) by any data processing system, whether responsive, conventional, or even
manual. On the other hand, inventory management,
credit checking, production control, cost control
and the like are problems encountered to some degree by virtually every business. In these a responsive system usually has a substantial advantage even
when the requirement is not urgent eIlough in its
own right to warrant much added expense.
Cost Reduction
The potential for cost reduction from the use of
responsive computing stems from a variety of
sources. An obvious one is the fact that electronic
data processing, whether conventional or respon-
1965
sive, offers to the larger organization inherent economies that have not been fully realized by smaller
companies because computers are, by their very
nature, mass-production devices, and despite significant strides in this direction they have not been
available in sizes small enough to meet the needs of
many businesses. While the smaller business has
had the alternative of time-sharing a larger computer system through conventional service bureaus,
the practical or psychological disadvantages of taking business documents ( or cards punched from
them) to the local service bureau have prevented
widespread use of this medium.
Responsive time-sharing can give even the
quite small user the effect of having a full-fledged
computer of his own through a terminal device located on his premises and connected by a phone
line to a responsive data processing system. This
advantage of bringing to the user (and charging
him for) only the computing capacity he needs,
when and where he needs it, is just as basic in engineering and scientific computation as in business
data processing.
Another area in which responsive data processing
offers recognized economy and convenience to the
scientist or engineer is ease of learning, ease of use,
and ease of making corrections. What is not always
appreciated is that these same advantages, albeit on
a somewhat smaller scale, accrue to the business
user as well. Training is greatly facilitated by a
responsive system, not only because procedures are
reduced to minimal simplicity (partly due to freedom
from the constraints of the 80-column card) but
also because a properly designed system can function as a teaching machine, indicating mistakes or
misunderstandings as they occur. This benefits both
the experienced operator, by guiding her through
extraordinary or unusual procedures, and the neophyte, by allowing her to learn by doing, at her own
rate and without the cost or embarrassment of a
teacher standing by. Corrections are similarly simplified because mistakes are discovered when they
arise and while the source document is still in front
of the operator.
An. incidental economy, often of more importance than might be thought, is the reduction in
space required since few computers are as small as
the terminal device of a responsive time-shared
system.
In many businesses operational documents such
as invoices must be prepared almost as soon as the
RESPONSIVE TIME-SHARING COMPUTER IN BUSINESS
information is available. Without a responsive system it is necessary either to use relatively clumsy
by-product card or tape punching during the initial handling of the data, or later to re-key part or
all of it a second time. Hence, while the repeated
handling of data can be eliminated in other ways, a
responsive system achieves it almost effortlessly.
A final area of manpower economy is the reduction in cost of managing the data processing function, a by-product of the centralized systematization and control implicit in a time-shared service
as discussed later.
Time-sharing, moreover, lowers the effective
cost of data processing equipment itself through
more effective utilization of it. The user, as previously stated, is charged only for actual usage and
not for idle time. Economy of scale-getting more
computing per dollar the more one spends for a machine-is a basic factor for the very small user (although not as significant as previously for the larger
user since Grosch's Law* no longer applies to computers which rent for more than a few thousand dollars per month). A factor related to economy of
scale, and far more important to today's technology,
is the potential for effective use of storage hierarchies, keeping in high-priced core storage only the
data needed at the moment, with other data, programs and files kept in storage devices lower in
both cost and accessibility.
In addition to these, there are two other hardware economies worth considering. First, a properly
designed procedure-oriented language for describing the procedures to be followed for one company will permit many others to use the same basic
programs even though there will generally be differences in the details of file record layout, output formats, exception handling, and other elements of
processing. This permits savings not only in the
preparation of procedures but also in the storage of
them, a factor of particular significance for frequently used procedures which tend to become resident in core memory. Second, techniques for randomly addressing a disc or magnetic card file are
usually more efficient for large than for small files.
*Named after Dr. H. R. J. Grosch, who was the first either
to quantify the relationship or at least to state it forcefully
enough to clothe it with the mantle of law. It states that,
empirically, the ratio of the computing capacities of any two
electronic digital computers equals approximately the square
of the ratio of their costs. Thus a $10,000 a month computer
could in an earlier day have been expected to produce results
one hundred times as fast as another renting for $1,000 per
month.
485
By combining many files in one file-handling system, a substantial amount of statistical smoothing is
obtained and double lookup and/or excess capacity
requirements thereby reduced.
Control
Perhaps the greatest advantage of a responsive
data processing system to the small or mediumsized business is improved control, systematization
and management of the data processing function.
Conventional service bureaus offer the same advantage, of course, but to a lesser degree. It is also
standard policy for suppliers of bookkeeping machines to aid their customers in establishing wellcontrolled procedures. But one of the biggest sources
of difficulty in the business use of larger-scale electronic data processing equipment has been the
abdication by the suppliers of their responsibility for
providing customers with business systems rather
than business· equipment alone. The supplier of
responsive data processing services, however, can
most readily and thoroughly aid the same business
manager by providing and maintaining an efficient
and well-controlled system.
While it would be possible to allow or even expect the business user to design and program his
own system, just as the engineering user of timeshared equipment currently does, the information
utilities will best serve their purpose by furnishing
to the user not merely a data processing capability
but a data processing system service. The cost of
systems design, programming, and data processing
management is and will remain high. It therefore
seems sensible to share these costs among many users just as the equipment is shared by many. And
certainly the time-shared responsive system not
only facilitates but virtually cries out for such sharing of the cost of system development.
Central time-shared facilities permit close and
continuous monitoring of the data processing operations of all its subscribers. There is no excuse,
then, for not offering the subscriber a well-designed system and a guarantee that it will continue
to be used as planned; This would differ from many
systems provided by conventional suppliers or consultants which would work well if used correctly
but, once left to the users' own devices, would tend
to degenerate through changes made by those performing the job to suit their own tastes without a
full understanding of the implications.
486
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1965
This is not to say that a packaged service can or
should be developed with the intention of having
any substantial number of businesses using it without modification. It is essential that the general system design and its implementation permit "cutting
and pasting" of user programs to meet the requirements and desires of the individual subscriber. But
the responsive system supplier has a very real
responsibility to provide his subscribers not only
with what they want-and no more-but with what
they need as well. There is, for example, no aspect
of the responsive data processing services offered
by KEYDATA Corporation that appeals more to
prospective subscribers than the fact that they can
have the data when they want it and, in addition,
can rely on KEYDATA to see that nothing is overlooked with regard to the efficiency and control of
their data processing activities.
Finally, for effective management contr?l the
convenience of having both computing capacity and
the corporate files accessible, essentially at one's
finger tips, for study, analysis and forecasting is of
tremendous significance. It means that managers
will be encouraged to consider facts and analyze
trends in a way which few if any of them, aside
perhaps from the planning staffs of very large corporations, can or will do today.
be capable of changing anyone else's program or
tying up the system in any way that would disrupt
normal service to other users. For maximum efficiency the programming techniques used to supply
commercial users should capitalize on every degree
each user may want to do things a little differently
of standardization that can be encouraged between
and among different users. At the same time, since
than any other user, adaptability of the system is
equally important.
A thorough discussion of all these elements
would, of course, be more appropriate in a book
than in a short technical paper. But like anyone
else who provides or plans to provide responsive
time-shared service on a commerical basis, the
group the author represents has had to find workable solutions to all the problems implicit in the listing above. Some of these solutions are far from elegant; many will improve rapidly as more and more
hardware and software designers direct increasing
attention to the problems. Therefore, without attempting even to discuss let alone resolve all of
these problems, a few remarks about some of the
more interesting design decisions represented in the
KEYDATA system may be in order.
IMPLICATIONS
Certainly the design of the user's terminal is important; but far too much emphasis seems to be
placed on high style and novelties to the detriment
of dependability and economy. A Model 28 Teletypewriter may seem ancient and unattractive; it
may have fewer keys than a new model typewriter,
lack lower-case letters, have only a few symbols,
and be comparatively slow and somewhat noisy.
But until something comes along that is as dependable and inexpensive, * has a higher speed, richer
character set and preferably a bit more style, the
Model 28 serves the purpose quite well. "Full-duplex
operation, t permitting easy correction of mistakes
and the use of a third level, a control shift, on
the keyboards in the KEYDATA system, has
proved especially satisfactory.
The responsive business data processing service
which, within the next few years, seems so certain
to blossom into a computer utility with almost universal appeal to the business world has many implications for the hardware and software required to
provide it. Increasingly effective equipment and
programming techniques must be found to improve
terminal devices, to lower communication costs,
and make highly efficient use of storage hierarchies.
The integrity of the user's file data must be unequivocally ensured; that is, he must be confident that
data will not be accidentally or maliciously lost, altered, or revealed to his competitors. He must also
be guaranteed continuity of service without the expense of duplicate equipment; thus it is essential
that the system be operational a very high percentage of the time, that interruptions of service be short
in duration, and that recovery from equipment malfunction be relatively painless and essentially foolproof. It is therefore clear that the system itself
must be truly "clobber proof"; that is, no user may
Terminal Devices
I
*The total cost includes, of course, modulating-demodulating equipment if needed, as it is not in the case of Model
28 Teletypes leased from New England Telephone & Telegraph Company.
tThe keyboard is connected to the computer and the computer to the printer; but there is no direct connection between
the keyboard and the printer.
RESPONSIVE TIME-SHARING COMPUTER IN BUSINESS
Scheduling
In the matter of response time, the KEYDATA
system permits three different approaches: (1) normal full-duplex semiresponsive message processing, in which the user needs the output generated from the input as rapidly as possible, but the
only conversational effect is for verification; (2)
truly responsive conversational operation, in which
the user must wait for a response to one input before he can determine what the next input should
be; and (3) job-queued quick-turnaround operation, in which the objective is to give relatively
rapid service on jobs in which the amount of processing is so great that the user cannot realistically
expect to wait for it without going on to another
task in the meantime.
The emphasis of most present time-shared systems is on the conversational mode, which is perhaps the most important and at the same time the
most difficult to handle effectively. Since the
amount of computing required to generate a response in any conversational situation can sometimes be substantial and is often unpredictable, the
crux of the matter is a scheduling procedure that
will give each user a "fair shake." The most obvious method, an equal time slot for each user,
seems eminently fair but tends to achieve equality
by making every user's reponse equally bad. (Suppose, for example, 10 users request a 5-::second job
at the same instant. Most "fair" systems would give
all these users a response after 50 seconds. Serving
each in turn to completion, however, would reduce
the average response time from 50 to 27.5 seconds
without penalizing anyone, though perhaps at the
expense of making user number 10 resentful of the
better service given the other 9.)
The scheduling method used in the KEYDATA
system has many of these defects but it does overcome one serious problem. It allows conversational
users to reserve a level of service for minutes or
hours at a time and guarantees that they will receive at least the agreed level of services on a minute-by-minute basis. In keeping with the doctrine adopted by Vyssotsky at Bell Telephone Laboratories, KEy-DATA prefers to deny service rather
than degrade it.
Job queuing on a minute-by-minute and
hour-by-hour basis is a desirable adjunct to any
time-shared system. It means that a user can set
up data and procedures, and try them out if neces-
487
sary, using responsive or semiresponsive services.
Then he may, through the same console, request execution of the processing task he has set up, get an
estimate of when the job will be finished, keep
track of where he is in the queue (if this matters to
him), be told when his job is finished (or forego
this if he expects to be using his console for some
other purpose), examine the results of the completed run (in whole or in part) and, when substantial
output is involved, have it printed or recorded on
tape at the central computer facility for later delivery to him.
Semiresponsive full-duplex operation is particularly appropriate in commercial applications. Consider, for example, the preparation of invoices using
the KEYDATA system. The operator enters a customer number and the computer responds with a
full heading for the invoice, including the name and
address of the customer, invoice number, date, and
other pertinent information. The operator then enters the stock item number and quantity, and the
system responds with a full line containing not only
the stock number and quantity but also a description of the item, unit price, extension and possibly
discount or other information.
This kind of operation differs from the conversational requirement of the engineering user in three
important ways: the processing of each message
usually involves random access to a large user file
but seldom requires much actual computation; output volume (as in job-queued engineering use) is
several times that of input and must be attractive in
format and free of visible corrections or typeovers;
and the operator usually wishes to enter data as
rapidly as she can strike the proper keys and need
not see the output resulting from one input in order
to go on to the next. It is nonetheless desirable that
the output be produced soon after the input so that
a finished invoice is immediately ready to permit
the shipment of an order; inconsistencies in input
are called to the operator's attention by the system
the moment they occur so that the data can be corrected or verified while the source material is at
hand, and basic files are constantly kept up to the
minute in currency.
In the KEYDATA system, input data are processed within a very short time (usually a fraction
of a second) after being entered, but the output is
adequately buffered so that the operator need not
wait for the printer before proceeding with the next
entry. Since the amount of processing required by
488
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
each input message is small and relatively predictable, scheduling in such a system can ordinarily be
handled by the simplest possible method: first
come, first served.
Procedure Language
User programs (that is, the description of the
procedures to be followed in processing data for
any application by any user) are written for the
KEYDATA system in a more or less problemoriented language, called KOP-3 (for KEYDATA
On-Line Processor number three, the first two
having been used in a smaller KEYDATA facility
which went into operation in July 1963). KOPlanguage programs can be stored compactly and
dealt with interpretively, thus saving storage and
facilitating adaptability and integrity. Because they
are written by KEYDATA's own staff, these procedures can be expected to make efficient use of the
language and equipment. (Many a sound design has
grounded on the rocks of inadequately informed or
incompetent use!) They can also be cut into quite
small segments by the programmer who, in the final
analysis, is much better prepared than any 1965vintage executive routine to determine the natural
segments of his program.
1965
KOP-Ianguage programs are, of course, pure
procedures: that is, the instructions cannot modify
themselves in storage and all variable status information is kept in small areas uniquely assigned to
each subscriber. Procedure segments (pages of 48
words) are kept on magnetic drum and called into
core as needed. Automatic core allocation and
scheduled drum accesses ( omitting a drum read
whenever the desired segment is already in core)
result in balanced utilization of core and drum and
a high level of central processor utilization.
CONCLUSION
Responsive time-sharing to serve the business
as well as the technical user is here to stay. It requires much more systems work to define and solve
the problems of applying it in new areas. It also
needs more collaboration between software and
hardware designers than has been evinced thus far.
Considerably greater attention should be focused on
questions of economy in deciding on the tradeoff
between hardware and software, in using storage
hierarchies, and in determining the kind of service
which the time-sharing user should be offered.
The next few years will indeed be fascinating ones.
CIRCUIT IMPLEMENTATION OF HIGH-SPEED PIPELINE SYSTEMS
Leonard W. Cotten
Fort George G. Meade, Maryland
INTRODUCTION
gy user is faced with a bilateral challenge. First,
and perhaps foremost, the user must exploit to a
high degree each favorable performance characteristic made available by technology developers. Emphasis will be placed on the traditional engineering
compromises; however, increased sophistication and
complexity will be necessary. For example, the propagation of a 1 to 2 nanosecond rise time in a logic
net requires far more attention to detail and fundamental understanding than say a 10 to 20 nanosecond rise time. Because of the large number of
special situations that arise in design, it is important that efficient generalities or ground rules be
used. Many considerations suggest that it would be
wise for the user to develop straightforward but efficient approaches to recurrent systems problems.
As more systems are built, well-known approaches
might be augmented such that the designer is able
to minimize redundant design effort by not having
to start from the beginning· with each new logic design. This notion is particularly amenable to a fast
reaction capability for one-of-a-kind machines if it
is more desirable to expend effort on the problem at
hand rather than on details of logic implementation.
The second challenge facing the user is that of
tightening feedback loops with technology producers and other users. The time is not too distant
when a sizable number of interconnections will be
The implementation of high-speed pipeline systems as described in this paper arose as a direct
consequence of a large scale Department of Defense
developmental effort initiated in 1962. The objective of the effort was to develop and make available
a complete capability for producing individual special-purpose systems on a fast reaction basis. From
1962 to the present time attention was focused on
all aspects of circuit and packaging technology, automated or computerized design aids, feasibility vehicling, systems design studies, and advanced memory development. As a result of strong industry impetus in these directions it now appears that 1 to 2
nanosecond hybrid integrated or full integrated logic circuits, practical fabrication of transmission
line interconnections, packaging densities of 5000
logic gates per cubic foot in the. machine environment, 100 to 150 nanosecond cycle time DRO thin
film main memories, and 23 to 40 nanosecond integrated scratchpad memories will be made available
for systems being constructed over the next one to
three year period.
Considering the high potential systems performance promised by these developments and the complexity of nanosecond logic realization it would
seem that the systems designer-builder or technolo489
490
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
made at the integrated subsystems level. This has
occurred at the circuits level and the trend will continue.
In light of the above factors it will be the purpose of this paper to address those areas relating to
the implementation of high-speed pipeline systems
at the circuits level. The ideas presented are considered to be valid in systems containing up to 50,000
basic gates. Since the circuit packaging and other
details have been described in the literature l -4 only
those circuit characteristics relating to pipeline implementation will be discussed. While the concept
of pipelining, or possibly streaming, has existed for
years * and has been applied in earlier systems, the
term appears not to have been broadly established.
For this reason examples will be given to permit a
better understanding. Main attention will focus on
exploiting advanced technology in the pipeline sys,..
tems environment. Emphasis is given salient features of implementation that will ultimately relate
to the pipeline system timing and control structure.
The specific logic for a pipeline process may be
quite general and is left to the designer of individual systems. For the benefit of· advanced technology
users it should be noted that many of the concepts
presented herein relate to systems implementation
in general and may be of interest in applications
somewhat removed from pipeline systems.
THE RATIONALE FOR PIPELINE
OPERATION
With the present state-of-the:...art in systems organization and technology it appears that pipelining
is a powerful approach to a particular variety· of
large data / processing pr~blem~, The implementations of systems to process such problems usually
display one or more of the following characteristics.
Byt~ flow l'3,te is important,
The total time required to flow inform a,.
tion through a stream of processing logic is
less critical than the rate of flow. Since
pipeline fill-up time is small compared to
total processing time the productivity depends largely on rate. The Byte acceptance
andlor output rate is optimized and consequently is allowed to constrain the system.
2. Input/output (I/O) speed is a critical factor even though care is taken to minimize
I/O and· to keep data buildup and reduc-
1, Thruput or
1965
tion internal to the processor logic. This
consideration is normally resolved by developing a memory hierarchy which frequently calls for advancing the memory art
significantly. 6-8 The most critical rate exists at the pipeline I/O interface. In many
cases this may be described by the following relationship.
Memory cycle time ::;
Memory word size
(Pipeline byte size) (pipeline byte input rate)
3. High system productivity is achieved by a
high degree of parallelism. Bookeeping for
example may be performed in a pipeline
parallel to the main data processing
stream. Serial control is often used to sequence and interleave pipelines. Concurrence is evidenced by the fact that almost
all of a system is actively processing data
at any given time.
4. Timing is usually performed in the synchronous mode. While the relative merits
of asynchronous timing are well known9
the advantages of synchronous operation
normally outweigh the. drawbacks in the
systems under discussion. This may be attributed to several considerations. First,
numerous internal process interactions
must be predicted and controlled ef~icient
ly. An example is time and spatial injection of new information into a running
pipeline process. Secondly, evidence exists
to suggest that recent technology is relatively more predictable. In one published
report the variance of raw circuit propagation delay was less than 0.5 nanoseconds
while typical logic net ( circuit, line and
loading) delay was approximately 5 nanoseconds. The components of delay associated with transmission line propagation
time and load driving are predictable, and
typically may account for over half the delay in a net in advanced systems. Also,
well-known theorems in statistics 10 may be
employed in strings of gates to avoid anticipation of so called worst-case delays. Fi*The term pipeline has been used for over 5 years by
by designers to describe maximal rate processing of the form
discussed. However, the author has so far been unsuccessful
in determining the origin of the term as used in this context. See reference 5.
491
CIRCUIT IMPLEMENTATION OF HIGH-SPEED PIPELINE SYSTEMS
nally, the maximum rate objectives of parallel processors usually rule out lengthy
ripple structures of the type where asychronism might be profitably employed. Instead, lookahead gating is incorporated
into structures for example.
The concept of pipeline operation may be illustrated by using an adder as a vehicle for discussion.
First, total time for performing addition could be
minimized by using parallel structures and carry
lookahead logic. Add time for 12-bit numbers
might be 4T. Secondly, the same adder could be
constructed with fewer gates by using a ripple carry
structure. Add time would be approximately 12T.
Thirdly, if the ripple carry structure of the second
adder is clocked, and if each successively higher order bit pair from the input numbers is delayed in
time by 1T, then the adder is capable of producing
a sum. for each period T. The skewing of the input
numbers and the deskewing of the output sums
could be accomplished by delay lines, clocked registers, or by specially ordered memory storage patterns. The third adder is of course a form of pipeline operation, and might be advantageous if many
pairs of numbers are to be added at a maximal rate.
A pipeline system would probably result if a special machine was designed to correlate long columns
of numbers, Ai and B i, as rapidly as possible. From
the relationship shown it is apparent that many products must be formed and accumulated at the same
rate. If points (Ai, B i ) where:
n-k
Pl2
=
I
i = 1
Ai Bi
+
k
were sampled in time it is seen that the process
would be repeated for many values of k, or time
lags. Finally it is observed that a careful balance of
memory capability, parallel logic, and system organization would be necessary to achieve the maximum practical rate.
I I
I::~
I
j
I I
~ ~ ••• ~."
A"
~ ~ . . . . _ .~: ~ ~ •••• _ :~: ~ ~.. • •
A"
~~
A~I
•••
[Y
~,
_
~ ~ ••. ~
I
GATING A
I
REGISTER B
~~ •••
c"
t
I
GATING
B
REGISTER
1...
~ ~ ••• ~
c"
__
t
REGISTER A
~~ ••• ~
J"L
A
C
GATING
_
&
D"
D"
o~
~~:
~,
I
C
Figure 1. Basic register and gating structure.
what generalized structure. In practice definite limits are established by logic and timing considerations, and not all elemental positions are necessarily
populated. The elements which form registers are
considered to be basic register cell structures. The
elements which occupy positions in the gate, arrays
are basic circuits that perform logic and consume
time in the process. Information flows through
strings of these circuits in an unclocked fashion.
For this reason gate strings between registers are
kept within specified lengths. Special attention is
given the gate-register and register-gate interfaces,
especially as timing constraints become critical.
The control delay (CD) elements are structured
from standard logic circuits and are used to form
timing chains, which sequence the pipelines. Timing chains usually account for 10 percent or more
of the gates. in typical pipeline systems.
An examination of data flow and the timingcontrol interaction of a basic pipeline will help to
explain the operation. Figure 2 illustrates the basic
timing for one input BYTE that is subsequently
CLOCK CYCLE
---t
~
~
--==D________________
~~----------------~~~~------------
______
~n~
____________
--------~~~--------
__--------~n~-------------------~~~-----
___,--~,------~n~-----'1'1+0
',. ... 1
~ ... 2
'1'1+1
tn+m
'n+m+1
Figure 2. Timing for one input BYTE.
BASIC PIPELINE OPERATION
A basic pipeline structure is illustrated in Fig. 1.
The pipeline is characterized by a succession of register and gate sections. The first section, (A), contains a register which is made up of elements All,
A 2 1, .• . Arl and a gating structure composed of an
(r) by (k-1 ) element array. Limits have not been
have not been placed on rand k as this is a some-
processed through the pipeline. At time tn + 1 the
BYTE is clocked in parallel into the first register A.
The information appears at the register A output
and moves unclocked through logic gating A to the
input of the next register B. At some time after time
t n + 1 and before time t n + 2 pulse moves from the first
control delay CDA to set the second control delay
CDR such that a clock pulse from the time base is
492
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
gated through to register B at time t n +2. The timing
process is repeated through the remainder of the
registers. Information arrives at the output end of
the pipeline and is available to be clocked into the
last register or output device at time tn +m . A total of
m registrations or clock cycles is required to tranverse the m section pipeline. Four points might be
emphasized. First, after introduction of the BYTE
into the pipeline the timing chain automatically sequences the information through at the basic clock
rate. Second, Figure 2 suggests that the pulse paths
between adjacent control delays are noncritical over
time periods approaching the clock period. Third,
Fig. 3 shows the timing sequence for a burst of six
input BYTES. During a burst it is possible that all m
REGISTER
0
_ _ _ _ _- - '
..
' ,
,
Inti
',...,
',..m
tn.m.1
tft+lrW.Z
Figure 3. Timing for six input BYTES.
registers may be clocked simultaneously; moreover,
this is the desired situation in normal processing.
One obtains maximum productivity by minimizing
time gaps between bursts. Lastly it is observed that
for p BYTES and m pipeline registers a total of
m + p - 1 clock periods is needed to clock out all
information. Now that concepts fundamental to pipeline operation have been set forth, it is meaningful
to consider implementation with newly developed
logic circuitry.
CIRCUITS AND LOGIC GATING
Basic Circuits
The circuits being used for implementation purposes are of the well-known current mode switching
or emitter coupled logic (ECL) variety. The circuits
and detail characteristics· have been described in the
literature,11-13 therefore only those properties needed
to explain logic implementation will be examined.
Figure 4 shows the basic circuit, logic, and an electrical behavior chart. The circuits perform standard
NOR-OA( +) or NAND-AND(-) logic depending
upon the logic polarity significance. The parenthetical
symbmols (+) and (-) are used to indicate which
*The IEEE logic symbol standard is the "American Standard Graphic Symbols for Logic Diagrams," approved by the
American Standards Association on Sept. 26, 1962. See
Computer Design, May 1965, pp. 6-7.
1965
\t£RTED NORMAL
ABC 0 OUTPUT
ouTPlJT
.+
~( . ) mI=ES~
"---+--r---c iBcD
(-I
+
... - +
~)~}
......
~
-..
-
-
~)~)
"ANO M
MOR~
Figure 4. Circuit logic and electrical behavior.
signal excursion is to be identified as a logical ONE.
For logic polarity symbols* the open right triangle
will be used to indicate negative logic - significance
and a straight line will indicate positive significance.
Since for purposes of this discussion the reference
voltage is zero volts or ground, it is seen that the
high swing is a positive voltage and the lower swing
is a negative voltage, nominally + 0.4 and - 0.4
volts. Because of the low leakage current associated
with the silicon transistors, unused inputs need not
be connected. A floating base effectively takes the
respective transistor out of the circuit. One other
point should be mentioned. The emitter followers
are not tied to resistors at the circuit level; however,
each logic net in the system will contain a current
source resistor and a transmission line terminator in
an R (resistor) pac. These provisions make possible
the very useful wired OR or DOT OR (+), provided one is able to sacrifice the identity of the
individual variables that are shorted together electrically. This connection also forms a DOT AND
(-) which proves quite useful. From a logical design
standpoint the connection permits two levels of logic
in the time slot normally required for one decision.
Additional constraints such as placement must be
considered, but many useful functions may be realized. For example, the EXCLUSIVE OR, f = AB
+ AB, or the carry equation, C n = An Bn + An
C n- 1 + Bn C n- 1 may be realized in one gate delay
for either positive or negative significance if both
polarity input variables are provided.
Circuit Response in the Systems Environment
Industry progress in improving circuit response
has been quite encouraging; however, the user still
must consider numerous design alternatives brought
on by bandwidth limitations and line propagation
velocity. This is illustrated by the fact that 1 to 2
nanosecond circuit propagation delays have to be
CIRCUIT IMPLEMENTATION OF HIGH-SPEED PIPELINE SYSTEMS
added to 2 to 4 nanoseconds of average interconnection and loading delay in typical cases. Since
loading by its very nature produces both lumped
and distributive effects, reasonably accurate calculations are less than straightforward. Placement of
circuit modules and interconnection paths further
complicate the logic design in highly synchronous
systems that are optimized for speed. In order to
deal with these problems .efficiently the designer
must be supplied with valid rules of thumb and
ground rules. Experience has taught that this is indeed possible, but that the user must be sold on the
increased complexity above that encountered in
slower speed logic.
The total delay of logic net, consisting of the circuit, interconnection lines and loading, is determinel by a number of contributing effects.1 For a
first order estimate it is necessary to consider only
four components. It should be mentioned that these
quantitative results are based on (a) effective base
input capacities in the 5 to 10 picofarad range, (b)
terminated transmission lines that are capacitively
loaded so as to present a load of approximately 50
ohms to the driving circuit, (c) logic nets of several
inches, and (d) a specific set of circuit design compromises.
1. The first component of interest is raw circuit delay. For the circuits being discussed
propagation delay is 1 to 2 nanoseconds.
Voltage transition times are defined as between -0.2 volts and 0.2 volts, and the propagation delay is measured from the start
of the input transition to the start of the
output transition. Since input transition
time is a parameter with respect to this
measurement, approximately 1 nanosecond
is .assumed (best case). This is approximately the circuit output transition presented to a 50 ohm resistive load.
2. Second, a transmission line propagation
delay of 0.18 nanoseconds per inch must
be considered. This is established by the
glass epoxy dielectric used in the laminated
interconnection cards which accept the circuit modules and the interconnection
boards which accept the cards.
3. The third component of net delay arises
from attaching base loads to the transmission line and appears in the form of an increase in line propagation delay. This ef-
493
fect and the one below (4) has been graphically described by Flynn. 1 In practice
the addition of a base to a moderately
loaded line of a few inches in length may
add as much as 0.25 nanoseconds of delay.
4. The last delay component to be cQnsidered
is an increase in circuit propagation delay
attributed to input transition times being
greater than the original one nanosecond in
the best case. Circuit propagation delay
increases roughly by one-half the increase
in the input transition time from the original one nanosecond. The actual input transition time is determined by base loading
on the driving net. In practical situations
the addition of one base onto the driving
net may cause the propagation delay of circuits being driven to increase 0.15 nanoseconds.
It should be noted that base loading caused delay in
two separate ways. T1;lese response considerations
give some insight into decisions that must be made
by a designer using high speed logic of this type.
Considerable freedom exists as to possible choices
for a given logic implementation, but appreciable
time may be spent in arriving at the best set of decisions. For these reasons it would seem desirable
to have some standard approaches to recurrent situations in pipeline systems particularly. At the present time it seems reasonable to strive to get total
delay per net down to approximately 4 nanoseconds
in pipeline applications.
Register Cell Implementation
Historically, the NOR implementation of a gated
set-clear flip-flop has been widely used. The NOR
implementation may be seen in Fig. 5 along with
two more recent developments in the evolution of
basic register cells. The NOR flip-flop is well
known; therefore, a review of operational characteristics will suffice. Input consists of a negative gatGATED SET-CLEAR
FLIP-FLOP
GATED
SET-CLEAR
6~~:6~O~E:'6~~CK
Figure 5. Register cell structures.
494
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
ing pulse (-G) and normal (X) and complemented (X) data input levels. If the symbol T is
used to represent a nominal circuit propagation delay, it is seen the 2T best case and 3T worst case
is required to obtain valid outputs from the time
that the gating pulse is initiated.
A useful modification of the NOR flip-flop is
the DOT OR feedback. It is seen in Fig. 5 that
feedback lines 1 and· 2 are wired to lines entering
circuits Eand I such that additional base inputs are
not required. Propagation delay through this flipflop differs from the first example in a basic respect. Since one of the output lines initially rises
with the SET or CLEAR pulse the outputs are
available earlier. In general, output information is
available in 1T best case and 2T worst case.
The register cells just mentioned have disadvantages in high-speed pipelines and the latch cell, Fig.
5, has much to offer both in terms of speed and logic implementation. The latch as a functional logic
array has been realized as a high-speed module and
was mentioned by Flynn.l It is the redundant feedback path through circuit I that· contributes to
many outstanding cell characteristics.
1. Redundant data input information is not
required. This frequently saves time and
simplifies logic.
2. The +G and -G gating pulses may be
skewed . either way with respect to each
other without impairing latchup. In practice + G and -G should coincide as closely
as possible for maximum pipeline speed.
3. A close examination of the latch will reveal
that the data outputs (X2 +, X 2 - ) , lines
2 and 3, will always occur in 2T. The complemented output (Xl - ), line 1 is clearly
available always in 1T because it is the only
line seen to enter circuit M.
A curious and useful DON'T CARE function is
displayed by the latch under certain conditions. To
realize this condition the -G pulse is put into the
data input and the -G pulse input is permitted to
float.
( a ) If the -G pulse arrives later than or with
the + G pulse the normal data outputs
assume the logical ONE state.
(b) If the -G pulse fails to appear then the
+ G will cause the normal data outputs to
assume the logical ZERO state.
1965
( c) If the + G pulse fails to arrive then the
-G pulse will cause the normal data outputs to assume the logical ONE state.
The apparent indifference as to the arrival of a + G
pulse when the logical one state is to be assumed
and the lack of· dependence on the arrival of a -G
pulse when the logical zero state is to be assumed
are both useful in implementing the control delay
function that will be considered later.
Register trigger or gating pulse requirements
must be established prior to the design of the synchronization for high-speed pipelines. Of interest is
the determination of the minimum width gating
pulse that must be supplied to registers to insure
correct information storage. Each of the flip-flops
or register cells previously considered has an almost
identical closure path. An examination of all closure paths has resulted in some interesting observations; however, it might be more practical here to
consider a worst-case path for each register cell.
1. Consider the first NOR flip-flop in Fig. 5
with a ONE stored and a ZERO about to
be gated in and stored. The -G pulse causes
the C output on line 7 to rise. The D output on line 1 falls causing the A output on
line 2 to rise. Line 7 is then permitted to
fall. To simplify calculations the effects of
loading and line propagation delay are included in maximum and minimum circuit
delay. If symbols G, C, D and A are used
to represent times associated with pulse
width and circuit propagation delay then
an expression for the minimum gating pulse
width, G, requirement may be stated as:
G=C-C+D+A
(2)
2. Consider next the DOT OR flip-flop with
the same initial conditions as above. The
closure pat4 HIEI is analogues to CDAD
even though the outputs of Hand E are
wired together.
G=H-H+I+E
(3)
3. The latch has a redundant feedback path;
therefore, one path is temporarily negated.
Consider the latch as containing a ZERO
initially with a ONE on the input data line
4. The +G pulse negates the feedback
path LML. The closure path is KMIM.
One may suspect from this example that
495
CIRCUIT IMPLEMENTATION OF HIGH-SPEED PIPELINE SYSTEMS
something in the latch is wasted, but a
consideration of all binary possibilities
proves otherwise. The minimum width requirement placed on the -G pulse is:
G=K-K+M+!
(4)
From these example cases it is seen that the minimum gating pulse width is determined by pulse
width reduction through one circuit and the maximum delay through two other circuits. Care must be
taken in measuring the possible pulse width shrinkage through a circuit because pulse width is a parameter in the limiting case. The interesting phenomenon is question is due to a finite recovery
time inherent in all high-speed switches. There may
even be an external recovery due to line reflections.
In effect a circuit may turn on slowly only to turn
off quite rapidly because the internal currents did
not complete a full swing to the d-c rest point. If T
and T are maximum and minimum circuit propagation delays in our domain of interest then for registers of the type discussed the maximum pulse width
requirement will be:
G = 3T-T
(5)
As T is made to approach T, G will approach G,
and G may be approximated by saying:
G= 2T
(6)
An analogy may be drawn between (a) the time a
circuit output is seen to switch and the time the circuit is internally stabilized and (b) the time information' appears on the output of a register and the
time the register is fully latched.
Logic Gating In Pipelines
Logic gating in pipelines is usually considered to
be gating between registers. For this reason the
constraints on maximum and minimum numbers of
gates in the critical path are determined respectively by maximum clock rate considerations and data
race conditions. Between these two limits the number of gates is usually noncritical. If T is used again
to represent a nominal unit of circuit, line propagation, and loading delay, then in 50 megacycle pipelines 4T. = 18 nanoseconds has been established as a
reasonable design objective in most cases. This suggests that if each gate was treated separately, then
4.5 nanoseconds would be the maximum T ; however, in practice the 18 nanoseconds may be applied
to the total string of gates, lines, and loading for
the maximum case. While 4 circuits are normally
used, it should be' said that 3 circuits under heavy
average loading and long paths or 5 circuits in short
and lightly loaded paths are not excluded. From the
previous analysis of registers it is seen that out of
the 4T delay almost half the ·time may be spent in
a register. For this reason special attention must be
given to (a) optimal methods of gating in and out
of registers and (b) ways to obtain as .many logical
decisions as time permits between registers.
Examples of special register input and output
gating may be seen in Fig. 6. The DOT OR flipflop provides for a level of input gating with circui-
««
STORAGE
1 :~::o 1:~ZDII =1 :Z.~I ~::R ::0 1:1 ::01
-AND
-AND
-AND
1
5ET-
CLEAR
STORAGE
SET-
CLEAR
Figure 6. Gating between registers.
try, or two levels of decision gating are possible if
the DOT OR is used judiciously. The time for this
gating is included in the 2T worst-case register
propagation time. In practice, to realize this gating,
one of three things must usually be done: (1) Completely separate SET and CLEAR equations may be
factored. (2) The register may be automatically
cleared with a simpler logic expression. (3) If
neither of the first two approaches is feasible then
it is usually possible to insert a redundant feedback
path of the form used in the latch. Gating our of a
register in the shortest possible time. may be done
by using the NOT output (Xd which is available
from the latch in IT. In Fig. 6 the output of Gate
N actually is available as soon as X2 appears on the
latch output. The power of this latch~N gate usage
is best demonstrated in the selection table example
which follows.
Suppose 'One wishes to store a 4 bit address in a
register and use this information to select one out
of 16 possible lines in the shortest possible time
slot. Assume further that special circuit packaging
is a consideration and a geometrical constraint exists on the maximum separation of the outputs in
the DOT OR ( + ) which is also the DOT AND (-).
Such a set of considerations is typical, and a solu-
496
PROCEEDINGS -
1965
FALL JOINT COMPUTER CONFERENCE,
D
1. The shortest time possible to clock information through the latch and use it to select a line is found to be 2T. Depending
upon the spacing between the registers and
the matrix, 2 to 3 nanosecond time must
be budgeted to drive 8 external loads from
the register. The time for the DOT decision
is neglected.
2. It is observed that a module ( module is a
0.5-inch square multiple circuit package
with 16 pins on 0.125-inch centers) made
up of 4 I-input circuits could be most efficiently utilized. This is not essential, however, because a separate line goes to each
input. Using either approach the geometrical constraint on wired (DOT AND ED)
emitter spacing is noncritical here. The
maximum distance between the two dots on
the line to be selected is well within an inch,
which in turn is well under the maximum
permitted. If advantageous the entire matrix could be partitioned vertically, as shown
by the segmented line, and each half could
be placed manually or by an automatic
placement routine. The module count for
this example, including 4 latches, could be
as low as 12 modules.
3. From a logic standpoint other facts should
be mentioned. In order to get correct outputs from the latches in a 1T time delay,
complements were fed in. These complements could just as well have been normals.
If this were the case, as may be seen from
the Fig. 7 table, line 1 would become 16,
line 2 would become 15 and so forth. Another alternative would be simply to interchange the normal and inverted outputs
from the basic circuits. These considerations
' rJH~
~-
tion is seen in Fig. 7. The DOT AND (-) was
used; therefore, the selected line and only this line
will go to a negative voltage level for the input conditions shown in the table. Since we are in a negative (-) logic system the low level voltage is interpreted as a logical ONE (-). A careful examination
will show that each line is wired to four individual
circuit outputs, and when compared to other implementations for this speed range the approach
seems reasonably efficient. The relative merits of
this solution can best be evaluated with respect to
( 1) timing, (2) packaging and (3) further logic
considerations.
~
A
rft:rfrf
:
~~~
A
:
'
I
I
ATCH
~
B
-f~J
:
:
I
t ¥t t f
t t
f:* t t * f ! ! !
Figure 7. Implementation of selection table.
are somewhat trivial, but are cited to show
that negative ( -) or positive- (+) input
logic presents no proble min this case. For
larger address decode problems it is suggested that one explore the possibility of
using more than one input on the last level
of circuits, granted this is not the only consideration. In passing it should be clear that
if the 16 lines are each to be ANDED
with a variable of interest by driving another
circuit input, then the negative (-) logic
significance is needed, and this is found to
be true in most cases.
Now that some optimal methods of gating in and
out of registers are known it is desirable to determine
the maximum number of consecutive decisions that
might be made between registers when a maximum
of two circuits may be inserted into the average critical path. Figure 6 shows a maximum critical path
of G n
~
NOPG n + 1 where G n + 1 is the next pulse in
time (20 nanoseconds later, neglecting skew, for a
50 megacycle clock rate), which gates the resultant
of all the path decisions into the DOT OR flip-flop.
A count of the Fig. 6 listing shows that storage plus
eight serial logic decisions are possible. The seventh
and eighth may be weakened by possible SET or
CLEAR requirements. If the last two are essential
decisions it might be necessary to incorporate redundant feedback into the last register to ease SETCLEAR requirements. With the DOT OR (+) or
DOT AND (-) one does sacrifice identity of component variables, but if this can be tolerated then
the result is a decision in approximately zero additional time. Lastly, it is known that levels of NAND
( -) NAND (-) are equivalent to OR (+) AND
(+) or AND (-) OR (-), as could be shown by
successive applications of DeMorgan's theorem.
497
CIRCUIT IMPLEMENTATION OF HIGH-SPEED PIPELINE SYSTEMS
PIPE~INE
TIMING CONSIDERATIONS
Timing analysis indicates that the maximum rate
single phase pipeline operation is possible only after
( a) register pulse requirements have been met and
(b) the race paths have been eliminated. It will be
shown that condition (b) is analogous to saying
4T - We - Se ;;::: 0 where T is the minimum circuitline-load delay, We is the original clock pulse width,
and Se is the maximum possible clock skew. Actual
skew is contained within the interval 0 S skew ~
Se. Further considerations suggest that in order to
meet the basic conditions one might find it advantageous to alter T and T in specialized cases such
as in registers, and to carefully consider detailed
factors associated with each system.
Data Race and LatchingTheoretical Considerations
clocked that the clock may still be present. If for a
pulse of width We the total effects of differences in
path lengths, loading and distribution circuits are
included in the skew term Se then the latest· possible
pulse disappearance time is We + Sc, from the earliest appearance at either reference point a or b in
Fig. 8. If the fastest logic circuit path is 4T then
an inequality to insure race free operation may be
stated.
We + Se
REGISTER I
GATE
GATE
REGISTER 2
4 T or We
S
4 T - Se
(7)
From Eq. (5) it is seen that to insure correct data
input clocking a pulse of 3 T - T nsec of width is
required. If the term Se is a random variable controlled only within limits it follows that the minimum
pulse input to a register is We - Se; therefore, an
inequality to insure correct register pulsing may be
stated.
3T-T
The insight provided by a theoretical treatment
of the data race problem aids eventually in the application of a practical solution. Since two adjacent
registers in a single phase pipeline may be thought
of as receiving the same clock pulse the hazard exists
that data clocked into register I (point a in Fig. 8)
may race ahead through a short, fast path and appear
S
S
W e -Se or3T-T
+ Se S We (8)
Considering race hazard elimination and minimum
pulse requirements it is possible to combine Eqs.
(7) and (8) to establish an interval for We.
3T - T
+
Se S We S 4T - Se
(9)
A solution for We in Eq. (9) exists if TIT;;::: 1 for
actual values of T and Se. For 50 megacycle pipeline
systems using the technology under discussion a
maximum system clock skew of Se = 2 nanoseconds
has been established as feasible. The maximum 4 T
delay is then 20-2 or 18 nsec and T = 4.5 nsec.
Based on \this information Eq. (9) may be solved
for T, a maximum value of the ratio TIT, and pulse
width We.
T = 3 T ~ 2Se = 3 (4.5 ; 2 ( 2 ) = 3.5 nsec ( 10)
Figure 8. Timing model.
at the input of register 2 (point b) before the original
clock pulse disappears. This is hazardous in two
respects. First, the clock may correctly gate in both
the original data and also the unwanted data from
the malfunctioning critical race path. Second, the
input data must remain a -level for the duration of
the clock pulse to insure correct gating and original
storage in the register. The undesirable consequence
of data races may be avoided by designing enough
delay into the shortest path to insure data arrival at
the next register after the clock pulse has disappeared. In order to determine an adequate amount
of additional delay one must first determine the
latest possible time after both registers have been
Therefore: T/T = 4.5/3.5 = 1.28 ;;::: 1 and is theoretically acceptable. And : We = 12 nsec by substitution.
Data Race and Latching -
Applied Considerations
While the theoretical treatment in the preceding
section produced a solution it is not too surprising
that in practice the considerations of T and Weare
not straightforward. This follows from the fact that
usually an objective of an advanced circuit developmental effort is to produce the fastest possible stable
circuits. The actual value of T is based on population sampling and more detailed considerations. For
the circuits being discussed T may be considered as
498
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
considerations as there are designers. One consid2 nsec when at least one base load is present along
with a finite interconnection line and an average
input transition time. The determination of We could
probably be based on as many or more individual
eration that has some merit is that presented below.
Clock period = III = 2 (We
+
Se)
(11)
Relationship (11) has implications with respect to
clock systems implementation and systems usage.
For one thing it permits a two phase clock with zero
overlap at the maximum frequency· of interest, and
thus would permit shifting betwen adjacent registers
with no intermediate serial gating. * From Eq. (11)
it is seen that We = 8 nsec for I = 50 mc and Se =
2 nsec. Some additional consideration should be
given to establishing a reasonable maximum effective
clock pulse shrinkage in the system. That We - Se
is a reasonable minimum pulse width may be countered as follows.
(a) From Fig. 9 it is seen that if the pulse
starts at time t = 0 then the minimum
width pulse has to be We nominal or 8
nsec.
(b) If the pulse starts at time t = 0 + Se/2 or
midway in the skew interval it is seen that
the minimum width pulse is We - Se/2
or 7 nsec.
(c) For a hypothetical pulse width of We - Se
the start surely has to occur at time t = 0
+ Se. For a pulse to have been delayed
originally by this amount the maximum.
excess line and load delay would have
been encountered. Since line and load delay
tends to slow both the leading and trailing edge of a pulse it is suggested that any
resulting pulse from t = 0 + Se is greater
than We - Se by an appreciable amount.
Statements (a), (b), and (c) do not of course constitute a rigorous proof of the hypothtesis that a pulse
of width We - Se cannot occur. It is considered that
sufficient grounds were established to invoke a minimum pulse restriction on the clock system of W =
We - Se/2 without appreciably affecting clock system implementation.
Practical race and latch constraints may now be
stated for a realizable pipeline system with many
*This point was brought out in a private discussion with
J. Alton and F. Miller of Remington Rand Univac Corp.,
St. Paul, Minn., April 27, 1965.
1965
factors considered. To eliminate the race hazard,
relationship (7) is modified into the Eq. (12) race
constraint.
(T 1 + T 2 + T 3 + T 4)
~
We + Sc
( 12 )
for a string of 4 gates in a race path. Equation (12)
differs from (7) in that the minimum of the sum
of 4 delays is used rather than a sum (4T) of 4
minimum delays. For We = 8 nsec and Se = ') nsec,
it follows that the effective T average, for 4 gates in
a race path, must be equal to or greater than 2.5
nsec. Considering that T = 2.0 nsec and taking into
account of the effects of line and load delay, the minimum average delay of 2.5 nsec is reasonable.
To insure correct register closure or data latching,
relationship (8) is restated in Eq. (13), the data
latching constraint.
(T l + T2 + T 3) - T::;;; We-Scl2
(13)
for a closure path. It follows that the maximum a'!erage delay for 3 circuits in a register closure path i8
3.0 nsec, be sed on previous assumptions. This average is easily achieved with a functional module.
If one is justified in assuming that the first circuit,
the -G input gate, in the latch has negligible T - T
for itself and immediate loading then Eq. (6) can
be modified to (14).
(T2
+
T3) ::;;; We - Se/2
(14)
fot a register closure path with negligible T - T for
the input gate.
Under Eq. (14) conditions the maximum average
delay for the remaining two gates in the closure path
is 3;5 nsee. This figure suggests the possibility of
placing heavy logic loads close to the register. Also,
the possibility of paralleling a latch with a load driving circuit tied to the wired node should be considered for load driving that in any way threatens reliable closure.
Pipeline Clock Rate Equations*
Pipeline clock rate equations may be developed
directly from the timing diagram seen in Fig. 9. For
analytical purposes three regions exist. The region
widths Wl, W2 and W3 are stated as follows.
-----·T-----~!7QZZZZUZQZZZZ~~1
1
---4"".5
C
1-.
'-t:~""1 .,-:.,J.",:J
V2
Figure 9. Timing diagram.
CIRCUIT IMPLEMENTATION OF HIGH-SPEED PIPELINE SYSTEMS
Region I:
WI = We
+
(15)
Se
Region II: This region is established as:
W2 = 4T - WI = 4T - We - Se
(16)
Region III: This is the data arrival interval. That the
width is not affected by inserted fixed delay may be
seen if a dummy variable Z is used. The earliest possible data arrives at register 2, Fig. 8 after a delay
of 4T + Z. The longest arrival delay is Se + 4T + Z.
W3 = (Se + 4T + Z) - (4T
= 4 (T - T) + Se
+
Z)
(17)
To simplify matters the term Se in W 3 will be added
to W 4 for frequency calculation purposes in:
f
-
(We
W2 + W3)
(18)
1
(4T-We-Se) + 4(T-T)
= l/(W I
+ 2Se ) +
+
A determination of operating frequency is of interest for t~ree special cases.
Case 1. When 4T - We -'- Se > 0 the terms in the
denominator of Eq. (18) may be added directly.
It is tacitly assumed that register closure requirements have been met by We and Se. From Eq. (18):
f
= 1/(Se
+
4T)
(19)
Case 2. When 4T - We - Se = 0, it is seen that
Eq. (18) becomes:
1
f = (We + 2Se) + 4 (T- T)
(20)
At first glace Eq. (20) appears to be significantly
different from (19). If 4T - We - Se = 0 then T =
(We + Se)/4. If this T is substituted back into Eq.
(20) :
f=
___1_ _ _ _----,---(W e+2Se ) +4T-4 (We+Se)/4
1
(Se + 4T)
It is seen that T here was a restatement of the data
race consideration in Eq. (12).
Case 3. When 4T - We - Se < 0 a race hazard exists
and may be eliminated by adding delay of-some'value
k such that:
k ~ We
+ Se -
4T
(21)
*The author acknowledges many fruitful discussions with
other investigators who were concerned with details of similar calculations. Notable among these were personnel at
RCA, Camden, N.J., and at IBM, Poughkeepsie, N.Y., and
Dr. H. H. Loomis of the University of California, Davis,
Calif.
499
It is recognized that adding a delay k increases W 2
in Eq. (16). Since W 2 = 4 T + k - WI = 4 T + k We - Se, as a result, the frequency becomes f = II
(Se + 4T + k). In order to recoup speed T might
be reduced by an amount k/4.
If T=4.5, T=2.5, We=8, and Se=2 nanoseconds
in Eq. (19)-and (20)-then f = 50 megacycles.
It -is emphasized that the clock rate calculations
herein presented are based exclusively on single phase
operation. In reality the period 4 T + Se is valid for
single phase clocking and We + 2Se + 4 (T - T) is
theoretically realizable with some multiphase clock
scheme. With the numbers given for 50 megacycle
operation both yield the same frequency. If one could
add sufficient delays to all fast nets such that the T
specification could be increased and if the required
phasing could be supplied then Eq. (20) will predict the theoretical performance. At first glance multiphase operation from a speed standpoint appears
to have merit; however, investigators* have ventured
opinions that in the systems under discussion it may
not be worth pursuing. Such opinions exist because:
(1) precise minimum circuit delay above some natural set of limitations can be difficult to predict with
confidence, (2) phase control requirements in the
clock system may complicate the organization, and
( 3) design constraints become more restrictive in
the inherently complicated nanosecond system. These
points will not be explored further, but the interested
reader is encouraged to pursue the matter.
TIMING CHAIN IMPLEMENTATION
The role of timing chains in pipeline systems was
established in the discussion on pipeline operation.
Figures 1, 2, and 3 further establish the dependence
of pipeline operation upon timing chains in more
or less conventional systems. The following discussion will consider (a) the control delay cell implementation using current mode switches without
delay lines, (b) detailed aspects of time base implementation' for highly synchronous systems, and
(c) some standard applications of control delays to
illustrate interesting capabilities.
The Control Delay Cell
For purposes of this discussion the control delay*
may be thought of as having three main functions:
*Private correspondence from Dr. H. H. Loomis of the
University of California, Davis, Calif.
500
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
( 1) It receives a set pulse and at a predictable time
later produces output pulses of both polarities and
with fixed width and skew characteristics. (2) The
control delay cell furnishes fan-out to clock registers and also furnishes certain control outputs to
drive slaves for even greater fan-outs. (3) Finally,
the control delay performs all functions in such a
way that it may be gated by logic commands.
To enhance systems realizations it is advantageous
to construct a basic control delay cell (Fig. 10)
from standard logic circuits without the benefit of
Figure 10. Three control delays in a timing chain.
fixed delay lines. Control delays may be better understood by making reference to the timing in Fig. 11.
Operation consists of feeding the V circut with pulses
on line 1 from the time base. If line 2 is floating
\ \ATCH OUTPUT "'N~' _
.2V
~
~
f...-8N5£C.----o1
/
~
ANO OUTPUT LINE 3
-.2V
_ _ _ _-'---'
r--8NSEC.~
Figure 11. Control delay timing.
(unused) and line 3 is at a low voltage level then
positive and negative output pulses appear on lines
7 and 10. For output pulses to have occurred the
latch output, X2, had to be a logical ONE. The line
7 positive output is used to furnish a + G latch reset
such that no more output pulses are produced unless
a -G latch set pulse is received from a previous control delay. Note that no conflict need exist for simultaneous -G and + G pulses due to the DON'T CARE
*The term control delay in this context is believed to
have been originated by Remington Rand Univac engineers,
some time prior to 1960. See also refere~ce 14.
1965
property discussed in the section on registers. Since
the + G path is made short, the -G path is always
longer, and it takes precedence over a possible +G.
Circuit U, the AND (-) gate, performs two functions: (1) It serves as a buffer delay such that a + G
cannot race around and literally reduce its own
output width by prematurely causing the V circuit
to be inhibit~d during the negative portion of the
time base. Also, the delay is provided such that a
-G input cannot race ahead and gate out an earlier
portion of the time base than is desired. Note that
the line 3 waveform shows that both of these conditions were met; moreover, the output of U could
have been delayed 8 nanoseconds and yet have met
these conditions. This brings us to the second function of the U gate. (2) The input line 4 may- be held
high to inhibit the control delay output indefinitely.
Some examples of this usage will be described below.
Also due to the 8 nanosecond possible delay of line
3, or the U gate output, up to approximately 40
inches of line could exist between adjacent control
delays with no serious consequences. Since the control
delay effectively gates out a known portion of the
time base, the negative pulse portion, the skew
throughout a system remains under control. Total
skew consists of that prevalent in the time base and
the V gate with the output line and load differences.
If the output loads per V circuit or slave are held
within 1 to 4 registers with no more than 4 inches
of effective line difference between any 2 then total
skew appears to be approximately 2 nanoseconds
when referenced at register inputs throughout a
large system. This was calculated for a wide range
of load placement conditions and is shown in Fig. 11,
lines 7 and 10 waveforms. It should be mentioned
that this representation of pulses should not be interpreted as suggesting very narrow possible output
pulses. Rather, what is shown are the pulse extremes
for the shortest lightest loaded path and the longest
heavily loaded path. Pulse shrinkage or stretching
would be held to no more than 1 nanosecond.
A wide range of useful functions are possible when
slaves are attached to control delays. By way of
explanation a slave in considered to be another V
circuit tied to line 3 and the time base. For one thing,
more slaves can be added for load driving without
affecting skew. Another interesting possibility arises
if one desires to inhibit the -G command otherwise
proceeding from a previous control delay. Simply
DOT-ANDING (-) to line 6 could interfere with
the load driving of the previous control delay. If a
501
CIRCUIT IMPLEMENTATION OF HIGH-SPEED PIPELINE SYSTEMS
slave drove line 6 then no such problem need exist.
Lastly, since a slave may itself be gated it is possible to inhibit control-delay slave outputs and yet furnish a + G to clear the control delay, without driving
registers in the process.
'IIWoC.--
~
11
~
\\
LI
~
P,N52,,51....7
PIN 10
II
15
2~~
'LI
\\'-'--_ _-'---"iT
0
\\
~
~__~U
~
~
~
0
II
II
\\'--'--~!T--
Time Base And Pulse Forming
In order to provide a basis for discussion the
conceptual time base distribution scheme of Fig. 12
will be considered. Two problems in time base disr-------
.
ALL CABLES
SAWE ELECTRICAL
LENGTH
I
68"
O~~ij]lT
~-~~I
0~~01
"::00
-
POOlER
1
AMFUFIERACTIVE OR PASSIVE
POWER DIVIDER
O ·
-
-
~f'lill f'll
l.J!-J
"" PULSERS
18 CARDS/BOARO
AVERAGE 40 MODULES/CARD
AVERAGE 3 GATES/MOOULE
Q
"'1 ,"
~".
L----i
·~~~i:
~.
i·
'-----'Ll
'_---.I
LAMINATED
INTERCONNECTION
BOARO
~'2"--'
:
L--J i..
LOCAL
SECTOR
Figure 12. Time base distribution.
tribution are holding skew within acceptable limits
and providing sufficient drive capability at many
remote points in space. One solution is to drive a
number of cables of uniform electrical length from
a common active or passive, frequency insensitive,
power divider or power divider-multiplier. In one
case that was implemented it was possible to use a
commercial power amplifier, a resistive divider,
and tunnel diode clippers to furnish square waves
with less than 1 nanosecond rise-fall times into approximately 100 50-ohm cables with approximately
0.2 nanoseconds total skew. In the 50,000 gate system layout of Fig. 12 the cables would terminate in
amplifiers that would shape pulses and provide
drive over some local sector within acceptable skew
limits. If skew in each local sector was within limits of 0 and S nanoseconds then it can be seen that
any two points over the total 68" X 51" area would
differ by no more than S nanoseconds of time skew,
provided cables are of equal length and divider
skew is negligible.
Figure 13 shows two methods of producing fixed
width time-base pulses by use of delay lines. In
method (1) the time-base square wave is brought
into an amplifier that furnishes normal and inverted
outputs as shown. By delaying one path for d nanoseconds, the V circuit can be made to produce d
nanosecond pulses as shown in the waveforms, pin
Figure 13. Pulse forming techniques.
10. The V circuit is actually part of the previously
discussed control delay cell and the delay d is a
fixed delay placed in the system. Method (2) effectively doubles the number of time-base input lines,
but permits delay d to become differences in these
incoming lines and thus requires no fixed delays in
the logic portion of the system. Both methods have
the desirable characteristic of independence between pulse width and frequency up to some maximum frequency and maximum pulse width. Below
these maximums changing one would not appreciably affect the other. A third method is to design
special pulse circuits and drivers which would probably operate at the sector level. In general a capability for adjustments in the suggested areas is necessitated by anticipated improvements in circuit performance, unpredictable factors in final circuit
loading and placement, and design policies that
could prove to be either overly optimistic or conservative.
Special Capabilities Realizable With Control Delays
Several useful capabilities in systems are made possible by the feedback and gating capabilities offered
by control delay cells. In the simple case a control
delay output (Fig. 10, line 10) could feedback into
the input (-G or line 6). Once such a control delay
is set it will run at the clock rate and furnish output
pulses until the feedback is interrupted. The last control delay output may be delayed indefinitely by
holding line 4 positive, or by clearing the latch, a
subsequent output could be prevented. By appropriately selecting feedback points from successively farther down a control delay timing chain it is possible
to produce pulse trains with apparent frequencies
of 1, 1h, 1;3, ~, . . . lin times the basic clock rate.
A wide range of phase relations are possible by selecting desired outputs. A more practical control
delay capability is seen in Fig. 14. The application of
502
PROCEEDINGS -
SINGLE
FALL JOINT COMPUTER CONFERENCE,
~SHOT - - - - - - '
PULSE S O U R C E - u - L f L I
CD,
C02
-----~u
-------.u
------~-.u
u
u
u
u
1965
the system and to operate in a manual step mode.
Figure 16 shows one method of incorporating the
step mode into a large system. The single-shot output
sets a series of control delays that are in turn fanned-
Figure 14. Control delay feedback and gating.
a logical ONE (-), or negative voltage level, to
either line ·11, 12, or 13 and a subsequent pulse
from a single-shot pulser (see Fig. 15) will cause
the chain to operate respectively at 1, lh, or Y3 the
clock rate. The waveforms at the top of Fig. 14
would result from using input 13. The gating used
in this example is AND-OR, and operation can best
be explained by use of the timing diagram of Fig. 11.
Assume that operation is in prograss and one wishes
to examine the means by which CDl is enabled to
pulse simultaneously with CD4. Assume from Fig. 11
that CDs is furnishing output pulses from time base
pulse n. The CDs negative pulse serves as the -G
to CD4. Line 5, the latch output of CD 4, assumes the
negative level at the time shown. This negative level
ANDS ( -) with line 13 and propagates through to
establish a ONE( -) on line 4 of CD1. Figure 10
shows that line 4 and the latch output (X2 ) of CD l
is ANDED (-) such that gate V will pass the next
negative pulse furnished by the time base. This is
pulse n + 1 and outputs from CD l and CD4 result
in time sync. From Fig. 11 the AND(-), line 3, output could have been delayed up to 8 nanoseconds
from points m and p. It is assumed that the gating
delay was within this interval. Had more than 8 nanoseconds been required then the line 8 output (Xl)
which occurs 1T earlier would have been used, as
is illustrated in the next example.
Frequently it is neccessary to design a large, synchronous, and parallel system such that in the process of high-speed operation it is possible to stop
Figure 15. Single shot implementation.
TLM1NGCHA1NDATAFLOW
-
Figure 16. Step mode implementation.
out to other control delays throughout the system.
The last level of control delays functions normally
unless the line 4 inputs are held in the ZERO ( - )
state (positive voltage). The control delays receiving
set or -G signals from the single-shot furnish
ONES ( -) at the line 4 inputs for one pulse interval.
Since the control delays in the timing chains contain storage the system is enabled to remain dormant
for many time cycles and still function correctly
for one cycle on a single-shot command.
The three levels of fan-out circuits in Fig. 16 may
consume up to approximately 12 nanoseconds. Since
the line 4 input to a control delay may be delayed
8 nanoseconds from the time the latch output (X z )
occurs and since the complemented latch output (Xl)
occurs IT earlier, a total of approximately 12 nanoseconds of delay is possible if two provisions are
made. First, the Xl must go through a chain containing an odd number of inverters to obtain the logical
ONE( -) needed at the line 4 input, and second the
-G signal must not have been appreciably delayed
in arriving from the single-shot. If still more time
is necessary, because of heavy loading or line length,
then it would probably be necessary to gate the control-delay V circuit.
A better appreciation of step mode operation is
possible by considering the events of Fig. 3. If the
pipeline had been stopped after time tn+m then a
step command from the single-shot would cause the
events associated with time period tn + m + 1 to occur,
but at a much later time. After running for one time
cycle the pipeline would again be stopped.
CIRCUIT IMPLEMENTATION OF HIGH-SPEED PIPELINE SYSTEMS
PIPELINES IN SYSTEMS
Pipelines in systems are tailored to the particular
data processing jobs at hand. Both system organization and subsequent implementation are oriented
toward doing useful processing at an optimal rate.
As information is sequenced through a pipeline
process, the register widths, or numbers of parallel
registers, may be increased or decreased according
to requirements. Information may be extracted from
and injected into the process at points along the
way. If an arithmetic operation requires more time
than g~ting between registers permits then an attempt might be made to partition that operation
over more than one pipeline section. Also, section
averaging, where one section expands in time at the
expense of its neighbors, is not ruled out when appropriate delays are available to adjust the timing.
Pulse widths are altered by standard techniques as
needs arise. Many forms of feedback may be incorporated into the structure, but attempts will usually
be made to eliminate or minimize any slowdown in
the output rate. Lastly, the physical layout for high
speed (50 mc) systems will be done so as to obtain
as many packaging advantages as possible with
structures of the pipeline variety.
503
pipeline logic structures have something to offer in
more sophisticated systems of the future.
ACKNOWLEDGMENT
The author is grateful for the encouragement,
help, and many valuable suggestions received from
his colleagues in the Department of Defense.
APPENDIX: GLOSSARY OF SYMBOLS
BYTE: N-bit word used for processing.
CD: Control delay.
DOT: A wired-gate decision.
f: Logic sense-function
Timing· sense-frequency.
-G: Negative gating pulse.
+ G:
Positive gating pulse.
k-l: Number of circuits between registers.
= ONE,
Positive voltage = ONE.
(-), (+): Negative voltage
nsec: Nanosecond, 10 - 9 sec.
picofarad: pf, 10- 12 farad.
r: Number of bits in a register.
CONCLUSION
Pipeline systems, operation, and specialized implementation areas have been described. Several example solutions to special areas of implementation
were presented along with some design considerations to permit an understanding. It is hoped that
solutions to common problems may be considered
as design tools so that designers do not have to
reinvent the wheel, so to speak, for each new system. A majority of the ideas contained herein are
quite well established; however, new technology and
the compressed time frame has required special
treatment. It was no mere coincidence that delay
lines were bypassed in favor of circuit solutions in
many instances. Integrated circuits are approaching
the speed requirements and integrated subsystems
are a distinct possibility. Numerical solutions to
design considerations were orientated towards the
50 mc systems that will be realized over the next 1
to 3 year time period. As time and technology advance and more complex structures of economical,
densely packaged logic are realizable then the possibility may exist that long strings of interacting
Right triangle (small symbol in logic drawings) : Negative voltage = ONE.
P12: Measure of correlation.
SS: Single shot, or one shot.
Sc: Clock skew measured at register inputs.
Slave: See "The Control Relay Cell" above.
t: Time.
T: Nominal circuit, line, and loading delay.
T: Maximum circuit, line and loading delay, unless
otherwise specified.
T: Minimum circuit, line and loading delay, unless
otherwise specified.
We: Original clock pulse width. Also, nominal final
clock pulse width.
X: Logic sense-complement of the logic variable X.
Timing sense-maximum value of the time varable X.
X: Minimum value of the time variable X.
Xl: Latch output available in 1T.
X2: Latch output available in 2T.
504
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
REFERENCES
1. M. J. Flynn, "Engineering Aspects of Large,
High-Speed Computer Design; Part I-Hardware
Aspects," presented at the Office of Naval Research
Symposium on High-Speed Computer Hardware,
Washington, D.C., Nov. 17-18, 1964.
2. G. M. Amdahl, "The Model 92 as a Member
of the System/360 Family," AFIPS Conference
Proceedings, vol. 26, part II, 1964 Fall Joint Computer Conference, pp. 69-72.
3. E. M. Davis, "The Case for Hybrid Circuits,"
1965 International Solid State Circuits Conference
(ISSCC) Digest 01 Technical Papers, pp. 32-33 (Feb.
1965).
4. Harding and Schwartz, "An Approach To
Low Cost, High Performance Microelectronics,"
presented at the Western Electronics Show and
Convention, Aug. 22, 1963.
5. W. Bucholz, Planning a Computer SystemProject Stretch, McGraw-Hill Book Co., New York,
1962, p. 204.
6. Pyle, Chavannes and MacIntyre, "A 10 Mc
NDRO Biax Memory of 1024 Word, 48 Bit Per
1965
Word Capacity," FICC, AFIPS, vol. 26, pp. 69-80
(1964).
7. H. D. Toombs and R. F. Abraham, "A Large
High Speed Magnetic Film Memory System," presented at the symposium Les Techniques des Memory Colloque Internationale, Paris, Apr. 5-10, 1965.
8. Crawford et aI, "Design Considerations for a
25 Nsec Tunnel Diode Memory," FlCC, 1965.
9. Maley and Earle, The Logic Design of Transistor Digital Computers, Prentice-Hall, Englewood
Cliffs, N. J., 1963, chap. 11.
10. J. E. Freund, Mathematical Statistics, Prentice-Hall, Englewood Cliffs, N. J., 1964, p. 176.
11. J. R. Turnbull, "Some Aspects of Digital Circuit Design," ISSCC Digest of Technical Papers, p.
58-9, Feb. 1963
12. Narud, Seelbach and Miller, "Relative Merits
of Current Mode Logic Implementation," ibid., pp.
104-5.
13. H. S. Yourke, "MilIimicrosecond Transistor
Current-Switching Circuits," IRE Transactions on
Circuit Theory, Sept. 1957.
14. CADY CARDS (manual of CADY logic
circuit family), Department of Defense, Mar. 1963,
pp. 35-41.
HIGH-SPEED LOGIC CIRCUIT CONSIDERATIONS
w. H. Howe
General Electric Computer Department
Phoenix, Arizona
provements may com~ with machine organizations
which consume large amounts of circuits. These organizations are now becoming feasible due to increased reliability and the availability of low cost
devices through the semiconductor industry.
INTRODUCTION
This discussion is confined to circuits operating
at switching speeds sufficiently fast to require the
use of terminated transmission lines for all logic
interconnections other than to an adjacent device.
The discussion is further confined to significant
factors affecting circuit decisions in a high volume
commercial/industrial environment. Laboratory curiosities operating at absolute maximum speeds are
not considered in view of the extremely distorted
economics associated with experimental technologies. The factors under discussion are technology
considerations, economic considerations, logic arrays, power dissipation, and packaging media constraints. The discussion is not intended to be a
gross prediction of future practice, but rather a
snapshot of today's design considerations imposed
by present technology and Mother Nature's rather
rigid philosophy concerning the speed of light.
Since the transmission time through the interconnecting media is significant when compared to propagation delay time of the logic device, the physical
size of the system has some bearing on the definition of high speed. This discussion is concerned
with relatively large organizations such that a propagation delay time of 2 to 5 nanoseconds may be
considered high speed. More dramatic speed im-
TECHNOLOGY
The selection of a technology for circuit fabrication is heavily dependent upon timing, anticipated
volume, cost and required performance. The following -sequence of events usually occur in the development of a circuit family.
Device Availability. Either as a result of a specific
development contract or in the normal course of
funded research and development programs, an improvement in mask technology, process sequencing,
etc., permits a higher speed device to be fabricated
on an experimental basis. Historically, the device
has been first implemented as a transistor.
Circuit Design. The new device is exploited by the
user, as well as the manufacturer, to produce a desirable logic circuit. These circuits are generally
different since the user is not aware of all the process constraints nor the economic tradeoffs required
505
506
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
for eventual success in the market place as a microcircuit.
User Selects Technology. If the user must exploit
his faster circuit as soon as possible, he may choose
immediate circuit implementation through the use
of one of the hybrid circuit technologies. The penalty for quick turn-around is higher cost in volume.
Meanwhile, the semiconductor manufacturer has
begun development of a silicon integrated circuit.
Manufacturer Announces Silicon Integrated Circuit.
The silicon circuit version may prove to be as fast,
or faster than the hybrid equivalent since he retains
control of all process optimization. The circuit development path is somewhat longer, but is being
reduced by the growing tendency on the part of
manufacturers not to disclose advanced technologies
until the silicon integrated circuit is well on its way
to market.
Let us, therefore, discuss technology in the light
of silicon integrated circuits. Silicon integrated circuit technology has been frequently described as a
cure-all for cost. Once, the circuits are in use, volume increases, causing costs to go down, increasing
the volume, etc., until the cost extrapolation goes
through zero. Some of these effects may be observed
in today's market where circuit costs are near one
dollar even at modest volumes. At one of the recent
conferences, several authors lamented the fact that
the longed-for impact of integrated circuits on the
computer business simply hadn't happened. A more
meaningful statement may well be that the impact of
the impact of the computer business on integrated circuits has not yet happened. The economic success of
these devices is highly volume-dependent; so much so,
in fact, that, to date, commercial computers are the
only logical market place for the latent high volume
all manufacturers may readily achieve. This relationship has caused a heavy emphasis on logic circuit and logic array development to reduce cost to
the fullest extent. Extensive research efforts have
been initiated to search for logic arrays which are
highly efficient, low in cost, and high in speed to
satisfy the needs of the computer industry. However, the tradeoffs in speed, cost, logic complexity,
and technology are inherent to the design of systems and are not separable in spite of the good intentions of the semiconductor manufacturers or the
abstract logicians. We would like to point out brief1y some of the tradeoffs available. One of the most
1965
interesting and significant paradoxes of the new
technology is the apparent reconciliation of a desire
to achieve high speed and low cost. The parameters
which yield high speed, i.e., low parasitics, small
device geometry, also yield lowest ultimate production cost in silicon integrated circuits. Past circuit
design practice has equated high speed with high
.cost. The first step in the assessment of circuit constraints is a careful analysis of the technology used
to make circuits and the latent cost significance of
the variables.
The following example has been normalized to prevent easy identification of a given semiconductor
device. The analysis technique was developed by
Mr. W. D. Turner! of General Electric and will be
explored in more detail in a forthcoming paper.
Two characteristic fabrication processes were analyzed and they may be generally described as diode
isolation and oxide isolation. Mask technology has
a critical effect on cost as will be demonstrated.
Table 1. Circuit Economics.
Isolation
Tpd.
Circuit/ chip
Critical area ratio*
Relative yield
Normalized yield
Circuits per wafer
X normalized yield
Wafer cost
Oxide isolation
premium
Relative chip cost
X 10-3
A
Oxide
4ns
B
Diode
3 ns
1
1
0.296 0.460
3.38
2.17
0.87
0.56
160
218
139
122
1.00
1.00
C
Diode
2 ns
2
0.256
3.90
1.00
766
766
1.00
D
Diode
5 ns
2
0040
2.50
0.64
316
202
1.00
0.25
9
8.2
1.3
5.0
Assuming reasonable cost levels for the cost of a
wafer and also assuming a reasonable value for absolute yield, one may compute the cost of a chip:
Wafer cost
Yield
~
~
$50
0.5
Chip cost = Wafer cost X relative chip cost
*Critical area ratio is the result of obtaining the chip
area of a given circuit and dividing by the area which is
critical to yield. Areas which are critical are those where
an oxide fault or misregistration may result in a circuit
failure.
507
HIGH-SPEED LOGIC CIRCUIT CONSIDERATIONS
A
$.90
B
$0.82
Yield
C
$0.13
D
$0.50
Assembly, test and package costs must be estimated
to complete the comparison, but for the purpose of
this paper, assume a constant $.30 for the sum total
of these factors.
A
$1.20
B
$1.12
C
$0.43
D
$0.80
Cost per circuit (C&D contain 2 circuits per chip)
$0.40
$0.22
$1.12
$1.20
These basic costs are marked up in accordance
with the profit motive to produce quoted selling
price. It is possible, of course, that a comparison of
selling price may result in an inversion or distortion
of the rank of the products. It is equally possible
that a distortion of the basic economics of the process as a result of the battle in the market place
may cause slipped schedules, price renegotiations
and poor quality parts. Selling price alone is not an
adequate parameter.
As a final comment, the smallest device, which
was also the fastest, had the least latent cost.
Now it must be recognized that a study of this nature has certain inaccuracies, but it is important
that these studies be made to ascertain the inherent
speed/cost relationship which may be entirely different from the quoted costs received from semiconductor marketing organizations.
LOGIC ARRAYS
One factor of growing significance, as circuit size
is reduced, is the increasing amount of surface area
consumed by areas devoted to interconnections and
pads for interconnections. There have been marginal improvements over the past few years, but no
startling improvements have been made in comparison to reductions in the basic device geometry. As
has been pointed out in many recent papers, the
consumption of real estate may be reduced by interconnecting the logic circuits with the narrow lines
allowed by the masking technology, thus reducing
to a minimum the area requirements for external
lead pads. At this point, the semiconductor manufacturer relaxes and says in effect to the computer
designer: Reduce your logic to a few standard con-
figurations, and I will reduce costs by a large factor. Hence, we have a search for magic standard
logic functions. Other approaches such as varying
the final step of the masking process to provide
special logic connections over a matrix of logic circuits has been proposed. This has resulted in difficult layout requirements and a challenging problem
of computing optimum ratios for connecting leads,
logic parting problems, and so forth. Other methods
are variations on providing a circuit/logic matrix
where bad elements are disrupted and the logic restructed through adjoining elements usually at the
expense of speed. All of these approaches seem to
neglect the basic overriding economic significance
of device geometry.
Figure 1 illustrates the normalized economy of
chip size vs array complexity. As can be seen, the
smaller the chip, the larger the array which may
economically be placed on the chip. With a given
mask technology, most economy is achieved by having a high number of chips per wafer which will set
definite limits on logic complexity per chip. Most
manufacturers are concerned with having a given
chip, good or bad, rather than going through complex "rescue" operations involving additional processing even though the metallic interconnection
step is relatively inexpensive. The point here is that
small size permits high speed with lower costs and
that logic arrays are apt to be most effective when
they are small, thus producing chip sizes amenable
to the economics of a given mask technology. The
largest stumbling block then is the logic configuration itself.
Unfortunately, we haven't achieved either a magic logic function or a magic insight into the solution of the problem of finding a relatively small set
of standard functions. However, we have carefully
analyzed a number of products and have classified
the logic groupings obtained in Table 2.
Table 2. Gate Efficiency.
EffiGates per
Group
package
ciency
1
2
3
4
(non-functional)
(functional)
(functional)
(functional)
1-3
4-8
46-71
106
100%
96
87
84
Loss
4%
13
16
Comparing these efficiency figures with cost
economy ratios exhibited in Fig. 1 reveals that array technology is economically attractive and war-
508
PROCEEDINGS -
1.0
\\
,7
~
\)
{J
0
1965
I
.8
h
V)
FALL JOINT COMPUTER CONFERENCE,
~
~
/
/
,6
41
N
"'-
~
~
~
05
ic resulting in Vf1 being negative.
At a time T later, Yin has moved to' X2, inducing
corresponding pulses V b2 and V j2. Simultaneously,
Vf1 has also propagated to X2. Therefore, Vf1 and
Vj2 exist at X2 at the same instant of time and add
directly. This is a continuing process (at a incremental level) until Yin arrives and is absorbed at
terminal r. Therefore, the induced voltage seen at
terminal j will be a direct function of the length of
the line. Hence, Vj is not only proportional to the
derivative of the driving function Yin, it is also proportional to the length of line over which the interaction exists.
Now the incremental backcross pulse will be
examined. Referring again to Fig. 2, note that during
the time interval T, when Yin moved from Xl to X2
in the direction of r, Vbl traversed an identical distance toward b. At any instant of time V bl and V b2
are 2T seconds apart. Consequently a continuing
series of pulses, each one delayed by a differential
time, will arrive at b. The last pulse will arrive 2Td
seconds after the time of initial application of Yin.
Therefore, the resultant backward pulse at b will
start to decay at 2Td seconds. This then defines the
width of the backward pulse. Produced by a positive
going wavefrO'nt it is 2Td seconds wide, or twice the
513
CROSSTALK AND REFLECTIONS IN HIGH-SPEED DIGITAL SYSTEMS
~TYP-1
V in (t)
g __
~~~
__
I
I
r
~DRIVER
I
I
~b2
J"tLVbl+-+-nbl
b
~JI~1____~----_r
LINE
-I
Td
~::------~~=---------~----------~'__------------------f
l
-.-.~
4J7:
V f l , I~fl
I
I
14
Tvp
~
't'
XI
SENSE OR PICKUP LIN E
Vf2
X2
Figure 2. Forward and backward traveling waves.
propagation time of the coupled section of the lines.
The resultant amplitude of Vb will be independent
of time provided 2Td is greater than the rise time of
V in, T 1. The backward pulse is an attenuated replica
of Vin for t less than 2Td. This can be demonstrated,
for the driving signal of Fig. 2, by summing several
rectangular pulses T 1 seconds wide that are incrementally displaced along the time axis.
It should be emphasized that the amplitude of Vb
is independent of the slope of the input signal (for
T1 < 2Td) and the length of the line, while Vf is
a direct function of these parameters. Thus, for· the
case that T 1 < 2Td the forward and backward induced pulses on the sense line have decidedly different characteristics and sensitivities. By altering
the geometrical cross section and permittivity of the
dielectric (i.e., its nonhomogeneous characteristics)
it is possible to make Vf positive or negative. As was
pointed out, in the homogeneous case, Vf will be
zero. However Vb will always be of the same polarity
as Vin. Again, this assumes the lines are terminated
in their characteristic impedance.
where Kf
= forward crosstalk constant = - 1/2
(~; -Cm
Zo ),
kb = back crosstalk constant =
(~: + Cm
1
4T d
zo),
£ = physical length of the coupled region,
and
Td = time for signal to propogate a distance £.
The complete derivation of Eq. 1 is given in the
Appendix. As indicated in Fig. 3, V( o,t) is the back
crosstalk voltage while V(l,t) is the forward crosstalk voltage. The two crosstalk constants are functions of the geometry and materials of the coupled
lines. Each of these constants can be determined by
a single measurement. With tb~ determination of
these constants the crosstalk voltage at any point Oll
the coupled sense line can be evaluated by Eq. (1)
for any line length and for any amplitude pulse with
a linear rise time. The forward and back crQsstalk
will now be more closely examined.
Crosstalk Equations
Back Crosstalk
The general expression for the instantaneous voltage induced anywhere on a line which is coupled to
another line driven by a signal V in (t) is given by
Eq. (1):
V(x, t) = Kf x :t [ Vin ( t -
[Vin(t-
I
T~ X) + Kb
T~X)-Vin(t-2Td+ T~X)1
(1)
At x = 0, the beginning of the coupled line, the
back crosstalk voltage is
V(O, t) = Kb [Vo(t) - Vo(t-2Td)]
(2)
To better understand the relationship of the back
crosstalk pulse waveshape to the driving signal waveform let V 0( t) be the function shown in Fig. 4, that
is,
514
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
DRIVER LINE Ro
Volt )
-I
1.
V,
Ro
1965
Substituti~g Eq. (3) into Eq. (2), the back crosstalk voltage is
V(o, t)=Kb [/o(t)-/o(t-T1 )-/o(t-2Td)+/o(t-Tl-2Td)]
= V(/II
(4)
f
SENSE
Ro
Figure 3. Coupled transmission line.
Va ( t)
t ---+
Figure 4. Input driving function.
Vo(t} - /o(t-T 1}
where /o{t} =
(3)
Vk
-;y;
t.
/
fo(t)
Long Line Case. Assume initially that the rise time
T 1 is less than twice the propagation time of the
region of interaction. This situation defines a long
line. For this case the plus waveform described in
Eq. (4) is as indicated in Fig. 5. Note that the back
crosstalk voltage is directly proportional to the input
driving voltage V o(t}. In addition, the maximum
amplitude, KbVk, is independent of the coupled
length. As discussed in the qualitative analysis, the
backcross pulse width is equal to twice the propagation delay of the line.
Short Line Case. If the rise time of the input
driving function is greater than twice the propagation
time of the coupled lines, the situation is defined as
the "short line." For this case the back crosstalk
pulse, still described by Eq. (4), is graphically shown
in Fig. 6. The maximum amplitude occurs at t =
2Td and from Eq. (4), the maximum back crosstalk
voltage is
fo(t-T. - 2T d) /
/
/
/
/
/
/
/
/
V(o, t)
/
/
/
/
TI
< 2Td
LONG LINE
/
t -----
- f 0 (t
Figure 5. Back crosstalk pulse in long line case Tt
v (0)
,t max
= 2Kb Tl
V K T d = 2Kb Td(jj(VO
d
( t»
(5 )
Here, as opposed to the long line, the back crosstalk
voltage is proportional to the slope of the driving
function and the electrical length of the coupled
line. Note that the maximum amplitude is decreased
from that of the long line by the factor 2Td/Tl. The
duration of the backward pulse for the short line is
Tl as indicated by Fig. 6.
< 2Td.
Forward Crostalk
Now consider the crosstalk at the other end of
the coupled line, X = £ in Fig. 1. This forward
crosstalk, in response to the input driving function,
Vo(t}, of Fig. 4, is
V(f, t) = Kff[
;t fo
(t- Td) -
~
fo{t- Td- Tl) 1
(6)
515
CROSSTALK AND REFLECTIONS IN HIGH-SPEED DIGITAL SYSTEMS
/
/
/
/
fo (t)
/
/
f o(t-T.- 2Td)
/
/
/
/
/
TI
/
> 2 Td
SHORT LINE
/
/
t~
Figure 6. Back crosstalk pulse in short line case Tl
= 0 for 0
=
< t < Td
Kf Vkl
T1
for T d
= 0 for t
< t < Td +
T1
> Td + T1
Equation (6) yields a rectangular waveform. The
amplitude of the forward crosspulse, V(£, t), is proportional to the slope (V kIT d of the driving signal
and the length of the coupled line. The pulse width
is equal to the rise of the driving signal.
The polarity of the forward crosstalk signal depends on the sign of the forward crosstalk constant
Kf. If Lm > em Zo2 the forward crosstalk pulse will
have a polarity opposite to that of the driving signal.
If Lm < Cm Z02 it will have the same polarity. Only
in the case of a homogeneous medium of propagation does Lm = Cm Z02.
Experimental Waveforms. An example of crosstalk
in a nonhomogeneous medium can be demonstrated
with printed microstrip transmission lines. Measurements were made on two sets of lines with the crosssectional geometry shown in Fig. 7. The setup is
as shown in Fig. 3, with the line length, f, equal to
rO.00 8 "
Note: Lines and ground plane are approximately
o. 003-inch thick.
Figure 7. Cross section of microstrip transmission line. Lines
and ground plate are approximately 0.003-inch
thick.
> 2Td.
16 inches. The waveforms are shown in Fig. 8 for
the ines spaced 0.120 inches apart and in Fig. 9 for
a O.OlS-inch spacing.
On both Fig. 8 and Fig. 9 the forward crosstalk
pulse is of the opposite polarity to the driving signal.
This occurs because the ratio of self to mutual
capacitance is greater than in the comparable
homogeneous case. As previously described, if the
dielectric medium was homogeneous, the forward
crosspulse would be zero.
(50 MV/OIV)
(50MV/OIV)
Kb=·014
K.f=-·029ns/ft
TIME. INS/OIV
Figure 8. Crosstalk waveforms of Fig. 7: S =0.12 inch,
I = 16 inch.
From the waveforms of Figs. 8 and 9 the crosstalk constants Kb and K f can be determined by use
of Eq. (1). In determining these constants it is better
to use the output voltage rather than the input voltage of the driven line. This is true because the output
of the drive line and the crosspulses are subject to
line losses and distortion. Thus, to obtain Kb, take
the ratio of V 0 to Vb; to obtain Kf take the ratio of
516
PROCEEDINGS...:.- FALL JOINT COMPUTER CONlfERENCE,
the maximum value of Vt to the maximum slope of
V out for a given line length. For the loose spacing of
Fig. 8 the backcross talk constant, K b , is 0.014, while
the forward crosstalk constant, K t , is 0.029 nanoseconds per foot. With the tightly spaced lines of
Fig. 9 Kb is 0.16 and Kt is 0.09 nanoseconds per
foot. These constants can now be used to predict
the crosstalk waveforms for an arbitrary length of
coupled line (of the given crossectional geometry)
and any driving signal waveform.
1965
signal, due to the capacitor termination, will arrive
at the sending end after the input signal has completed its transition. Those signal interconnecting
lines that exhibit this property will be defined as
"long lines" and will be treated as transmission
lines.
(lV/DIVl
0.2V/DIVl
Kb=O.IG
Kf =-0.09 hs/ft
TIME, INS/DIV
Figure 9. Crosstalk waveforms of Fig. 7: S
I = 16 inch.
0.015 inch,
REFLECTIONS
Definition-Long and Short Lines
When a pulse propagates along a transmission
line that is not terminated in its characteristic impedance, a signal of amplitude no greater' than the
original will be reflected toward the point of original incidence. Of special interest in high-speed
digital processors is the question of when is it necessary to treat an interconnection path as a transmission line or simply as a lumped element. Although the exact criterion is an exact function of
various parameters, a useful criterion can be
evolved using the following development.
In the discussions to follow, the high-speed logic
gate loads are represented by lumped capacity.
Many of the very high-speed logic circuits have input transistors which are operated in the linear
mode resulting in input characteristics which are
essentially capacitive. In Fig. lOa, Td is the time
required for the signal to propagate the length of
line. If 1\, the rise time of the input signal Ein, is
less that 2 Td, the waveform at the sending end of
the line will be as shown in Fig. lOb. The reflection
Figure 10. a. Transmission line terminated with capacitor
and resistor. h. Signal at point A for a long line
(Tl < 2Td). c. Signal at point A for a short
line (Tl > 2Td).
Conversely, if Tl is greater than 2 Td the effect
of the reflection is to degrade the rise time of the
input signal as demonstrated in Fig. 10c. Signal
lines that demonstrate this property are defined as
short lines. Such lines can be treated as lumped
capacitances.
As will be demonstrated shortly, a given transmission line, loaded with randomly spaced logic
gates, can exhibit the properties of both long and
short lines. In these ca.ses, the short and long line
theories can be separately applied to ascertain the
overall effect.
CROSSTALK AND REFLECTIONS IN HIGH-SPEED DIGITAL SYSTEMS
REFLECTION FORMULA
Although there are rigorous methods using computer programs for analyzing the waveforms on a
transmission line loaded with capacitors, these
problems can be analyzed by the novel application
of distributed line theory. This technique is an approximate method which yields accurate results for
the heavily loaded line, which for the most part,
represents the limiting criteria insofar as reflections
are concerned. Requiring only a few minutes of the
designer's time, this method provides him with further insight and understanding into the nature of
the reflection phenomena.
This method of analysis is best illustrated by an
example. Consider Fig. 11, where a line having a
characteristic impedance of 74 ohms to two 100ohm lines. Assume that each 100-ohm line pair has
a propagation time T d and the 74-ohm line pair
has a propagation time T 2 • If a pulse with rise time
T 1 is applied as ...shown in Fig. 11 the waveform at
A will be as shown in Fig. 12. Thus, using the
standard definition for the reflection coefficient r
yields
ZO'IOOn
Hl
E
RO'IOO'o;
VIN .E/2
~
'----1~----J
TI
Figure 11. 74-ohm line driven and terminated by 100-ohm
line.
r= ZL-ZO =-015
Z+ Zo
.
Consequently the mismatch between the 100-ohm
and the 74-ohm transmission lines produces a 15
percent reflection. Note that the reflection at point
B is negative while that produced at point C is
positive.
0.143V
r
v
I
4
Now consider the 100-ohm line alone. Assume
that ~he signal propagation velocity is 0.5 feet per
nanosecond, which is approximately the case for a
microstrip line. The distributed inductance and capacitance per unit length of line can be derived from
the well known expressions
1
v ----p -
(LoCo)V2
(7)
0.)
(8)
and
L.Zo= ( Co
Vz
where
V p = velocity of signal propagation,
Lo = distributed inductance per unit length,
Co = distributed capacity per unit length, and
Zo = characteristic impedance.
Substituting in Eq. (7) and (8), the distributed
parameters for this particular 100-ohm line are
Co = 20.3 picofarads per foot
and
Lo = 0.203 microhenrys per foot
Now assume that 17.5 pF is uniformly added to
line segment B-C as indicated in Fig. 13. The distributed inductance remains the same. However the
distributed capacitance is increased from 20.3 picofarads per foot to 37.8 picofarads per foot. Then
from Eqs. (7) and (8) the impedance of this section is reduced to 75 ohms, and the propagation
time is increased from 2 to 2.72 nanoseconds. The
waveform seen at A in Fig. 13 would be that of
Fig. 12. The amplitude of the reflection would be
0.15 and the width of the reflected signal would be
twice propagation delay of the line section B-C or
5.44 nanoseconds.
Now, a discrete number of capacitors placed on
this line segment can be approximated by a distrib-
0.0205V
" ",j,,:J
:
517
I
Figure 12. Waveform seen at point A in Fig. 11.
Figure 13. Simulated distributed line.
uted line whose characteristic impedance is to be a
function of the spacing between the capacitors. The
closer the spacing, the lower the characteristic impedance.
518
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
The reflection on the loaded line segment in Fig.
13 can be determined as follows:
Let Zo = characteristic impedance of unloaded
line
Co = capacitance per unit length of unloaded line
Lo = inductance per unit length of unloaded
line
Vo
= velocity of light - in free space
Vp
= K Vo = velocity of signal in unloaded
line
C L = capacitance per unit load
n = number of unit loads per unit length
D = physical spacing between loads
CT = Co + CL = total capacity of loaded
line per unit length
E ref =. maximum amplitude of reflected signal
E in = incident voltage
fp = Eref
Ein
fp is used to define the ratio of the peak amplitude of the reflected signal to the incident voltage
when the loads on the transmission line have capacitive components. When the loads are purely resistive
fp = f.
The impedance of the loaded line segment is:
_ (LO
- -)
Zl-
lh
CT
= Zl-ZO
Zl+ZO
reflected signal for a given Co, or more generally
a given Zoo A more useful form of Eq. 11 is
1_ (1
+~~)
DCo
f
V2
(12)
where D is the physical spacing between loads.
For a given transmission line, Eq. 12 reflects the
reflected signal to .the load capacitance and the
spacing between the loads. Conversely for a specified load and a predetermined maximum tolerable
reflection the spacing between equally spaced loads
can be determined. It is important to remember that
Eq. 12 is most accurate for a large number of
loads. The predicted reflections are pessimistic
when only a few loads are considered. However, the
system must be able to perform properly in the
presence of the maximum reflections; thus, it is for
many loads that accuracy is required.
Some indication of the relationship between. signal reflections and loads spacing can be seen in Fig.
14 where f is plotted as a function D for Co equal
to CL.
50
(9)
The percent voltage reflected from point B back
toward the generator end· of the line A is
f
1965
z
40
o
~
I,,)
~ 30
"11.1
a:
(10)
\z11.1
20
I,,)
a:
Substituting Eqs. (7), (8), and (9) into Eq. (10),
we get
11.1
~ 10
t..
o
0.5
1.0
2.0
3.0
4.0
0- SPACING IN INCHES
f
(11)
Figure 14. Percent reflection vs spacing for Co = CL.
Equation ( 11 ) gives the reflection coefficient in
terms of the distributed capacity of the unloaded line
and the added capacity (per unit -length) due to
shunt loads. Note in Eq. (11) that the total added
capacity per unit length appears as the product of
TJ and CL . Thus, the number of loads and the capacitance of the unit load C L can be inversely altered
without changing the magnitude of the reflected voltage.
Since, in general, the capacitance of the load is
fixed by the logic circuit input characteristics, the
number of loads per unit length TJ will determine the
As an illustration of the usefulness of these concepts assume a 100-ohm line with a velocity propagation factor of liz is loaded with circuits whose
input characteristics can be represented by a 5-picofarad capacitor. If the reflection is not to exceed 15
percent the closest allowable spacing can be determined by solving Eq. (12) for D . That is,
- 0.15
5)
( 1 + D20.3
5)
1. + ( 1 + D20.3
1-
V2
V2
where Co = 20.3 pf/ft from Eqs. (8) and (9). On
519
CROSSTALK AND REFLECTIONS IN HIGH-SPEED DIGITAL SYSTEMS
shown, a signal· of amplitude E volts will propagate
along the line. T d is the time required for the signal
to propagate the length of the line. If SW 1 is open,
thus open-circuiting the line, the amplitude of the
incident signal will jump to 2E which can be considered the equivalent thevinin open circuit voltage.
The thevinin impedance is obtained by short-circuiting the input voltage and determining the impedance
looking into the B terminal towards the generator.
This impedance is the characteristic impedance (except for low frequencies). Thus, the equivalent circuits of Fig. 15b will give a complete and accurate
description of the reflection along the line as well
as the waveform at the load. If Rg =1= Zo, then multiple reflections will exist. In this case the equivalent
circuits shown in Fig. 15b apply only in the interval
equal to 2Td seconds; A new thevinin voltage is
determined for succeeding 2Td interval based upon
the reflected voltage from the generator end of the
line. This reflected voltage is determined by the same
method, i.e., apply thevinin theorem to the generator
end of the line. For short lines (T1 > 2Td) the
equivalent circuits of· Fig 15b still apply provided
that the capacity of the line is added to CL.
Using the above approach, the reflection voltage
is given by
solving the above equation, D is found to be 3.56
inches.
Short Line
In the discussion following Fig. lOa line was
defined as short when the driving signal transition
time is greater than twice the propagation delay of
the line (T 1 ~ 2Td). This definition applies to the
idealized trapezodial waveform. In most applications,
however, the input waveform will not have a linear
rise time but an exponential one. Thus the end of
the transition period is difficult to define. Experimentally, it has been found that if T1 is taken as the
time required for the source voltage to rise to about
85 percent of its ultimate value, the relation T 1 >
2Td can stil be used to define the short line.
For those interconnecting lines to which the
"short line" criterion applies, the rise times, delays
and waveforms are calculated directly by conventional lumped component techniques. In this Cgse
the transmission line is represented by an equivalent lumped capacitance as determined by Eqs.
(7) and (8). This equivalent capacitance is added
to the capacitance presented by the loads to determine the total capacitance.
Single Loads or Widely Spaced Loads
EREF
For those cases where the line is loaded with a
single load or with widely spaced loads, the reflection amplitude and the time delay can be calculated
by the application of the thevinin theorem. Consider
Fig. 15. If an input signal E in = 2E is connected as
(t) =
-EZOCL (
- 2t )
2T1
1 - e ZOCL 0
(13)
EREF
(t)
e ~( 1 _ e _2_~)
ZoC
ZOCL
EZOCL
2T1
Tl
14------- Td
RL- -
r------,
RL
I
1
Zo+RL
~
1
.
RL
EIn RL +zo
I
1=
1
L _____ ..J1
-=
r-------,
IMlr
Zo/2
1
=:
1
I
Ein
1
-2-=E
:
1=
1-=
IL _____ ...1I
FOR Zo" RL
b.
< t < T1
Thevenin Equivalents
Figure 15. Thevenin equivalent circuits of transmission lines.
L
520
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
Experimental Results
The maximum percent reflection is given by:
- r p-
Zo C L
1 - e -2-T1 )
ZoC
(
2 T1
The time delay from A to B as measured at the 50
percent points is
Td
+
(ZoC L /2).
Reflections Due to Shunt Resistive Components
Because some logic circuits such as DTL's have
nonlinear input characteristics and signficant d-c
input components, it is extremely difficult to define
the input impedance in terms suitable for use in the
reflection formula. Defining Rin of the gate as the
ratio of the voltage change to the current change is
inaccurate since the total current variation usually
occurs in some interval of the voltage swing. This
assumes that the voltage swing reflects noise immunity properties which is generally the case. However, since the reflection on a line is proportional
to the shunt load current, it is possible to define the
percent reflection directly in terms of the input current level change which can be easily measured.
Consider Fig. 16 which shows a section of transmission line in which an incident current I in is applied. The line is loaded at point P1 by ZL through
which a current h flows.
Iin
=~
Zo
_l_lLTRANS.'SS'ON LINE -Z.
I
REF
XL
1
'
ZL
Figure 16. Current distribution at tapped node.
Summing up the currents at node P1 and using the
· ..
f
def mltIon
p
=
fp = -
EREF
'
E
. we 0 b
tam
in
h
1m
(16)
1/2 - . -
Thus we have the reflection expressed in terms of
the load current and the incident current, both readily
determined parameters for DTL type of loads. Although Eq. (16) was derived for resistive loads its
application can be extended to capacitive loads by
dv )
representing h by C ( (it
max
.
1965
Table 1 shows the comparison between. the predicted percent reflections and the measured percent
reflections for vari'ous" 'capacitance loads and spacings. The table is incomplete simply because all
combinations were not measured. As indicated earlier the accuracy between the predicted and observed reflections for the higher numbered loads is
within a few percent. As the number of loads decrease (for the closely spaced loads) the predicted
value exceeds the observed value. For slower rise
times the accuracy of Eq. (12) will maintain itself
for the higher numbered loads whereas the difference between the predicted and observed reflections
will increase with the fewer numbered loads. This
follows from the fact that a lumped line more closely approximates a distributed line as the number of
sections are increased.
DESIGN CONSIDERATIONS
In general the designer will initially endeavor to
first specify the characteristics of the transmission
line. This happens because the signal transmission
system, being an integral part of the packaging, has
many physical, mechanical and cost considerations
in addition to its electrical properties. Certain types
of lines, like coaxial lines, will be eliminated from
general use because of their cost and poor package
density. Once a line has been chosen consistent
with cost and packaging considerations, the natural
tendency will be to select as high an impedance as
is practical in order to reduce the current requirement of the driver circuitry. However, unless the
dynamic and static input current requirements of
the logic circuitry is exceedly low, the reflections
will be excessive. Excessive reflections in this case
means reflection voltages in excess of the predetermined noise immunity of the circuitry. Even if the
designer takes corrective steps by lowering the line
impedance or increasing the noise immunity of the
circuit (which generally incr,eases the delay) it
must still satisfy the requirements of the logic designer whose fan out and fan in requirements directly affect the reflections. Then still to be considered is the inherent conflict of increased package
density versus higher reflections since squeezing a
fixed number of loads close together increases the
reflections as indicated in Fig. 14. Increased pack.;.
age density usually results in closer spacing between
521
CROSST ALK AND REFLECTIONS IN HIGH-SPEED DIGITAL SYSTEMS
Table 1. Comparison of Measured and Calculated Values of
Capacitance
per Load
Spacing
(pf)
(in~hes)
Calculated
Measured fp for n Loads (percent)
.
n=2
1.5
3.0.
4.5
12.0.
6.0.
10..5
1.5
10.
3.0.
4.5
16.6
6.0.
17.5
1.5
3.0.
15
4.5
25.8
6.0.
21.5
Note: Rise time = 1 nanosecond.
5
r p.
n=3
16.0.
15.0.
n=4
12.4
10..5
29.0.
22.0.
18.4
17.5
38.5
27.5
the signal transmission line increasing crosstalk between the signal lines. This affects the noise immunity. Thus, the packing denseness required, the
electrical and mechanical properties of the signal
transmission system, the noise immunity, the current-speed capability and input impedance of the
logic circuitry, the logical fan in and fan out required and the spacing of the loads all effect and
are affected by the reflections (and other noise producing sources). Therefore, these various system
parameters should be evolved concurrently with the
various disciplines cooperating.
CONCLUSIONS
The interaction of two coupled lines, when one is
driven by a transient signal, will generally result in
induced signals at either end of the undriven line,
even when the lines are properly terminated. For
the special case where the dielectric medium is homogeneous, the induced signal at the receiving end
of undriven line will be zero when the lines are
properly terminated.
When the driving pulse having a linear rise time
is applied to one of two coupled lines having a nonhomogeneous medium, a crosstalk pulse will be
induced at each end of the coupled sense of pickup
line. The forward crosstalk at the receiving end of
the sense line will be proportional to the slope of
the driving pulse and to the length of the coupled
region. The back crosstalk pulse at the sending end
of the sense line has a waveform that is related to
27.0.
22.5
n=6
19.0.
17.5
13.4
10.5
33.0.
26.0.
20..3
17.5
41.5
32.0.
29.0.
22.5
n=11
r
n=12
26.0.
18.5
37.5
26.0.
43.5
32.0.
(percent)
26.8
17.3
12.9
10..1
38.0.
26.8
20..8
17.3
45.0.
33.2
26.8
22.6
the relationship between the rise time of the driving
pulse T 1 and twice the propagation time T d of the
coupled region. For T 1 < 2Td the amplitUde of the
back crosstalk pulse is independent of rise time and
the length of coupled region. The width of the pulse
is approximately 2Td. For Tl > 2Td the maximum
amplitude of the back crosstalk pulse is proportional
to the time derivative of the driving pulse and to the
electrical length of the coupled region.
The equations presented for the back and forward crosstalk apply to lines having either homogeneous or nonhomogeneous mediums. Following
the determination of two constants by direct measrement, these equations can be used to predict· the
forward and backward crosstalk voltage for variable
driving pulse rise times, driving pulse amplitude
and length of the coupled region.
In any given system the magnitude of the reflection can be a function of the signal transistion time,
the length and characteristic impedance of the
transmission line, the input conductance and capacitance of the circuit loads, and the spacing and
number of the loads.
A transmission line uniformly loaded with logic
circuits whose loading characteristics can be represented by capacitors can be represented by a distributed line with a .reduced characteristic impedance. This approximation is most accurate as the
number of loads increase and/or faster signal rise
time. The maximum reflections seen on these types
of lines can be determined by the relationship
522
1965
PROCEEDINGS - - FALL JOINT COMPUTER CONFERENCE,
1- ( l +
f
1
+(
1
CL )
Transmission Line in a Zero Field
112
DCo
+ D~O)1I2
where CL = effective capacity of load,
Co = capacity per length of transmission line,
and
D = spacing of loads on line.
For those lines which are loaded with logic circuits whose loading characteristics are essentially
resistive (including those which are nonlinear) the
maximum reflections can be calculated by the relationship
h
f=-Yz--.1m
where h
= current level change of logic circuit,
lin =
and
The equations. describing V (x) and I (x) for the
transmission line shown in Fig. 17 can be found in
any standard text on transmission line theory such
as reference 6. With V and I positive and x measured
from the receiving end as shown, V (x) and I (x) are
given by
V(x) = Aei'X + Be-i'X
(1)
A
I(x) = - Zo
where Zo
=
J
Z
'\j y
B
- - e - YX
ei'X _
(2)
Zo
= IR +
jwL.
'\j G + jWC
A and B are constants determined by boundary conditions and R, L, G and C are defined on a per unit
length basis.
incident current on line.
The effect of the various system design parameters on each other is pointed out. The need for cooperation and compromise between the requirements
of the packaging, signal transmission system, noise
immunity, ·current-speed·capability and input impedance, logical gain of the logic circuits, and the
spatial distribution of the loads is discussed. The
importance of unified design philosophy grows as
machines become faster and grow smaller.
REFERENCE
1. B. M. Oliver, "Directional Electromagnetic
Coupling," Proc. IRE, vol. 42, no. 11 p.1686 (Nov.
1954).
2. R. C. Knechtli, "Further Analysis of Transmisssion Line Couplers," Proc. IRE, vol. 43, no. 7,
p. 867 (July 1955).
3. W. L. Firestone, "Analysis of Transmission
Line Directional Couplers," Proc. IRE, vol. 42, no.
10, p. 1529 (Oct. 1955).
4. F. C .. Yao, "Signal Transmission Analysis of
Ultra High Speed Transistorized Digital Computers," IEEE Trans. on EL, pp. 372-382 (Aug.
1963 ).
5. H. Gray, Digital Computer Engineering.
Vg
V{)C)
iv
I~.--------x------~.I
X=O
Figure 17. Transmission line.
The instantaneous voltage and current equations
can be obtained from Eqs. (1) and (2) by multiplying them by ejwt, which gives
v(x,t)
= Aejwt+i'X + Be jwt-YX
(3)
and
. + yx
i(x,t) = - A e3wt
Bo
e3Wt -yx
(4)
__
Zo
Zo
Those terms containing expressions of the form
f 1 (wt - yx) indicate a wave traveling in the + x
direction while those terms containing the form /1
(wt+ yx) indicate waves traveling in the -x direction. Note that the + x direction is from the receiving
end of the line back toward the generator end.
Evaluating the constants from the boundary conditions at x=O, V(O) =VR and lo=VR/Zz=/(R),
Eqs. (1) and (2) become in hyperbolic form
V(x) = VR (cosh yx
Appendix
MATHEMATICAL ANALYSIS OF TRANSIENT
SIGNALS ON COUPLED TRANSMISSION
LINES
-!(X)
-I
+
Zo
. h
Z2 SIn
'}'X )
(5)
and
I (x) = IR (cosh '}'x
+
Z2.
Zo smr
'}'X
)
(6)
523
CROSSTALK AND REFLECTIONS IN HIGH-SPEED DIGITAL SYSTEMS
and gf---(9)
II (x, g) = P2 cosh'}' (£-x) + Q2 sinh '}' (f-x);
VI (x,
g) = ZO
x>g
(10)
= -Zo [QI cosh '}'X + PI sinh ,},x];
(11)
xg
(19)
+
Zo sinh '}'x;
+
(20)
Z2sinh,}, (f-x)];
(21)
524
PROCEEDINGS -
+ Zo
sinhI' (£-x)]; x>g
Substituting in Eqs. (17) and (18), gives
VI (X, g) = Zo Q [Z2 cosh I' (£-x)
(22)
will flow in opposite directions as shown by the two
opposing arrows.
r
P [Zo cosh I' g + ZI sinh I' g]
Q - [Zo cosh I' (£-g) + Z2 sinh I' (£-g)]
_
A
V
=
I
(23)
Zo P [Zl cosh I' g + Zo sinh I'g]
[Zo cosh I' (£-g) + Z2 sinh I' (£g)]
Zo cosh I' (I-g) + Z2 sinh I' (i-g)
(24)
With aid of hyperbolic identities, Eq. (24) is reduced
to
Figure 19. Transmission line with an induced current at
x=f
A procedure, identical to the one previously used
for an induced voltage, yields the voltage and current
A
P = V [Zo cosh I' {£-g + Z2 sinh I' (£-g)]
Zo [(Z02 + Z 1Z2) sinh I' £ +
(Zo Z2 + Zo Z1) cosh I' £]
(25)
From Eq. (23) Q can be determined
distribution in response to a current, f, induced at
x = g. The four boundary conditions required for
the solution of the simultaneous equations are
V [Zo cosh I' g + Z1 sinh I' g]
Q=
V 2 (0, g) =
12 (0, g)
Zo [Zo2 + Z1 Z2) sinh I' £ +
(Zo Z2 + Zo Z1) cosh I' £]
(26)
Substituting P and Q in Eqs. (19) through (22)
-v
V 1(x, g) =-i-{[Zo cosh I' (£-g)
[Z1 cosh I'X
+ Z2 sinh I'
(£-g)]
+ Zo sinh I'x]}; x < g
(27)
A
11(x,
g)
V
= ZoD.
{[Zo cosh I' (£-g) + Z2 sinh I' (£-g)]
[Zo cosh I'X + Zl sinh I'x]}
; xg
V 1 (x, g) =
[Z2 cosh I'
J,(x,
g) =
(29)
1,/1 {[Zo cosh yg + Zl sinh yg]
[Zo cosh I' (i-x)
+
Z2 sinh I' {£-x)]}; x>g
(30)
D. = [(Z02 + Z1 Z2) sinh 1'£ +
(Zo Z2 + Zo Z1) cosh 1'£]
Eqs. (27) to (30) give the voltage and current
distribution on the line in response to a zero impedance voltage source induced at Z = g.
Voltage and Current Distribution from an Induced
Current. Fig. 3 shows a transmission line of characteristic impedance Zo, terminated with Z1 and Z2, in
1
which an infinite impedance current source is induced at x = g. The direction of the. induced current
f is
1965
FALL JOINT COMPUTER CONFERENCE,
as shown in the figure. The currents in the line
-z . x
= 0
(31)
V2 (£, g) = -Z2' X = £
12 (£, g)
,
(32)
1,
A
h(g-o, g) - 12(g+0, g) = I; x
=g
(34)
As before, the current in the direction of x is a
positive current. Therefore 12 (g-o, g) is a positive
current while 12 (g+o, g) is a negative one. The boundary conditions at x = 0 and x = £ both generate
negative impedance values because the currents and
voltages are in opposite directions. This is done to
allow compatibility with the previously discussed
problem with an induced voltage.
Thus, starting with Eqs. (9) through (12) and
using Eqs. (31) through (34) in the same manner as
before, we arrive at the votage and current distribution at x due to an induced current 1 at x = g.
The solutions are
-1 Zo
-X([Zo sinh I'X + Z1 cosh I'x]
[Zo sinh I' (£-g) + Z2 cosh I' (£-g]}; xg
(37)
525
CROSST ALK AND REFLECTIONS IN HIGH-SPEED DIGITAL SYSTEMS
hex, t) =
-1
~ ([Zo
[Zo cosh y (e-x)
sinh yt
+
-+
Z1 cosh yt]
Z2 sinh y (f-x)]); x>t
(38)
The voltage and current distribution at any point
x on the line due to an induced voltage and current
at x =
can now be obtained via superposition.
outlined in reference 5. If an applied current signal
loon the drive line is propagating from left to right,
the current and voltage at any point g on the drive
line may be written as
la = 10 e- yg
(43)
t
Also, because the main interest here is in the induced
signals, let the line be matched, that is let Z1 = Z2
Zoo
Substituting Zo = Z1 = Z2 and combining Eqs.
(25) to (29) and (35) to (38) while using the
relationships cosh x = (ex + e- X ) 12 and cosh x =
(eX - e- X)/2 and cosh x + sinh x = eX, the equations reduce to
Defining Cm and Lm as the mutual capacitive and
inductive coupling between the driving lines and the
A
A
(45)
I(t) = sCm Va
(46)
where s is the argument of the Laplace transform.
Substituting Eqs. (43) to (46) in Eqs. (39) and
( 40) and integrating over the region of interaction
we get
10 x
=- T /
[s L.m e- yg - s Cm Z02 e- yg ] e-Y(X-odg
lex, t)
A
+( ~
-I )e- YCX - O ;
+~f
A
Vil = Zo is true, the forward crosstalk voltage
reduces to zero. In this case, for crosstalk considerations, the sense line may be terminated in any impedance at the receiving end.
The total induced voltage and current resulting
from a driving pulse propagating along the driver
line are obtained by adding the differential voltages
and currents expressed in Eqs. (39) to (42) over the
region of interaction. Here we follow the method
~
2 x
x>t (42)
Equations (39) to (42) define the voltage and
current distribution at any point x on a transmission
line as a result of an induced differential voltage and
current at x = t. The sign factors in these equations
should be interpreted in terms of the assumed positive direction in Figs. 18 and 19. For example, the
minus sign in Eq. (39) indicates that the direction
for the positive voltage at x = 0 is opposite to that
assumed. Again, this was expected from the qualitative analysis. The expressions that hold for xt, the forward crosstalk, the terms are of opposite polarity. These were the results expected in
light of the qualitative analysis. If the relationship
A
A
sense line, I and V can be expresszed as
V(x,s)
lex, t) =
(44)
Va = Zo/a
[s Lm e- yg
+ sCm Z02 e- yg ] e-YCg-x)dt
(47)
_
-
2
I
-
0
[(Lm-CmZ0
-yx +(Lm
2
sx e
s( e- yC2 !-x)
-
e-- YX )
+4yCmZ0
2
)
. (48)
For a lossless line the propagation constant y =
y'LC = s T d/£ where Td is the time required to
propagate the length of the coupled line £. Substituting in Eq. (48) and taking the inverse transform,
we get
V(x,t)=Kjx
-KIJ
:t
[Vo(t-
~dX)]
TdX) - Vo (t-2T
:::'
Figure 2. Sample of neural-electric data (bottom) and the
results of the event recognition program. E±. . . denotes the
multiplying power of ten's exponent. Time is in seconds,
amplitude in millivolts.
A common form of preliminary statistical analysis, associated with intracellular neuronal response
data, is to count the number of times a given variation occurs within a sample. These discrete probability distributions, or histograms, are divided into
class increments of the variation. For neuronal
data, two types of variation are of interest. In one
type, the variable is amplitude, thereby generating
an amplitude histogram. For variations with time,
there is the second type, a time interval histogram.
The production of either or both of these two
main types of histograms may be performed during
the event recognition program, or from the reduced
data contained on magnetic tape.
histogram. In addition, only those events which
have definitely attained their response peak prior to
initiation of another event are included. The events
which are interrupted-those marked with an * in
Fig. 2-by the onset of another event are not included since no accurate determination of their peak
value may be made.
The procedure governing assignment of an event
to a particular amplitude class interval, or amplitude 'box,' is depicted in a very simplified flow
chart shown in Fig. 3. There are separate amplitude
histograms for the EPSP's and the IPSP's.
In Fig. 3, the symbol.dA represents the basic amplitude class interval. To avoid truncation errors,
which may be quite misleading, .dA (an input variable to the program in millivolts) must be an intger
multiple of the digitizing amplitude resolution. N is
an integer representing the assigned class interval
and ranges from 1 to 200.
Amplitude Histogram
Time Interval Histogram
In examining the neuronal data, only two types
of events are considered for inclusion- in the amplitude histogram category. They are the EPSP's and
the IPSP's. Spikes are not included in this form of
There are three separate time interval histograms: one each for the EPSP's, the IPSP's and the
spikes. The requirement on the EPSP's and IPSP's
that they display a discernible peak (i.e., be unin-
analyses to be performed on the digital computer,
rather than manually.
HISTOGRAM GENERATION
570
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1965
AMPL=* ?
i .......",05 tft.. ..v ..nt
int ..rrupt ..d on t" ..
risinq p"as.. ?
ADD COUNT TO THE
Nt" ARRAY ELEMENT
IN AMPL. HISTOGRAM
ADD COUNT TO THE
MtfI ARRAY ELEMENT
IN TIME INTERVAL
HISTOGRAM
Figure 3. Simplified flow diagram for time interval and
amplitude histogram generation.
terrupted on the rising phase), as in the case for the
amplitude histograms, is absent. All that is required
in the form of an amplitude criterion is that their
amplitude be at least as large as the minimum acceptable at the time of the peak, or of their interruption by a supervening event.
The class interval for these histograms is D.T
seconds, an input variable to the program. The
measured times between contiguous like events is
taken as the time difference between initiation of
one event and initiation of the next like event. The
first is termed the reference event, while the second
is the crossreference event. Their times are labeled
as T 1 and T2 respectively. Similar to the amplitude
class width N, M is an integer denoting the assigned
time class interval or 'box,' and its value ranges from
a present minimum value MMIN to MMIN + 200.
Since the time intervals between adjacent like events
may exceed 200D.T, counts indicating that two events
separated by .200D.T or more are all placed in the
200 th box.
Histogram Readout
As was mentioned previously, the histogram may
be formed concurrently with the operation of the
event recognition program, or in subsequent operations involving the reduced data on magnetic tape.
In either case, selection of particular histograms is
an input variable. That is, a choice may be made,
prior to program operation, of which histograms are
to be printed out. Another program input variable
is the periodicity of histogram printout. This option
will supply the investigator with a time history of the
histogram formation if he so desires every D.T J seconds. The time history is supplied in two forms: (a)
ANALYSIS OF INTRACELLULAR NEURONAL RESPONSES
the total of the d T Jth time of printout and (b) the
difference between the histograms at the times associated withdTJ and dTJ - 1 •
Printout of all histograms may be in either of
two forms, tabular or graphical, or both. Collateral
with the tabular printout is the total number of
571
counts entered into each particular histogram at the
time of printout. Figures 4 and 5 are representative
samples of the graphical plots of an EPSP amplitude histogram and time interval histogram respectively. The original data was obtained from a motoneuron in the spinal cord of a cat. *
Figure 4. Amplitude histogram for excitatory postsynaptic
potentials (EPSP's) obtained from a motoneuron in a cat
spinal cord. Only EPSP's which were uninterrupted during the rising phase are included. Minimum acceptance
amplitude was 250 microvolts.
TIME CORRELATION OF EVENTS
The program described here is employed primarily with the data reduced by the event recognition
program and stored on magnetic tape. However, it
may be used on other data in card form in the correct format. Due to storage limitations of the digi-
tal computer, it cannot be performed concurrently
with the event recognition program.
In essence, the time correlation of neuronal
events is an extension of the frequency distribution
*Obtained from experimental data supplied by Dr. T. G.
Smith, Spinal Cord Section, Laboratory of Neurophysiology,
NINDB, National Institutes of Health, Bethesda, Md.
572
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
, ,.:e,i,:!:,
II i! ~ I
1965
:::. :::l \i
; I !I
;:1
Figure 5. Time interval histogram for the same data as in
Fig. 4. Here, however, all recognized EPSP's are included as
long as their measured amplitude difference from initiationto-peak or initiation-to-interruption was at least 250 microvolts.
analysis just described. Whereas the time interval
histograms were involved with contiguous and like
events, this form of analysis is not so limited. This
program, referred to as the correlation program,
measures the probability (or correlates) that, given
a particular type of event (EPSP, IPSP or spike),
there will be another event of a particular type
within a given delay time class interval. The two
events, reference and cross-reference, need not be
adjacent; nor need they be of the same type. 3
This type of analysis is felt to be much more
meaningful than the normal frequency distribution,
or histogram, form. The observed EPSP's and
IPSP's are usually each from multiple input pathways to the single neuron being monitored. Because
the various signals on the mUltiple pathways may be
asynchronous, indications of underlying signal processing or event. relationships in time by histogram
may be undiscernible.
There are two main types of time interval correlations performed. One type, designated type-A, is
independent of the amplitude of either the reference
or the cross-reference event. All events included
in the time interval histograms are included in this
type. The second type of correlation, designated
type-B, is performed with a condition of amplitude imposed. Only subthreshold events (EPSP's
and IPSP's) which meet the requirements for inclusion in the amplitude histograms are eligible for
this type of correlation. All spikes may be included
in either type.
Type-A Correlation
The type-A time interval
turn divided into six categories
reference and cross-reference
six categories, are referred to as
6 lists the various A-types and
correlations are in
as a function of the
event types. These
the IPAIR's. Figure
their collateral ref-
ANALYSIS OF INTRACELLULAR NEURONAL RESPONSES
573
TYPE-A CORRELATIONS
IPAIR
REF. EVENT
I
2
3
EPSP
IPSP
SPIkE
IPSP
EPSP
EPSP
~
5
6
CROssREF:
EVENT
EPSP
IPSP
SPIKE
EPSP
XPSP
SPIKE
TYPE -8 CORRELATIONS
AMPL. CONDn-IONS
TYPE
REF.
CROssREF.
MAx. No. DELAY
CLASS INTERVALS
B(MD,I.L)
(/SLS/6)A4
MD=200
s(Mcl~#Ll
(/SLs/6)~A
Me-SO
Figure 6. Classification of time interval correlation types.
Type-A includes all events meeting the minimum amplitude
criterion. Type-B performed for a particular type-A, and in
addition, has conditions of peak amplitude imposed.
erence and cross-reference event types, as well as
the conditions of amplitude included in the various
type-B correlations.
Associated with each type-A correlation are M
class intervals of time. The minimum (MMIN) and
maximum (MMAX) values of the class intervals are
input parameters to the program. The maximum
difference allowed between MMAX and MMIN for
any single processing run is 200. The values of M
must be positive and consecutive integers. The width,
AT in milliseconds, associated with each delay time
class interval is also an input parameter to the program.
Figure 7 is a simplified program flow chart for the
correlation program for the type-A. Data from the
magnetic tape is read into a buffer storage. Contained in the storage is data designating the type of
event, its initiation time and its amplitude between
initiation and peak; or if the event was interrupted,
an equivalent of the * (Fig. 2 ) indicating an interruption in place of amplitude.
The data is examined until an event initiation time
is found which corresponds to an input start time.
The time of the event is set equal to tj. This time is
checked against the input stop time and the print time
criterion. If tj is equal to, or greater than, either of
these, the PRINT .routine is entered. This will be
discussed later.
The type of event (I) is determined (i.e., EPSP,
IPSP or spike) and a count is entered in the appropriate reference event counter {KE(I)} for that type
of event. Next, the amplitude is checked to see if it
was a completed event or not. If the event was completed (no *), its type is checked against the type
of reference event to be used for the type-B correlation; that is, an ICLASS input corresponding to one
of the IPAIRS. Assume that the reference event is
the type to be used for the type-B correlations.
The event amplitude class interval is determined,
LT. The value of L T is checked against 16 input
amplitude classes. If the event's value of LT corresponds to one of these classes, a count is entered into
the appropriate type-B reference event counter
{KEB(L)}.
After such preliminary processing of the reference
event, the computer advances to the next event in
574
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1965
TYPE-A
Figure 7. Simplified program flow chart for the type-A
correlation mode.
storage. This event is the first cross-reference event.
A check is made (compute the M-class interval) to
ascertain if the difference in the initiation times of
the reference and cross-reference events lies between
(MMAX) X (aT) and (MMIN) X (aT). If the
time difference is in excess of (MMAX) X (aT)
and the reference event was interrupted in amplitude,
a new reference event is chosen. A time difference
less than (MMIN) X (aT) calls for a new crossreference event. Assume that neither of these conditions was encountered.
At this point, a determination of the cross-reference event type is made (IPAIR determined), and
a count in the appropriate M-class .interval for the
corresponding type-A con-elation is made. The reference event is checked again at this point to see if
it is a completed or interrupted event. If it is an
interrupted event, no entry is made into the type-B
correlation routine, and the program then examines
the next data point as a cross-reference event, and
makes the previously mentioned time checks, etc.
If, in this portion of the loop, the determined class
interval falls outside of either the minimum or maximum values of class interval, entry into the type-B
correlation is made if the reference event is uninterrupted' and is the proper amplitude class for a type-B
reference event (L =1= 0).
ANALYSIS OF INTRACELLULAR NEURONAL RESPONSES
Type-B Correlations
Collaterally with the six type-A correlations, time
analysis on conditions of amplitude may be performed to a .limited extent. Due to storage limitations in the digital computer, only a limited number
of combinatorial conditions of event type, conditional amplitude class interval and delay time class
intervals may be imposed. For anyone processing
run, the type-B correlations may correspond to
only one particular type-A IPAIR; e.g., EPSP's
correlated with EPSP's, where both the reference
and cross-reference events have an amplitude condition imposed. The type chosen is an input parameter labeled ICLASS, where the ICLASS equals one
of the six allowed IPAIRS. Referring to Fig. 8, the
processing procedure for the type-B is given in
the· following (it will be assumed here that the ref-
575
erence event had a measurable amplitude so that
entry into this portion of the program was made) .
The first check in this portion of the routine is
to determine if the combination of referencecross-reference events is the chosen ICLASS.
Following this, a check is made upon the amplitude
of the cross-reference event to determine if it is an
uninterrupted event or not. Assume for discussion
purposes that the ICLASS = IPAIR and that the
cross-reference event had a measurable amplitude.
The amplitude class corresponding to the cross-reference event (NT) is then determined. This class
is adjusted to the class interval of the reference event
by forming N = NT - LT.
The allowable class intervals for the cross-reference events are designated by the symbol N, which
is also an input to the program. N may vary positively
or negatively, and is an integer. The maximum num-
-TYPE-B
- - - - - - -CORRELATION
-
~nt~r count in MDt"
array ~I~m~nt for
TYP£-B(MD,I,L) t"is
L!.!.~.!..'!:' ICLASS
~nt~r count in MC t "
array ~I~m~nt for
TYP£-S(MC,N,L) Mis
IPAIR:.ICLASS
Figure 8. Simplified program flow chart for the type-B
correlation mode.
576
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
ber of values for N for a particular processing is ten.
For each value· of L, a conditional correlation is possible for each value of N. The function of N is to
adjust LdA for the cross-reference events. For positive values of N, the maximum allowed cross-reference amplitude class for each value of L is L (ilA)(N-l )dA. For negative values of N, the minimum
allowed class is L(ilA)-INldA.
The actual allowed cross-reference amplitude
classes are determined by imposing the condition
that the determined value of N must equal one of
the L values. This condition is imposed in order to
obtain conveniently the number of cross-reference
events employed in any amplitude-conditional
correlation. The L inputs may number 16, and the
N inputs 10.
Assume that the value of N corresponds to one
of the input values, and that this value was N = 1.
A check is made (K = O?) to determine if the time
difference already made is equal to an allowable MD
class interval. If it is, a count is made into the appropriate array element for type-B (MD, 1, L), where
this array has the same 200 delay class intervals as
the type-A arrays.
The next step is to determine if the time difference
is one of the 50 class intervals of the type-B array,
where these 50 class intervals do not necessarily have
to correspond to any of the 200 class intervals of
the type-A's or the B(MD, 1, L),s. The new class
interval is determined using the input delay class
width ilT2, and a check is made to determine if this
class interval, MC, is allowed. If it is, a count is
made in the appropriate array element for the type-B
(MC, N, L). Following this, either a new reference
·event or cross-reference event is obtained, and the
processing continued until a reference event's time
is found which equals or exceeds the input stop time.
Printout
The results of the correlation processing may be
printed as often as desired. A il T k input to the program determines the frequency of print in real (experimental) time, not machine time. The time of each
reference event is examined. If its time is > to the
start time plus dTk (k = 1,2,3, .... ), the results
of the correlation as of the time of that event are
printed, contingent upon meeting certain print criteria.
Since the processing is slowed every time the digital computer has to print results, it is economical
1965
to print only when it is felt the results may have
meaning. Hence, there are print criteria included in
the program. At present, there are three print options from which one may choose. Fig. 9 lists these
in the form of a simplified flow chart.
Upon entry into the print routine, certain values
are computed for each array. These values are the
number of counts, or correlations, expected if one
assumes that the events are Poisson-distributed in
time. The expected number of correlation counts
per class interval is found by forming the product
of the number of reference events for the particular
array being examined (N 1) and the number of crossreference events possible for that array (N2; N2 may
equal N 1 for certain array types as shown in Fig. 9).
This product in turn is multiplied by the individual
delay class interval width of the array being examined. This product is divided by the time difference between the time of the first reference event
employed in the correlation and the time of the last
reference event corresponding to the print time. The
value so determined, X, is the average number of
correlation counts expected if the process being examined has a Poisson distribution of event times. One
standard deviation is determined by calculating the
square root of X.
In the first option, in order to determine if a particular correlation array is to be printed, the counts
in all of the class intervals of that array are
summed. The sum is divided by the square root of
the product of the number of reference events and
the number of cross-reference events. This quotient, Pc, is compared to an input criteria number.
If it is larger, the array is printed out.
The second option employs the previously computed value of X. X is multiplied by the number
of delay class intervals to form a value Y. Y is the
expected average number of counts for the entire
array, and \/Y the standard deviation. If the sum
of the correlation counts in the array exceeds Pc =
Y ± zyY, where z is an input parameter, then the
array is printed out.
The third option consecutively examines each element in the array. As soon as an array element
(where the count =I=- 0) is found which exceeds X ±
zyX, the array is printed. Again,z is an input parameter.
In all options, if the print criteria is not met, the
array type, the value of X and yX for that array,
and the value of N1 (the number of reference events
for that array) are printed. In addition: for options
ANALYSIS OF INTRACELLULAR NEURONAL RESPONSES
for TYPE-s(MD,I,L) and
for TYPE-A, IPAIR-I,2,3:
comput. X= NI-N16T/"9
and f i
N/=N2
for TYPE -A, IPAIR =4,5, 6:
comput. X=NI'N26T/!J
and Vx
N/~N2
577
for TYPE-8(MC,N,L):
comput. X=NI'N26T2/!J
and
Vx
option 3
comput. Pc= r.INt for N1=N2
comput. Pc=r./vNI·N2 for
NlJiN2
r..
as 50 on as a
is found 3ucli tliat
'1:max or Mat
r..=-
r..;60,c'1:min
NI= numb.r of r.,.r.nc • • v.nts
N2=numb.r of cross-r.f. .v.nts
Figure 9. Simplified program flow chart for the Correlation Program's print routine.
1 and 2, the sum of the array counts and the value
of Pc are printed; for the third option, the peak count
of an array element is printed. When the print cri- .
teria is met, the following is printed: array type,
array, X, yX and N1. Following the type-A's, the
time of the last reference event is given.
Printout to date is in tabular form only. To obtain a graphical plot, an auxiliary program is employed where the count values (to a scale normalized to 1000 maximum) are consecutively entered
onto cards as a function of the delay class interval.
RESULTS
In addition to control test runs, experimental
data from several motoneurons in the spinal cord of
cats has been processed through the entire set of
programs herein described. The results have been
quite fruitful. * Complete histograms and type-A
and B correlations have been obtained for a total of
over 55,000 events. Computation time for event
recognition, histogram generation and the various
correlations (e.g., type-A's from a to 2000 ms in
1 ms delay class intervals) totaled less than 20
hours. The reduction in time compared to processing the same data by hand is fairly obvious.
Fig. 10 illustrates one particular result obtained
from the just mentioned processing. The plot is correlation counts on the ordinate and delay class intervals along the abscissa. The reference and
cross-reference events were both EPSP's. The plot
indicates that there is a preference for EPSP's to
occur fairly close together. This was borne out by
recalling the EPSP time interval histogram previously seen (Fig. 5). The remaining portion of the
correlogram would indicate that the activity is more
or less a Poisson process. The next illustration
(Fig. 11) is a time interval histogram from the
same experiment but at a later time. The· input ac*These results will be detailed and the possible anatomical and physiological mechanisms underlying them will be
discussed in a subsequent paper.
578
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1965
Figure 10. Time interval correlation (auto-) results for the
EPSP's used for the generation of Fig. 5. The time listed in
the legend is the duration of the experimental data analyzed.
X is the expected average number of correlation counts
assuming a Poisson distribution.
tivity to the monitored neuron had increased considerably. This time interval histogram shows the
same preference of EPSP's to occur close together.
In addition, there is a suggestion of a preferred
spacing in the vicinity of 40 to 50 ms, which might
be overlooked.
The next figure (Fig. 12) is a correlation plot
similar to that in Fig. 10. It encompasses the same
data as the immediately preceding histogram. The
preference of EPSP's to occur at spacings of 50 ms
is not obscured in this plot as it was in the histogram. In the histogram, the peak occurs around 40
ms; whereas in the correlogram, the peak is at 50
ms. The difference between these two peaks is accounted for by the high preference of closely spaced
EPSP's pairs obscuring the real peak in the histogram, but not in the correlogram. Furthermore,
examination of the next illustration, Fig. 13, shows
. a relationship between the EPSP's and IPSP's, which
occurred between the times encompassed by Fig. 10
and Fig. 12, and which would not be discernible
from examination of histograms.
The correlograms processed to date have uncovered activity relationships between neuronal responses which were not anticipated by monitoring
during the original experiment or by histograms.
They were not anticipated for two main reasons.
First, standard histograms are not overly sensitive
to relationships of responses due to numerous
sources mixed together, and prior to this, the majority of analyses of the time relationships between
neuronal events has been in the form of time interval histograms. Secondly, detailed analysis of several thousand events obtained in a single experiment
has been a long time-consuming process which few
experimenters have been willing or able to undertake.
ANALYSIS OF INTRACELLULAR NEURONAL RESPONSES
579
Figure 11. Time interval histogram of data from the same
experiment as the previous figures. However, this histogram
is for a longer duration. Note the suggestion of a periodicity
at 40 ms.
REFERENCES
1. F. F. Hiltz, "A Method for Computer Recognition of Intracellularly Recorded Neuronal
Events," IEEE Trans. on Bio-Medical Engineering, vol. BME-12, no. 2, pp. 63-72 (April
1965).
2. N. R. Lakey, "A Fortran Program for Intracellular Event Recognition," IEEE Trans. on Bio-
Medical Engineering, vol. BME-12, no. 2, pp.
73-87 (April 1965).
3. F. F. Hiltz and C. T. Pardoe, "A Correlator
of Time Intervals Between Pulses," IEEE Trans. on
Bio-Medical Engineering, vol BME-12, no. 2, pp.
113-120 (April 1965).
580
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1965
Figure 12. Time interval correlation (auto-) results for the
EPSP's in the same experiment. The data was for a slightly
longer duration than that of Figure 11. Note the pronounced
periodicities, and at a different period than indicated by the
histogram in Fig. 11.
581
ANALYSIS OF INTRACELLULAR NEURONAL RESPONSES
L~_.!
I
I~
:
I!
I
~ . . . g_~~
--f-!L--l--'--I-f--+-+--t--+;~--±~'±:-H+t--++-l-++--++H+t++f-++i-+-+-+-+-t--+--HH--H-++-+++-++t-++
:;OJ
:
! _-2!
1
rll' :7.!o.
r... ---1-:-
.!~--!
'-:-!
!
:
-L--~-+-++-H--+++-I-+-++--~-+--++--1f-++-HI-+++--H-+-+--H-++--H-++-+-t--++H-+-++--j-+----t--c'
~ I _. . ~-;;11 :-r:
; -i--+-HI-ji'-+-t-+-+--+-i-+--+--+-t-++-I-I--+-+-if-++-+-t-+-+-+-+++-f-++-+-t-+++-t++--H++'1r++r
f---
I
. I~ ;~v
~AA
----H )
.... ,
f,.~:-
__
~I~
In
Hf--+-l--l-~-+-l-I~-IlH-I---Hl-Ih-II+--+-+-++--+-+-++---+--1-++---+--1-+--'lIR"',1+-1'-f+--+-+--'-f-+-l-+:7I--"+-'--'I--~i-:-:-*~
i
I
ii-+--+-+--1-+
;
l1 ,. I
U
I"ln
. 10-1 'I
~
--+]!-f--~--l-+--+-+----1f-+--+-+----1~-+-H"--lH-I--+-f---!t~
L---I-+-+-++-+--1-++---+-1-+--+-r--l-1IIF1---1..f--1H+--t!-'---+-I--"+=i--,*,,"+-+-+~-F.i~
~-
---
!~I
:
i
'I~
IUM ,-'---
o
".
, h
10
:
~o
11,-1
t
II
lJ
3D
III
II
41'0
~'O
1111
I.....
6'0
7'0
80
9'0
/(;0
Figure 13. Time interval correlation (cross-) results for the
EPSP's and the IPSP's occurring during the same experimental time encompassed by Fig. 12. Separation of the correlation peaks is nominally 50 ms, with the first peak occurring at 45 ms, compared to the 50 ms peak in Fig. 12.
:,
110
""li:::,
I:
I~O
Ill!:!:'
130
j'l
1::11
INFO'RMATION PROCESSING OF CANCER CHEMOTHERAPY DATA
Alice R. Holmes and Robert K. Ausman
Health Research, Incorporated
Roswell Park Memorial Institute
Buffalo, New York
stipulating qualifications a patient must meet prior
to entry in the study such as age, medical history,
blood picture, and previous therapy. Also specified
is the conduct of the study, the method of drug administration, dosage regimen, frequency of blood
counts, and follow-up policies. Investigators are
expected to conform as closely as possible to the
protocol.
In conjunction with the writing of the protocol is
the design of the forms that enables the physician
to collect the information requested. The proper design of these forms is extremely important, for it
determines the quantity and order of the material to
be stored in the computer for later use in several
types of statistical analyses. To facilitate the handling of the data collected, most questions are
worded for an objective answer. An important aspect of the system is the return of these forms to
the statistical unit as soon as the pertinent patient
information is available to the physician. In this
manner there is continuing assurance that the protocol is being followed correctly. In addition, the
immediate response allows for current reports on
the progress of the study.
A patient is entered in the study by a referring
physician who telephones the statistical unit. At
INTRODUCTION
Roswell Park Memoriat Institute, one of the largest cancer research and treatment hospitals in the
world, conducts an extensive program of cancer
chemotheraphy, the use of chemical agents in the
treatment of cancer. Roswell Park, acting as the statistical center for participating hospitals throughout
the United States, is responsible for implementation, control, and follow-up of several chemotherapy studies. To facilitate the handling of the diverse and extensive amount of data made available
by the participants, the architects of these chemotherapy studies found it desirable and necessary to
automate as many facets as was feasible. This paper
describes the procedures employed in a system
which extensively treats medical data.
GENERAL PRINCIPLES
For each study the participating hospitals
throughout the United States follow the same procedures, thus insuring essentially identical treatment
of all patients. This procedure guarantees uniformity which facilitates statistical analyses. Prior to the
initiation of any given study, guidelines are determined and a protocol to be followed is established,
583
584
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
this point, he receives information as to the specific
drug, dosage, and frequency of administration. Subsequently, the physician receives written confirmation of the entry and procedures of therapy.
During the exchange of information between Roswell Park and the participants there are several
checks on the completeness and accuracy of the protocol procedures. If the forms are received as incomplete, inaccurate, or if they contain discrepancies, physicians from the statistical center staff note
these errors. The corrections are indicated in memos which are prepared by a separate semiautomated system and recorded on the IBM Magnetic Tape
Selectric Typewriter and sent to the participants. If
corrections have not been received at the close of a
30-day period, names of delinquents are run
through the Magnetic Tape Selectric Typewriter,
which prepares a reminder form.
Results of the 3-month follow-up examinations, perhaps the most important phase of the
studies, are submitted to determine the effect, if
any, of the chemotherapy. In this manner it is possible to tell· if there has been a recurrence of cancer,
the time lapse between therapy and recurrence, or
most significantly, if there has been no recurrence
at all.
The methods of handling the data received, previous to automation, were time-consuming, clumsy, and inaccurate. The information on each patient
chart was transferred manually onto three separate
code cards. Because of the limited available space
on the cards, only a summary of the data could be
coded. For example, the total amount of drug given
was recorded, but the individual doses and dates of
administration were eliminated. In addition, only
the lowest blood count was punched in the card,
eliminating the daily record. This system made it
impossible to analyze the effectiveness of the drugs
accurately and completely.
As the number of studies grew, the staff became
aware of the absolute necessity for a method which
would allow more intricate and sophisticated recording and analysis of patient information. A
means was devised by which a greater amount of
data could be recorded with less manual labor. With
the introduction of an automated system, initiated
in 1963 and utilizing an IBM 1401, the chemotherapy studies have produced more reliable information, stricter adherence to the protocol, and less misinterpretation of data.
1965
METHODS AND PROCEDURES
The individual patient charts were chosen as the
initial input documents since they afforded the
most logical and practical method of handling raw
data. Because of the format of the charts, it was determined that most of the data could be punched
directly from the charts with an absolute minimum
amount of coding.
The first stage of automation included the design
of the punched card layout. Since the cards are
punched directly from the patient forms, the card
layout follows these forms as closely as possible.
The number of columns to be allowed for an item
provides for the highest possible value of the particular data. Two major controls incorporated on
each punched card are the card type number and
patient study number. The card type number indicates a particular card format and the information
recorded on that card. As an example, the patient's
former medical history may be recorded on one
card, while daily observations may be found on
another. The unique number assigned~ to each card
reveals at a glance which category of information is
contained on the card.
The patient study number, different for each patient, is punched on all cards for that patient. In the
event that one card is separated from the others belonging to a patient, the information is not incorporated incorrectly with the data recorded for another
patient.
The assigning of codes to that data which have
not been written previously in numerical form must
be established in conjunction with the design of the
card layout. Examples are the indication of sex, to
be coded with a 1 or 2, or any type of yes or no
question which can be coded with a 1 for yes and 2
for no. This translation can be accomplished easily
by keypunchers.
Magnetic tape is employed as the storage device
for the records. The tape format can be designed in
several different ways. Each tape record may be
made identical to the initial input card, resulting in
a card image on magnetic tape. In this case, card
type and study number are retained on each tape
record, insuring proper identity. The second method
of formatting the tape is to combine several input
cards into one tape record. Utilizing this approach,
type and study number appear only one time. A
third way is to combine all cards for a patient into
one record; unless there is a fixed number of rec-
INFORMATION PROCESSING OF CANCER CHEMOTHERAPY DATA
ords for every patient, this method is not advisable.
The present application creates a tape record for
every input card. The number of records per patient
may vary from 5 to as many as 100, depending on
the number of daily observations recorded for a patient. If there were one record per patient, the record size would range from 400 characters to 8000
which would be cumbersome to program and analyze.
SYSTEM ORGANIZATION
The first program in the system writes the data
cards on tape. A standard card-to-tape routine is
used providing there are no changes or modifications in the data at the time it is written on tape.
When the tape is not formatted in the same manner as the cards, it is converted to the proper layout. This conversion may include combining two
card records into one tape record, or possibly rearranging the data so that it can be manipulated more
effectively. What may be efficient for keypunching
may not be efficient when working with the massive file of patient records on tape.
Once the tape has been created and sorted, a
complete edit is performed by the computer in order to check all the data for accuracy, completeness,
and logic or validity. If the individual who reviewed the patient form overlooked a discrepancy,
or the data was keypunched erroneously, the program detects and prints the error.
There are several different types of checks. Most
data are screened for validity. For example, months
must be numbered 1-12, days 1-31'. Some data can
be coded only with certain codes (male = 1, female =
2). A code of 3 for sex would indicate an error.
A portion of the data is recorded on several different records as they occur; the dates are crosschecked in each location to insure compatibility. A
case in point is the date of surgery which appears
on three types of records. These dates must agree
since there is only one possible date of surgery in
the study. Calculating the patient's age from date of
birth and comparing it to the given age is another
important check. For these studies age is a principal factor in determining the type of therapy the
patient is to receive because an error may exclude
the patient from the study.
In some instances the given information determines the nature of other information that should
be present. If the patient died during the course of
585
the study, there must be a date of death, or in the
event that a patient had a disease recurrence, the
site and system of the recurrence must be recorded.
The amount of drug given is reviewed to insure
that the dosage was in accord with the protocol.
Extremely low or high blood counts are listed and
checked later. Checks on all dates that must be later
than the date of the surgery are made to detect any
errors. All parameters specified in the protocol are
examined to insure that the patient has met the
qualifications for the study.
When the tape has been edited, the list of errors
is sent back to the statistical unit for correction.
This function may involve writing a letter to the
investigator for clarification, or it may represent
re-coding and repunching. After editing, the tape
is merged with the master tape, thus creating a new
master.
The update program demands a variety of routines. First, it allows for the inclusion of new patkmts on the master tape in the proper sequence.
Second, it permits the insertion of additional records
for a patient and also the deletion of extraneous
records. Third, this program has the flexibility necessary if data on a particular record demand alteration.
After the master tape has been updated, it is
ready for statistical analysis. A duplicate of the
master tape, denoted as a "frozen tape," is created
for the statistical work. This tape is not updated as
frequently hecause statisticians run correlations,
frequency distributions, and analyses which do not
anow for the daily change of the data. On the other
hand, administrators require up-to-date information to prepare reports, or to provide the investigators with current data relative to their patients. Requests for information concerning certain facets of
the study are made at frequent and irregular intervals.
As soon as a patient is entered in the study, the
initial entry form is sent to be keypunched. This
information consists of the patient's study number,
name, type of therapy, and date of surgery. The
card is written on tape. The master tape is updated
and contains at least one record for every patient
entered to date. Reports must be prepared monthly,
giving a distribution of the number of patients entered in each different drug category by hospital.
Having an automated record of each patient entered
allows for the production of reports that are current. Prompt fransit of the forms to the statistical
586
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
unit is of utmost importance. If forms are overdue,
a letter is sent to each investigator. As a form is
received, it is recorded and the patient's tape record
is updated daily. Each month a delinquency program is run against the master tape to determine
which patients have outstanding forms. A personal
letter with a list of the missing forms is written to
the investigator on the computer printer. These letters have been extremely effective in reminding the
study participants that they must send information
promptly. They continue to receive a letter each
month until all deliquent forms for a patient have
come into the office.
Another control is the determination of which
patients are overdue on the follow-up section of
the study. Since the patients must be examined every three months, forms summarizing the results of
the examination are submitted. A list of overdue
patients by hospital,as well as those patients who
must be seen during the current month, is prepared
by the computer. This list gives the patient's study
number, name and date that the patient should have
been seen.
These two systems are the most effective way of
reminding the study participants. They have provided more effective control and obtain requested information with more accuracy and tess time.
Specific and individual patient or drug information which may be of extreme importance to a
particular physician is easily available. He can receive a printout for each patient which might include daily drug dose, any toxicity, and other data.
In another case, a doctor may phone to inquire
about response and/or toxicity of a particular drug
to determine if he wishes to administer this drug to
a patient. If results show that patients having a tumor and receiving a certain drug experience extreme toxic reactions, a doctor may not wish to administer the agent or may wish to give a smaller
dose.
Frequently, requests are made to submit reports
1965
or punched cards to the National Institutes or
Health, coordinators for these chemotherapy studies. Because all conceivable patient information is
on tape, it is possible to provide any requested data
in whatever format is desired. With the former
method or present methods at statistical centers
other than Roswell Park, much of this data would
have to be obtained from the patients' charts and
either coded, if punched cards were requested, or
typed, if a report were requested. This process is
long and tedious, especially if each patient must be
recorded separately.
The ability to punch certain information from
the master tape and make graphs with a plotter that
is connected to the IBM1620 is important, as it
eliminates the expenditure of many man hours.
Graphs are plotted for individual patients to compare the response of their tumors (e.g., did one tumor decrease, while another increased during study,
or did the tumors decrease while the patient was
receiving drug, and then increase with suspension
of the drug). Applications of this kind are extremely usefut when developing reports or presentations
concerning a particular study.
SUMMARY
As a result of this system, more patient information is recorded with less summarizing and coding
of the data prior to punching and writing on tape.
The data, which always is current, facilitates the
control of delinquent forms and produces reliable
and representative reports.
The ability to perform more intricate and sophisticated statistical analyses has been enhanced by the
system because of the improved condition and increased amount of data recorded.
The accurate and close scrutiny on the progress
of these chemotherapy studies offers physicians information which previously was available to them
only through the "trial and error" method.
INFORMATION PROCESSING OF CANCER CHEMOTHERAPY DATA
CHEMOTHERAPY STUDIES
SYSTEM FLOW CHART
Figure 1. Chemotherapy studies -
system flow chart.
587
588
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
CHEMOTHERAPY STUDIES
SYSTEM FLOW CHART
~
I
ECEN
MASTE
TAPE
-,
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
L
YES
Figure 2. Chemotherapy studies - system flow chart.
1965
A FACILITY FOR EXPERIMENTATION IN MAN-MACHINE INTERACTION*
w. W. Lichtenberger
Computer Center and Department of
Electrical Engineering
University of California, Berkeley
and
INTRODUCTION
M. W. Pirtle
Computer Center
University of California, Berkeley
lar, it is not designed to serve a large number of
users over a broad spectrum of problems as a utility
approach 1,2 to time-sharing.)
The broad objective of the project of which the
work reported below is a part is to explore and develop techniques in man-machine interaction. The
situation involving a person interacting with a machine in the performance of a task generally requires that the person be on-line with the machine. The amount of machine time wasted while
the person is carrying out his part of a task may
require, even in an experimental situation, a timesharing system.
Design Philosophy of the Time-Sharing System
Because of the variety of tasks to be performed
by the users of the system it was felt that each user
should be given, in effect, a machine of his own
with all the flexibility, but onerousness, inherent in
a "bare" machine. It was also felt that additional
features should be provided to enable the user to
reduce the onerousness, perhaps at the cost of flexibility, to the extent desired. Thus each user is given
a "copy" of a slightly modified SDS 930 with 16K
of fast memory. This "copy" differs from the normal Scientific Data Systems (SDS) 930 only in (1)
the obvious impairment of certain real-time capabilities which result from the neces~ity of running
programs for short, nonregular intervals, (2) the
substitution of a set of instructions which initiate
system-controlled input/ output for .the standard
I/O instructions, and (3) the addition of many new
(software-interpreted) instructions along with
various system routines and a number of largescale subsystems.
In order to cut down response time it was felt
desirable to have more than one program present in
memory and to swap with auxiliary memory only
The Role of Time-Sharing in the Project
The time-sharing facility described below was
constructed
( a) to develop and test some ideas in timesharing a digital computer and
(b) to develop a useful facility for a variety
of experimenters in man-machine interactive areas.
It should be emphasized that the time-sharing
system, although general in nature, is an experimental system intended to give great flexibility and fast
response to a limited number of users. (In particu*The work reported in this paper was supported by. the
Advanced Research Projects Agency, Department of Defense, under Contract SD-185.
589
590
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
when nec{"ssary. In practice, swapping occurs relatively frequently.
For further economy of both active and auxiliary
storage, it was felt desirable to provide for common
programs - single copies of programs shared by
more than one user. Common programs are pure
procedures with a unique copy of temporary storage
assigned to each user. Many system routines are written as common programs, as are some large-scale
systems.
The major part of the system executive, for example, is a common program. Since this part of the
executive was written as it were dealing with one
user it was simpler to write and to debug. Furthermore, the same part of the executive would require
no changes if more central processors were added to
the system. Simplicity, small size, and flexibility
were among the goals of the system executive, and
all of these goals have been to some degree
achieved.
The project objectives made it desirable to base
all input/ output around the remote consoles as
much as possible and to minimize the role of more
standard I/O equipment (cards, line printers, magnetic tapes, etc.). The user is given mass storage in
the form of either word-addressable or sequential
files and a generalized file-handling capability.
Files are independent of any peripheral I/O device
or storage medium and are addressed homogeneously regardless of their current position in the storage
hierarchy. The file-handling facilities augmented
by comprehensive editing programs provide the users at remote consoles the ability to manipulate information conveniently within the system.
SYSTEM DESCRIPTION
Local Units
As shown in Fig. 1, the system is built around a
modified SDS 930 central processor3 and a main
memory consisting of two 16K modules of core
storage. The main memory is augmented by a large
capacity drum which is in turn augmented (it is anticipated) by a mass storage unit. Filling out the
list of local components are the teletype, multiplexor, the I/O processor, a 45KC magnetic tape unit,
and a 200-cpm card reader.
A cursory description of the SDS 930 central
processor and its modifications for time-sharing is
the subject of several of the following sections of
1965
this paper; therefore, the various memory devices will
be given first attention.
The main memory consists of two modules of 16,384 words. The words are 25 bits in length (including a parity bit), and the memory cycle time is approximately 1.9 microseconds. Each of the modules
is connected to three memory buses. These buses
have fixed priorities, with the drum I/O processor,
the general I/O processor, and the CPU connected
to buses having progressively lower priority. To accommodate the high data transfer rate of the drum
(525 x 103 words per second), the timing for the
main m~mory units and for the CPU are derived
from a timing track on the drum, and the memory
addresses are interleaved between the two modules.
By having the drum I/O processor reference the
modules alternately, the CPU can operate at approximatly 65 percent capacity during drum transfer operations. This assures that interrupt processing capability is preserved and that a significant
amount of processing accompany the data transfer
operations.
The next level of storage is in the form of a
magnetic drum* having a capacity of 1,376,256
words and a data transfer rate of approximately 525
x 103 words per second. The drum is word-addressable to facilitate handling files and has a storage
format commensurate with its function of swapping
programs, or parts thereof, between the drum and
the main store. This format provides 84 bands of
16K words, with each band divided into 8 segments
of 2K. Each of the segments is separated by a gap
of sufficient length to allow the drum I/O processor
to accept an instruction between segments. This
feature facilitates the scatter reading and writing
necessitated by the memory page technique employed (d. Memory Relabeling and Protection, below).
The next level of storage will be in the form of a
mass storage unit having a capacity in excess of
108 words and an access time not greater than 0.5
second. This unit will have some type of interchangeable cartridge, thus providing yet another level
of storage having a still greater capacity and access
time.
The teletype multiplexor consists of 16 input and
16 output buffers along with control logic to notify
the computer of buffer conditions requiring service.
The general I/O processor is another device having
*The drum is being produced to local specifications4 by
Vermont Research, North Springfield, Vermont.
591
EXPERIMENTATION IN MAN-MACHINE INTERACTION
LOCAL
UNITS
REMOTE OWITS
TELETYPE
MULTIPLEXOR
& DATA SET
COUPLER
100 WPM
CARD
16 FULLDUPLEX LINE
CPU
READ~R
SDS 930
MODI:B'IED
__-+----1
MAGNETIC
DRUM
I/O
PROCESSOR
MEMORY
MODULE 1
16 K
1.9f SEC
----+--1
PDP-5
.... -;
i·····-_···· ~... !
:DISPLAY:
:._.:
r-I---~:BUFFER
~ --.;
:
I
&
:
:CONTROLER:
CAPACITY1.3 x lOb WORDS
TRANSFER RATE:
525 x 10 3 WORDS/SEC
CRT DISPLAYS
WITH KEYBOARDS
•. ~.
·-·(NOTE)·-··~
.... :
~
L__
DRUM
.. -..
CRT DISPLAY /
CHARACTER &
VECTOR GENERATORS
RAND TABLET
MEMORY
MODULE 2
16 K
1.9 JL SEC
~--
MODEL 33 OR 35
TELETYPES
••
•
GENERAL
REMOTE
'1-/0
CO~1PUTER
PROCESSOR
SYSTEMS
-.-.--.-~I
MASS STORE:_ _ _ _ _ _ _ _ _ _ _
10& WORDS :
CAPACITY :
(NOTE)
:
I
~
I
•.....•..•......•
I
~
THESE UNITS ARE NOT YE'l' COMPLETELY
SPECIFIED. IT IS ANTICIPATED THAT THEY
WILL BE OPEHATIONAL BY CPHING OF 1966
Figure 1. Configuration of equipment.
I
I
I
I
I
.....
I
•
592
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
very little complexity. It is made up of a central
control channel and several independent subchannels. The subchannels operate concurrently and may
retain their requisite information (word count and
current address) internally or in· main memory.
The final two local units, the 45KC magnetic
tape unit and the 200-cpm card reader are used by
the occasional user who desires to transmit data between this and some other system via one of the
media processed by these units. In addition, magnetic tape is being used temporarily as a secondary
storage medium. This function, of course, will be
assumed by the mass store upon its acquisition.
Remote Units
1965
CR T display system, a remote unit is provided
which consists of (1) a PDP-5 processor with 4K
words of memory, (2) a CRT-display unit6 with
character, vector, and script generators, * and (3) a
RAND tablet. 7 These three major components are
integrated into a unit which provides the researcher
with a large amount of flexibility with regard to
both the use of the present system and the addition
of supplemental equipment to provide still greater
capabilities.
In the present system, the PDP-5 functions as a
buffer and controller for the CRT display and the
RAND Tablet and performs some elementary operations such as smoothing the data input from the
Tablet. All computations are performed in the central computer.
The remote units include 10 model 33 and model
35 teletypes, 2 different types of CRT displaykeyboard units, and a PDP-5 with a CRT display
and Rand Tablet.
The teletypes are operated in the full-duplex
mode with each character being individually processed by the CPU. That is, the teletype keyboard
and printer are treated as independent units by the
system I/O programs, and, for further flexibility,
the input characters to the CPU are processed on a
character-by-character basis rather than on a
message basis. This procedure consumes processor
time (approximately 300 microseconds to input and
echo a character, and 200 microseconds to output a
character), but experience indicates that the capabilities thus obtained justify the processing expenditure. *
In fact, the full-duplex, character-by-character I/O philosophy will probably be carried over to
the alphanumeric CRT display-keyboard stations.
These stations will consist of CRT display units
driven by a central buffer and control unit located
in the vicinity of the stations and keyboards which
will communicate with the CPU via the central control unit.
The second type of CRT display-keyboard stations will be similar to the Culler-Fried console. 5
These consoles employ a storage tube display with a
script generator, i.e., a generator which produces
short vectors of length and angle specified by the
input character.
For research efforts requiring a more capable
It is necessary to protect the system and all other
users from certain actions of any user. Such actions
include the execution of instructions which: (a)
affect peripheral equipment, (b) halt computation,
(c) interfere with rapid response to interrupt requests, or (d) access unassigned memory locations.
User actions of type (a) and (b) are handled
merely by trapping the offending instruction ( the
term trap means an interrupt of highest priority).
Instructions which fall into this category are called
privileged instructions. When the 930 is in user
mode an attempt by a user to execute a privileged
instruction will result in the execution of a no-op
followed by a transition to monitor mode and a
transfer of control to a memory location unique to
the illegal instruction trap.
*It is calculated that 16 teletypes executing independent
input and output simultaneously at full speed consume 8
percent of processor time.
*The display is being produced to local specifications by
the Burroughs Corporation, Ann Arbor Laboratory, Ann
Arbor, Mich.
FEATURES OF THE MODIFIED 930
USED FOR MULTIPROGRAMMING
Modes
The role of the system monitor is unique among
programs residing in the machine. Reflecting this
fact, the 930 has been modified to operate either in
monitor or in user mode. Monitor mode permits the
monitor the use of privileged instructions and unrestricted memory. The function of user mode is the
subject of the following sections.
Protection of the System from User Action
EXPERIMENTATION IN MAN-MACHINE INTERACTION
Type (c) actions are treated by permitting interrupt requests to preempt execute and indirect address operations. If either of these operations are in
process when an interrupt request occurs, the operation is aborted and the interrupt request acknowledged. Upon return to the interrupted program the
aborted intruction is begun anew. In this way, infinite indirect address and execute loops in a user's
program cannot halt the system.
The -solution to user actions of type (d) is related to memory relabeling and is discussed in the following section.
Memory Relabeling and Protection
The memory relabeling or paging technique 8
adopted provides both for dynamic program relocation and memory protection with no increase in
memory access time. The technique was adopted
initially because it eliminates the need to move information around within the fast memory in order
to provide space for incoming programs and because it easily provides memory protection. Other
important uses (discussed later) became apparent
as the system progressed. The implementation consists of 8 relabeling registers of 6 bits each laid out
in 2 registers as shown in Fig. 2.
r
6 Bits1
I
RO
I
R4
I~
I
Rl
R2
R3
)
R5
R6
R7
I
24 Bits
J
Figure 2. Physical arrangement of relabeling registers.
For purposes of relabeling, the memory is divided into 16 pages or blocks. Calls are addressed by
block number and location within a block as specified by subfields of the address.
The block or page size is fixed at 2K by the 11
bits of the least significant part of the address
(which may be thought of as an address within a
page or a page address). Because the address field
of the 930 contains 14 bits, only 16K or 8 pages
are permitted each user. The upper 3 bits of the address constitute the page number. Relabeling hardware replaces the user's page number i with an ac-
593
tual page number Ri which may be different from
time to time as the program is moved in memory
(cf. Fig. 3). Because of the spatial relationship of
the page number and page address, the user is not
conscious of page structure. Note that relabeling
permits user's storage to be located in noncontiguous blocks while appearing to the user and to the
machine to be connected.
Of the 6 bits in a relabeling register, the lower 5
are used for the actual page numbers. Addresses after relabeling are therefore of length 16 bits, permitting as much as 64K of fast memory in the system. The sixth bit in a relabeling register designates
a read-only block. The facility to have read-only
storage enables users to share subsystems directly
without interference and without the necessity of
calling the monitor constantly to change relabeling.
Absolute memory protection (i.e., protection
against any reference) is accomplished by using
Ri = 0 to mean that no memory is assigned to the
page i. Any reference to a cell whose relabeling register contains zero is trapped.
Figure 4 shows a 6K memory allotment distributed
in 2K blocks at 24000, 64000, and 14000. The block
at 14000 is read-only. It may be seen that references
to any location greater than 13777 will point to one
of the relabeling registers 3 through 7, causing an
out-of-bounds trap. The choice of the combination
Ri = 0 prevents absolute memory locations 00000
through 03777 from being used for user programs,
but this is of no consequence since that area is part
of the monitor. Note that the user may transfer control, for example, to his locations 10000 through
13777, but an attempt to store information there will
cause a trap.
Relabeling is always performed in user mode. It
is also possible to invoke relabeling for individual
instructions in monitor mode. In accessing memory
to obtain the effective address of an instruction,
any word encountered with Bit 0 set causes relabeling to apply immediately and for the duration of the
instruction. Thus an instruction with Bit 0 set causes
relabeling of its address, while an instruction with
a chain of indirect addresses produces relabeling
at the first instance of Bit 0 set. In the latter case subsequent references come from relabeling memory.
Mode Transitions
During the design of the modifications to the 930
it was felt desirable to make the transitions between
modes- as simple and as natural as possible. In par-
594
PROCEEDINGS -
User 's Page Number
FALL JOINT COMPUTER CONFERENCE,
page address
i
10../ 12 13
1965
User's Address
23
RO
Rl
R2
R3
r
R4
R5
R6
R7
j,
Actual
Pag~
l'
page address
G.
Number
1
8
12 13
Memory Address Register
23
\.__- - - _ _----..J
V
Actual Memory Location
Figure 3. Action of relabeling.
ticular, it was felt that there should be provided
sufficient hardware capability to insure that the interrupt routines could be independent of the mode
of the machine at the time of the interrupt and that
the system routines explicitly called by the various
programs should not require software interpretation
of
• the source of the calling program (monitor
or user),
• the location of the call,
• the location of the arguments, and
• the specific action requested.
Readers who have some experience in implementing high-speed interrupt or I/O routines will perhaps appreciate the spirit of the above objectives.
Transitions from user to monitor mode occur
only upon (1) an interrupt or trap, or (2) execution of a system programmed operator (cf. the following section). The. user has no interest or direct
concern over category (1) and does not think explicitly of the instructions in category (2) as changes
of mode.
The system initiates the transition from monitor
to user mode by transferring control to a user program. Specifically, a control transfer calling for relabeling causes a transition to user mode.
In order to provide closure in the above scheme,
the previous mode of the machine is stored as a
single bit in the subroutine link of both interrupt
and system programmed operator routines. The bit
used is the same as that used to designate relabel-
595
EXPERIMENTATION IN MAN-MACHINE INTERACTION
RANGE OF USER'S
ADDRESSES ~
PHYSICAL
PAGE NUMBER
~
000101
(0)
(I)
(2)
(3)
IOOO~}
13777
001101
(4)
100011
(6)
(7)
(9)
(10)
un
(13)04000
(14)
US)
(5) 00000
~
03777
100000
(8)
100000
100000
(12)
07777
100000
100000
PHYSICAL MEMORY
Figure 4. Example of a 6K memory allocation.
ing. Thus at the end of the routine, when the return
instruction is executed, the mode will automatically
revert. If arguments are accessed through the link
(cf. the following section), relabeling is or is not
applied, depending on the mode storage bit.
Table 1 summarizes the primary functions of the
modes.
System Programmed Operators
Input/output instructions are among the privileged instructions not allowed in the user's machine. The system must do all I/O for the user, and
he must therefore be able to call the system for such
services. Also the system executive requires many
complex services, some of which are potentially
useful to a user. Such services should be provided
by system calls. The system programmed operator
(SYSPOP) is the device by which such calls are
accomplished.
The SYSPOP is an extension of a normal 930
feature-the programmed operator (POP). POP's
are invoked by setting a bit in the instruction word,
and they function as a special kind of subroutine
call. In-the execution of a POP the op code bits are
not decoded in the usual way. Instead, they are taken to be the relative address in a transfer vector beginning at 0100 8 to which control is transferred. At
the same time the contents of the program counter
and status of the overflow indicator are stored as a
subroutine link in location 00000. The indirect address bit of this link is set as well. Single arguments
596
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1965
Table 1. Summary of 930 Modes and Their Effects.
User Mode
Monitor Mode
All 930 instructions except
All 930 instructions may be
privileged instructions (1/0,
executed.
halts, etc.) may be executed.
Relabeling applies to all
Normal addressing applies to
memory references.
references without Bit O.
References with Bit 0 set call
for relabeling.
Transition to routine mode
Transition to user mode occurs
occurs on all interrupts or
upon executing a transfer whose
executions of SYSPOP's. Bit 0
effective address is relabeled
is set at the formation of the
(i.e., Bit 0 is set).
first subsequent subroutine link
so that memory access to data
will be relabeled and so that
control will revert to user mode.
or the location of a list of arguments can thus be
transmitted to the body of the POP indirectly
through the link in 00000. The format of a POP is
the same as that of a normal routine instruction,
hence the POP is a convenient way of simulating
nonexistent machine instructions.
The 930 was modified so that a POP executed in
user mode with Bit 0 set causes transition to monitor mode. The user thus has the facility to jump to
a standard transfer vector in the system. Note that
the user may still implement his own POP's. The
SYSPOP's, however, give the user 64 new "machine
instructions" which do not require his memory allocation or other attention.
The reader should note that by having the mode
stored in the relabeling bit of the link, all four objectives for system routines listed in the preceding
section are accomplished; modes are completely invisible to interrupt and system programmed operator routines. Most importantly, interrupt routines
take no more time and in fact are no different from
similar routines in a non-time-sharing system.
Furthermore, the overhead associated with calls to
the system (SYSPOP's) is only 4 memory cycles.
GENERAL DISCUSSION OF
SYSTEM FEATURES
The features of the system described above
came into being through a compromise between
that which is desirable and that which is feasible, to
implement time-sharing on a machine basically
not designed for time-sharing. Some of the fea-
tures have been shown to be surprisingly compact
and effective, however. For example, the SYSPOP
provides a simple but versatile system call. Also,
relabeling is not only useful for dynamic storage
allocation but provides the basic means by which
common programs can be constructed.
Method of Writing Common (Re-Entrant) Routines
By its very nature, a common routine consists of
( 1) a pure procedure-a body of code which is not
self-modifying and in which there is no temporary
storage-and (2) one or more copies of all temporary storage associated with the routine. To implement a common routine, one allocates all temporary
storage-the data block-to a unique block or blocks
of memory different from those blocks of the procedure body. Because the procedure is pure the state
of a computation at any time is determined by the
contents of the data block and the active registers.
Thus to interrupt computation for one user and
continue computation for another is merely a matter of saving' and restoring active registers and
changing relabeling for the data bloc~.
The only programming conventions which must
be followed in writing the procedure body are those
of avoiding self-modification. Avoiding direct
self-modification is especially easy in a machine
like the 930 which permits combinations of indexing and indirect addressing and which has an executive instruction. The programmer simply avoids
storing information within the program. Assemblers
EXPERIMENTATION IN MAN-MACHINE INTERACTION
can check programs for such storage automatically.
In addition to the constraints mentioned above,
the programmer must avoid the use of the normal
SDS 930 subroutine transfer (BRM) since it stores
the subroutine link at the head of the subroutine
and thus within the procedure body. At the moment, a SYSPOP is provided to steer the link storage indirectly one level into the data block. This is
done by placing the address of the subroutine link
at the head of the subroutine, where the link itself
would normally reside. Thus the SYSPOP SBRM y
at location p first looks in y to find a link address
z. The value p is then stored in z, which is outside
of the pure procedure. Control is transferred to y
+ 1. Return is accomplished with the normal subroutine return instruction using indirect addressing
(BRR ~ y) . The indirect address sequence causes
the machine to look first in y to find z and then in
z to find p. Control goes to p + 1. It is anticipated
that SBRM will be incorporated as a new 930 instruction.
In cases of the occurrence of POP's within a program, it is usually desirable to have the data block
at least start within the user's block O. The use of a
POP places a return link in the user's location 0,
and this, of course, must be in the data block.
It should be noted that the utility of relabeling in
implementing common routines was not fully realized at the outset. It should also be observed that
using relabeling for such purposes restricts the common routines virtually to subsystems (compilers,
debuggers, interpreters, etc.) since an entire page is
reserved for temporary storage. Routines of this
magnitude, however, do have the most need to be
single-copy.
In our system many functions which might otherwise force users to have copies of the same little
routines in their programs are taken over by SYSPOP's. Finally, for those routines that lie halfway
between (e.g., packages of mathematical subroutines-SIN, COS, etc.) the read-only facility allows
users to share procedure storage. The only abridgment of the users' freedom here is that such routines must be located absolutely in user memory so
that they. may address themselves properly without
asking the system to change relabeling upon entry
and exit.
The Structure of SYSPOP's
The SYSPOP mechanism is basic to the overall
system and is used extensively by programs at all
597
levels. SYSPOP routines run in monitor modes. If
called by a user the execution of a SYSPOP is part
of his program, and this is the only instance of a
user-controlled program running in monitor mode.
The absence of normal protections during such intervals imposes constraints on the program structure
of SYSPOP's as follows:
1. SYSPOP's must be written so as not to
cause disaster if erroneously called. This
feature calls for a certain amount of software interpretation, but it is on a. different
level from the interpretation spoken of in
the section on Mode Transitions.
2. SYSPOP's are normally small and of short
duration. Because they share the same link
it is difficult to make SYSPOP's re-entrant without time-consuming maneuvers.
The SYSPOP's are therefore not re-entrant and contain their own temporary
storage.
3. Since SYSPOP's are not re-entrant and
since they are shared by all users and by
most parts of the system itself, program
interruption is handled by allowing a SYSPOP in process to go to completion. This
is done by having all SYSPOP's return
control through a common ( I-instruction) routine.
The reader should note the merit of the modechanging scheme discussed in the section on Mode
Transitions, as reflected in resulting simplicities in
SYSPOP's. Recall that the mode of the calling program is stored in Bit 0 of the link, and that the SYSPOP accesses the calling parameters indirectly
through the link. If a user calls a SYSPOP, relabeling will be applied to accesses of calling parameters, otherwise not. When returning control the SYSPOP executes a return instruction. If Bit 0 of the
link is set, relabeling is applied and the mode is set
back to users'. Thus modes are completely invisible
to SYSPOP's.
SUMMARY
The project goals discussed at the beginning of
this paper have been set. The time-sharing system
involving memory relabeling, common routines, and
duplex . teletype operation has been in operation
since April, 1965. The system is highly flexible and
598
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
can provide, for users who require it, a response
time of less than one second.
It should be noted that memory relabeling* is
accomplished with no increase in access time. The
number of processor modes is small (two) , and
mode transitions are done in su~h a way as to enable interrupt and user-called system routines to be
independent of mode.
The user machine is clean and well defined. Input/output is simpler, more foolproof, and deviceindependent. The user is given a variety of other
services ranging from generalized file-handling
capability to string processing to assemblers, compilers, debuggers, and editors.
ACKNOWLEDGMENTS
The authors would like to acknowledge the direction and valuable advice of Professors Harry D,
Huskey and David C. Evans, co-principal investigators. Mr. W. J. Sanders made many valuable contributions to the early system design and SDS 930
-~ modifications. The time-sharing executive system 9
and most of the subsystems were written by Mr.
Peter Deutsch and Mr. Butler Lampson, who suffered through seasons of balky, fidgety hardware
and primitive input/output to produce an excellent
result.
.
REfERENCES
1. R. M. Fano, "The MAC System: The Computer Utility Approach," IEEE Spectrum, vol. 2,
no. 1, pp. 56-64 (Jan. 1965).
*The technique of relabeling was developed by M. Pirtle
in April 1964 and was implemented in two weeks on the 930
upon delivery the following November.
1965
2. J. 1. Schwartz, "A General Purpose TimeSharing System," Proc. AFIPS Conf., 1964, vol. 25,
pp. 397-411.
3. W. W. Lichtenberger, M. W. Pirtle and W. J.
Sanders, "Modifications to the SDS 930 Computer for the Implementation of Time-Sharing," Document no. 20.10.10, Project GENIE, University of
California, Berkeley (Jan. 1965).
4. M. C. Hurley, '-'Drum Storage System Preliminary Reference Manual," Document no. 20.70.20,
Project GENIE, University of California, Berkeley
(Mar. 1965).
5. G. J. Cullen et aI, "TWR Two-Station OnLine Scientific Computer," vols. II and IV, TWR
Space Technology Laboratory (July 1964).
6. G. D. Hornbuckle, "Display System Specifications," Document no. 20.60.10, Project GENIE,
University of California, Berkeley (Jan. 1965).
7. M. R. Davis and T. O. Ellis, "The RAND
Tablet: A Man-Machine Graphical Communication Device," Proc. AFIPS Coni., 1964, vol. 26,
part I, pp. 325-331.
8. J. B. Dennis, "Program Structure in a MultiAccess Computer," Technical Report MAC-TR11, Project MAC, Massachusetts Institute of Technology (1964).
9. L. P. Deutsch and B. W. Lampson, "SDS 930
Time-sharing System Preliminary Reference Manual," Document no. 30.10.10, Project GENIE, University of California, Berkeley (Apr. 1965).
A TIME- AND MEMORY-SHARING EXECUTIVE PROGRAM
FOR QUICK-RESPONSE ON-LINE APPLICATIONS
James W. Forgie
Lincoln Laboratory*
Massachusetts Institute of Technology
Lexington, Massachusetts
since 1960. Never a service facility, the computer
has been used principally in a number of longterm research projects which have taken advantage
of the special input! output capabilities and direct
accessibility of the machine. These projects have
included such areas as graphics, waveform processing, and pattern recognition. Most of the work
on the computer has involved reai-time inputs,
interaction with output displays, or both. The computer has always been used as an on-line facility
with the bulk of the computer time being used in
sessions of several hours duration. Programming
has been carried out in machine ,language augmented in the past by a number of personal macro languages and recently by a more general macro lanuage for list processing (CORAL). An on-line
macro assembler, MK 4, has been used both as an
assembly program and an on-line operating system
by most users. In the fall of 1963 it was decided to
realize on TX-2 on experimental operation-oriented on-line system in order to study man-machine interaction in problem solving. This system
(APEX) is designed to allow the scientist or engineer to make use of the computer throughout his
work on a data analysis problem without having to
be concerned with many of the details ordinarily
involved in programming a computer. The system is
INTRODUCTION
APEX is an experimental operation-oriented
on-line data analysis system being developed for
the TX-2 Computer at MIT Lincoln Laboratory.
This paper describes the executive program which
has been designed to satisfy the needs of that system
as well as the other activities currently soaking up
the computational energies of TX-2. These needs
are developed into a set of requirements for a
fast-response time-sharin~ system. The requirements, in turn, lead to a· series of design decisions
which involve both the hardware and software parts
of the system. A memory and program-sharing
system, with the hardware to make such a system
efficient, takes the form of a complex of apparent
computers, one for each console, which share some
common hardware (TX-2). 'The salient characteristics of these computers are described as well as
-theexecutiV'~ program structure which gives them
apparent reality.BACKGROUND
The TX-2 Computer, an experimental facility
at MIT Lincoln Laboratory has been in operation
*Operated with support from the U. S. Air Force.
599
600
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
based on a library of computational and display
routines which may be called directly by the user in
an appropriate problem-oriented language. For a
problem area in which library routines exist, it is
expected that individual library routines or short
combinations of routines will suffice for a high percentage of the total operations needed. In order to
handle the few remaining cases, it is desirable to
have within the system special and/or general compilers which the user can utilize to create the occasional pieces of program which he may need to
complete the solution to his problem. It is likely
that a person using a system of this sort will probably spend much more time looking at his displays
and thinking about what to do next, than he will
spend actually doing computations. Economics then
dictate that mUltiple consoles should be provided
for the computer, and the computer facilities shared
among these consoles. An executive program which
will handle such sharing of the computer facilities
and related problems of storage allocation and communication becomes desirable to complete the system.
In January of 1964 a firm commitment was made
to undertake the realization of such an operationoriented on-line system on TX-2. Since experimentation with the new system would put pressure
on the already full schedule of TX-2, a further requirement was placed on the design of the new system, namely, that its executive program should allow for the use of TX-2 in something approaching
its accustomed style at the same time that the system was running. Thus the advantages of timesharing could be made available to the existing users of the machine. This paper discusses the design
of the executive system which has grown out of
those decisions.
1965
appeared that response times in excess of one second would noticeably degrade the performance of
already existing programs in these areas. In addition, the proposed experiments with the operation-oriented on-line system called for the ability to degrade response time in order to measure its
effect on the user. Thus, all proposed applications
of the system called for fast response under most
circumstances.
3. Retention of Results. The executive should
assume responsibility for the implicit retention of
all program and data files whose destruction was
not specifically ordered by the user or his program.
4. Subroutine Autonomy. The executive should
allow any program, written as a closed subroutine
and following certain conventions, to be run as an
independent pro gam making full use of core storage addresses and index registers. Routines to be
run in this fashion should be precompiled and
stored in absolute binary form. They should be
completely independent of the routines which call
them and may thus be called recursively. The executive should provide isolation and protection for
such routines and facilitate the passing of parameters to them. This requirement for subroutine autonomy was intended to allow both fast operation of
library routines by eliminating compilation or relocation time and relative simplicity of programming
by minimizing the number of conventions such routines must follow.
5. Flexible Input/Output Services. The executive should handle the details of all input/output
operations. It should provide continuity for displays
and keyboard-typewriter conversation. It should
provide for sharing of common I/O devices such as
printers and magnetic tape. Insofar as possible, it
should leave formats and the interpretation of data
to user programs.
SYSTEM REQUIREMENTS
DESIGN DECISIONS
The design requirements for the APEX executive
system may be briefly stated as follows:
1. Time-Sharing. The system should provide
for time-sharing essential computing facilities
among a small number of consoles (perhaps half a
dozen), most of which would have oscilloscope displays as well as the usual keyboard and typewriter.
2. Fast Response. The on-going activities in
graphics, waveform processing, and pattern recognition all involved the use of interactive displays. It
Early in the design phase of the executive program a number of policy decisions were made
which had considerable effect on the character of
the final program. Among these were the following:
1. Memory-Sharing. It appeared from the outset that if the requirements for fast response were to
be met, it would be necessary to keep some part of
each user's program in core at all times. The size of
the TX-2 memory (97K) made this feasible for a
A TIME- AND MEMORY-SHARING EXECUTIVE PROGRAM
small number of users. In order to facilitate memory-sharing it was decided to add relocation and
bounding hardware to the computer and to provide
some executive services which would make it easier
for programmers to break large program structures
into pieces of manageable size.
2. Program-Sharing. If memory sharing was to
operate efficiently, it was obvious that large public
routines such as compilers should be written in reentrant form, so that they could be shared by all
current users. The TX-2 order code allows this
kind of program to be written without any special
difficulty. It was decided to add certain features to
the executive to facilitate the operation of re-entrant programs and to add the necessary hardware
to protect them.
3. The executive should simulate an apparent
computer for each console. The requirements of the
operation-oriented on-line system could have
been met by a highly specialized executive program,
but such a design would not have satisfied the needs
of the on-going research projects already using
TX-2. Their needs would, perhaps, have best been
served by a time-sharing system which provided
the entire facilities of the computer for each user in
turn. The multiple sequence design of the TX-2
input/ output system made the realization of the latter design appear unreasonably complex. The realization of a simulated computer similar, but not
identical, to TX-2 seemed a reasonable compromise between these two requirements.
4. There should be no direct communication
between the user and the executive. All user commands should be passed through the executive to
programs operating within the simulated computer
for that console. Communications involving the executive are then passed back from such a program
to the executive. It appeared both unnecessary and
undesirable to tie the system to any language conventions by building these into the executive.
5. Insofar as possible, software features would
be realized in programs operating in the simulated
computers. This decision allowed the executive to
be as simple as possible and permitted expansion of
the overall software structure without having to
modify the executive program.
6. Compatibility between former TX-2 programs and programs which would operate within
the simulated computer was not to be a requirement. The design of the simulated computer should
be made to correspond to TX-2 whenever possible
601
and reasonable. But it was expected that some
change would have to be made in all programs to
accommodate the different input/output characteristics of the executive and to take advantage of the
storage allocation facilities provided by the executive.
7. Hardware changes to the TX-2 system were
to be considered as legitimate variables in the design work. The computer engineering group was
prepared to make reasonable and compatible modifications to the computer when such changes appeared to be the desirable and economical solution
to the design problem. Throughout the development
period of the executive program there was considerable interaction between hardware and software designs and designers, and major changes have been
made in the TX-2 computer to facilitate the
APEX executive system. These include the addition
of a file memory (a UNIVAC Fastrand Drum),
hardware to trap the attempted execution of privileged instructions, and four memory-snatch channels to increase the efficiency of high speed I/O
operations. The most significant change was the addition of a hardware system called SPAT (an acronym for Symbolic Page Address Transformation).
SPAT, which has been in operation since January
1964, utilizes a 1024-word thin film memory and
high-speed transistor circuitry to realize a 3-level address transformation within a single TX-2
clock pulse time (0.4 microseconds). This transformation makes available to the executive the advantages of paging, segmentation, and complete memory protection. It greatly reduces the overhead involved in memory- and program-sharing.
CHARACTERISTICS OF THE APPARENT
COMPUTER
The APEX Executive Program simulates an apparent computer for each console. These apparent
computers may be viewed as being somewhat restricted replicas of TX-2 augmented by features
provided through the executive pI!ogram. The core
storage for each apparent computer is bounded and
segmented and limited in total extent to approximately two-thirds of the TX-2 core capacity.
The order code for the apparent computer is that
obtained by eliminating input/output and multiple
sequencing instructions from the TX-2 order
code, and then adding some executive calls to handle input/output, file maintenance, and storage allocation in the apparent computer. The number of
602
PROCEEDINGS -
1965
FALL JOINT COMPUTER CONFERENCE,
index registers is reduced to 15, and some restrictions are placed on the choice of machine configurations available. The apparent computer is a single-sequence (program counter) computer in the
current version of the system, but the hardware allows for future expansion to three sequences. In
general, programs written for TX-2 will not operate in the apparent computer, and vice versa. However, programs which do not involve I/O operations may often be transferred with little or no
change.
The storage structure of the apparent computer
takes advantage of the SPAT address transformation hardware in TX-2. The SPAT hardware
breaks up core storage into pages of 256 registers.
These are organized into books (segments) of up to
32 pages (8,192 registers). The 17-bit addressing
capability of TX-2 allows 16 such books to be selected by the 4 highest order address bits. Since the
apparent addressing capability of the machine exceeds the real core (currently 97K), some of the
books must always be incomplete or empty. In the
apparent console computers realized by the APEX
executive program, the user's programs and data are
organized into files. A file is a contiguous group of
registers which must be some integral number of
pages in length. A file always has a name which is
known to the APEX file directory. Files may exceed one book in length, but they must begin at the
start of the book, and no more than one file, may
occupy anyone book. Executive calls in the user's
program control which files are to appear in core at
anyone time. A file may be set up in a book specified by the directory, as is usually the case for program files, or it may be set up in an arbitrary book
according to the requirements of a program which
is to process it. All files begin as working storage
files with ephemeral names. When a program has
placed information in such a file, an executive call
may be given to assign a permanent name to the
file. The naming call may also specify that the file
is to be a read-only file in future appearances in
core, that it may be operated as a program file, and
that it must be set up in a particular book when
called as a program. After the file has been given a
permanent name, it will remain in the file memory
until it has been discarded by a call from the user's
program. Thus, all data files which the user has had
occasion to name will be retained from one session
to the next. We refer to the complex of files set up
in the user's apparent memory at anyone time as a
MAP. A, MAP may be thought of pictorially as we
see in Fig. 1 for a typical setup of matrix routines.
However, a MAP may be equally well described as
simply a list of names of files together with the
book numbers in which each is to appear in the
MAP. In Fig. 1, the dashed lines indicate the poBOOK
BOOK
BOOK
EJ
~E:J
I
I
3
I
I
I ___ ....I
L.
4
r---'"
I
I
I
I
I
I
I
I
I
I
I
I
I
I ____ .II
L.
5
,..----,
I
I
,..---...,
I ___
L
~
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
IL. ___
I
~
7
6
I
I
:
I
I
~----'J
I
I
I
I
I
I
I
I
32 PAGES
I
MAXIMUM
I
IL ___ ..JI
IL ____ .1I
,.---, ... ----. r---'
13
r----.
IL ____ .JI
I ___ ...1I
L.
I
I
I
I
I
I
I
I
I
I
I
IL ___ ...JI
14
r - - - ....
I
I
I
:
I
I
I
I
I
I
I
IL. ___
12
II
10
I
I
I
~
I
I
I
I
I
I
I
I
I
I
I
I
IL. _ _ _ ...I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
II
I
I
I _ _ _ ..II
L.
'SERVICE
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
IL. ___ .JI
17
16
15
r----'
I
I
I
I
I
I
IL.. ___ .JI
I
I
CONNECTOR
I
I
I
I
I
IL. _ _ _ .JI
I
I
I'- ___ .JI
I
I
I
I
Figure 1. An APEX core MAP for the matrix operation
AxB-C.
tential capacity of each book, while the solid lines
indicate the actual core occupied by the file named
within the block. More will be said about the contents of this MAP after a short discussion of the
way the APEX system handles library routines.
One of the principal design requirements for the
APEX system was to provide a means whereby library routines (or arbitrary user subroutines) could
be called into core and operated without any conflict between the core addressing requirements of
the routine and the program which called it. This
requirement is met by providing a fresh MAP\ior
the called routine. When a program wishes to call a
library routine to be operated in a new MAP, it
does so by issuing a GO UP call to the executive,
passing along the name of the library routine as a
parameter of the call. This process is called "going
up" because the new MAP is thought of as being
,"
A TIME- AND MEMORY-SHARING EXECUTIVE PROGRAM
placed on top of the MAP which contained the
calling program. Since normal operation of the
APEX system involves multiple MAPS, the typical
situation at some instant of time will be a stack of
MAPS such as illustrated in Fig. 2. This figure
LIBRARY ROUTINE
~SLATOR
~
MAP 3
MAP 2
MAP I
7
7
Figure 2. Typical MAP stack for a simple library routine
operation.
shows the stack which will result in the simple case
of a user having logged-in and used a translator to
call a library routine. At the instant of time represented by Fig. 2, the library routine is presumed to
be in operation. When it finishes, the stack. will return to two MAPS in depth as control returns to the
translator.
Let us look in a little more detail at the operation of going up to a new MAP. The GO UP call
will cause the executive to produce an entirely new
MAP containing the library routine, which it sets
up according to the directory specifications for the
routine. In addition to the library routine, the
new MAP will contain two other standard files.
One, called the CONNECTOR file, is common to
all MAPS and is used to pass parameter information and to, provide small amounts of working storage for the library routines. The first register of the
CONNECTOR file is used to indicate the first free
register in' the file. The contents of this register 'are
noted by the executive on going up and are restored
on "peeling back," as the operation of returning to
a lower MAP is called. The other standard file,
called the SERVICE file, contains a number of of-
603
ten needed small routines such as those for floating-point arithmetic and the manipulation of calling
parameters. As shown in Fig. I, these two standard files are located in the two highest numbered
books in the MAP. If, as is usually the case, parameters are to be passed to the library routine,
they are placed in the CONNECTOR file in a
standard format by the calling program. The location of these parameters is passed to the executive
as a second parameter on the GO UP call. The executive then writes a PEEL BACK call into the
CONNECTOR file at the proper location and passes control to the library routine as if it had been
entered by a standard subroutine call instruction.
The library routine then finds its parameters in a
standard calling-sequence relationship to the index register contents which specify its return point.
It inspects the parameters and calls such other files
as it may need to carry out its mission. Upon completion of its operations, it makes a standard subroutine exit, which transfers control to the PEEL
BACK call which the executive had placed in the
CONNECTOR file. The PEEL BACK call causes
the executive to discard the MAP containing the
library routine and return control to the calling program. This procedure assumes that the output produced by the library routine has either been placed
in a file whose name was supplied in its calling parameters, or that its output was placed in the connector file at a location which was supplied in the
calling parameters.
The way in which the APEX system handles library routines has a number of advantages. First, the
library routine is written as an ordinary dosed subroutine and is not in itself concerned with going
up or peeling back, unless it needs to call another
routine in the course of its operation. It may therefore be operated either by going up to a new MAP
or by being brought into core as a part of the MAP
containing the calling program and called as a subroutine. The latter mode of operation has speed
advantages but is limited to situations where core
assignments and index register usage are compatible. Checking for compatibility is left to the programmer in this case. A second advantage comes
about because the MAP changing facility is not
limited to library routines but is available to arbitrary user programs. The MAP stack then aids the
programmer in putting together large complicated
program structures which may exceed both the real
core and the core addressing capacities of the ma-
604
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
chine. However, he must keep in mind that changing MAPS involves a bookkeeping overhead for the
executive and may involve file memory swapping
(at disc speeds) if real core capacity is exceeded.
The APEX system also uses MAPS to handle interrupts. The user's program may define special
MAPS called GHOST MAPS· which may be individually associated with all sources of interrupts.
When an interrupt occurs, the associated GHOST
MAP is placed on top of the MAP stack, and control is passed to the program there. For alarming
situations such as illegal instructions, boundary violations, and I/O troubles, a special HELP GHOST
MAP is provided, which automatically takes the
user to a fixed public routine which straightens out
his I/O problems of the moment, if any, and then
sets up a basic translator which allows him to call
debugging or other routines to his aid. Note that in
this situation, the HELP GHOST MAP has suspended the operation of his program and has given
him the full use of his apparent computer to work
on his trouble. In addition, the MAP stack has preserved all that was known about his program structure at the time this interrupt occurred. He may be
able to fix the trouble and continue, discarding only
the HELP structure at the top of his stack, or he
may elect to start again at the bottom, forgetting
everything about his old structure. GHOST MAPS
may be defined which will intercept all interrupts
according to arbitrary priorities, but the HELP
GHOST MAP may not be overridden and is always
there should some other GHOST MAP get into
trouble itself.
STORAGE AND RETRIEVAL FACILITIES
FOR THE APPARENT COMPUTER
One of the major tasks of the executive system is
to remember the user's data and program files as
well as other quantities which he may find useful in
maintaining his continuity of operation from one
session to another. A portion of the executive called
the Librarian maintains a private directory for each
user as well as a public directory which is shared by
all users. A number of calls are available to the
simulated console computer to allow a user's program to insert items into his private directory and
to inquire about these items and others in the public directory. Items remembered through a directory
are identified by names. A name is a string of up to
1965
50 characters. The characters which may be used
are restricted to Roman capital letters, Arabic numerals, and period. The directory itself is a list structure arranged in the form of a tree to give a logarithmic search· for names. Once a name has been entered into the directory, a unique name block is
created within the list structure, and the pointer to
that name block is used as a compact and more efficient substitute for the original string of characters. Remembering an item in the directory involves
two calls. The first call asks the Librarian to accept
a string of characters and to return the related name
pointer. The second call uses that name pointer together with the necessary defining information to
request the Librarian to establish an association between the name and the item to be remembered.
The director can keep track of the following kinds
of items, either directly or by way of the file memory.
1. Files. A file is any contiguous group of memory registers. As discussed above in connection
with the storage structure of the apparent computer,
the directory contains information concerning the
protection status and origin (if any) for each file,
and knows whether or not the file contains program
or data. In the case of a data file the type and kind
of data which the file contains is known only by the
internal format of the file and not by information
in the directory.
2. Scalars. A scalar is a single-register quantity remembered directly within the directory. The
directory scalar is useful for allowing the user to remember, from one session to another, quantities
which are not part of some fixed program. It is 'also
useful in allowing re-entrant public routines to
remember certain parameters from one usage to the
next.
3. Entrances. An entrance is a number associated with a file. A program file may contain a number of related routines which perfor'm different
functions. Entrances can then be used to call these
different routines by entering the program file at
different locations. If a GO UP to a new MAP call
is given to the executive and the parameter on that
call specifies an entrance, the specified file will be
set up (if it is a program file), and control will be
transferred to the location specified by the entrance.
Entrances may also be used in connection with data
files. For example, an entrance may identify the
start of a particular ring in a list structure.
A TlME- AND MEMORY-SHARING EXECUTIVE PROGRAM
4. References to Public Names. A reference to a
public name is a device for allowing a user to use a
name of his own choosing for a public name which
is unsatisfactory to him.
5. File Groups. The file group, as the name implies, is merely a related collection of files. Its existence in the directory allows a related group of files
set up in memory by means of a single call. For example, consider the case of using a general translator to translate a particular problem-oriented language. In addition to the file containing the translator itself, a file of definitions for the particular language and a suitable working storage file must be
set up before any translating can begin. Treating
them as a file group allows the executive to get
them all into core and set up before any attempt is
made to run the program.
The directory not only maintains relationships
between names and things but it also maintains relationships between names. A synonym call is available which allows the user program to indicate that
a particular item in the user's directory is to have
a second synonymous name. A name may have any
number of synonyms. These are added one at a
time by the SYNONYM call and may be removed
one at a time by the UNDEFINE call. If all of the
names have been removed by the UNDEFINE call,
the original named entity will be forgotten by the directory and destroyed. A DROP call is available
which will cause all the names to be undefined and
the entity destroyed with a single call. Synonyms are
useful for abbreviation and parameter substitution.
They are handled by the executive rather than left
to particular translators because they are felt to be
language-independent relationships which should be
closely tied to the items remembered via the
directory.
INPUT/OUTPUT FACILITIES IN THE APPARENT COMPUTER
The input/output devices available to the console
computers may be split into two categories. The
first contains those which must be shared among
the consoles because there are not as many devices
as consoles, and the second is made up of those devices located at each console. The first class of devices may be more important in this time-sharing
system than in many others because the majority of
the consoles will be located in the computer room,
605
and the users at those consoles will have easy access
to the common shared devices. The shared devices
may be again split into two classes. The first of
these includes those devices which are assigned to
the individual consoles on a first-come first-served
basis. If the user on one console wishes to use such
a device and finds that it is assigned to another
console, he must go and negotiate with the user at
the other console for its release. Devices in this
class include magnetic tape, the photoelectric paper
tape reader and the analog-to-digital input. The
second class of shared device includes the xerographic high-speed printer and the paper tape punch.
For these devices the executive maintains a pseudo
input/ output device which accepts user calls to build
files of characters to be printed or punched at some
opportune time. These buffer files are saved in the
file mem\>ry until they can be processed. These devices appear to be always available to the console
co:nputers even though the actual physical output
may appear at some later time. These shared input/ output devices can produce a considerable load
on the executive program and their presence in the
system posed a number of detailed problems to the
designers of the executive, but the solutions to· these
problems are too specialized to the nature of TX-2
to warrant further discussion of them here.
The input/output devices available at the .individual consoles were of particular interest in this
system because they are the means by which the
user interacts with the system. There are, of course,
differences in the makeup of individual console
equipment. But the basic console is made up of a
keyboard, a typewriter, an oscilloscope display with
light pen, and a few push-button switches. A
RAND tablet will be available on one console and
some of the consoles will have a connector with 36
output and 36 input wires to which a user can connect special equipment of his own. The keyboard
and typewriter are basically the same equipment
that has been used for TX-2 on-line communication and paper tape preparation in the past. The
color-coded keyboard has a double set of keys,
eliminating the need for case shifting. The typewriter has a platen rotator which allows super and
subscripting. The keyboard has the full Roman alphabet only in capitals, thus allowing more than the
usual number of special symbols. The character set
allows for the very nice typing of mathematical
expressions, but English text is singularly poor because of the lack of a full set of lower case Roman
606
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
letters. The keyboard has seven so-called "function keys" for which there is no direct typewriter
key. These keys are available for particular interpretation by programs operating within the console
computer. The keyboard sends its output signals
only to the computer, and the computer must send
signals to the typewriter to type back what the user
has keyed in. The executive handles this typing directly. When a key is struck, the executive places
the character code for that key directly into keyboard buffer file for the appropriate console computer. At the same time the executive examines
each character to see if it is one of a set of terminating characters specified by the user's program. If
the character is a terminator, and if the user's program has gone into an inactive state waiting for
further typing to· be complete, then the executive
will put the user's console program into an active
state. In order for the user's program to gain access
to the keyboard input, the keyboard buffer file must
be set up as a part of the user's MAP. Both the
user's program and the executive may write in this
file, and it is the responsibility of the user's program to write empty marks in those registers from
which information has been e:xtracted. If a key is
struck and the executive finds that the next location
in the buffer file is not empty, it will interrupt the
user's progiam and switch to the HELP GHOST
MAP structure. This method of handling keyboard
inputs is well suited to the ordinary translator form
of user program which completes an operation before returning to look at the input for the next command. In the case where it is desired to use the keyboard to interrupt an on-going program, a
GHOST MAP type of interruption mode is available for the keyboard. In this situation, the occurrence of a terminator character will cause the prespecified keyboard Ghost Map to be put into operation.
The typewriter has three functions in this system.
It normally types back the keys struck by the user.
This mode may be disabled in situations where the
keyboard is to be used as an input to the oscilloscope. The typewriter will also type messages from
the user program as well as messages from the executive itself. The latter have the highest priority, with
messages from the user's program second, and typing back of the user's input as the lowest priority
task of the typewriter. In the case where the typing
back of input has been interrupted because of a system message or a user program message, the user's
1965
keyboard is locked until the messages are complete.
If, in this locked keyboard situation, he wishes to
interrupt a message from his program, he may push
the HELP REQUEST button on the console and go
to the HELP GHOST MAP for corrective action.
He may not under any circumstance interrupt a
message from the executive.
Cathode ray tube displays and associated software techniques are areas in which much work has
been done on TX-2 in the past. It is expected that
there will be considerable future development in
these areas. As a consequence, the organization of
the display portions of the executive program have
received considerable attention. The present design
of this part of the system appears to be a reasonable
compromise between the various requirements
placed upon it, but it may well have to be changed
in the not too distant future as new requirements and
techniques develop. The displays themselves are
slave-type units. Some have Charactron tubes,
and some have ordinary cathode ray tubes. They are
driven from a shared vector· and curve generator
which gets its inputs directly from the computer
memory. Analog integrators are used to generate
lines, circles and parabolas from digital information
obtained from the computer. The analog deflection
signals are sent to all scopes, but only the scope for
which the display information is intended receives
an intensification signal. The display information is
stored in a ring-type list structure, which reflects
not only the order in which parts of the display are
to be produced, but also the associations which may
exist between parts of the picture. A display routine
within the executive threads its way through this
ring structure transmitting the data it finds in the
structure to the display generator. Characters are
stored in the display file in the form of packed keyboard character codes. If the particular console for
which the display is intended has a Charactron
tube, the display routine will recode the character
code appropriately and transmit that information to
the display scope itself. If the console does not have
a Charactron tube, the display routine will generate
from a stored . list the appropriate vector and curve .
segments necessary to make up the character. A single pass around the ring representing the display for
an individual console defines a frame for that display.
The display generator is time-shared on a frameby-frame basis. Control bits in the list structure
allow selected portions of a display to be darkened,
either until further command from the user pro-
A TIME- AND MEMORY-SHARING EXECUTIVE PROGRAM
gram, or on an alternate-frame basis so that a
flashing effect can be obtained. The display list
structure is built in CORAL language format by
executive calls given in the user's program. The
CORAL language builds a block-format list structure. Thus an entire drawing may appear as a single
block in the structure if the user so chooses. The
structure also allows for associations in a hierarchical sense between the blocks in the structure. Thus
the display file is not just a sequential list of lines,
points, curves and characters which make up the
picture which appears on the display, but it can also
represent some of the relationships which parts of
this picture may have to each other. The display routines maintain the display on the user's console
even though his program is inactive either because
of time-sharing or because it is waiting for an input.
A "push-to-see" button is used by some types of
display programs to keep down the load which display maintenance places on the system.
The light pen is the principal graphical input
device available in the APEX system. The light pen
may be used in two modes-pointing and tracking.
In both modes the executive maintains a complete
record of all light pen returns in a buffer within the
display file. A light pen return while in the pointing
mode causes the executive to lplace information in
the buffer from which the user's program can calculate both the pen position and the picture element
seen by the pen. In many light pen applications it is
necessary to associate the pushing of a button or the
striking of a key with the pointing operation. This
association is handled by the executive, and the Associated character code is placed in the light pen buffer. In the tracking mode, the executive displays a
tracking cross every 10 milliseconds. If the light
pen sees any part of the cross, the tracking routine
moves the cross to center it in the field of the light
pen. The location of the center of the cross is
placed in the buffer. Smoothing and extrapolation
are done in the tracking program to achieve good
"writing" characteristics for the pen. The processing of light pen signals is a high priority task
for the executive since response time is a critical
parameter in graphical input operations.
ORGANIZATION OF THE EXECUTIVE PROGRAM
The SPAT address transformation is a three-level
transformation. The first level is unique to TX-2 and
607
comes about because TX-2 is a multiple-sequence
computer. TX-2 has 33 program counters. Most of
these are associated with I/O devices and must be
used to operate these devices. Some are used to handle interrupts, one is used to start programs from console controls, and the remainder are used for computational purposes. Most ordinary programs in
TX-2 use but a small subset of the available program counters at anyone time. A priority relationship between these program counters determines
at any instant which of them will provide the address for the next machine instruction cycle. Figure 3 shows how the first level of the SPAT transformation treats this multiple sequence structure.
The SPA T hardware provides for a total of 64
books or segments. These are divided into 4 shelves
of 16 books each. Three of the shelves are tied to
single program counters. One of these is used for
user programs in the current system. The other two
are treated as user shelves by the hardware (i.e.,
privileged instructions are prohibited) but are
treated as unused spare shelves by the current executive program. The fourth shelf is shared by all the
other program counters except a special one which
is used for starting the computer and is not subject
to the SPAT transformation. This fourth or executive shelf contains all of the executive program. The
executive is thus itself transformed by the hardware
it controls. The application of SPAT to the executive makes the advantages of segmentation available
to the executive program as well as to the user programs. SPAT allows all private directories to appear in the same block of addresses in the executive
MAP. Switching from one to another requires only
two instructions. Similarly, I/O buffers can be
quickly switched, and drum and tape transfers can
be carried out while a file is being relocated. The
registers which control the SPAT transformation
appear as part of addressable memory and are
themselves transformed by SPAT. They may thus
be easily protected from user program interference
by allowing them to appear only in the executive
shelf. The problem of allowing the executive to examine or change a register in the user's shelf is
solved by setting up the appropriate file in a spare
book of the executive shelf. Switching between users' core MAPS is handled by changing the 16
SP AT registers which control the user shelf.
The five major parts of the executive are as follows:
608
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1. The Maestro is the part of the executive that
determines which user, interrupt, or alarm is to be
1965
handled next. It implements the time sharing scheduling algorithm, which is a simple round-robin of
PROGRAM
COUNTERS
.....STARTOVER
~----t.~ NO SPAT TRANSFORMATION
-----~
I ALARMS
L
I
11/0 INT
I
I TRAPS
I
I
I
.
.-- --------I DRUM
•I ••
I
P
R
I
I
~
I
o
R
I
T
Y
!~
I
I
1/0 DEVICES
(SOME SPARES)
~
I
I
ITYPEWRITER
________: .JI
I
I
...,. I
EXEC SHELF
I
I SPARE
I
.. I
I
,.
SPARE SHELF
I
ISPARE
I
,.. I
SPARE SHELF
I
IUSER
I
.. I
I
'
USER SHELF
I
I EXEC
I
.
I
I
IEXEC 2
L
w
J
16 BOOKS
Figure 3. First level of the SPAT transformation in TX-2.
all active users. The requirement for fast response
eliminates most other potentially more efficient
schemes. Fortunately, the small number of users in
the system complements the fast-response requirement.
2. The Csar (Core Storage Allocation Routine)
handles the bookkeeping required to maintain the
users' MAP structures. The SPAT hardware with its
limited page address memory (corresponding to
twice the limit of addressable core) reduces but
does not eliminate the storage allocation problem.
Paging removes the need to move registers in core,
and segmentation reduces the number of consecutive registers that must be found in the page address
memory (PAM), but it is still occasionally necessary to move files in PAM. The Csar handles the
allocation of PAM space, maintains a supply of free
core pages, and calls for files to be transferred to
and from file memory or discarded. It is by far the
largest of the five parts of the executive.
3. The Librarian builds and maintains the public and private directories. It provides a source of
A TIME- AND MEMORY-SHARING EXECUTIVE PROGRAM
ephemeral names for temporary quantities and supervises their destruction when requested.
4. The Mover transfers information to and from
the file memory (a Univac FAST RAND drum). It
keeps track of free drum space and maintains
bounds on each user's share of drum space. Actual
transfers are made on a page-by-page basis, but
bookkeeping is done on a file basis. The pages
within a file are tied together on the drum by a
list structure. Free drum space is not tied together
on the drum but is found from a bit table in core.
Files are stored at random on the drum and users
are limited by a quota rather than a fixed drum area.
5. The Secretary handles all input/output communications and interrupts. It is made up of a central program and a number of routines which are
specific to individual I/O devices. These device
routines are optional parts of the executive and do
not require core space if the devices are not in use.
COMMENTS AND A LOOK TO THE FUTURE
The APEX executive program was designed for
fast response. Its reponse, as must always be the
case, is not as fast as some users would desire
There are three major areas in which work is being
done to improve the response characteristics. One is
609
the area of bookkeeping overhead. The present program uses list structures built in CORAL language
format for all bookkeeping. CORAL was chosen to
simplify the programming of the system, but its use
exacts a price in storage and time which could be
reduced by the use of a more specialized list structure. However, the major reprogramming of an experimental system sucy as this is unlikely. The only
noticeable improvement in bookkeeping overhead
will probably come from the addition of a list-processing instruction to TX-2. A second area in
which changes can improve response time is I/O.
The contemplated addition of an I/O memory bus
would make a substantial increase in the number of
memory cycles available to the CPU during periods
of high I/O load. The third area involves the auxiliary memory facilities on TX-2. The access time
and transfer rate of the present drum system are
such as to cause a serious degradation in response
when memory allocation exceeds available core.
With a faster auxiliary memory, this degradation
could be substantially reduced. If such a memory
existed on TX-2, the SPAT hardware would be very
useful in implementing a page-turning scheme
which would allow an individual user to address
390K of virtual core without excessive overhead
costs.
A DESIGN FOR A MULTIPLE USER MULTIPROCESSING SYSTEM
James D. McCullough
Kermith H. Speierman
and
Frank W. Zurcher
Burroughs Corporation
Paoli, Pennsylvania
INTRODUCTION
3. Segmentation of data and programs to
make more effective use of memory and
permit a large number of active programs
to be present in the system.
4. Dynamic allocation of memory.
5. Memory Protection to prevent interference
between programs.
6. An Executive Scheduling Program (ESP)
that controls and schedules the entire system.
The B8500 system is designed to deal with the
following situation. A large number of active programs requiring various services are present in the
system and their current status and required service
are recorded. When some component of the system
becomes available, e.g., processor, memory space,
peripheral device, it is assigned to the active job of
highest priority that requires this service. The important concept is that no component of the system
belongs to any program but rather provides a service and then goes on to service another program.
The main function of the executive scheduling program is to keep track of the services required by
programs and to schedule the services when equip-:ment becomeSJ'available.
This mode of operation requires that the system
have the following properties:
The Burroughs B8500 is a modular processing
system designed for a multiprocessing and multiprogramming mode of operation. In addition to
the concept of multiple central processors, the
B8500 also functions with multiple input/output
processors which operate nearly independently of
the central processors. A high-speed fast-access disk
storage unit is provided as an extension to the main
memory and is used concurrently by the input/ output modules for storing input from external
communications and retrieving required programs
and data for the central processor.
1. The equipment should consist of independent modules that can function concurrently;
e.g., processors, memories, I/O, etc.
2. A bulk memory system that is a logical extension of main memory.
611
612
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
STRUCTURE OF PROGRAMS
Program segmentation is a basic requirement of a
multiple user system to provide effective use of
memory by permitting a large number of active
programs to be present in the system. A B8500 program may be considered as the output of one compilation consisting of program segments, data segments, an operand stack segment, a working storage
segment, and a program reference table (PRT).
The minimum structure that a program must retain
in main memory to remain active is one program
segment, an operand stack, a working storage segment, and a program reference table. A large program may also require additional program and data
segments at various periods of program execution.
The allocation of required segments is provided at
the time the segments are referenced through descriptors in the program reference table. The descriptors define the segments as they appear on disk
storage and when the segments are allocated in
main memory the descriptors provide communication between the separately allocated segments of
the program.
Tag Bits
A memory word contains 52 bits, 48 data bits
and 4 tag bits that may only be modified in a protected control mode. In addition to a parity bit,
three tag bits are provided to construct and control
memory words used as descriptors. One of these tag
bits is a presence bit that is used by ESP to define
the presence in main memory of the segment that is
represented by the descriptor. A reference through a
descriptor to a segment not yet allocated in main
memory causes a presence bit interrupt and invokes
ESP control of allocation of the required segment.
The two remaining tag bits are encoded as program
descriptor, data descriptor and indirect memory reference. A typical program is shown in Fig. 1.
Program Segments
Program Segments contain instructions and constants and may have a maximum length of 4096
words. Programs are location-independent and all
internal addressing of constants and jump instructions is relative to the Base Program Register
(BPR) which contains the absolute address of the
1965
segment base. Jump instructions may reference any
syllable within a word. Syllables are six bits long
and instructions contain from one to four syllables.
The Program Counter Register (PCR) is a 15-bit
register relative to the BPR: 12 bits are required to
address the relative word in the segment and 3 bits
to address one of the 7 syllables contained in the
word. The PCR was designed to enhance dynamic
memory reallocation and allows the ESP to move a
program segment which has been partially executed,
simply by changing BPR to the new base location
of the segment.
Program segments can only be read and are protected from accidental modification since they are
allocated outside the area bounded by the memory
bounds registers. Since program segments are referenced by program descriptors which have the appropriate tag bit configuration, they may never be
accidentally read as data segments. Program segments may contain internal subroutines which are
referenced via a program descriptor in the PRT.
While individual program segments are restricted to
the 4096-word limit, a large program may contain
many program segments. Transfers between segments are directed by program descriptors in the
PRT.
Program Reference Table
The program reference table is a read-only segment and contains descriptors for program communication with data segments and other program
segments. Descriptors are addressed relative to the
PRT base register and addressing beyond the limits
of the table is prohibited. A program is filed on the
disk with its PRT and program and data segments.
The filed PRT contains the information required
by ESP to construct the descriptors which must be
present in the PRT when the program is placed in
active status. For each descriptor, this information
includes the name of the object to which it refers,
the type of object (procedure, simple subroutine,
data, etc.), the mode of use (global, own, readonly, etc.) and the descriptor required for its input
from the disk. When the PRT is input to memory,
ESP decodes this information and sets the necessary
tag bit configuration required for processor recognition. One tag bit configuration is set for descriptors
which refer to program segments, procedures, and
subroutines and another configuration is set for
613
A DESIGN FOR A MULTIPLE USER MULTIPROCESSING SYSTEM
B8500 Uniform Program Structure
BPR
~I
~
MemOry{
Bounds ~ .:......_ _ _O_p_e_!_an_d_S_ta_c_k_ _ _- - l
Program Segment 1
Working Area
BDR
Internal
Subroutine
~
Common
Constants
BPR
Working Space For
Program Segments
1 + 2
Index Words
Local Variables
~
Program Segment 2
BXR2
Memory
Bounds
PRT
BXRa
~
Working Space For
Procedure or
Subroutine
Return Control Word
Program Reference
Table
Descriptors
I
I
Program Segment
Subroutine
Data Segment
Procedure
~
~
Prt for Procedure
PRT
~
Caller's Prt
.a0
Descriptors
o'ij
S§
0)0
'ij
C\I
0)
~
Subroutine
BPR~
~
BPR
~
Memory
Bounds
~
I
Data Segment
~~
Data Segment
(Shared)
~
This BPR
coo
~
~
Program Segment
For Procedure
Figure 1. Typical Program Structure.
Subroutine
614
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
those descriptors which reference data sets or memoryspace.
Words zero and one relative to the PRT base
register are reserved for special use on a procedure
call. A procedure is a program which requires its
own PRT, and is provided to permit calls on programs which have been compiled separately from
the calling program. A procedure call is executed by
a reference to a program descriptor in the caller's
PRT. This program descriptor contains the address
of the procedure's PRT instead of the program segment of the procedure. Word zero relative to the
new PRT is used to save the contents of the caller's
PR T base register such that it may be restored to
its correct value when the procedure returns to the
caller. Word one relative to the new PRT contains
the BPR value of the procedure's initial program
segment.
1965
dures and subroutines. Addressing in the control
stack area is relative to the Base Index Register
(BXR). When a subroutine or procedure is called,
the BXR is incremented beyond the calling program's control stack area; parameters and return
information are stored relative to the new BXR
value; and the called program addresses relative to
the new BXR. Word zero relative to BXR is used
to save the relative BPR, peR, BXR increment, and
jump control bits of the calling program. The subroutine return instruction refers to this location for
its information. This structure provides an automatic
mechanism for subroutine nesting and recursion. Any
of the 4096 directly addressable words relative to
BXR may be used as index words; the most recently used are kept in the processor's local high speed
memory.
Operand Stack Segment
Data Segments
Data segments are addressed relative to data descriptors which contain the absolute addresses of the
segments. The tag bits of the descriptor determine
the memory bounds. The next instruction which executes a memory fetch or store is compared with
these memory bounds, providing both read and
write memory protection. Any reference to a data
descriptor relative to the PRT base register causes
that descriptor to be placed in the processor's local
high-speed memory such that subsequent references
to that descriptor will not require a main memory
fetch if it is among the last 16 descriptors referred
to.
Working Memory Segment
The working memory area comprises two logical
segments, common, and the subroutine control
stack, allocated in a contiguous memory block and
bounded by the processor memory bounds registers.
The common, or global data area, is addressed relative to the Base Data Register (BDR). It should be
noted that we have not provided any direct method
of setting or saving the BDR on subroutine jumps
or procedure calls because of its intended use for
common data.
The subroutine control stack provides dynamically allocated space for subroutines and procedures
and is used to contain local variables and index
words and for passing parameters between proce-
The operand stack segment is used by the processor to hold operands and results for the .arithmetic
and logical instructions. Programs for expression
evaluation are Polish strings which are directly executed by the arithmetic unit using a push down
stack implemented in the processors hardware. The
stack segment discussed here is in main memory
and is a logical extension of the processor's stack.
Memory Protection
Memory space is allocated by the ESP and given
to the user program by setting bounds registers in
the processor and descriptors in the PRT. These
registers and descriptors can only be set by ESP (in
the control mode) which prohibits the user from
having any control over the assignment of memory.
The working segment and operand stack segment
are read-write areas and each are defined by memory bounds registers. The program segments are
read-only objects and are not contained within the
limits of bounds registers. The PRT, which is a directory of all program and data segments used by a
program is a read-only object and is contained
within the limits of bounds registers which prevent
using any descriptors that are not in this PRT. If a
user tries to change his PRT he will be interrupted
and ESP given control. Data segments are referred
to by data descriptors in the PRT. Each time such a
data reference is made the descriptor sets up bounds
around the data segment being referenced for the
A DESIGN FOR A MULTIPLE USER MULTIPROCESSING SYSTEM
duration of the data reference. Any attempt by the
user to read or write any other areas of memory
will cause an interrupt and entry to ESP.
It is possible to branch outside of a program segment without detection but the program is still restricted to its own data and working storage segments so it can't effect another user by accidentally
branching to his program. I/O operations are controlled by the ESP to prevent a user from interfering with another's space. This combination of ESP
and hardware conventions allows any number of
user programs to be executed together in a multiprocessing-multiprogramming mode without the danger of accidental interference.
THE EXECUTIVE SCHEDULING PROGRAM
The Executive Scheduling Program (ESP) schedules both hardware and software services for all
programming tasks or jobs that are present in the
system. Many of the services of the ESP are themselves programs that are structured as normal user
programs and therefore may also be scheduled in
the normal manner. The intent is an organization of
the ESP consisting of many subprograms which are
separately constructed and therefore may be executed concurrently. Each of these subprograms is extended system privileges according to the functions
it is to perform, e.g., initiate I/O, manipulate tag
bits, etc. The ultimate requirement of the ESP is to
efficiently schedule all services, both hardware and
software, to effectively establish maximum throughput of the system.
Scheduling
Jobs may be introduced to the system from various sources. Prestored production tasks are entered
by the ESP without any external request; requests
may be. entered from external remote stations; input
streams from peripheral devices are interpreted for
batched requests; or a program or job may request
the execution of another job during execution. All
jobs presented to the system are retained on bulk
storage and descriptions of these candidates for
scheduling are kept in a Cold Job Table which is
also kept on bulk storage. A Cold Job entry remains
in the system from the time it is introduced for execution until its final outputs are delivered. Each
Cold Job entry contains information required by
615
the scheduling program to efficiently introduce
tasks to the system. Information required in each
Cold Job entry includes class and priority of the
task, estimated processor time, memory requirements, input files required, dependence upon other
jobs and accounting information.
Prior to a job's introduction to the scheduling
program, the availability of program and data files
must be established. This fundamental requirement
is established to insure that once a program is entered for execution, its completion will not be deterred by the unavailability of a program or data file
when required. Therefore, prior to successful scheduling, a collection program is invoked to accumulate the job's external files on bulk storage and
present to the system the required information concerning those files.
When a job is acceptable for execution, the
scheduling program generates a Hot Job Table entry
in main memory and requests the allocation and
readying of the program's initial requirements. Initial requirements for all programs are the program
reference table, the working storage area, the operand stack, and at least one initial program segment.
Other required program and data segments are allocated and readied when they are first referenced
through the descriptor which describes them. These
Hot Job Table entries establish a path of control
which the processor is to execute, and contain the
processor state information (hardware register values) recorded at the program's last suspension of
execution. The entries also contain the program status and are linked in priority order. The program
status may be ready to execute, awaiting I/O,
awaiting memory, being executed, awaiting time, or
being terminated, and is used by the internal scheduling program for determining the next useful task
to assign to the processor.
Classes and Priorities
Class is defined as a mode of operation; e.g., real
time, batch, conversational, etc. Priority is defined
as relative importance within a class. Classes possess a relative priority to each other. It is a desirable feature that the system be self-regulating to
prevent a group of users in one mode from monopolizing the system's resources. The philosophy employed is to get done what must be done in a timely
fashion, but always to maintain a lower limit of re-
616
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
sources below which the sum of users in a class
cannot fall. A good example is the conversational
mode in which remote terminal users may experience a decrease in system response time but will not
experience a complete blackout due to higher priority requirements.
Memory Allocation
Memory allocation is controlled entirely by ESP
since no hardware directed technique is attempted.
The memory allocation program is responsible for
the maintenance of all main memory. Its basic
functions include obtaining a block of available
space to satisfy a request and to assume responsibility for space being relinquished by its prior owner.
Allocation performs its function through the mechanism of linked tables which include all blocks of
memory. All blocks of memory, whether available
or assigned, are linked by address in a memory
map.
All blocks of memory which are available are
linked by size in available space map. An attempt
to allocate space for a caller is governed by the
priority and class of the caller and the amount of
space which has previously been committed to callers of that class.
The allocation routine will first try to allocate by
scanning the available space map to find the smaller
block which is large enough. If a block of sufficient
size cannot be found, the overlay program is called.
The function of the overlay program is to find a
currently committed block of memory which can be
reassigned to the caller of allocation. Using the
priority and class of the requestor, the overlay program will scan the Memory Map for a block of
memory to be reassigned. The considerations to be
applied at each block will be:
1. Does the block belong to a running program?
2. Is I/O going on in the block?
3. Size of block and surrounding blocks.
4. Is the block program or data?
5. Priority and class of block.
6. The number of disk operations required.
7. Number of users.
8. Size required.
If a program segment is overlaid, the appropriate program reference tables are updated to cause
1965
interrupt on access by the callers of the segment. If
a data segment is overlaid, the data segment is
saved in bulk storage before the space is reassigned.
User PRT segments and stored register values are
updated appropriately.
In the event that a request for space cannot be
granted by these means, a deferred space request
will be put into an unallocated request chain. This
chain will be scanned periodically to allocate the
deferred request.
In the event that the system finds itself bound by
having too many things to do, and not enough space
to do it in, it will (based on priority, class and percent completion) choose a job which has been introduced into the system and terminate processing
on that job. The Cold Job Table will be restored to
a previous state so that the job can be rescheduled
at a later time.
The PRT contains the descriptors through which
programs address separately allocated segments
without knowledge or regard to the absolute memory allocation. Location independent addressing allows ESP to dynamically change the contents of
memory by releasing segments not currently in use
and replacing them with other required segments.
The presence bit of a descriptor is used to indicate
the presence of a segment in main memory and a
reference to a descriptor representing an absent segment causes a processor interrupt invoking ESP.
ESP must then ready the required segment referenced by that descriptor. The descriptor is interpreted for type and when a global type is required,
the memory map is scanned to attempt to locate the
desired segment in active memory.
If the segment is not in memory or a local copy
had been requested, the memory requirements and
disk address for the segment are available from the
PRT, and ESP places the requesting program in a
suspended status, initiates an input request for the
segment, and assigns the processor to some other
useful task. When the requested input is completed,
the descriptor(s) which addresses it are marked
present and those programs may then be scheduled
for the processor.
REFERENCES
1. "Burroughs B5500 Information Processing
A DESIGN FOR A MULTIPLE USER MULTIPROCESSING SYSTEM
System Reference Manual," Burroughs Corp., 1964.
2. "A Narrative Description of the Burroughs
B5500 Disk File Master Control Program," Burroughs Corp., 1965.
3. "D825-A Multiple Computer System for
617
Computer System for Command and Control," Pro
ceedings of Flee, 1962.
4. R. N. Thompson and J. A. Wilkinson, "The
D825 Automatic Operating and Scheduling Program," Proceedings of Flee, 1963.
A COMPUTING SYSTEM DESIGN FOR USER SERVICE
Webb T. Comfort
IBM Corporation
Yorktown Heights, New York
tem must be able to support large numbers of such
users simultaneously without an unreasonable
amount of system overhead. Indeed, there are some
computing installations in the country today who
are prepared to take just such a step. On the other
hand, there are also a good number of installations
-in fact, probably the majority as of today-which
have a requirement for some immediate access of
this type, but not at the expense of crippling their
normal batch processing operation. Consequently,
the system design objective has to be to provide a
flexible system which can provide either type of
service (immediate access or standard batch) as the
demand fluctuates.
One other point, which has been made increasingly clear in most of the pioneering time-sharing
systems across the country, is that in the "handson" type of system operation being discussed here,
long and arbitrary periods of system down time are
simply unacceptable. As a result, the system design
must include procedures for automatically handling
as many hardware error situations as possible and
avoiding a total system shutdown as long as possible.
Now, it is not the intent of this paper to describe
in detail the Model 67 hardware, the programming
system problems and solutions, or the specific user
interface. Rather, some of the basic system charac-
INTRODUCTION
After a long period of study and experimentation
with various forms of user/terminal/system interaction, IBM is developing a general purpose timesharing system. This is the recently announced System/360 Model 67 and the associated programming
support package.
The basic technical objective of such a system is
to provide a user at a console with what appears to
him to be immediate response; i.e., when he asks
for something relatively simple to be done, it
should be done within 1 to 2 seconds. When he asks
for difficult and complicated things to be done,
there should not be an unreasonable amount of delay before they are in fact done. (This response
time concept is very closely related to the current
concern in batch processing systems over turnaround time.) Superimposed upon the response
time requirement is the necessity to provide a
broad scope of selectable procedures which allow a
user at a console to simply and conveniently create,
debug and execute his programs. More properly, he
needs the necessary facilities at his fingertips to
solve his problems.
However, from a marketing point of view, this is
not· sufficient. For those customers who have a requirement for a major facility of that type, the sys619
620
A DESIGN FOR A MULTIPLE USER MULTIPROCESSING SYSTEM
teristics will be discussed, particTIlarly as they relate
to the objectives pointed out above. Then the major
hardware and software characteristics will be described. In this way, the reader should be able to
get a feel for the overall system operation without
getting bogged down in a myriad of details.
Throughout the software discussions, the emphasis
will be on the control program area rather than the
language area, since it is the control program and
its associated routines which determine how the
system will operate.
It should also be noted that the system design has
been influenced in a multitude of ways by various
previous internal IBM efforts, including in particular the following:
TSMl
QUIKTRAN2, 3
IBM's recently announced Administrative
Terminal System
Studies of One Level Store
Studies of Polymorphic Multiprocessing
It has been aided by a number of discussions with
various individuals from General Motors Corporation, Lincoln Laboratories, and particularly the
University of Michigan.
SYSTEM CHARACTERISTICS
There are several basic characteristics of system
operation which dictate how the hardware operates
and what the design approach to the programming
system must be.
The basic mode of operation is multiprogramming, wherein a multiplicity of tasks (represented
by programs) reside in core at the same time and
have access to common devices. However, the normal multiprogramming technique has been to run
one task until it had to wait for some reason, such
as for completion of some I/O. At that point, the
CPU could be switched over to another task, to return to the first one later when its I/O was complete. Time-sharing goes a step beyond that, in that
it is known that a certain (dynamic) number of
tasks must be serviced within a reasonable period
of time; namely, the response time mentioned in
the Introduction. To this end, an algorithm is used
to determine dynamically how much of the system's
resources a task ought to be allowed to consume. If
this threshold level is exceeded, the task is forced to
stop and wait while other tasks have a chance; in
other words, forced multiprogramming of sorts. *
(This has been generally termed as time-slicing.)
In this way, a multiplicity of tasks-and therefore a
multiplicity of users at consoles-can be accommodated.
The system is designed to operate with multiple
CPU's and with multiple CPU-independent selection paths for I/O devices. This is necessary to provide the desired increase in system availability (as
distinguished from reliability). It is also necessary
in order to maintain orderly growth, particularly as
it will probably not be possible to specify optimal
system configurations until after the system has
been in operation for a while. In addition, this allows for partitioning within the system, to provide
smaller but independent subsystems, if desired.
(The same system will, of course, operate with only
one CPU.)
Dynamic relocation is implemented (as described
in the next section) and applied. This allows a task
to operate as if it had a full addressable memory
(called a virtual memory) of 224 ( or, in the larger
CPU, 232 ) bytes, almost independent of the amount
of real core provided in the system. Dynamic relocation can be used to simplify the traditional relocation techniques. This is particularly important in
time-sharing, where it often becomes necessary to
throw a task out of real core before it is finished
and bring it back later (the forced multiprogramming indicated above). Dynamic relocation completely eliminates the problem of having to return
the task to the identical area of core it occupied
during its previous period of residency.
There are two characteristics of program execution which have heretofore been unexploitable. The
first is that in general a program must claim an
amount of core large enough for its worst case operation, even though in many cases, either during debugging or based upon parameter values or program
structure, the actual core requirement for a given
execution is significantly Jess than that maximal
amount. The second is that in many cases program
activity is localized for significant periods of time,
i.e., during its execution, a certain set of routines will
run for a while, and then another set, while the first
*In many places in this paper, for purposes of avoiding
unwarranted complexities, general statements are made.
Many of them, if interpreted literally and assumed applicable
in all cases, would be clearly objectionable. It must be
emphasized that such generalities are for expository purposes only, but do represent the basic system action.
A COMPUTING SYSTEM DESIGN FOR USER SERVICE
are not used. To capitalize on these situations, the
paging concept is used, wherein through the dynamic
relocation facility the control program can recognize on a dynamic basis which parts of a program
or its data are now required, and which are no
longer required. Virtual memory is then broken up
into blocks called pages, and only those pages actually referenced by the task will be brought into
real core. This will provide a better utilization of
real core storage, as well as allowing an equivalent
level of mUltiprogramming operation (when compared with always moving a complete package of a
program and its data) but within a significantly
smaller amount of real core.
In the past, it has generally been necessary to declare at least by load time all subroutines which
might possibly be needed during execution, and to
load them all into memory before execution can begin. Dynamic linkage (or, more properly, dynamic
loading) is a facility whereby a routine need not be
declared or loaded until, at execution time, it is actually called. (At call time, virtual memory can be
allocated for the called routine and external symbol
definitions resolved. Actual relocation modification
of addresses-or, in System/360 terms, address
constants-need not occur until paging time.) This
is particularly important for a user at a console, for
he may elect in the middle of an execution to
change his mind about what subroutine he wants
the program to call-and he ought not to be required to stop and reassemble or reload. This also
contributes to more effective utilization of real
core.
A general purpose terminal oriented system
would not be complete without a "warehouse" of
previously stored programs and data sets, maintained and cataloged by the system, and callable by
the user or his program.
The scope of user facilities is defined by the set
of languages at his command. The basic Model 67
program package will include a Command Language (with a set of Program Checkout techniques), a FORTRAN compiler, and a macro assembler. Later extensions will include COBOL and
PL/l.
HARDWARE CHARACTERISTICS
Several extensions and modifications have been
made to System/360 hardware in order to facilitate
621
the system characteristics summarized in the Systems Characteristics section. The most important of
these will be discussed briefly here. (It is assumed
that the reader is familiar with the basic
System/360. 4) In those situations where the CPU is
altered, there is generally a switch of some sort
(physical or programmable) which will disable the
feature, thus assuring that Model 67 will still run
programs prepared for any other model of System/360.
Standard System/360 has a 32-bit word, and 24bit addressability; i.e., 224_1 is the largest memory
address recognizable by the hardware. On the Model 67, an option is provided for full 32-bit addressing. Associated with this facility is a new instruction, Branch And Store, which is essentially a 32bit version of the Branch And Link instruction.
In System/360, an effective address is formed in
the general case by forming the sum of the contents
of a base register, the contents of an index register
and the contents of the displacement field· specified
in the instruction. The Model 67 is provided with a
program-controlled relocation mode, which imposes
a translation function between the time the logical
(effective) address is generated and the time the'
address is sent to the appropriate memory box. The
total addressing space (virtual memory) is broken
up into pages of 4096 bytes apiece. Thus a logical
address can be considered to be a 12-bit page number (optionally 20-bits with 32-bit addressability)
concatenated with a 12-bit byte address within the
page. The basic mechanism is to provide a relocation table for a direct look-up of the logical page
number, and to fetch a new physical page number
from the table. Because of its potential size, this
relocation table is kept in main core storage rather
than in its own hardware implementation. However,
since the translation must occur on the fetch of every instruction and operand from core storage, a set
of associative registers are provided to reduce the
number of additional memory references to an acceptable level.
Figure 1 shows a simplified data flow for the relocation hardware. It will be noted that tre page
number translation is a two-step process. That is,
pages are divided into groups of 256, called segments. The logical segment number is looked up in
a segment table, and determines the location of a
page table, which is then used to translate the logical page number into a physical page number.
622
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1965
LOGICAL ADDRESS
I
ASSOCIATIVE
,.-.. .-v----·---, REG I ST ERS (8)
I
I
[
I
~
:
I
SEG·I
~~~T I
PAGE
(6)
I
HIGH ORDE R
LOW ORDf R
(12)
(12)
\
IF AN EQUAL .;.
ON COMPARE
PHYSICAL
ADDRESS
(12)
I
T
I
I
I
I
I
LOAD IN T l '
REGISTFR WITH
CORRESPONDING
SEGMENT AND PAGE
I
I
~
I
PHYSICAL
ADDRESS
If NO fQUAL
1
j
ON COMPARF
SEGMENT TABLE
IF NO
EQUAL ON
COMPARE
PAGE TABLE
PAGE TABLE
ORIGIN
+.....
(?4)
~
...--
SEGMENT TABLE ORIGIN
---
~
PHYSICAL
ADDRESS
(12)
~-t
-
(24)
TABLE REGISTER
Figure 1. Simplified data flow of dynamic relocation.
Availability bits are provided at both levels, and
are used to cause CPU interrupts when references
are made during execution to selected segments or
pages. (This is how the paging technique is implemented.) This two-level relocation technique provides the following:
1. A convenient way to allocate a data area
of unknown length, by allocating it at the
beginning of a segment, and allowing it to
fill the segment.
2. A way of compressing page tables, since
actual table entries are not required for unused pages at the end of a segment.
3. A way of reducing the amount of real core
required to contain page tables at execution time, since segments can be marked
unavailable, and the associated referencing
interrupt used as a signal to bring that particular page table into core.
4. A very convenient way of sharing programs or data sets among tasks, since a
one-page table could be pointed to by segment table entries for several different
tasks.
5. An effective mechanism for read/write
protection of areas of virtual memory. The
reason for a segment being marked unavailable is open to interpretation by the control program.
6. A simple and efficient overlay mechanism,
through substitution of segment table entries.
Associated with the relocation feature is a new
privileged instruction, Load Real Address, which
allows the control program to find the translated
form of any virtual memory (logical) address.
In System/360, basic CPU control is contained
with a Program Status Word (PSW). In Model 67,
A COMPUTING SYSTEM DESIGN FOR USER SERVICE
this PSW control has been extended by a set of control registers, whose contents include the table register (which defines where the active segment table
is), extended masking facilities, and program-readable status indicators for various system switches.
Associated with this capability is a pair of new instructions, Load Multiple Control and Store Multiple Control, which allow for manipUlation of the
control register contents.
In order to allow for multiple-CPU operation,
multiple memory bus connections are provided.
Two-channel switches and channel controllers serve
to provide CPU-independent selection and control
of I/O. Extended Direct Control provides for interCPU communication. The instruction Test And Set
allows guaranteed interlocking where required.
Partitioning switches are included at critical
communication points within the hardware to allow
for dynamic manipulation of the hardware system
configuration. New devices can be added, others
removed for testing, or a full subsystem partitioned
out for independent operation. This facility must be
carefully controlled in actual operation to prevent
623
unintentional system operator slips from disrupting
the system. On the other hand, ultimate authority
must rest with a human being, to safeguard against
un debugged or recalcitrant programs.
SOFTWARE CHARACTERISTICS
The basic unit of control in the system is a task.
For a user at a console, one task is defined to provide services for his complete session at the terminal (i.e., from logging on to logging off). For a
non-conversational (batch) job, a task controls the
reading and interpretation of the job control cards
and requested services.
Each task operates within its own Virtual memory; i.e., there is a set of relocation tables for each
task. When multiprogramming among tasks, simply
changing the table register causes a new set of relocation tables to be brought into use.
When looked at from the point of view of one
task, the software can be thought of as having three
levels as pictured in Fig. 2. There is a basic system
Supervisor, which has the following characteristics:
NON-PRIVILEGED ROUTINES
Figure 2. The software world, as seen by one· task.
Permanently resident in real core
Runs non-relocated
Not addressable in Virtual Memory
Runs in Supervisor State
Not time-sliced or paged
This is the Supervisor program which is common
to all tasks and all CPU's. It handles all the details
associated with interrupts, I/O, paging and scheduling. Figure 3 shows the major pieces of the Supervisor, as it is now defined. Without discussing
each one, a few comments will clarify the operatio;}:
1. The Supervisor is basically an interrupt
handler. System 360. hardware automatically
sorts out five types of interrupts (called
I/O, Program, SVC, External and Machine
Check). The Supervisor must sort each of
these types into finer categories and cause
the appropriate actions. As a general rule
of operation, interrupts are stacked by the
Supervisor; i.e., if the Supervisor is pro-
624
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1965
TASK EXECUTION (PROBLEM STATE)
(I/O
(EXTERNAL
INTERRUPTS)
INTERUPTS)
t
(PAGING DRUM)
(OTHER)
I
(TI MER)
(MACHINE CHECK
INTERRUPTS)
(WRITE
DIRECT)
I
(MALFUNCTION
ALERT)
$UPERViSOR
STATE)
SYSTEM ERROR
RECOVERY
INTERRUPTS DISABLED
-- ---INTERRUPTS ENABLED
ETC.
Figure 3. Supervisor block diagram.
cessing one interrupt when a second one
occurs, the second one will be put on a
queue until processing of the first is complete. All such pending interrupts will be
processed (if possible) before returning to
one of the tasts in Virtual Memory.
2. Unfortunately, there are many situations
within the Supervisor where requests are
made for services or facilities which are
busy on other things. Whenever this situation occurs, it is necessary to form a queue
of some sort. In order to handle queued
requests in a reasonably uniform way, all
such queues are controlled at a central
point. There is a queue provided for each
" type of hardware interrupt (which is how
interrupts are stacked), as well as a queue
for each facility (such as real core storage,
drum space, and I/O device use) which
might be busy when a request is made. A
generalized queue scanner is then provided
to see to it that whenever any CPU is in the
Supervisor, it will handle all outstanding
serviceable queued items.
3. The major portion of what is generally
called the scheduling algorithm is built
into the Dispatcher. Since everyone seems
to have his own idea about what is a good
scheduling algorithm, its most important
characteristic in this system is to make it
as modular as possible so as to be easily
changeable, or even replaceable. Rather
than outline a specific technique, it will
suffice here to list the conflicting objectives which such an algorithm must attempt to satisfy.
(a) Provide "reasonable" response at a terminal to a request for a "reasonable"
amount of work
(b) Provide good throughput on batch-type,
non-conversational tasks.
A COMPUTING SYSTEM DESIGN FOR USER SERVICE
( c ) Provide some degree of balance between
(a) and (b), based upon relative loads.
( d) Utilize the paging concept as efficiently
as possible.
(e) Implement multiprogramming of timeshared tasks, as required by (d).
(f) Provide dynamic variation of (d) and (e)
to match CPU speed with drum and disk
rates.
(g) Provide some correlation with a useroriented priority scheme.
(h) Be simple and fast.
4. When a CPU recognizes a hardware error
(i.e., a Machine Check Interrupt occurs),
that CPU goes into Wait State, and another CPU is alerted (via Malfunction Alert
Interrupt through Direct Control). The alerted CPU then invokes a recovery procedure.
5. Because the system is designed to operate
by paging, such paging I/O functions are
handled separately, to make them as efficient as possible.
6. Pathfinding is the general control mechanism for I/O selection and utilization.
Within a task's Virtual Memory are a set of Service Routines, which are a part of the programming
support, and which provide in effect a one-task supervisor to control the services and communications
for the user with whom this task is associated.
These routines will in general be reentrant, and
shared among all tasks. Some of these service routines are defined to operate with a privileged action
capability (which is not the same as the Supervisor
State recognized by the System/360 hardware).
Such routines are allowed to request special actions
by the Supervisor (such as changing relocation tables) , and are protected from the rest of the programs (including those of the user) within Virtual
Memory. Such privileged service routines include
the following:
1. The program which interprets terminal
commands.
2. Some of the subprograms which carry out
terminal commands.
3. All services relative to the catalog (searching, interrogating, changing, etc.)
4. Virtual Memory allocation.
5. Private device allocation.
6. Allocation of external (catalogable) space.
625
7. Access methods, which provide the task
with GET /PUT capability to terminals,
sequential, and direct access devices.
8. Dynamic loader, and associated tables.
9. Mechanism for handling task interrupts (as
distinguished from hardware interrupts) .
The nonprivileged area of Virtual Memory
(which is by far the bulk of it) is available for the
language processors (FORTRAN, PL/1, etc.) and
user-generated routines.
The basis behind the three levels depicted in Fig.
2 is protection. The Supervisor is protected from
accidental random damage by virtue of the fact that
it is not addressable in anyone's Virtual Memory.
(What is addressable in a task's Virtual Memory is
determined by what the Supervisor puts into the
page table for the task. Note that this also provides
complete protection between different tasks.) Communication with the Supervisor is limited to a specific set of SVC's (Supervisor Calls, a System/360
interrupt mechanism for that specific purpose) .
However, the majority of such possible requests are
of a very sensitive nature, i.e., if misused, they
could seriously affect the operation of the whole
system, and this is highly undesirable. To help control this problem, the set of privileged service routines was defined, and execution of the sensitive
requests is limited to such service routines. In turn,
access to these privileged service routines is limited
to legitimate entry points, and they are protected
(via System/360 protection keys) from access by
non-privileged routines. In this way, it is hoped to
eliminate the possibility that an undebugged or irresponsible task could hurt anyone but itself.
SUMMARY
This report attempts to give an over-all picture
of the System/360 Model 67 Time Sharing System,
its system design, and major hardware and control
program characteristics. The unique combination of
hardware and software objectives makes a very
complex problem, for which a simple and efficient
solution is desired-a difficult task at best. It is
further complicated by the fact that there is no recognized consistent way to measure such a system,
either in how it performs or how well specific technical problems have been solved.
626
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
It is the author's opinion that one very significant concept made available in this system is the
large addressable Virtual Memory. It should force a
complete reevaluation of how programs should be
written, and has the potentiality of making obsolete
the traditional I/O techniques. However, it will require a good deal of experience and experimentation to know how best to exploit this new facility.
REFERENCES
1. H. A. Kinslow, "The Time-Sharing Monitor
1965
System," Proc. FICC 1964, Spartan Books, Inc.,
Washington, D.C., 1964.
2. T. M. Dunn and J. H. Morrissey, "Remote
Computing~An Experimental System. Part 1: External Specifications," Proc. SICC 1964, Spartan
Books, Inc., Washington, D.C., 1964.
3. J. M. Keller, E. C. Strum and G. H. Yang,
"Remote Computing-An Experimental System.
Part 2: Internal Design," Proc. SICC 1964, Spartan
Books, Inc., Washington, D.C., 1964.
4. "IBM System/360 Principles of Operation,"
IBM Document, Form A22-6821-1.
DESIGN CONSIDERATIONS FOR A 25-NANOSECOND TUNNEL DIODE MEMORY
D. J. Crawford, R. L. Moore, J. A. Parisi,
J. K. Picciano, and W. D. Pricer
International Business Machines Corporation
Systems Development Division
Poughkeepsie, New York
plete memory system is in the final stages of construction and assembly.
INTRODUCTION
About two years ago a tunnel diode memory system was described which employed substantially
different techniques than those previously used. 1
Although earlier systems had tended towards array
arrangements that had the storage cells connected in
parallel on one or more axes, the new system employed series connections along two axes. This new
arrangement has several design and performance
advantages compared to previous systems. The original paper described the basic approach and some
of the earlier work which included the design of array cross sections and the associated driving and
sensing circuits. Since that time one version of the
system has been operational in two IBM 7030
systems,2 and a 16-word, fully-populated, higherspeed .laboratory model was built and reported. 3 The
present paper describes the engineering considerations used in the design of a larger and faster memory
employing the basic techniques.
The new memory system contains 64 words of 48
bits each, and test results from a partially-populated cross-sectional model indicate a complete
READ/RESTORE or a CLEAR/WRITE cycle time
of less then 25 nanoseconds. A fully-populated com-
CELL OPERATION
The basic storage cell is simple, as shown in the
dashed-line box in Fig. 1. It consists of a tunnel
diode shunted by a series-connected load resistor
and the secondary winding of a transformer. A
biasing current is introduced to the tunnel diode
LJ
WORD
LJ
o----t-----1OO"-----t__
-+::=-=-~iIJ'_-::==~_J
LJ
LJ
BIT
Figure 1. Array configuration.
627
I
I
628
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
storage cell along the bit axis. During the writing
portion of a memory cycle, the biasing current can
be increased by the addition of a bit current. The
word driver introduces voltage excursions within
the storage cell loop by means of the transformer.
The normal bias current through the tunnel diode
cell is such that the tunnel diode can be either in a
high-voltage region or in a low-voltage region.
The cells are series-connected along two axes
(Fig. 1) in order to form the basic memory array
configuration. Figure 2 shows the load-line diagram
for the basic storage cell. The current bias and the
load resistor are such that the cell normally has two
stable states. During the READ operation, if the
cell has been storing a ONE, the voltage induced
into the secondary winding of the cell transformer
BIAS
CURRENT
V
"---VONE--~
WRITE
VOLTAGE
(WITHOUT
BIT DRIVE)
WRITE
VOLTAGE
(WITH
BIT DRIVE)
Figure 2. Tunnel diode load-line diagram.
shifts the load-line and clears the cell from the
high-voltage state to the low-voltage state. The
net voltage drop when the cell changes its operating
point from the high-voltage to the low-voltage
state is transmitted through the series connection of
the diodes on the bit line to the end of the bit line,
where it can be detected.
On the other hand, if the cell was storing a ZERO,
it initially would be in the low-voltage state. The
shifting of the load line in this case by the READ
voltage, V r, produces only a very small voltage as
the ZERO response. At the conclusion of the
READ cycle, all the cells associated with the particular word will have had their information read
out and will be left in the low-voltage state.
1965
ARRAY DESIGN CONSIDERATIONS
One of the key items in the design of the system
is the cell transformer. The objectives are to
achieve a low impedance when looking into the
primary loops, and to have reasonable inductive
coupling but a minimum of capacitive coupling
from primary to the secondary loop. In the earliest
work, etched circuit construction was tried but the
state-of-the-art of fine-line etching and the
difficulty of making good minature connections
made the approach impractical at that time. The
succeeding systems employed transformers with secondary windings made of wire. When the work on
this new system was started, it was decided to again
attempt to solve the various problems of the etched
circuit transformer for the inherent advantages of
reproducibility it offered.
The first step in the design process was to study
electric field patterns of likely configurations using
resistance paper analog techniques. From this work,
reasonably accurate predictions of the transformer
parameters such as mutual coupling and secondary
self-inductance were made. This procedure permitted a quick review and optimization of different
transformer patterns. A typical field plot used is
shown in Fig. 3. As a result of this work, preliminary decisions were made as to the desired shape
and dimensions of the transformer.
At this time different constructional methods and
arrangements of the storage cell into arrays were
considered from both a mechanical and electrical
viewpoint. It was finally decided that an approach
which had several cells on a module would simplify
manufacturing problems and improve serviceability.
Although it offered certain mechanical constraints,
the SLT (Solid Logic Technology) type of module
employed by IBM was chosen as a starting point for
the design.
An alumina ceramic wafer about one-half-inch
square serves as the main mechanical structure for
the module. The ceremic wafer has 16 swaged pins
on O.125-inch centers for external connections
plus 6 pins in the opposite direction to serve as
mounting points for welding the tunnel diodes. One
surface of the ceramic has 4 screened resistors and
a solder-coated circuit pattern. A view of this subassembly is shown in Fig. 4. The transformers are
contained in a separate multilayer etched copper
wafer assembly that is slipped over the 16 pins and
dip-soldered. The final assembly operation con-
DESIGN CONSIDERATIONS -
25
PRIMARY CONDUCTOR
629
NSEC TUNNEL DIODE MEMORY
AXIS OF THE TWO
PRIMARIES
r------------ --1
: r--I
I
EACH EQUIPOTENTIAL LINE
REPRESENTS 10% OF THE FLUX
0.125 inches
I
-I:
+
I
~
I
I
IL~____________
~
J
0.010 inches
:
Figure 3. Magnetic field pattern around primary conductor.
sists of applying a ferrite powder coating in an organic binder on the primary side of the transformer
wafer to increase the mutual coupling from primary
to secondary. The assembled modules are shown in
Fig. 5.
Early in the project, it was recognized that a
large, accurately scaled mockup of the module assembly was needed for making electrical measurements. These measurements were needed to assist in
the transformer design, and were also vital to the
determination of the overall array parameters. The
normal-size modules were so small that it was virtually impossible to make electrical measurements
with any degree of accuracy of precision. Because
both capacitance and inductance scale are directly
proportional to linear dimensions, it was decided to
make a module assembly 20 times normal size. One
problem was to find a suitable dielectric material
substitute for the ceramic substrate. because large
ceramic pieces were not available. The solution was
found by loading an epxy resin mixture with titanium dioxide powder in a ratio of approximately
1 : 1 by weight to. achieve a dielectric constant of
9.4. Analog field plots showed that thick copper
wiring patterns could be simulated by using two
thinner patterns appropriately spaced and connected
in parallel. Figure 6 shows photographs of the large
module and some of the transformer patterns.
The final transformer patterns are shown in Fig.
7. The solid squares are lands used for connecting
points. To cancel capacitive coupling effects from
the primary to the secondary wiring, each transformer uses two primary wires which are driven
push-pull with respect to ground. The primary
wires are routed in a manner to maximize the mutual coupling to the secondary wires, and the secondary wires are arranged in a mirror image configuration to match the land pattern to obtain uniform
characteristics. Capacitance stub "fingers" are included to balance the capacitances between the primary lands and the secondary windings.
A cross section of a transformer wafer is shown in
Fig. 8. To minimize the capacitance, the primary
and the secondary wires are separated by Teflon®
(registered trademark of E. I. Du Pont de Nemours
and Co.). The outer layers are used as supports for
a land pattern for soldering to the pins, and also to
protect the transformer pattern. Through-hole
plating techniques are used to connect the inner
land patterns with the outer lands.
BIT-LINE CHARACTERISTICS
The bit line used in the tunnel diode memory
consists essentially of a number of tunnel diodes
connected in series. The inductance of the interconnections is the series inductance of the line,
whereas the transformer and module capacitance
is the major component of the shunt capacitance.
However, the tunnel diodes add a nonlinear series
630
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
Figure 4. Array modules: (a) module with resistors and pins;
(b) top and bottom views of array modules.
Figure S. Assembled memory array modules.
1965
Figure 6. (a) Transformer wafer sheets, 20 times normal
size; (b) memory module, 20 times normal size.
resistance component that predominantly affects the
response of the line. An equivalent circuit for the
line is shown in Fig. 9 where each section represents
one bit of the line. Ls is the series inductance; Cd
is the diode junction capacitance; Rd is the nonlinear
diode resistance; and Cs is the total shunt capacitance. Because each of these parameters is very
small and the current rise times desired are very fast,
accurate measurements necessary to optimize the
line design could not be made.
A program was written to calculate the response
of the bit line that utilized this equivalent circuit and
took the nonIinearities of the tunnel diode characteristics into account. By using this program, the
631
DESIGN CONSIDERATIONS -:- 25 NSEC TUNNEL DIODE MEMORY
II
II
II
II
tJ
Ie
1
"
~
2
FOUR SECONDARIES
dI
U
n
I
I
I
U
n
~,
r
1
2
>\
PRIMARIES
Figure 7. Transformer pattern.
SECONDARY
ADHESIVE
TEFLON
EPOXY
TEFLON
r-----------r7~~r7~,-------------------~~----EPOXY
EPOXY GLASS
Figure 8. Transformer wafer cross section.
I ----------lR
Rd
I:
O
Cd
I Cs.
~
Figure 9. Bit-line equivalent circuit.
parameters of the line were varied to find their
effect on the response and to find an optimum termination. The range of parameters used was deterInined by measurements of the 20-times module
model. The results of the investigation led to the
following conclusions:
• Increasing the series inductance of the line
does not improve the response of the line;
that is, in spite of the fact that the line
becomes less lossy as the series inductance
is increased, the sum of the rise time plus
the delay time increases with increasing
inductance.
• The parameters that most strongly affect the
response of the line are the diode resistance,
the shunt capacitance, and the termination
resistance, in this order.
632
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
• The optimum termination resistance, chosen
with the criterion of minimum rise time with
negligible overshoot, is considerably below
the V L/C for the line. Reflections due to
this lower terminating resistance tend to
increase the bit drive current at the far end
of the array and thereby compensate for the
attenuation.
• Because of the nonlinearity of the diode
resistance, the fastest response times are
obtained when the bit drive is toward increased current in the diode.
Actual measurements verified these predictions but
could not provide the ~ccurate resolution obtained
in calculations.
Two types of crosstalk were investigated. One
was the inducement of a false sense signal on a bit
line because of a switching tunnel diode on an adjacent bit line. The other type was accidental writing
into a tunnel diode receiving full word current by
way of mutual inductive coupling between transformer secondaries. The primary source of a false
sense signal is the capacitance between cells on adjacent bit lines; these transformers are only 50 mils
apart. The value of the capacitance predicted by the
20-times module (0.31 picofarad) is slightly high
because no ground plane was used in the measurement. In order to find the worst case, it was assumed
that the capacitance of eight cells are all lumped together at a point and that the voltage rises in one
L/R time constant. Under these conditions, it is predicted that a maximum of 96 millivolts would be produced from the two adjacent bit lines. However, the
actual noise should be much smaller and the detector
rejection level is safely above 96 millivolts.
Calculations were made of the worst-case noise
current induced by mutual inductive coupling between adjacent transformer coils. They showed that
the peak induced current would be less than 2.5
percent of the diode peak current. This value is
small enough to have a negligible effect on operating tolerances.
WORD LINE STUDY
The ferrite material on the primary winding of
the transformer increases the characteristic impedance and delay of the line compared to a similar
line without the ferrite. In addition, the ferrite increases the high-frequency losses and results in
1965
rise-time deterioration as the word pulse travels
down the line. To maintain a reasonable variation
in rise time, the word lines are split into 24-bit
segments, with 2 segments being driven by the 2
output stages of a word-driver circuit.
A number of methods of maintaining independence of the secondary output current from primary
current rise-time variations were investigated.
These involved shorting the word line or inserting a
rise-time pad in the output of the word-driver
circuit. An investigation of the response of the
transformer itself showed th~t secondary inductance
and load resistance caused it to act as a reasonable
rise-time pad to a fast word-drive pulse. Fig. 10
shows the method of drive presently employed.
The line is terminated in its characteristic impedance, and the pulse is clamped at the driver to
control the pulse amplitude. The transformer secondary circuit is used as a rise-time pad, with the
result that a 20 percent change in the primary current rise time result in only a 10 percent change in
the secondary output voltage amplitude.
Vc
+v
WORD LINE
(YYY1Ls
WORD
DRIVER
~TRANSFORMER
RS
Figure 10. Word drive system.
ARRAY CHARACTERISTICS
The array is formed by mounting the memory
cell modules on a pluggable card which contains the
interconnection pattern. The card has etched wiring
on the two outer surfaces and a ground plane inside
the laminate In addition to the memory cell modules, the array card also contains all the worddriver circuits, word line termination networks, bit
line termination networks, and miscellaneous power
supply decoupling networks associated with the modules. Each card contains 8 words, 48 bits per
word, of storage. An assembled array card is shown
in Fig. 1I.
One of the advantages of the series-type array
is that the driving requirements are readily met
with high-speed transistors. The array described
utilizes a unidirectional current pulse of 88 milliamps into the word line and presents a load of
DESIGN CONSIDERATIONS -
25
WORD TERMINATIONS
NSEC TUNNEL DIODE MEMORY
633
word line, the bit line is much more like a distributed RC-line than a conventional low-loss transmission line and, therefore, exhibits high phase and
frequency distortion. To minimize delay, the bit
line is broken into 8-bit segments, with a pair of
segments handled by each information control (regeneration) circuit. The bit drive required is· only
8.27 milliamps and is also a unidirectional pulse
into a load impedance of about 50 ohms. Output
signals for sensing are in the range of 300 to 400
millivolts, although the detector sensing level is
usually set lower to save cycle time. The maximum
quiescent power dissipated in a memory cell is 4.5
milliwatts which leads to an overall maximum
standby power dissipation in the 64 by 48 array
(exclusive of associated circuits) of about 14 watts.
MODULES
BIT TERMINATIONS
CIRCUIT DESIGN
Figure 11. Array card layout.
Information Control Circuit
136 ohms to the word driver. As previously explained, each word line is split into 24-bit segments, with a pair of segments driven by one
word-driver circuit. The propagation delay from
the word line input terminals _to the far end of each
line is about 3 nanoseconds. In contrast to the
Figure 12 shows the information control circuit
diagram. The sense amplifier, detector, information
control logic, bit-line bias current, and bit driver
were included in this single circuit to shorten regeneration time and to simplify the circuitry.
+10
BIT LINE
+1.24
270.
+1.24
7250.
-6
Figure 12. Information control circuit.
The operation of the circuit is as follows: The
first transistor, Ql, connected as an emitter follower,
receives a sense signal at its base and drives a transmission line that is open-circuited at the far end.
634
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
This passes a pulse of a width of approximately
3 nanoseconds. If. the strobe input· is positive. at this
time, the- tunnel diode switches on the first 200 millivolts of the signal. The detector diode remains in the
high state as long as the strobe pulse is present.
This drives the current switch (Q3 + Q4) that supplies the bit current pulse to the bit line. The bit
drive may also be actuated by pulsing the logic input
terminal negatively (Q6). The bit-driver collector
current increase of the output transistor rises in 2
nanoseconds or less, and the entire line assumes
the full drive amplitude within 7 nanoseconds of bitdriver tum-on. No overshoot has been observed. In
addition to furnishing the bit pulse, the output transistor also supplies the DC biasing current for the
bit line. Throughput time. of the ICC (Information
Control Circuit) from sense-amplifier input to bit+17V
1965
driver turn-on is 2 to 3 nanoseconds. The total time
elapsed from the resetting of the tunnel diode to its
zero state until the restoration of that diode to its
ONE state is 10 to 11 nanoseconds. During the write
noise, the first transistor cuts off, thereby preventing
the write noise from influencing the state of the
tunnel diode detector.
WORD DRIVER
The word driver circuit is shown in Fig~ 13. The
operation of the circuit is as follows: The current
switch formed by Ql, Q2, and Q3 performs the logical AND function and provides regenerative feedback through C 1 to improve the rise time of the input
pulse; Q4 provides the necessary current gain to drive
+17V
200pf
.~
Figure 13. Word driver circuit.
the output stages. In this circuit, both outputs from
the current switch are utilized to drive the word line
in two segments. The output stage consists of the
current switch formed by Q5 and Q6, which drives
two 2N2369A (Q7 + Qs) transistors for increased
voltage gain. The output levels are clamped by the
diodes Dl and D2 to provide a well-controlled output
level. The two wires of each word line segment are
driven push-pull with respect to ground by a balun
transformer. Because of the relatively light drive
power requirements, high-speed current switch techniques can be used. It should be noted that half the
drive is left ON all the time because only the changes
in drive line current can activate the tunnel diode
cells through the transformers.
The word driver produces an output pulse of 88
milliamps into a 136-ohm load. The current amplitude stays within a tolerance of -+-10 percent under
worst-case conditions. The rise and fall times of the
pulse are nominally 3 nanoseconds, 10 to 90 percent.
DESIGN CONSIDERATIONS -
25
The worst-case tolerance on the write transition is
within ± 15 percent of nominal value and the read
NSEC TUNNEL DIODE MEMORY
635
transition is at least as fast. The waveforms of the
two outputs of the word driver are shown in Fig. 14.
l
I \
I
I
I
I
I
I
I
\
I
I
I
•
I . . . .
I
I
;I
b. Out of phase output.
a. In phase output.
(Horizontal
=5 ns/div and vertical = 2v/div.
Rl = 136 ohms.)
Figure 14. Final word driver output waveforms.
OVERALL SYSTEM DESCRIPTION
In addition to the memory array and the associated driving and sensing circuits, the complete system includes a binary address register with decoding circuits, data input gating circuits, a data output register, and a clock with timing pulse distribution circuits. An address counter, comparing circuits, and manual data input switches permit exercising the memory system without using any external connections. Either the internal clock, or an external source of clock pulses may be used. The logic
circuits are constructed of an advanced form of
IBM solid logic technology, referred to as ACPX in
papers presented previously. 4.5
The system interwiring and voltage distribution
is contained in 2 multiple-layer boards, each
about 10 by 14 inches, mounted side by side. These
boards have male pins protruding from them which
serve as connecting points for external wiring and
also receive the sockets mounted on the bottom of
each circuit card. The memory array cards are
about 18 inches long and plug into both board assemblies, bridging across the gap between them.
Except for one narrower card, all the other circuit
cards are about 3 inches wide and 5 inches high.
About two-thirds of the volume is used by the
memory array and the information control circuits;
the remainder is used by the various logic circuits.
A sketch of the ,layout is shown in Fig. 15. The
card and board assemblies, blower system, power
supplies, control panel, and I/O connection panel
are contained in a cabinet.
Tests on a cross-sectional model containing 2
populated word lines and 4 populated bit lines indicated good operating margins with a cycle time in
the range of 22 to 24 nanoseconds. Assembled array
cards, shown in Fig. 11, have also been tested, with
excellent operating margins verifying the validity of
the cross-section test data. Fig. 16 shows the
actual voltage waveforms from the cross-section
model. At the time this is being written, the complete system is under construction; a photograph of
the partially completed system is shown in Fig. 17.
636
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1965
MOOULE SIDE
Figure 17. Partially assembled memory system.
Figure 15. System layout sketch.
REFERENCES
1. D. J. Crawford, W. D. Pricer and J. J. Zasio,
"An Improved Tunnel Diode Memory System,"
IBM lournal of Research and Development, vol. 7,
pp. 199-206 (July 1963).
2. "Series-Coupled Tunnel Diode Memory,"
Computer Design, vol. 2, no. 10, p. 8 (Nov. 1963).
3. G. Feth, "Scratch-pad Tunnel Diode Memory," presented at the IEEE Workshop on Computer
Memories at UCLA Conference Center, Lake Arrowhead, September 10-12, 1964.
Figure 16. Cross-sectional model memory circuit timing
waveforms (lOns/cm). The traces from top to bottom are
as follows: (1) Input to word driver. (2) Driven end of
word line. (3) Transformer output of the cell at the far end
of the word line. (4) ICC output.
This represents the first complete memory system
using any type of technology reported in this size
and speed range.
4. G. M. Amdahl, T. C. Chen and C. J. Conti,
"IBM System/360 Model 92," AFIPS Conference
Proceedings, Fall loint Computer Conference,
1964, Spartan Books, Inc., Washington, D.C., 1965,
vol. 26, part II, pp. 69-96.
5. M. J. Flynn, "Engineering Aspects of Large
High-Speed Computer Design," presented at the
ONR Symposium on High-Speed Computer Hardware, Washington, D.C., November, 1964.
A SILlCO'N MONOLITHIC MEMORY UTILIZING A NEW STORAGE ELEMENT
Richard Shively
Litton Industries
Woodland Hills, California
INTRODUCTION
nel diodes, and (d) semiconductor flip-flop type
devices.
Most forms of thin film and laminated ferrite
storage devices exhib:.t adequately fast switching
times, but the drive current requirements are large
and the readout signals small.- The current amplitudes and driver power levels usually preclude the
use of integrated circuit drivers for the thin film
and laminated ferrite memory systems. Recent development work indicates that integrated circuit
sense amplifiers suitable for sensing low-level signals at high speed will soon be available. However,
it is felt that the cost of these amplifiers may adversely effect the cost per bit of a small memory
system for some time to come.
Tunnel diode memory elements are capable of
fast switching speeds, and the drive and sense circuit requirements are amenable to mechanization
via the use of integrated circuit techniques. The
main stumbling block in fabricating a small, lowpower tunnel diode memory is found in the tunnel
diode itself. Although the tunnel diode is a semiconductor device, the unique characteristics of the element have made it quite difficult to fabricate in monolithic form. Therefore, the prospects of a low-cost,
physically small tunnel diode memory are not good.
The proper application of integrated circuit techniques to the fabrication of a bistable multivibrator
The advances that have been made recently in
monolithic chip semiconductor logic circuits have
significantly contributed toward the development of
high-speed, low-power computers. These advances also emphasize the need for marked improvements in storage techniques if the operating speeds,
the weight, and the electrical power of future computers, especially airborne or spaceborne computers,
are not to be adversely affected by the computer
memory.
A machine organization that utilizes several small
"scratchpad" memories can be used to advantage in
providing the required random access storage. This
technique combines a number of small-capacity,
high-speed memories with ~ large-capacity,
slow-speed working store. A significant increase
in overall machine speed can be realized by the
proper application of this combined memory technique.
If the high-speed memory is to operate at a cycle time in the 100-nanosecond region, the class
of storage elements that can be used is somewhat
limited. Storage elements capable of switching
speeds compatible with 100-nanosecond cycle
times include (a) thin magnetic films of several
types, (b) some forms of laminated ferrites, (c) tun637
638
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
or flip-flop appears to be, at this time, the most
practical method for building a high-speed, lowpower, physically small and potentially low-cost
scratchpad memory. Most standard integrated circuit flip-flops can be used as random access memory devices; however, the power per bit is high,
being in the order of 40 milliwatts and up, and the
number of bits per chip that can be obtained by using commercially available flip-flops even with
new metalization masks is felt to be no larger than
3 bits per chip.
Efforts have been made to tailor the flip-flop
topology to the needs of a random access memory
device. By incorporating this philosophy, improvements in power/bit and packing density have been
reported. A device power of 30 milliwatts per bit
and 8 bits per 14-lead flat package has been
reported. 1
The storage device that is presented in this paper
represents a combination of advanced semiconductor
technology and a unique flip-flop type of topology
that has been optimized for random access memory
usage. Because of the fact that the storage device is
constructed using monolithic integrated circuit technology, the name Semiconductor Memory Integrated
Device, or SMID, has been used. It is felt that the
SMID represents a significant addition to existing
storage devices. The basic element can be used at
read cycle times of less than 100 nanoseconds over
the temperature range of -55°C to + 125°C, and
the power/bit is less than 0.55 milliwatt.
Nine bits of SMID storage has been fabricated on
a monolithic chip. Figure 1 shows a 9-bit memory
array. When the fact that the 9-bit array and all of
the memory drive electronics are batch fabricated is
coupled with other element characteristics, there is
the indication that the SMID will provide a vehicle
for the construction of fast, low-power and lowcost scratchpad memories.
THE BASIC STORAGE ELEMENT
A fundamental requirement of any binary memory element is that the element exhibit two distinct
stable states. It is possible to form a bistable element by interconnecting a PNP and NPN transistor
in the proper manner. By utilizing this fact, the
semiconductor industry has been manufacturing, for
a number. of years, bistable elements that are known
by various names. 2 ,3 In addition, several families
of monolithic integrated circuits, usually linear cir-
1965
cuits by design, have exhibited, much to the dismay
of the manufacturer and the user, a four layer
(PNPN) latching mode of operation. However, the
proper utilization of this four-layer latching mode
of operation has afforded a vehicle for the construction of the SMID element.
The basic SMID element is shown in Figure 2 (a) .
The conductance or nonconductance of the transistor
latching pair Ql and Q2 constitutes the fundamental
storage mechanism. When the latching pair is in the
conducting state the SMID element is said to be
storing a "one."
The method by which the storage of data is accomplished can best be understood by considering the
static operation of transistors Ql and Q2. Transistors
Ql and Q2 are interconnected to form a positive
feedback loop. The gain of this positive feedback
loop is approximately f3Ql X f3Q2' the product of the
current gains of the two transistors. The stable states
of the transistor latching pair are represented in
Fig. 3 as point A, the "zero" state, point C the "one"
state, and point B. Point B is a stable state that will
exist only during a read "one" or write "one" operation. If a "one" is to be written into a storage
element, base current must flow into transistor Q3.
Q3 will conduct into saturation causing the latching
transistor pair to be switched to point B. The load
line shown in Fig. 3 is determined by resistor R2.
Once a SMID element has been switched to point B,
the external drive can be removed. The voltage across
the SMID element will now be + V H and the transistor latching pair will be operating at point C of
Fig. 3 due to the self-regenerative action in the
transistor latching pair.
If the current that conducts into the latching pair
is equal to or greater than the value given by Eq.
( 1 ), the memory element will remain in the low
impedance or "one" state. The maximum hold-on
current as a function of temperature has been
plotted in Fig. 4 from empirical data and agrees with
the hold-on current as calculated from Eq. (1):
- f32 min (1 + f31 min)
I e~X
f32 min f31 min - 1
+
V BEQ2 max
V CE sat WORD DRIVER - (-V max)
------=-==----~-~--~~~
Rl min
(1)
where
f32 min = the minimum beta of transistor Q2,
f31 min = the minimum beta of transistor Ql,
V BEQ2 max = the maximum base to emitter drop
of transistor Q2,
SMID: A NEW MEMORY ELEMENT
Figure 1. Nine-bit memory array_
639
640
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
0--SENSE
1 - - -........
1965
SWITCH
TO DATA
REGISTER
DIGIT
DRIVER
DATA
DIGIT CLOCK
@,-------,
CLEAR CLOCK
READ/WRITE CLOCK
RI
DECODED ADDRESS
LOCATION·
(a)
-V=-2.6
WORD DRIVER
OUTPUT
(POINT 4)
DIGIT DRIVER
OUTPUT
(POINT 3)
SENSE
SWITCH
OUTPUT
WRITE
·0
IWRITEI
"0"
-2.6 - - I "1" ....- - - - - - - - - - - - - - -IWRITELJ READ 1
...._ _ _ _ _ __
- - I 111"
11111 .
(b)
Figure 2. Basic SMID element and typical timing waveforms.
sat WORD DRIVER = the collector-to-emitter saturation voltage of the word driver output
transistor,
-Vmax = the maximum negative value of -V volts
(2.86 volts), and
RI min = the minimum resistance of resistor RI.
V CE
The SMID element will remain in the "one" state
until the current flowing in the latching pair is
reduced to a value that is less than that given by
Eq. (1).
The following discussion is included so as to indi-
SMID: A NEW MEMORY ELEMENT
®
= "ZERO" STATE
@
=STATE OF THE SMID ELEMENT
DURING A READ OR WRITE OPERATION
©
= "ONE" STATE
641
HOLDING ....- - - H
CURRENT
VOLTAGE
Figure 3. Characteristics of the transistor latching pair.
cate the manner in which the SMID is used in a random access memory configuration. It will be assumed,
for example'S sake, that the SMID element of Fig. 2a
is initially in the high impedance or "zero" state.
A "one" is written into the element by the simultaneous switching of point 4 from ground to -2.6
volts and point 3 from -2.6 volts to ground. This
corresponds to time t1 of Fig. 2b. The simultaneous
switching of points 3 and 4 will cause base current
to flow into transistor Q3 thereby causing current
to flow into the transistor latching pair. The minimum
values of -Vas a function of temperature that will
guarantee that transistor Q3 will be switched into
saturation during a write "one" operation are shown
in Fig. 5.
Once the latching transistor pair has latched on,
point 3 of Fig. 2b is returned to -2.6 volts and
point 4 is returned to ground. Hold-on current will
now flow from + 2.4 volts through R2 and the transistor latching pair into ground. The output of a word
driver will always be at ground except when data
are being written into, read out of, or cleared from
the particular word of memory associated with that
word driver. A more detailed discussion of the operation of the word driver is included in the section
on Memory Electronics.
Reading out the data that are stored in an element
is accomplished by switching point 4 to -2.6 volts.
This corresponds to time t2 of Fig. 2b. If the element
is storing a "zero," the voltage level at point 2 of
Fig. 2b will not change. If the element is storing a
"one," and is therefore in the low-impedance state,
the voltage level at point 2 will tend to follow point
4. This means that the voltage level at point 2 will
drop from ground to approximately -2 volts. When
the input to the sense switch is switched to a negative
642
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1965
360
340
320
i
0
300
280
::::a.
~
~
260
z
w
a:: 240
a::
::>
u
z
9
0
-.J
220
0
200
::E
::>
::E
180
:z::
X
«
::E
160
140
120
-55
o
-25
+25
+50
TEMPERATURE IN °C
+75
+100
+125
~
Figure 4. Maximum hold-on current versus temperature.
5
4
MINIMUM
WRITE IN
-v
TO GUARANTEE
O~------+---~~----~----+-----~----~-------+--25
-55
+125
+25
+75
+50
+100
o
TEMPERATURE IN °C
Figure 5. Maximum and minimum values of -Vasa function of temperature.
SMID: A NEW MEMORY ELEMENT
potential the output of the sense switch will go from
ground to + 5 volts. A more. detailed discussion of
the operation of the sense switch is included in the
next section.
If the SMID element is storing a "one," hold-on
current will continue to flow through the element
after a read operation has been completed and the
word driver output is switched back to ground. If
the SMID element is storing a "zero," the read
operation will not affect the state of the transistor
latching pair in any way. Therefore, the read operation is nondestructive in nature.
The amount of time required to read out a "one,"
or access time, is effectively determined by the..~ter
nal electronics and not the SMID. This is because
the transistor latching pair is already in conduction
when storing a "one," and the time that is required
to switch on diode Dl of Fig. 2b is in the order of a
few nanoseconds. The cycle time of a read operation
may also be limited by external electronics and not
the SMID. This is especially true if the number of
bits on a given sense line is large. This can be seen
if one considers the fact that the anode of diode Dl
is at a negative potential at the end of a read operation; the capacitance of this point must be charged
to a potential that is equal to the threshold of the
sense switch. Once the threshold of the sense switch
has been reached, the output of the sense switch will
go to ground, and a new read operation can begin.
As the size of memory increases, or the cycle time
requirements of the memory system decrease, the
number of bits that can be sensed by a common
sense switch must be decreased. This means that
several sense switches must then be ORed together
to form a total sense line output.
When data are to be cleared out of a given word
of memory, the output of the word driver is caused
to switch from ground to + 5 volts. The clear operation corresponds to time 13 of Fig. 2b. Since + 5
volts is greater than + V H, the flow ~f holding current is stopped and the base to emitter junctions of
transistors Ql and Q2 will be back biased. The
SMID element will be switched to point A of Fig. 3.
Resistor Rl of Fig. 2a has been sized so that the
base of Q2 will recover to -V volts approximately
50 nanoseconds after a clear operation has ended.
Figure 6 shows the time that is required to clear
data out of a SMID as a function of temperature.
As can be seen from Fig. 6, a faster clear operation
is obtained if + V H is compensated as a function of
temperature.
The last waveform of Figure 2b indicates a write
120
o =NO
100
COMPENSATION TO +VH
Ii. = COMPENSATED +VH
80
1
fd
(J)
c
643
60
40
20
O~~------~------~-----+------~----~~----~----~
-55
-25
o
+75
+25
+50
+100
+125
TEMPERATURE IN °C
Figure 6. Clear time versus temperature.
644
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
"zero" operation. The output of the word driver is
switched from ground to -2.6 volts, but since the
datum to be written into the SMID element is a
"zero," the output of the digital driver remains at
-2.6 volts. The base to emitter junction of transistor
Qs of Fig. 2a will be back biased by a· collector-toemitter saturation voltage; therefore, base current
cannot flow into transistor Qs. The transistor latching pair will remain in the high impedance or "zero"
state.
Writing a "zero" into a SMID is a straightforward
operation; however, the latching transistor pair is
susceptible to a rate effect type of turn on.4 The
rate effect type of turn on in a SMID element can
best be understood by considering the capacitive
voltage divider that is formed by. the parallel combination of the base-to-collector capacities of Ql
and Q2. When the voltage swing at point 4 of Fig. 2a
is large enough, the base to emitter junction Ql will
become forward biased (due to ~~pacitive coupling
of the voltage pulse) and the transistor latching pair
will begin to conduct. Figure 5 shows the maximum
value that -V volts can have as a function of temperature in order to guarantee that a write pulse alone
(i.e., no digital pulse in coincidence with write) will
not cause the latching transistor pair to conduct.
From Fig. 5 it can be seen that -V volts (-2.6) can
be varied in excess of ±13 percent from -55°C to
+ 125°C.
Although Fig. 2a represents the basic SMID topology, it should be noted that it is possible to vary
this topology slightly and yet obtain the same basic
type of operation. Transistor Qs can be eliminated
and replaced by a diode. The cathode of this diode
would be connected to the base of transistor Q2,
and resistor Rs would be in series with point 3 and
the anode of this new diode. The diode would be
used to isolate the transistor latching pairs that are
common to a digit line. Nine-bit SMID arrays have
been fabricated by the utilization of both topologies.
Each type of SMID array operates satisfactorily.
MEMORY ELECTRONICS
The low-power levels of a SMID memory are
quite compatible with the power handling capability
of monolithic integrated circuits. Two special integrated circuit drivers have been designed and fabricated, and a commercially available T2 L gate has
been remetalized. These three special integrated
circuits in conjunction with standard T2L gates have
1965
been used to construct a completely integrated circuit memory system.
The drive electronics that are used to write, read,
hold, or clear data in a given word of memory are
shown in Fig. 7. Pins 1 and 2 are used as the second
level of memory address decoding. The first level
of address decoding is performed in an X Y type of
+5V
CLEAR
~----~------GV~3CLOCK
--oREAO/WRITE
@CLOCK
t--_ _ _ _
Figure 7. Word driver.
matrix consisting of T2L logic gates. Pins 3 and 4
are clock inputs. The input terms to a SMID word
driver are normaHy false (i.e., at ground). When
pin 1 or 2 or 3 is at ground level, the output transistor, transistor Ql of Fig. 7, will receive base drive
and therefore be driven into saturation. This will
allow hold-on current to flow from the SMID elements of a given word and into ground. If pins 1,
2, and 3 are all true (i.e., + 2.5 volts or greater)
at the same time, the base drive to transistor Ql
will be zero. This will cause the voltage level at
pin 5 to be + 5 volts. This is the state of a word
driver during a clear operation. If pins 1, 2, and 3
are true at the same time, transistor Q2 as well as Ql
will conduct into saturation. The voltage level at pin
5 will be approximately -2.6 volts. This is the state
of a word driver during a read or write operation.
Four word driver circuits are included on a single
chip and are packaged in a standard 14-lead flat
package. With one flat package of word electronics
it is then possible to perform all necessary word
functions on four different words of memory. The
645
SMID: A NEW MEMORY ELEMENT
word-drive electronics has been designated so that
word lengths up to 40 bits can be accommodated.
The driver that is used for switching a SMID memory system's information line from -2.6 volts to
approximately ground during the writing of a "one"
into a given bit is shown in Fig. 8. The input terms
to the digit driver are normally false. Under these
conditions transistor Q2, of Fig. 8, will receive base
drive and the output of the driver will be at approximately -2.6 volts. When both the clock and data
The push-pull output of the digit driver makes it
possible to drive long digit lines at high speeds with
low standby power requirements.
The sense switch is shown in Fig. 9. A standard
T2L logic gate was remetalized so that the isolation
region could be tied to -V volts rather than to
ground. This guarantees that the isolation regions
of the sense chip will not become forward biased
during a read or write operation.
r------------------~
+5V
OUTPUT
INPUT
1
L __________________
~I
-V(lSOLATION}
=
I
Figure 9. Sense switch.
The output of a sense switch is normally at ground.
If the input to the sense switch should go to ground
+v
+V
or lower (as is the case during a write ~'one" or
read "one" operation), the voltage level at the output
of the sense switch will rise towards + 5 volts. There
are 4 sense switches in a 14-lead flat package.
Figure 10 shows the manner in which the SMID
elements and the drive electronics are interconnected,
and a block diagram of the complete 64-word-by21-bit system is included.
-v
CLOCK
DATA
-v
Figure 8. Digit driver.
input terms are true the base drive to transistor Q2
is "zero" and transistor Ql will receive base drive.
The junction capacity of diodes Dl and D2 delay the
turn on of Ql thereby guaranteeing that Q2 will be
turned off before Ql is turned on. The voltage level
on the output terminal will rise from -2.6 volts
toward + 5 volts. Once the voltage level of the
output terminal reaches ground, Q3 will conduct and
thereby take base current away from Ql.
There are four circuits of the type shown in Fig. 8
on each digit driver chip. Each chip is packaged in
a 14-lead flat package. Each digit driver flat package
can therefore drive four SMID information lines.
RESULTS AND PROJECTIONS
A SMID memory system of 64 words, 21 bits per
word, has been constructed and is now undergoing
tests. All of the selection and drive electronics are
monolithic integrated circuits.
The total memory system, including all peripheral electronics, requires 194 14-lead flat packages.
The capability of operating at 100-nanosecond
read-only cycle times with access times of less
than 80 nanoseconds has been demonstrated. The
total system power has been measured to be less
than 3.0 watts. Pertinent waveforms of a sequential
clear/write operation are shown in Fig. lla. Waveforms of the read-only operating mode are shown
in Fig. lIb.
646
PROCEEDINGS -
-v
FALL JOINT COMPUTER CONFERENCE,
1965
'---+----i----:-:--._-+--t------+--.....,- - ,
: WORD DRIVER
I WORD I
I
I
I
1
I
-V I
L_.---'
-V
,.- -,
:
I
I
I
I
L _ _ _ -'
(0)
SENSE
SWITCH
BIT I
L _ _ _ -'
DIGIT
DRIVER
BIT 1
L _ _ _ 01
L _ _ _ -'
SENSE
SWITCH
BIT 2
DIGIT
DRIVER
BIT 2
IL
: WORD DRIVER
I WORD 2
I
I
I
_ _-V
_ -'I
en
ILl
Z
:J
en
ILl
Z
Cl
0::
~-----64
~
21 DIGIT LINES
PACKS)
WORD
DRIVERS
(16 FLAT PACKS)
21 SENSE LINES
21
SENSE
(6 FLAT
PACKS)
:J,
~ r-:A:-:D~D-=R=ES=-=S=-:D=-=E~C~O=DE~
f3
z
:J
MATRIX(12 T2L FLAT
PACKS)
ADDRESS
REGISTER
INPUTS
(12 LINES)
>-
CX)
(21 LINES) (21 LINES)
DIGIT FROM
to
CLOCK DATA
DATA
REGISTER REGISTER
(I LINE)
(b)
Figure 10. Interconnections between SMID array and driver electronics.
The cost of' a small scratchpad memory system in
which the SMID and its associated drive electronics
are utilized has beec. projected to be under $2.00
per bit in 1966. It has also been projected that 100
64-word-by-21-bit SMID memories, including all
the necessary electronics, assembly, and checkout
charges, will cost less than $1.00 per bit in late
1967. A 16-bit SMID chip packaged in a standard
14-lead flat pack is now being investigated. A chip
such as this should reduce the price per bit and certainly will improve the memory system packing
density.
Several types of flip-chip interconnection
schemes are now being investigated. The proper
chip interconnect technique, in conjunction with a
package that has many more leads than 14, should
647
SMID: A NEW MEMORY ELEMENT
MASTER CLOCK
(5V/cm)
ONE BIT OF ADDRESS REGISTER
(5V/cm)
OUTPUT OF DIGIT DRIVER ALTERNATE 1/0 PATTERN (5 V/cm)
OUTPUT OF WORD DRIVER
(I0V/cm)
OUTPUT OF SENSE SWITCH--------(5V/cm)
100ns/DIV~
(a)
MASTER CLOCK------------(5V/cm)
ONE BIT OF ADDRESS REGISTER
(5V/cm)
OUTPUT OF WORD DRIVER. READ ONLY
(5V/cm)
OUTPUT OF SENSE SWITCH
OUTPUT OF DATA REGISTER
(5V/cm)
(5V/cm)
50ns/cm~
(b)
Figure 11. SMID memory operating waveforms.
make it possible to package the 64-word-by21-bit memory on a circuit board area of less than
3 square inches.
author would also like to acknowledge his indebtedness to Yosh Murakami for his help in obtaining
test data, and for the work he has done on the analysis of the SMID elemeLt and driver circuits.
CONCLUSION
A transistor latching switch that has been optimized for random access memory usage has been fabricated as 9 bits of memory on a single integrated
circuit chip.
When compared to other semiconductor memory
devices, the element described in this document
represents at least an order nl! magnitude reduction
in power without significantly sacrificing switching
speed. The low-power requirements of the storage
element have made it possible to build a completely
integrated circuit memvry system.
REFERENCES
1. R. H. Cole and P. Smitha, "An Integrated
Circuit Memory," presented at a Los Angeles IEEE
Section Symposium on Computer Technology, December 1964.
2. Silicon Controlled Rectifier Manual~ General
Electric Co., Rectifier Components Dept., Auburn,
N. Y. 1964.
ACKNOWLEDGMENTS
3. T. A. Longo et aI, "Planar Epitaxial PNPN
Switch with Gate Tum Off Gain," presented at the
1962 WESCON.
Thanks are due to A. Ronald Roth and James
Payton for their advice and encouragement. The
4. R. A. Stasior, "How to Suppress Rate Effect
in PNPN Devices," Electronics~ Jan. 10, 1964.
AN EXPERIMENTAL 65-NANOSECOND THIN FILM SCRATCHPAD MEMORY SYSTEM
G. J. Ammon and Carl Neitzert
Radio Corporation of America
Computer Advanced Product Research
Camden, New Jersey
INTRODUCTION
MEMORY ARRAY
As computers become larger and more complex,
the need for a high-speed scratchpad type memory
becomes greater. Furthermore, the size and speed
requirements increase. An early memory, suitable
for use as a scratchpad, was described by H. Amemiya, R. L. Pryor, and T. R. Mayhew.1 This memory
used two ferrite cores per bit and had a read/regenerate cycle time of 200 nanoseconds. More recently,
a 64-word by 20-bit thin magnetic film memory was
described by G. J. Ammon and C. Neitzert. 2 This
memory had a read/regenerate cycle time of 125
nanoseconds and its speed was limited primarily by
the electronic circuitry. A number of other memories
and memory designs, suitable for use as scratchpads,
have been reported in the literature. 3 ,4,5 Future need
for a 256 to 1,024 word memory having a cycle
time of 50 nanoseconds has been indicated. A project
was therefore initiated to study the feasibility of
such a system.
At their present state of development, ferrite cores
do not appear capable of such speeds. Thin magnetic
film memories, on the other hand, are believed to be
capable of this speed using transistor circuitry, provided the best circuit and packaging techniques are
used. The goal for the project was therefore set as
a 256-word by 25-bit thin film memory having a
read/write cycle time of 50 nanoseconds.
The memory array consists of two polished aluminum plates coated on one side with a 1000-angstrom
continuous film of 80-17-3 NiFe Co. These are
mounted back to back. A set of 128 word lines,
etched from ~-oz copper backed with Mylar is
placed over each film surface. The word lines are
10 mils wide and are on 20-mil centers and are
grounded to a gold-plated portion of the film plate
at the far end by means of a pressure contact. They
have a characteristic impedance of 24 ohms and a
delay of approximately 1.5 nanoseconds per line.
A set of digit and sense lines, also etched from
~-oz copper backed with Mylar, is placed over and
orthogonal to the word lines. They extend completely around the two film plates and are spaced
from the word lines by a sheet of 1h -mil Mylar.
The digit and sense lines use an interdigitated
arrangement in which the digit line consists of two
parallel conductors 6.5 mils wide and 11.5 mils
apart. The digit lines are driven at their center, from
a IS-ohm line, and are terminated at their free ends
with 27 -ohm resistors. The resistors are trimmed
to give a specified differential digit noise of not
more than 2.5 millivolts, at the sense amplifier input,
when the digit rise and fall times are 5 nanoseconds.
The sense lines have a characteristic impedance of
54 ohms and are connected to their respective sense
649
650
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
amplifiers by a pair of 50-ohm terminated coaxial
cables. The ratio of signal to noise at read time is
specified to be greater than 5 to 1.
SYSTEM ORGANIZATION AND PACKAGING
A block diagram of the system organization is
shown in Fig. 1. The address is decoded in two
steps. The second step consists of a word driver matrix with a driver for each word line and a 2-input
AND gate at the input to each driver. The first step
of decoding is provided by AND gates at the inputs
of the matrix drivers.
CJ)
CJ)
w~
G:::C::(
00
o
2td) is poorly satisfied. However, this condition was assumed in the
driver design and the resulting schematic circuit
diagram is shown in Fig. 11. Two drivers are packaged in one module.
Matrix Drivers
The inductance of the word line and its connecting cable is approximately 65 nanohenrys and the
supply voltage is limited to about 15 volts by the
transistor breakdown voltage. If the transistor is an
ideal switch and capacitor C is selected to give critical damping on the word current turn-on, the
time required to rise 0.45 ampere is 3.6 nanoseconds. However, the turn-on time of the transistor increases this to more than 6 nanoseconds. As a
result the capacitance must be increased beyond the
critical value and the circuit becomes oscillatory.
The computed value of C for critical damping is 18
The elements of the word matrix, which are individual word-line drivers, require input levels of
+ 0.2 and +2.5 volts in order to give fast turn-on
and tum-off. In order to supply these levels, a slight
modification of the basic logic circuit, as shown in
the schematic circuit diagram of Fig. 10, is required.
Word-Line Drivers
In order to attain a mllllmum sense voltage of
1.5 millivolts, it was predicted that a word current
654
PROCEEDINGS -
•
FALL JOINT COMPUTER CONFERENCE,
1965
Figure 5. Bit side of complete memory system less timing and connecting cables.
picofarads. The value used was determined experimentally.
The fall time of the current is determined by the
line inductance and resistor R. If the diode has a
zero threshold voltage and zero resistance, the computed tirr..e for the current to fall to 0.05 ampere is
6.7 nanoseconds. This fall time has an effect upon
domain wall creep in the magnetic film and will be
discussed later.
The first two transistors of Fig. 11 serve as an
AND gate and the third drives the transformer. The
150-picofarad speed-up capacitor, in the emitter
circuit of the third transistor, gives a large initial
transformer voltage to turn the output transistor on
fast. The· field discharge of the transformer gives a
large reverse base voltage to turn it off. The transformer also provides a d-c level shift making it
convenient to ground the far end of. the word line.
The RC circuit between the power supply and the
collector of the output transistor is made common
to all 8 drivers. This reduces the space required for
resistors and is permissible since only one driver is
used at a time.
Sense Amplifiers
Figure 12 shows the schematic circuit diagram of
a sense amplifier. Capacitance coupling is required
between stages to eliminate drift in the d-c levels
due to temperature changes. Each amplifier is packaged in two modules with the coupling capacitors
connected between modules. One side of the output
drives the sense input to the information register
and the other side provides an inverted output for
test purposes.
This amplifier gives a single-ended output of 1
volt for a differential input of 2.4 millivolt. With a
very fast rising input, the delay through the amplifier is 5.6 nanoseconds, and the output rises in 4
nanoseconds. When a common mode pulse-having
SIXTY FIVE NANOSECOND SCRATCHPAD MEMORY SYSTEM
655
Figure 6. Word side of complete memory system.
an amplitude of 250 millivolts, rise and fall times
of 4 nanoseconds, and a base width of 10 nanoseconds-is applied to the input the common mode
output is not measurable. However, the unbalance in
the amplifier produces a sil1gle-ended output of up
to 1.9 volts, and the time required for amplifier recovery is about 25 nanoseconds. Waveforms under
operating conditions are given later.
Memory Register
Each bit of the information register consists of a
flip-flop made from one logic module as shown in
Fig. 13. The schematic circuit diagram is the same
as that shown in Fig. 7a except that a third AND
gate is used and a feedback connection is provided
to form a flip-flop. Strobing of the sense signal is
unusual in that it is performed at the register input
rather than in the sense amplifier.
Digit Drivers
A hybrid diagram of a complete digit driver is
shown in Fig. 14. It consists of a positive and a
negative driver in separate modules with their outputs paralleled at the digit-cable inputs. The logic
on the input is necessary because a ONE is written
with a positive digit current on the top plate and a
negative current on the bottom ·plate. A ZERO, of
course, requires the opposite polarities. Transformer
coupling to the output transistors permits the use of
the same type high-speed transistor for each current polarity. In order to properly drive the transformers, the logic circuits supply a high voltage
output similar to that shown in Fig. 10.
This driver deliver -' a 200-milliampere current
pulse to the digit line cable with a rise time of 6
and a fall time of 4 nanoseconds. The delay be-
656
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
+3.8V
1965
START~"-----I
} ADDRESS
LATCH
+-.. WORD T. P.
I....-_ _ _ _ _
INVERTED+-I--_ _ _ _ _ _ _ _ _....... REG. RESET
START
t-----.REG. SET
READ--++_ _ _....
WRITE
-4.7V
(0)
STROBE T. P.
(b)
DIGIT T.P.
Figure 7. Basic logic circuit: (a) Schematic circuit diagram;
(b) logic function performed by Fig. 7a; (c) other logic
functions obtainable.
tween the leading edges of the input timing pulse
and the output current pulse is 8 nanoseconds when
measured between the 50 percent points.
EXPERIMENTAL RESULTS
Figure 8. Logic diagram of the timing circuits.
LATCH
'""r4~--"I"
"0"
Temperature Effects
The input and output of a sense amplifier were
monitored while the ambient temperature was varied
from -45°C to + 60°C. The variation in delay
through the amplifier was less than 0.5 nanosecond
and the variation in gain was less than 5 percent.
In another test t~o inverters, a flip-flop connected
as a one-shot, and a matrix driver were connected
in a cascade to drive all four inputs of two word
drivers simultaneously. The word driver outputs
were connected to short-circuited cables to simulate
word lines and their input voltage and output currents
were monitored while the ambient temperature was
varied from -45°C to +80°C. In the range from
O°C to 55°C the input pulse to the word driver
increased in width by 2.5 nanoseconds and the output
pulse width increased by 5.5 nanoseconds. No
DATA
----I
Figure 9. One bit of the address register.
+3.8V
•
+1.8V +3.8V
150
270
1/4 W
-4.7V
Figure 10. Schematic circuit diagram of a matrix driver.
SIXTY FIVE NANOSECOND SCRATCHPAD MEMORY SYSTEM
657
22,4W
TOP
111"
T.P.
BOT
"0"
T.P.
HIGH-VOLTAGE
OUTPUT
-12V SEE FIG.IO
TOP
"0"
T.P.
BOT
3.9K
Figure 11. Schematic circuit diagram of a word driver.
60
15-.!l DIGIT- 4T #36HF TP ON
LINE CABLE FERROXCUBE CORE
213 - T060 -104
"1"
T.P.
Figure 14. Digit driver diagram.
Figure 12. Schematic circuit diagram of a sense amplifier.
RESET
SET
~""'-"I"
DATA
~--"O"
STROBE
SENSE
Figure 13. Logic diagram of an information-register bit.
change in delay of the leading edg~ through the word
driver could be detected for the full temperature
range.
Cycle Time
Figure 15 shows selected waveforms. taken while
reading and regenerating a ONE at a repetition rate
of 10 megacycles. In curve (d), the sense voltage
is too small to observe and the noise shown is
substantially all common mode. In curve (e), the
sense voltage is the first negative pulse and the remainder is digit noise plus the rewrite sense voltage. Curves (a) and (c) show 60 nanoseconds
from the beginning of the start pulse to the end of
the digit current. Curve ( e) shows that the amplifier recovers from the digit noise 65 nanoseconds
after the beginning of the sense voltage. Finally,
curves (a) and (f) show an access time of 30
nanoseconds. Fig. 16 shows the amplifier output
with a higher repetition rate. In this case the circuit
is arranged to write a ONE after reading a ZERO
and to write a ZERO after reading a ONE. Two
sense ONE signals appear as negative pulses at 2
and 8 divisions from the left and a sense ZERO appears as a positive pulse midway between the
ONEs.
Increased Number of Words
Although the memory plane has a total of 256
words, electronics was built for only 8. However,
physical path lengths and loading, for a full set of
electronics, were closely approximated except for
the loads on the matrix drivers. In order to show
the effect of a 16-by-16 word-driver matrix, a
matrix bus was loaded with an additional capacitance of 200 picofarads in parallel with 300 ohms.
There was no appreciable change in either the leading edge of the word current, or the amplitude or
timing of the amplified sense voltage. However,
there was an appreciable increase in the fall time of
the word current. As will be shown later, this is not
objectionable.
658
PROCEEDINGS -
!
I ill,
II
I
u
(a) START
I
I!li iIiii:: ~
I
I
I
~
I
-.
I
1965
II
I
P!:'II
~
III
FALL JOINT COMPUTER CONFERENCE,
(b)WORDCURRENT,O.2 a/DIV.
I
'(C) DIGIT CUR-
RENT,O.2a/DI\I.
(d)DIGIT NOISE AT
PLANE 10.2V/DI V.
(e)AMPLIFIER
OUTPUT,2V/DIV.
(f) REGISTER
OUTPUT,IV/OIV.
Figure 15. Selected waveforms during regeneration of a ONE.
Creep Tests
Adjacent-word disturb tests were made in all
four comers of both plates and in the central portion of the top plate. The test consisted of writing
alternate ONEs and ZEROs once in a word after
having predisturbed the word by writing the opposite information 10 times. The word was then disturbed by writing the opposite information 25 million times in each adjacent word. Finally, the original word was read out £.nd checked.
Initially, when every word was used with a
word-current fall time of 4 nanoseconds, information was lost in a large number of bits. Improvements were made by using only odd numbered
words and short circuiting unused word lines and
by increasing the word-current fall time to the
value shown in Fig. 15 (b) . Under these conditions a few bits failed in one .comer of each plate.
A total of 9 bits, distributed among 3 words,
failed. * In every case the information lost was a
ZERO. Apparently, loss of information was due to
easy-axis skew in the magnetic film. The presence
of skew is indicated by the difference in the amplitude of the ZERO and ONE signal inFig. 16.
Figure 16. Amplifier output while reading ZEROs and ONEs
alternately with a 60-nanosecond cycle time.
*It later developed that a partial short between two
word lines caused part of the failures.
SIXTY FIVE NANOSECOND SCRATCHPAD MEMORY SYSTEM
CONCLUSIONS
These experiments show the feasibility of a
256-word scratchpad memory with an access time
of 30 nanoseconds. By using faster transistors, now
available, this value should be reducible to 25 nanoseconds. The read/write cycle time, however, will
still be limited by the amplifier recovery so that
with the best transistors available it appears that 60
nanoseconds are required.
REFERENCES
1. H. Amemiya et aI, "High Speed Ferrite Memories," 1962 Fall Joint Computer Conference
Proceedings.
659
2. G. J. Ammon and C. Neitzert, "A 125-Nanosecond Thin Magnetic Film Scratchpad Memory
System," Internal RCA Report, Sept., 1964.
3. M. M. Kaufman, L. Dillon and G. J. Ammon,
"Tunnel Diode Memory," 1964 IEEE International Convention Record.
4. A. M. Bates and F. P. D'Ambra, "Thin Film
Memory Drive and Sense Techniques For Realizing a 167-Nsec Read/Write Cycle," 1964
Solid-State Circuit Conference Digest.
5. D. Seitzer, "Amplifier and Drive Circuits for
Thin Film Memories With 15-Nsec. Read/Cycle
Time," IEEE Transactions on Electronic Computers, Dec. 1964.
IMPACT OF SCRATCHPADS IN DESIGN:
MULTIFUNCTIONAL SCRATCHPAD MEMORIES
IN THE BURROUGHS B8500
Simon E. Gluck
Burroughs Corporation
Great Valley Laboratory
Paoli, Pennsylvania
The B8500 Modular Data Processing system is
the latest design in the rapidly growing family of
Burroughs Modular Computers. As in previous modular systems,1,2 the Burroughs Corporation has
found it expedient and efficient to utilize scratchpad
memories to enhance the performance 6f the computer and other modules. This paper will describe
in detail the application of multifunctional scratchpad memories in the computer module of the
B8500 system.
The overall system design of the Burroughs
B8500 is described in a separate paper presented at
this conference. 3 It will suffice to review only the
basic characteristics of the system as background
for this paper.
The B8500 is an advanced design, high-performance modular data processing system. Three
modules comprise the. major building blocks of the
system: the computer module, the input!output module, and the memory module.
Push-down (Polish) stack of operands to
help implement arithmetic operations
Absolute, indirect and five forms of relative
addressing:
Base Index Register
Base Data Register
Base Program Register
Self Relative
Program Reference Table
Unlimited Index Registers
Binary arithmetic: 200-nanosecond addition
(single precision integer)
Multiple multifunctional scratchpad thin film
memories
Associative memory
Full repertoire of arithmetic, logical and character handling function
Memory protection registers
INPUT/OUTPUT MODULE
512 independent simplex channels
Multiple 52-bit words of thin film buffering
per channel
One 104-bit descriptor word per channel
COMPUTER MODULE
52-bit words: 48 data, 3 control and 1 parity bit
661
662
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
Linked descriptors provide I/O processing
with minimum computer intervention
Parallel byte transfers-variable from 1 bit to
51 bits
Maximum throughput of 590,000 bytes per
second
Data transfer cycle time per byte (any size less
than full word), 1.7 microseconds
On-demand servicing of input and output devices
MEMORY MODULE
Thin film, destructive readout
4096.words of 208 bits: each memory word
contains 452-bit computer words.
Cycle time 500 nanoseconds: access time 300
nanoseconds for 208-bit word
16 module system capability:
65,536 - 208-bit words
262,144 - 52-bit words
Capability of executing lengthy memory-oriented
descriptors independently of, and concurrently with, computer module processing
Computer and I/O modules may total 16: at
least one of each type must be included.
Memory modules may be added up to 16
modules: (262,144 - 52-bit words)
The remainder of this paper will be confined to a
discussion of the scratchpad memories that are part
of the B8500 computer modules. The buffer memory of the I/O module could also be considered
scratchpad, but due to the limitations of time and
space, it will not be discussed.
INFLUENCE OF MEMORY MODULE DESIGN
UPON SYSTEM
The B8500 memory module concept, with its efficient "four-fetch" organization, provides the rationale for much of the scratchpad utilization in the
computer module. One memory address descriptor,
transmitted to all memory modules along a common
communications bus, will select the addressed module, and within that module will select one of
4096 208-bit words. This composite memory
word, containing 4 computer words, data or instructions, is available for transmission to the computer module 300 nanoseconds after the arrival of
the address. Stored in the read-write register, the
1965
4 52-bit words are sequentially transferred
(four-fetched) to the requesting module at the
rate of one full parallel 52-bit word every 100 nanoseconds. Including transmission time of the address descriptors and the returning information, 4
52-bit words can be requested from a memory
module and received at a computer module in a total of 1.0 microsecond, or an average of 250 nanoseconds per word. Similar speeds are available for
writing; "four-store" is also possible in the B8500
memory modules.
This four-fetch, four-store capability influences
to a great extent the use, size and organization of
scratchpad memories for the B8500. In a gross
sense, it may be stated that one of the uses of
scratchpad is to buffer the four-fetches until they
are needed, or, in the case of four-store, to buffer
words of data until they are assembled into fourword blocks for transmission to a memory module
for storage.
SCRATCHPAD CONCEPTS
At the risk of being pedantic, let us discuss for a
moment the definition of scratchpad memories.
Universally accepted definitions are difficult to
find, but the following statement will represent the
scratchpad concept to the" majority of system designers:
Scratchpads are small uniform access memories, with access and cycle times matched to
the clock of the logic, and which are closely
coupled to the source and/or sink of the data.
This definition studiously avoids the inclusion of
the functions to which the scratchpads are applied.
It also does not mention the form of implementation. These variables are left to the ingenuity of the
system and circuit designers. In the latter case, implementations may vary from discrete component
flip-flop registe~s to thin film uniform access memories.
Historically, core m~mories were used as scratchpads in computers where the bulk memories were
magnetic drums. IN the case of the Burroughs D825
system, the scratchpad is thin film; the bulk memory is core. Implementation may eventually travel a
full circle; with the advent of inexpensive integrated
circuit flip-flop registers, the cost per bit of this
implementation may become comparable with highspeed thin film memories.
IMPACT OF SCRATCHPADS IN DESIGN
SCRATCHPAD APPLICATIONS
Functionally, scratchpads have been used for a
variety of purposes. A major application is the buffering of information flow among the main memory
of computers, computational elements, and input/output elements. This use also provides a speed
conversion between data source and sink. Lookahead designs, in which block transfers of instructions and data minimize memory accesses and
transfers, have used scratchpad memories.
A major utilization has been the storage of intermediate arithmetic or logical results from a computational unit, minimizing the time and program
steps required to transfer information to main
memory and retrieve it when later needed. An outstanding example of this application is the use of a
thin film scratchpad for the "last-in, first-out"
stack in the Burroughs D825.
Finally, in many cases, economy dictates the use
scratchpad memories. Many registers, formerly
implemented with flip-flops, are now stored as
words in scratchpad memories. Typically, such registers are utilized for index words, base registers,
real-time clocks and similar relatively infrequent
usages. While it is true that these registers could be
stored in main memory, the lower access time of
the scratchpad makes the information available
more quickly and doesn't tie up main memory address logic and communications lines.
of
USE OF SCRATCHPAD MEMORIES IN THE
B8500
Now to specifics: the use of scratchpad memories
in the Burroughs B8500 computer module. Figure 1
represents a block diagram of the computer module.
The major elements of the computer module ate
self-evident from the drawing but special attention
should be given two blocks: the Local Thin Film
Memories #1 and #2. These are two scratchpad
memories; multifunCtional in application, identical
in physical construction and speed, different in
word size and word length. Tied in with scratchpad
#2 is a small 28-word associative memory (19
bits per word) whose use enhances the utilization
of the scratchpad memory by providing content addressing as well as the conventional binary coded
word addressing capability.
It must be emphasized that each of the two
scratchpad memories contains its own independent
663
addressing logic, sense amplifiers, and read/write
registers. Each of the memories is available to the
two processing elments of the computer module:
the Arithmetic and Logic Unit, and the Address
Arithmetic Unit.
The rationale behind the inclusion of local
scratchpad memories in the B8500 computer module encompasses many of the reasons previously
stated. Foremost among them, however, is the need
for buffering of four-fetches of instructions and
data in advance of their use, i.e., look-ahead. Also
important are its uses as storage for intermediate
results, as an economical implementation for registers and counters, and for the extension of the
push-down stack.
The use of two scratchpad memories, rather than
one common unit, is necessitated by the concurrency of operations in the computer module. Areas of
each memory are assigned in a manner which permits simultaneous operations of the most frequently
used functions. Ideally, it would be desirable to
have a multiplicity of scratchpad memories; one for
each function. Cost, space and power considerations
prohibit such extravagance now. Future adaptations, to be discussed in later papers, may offer
some hope in this direction.
SCRATCHPAD PERFORMANCE
CHARACTERISTICS
The B8500 scratchpads are implemented by magnetic thin film techniques developed and organized
into linear-select memory arrays at the Burroughs
Defense and Space Group at Paoli. To realize the
high-speed access requirement of 45 nanoseconds,
the reading function is nondestructive, eliminating
the need for a restoring write cycle when data are to
be retained unchanged.
Insertion of new data into the local memories
(writing) can be accomplished within the 100-nanosecond clock period of the computer module. For
both read and write, the memory cycle time is 100
nanoseconds, permitting synchronous operation
with the logic of the computer module operating at
10 megacycles per second clock rate.
Addressing of all words of each of the two memories is performed through explicit binary coded
addresses, which may be generated by the subcommand control hardware or included in a field of
data or instruction word. Twenty-eight of the 44
664
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1965
TO MEMORY MODULES
I
t
COMMUNICATION UNIT
I
LOCAL THIN FILM MEMORY #1
64 WORDS (i'i) 52 BITS
LOCAL THIN FILM MEMORY #2
44 WORDS @ 72 BITS
LOCAL REGISTERS (24)
INSTRUCTION LOOK AHEAD (l6)
(8)
PROGRAM REFERENCE TABLE
TEMPORARY DATA
r-
AND
STACK EXTENSION
(16)
LOCAL DATA BUFFER (l6)
INDEX WORDS (24)
-
r-
STORAGE QUEUE (4)
r
TUNNEL DIODE
ASSOCIATiVE MEMORY
r--
(Q) 19 BITS
28 WORDS
STACK REGISTERS (2)
INSTRUCTION PROCESSOR
-: ADVAQ
ARITHMETIC AND
I
LOGIC UNIT
(48 BIT)
J
t
SYLD
I
J
~
ADVAST
1
I
FINQ
I
FINST
I
DECODE
•
1
PCR
J
J
J
CONTROL'SIGNALS
ADDRESS ARITHMETIC
(21 BIT)
INTRAMODULE DATA BUSSES
Figure 1. B8500 computer module block diagram.
words in scratchpad #2 can also be addressed by
discrete output lines from the associative memory.
SCRATCHPAD CONFIGURATIONS
Scratchpad #1
Local Scratchpad Memory #1 contains 64 words
of 52 bits each. This memory is utilized for four
major functions:
1.
2.
3.
4.
Locally Used Registers and Counters
Temporary Data Storage
Stack Extension
Local Data Buffer
Registers and Counters. -In this 24-word portion of #1 memory are stored a multiplicity of registers and counters that do not require frequent access by either the program or the hardware, yet
which must be readily available within a clock cycle
when addressed. Economy is the rationale in this
IMPACT OF SCRATCHPADS IN DESIGN
case; cost, power and space savings over flip-flop
implementation are evident. Typical among the
words stored in this memory area are: Interrupt
Return Register, Base Interrupt Address Register,
and Interval Timer.
Temporary Storage. This area is used to store
literals and portions of multi syllable instructions.
The latter usage stores syllables for the period of
time from their detection and preliminary processing at the Advanced Station (ADVAST) until
they are required at the Final Station (FINST).
Eight 52-bit words are reserved for this function.
Stack Extension. Sixteen words in #1 scratchpad
memory provide additional depth for the Polish
stack above that available from the two hard stack
registers associated with the arithmetic unit. Additional stack depth is automatically accomplished by
automatic storing and fetching from the stack extension in #1 to an area in the memory modules;
the depth of the stack is limited only by the total
capacity of the memory and the permissible area
assigned to it by the Executive and Scheduling Program (ESP). Four-fetch and four-store are used
in stack transfers to minimize the traffic on the inter-module communication buses.
Local Data Buffer. This 16-word section of #1 is
unspecialized scratchpad. The area is not reserved
for a specific category of information but can be
utilized under program control for storage of any
data word or field. It is, however, the only scratchpad area that is capable of buffering four-fetch
and four-store of data. Specific instructions are
included in the machine repertoire to permit manipulation of data to and from the Local Data Buffer.
Scratchpad #2
Local thin fiIm memory #2 possesses the same
performance characteristics as #1 but contains 44
words of 72 bits each (Fig. 2). The additional
word length is required so that it can be utilized
nB~
II
52 BITS
COMPUTER
WORD~
_
ABSOLUTE ADDRESS OF
COMPUTER WORD
CONTROL BITS
1/
18 BITS
_
.
------------'
Figure 2. Scratchpad #2 word format.
·1
IB~S
I
665
with the associative memory; 52 bits hold a normal
computer word, while the remaining 20 bits contain
an absolute memory address to which it may eventually be sent for storage (two bits are control
bits) .
Three functional areas are contained withiR this
memory:
1. Instruction Look-Ahead
2. Program Reference Table and Index Words
3. Storage Queue
Instruction Look-Ahead. Sixteen words of #2
scratchpad are used to store four-fetches of instruction words transmitted from the memory module in advance of their use in the instruction processing section of the computer module. This area,
called Instruction Look-Ahead (ILA) , can hold
up to four such four-fetches of "packed" instructions. (Instruction words of 52 bits in the B8500
contain 1, 2, or 4 I2-bit syllables.) Only 52 bits
of the 72-bit words available are utilized in ILA.
Program Reference Table and Index Words. The
24 words in #2 scratchpad devoted to the storage of
Program Reference Table (PRT) lines and index
words utilize all 72 bits; 52 for the normal word
and an additional 20 bits for the absolute memory
address and 2 control bits. A word stored in this
area can be addressed in either of two ways; by explicit addressing or by selection by an output of the
associative memory. The PRT entries ( copies of
PRT lines in main memory) and index words are
stored interchangeably within this 24-word area.
Index words provide an increment to an address
accumulator to point to a specific memory location.
Index words stored in #2 contain an index value, an
increment and a limit field withing the first 48 bits.
The PRT word may contain, within' its first 48 bits,
the starting address and an upper limit for a data
area main memory.
The PRT word may also point to the starting address or the entry point of a procedure. (These are
called Data Descriptors and Program Descriptors,
respectively, in the B5500 nomenclature.) Appended to each type of word is an I8-bit field representing the absolute address in a memory module
where it has been or will be stored.
When a word is inserted into this 24-word area
a copy of its absolute address is placed in the 20bit word field described above, and also into. one of
the I9-bit words of associative memory. As instruction steps are decoded in the instruction proc-
666
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
essor, those instructions calling for an index word
or a PRT reference send the calculated fetch address (absolute address) both to the communications unit and to the comparison register of the associative memory. If the associative memory contains . the identical address, a word drive line from
the associative memory will read out the proper
word in the related 24-word portion of #2 memory. The selected word ( and its 18-bit address)
will be read out nondestructively, and the request to
the communications unit is canceled. Such a sequence takes only 100 nanoseconds compared with
the 600 nanoseconds (minimum) required if the
words were to be fetched from a memory module,
or an even longer time if the 24 words were
searched and compared sequentially.
The 18-bit address field of the 72-bit #2 memory is required when the words referenced by the associative memory must be returned to a memory
module. The associative memory used in the B8500
cannot have its contents read out like a conventional memory. When the logic requires storing of an
Index or a PRT word in main memory following its
access from #2, the 18-bit field is used to provide
the communications unit an absolute address for the
store function.
Storage Queue. The 4-word portion of the #2
designated for use by the storage queue is similar to
the Index and PRT area. In both cases 72 bits are
used and reference is achieved either by explicit addressing or by .selection via the associative memory.
The storage queue contains words destined for storage and their absolute address. Since the storage
function has the lowest priority in the communications unit, words are retained in this area until service time is available. In a manner identical to that
described for the Index and PRT area, data being
fetched are checked against the contents of the storage queue by the use of the associative memory.
1965
This use is not included primarily to save time ( although it does) but, more importantly, to ensure
that the "newest" data are fetched to the computer.
Fetching of data from a main memory location
about to be updated by a word awaiting service in
the storage queue would provide incorrect information to the program.
SUMMARY
This paper has presented an example of the application of scratchpads to the computer module of
a large processing system. The utilization of multifunctional scratchpad memories in the Burroughs
B8500 system has enhanced the performance of the
system and has resulted in significant savings of
space, power and hardware.
ACKNOWLEDGMENTS
Recognition and thanks must be directed to the
many systems designers involved in the B8500 evolution. Paramount among these are Richard Bradley, George Barnes, Albert Sankin and Richard
Stokes.
REFERENCES
1. James P. Anderson et ai., "The D825-A
Multiple-Computer System for Command and
Control," AFIPS Proceedings, Fall Joint Computer
Conference, 1962.
2. R. V. Bock, "An Interrupt Control for the
B5000 Data Processor System," AFIPS Proceedings, Fall Joint Computer Conference, 1963.
3. James D. McCullough, Kermith H. Speirman
and Frank W. Zurcher, "A Design for a Multiple
User Multiprocessing System," this volume.
SCRATCHPAD-ORIENTED DESIGNS IN THE RCA SPECTRA 70
A. T. Ling
Radio Corporation of America
Camden, New Jersey
phases of the Spectra 70 program, advanced technology provided solutions to many or the problems
encountered. Research and development efforts and
new manufacturing techniques greatly improved
magnetic memory cost, speed, packaging and reliability. These factors made it possible to use small,
extremely fast, scratchpad memories as integral
parts of computer organization.
General-purpose commercial processors with
scratchpad memories did not appear on the computer market until 1959. The reasons for this were
both technological and conceptual. Technologically,
the small, fast memories needed for effective
scratchpad applications were too costly and not reliable enough. Conceptually, the computer was not
complex enough to justify the large number of machine registers required. Memories used in computers, in addition to the main memories, were small
magnetic core stacks. The purpose of these auxiliary memories was to buffer data between peripheral
devices and the central processors to accomplish
speed, compatibility and to translate code or format.
These moderate-speed, small-capacity memories were
equal to or slower than *e central process'or's main
memory.
System designers, however, proposed or implemented systems, on an experimental basis, using
small fast memory as auxiliary storage. The PILOT
INTRODUCTION
As data processing applications become more sophisticated, so do the computers that carry them
out. Real-time, time-:sharing, data communications, mass random access, multiprocessing, and
other advanced computer concepts depend largely
on a new level of program and data control within
the processor.
One of the primary design objectives of the RCA
Spectra 70 family was to provide the many capabilities needed to handle these complex new applications while also simplifying the user programs
needed to keep track of data and the condition and
state of programs in the machine. The family concept itself added to processor complexity. To
achieve program compatibility within a family, a
total system specification-complete and complex in
its functional requirements-must be adopted.
To accomplish these objectives, the capabilities
of Spectra 70 processors and their software systems
were greatly expanded over second generation RCA
computers, especially in the areas of data, program,
and input/ output control. The ultimate effect of
this increased complexity was that the total system
threatened to become a greater burden than could
be handled at current logic speeds.
During the system concept and basic design
667
668
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
system of the National Bureau of Standards! used a
diode-capacitance memory as a secondary storage.
An early development using the list processing technique had beeh proposed2 , using a small memory
for list address scanning. The speed of such a memory device was to be 10 to 20 times faster than the
main memory.
Since 1959, a number of general-purpose commercial computers have been offered, featuring
small fast auxiliary memories. Honeywell 800, UNIVAC 1107, Burroughs D825, and RCA 3301 are
typical examples. Generally, the auxiliary memories
in these machines have been used for address registers, operand stack, index registers, index increments, and program interrupt storage. Although this
1965
did extend machine capabilities, it had not fully exploited the· scratchpad potential. Computer instructions normally do not operate directly on the
scratchpad registers without transferring the contents to the execution unit. Further, program manipulations of scratchpad contents require transfers
into the main memory through special move instructions.
GENERAL PROCESSOR ORGANIZATION
The general organization of the Spectra 70/45
and 70/55 processors (Fig. 1) is divided into three
major functional units-memory, input/output, and
central processing unit. The central processing unit
------lr-------,
I
I
I
I
I
I
I
I
1
I
I
I
I
I
I
I
I
~
,,
I
STATUS
CONTROL
t
BANK 2
I
BANK I
MAIN
MEMORY
UP TO SIX
I
BANK 4
BANK 3
I
I UTILITY I I ELAPSE I
IREGISTERI
CT~~CEK
-I .
I
~
r--
t
UPROCESSING
MATRIX
I
-
I
I
I
r
SCRATCHPAD
MEMORY
L..
I MEMORYrl
PROTECT
I
MULTIPLEXOR
CHANNEL
L--_---.-..,.---,.-----If-
I
I
I
-I
I
I '--v-'
II, DIRECT.I
CONTROL
I
I
,/
SELECTOR 2
SELECTOR I
I
I
I
STANDARD '
INPUT/
OUTPUT
INTERFACES
INTERMEDIATEI
REGISTER I
I
I
I
I
MEMORY
L _____
I
~
I
I
L__
CENTRAL PROCESSOR
...J L
INPUT /OUTPUT
I
_ _ _ _ ---1
Figure 1. General processor organization.
can be connected to a maximum of four main memory banks. Each memory bank is a complete unit,
capable of operating independently. The input/output unit consists of a multiplexor channel
that can control up to 256 simultaneous input/output operations and a variable number of selector channels, each controlling one input/output
operation. These input/output channels-multiplexor
and selectors-interface the peripheral devices
through a standard input/ output interface. This
standard input/output interface allows interchangeable connectivity of any peripheral device with the
input/output channels of all models within the
Spectra 70 family.
SCRA TCHPAD ORIENTED DESIGNS
SCRATCHPAD-ORIENTED DESIGN
The design of the RCA Spectra 70/45 and 70/55
processing unit is based on a unique concept that
makes the scratchpad an integral part of the computer organization, not an adju~ct. In essence, the
scratchpad design is a was of arranging the machine
registers with the logic structure built around it. As
shown in Fig. 2a, all the machine registers are arranged into arrays which interact with the rest of
the processor logic through an input and output bus.
The assembly is implemented by a small magnetic
core scratchpad memory as shown in Fig. 2b.
INPUT DATA BUS
OUTPUT DATA BUS
(ol
FLIPFLOP
IMPLEMENTED REGISTER SCRATCHPAD
FROM ADDRESS GENERATOR
MAGNETIC CORE REGS.
PROGRAM
INSTRUCTION
STATUS
I N PUT/OUTPUT
INPUT DATA BUS
OUTPUT DATA BUS
(bl
MAGNETIC CORE IMPLEMENTED SCRATCHPAD
Figure 2. Block diagrams showing various scratchpad
implementations.
By adding a processing matrix to the scratchpad
memory (Fig. 3), data from registers in the memory are circulated through the processing matrix before it is regenerated back into the registers. Logic
functions such as data incrementing, extraction,
movement, table look-up, and status recognition
can be executed with this circulation path. In fact,
these single-operand logic functions are sufficient
669
for processing interrupt servIcmg of a multiplicity
of simultaneous input/output operations. This circulation path is passive and self-sufficient and
therefore does not disturb any other register content
in the machine.
The normal compute mode pro~essing, however,
requires the complete repertoire of single-operand
and double-operand logical and arithmetic functions. These are achieved by adding two registers to
the data structure. The Utility Register (UR)
serves as a buffer for the second operand input to
the processing matrix, while the Intermediate Register (lR) acts as an output intermediate storage.
These two registers, in effect, supply a second and
parallel circulation path through the data bus in the
course of instruction execution. Of course, besides
the basic data structure, there are a number of miscellaneous control registers and counters necessary
to maintain an adequate performance level.
In the single-operand mode of operation, the
scratchpad memory operates typically in a
read/wait/regenerate cycle. The scratchpad memory
register location is addressed by the address register. The scratchpad cycle waits while the content of
the addressed register is read out onto the data register. The output of the data register is modified by
the processing matrix, independent of the contents
of the Utility Register. The processing matrix output passes through the regeneration matrix into the
regeneration input of the scratchpad memory to
start the regenerate cycle. The wait between the
scratchpad readout and regenerate cycle varies from
zero to some finite time, depending on the logic
function to be performed in the processing matrix.
For example, the wait time will be zero, if the data
register output is transferred to the output data bus
only.
In the case of double-operand operations, the
second operand is either previously set into the
Utility Register by a prior scratchpad memory register transfer cycle, or concurrently set into the
Utility Register from the input data bus, with the
data source from the main memory unit, the Intermediate Register, or other processor logic, while
the first operand is being read out from the scratchpad.
SCRATCHPAD MEMORY FUNCTIONS
Machine registers can be classified into four
groups according to their functional assignments. A
670
PROCEEDINGS -
1965
FALL JOINT COMPUTER CONFERENCE,
INPUT DATA BUS
FROM:
MAIN MEMORY UNIT
INPUT /OUTPUT
UNIT,OTHER
PROCESSOR
LOGIC
I
I
i
GATES
~
I
I
r------------ScR~CHPA~
GATES
I
I
ADDRESS
-REGISTER
I
I
:
~EAD-OUT
DATA
~
I
SCRATCH PAD
I
I
I
REGISTER
MEMORY
I
L ___________
UTILITY
REGISTER
I
MEMORY
UNITI
I
REG~~ION--1
CONTROLS-----+
•
I I
PROCESSING
MATRIX
REGENERATION I-MATRIX
,
1
i
GATES
•
•I
INTERMEDIATE
REGISTER
I
GATES
TO.
MAIN MEMORY UNI1\
INPUT/OUTPUT UNIT
OTHER PROCESSOR LOGIC
I
GATES
I
OUTPUT DATA BUS
Figure 3. Scratchpad-oriented processor data structure.
machine register is a 32-data-bit register used
either as an address register or data register, depending on the specific assignment.
1. Program Registers hold information regarding the complete and current status of
a program they are assigned. These registers are required for execution of a program, from instruction to instruction. One
setaf program registers in the Spectra 70
family is called a program state. A fullprogram state consists of 16 General Registers, 3 program control registers-namely,
the Program Counter (PC), Interrupt Status (IS), and Interrupt Mask (IM)-and 4
double-length Floating Point General Registers. The General Registers are for assignment, within the current user program, as
indexes, base addresses, data accumulators,
or intermediate storage registers.
2. Instruction Registers are used in the execution algorithm of an instruction as defined by an operation register. They are
not held over or used by a. subsequent instruction, but are treated as utility registers
without permanently assigned functions
and are used in various ways as needed in
the course of algorithm execution.
3. Status Registers hold the current status of
the processing unit. They are used by the
control structure to indicate the program
interrupt status and the current program
connection of the processing unit.
4. Input/Output Registers are used for the
simultaneous operation of peripheral devices. In the Spectra 70 system, each operation requires a subchannel consisting of a
set of three registers. A fourth register is
used for final standard interface status, reporting at the termination of an operation.
671
SCRA TCHP AD ORIENTED DESIGNS
In addition, utility registers facilitate data
assembly and chaining operation execution. These registers are preset at the initiation of an input/output operation in the
normal processing mode and are sufficient
to execute a sequence of input/output operations with branching and chaining ability according to a control word list in the
main memory.
o
2
o
3
SPECTRA 70 SCRATCHPAD
These four classes of machine registers are arranged into a scratchpad memory array as shown in
Fig. 4. The scratchpad memory is large enough to
provide one set of input/output registers for the
multiplexor and up to six selector channels. In addition, a significant feature is that there are four program states, namely four sets of program registers,
5
4
7
6
INSTRUCTION REGISTERS
PROGRAM STATE 4 CONTROL AND GENERAL REGISTERS
2
3
4
5
6
MULTIPLEXOR CHANNEL REGISTERS
INSTRUCTION
REGISTERS
SELECTOR CHANNEL I REGISTERS
PROGRAM STATE I I INTERRUPTI PROGRAM STATE 2
PROGRAM
PROGRAM
CONTROL REGISTERS
FLAGS
CONTROL REGISTERS
J
PROGRAM STATE 3 CONTROL AND GENERAL REGISTE RS
INSTRUCTION
REGISTERS
SELECTOR CHANNEL 2 REGISTERS
8
PROGRAM STATE
2
GENERAL REGISTERS
9
SELECTOR CHANNEL
4
REGISTERS
SELECTOR CHANNEL
5
REGISTERS
INSTRUCTION
REGISTERS
II
12
PROGRAM STATE I GE NERAL REGISTERS
13
14
15
PROGRAM STATE 3
GENERAL REGISTER
ADDRESSING FI ELD
SELECTOR CHANNEL 3 REGISTERS
7
10
PROGRAM STATE 4
GENERAL REGISTER
ADDRESSING FIELD
PROGRAM STATE 2
GENERAL REGISTER
ADDRESSING FIELD
PROGRAM STATE I
GENERAL REGISTER
ADDRESSING FIELD
FLOATING POINT GENERAL REGISTERS
IN STRUCTION
REGISTERS
SELECTOR CHANNEL 6 REGISTERS
Figure 4. Spectra 70/45 and 70/55 processor general scratchpad layout.
provided in the scratchpad. The four program states
-two executive program states and two problem program states-are not full program states as defined
earlier but are tailored to the software system assignment described below.
One set of the four program states is selected at a
time according to the setting of a Program State
Register to be operated on by the processing matrix. The addressing of the program. registers by the
processing control logic is automatically modified
by the Program State Register at the scratchpad address generation to select the proper register according to the rule of the scratchpad layout. In this
way, the normal instruction execution algorithms
672
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE, 1965
are concerned with only one program state, namely
the one determined by the setting of the Program
State Register. By switching the setting of the Program State Register between instructions, the processing logic automatically operates on a different
program state, and consequently a different program. However, the status of the previous program
state is preserved. Since normal instruction execution is .confined to registers within a program state,
each program state is protected from others in the
computer. However, two privileged instructions,
called "Load Scratchpad" and "Store Scratchpad,"
are provided for priviliged program access to any or
all of the registers in the scratchpad memory. These
instructions can only be executed when the program
is assigned a privileged operation status, as in the
case of the executive program.
The spectra 70 software system assigns the four
program states for the following program functions:
Program I-Processing State
Program 2-Interrupt Response State
Program 3-Interrupt Control State
Program 4-Machine Condition State
Essentially program states 1 and 2 are alternate
object programming states for multiplexed program
mingo Program state 2 is for interrupt processing or
input/output programs and does not have a separate
set of Floating Point General Registers. When a
floating point instruction is issued in program state
2, it is executed with the Floating Point General
Registers of program state 1. In fact, floating point
instructions in either stat~ operate with the same
set of Floating Point· General Registers. This is also
true for program states 3 and 4.
Program states 3 and 4 are privileged executive
program states. Because of their functions, either
program state requires less than 16 General Registers. On the other hand, program state 3 must have
the ability to manipulate the three Program Control
Registers that are not normally directly addressable.
The scratchpad arrangement takes advantage of the
extra general register designation capability in the
instruction format to allow program state 3 to address the Program Control Registers of states 1 and
2 as if they were General Registers. This allows
the executive program to use the full power of
the instruction complement in manipulating these
registers.
Similarly, program state 4 can address the first
six instruction registers used for instruction execu-
tion algorithms and the two operand address registers directly. Thus, the program register capacities
of the four program states are:
General Register
Program Count Register
Interrupt Status Register
Interrupt Mask Register
Floating Point General Register
Program
1
2
16 16
1
1
1
1
1
1
State
3
4
6
5
1
1
1
1
1
1
8
27 19
TOTAL CAPACITY: 63 registers
9
8
Standard input/ output operation control in the
Spectra 70/45 and 70/55 processors requires a set
of four registers-a Control Address Register, two
Channel Control Word Registers, and a Data/Status
Register. In the case of the multiplexor channel, the
first three registers for each of up to 256 subchannels
are stored in a hidden portion of the first main memory bank. A specific set is transferred into the multiplexor channel set of four input/output registers in
the scratchpad for servicing. These registers are
restored before servicing the next subchannel. In
addition, a duplicate set of four input/output registers
is used for termination reporting to the software
system. In the case of the selector channel, two additional registers are used for pre fetching of chain
controls or other utility functions. The Input/Output Register capacities are:
Basic Set
Termination Set
Utility Set
0
4
Channel Number
2
3
4
5
4
4
4
4
4
1
6
4
4
2
2
2
2
8
6
6
6
6
TOT AL: 44 I/O Channel Registers
2
6
2
6
The balance of the scratchpad registers are used
for instruction and status registers. Since the result
of every instruction is stored in program registers,
the instruction registers normally convey no additional information needed for subsequent instructions. The instruction registers are truly utility registers. Both the 70/45 and 70/55 processors are
free to use these registers for instruction algorithms
in the most efficient way.
Because of their performance requirements, most
status registers are in flip-flop registers outside
SCRATCH PAD ORIENTED DESIGNS
the scratchpad memory. But pertinent information
from these registers is also stored in the Interrupt
Status Registers in the program states. The Interrupt Flag Register, however, is in the scratchpad,
addressable by program state 3 as a General
Register.
PROGRAM STATE SWITCHING AND
THE INTERRUPT SYSTEM
The above-described processor design strives to
minimize the necessity of storing connective program information outside the scratchpad at the
completion of each instruction. The program state
can be switched by merely changing the Program
State Register setting. Program state switching can be
caused by an occurrence of one of the 32 program
interrupt requests, by a privileged instruction called
"Program Control," Qr as a result of an operation
on the Interrupt Mask Register (IM) or the Interrupt Flag Register (IF).
The interrupt system is connected to program
states 3 and 4. When one or more of the program
state switching reasons occur in the system, the processor normal mode automatically switches into the
program interrupt processing procedure at the termination of the current instruction. This procedure
involves the following hardware steps:
1. Program interrupt requests, if any, are entered into the Interrupt Flag Register (IF)
in the scratchpad memory for permanent
recording until they are individually processed.
2. The current program state's Interrupt Mask
Register (IM) determines the allowability
of the outstanding interrupt requests in the
Interrupt Flag Register (IF). This permits
complete programming control of the interrupt request processing. It is possible that
none of these requests will be allowed by
the current program state. When this happens, no program interrupt occurs but the
interrupt requests remain outstanding in
the Interrupt Flag Register.
3. The allowable interrupt requests are processed by priority positions. Then a priority weight is generated and stored in an assigned general register in program state 3
or program state 4, according to- the interrupt request.
673
4. The corresponding interrupt request bit in
the Interrupt Flag Register (IF) is then
reset. Only one interrupt request, the highest priority of the allowed requests, is processed at one time.
5. According to the interrupt request, certain
additional preparatory functions are performed and the interrupt state registers of
the outgoing and the incoming program
states are updated.
6. The setting of the program state register is
changed to- program state 4 for system error interrupt request or to program state 3
for all others. Instruction execution proceeds next with the new program state register setting.
THE SCRATCHPAD MEMORY UNIT
To implement the Spectra 70 scratchpad oriented
design, a single 128-word magnetic core memory
unit was selected for both the 70/45 and 70/55
processors. The memory unit uses magnetic cores of
30 mils outside diameter and 10 mils inside diameter, wired in a 2-core-per-bit fashion. A threewire core threading format is used-two for access
currents and one for a common sense/digit signal.
The word oriented linear selection system is accomplished by a transistor-diode cross-bar matrix.
Each memory cycle accesses 32 bits in parallel
plus 2 parity bits. The access time is 120 nanoseconds with a complete read-regenerate cycle
time of 300 nanoseconds. The scratchpad memory
unit can operate in one of two modes-straight
read/regenerate mode or read/wait/regenerate
mode. The "wait" in the latter case is for the processing matrix operation completion.
Figure 5 is a simplified block diagram showing
the addressing, the sense, and the regenerate systems. The addressing system uses two drivers per
word, each driving into a voltage switch circuit. A
discharge circuit helps to speed up the discharging
of the voltage switch line at the end of the regenerate cycle. The sense system consists of 34 difference
amplifiers, with strobing. Regeneration information
does not come automatically from the data register
but is supplied, at the regenerate inputs with two
digit drivers per bit, by a regenerate matrix in the
processor logic as shown in Fig. 3. The data register is under processing unit control, independent of
the memory cycle timing. The data register is used
674
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1965
FROM ADDRESS GENERATOR
AND CONTROL
DISCHARGE
CIRCUIT
DIFFERENCE
SENSE
AMPLIFIER
VS
Voltage Switch
RCL
Read Clock Level
RCD
Read Clock Driver
WCL
Write Clock Level
WCD
Write Clock Driver
DD
Digit Driver
DWCL Digit Write Clock Level
Figure 5. Spectra 70/45 and 70/55 scratchpad memory block diagram.
for logic operations without regard to the scratchpad memory operation.
The actual scratchpad memory unit is packaged
in a standard Spectra 70 printed circuit logic platter
(Fig. 6) . The unit itself occupies approximately
two-thirds of the 18 x 18-inch platter; the rest
of the platter contains processor logic.
reliability. The choice of implementation consists
of an integrated semiconductor flip-flop register
array or a magnetic core device. Integrated semiconductor register banks are not available for design consideration at this time.
DESIGN CONSIDERATIONS
A simplified cost comparison between flip-flop
implemented scratchpad and magnetic core implemented scratchpad is illustrated in Fig. 7a. The cost
ordinate is in normalized cost unit for comparison
purpose. The number ordinate is in number of 32-
The design considerations in selecting the method of scratchpad implementation are cost, packaging and size, accessibility, speed of operation, and
Cost
SCRA TCHPAD ORIENTED DESIGNS
675
Figure 6. Scratchpad memory in the 70/45 processor.
data-bit registers. Each of the two curves is really
a family of curves as a function of operation speed.
The exact relative positions would vary accordingly.
III this illustration, the magnetic core device has an
access time of 120 nanoseconds and. a cycle time of
300 nanoseconds. The flip-flop scratchpad is the
Spectra 70 integrated circuit packages with 3~pair
access, 41;2 -pair cycle. Here the crossover point is
in the range of 8 to 16 registers. Thus, when a
computer cost and performance specification requires 16 or more registers, as in the case of Spectra 70, a magnetic core device becomes increasingly
advantageous. The minimum number of registers
for a single program state and 3 input/output channels for the 70/45 processor is greater than 40. By
using a magnetic core scratchpad, it is possible to
provide 128 registers for a small cost increase. The
extra capacity makes it economically possible to
provide four program states and a multiplicity of
simultaneous input/output controls. Furthermore,
extra utility registers with almost .logic circuit speed
ac.cess, overlapping With the comparatively slower
main memory, have permitted use of algorithms to
achieve higher instruction execution speed than
otherwise would be possible.
Speed
The scratchpad memory access and cycle speed
must be economically achievable by current stateof-the-art standards, that is, in the 100-nanosecond
access speed range. For the logic circuit speed of 25
676
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1965
~
8
007
t:
z
::::>
s
r-
OO
0
U "
~
MAGNETIC CORE
0
~4
::J
(.)
>0::
10
4
•
•
•
•
•
•
--- .---.. '---.
.--'--'-- . ---'--'--..
----...
..--.!........
--~
~
...--.",::,----....
••
--'--'
LtJ
10
10
•
~........
0
~
• • • ••
•
•
LtJ
LtJ
•
••
Ci)
3
'--.~.~.
•
.....
.. .
.---..!.. - -
'---~--.---=:
•
••
2
I
10
1950
1952
1954
1956
1958
1960
1962
1964
" - - DATE OF FIRST MACHINE
1966
1968
1970
INSTALLATION
Figure 4. Memory cycle time extrapolation based on an expanded MILDATA data base.
The upper curve in Fig. 4 represents the least
squares fit to the total data base. Once plotted, all
points above that curve were eliminated from the
data base and a second curve fitted; finally the
points above the second curve were dropped and a
third curve was generated. The three curves thus
represent:
1. The average for the total data base.
2. The average of the leading or, in this case,
the lower half of the data base.
3. The average of the leading quarter of the
data base.
If a specific datum is selected and the three extrapolations of Fig. 4 are plotted for this data, as in
Fig. 5, a trend indicating the leading edge of the
technology for the particular time period should
emerge.
This hoped-for trend in memory cycle time
does not appear to be emerging. This implies, if the
extrapolation technique is valid, that a saturation or
slowing down of technological developments in this
area is occurring.
This type of extrapolation has been made by the
writer three times' during the past six years. Figure
11 summarizes the three results for the extrapolated
memory cycle time obtained for the year 1970. The
number of machines considered in each extrapolation is listed in the second column of Table 6.
Returning to the detailed specification of the
control memory, we can estimate a main memory
cycle time of approximately 300 nanoseconds for
1970. Based upon this figure and the speed ratio
686
PROCEEDINGS -
en
0
10
z
FALL JOINT COMPUTER CONFERENCE,
1965
3
0
(.)
I&J
CI)
0
z
cr
Z
I&J
2
~
I&J
...J
(.)
~
>a:
0
2
2
I&J
2
10
50
100
150
NUMBER OF MACHINES
200
Figure 5. Extrapolated memory cycle time for machines introduced in 1970 (based upon an expanded MILDATA
data base).
Table 6. Comparison of Three Extrapolations Concerning
Memory Speed in 1970.
Data Base
Date
1959
1963
1965
(number
0/
machines)
30
154
208
Average
Memory
Cycle Time
1970
flsec
0.25-0.50
0.30
0.330
developed above (control memory cycle time should
be 4 to 8 times as fast as the main memory cycle
time), the memory cycle for a control memory in
1970 should range from approximately 40 to 75
nanoseconds.
The capacity and the speed of the unit has been
established. The final question to be considered in
this paper-the selection of the control memory implementation technology-can now be examined.
687
SCRATCH PAD MEMORIES AT HONEYWELL
The control memory could be implemented via
one of several different physical implementationsfor example, semiconductive flip-flop elements or
magnetic thin films.
The general cost (in dollars) to produce (on a
production basis) a control memory of a given
capacity and speed can be defined as:
C=I+D·B
(1)
where
total product cost in dollars,
I = initial "investment" cost per machine for the
particular technological approach (the sum of
the costs for the drivers, sense amplifiers,
address and data registers, etc.) in dollars,
D = cost per bit of storage (in dollars), and
B = number of bits.
C
IA = $103
IB = 0
DA = $0.50
DB = $2.50
Therefore:
III = 103
and
IlD = 2.00
(4)
Using Fig. 6, we see for this example that the
break-even point is at 500 bits. For larger capacities a core memory is more economical; for less capacity the active flip-flop implementation is more
economical.
=
For two distinctly different methods of implementation the crossover point occurs when the total
costs of the control memory, in production for both,
are equal.
Therefore:
(2)
Solving for B, we obtain:
(3)
AI and IlD represent the differential in "investment"
and cost per bit respectively.
Figure 6 presents Eq. (3) in a convenient form.
A single example will suffice to explain its use.
Assume we are comparing a magnetic thin film
memory with a set of flip-flop registers for the control memory in a machine. Here I A will represent
the peripheral cost to support a film memory for this
purpose. In practical terms, I A is the sum of the
costs for the memory drivers, sense amplifiers,
address and data registers, etc. Hypothetically I B
could be considered to be equal to zero. DA and DB
would be the cost per bit for each approach. Summarizing:
SUMMARY AND CONCLUSIONS
Historically, Honeywell was the first to develop
and utilize the concept of scratchpad memories for
control purposes within its data processing systems.
The H-800 and the Series 200 systems each employed different embodiments of the basic concept.
Increasing utilization of the fundamental concepts
both by other computer manufacturers and Honeywell (for example the expanded and extended control memory system of the H-8200) substantiates
the growing recognition by systems designers of the
usefulness of such devices.
Extrapolations of current trends with an eye toward future developments in scratchpad memories
leads one to the definite conclusion that scratchpad
memories will be a significant factor in the design
of future systems. In substantiation, three facts can
be noted: ( 1) the rapid and significant decrease in
the cost of small high-speed memories; (2) the
definite trend toward the design and development
of more sophisticated and complex systems; and
(3) the increasing desire of the systems designer to
emancipate the programmer from the individual
machine characteristics to the greatest extent possible.
ACKNOWLEDGMENTS
The author wishes to express his appreciation to
the members of Honeywell Electronic Data Processing Division who made the submission and
presentation of this paper possible. In particular,
Mr. John Gilson, Miss Sonya Shapiro and Mr. Robert Zinn showed a great deal of patience in reviewing and assisting in clarifying my thoughts during
the preparation of this paper.
688
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1965
~
iii
~
>-
I-
o
fl04
ct
(.)
>a::
o
2
l&J
2
.J
o
a::
I-
z
8
1.000
100
.6r
10.000
100,000
INCREMENT INVESTMENT (DOLLARS)
Figure 6. Differential cost analysis, comparing two distinctly different implementation techniques.
REFERENCES
1. Honeywell 800 Programmers' Reference Manual, Honeywell Inc., 1964.
2. S. D. Harper, "Automatic Parallel Processing," Proceedings of the Computer Data Processing Society of Canada, 1960, pp. 321-331.
3. N. Lourie et aI, "Arithmetic and Control
Techniques in a Multi-Program Computer," Proceedings Eastern Joint Computer Conference, 1959.
4. H. W. Schrimpf, "Information Handling Apparatus for Distributing Data in a Storage Apparatus," U.S. Patent 3,142;043, July 21, 1964.
5. Series 200 Programmers' Reference ManualModels 200/1200/2200, Honeywell Inc., 1965.
6. Series 200 Programmers' Reference ManualModel 120, Honeywell Inc., 1965.
7. Series 200 Programmers' Reference ManualModel 4200, Honeywell Inc., 1965.
8. Series 200 Summary Description-Model 8200,
Honeywell Inc., 1965.
9. N. Nisenoff, "MILDATA, An Optimizing
Study of a Modular Digital Computer System," Final Report, Contract DA 36-039-AMC-03275 (E),
vol. JI, 1965 (AD 462 043).
10. L. C. Hobbs, private communications.
A BOUNDED CARRY INSPECTION ADDER FOR FAST PARALLEL ARITHMETIC
Emanuel Katell
Electronic Associates, Inc.
West Long Branch, New Jersey
INTRODUCTION
propagation. Any question of sign or overflow is
resolved by providing an N + 2 bit register for summation, where N is the conventional signed word
length.
This paper suggests a new mechanism for parallel, high-speed arithmetic for digital computers. It
is based on a bounded carry inspection adder
(BCIA) that operates on ternary coded data words.
The recoding circuitry is of the type currently in
use in computers that perform high-speed multiplication by the modified short cut (MSC) technique
of shifting over ones and zeros.!' 2 The uniqueness of
the BCIA lies in the application of this recording to
addition, and to an even greater speed-up of the
mUltiplication technique that fostered it. In the process of multiplication, repeated additions/subtractions
are required. The BCIA speeds up the process by
providing an addition technique that yields the sum
in parallel in one step through the elimination
(bounding) of carry propagation.
Previous attempts at multiplication speed-up required separate adders for addition and mUltiplication arithmetic,2,3 or employed adders that were not
truly parallel in the one-step sense,4,5 or retained
difficulties in sign and overflow detection. 6 , 7 In the
method described, the same adder is used for addition and multiplication arithmetic. Further, the nature of the recoding technique provides for one-step
summation: since there are never adjacent digits in
the recoded summands, there can never be any carry
RECODED-BINARY ARITHMETIC
The increased prominence of scientific computation has focussed attention on the need for increased arithmetic speed since, in general, scientific
problems require a higher ratio of arithmetic-tohousekeeping operations. The search for increased
speed has centered on faster circuitry and faster algorithms. The latter approach has led to recodedbinary arithmetic.
High Speed Multiplication
The method most currently used for binary
mUltiplication is based on the decimal short-cut
mUltiplication method used on desk calculators. The
technique centers on the ability to replace a string of
m
ones between positions a and m by I 2i = 2m + 1_2a.
a
For example, the binary number 00111110 ==
+2 5 +24 +23 +22 +21can be recoded as: + 26 - 21.
Thus, in multiplying, five additions and four shifts
689
690
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
can be replaced by one subtraction, a single shift
of five, and one addition. LehmanI extended this
basic recoding technique and demonstrated that
isolated ones and zeros could be treated as part of
a longer string of digits. An isolated one calls for an
addition of the multiplicand, while an isolated zero
calls for a subtraction. For example, 0011 nIl 0,
which would ordinarily be treated as two sequences,
can be recoded as +2 6 _23 _21. The justification for
this treatment of isolated units stems from the two
self-obvious identities +2 N_2 N- I = +2 N- I and
_2N +2 N- I = _2N-I. The result is that there are
never two sequential addition/subtractions.
The preceding techniques may be given a more
formal representation by considering a ternary coding
of binary numbers such that:
m
m+1
o
o
2- I B t2t==2- n I (-1)
n
StC t 2t
where B t , St, and C t are binary variables and t is
the positional indicator (t = 0, ... , m). An operation is performed in any tth position in which
C t = 1, with the sign of the operation controlled by
St. In the above, St = 0 commands an addition. The
entire recoding logic consists of an implementation
such that:
Ct = (B t =1= Bt-d (Ct-t)
St
=
(CtB t -
l )
+ (Ct St-t)
(1)
(2)
Note, from Eq. (1), that CtCt-1 = 1, and thus
operations can never be required in two successive
cycles.
Ternary Coded Addition for BCIA
The Bounded Carry Inspection Adder operates
by using the code commands directly as the representation of the summands, the only additional requirement being that both data words are recoded.
Addition takes place in parallel in exact one clock
pulse, hence the use of the "I" in BCIA to indicate
addition by mere inspection of the summands. The
rules for addition are stated in the following equations,using the notation above and with the subscripts a and b representing terms in the first and
second summands, respectively:
(StCt)a(StCth =
(StCt)a(StCth =
(StCt)a(dCth =
(StC t ) a(StCt)b =
(StCt}a(dCth =
St+l Ct+lC t
StCt
StCt
St+l Ct+lC t
StCt
(3)
(4)
(5)
(6)
(7)
1965
In the preceding equations, the d's represent "don't
cares" and the a and b subscripts may be interchanged since the commutative laws must obviously
apply.
Since the mathematical representation may tend
to obscure rather than explain, a simple example is
shown in Fig. 1.
Exponent of 2t
t = 7 654 3 2 1 0
Augend
A
Addend
B
+
+
= 0 0 1 1 1 0 1 0 = 58
+
:::: 0' 0 1 1 0 1 1 0 = 54
112
Augend in BCIA
notation
Addend in BCIA
notation
A=01000010
B=01000000
Note: 0 is a non-value space zero. (See Appendix A.)
Figure 1. Addition example.
In the figure the plus-minus notations above the
conventional-binary summands indicate that an operation is to be performed (C t = 1), and give the
St = 1). The example
sign of that operation, (+
illustrates the modified Boolean operations to be
performed. The complete rules for addition (positive numbers) are summarized in Table 1.
==
Summands
A
B
1
1
1
0
o 0'
o
0
1
0
Position Entry
t
+
1
1
o
o
o
o
t
0'
0
0
0'
1
Table 1. Addition Table for the BelA.
The rules for addition with both of the summands negative are identical. It is necessary only
to keep track of the sign, as in conventional sign
and magnitUde addition. Addition of oppositely
signed numbers is readily implemented. The numbers would be recoded normally, independent of
sign. Then the complementary output of the register
containing the recoded-negative summand would be
used for the addition. The sign of the sum would be
obtained from the sign decoder, as before.
Overflow detection is readily obtained in the
BCIA. Conventional notation for an N -bit computer
usually implies an N - 1 number and a sign bit in the
Nth position. For addition, an N + 1 sum register
A BOUNDED CARRY INSPECTION ADDER FOR FAST PARALLEL ARITHMETIC
is used to provide for overflow. In BelA addition,
an N + 2 addition register is employed. The sign
is in position N + 2, while position N is used for a
"non-overflow overflow," and position N + 1 used
for actual overflow. The "non-overflow" must be
provided for since the recoding of an N - 1 bit
word can produce a 1 in the Nth position. This
would normally be reabsorbed into the standard
word length when the sum is formed. If both summands have a 1 in this position, a conventional
overflow is formed in position N + 1, and is treated
in a normal manner.
Subtraction is performed in the manner previously indicated for opposite-sign addition. The complementary output of the recoded subtrahend register is always used in the process. The choice of the
true or complementary output of the recoded minuend depends on its sign, with the complement chosen for negative sign. Addition then proceeds as before. Note that there is no need to sense the overflow bit, as in conventional subtraction, to determine whether subtraction has proceeded in the right
direction (i.e. has a larger number been subtracted
from a smaller number, necessitating a correction
cycle?). This stems from the fact that the representation of the most significant bit iIi the answer is
the sign of the answer, while the magnitude of the
answer is already correct. In this sense, as well as in
the fact that borrow propagation is elimited, the
BelA produces faster subtraction than previous
techniques.
Fast Multiplication
McSorley,2 in his description of Stretch arithmetic, pointed to the speed advantage in using special purpose carry-save adders in the multiplication
process, while using carry-look-ahead for ordinary
high-speed addition. With BelA arithmetic, the
same adder is used for both multiplication and addition. Further, multiplication speed is considerably
increased.
In the carry-save technique, all intermediate additions can be effectively performed without carry
propagation, while the final addition required conventional carry ripple. With BelA representation,
one-step addition/subtraction is used for all intermediate operations, and for the final one as well. In
multiplication, both multiplier and multiplicand are
used in recoded form. It should be noted that the
full multiplication operation code is contained in
691
the format of the multiplier word. Thus: Multiplier
54 = 1 0 0 0 0 0 0 (BelA ternary coding) calls
for a shift, subtraction of the multiplicand, another
shift and subtraction, then a double shift and an addition of the multiplicand. It seems apparent, therefore, that a further savings in overall hardware is
obtained with BelA arithmetic, since the
multiplication decoder can be considerably simplified. It is also seen that a further speed-up can be
obtained, since multiplication sequence decoding,
which normally proceeds serially by groups, is already "stored" in the multiplier word. And finally,
it should be noted that the elimination of carry due
to the BelA permits formation of the most significant bits first, for both serial and parallel machines.
This is obtained by merely using left shifts instead
of right shifts and initiating the multiplication from
the left. The first significant answer bit becomes
available in one step, and the ith bit becomes
available in the ith step. It thus becomes possible
to implement a rounded-multiply operation which
terminates after the minimum commanded number
of digits has been formed. Such an operation would
find use in variable byte arithmetic as well as in
significant-digit analysis of floating point calculations. 8
Division
An analysis of a ternary division algorithm using
BelA notation is lengthy and will not be treated at
this time. It may be briefly stated, however, that
the method employs a shifting over space zeros.
That is similar to the method of shifting over ones
and zeros presently employed for high-speed division. 2,12,13 The justification is similar since, in both
cases, the digits shifted over are merely position
indicators and not value digits. Preliminary analysis
of the shift average, a figure of merit proposed by
Robertson, provides a figure greater than three.
This is obtained without the need for generating and
storing multiples of the divisor, typically %, 1, and
3/2.14 Penalty is paid, however, in going to a twostep addition process, since intermediate partial
remainders will now sometimes have adjacent digits.
This will result in a transfer digit, 5 which is absorbed
in the second step.
IMPLEMENTATION
The full description of implementation of the
concepts suggested by this paper can only be con-
692
PROCEEDINGS - - FALL JOINT COMPUTER CONFERENCE,
sidered by a lengthy treatise on the design of a digital computer in which the use of BCIA arithmetic
is extended to the entire machine. If Robertson's
definition of redundancy9 is extended ·to include the
BCIA form of ternary coding, then such a computer
is fully possible. There is a hierarchy of approaches
which could be implemented. Briefly stated, these
are:
1. Total. In this approach, at a cost of program assembly time, but with a saving in
output decoding, the entire machine, including instructions, data and addresses, is
structured for BCIA ternary representation.
2. Arithmetic only. Here, only data words
are .converted, as inputted. Address modification would require local recoding.
3. Arithmetic, as needed. In this approach,
all words are in conventional binary form
and are converted as required.
Certainly the Total approach would be the most
desirable. The problem of a ternary representation
can be solved on a multiregister basis or can be implemented with three-state devices, such as the tunnel diode, or appropriately wound magnetic cores.
Thus, for example, the ability to implement 0, the
non-value space zero, presently exists. The multiregister technique represents a brute-force approach
that would provide two N-bit registers to distinguish among the three states. The way in which the
registers are filled and the use in BCIA addition are
shown in Appendix A. These are but two approaches:
other, more elegant, schemes can be developed.
The recoding of a binary word into ternary form
has been treated elsewhere. 2 ,12 For completeness,
Appendix B shows a possible logical implementation for a four-bit word. Extension to longer word
lengths is treated in the references. Essentially,
these reduce down to logic diagram descriptions of
the equations presented earlier in this paper.
CONCLUSION
An extension of ternary coding to permit the
elimination of carry propagation in addition and
subtraction has been presented, and a new parallel
Bounded Carry Inspection Adder has been proposed. The speed-up of fast mUltiplication by virtue
of adder speed-up and the ability to use the same
adder for both multiplication and addition has also
been indicated. The technique described provides
1965
for simple over-flow detection and for round-off
techniques that suggest significant-digit-arithmetic
capability.
While the discussion has centered on a ternary
representation, and electronic means for such a representation have been· mentioned, it is noted that
normal binary hardware could be used for an equivalent representation. The possibility of redundant
representation in an entire computer has been
opened through the use of redundancy for both
summands in the adder, thus extending such representation to the entire arithmetic unit. It is felt this
can provide the means for faster parallel execution
in a digital computer.
APPENDIX A
In handling BCIA operations, a means of representation of the non-value space zero, 0, must be
provided. As is seen in Appendix B, the determination of the space-zeros and their location in the
recoded word is determined from the type of recoding used for multipliers. The ones of the BCIA
word represent the addition commands of the multiplier, the zeros represent the subtractions, and the
shift commands represent space-zeros. The length
of shift represents the number of sequentially registered O's. In this manner, the decimal number 54
is recoded as: "Shift one (0), subtract (0), shift one
(0), subtract (0), shift two (00), and add (1).
If an eight-bit word is considered, an additional
shift (0) must be entered. Written in BCIA form,
this becomes
54 = 0 1 0 0 0 0 0 0
To represent this using the brute force approach
of two N-bit registers, the procedure would be as
follows:
1. Load register A with ones in the positions
where an add command had occurred. The
remainder of the register is filled with zeros.
2. Load register 0 with zeros in the position
where a subtract command had occurred.
The remainder of the register is filled with
ones.
For the decimal number 54, the register would
appear as below:
A Register 0 1 0 0 0 0 0 0
o Register ----------------------1 1 ,1 1 0 1 0 1
-----------------------
The columns in which the same digit notation occurs represent the existence of that digit in the
693
A BOUNDED CARRY INSPECTION ADDER FOR FAST PARALLEL ARITHMETIC
BelA number. The columns containing differing
entries in the two registers represent the spacezeros, O. This can be seen when it is remembered
that a 1 represents an add command and a 0 represents a subtract command. A space-zero then is
a simultaneous command to add and subtract, i.e.
a no-op statement, in effect.
Addition of two numbers in this format is given
in the example below for 54 + 57 = 111
54
57
54
57
Aa
0
Ab
Oa
Ob
0
SUM As
Os
1
1
1
1
0
0
1
1
1 0
1 1
J,
+2 7
0
1
1
1
0
0
1
1
0
0
0
0
0
0
1
1
0 0
0 1
J,
_24
0
1
0
0
0
1
0 1
0 1
J, J,
_21 +20
1
0
0
OL t +1
I Lt+l
OKt+1
I Kt+l
t + 1
0
1
0
1
0
0
1
1
Fast Multiply. Shifting Over Ones and Zeros,
Modified.
Determination of Length of Shift.
A maximum shift length of 4 is analyzed.
H = Previous history, 1 = Add, 0 = Subtract.
R = 1 = Perform an operation, then shift.
R = 0 = No ops.
yzAB = 4 least bits of multiplier presently
being examined.
Dash (-) represents a "don't care."
y
z A B H R No. of Shifts
1 0 1 1
I
o 1 0 I
1
-
1
-
0
1
1
0
0
0
1
0
0
1
1
0
1
0
0
1
=
Aa Ab Oa Ob
1
=
Aa Ab Oa Ob
1
I
Terms in Summands
I
APPENDIX B
0
1
1
1
The entries in the sum registers are seen to be
determined as follows:
,As
Os
As
Os
As
Os
As
Os
complished during the fetch portion of the next instruction.
=
=
o
== K
(Aa Ab + Aa Ab) Oa Ob == L t
(Oa Ob + Oa Ob) AaAb
t
A magnetic core logic implementation using
"leaster" and "moster" symmetric circuits can be
employed to provide the sum.
The example given serves to demonstrate an additional requirement of a system employing BelA
arithmetic: while the addition of two BelA coded
words will always take place without carry (since
there can be no adjacencies), the sum may not always be generated in the proper recoded form. This
can, on occasion, hold for the results of other arithmetic operations as well. An approach in resolving
this is to recode before the next operation, using
look-ahead recoding equations of the form of Eqs.
( 1 ) and (2) . Since the equations contain terms
subscripted with t - 1, they are expandable to t - n,
thus suggesting a one-step recoding that can be ac-
o
o
o
1
o
o
I
o
I
o
o
1
1
0
1
0
0
1
1
0
1
0
0
1
Remarks
Shift I when
R = land
A=#B
1
0
0
0
1
1
I
0
0
0
1
1
1
1
0
0
1
I
0
0
I
1
0
1
0
0
I
I
0
I
0
0
I
I
0
0
0
1
0
0
0
0
0
1
0
1
0
0
0
1
0
2
Shift 2 when
z =# A and
shift of I is
not called for
3
Shift 3 when z
=# A and y =#
z and a shift
of I is not
called for
4
Shift 4 when
Z = y = z
and a shift of
I is not called
for
From the remarks, the shift equations follow:
= Shift)
(Sh
Sh 1 = R (AB + AB)
Sh 2 = (Az + Az) Sh I
Sh 3 = (yzA + yzA) Sh I
Sh 4 = (yzA + yzA) Sh I
From the section below, it is seen that the term
for R is generated in the process of determining the
operation (add/subtract) to be performed. In the
Kamaugh maps and logic equations that follow, H
is a history flip-flop that indicates whether the last
operation was addition (I) or subtraction (0). B is
the L. S. B. of a pair, Ais the next digit, S means
add, D means subtract, and R means shift right
(with respect to B ).
694
PROCEEDINGS -
A B H
0 0 0
0 0 1
0 1 0
1
1
1 0
1 .0
1 1
1 1
0
1
0
1
0
S
1
0
0
1
0
0
0
0
D
0
0
0
0
1
0
0
1
S=ABH+ ABH
D = A[BH + BllJ
R = (S + D) = SD
(BH + BH) = (B
R
0
1
1
0
0
1
1
0
FALL JOINT COMPUTER CONFERENCE,
D
SUBTRACf
S
MAP
ADD MAP
A
A
0
0
B
o 0 B
H 0 1
10
HOO
o 0
+ H)
OT
10
(B
+
A)
REFERENCES
1. M. Lehman, "High Speed Digital Multiplication," IRE Transactions on Electronic Computers,
vol. EC-6, pp. 204-205 (1957).
2. O. L. McSorley, "High Speed Arithmetic in
Binary Computers," Proceedings of the IRE, vol.
49, pp. 67-91 (1961).
3. B. Esterin, B. Gilchrist, and J. H. Pomerene,
"A Note on High Speed Digital Multiplication,"
IRE Transactions on Electronic Computers, vol.
EC-5, p. 140 (1956).
4. A. A vizienis, "Signed Digit Representations
for Fast Parallel Arithmetic," IRE Transactions on
Electronic Computers, vol. EC-I0, pp. 389-399
(1961).
5. A. A vizienis, "Binary-Compatible SignedDigit Arithmetic," AFIPS Proceedings, vol. 26,
pp. 663-672 (1964).
1965
6. H. L. Garner, "The Residue Number System," IRE Transactions on Electronic Computers,
vol. EC-8, pp. 140-147 (1959).
7. R. D. Merrill, Jr., "Improving Digital Computer Performance Using Residue Number Theory,"
IEEE Transactions on Electronic Computers, vol.
EC-13, pp. 93-101 (1964).
8. N. Metropolis and R. L. Ashenhurst, "Significant Digit Computer Arithmetic," IRE Transactions on Electronic Computers, vol. EC-7, pp. 265267 (1958).
9. J. E. Robertson, "Introduction to Digital
Computer Arithmetic," presented at University of
Michigan Engineering Summer Conference on Introduction to Digital Computer Engineering (June
1964).
10. F. Salter, "A Ternary Memory Element Using a Tunnel Diode," IEEE Transactions on Electronic Computers, vol. EC-13, pp. 155-156 (1964).
11. J. Santos, H. Arango, and M. Pascual, "A
Ternary Storage Element Using a Conventional
Ferrite Core," IEEE Transactions on Electronic
Computers, vol. EC-14, no. 2 (1965).
12. Ivan Flores, The Logic of Computer Arithmetic, Prentice-Hall Inc., Englewood Cliffs, N. J.,
1963.
13. R. S. Ledley, and J. B. Wilson, "An Algorithm
for Rapid Binary Division" IRE Transactions on
Electronic Computers, vol. EC-I0, pp. 662-670
(1961).
14. C. V. Freiman, "Statistical Analysis of Certain Binary Division Algorithms," Proc. IRE, vol.
49, pp. 91-103 (1961).
A FAST CONDITIONAL SUM ADDER USING CARRY BYPASS LOGIC
Joseph F. Kruy
Honeywell - EDP Division
Waltham, Massachusetts
INTRODUCITON
described here the carry signal is propagated serially in two logically different paths.. One of these
paths is for the generation of the two possible carries of each binary stage and the other is to perform
the carry bypass and selection logic. Because of the
serial propagation of signals, fast circuitry . is required.
The fastest available digital circuits today still
use tunnel diodes and are fabricated by hybrid integrated circuit methods. Speeds of about an order of
magnitude higher can be obtained with circuits incorporating tunnel diodes than with those using
transistors only. The controversy about the future of
the tunnel diode in digital computers stems from
the fact that they do not lend themselves to the
most promising type cf integrated circuit technol'Ogy, i.e., fabrication using monolithic techniques. In
hybrid integrated circuits, however, the speed advantage of the tunnel diode can be efficiently exploited due to the decreased stray reactances of
packaging and construction. Therefore, in this writer's opinion the future use of the tunnel diodes in
digital computers will depend to a large extent on
the acceptance of some sort of hybrid construction
technology.
In the design of the adder, an approach useful for
functional units appears to be the most suitable one
and is the approach followed here. 8 This design phil-
The higher speeds obtainable with present day
logic circuits of various integrated circuit types increase the need for faster adders .. The speed of. addition can be increased primarily in two ways: ( 1 )
by more efficient logic organization, (2) by using
faster logical elements.
Among the different binary full adders the best
known perhaps is the iterated type or .ripple carry
adder .1,2 In this adder, the carry is propagated between adjacent ordered stages through relatively fast
carry circuits, and the sum is generated after the
carry propagation has been completed. It is recognized that in this type of adder the major portion of
the required addition time is due to carry propagation time. There are several ways of speeding up
carry propagation. Two of the most frequently used
methods are the lookahead and the carry skip
techniques. 3•4 •5 In each of these techniques· the sum
generation is subsequent to the speeded up carry
propagation.
In the conditional sum adder, the generation of
the sum is simultaneous with that of the carries.
The conditional sum formation was first described
by Sklansky in 19606 . and later used by Bedrij in a
parallel adder utilizing a parallel pyramidal type of
logical organization.1 In the conditional sum adder
695
696
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
osophy makes use of the fact that optimization for the
maximum performance per cost ratio is an easier
task if the logic circuits have to perform only a predetermined, limited number of logic functions. In this
case a better coordination between logic and circuit
design is also possible.
The logic organization presented here observes
the capabilities and limitations of practical circuitry and topology. The circuit design conforms to
the characteristics of the logic functions and takes
into account the potential advantages of the microminiaturization techniques.
1965
LOGICAL ORGANIZATION OF ADDER
The principle of the conditional sum adder is
based on the computation of conditional sums and
carries that result from all possible distribution of
carries. 6 Since for a binary full adder stage the carry
can be either a one or a zero, only two conditional
carries, and consequently two conditional sums,
need be generated.
The illustration of the principle is shown in Fig.
1. Let us assume that two operands of N bits each
are to be added. Both operands, and therefore the
Group (i +1)
Group (i)
COi-l
CNi-l
AOI So,
CYi - 1
COi
COi+l
SUM
COi-l
SN Sy
Figure 1. Principle of the conditional sum-carry bypass adder.
adder, are divided into k groups of n bits each (n
does not necessarily have to be the same for every
group). In each group the carry and a portion of
the sum circuits are duplicated in order to generate
the two carries and two sums corresponding to a
possible one and zero carry input to the group.
Having two sums available, a decision is then made
as to which sum is the correct one and this in tum
becomes the sum output.
The carry-bypass decision logic circuit deter-
mines the correct carries using the carry signals of
the last stage of the group and the output of the previous bypass circuit. The decision logic is then propagated to higher order groups. The total worst-case
carry propagation time for two operands of N bits is
tp = nlte
+ ktg
(1)
where tp is the total carry propagation time, tc is the
propagation time of the carry circuit within the group,
nl is the number of carry stages in the first group, k is
697
A FAST CONDITIONAL SUM ADDER USING CARRY BYPASS LOGIC
the number of groups, and tg is the intergroup propagation time. This relationship assumes that the output signal of the last stage of a group is available by
the time the decision signal reaches the corresponding
decision stage.
The derivation of the decision logic either by the
method of Karnaugh map or total induction is
straightforward, thus only the end results are given
here. The bypass decision function (C Di) of the ith
group of an n bit grouping is as follows:
C m = CD(i-l)CYin
+
CNin
(2)
where C y and Cn are the carry output as shown in
Fig. 1.
Unfortunately, the logical expression of Eq. (2)
cannot be readily realized with one tunnel diode
unit-delay. Also the AND circuit would require
tighter component tolerances than required by an
OR Circuit, thus contradicting the design goals. For
these reasons, a different logic function whose generation is practical with a tunnel diode transistor
circuit using the tunnel diode in the propagation path
was needed. This circuit should have operation margins at least as good as a "one out of three" analog
1
CN8
CN
Cv
This logic statement can be realized by an emitter coupled transistor-tunnel diode bypass circuit
having the properties described above. More will be
said about this circuit in the following section.
The sum is produced by two cascaded logical
equivalence functions (A\lB = AB + AB) for both
Sy and Sn. The two sums are given in Eq. (4).
Sy = (A \l B) \l Cy
Sn = (A \l B) \l CN
0
(4)
From this we can see that the first subsum can be
combined for Sy and Sn and the circuit producing
A \l B does not have to be duplicated. Therefore, the
expression for the sum of group i stage j, including
selection, is given in Eq. (5).
Sij = (Aij\lBij)\l[Cyi<;-l)· CDIbl ~ ~
Suppose a factor m could be determined such that
mb = 1, then Q = ma. The algorithm to be described
defines an iterative procedure for determining a sequencemi such that
bi
1T mi ~
1
hence
both quadratically. Let
b>O
b I0 I ~
+
-t
E
2
0
-t
2
0
or 0 . 1
>E
~ O.
b>O
b ;
1 7 :..
j.":
(
P
Y
PI, YO;'[
->
~:[1I;C::S
Lt,S 1
U
JX J',' (; 'j
<;
0OOO10GOO
0 15
4. 00 '";OC1O'j~".J?·~( " }
I
O.
1000(;00JO
00000? 1 ~J('~l~)"'
15
8
O.
1
O.
1e';3
00Cl:)0Zl12!,1,I,
100000000
15
O.
8
I
O.
100000(;00
18~'t
15
O.
610a n ?'>17 /,! ')
8
0
- 7. I 7
--3. Cjl,
1855
1000000(:0
O.
4700CJ?r:;OOl'S1 0
15
12
-0 10.',7 [15000?:)~; li?:j 0
1856 0100000(,0
0
0
-2 .. 6D
1857
000001000
15
1'.
8.00 20121 20 7n26
6
O.
1858 001100090
-0
S.OO ~~OOO?Srl(,?(, 12
1~
o.
1859
000000110
15 -0
8.10 6-/0003~n6?6 1 ~
4 -7.1-7
1860 000001000
15
2.00 0;>00020001 ;'1 14
20
0
O.
-0
1861
001100000
15
2.00 0200020001'>1 13
0
o.
1862
000101000
15
1.20 070003:)23151 12
16
0
O.
1863
000101000
15
0.40 430003:'7:11 :d 11
0
2
O.
186',
000010010
0.',0 '>30003')7'>151 10
15
0
22
O.
18t,5 100000000
5S0002~5331.4 10
15 -0
o.
2 -'3.9 1;
2.00 470002500151 10
1866
000001000
20
2 -3.<)'.
15
-0
1867
010000000
0
6.95 05'1(,12'.5'.200 10
0
o.
1868
010101000
6
28
5.34 ~200025n617 15
2 -3.9',
1869
000001010
15
20
0.80 450003500700 15
9
1 -2.68
3.00 020002023577 10 14
1870
000001000
15
30
0
O.
-0
0 15
2.40 400002523077 10 14
1871
001000000
0
O.
1872
000000100
0 15 -0
1.08 400003123071 10 14
0
O.
400002123('(,5
4 1'.
100000000
0 15
O.
0
O.
1873
8
187',
000010000
0 15
1.00 40000;>123077
14
0
36
O.
113312470337
7 11
1875
100000000
0 15
O.
0
8
O.
-0
7 11
1876
00100COOO
0 15
O.
403312170037
0
O.
-0
3.78 403313570037
7 11
1877
000000100
0 15
0
O.
8 10
1878
100000000
O.
063212 1.67132
0
0 15 1,0
O.
3.20 110007077132
9
18
9
000110000
0 15
0
O.
Ul79
9
9
1880
001000000
2.56 400003571032
0 O.
0 15 -0
9
9
1881
000000100
3.78 410003577032
0
0 15 -0
O.
1882
100000000 0 15
46
O.
4500025'.0215 11 11
0
O.
;> -3.9',
1883
100000000
0 15
o.
620002526527 15 11
30
1884
000010000
36
1.00 450002523077 15 11
1 -2.68
0 15
4 -7.17
1885
000001000
0 15
9.00 56000255'.700 15 15
2
1886
9.00 022612403153 14 14
0
O.
100000000
0 15
2
001000000
7.20 40261Z5030~d 1'. 14
0
1887
0 15 -0
O.
-0
1888
000000110
0
O.
0 15
1.35 02251Yl03153 13 13
0.40 032412003153 12 12
18039
000110000
0 15
8
0
O.
0.40 032313003153 11 11
1890 001100000
0 15 -0
0
O.
0.40 0'.2213'.03153 10 10
0
O.
1891
001100000
0 15 -0
-0
0.40 072113'.03153
9
9
0
1892
001100000
O.
0 15
0.40 372013603153
8
8
10
0
1893
000110000
0 15
O.
0.40 420003103153
7
7
0 15 -0
0
O.
1894 001100000
6
0
0 15 -0 23.24 430003~03153
6
o.
1895 000000110
16
1.00 45000257'.674 12
6
0
1896 000001000
0 15
O.
O.
620002553527 15
100000000
0 15
6
6
3 -5. ',4
1897
O.
560002502315 15
6
4 -7.17
1898
10000COOO
2
0 15
-0
10.47
050002',23357
12
0
O.
3
1899
010000000
1 0
-0 10.47 550002523357
9
0
1 -2.6e
1900 001100000
0 15
-1215.
RUN TOTALS
1)-0.011 2) 0.007
( 50 R[SPOJlS[ EtJTf;OPY ) 0.182 Evt.l
H8LOCK
1)
USEO
731. 2 )
1'.3.
YIELO 1)-3.352 2l-0.9U,
1851
lB~?
'.
'.
U 'C'.l:
o.
--I.
<:,..,
C.
O.
O.
o.
)'j
o.
O.
8.10
-2.70
-4.e:,
1. 3~
1. 57
1. 89
-1.3~
- t. 35
-7.70
o.
2.70
--1.35
O.
1.35
O.
O.
O.
O.
4.72
O.
O.
3.15
4.72
O.
O.
-2.70
-5.40
O.
O.
1.35
O.
1. 57
1.8'1
2. ,6
3.15
4.72
4.72
-1. 35
-1.35
-2.70
-4.05
O.
2.70
-4.')5
o.
o.
o.
O.
O.
O.
O.
O.
O.
O.
O.
O.
O.
o.
o.
O.
O.
O.
O.
O.
O.
O.
O.
O.
O.
O.
O.
O.
O.
O.
O.
O.
O.
O.
o.
O.
O.
O.
O.
O.
O.
O.
o.
O.
O.
O.
O.
O.
47.
O.
3 ) o. n6 i,)
3)
2 '>I. '.)
3 ) 0.0 l', ':)
f:
c.
c
- o.
"
,
F'U:' It
t:',,'cl
r:,,,
-c. l?
B
I
H
A
Figure 1. Data sample page.
- o.
H
0
0
T
SI( I LL TSt( IL E
0 ~H O.
o.
1
0 L:, :~. ,? O.
o.
1
0 <;(,3 O.
O.
1
0 4(,3 o.
o.
1
0 ',63 O.
O.
1
0 463 O.
O.
1
2 465 O.
O.
1
0 ';(,3 O.
o.
1
0 '.63 O.
O.
1
2 465 o.
o.
1
1 466 o.
O.
1
1 467 O.
o.
1
0 46'. O.
o.
1
0 4(.1, O.
o.
1
0 46!~ O.
o.
0 4(,1. O.
O.
1
o.
1 4(>5 O.
1
0 463 O.
o.
1
0 463 O.
O.
1
O.
2 '.65 O.
1
0 1;63 O.
o.
1
o.
0 463 O.
1
0 463 O.
O.
1
0 463 O.
O.
1
2 465 o.
O.
1
0 463 O.
O.
1
O.
0 ';63 O.
1
O.
2 465 o.
1
O.
1 466 O.
1
0 463 O.
o.
1
0 463 O.
O.
1
O.
0 <;63 O.
1
O.
0 463 O.
1
0 463 O.
o.
1
0 1,63 O.
O.
1
o.
2 It6~ O.
1
o.
1
0 463 O.
o.
1
2 46:, O.
l' 466 O.
O.
1
O.
1
1 467 O.
O.
1
1 405 O.
o.
1
1 469 O.
O.
1
1 470 O.
O.
1
0 46~ O.
O.
1
0 466 O.
O.
1
0 466 O.
o.
1
0 466 O.
O.
0 4Lt, O.
1
O.
1
1 4(.7 O.
O.
0 ',6(, O.
1
l
P:, y
f.l" S ~
~:
V
-c,.12 - I L .
-172.
O.
-0.
-1(;2.
o.
o.
O.
-0.
O.
-8.5? -f,. ~,? -1(:'..
C' •
-3.9;, -3.9:, -1 Ct••
- O.
o.
O.
-0.
O.
O.
-2.68 -?6g -18(..
o.
- O.
8.10
8~ 10 -18'••
O.
-0.
O.
O.
-2.70 -2.70 -193.
o.
O.
-0.36 -11. ~B-·ll. ~8 -200.
-0.
O.
O.
1.35
1. 3~ -200.
-0.
-0.31.
O.
I.?!
1. 21 -199.
-0. ',2
O.
O.
1.47 -198.
1. '. 7
o.
O.
-o.n -1.63 -1. 63 -701.
O.
-0.64
-1.99 -1.99 -203.
O.
-6.9!, - 6. 9(t -20S.
-0.30
O.
-3.94 -3.9'. -20[;.
o.
O.
O.
2.70
o.
2.70 -207.
O.
o.
-5.29 -5.'29 -213.
O.
-2.6S --2.60 - 215.
O.
O.
-0.
1. ,5
O.
O.
1.35 -219.
-0.0(, - O. 06 -0. CI, -219.
-0.
O.
O.
-0. ',2
O.
- 0.',7 -0.42 -219.
-0.
-0.3D
-0.3f-: -0.31} -2?5.
O.
-0.
-0.3/,
-,O.V. -,0.3', -2?~.
O.
-0.
O.
4.72
O.
4.77 -223.
-0.
O.
-0.06 -O.Of> -0.0(- -223.
-0.lt2 -0. 're -223.
O.
-O.',?
O.
-0.
O.
O.
3.15
3.15 -221.
-0.
O.
O.
4.72
4.72 -ZI 7.
-0.16
O.
O.
-0.16 -0. II, -217.
-0;52
o.
-0.5? -0.5? -217.
O.
-0.
-2.70 -2.70 -221.
O.
O.
-0.
-9.34 -9.3'. -232.
O.
O.
-0.
-2.68 -2.68 -232.
O.
O.
-0.
-7.17 -7.17 -236.
O.
O.
-0.
O.
O.
1. 35
1.35 -235.
-0.
-0.16
-0.1 t, - 0.16 -23', •
o.
-0.12
O.
O.
1.45 -235.
1." 5
-0.3;J
-0.
O.
1. 51
1. 51 -23'••
_().4 f t
O.
1.92
1.92 -232.
O.
2.<,5
-0.70
o.
O.
2.45 -230.
-0. 8~
3. et, -226.
O.
O.
3. P.6
O.
-1.02
19.53
23.23 23.23 -203.
O.
-1.1'> -1.35 -206.
O.
O.
O.
-0.36
-1. 71 -1. 71 -203.
O.
-0.
-2.70 -2.70 -217.
O.
O.
-0.
-9.49 - 9. ',1 -22'••
O.
O.
-0.
-7.17 -7.17 -22', •
O.
o.
-0.
2.70
O.
O.
2.70 -22~.
-0.
- c.• 9'') -6. gr:j -;> 3 i.
-0. ?6
O.
-l,6. -11 <)'t. -11 (j:,
20.
O.
C. ,)(,1 5)-O.Ol? I,), 0 .. C ~
2,) -c. (,1 t. 9 ) O.
71--0. C:
2 "0
5 ,~, (. • 71
e)
5)
-0.
9'> • 6 )
I:
II! • 9)
!. 3/0 51-0.112 /,1-;>.911 71,,1.0 I 8)-2.0/) 9)-0.720
O.
0.
-0.12
O.
2 " 1 [·:;00
O.
"
I"
766
PROCEEDINGS -
Table 5.
FALL JOINT COMPUTER CONFERENCE,
1965
Payoff Data by Means Combinations from PUPIL 25016500;
Means 1
BLOCK
PG.
NO. RCNO PAYOFF AVG.
1 1454 -4.05
1 1457 -16.42
1 1460 -11.33
1 1462 -3.94
1. 1465 -4.03
1 1468 -2.68
1 1470 -2.68
1 1472 +1.35
1 1485 -2.68
1 1490 -8.08
1 1491 -7.17
1 1493 -2.68
1 1496 -1.33
1 1498 -2.68
1 1499 +4.05 -4.29
2 1503 -3.06
2 1504 -11.33
2 1505 -3.94
2 1506 -9.13
2 1507 -16.42
2 1510 -9.13
2 1512 -2.74
2 1515 -2.68
2 1524 -6.79
2 1525 -7.17
2 1526 +4.05
2 1529
0.00
2 1530 -12.57
2 1532 -3.94
2 1538 -11.83
2 1539 -2.68
2 1540 -5.44
2 1541 -5.44
0.00
2 1546
2 1547 -10.84 -6.05
3 1557 +7.09
3 1567 -13.75
3 1569 -5.29
3 1585 +2.70
3 1588 -11.33
3 1590 -3.94
3 1597 -16.42
3 1599 -2.68 -5.45
Means 1, Conttd.
PG.
BLOCK
NO. RCNO PAYOFF AVG.
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
5
5
5
5
5
5
5
5
5
5
5
6
6
6
6
6
6
6
6
6
6
6
6
'6
6
6
6
6
1601
1604
1607
1609
1611
1613
1615
1621
1622
1624
1626
1629
1630
1634
1636
1654
1656
1661
1670
1675
1678
1680
1689
1693
1694
1697
1702
1704
1705
1706
1708
1709
1714
1721
1724
1729
1730
17.33
1734
1736
1739
1743
1744
-13.75
+4.05
-5.72
-16.42
-3.94
-2.68
-3.94
-3.94
-13.75
-16.42
-9.13
-7.17
+4.05
0.00
+2.7'() -5.74
-9.13
-7.17
+1.35
+2.70
-10.78
-7.17
-13.75
+6.75
-5.44
+8.10
-4.05 -3.51
0.00
-9.87
-13.75
-7.17
-11.33
-2.68
-5.29
+1.57
-1.43
-2.70
+2.70
-14.15
-2.68
-2.68
-3.94
-0.18
-2.70
Means 1, Cant'd.
PG.
BLOCK
NO. RCNO PAYOFF AVG.
6
6
6
7
7
7
7
7
7
7
7
7
7
7
7
7
8
8
8
8
8
8
8
8
8
8
8
8
8
8
8
8
8
8
8
8
9
9
9
9
9
9
9
1745
1748
1750
1755
1760
1768
1779
1783
1785
1786
1789
1790
1796
1797
1798
1800
1801
1805
1806
1807
1810
1813
1814
1817
1819
1820
1821
1825
1826
1829
1831
1834
1835
1836
1846
1847
1852
1853
1854
1855
1865
1873
1875
+2.36
-4.05
+1.35 -3.83
-12.88
-6.72
-1.41
0.00
0.00
0.00
+3.15
0.00
+2.70
-0.38
-6.75
-6.64
-7.17 -2.78
+2.70
-7.17
-7.17
-2.68
+5.40
-1.35
+2.36
-4.05
-2.68
-16.42
-5.44
-5.47
+2.70
0.00
+1.35
0.00
-10.48
-16.42
-1.35
-13.18 -3.97
0.00
0.00
-8.52
-3.94
-6.94
-0.38
+4.72
767
COMPUTER EXPERIMENTS IN MOTOR LEARNING
Means 1 Cont'd.
BLOCK
PG.
RCNO
PAYOFF
AVG.
NO.
9
9
9
9
9
9
10
10
10
10
10
10
10
10
10
10
10
10
10
10
10
11
11
11
11
11
11
11
11
11
11
11
11
11
11
11
11
11
11
11
11
11
11
11
11
1878
1882
1883
1886
1897
1898
1910
1911
1912
1914
1915
1916
1927
1935
1938
1939
1942
1943
1946
1948
1949
1953
1954
1955
1956
1957
1961
1965
1972
1974
1975
1977
1980
1981
1983
1986
1989
1990
1991
1993
1994
1995
1996
1999
2000
+3.15
-2.70
-9.34
+1.35
-9.49
-7.17 -3.02
-1.33
-2.68
0.00
-11.33
-13.75
-1.33
-12.13
+4.36
0.00
+6.30
-8.52
-2.68
-11.33
-2.68
-11.33 -4.56
-5.40
0.00
-6.64
-2.68
-8.52
-5.44
+1.89
-1.35
0.00
0.00
+2.70
-14.03
-16.42
-2.68
-2.68
-9.13
-5.29
-2.68
-7.17
-2.68
-3.94
-11.33
-9.13
-13. 75 -5.26
Means 1 Contld.
Means 1 Cont'd.
BLOCK
PG.
NO. RCNO PAYOFF AVG.
PG.
BLOCK
NO. ReNO PAYOFF AVG.
12
12
12
12
12
12
12
12
12
12
12
12
12
12
12
13
13
13
13
13
13
13
13
13
13
13
13
13
13
13
13
13
13
14
14
14
14
14
14
14
14
14
14
14
14
14
14
14
14
14
14
14
14
14
14
14
14
15
15
15
15
15
15
15
15
15
15
15
15
15
15
15
15
15
16
16
16
16
16
16
16
16
16
16
16
16
16
16
16
16
2005 -3.94
2007 -2.68
2008 -16.42
2013 -7.17
2015 -2.68
2017 -2.68
2018 -5.44
2020 -2.68
2022 -3.94
2030 -2.68
2031 +4.05
2034 +5.67
2039 +2.70
2043 +4.72
2046 +4.05 -1.94
2051 -7.17
2053 -2.68
2054 -2.74
2056 +5.40
2059
0.00
2061 -16.42
2063 +1.35
2074
0.00
0.00
2075
0.00
2076
2078 -4.05
2079 +3.78
2083 -4.05
2087 +2.70
2091 -2.70
2096 +5.40
2099 -1.35
2100
0.00 -1.25
2101 -2.70
2102 -5'.29
2103 -5.60
2104 -2.68
2105 -2.74
2106 -2.70
2107 -2.68
1108 -5.44
2109 -11.33
2110 -5.82
2112 -5.29
2113 -5.60
2115
2117
2123
2126
2130
2133
2137
2139
2140
2144
2145
2149
2151
2152
2161
2166
2167
2176
2177
2182
2184
2186
2187
2189
2191
2194
2195
2198
2199
2204
2207
2208
2209
2210
2211
2213
2217
2222
2224
2225
2228
2229
2230
2232
2233
-7.17
-5.44
0.00
-7.17
+1.81
-2.68
-7.02
-11.33
-13.75
-2.68
-2.68
-2.68 -4.94
-16.42
-5.44
-3.26
-0.08
+2.36
-2.68
+4.05
-6.64
-3.94
-5.44
-2.68
-7.17
-11.33
-5.44
+4.05
-9.49
-2.68 -4.25
+7.38
0.00
-2.70
-2.70
-12.68
-2.68
-7.17
-7.17
0.00
-9.13
+4.05
0.00
0.00
-10.48
0.00
0.00
@;
768
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
Means 1 Cont'd.
Means 1 Cont'd.
BLOCK
PG.
NO. RCNO PAYOFF AVG.
PG.
BLOCK
NO. RCNO PAYOFF AVG.
16
16
16
16
16
16
16
16
17
17
17
17
17
17
17
17
17
17
17
17
17
18
18
18
18
18
18
18
18
18
18
18
18
18
18
18
19
19
19
19
19
19
19
19
19
2234
2235
2236
2237
2243
2246
2248
2250
2255
2257
2261
2263
2267
2268
2271
2277
2281
2286
2287
2289
2300
2304
2305
2309
2314
2323
2325
2333
2334
2335
2339
2341
2342
2343
2347
2348
2352
2357
2359
2360
2361
2362
2365
2367
2368
0.00
-9.13
-6.79
-16.42
-8.52
-3.94
-8.52
-1.33 -4.08
-11.33
-11.33
-3.94
-5.44
-14.03
+2.70
-11.45
-4.35
-10.78
-2.68
+1.35
+1.35
+5.40 -4.96
-5.44
+2.70
-13. 75
-2.68
-2.68
-13.75
-8.36
-2.68
-2.68
-2.68
-2.68
-2.68
+4.05
-11.33
-2.68 -4.49
-6.79
-11.22
-2.68
-16.42
-2.68
-2.68
-2.68
-2.68
-13.75
19
19
19
19
19
19
19
19
19
19
19
19
19
19
19
2371
2375
2380
2381
2382
2383
2384
2385
2386
2387
2388
2389
2391
2394
2396
+1.35
+2.70
0.00
-9.43
-2.68
-2.68
-11.33
-13. 75
-13. 75
-16.42
-11.33
-9.13
+6.75
-1.53
-16.42 -6.63
1965
Means 1 * Cont I d.
PG.
BLOCK
NO. RCNO PAYOFF AVG.
10
11
11
11
12
13
13
13
13
13
16
16
17
17
19
19
1927 -12.13
1972 -1.35
1974
0.00
1975
0.00
2043 +4.72
0.00
2074
0.00
2075
0.00
2076
2078 -4.05
2091 -2.70
2207
0.00
2208 -2.70
2277 -4.35
2281 -10.78
2380
0.00
2381 -9.43 -2.80
324 cases, -4.36 avg.
39 cases, -2.80 avg.
Means 1*
*Se1ected cases in
pen center area.
1 1454 -4.05
0.00
2 1529
2 1530 -12.57
2 1546
0.00
2 1547 -10.84
3 1557 +7.09
0.00
4 1634
5 1675 -10.78
5 1697 -4.05
0.00
6 1702
·6 1743 -0.18
6 1744 -2.70
6 1745 +2.36
6 1748 -4.05
0.00
7 1783
7 1796 -0.38
7 1797 -6.75
8 1813 -1.35
8 1817 -4.05
8 1846 -1.35
8 1847 -13.18
9 1878 +3.15
9 1882 -2. 70
Means 2
·3 1592
5 1651
5 1658
5 1660
5 1662
5 1669
5 1682
5 1683
5 1684
6 1725
6 1726
6 1749
7 1769
7 1770
7 1774
8 1848
9 1856
9 1867
9 1899
10 1909
10 1929
10 1944
10 1947
11 1962
-7.17
-7.17
-7.23
-7.23
+1.51
-4.39
-7.23
+1.35
+1.51
-4.37
+2.70
-6.85
+1.35
+1.57
+1.77
+4.05
-2.68
+2.70
+2.70
-7.17
+1.35
-4.03
-7.17
+4.05
-7.17
-3.61
-2.84
+1.56
+4.05
+0.91
-4.26
769
COMPUTER EXPERIMENTS IN MOTOR LEARNING
Means 2 Cont'd.
Means 3 Cont'd.
PG.
BLOCK
NO. RCNO PAYOFF AVG.
BLOCK
PG.
NO. RCNO PAYOFF AVG.
11
11
12
14
14
15
15
15
15
16
16
16
16
17
18
18
18
18
19
1987
1988
2011
2118
2148
2153
2171
2178
2181
2218
2240
2245
2247
2252
2310
2326
2349
2350
2397
-7.17
0.00
-7.17
+4.05
-6.79
-7.17
-1.61
+7.09
-4.41
+4.05
-7.17
-7.17
-1.33
+1.35
+4.05
+1.35
-3.94
-1.33
+4.05
-1.04
-7.17
-1.37
-1.53
-2.91
+1.35
8
+0.13
+4.05
43 cases, -1. 75 avg.
Means 3
1
1
1
1
2
2
2
2
2
2
2
3
3
3
3
3
4
5
5
1473
1476
1479
1488
1517
1519
1527
1534
1536
1544
1549
1553
1574
1576
1578
1586
1640
1695
1700
-0.16
-0.34
-0.62
-0.16
'~O .16
-0.16
-0.06
-0.16
-0.38
-0.42
-0.16
-0.50
-0.32
-0.16
-0.16
-0.06
0.00
-0.06
-0.16
6
6
6
7
7
7
7
7
7
7
8
8
8
8
-0.32
-0.21
-0.24
0.00
-0.11
8
8
8
9
9
9
9
10
10
10
10
10
11
11
11
11
12
12
13
13
13
13
13
14
14
14
14
15
1727
1731
1741
1752
1758
1763
1765
1781
1791
1794
1803
1811
1815
1823
1827
1832
1841
1844
1871
1876
1880
1887
1921
1923
1930
1932
1936
1951
1963
1968
1978
2024
2027
2057
2065
2067
2093
2097
2119
2121
2128
2135
2156
-0.16
0.00
-0.06
-0.42
-0.06
-0.16
-0.38
-0.06
-0.16
-0.16
-0.06
-0.16
-0.16
-0.16
-0.06
-0.06
-0.32
-0.22
-0.06
-0.06
-0.16
-0.16
-0.64
-0.16
-0.16
-0.48
-0.06
0.00
-0.06
-0.48
-0.16
-0.16
-0.54
-0.06
-0.42
-0.34
-0.06
-0.16
-0.06
-0.06
-0.16
-0.16
-0.16
-0.07
-0.20
-0.15
Means 3 Cont'd.
BLOCK
PG.
NO. RCNO PAYOFF AVG.
15
15
15
15
16
16
17
17
17
17
17
18
18
18
19
19
2159
2164
2168
2179
2202
2205
2265
2275
2280
2288
2298
2329
2331
2334
2372
2392
-0.64
-0.06
-0.16
-0.06
-0.16
-0.06
-0.06
-0.58
-0.06
-0.06
-0.06
-0.52
-0.16
-0.06
0.00
-0.06
-0.22
-0.11
-0.16
-0.25
-0.03
78 cases, -0.19 avg.
Means 5
-0.11
-0.30
-0.18
-0.35
-0.21
-0.11
1
2
5
5
5
6
6
7
7
7
7
8
9
9
9
10
12
12
13
13
16
16
1483 -1.53 -1.53
1523 -0.40 -0.40
0.00
1667
1668 -1.35
1692 -15.10 -5.48
1713 -0.12
1720 -0.14 -0.13
1754 -1.14
1767 -1.10
1777 -2.70
0.00 -1.24
1778
1808 -13.75 -13.75
1851 -0.12
1874 -0.34
1884 -2.68 -1.05
1934 -1.20 -1.20
2037 -9.51
2050 -7.17 - 8.34
2077 -1.35
2092 +4.72 +1.69
2221 -4.05
2223 -6.79 -5.42
770
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1965
Means 5 Cont'd.
Means 6 Cont'd.
Means 6 Cont'd.
PG.
BLOCK
NO. ReNO PAYOFF AVG.
PG.
BLOCK
NO. ReNO PAYOFF AVG.
PG.
BLOCK
NO. ReNO PAYOFF AVG.
17
18
18
18
2296 -2.11 -2.11
2308 -13.75
2313 -6.73
2315 +5.40 -5.03
26 cases, -3.19 avg.
Means 6
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
l
1
2
2
2
2
2
2
2
2
2
2
3
3
3
3
3
3
3
1453 -2.70
1455, 0.00
1456 -5.38
1458 -2.68
1459 -5.44
1461 -16.42
1463 -11.33
1464 -5.44
1466 -2.68
1467 -11.33
1469 -5.44
1471 -11.33
1484 -12.57
1486 +2. 70
1492 -3.94
1494 -16.42
1495 -7.17
1497 -16.42 -7.44
1508 -5.44
1509 -2.68
1511 -16.42
1513 -2.68
1514 -2. 74
1516 +1.35
1531 -2.68
1533 +1.35
1542 +2. 70
1548 +1.35 -2.59
1560 +3.78
1563 -2.6"8
1564 -9.13
1565 -9.19
1566 -2.68
1568 -9.13
1570 +1.35
3
3
3
3
3
3
3
3
3
3
3
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
5
5
5
5
5
5
5
5
5
5
5
5
6
1580
1581
1582
1583
1584
1589
1591
1593
1596
1598
1600
1602
1603
1608
1610
1612
1614
1616
1619
1620
1623
1625
1627
1628
1635
1639
1642
1643
1647
1648
1649
1650
1652
1653
1655
1657
1659
1676
1677
1679
1681
1687
1688
1698
1703
-1.35
-1.35
0.00
-7.99
-2.68
-9.13
-2.68
+2.70
-2.68
-2.68
-2.68 -3.23
-3.94
-16.42
-9.13
-13.75
-2.68
-2.68
+2.70
-15.10
-3.94
'-13.75
-2.68
-2.68
-9.13
-6.64
+4.72
-2.68
+4.05
-5.40
-8.68
-9.13
-7.17 -5.91
-9.13
-7.17
-9.13
-2.68
-2.80
-9.13
-7.17
-9.13
-2.68
-4.03
-9.13
+1.89 -5.86
-2.70
6
6
6
6
6
6
6
6
7
7
7
7
7
7
7
8
8
8
8
8
8
8
9
9
9
9
9
9
10
10
10
10
10
10
10
10
10
10
10
10
11
11
11
11
1707
1710
1715
1716
1735
1737
1738
1740
1756
1757
1761
1762
1780
1784
1799
1809
1818
1822
1830
1837
1838
1839
1857
1860
1866
1870
1885
1896
1902
1903
1904
1905
1906
1907
1908
1913
1917
1928
1945
1950
1958
1959
1960
1973
-2.68
+1.35
-2.68
+2.70
-2.68
-2.68
-5.44
+6.75
-16.42
+1.35
-13.75
+1.35
+4.72
-4.05
-5.44
-9.13
-12.68
+1.35
-11.83
-2.68
-2.68
+1.35
+8.10
+1.35
-3.94
+1.35
-7.17
-2. 70
-2.68
-5.44
":5.44
-11.33
-2.68
-3.94
-7.17
-8.52
+1.35
-2.68
-16.42
+9.45
-7.17
-16.42
-2.68
0.00
-0.90
-4.61
-5.19
-0.50
-4.63
771
COMPUTER EXPERIMENTS IN MOTOR LEARNING
Means 6 Cont'd.
Means 6 Cont'd.
Means 6 Cont' d.
PG.
BLOCK
NO. RCNO PAYOFF AVG.
BLOCK
PG.
NO. RCNO PAYOFF AVG.
PG.
BLOCK
NO. RCNO PAYOFF AVG.
11
11
11
11
11
11
11
12
12
12
12
12
12
12
12
12
12
12
12
12
12
12
13
13
13
13
13
13
14
14
14
14
14
14
14
14
14
14
14
14
14
14
1976
1982
1984
1985
1992
1997
1998
2001
2002
2003
2004
2006
2009
2010
2012
2014
2016
2019
2021
2023
2029
2038
2052
2055
2060
2062
2084
2095
2111
2114
2116
2124
2125
2127
2134
2138
2141
2142
2143
2146
2147
2150
-12.19
-2.68
-16.42
-16.42
-2.68
-7.17
-11.33
-16.42
-2.68
-3.94
-5.44
-2.68
-2.68
-11.33
-16.42
-2.68
-2.68
-5.44
-2.68
+1.35
-11.22
-7.17
-3.94
-7.17
-8.08
-5.44
+4.72
-11.33
0.00
-5.44
-2.68
-8.14
-2.68
+4.05
+1.35
-5.78
-2.68
-7.17
-5.44
-16.42
-9.98
-2.68
-8.65
-6.14
-5.21
-4.55
15
15
15
15
15
15
15
15
15
15
15
15
15
16
16
16
16
16
16
16
16
16
16
16
16
17
17
17
17
17
17
17
17
.17
17
17
17
17
18
18
18
18
2154
2162
2163
2173
2174
2175
2183
2185
2188
2190
2192
2193
2200
2206
2212
2214
2215
2216
2231
2238
2239
2241
2242
2244
2249
2251
2254
2256
2258
2259
2260
2262
2264
2272
2278
2282
2285
2297
2303
2317
2320
2321
+1.35
-0.12
+3.15
-2.70
-8.52
-9.13
-1.33
-9.13
-9.13
-16.42
-5.44
-2.68
+5.40 -4.21
-1.35
-2.68
-16.42
-2.68
-16.42
-2.68
-9.13
-7.17
-1.33
-9.13
-7.17
-9.13 -7.11
-7.17
-3.96
-1.33
0.00
-4.03
-9.13
-9.13
+4.05
+2.70
+5.67
+4.05
-11.49
+7.56 -1.71
-7.17
+1.35
-5.44
-2.68
18
18
18
18
18
18
18
19
19
19
19
19
19
19
19
19
19
19
19
2322
2324
2336
2337
2338
2340
2346
2351
2353
2358
2363
2364
2366
2369
2370
2374
2379
2390
2395
-5.44
-11.33
-13.75
-16.42
-2.68
-16.42
-5.29 -7.75
-2.68
+2.70
-16.42
-2.68
-16.42
-16.42
-11.33
-2.68
-11.33
-0.08
-2.68
-8.12 -7.35
227 cases, -5.13 avg.
Means 7
1 1482
1 1489
2 1537
2 1545
4 1618
4 1638
4 1646
5 1674
5 1691
5 1696
6 1701
6 1712
6 1723
6 1728
6 1732
6 1742
7 1753
7 1759
-0.12
-0.52
-0.74
-0.78
0.00
-0.52
-0.16
-0.26
0.00
-0.42
-0.52
-0.06
-0.32
-0.52
-0.36
-0.42
-0.78
-0.42
-0.32
-0.76
-0.23
-0.23
-0.37
772
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1965
Means 7 Cont'd.
Means 1,4 Cont'd.
Means 2,3 Cont'd.
PG.
BLOCK
NO. RCNO PAYOFF AVG.
PG.
BLOCK
NO. RCNO PAYOFF AVG.
PG.
BLOCK
NO. RCNO PAYOFF AVG.
7
7
7
7
7
8
8
8
8
9
9
9
10
10
10
11
11
11
11
12
12
13
14
14
14
15
15
16
16
17
17
17
18
18
19
19
1766
1776
1782
1788
1795
1812
1816
1824
1833
1872
1877
1881
1926
1933
1941
1952
1964
1971
1979
2028
2033
2098
2129
2132
2136
2165
2172
2220
2227
2266
2276
2284
2312
2332
2356
2393
-0.74
0.00
-0.42
-0.42
-0.52
-0.52
-0.52
-0.52
-0.42
-0.42
-0.42
-0.52
-0.26
-0.84
-0.52
-0.36
-0.42
-0.26
-0.52
-0.90
-0.42
-0.52
-0.52
-2.68
-0.52
-0.42
-0.12
0.00
-0.42
-0.42
-0.94
-3.10
0.00
-0.52
0.00
-0.42
-0.47
-0.50
-0.45
7
12
12
13
13
13
13
13
15
15
17
18
19
19
1793
2041
2047
2069
2070
2081
2085
2089
2170
2196
2294
2311
2354
2355
+6.30 +6.30
0.00
+4.72 +2.36
+2.36
+3.15
0.00
0.00
-1.35
-0.06
-1.99
-1.35
+3.15
0.00
-0.78
-0.71
-1.99
-1.35
+1.58
17 cases J +0.30 avg.
Means 1,8
-0.39
-1.24
-0.27
-0.21
"
-1.49
-0.26
-0.21
3
3
8
8
8
10
12
13
13
13
13
13
14
15
17
18
18
1554 +1.80
1556 -1.97
1828
0.00
1842 +1.89
1845
0.00
1924 +4.72
2042 -0.06
2058 -0.02
2066 +1.39
2068 +1.89
2073 -0.08
2090
0.00
2122 -4.17
2169 +3.03
2299 -8.10
2330 +2.08
2345 -11.22
-0.09
+0.63
+4.72
-0.06
+0.64
-4.17
+3.03
-8.10
-4.57
54 cases, -0.51 avg.
17 cases, -0.52 avg.
Means 1,4
Means 2,3
2
3
6
1521
1555
1746
-0.18 -0.18
-0.26 -0.26
0.00 0.00
6
7
7
1722
1771
1775
1-802
1849
+6.30
0.00 +3.15
5 cases, +1.25 avg.
Means 3,4
.t9.43
-0.54
-0.66
-0.52
8
8
-0.06 -0.06
0.00
0.00 0.00
1
1
2
3
3
3
4
4
4
4
4
5
6
7
7
8
8
9
9
9
9
9
9
9
10
10
10
10
11
11
11
12
12
13
13
13
15
1475
1478
1543
1551
1552
1573
1605
1631
1637
1644
1645
1671
1717
1751
1787
1840
1843
1858
1861
1890
1891
1892
1894
1900
1919
1920
1925
1940
1966
1967
1970
2026
2032
2064
2071
2072
2158
+1.39
+1.90
+2.89
+1.19
+1.55
+1.73
-0'.26
+5.61
-1.61
+7.03
-1.35
+3.62
+1.53
+1.09
-4.31
+1.41
+2.30
-2.70
+1.21
+1.92
+2.45
+3.86
-1.35
-6.99
+1.47
+1.88
0.00
-5.66
+2.30
+2.83
0.00
+1.19
-0.26
+1.09
+4.56
-0.02
+1.41
+1.65
+2.89
+1.49
+1.88
+3.62
+1.53
-1.61
+1.86
-0.23
-0.58
+1.71
+0.47
+1.88
+1.41
773
COMPUTER EXPERIMENTS IN MOTOR LEARNING
Means 3,4 Cant'd.
Means 4,6 Cant'd.
Means 6,8
BLOCK
PG.
NO. RCNO PAYOFF AVG.
BLOCK
PG.
NO. RCNO PAYOFF AVG.
BLOCK
PG.
NO. RCNO PAYOFF AVG.
16
17
17
17
17
18
18
18
19
19
19
19
2226 -1.61 -1.61
2253 -6.90
2273 -2.89
2292 +2.04
2293 +2.47 +0.13
2316 -11.48
2318 -10. 74
2328 +1.53 -6.90
2377 +4.56
2378 -0.02
2398 +4.66
2399 -4.37 +1.21
49 cases, +0.37 avg.
Means 4,5
4
5
5
9
9
9
12
13
17
17
18
18
1632 -0.02 -0.02
1673
0.00
1685 -0.32 -0.16
1879 +4.72
1889 +1.51
1893 +23.23 +9.82
2048 -4.05 -4.05
2088 +3.15 +3.15
2274 +4.30
2291 +1.63 +2.97
2301 -5.56
2327 +1.57 -2.00
12 cases, +2.51 avg.
Means 4,6
1
1
1
2
3
3
3
3
3
1481
1487
1500
1501
1558
1561
1571
1572
1594
-1.41
+3.15
+4.72 +2.15
-4.31 -4.31
-1.35
-8.08
+1.35
+1.57
-1.35 -1.57
4
5
5
5
9
9
10
12
12
12
13
14
15
16
16
17
17
17
17
19
1617
1672
1690
1699
1862
1863
1918
2035
2040
2044
2080
2131
2155
2201
2219
2269
2279
2283
2290
2376
-2.76
+6.30
-6.75
+2.36
+1.47
-1.63
+1.41
-5.46
+1.89
-4.11
+6.30
-6.73
+1.35
+7.09
0.00
-7.99
+4.72
-9.60
+1.57
+3.15
-2. 76
+0.64
-0:08
+1.41
-2.56
+6.30
-6.73
+1.35
+3.55
-2.83
+3.15
29 cases, -0.45 avg.
Means 5,8
2
2
2
5
7
7
8
9
10
10
12
13
14
15
17
18
19
1502 -7.35
1520 +1.25
1522 -1.49 -2.53
1666 -1.41 -1.41
1764 +1.13
1772 +1.89 -1.51
1804 -10.91 -10.91
1864 -1.99 -1.99
1901 -7. 79
1922 +2.55 -2.62
2045 -7.06 -7.06
2086 -11.69 -11.69
2120 +3.78 +3.78
2180 -2.70 -2.70
2270 -9.49 -9.49
2319 -11.95 -11.95
2400 -7.41 -7.41
17 cases, -4.16 avg.
2
3
3
4
5
9
15
16
17
19
1518 +1.35 +1.35
1562 -9.13
1595 -5.54 -7.34
1606 -10.39 -10.39
1686 -1.43 -1.43
1869 -2.68 -2.68
2157 +1.45 +1.45
2203 -5.82 -5.82
2295 -2.05 -2.05
2373 -14.03 -14.03
10 cases, -4.83 avg.
1
1
1
2
2
2
3
3
3
3
3
4
4
5
6
6
7
8
9
9
9
10
10
11
12
12
12
13
13
Means 7,8
1474 +1.33
1477 +1.49
1480 +2.37
1528 -0.22
1535 +1.13
1550 +1.13
1559 -4.11
1575 +2.18
1577 +2.93
1579 -0.02
1587 -10.09
1633 -4.43
1641 -9.40
1664 +1.89
1719 -2.88
1747 -1.35
1792 +3.76
1850 -4.11
1859 -11.58
1888 +1.45
1895 -1.71
1931 +1.03
1937 -4.27
1969 +4.38
2025 +1.13
2036 -8.74
2049 -6.73
2082 -2.70
2094 -9.65
+1.73
+0.68
-1.82
-6.92
+1.89
-2.12
+3.76
-4.11
-3.95
-1.62
+4.38
-4.78
-6.18
774
PROCEEDINGS -
Means 7,8 Cont rd.
PG.
BLOCK
NO. ReNO PAYOFF AVG.
15
15
18
18
2160
2197
2302
2307
-0.90
-1.57 -1.24
-8.94
-2.68 -5.81
33 cases, -2.12 avg.
Means 1,2,4
18
2306
-5.29 -5.29
1 case, -5.29 avg.
Means 2,3,4
5
6
6
7
1663
1711
1718
1773
0.00 0.00
0.00
-0.22 -0.11
-1.41 -1.41
4 cases, -0.41 avg.
Means 2,4,5
5
1665
0.00
0.00
1 case, 0.00 avg.
Means 2,4,6
9
1868
-5.29 -5.29
1 case, -5.29 avg.
FALL JOINT COMPUTER CONFERENCE,
1965
A SURVEY OF REAO-O'NLY MEMORIES
Morton H. Lewin
RCA Laboratories
Radio Corporation of America
Princeton, New Jersey
For applications such as those given above, typical values for A and B are sufficiently large and the
given Boolean functions are sufficiently complicated that the circuit is normally. constructed in two
parts as shown in Fig. 1. Thus, most read-only
memories are word-organized or linear-select stores.
INTRODUCTION
Consider the problem of the design of a combinational circuit with A inputs and B outputs, where
each of the output variables is given as a Boolean
function of the input variables. Such a circuit might
be part of the control unit of a digital computer,
where the A inputs are the operation code of an instruction, and the B outputs are the signals which
directly control the opening and closing of gates
throughout the machine to effect an execution of
that instruction. The circuit might be a code converter, where the A inputs are an input code (for
example, the machine-code of an alphanumeric
character); and the B outputs are an output code
(for example, the pattern of signals required for a
display of that character). The circuit might be a
table look-up device, where, for example, the input
variables are a code for the numeric value of a given argument, and the output variables are a code
for the value of some function of that argument. Finally, the circuit might be considered as a memory,
with fixed information stored, where the A input
bits are an address, and the B output bits are the
word stored at that address. It is called a "readonly" or "fixed" memory if the information stored
is not alterable at electronic speeds.
I
I
DECODER
A INPUTS
(ADDRESS)
2A DECODER
OUTPUTS
ENCODER
B OUTPUTS
(WORD STORED)
Figure 1. Usual read-only memory structure.
The input address causes only one of the 2A decoder
outputs to be energized, and the function of the encoder is to selectively couple this signal to the B
output lines in accordance with the stored information pattern. The information-bearing portion of
the memory, then, can be viewed as a selective signal-coupling device.
Since the design of decoders is well known,1,2 this
paper deals primarily with the structure of the various encoders which have been proposed. One must
775
776
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
bear in mind, however, that the decoding system
accounts for an appreciable part of the cost of a
typical read-only store. The body of this paper contains qualitative descriptions of a number of memories. Some are early experimental attempts. Others
are in developmental stages. Sill others are already
in use in operating digital systems. For the convenience of the reader, this kind of information, along
with additional data, where applicable, is contained
in the annotated references.
1965
renewed. 5,6 One approach5 involves a memory consisting of a stack of paper or plastic cards (with
conventional punched card di~ensions ) , each of
which contains, on one surface, an interconnected
array of silk-screened resistors as shown in Fig. 3.
Holes punched in each card by a conventional keypunch have one of two purposes. Some insert infor~~~~~~~~t-~SILVER
PAINT
WORD LINE
~
LINEAR ARRAYS
Resistive and capacitive arrays generally have the
matrix form shown in Fig. 2. Rows are word lines
and columns are bit lines. A given word line signal
PUNCHED
HOLES
BIT LINES
CONDUCTIVE
GASKETS
PAPER OR PLASTIC
SUBSTRATE
z
Figure 3. Part of a resistor punched card.
~.n
::> ---.-+-----+---0-
I::>
o
WORD
LINES
I
I
!tt-~~t
1
1
SENSE
AMPLIFIERS
l
Figure 2. Resistive or capacitive matrix (Z
=
R or Z
=
C).
is coupled to a particular column wire if a coupling
element is present at the appropriate intersection.
Since a signal developed on a column may be coupled to unselected rows, columns are usually terminated in low input-impedance sense amplifiers. The
array then acts as a severe attenuater and, for a
reasonable capacity, the required word line signals
are 30 to 100 volts, while sense signals are in the
millivolt range.
mation by breaking appropriate printed connections
(holes A and B in Fig. 3). Others (holes C through
F), surrounded by conductive "gaskets", are used to
interconnect all cards in a deck. When the cards are
stacked, the open channels formed by these holes
are filled with a low-temperature molten alloy
which later solidifies. This results in the interconnection of all gaskets in any given position. Thus,
the common conductors on the cards are the word
lines and the alloy columns through the stack are
the bit lines. While information is readily inserted
into each card, the information stored in a finished
stack is not easily changed.
Resistive arrays have an important advantage in
that they are direct-coupled systems. On the other
hand, for sufficiently large storage capacities and
wide resistor tolerances, an appreciable amount of
power can be dissipated in a memory stack at fullspeed operation.
Capacitive Arrays
Resistive Arrays
Resistive matrices were used as early as 1943 for
storage of function tables 3 and were also used in the
Eniac machine. 4 Until recently, little attention was
paid to such memories. However, with the development of new techniques for the deposition of resistor arrays, interest in these stores has been
Capacitor read-only memories which have been
discussed may be divided into two classes-those in
which stored information can only be changed by a
partial disassembly of the memory stack (involving
the breaking and making of a relatively large number of contracts) and those which are designed to
allow information change via insert able , low-cost
777
A SURVEY OF READ-ONLY MEMORIES
cards or strips (involving very little, if any, breaking and making of contacts) .
In the first case,7,8 arrays of parallel-plate capacitors may be constructed by appropriate conductive
patterns on either side of a thin insulating sheet.
Holes are punched, as explained above, to remove
certain capacitors from the network, and sheets are
interconnected in a stack either by using conventional connectors 8 or by allowing metallized eyelets
on the sheets (similar to the gaskets discussed
above) to be connected to each other under pressure applied to the stack. 7 The capacitor array can
also be constructed using vacuum evaporation techniques to permit thinner insulating layers and, consequently, larger capacities per unit area. The evaporation masks may be designed to already include
the required information pattern.
In the second case,9-12 the capacitor pattern is
modified oy the presence of a removable card or
strip. One approach9-11 involves the use of two
plates facing each other, one containing word lines,
the other bit lines as shown in simplified form in
Fig. 4. With the plates in close proximity to each
other, coupling capacities exist at all intersections.
If a thin card, containing an insulated ground foil,
is inserted between the plates, all coupling capacities are reduced appreciably by the presence of the
shield. If the shield card has a pattern of holes in it,
those word and bit lines associated with intersections at which a hole is present will be capacitively
coupled. The shield card may be of conventional
BIT LINE
PLATE
-=--=--=--=- --=--=------------------..:
=.=..:: ==..=PUNCHED HOLE
punched card dimensions, punched by an ordinary
keypunch,9 or it may contain a very thin metal layer
with holes formed by a spark discharge. l l In another
approach,12 the prospective capacitor plates do not
face each other. They are etched in the same surface, as shown in Fig. 5. With this structure, very
little coupling capacity exists. If a card or strip,
containing a pattern of insulated metallized areas,
is placed directly over the capacitor plate pattern,
coupling capacity will exist only in those positions
at which a metal area is present. The coupling
capacitor is then composed of the series combination of two equal capacities. No connections need
be made to the information-bearing strip. Throughconnections (indicated in Fig. 5) are required if
word lines and bit lines are placed on opposite
sides of the same matrix sheet.
CODED METALLIC
AREAS
COUPLING STRIP
(INSULATED)
Figure 5. Part of a' capacitive array using coded coupling
strips (layers separated).
Balanced capacitor arrays have also received attention recently13 since they permit reflectionless
signal propagation in the encoder, when it is operated at very high speed.
In all of the capacitive systems thus far proposed
(with the exception of the thin-film, vacuum-evaporated array), practical limits on the effective distance between 'capacitor plates (particularly in the
card-changeable cases) and on the plate areas (if a
good packing density is to be achieved) restrict
coupling capacity values below 5 picofarads. The
memory stacks must be carefully designed to insure
that stray coupling capacity is small compared to
this.
Incudtive Arrays
Figure 4. Shielded capacitor memory structure (layers
separated) .
In all linear inductive read-only memories, a
drive current pulse is passed through one of the
778
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
word lines, and this signal is inductively coupled to
the sense lines as a function of the stored information pattern. At the outset, these memories have the
important advantage of decoder simplicity, since a
word line. is normally selected by the closing of two
"switches," one on either end. Thus, two decoders,
each with 2A/2 outputs, are required in place of one
with 2A outputs. This represents a significant reduction in the cost of the selection circuits.
Inductive fixed stores may be divided into three
classes-those involving only air coupling, those
with open magnetic flux paths, and those with
closed magnetic flux paths.
Air Coupling. Aircoupled inductive arrays which
have been discussed are either relatively fixed (in
the sense that stored information is changed only by
the use of many-contact connectors or "card-changeable" with no contacts required to the removable
card).
An example of the first type14 is illustrated in
Fig. 6. Word lines and sense loops are formed on
either side of a thin insulator sheet. At each intersection, the word current originally has two alternate paths. Information is inserted by the breaking
of one of these (using an appropriate cutting device, for example). The polarity of the induced signal in the sense loop depends on which current path
is taken.
SENSE
LOOPS
Figure 6. Alternate current path inductive store.
Card-changeable systems have been proposed using cards which act as shields, preventing inductive
coupling, and using cards in which induced eddy
currents enhance inductive coupling. The first type 15
is the inducti",e counterpart of the shielded-capacitor card-changeable approach, discussed earlier.
Word loops and sense loops are formed on. two sep-
,
1965
arate sheets, facing each other, as shown in Fig. 7.
An insulated metal shield card containing a pattern
of holes is inserted between them. Whenever a hole
exists at an intersection, the mutual inductance between word and bit lines is relatively high. Without
a hole it is low.
SENSE
.,{LOOP
SECTION OF
SHIELD CARD
Figure 7. Shielded inductive memory principle.
Eddy-current memories of two kinds have been
discussed. In one case,16 insulated drive loops and
sense loops are constructed orthogonal to each other
as shown in Fig. 8. Inductive coupling is virtually
zero until a small, insulated coupling loop is added
at each intersection. When a current is passed
thrgouh the word loop, an eddy current induced
around the coupling loop will induce a signal in the
sense loop. Insertable cards are constructed to include an array of small coupling loops for all intersections. To destroy coupling at a given intersection, the coupling loop is broken with a punched
hole, preventing eddy current flow. A modification
of this approach17 uses an insertable card on which
each coupling "loop" has one of two geometries
(depending on whether the stored bit is to be one
or zero. For a given word current pulse, a sensed
output pulse may be positive or negative, depending
on the geometry of the eddy-current coupling "loop"
at the corresponding intersection.
Another eddy-current memory18 uses nonorthogonal word and sense lines with solid metallic rec-
779
A SURVEY OF READ-ONLY MEMORIES
/SENSE LOOPS
~~~
~~~
DRIVE
LOOP
DRIVE
FERRITE
KEEPER
RODS
SENSE
LOOP
(a)
Figure 8. Eddy-current memory principle.
tangles (in which eddy currents are induced) on the
insertable cards. Holes punched in these cards partially or completely remove these rectangles. With
no punched hole, the inductive coupling is high.
Where a hole is punched, it is low.
'-----FERRITE ROD
Open Magnetic Paths. Two linear read-only systems
using open magnetic coupling paths will be discussed here. The first,19 used in the Atlas I computer, consists of a woven mesh of insulated word and
bit wires as shown in Fig. 9a. At each bit intersection, a small ferrite rod (to increase coupling) is
inserted if the stored bit is to be a one. Around
each bit intersection, a number of identical ferrite
rods ("keepers") are placed, unconditionally, to
localize the field and prevent it from returning
through other "information rods." To allow for
easier change of stored information, the information rods are enclosed in nylon tubes, each twice
the length of a ferrite rod, in a balanced winding
system shown in Fig. 9b. Each rod has two possible
positions in its tube, thus enhancing inductive coupling either in the upper layer or the lower one.
The second open magnetic linear system20 ,21 also
uses ferrite rods (relatively long ones) as shown in
Fig. lOa. Word lines, on cards placed in a stack
over the rods, may pass any given rod on one of
two sides, as a function of the required stored inforation. When a word line is pulsed, the polarity of
the signal induced in the rod sense winding tlepends
on whether the word line passed above or below the
rod. The word line path may be'determined by a set
of punched holes as shown in Fig. lOb.
(b)
Figure 9. Ferrite rod inductive store.
FERRITE
RODS
WORD LINE
ON CARD
~SENSE WINDINGS
WORD LINE
CARD
SURFACE
PUNCHED
INFORMATION
HOLES
(bl
Figure 10. Ferrite rod, stacked card store.
Closed Magnetic Paths. Transformer read-only
stores of various types 22-27 have been described in the
literature. Most operate in the following manner:
780
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
Consider a set of B large magnetic cores (B is
the number of bits per word), each with a sense
winding as indicated in Fig. 11. A given word line
threads only those cores in positions corresponding
to ones in its word. When a given word line is
pulsed, only those sense windings corresponding to
store ones develop output signals. A balanced
system 28 using two-aperture cores, with a sense
winding on the center leg between the apertures,
permits positive or negative sense signals to be induced. The word line threads one aperture (stored
one) or the other (stored zero) .
all
101
010
Figure 11. Transformer read-only memory structure.
The closed magnetic path system allows for the
development of large output signals (typically 0.5
to 1.0 volt), often eliminating the need for sense
amplifiers. It has the disadvantage of not being easily changeable. Usually, one changes a word b~'
disconnecting the old word line (leaving it physically in place) and threading and connecting a new
word line. Information may be more easily inserted
through the use of thin, word line punched cards or
strips, similar to those discussed above for the ferrite rod system. However, the number of word lines
which will fit into a given size core aperture is reduced considerably.
Other. Many variations on the systems discussed
above are possible. In particular, shielded L-C
arrays
can be used to improve sense signal 1/0
ratios over those achievable with simple shielded L
or sielded C approaches.
NON-LINEAR ARRAYS
Having discussed linear R, L, and C read-only
stores, we can proceed to nonlinear systems of the
1965
same types. These have the important advantage of
eliminating many of the sneak signal paths present
arrays29-30 can be used to improve sense signal 1/0
ratios and word drive requirements. Nonlinear resistive memories which have been proposed are primarily of the diode matrix type. Nonlinear inductive
approaches all involve the switching of square-loop
magnetic material.
Diode Arrays
Semiconductor diode matrices were discussed in
the literature as early as 1949. 31 While diode decoding networks are well known,1,2 diode matrices as
encoding networks were not considered economical,
compared to other realizations, until recently.
Again, with the advent of batch fabricati.on techniques, a few diode read-only stores have received
attention.
One approach32 uses the vacuum deposition of
organic semiconductor material to form diode arrays on thin, punchable printed circuit cards.
Punched holes break appropriate connections, and
an encoding matrix as shown in Fig. 12 results.
Each bit column is the output terminal of a k -input
diode OR gate, where k is the number of words
which have a one in its position. Since diodes may
also be used in the decoder, it is possible to construct the entire memory ( encoder and decoder)
using the same fabrication technology, thereby drastically reducing the number of required interconnections. With this approach, however, the information stored is fixed and cannot easily be changed
unless a system of connectors is used.
Diode matrices using conventional semiconductor
fabrication techniques have received attention
recently33,34 and can be expected to receive more
BIT LINES
~
ll-+-+----+--
g
~~~*
ffi
I
I
I
I
o
I
I
I
WORD
LINES
gn----r
ENCODER OUTPUTS
Figure 12. Diode encoding matrix.
781
A SURVEY OF READ-ONLY MEMORIES
attention in the future. Again, although stored information change is not easy, encoders and decoders can be constructed on the same substrate. As
tchniques for the encapsulation of these arrays are
imporved, semiconductor diode matrices (because
of their ideal nonlinear properties) will compete
much more favorably in the read-only memory field
-particularly for applications where stored information need not be changed.
INFORMATION
WINDING
o
o
FERRITE
DISC
o
o
o
o
Magnetic Switching
Square-loop magnetic core arrays, permanent
magnet-twistor35 systems and magnetic film arrangements are treated here.
Magnetic Cores. The read-only store of the Edsac
U 36 utilizes a conventional core selection array, with
one core per word and x, y and bias windings.
Using coincident-current techniques, only the selected' core is switched during a memory cycle.
There are B sense windings (B is the number of
bits per word), each threading only those cores corresponding to words requiring stored ones in its position. Bypassing a core is equivalent to a stored
zero. When the selected core switches, only those
bit lines threading it develop induced signals. Multiple turns are used to reduce drive current requirements and increase sense signal magnitudes (to 9
volts) .
A serial read-only memory technique, utilizing
multi aperture ferrite disks, was recently described. 37
One use of the system is illustrated in Fig. 13. A
ramp selection current, applied through the center
aperture, causes flux switching around the center
hole to proceed radially outward as a linear function of time. At a given distance from the center,
when switching occurs, if a sense line threads the
associated aperture, a signal is induced in it. Thus,
information is stored in the threading of the small
radial. apertures. A number of these can be distributed around the center hole as indicated in Fig. 13.
Also, a number of sense wires can thread along the
same radius.
Permanent Magnet-Twistor Arrays. An important
class of read-only memories may be placed under
this general heading. The operation of the first versions of such memories 38,39 is explained with reference to Fig. 14. Initially, all twistor segments
along the lengths of the sense wires are in a ref-
ILitl
1~ t .
Figure 13. Ferrite multiaperture disk serial storage.
erence state. When a given word is selected (by a
core selection switch), current induced in the word
strip or word "solenoid" is in such a direction as to
acuse all twistor segments immediately inside it to
switch. If, however, a small bar permanent magnet
(appropriately poled) is in the immediate vicinity
of a twistor segment, its switching is prevented. If a
twistor segment does switch, a signal is induced in
its sense wire. The small permanent magnets are
contained on a thin, insertable card so that information can easily be changed.
RETURN
WIRE
Figure 14. Permanent magnet-twistor store (original
structure) .
Many modifications and improvements in this
arrangement have since been discussed. 40-48 The
twistor wires and their return wires need not be inside the word solenoid, but may be outside it, thereby
reducing the distance between the permanent magnet and the twistor segment. 40 In this system,
eddy currents, induced in the conducting card
which holds the permanent magnets, enhance the
field produced by the word solenoid to cause twistor switching. Another variation involves making
the return wire another twistor wire, either· reverse-
782
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
wrapped to increase the sense signal41 or in a twomagnet-per-bit arrangement to achieve bipolar output signals. The insertable cards normally contain
magnets in very bit position, some magnetized,
some not-depending on the desired information
pattern. The pattern on a card may then be easily
changed by passing the card under an appropriate
set of magnetizing and demagnetizing heads. Another modification44,45 changes the information insertion method by eliminating the permanent magnets
from the changeable card. Instead, holes are
punched in the card, in positions where it is desired to prevent local eddy currents from enhancing
the switching field. Still another recent approach46
reverses the roles of the twistor wire and word solenoid, making the twistor wire the word line and the
solenoid the sense line. Finally, a recent paper47 described a system using electroplated continuous
magnetic film wire rather than twistor wire.
Many of the detailed varitions between all of
thses approaches are beyond the scope of this paper.
The reader is referred to the references cited for
detailed discussions. It is clear that the activity generated around the original permanent magnet-twistor technique has been great. What is probably the
largest capacity, all-electronic, read-only operating
store has been constructed using this approach.48
Magnetic Flat Film Arrays. Systems using permanent magnets and flat magnetic films have also been
described. 49,50 In one arrangement, each circular film
(electroplated and uniaxially anisotropic), because
of its geometry, is normally demagnetized. When a
word current is passed through a drive line over the
film, it switches, by rotation, into the hard direction, inducing a signal in a sense line. When the
drive current disappears, the film is again demagnetized. If, however, a permanent magnet is in its vicinity, the film remains satuarated in the' hard direction and switching is prevented. Continuous
magnetic films may also be used.
Another flat magnetic film approach51 replaces
the biasing permanent magnet with another (high
coercivity) film, which is part of a companion
memory array in which information can be written
electrically, although relatively slowly. The state of
the bias film either prevents or allows switching of
the readout film. This type of memory may be considered as not in the read-only class, because information can be written electrically. It is mentioned
here because the access time to read the store is
1965
much shorter than the write time, to change the information stored.
OPTICAL SYSTEMS
The basic property of an optical read-only memory is that, information is stored as a pattern of
opaque or transparent areas on a normally flat surface, such as a card, plate or disk. Storage of information in photographic form 52 permits bit packing
densities approach only by the finest magnetic
surface recording systems now in use.
Early optical fixed stores 53,54 utilized mechanically selected punched cards, read by sensing holes
with a light beam and a photoelectric cell. Systems
involving the semiconductor phenomena of electroluminescence55 and photoconductivity have also been
proposed. A matrix, having a series combination of
a phototransistor and a diode at each intersection,
has also been discussd. 56 The phototransistors act as
switches, actuated by a light pattern. The circuit
which results is a diode encoding matrix, where
connections (stored information) can be changed
simply by changing the light pattern. Recent interest
in optical read-only stores, however, has been primarily directed at photographic storage systems,
because of the high storage densities which may be
achieved.
Flying-Spot-Stores
An additional major advantage of some optical
systems is the simplification in the selection device
(decoder) afforded by using a cathode ray tube. A
given word can be selected simply by moving the
output light spot to a given position on the tube
face. The spot can be moved vary quickly in going
from cycle to cycle-hence the term "flying-spot"
store.
A number of such approaches have been described in the literature. 57,58,69 One system58 which has
received much attention is outlined in Fig. 15. The
CRT face is imaged, by a lens array, as a set of B
such areas on a photographic plate, where B is the
number of bits per word. Light emitted from a spot
on the CRT screen is thus focused into B corresponding sports on the photographic plate. As the
CRT spot moves over its allowable area (defined as
a square on the CRT face), the B spots on the photographic surface move correspondingly over their
allowable areas. Behind each of these areas is a
783
A SURVEY OF READ-ONLY MEMORIES
condensing lens to collect transmitted light through
that area and focus it onto a photomultiplier detector tube. Thus, the system consists of one CRT, B
imaging lenses, a photographic information plate, B
condensing lenses and B photomultiplier tubes. A
word is selected by appropriately positioning the
CRT spot. In parallel, B spots on the photographic
plate are selected. This plate contains information
in the form of opaque and transparent dots, so that,
within each of the B areas, the light mayor may
not pass on to the photomultiplier. Thus, a word is
read out in parallel. Each of the B areas on the
photographic sheet contains W bits, where W is the
number of words. It contains the same bit position
from each word. The information stored can be
changed by changing the photographic plate, being
very careful, of course, about plate registration.
The disk contains a number of tracks of photographic information in an outer annular ring: A spot
from the CRT face is focused as a finer spot on the
disk surface by a lens, as shown, with the photomultiplier acting as a detector. Access to the system is similar to that for a conventional magnetic
disk store in the sense that, first, a track is selected,
and second, information is read serially from it.
RING OF
PHOTOGR~HIC
CRT
INFORMATION
/
LENS
PHOTOMULTlPLlER~
TUBE
~
OUTPUT
~
SPINNING GLASS
DISC
B LIGHT SPOTS
CORRESPONDING
TO THAT ON CRT
FACE
UGH! I
SPOT
Figure 16. Spinning-disk photographic store.
COMPARISON CRITERIA
OBJECTIVE
LENS ARRAY
(B LENSES)
PHOTOGRAPHIC
STORAGE
PLATE
(B AREAS)
CONDENSING LENS a
PHOTOMULTIPLIER
ARRAY
(B OF EACH)
Figure 15. Flying spot store.
An optical system has recently been proposed60
which replaces the imaging and condensing lens arrays with fiber-optic systems. The light spot is thus
distributed to the areas on the photographic information sheet and then collected from the various
spots on a given area and transmitted to the photodetector by "light pipes." One may also replace the
CRT light source by an x-y selected array of lightemitting diodes.
Photographic Disk Sotre
Another important development in the field of
photographic read-o:o.ly storage has been the deveiopment of a glass disk store,61 outlined in Fig. 16,
for use as a natural language translation dictionary.
Read-only memories are generally compared by
using the conventional memory figures of merit
(e.g., storage capacity, cost per bit, speed) and by
evaluating the ease with which stored information
can be changed.
Storage Capacity
The largest capacity read-only stores constructed
to date have been the photographic systems. One
can expect that, in the future, these systems will
continue to be used for fixed, bulk storage. Of the
nonoptical systems, the largest capacity store presently in operation is the permanent magnet-twistor
system.
Cost Pet Bit
While data on costs of various _system approaches
are difficult to obtain, one can make some qualitative observations. In considering the cost of a fixed
store, the effect of the selection system and the sensing system (i.e., the system exclusive of the information encoder) is always of major importance.
For example, random-access optical systems, while
offering very high storage densities, are normally
burdened by complicated and expensive selection
784
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
and sensing optical networds. Linear Rand C arrays, while simple in construction, normally require
large drive signals from a giant decoder and deliver
very small sense signals, each requiring a number of
amplification stages. This is especially true as the
impedance tolerances widen. Magnetic systems, on
the other hand, normally require much less decoding complexity. In particular, the transformer store,
because it involves closed magnetic paths, delivers
in addition large sense signals needing little, if any,
amplification. At present it appears to be one of the
best present systems, from the cost point of view. It
is conceivable that, with the advent of low-cost integrated sense amplifiers, other linear systems will
compete more favorably on a cost basis.
Speed
It is difficult, at present, to single out anyone of
the approaches discussed as being the fastest. To
date, the highest speed systems have been the linear
arrays (excluding the transformer store) and some
of the thin magnetic film arrays.
The Question of Changeability.
As the discussions in this paper point out, readonly memory systems vary from those in which information stored is more or less absolutely fixed
(or at least changed with great difficulty) to those
in which information can be changed at slow electronic speeds. An important class of these stores
consists of the card-changeable memories. Associated with the question of changeability, but not necessarily included in it, is the question of ease of information insertion. A number of encoder arrays
can be set up easily to begin with (using some
punching device, for example) but cannot be easily
changed subsequently. It has generally been true
that, as one adds more flexibility to the system,
such as by making it card-~hangeable, the cost of
the system goes up-particularly if high bit packing
densities (which require accurate card registration)
are involved. In some cases, the cost of the "card
holder" becomes comparable to that of a conventional magnetic core memory plane.
The requirements for information changeability
vary widely with the application. Fixed, well-defined tables, code conversions are debugged subroutin, for example, may be stored in nonchangeable memories. However, it is probably safe to
1965
say that, for most present applications, the stored
information pattern is not completely permanent,
but will have to be changed after memory construction. If the amount of change is small (for example,
as certain program errors are found), a more permanent system which accepts a small number of
changes may still be usable. Clearly, however, for
many applications the degree of updating required
will be such that only card-changeable types will be
satisfactory.
ACKNOWLEDGMENTS
The author wishes to acknowledge the assistance
received from previous read-only memory surveys17,62
in the writing of this paper. In any paper such as
this, invariably some pertinent accomplishments are
inadvertently omitted and others, which are included,
are not given the emphasis they deserve. The author
apologizes, in advance, for these shortcomings.
REFERENCES
NOTE: Notations are added to the references only
where they are deemed appropriate. The following
abbreviations are used:
IEEETEC:
IEEE Transactions on Electronic
Computers
Proc. ICMT: Proceedings of International Colloquium on Memory Techniques,
Paris, 1965 (to be published by
Editions Chiron, Paris).
LCMTCS:
Large Capacity Memory Techniques, for Computing Systems,
M. C. Yovits, ed., Proceedings of
May 1961 Symposium, Macmillan, 1962.
1. Y. Chu, Digital Computer Design Fundamentals, McGraw-Hill, 1962, chap. 9.
2. R. S. Ledley, Digital Computer and Control
Engineering, McGraw-Hill, 1960, chap. 17.
3. J. A. Rajchman, U.S. Patent 2,428,811, "Electronic Computing Device," filed Oct. 1943.
4. A. W. Burks, "Electronic Computing Circuits
of the ENIAC," Proc. IRE, vol. 35, p. 756 (1947).
5. M. H. Lewin et aI, "Fixed Resistor-Card Memory," IEETEC, June 1965, pp. 428-434. [Experimental stack of 120 cards, 60 bits each.]
6. C. David, "Fixed Memory with Resistive
Coupling," Proc. ICMT.
A SURVEY OF READ-ONLY MEMORIES
7. L. I. Gutenmakher, Electronic InformationLogic Machine, English translation, Interscience,
1963, pp. 41-47. [Developmental. Up to 1024 capacitor sheets in a stack, 3 pf coupling capacity, 40 v
drive pulse, 60 mv sense signal.]
8. D. H. Macpherson and R. K. York, "Semipermanent Storage by Capacitive Coupling,"
IRETEC, Sept. 1961, pp. 446-451. [Experimental.
576-bit sheets, 5 pf coupling capacity, 20 v 0.5 JLsec
drive pulse, 10 mv sense signal, 3JLsec cycle time.
Planned-1K words, 34-bits-per-word.]
9. H. R. Foglia et aI, "Card Capacitor-A SemiPermanent, Read Only Memory," IBM Journal, Jan.
1961, p. 67. [Experimental. 100 v drive pulse, 1 mv
sense sigI)al estimated for 4.8 X 104 bit store.]
10. J. W. Haskell, "Printed Cards for the Card
Capacitor Memory," IBM Journal, Oct. 1962, p.
462. [Experimental. 3000 bit array operated at 1
mc.]
11. S. Takahashi and S. Watanabe, "Capacitance
Type Fixed Memory," LCMTCS, pp. 53-62. [Developmental. 4K words, 50 bits each, 0.2 JLsec cycle
time.]
12. J. Van Goethem, "The Capacitance SemiPermanent Information Store and Its Uses in Telephone Exchanges," Proc. lEE (Brit.) vol. 107B
supplement 20, p. 346 (1960). [4 pf coupling capacity, 2.5 v drive, 0.16 mv sense for 10K words.]
- - - , "Operating Principle, Manufacture and
Application of a Semi-permanent Memory," Proc.
ICMT. [5 X 105 bits, 4 JLsec cycle time.]
13. D. M. Taub, "Analysis of Sneak Paths and
Sence-Line Distortion in an Improved Capacitor
Read-Only Memory," Proc. IEEE, vol. 51, no. 11,
p. 1554 (Nov. 1963).
14. Available under the name "Permacard" from
Fabri-tek, Inc., Amery, Wis., in 1920-bit sheets.
15. I. Endo and J. Yamato, "The Metal Card
Memory-A New Semi-Permanent Store," LCMTCS, pp. 213-230. [Experimental 2 X 104 bit store.
3 nh coupling mutual inductance. Developing an
improved 14,700 word, 32 bits per word, memory.]
J. Yamato and Y. Suzuki, "Forming SemiPermanent Memories with Metal Card Storage,"
Electronics, Nov. 17, 1961.
16. T. Ishidate, S. Yoshizawa, and K. Nagamori,
"Eddycard Memory-A Semi-Permanent Storage,"
Proc. EJCC, Dec. 1961. [Developmental. 4.6 X 104
bit capacity, 200 rna, 0.05 p.sec drive pulse, 1 mv
sense signal, 100 nsec cycle time.]
17. D. M. Taub, "A Short Review of Read-Only
Memories," Proc. lEE (Brit.), vol. 110, no. 1 pp.
785
157-166 (Jan. 1963). [Includes a discussion of an
experimental "noughts and crosses" eddy-current
system using 100 rna drive pulses, 12 mv sense signals, with a 200 nsec cycle time.]
18. A. M. Renard and W. I. Newman, "Dnifluxor: A Permanent Memory Element," Proc.
WJCC, 1960, p. 91. [Experimental 3000 bit store.
500 rna, 150 nsec drive pulse, 10 mv sense signal,
420 nsec cycle time.]
19. T. Kilburn and R. L. Grimsdale, "A Digital
Computer Store with Very Short Read Time," Proc.
lEE, vol. 107B, pp. 567-607 (1960). [Operating
store with 4096 words, 52 bits each, involving 50
rna drive pulses, 3 mv sense signals, and 200 nsec
cycle time.]
20. D. B. Brick and G. G. Pick, "Microsecond
Word Recognition System," IEEETEC, Feb. 1964,
pp. 27-35. [Developmental. 1.3 X 105 system with
200 rna drive pulses, approximately 20 mv sense
signals and 5.4 JLsec cycle time.]
21. I. R. Butcher, "A Pre-Wired Storage Unit,"
IEEETEC, vol. EC 13, no. 2, pp. 106-111 (Apr.
1964). [Developmental 4K word, 40-bit-per-wordstore with 75 rna drive pulses and 1.5 JLsec cycle
time.]
22. T. L. Dimond, "No.5 Crossbar AMA Translator," Bell Lab. Record, vol. 29, p. 62 (1951).
23. E. G. Andrews, "The Bell Computer, Model
6," Proc. 2nd Symposium on Large Scale Digital
Calculating Machinery, Harvard D., 1951, p. 20.
24. J. Goldberg and M. W. Green, "Large Files
for Information Retrieval Based on Simultaneous
Interrogation of All Items," LCMTCS, pp. 63-77.
25., E. L. Younker et aI, "Design of an Experimental Multiple Instantaneous Response File," Proc.
1964 Spring Joint Computer Conference, pp. 515528. [Experimental 1K word, 300 bit-per~word information retrieval store.]
26. P. Kuttner, "The Rope Memory-A Permanent Storage Device," Proc. 1963 Fall Joint Computer Conference, pp. 45-57. [Developmental 256
word, 64 bit-per-word store with 4 JLsec cycle time.]
27. D. M. Taub and B. W. Kingston, "The Design
of Transformer (Dimond Ring) Read-Only Stores,"
IBM Journal, Sept. 1964, pp. 443-459. [Excellent
discussion of system operation and design calculations. Stores of the type described are used in some
current product line machines.]
28. A. D. Beard, "RCA Spectra 70, Basic Design
and Philosophy of Operation," 1965 WESCON, San
Francisco, Paper 12.1. [1024 word, 53-bit-per-word
modules with 960 nsec cycle time. Up to 4 modules
786
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
are operated in an interleaved fashion, to give an
effecitve 480 nsec cycle time.]
29. J. Yamato and T. Shimizu, "Large Capacity
Metal Card Memory," Proc. ICMT. [20 K words, 50
bits each, 5 JLsec cycle time, 3 mv sense signal.]
30. L. A. P. M. De Bot, "Linear Read-Only
Memories," Proc. ICMT.
31. D. R. Brown and N. Rochester, "Rectifier
Networks for Multiposition Switching," Proc. IRE,
1949, pp. 139-147.
32. M. H. Lewin et aI, "Fixed, Associative Memory Using Evaporated Organic Diode Arrays," Proc.
1963 Fall Joint Computer Conference, pp. 101-106.
[Experimental. 100 bits operated at 10 JLsec cycle
time.]
.
33. "Thin Silicon Wafers Used in Fixed Digital
Memory," Electronic Design, Jan. 18, 1961.
34. "Diode Matrix in Integrated Circuit Form,"
Computer Design, June 1965, p. 12.
35. A. H. Bobeck, "A New Storage Element Suitable for Large-Sized Memory Arrays: The Twistor,"
Bell System Technical Journal, vol. 36, pp. 13191340 (1957).
36. M. V. Wilkes et aI, "The Design of the Control Unit of an Electronic Digital Computer," Proc.
IEEE, vol. 105B, p. 121 (1958).
37. B. E. Briley, "MYRA-A New Memory Element and System," Proc. 1964 INTERMAG Conference, Paper no. 14-8. [Experimental. 64 words, 5
bits each, 2 v sense signals.]
38. D. H. Looney, "A Twistor Matrix Memory
for Semi-Permanent Information," Proceedings Western Joint Computer Conference, 1959, p. 36.
39. J. H. DeBuske, J. Janik and B. H. Simons, "A
Card Changeable Non-Destructive Readout Twistor
Store," Proceedings of the Western Joint Computer
Conference, 1959. Compansion article to ref. 38.
[Developmental. 1.3 X 104 bit store, 1.8 amp 2 JLsec
drive pulse, 8 mv sense signal, 5 JLsec cycle time.]
40. W. A. Barrett et aI, "A Card Changeable
Permanent-Magnet-Twistor Memory of Large Capacity," IRE Transactions on Electronic Computers,
September 1961. [Developmental. 3.6 X 105 bit
store with 5 JLsec cycle.]
41. E. J. Alexander et aI, "A Permanent Magnet
Twistor Memory Element of Improved Characteristics," J. Appl. Physics (supplement) vol. 33, no. 3,
p. 1075 (Mar. 1962).
42. U. F. Gianola et aI, "Large Capacity Card
Changeable Permanent Magnet Twistor Memory,"
LCMTCS, p. 177.
1965
43. L. W. Stammerjohn, "An Evaluation of Design and Performance of the Permanent Magnet
Twistor Memory," Proc. 1964 INTERMAG Conference, Paper no. 8-4.
44. K. E. Krylow et aI, "Semipermannt Memory
-Latest Use for Twistors," Electronics, vol. 36, no.
11, p. 80 (Mar. 15, 1963). [7000 bits, 550 rna,
4 JLsec drive pulse, 15 mv sense signal, 10 fLsec cycle
time.]
45. J. P. Shuba, "An Eddy Card Twistor Memory
for Semi-Permanent Storage," Poe. 1965 INTERMA G Confeence, Paper no. 14-2. [2K words, 66 bits
per word, 1.2 A drive pulse, 4 mv sense signal, 3
JLsec cycle time.]
46. F. J. Procyk and L. H. Young, "A HighSpeed Card-Changeable Permanent Magnet Memory
-The Inverted Twistor," Proc. 1964 INTERMAG
Conference, Paper no. 8-6. [Experimental 512 word,
24 bit-per-word store. 400 rna drive pulse, 10 mv
sense signal, 250 nsec cycle time.]
47. U. F. Gianola, "Analysis of Operating Modes
for a Read-Only Memory Using Cylindrical Film
Sensing of Permanent Magnet Arrays," Proc. ICMT.
[Less than 100 nsec cycle time.]
48. C. F. Ault et aI, "No.1 ESS Program Store,"
Bell System Tech. J., vol. 43, no. 5, part 1, p. 2097
(Sept. 1964). [Operating 65K word, 88-bit store.
2A drive pulses, 2.5 mv sense signal, 5.5! JLsec cycle
time.]
49. R. E. Matick, "Thick Film Read-Only Memory Device," J. of Appl. Physics, vol. 34, no. 4, p.
1174 (Apr. 1963).
50. C. Sie et aI, "A High-Speed Thick Film ReadOnly Store," Proc. IFIPS Congress, New York City,
May 1965. [lK words, 288 bits-per-word.]
51. R. J. Petschauer and R. D. Turnquist, "A
Nondestructive Readout Film Memory," Proc. 1961
Western Joint Computer Conference, pp. 411-425.
[Experimental 1024 word, 36 bit memory. Read
(access) time: 1.5 f-tsec. Write time: 100 f-tsec.]
52. G. W. King et aI, "Photographic Techniques
for Information Storage," Proc. IRE, vol. 41, p.
1421 (1953).
53. L. N. Hampton and J. B. Newsom, "The
Card Translator for Nation-Wide Dialing," Bell System Tech. J., vol. 32, p. 1037 (1953).
54. T. Kilburn and E. R. Laithwaite, "Servo Control of the Position and Size of an Optical Scanning
System," Proc. lEE, vol. 101, part IV, p. 129
(1954) .
A SURVEY OF READ-ONLY MEMORIES
55. G. R. Hoffman et aI, "High-Speed Light Output Signals from Electroluminescent Storage Systems," Proc. lEE, vol. 107B, suppl. 20, p. 257
(1960).
- - - and P. L. Jones, "An Electroluminescent Fixed Store for a Digital Computer," Proc. lEE,
vol. 109B, p. 177 (1962).
56. H. Hagiwara et aI, "The KT Pilot Computer
-A Microprogrammed Computer with a PhotoTransistor Fixed Memory," Proc. IFIPS Congress,
Munich, 1962, North Holland Publishing Co., 1963,
p. 684. [256 word, 8O-bit-per-word , store.]
57. F. Roberts and J. Z. Young, "The Flying
Spot Microscope," Proc. lEE, vol. 99, part IlIA,
p. 747 (1952).
58. C. W. Hoover and G. Haugk, "The FlyingSpot Store," LCMTCS, pp. 79-98. [Operating 2.2 X
106 bit store. 5 p,sec cycle time.]
59. D. J. Parker et aI, U.S. Patent 3,191,157;
787
"Optical Memory," filed Jan. 21, 1960. [Utilization
of an optical tunnel with a CRT selection system is
discussed.]
H. E. Haynes, U.S. Patent 3,184,732; "Computer Circuit," filed Apr. 15, 1960. [An optical
tunnel-fiber optic system is discussed.]
See also "Electro-Optics," brochure of 19 technical papers, Report no. DEP/SCN 009-64-5M,
available from Manager, Marketing Services, RCA
Defesne Electronic Products, Building 2-6, Camden,
N. J.
60. G. R. Hoffman and D. C. Jeffreys, "A Computer Fixed Store Using Light Pulses for Read-out,"
J. Brit. IRE, vol. 25, no. 2, p. 99 (Feb. 1963).
61. J. L. Craft et aI, "A Table Look-Up Machine
for Processing of Natural Languages," IBM Journal,
July 1961, pp. 192-203.
62. J. M. Donnelly, "Card Changeable Memories," Computer Design, June 1964, pp. 20-30.
A HIGH-SPEED, WOVEN READ-ONLY MEMORY
H. Maeda
A. J. Kolk, Jr.
General Precision Inc.
M. Takashima
and
Lihrascope Group
TOKO, Inc.
Glendale, California·
Tokyo,/apan
INTRODUCTION
The woven, plated-wire memory concept! has
been shown to provide an economical fabrication
technique for the construction of high-speed DRO
and NDRO memory arrays. The economies of the
woven memory arise from two factors: first, the
memory element consists of permalloy-plated, alloy copper wire which is made by an inexpensive,
readily controllable, continuous plating process; and
second, the weaving technique constitutes a highly
automated method for providing the array wiring.
Recently, the woven concept has been extended.
to the fabrication of very inexpensive, high-speed,
read-only memories. This paper describes·· the fabrication techniques for the preparation of a woven
permanent memory matrix and the electrical properties of some of the possible organizations.
Other implementations of read-only memory
include the core rope memory2 and various capacitively or inductively coupled word-organized memories} The woven permanent memory is more similar to the word-organized capacitive or inductive
read-only memories than to the rope memory. It
appears to have an appreciable speed advantage over
other word-organized, read-only memories because
of high sense signal output and very small sense line
delays due to high packing density.
789
The applications of read-only memories have been
described in other papers 2,3 and will not be reviewed
here.
THE MEMORY ARRAY
The memory element in the woven array consists
of a resilient copper alloy wire, 8 mils in diameter,
upon which a uniaxially anisotropic, permalloy thin
film is plated with a circumferentially directed easy
magnetic axis. The magnetic coating is 81Ni19Fe
permalloy plated from an aqueous solution containing mainly NiS04 • 7H20, NiCb· 6H20, FeS04 •
7H20 and HsB03.
The memory matrix is fabricated by weaving the
permalloy-plated wires as the woof, and 3-mil
diameter, insulated, conducting wires as the warp,
in the configuration shown in Fig. 1. Selected warp
wires are joined together at the ends to provide
drive coils of the appropriate number of turns and
spacing. The joining of the warp wires can be automated so that this step does not affect the economy
or reliability of the fabrication process.
A schematic of the parts of a loom, important to
the description of a woven read-only memory, is
presented in Fig. 2. In a conventional loom, the
warp lines are controlled by the heddles. The hed-
790
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1965
TERMINALS
W2
I'
2
2'
n
n'
DIGIT
SENSE
W3
PLATED WIRE
WORD
Figure 1. Structure of TOKO memory plane.
dIes are connected to bars called harnesses, and
upon raising one or more of these harnesses, a
group of warp lines are lifted to permit insertion of
the woof. A Jacquard loom is a modification which
permits individual control of each heddle so that
any pattern of warp lines may be selected. The raising of the heddles in a Jacquard loom is activated
by punched card input. As a matter of historical
interest, this type of loom, which was developed at
the start of the 19th century, was the first machine
controlled by punched card input.
The weave pattern for a conventional DRO or
NDRO woven memory is similar to that shown in
Fig. 1. This pattern can be formed by controlling
the heddles with only two harnesses, since the
weave is constructed by alternately lifting adjacent
warp wires and inserting the magnetic woof wire.
Several warp wires are then connected in series to
form a multiturn word line drive.
One type of read-only memory weave is shown
in Fig. 3. This weave, which can only be conveniently formed by using the Jacquard loom, employs
the usual woven coil where a "one" is to be read,
. and bypasses the permalloy-plated wire at a
"zero" position by having all the warp wires either
above or below the plated wire.
The simplicity of the production method is a key
factor in making this variation of the read-only
memory economically attractive. Other types of
batch-fabricated, word-organized read-only memories utilize some form of punched metallic conductor
or printed circuit board to provide a cheap capacitive
or inductive-coupling· mechanism to store the information patterns. The loom-woven memory array,
constructed by using punched cards generated by a
computer, can be fabricated at costs only slightly
higher than the capacitive or inductive type. In addition, this type of array has a relatively high signal
output, high packing density, low sense line delay
per bit and other faetures necessary for the con-
A HIGH-_SPEED, WOVEN READ-ONLY MEMORY
791
WARP DIRECTION - - -.........
HARNESS
HEDDLES
Figure 2. Significant parts of a ioom.
struction of a memory with a cycle time in the
O.OI-nanosecond range.
PRINCIPLES OF OPERATION
When used in a normal DRO or NDRO memory
array, the permalloy-plated wire, is the digit sense
line, and the insulated copper wires form the word
drive coils. The permalloy-plated wire has a cir-
cumferentially directed magnetic easy axis so that
the remanent magnetization lies in a closed path
around the wire.
The information is written by applying an axially
directed field with the word line and an easy direction field down the digit line. The word line field
must be removed prior to the digit field. In order
to read, a field is applied along the wove~ word
line and the information is sensed on the plated wire.
792
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
WORD
1965
The read output voltage as a function of word write
current I w , and digit write current In, of a typical
undisturbed memory bit is shown in Fig. 4.
The read-only memory could be operated in
precisely the same manner as the DRO memory using a unipolar digit pulse. In most cases, the preferable mode of operation employs a DC bias to reset the bits .using a circuit configuration such as
shown in Fig. 5. The DC bias mode has the advantage of permitting very high-speed memory operation since (with the elimination of digit write cur-
LINES
"ZEROS"
SENSE
LINES
Figure 3. Weave pattern for the read-only memory.
20 ~--------~----------~----------~--------~
o r---------~-----------+----------~~------~
~IO
-20 ~--------~----------~--------~~--------~
o
-50
100
-100
50
10 (mAl
Figure 4. Output vo~tage vs digit write current for an undisturbed bit in a DRC> memory.
rent delay and the serious noise problem which accompanies the digit write operation) the major factors affecting speed are only the element switching
time, sense line delay and recovery, and amplifier
delay. Moreover, information is not re-written selectively, so the addressing of the next word can be
a least partially accomplished during the time required for the element switching transient to decay.
A quantitative discussion of memory speed is contained in the section on memory cycle time.
The price paid for the DC bias organization is
power, but fortunately the losses due tQ DC bias
are only on the order of 10-5 watts per bit. In memories operated with submicrosecond access times,
no power saving would be accomplished by using a
pulsed digit current, in fact the DC bias has a power
advantage. However, in slow memories operated with
bit transfer rates on the order of 105 bits per second,
an appreciable power saving can be obtained by
using a pulsed digit current to reset the bits.
793
A HIGH __ SPEED, WOVEN READ-ONLY MEMORY
MEMORY ELEMENT CHARACTERISTICS
Joe
Figure 5. Circuit configuration for a dc-biased, read-only
memory.
The economical construction and reliable operation of the woven read-only memory depend on the
sensitivity of the plated wire to changes in DC bias
level, to temperature, and to bending and twisting
strains. Another important characteristic is the peaking and switching time of the memory element. These
characteristics have been measured and the test
results are described below.
The memory element output as a function of drive
fields has been measured using the test configuration
shown in Fig. 6. The output signal voltage was
measured as a function of the DC bias current IDe,
the word current I w , and the magnetic plating thickness. The effect of the variation of the DC bias level
is shown in Fig. 7. The 0.7-micron-thick permalloy
READ OUTPUT
VOLTAGE
D.C.
100.n.
Figure 6. Read-only memory test circuit.
thick plate is seen to peak in output as a bias of about
60 milliamps and then drop. The initial increase in
peak voltage with bias current is due to incomplete
resetting of the magnetic film at low DC bias levels.
The drop in voltage at higher levels of bias occurs
because the word drive field no longer completely
rotates the magnetization into the hard direction. The
2-micron-thick film shows a much greater output
and a lower DC bias requirement than the 0.7 micron. A 5-micron-thick plate has also been tested but
in this case the output was no greater than that obtained with the 2-micron film because of the slower
switching speed of the thick film.
The read output voltage has been determined as
a function of temperature over the range -60°C to
+ 175°C. At a constant word read current of 500
milliamps and a reset current of 50 milliamps, the
temperature coefficient of signal output was only 60
microvolts per degree Centigrade. This is sufficiently
small so that word drive temperature compensation
is not necessary.
The change in coercive force from -60°C to
+ 175°C has also been measured. The temperature
coefficient of coercive force is -0.0027 oersteds per
degree Centigrade. This magnitude of the temperature coefficient would not require any change of
DC bias level with temperature.
An important factor in the packing density and
reliability of a woven plane is the susceptibility of
the plated wire to strains introduced by bending.
Special looms have been designed to minimize and
control the local bending stresses normally encountered in the weaving process. However, in order to
minimize cost by maximizing yield in the read-
794
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1965
7000A PERMALLOY
2 TURN WORD LINE
20
-......
~~
t6
~W:700ma . /
12
I
/
IW:500ma
8
I
:300mo'~
~
~
./
---.
",.",.,.",- .
r----.
W
4
Va
(mv)
0
-4
~'
-8
-12
-16
/
......
.------
~
-20
-100
-80
-60
~
/~
-40
-20
o
40
60
80
100
Figure 7. Output voltage as a function of dc-bias current and word drive current.
only memory planes, a relatively high tolerance to
strains introduced by the weaving process is desirable. The effect of bending on memory operation
was determined using the apparatus shown in Fig.
8. The results of the test are shown in Fig. 9. With-
in the range of expected bending angles, no serious
change of signal level is evident.
The effect of twist was also determined using the
test fixture shown in Fig. 10. A very great tolerance
to twist was obtained as shown by the data presented in Fig. 11.
The peaking time of the memory element is
equal to or less than the rise time of the word read
pulse at least down to rise, times of 40-nanoseconds. The peaking time in the read-only memory is more important than the switching time because the bit is completely reset in every case by
the DC bias current during the fall of the word current.
MEMORY PLANE CHARACTERISTICS
FORCE
Figure 8. Apparatus for determining the effect of bending
on the output voltage.
The important characteristics of a memory plane
include packing density, sense line delay and attenuation, sense line characteristic impedance, word
A HIGH __ SPEED, WOVEN READ-ONLY MEMORY
FIXED I
DC
795
"50ma
IW: 5OOino const.
10 J:.-~--+--+---t--+---t
Vo
(my)
0
~-+--+--+--t---t----1
-10 I-~--+--+--..y.-+---t
-20
L----1_......L-....J....-..J--~---'
00
100
200
300
8
7000 A PERMALLOY.
TWO TURN WORD LINE
Figure 9. Effect of bending strains on the output voltage
in a read-only memory.
Figure 10. Apparatus for measuring the effect of bending
strains.
line inductance and capacitance and finally signalto-noise ratio.
Packing Density. A read-only memory array
has been assembled with a packing density of 800
bits per square inch. A photograph of this array is
shown in Fig. 12. This packing density was chosen
because of compatibility with existing production
equipment for DRO memories. It represents a density of 20 bits per inch along the word line and 40
bits per inch along the sense line.
The packing density along the sense ··line in a
DRO plated-wire memory is limited by adjacent
bit interactions. There is no such limitation in the
permanent memory; the bits can be spaced as tightly as the spacing of the warp lines allows. In the
case of a 2-turn drive coil with 3-mil diameter
warp line wires, this spacing is 80 bits per inch.
In a DRO or NDRO plated-wire memory, the
signal output is also directly related to the bit density along the sense line. This is not true in the
read-only memory for the following reasons. If a
single turn coil is considered for illustration, it is
found that the rotation of the magnetization is
spread over an appreciable distance from the coil
because of the slow drop-off of the magnetic field
due to the demagnetizing fields in the film. In a
read-only memory, there is no need to place both
turns of a two-turn drive coil on adjacent warp
wires, even if every warp wire is used in the word
drive lines. The word drive coils can be interleaved
as shown in Fig. 13 to share magnetic material between adjacent bits, and thereby, provide a much
higher output signal than would be obtained with
adjacent coils.
Electrical Characteristics of· the Sense Line. The
delay per bit of the plane shown in Fig. 12 is quite
long, 50 picoseconds per bit, because of the large
spacing along the sense line and because a 5-micron-thick plating of permalloy was employed.
If every warp wire is used in forming the word so
that no warp wires are left for spacing as in the
memory plane shown, and if a 2-micron-thick,
plating of permalloy is employed, the delay per bit
along the sense line would be 8 picoseconds per bit.
The woven memory plane has a ground plane beneath the woven mat. The capacitance to ground,
and consequently the characteristic impedance of
the line, depends upon the position chosen for the
796
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
15
I
1965
I
FIXED IDe! 50ma
10
~
7000 A PERMALLOY,
TWO TURN WORD LINE
-
~
......
~
~
~
~
~
5
-5
~
~
~
-
~
,.....
~
~
~
-10
-15
-eo
-60
-40
-20
o
20
40
60
80
TWIST eo
Figure 11. Effect of twist on the output voltage.
ground plane. This can be varied at will but a typical value is about 100 ohms.
Word Line Characteristics. The word line can
best be considered as an inductance and not as a
transmission line. The value of the inductance will
vary with the configuration chosen for the word
coil but a typical value for a two-turn coil and a 2
micron thick plate is on the order of 10 to 20 nanohenries per bit.
Signal-to-Noise Ratio. Figure 14 presents the
sense signal output of the experimental read-only
memory array shown in Fig. ]2. The measured 1to-O ratio. over the plane was better than 5 to 1.
The peaking time for this plane was relatively large
because a 5-micron-thick permalloy plate was
used.
TWO-INTERSECTION-PER-BIT ARRAY
An alternative organization for the woven
read-only memory plane is shown in Fig. 15. This
organization employs two intersections per bit and
A HIGH-SPEED, WOVEN READ-ONLY MEMORY
Figure 12. Photograph of the test read-only memory plane.
797
798
PROCEEDINGS -
WORD I
FALL JOINT COMPUTER CONFERENCE,
WORD 2
1965
ference in the back emf of these two lines using a
500 milliamp drive and a 50-nanosecond rise time
would be as high as 10 volts for a 64-bit word.
MEMORY CYCLE TIME
Figure 13. Interleaving configuration for the word drive
lines.
yields a bipolar sense output for a "one" and a
"zero." It is attractive from the point of view of
presenting a constant word .line inductance to ,the
driver. It also provides improved discrimination
between a "one" and a "zero." In the one-intersection-per-bit organization, a word line could
contain all "zeros" and another all "ones." The difWORD I
7111
s.~
The read-only memory, employing a DC bias,
represents a rather simple memory design problem
as compared with an electrically alterable memory.
The digit write operation, which is one of the principal sources of noise and delay in a memory system, is eliminated. A block diagram of a readonly memory system is shown in Fig. 16.
The factors limiting speed in this system are the
delays associated with address decoding, the delay
in generating the 500-milliamp word line drive,
the transmission delays in the word access circuits
and word lines, the peaking time of the element, the
transmission delay in the sense line and the delay in
the sense amplifier.
The address decoding can be accomplished with
a small delay, on the order of 10 to 20 nanoseconds. The setting up of new addresses iil the registers can be taking place during the latter part of
the reading cycle.
The delay in generating the 500 rna pulse is a
problem in view of the word-to-digit line capacitance. This capacitance must be charged in all the
paralleled word lines in front of the line diodes.
The problem is compounded by the fact that the
common mode signal generated by this charging occurs during read time and, unless compensated by
careful balancing, will generate an intolerable noise
pulse.
There are several ways to handle this problem in
a small memory; one is to open the "B" switches
WORD 2
11111
I
I
.1
Figure 14. Bipolar, two-intersection-per-bit memory array.
799
A HIGH __ SPEED, WOVEN READ-ONLY MEMORY
"One" Ou tput Voltage (Upper trace)
20 rnv per div
40 ns per div
Word Current
200 rna per div
40 ns per div
"Zero" Output Voltage (Upper trace)
20 rnv per div
40 ns per div
Word Current
200 rna per div
40 ns per div
Figure 15. Output signal from a "one" and a "zero" in the test read-only memory plane.
(See Fig. 16) first and provide the required charging current. In a large memory, the problem can be
eliminated by providing transformer coupling to the
word line. The transformer coupling reduces the
charging current and can eliminate common mode
noise if proper balancing is employed.
The delay in the word access circuits for the
word lines is generally small. However, the word
line delay for a 2-turn weave with a 40-mil
spacing between bits on the word line is about 50
picoseconds per bit. If unusually long words are
employed, this could be a significant delay.
The peaking time of the element is significant in
this memory rather than the switching time, since
the element will always be reset by the DC bias
when the word current is removed. For a SOO-milliamp word current at a 50-nanosecond rise time,
a nominal value of the peaking time for a 2-micron-thick permalloy plate is 30 nanoseconds.
An outstanding feature of the memory is the
small value of the sense line delay. With a 2-turn
word drive coil, the sense line delay is only 8 picoseconds per bit. The delay, of course, increases linearly with the number of turns on the word drive
coil.
The remaining delay is that of the sense amplifier. A suitable sense amplifier would require a
gain of about 200. With presently available transistors, the delay in such an amplifier would be approximately 10 nanoseconds.
A THIN MAGNETIC FILM COMPUTER MEMO'RY USING A RESO'NANT
ABSO'RPTION NONDESTRUCTIVE READOUT TECHNIQUE
M. May, W. W. Powell, and J. L. Armstrong
Hughes Aircraft Company
Culver City, California
very nearly in the plane of the film. This fact modifies the Larmor expression so that the resonant
ge
frequency is given by Wo = 2
( 41TM H K) 1f2 where
me
the subscript on w implies that no external field other
than the rf field is applied. If an external field Hap
is superimposed parallel to the external anisotropy
field this expression becomes
INTRODUCTION
H. D. Toombs and T. E. Hasty have described a
technique utilizing ferromagnetic resonance to obtain
nondestructive readout in thin permalloy film memories. 1 A study has been made by the authors to
determine how this technique could best be utilized
in medium and large sized computer memories. The
study culminated in the construction of a 32-word,
24-bit film plane tester utilizing absorption resonance
readout which served to provide data on the resonant behavior of various films and to lend practical
experience in the design of a resonance memory.
Resonance absorption may be demonstrated by
subjecting a ferromagnetic material sample to a
steady magnetic field which (viewed classically) sets
up an axis of precession for the electron spins responsible for the materials magnetic moment. If a R.F.
field is now applied perpendicular to this axis with
a frequency near the natural precession frequency
geH
given by the Larmor relation OJ = 2
(gaussian
me
units), the spins will begin to precess in sympathy,
absorbing energy from the R.F. source. In the present
application the steady field is provided by the internal anisotropy field H K of the material. Due to the
shape anisotropy of a thin film, the precession is
distorted so that the magnetization vector M remains
ge
~
w = -2- (41TM I HK
me
+
~
Hap
I ) 1f2
Thus the resonant frequency shifts above or below
Wo depending upon whether the magnetization vector M is parallel or antiparallel to Hap. It is this
phenomena that is used to nondestructively determine the magnetic state of film cell.
Figure 1 shows schematically the configuration of
a resonance memory. Power from a uhf oscillator is
equally distributed by means of a power splitter over
the digit lines. Film cells lie under the digit lines such
that their anisotropy axis is perpendicular to the uhf
fit; ld direction. A small fraction of the power is
therefore absorbed by each cell, the remainder proceeding to the end of the line to bias a demodulator
detector. The operating frequency is set somewhat
below the resonant frequency of the film cells as
shown in Fig. 2. Interrogation of the memory is
801
802
1965
PROCEEDINGS - - FALL JOINT COMPUTER CONFERENCE,
Figure 1. Configuration of resonance absorption memory.
D
or greater, and a frequency of 550 megacycles or
greater. The frequency of 550 mc was satisfactory for
the design of a solid state oscillator using a 2N3375
transistor having an efficiency above 40 percent,
and also for the design of a transistor detector. A
higher frequency and H K can be used but no advantage was recognized. Operation below film resonance
was much preferred to operation above resonance
both because of greater signal output and because
circuit design problems significantly increased at the
higher frequencies near 1,000 megacycles.
Thin film absorption characteristics in terms of
signal output versus frequency are shown in Fig. 3.
Figures 3a, 3b, and 3c show the need for selection
of film having relatively low dispersion (under 2 0 )
THIN FILM CHARACTERISTICS
HK = 3.5 OERSTEDS
!.::FIELD
EASY AXIS
FREQ.
+
FREQUENCY
fr
RESONANT FREQ
If = "lS'
'6 =
(47T
2.8
MH 1
01.90=10 0
I
I ,,' I
"f!
OPERATIN~
HC =2.4 OERSTEDS
STORED 0
+2
AI
STORED I
+1
~
K"
M
go
Me / OERSTED = J = 2 me
..J
accomplished by pulsing a word line thereby reducing or augmenting the absorption depending on the
cell's magnetization. This change in absorption is
demodulated at the detector so that an output pulse
is produced which is a replica of the interrogate pulse
with a polarity dependent on film state.
Writing is done by the technique conventional for
film memories. Note however that the film axis
direction dictates that the role of the word and digit
lines be exchanged for the writing operation. This
necessitates transposing the data to be stored. Other
writing schemes involving separate write conductors
to circumvent this difficulty were considered but
these were not developed in this study.
DESIGN CONSIDERATIONS
Selection of an Operating Frequency
From the design viewpoint a. low absorption frequency is. preferred because this favors oscillator
efficiency, detector efficiency, and ease of matching.
It was found by experiment that obtaining good
control of resonant absorption with an external field
requires the use of film with an H K of 5 oersteds
~
o
-I
v
400
I
}
v--vU V V
:3 00
..J
I.&J
It:
'STOR~D
T
/
1\
600
800
A
700
me
~~
I
I
NO STORED ZERO SIGNAL
WAS OBjERVABLE
-2
Figure 3a. Resonant absorption readout of 1000A-thick
permalloy, 25 x 50 mil size. Applied field to readout is in
direction of stored 1.
and a moderately high H K of 5 oersteds or greater.
The R.F. line was 5.5 mil wide and of 50 ohms
impedance.
Figure 3a shows the effect of HK = 3.5 oersteds
and dispersion of (a90) of 100 • The signal output
versus frequency for a stored 1 is shown. Too much
creep occurred on this film to show a stored when
the applied read field was opposite to the magnetized
state of the cell.
Figure 3b shows some improvement for HK = 3.8
oersteds and dispersion of 4 0 • A signal was obtained
for a stored 1 and stored 0.
Figure 3c shows the 1 and 0 output versus frequency for films havingHK = 5.3. In this film a90
was less than 20. The externally applied field was
generated by passing 300 milliamps through. a 60mil-wide conductor adjacent to the cell in all cases.
°
A THIN MAGNETIC FILM COMPUTER MEMORY
THIN FILM CHARACTERISTICS
Hk =3.8 OERSTEDS
Hi: =2.5 OERSTEDS
.,1.90=4 0
803
SWR of less than 1.1 over the frequency range
shown. A diode detector was used for these
measurements.
CALCULATING THE SIGNAL MAGNITUDE
+2~--=4~--~----~----+-----~--~----~
TH IN FILM CHARACTERISTICS
HK = 5.3 OERSTEDS
Neglecting copper and dielectric losses the power
attenuation down a digit line is given by dP/ dn =
-kP or P = Poe- kn where Po is the input power, n
is the cell number, and k is the fraction of the power
reaching a cell that is absorbed by that cell. If one
of the cells is interrogated, its power absorption is
changed by akP where a is the fraction of the absorption that can 'be controlled by the word pulse.
Considering the attenuation due to the remaining
cells as given above, the power signal at the output
is AP = akPoe-kN • This signal is maximized with
respect to k when kN = 1. Thus to obtain the best
power signal the cell size must be adjusted to produce an attenuation down the line (less copper and
dielectric losses) of 1/e or 4.3 db.
Under this condition the signal power at the last
cell where the R.F. lines connect to the detector
HC= 2.5 OERSTEDS
.,1.90=20
is ±
oJ
<[
z
(!)
en
450
~
0r---~T----r~--~----+-----r----4----~
i=
<[
oJ
UJ
II:
-Ir-+-~-----+--~~----+-----r-++'~----~
STORED
ZERO
-2r---~-----+----~~~~;---H---~----~
Figure 3b. Resonant absorption readout of 1000A-thick
permalloy, 30 x 60 mil size. Applied field to generate signal
is in direction of stored 1.
+1~---+-----~----~~-+----1-----r----1
oJ
<[
z
o
en
UJ
~
950
0~---+----4-~--~---~----~----r----1
450
550
FREQUENCY me
~o (~)
where N is the total number of cells
along the digit line.
The fraction a was measured for a number of
cells on a 50-ohm strip line. A power meter monitored
the output and input. A large cube coil placed over
the line applied a field of 0, + 1.4 or -1.4 oersteds.
This was the largest field that together with the
R.F. would not permanently change the stored
magnetization of the memory cells. R.F. level was
10 milliwatts. With the data adjusted for 1/e total
attenuation, the power ratios measured were:
!;(
oJ
UJ
II:
-I
-3
-4
Figure 3c. Resonant absorption readout of 10OOA-thick
permalloy, 30 x 60 mil ~ize. Applied field to generate signal
is in direction of stored 1.
The reason for the inflections in signal amplitude
with increasing frequency for films with low H K
and high a90 is not understood by the authors. The
line and detector for these measurements had an
Frequency
Line Loss
450 mc
4.3 db
550
4.3
640
4.3
Variation in
Line Loss with
Applied Field
-0.81 db
+0.38
-1.17
+0.58
-1.53
+0.76
It was found that 10 milliwatts peak power at 550
megacycles was near a .practical maximum for the
5.5-mil-wide 50-ohm R.F. line used. Above this
value the stored information might be lost due to
creep on readout. This power provided a peak calculated R.F. field of close to 1 oersted. The limit
804
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
transistor was used in place of a diode for detection
tester described later, a -1.17/32 or 0.0365 db
change could be expected, assuming the cells were
adjusted so that 32 cells provided 4.3 db attenuation.
In practice 30 X 60 mil cells 1,000 A thick with
HK = 5.3 will closely achieve this. The R.F. level
change in a 50-ohm line using 10 milliwatts into
the input would be 0.43 volts + 1.55 millivolts. This
modulation was confirmed using a General Radio
detector on the above-mentioned film tester.
Figure 4 is a circuit diagram of the detector and
signal amplifier used on the film tester. A 2N918
on power is influenced by the R.F. line dimensions
and the properties of the thin film.
The power at the output with 10-milliwatt input
would be
0.010
. I power
- watts ± the sIgna
e
If all the memory cells .at 550 mc could provide
a -1.17 db change, then each cell would provide
-1.17/n db change of the output power.
For 32 cells used in the memory film substrate
1/4
LI
WRITE
CURRENT
1965
!
V3
WAVELENGTH
TUNING STUB
)
RIO
R3
RI2
.-D
R4
EMITTER
FOLLOWER
INPUT
FROM
L2
02
2N914
03
MEMORj
BNC
CONNECTOR
EMITTER
FOLLOWER
2NI711
R.F.
DETECTOR
RI4
OUTPUT TO
VOLTAGE
COMPARATOR
RI3
RI5
R2
~
--~4---------~
INPUT FROM
COMMON
...------------------------'
MODE EMITTER
FOLLOWER
Figure 4.
since it was possible to obtain a signal voltage gain
of 32 as compared to a diode. This gain avoided
noise problems in the following amplifier and resulted in a better signal-to-noise ratio being obtained as compared to a diode.
Detector noise as observed on an oscilloscope
was below 0.1 volt peak at the amplifier output.
Referred to the input at the emitter there is a volt- .
age gain of 8,000, thus noise peaks at 12.5 microvolts for this particular detector. Turning the oscillator o.n and off makes almost no difference to the
observed noise so that· the solid state oscillator
A THIN MAGNETIC FILM COMPUTER MEMORY
driving 10 milliwatts into the R.F. lines does not
appear to be a principal source of noise. It is believed that a better noise figure could be obtained
with an improved detector design.
The observed signal-to-noise ratio was 120 to
1 for a stored 1 and 70 to 1 for a stored O. If the
number of cells on the R.F. line were increased (by
8) to 256, the signal noise would be 15 to 1 and
8.7 to 1. This assumes that attenuation and signal
from each memory cell was reduced by 8 to 1 to
maintain optimum line attenuation. Adjustment of
cell width, thickness, and HK could achieve this.
Since a considerable allowance must be made for
variations in detectors, standing waves, and variations in memory cell performance, this represents a
practical limit for the setup as tested. However, a
better detector might be designed and. better film
might be made so that extension of the number of
memory cells per R.F. line is quite possible.
R.F. LINE IMPEDANCE
The initial choice of a 50-ohm R.F. line was
dictated by the convenience of using existing fittings, cables, and R.F. measuring equipment. How-
805
ever, given a required memory cell center to center
spacing (80 mils was initially selected), the lower
the line impedance the wider the strip line must be,
assuming spacing to ground is fixed by glass substrate thickness. Thus R.F. line-to-line coupling
will be increased as the spacing between lines is reduced. There is therefore a lower limit of impedance which is reached for a given layout when
R.F. line-to-line coupling becomes excessive.
The worst-case coupling that must be considered
is all of the lines to one other line. If all the bits
being read out are O's (or 1's) except one bit, then
R.F. coupling will provide a competing signal of
polarity which will reduce the required signal.
It was determined experimentally that the absorption (and therefore signal output) from a given
memory cell increases almost linearly until the cell
is three times the width of the R.F. line. This is
probably true only for the narrow (5.5 mil) R.F.
lines used where fringing fields are excessive. Spacing to ground was 6 mils. It was also found that
writing into a memory cell by rotation of its magnetization required approximately the same word
current whether the word line was 1/6 of the cell
width or equal to the cell width. This was due in
Figure 5.
806
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
part to fringing fields, in part to the fact that
demagnetizing fields prevent alignment of the edge
of a thin film cell in the hard direction and in part
to substantial magnetic coupling from region to region within the thin film cell.
The disadvantage of a narrow line to obtain 50
ohms did not appear very substantial. The advantage of wider R.F. lines might be the use of greater
R.F. power without causing magnetic creep during
reading. The line current [2 = Power/Z. Approximately,
X
K
Z = spacing to ground
.
constant 1
conductor WIdth
but space to ground is determined by glass substrate
thickness so
Z
=~
Width an
d
[2
= Power X Width =
Kl
Pi Wi
Kl
The R.F. magnetic field on the surface of the flat
· 1··
.
1 .
X K2
st np
me IS approXImate
y gIVen b y H RF = [Width
= (~) lh
Kg, so that if peak R.F. magnetic field
is the main cause of creep during reading it can be
reduced as the inverse root of the R.F. line width,
although in most practical layouts R.F. coupling
will make lines of less than 50 ohms impractical apart
from the problem of making special fittings cables
and measuring equipment.
The principal source of coupling between the
R.F. lines was capacity coupling between the R.F.
lines and the digit write lines (see Fig. 5) . The
layout in Fig. 5 has 768 small capacity coupling
points at the matrix crossover points. When 23 R.F.
lines are driven with equal R.F., the worst-case
coupling on the 24th line was measured as 16 db
down on the driven lines. The 24th line is matched
at its input with 50 ohms for this measurement.
Removing the digit write lines reduced the R.F.
coupling for this test to 35 db below the driven lines.
It would not therefore be practical to extend this
layout much beyond 32 memory cells per R.F. line.
However if a I-mil metal sheet is placed over the
R.F. lines, and it is suitably grounded at intervals,
then almost no R.F. will be found outside the thin
metal top plate. If the digit write lines are insulated
and placed over this sheet they no longer form a
coupling network. However at a readout rate of the
order of 1 megacycle the field from these lines will
1965
penetrate the thin metal shield and permit readout
of words from the memory, and also permit information to be written in the memory.
THE MATRIX TESTER
A memory thin film tester was designed and constructed both for testing film for use in an R.F. absorption type memory and to determine the magnitude of the problems in reaching a practical design
for a useful memory. Figure 6 shows the completed
tester. Thin film substrates could be inserted under
a grid of R.F. lines (24) and digit lines (32) as
shown in Fig. 5. To obtain information on writing
in the memory, and reading the memory, it was arranged that the 768 switches shown would provide
unlimited variation of the pattern to be written, and
a check of the contents on a subsequent read cycle.
A complete read write cycle is repeated at a 30
cycle rate so that a flicker-free display could be
obtained (on the CRT shown) of the memory contents or error pattern or switch pattern. Word current, digit 1 and digit 0 current, and read current
were continuously variable by adjusting the power
supplies shown so that the effect of any variable
could be seen on an error pattern. To determine
creep and stability, the tester can be switched to
read only and the error pattern observed. The low
clock frequency of 25 kilocycles made possible a
relatively simple layout using pulse rise times of
about 0.5 microseconds and allowed recovery time
for drive transformers when writing. Even so some
shielding of the 1.5-amp maximum word currents
was necessary.
Both 1 and 0 signal outputs when reading were
checked for signal level by voltage comparators
whose trigger point was continuously variable from
2 to 7 volts. This enabled a memory cell to be identified rapidly if its readout signal was below the set
margin. Display lights and stop on error provided
the address of a memory cell that had a weak output. Identification of a memory cell on the CRT
display was quite easy and this feature was normally used. The setup as shown allowed easy connection
of a signal generator to any R.F. line so that a faulty
memory cell could be tested over a range of R.F.
frequencies to determine whether its resonance
characteristics caused a read error. It was as a result
of such tests that 550 megacycles was selected to
make most of the film planes operate correctly.
Word currents were variable from zero to 1.5
A THIN MAGNETIC FILM COMPUTER MEMORY
Figure 6.
amp through a 5.5-mil wide R.F. strip line: Digit
currents were variable 0-300 milliamps through a
60-mil wide strip line.
Figure 5 shows the R.F. lines which are placed
next to the thin film. The substrates are loaded under the matrix with the magnetic film on the top
side of 6-mil glass. Four grounding screws must
be removed to change the glass substrate.
The digit lines were divided into 3 12-mil lines
spaced 12 mils apart to reduce capacity loading on
the R.F. lines. This precaution was just sufficient
to make the 32-word layout usable from the R.F.
coupling viewpoint. As mentioned earlier, the R.F.
coupling problem can be reduced by an order of
magnitude if a "Tri-plate" layout is used with a
thin ground plane separating the digit lines from
807
the R.F. lines. It appeared that this could be a solution to the R.F. coupling problem in larger memories.
The R.F. source was a series-tuned oscillator
using a 2N3375 transistor. This oscillator drove a
strip line power splitter etched on a 10-mil glass
epoxy copper laminate. A transformation from 50
to 2 ohms was made with a tapered line and 25
lines connected to the 2-ohm point through 50ohm resistors to 25 coax lines. The resistors were
required to absorb the reflected waves which had a
strong effect on the signal amplitude.
The use of 50-ohm cables and connectors was
almost essential in the R.F. system to allow measurements of power loss, SWR, and coupling. These
problems must be well understood before making a
memory layout comprising etched card strip line
configurations which would not necessarily provide
accessibility for making measurements.
In testing thin film planes a rather severe "creep"
test is made when writing information because of a
peak R.F. field below resonance of near 1 oersted
acting in the hard direction for the film. Both R.F.
and d-c were passed down the R.F. lines when
writing. However, it did not 'prove necessary to tum
off the R.F. when writing since the same digit current was used for reading and writing, and creep
would occur on the read cycle if it affected writing.
Nickel iron cobalt films were used with an overlay
of copper diffused into the film. This appreciably
,,helped the creep problem. A number of copper diffused substrates were tested and found to have reasonable write current margins. In addition they did
not lose information on continuous read.
The use of a 5.5-mil word line for writing into
a 30-mil wide memory cell did not appear to affect the digit current margins appreciably as compared with the use of a wider word line. However
in testing 20-mil wide word lines (which required
a strip line transformer for impedance matching at
both ends) there was so much R.F. coupling because of the increased capacity coupling to the digit
write lines that this test was not continued.
The R.F. absorption memory has a peculiarity
that the word lines (R,F. lines) when writing become the digit lines when reading. Thus the memory must be loaded with one digit of every word at a
time requiring that the information be prepared in
this form before loading a memory. If it were required that one digit of each word be read out at a
time (perhaps in searching for information) then
80'8
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
rotating the information format through 90'0 would
be an advantage.
1965
SUMMARY
Possible Advantages of Resonant
A bsorption Readout
DESCRIPTION OF A HYPOTHETICAL
4K WORD MEMORY
Figure 7 shows how an NDRO memory using
resonant absorption phenomena for reading might
REQUIRED
16 OSC I LLATORS
32 WORD LINE SWITCHES
I
16
768
48
WORD CURRENT GENERATOR
MEMORY PLANES
"DETECTORS
SENSE AMPLIFIERS
4 BITS
ADDRESS
Figure 7.
be laid out to take advantage of the fact that R.F.
is only required in the small block of the memory
being interrogated. The block diagram shows a layout using 16 blocks with 16 separate oscillators and
the present predicted limit of 256 memory cells per
R.F. line. There is a transient settling time of 3-4
microseconds for the detector shown on Fig. 4 (except that C2 was reduced to 20'0' picofarads) after
switching the oscillators. No detectable noise or
output was observed at the detectors (that resulted
from the read current) when an R.F. oscillator was
turned off because the detectors include a 1/4. -wave
55O'-megacycle grounded stub which would provide a short circuit for transients of lower frequency. It should therefore be possible to parallel the
detectors to a common sense amplifier as far as
read noise is concerned. Switching the detectors on
and off simultaneously with the oscillators is not
costly in terms of components if this should prove
necessary.
1. The use of R.F. oscillators provides an extra
switching dimension which might provide economies over word selection usually used to read out
thin film memories. This is more likely to be relevant if the memory is very large than for the size of
memory considered.
2. In large memories it might be 'economical to
use relatively long pulse rise times to read out thin
film. The resonant absorption readout technique
makes this possible without reducing signal amplitude since' the output signal is not proportional to
the rate of change of the read current, but only dependent on the R.F. frequency and absorption
characteristics.
3. In large memories the operating power could
be quite low compared to other "thin film types because very little power is required by a block of
memory not being used.
4. Current margins for readout can be quite
wide. Temperature compensation should not be required.
5. In some applications, turning the information
format thru 90'0 could be an advantage, particularly
when it is required to address bits within words
rather than complete words.
Disadvantages
1. The present limit of 256 cells per R.F. line
and detector may not prove sufficiently attractive
economically. However it may be possible to make
considerable improvements with better magnetic
film and a better detector.
2. Turning the information format through 90'0 is
likely to be a disadvantage for many applications.
The use of an extra set of write conductors to correct this (in zigzag form) is a possible though not a
proven practical solution.
REFERENCES
1. H. D. Toombs and T. E. Hasty, Proc. IRE,
June 1962.
DEVELOPMENT OF AN E-CORE READ-ONLY MEMORY
P. S. Sidhu
Ampex Corporation
Culver City, California
and
B. Bussell
University of California, Los Angeles
INTRODUCTION
Many types of read-only memories have been
devised. The information in matrix memories is
stored by forming a matrix of word and data lines
and placing a coupling element at each intersection
whenever a "1" is stored. The main disadvantages
are the low output signal, and low signal-tonoise ratio in large matrices.
A memory using inductive coupling but not built
in the form of a matrix is the transformer memory
or Diamond Ring Translator. 1 There is one address
for each word stored and one transformer core for
each binary digit in the output word. Each address
wire threads through or bypasses a particular core,
depending on whether the corresponding digit in
that word is respectively "I" or "0." To read a given word a current pulse is passed through the corresponding address wire. This induces a pulse, representing 1, in the sense windings of the transformers
through which it is threaded. No output appears
from the transformers which are bypassed.
From the point of view of speed, storage medium, and cost of components, the Diamond Ring
Memory is the most suitable of all the memories
described by Taub. 2 However, in large memories,
because of coupling between the word lines, sig-
Electronic memories in data processing equipment can be divided into two categories. In the first
category are easily alterable memories such as ferrite core matrices, ultrasonic delay lines and thin
films. These are used for temporary storage of information, for example, the program of instructions
for a particular calculation, initial data and intermediate results. In the second category are fixed or
"read-only" memories used to store information
that is seldom, if ever, changed. Writing the information into these memories is part of the manufacturing process, and in order to change it one has
either to replace a part of the memory or to alter its
construction in some way.
The main advantages of a read-only memory
over erasible read/write memory are:
Nondestructive read
No rewrite electronic and power
Faster cycle time
High reliability
Low cost
N onvolatility
809
810
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
nal-to-noise ratio suffers and the sense amplifier
design becomes elaborate.
The problem of sensing can be solved by the use
of an E-core in the place of a regular transformer
core, making the polarity of 1 opposite to the polarity of 0. Two E-cores put together form two
windows, as shown in Fig. 1. When a current is
passed through window 1, it induces flux in one direction through the central leg; and if a current is
passed through window 0, the flux is induced in the
opposite direction. Since the sense winding can
have many turns, the output signal may be large.
Additionally, no change of state is involved as in
ferrite cores, therefore lower drive energy is required and the output is not delayed.
.--WORD LINES"
1965
• An address decoder compnsmg 6 I-by-4
decoders, to decode the address and to select
a particular line driver.
• Sixteen voltage drivers and 32 current drivers to drive a current through the selected
word line.
• Information register consisting of 24 flipflops representing one full word length of
information for parallel transfer.
• Forty-eight sense amplifiers to amplify the
signal output of the E-core when a word was
read.
• Timing control pulses to generate the
required signals for the memory operation.
The logical design of the memory (Fig. 2) is
considerably simpler than a regular read/write type
INFORMATION
OUTPUT
INFORMATION
AVAILABLE
STORAGE
SENSE WINDING
MAGNETICS
CONTRQL
Figure 1. E-cores.
MEMORY DESIGN
A memory of 1,024 words of 24 bits each was
designed and built. For purposes to be described
below, information was stored in two sets of Ecore pairs. In each set 256 wires were passed
through 48 pairs of E-cores. Thus, each of the
512 wires stored 2 words of information. Two
groups of 256 wires each were used in order to cut
down the capacitive coupling between the word
lines and to reduce the drive current requirement.
The electronics were designed to read one word at a
time out of the memory. The electronics consisted
of:
• An address register comprising 10, flip-flops
which held the 100bit address representing
the addressed line or word.
REQUEST
INFORMATION
ADDRESS INPUT
Figure 2. Block diagram of memory.
of memory because only the read cycle is performed
and, secondly, the read is nondestructive-that is,
regeneration of the information is not required. The
811
DEVELOPMENT OF AN E-CORE READ-ONLY MEMORY
read cycle operation can be broken down into the
following operations:
1. On receiving the "Request Information"
command the address register is strobed to
transfer the new address information.
2. The address is decoded to select a word
line in which the read current will flow.
3. When the current is established in the
word line, the sense amplifiers are strobed
to sense the output signal of the E-core.
This signal sets the information register.
4. As soon as the Information Register is set,
the "Information Available" signal is sent
to the computer.
The memory timing is represented by the timing
diagram in Fig. 3.
TEND
TO
REQUEST
INFORMATION
~
lLLLLJ
Table 1.
No.
of
Word
Turns Drive
ma
I
I
ADDRESS
rI I I I'
I
I
ADDRESS STROBE
ADD RESS CLEAR
lfZZJ
I
I
READ CURRENT ,
SENSE AMP STROBE
INFORMATION
CLEAR
10
20
,r-:Z"'Z"'Z"""'Z-lZ"'Z"'Z"""'Z-lZ"'Z"'Z""
5
flllllllla
I
I
I
L-.-,....,..
JLLLLI
INFORMATION 'I
AVAILABLE
I
I
INFORMATION
next or vice versa. Forty-gauge enameled wire was
used to wire the word lines through the cores. The
ends of the word line were terminated in a printed
circuit board mounted on the ends of the wooden
board. Each line passed through the 1 or 0 window
of the cores depending on the information word
stored. It is proposed that after wiring the word
lines through the E-cores, the magnetics be removed from the wooden board and mounted on the
end cards as shown in Fig. ,5. The two car4s can
then be secured to each other to form a plug-in
module.
The minimum size of the E-core is determined
basically by the number of words to be stored.
The sense winding was wound in a spiral and
placed around the central leg of the E-core. Several tests were made to determine an optimum number of turns. The result (Table 1) shows that the
I
I
1 _ _ __
IIZIIIIZZ
TIME
.,
20
40
40
20
20
40
Sense Termination
Res:
4,700 3,300 2,200 1,500 1,000
Output in volts
0.65 0.55 0.45
0.8
+0.9
1.7
1.4
1.2
1.0
+1.8
1.8
1.5
1.3
1.1
+2.1
0.75 0.60 0 ..5
0.3
+1.0
0.25
0.5
0.4
0.3
+0.6
0.55
0.3
0.8
0.7
+1.0
680
330
0.4
0.8
0.8
0.35
0.2
0.4
0.3
0.6
0.6
0.25
0.1
0.2
optimum number of sense turns should be 8 for a
drive of 50 milliamps in the word line.
The magnetics was diode-decoded as shown in
Fig. 6. Thirty-two current switches were decoded
in two groups of 16 each. Address bits 0 through 3
decoded lout of 16 voltage switches; bits 4
through 7 decoded 2 current switches out of 32 (1
in each group); but 8 selected 1 current switch out
of the 2 sets by gating bits 6 and 7. Bit 9 gated the
sense amplifier strobe, selecting 1 group out of the
2 sense outputs.
Figure 3. Timing diagram.
DRIVE SYSTEM
MAGNETICS
The magnetics of the memory was constructed on
a wooden board. Two rows of 48 E-cores were
placed in line on the board as shown in Fig. 4. Two
wooden panels were mounted along the cores to
hold them in position. One-half inch distance between the cores was allowed to permit 256 wires to
pass from the 1 window to the 0 window of the
The drive system was designed as a single block
and consisted of input circuits, address flip-flops,
decoders and drive switches. The circuit diagrams
are given in Figs. 7 and 8. The drive system operation is briefly described below.
Referring to Fig. 7, normally strobe 2 is negative
which keeps both sides of the address flip-flops in
the false state and thus all the decoder transistors
812
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1965
SENSE WINDING
WORD
LINES
24.
Figure 4. Magnetics assembly.
VOLTAGE
SWITCH
./E.-CORES
-V
l
r
............... ,....,
...............
-
,
..L.
"
-
~.
-.
I
,I
16
V
1--
WORD
UNES
=
""'!
-
:.-~
"
- -- -~
I 2 ------- 3 3 3
oI 2
CURRENT SWITCHES
Figure 5. Magnetics packaging.
Figure 6. Magnetics decoding.
813
DEVELOPMENT OF AN E-CORE READ-ONLY MEMORY
~------------~6--4AO
+v
A2
A2
~QtlF/F
___
I
I
I
(Ar +v
TRQD~
_ _ _ ~CKTLI
+V +V
T
(AI
F
+v
I
I~T
I
I
I
I
I
I
I
REFJ
-v o v l
ADD
L ____ _ A~.~T~B_ _ _ _ _ _ J
Figure 7. Drive system.
are reverse-biased, or all eight outputs are false.
All the drive switches, positive and negative, are
off. Both of the transistors of the input circuit are
reverse-biased· by the positive level of the address
strobe.
When a "Request Information" pulse is received
at the control, the address strobe goes negative. One
of the decode outputs goes positive depending on
the address information. Referring to Fig. 8, 1out-of-16 drive switches is selected by the coincidence of 2 true l-out-of-4 decoder outputs.
Approximately 40 nanoseconds after the address
strobe, strobe 2 goes positive (Fig. 7), which sets
the address flip-flop. Near the end of the cycle,
strobe 2 goes negative and both the transistors of
the address flip-flop are reverse-biased which in
814
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
OUT
OUT
TYP. CURRENT
TYP.
VOLTAGE SWITCH
1/4 IN. --+---t~11
1/4 IN.
1965
,..-........I----I~
+v
1/4 IN.
1/4 IN.
1/4 DECODER INPUT
\
V
I
1/4 DECODER INPUT
Figure 8. Drive switches.
turn reverse-biases all the decoder transistors and
the drive switches.
These signals were generated on receiving the "Request Information" command, as shown in Fig. 9,
by tapping a delay line.
TIMING
The timing circuit controls the operation of the
memory through the following timing signals:
1.
2.
3.
4.
Address strobe
Strobe 2
Sense strobe
Data clear '
BIT SYSTEM
The bit system consists of the sense amplifier,
Information Register and transmitter circuits. The
sense amplifier circuit was designed to amplify positive going signals and reject negative going signals.
The circuit was strobed because of signal kickback.
815
DEVELOPMENT OF AN E-CORE READ-ONLY MEMORY
Operation
-15V
+15
RI
R3
-15V
-l5V
The memory operated at a cycle time of 250 nanoseconds with the information available at 150 nanoseconds. The decoder and drive switch were selected in 50 nanoseconds and the current was established through the word line in 100 nanoseconds.
The transmission delay through the word line was
15 nanoseconds for a drive current of 50 milliamps.
The drive current was set at 50 milliamps to get the
maximum sense output voltage and fastest recovery
100 nanoseconds. E-cores of 4A material of Ferroxculie were used for this model.
The optimum termination of the sense winding
was 2,700 ohms (see Table I). The pattern of information or of addressing the memory had no effect on the recovery and output voltage of the sense
winding. Diagrams of Figs. 11 and 12 show the
sense line output and cycle times of the memory
with two addresses selected.
Figure 9. Schematic timing.
A negative going 0 signal was followed by a positive
going overshoot which could falsely set the flipflop. The output of the sense amplifier was connected to the true side of the information flip-flop,
and also transmitted to the computer through the
transmitter circuit (Fig. 10).
=
+4V
+4V
+15V
=
Figure 11. Sense line output. Horizontal scale
40 nano1.0 volt/division.
seconds/division; vertical scale
{'-
+15V
j
/
-
DATA
CLEAR
+4V
i'r
" I -\ /
I
L't..
\
"
-.
\~
Figure 12. "I" Output with respect to "Request Information." Horizontal scale = 40 nanoseconds/division; vertical
scale
2.0 volts/division.
=
REF 2
Multiple Selection
SET
SENSE STROBE 0 - -.......- - 4
2
REF I.
-15V
Figure 10. Schematic - bit system.
The memory is orgC!nized in a linear select mode.
That is, a single line is selected in order to read information. This type of selection technique is costly
because the number of lines wired is directly pro-
816
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
portional to the number of words stored and a high
level of decoding is required.
To overcome the problem of cost and size limitations of the memory in a single wire per word
scheme, the multiple select (coincident) technique
was investigated. In this technique two wires instead of a single wire are selected at a time to read
one word of information (Fig. 13). The output signals are then combined as given in Table 2.
Figure 13. Multiple selection organization of E-core Memory.
Word
Lines
Selected
Logical
Comb ination
A1+B 1
+1
A1+C 1
A 1+A2
B1+C1
B1+B 2
C 1 +C2
+1 +0
+1 +1
-1 +0
-1 -1
0+0
Mode
-1
Table 2.
Sense
Line Logic
Comments
Output Output
volts
Positive flux cancels
0
0
negative flux.
Positive signal is 1.
1
+1
Positive signal is 1.
1
+2
Negative signal is O.
-1
0
-2
Negative signal is o.
0
No flux coupling
0
0
into the E-core.
1965
It is assumed that the current flowing through
one word line induces 1 volt in the sense line. The
logical representation of sense line output is arbitrarily taken as positive flux for 1, no or negative
flux for 0; it can be changed to suit the type of information to be stored. Six different combinations
help to reduce the number of lines required to store
the information.
The information stored at line 1 or line 2 does not
represent the word of information, but it is the
logical combination of these two lines which gives
the word of information. In order to store N words
of information, N combinations need to be stored.
As a minimum, only 2x N different words (lines)
have to be wired in order to store N words. This
technique of multiple selection can be extended from
double to triple and quadruple selection, which further reduces the number of different states. In triple
selection, only 3x3 N and in quadruple selection
4X4N different states are needed for N words of
storage. Table 3 shows a comparative organization
of the memory under various selection techniques.
The cost of a diode is taken as 1 unit and the
remainder of the entries are calculated based on
this unit. Fixed cost is estimated at 1,500 units
(without power supplies) which includes receiver,
flip-flop, timing and transmitter circuits, assembly,
packaging and testing. A 1 to 0 ratio is considered
(1) high when the sense output polarities are opposite, (2) good when the ratio is 5 to 1 or greater,
and (3) poor when the ratio is less than 2 to 1.
Table 3 shows that series parallel gives the opti-
Table 3. Comparison of Cost and Speed of E-Core Memory for Different Selection Techniques.
SignalDrive
AddiWord Cores and
Access
Cycle
toSwitch
Sense
Comments
tional
Line
Sense
Time
Time
Noise
and
Amplifier
Cost
Winding
Ratio
Decoder Units
nanoseconds
High
Costly
2048
470
2710
150
250
144
48
l.a) Linear
select
b) Series
parallel
c) 4-Series
2. Double select
3. Triple select
1024
192
288
375
1879
150
250
High
Optimum
576
128
64
192
48
48
576
144
288
280
150
150
1624
470
550
200
200
200
300
300
300
High
Good
Poor
Long drive line
mum configuration for straight linear select mode.
Double select technique is the most attractive from a
cost point of view. Out of 1033 logarithmic and
exponential constants (3), 256 words were selected
to give 16 + 16 = 32 different states for storage
by double selection. The model operated at a speed
of 250 nanosecond cycle time and access time of 150
nanoseconds. The sense line output is shown in
Fig. 14. An effort was made to program the information in attempt to yield 2xN states. However,
this optimum lower limit was never· achieved. The
addressing logic for mUltiple selection becomes complex and the memory cost increases beyond that of
a linear select. This approach was therefore abandoned. Further work in programming the information
is suggested.
DEVELOPMENT OF AN E-CORE READ-ONLY MEMORY
OVERSHOOT
Figure 14. Sense line output for a double select technique.
Horizontal scale = nanoseconds/division; vertical scale =
2 volts/division.
AUTOMATIC WIRING
A computer program can generate the word line
pattern for storage. The lines are printed on a double sided strip. The strip is punched in the middle
to pass the central leg of the E-core. The crossover from one bit position to next is accomplished
by passing 1 to 0 word lines on one layer and 0 to
1 on the other. The number of words on each strip
is determined by the number of lines which can be
printed in one half of the strip. The computer program guides a photopointer to produce a photoprint
of these lines. After making, these strips, each of
which may contain up to 64 words, are placed in
the E-cores.
INTEGRATED CIRCUITRY
Integrated circuits can be used for almost all the
circuits in this memory. The flip-flops are two
cross-coupled DTL gates. The decoders and drive
switches can be combined into one circuit like the
Fairchild 932 or 946 gate. For sense amplifier and
receiver circuit, Motorola MIC logic can be used.
The use of integrated circuits cuts down the component count, thus cutting down the cost considerably,
and makes the memory more reliable.
The operation of the memory demonstrated the
following:
1. The speed of the memory depends upon:
( a) the recovery of the sense line;
(b) the time required to decode the address
and to select a voltage and a current
switch.
2. The access time depends upon:
The time required t6 decode the
address and to select a voltage and
817
a current switch, together with the
time required to amplify the signal to set the data flip-flop and
the transmitter.
3. For the E-core used, 50 milliamps of worddrive current gives the optimum signal of
+ 1.2 volts. The number of sense turns for
this output is eight.
4. Because of the high output voltage, the sense
needs only one stage of amplification to set
the data flip-flop.
The model clearly demonstrated that it is possible to use E-core as a storage element. The speed
obtained with the prototype was 250 nanoseconds.
The recovery of the sense line depends upon the
material of E-core. Ferrite material, with response
up to 100 megacyles per second is available, as
compared to the 10 megacyles per second material
used in this model E-core. The faster the material,
the faster the core recovery; hence the present recovery time of 150 nanoseconds can be substantially reduced simply by changing the core material.
For computer storage, cycle time of 250 nanoseconds and access time of 150 nanoseconds are
considered very high speed compared with the present state of the art. The 0 sign~l is of opposite polarity to 1, thus any problem of discrimination between 1 and 0 signal is avoided. The E-core wiring can be automated very easily since the wires
can be prebent or printed according to the information and then just dropped into the open windows
of the E-core. Also, the number of E-cores used
in the memory is small compared to any other type
of memory. The reduced number of cores and the
automatic wiring reduces the cost of the magnetIcs
of the memory profitably.
The automatic wiring eliminates wiring errors to
which the human hand is susceptible; the clear discrimination between 1 and 0 signals also contributes to a highly reliable operation.
To summarize-an E-core memory is feasible
and seems to possess the following unique advantages:
1. High speed - 250 nanoseconds.
2. High reliability:
( a) Opposite polarity of 1 and 0 signals.
(b) Automatic word wiring.
3. Low cost - at least one half that of any
other memory.
4. No limitations on the size of the memory.
818
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1965
REFERENCES
1. T. L. Diamond, "No.5 Crossbar AMA Translator," Bell Laboratories Record, 29.6271, 1 (1961).
2. D. M. Taub, "A Short Review of Read-Only
Memories," Journal lEE (England), vol. 19, pp.
29-31 (Jan. 1963).
3. D. Cantor, G. Estrin and R. Turn, "Logaritl)mic and Expo:nential Function Evaluation in a
Variable Structure Digital Computer," IRE Transactions on Electronic Computers, vol. EC-ll, pp.
155-164 (Apr. 2, 1962).
MAGIC-A MACHINE FOR AUTOMATIC GRAPHICS INTERFACE TO A COMPUTER
D. E. Rippy and D. E. Humphries
National Bureau of Standards
Washington, D. C.
INTRODUCTION
mote display station and is intended to be connected
to a large ADP system via voice quality communication lines. Extensive design effort has been
devoted to removing from the ADP system the
time-consuming and repetitive tasks of display regeneration and manipulation, and to minimize the
limitations introduced by the communication lines.
Particular emphasis has been placed on establishing
the proper balance between hardware and software
functions.
MAGIC was originated in August 1964 and
completed in February 1965. It is currently being
used to conduct experiments and perform demonstrations in order to better define the optimum
characteristics for equipment of this type.
The Computer Technology Section of the National
Bureau of Standards is currently engaged in an
extensive program to develop advanced techniques
for improving user communication with large ADP
systems. This program is an outgrowth of a number
of projects in which the Section has been called
upon to assist other agencies in the solution of a
variety of data processing applications.
These projects have provided contact with many
kinds of data processing problems which generally
involve such functions as command and control,
design and mapping, updating and utilization of active files, editing, and information retrieval. Development of data processing hardware for use as tools
to aid the investigation of such functions is usually
required.
This report describes a machine which has been
developed within the Computer Technology Section
as a research tool for the investigation of manmachine communication techniques involving CRT
displays. This machine has been designated MAGIC
(Machine for Automatic Graphics Interface to a
Computer). This machine combines large-diameter
cathode-ray displays with a specially designed programmable digital computer. It is designed as a re-
SYSTEM DESCRIPTION
Design specifications of MAGIC evolved from a
number of basic considerations such as memory
type and organization, display type and control,
word formats for the machine instruction and display data, operator controls, and system economics.
Underlying the over-all display data organization
within MAGIC and the digital manipulation of these
data is the realization that most CRT displays function in a point-to-point manner, implying that
819
820
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
the digital data required to drive the display device
must exist in a serial list form with respect to time.
Therefore, it would seem logical to generate, manipulate and store display data with a processor
having list processing abilities. Also, as will be
shown in detail later, this list processor organization results in a significant savings in the software
required for display operation. This concept has
provided the basis for the hardware design of the
MAGIC system.
System Hardware Organization
There are two major hardware sections of the
MAGIC system as seen in Fig. 1. These are the display unit and the processor unit. The display· unit
consists of a primary and a secondary CRT display.
PROCESSOR UNIT
CONTROL PROCESSOR
1965
Drum Memory
The magnetic drum memory within MAGIC is
similarly subdivided. The control processor communicates with the portion of memory termed general memory. The portion of memory designated as
display memory is used by the subordinate list processors (including W) and the primary display. The
secondary display, however, receives its display
data from three channels of the general memory.
A rotation speed of 1800 rpm was chosen for the
drum since it provides a display refresh rate of 30
frames/sec, which is generally recognized as the
lower limit for flicker-free presentation of stationary information. When the limitations of operating speed of the display consoles and the available
logic hardware were taken into account, it was
found that 128 12-bit words per drum memory
channel could be accommodated; this results in a
system clock rate of slightly less than 54 kHz. General memory capacity is 90 channels.
Word Formats
w
x
Y
z
SUBORDINATE
LIST
PROCESSORS
SECONDARY
DISPLAY
OPERATOR
CONTROLS
,
The processor unit functions as a serial, single
address processor. The single operand address and
the various operation codes are organized in a double length instruction word format as shown in Fig.
2. Bits designated W, X, Y and Z in the instruction format address the respective subordinate list
processors. Note that any combination of W, X, Y
WORD n
~
____
~A~
___
~~
WORO·n+1
___
______
~A~
~
'211
y
DISPLAY UNIT
Figure 1. Machine for automatic graphics interface to a
computer.
CHANNEL ADDRESS
(000 - 700)e
OPERATION CODE
(00 - 77fe
SECTOR ADDRESS
(000 -177)e
Figure 2. Instruction word format (double length).
The operator uses the primary display and its associated controls to perform the majority of his communications with MAGIC. The secondary display
is used as a passive display device only.
The processor unit is subdivided into one control
processor and four identical subordinate list processors designated W, X, Y and Z. Subordinate list
processors X, Y and Z operate directly on the portion of memory used by the primary display. Subordinate list processor W is considered part of the
control processor and allows the control processor
to perform list manipulations without disturbing the
primary display.
or Z may be specified simultaneously in anyone
instruction, allowing additional flexibility in manipulating display data.
Display data within MAGIC may be considered
to consist of three fields: the X coordinate field,
the Y coordinate field and the Z field. These fields
are the X, Y and Z lists in display memory and are
operated on by subordinate list processors, X, Y
and Z respectively. The Z field in the list specifies
the display characteristics for the associated X and
Y coordinate words. The X and Y coordinate data
consist of 10 bits centered in the 12-bit standard
MAGIC -
data word. This allows an increase or decrease in
display size by a factor of two without loss of coordinate data bits by utilizing shifting techniques.
Graphic presentation on both displays represents
the first quadrant of the cartesian coordinate X-Y
plane. The 10-bit coordinate fields provide a resolution of 1024x1024 discrete positions. Since all
numbers within the first quadrant are positive, aU
numerical data in MAGIC are assumed positive and
are therefore unsigned. Also, since MAGIC is display-oriented and therefore not primarily intended
for general purpose processing, arithmetic operations
in MAGIC are of the simplest form, consisting primarily of straightforward binary addition. Further,
there is no parity bit associated with MAGIC data.
The Z word format is shown in Fig. 3. The desired Z parameters may be processor-generated or
ALPHA-NUMERIC CHARACTER CODE'
FROM KEYBOARD
821
A MACHINE FOR AUTOMATIC GRAPHICS INTERFACE
FROM Z CONTROLS ON
DISPLAY CONTROL PANEL
~
NORMAL / ALPHA-NUMERIC CHARACTER
~
CHARACTER GAIN: HIGH/LOW
[ill
INTENSITY: OFF/LOW/ MED./ HIGH
fi@
PLOT MODE: POINT / LINE (SHORT. MED•• LONG)
instructions are considered as those affecting data
in the four display memory channels. They may be
performed on a single sector within display memory
or on a list of specified length in display memory.
Non-list instructions pertain primarily to the control processor. Since the execution of the instructions categorized in Fig. 4 is intimately tied to the
hardware which performs them, no attempt will be
made to discuss them as a separate subject. Rather,
it would be better to discuss them as they individually become pertinent to the detailed discussion of
the hardware of MAGIC which follows.
CONTROL PROCESSOR
As stated earlier, the processing unit is divided
physically and logically into a control processor and
four subordinate list processors (W, X, y. and Z),
each communicating with its assigned portion of
the drum memory. Figure 5 is a block diagram of
the control processor. Also shown are blocks representing both displays and the four subordinate list
processors. The control processor contains all registers necessary for executing programs within the
control processor and for controlling the subordinate list processors. These registers are the instruc-
Figure 3. Z word fonnat.
may be specified manually via the Z switch register
on the operator's control panel.
The instruction repertoire of MAGIC may be divided into two categories: list instructions and
non-list instructions, as tabulated in Fig. 4. List
LIST
I.
BLOCK TRANSFER
2.
ADD
3.
INSERT
4.
DELETE
S.
SCAN (MASKED)
6.
SHIFT
NON-LIST
I.
JUMP
2.
3.
REGISTER FILL
4.
CONTROL FUNCTIONS
REGISTER EMPTY
A.
B.
INTERRUPT
DISPLAY DATA TRANSFER
C.
CONTROL PROCESSOR REGISTER MANIPULATIONS
D.
HALT / BREAKPOINT/ SENSE LIGHT SET
Figure 4. Control processor instructions.
W CHANNEL
Bw REGISTER
Cw REGISTER
X CHANNEL
Bx REGISTER
Cx REGISTER
Y CHANNEL
By REGISTER
Cy REGISTER
Z CHANNEL
Bz REGISTER
Cz REGISTER
COMPARATOR
ADDER
ZERO· DET.
COMPARATOR
ADDER
ZERO DET.
COMPARATOR
ADDER
ZERO DET.
COMPARATOR
ADDER
ZERO DET.
W
PROCESSOR
PROCESSOR
X
Figure 5. MAGIC block diagram.
822
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
tion register, instruction address counter, memory
address register, alternate address register, mask
register, interrupt address register, loop counter,
and I/O register. All registers listed above with the
exceptiqn of the instruction register, instruction address counter, and memory address register may
corrimunicate both ways with general memory as a
result of the execution of a register fill or a register
empty instruction. An additional exception to the
above is the loop counter which may be filled only.
All instructions are decoded for execution from
the instruction register contents. The instruction
fetch phase of control processor operation loads this
register. The contents of the instruction address
counter determine the memory address for obtaining the next instruction from memory. Normally,
this register advances by two following an instruction execution. However, the contents of this register may be forced to non-sequential addresses by
jump or interrupt operations.
All memory addressing is performed via the
memdry address register. The channel address selects the proper general memory read/write head
and the sector address is compared with the memory
word counter (which is not shown in Fig. 5) to
provide sector selection and timing for general
memory or display memory, whichever is specified
by the instruction.
The alternate address register is provided for
alternate addressing. When so specified in the instruction, the sector contents of the instruction address counter may be replaced by the contents of
the alternate address register before execution of
the instruction. The alternate address register may
be loaded via a register fill instruction, a scan operation or as a consequence of addressing a displayed
object with the light pen. The alternate address register may also be incremented by the execution of
an instruction for this purpose. Also, if the contents
of this register correspond to sector 1278 (the last
sector of any drum channel), the execution of an
alternate address register sense and jump instruction
will cause the program to jump to the address specified in the instruction. If the contents are not 1278,
no action is taken.
Masking data for scan operations are supplied by
the contents of the mask register, which is shared
with all subordinate list processors for masking purposes. Associated with the mask register is an instruction which allows the mask register contents to
be shifted right by one bit each time the instruction
1965
is executed. This was implemented as a programming aid for code conv.ersion subroutines.
The interrupt address register acts as an intermediate buffer for the interrupt sector address during the execution of an interrupt instruction. The
interrupt mode of operation of MAGIC is further
.described in the section headed System Operation.
Four general-purpose sense flip-flops (designated W, X, Y and Z) are available as programming aids. Control of these flip-flops is provided
by a set instruction and a sense and jump instruction. If such a jump is performed, the involved
sense flip-flops are reset.
Another programming aid is provided by the
loop counter. When filled with the desired constant,
it permits reiterative program loops to be performed without additional software tallying which
would normally be part of the loop. Execution of
the sense and jump instruction for the loop counter
interrogates the loop counter for the following conditions: if the loop counter is not zero, it is decremented and the program jumps to the operand
address specified in the instruction (beginning of
the loop). If the loop counter is zero, no action
results and the instructions are taken in normal
sequence, allowing the program to break out of the
loop.
Local input/output may be performed via the
single length I/O register. This register is additionally linked to a special general memory I/O channel
with dual read/write capabilities. With this register
and its associated dual controls and sector addressing, it is possible to communicate between the I/O
channel and _a peripheral device without interrupting operation of the control processor. This I/O
channel is also utilized by the hardware used to interface MAGIC to the central ADP system. This
will be detailed later in the paper.
SUBORDINATE LIST PROCESSORS
Digital data presented to the primary display are
manipulated by the three subordinate list processors
designated X, Y and Z. These processors (as well
as processor W) are identical in design. Figure 6
shows the logical arrangement of one of them. The
read head and write head shown in this figure are
separated by exactly one word (or sector) as related
to drum circumference, allowing the A register and
the associated drum channel to form a nonprecessing recirculating loop. Since it is not possible to
MAGIC - - A MACHINE FOR AUTOMATIC GRAPHICS INTERFACE
A.
B.
C.
D.
E.
F.
G.
l
!
823
DELETE
SHIFT RIGHT
SHIFT LEFT
NORMAL WRITE SOURCE
ADD
INSERT
BLOCK TRANSFER
WRITE CONTROL
d!---<. FROM GENERAL MEMORY
~----~------------_---oA
r-------OB
OF
E
r----------<:>C
D
INPUT FROM PRIMARY
DISPLAY
OUTPUT TO
PRIMARY DISPLAY
ADDER
CARRY
C REGISTER
MASK
REGISTER
"NO HIT"
DETECTOR
Figure 6. W, X, Y or Z subordinate list processor.
place heads this close together on the same channel
(one sector occupies approximately VB" of drum
circumference), a pair of channels are used, with
the write head on one and the read head on the other.
A restoring pair of read/write heads are placed
a convenient integral number of sectors around the
circumference of the drum from the pair associated
with the A register to close the loop. Consequently,
the memory "channel" shown in Fig. 6 represents
the two actual memory channels as described. However, this arrangement will continue to be considered as a single memory channel since this is its logical function.
'
Normally, when the subordinate list processor is
inactive, data flow from the memory channel, via
the read head, through the A register and the write
control switch (which, when inactive, is in position
D), and back onto the memory channel via the
write head. During the time period for any specified sector, the data contents of this sector are
being written from the A register onto the drum.
To perform list manipulations on these data, the control processor switches the write control to the
proper switch position for the appropriate length of
time. At the end of the specified list manipulation,
the write control is returned to position D. This
concept· of a recirculating loop with write control
makes it possible to perform true list manipulations
within the subordinate list processors. Reiterating,
these list manipulations are: insert, delete, block
transfer, shift (right or left), add and scan. The following describes these manipulations in detail.
Data are inserted into a display channel via the
associated B register as shown in Fig. 6. The B reg-
824
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
ister may be filled by the control processor with
data from general memory or from the locator coordinate buffers and the Z parameter switches on the
operator's control panel. The word location or sector into which an insertion is to take place may be
defined by the zero detector, or the sector may be
defined in the instruction. Thus, the following types
of insertion may be accomplished: (a) at the beginning of a list, (b) within a list, and (c) at the
end of a list if the last word of the list is followed
by a word containing all zeros.
At the time coincident with the beginning of the
desired sector, the write control is changed to Position F. where it remains until the end of the channel. This places the B register in the recirculating
memory loop, creating a total delay of two words.
Consequently, the contents of the B register are
written into the desired location in the list and the
previous contents of that sector and all following
sector contents are automatically moved down one
sector to accommodate the inserted word. This represents a true insertion within a list of data, with data
properly relocated by the completion of the operation.
The following table illustrates the insert operation. Here, it is assumed that the data word 7777
(octal) is to be inserted into sector 0'56 of a display memory channel.
Table 1. Illustration of an Insert Operation.
Data List
Before Insertion
Sector
Contents
0'54
0123
0'55
4567
0'56
7654
0'57
0'60'
3210'
0'123
Data List
After Insertion
Sector
Contents
0'54
0123
0'55
4567
0'56
7777
057
7654
0'60'
3210'
Deletion is accomplished by switching the
write control to position A at the beginning of the
desired sector where it remains until the end of the
channel. The sector is defined in the same manner
as in insertion operations. This operation removes
the A register and the one word delay it represents
from the recirculating loop. Consequently, the contents of the specified sector are bypassed and the
data in the remainder of the channel are moved up
one sector position to· fill the gap. As in the insert
operation, this represents a true list delete. The
1965
types of deletion possible are the same as for insertion.
Table 2 illustrates the delete operation. Here it is
assumed that the contents of sector 0' 10' (octal) are
to be deleted.
Table 2. Illustration of a Delete Operation.
Data List
After Insertion
Sector
Contents
0'0'7'
1111
DID
2222
0'11
3333
0'12
4444
Data List
After Delete
Contents
Sector
1111
0'0'7
DID
3333
4444
011
5555
0'12
Memory to memory "block" transfers of data between display memory and the control processor
general memory can occur in both directions. Block
transferring from the display memory to general
memory requires no action by the subordinate list
processor and is under direct control of the control
processor via the system's memory bus (heavy line
shown in Fig. 5). Block transferring from general
memory through the memory bus to display memory requires the write control (Fig. 6) to be
switched to position G during the performance of
this operation. A block transfer may be specified
for a particular sector or from a sector to the end of
the channel. This block. transfer capability is generally used for storage in or retrieval from general
memory of pictures or sub-pictures for presentatiqn on the primary display.
Increasing or decreasing the size of the display
by factors of two may be accomplished by shifting
the data in the display channels left or right respectively. By switching the write control to position C,
data from the A register are written one bit late, effectively shifting left or increasing the binary
weight of each data bit by a factor of two. By
switching the write control to position B, data bits
are written one bit early, effectively shifting right.
In either of the above cases, there is no overflow
from word to wor,d, and each execution of the operation constitutes a shift of one bit position only.
Shift operations may be performed on a single specified sector or from a specified sector to the end of
the channel.
An add operation within the subordinate list processor utilizes the serial full adder as shown in Fig.
6. During the execution of an add operation, the
MAGIC -
A MACHINE FOR AUTOMATIC GRAPHICS INTERFACE
contents of the control processor addressable C register are serially added to the A register. Also during
the add operation, the write control is switched to
position E and the adder sum is written on the
drum as it is generated. If an overflow condition
exists after any single word has been added, the
overflow detector is set. A subsequent sense and
jump instruction may be used to test for an overflow condition. The overflow detector is reset only
at the beginning of any add operation. As implied
in Fig. 6, variations in the add instruction permit
Modulo 2 add or complementing of the C register
before addition. Addition may be performed on a
single sector or from a specified sector to the end of
the channel.
Addition of a constant to X or Y coordinate data
has the effect of moving the picture presented on
the primary display; e.g., adding a constant to the
X coordinate data list moves the picture to the
right or subtracting the constant moves the picture
left. Repetitive addition creates continuous movement of the picture. The speed of movement is determined by the size, of the constant.
Scan operations perform table look-ups on data in
display memory. During the scan operation the contents of the A register are serially compared with the
contents of the C register by the comparator shown
in Fig. 6. This operation requires no switching of
the electronic write control. Scanned data are always
masked by the contents of the mask register. The
data in a specified display memory may be scanned
for A = C, A> C, A < C, or A =I=- C. A scan operation is initiated at the sector specified by the instruction and continues until terminated by one of the following methods. If the scan operation is successful,
the comparator output terminates the operation at the
end of the sector for which the scan was successful.
At this time the contents of this sector are placed in
the B register and the sector address is placed in the
alternate address register. If the scan operation is
not successful, the operation is terminated at the end
of the channel and the "no hit" detector is set. The
no hit detector may be tested by a sense and jump
instruction which will cause a jump to occur if the
no hit detector is set. The no hit detector is not reset
at this time but is reset at the beginning of any scan
operation.
The salient feature of the list manipulations just
discussed is that each requires a single control processor instruction which requires one drum revolution or less for execution. Since, as previously de-
825
scribed, one bit in the control processor instruction
format is reserved uniquely to each of the four subordinate list processors, a specified list manipulation may be performed simultaneously in any desired combination of these processors.
PRIMARY DISPLAY
Dual magnetic deflection is employed by both
display consoles. Object position on the CRT
screen is provided by the main deflection system.
The secondary deflection system provides the relatively fast, limited deflection characteristics required for alpha-numeric character and locator
(primary display only) presentation. Figure 7 depicts the primary display and its associated control
logic.
FROM AZ
REG.
Figure 7. Primary display.
Digital display data are parallel transferred from
the A registers of subordinate list processors X, Y
and Z to the primary display buffers (X, Y and Z).
The X and Y coordinate buffers drive digital to analog converters to provide object position. The contents of the Z buffer (see Fig. 3) are decoded to provide control information for use as indicated in
Fig. 7.
Included in the main deflection system is a pair
of preamplifiers which act as gated integrators for
vector generation and which are controlled by decoded plot mode information from the Zbuffer.
When in vector mode (solid or dashed vector) ,
these preamplifiers act as integrators. When in any
other mode, they act as unity gain amplifiers.
As recalled from Fig. 3, six bits of the' Z word
are used by the alpha-numeric character generator.
This character generator is commercially available
826
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
and utilizes a formatted stroke method of character
generation. To obtain two character sizes, the character gain bit of the Z buffer controls the gain of a
pair of preamplifiers in the secondary deflection
system.
The digital intensity modulator receives information from the locator generator, character generator
and Z buffer for generating the necessary control,
timing, and intensity level information required by
the various objects being displayed.
1965
SYSTEM OPERATION
Manual Controls
The physical appearance of the MAGIC system
is shown in Fig. 8. The primary display is on the
right and the secondary display is on the left. To
the upper left of the primary CRT is the light pen.
Light Pen and Locator
The light pen in the MAGIC system is used in
conjunction with an electronically generated crosshair locator pattern for manually generating display
data. The locator may be turned on or off by a
manual switch on the control panel and appears on
the primary display. The locator control circuit allows the locator to be generated once each revolution of the drum during the time period allocated to
the last sector (sector 1278) of memory channel
timing. Consequently, this last sector cannot be
used for display data storage. Generation of each
segment of the locator is synchronized with the locator coordinate up/down counters in a manner
which allows the locator to servo to the light pen.
The speed at which the locator servos to the light
pen is proportional to the distance of the light pen
from the center of the locator. The locator position
is updated by transferring the contents of the locator up-down counters to the main deflection X
and Y buffers immediately after each locator generation.
Once the locator has been positioned and the object at that position has been described by the Z
parameter switch register on the control panel, it is
necessary for the data to be transferred to the B
registers of the X, Y and Z display processors for
subsequent insertion. This may be accomplished
manually by the depression of any keyboard key or
by the execution of the display data transfer instruction by the control processor.
Another use of the light pen is for the identification of data being displayed on the primary CRT.
Depression of a manual switch permits the sector
address synchronous with the light pen output pulse
to be transferred to the alternate address register.
Subsequent processing using alternate addressing
techniques permits the desired manipulation to be
performed on any object pointed to with the light
pen.
Figure 8. MAGIC - a machine for automatic graphics interface to a computer.
Panels from top to bottom of the center rack are:
(1) control processor monitoring indicators, (2)
control processor manual controls, and (3) display
control panel containing the Z switch register, locator and light pen controls, and keyboard interrupt
control. The keyboard, used primarily for alphanumeric data entry, is below this panel. The vertical
rows of buttons on the right side of the display control panel and the two vertical rows of illuminated
pushbuttons at top center of this panel are interrupt
pushbuttons, Additional interrupt pushbuttons are
on the small chassis shown at the lower right of the
primary CRT.
Interrupt Feature
Operation of MAGIC generally consists of using
the light pen and the interrupt pushbuttons to either
generate display data or manipulate data already
being displayed. Also, for most purposes, MAGIC
MAGIC -
A MACHINE FOR AUTOMATIC GRAPHICS INTERFACE
executes subroutines on an interrupt basis. This interrupt feature functions as follows. The receipt of
an interrupt instruction by the instruction register
causes the control processor to "hang up" in the instruction execution phase. This condition will remain until one of the 63 interrupt pushbuttons
available to the user is depressed; at which time,
the sector address assigned to this pushbutton is
tran'sferred to the interrupt address register. This
sector address is then transferred to the instruction
address counter and the operation is simultaneously
terminated. The next instruction is taken from this
new sector address and normally consists of an unconditional jump instruction which is one of a list
of links to display manipulation subroutines; each
link corresponding to an interrupt button. Thus the
interrupt pushbuttons are programmable and their
meaning may be changed by modifying the list of
uncondition:al jump instructions associated with the
interrupt jump instruction.
It follows that any particular interrupt pushbutton may mean different display manipulations for
different users. Therefore, annotated overlays are
used for describing the functions of the pushbuttons. About half of the interrupt buttons are designed to operate continuously, i.e., the subroutine
accessed by one of these interrupt buttons is repeated
as long as the button is depressed.
Subroutine Examples
The following examples of simple display manipulations are accessible by interrupt as previously
described. Each subroutine, therefore, is linked
back to the interrupt instruction by an unconditional jump instruction. Also, the following examples
are representative only; a complete discussion of
MAGIC software would be beyond the scope of this
paper. However, detailed programming and operating procedures for MAGIC are described in National
Aeronautics and Space Administration document
number N65-25010: "MAGIC-A Machin'e for
Automatic Graphics Interface toa Computer" (reference manual) by D. E. Rippy (29 Dec. 1964).
A typical insert subroutine generally consists of
two machine instructions: an insert instruction, and
a jump instruction for return to the interrupt. An
insert may be performed at the beginning of a display list by specifying the beginning sector of that
list in the insert instruction.
Insertion within the display list utilizes alternate
827
addressing and is generally executed in the following manner. The user points with the light pen at
the displayed feature (line, point, etc.) within the
picture where it is desired that new graphic data be
inserted. Depression of the light pen address switch
loads the alternate address register with the sector
address coincident with that feature. The Z switch
register contents (assumed previously set for the
desired parameters), keyboard buffer contents, and
the locator coordinates (from the locator up-down
counters shown in Fig. 7), are then transferred to
the Bx, By and Bz registers. This is accomplished
either by depressing any keyboard key or by the
execution of a display data transfer instruction which
would be included in the subroutine. The choice is,
of course, dictated by user preference.
The interrupt button assigned to this subroutine
is then depressed. In this case, the insert instruction
will include the alternate addressing code, allowing
the alternate address register contents to govern the
sector location within the display list where insertion is to take place.
The A register zero detector (Fig. 6) is utilized
for insertion at the end of a display list. The end of
a display list is defined as the first sector of a display memory channel in which no data exist. In
this case, the execution of the insert instruction
monitors the zero detector as the display list flows
through the A register. If the display list in question is followed by at least one sector containing no
data, a signal from the zero detector initiates the
insert process at that sector.
A typical delete subroutine also consists of two
or three machine instructions, depending on whether
or not a display data transfer instruction is included. Deletion at the beginning of, within, or at
the end of a display list is performed as described
for the insertion subroutines described above.
A subroutine to move a picture or subpicture
across the CRT face requires three machine instruc. tions. The first instruction fills the C x or C y register
with the desired con'stant to be added to the X or Y
coordinate data. The add instruction is performed
next. This may be a complement-before-add instruction (as previously described) for such purposes as creating movement left or down. Finally,
an unconditional jump instruction returns program
control to the interrupt in'struction.
Movement at any angle may be accomplished by
the addition of "another fill instruction which would
828
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
allow the C x and C y registers to be filled with different constants. Simultaneous add instruction execution may then be performed by the X and Y subordinate list processors if both X and Yare specified in the add instruction (see Fig. 2).
To tie the last point (or line) in a display list to
any preceding point (or line) in that list with, generally, a line, requires three machine instructions.
The user identifies the feature he wishes to tie to
with the light pen and depresses the interrupt button used for this "Tie" subroutine. The X and Y
coordinate values in the sector location just identified are transferred to the Bx and By registers, respectively, by the first instruction, utilizing alternate addressing. An end-of-list insert instruction
is performed next, resulting in a new line being
drawn to the identified location. The remaining instruction is the usual unconditional jump to the interrupt instruction.
Calling up whole pictures or subpictures after
they have been generated, stored in general memory, and identified by interrupt buttons and associated linkage software requires a subroutine of four
machine instructions. These are three blocks transfer instructions (one each for the X, Y and Z lists)
and the usual unconditional jump back to the interrupt.
On a somewhat larger scale, a "visual assembler"
program has been devised, requiring (irrespective
of look-up tables and picture storage) 103 machine instructions. Upon entry into this program, a
picture is displayed on the primary display consisting of alpha-numeric mnemonics representing all
of the machine instruction and the four subordinate
list processors, a number table for composing channel and sector addresses, and mnemonics used for
control purpo&es. The processor then goes to a light
pen interrupt mode. The user then points to a displayed mnemonic with the light pen and depresses
the light pen address switch. By alternate addressing, the program is then directed to the proper subroutine for assembly if the binary equivalent of
the mnemonic pointed to. This process is continued; assembling the operation and address fields in
their proper locations to result in a machine instruction or, as desired, a numeric data word. Various control mnemonics allow such operations as
assembly of machine instructions or data words, error correction and transfer of the assembled information to its permanent general memory location.
1965
More sophisticated programs of this nature will be
generated in the near future for compiling programs
for MAGIC and for the central computer.
MAGIC-CENTRAL PROCESSOR
COMMUNICATION
By November 1965 MAGIC will be interfaced to
a large ADP system which is particularly oriented
toward time-sharing communication with a large
number of peripheral devices. These devices will be
of varying complexity, ranging from simple teletypes to more complex devices such as MAGIC or
satellite computers. Such an arrangement will allow
further investigation of time sharing, interfacing
and on-line processing techniques. Where MAGIC
is particularly concerned, various experiments will
be performed involving applications of a display
device which require the large data base and the
faster, more sophisticated computing abilities provided by a large ADP system.
The central processing system to be used for this
purpose is a MOBIDIC B twin computer and the
NBS PILOT mUlticomputer, each computer having
the ability to communicate with the other. Four
magnetic tape units and two disc files provide a
large data base for program and data storage. Minor
modifications to the I/O Converters associated
with each processor and to the I/O interrupt logic
have greatly increased the time-sharing capabilities of the system.
The interface associated with MAGIC utilizes a
voice quality, half-duplex line. The data rate has
been set at 2.4 kHz. Although the data rate could
be considerably higher, the use of voice quality
lines is generally a necessity for many remote station applications as dictated by availability and
economic considerations.
Two forms of intercommunication between
MAGIC and MOBIDIC are used: one for communication of graphic data to and from display channels
X, Y and Z, and one for communication of programs and related data or control information to
and from the I/O channel in the control processor.
When used on line with MOBIDIC, MAGIC is assumed to be in the receive mode except when actually transmitting. In MAGIC, reception of data is
concurrent with control processor operation. MOBIDIC is similarly assumed to be always in the receive mode. However, unlike MAGIC, MOBIDIC
MAGIC -
A MACHINE FOR AUTOMATIC GRAPHICS INTERFACE
utilizes an interrupt for recognizing data receipt
when operating in the time sharing mode.
Transmission of blocks of data from either
MAGIC or MOBIDIC is program initiated. In
MAGIC, two machine instructions are used for initiating transmission: one for display data and one
for control information or programs as related
above. Once initiated, the transmission process will
continue until self-terminated without further interruption to MAGIC.
MAGIC-MOBIDIC interactive display data
processing will generally adhere to the following
philosophy. MAGIC will provide local storage and
basic manipulation of display data under direct user
control. MOBIDIC will provide a large data base
for storage of graphic data and graphic data processing routines for both MAGIC and MOBIDIC.
MOBIDIC graphic data processing will be relegated
mainly to relatively complex functions; i.e., curve
fitting, geometric calculations, generation of graphic solutions to mathematical equations, etc. Detailed descriptions of the programming required for
this interactive graphic data processing, the interfacing hardware required and the over-all system
configuration are beyond the scope of this paper, but
will be presented in forthcoming papers by various
members of the section.
CONCLUSIONS
MAGIC is an operating display system. The
preceding paragraphs have described the salient
hardware, software and operational features as of
May 1965. The remaining questions are: What has
been learned from the present system, and what of
the future?
First of all, it can be said that the concepts of list
processing have proven to be desirable in the manipulation of display data. Second, the hardware
and software design of MAGIC as described provides sufficient local processing abilities to significantly remove the burden of display data processing
from the C.P.V. to which it is to be interfaced.
Software has proven to be quite efficient, and operation of MAGIC using the programmed interrupts,
keyboard and light pen has proven to be straightforward and very easy to learn.
Flexibility of original design has provided considerable latitude in experimentation with the hardware and software of MAGIC. As was expected and
desired, this has brought to light a number of ex-
829
cesses and deficiencies in the hardware and software areas.
Three major excesses in hardware design have
been determined. It has been found that there is no
great need for a secondary display unit except in
special cases such as when MAGIC may be used as
a teaching machine. The serial adders in the subordinate list processors, elementary as they are, may
be further simplified by eliminating the complement-and-add feature. The use of this operation
has proven to be negligible. Subordinate list processor W may be essentially eliminated except for
block transfers, since it has proven to .be of little
use in display data manipulation. Many of the features in processor W could be directly implemented
within the control processor.
Hardware deficiencies have been minor except in
the analog area. More work is required on the vector generators and consideration is being given to
the implementation of a circle generator. The major
problem in the analog area concerns boundary conditions involving portions of vectors which extend
beyond the display boundaries. Such a problem
arises in magnifying to full screen size small portions
of a displayed picture. Preliminary investigation
indicates that a combination of digital and analog
techniques will provide the necessary intensity
blanking and vector segmentation for objects extending beyond the visible boundaries automatically,
with little or no required software.
In the software area, certain basic deficiencies
have arisen. The instruction format, as it stands,
does not fully describe a list; it only specifies the
initial address of a list and the list is assumed to
terminate at the end of the channel. For the list manipulations (block transfers excepted) a minor addition to the hardware of the control processor will
eliminate this difficulty. However, the list manipulations as they stand have proven to be quite efficient in manipulating display data and the difficulty
described above does not detract from the concepts
of list processing incorporated in MAGIC.
As for the subordinate list processors, it has been
found that additional register-to-register transfers would be helpful in decreasing the software required to directly enter and retrieve data from the
display channels. It has also been found that the implementation of hardware and software for direct
addition of general memory data to data in the display channels would be of considerable help in re-
830
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
ducing software required for magnification or division of display picture size, subpicture assembly
and manipulation, and the like. However, even
when these minor software deficiencies are taken
into account, the machine language programming of
MAGIC is straightforward and has proven to be
surprisingly minimal.
Future additions to MAGIC in respect to the
hardware and software deficiencies described above
will be performed on an experimental or "spare
time" basis. This is dictated by the following considerations. First, as it may be inferred from Fig. 8,
the actual hardware used in the construction of
MAGIC consists of surplus equipment which, by
today's standards, considerably lags the present
state-of-the-art. Consequently it has been decided that there is little value in attempting to increase the performance of this Model I MAGIC
which is already operating at maximum capability.
However, this has in no way affected the implementation of the concepts of display equipment design
as discussed in this paper.
Second, -preliminary design work on a Model II
MAGIC is in progress and final design and construction will begin in the fall of 1965. Model II
will incorporate state of the art hardware components and will be designed to eliminate the previously described hardware deficiencies of Model I.
This will result in greatly improved operating
speed, display quantity and quality, and list manipulating capabilities. It will also be interfaced to
same central computer as Model I. The final design
of the interface hardware for Model II will, of
course, be dictated from what is learned from the
implementation of the interface for Model L
ACKNOWLEDGMENT
The development and programming of MAGIC
has been sponsored by the National Bureau of Standards and the National Aeronautics and Space Ad-
1965
ministration. Every member of the Computer Technology Section has contributed to the success of this
project, but particular appreciation is extended to
James A. Cunningham and Paul A. Meissner for
their valuable counsel concerning all aspects of the
project.
REFERENCES
The following represents a collection of general
subject material concerning man-machine communication using display devices.
1. J. C. R. Licklider and W. F. Clark, "On Line
Man-Machine Computer Communication," Proc.
S.J.C.C., vol. 21, p. 113 (1962).
2. H. H. Loomis, Jr., "Graphic Manipulation
Techniques Using the Lincoln TX-2 Computer,"
Report No. 516-0017 (U) , Lincoln Laboratory,
M.LT., Cambridge, Mass. (November 1960).
3. E. E. Sutherland, "Sketchpad: A Man-Machine Graphic Communication System," Proc.
S.J.C.C., vol. 23, p. 329 (1963).
4. T. R. Allen and J. E. Foote, "Input Output
Software Capability for a Man-Machine Communication and Image Processing System," Proc.
F.J.C.C., vol. 26, p. 387 (1964).
5. R. Stotz, "Man-Machine Console Facilities
for Computer Aided Design," Proc. S.J.C.C., vol.
23, p. 323 (1963).
6. B. Hargreaves, J. D. Joyce, G. L. Cole, et aI.,
"Image Processing Hardware for a Man-Machine
Graphical Communication System," Proc. F.J.C.C.,
vol. 26, p. 363 (1964).
7. P. B. Lazovick:, et aI., "A Versatile ManMachine Communication Console," Proc. E.J.C.C.,
vol. 20, p. 166 (1961).
8. T. E. Johnson, "Sketchpad III: A Computer
Program for Drawing in Three Dimensions," Proc.
S.J.C.C., vol. 23, p. 347 (1963).
9. D. T. Ross and J. E. Rodriquez, "Theoretical
Foundation for the Computer-Aided Design System," Proc. S.J.C.C., vol. 23, p. 305 (1963).
A MAGNETIC DEVICE FOR COMPUTER GRAPHIC INPUT
M. H. Lewin
RCA Laboratories, Radio Corporation of America
Princeton, New I ersey
not insert between the CRT face and the pen any
material (such as a sheet of paper) which will prevent light transmission.
The Rand tablet consists of a thin Mylar sheet
containing on one side, an array of etched copper
lines in the X direction and, on the other side, a
similar array of fine lines in the Y direction. By
means of capacitor encoding networks, also etched
on the same sheet, a unique voltage pulse train is
applied to each X and Y line from a common pulse
pattern generator. The pen in this case is merely a
metallic electrostatic pickup connected to a high
input-impedance amplifier. The pulse train picked
up by the pen depends on the X and Y lines nearest
to its tip. This serial pulse pattern (in Gray code to
eliminate errors) is converted into a parallel binary
address with appropriate peripheral logic, which
includes a shift register and a code converter. The
system is entirely digital and the tablet is relatively
inexpensive. In addition, thin paper sheets can be
inserted between the tablet surface and the pen for
tracing maps and curves .
Both of the approaches described above utilize
the pen as the signal pickup device and the writing
surface as the signal generator. While the Rand tablet system materially simplifies the writing surface
used and reduces the complexity of the peripheral
INTRODUCTION
Recent work on systems to facilitate the input of
graphical information to a computer has resulted in
the development of the light penl and the Rand
tablet. 2 Both of these devices allow a user to
"write" on a flat surface with a special, hand-held
electronic pen. Periodically, the pen position is de.:.
tected and converted into a machine-readable address. In this way, the pattern which is traced out
by the pen is directly converted into binary code
and stored in the machine. Devices such as these
promote the easy input of graphical data such as
curves, maps, diagrams, and other drawings. They
should also be of interest to many researchers concerned with character and pattern recognition.
The light pen is normally used in conjunction
with a cathode-ray tube as the writing surface. A
light-sensitive element in the pen generates a signal when the flying spot on the tube face reaches
the pen tip. The timing of this signal, relative to
the timing of the scanning pattern, establishes the
.pen position. Appropriate digital and analog peripheral circuits are necessary to convert this signal
into an equivalent binary address for storage. Clearly, the speed of movement of the pen is limited by
the scanning frame rate of the CRT. Also, one can-
831
832
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
electronics required for a given pen position resolution, the amount of circuitry needed, for the generation of the appropriate pulse sequences and for the
conversion of detected pulse sequences into parallel
binary addresses, is not negligible. *
The work on which this paper is based was initiated to develop a graphic input device which
would require a minimum of associated circuits
while maintaining simplicity in the construction of
the writing surface. The system to be described utilizes the pen as the signal generator and the writing
surface as the address detector. The pen contains in
its tip a small magnetic head which periodically
generates a localized magnetic field pulse. (Since
the coupling is magnetic, it is not shielded by most
materials placed between the pen and the tablet.)
The writing surface contains a number of thin winding layers in a laminated structure. Each winding
layer consists of a single, continuous wire pattern
designed to detect one of the pen address bits.
Thus, there are as many layers as there are address
bits, each developing a positive or negative induced
voltage as a function of the pen position. All layers
generate output pulses in parallel and these signals
are of sufficient magnitude to set a register directly.
WRITING SURFACE WINDING PATTERN
The magnetic head in the pen tip consists of a
small, linear ferrite core with an air gap and winding as indicated in Fig. 1. The coil is periodically
driven with a voltage pulse as shown. Any wire,
brought in the vicinity of the air gap and oriented
so as not to be perpendicular to the air slot, will
link some of the magnetic flux generated and will
thus develop an induced voltage pulse whose shape
is similar to that of the drive signal. The polarity of
the induced voltage is determined by the familiar
right-hand rule. Its magnitude is greatest when the
wire is parallel to the slot.
face is divided into two sets of areas or sectors
which may be labeled "even" and "odd." When th~
pen tip is positioned over anyone of the even sectors, a positive voltage pulse (binary "one") is induced across the two winding terminals. When the
pen is over any of the odd sectors, a negative outConsider a wire winding pattern in a plane surface over which the pen is "writing." The pen tip is
*T.he authors state that the system "contains some 400
tranSIstors and about 220 diodes; however, little attempt
has been made to minimize the number of components."
1965
I
Jl
PERIODIC DRIVE
VOLTAGE PU LSE
LINEAR
FERRITE
CORE WITH
A I R GAP
I
MAGNETIC~
FIELD
PATTERN
/
I
/INDUCED
VOLTAGE
.......... -..........
PULSE
",
JL
Figure 1. Magnetic head in pen tip.
in close proximity to the winding layer. It is desired to arrange the winding pattern so that the surput signal (binary "zero") is obtained. A winding
configuration which will satisfy these requirements
is shown in Figs. 2 and 3.
Assume the plane surface is divided into m sectors
(m an even number), half even and half odd. The
odd and even sectors alternate and can be labeled 1
2, . . . m as shown in Fig. 2. Each sector consist~
of n winding "stripes" or wires, each of which is a
segment of the total length of wire used in the winding. Let the stripe ij be the jth stripe of the ith sector,
where 1 ~i~m and 1 ~j~n.
The procedure for laying out the winding pattern
is shown in Fig. 3. In making a given winding
"pass" over the surface, from left to right, one
winds the wire vertically up in a specified stripe
position, then continues the winding horizontally to
the right to the next designated stripe position, then
winds the wire vertically down, then horizontally to
the right, then vertically up, ... etc., until the right
end of the .plane is reached. The wire is then returned horizontally from right to left and another
pass is started from left to right. The procedure indicated in Fig. 3 is summarized in Table 1. As a
simple example, the pattern for four sectors, each
containing four stripes, is given in Fig. 4. An examination of this winding configuration reveals that
j
833
A MAGNETIC DEVICE FOR COMPUTER GRAPHIC INPUT
WINDING
PLANE
n WINDING
STRIPES PER
SECTOR
SECTOR
ODD
.u
EVEN
WINDING
TERMINALS
.J..L.L..LJ..J.J...I.J..---...I....--- . . . . J . - - - - -
2
NUMBER·
m-I
3
m
Figure 2. Partitioning of a winding plane.
r-;:-
-
"ACTIVE
PLANE
WIDTH
r-
---
1---
--
r-;::--~-
II
---
SEC TOR:
_.0---
AL
CONTIN UE .-/
THIS
PATTERN
---
---
--
4
,
~,
START
TERMIN
~
--
-
~-
I
2
3
1---.-
1---
f--.-
/v
)
- - -m-3
1------ - - -
'--
.-
_ n-I"R-L"RETURN PATHS
r
~
m-2
~-
m-I
f..---
-·-FINISH TE:"RMI NAL ~
...
Figure 3. Winding layer pattern.
m
-- -
834
PROCEEDINGS -
r+-
r+t - l- f - - ~- f - - f - - f -
"ACTIVE'
PLANE
WIDTH
FALL JOINT COMPUTER CONFERENCE,
f-
f-
f- f - 1--
all stripes in the odd sectors have the same sense,
opposite to that of the stripes in the even sectors.
Figure 5 shows a side view of two adjacent sectors, with three possible head positions indicated.
The field pattern for position A causes a given polarity signal to be developed across the winding terminals. For position C, because the sense of the
windings is reversed, the opposite polarity signal
will be induced. If the pen tip is in the immediate
vicinity of the boundary between the sectors (position B), very little signal will be generated since
positive and negative components will cancel. Thus,
the output pulse polarity determines whether the
pen is over an odd or an even sector. The number
of stripes (n ) required in a sector depends on the
desired output pulse magnitude. As n increases, for
a given sector width, the induced signal increases.
The winding pattern shown in Fig. 3 is interesting because it contains no wire crossovers. It can
therefore be photo etched on a thin, conductor-clad
insulator sheet, such as copper-clad Mylar, or otherwise deposited via screening or evaporation techniques on a thin insulator substrate. Two patterns
can be placed on either side of a given sheet.
-
,
STAR T
r--
t- t-- t-- t-- 1-- f-- f-- f--- f-- f-- 1--1-
~
,
FINISH :>
Figure 4. Configuration for m
= 4, n = 4.
B
A
FIELD PATTERN\
WINDING----'-- ® ®
STRIPES
1965
C
IJ
®:~~~~®
~
SECTOR
----.,.....-SECTOR i + 1---"-1
Figure 5. Side view of winding stripes with three possible
head positions.
MULTILAYER TABLET
The writing surface is constructed by stacking or
laminating as many thin winding layers as there are
pen address bits. Thus, for a tablet to resolv~ any
one of 1024 X 1024 locations, ten-double-sided
sheets are required. Half of the winding layers are
oriented in the X direction, half in the Y direction.
The total thickness of the system can be kept small
by using sheets only a few mils thick. Each X layer
has an identical companion Y layer oriented orthogonal to it.
The layout of a winding pattern to detect a given
address bit is, of course, a function of the position-to-address coding scheme used. By using a
closed, cyclic code, such as Gray code, one is as-
A MAGNETIC DEVICE FOR COMPUTER GRAPHIC INPUT
sured that no more than one address bit in a given
coordinate direction can be undecided. That is, for
any pen position, the head can be located over, at
most, one boundary between sectors. For these reasons, it would appear that a conventional binary
coding scheme should not be used because the pen
point may be positioned over more than one indecision boundary. However, the addition of a small
amount of external logic, no more complicated than
that requ.ired for a parallel Gray-to-binary conversion, may allow a conventional binary code to be
X2 OUTPUT
:
835
used. (The indecision correction algorithm involved
is described in the next section.) Assuming such a
code, winding patterns can be laid out as illustrated
by the simple 8 X 8 example shown in Fig. 6. The
"most significant" X or Y layer has only two sectors, the next four, then eight, etc. The total number of X or Y winding layers (address bits) depends on the resolution required in the location of
the pen tip. The "least significant" layer (the one
with the largest number of sectors) may have only
one stripe per sector (i.e., n = 1).
XI OUTPUT
xoOUTPUT
II
I
I
I
I
I
I
I
I
1
X2: TWO SECTORS
XI: FOUR SECTORS
Xo : EIGHT SECTORS
----------
bbY OUTPUT
Y2:TWO SECTORS
2
YI: FOUR SECTORS
YI OUTPUT
Yo OUTPUT
Yo: EIGHT SECTORS
Figure 6. Six winding layers for a 64 (8 X 8) position array
using conventional binary coding.
All of the winding layers must be close to the
pen point to allow the generation of sufficiently
large output signals. The output voltage induced in
a more significant layer (one with many stripes per
sector), when the pen is over a given sector, is the
sum of the voltages induced in all the stripes of that
sector (refer to Fig. 5). This integrating effect allows one to locate a more significant layer at a distance from the pen tip which is larger than that for
a less significant layer. Thus, the tablet is laminated
with the most significant winding layers at the bottom and the least significant layers nearest to the
surface.
Note that, in order to detect approximately equal
magnitUde signals at the outputs of the X and Y
layers, the air gap in the magnetic head must be or-
iented at 45 0 to the X or Y orthogonal stripes.
Thus, the pen must be marked or shaped appropriately to insure that it is held roughly in the correct
orientation. Small variations from 45 0 will not
change the induced signals appreciably.
Other more sophisticated head designs, which
allow the system to operate indep~ndently of pen
orientation, are possible. For example, one can use
more than one air gap in the pen tip. Using two orthogonal gaps (pulsed at different times) * as the
pen orientation changes, the magnitude of the induced signal from one gap increases while that from
the other decreases. Orthogonal gaps can also be
used to generate a rotating magnetic field. Another
methodt involves using two or more air gaps to
*Suggested by J. A. Rajchman.
tDue to J. Avins.
836
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1965
them one, X and zero) available from each layer.
One and zero are acceptable signals (positive and
negative pulses). X represents almost no output
pulse-an undecided bit. An examination of the
one, zero transitions, when counting in conventional
binary code, will show that if one follows the following simple rules, errors at multiple-transition
boundaries can be resolved and the conventional
binary pattern can be used:
Detect the most significant bit which is undecided (i.e., the most significant X output). Arbitrarily decide this bit to be one or zero.
Force all less significant bits to be the complement of the bit chosen above. *
The addition of a very small amount of external
logic will allow this procedure to be used. For example, in the circuit shown in Fig. 7, an undecided
output is arbitrarily decided as a zero and all less
significant outputs are forced to be one.
generate a number of discrete field orientations
(say, three), one of which will always be acceptable
for any pen orientation. Periodically, these orientations are sequentially tested and the acceptable one
is chosen. This testing may involve the use of an
additional test winding layer whose stripes are all
oriented at 45 0 to the X and Y stripes. Each of
these arrangements, however, increases the complexity not only of the magnetic head but also of
the peripheral electronics. At this stage, the requirement of proper pen orientation, which allows the
system to be very simple, does not appear to be a
very severe user restriction. If necessary, some simple mechanical approach, such as housing the head
at the end of a flexible shaft (similar to that used
in speedometer cable), would permit the sleeve of
the pen to rotate while the head orientation stays
relatively fixed.
INDECISION CORRECTION ALGORITHM
*This method will work provided that the winding patterns are designed such that, for any two adjacent address
bits having a transition boundary in the same position, the
"zone of indecision" for the more significant bit overlaps
that of the less significant bit.
For a system such as the one described above,
there are actually three possible output signals (call
OUTPUT
A
... ,
A i + 10
_-~._ _ _---J1
A i + 2 _O---------J
Ak
p.
I
0
.... - - - - - - - - '
OJ
L A . (TO LESS SIGNIFICANT
I
POSIT IONS)
r-l) -
k+1 = NUMBER OF ADDRESS
BITS IN ONE COORDI NATE DIRECTION
o~ j ~ k
,..---'----'---,
POSITIVE
PULSE
DETECTOR
NEGATIVE Pj=IIF POSITIVE PULSE IS DETECTED
PULSE
Qj=1 IF NEGATIVE PULSE IS DETECTED
DETECTOR Aj=1 IF NO PULSE IS DETECTED
~
TABLET LAYER j
OUTPUT PU LSE
Figure 7. Mechanization of indecision correction algorithm
for conventional binary address coding.
A MAGNETIC DEVICE FOR COMPUTER GRAPHIC INPUT
EXPERIMENTAL MODEL
An initial experimental model consisting of a 32
32 array, using ten winding layers (five X and
five Y) to resolve anyone of 1024 pen positions,
has been constructed and is operating as described
above. A photograph of the pen and tablet is shown
in Fig. 8. The winding layers for this model were
wound by hand using conventional No. 33 insulated
coil wire. Each of the layers follows the configuraX
837
tion given in Fig. 3. A conventional binary code was
used. The windings were potted with an epoxy resin
to allow the tablet to present a flat surface to the
pen. The stripes are lJ8" apart, and the total thickness of the ten-layer system is approximately 0.1".
Clearly, a much higher stripe density is achievable
using present photoetching techniques. Also, one can
easily laminate a ten double-sided sheet system,
required for a 1024 X 10.24 array, and· obtain a
total thickness less than 0.1".
Figure 8. Experimental 32 X 32 tablet with pen.
The pen tip contains a linear ferrite core, 3/16"
O.D. and lJ8" J.D., wound with 30 turns, and driven
from a conventional General Radio pulse generator.
Approximately 100 volts is developed across the
head winding during the pulse peak. The core has a
15 mil air gap. Little attempt was made to optimize
the core drive circuit so as to obtain optimum output signals. Each of the winding layers is terminated in 100 ohms. This value was chosen to criti-
cally damp output ringing. A photograph of a typical
output impulse is shown in Fig. 9. The reverse polarity signal has the same shape. The waveform is clean
and has sufficient amplitude to set a flip-flop. It can
no doubt be made larger with appropriate pen drive
circuit design. The timing indicated shows that one
need not be concerned with the speed of movement
of the pen. The pen is marked to permit proper
838
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1965
position of a number of patterns, each of which is
designed to detect one of the pen address bits directly, the amount of associated circuitry is minimized. Although the initial artwork involves the laying out of as many patterns as there are address bits
in one coordinate direction, subsequent fabrication
of a number of tablets should be simple and inexpensive.
ACKNOWLEDGMENTS
Figure 9. Typical winding output pulse across 100 ohms. Vertical scale: 0.1 volt/div. Horizontal scale: 0.2
,usec/div.
The author wishes to express his appreciation to
H. Schnitzler, who constructed the experimental devices and who assisted in many of the tests.
orientation of the air gap with respect to the winding stripes.
A set of ten peripheral circuits, which includes
the logic given in Fig. 7 and which also contains
digital-to-analog converters, is used to demonstrate the operation of the tablet by permitting the
position of the pen to be displayed as a spot on a
CRT face.
REFERENCES
1. B. M. Gurley and C. E. Woodward, "LightPen Links Computer to Operator," Electronics,
pp. 85-87 (Nov. 20, 1959).
2. M. R. Davis and T. O. Ellis, "The Rand Tablet: A Man-Machine Graphical Communication
Device," Proc. 1964 Fall Joint Computer Conference.
CONCLUSIONS
By constructing the writing surface as the super-
Table 1. Winding procedure.
I!U pI! IN STRIPE
'''DOWN'' IN STRIPE
POSITION NUMBER POSITION NUMBER
L...,R PASS NO.1 (START)
II
31
51
I
I
I
----
------.,..
.-:::::::----
---
ETC.
(m-I) I
2n
4n
6n
I
I
I
mn
RETURN R-L
L-R PASS NO.2
12
2 (n-I)
32
4(n-l)
I
I
I
(m-I) 2
I
I
I
m(n-I)
RETURN R-L
t
CONTI NUE TH I S PATTERN
t
L-R PASS NO. n
In
3n
I
I
I
(m-I) n
21
41
I
I
I
m I (FIN ISH)
GRAPHIC 1 -A REMOTE GRAPHICAL D'ISPLAY CONSOLE SYSTEM
Bell
William H. Ninke
Laboratories, Inc.
Murray Hill, New Jersey
Tel~phone
INTRODUCTION
soles be very close, if not physically adjacent, to the
supporting computer or channel.
Significant rese,arch work is now concentrating
on providing computing power to a large user community through a sharing of the facilities of a central computer. * The user community is distributed
over a wide geographical area and communicates
with the central facility using remote consoles. The
pace of such work definitely seems to be accelerating, the reason being that early experience .~as
shown that a time-shared central computer facIlIty
can serve many people very well.
The console facilities used to date to communicate with time-shared computer centers have been
teletype or other typewriters. With such low-speed
character oriented devices, slow access to the central computer (a few hundred milliseconds) is not
readily apparent to a user. The real limiting factor
in the use of such devices is the transmission delay,
the time required to transmit messages between the
computer and a user. For long messages with 10
character/second rates, this delay becomes both
long and annoying, and strongly affects what can be
done.
For a high-speed remote graphical display, the
transmission rate can be increased so that the delay
problem is reduced. Access time then becomes a
Graphical information exchange using directview display consoles is a rapidly growing means of
communicating between a human and a computer.
One reason for this growth is that results presented
to a human in graphical form are concise and readily understandable. Another reason is that. through
alternate viewing of graphical output displays and
entering of new inputs, either graphically or by other means, based on these output displays, a user can
monitor and guide a computer during the course of
a complex problem solution.' Such interaction generally produces more rapid and often better problem
solutions. In many cases, interaction allows a problem to be solved which would not even be attempted using other techniques. 1
Most work to date on providing man-computer
graphical communication facilities has been based
on consoles which require full-time occupancy of,
or, in the least, immediate access to a large digital
computer. 2,3,4 Display maintenance has been provided by the large computer or by a data channel off
the large computer. The high transfer rates needed
for such an organization have dictated that the con*Such work is going on at Bell Laboratories, M.I.T., G.B.,
Dartmouth, Carnegie Tech, V.C.L.A., I.B.M., S.D.C.,
RAND, and other places.
839
840
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
major problem. A few hundred milliseconds is certainly suitable access for certain aspects of a problem. To allow a user, however, to do real-time
composing, editing, or other manipulation of graphical information with a lig~t pen .or other graphical input device, the sum of access time and transmission delay must be. only a few milliseconds. For
a truly remote console in a time-shared center,
such timing will generally not be available. Therefore, to allow such light pen operations, a remote
graphical console facility in a time-shared central
computer environment should have local computing
power.
Also if such high speed consoles are to be used
remotely, local display maintenance is essential.
Thus,a picture need be transmitted only once,
1965
maybe over low-cost low-speed lines. The central facility is then relieved of any further immediate responsibility. The console itself provides the
broad band capabilities to maintain a picture at a
flicker-free rate. Local computing power and display maintenance not only reduce the requirements
on the central computer, but also should allow many
interesting things to be done without even disturbing the central facility.
The GRAPHIC 1 console has been designed for
use in exploring the problems associated with local
computing power and display maintenance for a remote graphical console which eventually will be operating in a time-shared central computer environment. A photograph of the console is· shown in Fig.
1.
Figure 1. The GRAPHIC 1 console (photo no. B65-3727- MH).
GRAPHIC 1 SYSTEM ORGANIZATION
The console system can be divided into two major units. The first unit consists of a control computer, which provides the local console computing
power, and the console input devices. The second
unit contains a display scope and an associated core
buffer memory for storing display material. These
last two elements and their connecting interface
make possible a locally maintained output display.
The two console units are connected by an interface
which allows control signals and other information
to be passed between them.
This interface also allows communication between the control computer and/or the display
memory and the central computer facility, an IBM
7094 in the Murray Hill Computation Center. A
block diagram of the described organization is
shown in Fig. 2.
The console system is currently connected to the
7094 by a parallel transfer data connector. However, a request for service by the console system can
GRAPHIC
1-
GRAPHIC 1 CONSOLE SYSTEM
REMOTE GRAPHICAL DISPLAY CONSOLE SYSTEM
CENTRAL COMPUTING
FACILITY
0
DISPLAY SCOPE
DEC TYPE 340
CONTROL
COMPUTER
DEC
PDP-?
4K
i
1<' bl t word,
.36 bl ts
f
INPUT DEVICES
I
3x12
18 bl ts
bit s
36 bits
t----IBUFFER REGISTER~
I
CENTRAL
COMPUTER
IBM
32K
7094
36 bit word,
DISPLAY
MEMORY
Ampex
4K
RVQ
36 bit words
Figure 2. Organization of the GRAPHIC 1 console.
only be recognized at interjob points, i.e., between
jobs on the normal input batch tape to the 7094.
Response time is governed by the length of jobs on
the batch tape, an average of from two to six minutes during daytime working hours.
This access is far from ideal and certainly will be
improved when the Laboratories converts to a multiprogrammed central computer facility. The present waiting time does, however, simulate to some
extent the access time and transmission delay that
would take place if the console were connected to
the central facility over a serial voice-band transmission line. It has strongly reenforced the necessity of having local computing power at the console.
The control computer in the console, a Digital
Equipment Corporation PDP-5, is a small fixedpoint core-memory digital machine. 5 The 6 microsecond cycle time memory contains 4096 12-bit
words. The instruction set allows addition, subtraction, indexing, direct and subroutine branching, accumulator manipulation, and accumulator testing to
be done internally. A microprogrammed inputoutput instruction allows extensive external manipulation and control operations to be performed.
This instruction set can be used to compose, edit,
translate, group, or otherwise manipulate graphical
information in real time. However, the limited
word size and the absence of hardware multiplydivide instructions prohibit computations in the
PDP-5 itself involving a wide range of numbers or
computations for rotations.
The display scope, a modified DEC Type 340 Precision Incremental Display, allows the rapid conversion of digital data to graphical form through point
plotting on a 1024 X 1024 raster. 6 This plotting area
occupies about a 9¥s inch square on the scope face.
Plotting is accomplished through use of six standard
modes of operation. One of these is a control mode
841
which allows the entering of parameter information
such as scale factor, intensity, scope image reflection,
and light pen enabling. The other five are display
modes which allow the plotting of points, curves, vectors, axes, and characters. Internal vector and character generators produce the points needed to form
lines and characters from the coded input words.
A seventh non-standard mode for use in display linkage has been added. This will be described shortly.
Of the display modes, only one, the point plot mode,
is absolute. By absolute is meant that fixed X and Y
absolute coordinates are assigned to a point. The remaining display modes (vectors, curve, axis, and
character) are incremental in nature. Thus they essentially contain only differential or aX and aY
movement information.
The plotting rate in one of the incremental
modes is 11;2 microseconds per point. For the XY or point plot mode, 35 microseconds are required
per point. Since the characters average 20 points,
average character plotting time is 30 microseconds.
Considering these timing values and the scope grid
size, about 20 scope diameters or 200 inches of line
can be maintained on the scope at a flicker-free
rate (30 frames/sec). This is equivalent to about
1000 characters.
The display material memory is an Ampex RVQ
core unit with 4096 36-bit words and a 5 microsecond cycle time. This storage capacity represents
about eight large pictures, i.e., those which take
about 1/30-th of a second to display.
As was previously stated, the console system is
divided into two major units: the control computer
and input devices, and the scope and display memory. The former is the supervisory and input unit;
the latter, the output unit.
The function of the output unit is the conversion
of words in the display /memory to an output picture on the display scope. These words can be a
mixture of those composed in the 7094 and transferred to the console and those composed at the
console using the input devices and the control computer. Successive words are supplied to the scope
from the memory through operation of a rather complex interface. Since the scope accepts 18-bit words
the interface accomplishes the gating necessary to
load first the left half and then the right half of each
36-bit word from the memory into the scope. The
interface also operates in conjunction with the scope
and the display memory to execute display linkage
instructions. These instructions, a direct transfer or
842
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
jump and a subroutine jump, operate essentially as
will be described. There are slight additional complications which result from the left-right usage
by the scope of the halves of a word from the display memory.
For the direct jump instruction, the next word is
taken from the location specified by the jump instruction address. The first word of a subroutine is
used to store return information. Therefore, for a
subroutine jump, the next word is taken from the
location one beyond that specified by the jump address. Before control actually passes on to that location, however, the location where control is to return upon completion of the subroutine is planted
in the location specified by the subroutine jump.
Exit from the subroutine is then accomplished using the planted return information. This method of
display subroutine linkage allows multilevel display
part subroutining. However, it does not allow reentrant subroutines.
Since the interface can provide successive words
to the scope and the linkage words themselves can
cause control to flow throughout the display memory, di'splay maintenance is independent of control
computer intervention. Thus, once started by the
control computer, the display continually refreshes
itself with a direct jump word at the end of the display picture providing the link back to the start.
The only exception to this independence is that the
control computer must restart the scope after display stoppages produced by edge overflows.
While the output unit is displaying a picture, the
control computer can do other operations. Generally
these are only looking for flags or interrupts indicating that an input device requires attention or that
the central computer requests information. Once a
flag or interrupt is found, the control computer can
stop the output unit. The status of the scope can be
saved. During the stoppage, the desired display
composing, editing, or manipulation can then be
done or information in the display memory can be
read or written by the central computer. If desired,
the display status can then be restored and the display cycling resumed or a new display can be started.
Cooperative communication programs in the control and central computers supervise any interchanges between the console and the central facility.
The central computer can interrupt the console at
any time. However, as was mentioned, the console
1965
can only be recognized between jobs on the central
computer batch input tape.
In addition to the display linkage features, other
additions made to the standard 340 Display involve
the inclusion of reflection properties which can be
enabled using the parameter setting mode, and addition of flagging bits to the parameter setting
mode, jump modes, and point plot mode. The reflection properties allow display material to be reflected about a horizontal axis, or a vertical axis, or
both. By changing the reflection setting before calling for a graphical subroutine, one block of data
can be used to produce a figure in different orientations. There are no hardware rotation matrix features.
The flagging bits have been added to allow programmable display stoppage and computer signalling when certain conditions are found. This has
been done since the material in the display memory
is a mixture of actual display words plus list and
buffer areas and aid is needed in tracing such material to make changes or additions. Use of programmable trapping means that interpretive programs do not have to operate upon the display
memory to find desired locations or conditions. The
scope itself does the tracing. These features are particularly valuable for use in conjunction with the
light pen.
Additions have also been made to the single interrupt line of the PDP-5. This permits six different signals to be passed on to the interrupt line with
the allowable interrupt signals at any time being
controlled by a programmable masking register.
INPUT DEVICES
The GRAPHIC 1 console is experimental as are
the problems attempted using the system. Therefore,
a wide variety 6f input devices has been included
both to allow a user to pick which seems best for a
particular function and to provide programming
and human factors experience in the use of different
devices.
The main input device for the console system is a
DEC Type 370 Light Pen. The pen, which is a
light sensing device, consists of a hand piece with
mechanical shutter at one end of a fiber optics bundle. The other end of the bundle provides input to a
photomultiplier system. The combined optic photomultiplier system is tuned to pick up the blue
flash from the P7 phosphor of the scope face that is
GRAPHIC
1-
REMOTE GRAPIDCAL DISPLAY CONSOLE SYSTEM
843
Figure 3. Package placement using GRAPHIC 1 (photo no. B65-4838-MH).
produced when a point is initially intensified. The
pen is not sensitive to the yellow persistence. The
field of view of the light pen is such that the tip
can be held right on the scope face or up to three
inches away without affecting proper operation.
The photomultiplier sets a flag and stops the scope.
Programmed operations in the control computer can
then read the coordinate or address information
needed for tracking or pointing before restarting the
scope again.
A Teletype Model 33 ASR provides standard
keyboard input-output functions. A Teletype
Model 33 Self-Contained Keyboard has been added mainly for use in entering text where the display
scope is used for feedback. The keyboard can be
placed immediately in front of the scope for such
occasions. For other occasions where the physical
presence of the keyboard on the front counter interferes with light pen manipulations, it can be moved
off to the side to the position shown in Fig. 1. Programs which use the self-contained keyboard echo
any typed characters on the regular Teletype machine so that, in addition to the character image on
the scope, a hard copy record is also available.
A track ball has been included for use in twodimensional positioning problems: It consists of a 4
inch diameter nylon ball which drives two pickoff
wheels, one for vertical travel and one for horizontal. No pickoff has been provided for rotation. The
pickoff wheels turn potentiometers. Outputs from
the potentiometers go through a multiplexor into an
analog-to-digital converter in the PDP-5.
To reduce the directional polarization usually associated with track balls, the ball itself is not supported by the pickoff wheels. Instead, the ball rests
on three ball end casters, the pickoff wheels being
held· against the track ball by spring tension. Mechanical lockouts are provided so that a pickoff
wheel can be moved away from the ball if it is desired to leave one coordinate unchanged while the
other is moved or to protect a setting from changes
due to accidental bumping.
In addition to the two from the track ball, outputs
from six potentiometers have been connected
through multiplexor gating to the analog-to-digital converter in the PDP-5. Included are two
ten-turn, two three-tum and two one-tum potentiometers. Through appropriate programming, an
operator can use the knobs attached to the potentiometers to enter or change parameters in his problem.
Four switches have also been provided. Two are
alternate-action and two are momentary. The alternate-action switches and one momentary are
mounted on the corner panel for hand manipulation. The remaining momentary switch is located in
a foot pedal. Testing instructions in the PDP-5
can determine the status of these switches.
844
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
A function keyboard consisting of 32 buttons
connected by a mechanical lock and interlock is
also located on the comer panel. The keyboard is
wired so that the· binary number of the currently
depressed button can be read by the PDP-5. There
is an additional EXECUTE bar located below the
other keys.
The function keyboard can be usep for vectoring
into various special routines in the PDP-5. This is
done as a two st~p process. First, the button corresponding to a desired function is depressed. When a
user desires the function represented by the depressed button to be executed, the EXECUTE bar
is pressed. This two step method of operation was
chosen for accuracy in operation and for ease in
repeating functions. To allow the vectoring operation to be accomplished, the output flag from the
EXECUTE bar is connected to the interrupt facility of the PDP-5. The trap handler program can
read in the depressed button number and enter a
transfer vector table which causes a jump to the
proper routine.
Since different users might desire different routines to be performed using the function keyboard,
the changing of the functioning of a particular button can easily be accomplished by changing the
transfer vector linkage and adding the routine to be
linked. Changing of a label on a button can be accomplished by changing a labeling mask which fits
over the keys. Despite this flexibility, the function
keyboard has had little use. The reasons will be described in the next section.
A DEC Type 451A card reader allows the entering of information from cards into the console system. Reading speed is 200 cards/minute. A connector position for use in attaching any special or experimental devices has also been provided.
EXPERIENCE WITH THE LIGHT PEN
There seems to be considerable debate over what
is the best graphical input device - RAND tablet, 7
graphic pencil4 or light pen. There are advantages
and disadvantages associated with each device.
A RAND tablet is a stylus-tablet device which
generates 10-bit x and 10-bit Y coordinate information representing the stylus position on the tablet.
Thus two-dimensional information can be obtained on a surface not coincident with the display
device. A graphic pencil is a voltage-pickup position pencil. It operates in conjunction with an im-
1965
pressed voltage on a transparent conductive coating
over the display screen to give position information.The advantage of these devices is that coordinate information can be entered at any time or
place including on a blank area of a display or
when a display is not even present. The disadvantage is that a positive reference between the stylus
or pencil position and an object on the display
scope does not exist. Thus a fairly complicated
coordinate trace must be performed for demonstrative or pointing uses of these devices.
As was pointed out previously in the description
of the console input devices, the light pen is only a
sensing device. It must sense or see light from a
display on the scope before it can function, i.e., set
a flag and stop the display. Thus a disadvantage is
that a display must be present before it can be used.
Once the display is stopped and before it is restarted, however, considerable information can be extracted by the control computer. In addition to
coordinate information, a positive reference to what
is being plotted when the scope was stopped is
available. Thus a coordinate trace does not have to
be performed for demonstrative uses.
The light pen allows good feedback in pointing
since programming can brighten what unit is being
seen or display only the total unit being seen. The
importance of such feedback cannot be appreciated
until one works with a graphical system. Because of
its particular value in pointing operations, a light
pen has been chosen as the graphical input device
for GRAPHIC 1.
To make the light pen as effective in drawing applications as the other devices, a good tracking pro.,.
gram must support it. For the GRAPHIC 1, the
tracking program uses a conventional four arm
cross.s Tracking is on a 60 cps interrupt basis. Thus
tracking performance is constant despite variations
in the amount of display material. First order prediction is incorporated. Special testing allows the
center of the cross to be moved right up to the edge
of the display area on the scope with the corresponding arm shortening. Backed by this programming support, the light pen is equally suitable for
both pointing and drawing.
In fact, the ease of use of the light pen for pointing operations has almost eliminated the originally
envisioned extensive use of the function keyboard
for entering various control routines. Instead, controls have almost exclusively been placed on "light
GRAPHIC
1 - ~EMOTE
GRAPHICAL DISPLAY CONSOLE SYSTEM
buttons" which are displayed on the scope face. A
light button is a word or figure on the scope face
which has a transfer vector associated with it.
When the light button is touched by the light pen,
the transfer vector is used to pass control to the appropriate routine.
Placing control functions on the scope face has
two advantages. First, only those controls which
should be present at a particular stage of a problem
are displayed. If a light button labeled "MOVE" is
one of those present, a user knows that he can move
picture parts around. Similarly, if only light buttons
in the form of PNP and NPN transistor symbols are
displayed, a user knows he must select a particular
type of transistor at that time. Thus, in effect, a
user is steered through a problem. Second, during
most operations there is only one center of attention, the scope face, on which a user need concentrate. This allows faster and smoother work on a
problem.
Any shape, no matter how complicated, can be a
light button since the shape has no effect on detection by a light pen. This is extremely important
since the shapes can be problem oriented. The complicated zone bounding necessary when performing
a similar function using a RAND tablet or graphic
pencil need not be done. Light buttons can easily be
moved around on the scope face. Also, the transfer
vector for one light button can easily be transferred
to another shape. This mobility and flexibility combine with the previously mentioned advantages to
make light pen detected light buttons very powerful
control elements.
.
BASIC PROGRAMMING SUPPORT
Software support for the GRAPHIC 1 console
system consists of three basic units. First there is
an assembler. and assembly postprocessor which
generates, merges, and links relocatable programs
for the PDP-5. This assembler has been written
using MACRO-FAP and runs on the 7094.
Therefore, all the features of MACRO-FAP are
available for use when assembling programs. This is
particularly useful when generating higher level languages. The combined assembler-postprocessor
aids considerably in dealing with the page boundary
problems of the PDP-5. The second basic unit is a
display material assembler and postprocessor. This
also is written in MACRO-FAP and runs on the
845
7094. Symbolic linkage between PDP-5 code and
display code can be accomplished. The third major
software unit consists of communications routines
to allow the easy passing of information between
the 7094 and the GRAPHIC 1 console.
Using these three basic units, several higher level
graphical languages are being developed. Furthest
along of these is the GRIN (GRaphical INput) language. GRIN is particularly suitable for use in
problems requiring the extensive real-time manipulation of graphical information at the console. It
takes full advantage of the incremental display
structure of the scope. (Recall that there is only one
absolute plotting mode. The remainder are incremental.) Thus, if a display part is composed only of
a sequence of incremental words, its position on the
display scope can easily be changed by changing
only the initial absolute entry point. Also if a part
is represented only incrementally, it can be called
up using the display part subroutine linkage at
many places on the scope face. The part has to exist
in storage only once, however.
No hard copy facility is available directly in the
console system. A SC 4020 microfilm recorder is
attached to the 7094. So, hard copy of the current
display picture is achieved by transferring an image
of the display buffer contents to the 7094 where
translation programs map the scope code into comparable SC 4020 code to produce a microfilm print.
Turnaround time to receive a print is from two to
four hours.
USAGE
The major use of the console system so far has
been for problems requiring at some stage "dynamic
scratchpad" capabilities. Included in such problems
have been printed circuit component and wiring
placement, schematic circuit design, block or flow
diagram design, text composing and editing, and
placement of cards on a chassis to achieve minimum connecting wire length. Initial input or configurations to be manipulated can be provided by
programs in the IBM 7094. For example, a card
placement programs to minimize connecting wiring
length gives a good initial placement. However,
such placements are subject to local minima and
frequently neglect maintenance considerations. An
operator ,at the console, when provided with appropriate displays, can break out of local minima
for reruns in the 7094 and can make placements for
846
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
easy maintenance. The photograph in Fig. 3 shows
a user working on such a placement problem.
The result of any console manipulations can serve
as input to a 7094 program also. As an example, text
in the form of a computer program which has been
composed at the console can be entered, assembled,
and run. A system block diagram composed at the
console can be used as input to a special compiler,
the compiled program can be run, and the results
viewed at the console. As a further example, a
schematic circuit composed at the console can be
used as input to a circuit analysis program and the
results of the program returned to the console. The
number of interesting uses of the console continues
to grow as the existence of the console becomes
known to people in varied fields of work.
SUMMARY
GRAPHIC 1 provides very flexible man-computer graphical communication facilities in a
time-shared central computer environment. Local
computing power and display maintenance capabilities make possible extensive real-time graphical
manipulations at the console. Interaction between
the console and the central computer permits attacks on large complex problems.
ACKNOWLEDGMENTS
During the period that this system was designed
and constructed many people have contributed useful ideas. Chief among these contributors were H. S.
1965
McDonald, A. D. Hause, C. Christensen, and A. G.
Faulkner. The author wishes to express his appreciation to all these people.
REFERENCES
1. G. J. Culler -and R. W. Huff, "Solution of
Non-Linear Integral Equations Using On-Line
Computer Control," S.J.C.C., Vol. 21, 1962.
2. I. E. Sutherland, "Sketchpad, A Man-Machine Graphical Communication System," S.J.C.C.,
Vol. 23, Spartan Books, Inc., Washington, D.C.,
1963.
3. T. E. Johnson, "Sketchpad III, A Computer
Program for Drawing in Three Dimensions,"
S.J.C.C., Vol. 23, Spartan Books, Inc., Washington,
D. C., 1963.
4. B. Hargreaves, J. D. Joyce, G. L. Cole, et aI,
"Image Processing Hardware for a Man-Machine.
Graphical Communication System," F.J.C.C., Vol.
26, Spartan Books, Inc., Washington, D. C., 1964.
5. "PDP-5 Handbook," Digital Equipment Corp.,
Maynard, Mass. (1963).
6. "Type 340 Precision Incremental CRT Display," Digital Equipment Corp., Maynard, Mass.
(1965) .
7. M. R. Davis and T. O. Ellis, "The RAND
Tablet: A Man-Machine Graphical Communication Device," F.J.C.C., Vol. 26, Spartan Books,
Inc., Washington, D. C., 1964.
8. R. Stotz, "Man-Machine Console Facilities
for Computer-Aided Design," S.J.C.C., Vol. 23,
Spartan Books, Inc., Washington, D.C., 1963.
THE BEAM PEN: A NOVEL HIGH-SPEED, INPUT/OUTPUT DEVICE
FOR CATHODE-RAY-TUBE DISPLAY SYSTEMS*
Donald R. Haring
Electronic Systems Laboratory
Massachusetts Institute of Technology
Cambridge, Massachusetts
computer CRT display with the light pen. Specifically, the light pen is used to draw displayed lines
on the CRT by using pen tracking, to position
predefined parts of the sketch on the display and to
point to such parts in order to change them. A set
of push buttons operate in conjunction with the pen
to indicate changes such as erasing or moving. Because of this hardware-software package, no typed.
statements are necessary to the computer ( except
for le"gends), hence greatly increasing the speed of
man-machine communication. For additional discussions of light-pen applications see references 1-5.
Light pens detect the light produced on the CRT
screen by a pulse of electron-beam current. The
electron beam is under computer control. The lightpen output for a given pulse is determined by the
combined time responses of the light-producing
mechanism in the phosphor, the light-detecting mechanism in the light pen, and the pen amplifier.
Each of these responses is characterized by a delay
in buildup and a delay in decay, with the result that
INTRODUCTION
In recent years there has been considerable interest in providing rapid communications between a
man and a computer. A most useful communication
system in this regard is a cathode ray tube (CRT)
computer display and a hand-held "light pen"! to
report to the computer whenever a lighted spot on
the CRT display falls within its small field of view.
The pen may be used to identify or select a particular displayed item which the man wishes to call to
the attention of the computer. Also, a feedback loop
may be programmed to keep a displayed pattern of
spots centered in the pen's field of view. With such
a loop the displayed pattern follows the moving
pen, and the system becomes a very versatile graphical input device. This process is known as "pen
tracking. "
A typical application of the light pen and CRT
as a graphical input/output system has been discussed by Ward and ROSS2 for "Computer-Aided
Design" in general, and by Sutherland3 for his
"Sketchpad" system which is used to communicate
between man and computer by sketches. In the
Sketchpad system, the user sketches such things as
machined parts and electronic circuit diagrams on a
*This work was made possible through the sponsorship
extended the M.I.T. Electronic Systems Laboratory by Project MAC, an M.I.T. research program sponsored by the
Advanced Research Projects Agency, Department of
Defense, under Office of Naval Research Contract Number
NOnr-4102(Ol ).
847
848
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
the pen output corresponding to a single narrow
square pulse of beam current is a delayed and much
broadened pulse with a long tail. The net pen output for a given intensification pulse is the sum of
the response for that pulse and any residual tails
from previous pulses. Thus, there is an integrating
effect and it is necessary that sufficient time be allowed between pulses for almost complete decay if a
reliable solution among points is to be maintained. In other words, the system has a limited
bandwidth.
In the past, typical CRT display systems used
with light pens plotted points at intervals of about
20-30 microseconds, and the combined decay time
of the pen response was such that these points could
be resolved in time. However, because of the desire
to display more complicated pictures flicker-free
with modern computer display systems, it is not
unusual to display points every 1-2 microseconds,
and it is found that it is impossible to distinguish
among points displayed this closely in time, even
with the fastest available light detectors-photomultiplier tubes. The present limit appears to be
about 8 microseconds.
The major limiting factor in further improvement
in light-pen speed is in the phosphor screen itself,
which must be of a persistent type in order to reduce display flicker. Fortunately, all persistent
phosphors are of the two-layer type, with a relatively fast phosphor layer which responds directly to
the electron beam, and a slow phosphor layer which
does not respond to the electron beam but is excited
by the light from the fast layer. The widely used P7
phosphor, for example, has decay times of 40 microseconds and 100 milliseconds for the fast and
slow phosphors, respectively. Since the light output
1965
of the fast layer is much greater than that of the
slow layer, the fast layer essentially determines the
pen response.
The development of a new phosphor with a sufficiently fast exciting (fast) phosphor to improve the
light-pen response speed would entail considerable
cost and does not appear to be the best long-term
solution. Instead, we have taken a new approach to
the problem. We have developed a system to detect
the electron beam causing the screen light rather
than the light itself. This system is called the beampen system. This new approach to increasing the
speed of the CRT display system for man-machine
communication is described in this paper.
BEAM-PEN SYSTEMS: BASIC THEORY
Reference to Fig. 1 provides a qualitative understanding of the operation of a beam-pen system.
The CR T cj.rcuitry selects the area of the CRT
screen to be illuminated and, in the typical computer display, turns the electron beam on for a prescribed period of time after the desired screen area
has been reached, i.e., the selected deflection signals
have stabilized. The beam pen is a conducting
probe that is hand-held in front of and very near
the CRT screen. The conducting probe· is connected
to a high-input-impedance "pen amplifier." When
the pen is placed in front of the CRT screen, the
probe is capacitively coupled to the beam such that
as the distance (defined in Fig. 1) is decreased, the
beam-to-probe capacitance, and thus the signal detected by the probe, increases. One can use either
the analog distance-varying output of the pen amplifier or add a threshold element to produce a digi-
, . . . . - - - - - , DEFLECTION SIGNALS
BEAM PEN
'-----;:':::::0==:::::;;;;;:;;;::;,;'-
--
T
x
1'BEAM -TO- PEN
-- -- - - - _
-
DIGITAL OUTPUT
n
JL
ANALOG OUTPUT
n
Figure 1. Simplified beam-pen system.
DISTANCE
THE BEAM PEN: A NOVEL HIGH-SPEED, INPUT/OUTPUT DEVICE
tal "seen-not-seen" output. We see then, that the
beam-pen system is similar to a radar system.
Let us now take a closer look at the system to
obtain a quantitative understanding of its operation. Figure 2 shows a simplified circuit model of
the beam-to-pen coupling. The CRT screen can be
represented by a parallel RC circuit. In non-aluminized tubes this time constant is in the order of 0'.1
ELECTRON-BEAM
CURRENT =
849
seconds. 6 Hence, since we are concerned with beam
pulses in the order of microseconds, the screen acts
as an integrator. Similarly, the pen amplifier input
can be represented as a parallel RC circuit. This
impedance is a parameter of the design. Finally, the
coupling between the beam and pen is clearly primarily capacitive.
BEAM-TO-PEN CAPACITANCE
c~
rep =
CRT SCREEN
IMPEDANCE
PEN VOLTAGE
PEN AMPLIFIER
INPUT IMPEDANCE
Figure 2. Simplified circuit model of beam-to-pen coupling.
The geometry of the CRT screen controlling the
beam-to-pen capacitance is illustrated in Fig. 3.
Here the electron beam is represented by a stream
of electrons impinging upon the CRT phosphor.
Upon striking the phosphor, the incident electrons
create some secondary electrons which leave the
struck area Df the phosphor, but since the phosphor
is operated at secondary emission ratios of less than
one, the net charge on the screen at the struck area
is negative. The beam is represented by a conducting probe surrounded by a shield and is shown
displaced a distance x along the safety plastic from
the center of the electron beam. Typically, the diameter of the electron beam (d) is in the order of
5 to 10 mils, the distance (ro) from where the beam
strikes the phosphor to' the surface Df the safety
plastic is in the order of lh inch, and the probes
used are approximately 1/10 inch in diameter. From
CRT FACE PLATE
CONDUCTING
PROBE
INCIDENT
ELECTRON
BEAM
ELECTRONS
DEPOSITED ON
SCREEN AND
POSITIVE
CHARGES
CAUSED BY
SECONDARY
ELECTRONS
Figure 3. Cathode ray tube screen details.
850
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
these values it is clear that even the maximum value
of the beam-to-pen coupling capacitance C is extremely small. For example, using a parallel-plate
capacitor model shows that this capacitance is less
than 1/10 of a picofarad. Hence, the input impedance of the pen amplifier must be high to obtain a
reasonable signal from the beam.
From the circuit model in Fig. 2 and the above
observations, the pen voltage ep is approximately
equal to C(j w Zs ZP)ib. Furthermore since Zs has
such a large time constant, i.e., Zs ~ 1/j w, to a
close approximation, ep ~ kC Zp ib, where k is a
constant. Hence as we observed previously, the pen
voltage is directly proportional to the coupling capacitance and the pen amplifier input impedance.
Because the distance r in Fig. 3 is so much larger
than the areas of the beam and pen, a parallel-plate
capacitor model for the beam-to-pen capacity is
poor. Unfortunately, a more exact capacitor model
results in very difficult field equations. Hence, we
have used experimental data to derive the following
polynomial approximation to the variation of capacitance with pen displacement along the safety plastic
when x ~ 4 ro, where x and ro are defined in Fig. 3:
1.6
C = C max [ V 1 + a 2
1
-
1
0.46
+ a2
-
0.14]
1 + a4
(1)
where a =~. We are not interested in distances
ro
x > 4 roo Clearly, the pen voltage is the same function of x. As a matter of fact, the data used to derive
( 1) was obtained by measuring the pen voltage in
a system with the following properties:
1965
CRT = 16ADP7 (non-aluminized)
Beam diameter = 0.005 inch
ro = 0.5 inch
Probe diameter = 0.085 inch
Shield diameter = 0.5 inch
Variations in these parameters influence the coefficients in (1). For a discussion of these effects see
reference 7.
Figure 4 is a normalized plot of (1). The shape
of this response curve is of interest in resolution
considerations. In other words, one important property of a man-machine communication system is to
designate specific items in a CRT disptay using a
digital seen-not-seen signal. The higher the resolution (i.e., the smaller the field of view) of the
beam pen, the smaller the distance between two different items that can be independently designated.
The digital signal from a beam pen is derived from
a threshold detector attached to the analog output.
Because a threshold detector has a finite "dead
band" for which its decision is not predictable, to
insure a stable resolution the pen anatog output
should change as rapidly as possible with a change
in displacement, and be as noise-free as possible at
the value of the output to which the threshold detector is adjusted.
Since ro has a marked effect on the shape of the
beam pen response as shown in Fig. 4, it is important
to operate the pen as close to the CRT screen as
possible. The minimum value of ro is determined by
geometry that, for the most part, is not under control of the designer. Hence, this distance is a major
limitation of the beam-pen system.
RESPONSE
x
BEAM CENTER
Figure 4. Normalized plot of Eq. (1 )-ro = Y2 inch.
THE BEAM PEN: A NOVEL HIGH-SPEED, INPUT/OUTPUT DEVICE
There are several noise sources in the beam-pen
system. The primary sources are: the electron beam,
the CRT phosphor, the CRT high voltage supply,
the computer, the display system, the pen amplifier.
Since signals in the order of tens of microvolts have
been measured at the screen, the noise contributions
of a well-designed pen amplifier are small compared
to this signal. Because the noise in the CRT high
voltage supply, the computer and the display system
are principally centered about d.c., the contributions from these noise sources can be greatly reduced
by modulating the unblanking pulse with a highfrequency signal and using an appropriate highfrequency band-pass pen amplifier followed by a
detector. That is, in Fig. 5 we show the usual
computer-generated unblanking pulse. This pulse is
modulated by a high-frequency sine wave, also
shown in Fig. 5 as the resulting beam charge. Figure
5 also shows the corresponding phosphor charge,
viz, approximately the integral of the beam charge.
COMPUTER-GENERATED
UNBLANKING PULSE
- BEAM CHARGE
-PHOSPHOR CHARGE
Figure 5. Charge variations.
Using a modulation system, the remaining primary noise sources are the electron beam and the
CRT phosphor. There is reason to believe that both
of these sources produce both wanted signal and
noise. Hence both can be viewed as noise in a com-
851
munication or radar channel. Thus, methods used in
communications and radar can be used to increase
the signal to noise ratio. As yet, little has been
done in this direction. One approach presently
being studied experimentally is to use a synchronous detector rather than an envelope detector.
The beam pen operates because of local charge
variations on the CRT phosphor. Any modification
to the phosphor that eliminates or modifies these
charge variations will render the beam pen inoperative. Hence, a beam pen will not operate with an
aluminized-screen CRT because the aluminization
greatly reduces the resistivity of the phosphor, thus
essentially eliminating local charge variations.
When a CRT screen is not aluminized, placing foreign objects at the outside surface of the screen can
modify the charge distribution. A way to eliminate
this effect is to cover the outside of the CRT or the
safety plastic with a static reducing agent, such as
Statnul, manufactured by Daystrom, Inc.
THE ESL BEAM-PEN SYSTEM
This section decribes the beam-pen system built
at the Massachusetts Institute of Technology, Electronic Systems Laboratory (ESL) for use with a
display console on the M.I.T. multi-access computer
(Project MAC). This display is an incremental
type in which output pictures are composed of discrete dots displayed in the following manner: The
CRT beam is turned on for a duration of 0.5 microseconds to display a point, turned off, relocated,
and turned on for the next point. The beam never
moves while turned on. The 0.5 microsecond duration that the beam is turned on is hereafter referred
to as the display time of the unit. Points are displayed at a maximum rate of one every 1.5 microseconds. The display is manufactured by Digital
Equipment Corporation (DEC) and is similar to
their Type 330. A Type 16ADP7 non-aluminized
CRT is used.
A block diagram of the ESL beam-pen system is
shown in Fig. 6. A frequency of 10 megacycles was
chosen as the modulation frequency for this unit
because of the availability of a commercal amplifier
at this frequency. The lower frequency limit for
such a system is imposed by the system bandwidth
requirement which in turn is dictated by the display
time. The upper frequency limit for such a system
is imposed by stray capacitances and radiation.
852
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1965
NUVISTOR
BEAM PEN
Figure 6. The ESL beam-pen system.
As seen in Fig. 6, the ESL beam-pen system consists of the fol'lowing units:
1. A pulse modulator to intensity modulate
(approximately 100 percent) the CRT beam
with a 10-megacycle signal.
2. The beam pen itself, a hand-held, highinput-impedance pickup consisting of a
shielded conductive probe and a preamplifier to pick up the electron-beam signal.
3. A broad band-pass 10-megacycle amplifier
to amplify the beam-pen output.
4. An envelope detector to remove the 10megacycle carrier.
5. A pulse amplifier to drive the threshold
detector and produce signals compatible
with DEC signals.
6. A threshold detector to produce a digital
output.
Each unit is now briefly described. First the
pulse modulator. In the present system, this modulator is simply a high-Q parallel resonant circuit
having a normally turned-on switching transistor
connected in series with a d.c. power source. When
the computer generates an. unblanking pulse, this
switching transistor is turned off, rapidly changing
the current through the high-Q circuit and resulting
in a 10-megacycle "ringing" for the duration of the
unblanking pulse. This signal is coupled to the CRT
first grid through an emitter follower and the amplitude is adjusted to cause about 100 percent modulation. (For the 16ADP7 this is approximately 15
volts peak to peak.) Hence there is a 10-megacycle
signal present only when the beam is turned on.
A photograph of the present beam pen is shown
in Fig. 7. To produce a stable high input impedance
a Nuvistor cathode follower is used. The probe
characteristics are those specified in the previous
section to derive Eq. (1). Other pertinent characteristics are: input capacitance = 7.25 picofarads,
input resistance = 2.0 megohms, and voltage gain
with 50-ohm load = 0.25.
The band-pass amplifier is a Type BV 1002 manufactured by RHG Electronic Laboratory, Inc. Pertinent characteristics are: center frequency = 10
megacycles, 3 db bandwidth = 2.1 megacycles, noise
figure = 2.1 db, maximum power gain = 97 db,
maximum voltage gain = 106 db, and maximum
voltage output = 40 volts. The amplifier contains
an envelope detector and a linear pulse amplifier.
The latter drives a simple external pulse amplifier
to make the signal levels compatible with DEC signal
levels and to provide a threshold adjustment. The
RHG amplifier also has a manual gain control for
the 10-megacycle section and an AGC for this section. The manual gain control is adjusted to prevent
overload at any time, and the AGC is disabled to
prevent integrating effects in the AGC circuit.
The threshold detector is simply a DEC Schmitt
trigger and it drives a flip-flop in the display system. This flip-flop is under program control. For
additional details of this system see reference 7.
PERFORMANCE OF A BEAM-PEN SYSTEM
Figure 8 shows the analog response of a DEC
Type 370 fiber-optic, photomultiplier-tube lightpen system to a sequence of displayed points occur-
THE BEAM PEN: A NOVEL HIGH-SPEED, INPUT/OUTPUT DEVICE
853
Figure 7. The beam pen.
Response
Unblonking Pulses
Width = 005 Mic:rosec:.
Period= 3.5 Mic:rosec:o
Figure 8. Light-pen response.
ring at a display rate of 1 point every 3.5 microseconds. This light pen is presently the fastest commerciallight pen available. The first point displayed
is on the left-hand. side of the picture. As can be
seen, the pen response to this first point is greater
than that to the second point, which is in turn
greater than that to the third point, etc. Although
not shown in this figure, after a sufficient number
of points essentially no response is obtained. This
figure clearly demonstrates the deficiency of a lightpen system for high-speed operation-the inability
to discriminate between points occurring at high
display rates. For reference, the Type 370 light pen
can operate reliably, by proper signal processing, at
a display rate of one point every 8 microsecoonds or
more.
Figure 9 shows the analog response of the ESL
beam-pen system described in the previous section.
The display rate is identical for this photograph as
for the one in Fig. 8. The superior performance of
the beam-pen system as regards the speed of response is clearly evident. For reference, this beampen system has been tested at a display rate of 1
point every 1.5 microseconds and it still was not
speed-limited. Of course, it operated equally well at
slower display rates.
On the other hand, the light pen has a better resolution than the beam pen. The particular ESL
beam-pen system described in the previous section
854
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
Re.sponse
Unblanking Pulses
Width = 0.5 Microseco
Period = 3.5 Microsec.
Figure 9. Beam-pen response.
has a reliable field of view of 1.5 inches in diameter. This is approximately a factor of 10 larger
than that of the DEC 370 pen. The primary cause
of the relatively poorer resolution of the beam pen
is that the phenomenon by which it works is a
slower varying function of distance from the displayed point than is the light by which the light pen
works. This is evident from the plot of (1) in Fig.
4 and the realization that optical signal processing
can be used on the light pen. A secondary cause of
this poorer resolution is that the noise level in the
present system is higher than it should be. This latter factor is presently being studied.
The poorer resolution of the beam pen does not
markedly affect its performance in respect to the
light pen when "pen tracking" is being used. The
poorer resolution does show up of course when one
desires to point to one of several objects that are
close together.
Another factor to consider in the beam-pen system is the delay between the occurrence of the electron beam and the output of the system. This is
caused by the band-pass pen amplifier. From Fig. 9
we see that. the delay is in the order of 1 microsecond. For many applications (e.g., the ESL Display <:;onsole) this delay is no problem. If the delay
cannot be tolerated, then a wider band-pass amplifier
would reduce this delay with some loss in signal-tonoise ratio.
For more detail's of the performance of the ESL
Beam-Pen System and other considerations, see reference 7.
CONCLUSIONS
From experimental evidence we conclude that it
is possible to make a system to detect when and
1965
where the electron beam of a CRT strikes the
screen, thus essentially eliminating the bandwidthlimiting effects of the CRT phosphor and making a
high-speed man-machine communications system
possible. The system that is described here is presently being used in an operating environment. It is
superior to the light-pen system as regards to speed,
and is relatively simple, inexpensive, and insensitive to the background light of the operating room.
There are also several possible extensions of the
system. For example, by modulating the electron
beam with several differential selectable frequencies, one could "color" or "classify" various displayed points on the CRT screen. The present disadvantage of poorer resolution than the light pen
does not appear to be insurmountable and is presently being studied.
ACKNOWLEDGMENTS
Thanks are due to Mr. J. E. Ward, Assistant Director of the M.I.T. Electronic Systems Laboratory
and Mr. R. H. Stotz, also from the Laboratory, for
their continued interest in this work, and to Mr. P.
G. Arnold, formerly from the Laboratory, for his
contributions to the developm~nt of the hardware.
*This work was made possible through the
sponsorship extended the M.1. T. Electronic Systems
Laboratory by Project MAC an M.I. T. research
program sponsored by the Advanced Research Projects Agency, Department of Defense, under Office
of Naval Research Contract Number NOnr4102(01) .
REFERENCES
1. B. M. Gurley and C. E. Woodward, "LightPen Links Computer to Operator," Electronics,
Nov. 20, 1959, pp. 85-87.
2. J. E. Ward and D. T. Ross, "Investigations in
Computer-Aided Design," M.LT. Electronic Systems Report 8436-IR-l, chap. 8 and 9 (May 30,
1960).
3. I. E.Sutherland, "Sketchpad: A Man-Machine
Graphical Communication System," M.LT. Lincoln
Laboratory Technical Report 296 (Jan. 30, 1963).
4. T. E. Johnson, "Sketchpad III: Three Dimensional Graphical Communication with a Digital
Computer," M.LT. Electronic Systems Laboratory
ReportESL-TM~173 (May 1963).
THE BEAM PEN; A NOVEL HIGH-SPEED, INPUT/OUTPUT DEVICE
5. J. C. R. Licklider and W. E. Clark, "On-Line
Man-Computer Communication," Proc. A FIPS
Spring Joint Computer Conference, vol. 21, pp.
113-]35 (1962).
6. Williams and Kilburn, "A Storage System for
855
Use With Binary Digital Computers," Proc. lEE,
vol. 96, pt. 2, no. 81, pp. 183-200 (Mar. 1949).
7. D. R. Haring and P. G. Arnold, "The Beam
Pen: A New Approach to High-Speed Light Pens,"
M.LT. ESL Report (in preparation).
VOICE OUTPUT FROM IBM. SYSTEM/360
A. B. Urquhart
Systems Development Division
International Business Machines Corporation
Kingston, New Yark
INTRODUCTION
numbers of "lines" and "words." A "line" is
defined as a half-duplex communication channel to
which more than one telephone may be connected,
but where only one transaction takes place at one
time. In the case of the 7770 and 7772, a line is
used for transmission of digital information in the
input direction and voice in the output direction. A
"word" is a unit of vocabulary which, for speechprocessing purposes, may be either in analog or
digital form.
There are differences in the number of lines and
words available on each device and in the method
of generating voice output. There are three models
of the 7770 and one of the 7772. The difference
among the three models of the 7770 is in methods
of attachment to host processors. Table 1 shows
how many lines and words are available on each
and the type of processors to which each device can
be attached.
'
Both units are available in languages other than
English, but because of the many variations in vocabulary and methods of attachment to communication equipment on a worldwide basis, this discussion is limited to that of the 7770 Model 3 and
7772 as attachable to IBM System/360 within the
continental United States. The external appearance
of the two machines is similar because the same
The IBM 777(} and 7772 and elements of the
New York Stock Exchange Market Data System
form a family of IBM devices providing voice output facilities. The devices function 'similarly in that
each gives a computer-generated voice response to a
dialed inquiry. The audio generation principle incorporated in the 7770 is a derivative of the original Voice Answer Back principle-that of adjusting
word length to fit machine time-slots. Then by access to these words from many input lines, sentences are formed into a voice response. The 7772,
on the other hand, generates audio on the "vocoder" principle-that of energizing tone filters and
combining the output result first to form words and
then sentences in a manner similar to that of the
7770.
Input to the New York Stock Exchange Market
Data System was accomplished through the use of
the IBM 7750 program transmission control. In
contrast, the 7770 and 7772 are self-contained input/output devices designed for widely diversified
applications requiring various types and lengths of
inquiries and responses in such industries as banking, insurance, manufacturing, and retailing.
They are, therefore, modular in increments of
857
858
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1965
Table 1.
No. of Lines
7770-1
4 to 48
(in increments of 4 )
7770-2
4 to 48
(in increments of 4)
No. of Words
i
Processor
32 to 126
(in increments of 16)
IBM 1401, 1140,
1460
32 to 127
(in increments of 16)
IBM 1410, 7010
7770-3
4 to 48
(in increments of 4)
32 to 128
(in increments of 16)
IBM System/360
Mod. 30., 40, or 50
7772
2 to 8
(in increments of 2 )
Any amount from
available vocabulary
list. Limited by
available storage.
IBM System/360
Mod. 30, 40, or 50
frame with similar means of connection to communication facilities and host processors is used for
both. (See Fig. 1.) In function, the 7770 and 7772
are the same; the difference between the two is in
the method of generating the voice output as will be
described later in this article.
There are three basic sections to each deviceinput, output, and control. The input and output
sections connect to the common-carrier communications network. The control section connects the input and output sections to the host processor. A
transaction takes place in the following way.
Using the telephone, an inquirer first dials the
telephone number allotteed to a 7770 or 7772. When
the ringing stops, a tone will be heard, indicating
that the call has been answered. The inquirer now
dials his input message. The 7770 or 7772 forwards
this message, a character at a time, to the attached
computer which processes the input data and returns a digital output message. The message is converted to audio and heard by the inquirer.
METHODS OF ATTACHING
COMMUNICATIONS EQUIPMENT
The design of the communications interface required consideration of the various uses. of a telephone in a machine environment. The telephone, in
this case, was the prime input and output device
involving dialed digits as input and audio as output.
Some of the first design problems affecting the
telephone as an inquiry terminal involved consideration of the human element. How does the inquirer
react to a telephone that he knows is connected to a
machine? If the inquirer receives no reply, does he
hang up? If so, right away? In 20 seconds? In 2
Figure 1. An audio response unit in background.
859
VOICE OUTPUT FROM IBM SYSTEM/360
minutes? It seeemed that most human-factor problems fell in the category of "normal telephone practice" and that the real problem was not what the
inquirer would do but how the 7770 or 7772 would
react to questions like these:
•
•
•
•
What does it do if the inquirer misdials?
How will it recognize end-of-inquiry?
What codes will be presented to it?
How much error-checking can reasonably be
done?
• Should it accept d-c dial pulses or tones?
• Should it accept serial and/or parallel data?
In adition, there were questions concerning Audio
Output over the communications interface.
• Could any data sets be modified to traninit
audio?
• Could this be done on balanced lines? On
unbalanced lines?
• What should be the level at which audio is
transmitted?
• What happens if an inquirer dials-in while
audio is being transmitted?
The answer to most of these questions lay in designing the 7770/7772 "front end" to fit a parallel
communications data set interface; namely, that of
the AT&T 400 series data sets or their equivalent.
These data sets are serial-by-character, parallel-bybit data transmitters and receivers which are capable of handling numeric and/or alphanumeric data.
In addition, one receiver type has since been
modified to allow audio transmission in the output
direction. One data set receiver (equivalent to the
Western Electric 403A or 40113) must be used per
7770 or 7772 line as shown in Figs. 2 and 3 respectively.
Common to all of the connections in Fig. 2 is the
use of pushbuttons for inquiry. Entering digits in
this manner is faster, more reliable and is gaining
in popularity, but there still are two basic types of
telephones that can be used-the rotary dial telephone and the pushbutton telephone. If a rotary
dial telephone is used, a pushbutton attachment is
usually added. An inquirer ~ould first dial the system number with the rotary dial telephone and enter the inquiry with the pushbutton attachment. If
the inquirer uses a pushbutton telephone as shown
at Fig. 2D, he dials the number and enters the inquiry using the same telephone.
lmomn
~
\:y
~_-
PUSHBUTTON
ATTACHMENT
I__
INTERFACE
®B
~~';A'
PUSHBUTTON
AUTOMATIC
DIALER (CARD
OR"RAPIDIA~)
I
,~Ij~_
I
_--I
ROTARY DIAL
CARD DIALER
OR"RAPIDIAL!'
@
.
COMMON
CARRIER
NETWORK
I
II
PUSHBUTTON
ATTACHMENT
~
~A::
I
@
PUSHBUTTON
®
PUSHBUTTON AUTOMATIC DIALER
(CARD OR"RAPIDIA':)
DIALER
~
II
.
PUSHBUTTON INQUIRY
* OR EQUIVALENT
Figure 2. Telephone equipment appropriate for inquiry.
Other inquiry terminals shown in Fig. 2 are variations of either the basic rotary dial or the pushbutton type of telephone. These are the card or RAPIDIAL* type of telephones. In the card-dial telephone, a number that is frequently used can be
punched in a card which is then used to enter the
inquiry.
Since inquiry from a telephone is restricted to
ten digits, there are some applications where it is
desirable to have alphabetic as well as numeric inquiry. In those cases, a terminal, such as the IBM
1001 shown in Fig. 3, canbe used. Numeric and/or
alphabetic data can be entered from the 1001 using
a punched card. When an IBM 1001 inquiry terminal is used, a data set transmitter (40 1E3) per line
is required in addition to the data set receiver per
line for inquiry. In a method similar to the direct
telephone attachment shown in Fig. 2, the initial
number is dialed using either a card or a RAPIDIAL * telephone.
860
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
®
COMMON
CARRIER
NETWORK
®
©
ROTARY CARD
DIALER
"RAPIDIAL\\
IBM 1001 TYPE INQUIRY
Figure 3. IBM lool-type inquiry or equivalent.
AUDIO OUTPUT
On the 7770, vocabularv i.l stored on a drum
similar in form to storage of' words on a tape recorder. Words are stored around the circumference of
the drum surface on tracks. The drum has 128
tracks; it is 4 inches in diameter, 10 ~nches long,
and rotates at 120 revolutions per minute. Each
drum track has an associated read head and amplifier for retrieving the recorded word impulses.
The rotational speed of the drum dictates that the
information per track must fall within a SOO-millisecond time period. A process has been developed
in IBM to compress words or segments of speech
into SOO-millisecond time-slots. Words having a
time duration greater than 500 milliseconds are
placed in 2, or more time-slots or tracks.
The vocabulary is first generated in the following
way. An elocutionist speaks the vocabulary words
onto a tape recorder. This tape is then digitized
through an analog to digital encoder, the output of
which is edited and processed by a computer program to fit the words into SOO-millisecond time*Registered trademark of McGraw-Edison Co.
1965
slots; these digital time-sloted words are stored on
tape. At such times as a specific vocabulary is required, words are converted to analog form and
placed on the drum at the specific track locations
required by the application. This is normally done
once for each application. Vocabulary modification
is accomplished "by removal of the recorded drum
cylinder and its replacement with another cylinder
having a different vocabulary.
The way the vocabulary is accessed is depicted in
Fig. 4, which shows a functional diagram of the
drum and the associated analog circuits. For each
application, the processor has a table of addresses
corresponding to vocabulary words. Upon analyzing
an input message, the processor formulates the required output message which is transmitted to the
7770. This digital output message consists of a series of drum track addresses preceded by a line address. Each track address conditions a specific word
analog gate allowing a word to be gated onto the
Pulse Amplitude Modulation Bus (P AM BUS) .
From this bus, the word is gated through any message analog gate conditioned by a specific line address. This allows each word to be transmitted to
any line and simply represents time-division-multiplexing of the analog word signaL Since this leaves
the audio in a rather chopped-up state, the signal
passes through a reconstruction filter before being
transmitted to the output line. As long as a relatively high sample frequency (e.g., 12KC) is maintained, no appreciable degradation in audio quality
is noticed. It must be remembered that the audio
output is in the 200 to 4000 cycle per second range.
For the 7772, the method of recording vocabulary is similar to that for the 7770; however, the
processing phase is different. The speech on tape is
converted to digital data through an analog-to-digital encoder. The processing of vocabulary is then
accomplished by band-compressing each word, thereby limiting the numbers of digits per word. This
results in providing a stream of digital data which
is stored on cards or tape for later transfer to a disk
file or similar random-access storage device within
the system. This digital data, called Digitally Coded
Voice (DCV), consists of sequential aggregate and
excitation functions. An aggregate function is 45
bits in length and represents a portion of the analog
signal. The excitation function is 8 bits in length
and acts as a counter determining the length of time
an aggregate function should be used for a specific
segment of analog signal'. The sequential combina-
VOICE OUTPUT FROM IBM SYSTEM/360
861
EXCITATION r - - - - - - - ,
FUNCTION
REGISTER L..----=-.,..=-_--'
(8 BITS)
ENERGIZER
SA~ AMPLr~~
PU~:E\E~o-----~~----ANALOG
GATES
WORD ANALOG
GATE
PAM BUS ~----....
MESSAGE ANALOG
GATE
BAND
FILTERS
RECONSTRUCTION
FILTER
~
FILTERED
AUDIO
VOICE CODE
TRANSLATOR
TO
COMMUNICATIONS
LINES
Figure 4. PAM bus and related gating in IBM 7770.
tion of these aggregate and excitation functions
makes the DCV representation of a particular word
of vocabulary.
Approximately 300 bytes of data (DCV) are required per second of audio. The word "balance,"
for example, would be stored on a disk file as approximately 75 bytes of digital data (600 bits).
Vocabulary in the 7772 is not stored on a drum
in analog form but is stored within the system in
bit form on a disk file. The CPU reads the required
vocabulary words into its own core storage from the
file. The information from the file is transferred
through the multiplexer channel to the 7772 control
unit which, in tum, transmits the data to a Voice
Code Translator (VCT); the VCT then converts the
data from digital to analog form. One Voice Code
Translator is shared by every two input lines.
The Voice Code Translator consists basically of
a set of 15 filters, each with suitable energizers and
covering the voice frequency range of approximately 200 to 3,700 cycles per second. Each filter covers
a specific portion of the voice band. For example,
filter No.1 covers the 200 to 300 cps range, whereas filter No. 15 covers the 3,150 to 3,700 cps
range. An audio output is obtained when a combination of these filters is, in effect, energized by a
pattern of input data.
A functional diagram of the VCT is shown in
Fig. 5. An aggregate function representing a partic-
AUDIO
Figure 5. Voice code translator in the IBM 7772 audio response unit.
ular portion of a word is placed in the aggregate
function register with its corresponding excitation
function in the excitation function register. The DCV
data ( aggregate function) is converted to analog
form and gated into the band filters, thereby energizing each band filter to a level and frequency
determined by the format of the aggregate function.
The output from the band filters is integrated, reconstructed, amplified, and transmitted to a telephone
line as audio. The length of time that each aggregate
function is used is determined by the excitation
function which in turn is dependent on the dynamic
range of the analog signel. For example, a constant
tone would be signified by using an aggregate function having a corresponding excitation function with
a large count.
Vocabulary
Since the 7770 vocabulary is stored within the
device on a drum, the number of words per application is limited to the number of tracks on the drum.
(See Table 1.) The words are chosen from an available vocabulary list. Special words, proper· names,
and dialects, etc., can be obtained on a charge basis. A user may change his vocabulary by requesting
the drum rotor in the machine to be rerecorded
with new words of his choice.
862
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
There is no storage of vocabulary within the
7772 unit; instead, the words are stored. on available random-access devices within the system to
which the 7772 is attached. Because the words are
in digital form, they require about 240.0. bits of
storage for each second of speech. If an average
speaking rate of 180. words per minute is considered, this means approximately three words per
240.0. bits of storage. There is, therefore, only the
system limitation of available storage restricting the
size of vocabulary per application for the 7772.
Changing the vocabulary is simple since it requires
only the reading of new words from cards or tape
into storage. A list of words is also available for the
7772 with special words or dialects again offered on
a charge basis.
DATA FLOW
Both the 7770. Model 3 and the 7772 connect to
the common-carrier data set receivers on one side
and, on the other, to the multiplexer channel' of an
IBM System/36o. Processor. Input information is
entered via a telephone or similar terminal through
a common-carrier data set to the 7770. or 7772 input section. It is then converted to data interpretable by the processor and is forwarded, a character
TELCO
1965
at a time, to the processor. The multiplexer channel
has the capability of interleaving operations from
many low-speed input/output devices and thus provides a high degree of I/O efficiency and adaptability.
The multiplexer channel operates asynchronously with the central processing unit and contains several subchannels. One sub channel is used per I/O
line. Data is transmitted to and from the subchannel and the line in parallel 8-bit-byte form. Data
operations between the 7770. and the multiplexer
channel are controlled by a sequence of commands
which, in turn, are controlled by the Operating System/36o. control program. The functional block diagrams of a basic 4-line 7770. Model 3 and a 2-line
7772 are shown in Fig. 6 and 7, respectively.
To understand the operation in more detail it is
first necessary to define some basic commands used
in IBM System/36o. for communication devices.
ENABLE - is used to condition a line for
accepting or maintaining a call.
DISABLE - is used to condition a line for terminating a call.
READ
- is used when information is being
transmitted from the 770. to CPU
storage.
WRITE - is used when information is being
transmitted from CPU storage to
the 7770..
:
+-----NETWORK--------'
IBM
SYSTEM/360
CPU
MPX
CHANNEL
PBX
CONTROL
UNIT
Figure 6. Functional block diagram of a 4-line IBM 7770-3
audio response system.
VOICE OUTPUT FROM IBM SYSTEM/360
863
I
I
TELCO
:
NETWORK
I
I
IBM
SYSTEM /360
CPU
4------------------- 1
I
I
I
MPX
CHANNEL
I
I
CONTROL
UNIT
Figure 7. Functional block diagram of a 2-line· IBM 7772
audio response system.
All of the above commands are under the control of
the IBM System/360 Operating System Control
Program. The flow chart (Fig. 8) describes the operation.
A transaction is initiated by removing the telephone receiver and dialing the machine number. If
the number is not busy and the line has been previously ENABLED, an inquirer hears a tone that is
2 to 5 seconds in length, indicating that the call has
been answered. The inquirer then dials an inquiry
character and the 7770 or 7772 requests service from
the multiplexer channel. If the caller does not hang
up and the character is accepted, it is transferred to
processor core storage. This operation continues
until the caller stops entering characters (as is indicated by a time-out of 5 seconds or by the caller's
depression of an "end of message" key). If this
time-out occurs, the device indicates "end of message" to the multiplexer channel which interrupts
the processor. The processor will analyze the completed inquiry and formulate the response message
in the form of a sequence of drum addresses
(7770) or "DCV" (7772). This sequence of drum
addresses or DCV is preceded by the address of the
line requiring the response. The multiplexer channel
uses this address to identify the line requiring the
response. Each time the multiplexer channel sends
an output message to the 7770 or 7772, an audio
word or portion thereof is gated onto the respective
line requiring it.
This is repeated until the channel control word
(CCW) count is zero indicating the last word. If
conversation mode is not indicated, the line. is then
disconnected by issuance of a DISABLE command
and then re-ENABLED for a new inquiry. If conversation mode is indicated, it will either be a
READ command indicating more information from
input, or a WRITE command indicating additional
response will be transmitted by the processor. Conversation mode is a means by which a caller can
effectively conduct a conversation with the computer under control of the operating program.
APPLICATIONS
The choice of either a 7770 or 7772 Audio Response Unit depends entirely on the application.
The number of calls per day, the length of the output message, and the size and variations in vocabulary are all factors that must be considered. The
7770 has greater throughput offering service to
more lines than the 7772 but has the limitation of
approximately 128 vocabulary words per application. The 7772, on the other hand, is limited in vocabulary only by available storage. The 7772, has a
lesser number of input/output lines and has less
864
PROCEEDINGS -
START
PICK UP
TELEPHONE·
FALL JOINT COMPUTER CONFERENCE,
PRESENT
EXCEPTIONAL
CONDITION
STATUS TO CHN
1965
DISCONNECT
LINE
(DISABLE)
RE-ENABLE
FOR
NEW INQUIRY
NO
PRESENT
END STATUS
TO CHANNEL
CPU
PROCESSES
INQUIRY
CHANNEL
ADDRESSES LI NE
REQUIRING
RESPONSE
DATA SET
ANSWERS
HEAR
2-5 SEC TONE
DRUM ADDRESS
(7770)
OR
DCV (7772)
RECEIVED
DIAL
INQUIRY
CHARACTER
REQUEST
SERVICE
GENERATE
AUDIO WORD
Figure 8. Flow chart of inquiry and response.
throughput imposed by the higher data transfer rate
across the device/processor interface during vocabulary transfer from the processor.
Both Audio Response Units are suitable for use
by any business having a centralized file system in
which the telephone is useful as an inquiry terminal. Responsible individuals within an organization
can dial the computer directly for any information
865
VOICE OUTPUT FROM IBM SYSTEM/360
about an account and receive a voice answer within
seconds. Only a small number of words in the vocabulary list take more than 1;2 second; hence, a 20word message generally takes less than 10 seconds.
Because audio response is a recent addition to
data processing, the application of these devices is
left largely to the imagination of the system application engineer. A number of applications have
been defined but are too lengthy to describe within
the scope of this article. A brief description of a
banking application follows:
Banking
Account, loan, credit or mortgage type inquiries
can easily and quickly be handled by voice response. When a teller needs to determine the account status of a customer who desires to cash a
check, he simply picks up this telephone receiver
and dials the account number, and follows with
some predetermined code; for example, dial 6 for
account status information. The voice response received confirms the account number by repeeating the
number the teller dialed, and then supplies the account balan~e and any other information the teller
requested. After the customer cashes his check on
the tener's approval, the teller, still holding the line,
dials the code to inform the processor to update the
customer's account by debiting the amount of the
check. In most cases, the transaction from initial
inquiry to final response will take the teller less
than one minute; there is no paper-work involved
and the teller is using a terminal he is already familiar with-the telephone.
Other applications are:
Finance ..
- stock quotations
- margin account balance
Insurance.
- policy status
- premium information
Retail . . .
- credit inquiry
- inventory inquiry
Manufacturing
. - inventory inquiry
- job status
- parts cost inquiry
The addition of Audio Response to the family of
IBM System/360 input! output devices now provides
a closer link between man and the equipment that
stores and processes the data with which he is concerned.
CORRUGATOR PLANT OPERATING SYSTEM
Walter J. Koch
IBM Corporation
San Francisco, California
INTRODUCTION
vestigated, and is now being implemented, .is the
extensive use of the' digital computer for production
planning and monitoring in the container plant.
The most significant result of these recent studies
into using a digital computer profitably in· corrugator plant operation has been the conclusion that top
plant operating efficiency can best be obtained
through a computer-oriented Plant Operating System. Such a system, with its computational, analytical, and data collection capabilities, would supply
pla~t management with the timely, pertinent operational information needed for most effective operating decisions. Applying these capabilities to all aspects of. plant operation would provide the necessary integration of these interrelated requirements:
The corrugated container industry currently comprises approximately 850 plants, centered mostly in
larger industrial areas. The average container plant
employs 88 people and has an annual sales revenue
of two million dollars.1 In a working day, the plant
will produce upwards of 100,000 containers to
meet the shipping requirements of neighboring industry.
The process of converting rolls of kraft paper
into these containers begins with the production of
corrugated paper board. The corrugator joins outer
sections of the board to the inner section, which has
been corrugated for strength, and cuts the board
into rectangular blanks representing specific container orders. The blanks are then processed by a
variety of machines, such as printer-slotters and
folder-gluers, to produce the finished containers.
This manufacturing process is becoming steadily
more complex because of the tremendous variety of
containers required. by American industry, and also
because of the trend in the container industry to
provide more specialized and marketable coptainer
designs and services. This results in a continual
search for' more efficient container production techniques. One technique which has been recently in-
• Efficient order scheduling and processing
• Maintaining the lowest reasonable plant inventory
• Providing the best customer service possible
SYSTEMS FUNCTIONS
A group of container orders is scheduled on the
corrugator using a series of patterns which contain
orders combined with each other. Combining these
orders to produce more efficient corrugator sched-
867
868
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
ules is a natural area to begin using the computational capabilities of the system. It has been shown
that significant savings can result in this area of
plant operation alone. However, to realize as much
of these potential ·savings as posible, the computer
must have complete, accurate files showing the status of orders and plant inventory. Entrusting the
maintenance of these files to the system, with its
inherent advantages in accuracy, becomes a natural
consequence of corrugator scheduling on the computer.
The maintenance of complete inventory files
leads to the application of scientific inventory management techniques which permit plant management
to exercise more effective inventory control. These
techniques, many of which are being used successfully today, will be used to maintain the lowest possible number of stock widths for each of the various
paper grades, as well as the smallest number of rolls
of each width.
In addition to the files showing raw material inventory, files will be kept on orders being processed. The progress of these orders can then be
shown for management review and action. Timeliness and accuracy of these files will be obtained by
maintaining them automatically, using input from
data collection stations in the work centers on the
plant floor.
EXTERNAL SYSTEM FEATURES
The true effectiveness of the Plant Operating
System depends on how well it performs in the corrugator plant environment. The system should require as little manual data preparation and manipulation as possible. On the other hand, it should have
features which make it easy for plant management
to obtain the operating information which resides
in the system. The inherent flexibility of digital
computer logic and the variety of computer input/output· devices, make these system communication features readily available.
INTERNAL SYSTEM FEATURES
In addition to these required external features,
there are two basic internal features which are required:
• Large random access memory
• Comprehensive computer monitor program
1965
A large random access memory auxiliary to the
main computer memory is essential to store the
many data files and computer programs which will
be incorporated in the completed system. A computer monitor program is one which controls and
supervises the execution of the various operating
procedures. This control is needed to allow plant
personnel to quickly and easily specify the procedures they wish performed.
CORRUGATOR PLANT OPERATING
SYSTEM DEMONSTRATION
This demonstration was prepared to show the advantages of the Plant Operating System concept by
putting a significant portion into operation. Processing box orders was chosen to highlight the value
of integrating operating requirements so that they
can be monitored on one computer system. The
demonstration performs the routine processing of
order information as well as optimum scheduling of
the orders on the corrugator and the costing and
operational analysis of the corrugator schedules.
Plant operating personnel need only enter orders
into the system as they are received at the plant.
The system then provides all information needed
for the operating decisions which result in complete, effective processing of these orders. The demonstration is designed to run on an IBM 1620,
equipped with an IBM 1311 Disk Storage Drive
and running under control of the 1620 Monitor.
These corrugator plant applications are included:
•
•
•
•
•
•
Corrugated Container Cost Estimation
Corrugator Order Entry
Order Request
Stock Width Request
Corrugator Scheduling
Corrugator Schedule Costing and Operational
Analysis
• Order File Maintenance
These procedures are performed by simply entering the monitor control card specifying the execution of a particular procedure, together with the required data cards. All data cards, other than the
initial order cards, are supplied by the system.
The order processing cycle in a plant begins
when order cards are entered into the system by executing the Corrugator Order Entry Procedure.
These new orders are merged with the existing open
CORRUGATOR PLANT OPERATING SYSTEM
order file on the disk and the updated order file is
printed on the option of the planner.
Scheduling activity begins by executing the Order
Request Procedure which scans the open order file
on the disk, selects those orders which are for specified grade-flute codes and which fall within specified due dates. These orders, which are grouped by
grade-flute codes, are printed as well as punched,
allowing the production planner to select only those
cards for orders he desires to schedule.
Selected cards from the Order Request Procedure
are then fed into the Stock Widths Request Procedure .. The stock widths available for each gradeflute group are punched as well as typed for the
planner's information. Other information required
for scheduling is also punched, such as material
costs and operating costs. Schedule information
which the planner might wish to vary frequently,
such as minimum corrugator run and maximum
number of slitter knives, may be entered through
the typewriter and punched. The punched output
from this· procedure represents a complete input
deck for the order scheduling procedure for the orders for each grade-flute group.
Input decks for each grade-flute group are fed
into the Order Scheduling Procedure to be scheduled on the corrugator. This procedure uses a special linear programming technique developed by the
IBM Research Laboratory, Yorktown Heights, N.Y.,
to determine optimum schedules for the gradeflute order group.2,3 In addition to printing the
schedules, this procedure punches them, making
schedule information available as input to the costing and order file maintenance procedures. Since
the schedule is in punched cards, the production
planner has the ability to change schedule information as he wishes before proceeding to these succeeding procedures.
Schedule cards from the scheduling procedure
make up the input to the Corrugator Schedule Costing Procedure which determines material, trim, setup and operating costs for each corrugator run as
well as a schedule summary showing costs and operating statistics for the entire schedule. This procedure can ~e run for a projected corrugator schedule
or for a schedule that has already been run, giving
projected or actual operating costs.
Schedule cards from the Scheduling Procedure
also make up the input to the Order File r.1aintenance Procedure. After a schedule has been run on
869
the corrugator, this procedure updates the order file
by analyzing the amounts run for those orders
scheduled. Orders which have been completed are
deleted from the order file. Orders which have been
partially completed are reduced by amount run.
Using the Corrugated Container Cost Estimation
Procedure occurs somewhat independently of the
other procedures, although it would frequently precede the order entry procedure. This procedure
checks box orders for validity and determines the
corrugator blank size to be run from the dimensions
of the box order. Various material and production
costs are combined to determine the sales price for
the order. Finally, the shipping weight of the order
is determined.
Since this family of procedures can be executed
in a variety of sequences, the planner has the ability
to do different phases several times, varying such
things as order grouping, tolerances, and stock
width availability. In this way, he uses his experi..,
enced judgment in applying the procedures most
effectively. For example, having reviewed a particular schedule, he may decide to remove an order
from that group and try to obtain a new schedule
with a higher corrugator utilization. On the other
hand, he may decide to go back to the Order Request Procedure with different due dates, obtain
more orders to add to the original group, and then
obtain a more efficient schedule with the new larger
group.
CONCLUSIONS
The Plant Operating System Demonstration highlights the desirability of implementing a complete
Plant Operating System. The basic principles of integrating related plant applications to operate in a
computer monitored environment can be extended
to include plant production scheduling, inventory
management, production analysis and production
control through on-line production monitoring. The
Plant Operating System Demonstration represents
the beginning of the systems design for such a truly
comprehensive system.
REFERENCES
1. "1958 Census of Manufacturers," Summary
Statistics, Bureau of the Census, vol. 1.
2. P. C. Gilmore and R. E. Gomory, "A Linear
Programming Approach to the Cutting Stock Prob-
870
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
lem," Operations Research vol. 9, pp. 849-859
(1961).
3. P. C. Gilmore and R. E. Gomory, "A Linear
1965
Programming Approach to the Cutting Stock Problem, Part II," Operations Research, vol. 11, pp.
863-888 (1963).
REAL-TIME PROGRAMMING AND ATHENA SUPPORT AT WHITE SANDS MISSILE RANGE
William G. Davidson
Computer Directorate
White Sands Missile Range, New Mexico
THE ATHENA SYSTEM AND GROUND
COMPUTER ASSIGNMENTS
3.
The Athena is a four-stage missile launched
from Green River, Utah toward White Sands Missile Range, New Mexico, covering a ground range
of 450 miles. It is designed to deliver a payload
into White Sands with prescribed atmospheric reentry velocities and angles, so that re-entry phenomena may be studied, and radar performance
against re-entry bodies may be evaluated.
A ground computer (presently an IBM 7044)
<>perates in real time during an Athena flight, with
seven assigned tasks:
4.
5.
1. Providing instantaneous impact prediction
data during first and second stage burning,
for range safety use. This function is especially important since the Athena is flying
over populated areas and, in case of missile malfunction, must be destroyed by
safety personnel so as to land only along
non-populated sections of its flight path.
2. Computing and transmitting midcourse
(between second stage bum-out and third
stage ignition) guidance correction com-
6.
7.
mands to bring the missile into the proper
re-entry orientation.
Evaluating telemetered data from the missile following the transmission of the guidance commands, to determine if all system
functions are operating within prescribed
limits.
Transmitting the third stage ignition command to the missile at the proper time,
once it has been established that the third
stage can be safely fired.
Transmitting precise pointing data to several tracking and measurement sites
throughout the flight, to assist them in acquiring and following the missile.
Providing ground support personnel with
data displays and flight status information
throughout the mission.
Logging all computer inputs and outputs in
real time for postflight analysis.
The following sections, while relating to real
time philosophy in general, will at the same time
show the development of the above tasks as a multiprocessing job under the real time constraints of
input and output data demands.
871
872
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
MONITOR PHILOSOPHY
All real time programs within the Computer Directorate at White Sands operate under the control
of an Executive Control Monitor which directs Lhe
flow of activity among various functional modules
to assure that all program functions are carried out
at the proper times and in the necessary contingency to related activities. The Monitor was originally
adapted for Athena use from the Mercury project
monitor, but has been revised to make it more applicable to specific support problems encountered at
White Sands.
A multi-processing job is a programming task
that can be subdivided into functional modules or
processors, each of which executes a specific function in the overall program, and can be in various
stages of completion throughout execution of the
entire job. Each functional processor should have
well defined inputs and outputs in relation to other
processors in the system, and in relation to the outside (non-computer) world. The Monitor might
well be referred to as a mUlti-processing monitor,
since in controlling the relative flow of activity between functional processors, it monitors the state of
completion and the dynamic requirements of each
processor in the job throughout execution of that
job.
A multi-processing job becomes a real time job
when execution of various program functions becomes dependent on some regular or random time
constraint, such as input or output data requests
from a device operating dynamically during program execution. Thus, for a real time job an additional, or perhaps we should say a master, processor
must be added to the other functional processors in
order to service the real time demands and pass any
pertinent control information on to the Monitor.
For the Athena system this. processor is known as
the trap or interrupt processor, to which control is
given immediately whenever a data input or output
demand is recognized, interrupting whatever functional processor may be in process at that time. The
trap processor examines the real time request, reads
in or transmits the required data, and provides the
Monitor with information regarding the input or
output action just completed so that either execution of pertinent program functions may be initiated, or control will be returned to the interrupted
functional processor.
1965
Each real time processor can be classified in one
of three time-oriented categories, and relative execution priorities can be assigned on this basis.
First are those activities that must be executed immediately upon recognition of some event or time
in the program. Examples in the Athena program
are the transmission of the third stage ignition
command within a critical time interval, or the
plotting of range safety display data without undue
delay after they have been computed. The second
category consists of functions that must be executed
within a given time interval after they are requested, such as the ed~ting and processing of direct data
messages before the next sample arrives at a remote
terminal, or the calculation of regularly transmitted
computer outputs within a time dictated computation-transmission cycle for those data. The third
group of processors involves program functions
whose execution can be deferred until some later
point in the mission, when higher priority functions
have been completed and computing time is available. Included in this category would be such functions as selecting a radar for later use on the basis
of current data from several radars, or the calculation of guidance commands to be transmitted at
some later point along the trajectory. The first class
of functions described must, of course, have highest
priority, since initiation of these actions usually
cannot be delayed for any other purpose. Other processors will be assigned lower priorities as the importance of their roles and the judgment of the program organizer dictate. The trap, processor, as mentioned before, will usually assume control whenever
a data demand appears at a remote terminal, but
even trap processing can be temporarily inhibited if
an extremely critical function is in process at the
time the demand occurs.
Figure 1 shows the relationship between functional processors, a trap processor, and the Executive Control Monitor. A simplified version of the
Monitor is shown since this paper intends only to
outline its control functions, omitting many of the
details and system tools available. When an input or
output data demand interrupts a functional processor, the contents of all machine registers are saved
as control is given to the trap processor. When control is eventually returned to the processor at the
point where it was interrupted, these machine registers can be restored and processing can resume.
Numerous control tables and indicators link the
functional processors to the Monitor. Some of these
REAL-TIME PROGRAMMING AND ATHENA SUPPORT
873
PROCESSOR QUEUEING AND
ROUTING SHARING
at time of interrupt
from processor I; retum
_ _
~r~.t ~aE-
Demand
Functional Processor I
(I - 1,2, .•• ,N)
Figure 1. Monitor-trap
linkage.
processor - functional
processor
are set initially to define, among other things, the
entry points and job priorities assigned each processor, while others are maintained dynamically for
use in monitoring the status of each processor.
Among the indicators that change during job execution are three basic ones described here. First is
a "request" indicator for each processor, turned on
whenever execution of that processor is requested
by another segment of the program. For example,
the program might contain a processor which edits
or evaluates radar data, in which case this function
would probably be requested by the trap processor
each time a new set of radar data enters the computer. When control is returned to the Monitor by
the trap processor following the data interrupt, the
Monitor would see the request indicator on for the
radar processor, and would transfer control there as
soon as any higher priority activities had been completed. The second indicator is a "suppression" flag
for each processor, causing the Monitor temporarily
to suppress execution of that processor even though
its request indicator might be on. Any processor
can be suppressed by any other processor, and can
likewise be released by turning off the suppression
indicator. The third indicator is the "in process"
indicator, turned on whenever execution of the
functional processor associated with it has begun
but has not been completed, and turned off upon
normal exit from that processor.
Real time processes often encounter times of
peak activity where processor execution requests
occur more frequently than control can be given to
the processor. In order to avoid losing any of these
requests, a queueing facility is provided by the
Monitor, which allows the stacking of requests until
computing time is available for servicing them. A
basic assumption, of course, is that overall program
timing has been so established as to assure the execution of queued functions within some limiting
time frame and before the queue tables for that processor have been filled.
Another problem occurs when two or more segments of a mUlti-processing job share the same
subprocessor, resulting in a possible loss of indicators and temporary storage within that subprocessor. At least four·· solutions may be applied to this
problem, all of which have been used at the White
Sands facility with varying degrees of success. First,
and perhaps most obvious, is to avoid the problem
by loading duplicate subprocessors, though this is a
space consuming procedure. Second is a modified
queueing philosophy applied to subprocessors. This
is ideal in some instances, but may tend to subvert
the priority assignment given to processors unless
handled carefully at the subprocessor level. Third is
the disabling of data traps during execution of a
subprocessor, thus forcing the completion of that
subprocessor prior to its call by another processor.
This should be done only for short subprocessors,
and even then there is some danger of losing vital
segments of data while in the disabled mode.
Fourth is the saving of temporary cells and indicators for the subprocessor at the time a data interrupt occurs. This method respects the initial priority establishment, but is often wasteful of computer
storage and execution time.
INPUT-OUTPUT AND INTER-COMPUTER
COMMUNICATIONS
Inputs to the Athena real time program are radar
tracking data (polar coordinates, site identification
and tracking mode) from five radars; 15 channels
of telemetry data from the missile in both a data
and a calibrate mode format; signals such as lift off,
stage ignition and burn out; and timing from four
sources-a millisecond clock attached to a multiplex-
874
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
er control unit to provide timing pulses for radar
interrupts, an astrodata clock measuring range timing in millisecond increments, a range timing word
attached to each sample of telemetry data, and an
internal computer core clock with a 16 213 millisecond resolution, to be used if other timing sources
fail. Computer outputs include acquisition data
(polar or rectangular coordinates and site identification) to eleven tracking sites; commands to the
missile via a command transmitter station, including stage firing commands and midcourse pitch and
yaw maneuver instructions; impact prediction, present posit~ons and other information sent to seven
plotter display boards; and numerous .binary and
decimal displays showing the present trajectory status, the results of dynamically changing computer
decisions, and various data and error messages appearing on an on-line printer throughout the mission. Input!output formats and data rates may vary
throughout a mission.
Since 7044 core storage and computer speed prevent loading or executing the ideal large scale real
time program from core, any data handling or formatting that can be done outside the computer
simultaneous with computer processing is always
welcome. For this reason the personnel of the Real
Time Data Center at White Sands have designed a
Direct Data Buffering System to handle many of the
computer inputs and outputs without tying up valuable computer storage and time.
Radar data are transmitted from the radar sites
to the Real Time Data Center over telephone lines
via kineplexes, one sample requiring 15 8-bit
bytes. These bytes come from the kineplex receivers
every 3 V3 milliseconds, so that one complete frame
of radar information is transmitted every 50 milliseconds with the data multiplexed for the five input radars, adding a time word from a master clock.
Telemetry data and calibration words are transmitted over 14 pairs of telephone lines, formatted and
buffered by the Direct Data Buffering System, and
similarly sent to the 7044 on an interrupt basis
about 11 times per second. The Direct Data Buffering System includes an output buffer to format and
transmit acquisition data to the tracking sites via the
kineplexes, and to transmit computer generated
missile commands and display data to the proper
device. Output data demands interrupt the computer
every 50 milliseconds. The System displays the computer-buffer inputs and outputs dynamically, and
records all real time inputs serially on analog tape as
1965
they enter the Real Time Data Center. Communication between the direct data buffer and the 7044
main computer is of the demand-response type
through the direct data connection on Channel B of
the 7044. Communication from the 7044 to the
analog computer and plotting boards through the
digital-to-analog converter is via the direct data
connection on Channel C of the 7044.
A Direct Couple System (7044-7094) has recently been installed in the Computer Directorate,
and the present Athena program is being rewritten
for the new configuration. The new program will
rely largely on the 7044 side of the system to edit
and format data to and from input-output 7288
sub channels , replacing in large part the· function of
the present Direct Data Buffering System.
DEVELOPMENT OF THE ATHENA
PROCESSORS
Figure 2 outlines the organizational structure of
the Athena real time program. The solid lines indicate a direct transfer of control from a processor to
a subprocessor, in the manner of standard deferred
time subroutines. The dashed lines represent a request for the execution of a processor by another
processor. The function of each program in the system is described below.
Figure 2. Athena program organization.
1. Priority Processor #1: transmits, at the
proper time, .a command to ignite the third
stage of the missile.
REAL-TIME PROGRAMMING AND ATHENA SUPPORT
2. Priority Processor #2: transmits data to display plutters via digital-to-analog converter
and an analog computer.
3. Priority Processor #3: logs computer inputs,
outputs, and other pertinent data on tape
during a mission for post flight analysis.
4. Priority Processor #4: processes raw telemetry calibration messages.
5. Priority Processor #5: edits and processes
raw telemetry data messages, and transmits
midcourse guidance commands to the
missile.
6. Priority Processor #6: edits and processes
input radar data, monitors the tracking status of the radars, and transmits acquisition
and display data.
7. Priority Processor #7: formats acquisition
and display data, and controls the linkage
and synchronization between the processors
generating acquisition data, range safety
plots, and various digital displays.
8. Priority Processor #8: evaluates telemetry
data from the missile after midcourse guidance commands have been transmitted to
decide whether all parameters are within
prescribed limits.
9. Priority Processor #9: evaluates radar data
after second stage burn-out, and chooses a
radar from which input data for the guidance equations can best be obtained.
A. Subprocess or RADCOR: a subroutine of
Processors 7 and 9, RADCOR performs
coordinate transformations and generates
acquisition data.
B. Subprocessor PRED: a subroutine of Processor 7, PRED generates Kepler extrapolated impact prediction and other plotting
board data.
C. Subprocessor DIRSIT: a subroutine of Processors 7 and 9, and of PRED, DIRSIT generates smooth positions, velocities and accelerations from raw radar position data.
D. Subprocessor SUB86: a subroutine of Processor 8, SUB86 computes predicted payload
impact points.
E. Subprocessor EXTRAP: a subroutine of
Processor 9, EXTRAP performs trajectory
extrapolations for guidance equations inputs.
F. Subprocess or MATRIX: a subroutine of
Processor 9, MATRIX solves the guidance
875
equations, computes missile attitude commands for midcourse guidance, and determintes third stage ignition time.
G. Subprocessor DFXWRS: a subroutine of
PRED, DRXWRS selects a radar for instantaneous impact prediction plots on the basis
of digitally filtered position and velocity
data from all input radars.
H. Subprocessor PLOT: a subroutine of PRED,
PLOT scales and formats plotting board
data for transmission to the digital-to-analog converter.
I. Subprocessor CHECK: a subroutine of
PRED, CHECK is a limiting subroutine for
output data.
Note the high priority given to routines to transmit third stage ignition command and plotting
board data for range safety use. This indicates that
these functions must be executed within critical
time frames. The data editing and processing routines are generally given next highest priorities
since they must be completed before the next data
samples reach the remote terminals. Functions that
may span longer time intervals for completion, and
which may, to some extent, be completed as available time permits, are given lowest priority.
REAL TIME STORAGE LIMITAT~ONS
Most real time programs currently in use are of
the medium to large scale variety, as is the Athena
program at White Sands. An immediate problem
arises when the entire executable program is too
large to be accommodated by the immediately accessible core at one time. Three techniques are used
by the Athena program to alleviate this problem,
and may be worthy of mention here.
First, all data input and initialization programs
are loaded into core and executed, and are then
chained over or overlaid by the real time processors
and subprocessors, the two program links sharing
only the data and control storage common to each.
The same procedure is followed after execution of
the real time phase of the job, with post flight routines overlaying the real time processors, retaining
only those data and control words common to both.
Second, all data which are generated during the
real time mission but are intended only for postflight use, are moved out of the computer immediately via a direct data connection or onto logging
tapes, to preserve storage within the main cores.
876
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
The 7044 currently in use at the Real Time Data
Center does not have a disc or drum storage unit,
hence cannot use these facilities for conservation of
core.
Third, much preprogramming, formatting, filtering, buffering, and functions of this nature are handled externally by the input-output Direct Data
Buffering· System and by the analog-to-digital
link between the 7044 and the plotting boards. This
eliminates large blocks of digital programming as
well as conserving computer time. Future real time
work at White Sands will probably move more into
this field of hybrid operation as systems become
larger and more complex.
CHECK-OUT PROCEDURES
Check-out of real time programs requires methods not usually necessary in deferred time programming. First of all, of course, each module or functional processor must be checked out individually,
as far as possible. If a multiprocessing job has been
properly formulated, with each processor assigned
well defined functions and communications with
other program parts, this check-out at the processor level will be greatly simplified.
Repeatability of input data is necessary in order
to reproduce the results of computation processes
during check-out phase. Unfortunately, real time
processes dependent upon external timing and data
demands cannot usually be reproduced with any
degree of precision. However, the ability to record
actual input data during a real time process, in such
a manner that it can be played back into the program exactly as it was recorded, assists the programmer in his check-out diagnostics. A related
process is the establishment of a data simulator
which will provide the real time program with repeatable data demands at prescribed time intervals.
Both procedures have been used extensively in
checkout at White Sands.
The complexity of interprocessor and monitor
interactions is frequently exceeded by a maze of intercomputer and data communications pro~lems.
Errors in data formatting and transmission are difficult to trace during or following the completion of
a real time process, pointing up a need for thorough
diagnostic check programs for all external devices
and intercomputer links prior to the execution of a
real time job. Even with thorough premission
1965
check-out, the possibility always exists that some
linkage device or external module may fail during a
real time mission. Such possibilities demand the use
of a limited amount of equipment monitoring by
the program during a real time process. Upon occasion facilities can be provided for the switching of
modules during the run in case errors are detected,
thus endowing real time jobs with an additional
feature not usually found elsewhere, that of dynamic
check-out.
DOCUMENTATION
The preparation for and writing of a real time
program requires the close cooperation of a number
of persons, among them the program administrators, coders, equipment specialists, and those establishing the functional requirements and job specifications. This, together with the extreme complexity
of most existing real time systems, compels a formalization and documentation of requirements and
procedures not usually needed for a standard programming job.
Once the functional requirements for a job have
been set down, the program management group
should incorporate the system functions into an arrangement of processors and subprocessors which
will form the backbone of the programming assignments. This initial work must be done with great
care and imagination, since the establishment of
time dependent linkages and relative priorities
among processors is basic and can be altered only
with great difficulty once the programming work
has begun.
Aft€r the overall system documentation and flow
charting has been completed (ideally before any
other work has begun), program specifications and
detailed flow charts can be drawn up for each functional processor. Specifications must be controlled
b j someone familiar with all parts of the program
so that logical program linkages may be defined.
Detailed flow charts should be primarily the responsibility of the individual programmer, and just as in
deferred time programs, can be compiled before,
during, or after the program has been written.
Standards for program management and documentation need to be set down early in the game,
and followed rigidly even when the pressures of
time begin to close in on the programming team.
Not only will these standards aid the program nian-
REAL-TIME PROGRAMMING AND ATHENA SUPPORT
ager and his real time staff during the program
writing stage, but also will establish a basis for future changes and program module rewriting when
that becomes necessary. The competent real time
programmer will be called upon to use all the ingenuity and creativity he possesses to solve the
problems he faces, and poor standardization and
documentation will make his task more difficult
and lead to a substandard quality of work.
Most of the difficulties encountered in the Athena
and other White Sands real time systems have been
due to breakdowns in communication and documentation. The establishment, check-out and
maintenance of a sound real time system is sufficient in itself to tax the patience of the most erudite team, and the use of such a conceptually simple
877
yet basic tool as documentation cannot be overemphasized.
REFERENCES
1. F. N. Berry, "Re-entry Vehicle Testing at
White Sands Missile Range," (not for distribution
outside. White Sands Missile Range) (April 1965).
2. R. V. Head, "Real-Time Programming Specifications," ACM Communications (July 1963).
3. T. A. Holdiman, "Management Techniques
for Real Time Computer Programming," ACM
Journal (July 1962).
4. E. H. Schuetze, "The Direct Data Buffering
System at WSMR," (not for distribution outside
White Sands Missile Range) (June 1964).
QUALITY EVALUATION OF TEST OPERATIONS VIA ELECTRONIC DATA PROCESSING
A. A. Daush
Hughes Aircraft Company
Space Systems Division
El Segundo, California
the operations team and has created a real mancomputer methodology that can be the slave or master of the program. The design of the testing system
normally follows a very hybrid and interwoven sequence involving many decision paths and management constraints. Some of the rationale might include:
INTRODUCTION
The test operations and data analysis engineers
have taken giant steps in utilizing computers and
data processing methodology in the various scientific fields. This is especially true in electronic weapons systems and space vehicle and related systems
being developed and proposed. Procedures, command sequences, and data collection and reduction
have reached high density and are being treated via
EDP. However, to what extent have related disciplines such as quality engineering utilized this same
app,-:oach? How have they integrated or been integrated into this complex? What steps appear to be
in the offing to insure that the attention of the electronic data processing community is focused upon
this problem and can present a case suitable to
secure management backing where capital expenditures, personnel, and space are considerations? More
questions may be raised than answers given. However if some actions are stimulated the basic objective
will be fulfilled.
Technician operation vs automatic command
Quantity of command and control elements per
timing sequence
Real or near real time status displays and decision capability
Types and quantities of data secured
1. Analog
2. Digital
3. Tape, paper or magnetic
4. Visual vs recorded and confirmed
TEST OPERATIONS USAGE OF EDP
Near real time analysis and decisions on validity of results
The testing of complex electronic and related systems has had a series of requirements placed upon
Fluidity and control of change of test procedures and system test requirements
879
880
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
complex test sequences and during these monitor
and analyze a detailed number of variables for incorrect performance, how does the requirement for
quality approval and for acceptance of this operation become specified, funded, and certified as true
by the in-house Quality Control and the customer?
Quantity of peripherals to support the program., i.e., card sorts, printouts, key punch,
etc.
Program change control to assure machine
match with equipment being tested
Level of data verification and trouble shooting
routines, i.e., self-test and failure isolation
THE QUALITY DILEMMA
Machine costs-capital vs project
Consider the major segments of a typical computer-controlled test operation including automatic
real time performance evaluation and test certification. They are shown in Fig. 1.
How are the quality requirements integrated into
this total data acquisition picture? They for the
most part are not! Several organizations are noted
exceptions. The normal procedure is for Engineering ( a) to define the parameters to be measured,
(b) define the allowable limits and then, (c) proceed to design the testing such that the test/test
complex rationale leaves no unexplainable discrepancies. A data review then occurs as a later date
with only some "quick look" information to verify
moving on to the next phase. What, therefore, are
the "Quality EDP" challenges?
Reliability
We have seen in the test field numerous. interfaces with instrumentation, controls, high-speed analog and digital inputs, outputs, and computations
involving test operators, all or many of which may
be external to the actual computer complex.
For a fully automatic test or checkout system the
programmer must understand the above and have
researched the procedures and must understand the
requirements. He must. understand the equipment
being evaluated almost as well as the design and
system engineer. He must then define the complete
test program to accomplish the task. Obviously this
must all be accomplished (l) to a schedule, (2)
within cost, (3) to meet performance, and (4) it
must be reliable. How can such a demanding set of
diverging requirements accommodate yet another
delay like Q.C. evaluation overlays and requirements? If the checkout system requires that the
computing complex control and execute a series of
The number of Q.A. and Q.C. managers and
engineers available that can plan in depth in
terms of ED P for quality acceptance are limited in number.
Upper management understanding and backing
can be lacking in furthering Quality's role.
,- -
,
I
Other
Peripherals
Computer
& Storage
& Programs
Power
Sources
Manual
Overrides
,
f
~~
,
~
~If
~
,
System
Under
Test
f
--,
Contro 1s &
Outp uts
,,I
t
....---------,j--
~
Error &
~ Prob1emMI.lr----......~
Routing'
,
,
l,
r--
I Enviro nmenta1
I
Equipment
I
Crossbar
Switching
Input/
Output
Buffers
1965
I
Signal Sources
& Conversions
Test Status/
Acceptance
Displays
Figure 1. Major segments of a typical computer-controlled
test operation.
-'
QUALITY EVALUATION OF TEST OPERATIONS
The ability of Q.A./Q.C. to understand and to
make significant technical contributions to the
data mix. Too often this is only a paper mill
increase at best and does not improve product, insure test results, or customer satisfaction.
This lack of understanding results in lack of
trust in the data evaluation and long laborious
manual data reviews are required to accomplish that which a computer can accomplish
is several seconds.
What must Q.A. managers do to become capable
in these fields?
They must develop competency in planning,
analyzing, understanding, and contributing to
test requirements, procedures, plans, and customer reviews.
They must move out to provide systems test
planning with inputs to insure that Q.A. will
be an accomplished factor in EDP evaluation
programs, both analytically and factually.
They must provide a complete technical service to systems test, not just a paper service,
or rubber stamp on results not understood.
Q.A. must become machine conscious in terms
of EDP utilization for data summaries, quality control history records, calibration requests, customer mandatory product control,
and customer acceptance review. This implies
that machine techniques, instrumentation,
error analysis, circuit design, workmanship,
calibration, and checkout procedures have
all been reviewed in depth and approved and
Q.A. has only to review the results of a computer status report to make a "buy" decision
and obtain customer concurrence.
Here is where the technical and management decision lies. Do the Quality organizations existing
today in most hardware producing establishments
have the opportunity and management backing to
gear up technically to meet and utilize the EDP that
computer engineering is refining. Can they then
adapt these to quality systems and detail hardware
evaluation? We must each answer this based upon
our own experiences and known organization. I suspect that the answer will be "NO" more often than
not, however, there are certain exceptions that are
pointing the way.
881
POTENTIAL SOLUTIONS
The following represent some thoughts that lead
to an increase in communications and technical understanding of the changing evaluation methodology, i.e., real time computer data analysis and quality acceptance of product.
The buy-off criteria for most electronic systems performance and configuration must be
designed from inception to accommodate in
real time electronic data processing, analysis,
and acceptance.
In-house quality organizations should have a
strong but technically sound initial input into
final system sell-off criteria. This requires a
broad-based systems-oriented engineering capability with strong test and EDP background
plus strong management backing.
Customer buy-off criteria must be adaptable to
EDP and must be completely knowledgeable
and must accept such output as evidence of
satisfactory performance. In some fields
where enormous quantities of data are produced and analyzed this is mandatory to
economical operation.
Public acceptance of such techniques will also
need to be cultivated, particularly where
safety or welfare are concerned. Many. agencies are already engaged in EDP techniques,
such as banks, ticket information, billing,
automatic machine tool operation, etc., and
extending this to other disciplines is only a
matter of time and ~conomics which can be
hastened through extensive missionary activity which is shaped to provide increased service, lower cost and decrease schedule span
for Q.A.· operations.
CONCLUSIONS
If the above potential solutions are to be considered seriously, these actions should be actively pursued to insure a continued growth and increased
Quality Assurance role in the utilization of EDP.
This can be done if you:
Learn EDP.
Think EDP.
2. Talk EDP.
3. Teach EDP.
l.
882
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
4. Encourage engineering and manufacturing
counterparts to do likewise.
5. Include EDP as part of your quotes and
budgets but be really ready to technically
and economically defend them by convincing yourself that the application is correct.
Wrongly applied, your case can be pulled
down around your ears.
6. Update your management as required. This
must be skillfully done; such a seed needs
careful planting, tender loving care, and
much hard work in upward communications. Here, among others, films, items
about potential competition usage vie related fields, i.e., case histories, visiting experts, and participation in EDP meetings,
are in order.
7. Consort with the enemy. Find out what
engineering, design, manufacturing, the customer is doing and planning in EDP. Take
the initiative and assist and help design better techniques and applications that will,
then, be completely acceptable to you"quality."
8. Call upon the services of EDP organizations to assist with (a) meetings, (b) arranging speakers for programs, and (c) as
1965
a source for technically qualified assistance.
Services are available for defining, reviewing, and planning implementation for extending the use of Electronic Data Processing disciplines deeper into the acceptance
of product sphere for which Quality has
final authority.
Since I have been operating in the role of the enemy for a number of years, i.e., electronic systems
test operations and evaluation, the above is offered
in the hope that this will provide incentive in many
organizations to investigate and accept the tremendous technological advantage that the proper application and utilization of Electronic Data Processing
will make in almost any major industrial effort involving equivalent systems sell-off and involving
many parameters and technical judgements. The return in
• Performance
• Cost savings
• Schedules
• Reliability
• Technical satisfaction
will be your reward as well as your management's
and will provide an important contribution to advancing and cross pollinating the Quality and EDP
mixture to the betterment of each.
THE INTRODUCTION OF MAN-COMPUTER GRAPHICS
INTO THE AEROSPACE INDUSTRY
S. H. Chasen
Research Laboratory, Lockheed-Georgia Company
Marietta, Georgia
the computer or a requested action by the computer
as fast as the requested response or action can be
assimilated. Our present space program, for example, would not exist, as we know it, without this
real-time capability. The closer we get to a true
and general real-time environment, the closer we
will be to maximum problem solving capability.
Only with the most recent advances in computer
speed and scope performance has. a real-time online computer graphics· system become practicable.
With a visual display and the ability to interact
with the computer through the geometric representations, it is· possible to perceive and absorb significant information such as shape, area, proximity,
density, and intersection to a degree that may obviate the requirements for special-purpose, complex, and cumbersome computations.
The graphic medium of communication is but
one of many media by which man is attempting to
maximize computer utilization. It is, however, a
very important medium to which considerable research and development is being applied with the
promise of rewarding results. The contribution of
graphics in the real-time on-line system is manifested in all problems where a visual display and
the facility to work with the display is desired.
This can be perceived in a broad spectrum of ap-
INTRODUCTION
With the exponential growth of computer facilities, a geat deal of attention has been given to the
analytical description of man's decision-making
processes. Yet little has been accomplished of general value to automation. We can think of many examples where the human mind can assimilate information and quickly reach a decision where we
would be hard pressed to computerize the thought
process. Since it will be many years before man's
general decision making powers can be channeled
into· computers, he must be given an optimum remedial problem-solving capability. This means
that he must be given the facility to communicate
or interact directly with the computer and he must
be given adequate tools to accomplish this interaction. In an idealized man-computer system, facilities will exist to yield a homogenous mix of man's
decisions with routine computation. With the addition of fast response, it will be possible to shorten
span time and to increase the learning and the retention of significant results. To this end the concept of real-time on-line computer graphics will
playa major role.
An optimum real-time capability is one in
which the man receives a response to an inquiry to
883
884
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
plications. For example, the designer wishes to
create a small part or perhaps a large section of an
aircraft. In either case, the ability to view, to evaluate, and to change the design while maintaining the
mathematical definition on the computer will be an
invaluable contribution. Then there is the program
manager who would like to have an up-to-date
review of his program and who would like to consider the effects of his proposed changes. The display of a PERT network and the facility to make
direct alterations to observe the effect on the critical path will accomplish this function.
Thus the graphical capability augmented to the
real-time on-line system will significantly increase the efficiency of solutions of many problems
and will open the horizons to solutions of a new
class of problems which have not been tractable in
the past.
Research in the area of computer graphics
reached a significant milestone when Ivan Sutherland completed his initial "Sketchpad" system in
1961. Using a cathode ray display interfaced to the
Lincoln Laboratory TX2 computer, Sutherland
showed the feasibility of supplying graphical or geometric information to the computer via the display. Cathode ray output has been around for a long
time in the computer age, but two-way geomettic
communication was a revolutionary concept. Certainly, Sutherland is not the only one to have considered the significance of graphical, on-line I/O.
General Motors has had a program in computer
graphics which was initiated in 1959 and was, for
competitive reasons, veiled in secrecy until the Fall
Joint Computer Conference in October 1964.
Another program which has considerable bearing
on computer graphics is Project MAC (Machine
Aided Cognition) under the direction of Professor
R. M. Fano at ,M.I.T. Project MAC is a broadbased program delving into all .aspects of mancomputer systems. Significant achievements are also
being realized under programs directed by Douglas
T. Ross and Steven Coons at M.L T. Mr. Ross is
developing Automated Engineering Design (AED)
compilers which will aid the user of man-computer systems in formulating and solving his problem
with increased facility and versatility. Professor
Coons is responsible for the development of programs associated with computer-graphic applications. His work is well known as Computer Aided
Design.
1965
The dramatic innovation of man-computer graphics adds a new dimension to computer technology. It is a new link in the chain which leads toward
more complete automatton, as information communicated by pictures or displays is often many
times more effective than the written word.
LOCKHEED-GEORGIA'S COMPUTER GRAPHICS
In its belief in the strong future role of mancomputer graphics, Lockheed-Georgia was a pioneer in the aerospace industry when it acquired a
system with emphasis on two-way graphic communication as defined above. In December 1964
the UNIVAC 418 Computer connected to the Digital Equipment Corporation's 340 Scope became operational in our Research Laboratory. The system
whose graphical section is exhibited in Fig. 1 is
dedicated to a research program in the application
of man-computer graphic. Our research team
consists of experienced personnel in a variety of
Figure 1. Design of an aft fuselage in 3-D.
disciplines including systems programming, mathematical analysis, electronics, information retrieval,
design, numerical control and engineering loft. Under the Man-Computer Systems Program of Sys-
INTRODUCTION OF MAN-COMPUTER GRAPHICS INTO AEROSPACE INDUSTRY
tems Sciences, there are about 20 team members in
all. In addition, regular consultation is carried on in
other specialty areas.
Since the UNIVAC 418 had never before been
used for graphical operations, development of the
systems programming was undertaken by our specialists. The development of "software" is quite involved for computer graphics. Many technical problems are associated with the creation of a drawing
or sketch on a cathode ray tube by input from a
light sensitive pen or from other sources. In order
to perform analyses on or make changes to a display on a scope, it is necessary to program the computer such that the displayed configuration is, at all
times, stored in mathematical form in computer
memory. Provisions must be made for efficient
storage and retrieval of displayed data and the protection of computer memory against careless destruction of vital data. In addition to the light pen
input, data such as coordinates may be ~transferred
to the display from input on a standard computer
keyboard. Also, certain subroutine functions such
as deriving the area, drawing a line, rotating,
changing scale, and deleting an entity must be accomplished. These special subroutines are "called
up" by the button box, a panel of· 28 buttons designed by our engineers. and integrated with the
computer system. These buttons are under program
control; that is, they may have different meanings
for different applications programs.
Through the combination of the various input
media and the general-purpose software system, it
will be possible to develop special-purpose programs to solve particular problems. There is no intent that the presently developing software and general programs are to be the final system. Only those
functions of obvious general utility are being incorporated. Then, as specific solutions are sought
through the computer-graphic system, need for
further extensions to the general package will become apparent. A feedback relationship between
solutions to specific problems and extensions to the
general program package seems the most reasonable
doctrine in a research environment.
The capabilities that exist at the end of August
1965 on the graphic system include the following
basic features:
Three-Dimensional Capability
• Four views: three principal projections and a
perspective. Drawings are created by working
•
•
•
•
•
•
•
•
•
•
•
885
in any combination of the three principal
views
Conversion to single view and return to four
views upon request
Isometric: drawing is created by locating X
and Y first, then Z (available as a separate
experimental program)
Definition of points
Definition of lines
Definition of circular arcs (elliptic arcs in
projection)
Change of scale
Rotation about any designated axis perpendicular to any view
Translation
Deletion of any element
Multiple figures can be displayed simultaneously. Proper definition of figures permits alteration of parts of the complete drawing
while other parts are held fixed
Multiple rotation axes: this permits the study
of relative motion. Axes may be parallel or
orthogonal to each other. Angular rotation
rates may differ
Two-Dimensional Capability
The two-dimensional capability includes the
definition of points, lines, circular arcs and standard manipulation features using both keyboard and
light pen inputs. In addition, a host of special features are planned, or exist, for the numerical control milling application which is explained later.
Examples are:
• Construct a circle tangent to a displayed circle
and a displayed line with its radius input by
keyboard
• Construct a circle tangent to two circles. The
radius of the required circle is input by keyboard
• Construct a circle through three displayed
points
• Move a point while maintaining connectivity
Features in Other Programs
• Shape or mold a stored geometric shape to
fit a preconceived concept or specification
• Freehand sketching
• Alphanumeric display
• Alphanumeric printout of designated text
886
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1965
• Translation of displayed data to hard copy
output on an X-Y plotter
A small tracking cross is used as the medium of
communication between scope and computer. Its
position as directed by the "light pen," shown in
Fig. 1, and the activation of appropriate subroutines by the use of the button panel direct the creation or functioning of the indicated graphical features.
Some of the above features are illustrated in the
following figures:
• Figure 1 shows the 4-view representation of
the design of an aft fuselage
• Figure 2 illustrates drawing in isometric
• Figure 3 is a series of photographs to illustrate multiple axes of rotation
• Figure 4 shows the cross section of a wheel
pod. It has been shaped from a starting circle
• Figure 5 shows how connectivity may be
maintained when points are moved
Figure 2. Design in isometric.
pmLOSOPHY OF COMPUTER GRAPHICS
RESEARCH
Figure 3. Multi-axis rotation.
Before discussing our applications activity, I
would like to discuss some aspects of the general
national interest in man-computer graphics.
INTRODUCTION OF MAN-COMPUTER GRAPHICS INTO AEROSPACE INDUSTRY
887
Figure 3. Multi-axis rotation. io ••••••
Figure 4. Wheel pod - shaped on the scope starting with a
circle.
At the present time, various degrees of activity
are springing up around the country. Some computer manufacturers, in recognition of the future role
of computer graphics, have strong programs of their
own. They hope to offer a complete system that will
give newcomers to the use of computer graphics an
up-to-date capability. Actually, it is unlikely
that the manufacturers will be able to anticipate all
of the problems and demands that will be forthcoming from this new area. Users of the packaged systems may find that their special applications will
require additional computer programs that will be
difficult to acquire. Though Lockheed-Georgia
may use some of the manufacturer's software features when they become available, we believe that
the creation of our own program system for our
own applications offers the greatest flexibility and,
therefore, the greatest success in long term operations.
Because of the tender age of the man-computer
graphic concept, we believe that a modest initial
effort with freedom to grow with experience is the
most feasible course. There are many problems
both of a technological and of a human engineering
nature which must be dealt with in due time, and
888
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1965
Figure 5. Connectivity may be maintained when end points are moved.
which will influence the growth pattern. Considerable experience and strong familiarization with the
MCG concept and its ramifications will be necessary
to make accurate judgments to distinguish the primary problems of concern from the secondary problems. ~n that regard there are many problems that
require classification according to priority because
any team of modest size can cope with only so many
problems in a short time span. Because of this, we
tend to prefer that the manufacturer deal with complex hardware problems while we furnish specifications that must be met to allow the system to function properly and to grow. Such problems as time
sharing and remoting scopes from the C.P.V. must
be given priority attention while the optimum tilt of
the drawing surface, the nature of the drawing surfaces, and the number of buttons on the· button panel
are the type problems that can await attention without jeopardizing the main purposes and payoff of
the man-computer graphic system.
APPLICATIONS
Many long-range applications are contemplated
for computer graphics, with current emphasis being
placed on the design process. Structural analysis,
management systems, information processing, electrical circuit layout, process control, and command
and control are other areas that are receiving or will
receive early attention. The first applications will
be fairly specific and of limited technical complexity. In this connection, a "Near-Term" Group has
been formed to seek technical contributions to
Lockheed-Georgia which can be completed in
1965. This is in addition to the more long-range
goals of the computer-graphics team. The
"Near-Term" Group is investigating two areas:
complete mathematical definition of all surfaces of
an aircraft envelope and two-dimensional* numerical control milling procedures with completion
dates of September and December 1965 respectively. The problems of mathematical definition of surfaces and of 2-D numerical control are described
below.
Mathematical Definition of Surfaces
In regard to the mathematical definition of surfaces, it should be noted that a large percentage of
present designs are defined in mathematical terms.
However, on some aircraft, there are regions where
mathematical definition has not been practical. For
example, on the C-141 the wing and fuselage surfaces are mathematically defined but the fillets between these surfaces were created more or less empirically. When complete mathematical definition is
attained, the need for "master models" will be
greatly reduced. Furthermore, the various activities
that must utilize the shape will find that a central
source of data in mathematical form will expedite
their analyses and will eliminate cumulative errors
which accrue when data is used, converted, and
passed on to other activities in a serial fashion. Although this "Near-Term" task is not likely to utilize computer graphics per se, follow-on work will
see surfaces displayed on the scope to study esthet*In the subsequent discussion on 2-D NjC, it should be
borne in mind that we are actually working in a 3-D environment, but the 2-D nature of the basic problem should be
clear from the context.
INTRODUCTION OF MAN-COMPUTER GRAPHICS INTO AEROSPACE INDUSTRY
ics of design, to permit design alterations, and to
perform functional evaluations. Various methods of
surface definition have been investigated in depth
in preparation for the follow-on work.
By defining surfaces on the display, it will be
possible to vary unconstrained parameters and ascertain both the geometric and the analytical effects
on the created surfaces. Computations such as surface area and volume will be performed. Parameters
may be varied within allowable degrees of freedom
to achieve "optimum" results with respect to design
specifications.
Two-Dimensional Numerical Control
In our manufacturing process, many items are
milled automatically by numerical tape-controlled
milling machines. The creation of this tape is a laborious task. To produce the numerical control tape
for a part or tool, an accurate drawing of the item
must first be produced. Then a part programmer
must painstakingly go through the drawing and define the points, each distinct line, curve~ and other
significant features. A series of computer instructions is then written to represent the path that the
cutting tool must follow to mill the item according
to the design specifications. The language and computer program for writing, for compiling, and for
interpreting the instructions is called APT (Automatically Programmed Tools). The resultant output
is a magnetic tape. With additional. post processing,
a paper tape for directing a particular milling machine is created. The APT system has been devel-
889
oping for many years, and the task of producing it
has been a formidable one.
Still, the various steps leading to the production
of a numerical control tape require many man
hours. It was recognized that the application of new
techniques in automation could significantly reduce
the manual effort. It is estimated that about 80 percent of the items produced by means of numerical
control at Lockheed-Georgia are of a two-dimensional line and circle geometry. Therefore, the
desirability of using the early computer-graphic
capability to assist the 2-D NjC problem became
apparent. First the item would be defined directly
on the scope using the various input media-light
pen, buttons, keyboard, etc. The geometry of the
item would be stored in comput~r memory on a
permanent file. This geometry, which would be represented mathematically in the computer, would
serve as the basis for the description of the cutter
path.
Figure 6 shows a typical part that has been created on the scope. The two views show the circular
top of the cutting tool at the beginning and near the
finish of its path around the. periphery of the part.
Each line or circular arc of the figure is a surface
since the part is viewed from the top.
To create the part, the part programmer must describe a path, composed of linear elements, for the
cutting tool. In general, there is no way to determine the optimum path. The part programmer only
knows that he must consider each of the detailed
elements and their dimensions. The computer-graphics program to assist him is called PATH, and its
operation is summarized in the sequel.
Figure 6. Top view of a typical part with cutter path de- scription.
890
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
When the part of concern has been displayed as
in Fig. 6, the mathematical description will exist in
the computer. Then the process of defining the cutter path can begin.
The initial position of the cutter is, in general,
not on the part. The starting position in the X-Y
plane is input as is the depth coordinate Z. The cutting tool will cut to the indicated depth and the Z
value will be automatically associated with each (X,
Y) of the succeeding path until a change in Z is
requested and keyed in. The cutter radius must also
be input at the beginning of a sequence of steps to
enable the computer to automatically create appropriate offsets. Now, a point on a line (surface)
of the part is designated as the first (X,Y) to
which the cutting tool will move. Before the movement, however, it must be ascertained to which side
of the line the cutting tool will be tangent or if the
center of the cutter is to be positioned directly on
the line. This is accomplished by moving the tracking cross to one of three approximate positions
with respect to the line. That is, the cross is moved
by the ."light pen" distinctly to one side, "near" to,
or distinctly on the other side of the line of interest.
When the appropriate button is pushed, the circle
representing the diameter of the cutter will be automatically positioned on the display and the center
coordinates will be automatically recorded. The
next step in the process of cutting out a part profile
is to indicate the second surface of interest. Tangency or direct centering must be established as before. The cutter center will then move on the display and in the computer to the (X,Y) location of
the intersection of the two surfaces. Where tangency
is sought, appropriate offsets will be automatically
allowed. Cutter path definition for succeeding surfaces is derived similiarly.
In many cases, an automatic mode for defining
many steps of the cutter path may be employed.
With respect to the geometry, the path will be continuously defined in this mode until the operator
sees an error and stops the process or until the cutter arrives at a position where the next move is insufficiently defined. In the former case, the console
operator may delete the erroneous portion of the
path and correct it by stepping through the questionable region as previously described. In the latter
case, the operator may resolve the ambiguity which
the computer had indicated and restart the automatic process or continue step by step. In all cases the
1965
cutter path for several preceding steps is displayed
for the perusal of the console operator.
When the outer profile has been described, the
operator may choose to change Z and move to
another section of the part for profiling or for swathing a flat top. When the cutting tool is placed
tangent to a circle and when it is necessary to circumnavigate the circle, it is required that the operator key in the allowable error tolerance. The successive coordinates are then defined automatically
by the computer such that the cutter may move
linearly from point to point around the circle without
exceeding the indicated tolerances.
When the path has been derived, it can be redisplayed step by step or continuously for operator approval. From an operator's point of view, the system is reasonably simple because there are relatively few operations of buttons, keyboard, and tracking
cross which need to be learned.
With successive coordinates thus described, an
output tape can be generated that describes the necessary cutter motion and completely bypasses the
APT system requirement.
It is estimated that the average part programming
time for approximately 1500 parts amenable to this
graphic representation was 60 hours per part for the
C-141. The estimate for the same requirement using the computer-graphic PATH program is 10
hours. Considerable savings will also be manifested
in tooling· and template manufacture. In addition to
the great reduction in manhours, the decrease in
span time will be even greater because the graphical
system should inherently reduce program errors and
because batch processing is replaced by real-time
operation.
As of this writing, an evaluation of competitive
bids for a three-scope system is being completed.
The system will be dedicated to the single but important numerical control application. Delivery is
expected early in 1966. The system will be separate
from the continuing computer-graphic Research
facility.
MAN-COMPUTER GRAPHICS
IMPLICATIONS
The specific areas of application which have
been alluded to are relatively simple in many respects. Until computer graphics reaches a somewhat
more advanced level of sophistication, even the
seemingly simple problems offer a considerable
INTRODUCTION OF MAN-COMPUTER GRAPHICS INTO AEROSPACE INDUSTRY
891
challenge. It is one thing to solve specific problems
but quite another thing to solve large classes of
problems and to integrate newly developed problem
solutions into systems of somewhat greater breadth.
An optimum solution to a particular problem may
not be optimum when a solution to a more all-encompassing problem is sought. Thus the successful
introduction of computer graphics into our technological "bag of tricks" will necessitate our investigation of new methods and techniques for computer graphics. For example, current design practices require a sequence of relatively autonomous
operations. The layout is created. Drawings are distributed to the various specialty areas. Information
is extracted and forwarded to the central computer
facility for the analyses that characterize each specialty area. Results. of the computer runs are interpreted and, perhaps, design changes are requested.
Among the many conflicting design requirements,
compromises must be made and the cycle from designer to specialists to computer and back again is
repeated.
With computer-aided design, the team concept
may be altered considerably. Specialists may actually
take part in the early design process. Some of their
evaluations might be accomplished directly as the
design develops and this may cause almost immediate modific(j.tions before extensive time is lost by
the creation of unacceptable designs. As another
approach, the various specialty areas may have their
own display systems which are linked to a common
central computer. Design specifications and alterations may be perceived at each station when information is called up. This capability will compress
the time expended for the entire design process.
The types of future computer-display systems
that will best suit Lockheed-Georgia or any other
company will depend on many factors, including the
number of departments which will utilize the system, the interdependence of the encompassed activities and, of considerable importance, the direction
of growth in computer facilities which will be made
available by computer manufacturers.
cause a good system for today may lack flexibility
to grow into tomorrow's requirements. Once we are
committed to a computer system and the development of the associated software, it takes considerable time to justify and implement a change. It is
therefore incumbent upon us to plan for the solution of tomorrow's problems though they are not
totally defined today.
The computer-graphics research facility at
Lockheed-Georgia has provided a rare opportunity for gaining insight and experience in the vast
area of man-computer systems. Familiarization
with the problems and the general capability of
computer-graphics will equip our personnel with
the background and training to adjust quickly to
new and uncharted areas. Indeed, analogous to the
portion of an iceberg that lies beneath the water's
surface, there are many more as yet undiscerned applications which lie just beneath the surface of current comprehension.
SELECTION OF AN "OPTIMUM" SYSTEM
1. R. W. Mann and S. A. Coons, "ComputerAided Design," McGraw-Hill Yearbook of
Science and Technology, 1965, pp. 1-9.
2. Lockheed Georgia Quarterly, Summer 1965,
vol. 2, no. 2, Lockheed Georgia Company.
3. S. H. Chasen, "APT-less Contouring Tapes?"
American Machinist, pp. 69-70, (July 5, 1965).
The evaluation of equipment for our follow-on
man-computer research facility is now in process.
This evaluation involves many subjective decisions
which are characteristic of the selection of any major computer system. Timing is most important be-
ACKNOWLEDGEMENTS
I would like to express my appreciation for the
encouragement and support that the Man-Computer Graphics Program has received from many people in the Engineering, Computing, and Manufacturing Branches of the Lockheed-Georgia Company.
The basic team has been staffed with personnel
from each of these branches. Their enthusiasm and
competence have been a major factor in our continued progress. In particular, I would like to acknowledge the important contributions of O. V. Hefner,
the lead programmer for the software development.
A special note of thanks is extended to Mr. M.
D. Prince, Associate Director of Research-Systems Sciences, who initiated the program, and who
has maintained an active role in charting its course.
REFERENCES
892
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
4. E. L. Jacks, "A Laboratory for the Study of
Graphical Man-Machine Communication," Proceedings - Fall Joint Computer Conference, Spartan
Books, Inc., Washington, D.C., 1964, pp. 343350.
5. 1. E. Sutherland, "Sketchpad: A Man-Machine Graphical Communication System," Report
1965
No. 296, Lincoln Laboratory M.LT., (30 January
1963 ).
6. J. C. R. Licklider and W. E. Clark, "OnLine Man-Computer Communication," Proceedings-Spring Joint Computer Conference, 1962,
Spartan Books, Inc., Washington, D.C., pp. 113128.
HYBRID COMPUTING FOR LUNAR EXCURSION MODULE STUDIES
Arthur Burns
Analog Computing Section
Grumman Aircraft Engineering Corporation
Bethpage, New York
around the country, there appears to be an awakening to the fact that modern day problems require
the advantages of both computers. Efforts have
been made to organize computing teams for hybrid
projects. It is apparent that a softening of the old
barriers has occurred. In addition, an important byproduct has been produced in the form of "cross
education" between the analog and digital programmers. By necessity, each is beginning to appreciate
the advantages and disadvantages of both machines.
At Grumman Aircraft Engineering Corporation,
preliminary investigations into the field of combined analog-digital computation began late in
1957. The results of these studies led to the acquisition of a small, flexible linkage system to interconnect an IBM 704 digital computer with a large
analog computer installation. The initial application of this system was a verification of the theoretical studies that had revealed that, for certain types
of problems, a hybrid technique had distinct advantages over all-analog and all-digital simulations. 1
When the IBM 704 was replaced by the IBM 7090,
the linkage system was modified accordingly in order to be compatible with and to utilize the best
features of this new digital computer.
The next few years (through 1963) saw no major
changes in the linkage system as the IBM 7090
INTRODUCTION
Hybrid computation plays an important role in
our man-in-space effort. Large scale combined analog-digital studies in support of NASA's Project
Apollo are now being performed. The computers
utilized are an IBM 7094-II digital computer and
three consoles of Reeves 500 analog computing
equipment. These are linked by an Adage 770
Computer Link that provides an analog-to-digital
and digital-to-analog capacity of 48 and 55 channels, respectively.
The effort described herein concerns one of the
latest applications of a computing technique that
has become a "semicontroversial" topic in recent
years. Reluctance to use both computing systems
concurrently in the solution of one problem has
stemmed from their inherent incompatibility-the
continuous versus the discrete domain. As a result,
most analog and digital computer installations have
been physically and philosophically isolated from
one another. Professional computer organizations
also have been slanted toward one or the other type
of machine.
The arrival of the space age has changed this picture considerably. Recently, as evidenced by the
large number of hybrid facilities being developed
893
894
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
evolved into the IBM 7094-II. During this period,
relatively small-scale problems were solved using
the hybrid technique. These included some simulation work on the stabilization and control system of
the Orbiting Astronomical Observatory,2 a system
identification study,3 and a missile homing dynamics problem. 4
Active work on the development of the LEM
(Lunar Excursion Module) commenced at Grumman in January 1963. Although the initial computer studies were of the all-analog and all-digital variety, it was immediately clear that for some of the
more complicated programs, then in the planning
stage, an elaborate hybrid computing complex
would be required. Accordingly, a decision was
made to expand the existing linkage equipment to
accommodate these large problems. This expansion
ultimately proved unfeasible and a completely new,
large scale system was ordered and delivered early
in 1964.
This system is now being used in the LEM hybrid studies. The following discussion describes the
reasons for the hybrid approach, the equipment
utilized, the problem that is solved, and the implementation of the hybrid computing technique.
DISCUSSION
Why the Hybrid Approach?
The choice of the type of computer complex to
be used in simulation studies is becoming increasingly difficult. This is due largely to:
• The continually increasing scope and complexity of simulation programs,
• The rising popularity of digital, real-time
simulation techniques, and
• The reluctance of many engineers to depart
from conventional all-analog and ·all-digital
methods.
As part of the LEM project some computer
studies are planned that combine many subsystems
into an integrated simulation. The result will be a
considerably detailed representation of the LEM
vehicle. Add to this already ambitious undertaking
the facility for including a pilot, external visual displays, and actual flight hardware, and the result is
a problem that exceeds the state of the art of allanalog and all-digital techniques.
1965
Even the high speed digital computers in use today are not fast enough for an all-digital real time
solution. Sampled data studies indicate a maximum
allowable computation interval that, when exceeded,
proC;uces instability and prohibitive inaccuracies.
For the large scale studies, the interval required by
an IBM 7094-II to solve the appropriate equations
exceeds this limit. The only alternative for an alldigital solution is thus a relaxing of the complexity
of the mathematical models. This can be a painful
process. Rather than compromise the aims of the
program, a more desirable solution is to abandon
the all-digital idea and investigate other computing
approaches.
The problems involved in an all-analog solution
are of staggering proportions. Equipment requirements quickly exceed a feasible amount. Resolution
requirements on some trajectory parameters are on
the order of 0.001 percent. This is impossible to
obtain without resorting to multiple amplitude scali~g techniques. For real-time operation, these scale
changes would have to be made automatically, resulting in an even larger equipment load. Finally, a
huge amount of logic and memory is necessitated
(particularly by the various guidance laws). It is
generally well known that these operations are very
unwieldy on the analog computer.
Having eliminated these approaches, the logical
choice is to "go hybrid." Although the utilization
of the hybrid technique does present some unique
problems, these are mostly of a logistical nature
( computer scheduling, coordination, etc. ) . The
present state of the art of computer linkage hardware is such that analog and digital equipment may
be confidently interconnected. The resulting computing system is one that, unlike the. all-digital and
all-analog techniques, can result in a large scale
simulation study that meets the required objectives.
The LEM Mission
The Lunar Excursion Module is the vehicle in
which, as part of Project Apollo, two astronauts
will land on the moon. The mission starts with the
LEM detaching from the orbiting Command Service Module (CSM) and inserting itself into a
coasting elliptical orbit (Fig. 1a). At a point in the
vicinity of the pericynthion, the powered descent
portion begins. This is accomplished by firing the
throttle able descent engine. Execution of the proper
HYBRID COMPUtING FOR LUNAR EXCURSION MODULE STUDIES
Separation \
~___
~ ....~
~~
/)J'--"~...
/ LEM
.. ,
/
LEM
~,
CSM
"
Insertion Into
"
Coasting Elliptical
,Orbit
"
e
I
",
/
'\
'\
\
\
\ ___ Powered
I
895
\
Descent
\
I
J
\
/
~
/1
'"
/CSM Circular
~ Orbit
---------
/
/
Figure la. LEM mission (descent).
attitude profile by the LEM causes sufficient deceleration for the landing maneuver.
After an exploration period, and with the CSM
in the proper orbit position, the ascent engine is
ignited, using the descent stage as a launching pad.
The LEM is launched from the lunar surface and
then performs the rendezvous and docking maneuvers with the CSM (Fig. 1b).
At any time during the descent the mission may
be aborted (Fig. lc) and the LEM will perform an
"early rendezvous" with the CSM.
A reaction jet control system, including jet select
logic and pulse ratio modulators, is used for rotational and translational control. Additional attitude
control is provided during powered descent by descent engine gimballing. Guidance is implemented
via on-board computer, manual inputs, or combinations thereof.
Simulation Facilities
The simulation complex shown in Fig. 2 can be
thought of as organized into three general categories:
1. The hardware necessary to insert a man in
the loop including an instrumented LEM
cockpit, external visual displays that present appropriate visual cues to the pilot,
and control consoles for monitoring the
runs, failure insertions, etc.
2. The computer facility consisting of an
IBM 7094-II digital computer, a Reeves
500 analog computer, and special purpose
computing hardware, and
3. An Adage 770 analog-digital computer linkage system.
Although the features of general-purpose an(llog
and digital computers are well known, computer
linkage systems are still fairly unique. The Adage
770 5-7 is composed of two ADC's (analog-to-digital
converters), two 24-channel multiplexers, 35 sample and hold amplifiers, and 55 DAC's (digital-toanalog converters). Three channels of fixed discrete
896
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
/1 0::
------........... . . .
Docking \
"-
CSM
"\
I\
Rendezvous
\\
0
f
/Lift-Off
\ \~EM
\
' -............
\
"""
1965
E)-)
----
"'-
\
..,/,/
--- ----
/
/'/
)
/
Figure lb. LEM mission (ascent).
data information (switch positions), each consisting of a 14-bit digital word, may also be transmitted to the digital computer.
Utilization of the mode control feature enables
the digital computer to control the OPERATE,
REST, BALANCE CHECK, and HOLD modes
of the analog computer. However, the analog consoles, recorders, control console, and the digital
computer are "mode slaved" in such a way that the
lowest commanded mode always dominates. Regardless of what mode it has commanded, the
digital computer can sense the actual mode of the
analog.
Because the digital computer can write initial
conditions into the analog integrators and then
throw them into HOLD, it may now use the same
DAC for any other problem input during the OPERATE portion of the run. Thus, the output of a
"time-shared" DAC may be patched to the initial
condition input of an integrator as well as any other
input (except another integrator initial condition).
Digital input/ output sequences are initiated by
appropriate READ and WRITE commands from
the IBM 7094-II. The A-to-D, D-to-A, and discrete
data channels may be selected at random.
The outputs of four flip-flops whose states are
controlled by the digital computer are made available at the analog patch panel. These are called
Function Outputs and can be used to drive relays
on the analog computer.
A prominent feature of the Adage 770 is the
control panel shown in Fig. 3. This unit contains
the switches and indicators necessary to operate and
test the linkage equipment. In the manual mode,
with the aid of the control panel, all instructions to
the Link may be entered and all data and control
registers examined. This enables a major portion of
checkout and trouble-shooting procedures to be accomplished without using the digital computer.
Assignment of Computer Tasks
Allocating portions of the problem between the
HYBRID COMPUTING FOR LUNAR EXCURSION MODULE STUDIES
897
Figure lc. LEM mission (typical abort).
analog and digital computer should not consist
merely of having the analog relieve the digital of
some of its computational burden. The primary
consideration, rather, should be the utilization of
the best features of each. Efficient computer usage
is also an important factor. With as powerful (and
expensive) a machine as the IBM 7094-II being
utilized, it is imperative for computer efficiency
that its computation interval be as close as possible
to (but less than) the maximum allowable for real
time. Once. this is achieved, the requirement for a
minimum amount of analog equipment should naturally follow.
The usual starting point is to assign the high frequency dynamic equations to the analog computer
and those involving large dynamic ranges and considerable logic to the digital. For example, the
LEM reaction control system modulators provide
rapid pulses of thrust to the vehicle, resulting in relatively high-frequency attitude accelerations. Thus
the control system, calculation of the body forces
and moments, and resulting rotational dynamics are
placd on the analog. In addition, descent engine
gimballing, and reaction control system fuel computations are also placed on this computer. Translational and trajectory equations, calculation of variable mass and inertias, descent and ascent engine
thrusts, and all axis transformations are placed on
the IBM 7094-11. Appropriately, the on-board guidance computer is also simulated on this dital computer.
Reaction control system modulators, jet select
logic, thruster shaping circuits, and gimballed engine logic are taken care of by special purpose computing hardware.
The resultant computer configuration is shown in
Fig. 4.
Operation of Problem
Real Time: A typical automatic mISSIon run
begins with the analog computer in BALANCE
898
PROCEEDINGS -
-
......
~
Analog
FALL JOINT COMPUTER CONFERENCE,
AID
Link
....
~
1965
......
Digital
A~
A~
A~
H
~
Extern._
Visual
Displays
,,
,
~
~
LEM
Cockpit
-
......
--
..-
Monitor
Console
Figure 2. Simulation facilities.
CHECK (all amplifier inputs grounded). The digital computer then writes initial conditions into the
appropriate DAC's and sends the analog into RESET. The analog integrators now have the appropriate initial condition output voltages. The analog computer is then sent into HOLD. If time
sharing of DAC's between initial condition and
variable quantities is used, the values of the .latter
at time = 0 are then transferred to the particular
DAC's involved. The digital computer then causes
the analog to go into OPERATE. The precision
interval clock is simultaneously turned on, and the
basic computation interval has started.
Body attitude rates generated on the analog are
sampled by the digital and integrated to produce
vehicle attitude. By applying the appropriate guid-
ance law, attitude errors are generated. These are
sent back via DAC's to the control system that produces corrective moments to the LEM. The guidance laws also determine when translational commands are required. These are also transmitted to
the analog where the translational forces are generated. Because these forces contain high frequencies
and cannot be sampled fast enough, they are integrated first and then sampled. The digital computer
then takes the derivative using successive values.
From the calculated average force, the resultant
translational motion is calculated.
Descent or ascent engine thrust is calculated and
sent to the analog for computing moments due to
engine misalignments and center of gravity offsets.
Updated values of moments of inertia and center of
HYBRID COMPUTING FOR LUNAR EXCURSION MODULE STUDIES
899
AID LINK
I
I
Digital Computer
Attitude Errors
Guidance Laws
Engine Thrus'ts
Translational Eqns
I
Inertias
C. G. Positions
Axis Transformations
Altitude
Engine Thrusts
Control
(For "Quick Look")
J
Analog Computer
..
..
..
Control System
Body' Forces
Body Moments
Rotational Eqns
CSM Position
Control System Fuel
Control System Fuel
Inertias
Attitude Rate~
Gimbal Angles .
Integral of Body Forces
Engine Gimballing
Mass
..L
Total Fuel
I
I
I
t !
Special Purpose
Computing Hardware
Jet Select Logic
Gimbal Logic
Pulse Ratio Modulators
Jet Thrust Shaping
Figure 3. Adage 770 control panel and manual control
instructions.
Manual Controls
POWER ON/OFF - Power to 770.
REMOTE MANUAL - Permits either manual or digital
computer control of read/write operations in the 770.
READ SCAN UPPER LIMIT Sets upper limit of
sequential read scan.
WRITE SCAN UPPER LIMIT - Sets upper limit of
sequential write scan.
REFERENCE VOLTAGE INTERNAL/EXTERNAL -
Selects either internal or analog computer reference
voltage.
MANUAL INPUT REGISTER - For entering manual
data and instructions into the interface write buffer.
MANUAL WRITE - Enters the contents of the manual
input register into the 770.
MANUAL INTERRUPT COMMAND Generates a
manual interrupt signal.
MANUAL READ - Manually duplicates a read request
signal from the digital computer.
Indicators
READ ADDRESS - Displays the read channel address.
gravity positions from consumed values of reaction
control system and engine fuels are transmitted to
the analog at each computation interval. Altitude
above the lunar surface is also sent to the analog for
general monitoring purposes.
During the course of· a run, the digital computer
also selects control system deadband values and
feedback gains. This is accomplished by the 7094-11
setting the proper Function Output flip-flops. Function relays on the analog computer are thus energized
and the appropriate analog circuitry switched in.
Fast Time: Although the studies are primarily
run in real time, certain coasting phases of the mission may be accomplished in fast time. For this
mode the analog computer is sent into RESET. The
LEM is then assumed to remain at a constant atti-
tude, and the digital computer goes through its
computations as fast as it can using larger iteration
intervals. The problem is returned to real time by
new initial conditions being written by the digital
computer and the analog sent once more into OPERATE.
Future Studies
At the time of this writing, the LEM hybrid
study has been made assuming fully automatic guidance. The capability for both manual and automatic
control should be available soon. The simulation
will then contain, in addition to the hybrid computer complex, a fixed base, instrumented LEM cockpit, a contrQl and monitor console, and external
900
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1965
Figure 4. Allocation of computer tasks.
visual displays. The displays will present a view of
the lunar surface, CSM, and stars as would be observed by the crew members during a major portion
of the mission.
With the introduction of this additional simulation hardware, computer interface requirements become much more severe. Not only are more DAC's
and ADC's required but some digital-to-digital type
converters as well. These are necessary since the
digital computer must now read both analog (throttle and controller) signals and digital (switch positions) information as well as drive both analog and
digital displays in the cockpit.
CONCLUSIONS
The preceding has described an application of a
hybrid computing technique that is now making
contributions to the Apollo project. It has been
shown that, for the large scale LEM computer
studies, this is the only feasible method to use. The
stringent speed and resolution requirements, and the
complexity of the problem eliminate the all-analog
and all-digital approaches.
The combined analog-digital system, on the other
hand, provides the engineer with a computer complex that:
",
• contains a considerable amount of logic and
memory,
HYBRID COMPUTING FOR LUNAR EXCURSION MODULE STUDIES
• provides high resolution where needed, and
• is capable of real-time and fast-time operation.
Discussions of combined analog-digital techniques usually end with a debate. on the probability
of the digital computer replacing the analog. Many
believe that hybrid computation is only a part of a
transitional period and that due to the rapidly increasing speed and input/output flexibility of digital
machines, a decade from now they will be performing most if not all simulation-type problems.
To those companies that in the past few years
have expanded their analog and digital facilities
and find both being used to full capacity, these prophecies are only of academic interest at present.
This also is true for the personnel involved in landing men on the moon in the beginning of the next
decade. They know that many times before 1970,
with the aid of computers, they must simulate this
feat. They are concerned with the 1965, not 1975,
state of the art.
The hybrid technique described here is being
used today and similar techniques will be used in
the next several years. For large scale simulation
studies such as are being made for the LEM-for
problems of such high order of complexity-the hybrid approach is the only computing method by
which the required objectives may be attained.
ACKNOWLEDGMENTS
Regarding the implementation of the hybrid
computing technique described here, the author has
merely reported the work of many people. He would
especially like to acknowledge the significant contributions made by R. Phagan and H. Ahders of
LEM Dynamic Analysis, and J. Sachleben, G. Con-
901
nelly, R. Alleva, A. Mackenzie, and J. Casey of
Computing Sciences.
REFERENCES
1. A. J. Burns and R. E. Kopp, "Combined Analog-Digital Simulation," presented at the 1961
Eastern Joint Computer Conference, Washington,
D. C. (Dec. 12-14, 1961).
2. G. Zetkov and R. Fleisig, "Dynamic Analyses of OAO Spacecraft Motion by Analog-Digital
Simulation," presented at the Space Electronics Session of the 1962 IRE International Convention,
New York City, (March 29, 1962).
3. R. E. Kopp and R. J. Orford, "Linear
Regression Applied to System Identification for
Adaptive Control Systems," presented at the 17th
Annual Meeting and Space Flight Exposition, Pan
Pacific Auditorium, Los Angeles, California (Nov.
13-18,1962).
4. W. Valckenaere, R. Helm and H. Ahders,
"Dynamics of Homing Guidance," Grumman Aircraft Engineering Corporation Report ADR 06-0563.1 (March 1963).
5. Adage, Incorporated, Reference Manual for
the Adage 770 Hybrid Computer Linkage System)
(Jan. 1964).
6. J. H. Sachleben, "Why Hybrid Computing?,"
Grumman Aircraft Engineering Corporation Research Department Computing Report CR 65-2
(Feb. 1965).
7. G. Connelly and F. Romani, "A Report on
the ADAGE 770 Computer Link and Operating
Procedures Applicable to Analog Computation,"
Grumman Aircraft Engineering .Corporation Research Department Computing Report CR 65-3
(March 1965).
OPTIMUM DESIGN AND ERROR ANALYSIS OF DIGITAL INTEGRATORS
FOR DISCRETE SYSTEM SIMULATION
Andrew P. Sage
University of Florida
Gainesvil!e, Florida
and
Roger W. Burt
Motorola, Inc.
Phoenix, Arizona
INTRODUCTION
+ u(u- )(u-2)(u-3)
4!
In digital differential analyzers and digital computers, simulation is carried out by some form of
numerical integration or of replacing a difference
differential equation by a difference equation. This
paper is concerned with the development of optimum numerical integration and digital simulation
techniques and a discussion of the accuracy of these
methods when compared with ideal integration.
A2
t-to
t
L.l.
Xo
3!
~2XO
=
aXl - ~xo
=
X2 -
2Xl
+
Xo
and where xo, Xl, X2, ••• Xn are the values of x(t)
at to, to + T, to + 2T, ... to + nT. The value of
the integral
+ nT
f -------- x(t) dt
to
yet) =
(2)
to
is then approximately
2!
L.l.
= to + Tu
~ Xo = Xl - Xo
u(u-1)
+u(u-1)(u-2)A3
(1)
u=T
A given function may be approximated by some
polynomial over a short interval, t, and then the
polynomial integrated rather than the original function. Newton's formula for representation of the
function from to ~ t ~ to + nT is the polynomiap,8
= Xo + u~xo +
Xo •••
Where x (t) is the function, P (t) the polynomial, T
is the sampling rate, and
CLASSICAL APPROACH TO DEVELOPING
INTEGRATION RULES
P(t)
A4
L.l.
to
f
Xo
to
903
+ nT
to
x(T) dt
~
f
to
+ nT
P(t)dt
(3)
904
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
If Eq. (1) is substituted and integrated term by
term the result is
Ca(nT) is the actual sampled output. After substitu-
tion the error sequence in z transform notation is
E(z)
4
=
(4)
00
Various classical integration rules are obtained from
Eq. (4). The trapezoidal rule results when n = 1
and Simpson's rule is obtained when n = 2.
As n is increased in Eq. (4), the approximation
to the true value of the integral usually becomes better. The error involved in this approximation is
known as truncation error and will be the only type
of error considered in detail in this paper. Another
source of error in digital integrators is known as
round-off or quantization error which has an approximate root mean square value of q/y'12 where q
is the quantization level.
A complete discussion of DDA theory, operations,
mechanization, and programming is the presentation
by Gschwind. 3 Truncation errors are discussed in
references 3-6 while round-off errors are discussed
in references 2, 3, 5, and 6. The material to be presented here applies equally to digital differential
analyzers of the serial or parallel type as well as conventional digital computers.
DEVELOPMENT OF OPTIMUM NUMERICAL
INTEGRATION AND SIMULATION RULES
The theoretical design problem for optimum numerical integration and transfer function simulation
rules may be visualized by referring to Fig. 1.
r(t)
[R(s)J(s)]* - R(z)F(z)H{z) .. (5)
where [R(s)J(s)]* is the z transform of R(s)J(s).
The sum of error squared may be written
A3
+(~_3+2)~]
4 n
n
3!
.
1965
e(nT)
I
e(nT)2
n=O
1
=-r
f
'TTJ
r
E(z)E(z-l)Z-ldz
(6)
where the contour of integration is the unit circle. l
Substitution of Eq. (5) into this expression yields
00
I
e(nT)2
n=O
=
r'TTJ1 rf [A(z)-R(z)F(z)H(z)]
[A(Z-l)-R(z-l)F(z-l)H(z-l)] z-ldz
(7)
where A(z) = [R(s)J(s)]*.
H (z) is to be determined to reduce the sum of
error squared given by Eq. (7) to a minimum. This
problem may be solved by applying the calculus of
variations,7,9 which yields the result
R (Z-l )F(Z-l)A (z)
}
Ho(z) = { [R(z)R(z-l )F(z)F(z-l)] - P.R. (8)
[R (z)R (Z-l )F(z)F(z-l)] +
where the symbol { } P.R. refers to the physically
realizable portion of the term within { }, and the +
and - subsubscripts refer to the conventional spectrum factorization operator denoting extraction of
the multiplicative term containing poles and zeros
inside (+) or outside (-) the unit circle.
DISCUSSION OF THE FIXED PORTION
OF THE SYSTEM
If a solution of the equation
dy =g [x(t) _y]
dt
Figure 1. System for optimization problem.
In Fig. 1, ret) is the input, J(s) is the ideal operation, H (z) is the desired transfer function, andF (z)
is the fixed portion of the system. The error sequence
of the system is
e(nT) = ci(nT) - ca(nT)
where ci(nT) is the ideal output after sampling, and
(9)
is attempted by a numerical integration process such
as the standard trapezoidal rule.
y(nT) =y(n-l )T+
T
2
[y' (n-l )T+y' (nT)]
(10)
it is seen that the simulation of the system described
by Eq. (9) and illustrated in block diagram form in
Fig. 2a is unrealizable since y' (nT) cannot be obtained until y(nT) is known. In order to implement
905
OPTIMUM DESIGN AND ERROR ANALYSIS OF DIGITAL INTEGRATORS
x( t)
-1
+
y(t)
+ z
- z-1
g( e)
x(t) +
z
a)
-1
y(t)
+ z
- z-I
T
2
g( e)
-1
UNREALIZABLE SIMULATION
b)
REALIZABLE SIMULATION
Figure 2. Digital simulation of dy / dt = g(x - y).
the simulation a delay may be introduced1o,11 as
shown in Fig. 2b. In open loop problems this delay
is not necessary. However, for closed loop problems
such as those occurring in the digital simulation of
control systems, the introduction of the delay may
cause major errors as compared with using optimal
realizable discrete transfer function. If the integral
is estimated by using only previous values of the
dependent variable the need for the delay is eliminated. It will be seen that if
F(z)
= e- snT = z-n n = 1, 2, . . .
T
'2
4_3z- 1 +z- 2
1 - z-I
z
-1
y( t}
=
1/s2 and
T
R(z)R (Z-l )F(z)F(z-l)
( 1-z -1 ) 2
T
(
1-z) 2
Eq. (8) becomes
Ho(z)
(11)
the need for the delay is eliminated. In letting F(z)
= Z-l the discrete transfer functions developed are
forced to give the least sum of error squared when
the present value of the dependent variable is not
available. Discrete transfer functions with delay will
be referred to as open loop realizable and those without delay as closed loop realizable. In any single loop
only one closed loop realizable descrete transfer
function would be required. It will be shown that the
block diagram of Fig. 3, which uses an optimum
realizable integrator, will give less error for a given
sample period than the simulation of Fig. 2b.
YI (t)
of the procedure is the solution for R(s)
n = 1. When it is noted that
=
T2 z-l(1 + Z-l) (z) (TZ)}
2
(1-z- 1)3 (1-z)2
P.R.
{
T/(1-z)2
T
(1-z-1 )2
T2 . z(l + Z-l)}
{ 2
(1-z- 1)3
P.R.
TI( 1-z- 1 )2
(12)
The physically realizable portion of the term within
the brackets must now be found. Expansion of this
term yields
z+1
1-3z- 1 + 3z- 2 - Z-3
(13)
If z is subtracted from this it becomes
4 - 3z- 1
+
Z-2
(l-z-1 )3
(14)
which is physically realizable. The optimum integrator transfer function is therefore
z
-lH ( ) _ T (Z-l) (4-3z- 1
0 z
-2"
(1-z-1)
+
Z-2
(15)
SINE LOOP ERROR ANALYSIS
Figure 3. Optimum digital simulation of dyjdt = g(x - y).
INTEGRATORS DETERMINED FOR STEP,
RAMP AND PARABOLA INPUTS
Substitution into Eq. (8) with J(s) = l/s fez)
z-n and given R(s) results in Table 1. An example
Probably the most demanding function required
of digital integrators is the generation of sine waves
(far less demanding would be, for instance, a decaying exponential generator). Thus it is desirable
to develop. means by which truncation errors involved in sine wave generation may be compared.
Figures 4a and 4b show the sine loop for the
906
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
digital and ideal integrators. If impulses are assumed for inputs, the outputs are
C(z)
C(z) =
1
[z-nH o (z)]2
+ [z-nH o(z)]2
(16)
and
(a)
H
l-----l-
1965
L-------.J'IS
(17)
'IS
Now, if the expressions for z-nHo(z) from Table 1
are substituted into Eq. (16), results are obtained
which are listed in Table 2. The expressions for C(z)
in Table 2 may be divided out for any desired value
of T in order to determine the output sequences. This
has been accomplished for each of the integrators
(b)
Figure 4. Sine loops for the digital and ideal integrators.
Table 1.
Expressions for Optimum Digital Integrators.
R(s)
z-nHo(z)
n
a
lIs
0
b
lIs
1
c
1/s2
0
d
1/s2
1
e
1/s3
0
f
1/s3
1
I
2
T
2
T
3
T
3
Tz 1
1-z- 1
Tz 1
1-z- 1
(l+z- 1 )
(l-z- 1 )
(z 1) (4-3z- 1 + Z-2)
(l-z- 1 )
1 +4z- 1 + Z-2
1-z- 2
(Z-l) (8-5z- 1 +4z- 2-z- 3 )
( 1-z- 2 )
with T = 7T8, and results (except for the four-point
rule) are shown in Figs. 5 through 10.
From Eq. (16), it is seen that the characteristic
equation of the sine loop is
(18)
After substitution from Table 2, solution of this
equation for several values of T gives corresponding
pole locations on the root locus. Figs. 11 through
17 show root locus plots for the sine loops for various integrators considered.
Although root loci may be found in the s-plane
from s
= ~ In z,
the different modes of transient
or steady-state behavior are characterized by the
root loci in the z plane. Complex conjugate poles
inside the unit circle correspond to a damped oscilla-
Rectangular Rule
Rectangular Rule
Trapezodial Rule
Three Point Extrapolation
Simpson's Rule
Four Point Rule
tory output sequence while complex conjugates outside the unit circle indicate an exponentially rising or
unstable output sequence. An ideal sine wave output
sequence is indicated by complex conjugate poles on
the unit circle. If poles fall near the origin so that
their magnitude is much less than one, they may
be neglected in favor of complex conjugate pairs
near the unit circle. Poles near the origin represent
fast decaying transients.
In all but one of the root locus plots, it is seen
that for small values of T one complex conjugate
pair lies near the unit circle while the remainder of
the poles are near the origin. Thus, in these cases,
it is a very good approximation to deal only with
what may be called the control poles near the unit
circle, especially after the first few samples.
From Figs. 5 and 10, it appears that the rectan-
907
OPTIMUM DESIGN AND ERROR ANALYSIS OF DIGITAL INTEGRATORS
Table 2. Sine Loop Output Expressions.
Double Integrator
Approximation
C(z)
Z-l sin T
1-2z- 1 cos T+Z-2
a
c
[-2T
d
T
2
e
T2
1 +2 z l+Z 2
(T2+4) 1 +2(T2_4 )Z-1+Z-2
(T2+4)
(1 +Z-1)]2
(l-z-l)
l+z- 1 T
l-z- 1 2
I+Z- 1Z- 1
l-z- 1
1+Z- 2)]2
[!2 z-1(4-3Z(l-Z-l)
(16z 2-24z 3+17z- 4-6z- 5+z-6)
T2
4
1-2z-1+z-2+
T2~16z-2-24z-3+17z-4-6z-5+z-6)
4
T2
T
f
2
1 +Z-l
l-z-l
T
2
4 (4+z- 1-2z- 2+z- 3)Z-1
Z-l( 4-3z- 1 + Z-2)
l-z- 1
(1 +4Z- 1 +Z2)]
(l-z- 2 )
C(nT)
10
8
.-.
6
4
2
0
-2
-4
.4
C(nT)
-
.3
0
•2
•
.....-
.- ...
.-
.1
rr
...-.
2rr
~ 4rr
.
.
5rr
... ... ...
trr
7rr 8~
..
-. I
- .2
-.3
-:
-6
-.4
9rr 10rr II_rr 12rr
0
. .. . .. .
".
.."
·0
0
.0
-8
•
-10
Figure 5. Output sequence for rectangular rule (Table 2b).
Figure 6. Output sequence for two unrealizable trapezoidal
rule integrators (Table 2c).
908
PROCEEDINGS -
C(nT)
C (nT) 10
.5
.4
..
•3
'0
.' ,
..
0
o
.2
.......
........ e.
••••
3n
~1[
1965
FALL JOINT COMPUTER CONFERENCE,
.
..
0
o.
0
0
0
o·
0
0
.
.
.1
o.
-2
'I!
..
gn
21!
91!
IOn lin 121! 131!
t
4n
o
-. I
••0
0
-4
0
o
0
-.2
-6
-.3
-8
-.4
• 0
-10
o'
.. .
0
0
.
0
0
0 0
0
-.5
Figure 7. Output sequence for realizable trapezoidal rule
(Table 2d).
Figure 10. Output sequence for two Simpson's 1/3 rule
unrealizable integrators (Table 2g).
Imaginary
.6
C(nT)
1.0
T=I.0
.5
.8
.4
.3
•2
.1
.0.
•
.6
.
.4
n
.....
-. I
-.2
. . .......
..0
.2
o •
o •
o
..,.
.4
.2
-.3
-.4
-.5
-.6 -
Figure 8. Output sequence for two three-point rule
integrators (Table 2e).
Figure 11. Root locus for rectangular rule since loop
(Table 2b).
Imag i nary i!
C(nT)
.4
.3
1.0
...
•
.
.8
...
0
.6
.2
.4
.1
.2
2n
4n
3n
2 Zeros
-. I
-.2
-.3
-.4
·0·
...
.2
Real i!
...
Figure 9. Output sequence for three-point rule-trapezoidal
rule combination (Table 2f).
Figure 12. Root locus for unrealizable trapezoidal rule sine
loop (Table ~
909
OPTIMUM DESIGN AND ERROR ANALYSIS OF DIGITAL INTEGRATORS
Imaginary l
.8
.8
Unit CirCI~
UnitCirc~
.6
.. 6
zero
.~o
.4
.2
.2
(
I po I
Figure 13. Root locus for realizable trapezoidal rule sine
loop.
Imaginary
e
1.0
Figure 15. Root locus for trapezoidal integrator three-point
rule integrator (Table 2f).
In general, it may be seen that as the integration
rule becomes more complicated more poles are added to the discrete transfer function with a corresponding increase in chances of instability and spurious modes of response. The three-point rule adds
four extra poles. Simpson's :y§ rule adds only two
extra, but these are very damaging. The four-point
rule, which is the closed loop counterpart of Simpson's rule, has six extra poles of which two are very
damaging.
The best choice is now seen to be the combination of the open loop trapezoidal rule and the
closed loop counterpart of the open loop trapezoidal
rule, the three point rule. Almost as desirable is the
cascade combination of two closed loop three point
rules. The cascade combination is somewhat more
desirable in terms of systematization of computer
programming.
Figure 14. Root locus for two three-point rule integrators
(Table 2e)
gular rule, although very simple, compares unfavorably with the trapezoidal and three-point rules.
Therefore, it will not be discussed throughout the
remainder of this paper. Simpson's :y§ rule has two
pairs of complex conjugate poles on the unit circle.
The effect of the extra pair may be seen as a high
frequency oscillation of the sample points in Fig. 4.
A similar but worse effect may be seen with the
four-point rule. Simpson's :y§ rule and the fourpoint rule are thus unsatisfactory when sine wave
generation is used as a criterion and so are also not
discussed further.
AMPLITUDE AND FREQUENCY ERRORS
From the locations of the control poles it is possible to find "a" and "b" in the expression
c(nT) = e- anT cos bnT
(19)
since the z transform of this expression is
C( ) z -
e- aT sin bTz-l
.
1-2e-aT cos dTz-l
+
e- 2aTz- 2
(20)
Assuming that all but the control poles may be
neglected, Eq. (20) may be used to find the values
for "a" and "b" in relation (19). The results are
910
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
Imaginary
1965
Z
1.0
2 zer
Figure 16. Root locus for Simpson's 1/3 rule sine loop
(Table 2g).
presented in Figs. 18 and 19 for the unrealizable
trapezoidal integrators, realizable three-point integrators, and combination trapezoidal and threepoint rule. As might be expected the unrealizable
trapezoidal rule has no amplitude error. However,
both the realizable three-point integrators and combination three-point integrator trapezoidal integrator give less frequency (or time) error than the
(unrealizable) trapezoidal integrators.
OTHER APPLICATIONS OF THE METHOD
The authors have had considerable success in
applying this method of discrete simulation of control systems to a number of examples. Systematic
Imaginary i!
application of the technique is obtained by a phase
variable description of the system and repeated application of the three-point formula for realizable
closed loop integration. It is not necessary that this
be done. It is convenient with respect to computer
programming, however.
The following example illustrates an alternate
method of applying this optimum discrete simulation technique. A similar example. was first treated
by Hurt10 in his discussion of the IBM simulation
method.
Consider the nonlinear system represented in block
diagram form by Fig. 20. The three-point integration
rule could be used to simulate the 5/s + 2 term. An
alternate approach is to find an optimum discrete
0.1
.09
.8
.6
.08
.07
.4
.2
0.1
Figure 18
Figure 17. Ideal pole locations (Table 2a).
Plots of "a" Versus T
Figure 18. Plots of "a" vs T.
911
OPTIMUM DESIGN AND ERROR ANALYSIS OF DIGITAL INTEGRATORS
3-Po i nt + Trapezo i da I
1.02
approximation of the entire transfer function 5/s+2
for a suitable input such as a ramp. The approximation need only be open loop realizable. From Eq. (8),
the discrete transfer function under the stated conditions becomes
+Z-l( l_e- 2T_2Te- 2T )
(1-e- 2Tz- 1 )
5 (2T-1 +e- 2T )
Figure 19
4T
Plots of "b" Versus T
By using the closed loop realizable three-point rule
for integration the block diagram for the discrete
Figure 19. Plots of "b" vs T.
y
~
s
5
+
2
r---
;{<+.:~<
Y2a
1
roo-
s
Figure 20. Feedback system with limiter.
Y
2..R
--K>- 4T
( 2T-1 e -2T) +z -I ( I-e -2T -2Te -2T)
(l_e~2Tz-l)
~J
Y
-20
III z -I (4-~z -I +z -2)•
2
I-z
YID
-I
Figure 21. Discrete simulation of Fig. 20.
simulation becomes that shown in Fig. 21. (Pl =
P2 =1).
Figure 22 shows the unit step response of the system for three sample periods T=O.l, 0.2, and 0.5
seconds. While the step response for T=0.5 is not as
accurate as the IBM method for T=O.5, it is considerably better than Tustin's method of simulation. 1o
Since this method does not require a detailed root
locus redesign of the loop gains or computation of
the "input transfer function" as does the IBM method
and since it provides considerable reduction in error
over the Tustin and other approaches,l1 it is felt that
its use in discrete system simulation is warranted.
If the system is decidedly nonlinear, a more useful
result is obtained if the nonlinear characterics are
taken into account. For the exainple treated here,
as the amplitude of the input step is increased, the
approximation error for a given sampling period
increases. This is due to the fact that the "optimum"
discrete approximation has been derived on a linear
basis and the system is highly nonlinear for large
inputs. The approximation error may be reduced by
decreasing the sampling period or by making use
of the fact that the system is acually nonlinear in
determining the "optimum" discrete approximation.
The IBM method10 essentially consists of adjusting
gain P 1 or P2 such that the closed loop eigenvalues
for the discrete approximation are the same as for
a linearized version of the continuous system. Here
a somewhat different method is proposed which takes
into account the actual nonlinear charactertistic of
the system. It is desired that the digital system output,
Xd(t) approach the analog output Xa(t) , where Xa(t)
912
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE, 1965
y( t)
1.2
1.
.8
.6
.4
Continuous System
.2
x
T = • 1 seconds
()
o~
____
~
____-+______
~
= .5 seconds
____ ____ ______+-____ ____________________
--.
~
~
4
3
2
T = .2 seconds
~
T
~
5
Figure 22. Step response of the system of Fig. 20.
and yd (t) are m vectors describing the system output
state-for a given input. It is assumed that the analog
state vector output Ya(t) is completely known, as is
the input to the system. The form of the digital system has been determined by the previously presented
method and is known except for a certain number
of constant parameters P, which are to be determined.
P will be interpreted as ap vector and will be adjusted
to minimize the cost function
]=
~
N
I
II Ya(kT)
k=O -
- Yrz(kT)
-
II
2
R
(21)
R (n
+ 1 T)
I.d (0)
+ 1 T)
= f[Xd (nT), ~]
(22)
=Xa(O)
= ~ (nT)
(nT) = - \/1f [Yd (nT),'p] ~Y
(n + 1 T) + R [Ya (nT) - Yd (nT)]
Ap (n
+ 1 T)
Ap(nT)
Ay (NT)
= -\/
= Ap
(23)
Application of standard variational calculus procedures demonstrates that the optimum parameter
vector P is determined by solution of difference Eqs.
(22) and (23) together with adjoint vector difference equations
= - Llp~ [Yd- (nT),
E] !
pf [Yd (nT), P] Ap (n+1 T)
(NT) = Ap (0) = 0
(24 )
(25)
(26)
where Ll represents the gradient operator. Equations
(22) through (26) represent a two-point nonlinear
boundary value problem which may be solved by
the method of quasilinearization,12 a technique due
to Bellman. 13 A new 2 (m + p) vector
x'(nT) = [v'd(nT), P', A'y(nT), A'p(nT)]
-
subject to the constraint that
yd (n
~Y
. L
-
-
(27)
will be defined such that Eqs. (22) through (26)
may be described by the difference equation
~
(n
+
1 T) =
9.
[x (nT)]
(28)
with the boundary conditions
j=O,N
i=1, 2, ... (m+p)
(29)
where C and x are 2(m+p)-dimensionalvectors <,
> denotes the inner product. If xO(nT) is the initial
OPTIMUM DESIGN AND ERROR ANALYSIS OF DIGITAL INTEGRATORS
guess to the solution of Eq. (28), the (q + 1 ) th approximation is obtained from the qth by
X q+ 1 (n+1 T) = q [!q (nT)]
+ J{9.[~q (nT)]}{xq+ 1 (nT) - ~q(nT)}
(30)
where ] is the Jacobian matrix whose ijth element is
the partial derivative agilaxj. If q+l(nT) is the fundamental matrix of
q+l (n+ 1 T) = ] { q [!.q (nT)]} q+l(nT)
q+l(O) = I = identity matrix
(31)
and !1q+l (nT) is the particular vector solution of
!1q + 1 (n + 1 T) = q [!.q(nT)]
+ ] {q [!q(iiT)]} {!1 q+ 1 (nT) -
~q(nT)}
=
(32)
The solution of Eq. (30) is
!q+l(nT) = q+1(nT) yq+1
+
!1P + 1 (nT)
-
(33)
where v + 1 is a constant vector which is determined
by solving
q
q+l(jT) yq+l
j = 0, N
+ Qq+l(nT»
=
bi(jT)
(34)
913
paid for this "tailoring" is of course an increase in
the time required to determine the discrete system
parameters. However, the quasilinearization method
is readily programmed on a digital or hybrid computer and computation time is short if rapid convergence is assured by a good initial guess to XO (nT) ,
as is easily the case here.
CONCLUSIONS
A new method for the discrete simulation of systems has been presented. The technique consists of
determining best least squares approximations for
the linear elements of a continuous system and adjusting loop gains via the quasilinearization approach such that an accurate approximation to any
decidedly nonlinear elements is obtained. In many
cases tailoring of the complete closed loop system is
not necessaryy. If tailoring is necessary, actual treatment of the nonlinearities involved via quasilinearization allows resolution of any stability problems
concerned with the discrete approximation.
i=1,2,3, ... (m+p)
Application of this technique to the specific problem of Fig. 21 with T = 0.1, P2 = 1, N = 100,
100
] = I [Ya(kT) - Yd(kT)]2 yields a parameter Pl
k=O
which decreases with increasing input step size as
illustrated in Fig. 23. Without the change in Pl as
ACKNOWLEDGMENTS
The first part of this research was initiated by
the authors while at the University of Arizona. The
principal author is indebted to his students, at the
University of Florida, in particular B. Eisenberg
and S. Goldberg for assisting in the digital computer studies necessary to evaluate the theoretical results in the latter portions of the paper.
REFERENCES
1.2
3
4
7
8
9
10
Amp I i tude of
Input Step
Figure 23. Optimum gain parameter setting for nonlinear
system.
indicated by this figure, the discrete simulation becomes unstable for a given sampling period as the
step size increases. The use of the optimization technique allows determination of P 1 as a function of the
input such that very large (compared to conventional
methods) sampling periods can be used. The price
1. J. T. Tou, Digital and Sampled Data Control
Systems, McGraw Hill, 1959.
2. B. Widrow, "Statistical Analysis of Amplitude-Quantized Sampled-Data Systems," Applications and Industry, vol. 52, pp. 555-567, (Jan.
1961) .
3. Ho Wo Gschwind, "Digital Differential Analyzers," Electronic Computers, P. Von Handel, ed.,
Prentice-Hall, 1961, p. 139.
4. D. J. Nelson, A Foundation for the Analysis
of Analog-Oriented Combined Computer Systems,
Doctoral Thesis, Stanford University, Stanford, Calif., Apr. 1962.
5. F. B. Hills, A Study of Incremental Computation by Difference Equations, Masters Thesis,
Massachusetts Institute of Technology, 1958.
914
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
6. M. Pavlevsky, Computer Handbook, Korn
and Huskey, eds., McGraw-Hill, 1961, pp. 19-1419-74.
7. J. T. Tou, "Statistical Design of Digital
Control Systems," IRE Transactions on Automatic
Control, vol. AC-5, pp. 290-296, (Sept. 1960).
8. F. B. Hildebrand, Introduction to Numerical
Analysis, McGraw-Hill, 1956.
9. R. W. Burt, Optimum Design and Error
Analysis of Digital Integrators, Thesis, University
of Arizona, Tucson, 1963.
10. J. M. Hurt, "New Difference Equation Technique for Solving Non-Linear Differential Equa-
1965
tions," AFIPS Conference Proceedings, vol. 24,
1964.
11. M. E. Fowler, "A New Numerical Method
fo Simulation," Simulation, May 1965.
12. A. P. Sage, "Suboptimal Control via Frequency Domain Techniques, Quasilinearization,
and Differential Approximation," Proceedings,
Third Allerton Conference on Circuit and System
Theory, University of Illinois, Oct. 1965.
13. R. Bellman, R. Kalaba and R. Sridhar,
"Adaptive Control via Quasilinearization and Differential Approximation," Rand Corporation Research Memorandum RM -3928 PR (Nov. 1963).
SEQUENTIAL ANALOG-DIGITAL COMPUTER
Hermann Schmid
General Electric Company
Light Military Electronics Department
Johnson City, New York
applications with ~any signals but few computations, the size, weight and cost of the interface circuits may equal, or even exceed, those of the computer. Besides, the analog-to-digital and digital-to-analog conversion circuits are subject to
the same shortcomings and limitations as the analog
computer circuits.
This paper describes a Sequential Analog-Digital Computer (SADC) which overcomes many of
the limitations of the conventional techniques described above, by combining an analog arithmetic
unit with a digital memory and a digital control
unit. Tht; computer, thus, exploits the advantages of
the analog technique (no interface circuits required,
ease of summing and scaling, high resolution) with
the advantages of the digital technique (drift-free
storage, logic decisions, ease in signal switching).
Except for a few precision components, SADC can
be built entirely with integrated circuits.
There are only a few sequential analog computers
described in literature3,4 and only one known technique which is similar to SADC.5 E. V. Bohn describes a pulsetime computer which uses vacuum
tube integrators, vacuum tube current switches and
a magnetic drum for storage.
INTRODUCTION
The applications of digital computers to analog
control systems, where the inputs to the computer
are the outputs from analog sensors and the outputs
from the computer must drive analog controls, increase steadily, even in cases where analog computer accuracy (1 percent) would be sufficient.
The reasons for this are:
• Digital computers, built with integrated circuits, are small, reliable, use little power and
are insensitive to changes in environment.
• Conventional analog computers1,2 in comparison are large, unreliable, vary considerably
with change in environment, require precision
components, stable power supplies and many
adjustments. In addition, conventional analog
computers do not lend themselves easily to
sequential operation.
Although the statements above are correct, a
comparison of this type is worthless and often misleading because the interface equipment required
with the digital computer is not included. In control
915
916
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
COMPUTER ORGANIZATION
As shown in Fig. 1, the organization of SADC is
very similar to that of a sequential digital computer. 6,7 SADC combines an analog arithmetic unit with
a digital memory and a digital control unit in a
unique way. The inputs to SADC may be a-c or
d-c voltage or pulse-time signals, whereas the
outputs are either in pulse-time or d-c voltage
form.
DIGITAL
MEMORY
1965
The design of the analog integrator is conventional,1,2 with resistor R in input and capacitor C
in the feedback path of a high-gain operational
amplifier, the output of which is
The quality of this integrator depends on the precision and the stability of R, C, and on the zero
offsets of the amplifier.
The design of the analog summer-inverter is also
conventional,1,2 with resistors R31 to R 3n in the input
and R4 in the feedback path of a high-gain operational amplifier, the output of which is
Vo
=-
[~ + ~ + ... + ~J
RSl
RS2
Rsn
R4
Analog computing elements in the arithmetic
unit are connected in various ways to perform different arithmetic operations under control of the
program. The control unit connects signals to and
from the arithmetic unit in appropriate time intervals according to the stored program.
The arithmetic unit performs each arithmetic operation without the use of the memory in one basic
timing interval, no matter how complex. Often, the
results of one arithmetic operation are used as the
initial conditions for the next operation, thus eliminating the need for storage and associated transfers.
Pulse-time signals are used for transferring information from the arithmetic unit to the memory,
and vice versa. The outputs of SADC are not provided continuously (sampled-data) , and one buffer element is required for each output signal, just
as in a sequential digital computer.
The quality of the inverter depends on the precision of R 3 , R 4 , and on the zero offsets of the amplifier.
The comparator uses a differential amplifier and
logic circuits to indicate on two wires the result of the
comparison V o-V c. When ( V 0- V c) is larger than
+2mv, the amplifier output Vp is + VB, and when
(Vo-Vc) is smaller than -2mv, V p is zero. The transition from + VB to zero requires 50 nanoseconds.
V P, which indicates the polarity of (Vo- V c), is stored
in a flip flop. NOR-gating provides the pulse-time
outputs Pt1 if (Vo-Vc) > 0, and Pt1 if (Vo-Vc) < O.
The number of integrators, comparators and inverters used in an arithmetic unit is a function of
the problem complexity and the required computation speed. The larger the number of computing elements, the more arithmetic operations can be performed in parallel.
All amplifiers, resistors and capacitors are susceptible to changes in environment. The arithmetic
unit and the power supply regulators, therefore, are
put into a small oven, operating at 70 ± 2°C to
minimize computation errors due to changes in
temperature.
ARITHMETIC UNIT
MEMORY
The arithmetic unit in Fig. 2 consists of integrators, inverters and comparators which may be used
separately or combined. One integrator and one
comparator perform; e. g., multiplication or division.
In the sequential analog computer, relatively few
variable signals need be stored because:
ANALOG
INPUTS
DIGITAL CONTROL UNIT
(WITH ANALOG SWITCHES)
ANALOG
L-_ _ _ _ _ _ _ _ _ _ _-A OUTPUTS
Figure 1.
Basic building blocks of SADC.
• Arithmetic unit does not require storage when
executing an arithmetic operation.
917
SEQUENTIAL ANALOG-DIGITAL COMPUTER
~
VII
R
I INTEGRATOR I
VOl
~~.l<---~---------------;,r
I
COMPARATOR 1
T.I
C2
R
V'2
~
I~
V 02
INTEGRATOR 2
I D-0. . . ------,-----------------~,.
COMPARATOR 2
I
j
Figure 2.
..
Pt2
Typical arithmetic unit.
• Inputs are continuously available.
• Arithmetic unit has itself limited storage
capability.
• Often, outputs of one operation remain as
inputs for the next operation.
Therefore, the requirements on the memory elements for the SADC are entirely different from
those of conventional digital computer memory elements. For a pulse-time memory element, it is important that the circuitry can be packaged into integrated-circuit packages and that it can be driven
from and read by integrated circuits. The method of
storing pulse-time -signals with continuously operating counters8 fulfills these conditions.
In the pulse-time memory as shown in Fig. 3
the output square-waves from the most significant
stage of the master and the slave counter, both driven
by the same frequency fe, are gated to detect the
phase shift between them. This phase shift, which is
proportional to the difference in count between these
two counters, produces the pulse-width output
signal tx.
The value of tx varies when the number of pulses
PROCEEDINGS -
918
FALL JOINT COMPUTER CONFERENCE,
Ie ~---.-------l
lO-BIT MASTER COUNTER
1965
1----.--------.;MASTER
TO RING COUNTERS
1'---_---'
TIMING GENERATOR
---- ---
~
lO-BIT SLAVE COUNTER
OUTPUT~
INPUT
-i
RESET TO ZERO
RESET -----1-----+--1
Ix
~
-J. ----------I
MEMORY ELEMENT
,
CLOCK
SYNC
MASTER
-----------..-- . v - - - - - - - -j
TO OTHER MEMORY ELEMENTS
Figure 3.
reaching the slave counter is different from the number of pulses reaching the master counter. When
pulses are added to the clock pulses fe, tx increases;
when pulses are subtracted, tx decreases. When no
pulses are added or subtracted, tx stays constant.
The memory can be read out nondestructively.
To read in new information, the slave counter must
be reset when the master counter is zero.
Only unidirectional counters are necessary. These
may be available shortly in a single integratedcircuit package. One master counter, with appropriate buffers at output, can supply many slave counters with the reference square-wave signal.
CONTROL UNIT
The control unit in Fig. 4, which consists of the
timing generator, the program generator and the
input/output switches, regulates the flow of information to and from the arithmetic unit and the
memory by energizing voltage switches and digital
logic circuits.
The timing generator shown produces, in sequence, 24 timing signals of equal length by gating
the outputs of a 4-bit high-speed ring counter
with outputs of a 6-bit low-speed ring counter in
24 buffer NOR circuits.
Variable program storage, as in general-purpose
machines,6,7 provides flexibility but requires writeread memories (cores, drum, etc.) . Special-purpose machines, using fixed or wired program storage, require simple read-only memories (rope cores,
diode matrices, etc.).
Memory element.
In the fixed-storage program generator of SADC,
the program memory and the digital switching are
comprised in one logic network which combines timing control signals with pulse-time input signals and
generates n analog switch control lines and m pulsetime output signals. The logic circuitry of the program generator can be defined precisely for any
particular application with a specific set of Boolean
equations.
The switching of analog voltages is still a problem
at present when size and cost must be considered,
since low-impedance integrated metallic oxide semiconductor (MOS) switches9 are not yet available,
integrated photo-electric switches10 are too expensive ( $100) , and transformer-coupled transistor
switches l l are too bulky. The best compromise is
the direct-coupled transistor switch in Fig. 4, capable
of switching signals with ± 5V excursions but which
require a base drive from - 6V to + 12V, a low
source and high load impedance. The voltage across
the saturated transistor VeE is dependent on the
signal voltage, the base current and the load current.
In the proposed circuit, VeE is maintained within
± 2mv. Special driver amplifiers, presently built with
one transistor, are required to provide the largeswing base drive signal. Zener diodes are used to
shift the level from zero to - 6V.
ARITHMETIC OPERATIONS
A change in the interconnection of the computing elements permits the execution of various arithmetic operations. The program specifies in each operation interval Ti what arithmetic operations are to
919
SEQUENTIAL ANALOG-DIGITAL COMPUTER
F 9
~~>l-I
- ; ~f~
---------------------.-----
TO
ARITHMETIC
UNIT
ANALOG
INPUT
SIGNALS
./'----'\,IV\r-
PULSE TIME
SIGNALS
FROM
ARITHMETIC
UNIT +
MEMORY
+12 V
PROGRAM GENERATOR
LOGIC
(PROGRAM MEMORY + DIGITAL SWITCHING)
PULSE
TIME
SIGNALS
TO
MEMORY
+ ARITHMETIC
UNIT
24 OPERATION TIMING SIGNALS
Figure 4.
A control unit for 24 sequential operations.
be performed and how the various computing elements must be interconnected by energizing the appropriate analog and digital switches. The arithmetic unit in Fig. 2 can perform -two multiplications, two divisions, two additions, two SET or two
READ operations simultaneously.
In all arithmetic operations, a voltage or voltagetime function is integrated for a controlled period
of time. The integrator output voltage V o, at the end
of the integration period, or the time required to
make Vo equal to some specific potential, are the
results desired.
The SET operation establishes the initial conditions prior to some arithmetic operations by integrating a certain analog voltage during Ti or by
integrating a reference voltage V R during a certain
pulse time.
The READ operation converts the output voltage
Vo into a pulse time after certain arithmetic operations by integrating a reference voltage with appropriate sign until Vo is zero.
Often, however, the voltage outputs of one arithmetic operation are the inputs or the initial conditions for the next operation. Most SET and READ
operations are thus eliminated.
Information is transferred from one computing
element to another by having pulse-time signals,
generated by a READ operation on the transmitting
element, control a SET operation on the receiving
element. When the TRANSFER operation is applied twice between two elements, errors due to
variations in the time constants and the reference
voltage cancel.
Each arithmetic or auxiliary operation is completed in one operation interval T i • All intervals are
of equal length and only one occurs at anyone time.
Conversion of Signals
The arithmetic unit may be used to convert analog signals from one form into another form. Any
conversion can be performed before or in between
the arithmetic operations of a computer program.
The conversion from pulse time to d-c voltage
requires only a SET operation.
The conversion from d-c voltage to pulse time,
which is usually referred to as pulse-width modulation,12 requires a SET operation to be followed by
a READ operation.
920
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
the next half-cycle by rotating R until V02 is zero.
The time required for this rotation is proportional
to the angle presented by the synchro signals, but
independent of the reference voltage amplitude Eo
and the synchro transformation ratio k.
Conversions with a-c voltage inputs require that
the computer timing is synchronized to the a-c signal.
The conversion from a-c voltage to d-c voltage is performed by integrating one-half cycle of
an a-c signal. The integrator output voltage at the
end of a half-cycle is proportional to the amplitude of the a-c voltage.
The conversion from a-c voltage to pulse time is
performed similarly to a d-c to pulse-time conversion.
The conversion from three-wire synchro signals
to pulse time13 is accomplished by converting into
two-wire signals with a Scott T transformer and
by performing a-c to d-c conversion on each of
these two signals. At the end of the integration interval, the integrator voltages are:
VOl
=.kEo sinA
V02
Addition -Subtraction
Any addition can be performed serially or parallel; subtraction is performed by negative addition.
Parallel addition, as shown in Fig. Sa, is carried
out by summing currents,1,2 proportional to the variables in the sum, at the input of the integrator or the
inverter. Since this requires additional precision
resistors, parallel addition is used only when sequential addition is too time consumIng.
= kEo cosA
and present the components of a vector R. A coordinate transformation operation is performed in
R
~:=+ 1:
I
1965
R
t
G;l~.
I~
Vn~~
Vo
VO =Vl+V 2 +---+V n
IT.
I
Figure 5a.
Parallel addition with inverter or integrator.
Integration
Sequential addition, as shown in Fig. 5b, is carried out by integrating each of the voltages representing a variable for one operation interval, separately and sequentially in time, and without resetting the integrator. Sequential addition can be performed also with a memory element.
Integration, as illustrated in Fig. 6, is approximated by sequential addition in a modified memory
element. The pulse-time memory element has a
capability of summing and storing information and,
V1~
V2~~-----f
I I
Vn
_.
II
...__---'\.J'V'R..r--<"""-'; D -C
V0
I
]
-~--crf
o-J
I
IT n
T1 T2
V0
=
RC
fi· V 1dt + f
T]
T2
-0
T1
:C [V]+V2+ •••• + V
Figure 5b.
Tn
V 2dt + •••• +
Sequential addition with integrator.
f V dt~J
n
Tn -1
n]
921
SEQUENTIAL ANALOG-DIGITAL COMPUTER
thus, has a capability of approximating integration.
Rectangular or trapezoidal integration7 is performed by summing properly scaled fractions or
multiples of the integrand in the memory element
once or several times during each computing interval.
To provide adequate input and output resolution,
a 10-bit delta counter must be added to the memory
element, to which Ie is connected when tx (representing an integrand X) is present. The delta counter
fills up and overflows at a rate Rx which is propor-
f
ADD/
SUBT.
c
tional to tx • Each time the delta counter overflows
a fixed number of pulses NT is added to the clock
pulses for the slave counter. The time constant of
the integrator is inversely proportional to NT.
The computing elements in the arithmetic unit
perform the required scaling or mUltiplying and the
conversion of the integrand into a pulse-time signal.
Integrations in the SADC are subjected to all
limitations of numerical integration, but have no
other intrinsic limitations.
U
I :
20-BIT SLAVE
COUN~
I
MEMORY ELEMENT
L_ _ __
tc..---.olo---
lO-BITDELTACOUNTER
r-
R~x"'tx
I
t
x
Figure 6.
Digital pulse-time integrator.
Multiplication
MUltiplication, as shown in Fig. 7, is performed
by integrating a voltage V x for a time t y • The voltage
across the integrator at the end of ty is proportional
to the product XV, when V x is constant during ty ,
since
Vo =
FROM MASTER
COUNTER
1
RC
One variable must be a d-c voltage, the other a
pulse-time signal. If both are d-c voltages, one must
be converted into pulse-time form; if both are pulsetime signals, one must be converted into a d-c
voltage.
The pulse-time signal t y operates the switch which
connects the voltage Vx to the integrator. Both signals may be bipolar.
922
PROCEEDINGS - - FALL JOINT COMPUTER CONFERENCE,
PULSE WIDTH
INPUTt
y
·t------
...
j::-=--=--=--t-=--=-T--l-;---....
y
~
-I
I
1965
T -------:;·
....1
~__~~
.A-------.- Vo ,..,
X • Y
INTEGRATOR
OUTPUT Vo
~TIME""'Y
Figure 7.
Multiplication waveforms.
V u is connected to the integrator by an analog
switch which is controlled by the pulse-time signal
tQ, which starts at the beginning of the operation
interval and ends when the integrator output voltage
reaches zero. V w can be either a positive or a negative potential. V u must have the opposite polarity
of V w •
Division
Division, as illustrated in Fig. 8, is performed by
integrating V u until the output of the integrator reduces from V w to zero.14
The voltage V u represents the divisor; V w represents the dividend which must be set into the
integrator before the division operation starts. The
integrator output voltage decreases with a constant
slope as:
Arbitrary Function Generation
In SADC arbitrary functions are approximated
with linear segments. 15 A staircase waveform approximating f' (x), which is the derivative of the
desired function, is generated by connecting the
reference voltage sequentially to the set of scaled
1
Vo(t) = Vw
RC
when Vo (t) is zero,
tQ
= V w/kVu
INTEGRATOR
OUTPUT Vo
TIME,.., W/V
;'~I5------T -----~
...
~.-_ _ _ tQ _ _--il"'~1
OUTPUTt~
PULSE WIDTH
I
Figure 8.
Division waveforms.
923
SEQUENTIAL ANALOG-DIGITAL COMPUTER
resistors at the input of the inverter. This is illustrated in Fig. 9.
The number of segments n required depends on
the accuracy desired and on how fast f (x) is changing. The length of the segment ti is Tiln. Preferably,
ti should be made a binary fraction of T i , for ease
of generating these timing intervals. The segment
timing intervals ti are generated like the operation
intervals in Fig. 4 with ring counters and gates.
The staircase waveform is integrated to produce
the linear segment curve f( t). The time of integraANALOG SWITCHES
~
tion is determined by the pulse-time signal t%. At the
end of tx, the integrator output voltage is:
Vo= -
1
1
RC
RC
One set of precision resistors and one inverter
are needed for each function to be generated. The
accuracy of the function generator depends on the
number of segments used, the function to be generated, and the precision of the components.
-- -- -- -- --
~ ~R31
R4 ARITHMETIC
~
I R32
I
UNIT
Cl
D-C
:I
:
L __
tn
t4
~_+---+---~Ti~---+---+--~
== VOl (tx )
Figure 9.
I
I
_ _ _---J
_! R3n I
V 031
+----.!.---~I~
___ ~
f(x)
f (tx) - f (0)
Linear-segment function generator and waveforms.
924
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
Coordinate Rotations/Transformations and Trigonometric Function Generation
The electronic analog resolver13 rotates and transforms coordinates and generates trigonometric
equations by controlling the initial conditions and
the operating time of an harmonic oscillator (Fig.
The generation of inverse function can be accomplished by setting the integrator to VI: and then
integrating a voltage function g' (t) until VOl is zero.
The time required to reduce VOl from V x to zero is
the desired inverse function, since:
Vx -
or, if g(o)
(x
0) g'(t)dt
= Vx - g(tx)
1965
10).
+
g(o)
Two integrators and one inverter are connected
into a loop as an harmonic oscillator to solve the differential equations X = -kX. The outputs of the integrators, Volt) and V02(t) , are the solutions to the
differential equations and represent the components
=0
R4
=C
VOl
I
COS w t
V02
= C SIN w t
+1.0
.5
-.5
~
1-
V02
.866
-.5
.866
(9
VECTOR IS ROTATED FROM -300 TO +60
Figure 10.
®
0
0
ANGLE VECTOR IS FOUND TO BE 135
The controlled harmonic oscillator and wave·forms.
925
SEQUENTIAL ANALOG-DIGITAL COMPUTER
of the imaginary vector R.
The vector R "rotates" with constant velocity when
the harmonic oscillator loop is closed and the integrator outputs change in a sinusoidal fashion. The
time during which the loop is closed is directly proportional to the angle through which R is rotated,
since A = kwt.
Coordinate rotation can be performed by rotating
R from its initial components V x, V y for a time tA,
which is proportional to the desired angle of rotation. The integrator voltages at the end of tA represent desired outputs, since
Tl
T2
0
0
Coordinate transformation can be performed by
rotating R from its initial components V x, V y until
Vodt) becomes zero. The time required for Vodt) to
decrease from V y to zero is:
The value of V 02 at the time t
=
tA' is:
The initial components V x, V y are SET into the
integrator prior to the rotation or transformation
operation.
Modification of the basic rotation and the transformation equations permits the generation of sine,
T3
Ta
T9
0
0
DD
-DE
..
DX
Dy
---
DN
-
Dz
AJ1..
,.0.rL
9 -1"1-
Figure 11 a. Flow diagram with arithmetic unit in figure 2.
--DX
_Dy
_Dy
FLOW DIAGRAM LEGEND:
o
o
= INTEGRATOR &
COMPARATOR
= MEMORY
ELEMENT
Figure lIb. Flow diagram with arithmetic unit having three
integrator-comparator combinations.
926
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
cosine, arcsine, arccosine, and other trigonometric
functions. Similarly, the solutions to the differential
equations X = + kX and X = -kX can be exploited
to generate exponential, logarithmic and hyperbolic
equations.
APPLICATION
The application of the SADC to the coordinate
conversion problem demonstrates the elegance of
this method of computation and emphasizes most of
the advantages.
Conversion from earth coordinates to ship coordinates, as frequently used in navigation, ground
control of missiles, fire control, etc., is defined by
the matrix equation:
DX]
D =
[ Dz
y
[1
0] [coso
0 cosfj}0 sin(/)
0 01-Sin(/)]
0
O-sinfj} cos(/)
sinO 0 cosO
which can also be written as:
Dx = (DN cosA
+ DE sinA) cosO -
Dy = (DE cosA - DN sinA) cos(/)
[(DN cosA + DE sinA) sinO
DD sinO
+
+ DD cosO] sin(/)
Dz = (DN sinA - DE cosA) sin(/) +
[(DN cosA + DE sinA) sinO + DD cosO] cos0
where D x, D y, Dz are ship coordinates; DN, DE, DD
are earth coordinates; and A, 0, (/) are azimuth, pitch
and roll angle, respectively.
The programming of the SADC is best illustrated
with the flow diagrams in Fig. 11, in which each column of squares represents the arithmetic unit and the
required memory elements in one operation interval.
Inputs to and outputs from a computing element and
its function are explicitly indicated: S = SET initial
condition, 0 = READ OUT, R = ROTATION,
H = HOLD. A resolver operation is depicted by
interconnecting two squares.
With the coordinate inputs DN, DE and DD in a-c
or d-c voltage form and the angles A, 0 and (/) in
pulse-time form, the coordinate conversion problem
can be solved in two ways.
1965
Solution 1: With the arithmetic unit in Fig. 1
and one memory element, the problem can
be solved in nine operation intervals Ti.
Solution 2: With an arithmetic unit consisting
of three integrator-comparator combinations,
the problem can be solved in five operation
intervals Ti, and without the use of any memory element.
Coordinate Converter Circuit
A SADC solving the coordinate conversion
problem according to Solution 2 consists, as shown
in Fig. 12, of the arithmetic unit with three integrator-comparator combinations, the 15 analog
voltage switches, the timing generator, and the controllogic.
It is assumed that the coordinate inputs DN, DE
and DD are d-c voltages and the angle inputs are in
pulse-time form. The output signals D x, D y and Dz
appear in pulse-time form.
The required number of analog voltage switches
and digital logic circuits can be derived most easily
from a list of input signals which must be connected to the integrators and inverter in the arithmetic
unit during each of the five time periods T i •
Timing Integrator Integrator Integrator
III
II
Periods
I
DD
DE
DN
Tl
Tz
V04
VOl
VOl
T3
V04
V02
T4
V04
VR
VR
VR
T5
Inverter
V 02
V03
V03
The circuit in Fig. 12 can be built with approximately 25 digital integrated circuits (flip-flops
and NORs) and 25 linear integrated circuits (amplifiers and analog voltage switches). It would require less than ten watts of power, less than 100
cubic inches in volume, and weigh less than five
pounds.
To perform the same problem with a conventional analog or a conventional digital computer would
require a circuit complexity at least one order higher.
PERFORMANCE
The performance of any analog computing element is always a function of the quality of the com-
927
SEQUENTIAL ANALOG-DIGITAL COMPUTER
INTEGRATORS
ANALOG 5WITCHE5
COMPARATORS
~ ~
/ , - - - - - . 1 ( \ ' - - - - -____
DN ~:>-------. . .
I Sl
V04~.-----~:>---------~
I S2
V04~.------~~~----~
15 3
VR ....------"..-r,
I S4
DE~~------~
I S5
VOl.
V)
w
o<{
V04 •
I-
.....I
o
>
~:>---------~
1 56
~~---~
157
0-
Z
U
I
£:)
f-
:::>
0I-
:::>
o
w
~
~
VR •
f-
I Sa
I-
:::>
V)
w
V)
....J
DD~~------~
I S9
VOl.
¥(cr------~
I S10
V 02 •
~r----~
:::>
0-
Q
Z
<{
w
o
<{
I-
I Sl1
....J
o
~
YR·
>
I S12
U
I
£:)
V02~~13
~
V03.
~o
I S14
~
V03.
I 515
I
,.)
Sl 52 S3 S4 S5 S6 S7 Sa 59S10S11S12S13S14S15
~~~t
ttt tt
t
t t
t~t~t___
CONTROL UNIT
t
Tl
\
Figure 12.
t
t
,
T2
T3
T4
V
TIMING PERIODS
T5
\
P. T.
Pt 1 Ptl Pt 2 Pt 2 Pt3 Pt3
~IGNAL
v
/
FROM ARITHMETIC UNIT
Earth-to-ship coordinate converter.
ponents used and a compromise between static accuracy and speed of operation.
Accuracy and speed of the SADC is solely determined by the quality of the analog computing elements, whereas resolution and dynamic range are
largely functions of the number of stages in the
memory counters and the length of the operation
interval.
Accuracy is also dependent on the variations in
temperature and power supplies. With higher quality components, these effects can be made smaller.
However, for the few, small components involved,
928
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
it is cheaper to control the environment of the arithmetic unit.
The performance data given below refer to an
arithmetic unit with the following components, timing and environment:
Fairchild /LA702 adjusted
Amplifiers:
for zero offset
Analog Switches: Direct-coupled 2N2432
±0.05% of nominal value
Resistors:
Trimmed to ±0.05% of
Capacitors:
nominal value
25°C ± 2°C
Temperature:
Operation Interval: 1 ms
Power Supplies:
-6V ± 0.1 %, +5V ±
10%, +12V ± 1%
Static accuracy for addition, multiplication, division, * integration, SET, READ and transfer operations is ±0.1 % of full-scale and ±0.2% for coordinate rotation, coordinate transformation, * sine-cosine
generation, etc.
With a I-ms operation interval, 1,000 sequential
operations can be performed in 1 second.
CONCLUSION
The ability of computing analog and storing digital provides the SADC with the unique capability
of operating as a sequential computer which accepts
analog signals as inputs and provides analog signals
as outputs, and operates with analog computer accuracy and sequential digital computer speed.
Time-sharing the simple arithmetic unit and
requiring no interface circuits make the SADC the
most economical method of computation in its
speed-accuracy domain.
Due to the small number of components used and
due to the fact that almost all of these components
are integrated circuits, SADC has an inherently
high reliability. In addition, redundancy techniques
can be applied to SADC just as to a digital computer.
The performance figures given have been obtained with relatively low-performance integrated
analog circuits. With better components, such as
chopper-stabilized amplifiers, higher precision will
be possible. However, in achieving and maintaining
this ~igher accuracy, there may be a problem in
finding accurate and stable capacitors.
To date, only parts of the SADC have been built
*Only when inputs are larger than 50 percent of full-scale.
1965
and tested, and considerably more work is neededboth at the circuit and system levels.
With its inherently high reliability, minute size,
low power consumption and minimum cost, the
SADC should be well suited· for all those military
and industrial control applications where the computer inputs and outputs must be in analog form.
REFERENCES
1. G. A. Korn, Electronic Analog Computers,
McGraw Hill, New York, 1956.
2. S. Fifer, Analog Computation, Volumes 1 to
4, McGraw Hill, 1961.
3. R. Lee and F. Cox, "A High-Speed Analog-Digital Computer for Simulation," IRE Transactions on Electronic Computers, June 1959, pp.
186-196.
4. A. Herzog, "Pulsed Analog Computer for
Simulation of Aircraft," Proc. IRE, May 1959,
pp. 847-851.
5. E. V. Bohn, "A Pulse Position Modulation Analog Computer," IRE Transactions on Electronic Computers, June 1960, pp. 256-261.
6. M. Phister, Logical Design of Digital Computers, John Wiley and Sons, 1959.
7. R. S. Ledley, Digital Computer and Control
Engineering, McGraw Hill, 1960.
8. W. R. Seegmiller and E. C. Underkoffler,
"Static Sync Drive Development," General Electric
Company TIS Report R61APS47 (Dec. 1961).
9. H. Ruegg, "An Integrated FET Analog
Switch," Solid-State Conference, Philadelphia, Pa.
(Feb. 1964).
10. No author, "Molecular Opto-Electronic Multiplex/Chopper," Texas Instruments, Product News
Release (March 23, 1964).
11. H. Schmid, "Four-Quadrant All-Electronic Pulse-Time Multiplier," General Electric Company TIS Report 62APJ 43, pp.II-13.
12. H. Schmid, and B. Grindle, "Inexpensive
Pulse-Width Modulation," Electronics, pp. 2931, Oct. 11, 1963.
13. H. Schmid, "Electronic Analog Resolver," to
be published in Electronics.
14. H. Schmid, "Repetitive Analog Computing
Technique," Aerospace Conference, Phoenix, Arizona, April 1964.
15. H. Schmid, "Linear Segment Function Generator," IRE Transactions on Electronic Computers, Dec. 1962, pp. 780-788.
DESIGN O'F A HIGH SPEED DDA
Mark W. Goldman
Guidance and Control Department
Martin Company
Baltimore, Maryland
INTRODUCTION
nonlinear differential equations and secondly, because the solution time of predictive reentry equations, particularly near impact, are very critical.
One of our earlier DDA designs required approximately 200 seconds to solve the entire Chapman
equation from entry to terminal; the new design can
solve the same problem in 0.56 seconds-using less
hardware and at the same accuracy. These new
techniques are:
1. Shared integrators. A method of combining several ( 5 in one case) integrators
which contain the same variable, and sharing the Y register and adder network with
a multiple bank of R registers.
2. Automatic rescaling. Instead of scaling the
DDA to handle the worst case range of
variables, it is scaled to handle a nominal
range. When a variable exceeds the range,
certain critical integrators are rescaled to
modify the increment size of the variable.
Since the iteration time of a DDA is dependent on the variable size, this technique
also allows the DDA to run at the optimum speed at each region of its solution.
3. Asynchronous timing. No fixed clock is
used in the design; rather, as each iteration
is completed the next one is started. For
The objective of the company-funded task which
supported this work was to develop techniques for
high-speed solutions to differential equations, particularly those which are common in aerospace
problems. For example, the solution requirements
for reentry guidance are very time-limit~d and must
be processed at the highest priority level. To solve
this type of problem, depending on the accuracy required, the number of iterations can get unreasonably large and require an inordinate amount of computer "power." Therefore, the solution time requirements led us to investigate means other than
the general purpose computer to solve these timecritical differential equations. The nature of the
problem and the aerospace requirements of long
term drift stability and accuracy led us to choose
the digital differential analyzer (DDA) as one of
the candidates for investigation. This paper, then,
is concerned with the new techniques in DDA design which were developed in order to meet the
solution time objectives.
Our work on DDA's centered about the Chapman
predictive reentry guidance equation. This equation
was chosen as a vehicle for study, first, because it is
representative of a class of aerospace second order
929
930
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
example in the Chapman design, the longest critical string of integrators is nine, but
on the average only four will overflow in
anyone iteration cycle. In the asynchronous mode the next iteration cycle is started when the longest critical path has been
completed.
In addition, advanced logic techniques are used
throughout. For example, carry look-ahead adders
-each iteration cycle time is dependent on how
many integrators overflow, which in turn is dependent on the time that it takes the integrator adder
to detect this overflow and complete the addition.
The carry look-ahead adder can add two operands
in as little as five gate delays and is completely independent of carry. The gate delays for the microelectronics we are using result in an add time of
50 to 90 nsec.
In order to demonstrate the feasibility of the new
features it was not necessary to build the complete
DDA to solve the Chapman equation. To do so
would have risked hiding the principles of the new
features in the complexity of the Chapman equation. Instead, each of the new features is amply
demonstrated in a DDA designed to solve the simples equation, xy" + ~y' + Y = O. This simpler
equation also has the advantage of having a closed
form solution for ease of demonstration. This computer is capable of approximately two million iterations per second.
Therefore, the design goal of the project was to
study the systems and logical organization of
DDA's and to develop a machine with a significant
speed increase over contemporary designs. Since
speed was the target, economics of the design was
deemphasized. That is, where cost and speed were
in conflict, the faster but costlier technique won. In
some cases, the speed return may not have been
worth the added cost, but until a specific application is to be designed, no yardstick is available to
measure the economic worth of one design against
another. When weight and size become important,
as they often do in aerospace equipment, then the
yardstick is available to tailor down the design to
meet the constraints.
1965
are an extension to the theory, while the third, asynchronous timing, is an improvement in the construction.
Shared Integrators
When two or more integrators store the same
variable, they can be combined into one integrator
with a corresponding saving in hardware. An example of the shared technique is shown in the following
sketches. Sketch (a) shows the conventional DDA
hookup to obtain reciprocals, in this case
1
u
Both integrators store the variable _1_ (in the Y
u
register). Combining the two integrators, sketch (b)
provides the shared integrator which also yields 1 .
u
The equivalents between sketches (a) and (b) are illustrated by the flow of pulses through each figure. A
du pulse travels along path A. The du pulse generates
a~
output pulse
u
In turn,. the
~u
which travels along path B.
pulse generates a d (
pulse which travels along path C.
du
A
du
u
c
(a
NEW TECHNIQUES
During the system's study, several improvements
were made over contemporary DDA's. Two of
these, shared integrators and automatic rescaling,
-
(b)
+)
output
931
DESIGN OF A HIGH SPEED DDA
A more detailed comparison is illustrated below
by a block diagram of each integrator. The Y registers in Integrators 1 and 2 contain the same numbers, but not the R registers. Therefore, the shared
integrator in sketch (b) has two R registers. Register
Rl is identical with the R register in Integrator 1
and R2 identical with the R register in Integrator 2.
du
Integrator 1
du
.!!!!
u
fixed point arithmetic, we have developed several
techniques.
The first technique, referred to as multiple scales,
is a compromise between fixed point and floating
point arithmetic. The procedure is to divide the
complete range of the variables and scale each subdivision separately. The individual scales are combined into one computer. The DDA computes with
the scale corresponding to the subdivision which
the variable size happens to be in at the moment.
When the variable size changes to a different subdivision, the DDA automatically switches to the
corresponding scale. Each scale uses fixed point
arithmetic, but the switching from scale to scale as
the variable size changes simulates a floating point
method.
The following is a decimal example demonstrating how the DDA switches from one scale to
another. The problem is to compute the reciprocal
of u where the desired range of u is
0.02
~
u
~.
1.000.
Therefore, the complete range of _1_ is
u
0.001
The shared integrator above requires 1 adder, 1 Y
register and 2 R registers where the conventional
DDA hookup requires 2 adders, 2 Y registers and
2 R registers.
< _1_
~
u
-
50.
Consider dividing the range of _1_ into two parts:
u
0.001
1 < _1_
-
1
~
u
u
<
-
<1
50.
Automatic Rescaling
A major disadvantage of a conventional DDA is
that it uses fixed point arithmetic. This is exemplified by the fact that the scaling is based upon the
maximum value that each variable can assume. If
some of the variables very over a large range, then
an extremely small independent increment may be
necessary to maintain accuracy. As the size of the
independent increment decreases, the number of
iterations increases. A GP computer could solve the
same problem with the· same accuracy using a larger
independent increment if it employs floating point
arithmetic. If the GP computer were restricted to
fixed point arithmetic, then it would have fundamentally the same scaling problems encountered by
the DDA. To overcome the scaling problems of
When
1
u
is in the range from 0.001 to 1 we use
Scale A. When _1_ is in the range from 1 to 50
u
we use Scale B. Suppose we decide that we want
to carry 5 decimal places, then from DDA scaling
theory, in Scale A each du,
equals
10-5
as shown below.
du
u
~u
and d (
+)
pulse
932
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
Then from the normal DDA scaling techniques3
both Y registers are 5 digits and 100 inside each
triangle means the decimal points are at the extreme
left as shown below.
100 10- 1 10- 2 10- 3 10-4 10- 5
[~~~l
I I
y: register
As in scale A both Y registers are 5 digits long. The
102 inside each triangle means the decimal points
are two digits to the right as shown below.
1
10°
10- 3
10-1 10- 2
I
As the computation proceeds the number in the
Y registers decreases. The number will eventually
reach the value
o
For Scale B, since we wish to maintain the same
number of decimal places as before, each du pulse
du
1
must equal 10-7, becomes 10-5, and d u
u
becomes 10-3 as shown below.
10
1965
I I Register
,Y
Incidentally, the values of the scaling constants of
Scale B are those that would be used for the complete range in a convential DDA.
An example will show how the DDA automatically
switches from one sale to the other.
9
9
I. I
At this point
~ < 1 and hence in Scale A. The
change to Scale A is sensed by detecting the zeros
in the two leftmost positions. The scale switching is
accomplished· by shifting the number two places to
the left (and the imaginary decimal point also) so
it reads
and the R register is reset at 0.5 (its median value)
and the DDA is started again.
If the computer happens to be in Scale A and
the number in the Y register is increasing, then the
Y register will eventually read
9
9
9
The next increase (by 1 x 10-5 ) will cause the Y
register to overflow. The new value of Y would be
1 which is in Scale B. The overflow triggers the
scale switch from A to B which sets Y register at
If _1_= 50 at the start of the computation, then
u
Scale B is used and the Y register reads
10
5
1
10
0
0
10/11.0100
10- 1
10- 2
10- 3
0
0
0
1
I
I
I
and resets the R register at 0.5. The number of
iterations required to compute the. reciprocal of u
from u = 0.02 to u = 1000 using Scale A for the
complete range is
107 (1000-0.02)::::::::: 1.0 x 1010 •
DESIGN OF A HIGH SPEED DDA
Using multiple Scales A and B, the same accuracy
is obtained and the number of iterations is
107 (1- 0.02)
+ 105 (1000 -1)
~
1.1 x 108 •
Another technique being considered is to actually
use floating point arithmetic. We are presently exploring several methods for mechanizing a DDA to
operate with floating point numbers. Some of the
ideas are as follows. The decimal (or binary) point
of the Y register is no longer predetermined, and
special registers are needed to specify the position
of the decimal point. Special control logic will solve
the scaling equations for each iteration and perform
the scaling as the DDA is computing. This technique shows a great amount of promise.
Asynchronous Timing
Asynchronous timing is a method of decreasing
the iteration time. In conventional DDA's the iteration time is constant because a fixed clock is used
(synchronous timing). In contrast, asynchronous
timing means the next iteration is started as soon as
the last one is finished. The DDA hookup for obtaining the reciprocal of u will again be used to explain the principle. Suppose the time required to
process any input pulse is T. At the start of each
iteration a du pulse triggers Integrator 1 (see following sketch). If the R register does not overflow,
then no output pulse occurs and the iteration is finished in time T. If an output pulse does happen to
occur then Integrator 2 is triggered. If the R register of Integrator 2 does not overflow, then no output pulse from it occurs and the iteration is finished in 2T. If an output pulse from Integrator 2
does occur, then the lower inputs to both integrators are triggered, and the iteration requires 3T.
du
du
Integrator 2
Integrator 1
A summary of the three possibilities and the time
necessary to perform the calculations in each case is
tabulated here.
Integrator I
Integrator 2
Occurrence of an Occurrence of an
Output Pulse
Output Pulse
No
No
Yes
No
Yes
Yes
933
Iteration Time
Required
IT
2T
3T
For a fixed clock system the iteration time is set for
the longest pulse path which in this case is 3T. With
the asynchronous timing technique each integrator
has a control section which indicates whether the
integrator is calculating. The master control examines
each integrator, and it will generate the next du pulse
each time both integrators become inaCtive. In this
way the iteration time will be 1, 2 or 3T depending
on whether none, one, or both R registers overflowed. A good estimate as to the probability a
particular R register will overflow is V:2. Based on
this estimate, the probability of no overflows is V:2,
of one overflow is 14, and of two overflows is 14.
Therefore the average iteration time using asynchronous timing would be Ih x IT + 14 x 2T + 14 x 3T
= (1 %) T which is an improvement over the 3T
required for synchronous timing.
DDA DESIGN FOR CHAPMAN'S EQUATION
In the Introduction we stated our objective was
to design a DDA as part of a guidance computer
for a lifting re-entry vehicle. This work was based
on references 1, 2 and a Martin report, "An Automatic Predictive Guidance Method for Unmanned
Lifting Vehicles Re-entering with Circular Velocity" by V. Blaes. The report discusses the equations
to be solved by the guidance computer. The computations divide into two parts: subroutine FUTURE
and subroutine PREDICT. Figure 1 shows the entire control system which includes a block diagram
of both subroutines and their interconnection with
the vehicle dynamics. The advantages were in favor
of a GP (General Purpose) computer to solve subroutine FUTURE and a DDA to solve subroutine
PREDICT. Subroutine PREDICT requires an accurate, high speed solution of a differential equation
which is best suited for a DDA.
The purpose of subroutine PREDICT is to calculate the footprint size based on the spacecraft
position. By varying the lift-drag ratio LID and bank
934
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
angle cp of the spacecraft, the landing spot can be
controlled. The footprint specifies the region of the
earth where the spacecraft could possibly land. The
idea is to control the spacecraft so the target is in
the middle of the footprint. The footprint size is
defined by three ranges: (Rx)max, (Rx) min, and
(Ry)max which are calculated from Eqs. (1-4).
Equation (1) is Chapman's equation - a second
order, nonlinear differential equation.
Ui
r (sin 'I'
R y = 30 ) -Z-du, ft
Z
1- u
"' / R
L
~
Z " = Z' ---+----vtJr--cos~
U
uZ
D
Ui
r (coZS'l' du
Rx = 30 )
d 'I' = __1_(~) sin
du
u
D
(1)
cp
r------------,
Initial O'c, =
0
(2) With (L/D)min'
0.002
-
(3)
Uf
Equations (2), (3), and (4) are used to calculate
the X and Y components of the range
Vehicle
dynamics
(2)
Uf
2
U
1965
=
45°
automatic predictor guidance method.
935
DESIGN OF A HIGH SPEED DDA
emphasized that ZiJ Z'iJ Ui are the values of ZJ Z'J U
corresponding to the spacecraft's position during the
prediction. Suppose the spacecraft's L/D and cp are
known for the entire reentry. If LID and cp are substituted into Chapman's equation and ZiJ Z'iJ Ui
are the initial conditions, then the solution of the
range equations, together with Chapman's equation,
predicts where the spacecraft should land.. The miximum downrange (Rx)max is the value of Rx that
would be achieved if the spacecraft would maintain
LID = (LID)max and cp = 0 during reentry. Similarly, the minimum downrange (Rx)min is the value
of Rx for LID = (LID)min and cp = o. The maximum crossrange (Ry) max is the value of R y for LID
= (LID)max and cp = 45°. As often as possible
during the reentry a new prediction is made of the
footprint size. Each prediction requires that Chapman's equation and range equations be solved three
times in order to obtain (Rx) max, (Rx) min, and
which starts each iteration cycle. The triangles represent the integrators where the inputs, outputs, and
the variables stored in the Y registers are labeled.
The arrows indicate the flow of pulses. The pentagons and divided circles are used to represent multiplexed inputs to the shared integrators. Triangles
1, 6, 9 and 10 are shared integrators. The function
performed by shared Integrator 1 would normally
require four integrators-one for each pentagon.
Similarly, shared, Integrator 6 is equivalent to four
integrators, and Integrators 9 and 10 are equivalent
to two integrators each. Consequently, by using
shar~d integrators we have reduced the total number
of integrators from 20 to 12.
Integrators 1 through 6 are used to solve Chapman's equation. The outputs from Integrators 2, 3,
4, and 6 are summed together; the resulting sum is
(Ry)max.
The DDA design used to solve Chapman's equation and the range equations (subroutine PREDICT) is given in Fig. 2. The square boxes with du
inside represent the independent variable increment
Factoring out du results in the right side of Chapman's equation, Eq. (1). Therefore, the sum of the
u.
Ry.:S~ ¥
U
Figure 2. DDA for Chapman equation.
f
du
936
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
outputs composing input 3 of Integrator 1 is equal
to u Z" du:
uZ "d u
=(
2
z' - -Zu+ -1- u
L
uZ- - yf3r-cos
D
cP ) du
1965
1
z
.
The change in the heading angle d'l! is derived by
Integrators 3, 1, and 8 in that order according to
Eq. (4). The y component of the range is obtained
by Integrators 6, 7, 9, and 11 where the solution R y
is in Integrator 11. The x component of the range
is obtained by Integrators 6, 7, 10, and 12 where
the solution Rx is in Integrator 12.
The DDA for solving Chapman's equation could
not be scaled to achieve the accuracy and solution
time requirements without introducing the technique
of automatic rescaling. For a skipping trajectory the
value of Z may vary from 0.0001 to 10. In turn, the
value of lIZ in Integrator 6 will vary from 0.1 to
10,000. Because the DDA uses fixed point arithmetic, the large range of 1Z requires an extremely small
independent increment ~u. To obtain an accuracy
of three decimal places, an_ increment size of 10- 8
was necessary. Since u varies from 0 to 1, ~u = 10- 8
is equivalent to 108 iterations. The number of iterations is directly proportional to the solution time.
The desired accuracy and solution time were ob.;.
tained by applying the technique of automatic rescaling. Instead of using one scale for the complete
range of liZ, we divided it into four scales: A, B, C,
and D. The table below shows the range of liZ,
value of ~u, and number of iterations for each scale.
The accuracy for each scale is approximately the
same. The following sketch shows a typical plot of Z
versus u for a skipping trajectory. From the sketch,
Scales A and B are used during most of the integration, and Scales C and D are used less than 10 percent of the time. Therefore, the average ~u is about
10- 6 • The use of multiple scales has reduced the
number of iterations, and consequently the solution
time, by a factor of 100.
u: - - - -
A test of the accuracy for different ~u sizes was made
using a simulation program. Figure 3 shows the results of solving Chapman's equation with the same
initial conditions but with different sizes of ~u. The
three solid curves are the solutions for ~u = 10-4,
~u = 10- 5 , and ~u = 10- 6 • A computer program
at the Martin Company, for solving differential equations, entitled Unitrac was used to compute an accurate solution represented by the dotted curve for
means of comparison. The curve for ~u = 10- 6
more closely follows the dotted curve than the curve
for ~u = 10- 5, demonstrating that an increase in
accuracy results as the independent increment size
decreases.
DDA DESIGN FOR DEMONSTRATION
PROBLEM, xy" +
2..- y' +
To demonstrate the principles of the new techniques we have designed and built a special purpose
d 2y
DDA to solve x dx 2 +
Scale
Range of 1I Z
A
0.1 ~ liZ < 10
B
10 ~ liZ < 100
C
100 ~ lIZ < 1000
D
1000 ~ liZ < 10,000
~u
10- 5
10- 6
10- 7
10- 8
2
1
dy
dx
+Y
=
O. Con-
sequentlY' we did the detailed logic design needed
to incorporate automatic rescaling, asynchronous
timing and shared integrators into a DDA.
d2
x
The general solution to x d ~
y = A cos (2yx)
Number of
Iterations
Y = 0
2
+
1 d
+ -~
+ Y = 0 is
2 dx
B sin (2yx)
where A and B are arbitrary constants. If the initial
conditions are
y
= 100 and
~~ = a at x = a
937
DESIGN OF A HIGH SPEED DDA
then the particular solution is
y
=
100 cos (2yx)
for which a graph is shown in Fig. 4. We confined
our interest to computing the solution y = 100 cos
91T
(2yx) from x = 0 to x = 4
,although other
solutions could be computed by changing the initial
conditions.
d 2y
1 dy
The DDA hookup used to solve x dx2 + 2
dx
+ Y = 0 is shown in Fig. 5. The box with dx inside
generates the independent increment which starts
each iteration. The figures labeled IT, 2T, and 3T
are counters for the purpose of scaling and will be
explained later. The triangles labeled 11, 21, and 31
y
are the integrators storing _1_, dd and y, respecx
x
tively. The lower input is the differential of the
variable being stored. The output is the product of
the upper input and the variable being stored.
Integrator
register and
Ra. Input 1
accumulated
11 is a shared integrator having one Y
three R registers labeled R 1, R2, and
to Integrator 11, dx, causes Y to be
in register R 1. The overflow from Rl,
dx ,becomes input 2. Input 2 causes Y to be accu-
x
mulated in Register R 2. The sign of the overflow
from R2 is switched; the result -
d!
x
is equal to
1
and hence becomes the lower input to 11.
x
Input 3 which comes from Counter 2T is equal to
d
__1_ dy • dx and input 4 which comes from
2 dx
'
Integrator 31 is equal to - y dx. Both inputs 3 and
4 result in Y being accumulated in Register Ra. So
it is possible for Y to be added to Ra twice in the
same iteration if both 31 and 2T overflow. Multiplying the equation x ~~
~~ + Y = 0
+ ;
it is clear that the sum of inputs 3 and 4 is equal to
X
d 2y
dx 2
1;; ·
•
dx. Therefore the overflow from Ra is
dx which equals d
(7x)
by identity.
The next phase is the scaling which determines
the increment size corresponding to each input and
output pulse and the number of bits in each register.
To demonstrate the technique of automatic rescaling,
we used two scales. Scale A is for the range from
x = 0 to x = 16, and scale B is for the remaining
Chapman's equation
---Exact solution
-DDAsolution
1
~L.l--------~--------~-------'~--
U_
Fig. 3.
Increase in Accura.ey as the Size of the Independent Variable Increment
1s Decreased (or as the Number of Iterations 1s Increased) for a DDA
Solving Chnpnan's Equation
Figure 3. Increase in accuracy as the size of the independent
variable increment is decreased (or as the number of iterations is increased) for a DDA solving
chapman) equation.
by dx,
Figure 4. Solution of XY"
+ -}
y'
+
y = 0
938
PROCEEDINGS -FALL JOINT COMPUTER CONFERENCE,
Scale
Scale
B A
dx
B A
dx
dx
dx
dx
2
1965
o
0
IT
1
3T
Scale
B A
1
"2
dy
= (~)
dx
3
dy
1
2T
-y dx
d 2y
.
FIgure 5. DDA to solve x dx2
+ 21
+
range from x = 16 to x = ( 911"
4
maximums of
~,
) 2
•
The absolute
•
dy
+y
dx
•
= 0
..
: : ' and y plus the scaling
coefficients for scale A and scale B are given in
Fig. 6. For example, a 5 on a line would indicate
that the weight of the variable increment at that
point equals
is
s
or it takes 2 pulses to equal one
unit of the variable. The position of the binary point
in any register can be determined by the number
inside the triangle. The following sketch shows the
sign convention. The scaling coefficients were purposely chosen so that th length of· each register was 7
bits for Scale A and Scale B. The counters, IT, 2T,
and 3T, in Scales A and B are used to decrease the
scaling coefficients so that they satisfy the scaling
equations.
We also employed asynchronous timing in the
computer. No further explanation of asynchronous
timing is needed here.
Another problem not mentioned so far is sequencing, which is related to the timing. While Integrator
1I is processing input 2, it is possible for inputs 3
~ and 4 to arrive. The integrator must process each
input one at a time in series. To solve this problem,
the integrator has been designed to store any input
pulse which arrives while the integrator is integrat-
939
DESIGN OF A HIGH SPEED DDA
dx
Scale A
2
( 4JL)
<
-
x
3T
< 24
-
2T
Scale B
dx
24
< x
-
«91T)
4
2
IT
-
3
5
5
5
2T
o
I~I = 1-100
/x
-2
I
sin (2{X) < 2
Figure 6. Scale A and B for demonstration DDA.
5
940
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1965
ever, there are three pulse paths where Integrator 11
can be actuated in a AX cycle; they are pulse path 1
(triangle 2), pulse path 2 (triangle 5) and pulse
path 4 (triangle 9). In the worst case, if all integrators in time zone A overflow, then all three requests for 11 in time zone B must be honored. This
is done by sequencing the 1I AX requests on a first
come, first served basis. There is also a time conflict
in time zone C with Integrator 21, pulse path 2 (tri-
ing. When the integrator is finished, it will act upon
the stored input pulse.
The sequencing problem is best illustrated by
examining the pulse paths. Referring to the Sequencing Chart (Fig. 7) there are three basic time zones
after a AX generator starts the iteration cycle. In
zone A, Integrators 11, 21 and 31 do a AX cycle
(triangles labeled 1, 4 and 8) and there is no possibility of an integrator time conflict. In zone B, how-
Pulse path
1
Pulse path
2
Pulse path
3
Pulse path
4
I
i
II
:1
!--:
Time zone_--l-_Time zone_~_Time zone---J
A B C
:
1
Start of
iteration
cycle
Figure 7. AX generator sequencing chart.
angle 6) and pulse path 4 (triangle 10). The worst
possible series path of integrators is shown below:
1
4
8
.9---.
.6
3
.10
The worst path is four AX cycles and one AY cycle.
They are Integrators 1, 2, 5, 9 and 3. The other
integrators, Nos. 4, 7, 6, 10 and 8, are processed
in parallel and take no added time. For example,
while Integrator 1 is processing, Integrators 4 and 8
are also processing. In this description of the pulse
paths and parallel processing we have ignored the
DESIGN OF A HIGH SPEED DDA
influence of the IT, 2T and 3T counters. The most
they can do, however, is decrease the occurrence
of the worst case series path. At the completion of
all the pulse paths, the ~X generator starts the next
cycle. Since the longest path is five integrators
(which occurs infrequently) and the shortest is one
(which occurs frequently), this technique of asynchronous timing results on the average in a great
time saving.
Test data obtained from the demonstration computer indicates the number of iterations per second
to be in excess of 2 million.
xy"
941
+ -1 + Y = 0 can be divided into three sec2
tions: Integrators, Counters and Master Timing
Control.
Integrators
LOGIC DETAILS OF DEMONSTRATION DDA
There are two types of integrators, shared (11)
and normal (21 and 31) both of which use the normal rectangular integrator technique. To ease the
description, we will describe the less complicated
normal integrator and note the differences of the
shared integrator where appropriate.
The logic of the demonstration DDA to solve
The integrators are divided into two sections: (a)
Register and Adder, and (b) Timing and Control.
Carry
I I
Q
Adder
y
R
Figure 8. Registers and adder interconnection.
942
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
Registers and adder. The functions of the Y and
R registers are as normally used in a DDA (see
reference 7 for full description of DDA integrators).
There is also a 0 register which is used as a temporary storage from the adder. The block diagram
of the registers and adder is shown in Fig. 8.
The possible register-to-adder transfers are
(In the case of a shared integrator, 0 transfers
to R 1 , R 2, or Rs depending on which AX is in
operation. )
Both the Rand 0 are 7 bit unsigned registers and
the Y register is 7 bits plus sign. Subtraction is performed by a modified 1's complement addition plus
some sign control and zero detection. The Y and Q
registers can transfer out their true value (YT or OT)
or their 1'scomplement value (Yc or Oc).
The total algorithm for theY, Rand 0 registers
to the adder is given in Table 1. The symbols used
in the table are:
Y M == Y register magnitude
Y s == Y register sign
Y T == Y register true value
y c == Y register 1's complement value
C == overflow carry-out of adder
C == no overflow carry-out of adder
~ == transfers to
DN == do nothing
== yields
Let's look at an example (line 4): Suppose the
integrator is in the condition Y M =I=- 0 (the contents
of Y register are not equal to zero), Y M =I=- MAX
(the contents of Y register are not equal to all ones
or the maximum value) and Y s = - (the sign of Y
register is negative). If the input to the integrator is
a AX + cycle (column 1), then from the table it
can be seen that: the 1's complement value of Y is
added to the value of R and the result stored in 0
(Yc + R ~ 0). Then the true value of 0 is transferred back to R (QT ~ R). If an overflow carry
from adder position 7 did not exist, then the inte-OUTgrator signals a minus (~X -) output (C
PUT) to the next integrator.
In addition to the normal functions of the registers
1965
and adder, they are also used to accomplish the
rescale function. The rescale is really only a shift left
of the contents of the Y register. Rather than put in
the shift hardware and set up a special timing sequence it was found to be cheaper and faster to use
the adder. This is done by adding a 0 to the contents
of Y (which is only a trick to get the output of the
adder to contain Y) and transferring the adder output to the shifted rather than the normal position of
Q. For example, in Integrator 21, the output of So
(the zero stage of the adder) is sent to 04, Sl to 05
and S2 to 06. All lower stages of Q, that is 00, 01,
02 and Os are filled with O's. This results in a shift
left 4 of the Y register.
The adder network is a fully parallel 6 delay carry look-ahead adder. (For more details on adders,
see references 4, 5 and 6.) In brief this adder is
made ripple free because each stage "looks ahead"
to the input of all previous stages to determine
whether a carry-in will exist. This technique, although extravagant in hardware, is the fastest method for binary addition. In passing, it should be noted
that this adder could have been made 1 delay
faster (from 6 delays maximum to 5 delays maximum), but the amount of hardware would have
been approximately doubled. The gate delays in the
microelectronics we are using result in an add time
of 50 to 90 nanoseconds.
Timing and Control
Each integrator is required to perform two types
of operations: a ~X cycle, where Y is accumulated
to R and a AY cycle, where Y is incremented or
decremented by 1. Control flip-flops in each integrator store the present status of the integrator as
well as the input commands (AX or AY). Another
series of control flip-flops store the action to be
taken by the integrator, based on the decoding of
the input comm~nds and the present status.
See Table 1 for the combinations of input commands, present condition of integrator and decoded
commands. The input commands are also used to
initiate the timing chain. No clocks exist in the
timing chain, or anywhere in the machine. Rather,
the timing and control is designed so that each integrator only operates on command and will process
commands according to a priority sequence as
shown in the following tabulation.
943
DESIGN OF A mGH SPEED DDA
TABLE 1
Algorithm for Registers and Adder
ax
ax
Condition of
Integrator
DN
DN
YM = 0; Y s = -
DN
DN
Y M =F,O; Y s
= +
YM=FMAX
YM=FO;Y S = YM=FMAX
Ys
=+
YM = MAX
Ys =Y M = MAX
YT +
R~Q
aY
+
+
+
YM = 0; Ys =
Input to Integrator
aY
Yc +
R~Q
Yc
QT~R
C
+ OUTPUT
Y c + R~Q
-OUTPUT
C
Y T + R~Q
Yc
QT~R
QT~R
Qc~Y
C
-OUTPUT
Y T + R~Q
C
+ OUTPUT
Y c + R~Q
1
QT~R
QT~R
C
+ OUTPUT
Y c + R~Q
QT~ R
C
-OUTPUT
-OUTPUT
C
Y T + R~Q
Yc
QT~R
Qc~Y
C
+
1~ Q
Qc~Y
QT~R
+
1~ Q
OVERFLOW
Yc + 1 ~ Q
Qc~Y
+
1~Q
OVERFLOW
+ OUTPUT
1. If a IJ.X command occurs, and a Y is not
active, process aX.
2. If a IJ.X command occurs and IJ.Y is active,
then store aX command.
3. If a IJ.X command is in storage and AY
goes inactive, then process AX and reset
IJ.X storage.
4. If a aX and AY command occur at the
same time, process AX and store AY.
5. If a AY command is in storage and aX goes
inactive, then process AY and reset AY
storage.
Notice that in line 4, preference is given to AX commands since only they can cause an output from the
integrator. In this way, if there is to be an output,
the next integrator is started processing sooner than
if the AY (which never has an output) were processed first.
Each of the aX and AY command and storage
flip-flops from each integrator is also decoded in
the Master Timing and Control Section. When all
flip-flops are zero, which indicates that no integrator
is processing or has anything to be processed, the
next AX generator or next iteration cycle is started
- which is asynchronous timing.
Counters
The DDA contains three counters, IT, 2T and
3T. Counters IT and 3T are simple decoding counters of 2 bits and 1 bit, respectively, and 2T is a 4bit double-rank forward and reverse shift counter.
In e~ch of the counters, since we cannot afford to
wait for ripple time, a simple expediency is used.
When a counter is one count below its output value
the next count command will trigger the output
even though the counter never really increments (or
decrements). For example, suppose the counter was
designed to overflow or output at a value of 16,
when it reaches 15 the next count command triggers
the output and resets the counter. In this way, the
computer does not have to wait for the counter to
increment but only for the time it takes the simple
counter control to trigger.
It might appear that, since the counter ripple
time has been bypassed as a problem, we could
have used the simplest ripple counter, but this does
944
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
not work out. It is true that. the counter time is important only in as much as it finishes its counting
before the next count command is received. Under
certain conditions in the DDA, the count commands could be very close together so the faster
shift counter was required.
In order to avoid the annoying typical counter
problems of two types of zero (plus and minus)
and sign control, a simple expediency was used of
making the counter one stage too big and resetting
to the middle value. For example, if we want to
count from 0 to + 7 and 0 to - 7, we make the
counter 4 bits (instead of 3) and count from 8 up
to 15 and 8 down to 1. The hardware difference
between the two techniques is so small as to be
unimportant.
Master Timing and Control (MT&C)
The Master Timing and Control is made up of a
series of housekeeping and control functions which
belong to the whole DDA rather than anyone integrator. They are: (1) AX Generator, (2) Fault
detection, (3) Rescale, and (4) Push button starter.
~ Generator (~ Gen). This flip-flop receives
the output and status of all integrators to sense when
they have completed their last operation and are
waiting for another cycle. When this "nothing else
to do" condition exists, the AX Generator initiates
the start of the next iteration cycle. This scheme
takes the place of a clock which would run the DDA
at fixed clock periods and would have to be timed
for the worst case pulse path or slowest possible
cycle. By the very nature of a DDA not every iteration cycle will, and in fact many will not, have overflows from some integrators.
Fault detection. Certain conditions of the Y decode
logic of each integrator are considered as a fault.
These conditions, when they occur, are stored in
flip-flops in the MT&C and the machine will stop or
ignore these faults depending on a switch setting on
the operator's console.
The fault conditions are:
• 11 Y register = o.
• 21 Y register> 27 - 1 = overflow.
• 3I Y register > 27 - 1 = overflow.
Rescale. Two flip-flops are used to store the rescale conditions of 11 and 2I. When both conditions
occur, the next AX Generator cycle is replaced by a
rescale cycle which resets the initial conditions into
1965
the integrators and counters, and at the same time
shifts left or right the contents of Y register in 11
and 2I. This shifting is done by using the adder network. At the completion of the rescale cycle, the
next AX Generator cycle is initiated and the DDA
starts its normal operation.
Push button starter. This is simply a means of
filtering out the "Single-Cycle" push button noise
when the DDA is operated in the "Step" mode. In
the "Step" mode, the push button intiates the AX
Gen cycles so that the machine iterates on command.
HARDWARE DESCRIPTION OF
DEMONSTRATION DDA
In the development phase of the DOA program it
was unnecessary, and in fact burdensome, to package the computer for an aerospace environment. On
the other hand, a "breadboard" package often turns
out to mean "slapped together." Neither of these
extremes was desirable and the result of this program is what we call a laboratory model.
For the logic hardware, it was advantageous to
use aerospace approved circuits. In this way, the
laboratry model would have speed and noise
characteristics approximating those of a final packaged version. The logic chosen was the Fairchild
commercial integrated circuits.
The front and back views of the resultant laboratory model are shown in Figs. 9 and 10. The chassis
measures 23" X 20" X 5" (exclusive of the operator's console) and can accommodate 100 logic cards
of the type shown in Fig. 11. The connectors are
86 pin AMP tab-wired and require no soldering.
Tbe recessed tabs preclude wire shorting and also
allow easy removal. Connector power (+ 4 v and
groun~) is tapped-off the bus bar with all grounds
isolated from the chassis and brought back to a
central point.
The logic cards measure 41;2" X 2" and each
contains twelve 8-pin, TO-5 header microelectronic
logic elements. The only interconnections made on
the printed circuit cards are the power bussing. The
remaining logic interconnections ~re made on the
wiring backpan (this is often referred to as "backpan logic"). This technique allows greater flexibility
and ease of modification, and the logic partitioning
is much easier since it can be done at a much later
date in the design. There is also a significant cost
saving in only requiring one printed circuit design.
The photographs in Figs. 9 and 10 were taken
DESIGN OF A HIGH SPEED DDA
Figure 9. Front view of DDA.
Figure 10. Back view of DDA.
946
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1965
Figure 11
during the computer checkout which accounts for
the chassis sitting vertically with the operator's console temporarily mounted on top. .
In Fig. 11, the lower card is a normal 12-element
logic card. The upper card is also a logic card but
with the addition of capacitors and trimpots for
single shots. The middle card shows the reverse side
of all cards.
The demonstration DDA to solve xy"
+ {-
y'
+ Y = 0, contains 94 cards, each with 12 elements, for a total of 1200 TO-5 cans (approximately
1900 logic elements). Redesigned for aerospace, and
using the conservative estimate of 17,000 flat-packs
per cubic foot, the machine could be packaged in a
volume of 0.07 ft3 and weigh approximately 3.6
pounds.
CONCLUDING REMARKS
The use of digital differential analyzers (DDA) ,
because of their great speed in solving differential
equations, appears to offer a promising future in
aerospace applications over the pure general purpose (GP) approach. In brief, two very attractive
applications for the high speed DDA are evident,
(a) as part of a GP-DDA hybrid which would alleviate the loading of the GP computer for aerospace applications and ( b ) as the real time computer for a strap down inertial guidance system. A
further detailed comparisn is necessary to justify
the merits and economics of this approach, but we
do feel that the vastly disproportionate design time
given to the GP machines requires at least some
attempt at equal design time for the DDA before
the· final comparison can be made. Potentially, the
DDA can iterate a differential equation faster since
the GP wastes some time doing housekeeping and
memory transfer instructions, e.g., transfers to and
from memory, and indexing. In contemporary
DDA's this potential has generally not been realized
for two reasons; ( a ) most of the DDA's built have
been serial machines, and ( b ) the DDA uses a
fixed independent variable increment while most of
the more sophisticated GP programs for solving differential equations use a variable increment.
DESIGN OF A HIGH SPEED DDA
ACKNOWLEDGMENT
The author wishes to express his apprieciation to
Mr. M. F. Hutton of the Martin Company who contributed significantly to the details of this program.
REFERENCES
1. D. R. Chapman, "An Approximate Analytical
Method for Studying Entry Into Planetary Atmospheres," NACA TN 4276, 1958.
2. R. C. Wingrove and R. E. Coate, "Piloted
Simulator Tests of a Guidance System Which Can
Continuously Predict Landing Point of a Low LID
Vehicle During Atmosphere Re-entry," NASA TN
D-787.
3. A. Gill, "Systematic Scaling for Digital Differential Analyzers," IRE PGEC, pp. 486-489,
(Dec. 1959).
4. I. Flores, The Logic of Computer Arithmetic,
Prentice-Hall, Inc., 1963.
5. R. S. Ledley, Digital Computer and Control
Engineering, McGraw-Hill Book Co., p. 257.
6. R. K. Richards, Arithmetic Operations in
Digital Computers, D. Van Nostrand Co., p. 303.
7. Huskey and Korn, Computer Handbook, Sect.
19 on DDA's, Mc-Graw-Hill Book Co. (1962).
BIBLIOGRAPHY
1. "An Air-To-Surface Missile Guidance HighSpeed Digital Differential Analyzer," Final Report
Contract No. AF33(600)-31315, WADC TR-59651, IBM Fed. Sys. Division (Jan. 1960).
2. C. G. Blanyer and H. Mori, "Analog Digital
and Combined Analog-Digital Computers for RealTime Simulation," Proc. Eastern Joint Compo
Conf., Washington, D. C., Dec. 9 to 13, 1957, pp.
104-110.
3. R. E. Bradley and J. F. Genna, "Design of a
One-Megacycle Iteration Rate DDA," Prog. SJCC
AFIPS, 1962.
4. E. L. Braun, "Brief Introduction to the
DDA Computer," Computers in Control, AlEE
Control Compo Session, 1960 to 1961, pp. 80 to 86.
5. E. L. Braun, "A Comparison of Integral and
Incremental Digital Computers for Process Control
Applications," Control Engineering, pp. 113-118
(Jan. 1960).
6. E. L. Braun, "Design Features of Current
I
947
DDA's," IRE Convention Record, Part 4, N. Y.,
pp. 87-97, 1954.
7. E. L. Braun and G. Post, "Systems Considerations for Computers in Process Control," IRE
National Conv. Record, Part 4, pp. 168 to 181,
1958.
8. V. Bush, "The Differential Analyzer," Journalof Franklin Institute, Vol. 212 (1931).
9. "Computers in Control," AlEE Pub. S-132
(Sept. 1961).
10. J. M. Crank, The Differential Analyzer,
Green and Company, Ltd., London, 1947.
11. F. G. Curl, "A Comparison of Computers,"
Computers in Control, 1960 to 1961, AlEE Control
Computer Sessions, pp 87-96.
12. "DDA," published by G. Forbes.
13. M. M. Dickinson, "A Comparison of DDA
& GP Equipment in Guidance Systems," Computers
in Control, 1960 to 1961, AlEE Control Computer
Sessions, pp. 208-210.
14. "The Digitac Airborne Control System
Trends in Computers: Automatic Control and Data
Processing S-59," Proceeding Western Joint Compo
Conf., Los Angeles, Feb. 11 and 12, 1954, pp. 3844.
15. J. F. Donan, "The Serial Memory DDA,"
Math. Tables and Other Aids to Comp., Vol. 6, No.
38, April 1952, pp. 102-112.
16. C. F. Edge, _"Digital Differential Analyzers
Versus General Purpose Digital Computers for
Schuler-Tuned Inertial Navigation Systems," IEEE
Trans. on Military Elec., Vol. MIL-7, pp. 23-29,
Jan. 1963.
17. E. E. Grabbe, Handbook of Automation,
Computation and Control, Vol. 2, Computers and
Data Processing, Wiley, 1959.
18. D. R. Hartree, Calculating Instruments and
Machines, University of Illinois Press, Urbana, Ill.,
1949
19. F. B. Hills, "A Study of Incremental Computation By Difference Equations," MIT Servomech-anisms Lab., Rept. No. 7849-R-1 (May 1958).
20. E. G. Homer and W. Palmer, "Comparison of
Computational Speeds of Digital Differential Analyzers and General Purpose Computers," lEE
PGEC, June 1964, p. 307 (correspondence).
21. H. K. Knudsen, "The Scaling of Digital Differential Analyzers," IEEE Trans. on Electronic
Computers, Vol. EC-14, pp. 583-589, Aug. 1965.
22. R. D. Lamson, "A Division Algorithm for a
948
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
Digital Differential Analyzer," IEEE PGEC, Feb.
1964, pp. 54-55.
23. F. Lesh, "Methods of Simulating a Differential Analyzer on a Digital Computer," J. Assoc. of
Compo Machinery, Vol. 5, July 1958, pp. 281-288.
24. "Maddida-DDA," Brochure No. 38, Northrop Aircraft, Inc. (Dec. 1950).
25~ H. E. Maurer, "An Approximate Analysis of
Error Propagation in a DDA," MIT Inst. Lab.,
Martin Company, MR-6140-A44 (March 1958).
26. H. E. Maurer, "Error Analysis of a DDA,"
MIT Inst. Lab., Martin Company, MR-6140-A-17
(May 1957).
27. M. J. Mendelson, "The Decimal Digital Differential Analyzer," Aeronautical Engineering Review, Vol. 13, pp. 42-54, (Feb. 1954).
28. K. Millington, "An Experimental Incremental Computer," J. Brit. IRE, Vol. 25, pp. 461-473
(May 1963).
29. L. M. Milne-Thompson, Calculus of Finite
Differences, Oxford Press.
30. L. P. Mussner, "Real-Time DDA (DART)
Trends in Computers: Automatic Control and Data
Processing S-59," Proc. Western Joint Compo Conf.,
Los Angeles, Feb. 11 and 12, 1954, pp. 134-139.
31. D. J. Nelson, "DDA Error Analysis Using
Sampled Data Techniques," AFIPS SJCC, 1962,
pp. 365-376.
32. P. L. Owen, M. F. Partridge and T. R. H.
Sizer, "Cosair, A Ditital Differential Analyzer,"
Royal Aircraft Establishment, England, TN lAP
1123; "The Use of Direct Coupled Logic in the Design of an Arithmetic Unit (for a DDA) ," Electronic Engrg., Vol. 34, pp. 540-545, Aug.; pp. 619623, Sept. 1962; "A Transistor Digital Differential
Analyzer," J. Brit. IRE, Vol. 22, pp. 83-96. (Aug.
1961).
33. M. Palensky, "The Design of the Bendix
DDA," Proc. IRE 41, Oct. 1953, pp. 1352-1356.
34. M. Pavevsky and J. V. Howell, "Digital Differential Equation Solver," Instr. and Control Sys.,
Vo. 36, pp. 118-121 (April 1963 ).
35. Z. Pawlak, "The Application of a Negative
Base Number System to a Digital Differential Analyzer," Bull. L'Acad. Polonaise Sci., Ser. Sci. Tech.,
Vol. 8, pp. 149-150, (Feb. 1960).
36. R. Rutishauser, "Litton-20 DDA," Litton
Industries, 1955.
37. M. I. Schneider, "Logical Design of Integrators for Digital Differential Analyzers," MIT In-
1965
strumentation Lab., Rept. No. T-154 (May 1958).
38. R. G. Selfridge, "Coding a GP Digital Computer to Operate As a Differential Analyzer," Proc.
Western Joint Compo Conf., Los Angeles, March 1
to 3, 1955, pp. 82-84.
39. G. T. Sendzuk, "Results of Simulation Comparison of Control Computers," Computers in Control, AlEE Control Computer Sessions, 1960 to
1961, pp. 97-103.
40. G. T. Sendzuk, "A Variable Increment Computer," Computers in Control, AlEE Control Computer Sessions, 1960 to 1961, pp. 112-120.
41. M. F. Sentovich, "Mechanization of SST Inertial Navigation Computations in All-DDA Computer," 20th Annual National Meeting, Navigation,
Autumn 1964, Vol. II, NVM 3, pp. 284-298.
42. S. M. Shackell and J. G. Tryon, "The Relative Merits of Incremental and Conventional Digital
Computers in Air-Borne Real-Time Control," Computers in Control, AlEE Control Computer Sessions, 1960 to 1961, pp. 200-207.
43. C. E. Shannon, "Mathematical Theory of
Differential Aalyzers," J. of Mathematical Physics,
Vol. 4 (Dec. 1941).
44. D. E. Skabelund, "The Numerical Process of
a Binary Differential Analyzer," University of Utah
Report (Aug. 1953).
45. R. E. Sprague, "Fundamental Concepts of
the DDA Method of Computation," Math Tables
and Other Aids to Compo (6), Jan. 1952, pp. 41-49.
49.
46. R. E. Sprague, "CRC-105 Computer," Aero
Digest (67) pp. 48-55, (Aug. 1953).
47. R. H. Stotz, "Specialized Computer Equipment for Generation and Display of Three
Dimensional Curvilinear Figures," Contract
AD33 (600)42859, Prof. DSR 8753, 154 pp.,
March 1963; U. S. Gov. Res. Rept., Vol. 38, p. 6
(A), September 20, 1963. AD 406 608 (OTS
$12.00) .
48. J. Tou, "Digital and Sampled-Data Control
Systems," McGraw-Hill Book Co., 1959.
49. "The Trice-A High Speed Incremental
Computer," IRE Nat. Conv. Record, 1958, Part 4.
50. O. C. Turtle, "Incremental Computer Error
Analysis," IEEE Trans. on Communication and
Electronics," Vol. 82, pp. 492-495, Sept. 1963.
51. R. W. Waller and F. E. Brinckehoff, "A
Comparison of Whole Value and Incremental Digital Techniques by the Use of Patch Panel Logic,"
DESIGN OF A HIGH SPEED DDA
Computers in Control, AlEE Control Computer
Sessions, 1960 to 161, pp. 104-111.
52. C. J. Wayman, "The Airborne Digital Computer 'Dexan'," Interavaia, Vol. 16. pp. 1705-1706,
(Dec. 1961).
54. E. Weiss, "Applications of CRC-105 Decimal
949
DDA," IRE PGEC EC-1, Dec. 1952, pp. 19-24.
55. H. A. Whitted, "A High-Speed Rate Multiplier for Data Display Systems, Navy Electronics
Laboratory, Report 1174, p. 70 (July 1963).
56. D. J. Winslow, "Incremental Computers in
Simulation," Meeting of Southeast Simulation
Council, Huntsville, Ala., Oct. 1958.
ENGINEERING MATHEMATICS VIA COMPUTERS
John Staudhammer
Arizona State University
Tempe, Arizona
mathematical models, where the extent of abstraction may be dictated solely by the availability of
simple solutions, as for instance, solutions of constant coefficient differential equations.
To ease some of the more routine calculations,
computer~ are being used in some courses in various colleges. However, extensive use of modern
computers by all instructors of engineering and engineering mathematics can eliminate many timeconsuming rote calculations from the class presentations. Such an extensive use of computers requires a
careful reevaluation of the structure of engineering
education. Many new and unique computer programs
need to be developed for each segment of the restructured engineering curriculum. The faculty and
students need to understand how to use computers
and what to use them for, and they need to be able
to access the computer programs during normal
classroom study. Computer use during classroom
discussion can allow for the exploration of a larger
number of problems of a greater degree of sophistication than is possible otherwise.!
A broader l>ase of applied problems will afford
an opportunity for students to gain a deeper understanding of the scope of modern engineering. Many
more significant problems become vivid demonstrations of principles discussed; ramificatIons of changes
in problem formulation become easily demonstrable,
THE NEED
In the last two decades the use of mathematics
in engineering has increased considerably. Today,
mathematical models make up the heart of many
engineering disciplines. The rapid increase in scientific knowledge has brought about a staggering proliferation of technological applications. In order to
equip students for this exploding technical environment, engineering colleges have turned to emphasizing fundamentals. It is physically impossible to
include even a representative sampling-in-depth of
today's techn910gical developments. without further
crowding the already overburdened curricula. This
crowding is due not so much to the new ideas the
newer scientific applications require, but rather to
the inordinate number of details and refinements
they engender.
It is hardly a radical idea to suggest that today's
curricula, developed in a more static scientific
world, are not keeping pace with the accelerating
pulse of the mainstream of scientific thought. But
more importantly, the vast majority of curricula fail
to prepare students in the philosophy and processes
of engineering; rather, they stress a wide variety of
special tricks and special procedures. Often the only
explanation is that these tricks "solve problems."
This is especially true of the construction of simple
951
952
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
and a wider variety of solutions can be supplied to
the student upon which heuristic arguments may be
based. It is even possible to explore intuitive approaches to the solution of given problems.
THE APPROACH
One of the most subtle changes of the last decade
and a half has been in the meaning of the word "solution." It is widely recognized that a m~thematical
closed form, while unquestionably pleasing aesthetically, may not necess~rily be interpreted easily.
Curves and graphs may constitute a more desirable
solution format, but most often a solution merely
consists of a procedure whereby an answer can be
obtained. Categorically, today's good solution is
one that can be calculated without undue difficulties. We do not shrink from solving 50 or 100 coupled simultaneous linear equations, or that many
differential equations. Every major computer center
has, or should have, an extensive subroutine library,
often stored on an on-line library tape, for effecting
extensive numerical calculation and/or simulation.
For statistical work, UCLA's BMD programs
have become widely accepted industry standards;
programs like RAND's ROCKET are used extensively in work in astrodynamics. Yet very few engineering instructors have been exposed to these powerful tools; fewer yet have introduced their students
to the concepts, the frame of mind necessary for the
effective use of these procedures. Many engineering
educators are still preoccupied with methods for the
solution of specific problems; they neither emphasize nor demand a mathematical environment that
stresses concepts of problem formulation and problem evaluation. In such an environment, effort
would be concentrated on stating the problem correctly and ad~quately and estimating the expected
answer. The mechanics of obtaining an answer
would be left to a computer (or a computer center),
the computer then in effect would become an educated answer-generating device. Such a device may
still be fallible: when an unexpected answer appears, the user of the device must know enough
about the problem to be able to account for the discrepancy. Often it lies in the method used in arriving at the estimate, but sometimes it may be due to
bad input data, or (rarely) to a true machine error.
AN EXAMPLE: MATRIX APPLICATIONS
The main purpose of this paper is to discuss the
articulation necessary for a set of computer programs, none of which is very interesting by itself, to
be used for the teaching of a senior-graduate engineering course in the application of matrices. This
course is conceived as a model for other engineering study areas, which must begin to employ more
computer assistance. *
*For . an
outline of this course, see Fig. 1.
Matrix Methods in Electrical Engineering
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
l3.
14.
15.
1965
Introduction: Examples of Matrix Formulations
Determinants and the Matrix Inverse
Linear Equations
Direct Numerical Procedures
4.1 Computational Error Considerations
4.2 Evaluation. of Determinants
4.3 Simultaneous Equations
4.4 Inversion
4.5 Programs for Very Large Matrices (Partitioning)
4.6 Library Routines (Dimension - independent)
4.7 Partial Double Precision
Iterative Numerical Procedures
5.1 Simultaneous Equations
5.2 Inversion
Non~Numeric Matrices (FORMAC)
Characteristic Value Problems
7.1 Eigenvalue Calculation
7.2 Eigenvector Calculation
7.3 Repeated Eigenvalues
Diagonalization
8.1 Non-repeated Eigenvalues
8.2 Jordan Normal Forms
Functions of Matrices
Linear Differential Equations
Network Topology
Four Terminal Networks
Small Vibrations Problems
Continuous Systems
Operations Research Examples
Figure 1. Outline of course on matrix applications.
ENGINEERING MATHEMATICS VIA COMPUTERS
Overall Criteria
A set of computer programs was written conforming to the following criteria:
1. There is a variety of programs for each
task to be performed.
2. The programs constitute a sequence of
interrelated programs.
3. Successive programs are increasingly more
economical, and/or powerful, and/or have
a wider applicability.
4. Final programs are favorably comparable
with good computer programs.
5. Selective querying is possible.
6. Uniformity of input formats.
7. Relatively uniform output formats.
8. Limited restart feature.
9. Machine independence.
10. Compatibility· with current program systems.
In .the design of the computer programs each of
the above criteria must be considered concurrently.
Thus the programming language used is FORTRAN II, with a FORTRAN IV version also developed. It should be pointed out that the matrix
routines generated constitute a limited version of
matrix interpretive systems, such as the ones in use
at the Aerospace Corporation or in Lockheed's
FAMAS system.
Programs for Basic Operations
After a relatively conventional introduction of
matrix notation and of linearity, determinant theory
is discussed. This is the first instance of computer
work in the course.
The applications aspects of determinant theory
are covered in approximately five hours, including
the method of Gauss for the evaluation of numerical determinants. This method is then programmed,
a main routine written for it, and an acceptable input data format is agreed upon. This format must
conform to standard fixed format input of eight tendigit numbers per card (8E10.0) preceded by two
cards, the first specifying a title, the second the order of the determinant that is to be evaluated. Serial
numbers and comments can also be placed on these
cards. The input data is in row order.
By means of examples the program is shown to
be faulty in some cases. The addition of a set of
PRINT statements allows monitoring of this pro-
953
gram, thus resulting in the second determinant
evaluator. It is shown by the examples that it is the
main-diagonal zero divisors that lead to the breakdown of the original routine.
At this point a cursory discussion of chopping
errors is undertaken and the need for the pivot procedure demonstrated. 2 Implementing this pivot exchange leads to a fully acceptable routine, efficient
and relatively accurate as demonstrated by a variety
of numerical examples. The internal operation of
this routine is checked by means of selective printing of intermediate results accomplished by setting
control switches on the computer console.
For inclusion in the permanent library of the installation the last program is made dimensionindependent for compilations in FORTRAN II; a
version for complex matrices is also prepared. Extensive testing of these programs completes the discussion of numerical determinants.
Next the discussion of simultaneous equations is
undertaken. The basic theory underlying linear vec~
tor spaces is covered in about three hours, culminating in the derivation of the Gauss method of
elimination. Programming this method along the
lines of the first determinant evaluator gives the
first equation-solving routine. Although this program will also fail when zero main-diagonal elements are generated, the introduction of pivoting is
delayed until after a discussion of the Gauss-Jordan
method.
Pivoting used with the Gauss-Jordan method is
then made dimension-independent for inclusion in
the system library tape. Extensive numerical examples are used throughout.
The next topic handled is matrix inversion. Five
different programs are discussed, paralleling the programs for the solution of simultaneous equations, and
culminating in an efficient in-place inversion routine
(using pivoting and dimension-independence) for
use in FORTAN II and IV systems.
The derivation of this row-and-column-shuffling
procedure is rather interesting. Instead of deriving
the various algorithms needed for compacting the
Gauss-Jordan inversion procedure, 3 the original matrix is reduced and its augmented unit matrix printed for several test cases until it becomes obvious
how to eliminate the augmented part (thereby saving
50 per cent storage space.) Similarly the algorithms
necessary for unscrambling the pivot-condensed,
row-and-column-shuffled matrix are first obtained
954
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
The solution of the characteristic equation is accomplished by a standard library routine, which is
simply taken as given.
Properties of eigenvalues of special matrices are
discussed with the aid of the above computer programs. The program is extended to complex matrices and properties of these matrices are studied.
Applications to network analysis and vibration
problems seem natural at this point. 4
Application of the simultaneous equation routines discussed earlier leads to eigenvector problems; difficulties with repeated eigenvalues are easily demonstrated, and the idea of a minimum polynomial becomes obvious. Orthogonality of eigenvectors is demonstrated and their use in vibration
problems illustrated by means of carefully contrived
examples.
The complete eigenvalue problem constitutes the
last problem discussed under this topic. Construction of the Jordan normal form leads directly to
procedures (all programmed) for obtaining generalized eigenvectors. 5 Numerical examples showing
these methods complete this part of the class
presentation.
heuristically from a series of examples; only then
are they derived mathematically. This procedure
gives life to otherwise dry and seemingly uninteresting derivations.
The discussion of the solution of linear operations is concluded with programs suitable for complex matrices and procedures used with partitioned
matrices. A few tape-shuffling routines are demonstrated for solving a set of simultaneous equations
too large to fit into the machine at one time.
The discussion of round-off errors and direct
numerical procedures is "naturally" extended to itera.:.
tive techniques such as the Gauss-Seidel procedure.
Throughout the discussions, a set of examples is
used that demonstrates the limitations as well as the
uses of the various methods. Without computer assistance most students would not be able to gain
any insight into the why's and wherefore's of these
various procedures.
Eigenvalue Problems
The next major topic discussed is the general
eigenvalue problem. After conventional derivation
of the Cayley-Hamilton theorem, the theorem is used
to derive a computer-adaptable procedure for finding the characteristic equation. This process uses
subprograms developed in earlier discussions; hence
a pyramiding of subroutines starts at this point.
Tf-F If\PUT ..:.TRyX Is
';.000000,.000
".
-e,OOOOOO.OOO
2,000000.000
-2.000000+000
6.0000110+000
0,
-4,000000+000
1965
Matrix Functions
During the class discussion, the Cayley-Hamilton
theorem is applied (repeatedly) to the construction
-2,000000+000
2.000000.000
O.
-4.000000+00(:
-2,OOoooo+00r:
6,OOOOOO+00fi
tl.oooooo+ooo
0,
THE COFFICIFNTS OF THE CHARACTlRISTIC E(lUATlOIll .RE, IN nECRr"SI~)G !'OWER!> Of ""l,
SHIHI_', WITH
4 AS THE HIGrlEST POWER
l,ooonpn,.ooo
THE
r. ... AfiACTERIs'fIC
-l,08tiOOo .. 003
ROoTS ARE
3,999 0 4/\"000
tl.oooooo.OOO
4.0000111.000
1,200000*001
-8,902723-006
-5,783758-014
2,915199-006
2,818920-016
SIMUL TANEOllS rOUATIONS rOR THE COEF'F"lCIENTS OF THE MATRIX FUNCTION ARE
l,OdOooop,.oeo
1 , 0 ~, n 0000.0 e 0
l,Ollooooo.oeo
1,0000000.000
3.9999 4 75+000
8,0000000·000
4.00003. 7 5.000
1.2000000·001
1,5999580.001
6,4000000·001
1,6000140+001
1,4400000+002
6,3997482.001
5.1200000·002
6,4000842+001
1.7280000*003
~5,450!l21l6.n01
p2,98~95I:!n·nU3
~5,459Yl0R.no1.
~1.t>27!l419.ou5
THE ~ATRIX F'U"CTION EXPANSION COEF'fICIENTs ARE
9,2952023.003
-5.91521158.002
THE FUNCTION er THE MATRIX IS
4,15158061123.004
"3,980688810".004
.. 4,1352068~69.004
3.9~96028702*004
-3.9998028702.00 4
4.140t>666215·004
3.9888888092·004
-4,1461206649·004
-4 .13!)2068069.0 0 4
3.9998028702+004
4,1515806823·004
-3.9888888106·004
3, 988888809i!.O 04
-4.1461208t>49.004
-3.9998028702.00 4
4.1406666215.004
Figure 2. Calculation of eM.
955
ENGINEERING MA THEMA TICS VIA COMPUTERS
General Linear Equations Package
of matrix functions. Trigonometric and exponential
functions having square matrices for arguments occur naturally enough in the solution of certain sets
of linear differential equations. Several subprograms
are constructed to carry out several methods for calculation of these matrices.
Various shortcuts in the above procedures lead to
the surprising conclusion that the most timeconsuming part of matrix function calculations is the
raising of the original matrix to successively higher
powers up to N-l, where N is the order of the matrix. A search for shorter procedures leads to the
use of Jordan normal forms. Even though the logic
is consciderably more complex, the resultant program is approximately N times faster.
The program developed in the above section is
one of two more extensive omnibus programs developed in the course. The other is a long routine for
the solution of M linear equations in N unknowns.
A check is made of the consistency of the set; consistent sets are further examined for unique solutions and parametric solutions. The program uses
the Gauss method without pivoting to obtain a first
approximation to the solution; by appropriate
switch action the above process may be repeated
using pivotal condensation. Next, unique solutions
may be iterated using either the Gauss or the
Gauss-Seidel iteration technique. After each complete pass the program may be restarted for a possibly different intermediate printing. Input data and
intermediate results are stored on tape, thus enabling the handling of 80 by 80 coefficient matrices
on a 16,000-word machine.
Once the programs are discussed they are
checked out on short problems, suitably chosen to
give easily checked results. (See Fig. 2.)
The ideas presented so far are then extended to
the solution of simultaneous differential equations.
Physical problems are formulated; the complexity of
the mathematical model may be essentially unrestricted by considerations of computational involvement.
Applications of Matrix Programs
The reason for developing these computer programs is their wide applicability to various engi-
The interconnection matrix is:
0
II
12
~1
~J
-1
0
0
0
0
0
0
0
0
1
13
1
-1
-1
0
0
0
0
0
0
0
14
0
0
1
-1
-1
0
0
0
-1
0
IS
0
0
0
0
1
-1
-1
0
0
0
16
0
0
0
0
0
0
1
-1
1
0
17
0
1
0
1
0
1
0
1
0
-1
IS
~I
0
<:)
~l
<:)
~
19
f
1
Order of elimination of Variables is
0
...:-
1
10
, 1 , IS' 1 , 1 , IS' 1 , 1 , 1 ,
2
6
9
7
4
3
Figure 3. Example for finding independent currents.
II
10
10
1
0
956
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
network model is constructed on the computer,
which then generates numerical answers to transfer
function analysis. Several shortcut procedures are
also illustrated. 6
The matrix function routines are also used extensively in the state vector calculations of control system analysis. Particularly eigenvalues and Jordan
normal. forms are found very applicable in this
work. Matrices of an order up to 80 by 80 are routinely handled in the course.
Not all applications are drawn from electrical engineering; rather, a good number of examples are
taken from the subjects to be described next.
neering fields. Since this course is taught within the
electrical engineering curriculum, the lion's share of
the examples are drawn from network topology,
network analysis, and control systems. The programs typically facilitate analysis of various engineering problems rather than provide synthesis procedures.
Electrical network equations are now input in
completely general form into the linear equations
package; the computer results indicate the independent node voltages and the independent currents.
Supplying separate element values to the programs
gives the canonical equations for the network,
which are then solved as a matrix differential equation. Throughout, many intermediate results may be
printed, for class demonstration as well as possible
checking, by appropriate console switch actions.
(See Figs. 3 and 4.)
Four-terminal networks are treated in an analogous way; routines for the handling of various interconnections are prepared, and the mathematical
1.n+O fi O
n.
I).
o.
-1
• n
+!I n!1
rl.
r .•
fl.
1.'.(01)
-1. P+
TI-~
:3:
UJ<
~~
~~
>1
987
NANOSECOND MAIN MEMORY SYSTEM
E
..... 10
ZI/'l
~~
ti~
(')UJ
0::
o
(.)
~DW) .l.N3HHn:> 3.l.IHM
Figure 2. Constant voltage curves.
ARRAY ORGANIZATION
The array is best described by referring to Fig. 3.
Each core is threaded by two wires, a word wire
and a bit-sense wire, and assembled on IS-mil
centers. To facilitate assembly, 9,216 cotes are as-
Length . . . . . .
Resistance (20°C)
Figure 3. 8K Array assembly.
Bit
Wires
.21 inches
.3.8 ohms
Word
Wires
9 inches
1.8 ohms
988
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1965
Figure 4. Plane frame.
Characteristic
impedance
Inductance
.. 94.5-102 ohms
.510-595 nh
Transmission delay ..5.4-5.8 nsec
117 ohms
166 nh
1.4 nsec
Mutual inductance
to adj. wire
. .60 nh
25 nh
Impedance, inductance and delay are a function
of the flux state of the cores on a specific line. The
A
375
NANOSECOND MAIN MEMORY SYSTEM
low values shown in the above table for the bit
wires are for a condition of all zeros. The high values are for a condition of all ones. An additional
variation in these characteristics can be expected
due to geometrically caused variations in wire capacitance and inductance of the lines. In the case of
impedance this was found to be about 2 ohms.
The cross-wire capacitance is on the order of
0.025 picofarads. Capacitive coupled noise which is
due primarily to the time rate of change of read
current has been determined to be 3 millivolts and
is negligibly small. For slow time-rate of changes
in word line voltage, such as that due to the cores
switching, the amplitude of the capacitively coupled
noise is not only lower but is mostly canceled as
common-mode at the input to. the sense amplifier
because of the relatively short transmission delay on
the bit-sense line.
The bit-to-bit coupling is limited to about 10
percent of the self inductance because of the close
spacing of the bit-sense wires to the ground plane.
This results in from 1.5 to 2 millivolts of induced
noise on the sense line at read time when a core is
switched on an adjacent line. Bit current on an adjacent line induces )current pulses of about 5 milliamps on the line.
WORD DRIVE SYSTEM
There are 2048 word lines (1024 on each side of
the ground plane) in the array, o.ne of which must
be selected and driven with a bipolar current pulse
during each cycle. Each line contains 288 cores and
is electrically short enough to. permit the line to be
grounded at one end and driven as an inductive
load. In order to achieve high performance and still
maintain some degree of economy with a matrix
selection system, the lines are divided into 8 groups
of 256 lines each. Each group is driven by 256
linear transformers. A transformer is used for each
word line primarily to permit common gating for
read and write· drives. Ano.ther advantage is that it
reduces some of the noise voltages coupled into the
memory array from the word drive system.
The primaries of the 256 transformers are connected in a diope matrix and driven by 16 gates, 16
read drivers, and 16 write drivers as shown in Fig.
5. The packaging is arranged such that the circuit
boards butt up against the array ground plane (see
Fig. 10) , and the interconnections between the
transformer outputs and the word lines are made
989
with a short printed-wire strap. The ground plane
in the circuit board is connected to the array
ground plane in a similar manner.
In the quiescent state, the gate buses are maintained at a potential of +45 volts with the read
and write buses at a slightly negative potential, so
that the diodes are back-biased and nonconducting. When a word is selected, the appropriate gate
bus is switched to ground potential leaving the
diodes in a slightly reverse-biased condition. When
a pair of read and write buses are selected
and successively driven positive, the diodes of the
transformer located at the intersection of the active
word drive and gate buses will be forward-biased
and conduct current. The amplitude and regulation
of the word currents is controlled by a biased nonlinear transformer located in each drive circuit which
assumes a high-impedance state when the current
pulse reaches the proper amplitude. The backto-back diodes, inserted in series with the secondary, reduces to a negligible level the d-c shift that
would result from the difference between the read
and write currents. The waveform in the top trace
of Fig. 6 is an oscilloscope display of the read and
write currents as they appear in the secondary of
the transformer. The read current, which is about
50 percent loaded, is beginning to. show the effect
of core back voltage.
BIT-SENSE SYSTEM
A common bit-sense system is used in the
memory to minimize the number of windings of the
core plane. Figure 7 illustrates one bit position of
this system. Corresponding bit wires on either side
of the ground plane are connected together at one
end and driven as a pair. This results in a requirement for 288-bit drivers for an 8K system. Each
bit driver must supply a maximum of 500 milliamps. The bit lines are short enough to be driven
unterminated at the far end and still achieve the desired transition times. The sense signal· is detected
across the pair at the far end. Diodes shunt the bit
currents past the sense amplifier during a write operation, but are essentially out of the circuit during
sensing. A balancing transformer in series with the
diodes forces a balance of the currents in the two
lines. The terminating network at the output of tHe
bit driver terminates the positive common mode
signal generated by the bit driver turnoff. The sense
amplifier input impedance terminates low-level
990
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
GATE
CIRCUIT
#2
GATE
CIRCUIT
#1
GATE
CIRCUIT
#3
1965
~
16
GATE
CIRCUITS
GATE
BUSS
READ
DRIVER
#1
WRITE
DRIVER
#1
READ
DRIVER
#2
WRITE
DRIVER
#2
II~
II~
II~
II~
READ
DRIVER
#3
II~
WRITE
DRIVER
#3
16 PAIRS
OF DRIVERS
Figure 5. Word drive schematic.
common- and difference-mode noise on the sense
line following the recovery of the shunt diodes and
helps recover the line for sense time. The recovery
time of the line under worst-case unbalance is
about 90 nanoseconds.
The waveform in the bottom trace of Fig. 6 is an
oscilloscope display of the bit current prior to splitting in the bit line pair. The step in the leading and
trailing edge shows the reflection that occurs in the
unterminated line.
The waveforms photograph (Fig. 8) shows the
sense signal as observed at the input terminals to
the sense amplifier. Ones and zeros are superim...
posed in this picture in order to depict an average
signal-to-noise ratio.
The bit drivers and sense amplifiers are each
connected to the array through a length of twisted
pair and a "transition" board. The transition
boards, which are butted to the array at either end,
carry printed lines and provide a means for fanning
out the bit-sense lines to socket pins into which
the twisted pair is plugged (see Fig. 10). Twisted
pair connects the transition board to the bit and
sense circuits. The transition board lines and the
array wires are interconnected through a short
printed-write strap. The circuit-to-array wiring
increases the length of the sense loop by 44 inches
and the bit line by 28 inches. An attempt was made
to match the characteristic impedances between array and circuits. Some mismatches were introduced
A
375
991
NANOSECOND MAIN MEMORY SYSTEM
VERTICAL 400 ma/cm
HORIZONTAL 100 nsec/ em
gi1
~
rj l~
rl
l!~
"-.In'
.,
'1 ~
.'=
"
i!! !I"
,J
11£
~
~l!~rJ
VERTICAL 200 ma/em
HORIZONTAL 100 nsee/em
1
~,
~- Jj
~
~:..
-
~
II
Figure 6. Drive current waveforms.
~--------------~---------------REGENERATION
PATH--------------------------------~
STROBE
DATA IN
TIMING
-GND PLANE
/
Figure 7. Bit-sense schematic.
II!!!!!!!!!!
::::1
i!
"
/
I
i
:
~
r
VERTICAL 50 mv lem
HORIZONTAL 100 nsee/em
~
--
~
n
n
.r(, ~ .~nip" ...,~~ J. ~. !~ rl~
!aJ
ill,
l:.M
l.~
I::
I::
I:
Figure 8. Sense signal waveforms.
I~
A
1"'1'
.:J
992
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
control. These are well known and will not be described. The computer need only provide address
information and a select memory pulse to initiate a
cycle. The internal timing of the memory can be
traced from the timing chart in Fig. 9. With SLT
circuits, an access time of 200 nanoseconds was
achieved. In order to achieve a 375-nanosecond interval between accesses, an overlap technique is
but they are electrically so short they have negligible effect.
MEMORY ORGANIZATION
The 8K memory unit is self contained in that it
has its own clock, .address registers, decoding, data
register, data-in and data-out controls, and byte
o
100
1965
200
300
400
500
SELECT
DECODE
GATE DRIVER
OUTPUT
WORD
CURRENT
SENSE AMP.
OUTPUT
DATA REG.
BIT
CURRENT
DATA ON
BUS
Figure 9. Ferrite memory timing cycle -
used. Because of preread delays, it is possible to
begin a new cycle before the preceding one is completed. However, it is not possible to take full advantage of overlap in this memory because it is limited by the sense line recovery.
Figure 10 shows a gate which contains an 8K
memory unit. Two such gates can be housed in a
frame to make a 16K memory unit. One array temperature control unit is shared by two arrays and is
also housed in the frame. All power supplies are
external.
CONCLUSION
The construction and operation of this memory
demonstrated the feasibility of fabricating 7.5-mil
cores and assembling them into a compact array.
The array assembly eliminated . many of the problems which previously limited ferrite speed. Most of
the speed-limiting problems now seem to lie with
the storage core,the circuits, and packaging external to the array.
It is clear at this time that several changes can be
made which will significantly improve the memory
performance. Separate tests conducted on the core
375 nanoseconds.
showed that if the write pulse is shortened by 60
nanoseconds and the read pulse by 20 nanoseconds,
the one signal under worst-case pulse sequence
dropped to about 35 millivolts and the percent flux
switched to about 40 percent. This is approaching
the minimum acceptable limit for the sense amplifiers, but still appears acceptable. If this were done,
it would reduce the memory cycle by 80 nanoseconds. To go beyond this would require some
zero cancellation scheme such as 2 core per bit or a
better core. 5 Faster logic circuits are now available,
which were not available when the memory was designed. If these were used, the memory decode time
could be reduced by 30 nanoseconds. The gate drive
circuit now accounts for 60 nanoseconds in the access and 105 nanoseconds in cycle time due to
turn-on delay and transition time. A new version
of the circuit reduced the 105 nanoseconds by 55.
The. transition board and twisted pair cable increased the length of the bit-sense line considerably. If these were repackaged, the sense loop could
be reduced by 16 inches and the bit line by 9 inches. It is estimated this would reduce sense line recovery by 30 nanoseconds. If all of these improvements were made it would reduce the access time to
A
375
993
NANOSECOND MAIN MEMORY SYSTEM
BIT-SENSE
BOARD
BIT-SENSE
BOARD
ARRAY
WORD DRIVE
TRANSITION
BOARD
TRANSITION
BOARD
BIT-SENSE
BOARD
BIT-SENSE
BOARD
WORD DRIVE
Figure 10. Memory unit -
135 nanoseconds and the interval 1>etween accesses
to 250 nanoseconds.
ACKNOWLEDGMENTS
In a project of this magnitude and complexity, it
is impractical to mention all the persons who contributed to its success. However, particular acknowledgment is given to the several engineers and
technicians who worked directly on the project and
were responsible for the design, construction, and
testing of the model. In addition, grateful acknowledgment is given to the many supporting groups
who contributed specialized talents so vital to the
success of the program. In this category, the efforts
of the circuits and device groups in the Poughkeepsie Memory Development organization and the core
plane and array assembly groups of the IBM Kingston Manufacturing organization are acknowledged.
8K words.
REFERENCES
1. J. A. Rajchman, "Computer Memories-Possible Future Developments," RCA Review (June
1962).
2. C. J. Quartly, "A High Speed Ferrite Storage
System," Electronic Engineering, vol. 31, pp.
756-758 (Dec. 1959).
3. W. H. Rhodes et aI, "A 0.7 Microsecond Ferrite Core Memory," IBM Journal oj Research and
Development (July 1961).
4. V. L. Newhouse, "The Utilization of Domain
Wall Viscosity in Data Handling Devices," Proc. oj
IRE, pp. 1484-1492 (Nov. 1957).
5. C. S. Holzinger, "Technique for Determining
the Speed Capabilities of 2D Ferrite Core Memories," Proc. Intermag Conj., April 1965, pp. 14.31 to 14.3-6.
MONOLITHIC FERRITE MEMORIES
I. Abeyta, M. M. Kaufman, and P. Lawrence
Radio Corporation of America
Camden, New Jersey
INTRODUCTION
mode and employs. several types of selection matrices. Storage diodes and conventional diode systems will be described together -with the employment of integrated circuits for a large number of
system components.
Monolithic arrays of ferrite memory elements are
being used to produce low-cost, high-speed
memory stacks. These elements are made by the
simple batch fabrication technique of laminating
ferrite sheets with embedded conductors. This process, evolved at the RCA Research Center, Princeton, New Jersey, was selected for material develop..
ment by the Electronic Components and Devices
Division and for system development by DEP Applied Research.
This paper describes the construction, characteristics and system tests for a. basic monolithic memory stack. The system operates in the word select
-------t-t-r
=4~=======5li!~
-:flf--- - --,fff
A
L
I'la
1111
1111
1111
IIII
JIll
1111
1111
:::: A
1111
llllj
1111
FABRICATION OF THE MEMORY STACK
Ferrite Wafer
Construction. The ferrite wafer is constructed by
sandwiching two groups of conductors between very
thin sheets of ferrite to form closed-flux-path
storage elements. The wafer is just over 1 inch
square and less than 6 mils thick. Each group of
63 EQUAL SPACES
@.015=.945
r
•
63 EQUAL SPACES
@.015=.945
9II rcr :g~~
~pr!!:.!..,~~~~
02f
1I11~
1111
[
:::~b_~-=.-=.=.== ~=
=mt='
=== === =~iiF
.0050 I
.0055
SECTION A-A
Figure 1. Monolithic wafer.
995
996
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1965
Palladium, the metallic conductor, is squeegeed (in
paste form) through a metal mask onto a glass substrate. The metal mask is removed and the ferrite is
then doctor-bladed over the line patterns. The
conductor lines obtained are 6 to 7 mils wide, 1 mil
thick and 1.2 inches long.
On the green doctor-bladed sheet, the resistance
of this line is 225 ohms ± 10 percent. This measurement is used as a quality control check on the
metallic paste batches.
The three ferrite sheets form a wafer 1.2 inches
square in the unfired state. The sheets are laminated
at an elevated temper~ture and a pressure of about
10,000 pounds per square inch to form a monolithic body. The firing of these ferrite bodies is divided
into two cycles: (1) a binder burnoff stage and (2)
a sinter or densification stage. The binder bumoff
cycle is extremely important, since it is here that
such mechanical problems as cracking and warpage
occur. The heating rate must be very slow and the
proper atmosphere must be maintained to facilitate
binder removal. After completion of this cycle, the
laminates can be brought directly to sintering temperature and fired to yield the desired magnetic
properties. The proper atmosphere must be maintained during sintering, not only for magnetic con-
conductors consists of 64 straight parallel lines in a
planar array, with center-to-center separations of
15 mils, as shown in Fig. 1. The conductors of one
group are placed at right angles to those of the other.
Their vertical separation is less than one mil.
The ferrite sheets are prepared by a technique
known as "doctor-blading." In this process, a
mixture of ferrite powder (an Fe-Mg-Mn-Zn composition) , vinyl plastic and plasticizer dispersed
in methyl-ethyl-ketone (MEK) is prepared in a
ball mill. This slurry is poured onto a glass substrate and drawn to the appropriate thickness by
passing a metal blade, called a "doctor blade," over
the mixture. When the MEK evaporates, a sheet re:mains in which the ferrite particles are suspended
and bound. The density of the sheet is related to the
extent of dispersion of the ferrite powder in the
slurry, and the binder system plays a vt!Jry important
role in stabilizing the dispersions. The doctorblading technique enables us to form ferrite sheets
of approximately 50 percent of the maximum density, with thicknesses ranging from 0.1 mil to 20
mils.
The laminated ferrite plate is made using three
ferrite sheets. Two of the ferrite sheets have a pattern of 64 lines of metallic powder on one surface.
,,(OTAL WRITE
READ
~"",-DIGIT
4
WORD WRITE
/wORD WINDING
e__-
BIT WINDING
Figure 2. Memory operation.
997
MONOLITHIC FERRITE MEMORIES
WORD WINDINGS
IAI_"e M
Figure 3. Laminate winding arrangement.
siderations, but also to keep the electrical resistivity
of the ferrite high so that the ferrite layer between
the palladium conductors can function effectively as
an insulator.
Mter sintering, the ends of the palladium conductors are exposed for electrical connection. This
is accomplished by using an airbrasive unit to erode
away the ferrite above the conductors. The electrical resistance of the embedded palladium conductors
is now 2.5 ohms.
Operation. Integrated ferrite wafers have been developed for linear select operation. Each wafer has
64 word windings and 64 bit windings. Each bit is
composed of the crossover between the word winding and two adjacent digit windings, so that a wafer
has 32 bits per word. By connecting bit lines of
several wafers in series, a memory with some multiple of 64 words is formed. Similarly word lengths
in multiples of 32 bits are made by adding wafers
on the other axis.
Figure 2 illustrates the nature of magnetic flux
switching at one crossover point with the application of word and digit pulses. As long as pulses are
applied to the word winding only, there is no flux
.change around the bit windings, so no signal is cou-
pled magnetically from the word to .the bit windings.
The vector diagram in Fig. 2 shows the addition of
word write and digit driving fields. The respective
components are drawn parallel to the driving currents, thus normal to the planes of their respective
driving fields. With coincidence of word and digit
currents, flux is switched to the plane to which the
vector sum is normal. Upon application of a word
read pulse, which is opposite in polarity to the word
write pulse, all of the flux is switched to the planes
normal to the word winding, with a direction consistent with the vector marked READ. Elimination
of the component of flux which had linked the bit
winding causes a magnetically coupled signal to appear on the bit winding. Analysis will show that the
polarity of this signal depends only upon the polarity
of the digit current. For the situation shown, the
upper end of the bit winding has a positive voltage
with respect to the lower end at read time.
Figure 3 shows the 2-crossover-per-bit storage
technique. Each pair of bit windings has its own
set of digit drivers and a sense amplifier. For
those bits of the addressed word which are to store
ones, a positive digit current is applied to winding
"A" and a negative current to winding "B." At the
998
PROCEEDINGS -
output voltages; conversely, the "B" lines of bits
storing ones and "A" lines of bits storing zeros
have positive output voltages. Hence, if the difference sense amplifiers yield A-B, one output signals from the sense amplifiers are negative and zero
outputs are positive. Note that the total signal output of the sense amplifier is proportional to the
sum of the absolute values of signals magnetically
coupled at the contributing crossover points.
addressed word, flux switches in a fashion conforming to the explanation of Fig. 2. For those bits
which are to store zeros, a positive digit current is
applied to winding "B" and a negative current to
winding "A." When that same word is next addressed with a word read pulse (opposite in polarity to a word write pulse), the explanation for Fi~. 2
shows that the "A" lines of bits storing ones and
the "B" lines of bits storing zeros have negative
7
\~I~~~RBED
PRE- DISTURBED
ZEROS
I I
j
I I I II
IIA"
II
J
I
7
ONES
\
I
,
t
'"
I
pOST DISTURBS
11111111
t
DISTURB
I
i
Figure 4. Disturb test pattern.
Testing. Extensive testing of ferrite wafers has
been conducted under worst-case disturb conditions. The test pulse pattern applied to the ferrite
wafers is shown in Fig. 4. Seven pre-disturb
pulses and 8 post-disturb pulses are applied to the
'i
WRITE ZERO
I 1111111'1
11
118
I
~I
I III ill III
BIT
WINDING
ZERO
I I I IIIII
11111111
WRITE "ONE
II i i II i i
wafer. Tests made with a greater number of disturb
pulses indicate that 7 pre-disturbs and 8 postdisturbs produce approximately the maximum disturb condition. From this test, a set of curves were
plotted (Figs. 5 and 6) showing differential sensed
140
«
oS
120
READ CURRENT AMPL = 400 rnA
READ CURRENT WIDTH =60 ns
WRITE CURRENT WIDTH =200ns
DIGIT CURRENT WIDTH = 200ns
ILl
C)
~
<5
100
>
I:::J
Q.
~
o
80
~
«
ILl
Q.
o
60
-.....
ILl
(/)
~?,.,. ~~
"
Z
ILl
(/)
40
~,..
~
IL.
/"
a
/
20
~F:rDURBED
\
ONE
~
WORD
WINDING
1965
FALL JOINT COMPUTER CONFERENCE,
/'
~ - N O DISTURBS
___ 7 PRE - DISTURBS
8 POST-DISTURBS
"
~OOmA
,,~
,~
150mA
'X
100mA'
DIGIT CURRENT PULSE AMPLITUDE (rnA)
Figure 5. Signal output vs digit current (read current = 400
milliamps) .
999
MONOLITHIC FERRITE MEMORIES
140
>
5120
READ CURRENT AMPL =550 mA
READ CURRENT WIDTH = 60 ns
WRITE CURRENT WIDTH = 60 ns
DIGIT CURRENT WIDTH· 200 ns
W
(!)
~
5 100
>
....:::>
~
80
:::>
o
~
~
7 PRE - DISTURBS
8 POST-DISTURBS
60
Q.
o
W
~
40
W
(/)
IL
2i
20
O~--~----~----~--~-----L----~~
o
10
20
30
40
50
60
DIGIT CURRENT PULSE AMPLITUDE (rnA)
Figure 6. Signal output vs digit current (read current
=550 milliamps).
creasing the magnitudes of the 60-nanosecond
peak output voltage as a function of digit current.
A desirable range of operating current values have
been chosen on the basis of these curves. Disturbed
signal output versus digit current, with write current magnitude as a parameter and read current
magnitude fixed at 400 milliamps, is shown in Fig.
5. The same curves are plotted in Fig. 6, except the
"read" current magnitude is fixed at 550 milliamps.
The typical disturbed output of the ferrite wafer is
45 microvolts. Figures 5 and 6 indicate that in-
wide read and write current pulses above 400 milliamps for read and 100 for write will not increase
the disturbed output signals.
The digit current indicated in Figs. 5 and 6 has
an optimum amplitude of between 30 and 40 milliamps. It is apparent that the optimum word write
and read pulse amplitudes are a function of the
pulse widths chosen. A number of operating pulse
values are shown in Table 1. The bit back-voltage
is also a function of the pulse values and is listed in
the table.
Table 1. Operating Pulse Values.
Read Current
Write Current
rna
Td(50% )
nsec
TrTf
nsec
400
400
110
60
45
30
Digits
rna
Td(50% )
nsec
100
150
120
30
I
rna
Td(50% )
nsec
Typical
Outputs
mv
BBV*
rnv
30
30
200
100
45
25
250
320
*Bit back-voltage, or the word read voltage divided by the number of bits in the word.
As an indication of the spread of signal values on
a ferrite wafer, a map of dh;;turbed and undisturbed
signal values across the wafer is given in Fig. 7. All
locations of the wafer have been checked and the
map in Fig. 7 indicates w?rst-case extremes.
Extensive testing of the transmission line properties of the wafers using nanosecond pulse techniques has been conducted also. A summary of the
1000
PROCEEDINGS -
1965
FALL JOINT COMPUTER CONFERENCE,
DISTURBED/UNDISTURBED SIGNAL OUT AT DIGIT PAIRS (mv)
Word
Pair
1
Note:
Pair
2
Pair
3
Pair
4
Read current
Pair
5
= 400 rna,
Pair
6
Pair
7
Pair
8
Pair
9
70 nsec; write current
Pair
10
= 120 rna,
Pair
11
Pair
12
Pair
13
70 nsec; digits
Pair
14
Pair
15
= 30 rna, 80 nsec.
Figure 7. Map of signals for a typical wafer. (This map
shows typical and extreme signals.)
data obtained is shown in Table 2. The characteristic impedance of the transmission line is a function
of frequency and coupling to the ground plane;
however, the best terminating impedance is given.
The attenuation shown in the table for pulses wider
than the rise time is essential d-c attenuation, and
is associated with the 2.5-ohm line resistance
through each wafer.
Table 2. Digit Transmission Line Properties.
Characteristic
Impedance
Best termination
Pulse
Delay
Midpoint to
midpoint
for 1,024
words
Pulse
Rise
Time
256
words
Attenuation
For pulses
wider than
rise time,
512
words
150 ohms
35 nsec
15 nsec
1.1 dB
Stack Construction
Monolithic ferrite stacks are assemblies of modular building blocks. The basic modules are assem-
blies of two ferrite wafers with diodes and bussing
for word selection. There are two kinds of selection
diodes available: conventional high-speed, and
storage types. The module using conventional
diodes requires two diodes per word, while the module using storage diodes requires one diode per
word. The fabrication of a 1024-word 64-bit
stack using 16 modules with conventional diodes
will be described. A 256-word 64-bit stack was
also assembled using 4 modules with storage diodes.
Operation and testing of both stacks will also be
described.
Figure 8 is an illustration of an integrated ferrite
module with conventional diodes and Fig. 9 is a
schematic of this module. For convenience, a 16word array is shown. The modules actually have 64.
One end of each word winding is connected to the
anode of one diode and the cathode of another. The
cathodes of the diodes whose anodes are connected
to word windings are common in eight groups of
eight diodes each. The anodes of the diodes whose
1001
MONOLITHIC FERRITE MEMORIES
Figure 8. Construction
module.
READ IWRITE
DR IVER
~
4
~
4~
~
4~
~, ~~ ~, ~
~
4~
of the
4~
4~
~, ~, ~, ~
~
~
~
~
~
~
~, G, ~ ~
~
;::
integrated ferrite
~ ~
~ ~
~
basic
~
4
4
.----
4
J~, ~ j
I
-~
/v
ONLY HALF THE DIODES
OF A BASIC MODULE
ARE SHOWN
TERM INATING
RESISTORS
~
./
-
CONDUCTORS IN FERRITE
B>
I
TO SWITC HES
SWITCH
Figure 9. Schematic of the basic integrated ferrite module.
cathodes are connected to word windings .are
grouped identically.
The diode chips have 2 rows of 4 diodes, each on
30-mil centers. Notice that the number of connec-
tions required on the diode end of the words has
been reduced from 64 to 16. The ends of the word
windings remote fromtne diodes are connecteQ. directly together in eight groups of eight ~ords each.
1002
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
Each group has one word from each of the diode
groups. The riumber of connections to this end of
the module is reduced from 64 to 8.
Expansion of the diode selection matrix to arrays
of more· than 64 words merely requires expanding
the sizes and numbers of the groups by connecting
corresponding points of different modules together.
For example, a 1024-word matrix would have 32
groups of 32 words each, as viewed from either
end.
Line connections on the modules are "fingers" of
etched combs leading to the winding ends on the
four sides of a wafer. This assembly is soldered to a
printed circuit pattern on the module board. After
connection, the solid edge of the comb is sheared
off. Tabs are left !it the edges of the module for
connection to the bit windings of another module.
1965
The diode array is assembled as a separate unit
and cemented onto the module tabs of previously
mounted combs. Connection to the switch buses
(see lower end of Fig. 8) is accomplished with
combs appropriately etched from one-ounce beryllium copper. They are raised above the module surfaces as they cross lines which they must not contact. The wafer interconnection combs are etched
from one-ounce copper sheets, subsequently plated
with an electroless tin coating to facilitate soldering. The module substrate is constructed from
1/16-inch G-I0 laminated glass epoxy board.
The diode assembly comprises a baseboard with a
printed circuit pattern and a spacer board of G-I0
material with the common connections etched.
Word drive connections are made with printed circuit plugs having contacts on 50-mil centers.
Figure 10. Typical arrangement of monolithic memory
stack.
The conventional diode stack consisting of 1024
words of 64 bits is formed from 2 planes, each containing 512 words (Fig. 10). Overall dimensions
for the unit, including the sense digit connector
boards with terminating resistors, are 14 x 4.5
inches with 0.5-inch spacing between the planes.
The sense digit connections are fanned out from the
IS-mil centers to suitable plug connections on
50-mil centers. Module boards are cemented onto
a backboard and interconnections between the modules are made by soldering. Bussing between
planes is accomplished by soldering #30 wire be~
tween corresponding bit windings of the two planes.
MEMORY SYSTEM USING
CONVENTIONAL DIODES
Selection Matrix and Digit Drive
This section describes the system design and. operational details of the 1024-word, 64-bit memory. The word driving scheme is illustrated by the
4x4 diode matrix of Fig. 11. The physical dimensions and geometry of the matrix diodes have already been given.
These diodes have the following typical characteristics:
1003
MONOLITHIC FERRITE MEMORIES
Co (function capacity at zero bias) = 4 picofarads
V (at 1=400 rna)
2 volts
VB (at 10/ta)
=60 volts
Tv (reverse recovery time)
= 15 nanoseconds
Under quiescent conditions, the matrix diodes
are back-biased by a positive voltage applied at
the write driver, a negative voltage at the read driver, and ground at the read/write switch. The
read/write selection sequence is executed in the following manner:
.
1. A read command pulse turns on the selected read switch, driving the 32 words selected by the switch to a positive voltage,
thus removing the back bias from 32 read
diodes.
2. Sometime later, the read driver is turned
on, forward-biasing one read diode on
the selected word, while the drive line
moves toward ground potential from a current source to complete the read operation.
3. Similarly, during the write part of the cycle
a write command pulse turns on the write
switch, driving the selected set of words
negative.
4. The write driver is then turned on, driving
the write diode on the selected word to-
ward ground from a current source and
thus enabling write current to flow.
All the drivers and switches of the word system are
compatible at their inputs with the logic levels of
the current steering integrated logic gates used
throughout the system.
The parallel capacitance of the 32 word lines
connected to each switch is about 1600 picofarads.
The read chanp.el of the switch supplies 1 ampere of
current so that it can charge this capacitance to
+ 25 volts in about 40 nanoseconds. The write
channel of the switch supplies about 500 milliamps
to complete the transition from + 25 volts for read
to -10 volts for write in about 100 nanoseconds.
As seen in Fig. 11, there are two clamp diodes at
the output of each switch, one to + 25 volts and the
other to -10 volts. Physically located at the stack,
their purpose is to provide a sink for the switch
current and thus to maintain a low impedance at
the memory stack during the read and write pulses
despite the cables needed to interconnect the switch
lines in the stack with the circuit boards. This action helps to clean up waveforms and to reduce
noise.
The digit drivers are bipolar. They store a one by
delivering a positive pulse into one line of a digit
RO
~
V
¥
-
4
"'
RO
~
V
..
¥
*
*4
4
-.
4
,
4
~
4
4~
~ ~~
¥
RO
~
t;/'
,
4
,
,
'lk-
'It"
RO
~
V
.-
.-
*
*
..
+25V
-IOV
~
*
.-
r
.+25V
-IOV
~
4
+25V
-IOV
-IOV
+25V
:l~LA MP
010 DE
BIPOLAR SWITCH
Figure 11. Schematic of the driving and matrix scheme for
the two-diode-per-word laminated ferrite memory.
1004
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
pair and a negative pulse into the other. A zero is
stored by reversing these two pulse polarities. The
digit driver outputs are voltage sources during the
digit pulse, the amplitude of the digit current being
controlled by the voltage and the termination resistors. The driver inputs have logic gates which are
primed by the information register and activated by
the digit timing pulse.
The sense amplifiers are constructed by cascading
two current-steering logic gates and A-C coupling at the output of the second gate to the memory register. Strobing is accomplished at the input to
the second gate. Power supply levels for the gates
are raised to prevent gate saturation during the digit pulse. A positive pulse can appear at the sense
output only during the strobe, and only if a one is
sensed.
Operation
The test vehicle used to check the performance of
the conventional diode memory consists of:
1. The timing and control generator
2. A word system
3. A digit system
The timing and control generator supplies all timing and control pulses. The word system supplies
the proper switch voltages and read-write currents
at the command of the timing generator. The digit
sense system performs the dual function of sensing
stored information and writing back into the memory.
The test vehicle has four different types of logic
components, providing address scan, disturb patterns and error checks. These are:
1. Integrated current-steering gates of the
emitter-coupled current-steered logic
(ECCSL) type
2. Integrated. current-steering flip-flops
3. One-shots with variable delay, made by
adding a few external components to integrated current-steering gates
4. A free running multivibrator with variable
frequency, also made from an integrated
current-steering gate
These four devices have proven to be very stable
and reliable, and have made the test unit highly
flexible, coupled with high packing density.
The test system can be set to pre-disturb a word
up to 35 times when running at 2 megacycles per
1965
second. The pre-disturbs end when the timing
generator causes the digit drivers to change write
information for one write time. The timing generation then causes up to 70 digit disturbs to be generated. At the end of the digit disturb period, the last
written information is read out. Then this entire
pattern is complemented. Finally, the address is
changed to a new word. The disturb pattern is then
repeated at the· new word, etc. Information from
one bit to the next along a word can be complemented by a mechanical switch in the information
register. The timing diagram for the test vehicle is
shown in Fig. 12.
CLOCK
MEMORY REG. RESET
1--510NS--l
J'-.
--1L..
--f 1--100 NS
READ SWITCH
WRITE SWITCH
READ DRIVER
~
1- 340 -1"
1----t-~15-0- -
~
120+-1
WRITE DRIVER CURRENT
DIGIT CURRENT (2 DRIVERS)
SIGNAL (ONE OR ZERO)
ERROR CHECK PULSE
: ...u--L
::0
---i-I~I
---v-
~--
JL
Figure 12. Timing diagram for conventional diode test
vehicle.
The read switch is turned on approximately 150
nanoseconds ahead of the read driver to allow the
noise transient coupled into the digit lines to decay
to a level much less than the signal.
The noise is coupled into the stack through all of
the words common to the selected switch. The magnitude of this noise depends on the number of
words in the stack. For a switch driving 32 lines,
the common-made noise injected into the digit lines
is approximately 1 volt. However, as has already been
pointed out, sensing is performed differentially;
therefore, large common-mode noise will not be a
problem provided the digit pair balance in the stack
is sufficient. This balance minimizes conversion of
common-mode noise to difference mode. The conversion to difference mode for this stack was approximately 1.5 percent, or 15 microvolts. With ideally
1005
MONOLITHIC FERRITE MEMORIES
terminated digit lines, a read operation could immediately follow the end of the digit pulse. However,
it has been found that a waiting period 6 to 9 times
the delay on the dig line is required.
In addition to the switch and digit noise, which
are not time-coincident with the signal,. there are
sources of noise which are coincident with the signal. One of these sources is associated with the fact
that the impedance between the stack and the matrix switch ground cannot be made negligibly small.
The problem is minimized by adding clamps at
each switch line. Reference to Fig. 11 shows the
location of these diodes on the stack. The second
source of time-coincident noise is associated with
the removal of back bias on the matrix diodes connected to the selected driver. Other things being
equal, this noise can only be eliminated by the inherent common-mode rejection of the stack.
Test Results
words superimposed. The first transient at the left
is caused by read switch turn on which lasts 150
nanoseconds. Following this are signal outputs. The
positive signals are ones and the negative signals are
zeros. These last for about 70 nanoseconds. The
remaining time in the cycle is occupied by the digit
transient.
Figure 13 ( bottom) shows the stack output
with the drive conditions the same as those above
except for removal of word write current. The low
noise level at signal time indicates good sense pair
balance.
In Fig. 14 (top), typical read and write currents are shown at a point between the diode matrix
and the drivers. In addition, typical signals are
shown in coincidence with the read current. Again,
these are disturbed signals. In Fig. 14 (bottom) ,
read. and write currents as well as switch voltages
are shown.
VERT =200 mA/div (BOTH READ AND WRITE)
VERT = I VOLT / div (SIGNAL)
HORIZ=IOO ns/div
Waveforms of the tests on this 1024-word 64bit stack operating at a SOO-nanosecond cycle
time are shown in Figs. 13 and 14.
Figure 13 (top) shows fully disturbed stack
output signals with digit transients. Each of the
three traces represents the outputs of a bit from 64
+ - READ AND WRITE
CURRENT
VERT = 200 mA/div (BOTH READ AND WRITE)
VERT = 20 VOLTS/div SW WAVEFORM (BOTH READ AND WRITE)
HORIZ = 100 ns / div ALL WAVEFORMS
NOTE CURRENT PULSES
POLARITIES REVERSED
TO CLARIFY PICTURE
HORIZONTAL : 100 NS/DIV.
VERTICAL: I VOLT /DIV.
Figure 14. Typical tim~ relationships. Top: Time relationships between read/write current and signal. Bottom: Time
relationships between read!write driver circuits and read!
write switch voltages.
r"."
MEMORY SYSTEM USING STORAGE DIODE
Selection Matrix
Figure 13. Comparison of the signal and total noise present
at read time. Top: 64 Signals superimposed at each of 3
digit locations. Bottom: Same as above, except that write
drivers are disconnected to show the noise present at read
time.
The storage-diode selection matrix uses one
diode per word. The storage diode, because of its
resistivity profile, has long-term minority carrier
storage and with one diode allows construction of a
selection switch that delivers bipolar currents. Storage diodes also exhibit the desirable property that
in the absence of minority carrier storage, a highresistance (megohms) , low-capacitance ( a few
1006
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
picofarads) characteristic is maintained in the reverse direction. This ensures that below the halfselect matrix voltage, the properties of a storage
diode will be identical to those of a high-speed
diode.
A schematic diagram of a 4-word portion of
the 256-word matrix that was built and tested is
shown in Fig. 15. Half-select operation of the matrix is identical to the half-select operation of a
conventional diode matrix, all diodes appearing as
approximately 2-picofarad capacitors. Activation
of a driver alone or a switch alone does not forward-bias any diode in the matrix.
Activation of a switch followed by a driver
drives the selected diode sufficiently beyond its voltage gap to permit the flow of the required read
current. The storage diodes in the matrix have minority carrier lifetimes of approximately 170 nanoseconds. Storage-diode lifetime is the parameter
that defines the diode's stored-charge recombination rate2 and, therefore, is directly proportional to
the maximum charge the diode can store. The
170-nanosecond lifetime allows the diode to re-IOV
1965
turn 20 percent of the charge flowing into it during
the read pulse as a write pulse. The write pulse is
generated when the read channel of both the switch
and the driver are deactivated and the write channel
is activated. The write channel of the switch is activated during the fall time of the read clock pulse.
The previously selected storage diode is then forced
by the voltage that eventually reverse-biases the
diode to give up its stored minority carriers as
write current. The write current terminates when
the charge in the diode is depleted and the fall time
of the write current is approximately equal to the
snap-off time of the diode.
The diodes used in this matrix are fabricated in
integrated strips of eight diodes each. They have
the following typical characteristics:
Co
2 picofarads
V + (at 1t-400 rna)
1.2 volts
IR (at V= 100 v)
1 micro amps
Bv (breakdown IR= lOp-a)
= 150 volts
Transition time (snap-off time) = 4 nanoseconds
= 170 nanoseconds
Lifetime
+35V
INSIOO
68
s
68
-IOV
DRIVER
Figure 15. Storage-diode selection matrix.
The read/write current and the word back voltage generated in the matrix for the 170-nanosecond lifetime diode are shown in Fig. 16. The
read current peak magnitude is 600 milliamps with
a 300-nanosecond base width; write current peak
magnitUde is 200 milliamps with a 100-nano-
second base width. The word current has a front
porch caused .by the voltage associated with
switching the ferrite. Comparison of the back voltage waveform across the word when all the ferrite
is switched with the back voltage under the above
condition indicates that 80 percent of the ferrite is
1007
MONOLITHIC FERRITE MEMORIES
. - VOLTAGE (IOV /div)
. - CURRENT (200 mA/div)
!OO nS/div
Figure 16. Read/write current and word back-voltage generated for 170-nanosecond lifetime diode.
switching. The percentage of the ferrite that switches is a function of the write current pulse, width as
well as amplitude; therefore the duration of the read
current front porch is also a function of the write
current pulse width. The front porch limits the
amount of read charge going into the diode and reduces the percentage change of write current pulse
width as a function of storage diode lifetime. A 30
percent change in storage diode lifetime produces
only a 20 percent change in write current pulse
width.
Operation
A block diagram of the 256-word 64-bit test
vehicle is shown in Fig. 17. The memory was addressed by scanning the 256 words sequentially.
There were 16 switches and 16 drivers in the system and all 256 words were addressed. However,
there were only 4 sense amplifiers and 4 digit drivers in the system, operating only 4 out of 64 digit
locations at one time. The circuits were built on
plug-ins and placed in a rack, as indicated in Fig.
16
16
SELECTION - - DRIVERS ~
256
GATES
SELECTION~
DIODES
MEMORY
STACK
256 WQ.RDS
X
64 BITS
1
SEQUENCE
GENERATOR
4
SENSE
AMPLIFIERS
1
16
16
SELECTION ~ SWITCHES
GATES
4
DIGIT
DRIVERS
4 STAGE
. MEMORY I---REGISTER
Figure 17. Block diagram of test vehicle for storage-diode
matrix.
1008
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1965
Figure 18. Construction of test vehicle for storage-diode
matrix.
18, a photo of the test system.
A timing diagram is shown in Fig. 19. The
switch is initiated 100 nanoseconds before the driver to isolate the switch noise coupled into the matrix when the half-selected word lines are driven
to 25 volts. The switch rise time is 50 nanoseconds
for driving the 256-word stack. When driving a
stack of 4096 words (2000~picofarad line capacitance) the rise time is 100 nanoseconds. The word
driver rise time is only 25 nanoseconds. The maximum selection line capacitance it drives, considering a 4096-word stack, is 128 picofarads. The digit current is initiated at the end of the read current
and has a 30-milliamp peak magnitude. It is 200
nanoseconds wide at the base. The digit lines are
driven in a push-pull mode, as s1!pwn in Fig. 20.
The lines are terminated in their characteristic impedances of approximately 120 ohms.
1009
MONOLITHIC FERRITE MEMORIES
SWITCH
/
A group of one and zero outputs out of the stack,
as amplified by a linear difference amplifier, is
shown in Fig. 21. The lowest disturb output observed is 5 microvolts. The lack of symmetry in the
one and zero outputs is due primarily to the noise
generated by the strobe circuit in the sense amplifier plug-in.
\
WORD
LINE DRIVE
CONCLUSIONS
WORD
CURRENT
Monolithic ferrite elements have now been sufficiently developed to be competitive with other basic memory elements. They offer the following advantages:
DIGIT
CURRENT
DIGIT
LINE B
SENSE
SIGNAL
OUTPUT
1. These elements can be combined to make
basic blocks for high speed memories.
2. The ferrites can be combined with integrated circuits and diodes to make an economical and compact package.
3. Transients in the stack are readily controlled and noise levels are well below signallevels during read time.
4. The use of special storage diodes reduces
the number of decoding diodes required,
and simplifies the electronics to permit the
use of single polarity drivers for all word
selection functions.
5. The use of storage diodes and integrated
circuits make the linear select monolithic
ferrite system cost competitive with coincident current core systems, while offering
considerably higher speed capability.
ONE
\
V
I
ZERO
Figure 19. Timing diagram for storage-diode test vehicle.
Test Results
The common-mode noise on the digit sense
lines, generated during the switch rise time, is primarily electrostatic coupling from the half-selected word lines, and is approximately 150 millivolts.
Approximately 1.5 percent of the common-mode
signal is converted to a difference mode yielding'" 2
millivolts. The common-mode signal generated
during the rise time of the word driver is approximately 20 millivolts, and the signal converted to
difference mode cannot be detected.
120
-= -=
120
- 30mA
U
+ nt
30mA
"1"
DRIVE
n+
Ut
30mA
120
120
30mA
"0"
DRIVE
Figure 20. Digit-sense system for storage-diode memory.
-= -=
1010
PROCEEDINGS -
r
FALL JOINT COMPUTER CONFERENCE,
1965
SENSE SIGNALS
SENSE AMP OUTPUT
7 mV /div AT INPUT
WORD CURRENT
400 mA/div
100 nS/div
Figure 21. A group of one and zero signals from the
storage-diode selected stack.
REFERENCES
1. R. Shahbender et aI, "Laminated Ferrite
Memory," Proc. Fall loint Computer Cont., 1963,
p.86.
2. G. A. Dodson and J. A. Ruff, "Charge Storage Diode for Magnetic Memory Application,"
Proc. ISCC, vol. 7, p. 104, (1964).
3. J. L. Moll, S. Krakanes and R. Shen, "P-N
Junction Charge Storage Diodes," Proc. IRE, pp.
43-53, (Jan. 1962).
4. H. Amemiya, F. R. Mayhew and R. L. Pryor,
"A 105-Bit High-Speed Ferrite Memory SystemDesign and Operation," Proc. Fall loint Computer
Cont., 1964, pp. 123-144.
HIGH SPEED FERRITE 2V2D MEMO'RY
Thomas J. Gilligan, Perry B. Persons
Electronic Memories Incorporated
Hawthorne, California
INTRODUCTION
Main core storage in digital computers has been
getting both larger and faster with each generation.
In view of this a design was undertaken which
would be inherently faster, and inherently less expensive in the large sizes. Other significant inputs
to the design approach chosen was that electronics,
that is semiconductors, were becoming less expensive; also, to operate at the higher speeds smaller cores must be used. Since it is progressively more
difficult to put additional wires through these
smaller cores, system approaches using fewer wires
through the core were studied.
The system chosen has basically a two-dimensional magnetics array similar to linear select; however, it has the advantage that it has a level of decoding in the array. Therefore a significant saving
in electronics over the liner select may be affected,
while maintaining the inexpensive magnetics array.
Shown in Fig. 1 is how the relatively inefficient aspect ratio of linear select arrays may be improved
by dividtug the word dimension by some number
and, while multiplying the bit dimension by the
same riumber, get a squarer array.
The chosen system has a coincident current read
cycle, and a linear select write cycle, that is, the digit current is additive rather than subtractive. Since
BITS-
i~
a) LINEAR SELECT
t
BITSxN-
IzD
WO:DS ~BITSxN
b)
2h 0
Figure 1. Comparison of linear select and 2 ~ D planar
arrays.
this system is not as efficient in decoding as a cubic
three-dimensional coincident current system, but
far more efficient than a planar two-dimensional
linear select system, it became known as a
21h-dimensional system. The inherent power in
the 21hD system organization may be appr~ciated
by noting that the system reported here is capable
of a 900-nanosecond cycle time using a 30-mil
1011
1012
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
core. In order to achieve a comparable cycle time, a
standard coincident current organization must use a
20-mil core.
MEMORY ORGANIZATION
In the "pure" memory organizations, coincident
current and linear select, the dimensions of the system may be clearly identified as ~eing either ad-
1965
dress or data dimensions. In the 2~D organization, one dimension serves a dual function as a data
and address dimension. This requires that data be
inserted on the address lines in one dimension during the write cycle, requiring an independent drive
system in one dimension for every bit. We will call
this the bit dimension and the other will be the
word dimension.
In Fig. 2, a functional diagram of a 2~D, 16K
by 3 bit system is shown. It should be noted that a
ADDRESS--~----~~----~~----~
SELECTION
16
VOLTAGE
SWITCHES
16
CURRENT
SWITCHES
4096 BIT
SENSE MAT
DATA
Figure 2.
IN~
2~D
BIT I
BIT 2
BIT 3
memory organization -16,384 words, 3 bits.
complete 32-way diode decode matrix is used for
every bit. To select one core of 16,384, one bit line
of 32 is selected and one word line of 256 is,-also
selected. In addition the phasing of the word current is chosen to be either positive or negative, and
therefore affect coincidence in one of two cores on
a common intersection, and anticoincidence in
the other. Using the phase of the word current as
part of the selection allows the word electronics to
be effectively divided by two. It should also be noted in Fig. 2 that there is great redundancy in the
bit selection circuitry. That is, the same relative
voltage switch, and the same relative current switch,
is selected in each bit position. The only difference
between bits is that the data will gate current to the
matrix or not, depending on whether it is desired to
write a one into that bit. Cost will be minimized as
the aspect ratio of the system approaches a square.
This forces the aspect ratio of a particular bit to be
quite "unsquare." The aspect ratio per bit is 512 in
the word dimension times 32 in the bit dimension.
The aspect ratio of the 4096 sense mat is 256 by
16. A word is read out of the memory by selecting
one line in each bit matrix, and then turning on
current in the appropriate word line.
A block diagram of the 2~D system is shown in
Fig. 3. Included in each block is the number of
components in terms of memory capacity C, and the
number of bits per word B. Also included is a
weighting factor, Kn, which when multiplied by
the number of elements in that box, and then
summed over all boxes will give the total cost of the
system as a function of memory capacity and number of bits per word. From this equation it can be
determined that the cost per bit is strongly a decreasing function of the number of words.
1013
HIGH SPEED FERRITE 2~ D MEMORY SYSTEM
MEMORY REGISTER
,
,
B
WORD
DECODERS
SENSE AMPLIFIERS
2~o/2
,
J0/2
0/4096
K5
WORD
DRIVERS
KI
r--
CORES PER
DRIVE
K3
MAX. DRIVE SENSE
INTERSECTION
--256
MAGNETICS
TOTAL COST
C
K2
KI
+
JC/2
KIC
+
2K2 Jc~ + K3 CA096
+K4B + 2K 5 ~% +
2K6JB·~
BIT DRIVERS
=
C/2
K2
4
BIT
218
DECODERS
~~2
Figure 3. Three-wire
K6
2~D
cost breakdown.
SYSTEM CONSIDERATIONS
1
current in a line may be expressed as: P = T C
As pointed out previously, the 2~D system has
many advantages of the 3D and the 2D systems, as
well as many advantages in its own right. The most
important of these advantages is that there is no
inhibit recovery problem, since there is no inhibit
current. As a matter of fact in a typical 16K word
system, the maximum drive-sense intersection is
256 cores. The typical drive-sense intersection in
a coincident current system is 2048 cores. The effeet of this is that there is no recovery problem
from the previous memory cycle. Another big advantage of the 2 ~ D system is that no drive line is
longer than 900 cores, and therefore all system resonances will be significantly higher than the frequency spectrum of a core switching, and· thereby
simplify the sensing problem. The short drive lines
also allow fast rise times on the current pulses, with
relatively low drive voltages. This in turn will allow
the use of less expensive semiconductors. Since
digit current adds to write current to write a one, no
time has to be allowed between read and write currents to insure overlap of the write current by the
digit current.
The bit drive system accounts for the largest part
of the power used in a memory system.
The power required to establish and then hold
(2L12 + IRTS), where the assumed driver is a saturated switch, T c is cycle time, L is line inductance,
R is line resistance and Ts is the switching time of
the core.
The 2~D will have a lower minimum power
required since the bit line is shorter by a factor of
four or eight than a linear select or a coincident
current memory. It must be remembered that current must be driven in the bit line twice during a
2 ~ D cycle and therefore the gain isn't as great as
expected.
In practice, the power used in a commercial
2~D system will be about the same as in a coincident current system; however, the lower voltage
required will allow the drive currents to be derived
from logic voltages, i.e., +12 volts.
The core chosen for this system has the following
characteristics: 30 mil O.D., 22 mil I.D., switching
time 330 nanoseconds, half select current 380 mils.
Since almost the full cycle time in this system is
occupied by core switching time, the most desirable
·core for a faster system of this type would be a high
drive (500 milliamps half select current), low flux
22-mil O.D. core. The high drive core may be
used effectively since the increased drive requirements will not slow cycle time significantly as the
1014
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
rise and fall times are but a small percentage of the
cycle time.
TWO DIODE PER LINE DIODE DECODE MATRIX
A two diode per line decoding scheme was chosen for the 2 Y2 D system since it is the least expensive of any high speed access system. The important
points of this two diode per line system are shown
in Fig. 4. This decode system has not been too popular in high-speed systems since when a voltage
switch is selected, the selected group of lines tend
1965
WORD DRIVE SYSTEM
The schematic of the word drive system is shown
in Fig. 5, and the photographs of its operation are
shown in Fig. 6. It is a simple 256-way diode decode, having the following characteristics: sink
capacitance 700 picofarads, drive line inductance
1.3 microhenrys , propogate time 9 nanoseconds.
DECODED ADDRESS, PHASING a R/W INFO
I 1/2 ADDRESS BITS
en
r - -_ _
_ _______,
w
~A'-
en
en
w
liD
liD
Iia
0:: en
8t::
«m
oen
wen
OW
uO
00::
wo
0«
CDrt)
a)
:
:
:
WORD SWITCH SELECTION
LI-Parasitic Inductance of Voltage Switch Circuit
RS- Series Resistance of Voltage Switch Circuit
RSH- Shunt Damping and D.C. Restoring Resistor
Z-Impedance Looking Into Magnetics Array
)I.-Electrical Length of Drive Line
L2-Drive Line Inductance
C - Drive Line Capacitance
i!o-Drive Line Characteristic Impedance
,f
,
LI
LARGE 21") LICN
LJ
SMALL 21CyfL2C
r
,
SCN
i!o
~
Critical Damping
-~J NL,C
RSH - 2
RS
=
t- i!o
Figure 4. General two diode per line decoding; voltage
switch resonance.
to resonate, either as a quarter wavelength line with
zero impedance at the voltage switch end, and open
circuited at the far end, or as a capacitor with the
parasitic inductance in series with the voltage
switch. In the 2 Y2 D system being described, both
these effects were minimized by mounting the voltage switches in close proximity with magnetics array. Since the drive lines are so short, the frequency
of resonance is high, and therefore the lines do not
have to be critically damped. Note that the basic
philosophy here is different than previously
reported! since no attempt was made to minimize
the voltage switch to sense line coupling, but instead the resonant frequency and damping were
controlled.
b) DIODE
DECODE
Figure 5. Word drive circuitry.
Floating switches are used and the current is determined by a fixed resistor providing an L/R time
constant of 30 nanoseconds. The phasing information is present in the logic input to the decode matrix.
BIT DRIVE SYSTEM
Each bit in this system has an electrically independent drive system. Because of the redundancy
the same relative switch in each matrix is selected
at one time. This is accomplished by placing all the
transformer primaries in series. The 32-way output is obtained from 8 bipolar voltage switches and
4 bipolar current switches. The read current is always present when the appropriate read switches
HIGH SPEED FERRITE
21;2
1015
D MEMORY SYSTEM
BIT 2
BIT N
BIT I
a) BIT 'SWITCH SELECTION
C~
R=£"
b) DIODE
I
DECODE
-v
DATA IN
Figure 7. Bit drive circuitry.
Figure 6. Word drive system waveforms, 200 nsec/ cm.
Top: Voltage switch 5v/cm. Center: Voltage across drive
line 5v/cm. Bottom: Drive current 100 ma/cm.
are selected. The 'write current, on the other hand,
is conditional and· will only be present when a one
is to be written into memory. The schematic of the
bit drive system is shown in Fig. 7. Photographs of the bit drive .system operating are shown
in Fig. 8. Pertinent characteristics are sink capacitance 125 picofarad, bit line inductance.. 70 microhenry, propogate time 5 nanoseconds.
SENSING SYSTEM
The 16,384 cores per bit are broken up into four
4096 sense lines. The aspect ratio of the sense line
was chosen to be 256 X 16 for ease of fabrication.
The penalty paid was that twice the delta noise2 , 3
must be handled. Fortunately the system resonances
are high, and a wide-band sense amplifier may be
large output from the core and the absence of large
used effectively to achieve time discrimination. The
inhibit noise (see Fig. 10) allow a less expensive
sense amplifier to be designed. The 30-mil cores
may be strung on 25-mil centers since a parallel
core pattern is used. The sense winding is
rectangular,4 and therefore the cores are on 25-mil
centers on the sense winding. This compares to a
core spacing of 42 mils on the normal diagonal
sense winding with a box pattern. Since the cores
are on such close centers on the sense line the impedance of the sense line is somewhat higher. It is
approximately 170. The wiring pattern of the sense
matrix is shown in Fig. 9. Worst-case outputs
from the array are shown in Fig. 10.
CONSTRUCTION
The magnetics module is constructed by mounting two magnetics arrays back to back and interconnecting them. Each array is 256 X 448 cores.
Around the periphery are mounted the drive diodes.
This complete assembly is then sandwiched between
the bit drive and the bit sink circuitry. The drive
and sink circuitry are connected to the magnetics
module with flexible wiring. This allows the boards
1016
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
4 - RX 256
1965
RXo-.
RXI---+
4-RX258
RX
4--RX511
SENSE
OUTPUT
Figure 9. Sense matrix wiring.
Figure 8. Bit drive system waveforms, 200 nsec/cm. Top:
Voltage switch 5v/cm. Center: Voltage across drive line
5v/cm. Bottom: Drive current 100 ma/cm.
to be swung open for repair. The drive boards are
hard' wired to the array to eliminate a large number
of connectors.
The sense amplifiers are of cordwood construction to get a high component density. This is done
since it allows the sense amplifiers to be mounted
with the drive and sink circuitry near the magnetics
array. The package as described to this point is effectively a complete 14-bit memory system, having logic inputs and logic outputs. A system having
a larger number of bits may be constructed by joining any number of up to 5 of these smaller 14-bit
arrays in parallel. Photographs of this system are
shown in Figs. 12, 13, and 14.
OPERATING MARGINS
In Fig. 15 a 3-dimensional Schmoo diagram is
shown similar to those reported by Womack. 5 The
Figure 10. Sensing waveforms, 200 nsec/cm, 40 mv/cm.
Top: Ones and zeros. Bottom: Worst pattern.
mGH SPEED FERRITE
Ih = CONSTANT PLANES
:.IR a IW SCHMOO
IW= IR PLANE
L . . . - -_ _------J
:.lh a IR = IW
SCHMOO
I --.
R
Figure 11. EMI "Nanomemory" -16,384 words, 56 bits.
operating margins may be compromised somewhat
by increasing the write current. This in tum will
2;.2
D MEMORY SYSTEM
1017
commonly encountered Schmoo diagrams are shown
as planes in this 3-dimensional representation. It
must be appreciated that in the 2;.2D system, since
it is a planar array the Schmoo diagram will, of necessity, be a 2-dimensional figure. In the 2;.2 D
system being reported, all the currents were derived
from plus and minus 12 volts. This makes it difficult, in the 2;.2 -dimensional approach, to vary
read and write currents independently. As an alternative, another method is suggested. It is a relatively simple matter to drive the word and bit drivers
from separate plus and minus 12 volt power supplies, and to vary these independently to check system margins. That this is a meaningful check on
system margins can be understood in that the word
and the bit currents are orthogonal, that is, one of
these currents may be increased without significantly increasing the knee of the core and therefore
check the knee. It is readily apparent from the 3dimensional Schmoo diagram shown in Fig. 15
that the 2;.2D I read versus I write Schmoo is the
largest plane there. This says that the 2;.2D system
has inherently broader operating margins. These
cause a faster cycle time since the core may be
switched faster, and we may be operating on a more
favorable portion of the loop.
Figure 12. Memory drawer; voltage switch side.
1018
PROCEEDINGS -
FALL JOINT COMPUTER. CONFERENCE,
1965
Figure 13. Memory drawer, current switch side.
The operating margins of the system are shown
in Fig. 16. We would expect it to be symmetrical in
word and bit currents; however, it is seen that we
are far more sensitive to the bit currents.
The reason for this is the nons quare aspect ratio
of the sense line, having far more cores on the selected bit line.
Since the Schmoo is elliptical, the operating margins of the system may be measured by varying the
word and bit currents together. This would give a
Figure 14. Magnetics array, decode diodes on periphery.
HIGH SPEED FERRITE
2:YZ
D MEMORY SYSTEM
Figure 15. Coincident current three-dimensional Schmoo.
1019
1020
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1965
16
15
RANGE
DETERMINED
BY VARYING
±12 VOLT
SUPPLIES
14
1
13
(i)
MINIMUM
OPERATING
RANGE
!:i 12
g
.....,
Q
~II
3:
>
10
9
8
9
10
12
13
VBIT (VOLTS)-~.
II
Figure 16. Operating margins, I word versus I bit with worst pattern.
one-dimensional Schmoo, which would quite adequately define system tolerance.
All drive currents are derived from + 12 volt
supplies which also is used throughout the system.
In margining the logic voltage, therefore, the drive
current would also be margined by the same percentage (+ 5 percent).
REFERENCES
1. H. Amemiya, T. R. Mayhew and R. L. Pryor,
"A lOS Bit High Speed Ferrite Memory System-Design and Operation," Proceedings, Fall Joint Computer Conference, Spartan Books, Washington,
D.C., 1964 pp. 123-145.
HIGH SPEED FERRITE 2~ D MEMORY SYSTEM
2. Meyerhoff, Digital Applications of Magnetic
Devices, John Wiley & Sons, New York, 1960.
3. J. R. Freeman, "Pulse Responses of Ferrite
Memory Cores," IRE Wescon Convention Record,
1954, pp. 50-61.
4. P. Cooke and D. C. Dillistone, "The Measure-
1021
ment and Reduction of Noise in Coincident Current
Core Memories," Institute of Electrical Engineers,
Sept. 1962. pp. 383-389.
5. C. P. Womack, "Schmoo Plot Analysis of
Coincident Current Memory Systems," IEEE
Transactions on Electronic Computers, Feb. 1965,
pp.36-44.
DESIGN AND FABRICATION OF A MAGNETIC, THIN FILM,
INTEGRATED CIRCUIT MEMORY*
T. J. Matcovich and W. Flannery
Sperry Rand Corporation
UNIVAC Division
Blue Bell, Pennsylvania
and economical processes should be developed
within a few years. The design and fabrication of
an integrated circuit, thin magnetic film memory
system is described in this paper.
In addition to using uncased integrated circuits,
the memory to be described also makes use of evaporated wiring and insulating layers. When fast rise
times are required, the usual requirement of
hundreds of milliamperes of drive current for operating magnetic memory elements is difficult to
achieve with integrated circuits. The current requirements can be reduced by decreasing wire size
to obtain a larger magnetic field per ampere or by
using a storage element which operates with smaller
magnetic fields. Both of these approaches are used
in the memory to be described. The wire size is reduced to the point where evaporatd conductors and
insulators must be used to obtain tight coupling between storage element and wires. The use of evaporated wiring has other advantages which include
compatibility with uncased integrated circuits and
excellent reproducibility.
The memory system design is described in the
next section. Emphasis is placed on the electrical
characteristics of the system. The advanced fabrication techniques are described in the following sec-
INTRODUCTION
Techniques have recently been developed for using uncased integrated circuits in electronic systems. The use of uncased integrated circuits in
computers will lead to the development of more
reliable, more economical, and physically smaller
computers than can be fabricated from discrete
components or packaged integrated circuits. Before
these advantages can be realized, practical techniques must be developed in the following areas:
1.
2.
3.
4.
5.
Memory design
Logic design
Chip testing
Chip bonding
Chip passivation
All these areas are under study, and the initial
results are· promising. UNIVAC has been concentrating on developing techniques for achieving practical memory design, logic design, chip testing, and
chip bonding. Several integrated circuit manufacturers are studying techniques .for chip passivation,
*Supported in part by Air Force Material Laboratory,
Research and Technology Division, Air ForceSystems Command, United States Air Force.
1023
1024
PROCEED1NGS -
FALL JOINT COMPUTER CONFERENCE,
tion. Production vacuum evaporation equipment and
the chip testing and bonding process are also described. The physical layout of the memory and
fabrication steps are given in the next to the last
section, and the results are discussed and conclusions presented in the final section.
MEMORY SYSTEM DESIGN
The memory element configuration is shown in
Fig. 1. The element is a thin film of electroplated
magnetic material and is wired in a conventional
SENSE
LINE
Figure 1. Memory element configuration.
manner with bit, dummy bit, and sense and cancellation lines. The word lines are O.OOS inch wide
and are located less than O.OOOS inch from the magnetic element; drive fields of about 100 oersteds per
empere are produced by the word drive currents.
The memory operates with word drive fields of
about S oersteds, and, consequently, requires only
SO milliamperes of word current. Bit and sense
lines are 0.004 inch wide, and the memory requires
only 20 milliamperes of bit current. The magnetic
film is 800 angstroms thick and has an anisotropy
field (Hk) of I.S oersteds and coercivity (He) of
2.0 oersteds.
The memory system consists of a 64-word,
24-bit-per-word rectangular array of storage
elements, octal decoders, apdress registers, a selection matrix, word drivers, sense amplifiers, data
registers, and bit drivers. All of these units are
mounted on a common substrate. The circuits are
all integrated and mounted (unpackaged) facedown onto the evaporated aluminum wiring on the
substrate.
1965
The memory operates in a linear-select, destructive-readout mode with a complete readwrite cycle time of 2S0 nanoseconds. The memory
element output is about 1 millivolt in amplitude
and S nanoseconds in width at the base. Output
from the data register is at standard logic levels,
and power dissipation is about 4 watts.
The memory system circuitry is contained on
178 integrated circuit chips. There are 10 different
chip types of which 2 are of standard and 8 of custom design.
The word address and driver circuitry is shown
in Fig. 2. The address register consists of 6 Motorola Me 302 flip-flops, and octal decoder A consists of 8 Motorola Me 306 3-input gates. The
remaining circuitry is custom designed and is divided into the three chip types illustrated in Fig. 3.
All of the PNP devices are grouped on a single chip
so that the best characteristics of both NPN and
PNP devices can be realized. Octal decoder B consists of 8 of the PNP chips, and the 64 matrix transistors are contained on 16-word switch chips.
Word current pulses with a SO-milliampere amplitude and a rise time of less than 3 nanoseconds are
generated by this circuitry. The amplitude of sneak
currents in unselected lines is less than one milliampere.
A block diagram of the recirculation loop is
shown in Fig. 4. The differential amplifier is designed for low noise and low power operation. The
amplifier pulse gain is 12, its rise time is 3 nanoseconds and its common mode rejection ratio is 320.
The amplifier stages are capacitor-coupled, and the
coupling capacitors are included on the singleended amplifier chips. The design shown in Fig. 4
makes it possible to use identical chips for the two
single-ended stages. The pulse gain of the 2 cascaded single-ended stages is adjustable from 100
to 200, and the rise time is 3.S nanoseconds. A
strobe circuit which is used to gate the amplifier off
except during the read cycle is on a separate chip.
The data register is a standard Motorola Me 302
flip-flop. The bit driver circuitry requires both
NPN and PNP devices. The integrated bit driver is
fabricated on two chips, one containing all NPN
devices and the other all PNP devices. The chips
are designed so that one NPN chip and one PNP
chip connected together form two bit drivers.
A typical chip is shown in Fig. S. This is the
single-ended amplifier chip, and the coupling
capacitors can be clearly seen. All chips are
1025
MAGNETIC THIN FILM INTEGRATED CIRCUIT MEMORY
, . . - - - - - - INPUTS FROM OCTAL DECODER A----,
-----8
LINES-- - - -
r-----'
:
1
SELECTION
MATRIX
I
I
I
OCTAL
DECODER
A
1
8 LINES
I
ADDRESS
REGISTER
(6 FF'S)
I
I
6 LINES
I
1
OCTAL
DECODER
B
I
I
1
8 LINES
I
1 " - - -......
I
8 DRIVERS
~
1
I
I
I
L _____ ...1
I
TIMING
PULSE
Figure 2. Word address and driver circuitry.
WORD- SWITCH
CHIP
--I,
I
I
-I:
I
I
I
I
I
--L _
-5.2
_
_
_
J-/
TOTAL OF 8
EMITTERS
CONNECTED AT
THIS POINT
TOTAL OF 8 3-INPUT GATES EACH
OF WHICH DRIVES 8 BASES IN THE
SELECTION MATRIX. (OCTAL DECODER A)
~
~
WORD-DRIVER NPN CHIP
Figure 3. Word driver and selection matrix.
1026
PROCEEDINGS -·FALL JOINT COMPUTER CONFERENCE,
FIRST SINGLEENDED AMPLIFIER
1965
SECOND SINGLEENDED AMPLIFIER
JUMPER ON
SUBSTRATE
WIRING
ONE
TRIPLET
DIFFERENTIAL
AMPLIFIER
BIT DRIVER
NPN
SECTION
PNP
SECTION
DATA
REGISTER
(MECL
FLIP-FLOP)
STROBE
CIRCUIT
STROBE
PULSE
Figure 4. Recirculation loop.
0.050-by-0.050-by-0.006 inch in size and are
glass passivated for enviromental protection. Critical circuits contain custom designed transistors. Use
is made of evaporated nichrome for close tolerance
resistors.
ADVANCED MANUFACTURING
TECHNIQUES
The most serious problem resulting from the
elimination of the integrated circuit package is
that of providing environmental protection for the
chip. The commonly employed passivation technique is to grow a silicon dioxide layer on the surface; this passivation coating has proved inadequate
in environmental and life tests. Other techniques,
which include the use of thin layers of low-temperature glass, thick layers of high-temperature
glass, epoxy encapsulation, and silicone rubber encapsulation, have been tried with encouraging degrees of success. A satisfactory passivation technique will probably become available within a few
years.
Other problems encountered with uncased chips
are handling, testing, and bonding. A number of
micromanipulators with vacuum pick-up devices
are commercially available and provide adequate
handling facilities. The problem of testing the chips
is under study at UNIVAC. Chips are usually tested
before wafers are diced or after they are packaged.
Since the uncased chips are never packaged, only
the wafer probing techniques are applicable to
chips. Wafers are usually probed with long, needle-point, metal probes attached to small manipulators. These probes damage the probed area and
are awkward to manipulate. More seriously, the size
and shape of the probe make high-frequency testing of the chips impossible due to the inductance of
the probe leads.
A high-frequency chip testing device has been
developed at UNIVAC. A schematic drawing of
this test equipment is shown in Fig. 6. The chip to
be tested is held by a vacum pick-up. and positioned on a test card. The test card has a set of
pedestals which correspond to the pad locations on
the chip. The test circuitry is located adjacent to
the pedestals on the test card, and no long, highinductance leads are required. Consequently, highfrequency testing of the chip as well as low-frequency testing, is possible. A photograph of the
chip test equipment is shown in Fig. 7.
The problem of bonding the chips to the circuit
assembly has been solved by the development of an
ultrasonic bonding technique. Pedestals are evaporated on the substrate at locations corresponding to
the pad locations on the chip. Interconnect wiresare evaporated onto the substrate to interconnect
pedestals, memory elements, and external connec-
1027
MAGNETIC THIN FILM INTEGRATED CIRCUIT MEMORY
C OUT I:
I
_____ JI
56 pf
56 pf
I
i INPUT
I
M331-12
~ NICHROME RESISTOR
Figure 5. Single-ended amplifier chip layout.
tion pads. The chip is located over the pedestals,
and ultrasonic energy is applied. As many as 14
bonds are made simultaneously on a single chip.
The apparatus used for performing the bonding operation is shown in Fig. 8, and a photograph of a
bonded chip (viewed through the bottom of the
transparent substrate) is shown in Fig. 9.
The problem of wiring substrates by evaporation
appears formidable; however, practical production
equipment has been developed. A typical 64-by24 wired array of elements is shown in Fig. 10.
This array has 3 layers of conductors and 3 layers
of insulators and contains over 5,000 conductor
crossovers. Units of this type have been produced in
conventional vacuum system bell jars during a single pump down. Materials are evaporated through
masks; masks, sources, and substrates are manipu-
lated externally while the system is pumped down.
Registration between masks is held to 0.0002 inch
over a 1-by-2-inch substrate area. Although only a
single pumpdown is required for completing the
wiring of a memory system, production is limited to
one or two per 8-hour day.
A production vacuum system has been developed
for evaporating the conductors and insulators. An
artist's sketch of the system is shown in Fig. 11,
and a photograph of the prototype system is shown
in Fig. 12. The input and output boxes contain substrate holders, each of which has a 13-unit capacity; these boxes are shown in Fig. 13. A trolley carries a substrate from the input chamber to the main
chamber where all of the wiring is deposited in a
sequence of evaporations through appropriate
masks. The -mask changer and operating mechanism
1028
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1965
VACUUM LINE
MICROMANIPULATOR
X-Y-Z
MOTION
360 0
ROTATION
CONTROL
TIP
PEDESTAL
ARRANGEMENT
ON GLASS
PRINTED CIRCUIT
BOARD CONTAINING
TEST CIRCUITRY
/
TEST-CARD
SUPPORT
X-Y MOTION
328-t
BOTTOM VIEWrNG
Figure 6. Schematic diagram of chip test equipment.
are shown in Fig. 14. When the evaporations are
completed, the trolley carries the substrate to the
output box and delivers a new substrate from the
input chamber to the main chamber. When the substrate supply is exhausted, the input and output
boxes are isolated from the system, new substrates
are added, and completed substrates are removed.
These boxes can be pumped down to operating pressure in 10 minutes. Consequently, the evaporation
processes are not delayed for lack of substrates. The
prototype system shown contains a single main
chamber. This system can be easily expanded to include several main chambers. The following are
some of the advantages multichamber systems provide:
1. Increased production by parallel operation.
2. Continuous production by sequentially isolating single chambers from the system for
routine maintenance.
3. Provisions for performing low-vacuum and
high-vacuum deposition techniques in different chambers.
MAGNETIC TIDN FILM INTEGRA TED CIRCUIT MEMORY
1029
Figure 7. Photograph of chip test equipment.
Figure 10. Photograph of 62-by-24 matrix.
The system shown can be used to produce 3 fully
wired 6-by-3-inch memory planes per hour.
MEMORY SYSTEM FABRICATION
Figure 8. Ultrasonic face-down-bonding equipment.
Figure 9. Photograph of face-down-bonded chip.
A plane view of the memory system is shown in
Fig. 15. Word lines are on 0.020-inch centers,
and bit and sense line pairs are on 0.060-inch
centers. All lines are made of evaporated aluminum
20,000 angstroms thick; word lines are 0.005 inch
wide, and bit lines are 0.004 inch wide. A cross
section view through the plane is shown in Fig. 16.
The recirculation loops are equally divided on
the left and right sides of the memory matrix. The
entire word address circuitry and the word drivers
are located at the top of the matrix. Two springtype connectors provide all voltage and signal leads
for the plane.
The fabrication steps for manufacturing the
memory system are shown in Fig. 17. A glass substrate is covered with a copper ground plane formed
by a combination of evaporation and electroplating
steps. The copper serves as the conductor for electroplating the film of magnetic alloy, which is applied next. Following this step, the 64-by-24 array of storage elements is photoetched, and the
magnetic memory properties of the resulting elements are measured in a test setup which simulates
memory operation. Planes with defective bits are
rejected. After testing, the plane is installed in the
production vacuum system where all the pedestals,
1030
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1965
EVAPORATION
CHAMBER
OUTPUT BOX
Figure 11. Artist's sketch of production vacuum system.
Figure 12. Production line vacuum system.
Figure 13. Substrate holder.
MAGNETIC THIN FILM INTEGRATED CIRCUIT MEMORY
1031
Figure 14. Mask changer and operating mechanism.
wires, and insulating layers are sequentially deposited through appropriate masks. All wiring is completed in a single cycle through the vacuum system,
and no etching steps are required. The wiring on
the planes is next tested for continuity, shorts, and
resistance. The pretested integrated chips are attached to complete the system. The memory is then
given an operational test, encapsulated, and finally
retested.
DISCUSSION AND CONCLUSIONS
The memory described does not represent the
best that can be produced by the new techniques
developed, but it does establish the applicability of
the techniques to the fabrication of thin film memory systems.
The system could be made denser. The bit spacing and chip spacing were set by energy dissipation
considerations. The bit spacing in the array shown
in Fig. 10 was 0.030 inch; the 0.060-inch spacing
was used in the final model so that the recirculation
loop circuit chips could be spread over a larger area
without fanning out. If the .chips are spaced as
shown in Fig. 15, temperature rise during operation
is limited to a practical value. When the energy dissipation problem is solved, the chips can be placed
as close as 0.010 inch apart by using the fabrication
techniques already developed.
The system capacity is adequate only for scratchpad applications. Larger systems can be made, but
there is a limit to the length of line that can be
vacuum evaporated. through masks. A practical upper limit is presently considered to be 6by 6 inches. When the energy dissipation problem is solved,
138,000 bits, including circuitry, could be placed
on the substrate. This is still not adequate for large
systems; however, larger systems could be fabricated by stacking planes.
The cost per bit of the memory system could be
reduced by making higher-capacity planes. However, the most substantial decrease in cost will re-
1032
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1965
SELECTION
MATRIX
CONTACTS ----.--
RECIRCULATION
LOOP
MEMORY
ARRAY
RECIRCULAT ION
LOOP
.. , - - - CONTACTS
BITS ON
0.060 IN.
CENTERS
~~~~
WORD LINES ON
0.020 IN. CENTERS
~---------A--------------~
DIFFERENTIAL
AMPLIFIER
SI NGLE -ENDED
AMPLIFIER
STROBE
~-----------------B------------------~
BIT-DRIVER
BIT DRIVER
GATE
DATA
REGISTER
~-----------------------------------------c------------------------------------------~
Figure 15. Plane view of memory system.
1033
MAGNETIC THIN FILM INTEGRATED CIRCUIT MEMORY
30,0001
SiO
20,0001
CONDUCTOR - WORD LINE
AND VARIOUS LOGIC LINES
75,000 A
SiO
20,000 A
600 I , '5,000
resistors. The estimated cost of 4,096-word memories of the type described is $0.02 per bit based
on a $2.00-per-chip price. Since chip prices
eventually fall below $1.00 per chip, memory systems costing less than $0.01 per bit are anticipated.
The reduced chip price and the fabrication techniques described may lead to a new concept in
memories. Memories are presently formed by interconnecting components, such as storage arrays,
word drivers and sense amplifiers. Rugged, compact, and inexpensive memory modules containing
all the system circuitry may be made available in
standard sizes for use as computer system components. This advance, whiCh is potentially as significant as was the introduction of integrated circuits,
which changed. the component unit from the resistor, capacitor, and transistor to the entire circuit,
could substantially alter the present concept of computer design.
CONDUCTOR - BIT LINE
AND VARIOUS LOGIC LINES
20,0001
CONDUCTOR - SENSE LINE
A
800l J
SiO
MAGNETIC FILM
~CHROME
J
SUBSTRATE
I
Figure 16. Cross section through memory plane.
suIt from reduced integrated circuit prices. The cost
of integrated circuits has decreased substantially in
the last year, and further reductions are anticipated.
The use of uncased integrated chips should result in
additional substantial reductions in the cost per circuit. The cost should reach such a low level that
minimizing circuitry will not be a significant system design criterion. A parallel situation existed
when integrated circuits were introduced with the
result that transistors became less expensive than
EVAPORATE
GLASS
SUBSTRATES -
4000 ANGSTROMS
OF COPPER ON
GLASS SUBSTRATE
ELECTROPLATE
f-e
.0005 INCH
COPPER GROUND
PLANE
ACKNOWLEDGMENTS
The design and fabrication of the memory system
described could only be accomplished through the
close cooperation of specialists in several fields.
These people constitute the Molecular Systems
Group at the Blue Bell Laboratories of UNIVAC,
and the authors wish to acknowledge their indebtedness to all members of the group for their contributions in the development of the memory system.
-
ELECTROPLATE
MAGNETIC
FILM
~
ETCH
MAGNETIC
FILM
I---
TEST
MAGNETIC
FILM BITS
I
1
ETCH COPPER
GROUND PLANE
EVAPORATE
ALL
WIRING
!---to
TEST
WIRING
CIRCUIT CHIP~_
FROM MFGR.
TEST
CIRCUIT
CHIPS
I--
~
ASSEMBLE
CHIPS TO
ARRAYS
I-----
TEST
MEMORY
I---
ENCAPSULATE
MEMORY
I---
TEST
MEMORY
!
Figure 17. Production flow chart.
COMPLETED
MEMORY
SYSTEM
1034
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
.
....
x
0:
"A"
BUFFER
H
....
.
"A"
SWITCHES
(WORD)
1965
ti
:i
MEMORY
I.LJ
c
.
0
0
.4 ~
, IF
PULSE
CURRENT
SOURCE
-...
liB"
SENSE
AMPLIFIER
~~
INFORMATION
OUT
SWITCHES
(WORD)
...
ADDRESS
REGISTER
D. C. BIAS
~
IF
"B"
BUFFER
Figure 16. Memory block diagram.
Considering all sources of delay, a memory cycle
of 100 to 150 nanoseconds is easily obtainable in a
memory consisting of a few thousand 32-bit
words.
ment is insensitive to mechanical stresses and temperature changes. The woven array will therefore be
applicable in those areas where highly reliable operation is required under extreme environmental conditions.
CONCLUSIONS
ACKNOWLEDGMENTS
The experimental results have shown that the
woven read-only memory array has the required
properties for achieving very short memory cycle
times. Another attractive feature of the memory is
the high packing density which can be achieved; the
results quoted on area packing density indicate that
over 50,000 bits per cubic inch are obtainable.
The description of the fabrication method for the
memory has shown that stacks can be assembled at
very low cost per bit due to the highly automated
nature of the memory production process. If the
costs become as low as anticipated, it may even be
possible to consider the woven memory array as a
very high-speed semipermanent memory by using
"throw-away" planes.
Finally, the results show that the memory ele-
The authors would like to acknowledge the help
and encouragement of A. Matsushita of TOKO,
Inc., and Dr. R. H. Fuller of General Precision Inc.
REFERENCES
1. H. Maeda and A. Matsushita, "Woven Thin
Film Memories," IEEE Trans. on Magnetics, vol.
1, p. 13 (Mar. 1965).
2. P. Kuttner, "The Rope Memory: A Permanent Storage Device," AFIPS Conference Proceedings, p. 45, 1963.
3. D. M. Taube, "A Short Review of ReadOnly Memories," Proc. lEE, vol. 110, no. 1, p.
157 (Jan. 1963).
BATCH FABRICATED MATRIX MEMORIES*
Thomas L. McCormack, Claude P. Battarel and
Harrison W. Fuller
LFE Electronics
Boston, Massachusetts
INTRODUCTION
Present-day matrix memory fabrication techniques are relatively expensive since discrete binary
memory elements are individually made and then
assembled into a matrix array by means of manual
or semimanual wiring. The assembly of individual
matrix planes is usually followed by another expensive step wherein the planes' of a memory stack are
interconnected. The key to low cost matrix memories lies in integrated or batch fabrication of the
memory elements and wiring structure of a plane,
and also batch forming the interconnections between planes in a memory stack. Additional economy results from making the bit capacity of a plane
as large as possible. The need for batch fabricated
memory planes appears to be generally recognized,
considering the number of suggestions that have
been made for achieving the goal. Thin magnetic
*This program is receiving support from the Air Force
Materials Laboratory, Research and Technology Division
Air Force Systems Command, United States Air Force:
under contract no. AF 33(615)-3018; and Rome Air Development Center, Research and Technology Division, Air
Force Systems Command, Griffiss Air Force Base, N. Y.,
under contract no. AF 30(602)-3826, as well as LFE Electronics, a division of Laboratory For Electronics, Inc., Boston, Mass.
1035
film memory· elements were initially of interest because of their high switching speed, which made very
fast memories possible, but today thin magnetic
film memories are of interest principally because
they offer one approach to batch fabrication, since
in the meantime ways have been found for achieving high speed with ferrites.
Although low cost is the principal motivation for
developing batch fabricated memory planes, the result is that such manufacturing methods also make
small physical size and low power practically achievable. The small physical size of memory elements
makes large-capacity planes possible, and this, together with the low-power requirements, works to
reduce the cost of selection, drive and sense electronics. This reduction in electronics cost possible
with batch fabricated memory planes. is very important to the objective of low cost memory systems
since the cost of electronics in present-day memory systems is a substantial and sometimes dominating fraction of the total.
A new approach to batch fabrication of memory
planes based on the use of permalloy-sheet toroids
as memory elements is employed. The etched permalloy-sheet toroid approach is attractive because
the fabrication techniques are applicable to a variety of memory systems.
1036
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
BATCH FABRICATED MEMORY PLANES
The batch fabricated memory plane consists of
flat toroids etched from sheet permalloy with the
necessary control wiring formed by etching and
X,
1965
plating copper. The wiring pattern required for
planes to operate in the coincident current mode is
shown in Fig. 1. A slightly modified version is used
for the linear select mode. The dark circles represent the etched toroids; wiring, indicated by solid
X2
Y, .. SENSE
INN/Sir
.$,
-+ 'tl
--0 Y,
S~ .....----
X,
Fi~l!re
1. Wiring pattern for 4 X 4 memory matrix.
lines, is in the plane above the toroids, while the
wiring indicated by dotted lines is in the plane below the toroids. Connections between the two wiring layers are made through the interior of the toroids. The topology is such that wires never cross
on the same plane, thus allowing the wiring pattern
to be formed by two layers of etched copper insulat-
ed from each other and the toroids but connected
by means of plated regions through the interior of
the toroids. Larger arrays are made by repeating
this basic pattern. Figure 2 shows an early test
model with toroids on 25-mil centers prepared by
bonding sheet permalloy to a plastic sheet, coating
it with photo resist and exposing it to the negative
BATCH FABRICATED MATRIX MEMORIES
Figure 2. Hand-wired 4 X 4 matrix model. 23-Mil O.D. on 25-mil centers.
1037
1038
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
of the toroid pattern. The model was then developed and etched, and holes were burned in the plastic in the middle of the toroids. It was then handwired.
0000
0000
0000
0000
( a)
( c)
1965
In Fig. 3 are shown the photo masters which are
required to batch fabricate the memory array shown
in the previous figures. The two lower patterns are
the top and bottom halves of the wiring pattern. At
-i
-',i
.,
-'
.,
:,
,i
:,
., •
"
,i :,
~,
~,
•
:,
".,
~,
•
I,
•
( b)
( d)
Figure 3. Partially reduced photo-etched masters for a 4 X 4 matrix model.
upper left is shown the toroid pattern, while to its
right is shown the mesa pattern which produces the
connection between the two wiring layers through
the interior of the toroid.
The steps involved in batch fabricating a memory
plane of the design shown will be described, briefly,
with reference to Fig. 4 (a-e) , which show cross
sections of a plane in various stages of fabrication.
In Fig. 4a, a 1.3-mil-thick copper sheet has been
bonded to a temporary substrate via a thermoplastic
adhesive. The copper has subsequently been coated
with a photo resist, the mesa pattern exposed, and
the resist developed. This layer has been baked and
a subsequent coating of photo resist applied and
BATCH FABRICATED MATRIX MEMORIES
1039
MESA
KPR
~~~~~~~~~~~~~~~~~~~KPR
a
~~~~~~~~~~~~~~~~~~~~~~~COPPER
ADHESIVE
TEMPORARY
SUBSTRATE
,..----K P R
" "
...........
,
,
..'
.......................... ................................ ............................
:
:
'/'"///////.//////
......
b
PERMALLOY
(TORROID)
c
PLATED COPPER
:...~V ~EVAPORATED
d
e
COPPER
PERMANENT
SUBSTRATE
ADHESIVE
FINAL WIRING PATTERN
Figure 4. Steps in batch fabricating technique of memory plane manufacture.
1040
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
Figure 5. Fabrication stages of 64 X 64 models. Upper
left: Toroids with mesas plated up. Upper right: Top wiring
pattern completed. Lower left: Top and bottom wiring completed. Lower right: Lead-in from edges and top to bottom
wiring feed through technique for edge connectors.
1965
BATCH FABRICATED MATRIX MEMORIES
exposed to the mesa pattern as before, then developed. This layer of resist serves as the adhesive
which bonds the permalloy. In Fig. 4b the permalloy has been bonded, coated with photo resist, exposed to the toroid pattern and developed, and the
toroids have been etched. In 4c the specimen has
been recoated with photo resist, the mesa pattern
exposed and developed once more (this completely
surrounds the permalloy toroid with an insulating
layer of photo resist), then the specimen is placed
in an electroplating bath and the mesas plated up
level with the top layer of photo resist. In 4d a
coating of copper has been evaporated onto the surface and electroplated up to a thickness of 1.3 mils;
this surface is then coated with photo resist and exposed to one wiring pattern, developed, and etched.
In Fig. 4e the specimen has been transferred to the
permanent substrate, the. temporary substrate removed, the newly exposed copper surface cleaned,
coated with photo resist, exposed to the final wiring
pattern, developed and etched.
The basic 4 X 4 pattern has been developed into
larger models; Fig. S shows 64 X 64 models at various stages of fabrication, upper ·left, toroids with
mesas plated; upper right, top wiring pattern completed; lower left, top and bottom wiring completed;
lower right, shows details of lead-ins. Figure 6
shows a completed 64 X 64 memory plane. This plane
is 1.6 inches square and has 4,096 toroids on 2S-mil
centers.
The completed planes may be stacked; the manner of accomplishing this is indicated in Fig. 7. These
connections are formed by stacking and potting the
planes in register, lapping off the surfaces to be interconnected, coating the surface with copper, photo
resist, exposing through a mask and finally etching.
In this way the plane-to-plane interconnections are
formed in the same manner as the memory plane
wiring. Figure 8 shows a section of a 64 X 64 stack
of 5 planes where the edge of the copper lines is
clearly visible. A stack of S 16 X 16 planes is shown
in Fig. 9.
Some essential features of the technique described are: (a) no new materials need to be developed; (b) the magnetic material, thin permalloy
sheet, is inexpensive, is available from several
sources, and rigid quality control has been in hand
for many years; (c) closed flux structure memory
elements allow close coupling to wiring and close
spacing of elements without interaction and with
low sensitivity to external fields; (d) the chemical,
1041
photochemical and electrochemical materials and
processes are individually inexpensive, wellknown and well-established ones; (e) the processing methods are relatively simple and allow
large planes to be fabricated with simple apparatus;
(f) the tolerance and resolution required in individual steps of the fabrication process are well
within the present state-of-the-art; !g) the
batch fabrication processes used to make interconnection wiring between planes of a memory stack
are identical to those used to manufacture memory
planes. These properties of the fabrication techniques give good assurance that a high yield of
memory planes at low cost can be dependably and
reproducibly expected, once all the steps in the
manufacturing sequence are brought under control.
By reason of the nature of the batch fabrication
techniques, the same expectations folloW for connection of individual planes to automatic test
equipment and interconnection of planes in a stack.
BATCH FABRICATED PLANE RESULTS
Early models consisted of an array of 16 X 16
toroids on 62.S-mil centers. These were used primarily as a demonstration of the feasibility of the
process. Subsequently arrays of 16 X 16 toroids on
2S-mil centers have been made, and the development of a satisfactory fabrication process has been
a part of the program. Planes are generally fabricated
in lots of 10 and as many as 9 out of 10 planes are
being carried successfully through fabrication with
an average of 6 out of 10 for the last 8 lots fabricated. In additions, the yield of testable planes has
improved so that if a plane was processed completely
through fabrication it was also a testable plane. Further, the yield of planes has steadily improved, despite the fact that planes 'are actually being made
on an experimental basis where it is understandable
~ that the yield may temporarily drop from the introduction of changes in the process which are expected
ultimately to improve planes. In the last 3 lots of
16 X 16 planes fabricated, one lot had 3 out of 10
planes perfect, the next 3 out of 7 perfect, and in the
last S out of 10 were perfect. Fabrication is nowin
progress on 64 X 64-bit planes. Of the 90 processed
to date, SO have been completed, and of these 37
were testable. The first 7 lots have been tested, with
6 planes having perfect wiring (no shorts or opens).
Spot switching tests have shown S curves similar to
those for the 16 X 16 planes.
1042
PROCEEDINGS -
FALL JOINT COMPUTER CONFEREN~E,
1965
Figure 6. Completed 64 X 64 bit memory plane.
LOW-POWER LINEAR SELECT OPERATION
The memory· planes presently· being fabricated
are suitable for use either in the linear select mode
or the coincident current mode. When operated in
the linear select mode, reasonably short cycle times
can be realized since the switching constant is less
than 0.5 oersted-microseconds. Figure lOis a switching curve taken using one of the planes and it may
be seen that a 120-milliamp, 0.5-microsecond pulse
should switch the toroid fully. In Fig. 11 d it may be
seen that a 2-microsecond read-write cycle time
may be readily achieved with read word pulses of
0.5-microsecond at 150 milliamps and a write word
pulse of 1.0 microsecond at 67 milliamps with bit
current at -+-33 milliamps. These currents are less
than one-third the values generally required with ferrites at the same cycle time.
At present a system design using these memory
planes in a linear select memory of 8,000 30-bit
words is under study. Present estimates are (for 2microsecond cycle time): power - less than 10
watts; volume - approximately a 4-inch cube; cost
- 7 to 14 cents per bit depending upon the number
of units. Figure 12 shows an artist's sketch of this
memory.
BATCH FABRICATED MATRIX MEMORIES
PLANE
I
PLANE
2
•
PLANE
3
•
PLANE 4
•
PLANE 5
•
1043
Figure 7. Wiring diagram of completed stack.
MASS MEMORY
The low cost per bit resulting from the method of
batch fabricating memory described, typically 0.01
to 0.02 cents per bit, combined with key system
techniques make it possible to build very large memories with a capacity in the order of 108 bits at a
cost in the order of 0.1 cents per bit including electronics.
The present mass memory system approach is
based on the use of coincident-current memory organization, since for a large memory the drive and
selection electronic circuits are many fewer for a coincident-current memory compared with those in a
1044
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1965
Figure 8. Magnified views of end cross sections at 20X (top) and lOOX (bottom).
BATCH FABRICATED MATRIX MEMORIES
1045
(0)
Figure 9. Views of five 16
X
16 planes stacked with batch fabricated plane to plane interconnections.
linear-select memory. It is next considered important in a low-cost memory that the matrix plane size
be as large as possible, e.g., at least 256X256, since
drive, selection and sense costs are again reduced as
a result. Matrix planes in coincident-current memories are normally restricted to much smaller sizes,
e.g., 64 X 64, and even then the planes may require
partitioning of the sense line, and the use of additional sense amplifiers, to reduce delta noise. To eliminate delta noise and make large planes possible, an
unconventional two-frequency C.w. selection schemel
was .adopted for the reading operation, while conventional coincident-current writing is employed.
Small toroidal storage elements are used to reduce
the length and losses of drive and sense lines. Closed
flux storage memory elements allow close coupling
to wiring and result in low sensitivity to external
fields. Sense signals are relatively small, but the sinus: .
oidal nature of sense signals p'rovides system versatility by permitting narrow-band filtering prior to
sensing for acceptable signal-to-noise ratio. The reading method is nondestructive, which enhances reliability in large memories, and which reduces access
time in the frequent case where read accesses significantly outnumber write accesses. The reading signals
are sinusoidal while the write signals are d-c pulses;
this difference provides a degree of electrical isolation between reading and writing operations. In Fig.
13 are shown the characteristic S-curves (set pulse
amplitude as the variable) traced from all 256 toroids
on one 16X 16 plane read in this manner. Outputs
from 64 X 64 planes are similar. The majority of experimental fabrication work has been done using
permalloy toroids that are 0.025 inch on centers.
Small toroid size and low coercive force (0.2 oersteds
for some permalloy) make possible a low, 30 milliampere, half-select write current for O.025-inch toroids, read drive currents are approximately the same
peak magnitude. Memories using these toroids are
potentially low-power ones, and the memory fabri-
1046
J60
PROCEEDINGS -
TO p. SEC
4----~
-.
~------
.-
FALL JOINT COMPUTER CONFERENCE,
1965
...
ro ~SEC
320
1 SET
280
....
«
o~
2"'0
10%
lr-'
9w
~
I.S7oe
200~
,......
PLANE 5H
1'0
LINE YI3
V4 Mil MO- PERMALLOY
tS
~ '20
swCD
....
LaJ
V)
~
80
SW@
=
O. Ie, oe - p. SEC.
0.3.3 oe- fI- SEC
5W@
:=
O.52oe-}-lSEC
1r
O.1850e
'to
0
7
liT
8
10
t MICROSECONDS)-'
"
12
Figure 10. Switching curve for mo-permalloy.
cation technique can be compatible, therefore, with
present capabilities of integrated semiconductor
microcircuits.
Figure 14 shows a simplified block diagram for a
lOs-bit mass memory. The system consists of 16
modules of 6.5 X 106 bits showing common electronics organized as in the CCM. The basic system
design has been described previously.2 A model of
this memory is shown in Fig. 15. The model has a
volume of 4.85 cubic feet.
( 65 X 103 bits total) planes. In order to assure good
yield the total fabrication process has been developed carefully, including the provision for a clean
production area. A series of 10 clean work stations
have been designed and are now being installed to
implement the process on a pilot plant production
basis. Figure 16 shows two of the clean stations which
are designed for class 100 conditions. The fabrication facility includes in contiguous areas the pilot
plant, photo master preparation and electrical test
areas.
FABRICATION FACILITY
ACKNOWLEDGMENTS
It is apparent that yield is a critical factor in any
batch fabrication process and this is no exception.
Present results in a normal laboratory environment
are quite good for the 64 X 64 planes now being fabrica ted, but the next step is to prepare 256 X 25 6-bit
The authors would like to acknowledge the assistance of the staff of the Solid State Electronics
Laboratory at Laboratory For Electronics, Inc., for
the work reported here.
BATCH FABRICATED MATRIX MEMORIES
(a)
I set
=33 rnA p @ 1p.sec
Ireset :. 150 rnA p @ O. 51-lsec
(c)
I set • 100·rnA p @ 1!J.sec
Ireset· 150 rnA p @ 0.5l-lsec
Figure 11. Test of write operation of ~-mil mo-permalloy.
Plane 5H line Y-13. Vertical voltage sides are 10 mv/cm
for (b), (c) and (d), and 5 mv/cm for (a). (a): Readout
from 1/3 selected elements (0). (b): Readout from 2/3
1047
(b)
I set • 67 rnA p @ 1p.sec
Ireset • 150 rnA p @ 0.5p.sec
(d)
I set • 100 - 33 rnA p @ l!J.sec
Ireset -ISO rnA p @ 0.51-l§ec
selected elements. (c): Readout from selected elements (1).
(d): Combined 1-0 read write signals. Note: I set corresponds
to the write signal while Ireset corresponds to the read
signal.
REFERENCES
1. B. A. Widrow, "A Radio-Frequency Nondestructive Readout for Magnetic Core Memories,"
Trans. IRE, PGEC, vol. EC-3, p. 12 (Dec.
1954).
2. H. W. Fuller, T. L. McCormack and C. P.
Battarel, Paper 5.5, Proc. of the 1964 Intermag.
Conf., IEEE Inc., New York.
1048
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1965
MICRO[)IO[)ES'~
CIRCUIT CARDS
WIRE WRAP
TERMINALS
CONNECTOR
TO
INTERFACE
AND
POWER SUPPL 'f
MEMORY STACK
90 PLANES
40 MILS CENTERS
SINK
Figure 12. Proposed low-power memory packaging layout, 8,010 words, 30 bits per word.
BATCH FABRICATED MATRIX MEMORIES
Figure 13. S curves for all 256 elements on plane 2N.
Reset pulse -80 rna peak, 5 fLsec wide; set pulse varied 0 to
!f-80 rna peak, 5 fLsec wide. Read currents 40 rna pp. Ver-
1049
tical scale 10 fLV /main div. Horizontal scale 10 rna/major
div. Each photo shows 16 S curves associated with one Y
line. Material is No. 43 - Y:z mil.
SENSE a INHIBIT '
16XI01 2WIRE LINES
......
o
VI
GROUP #1
o
A
#16
#4
#1
16 BLOCK
=:;;:--
100 BIT
WORD OUT
•••
MEMORY
••
JooC
BLOC~
~
n
~
~
o
.I
100BIT
WORD IN
~
~
~
~
~
c:I}
COMPUTER
SYSTEM
INTERFACE
Z
>10
Cf.l
I
"'!j
>
t'""
••
•
t'""
c....
o
Z
~
:3
n
o
~
s=
~
JooC
c::
g:
~
~
~
:;tI
n
0..
~.
o
z
"'!j
~
~
:;tI
~
z
n
!l1
......
\0
0'1
VI
Y-256
Z-IOO PLANES
WRITE-WI -W2
5 CARDS
MEMORY
BLOCK
BATCH FABRICATED MATRIX MEMORIES
Figure 15. Model of lOS-bit memory. Outside dimensions
are 17 X 17 X 29 inches.
X 29 inches.
1052
PROCEEDINGS ~ FALL JOINT COMPUTER CONFERENCE,
1965
Figure 16. Clean work stations used for batch fabricated memory plane manufacture.
AN INTEGRATED SEMICONDUCTOR MEMORY SYSTEM
H. A. Perkins and J. D. Schmidt
Fairchild Semiconductor Research and Development Laboratory
Fairchild Camera and Instrument Corporation
Palo Alto, California
INTRODUCTION
formance required from the peripheral circuits is
much less stringent for the· semiconductor memory
described.
The following sections will describe the circuits,
devices, packaging and system design for a random
access 256-word memory of 72 bits per word.
The concept of active circuit data storage (flipflop) is as old as electronic data processing systems. The attributes of highest access speed, steady
state nondestructive readout and flexibility of application have been partially offset by higher costs and
higher standby power per storage bit. As a result,
flip-flop storage has until recently been really only
feasible for registers.
As integrated circuits have been perfected to provide a multiplicity of gates or flip-flops on each
monolithic die at lower costs than traditional discrete component circuits, the always interesting
possibility of an integrated all-semiconductor
memory of reasonable capacity becomes exciting.
Simply integrating the circuits is not enough. A
maximum number of storage positions in a given
device package should also require a minimum of
leads. Since a memory system is the objective, the
need for the usual logic level compatible interface is
waived. The storage device is optimized for minimum complexity per bit of storage and greatest
electrical tolerance allowance at its terminals. Interface (peripheral) circuits are required such as word
bit drivers and sense amplifiers much as for magnetic core or film memory systems. However, the per-
SYSTEM DESCRIPTION
System Design Goals
The main goal to be achieved is a memory system competitive in cost and superior in performance for a certain range of applications. To achieve
the goal, the storage device's complexity must not
push processing technology toward low yields, must
use fairly standard, easily installed packages and
require a minimum number of packages for the system complement. To meet these needs, a 16-lead
dual in-line package (similar to Fairchild CTILL
logic family) containing 4 words of 9 bits per word
(36 bits) was selected. An 8x10-inch doublesided printed circuit card readily holds 160 packages. Of this number, 128 are arranged in an array
of 16x8 to provide 64 words of 72 bits each, with
the balance of the packages containing word drivers
1053
1054
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
at 2 circuits per package. Thus the 256-word
memory requires 4 such cards (storage). In addition, 2 data circuit cards at 36 bits per card (bit
driver and sense amplifier) and a single address
register and control card are needed to complete the
card complement. A power density of approximately 5 milliwatts per bit was chosen with a view to
providing high speed ( ISO-nanosecond cycle
time) but moderate system power required ( 130
1965
watts). The basic storage cell has wide application
flexibility and the specifications which follow are
one compromise to several-co.nflicting needs. Minor
modifications to. the design permit optimization for
higher speeds, larger capacity for one set of peripheral circuits or reduced power. The capacity of
the storage device itself (36-bit package) is a
function of present device and package technology
and should be regarded as a first expedient.
Table 1.
· 256 words, 72 bits/word, 18,432 bits total
Capacity.
Repetitive access or
write cycle time .
Read mode
Write mode.
· 150 nanoseconds
· nondestructive
· jam set to one or zero (at storage
device)
· 120 nanoseconds
· 480 bits/microsecond
+ 2 volts = one, -0.5 volt = zero
(compatible with Fairchild CTJLL family)
Read access time
Data flow rate .
Interface signal levels .
System Organization
The functional components of the semi-conductor
memory are similar to most random access memo-
ries as shown in the block diagram of Fig. 1. A
nondestructive steady state output is obtained from
the storage cell. Strictly speaking, buffer registers
are unnecessary. In the memory described,· registers
MEMORY ADDRESS
REGISTER
.
DATA
IN
STORAGE
256 WORDS X
72 BITS
GATE DATA OUT
d
J---.....;.......-'--~
DATA
OUT
Figure 1. Block diagram.
have been included primarily as a convenience for
writing in data so that the computer need not be
tied up any longer than necessary. Instead of having
an output level directly compatible with logic level
signals, a relatively low 30-millivolt signal is
sensed as a one. Although a sense amplifier is required, the advantage of relatively low-impedance
sense/bit line (50 to 150 ohms) justifies its use.
Such an impedance level is compatible with interconnection techniques such as printed wiring, twisted pair and coaxial cable. Yet only about 400 microamperes are required from the storage cell tending to minimize standby power needs for high speed
capability.
1055
INTEGRATED SEMI-CONDUCTOR MEMORY SYSTEM
Logic elements for the memory are standard complementary transistor micrologic units. The memory
address register is formed from Dual Rank Flip-Flop
circuits (Fairchild CTfLL-957). The first level decoder uses dual, 4-input positive "and" gate (Fairchild CTfLL-954) followed by inverters (Fairchild
CTfLL-952) to provide the correct signals for the
second-level decoders and Word Drivers. On the
first memory cycle after the memory has been idle,
the address with a start command is loaded into
both ranks and decoding begins immediately. After
the correct word has been selected and delay through
the sense amplifier completed, a gating signal transfers the .output to the computer system. If an input
command ( write-in) is presented to the memory,
data is gated into the Data Register (a Latch circuit
Fairchild CT fLL-968 ), Bit Drivers corresponding to
zeros are energized and after decode delay time a
Write Control signal causes the Word Driver to
change the state of the storage cells to correspond
with Bit Driver outputs.
For repetitive memory cycles, the earlier address
is retained in the second rank of the memory address register until the first cycle is complete. The
new address, if a start command is also present, is
loaded in the first rank only, pending completion of
the preceding cycle. Near the end of the first cycle,
the new address is transferred to the second rank
and, soon after the new cycle begins, the first rank
is cleared. By storing the. start command also in dual
rank register, an asynchronous input can keep the
memory cycling at maximum speed.
Special integrated circuits are the 36-Bit Storage Cell, Decoder-Word Driver, the Bit Driver, and
the Sense Amplifier. (The Sense Amplifier is a
Fairchild fLA-710 Comparator.)
Typical voltage and current levels at the storage
array interface are given in Table 2.
Table 2.
Function
Word' drive, read
Word drive, write
Bit drive
Sense output
V
1.5-2v
> 3.0v
-0.7v
50 mv
Imax
25 rna
55 ma
10 rna
0.5 ma
A timing diagram is shown in Fig. 2. Times correspond to functional blocks in Fig. 1. Response of
the storage bit after word drive is applied is about
40 nanoseconds. The time required to store information in a storage bit is about 50 nanoseconds. A
o
O. START
20
40
60
80
100
120
140
160
1110
OUTPUT
DELAYr--------:=-..J
ADDRESSING
WORD
DRIVE
C. STORAGE 81T OUTPUT
DATA OUT
b.
d.
e.
START
f. DATA
g. BIT
INPUT
IN
DRIVE
-
FIRST
CYCLE
START
SECOND
CYCLE
START
Figure 2. Timing diagram.
ISO-nanosecond repetitive cycle time is obtained
by allowing overlap into the subsequent cycle. The
sense amplifier is not limited to 256 bits on a sense
line. U sing a multiplexed sense amplifier (4 segments of 1024 words) a 4096 word memory of 72
bits per word should easily achieve a repetitive
cycle time of 500 nanoseconds. A cycle time of 100
nanoseconds is obtained at 64 words of 72 bits.
These timing estimates are projected on the basis of
5 milliwatts per bit for the particular partitioning
scheme described. A higher redundancy of peripheral circuits obviously tends toward maximum speed.
The other variable is the power density in the storage package. This may be modified readily by
changing the applied power supply voltages or by
changing the internal resistances within the device
either to optimize for speed, cost or power consumption. The most significant point is that basically a single device design in the storage package
will serve a large spectrum of application requirements. The goal of competitive costs is achieved
mainly by producing a very large volume of essentially a single device with the attendant cost minimization realizable with microcircuitry.
CIRCUIT DESIGN
Basic Memory Cell
The particular circuit configuration that one
chooses for an integrated semiconductor memory is
1056
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
dependent on a number of considerations such as
chip complexity and size, number of leads" power
dissipation, speed, logical flexibility, fabrication
capability, etc. After analyzing the complete memory system costs and considering the type of packaging to be used, we concluded that the most important criteria for selection of the circuit configuration is to minimize the number of leads required to
the memory cell. This approach permits achieving the highest functional complexity possible for
any given package chosen. Consistent with the
above statements is the belief that the memory cell
circuit should be as simple as possible even at the
expense of more complex peripheral circuits. if required.
After investigating both linear selection and
coincident selection circuits as well as separate data
in and data out buses, we decided on a linear selection cell with a common data in-data out bus as
shown in Fig. 3. The circuit is a conventional emitter-coupled binary except that the right-hand
WRITEJL
WORD
SELECT
LINE
RE A D .J""'""'\.-
Figure 3. Basic memory cell.
load resistor has been omitted and this node tied to
ground. This forces the left-hand collector output
node (point A) to vary from -0.7 volt (defined as a
zero) to + 0.7 volt (one state) under quies,cent
conditions. These voltages are determined only by
the transistor parameters and are as follows:
Vone = + Vbe sat - Vee sat and
V zero = -Vbe sat + Vee sat
Transistor T 3 acts as an input/output switch and
is driven by a bi-amplitude positive pulse as shown
1965
on the word select line. Consider first the read operation. A lower amplitude positive pulse forward
biases the T3 base to emitter turning the transistor
on. If a one is stored (+0.7 at A), T3 acts as a
normal transistor and a positive output appears across
R L. Its amplitude is determined by the ratio of RL
to Rl + RE and the magnitude of + Vee Veesat.
The purpose of RE is to prevent point A from going
closed to ground that + 200 mv even with Do
shorted to ground and minimum Veesat on T 3.
If a zero is stored, T 3 acts as an inverse transistor
and a slight current (dependent on inverse beta)
flows into node A, tending to produce a negative
output signal. Thus, a zero or a negative level at
Do corresponds to a zero and a positive level corresponds to a one. When readiag a zero, the base
collector junction of T3 is forward biased forcing base
current into node A: In order not to write a one into
the cell, this current must be limited and this in turn
is the reason for the bi-amplitude word select pulse.
To write a one, transistor T4 is left off and the higher
amplitude word select pulse is applied. This forces
sufficient current into node A to cause T 1 to turn
off and T2 to turn on, thus writing in a one. To write
a zero, T4 is saturated, pulling the data in-data out
line negative and thus pulling node A negative when
T 3 is saturated.
In designing the circuit, three primary constraints
must be met to satisfy the conditions that the cell
performs correctly at d-c or low frequency. These
design constraints are:
1. Minimum write current is sufficient to write
a one.
2. Maximum read current does not write in a
one if the F /F is in the zero state.
3. Minimum read current is sufficient to saturate the gate transistor (T3) .
In considering the design problem, it is soon realized that there are many more degrees of freedom
in the design than there are constraints on design
conditions which must be met. Thus one must use
many secondary design considerations and some
judgment in order to arrive at a unique solution.
Our approach was to consider first the word driver
circuit and determine roughly what read and write
voltage levels and spreads could be realized. From
this preliminary work, all terminal voltages were
specified for the cell (i.e., word line and data line
voltages for both reading and writing). The next step
is to define the power supply voltages and the approximate power level of the cell. From the standpoint
INTEGRATED SEMI-CONDUCTOR MEMORY SYSTEM
of wide tolerances in the cell, it is desirable to have
+ Vee and -VE both large in magnitude compared
to V BE, and Rl and R2 large so that the node A and
the emitter node are fed from a constant current
source. However, this increases both the power dissipation and the size of the circuit in addition to
yielding a somewhat slower circuit due to the RC
time constant at node A. Consideration of these
tradeoffs lead to fixing + Vee, -VE and R2.
The three worst case equations for the three primary constraints are:
1.
(Use high temperature values for transistor
parameters)
2.
VA =
[~lR3
R 1 +R 3
]
[VR - ~eb3
+
Ra
_ (V E
Vee
(Use low temperature values for transistor
parameters)
3.
VA = (RE+Rd [
VR
-
Vbe satl
R3
+
and (2), Rl and R3 can be uniquely determined in
terms of R2 which was previously chosen. The values
of Rl and R3 along with RL were then used in Eq. (3)
and RE was determined. The analytical results were
verified by breadboarding all worst configurations
using kit-integrated parts and determining the points
at which failures occurred.
The decision was made to package the memory
circuits in Fairchild's new dual in-line configuration
having 14 or 16 leads available. Since the memory cell
requires . . . . power supply contacts (+ Vee,-VE,
and ground), 13 signal leads are available with the
16 pin package. These 13 pins can be used in a near
optimum fashion by organizing the memory chip as
a 4-word array of 9 bits per word. This gives 36
bits of storage with 4 word select lines and 9 data indata out lines required. The chip size is 60 by 80
mils and utilizes two layers of metal to solve the
crossover problem. Fig. 4 is a photomicrograph of
the 36 bit memory array.
Peripheral Circuits
Rl
V be sat t) ]
R2
-
1057
Vee]
Rl
+
Vee sat3
where the bar above a parameter indicates the maximum value and a bar below indicates the minimum
value. V R is the "read" voltage on the word select
line and Vw is the "write" voltage at the same point.
All other notations are defined in Fig. 3.
The following values were used in the design equa- .
tions: ,
Resistor tolerance
. -t- 30%
Power supply tolerance
. -t- 5 %
"Read" word select voltage tolerance. -t- 0.25 volt
Minimum difference between "read" and
"write" word select voltage
1 volt
VA (write 1)
· = 0 volt
VA (read 0) .
· = -200 mv
VA (read 1) .
· = +200 mv
The voltage levels at node A were chosen so that
there would be no change in the state of the cell during reading. By using the above values in Eqs. (1)
Final Decoder and Word Driver. In order to minimize the number of gates required for word decoding, the decoding has been broken down into two
levels called 'a first decoder level and final decoder.
The final decoder is a two-input "and" gate whose
output ties to the word driver circuit. Thus for a
memory of 2n words, there are 2n of these circuits
required. The first level decoding is performed by
2 X 2n / 2 "and" gates with n/2 inputs each. For n = 8,
there are 256 final decoder and driver circuits and
32-4 input "and" gates for the first level decoder.
The fan out required of the first level decoder is
simply 2 n / 2 with this decoding scheme.
The final decoder and word driver circuit is
shown in Fig. 5. It consists of a current mode gate
providing a "negative and" function with the noninverting output directly coupled to a common
emitter inverter stage which in turn drives an emitter follower output stage. To make the input levels
compatible with the first decoder output, the reference transistor in the C.M. stage (T3) is tied one
diode drop above the group. This also gives some
temperature compensation to the circuit. The input
resistors are used to suppress any tendency of the
circuit to oscillate. The bi-amplitude output is
achieved by controlling the voltage on the write
control bus (WeB). If this line is tied to ground, a
voltage divider is created and the lower amplitude
output (read) results. If the bus is opened, the high-
1058
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1965
Figure 4. 36 Bit memory circuit (60 x 80 mils).
TO WORD SELECT
LINE
Figure 5. Final decoder and word driver.
er amplitude output (write) results. Again the purpose of diode D2 is to provide temperature compensation. The output stage is designed to drive a 72 bit
or less length word which requires about 50 rna
current for writing.
Again the circuit was worst-case designed using
the same tolerances as previously listed for the basic memory cell. In addition, the output transistor
was designed to withstand momentary shorting to
the minus supply voltage. The circuit was designed
to operate within the correct levels at a junction
temperature of 110°C and was actually tested at
this ambient temperature with no special heat sinks
on the transistors.
In order to keep the integrated circuit relatively
simple, only two such circuits were put on one chip
and single layer metal was used. The chip is 45 mils
onaside, has a nominal power dissipation of 250
mw and is shown in Fig. 6. The chip is packaged in
the 14 lead dual in-line header and two such
packages are required to service the four words
contained in the memory package.
Data Circuits. The sensing problem with this memory cell is strictly a compromise solution based on
system size, speed and complexity. Basically, the
output of the memory cell during reading is a given
current (about 0.5 rna) and this current may be
driven into any impedance from zero on up as long
as it is referenced to ground. Thus an ideal method
of sending would be to use a common base stage
which would terminate the line in a low impedance
and provide voltage gain with good bandwidth. Unfortunately, the polarity is such that a PNP transistor
is required and good PNP common base stages are
not easily integrated. For this reason, and also to
limit the number of special integrated circuits that
have .to be designed, we decided to build the prototype system using a convential integrated amplifier, the pA 710. All 256 words in the memory system are tied to a common data line. This can be
done since the leakage current of the gating traI1sistor in the memory cell and its emitter capacitance
are both very low. The data lines are then termi-
INTEGRA TED SEMI-CONDUCTOR MEMORY SYSTEM
1059
Figure 6. Dual final decoder and word driver (45 mils square).
nated in a 150.0, resistor and the 0.5 rna sensing current produces about 75 mv output signal nominally.
This signal is amplified to the logic level by the
JLA710 and a CTp.L-956 buffer is used as an output
gate and line driver as shown in Fig. 7.
DATA
OUT
Figure 7. Data circuits.
A CTJtL dual latch element is used as a data input register. This circuit is simply a flip-flop with
single rail input and output and a gate on both
the input and output. Its output cannot pull negative enough on the data line to write in a zero
and thus a single common emitter transistor tied to
-0.8 volt is required on the output of the latch. Saturating this transistor when a "write" pulse is present on the word select line will write in a zero and,
if the transistor is turned off, the data line will be
referenced to ground and a one will be written in.
In the preliminary design, data line recovery time
was recognized as an important system parameter.
The capacity associated with the line was estimated
as 0.2 pf per emitter times 256 plus about 20 pf
wiring for a total of 70 pf. Thus the RC time constant would be about 10 nanoseconds with a 150
ohm termination. This appears to be a. reasonable
number.
With regard to packaging, all Circuits are housed
into dual in-line packages with two latches, two
data drivers, two p.A71 O's and two buffers per package. In addition, a special interface circuit will be
provided so that the memory will be compatible
with any logic levels commonly used. Thus it require five of the dual in-line packages to service
two bits of data.
1060
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
PACKAGE DESIGN
Printed Circuit Boards
All the circuits are packaged in the dual in-line
14 or 16 lead configuration which, in turn, are inserted and flow-soldered in a two-sided printed circuit
1965
board. Fig. 8 is a photograph of the storage board
which is 8" x 10". The boards are standard 1/16"
G-10 material with 1 oz. copper. The plated through
holes are 30 mils in diameter with a 40 mil land. The
copper runs are 20 mils wide with a minimum clearance of 20 mils. The power supply voltages and
ground are interconnected on the two sides of the
Figure 8. Storage printed circuit board (64-72 bit word).
board to form an interleaved grid pattern. This helps
to reduce the self inductive and resistive drops in
the lines.
Interconnection Techniques
The 7 pc cards are interconnected by means of a
two sided mother board which also functions as the
base of the specially developed connector. Fig. 9
shows a photograph of the top side of the mother
board with some of the contacts installed. The signal interconnection pattern between boards is contained on this side of the board. This is also the
side into which the contacts and pc boards are inserted. The other side of the mother board contains
power supply busing and solder contacts.
Fig. 10 shows a photograph of the connector assembly. It consists principally of three parts: the
mother board, the contacts, and a set of cams.
When the cams are in the relaxed position, the pc
cards may be freely inserted or withdrawn since
they do not touch the contact fingers on the board.
After a board is inserted the cans are rotated, causing the contacts to exert a wiping action on the fingers and thus make contact. The contact pressure is
designed to wipe through any oxide films that may
be built up on the pc card.
The same mother board and connector is designed to be usedlls the equipment interface connector in order to minimize hand wiring required
in the memory system. Fig. 11 shows a photograph
of the interface connector card with sub-miniature
coaxial cable connected to it. Two such cards are
required to complete the approximately 160 connections to the system. The cables are secured with
a hood and clamp piece that feeds them out the
back end of the card cage in a bundle.
1061
INTEGRATED SEMI-CONDUCTOR MEMORY SYSTEM
Figure 9. Mother board -
+
Card Cage
Figure 12 shows a photograph of the partially
assembled card cage. This mechanical assembly provides two primary functions. They are mechanical
support and guiding of the individual boards into
the mother board and connector assembly, and providing proper cooling for the assembly via a small
fan and a controlled air flow path. In addition, the
card cage will house the right angle drives that allow rotating the cams from the front of the unit.
Thus it will be possible to completely service the
unit from the front, i.e., withdraw and insert cards
or insert an extender board and contact assembly.
Overall Unit and Power Supply
The card cage is bolted to a standard 7" rack
panel. The panel has a cutout to permit servicing
the unit. The depth of the assembly will be about
11", allowing the power supply to be placed directly
behind the memory unit in the same rack space.
The power supply requirements are approximately
as follows:
Total d-c Power
+ 12 V
+ 4.5V
topv side.
132 watts
0.7 amps
7 amps
+ 10%
+ 5%
2 V
- 0.8V
- 2 V
- 5 V
5
1
3
15
amps
amp
amps
amps
+
5%
+ 10%
+ 10%
+ 5%
OPERATING CHARACTERISTICS
A pulse program consisting of write one, read
one, write one, read zero, and then repeat has been
used for the testing. The oscillographs of Fig. 13
show some of the waveforms for this system. A is
the negative output pulse from the first decoder and
inverter as it appears on one of the decode input
lines on the storage card. These are all essentially
"open" lines so the reflections present are reasonable.
B shows the output of the final decoder and word
driver as it appears on the word select line. The
first pulse is the read zero, followed by write one,
etc. Mid-amplitude delay through the circuit is observed to be about 20 nanoseconds. The long tail
off (about 50 nanoseconds) on the trailing edge of
the pulse is due to stored charge in the base of the
gating transistors. The cycle time shown here is 150
nanoseconds.
C shows the output of the sense amplifier. As ex-
1062
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
pected, a signal is produced by the write "1" operation as well as the read "1" operation. Mid-amplitude
1965
delay from word line to data out is about 30 nanoseconds.
Figure 10. Connector assembly.
Figure 11. System interface connector assembly.
INTEGRATED SEMI-CONDUCTOR MEMORY SYSTEM
1063
Figure 12. Card cage.
To demonstrate possible crosstalk that exists on
the bit line, the repetition rate was slowed down
and a zero and one superimposed. .Figure 14 (a)
shows the word line and 14 (b) the bit line waveforms. For this case, the two adjacent bit lines of
both sides of the central bit had .complement data
written in and read out at the same time as the cen->
tral bit line was being operated on. The photo is of.
the central line and the small amount of crosstalk at
the leading and trailing edges is apparent.
CONCLUSIONS
The results obtained have shown the f~asibi1ity
of producing integrated circuit memory systems
which offer advanced performance and design simplicity. Such memories may be expected to have an
important place in applications where high speeds
at low costs for moderate storage capacity are needed. Our experience has shown that the engineering
of an integrated semiconductor memory system is
much easier than that required for a thin magnetic
film memory of comparable performance.
~As semiconductor fabrication techniques continue to advance, further improvement in semiconductor memory systems is certain.
ACKNOWLEDGMENTS
Many persons have contributed in one way or
another to this project· and all cannot possibly be
named. However,· special thanks are due to the Digital Integrated Circuit Section (R. Seeds) for fabricating the storage array and the Digital Systems
Research D~partment (R. Rice) for many of the
packaging concepts and, in particular, for the design of the mother board connector assembly. We
are also indebted to J. Friedrich for the design and
fabrication of the word driver circuit, and H. Zinschlag for the design of the data circuits.
1064
INTEGRATED SEMI-CONDUCTOR MEMORY SYSTEM
lial ,
Iii
n It.,l
:~
..::.
I
I
~
.
-+-
1
__ 1
I
I
Figure 13. Memory system waveforms (50 nanoseconds/
square, 2 volts/square - execept as noted).
a. Decode input pulse.
b. Word line.
c. Sense amplifier output.
Figure 14. Superimposed one and zero outputs (40 nanoseconds/square) .
a. Word line (2 volts/square).
b. Bit line (100 mv/square).
STROBES: SHARED-TIME REPAIR' OF BIG ELECTRONIC SYSTEMS*
Jesse T. Quatse
Computation Center
Carnegie Institute of Technology
Pittsburgh, Pennsylvania
INTRODUCTION
The trend towards larger, more elaborate, and
more complex computer systems is gradually modifying many programming and engineering techniques. The sheer bulk of equipment required by
contemporary' computer centers, in itself, creates
new and severe problems in many areas. In particular, the maintenance of big systems has been complicated by three important characteristics: redundancy, complexity, and expense.
Redundancy has adversely affected system reliability. Devices such as memory modules are respectably reliable when administered in small dosesbut memory banks consisting of 32 or more modules are currently available. In these quantities,
even the most reliable modules can become collectively significant as a source of down-time. One
popular, but expensive, approach to the problem is
the refinement of packaging techniques so that entire modules, including addressing circuits, may be
unplugged and replaced as units.
The complexity of a big system imposes a heavy
burden on maintenance personnel. As a rough com*This work was supported by the Advanced Research
Projects Agency of the Office of the Secretary of Defense
(SD-146).
parison of complexity, a maintenance engineer who
is responsible for a single unit in a big system may
be required to retain an understanding of a device
two orders of magnitude larger, by circuit count,
than an equivalent unit produced six years ago.
Maintenance crews have been growing, in response
to the problem, so that each engineer need be familiar with only a fragment of the entire system.
Again, the approach appears to be more expensive
than effective.
Big systems mean big cost in down-time. Although the maintenance engineer can think no faster than he did six years ago, the hourly cost of
down-time has grown considerably. Proportional
increases in maintenance crew sizes is an unsatisfactory approach to the problem. Thinking speed is
not linearly accumulative.
STROBES (Shared-Time Repair Of Big Electronic Systems) is an attempt to ease these problems. It is based upon two assumptions. First, a big
computer system is an unnecessarily overpowered
device for regenerating the trace on a CRT console
-even if the console is a maintenance engineer's
oscilloscope. Flicker-free wave shape display requires no more than 32 uniformly spaced runs of
the maintenance program during each second of
use. Second, the bigger the system, the lower the
1065
1066
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
probability that any isolated facility is crucial to
system operation. A redundant system facility, such
as a core memory module, a magnetic tape drive, a
telephone channel, etc., can be removed from general use without destroying the integrity of the computer system. The two assumptions lead to the
STROBES approach of isolating the faulty unit, so
that standard time-shared programs are denied its
use, and of blanking the maintenance CRT cathode
whenever
standard
programs
are
running.
STROBES thereby achieves a "stroboscopic" effect
in which the maintenance engineer is able to monitor the faulty equipment, at cyclic intervals, during
the time-shared running of his maintenance routines. In typical trouble-shooting situations, the
standard users are hardly aware of the presence of
the maintenance engineer, and he is totally unaware
of them.
TIME-SHARED TROUBLESHOOTING
The STROBES equipment and programs treat
the maintenance engineer as a nonstandard timeshared system user. The equipment consists of a
modified oscilloscope, a small control panel, and
central processor circuits for communicating with
the oscilloscope and control panel. The control panel offers a few basic operating options and is
mounted on the oscilloscope frame. The oscilloscope is modified to accept blanking and unblanking
commands from the STROBES programs.
Three modes of operation are provided by the
STROBES programs. A "log-in" mode is furnished
for identifying the maintenance engineer and the
faulty unit and for initiating the "conversational"
mode of troubleshooting. In conversational mode,
the maintenance engineer can monitor and control
the ·system by means of tasks called from the permanent library. The tasks can be used to exercise
faulty units, assemble other tasks, inspect and alter
the permanent library, terminate operation, and initiate or terminate a "stroboscopic" mode without
terminating the conversational mode. The stroboscopic mode serves the purpose of regenerating the
trace on the modified oscilloscope. A real-time interrupt, of the frequency selected at the control panel, calls a previously specified maintenance routine
which unblanks the CRT, exercises the faulty unit,
blanks the CRT, and returns control to the interrupted program.
1965
The primary advantage of this approach to troubleshooting is the drastically reduced cost of downtime. Each run through a typical stroboscopic mode
program, for example one which reads and writes
into a faulty memory module, requires less than 1
millisecond including interrupt overhead. At 15 repetitions per second, near flicker-free display is obtained for the cost of 1.5 percent degradation in
system speed. (During extended periods of independence from the CRT, the maintenance engineer can
reduce this time degradation to zero by disenabling
the STROBES interrupt. A switch on the oscilloscope control panel is provided for this purpose.)
Ignoring for the moment the cost of configuration
degradation, the cost of down-time apparently drops
to a negligible value.
In addition to its more obvious value, virtually
free troubleshooting time should lower the requirements imposed upon maintenance crew sizes, individual skills, and equipment familiarity.
The effect of configuration degradation during
STROBES troubleshooting is not easily related to
the cost of down-time. In special cases (for the example given, those in which the system programs
handle a "virtual memory" which far exceeds the
capacity of the "actual memory") system performance is relatively insensitive to small changes in the
system configuration. System performance degrades
as some function of lost capacity when configuration changes are sufficiently large; but in the example, large changes reduce multiple faults and are
rare. In general, configuration degradition has a
binary effect on programs. Programs which reduce
service from the faulty unit cannot be run. For
them, the entire system can be considered inoperative during the repair period. For those which do
not require the faulty unit, the entire system can be
considered fully and continuously operative. The
system is more reliable in effect, if not in fact. This
increase in effective reliability is of particular signifficance to users who interact with the system in
real time and are therefore acutely aware of inactive periods.
The advantages of the STROBES system are not
obtained without cost. System programs must be
rewritten to accept the maintenance engineer as a
nonstandard user. More important, they must function in a system which can change configuration
upon operator command. System programs having
similar properities are already available from manu-
STROBES: SHARED-TIME REPAIR OF BIG ELECTRONIC SYSTEMS
facturers who support many configurations of one
basic system. Systems which make use of paging
and segmenting hardware are particularly adaptable
to the STROBES troubleshooting of memory modules. However, a sizable cost can be expected if
these systems are to be adapted to permit the isolation of any nonessential system facility during
STROBES troubleshooting. The obstacles are greater, but not insurmountable, for system programs
which are rigidly fixed to one system configuration.
A worthwhile gain can be obtained by relegating
one facility at a time to the nonessential status. For
example, an essential line-printer can be buffered by
tape or disk so that images are loaded on an intermediate storage during printer down-time, then unloaded and printed later. (Because of the image
scrambling problem, elaborate printer buffering is
already mandatory in time-shared systems. ) Buffered line printers can be efficiently isolated for
STROBES troubleshooting because of their normally low duty cycles. At Carnegie Tech, a 900-lineper-minute-printer produces an average of about 2
X 10-5 Jines per 24-hour day. At that rate, about 20
hours each day can be devoted to line-printer maintenance without affecting system performance.
Another important cost of implementing
STROBES results from the rewriting of maintenance programs. They are subject to the same versatility requirements as system programs. They
must be parametric so that they may be confined in
extent, and non-iterative so they can be run in stroboscopic mode. They must be cataloged and assembled in a maintenance library so that small relevant
subroutines may be called conveniently. Finally, the
conversational mode requires an appropriate interpretation and assembly system.
Other costs and disadvantages of the STROBES
system are less tangible. A psychological reaction
can be expected on the part of both the maintenance engineer and the standard time-shared user.
Both object to trouble shooting during production
runs. The objections are based upon the fear that
the maintenance engineer will interfere with the
standard user's program by accidently touching a
circuit component with an electrically active object.
Such accidents might cause program errors which
are difficult or impossible to detect. Although
standard maintenance procedures rarely endanger
the standard maintenance programs in this way, in-
1067
sufficient data are available to determine the extent
to which the objections are valid for production
runs where mishaps are more costly.
In conclusion, it is worth noting that the disadvantages of STROBES may not survive installation and general user acceptance.
STROBES OPERATION
Mandatory Equipment
The stroboscopic mode requires a nonstandard
oscilloscope, a small control panel, and appropriate
communicating circuitry located in the central processor. The circuit shown in Fig. 1 is used at Carnegie Tech to adapt a Model 535A Techtronix oscilloscope. It is housed in the preamp storage bay of
the oscilloscope and mounted on the control panel
which covers the front of the bay. No modifications
to the oscilloscope proper are required other than
the addition of the CATHODE BLANK and
TRIGGER INHIBIT wires shown in Fig. 1. The
switches, indicators, and connectors shown in the
figure constitute the control panel. Repetition rates
of 1, 15, and 60 per second are selected by the 3position REP RATE switch. Repetitive traces are
easily visible on the standard 535A CRT at the 1
per second rate. Traces are near flicker-free at 15
per second and entirely free at 60 per second. The
reptition rate selector wires run to AND gates in
the processor which enable interrupts generated by
a real-time clock. The connecting cable is 100 feet
in length so that the oscilloscope may be wheeled to
any unit in the system. The four-position MODE
switch must be positioned at AUTO to obtain automatic trace repetitions at the selected rate. Interrupts are generated manually, for one--shot cycling,
by depressing the ENABLE switch while MODE is
in the MAN position. The manual interrupt line
circumvents the real-time clock at the processor.
INHIBIT TRACE is the only signal received from
the processor. It is a d-c level supplied by a programmable processor flip-flop. In either AUTO or
MAN mode, INHIBIT TRACE blanks the CRT cathode and inhibits further triggering. The TEST
position of MODE substitutes a l-kc square-wave
oscillator for the inhibit trace sigrial. The NORM
position returns the oscilloscope to normal operating condition.
1068
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
1965
-12
-12
r-----,
I OSC. I
-12
-12
-=
54
TIST OSCILLATOI
TEST
5.4
I
I
I
MN-k-t-i-,
0 CATHODE,
_
-
BLANK I
I""RID
OF 154)
1
1-
i
I
AUTO
r - -_ _ _ _ _ _ _
N~
2
I
l
2N404
MAN
It-T~IIGGER
INHIBIT
+12
I
i
:
I(CATHOF 133;\
L ___ --'
(MODE
I
ONI SHOT LIGHT DllVn
INTElfACE CIICUITS
I
I
I
I
I
'i-;~I-;~ --I
1
1
i
I
I
IINHIBIT TlLACU
I
0-+------------------1
1
'-----'
I
1
I
-12
I
1
MANUAL
IEPETITION lATE
I
ENABLE
~---~------~~
I
-12
1
I
I
I
I
I
I
.0 1'5 ~------------~~*-~~~----~-------~---D~
IS 1'5 ~--~-------~~-_7~~--~------~-_D~
I 1'5 O-+I--~-------~---_.~~--~-----~-_D~
I
I
~+12
+12 V
--12 V ~-12
-3.5 V ~-3.S
I
IL ____
GND~
-l =
I
2NI305
6G.E,330
~50
50..
i•~L.-----=-:-=-2..0-j,41-+__ _ _ _ _---'
• L
..,.tI'~::"------5 12
DUAL 'UIPOSE FLIP' FLO'
NOTI:
AU DIODIS AlE IN252
AU IESISTANCE VALUES All IN KILO' OHMS
AU CAPACITANCl VALUIS AlE IN MICIO·FAIADS
Figure I, The oscilloscope circuits.
The STROBES Assembly Language
A version of the standard machine language has
been augmented by special STROBES macros and
pseudos so that closed relocatable subroutines can
be assembled in conversational mode. Statements
in the language are of the form "LABEL, OPCODE, PARAMETER, PARAMETER; COMMENTS" and are assembled as they are typed. The
relative address of each statement is automatically
typed at the beginning of each input line so that it
may be used as a parameter. Relative addresses are
distinguished from numeric values by a preceding
decimal point.
At the completion of an assembly, a debugging
mode is automatically entered, in which error comments are typed-out and corrections are typed-in.
When errors are corrected to the satisfaction of the
assembler, a request for a unit name is automatically typed-out. After the name is typed, the subroutine is automatically dumped onto a temporary library and classified as a "task" for the specified
unit. The task can be called and executed in either
the conversational or stroboscopic mode, only if the
specified unit was reserved during log-in. If no unit
is specified, the task is assigned to all units. A permanent task is available which copies tasks from
the temporary library to a permanent master library
which remains intact when STROBES is terminated.
A task call is of the form "NAME, FIRST P ARAMETER, SECOND PARAMETER, ... , Nth PARAMETER, STROBES MODE; COMMENTS" where
NAME can be any alphanumeric string and N ~ 9.
The task can be run in stroboscopic mode only if
indicated by the STROBES MODE parameter. The
actual parameters are passed by means of a macro
STROBES: SHARED-TIME REPAIR OF BIG ELECTRONIC SYSTEMS
called "Par." The PAR macro requires one parameter: a digit which represents the position, in
the task call, of a parameter which is to be brought
to the accumulator. For example, the occurrence
"PAR,2;" in the task called by "LOAD,
100, 200, 300;" will cause the assembly of code
which brings the number 200 to the accumulator. A
list of pseudos and macros are given in Table 1 and
Table 2. Angle brackets are used in the tables to
denote class names.
Table 1. The STROBES Assembly
Language Pseudos.
ENT;
Defines the automatically typed relative address
as the entry point of a library task.
DEF, < LABEL >, < INTEGER >;
Defines < INTEGER > as the value of <
LABEL >.
END;
Terminates assembly and enters the task being
assembled into the temporary library.
PRT;
Causes each used label and its defined value,
if any, to be typed-out in tabular form.
PRT, < INTEGER >;
Executes the PRT pseudo for labels LO through
< INTEGER > only.
DMP, < INTEGER >, < LOCATION >;
Types the contents of the < INTEGER >
memory locations beginning at < LOCATION >.
Table 2. The STROBES Assembly
Language Macros.
RTM;
Returns control to the program marked in the
ENT location.
TON;
Turns the oscilloscope trace on (unblanks the
CRT).
TOF;
Turns the oscilloscope trace off (blanks the
CRT).
TPI, < INTEGER >, < LOCATION >;
Enables type-in of < INTEGER > octal numbers beginning at < LOCATION >.
TPO, < INTEGER >, < LOCATION >;
Types-out the octal contents of the < INTEGER > words beginning at < LOCATION>.
1069
PAR, < INTEGER >;
Brings the parameter in call position < INTEGER > to the accumulator.
CLL, < NAME >, < PARAMETER >,
< PARAMETER >, ... ;
Calls the library task specified by < NAME >
and supplies the ensuing parameters.
The Operation Modes
The first phase of the log-in mode identifies the
maintenance engineer to the general time:-shared
system programs and registers. his request for
STROBES. In the second phase, any tasks accessible in log-in mode may be called. Of the log-in
tasks shown in Table 3 only "DIRECTORY" leaves
the system in log-in mode. The third phase termiTable 3. The Log-In Tasks.
DIRECTORY;
Types the unit, name, number of parameters,
STROBES mode, and length of each task registered in the permanent library.
DONE;
Terminates STROBES.
RESTORE;
Establishes the conditions which existed when
a SAY task was last executed. (See Table 4.)
UNIT, < ALPHA-NUMERIC >,
< ALPHABETIC >;
Declares the unit or units specified by the name
< ALPHA-NUMERIC >, or the subunit delineated by the < ALPHABETIC> descriptor,
as reserved for exclusive use of the maintenance
engineer, loads all of the relevant subroutines
from the permanent library, creates an appropriate temporary directory, and initiates the
conversational mode.
nates log-in by returning to a previously aborted
state, by terminating STROBES, or by progressing
to the conversational mode.
Once initiated, the conversational mode remains
in effect until STROBES is terminated. Successful
initiation is verified by the automatic type-out of
the message "STROBES IS NOW PREPARED
FOR" followed by the name of the unit requested.
After an automatic carriage return; the letter "T" is
typed-out to request input of a task call. When the
task call is received, either a standard message, in-
1070
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
formation particular to the called task, or a "T" is
typed-out. In this way, the reception of typed-in information is always verified by a subsequent typeout. The conversational mode tasks are listed in Table 4.
Table 4. Conversational Mode Tasks.
TSK, < NAME >, < INTEGER >,
< MODE >;
Begins assembly of the temporary library task
called < NAME > requiring < INTEGER >
parameters and operable in stroboscopic mode
if < MODE> = 1.
STT, < NAME >, < PARAMETER >,
< PARAMETER >, ... ;
Initiates stroboscopic mode and calls the task
specified by < NAME > with the ensuing
parameters.
STP;
Stops the stroboscopic mode by inhibiting the
STROBES interrupt.
DIRECTORY;
Identical to the log-in mode task by the same
name.
INI;
Clears the temporary library.
DLT, < NAME >;
Deletes the task specified by < NAME >
from the temporary library.
RMV, < NAME >; Deletes the task specified by
< NAME> from the permanent library.
CPY, < NAME >;
Copys the task called < NAME> from the
temporary library to the permanent library if
it is not already permanent.
MRG, < NAME >;
Merges the entire temporary library with the
permanent library.
TPI, < INTEGER >, < LOCATION >;
Enables type-in of < INTEGER > octal numbers beginning at < LOCATION >.
TPO, < INTEGER >, < LOCATION >;
Types-out the octal contents of the < INTEGER > words beginning at < LOCATION >.
ATI, < INTEGER >, < LOCATION >;
Enables type-in of < INTEGER > alphanumeric characters, one per word, beginning
at < LOCATION >.
1965
ATO, < INTEGER >, < LOCATION >;
Types-out the alpha-numeric contents of the
< INTEGER > words, one character per
word, beginning at < LOCATION >.
SAY;
Completely copies the state of the STROBES
system so that restoration can be made by
the log-in task "RESTORE" and terminates
STROBES.
OUT;
Terminates STROBES unrecoverably.
Entrance and exit to the stroboscopic mode is
controlled by the conversational mode tasks STT
and STP. The STP task is equivalent, in effect, to
disenabling the STROBES interrupt at the control
panel. Thus, the stroboscopic mode can be considered to be the "iterative" form of conversational
mode tasks.
AN EXAMPLE OF USE
The coordination of conversational and stroboscopic modes, during the trouble-shooting process, is
best described by example. What follows is a contrived protocol intended to illustrate the effect that
the maintenance engineer is the sole system user
while he absorbs an estimated 2 percent of total
system time. Typing performed by the STROBES
system programs is underlined for clarity.
The maintenance engineer is required to eliminate recurrent parity errors in an 8K memory module having the octal address range 120000 to
140000. He initiates log-in by the standard identification procedure required of all time-shared users.
He then requests all library procedures relevant to
the faulty memory module (module MM11, E) and
reserves it for his own exclusive use.
UNIT~
MM11, E;
After a brief pause in which the requested routines
are prepared for use, a type-out informs the maintenance engineer that conversational mode has been
established.
STROBES IS NOW PREPARED FOR MM11 E
The letter "T" is typed-out to request a conversational mode task. The maintenance engineer then
STROBES: SHARED-TIME REPAIR OF BIG ELECTRONIC SYSTEMS
types a small routine, by means of the task "TSK",
which simply clears and then sets every bit in a
word of the defective module. Three parameters are
required by TSK: the task name (MEM), the number of parameters (2), and the mode of operation
desired (1 for stroboscopic). The maintenance engineer intends to use the oscilloscope to trace the
information path to and from the defective module.
T TSK, MEM, 2, 1;
0000 ENT; SPACE FOR THE MARK
0001 PAR, 1; BRING THE FIRST PARAMETER
0004 STL" LO; STORE IT IN LOCATION LO
0005 PAR, 2; BRING THE SECOND
PARAMETER
0010 STL" L1; STORE IT IN LOCATION L1
0013 STZ, 1, LO; STORE ZEROS IN THE LOCA--TION SPECIFIED BY LO
0014 CAL" L1; CLEAR ADD THE PATTERN
--IN LOCATION L1
0015 STL, 1, LO; STORE THE PATTERN IN
--THE LOCATION SPECIFIED BY LO
0016 TOF; TURN OFF THE OSCILLOSCOPE
TRACE
0017 RTM; RETURN TO MARK
0020 LO, 0; DECLARE LO AND ITS CONTENTS
0021 L1, 0; DECLARE L1 AND ITS CONTENTS
0022 END; TERMINATE ASSEMBLY
If assembly is unsuccessful, special debugging com-
ments and requests will be typed-out. The maintenance engineer may then debug and correct his
code. When the task is acceptable to the asembler, a
type-out requests the name of the unit for which the
task is intended.
SPECIFY UNIT
MM11
The task is added to the MM11 category of the
temporary library. Successful assembly and library
updating is then reported by a type-out and the next
task is requested. The maintenance engineer enters
stroboscopic mode by calling the newly created
MEM task with STT. He supplies the address
120000 and the pattern 77777777777 as parameters
toMEM.
LIBRARY UPDATED
T STT, MEM, 120000, 77777777777;
T
The stroboscopic mode is now in effect. The maintenance engineer commences trouble-shooting with
1071
the oscilloscope. After an hour of unsuccessful labor, he decides that a call to his supervisor is appropriate. The SAV task will recoverably remove
STROBES from the operating system. After recognizing the call for SAV, STROBES types a recognition message, preserves the temporary library and
the operating state, and terminates.
TSAV
STROBES SAVED
After consulting with his supervisor, the maintenance engineer must log-in once again. He can recover from the SAV operation by means of the RST
task.
RST
STROBES IS NOW PREPARED FOR MM11, E
T STT
T
Troubleshooting with the oscilloscope resumes. As
other tasks are found necessary, they may be written or called from the library.
T TSK, SEQ, 1, 1;
0000 ENT
0001
- { code which sequences the MEM task
00i 6 through memory
0017 END
"SPF:CIFY UNIT
MM11
T STT, SEQ, 77777777777;
T STP
T
When the faulty component is found, stroboscopic
mode is terminated by STP, the memory modules is
powered down, the component is replaced, the
memory modules is again powered up, and an acceptance routine MTA is called. The routine reports faultless operation so the maintenance engineer terminates operation.
T MTA, 'E';
MEMORY MODULE E O.K.
TOUT;
CONCLUSION
The work described in this paper serves more to
introduce than to exploit the STROBES approach.
The central theme has been efficient troubleshoot-
1072
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
ing in emergency conditions. No mention is made
of the many periodic "preventive maintenance"
formance. These procedures are particularly adaptable to the STROBES approach and should be a
profitable realm of application. As systems increase
in size, speed, and modularity, they should become
more adaptable to the STROBES approach. An
attempt to incorporate STROBES at the equipment
1965
and program design level may lead to significantly
reduced maintenance costs in future systems.
ACKNOWLEDGMENTS
The author is grateful for the assistance of his
colleagues Arthur Yaffe, John Schlotterer, and Beau
Brinker.
A SELF-DIAGNOSABLE COMPUTER
R. E. Forbes, D. H. Rutherford, C. B. Stieglitz
International Business Machines Corporation
Federal Systems Division
Owego, New York
and
L. H. Tung
International Business Machine Corporation
Systems Development Division
Endicott, New York
INTRODUCTION
A self-diagnosable computer is a computer which
has the capabilities of automatically detecting and
isolating a fault (within itself) to a small number
of replaceable modules.
This paper presents a new concept which, in
mid-1962, resulted in the design of a self-diagnosable computer called the DX-l. The design effort represented one of a number of related and
continual efforts aimed at finding the best way to
achieve maximum operational availability of a
computer to do useful work. The design objectives
were:
1. Maximum capability for self-diagnosis.
2. Minimum mean-time-to-diagnosis (MTTD).
(MTTD).
3. Minimum additional hardware to make it
self-diagnosable.
4. Minimum hard core (that portion of logic
which must function correctly) .
The DX-l was logically implemented and simulated on the IBM 7090 during 1963. The results of
the logic simulation indicated that the basic philosophy of self-diagnosis is technically correct. It also
confirmed two facts, namely: (l) the design of a
self-diagnosable computer must originate with the
system architecture and must be treated as a principal design parameter, and (2) it is mandatory that
the diagnostic programs be automatically generated
by a computer.
This paper describes one way a computer can be
designed to be self-diagnosable. Although much of
what is discussed here has undergone substantial
modification and advancement, it remains fundamental and valid.
Before discussing the design techniques used to
make the DX-l a self-diagnosable computer, we
present an abstract model as a basis for design and
then describe the DX-l's organizational structure
and operation in its normal mode. We conclude
with certain specific results.
1073
1074
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
A MODEL OF SELF-DIAGNOSABLE DIGITAL
SYSTEMS
It is well known that a digital system Sl possessing
a certain configuration is capable of diagnosing single
solid faults of another digital system S2, provided
that all the nonredundant inputs and outputs of system S2 are accessible to Sl. In particular, where system S2 consists of combinational logic, this method
has been demonstrated for various systems. 1,2,3
When a digital system is partitioned under certain restrictions into subsystems it is possible to
achieve self-diagnosis of the system through the
mutual diagnosis of its subsystems.
Diagnosis and Diagnostic Systems
We state briefly, without proof, a method of diagnosis of logical malfunctions of a digital system
and the requirements of a diagnostic system to be
used for such diagnosis.
1. Let Sl and S2 be two examples of a digital
system S, where Sl is the system to be diagnosed and S2 is known to be nonmalfunctioning. The diagnostic method consists of
applying a sequence of stimuli to the inputs
of Sl and S2 and of comparing the outputs
of Sl to the corresponding outputs of S2.
2. When S contains combinational circuits
only, the sequence of stimuli may consist
of all distinct combinations of input signals
or some subset thereof. By utilizing the results of the comparison previously referred
to in (1), it is possible to decide whether
Sl is malfunctioning, if the sequence of
stimuli is properly chosen. In most cases,
it is possible to obtain better diagnostic resolution, in the sense that the malfunction can
be isolated to a relatively small portion of Sl.
3. When S contains sequential circuits, a suitable sequence of stimuli may be derived
from the original design information. However, it is difficult to obtain as good a resolution as is possible with combinational circuits.
4. Applying the diagnostic procedure4 and implementing the above-described method" a
digital system R can be used to diagnose
another malfunctioning digital system S,
provided that R has the following capabilities:
1965
F 1: Supply the linkage to enable execution
of a given sequence consisting of the
operations F 2, F 3, F 4 and F 5 appearing
repeatedly in any order.
F 2 : Transmit predetermined stimuli to all,
or any selected portion, of the inputs
of the system S.
F3: Observe the outputs of the system S
and compare them with given patterns.
F 4 : Proceed to the next operation in the sequence or branch to a different point of
the sequence of operations, depending
on the result of the last F 3 operation.
F 5: Stop and communicate to the human
operator the place in the sequence where
the stop occurred or communicate other
chosen information such as the location (s) of replaceable unit ( s) that have
been diagnosed as being faulty.
A Model of Self-Diagnosable Digital Systems
A system S is self-diagnosable if S can be partitioned into n mutually exclusive subsystems, Sl, S2,
. . . , Sn for n ~ 2; and, the following conditions
hold:
( a) Diagnosability of subsystems Si:
Each Si, i = 1, 2, ... , n, must consist of
either combinational circuits only; or, if it
contains sequential circuits, a sequence of
stimuli for its diagnosis must be possible
and known.
(b) Existence of diagnostic subsystems:
Among the subsystems Sl, S2, ... , Sn,
there exist subsystems, each of which can
perform all the five operations F 1, F2, ... ,
F 5. These subsystems form the class of the
diagnostic subsystems which we will denote
by CDS' CDS = {Sj}, j = i1, i2, ... , im ,
m:::; n.
(C) Communication between Si and CDS:
The inputs and outputs of each subsystem
Si, i = 1, 2, . . . , n, are accessible to at
least one of the diagnostic subsystems
(other than itself if it is in CDS).
Following the discussion in the preceding section, we assume that a subsystem Si is diagnosable
if its input and output terminals are accessible to a
nonmalfunctioning subsystem Sa E CDS. The word
"diagnosable" implies that the action of diagnosis
1075
A SELF-DIAGNOSABLE COMPUTER
described in the preceding section is possible by using
Sa as the diagnostic digital system where Si is the
digital system/being diagnosed. We assume also that
S is diagnosable if all the & are diagnosable. Furthermore, we assume that when a malfunction subsystem is diagnosed, it is immediately repaired.
Classes of Subsystem Failures for Which the Model
is Self-Diagnosable
system design. We shall define S; Sl, S2, ... , Sn; and
CDS as above (see A Model of Self-Diagnosable Digital Systems).
1. Let Sa be any of the Si E CDS. Let Sa be
further partitioned into subsystems Sal, Sa2,
. .. , Sak. The partition possesses the following properties:
(a) Sal, i = 1, 2, ... , k, are also subsys1
Certain conditions of failure are inconsistent
with self-diagnosability, the extreme case being
simultaneous malfunctioning of all subsystems. We
will examine the classes of failures for which the
model defined above is self-diagnosable.
1.
It is possible to construct a self-diagnosable
system S if at least two subsystems among
Sl, S2, ... , Sn belong to CDS.
2. S is self-diagnosable, independent of the
number of malfunctioning Si (i = 1, 2,
... , n), if no more than one of the malfunctioning Si is in CDS.
3. S is self-diagnosable no matter how many
of the & are malfunctioning if for each of
the malfunctioning Si in CDS, there exists
at least one known nonmalfunctioning Sj E
CDS, and the inputs and outputs of Si are
accessible to Sj.
4a. Let Sil, ... , Sim be a subset of Sl, S2 ... ,
Sn. Sil, Si2, ... , Sim is said to form a diagnostic sequence from Si to Sim if the input
and output terminals of Sip are accessible
to Sip-l for p = 2, 3, ... , m, while Sil,
Si2, ... , Sim-l E CDS and Si2, Si3, ... , Sim
are malfunctioning subsystems .. Then, Sim
is diagnosable if Sil is known to be nonmalfunctioning.
4b. S is self-diagnosable no matter how many
of the &(i = 1, 2, ... ,n) are malfunctioning, if there exists at least one nonmalfunctionillg Sj E CDS and a diagnostic
sequence can be formed from a nonmalfunctioning S3 to each malfunctioning &.
Extensions of the Model of Self-Diagnosable Digital Systems
In this subsection, we will consider the extensions
of the self-diagnosable digital system model, taking
into account realistic situations as found in actual
terns of S.
(b) Sa is nonmalfunctioning if each of the
Sai (i = 1, 2, . . . , k) is nonmalfunctioning.
( c) When Sa contains no redundant or irrelevant elements, each Sai (i = 1, 2,
. . . , k) is nonmalfunctioning if Sa
is nonmalfunctioning.
2. Forming new diagnostic subsystems during
diagnosis: Let Sal, Sa2 , ... , Sak be the subsystems of a diagnostic subsystem Sa resultingfrom a partitioning of Sa. Let Sil, Si2,
. . . , Sip be some of the subsystems Si of S.
( a) If a new diagnostic subsystem SI3 can
be formed by combining Sil, Si2, . . . ,
Sip and some or all of the Sai, say Sit,
S12, . . . ,S1m, then we can consider
SI3 as the result of a new partitioning
of S.
(b) If all the Sil, Si2, ... ,Sip and Sit, Si2,
... ,S1m are diagnosable, then SI3 is
diagnosable.
3. (a) Sit, Si2, ... , Sim are said to form a
chain diagnostic sequence from Sil to
Sim if they possess the properties of a
diagnostic sequence and at least one of
the Sij is formed by a new partition of
S immediately after the Sij-l is diagnosed and repaired, and each of these
Sij is a combination of known nonmalfunctioning subsystems of S in the new
partition.
(b) Sim is diagnosable if Sil is known to be
nonmalfunctioning.
(c) S is self-diagnosable no matter how
many Si (i = 1, 2, ... ,n) are malfunctioning if there exists at least one
nonmalfunctioning Sj E CDS, and a
chain diagnostic sequence can be
formed from a nonmalfunctioning Sj
to each malfunctioning Si.
1076
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
4. Indirect access in diagnosis: Consider Fig. 1
where a diagnostic subsystem Sa has access
to the inputs but not the outputs of S1 and
the outputs of S2, Sa, and S4; while at the
1965
we must be certain that H (S) is not malfunctioning prior to the commencement of
diagnosis. In this case, H (S) is called the
"hard core" of S.
When a hard core H (S ) exists, S is not
completely self-diagnosable. We can, however, design H (S) to utilize self-checking
hardware, and we can employ manual
methods to diagnose H (S ). When H (S) is
tested and is known to be nonmalfunctioning, we may leave H (S) alone and treat S
as a self-diagnosable system.
7. Use of additional hardware during diagnosis: Certain parts of a self-diagnosable
digital system might not be necessary for
the normal operation of the system. Such
parts would therefore not be required to be
connected to the system except when the
system is to be used for self-diagnosis.
Figure 1. Indirect access.
Diagnostic Procedure
same time, the output of S1 can control the
input of S2, Sa, and S4. S1, S2, Sa, and S4
can be diagnosed by using a majority technique if only one of these four is assumed
to be malfunctioning. This can be done as
follows: ( a) apply stimuli to S1 and observe the output indirectly through the outputs of S2, Sa, and S4 individually; and
then (b) accept only the majority of the
three results. It is possible to use this technique in a variety of ways including the
joining of diagnostic sequences and chain
diagnostic sequences.
5. Partition vs. resolution in diagnosis: Even
though each Si (i = 1, 2, ... ,n) is diagnosable, the resolution of diagnosis depends
on the configuration of Si. However, if S is
partitioned such that each of the Si is small
enough and preferably packaged in a single
replaceable unit, it is not necessary to obtain better resolution than diagnosis to an Si.
6. Hard core: When it is not possible to partition S into mutually exclusive diagnostic
subsystems, and additional hardware cannot be added for economic or other reasons,
some elements, say H(S), are then necessary
to be in common to two or more diagnostic
subsystems, say Si1, Si2, . . . , Sim. If the
nonmalfunctioning of at least one of these
Sii is necessary for S to be self-diagnosable,
For a given self-diagnosable digital system, there
are various ways of constructing diagnostic procedures. We will present a method which will aid in
determining whether a system is self-diagnosable. If
this is the case, this method will also assist in the
construction of a diagnostic procedure. A simple
example is given in conjunction with this discussion.
1. List all the mutually exclusive subsystems
S1, S2, ... , Sn which are the result of all the
expected partitions of S. For each subsystem Si (i = 1, 2, . . . , n) list all the diagnostic subsystem or subsystems (denoted by
DS j ) by which & is diagnosable, either by
direct or indirect access.
In the example, there are eight subsystems.
They are given in Table 1.
Table 1. Subsystems
Subsystems
Diagnosed by
S1 ...................... .
DS2
DS2
S2
Sa ............................... . DS 1
S4 ............................... . DS4
DSa
S5
S6
DS 1
S7 ............................... . D~
S8
DS4
1077
A SELF-DIAGNOSABLE COMPUTER
2. List all diagnostic subsystems in the same
manner. Furthermore, for each diagnostic
subsystem DSj, j = 1, 2, ... , k, list all the
subsystems resulting from its partition.
There are four diagnostic subsystems in the
example. They are given in Table 2.
by the combination of subsystems A and B,
etc. By examining Tables 1 and 2, using
the relationship that a diagnostic subsystem is diagnosable if all the resulting subsystems from its partition are diagnosable,
construct a diagnostic diagram as in Fig. 2.
The diagnostic diagram shows how each of
the diagnostic subsystems may be diagnosed.
Table 2. Diagnostic Subsystems
Diagnostic Subsystems
Diagnosed by
DS1 (Sl, S2) ................. DS2
DS2 (Sa) .................... DS1
DSa (Sl, S6) ................. DS2, DS1
DS4 (Sl, S2, Sa, S5) ............ DS2, DS1, DS a
The method of partitioning and the diagnostic system requirements are necessarily
interacting. Therefore, Tables 1 and 2
should be constructed simultaneously. Several trials are likely to be required before
a satisfactory list can be constructed. If a
list cannot be constructed, the system is not
self-diagnosable.
3. Denote A ~ B to mean that subsystem B
is diagnosable by subsystem A, A B ~
C to mean that subsystem C is diagnosable
X Select
Read Write
Memory
Figure 2. Diagnostic diagram.
DX-l AS A COMPUTER
Figure 3 presents the data flow within DX-l. In
its normal mode, the DX-l is a binary, fixed point,
parallel, microprogrammed digital computer. It includes a single bus, six arithmetic registers (Rl
ROM
X Select
Maximum of 256 x 16 = 4096 Words
Y Select
Y Select
I
T
1
34 Bits
!--10-nBits
r--'--...i.-.,
\MAR Decode
-11
1
Rio
A~ 6~ AI12b3 AU 20 11 IS~
Control A
617 lSI
M---- -
;-1----1?-----\::~1~~!1
12j13 1SU 20.1 1 AU4f 5 lSuafl
Control B
4
hAH415BH al9
~
~
iii
L
11
Data
~
(A Bus)
1
d :
-J
:
I
ft§;r~
«
-
I
L.y--+--''----'---..........-+-+----~_I_+__,--..--r--___.___,___r_-___r'_--_.__.....:.-_rT..,._.L,__r__+___r_----'-'-__t__+_tlf Bus
Parity
s) -t---''-+---+--'-r-+----._I_+--.--r-__r_--fT--+--r-r-a-+---.-..-Ir--r-r-ir-t-t--t-+-r--r--t---T----,-,---'--H!1 Check
TIMING
r----
-
CONTROL
L ___ -:)
REGISTERS
ARITH.
REGISTERS
r----
REGISTERS
~-------I
ARITH.
REGISTERS
RW-,tI.
Figure 6. Example of bootstrapping.
(a) B BUS Diagnosis
A diagnostic program (Fig. 7) for detecting a
stuck B BUS bit follows.
Load 11111 in R3 A.
XFER R3 A into R2A by way of the B BUS.
Compare R2A with 11111.
Branch on non zero to 8~ (B BUS struck to
zero.)
5. Repeat 1 through 4; loading and comparing
with 00000.
6. Branch on non zero to 9. (B BUS stuck to
one.)
7. Branch to (b) Timing diagnosis. (B BUS
not stuck.)
1.
2.
3.
4.
(b) Timing Diagnosis (of t4B and t5 B)
TIMING
~ - - ~I-------;
CONTROL
Figure 7. Bus diagnosis.
The sample diagnostic program for this portion of
the machine indicates that we cannot distinguish
among two failures (namely a permanent clear condition set up by t4B and a missing signal to open the
input gates on t5 B ); and this is the worst resolution
obtained by any diagnostic routine. Note that diagnosis of failures assumes explicitly that bnly a single
fault has occurred. A part of the timing diagnosis
program follows:
1. Load each RNB (Le., all B half-registers)
with 11111.
2. Compare RIBand R2B with 00000.
3. Branch on both non zero to 9. (If and
only if the branch does not occur, failures
8, 10, 1, 3 are possible.)
4. Compare R3 B and R4B with 00000.
5. Branch on both non zero to 8. (If and only
if the branch does not occur, failures 8,
(10 and 12), 1, (3 and 5) are possible.)
6. Stop. The failure is number 8 or 1. Write
out location of card(s) causing these failures.
7. Branch to 9.
8. Stop. The failure is number 12 or 5. Write
out locations of card(s) causing these failures.
9. Compare R5 B and R6 B with 00000.
1084
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
10. Continue to diagnose other failure numbers.
It should be noted that the diagnostic program
used a "serial voting" technique to make the tests
insensitive to a single bit of a register being stuck
to 1.
(c) CONTROL B Diagnosis
Because each BT bit was connected to five AND
gates and each BF bit was connected to ten AND
gates, a serial voting technique was used to diagnose these fields, independent of a possible failure
in the gates or registers they control. We placed 1
in a field's four bit positions successively, each
time attempting to execute an XFER micro operation.
If three or more of the XFER's actually transfer
data, we knew that the bit position common to the
XFER's in that field was stuck to 1. Similarly, we
tested a field for a bit position stuck to zero by
trying each allowed bit configuration as an address
of aXFER. If two or more of these XFER's failed
to work, their common bit position must have been
at fault.
The diagnostic programming example for the
control portion of the machine diagnoses BT for a
bit stuck to 1.
1.
2.
3.
4.
5.
6.
7.
8.
9.
Load each RNB with 11111.
XFER RIB into R2A by way of the B BUS.
Compare R2A with 00000.
Branch to 7 on non zero.
Repeat 2 through 4 taking R2 B, R3 B, etc.,
in sequence.
Branch to BF one-stuck diagnosis.
Write out number of last register read from.
Stop if this is the third execution of this
instruction and write out location of card
causing this failure.
Branch to 2, taking next register in sequence.
Note that if the output record contains the numbers of three registers with a common address bit,
then this· bit of BT must be stuck to 1 even if one
or two failures exist in the registers and gating involved.
1965
faults in individual bits of individual arithmetic registers. In the program for diagnosing this portion
of the system, "AND R6 A with x" means load x into
R5 A, pass R6 A and R5 A through the A half of the
AND GATES on an XFER which puts their logical
AND in the aPR. For example, if x is 00001, then
AND R6 Awith x effectively places only the low order
bit of R6 A in the aPR. The complete diagnostic program for arithmetic register diagnosis follows.
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
Load each RNB with 11111.
XFER R1 B into R2A by way of the B BUS.
Compare R2 with 11111.
Branch to 8 on non zero. (A bit is stuck
to zero.)
Repeat 2 through 4 for R2 B, R3 B, etc., in
sequence.
Repeat 1 through 5 loading and comparing
with 00000.
Branch to ROM Diagnosis routine.
AND R6 A with 11100.
Branch to 14 on zero.
AND R6 A with 11110.
Branch to 13 on zero.
Stop. The fourth bit of the last register
read from is stuck. Write out the location
of its card.
Stop. The fifth bit of the last register read
from is stuck. Write out the location of its
card.
AND R6 A with 11011.
Branch to 17 on non zero.
Stop. The third bit of the last register read
from is stuck.· Write out the location of its
card.
AND R6 A with 01111.
Branch to 20 on non zero.
Stop. The second bit of the last register
read from is stuck. Write out the location
of its card.
Stop. The first bit of the last register read
from is stuck. Write out the location of its
card.
Steps 14 through 20 of this program simply determine which bit of a register is stuck.
(e) Special Gating Diagnosis
(d) Arithmetic Registers Diagnosis
Knowing that the B BUS, control, and timing were
operational, we proceeded to test and/or diagnose
The ADDER, AND GATE, and ZERO TEST
GA TE in SB in our example were checked and diagnosed by exhaustion techniques. Since we were deal-
A SELF-DIAGNOSABLE COMPUTER
ing with a four-bit device, this was practical. In the
general case, where the number of tests required by
exhaustion is unreasonable, we would have used tests
which were generated through a process of computer
analysis and simulation.
Diagnosis of the parity checking circuitry in SB was
also done by exhaustion. Here, however, the output
of the parity checking circuitry of SB was observable
by DSA. This required that a gate be provided to
direct the parity check output in SB to a bit position
of the A BUS.
SIMULATION RESULTS AND SUMMARY REMARKS
The DX-1 logic was built into a simulator which
was designed to be processed on the IBM 7090.
The simulator is capable of operating the machine
in two distinct modes-normal and self-diagnostic.
The normal mode was used to debug the original
computer design, read-only memory programs, readwrite memory programs and simulator design. It
was also used to prove that the DX-1 is a computer.
The normal mode simulation served its purpose in
that it accomplished the above objectives and also
proved that the DX-1 can perform all of the operations as specified in the earlier sections of this paper. The diagnostic mode was used to evaluate the
philosophy of self-diagnosis and to investigate
problems encountered in the application of this
principle.
It was assumed that only solid single faults can
occur. The diagnostic inputs and procedures were
manually generated. Several changes in the design
resulted from the logic simulation. The simulation
did not consider the failures of ROM and/or RWM
and their corresponding addressing systems.
The results of the simulation are as follows:
1. 94 percent of the DX-1 CPU logic was
self-diagnosable.
2. 14.6 percent of the DX-1 CPU logic was
added to achieve self-diagnosis.
3. 1 percent of the DX-1 CPU logic was hard
core.
4. MTTD was less than 2 minutes for an input rate of 600 cards per minute.
Experience in simulation showed that the manual
effort of preparing a diagnostic program is a long
and tedious task and errors in judgment are always
possible. Moreover, any modification in the pro-
1085
gram, because of a change in load assignment or
control register coding, can become very formidable. Engineering changes could result in the scrapping of a computer program. These' problems would
become almost trivial if a computer could be used
to generate the required diagnostic procedures or
programs. Therefore, if the theory of self-diagnosing computers is to become practical for a family of
machines, further study and development of machine generation of diagnostic procedures is necessary.
The degree of discrimination between the diagnostic results is highly dependent upon the ground
rules followed in packaging the equipment. Packaging also has a major influence on the total time
(MTTD) required to process the diagnostic program. This emphasizes the important fact that the
design of a self-diagnosable computer must originate with the machine specification and must be
treated as a principal design parameter.
The simulation study of the DX-1 has shown that
both its design and the basic concept and philosophy of self-diagnosis are technically correct. However, many questions must be answered before the
concepts of self-diagnosis and self-repair can become physical realities.
In conclusion, consider the following questions as
samples:
1. How do we automate the generation of the
diagnostic procedure?
2. How do we automatically initiate the diagnostic procedure?
3. What techniques applied singly and/or in
combinations provide the great improvement in reliability and/or availability?
4. How can the design be optimized with respect to performance, cost, speed, and/or
complexity? That is, what tradeoffs are
possible?
5. Can the concept be extended to encompass
other CPU's and systems as related to the
concept of a "Central Diagnostic Computer"?
6. What is the effect on diagnosability, cost,
and maintenance of partitioning into more
than two subsystems?
7. What are the packaging ground rules?
8. What design notation is necessary for automation?
1086
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
9. What will be the programming requirements?
10. To what extent could our assumptions be
relaxed (e.g., types and number of assumed
failures)?
ACKNOWLEDGMENTS
The authors are indebted to: ( 1) P. W. Agnew,
who was a principal contributor to the DX-1 design
effort and coauthor of two of the three IBM technical reports which constitute this paper; (2) J. R.
Belford for his active and sustaining support; (3)
K. C. Dickerson, J. S. J ephson and V. Y. Lum for
their simulation and evaluation of the DX-1; and
( 4) Prof. D. E. Muller for his illuminating discussions and comments on the theory of self-diagnosing
and self-repairing computers.
1965
REFERENCES
1. R. E. Forbes, "The Case for Computers in
Automatic Checkout Equipment," Proceedings of a
Seminar on Automatic Checkout Techniques, Sept.
5-7, 1962, Battelle Memorial Institute, pp. 51-65.
2. K. Maling and E. L. Allen, "A Computer Organization and Programming System for Automated
Maintenance," IEEE Trans. on Electronic Computers, vol. EC-12, no. 5, pp. 887-895 (Dec. 1963).
3. W. C. Carter, et aI, "Design of Serviceability
Features for the IBM System/360," IBM Journal of
Research and Development, vol. 8, no. 2, pp. 115126 (April 1964 ).
4. R. E. Forbes, D. E. Muller, and C. B. Stieglitz,
"Automatic Fault Diagnosis," AlEE Conference on
Diagnosis of Failures in Switching Circuits, Michigan
State University, May 15-16, 1961.
AN AUTOMATED INTERCONNECT DESIGN SYSTEM
W. E. Pickrell
Automation Systems, Incorporated
connection are mounted on the surface (or both
sides) of the laminated board according to the
scheme of the logic.
The "feed-thm" or "drill-thru" implies a second
form of layer-to-Iayer or side-to-side communication in addition to component terminals. The system accepts either fixed or "floating" locations on
the interconnect media.
INTRODUCTION
This paper describes a system for automatically
designing and producing artwork for interconnect
surfaces. This system consists of a number of computer programs which can be a subsystem of a general design automation effort. The interconnect design programs deal with the problems of artwork
production and interconnection of electronic components frequently experienced in computer system
design and construction.
One such system was developed for a former employer and spanned a period of about 18 months.
During this period of development numerous experiments comparing the automatic interconnect design
method to the manual method were performed. This
paper discusses the results of these experiments
along with some details of the programs and their
operation. Before a discussion of automated interconnect design can proceed, some preliminary definitions are required.
The laminate of this paper is a series of singlesided etched boards or layers. Each layer has its
own series of conductive interconnect paths. Layerto-layer communication is accomplished through
plated holes drilled through the layer. A group of
layers are stacked and bonded together under heat
and pressure to form a laminated interconnection
board. Electronic components which require inter-
MANUAL INTERCONNECT DESIGN AND
ARTWORK
As with many computer applications the beginning of the analysis is the "present" manual method
of how things are done, since this situation is usually the one that requires some improvement in
speed, accuracy, or cost. This holds true equally
well for the interconnect design effort. A typical
circuit board design task can be divided into the
following steps:
1.
2.
3.
4.
5.
6.
7.
1087
Prepare a schematic wiring diagram.
Prepare a cover (master design pattern).
Place components in the best pattern.
Assign the external pins.
Check the artwork design.
Tape the artwork.
Check the artwork against the design layout.
1088
PROCEEDINGS -
FALL JOINT COMPUTER CONFERENCE,
8. Photographically reproduce (chronoflex)
and reduce the artwork to desIred size;
produce necessary negatives and positives.
9. Deliver finished negatives to laboratory for
fabrication.
Analysis of these steps shows that some of the
work is particularly suited to automation with computer programs. If some or all of the functions
could be performed for less cost, time and greater
accuracy with computer programs, a significant
gain in computer technology would be realized. The
following section describes the program system and
programs developed to meet that end, concluding
with a section of remarks and statistical results for
review.
THE PROGRAM SYSTEM
The system includes the following general program areas:
1. Generation of the simulated physical composition of the surfaces in digital form
(mapping or cover layer generation).
2. Organization of the data to be interconnected (routed). This may include: selection processes, determination of a minimal
connection tree fOJ a signal net, construction of various sort keys and selected sequences of sorts. These are all designed to
facilitate and increase the yield of the
routing phase which normally follows.
3. Routing of the organized data strings
against the simulated physical environment
of the surfaces taking into account any
special constraints imposed by the hardware system being processed.
4. Editing and generating inputs to graphic
devices such as plotters for covers and
routed surfaces.
5. A uxiliary program providing secondary
passes or continued processing such as:
(a) routing update - to allow manual intervention,
(b) net change - to pass first run failures
onto alternate paths,
(c) drill through - a subsequent run to the
router phase if further laminate interlayer communications are required.
1965
THE PROGRAMS
This interconnect design system is modular and
independent of other systems and, as such, operates
on a fixed placement of components on the board as
provided by a Logic Assignment and Placement
Subsystem which precedes it. The programs can be
best described by function, input and output, in the
sequence they normally retain in operation.
1. Title:
Cover Layer
Function: Simulate the environment in digital
form for interconnection (grid, size
of board, obstacles, number of layers,
external connectors, component pin
arrays).
Apply grid coordinates to all necessary data.
Provide for special requirements such
as clock, voltage singularities, deletion of grounds, etc.
Inputs:
(l) String list (signal nets with com-
ponent-pin identifications) as
assigned and placed.
(2) External connectors available.
(3) Obstacle descriptors, drill-thru
locations, module or chip arrays.
( 4) Signals to be deleted and signals
to special layers.
Outputs:
2. Title:
( 1) Cover layer tape (include all
routine obstacles).
(2) Special layers signal nets tape.
(3) General layers signal nets tape.
( 4) External connectors mesh.
(5) Printer plot of cover layer; errata
list.
Organizer
Function: Inspect each signal net. Select an external where required. Produce the
minimal tree for inter-connect based
on straight-line distances.
Establish sort keys for data organization for the Router. Compute slope
class, distance, number of pins/net
and signal priority.
Sort as directed.
AN AUTOMATED INTERCONNECT DESIGN SYSTEM
Inputs:
Outputs:
3. Title:
(1) Signal nets tape.
(2) Control data for external selection.
(3) Slope class criteria.
( 4) Sort criteria (sequences, keys).
(l) Connect input tape(s) in specified sort.
(2) Class statistics, errata, external
connection list.
(3) External connecti( m puncbed
cards (optional).
Router
Function: Set the environment to memory.
Route the interconnect of two points
according to the organization, constraints and bounded routing area.
Record the path if successful. Record
the event of a failure for further routing trials on subsequent layers or update processing. Set a coordinate tape
for a plotter editing. Record path in
core as a layer his tory.
Inputs:
Outputs:
4. Title:
( 1) Cover layer tape.
(2) Connect input tape.
(3) Media parameters (grid, number layers, etc.).
(1) Coordinate lists (for each layer)
tape.
(2) Fail list (unconnected pairs of
pins) tape.
(3) Layer or side history (cover +
paths for each) tape.
( 4) Statistics and layer listing.
Plot Editors
Function: Translate data from cover layer tape
and coordinate list tape into formats
required for the plotting devices (i.e.,
CALCOMP, GERBER).
Inputs:
Outputs:
(l) Cover layer tape.
(2) Coordinate list tape.
(3) Control data (scaling, conversion factors, etc.).
( 1) Plotter tapes (cover + layers)
either magnetic or paper, as required.
1089
(2) Signal name tape (optional).
( 3) Line length lists (by layer and
by signal).
AUXILIARY PROGRAMS
1. Title:
Router Update
Function: Reconstruct cover layer and routed
history from the initial routing run.
Process manually derived inputs
through erasure and rerouting to attain total interconnect for plot editing and artwork.
Provide sufficient errata reports on
fails or invalid conditions to allow
feedback through the man-machine
loop.
Provide outputs for checking continuity of signal interconnection.
Inputs:
(1) Cover layer tape.
(2) Coordinate list (lst run) tape.
(3) Layer parameters
(4) Update list (deletions, insertions) .
Outputs:
(1) Coordinate list tape (updated).
(2) Listing of connections updated,
fails and errata.
2. Title:
Net Change
Function: Reconstruct layer history (lst run)
and try to route from a list of alternate paths. Use only original failures
where alternate paths are available to
interconnect the signal net.
Inputs:
(1) Layer history tape (1st run).
(2) Coordinate list tape (lst run).
(3) Net prep tape (a list of alternate routes).
Outputs:
( 1) Layer history tape (updated).
(2) Coordinate list tape (updated).
(3) Fail list tape.
3. Title:
Drill Thru
Function: Apply a subsequent run to the router
program with the capability of using
"drill thru" locations for layer-tolayer communication.
1090
PROCEEDINGS -
Inputs:
(l) Layer history tape (1st run).
(2) Coordinate list tape (lst run).
(3) Fail list tape (lst run).
Outputs:
(l) Layer history tape (updated).
(2) Coordinate list tape (updated).
(3) Fail list tape.
RESULTS
The following table illustrates the degree of success in using the design system. Five different laminate boards (in five different hardware systems)
were designed. In all cases, no drill-thru pass was
used. In one case the manual update feature was
employed to attain 100 percent interconnection.
Components
Modules
Modules
Chips
Chips
Chips
Board
Size
346
268
70
88
98
X
X
X
X
X
139
168
84
105
78
PerLayers Inputs Paths Fails cent
6
7
6
5
4
593
1006
685
712
312
1965
FALL JOINT COMPUTER CONFERENCE,
518
965
670
659
304
75
41
15
53
8
87
96
97
91
97
man/machine method for multilayer laminates have
shown the following results:
1. A calendar time compression to 5-10 days
for the design cycle.
2. Cost reductions to the project of 50 percent or more.
3. An accuracy or reliability factor unobtainable by manual approaches.
4. Shorter line lengths for etched paths.
5. By-product provisions for automated tooling for board manufacture.
Figure 1 illustrates in flow form the programs
(input/output) comprising the basics of a design
system.
(Inputs)
Start
(Outputs)
Cover • Layer Cover Layer
Program
Environment
Board Geometry
Net Inputs
·
Selection
Criteria
Organizer
Program
Cover Layer
Connects'
Router
Program
Minimal Tree
(nets)
Sorts as Directed
(connect)
Additional items of importance derived from this
study are:
1. "Chips" on a board result in a higher yield
of interconnect in a shorter period of time.
This physical configuration offers a better
distribution of pins about the board,
simplifying the interconnect problem.
2. Presentation of the data, pairs of points, in
the following sequence is the most optimal
for the routing phase:
( a ) Classification by slope of the straight
line connecting the two points.
(b) Straight line distance between the two
points.
( c) Minor sorts on signal priority and number of pins in the signal net.
Since the deterministic method used in the routing phase does not yield 100 percent interconnect
design, in most cases, manual intervention is required. This phase involves an analysis of the failures against the plotted layers. Subsequent introduction of the failures into the machine solution is
achieved through the preparation of update and insertion inputs to the Router Update Program.
Direct comparisons of manual design versus the
·
Interconnection
Coordinates Fail
List
·
Parameters
Graphic Edit Device Input
(cover)
Programs
Device Output
(layers.or sides)
.
(Update)
.
(Artwork)
Figure 1
The environment of the interconnect medium
must be completely described in the following
areas: module or chip pin arrays, drill-thru or via
locations and identifications, dimensions of the layer or board ( it is highly desirable to design the
board to an equally-spaced grid in both directions),
cutouts, mounting, or fabrication areas where routing is not allowed, and the external connector array
where signals enter or leave the board.
In general, for a multilayered laminate, the cover
AN AUTOMATED INTERCONNECT DESIGN SYSTEM
layer (environment) is constant for all cases. However, some newer manufacturing techniques are
being used ("Post" or "build-up") that permit deletion of pins which are net terminals in all layers
below that of the interconnection. The concept here
is that this deletion "opens up" further areas for
interconnect routing as the process proceeds. This
is, of course, highly desirable for boards of high interconnect density and tightly packed pin arrays.
This process requires "customizing" the router
program to reflect the deletions. In additon, if one
desired pen plots as design or update tools, it would
require the generation of n cover layers for an
n-layer laminate.
For the two-sided board, one would normally
have two cover layers reflecting the environments
on the respective sides.
The function of a Cover Layer Program is to
generate this environment and append all coordinates to the net input list for subsequent program
processing.
Organization of the data to meet changing requirements of board design, external usage, size,
singularities of environment, and so on can very
well be the key to the degree of success attained in
the routing phase.
The methods employed in the router phase of a
system such as this are not iterative. Thus the sequence of presentation of inputs to the router program is extremely important.
Experience in several different environments indicates that so far a typical, practical approach
might include some of the following:
1. Inspect the environment for singularities
which can prove useful in selecting a set of
criteria. For example, a well-distributed
pin population on a square board might
suggest the use of slope classes in equal
angle segments.
2. Also with respect to slope, for laminates it
is generally good to equate the number of
classes with the number of layers.
3. Specification of pairs of pins selected for
interconnection of an entire signal (net) is
normally the result of a minimal tree calculation based on straight line distances
pin-to-pin.
4. Selection of external connectors available
for signals entering or leaving the board
1091
can also be accomplished during the tree
phase.
S. Other considerations in this stage prior to
final sorting are number of pins in the net
(this can be important in an update process), and special priorities to be considered in the routing phase.
At this point, an additional function of an
Organization Program is to monitor the
soring of the data for which the various
keys mentioned above have been included.
For example, a typical sort sequence pre...;
paratory for routing might be:
( a) Slope class (numeric).
(b) Distance (in grid units).
(c) Number of pins/net.
(d) Priority.
The routing phase of the design system generally
has little freedom in changing the sequence of processing the input data. However, a moderate
amount of override capability should exist. For example, due to statistics from the organizer phase, it
is indicated that several classes should be attempted
on layer 1, rather than the single class originally
planned. At this point it would be desirable to have
the capability to modify the presentation with, say,
input cards rather than returning to the organizer
phase.
A rather important aspect of a routing program
is the ability to "customize" the routing algorithm
routines to meet particular requirements of a board
or laminate. Certain restrictions or constraints can
be imposed in this area which would be germane to
a certain type of environment. For example, spacing
between etched path and adjacent pad areas (a terminal where an interconnect already exists) might
be critical on a board and require a prohibitive action. To reduce computer processing time, only the
required logical inspections of the routing space are
performed and these are functions of the particular
board.
Outputs from a router program are normally:
1. Descriptions of the interconnect paths
which can be edited for a plotting device.
In previous systems these were in the form
of a "from-to" terminal or pad identifications accompanied by a coordinate chain
which represented every unit cell in the
interconnect path.
1092
PROCEEDINGS -FALL JOINT COMPUTER CONFERENCE,
2. A listing of all input pairs which the program failed to layout.
Finally, programs which perform the necessary
translation of data into plotting device input format
are required. There are several ways to develop the
graphic output; one of these is to edit the cover layer (environment) and router output separately. If a
pen plot is used, the cover and interconnect can be
plotted in superimposition using contrasting colors
for clarity. Size or scale of the plot may be as desired within the limits of the device.
1965
The "ultimate" at the moment is production of
final artwork for the interconnection surfaces. This
requires high resolution and optical capabilities in
the plotting device.
Within the realm of design automation, automatic interconnect design is in its infancy. Many new
things are being done presently, and many more
will follow in the months ahead; some of these will
include: automatic board design, refinements in the
analysis of data organization, and new "customized" router algorithms to accommodate advanced
manufacturing techniques.
SYSTEMATIC DESIGN OF AUTOMATA
J. P. Roth
IBM Watson Research Center
Yorktown Heights, New York.
The subject of this paper is a system of programs
to aid in the logical design of automata. Figure 1
depicts the experimental system as it presently exists within IBM. There are essentially two internal
CODE
TRANSLATION
CUBICAL
COVER
POSITION
ASSIGNMENTS
Figure 1. The logic automation complex.
1093
formats for the system, one called the injective
word as shown in Fig. 3. which in essence specifies
all the logic blocks in a circuit and how they are
linked together. The other is termed a cubical cover
(or ON-OFF-ARRAYS) which is a means of describing the behavior of the circuit as if it were a
two-level circuit consisting of ANDs followed by
ORs (or vice versa). See references 1 and 2 for a
more detailed description of these notations.
The programs of the system, the principal ones
being shown in boxes (Fig. 1), may be thought of
as transformations of these formats into themselves.
There are two exceptions. They are the Sequence
Chart Analyzer SCA and the program EQIWT, the
equation-to-injective-word translator.
We shall first discuss the sequence chart analyzer;
a sequence chart is shown in Fig. 2. This is a sequence chart for the operation of the instructions
Floating Point Add, Subtract or Compare for an
early version of MODEL 60 for System/360. It will
be observed that across the top of the chart appear
intervals labelled T1, T2, ... , T10. These refer to
time intervals and admit of many interpretations.
Operations are written above horizontal line segments in one of the "time columns". Immediately
to the left are written the "immediate" conditions
necessary for its execution. For example in column
5 the operation Set Condition Register is performed,
1094
PROCEEDINGS -
1
0
•
1
,
Z
1"
5
b
'1
8
q
1
~ , 234561a1J01 2]45678901 2341'8901 23.5678~: Z:M5678901214i678901 234561890::].5678901 Zl!t567!901 21.56189~: 2345618901
i345611901
!
t
I
: ~2'
!
~
:
. , _
':~:.
'"
lO
i~: .
.,
.!. _______________ .1. -,
~ 4'
C~T.-cTJt
1
Z
.-c:HMT ENTAY...
tRESET A 5Ht'TO
:
'sue11lQCTrAro-,.loSro-,.lOCntio:
",RD~PSUI
I
,
'1
't -
I
I"..
I
IWlOS-OSLa-63jO
t
I
I
i i
!
... SHI...HI'H C_'~021_ 4
~AO'7~.---fSHIFT [S[~J'JORICHT -J
i,
!,
I
, r..
..
~SfTFtA5~""""
\ZEAOS-+Srl2-611
"1
!LTOl[o-7]
-
SADDfR AESUL T TRU£ ,-SQZ£RO.SP
,SHI" tSt_1JeIIIGHT OJ
'tA5TO~aa
SADD!'''' AnUlT TR\J£.-SAZt:AO.D IRESET
I
IA TO L
I
SQOOf.A RESULT TRue.-SAZERO,FIRST
I
r...
SI~"1
""
,..
'.0.,.. "S\A.' TAUE.-SOZEAO_I.S'/" '0 Lt~'j ROO....
I
..
saOOfA RESULT TAUE".-S~ERot-QAOO"QN
..
ISHIFT tat~3'J¢QIGHT
I
I
,
4j
I
-eTA I:W.L
't -" :
eTA ALt..
'tI - 4'I
t
I
I
:QDDSrCT~EocTAlo
:
~
I
I
r"
""
I
a
SHtf'T.\.CIII COUNT .51' I
Q
SHIFT.L.OIII cou~T.DPr
I
I
I
..I
4J;
;I
I
I
I
:~HIFT
I
I
I
t
SI~71
SRO:>ER RESULT CQl'li~g()Z2QD
I
II
I
I I
I
I
01!CI 11
IS TO L.
I
"000+
IseT A SHtFTO
I
I
!
I
IAI8-6310f1PtGHT .,
I"
I
t
r---------------~
rr.,
saDDER RESULT CO\"l'.51' 1
I
II
I
I
~
I
I
I
15H!'"
It
I
I
II
.!
-,
2
II
I
I SUlITAQCT I CTR«lfEOCTAI
J
rr..
lAl&-l1JOftIOtfT
",..,
I
.. ,
I
ISHIF' IAI8-6310AIGHT 411
I
e
~
....
"
I
cau..,' I
"J
I
• SMIFT."IG" COU.,t-••
I
I
I
I
I
I
I
I
I
lZF.ROS-C>Al8-631
•
I
'"
I
I
r
..
I
I aDDS I CTA¢QfEOCTA I ()
I
I
to
"I
,."
I
I
II.. TO L.. 10-71
I
A SHIFT_L.c.
?~
~
I
O~20VO~ -
1
..
-cIJIIPClRe.p ADDER RESULT CCI'IP+ -
F
-o.r,. .,
\.
J
IQ
I
~
-S!C,",S tI!...!kf.ICD+sp.a
~
L.!SS S Tn S
p
"
,.
a~D
"
[Q[~J]o
L.
,.
" ..
~tL
TO l [o-?J
!.
~
PLUS' '0 S ... D L
SIGNS Df..I."~.SP.~ SHIFT:
'.O ,': :;~
-nONS
Df..lk'.SP.~
I
., SP.
SI~E
•
DDU8L.P
DP.
OUTslOe THE
\.
~
J
,J
ia
PR~CISION
A~
"
~ rQO~~Pl'
,.
:"
LUS :
I
!I
I,
I
:
I
p..
Il TO L. Io-?IO t"Nf'RT
PRECtsloPIi
: ::~:~TS:OD£~~~~:~:
•
Q~o!etMt
L~SS S TO S AJ.lO rB-l.ll
l tat~11!O
IIDSf_llosr_ll01.
SHIFT:
I
no11AH_------t.
(.[0-31]0
H_ljOS[_l]OI.[B-1']]
g
.,
!,
I
i
!I
!
!
SHIFT,ist8-61Joste-63JOLt:~:~~Ql
\.
~
•21
..
saZfA(H- ---- ---- ---------:1:"': ~~K;.r~:p:Q-s:t:Tit5t&-631ost~J10Lt~]:~~QJ
~
i;
r.f_,10
4 0 PLUS' '0 S ... D L
A SHIFT.lOlal COUI\IT.CTR CL.L , . - - - - - - - - - - - - r
..
,. ........
e!
~
1
1
1
1
1
]
..
56
'1
23456789012345678901 ~:561afJO' 2J4561"fk)1 Z34567i901 23.5
FPSUI+COI'PCAEltNYERT S SICH
-A SHIFhL.DW
EQUAL
J
~2Z_"---'" -a SHIFT.LCI.! C~T.CTR EQUAl. 1+ - - - - - - - - - - - - - ~
,5 ,
~
1965
FALL JOINT COMPUTER CONFERENCE,
Q
Stc'fO
TO~ S A~D LornRD!O
I,. ",. .. ,.~ ....
r~S' l~!JOSt8-63J~t&-~:~
I
It
!
I
I
OF TIE OTHER •
: ~: ~OO:T.:, ::'~.:-'
.-- -- - - -- - -- -- - -- -- -- - - -- - - -- .
12'
Figure 2.
when the sequence chart is "operative" in this area,
when the condition holds. The given sequence chart
becomes operative at "Chart Entry Conditions"
such as (on the far left of the chart) FPADD +
FPSUB + COMPARE. Thence in time Tl, in the
Tl column some, or all, of the operations such as
INVERT S SIGN are performed provided the immediate condition written to its left, such as
FPSUB + COMPARE, are satisfied. If no immediate condition is there written then the operation is
automatically performed. Thus several operations
may be simultaneously executed. Thence one moves
to the next time interval or else to the next specified interval. For example if after performing any
of the operations at time Tl if SAZERO (Serial
Adder is Zero) then one jumps to time interval T4.
Whichever is the case one proceeds into other
phases of the chart. For example after an operation
in time T 4 has occurred then the next operation
and the next time are determined by the immediate
conditions written to the left of the line segments
emanating from the right half of the T5 column.
For example if LEAD ZERO-OVERFLOW and
-NORM and ZERO FRACTION is true then the
operation SET SIGN PLUS [L] is performed. Thus
any set of conditions are prescribed a "path of operations" on the sequence chart ending ultimately in
an ENDOP which means end of operations.
It is thus seen that the sequence chart is kind of
sequenced flow-chart of machine operations as de-
1095
SYSTEMATIC DESIGN OF AUTOMATA
..
8
,
9
2 2 2 2 2 2
0
..
2
3
..
5
Z
6
2
2 2 ]
II
'9
0
7'
,
'?no" 2345678901 23456 78~' Z]45'?II9()'I 23456 78'~ Z14561890121456789C)1 2]4"'890'23456719012].,.719012]45,78901 Zl4567no1 214561890
!
!
E~TRV---.
,.---ctt:IRT
I
I I'IPV
TERI'lINATI~
I
!
:
!.
!
I
I
I
I
I
I
I
I
+----0+
I
'V2
;
~
I
I
,
LEAD 2!'ItO.-l'!'AO (Il'RACTI"'* - ~
~
:
-o!
I
I
I
I
,.
,."..
.. .. I
IADDS lOfE«-to-?JOt..to-?lJ I
LEAD ZERJ - - - - - - - - - -
~
ISET SI~ PLusrLlo
:
".I
.. - M-
I
I
I
I
I
~
I
"I
I
I
Isr&-631~sre-6]I«.re-]1110
,~
J
~
J
~
JJ
I
r
r..
: SUBTRACT l0tEot..to-7JO
I r
I
II
I
I
:I
I
:
I
:
I
:
:
I
"
I
r--CHaRT fNT"V--,
:
~11AC..--J
I
+ __________
I
r
r
..
r
....
IADDSI~E«.lo_?I«.lo-?ll¢
~
\.
J
"rJJ
'0 PL.US S TO S ONe LlO~¢
I
~
:
I
~
J
~
,,\.
"J
IRIO:;MT •
I
I
~
I
ISET tiVDtCOTQAS
I
III'v+oc:r+,
"
IS TO L PI32-6300-]"¢
I
I
~
"I
r--------"
ILOR,I!I
\.
J
I
t>+
I~
r
.. ,
ISTOA!lLM'.CI'l:
+-AA01,AX4-t
- - - - -
'l
I
- - - - P+
I
I
I
- ~ ENDOD I
I
~
I
-
I
j
LOCALi
1
~"JDERFL.OIa·.-I'Ipv.SP+ - - -
I
\.
IREAD
I
LFOD ZEAO.-OVJ'RFLIllI.-f,JOOl'l.ZERO FAQCTIO"l+----OI:tlO'3AJ~
I
~
r,1
---G1\lr:F~FLOW.IDP+I'I'vl+------t>+
I
I
p - -
J
~Cl011AIM----+
I
\.
J
I
LFOD ZeRO.-OVERFLO\u.-".IOD~O'3~~
IS!:T CeND REG
~
l
II
lREAD II\I LOCAL STORE
I
\.
I
2
I
I
I
CO~ME+------4:IOO',QN~ -
..
ISTOII'tLOIII'JO IST_rLOIII"
QODER RESI.A.T TRue.oveRFL~o-~1'A8---r
ISET SIG"J PLUSILI
I
~ J
r
:
IR!AD t,. LOCALf
IZEAOs--ot..
I
Ie.
J
I
urIDERFLClW+-MC!'.at.--+-AQO •• QE-...-....
-N>V.S.,.
II
I ..
I
. . . pv~OO'.Qj(~
I
.~
IR~QD OUT LOCAL.I
I
15 1&0-6310S,lZ-6l1OL,e-3,llo:
I
~
t
I
l
I
2
;
I
I I
I DIVICE UI\lDfRIfLOW ~
I
,I
~
I
ODOC••ESULT TIIU£..ov...LoJ ________________________ ~
I
-COI'IIPaRE.p
I
I
-oY!RFlOW.p ADDER RESULT TAue.NDR1".LEDD lER~"'U:a~
-oVE'RFLOI-.P ODDER q[SULT TRUE.1V0A~A013AO~
IL to-7JJO+..EFT .
:
I
'A PLUS S TO 5 ~D I.IZEROSO,
-...00 2£.00-<:1JI'I......
~
lEAD FRAtTICJN+-AQOZ,AN&---+
I
I
I
~
I
I
I
!
L....J
7~
I
-R!5Ul T Ok+ - - -0+ TitAPS I
j
I
I
I
-o+t...::.J
I
R!Su.. T OK+ - -
I
2
i
~
2
i:
'~2
i
110
~
ri_:::::::-;L:-::O;;;;G:-::r~e:OU:T:::OOI-f;;.OT~rON:':S~'~QU@?;~Ne;:3,-::::..~:e-::_:;;.T:::::::'-'I ;
II
II
FLOATING POINT ADDOSUI. A..,D CON'aAE
....
~
•
lu~
I
I=:-:,~~---::~----~~~-=I
IDO..
lOOeH TYoe
I
PQGf: ,.,.
"".1
,
2
ICOO'1Z;
Z
Z
Z
2
Z
22:3
:3
..
,
•
7'
•
•
0
678'901234,.789012345.789012]4" 78901234'678901 234'678'01234'678901 214,,78«J01 2345678901 ZJ456789012]4!1i671901Z]4!1618«M>1 234,.7890
II
'9
222
0
scribed for some particular data flow structure. of a
machine.
The Sequence Chart was originally designed by
Stanley Pitkowski and subsequently formalized by
the Logic Automation Group mainly by J. M. Galey, P.N. Sholts and Jere Sanborn. Numerous abbreviations for indirect addressing are employed in
the description.
..
2
it is to be open or to be closed. Figure 3 is one
page of several pages -of the injective word produced by SeA for the sequence chart of Figure 2.
The meaning of this form may be understood by the
following example: at left is a circuit and at right is
the corresponding injective word.
T2
Tl
The Sequence Chart Analyzer (SCA) first produces a hard copy of the sequence chart itself as illustrated by Fig. 2. The form of this chart is subject
to updating and can be changed at will. The second
output from SCA is a injective word which prescribes for each gate the total conditions for which
A
T3
a
-8
-A
8
."
a* = A, CE
AND
a =4*, T1
AND
{:J* = 8,(1'*
AND
{:J={3*, T2
AND
8* = -A, CE
AND
8
= -A, T1
AND
E
=&*, T2
AND
.,,* = A, '-8, CE
AND
." =.",*, T2
AND
z=A,t3,CE
AND
1096
PROCEEDINGS- FALL JOINT COMPUTER CONFERENCE,
FLOATING PT. ADO, SUB, COMPARE
1965
- ALIGNMENT OF FRACTIONS -
INJECTIVE WORD
-0018CO
-OOlBeo
0038ec
OClDt1
0038(2
0038C"
-lA1E
-all
•
..
•
..
•
•
•
-112
tAlE
• 2AtE ,9Z0005
OR
• -Sp
a 2A1£,1A3£,9l0005
• FPAOO,FPSUD.COMPAR
OR
-tA3
-2A1E
-3A1
a
-3A~E
-3A~
a
-lA<\X
-3A5
a:
-3A2
-)A2X
-3A3
-3A1X
•
-1&5X
-1A6
•
-3A6)(
-3A1
#
-3A1X
-3A8
-}A8X
-SALE
5A2E
~AJE
-,)A1
-I)
l
A
A
I
A
A
OR
OR
N
..
..
..
..
..
:a:
SAORT.SP,910028
lA2X.9Z0006
SAORT,OP,9Z0028
3A3X,9Z0006
SAORT,fST,9l00Z8
SAZERO,FST,2AlE
3A4tX,9l0006
3A4E.SADRT
:UI)X,910006
SAORT,9l0028
3A6X,9Z0006
9l0001,SADRC,OP
3A1(,910006
910001,$AORC,SP
'U8X,Ql0006
9l0001,SADRC
CTREQ 1, I)
-I)
..
2
)
23
OR
A
OR
"4
N
OR
OR
A
-lA~
• 1.1AItE,910012
.. 1.5l3E,ql0012
OR
lA6E
-1
'lAtE
9T
T2J
T1J
-7
.. 9l0018.9100lQ.9IC020
.. -CNTAll
>II
-9l00')I.
-OC,lBCIJ
')
2
2
2
1
1
3
S
A
'J
OR
')
A
A
:3
3
3
2
3
OR
:AI
2
l
OR
A
.. LOWOH. 9Al E
-lAS
3
3
2
1.
2
2
-lAlfE
1A2
-lAlE
2
OR
-7A1
-lA1
3
1
1
A
OR
A
A
OR
A
91C~13,910014.9l0015.9l0016,9lCOl7
2
2
A
• lO_CNT,9AIE,SP
l,1AIE,9l0012
.. T),lA6E,LOkCNT
• lO.CNT.9AIE,OP
.. 7.7AlE.9Z0012
-lAlf
It
It
OR
OR
OR
• -HIt""
.. 5,5A3E,9l0012
3
2
3
J
1
2
3
OR
OR
• U lX, 910006
..
..
•
•
..
..
-lA1X
5AIE,T2,lOWCNT
9T,9AlE
Tl,9Z0007,SA1£RO
5A2E,T2,lOWCNT,CTREQl
7A6E,T2,lOWCNT,CNTAll
9T,CNTAll
fPSU8,COMPAR
2A1E,1A1£,910005
,..
OR
3
1
OR
3
N
N
1
1
"4
1
1
.. -0028CO
~
.. -9l00l~
T4J
910005
-u
910006 .. -T2
~
1
N
1
1
N
Figure 3.
Each line or argument is given a label; each line,
for example line 3, emanating from a box is expressed as a function of the lines entering the box,
together with the name of the logical function
which the box performs, and the number of inputs,
e.g.
3 = 4, 5
3= 4,5
AND 2
4=3,7,8
OR3
5 = 2, 3, 6
NOR 3
AND 2
This method of description of course allows feedback in the circuit
The method of analysis for SCA will be illustrated in the next figure.
The injective word output of SCA is fed directly
into the program CIMPL, an acronym for circuit
implementation. This program accepts an injective
SYSTEMATIC DESIGN OF AUTOMATA
word described by any set of logic blocks and converts it into an SLT injective word, i.e. one composed of SLT logic modules and obeying fan-in,
fan-out powering, etc. of this particular technology
(SLT stands for Solid Logic Technology.)
This program was outlined in architectural form
by John Earle and programmed by Peter Schneider,
now of the University of Wisconsin; Michael Galey,
IBM San Jose; Jere Sanborn, IBM Poughkeepsie;
and others.
The first operation which CIMPL performs is to
search for identical gates or inverters. It eliminates
all but one of these and fans out from it. It then does
the SLT· implementation in the low, medium or
high-speed circuit families so as to preserve the
general logical structure specified by the designer
on the Sequence Chart as well as the delays inherent
in the structure. It does this implementation within
the SLT circuit constraints manipulating the logic
locally so as to satisfy the fan-in constraints and
inserting powering whenever called for by the loading equations. It does the fan-in and powering manipulation so as to add the minimum amount of additional hardware to satisfy the constraints while
preserving the general logical structure. The program then assigns the circuit types in the specified
circuit family and produces an output which can
feed directly into DRAW, the program which partitions the injective word produced into pages and
assigns print positions for each page,· in sum, in effect to draw the ALD sheets (ALD automated
logic diagram) for Solid Logic Design Automation;
it in fact generates the SLDA logic master tape.
Another input to CIMPL is Boolean.equations: a
preprocessor called EQIWT (equation to injective
word translator) accepts Boolean equations and
converts it to the injective word format. This gives
the designer the ability to convert his Boolean
equations directly into hardware. This path within
the logical automation complex was used in Hursley, England,· particularly by K. A. Duke, in the
design of the Model 40 of System/360. Specifically
the ALU, the Arithmetic and Logic Unit was so
designed.
An alternate program called Position Draw or PDRAW was written by Galey: this accepts an injective word in which the pages and print positions
have been specified by the designer in a list structure. It generates the ALD sheets in accordance
with the pattern prescribed by the engineer and also
1097
generates the Logic Master Tape of Solid Logic Design Automation. It thus saves the designer from
making careful drawings for the keypunching operation and substantially cuts down on the magnitude
and difficulty of the keypunch operation itself.
Figure 4 shows one page of 24 ALD sheets produced by SCA followed by CIMPL followed by
DRAW. These ALD sheets would have, in the
manual version, been originally computed and
hence drawn by hand for insertion into the Solid
Logic Design Automation System. Total IBM 7094
time for this operation was about 15 minutes.
Another entry into the system of logic automation programs besides the sequence chart and Boolean equation is through a translation code, as
shown in Figure 5. This is the translation from
NPL to typewriter coding. The code translation
program assembles this into a set of Boolean functions, one for each output. This is then formulated
as a two-level minimization problem with many
outputs.
MIN is a program of the extraction algorithm2, 3
which works with cubical covers and has a domain
of applicability much wider than those algorithms
using canonical terms. For the NPL-typewriter code
translation a minimum was obtained. This minimum two-level solution was then converted by a
decomposition algorithm DEC02, 4, 5 applied by a human computer (the program was unavailable at the
time). This result was then fed into CIMPL and
DRAW to produce on 13 ALD sheets an SLT design of the code translator. Total machine time did
not exceed 10 minutes on the IBM 7094.
Two more programs of the system will be mentioned. These programs can be run in conjunction
with the other programs of the system. The first is
called 1T*2, 3 and is a program for analyzing Circuits;
precisely it accepts an injective word and translates
it to a cubical cover, i.e. a normal form expression,
for each output of the circuit in terms of its primary inputs. This program requires that all feedback loops be cut. It enables us to apply minimization procedures to already designed circuits in an
effort to achieve cost reduction.
This approach is especially useful in the case of
design of low-cost circuits for which many copies
will be made and wherein the savings in diodes or
transistors are particularly significant. These· programs have had extensive use in IBM Endicott, e.g.
on the MICR (magnetic ink character recognition)
PROCEEDINGS -
1098
1965
FALL JOINT COMPUTER CONFERENCE,
000
8COO"
13.
,-.~I
...---IOR
••SUI....___10000._!U03OO !....J
I
2A-QA
,0
I
I
fA
I
,----IOR
CQI'IPADe-tlaooOAv4-IU03QA 1---'
I
I
I
I
12C-tl81
I.A-QEI
'0
I
r---:5~HI
!,
I
IN
I"-I
I
17Q-QTI
,
, , ,
'"
,
,,
,,
,
I,
..-......
,I
I
6a-AP
,
, I
IN
I
I
r---IA
,.--IUO]AO 1---1
--c>u03AF 1 - + - - 0 1
I
1
I
I"
I
I
I
I
I
I.e-OFI
,
,,,
,,
,
I
I
IbC-CQI
I
I
R E S E T F I R S T - - o Q 0 0 3 Q G 4 - - - - - - - - - - - - - - -_ _
0------i'
I
I
IQ
I
I
I
~:g;6
=====±IIj-'1
I
I
r.---
I
l?e-aVI
I
,
1,.....--,
'--lOR
I
I"
1
"
---I "I
IU03QE
::
~
II
II
11.--,0,
::I,--:uo3OF
I 1111
1111".--1,
==±: IP!
~~g~~~g:==========================='=5O-=<.<='======'.O-O=.='
:iWt
J...--.
_ _- ' -_ _ _ 10
,ce'O]
550056
I
I'
SROOEPRESULTCOlllc.--QROOOBC_._ _ _ _ _ _ Il.J03AF CO-:----_~CI'U03QF
I
17~wl
AOO'61AC056-CD4
III1
I III
! ii!
I-o~:
i---¢--ii~JQQ
CTR~4
I,
-----------t==========================:i::;::t-:....' _ _ _
8COO'
twEoc:a---AV4
II
IQ
,_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ IU03QC : _ _......
R!VT1-t1U"
"'"
,.-+-+-,'
c~~~~~:================:::::::===='=5E-=J'==:±!
I
1""••'5510_14
I
IU03AF 0 - - -
I,--IA
II
I
:
:~g~~~
I
""--I
I
11C-CU1
I I .--lu01QF
'--fU03Ac:. 1 - - . . . . . .
'"
IXIOZ2
26«)09
~IQ
!UO]o, r-----t lUOlOF ~\- - -....:UO_ ! - -
!UO_ .;. \---!U03OO !....J
I r---13A-ACI
,I
,I
,
II
I
I
! I!!
:--.. . .1
....!:::::::I5::..I-"-='===:±1=====:±,fro~L:;;o:;;oo;:z.--SE~~~~T---'
_ _ _ _ _ _-.:.':3J-O~D'_ _ _~'
• .......,:....:::G.:...'_ _
,
I
IIII
'"'
I III
I
1111
I
IA
LEFT
nc11Af1' _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ ilU03QQ
I
_ _""',
0I;<_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
,
i
I
'5L-e1"l1
SI~_T_
,
I
_ _......
'--lOR
~I
I
!,,
,
,0!iU0300!....J
, i
, ,
""
IIII
~
+-.
I
I II-....-.,Q
I
!U03QE ~~:U03QF
I
161..-':151
~I
500110_ _ _ _ _ ,
I
1'7L-t:lxl
,
,
,,'"00
I
I DQT£
SEQU!~t£
CHQRT It
06-09-6" fltaCH. SCASVS
In" CORP.
,
,
Q
,
Q
, 0
:, ~
..,'000
I 5pt....CN I
Figure 4.
circuitry by Billy N. Carr, on the IBM 1030, a device to transmit data from terminals, and on the
Serial Wire Printer. In each case, '77'* was followed
by MIN followed by factorization and thence (in
two cases) by CIMPL. Incidentally mistakes by the,
engineer were detected in the design and corrected.
In each ,case a very small amount of machine time
was used.
Another program which has had extensive usage
is a set of diagnostic programs, labelled DIAG in
Fig. 1 This is a program which accepts a circuit in
injective word format and produces a set of input
patterns termed tests which when applied to the circuits is capable of determining whether or not a
failure has occurred of a given variety. The variety
treated by the program basically is a failure .for
each line wherein the line may be fixed in value,
either stuck-at-1 or stuck-at-O. This program was
used specifically in the design of the Memory Protect and Relocation MPR for the 7094 configuration going to M. I. T. By an addition of only 2.5
percent extra hardware it was possible by means of
the tests generated for this piece of hardware to detect any single failure of the stuck-at-1 or stuck-ato variety. Approximately one hour was required of
7090 time to compute the tests. DIAG was a predecessor of the SLT set of programs FLT (Fault Location Technology) used on some models of System/360, written principally by Frank and Martha
Evans.
A new program is being planned8 for an improved algorithm called d-DIAG for generating diagnostic tests. This algorithm is substantially faster
than other known methods and is based essentially
on a calculus of injective words rather than by a
calculus of cubical covers, as is DIAG. Reference 8
describes this algorithm in considerable detail and
makes comparisons with previous methods.
A profitable way to utilized this complex of design programs is in the redesign mode: A computer
system is designed for one technology and it is desired to redesign it in another. For this purpose a
program EXTRACT is needed to take from the
SLDA Logic Master Tape the logical essentials to
1099
SYSTEMATIC DESIGN OF AUTOMATA
NP L
0
1
1
1
1
1
1
1
1
1
1
1
1
1
1
.1
l
I~1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
B CD Code
2
3
4
5
6
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
1
1
1
1
1
1
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
0
0
0
0
1
1
1
1
0
0
0
0
0
1
1
1
1
0
0
0
0
0
0
1
1
0
0
1
1
0
0
0
1
1
0
0
1
1
0
0
0
1
0
0
0
0
0
1
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
Code
1
1
1
1
1
0
0
0
0
--- Q
0
0
1
1
1
1
1
1
1
1
1
1
0
0
1
1
0
0
1
1
7
1
0
1
0
1
0
1
0
1
1
0
1
0
1
0
1
0
1
1
0
P
0
0
1
0
1
1
0
0
1
1
0
1
2
4
8
0
1
0
0
0
0
0
1
0
0
1
0
0
0
0
0
1
0
1
1
1
1
0
1
0
1
0
0
0
1
0
1
1
1
1
0
B
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
1
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
1
1
0
0
0
1
1
0
0
0
1
1
0
0
0
0
1
1
0
0
0
0
1
0
1
0
0
0
0
1
0
1
1
1
0
1
0
1
1
0
1
1
0
1
1
0
0
1
0
1
0
1
0
0
0
1
0
1
1
0
0
0
1
1
0
0
0
1
1
0
1
0
0
1
1
0
1
1
0
0
0
0
0
1
1
0
1
0
1
1
0
0
0
1
1
1
1
1
0
0
0
1
1
1
0
0
0
1
1
1
0
1
0
0
0
1
1
0
0
1
1
1
1
1
0
0
0
1
1
0
0
0
0
0
1
0
0
0
0
0
0
0
1
1
A
1
C
0
1
1
0
1
1
0
1
1
.1
1
1
0
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
1
0
0
0
0
0
0
0
0
0
1
1
0
0
1
1
1
0
1
0
0
1
1
0
1
1
W/'M
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
1
1
0
0
1
Figure 5.
be fed into the analysis and synthesis programs.
This process has been used partially automated, that
is to say, the EXTRACT program as it now stands
requires another routine before being completely
automatic. This program is in an experimental
state.
FUTURE SYSTEM
The Iverson notation has been used to present a
very complete specification of the architecture of
the IBM System/360 (Ref. 9) and this immediately opens the possibility of adjoining this description
to our Logic Automation Complex. The design for
the machine would thus be specified initially by
means of a program or system of programs written
in the I verson notation. For each model of System/360 the data flow would also be given, essen-
tially as a list of facilities available plus their interconnections. This part of the experimental design
system remains to be done.
REFERENCES
1. J. P. Roth, "Minimization Over Boolean
Trees," IBM! of Res. & Dev., vol. 4 no. 5, pp. 543558, (Nov. 1960).
2. R. E. Miller, Switching Theory, John Wiley
& Sons, Inc., New York, 1965.
3. J. P. Roth, "Algebraic Topological Methods
in Synthesis," Proc. of an Inter. Symp. on the Theo
ry of Switching, 2-5 April 1957, Part 1, Harvard
University Press, Cambridge, Mass., 1959.
4. J. P. Roth, and R. M. Karp, "Minimization
over Boolean Graphs," IBM! of Res. & Dev., vol. 6
no. 2, pp. 227-238, April 1962.
1100
PROCEEDINGS -
FALL JOINT COMPU'I'ER CONFERENCE,
5. R. M. Karp, F. E. McFarlin, et aI, "A Computer Program for the Synthesis of Combinational
Circuits;" Proc.of 2nd Annual Symp., Switching
Circuit Theory and Logical Design, AlEE, Oct. 1720, 1961, New York.
6. J. M. Galey, R. E. Norby, and J. P. Roth,
"Techniques for the Diagnosis of Switching Circuit
Failures," IEEE Transactions on Communications
and Electronics, vol. 83 no. 74., Sept. 1964, pp.
509-514.
1965
7. W. C. Carter, R. Preiss, et aI, "Design of Serviceability Features of IBM System/360," IBM! of
Res. & Dev., vol. 8 no. 2, April 1964, pp. 115-126.
8. J. P. Roth, "Algorithms for the Mechanization
of Design I," Diagnosis IBM Resarch Paper RC1924 (Oct. 1964). To appear in 1MB! of Res.
& Dev.
9. A. D. Falkoff, K. E. Iverson, and E. H. Sussenguth, "Formal Description of System/360," IBM
Systems Journal, vol. 3, no. 3, pp. 198-262 (1964).
AMERIC.AN FEDERATION OF INFORM.ATION
PROCESSING SOCIETIES (AFIPS)
211 E. 43rd Street, New York 17, New York
Officers and Board of Directors of AFIPS
Chairman
Treasurer
DR. EDWIN L. HARDER*
1204 Milton Avenue
Pittsburgh 18, Pennsylvania
MR. FRANK E. HEART*
Lincoln Laboratory-MIT
P. O. Box 73
Lexington 73, Massachusetts
Secretary
Chairman-Elect
MR. MAUGHAN S. MASON
Thiokal Chemical Corp.
Wasatch Division-Mail Stop # 150
Brigham City, Utah 84302
DR. BRUCE GILCHRIST*
IBM Corporation
Data Processing Division
112 East Post Road
White Plains, New York
ACM Directors
MR. J. D. MADDEN
ACM Headquarters
211 East 43rd Street
New York 17, New York
MR. HOWARD BROMBERG*
CEIR, Inc.
Benson East, Twp. Line & York Roads
Jenkintown, Pennsylvania
MR. EUGENE H. JACOBS
System Development Corp.
2500 Colorado Avenue
Santa Monica, California
DR. BRUCE GILCHRIST
IBM Corporation
Data Processing Division
112 East Post Road
White Plains, New York
IEEE Directors
MR. WALTER L. ANDERSON*
General Kinetics, Inc.
2611 Shirlington Road
Arlington 6, Virginia
MR. L. C. HOBBS
4701 Surrey Drive
Corona Del Mar, California
Simulation Councils Director
MR. JOHN E. SHERMAN
Lockheed Missiles & Space Corp.
D-59-15-15, B-102
P. O. Box 504
Sunnyvale, California
Association for Machine Translation and
Computational Linguistics-Observer
PROFESSOR WINFRED P. LEHMAN
Department of Germanic Languages
The University of Texas
Box 7939, Austin 12, Texas
* Executive
Committee
DR. R. I. TANAKA
3427 Janice Way
Palo Alto, California
MR. T. J. WILLIAMS
Monsanto Chemical Co.
800 N. Lindburgh
St. Louis, Missouri 63166
A merican Documentation
Institute Director
MR. HAROLD BORKO
System Development Corp.
2500 Colorado Avenue
Santa Monica, California
Executive Secretary
MR. H. G. ASMUS
AFIPS Headquarters
211' East 43rd Street
New York, New York 10017
Standing Committee Chairmen
Admissions
Award
MR. WALTER L. ANDERSON
General Kinetics, Inc.
2611 Shirlington Road
Arlington 6, Virginia
MR. SAMUEL LEVINIE
Bunker-Ramo Corporation
445 Fairfield Avenue
Stamford, Connecticut
Conference
Constitution and By-Laws
MR. KEITH W. UNCAPHER
The RAND Corporation
1700 M.ain Street
Santa Monica, California
MR. EUGENE H. JACOBS
System Development Corp.
2500 Colorado Avenue
Santa Monica, California
Finance
International Relations
MR. WILLIAM D. ROWE
Sylvania Electronics Systems
189 B. Street
Needham Heights, Massachusetts
PROFESSOR JOHN R. PASTA
Digital Computer Lab.
University of Illinois
Urbana, Illinois
Planning
Pub lica tions
DR. JACK MOSHMAN
CEIR, Inc.
One Farragut Square, S.
Washington, D. C.
MR. STANLEY ROGERS
P. O. Box 625
Del Mar, California
Public Relations
s
Social Implications of Information
Processing Technology
MR. ISAAC SELIGSOHN
IBM Corporation
Old Orchard Road
Armonk, New York
MR. PAUL ARMER
The RAND Corporation
1700 Main Street
Santa Monica, California
Education
Technical Program
DR. DONALD L. THOMSEN, JR.
IBM Corporation
Old Orchard Road
Armonk, New York
MR. BRIAN POLLARD
Burroughs Corporation
41100 Plymouth Road
Plymouth, Michigan
Harry Goode Memorial A ward
DR. JERRE D. NOE
Engineering Sciences Division
Stanford Research Institute
Menlo Park, California
General Ghainnan-(j(j SJGG
GeneTal Ghairman-(j(j F JCG
DR. HARLAN E. ANDERSON
Digital Equipment Corp.
146 Main Street
Maynard, Massachusetts
MR. R. GEORGE GLASER
McKinsey & Co.
100 California Street
San Francisco, California
1965 FALL JOINT COMPUTER CONFERENCE
EXHIBITORS
as of October 31, 1965
ADAGE, INC., Boston, Mass.
ADDISON-WESLEY PUBLISHING COMPANY, INC., Reading, Mass.
AMERICAN TELEPHONE & TELEGRAPH
COMPANY, New York, N.Y.
AMPEX CORP., Redwood City, Calif.
AMPEX CORPORATION, Redwood City,
Calif.
ANELEX CORPORATION, Boston, Mass.
APPLIED DYNAMICS, INC., Ann Arbor,
Mich.
BECKMAN INSTRUMENTS, INC., Fullerton,
Calif.
BENSON-LEHNER CORPORATION, Van
Nuys, Calif.
BRYANT COMPUTER PRODUCTS, A Div.
of Ex-Cell-O Corp., Walled Lake, Mich.
CALIFORNIA COMPUTER PRODUCTS,
INC., Anaheim, Calif.
CALMA COMPANY, Los Gatos, Calif.
C OMCOR/ ASTRODAT A, INC., Anaheim,
Calif.
COMPUTER ACCESSORIES CORPORATION, Santa Barbara, Calif.
COMPUTER CONTROL COMPANY, INC.,
Framingham, Mass.
COMPUTER DESIGN, West Concord, Mass.
COMPUTER PRODUCTS, INC., Braintree,
Mass.
COMPUTERS AND AUTOMATION, Newtonville, Mass.
COMPUTER SCIENCES CORPORATION,
EI Segundo, Calif.
CONDUCTRON CORPORATION, Ann Arbor,
Mich.
CONRAC, A DIV. OF GIANNINI CONTROLS CORPORATION, Glendora, Calif.
CONSOLIDATED ELECTRODYNAMICS
CORPORATION, Pasadena, Calif.
CONTROL DATA CORPORATION, Minnea polis, Minn.
CORNING GLASS WORKS, Bradford, Pa.
CYBETRONICS, INC., Waltham, Mass.
DAT A DISC, INC., Palo Alto, Calif.
DATA EQUIPMENT COMPANY, A DIV. OF
BBN CORPORATION, Santa Ana, Calif.
DATA MACHINES, INC., Newport Beach,
Calif.
DATAMATION/F.D. THOMPSON PUBLICATIONS, New York, N.Y.
DATAMEC CORPORATION, Mountain View,
Calif.
DATA PROCESSING MAGAZINE/NORTH
AMERICAN PUBLISHING CO., Philadelphia, Pa.
DATA PRODUCTS CORPORATION, Culver
City, Calif.
DIGITAL ELECTRONIC MACHINES, INC.,
Kansas City, Mo.
DIGITAL EQUIPMENT CORPORATION,
Maynard, Mass.
DIGITRONICS CORPORATION, Albertson,
L.I., N.Y.
ELCO CORPORATION, Willow Grove, Pa.
ELECTRONIC ASSOCIATES, INC., West
Long Branch, N.J.
ELECTRONIC MEMORIES, INC., Hawthorne, Calif.
F ABRI-TEK, INC., Amery, Wis.
FAIRCHILD SEMICONDUCTOR, Mountain
View, Calif.
FERROXCUBE CORPORATION, Saugerties,
N.Y.
GENERAL COMPUTERS, INC., Los Angeles,
Calif.
GENERAL ELECTRIC COMPUTER DEPT.,
Phoenix, Ariz.
GENERAL PRECISION, INC., Librascope
Group, Glendale, Calif.
HEWLETT-PACKARD DYMEC DIVISION,
Palo Alto, Calif.
INDIANA GENERAL CORPORATION, Valparaiso, Ind.
INTERNATIONAL BUSINESS MACHINES
CORPORATION
DATA PROCESSING DIVISION, White
Plains, N.Y.
INDUSTRIAL PRODUCTS DIVISION,
White Plains, N.Y.
ITT
INDUSTRIAL PRODUCTS DIVISION, San
Fernando, Calif.
ITT DATA SERVICES, Paramus, N.J.
KLEINSCHMIDT DIVISION OF SCM CORPORATION, Deerfield, Ill.
LOCKHEED ELECTRONICS COMPANYA & IP DIVISION, Los Angeles, Calif.
MAl EQUIPMENT CORPORATION, New
York, N.Y.
McGRAW-HILL BOOK COMPANY, New
York, N.Y.
MEMOREX CORPORATION, Santa Clara,
Calif.
MICRO SWITCH-A DIVISION OF HONEYWELL, INC., Freeport, Ill.
MIDWESTERN INSTRUMENTS, INC.,
Tulsa, Okla.
MILGO ELECTRONIC CORP., Miami, Fla.
3M COMPANY MAGNETIC PRODUCTS
DIVISION, St. Paul, Minn.
MONROE DATA/LOG DIVISION OF LITTON INDUSTRIES, Beverly Hills, Calif.
THE NATIONAL CASH REGISTER COMP ANY, Dayton, Ohio
OLIVETTI UNDERWOOD CORPORATION,
New York, N.Y.
PHILCO COMMERCIAL ELECTRONICS OPERATION, Willow Grove, Pa.
POTTER INSTRUMENT COMPANY INC.,
Plainview, L.r., N.Y.
PRENTICE HALL, INC., Englewood Cliffs,
N.J.
RADIO CORPORATION OF AMERICA
ELECTRONIC COMPONENTS & DEVICES, Harrison, N.J.
ELECTRONIC DATA PROCESSING, Camden, N.J.
RAYTHEON COMPUTER, Santa Ana, Calif.
RECORDAK CORPORATION, New York,
N.Y.
REEVES INSTRUMENT COMPANY, Garden City, N.Y.
REEVES SOUNDCRAFT DIVISION,
REEVES INDUSTRIES, INC., Danbury,
Conn.
REMEX/RHEEM ELECTRONICS, A DIV.
OF EX-CELL-O CORP., Hawthorne, Calif.
ROTRON MANUFACTURING COMPANY,
INC., Woodstock, N.Y.
ROYAL TYPEWRITER CO.-A DIV. OF
LITTON INDUSTRIES, New York, N.Y.
SANDERS ASSOCIATES, INC., Nashua, N.H.
SCIENTIFIC DATA SYSTEMS, Santa Monica, Calif.
SCM CORPORATION, New York, N.Y.
SOROBAN ENGINEERING, INC., Palm Bay,
Fla.
SPARTAN BOOKS, INC., Washington, D.C.
STROMBERG-CARLSON CORPORATIONDATA PRODUCTS DIV., San Diego, Calif.
SYSTRON-DONNER CORPORATION, Concord, Calif.
TALLY CORPORATION, Seattle, Wash.
TECH-MET INC., Sunnyvale, Calif.
TELETYPE CORPORATION, Skokie, Ill.
TEXAS INSTRUMENTS, INC., Dallas, Texas
TRANSISTOR ELECTRONICS CORPORATION, Minneapolis, Minn.
UGC INSTRUMENTS, INC., Houston, Texas
UPTIME CORPORATION, Golden, Colo.
U.S. MAGNETIC TAPE COMPANY, INC.,
Huntley, Ill.
WEST ELEVEN, INC., Los Angeles, Calif.
JOHN WILEY & SONS, INC., New York,
N.Y.
ZELTEX, INC., Concord, Calif.
REVIEWERS, PANELISTS, AND SESSION CHAIRMEN
SPECIAL ACKNOWLEDGEMENT. The 1965 Conference Committee is indebted to Russell Bennett, Leonard Cotton, Robert Davies,
Ben Ferber, Robert Gray, Robert Kiel, Donal Meier, Ja'mf3s Mihalik,
John Piontek, Lynn Y arbroug h, and David Yetter for applying themselves so diligently to make the Preprint V olume for the Discuss-Only
Sessions possible.
SESSION CHAIRMEN
D. ACKLEY
L. D. AMDAHL
P. ARMER
D. G. BOBROW
E. E. BOLLES
P. BROCK
J. R. BROWN, JR.
E. BRYAN
J. W. CELLAR
C. W. CLEWLOW
J. S. CRAVER
A. J. CRITCHLOW
C. DAVIDSON
J. B. DENNIS
W. A. FARRAND
M. T. FISCHER
E. GLASER
D. L. GOLDY
M. H. HALSTEAD
D. HARATZ
W. H. HARTWIG
D. G. HAYS
L. C. HOBBS
R. E. KAYLOR
W. B. KEHL
K. KOLENCE
R. L. KOPPEL
W. S. MCCULLOCH
D. MEIER
R. C. MINNICK
G. S. MITCHELL
W. D. ORR
J. B. RHINE
J. A. RICCA
R. RICE
A. ROSENBERG
M. ROSENBERG
W. SANGREN
G. M. SILVERN
1. R. WHITEMAN
1965 FJCC PAPER REVIEWERS
C. W. ADAMS
J. C. ALRICH
L. D. AMDAHL
J. P. ANDERSON
P. L. ANDERSSON
R. M. ANNIS
M. ARBAB
A. A VIZIENIS
L. AYRES
R. S. BARTON
D. BEELER
D. W. BERNARD
D. L. BICKEL
L. BISCOMB
R. M. BLOCH
Q. BONNESS
R. E. BRADLEY
P. BROCK
D. R. BROWN
H. D. BROWN, JR.
E. BRYAN
D. J. CHESAREK
J. G. CLARK
D. CLUTTERHAM
M. COHEN
T. A. CONNOLLY
L.
R.
O.
C.
P.
D.
S.
W. COTTON
DAVIS
R. DEACON, JR.
F. DEDO
DENNING
L. DIET MEYER
M. DREZNER
T. J. DUDLEY
A. I. DUMEY
F. G. DUNHAM
H. P. EDMUNDSON
R. A. ELLIOTT
H. S. ENGLANDER
F. Ess
B.
P.
R.
S.
R.
L.
C.
J.
FERBER
J. FOY
J. FREEMAN
K. FREEMAN
H. FULLER
GALLENSON
L. GERBERICH
GOLDBERG
H. GOODMAN
W. F. GOODYEAR
N. GORCHOW
J. W. GRANHOLM
F. S. GREENE
C. H. GUTZLER
E. HABIB
N. HAUSNER
D. HAYS
W.H.HoWE
C. A. IRVINE
B. JACKSON
W. W. JACOBS
R. KAIN
D. KIELMEYER
P. KIVIAT
J. KLEINBARD
K. W. KOLENCE
N. R. KORNFIELD
1. J. KUBASAK
S. M. LAMB
C. J. LAMPE
E. L. LAWLER
J. A. LEE
1965 FJCC PAPER REVIEWERS-Continued
S. Y. LEVY
D. LIU
P. A. LUNDAY
F. MAGNESS
R. N. MATHUR
R. E. MATTESON
J. J. MIHALIK
M. MINSKY
T. J. MOFFETT
J. MOXLEY
G. G.MUNAY
T. W. MURPHY
E. OSTROWSKY
J. J. PARISER
D. B. PARKER
C. L. PERRY
S. N. PORTER
R. RICHMAN
V. C. RIDEOUT
B. Roos
A. ROSENBERG
M. ROSENBERG
J. SALTZER
A. D. SCARBROUGH
W. J. SCHART
P. N. SCHEID
1. B. SCHNEIDERMAN
J. SCHWARTZ
B. SEAR
D. SHAEFER
N. SHAPIRO
C. SHELTON
T. L. SIDES
G. M. SILVERN
L. C. SILVERN
J. W. SMITH
P. B. SMITH
F. G. SNYDER
L. M. SPANDORFER
C. F. SPRAGUE, III
T. STOCKHAM
J. F. SWEENEY
D. TOCHER
Q. T. TROSTUND, JR.
F. O. UNDERWOOD
E. VAN HORN
1. WARSHAWSKY
C. WEISSMAN
1. R. WHITEMAN
R. L. WIGINGTON
J. M. WILLARD
1965 FJCC PANELISTS
L. ANDREWS
G. B. BEITZEL
R. W. BEVERIDGE
J. R. BROWN, JR.
A. BURNS
R. W. BURT
E. U. COHLER
V. M. CORRADO
L. W. COTTON
J. F. CUBBAGE
C. DEDO
J. J. DIGIACOMO
J. P. ECHERT
R. ELFANT
D. C. ENGLEBART
D.C.EvANS
L. FEIN
A. FELLER
G. W. FENNIMORE
E. FORBES
J. W. FORGIE
K. FREDERIKSEN
H. FULLER
R. FULLER
W. J. GALLAGHER
M. W. GOLDMAN
J. W. GRATIAN
L. C. HOBBS
W. H. HOWE
W. S. HUMPHREY, JR.
W. J. KARPLUS
H. R. KAUPP
A. J. KOLK
D. H. KRAMER
W. KUZMIN
T. G. LESHER
M. H. LEWIN
J. MARKUS
M. MAY
D. F. ORR
R. J. PETSCHAUER
J. A. RAJCHMAN .
A. P. SAGE
F. X. SCAFURO
H. SCHMID
R. D. SCHMIDT
P. E. SHAFER
R. SHAHBENDER
P. S. SIDHU
L. SLATTERY
R. W. SMILEY
L. M. SPANDORFER
K. H. SPEIERMAN
T. STEEL
L. M. TERMAN
T.L.THAU
D. TOCHER
K. UNCAPHER
J. A. WARD
W. WINGSTEDT
R. F. ZEITH
Conferences 1 to 19 were sponsored by the National Joint Computer Conference, predecessor of AFIPS. Conferences 20 and up are sponsored by AFIPS.
Copies of volumes 1-26, Part II may be purchased from SPARTAN BOOKS,
scientific and technical division of Books, Inc., 432 Park Avenue South, New
York, N. Y.
'
Volume
1-3
4-6
7-9
10,11
12,13
14,15
16,17
18
19
20
21
22
23
24
25
26
26
Part
I
II
$150.00
11.00
9.00
9.00
7.00
7.00
8.00
6.00
3.00
3.00
12.00
6.00
4.00
5.00
8.25
8.00
9.50
2.50
11.00
9.00
9.00
7.00
7.00
8.00
6.00
3.00
3.00
12.00
6.00
8.00
10.00
16.50
16.00
18.75
4.75
Cumulative Index to Vols. 1-26, Part II
Complete Set:
Member Price
List Price
$3.00
Price per set to members:
$100.00
27. 1965 Fall Joint Computer Conference, Las
Vegas, Nevada, November, 1965.
Source Exif Data:
File Type : PDF
File Type Extension : pdf
MIME Type : application/pdf
PDF Version : 1.3
Linearized : No
XMP Toolkit : Adobe XMP Core 4.2.1-c041 52.342996, 2008/05/07-21:37:19
Producer : Adobe Acrobat 9.0 Paper Capture Plug-in
Modify Date : 2008:11:17 06:46:12-08:00
Create Date : 2008:11:17 06:46:12-08:00
Metadata Date : 2008:11:17 06:46:12-08:00
Format : application/pdf
Document ID : uuid:badc6a2c-1854-4bc0-93fd-e10e3cb89a5f
Instance ID : uuid:f8ed0245-a677-48ad-bf1a-2a48197adc6b
Page Layout : SinglePage
Page Mode : UseOutlines
Page Count : 1119
EXIF Metadata provided by EXIF.tools