1968 12_#33_Part_2 12 #33 Part 2

1968-12_#33_Part_2 1968-12_%2333_Part_2

User Manual: 1968-12_#33_Part_2

Open the PDF directly: View PDF PDF.
Page Count: 600

Download1968-12_#33_Part_2 1968-12 #33 Part 2
Open PDF In BrowserView PDF
AFIPS
CONFERENCE
PROCEEDINGS
VOLUME 33
PART TWO

1968
FALL JOINT

COMPUTER
CONFERENCE
December 9-11, 1968
San Francisco, California

The ideas and opinions expressed herein are solely those of the authors and are
not necessarily representative of or endorsed by the· 1968 Fall Joint Computer
Conference Committee or the American Federation of Information Processing
Societies.

Library of Congress Catalog Card Number 55-44701
THOMPSON BOOK COMPANY
National Press Building
Washington, D.C. 20004

© 1968 by the American Federation of Information Processing Societies, N ew York,
New York, 10017. All rights reserved. This book, or parts thereof, may not be
reproduced in any form without permission of the publisher.

Printed in the United States of America

CONTENTS
PART II
PROGRAMMING SYSTEMS II
WRITEACOURSE: An educational programming language. .. . . . . . . . .

923

A table driven compiler for use with automatic test equipment. . . . . . . .

929

On the basis for ELF: An extensible language facility.. . . . . . . . . . . . . . . .

937

MEMORY TECHNIQUES-HERE TODAY
Associative processing for general purpose computers through the use
of modified memories .......................................... .
Addressing patterns and memory handling algorithms ..

949
957

i

•••••••••••••

E. Hunt,
M. Zosel
R. L. Mattison,
R. T. Mitchell
T. E. Cheatham,
A. Fisher,
P. Jorrand

H. Stone
S. Sisson,
M. Flynn
T. lshidate

Design of a 100-nanosecond read cycle NDRO plated wire memory .....
High speed, high cur~ent word matrix using charge storage diodes for
rail selection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

981

S. Waaben,
P. Carmody

AUTOMATED MAINTENANCE AND CHECKOUT OF HYBRID
SIMULATION FACILITIES
Automatic Checkout of a large hybrid computing system ............ .
Hybrid diagnostic techniques. . . . . . . . . . . . . . . . . . . . . . . . . . ........... .

987
997

J. C. Richards
T. K. Seehuus,
W. M aasberg,
W. A. Harmon

DYNAMIC RESOURCE ALLOCATION
Demand paging in perspective. . . . . . . . . . . . . . . . . . . . . . . . . ........... .

1011

Program behavior in a paging enviromnent ......................... .

1019

JANUS: A flexible approach to real-time time-sharing ............... .

1033

A parallel process definition and control system. . . . . . . . . . ........... .

1043

969

B. Randell,
C. Kuehner
B. Brawn,
F. Gustavson
J. Kopj,
P. Plauger
D. Cohen

HUMAN AUGMENTATION THROUGH COMPUTERS AND
TELEOPERATORS (A Panel Session-No papers included in
this volume)
LA;BORATORY AUTOMATION
A computer system for automation of the analytical laboratory ....... ,

Real-time time-sharing, the desirability and economics. . . . . . . . . . . . .

1051

1061

P. J. Friedl,
C. H. Sederholm,
T. R. Lusebrink
B. E. F. M acefield

A modular on-line computer system for data acquisition and
experimental control ....... '.' ........................... , , . ' . . ..

1065

A standardised data highway for on-line computer applications ....... ,

1077

~

1089

Use of a computer in a molecular biology laboratory ..... ' ...........

A small computer as an on-line multiparameter analyzer for a neutron
spectrometer ................ , .. , ................ , . . . . . . . . . . . .. '1099

H. P. Lie,
R. W. Kerr,
G. L. Miller,
D. A. H. Robinson
I. N. Hooton,
R. C.,M. Barnes
J. F. W. Mallett,
T. H. G088li,ftg

1'v.l. G. Silk,
S. B. Wright

Applications of digital computers to the long term measurement of
blood pressure and the management of patients in intensive care
situations ...... , , . , ......... , .. , .................. , ...... , . . . .

1105

J. L. Corbett

HAND PRINTED CHARACTER RECOGNITION
Some conclusions on the use of adaptive linear decision functions .. , . , ,

1117

E. R. Ide,
C. E. Kiessling,
C. J. Tunis

Experiments in the recognition of hand-printed text: Part I-Character
recognition................. , . , .. , .', ........... , ............... . 1125
Experiments in the reoognition of hand printed text Part II-Context
analysis ..... , ............ , ............. , .......... , ....... ' .. . 1139
The design of an OCR system for reading handwritten numerals .. , . , ..

1151

OPERATING SYSTEMS I/OPERATING SYSTEMS II
The dynamic behavior of programs ... , ................ , , . , . , ... ' , , , 1163
Resource allocation with interlock dete~tion in a multi-task system. , . , , 1169
Cage dual processing ... , ... , , , ... , ..... ' . '.... , ........... , . , . , , , . 1177
An operating system for a central real-time data processing
computer ....... , .............................. , ............. . 1187

J. H.

1~funson

R. O. Duda,
P. E. Hart
P. J. Hurley,
W. S. Rohland,
P. J. Traglia

I. F. Freibergs
J. E. Murphy
K. C. Smith
P. Day,

H. Krejci
~EW

MEMORY TECHNIQUES
Holographic read -only memories accessed by light-emitting diodes ..... .

Semiconductor memory circuits and technology .... , , , , , , , ... , ' , , , , "

2Yr-D Core search memory ............ ' ................. , . . . . . ..

1197

1205
1213

Design of a small multi-turn magnetic thin film memory. ' , .... , ...... 1219

D.H.R. Vilkomerson,
R. S. 1\1[ezrich,
D. I. Bostwick
W. B. Sander
}\I[. W. Rolund,
P. A. Harding
W. Simpson

HYBRID SIlVIULATION TECHNIQUES
An adaptive sampling system for hybrid computation ........ , ... ' , ..

1225

G. A. Hahe,
lV. Karplus

A new solid state electronic iterative differential analyzer making
maximum use of integrated circuits ............................. '.

1233

B. K. Conant

A general method for programming synchronous logic in analog
computation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..

APPLICATIONS OF COMPUTERS TO PROBLEMS OF THE
ATMOSPHERE AND GEOPHYSICS
Computer experiInents in the global circulation of the earth's
atmosphere. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..
Computational problems encountered in the study of the earth's
normal modes ............................................... ".

1251

R. A. Moran,
E. G. Gilbert

1259

A. Kasa;.hara

1273

F. Gilbert,
G. Backus

PROGRESS IN DISPLAYS (A Panel Session-No papers in this volume)
COMPUTER GENERATED PICTURES-PERILS, PLEASURES,
PROFITS
Computer animation and the fourth dimension ..................... .
Computer displays in the teaching of physics ....................... .

1279
1285

Art, computers and mathematics .................................. .

1292

CAMP-Computer assisted movie production ...................... .

1299

What good is a baby? ........................................... .

1307

A computer animation movie language ............................. .

1317

NEW TRENDS IN PROGRAMMING LANGUAGES
CABAL-Environmental design of a compiler-compiler .............. . 1321
Cage test language-An interpretive language designed for aerospace .. . 1329
An efficient system for user extendible languages .................... . 1339
Program composition and editing with an on· line display ............. . 1349

BULK MEMORY DEVICES
New horizons for magnetic bulk storage devices ..................... .
Laser recording unit for high density permanent digital data storage ... .

1361
1369

A magnetic random access terabit magnetic memory ................ .

1381

A. 1ll. Noll
J. L. Schwartz,
E. E. Taylor
C. Csuri,

J. Shaffer
J. WhitneY,
J. Citron
N. W ink less ,
P. Honore
D. Weiner,
S. E. Anderson

R. K. Dove
G. S. Metsker
M. C. Newey
H. Bratrnan,
H. G. Martin,
E. C. Perstein
F. D. Risko
K. McFarland,
M. H ashiguchi

S. Damron,
J. Miller,
E. Salbu,

M. Wildman,
J. Lucas
Diagnostics and recovery programs for the IBM 1360 photo-digital
storage system ................................................ .

1389

D. P. Gustlin,

D. D. Prentice

SIMULATION IN THE DESIGN AND EVALUATION OF DIGITAL
COMPUTER SYSTEMS
. Simulation design ofa multiprocessing system. . . . . . . . . . . . . . . . . . . . . ..

1399

R. A. Merikallio,
F. a.Holland

A simulation study of resource management in a time-sharing system ...

1411

Performance of a simulated multiprogramming system ............... .

1431

THE COMPUTER FIELD: WHAT WAS PROMISED, WHAT WE
HAVE, WHAT WE NEED (HARDWARE SESSION)
Hardware design reflecting software requirements. . . . . .............. .
What was promised, what we have and what is being promised in
character recognition .......................................... .
High speed logic and memory: Past, present and future .............. .

S. L. Rehmann,
S. G. Gangwere, Jr.
M. M. Lehman,
J. L. Rosenfeld

1443

S. Rosen

1451
1459

A. W. Holt
A. W.Lo

REAL-TIME INFORMATION SYSTEMS AND THE PUBLIC
INTEREST
Real-time systems and public information .......................... .
National and international information networks in science and
technology ................................................... .
Real-time computer communications and the public interest .......... .

1467

C. W. Churchman

1469
1473

Toward education in real-time .................................... .
A public philosophy for real-time information systems. _ ............ .

1479
1491

H. Borko
1\!1 • M. Gold,
L. L. Selwyn
P. E. Rosove
H. Sackman

COMPUTER DESIGN AUTOlVIATION: WHAT NOW AND WHAT
NEXT?
Introduction ................................................... .
Functional design and evaluation ................................. .
Interface between logic and hardware .............................. .
Hardware implementation ........................................ .
Hardware fault detection ........................................ .

1499
1500
1501
1502
1502

J. M. K urtzberg
D. F. Gorman
R. L. Russo
W. E. Donath
M. A. Breuer

WRITEACOURSE:
An educational programming language*
by EARL HUNT and MARY ZOSEL
University of Washington
Seattle, Washington

The problem

equipment suitable for his use. This alternative
is extremely expensive (the equipment alone would
rent for $100,000 a year or better) and is feasible
only for large research projects. He could hire a computer programmer and tell him what the computer was
supposed to do. This introduces another speci9li5t
into the research team, and has the disadvantage
that the computer will then act as the programmer
thought the educator wanted it to act. The educator
may not discover a misunderstanding until after it has
been built into the programming system, at which point
it is hard to fix.
We advocate another alternative, placing in the general purpose computing system a language which is easy
for the educator to use. This is the solution which was
taken over ten years ago by mathematicians, when they
were faced with the prospect of writing mathematics in a
language which was designed for machine execution,
rather than for problem statement. The great success of
languages such as FORTRAN and ALGOL testifies to
the feasibility of the approach. In the next ten years an
educator's language may also be needed.
What should the characteristics of such a language
be? By far the most important requirement is that the
language should be natural for the teacher. Its syntax
and semantics should conform to his writing habits.
Insofar as possible, and there are limits on this , the form
of the language should not be determined by the physical characteristics of the computer on which it will be
used.
Readability is a second requirement, It will often be
necessary for a person to understand a program he did
not write. The structure of the programming language
should be such that the basic plan of a program can be
communicated without forcing the reader to master the
intricacies of each line of code.
A judicious choice of a language can also ensure the

Computer applications in education are becoming
more ~nd more prevalent. Perhaps the most talked
about use of computers in the schools is to control the
educational material presented to students ... the Computer Assisted Instruction (CAl) application. CAl requires that two problems be solved. Someone has to decide what material should be sent to a student, and
when, and someone has to arrange that the computer
actually do what is desired. The first problem, what
should be done, is a topic for educators and psychologists. Our concern is with the second. How can we make
CAl a convenient tool for the educator?
We will assume that the educator has access to an
int~ractive system, but that system was not specifically
desIgned for computer assisted instruction. Once the
educator has determined the form of a lesson he would
~ike to ?e able to go to the typewriter, t;pe in the
InstructIOns, and then leave the typewriter knowing
when he returns with a student, the computer will be
prepared to conduct the lesson. The problem is that the
computer "understands" instructions only in a very restricted set of languages. The form of these languages
has, for the most part, been dictated either by the internal design of the machine or by the requirements of
mathematicians and statisticians who are, after all, the
largest group of users of general purpose computers.
The language problem can be solved in several ways.
The educator could, himself, become proficient in computer pro~ramming. This diverts his timefrom the problem he wIshes to pursue. He could acquire a specially
designed computing system which had languages and
*This research was supported by the- Air .Force Office of Scientific Research, Office of Aerospace Research, United States Air
F~rce., ~der AFOSR Grant No. AF-AFOSR-l:3U-67.
DIstrIbutIOn of this document is unlimited.

923

924

Fall Joint Computer Conference, 1968

----------~-----------------------------------------------------------------------

availability of a computer. Any language which is not
tied to the physical characteristics of a computer requires a translator. Pragmatically speaking, then, the
language is defined by the translator program. Thus
the educational language can be "inherited" by any
machine for which its base language translator exists.
Weare by no means the first to recognize the need for
an educator's language. Several others have already
been developed. The best known are probably IBM's
COURSEWRITER5 and System Development Corporation's PLANIT.2 These languages are admirably
suited for the particular computer configurations for
which they were developed. For a variety of reasons,
however, we believe that they fail to meet the criteria
we have listed. Our principal criticism is that they
either are too much influenced by the way a computer
wishes to receive commands, instead of the way a person
wishes to give them, or that they contain features which,
although quite useful in themselves, would not be available except in specially designed computing systems.
The WRITEACOURSE Language.

We have developed an educational language, called
WRITEACOURSE, which is consciously modeled after
the ALGOL arithmetic programming language, 7 which it
resembles in its syntactic structure. The basic unit of
discourse is the statement, corresponding roughly to an
English sentence. Statements are grouped into larger
units called lessons, and lessons into courses, similar to
the way a group of subroutines make up a program.
Statements are composed of instructions. In WRITEACOURSE there are only ten instructions. Physically,
they are English words, such as ADD and PRINT,
which have been chosen to have a meaning as closeas
possible to their meaning in the natural language.
Limiting the commands of the language restricts us.
There are actions which can be executed by a computer,
but which are difficult to express in a restricted idiom.
The initial users of WRITEACOURSE have not found
this to be a great problem. They appear to be able to
say almost everything they want to say without extensive training.
The WRITEACOURSE translation program has been
written entirely in the PLII programming language,6
which we expect to be widely available in a few years.
We assume that the particular configuration has an
interactive computing capability, in which the user can
exchange messages with a program from a remote station equipped with a typewriter or other keyboard device. By 1970 this sort of capability should be common
in universities, at a price well within the reach of a
modest research budget.
An earlier version of WRITEACOURSE4 was defined for the Burroughs B5500 computer only, using the

extended ALGOL provided for that machine. 1 Thus the
early version was not machine independent in the sense
that our present program is, although it would be a
fairly straightforward task to adopt it to some other
computer which had an ALGOL compiler.
Our approach should produce an easily maintained
system. This is a very important point. Undoubtedly
there will be errors in any system as complicated as a
programming language. Also, different users will want to
extend the language to suit their own purpose. Since the
translation program is written in a commonly available,
user oriented language, the educator wil1 find that there
are many people who can understand and alter it. This
will be particularly true in universities, where Computer Science departments and computer centers will
regularly offer undergraduate courses in PLII programming.

A user's view of the language.

The purpose of developing WRITEACOURSE was
to have a language which could be easily understood by
educators. We can test this now by presenting a fragment of a WRITEACOURSE lesson. Hopefully it will
be readable with only a minimal explanation.
The following statements are taken from a fragment of a WRITEACOURSE lesson. They appear
exactly as they would be typed by an instructor, with
the exception of the numbers in parentheses at the beginning of each line. These have been introduced for
ease of reference in explaining the lesson.
(1) 20 PRINT "THE ANGLE OF INCIDENCE IS
EQUAL TO THE ANGLE OF ... "
(2)
ACCEPT CHECK "REFLECTION" "REFRACTION" IF 1 CHECKS THEN GO TO 51
(3)
IF 2 CHECKS THEN PRINT "NO, THE
ANGLE OF REFRACTION DEPENDS ON
(4)
THE TYPE OF LENS"I
(5)
PRINT "TRY AGAIN" ACCEPT CHECK
REFLECTION" IF 0 CHECKS THEN
(6)
PRINT "THE CORRECT ANSWER IS REFLECTION"I
(7) 5 PRINT "HERE IS THE NEXT QUESTION"I
What would happen when a student executed this
lesson? The first statement (statement 20) to be exe- .
cuted is the statement beginning on line (1) and extending to the end of statement marker("I") on line (2).
Statements always begin on a new line; otherwise, they
may be typed in any way convenient. Line (1) would
print the question THE ANGLE OF INCIDENCE IS
EQUAL TO THE ANGLE OF ... on the computer-'
controlled typewriter. At line (2) the ACCEPT in:"
struction would print an underscore ("_") on the next
line. This would be a signal to the student indicating

WRITEACOURSE
that an answer was expected. At this point the paper
in front of the student would look like this
THE ANGLE OF INCIDENCE IS EQUAL TO
THE ANGLE OF ...
The computer would then wait for the student, who
would type whatever he thought was an appropriate reply, then strike the carriage return key of the typewriter,
indicating that he was through with his answer. The
program would ACCEPT this answer, and CHECK it
against indicated possible answers. Suppose the student
had typed
REFRACTION
The CHECK command on line (2) would match this
answer against the quoted statements "REFLECTION" and" REFRACTION." The quoted statem en1 s
are called check strings. In this case the answer would
be identical to the second check string, so we say that
"2 CHECKS." At line (2), however, the question asked
is, "DOES 1 CHECK?" This would only be true if the
student had replied REFLECTION (the correct answer), in which case control would have been transferred
to the statement named 5, at line (7) of the lesson,
which continues with ~ new question.
However, 1 did not check, so the next commands to be
executed are those on line (3), which begins a new, unnamed statement.S Lines (3) and (4) are straightforward.
The computer asks if 2 CHECKS, which it does, since
the student's reply was identical to the second check
string.6 Upon determining this, the computer types out
the correcting response given on lines (3) and (4).
Next the statement begiIming on line (5) is executed.
This prints another line, urging the student to try
again, and an underscore (the ACCEPT of line (5))
teHing him an answer is expected. The student will now
have in front of him
THE ANGLE OF INCIDENCE IS EQUAL TO
THE ANGLE OF ... REFRACTION
NO, THE ANGLE OF REFRACTION DEPENDS
ON THE TYPE OF LENS
TRY AGAIN
Assume that he replies correctly, printing REFLECTION. This will be read by the ACCEPT statement in
line (5) and the immediately following CHECK statement will determine that 1 CHECKS is true. IF 0
CHECKS tests to see if nothing checked, i.e., 0
CHECKS is true if the student's answer does not match
any of the check stIings. In this case, the condition 0
CHECKS would be true fOl 2.ny answer other than
REFLECTION. Looking at the final three lines of the
conversation, we have

925

TRY AGAIN
REFLECTION
HERE IS THE NEXT QUESTION
But suppose that the student had not been so bright.
The final lines could have read
TRY AGAIN
WHO KNOWS?
THE CORRECT ANSWER IS REFLECTION
HERE IS ,THE NEXT QUESTION

M ore sophisticated programming
The example just given was very simple. Using the
computer's capabilities more fully, WRITEACOURSE
makes possible the specification of a much more complex branching sequence. There is also a limited arithmetical capability. A set of counters (temporary variables) are provided to keep track of intermediate results. Counters can be used either to do arithmetic or to
record the number of times a student takes a particular path through a course. This turns out to be a powerful device. We will give a few examples. 'I
Counters are named by preceding a number with the
symbol "@." Thus @10 means "counter 10." Three
commands are defined for counters, SET (counter ,number) TO (value), ADD (value) TO (counter number,
and SUBTRACT (value) FROM (counter number).
They have the obvious meaning.
SET@10toO
establishes 0 as the value of counter 10, while
ADD5TO@10
sets the value of counter 10 to 5 plus its original value.
It takes little imagination to see that the counters can
be used to keep scores on a student's responses, through
the device exemplified by
IF 1 CHECKS THEN ADD 1 TO @7.
The value of a counter may also be printed. To do this
the name of the counter is included in a PRINT command. When the command is executed, its current value
will be printed. The statement
SET @8 TO 5 PRINT "THE VALUE OF 8 IS@8"
will print
THE VALUE OF 8 IS 5.
The content of a counter is a value, so arithmetic can
be done on counters. ADP @2 TO @3 would set the
value of counter 3 to the original value ()f counter 2
plus the value of counter 3.
There are actually three groups of counters. Counters

926

Fall Joint Computer Conference, 1968

50-99 are lesson counters, their values are carried over
from one use of a WRITEACQURSE lesson to another.
There are several reasons for doing this. For instance, a
counter can be used to keep track of the number of students executing a lesson, or the number of students who
miss a particular question. Counters 1 to 49 are the
temporary counters. They are set to zero when a student
first signs in for a session with the computer. They are
retained for that student, however, for the duration of
the session even if he switches WRITEACOURSE lessons. Finally, Counter 0 is a special counter set by the
computer's internal clock. It can be used to time a student's responses.
A set of Boolean IF statements are provided to check
the value of a counter against another counter, or some
constant value. The command IF @4 = 7 THEN GO
TO 6 will cause a transfer to statement 6 only if counter
4 contains 7. The' normal arithmetical relations of
equality and ordered inequality are permitted.
Counter numbers may also be used for a computed
GO TO. GO TO @2 is an instruction to go to the statement whose number is contained in counter 2. Of
course, the instructor who writes this command must
insure that counter 2 will contain the name of a statement whenever this command is executed.
Let us look at an example which uses some of these
more complex commands.
SET @54, @41 TO 0 PRINT "WHAT DISCOVERY
(2)
LEAD TO LASERS?" I
(3) 3 ACCEPT CHECK "MASER" "QUASER"
(4)
"CANDLES" IF 1 CHECKS THEN GO TO 61
(5)
ADD 1 TO @41 IF 0 CHECKS THEN GO TO
401
(6)
IF 2 CHECKS THEN PRINT "THAT IS IN
ASTRON01VIY."
(7)
GOT040 I
(8)
IF 3 CHECKS THEN PRINT "DO NOT BE
SILLY." I
(9) 40 IF @41 < 3 THEN PRINT "TRY AGAIN"
GOT031
(10)
ADD 1 TO @54 PRINT "THE ANSWER IS
MASER." I
(11) 6 PRINT "HERE IS THE NEXT QUESTION"I
(1)

the following exchange might take place between the
student and computer.
WHAT DISCOVERY LEAD TO LASERS?
SUASER
THAT IS IN ASTRONOMY.
TRY AGAIN
MASER
HERE IS THE NEXT QUESTION.

The first statement sets counters .54 and 41 to zero,
then prints the basic question. Statement number 3
through statement number 40 establish a loop, which
checks the student's answer for the correct answer or
two anticipated wrong answers, prints an appropriate
message for a wrong answer, then gives the student
another chance. If the correct answer is detected (if 1
CHECKS in line (4)), the loop is broken by a transfer to
statement 6. If a wrong answer is detected, the question
is reasked. Counter 41 is used to keep track of the nunlber of wrong answers. If three wrong answers are given,
the correct answer is printed, and the program continues on. If this alternative occurs, howeve~, the value
of counter 54 is incremented by l. Recall that counter
54 is one of the lesson counters, i.e., its value carries
over from one user of the lesson to another. At some
later time, then, an instructor could interrogate the lesson to see how many students had failed to answer this
question in three or fewer tries.
Lessons and courses

Statements are grouped into lessons, and lessons into
courses. Roughly, a lesson can be thought of as the number of WRITEACOURSE statements needed to carry
on the computer's part of a computer-student interaction lasting about half an hour. Another important
functional distinction is that a lesson is the WRITEACOURSE unit to which counters are attached. Thus if
@54 appears in two different statements in the same
lesson, it refers to the same counter. If the two statements arein different lessons, they refer to different counters. Note that this is not true for temporary counters,
since they remain attached to a student for the duration
a student-computer conversation. Thus if it is anticipated that a student will use more than one lesson during a' single session, the results accumulated while the
first lesson is active may be communicated to the second
lesson via the temporary counters.
Lessons themselves are grouped into courses. Functionally, the chief distinction of a course is that it is possible to activate one lesson from within another, providing that the two lessons are in the same course. Suppose a student signs in, with the intention of taking a
course in Romance Literature. He would begin by indicating that he wanted to work on the first lesson of this
course. He would do this by replying, in response to a
computer question, that he wished to work on LE8SONI/LIT 47. LIT47 is assumed to be a course name,
and LESSON 1 a lesson of the course. Let us suppose
that this lesson is going to discuss the novel Don Quijote.
The instructor might want to check to make sure the
student knew enough Spanish to understand some of
phrases. This can be accomplished by the following
statement.

WRITEACOURSE
(1)
(2)

1 PRINT "DO YOU WISH TO REVIEW
SPANISH?"
ACCEPT CHECK "YES" IF 1 CHECKS
THEN CALL SPREVUE/LIT471

If the last command Qn the secQnd line is activated, it
will suspend the current lesson nQW active (LESSONI/
LIT47), and IQad the lessQn SPREVUE/LIT47. BQth
lessQns must be in the same CQurse. U PQn cQmpletiQn Qf
SPREVUE/LIT47, cQntrQI WQuid be returned to' the
statement' in LESSONl/LIT47 immediately after
line (2).
The cQmmand LINK (lessQn name) / (cQurse name)
will alsO' change a student frQm Qne lessQn to' anQther
within the same CQurse. In this case, hQwever, there is
nO' autQmatic return to' the calling lessQn after the called
lessQn is cQmpleted. The nQrmal use Qf LINK is to'
string tQgether several lessQns which the instructQr
wishes to' have executed in sequence.

Using WRITEACOURSE
The steps in using WRITEACOURSEwill nQW be
described. The steps a student must gO' thrQugh to' initiate a lessQn have been kept to' a minimum. He types
XEQ and then supplies the lessQn name and CQurse
name when requested. After a lessQn is Qver, he may
type XEQ and gO' thrQugh anQther lessQn, Qr type
STOP to' terminate the sessiQn.
When an instructQr CQnstructs a lessQn, the prQcess
is necessarily mQre involved. After calling the system
the instructQr sends the message / / / COMPILE indicating a CQurse is to' be established ar mQdified. (In
general, the symbQls "// /" precede cQmpiler CQmmands.) If a new CQurse is to' be written, the Qrder is
sent.
/ / /PROGRAM NEW lessQn/ CQurse
The translatQr will then be ready to' accept the lessQn.
Each statement is checked fQr syntax errQrs as it is
received. If there is nO' errQr, the next statement is requested. Whenever an errQr is detected, a message is
printed indicating where it Qccurred. After determining
the cQrrected fQrm, the instructQr re-enters the statement, frQm the PQint Qf the errQr to' the end. When the
instructQr wishes to' stQP wQrking Qn the lessQn, he types
/ / /END. ThelessQn will be autQmatically stQred in the
cQmputing system's files. If the mstructQr desires, he
may Qrder a check fQr undefined statement numbers referenced by GO TO instructiQns befQre the lessQn is recQrded.
The instructQr may mQdify existing lessQns Qr Qbtain
a listing Qf lessQns, using the cQmmands / / / ADD,
/ / /DELETE and / / /LIST.

927

System implementation.
WRITEACOURSE has been tested Qn an IBM
360/50 with a remQte 2741 terminal. The translatQr
was written in the RUSH4 subset Qf PL/I, pravided by
Allen-BabcQck CQmputing (8). The Qnly nQn-standard
PL/I used is the timer functiQn. WRITEACOURSE
lessQns are incrementally cQmpiled intO' a decimal integer cQde, which is stQred in a data file. The stQrage file
fQr each CQurse cQnsists Qf 64 tracks 0' ~ fixed fQrmat data
with a black size Qf 252 bytes. The internal cO' de is
edited whenever a teacher makes a mQdificatiQn. The
executiQn prQgram interprets this cQde to' praduce the
sequence O'f events planned by the instructQr. The first
blQck Qf cade in a caurse cO'ntains the names and IQcatiQns Qf the lessO'ns in it. Each lessQn O'ccupies 38 blQcks
O'f the file, and is divided intO' five parts.

1. The instructiQn table, which cO'ntains the cQmpiled
decimal cQde with apprQximately Qne cO'de wQrd
fO'r each instructiQn in the lessO'n.
2. The statement number table, which cQntains the
statement numbers with a PQinter to' the cO'rresPQnding instructiQns.
3. The CQunters attached to' the lessO'n.
4. The print tables, which cQntain all O'f the strings
to' be printed.
5. The print table index, which cO'ntains a PQinter t ')
the IO'catiO'n O'f each string.
Since the source cQde is nQt saved, the cQmpiled cQde
must be used whenever the lessQn is changed. TO' Qbtain
a listing Qf the lessQn, the cQde is interpreted, as if it
were to' be executed, and the SQurce cO'de is reCQnstructed. When a sectiQn O'f a lessQn is deleted, the instructiQn table and the statement number table are
clQsed up to' eliminate the desired PQrtiQn. The strings
in the print tables are marked inactive, fO'r later garbage
cQllectiQn. CQde is added to' a lessQn by apening a hO'le
in the instructiQn table and statement number table O'f
the prQper length, and then inserting the cQmpiled cQde.
New print strings are added to' the end Qf the print
tables.
A PQinter is kept in each table to' indicate the last
entry in the table, sO' that new cQde can be added to' the
end Qf a lessQn. SO'urce cO' de is input to' the cQmpiler Qne
statement at a time. The cQmpiler analyzes the statement instructiQn by instructiQn. If it detects any errQrs
it requests that the user re-input the statement frO'm the
instructiO'n cQntaining the errQr to' the end.
WRITEACOURSE is brO'ken intO' several prQgrams
in Qrder to' fit within the limited cQmputer space available in a time-shared system. The prO'grams Qperate as
Qverlay segments, with PL/I external variables used to'
cQmmunicate between them. The mQdular structure Qf

928

Fall Joint Computer Conference, 1968

WRITEACOURSE should facilitate system additions
or modifications. Figure 1 shows the basic overlay structure. The functions of each program are indicated in
the figure.

BIHOR

(CQfpUaUon MONitor

(Uecution M031tor)

Handle file

S.t up 1•••01\11 for

operatiol\ll

_ecutlon

and lesson

aDd haDdle

f11e operat101\11

1IIOdlflcaUona

I

I
CCllPIL

IDCUT

Accept les.on

Interpret the

statements from user

Internal code and
handle .tudent

cheek .yntax. and
produce Internal codl

FIGURE 1

Statu8
The earlier ALGOL version of WRITEACOURSEhas
been successfully used by people with little programming experience. Although the current version, at the
time of this writing, has not yet been put into general
use, the programming is completed. A limited number
of manuals describing the language details and use of
the system are available from the Department of
Psychology (Cognitive Capabilities Project), the
University of Washington. Listings of the translator
and manuals will be provided up()n request and at cost.

REFERENCES
Burroughs B5500 information processing systems extended Algol
language manual
Burroughs Corporation 1966

2 S L FEINGOLD C H FRYE
U8er's guide to PLANIT
System Development Corporation 1966
3 J FELDMAN DGRIES
Translator writing systems
Comm ACM Vol 11 Feb 1968
4 S HENDRICKSON E HUNT
The WRITEACOURSE language programming manual
Department of Psychology University of Washington 1967
5 IBM 1500 operating system computer-assisted instruction
coursewriter II
Form CAl-4036-1 IBM
6 IBM operating system/360 PL/I: language specifications
Form 028-6571-2 IBM 1966
7 P NAUR (Editor)
Revised report on the algorithmic language Algol 60
Comm ACM Vol 6 pp 1-17 Jan 1963
8 RUSH terminal user's manual
Allen-Babcock Computing Inc 1966

FOOTNOTES
1. This research was supported by the Air Force Office of
Scientific Research, Office of Aerospace Research, United
States Air Force, under AFOSR Grant No. AF-AFOSR1311-67. Distribution of this document is unlimited.
2. We 'wish to express our thanks to Sidney Hendrickson for
his comments and work on an earlier version of the language.
3. There is an unfortunate ambiguity in the word "program",
since it is used by educators to mean a sequence of interchanges
between student and teacher, and by computer scientists to
mean the sequence of commands issued to a computer. We shall
use "lesson" when we mean "sequence of educational steps"
and "program" when we mean "sequence of commands to
be executed by a digital computer."
4. At this point the mind of people not familiar with modern
computer technology tends to swim. It is possible to carry
this process even further (3).
5. The statement had to end at line (2) because of the IF ..
THEN command. The general rule is that when a question of
the form IF condition THEN is asked, the commands between
the word THEN and the next I are executed only if the condition
is true. If it is false, as it is in this case, the command immediately following the I, i.e., the first command of the next statement,
is executed.
6. More complicated matches are possible, which do not require exact identity. For instance, it is possible to ask if a
check string is included anywhere in an answer, so that, in
this case 2 would check if the answer had been IT IS REFRACTION.
7. A manual describing the language in detail is available.

A table driven complier for use with automatic test
equipment
hy ROLAND L. MATTISON and ROBERT T. MITCHELL
Radio Corporation of America
Burlington, Massachusetts

INTRODUCTION
When generating compilers for use with automatic test equipment (ATE), a substantial need
arises for flexibility in both the source and object
languages. Flexibility is desirable for two reasons: (1) The field of ATE construction is rapidly
expanding1 and (2) the hardware and support
software design, development, and debug cycles
are often going on simultaneously. In earlier, more
standard compilers, the modifications and/or extensions of either language could easily create
chaos for the systems prQgrammer.
In an attempt tQ facilitate compiler implementation and growth, a table driven system, the Universal Test Equipment Compiler (UTEC), has
been developed. As in other table driven systems ,2 the function of defining a source language
.
has been dissociated from the actual translatIon
mechanism. The source language is specified to
the generator whic~ creates a set of tables for subsequent use by the translator. A dual-purpose
meta-language has been created for use in the
system. This language is used tQ specify the syntax of a particular source language and the meaning to be imparted to the variQus allQwable constructs 'Of that language.
A typical ATE system cQnsists of various prQgrammable devices for applying stimuli to, and
'Obtaining measurements from, the unit under test
(UUT).1,3
A requirement peculiar to ATE compilers is the
creation of a wire list specifying connections between the ATE and the UUT. An equipment
designator has been included in the system tQ
handle the wire list and to insure that the wire
list remains fixed despite source program recom-

pilations. This is necessary due tQ the CQst incurred in the production of this wiring.
The wide range of computers currently used in
ATE dictates that the output 'Of UTEC be a symbolically addressed code which must then proceed
thrQugh the second pass 'Of a normal two pass assembler. Since this reduced assembler CQuid be
different for each type of ATE, it will be excluded
from the following discussiQn.
The flQW of information through the UTEC sys·
tem is depicted in Figure 1. The source language
specifications and translation IQgic are defined to
UTEC using the meta-language and are fed intQ
the generator. From this, the generatQr produces
translation tables for use by the translator. The
generator also accepts the .ATE hardware CQnfiguratiQn and produces equipment tables fQr the
equipment designatQr. When a source program is
input to UTEC for translation, the translator uses
the translation tables and outputs an intermediate
code ready fQr assembly. Whenever ATE equip-

.----------,
I

EQUIPMENT !PECIFICATION
LANGUAGE &EFINITION

GEN

EQUIPMENT TABLES

TRANSLATION TABLES

L _________ -.lI

I

ENGINEER'S TEST PROGRAM

ASSEt.4BLER

4---~

~~~!PER
..

MACHINE CODE
TAPE, CARDS, MAG. TAPE)

FIGURE I-System flow block diagram

929

930

Fall Joint Computer Conference, 1968

ment must be specified by the translator, it inserts
a symbolic into the intermediate code, and requests
the required equipment from an available equipment pool in the equipment tables. The request is
tied to the intermediate code by the symbolic. The
equipment designator now processes the equipment requests and, using the equipment tables,
produces equipment assignments for each symbolic
in the form of a symbol table.
The META-language

We now present a language, SYNSEM, (SYNtax and SEMantics) for explicitly defining a problem oriented language (POL).4 SYNSEM itself
is a twofold problem oriented language which (1)
specifies the syntax of the POL's and (,2) specifies
the semantics of the allowable constructs in a
POL. SYNSEM is therefore divided into two sublanguages: SYN for specifying syntax, and SEM
for specifying semantics.
Problem oriented languages currently in use
with ATE are tabular in format. The reason for
this, and examples of such languages have been
previously presented,5,7 and, therefore, will not
be considered here. Let it suffice to say that fixed
fields are generally adhered to, with one field set
aside for the function or verb and the remaining
fields for modifiers of various types. Each verbmodifier complex is referred to as a source statement.
The goal of SYN is to allow format-syntax type
information to be specified for each verb of the
POL. This information is encoded into a table by
the generator and will be used by the translator
whenever the verb is used in a source program.
SYN is comprised of various disjoint subsets of
any commonly used character set. Three of the
subsets are given below:
NU~\IBERS = {A, B, C, D, E, F, G, H, I, ~1, N, O,}

LETTERS

= {p, Q, R, U, V, W, y}

:VIAIN UNITS = {K}

To specify a numeric modifier, a letter is chosen
from NUMBERS and repeated so that the number
of times the letter appears equals the maximum
number of digits the modifier may contain. Alphabetic modifiers are handled in a similar way by
choosing from LETTERS.
If desired, a MAIN UNITS modifier may be
used with any verb. When the letter K is recog-

nized by the generator, the 4 characters immediately following the K are taken as the MAIN
UNITS and entered into a dictionary with the
verb. MAIN UNITS are used to further distinguish the verb when more than one source
statement uses the same verb.
As an example, consider the following SYN
statement to specify the verb CONNECT with the
modifier VDC:
CONNECT AAA KVDC

BBB

PPPP

The modifiers may be 2 numeric (A and B), 1
alphabetic (P), and the MAIN UNITS for this
form of CONNECT is VDC.
Once SYN has been used to specify a given verb,
a SEM"program" is written, later to be executed
by the translator, which analyzes the verb-modifier relationship and generates the desired intermediate code for the source statement. The SEM
language is composed of a number of semantic instructions, some of which are described below. A
maximum of 750 such instructions can be used
in anyone SEM program. A statement in SEM
consists of a semantic instruction followed by its
possible modifiers. A three digit label is optional
for all statements. A three digit branch is required with some instructions and optional with
others. If a branch is given on optional instructions, it is considered unconditional. The SEM
instructions are divided into three categories: (1)
Code producing, (2) Modifier handling and (3)· .
Control.
COD·E, CV AR, CALPHA, and CSIGN are four
·of the code generating instructions. CODE tells
the translator to output the characters which are
literally specified with the CODE instruction.
CODE

3

PSl

will cause the three characters "PSI" to appear
in the intermediate code. CVAR, CSIGN, and
CALPHA each are used with an identifier which
the translator references to find the data to be
output. The identifiers with CVAR and CSIGN
must be numeric.
CSIGN

NUMl

will generate a
NUMl
CVAR

+

NUMl

or - depending on the sign of
4

2

will cause the value of NUMl to be coded using

Table Driven CQmputer
fQur characters with tWQ implied decimal places.
If NUM1==46.913, the characters 4691 will be
cQded.
CALPHA

WI

3

will cause the three leftmQst characters .of WI
tQ be cQded.
TEST, RANGE, and the four arithmetic QperatQrs ADD, SUB, MUL, and DIV are SQme .of the
mQdifier handling SEM instructiQns. TEST causes
the translatQr tQ compare a referenced quantity
with a grQUp .of characters specified fQllQwing the
instructiQn. RANGE causes a check .of a referenced quantity tQ see if it is numerically between
tWQ limits. ExecutiQn .of either a TEST .or RANGE
instructiQn by the translatQr can cause a branch
in prQgram flow tQ a labeled SEM statement if the
cQmparisQn fails.
TEST

A

3 AMP

40

causes a cQmparisQn .of the three leftmQst characters .of A with the three characters AMP.
RANGE

B

20.0

30.0

40

causes a CQmparisQn .of B to see if 20.0 ~B:s;; 30.0.
If the abQve cQmparisQns are satisfied. the translatQrexecutes the next sequential instructiQn:
.otherwise, statement 40 will be prQcessed next.
JUMP, SCWL, ROUTINE and EQUI are each
SEM cQntrQI instructiQns. JUMP is used with a
SEM statement label and causes an uncQnditiQnal
transfer by the translatQr tQ the labeled statement. SCWL infQrms the translatQr that all .of the
intermediate cQde generated fQr a particular
SQurce statement must be saved with a label fQr
future use. The SEM language is prQvided with
a subrQutine capability thrQugh the ROUTINE instructiQn. ROUTINE may be fQllQwed by a parameter list .of frQm .one tQ seven ·dummy parameters. The CALL instructiQn, fQllowed by the
actual parameters, is used tQ invQke a SEM subrQutine. The EQUI instructiQn is used tQ cause
the translatQr tQ generate a symbQlic equipment
request fQr the equipment designatQr. Its mQdifiers must be a unique symbQlic, which will be
placed in the intermediate cQde by the ECODE
jnstructiQn, a type number referencing a particular pOQI .of equipment, and a set .of "cQnnectiQns"
tQ which a specific piece .of equipment frQm that
PQQI shQuld be wired.
A cQntrQI card called REQUIRED· is. used between the tWQ parts .of the SYNSEM language

931

and lists all required mQdifiers in the SYN portiQn.
The fQllQwing example shQWS the cQmbined use
of the SYN and SEM languages tQ specify the
verb CONNECT mQdified by VDC.
CONNECT
REQUIRED
NEWN
CODE
RANGE
EQUI
ECODE
CVAR
10 RANGE
EQUI
ECODE
CVAR
20 CODE
30 ERROR
40 E~D

AAAA
AC
UN1
1
A
5
UN1
A
A
6

UN1
A
2
A

KVDC

JC

JD

0
UN1

50
C

D

4

51
UN1

2
120
C

5

2

S
10

20
30
D

40

ES

OUT OF RANGE

The abQve prQgram is suitable fQr input tQ the
generatQr which WQuld create the necessary table
entries fQr later use by the translatQr after it sees
the CONNECT VDC verb in a source prQgram.
For example, supPQse the statement
CONNECT

4.2

VDC

JI-4

J3-5

was prQcessed by the translator. The cQde prQduced by the previQus definitiQn WQuid be
S ;00005 ;0420ES
'--"

EQUIPMENT
SYMBOLIC
which would cause the ATE tQ cQnnect 4.2 VQlts
.of direct current between PQints JI-4 and J3-5.
The equipment symbQlic number. 00005 was prQduced by the SEM instructiQn NEWN.
In a compiler fQr use with ATE, there is another function .of the meta-language, other than
defining source statement syntax and semantics.
It is to specify equipment available in any ATE
configuration. This is done in UTEC by the two
instructions, SYMBOL and DATA.
All equipment is divided into types, e.g., power
supplies, signal generators, v.oltmeters, and each
type is given a number to identify it. One SYMBOL instruction and as many DATA instructiQns
as there are pieces of equipment in a type are
used to define that type .. The SYMBOL instructi.on tells h.oW many pieces .of equipment in the
type, how many c.onnection terminals each has,
and the t.otal table area required t.o store requests

932

Fall Joint Computer Conference, 1968

for this type. Each DATA instruction gives an
equipment name and the ATE connection terminals for it.
When a new POL or a modification to an existing POL is defined to UTEC by means of SYNSEM, the syntax of the language and its semantics are stored into tables by the generator. Since
ease of language modification is a requirement,
three tables have been implemented as linked
lists. 6 The format list is used to hold the syntax
specification for each verb in the language. The
logic list is used to hold one entry for each SEM
instruction specified in a verb definition. The logic
modifier list holds SEM instruction modifiers
which are not suitable for entry in the logic list.
A dictionary is also used which contains the name
of each functio.n defined by SYNSEM as well as
various pointers to. the lists. Since UTEC is designed to handle POL's for automatic test equipment, it automatically controls the assignment of
equipment and produces wire lists. A pair of
equipment tables, the hardware name table and
hardware usage table, are built by the generator
to aid in these tasks.

The generator
The generatQr division 'Of UTEC accepts the
definitions of verb syntax and semantics written
in the SYNSEM language, and assembles this information into all the necessary tables and lists.
It also has the ability to delete and equate verbs
in the lists, and to build the equipment tables. The
generator is used whenever a DEFINE, EQUATE,
.DELETE, or EQUIPMENT control card is encountered and is divided into. fQur corresPQnding
sections as f'Ollows:

Define: F'Ollowing the DEFINE control card,
the SYNSEM language is used to define verbs.
First the syntax ofa verb is given using the SYN
language. The verb is placed into the dicti'Onary.
The syntax specification is analyzed character by
character, determining the type 'Of each argument
encountered. It counts the number 'Of characters
or uses a standard count allowed in each argument,
and thus builds the format list. It also enters the
symb'Olic character of each argument along with
its f'Ormat list position into an argument table for
later reference by the REQUIRED control card
and SEM language instructions. At the c'Omple-

tion of analyzing the syntax, the argument table
contains the one letter symbolic of each argument,
in the order in which they will appear in the
source statements. The REQUIRED control card
eontaining the one letter symbolic of each required
argument foll'OWS the SYN syntax specification.
Each argument is found in the argument table
and its format list position obtained. The format
list is thus modified to indicate which arguments
in the syntax are required with each usage 'Of the
verb. After the REQUIRED control card is processed, the generator must load the SEM program, which gives the semantics of the verb, into
the logic and lcgic m'Odifier lists. There are fourteen different f'Ormats for the thirty-six SEM instructions. There are f'Ourteen corresponding r'Outines in the generat'Or t'O handle the building of the
lists. For each instruction, the ccrrect r'Outine is
called to set up the list entries. When· a m'Odifier
'Of a verb is referenced by an instruction, the 'One
character symbolic of the SYN language is given
as a m'Odifier to the SEM instruction. This character is I'Ooked up in the argument table and its
integer positi'On number is used for the I'Ogic list
entry. (When a s'Ource statement is parsed by the
translator, each argument is I'Oaded int'O a table
at the same position as is used for containing its
SYN symbolic character in the generator.) Variables may be established in the SEM language by
a symbolic name. This symbolic name is placed
into. the argument table after the symb'Olic SYN
m'Odifiercharacters, thus establishing a location
for numeric reference in the logic lists entries and
for storage use by the translatcr. The argument
table provides st'Orage 'Only within a single definition, in that each new source statement starts
using this table at its top, destroying symbolic
names from previcus scurce statements. The SEM
instructions SX and TX provide storage locations
f'Or use throughout an entire source pr'Ogram compilation. A table exactly like the argument table
is used, except that the location symb'Olic name is
never destr'Oyed, thus giving each definiti'On access
to the same locati'On and pr'Oviding f'Or exchange
of information between definitions. If a SEM instruction requires alphanumeric information, or
if one entry in the I'Ogic list is not sufficient t'O ccntain all the necesary data for the instruction, a
pointer t'O the logic modifier list is placed in the
logic list entry, and as much space as necessary
is used in the logic modifier list. A two digit cp

Table Driven Computer
code (1-36), for access by the translator; and a
brance and link address, are always in a standard
location in each logic list entry.
E quat-e: Many verbs in a particular POL de-

veloped for use with automatic test equipment
are similar in syntax and semantics. For example,
the source statement for connecting a stimulus to
deliver volts is very similar to the statement for
connecting kilo-volts or milli-volts. The equate
section of the generator was therefore developed
whereby two or more verbs may share the same
definition, and therefore the same list area. The
name of the verb to be equated is placed in the
dictionary and all the pointers associated with the
equated verb are used with the new one, thereby
using the same definition. In order that the small
differences of the two verbs can be taken into account, the CHFLG SEM instruction must be used
in the original definition. This instruction requires
two indicators which are stored in the dictionary.
A definition always sets them to zero, but they may
be set to any desired value by the language designer using the EQUATE option. The CHFLG
op code can test the value of these indicators and
thereby set up branching logic in the SEM language program to control the translation.

Delete: The generator section of UTEC maintains a list of available cells to which the DEFINE
section looks as it makes the various list entries
for a given definition. The purpose of the DELETE section is to remove previously defined
verbs from the dictionary and to restore their
various list entries to the list of available cells.
This is done by changing the link at the bottom of
the list of available cells to point to the top of the
list entry for the deleted verb. This makes the
last list entry for the deleted verb the bottom cell
on the list of available cells.
Equipment: In this section, the generator builds
the hardware name table and allocates area in the
hardware usage table, both of which are used by
the equipment designator. All equipment of each
type which the system has available is defined in
the SYNSEM language by the SYMBOL and
DATA instructions. The generator calculates the
area required in each table section, and sets up
the pointers in the hardware name table and hardware usage table. The DATA cards contain the
name of orle piece of equipment along with its

933

terminal connections in the allocated positions in
the hardware name table.

Translator
The analysis and translation of source programs
and the subsequent output of intermediate code is
handled by the translator.
When analyzing a .source statement, the verb is
first checked against the list of defined verbs in
the dictionary. When a match occurs, an attempt
is made to verify any main units allowed with the
verb. Once a verb and main units match is made,
the associated dictionary entries are used as references to the format and the logic lists where information specified by the SYNSEM program for
this source statement has been stored by the generator.
Using the format list as a guide, the translator
parses each source statement and creates an argument table as it goes. A left to right scan of the
source statement is initiated looking for a modifier
of the type specified in the first format list entry.
If the modifier is found, it· is placed in the first
position of the argument table. The scan continues
looking for the next modifier as called for in the
next format list entry and places it· in the next
available argument table position. An error condition exists if the scan fails to verify a modifier,
unless that modifier is not required in which case
a dash is placed in the argument table in the next
position. The scan finishes when the entire format
list for this verb has been considered and an argument table entry is present for each item in the
list. Once the format scan has been completed, the
translator turns its attention to the logic list
where the algorithm for generating intermediate
code has been stored for this source statement.
Each entry in the logic list is a numeric representation of one of the SEM instructions. A two digit
op code is extracted from each entry which identifies the particular SEM instruction requested.
Once the instruction is known, the translator is
able to completely dissect the logic list and modifier list entries for this instruction and perform
the desired operation. All references by the instruction to the verb modifiers are made by simply
referencing the argument table position for that
modifier, as the parsing algorithm already has
inserted the modifiers in the table. As an example,
consider the following SYNSEM specification:

934

Fall. Joint Computer Conference, 1968

CONN

RANGE

AAA

A

KVDC

10.0

20.0

JC

JD

100

The generator creates the argument table as
shown in Figure 2.
1
2
3
4

A
K

C
D
FIGURE 2-Argument table of generator

Since the character A is in position 1, this position number is used in the logic list when the
RANGE instruction is processed by the generator.
When the source statement:
CONN 14.6 VDC JI01-42 J16-33
is parsed by the translator, the argument table is
filled as shown in Figure 3.
1
2

3
4

14.6
VDC
.1101-42
J16-33
FIGURE 3--'-Argument table of translator

When the translator discovers the RANGE instruction number in the logic list, it decodes a
reference to position one in the argument table for
the number it is to test. In the example, the number 14.6 is checked to determine if it is between
10.0 and 20.0.
Each logic list entry provides the translator
with the position 'Of the next instruction to be consid-ered, or in the case of conditi'Onal instructions;
the translator must pick the next instructi'On from
two or three choices after it performs the current
instructi'On.
The translator continues through the logic list
until the END'opcode is discovered, at which time
it has completed its analysis and code generati'On
for the source statement under consideration. The
next statement is read and the entire process repeats. When the translator reads the END verb
it turns the intermediate code generated for the
program over to the assembler for final 'Object
C'Ode' pr'Oducti'On.

Equipment designator
Each time the translator processes a source
statement which requires the use 'Of ATE equipment, an entry on a tape is generated by means of
the SEM instructions EQUI or PREAS. This tape
is called the request tape. The translator itself
has no ability to select equipment from the available equipment pool in order to satisfy the needs of
the source statement. The SEM language program
used to translate these source statements requiring equipment first generates a unique symbolic
number which will be used by it to symbolically
refer to an equipment name in the intermediate
code it produces. It then determines the type of
equipment required by the source statement and
generates the request tape entry. Each such entry generated tells the type of equipment desired
and the symbolic number used to identify it, and
tells how the terminals of that equipment should
be connected. In the case when a specific piece of
equipment must be used in a particular manner,
the translator also processes an equipment preassignment by producing a request tape entry
which gives the specific name of a device and tells
how it is to 'be c'Onnected.
The function of the equipment designator is to
read and process the entries on the request tape
produced by the t~anslator. It attempts to match
an equipment name of the correct type to each
symbolic number and produce inf'Ormation describing how each piece of equipment is to be connected. In the assembly of the intermediate code,
each symbolic number is replaced by the matching
equipment name as provided by the equipment
designator.
The equipment designator operates using two
tables: the hardware name table and the hardware
usage table. The hardware usage table is divided
into two secti'Ons for each equipment type: the
hardware assignment section and the hardware
request section.
The hardware assignment section for each
equipment type contains one assignment indicator
and one row for each piece of equipment of that
type. The indicator gives the, status 'Of the equipment while the row c'Ontains references to the c'Onnections made to this equipment.
The hardware request section for each equipment type can contain a number of requests for
equipment of that type. Each request section entry is made up of an indicator and row like th'Ose

Table Driven CQmputer

in the assignment sectiQn, plus a half-wQrd which
is used tQ hQld the unique symbQlic number fQr the
request. The number 'Of entries allQwed in the request sectiQn fQr a particular type of equipment is
specified in the SYNSEM equipment definitiQn.
The requests prQcessed by the designatQr fall
intQ tWQ classes: ( 1 ) ThQse which name specific
pieces 'Of equipment, and (2) ThQse which symbQlically seek an assignment 'Of any piece 'Of equipment 'Of a specified type. When fulfilling requests,
tWQ passes are made 'Over the request tape with
the items in classes one and tWQ being handled 'On
passes 'One and tWQ respectively.
On pass 'One, the designatQr simply reads the
requests, and in the hardware assignment sectiQn,
Rets the indicatQr fQr the named piece 'Of equipment and fills the rQWS with the cQnnectiQn references. BefQre the first pass, all indicatQrs reflect
an equipment available status. After pass 'One,
the indicatQrs 'Of the equipment named in pass
'One are set tQ indicate 'One 'Of tWQ states: (1) Hard
preassigned-specified equipment may 'Only be used
as stated.
(2) Update preassigned-specified
equipment shQuld be used as stated if PQssible, but
may be used differently if needed. This preassignment is autQmatically generated at the end 'Of each
cQmpilatiQn fQr each pi.ece 'Of equipment used. It
then is submitted 'On the fQllQwing run tQ insure
that the same wire cQnnectiQns will be generated
whenever PQssible, even when changes are made
in a source prQgram.
On pass tWQ, the designatQr tries tQ assign 'One
piece 'Of equipment tQ each symbQlic request. In
additiQn, it creates the matching list tQ be used
by the assembler when prQcessing the symbQlic
references in the intermediate cQde.
Each request causes a scan 'Of the hardware assignment sectiQn fQr the type 'Of equipment requested. If the connecti'Ons 'Of the request match
thQse 'Of a piece 'Of equipment already used, the
request is matched with that equipment. If the
cQnnectiQns 'Of the request dQ nQt match thQse 'Of
any already used, a new piece is assigned tQ match
this request. If all the equipment 'Of the type requested has b.een used, the request is put intQ the
hardware request sectiQn and saved. When the
entire request tape has been read in pass tWQ, the
designator is finished unless SQme unfulfilled requests. remain in the request section. If unfulfilled
requests dQ exist, the designatQr scans the assignment sectiQn fQr all equipment which was update

935

preassigned but not used in this cQmpilatiQn. It
resets the indicatQrs of these equipments tQ reflect an available status. An attempt is then made
tQ assign the unfulfilled requests tQ the equipment
made available. If the request still cannQt be satisfied, it remains in the hardware request sectiQn.
Finally, a wire cQnnectiQn list is prQduced frQm
the hardware assignment sectiQn giving all the
equipment used in the cQmpilatiQn and hQW it is to
be cQnnected. HQW the equipment was used in reJatiQn tQ a possible previQus cQmpilatiQn is alsQ
stated. ErrQr conditiQns are prQduced based 'On entries remaining in the hardware request sectiQn.
New update preassignments are also generated
fQr use if the prQgram is tQ be changed and recQmpiled, SQ that a similar wire list can be prQduced.
CONCLUSION
At this time, UTEC has been cQmpletely written
and checked 'Out using FORTRAN IV, and a language develQped for use with 'One type 'Of autQmatic test equipment (LeSS) currently being
prQduced by RCA has been implemented using
UTEC.
The implementatiQn 'Of anQther language fQr a
secQnd type equipment is being considered at this
time.
It is interesting to nQte that after having defined
the language tQ UTEC, the users CQuld evaluate
the quality of the language and its usefulness, and
suggest changes and imprQvements.
These changes were easily incQrpQrated intQ the
language almost daily during a shakedQwn periQd,
thus allQwing them to be tested within days after
they were cQnceived. The overall effect was to
stimulate ideas for improvement. Thus, a much
more effective language than that 'Originally specified was developed.
REFERENCES
1 BJEVANZIA
A utomatic test equipment: a million dollar screwdriver

Electronics August 231965
2 P Z INGERMAN
A syntax-oriented translator

Academic Press 1966 ch 1 pp 13-19
3 VMAYPER
Programming for automated checkout-Part I
Datamation April 1965 Vol 11 No 4 pp 28-32
4 VMAYPER

Programming for a'ltomated checko.·t Part-II
Datamation May 1965 Vol II No 5 pp 42-46

93&

Fall Joint Computer Conference, 1968

5 BLRYLE
The atoll checkout language
Datamation April 1965 Vol II No 4 pp 33-35

6 MVWILKES
Lists and why they are useful

Proc ACM 19th Natl Conf August 1964 Phila Pa
7 BHSCHEFF
Simple U8er oriented compiler source language for programming
'automatic te8t equipment
Communication of thE' ACM Apri11966 pp 258-266

On the basis for ELF-An extensible
language facility*
by T. E. CHEATHAM, JR., A. FISCHER and P. JORRAND
Computer Associate8, Inc.
Wakefield, Massachusetts

INTRODUCTION
There are two basic premises which underlie the
development of ELF. The first of these is that there
exists a need for a wide variety of programming languages; indeed, our progress ·in the understanding and
application ot computers will demand an ever widening variety of languages. There are, in fact, "scientific"
problems, "data .processing" problems, "information
retrieval'~ problems, "symbol manipulation" problems,
"text handling" problems, and so on. From the point of
view of a computer user who is working in one or more
of these areas there are certain units of data with which
he would like to transact and there are certain unit
operations which he would like to perform on these data.
The user will be able to make effective use of a computer
only when the language facilities provided allow him to
work toward a desired result in terms of data and operations which he chooses as being a natural representation
of his conception of the problem solution. That is, it is
not enough to have a language facility which is formally
sufficient to allow the user to solve his problem; indeed,
most available programming languages are, to within
certain size limitations, universal languages. Rather,
the facility must be natural for him to use in the solution of his particular problem.
The second premise underlying our work is that the
environment in which programs are prepared, debugged,
operated, documented and maintained is changing and
that the language facilities currently available do not
properly reflect these changes. Weare speaking, of
course, of the advent of computer-based files and of
interactive computer systems which permit the user to
be more intimately involved with his program than was
possible with a batch system. A modern language system must be developed with this kind of environment
*This work was supported, in part, by the National Aeronautics
and Space Administration under contract No. 12-563

937

in mind, but should still be adaptable to the older environment.
Let us now explore briefly the implications of these
two premises and examine some alternative approaches
to providing an appropriate language facility.
The "classical" approach to providing a large variety
of languages has been that of developing languages and
their translators-and often even their operating environments-independently. However, it seems clear
that the cost of creating and maintaining an ever
increasing number of language systems is not tolerable.
Somehow we must both provide the variety of "facilities
but, at t~e same time, also reduce the number of different systems. It would seem that there are two extreme approaches to the problem of developing a
language facility which provides all things to all men.
These are referred to as the shelZ approach and the core
approach. The shell approach calls for the construction
of one universal language which contains all the facilities required for every class of users. PL/I with the
"compile-time" facility is probably the best current
example of a shell language. In contrast, the core approach calls for the development of a small "core" language which, by itself, is probably not appropriate for
any class of user, but which contains facilities for selfextension. A particular class of users then extends the
core language to create a language which is appropriate
for their problems. There are, to our knowledge, three
current languages which are, to some extent, core languages: ALGOL-D, GPL, and ALGOL-68.
The shell approach does have a certain appeal. Like
the modern supermarket it promises us a great variety
of both ordinary and sophisticated products. But the
overhead inherent in utilizing such a system is rather
large. As in the supermarket the user must pay for both
the space to contain the products he is not using and the
extra time to access the desired product. Perhaps a more
important difficulty inherent in the shell approach is

938

Fall Joint Computer Conference, 1968

that whenever a meaning is prescribed for a construction,
that same meaning is force'dupotiallll,sers, even tho'llg-h
the construction might" reasonably mean several things.
In PL/I this has lead to such anomalies as: botli of the
boolean expressions 5 < 6 < 7 and 7 < 6 < 5, are
true; the interpretation A *B where A and B are matrices is the matrix whose (i,j) th element is the product
of the (i,j) th elemen~s of A and B. It is not that these
kinds of interpretatIons are "bad"-the point is that
they are built-in and unchangeable. No matter what
meaning onemight like for 7 < 6 < 5 (we like false) or for
(i,j) th element of A *B, (we like the inner product of
the i th row of A and of the j th column of B), that meaning
provided by the designers is now fixed. One must revert to procedures if he wishes to introduce new operators or to detour around the built-in operators when he
needs to vary the meaning of those originally provided.
And this becomes even more cumbersome when, .as in
PL/I, procedures can produce only scalar results. We
would maintain that our reasons for rejecting the shell
approach are not based on speCUlation; the difficulties
currently being experienced with the implementation
and utilization of full PL/I provide ample evidence.
Thus it is our contention that the most reasonable
approach to providing the desired" variety of language
facilities is that of providing an extensible language
supported by an appropriate compiling system. We do
not, however, suggest that we can now devise a single
. universal core language which will adequately provide
for the needs of the whole programming community;
the diversity in "styles" of languages and translation
mechanisms will probably always be sufficient to encourage sev~rallanguage facilities. ELF, which is the
subject of this paper, provides a facility in the "style"
of such languages as ALGOL-60, PL/I, and COBOL.
N ow let us discuss the second premise, concerning the
environment in which we envision programming being
done. Our basic assumption here is that the programmer
does not approach the computer with a deck of cards or
magnetic tape which constitute a complete and independent run: a "run" deck which would commence with
control cards, followed by his problem and then by his
data, and which would result in the system accepting
these, compiling his problem, running it against his
data, and finally burying him in dumps or some other
visible output. Rather, the programmer's unit transactions shOUld be thought of as acts of updating some file.
He might insert a few corrections to his program text,
might call for some incremental change to some executable form of his program, and then might let his program run.
If he is working in an interactive system, he might
maintain intimate control over the proceedings, responding to messages as they occur instead of having to wait

for the final results before he can exert any control.
We do not suggest that ELF is a solution to the problem .of '. providing' 3, langUage for the effective use of a
modern time~shared* system with permanent users'
files. Indeed, there is really very little experience now
accumulated in using such facilities, as most of the language facilities now in use on the available systems
were developed as "batch" languages. It is to be hoped
that work such as that now underway at CarnegieMellon under PerIis' direction will provide some
guidance in this area. 11
We do suggest, however, that we can now devise an
extensible language facility in such a manner that it is
cognizant of an avilable filing system and provides for
interactive control; we will discuss oUr point of view on
the relation of the language to the system in a later section.
The remainder of this paper is divided into four sections. In the next section we will discuss the overall design criteria which have guided the development of the
language. Following this, we will present an overview of
the language with the object of providing the reader
with a general feeling for the language as well as for the
translating and executing mechanisms which we envision. Following this we will discuss the kinds of features and facilities which will be in the language; the
purpose of this section is to justify and describe certain
constructions proposed for the programming language
component of ELF. The final section is devoted to a
summary and conclusion.
Design criteria
Perhaps the most eloquent defense of the overall design criteria to which we have tried to adhere was given
in the 1966 Turing lecture by A..J. PerIis. 10 There
PerIis framed the problem as that of providing for systematic variability in a language. All acceptable languages provide for constant as well as for variable operand values. However, a great deal more variability
must be provided if a language is to be extensible. There
must be means of providing for variability in the
types of quantities with which we deal, in the operations
on these quantities, in programs or procedures, in the
syntax of programs, in regimes of control, in the binding of programs to other programs and data, in the
means of accessing data, in the employment of the
various. storage and input/output resources afforded
by the system, and so on. However, we must provide
for this variability very carefully so that we retain the
necessary control over the efficiency of use of the com*That is, interactive" how the intimacy between the user and
the system is arranged does not concern us.

ON the Basis for ELF
puter, or else our result will be a purely academic exercise.
In our design of ELF we have looked toa number of
"users" as sources of constraint; unless the language
facility is properly matched to its users, it will not be
an effective tool. These' "users" include the programmers who will read and write in the language, the computer which will execute programs, the compiler or
translator which will prepare executable programs, and
the operating system which will provide the environment for the preparation and execution of programs.
In addition, we feel that there are two other important
sources of constraint: the traditions established by
current languages, and the practicality of the language.
Let us now briefly discuss the nature of the constraints
which each of these various sources imposes.

Programmers
People have to learn and use the language. Indeed
we hope that people will even read programs in the language in addition to writing them. However, we find
that different people have rather different ideas about
the form in which a program should be cast. Most
serious programmers adhere to the basic expression
forms where these forms are appropriate-using the infix, prefix, and postfix operators plus parentheses which
have resulted from the years of development of mathematical notation. The form of program text which is
not inherently "expression-like" is, of course, not so well
established. We note here, however, that the usual
"out" for introducing new operations into a languagethe use of functions or procedures-does not provide an
adequate notation for the majority of operations. If the
number of arguments required exceeds three or four, the
user has difficulty in associating the "meaning" of an
argument with its position in the argument list and he
might be considerably better off with some keywords to
help him focus on what is what. Also, if the nesting of
function calls gets to be more than two or three deep,
the "LISP-unreadability" problem becomes serious. We
would also note that, for the user, an important criterion is that he should not have to introduce and deal
with constructs which are unnecessary to the solution
of his problems. The arithmetic expression form provides
a facility which is both natural to a large class of users,
and which also very effectively hides the setting up of
temporary storage for intermediate results. Similarly the various renderings of McCarthy'S conditional
expressions as well as the iteration or looping facilities
which appear in many programming languages have, as
a secondary effect, that of eliminating the need of
introducing temporaries or labels which are used. only
once (see Refs. 2 and 9 for interesting discussions of this
point).

'989

Computers
The abilities of current and projected computers must
also be viewed as a source of constraint. That is, we
should try to "match" the basic types and operations in
the language with those available in "standard" computers (and here we have reference to OPUs, not the
whole "system' ') . Thus, for example, although our
mathematical natures might encourage us to define only
integer quantities and operations as primitive, and obtain
floating point quantities and operations by extension,
this would surely be foolish when we are faced with
computers which by-and-Iarge have floating point
quantities and operations as primitives. Similarly, we
reject the notion of quantities and operations drawn
from set theory as primitive in the language because of
the wide variety of implementation strategies which
might be employed in providing for these. Such quantities and operations should be introduced via extensions.
We must presume that the facilities available in current computers mirror, to some extent, the basic
facilities which the users require.

Compilers
The past several years have-witnessed-the-emergence
of a considerable body-of experience-and technology-in
compiling plograms. Unfortunately, most recent language developments seem to ignore this technology and
demand new and ever more difficult and expensive
translating mechanisms. We have attempted to reverse
this trend and -to adhere rather strictly to the technology available-to provide a language which can be
effectively and efficiently translated and for which the
known techniques of code generation and optimization
will apply.

Operating systems
The constraints which mi,ght be imposed -by the
peripheral devices and operating systems which are' to
be used must be noted. Thus, the -means for encoding
messages to and from the computer are rather strictly
dependent upon the devices (and software) which are
available; our adherence to a conventional string language with reasonably conventional characters is dictated by this consideration. The control structure inherent in modern computers must also be kept in mind;
for example, the notion of "interrupt" is basic in -most
computer systems and our language facilities should
reflect this. Further, the availability of various kinds of
storage having varying degrees of accessibility plus the
needs of the operating system to allocate the storage
and other resources of the system must not be ignored.

940

Fall Joint Computer Conference, 1968

Tradition
The "tradition" which has been established·by such
languages as ALGOL-60, PL/I, COBOL, and LISP
and which is being established by ALGOL-68, GPL,
and ALGOL-D sho~td be considered as a source of
constraint. That is, it does not seen reasonable to reinvent and re-cast the facilities available in those
languages just to be different. Our departures from the
facilities available there should be well thought out
and well justified. It will be·clear that we have in fact
departed in more-or-Iess significant ways from all these
languages; we hope that our arguments for doing this
are convincing.

Practicability
The final source of constraint which we have tried to
observe is that of practicability. It is our intention that
the language· be as efficient and useable as any of the
conventional programming languages. In adhering to
this constraint we have failed in many ways to reach
all the goals of variability which Perlis prescribed. Thus,
ELF provides a language which has the kinds of
variability which we can imagine being handled with
reasonable efficiency. Another generation of language
development will be desirable when we better understand other kinds of variability and can devise mechanisms for handling them efficiently.

Overview of the extensible language facility
There are a number of facets of ELF that are of
interest. In this section we will look at three of these.
First we will look briefly at the base language (BASEL) .
Following this we will consider a compiler for BASEL,
and the ways in which it might be extended. Finally we
will consider the interface between the language system
and the operating environment.

The base language
BASEL has four kinds of primitive program elements:
a Names identify the objects that a program manipulates.
b. Operators manipulate the objects. These. include
the assignment operator, operators such as plus
and less than, and procedure calls.
c. Control statements are used to specify the order in
which the expressions are executed.
d. Declarations are used to define objects and operators, and give them names.
These program elements are embedded in a block structure which is a generalization of that in ALGOL-60.
In the next section we will discuss the elements of the

language in somewhat more detail; for the moment it
will suffice to think of the language as similar to
ALGOL-60 but with provisions for new data types and
operations (or, if ALGOL-68 is familar to you, similar
to ALGOL-68).

A BASEL compiler
Second we want to explore the compiling mechanism
which we have in mind. Although one does not converitionally talk about compil!ng techniques in describing a language, we believe that it is helpful in this case.
Our choice of language constructs, notations, and
mechanisms has been strongly influenced by what we
feel can be readily handled by the current compiling
technology. Thus understanding our view of the kinds
of compiling mechanisms envisioned is rather important. For present purposes we want to think of the compiler for the language as consisting of several "components," including: a lexical analyzer, a syntactic analyzer, a parse interpreter, and a user controlled optimizer, plus other components for generating machine
code and filing it, or for interpretively executing some
"internal" representation of the program text, and so
on. We shall have no particular interest in these latter
components here; let us now consider the other components.

Lexical analyzer
The lexical analyzer will be responsible for isolating,
identifying, and appropriately converting the source
input (e.g., typed characters) thus producing a stream
of "token descriptors" representing constructs at the
level of "identifier," "literal," "operator," "delimiter,"
and so on. We anticipate that, although the lexical
analyzer will be "table driven" by tables derived from a
grammar which specifies the structure of the tokens,
these tables will not ordinarily be changed or extended
by the average user and we will thus think of them as
fixed.

Syntactic analyzer and parse interpreter
We intend that the syntactic analyzer be essentially
an operator precedence analyzer. An operator precedence analyzer is, of course, one of the simplest and
most eflicient kinds of syntactic analyzers available.
Operator precedence analysis works only on a rather restricted set· of languages. However, as Floyd demonstrated in his original paper on this method ,4 ALGOL60 is close to being an operator precedence language;
further, those changes Floyd proposed to the original
syntax rules for ALGOL-60 and to certain constructions
in the language in order to make it operator precedence

ON the Basis for ELF

did no real violence to the language but actually made
it cleaner and more symmetric. Thus, a language does
not necessarily suffer in richness of style because it was
designed with this method of analysis in mind. Another
important reason for the choice of this method is that
those properties of the operators which, properly encoded, are required to "drive" such an anlyzer are
exactly the properties which the user has in mind when
he·specifies an operator, namely, the precedence, in the
sense of order of evaluation of that operator relative to
other operators.
It will be convenient to think of the operators available in the language as including binary infix (e.g. '+'
or '<'), unary prefix (e.g. C'), unary suffix (e.g. '1'),
unary outfix (e.g. 'I ·.. 1'), n-ary "distributed" (e.g. 'if ...
then ... else' or lincrement ... by ... '), and llfunctional"
(e.g. 'lVIAX ( ... ,... )' or 'SINO'). Each operator (actually
each fixed "part" of each operator) will enjoy one of
four relations with respect to all other operators (or
parts of operators), namely: takes precedence, yields
precedence, has equal precedence, or none. The user
will introduce a new operator (syntactically) by
specifying the precedence of each of its parts relative to the precedence of operators already available. It will generally be the case that a given operator
will be defined for operands of a variety of data types;
the "syntactic analyzer" will isolate a phrase-an
operator plus its operands-by using the precedence relations, and then the parse interpreter will then determine the "meaning" or "interpretation" of the phrase in
accordance with specifications which are either built-jn
(e.g. with '+' operating on two integers) or supplied as
extentions by the user (e.g. with '+' operating on two
quaternions). One of the' 'dispositions" which the parse
interpreter might make of some phrase is to place the
operands of that phrase into some previously given
(macro) "skeleton" and re-submit the resulting text for
syntactic analysis. This will provide what are essentially
the "lexical macro" and "syntactic macro" facilities
proposed in. 1

User controlled optimizer
There are certain operators which require a larger
context than the phrase in which they occur for their
interpretation, particularly if one has a goal of producing optimal coding and either does not have, or perfers
not to overburden, a code optimizer. An example here
would be the coding of the multiplication of two conformable matrices in the three contexts:

A*C

(A

+ B) *C

(A

+ B) * (C + D)

Thus, it may be that one might desire an algorithm for
matrix mUltiplication which required only one tem-

941

porary scalar for the first case, a temporary row for the
second, and a full temporary matrix only for the third.
On the other hand, one might use tenlporary rows for
all three cases, giving up storage efficiency in the first
case and computation efficiency in the third. The point
is that there are cases in which the determination of the
appropriate means for performing some operation depends upon some context. The user controlled optirnization phase provides for this. It would also be in this
stage that the user would have the ability to tinker
with' such things as the allocation of storage, the means
of access (e.g., via some hardware or software "paging"
scheme) to certain quantities, and so on.
Briefly, we think of the parse interpreter as constructing what in effect is a computation tree representation
of the analyzed program text; each node of this tree
would be labelled with the data type of the value it
represents. The user controlled opti11'"izer may then be
thought of as a mechansim. which "walks" over this
computation tree, inspects context as appropriate, and
re-organizes and re-constructs portions of this tree.
The mechanism has certain similarities with those proposed in ALGOL-D, but with a control and sequencing
strategy similar to that of the GSL component of the
CGS system (see Refs. 12, 13 and 15)

Interface view
N ow let us briefly consider the interaction of the
language with the environment in which programs are
constructed, debugged, and executed. First, we want to
emphasize that we would expect the language to include
the means for the kinds of communication with the
operating environment which are typically handled
via "control" or "job" statements as well as the kinds
of communications which have to do with "editing."
We presume that there is a filing systenl which contains
such things as: (1) program text which we might want
to incorporate into the input stream to the compiler
during some run, (2) specifications of modes of programs
or procedures which our current program might want to
reference, (3) lnodifications or extensions to the compiler which we might want incorporated for processing
our current program text, (4) an "executor" which can
execute programs as they are represented following the
interpretation of the parse and user-controlled optimization, (5) data which has been previously input or
generated and then filed, (6) and so on. Clearly there
must also be means for placing any of these items in the
filing system. Thus, a "run" or "session" might be one
in which we input a number of extensions to the language including new data types and operations over
them, with the result that they are filed in such a fashion that we can later call the compiler, mentioning
that it is to include these extensions. Another type of

942

Fall Joint Computer Conference, 1968

session might be the input of program statements
with the expectation that they be executed directly.
Another might be the input and editing of a program
with the expectation of filing it for later execution. That
is, a "unit transaction" within the ELF system will
typically utilize material previously developed and filed,
and result in material to be filed. There will be a number
of forms which this material mjght take, rangjng from
or certain programs within the compiler at the other.
So long as this philosophy of operation is understood we
will not go into further details here; we intend to s~ell
out more details of the linguistic forms and possible
system mechanisms to attend to problems in this area
in a subsequent paper.
The base language

In this section we will present the basic concepts and
mechanisms which are introduced in BASEL the base
Ianguage component of
\ ELF. It is not our purpose
'
to
provide an introduction or primer for the language.
(One must see Ref. 3 for this.) Rather, we hope that our
discussion will, as it were, "soften the blow" and make
the details of BASEL appear reasonable. Thus we will
suggest the kinds and means of variability which
BASEL permits. We" might also note that we are not
attempting to justify all the concepts and notions in
the language. This section is divided into two parts':
First we will discuss the fundamental notion of value
and show how it is handled in BASEL, comparing
BASEL to other languages. Then we will present the
declarations and expressions which manipulate values.

Values
One Of. the baSi? characteristics" of a programming
language IS the vanety of values it can handle and the
-;vay in which these may be manipulated (that is, read
In, stored, named, declared, operated upon, passed as
parameters or returned as results of functions and writ- .
ten ~ut.) The larger the variety of values a language ~an
manIPul~te, the more "powerful" that language is; the
more unIformly these values are treated the easier
that language will be to write, compile ~nd extend.
We will use FORTRAN II to demonstrate what we
mean by "~ value." FORTRAN handles only real
values and ~nteger values in a general way. One may
rea~ and wrIte alphabetic values, but must treat them
as If they were integers. Also, FORTRAN has arrays,
?ut no concept of an array value which may be treated
In all ways as a unit.
We can distinguish two aspects of a value : (We have
been strongly influenced here by ALGOL-68.)

-

its data type or mode (that is, the "class" it belongs to, that aspect which
determines the contexts in
which it is meaningful and
the operations which apply
to it.)

-

its meaning

(that is, its interpretation,
the object which it represents.)

Example:
The number 3.14159 has the mode real and its meaning is an approximation to the number pi.
One must be able to store values, and thus we are led
to think about variables. We see that we can treat
variables like values, having both mode and meaning.
Example:
A real variable has the mode loe real (a location in
which to store a real value) and its meaning is the
address of that location.
We have made a distinction here between the meaning of a variable and its value (that is, the value stored
in it.) We can refer to either (as will be described later)
and speak of the mode of either. This makes possible a
simple and uniform treatment of pointers. We see that a
pointer is simply a variable in which the meaning
(address) of another variable is stored.
Example:
A pointer to a real variable has the mode loe loe real
and its meaning is the address of a box in which we
can store the meaning of a real variable (a loe real).
Continuing this line of thought we see that we can
describe the mode of. any pointer in terms of the mode of
the thing it points to, which permits a compiler to decide
whether it is meaningful to use that pointer in any
given context.
The aggregate-value is another generalization of the
concept of a value to which the notions of mode and
meaning must also apply. We must go further than
PL/I did. The PL/I declaration
DECLARE

1

A
2 Al INTEGER
2 A2 FLOAT

ON the Basis for ELF
declares an instance of an aggregate, but there is no way
of speaking of the mode of that aggregate independently of this instance. One result of this is that such
aggregates may not be returned as the results of functions. This difficulty also arises with arrays in ALGOL60 and FORTRAN.
So in addition to a set of "basic modes" (like real) we
need a set of "mode constructors" which can be used to
combine the basic mode descriptors into descriptions of
more complicated values.

Basic modes
We have adopted the following set of modes as basic:
integer value
real value
boolean value
character value

written

int
real
bool
char

Mode constructors
Variables:

If ffit is a mode, then loc mt is the mode of a variable
which can store a value of mode mt. ("loc" in
BASEL serves the same purposes as "ref" in
ALGOL-68. The differences will be explained later.)
Aggregates:

An aggregate is a sequence of values. BASEL has
three ways of describing these.
tuple: A tuple is an ordered list of values. These are

used primarily as the actual parameter list in a
function call, but they can also be computed,
constructed and stored in variables.
If mt1 , mt2, ...ffit n are modes, then
tuple (ffitl, ffit2, ... mt n ) is the mode of an aggregatevalue whose parts are n values of these modes.
Examples:
tuple (real, int, char)
tuple (loc real, real)
row: A row is a homogeneous aggregate. We treat

this case separately for two reasons: first, rows
are particularly ·simple to implement, and second,
this is the only way to describe a variablelength aggregate.
If n is an integer value and ffit is a mode, then
row n of ffit is the mode of a homogeneous series
of n values, each of mode mt. These are numbered
from 1 to n and may be accessed by number.
Note that this is a more elementary notion than
the array of ALGOL-60.

943

Examples:
row 3 of real
row 6 of loc char

(This might be used to
describe a character
string variable.)
(This describes a variable length row of
integers.

row any of int

struct: The mode constructor struct attaches names to

the elements of an aggregate, permitting these parts
to qe accessed by name. If mtl, mt2, ... , mtn are modes
and el, e2, ... en are identifiers, then struct (ffitlel,
mt2e2, ... mtne n) is the mode of an aggregate-value
with n named parts. (This mode constructor is exactly
the same as in ALGOL-68.)
Examples:
struct (real r, real i)

might be used to describe a complex number.
struct (loc int level, row 50 of loc bool elem)

might be used to describe a push-downstack which can hold up to 50 boolean
values. The integer vari2 ble 'level' would
store the index of the current top of the
stack.
Procedures:

A procedure is a parameterized description of a
value. (This value is usually specified by giving an
algorithm by which to compute it.) In BASEL,
procedures are also considered to be values, which
may be stored, passed as parameters, etc. Because
of this we are interested in the mode of procedures.
Since procedure calls are used in expressions, the
procedure's mode must embody information about
its domain and range, in order for the compiler to
ensure the meaningfulness of expressions involving
procedures. Therefore we define the mode of a
procedure as follows: if mtl, ffit2, ... mtn and ffit are
modes then proc (ffitl, mt2, ... mt n ) mt is the mode of
a procedure which takes n parameters of modesmtl'
mt2, ... , etc. and returns a result of mode ffit. If the
procedure takes no paraID:eters, empty parentheses
are written. If it returns no result, the word "none"
is used in place of the mode of the result.
Examples:
proc (real) real
proc (int) none

describes the mode of of the
sine function.
might be used to describe a procedure which

944

Fall Joint Computer Conference, 1968
opens the file designated by the integer code.
proe

0 strue (int, int)

proe () none

could describe a procedure which returns the
current time of day expressed as two integers
representing hours and
hundredths of hours.

A row or a structure is denoted by writing a row or
struet mode followed by a tuple with the appropriate number of elements, each of which has the
appropriate mode.
A procedure of mode proc (;)11,1, 2, " ' , mt n) mt is
written: proc (mt1f1, 2f2, "', mtnfn) (E)
where the fi are the names given to the dummy
parameters, and E denotes an expression whose
value has mode mt. If the procedure returns no
result, then E must denote an expression with no
value.

could be used to describe a DUlVIP procedure.

Dynamically varying modes

Finally, if

Occasionally it is useful to permit a varIable to store
values of more than one mode, or to permit an expression to produce a value whose mode depends on the
data. The mode-descriptor operator 'union' is used to
build the description of a mode which is not completely
fixed, but can be one of a fixed finite set of modes. The
mode of a value stored in a 'loe union' can vary dynamically among the fixed set of modes specified by the
'union'. The mode of the value of an expression' can
vary if that expression contains a conditional whose
'then' and 'else' clauses have values of different modes.
If mtl,.'" mtn are modes than union (mt1, ••• mt n ) describes
an object whose meaning may have anyone of the given
modes; which one may not be known until run time, and
may change.

~m;

is a mode then

a mt creates a value of mode ;me

Examples:
Mode

A Value of that Mode

tuple (int, int, int)

[1, 5

tuple (real)

[sine X]

row 4 of bool

row 4 of bool [true, true, true,
true]

struct (real r, real i)

complex [0.5, 2.J ("complex"
has been declared as a
synonym for the mode
struct (real r, real i).)

proc (int, int) tnt

proe (int A, int B) (if A
then A else B)

loe int

aloe int (This causes an

Meanings
Certain values of the basic modes 'have pre-defined
representations. These are:
the
the
the
the

+

7, 4J

>

B

integer variable to be created. Its contents are not
initialized. This expression
then denotes the new address.)

non-negative integers written 0, 1, 2, ...
non-negative reals
. 0., 3.96, .5, 3.7 e-2
boolean values
true, false
'A' ... , 'Z', '1', '2',
character values

.. " ',','+', ...
Declarations, expressions and programs
Because these are conventional representations of
their meanings, they are sometimes called 'literals'.
Through declarations, the programmer can give other
names to these values. In fact, it is possible to name any
value that can be computed.
Since an unbounded number of modes can be defined
in BASEL, the language must provide a consistent way
of denoting a value of any mode.
The conventions we have chosen are:
A value of mode tuple (~1, "', mt n) is denoted by
writing [VI, v!, ... v fll. Each VI denotes a value of
mode mt,.

Declarations
A declaration attaches a name to a value, thereby
creating an object. In BASEL, any value which can be
computed or denoted may be named. BASEL has four
kinds of declarations, which we will consider in the following order:
-

mode declarations
data declarations
operator declarations
meaning definitions

ON the Basis for ELF
Mode declarations
A mode declaration defines a name for a mode.
This name then acts as a synonym for that mode.

Example8
let complex be 8truct (real re, real im)
This describes a complex number. Having
given a na'lle to the mode expression, the following expressions become exactly equi valent:
8truct (real re, real 1m) [l.4, .2]
complex [1.4, .2]
let tree be 8truct (char top, row 2 of loc tree son)
This describes a recursive tree structure, in
which each node has two sons.

As with mode declarations, a data declaration
defines a name for a value. Within the scope of this
declaration that name and the value it denotes are
completely synonymous.

Example8

This declares "pi" to be a name for the real
value "3.14". Notice that pi is a name for a
constant, not a variable. One would never be
able to store another value in pi.

Examples:

let i be infixR prec > *
let ! be 8uffix prec > *
let sin be prefix prec > !
If these three declarations were given in this
order, the resulting precedence relations would
be:

t > prec of sin > prec of ! > prec of •

After an operator has been declared one or more meaning definitions are used to associate procedures with that
operator. These are written:

let (operator name) mean (procedure-value)

let + mean proc (bool a, bool b) (if a then ..., b
el8e b)

"+"

This definition states that
is to be meaningful when written between two boolean values,
and is to denote "exclusive or" in that context.
A call on this operator would be written: x
y,
where x and yare boolean values. When the procedure is executed "x" is bound to dummy
parameter "a" and "y" is bound to "b."

+

let X be a Zoe real
As mentioned. above, the expression "a loe
real" causes a new loe real to be created, and
denotes the resulting address. This declaration
makes X a synonym for that address.

>

j then i else j)

This declares that MAX is a name for a procedure which computes the maximum of its
two integer arguments.
Operators
In BASEL an operator is simple a collection of related procedures plus some information on how a
call on that operator is to be written. This syntactic information is declared with an operator declaration, written as follows:

let (operator name) be (shape) prec (precedence
relation) where the (shape) may be
prefix
suffix

and the precedence relation specifies that the
precedence of the new operator is either =,
(equal to), > (just greater than) or < (just less
than) that of some previously defined operator.

Example:

let pi be 3.14

let MAX be proc (int i, int j) (if i

injixL (left associative)
infixR (right associative)
infix
(non associative)

prec of

Data Declarations

945

The term "generic" is commonly used to describe an
operator that can have different meanings, depending
on the modes of its operands. A particular meaning is
chosen in a given context by matching the modes of the
operands to the modes of the formal parameters in one
of the procedures attached to the operator. This "match"
is made according to the conventions described later
for procedure calls.

Expressions
Program Structure
In BASEL a program is a compound expre8sion.
A compound expression has a body, which is written
between "begin" and "end" or between
"("and")" .
The body itself is a series of block8 separated by
commas.

946

Fall Joint Computer Conference, 1968

A block begins with a declaration part and ends
with an expression part.
The declaration part is a series of declarations each
terminated by a ";". This part may be omitted.
The expression part is a series of clauses separated
by "exit" s.
A clause is a series of expressions separated by" ;"s.
Finally, the syntax of an expression corresponds to
conventional usage. Statements such as "go to",
"for", "iJ", compound expression, and the empty
statement are also considered to be expressions.
A compound expression may be used anywhere
a name is legal.
The evaluation of a compound expression
A compound expression (or a program) is evaluated
by serially evaluating its blocks. At most one of
these blocks may have a value, and this value,
if any, is taken as the value of the compound
expression.
A block is evaluated by
(1) Elaborating its declarations, in order. The
scope of a declared name or operator meaning is the smallest block which contains that
declaration.
(2) Evaluating the expression part until the first
"exit" is encountered. The expressions in this
part are evaluated in order except, of course,
any "go to" commands are obeyed. The
value of a block is the value, if any, of the
last expression evaluated in it.
The value of each kind of expression is defined as
follows:
The go to statement and the empty statement
have no value.
A conditional is evaluated in the conventional
wa.y, its value is the value, if any, of the selected
expression.
The value of a "Jor" statement is the series of
values produced by the repeated evaluated of
its scope.
Since the use of an operator is considered to be a call
on one of the procedures associated with it, in all other
cases an expression is nothing but a set of procedure
calls ordered with respect to the precedence of the
operations involved. Its value is the value, if any, returned by the last procedure call.

Procedure calls
A procedure call is written as a procedure-valued ex-

pression followed by a tuple-valued expression. The call
is evaluated in four steps:
1. The tuple expression is evaluated resulting in the
tuple of values, [VI, V2, ••• , v,.J. Note that these
value's may be of any mode including loc modes
and p~oc modes. Thus we may have calls by value,
address or procedure.
2. The procedure expression is evaluated resulting in
the procedure: proc (mtlfl , mt2f2, "', mtnf,,) (E).
3. The parameters are bound. This has the same
effect as replacing the procedure call by the
following compound expression:
(let fl be VI; "'; ; let f" be v,,; E)
4. This compound expression is evaluated.

In order for a procedure call to be legal, the modes of
the values v" must "match" the modes of the dummy
parameters. For the basic modes, the definition of
"match" is the natural one; "match" only becomes
complicated when "union" is involved. The precise
defi~tion follows:
A mode P of an actual parameter "matches" a mode
Q of a dummy parameter (we will write P c Q)if

- P is union (mtl, "', mt n ) and V., mtl C Q
or Q is union (mtl, "', mt,,) and ~" such that
Pc mt.
or P is int, bool, real or char and Q is respectively
int, bool, real or char.
or P is loc mt and Q is loc mt' with mt c mt' and
mt' C mt.
or P is tuple (mtl, "', mt n ) or struct (mtlel, "',
mt"e n ) and Q is respectively tuple (mt/l' "',
mt' n) or struct (mt'lel, "', mt' "e,,) where V", mt"

c mtl.
or P is proc (mtl, ... , mt n ) mt and Q is proc (mt'l, ... ,
mr' ,,) mt' where mr c mt', and V. mtl c mt •.

Some other aspects of BASEL
. Assignment
The infix operator ---+ is used to denote assignment.
The assignment of a value v, of any mode mr, to a
variable V of mode loc mt is written v ---+ V.
The assignment I ---+ J where I and J are both
integer variables (loc int) is meaningless, since both
I and J stand for their meanings, which are addresses. One must write: val I ---+ J, which explicitly
fetches a value from I and stores it in J. In BASEL
there is no automatic built-in mechanism to f,etch
the value of a variable or to do any of the other
transformations performed by the "COERCION"
mechanism of ALGOL-68. Of course, one can

ON the Basis for ELF

freely define extended meanings for "-+" and other
operators, which would make the expression
"1 + 1 -+ I" meaningful.
Allocation
We have already mentioned that the expression a
"mt", where mt is a mode, causes a value of mode mt
to be created. Thus allocation is caused when m is
"loe fir'" . If fir' is in turn a "loc" a second location
is not created. This is a major difference between
loc in BASEL and ref in ALGOL-68 where the
declaration:
ref ref real Y

= ref real 0

causes two locations to be allocated, the first one
pointing to the second one, which in turn can store
areal.

In addition to the block-entry-time allocation
caused by declarations of variables, BASEL also
permits dynamic allocation: if V is a value of mode
fir, then aloe V stores V in a newly created location,
and returns its address as a result. This result, of
course, has mode loc fir.
Using a "union"
With "union", BASEL allows dynamically varying
modes. But in order to use an object of "union"
mode it is often necessary to check the current
status of the mode. This is done with a "when"
conditional expression, defined as follows:
when (name of the object) is fir then El else E2

where fir must "match" the deelared mode mr' of
the object.
In such expression, all uses of the object in El are
treated exactly as if the object's mode were fir rather
than ffit'.
SUMMARY
BASEL derives its simplicity and power from a general
and unifying notion of data objects. Data objects in
BASEL comprise not only the usual integers, reals,
booleans, and character strings, but procedures, pointers, and aggregates as well. Aggregates may be built
out of any objects whatsoever, so that arrays of arrays
and arrays of procedures are but special cases. Similarly,
any object may be pointed at, so that procedure variables and aggregate variables are treated just like
integer variables.

947

This generality in the treatment of objects greatly decreases the number of special constructs which need
appear in the language. Any object may be named using
a declaration, so that separate statement types are not
needed to name procedures, variables, arrays, and
numeric constants (the latter is not allowed in ALGOL60). Similarly, a single parameter passing mechanism
serves to allow call by value, call by address, and call by
name (procedure) for all three are simply BASEL data
objects. Arrays and procedures may be passed to other
procedures as arguments and may be returned by them
as results. Location valued procedures are likewise permitted, allowing the user for example to define new subscripting functions. The left and right sides of an assignment statement may be evaluated using exactly the
same set of rules; the compiler need not distinguish between them.
To enable the wide class of data objects to be used
effectively, BASEL allows the user to extend the
meaning of existing operators as well as to define new
ones. Thus, the user is free to define (or leave undefined)
as he chooses such strange combinations as the concatenation of a teal and an integer or the square root of
a character.
Unlike many other generalized languages, BASEL is
designed to be compiled with as little as possible left to
be interpreted at run time. Efficient compilation
is enabled by the mode information which the programmer specifies (as he is required to in most other higher
level languages). Only when complete mode information
is not known until run time is the type checking done
interpretively (as indeed then it must).
We hope that BASEL has helped to illustrate that
power and generality are not necessarily bought at the
cost of complexity-indeed, careful generalization can
provide power and at the same time make a language
much simpler, both to learn and to implement.
REFERENCES
1 T E CHEATHAM JR
.
The introduction of definitional facilities into higher level
programming languages
Second Edition Proceedings of the AFIPS Fall Joint Computer Conference San Francisco November 1966 Vol 29
Washington DC Spartan 1966 pp 623-637
2 E W DIJKSTRA
Letter to the editor
Communications of the ACM Vol 11 March 1968 pp147-148
3 A E FISCHER P JORRAND
BASEL The base language for an extensible language facility
Mass Computer Associates Inc CA-6806-1311 June 1968
4 R WFLOYD
Syntactic analysis and operator precedence
Journal of the ACM 10 July 1963 pp 316-333
5 J V GARWICK J R BELL L D KRIDER
The GP L language

948

Fall Joint Computer Conference, 1968

Control Data Corporation Palo Alto California Programming
Technical Report TER-05, 1967
6 JVGARWICK
A general purpose language (GPL)
Forsvarets Forskningsinstitut Norwegian Defence Research
Establishment Intern Report 8-32
7 B A GALLER A J PERLIS
A propoealjor definitions in ALGOL
Communications of the ACM Vol 10 April 1967 pp 204-219
8 G F LEONARD J R GOODROE
An environment for an operating system
Proceedings of the ACM 19th National Conference Philadelphia Pennsylvania 1964 New York ACM 1964 pp E2 3-1-E2
3-11
9 PJLANDIN
The next 700 programming languages
Communications of the ACM Vol 9 March 1966 pp 157-166
10 AJPERLIS
The synthesis of algorithmic systems
First ACM Turing Lecture Journal .)f the ACM Vol 14

January 1967 pp 1-9
11 AJPERLIS
Private Communication
12 R M SHAPIRO S WARSHALL
A general purpose table driven compiler
Proceedings of the AFIPS Spring Joint Computer Conference
Washington DC April 1964 Balitmore Spartan 1964 pp 59-65
13 R M SHAPIRO L J ZAND
A description of the compiler generator system
Massachusetts Computer Associates Inc Wakefield Mass
CA-6306-0112 June 1963
14 A VAN WIJNGAARDEN BJ MAILLOUX J E L PECK
A draft proposal for the algorithmic language ALGOL 68
IFIP Working Group 2 1 MR 92 January 1968
15 NWIRTH HWEBER
EULER: A generalization of ALGOL and its formal definition
Part I
Communications of the ACM Vol 9 January 1966 pp 3-9
Part II
Communications of the ACM Vol 9 February 1966 pp 89-99

Associative processing for general purpose
computers through the use of modified memories*
by HAROLD S. STONE
Stanford Research I nstit'l.tte
Menlo Park, California.

INTRODUCTION

permit row-column access. The last section contains an
evaluation of the technique and a summary.

Tbe concept of the content-addressable memory has
been a popular one for studY'in recent years/· 2.3 •4 but
relatively few real systems have used cont~mt-addres­
sable memories successfully. This has been partly for
economic reasons-the cost of early designs of contentaddressable memorieR has been very high-and partly
because it is a difficult problem to embed a contentaddressable memory into a processing system to increase system effectiveness for a large class of problems.
In this paper, we describe a relatively inexpensive
modification to the memory access circuitry of a general
purpose computer that will permit it to perform some of
the operations that can be performed in a contentaddressable memory. The major restriction is that the
memory must be a 272D memory.6.6 The modification
resul ts in a new access mode to memory, one which
permits a bit slice from a number of different words in
memory to be accessed. Memory can be viewed as a
collection of N X N arrays of bits such that an access
can be made to either the ith row or the ith column of
an array. Although this capability is somewhat limited
by comparison to the capability of large contentaddressed memories as they are normally conceived, it
is ideally suited to that class of problems that requires
both conventional and associative processing. A good
example of this type of problem is the Gauss-Jordan
algorithm for matrix-inversion which involves a search
through a matrix for the numerically greatest element.
In the next section we further describe the functional
behavior of the modified memory and illustrate its use
through a series of examples. The subject of the third
section is the m()dification of a 272D memory in order to

A880ciative proce88ing with row-column operation8
The functional behavior that is described in this
section is not novel; it was first described nearly a
decade ago. The reason for its resurrection is due to the
ease with which it can be implemented in present technology using techniques described in the next section.
The basic idea is illustrated in Figure 1. Memory is
viewed to be partitioned into multiple arrays of bits,
each of size N X N, where N is a power of two. Memory
accesses can be made in one of two modes, row mode
or column mode. In either mode, a particular array in
memory is selected by the high order bits of an effective
address while the least significant log2 N bits select
either a row or a column. This behavior is essentially
the same as that of the horizontal-vertical computer
described by Shooman.7 •8 Row selection' is equivalent
to word selection in a conventional computer. The
column selection mode has no counterpart in conventional computers and is the mode that supports a
limited form of associative processing.
There is a major difference between Shooman's
vertical-horizontal processor and the idea that is developed in this paper. Shooman conceives of two separate collections of registers and processing logic to be
used in his processor, one for vertical processing and
one for horizontal processing. An implementation of
Shoomar:t's idea is described in Ref. 12. What we describe here uses just one collection of registers and logic.
This characteristic comes about because in the modified memory. all data is transferred between memory
and a common data register for both row and column
operations.
To be more specific, Figure 2 shows an N X N array of
bits and shows the contents of the memory data register

*This research was supported by the Office of Na.val Research

I~ormation Systems Branch, under contract Nonr-4833 (00)
WIth Stanford Research Institute.

949

950

Fall Joint Computer Conference, 1968

BIT-

o

..
I

Let X and Y be the base addresses of two N X N
bit 0, 1 matrices in memory. Then rows of X can be
stored as columns of Y by iterating the instructions
below for different values of the index variables.

2 •.• N-I

RI

EFFECTIVE MEMORY ADDRESS

I I I
-----""---1
SELECTS
AN ARRAY

'-----~

I I

FIGURE I-The logical organization of memory. The mode of
access is determined by the instruction

both after reading the kth column and after reading the
the ith row. Note carefully that rows are placed in the
register such that the first bit, bi,o, is at the left end of
the register and that the first bit of a column, bOok, is
placed at the right end of the register.
To illustrate how row-column accessing is useful for
associative processing, we n~w consider several examples. The notation that is used in the examples needs
a brief explanation. Symbols written in capital letters,
such
X and Y, are symbolic addresses. Subscripted
letters such as Ra denote hardware registers which are
part of a central processor. The symbols "~,, and
"~" denote row and column operations respectively.
Symbolic addresses that are followed by bracketed
symbols such as X[I] denote indexed addresses such
that X[I] is the address obtained by adding the contents
of memory location I to the address of X.
Thus, we have the primitive operations:

as

RI~X;

Y

~RI;

R2~X;

Y[J]

. •1

log2 N bits
- - SELECTS A COLUMN
OR ROW IN ARRAY

Fetch a row and place it in R 1 •
Store RI as a row at address Y.
Fetch a column addressed by X
and store in R 2.
Store the contents of R2 as a
column at address Y.

~

X[I];

~

R1 ;

If I is equal to J and the loop is repeated for I = 1,
2, ... , N, Y will contain a copy of X rotated a quarter
turn. If I = N -J, then Y will contain the transpose of
X where the transpose is taken about the minor diagonal.
To aid in search operations, we introduce the machine
function NORMALIZE. The NORMALIZE instruction left shifts a specified register until a "I" appears.
in the left-most position or until N shifts are performed
if there are no "1"'s in the register. A count of the number of shifts is placed in another designated register.
In our symbolic notation we use the form

to mean that RI is normalized and the shift count is
placed in R2.
As an illustration of the use of the column operation
for searching consider the problem of scanning a status
vector array to find a vector with a "I" in the ith bit
position. The pair of instructions that perform the
search are
RI ~ STATUS [I];
R2

~

NORMALIZE (R 1);

R2 contains N-j where j is the index of a vector with
a "I" bit in the ith position. (For programming convenience it would be wise to implement the NORMALIZE
co~and so that the result left in R2 would be j instead of N-j.)
Now consider the problem of searching an array for a
vector that contains a particular field matching a
specified pattern. The technique that we use is to perform an iterative sequence of column operations on the
pattern field. In the instruction sequence below, register
RI holds the pattern, R2 holds columns fetched from
memory, and Ra holds a 0, 1 vector that contains a 1
bit in a bit position if the corresponding word in the
array has matched the pattern on all preceding operations. The sequence is initialized so that the left-most
bit of the pattern lies in the left-most bit of R 1, and Ra
is initialized to all "1"'s. "NOT" and "AND" are full
register logical operators.

Associative Processing for General Purpose Computers
Fetch the next column of the
pattern;
IF LEFTBIT(R 1) = 0 THEN
R2
Rs

f-

f-

NOT R 2;

Ra AND R 2; l\1ask out bits in Ra that
disagree with pattern for
this iteration.

The sequence of instructions must be repeated for
values of I ranging over the index field. After the final
iteration, Ra contains a "I" in positions that correspond
to words with fields that match the pattern. A NORMALIZE command ci:m be used to obtain the addresses
of words that satisfy the search criterion.
Note that the search procedure effectively looks at
N words but the number of fetches required is equal
to length of the pattern. Hence, for short patterns,
considerably fewer than N memory fetches will serve to
search N words. Patterns that occupy a full word can
be processed about equally well in row mode or in
column mode.
Searches need not be made on "equality" matches.
A slight modification of the instructions above is all
that is required to implement a threshold search. We
use one more register, R 4, that contains "1"s in posititions that correspond to words with fields greater than
the pattern. R 1, R2 and Ra are used as before. R4 is
initialized to all "O"s prior to executing the sequences
below.
R2

~

X[I];

Fetch next column
of the pattern;

951

Sets that satisfy any of the relations" = ", "~", ">",
"~", "<", and "~" can be obtained easily by simple
operations on the vectors in Ra and R 4. "Between
limits" or "outside limits" searches can be done by
using three registers for each limit, i.e., one register to
hold the limit pattern, and two that function as counterparts of Ra and R 4.
A number of other operations are also possible with
row column accessing. Many of these are given in
Shooman7 • The examples given here and in Reference 7
should suffice to illustrate the power of idea. It is appropriate at this point to consider the implementation.
The memory I1wdification

The memory modification can be developed from a
discussion of the functional requirements of row-column
processing and the constraints of conventional memory
technology. We begin by examining the N X N matrix
in Figure 2.
In a conventional memory, the bits in the matrix
are stored so that the columns of the matrix lie on
separate and distinct sense lines of the memory. Because of constraints of conventional memory technology,
during any memory cycle no more than one bit per
sense line can be read or written. For row operations,
selection circuitry activates all bits of a specified row,
each of which lies on a different sense line. The physical
portion of a core memory that is threaded by a sense
line will be called a bit plane in the following material.
For both 2%D and 3D memory organizations, the
entities that we call bit planes correspond to physical
planes of memory stacks, but the correspondence is
usually not true for 2D memories.
For column operations, the technology constraint is
severely restrictive. With physical memory organized

IF LEFTBIT(R 1) = 0 THEN
BEGIN R4
R2

f-

f-

R4 OR (Ra AND R 2 );

NOT R 2;

END;

bO,O

bo,l

b o ,2

bo,N-1

bl,o

b l, I

b l ,2

b l , N-I

b 2 ,o

b 2,1

b 2 ,2

b 2 ,N-1

Ra

f-

Ra AND R 2 ;

.

Rl

f-

LEFTSHIFT(R 1);

.

The statement immediately after the "BEGIN"
updates the vector in R4 such that a word is greater
than the pattern if it had been greater on the previous
iteration or had been equal before and is greater on the
current iteration. At the close of a sequence of iterations of the rrogram steps given above, the vectors in
Ra and R4 uniquely identify all words that are either
equal to the pltttern or greater than the pattern. All of
the remaining words are clearly less than the pattern.

bN-1,o

bN_1,1 b N _ 1,2

K

THE LOGICAL
AYOUT OF
DATA IN MEMORY

bN-1,N-1

DATA APPEARING
IN PROCESSOR
REGISTER

FIGURE 2-The result of fetching a row and a column from an
N X N array

952

Fall Joint Computer Conference, 1968

as shown in Figure 2, columns of arrays will lie wholly
within bit planes, and thus no more than one bit per
column could be accessed during any memory cycle
under the assumed constraint. In order to overcome the
constraints of memory technology, memory can be
organized as shown in Figure 3. Each row in the array
corresponds to a row in the array shown in Figure 2,
but in Figure 3, the matrix is stored such that the ith
row is cyclically shifted to the left by i bit positions.
Careful examination of the figure shows that both rows
and columns of the matrix have the property that
exactly one bit lies in each plane. The bits are scattered
and shifted, however, so that the memory access circuitry must be constructed to take this into account.
Figure 4 shows how row and column selections must
function when data is held in a skewed fashion. In

WORD
~

MEMOR Y P LA NES

I

o

l

~

I

~

+

boo

bOI

b

b ll

b l2

b l3

blO

2

b 22

b 23

b 20

b 21

3

b 33

b 30

b 31

b 32

b

02

03

MEMORY
PLANE INDEX

0

2

3

FIGURE 3-Physicallayout of a 4 X 4 array in memory

o

3

2

o

3

2

Figure 4a, the first row is read into a data register, and
must be cyclically shifted right one bit to place in a
standard format. In general, an operation on the ith
row requires a right cyclical shift of i bits after a read
cycle, and a left cyclical shift of i bits before a write
cycle.
Column operations are somewhat more complex.
Examination of Figure 4b shows that N different words
must be accessed to obtain all of the N bits in one column. Specifically, the memory planes must be able to
support simultaneous selection of bits in different rows.
We shall return to this point later. The planes in Figure
4 are numbered from left to right as 0, N, N-I, ... ,l.
The selection of the ith column is such that the jth
plane must selec,t the bit in word (i+j) mod N. After
access, the word must be cyclically shifted to the right
by N":i-l bits. The skewed storage technique has been
used in the Illiac IV design,9 where Illiac IV memory
modules correspond to bit planes ·here, and full word
operands in Illiac IV correspond to bits.
Thus far, the discu~;sion has been functional in nature.
Now we must take a closer look at memory technology.
Among the conventional memory organizations only
one can easily be adapted to permit aocesses to different
hits in different bit planes. This memory organization
is the so':called "2~D" memory, and its organization is
shown in Figure 5.
In the figure, it is shown that addresses are split into
two components, X and Y, and are separately decoded.
There is a single set of X drivers for all planes, but
there is an individual set of Y drivers for each bit plane.
The several sets of Y drivers act in unison for selection
purposes in the normal mode of operation. It is precisely the mu.ltiplicity of Y drivers that permits the
memory access circuitry to be modified to support the
column mode of operation.

o
2

3

FIGURE 4-Examples of a row and column access to a 4 X 4
array
FIGURE 4a-Access to row 1 followed by a right shift of 1 bit
FIGURE 4b-Access to column 1 followed by 3, right shift of 2
bits

FIGURE 5-The organization of a

2~D

memory system

Associative Processing for General Purpose Computers

::t>-AND

:tr-OR

C:R~~~~~

_______'!'_~RIVERS

,,

BIT PLANE

[---.-_ _--.

A DECODED
WORD ADDRESS

x - DRIVE

LI NES

FIGURE 6-Modifications requirec:. for the ith memory plane.
The modifications are enclosed by the dashed rectangle

For column mode operations, it is necessary to modify
the drive circuitry slightly in order to place independent
control in each plane for the selection of the Y driver to
be energized. The requirements are that each plane
energizes either the driver corresponding to the decoded
Y address (row mode) or the driver corresponding to
the sum of decoded Y address and bit plane index
(column mode).
Since the plane index is fixed for each plane, it is not
necessary to use an adder in each plane to implement
associative access mode. Consider, for example, the
schematic diagram of the memory drive circuitry for a
single bit plane as shown in Figure 6. In this figure, the
X drive lines link all bit planes and effectively select a
particular N X N bit array for interrogration. The Y
drive lines select either a row or a column from the
N X N array. The circuit in Figure 6 is intended to
be identical to that in :figure 5 except for the two level
logic circuit shown in dashed rectangle. *
The two level logic circuit has the following property.
In row mode, signal A is false, and the output of the Y
decoder is fed directly to corresponding drive lines in
the selection matrix. Note that all planes act identically
in ,this condition, so that each plane returns a bit from
the same word address. When signal A is true, the output of the Y decoder is displaced cyclically by an
amount i in the ith plane. Thus, each plane reports a
bit that belongs to a different word of the N X N
array, and in fact reports the bit indicated by Figure 4.
The signal displacement in column mode i~ "endaround." That is, if the decoded address is the jth
address, then line (i + j). mod N will be activated in the
ith plane.

953

In terms of logic circuitry, the two level circuit
shown in Figure 6 is the only modification that is
necessary for the access circuitry.
One possible adverse effect of the two level logic
circuit is a small increase 'in the total memory cycle
time. This increase is expected to be less than one percent in magnetic storage technologies, and is likely to
be more than balanced by an increase in memory
effectiveness. However, in newer technologies, such as
integrated circuit memories, the increase in cycle time
may be somewhat larger.
We turn our attention now to the problem of cyclica1ly shifting data prior to entry to the memory and
after retrieval from memory. The two shifts are called
preshifts and postshifts, respectively. To solve the
shifting problem note that the amount of a preshift or
a postshift can be derived from the mode and the
effective memory address. Let the least significant log2
N bits of an effective address be called the s index.
(s for shift). Then, the shift direction and amount is
given by the table below.
TABLE I
Mode

row
column

Preshift amount

s bits left
N - s - 1 bits left

Postshijt amount

s bits right
N - s - 1 bits right

Note that the value of N-s-l can be computed by taking
the l's complement of s in a register with log2 N bits.
Consequently, it is extremely. simple to compute the
shift amount because it is equal to the lower address
bits in row mode or the 1's complement of these bits in
column mode.
A good candidate for the shift circuit is the barrel
shifter shown in Figure 7. Shifters of this type have been

SHI FT AMOUNT

ith BIT OF

SHIFT AMOUNT

T

2 i+1 E
APAf

1

*Actually, X and Y axis selection circuits of 2%D memories
are more complex than is indicated in Figures 5 and 6. In most
implementations, the Y address lines are partitioned into two sets
~hich are separately deco.ded. The decoder outputs are connected
In a cross-bar arrangement and the Y lines that thread cores are
~ocat~d at the crosspoints. The modifications shown in Figure 6
IS easIly adapted to the cross-bar method of selection.

FIGURE 7-A barrel shift register with a detailed view of a
typical cell

954

Fall Joint Computer

Confer~nce,

1968

implemented in several commercial computers. A
shifter with log2 N stages can cyclically shift N bit
words by any amount from 0 to N-l. Each stage shifts
either 0 or 2 i bits depending on the ith bit in the binary
representation of the shift amount. For our purposes,
it is most likely that two shifters are needed, one for
postshifting -and one for preshifting. The two shifters
should be placed on the data paths between memory
and processor, and not between the memory data register and the memory. The reason for the placement on
the data paths is that there is ,a possibility for overlapping the shift operation ,with other operations,
whereas if the shifters were phiced bet~en data register and drivers or sense amplifiers, the effect would be
to increase the memory cycle:
A good example of masking the effect of shifting
time through overlap of· operations occurs for write
cycles. Normally, write cycles are preceded by clear
cycles to clear memory in thebit positions that are to
receive new data. The clear cycle can proceed as soon
as an address is available. While the clear cycle is
active, data can be shifted into the proper format and
be ready for storage when the write cycle begins. Postshifting after read cycles will tend to increase the time
. required to access data but it will not increase the
memory cycle time.
The barrel shifters thus are used to translate between
a standard data format for the processor and one of N
different storage formats. Because data within the processor are held in a standardized form, manipulation of
data within the processor can be done in conventional
ways. Conventional load and store commands can be
issued from the central processor without regard for
the fact that data might be cyclically rotated during
the transfer. In essence, the physical form of data storage is completely invisible to the processor.
What has been gained, of course, with the barrel
shifter, the modified memory, and the modified storage
format, is the capability for performing associative
operations in a general purpose computer. The net cost
of the memory modification is two-barrel shift registers,
and some logic in the Y address circuitry. The cost is
undoubtedly a small fraction of the cost of the memory
and processor. There may be some impairment of performance because of increased access time, but this is
unlikely to be significant because the delays through a
microelectronic barrel shift register are very small
compared to the cycle time of a core memory.
SUMMARY AND CONCLUSIONS
The performance benefits to be derived from rowcolumn addressing depend greatly on the effectiveness
of column mode operations in reducing the number of
nlemoryaccesses. The greatest benefit of column mode

addressing is for those operations in which a small
. portion of a large number of words in memory must be
accessed. Scaling and magnitude searches fall iIi this
category. Take, for example, IBM System/360 long
floating point words with 7-bit exponents and 56-bit
mantissas. Scaling and magnitude searches over 64word groups can be done with approximately eight
accesses by accessing the exponent field of the groups in
column mode. This is essentially how one would conduct the search for a pivot element in a Gauss-Jordan
reduction.
Since the word-length effectively determines the
number of different words that are accessed in a column
operation, it is desirable to have as large a word length
as possible. In present technology, it is feasible to
implement memories with up to 64 or 128 bits/word.
It appears that reasonably good performance improvement is possible with 64-bit words, so that a useful
implementation of row-column accessing is possible
within present technology.
Some analysis is required to determine the utility
of column accessing for common operations such as
symbol table searching and sorting. Hash-addressing
used in combination with column mode access is one
possible method for performing table searching. Hashaddressing normally involves a search when conflicts
occur10 ,11 and this search might be speeded with column
access. The biggest improvement would come when
tables are nearly full, at which time there is a high
probability that a search will take place after the
eomputation of the hash-code.
Ultimately, the performance improvement to be
derived from the techniques that have been described in
this paper are determined by the applications. While
we cannot predict what benefits might accrue at this
time, the fact that row-column accessing can probably
be achieved for little cost within present technology
indicates that the benefits of an implementation of the
technique will almost certainly outweigh the cost of
implementation.

REFERENCES
1 AKAPLAN
A 8earch memory 8ub8y8tem for a general purp08e computer
AFIPS Proc of the 1963 FJCC Vol 24 Spartan Books Balti-.
more Md pp 193-200
2 RGEWING PM DAVIES
A n associative proce88or
AFIPS Proc of the 1964 FJCC Vol 26 Spartan Books Baltimore Md pp 147-158
3 B T McKEEVER
The associative memory 8tructure
AFIPS Proc of the 1965 FJCC Vol 27 Part 1 Spartan Books
Baltimore Md pp 371-388
4 AGHANLON

Associative Processing for General Purpose Computers

5

6

7

8

Content-addressable and associative memory systems--a survey
IEEE TEC Vol EC-15 No 4 pp 509-521 August 1966
T J GILLIGAN
2-1/2D high speed memory systems-past present and future
IEEE TEC Vol EC-15 No 4 pp 475--485 August 1966
JRBROWN JR
First and second order ferrite memory core characteristics and
their relationship to system performance
IEEE TEC VOL EC-15 No 4 pp 485-501 August 1966
WSHOOMAN
Parallel computing with vertical data
Proceedings of the EJCC V0118 december 1960
WSHOOMAN
Orthogonal computer

955

US Patent 3 277 449 October 4 1966
9 D L SLOTNICK
Unconventional systems
AFIPS Proc of the 1967 SJCCThompsonBookCo Washingt.on
DC pp 477--481
10 WDMAURER
An improved hash code for scatter storage
CACM Volume 11 Number 1 pp 35-38 January 1968
11 RMORRIS
Scatter storage techniques·
CACM Volume 11 Number 1 pp 38-43 January 1968
12 P A HARDING M W ROLUND
A 2-1/2 D core search memory
Fall Joint Computer Conferencp Decemb(3r 1968

Addressing patterns and memory handling algorithms
by SHERRY S. SISSON* and
MICHAEL J. FLYNN**
Northwestern University

Evanston, Illinois

INTRODUCTION
One of the principal problems facing the designer
of a high performance computer system is the
efficient handling of memory. In arranging a
lnemory system the designer, lacking knowledge
of the programs to be run, usually selects "worst
case" assumptions concerning addressing patterns. This paper is an attempt to compare a set
of selected disparate programs and analyze their
actual addressing traces with respect to the various memory algorithms.
In hig.h performance systems an increase in the
bandwidth* of memory must be achieved to insure an ample supply of instructions and operands
to the central processing unit. More efficient program storage organizations are needed to decrease
effective memory access time to a time compatible
with the process'or cycle time.
Three methods by which storage bandwidth can
be improved are: (1) making use of a high speed
immediate storage by transferring blocks of words
between local and main storage as required (variations of this method have been called paging),
(2) interleaving independent memory units in
order to obtain a faster effective access time, and
(3) using a high speed virtual memory (lookahead unit) .in the central processing unit in order
to obtain. an optimum instruction or data flow.
This method anticipates the actual information
requirements of the system.
*Present Address: Bell Telephone Laboratories
Naperville, Illinois 60540
**Author's time' supported in part by U.S. Atomic Energy
Commission, Argoune National Laboratory, Argoune, Illinois

The purpose of this project is to evaluate these
system organizations by studying recorded address traces of executed programs. Blocking, interleaving, and looking ahead configurations were
evaluated and compared with similar results presented in the literature. This work was done in
two parts.
.
1. The first step was to write a simulator which
chronologically recorded on tape storage demands requested by a test program as this
program was being executed. This simulator was then used to generate addressing
pattern tapes of three representative test
programs. The memory requests of each. test
program were written on tape in the order
used by the program and were flagged to
indicate whether they were data or instruction requests.
2. The second step consisted of statistical
analysis of the data. This study was done by
adapting various program organizations to
the data generated by Part 1. Three statistical programs were written. All of them
gathered information for: a) the instruction stream** b) the data stream c) instruction and data stream combined. These
statistical programs determined the frequency of run lengths, number of jumps
within a look-ahead unit for various level
sizes, and the interference occurring for
various interleaving configurations.
The results of the data analysis can be used to
give an indication as to how blocking, interleaving, and look-ahead organizations improve the
speed of execution of actual programs. First, a.
**Stream is defined to be a sequence of data or instructions as
seen by the machine during the execution of a program.1i

*Storage bandwidth is the retrieval rate of words from memory.'

957

Fall Joint Computer Conference, 1968

958

DATA BUS

M,2M

i,i+M

1,HM

REQUEST BUS

I1

MEMORY CYCLE

b

...

I

ACCESS rEGENERATION,
1

NUMBER OF
REQUESTS PER
MEMORY CYCLE

NUMBER OF MEMORY UNITS, M

BANDWIDTH =NUMBER OF REQUESTS PER MEMORY CYCLE

FIGURE 1

general review of program storage organizations
will be presented, followed by descriptions of the
simulator, test programs, and analysis programs.
The final sections present the results of the adaptions of program storage organizations to the test
program address traces, and conclusions.

Organization of storage (memory) systems
Concept of interleaved storage
The concept of interleaved storage is developed

from the idea of using M independent memory
units instead of one main memory. The words of
a program and its data are distributed successively
among the memories modulo M such that memory
1 contains addresses 1, 1 + M, 1 + 2M, etc., as
shown in Figure 1. By increasing M for a given
computer, the memory bandwidth is increased because the number of accessing conflicts is decreased. Recall that bandwidth is the service rate
of the main memory. This bandwidth must process fetching of instructions and operands, storage
of results, and input/output demands.
In his paper, Flores4 studied the limiting effect
of the number of memory banks on computer response time and developed an equation relating
a waiting time factor to relative memory cycle
time based on a queuing model. His work was
based on the assumption that the demands are independent of response; therefore, the demand
distribution is stationary. As a result, the probability of the arrival of a demand request in a
small fixed time interval is the same as any other
period. Only the worst case condition of random
TIlemory addressing was considered. In reality,
programs ,execute demands somewhere between
the extremes of sequential and random. The interleaving results of this project used actual program addressing patterns. From these results,
one can get an idea as to the effect of the assumption of random storage demand. The results are
presented and contrasted with Flores' predicted
results.
Blocking and look-ahead design

Gibson's block-oriented system

LARGE

MAIN MEMORY

1
PAGE
LOCAL STORAGE
CPU
FIGURE 2

Another approach to the bandwidth problem is
a block-oriented design presented by Gibson.6 He
proposed a block storage method where a request
to main memory moves a block of words to a local
store. Statistical results are presented on how
big a block should· be and what the size of local
storage should be. Of course, the optimum configuration at -anyone time depends on· the program being executed; thus, one must find an overall best configuration for the system.
Gibson's simUlations results show that for a
local storage size of 2048 words the number of
accesses to main storage is relatively independent
of the block size. This result indicates that the
block size should be kept as small as possible. If
large blocks are used, then the local store size

Addressing Patterns and Memory Handling Algorithms
should be increased in order to get good performance compared to conventional storage. The
local storage replacement algorithm used does
have some effect on the usefulness of local storage,
while the size of the program being executed has
little effect 'On local storage characteristics. As a
reference the paging results of this project will be
compared to Gibson's block storage results.

959

ADDRESS TAPE FORMAT
THE TAPES CONTAIN BINARY RECORDS, 24 WORDS PER
RECORD.
THE ADDRESSES ARE WRITTEN ON TAPE IN ORDER OF
USE DURING EXECUTION OF THE TEST PROGRAM

RECORD FORMAT:

Look-ahead systems
Look-ahead systems are employed in large-scale
computers in order to increase program execution
speed and to sm'Ooth fluctuations in memory demand. The Stretch3 system included a look-ahead
unit in its design for these reasons. Other systems incorporating a look-ahead feature are the
IBM System 360 Model 9P· and CDC 6600.
Look-ahead, as the name implies, is anticipatory
in nature unlike the block transfer method which
operates 'On demand. Instruction look-ahead is
counter driven by the expectation of instructions
to lie in a strict sequence. Data look-ahead is
driven by the expected instruction stream. Figure
3 shows the types of look-ahead units that were
applied to the address traces. A comparison of
the performance of these look-ahead models will
be presented as part of the results.

Simulator description
SimuIatorprogram description
INSTRUCTION
FORWARD

LOOK- AHEAD

BACKWARD CENTERED

T

ALGO- CONTENTS CONTENTS CONTENTS
RITHM
OF
OF*+I
OF*-I
TO*+N TO *-N *±1- N

* =PRESENT
VALUE

*

CONTENTS
OF

LOOK-AHEAD

1

N INSTRUCTION

'--_....,j

REGISTERS

INSTRUCTION COUNTER
INSTRUCTION Rt.GISTER

*

DATA LOOK-AHEAD
INSTRUCTION
LOOK-AHEAD
REGISTERS

T: I

•
••
••
C(l()

• I

N

1

••

I

~

C(B)
C(oc.)

It

'"

I

DATA
LOOK-AHEAD
REGISTERS

N

1

C(CIC.) =CONTENTS
OF..c

FIGURE 3

WORD 1

•

•
•
•
•

•

•••

WORD 24

S 1
D
I

17 18
ADDRESS

•
•
•
•

•

••
•
•
•
•
••

••
•

•
•

01

35

/0 ----------0
•
••

•

••
•
•

•

= 1 I F ADDRESS IS IN DATA STREAM
= 0 IF ADDRESS IS IN INSTRUCTION STREAM
FIGURE 4

This program was written in FAP, the IBM
7094 assembly language, for the purpose of monitoring the execution of other F AP programs. The
simulator keeps an instruction location counter
for the test program being monitored. This location counter is used by the simulator to read the
next instruction of the test program and record
the location on tape. If an instruction requires an
operand, the absolute address of the operand is
recorded 'On tape following the instruction address.
Any index modification of the operand is done
before recording the address. One level of indirect addressing can be handled, with all absolute
addresses requested during the indirect addressing process recorded on tape. These addresses are
written on tape in the order requested by the test·
program and are flagged to indicate whether it
was an instruction stream or data stream address.
'The format of the tapes is shoWll.in Figure 4.
Some difficulties were encountered in writing
the simulator. The execute command (XEC) was
used to run a test program under control of the
simulator. The maj'Or problem with this method
of control is that when XEC executes a transfer,
the simulator loses control to the transferred location. This problem was partly solved in the following manner: Before executi'On of a transfer,
the simulator inserts at the transferred location
a transfer to a location in the simulator program
and sets a flag to indicate that this was a transfer

960

Fall Join.t Computer Conference, 1968

operation. After execution of the transfer instruction (conditional or unconditional), the simulator, replaces the original instruction into the
test program. If the transfer was taken, the instruction location counter is updated to the transferred location. This method works except when
control is transferred to a protected area in core
during an input/output sequence.
The input/output problem was resolved by placing any test program I/O code in the simulator
B,nd placing an unconditional transfer to this code
in the appropriate test program location~ As the
simulator is executing the test program, it looks
for a transfer to I/O code before executing an
instruction. When it finds one, the I/O code is
executed. The simulator then reads the next instruction of the test program and continues execution. Because the simulator performs the I/O,
no addresses used during I/O execution are recorded on tape.

'l'est program descriptions
This simulator was used to simulate and record the accessing patterns of three test programs.
In an attempt to get a representative sample of
patterns, programs with varied characteristics
were run.
Differential equation solution
The first program was an iteration problem
which solved the Van der Pol differential equation
using Hamming's method. The Van der Pol equation,

is used to describe non-oscillatory systems. Input
parameters were given to the program for e,
h, and the first four x and y points. The next 96
points were calculated using the input parameters
and Hamming's predictor-corrector equations (see
Ralston 7 p. 189) :
Predictor: yo n4-1 = Yn-3

+

+

2y' n-2)

+ (3h/8) (y n+l)' + 2y' n -

y' n-l)

(4h/3) (2y' n - y' n-l

'+1

Corrector: Yn+! = (1/8) (9Yn - Yn-2)
i

'Where: YOn+! is the estimated initial value of y n+1,

y' n

i~

yin

i:-: the ith approximation to Yn+l.

the derivative of y 11,

The corrector formula was reapplied at each point
until the change was less than 0.0001. Each x,y
point plus the number of times the corrector was
applied at each step were printed at the end of the
program. This program passed through a few distinctive loops many times as each new point was
predicted and the corrector formula applied. The
simulation of this program caused 83856 accessing
requests to be recorded on tape.
I>ata processUng problemm
A data processing program was run as the second test program. D'ata cards are initially read
and stored sequentially in storage. Each card
contains an employee name (last name first), number, and codes indicating marital status and sex.
After all the cards are read into storage, the following steps are performed.
1. The cards are sorted

SQ that the employee
numbers are ~n ascending order.
2. The input information is used to prefix each
employee name by MR., MRS., or MISS.
Each prefixed name is reversed to last name
last before going to the next employee name.
Several logical and shift commands were
user to prefix and reverse a name and place
it left adj usted in storage.
3. Any leading zeroes on the employee number
are replaced with blanks.
4. The employee names, each with the proper
prefix and followed by the employee's number, are printed in numerical order.

When executed by the simulator, this program
requested 190608 accesses to main storage.
Machine simulation
The last· test program was a program which
simulated the language of a nonexistent machine
on the IBM 7094. It interpreted control cards ·and
simulated the instructions of the nonexistent machine. The main program read and interpreted
the control cards which included such commands
as IDMP, LOADER, PRINT, SNAP, and START.
After the START control card was read, the program transferred to a decoding routine which
read and decoded the instructions of the nonexistent machine. Each instruction was simulated by

Addressing Patterns and Memory Handling Algorithms

a oommand subroutine. A very small program.,
consisting of a few control cards and instructions,
was written in this new language for the test program to simulate. The test program requested
1392 addresses when it was executed, enough to
get an idea of the characteristics of this program.
There was more input/output in this program
than in the other two test programs. Recall that
memory requests in an input/output section of
code are not recorded by the simulator, thus a portion of the test program access pattern is not considered in the program analysis.

Analysis program descriptions
When executed, the following programs made
three passes of an address tape, one for each of
the three addressing streams.
Run length program

The first program written recor,ded run length
information. A run length is defined as the number of addresses increasing in value by one over
the previous address in an addressing pattern.
When an address breaks the sequential pattern,
this address is taken as the start of a new run
length. A table is printed showing the run lengths
that occurred in the program and the number of
times each run length occurred. Also recorded are
the total number of instruction, data, and combined stream references used by the test program. This information can give an idea of the
size and number of loops occurring in a program,
and of the sequential nature 'Of a program.
Blocking and look-ahead program
Blocking and look-ahead characteristics were
obtained by the second analysis program. The
program as written with the assumption that
only one page can reside in local memory (readily
available to the processor) at a time. For a given
page size, the number of new pages requested by
a test program is printed. Along with the page
count, the number 'Of addresses used by a test program assuming single word access is printed. It
should be noted that the blocking analysis program considered accesses to memory, not distinguishing between fetches and stores. The lookahead section looked f'Or three situations for a
given number of levels (N).

961

1. It considered that N levels followed the cur-

rent address (forward look-ahead).
2. N was considered to be the number of se-

quential addresses preceding the current address (backward look-ahead).
3. The levels were considered to be centered
with N/2 addresses preceding the current
address and N/2 following.
The analysis program printed the total number
of jumps occurring within N for each situation
(forward, backward, and centered). The blocking
and lo'Ok-ahead data were collected first on the
combined stream and then on the instruction
stream and data stream separately.
Interlea,ved storage program

The last program tabulated information on interleaved storage. It required three input parameters, relative memory cycle time, relative processor cycle time, and the number of memory banks
used. Several items were tabulated for each· set
'Of input parameters, including the f'Ollowing.
1 A count was made of the number of inter-

2.

3.

4.
5.

ferences that 'Occurred. An interference
occurs when service is requested of a memory bank which is still busy with a previous
request.
The total wait time due to memory bank interference was recorded; the wait time being
the total time in cpu cycles that the cpu had
to wait because the requested memory bank
was busy.
The total processing time was tabulated.
This time included one cpu cycle for each
memory access plus the wait time.
The number of accesses to memory was
printed.
The occupancy ratio (busy ratio divided by
the number of banks) was calculated. The
busy ratio is the ratio of memory cycle time
to process'Or request rate.

Increasing the number of banks for a given.
relative memory cycle time proportionately .. decreased the occupancy:ratio. Results were obtained on: all three .addressing . streams. When the
instruction stream was used alone, the data
stream and any of its effects on processing time
were ignored. Theref'Ore,. the results are worst
case interference' figures for' the given relative
processor and memory cycle times.

962

Fall Joint Computer Conference, 1968

100

100

90

CI)

l&J

CI)
CI)

L&J
0::

80

Q
Q

«

U)

LLI

U)
U)

LLI

80

LL.

70

0

a::
a
a

IZ

~

I&-

l&J

u

60

0

0::

~

l&J

LLI
(,)

50

l&J

60

>

LLI
Cl.

ij

~ 4

:::l

...J

:IE

:e 50

£i..J
::::.

2
:I

70

no.

a::

'::::'

90

:::l
.U

30

(,)

0

0

2

2

4

6

8

10

RUN LENGTH. WORDS
FIGURE 6

20
30
40
RUN LENGTH, WORDS

50

60

FIGURE 5

100
U)

au

~

Re8ults

au

90

gs

c

Run length results
The .run length results in Figures 5 - 7 show the
sequential nature of the three addressing streams
fore each test program. They are a plot of the
percentage of addresses which are of a given run
length size or less. The figures show the worst
and best case range of results. From observation
of these figures, one can see that the instruction
stream run lengths are significantly longer than
either the data. or combined stream run lengths.
Because a run length is a number of sequential
addresses, this longer run length of the instruction
stream indicates that the instruction stream is
more sequential than either the data or combined
streams. The difference is more marked in program 1 where the mean instruction stream run
length is 14.4 words white the data and combined
stream mean run lengths are. just over 1 word.

~

I&.

o

80

z~

au
o

a::

70

~

~

ti

60

..J

:;:)

~

8

50

2

4

RUN

6

LE~GTH.WORDS

FIGURE 7

8

10

Addressing Patterns and Memory Handling Algorithms
The wide range of instruction stream run lengths
generated by program 1 and 2 is reflected by their
large standard deviations of 13.7 and 8.8, respectively.

100

MEAN RUN LENGTH, WORDS

80

COMB

INST

DATA

1.08
1.09
1.33

14.44
4.52
3.78

1.05
1.04
1.38

TEST PROGRAM 1
TEST PROGRAM 2
TEST PROGRAM 3

,
\

::::)

~

CENTERED

\

~

560
(I)

&.J

RUN LENGTH VARIANCE
(STAN pARD DEVIATION IN PARENTHESES)
COMB
TEST PROGRAM 1
TEST PROGRAM 2
TEST PROGRAM 3

INST

0.08(0.29) 187.5(13.7)
0.13(0.37) 77.1( 8.8)
0.73(0.86) 13.7( 3.7)

DATA
0.07(0.27)
0.09(0.30)
0.39(0.63)

~

\

\

\---.

&.J

a::
w
~
a::
... 40
z
w
(.)

'

....

....... ~FORWARD

-~

CENTERED

20

FULL
CENTERED ONLY

OL-----~----~----r_----r_--~r_--~~

o

4

2

16

8

32

FIGURE 8

Results in Figure 8 depict the probability of an
instruction reference to a point outside a lookahead unit for a given size. The best and worst
case results for each look-ahead model are plotted.
Results were also obtained on instruction stream

...z
::::)

w

0

j!?

a
(I)

w

z

Another way to observe the sequential nature
of program accessing patterns is to investigate its
branching characteristics. A look-ahead unit anticipates the sequential nature of the instruction
stream alone. An instruction operand· is obtained
from memory as a part of the look-ahead process
and is saved along with the instruction at a single
level of the unit. One problem with a look-ahead
unit is that its effectiveness is diminished when
a program branches to a location outside the unit.

64

NUMBER OF LEVELS IN A UNIT

(.)

Look-ahead results

"'~

a::
~

TABLE 1

For all three programs, the data run lengths
are short with means ranging from only 1.04 to
1.40 words. This almost random referencing of
data is primarily due to the initial data layout.
For example, program 1 sequenced through large
tables; but in between referencing a location in
the table, constants and temporary storage locations were referenced~ Perhaps the single data
stream should be considered as two streams, one
referencing the tables, and another referencing
constants and temporary storage locations. The
short combined stream runs strongly reflect the
data characteristics.
In general, one can observe from these figures
that the instruction stream is significantly more
sequential than either the data or combined
stream. This difference gives an indication that
the instruction and data streams should be treated
as separate memory areas in a computer organization.

BACKWARD

RANGES,tr-_.......~_-A

...z

963

w
a::
w
La..
w
a::

...z
w

(.)

a:: 20
w
Q.

0
0

2

4

8

16

NUMBER OF LEVELS IN A UNIT

FIGURE 9

32

964

Fall Joint Computer Conference, 1968

branching to instructions just previously executed
(backward look-ahead) and on branching to instructions in both the forw~rd and backward direction. For all test programs the three different
look-ahead configurations are plotted on the same
figure so that the configurations may be compared.
Similar results are also plotted for the data stream
in Figure 9.

100

Inst~uction stream
Test program 1 demonstrates the best lookahead characteristics because of its highly sequential accessing pattern. For any of the lookahead configurations, program 1 had a" probability
of less than 0.07 of not finding any given instruction within the look-ahead. However, this program was also the least sensitive to look-ahead size
with the probability of not finding a given instruction within the unit decreasing by only 0.01 up
through the 64 level unit size.
The most significant improvement in the probability of finding a reference within a unit came
when expanding from 2 to 4 levels for the forward
and backward cases. For example, for the best
forward unit results the probability of a branch
instruction transferring out of 4 forward levels
is 0.54, down 0.23 from 2 levels. The same probability at 4 levels in the backward unit was somewhat higher at 0.78, down 0.17 from 2 levels.
Each level plotted" on the centered look-ahead
reflects the combination of the forward and backward results of the next lower level. The branching improvement decreases rapidly for the first 8
levels, then decreases more slowly for the rest 'Of
the range.
Branching conclusions
Based on" the results in Figure 8, a centered
unit would be most desirable, but this configuration may also be the most difficult to implement.
A" unit of at least 8. levels is needed before the
centered look-ahead performs any better than the
forward configuration. If a smaller unit were required, then the forward look-ahead would provide the best performance. A timing simulation
study was made for the Stretch computer to get
information as to the effect of a forward lookahead unit on relative program execution speed. 8
The Stretch results indicate that the biggest improvement in speed came when expanding to 4
levels. The same characteristic is displayed by the

ffi

~

20

1&1

CL

10

*512-WORO
&....JCOMBINEO STREAM*
LOCAL S'roRAGE
~

o

- ...... -...t-.-

o

2

4

8

~

~

M

128

2S6

BlOCK SIZE • WORDS

FIGURE 10

test program branching patterns. This comparison indicates that program branching behavior is
a direct indication of look-ahead applicability.
Blocking results
Another method for improving program execution speed is by use of blocking techniques. The
first results were obtained by the paging analysis
program with the assumption of 'Only a one-page
local storage. From previous observations of the
run length curves, one would expect paging to be
most applicable to the more sequential instruction
stream. Figure 10 reveals, as expected, that the
instruction stream always has a smaller probabil;.
ity of not finding a word in local storage than
either of the other streams. It takes only a 4-word
page to bring the instruction stream probability
down, to less than 0.5 while the data stream needs
a 32-word page to achieve the same performance.
A 64-word page brings the probability of not finding an instruction reference down to less than 0.1.
However, at this large size, the number of references made to local storage represents a small
fraction of the number of words transferred to
local storage. A large number" of words are now
being transferred from main memory to local

Addressing Patterns and Memory Handling Algorithms

8.0

8.0

6.0
5.0

6.0
5.0

4.0

4.0

~ 3.0

3.0

~

~ 2.0

2.0

...en

@

z

li-

......

~

~

~ 1.0
2

i=

.8

~

.6
.5

ALL PROGRAMS

0

....

z

~ 1.0
~

~
~

j

&&.i
.~

MODEL 91 SIMULATION

c:(

8\

i= .8
(!)

~12
(.)

.6
.5

----------------------------ACTUAL TRACE

.4
4

~3

.2
O~----_r------~----~------------~
48
o
16
32
.1

INTERLEAVING

.6

.8

OCCUPANCY RATIO
NUMBER OF REQUESTS PER MEMORY CYCLE,TM)
(

.NUMBER OF MEMORY UNrTS

FIGURE 13

data stream causes the greater blocking times.
One way to make the data stream wait times more
compatible with those of the instruction stream
would be to separate the data stream into three
operand areas as enumerated in the previous section.
Flores' curve, shown in the figures for comparison purposes, was derived from an open loop
queuing model assuming a uniform random accessing pattern. It should be used only as a limiting factor in considering program execution
speeds and not for determining the optimum waiting time for a system. Comparison of the test program addressing patterns with Flores' random
accessing pattern reveals that fewer banks are
needed to obtain a reasonably low memQry waiting
time for the test program patterns. With an occupancy ratio of 0.5, the average waiting time is
less than one-half of the wait with random accessing. Increasing the occupancy ratio to 1.0
causes an insignificant increase in the waiting
time of the test program addressing patterns as

FACTOR

FIGURE 4

1.0

compared to Flores' curve. It should be noted that
the test program waiting times are slightly optimistic because interferences generated by input/
output sequences were not included in the accessing pattern.
The relative improvement of the accessing rate
that occurs as· interleaving and memory speed are
improved· can be demonstrated in another manner.
Figure 14 depicts the average access time in cpu
cycles as a function of interleaving. Notice that
the amount of interleaving yields an exponentially diminishing improvement. Similar results
were obtained from a simulation of the IBM System 360 Model 91 storage system. 2 The Model 91
simulation also used a random addressing pattern,
which resulted in longer average access times than
those· of the test programs.
CONCLUSION
The objective in the design of a general-purpose
computer is a system which executes' programs as
fast as possible, and is technically and economically feasible. The analysis of dynamic address traces
provides a quantitative measure of system performance which can be used to guide computer
system design. This project examined program
accessing patterns, looking for characteristics

Addressing Patterns and Memory Handling Algorithms
which would lead to more efficient storage utilization by programs. Final evaluation of the optimum method of organization depends on hardware
costs for the various configurations which were
not considered in this study.
The major methods of organization are blocking, interleaved storage, and look-ahead. Results
indicate that separation of the instruction and
data streams improves program performance.
The results of a one-page local storage show that
the test program performance is much improved
wi th a page size of 4 or more words. However,
the results presented using a larger local storage
reveal that performance is improved by more than
an order of magnitude when more than one page
is saved in local storage. Interleaved storage performance is good with the combined stream, but
could be significantly improved by using separate
streams. Reasonably low access waiting times are
produced by using enough memory units to offset
the ratio of memory to processor cycle times.
A look-ahead unit would be more applicable to
an interleaved storage organization than an organization with local storage. A centered lookahead handles program branching more efficiently
than the other configurations at a size of 8 or more

967

levels. If a smaller look-ahead is to be considered
for a system, a forward look-ahead would provide
the best performance.
REFERENCES
1 D W ANDERSON F J SPARACIO

R M TANASULO

The M odel91 : Machine philosophy and instruction handling

IBM J Res and Dev January 1967
2 L J BOLAND G D GRANITO A M MARCOTTE
B U MESSINA J W SMITH
The model 91 storage system
IBM J Res and Dev January 1967
3 W BUCHHOLZ Ed
Planning a computer system

New York McGraw-Hi1l1962
4 I FLORES
Derivation of a waiting time factor for a multiple band merrwry

JACM VollI pp 265-282 July 1964
5 MJFLYNN
Very high-speed computing systems

Proc IEEE Dec 1966 pp 1901-1909
6 DGIBSON
Considerations in block-oriented systems design

AFIPS Conf Proc Vol 30 pp 75-80
7 ARALSTON
A first course in numerical analysis

New York McGraw-Hi1l1965
8 SSISSON
Statistical analysis oj computer accessing SY8tems

Thesis for the MSEE degree Electrical Engr Dept Northwestern Univ Evanston Illinois June 1968

Design of a IOO-nanosecond read-cycle NDRO
plated-wire memory
by TAKASHI ISHIDATE
Nippon Electric Co., Ltd.,
Kawasaki, Japan

INTRODUCTION

Disadvantages:

Plated-wire memories are now attaining a promising
position in main memories. UNIVAC claims that, in the
non-destructive readout (NDRO), the cost of peripheral
circuit can be reduced, since peripheral circuits are useful enough to maintain multiple words. However, the
NDRO is most effective only for slower memories with
a comparatively small number of interface bits. Themain
memories such as 72-bit-per-word 100-nsec read-cycle
memory system cannot be improved by the use of the
NDRO as recommended by UNIVAC.
This paper deals with the best use of the NDRO
techniques in the high speed operation of memory system.

(1) The output signal level obtained by NDRO mode is
generally lower than that obtained by DRO mode,
because, in the NDRO operation, the readout word
field is kept considerably below the amplitude
which brings the magnetic vectors along the hard
axis of thin magnetic films. Low output signals
make the sense circuit more complicated.
(2) When the thin film is operated by NDRO mode, the
domain wall is apt to creep, and the yield of good
plated wires fall, thus increasing the cost of memoryplanes.
(3) The read cycle time of NDRO is short. But the time
required to write information increases due to the
limited word current for writing.
(4) The difference between read cycle time and write
cycle time is not favorable to the control unit of a
processor to access the memory.

Reasonsfor adopting NDRO
The advantage and disadvantage of NDRO opera.
tion of plated -wire memory are as follows.
Advantages:
(1) The NDRO memory is more reliable as compared
with the DRO memory agaihst temporary errors.
In the :ORO memory, stored information will he
definitely destroyed, if the readout signal is detected in the wrong sense or if there is any malfunction in the recirculation loop.
(2) The read cycle time is short. There is neither re·
write time nor recovery time of digit noises in the
NDRO memory.
(3) If the word current for writing is compatible with
that for reading, we can store information exclusively' into desired bits of a word, while reading the
remaining bits. This gives rise to efficient use of the
memory.
(4) From the similarity between NDRO memory and
electronically changeable read-only memory
(ROM), most of the NDRO techinques can be uti
lized in R01VI.

Design objectives
The following items were selected for the design,
anticipating the demand of performance characteristics
in the 1970's.
(1) Unit capacity of the memory is 16,384 words of 72
bits each.
(2) The memory must have a read cycle time of 100
nanoseconds, a read access time of 70 nanoseconds,
and a write cycle time of 200 nanoseconds.
(3) Input .and output signal levels must be compatible
with those of CMLs.

Timing
The time should be determined for each functional
block before the detail system design is given. The
timing waveforms are shown in Figure 1.
Read operation is repeated at a rate of 10 M cycles

969

970

Fall Joint Computer Conference, 1968

Digit current consists of a pair of pulses, one positive
and the other negative. The first timing is called "Digit
Phase I" and the second "Digit Phase II."
The digit current timings shown by broken lines mean
that the digit driver advances the digit current in the
same amount as the propagation delay in the memory
plane.
This technique insures the precise time relation of the
word and digit current for any address.

per second. The time delay for the address decoding and
for the propagation of word current from the current
source to the memory plane is estimated as. 30 nanoseconds.
The peak of readout signal appears 10 nanos~conds
after the mid-point of the rise of the word current.
The maximum propagation time of the signal in the
memory plane is assumed to be 15 nanoseconds. The delayed signal is shown with broken lines.
Each of the circuit delays admitted, in the sense amplifier, in the polarity detecting strobe circuit, and in
the output buffer, is 5 nanoseconds.
For write operation, 200 nanoseconds" are allotted.
The word current has two timjngs. The timings are
called "Word Phase I" and "Word Phase II."

o

20

40

60

80

100

120

System design

The system design should be accomplished taking engineering feasibility and cost into consideration".
Some of the possible combinations of word and bit
nanoseconds

140

160

180

200

220

240

260

I

,---____________~)xC~___________

~

Read Instruction

Address

I

L

30n~

'--_/

~~'n

n5

Word Current

~JO_ ~5 _I~__:~~~ ____________~£)<-~\
----c--=i"
....

r_"

Sense Signal

__~15ns 15 ns
,

Output Buffer
Reset

Write· Instruction

=x______________________
Word Phose I

Word Phase

n

- ' " 

Read

Word Current
Word Current
Di git CUrrent
Digit Current
Strobe Pulse
Strobe Pu I se

Figure 6-Timing waveforms in the digit control system

Figure 7-Timing waveforms in the word control system

Let the memory location nearest to the digit circuit:ry
be Mn and the farthest M, in :Figure 5, then an example
of the time relation using the digit control system is as
shown in Figure 6.
In the cases such as write-after-write, read-after-read,
and write-after-read, the time separation of succeeding
timings may be smaller than the typical value. However, it is not so serious as will be shown in the case of
the word control system.
Figure 7 shows another time relation of word and
digit timings when the word control system is employed.
The timings of digit circuitry are fixed irrespective of
the selected memory location.
As shown in Figure 7, the time separation of succeeding word currents becomes very small in the worst case.
The reason why the situation is worse with the word
control system is that the "time rel~tion is adjusted only
by the word current, causing a significant approach of
leading and trailing word currents.

The timing generator for a total system is shown in
Figure 8. The delay circuits connected with address decoding logics are used to control the bit timings in accordance with the memory address selected.

Strobe circuit
A two-stage monolithic differential sense amplifier is
followed by a :modified CML circuit which detects the
polarity of readout signals at the strobe timing, and
also holds it as long as the strobe pulse is present.
As shown in Figure 9, the polarity detecting strobe
circuit consists of a current switch made of transistors
Ql and Q2, feed-back loops made of transistors Qa and
Q4, steering transistors Q6 and Qs, and an emitter Q7
which makes the current switch inactive when the
strobe pulse is absent.
At the moment the base level of transistors Q7 is
shifted from -0.8 volt to -1.6 volt, i.e., the strobe pulse is

976

Fall Joint Computer Conference, 1968

given, the current switch is turned on. Since both the
transistors Qland Q2 cannot conduct simultaneously,
one of them assumes "on" state answering to an imbalance, however small, present between the base levels of
input transistors Qs and Q4'

ll'ord seTeztion matrix

Diode-st.eered transformer matrlx7 and transistor
matrixs can be selected as word selection matrices.
The feature of the two methods are compared in
Table 4.
Word (Even Plane)

Word (Odd Plane)

Execute
Instruction

10

Strobe

Addresses

0----{}:

Digit Phose I

Digit Phose II

(Arabic figures show delays
in nanoseconds)
D = Delay Lines
J:t'ij.Cure 8-TiminJl: pulse Jl:enerator

Design of lOO-Nanosecond Read-Cycle NDRO Plated-Wire Memory

Items

977

Transistor Matrix

Diode-steered Transformer Matrix

(1) The current capacity must be large
(1) The current required is l/fJ times the
enough to supply the word current.
word current.
(2) The characteristic impedance of the
(2) The characteristic impedance of the bus
line must be as low as possible to minimize bus line can be made higher.
the voltage shift of the bus line. A large part
(3) The voltage swing is a function of
of the power wiIa be wasted in the terminating
transistor parameters. It is generally smaller
register.
than the peak amplitude of the back volt(3) The voltage swing must be greater than ages on the word line.
the peak amplitude of the back voltage on the
word line.

Switch
Lines

Problems in the
current Switch
Lines

Problems in the
Ground Line

Problems in the
Half-Selected
Word Current

No particular problems exist.

Since the bus line is apt to ring, a. damping
resistor must be added.

There is hardly any current in the ground Instead of the voltage source bus line the
plane, because the transformer prevents com- ground or power line supplies the current to
the word line. Accordingly the impedance of
mon moqe current.
the ground or power line must be kept as
low as possible.
(1) Transistors with less collector-to-base
(1) The balance in the transformer windings
must be improved to prevent the common capacity are recommended.
(2) The voltage swing of the base driver
mode voltage on the primary winding from
inducing a differential mode current in the (voltage switch) must be designed to be
secondary winding.
small.
(2) Diodes with less junction capacity must
be used.

TABLE 4.-Comparison of word selection matrics

Figure 9-Polarity detecting strobe circuit
Output

Strobe
Pulse

to Planes
-6V
lOOp

Oi fferential
Amplifier

978

Fall Joint Computer Conference, 1968

When production is taken into consideration, a transistor matrix is preferred in the NDRO plated-wire
memory. A blockdiagram of word selection circuits is
shown in Figure 10.
Experimental results

A cross-sectional model of the system was con~
structed, and it worked as expected.
The memory stack consists of four modules each containing eight planes. Each plane has 128 word line s by
288 plated-wire digit lines.
The· memory wire is prepared by electroplating permalloy onto a bronze phosphide wire 0.13 mm in diameter. The plated wires are spaced at 1.0 mm centers
and the word lines are spaced at 1.5 mm centers.
The back EMF of the word line for 300 milliamperword current with a rise time of 20 nanoseconds is approximately 14 volts. A 40 milliampere digit current
gives rise to a typical output signal of 10 millivolts.
The basic logic element exclusively used in the experimental model is a current-mode-Iogic circuit J.I. PB 80
which has two four-input gates. Typical propagation
Address

delay of J.I. PB 80 is 3 nanoseconds.
A monlithic integrated differential amplifier J.l.PC7B
is used for the sense amplifier. It has a bandwidth of 40
MHz and a voltage gain of 40 dB.
Transistor selection matrices and digit drivers are
temporarily constructed by discrete components.
The propagation delay and the attenuation per 1-K
words were measured. The results obtained are 14 nanoseconds and 1.5 dB.
Adjustment of terminating resistors was necessary to
minimize the digit noises.
The use of diodes, connected back-to-back in series
with the digit line, was satisfactory.
The minority carrier strage in the diodes and the
junction capacity gave rise to a slight ringing on the
digit line.

CONCLUSION
It has been pointed out that the delays in address decoders, sense amplifiers and strobe circuits are the princi-

.

Decoders

/

Registers.

Memory Plane

/////////////

.Q

¢

~X16 Transistor M o t ¥ /

Transistor
Matrix
Selection

77///7 7/////

/

c:

.2
U
.!

¢

Q)

Bose line
Selection

)(

..::

0

Emitter Line
Selection

~
~

2

¢

Word Phase I

0

~

n

Word Phase
(WOrd Phose I )
for Reod CycJe

Read IWrite
Word Phose I
Word Phose n

~

f/)

.-

Q

7

f/)

c:

1'1'

··-

.J

/

I

'II
V

···

ZZ22Z7

···

/

70
j

~ZZZZZ? 7

CJ)

/////////////

7

~72 ZZZZZ? 7

iI"~
II
V

II

I~

··•

/0
/LJ
/5
Emitter Line Selection
(Word Phose 1)

Bose Line
Selection

Figure 100BIockdiagram of word selection circuitR

Emitter Une Selecti on
(Word Phase lU

Design of lOO-Nanosecond Read-Cycle NDRO Plated-Wire Memory
pal factors determining the read cycle time of the mem0ry.
Small-scaled transistor selection matrices in combination with Cl\1L decoders were employed, and a decoding
delay of 30 nanoseconds was obtained.
A polarity detecting strobe circuit was developed
with a modified CML circuit. The strobing delay observed was less than 5 nanoseconds.
The method to compensate the propagation delays in
the memory plane was discussed. It was shown that it is
desirable to control the timing of digit circuitry.
ACKNOWLEDGMENT
This paper is a part of work done jointly by many research engineers of the Central Research La boratories
of Nippon Electric Co., Ltd.
The author would like to express his cordial thanks to
Dr. I. Someya and Dr. Y. Sasaki for their support and
guidance, and also to his colleagues, particularly
Messrs T. Furuoya and H. Murakami for their cooperation in the construction of his memory system.

979

REFERENCES
1 C F CHONG R MOSENKIS D K HANSON
Engineering design of a mass mndom accessplaled wire memory
Proc F JCC 363 1967
2 JP McCALLISTER C F CHONG
A 500-nanosecond main computer memory utilizing plated-wire
elements
Proc F JCC 305 1966
3 B A KAUFMAN P BELLINGER H J KUNO
A rotationally switched ROD memory with a tOO-nanosecond cycle
time
Proc F JCC 293 1966
4 S A MEDDAUGH K L PEARSON
A 200-nanosecond thin film main memory system
Proc FJCC 2811966
5 TISHIDATE
Circuit techniquesJor one hundred nanosecond thinfilrn memory
Colloque International sur les Techniques des Memoires
Editions Chiron Paris 1966 p 671
6 TISHIDATE
Delay co'mpensation concept in very high speed memories
NEC Res and Dev 9 129 1967
7 E E BITTMANN
A t6 K-word, 2-Mc magnetic thin film memory
Proc F JCC 93 1964

High speed, high-current word-matrix using charge~
storage diodes for rail selection
byS. WAABENandP.CARMODY
Bell Telephone Laboratories, Incorporated
Murray Hill, New Jersey

INTRODUCTION
Diode matrices used to select the path of an unidirec~
or bidirectional matrix current are well known. l-3
Conventional matrices use a low storage diode crossP?int for unidirectional current, and a charge-storage
dIOde c~osspoint for bidirectional current. For typical
magnetIc memory cells the required currents approach
1 ampere. Also, for magnetic thin film memories the
required word current duty cycle' is small, typically 30
ns out of a store cycle time of several hundred nanoseconds. To conduct such currents, the required silicon
area for a diode is almost one order of magnitude less
tha~ that required for a transistor. Since the cost of a
s~~lConductor device is strongly dependent on the
SIlicon area used, diode matrices are therefore commo~ly used for the economical selective drive of magnetIc ~emory stacks. For many memory system configuratIOns, because of the significant cost of high current t~ansistor m~trix. selection switches, the cost per
word hne of matrIX raIl selection is comparable to that
of the individual word selection diode. Large matrices
are therefore commonly employed to share the switch
cost among. many matrix crosspoints. The penalty is
more stray Impedance and system noise as weH as difficulty of reliable assembly of large arrays. It will be
~hown how this rail selection switch function can be
nnplemented advantageously by a circuit combination
of a. low co~t, high-current charge-storage diode and
an InexpenSIve 100-200 rnA current transistor. No
transformers are needed. Furthermore, the usual number of rail selection transistors can typically be halved
by a tandem diode matrix arrangement.
I~ this paper the basic schemes are presented. The
desIgn tradeoffs are then given and discussed. Experimental results are shown.
tio~l

Basic charge-8torage diode rail8election
Figure 1 is the schematic of a diode matrix of 32 word

rails and 32 diode rails for a plated-wire store which
will be used-as an example. The distributed word rail
loading capacitance is 32 X 32 pF ~ 1000 pF. The distributed diode rail capacitance is 32 :X 3 pF ~ 100 pF.
Resistors for biasing the matrix diodes in such a manner that nonselected diodes will remain backbiased are
also shown. Note that inherent in all diode matrix
selection schemes the matrix rail selection switches
must carry the current of the selected path.
To select the crosspoint of a word rail and a diode
rail, one of the 32 word rails is changed from zero volt
, to El volt by turning the charging current Iclll on. If
Ichl is constant then the word rail voltage rises linearly
to El volt. When the rail voltage reaches El, CSDl goes
into forward conduction and the rail voltage is thus
clamped to E l . From this point on, charge is accumulated in CSD1 by the continued flow of I chl. All matrix
diodes remain backbiased since the diode rail voltage is
more positive than the Qlamped word rail voltage.
Similarly, the closure of a diode rail selection switch
generates a current flow through CSD 2• Charge has
now been accumulated in the two matrix rail chargestorage diodes, OS01 and CSD 2. The charging currents
Ichl and Icll2 are supplied via two selected low current
transistors. In the reverse conduction phase of the
charge-storage diode, the voltage drop across the diode
is the junction voltage across the forward biased junction of a reversely conducting diode minus the IR drop
across the body resistance. Therefore, by closing a
common control switch transistor to ground potential,
current will flow in the selected path as long as there
is charge. available in the electrically floating· chargestorage diodes. Notice that the current handling and
tum-on and tum-off requirements on the matrix rail
selection switches are relaxed and decoupled from the
high-speed, high-current requirements of the selected
matrix path. Rise time, amplitude and duration of the
current pulse are determined ,by the circuit parameters
and the common control circuitry described below.

981

982

Fall Joint. Computer Conference, 1968
t

DIODE RAIL BIAS VOLTAGE

+E2

QNDRO

=

f i dt

=

U ·OA·40·1()-9

4 nC

=

QDRO = ~ ·1·50·10-9 = 25 nC
WORD RAIL
T

C.S.D.,

This amount of charge must be supplied by CSD I , CSD2
and the common control circuit. The following arithmetic indicates the relative sensitivities of the parameters involved. For a desired word current peak Ip

~p the rerIse
V2Qs. Also since

and a constant rising current slope s =
quired charge Q is given by: Ip
s

=

=

t

~,
E is the fixed driving voltage minus the
L where
.

semiconductor junction drops of a selected path, and
L is the driven word loop impedance, it follows that
,CHARGE STORAGE
DIODE C.S.D.

1 =
p

--i>tNORMAL COMPUTER
DIODE

~
LONG T
CHARGE STORAGE
DIODE

BASIC
COMMON
CONTROL
WORD
CURRENT - - t - - - - i
TIMING

R

FIGURE I-A basic 32X32 diode matrix with charge-storagediodes for matrix rail selection

As an extension on the basic scheme the number of
rail selection switch drivers can be reduced by a cirouit
arrangement where groups of the rail selection diodes
are shared by common switch drivers. Figure 2 illustrates this principle as applied to the diode rails. A
16 word rail by 64 diode rail matrix is shown selected
by 16 + 8 + 8 = 32 medium current transistor switches. This should be compared to 80 high-current switches
for a classical 16 X 64 matrix. Alternatively, 64 sWItohes
would be required for a square matrix covering the same
1024 matrix crosspoints. It can be seen that the chargestorage diodes are also arranged in matrix form which
we shall refer to as being in tandem with the original
matrix.
The practicality of the basic scheme presented above
depends on a tradeoff between particular circuit parameters and requirements. A brief analysis of typical
circuit performance will be presented next.

Required charges
Word current charge area
The nominal plated-wire store word current pulses
shown in ,Fig. 3 require the following amounts of charge:

.~.
V
E

Consequently for a fixed available Q,al0 percent variation of E or L will result in 5 percent variation of 1 peak.
This reduced .sensitivity to variations in L is significant
because L may vary from word loop to word loop while
Q and E are more readily controlled in a memory systems efl;vironment. The charges of the matrix capacitance will be discussed next.
Matrix capacitance charges
To change the voltages at the matrix terminals
charge must be supplied to or drained from the matrix
rail capacitances. Here only the charges at the word
rail will be discussed.
The word rail terminal of the word l!ne must for
pulses with equal rise and fall times supply twice the
basic 25 nC for the DRO and 4 nC for NDRO. At the
word rail terminal there are three possible sources of
charge available to supply the word current (see Figures
1 and4):
(a) The rail selection switch.
(b) The charge-storage diode CSD I •
(c) The word rail capacitance ..
To attain a linearly rising word current the voltage on
the word rail must remain constant during the rise time
of the word current pulse. Th~refore the role of CSD 1
is twofold (I) to limit and clamp the charging of the
selected word rail capacitance to a well-defined voltage
and (II) to implement a low impedance charge reservoir at a fixed votage level during the rising portion of
the word current. This buffer can in principle be either
the CSD or the large word rail capacitance of 1000 pF.
However, such a charge drain from the word rail capaci-

High Speed, High Current Word-Matrix

983

1000mA
WRITE

mA

400mA
NORO
WORD RAILS-I6, SWITCHES-16
DIODE RAILS- 64, SWITCHES- 8+8

FIGURE 2-Tandem diode matrix ",'ith charge-storagediodes. There are 16 word rails and 64 diode rails. The word rails
are operated as indicated in Fig. 1. The diode rail charge-storagediodes are arranged in a diode matrix of eight groups of diodes
(A, B, ... , H) each of eight diodes. After charging, a selected
crosspoint is driven by the closure of the common control circuit.

20

Charge-storage diode properties
The continuity equation for charge describes the
charge flow through a diode:

dQ
dt
The term

~ is
T

+9 =

i(t)

T

the amount of charge disappearing by

recombination in the diode and the other terms express
the conservation of charge. Assume a current IF is
conducted in the forward direction for a period of time
t F • Charge is accumulated in the diode. The efficiency
E of this charge reservoir measured at time tF is:'

TIME nS

t,
QNDRO=

tance will produce access noise during read. It follows
implicitly from the discussion above that in the caseof
the 8 nC necessary at the word rail terminal for NDRO
that the charge storage feature of the word rail diode is
in some ways incidental. For NDRO operation in particular the matrix access time is dominated by the
charging time, say 75 ns, of the word rail capacitance
of 1000 pF charged to 15 volts at the corresponding
charging current of 0.2 A. The finite lifetime of the caJriers in the CSD does, however, set a practical limit to
how small a rail selection current one can realize and
still achieve a 1 ampere DRO pulse. The next .section
will summarize the charge-storage-diode phenomena
briefly.

50

f

Lo dt, =+.04 020 010-9 =4nCOUL

o

t

QWRITE=

f

i.. o dt,=+01-50 o I0- 9 =25nCOUL

o

FIGURE 3-Current, time and charge for nominal word currents

It follows that the longer the lifetime T the higher the
efficiency. The necessary charging time is given by:

Figure 5 shows typical calculated tradeoffs using this
expression.
Notice that limiting the peak current with a common
control charge-storage diode eases the charge uniformity requirement on the many rail selection diodes.
The rail diode requirement is therefore only single
ended; namely, more charge is required by the rail
diodes than the amount in the common control diode.
A simple common control circuit for a 400 rnA peak
NDRO pulse is implemented using a short lifetime
diode, thus securing a maximum peak word current
which is insensitive to the duration of the charging
pulse. The 1.0 A peak DRO pulse is conveniently implemented using a longer lifetime diode for common
control.

Rail selection switches
E = Charge Available
Charge Supplied

IFT(l - e -tiT)
IF"tF

The rail selection switches can be implemented in a
variety of ways. The "optimum" choice depends heav-

984

Fall Joint Computer Conference, 1968
02

01

ON

~ 601--609
11 P J nE~NI~G
The working set model for program behalli()r
Comm ACM 115196R pp ~{2;{-3~:)
12 LWCOMEAU
A study 0.1 the effect (~f user program optimization in a paging
system
ACM Symposium on Operating System Principles Gatlinhurg
Tennessee Oct 1-41967
1~ A C McKELLAR E G COFFMAN
The organization of matrices and matr£x operations in a paged
multiprogramming environment
Technical Report No 59 Dept of Elec Eng Comput.er Rcien('es
Laboratory Prin~et.on Universit.~, February 196R
14 .J A COHEN
Use of fast and slow me'mories in li.'!t pr()Ces"H:n(! languages
Comm ACM 10 2 1967 pp 82-86
15 D G BOBROW D L MURPHY
Structure of a LISP system using two-level storage
Comm ACM 1031967 pp 155-159
16 B BRA VYX F GUSTAVSON
An evaluation of program performance on the 11/44/44X system
Pali. 1 Report RC 2083 IBM T J Wa.tson Research Center
Yorktown Heights New York May 1968
17 R L PATRICK
Time-sharing tally sheet
Datamation 13 111967 pp 42--47
18 G OPPENHEIMER ~ WEIZER
Resource management for a medium scale time sharing operating system
Comm ACM 11 5 1968 pp 313-322
19 LA BELADY

1018

Fall Joint Computer Conference, 1968

A study oj replacement algorithms for a virtual storage computer

IBM Systems J 5 2 1966 pp 78-101
20 BRANDELL

A note on storage fragmentation and program segmentation

Report RC 2102 IBM T J Watson Research Center Yorktown
Heights New York May 1968

Program behavior in a paging environment
by BARBARA S. BRAWN and FRANCES G. GUSTAVSON
IBM Watson Resarch Center
Yorktown Heights, New York

Study objectives

This paper is the result of a study conducted
on the M44/44X system, an experimental timeshared paging system designed and implemented
at IBM Research in Yorktown, New York. The
system was in operation serving up to sixteen
users simultaneously from early 1966 until May
19.68. Conceived as a research project to implement the virtual machine concept, the system has
p.rovided a good deal of information relating to the
feasibility 'Of that concept. 1 The aim of this study
is to investigate the concept more thoroughly from
a user's viewpoint and to try to answer some important questions related. to program behavior in
a paging environment. As an experimental system, the M44/44X provided, an. excellent vehicle
for the purposes of this study, and the study itself
forms some basis for an evaluation of the system.
It is recognized by the authors that the results
and conclusions presented here are to a great extent characterized by a particular configurati'On of
a particular paging system, and as such do not constitute an exhaustive evaluation of paging systems or the virtual machine concept. Nonetheless,
we feel that the implications of the conclusions
reached here are of consequence to other system
implementations involving paging.
Conventional vs automatic memory management
There has been much written about the benefits
and/or disadvantages of paging machines and the
virtual machine concept. 2 ,3,4 However, little data
have been obtained which sheds a realistic light 'On
the relative- merits of such a system compared to
a conventi'Onally designed system. From a programmingpoint of view there is little question
that any technique which obviates the necessity
1019

for costly pre-planning of memory management
is inherently desirable. The question that arises
is-given such a technique, how efficiently is the
automatic,management carried out?
From a user's point 'Of view this can simply
mean-how long does it take to run a program
which relies on the automatic memory management, and is this time comparable to the time it
would take to run the program if it were written
in a c'Onventional way where the burden of memory management is the programmer's responsibility. It is this user's viewpoint that forms one focal
point for this study.
The· role of the programmer
Perhaps the most important aspect of the study
concerns the role of the programmer. How does
the role of the virtual machine programmer differ
from that of the conventional programmer? For
a conventional system the role 'Of the programmer
is well defined-the performance (i.e., running
time) of his program is usually a direct result of
his ability to make efficient use of system resources. How much he is willing to compromise efficiency for the sake of ease of programming may
depend on how often the program is to be run.
In any case, the decision rests with him. (There
of course exist many applications where his choice
of programming style or ability have little effect
'On performance; this case is of little interest to
our stUdy.)
When faced with th.e problem of insufficient 'JIlachine resources to accommodate a direct solution
of his problem, the conventional, programmer is
left with no choice but to use some procedure
which is inherently a more complex programming
task. The quality of the procedure he chooses may

1020

Fall Joint Computer Conference, 1968

have a dramatic effect on performance but it is at
least a consistent effect and often quantifiable in
advance. In any event, because conventional systems have been around a long time, there are
many guidelines available to the programmer for
achieving acceptable performance if he should
wish to do so.
The role of the virtual machine programmer is
not nearly so well defined. One of the original attributes claimed for the virtual machine concept
was that it relieved the programmer from consideration of the environment on which his program
was to be run. Thus he need not concern himself
with machine limitations. As was pointed out
previously, the question is-given that the programmer does in fact ignore all environmental
considerations, what kind of efficiency results?
Assuming that the answer to this question is sometimes undesirable, that is, running time is unacceptably long, another question arises. Can the
programmer do anything about it? Clearly it is
difficult to conceive of his being able to reorganize
his program in such a way as to assure improved
performance if he has no knowledge of the environment nor takes it into consideration when
effecting such changes. Thus if the premise 'Of
freedom from environmental considerations is to
be strictly adhered to, there can be no way for the
programmer to consciously improve performance.
Sh'Ould this premise be compromised to allow
the programmer to influence performance through
exercising knowledge of the system environment?
This study assumes that this should indeed be the
case and shows that there is much to be gained
and little to be lost. It ,should be emphasized, however, that the original premise need not be compromised at all in so much as it would, of course,
not be necessary for the programmer to ever assume the responsibility of having knowledge of
the environment (unlike in the case 'Of the conventional programmer faced with insufficient machine resources). It would only enable him to have
better assurance of acceptable performance if he
chose to do so.
Clearly, the many interesting questions concerning the role of the virtual machine programmer
and his effect on performance are worthy of pursuit We feel that the measurements obtained in
this study. of program behavior in a paged environment provide a valuable insight to such questions and serve as motivation for further consideration of them.

Test environment
Before discussing the results of the study we
feel it is advisable to describe the environment in
which they were obtained. Thus included herein
are brief descriptions of the M44/44X system and
the methods employed to obtain and measure the
test load programs. (More complete information
is available in References 1, 5 and 7.) It is assumed throughout this discussion that the reader
is generally familiar with the concepts of virtual
machines, paging, time-sJ.taring and related topics;
however, a short general discussion on paging
characteristics of programs is included in order
to establish an appropriate reference frame for
the presentation of the experimental data.
The experimental M44/44X system
To the user a virtual computer appears to be a
real computer having a precise, fixed description
and an 'Operating system which provides various
user facilities and links him to the virtual machine
in the same way as the operating system of a conventional system links him to the real machine
(Figure 1). Supporting the virtual machine definition. is a transformation (control) program
ConventIonal System

Virtual System

Real Machine

Real Machine

r-- --,

I
I

VIrtual
MachIne

L

___

FIGURE 1~Conventional and virtual systems

I
I

.J

Program Behavior in Paging Environment
which runs on the real machine.. This program, together with special mapping hardware, "creates"
the virtual machine as it appears to the user. Implementation of multi-programming within the
framework of the virtual machine concept permits
the transformation program to define the simultaneous existence of several separate and distinct
virtual machines.
The virtual machine programmer may write
programs without knowledge of the transformation program or the configuration of the real machine-his concern being the virtual machine description, which is unaffected by changes in the
real hardware configuration or the transformation
program. In the M44/44X system the real machine is called the M44, the transformation program is MOS, the virtual machine is the 44X and
the virtual machine operating system is the 44X
Operating System (Figure 2).
The real computer
Figure 3 shows the hardware configuration of
the M44 computer. It is an IBM 7044 with 32K,
36-bit words of 2 JLsec core which has been modified to accommodate an additional 184K words of
8 JLsec core and a mapping device. The resident
control program together with the mapping device and its associated 16K, 2 JLosec mapping memory, implement the 44X virtual machines on a demand paging basis in the 8 JLsec store. The backup store of the M44/44X system, which is used
fo~ both paging and permanent file storage, con-

1021

Non-Overlapped
Channel A Omitted

FIGURE 3-M44/44X hardware configuration

sists of two IBM 1301 II disks. The page size (a
variable parameter on this system) used for our
tests was 1024 words (1 K). The average time
required to seek and transmit one page from the
disk to core is 0.21 second for that page size (computed from our data). The IBM 7750 serves as a
message switching deyice, connecting a number
of IBM 1050 terminals and teletype 33's to the
system. To facilitate measurement our tests were
not run from terminals (foreground) but as background jobs from tape. (The system makes no distinction between the two for the single programmed case-nor for the multi-programmed
case as long as all jobs on the system are of the
same type, i.e., all background or all foreground.)
The control program

FIGURE 2-M44/44X Multi-programming system

MOS, the control program, resides in the nonpaged 2 JL sec store. This M44 program "creates"
and maintains each virtual 44X machine and enables several 44X's to run simultaneously, allocating the M44 resources among them. All 44X I/O
is monitored by MOS, and all error checking and
error recovery is performed by MOS. Some of the
design parameters of MOS are easily changed to
facilitate experimentation. The variable parameters include the page size, the size of execution
store (real core) made available to the system, the
page replacement algorithm, the time slice and the

1022

Fall Joint Computer Conference,

19~8

scheduling discipline (via a load leveling facility).
The last two parameters mentioned are applicable
only in the multi-programmed case. As previously
stated, the page size used throughout the study
was 1024 words. The size 'Of real c'Ore was, of
course, one 'Of the m'Ost important parameters and
was varied to investigate paging properties of the
programs (in. b'Oth the single and multi-programmed environments).
For the single pr<,>grammed part of the study
the page replacement algorithm e.mpl'Oyed was
FIFO (First In-First Out). If a page in real core
must be 'Overwritten, the page selected by FIFO
is the one which has been in core f'Or the longest
period of time. .Data were also obtained for single
pr'Ogrammed paging behavior under a minimum.
page replacement algorithm developed by L. Belady. 6 A non-viable algorithm, MIN computes the
minimum number 'Of page pulls required by examining the entire sequence of program address
references.
For the multi-pr'Ogramming part of the study,
a time slice of 0.1 second was used. Runs were
made using three different page replacement algorithms to determine the effect 'Of this design
parameter on system performance. (Available
real store is competed f'Or freely by all the 44X's.)
The three algorithms were FIFO, BIFO, a biased
version of FIFO which fav'Ors (on a r'Ound r'Obin
basis) one 44X by choosing not to overwrite the
pages associated with it for a preselected' interval
'Of time, and AR, a hardware supported alg'Orithm
which ch'Ooses a candidate for replacement fr'Om
the set of pages which have n'Ot been recently
referenced. (These algorithms are described more
fully in Refs. 1 and 6.)
The virtual machine
Each virtual 44X machine is defined to have 221
words of addressable st'Ore. The virtual memory
speed of a 44X is 10 J.Lsec (44X pr'Ograms are executed in 8. J.L:sec store and a 2 J.Lsec mapping cycle
is added to a mem'Ory cycle) ; the CPU speed is 2
ILsec.The user c'Ommunicates with the 44X virtual machine' thr'Ough the 44X Operating System,
a 44X pr'Ogram which permits continu'Ous pr'Ocessing 'Of a stack of 44X jobs; it contains a c'Ommand
language, debugging facilities, a FORTRAN IV
c'Ompiler, an assembly pr'Ogram, a rel'Ocatable and
absolute loader facility, routines for handling a
user's permanent disk files and a subroutine library.

Test load problems
Test pr'Oblems were chosen from the scientific,
c'Ommercial, list processing and systems areas of
computer applications. The problems chosen inv'Olved large data bases which required the programmer of a conventiona,l machine to c'Oncern
himself with memory management. The pr'Oblems
discussed in this paper include matrix inversion
and data c'Orrelati'On fr'Om the scientific area and
s'Orting fr'Om the c'Ommercial area. (A c'Omplete
rep'Ort 'On the entire study can be f'Ound in Ref. 7.)
Pr'Ograms were initially c'Oded for each problem in tW'O ways:
i) a c'Onventi'Onal manner where the b.urden
'Of mem'Ory management is assumed by the
pr'Ogrammer (c'Onventi'Onal c'Ode), and
ii) a straightf'Orward manner utilizing the
large virtual mem'Ory ("casual" virtual
c'ode) .
Simple modificati'Ons were then made t'O the
"casual" virtual c'Odes to pr'Oduce pr'Ograms better
tail'Ored t'O the paged envir'Onment. Our interest
lay in c'Omparing the performance of the different
versi'Ons· 'Of the virtual C'Odes under variable paging
c'Onstraints in b'Oth single and multi-pr'Ogramming
envir'Onments. We were als'O interested in c'Omparing the c'Onventi'Onally c'Oded pr'Ogram perf'Ormance
with that of the virtual (i.e., aut'Omatic mem'Ory
management) c'Odes given the same real memory
c'Onstraints.
It sh'Ould perhaps be n'Oted here that f'Or our
purp'Oses a pr'Ogram's perf'Ormance is directly related t'O its elapsed run time. Thus in a paging envir'Onment, where this elapsed time includes the
time necessary t'O acc'Omplish the required paging
activity, po'Or paging characteristics are reflected
by increased run time and thus degraded perf'Ormance.
Measurement techniques
A n'On-disruptive hardware m'Onitoring device
capable 'Of measuring time spent in up t'O ten
phases. 'Of pr'Ogram executi'On was used f'Or all
7044 runs and relevant single-pr'Ogrammed 44X
runs. In additi'On, f'Or 44X runs (b'Oth single and
multi-pr'Ogrammed), a s'Oftware measurement
r'Outine in MOS was utilized. This r'Outine c'Ollects
data while the system is running (using the cl'Ock
and a special high-speed hardware counter) and
'On system termination produces a summary of the

Program Behavior in Paging Environment
data including; total time, idle time, time spent in
MOS (including idle time), number of page exceptions, page pulls, page pushes and other pertinent run data.
All programs were run in binary 'Object f'Orm
as background jobs residing 'On a system input
tape; all output was written on tape. For the
multi-programmed runs, a facility of MOS was
used which permits several background jobs to be
started simultane'Ously. For the single pr'Ogrammed study the 44X programs were first
run and measured on the system with sufficient
real core available t'O eliminate the need for paging; these same pr'Ograms were then run (and
measured) in a "squeezed core" envir'Onment, i.e.,
with insufficient real mem'Ory available, thus necessitating paging.
Program behavior under paging
Pr'Ogram performance on any paging system is
directly related to its page demand characteristics. A program which behaves poorly acc'Omplishes little on the CPU before making a reference to a page of its virtual address space that is
not in real core and thus spends a good deal of time
in page wait. A program which behaves well
references storage in a more acceptable fashion,
utilizing the CPU m'Ore effectively bef'Ore referencing a page which must be brought in from
back-up store. This characteristic of st'Orage referencing is often referred t'O. as a pr'Ogram's "locality of reference." 6 A pr'Ogram having "g'O'Od"
locality 'Of reference is one whose storage reference pattern in time is m'Ore local than global in
nature. For example, although a program in the
course 'Of its execution may reference a large number of different pages, if in any reasonable interval
'Of (virtual) time, references are c'Onfined to 'Only
a small set of pages (n'Ot necessarily contiguous in
the virtual address space), then it exhibits a desirable locality of reference. If, on the 'Other hand,
the size of the set is large, then the locality of
reference is poor and paging behavior is correspondingly poor. (The "set" 'Of pages referred to
in the ab'Ove example corresponds roughly to Denning's8 noti'On of a "~orking set."
All pr'Ograms typical of real pr'Oblems exhibit
badly deteri'Orated paging characteristics when
run in s'Ome limited real space environment. What
is 'Of interest is the extent to which the space can
be limited without seriously degrading perf'Ormance. Clearly, the size 'Of this space is related to

1023

the program's locality and provides some indication of the size of what might be called the program's critical or characteristic working set. As
the single pr'Ogrammed results presented below
show, the effects of programming style on the
relative size of this space can be enorm'Ous.·

Single programmed measurement results
We first measured the behavi'Or 'Of the 44X programs in a controlled single programmed envir'Onment. The results obtained are discussed in terms
'Of the relative effects of pr'Ogramming style on
performance for three problems: Tl~Matrix Inversion, T2-Data C'Orrelation, and T4-Sorting.
In each case we are concerned with showing how
even simple differences in programming technique
can make a substantial difference in perf'Ormance.
Unquestionably there are further improvements
which could be made in the alg'Orithms employed;
however, we feel that our point is best illustrated
by the very simplicity 'Of the changes made.
Timing and paging overhead data are given for
actual runs made on the system employing a FIFO
page replacement alg'Orithm. Also, in 'Order to establish that these results were not unduly influenced by that page replacement alg'Orithm, c'Orresponding computed minimum paging overhead
data are given (obtained thr'Ough interpretive program executi'On and application 'Of L. Belady's6
MIN algorithm).
The data collected f'Or the comparison of the
aut'Omatic and manual methods of memory management is also discussed in this section.
Problem T1 . .. Matrix inversion
The virtual machine codes for this program
were written in FORTRAN IV and are intended
to handle matrices of large' 'Order. They all empl'Oy an "in-c'Ore" technique since the large addressable virtual store permits the acc'Ommodati'On
of large arrays (the burden of real memory management being assumed by the system· through
the automatic facility of paging). The curves in
Figure 4 give the respective program run times
as a function of real c'Ore size f'Or the three differ- .
ent versi'Ons which were written for the virtual
machine. These times are for inverting a matrix
of order 100 (which is admittedly not an unusu~l­
ly large array, but sufficiently large t'O illustrate
our point without requiring an impractical amount
of CPU time).

1024

Fall Joint Computer Conference, 1968

o
o

3000

~

o

TI.IX
42 PAGES
TLlX* 35 PAGES
I:::. TUX
35 PAGES
IK PAGE SIZE
FIFO REPLACEMENT ALGORITHM
SINGLE. PROGRAMMED

o

**

TI.IA
TI.IX*

42 PAGES (CASUAL CODE)
35 PAGES (MOST IMPROVED CODE)

---- MIN REPLACEMENT ALGORITHM
FIFO REPLACEMENT ALGORITHM

2500

I K PAGE SIZE
SINGLE PROGRAMMED

z

8
w

2500

(/)

o

~ 2000

I

w

:E

t=

~

en

2000

z

z
::J

«
0::

0::
.J

~

7335 @24K

en

z
o

~

w

1500

(!)

1500

~

~

1000
1000

o
500

500

0L---~8-K--~16-K---2~4-K---3~2K---4-0~K---4~8-K-----­

REAL CORE SIZE (K = 1024 WORDS)

FI G URE 4-Effects of real core size
TI-Matrix inversion (lOOxIOO)

All three programs employ the same algorithm,
a Gaussian procedure utilizing a maximum pivotal
condensation technique to order successive transformations. The differences in the three versions
are extremely simple. The "casual" version,
T1.1X, stores the matrix in a FORTRAN double
subscripted array of fixed dimensions (storage allocated columnwise to accommodate a matrix of
up to order 150), reads the input array by rows
and prints out the inverted array by rows. The
innermost computation loop traverses elements
within a column. Version T1.1X** is the same as
T1.1X except that variable dimension capability
was employed (thus insuring the most compacted
allocation of storage for any given input.· array) .
Version T1.1X* is the same as T1.1X** except
that the input and ouput is columnwise instead of
rowwise. Obviously neither of these changes is
Gomplicated or of any .consequence in a conventional environment; however, as clearly shown in
Figure 4, they make a considerable difference in a
paging environment.
The paging overhead data is shown in Figure 5
for the casual (T1.1X) and the most improved
(T1.1X*) versions for both the FIFO algorithm
(corresponding to the time curves of Figure 4)

i

,
I

8K
16K
24K
32K
40K
REAL CORE SIZE (K= 1024 WORDS)

48K

FIGURE 5-Effects of page replacement algorithm
TI-Matrix inversion (lOOxIOO)

and the MIN algorithm. This paging overhead is
-given in terms of the number of page transmissions required during execution of the respective
program when run with a given amount of real
core available under the discipline of the particular page replacement algorithm. (Each reference
to a page not currently residing in real core requires a page to be transmitted from backup store
into real core [a "pull"] and often also requires
a page to be copied from real core onto backup
store [a "push"]. The total number of pulls and
pushes is the number of page transmissions.
Given a particular real core size, the MIN algorithm employed gives the theoretical minimum
number of pulls required. Belady has shown that
the number of page transmissions obtained by
this algorithm differs insignificantly from the
number obtainable by minimizing both pulls and
pushes.)
As can be seen in Figure 5, there is no great
disparity between the paging overhead sustained
under FIFO and the theoretical minimum possible
(under MIN) for either of the programs. In particular it should be noted that the paging behavior
of the well. coded program is considerably better

Program Behavior in Paging Environment
under FIFO than that exhibited by the casual program under the most optimum of page replacement s~hemes. Certainly these data support the
argument that improvement in programming style
is advantageous to performance, irrespective of
what page replacement scheme is used.
Clearly there are modifications which could be
made to the algorithm itself which would further
improve performance through improved locality
of reference. McKellar and Coffman 9 have indeed
shown that for very large arrays, storing (and
subsequently referencing) the array in sub-matrix
form (one sub-matrix to a page) is superior to the
more conventional storage/reference procedure
employed in our programs. (For the lOOX100 array, however, the difference is not significant.)
Problem T2

0

0

00

Data, correlation

For the other problem in the scientific area an
existing conventional FORTRAN program, which
required intermediate tape I/O facilities because
of memory capacity limitations, was modified to
be an "in-core" procedure for the virtual machine.
The problem, essentially a data correlation procedure, involves reconstructing the most probable
tracks of several ships participating in a joint
exercise, given a large input data set consisting
of reported relative and absolute position measurements. The solution irnplemented is a maximum
likelihood technique; the likelihood functions relating the independent parameters are Taylor expanded to yield a set of simultaneous equations
with approximate coefficients. The equations are
solved (using the inversion procedure of problem
Tl) ,the solutions are used to recompute new approximate coefficients, and the process is reiterated until a convergent solution is reached. (Each
iteration involves a single pass of the large data
set.) The measured position data, together. with
the accepted solution are used to compute the reconstructed ships' tracks. (This final step requires one pass of the data set for each ship.)
For the first (or "casual") version, T2.1X, the
conventional code was modified for the large virtual store in the most apparent way. The large
data set, a mixture of fixed and floating point variables stored on tape for the conventional version,
was stored in core in several single-subscripted
fixed dimension arrays, one for each variable in
the record format. As the curve for this program
in Figure 6 shows, the performance is rather poor.
This is accounted for in part by the fact that the

1025

/:). T2.IX 54PAGES
o T2.IX* 45 PAGES
3500
IK PAGE SIZE
FIFO REPLACEMENT ALGORITHM
SINGLE PROGRAMMED
3000
(/)

o

z

8 2500
IAJ

(/)

I

IAJ

~

2000

z

;:)

a::

...J

~

1500

~
1000

500

O~

__

~

SK

__- L_ _ _ _

~

__

~

_ _- L _ _ _ _

~~

16K
24K
32K
40K
4SK
REAL CORE SIZE (K =1024 WORDS)

_ __

54K

FIGURE 6-Effects of real core size
T2-Data correlation

manner in which the data are stored causes a glob..
al reference pattern to occur due to the program's
logical use of those data. Version T2.IX* attempts
to improve the locality by storing the data compactly in one single-subscripted floating point array, such that all of the parameters comprising a
single logical tape record in the conventional code
are in sequential locations. (The conversions necessitated by assigning both fixed and floating
point variables to the same array name increased
the CPU time slightly.) The curves in Figure 6
clearly show that this modification resulted in a
significant improvement.
The same ordered relationship exhibited under
FIFO holds for the casual and improved versions
under the MIN algorithm (Figure 7). Although
in the case 'of the poorly behaving code, the MIN
algorithm does appreciably better than FIFO
given a core size of 32K where FIFO performance
has already deteriorated badly. The improvement
is short lived, however, since deterioration under
MIN occurs with any further decrease .in real
core size.

1026

Fall Joint Computer Conference, 1968

It should be noted that the actual data set used
for these runs was not exceptionally large (as the
total number of pages referenced indicates).
Again, practicality demands that we settle for a
data case 'Of reasonable size. The case at hand involved six ships (resulting in 26 equations) and a
rather small data base of only 240 reports. The
data base storage requirements in the case of the
well coded program, T2.1X*, were satisfied by
four pages. In the case of T2.1X, however, the
several large fixed dimension arrq,ys used to store
the data in that pr'Ogram required 13 pages; thus
not only was the data ordering poor but a great
deal of space was wasted as well.
Once again, there are probably other improvements that could be made. For example, because
the program is divided into several subroutines
(17) of reasonable length, a change in the order
of loading the routines could improve (or degrade) performance. We have illustrated here
only the effects of a change in the manner of storing the data base.

!::. T2.IX

54 PAGES
T2.1 x* 45 PAGES
--- MIN REPLACEMENT ALGORITHM
FIFO REPLACEMENT ALGORITHM
IK PAGE SIZE
!::. SINGLE PROGRAMMED

o

8000

7000

o

,,
,,
,,
,,
,,
,,
,,

o

3000

2000

\
\

o

1000

\
\

\

\
16K
24K
32K
40K
48K
REAL CORE SIZE (K =1024 WORDS)

56K

FIGURE 7-.:..Effects of page replacement algorithm.
T2-Data correlation

Problem T4 .•.• Sorting
Sorting, a classical example of the necessity for
introducing complicated programming techniques
to accommodate a problem on a conventional memory bound computer, also aff'Ords an excellent example of how drastically programming style can
effect performance in a paging environment.
Ideally, if memory capacity were sufficient for the
entire file to be in core, the sort programmer
would only need to concern himself with the internal sorting algorithm and never be bothered
with the other plaguing procedures involved with
doing the job piecemeal. This was the approach
taken, programming the virtu'al machine codes
assuming that the file could be accommodated in
virtual store.
Initially, two different algorithms were codedthe' Binary Replacement algorithm (basically a
binary search/insertion technique employed in a
generalized sorting program in the Basic Programming Support for IBM System 360) and the
Quicksort 1o algorithm (a partitioning exchange
procedure). When the c'ompleted programs were
run with a reasonably l'Ong data set, it became immediately apparent that the Binary Replacement
algorithm was exceptionally bad for large lists
because of the amount of CPU time required.
(Note that this characteristic presents little problem for the internal sort phase of a conventional
code which never deals with a very' large list.)
We will, 'Of course, acknowledge that someone
more knowledgeable in the field of sorting than
we would have recognized this characteristic of
the algorithm beforehand. Our experience nonetheless pointed out rather dramatically that an accepted technique for a conventional machine need
n'Ot be acceptable when translated to a virtual machine environment, irrespective of its paging behavior! Because of its unacceptable CPU characteristics, the algorithm was discarded and our
efforts were concentrated on Quicksort since that
algorithm is efficient for either small or large
lists.
Four versions were ultimately c'Oded for the
virtual machine, each of which is described below.
All of the .changes made to get from one version
to another were simple and required little programmer time. N one of these changes altered the
total number 'Of pages referenced; they simply improved the locality of reference. The time curves
in Figure 8· and the paging curves in. Figure 9

Program Behavior in Paging Environment 1027
size of the- file in the case shown is 100,000 words,
occupying 100 pages in virtual memory.)
T 4.1XQ, the "casually" coded version, reads in
the entire file, performs a non-detached keysort
3500
• T4.IXQ
129 PAGES
4730 SECONDS
o T4.IXQ*I30 PAGES
AT 64K
utilizing the Quicksort algorithm and a table of
• T4.IXQR 129 PAGES
key address pointers, then retrieves the records
o
T4.IXQR* 129 PAGES
3000
IK PAGE SIZE
for output by using the rearranged table of pointFIFO REPLACEMENT ALGORITHM
0
ers. The records themselves are not- reordered durz
SINGLE PROGRAMMED
§ 2500
ing the sort thus storage references are random
and
global during both sort and retrieval, making
!oJ
locality of ref.erence poor. Deprived of only a
j:: 2000
z
::::l
small amount of its required store, this program
II::
...J
behaves very badly. Note that although the MIN
~ 1500
f2
curve in Figure 9 does show some improvement in
paging behavior over FIFO, the improvement is
of no consequence since performance is still quite
1000
unacceptable.
T 4.1XQ* treats the file as N sublists; each is
500
~
read
in, then sorted using the non-detached key':
•
4~1I

----,I I
~I
1 1
1><1 C><1 ,----I
1....--'

>32

19

>

T4
- Qu1wort
1,000,000
word f11.

[70:.":::]

15
1'-----1

The data indicate that, if reasonable pr'Ogramming techniques are employed, the automatic paging facility compares reas'Onably well (even favorably in some instances) with programmer controlled meth'Ods. While n'Ot spectacular, these
results nonetheless lo'Ok g'O'Od in view 'Of the substantial savings in pr'Ogrammer time and de-

Program Behavior in Paging Environment
bugging time that can still be realized even when
constrained to employing reason,able virtual machine programming methods.

I:::. T4.IXQ
1200

The effects of programming style
The two versions 'Of the sort program used for
this study were the "casually coded" version,
T4.1XQ, and the "most improved" version, T4.lXQR*. Multiple copies of a given program were
run simultaneously (as background jobs) on the
system with the full real core (184K) available.
(N 0 more than 5 background jobs can be run
simultaneously because of tape drive limitations.)
The curves in Figure 10 compare the multi-programming effic,iency 'Obtained with the two different programming styles. .These curves are plots
of Time/Job vs the number of (idEmtical) jobs
run simUltaneously on the system (Multi-programmin.g Level) .
Clearly the efficiency of the system is nearly
identical whether multi-programmed at tne' two
level or the five-level in the case of the well-coded
program, T4~IXQR *, but is substantially degraded
for each additional job in the case of the casually
coded version, T4.1XQ. In fact, multi-programming at even the two level for that program is

(CASUAL CODE)
IMPROVED CODE)
BIFO REPLACEMENT ALGORITHM
I K PAGE SIZE
0.1 SECOND TIME SLICE
REAL CORE SIZE - I84K
PAGE REQUIREMENTS 129 PAGES/JOB
NO LOAD LEVELER

o T4.IXQR *(MOST

Multi-programming measurements

The importance of programming style to paging
behavior was clearly demonstrated in the single
programmed part of this study. We were interested in learning if it would have similarly dramatic effects on performance in the domain more
common to paging systems, i.e., multi-programming. Because the most notable changes in behavior were observed in the sorting area, we
decided to plan our measurement efforts around
these programs. An extensive measurement program was undertaken which was designed to give
us insight into the relative effects on perf'Ormance
of the following: programming style, page replacement algorithm, size of real core, number of
users and scheduling. It should be noted that the
question of performance in a multi-programmed
environment involves b'Oth the individual user
response and total system thruput capability. Although the study addressed both of these aspects,
the results discussed here pertain only to the
latter. (A complete in-depth report 'On the entire
multi-programmed measurement study is given
in Ref. 7, Part III.)

1029

1000
(J)

o

z

~

800

I

m

o

~

~ 600
~

400

200

~ /.

tlLo--o--o
--

I:::.~---

____ -

-

__ -

.

T4.IXQR
-OPTIMUM
_
T4.IXQ

OPTIMUM

2

3

4

5

MULTI-PROGRAMMING LEVEL

FIGURE to-Effects of programming style
T4-8orting (10,000 10-word items)

worse than running sequentially. (For T4.1XQR*
multi-programming is consistently more advantageous than running sequentially up through the
five-level. )
The effects of load leveling
One of the capabilities available on the M44/
44X system aimed at improving efficiency is that
of dynamically adjusting the load on the system
in order to attempt to av'Oid the overload condition which is characterized by excessive paging
coupled with low CPU utilizati'On. When this I'Oad
leveling function is activated, the system peri'Odically samples paging rate and CPU utilization,
compares them with pre-set criteria to determine
if a condition of overload 'Or underload exists, and
then takes action appropriately to adjust the' system load by either setting aside a user, i.e., removing him temporarily from the CPU queue, or' restoring to the queue a user who was previously
set aside. The function of the load leveler is thus
essentially one which affects· scheduling.
The extremely poor behavi'Or e.~hibited by the
casual code when multi-programmed made this
case a likely candidate for studying the effects of
load leveling. Figure 11 shows the remarkable
improvement which the load leveler achieved

1030

Fall Joint Computer Conference, 1968

b.
A

o

1200

1000

T4.1XQ
(CASUAL CODE) -NO LOAD LEVELER
T4.IXQ
(CASUAL CODE) - LOAD LEVELER ACTIVE
T4.IXQR* (MOST IMPROVED CODE) NO LOAD LEVELER

BIFO REPLACEMENT ALGORITHM
I K PAGE SIZE
0.1 SECOND TIME SLICE
REAL CORE SIZE -184K
PAGE REQUIREMENTS
129 PAGES/JOB

•
•
•

b.

500

BIFO)
AR
LOAD LEVE LER
FIFO

6.
o

BIFO)
.
AR . NO LOAD LEVELER
O· FIFO
REAL CORE· SIZE - 64K
IK PAGE SIZE

~400

z

0.1 SECOND TIME SLICE

~

PA~E REOUIREMENTS-129 PAGES/JOB

o

fa

(J)

~ 300

o

~

ILl

:I:

i=

b.

/~~~

.200

O~_O _ _O _ _ O

100

400

200

~

-

-

-

T4.IXQR*
- . - - ----OPTIMUM

,

~:l

~~~~~

OPTIMUM

-------_T4.IXQ
OPTIMUM

:3

4

5

MULTI- PROGRAMMING LEVEL

FIGURE ll-Effects of load leveling
T4-Sorting (10,000 10-word items)

2

3

4

MULTI- PROGRAMMING LEVEL

FIGURE 12-Effects of page replacement algorithm
T4.1XQR*

when there were three or more jobs involved. UnlXQR* multi-programmed (up to the five-level)
fortunately, the efficiency is still sUbstantially
with only 64K 'Of real memory available to the enworse than in the sequential case. We nonetheless
tire system, i.e., shared by all the users. The level
feel that the p'Otential for improved performance
of multi-programming for which the efficiency is
achieved through the use of an automatic dynamic . optimum is in all cases three; however, in the case
facility such as this is promising and indicative
of the AR algorithm, multi-programming at the
that it would be well worth implementing-in parfive-level with only 64K 'Of real memory is still
ticular if it can be kept simple and efficient as is
more advantageous than running the five jobs sethe case with the M44/44X load leveler.
quentially (with the same 64K of real memory).
Note that this is also true when running under
The effects of page replacement algorithm
the other algorithms with load leveling.
As might have been suspected from the singleprogrammed MIN study, the role of the page· replacement algorithm appears to be 'Of relatively
little significance. In the case of T4.1XQ, runs
were made using the more sophisticated AR algorithm but the data collected differed little from
that obtained for the BIFO algorithm. Similarly,
in the case of T4.1XQR* the difference in the results is inconsequential for those runs made where
all 'Of real core (184K) was available. (Figures
10 and 11 show the BIFO data.) However, when
the same T4.1XQR* runs were made with the real
core size restricted to 64K there was some change
in performance for the different replacement algorithms. The curves in Figure 12 compare the
effects of using the different algorithms for T4.-

The effects of real size
Performance is so pour for the 'T'4.1XQ program
given the full 184K of real memory, that it is obviously unnecesary to show how bad things would
be given an even smaller memory! In the case of
T4.1XQR * however,performance fur the system
is so close to optimum that we were curious to
learn just how small the real core size could be
before performance would be worse than in the
single-programmed case (for the same size of
real memory). The curves in Figure 13 compare
Time/Job for the single-programmed case, with
multi-programming at the three and five levels for
different real core sizes. Runs were also made
mJ,llti-programmed at the five level with the load

Program Behavior in Paging Environment

the question of individual thruput .(or resp'Onse)
time in the data shown here; however; we have
shown that total system thruput is most certainly
affected by the programming style employed by
the users on the system. We have shown in our
other work (Ref. 7) that this is als'O true for individual response time (often even if system thruput is unaffected).

AR REPLACEMENT ALGORITHM
I K PAGE SIZE
0.1 SECOND TIME SLICE
PAGE ~EQUIREMENTS-I29 PAGES/JOB
NO LOAD LEVELER

----0

SUMMARY

~~O-------O
~-----~~·----O

~

100

32K 48K 64K

96K

1031

128K

1841<

REAL CORE SIZE
(K·,024 WORDS)

FIGURE 13-Effects of real core size
T4.1XQR*

leveler activated and real core sizes of 48K and
32K. As can be seen in Figure 13, while improving
performance, the load leveler was not able to improve it sufficiently to compare favorably with the
single-programmed sequential case.
When viewed· in the perspective of page requIrements per job, the performance of the system is
remarkable for the well coded program. Five jobs,
each requiring 129 pages, shared a 32K memory
and still behaved reas'Onably well! (The time per
job is even a few seconds less than that required
for the overlapped 2-way merge conventional
code.) On the other hand, the performance for the
casual code given the full memory capability of
184K is at best (load leveled) quite a lot worse
than sequential and at worst (not load leveled) a
minor disaster.
The data which we have presented here on
multi-programming represent only part of that
collected for the stUdy. The cases chosen are obviously the extreme ends 'Of the spectrum. One
would not (hopefully) encounter all "bad" programs running at the same time on a system under
real time-sharing conditions, nor (regretfully) is
one likely to encounter all "good" programs. The
real situation lies somewhere inbetween-and
most likely, so does the characteristic perform~
ance of the system. We have not directly addressed

The single programmed data presented in this
paper give strong support to the conclusion that
the effects of programming style are of significant
consequence to the question of good performance
in a paging system. Indeed, as the MIN results
indicate, the basically external consideration of
programming style can be considerably more important than the internal systems design consideration of replacement algorithm. We feel that data
obtained for the multi-programmed case, some of
which were presented in the previous section, further support our conclusions. In view of these
results, we feel that this aspect of performance
must not be disregarded in future endeavors t'O
implement paging systems. Programming techniquesshould be developed at both the user and
system levels which are aimed at achieving acceptable performance on such systems. For example,
higher level language processors such as FORTRAN should be designed for paging systems to
produce good code for the environment as well as
to perform well themselves in that environment.
While we support the stand that paging and
virtual machines are inherently desirable concepts
with much potential, we strongly feel that in order
to fully realize that potential in terms of practical
performance characteristics, the notion 'Of programming with complete unconcern for the environment must be discarded. Our data have
~hown, however, that one can often realize acceptable performance by employing even simple
techniques which acknowledge the paging environment. Their simplicity leads us to feel that the
programming advantages inherent to the concept
of virtual systems can, to a great extent, still be
preserved.
ACKNOWLEDGMENT
We would like to acknowledge E. S. Mankin for
his extensive contribution in preparing the test
load programs for the sort area.

1032

Fall Joint Computer Conference, 1968

REFERENCES
1 RWO'NEILL
Experience using a lime-shared multiprogramming system with
dynamic address reZocation hardware
SJCC Proceedings Vol 30 1967 pp 611-621
2 PWEGNER
Machine organization for multiprogramming
Proceedings of 22nd ACM National Conference Washington
DC 1967 ACM Publication P-67 pp 135-150
3 GHFINE CW JACKSON PVMCISAAC
Dynamic program behavior under paging
Proceedings of 21st ACM National Conference Washington
DC 1966 ACM Publication P-66 pp 223-228
4 Adding computers-Virtually
Computing Report for the Scientist and Engineer Vol III No
2 March 1967 pp 12-15
5 The M 44f4.4X user's guide and the 44X reference manual
IBM Corp T J Watson Research Center Yorktown Heights

New York September 1967
6 LABELADY
A study of replacement algorithms for a virtual storage computer
IBM System's Journal Vol 5 21966 pp 78-101
7 BSBRAWN FGGUSTAVSON
An evaluation of program perJormance on the MJ,.J,.f4.J,.X system
Parts I II III
R C 2083 IBM T J Watson Research Center Yorktown
Heights May 1968
8 PJDENNING
Working set model for program behavior
CACM Vol 11 5 May 1968 pp 323-333
9 A C McKELLAR E G COFFMAN
The organization of matrices and matrix operations in a paged
multiprogramming environment
Princeton University Technical Report No 59 February 1968
10 CARHOARE
Quicksort
Computer Journal Vol 5 April 1962 to January 1963 pp 10-15

JANUS: A Hexible approach to realtime timesharing *
by J. O. KOPF and P. J. PLAUGER
Michigan State University
East Lansing, Michigan

to meet the goals outlined above. JANUS has
proved to be far more powerful than we originally
expected.

INTRODUCTION
Motivation
A third generation computer seems to cause as
many problems as it solves; not because it is difficult to program or too problem directed-quite
the contrary. The problems arise because such a
computer lends itself so willingly to all applications-realtime data acquisition, process control,
scientific calculations, bookkeeping and conversational time-sharing. In· a nuclear physics laboratory, there are enough imaginative people interested in each of these subjects that eventually all
are implemented with some success. The central
problem, then, is to develop an operating environment compatible with open-ended development of
any or all types of computer usage. Ideally, one
seeks a standard operating system providing the
framework and resources to aid all such development.
In our case, the desired priority of computer
usage was to be: 1. realtime data acquisition and
control; 2. interactive on-line operations, especially data analysis; 3. background operation. In a
nuclear physics experiment, realtime operation
is normally characterized by immense bufferswhich are updated at each input event, rather
than transmitted as sequential data-indicating
the desirability of a small resident monitor, and
non-permanently dedicated interrupt routines and
buffers. Event rates as high as 50,000 events/
second may be expected, implying the need for a
powerful computer to perform quickly the operations necessary to each event.1 ,2
The Michigan State University Cyclotron Laboratory installed a Scientific Data Systems
Sigma-7 computer in January 1967. We have constructed operating system JANUS for the Sigma-7
*Supporled by the National Science Foundation.

The SDS Sigma-7

The SDS Sigma-7 is a high-speed, integra~ed
circuit machine with sophisticated timesharIng
hardware. 3 It features a 32-bit word, with displacement indexing by 8-bit bytes, half-words,
words and doublewords, and direct addressing to
128k words. Timesharing hardware includes master / slave modes, rapid context switching (~x­
change- and Load_Program-Status-Doubleword ~n­
structions), a powerful interrupt structure wIth
certain functions inhibitable under program control program traps which are independent of the
int~rrupt structure~ and mapping hardware.
The Sigma makes extensive use of scratch-pad
memories; integrated flip-flop registers whose access time is insignificant compared with core memory. Thus there are 16 distinct registers, effectively accumulators. Instructions normally reference one or more of these registers. In addition,
the computer treats these register~ as the fir~t
16 locations of memory: all instructions are ~al~d
for'register-register operations. The computer IS
thus effectively a two-address machine, where one
address space is a subset of the other. Furthermore four registers may be used in a block as a
deci~al accumulator (31 digits plus sign), seven
others may be used as index registers (post indexing) , and any even-odd register pair may be usedfor double precision work.
The hardware also makes use of hard-wired
table look-up and translation for certain fune ..
, tions. An example is the map. Memory is na~urallY
divided into 512-word pages. The map consIsts of
a scratch-pad memory of 256 bytes, one for each
page of virtual memory (virtual memory is the
full address space of the machine, independent of

1033

1034

Fall Joint Computer Conference, 1968

the actual core memory available). When the map
mally uses the lOP for conventional I/O operais in operation, the first byte of the effective virtions;- the DIO for acquisition and control.
tual address is used as an index to look up a transFigure 1 details the resources available on the
lation byte from the map, which replaces the origMSU
Sigma-7.
inal byte to form the actual address used to make .
the memory reference. As a result, contiguous
Other approaches
virtual pages need not be in contiguous actual
memory; under a properly initialized map they
Before going into the details of JANUS, we
act as though they are. Associated with each page
should perhaps explain why we felt existing apis a two-bit access process code which can inhibit
proaches were inadequate for our needs.
slavemode from writing, executing, or even readConventional realtime systems are usually
ing words in the page. In conjunction with a rapid
geared for one application, or one set of appliaccess dis.c (RAD) , this hardware provides the
cations. One cannot randomly start and stop arbiswapping control needed for efficient timesharing.
trary functions, even though the particular reIn addition, the computer has two major means
sources needed may be standing idle. In particuof communicating with the external world. The
lar, one cannot "batch" process (i.e. compile, load
Input-Output Processor (lOP, of which there may
and run a series of purely computational probe up to 8) is designed for sequential transmission
grams) to take advantage of the usually large
of data asyncronously with the operation of the
CPU time available between interrupts.
computer. The Direct I/O (DIO) provides for the
By dividing memory into foreground and backtransmission of one word at a time to or from the
ground areas, it is possible to operate, a batch
registers, under program control. JANUS norsystem in conjunction with one or more realtime
operations. Aside from the fact that either of
these areas is frequently; a) unused for long
periods of time or, b) inadequate for many jobs
that could be run in full memory, there is a more
sophisticated drawback. Since realtime operations
must often use the same resources as the batch, a
large resident monitor is needed to handle common
operations and to prevent confiictsJ-Furthermore,
since realtime operations occur on an interrupt
basis, the monitor must either be reentrant to
several levels or must inhibit interrupts while it
is active (or a little of both). The former solution makes the resident even larger and slowerthe latter interferes with fast response to realtime events.
Conventional time-sharing systems 4 can be
geared to provide the random stop/start of realtime which we desire, and are better geared to
adapt efficiently to dynamically changing memory
availability. But the usual approach has been to
take an already large foregroundlbackground type
monitor and to add a swapper, job scheduler and
elaborate
I/O queuing routines to the resident. The
~-f--- MONITOR
dedjcable memory left over can be vanishingly
small.
Figures 2 and 3 caricaturize the distinction we
made between what we saw in conventional approaches to realtime timesharing and what we
FIGURE 1-Hardware resources of the MSU Sigma-7 system.
envisioned for JANUS.
Items labelled "J" are handled by JANUS. Those labelled "S"
There still remains the problem, not yet menare shared by all users on a cyclic basis. All others are loaned out
tioned, of the hybrid job. It is often desirable to
on a first come, first served basis for exclusive use.

JANUS

1035

c::I:J

TRAP (HARDWARE)

S S S J J J J J

r-,-I-

c:::cJ

I I "
I I
I Lilli
I I I I

-

TRAPS

+'

Iva slNfERRUPT

r-:
f-

~

f-or

CLOCKS

r-+-

c::I:J

I-

r-+
f0-

INT. INTERRUPTS

r

c::I:J
MAP

DISKPAGES (684)

c:I::J

~
TTY (CONSOLE)

c=J

c::::J

c=J

[ilIJ

c::::J

c:::J

C=:J

RAD

TTY- PT
CR

MT

LP

010 REGISTERS (8)
SCRATCHPAOS (4)
EXT. INTERRUPTS (8)

c:=:J
CP

RESOURCE
JANUS
RESIDEN .....
T - -......

JANUS
REALTIME
TIMESHARING

FIGURE 2-Caricature of a "conventional" realtime timesharing system. It is characterized by a large resident monitor and
rather rigid frame-at-a-time use of remaining memory.

construct programs having a small realtime part
and a large problem-solving part that could happily be timeshared. The question is raised as to
specifying such an animal. How is communication
between the two parts effected?
We found the answers to these questions, and
the inspiration for answering many more, in the
definition of PL/I.5

Design of JANUS
Terms and philosophy
The PL/I language definition provides a vocabulary and a philosophy (we have probably corrupted both). In PL/I a piece of code containing
all routines needed to perform a job, or distinct
part of a job, is a task. A task can start one or
more subtasks to perform asynchronous opera..

PLOT

CHART

MSU L7

FIGURE 3-Caricature of JA~US system, with small resident
and intertwined tasks having both dedicated realtime and
timeshared pari,s.

tions; it can go into a wait state until certain
events, signalled by other tasks, occur, On can
specify on-units to be activated when conditions
are raised (interrupts or traps). These are the
terms needed to describe what JANUS does.
Equally important is the philosophy. PL/I is
a modular language with the built in attitude, "If
you don't know about this option, it isn't there. I
will do what you most likely want done." With
only 16k words of core memory, one must necessarily begrudge the presence of excess code. This
is the philosophy by which JANUS works.
It should be emphasized that JANUS does not
require the PL/I compiler to operate, nor do we
plan to write one. The concepts are quite useful
without the compiler.
Timeshared monitors
A monitor is used. to provide certain functions,
such as control and I/O, which a user either does
not want to implement himself, or cannot be
trusted with. However, these functions may be
made modular in. form, and can thus be loaded
from a library.

1036

Fall Joint Computer Conference, 1968

The amount of code that truly must be resident
in a timesharing system is really quite small. A
scheduler (JOB CHANGER) and RAD handler
(SWAPPER) are, of course, required. A console
teletype handler is rather important to initiate
system actions and to provide a common voice-ear
for all users. Other I/O handlers are not required.
If one extends the concept of a task to include
all the mastermode routines required for its execution, many interesting by-products result. First,
the resident requirements are drastically reduced.
If no one is reading cards, no card reader handler
is in memory. Second, each task can have a "tailornlade monitor." If task A never does realtime
operations, it has no such handlers in its mastermode end. Third, and perhaps most important,'
monitor routines need not be re-entrant. Each
talks only to one user.
All the resident must do is act as referee.
Peripherals, indeed any nonshareable system resources (interrupts, extra register blocks, disc
space) are handed out on a first come, first served
basis, and passed on to the next requestor when
released by the current user.
Naturally, such a scheme requires that all
mastermode routines be "honest," and completely
deb!lgged, but this is a rather ordinary requirement. A malicious or naive mastermode routine .
can cause damage any number of ways, so JANUS
assumes that no such routine exists and performs
virtually no checking. While it may appear that
spreading the responsibility for system integrity
over many routines increases the programming
load, the fact that each routine is a module with
clearly defined rules of construction actually
makes the coding job much easier.
Thus, JANUS is a system that timeshares
monitors, in the ordinary sense, and is itself
'Only a referee. Each monitor essentially patches
together the small computer it needs to perform
a specific operation, leaving all unused system
resources available for other tasks. Since, in general, no one job requires more than a few private
resources, another subsystem can be constructed
from the remainder, and then another, until the
entire system is active.
In the MSU implementation, the JANUS resident occupies 3.5k words (7 pages).
Resident and system tasks
The RAD is arbitrarily divided into approximately 680 diskpages, each holding 512 words of

useable information. Each distinct page of a task
or file has a unique diskname (16-bit) by which
JANUS can locate it. The first page of every
task begins at location X' COO' under the map and
is called the task control page (TCP). All information required by JANUS to bring in a task and.
start it up is stored in fixed locations in the first
34 to 283 words of the TGP, depending on the
size of the task. Thus. JANUS needs only know
the diskname of the TGP, called the taskname, to
bring in the whole task. With 16-bits of status
information, the taskname becomes a one-word
entry on the resident ring of active tasks..
Note' that a task sees only 3.0k of JANUS.
(Under the map, each task is executing in its
own unique address space of up tQ 128k-words,
generally independent of the available core
memory. Task address spaces diverge after 6
pages: the lower 3k are CQmmon to all tasks
and contain the major portion of JANUS.) The
remaining resident page is the TGP of the
HOUSEKEEPER task, the nonresident supervisor. Rather than maintain complete lists of
resources and requests in resident, JANUS maintains only token lists, tQ meet short-term needs,
and invokes the HOUSEKEEPER as needed tQ
tidy up lower core.
JANUS also serves as intermediary between
interrupt routines, which run unmapped in dedicated memory, and their cQntrQlling tasks, which
are mapped and not .necessarily in memory at
any given time. The most important service provided is the MESSAGE CENTER, the only multiply re-entrant resident routine in the system. It
will accept a 'One word signal (consisting a 16-bit
taskname and an 8-bit identifier) from any source
and pass it on to the specified task, pulling it
out of a wait stateif necessary. Great pains have
been taken to ensure that no signals are lost,
duplicated or rejected by JANUS.
CommunicatiQn in the other direction-alerting interrupts-is generally possible through the
. hardware. Interrupts can be triggered under program control. But since the I/O interrupt has a
software fan out to individual device handlers,
JANUS provides a software device-directed
trigger, or "kick," to aid communication. A
pseudo-acknowledgment status word is added to
a queue and the interrupt is triggered. The resident interrupt routine, after acknowledging all
real interrupts, empties the queue, passing the
contents to the individual rQutines. This feature
simplifies coding of I/O handlers, since all con-

JANUS
tro.I o.peratio.ns can be co.nfined to. the interrupt
end o.f the handler.
The last majo.r co.mpo.nent o.f the resident is
the co.nso.le teletype handler, used by all tasks
to. co.mmunicate with the o.perato.r. Output is
sequential in o.rder o.f request and may interrupt
an input. All co.nso.le teletype I/O is prefix directed to. the appro.priate task. A prefix Po.o.I is
available to. pro.vide unique prefices UPo.n request.
Prefices co.nsist o.f a byte co.unt, plus up to. 3
characters.
One prefix, the ampersand (&) is reserved fo.r
directing messages to. JANUS. A special task,
the AMPERSCANNER, deco.des these messages
and takes appro.priate actio.n. The AMPERSCANNER kno.ws where to. find, o.r ho.W to generate, all tasks built into. the system, and thus
is resPo.nsible fo.r initiating mo.st pro.cesses.
A third system task is the MORTICIAN, which
dissects dead tasks and files, and returns their
co.mpo.nent pages to. the system diskpage Po.o.l.
Other system tasks are an o.pen-ended set o.f
symbio.nts, used to. perfo.rm an I/O o.peratio.n with
a disk file. Tasks may either do. their o.wn I/O
o.r generate a file, and let the appro.priate symbio.nt perfo.rm the necessary I/O o.peratio.ns. Symbio.nts do. no.t take up any memo.ry space unless
they are actively wo.rking o.n a file. Only the
system tasks are immortal under JANUS. All
o.ther tasks are bro.ught into. existence fo.r a
specific applicatio.n and are interred when no
Io.nger needed.

Operation
Job changer

Figure 4 is a flo.W chart o.f the JOB CHANGER,
which is resPo.nsible fo.r entering and leaving tasks
between time slices, and fo.r delivering messa~es
(signals) to tasks. The Io.west prio.rity interrupt
is the jo.b-changing interrupt. It fires when its
co.rresponding co.unt pulse interrupt co.unts to. zero
(o.r when triggered by 'a CPU instructio.n), and
transfers co.ntro.I directly to. the task mo.nito.r to
perfo.rm register saving and any o.ther slice-end
functio.ns needed befo.re branching to. the JOB
CHANGER.
Similarly at slice-start, the JOB CHANGER
gives the task mo.nito.r an o.PPo.rtunity to. pro.cess
signals and resto.re registers befo.re co.ntinuing.
A special purpo.se task thus has the o.ptio.n o.f
streamlining both ends o.f the pro.cess. N o.te that
a task a~so has the o.ptio.n o.f putting itself on

1037

high prio.rity, a status such that it must be the
next task executed, altho.ugh this practice is disco.uraged.
No. attempt is made to. co.mpute an "o.ptimal"
slicetime. This interval is fixed by an empirical
tuning pro.cess and is currently 0.1 seco.nd.
Swap per

A flo.W chart o.f the SWAPPER algo.rithm is
sho.wn ill' Figure 5. Once the TCP o.f the next
task is in memory, the SWAPPER wo.rks fro.m a
variable length table in the TCP to. determine
what pages must be bro.ught in (or Io.cated in
memo.ry) to. permit the task to. proceed.
Since it' takes an average o.f 28 milliseco.nds to.
read o.r write one page o.f memo.ry to. the RAD,
whenever possible great effo.rt is made to. avo.id
actually swapping. So.me o.f the techniques used
are:
1) Sto.rage areas are gro.uped to.gether, beginning o.n a page bo.undary, so. that all o.ther pages
can be flagged "read o.nly." Thus, the SWAPPER
kno.ws no.t to. write these pages back to. the RAD.
2) Slavemo.de sto.rage areas are at first writepro.tected, so. that the task mo.nito.r can info.rm
the SWAPPER (via the TCP table) whether o.r
no.t a page has been mo.dified.
3) Only those pages o.f a task which are
kno.wn to be needed (o.r fo.r which usage cannot
be mo.nito.red) during the next timeslice are
flagged "must be in next time" o.r "must be in
everytime. "
4) A fo.ur-level "usage prio.rity" is maintained
fo.r each page of real memo.ry (see Figures 4
and 5). After each timeslice it is' set to. a high
value if used, reduced to.ward zero. if no.t. As a
result, pages are "turned" acco.rding to. a weighted
LRU (Least-Recently Used) algo.rithm. A page
which must be written back has a higher weight
than o.ne which do.es no.t. Other weight criteria
may also. be established. This simple trick helps
the JOB CHANGER and SWAPPER "learn" the
best way to. use memo.ry as the system lo.ad
changes.
Such great pains were taken to. impro.ve swap
efficiency, in fact, that a wholly unexpected bypro.duct was created. Slavemo.de memo.ry usage
could be monitored so clo.sely that the slave portion of a task need never be all in memory at once,
but instead pages would be brought into. core
upon demand. Thus a pro.blem-solving program
could be written Ulsing the full 128k address space

1038

Fall Joint Computer Conference, 1968
FIGURE 4-Flow chart for JOB CHANGER (scheduler). It operates at lowest priority interrupt level.

YES
CLEAR '~SK
READY' FLAG
FIND TCp,SET MAP
SET ALL AC=3

RETURN SIGNALS
AND REMOVE
TASK FROM WAIT

JANUS

1039

FIGURE 5-Flow chart for SWAPPER. It operates at I/O interrupt level whenever RAD operation completes or software
"kick" is administered by JOB CHANGER.

1040

Fall Joint Computer Conference, 1968

of the-machine and, with no alteration, run on a
Sigma-7 with as little as 8k of actual memory, al.
though less efficiently than in the full machine.
Demand paging

Figure 6 is a flow chart of the demand paging
algorithm. It is implemented as an optional task
monitor module (a pure process control task
would have no use for this feature), and occupies
250-words. Whenever an access protect violation
is signalled via the Nonallowed Operation trap,
the routine is entered to detennine a) what page
reference caused the trap and b) whether it was
indeed an error or a valid reference. If the latter,
the page is made available and usage is noted.
The routine is time consuming because the

access protects were not originally designed for
this purpose, and hardware deficiencies must be
made up in software. .But each page traps at
most twice during each 0.1 second timeslice, and
the possible gain in information weighs heavily
against 28 milliseconds per page swaptime.
Except for the Execute instruction, which is
infinite level, no Sigma-7 instruction can access
more than five pages before reaching an exit
point. With special handling for execute, this five
.page limit applies to all cases. Consequently,
timesharing can be maintained as long as there
are sufficient non-dedicated pages of memory to
hold the largest task monitor, plus five slavemode pages. The MSU Sigma-7 thus has over
half its 16k memory dedicable for I/O and other
realtime processes.

FIGURE 6--Flow chart of Demand Paging algorithm, a 250-word routine ,which may be included in a task monitor to permit
task to occupy full 128k virtual memory. It operates whenever Nonallowed Operation trap occurs in the task.

JANUS
· STRUCTURE

AND
INTERNAL CaJlMUNICAT ION
UNDER JANUS

INTERRlJ'T (REAL-TIME)
ROUTINE AND
ASSOCIATED
UNIQUE STORAGE

~

INTERRUPT CONTROL

~

STORAGE COMMaI!· TO BOTH REAL-TIME AND CONTROL
FUNCTIONS (INTERSECTION)

'---

L

CAL TRAPS
COMMON STORAGE

AN~

SLAVEMODE
(COMPUTATION)
ROUTINES AND
ASSOCIATED
UNIQUE STORAGE

---

MASTERMOOE (CONTROL)
ROUTINES AND
ASSOCIATED
UNIQUE STORAGE

ROUTINE
PRIVILEGED

SIGNALS

CJ

INSTRUCTIONS

FIGURE 7-Typical realtime (I/O) driver. This illustrates
optional three-part division of labor and modes of communication
bet ween parts.

Efficiency
It is difficult to measure the overall efficiency
of the JANUS system, because loading changes so
rapidly with even the simplest uses. Several extreme cases can be cited, however, to provide a
feel for overall parameters:
1) In the case where all active tasks fit in
memory at once, the SWAPPER "learns" in about
1 second that all codes should be in memory.
Afterward, the time spent in the JOB CHANGER
and SWAPPER (overhead) drops to less than 5 %
total CPU time.
2) The FORTRAN optical model code GIBELOMP, which requires approximately 35k words
of slave code in a Sigma-7, registers a swap
overhead (the fraction of time not spent in slave
problem-solving mode) of 20% when running
alone in the machine.
3) While it is possible to create pathological
cases where swap overhead would exceed 99%,
system processors and compiled code seUlom
burst over 50 % overhead.
These figures are, of course, degraded as available memory decreases or external interrupt load
increases.

CONCLUSION
JANUS has proved to be a viable solution to the
needs outlined in the introduction. By extending
the concept of realtime operations to include all
I/O operations, a small resident is possible, permitting the use of most of memory for produc-

1041

tion. Whether that production is realtime data
acquisition, computation, or a mix is irrelevant-it can 6roceed if the necessary resources are
available. Any realtime process can interrupt the
system or background processes at all times, permitting high data rates. The ability to run large
computation codes with reasonable efficiency, and
in a timesharing environment, is an unexpected
boon.
While the concept of timesharing monitors does
permit a minimal resident supervisor, it would
appear to suffer from the need of writing specialized monitors for the library. However, certain facts should be kept in mind. Each monitor is
no more complex than a· corresponding standalone system would be-only the rules are different. Each monitor need contain only those
function~ necessary and sufficient for the task in
question. The various functions are modular; once
a specific function has been coded, it may be
placed in a library for future use, rather than
having to be recoded again and again. Given such
a library of monitor functions, it is possible to
define a system loader (actually a task maker)
which can generate a tailor-made monitor. This
sort of operation is not uncommon in single-user
monitors, where, for example, a magnetic tape
handler is included only if the programmer references it implicitly or explicitly. It is not an overly
restrictive requirement that the possible set of
operations to be performed be known before execution is started.
Demand paging requires some form of map
with protection, as well as an external storage
medium. Although there is some controversy as
to the value of demand paging in a timesharing
environment, JANUS demonstrates the value of
the concept in situations of single user, limited
memory machines. While overhead for problems
which fit into core is minimal, larger problems
can also be run, although at lower efficiency, at
less monetary cost than would be incurred by an
increase in core memory. Given demand paging,
additional memory increases efficiency. Comparing
an open-shop, small demand-paged computer with
a closed-shop computation center, turn around
time is significantly reduced, even if the actual
computer running· time is increased by orders of
magnitude.
The Sigma-7 is an admirable machine for timesharing, but was not designed with demand paging in mind. Certain features necessary for demand paging, while capable of being implemented

1042

Fall Joint Computer Conference, 1968

by software, would be much more efficient if implemented in hardware. These include: 1. Scratchpad registers, readable and clearable under computer control, which would automatically keep
track of all pages referenced and modified. 2. A
more powerful hierachy of operation modes and
a different sequence of protection. A better
sequence of protection might be: write permitted,
write permitted from master-mode, no write permitted, and no access permitted. Master-slave
modes of operation should be expanded to the
point where the protection could optionally apply
to master-mode thus permitting demand paging
operation of such code. 3. A privileged instruction which would interpretively execute any instruction, and return the reference which would
cause a trap. Other features, such as a readable
map, would be of great use.
In conclusion, we feel that JANUS may well

serve as a model for future computer systems,
due to its flexibility and lack of constraints. As
much as possible, limitations on computer usage
are set by hardware, rather than by 8ystem conventions.
REFERENCES
1 J A JONES
On-line computers lor research
Nucleonics p 34 Jan 1967 A survey of the field containing many
references which will not be duplicated here
2 JBIRNBAUM M M SACHS
Computer and nuclear physics
Physic~ Today p 43 July 1968
3 SDS Sigma-7 computer reference manual (#90009-506)
Scientific Data Systems Santa Monica California May 1967
4 BWLAMPSON
A scheduling philosophy for multiprocessing systems
Communications of the ACM p 347 May 1968
5 IBM System/360 operating system-PL/I language specifications (#C28--6571-4)
IBM Corporation New York New York Dec.. mber 1966

A parallel process definition and control system
by DAN COHEN
Harvard University
Cambridge, Massachusetts

ternally to indicate failure, or externally to indicate that
this process is not needed.
The combinations of f, g, h may be interpreted as:

INTRODUCTION
(1)

General

The purpose of this work is to supply a simple method
for definition and efficient control of a network for asyp.chronous parallel processing.
The system is able to compile a definition of a network of processes which run partially in parallel and
partially sequentially, into a set of control instructions
for some monitoring process, to be executed at run time.
The defined network is composed of a set of interrelated processes and a monitoring process which initiates
processes according to some dynamic conditions. Typical resources for processes are processors, I/O devices,
memory banks, and bulk storage.
The system discussed below is concerned with two
tasks, the handling of raw input statements (the network definition language and its parsing algorithms),
and the network control process, which initiates and inhibits processes as they become needed or unnecessary.

(S)

f
0
0
0
0

g h
0 0
0 1
1 0

1
1 0
1 1
1 0
1 1

(3)

1

0
0
1
1

reset
the process is not needed
illegal
illegal
initiated
sucessfully completed
failed or not needed
successfully completed

The control language

Each process, p, has two propositions associated with
it:
b(p) which is 'TRUE' if p has already started, and
c(p) which is 'TRUE' if p was successfully completed.

The states of the processes

Anetwork is defined by a set t>f statements {S,} of the

Each process at any time can b~ in one of the following states:
(a)
(b)
(c)
(d)
(e)

form:

reset (not initiated)
initiated
successfully completed
failed
not needed

where the {f i} are propositional functions, which contain
AND and OR.
The meaning of Si above is that when f, is 'TRUE' then
b(pi,) is set to 'TRUE' (which initiates the process p, if it
was not already initiated). For a more convenient notation, we write Pk for C(Pk) on the left hand side, and write
Pk for b(Pk) on the right hand side.
Statements with identical propositional functions are
combined by writing their propositional functions on
the left, and writing the set of all their right hand sides,
on the right. Example:

However there is no need to distinguish between
states (d) and (e). Each process has three binary variables f, g and h to indicate its state.
f = 1 : the process was initiated.
g = 1 : the process was successfully completed.
h = 1 : the process failed or was found not needed.
The f variable is set externally to initate a process.
The g variable is set internally by the process to indicate its completion. The h variable may be set in-

1 =} means logical implication. e.g. x
impliesy = 1

1043

=}

y means that x = 1

Fall Joint Computer Conference, 1968

1044

+ c(B) & c(O) => beD)
c(A) + c(B) & c(O)::::} beE)
c(A)

2

is written as:
A + B &C

= > D, E a

An example for definition of a network
s ::::} A, B

read: start with processes A and B
(s for start)

B & C::::} D, E

read: When Band C
are successfully
completed initiate D and E

A

+ B::::} C, D

When A or B is
successfully
completed initiate C and D
read: When (CandD)
or E is successfully completed,
end (e for end).

read:

Let L be the left hand side of a statement (Le, the propositional function) and let R be the set of the processes
which are initiated when L is 'TRUE.' Hence each
statement has the form:
(read: L, implies R,)
We say that p is used in L ::::} R, if L contains p.'
Any process which is used by the network, but is not
initiated by it, like s, is called a source-process, and its
beginning is a source of the network.
Any process which is initiated by the network, but
not used by it, like e, is called a sink-process, and its end
is a sink of the network.
Each network must have at least one source. A circuit-free network has at least one source and one sink.

(3.1)

(3.2)

initiated only once (3.2.3)
(d) separate AND from OR (3.2.4)
(e) transform the network so that each process is
used only once (3.2.5)
(f) simplification (II), if possible (3.2.6)
Each simplification (I) step is applied as soon as it
applicable.
This ordering of the steps insures that no reduction
steps except simplifications are needed as a result of a
later step.
.
After this procedure the input statements are reduced
to a "graphable" form, where the definition statements
correspond to vertices, and the processes correspond to
edges. Because of the AND/OR separation (3.2A) each
node can be represented by an AND or OR gate.
During the reduc_tion dummy processes are created,
which can be recognized as such, thereafter. The role of
these dummy processes is similar to the role of temporariesin program compilaticn. All dummy processes are
created and named only by the compiler. We will name
all dummy processes hereafter by q ,. Dummy processes
have no duration, and do not need any time or resources
to be executed. They serve to eliminate ambiguities
in the network definition. The nature of these processes
will be made clear in (3.2.3), (3.2.4) and (3.2.5) below.
Note that this reduction is not a complete optimization.
(3.2.1): validity checking.
There must be at least one source to any network.
(3.2.2) : simplification (I)
The simplification is not essential, however it may
save time and space and therefore is desirable.
(a) is s is the only source of the network then replace s & p by p, and s + p by s, in any L.
(Note that we drop subscripts for processes, for
simplicity. )
(b) If e is the only sink of the network, then if
{e,p} E R, delete p from R. If p is initiated only
in this statement, replace c(p) by 'false' whereever p is used.
(c) Apply any available method of propositional
calculus ::simplification, to L. (i.e.,· replace A +

The control language processor

The input statements are compiled into a set of control instructions to be executed at run time. This compilation has two stages. The first stage checks validity
(and outputs diagnostics) and reduces the input statements to some "graphable" form. D The second stage
produces control instructions according to the reduced
statements.
(a) check validity (3.2.1)
(b) simplification (I), if possible (3.2.2)
(c) transform the network so that each process is

The reduction procedure

2 ± and & mean logical OR and AND. The AND takes precedence over the OR.

aD, E means the set{D, E}.
"i.e., c(p) appears in thepropositionalfunction, L.
'The meaning of a "graphable" form is explained later, in
(3.2)

Parallel Process Definition and Control System
~1------"

)=z
« ~~~'4
t3-----

~2----

A

L1

B

q

i, repLaced by

~~

~

£1-----12
i3

FIGURE I-An example for rule (3.2.2.f)

A by A, and A + A & B by A.)
(d) Delete L =} II
(e) Replace L =} Rl and L =} R2 by L =} R11 R2
(f) Delete q =} R (where q is a dummy-process),
and replace q by R, where q was initiated. See
Figure 1.
(g) Delete p =} q (where q is a dummy-process,
and p a process) and replace q by p wherever
q is used. See Figure 2.
(h) Delete statements of the form 'FALSE' =} R. If
some process pER is initiated only in this
statement, replace c(p) by 'false' wherever pis
used.
(3.2.3): Uniqueness of initiation

C

L2.

0

is

Ll =} ql, Rl - (R1
~=}

/\

R 2)

q2, R - (RI /\ R 2)

repLaced by

A
B

L1

C
0

L2
FIGURE 3-An example for rule (3.2.3)

qi

+ q2=} RIA R2

where ql and q2 are new dummy processes.
Example: Ll
placed by:

=}

A, B, C and L2 =} B, C, D are re-

This rule is illustrated in Figure 3.
(3.2.4): OR/AND separation

If any process is initiated more than once then there

are Ll =} Rl and L2 =} R2 such that Rl /\ R2 :;C 0, i.e.
there are common terms in the sets Rl and R 2. In this
case replace Ll =} Rl and L2 =} R2 by:

1045

Any statement which contains both AND and OR is
converted to several statements which use only AND or
OR. Some dummy proc~sses are used to represent subcombinations of processes (the same way compilers use
temporaries). A simple precedence reduction analysis
suffices. For simplicity here, let us assume that L is
converted to, or is given as a sum of products: L =
Sum of products may be simplified by using the
idempotant and absorption rules.7 In this case replace
L=}R by:

L1I"i.

is repLaced by

6

FIGURE 2-An example for rule (3.2.2.g)

0 is the empty set.

7 The absorption rule: x
rule: x + x = xandx&x = x

+

x & y

x The idempotent

1046

Fall Joint Computer Conference, 1968

where the {q,;} are new dummy processes. Repeat this
step as necessary, until all the statements are ANDstatements or OR statements.
The similarity between dummy processes and temporaries in program compilation may be illustrated by
the following example. Consider the statement:
A

A------+f

~----------~~

8----=::::::-

J-------. rZ

C-------.c

+ B * C * (D + E) ~ F

is replaced by

As an arithmetic statement it may be compiled as:

D+E~1\
B

* C * Tl ~ T2

A+T2~F

D+E~Tl

)

Tl and T2 are temporaries

ql~q2

B--...c

c-------..t

,

As a process statement it may be compiled as:

B& C&

A--------., )-------------. r1

ql and q2 are dummy processes

8ee Figure 4.
(3.2.5): uniq~eness of usage

If some process, p, is used m > 1 times, then add
p ~ {ql, q2 .... qm}, and replace p in its nth usage by q.
(these {q n t are new dummy processes). See Figure 5.

1-----------. r 2

FIGURE 5-An exa~ple for rule (3.2.5)

(a) p + El ~ q and q + E2 => R, where q is a
dummy process is replaced by p + El + E2 =>
R. 8ee Figure 6.
(b) p & El => q and q & E2 => R, where q is a
dummy process is replaced by p & El & E2 ~
R. 8ee Figure 7.
(3.3) : Example
(a) The input statements:
(8 1) s=>A,B

(3.2.6): simplification (II)
8implification (II) like simplification (I) is not essential, hut may save time and spa~e. It may be done only
after (3.2.3) OR/AND separation, and (3.2.4) uniqueness of initiation, are completed.

A - - - - - - -...

t---~F

B--------,.

c------..
0--------,.

(82 )
(83)

A+B~C,D
B~D ,E

(84 ) C&D

(b) D is initiated twice, in 8 2 and in 8 3 • Apply
(3.2.3) :
(81)

s~A,B

(82) A+ B=*C,ql
(83) B =* q2, E

(84) C & D
(84) ql

FIGURE 4-An example for rule (3.2.4)

+ E =* e

+ q2~D

lc) 8 4 contains both AND and OR. Apply (3.2.4):
(81) s~A,B
(82) A
B~C, ql
(83) B~q2,E
(84) qa
q4 =* e
(811 ) ql q2 =* D
(86) C & D =* qa
(s,) E =* q4

+

E------I

+ E~.e

+
+

Parallel Process Definition and Control System

p

1047

(e) B is used twice, in 8 2 and in 8 a• Apply (3.2.5)
(81)
(82)
(8 a)
(8,)
(85)
(8s)
(8,)

)..-----I~R

q. =:} q2, E
qs + E =:} e
q1 + q2 =:} D
C & D =:} qa
B =:} q& , PII

(f) step (f) of simplification I, can be applied to 8 a,
and 8,:

is rep/aced by

(81)
(82)
(8 a)
(8,)
(85)
(8s)
(8,)

p

__~-=:---..J~J-_------"~

s =:} A, B
A + q&=:} C, q1

R

s =:} A, B
A + q. =:} C, ql
deleted
qa + E =:} e
q1 + q2 =:} D
C&D=>qa
B =:} q&, q2, E

No more reduction rules can be applied. The network
is now in a "graphable" form, as shown in Figure 8.

FIGURE a-An example for rule (3.2.6.a)

p

R
FIGURE 8-The

lS

replaced by

p

---:~:-~
..... 8t J-----~.R
FIGURE 7-AiJ. exmple for rule (3.2.6.b)

Now we assume that the input statements have been
reduced to their "graphable" form, as described before.
The next goal' is getting instructions for updating the
process variables f,g and h. These instructions are
executed at run time by some monitoring process. Let
{.e i} be the processes used in L, and {rj} the processes in

R.

Lt,

(4.1) A set of rules for compiling
=:}{rj}
can be written:
(4.1.1) If any t i in L succeeds, initiate R (success
forward):

(81) s =:} A, B

+
+

of example (3.3)

(4) Compiling the input statements to control instructions

(d) step (g)of simplification I, can be applied to 8"
and 84 :
(82) A + B=>C, q1
(8s) B =:} q2, E
(84) qa E=>e
(8&) q1 q2 =:} D
(811 ) C & D::::} q3

~etwork

8 -+

x+y.

means replacing (setting) e.g. x -+ y means replace y by
'

. 1048

Fall Joint Computet Conference, 1968

A

D

A

B--------

c
t----~D

E

c
FIGURE 9-An eXample for (4.1)

B
FIGURE lo-An example for (4.2)

Example (Figure 9): if A, or B, or C succeeds, initiate
DandE.·
(4.1.2) If all {.e d in L fail, then R is not needed
(failure forward):

Example (Figure 10) if A and B succeed then initiete
C,D,andE.
(4.2.2) If any t i in L fails, then R is not needed
(failure forward) .

Example (Figure 9): is A, and B, andC fail, so do D
andE.
.

Example (Figure 10): if A or B fails, so do C, D and E.

(4.1.3) If R is not needed, th~n L is not .needed
either (failure backward):

(4.2.3) If R is not needed, then L is not needed
either (failure backward) :

TIh(ri) -+ {h(t.)}
Example (Figure 9) : if D and E are not ne-eded, neither
are A, Band C.
.

Example (FigurelO) : if C and D and E are not needed
neither are A and B

(4.1.4) If any ti in L succeeds, then its brothers
are not needed (inhibit brothers )10

(4.2.4) If any t i inL fails, then its brothers are
not needed (inhibit brothers):

Example (Figure -9): if A succeeds, Band C are not
needed. If B succeeds A and C
are not needed, etc.

Example (Figure 10): if A fails B is not needed. If B
.
fails A is not needed.

II

(4.2) A set of rules for compiling t i ::::} {ri} can
written:
(4.2.1) If all (, i in L succeed, initiate R (success
forward) :

IIg(t ,) -+ {f(ri)}
The overbar means ·logical complementation, e.g.
and i = o.
9

'0

=

I

10 Processes which terminate at a common vertex are called .
"brothers" here.

The statement p ::::} R can be compiled according to
either rules. However we consider one input node as an
AND node. Note that (4.2.4) is not necessary as it is implied by (4.2.2) and (4.2.3).
For each dummy process, q, the setting instruction of
f(q) is replaced by the setting instruction of g(q).
The purpose of the rules set forth in 4.1 and ·4.2 is to
inhibit the execution of processes which are not needed,
and to initiate the execution of processes which are
needed. A networ~ with one sink only, is always inhibited upon arrival at it.

Parallel Process Definition and Control System

1049

(4.3) Example:

FIGURE ll-The network for example (4.3)

I. Consider the time-slice B-D-C, as marked on the
graph in Figure 11. Say that D fails there at that time.
Then (4.2.2) inhibits G and H, (4.2.3) or (4.2.4) inhibits
E, (4.2.2) inhibits J. and (4.2.3) inhibits F and B. This
means that process B, already initiated, is told to quit,
since its success is not necessary to produce "e". This
leaves only A, C and I not inhibited.
II. Consider process I completed. Process J is now
deemed useless, and is turned off, as are its "parents",
F and H.

A computer system for automation of a laboratory
by P. J. FRIEDL, C. H. SEDERHOLM,
T. R. LUSEBRINK and C. J. JENNY
IBM Scientific Center
Palo Alto, California

INTRODUCTION
In the past,. scientists have applied the digital
computer in an off-line fashion to the problems
of analyzing the voluminous data pr.oduced by
their laborat.ory instruments. The use of computer techniques for digital filtering, peak
finding, and spectral dec.omposition has greatly
increased the rate at which experimental data can
be analyzed. More recently, many instrument
manufacturers and users have also begun to apply
computers on-line t.o their instruments in an effort
to increase the rate· at which useful experimental
data can be .obtained. This paper describes a laborat.ory automati.on computer system which simultaneously supports multiple closed-loop experiments and data analysis programs.
Let us first distinguish between instrument
aut.omation and lab.oratory aut.omation. In the
f.ormer case a single computer, usually small, is
devoted to a single instrument, or at least to one
instrument at a time; whereas in lab.oratory automation, a gr.oUP of instruments in a laboratory
is automated using one central cQmputer system.
A discussion of the relative advantages of these
two modes .of autQmation follows.
There are several advantages to a dedicated
c.omputer. The most important of these is that of
iSQlation. Each individual user prefers to cQncern:
himself with his own problems. He dQes not want
malfunctions of other instruments or their interfaces tQ j eQpardize his experiment. Pr.ogramming cQnsiderati.ons associated with his .own instrument are sufficiently complex that he doesn't
want to worry about other users' pr.ogramming
pr.oblems too. He is rightfully afraid .of being
forced to factor parts .of his pr.ogramming requirements int.o general purpose programs which
serve the entire laboratory (programming by
committee) .

Immediate accessibility to the c.omputer system
can be guaranteed in a devoted computer configurati.on. This fact also errc.ourages one to favor
multiple C.omputers, .one per instrument; h.owever,
if a time-shared computer system could almost
always be made available in a period of a few
seconds, or at most one or two minutes, this would
not be intolerable. The prospects, however, .of
having to schedule one's experiments and schedule
the use of the computer facilities long in advance
is very unpalatable to most users. In general,
users prefer fewer facilities which are routinely·
available to larger facilities available by appointment .only. Hence, the desire fQ:r immediate computer access tends t.o favor multiple instrument
automation over laboratory autQmation.
Relative costs should also be considered. A computer system which is capable of expanding to
handle the requirements of an entire lab.oratory
will, .of necessity, be m.ore expensive, even in a
minimal cQnfiguration, than a computer system
capable of automating a single labQratQry instrument. Therefore, when one is taking the first step
toward laborat.ory automation, that is, the automation .of a single instrument, there is· a very
strong tendency to autQmate that instrument
using one small c.omputer since the initial cost
is cQnsiderably I.ower. Quite understandably, an
organization is reluctant to commit large capital
and manpower reSQurces to a project in which it
has little .or n.o experience. H.owever, if the ultimate gQal of completely aut.omating the laboratory is considered, many of the considerati.ons
discussed bel.ow imply that a single, shared labQratQry cQmputer W.ould prQvide m.ore perf.ormance per dQllar.
.
TQ realize all .of the potential .of automati.on,
.one must nQt .only acquire data but alsQ cQntr.oI
the instrument during the data acquisiti.on step,

1051

1052

Fall Joint Computer Conference, 1968

process the data which has been acquired,standardize it,. compare it against known parameters
(e.g., compare an unknown spectrum against a
file of spectra of known compounds for identification), and finally present the results in a form
which is usable to the experimenter. Data
acquisition and control steps can very often be
adequately performed by a small computer. However, the data reduction steps, the comparison
with data files, and the- presentation of results in
usable form often require much larger computer
capabilities. One solution to this is to record the
raw data, which has been acquired with a small
stand-alone computer,. on a recording medium such
as magnetic tape or pUJ?ched paper tape. This data
may then be processed on a large computer when
time is available. However, if the raw data is at
all voluminous, and magnetic tape must btr used.
the cost of the magnetic tape drive relative to the
cost of the small computer. can become very high.
Hence, this approach tends to encourage one to
minimize the quantity of data taken, often resulting in less precise results. Furthermore. the
turnaround time on very large computer facilities
still is much longer than the individual investigator would like to wait between epochs of his
experiment. That is, if he could have the data
from the last epoch back quickly, these results
could be used to determine conditions for his next
experiment.
A larger shared computer has several advantages for the automation of a laboratory. It can
have sufficient core and processing capabilities
to do a large portion of the data reduction required in most laboratories without resorting to
a large central computer. By dynamic allocation
of system facilities one may take advantage of the
fact that most analytical and spectrocopic instruments have low duty cycles (Le., much of the time
is spent in sample preparation or with the instrument oompletely idle). This dynamic allocation of system facilities will allow one to reduce
the total size of the required system below the
sum total of each experiment's requirements. In
addition, under such a system, background data
reduction tasks may be carried out by utilizing
excess capacity that exists at any moment.
Another advantage of a shared computer is
that more sophisticated input/output devices are
available to all users. The advantages of having
access to large disk files, line printer, card reader /
punch, and magnetic tape units are obvious. Only
the largest devoted computer /instrument com-

bination could justify the most modest of such
devices.
As a result of the above considerations, we have
designed and implemented a monitor system. Our
goal was to provide a system which could serve a
laboratory containing a group· of analytical and/
or spectrographic instruments operating in a dynamic mode, i.e., a. research or develop.ment environment. This. system, the Palo Alto Laboratory
System (P ALS), features complete program independence and complete system independence of
each instrument from all others. An application
program for one instrument, either at the time
it is written or executed, need in no way take into
consideration other programs running in the system simultaneously. Furthermore, .application programs need not be modified as a result of a change
in the total instrument configuration attached to
the computer system. Each instrument may have
its own individual data path to the core of the
computer via data channels, .so that there need
be no sharing of the interfaces between various
instruments. Data acquisition associated with the
various instruments is completely asynchronuos.
Closed-loop control capabilities are provided with
a response time of the order of 50 ms. The system dynamically allocates core, disk space, and
I/O devices to provide maximum usage of these
facilities.
This system uses an IBM 1800 computer. It is
possible to operate this system on a computer with
16K words of core; however, it is more useful if
the laboratory to be automated is sufficiently large
to support a 24K or 32K word machine.
A new device, a digital multiplexer, has been
designed and built, specifically to support this
laboratory automation system. This digital multiplexer channel provides up to 32 discrete data
paths between the laboratory and the core of the
computer. Via cycle stealing, it allows data to be
acquired from, or presented to, the individual instruments in a demand/response mode with a
minimum of computer overhead. The reduction in
computer overhead increases the allowable total
data acquisition rate from all instruments by
more than an order of magnitude. Data acquisition is in a demand/response mode which is of
great value in that it allows the instrument to
indicate when data is available rather than letting
the computer determine when data should be presented. An example of the usefulness of demand/
response data acquisition is in acquiring data
from an infrared spectrometer which has automatic scan suppression. Since the scan rate is a

Computer Systems for Automation of a Laboratory
function of the first derivative of the absorption,
data acquired at equal intervals of time would
not be at equal increments in wavelength or wave
number. However, with a demand/response interface, the instrument could be run with scan suppression, and demands to take data could be made,
by the instrument at equal wavelength or wave
number increments.

The monitor system
The PALS monitor system is made up of a
group of relocatable modules which are initially
stored on the disk. Modules serve either input/
output devices or are responsible for initiating
program loading, linking to the Job Control Language (JCL), controlling multi-task communications, and monitoring time slicing operations.
The only portion of the system which is not
. relocatable is a 200-word area in low core contains
information related to hardware wiring, such as
an interrupt transfer vector and the word count
and address registers for the various sub channels
of the multiplexer.
The various modules making up the system
communicate with each other by means of task
control blocks and task-complete control blocks.
When one module has a task to be performed by a
second module, a task oontrol block is generated
by the originating module. The address of that
block is passed to the second module which performs the task as soon as it is able. When the
task has been completed, the second module generates a task-complete control block which is returned to the originating module. The task-complete control block indicates whether the task has
been completed successfully or unsuccessfully, and
the reason for any unsuccessful completion if
appropriate.
Two monitor modules, a real-time module
(foreground) and a non-real-time module (middleground) are responsible for the execution of
all users' programs and for communication between users' programs and the rest of the system.
A second non-real-time module (background) is
not part of the system's modules, but is read from
disk upon request.
Input/output device modules service disk files,
all the subchannels of the digital multiplexer,
printer, card reader, the typewriter terminals, etc.
There may be several modules stored on the disk
servicing the same input/output device at various
levels of complexity. That is, one may be a very
sophisticated package, having many operations in-

1053

eluded in it, while another services the device at
a minimal level having only one or two very
simple operations implemented in it.
The system is configured by reading in a pack
of control cards which indicate which modules
should be loaded and link-edited together. The
reconfiguration of the system, depending on the
various users' needs, is very easy and requires
only a few seconds. At this time the operator may
choose, for example, whether he wishes to have
full printer support requiring over a thousand
words of core, or minimal printer support requiring only a couple of hundred words of oore.
,It is also possible to eliminate devices which are
not to be used. After this deck of cards is read
in, the system is cold-started with a one-card
cold-start routine which loads and link-edits the
appropriate system modules in low core.

Dynamic core allocation
All of core not taken up by the system is divided into pages of 512 words and all of these
Core Allocation

)
Relocatable
Modules

I
System

Making Up
System

t
)

A
B

A
System Subroutines
A
B

A

)

t

Variable Core
(512 Word Pages)

I
)

1---------------.... ---------------

Assembler or
Variable Core

FIGURE I-CorE" allocation map

pages are put into a pool of free pages (Figure
1) . The loader module is responsible for allocation
of the pages of variable core. When a task is
given to the loader to load and set a user's program into execution, the loader places that user's
program into any pages of core which are presently available. These pages need not be contig-

1054

Fall J.oint C.omputer C.onference, 1968

U.oUS, since the I.oader takes care .of altering the
rel.ocatable addresses within the user's pr.ogram
at I.oad time, S.o that it may execute out .of n.onc.ontigu.ous pages .of c.ore. In additi.on, the first and
last w.ords .on each page of a user's program and
all w.ords which are not modified during execution
.of that pr.ogram are st.orage-protected at load
time. This pr.ovides a high degree of pr.otection
of .one user's program from another.
Systems subroutines, such as floating addition,
multiplicati.on, division, sine, etc., are shared
am.ong individual user's programs. These subr.outines are st.ored .on the disk, in gr.oups which
are assembled into a block that will occupy one
page of variable core. At I.oad time, when the
I.oader encounters call to a systems subroutine
from a user's pr.ogram, the loader checks whether
the page on which that system subroutine exists
has previously been loaded into core. If it has, the
loader links the program being loaded to that
systems subr.outine. If the systems subroutine has
n.ot previously been loaded, the loader I.oads the
page which contains the called systems subroutine
from the disk and links it with the program being
loaded.
When a program has completed its execution
and has called EXIT, a task is given to the loader
t.o return all the pages associated with that user's
program t.o the pool of free variable core. The
loader clears all the st.orage protection hits from
these pages and interr.ogates to see if the terminating progr'am was the only program using
any of the systems subroutines. All systems subroutine pages which are not being used by other
programs are als.o returned to variable core. However, th.ose subr.outine pages which c.ontain subroutines being used by .other application programs
still in executi.on are n.ot affected.

a

Disk allocation

Many of the problems encountered in a laboratory require random access to individual data
or groups of data within large files, hence the file
structure on a peripheral disk is of considerable
importance. It is usually impossible to define the
ultimate size of a data file before it is generated
so that a dynamic file allocation is highly desirable. Disk files in the PAL system are organized
as logical tapes which are aut.omatically expanded or contracted according to the present length
of the data table written thereon. These logical
tapes are defined by name and allocated by' the
system one cylinder of the disk file (2560 words)
at a time. Such tiles may be used in much the

same way as a physical magnetic tape, that is,
one can read, write, backspace any number of
words, rewind, etc. In addition, the inherent advantage of the disk file is preserved, i.e., reading,
writing .or altering single words anywhere .on the
logical tape in the direct access m.ode is still possible. The system all.ows this by keeping an internal w.ord c.ounter which points t.o the I.ocati.on
where the last access was made. Alteration of
this pointer to any value is easily done by giving
an instruction t.o the system. If the data table
exceeds the length of the first cylinder, the system
automatically adds cylinders to the logical tape,
restricted only by the maximum value the word
count pointer may attain, 32767 words. The tape
may be closed at any value to retain the file for
future use. If the length of the tape upon closure
is less than a previous length, the excess cylinders
are returned to the system, providing automatic
contraction of the file. For example, if a previous
data file required six cylinders and the present
one .only requires three and a half cylinders, the
fifth and sixth cylinders will be returned to the
pool of empty cylinders for reallocation by the
system.
Multitasking

Allocation of the CPU is implemented using the
multilevel interrupt structure of the 1800. Therefore, if a task is being processed and a task of
higher priority is initiated, the low priority task
is suspended and the higher priority task is processed immediately.
The entire PALS system is oriented toward
performing a variety of tasks, many being associated with input/output. In general, it takes 12
words' of a user's program to specify a task for
the system. These tasks are accepted by the system and performed as soon as possible. Multiple
tasks may be queued for a single input/.output device; e.g., the line printer may have a current
task in execution while five other tasks are
waiting for the line printer to be free. A given
application program may have several tasks
outstanding simultaneously. For instance, a given
application program may instruct the system to
(1) acquire a block of data from a given instru- \
ment, (2) read a card in the card reader, (3)
print a line on the line printer, (4) light a light
at the user interface, and (5) write a block of
data on a logical tape. These tasks would be given
to the system sequentially; but all five tasks would
be set into execution before the first one was
completed.

Computer Systems for Automation of a Laboratory

CYCLE STEALING

Active

Queued

1.055

Passive

A -4,.;..------- B

MACHINE CHECK
ANSWER INTERRUPTS
START INPUT/OUTPUT

FIGURE 3-Flow diagram of monitor algorithm

REAL TIME USERS MONITOR

Below that, hardware interrupts are answered and I/O devices are started. Each of
the system modules which is responsible for
an I/O device may queue, to an indefinite
length, tasks to be performed in conj unction
with its particular I/O device. When a hardware interrupt occurs, indicating the completion of a task or a subtask, the next task
or subtask is initiated as a portion o~ the
interrupt handling routine. In addition,
when a task has been completed as a portion
of the interrupt handling routine, a taskcomplete contr.ol block is returned to the
system module which originally issued the
task.
Next is the real-time user's monitor under
which all real-time programs are executed.
A user's program in execution under the
auspices of the real-time user's monitor may
be in one of three states: passive, queued, or
active (Figure 3). When a p;rogram originally goes into execution it is placed at the
bottom of the queue. As programs are taken
from the top of the queue and placed in
execution, the program at the bottom of the
queue works its way up through the queue.
Finally, the program. is taken from the top
of the queue into active status and the user's
monitor transfers control to the user's program. The. user's program maintains control for doing processing.
After giving out a number of tasks and
having done a certain amount of processing,
one of two things can happen which will take
a user out of the active status. The first
possibility is that he has completed all processing and has given out all the tasks he
desires at that time and can do nothing
more until one of his I/O tasks. has been
completed. At this point the user executes a
relinquish operation which then cause~ the
user's monitor to remove him fro~ the
active status and place him in the passive
status. Control then passes to the next

LOADER
NON-REAL TIME USERS MONITOR
HOUSEKEEPING
JOB CONTROL LANGUAGE
ASSEMBLER
WAIT LOOP
FIGURE 2-Priority list' for CPU cycle allocation

If the user's program wishes to initiate an I/O
operation, it does so by giving specifications of the
task to the user's monitor, which In turn generates a task control block and gives it to the
system module responsible for executing that task.
As soon as the task has been started or queued,
and before the task is completed, control is returned to the user's program. From this point the
user's program may send out additional tasks to
the system, each time control being returned to
the user's program. Two of the arguments in a
specified task are entry points to the user's program. One is associated with a normal return
addres~, and one is associated with an abnormal
return address. If the task is completed successfully, the user will regain control at his normal
return address, whereas if the task is completed
unsuccessfully, control will be returned to the
abnormal return address.
A priority list for allocation of CPU cycles
is given in Figure 2.
The highest priority is cycle stealing for the
transfer of data between' the various I/O
devices, including the instrument interfaces
and memory.
The next highest priority is for servicing
machine check conditions.

· '1056

Fall J.oint Computer C.onference, 1968
pr.ogram in the active queue. The user in the
passive status does not relinquish core, only
his P.osition in the active queue.
The other possibility is that of a "time-out."
Each time a user's program is put into
execution, the user's monitor sets an interval
timer for a n.ominal five millisec.onds, and at
the expiration of this time the user's m.onitor takes control away from the user's pr.ogram, saves all status and registers, and
puts the program from the active status to
the bottom of the queue (time slicing). The
user then must work his way up through
the queue t.o the top again to resume processing. Using five millisec.ond time slices,
it has been our observation that a real-time
user usually relinquishes before he is timed
.out.
When a user is put into the passive status,
the .only way he can return to active queue
is as a result of a task-complete control
block being returned to the user's monitQr,
indicating that .one the user's I/O operatiQns
has been completed, either successfully or
unsuccessfully. At this point the user's monitQr takes the user's prQgram out .of the
passive status and places it at the bQttQm
of the active queue. A guaranteed response
time, i.e., the time from cQmpletion of an
I/O .operation until the time that a user's
prQgram may act upon that cQmpletiQn, may
be implemented by limiting the number .of
users allQwable in the real-time executiQn
and by implementing five millisecond time
slices. If, fQr instance, the real-time user's
monitQr limits the number .of users to five,
, a worst case conditiQn WQuld be four users
in the queue. When an I/O operatiQn has
been cQmpleted or a hardware interrupt
.occurs, the maximum length of time necessary tQ answer WQuld be five milliseconds
fQr each .of the peQple in the active queue,
plus nQt mQre than 20 millisecQnds overhead
assQciated with the higher priority operatiQns (if .overhead were 100 percent). This
would mean a total .of 40millisecQnds maximum frQm the time that the interrupt
cQnditiQn .occurred until the time that the
user's program gained cQntrol .of the CPU
fQr a five millisecond time slice. N Qte that
interrupt servicing and queue manipulating
dQ nQt interfere with .on-line data acquisitiQn which proceeds via CPU cycle-stealing.

Immediately below the real-time us·er's monitor in priority is the module' contrQlling
loading of programs and dynamic allocation
of core. It was given a IQwer priority than
the real-time monitor in order to avoid interference with operational real-time programs. Because the loader module can ac~
cept tasks from other modules, operations
such as having a real-time program load a
non-real-time prQgram and vice versa are
possible. A typical example of load time is
1.7 secQnds for a seven page program,
during which time all of the operatiQns previously discussed under core allocation are
performed. This .overhead occurs just once
at the time each program is initially loaded,
and it is negligible compared to the manual
set-up times needed to ready an instrument
or experiment.
Next is the non-real-time user's monit.or.
Programs executing in a non-real-time
status are executed under the control of this
mQnitQr which has the same algorithm for
scheduling time t.o the various user's programs as does the' real-time monitor, except
the time .out peri.od in the non-real-time
monitor is a nominal one hundred millisec'Onds.
Below the non-real-time monitor in priGrity
are various housekeeping modules. These
modules are in general responsible for code
conversiQn and spooling operatiQns between
variQus I/O devices. For instance, a user's
program may give a task to the system to
print an entire logical tape~ This task WQuld
be executed by a h.ousekeeping module,
which in turn WQuld give out subtasks to
read sectors of the logical tape and tQ print
the individual lines. This module would also
be responsible for the code conversi.on from
EBCDIC to printer cQde .
Below the housekeeping modules in priQrity
is the module which deals with job control
language. This module offers a fairly high
level .of conversational interaction between
the operatQr .of the system and the system.
FrQm the console typewriter the operatQr
may load a program or set a program into
execution, may cancel a program, may get
a dynamic dump of a program while it is in
execution, or a dynamic dump of any area
core, may get a dump of a logical tape, may
define or scratch a l'Ogical tape, or may get
a status of the entire system.

Computer Systems for Automation of a Laboratory
The lowest pri.orities are devoted to the
language assembler, and a wait loop.

The PALS

languag~

Language requirements for a laboratory automation system include the ability to easily program multiple on-line data acquisition and control tasks as well as off-line data reduction or
analysis tasks. Data acquisition and control tasks
demand programming of a number of input/output interactions with sensor-based devices. Logical operations are required for the various control
functions. Past experience indicates that data reduction and analysis tasks are best served by the
FORTRAN or PL/I type of language.

1057

The approach chosen for the PALS system was
a macr.o language with statements natural to the
laboratory environment. There are statements for
easy handling of sensor-based input/output (e.g.,
mUltiplexer channel commands, analog inputs, and
a series of special logical statements used to set
up bit patterns for control of instrument interfaces, etc.). FORTRAN-like statements are available for data analysis, and they. require very
little re-Iearning for users familiar with
FORTRAN.
Some examples of the various types .of PALS
statements may serve to indicate the salient features of the language.

I/O OPERATIONS

SUBCHANNEL OPERATE
SCOP
SCl, 3, DTNAMl, NRET, ARET

READ LOGICAL TAPE
DRTP

WC, NAME, BUF

W rite the c.ontents of table DTN AMl out .over
subchannel 1 in demand/resp.onse mode (operation code 3). After successful completion of the
operation, return contr.ol to n.ormal entry point
NRET, if not successful to abn.ormal entry point
ARET.

Read the number .of words c.ontained in location
WC from logical tape NAME and place them in
core starting at locati.on BUF.

READ CARD INTO CARD BUFFER
CRDR
CONVERSION

CARD BUFFER TO INTEGER VECTOR
CBIV
N,M,VEC,AI

INTEGER TO EXTENDED PRECISION
FLOATING POINT
FLOT
INTEGER TO EBCDIC·
IEBC

I, E

I,CH, EBCDIC

The contents of address N is the number of card
columns to be used f.or each element; location M
contains the number of elements per card; VEC
is the starting address of the vector; and location AI contains the subscript .of the first vect.or
element t.o be filled by the present card buffer.
Convert the integer at location I to extended precision floating point number at location E.
C.onvert the integer at location I to CH number
of characters starting at location EBCDIC.

'1058

Fall Joint Computer Conference, 1968
,MATHEMATICS

STANDARD FLOATING ADD
FADD

A,B,C,ERR

MULTIPLY VECTOR ELEMENTS
IMPX
A, I, B, J, C, K, ERR

SQUARE ROOT FUNCTION
FSQT

A, B, ERR

The macro language permits programmers to
mix assembler language with macro statements.
It is at the user's discretion to define new statements to meet the needs and level of programming
experience of the scientists writing applications.
The PALS macro processor is treated somewhat as if it were an application program, except
that it is executed in background mode. It is not
paged but is loaded into a partition at the high
end of variable core. When the loader is asked
to load the assembler, the request is queued until
the 12 pages at the high end of core are all free. At
that point,' the limits of variable core are lowered
about 6000 words from the top end .of core and
the assembler is loaded into this partition. When
the assembler has completed its operation, its
core block is returned to variable core. This
means that several minutes may be required from
the time it is requested to load the assembler
until the assembler is actually loaded. However,
since this system is built on the basis that assembly should be a background operation" this
allocation scheme appears to be satisfactory.

Application programs
A joint study was carried out with Varian Associates in order to investigate and demonstrate
the usefulness of a time-shared, laboratory automation computer system. A simulated research

Add the standard precision fl'Oating point numbers
I'Ocated at A and B and put the result in location C.
ERR is the entry point 'Of a user error correction
r'Outine.
Perform a floating point mUltiply of the Ith element of vector located at A by the Jth element of
vector located at B and place result in the location of the Kth element of vector located at C.
Branch to ERR if any error.
Take the square root in floating point 'Of the number at A and place the result in location B.
Branch to ERR if any error.
lab'Oratory, containing an M-66 medium resolution
mass spectrometer, an A-60 NMR spectr.ometer,
two Aerograph gas chromatography columns, and
a Statos recorder, was linked to an on-line IBM
1800 computer (Figure 4). Each of the instruments was interfaced with the computer via
prototype Varian interfaces, an example of which
is diagrammed in Figure 5.
Each instrument interface provided facilities
which allowed the spectroscopist to have complete
control over his use of the computer from his
remote spectrometer. These facilities consisted of
backlighted push buttons, lights, and thumb
switches. Prompting lights, operated under program control, served to indicate the present status
of the operating program. Functions executable
from the remote console included loading and
aborting application programs, controlling branch
points within the application pr.ograms, and entering parameters during the execution of programs. In general, the recorder ass9Ciated with
a particular instrument was used as the graphical
output device; the Varian Statos recorder being
used fur the two chromatographs. Tabular reports were printed on the shared-line printer.
A disk resident master program was associated with each instrument. When an experiment
was to be perf'Ormed, the master program was
loaded into the core memory by depressing a

Computer Sy-stems for Automation of a Laboratory

1059

pushbutton on the interface. The function of the
master program was to initialize the spectrometer
and to provide a choice of the available application programs to the user. After the master program was loaded, it armed appropriate pushbuttons for program selection. This was indicated
by. lights behind the several pushbuttons. When
Programs

1

2

3

4

I I I I D

DEntllrlnteger

o

Select

EntllrFrlCtion

o

000 000 000

Error

256 128 64

Program
Stop

D
FIGURE 4-An automated analytical laboratory

o

Manuel Entry

32 16 8
Scan Count

4

2

1

D

FIGURE 5-Typical control panel of instrument interface

DATA MANIPULATION
TRANSFER VECTOR ELEMENT

A, J, B, K

TVEI
INCREMENT VARIABLE

Transfer integer element Ai into integer· element Eko
Increment integer at I by the integer J, where

INC

I, J

- 128 =::;

J=::; 121.

FIND MAXIMUM ELEMENT OF INTEGER
VECTOR

IMAX N, YDATA, I, YMAX, IMAX

Search N elements of the vector starting with the
Ith element of vector YDATA and place the maximum values of YDATA and its index I into locations YMAX and IMAX respectively.

PROGRAM SWITCHING
IF

IF

I, J, A, B, C

Branch to A if integer quantity (I-J) < 0, to B
if == 0, and to C if > 0.

A, B, C, D, POINT

Branch to A, B, C, D if location POINT contains
0, 1, 2, 3 respectively.

LOC, N, I, K

Execute all statements starting at location LOC
through the REPT statement th~ number of times
contained in N. Each time through the loop, increment integer at I by the number K.

COMPUTED GO TO

GOTO
REPEAT LOOP

REPT

; 1060

Fall Joint Computer Conference, 1968

one. of these programs was selected, it. was loaded
and the maSter program would exit (release its
core). After the subprogram completed its function, and before it exited, it requested that the
master program be reloaded. This mode of operation minimized the core requirements of a single
user.
Several application subprograms, which performed some of the more elementary instrument
functions, were jointly specified and written by
Varian and IBM. With the A-60 NMR it was possible to: aCquire data in a demand/response-mode
by sweeping the magnetic field; time average the
data by repeating the scans any desired number
of times; digitally smooth the acquired data; and
control the magnetic field homogeneity. The mass
spectrometer programs acquired data in a demand/response mode; replotted any desired portion of the data on the instrument recorder; found
all peaks' in the spectrum and normalized their
intensities; and found the five highest peaks and
identified the compound by comparison with a
table of spectra of known compounds on the disk
file. The gas chromatography programs acquired
data, detected and resolved peaks, calculated their

areas and wrote a report giving retention times
with peak areas.
Actual experience with· the above system was
quite good. Programs were easily and quickly
written using the macro language. No noticeable
interference between users occurred, even when
alI instruments and data processing I/O devices
were running simultaneously.
In conclusion, we believe that we have been
able to produce a system for use -in laboratory
automation which provides each individual user
the isolation, the availability, the real-time control responsiveness, and the price per instrum~nt
associated with multiple computers, one per instrument. In addition, this system provides powerful input/output devices, disk files, and a large
amount of support :for the I/O devices and the
files, which is not normally found on a small dedicated computer. Each user, then, has the impression that he has a large computer attached to his
instrument and completely at his disposal. The
1800 PALS program, excluding the instrument
application programs, is available from the IBM
Type III library (PIn #5778) ~

INSTRl]MENTATION CONTROL

READ DATA AND WRITE LOGICAL TAPE
RDLT
SCPNT, WCPNT, LTPNT, NPRA, APRA

ALTER SUBCHANNEL BIT
ASCB

BC, BIT, N

CONVERT THUMB SWITCH TO FLOATING
POINT NUMBER
CTTF
F

ASSIGN PROCESS INTERRUPTS
ASPI
PINO, ENT

From subchannel number contained in SCPNT,
read the number of words contained in WCPNT
(1 ~ we ~ 32767), and store on a logical tape
whose name resides in LTPNT. Return control
to normal entry point NPRA for successful completion of the read, and to APRA for abnormal
conditions.
Alter one bit (number of bit contained in location
BIT) of a subchannel register (number of subchannel contained in location SC) to a new value
N == 0 or N = 1.

Read a set of manual switches on instrument
interface and convert the setting to a floating
point number at location F.
Assign a program entry point ENT to the process
interrupt whose number is contained in location
PINO.

Real time time sharing, the desirability and economics
by B. E. F. MACE FIELD
University of Oxford
Oxford, England

INTRODUCTIOK
The thoughts expressed in the following contribution
have arisen largely as a result of work at the University
of Oxford, Nuclear Physics Laboratory. In Oxford, we
have had a demand for two real time users and one
background user. The achievement of this goal is described in the first reference. 1 It has been clear, however,
that many institutions have a similar problem and it
was the fact that the solution was not impossibly difficult that prompted this plea for more efficient use of
large capital investments represented by on-line computers.
I think most of us would agree that the battle of the
stored programme computer versus the fixed wired
kicksorter is over in the field of nuclear data collection.
The victory has certainly gone to the stored programnle
device. But once the euphoria had died down many people, the author included, have found that a certain
amount of fast front end hardware has been necessary
to supplement the activities of the central processor.2
This requirement will probably be Jess prominent in the
future with the current increase in nUllJ.ber of sub-one
micro second cycle time computers. I might point out
that front end hardware is usually used to lnake decisions on the admissibility of the incoming data faster
than the computer.
The proposition

It is the utilization of the computer used in such a
data col1ection system that I wish to exanline. The increasing speed of even the smallest computers has given
rise to some interesting consequences. In very general
terms the speed of processing the incoming data in a
real time environmenl. is inversely proportional to the
memory cycle time. The add one to memory or for very
sophisticated data the programme manipulation time
are both very obviously dependent on cycle time. As a
result we can get to a stage where the C.P.U. is infre1061

quently used because the data is serviced in a time considerably shorter than the time between events. Let us
examine this further. Assume we have a 1 ,usec cycle
time machine.
The add one facility fronl an external source should
not take longer than 2 ,usec giving a mean rate of 500,000
events/sec. Now there are a few detectors and no analogue to digital converters that will perform at this rate.
The best. the latter can do is probably 50,000 events/sec
at the present time. Thus for a physically realisable configuration we can see 10% utilization in the simple case.
For the complex data system we may require programmes which take say a millisecond to sort an event.
Now complicated events usually occur at very slow
rates, typically l00-200/sec, so our multiparameter
data analysis takes at most 20% of the time. Even
allowing for a factor of 2 upwards in these estimates we
see a sizeable fraction of the total processor time is
unused.
But, you will be asking, what about the display and
other user requirements. These user requests, often for
data manipulation, are very infrequent on a computer
time scale and in consequence need not affect the overall
argument even though they may take a substantial
time to complete. The display, however, is another
matter.
Most people working in real time with computers require some form of immediate access to the data. The
most usual form for this to take is a visual display. I
would distinguish, very obviously, two types. The
C.P.U. controlled variety and the directaccesstomemory type. The former obviously ties up the C.P.U. continuously for display and in my opinion is so bad an investment (regardless of its price) as to be completely
disregarded from the current discussion. The latter
comes in forms ranging from simple analogue scopes to
sophisticated vector and character generators. The
major problem about C.P.U. controlled scopes is that
once installed, economic arguments win be advanced

1062

Fall Joint Computer Conference, 1968

forcing them to be used and in consequence be a large
Jnhibiting factor on the application of the computer.
(This conunent may be toned down somewhat by the
advent of large diameter storage tubes and the fact that
it is a comparatively trivial modification to convert such
a device to a direct access system). But, given we have a
scope that will display a frame without C.P.U. interyention, save for start up, then it is unlikely that the
total increase in used time would be another 5% giving
at most "40-45% total time used.
Having dealt in general terms let me give an example
from our work in Oxford. We have several A.D.C.'s in
use, each connected t~ front end hardware. 2 Figure 1
shows the time left over when they are in operation at
the event rates indicated. Two sets of curves are shown,
a software (API) storage system and hardware storage
(DES) . We see for the software case less than 10%
time left over if 6 A.D.C.'s are working at rates of 2000/
sec each. This, however, is not a physically sensible rate
for such a system since Figure 2 shows that at this rate
A.D.C.'s 1 and 6 have vastly different storage efficiencies
as seen by the outside world, making usage very difficult. The solution is thus either to reduce the input
rate to get the efficiences more nearly equal, say to
lOOO/sec, when the time left free is a usuable 30% or
build front end hardware and get similar efficiences and
30% free time at 20,OOO/sec. The derivation of these
curves and explanation of the different shapes of the
DES and API efficiency curves is set out in reference.3
If either of the above approaches were adopted we obtain some free C.P.U. time and we must therefore ask
why is this time not used?
Nevertheless the question needs some justification
even though once put it seems obvious enough.

Figure I-Free C.P.V. time for various input rates on
each A.D.C

Figure 2-Efficiency of each A.D.C. for various input rates

Modus operandi

When buying a computer the expenditure must be
justified to some funding body, as in the purchase of an
accelerator facility. In the latter case, however, one of
the earliest questions is what is the percentage utilization? If we asked for an accelerat<;>r to be used only 50%
of the time, it would be completely unacceptable to the
financial experts and physicists alike. So why do we not
get such questions from the computer users who have
most to gain from efficient machine use?
Anothrr point that arises is that a given institution be
it a small laboratory or a government enterprise, buys
the computer it can afford and this, I propose, results in
the computing expenditure being a constant fraction of
the total expenditure on the project. In other words all
user~ have the same percentage to gain be they large or
small. So on this premise we should be hearing a clamour
for time sharing even the smallest machines, but we do
not. Why not? Because we have to some extent been
brain washed into thinking that only with the largest
installations is this at all possible.
Where do we start in attempting to achieve this
utopia of 100% computer utilization? We start by delimiting exactly our goals. We are not aksing for 10-20
users ~ut in terms of 1-3 or maybe 4. In Oxford, we
have already 3 simultaneous users on a 24k PDP-7; why
would others not do the same? The one single problem
with most on-line nuclear structure computer systems is
the inabiIi~y to get at the computer after taking the experimental data. Obviously, therefore, the extension of
the users from 1 to 2 produces an infinite improvement.
To attain this end we have to split the machine in
some way, either by swapping programmes in core or
memory allocation. In Oxford we have a mixture of
these approaches;1 memory allocation to separate back-

Real Time Sharing
ground from foreground and swapping for the two online users. At this stage we ran into the first problem.
The system software as then supplied for our machine
was not designed for multiple users and moreover was
not organized on a modular concept using bulk storage
peripherals. (I would point out that in the latter case
this has since been changed with the PDP-9).
In a small system it is not essential to get two background users but the background system must function
in the presence of unusual device flag combinations.
This implies a flag mask to allow only specified flags to
cause interrupts and an efficient automatic priority interrupt system (API). This should be an interrupt chain
with successive channels having non-consecutive memory locations. In this way the low priority channels can
be assigned to the background system and the higher
channels to the foreground system in a completely different and possible protected memory area. (To my
knowledge no company has such an API system.)
The concept of memory protection necessarily entails
some mode of trapping the incorrect memory references;
obviously to be a location different from the API channels and programme interrupt. Since ~e are in a small
system we will need to share peripherals, on the grounds
of expense, and thus 10 instructions need to be trapped
also. Along with 10 trapping we will need a supervisor
programme to oversee this concurrent peripheral use.
This programme need not be extensive, we have an
800 10 word programme to share Dectape, punch and
plotter.
We have now got round to the question of 10. In a
real time environment it is evident that all synchronous
devices must have a higher priority than random data
from a nuclear experiment. By synchronous I include
magnetic tapes of all types together with discs but not
most paper tape readers which stop between characters.
To guarantee this priority sequence we must have all
such syllchronous 10 by data channe1.

Necessity hardware canfiguratian
Perhaps it might now be possible to see what we are
asking of the computer manufacturers:
(i) A versatile API system
(ii) An effective 10 trap system

(iii) Synchronous 10 by data channel
(iv) Flag mask.
Hopeful1y some may conclude that many machines

1063

seem to have most of these facilities already. I think this
is true, so why are there not more foreground-background systems in existence, for given the above four
requirements, time·sharing in that context is straightforward.
There seem to be three possible reasons for the present state of the art:
(i) the expense of memory for a memory al1ocation
system
(ii) the cost of fast swapping discs
and (iii) we are persuaded that we need to buy a bigger
machine to achieve these resu1ts.
The costs of the first two items are continuously falling and from this point of view we :may expect an increasing number of multi-user applications. In the third,
we, as computer users, must be more critical of what is
provided by the manufacturers and not succumb to their
advertising.

Ecanomics
Of the three major hardware requirements the ones
most llkely to be missing are the s1it API system, 10
Trap and flag :mask. Given that the machine runs asynchronously the cost of putting these into an existing machine is UP likely to exceed $2400. The duplication of the
smal1est system is unlikely to be less than $15,000. The
latter cost obvious1y depends on the system while the
former is a1most independent of the system context.

CONCLUSION
In conclusion then, it is evident that to achieve more
than 50% real time utilization in nuclear structure work
we require more than one real time user. It is also evident that this can be done with a mipimum of effort
given the right hardware configuration, which i~ not so
very different from that existing on most small to
medium processors. It is thus much more economic to
satisfy a demand by a small extension to an existing
machine than to duplicate the system.

REFERENCES
1 G L MURRAY B E F MACEFIELD
N uel Inst & Meth 51 1967 229
2 G L MURRAY B E F MACEFIELD
Nuel Inst & Meth 57 1967 37
3 G L MURRAY B E F MACEFIELD
Nucl Inst & Meth 62 1968 122

A modular analog digital input output system
(ADIOS) for on-line computers
by R. W. KERR, H. P. LIE, G. L. MILLER,

and D. A. H. ROBINSON
Bell Telephone Laboratories, Incorporated
Murray Hill, New Jersey

INT'RODUCT'ION
The most important single feature that allows a
computer to be employed in a broad range of calculations is the fact that one' can, by programming, in effect restructure the machine to perform the desired computational task.
I t is not possible to retain the same degree of
flexibility in on-line systems because of their
need to be connected to specialized external hardware. Primarily for this reason the majority of
on-line computer systems that have been constructed in the past have been designed to perform a pre-defined class of specialized operations.
This situation is analogous to that which existed
before the invention of the stored program machine when a computing device would be constructed to perform each new special task.
The system described here is the result of
an 'effort to obtain a reasonable degree of flexibility
in on-line computer controlled environments and
is based on careful considerations of the factors
that tend to limit such flexibility. The consequences of such problems in previous systems has
been evidenced by difficulties of adding or reconfiguring hardware and interface equipment. Such
difficulties have reduced the potential versatility
of many existing systems in which the· effort
required to implement useful changes is uneconomic and such changes are therefore only rarely
made.
It is possible, however, by the use of appropriately designed modular units interconnected
by a common two-way analog and digital databus to obtain the desired degree of flexibility and
power. The next section of this paper outlines
the general consideration involved in the desii11

of such data-bus systems, while the remainder of
the paper describes the implementation of these
ideas for a specific small computer, namely a
Digital Equipment Corporation PDP-So
. Design considerati0n8

Of the many schemes whereby equipment may
be connected to a computer perhaps the simplest
division is between "radial" and "bus" systems.
In the former, the inte'rconnecting cables can be
thought of as radiating like the spokes of a wheel
to connect the computer to each external unit,
while in the latte'r each external unit is connected to a common "highway," "party line," or
"bus" cable system. In a certain sense this distinction is artificial since in the last resort even
a radial connection is handled on a bus basis
once the signals enter the computer hardware
proper. However, the distinction is a valid one
for the domain of equipment external to the
computer itself and can serve as a starting point
for comparing different systems.
The greatest single advantage in a radial system, from the use'r's point of view, is the fact
that exte'rnal equipment need only be plugged
into a suitable connector to be on-line with the .
computer. The outstanding disadvantage, however, is that such systems are relatively inflexible
and can become expensive if large numbers of
external units are required. Furthermore, the user
may become overly dependent on the computer
vendor and his instrument division since only
their equipment i~ automatically interfaced with
the machine. In bus systems, on the other hand,
the organization is different in that each external
unit is connected to a common set of cables which

1065

1066

Fall Joint Computer Conference, 1968

o

COMMON ADDRESS
AND DATA LINES

SEPARATE ADDRESS
AND DATA LINES

FIGURE I-Logic tree indicating the major decisions involved
in the design of a data bus system

carry address, data, and status information to
and from the computer.
A feature of such bus systems that is worth
noting is that the interfacing operation between
theoomputer and the external world occurs only
once, namely between the computer and the data
bus. It is theref'Ore possible to design such systems so that the sa~e, or different, collections of
on-line equipment can be connected to many different computers by changing only the bus-tocomputer interface.
The differentiation into radial and bus systems is
indicated by the node labeled 1 in Figure 1. Other
important decisi'Ons follow at other nodes in this
diagram and it is the purpose of this section of
the paper briefly to indicate the major consid.erations involved at each branch. In order to forestall
any misunderstanding it may be well to point out
that th'Ough a particular design path was foll'Owed
in the PDP-8 sysltem described in the remainder
of this paper, it is by no means claimed that the
resulting system is ideal for every application.
As will become clear the design of any system
is dependent 'On numbers 'Of factors,' major ones

being, for example, the total size 'Of the system
envisa.ged (i.e., number of input and output units
together withinf'Ormation on their spatial separation), and whether the system is to be employed
by a single user or time shared by several noninteracting users simultaneously. An'Other important consideration is of course input/output speed
and data-rate. Interestingly enough, h'Owever, in
a number of actual on-line experimental environments that we have c'Onsidered, it turns out that
the bus approach imposes only a small time burden on the system. In the last resort this arises
from two causes, first because operati'Ons proceed sequentially inside the computer, requiring a
certain time to service each external unit, and
second because external units are themselves 'Often
qui~ slow (e.g., ~50 P.s conversion time for a
typical 12 bit nuclear physics ADC). The result
of this is that if reas'Onable care is exercised in
the design 'Of the bus system the access and I/O
time f'Or external units· can become quite small
compared with the sum 'Of the device 'Operation
and computer servicing time. Obviously it is
always possible to envisage situations where this
is not the case, but we believe them to be only a
small subset of most 'On-line situations. In those
cases where I/O speed bec'Omesan unavoidable
limitation, e.g., CRT displays, it is usually worthwhile t'O consider the use 'Of a separate dedicated
piece of equipment (such as a disc with suitable
DAC's, etc., for display) to perform the critical
function at all times.
Returning to the general considerations indicated
in Figure 1, the next question is one of .system
size. It is taken for granted that the user's hardware will consist of some form of modules which
plug into bins (see for instance the European
standard IANUS system or the ADIOS system
. described here) , and the question is whether there
will be one bin or several. This decision involves
questions of how multiple bins are to be addressed,
and what will be their physical separation and
distance from the computer. This latter point is
highly important though easily overlooked. Its importance can be seen in the following way. If the
cable length is l'Ong then the cables must be terminated to avoid reflection. The logic levels employed for modern microcircuits are typically 0
and +4 volts. F'Or 50 ohm terminated c·able this
means 80 rna/hit. Since on-line systems usually
employ common grounds between analog and digital hardware it folIows that the transfer of a 16
bit word can involve a current pulse at over 1.2

ADIOS
amperes in the ground return. The noise and
crosstalk implications of this are obvious. (A
new data bus system presently being designed at
Bell Telephone Laboratories for a multi-user environment using SDS Sigma computers circumvents this problem by using balanced currentdriven twisted pair. No such extreme steps were
necessary for the relatively small PDP-8 system
described in this paper.)
If multiple simultaneous users are envisaged it
is advantageous to employ a buffer unit between
each bin and the data bus. This has a number o.f
advantages fo.r large systems, not least being the
ability to pro,vide logical buffering between bins
to prevent one user from wiping out another by,
for instance, unplugging a module. This decision
is shown at node 3 in Figure 1.
Again in large multi-bin environments it can
be advantageous to provide a bin address with
unit sub-addresses within each bin, (this is the
route followed in both the European IANUS and
BTL Sigma system' designs).
At node 5 a difficult choice must be made regarding the extent to which the bus cables are
shared by time, or other multiplexing arrangements. The advantage of multiplexing lies in its
ability to reduce the numbe:r of cables in the system. The disadvantages are reduced I/O speed
and added complications to unit hardware and
system programming.
. The way in which external units signal the
computer via the interrupt system is also one ef
central importance. In this connection the major
choices lie, as indicated at node 6, between using
a single common interrupt line, or of employing
a hierarchical or priority system. The latter can
be organized two ways, either by using a sepa.rate physical interrupt wire from each external
unit to the computer, or by connecting the external unit interrupts in head-to-tail fashion whereby priority is defined by position in the chain.
Neither of the latter system is well suited to a
flexible data bus system designed to accommodate
a wide variety and number of external units
since each time a chang.e is made in the configuration of modules, numbers of separate physical
wires must also be re-routed.
It will be appreciated that the foregoing di~
cussion of general. questions is of necessity superficial, though we believe it to indicate most of the
major hardware considerations involved. Without
going too deeply into details of software and logical design one other question regarding module

1067

addressing must be raised. This is the issue of
what we term "generic" addressing and its importance can be seen with a simple example.
Suppose the on-line system involves a number of
external devices which must be turned on and
off in exact time synchronism. Since any bus system is by definition sequential, in that only one
set of address lines is used, it is not at first clear
how this can be achieved. A solution to the problem can be provided by allowing units to recognize mOore than one address. E.ach unit or module
recognizes its own unique address and having
been so addressed one of the commands to which
it can then respond is to enable recognition of another address. Such other addresses are termed
generic addresses and they can be common to
many different units. In this way the computer
can issue generic commands which apply simultaneously to any subset of extern.al modules, allowing them to operate in exact time synchronism.
By way of concluding this section on general
design considerations it may be illuminating to
consider a number of questions that can be asked
regarding the logical organization of any o.nline computer syste'm. Do'es it, for example, require special timing, logic and drive circuits to
be added to an external unit before it can talk to
the computer? If the ans·wer is yes then the
chances are that considerably less experimental
innovation will be carried· out with the on-line
hardware than would otherwise be the case.
Another important point to bear in mind is the
ability of the system to check itself. Can the computer tell what' units are connected and whether
they are operating? This feature can be very important in systems employing many modules.
An area that is outside the scope of this account
is that of programming, but it is obvious that
the hardware and software of anyon-line system
must be harmoniously designed. Less frequently
considered from the outset, however, is the question of the ease of debugging the operation of
the entire on-line system. Our experience with
the present system has shown the extreme desirability of being able to "force" external equipment
to well defined conditio.ns, by hand, as a check in
debugging programs and hardware. This also is,
therefore, a point to consider in comparing system configurations, how difficult is it to debug the
hardware-software interaction in preparing programs for the on-line system? .
A corollary to this point is the related one of
investigating the degree to which the computer

1068

Fall Joint Computer Conference, 1968

is able to exercise external equipment. In this
connectiQn it has been found extremely useful
to prepare programs which operate all the computer-accessible features of a mQdule sequentially.
This approach allows convenient debugging of
modules as they are produced since an Qperator
c'an examine repetitive waveforms ,at his leisure,
proceeding sequentially through a series of test
conditions under programmatic contrQl.
A final point invQlves the prQvision of an analog
measurement capability within the data bus system. This has been fouhd to ~; most useful in
. the PDP--8 system described, here, and comprises
a shielded twisted pair in the data bus cable
connected to a central 12 bit ADC at the CQmputer. In conjunction with suitable external
modules this furnishes the ability to. both measure and prQvide analQg levels. TQ'gether with the
digital capabilities of the system this provides a
combined digital and analog capability which encompasses a broad range of appNcatiQns.

The data bus
A diagram of the data bus system chosen for a
PDP-8 and a single-user envirQnment is shQwn in

I
ANALOG BUS
X
X
INSTRUCTION
LINES
IOPI
I
IOP2
J

~

i

<]- r+f-+BLOCK OF
r+3 LINE
DRIVERS

..

- ;+BLOCK OF
12 LINE
BUFFERS

I
I

1/0

SKIP

PROGRAM
INTERRUPT

1/0 SKIP
BUFFER

--'"

....
L-..J

T

FIGURE 2-Simplified block diagram showing the computer
interface and data bus cable

Figure 2. The IQgic Qf the Qperation Qf the data
bus closely parallels that Qf the PDP-8 cQmputer.
Commands are sent to a particular module by
placing the mQdule address on the address lines
and activating the instruction lines. The-three instructiQn pulses in this system Qccur at 1 P.s intervals and are .4 P.s in width. Because one Qf the
system rules is that the modules operate with
instruction pulses Qf any length greater than
~.2 p's, there is no. restrictiQn Qn the maximum
length of the cycle. (Thus the system may also
be us·ed with other computers of lower speed than
the PDP-8 provided a suitable computer to databus interface is cQnstructed.)
The type Qf operation performed with each instruction pulse has been standardized as fonQws.
Instruction Pulse
Operation
IOP1
Augmented InstructiQns
IOP2
and I/O Skip
IOP4
Input to. cQmputer
Output to. mQdules
Since it was useful to have many more than
three instructions, a syste:m for deriving a set of
augmented instructions during IOP1 is used
wherein the six low order hits of the computer
output lines are each interpreted as a separate
instruction. The six high order bits have been reserved to be us,ed in coincidence to obtain 64 additional augmented instructions, should such a
need arise in the future. IOP2 is used to. generate
all inputs so as to relax the requirement on the
fall time of the input pulses which must be clear
before the next instruction is executed.
·A mQdule requests attentiQn from the computer by energizing a co.mmo.n interrupt line
until it is s,erviced by the computer. The computer then interrogates the modules to determine
which one is requesting s·ervice. (Since a priority
interrupt system might be desirable when the
system is interfaced to. a different computer, fQur
additional lines in the data bus have been provided, which may be used in this manner.)
The I/O skip line pro,vides a means for a
module to respond to interrogation by the computer. An a.ffirmative response is signalled by
energizing this line, which. causes the PDP-8 to.
skip an instruction. If the system were connected
to a different computer, the I/O skip line could
set a status bit in the machine.
The distribution of the analog input lines to
the mQdule bins is performed with shielded
twisted pair. The interface contains a parametric
amplifier in a configuratiQn which CQnverts the

ADIOS
single ended 0 to -10V range of the an;alog to
digital converter in the PDP-8 toa +10V to
-10V differential system with good dynamic
.common mode rejection.
Since the Io.gic o.f the bus system is compatible
with the PDP-8, the interface is used simply to
provide the level shiftin-g and buffering that is
necessary to. communicate directly with the integrated circuits in the modules.
The interface unit con.verts the negative logic
levels of the PDP--8 to standard microcircuit
levels as used in the d.ata bus system. In addition
the interface input buffers provide noise filtering
and an input threshold which can be varied in
order to investigate noise margins. Tests have
sho.wn the data bus syste.m capable of operating
with cable lengths of more than 100 feet.
. The data bus co.nsists physically of 48 ilninia_
ture co.axial cables, to.gether with one shielded
twisted pair, which interconnect the required
number o.f module bins. Within the bins, the
data bus loops through twelve 50 pin connectors
into which the modules connect upon insertion
into the bins.
The plug-in modules

Four general purpose modules were designed
to. operate in conj unction with the computer to
assist in the operation and control of experiments and in the acquisition of the resulting
data. Figure 3 sho.WS one of each type of module
installed in a module bin. A mo.dified NIM power
supply located at the rear of the bin provides
local power for the modules.
The construction of all the modules is similar to

1069

FIGURE 4-Photograph of a scaler module, showing the location of the plug-in address cards at the lower center of the printed
circuit

that of the scaler, which is sho.wn in Figure 4.
The use of integrated circuits on a single special
purpose board results in considerable reduction
of Co.st and size over the mo~e usual technique of
using general purpose logic boards.' The cost of
the more complex modules is approximately $300
each.,
In order to simplify programming and debugging, the module address,es are defined by small
plug-in cards, visible in Figure 4, which may be
changed at will. Removal o.f a unit for repair
may thus be performed by switching its address
card to. a new module. A brief discussion o.f the
structure and operation o.f eadi of the modules
is given in the follo.wing sectio.ns.
Register

FIGURE 3-Photograph of one medified NIM bin containing
one of each of the four modules

The register provides a general purpose interface fo.r digital devices. It is capable o.f accepting
a 12 bit word from an external device and input.·
ting the word to the computer. It can also accept a word fro.m the co.mputer and present it,
with buffering, to the outside world. The block
diagram o.f this unit is shown in Figure 5. Table I
lists the commands for the unit, most of which
require no explanation.
The vo.ltage levels for input and output are
standard NIM and integrated circuit levels. The
use of jam transfer makes res,etting unnecessary,
and allows alteration of selected bits of the
register without even a momenta:ry change in the
other bits.
The use of master-slave flip-flo.Ps as buffers

1070

Fall Joint C'Omputer C'Onference, 1968

DATA

a

ADDRESS
INSTRUCTION
DECODING

LINE
DRIVERS

DATA BUS

BUS

INPUT
GATES

a

ADDRESS
INSTRUCTION
DECODING

INTERRUPT
BUFFER

rI

------

I

I

EXTERNAL DEVICE

L ____

12 LINES
FROM
REGISTER
I

~---------------~

,-I

FIGURE 6--8implified block diagram of relay driver module

Scaler
This m'Odule cQmprises a12 bit binary ripple
counter and the necessary logic t'O permit the
module to function on the PDP-8 data bus system. In operation the module sends an interrupt
to the CDmputer for every 4096 input pulses. The
system records the number 'Of these interrupts
and the ref'Ore functi'Ons as a scaler modulo 4096.
At the end 'Of the cQunting period the fractional
CQunt remaining in the scaler is added to the
previously recorded t'Otal. Figure 7 is a I'Ogic
block diagram 'Of this unit. A unique· feature in
this design is the use of a two address command
structure. The first 'Of these is the generic address and the second the unit address. By use of

DATA BUS

Relay module
The I'Ogic 'Of this m'Odule is sh'Own in bl'Ock diagram in Figure 6. It contains twelve single P'Ole
d'Ouble throw high speed relays each capable 'Of
switching 3 amps. It is used in c'Onjuncti'On with
a regi.ster module t'O handle signal and power
levels which are inc'Onvenient to handle electr'Onically.
The relay driver commands are shQwn in Table
II. The Test Unit Ready c'Ommand allQws the
computer to test for the presence 'Of the m'Odule.

I

L------ J

FIGURE 5-Simplified block diagram of register module

permits the register to be used as a hardware
bit-ref'Ormatting device by connecting the outputs
of the register to the inputs in the desired sequence. The word tOo be ref'Ormatted is sent out
tOo the register and then read in as external data.
This feature alsQ permits use 'Of SQme bits of the
register as input and some as 'Output, since by
connecting it bit 'Output line to the cQrresponding
bit input line one makes the value of the bit independent of the Load Register fro'm External
Unit c'Ommand.
The most serious stumbling block in interfacing an external device tOo a computer is not the
compatibility 'Of the input/'Output levels, but the
necessity 'Of establishing logical cQmmunicati'On
between the devices and the c'Omputer in a simple
way.. In the register the "freeze" circuitry allows
the c'Omputer t'O command a device to remain
stable while being read. The interrupt circuitry
allows the device t'O request service from the
computer. A busy line informs the unit 'Of the
status 'Of its request.

--,

EXTERNAL
DEVICE

UNIT
ADDRESS
INSTRUCTION
DECODING

a

r--

GENERIC
ADDRESS
INSTRUCTION
DECODING

a

LINE
DRIVERS

----,

II

EXTERNAL
I
DEVICE
L _____
..JI

FIGURE 7-Simplified hlock diagram of scaler module

ADIOS
the generic address the computer can execute any
one of three c'Ommands and have all scalers sharing that address respond simultaneously. The
c'Ommand structure f'Or the unit is shown in
Table III. Most 'Of the commands are self-explanatory. While Increment 'Operates in front
of· the input gate, Preset,. which also incre'ments,
operates at all times. Enable Generic and Disable
Generic permit the generic command,s to be
'Obeyed or ign'Ored.
The 'Overflow and saturate logic allo'w the computer to, serve as the high order portion of the
'scaler in the following manner. When the high
order bit of the scaler overflows, its overflow flipflop is set and a program interrupt is sent to
the computer. The computer then initiates a
search using the Test Overflow instructi9n and
thereby ascertains which module interrupted.
Should a sec'Ond 'Overflow occur in a scaler before the previous one has been recorded by the
computer then the saturate flip-flop is set. The
cO'mputer can test this flip-flop, ,and thus either
ascertain that the scaling has been :performed
without error or take appropriate action to insure correct scaling.
A rear panel switch connects the output of the
most significant bit to' either the O'verflow detecting circuitry or to a front panel connector,
thus all 'Owing the use of a sec'Ond scaler module
to form a 24 bit scaler if desired.
A discriminator is located at the input to the
mO'dule and its level is adjustable from -5 volts
to +5 volts. A front panel lamp indicates the
status of the input gate.
A three position switch allows manual setting
of the unit in either the start or stop condition,
or returns this control to the computer.
Programmable power supply

The primary purp'Ose 'Of this module is to al16w the co·mputer to supply adjustable voltages to
external devices. As shown in Figure 8, the computer controls the power supply voltage by causing
rotation of a motor driven ten turn potentiometer
which serves as an inexpensive anal'Og memory.
The series regulators, which operate by re-regulating the bin power, are built either as positive
or n~gative supplies, and furnish 0 to 10 volts
with overcurrent protection from 10 to 250 rna.
The control commands f'Or this unit are shown
in Table IV. The computer controls the unit by
operating the potentiometer while simultaneously

1071

DATA BUS

ANALOG
ADDRESS a
INSTRUCTION
DECODING

MOTOR DR IVEN
POTENTIOMETER

POWER
SUPPLY

FIGURE 8-Simplified block diagram of power supply module

m'Onitoring its output voltage, thus becoming part
'Of a servo lo'Op.
The fr'Ont panel dial attached to the potentiometer indicates the supply voltage directly while
also permitting manual setting of the voltage.
In addition, the module may be used as an analog
inp:ut from the 'Operator, since the computer can,
in effect, read the dial setting. Connecti'On of
the potentiometer shaft to other r'Otating equipment could also pe.rmit the computer to cause
controlled motion in the external equipment
should this be desired.
Sample applicationS

The system has been used to control, and process data from space experiments; to run nuclear
analysis displays using DAC's; and as the data
acquisition and control center for an automatic
Hall-effect measuring system.
Figure 9 is a block diagram showing h'Ow experiments are connected into the system. Note
that the computer 'Output can c.ontrol the experiment via the data bus. Computer input can store
digital 'Outputs fro,m the experiment via the data
bus, and make anal'Og measurements via the
analog bus and computer ADC.
Testing of a satellite charged particle
spectrometer

A simplified block diagram of a satellite paT-

1072

Fall Joint Computer Conference, 1968

EXPERIMENT

"X"

12 MODULE
NIM BIN
STATION I

INCIDENT

c:>

PARTlClES

r---""

, .... I DATA I
4;.-:;>j PHONE I

LLINK.J

-rI

'.y
TO LARGE
COMPUTER
FACILITY

DETECTOR THICKNESS
( i ) - 5DMU
12> -IOOMU
1 3 ) - 2MM
@ ) - 2 MM

B}

CONTROL
LEAST SIG. BITS
LOGIC
~ FROM SEQ. CLOCK
'-------"

FIGURE IO-Simplified block diagram of a satellite experiment
DIGITAL OUTPUT LINES
(ADDRESS,COMMANDS,
INSTRUCTIONS)
DIGITAL INPUT LlNES(DATA)

FIGURE 9-Simplified block diagram showing three bins connected to the data bus

ticle detector experiment is shown in Figure 10.
Particles incident on the semiconductor detector
assembly deposit their charge in one or more
detectors. Coincidence logic applied to detector
'Outputs determines the particle· type, while the
linear system sorts the energy of each particle
type into one of five consecutive energy ranges.
Sixteen different particle identifying modes
and the five channel energy ranges are controlled
by the digital outputs from the spacecraft sequence clock in such ·a way that each mode lasts
for approximately 10 seconds. For in-flight calibration the experiment contains a test pulse generator and two internal sources, each activated
by certain states 'Of the sequence clock once every
six hours·.
When tested in thermal vacuum in the laboratory by the computer system, outputs from a
register unit simulated the sequence clock and
thereby controlled the experiment modes. Calibrati'On modes were arranged to alternate between the
test pulser and internal source modes for every
complete sequence of experiment modes, interspersed with complete sequences of no excitation.
In this way a calibration cycle was repeated· about
'Once every 25 minutes, so large amounts of calibration data were processed in a relatively short
time. The sequences of no excitation were useful
for the .observation of noise counts.
Scalers were used to accumulate the five chan-

nel outputs and to transfer the counts into the
computer for printout.
Temperatures were also recorded. The outputs
of temperature sensors were switched onto the
analog bus using a relay driver and register combination, and were measured by the computer
ADC.
Although not used in this particular test, it
would be appropriate to use programmable power
supply units in a test of this kind to investigate
the effect of varying power supply voltages on
circuit performance.

Automated hall-effect measurements
An . ion implantation laboratory is in operation and many implanted diode samples will require extensive electrical testing. Each sample is
expected to go through several stages of annealing, and following each stage measurements
will be made to evaluate Hall coefficients, specific
conductivity, carrier concentration and carrier
mobility, over the temperature range 2°K to
300 o K. Figure 11 shows a simplified block diagram of the electrical system.
A large number of voltage measurements from
contact to contact are required at each temperature of interest, and the temperature stability at
each measurement point must be carefully controlled.
Figure 12 is a flow chart showing the main
steps in the computer controlled system. After
the sample is mounted and ready to' be lowered
into the cryostat, the program starts with a
comprehensive check of the hardware and a
"FAULT" printout is generated indicating the
nature of any malfunction. An "OK" printout in-

ADIOS

1073

.:.

CHARACTERISTIC
8 HALL EFFECT

TEMPERATURE
CONTROL

MAGNETIC FIELD
CONTROL

TO COMPUTER

FROM COMPUTER

FIGURE ll-Electrical block diagram of Hall effe('t
measurement system

dicates the satisfactory completion of each test.
When the hardware test is completed the sample is manually lowered into the cryostat. The
program continues by welding the sample contacts to ensure good connections and then checking that the voltage drop across each is within
acceptable limits. The weld current and the contact test current paths are selected, by using
relay units, so that they flow in the appropriate
direction (depending on the diode j unction type)
and through any desired contact.
The next step is the measurement of a complete set of diode characteristics. These are made
at several values of current, taken from a table
in memory. A first set of Hall measurements is
then made. Coefficients are processed and printed
out. A manual decision is then made whether the
Hall properties exhibited by the sample make
continuation of the measurements worthwhile.
In continuing the test, the operator types in
the upper and lower limits of temperature range
and the increments at which measurements are

FIGURE 12-Simplified ftow-gr'aph of Hall effed measurement

to be made, and· opens the liquid helium valve on
the cryostat. The computer selects the te'mperature values by interpolation from a table in
memory, starting with the lowest specified temperature. The required temperature control is
provided by the setting of a programmable power
supply whose output controls the power applied
to the sample heaters.
The computer proceeds in a similar way to
control and check the remaining experimental
conditions, as indicated in Figure 12.
The measurement of each of the many voltages
is accomplished by using relay units as multiplexers at the input of a digital voltmeter. Two
register modules are used to receive the DVM
digital data and transfer it to the computer via
the data bus. One of these registers is used to
trigger, and also to· recognize the end of each
DVM measurement, signaling to the computer
that data is ready for input.

1074

Fall Joint Computer Conference, 1968

Other applications
Many uses other than those already described
have been considered. An example is that of automated sample liquid' scintillation counting in
which programmable power supplies can pro,vide
levels'defining pulse-height windows. In conjunction with scalers this provides pulse height
analysis, while relay/register combinations can
exercise electromechanical control .
. The system may also serve as a versatile, economical alternative to large multiparameter pulse
height analyzers when u.sed in conj unction with
pulse analog-to-digitaI converters, and fast digital-to-analog converters for CRT displays.
When connected to an engineering breadboard,
the system has been used asa versatile programmed pulser and circuit tester.
DISCUSSION
The point can be made, and with justification,
that computer manufacturers realized years ago
that peripheral hardware was best handled on a
bus basis, which is all that is being achieved with
the system described here. This is quite true.
The differences that arise with on-line systemg
are primarily those of degree (with the' exception
of analog bus facilities) rather than those of
. kind. One example will suffice to make the point.
The present system might be required to handle
60 scalers all counting at '---.1 MHz (e.g., a data
rate of 6 X 10 1 bits/second) with the subsidiary
requirement that various subsets. of them be
gated on and off in exact time synchronism. Such
requirements are not encountered with standard
computer peripherals for which supervisory control and timing c'an always be exercised in a logical sequential manner.
The major point being made here is really a
different one, namely how to design a modular
system with a smaIl number of different .types
of modules to encompass a large number of online tasks~ While examples of such tasks are endless it is 'hoPed. the outline of the rationale of the
design, together with the- sample applications
given, demonstrates the flexibility that such an
on-line modular analog-digital system can provide.
A CKNOWLEDGME-NTS
It is a pleasure to acknowledge the many contributions of others to the work presented here.

Notable among these, have been E. H. CookeYarborough and his associates at AERE Harwell
in discussions of design philosophy and in providing inforrnation on their IANUS system, R.
Stensgaard of the University of Aarhus in all
phases of the work on Hall effect measurements,
and W. L. Brown of Bell Telephone Laboratories
for constant encouragement and. support.

lOP

1
1
2
4

DATA
BIT
6
7
8
9
10
11

COMMAND
Test Interrupt
Generate Interrupt from Computer
Disable Interrupt
Enable Interrupt
Set Freeze Output
Load Register from External Unit
Load Computer from Register
'Load Register from Computer

TABLE I -Register module commands

lOP

COMMAND

1
2
4

Test Unit Ready
Enable Relay Drivers
Disable Relay Drivers
TABLE II-Relay module commands

A. UNIT ADDRESS COMMANDS
lOP
'1

1
1
1
2
4

DATA
BIT
6
7
8
9
10
11

COMMAND
Test Overflow
Test· Saturate
Disable Generic
Enable Generic
Increment
Clear
Load Comput.er from Scaler
Preset

B. GENERIC ADDRESS COMMANDS
lOP

COMMAND

1
2
4

Start Scaling
Clear
Stop Scaling
TABLE III-Scaler module commands

ADIOS

lOP

1
2
4

DATA
BIT
11
10
9
8

COMMAND
Motor Off
Measurement Off
Measure Current
Measure Voltage
Motor Counterclockwise (Lower Voltage)
. Motor Clockwise (Raise Voltage)

TABLE IV-Power supply module commands

1075

A standardized data highway for on-line computer
applications
by I. N. HOOTON and R. C. M. BARNES
.4tomic Energy Research Establishment
Harwell, England

INTRODUCTION
In nuclear experiments the quantity and complexity of
the data have led to the widespread adoption of automatic processing equipment. This equipment may be
divided generally into two categories. The first consists
of data gathering and converting devices intimately
involved 'Nith the experiment, e.g., radiation detectors,
scalers and analogue-to-digital converters. The second
category consists of devices for storing and processing
the data, and for controlling the experiment. Increasing
use is now being made of small general purpose computers to perform these latter functions, in place of the
special purpose devices previously employed.
Each experimental situation requires a unique system
of equipment, although in principle the functions to be·
performed are very similar. This has led, historically, to
the creation of modular instrumentation systems to
satisfy the experimental requirements in. the first category above. For example, the Harwell 2000 Series, l the
ESONE System2 and the U .S.A.E.C. N .I.M. System3
provide standard hardware which allows up to 5, 8 and
12 modular units, respectively, to be held in a 19 inch
rack-mounted crate. Within each system mechanical and
electrical compatibility is achieved by specifying standards and codes of practice. None of these systems however incorporates a means of communicating with 'computers, and individual laboratories have adopted ad hoc
arrangements. As a consequence the representatives of
major European nuclear laboratories (see Appendix)
have collaborated under the auspices of the ESONE
(European Standard of Nucleonic Equipment) Committee to draw up recommendations for a modular system incorporating a standardised data highway. The
basic features of this system, known provisionally as
IANUS, may be summarised as follows:
a) It is a modular system so that any combination of
functional units may be assembled easily by the
experimenter.

1077

b) Each module makes direct connection to a highway
which conveys digital data, control signals and power.
The highway standards are independent of the type
of module or computer used.
c) The mechanical structure is designed to exploit the
high component packing density possible with integrated circuit packages and similar devices.
d) The data transfer highway and modules are kept as
simple as possible. Any system complexity is introduced in the interface between the computer and the
highway.
e) Although the recommendations for the IANUS system were drawn up by representatives of nuclear
laboratories it was designed as a generalised data
handling system for use in any field of instrumentation.
f) IANUS is a non-proprietary specification which is
freely available to all.
g) The IANUS system incorporates the experience
gained with previous modular systems and aims to
augment rather than replace these systems.
The system has provisionally been given the name
IANUS after the Graeco-Roman god Ianus (or Janus)
Gemini who had two faces. The system looks to the
experimental environment and to the computer. It also
looks back to previous modular instrumentation and
forward to the fully automated laboratory.
This paper gives an informal description of the IANUS
system and then outlines the way it is used at the
Atomic Energy Research Establishment, Harwell, as a
standard interface between digital computers and
peripheral devices.
The IAN US system

IANUS is a modular instrumentation system which
links transducers or other devices with digital controllers or computers. Figure 1 illustrates the use of the
system in a generalised experiment such as is commonly

Standardized .Data Highway for On-line Computer

1078

TABLE 1--8tandard dataway_ usage
Title

Designation

Use at a Module

Pins

~

Station Number

Selects the module (Individual line f.rom control station).

N

1

Al, 2, 4, 8.

4

Selects a section of the module.

Fl, 2, 4, 8, 16.

5

Defines the function to be performed in the module.

Strobe 1.

Sl

1

Controls first phase of operation (Dataway signals must not change).

Strobe 2.

S2

1

Controls second phase

Write

Wl - W24

24

Bring information to the module.

Read

Rl - R24

24

Take information fran the module.

Look-at-Me

L

1

Indicates request for service (Individual line to control station).

Response

Q

1

Indicates status of feature selected by command.

Busy

B

1

Inhibits change in Dataway signals (except as pennitted at S2).

Sub-Address
Function
Timing

(Dataway signals my change).

Q!.!:!

~

Non-Addressed

Oeerate on all features connected to them, no canmand re9.!!lred.
1

Sets module to a defined state. (Accompanied by S2)

I

1

Disables features for duration of signal.

C

1

Clears registers.

X

1

Reserved for future allocation.

P1 - P5

5

Free for unspecified interconnections.

Initialise

Z

Inhibit
Clear

(Accompanied by S2)

Reserved
Reserved Bus
Private

May be used as a patch bus.

Wiri~

Patch Points
Mandator~

Power Lines

The IANUS Crate is Wired for

+24V D.C.

+24

1

+6V D.C.

+6

1

-6V D.C.

-6

1

-24V D.C.

-24

1

0

2

CN

Additional Power Lines

Mandato~

+200

1

Main power return.

+12V D.C.

Su~~lies

Low current for indicators etc.

+12

1

-12V D.C.

-12

1

117VA.C. (Live)

ACL

1

117V A.C. (Neutral)

ACN

1

E

1

Referen~efor

Yl, Y2

2

Reserved for future allocation.

"Reserved

and Additional Lines

Lines are Reserved for the Followins Power

+200v D.C.

Clean Earth

No Dataway lines.

circuits requiring clean earth.
~

TOTAL

86

1079

::Fall Joint Computer Conference, 1968

All dimensions in millimiltres.

C ·Ianus Control Module

p

C

---1~n

IANUS

r

111.62

r~:,.

Front Panel

I
200

86 printe~ contacts
(43 per s,de)

"L/
f

Pitch of edge connectors 17·2

J

Datum Face

~~~ ~~~~_

0"""" ,,,,"

~1'---------305 --------~.l

Centre linez:
01 Card

~17'2N-0.2

Centre line
of Guide

Man-to-Machine Communication

Figure 2-Critical dimensions of the lANUS module
Figure 1-A generalised experimental system

nector on the NIM unit to the edge connector socket.
encountered in R. and D. establishments. Experhnenta
data produced by transducers may be taken directly to
peripheral modules (P) where, for example, it is digitised or buffered before being presented in standard
form to the control and data highway or 'Dataway.'
Alternatively, use may be made of equipment in other
standards with an adapter module to produce compatible signals. Other peripheral modules may be used to
provide co:m:munication between the experimenter and
the computer or to provide control signals to the experiment. Previous experiences suggest that the experimental parameters should be set up via the computer so
that they may be checked and recorded. A controller
(0) supervises the operation of the Dataway and provides connection between the IANUS system and the
computer. It will be noted that the peripheral modules
are isolated from the computer and are hence independent of its input/output standards.

Mechanical standards
A standard IANUS crate mounts in a 19 inch rack
and is fitted with 25 edge-connector sockets. Each socket, together with upper and lower guides, constitutes a
'station' into which a module may be inserted. A mod.
ule consists basically of a printed card and a front
panel. Runners on the module engage with the guides in
the crate, and 86 printed plug contacts on the card mate
with the edge-connector socket. Figure 2 shows the critical dimensions of the module which define the position
of the printed contacts in relation to the runners and the
front panel. Any method of construction which conforms
to these dimensions will be compatible with a standard
crate. A module may occupy as many crate positions as
.are required. Units in the U.S.A.E.C. N.I.M. format
will fit into the guidance system, with each NIM single
width occuping two basic IANUS widths and a simple
adapter completing the connection from the AMP con-

The Dataway
The Dataway is a standard highway which conveys
digital data, control signals and power. It consists
mainly of bus-lines joining corresponding pins on the
86-way edge-connector sockets within a standard crate.
One station, known as the 'control station,' has individual lines to each other station. The module occupying the control station acts as a 'controller' for its crate.
During each Dataway operation the controller generates a command on the control lines of the Dataway. It
is implicit in all data transfer operations that one p9.rticipant is the' controller and the other is the module or
modules specified by the command. In practice the
various features which constitute the controller may be
distributed among several physical units. It is convenient to distinguish between the controlling and controlled parts of a system by the generic terms 'controller' and 'modules.' The Dataway includes lines by
which modules can demand attention and a line by
which the controller can test the status of a module.
The Dataway lines (see Table I) may be divided into
seven categories as follows:
A command consists of signals which
select a specified module or modules within a crate
(by station number lines), a particular section of the
module (by sub-address lines) and the function to be
performed (by function lines). Each normal station is
addressed individually by a signal on one of the station
number lines (N) which link the control station separately to each other station (see Tables II and III).
There is no restriction of the number of modules that
may be addressed simultaneously, so that the same
command may be given to any desired selection
of modules in the same oper~tion. The duration of
the signal on the N line defines the Dataway operation period.

1. Commands:

'1080

Fall Joint Computer Conference, 1968
TABLE II-Pin allocation at normal station viewed from front of crate

Individual Patch Point

"
"
"
"

""
"
"
"
"
"
"

Bus Line with Patch Point - Reserved
Inhibit
" .. Clear
"
" " "
" "
"
"
"
Individual Lines
-_ Station Number
[
with Patch Points
Look-at~e
Bus Line
- Strobe 1
.. Bus Line
- Strobe 2

24 Write Bus Lines
W1 = least significant bit
W24 = most signi ficant bit

24 Read Bus Lines
R1 = least significant bit
R24 most significant bit

=

Reserved for -12 volts D.C.
Reserved for +200 volts D.C.
Reserved 117 volts A.c. Live
Reserved
Reserved for + 12 volts D.C.
Reserved
o volts (Power retarn)

P1
P2
P3
P4
PS
X

B
F16
F8
F4
F2
F1

Z

"
"
"
"
Sub-Address
"
"
"
Initialise

Q

Response

I

A8

C

A4

N
L
Sl
S2
W24
W22
W20
W18
W16
W14
W12
Wl0
W8
W6
W4
W2
R'24
R22
R20
R18
R16
R14
R12
R10

A2

A1
W23
W21
W19
W17
W1S
W13
Wll
W9

R6
R4
R2
-12
+200
ACL
Y1

E

+12
Y2

+24
+6

0

0

Signals on the sub-address bus li:i1es (A8, A4, A2 and
AI) specify one of the, 16 sub-addresses in the module.
This sub-address may be used to select a specific
register, to define which of sixteen different flags controls the Response signal, or to direct functions such
as 'enable,' 'disable' or 'clear' to the required section
of the module.
Signals on the five function bus lines (FI6, F8, F4, F2
and Fl) specify one of 32 functions. Sixteen of these
functions are fully defined within the lANDS recommendations. This permits the same command to be
interpreted correctly in modules from different designers and assists in the design of table-driven software.
The remaining si~teen codes are available for special
functions at the discretion of individual designers.
The standardisation of codes is discussed more fully
below in the section on 'Function Codes.'
2. Timing: Each Dataway operation occurs asynchronously, and is timed by two strobes S 1 and S2 which
are generated in sequence on separate bus lines. The

-

Bus Line

"
"
"
"
"
"
"

"

"
"
"

"

"
"
"
"
"
"
"
"
"
"

W7

WS
W3
Wl
R23
R2l
R19
R17
RIS
RU
Rll
R9
R7
RS
R3
Rl
-24
-6
ACN

R8

Busy
Function

-24 volts D. C.
-6 volts D. C.
Reserved for 11 7 volts A. C. Neutral
Reserved for Clean Earth
+24 vol ts D. C. .
+6 volts D.C.
o volts (Power Return)

ti.mes, Tl to T5, shown in Figure 3 may each have
any value greater than a prescribed mini.mum. This
minimum is currently defined as 200 nS. The signals
conveying the command are maintained throughout
each Dataway operation and the other signals are set

Command
.-1-----ljJS(min.)----.l·1
Signals ~!-I-_ _ _ _ _ _ _ _

--!V-

51
52 - - r - - + - - + - - ' "

TI

T2

Figure 3-Dataway timing

Standardized .D'ata Highway for On-line Computer

1081

TABLE III-Pin aiIocation at control station viewed from front of crate

Individual Patch POint

"
"
"
"

"
"
"

tI
tI

"

Bus Line with Patch Point

"

"
"

"

"
"
"
Individual Patch POint
"

Pt
P2
P3
P4
P5

""

"

-

Reserved
Inhibit
Clear

-

Strobe 1
Strobe 2

n

Bus Line
Bus Line

24 Individual Look-at-Me Lines

Reserved fOr -12 volts D.~
Reserved for +200 volts D.C.
Reserved for 117 volts A. C. Live
Reserved
Reserved for +1 2 volts D. ~
Reserved
o vol ts (POoVer Return) .

X

I
C
P6
P7
Sl
S2
L24
L23
L22
L21
L20
L19
L18
L17
L16
L15
L14
L13
L12
Ltt
LlO
L9
L8
L7
L6
L5
L4
L3
L2
Ll
-12
+200
ACL
Y1
+12
Y2
0

up as soon as the command has been interpreted. For
example, a module instruct~d to transmit will establish its data on the 'Read lines' in response to the command. Sufficient time is allowed for the data to settle
before the first strobe (81) admits this data to the
controller. S1 is used for actions which do not change
the state of signals on the Dataway. The second
strobe S2 may be used to initiate actions which
change the state of Dataway signals, for example,
clearing a register which has just been read and whose
output is connected to the Dataway.
3. Data : Up to 24 bits may be transferred in parallel to
or from the selected modul~. Independent lines
('Read' and 'Write') are provided for the two directions of transfer.
All information carried on these data lines is regarded
as 'data' although it may in specific instances be concerned with the status or control features of the modules. The 24 parallel lines in each direction set an

B
Ft6
F8
F4
F2
Fl

Busy
Function

AS

Sub-address

A4
A2
At

"
"

Z
Q

N24
N23
N22
N21
N20
N19
N18
N17
Nt6
N15
N14
N13
Nt2
Nt1
NlO
N9
N8
N7
N6
N5

"
"
"
"

"

Initialise
Response

-

Bus Line

"
"
"
"
"
"
"
"
"
"

"

"
"
"
"
"
"
"

"
"
"
"

24 Indi vidual Station Lines

N~

N3
N2
Nt
-24
-6
ACN
E

+24
+6
0

-24 vol ts D. C.
-6 volts D.C.
Reserved for 117 volts A. ~ Neutral
Reserved for Clean Earth
+24 vol ts D. C.
+6 volts D.~
o volts (POoVer Return)

upper limit to the word length but shorter words may
be transferred.
As shown in Tables II and III, the data lines are not
taken to the control station, where the corresponding
pins are used for Nand L lines. A controller therefore
requires connection to the control station and one
normal station.
4. Status: Individual Dataway lines from each station
to the control station are used for 'Look at Me' signals (L) by which modules can demand attention.
The action taken in response to such a signal is a property of the controller and/or computer.
The L signals may be cleared, enabled and disabled
by appropriate commands.
The status of the Dataway itself is indicated by a Busy
signal (B). While· the N signal specifies the duration
of a Dataway operation to the modules involved, the
B signal indicates to all modules that an operation is
in progress. I t is generally used to staticise the con-

1082 Fall Joint Computer Conference, 1968
ditions on the Dataway so as to reduce crosstalk.
While B is present all signals, including for example L
signals, must remain constant unless their state is
modified at S2.
The status of any specific feature of a module may be
tested by a command to transmit a signal on a common Response (Q) bus line. The appropriate signal is
set up on the Qline by the module as soon as the command is recognised, it is strobed into the controller at
Sl and may be changed at S2 if the feature being
tested is set or cleared then. One bit of status information may thus be sent to the controller with every
Dataway operation and may, for example, be held
in a one-bit register for testing by the computer. The
module designer is free to decide which feature, if any,
is tested.
An obvious application of the Response signal is to
test for the source of a 'Look at lYle' demand in a system which has only a single-level interrupt. In more
complex installations ~t may identify which of several
possible demands has originated from the same module.
5. Common Controls: Common control signals operate
on all modules connected to them without requiring
individual addressing signals.
The Initialise signal (Z) is generally connected to all
modules and forces them to a basic state by resetting
all data and control registers. It also clears all L signals and where possible disables them. It provides a
quick and sure method of dealing with the situation
at switch-on when registers and control bistables
may, in principle, have assumed unpredicted states~
A common time control is provided by an Inhibit signal (I). This may be connected to any modules which
require accurate timing of activities, such as data
taking, independent of computer access times. The
Inhibit signal may be controlled by any appropriate
signal from the computer, the IANUS system or elsewhere.
A common Clear signal (C) may be connected to any
selection of modules in which data registers require to
be cleared.
As a protection against spurious signals both the Initialise (Z) and Clear (C) signals are accompained by
the S2 strobe.
6. Private Wiring: Five ways on the 86-way socket are
not connected to bus lines but may be brought out to
individual local pins on the Dataway wiring. They
are available for unspecified patch connections subject
only to the restriction that the signals must not interfere with standard Dataway signals and must be able
to tolerate some crosstalk from the Dataway lines.
Highly sensitive signals or those that require coaxial
connection are more appropriately located in the

space provided above the edge connector socket.
7. Power: The mandatory power supplies, which may
be used by any module, are +24V, +6V, -6V and
-24V d.c. The maximum current loading for a crate
is 6A for each 24 volt line and 25A for each 6 volt line.
The recommended total power dissipation is 200W
per crate. Two pins are provided in parallel on the
edge-connector as a heavy current 0 volt return for
digital circuits.
Lines are provided for + 12V and -12V d.c. optional
supplies. Low current lines are allocated for a +200V
d.c. supply (primarily for neon indicators) and for
117V a.c. (two lines.) There are also two 'reserved'
power lines which may be allocated in the future. A
return, independent of and isolated from the digital
ovolt line, is provided for low current operations that
require a 'Clean Earth.'

Signal standards
The signal standards are derived from CompatibleCurrent-Sinking-Logic (CCSL), that is, Diode-Transistor-Logic (DTL) and Transistor-Transistor-Logic
(TTL). In all situations where more than one module
may feed onto a line intrinsic, (or wired) OR operation is
specified. This requirement makes it appropriate that
the 'Low' state (short to ground) be interpreted as the
significant or '1' state while the 'High' state (open circuit) is the non- significant or '0' state.

Function codes
While it is desirable for table-driven software, for
autonomous operation, and for multiple addressing to
have standardised functions, it is impossible to legislate
for all possible modules. Half of the 32 functions (see
Table IV) are therefore fully specified and the other half
are available for special applications. For software convenience a single bit (F4) distinguishes between standard (Le., universal) codes and non-standard (Le., local)
codes. Similarly, bit F16 specifies the direction of data
transfer. This is an important feature with some computers when the controller must interpret the function in
order to set up the required data path.
Registers within a module may be divided into two
groups which have independent commands. It is therefore simple to distinguish between control and data registers. This is particularly useful in a system which
makes large scale use of autonomous transfers or in
which it is desired to 'step through' registers on the subaddress codes.
The incrementing functions are provided mainly as a
test facility; code 27 in particular permitting a group of
preselected registers within a module to be incremented
simultaneously.

Standardized .D·ata Highway for On-line Computer

1083

TABLE IV-lANUS function codes
I No.
0
1
2
3

Function
Read
Read
Read
Read

Group 1 Register
Group 2 Register
and Clear Group 1 Register
Canplement of Group 1 Register

4
5
6
7

Non-s tandard
Reserved
Non-standard
Reserved

8
9
10
11

Test Flag
Clear Group 1 Register
Clear Flag
Clear Group 2 Register

12
13
14
15

Non-standard
Reserved
Non-standard
Reserved

16
17
18
19
20
21
22
23

24
25
26
27
28
29
~

31

Overwrite
Overwrite
Selective
Selective

Group 1 Register
Group 2 Register
Overwrite Group' Register
Overwrite Group 2 Register

State After SI
Q = FA;
Q = FA;
Q = FA;
Q = FA;

Ci

= 1Ai

Ci

=

State After S2
LA = 0;
LA = 0;
LA = 0;
LA = 0;

Ci = 2Ai
Ci = 1Ai

'Ai

Q= FA
Q = FA
Q = FA
Q = FI\

Q
Q
Q
Q

= FA;
= FA;
= FA;
= FA;

Q = FA
Q = FA;
Q = FA
Q = FA;

= Il\i
= 2Ai
= 0
= 1Ai

FA = FA
FA

= Ci
2Ai = Ci
If Ki = "
'Ai

If Ki

=

1,

1Ai
2Ai

= Ci;

= Ci;

If Ki = 0,
If Ki '" 0,

'Ai = 1Ai
2Ai = 2Ai

LA
LA
LA
LA

=0

=0
=0

=0
=0

Non-standard
Reserved
Non-standard
Reserved
Disatile
Increment Group 1 Register
Enable
Increment Preselected Register

1Ai
2Ai
1Ai
1Ai

RA=RA+l
Rp=Rp+l

Non-standard
Reserved
Non-standard
Reserved

Q is tne state of the Q line.
FA is the state of the Flag associated with the selected sUbaddress.
LA is the state of the Look at Me assOCiated with the selected subaddress.
Ci is the state of the ith bit of the register in the controller.
1Ai is the state of the ith bit of the register, selected by subaddress,
fran a first group of registers in the module.
2Ai is the state of t}le ith bit of the register, selected by !'IIbaddress,
fran a second group of registers in the module.

The tenn Overwrite (codes 16-19) is synonYlnous
with 'Jam Transfer'-a term which has led to some confusion, particularly in translation.
Application of the system

The IANUS specification sets out in detail the logical
and electrical features of a data and control highway
within a single crate system. It also defines themechanical dimensions necessary to ensure that modules from
different sources are mutually interchange::j,ble. These
recom:mendations have been accepted by major nuclear
laboratories throughout Europe.
In practical applications of computer-based data processing it is also necessary to specify the computer-tocrate and crate-to-crate interconnections. The following sections describe the local standards and techniques
adopted at A.E.R.E., Harwell, for computer-based
IANUS systems.

External representation of the command
A command is represented within a IANUS crate by

'Ai = 0
2Ai = 0

•
•
•
•

F
16

F
8

F
4

F
2

F
1

No.

0
0
0
0

0
0
0
0

0
0
0
0

0
0
1
1

0
1
0
1

0
1
2
3

0
0
0
0

0
0
0
0

1
1
1
1

0
0
1
1

0
1
0
1

4
5
6
7

0
0
0
0

1
1
1
1

0
0
0
0

0
0
1
1

0
1
0
1

8
9
10
11

0
0
0
0

1
1
1
1

1
1
1
1

0
0
1

0
1
0
1

12
13
14
15

1
1
1
1

0
0
0
0

0
0
0
0

0
0
1
I'

0
1
0
1

16
17
18
19

1
1
1
1

0
0
0
0

1
1
1
1

0
0
1
1

0
1
0
1

20
21
22
23

1

1
1
1

1
1
1
1

0
0
0
0

0
0
1
1

0
1
0
1

24
25
26
27

1
1
1
1

1
1
1
1

1
1
1
1

0
0
1
1

0
1
0
1

28
29

1

~

31

Ki is the state of the ith bit in a mask register.
RA is the content of a register selected by subaddress in the module.
Rp is the content of a register preselected in the module.
Q = FA indicates that the state of the Flag at the selected subaddress
is tested before the Flag is cleared.
• Data transfers 'Ai = Ci and 2Ai = Ci may, alternatively, orcur at
Strobe 52.

a 5-bit function code, a 4-bit sub-address code, and signals on the appropriate station number (N) lines. When
the command is transmitted externally, e.g., between a
crate and a computer, it is generally perferable to use a
5-bit code for N instead of the internal 24-bit form. The
decoded values 1 through to 24 are used to address the
corresponding stations directly. The remaining codes
(25 through to 31) may have special applications. For
example Code 31 may be interpreted as 'address all
modules' and Code 30 as 'address preselected modules.'
This makes it possible to address the same command to
a selection of modules simultaneously.
The command to a single crate requires 14-bits. In
multi-crate systems the command is extended to include
a binary coded crate address, e.g., a 16-bit command
will control a 4-crate system.

Multi-crate operation
In a single-crate system the controller has to perform
logical and level conversions between the specific
computer and the IANUS standards. This 'Master

1084

Fall Joint Computer Conference, 1968

Controller' is given the crate address '0' so that it can
operate with 14-bit commands. The number of crates
may be extended by providing a 'Line Driver' module
in the master crate. This buffers the Dataway bus-lines
onto an external highway which feeds a 'Slave Controller'in each added crate. The line driver and slave
controller modules .are independent of the computer
type. The master crate may contain more than one line
driver module, so that the connections to slave crates
can be arranged as a star rather than as a highway (see
Figure 4). Some conneetions, in addition to the Dataway, are required between the master controller and
line driver modules, e.g., for the coded crate and station
addresses. These are provided by a multiway socket
above the Dataway at the rear of the crate.

Program interrupt demands
The individual Look -at-l\/Ie (L) signals within a single
crate are brought to the controller via the Dataway.
There are three general methods by which the computer
may identify the source of the demand. Firstly, the L
signals may be taken individually from the controller to
a multi-level priority interrupt option on the computer.
Secondly, the demands may be combined into a single
program interrupt request, leaving the compute-r to
identify the highest priority demand by a search algorithm. This may operate on either the individual sources
of L signals by 'Test Flag' commands, or on a 24-bit
status word giving the pattern of demands on the L
lines. Thirdly, the controller complex may include a

CRATE 2

Slave
Controller

D
:

CRATE I

To/from r-----__,c--~r__-..___.f---__r_---__,
Remote
Line Drivers
Crate
CRATE 0

CRATE 3

Slave
Controller

[]

Figure 4-Multi-crate interconnection

'priority sorter' module to select the highest priority demand and identify it to the computer. These three
methods may be combined in various degrees.
In a multi-crate configuration a single status flag is
generated within each crate to indicate that a demand
is present. The intercrate connection provides a commonline for these flags, a response line by which the
mastel controller may staticise the crate demands, and
iines on which the highest priority crate may indicate
its binary coded address. In such an arrangement each
additional crate automatically adds a crate priority level to the system. Altelnatively the indi.vidual crate
flags could be assembled into a status word in the master controller. When the crate has been identified by
the computer any of the previously described rnethods
may be used to locate the individual module making the
request.

Autonomous operations
Autonomous operations typically transfer data to or
from the computer store via direct access I/O channels
in response to a Look-at-1Vle signal, without requiring a
direct command from the computer program. Each de~
mand may result in one Dataway operation (Simple
Autonomous Operation) or a sequence of Dataway oper;..
ations (Complex Autonomous Operation).
Simple autonomou8 operation8: Demands for simple
autonomous operations originate in precisely the same
way as demands for program interrupts, i.e., as Look-atMe signals generated in modules and transmitted along
the individual L lines to the crate controller. Here they
are not combined, as demands for program interrupt
may be, but are routed as individual signals to a priority
sorter module in the same crate. This module generates
a request for autonomous operation on a common line to
the master controller, which then freezes the priority
sorters and reads a channel number code corresponding
to the highest priority demand present. The channel
number. specifies the type and direction of transfer required. The controller, in conjuction with a direct-access I/O channel of the computer, initiates either a
Dataway Read operation followed by an input to the
computer, or an output from the computer followed by
a Dataway Write operation. The simplest priority sorter generates a command consisting of one function bit
associated with the channel code (giving a choice of
function 0 for Read transfers or function 16 for Write
transfers), and a station number derived from the identity of the L line. The sorter can, however, be elaborated to generate a full command with choice of station
address, sub-address and function codes. On completion
of the transfer the priority sorters are released. If a buffer area limit ('End of Record,' 'Word Count Overflow') .

Standardized .Data Highway for On-line Computer
in the computer has been reached, a program interrupt
is generated.
Multi-word records may be transmitted from a module by maintaining the demand flag after the completion
of each transfer until the record is complete. However,
since the system operates in multiplexer rather than selelJtor mode, a higher priority demand will interrupt the
sequenrte.

Complex autonomous operations: These consist of a defined sequence of different Dataway operations initiated
by a single demand. One complex operation may include, for example, reading some registers, writing into
others and generating a program interrupt. One or more
such operations are controlled by a 'programmer' module which holds a list of extended po:m:mands, each consisting of the normal co:m:mand together with a channel
code (as described above). There is also a tag bit indicating the last extended command in the sequence. A
Look-at-Me signal demanding a complex operation
starts the appropriate sequence in the progra:m:mer,
which then competes with any other requests for autonomous operations. When its request is granted the programmer sets qp the first extended co:m:mand, and the
master controller initiates the appropriate I/O and
Dataway transfers. The progra:m:mer repeats its request and generates successive extended co:m:mands until the sequence is completed. If a higher-priority demand intervenes the progra:m:mer remains locked at its
current state until it is able to resume the sequence.
The progra:m:mer may have a patchboard on which the
list of co:m:mands is set up manually, or a scratchpad
memory which is loaded by ,the computer. Alternatively it may have fixed, prewired co:m:mands to suit a
specific application.
Module-to-mod~e

transfers

Information may be transmitted between modules
via the computer in two Dataway operations. A direct
module-to-module transfer in a single operation may be
performed under progra:r:n control or autonomously. In
'the latter case it is identified by a specific autonomous
channel code. Since two modules are involved, one
Reading and the other Writing, it follows that on~ of
them"must interpret the Dataway command in a nonstandard manner. During this operation the master
controller couples the Read and Write lines in order to
complete the data path.

. Error checking
The IANUS system incorporates facilities for checking data transfers within its own boundaries.. In a
Write operation a data word isest~blished on the Write
lines of the Dataway by the controller and strobed into

1085

the module by S1. The data signals are maintained but
the function code is then forced to 'Read Complement'
(Code 3). The content of the module register is unaffected by this command but the complement of its content is put out on the Read lines. A separate module
checks that corresponding bits on the Read and Write
lines are in fact complementary. The original function
code is then allowed to return and the Dataway operation completed.
With a Read operation the same procedure is followed
except that, after the data have been transferred from
the module to the controller, the 'Read Complement'
function causes the module to put the complement of its
register on the Read lines and the controller to put the
coritent of its data register on the Write 1ines. The comparison then continues as before.
This technique is more rigorous than a simple parity
check and requires no additional lines on the Dataway.
The hardware to generate complements need only be
added in those modules which require error checking.
The technique is equally applicable to program controlled and autonomous transfers. Only one comparison module is required in a multi-crate system and may
be installed in any crate.

Sub-standard systems
Simple laboratory experiments often make use of
small computers which may have only a 12-bit word
length, program controlled input/output, and a singlelevel program interrupt. Cost is a major consideration
is these systems.
Many modules transfer 12-bits or less per operation.
When more bits are essential they may be accessed in
12-bit byte mode by separate transfers at different subaddresses. The computer word length makes it desirable
that the command is also limited to 12-bits. This may
be achieved with 4-bits for the coded station number
(N), 4-bits for the sub-address (A) and 4-bits for the
function (F). The restricted address code permits only
16 stations to be addressed. However, the controller
itself occupies a number of stations, and certain modules
such as fixed gain amplifiers do not require addressing.
The reduced function code permits the use of a 16 func
tion subset containing all the standard codes shown in
Table IV, or an eight function subset of standard codes
(such as Codes 0, 2, 8, 10, 16, 18, 24 and 26) with an
additional eight non-standard codes.
The simple interrupt facilities of the computer are
used by combining ail the L signals as a common interrupt request. With a short connection between the
crate and the computer it may also be possible to join
the Response line (Q) directly to the computer 'Skip' or
'Branch' facility.

1086 Fa]l Joint Computer Conference, 1968
IANUS is particularly suitable for simple applications
since the compatibility problem.s are restricted to the
controller. Any pe.ripherals developed specifically for
. simple systems are fully comratible with larger systems.
Such simple systems may be expanded by the addition of a data buffer register to give the controller 24-bit
capacity. Data transfers will now take two I/O opera·
tions for modules utilising more than 12-bits. However,
virtually the entire range of standard modules-becomes
available to the experimenter.
System expansion

The expansion of a system to provide any of the facilities outlined above is achieved by additions to the controller and associated units. This capability therefore
neither requires modifications to standard modules and
the Dataway nor raises their basic cost.
The growth of a minimum program-controlled system
into a complex system with autonomous and pregranuned transfers will be given as an example. Conversion from a sub-standard syste:m to one with the full
range of function codes and station addresses involves
the replacement of the controller. All further developments are achieved by the addition of appropriate units.

CRATE 1\.

Extension from a single-crate to a multi-crate system requires a line driver module -in the master erate and a
slave eontroller in each additional crate. Systems with
autonomous transfers need priority sorting modules for
simple operations and programmer modules for complex
operations, the ntimberof each depending on the system
configuration. The same peripheral modules :may be
used in both programmed and autonomous modes. An
error-checking :module may be added to any standard
system. Since a standard co:mmand structure is used,
syste:m expansion does not destroy program co:mpatibility. At any stage a change to a different type of computer involves only the replacement of that portion of the,
master -controller which adapts the logical and e]ectrical signal standards. A highly simplified out1ine of a
multi-crate system is shown in Figure 5.
CONCLUSIONS
The pooled resources of major European nuclear laboratories have established the specification for a new
range of modular instrumentation as an international
standard. This includes the connection of modules to a
data and control highway within a single crate. At
A.E.R.E., Harwell, this work has been extended to

I

DATAWAY

~

Data &
Control
1--

---

1

I

,

~T~O~E~XP_E_R~IM~E~N~T__________L__-_-_-+l_-_--______-_-_-~l---CRATE

I
I

MODULE

0

,

Transfer Request

I
Program

1

Interrupt Request

,I

I
I1- _ _ _ _ _ _ _ _ _ _ •• ____ -

___ -

__I

-

Data &
Control

Auton omous
R4!qu ests

DATAWAY

EBJ.Q8!.I.X
SORTER

---I
'AUTONOMOUS I
I PROGRAMMER I

,

,
I Coded }
, Address

~

I

I

I

I

',~
1
1
,I
I •
,

_______ .J

MASTER
CONTROLLER

I

'D4!mands

---r--

Autonomous

r--

I seqU4!nCed}

EXPERIMEN~

I
I

Data & Control
r--

1

TO

I

... _______ ..1

1

:P-;RIP~ER:;;:LI

1

I
I
I

1

Data &
Control

--I

SLAVE
CONTROLLER

I PERIPHERAL
I MODULE
I

I

I
I

I
1

Initiating D4!mand

r

I

I •

Autonomous Requ4!sts

Program Interrupt Requests

"

L---r-_J ,-

I

I ndividual

~

:

_____ .JI

Data
&
ontrol

Multiplutll
Autonomo
R4!quests

5

Program
Int4!rrupt
R4!quests

TO COMPUTER

Figure 5-Schematic representation of a complex system

Interrupt
Requests

Standardized Data Highway for On-line Computer
include multi-crate systems with programmed and
autonomous transfers to and from on-line co~puters.

and Applied Physics Division and the authors must express their thanks to their many colleagues on this
project .

. ACKNOWLEDGMENTS
The develop:rnent of the IANUS system would have
been impossible without the active collaboration and
good will of representatives from laboratories throughout Europe. The authors hope that this attempt to present the results of that collaboration will indicate their
gratitude.
The work at the Atomic Ell:ergy Research Establishment, Harwell, has been conducted in 'the Electronics

. REFERENCES
1 Harwell 2000 series specification and guide SG2000

AEREUKAEA

.

2 ESONE standards EUR 1831e

Euratom Ispra
3 U SAEG nuclear instrument module series
TID 20893 (Revision 2) National Bureau of Standards
Washington, DC

APPENDIX
Organisations which ,took part in the specification of the JANUS system

International

Austria
Belgium
Britain

France
Germany

Holland
Italy

Yugoslavia
Switzerland
Secretariat

1087

CERN, European Organisation for Nuclear Research, Geneva
Centro Comune di Ricerca (Euratom), Ispra
Bureau Central de Mesures Nucleaires (Euratom), Geel
Studiengeaellschaft fur Atomenergie
Centre d'Etude de l'Energie Nucleaire (CEN)
Atomic Energy Research Establishment
Rutherford High ~nergy Laboratory (SRC)
Daresbury Nuclear Physics Laboratory (SRC)
Centre d'Etudes Nucleaires
Centre d'Etudes Nucleaires
Physikalisches Institut der U niversitat
Deutsches Elektronen Synchrotron
Hahn Meitner Institut
Kernforschungsanlage
Kernforschungszentrum
Physikalisches Institut der Universitat
Reactor Centrum Nederland
Laboratori Nazionali (CNEN)
Centro Studi Nucleari (CNEN)
Centro Studi Nucleari Enrico Fermi (CESNEF)
Centro Informazioni Studi Esperienze
Instituto di Fisica
Boris Kidric Institute
Institut fur Angewandte Physik der Universitat
Dr. W. Becker, CCR, Euratom, Ispra

Switzerland.
Italy.
Belgium.
Vienna.
Mol.
Harwell.
Chilton.
Daresbury.
Saclay.
Grenoble.
Marburg.
Hamburg.
West Berlin.
Julich.
Karlsruhe.
Frankfurt.
Petten.
Frascati.
Cassaccia.
Milan.
Milan.
Bari.
Belgrade.
Basle.
Italy.

Use of computers in a molecular biology laboratory
byT. H.GOSSLINGandJ. F. W. MALLETT
M.R.C. Laboratory of Molecular Biology
Cambridge, England
.

INTRODUCTION
On-line computers have been making considerable pro-.
gress in recent years, to the point where they represent
a sizeable proportion of the computing field. Only lately
however, have they started to make their mark in the
laboratory as a research tool. In this paper, we shall
illustrate this application from our experience in molecular biology.
In our laboratory at Cambridge, we have been using a
time-s~aring on-line computer (a Ferranti Argus 300)
for a little over three years. We hope that the techniques
that we have developed will be of interest to others, and
also that our experience in running this kind of computer
in a specialized environment may be illuminating. We
shall also say a little about possible future developments:
Molecular biology is a science in which computers
have a central role. This is particularly truein the study
of protein structure, simply because of the enormous
quantity of data to be processed; one of the final proces~es, a three dimensional Fourier transformation, may
typICally contain some 106 terms, and have required as
many measurements. Until a few years ago this was
entirely off-line computing, i.e., data in num:rical form
had to be prepared by hand, and the output, also
numerical, required considerable further hand processing to get it into a useable state. Certain parts of the
process, particularly data collection, have gradually become automated to the extent that punched tape or
cards can be directly generated. From now on, however,
we shall see more computers being used for on-line data
collection and reduction, for control of experiments, and
presentation of comprehensible results.
Since we shall be concentrating mainly on protein
structure, it might be as well to summarize briefly what
is already known on this subject. A protein is simply a
very large organic molecule-of the order of thousands
of atoms. Fortunately, this apparent complexity is reduced by the. fact that the molecule is a folded chain
of about 200 amino-acids, and there are only twenty
amino-acids to choose from. To understand how a pro-

tein works, we need to know both the sequence of aminoacids in the chain and the' physical structure-the way
in which the chain is folded. There are thus two parts to
the study, chemical and physical, and these are complementary.
In Figure 1, we show these two approaches, with an
indication of the places where computers come into the
pic~ure, or might do so in the future. The computers
have been classified as "small," "medium" and "large";
this classification is deliberately vague, but roughly
the dividing lines may be taken in this context as about
20,000 and 100,000 bytes (of 8 bits) of core store, with
appropriate backing store in each case. The many computers shown do not have to be separate, of course. In
our case, we make do with two: the on-line computer already mentioned in the laboratory, and an IBM 360 a
few miles away. The two machines communicate with
each other by magnetic tape, on a daily courier basis.
Use in crystallography

Let us begin with the physical approach-the study of
structure by means of X-ray crystallography. To use
this technique, the protein must be grown as a crystal,
which is a regular lattice of identical molecules in the
same orientation. If this is placed in a monochromatic
X-ray beam,it acts as a diffraction grating, in that, for
certain specific orientations, diffracted beams, called "reflections," are produced. From the intensities and
phases of tlIe reflections, it is possible to reconstruct the
internal form of the molecule, by Fourier synthesis.
Unfortunately, phases are not directly measurable; they
can, however, be inferred by repeating the measurements for the same protein, with a heavy atom chemically attached at a known point-this process is similar
to holography.
Usually two or three such "heavy-atom derivatives"
are needed, and this adds to the amount of data that
must be collected. Typically, a total of 106 reflections
would resolve a protein to about 2 Angstrom Units,
which is not fine enough to see individual atoms, but al-

1089

1090

Fall Joint Computer Conference, 1968

lows identification of the amino-acids, given the chemical sequence.
There are two basic methods of collecting the intensities of reflections from a crystal: diffractometers and
X-ray film. A diffractometer (Figure 2) is an instrument for orienting the crystal and an X-ray quantum
detector accurately-to about one minute of arc-orientations being changed from reflection to reflection.
Computation of the setting angles involves a certain
amount of trigonometry, and the accuracy required im-

plies that it must be done digitally. X-ray quantum counting 1S also inherently a digital process, and most diffractometers are now therefore automated to the extent of
taking in settings from pre-computed ~ards or paper tape,
and punching out ~he measurements in a similar form.
From here it is a small step to connect the diffractometer directly to an on-line computer. This offers two
advantages. First, the data can be compressed as they
are collected, so that not only is the convenience in-"
creased but also the errors introduced by electro-

FIGURE I-General diagram of the stageEl in solving a protein structue

ISOLATE
PROTEIN

CHEMICAL"

PHVSlCAL. STRUCTURE
(X-ray or neutron diffraction)

STRUCTURE

Chemistry: brsk
into Ca'lStituent
amino acids

Chemistry: aystallts. natiw Prot.in and
hay-atom derivatives

down

o •0

AMINO
ACID
ANALYSER

Crystals

DIFFRACTOMETER

CAMERA

METHODS

METHODS

reccrd

"\

manual
analysis

manual
c:::r'densitometry

Medium Off-line
Comput~r:

density

sc:aUng. merging.
phasing & Fourier
transformation

peak analysis

structu~ computation

& refinement

(f:\atomic
WCo-ordinatn

1
M4tdium Off-lin.
Comput4tr& Plott.r: _
rotation & perspectMt

~
"~

~

0-

structural views

rnNIUNmInts

~

W

M4tdium Online Computer:
mNSurement &
control

MICRODENSITOMETER

Use of Computers in Molecular Biology Laboratory
.

I' \

1091

.

.,

." ,.
,t ••

C"~,,,,

<,0\

-

;

.

,1:»)

":'./",

(~

"

, I

/

\1'

FIGURE 2-A typical diffractometer

mechanical' devices are reduced. Secondly, it becomes
possible to monitor the data collection process in a
"closed-loop" mode, and correct for minor error~ as they
arise.
The control of diffractometer setting and output of
basic measurements can be carried out on quite a small
computer. If it is larger-somewhere in our "medium"
range-then it is possible to do some of the preliminary
processing of data, and add the monitoring function.
The final assembly of complete data, involving scaling
of measurements from a number of crystals and determination of phases, must in any case be carried out on
a large off-line machine.

Presentation of results
Before turning to the other method of data collection,
X-ray film, we must digress to the far end of our subject, namely the presentation of results. Mter the
measurements have been assembled in a large computer, a Fourier transformation is applied to them, the result bei;ng a three-dimensional map of electron density
in the molecule. This map is in the form of spot values
of density on a regular grid. Traditionally, this ~as
listed on the output printer, section by section, and contoured by hand; if the crystal axes were not orthogonal,
then the numbers had first to be transferred to suitably
laid out paper, also by hand. In all, the process took
about a day and a half per section-two to three months
for a complete molecule.
When we purchased our on-line computer, for 

ftJ

4»

~

0.25

5

Harmonic
, FIGURE 5-Amplitude-frequency spectrum of blood pressure
waveform (From Patel, 1965)

"Problem" waveforms
While the classical shape of the blood pressure
waveform is that illustrated in Figure 4, other
shapes are frequently enc'Ountered. A sample of
such .shapes is shown in Figure 6, the records being taken from 6 different patients. All of the
patterns shown are' accurate recordings made with
the same critically damp system as described above
and the differences are therefore not due to artefacts, although their explanation lies outside the
purpose of this paper. Their importance lies in the
fact that short periods of the overall wave during
which a sign reversal occurs must not be recognized by the programme as new heart beats. Difficulties arise particularly when detailed predictions
based on the exact shape of the waveform are to
be made, especially since the pattern may not be
constant even in one patient. This is demonstrated
in the records shown in Figure 7 where the shape,
.DC level, pulse amplitude and rate of the bl'Ood
pressure wave all change rapidly in response to a
voluntary temporary increase in the pressure inside the subject's chest. 41 ,42 During this manoeu-

1108

Fall Joint Computer Conference, 1968

0'11-7-67

P.S. 19- 11 - 67

1°O1~

• H9

o

. . .
--------

sec s
"'\

"'I

-----

ft.'

]i\\

31-10·6&

''!.'~;lI.u

II",P""

III\\\\\\\\\\\\\I~\\\\\II\I11\11111\\11111\\1\\\I\\\~\\I

~ HW!\\\\\\H\\~\ \\~~~. .\..~. .u.~\~\\\\W\~~\~\~~\\~~~~~\~
. 11'~:1~\'\\\

D.W
·Arterlll'.P.
1111

q \

He.

1\'\ I

D

.
, , . , , , , . , ! "

P.S. 24 - 11 - 67 ,

RJIR

Hg

R.B. 31 - 10 - 66

1001(.~

\

o

sec s

Hg

IItREASED IIITRATIORACIC
PRESSURE

\ \\ \
sec s

. . .
-.

\t.L. 20 - 2 - 68

ioU. 28 - 10 - 66

.• ;~ [\N\MNJ\f
o

J.S.

J.S.

~

o

Ti~~,' s~' ! , , , , , , ' , , ,

I'·'!"

FIGURE 7-The effect on blood pressure and heart rate of a
voluntary temporary. increase in the pressure within the chest

100· ........
llil

't "

.'!'

sec s

'tll"'I"'"

MOVEMENT OF SUUJFCT
J.S.

sec s
rJm

H:OO

[j\~.~.I\'--

o .

• ecs

FIGURE 6-Varying blood pressure waveforms

IIITERFERENCE \¥ I TH
CATHETER
J.S.

vre, even in normal subjects, there may occur
pulse pressure variations between successive heart
beats of up to 20% and heart rate variations of up
to 45 %, while the mean pressure may vary by up
to approximately 80% 'Of its original value depending on how hard the subject "blows." Similar
but less marked changes in the wave shape also
occur with changes of posture and may be compounded by the presence of disease. A further difficulty is the frequent presence of noise
(Figure 8") which has been interpreted here in
a very wide sense. An example of a tremor in the
patient, c'Oughing, movement, interference with
the catheter and ·an abnormal heart rhythm are
shown, all of these events being common in clinical situations. It will be apparent that the frequency and amplitude of these events will usually
distinguish them from a normal recording.

~

Programme for analysis of individual
waveforms
Detection of the start of contraction of the heart
(systole) has been found possible with a method
based on the speed and duration of the rise of
pressure during this period-probably the most

I~~

100 ['

mm Hg

o

!

~,

I I

I

•

,I,I ~ ~

I

~

I

'

r'.

.• / ,

I'

.'

,

• H9

o

sec S
-L-.l -~l

I

,

I

G.H.

I

I.

I,

'I ~ , ,

I

!

100 [\~ \ -~"I.\ '\\\\CL~0~~~\ ~

ABIIORIoiAl HEART
RYTHMIl

\~~~\\~ ijl:~~\
~~ Ii
I~

P.S.' B 0 U T 0 F C 0 UGH I N G

I ' ,

I III

I

I

:

,','

! I

i

'I

100~
,. Hg [

o
's e c s

.INTERFERENCE WITH
CATHETER

sec s
~

n

E 1·'0 R

FIGURE 8-Noise in blood pressure recordings

constant feature of the blood pressure waveform.
The computer is programmed to recognize this
event by searching for a number of monotonically
increasing sample values, the number chosen depending on the sampling rate being used. The
diastolic pressure is then taken as the immediately
preceding minimum and the systolic pressure as
the highest subsequent maximum occurring within
a specified time. If this time is properly chosen
the dicrotic notch, the "notch" seen during the
downstroke of the wave in a number of the illustrations in Figures 6 and 7, will be ignored. Once
systolic and diastolic. pressures have been determined it is a simple matter to calculate the
mean and pulse pressures and heart rate. Constraints are built into the programme to detect

Applications of Digital Computers to Long Term Measurement 'Of BI'Ood Pressure
high frequency noise and 'Off-scale values, and
limits are set f'Or the maximum acceptable percentage variati'On between successive amplitudes
clnd rates. When an unacceptable value is f'Ound
the programme will either stop 'Or register a fault
c'Ount depending on which choice the user has
made in a pr'Ogramme "'Option." Other features
'Of the programme are an aut'Omatic calibrati'On
search, an initial scan to check that the signal
c'Onf'Orms t'O the user parameters inserted, simple
5 point sm'Oothing t'O remove high frequency n'Oise,
and 'Opti'Ons f'Or the type 'Of 'Output and the duration
'Of analysis.
The set of user parameters in the present version 'Of this programme is indicated in Table I.
The programme has been written in F'Ortran and
als'O in machine code t'O increase its speed 'Of 'OPerati'On. It will reliably accept samples at a rate
'Of 600 per second t'O pr'Oduce an analysis 'Of the
derivatives indicated in Figure 4 with an accuracy
within 2 %, enabling the speed of replay 'Of the
tape recorded signals to be increased by a fact'Or
'Of 16. VVhen the tape speed is increased t'O 30
times the 'Original and sampling rate is increased
t'O a thousand per sec'Ond, which is the highest of
which the available hardware is capable, the accuracy of analysis decreases by a further 1 t'O
2.5 %, depending on heart rate. It will be apparent
that the am'Ount of data will exceed available c'Ore
store during analysis 'Of I'Ong rec'Ordings since 'One
track 'Of a 16 inch sP'O'OI can accept a c'Ontinu'Ous
20 hr rec'Ording at a speed 'Of 1% inches per
sec'Ond. The pr'Ogramme is theref'Ore designed t'O
automatically dump data 'On the disc st'Ore after
every 320 heart beats and then return to the analysis. Inc'Oming data is I'Ost during this peri'Od.
TABLE I-User parameters for blood pressure analysiR
prog;ramme

1.
2.
3.
4.

Sampling rate of Analogue - to - Digital Converter.
Values of calibration signals (in mm fig.).
Number of CYcles, or period of time, to be analyseO.
Permissible variation in . a) amplitude·
b) heart rate between successive cycles.

5. Number of monotonically increasing points required to indicate systole.

6. Fal~ in ~mpli~ude after systolic point before beginning search for
diastolic pomt

At the end 'Of the entire analysis further pr'Ogrammes are instituted f'Or averaging set periods

1109

'Of the data and determining auto- and crossc'Ovariances. Depending 'On the latter findings and
'On the nature and durati'On of the initial rec'Ording
'One or m'Ore selected derivatives are 'Output 'On
punched tape and 'Occasi'Onally'On cards f'Or cumulative sum 'Or other subsequent· analysis. F'Or certain tests an example 'Of which is given later, a
"marker" signal is als'O rec'Orded 'On the tape during the initial recording. Two types 'Of marker
have been empl'Oyed, viz., a DC signal on an adjacent (unused) track which is then sampled
alternately with the bl'O'Od pressure signal during
pr'Ocessing, and an AC signal 'Of fixed amplitude
and durati'On, superimp'Osed 'On the data track itself. The latter meth'Od has been f'Ound m'Ore efficient f'Or m'Ost purposes and the marker signal is
detected by a separate I'O'OP in the pr'Ogramme. A
similar meth'Od is used t'O indicate the end 'Of a
rec'Ord.
Cumulative sum analysis
The beat-to-beat values derived fr'Om the f'Oreg'Oing bl'O'Od pressure analysis c'Onstitutes in statistical terms a time series 'Of n'Onstationary data
in which the serial values are highly dependent
and in which b'Oth the mean and r'Oot mean square
.vary with time. 40 If the degree 'Of aut'O-c'Ovariance
is kn'Own the initial derived series can be sampled
at sufficiently infrequent intervals t'O c'Onvert it
int'O a time series 'Of independent samples. Present
experience indicates that the degree 'Of aut'O-correlati'On in the data varies c'Onsiderably at different
times in the 'One patient, and between patients.
F'Or the present illustrati'On (Figure 10), serial
half-h'Our-average values 'Of heart rate, derived
from a fifteen-day c'Ontinu'Ous blo'Od pressure recording have been used. The method of time
series analysis applied has been devel'Oped f'Or· an'Other applicati'On by W'Oodward and G'Oldsmith,4:i
and is illustrated in bl'Ock f'Orm in Figure 9. Its
purpose is t'O detect changes in the average level
between groups of data in a time series and t'O
determine the P'Oint 'Of 'Onset of such ~hanges. The
pr'Ogramme causes the c'Omputer t'O read in a time
series, t'O calculate the cumulative sums 'Of the
series using the grand mean as a reference value,
and t'O P'Oint out the occurrence of significant
changes 'Of sl'Ope in the cumulative sum chart.
These changes can be determined at different
probability levels, and the standard deviations of
significantly different stages in the series are
calculated. The. output includes a graph 'Of the

1110

Fall Joint Computer Conference, 1.968

CHAPTER O.
INPUT
RAW
-DATA

DETECT AND
REPLACE
FREAKS
PRINT TABLE OF
TIME SERIES &
THEIR CUMULATIVE
SUMS

CHAPTER 1.
SELECT SIGNIFICANT
CORNERS IN
CUMULATIVE SUM
CHART IN FORWARD
DIRECTION

PRINT TABLE OF
SIGNIFICANTLY
DIFFERENT STAGES
IN SERIES

CHAPTER 2.
CALCULATE WITHIN
STAGE MEANS &
WITHIN STAGE S.D.

SCALE AND PLOT GRAPH ON SAME AXES
OF ORIGINAL TIME SERIES, CUMULATIVE
SUMS, MANHATTEN DIAGRAM

FIGURE 9-0utline of programme for cumulative sum analysie

original data, its cumulative sum or "cusum," and
a "Manhatten" diagram, a term used to describe
the graph of significantly different stages in the
series.
It is obvious from Figure 10 that the initial
series (a) is highly irregular and that trends of
variation in the mean level are n'Ot very obvious.
The cusum chart (b) however, gives a very clear
indication of the overall trend, and visual inspection confirms the presence of a pattern in the
original data which could easily have been overlooked. The Manhatten diagram (c) has condensed the scattered original data into a relatively
small number 'Of groups whose difference from
their neighbors is significant, in this case at the
1 % level. The within-group standard deviations,
although available from the analysis, have not
been included in this illustration. Differences produced by treatment of the patient with two
anaesthetic agents, nitrous oxide and halothane,
are clearly shown in both the cusum and Manhattan plots, the mean having decreased SIgnificantly. This method has ·also been used to test
'Objectively the effects of various other treatments
(to be published).
A further important application is shown in
Figure 11, where an assessment has been made of
the variability of the heart rate in the same patient
over the same period. The difference between the
highest and lowest values of instantaneous heart
rate (excluding "freaks" produced by an abnormal
rhythm) has been measured for each successive
half hour period, and used as a prImary measure
of variability. These figures have then been proc-

FIGURE lo-Cumulative sum analysis of half-hourly-mean
heart rates during a IO-day period in a patient with tetanus
(see text)

FIGURE ll-Cumulative sum analysis of the variability of
heart rates during a IO-day period in a patient with tetanus
(see text)

Applications of Digital Computers to Long Term Measurement of Blood Pressure

1111

essed in a fashion identical to that described above.
The general comments already made on the type
of output resulting from this analysis again hold .
true, and it is also clear that the changes in variability detected by this method do not closely·
parallel the changes in the average heart rate,
although treatment with an anaesthetic agent has
produced a clear decrease in the variability.
DISCUSSION
Blood pressure and heart rate are usually measured and recorded by a nurse. She is relatively
cheap, easily understood, replaceable, reliable and
compact. An automatic monitoring system must
offer significant advantages over her to justify
the increased capital cost and increased complexity. There is little doubt that most automated systems are indeed superior but there is equally little
doubt that most currently available systems. are
much more effective in producing large volumes
of unprocessed data than in efficiently compressing
results-they tend to act rather as a team of
nurses taking measurements more frequently. The
presence of large volumes of potentially useful but
unprocessed data in intensive care and research
units is a growing problem, and blood pressure
records of this type are in fact largely unprocessable, .due to the time required for manual analy. sis. For instance, this may require up to 5 - 6
weeks for analysis of a continuous record lasting
" 7 - 10 days, and even after this time the analysis is
limited. The need for computing is very real and
will grow with time.
Reference has already been made to previous
applications of digital computers to these problems. The present tendency is to make automatic
comprehensive measurements and thereafter to
compress the data simply by averaging over varying periods of time. Unfortunately, this is not
a.Iways a particularly sensitive method of compressing the data without losing its information·
content. The point is illustrated in Figure 12,
where five sections of a blood pressure record
taken from the same patient at different stages
of a disease process are shown. The duration of
each of the traces is about 25 minutes, and a calibration signal is shown for each record. On the
. right hand side an approximate mean level is
written, and the highest and lowest pressure during the period is indicated by a mark. It is immediately apparent to the eye that in the upper

FIGURE 12-Blood pressure recordings taken on a pen recorder
with a slow time base (see text)

record, there is great lability of blood pressure,
while this decreases in the lower records. Neither
the mean pressure however, nor the blood pressure
range adequately reflect the various stages between a very labile and a very stable pressure, although these differences are very important in describing and interpreting changes brought about
by the disease process. 33 ,44 The method of cumulative sum analysis described appears to be a more
sensitive method for automatically and objectively
describing and assessing such results and also approximates to what one normally attempts to estimate by eye. The in-built application 'Of statistical
methods of testing the significance of changes pro-

1112 Fall Joint Computer Conference, 1968
vides one means of assessing treatments which is
free from observer bias and this may well prove
to be one of the most important applications, as
well as being a critical test of the method. Such
assessments must at present be made with caution
since a comprehensive analysis of the auto-correlation in such biological parameters as heart
rate and blood pressure is not yet available.
It is to some extent a disadvantage of this
method that the analysis is necessarily retrospective, since prior knowledge of the grand mean and
overall standard deviation is required to determine
the cusums. The analysis is however, rapid, and
may prove convenient for many units which lack
on-line computing facilities. It may also prove
possible to develop a method based on the use of
provisional estimates of the mean and variance or,
in time, to define normal limits of these derivatives.
The methods described for blood pressure analysis should have important applications to the study
of other physiological phenomena, the principal
differences for many applications being that a
lower sampling rate is required (Table II). The
variables shown in the table are all very commonly monitored in clinical situations and in research and the important frequencies and derivatives are shown. Comments made above for the
near impossibility of measuring the variability
and quantitating trends of blood pressure with
standard methods may be made again for some
of these functions. Harmonics of the basic frequencies do not at present appear to require analysis for these signals, so that the frequencies to be
. analyzed are much lower than with blood pressure
recordings. In addition, it is only the rate or the
mean level which needs to be derived in most cases.
The similarity lies in the fact that, each function
being a time-series in which the degree of autocorrelation is high, they are statistically similar,
and are all probably suitable for analysis and
presentation by similar techniques, with the addition of cross-correlation analysis. The table is by
no means exhaustive, and measurements cou 1,'
easily be extended to include the study of bladdc'
and alimentary tract pressures and motility, some
aspects of locomotor activity, and probably other
body functions as well.
Research is also required to determine which
primary derivatives of the blood pressure wave
form have physiological significance and therefore
need processing. The mean pressure, mean square

TABLE II-Frequencies and sampling rates for physiological
variables suitable for study by techniques similar to those for
blood pressure analysis
APPROXIMATE SAMPLING RATES REQUIRED FOR DIGITAL COMPUTER
ANALYSIS OF COMMON PHYSIOLOGICAL SIGNALS
Physiological
Signal

Arterial Blood
Pressure

Frequence
rangel min.
in adults.

Derivatives
usually
required

45 - 200

I

I
Central Venous
Pressure
Heart Rate
( from instantaneous ratemeter )

Systolic Pressure
Diastolic Pressu re
(Mean Pressure)
( Pulse Pressure)
(Heart Rate )

Approximate
sampling ratel sec
required for
computer analysis
in real time
120

I

I

45 - 200

Mean Pressure

10

45 - 200

Heart Rate

Respiration

8 - 60

Respiration Rate

10

Temperature

0
(slowly varying
DC level)

Temperature

1

10 for instantaneous rate
1 for average rate

pressure, and heart rate may provide as much information alone as they do in combination with
systolic, diastolic and pulse pressures. The mean
square pressure in particular is likely to be a
powerful derivative, though its derivation has
not previously been proposed. For certain applirations it could also prove more suitable to derive
such primary derivates with analogue techniques.
Both in these research applications and in clinical
practice cumulative sum techniques appear to have
important applications. Above all the necessity
for arithmetic processing of records must be affirmed. It is only in the critical analysis of a record that its worth or otherwise becomes apparent,
and a great deal of information is likely to be lost
if complete reliance is placed on simple visual in
spection.
ACKNOWLEDGMENTS
I am grateful to Dr. D. Clarke and Mrs. H. Somner for substantial help with programming, Dr.
A. Barr for advice on statistics, Dr. J. M. K.
Spalding and Miss R. Williams for other support,
and the Oxford University Systems Engineeri~g
Group for the use of their facilities. The investigations were supported by grants from the National Fund for Research into Crippling Diseases,
the N uffield Trust and the Well come Trust.
The programme for cumulative sum analysis
has been kindly made available by Dr. P. L. Gold~mith, M.A., D.I.C. of Imperial Chemical Industries Limited.

Applications of Digital Computers to Long Term Measurement of Blood Pressure

REFERENCES
1 WHLEWISJR
Procedures in measurement of blood preS81.tre: a historical note

Practitioner 184 2431960
2 WASPENCER CVALLBONA

Application of computers in clinical practice
JAMA 1919171965
3 S RIVA-ROCCI
Un nuovo sfiymomanometro

Gazz Med di Torino 47 9811896
4 L HILIJ H BARNARD
Simple and accurate form of sphygmomanometer or arterial
pressure guage contrived for clinical use

Brit med J 2 904 1897
5 N S KOROTKOFF
Concerning methods of study of blood pressure

Tr Imp Mil Med Akad St Petersburg 11 365 1905
6 WE GILSON H GOLDBERG H C SLOCUM
Automatic device for periodically determining and recording both
systolic and diastolic blood press1tre in man

Science NY 941941941
7 MBRAPPAPORT AALUI.SADA
Indirect sphygmomanometry physical and physiologic analysis
and new procedure jor estimation of blood pressure

J Lab Clin Med 29 638 1944
8 J CROSE S R GILFORD H P BROIDA A SOLER
E A PARTEN OPE ED FREIS
. Clinical and investigative application of a new instrument jor
continuous recording of blood pressure and heart rate

New Eng J Med 249 6151953
9 JHGREEN
Blood pressure follower for continuous blood pressure recording
inman

J Physiol London 130 37P 1955
10 J H CURRENS G L BROWNELL S ARONOW
An automatic blood pressure recording machine

New Eng. T. Med. 256, 780 1957
11 R A JOHNSON
M odel16 atuomatic blood pressure measuring instrument

USAF Wright Air Dev Ctr Dayton Ohio Tech Rept 59-429 1
1959
1 2 T VON VEXKULL F KILLING
Ein apparat zur Jortlaufenden unblutigen registrierung von
puls and blutdruck

Munch med Wschr 101380 1959
13 R WWARE ARKAHN
A utomatic indirect blood pressure determination in flight

J Appl Physiol18 210 1963
14 A KAHN R WWARE OSIAHAYA
A digital readout technique for aerospace biomedical monitoring

Am J Med Electron 2 152 1963
15 R JONN ARD chairman
Symposium on patient monitoring. 15th annual conference on
engineering in medicine and biology

The Instrument Publishing Company Inc Pittsburgh
Pennsylvania 1963
16 L A GEDDES H E HOFF C VALLBONA
G HARRISON W A SPENCER J CARRZONERI
Numerical indication oj indirect systolic blood pressure heart rate
and respiratory rate

Anesthesiology 25 8611964
17 L A GEDDES H E HOFF W A SPENCER
C VALLBONA
Acquisition of physiological data at the bedside: a progress report

Ann NY Acad Sci 11510911964

1113

18 B L STEINBERG S B LONDON
A utomated blood pressure monitoring during surgical anaesthesia

Anaesthesiology 27 68611966
19 WWHOLLAND SHUMERFELT
Measurement oj blood pressure: comparison of intra-arterial and
cuff values

Brit med J 21241-1964
20 F H VAN BERGEN D S WEATHERHEAD
A E .TRELOAR A B DOBKIN J J BUCKLEY
Comparison oj indirect and direct methods of measuring artierial
blood pressure

Circulation 104811954
21 L N ROBERTS J R SMILEY G W MANNING
A comparison of direct and indirect blood pressure determination

Circulation 8 2321953
22 JMSTEELE
Measurements of arterial pressure in man

J Mt Sinai Hosp 810491941-2
23 W F HAMILTON R A WOODBURY H J HARPER JR
Physiologic relationships between intra-thoracic, intraspinal and
artierial pressure readings

J Am Med Ass 106 853 1936
24 R E JENSEN H SHUBIN P F MEAGHER M H WElL
On-line computer monitoring of the seriously ill patient

Med BioI Engin 4 265 1966
25 H SHUBIN M H WElL
Efficient monitoring with a digital computer oj cardiovascular
function in seriously ill patients

Ann Intern Med 65 4531966
26 M H WElL H SHUBIN W RAND
Experience with a digital computer jor study and improved
management of the critically ill

JAMA 198 10111966
27 SHTAYLOR HRMACDONALD
RPSAPRU

MCROBINSON

Computers in cardiovascular investigation

Brit Heart J 29 352 1967
28 H SHUBIN M H WElL

M A ROCKWELL JR

A utomated measurement of arterial pressure in patients by use
of a digital computer

Med BioI Engin 5 3611967
29 S I SELDINGER
Catheter replacement of the needle in percutaneous arteriography,
a new technique

Acta Radiol39 3681953
30 FDSTOTT
Medium term direct blood pressure measurement

Bio-medical Engineering 14571966a
31 FDSTOTT
Methods of assessment of variations of blood pressure

Bio-medical Engineering 1 544 1966b
S2 A L MACMILLAN F D STOTT
Continuous intra-arterial blood preS8ure measurement

Bio-medical Engineering 1 20 1968
33 J L CORBETT
Long-term measurements of intra-arterial pressure in man

In preparation
34 A T HANSEN E WARBURG
Acta Physiol Scand 193061949
35· A T HANSEN
Pressure measurement in the human organism

Teknisk Forlag Copenhagen 1949
36 DLFRY FWNOBLE AJMALLOS
An evaluation of modern pressu~e recording systems

1114 Fall Joint Computer Conference, 1968

...

Circulat Res 5 401957
37 HWSHIRER

41 AMVALSALVA
De aure humana tractatus

Blood pressure measuring methods

IRE Trans BME 116 1962
38 D J PATEL J C GREENFIELD W G AUSTEN
MORROW D L FRY
J Appl Physiol20 4591965
39 R B BLACKMAN J W TURKEY
The measurement of power spectra

Dover Publications Inc New York 1959
40 J S BENDAT A G PIERSOL
Measurement and analysis of rancom data
John Wiley & Sons Inc N ew York 1966

A C

G vande Water Utrecht 1707
42 G DE J LEE M B MATTHEWS
E P SHARPEY-SCHAFER
Brit Heart J 16 3111954
43 R H WOODWARD P L GOLDSMITH
Cumulative sum techniques

Oliver and Boyd Ltd Edinburgh 1964
44 JLCORBETT CPRYS-ROBERTS JHKERR
Cardiovascular disturbances in seven tetanus due to overactivity of the sympathetic nervous system.

Submitted for publicatiou

Some conclusions on the use of adaptive
lineardecision functions
by E. R. IDE, C. E. KIESSLING and
C. J. TUNIS
International Busine88 Machines Corporation
Endicott, N.Y.

INTRODUCTION

3)

Any pattern recognition system can be considered
to have three sections:

1)
2)
3)

The coding problem

a transducer section,
a measurement secti'On, and
a decision section.

The first two sections transform each pattern to
be recognized into one point in a multidimensional
space. The axes of this space are .the measurements or characteristics of the pattern. The decision section must assign regions of the measurement space to particular classes of pattern. One
common and convenient decision surface is the
linear boundary or hyperplane; much work has
been done ·with ". adaptively derived" hyperplanes. 1 ,2 Algorithmic procedures have been de,..
veloped for positioning the linear decision boundaries in the measurements space on the basis of
statistically meaningful samples of each pattern
class.3
Theoretical studies of the classification capabilityof linear decision b'Oundaries can be performed
by assuming a particular statistical distribution
of the measurements in the measurement space
for each class. For certain assumed distributions
in the two-class case, it has been shown that a
linear decisi'On boundary is the optimum one. 4 Of
course, this boundary would not be the optimum
one for many real recognition problems. However,
the linear boundary has been the subject of many
experimental investigations because it is:
1)

2)

convergent adaptive algorithms 5 exist for
this boundary.

optimum in certain idealized cases,
a convenient boundary to implement in
hardware, and
1117

In order to have a pattern classifier we must
..
'
posItIon a sufficient number of linear boundaries
in the measurement space; this separates the
measurements arising from one pattern class from
the measurements arising from all other pattern
classes.
. There are a variety of ways of positioning linear
boundaries. Figure 1 shows four pattern classes
(in a two-dimensional measurement space) separated by only two planes. (The contour in the
figure can be taken to mean that 990/0 of all patterns of this class will give rise to measurements
that fall within this boundary.) N'Ote that these
two planes, given the dichotomies they are assigned (i.e., plane "A" must separate classes 1 and
2 from 3 and 4), can indeed be positioned iIi such
a way as to perform the separations. If, on the
other hand, we had intended to use a single plane
to separate classes 1 and 3 from 2 and 4, the separation would not have been possible. This situation, i.e., nonlinear separability, is referred to as
the coding assignment problem and is described
in a previous paper by the authors. 6
Figure 2 shows another way oi separating each
class in the measurement space.· This particular
method uses a significantly greater number of
planes. Here, one plane separates each class from
one 'Other class. This will be referred to as the
class-pair plane coding assignment. This should
be the best linear-boundary classifier since each
plane only separates one pair of patterns. Note

1118

Fall Joint C'Omputer Conference, 1968
,---:--~------.

CODING ASSIGNMENT
Notes: • Planes are designated by A and B
• Arrow. designate "on" side of the planes

FIGURE I-Coded boundaries

-

---,------,

CODING ASSIGNMENT
Notes: • Planes are designated by A. B. C. D, E. and F

Cloll Plane
A B
I
11
2
I 0
~-

..

oI

00

that the regions assigned to each class are determined by different numbers of planes. For example class 3 is determined by three planes while
class 4 only needs one plane (planes E and Care
redundant with respect to class 4). The disadvantage of this coding assignment is that a large
number of planes is required. For example, if
there are n classes then n (n- 1) /2 planes are
required.
Another popular method of separating classes
with linear boundaries is shown in Figure 3. Here
we have assigned one plane to separate each class
from all others. This is sometimes referred to as
the l-out-'Of-n code, because there are as many
planes as classes (n) and only one plane will be
"on" for one pattern presentation. (By "on" we
refer to the state of the threshold circuit corresponding to the plane; a cUtssifiable measurement
will be in those regions of the measure"ment space
delineated by the positive side of one plane and
the negative side of all other planes.) The problem here, of course, is that it may not be possible
to separate one class from all others by means 'Of
a single plane.
A variant of the method shown in Figure 3,

• Arrows designate .. on" side of the planes

CIOSl

• Numbers in parentheses designate class

separation (e. g. • A (1.2) means Plane A
separatei class 1 &om class 2)

1
2

..3

Plane

ABCDEF
1 I I x x x
oxx I Ix
xOxOxl
xxOx.OO

FIGURE 2-Class-pair boundaries

called the matched filter app'roach, uses as many
planes ( or linear functionals) as the previous
case (i.e., only n) but the actual decision boundaries are the bisectors of the vari'Ous pairs of
planes 2 (see dotted lines in Figure 3). The
threshold circuit network corresponding to this
classifier is organized to allow only one plane to
be "on" at anyone time.
The purp'Ose of our experiment was to compare
the classification performance of all class-pair
planes (in one particular problem) to the classification performance of the matched filter ·approach. In addition, we tried to develop a classifier that would provide the performance 'Of the
class-pair classifier but would have fewer planes.
The method started with the class-pair classifier
and attempted to eliminate geometrically redundant planes and also to utilize other dichotomies
performed by the class-pair boundaries. We show
that the performance 'Of the matched filter classifier is surprisingly close to the performance obtained by using all class-pair planes. Our method
of eliminating some of the class-pair planes quick-

Conclusions on Use of Adaptive Linear Decision Functions

,,
,
\

,
\

,
\

\
\

,
\
\

.A·-~~K~~fL-~:;;:;;...;J-.

CODING ASSIGNMENT
Notes: • Planes are designated by A. 8. C and 0
I

• Arrows designate" on" side of the planes

Closs

Plone
ABC 0

1
2

100 0
0100
001 0
000 1

3
4

FIGURE 3-Matched filter boundaries

ly brought performance levels that were inferior
to those of the "l-out-'Of-n" code, even though we
were using more decision boundaries.
The classification problem used for this set of
experiments was the recognition of a small number of spoken words. In our experiment 15 words
or classes were handled. The measurements were
obtained by a spectrum analysis of each word; this
provided a binary representation of the energy
peaks in the frequency domain as a function of
time. The speech analyzer and some initial recognition experiments on this data set were reported
in an IBM Journal article. 7 The data from Reference 7 were used in this experiment.
Our data base is divided into two parts:
1)

2)

one set of sample utterances (i.e., the
analysis sample) ; used for adapting or positioning the hyperplanes in the measurement space and
an equal size sample (i.e., the test sample) ;
used to test the performance of the classifying hyperplanes

1119

It is the performance of the hyperplanes on this
hitherto unseen sample that is reported as the
recognition performance of the system.
The initial step of the experiment was to adaptively position all class-pair hyperplanes on the
basis of the training sample. Since the number
of classes was 15,we positioned 105 hyperplaneseach hyperplane having the assignment of separating one class of spoken words from one other
class. After the class-pair hyperplanes were positioned, we did a recognition run on this same
training set. We were not concerned with how
well the class-pair plane separated the two classes
for which it was designed (it generally does very
well, since this is the training set), but rather we
were concerned with what 'Other classes of pattern
each class-pair plane separated.- Recognize that
here the word "separator" means de,gree of separation; for example, a particular class-pair hyperplane may correctly separate 90 % of the members of class 7 from those of class 9 and only 75%
of members of class 3 from those of class 5. Thus,
we compiled a number of tables indicating not only
which classes each class-pair plane separates but
also to what level, i.e., 95 % separation, 90% separation, etc.
The second step in our experiment was to choose
a subset of all class-pair hyperplanes that still
separated all patterns, 'One from the other. From
the table that shows what class each class-pair
planes separates to a 95 % level, we had to select
as many as 43 hyperplanes to partition the space.
Each pattern class was described by its position
with respect to the 43 planes. This set of 43 binary
symbols ("I" denoting that the region is the "on"
side of a plane, "0" denoting the "off" side) constituted a code word. Looking at the table that
shows what classes each class-pair plane separates
to a 90 % confidence level, we had to select only 23
planes that would separate all the classes. We
then trained the chosen subset of the class-pair
planes on the training sample, taking into account
the other classes of pattern that we wanted each
plane to separate. Once these new planes were
chosen and adapted, we determined their performance on the test sample. By using the test sample,
we also determined the performance of the classifiers consisting of all class-pair planes, the l-outof-n, and the matched filter. This performance
data is tabulated in Table 1.
Table 1 shows us that the class-pair classifier
was the best linear classifier. The results are sur-

1120 Fall Joint Computer Conference, 1968
TABLE I -Coding of adaptive linear classifiers
Code

No. of
Hyperplanes

Rejects (R)
(%)

Substitutions (S)
(%)

viS

105

4.8

0.4

1.4

Reduced Class Pair

43

5.1

1.3

2.6

Closs Pair

Reduced Class Pair

25

6.2

1.1

2.6

Reduced Class Pair

22

5.8

1.2

2.6

l-out-of-n (Matched Filter)

15

4.1

1.4

2.4

l-out-of-n (Thresh)

15

11.0

0.3

1.8

Note:

This table wal prepared by sampling 1050 warda.

prising because the l-out-of-n coded classifier, requiring only 15 hyperplanes, is exceedingly close
to the performance of the class-pair classifier.
Also, we observe the failure of our method of
eliminating some of the class-pair planes in arriving at a c'Ode that is only nearly as good as the
simple l-out-of-n code. We cannot claim that
every recognition pr'Oblem can be handled so well
by this obvious coding assignment, but we have
shown an example here of a practical pattern recogniti'On problem where this was indeed the case.
The effect of weight quantization

The implementation of a linear decision function
requires that there be a variable weighting of each
of the binary inputs or measurements. The question we seek to answer in this section is, "How
does the categorization performance of the decision function vary with the number of discrete
weight levels available?" This, of course, has
tremendous implications. to the ease 'Of implementing the linear decision functi'On in hardware.
If a resistor array is used, then the tolerance on
each resistor is eased. If the linear decisi'On function ~is implemented by storing, the weights in a
digital mem'Ory and simulating the summati'On
effect 'Of the network, then the am'Ount of st'Orage
required is affected significantly. We thus describe
an experiment where adaptivelyderived linear
decision boundaries are 'Obtained and then the
effect 'Of different available weight quantization
levels on the rec'Ogniti'On performance of the netw'Ork is determined.
The applicati'On considered here was the rec'Ognition of the ten numeric characters and one
"special" class (period, c'Omma, d'Ollar sign, minus
sign) from approximately forty typewritten fonts.
The transducer and pre-pr'Ocessor were similar
t'O those described in Liu and Kamentsky. 8 Characters were scanned by a flying SP'Ot CRT 'Optical

scanner; a raster image was produced in a shifting
register. Position-invariant measurements were
made on a character as it passed thr'Ough the shift
register to produce a 100-bit (approximate) measurement vector.
This 100-bit measurement was the input to the
netw'Ork of linear functionals expl'Ored here. There
are eleven linear functi'Onals, one assigned to each
class of character (in the "matched filter" assignment of the previous part). The total sample obtained for this experiment consisted of 55,000
characters. Approximately one-half of these were
-used for the adaptation of the weights; the other
half were used as a test sample.
We were interested in obtaining the weights by
first simulating the adaptive pr'Ocess on the training sample on a digital c'Omputer and then storing
the weights in our recognition machine. The recognition portion of the machine was a special purpose digital computer which "emulates" the set
'Of linearfuncti'Onals. Thus, the number 'Of bits
required to st'Ore the set 'Of weights was 'Of interest.
After adaption 'On the training sample, the
weights were all within the range ±256; thus, we
require 9 bits t'O st'Ore each weight. Table 2 shows
the rec'Ogniti'On perf'Ormance 'On a test sample
when these finely quantized weights were used. It
als'O sh'OWS the deteri'Oration in the substitution
and reject rates as the levels 'Of quantizati'On in
the weights were successively reduced. Als'O
sh'Own is the square r'Oot of the pr'Oduct 'Of the
substituti'On and reject rates (sometimes used as
a Figure 'Of Merit* f'Or a rec'Ogniti'On system).
This is used because there is a "trade-'Off" between
these tW'O rates for any rec'Ogniti'On system, but
their pr'Oduct is approximately constant.. N'Ote
that when using even three-bits per weight (which
corresP'Onds t'O 'Only eight distinct weight levels),
the rec'Ognition rate has n'Ot yet degraded by a
fact'Or 'Of tW'O, while the substituti'On rate has remained the same. This behavi'Or is a complex functi'On of the nature 'Of the adaptive pr'Ocedure, the
shape 'Of the decisi'On boundary, and the distribution 'Of characters in the measurement space. Withassumed statistical distributi'Ons, it would be possible to "c'Ompute" this functional relationship,
but it is interesting t'O observe it in this "realvlOrld" application.
*The larger this figure, however, the less the merit of the rec;..
ognition system.

Conclusions on Use of Adaptive Linear Decision Functions
Thus, if this decision procedure were implemented by storing the individual weights in a digital machine, only 3300 bits** of storage would
produce a quite respectable classifier. This storage
requirement is about 1/20 of. that required by a
familiar method 9 (that of storing one, or more,
ternary references for each class and computing
the "distance" of the unkn'Own fr'Om the stored
leferences), but achieves a comparable recognition rate. ,This advantage of linear decision functions has not, to the auth'Ors' knowledge, been
noted in the literature.

Unsupervised learning
The literature 'Of pattern rec'Ognition, signal detecti'On, adaptive systems, and self-organizing systems has treated the subject 'Of unsupervised learning (learning without a teacher) fr'Om both the
theoretical and empirical p'Oints 'Of view. 10
We now summarize an experimental investigation 'Of an unsupervised adaptive pattern recognition algorithm previously reported elsewhere.ll
Our adaptive linear classifier undergoes an early
training period in which it is presented with a
number of labeled samples f'Ollowed by a later
peri'Od (which may be indefinitely l'Ong) during
which it c'Ontinues to adapt its parameters based
'On its own decisions.
An unsupervised system can begin with parameters generated from a small labeled sample, and
then can use a large sample of unlabeled patterns
to design accurate class boundaries. In a typewritten character problem, N agy12 used starting
parameters generated from nine type fonts. His
unsupervised system (quite different fr'Om that
used here) designed decision criteria f'Or each 'Of
twelve other fonts fr'Om a five-hundred-character
sample 'Of each. Using no labeled patterns from
these twelve fonts, he achieved "essentially singlefont, single-machine performance" on each.
It is intended that, during normal use 'Of a rec'Ognition system, unsupervised adapti'On will take
place and will allow the system to f'Oll'Ow gradual
changes in the class distributions due to data
changes or hardware degradation. Such "tracking"
was attempted by Koford and Mays13 with a supervised algorithm which, without data repeating
(without using a given input more than once),
will track changing statistics and remain close to
**Three bits for each weight; 100 weights/functional; 11 linear
functionals, 1 functional assigned to each class.

1121

optimal, if the. changes are not too rapid. Cooper
and Cooper14 suggested an unsupervised algorithm for tracking time-variant statistics in a very
special case that does not generalize.

T he supervised algorithm
We repeat the supervised algorithm: We construct a vector Wi and a constant t i for each
class in the problem, such that each pattern vector
X is assigned to a class as foll'Ows:
If Wi·X

+ ti > WrX + tj +

E

f'Or all j

¢

i, (1)

the pattern X is assigned t'O the class i. This type
of Glassifier is sometimes referred t'O as a trainable matched filter.
If there is no i such that Equation (1) is true,
the pattern is assigned to no class; it is rejected.
For this reason, E h is often called the reject
thresh'Old.
We 'Obtain the set 'Of vect'Ors and c'Onstants by
presenting iteratively to the netw'Ork a sequence
of sample measurements fr'Om each class t'O be
identified, where the class 'Of each sample is
known. Our sample must be statistically representative 'Of the individual classes.
The algorithm has advantages noted elsewhere.
It is general in applicati'On and is relatively easy
to implement in hardware. It adjusts the weights
(W i and t) after each pattern, eliminating the
need to save patterns 'Or sums of patterns in s'Ome
sort of storage. In previ'Ous experimental w'Ork
wi th both sp'Oken word and optically sensed typewritten patterns (d'One by one of the auth'Ors), it
has displayed somewhat better results than those
of several other current methods.1.5

The unsupervised algorithm
To the advantages menti'Oned above, the unsupervised algorithm adds the benefits of unsupervised adaption: labeling elimination and tracking
ability. It differs from the supervised alg'Orithm
in that the class of X is unknown. Theref'Ore,
the adaption must proceed as if the class assigned
by the decisi'On rule, Equation (1), is in fact the
class of X.
Specifically, the alg'Orithm proceeds in the following way: If the pattern is rejected 'Or str'Ongly
assigned t'O a class, the vectors are not changed.
If the pattern is weakly assigned to a class m,
W m and t m are incremented as though X be-

1122

Fall J'Oint Computer C'Onference, 1968

longed to class m, and the vector 'Of the nearest
'Other class (n) is decremented. The details 'Of the
alg'Orithm are presented in (11).

10 ALPHABET STARTING WEIGHTS

30

-

20

Data

Generalization
The classifier was first trained in the supervised
m'Ode (i.e., on a labeled set) 'Of thirty alphabets.
(An alphabet is defined as a set 'Of single utterances 'Of each w'Ord in the v'Ocabulary.) Then the
perf'Ormance of the classifier (weight vect'Ors)
was tested 'On the identificati'On sample after
each additi'Onal ten alphabets 'Of unsupervised
training. The results, shown in Figure 4, sh'OW
a decrease in the reject rate from 10.5 % t'O 1.6 %
12.-------------------------------------~

11
10

~ 9

I:j

~

8

~ 7

o

~

6

~

5

;::

~:.

E
i 2

SUBSTITUTIONS

,A,
... ,.,1$'

\

" \ ''b-......
-

n • 10

20

8·75

REJECfS

-- ".'''ruT'''"'

Results

30

'""o-_-<)O---4."'o---o()o.--Oo.,,'O-__ ..,. __

«I

50

60

70

II)

90

~--o-_-o-_-<>--oo()o--..()..--o----<)

100 110 120 130 1«1 150 160 170 III) 190 200 210

15

10

/+\

UNSU~RVISED PERFORMANCE DURING TRAINING

(·30
0-

+_ STARTING WEIGHT PERFORMANCE

25

The performance 'Of an adaptive linear classifier
designed (trained) using the unsupervised alg'Orithm just described was extensively tested in tW'O
distinct pattern rec'Ognition pr'Oblems-sP'Oken
w'Ord rec'Ogniti'On (the same data base as in Part
I) and handprinting recogniti'On. The tW'O pr'Oblems clearly had different statistical pr'OP~rties,
alth'Ough they had r'Oughly the same number 'Of
vattern classes. The measurement space of the
sP'Oken word pr'Oblem was 320 dimensi'Onal; that
'Of the handprinting rec'Ogniti'On pr'Oblem was 180
dimensi'Onal.
We shall repeat the detailed results c'Oncerning
the w'Ord recogniti'On data; but with the fact noted
that sUbstantially the same kind 'Of perf'Ormance
was 'ObtaiDed in the 'Other, rather different, applicati'On. This fact tends t'O indicate that the alg'Orithm has general applicability.

i

35-r--------~---~___:-------___,

"fj

+

./

"

\

+
,..

+

/

/\

/

1'/"" :~
/
,,",\ ,~I '>' ,,~ _~~~~ /~
\

,0,

I:+:'

/'

n' 10

20

'~

30

+

/

50

60

70

'+',

~o~ _ ' ,

~-¥,:-P'

_-h,

«I

,//

II)

--0'

'

~~S-.¥-o'
"of

\

_,

"0",,+

90 100 110 120 130 1«1 150 160 170 III) 190 200 210

NUMBER Of ALPHABETS Inl

FIGURE 5-Performance of starting weights and of the unsupervised system during each ten alphabets of training

as the classifier is "trained" 'On additional alphabets. There was als'O a decrease in the substituti'On rate.
Performance during training
H'Owever, the n'Ormal m'Ode 'Of an unsupervised
classifier W'Ould be described by its "perf'Ormance
during training"; i.e., its rec'Ogniti'On rate as it is
adapting itself 'On inc'Oming patterns. To 'Obtain
a rate, we sh'OW the rec'Ogniti'On rate f'Or each successiv~ 10 alphabets (150 w'Ords) as the classifier
is exposed t'O these alphabets. Figure 5 sh'OWS the
reject and substituti'On rates (f'Or each 10 alphabets) as the system is underg'Oing unsupervised
TABLE II-Performance of a set of weights quantized into a
varying number of levels

Weight Size

Rejects (R)
(%)

Substitutions (S)
(%)

v'RS

9-bit weights as
originally generated

0.443

0.111

0.219

6-bit weights
derived from 1

0.426

0.110

0.215

5-bit weights

0.462

0.111

0.226

4"'bit weights

0.523

0.136

0.226

3-bit weights

0.634

0.111

0.265

2-bit weights
("large" reject zone)

4.07

0.154

0.791

2-bit weights
("small" reject zone)

1.06

0.555

0.767

NUMBER Of ALPHABETS Inl

Note:

FIGURE 4-Unsupervised generalization on the ID sample
after each ten alphabets

This table was prepared from approximately 55.000 characters obtained
from a Cl~.T flying-spot scanner; the characters were then preprocessed
to about 100 measurements. Also. 27.930 characters were used for test
sample and 27.830 were used for adaptation of the weights (training
sample).

Conclusions on Use of Adaptive Linear Decision Functions
training. In addition, we show the comparable
recognition performance of the fixed classifier
using the starting weights, i.e., the weights obtained after supervised learning 'On thirty labeled
alphabets. The unsupervised weight classifier
fluctuates much less than the fixed classifier and
generally improves as its experience increases.

Tracking pattern shifts
To test "tracking capability" under more severe
conditions, a systematic and severe change in the
measurement statistics was artificially ~reated.
This change was a right shift by one column of all
the bits in the pattern matrix. We felt that performance of the unsupervised algorithm in tracking such a distortion would be generally indicative of its capability of tracking a variety of
changes in the statistics due to other malfunctions.
A sample of 100 alphabets was used in this experiment. An initial supervised learning was done
on these alphabets in their nominal position. These
initial classifier weights were preserved; their
recognition performance on the patterns of the
subsequently shifted 100 alphabet sample is
shown in column One of Table 3. Each pattern of
the 100 alphabet sample was then shifted right
'One column, and a classifier was trained on them
(1 pass) in the unsupervised mode, using the
previously mentioned supervised weights as the
starting weights. For comparison, a supervised
(labeled) training was also done on the shifted
patterns under the same conditions. Four successive column shifts were performed. For each shift
the performance 'Of the unsupervised and supervised classifier was measured (see columns Two
and Three of Table 3).

TABLE III-Shifting of patterns-Percent correct recognition
(forced decision) of fixed weight, unsupervised, and supervised
tracking classifiers
Shift in
Bits

Weight
Classifier

0
16
32

100.0
92.5

-

-

99.2

99.3

55.0

97.4

99.1

48

28.0

96.2

99.3

64

18.0

93.1

98.9

Unsupervised
Classifier

Supervised Tracking
Classifier

1123

It can be seen (from column One, Table 3) that
after the four shifts, the original weights are indeed useless in a fixed classifier, for their recognition rate is less than 2070. However, the unsupervised classifier has traced the changing data and
is still getting 93 % correct recognition.
This experiment clearly indicates that unsupervised training allows the recognition system to
survive (i.e, continue to perform well), even a
relatively rapid change in measurement statistics.
Such changes may occur for a variety of reasons
such as degradation of hardware or changes in the
data input environment.
SUMMARY
Based on the experimental work reported here
(and other work rep'Orted elsewhere), the authors
conclude that the use of adaptively derived linear
decision boundaries in practical pattern recognition systems deserves serious consideration. It has
been shown to be more than competitive with
other classification methods. 15 We have further
shown here that the .simple, trainable matched
filter represents a powerful coding scheme in the
practical cases investigated here. Undoubtedly,
there are applications where piecewise linear decision boundaries will have t'O be used. 3
Once the linear decision boundaries have been
derived, surprisingly few number of discrete
weight levels have been shown to be usable without
significant sacrifice in performance. This allows
either the digital storage requirements to be relaxed if a special purpose digital processor simulating the effect of the linear functionals is used,
or allows the use of analog implementations with
relaxed tolerance on the stored weights. Hithert'O,
the tolerance requirement has been the main barrier to an economical analog hardware implementation of linear decision functions.
The experiments reported here show that unsupervised learning has potential utility., It has
been demonstrated, in two applications, that a
classifier in the unsupervised mode can follow
changing statistics of' the input pattern set. This
may indeed be the most useful aspect of adaptively derived linear decision boundaries.
ACKNOWLEDGMENT
The authors wish to acknowledge the assistance
of Mitchell P. Marcus.

1124

Fall Joint Computer Conference, 1968

REFERENCES
1 N JNILSSON
Learning machines: Foundations of trainable pattern classifying
systems
New York McGraw-Hill 1965
2 J S GRIFFIN JR J H KING JR C J TUNIS
A pattern-identification device using linear decision functions
In Computer and Information Sciences
J T Tou and R H Wilcox Eds Washington D C Spartan pp
169-1931964
3 R 0 DUDA H FOSSUM
Pattern classification by iteratively determined linear and piecewise linear discriminant junctions
IEEE Transactions on Electronic Computers vol EC-15 pp
220-232 April 1966
4 J S KOFORD G F GRONER
The use of an adaptive threshold element to design a linear
optimal pattern classifier
IEEE Transactions on Information Theory vol 12 No 1 pp
42-50 January 1966
5 N JNILSSON
Op cit Chapters 4 and 5
6 C E KIESSLING C J TUNIS
Linearly separable codes for adaptive threshold networks
IEEE Transactions on Electronic Computers
Vol EC-14 No 6 pp 935-936 December 1965
7 JHKINGJR CJTUNIS
Some experiments in spoken word recognition
IBM Journal of Research and Development January 1966
8 LA KAMENTSKY C N LIU

Computer-automated design of multi/ont print recognition logic
IBM Journal of Research and Development
Vol 7 pp 2-13 January 1963
9 CN LIU GLSHELTON JR
A n experimental investigation of a mixed-font print recognition
system
IEEE Transactions on Electronic Computers vol EC-15 pp
916-925 December 1966
10 J SPRAGI~S
Learning without a teacher
IEEE transactions on information theory
Vol IT-12 No 2 pp 223-230 April 1966
11 ERIDE CJTUNIS
A. n experimental investigation of a nonsupervised adaptive
algorithm
IEEE Transactions on Electronic Computers Dec 1967
12 G NAGY G L SHELTON JR
Self-corrective character recognition system
IBM Yorktown Research Report RC 1475 1965
13 JSKOFORD CHMAYS
Adaption of a linear classifier without data repeating
Record of the 1965 International Space Electronics Symposium November 2-4 1965
14 DB COOPER P W COOPER
N onsupervised adaptive signal detection and pattern recognition
IEEE Transactions on information Theory vol IT-12 No 2 pp
215-222 April 1966
15 R G CASEY Editor
An experimental comparison of several design algorithms used
in pattern recognition
IBM Yorktown Research Report RC 1500 1965

Experiments in the recognition of hand-printed text:
Part I-Character r.ecognition
by JOHN H. MUNSON
StanJord Research Institute
Menlo Park, California

INTRODUCTION AND BACKGROUND
Among the many subject areas in the field of pattern
recognition, the recognition of machine-printed and
hand-printed alphan:umeric characters has perhaps been
the classic example to which people have referred in
exemplifying the field. Interest in character recognitio.n
has long run high; an extensive literature in handprinted character recognition alone dates back to at
least 1955.1- 36
In recent years, the .recognition of machine printing
has become a commercial reality. Following the introduction of the highly controlled E13B magnetic font
by the banking industry, several advances in optical
character recognition (OCR) capability have been
brought to the marketplace. The trend of these advances
is toward the acceptance of broader and less controlled classes of input: from single, stylized fonts to
multi-font capability; from high-quality copy to
ordinary inked-ribbon impressions, and even to multipart carbons of surprisingly poor quality. Still, in
contrast to hand printing, the approaches to OCR have
been able to rely on the lack of gross spatial distortions
in the character images, and to make considerable use
of templates.
Progress in the off-line recognition of hand printing
has been slower. The problem is intrinsically harder
than that of OCR, as reflected in the fact that the
human recognition error rate for isloated, hand-printed
characters is many times higher than for machine
printing. The great spatial variability of hand-printed
characters has led many researchers to explore nontemplate methods for recognition.
Thus, the maj or effort of many researchers has been
the exploration of unique methods of preprocessing, or
feature extraction, applied to the hand-printed char~cter ~mages. Dinneen, 1 in one of the earliest papers,
InvestIgated local averaging and smoothing operations
to improve the quality of the character image. Similar .

operations have appeared as a part of many other
approaches. 4,7 Lewis,16 Uyehara,21 Stern and Shen,23
and Rabinow Electronics 31 have used schemes in which
the sequence of intersections of a slit scan with the
character image, or the equivalent, gave rise to features
for classification. Lewis15 .was one of the relatively few
to emphasize the use of multiple-valued rather than
binary-valued features, an ingredient we have found
important in our own work.
Singer12 and Minneman 30 employed a circular raster,
which can facilitate size normalization and rotation
invariance. Unger,7 Doyle,9 and Glucksman27 have
emphasized features derived from shape attributes
such as lakes, bays, and profiles. The building up of a
character representation from component elements
matched to the image, such as short line segments or
portions of the boundary, has been attempted by
Bomba,4 Grimsdale et al.,6 Kuhl,19 and Spinrad. 26 Correlation techniques have been tried by Highleyman13
and Minneman. 30 Contour-following with a captive
flying-spot scan or its simulated equivalent has appeared
in the work of Greanias et al.,20 Bradshaw,22 and
Clemens. 28 The work of Greanias et al.,20 is especially
significant because it led to the method used in the IBM
1287 character reader.
Other 'workers have placed greater relative emphasis
on classification techniques and on the selection of features from a feature set or pool. Chow16 ,29 has long
worked with statistical classification methods. Bledsoe
and Browning3 and Roberts8 applied adaptive procedures to features obtained from more or less random
connections with the image raster. Uhr and Vosslerll
performed an important pioneering study of a program
that "generates, evaluates, and adjusts" its own
parameters. Not surprisingly, however, the automatically generated features were confined to simple,
local templates.
The recognition of characters printed subject to

1125

1126

Fall Joint Computer Conference, 1968

specific constraints (such as guide markers appearing
in the printing area) has been studied by Dimond,2
Kamentsky,14 and Masterson. I8
It may be said of most of these investigations that
they were in the academic, rather than the practical,
r~alm. In general, the methods were never tested
against a body of real-world data large enough to give
some estimate of their performance in a practical
situation. This probably reflects a common emphasis
on checking out a preprocessing scheme rather than
attacking a particular application problem; it certainly also reflects the labor and equpiment requirements involved in collecting and controlling a significant body of data. An exception to this general statement is the work of Highleyman and Kamentsky in
the early 1960's, in which they used data files numbering
in the thousands of characters. 13 •14 Also, several files
each containing many thousands of characters of
graded quality were gathered in conjunction with the
development of the IBM 1287 character reader and are
currently in use at IBM and in our group. Bakis et al. 35
describe these data, on which they and others at IBM
have performed extensive experiments.
The use of context to improve recognition performance, which figures prominently in oUr own work,
was discussed briefly by Bledsoe and Browning,3 but
otherwise has received scant attention in the past.
Some studies have been carried out under simplifying
assumptions such as M:;trkov dependence in digrams
and trigrams.
Chodrow et al. 31 surveyed hand-printed characterrecognition techniques in 1965 and discussed at some
length the procedures of Clemens,28 Greanias et al.,20
and Rabinow Electronics. The book Pattern Recognition
by Uhr 32 reprints a number of the important source
papers 3.6 •8.11 and contains a well written survey. An early
progress report on the work described herein was given
by Munson. 36
Recently, commercial organizations have announced
the capability to read off-line hand printing. At the
date of this writing (early 1968), one system (the IBM
1287 optical reader) has achieved pilot production
operation. The 1287 reader can read the ten numerals
and five letters. Another system is announced to have
full alphanumeric capability.
A common characteristic of the announced systems
is that they are intended to work with hand printing of
very high quality, produced by coders who have undergone training in the skill of printing for machine
recognition. If individual characters must be recognized
with, say, better than 99.9% accuracy in order to
yield usable document acceptance rates, this type of
training is clearly required. Some experiments that will
he described in the next section show that humans

cannot recognize isolated characters printed by an
untutored population with any rate approaching the
required acc~racy.
.
In our work, we have taken the alternative approach:
Given text from an untutored coder, in which the
individual characters cannot be recognized (by man or
machine) with high accuracy, contextual analysis is
used to reduce the error rate. Every form of text has
its o~n contextual structure, which is utilized by
humans in a complex, largely unconscious process. We
have therefore emphasized the following points in our
research: the establishment of large hand-printed data
files of known ,quality; the choice of a well defined
char~cter alphabet and textual situation (FORTRAN
program texts) as a vehicle for study and the reporting
of results; the use of multiple approaches to preprocessing; context analysis to improve recognition; and
the preservation of non-binary confidence information
between the preprocessor and classifier and between
the classifier and the context analyzer.
In a companion paper,37 Duda and Hart describe the
use of programmed contextual analysis in the recognition of FORTRAN program texts. The present paper
will therefore concern itself only with the problem of
recognizing individual characters.
Problem definition

In a recent paper, the author has argued that there is
an infinity of character-recognition problems, and that
recognition results are meaningless as they are often
reported in the literature, without an adequate description of the problem being tr~ated. 38 Accordingly,
we shall try to describe the two recognition problems
dealt with in this paper thoroughly enough that the
reader can form an intuitive opinion of the difficulty of
the problems.
We must first distinguish between off-line character
recognition from a printed page, and on-line recognition,
in which the characters are generated by a light pen,
RAND tablet, or similar device.24.33.34 On-line ·recog~
nition is much simpler because the data provide a nearly
exact trace of the path of the writing instrument and
give accurate stroke-position and time-sequence information. Furthermore, an error rate of as much as 5%
may be considered acceptable, because each character
can be classified, displayed, and corrected immediately
by the writer if it is wrong.
The recognition of hand-printed characters should
also be distinguished from that of cursive (connected)
script.25 The separation of the printed characters and
the fact that each belongs in fl, well~specified category
obviate the "segmentatiqn problem" that makes cursive-script recognition much more difficult.

Experiments in Recognition of Hand-Printed Text
Within the framework of off-line block hand printing,
the difficulty of a particular problem is still affected by
many variables: the size of the alphabet; the "standard" forms of the individual characters and the degree
of constraint placed on their formation; the size, spacing, and arrangement of text on the page; the writing
-instrument(s); the number of writers; their -training
and motivation; and the (fixed and time-varying)
characteristics of each individual writer. To illustrate
the variability of hand printing, we may cite several
instances of human recognition /rates on Isamples of
hand printing. Neisser and WeEme reported a 4.1%
average error rate on character~ printed by visitors at
the front gate at Lincoln Laboratory.1o With all subjects
voting together, the error rate was 3.2%. We have
reported an error rate of 11 % on the well-known
quantized character set collected by Highleyman,
which suffers from crude quantization of the characters.3D
On the multiple-coder data file used in our experiments and described below, the error rate was 4.5%;
on the single-coder file, 0.7%. Fin~lly, present commercial systems are intended to operate with character
error and reject rates on the order of 0.1 % to 0.01 %.
The most significant determinants of hand-printing
quality are the training and the motivation of the
printing population. Our choice in the work described
in this paper was to treat data from an essentially untutored, moderately motivated population, represented
by computer users who hand-code program texts for
keypunching. Such a coder has typically received no
instruction in printing, beyond a few rules about
slashing or crossing characters to avoid such confusions
as 1-1, O-zero, and 2-Z. He does receive feedback of the
results from prior keypunching jobs, which motivates
-him to maintain (perhaps grudgingly) a certain level of
'legibility. Thus, while this printing is far sloppier than
that allowed by presently announced recogn:ition systems, it is more legible than that produced by the
general public while, for example, addressing mail.
Two files of data were used in the experiments reported in this paper, a multiple-coder file and a singlecoder file. The characters in both files were handprinted on standard general-purpose coding sheets
obtained from the Stanford Research Institute computer center. The cells on these sheets measured 1/4
inch high by 3/16 inch wide, with no extra spacing
between cells. A thin-lead mechanical pencil with an
HB (soft) lead was used, after brief experimentation
indicated that no other conventional writing instrument gave crisper images when viewed through our
input system. (A pencil is the preferred instrument
because it facilitates erasure.) The coder was free to
use whatever character size he found natural.

1127

The 10 numerals, the 26 uppercase letters, and the
symbols [ = * / + - .,' $1 comprised the alphabet of 46
characters. This is the basic FORTRAN alphabet, with
brackets substituted for parentheses in accordance with
the convention associated with our computer system at
the time. The blank was not treated as a character'
category, the recognition of blanks being more a function of a document-scanning subsystem than a patternr~cognition problem. We instructed the coders to print
zero with a diagonal slash and Z with a midline slash,
and to put crossbars on the letter 1. Numeral 1 was to
be without serifs; several coders, however, added serifs.
Other choices were left to the individual, such as open
versus closed 4, the crossbar on J, and the number of
verticals in $.
Multiple-coder file

Printed data from 49 individuals were included in
the multiple-coder file. Each person was asked to print
several 46-character alphabets on a coding sheet (at
one sitting), and the first 3 alphabets from each sheet
were taken for the file. The data from the first 32 persons (96 alphabets, 4416 characters) were used as
training or design data during the experiments, and the
data from the remaining 17 persons (51 alphabets,
2346 characters) for test. The coders of the training data
were all personnel of the author's laboratory and the
computer center at SRI. The coders of the test data
were 8 from SRI and 9 from the US Army Electronics
Command, Fort Monmouth, N.J. Any cross-country
bias in printing styles is probably small compared with
individual differences.
Por~ions of several of the test alphabets are shown in
Figure 1. The coders were asked to print naturally,
being neither especially casual nor especially meticulous. However, it is obvious that data gathered this
way are not candid; they are probably better than data
from actual coding sheets prepared for keypunching.
Unfortunately, it was not feasible for us to process
candid data from a number of people using a variety of
coding forms and languages.
Five human subj ects were asked to classify the
character~ in 17 of the test alphabets-one from each
coder-viewing the quantized images (see the section
on scanning) in isolation and in random order on a
cathode-ray tube display. The error rates ranged from
3.0% to 6.4%, with an average of 4.5%. Taking a
plurality vote among the five responses, the error rate
was 3.2%.
Single-coder file

Experiments were also performed with a single-

1128 Fall Joint Computer Conference, 1968

I.' ,.
l.tI (.,

1,.4

'X

C.

I

.:Lr-.r.:r

1 \'

:'

,

PA

,

u.<;I~,

.1

1('
1.·;:.-f'I.

IF: .It.A,M.EIS.4

(lIe . .:."."

,

1

,

.K.\I .t\ •.,.

,.
.r.

.• !='.

.4 <7.tA.

l; 0

fL ,-:Ii.

"

~T.(':.

'z

<' ,~11

I).T

,1,.,'Il

-=
:

:l.'"

~L,,""

1

.:r. <

',A ...

1

.1).0

1

1.'"

llI!
:J. l.t-

v.

1

.t\.n

I

To:"

,

Z " ..{f,
I..

" ..

I

...

I

,

,
,
1

I
I

.11

I
I

.,11

I

,

I

I

:J".I',

-=

,

1:.11. :t~ .Ll

.dl

'" :11'"
.!)I~,t1l. ::r. ,-"

;'I".'<;'U.M ....

I

I

1

,

1'11,,-'

I

I

I

r

I

1

I

.1< " .. (Y.

,. L·tJ.6-. ...

1

I. ,f\.n.

.... "..,1.,

,

\.l N.tI.~It:.I<'.r.'I..N

,r.A.I.i

C ...:.T.I' <':.1'\

1

I~

I +Ii'

,

I

I

I
I

I

1(/1 •

1£.".

14.1!., 11.4, ",

,

.2.,

45

I

I

1

C .. P."

,J"

~.I.:l,r,Il:·I(I" .- ,tv.+',i. A.],-

"'-.'-

I· 1 , -.1:' .,--'t ~, - :-;II$.J. t 'l..1

,I

liz.

".': •
,I "

1"1'''.1 .r:nl
1.:a..It,
I

.:tIS V 1'I.¥,i :t.~,
.~

1=

I -, I

..xts..U,~l,'-l....

1

L..r .:T'-\

~ 5'~

I

I

I

I
I

1

, ... L...r.:lU
I

I
1 T8'OIMi4-~

FIGURE 2-A sample of the single-coder test data

humans but not highly regular. Ten human subjects
were asked to classify the test characters. The average
error rate was 0.7%. Taking a plurality vote among the
10 responses, the error rate was 0.2% (2 errors in 1042
characters) .
Scanning

FIGURE I-Portions of several multiple-coder test alphabets

coder file, in order to investigate the Improvement in
performance resulting from allowing the recognition
system to specialize in the printing of a single individual.
This file contained 1727 training characters and 1042
test characters. The training set included 15 alphabets
(690 characters) of the type collected for the multiplecoder file. The remaining 1037 training characters were
taken from FORTRAN text on coding sheets, as were
the 1042 test characters. The 15 alphabets were included
in the training set to ensure adequate representation of
all the character categories, since their appearance in
actual text was haphazard.
The text characters were taken from FORTRAN
coding sheets prepared by the author in the course of
actual program development, some months befo,re the
recognition experiments were performed. The coder
corrected major malformations of characters as he
noticed them, but avoided printing with unnatural
care. Thus, while these data are not candid, it is felt that
they closely model a realistic situation that would be
obtained if one tried to serve a coder who was making
a minimal effort to assist the system. '
A sample of the test data" is shown in Figure 2. We
may describe these characters as being quite legible to

The hand-printed characters were scanned from the
source documents (the coding sheets) by a vidicon television camera fitted with a close-up lens and operated
under the control of an SDS 910 computer. Each document was mounted in a concave cylindrical holder so
that, as the camera panned across the document,· the
viewing distance and hence the image scale remained
constant. The field of view was approximately one inch
square. The camera generated a standard closed-circuit
television waveform, which was quantized to two
levels (black/white) by a Schmidt trigger and sampled
in a raster of 120 X 120 points.
The document was illuminated by four floodlights
mounted around the TV camera. A colored filter was
placed over the camera lens, to suppress the colored
coding-sheet guidelines anpearing on the document.
The guidelines could have been used for locating the
characters, but we preferred to strive for a free-field
character-locating procedure that could ultimately handle between-the-lines corrections or coding on a blank
sheet of paper. Also, without a color-sensitive input
system, separating the guidelines from the characters
where they crossed or coincided could be a major
problem.
The field of view was chosen so that a single character image was usually a little less than 24 points high
and about 15 points wide. The computer began the
scanning by reading in a 120 X 120 picture containing,
in general, several character images. A scanning routine

Experiments in Recognition of Hand-Printed Text
then proceeded approximately horizontally through the
picture, finding and isolating character images. Provisions were included for tracking a line of text, and for
accepting multi-part character images such as. equals
signs and characters with unconnected crossbars.
When the scanning routine got to the right of the
120 X 120 picture, it requested the camera to move to
the right and input another picture.
As each character was isolated, it was placed in a
standard 24 X 24 raster format (Figure 3). No corrections for magnification or rotation were applied.
The BCD code of the character was entered manually

**************************

*
*
.0
o 000000000
*
*
0000000000000
*
*
000
0000000
*
*
000
00000000
*
*
000
00 000
*
*
000
00
*
*
000
000
*
*
000
000
*
*
000
000
*
*
000 0000
*
*
0000000
*
*
0000000
*
*
000000000
*
*
0000
000
*
*
0000
000
*
*
00
000
*
*
00
000
*
*
000
0000
*
*
0000
000
*
**
000
000
*
000
*
* 00000
00000
0000
*
*
00000000000
****************************
FIGURE 3-A hand-printed character in the standard
24 X 24 format

1129

at the console typewriter and attached to the character
record, for subsequent use in the training and testing
procedures. The two files (single-coder and multiplecoder) of quantized 24 X 24 black/white character
images served as the starting point for all subsequent
processing. We hope to .make these files available to
other researchers through the efforts of the .subcommittee on Reference Data Sets of the Committee on Pattern Recognition of the IEEE Computer Group.
Our scanning setup was "strictly experimental." It
was an inexpensive substitute for the sophisticated
optical scanner and mechanical transport required for
a high-volume production system. Although the scanning routine enabled us to gather the thousands of
quantized characters in our data files, it was never
capable of running without an attendant to rescue it
from its errors. These were due to badly non-uniform
sensitivity across the field of view (common in vidicon
tubes), which made it impossible to set a single quantization threshold valid throughout the field, and to the
lack of precise knowledge of the position of the TV
camera. (Incidentally, by solving these problems, it
should be possible to create a low-speed, inexpensive
automatic scanning system along the lines of the one
described above.)
Other files of digitized hand-printed data, supplied
through the courtesy of W. Highleyman and researchers
at IBM Corporation and Recognition Equipment,
Iric., have been processed merely by converting them
to our standard 24 X 24 format. In some cases, this
has required changing the size of the character raster
by copying Or deleting rows and columns.

Preprocessing
The term "preprocessing" has acquired a variety of
meanings. We use it here to refer to the specific activity
of feature extraction: The calculation, from the
(quantized) character image, of a set of numerical feature values that form the basis of subsequent pattern
classification.
Two preprocessing methods were used in these
experiments. The first, embodied in a computer program called PREP, was a simulation of a .previously
constructed optical preprocessor capable of extracting,
in parallel, 1024 optical correlations between a character image and a set of photographic templates, or
masks. 4O The second, a program called TOPO, extracted a large number of topological and geometric
.features of the character image.

The PREP preprocessor
The PREP program performed edge detection on
the 24 X 24 quantized images through the use of

1130

Fall Joint Computer Conference, 1968

FIGURE 4-Edge-detecting masks in PREP
(a) Quantized character image
(b) An edge mask
(c) Character and mask together

edge-detecting mask pairs, or templates. Each mask
pair consisted of two 2 X. 8 rectangles of points, adjacent to each other along their long edges. One of the
masks was given positive weight, the other, negative,
and a threshold was set such that if the positive mask
encountered six more figure points than the negative
one, the binary response of the mask pair was ON
(Figure 4).
To provide a limited degree of translation invariance,
the responses of five such mask pairs were OR-ed together to give a single binary component of the output
feature vector. The five mask pairs in a group had the
same orientation and were in the same region of the
24 X 24 field. Nine regions were allotted to each of the
four major compass directions, and six regions were
allotted to each of the eight secondary directions (at
300 intervals). Thus, the complete feature vector consisted of 84 binary components, and the significance of
a typical component was, "An edge oriented north-ofwest has been detected in the left central region of the
field." Figure 5 shows a computer display in which the
lines are normal to edges detected in a sample of the
numeral 2. The lines emanate from 15 loci representing
the allotted regions.
Each quantized image was presented to the PREP
preprocessor nine different times, first in the center of
the 24 X 24 field,then in the eight positions formed
by translating it vertically and/or horizontally by two
units. Thus, for each pattern, a set of nine 84-bit feature

FIGURE 5-Responses of the PREP edge-detecting mask groups
to a numeral "2"

vectors was formed. The use of these multiple-view
feature vectors· to improve classification performance
is described below.

The TOPO preprocessor
The TOPO preprocessor was a sizable collection of
computer routines assembled to extract topological
and geometric features from the character image. In
general, these features described the presence, size,
location, and orientation of such entities as enclosures
and concavities (lakes and bays) and stroke tips in·
the character.
TOPO began with a single connected character image
in the 24 X 24 field. (The equals sign was sought out
in advance, and treated as a special case. Other unconnected figures were forcibly joined by growing a
bridge between the individual connected regions. If
this failed, the lesser region(s) were discarded.) The
perimeter of the figure was first found (Figure 6). The
perimeter was defined asa list of figure points, beginning
with the bottommost of the leftmost points of the
character figure, found by stepping along the edge of
the figure and keeping the figure always at the right
hand and the ground (non figure) at the left. The
perimeter has the property of including all figure points
hand and the ground (non-figure) at the left. The peri-

Experiments in Recognition of Hand-Printed Text

**************************

*

*.
*
*
XOXOOOOOOOOOX
*
*
XXOOOOO
OOX
*
*
XOOXOO00
OOX
*
*.
XX XOO
OOX
*
*
XOO
OX
*
*
XOO
OOX
*
*
XOO
OOX
*
*
XOO
OOX
*
XOO OOXX
*.*
*
XOOOOOX
*
*
XOOOOOX
*
*
XOOOOOOXX
*
*
XOO
OOXX
*
*
XOO
00 OX
*'
*
XOO
ox
*
*
XOO
OX
*
*
XOOO
OOX
*
*
XOO
OOOX
*
*
XOO
OOX
* XXOOO
*
OOX
* XOOO OOXXX
*
* XXXXXXXXXXX
*
*
*
**************************
X

x XXXXXXOXX

FIGURE 6-Hand-printed character with perimeter points
marked X

meter has the property of including aU figure points
that are 8-adjacent (adjacent horizontally, vertically,
or diago~ally) to ground points outside.
Next, the convex hull boundary (OHB) of the figure
was found (Figure 7). The OBB of a two-dimensional
figure may be thought of as the outline of a rubber band
stretched around the figure. In the case of a quantized
figure, some arbitrariness is reqUired in the specification
of the OBB, because a straight line between two points
on the image grid does not generally fall on exact grid
locations. We defined the OHB to include all the extremal points of the character image, represented by letters

1131

**************************
*
E
*

*
*
*
*
*
*
*
*
*
*
*
*
*

*
*

*
*
*
*
*

*

*

*

DxXXXXXXOXF
XOOOOOOOOOOOG
CXOOOOO
OOX
OOH
~OOOOOOO
xOO 000
OOX
. X
000
OOX
000
OOOX
X
000
000 X
X
000
000 X
X
000 0000 X
X
0000000
X
X
0000000
X
X
X
000000000 X
X 000
OOOOX
X 000
OOOX
X 000
OX
X
000
OX
X 0000
OOX
X 000
0001
X 000
OOX
XOOOO
OOX
XOOO
OOXXJ
AXXXXXXXXXK

*
*

*

*

*
*

*
*
*

*

*
*

*

*

*

*
*
*
*

*
*
*

*

**************************
FIGURE 7-Character with convex hull boundary (CHB)
t

ot herthan "X" or "O'J in Figure 7. In between these
extremal points, the OlIB was to follow as straight a
path as possible, bu~ never falling outside of the
theoretical straight line connecting the extremal
points. Keeping the ORB to the inside reduced the number of small, insignificant concavities found subsequently. To find the extremal points in the OHB, it
was only necessary to search among those perimeter
points at which the perimeter turned to the right.
After the OBB was obtained, the concavities and
enclosures of the character image could be found quite
readily using computer routines that simulated Boolean
and connectivity operations performed in parallel over

1132

Fall Joint Computer Conference, 1968

the entire 24 X 24 field. Let the border consist of those
ground point.s in the outerm')st rows and columns of
the 24 X 24 field. Let an image be formed consisting
of the ground, minus the CHB. The portion of this image that is not connected to the border lies within the
CHB and consists of the concavities and enclosures
of the character. Those regions that are connected to
the border by a path of ground points (including ground
points in the CHB) are concavities; those regions that
are not are enclosures within the figure. The character
in Figure 7 contains two concavities and two enclosures.
Multiple concavities and/or enclosures were extracted all at once in a single 24 X 24 array by the parallel
operations. They were then separated (again using the
connectivity operations) and sorted by size for subsequent use.
The spurs of a character are those strokes that end in
an isolated tip. Ideally, the letter X has four spurs, the
letter 0, none, and the letter S, one spur with the special property of having a tip at each end. The list of
perimeter points was used to find the spurs. Consider
two pointers moving down the list of perimeter points,
with one pointer ahead of the other by, say, 15 places.
As the pointers moved, we calculated the Euclidean
distance between the two perimeter points indicated by
the pointers. Some of these distances are represented by
arrows in Figure 8(a). Most of the time this distance
would be approximately 15 units. A sudden decrease of
the distance between the two points to a minimum that
was less than half its usual value indicated that the
perimeter had gone around a sharp bend-i.e., had
gone around the tip of a spur. The position of the spur
tip, indicated by the perimeter point halfway on the
list between the two minimum-separation points, was
the primary attribute of the spur used for forming features.
Once a spur was found, it could be traced by the
"caliper method" [Figure 8(b)]. Imagine that the legs
of a pair of calipers are placed at the two minimumseparation points. The calipers are then "slid" along
the spur by stepping the legs of the calipers along the
perimeter, away from the tiiJ. The calipers are moved as
far as they can go without having to be spread by more
than, say, seven units. In some cases, such as the numeral "6," the calipers will be obstructed by the body of
the figure and must stop. In other cases, such as the
letter "S," the legs of the calipers will travel all the way
along the figure and meet at the far end, indicating a
"single-stroke" figure. The midpoint of the moving
calipers traces out the backbone of the spur, and a list
of the midpoint positions can be stored to represent the
spur (the heavy line in Figure 8b).
Another set of character attributes found in TOPO

**************************
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
**************************

**************************
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
xxxxx
*
*
oooooooooxx
*
*
xoooooooooooooox
*
*
*
* xoooooooo oooooox
ooox
*
* xooooo
oooox *
* XOOOOO
xooooooooooooooox
*
*
xxoooooooooooox
*
*
xxooooooooox
*
*
xxxxxxxxx
*
*
*
*
*
*
*
*
**************************

FIGURE 8-Spur-finding
(a) Finding thp spur tip
(b) Tracing the spur

and used for feature generation were the profiles of the
character image. The profiles were four lists, of 24
entries each, specifying the first row (or column) in
which a figure point was encountered in each successive
column (or row) as seen from the top, bottom, left, and
right. The profiles were the basis of a number of specialized feature calculations, designed to discriminate
among particular categories, that evaluated such properties as the width of the character at various levels,
the number of reversals of direction in a profile, and
discontinuities in the profiles.

Numerical feature calculation in TOPO
After the topological and geometric components of
the character image-concavities, enclosures, spurs,
profiles, etc.-were extracted, it remained to convert
them to numerical components of a feature vector
suitable for subsequent classification by an adaptive
machine. This task was beset with several conceptual
and practical difficulties that may not be obvious at
first.
In TOPO, the task was carried out in two steps.
First, descriptors (individual numerical quantities) were
derived from the information at hand. Second, features
in a standard form were calculated from the descriptors.
Each descriptor had to be chosen so that it always
represented a unique characteristic of the character
image. For example, suppose that one descriptor were to
represent the vertical position of the rightmost spur
tip. Such a descriptor would help to discriminate, for
example, between T and L. But this descriptor would
give unpredictable results for characters such as C,E,
and [, depending on which spur extended farther to the

, Experiments in Recognition of Hand-Printed Text
right, and would probably be detrimental to the classification of characters in these categories. In addition,
there is the problem of vacuous descriptors: What value
do we assign to the above descriptor in the case of a
letter O?
In TOPO, these problems were countered by a careful
choice of the definition of the descriptors. In many
cases, it was possible to devise a descriptor that was
always well defined. For example, if a spur-descriptor
is put in the form, "To what extent is there a spur in the
upper right-hand corner," it is defined for any number
of spurs and can properly be given its minimum value
for a figure with no spurs at all. In addition, this form
of definition (unlike the preceding one) has the important property of continuity: Deformations of the character image that move the spurs by small amounts
always cause small changes in the value of the descriptor. In another paper, the author has argued that
the preservation of continuity is important throughout
the various stages of the pattern-recognition process. 38
As the final step in TOPO, the actual features (the
numerical components of the feature vector for classification) were calculated from the descriptors. A first
requirement on the features was that they be of comparable magnitudes, so that none would dominate the
sums formed in the pattern classifier. Thus, the features were all given a standard range of zero to 100.
(Note that these features were multiple-valued, whereas
those from the PREP preprocessor were binary.)
A second, heuristic requirement on the features was
that they emphasize the significant differences among
character classes. In a two-category classification problem, it is feasible to analyze the discriminating power
of a feature statistically (or even by inspection) and to
adjust the transformation from descriptor to fe~ture so
as to maximize this power. In our 46-category problem,
we could only guess at reasonable transformations. In
any case, one should not expect the feature to be a
simple linear function of a descriptor.
The,derivation of features in TOPO may be indicated
by an example. Consider a descriptor, MCONC(up)
which is a measure of the presence of an upward-facin~
concavity in the character. For a flat-topped or roundtopped character, such as T or 0, MCONC(up) should
have the value zero. For a character such as U or V
MCONC(up) should have a value of eight or greater:
For a Y or an open-topped 4, however, we should only
expect values of five or greater. Owing to the linear
nature of the dot-product units used in the pattern
c~sifier, it is impossible for a single feature proportIonal to MCONC(up) to discriminate between Y and
T, for example, without treating U as a "super-Y."
We actually require two features-one that "switches"

5
MCONC (up)

LNC~::~ .
O~!
o

1133

10

!

5

10

MCONC (up)

FIGURE 9-Two transformations that derive features from a
concavity descriptor

in the range 0 to 5 and one that does so at a higher
range.
The two transformations that derived features from
MCONC(up) in TOPO are shown in Figure 9. There
were two such features corresponding to each spur
descriptor and concavity descriptor in TOPO. In all,
TOPO produced 68 features: 16 for the spurs, 16 for
the concavities, 8 for the enclosures, 6 for overall
character size and shape, and 22 resulting from special
calculations about the width of the character at various
levels, discontinuities in the profiles, etc. Each feature
was calculated from a numerical descriptor by a transformation arrived at by'inspection.
It should be evident from the foregoing description
that the development of TOPO was a cut-and.,;'try
affair. The extraction of topological entities and the
generation of descriptors and features were continued
'only as far as patience permitted. For example, a featUre to look for structure within' an enclosure and help
discriminate between 0 and Q was never implemented.
It is the author;s opinion that the generation and selection of features for pattern classification, especially in
the multi-category case, is the greatest problem area
in pattern recognition at the present. 38
Olassification

An adaptive pattern classifier, or learning ma;chine,
was used to classify the characters on the basis of the
feature vectors generated, by a preprocessor; either
PREP or TOPO. The learning machine was of the
piecewise linear (PWL) type, described by Nilsson. '1

1134

Fall Joint Computer Conference, 1968

The learning machine for these experiments was implemented by a computer program called CALM (Collected Algorithms for Learning Machines),42 running on
the SDS 910 computer, which simulated the action of
the MINOS II hardware learning machine constructed
earlier during this project.40·43.44
Briefly, a learning machine embodies a set of Dot
Product Units (DPU's) that form the dot product (also
called the inner product or vector product) between
the incoming pattern, or feature vector, and a set of
stored weights. The jth DPU of the machine forms
the dot product
Sj

= X . Wj =

L: Xi Wij

between the pattern vector X and the weight vector W j
associated with the jth DPU. In a PWL learning machine, a small number of DPU's are assigned to each of
the 46 character categories. The largest dot product
formed among the DPU's assigned to a category is
taken as the response for that category.
The category responses may be utilized in two ways.
If it is desired to explicitly categorize a character, the
character is assigned to the category with the largest
response. A testing margin or dead zone may be employed, so that any character for which the largest response does not exceed the second largest by the margin
is classed as a reject. In the performance results listed
below, the reject margin is not used. The performance
scores are thus of the simplest possible type: percentage
of successful classifications with no rejects allowed
(response ties are broken arbitrarily).
Alternatively, if the goal is not to achieve a succinct performance measure but rather to use the character-classifi<>ation information for contextual analysis,
the responses may be used to obtain confidence information. The simplest confidence measure is the set of
46 responses from the learning machine, with a higher
response indicating a higher confidence that the character belonged to the category in question.
To adapt a learning machine, a training pattern is
presented, and the responses to that pattern are obtained. If the resPOnse in the true category of the
training pattern does not exceed the largest response
among the other categories by a value called the
training margin, the DPU yielding the response in the
true category is marked to be ~ncremented, and that
yielding the competing response is marked to be
decremented. This is done by setting
.
3.true = 1

; acompeting = - 1

in the adapt vector A,and setting all the other components of A to zero. Adaptation of the weights is then

performed according to the fixed-increment error
correction rule:

Wi

~

Wi

+

ai . D . X,

for all j.

In other words, the pattern vector is added to or
s:ubtracted from the jth weight vector, depending
on ai' D· i~ an overall multiplying factor called the
adapt step size, usually set to a small integer throughout
a block of training. (Other methods of determining the
responses and A and D lead to learning machines other
than the PWL machine.)41.42
The adaptation causes the subsequent dot product
between the pattern vector and the weight vector to
be changed· by an amount
L\Sj =

X . (aj' D . X) = ajD

Ix12.

Since ](2 and D are always positive, the sign of aj
automatically determines. whether the response (i.e.,
the dot product) of the jth DPU with the pattern X is
enhanced or reduced. Through this means, appropriate
DPU's can be made to respond to certain patterns and
ultimately to classes of patterns.
To perform a learning-machine experiment, the
adaptive weights w ii are initialized, usually to zero.
The training patterns are then presented sequentially.
The responses to each training pattern are f~rmed,.and
if the classification is incorrect the machine is trained.
One pass through the training patterns is called an
iteration. Typically, repeated iterations through the
training set are performed until the classification performance on the training patterns' ceases to improve. At
that time, .the test patterns may be presented, and the
classification performance on them recorded. This performance is generally taken as the measure of success of
the learning machine on the task represented by the
training and test patte~.
In dealing with the nine-view sets of feature vectors
produced by the PREP preprocessor, the running procedure was modified slightly (Figure 10). During training, one of the nine feature vectors representing a
training pattern was selected quasi-randomly at each
iteration. Thus, it took nine iterations for the machine
to encounter each view of each pattern. The use of
multiple views had the effect of "broadening" the
training experience of the learning machine. During
"nine-view testing," all nine views of each test pattern
were presented and the nine responses in each category
were added together to form cumulative responses that
were used as the basis for classification. It will be seen
that the redunda~cy achieved by accumulating the nine
responses led tq a'significant improvement in performance. This technique has also been used successfully

Experiments in Recognition of Hand-Printed Text

113.5

ther improvement with the combined system of Condition 4. The mORt important results can be summarized as follows:

c=J
OVERALL
CLASSIFICATION
RESPONSE

N
C LASSIFICAT ION
RESPONSES

FIGURE 10-Multiple-view testing procedure

U sing the combined system, a correct characterclassification rate of 97% (with no rejects) was
obtained on independent test of relatively unconstrained hand printing in the 46-character FORTRAN alphabet, when the learning machine was
allowed to specialize on data from a single coder.
When the learning machine was trained on the
printing of 32 coders and tested on the printing of 17
others, the correct classification rate was only 85%.
These rates are for the isolated characters, without
context.
Final Classification Scores

by Darling and Joseph in the processing of satellite
photographs. 46

Condition

Number of
Iterations

Preprocessor

Training
Patterns

Test
Patterns

Single-Coder File

Experimental result8

A series of experiments were performed on the singlecoder and multiple-coder data files, using the preprocessors and learning machine described above. Experiments were run under four conditions.
In Condition 1, the characters were preprocessed by
PREP, but only the one feature vector representing
the central view of each pattern was use~ for training
and testing the learning machine. Only the single-coder
file was run under Condition l.
In Condition 2, the characters were preprocessed by
PREP in all nine views, and nine-view training and
testing were performed as described above. A PWL
learning machine with two DPU's per category was
used in Conditions 1 and 2.
In Condition 3, the characters were preprocessed by
TOPO, and the single feature vectors produced by
TOPO were used for training and testing. Owing to
computer restrictions, a learning machine with only
one DPU per category was used. This is generally called
a linear rather than a PWL learning machine, after the
form of the discriminant functions in feature space. 41
In Condition 4, the responses of the learning machines in Conditions 2 and 3 for each test pattern were
added together and taken as a new basis for classification. This procedure was a way of harnessing the
preprocessor-classifier systems of Conditions 2 and 3
"in tandem" in order to improve classification performance, in a manner analogous to the nine-view testing
of the PREP feature vectors.
The results of the experiments are presented in
Table I. The results show a significant improvement in
performance for the case of nine-view training and
testing over single-view training and testing, and fur-

88'70

PREP, 1 view

10

99'70

PREP, 9 views

27

89% *

96'70

TOPO

10

94%

91'70
97%

Combined

Mult iple-Coder File
PREP, 9 views

18

TOPO

Combined

*Single-view

65'70
84%

*

78%
77%
85%

classification scores

TABLE I-Experimental results on two files of hand-printed:
alphanumeric characters

A well known set of quantized hand-printed character images (letters and numerals only) cqllected by
Highleyman were also processed under Condition 2,
yielding a test classification score of 68%. Previously
reported classification methods, not employing preprocessing, had achieved scores of 58% or less. These
characters are of very poor quality, being only 86% to
89% classifiable by humans. These results are described in Ref. 39.
A large number of preliminary and auxiliary experiments, not described in this paper, were performed.
In particular, during the development of the TOPO
preprocessor, an attempt was made to use the features
produced by TOPO in a binary decision-tree classifier.
The results of this effort were very poor, because it was
impossible to find features reliable enough to serve for
dichotomization of the character classes. For example,

1136

Fall'Joint Computer Conference, 1968

the presence of an enclosure was a useless feature,
because quantization noise introduce<;l some spurious
enclosures, and other expected ones were lost because
they were filled in or not completely formed. It thus
appears to us that, for patterns with the variability of
hand printing, an approach that considers all the
features in parallel is a necessity.
The development of the TOPO preprocessor, the
exploration of variations of the PREP preprocessor,
and the running of classification experiments with,
different learning-machine configurations and different
data files were all severely restricted by system limitati....

.,. ".....

..tJ. .A&lE•.

.1te.......1:1

,It.

.... , ..... ,,,,,-

X
If the first plus sign were chosen as the delimiter we
would have the two simple legal subexpressions
X as first choices' of each segment of the
X X and
P-list, but the concatenation of the two with another
plus sign is illegal. This pitfall can be avoided by making
a final legality check which, if not satisfied, forces a new
selection of potential delimiters. Thus although the

+

ll.e.

I:L~.L.A.G

.6.•6:"1

''''I!:.~G,II

.I'lO.R,£I

N

COU.31-40
W.T ...w;; .... ,

.r..".,:1.

COI.S.4H10
.~o

.IA.&'r;;.
1p-.r.t'_oI.TI
.,..'t

~1E.."

..n.

.Goa.

wE.: 6-.H.T .AlG.E M ~

I.LIII. . . . . .'5

..O.:tlll.

XX++X

COLS.21-50

COLS. 11-20

COLS.I-IO

.T."".A.&

.11<
.T.a,lt .

,".iI..

..." ".A./ftl<'. II:
;"! 10 ...........

.d>

.1::1.0••2

.d.

A.tf.E.
.c.A.1..
CA

.,.

.".ttI.
... I.l.EIT.& ....T.1

~.

d>4
A.I.p-.r,A.6-.E.,
.A.\I.s:..t:.WE.-r.,G.M T.

........
x."
- ......r ....
0" ... .. ..........

A ... p. ........

.cd>

'T.I

to~

.It.t!

.b.o. •S O ••

.d.

.e..(),J

r-...

··.. ·'·-.AlcaE ",E,

N

".-

/

·.V.PoE

....' •. I:I:.P.....

.~",

.... 4t..'s,:

.d>,.

.,10.
.T.O.

.0.

,4.

LI>.

'I..
•

..,.

FIGURE I-A hand-printed FORTRAN program

Experiments in Recognition of Hand-Printed Text

I

100

R~AD-IOI.AGE.WEIGHT

5

101
10

to!
20
21

7. 0
so
1~2

60

C0MM0N AG~.WE~GHT.AGEMEAN.WTMCAN.C0V
DIMENSI0N AGECI001.UEIGHTCIOOj
READ 1001IFLAG.M0KCF0RM/TC2ISl
-IF[ MER= 160. S. 5
FQRMATCFIO.21
G~ T0 [IN.=01301.IFLAG
01 I I 1=1.100
Wr:IGH.:.CIl=AGECSl
G~ T0 10
-00 21 1=1.100
AGECI1=WEIGHTCI1
CALL AVECAGE.IOO.AGEMEANl
tALL AVECWEIGHT.IOO.WTMEAN1
[0V=0.
GO SQ 1=1.100
C0V ; C0V+[AGECI1-AG[~CAN1*CWEIGHTCIl-UTMEANl
C0V=C0V/IOO.
--.
TYPE 102.IFLDG.C0V
F0RMATCIS.FIO.S1
G0 TQ I
=T0F'£ND-

TABLE I-First choice of classifier

based on the syntactic requirements of the respective
statement types. Each subpart consists of one of a
small number of constructs, e.g., an expression or a list
of integers. The analysis of each construct often begins
with a match against the table of identifiers. If the
resulting P-list is combinatorially simple it is analyzed
exhaustively by dynamic programming. Otherwise, the
P-list is further partitioned into segments by means of
delimiters and the segments are examined. Before any
exhaustive analysis is made the P-list is compressed to
reduce the number of alternatives. The final answer is
reconstructed from the compressed answer by comparison with the uncompressed P-list.

1145

products between a feature vector and 46 stored weight
vectors, one for each of· the 46 categories. The first
choice response is the category corresponding to the largest dot product. The context-directed analyzer uses
these dot products to determine alternative choices for
each character, together with a measure of confidence for
each alternative. The confidences C i are obtained by
normalizing the dot products according to
C . -- Si - Smax
Smax
~

1

=

1, . ~ ., 46,

where Si is the ith dot product, and Smax is the largest
dot product observed for all of the characters in the
source program.
Since the correct category for the character is usually
included among the choices having high confidence, it is
not necessary to consider every alternative for every
character. An empirical study showed that almost invariably the correct category was among those alternatives whose dot products were at least half of the
maximum dot product for that character. Thus, only
those characters whose confidences met this condition
were included in the list of alternatives. This typically
reduced the number of alternatives from 46 to 4 or 5,
with occasionally only 1, and never more than 10. The
price for this simplification was an occasional failure
to include the correct category, which was the case for
the five doubly-underlined characters in Table 1. Although this introduces extra problems, the reduction in
combinatorial complexity is worth the price.

Statement identification
Example

Source data
This section describes an experiment illustrating the
operation of the context-directed analyzer. The source
data for this example came from the hand-printed
FORTRAN program shown in Figure 1. The characters on
the coding sheet shown were scanned, preprocessed, and
classified by the methods described by Munson. 1 Because we purposely wanted data with a moderate error
rate, we chose to use only .topologically-derived
features. Using these features, Munson had previously
obtained a nine percent error rate on other data by the
same writer. The results on this data were very similar,
with 38 out of the 410 characters misclassified for an
error rate of 9.3 percent. The particular errors made are
shown underlined in Table 1. It is interesting to note
that about one-third of these classification errors would
not have been detected by purely syntactic methods.
These error rates correspond to the first choice responses of the character classifier, a linear machine. The
linear machine classifies a character by computing dot

As mentioned previously, the analyzer is organized as
a two-pass program. During the first pass, the type of
each statement is determined and variable names are
collected for the construction of the identifier table.
Statement identification was done by comparing the beginning of each statement with the "control" words,
IF, DO, READ, etc. For example, the alternatives and
the corresponding confidences for the first part of the
sixth statement were as follows:
R KA D
A [ R 0
V
Z E
U
R
L
Z

-25 -65 -28 -42
-49 -65 -62 -52
-61
-62 -68
-69
-77
-81
-81

The average confidence for the first choice selection
RKAD was -40. The average confidence for READ was
-41, which was sufficiently high to identify the state-

1146

Fall Joint Computer Conference, 1968

ment as a READ-statement. This matching procedure
correctly identified all but one of the 24 statements, including two cases in which the correct category of a control word character was not included in the list of alternatives. However, absence of the correct, category
caused the ninth statement, a DO-statement, to be
erroneously identified as the arithmetic-assignment
statement D711I = 1,100. Subsequent analysis failed
to resolve this as a legal arithmetic-assignment-statement, however, and the result of this failure condition
was that first choice decisions from the classifier were
accepted as the final output.

During the first pass, all COMMON, DIMENSION,
and input/output statements were inspected to collect
potential variable names. This operation was allowed to
be somewhat liberal, since the inclusion of spurious
identifiers is less harmful than the exclusion of actual
identifiers. For example, in the last TYPE-statement
the input/output list had the following alternatives:
F L D G
[
[ A 6
S
6 M+
K
2 N U
1

GAGE M0KE IFLAG WEIGHT AGEMEAN
COY
WTMCAN
IFL
Of these names, two were spurious (G and IFL), but
caused no trouble. Two' were wrong (MaKE and
WTMCAN), but since only one representative of each
was found, they could not be fixed. The remainder
(AGE, COY, IFLAG, WEIGHT, and AGEMEAN)
were correctly clustered.

Statement analysis

The identifier table

I

identifier table to the following first choice possibilities:

C
[

a v
DW

Q U
B +
G a
P

Because the fifth-choice comma had a fairly high confidence, the program found IFL and G as well as
IFLDG and COY as possible variable names. While
there is a danger 'that these fragments might have accidentally matched similar names elsewhere in the pro.gram, no such matches occurred. One reason is that long
names are tried before short names when the identifier
table is used, and this prevents the premature discovery
of erroneous matches with short fragments. Another is
that completely accidental matches involving names of
length greater than three or four are highly unlikely.
The search for possible variable names yielded the
following (first choice) possibilities:
G AGC MOK[ IFLAG WEIGHT AGEMEAN
IFLDG WTMCAN
COY
AGE
UEIGHT
AGE
WEIGHT
IFL
COY
Even in this simple example the need to cluster the
identifier table is clear, since (a) four names were found
more than once, and (b) three' of t1?-ese appeared with
different first choice spellings. Clustering reduced the

During the second pass, each statement was resolved
in turn. Since each different type of statement had to be
treated differently, a complete description of how this
was accomplished.would be tedious. However, the spirit
of our procedures can be conveyed by considering the
resolution of the long arithmetic-assignment statement.
For this statement, the first choices of the classifier were
50 COY = COy + [AGE[I] - AG[HCAN]
[WEIGHT[I] - UTMEAN] .

*

As with all statements, the label field (columns 1
to 5) was inspected first. Its resolution was trivial, since
the first choices were legal. Attention then shifted to
the statement field. Starting in column 7, a search was
begun for a possible equals sign to be used to break the
statement into a tentative variable and a tentative expression. (Had later procedures failed to resolve either
of these parts, the search would have been resumed for
a possible equals sign further to the right.)
The first character found having an equals sign for an
alternative was, in fact, the correct equals sign. At this
pOlnt, the first step was to resolve the left-hand side of
the statement. Since the tentative variable, COV, could
have been either a simple identifier (scalar variable) or
'an identifier followed by a bracketed list of expressions
(array variable), a search was begun for a string of the
form "alphanumeric, left-bracket." No such string was
found, of course, and the tentative variable was declared to be just an identifier of length three. A search
through the corresponding part of the identifier table
produced a match, and COY was accepted for the name.
The next step was to resolve the expression. Here an
exhaustive search of the expression for candidate identifiers was begun at once. Each candidate found was
matched against appropriate length entries in the
identifier table. This procedure produced five matches,
and changed the first choices for the expression from
bOY + [AGE[I] - AG[HCAN]
- UTMEAN]

* [WEIGHT[I]

Experiments in Recognition of Hand-Printed Text

to

1147

are much more difficult to implement. For example, the
of s~atement type is currently made by
matchmg the leadmg portion of the P -list against the
various control words. A short control word results in a
greater risk that the statement "type will be misidentified, yet a human easily identifies· statement types by
their general appearance or gross structure, as well as by
t~e (possibly misclassified) control word. This appreciatIOn of global structure has been one of the more difficult
abilities to give the analyzer.
Another observation is that the basic strategy employed by the analyzer should change with variations in
the error rate of the input data. The support for this observation rests on intuitive, rather than experimental
grounds, but it seems clear that elaborate procedures
that may be required for very poor data are unnecessarily inefficient on very good data. While the present
analyzer can cope with a certain amount of error-rate
variability by automatic ,adjustment of thresholds,
there is no provision to change the basic nature of the
operations as a function of the quality of the input data.
A third observation is that there will always exist
FORTRAN programs that are unlikely to be resolved
successfully. One need only consider the contrary programmer who defines three separate variables as
SS5S5, S5S5S, and S5SS5 to appreciate this. Whenever
there can be errors in the input, there is a chance of
errors in the output. In a practical system, one would
want to provide the user with more· than the final dec~sion of the recognitio.n system. For example, diagnostIC messages could be gIven to aid the user in finding and
correcting errors, whether they were committed by the
classifier, the analyzer, or the user himself.
I t is difficult to assess the usefulness of our techniques on the basis of an exploratory investigation. A
deter~ination

COY

+

[AGE[I] - AGEMEAN]
- WTMCAN1,

* [WEIGHT[I]

which, even though it contains the error in WTMEAN
a syntactically valid expression. Thus, in this case'
the expression was resolved by the first operation, the
use of the identifier table. The remaining operations
which have been very useful in other instances, were not
needed, and hence were not performed. Since both the
variable and the expression were now resolved, these
parts were joined by an equals sign and appended to the
results of the label-field analysis to yield the final resolution of the statement.
When similar procedures were applied to the other 23
statements, 28 of the 38 errors were corrected, reducing
the error rate from 9.3 percent to 2.4 percent. The final
output of the analyzer is shown in Table 2, where the
10 remaining errors are underlined. Thr.ee of these errors
were due to the appearance of WTMCAN rather than
WTMEAN in the identifier table, and three more were
due to other problems with identifiers: lVIOKE, MERE,
and S. A better method of using the identifier table, in
which a final determination of variable names is postponed until all matches are made, would no doubt yield
improved results.
Of the remaining four errors, one was in a FORMATstatement, one in the DO-statement control word, and
two involved labels. The FORMAT error was due to the
fact that we have yet to implement that part of the program that resolves FORMAT statements. The DO
error was caused by the missing alternative, and its correction would require the use of much more sophisticated methods for identifying statement types. Both
label errors, however, could easily be cured by using a
table of labels similar to the table of identifiers. Thus,
roughly half of the 10 uncorrected errors could be resolved by relatively straightforward additions to our
present program; the remainder would be difficult
indeed to fix.

.

IS

1
100
5
101

DISCUSSION

10
11

This paper has been concerned with techniques for using context to detect and correct character recognition
errors. A few concluding remarks and observations
about these techniques and their implementation are in
order.
Our first observation is that the addition of new techniques to the context-analyzer program can continue
virtually without limit. Many of these additions are
straightforward. For example, tables of library subroutine names or statement labels could be incorporated and used in an obvious fashion. Other strategies, which humans employ with remarkably little effort,

20
21
10

50
102
60

C0MM0N AGE,WEIGHT,AGEMEAN,WTMCAN,C0V
DIMENSI0N AGE[100l,WEIGHT[100]
READ 100,IFLAG,M0KE
F0RMAT[2I8l
In MERE 160,5,5
READ-I0l,AGE,WEIGHT
F0RMATCFI0.21
G0 T0 (10,201301,IFLAG
D7 11 1=1,105
WEIGHT[I1=AGE[Sl
G0 T0 30
D0 21 1=1,100
AGE[11=WEIGHT(I1
CALL AVE(AGE,100,AGEMEAN1
CALL AVE[WEIGHT,100,WTMCANl
C0V=O
D0 50 1=1,100
C0V=C0V+[AGE(I1-AGEMEANJ*(WEIGHT(IJ-WTMCANJ
C0V=C0V/I00.
TYPE 102,IFLAG,C0V
F0RMAT(I8,FI0.5J
G0 T0 1
STep
END

TABLE II-Final output of analyzer

1148

Fall Joint Computer Conference, 1968

thorough evaluation of the performance of the analyzer
can be made only by testing it on a large number of
FORTRAN programs produced by a variety of authors.
Unfortunately, we we:r,:e unable to undertake a dataprocessing project of this magnitude.
An equally difficult question concerns the extend.:.
ability of the reported techniques to other problem domains. These techniques can be characterized by three
qualities: risk-spreading in decision making, partitioning of a large decision problem into a hierarchy of subproblems, and continual checking of internal consistency. It seems clear that our basic approach applies
more or less directly to other programming languages,
and perhaps could be used with natural language in
tightly constrained situations. The conjecture that the
general approach, at least, can be. applied in more
general problems has a certain piquancy, but it remains only a conjecture.
We have, however, been able to achieve a substantial
reduction in error rate for a particular application. In
our opinion, it would have been difficult to obtain a
comparable improvement by applying more conventional context analysis methods which do not take
advantage of the special nature of the problem.
AKNO\VLEDGMENTS
The authors wish to thank their colleagues at Stanford
Reasearch Institute, and most particularly to thank
Dr. John H. Munson for many stimulating and fruitful discussions.
This work has been supported by the United States
Army Electronics Command, Fort Monmouth, New
Jersey under Contract DA ~8-o43 AMC-01901(E).
APPENDIX
In this Appendix we derive the decision rule that
classifies strings of alphanumeric characters in an optimal (minimum probability of error) fashion. By appropriately interpreting our result, we arrive at the decision rule described in the text.
Suppose that we are given some string of n characters
to classify. Our problem is to determine a string of n
categories that minimizes the probability of misclassification. If we let the vector Xi denote the set of
measurements made on the ith character and () i denote
the category selected for the ith character, then it is well
known from Bayesian decision theory that the minimum
probability of error is achieved by the following rule:
Select the categories

()I, ••• , ()n

which maximize the

posterior probability P(()l, ... , ()n/XI, ... , Xn).
In other words, the posterior probability is computed

for every possible assignment of ()l through ()n, and the
most probable assignment is taken as the decision.
By Bayes' law of inverse probabilities we can write
the posterior probability as
P(()l, ... , ()nIXI, ... , Xn)
p(XI, ... , Xn /()l, ... , ()n) P(()l, ... , ()n)
. p(XI , . . . , Xn)

(1)

This shows that the computation of the posterior probability depends upon both the prior probability
P(()l, .. ',()n) and the conditional probability P(XI, ... ,'
Xnl()l, .. . ,()n,). To simplify the computation of the conditional density, we make the reasonable assumption that the manner in which a character is formed
depends only upon the category of the. character
and not on the categories or the measurements of any
surrounding character. This assumption is equivalent to
an assumption of conditional independence, namely
that
n

. ' ... ,XnI()l, ... , () n
P (.XI

II p(Xil()i).

(2)

i=l

At this point we must make some assumptions about
the performance of the character classifier that provides the input to the con,text-directed analyzer. If this
classifier were designed for the optimal classification of
characters without regard to context it would compute,
for every () i, the posterior probability

During the design of the classifier it was tacitly assumed that all classes are equally likely a priori, so that
P(()i) = 1/46. We therefore make the bold assumption
that the classifier computes, for all 46 values of () i,

Substituting (3) and (2) into 0), we obtain
P(()l, ... , ()n/XI, ... , Xn)
=

IT

pe()l,···, ()n)
46 P(Xi)P*(()iIX i )
P(XI' ... , Xn) i=l

•

Now for given measurements Xl, ... ,Xn we are interested
in maximizing this quantity over ()l, ... ,()n so we can
ignore constants and factors depending solely on Xi and
obtain the following optimal compound decision rule:
Select the categories ()l, ... , ()n for which

Experiments in Recognition of Hand-Printed Text

11.

p(llt, ... , On)

IT p*(OiIX i) is maximum.
i=1

We can, of course, take any monotonic function of this
quantity and maximize it instead. Taking logarithms
we can select the 01, ••• ,0 11. which maximize
11.

log P (01, ... , On)

+L

log p * (Oi IX i )

.

i=1

If we define log p* (OiIXi) as being the confidence that
the measurements Xi indicate class Oi, then we may
reasonably define the confidence of the string to be
11.

~L log p*(oiIX i ) .
i=1

The optimal decision rule, then, computes the confidence of each string of length n, biases each string confidence by adding the logarithm of the prior probability
of the string, and selects as the answer that string
having the highest biased confidence.

REFERENCES
1 JMUNSON
Experiments in the recognition of hand-printed text: Part 1Character recognition
In this volume
2 BGOLD
Machine recognition of ,hand-sent Morse code
IRE Trans on Information Theory Vol IT-5 pp 17-24 March
1959
3 W W BLEDSOE J BROWNING
Pattern recognition and reading by machine

1149

Proc EJCC pp 225-232 Dec 1959 Also in Pattern Recognition
L Uhr Ed pp 301-316 Wiley New York 1966
4 LDHARMON
A utomatic read~:ng of cursive script
In Optical Character Recognition Fischer et al Eds pp 151-152
Spartan Washington DC 1962
5 A W EDWARDS R L CHAMBERS
Can a priori probabilities help in character recor;nition
JACM Vol 11 pp 465-470 October 1964
6 GCARLSON
Techniques for replacing characters that are gaTbled on input
AFIPS Conf Proc Vol 28 pp 189-192 Spring Joint Computer
Conference 1966
7 CKMcELWAIN MBEVENS
The degarbler-A program for correcting machine-read morse
code
Information and Control Vol 5 pp 368-384 1962
8 C M VOSSLER N M BRANSTON
The use oj context JOT correcting garbled English text
Proc ACM 19th National Conference paper D2 4-1 D24-13
lQ64
9 KABEND
Compound decision procedures JOT pattern recognition
Proc NEC Vol 22 pp 777-7801966
10 KABEND
Compound decision procedures for unknoWn distributions and
for dependent states oj nature
In Pattern Recognition L Kanal Ed Thompson Book Co
Washington DC 1968
11 JRAVIV
Decision making in 1tlarkov chains applied to the problem of
pattern recognition
IEEE Trans on Info Thy Vol IT-13 pp 536-551 October 1967
12 R 0 DUDA P E HART J H MUNSON
Graphical-data-processing research study and experimental
investigation
Fourth Quarterly Report Contract DA 28-043 AMC-01901
(E) SRI Project ESU 5864 Stanford Research Institute Menlo
Park Calif March 1967

The design of an OCR system for reading handwritten
numerals
by WILLIAM S. ROHLAND, PATRICK J. TRAGLIA
and PATRICK J. HURLEY
International Business Machines Corporation
Roahester, Minnesota

INTRODUCTION
The problem of transcribing computer input data
from a human-sensible form to a machine-sensible
form has grown with the computer industry. The
".Inpu t gap" h as aIso grown as the time to process
a given batch of data in the CPU (central processing unit) has become smaller and smaller when
compared to' the time needed to manually transcribe that same batch of data. In short, today's
da~a process~ng systems are capable of accepting,
USIng, and dIsposing of data at much faster rates
than available input devices are capable of providing, without some form of preliminary offline conversion or transcribing process.
Optical character recognition (OCR) has been
given much credit for helping bridge the "input
gap." This is because optical character readers
have been used for several years for the automatic transcription to computers of data printed
in certain specified type fonts on certain specified
types of input devices such as high-speed line
printers, typewriters, credit-card imprinters, and
a variety of printing presses, cash registers and
adding machines. This would include sales checks'
ai:line, bus, and train tickets; utility bills; pre~
mIum notices; charge -account statements, and
countless others too numerous to mention. One
input device for character recognition that had
not been successfully exploited during the early
years of OCR was the common lead pencil.
This paper traces the steps taken in exploring
the practicality of an input system consisting of
a usually unconcerned, intractable human being;
an always too sharp or too dull pencil; and an
optical reader designed to accommodate many of
these variables. The initial objectives, as defined
by Market Planning, will be discussed. The tech-

-nology choices will then be examined, and the
limitations and problems of the system discussed.
Finally, relative performance as a function of
training, motivation of, and feedback to, the
writer will be presented.
This paper deals with only one aspect of the
IBM 1287 Optical Reader (Figure 1). Although
this machine possesses the capability of reading
many type fonts printed by a variety of input
devices, only the Numeric Handwriting Feature
(NHW) will be discussed. The discussion presented here also applies to the numeric handwriting feature of the recently announced IBM
1288 Optical Page Reader.

Initial objectives
Several initial design objectives for developing
a hardware capability for the reading of numeric
handwriting were set forth.

• The minimum character set should be the ten
numerics plus five control symbols, preferably
alphabetic characters. (e, S, T, X and Z were
chosen)
Although some applications require, or
prefer, full alphanumeric capability, a
definite majority of the applications can
be handled with the stated minimum set.
Dollar amounts, stock numbers, merchandise classes, departments, dates, clerks,
customers and creditors can all be identified by numbers. The limited number of
control symbols are necessary for tagging
various categories of numeric information,
i.e., negative balance, sub-total, taxable
amount, etc.
• Acceptable character shapes should be thO,fJe
1151

1152

Fall Joint Computer Conference, 1968

acter reject and substitution rates) and product cost are interrelated to a high degree, with
many trade-offs possible among these three
factors. General limitations, however, mus·t be
set for each factor.

FIGURE I-IBM 1287 optical reader

rw..turally formed by the majority of the population.
There is some disagreement as to whether
one "writes" or "prints" numeric characters. On the other hand, the difference between handwritten and hand-printed alphabetic characters is generally recognized.
For· clarification, the character shapes are
defined as handwritten numerals and handprinted alphabetic symbols (not cursive
script).
Although it is highly desirable through
training techniques to limit the distribution of acceptable character shapes, these
techniques should not force the writing of
unnatural character s·hapes.

• Within the limit8 of good human engineering,
the constraints printed on the document to
guide the writer relative to the location, size,
and aspect ratio of the character, shall be minimum.
This is based on the premise that the simplest constraint will tend to· result in the
simplest training and maximum effectiveness in writing.

• No special printing devices must be required,
other than a common No.2 pencil.
There should be no requirement· for special
inks, pencil leads, 'or for papers other than
OCR bond papers.

• The objectives for recognition rates (characters
per second),' recognition performance ( char-

The recognition rates must be such that,
when effective throughput rates are calcu-lated, with provision for transport time,
format control, and reject rescans, these
effective throughput rates represent a good
payoff to the customer.
Reject and substitution rates should be
the same or lower than equivalent error
rates of equal impact in the manual transcription process. For example, the character substitution rate should be no higher
than keypunching error rates, which are
comprised primarily of transposition' errors. Customer satisfaction with a given
level of· rej ects and substitutions will depend to some extent on the effectiveness
of his system checking and correction routines.
The product cost, after all trade-offs
have been made, must obviously be compatible with a marketable price.

Technical description-IBM 1287 Optical Reader
The numeric handwriting function of the mM
1287 Optical Reader will be described in three
parts:
• Scanner and Format Control,
• Data Extraction Process, and
• Recognition Decision.
Scanner and format control (Figure 2)
A CRT (cathode ray tube) flying spot scanner
was chosen for the 1287 primarily because of its
format and scan pattern versatility. This versatility was required to accommodate a wide range
of· document sizes and formats, and to provide
the scan patterns for mark reading and for a
variety of stylized type fonts, in addition to those
required for NHW.
The scanner is composed of a 5-in. PS6 CRT
(PS6 is the phosphor type), two mirrors, a f/2.8
2:1 magnifier lens, a 11h-in. Sll response .monitor PMT (photomultiplier tube), a 5-in. Sll responsedocument PMT, a PMT power supply, and
a high voltage power supply which provides bQth
the focus potential and post-accelerator anode p0tential for the CRT. Mechanically, the light-tight

Design of OCR System for Reading Handwritten Numerals

.../1-(------{}-~:~~
PMTr~
,

!
I

Document

I PI.III

II
1

.---.1-._ _

To CPU

FIGURE 2-Schematic of scanner and format control

scanner housing is mated at the document transport plane to a vacuum bed plate. A clutched belting arrangement transports the document to the
vacuum bed plate through a pair of light-seal
brushes.
Within- the scanner housing, the CRT spot of
light is reflected 90 deg. by the first mirror, and
magnified and focused at the document plane by
the lens via the second mirror. Light reflect~d
from the document is sensed by the document
PMT which, in turn, provides an intensity-modulated signal to the video amplifier.
The CRT spot is constantly and directly monitored by the monitor PMT. The state of the art
of phosphor deposition on CRT screens precludes
a perfect screen; therefore, variations in spot
intensity at the CRT screen, which are caused by
phosphor graininess, are detected by the monitor
PMT and the output from this PMT is subtracted
(in the video amplifier) from the output of the
document PMT.
The video circuitry amplifies the video signal,
compensates for long and short term drift in
scanner components, and stores peak electrical
excursions for clipping functions and other ana-

1153

log-to-digital conversions. The output of the video
circuitry goes to beam control.
Beam control is that portion of the logic which
maintains control of the CRT beam at all times.
This block of circuitry receives input from the
format control circuitry and from the recognition
logic, as well as from the video circuitry. The
beam control provides input to the CRT deflection
drivers and to the location-direction generators.
The feedback of beam control data to the. CRT
deflection system completes the optical-electrical
flying-spot. scanner loop. The CRT beam action,
therefore, is really dependent upon what the
document PMT had seen as a result of beam action a half-microsecond earlier.
The location-direction generators, which provide basic vital data to the recognition logic, will
be described in detail in connection with data
extraction.
The beam control circuitry can be controlled to
generate sine and cosine functions ·in such a manner as to produce a circle scan (Figure 3). In addition, the circle size can be selectively attenuated
while in black so that scanning progresses along
a black edge. If the circle is allowed to proceed
unattenuated at diameter X until it strikes black,
and is then attenuated to a diameter of X/2 for
Yaxis

X.xis

-----+---..,....--+---+---+---

Direction of belm trlVel
LBllCk hit detected .ttenulte to smlll circle

I

I

I

Non reflective line

!

~.---+t~

Direction of trlVei of locus of centers

FIGURE 3-Circle sca.n method

•

1154

Fall Joint Computer Conference, 1968

180 deg., and the process repeated, the scanning
along a line will progress at the rate of X/2 per
circle.
Two modes of circle scan are utilized by the
1287 in the NHW mode. The search for the first
character in the field is accomplished by forcing
a horizontal right-to-left circle scan. As real black
video is encountered, curve following is initiated
with the full flying-spot scanner feedback loop in
operation. The result is that the beam progresses
generally clockwise around the outer contour of
the character.
Many of the 1287 functions are under program
control. These include the run control, the scanning format, direction of scan, type fonts to be
read, reading modes to be employed, and special
format conditions. Two-way· communication between the 1287 and the CPU (normally System/
360 Models 25 through 50) is handled by the CPU
interface circuitry. This control may best be illustrated by following a sequence of events in the
processing of a document.
When the CPU has determined that the 1287
is in a "ready" status, as indicated by the run
control (via the CPU interface), a command is
issued to the run control to feed a document. The
run control then takes over and feeds, separates,
and aligns the next document and properly registers it in the light-tight scan station. The CPU is
then advised by the run control that a document
is in the read station and the 1287 is prepared to
scan the document.
The CPU then issues a series of format words,
one for each field of data to be scanned, as required
by the 1287. Each format word contains four
hexadecimal bytes. The first 3 bytes contain .the
vertical and horizontal start, and the horizontal
stop respectively. (The CRT beam can be addressed to any set of coordinates within a 256 X
256 grid.) The fourth byte identifies the type font
to be read in that field, determines the scan direction, how blanks should be transmitted, and
whether on-line or off-line correction is to be effected should a reject character appear.
It should be noted that the first format word
transmitted for a new document results in addressing the document reference mark. A special
hardware routine is initiated to locate the edges
of the reference mark with respect to the coordinates defined in bytes 1 and 2, and to generate a
registration error voltage which is "used to modify
the coordinates of all fields subsequently addressed on this document.
As this information is being transmitted from

the CPU, it is stored in the format control circuitry-the first three bytes in a digital-to-analog
converter and the fourth byte in digital storage.
At the conclusion of the transmission, the scanner
is ready to begin a seek operation to a point to
the right of the field to be scanned, as defined by
bytes 1 and 2.
When the CPU replies with a signal indicating
that it is ready to accept the transmission of recognition information, the format control signals
the beam control to seek to the XY coordinate indicated by the first two bytes. When the beam
reaches that point, a right-to-Ieft artificial follow
is initiated by the beam control to search for the
first character in the field. When the last character
in the field has been scanned and the beam has
progressed horizontally to the location indicated
in the byte 3 of the format word, the beam is
forced into a large raster or idle mode.
The beam remains in large raster until the CPU
transmits the format word for the next field to
be scanned. This process is repeated until all fields
on the document have been scanned. At that time,
the CPU transmits a document eject command.
The run control executes the command by moving
the document just scanned into the stacker transport and by moving the next document into the
scanning station.
Data extraction process
Because of the wide variability of handwritten
numerals, it is highly desirable to extract a relatively small number of measurements that are
highly insensitive to minor variations in the shape
of the same character. On the other hand, these
measurements must dichotomize the character set
in order to have discrimination value. There is a
very practical limit, however, on how far a given
measurement can go in meeting these two requirements. As we shall see later, if the variation in shape for a given character is allowed to
degrade beyond a reasonable limit, it can no longer
be distinguished from the degraded form of another character.
A form of measurement that generally meets
these requirements is a sequential arrangement of
combinations of locations and vectors (or directions) generated as a result of contour following.
The process of contour following with a circle
scan and selectively attenuated circles has been
described.
The handwritten numeral is found on the document by means of the artificial follow search rou-

Design of OCR System for Reading Handwritten Numerals

S.rch circle scan

. ..

1155

I.~;'
~ ~, :~.

Point addressed by format

2
3
4

•• "!"

.~::

... :,.

& "':'.':.'..... :.:..

Seek

-"'n

c_.....tar

SEMI
·d./dl > . (dy/dtl (t•• 15°)

FIGURE 4-Contour following a character

tine (Figure 4).· When real video data is encountered, the video-deflection feedback loop causes
the beam to follow the outer contour of the character in a clockwise direction. The beam is permitted to make a "first follow" or complete trip
around the character to exactly locate the character, normalize the measurement circuitry to the
size of the character, and optimize threshold circuitry. The follow circle scan frequencies are filtered out of the deflection signals thereby providing signals representing the locus of circle centers
for subsequent recognition measurements.
During the "first follow," the filtered horizontal
and vertical deflection signal excursions about a
zero reference voltage are tracked, and their extremes are stored. The stored voltages, representing the horizontal and vertical extremities of the
character, are applied across resistor divider networks which divide the horizontal dimension into
four zones (A, B, C and D) and the vertical dimension into five zones (1, 2, 3, 4, and 5) (Figure
5A). During the "second follow," location information is. provided by comparing the current referenced deflection voltages to the levels stored in
the four by five matrix.
The instantaneous direction of travel, with an
angular resolution of 30 deg., of the circle center
is obtained as follows: The filtered XY deflection
signals are differentiated to generate the signals
+dx/ dt, -dx/dt, and -dy/ dt. Each of these
three signals is then effectively multiplied by the
tangent of 15 deg. Voltage comparisons of appropriate pairs of these six signals yield six digital
signals which represent six semi-circles, each
one displaced 30 deg. from the next. For example,
the output of the voltage comparator comparing

+SEM1~
'SEM2-VN.

FIGURE 5-A four by five matrix
B Voltage comparator
C "ANDing" Sem 1 and Sem 2

the amplitudes of -dx/dt and -dy/dt (tan 15°)
is active whenever the direction of circle center
travel is in the semi-circle bounded by 15 deg. west
of due south to 15 deg. east of due north (refer
to the shaded area in Figure 5B). Any particular
30 deg. direction segment, or any combination of
30 deg. segments, can be generated by "ANDing"
two of the 12 signals composed of the six semicircles signals and their inverses. The 30 deg. segment representing "north" is generated by "ANDing" together + (semi-circle 1) and - (semicircle 2) as illustrated in Figure 5C.
In order that information about the duration
of travel in a particular direction might be utilized for characters of varying size, the time to
follow around a second time is normalized by making the follow circle scan diameter linearly dependent on the amplitude of the stored voltage
representing the character height.
Two types of measurements, known as feature
tests and supplementary scans, are made on the
character.
Feature tests make use of the parameters of

1156

Fall Joint Computer Conference, 1968

2~h~~eJ
3J..-~~~aJ
4

--1

1
2

(W + WNW)~) T = X Ilsec (sets first 3BN latchJ
@.y (1+2)
- (resets first 3BN latch)
(E + ESE) .(4+5) T = Y Ilsec (sets second 3BN latchJ
(W+WNW).(5) resets SecoM 3BN latch)
(WSW + W).(5).(C + D) (sets first 2BT latchJ
(A + B)
(NNE + ENE + E + ESE + SSE + S + SSW)
~ T = Z Ilsec resets first 2BT latch)

r

@

<1illD
<:iillV

r

-~(1+2)

i

t::\3 (1)
'&'

r

~ + N)

T = Y Ilsec (sets second 2BT latchJ
(NNE + ENE + E + ESE + SSE + S + SSW)
~
T = F Ilsec resets SecoM 2BT latch)
~lID (WSW + W).(3 + 4 + 5).(A + B) (sets third 2BT
latch, which resets second 3 BN latchJ
{sets final 3BN latchJ

r

FIGURE 6-Typical feature. test

location, direction, time, and sequence while following the outer contour of the character to identify character feature shapes. These feature
shapes are subordinate to an entire character
shape and many are common to several characters.
A typical feature test will be described in detail
to illustrate the manner in which position and
direction information is used (Figure 6). This
feature test is called 3BN-"three bottom, normal"-because it is intended to recognize the most
common variations of the bottom horizontal stroke
of the numeral 3 (or 5). Three sequential conditions are required to satisfy 3BN. They must all
occur during "second follow." First, the circle
center must travel west or west-northwest in
rna trix row 5 for x microseconds. N ext, there
must be y microseconds of travel east or eastsoutheast in matrix rows 4 or 5. If the beam
tJ'avels into matrix rows 1 or 2 before satisfying the second condition, the first condition is reset, and must be satisfied again to reinitiate the
sequence. This "reset" condition prevents such
shapes as "open topped" zeros or ones with a tick
on the right side from- passing the test. The final
requirement is that the beam travel into matrix
row 1.
There are two events that will reset the second
condition if they. occur before the final requirement is met. They are, 1) if the beam travels

west or west-northwest again in row five, which
prevents the test being prematurely passed during "first follow," and, 2) if the third condition
of the feature test "two bottom-like-three" (2BT)
is met. This last reset condition makes use of a
portion of the feature test designed to recogniz~
two bottoms that droop down toward the lower
right, somewhat like the upper right hand portion
of a normal three bottom. This sort of thing is
done fairly often throughout the feature test
logics. Sometimes, instead of using part of another feature test, the reset condition is itself
sequential. The only reason for asking for row 1,
as the last requirement for 3BN, was to make it
possible to implement the last two reset conditions.
Nearly all recognition decisonscan be made on
the basis of only feature test data. Only when the
feature tests do not yield a high-confidence recognition decision are supplementary scans employed
to "look inside the character" for additional information. Some of the conflict pairs that can be
resolved by supplementary scans are the skinny
zero vs the fat one; a nine vs an eight with a narrow bottom loop; a zero vs an eight without dimples on the sides; and, a fat-topped seven vs a
thin-topped nine. The first two examples would
be resolved by a horizontal supplemental
scan looking for two black hits vs one hit. Similarly, the third and fourth examples would be resolved by a vertical supplementary scan.
The recognition decision

Compared to what has been described thus far
concerning beam and format control, t.rigonometric scan patterns, generation of locations and directions, and the development of feature tests, the
recognition decision logic may appear to be somewhat trivial. Basically, it is a combinatorial logic
statement for each character in the set, operating
on the output of the feature test circuitry, that
provides the decision directly (unless a conflict
must be broken by the use of a supplementalscan) . Based on the characters involved in the conflict, the recognition logic will select the type of
supplemental scan required. The logic statement
for any given character usually consists of a number of paths representing the most commonly encountered variations of that character's shape. For
example, the logic tree for the nine logic includes
a path for nines with open tops (4:j), a different
path for nines with an opening on the left side
(Q), another path for nines with a three-like
bottom (9), as well as paths for the more con-

Design of OCR System for Reading Handwritten Numerals / 1157
is directed to a point immediately to the right of
right center (referred to as the initial point).
Right-to-Ieft artificial follow is then initiated with
the video blanked until the beam has passed the
area occupied by the character just recognized.
Artificial follow then continues until the next
character is encountered. If the recognition logic
is unable to make a decision based on available
measurement data, the beam is returned to the
initial point and a second recognition attempt is
made. Up to two rescans are made, with clipping
level adjustments made in instances where specific
degradations of the character are indicated. If
the character cannot be confidently identified after
two rescans, the CPU is so informed and the
recognition process is continued on the next character.
Document format and writing implements

This section presents a typical handwriting doc-

um~nt format showing the field and character lo-

FIGURE 7-Feature tests satisfied by "3"

ventional closed loop-straight stemmed nines.
Figure 7 shows a well-formed character 3 and
some of the feature tests which it satisfies. A
fairly horizontal top start produces 2 TN (2 'rop
Normal) which is commonly found in the characters 2, 3, and 7. A number of other feature tests,
representing a variety of other top start conditions would be just as acceptable. The RNS (Right
Notch Straight) is a strong characteristic in the
character 3, and to a lesser degree in the 8. The
9B3 (9 Bottom Like 3) is very common for the
character 3 and some forms of the 9. The 3BN (3
Bottom Normal) ending is typical for the five and
some forms of the nine. All other normal 3 endings also yield acceptable feature tests in this area.
The 3CN (3 Cusp Normal) adds considerable
weight to the decision for the 3.
This diagram shows only those feature tests
satisfied by this particular shape of the character 3. There must obviously be a number of feature tests that remain unsatisfied in order to
prevent other character shapes from substituting
(or satisfying the 3 logic). In the same manner,
some of the sequence tests that are strongly satisfied by the 3 should show up as inhibit conditions
in the logics for other characters.
When a character has been recognized, the beam

cators determined to be optimum by a series of
extensive human factors experiments and field
studies, and discusses the proper choice of writing
implements.
Document format
Figure 8 is one possible format for a retail sales
check. The document reference mark, which can
be located anywhere on the do.cument within the
scanning area, is used to provide the link between all field locations on the document and
the format co.ntro.l system, and to compensate
for variables such as paper cutting, printing registratio.n, and do.cument stopping tolerance. On
this document, the reference mark is the dark
"L-shaped" mark in the upper right-hand corner.
All the fields o.n this do.cument are horizontally
aligned, but may be both horizontal and vertical
on the same document if this is desired in the application. The o.nly requirement on the placement
o.f data fields is a 0.125-in. clear band between any edge o.f the field and the edge of the
do.cument o.r ano.ther field. Each field is co.mprised
o.f o.ne o.r mo.re character lo.cators, which are rectangular boxes printed in a reflective ink (shaded
area in Figure 8). As determined by human facto.rs studies o.n ho.w peo.ple no.rmally and comfortably write, the aspect ratio of the bo.xes should
be three-to-fo.ur (width to height), with the height
ranging fro.m 0.240 to. 0.320 in.
A space o.f 0.020 in. between boxes is recommended in order to. minimize linking o.f adj acent

1158 Fall Joint Computer Conference, 1968
-4~S....

dJ
!J---=--~
-

o

041735
FIGURE 8-Possible format for retail sale& check

characters. The lower left-hand field of Figure 8
illustrates the confusion that can result both to
the human reader and the OCR reader when handwritten characters are linked. The field was intended to contain the digits 2 1 5 1. Without the
preconditioning that the boxes provide, the field
could easily be interpreted as 4 5 1, or even 4 3 7.
Constraints other than a rectangular box were
considered and tested early in the development of
the current technology, but these were discarded
because of the additional burden placed on the
writer and their interference with normal writing
habits. Generally, it was concluded that the fewer
rules the writer must remember about the formation of each character, the greater will be his
acceptance of recommendations for good writing
disci pline.
Writing implements
The ideal writing implement should provide a
continuous stroke of uniform width and density
throughout the character. The written character
should be largely independent of the pressure applied to the writing implement, should be highly
insensitive to smudge and smear, and should have
a minimum of extraneous non-reflective matter.
A number of writing implements were investigated for their suitability in providing the desired characteristics. No known implement studied

consistently provided all of these characteristics,
and therefore the selection of the implement was
a matter of a best compromise.
Table I presents the conclusions of an evaluation of a variety of fine-line pencils and pens in
terms of the characteristics desired in a writing
implement. A plus (+) in the matrix should be
interpreted as an advantage for that implement,
a minus (-) as a disadvantage, and a zero (0) as
no net advantage or disadvantage. Note that it is
next to impossible to assign a value to each of
the pluses and minuses, and that they have widely
varying values. Inasmuch as a minus in only one
characteristic could nearly rule out a given implement, the scores are not directly additive. It
should also be noted that the applicability of a
given implement varies with the application, the
writing environment, the training emphasis of
the writer, and other factors.
A primary conclusion is that the best overall
writing implement is the fine-line mechanical pencil with a medium lead (No. 2-2112). All mechanical pencils have a minor maintenance disadvantage in that they can run out of lead, but normally the lead can be quickly replaced by a spare
carried inside the pencil.
One runner-up is the wood pencil with a medium
lead. Its disadvantage is that the point breaks or
becomes dull (reliability), and needs to be sharpened (maintenance). A second runner-up is the
fine-line mechanical pencil with. a hard lead
(harder than No. 2 112). Its disadvantage is its
lighter stroke density and the range of density,
particularly under the condition of lightly applied
pressure. This factor can be overcome to some extent, however, with proper emphasis on training.
Soft lead pencils have the disadvantage of susceptibility to smudge and smear and to leaving
extraneous deposits of graphite, compared to medi urn and hard lead pencils.
Ball-point pens are characterized by their uniform stroke width and insensitivity to applied
pressure on the one hand, and by skipping (lack
of stroke continuity), smear susceptibility, extraneous ink deposits, and running out of ink (reliability) on the other hand.
The advantages of the fountain pen are its continuity of stroke and uniform stroke density. However, it tends to smear wet ink, and at times leaves
a double track. It also runs out of ink (reliability)
and needs to be filled (maintenance).
The felt pen has some key advantages, such as
continuous uniform stroke (both width and density) and insensitivity to pressure. It has some

Design of OCR System for Reading Handwritten Numerals

equally key disadvantages. Its (absolute) stroke
width relative to normal character size is too
large causing a general blobbing of the character.
It leaves extraneous deposits of ink, and the pen
tip dries quickly unless it is covered.
In most applications the ordinary medium wood
pencil will satisfy the need for a writing implement. A medium lead mechanical pencil adds reliability and convenience. In applications where a
more permanent recording is required, certain

1159

types of ball-point pens can be successfully used
under proper conditions.

Hurnan factors
Motivation
In most endeavors, motivation can make the
difference between success arid failure, or at least
the difference between highly successful results
and moderately successful results. Case studies
Writing Implement

•

..c:C)

Desired Characteristic

C"I

•

Q)~

~e.

+

+

Uniform Stroke Width
(constant pressure)

o

Absolute stroke Width

o

+

Continuous Stroke
(constant pressure)

o

Uniform Stroke Density

+

+

+

o

+

+

+

o

o

o

o

o

o

o

o

+

o

o

Smudge/Smear Resistant

o

+

o

+

Lack of Extraneous Deposits

o

+

o

+

+

+

+

o

Lack of Maintenance

+

+

+

o

Reliability

+

+

+

o

Good Stroke Starting

+

+

+

o

+

+

+

o

This characteristic is an advantage of this implement.
This characteristic is neither an advantage nor a disadvantage of this implement.
This characteristic i~ a disadvantage of this implement.
TABLE I-Suitability of writing implements for NHW.

-

+

+

+

o

o

o

+

o

o

Pressure Insensitive

+

o

+

o

o
o

1160

Fall Joint Computer Conference, 1968

on the application of numeric handwriting as direct input to data processing systems have verified some interesting data. These case studies
were conducted in a variety of .industries to determine the effectiveness of motivation, training,
and feedback, and the interrelation of these factors. In all cases, the gauge of effectiveness. was
actual performance on a prototype model of the
IBM 1287 Optical Reader in reading documents
generated in the case studies. Only the conclusions
from the studies will be presented.
Motivation must first be instilled at the user's
top management level. If the enthusiasm for and
confidence in a program is generat~d there, and
the need for positive involvement is communicated
downward through all levels to the operational
level, the probability of success is greatly enhanced. In the case studies where management interest and involvement were obvious to the lower
level employees, the response was one of equal
interest and enthusiasm. In post-test interviews
that were conducted among participating highly
unionized employees, comments such as "If it's
good for the company, I want to do well" were
common.
In general, the case studies demonstrated that
highly motivated employees working under poor
environmental conditions yielded results that
were at least as successful as those produced by
moderately motivated employees working under
excellent environmental conditions.
Some employees were motivated by the mere
fact that management was interested enough in
the program to provide a training session on the
writing of proper shapes, particularly when those
shapes looked like the characters they had always
written. Many others were motivated through
properly applied feedback where they knew that
the quality of their output was important to the
success of the system.
Most frequently, the motivation was in some
intangible form, such as the sincere desire to comply with a management plan, when that plan had
been presented to the employees in terms of the
benefit to be derived. Today, monetary rewards in
the form of bonuses and discounts are offered as
motivation in at least one application. However,
most forms of motivation discussed here cost only
a small amount of management time and effort,
but they produce handsome payoffs.

Training
Because of the limitless variety of character
shapes encountered in numeric handwriting, some

OOP#I//I
7777i~7i'79q

0066660

0OO68S888g
88?Y9999QQQ

( IlL ~ ~

Lb b 6

I 1 11 11qqq

4-4 4

~'-'t,t.,.f,.tobb

21t,v'Y'YV.lfJ{~lj4-

2.2J.).}rJ-~~Sr~~g8

2. 2 2 2 2

2.. ). 1..

h b b bb 6

'+4-lfltlt~~CfCt~f&~~g8

33.3.3333355555

3333333333339

11111111

5558~7f~~8gg

:?bb6,56(.66g gg

55555556(,666

55777777q9<19

ttLf44444qqqqqq99

FIGURE 9-Exampl€.s of character degradation sequences'

limit must be imposed on the number of variations of a given character shape that will be allowed to define that character. Needless to say,
the greater the number of variations allowed for
any particular character, the more difficult and
costly it becomes to maintain sufficient separation
between the decision logic for that character and
the logics for all other characters in the set.
Figure 9 illustrates some common examples of
character degradation sequences, in which progressively more degraded forms of one character pass through a "twilight zone" and emerge as
progressively less degraded forms of another character. The few shapes on either end of the sequence are both human and machine sensible, but
those falling toward the center of the. sequence
cannot be confidently recognized, particularly
when they do not appear within a sequence that
limits the choice. The only way to limit the variations in handwritten character shapes that must
be handled by an optical reader is to motivate and
train the writer to exercise a reasonable degree
of care.in writing.
In developing the 1287, a practical compromise
was sought between recognition hardware complexity and the degree of care required of the
writer. Happily, a compromise was found which
requires relatively little conscious attention to
writing rules. The "model" character set chosen
as a handwriting "font" does not differ from the
normal character shapes taught in most grammar
schools. Furthermore, considerable effort was devoted to making allowances for the most com-

Design of OCR System for Reading Handwritten Numerals

1161

• Avoid linking characters.
• Avoid leaving gaps in line strokes.
• Write simple shapes without fancy strokes or
curls.
• Make loops closed and rounded.

monly ep.countered reasonable deviations from the
"model" shapes. For example, although the
"model" shape of the numeral (l.) does not show
a loop at the junction of the bottom stroke with
the upright stroke, a common variation of this
shape which does have a loop at the junction (~)
is accepted by the recognition logic.
Human factors' studies indicated that some
training would be beneficial to writers to help
them achieve an optimum writing style. The small
degree of constraint imposed, and the ease of
learning the "model" shapes (which involved little
more than a review of writing habits learned in
grammar school) led to the development of a
short, easy to understand self-instruction manual.
This training is easily given, including supervised
practice if desired, in less than 30 minutes.
The model shapes are shown on the top line of
the Any Store document in Figure 8. It may appear unnecessary to require employees, who will
be writing for direct entry into data processing
systems, to take instructions on how to form
handwritten numerals that are no different than
were learned in elementary school. However, the
primary purpose of the instruction was to impress
the writer with these five basic rules:
.

Feedback

The instruction session provides the. writer with
these basic rules and a feel for allowable deviations. Through immediate feedback during the instruction session he also receives an indication of
his initial performance. As in all servomechanisms
with feedback removed, the writer tends to become lax and sloppy if occasionally he does not
receive some word about his performance. As indicated earlier, the feedback provides a motivating effect in demonstrating to the writer that his
output has a bearing on the success of the total
system. ,From a practical standpoint, feedback in
the form of helpful hints from his supervisor will
tend to correct "open-loop eights," "thin-topped
nines," a tendency to write very light characters,
or any other individual quirk.
Another type of feedback is the computer-generated report card, a listing of error documents by
clerk or employee number. Table II is a relative
performance chart illustrating the importance. of
motivation, training, and feedback for machIne
recognition of handwritten numerals. This chart
shows only relative performance under the various

• Write the character, just filling the box (the
1287 can handle characters that are less than
half the height of the smallest recommended
box height) .

TABLE II-Relative reader performance as a function of motivation, training, and feedback
Level Of
Supervision

I
I

_1_

Very
Close

Level Of
Training
~ Detailed initial
.training and
continuing feedback
~ Detailed in~tial
training; early feedback only, probably
problem oriented
Initial training by
manual; little or no
feedback
A

r

Loose
to
Moderate

B

C

s

r

s

r

s

10

15

I

7.5

1

1

5

2.5

2.5

6

10

12.&

17.5

4

4

9

15

15

20

2.5

2.5

7.5

10

12.5

20

4

4

9

12.5

15 "

30

6

5

11

15

20

40

L9
1L

3. Low

2. Average

1. Very High

A

I
I

l

.J!L

Little,

B

I
4

5

.p

11

any.

C

7.5

15

I

11

i

15

I

25

15

I

20

I

30

II
I

50
75

!

1162 Fall Joint Computer Conference, 1968
combination of these factors. Each number may be
considered a reject and substitution rate multiplier in relation to the rates obtainable under the
most favorable conditions.
CONCLUSION
Any system involving the reading of handwritten characters is subject to a myriad of human
variables. It is impossible for even the most
sophisticated optical reader to compensate for all
of them if no control is exercised. As indicated in
Table II, the user frequently has the degree of
freedom to provide the environment compatible
with the system performance he desires.
Customer satisfaction with reading performance is highly variable with the application and
the degree of control that the customer has over

~¥7' -----11
I i I nT' I L
'!t---=-

ACKNOWLEDGMENTS.

--- - -- -- - l

'f ! t

4

\ /

I

the quality of his input. Character reject rates of
a fraction of a percent are achievable from input
prepared by unskilled workers on a routine job
in a controlled environment.
Figure 10 illustrates the range of character
shapes beyond the "model" shapes that are read
correctly by the IBM 1287 Optical Reader. Its significance is that the success of a handwriting
reader application can be greatly enhanced if the
reader is designed to read a broader range of
character shapes than what the writer is asked
to write.
In conclusion, it must be stated that the practi-.
cality of reading handwritten ;numerals by machine in a variety of industry applications is a
demonstrated fact by.virtue of wide customer acceptance. Its application in the future seems almost without limit.

I ' \ /: ~

The authors would like to acknowledge Mr.
Douglas C. Antonelli for his Human Factors contributions, and the many others from the Optical
Reader Development departments of the IBM
Systems Development Division Laboratory, Rochester, Minnesota, who helped develop the 1287
Optical Reader.
REFERENCES

)'

t>~ I~J

~ ~ It,"~'
"

,), 7

~!f
---....,--•.,~!IP,
,1)

-------_.......

7

l~ [~
It

&P

77
i, i~

l5

t ~

L

1
t ~

1
J

t. ~ ,

"'f~ ~
7 '1!" T

I~ 'ie Z
~ '{{!t;

X

1 DC ANTONELLI
Optical character recognition oj numeric handprinting

Case Studies and Results, IBM Rochester Minnesota (un,
published report
2 TLDIMOND
Devices for reading. handwritten characters

Proceeding of EJCC 232-237 1957
3 M N CROOK D S KELLOG
Experimental study of human factors for a handwritten numeral
reader

IBM J of Research and Development 7 No 1 January 1963
4 E C GREANIAS et al
I

I . .~. . . . . . . .~~. . .
il . .
FIGURE 100Range of character shapes beyond "model"
shapes that are read correctly by the 1287

The recognition of handwritten numerals by contour analysis

IBM J of Research and Development 7 No 1 14-21 January
1963
5 NSEZAKI HKATIGIRI
Character recognition by the follow method

IEEE Pf')ceedings 510 May 1965

The dynamic behavior of programs*
by I. F. FREIBERGS
McGill Univen;ity

Montreal, Canada

INTRODUCTION
A computer system consists of several resources for
which users' programs are competing: CPU time,primary and secondary storage and Input-Output devices.
More and more computer systems, which are being marketed currently, allow for resource sharing among
several jobs, in order to obtain a better utilization of all
available equipment, or to provide better service to the
users by reducing their job "turnaround" time, or both.
Some of the approaches tried so far in order to
achieve these aims are:multiprogramming, as in Univac 1108, IBM 360,
i.e., the sharing of core memory by several users'
programs;
remote-access computing via teletype-like terminals, as in the RAX system on IBM 360, or Dartmouth College's time-sharing system on GE265,
i.e., "slicing" of the CPU's time between the terminal users;
on-demand paging. Only the active parts of programs, or pages, are kept in core memory, a.s in
CDC3300, SDS940, IBM 360/67, GEM5. This is
achieved by dividing up the available core memory
into blocks of a certain size. Blocks of storage
might then. be allocated to program pages on an
"on-demand" basis.
The performance of the above three types of systems
becomes difficult to estimate, as compared to batch processing systems, since each program can no longer be
considered per se, but is heavily dependent on the
priority and scheduling rules of the operating environment, as well as on the type and volume of other programs to be processed. In time-shared systems, the performance will also be influenced by the size of the timeslices and by the resulting systems overhead for pro*This research was supported by the National Research
Council of Canada through Grant A-4081.

1163

gram swapping between core and auxiliary storage. In
paged systems additional performance parameters are
introduced by page size and page turning strategies.
Such systems are generally considered to be too complex and non-deterministic in nature for analytical
study methods. The alternative for studying these systems is simulation, validated by subsequent 'direct
measurement.
As an exception, it must be pointed out that remoteaccess low speed terminal systems have been studied by
queueing theory. A summary of the models and the
measures used is available.1 Simulation models of
remote-access systems have also been developed, at
MIT,2 and at SDC.3
Multiprogramming on Univac 11074 as well as an
IBM 7094/44 directly-coupled, system 5 have been
simulated.
The IBM 360/67 system has been simulated at
Stanford.6 Paging strategies have been investigated at
IBM, both experimentally using the M44/44X system,7
and by simulation.s
In all such models, and particularly in ones dealing
with page-turning, crucial assumptions have to be made
about the dynamic behavior of prograIlJs under execution, in particular about .frequency of references to data
in memory, and about the distance between successive
instruction and data fetches. It has been said that
page-turning can be either very useful or disastrous, depending on the type of program to which it is applied. 9
Up to now, few direct investigations have been carried
out, and performance estimates of systems, such as the
IBM 360/67, were based on certain theroretical assumptions about the behavior of compilers and of the users'
programs of different classes. Recently, at SDC a study
of actual memory requirements and simulated page demand rates has been carried out for a fixed page size of
1024 words on some interpretive type programs (LISP,
META5), with some interesting conclusions.10
In the Spring of 1966 it was decided to obtain em-

1164' Fall Joint Computer Conference, 1968
pirical data about the dynamic behavior of programs,
typical to McGill University Computing environment,
from the point of view of CPU time and memory utilization.
The data thus obtained would also be used as input to
subsequent simulation studies about varioUs paging and
scheduling strategies.

Method
A trace-like program was developed for the IBM 7044
which executes interpretively any program, instruction
by instruction, recording for each the operation code
and the actual address of the instruction. For instructions referencing data in memory, the actual address of
tJ:1e data word was also recorded, together with an indicator as to whether it was a data fetch or a data store.
Supervisor calls (SVC) were not executed interpretively, since it was felt that the design philosophies behind supervisor routines would diff-er too widely, so that
the IBM 7044 supervisor routines would not have a sufficient degree oJ generality for subsequent study of other
systems. Instead, the entry point in the supervisor was
noted, and the time spent in the supervisor was measured in clock pulses of 1/60th of a second each. From
the entry point it was then possible to determine the
type of SVC which caused the suspension of program
execution. Most of the SVC's occurred for input or output (I/O) operations.
Since the programs Investigated were not specifically
organized for execution with paging, dividing the core
memory into 32 equal size pages of 1024 words regardless of program organization would result in an overestimation of page bO,undary crossings. For this reason a
memory map, or layout, was taken for each program
investigated, indicating _its functional parts, namely
the regions occupied by thesupervispr, the input-output control blockS' and buffers, the problem program,
its subroutines, system-suppIled subroutines, common
data areas, etc. Memory was then subvided into a
variable number of pages of 1024 words or less, but
making sure that each functional part begins on a new
page boundary. Table I shows a typical memory layout
with page allocation. Since any program would cons'st
of the above funct:onal parts, regardless of the computer system used, it was felt that the results of this
investigation would be of more general value, beyond
the IBM 7044 memory organization.
The output of the trace-like program was recorded on
magnetic tape, and was subsequently analyzed and
summarized by an analyzer program.
After various attempts, the best way to summarize in

TABLE I-Memory layout of an in-corP Fbrtran'compiler
No.

PAGE

DESCRIPTION

SYMBOL

PAGE ADDRESS

PAGE SIZE
(WoRDS)

(IN OcTAL>

START

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

FILE CoNTROL
CoNTROL PGM
CoNTROL PGM
LOADER
LOADER
loADER
LOADE~

PROGRAM
PROGRAM
PROGRAM
PROGRAM
PROGRAM
PROGRAM
PROGRAM
PROGRAM
PROGRAM
PROGRAM
PROGRAM
PROGRAM
BUFFER

F
C
C
L
L
L
L
P
P
P
P
P
P
P
P
P
P
P
P
B

END

14323 14467
16113 20112
20113 21201
21202 23201
23202 25201
25202 2n01
27202 30004
30005 32004
32005 . 34004
34005 36004
36005 40004
40005 42004
42005 44004
44005 46004
46005 50004
50005 52004
52005 54004
54005 56004
56005 60000
76122 77776

101
1024
567
1024
1024
1024
387
1024
1024
10211 .
1024'
1024
1024
1024
1024
1024
1024
1024
1020
941

pictorial form the information obtained was found to be
a series of "Snapshots" of memory between successive
supervisor calls. Typical output of the analyzer program
is shown in Table II, which shows the sequence of steps
for an in-core Fortran compiler.

TABLE II-"Snapshots" showing memory utilization between
successive SVC's for an in-core Fortran compiler

I1&...lL
FCC L L L L P P P P P P P P P P P P B*

P Fl ••• S •••••• I • • • • • •

MSEC •.

0

.SG . . . . PPP .. psP.P . . .
16
PSS . . . .PSS .. PS .. G . . .
1
PSS . . . . PFSP.PS .. G .. G
2
PSS •••• PF •••••• : ••• P
1
PSS . . . . PIS . . . . . . 6 .. P
1
PSS·•••••• F ••••••••• P
1
PSS •••• PIS •••••• G •• 6
1
PSS •••••• F • • • • • • • • • P
1
PSS . . . . PSSSPPS .. P .. G
7
PSS •••••• F • • • • • • • • • P
1
PSS . . . . PSSS.S . . . G .. G
4
PSS •••••• F • • • • • • • • • P
1
PSS . . . . PSSSSSP .. P .. G
9
PSS •••••• F • • • • • • • • • P
1
PSS . . . . PSSSSSP .. P .. G
10
PSS •••••• F • • • • • • • • • P
1
PSS . . . . PSSS.SS .. P .. G l l
PSS •••••• F • • • • • • • • • P
1
PSS • • • • PSSSSSS • • P • • G
19
PSS •••••• F • • • • • • • • • P
1
PSS . . . . PSSSIS . . . P .. G
2
PSS . . . . PSSFPPSSSS .. P
62
.P • • • • • • • • • GGP.SS •••.
34
.G • • • • • • • • • GGG.SS •••
18
.G • • • • • • • • • GGG.SS...
5
.P • • • • • • • • • • • G.SS...
2
PSS • • • • • • • • • • • • SF...
1
PSS • • • • • • • • • • • • • • • • G
1

.l.tmR.B.uill.

~

LOAD

110
GET
PUT
PUT
PUT
GET
PUT
GET
PUT
GET
PUT
GET
PUT
GET
PUT
GET
PUT
GET
PUT
GET
PUT
PUT
PUT
PUT
PUT
PUT
GET
TYPE

C, OCK

PULSES

165

o
o
1
o
o
o
o
o
1
1

o

o
1
o
o
o
o
o
o
o
o
1
o
o
o
o
o
23

* SYMBOLS FOR PAGE IDENTIFICATION AS PER TABLE I.

A header line identifies the pages of each functiona.l
part of the program. Each snapshot line begins with a
layout of memory, split up into pages as described
above. The meanings of the symbols used in the snap-

Dynamic Behavior of Programs
shots are (for each page) :
page not used during this interval
I Instructions fetched
F Instructions and data words
Page content
fetched
unchanged
G Data words fetched
P Data words stored and fetched
S Instructions fetched and data
words stored and fetched

Page content
changed

N ext the time (in msec.) spent in uninterrupted com~
putation before the SVC occured is given . .The average
instruction time for the IBM 7044 was found to be 4.4
microseconds. Finally, the reason for the supervisor call
is indicated with the number of clock pulses spent in the
supervisor before program execution was resumed.
Snapshots were chosen between successive SVC's
rather than for a fixed number of instructions, because
at each SVC the scheduling algorithm has the option of
deciding whether to continue with the present program
or to switch to a different one.
Results

A summary of the classes of programs investigated so
far is shown in Table III. This table also shows a percentage breakdown by type of instruction within each
cla~s of programs. Of interest here is the 34% average
proportion of branch instructions, which may result in
references to instructions on a different page in memory.
Also of interest is the 51 % average ratio of data words
to instruction words.
TABLE III~Classes of programs investigated

1165

instance in the IBlY! 360 with 16 general purpose registers, as opposed to 2' Arithmetic registers and 3 Index
registers in the IBlV[ 7044, the proportion of instructions
referencing data in memory was expected to be reduced
in favor of more register-to-register type instructions.
To verify this some IBM 7044 programs were rerun
with 360 Fortran G and Cobol F. The p'(rcent increase
in the proportion of R type instruction was found to be
between 40% (Fortran) and 340% (Cobol). The percentage of B type instructions was decreased between
56% (Fortran) and 64% (Cobol) to compensate for this.
The proportion of branch instructions for which the
branch is taken was found to be around 90% on the 360.
The instruction tjmes obtained in this investigation can
then be adjusted accordingly.
By comparing the "snapshots" of the various programs investigated, several results emerge.
A program does not execute very many instructions
between successive SVC's. This is illustrated in Figure
1, which shows the cumulative percentage of the number of SVC's vs. the number of instructions executed
between successive SVC's. For most classes of programs
50% of the time this number lies somewhere between
100 and 1000 instructions, i.e. less than 5 msec.
Figure 2 shows the cumulative percentage of SVC's
vs. the number of instruction pages required between
successive sve's. It can be seen that this stabilizes
around 2 to 3 pages, except for the in-core Fortran
compiler which requires 5 pages most of the time.
Data page requirements are higher than those for
instruction pages and practically coincide with total
page requirements. This is illustrat'ed in Figure 3. Most
programs need 4 to 6 data pages between successive
SVC's with the execption again of the in-core Fortran
compiler, which requires between 5 and 12 data pages.
100

CLASS OF
PROGRAM

FORTRAN ExECUTION
(11 PROGRAMS)

TOTAL No, OF
I NSTRUCTI ONS
TRACED
(IN MILLIONS)

4,02

PERCeNTAGE OF DIFFERENT
TYPES OF I NSTRUCTI ONS

----------------------------------G (X)

P (X)

R (X)

36

20

11

B

(Xl

33

RATIO OF
DATA \lORDS
To I NSTRUCTI ON
WORDS (X)··

57

,28
1',29

LIST PROCESSING
(SLIP)

1.13

FORTRAN COMPILATION
IN-CORE COMPILER
(7 PROGRAMS)

1.74

28

22

14

36

54

CoBOL EXECUTION

1.89

28

15

14

43

45

29

22

15

34

51

MBA6E

I
-

.

III)

0

38
28
26

STRING PROCESSING
SIMULATION (GPSS)

FORTRAN
80

~STRIicTiONS REQUIRING A DATA WORD FETCH IN MEMORY,
NSTRUCTIONS REQUIRING A DATA WORD STORE,
NSTRUCTIONS REFERRING TO REGISTERS ONLY,
ANCH, OR JUMP, INSTRUCTIONS,

• AVERAGE INSTRUCTION TIME FOR IBM 7044,
•• DATA WORDS G+P+I'!OVE INSTRUCTIONS

23
27
23

15
20
19

24
25
32

61
55
49

!~

MICROSECONDSl·
MICROSECONDS
MICROSECONDS
MICROSECONDS

>
III)

80

I&.

0

.,.

40

20

O~

__

D2
____

~

J

~

____

.3

____

(

INSTRUCTIONS ( IN 1000)
L ____ ~
__ la __

~

~~

4.5

1.1
TIME

The percentages of Table III will vary somewhat, depending on the instruction repertoire and on the number
of registers available in a particular computer. For

~

IN

MSEC

20,

~~

~

80

)

FIGURE I-Cumulative frequency of the number of instructions executed between supervisor calls

1166 Fall Joint Computer Conference, 1968
100~--------~~-=======--=~----------

100r-----:-----------------======t
80

USERS

PROGRAMS

ONLY

80
COMPILATI

LIST

60

PROC.

...

o

60

en

40

'It

u

>
en
!+-~HI--- SIMULATION

II.

0

o

4 ,

~

10

6
PAGE

REQUIREMENTS

(1024

12

14

16

18

WORDS)

FIGURE 4-Cumulative frequency of the total number of
pages required for a 127 job sample at McGill University

o

4

2
NO.

OF

6

INSTRUCTION

8

PAGES

12

(1024 WORDS)

FIGURE 2-Cumulative frequency of the number of instruction pages required between supervisor calls

100r-------------r--r----------~----~_===

80

II)

60

~

...o

40

r

20

o

I

8
NO.

OF

TOTAL

PAGES

(1024

10

12

14

WORDS)

FIGURE 3-Cumulative frequency of the total number of
pages required between supervisor calls

In order to obtain an estimate about the total
memory requirements of the class of jobs processed by
the McGill Computing Centre, all the jobs submitted
during a certain 24 hour period were examined. For 127
of these jobs which reached the execution phase, a
memory layout was obtained. The results of the analysis are shown in Figure 4, i.e., the cumulative percentage
of the number of jobs vs. their total memory requirements (for the entire duration of the job).

It can-be seen that 50% of the jobs require less than
7 pages, made up as follows: Problem program including its subroutines 3 pages
2 pages
System supplied subroutines
1 page
I/O Buffer areas
1 page
Common data area (for larger programs)
DISCUSSION
Contrary to the popular view, long compute sequences
without any interruption seem to be the exception
rather than the rule. The number of instructions executed betweed supervisor calls is of the order of 102 to
103 • Since most of the SVC's are due to I/O requests, assuming a blocking factor of about 10 with a fully overlapped I/O channel operation, return to the interrupted program could be immediate in 9 out of 10 cases.
This would result in a sequence of 103 to 104 instructions
(5 to 10 msc) about 50% of the time.
On the other hand, the corresponding memory requirements appear to be larger than generally expected, in agreement with the SDC study.lo They are of
the order of2 to 3 instruction pages and 4 to 6 total
pages. An exception here is the Fortran Compiler whose
page requirements are about twice as large.
From these considerations it seems clear that the systems overhead for a "one-page-on-demand" strategy
would be prohibitive. To avoid this, at least 3 pages for
instructions and data, plus one page for I/O buffers,
plus 2 pages for systems supplied subroutines (which
could be shared code) should be made available to any
program at the outset, as well as during each successive
period of activity.
From Figure 4 it can be seen that such a memory assignment policy would accommodate the entire program of 50% of all jobs entering the execution phase,
so that there would be no further paging overhead for
the entire duration of these jobs.

Dynamic Behavior of Programs
Another question worth ralsmg at this point is
whether the additional· cost of implementing a fixed
page size hardware system is warranted, as opposed to a
multiprogrammed system where the entire program is
brought into memory.
These costs should also be weighed against those of
acquiring more Large Capacity Storage, especially in
view of the large memory requirements of Compilers.

1167

utilities

Proc 22nd ACM Nat Conf pp 85-96 August 1967
2 ALSCHERR
An analysis of time-shared computer systems

Research Monograph No 36 The MIT Press Cambridge Mass
1967
3 G M FINE P V McISSAC
Simulation oj a time-sharing system

Mgt Sei 126 pp B18Q--B194 February 1966
4 G K HUTCHINSON J N MAGUIRE
Computer systems design and analysis through simulation

SUMMARY
In this paper the results obtained from an interpretive
instruction by instruction execution of different classes
of programs (Fortran compilation and execution, Cobol,
GPSS, SLIP) on an IBM 7044 have been presented.
Memory and CPU time requirements between
successive supervisor calls have been analyzed, with the
outcome that most of the time instruction sequences
are rather short and more than one page (1024 words)
of memory is required.
The data obtained can be used as realistic input to
simulation models of multiprogrammed or fixed page
size computer systems.
REFERENCES
1 G ESTRIN

L KLEIN ROCK

M ecuures models and meaaurement8 for time-shared comfYlJlm

Proe of the 1965 Fall Joint Computer Conf pp 161-167 1965
5 JHKATZ
Simulation of a multiprocessor computer system

Proe 1966 Spring Joint Computer Conf pp 127-139 1966
6 NRNIELSEN
The simulation of time sharing systems

Comm ACM 107 pp 397-412 July 1967
7 RWO'NEILL
Experience using a time sharing mult~programming system
with dynamic address relocation hardware

Proe 1967 Spring Joint Computer Conf pp 611-622
8 LABELADY
A study of replacement algorithms for a virtual-storage computer

IBM Sys J 5 2 pp78-1011966
9 J B DENNIS E L GLASER
The structure oj on-line information processing systems

Proe of the 2nd Congress on the Information System Seiences
p61964
10 G M FINE C W JACKSON P V McISSAC
Dynamic program behavior under paging

Proe of the 21st ACM Nat Conf pp 223-228 August 1966

Resource allocation with interlock detection
in a multi-task system
by JAMES E. MURPHY
International Business Machines Corporation
Poughkeepsie, N ew York

INTRODUCTION
In a multiprogramming environment, special care must
be taken to insure that, for a given physical record on a
data set, only one task at a time is involved in the following sequence:
1)
2)
3)
4)

Read the physical record (with update in mind);
Find the logical record to be changed;
Change the logical record;
Write the "new;' physical record.

. For example, suppose there exists on a (direct access)
data set a physical record, R, consisting of the logical
records A, B, C, and D, and that there are two programs, PI and P 2 , contending for the resources of the
system.
Suppose further that the following sequence of events
is allowed to take place:
1) PI reads R and obtains A, B, C, D in its buffer;
2) P 2 reads R and obtains A, B, C, D in its buffer;
3) PI changes A so that its buffer contains A,l B: C,

D;
4) P 2 changes D so that its buffer contains A, B, C,
Dl;
5) PI writes the "new" physical record so that
Rl=Al,B,C,D;
6) P 2 writes the "new" physical record so that
Rl = A,B, C,Dl.
As a result of P 2'S action, the update to the data.set
made by PI has been erased as though it never existed.
Any recovery via an audit trail at a later time would
duplicate this error. This example demonstrates the
general need for protecting the integrity of system resources and insuring validity of the work done by the
resources' users.
1169

There are three degrees of protection required by resources and their users in a multiprogramming environment. These are:
1) If a resource is to be altered by a task, then the entire sequence of events essential to the alteration
of the resource must be protected from any interference which would affect the completion of the
change.
2) If a task is using a resource but not changing it,
then that task requires protection against any
changes in the resource which would affect· the
validity of the information the task receives from
the resource.
3) If a task uses a resource without changing it, then
no protection is required when changes to the resource do not affect the ultimate use of information received from it, and the task in no way
interferes with other users of the resource.
All tasks wishing to use a resource seek control of it
by name through a resource management facility. Requests for a given resource are queued on a first-in,
first-out (FIFO) basis. The concepts of shared (S) vs.
exclusive (E) control are used. Any number of tasks
may use a resource simultaneously if they have all requested shared control. A task requesting exclusive control will be the only task using the resourse once it
gains control (except for those tasks falling into category (3) above). Tasks coming under category (1)
would request exclusive control; tasks in category (2)
would request shared control; and tasks in category (3)
would use the resource with-out requesting control.
Embedded in the typical approach to resource allocation, however, are certain rigid constraints designed to
avoid all possibility of system interlock:, i.e., the situation in which two or more tasks place each other into a
permanent wait state.
Suppose that a task is allowed to request control of a
resource without releasing control of another resource

1170

Fall J'oint Computer Conference, lU68

it has previously obtained. Here the expression "enqueue" (ENQ) will refer to the requesting of a resource,
and "dequeue" (DEQ) will mean the releasing of a resource (both actions performed through queue manage,:"
ment).
Assume the following sequence of events, where P l
and P 2 are programs running concurrently in a multitask system and Rl,R 2are resources:
a) P l ENQ's Rl for E (exclusive control);
b) P 2ENQ'sR 2forE;
c) P 2ENQ's Rl for 8 (shared control);
(At this point P 2 is placed in a wait state);
d) P l ENQ's R2 for 8 (Now P l is placed in a wait
state).
The queues for Rl and R2 would appear thus:

FIGURE 1
Now P l and P 2 are each waiting for a resource which
the other holds. Under these conditions both programs
would remain inactive until one or both were terminated by control programming, either through a timeout mechanism or other such safeguard.
Interlocks can occur only if a task is allowed to wait
for a resource without releasing all resources it previously controlled. However, this is only a necessary
and not a sufficient condition for interlock. Usually, resource control is handled as though the two conditions
were equivalent, and tasks must request multiple resources in parallel or in some pre-defined sequence.
This paper discusses a more flexible resource management which examines each request in the context of all
currently pending requests and determines the existence
or absence of an interlock.
Queue management

For each resource, maintain a queue cont,aining the
name, mode, and rank for each task which hasrequest~d
control of the resource but which has not yet released It.
Within this queue:
TA8K NAME identifies the task requesting use of
the resource.
MODE refers to the type of control desired-"E" for
exclusive, "8" for shared.

RANK indicates the task's relative position in the
queue. This is assigned as follows:
A) Enqueuing
1) The first task in the queue is assigned control
of the resource in the desired mode, and is
assigned a rank of of that resource.
2) If the new task requests mode 8 and the mode of
the last task in the queue was also 8, then the
rank of the new task is equal to that of the old.
Otherwise, the rank of the new is one greater
than that of the old.
3) If its rank in the queue is 0, then the task may
use the resource without waiting for another
task to finish with it.
4) If its rank is not 0, the task's request for the
resource must be set aside for analysis of inter1ocks.
5) This same procedure is followed for each of the
resources requested simultaneously by the task.
6) Interlock analysis is now performed. The algorithm and the procedures followed when an
interlock is detected are described later in this
paper.
7) If the task is allowed to wait for the requested
resources without releasing those it already controls, its name, rank, and mode are entered in
the resource queues. In this case, the wait count
of the task is increased by one for each resource
for which it has a rank greater than zero.

°

B)

Dequeuing
If a task releases control of a resource, then:
1) The name of the task requesting a DEQ is

removed, along with its mode and rank, from
the queue, for the specified resource.
2) If there are other tasks in the queue having
rank zero, or if the queue is empty, the dequeuing process is finished.
3) The rank of each task in the queue is decremented by one, and the wait count is decremented by one for each task whose rank is now zero.
4) When the wait count for a task is zero, it is removed from the wait state and proceeds sharing
or c~ntrolling resources in the requested mode.
Example: Let us suppose the following
requests occurs for resource R.
1.
2.
3.
4.
5.

seque~ce

of

Task B requests exclusive control
Task C requests shared control
Task A requests shared control
Task D requests exclusive control
Task E requests exclusive control

At the end of this time, the entries on the queue for

Resource Allocation with Interlock Detection

resource R would be:

NAME

MODE

RANK

B

E

0

C

S

1

A

S

1

D

E

2

E

E

3

FIGURE 2
When task B releases R, the entries become:

NAME

MODE

RANK

C

S

0

A

S

0

D

E

1

E

E

2

FIGURE 3
"Waits-for" as an ordet telation
Definition: Let the symbol, Pi;' represent the rank of
task T j on the queue for resource R ,.
Definition: IfPa ,> Pajforsomea, 1 ~a ~m where m
is the number of resources being used in the system,
then taElk T i waits for task T j to finish with resource R a •
The expression T i~ T j is defined to symbolize this relationship.
If T i waits for T j to finish with resource "a" and T j
waits for Tk to finish with resource "b", then we have
Ti~Tj and Tj~Tk' or Ti ~Tj ~ T k. Clearly T,
waits for T k to finish with resource Rb because the
minimum length of time T i must wait for resource Ra is
the length of time during which T j controls resource Ra
plus the length of time during which T k controls resourc,e R b • This might be expressed in the form
Tia~Tk'
However, in this paper we are only indirectly concerned with the resources for which a task waits. Our
main interest here is that a task must wait. To determine for which other tasks a given task is waiting, re-

1171

quires knowledge of the resources involved. In line with
this, make the following definition:
Let T i and T j be any two tasks currently resident in a
multi-task system. Then say that Ti waits for Tj, expressed T i > T;, if and only if there exists a series of
resourses, rk, 1 ~k ~m, and a series of tasks, 1\, 1 ~h
~m -1, such that Ti ~Tl ~ .. ~Ta ...~rmTj.
Lemma: The relation "waits-for", ">", is transitive
and is an order relation.
Note: Transitivity of the "waits-for" relation may be
proven formally by introducing the variable, T, for time.
Of significance here are the times when a task gains control of a given resource, when it releases the resource,
and when the task gains control of the last resource for
which it is waiting.
Definition: A (system) interlock is said to occur if there
is a task, To, related by competition for resources to
other tasks, T i , in a system such that a chain of wait-for
relations To> Ti ...T m > To exists.
Matrix reptesentation of "waits-fot"

The ordering of the set, T, of the n tasks in a system
by the relation ">" may be portrayed in an 'n x n
matrix, W, the Wait Matrix, constructed as follows:
Let the n columns and the n rows be named by the
tasks in the system, To - Tn _ 1. Let the elements
W ij be assigned the value 1 if T i has a rank higher
than that of T j on the queue for any resource R-i.e.
T i ~T j. If there exists no resource, R, for which
T i ~Th then let W ii = O. Note that the elements
along the main diagonal are all zeros.
Consttuction of the ptecedence matrix

For a more complete, and more complex, model of the
task-resource system, construct a matrix, P, (the precedence matrix) as follows. Let the n columns be named by
the tasks in the system, To - Tn _ l' Let the rows be
named by the ill different resources currently being used,
Ro - Rm - 1. Let the m x n elements of the matrix
Pij = (Ri, T j ), each contain the rank (or place) on the
queue for Ri which has been assigned to task T j •
Clearly many of the elements will be empty (blank or
null, as opposed to zero.)
Going across any row i, corresponding to resource R i ,
we find an entrw in column j, corresponding to every
task T j which is in the queue for R i . A zero in
column k of row i indicates that task T k has control of
resource R i. Multiple zero entries in row i indicate that
several tasks share control of that resource. Duplicate
non-zero (and non-null) values indicate tasks whicJ:t will
share control at a later time. Similarly Pik < P ij indi-

1172

Fall Joint Computer Conference, 1968

cates that Tk is ahead of T j in line and that T J will have
to wait for T k.
Going down any column j, corresponding to task T j,
we find an entry in rows h, ... ,i e corresponding to every
resource Ri which T j controls or for which it is queued
and waiting. An entry (rank) of zero in row k of column
j indicates that task T j now has the use of resource R k.
N on-zero (and non-null) entries in column j indicate
those resources for which task T j is waiting.

To

T

T2

T3

To

0

0

0

0

Tl

1

0

0

0

T2

0

0

0

0

T

0

1

1

0

3

Illustrations

FIGURE 6

Assume a system with four tasks, To - T s, and six resources, Ro - RI), with all queues initially empty, and
that the following sequence of requests has been processed by queue management:

1) To request E for Ro and S for RI);
2) Tl requests S for R o, Rs, and RI);

3) T 2 requests E for Rl and R 2 , and S for Ra;
4) Ts requests E for R a, R 4, and R 5 •
5) To releases R 5 •
At this point, the queues for the six resources are
shown in Figure 4.
SOURCE

Ro

MODE

E

S

E

E

RANK

o

1

o

0

S

S

E

ESE

o

1

0

0

1

FIGURE 4
Figure 5 summarizes the state of the system in a precedence matrix. In Figure 5, two tasks, To and T 2 , are
not waiting for any resources and are able to process.
Tl is able to process when Tl releases Ro. T a, however,
must wait to process until both T 1 and T 2 releases Ra
and T 1 releases RI)'

RO

TO

Tl

0

1

T2

Rl

0

R2

0

R3

0

0

T3

The-wait matrix is shown in Figure 6.
Interlock detection

To demonstrate the interlock detection procedure,
continue with the system described in the previous section, using the state of the system shown in Figures 4
and 5.
Starting at this point, we will examine two cases as
To requests a resource. In each case, To must wait for a
resource to become available while another task is able
to continue processing and To and the remaining two
tasks are in a wait state. In one case, as resources are released by the task which is running, subsequent tasks
are able to process. In the second case, if the request
were to be honored, the remaining three tasks would still
be in a wait-state once the task which was still running
released its resources. In this situation, only a new task
could run, and it could only continue so long as it demanded no resources held by any of the th:r;ee totally
interlocked tasks.
Case 1: Suppose To asked to share R 2• Figure 7 shows
the resulting resource queues, and Figure 8 shows the
new precedence matrix. ("A" indicates the affected
entries.) The significant steps in determining if the request may be honored and To allowed to wait for R2 are
as follows:
a)
b)
c)
d)

To ~ T 2;
T 2 ::J> Ti for any value of i;
To~Tiforanyi~ 2;
To ~ Ti for any j ~ 2.

Thus there exists no sequence To >... > To, and
the request may be honored without fear of interlock.

1

RESOURCE

Ro

MODE

E

S

E

E

S

SSE

RANK

o

1

o

0

1

0

0

R4

RS

1

0

FIGURE 5

1

FIGURE 7

E

S

E

010

0

1

Resource Allocation with Interlock Detection

Ro

To

Tl

0

1

T2

T3

~1

R2

0
0

0

R3

1
0

R4

1

0

R5

FIGURE 8
Case 2: Suppose that, instead of R2, To requests
shared control of R 4 • The resulting resource queues and
the precedence matrix are described in Figures 9 and 10.
("t:." indicates those entries which are new).
Significant steps in concluding that the request can
not be honored follow and are keyed to the arrows in
Figure 10.
a)
b)
c)
d)
e)

To~Ts;
Ts is waiting for control of Rs;
Ts ~3 T 1 ;
T 1 is waiting for control of Ro;
Tl ~ To.

RESOURCE

R0

Rl

R2

R3

Thus there exists a sequence To> ... > To and the
request must be denied. In this case the wait-chain is
To > Ts > Tl > To. 1£ To were allowed to wait for
without first releasing Ro, the tasks To, T 1, and Ta
would be in a permanently interlocked wait state.
Since there is only one new resource requested and
one old resource being held, no further paths need be
followed. Note that there is an alternate branch which
parallels the first path, but which does not lead to any
new knowledge.
For a sinlpler view of the interlock, use the wait
matrix shown in Figure 11. Scanning across the row
from To, we find that To > Ts. Scanning across Ts's
row, we find Ts > T 1 • Scanning T1's row, we see that
Tl > To. Thus To > Ts > Tl > To, and an interlock
exists.
It is apparent from even this simple example that an
algorithm using the wait matrix would be far simplef to
implement and use than would one based on the precedence matrix. However, since the information which
can be provided with the wait matrix (i.e., whether or
not an interlock exists) is only a subset of the information available by using the precedence matrix, this discussion will concentrate on the latter.

R:

0

Rl

1173

The detection algorithm

R4

R5

TASK NAME

To

Tl

MODE

E

S

T2
E

T2
E

Tl
S

T2
S

T3 T3 ToA Tl
E E S
S

RANK

0

1

0

0

0

0

1

0

1

0

T3
E
1

The algorithm to trace through the chains of waitrelations in the precedence matrix may be regarded as
making a series of horizontal and vertical movements
through the elements of the matrix. If a valid series of
movements leads back to the column from which the
series began, then an interlock has been detected.
The rules governing movements through the matrix
are as follows:
1) Searching for positions to which to move is done

FIGURE 9

To
Ro

O~

T2
Tl
--Cef'l
",-'1\

Rl

' II

'/(d)

R2

I

R3

/0

I

R4
R5

T3

down columns from the top and from left to
right along rows.
2) Movement alternates in direction. A position
occupied via a horizontal move must be vacated
by a vertical move and vice versa.

I-A··

I

\

\O~ -

Tl

T2

T3

0

0

0

1

Tl

1

0

0

0

T2

0

0

0

0

T

0

1

0

0

T

0

J~_ -0- c)
-1

-----

To

0

~( b)

-(~ >o~

--

FIGURE 10

J

-I'"

0

3

FIGURE 11

1174 Fall Joint Computer Conference, 1968
3) Movement along a row (within the queue for a
resource) can be made only to a position having
a (non-null) place-value (rank) less than that
of the position the algorithm currently occupies.
This locates a task in the queue ahead of the
task whose column is now being occupied
4) Movement down a column through the list of resources contended for by a task can be made to
any position having a non-zero (and non-null)
place-value. This locates resources for which
the task whose column is occupied must wait.
5) If, during a horizontal move, the column corre, sponding to the original task is reentered, an interlock has been found.
6) A sequence of moves stops when an interlock is
found, or when no position is open in the specified
direction.
7) A new sequence is started at those points from
which multiple paths lead.
To insure that all moves are taken before vacating the
spot currently held, all positions which may be oecupied
next are selected and their coordinates saved for later
use. In the list with the coordinates of these "next"
positions is placed the direction in which movement will
be made.
When all the "next" moves from a position have been
listed, the position is vacated and a new one taken from
the list of "next" positions. The jobs is complete when
the' 'next" position list is empty.
Describe the precedence matrix as a two-dimensional
array indexed by the variables Rand T, with each element of the matrix consisting of three entires:
1) P, the rank of task T on the queue for resource R;
2) H, the row flag, turned on to show that this element has been accessed horizontally:
3) V, the column flag, turned on to show that this
element has been accessed vertically.

Define the "NEXT list as a last-in, first-out stack,
indexed by the variable i, having the entries:
1) RNEXT, to save the R-coordinate of selected
elements of the precedence matrix
2) TNEXT, to save the T -coordinate of selected
elements of the precedence matrix
3) NEXTMOVE, containing "H" or "V", depending whether the direction of move~ent from this
entr.y is to be horizontal or vertical when its coordinates are selected from the NEXT list.
With all indexes, assume a value of 0 to reference the
first element.
It is assumed that, prior to entering the analysis
routine, queue management has:

1) Checked that analysis is really needed in that:
a) At least one resource was already controlled by
the task; and,
b) At least one of the newly requested resources
has been assigned a rank higher than zero;
2)
3)
4)
5)

Stored the number of tasks in TMAX;
Stored the number of resources in RMAX;
Constructed the matrix;
Stored the column number corresponding to the
requesting task in TINIT (0 ~ TINIT ~ TMAX
-1);

6) Stored the row number corresponding to anyone
of the resources already controlled by the requesting task in RINIT (0 ~ RINIT ~ RMAX
-1).
RESOURCE

Ro

TASK NAME To

Ta Ta T2

Ta

MODE

E E S

E

101

1

RANK

E

1

FIGURE 12
Example

Suppose the request in Case 1 of Section VII has been
made (and validated) and the state of the system is
as shown in Figures 7 and 8.
Assume that T 2 now requests shared control of R 4 •
The new queues and precedence matrix are described in
Figures 12 and 13.

o

1

o

o

1

o

o

o

1

1,6.

0
1

FIGURE 13
In this case it is obvious that ali interlock exists,
because no task remains in a run condition. However, as

Resource Allocation with Interlock Detection
WHEN
ADDED
TO
LIST

RNEXT

TNEXT

NEXTMOVE

WHEN
REMOVED

(2)

4

2

H

(4)

(S)

4

3

V

(7)

(9)

3

3

H

(21)

(10)

S

3

H

(11)

(12)

S

1

V·

(13)

(14)

0

1

H

(1S)

(16)

0

0

V

(17)

(18)

2

0

H

~19)

V

(24)

(22)

3

1

FIGURE 14

has been shown in previous examples, an interlock can
exist even though another task is running.
Since T 2 must wait for a resource and already controls
other resources, analysis is needed.
Now queue management sets TINIT = 2, TMAX =
4, and RMAX = 6. Figure 14 traces the maintenance of
the "next" list. The numbers in Figure 14 are keyed to
the following text which outlines the algorithm as it
processes the matrix (Figure 13);
1)
2)
3)
4)
5)
6)
7)
8)
9)
10)
11)
12)
13)
14)

The column under T 2is scanned.
An open position is found in the row with R 4.
No further entries in column 2.
Latest entry taken from "next" list.
Row 4 scanned, with an entry found in column 3.
End of Row 4.
Latest entry taken from "next" list.
Column 3 scanned.
Entry found in Row 3.
Entry found in Row 5, then end of row.
Latest entry taken fronl "next" list.
Row 5 scanned, with the only entry in column 1.
Latest entry taken from "next" list.
Column 1 scanned, with only eligible position in
RowO.
15) Latest entry taken from "next" list.
16) Row 0 scanned, with an eligible position in
columnO.
17) Latest entry taken froJ)l "next" list.
18) Column 0 scanned, with an eligible position
found in Row 2.
19) Latest entry taken from "next" list.
20) Row 2 scanned, with the only open position
found in Column 2, corresponding to TINIT. The
routine enters "2" in the message list for queue
management.

1175

21) Latest entry ta.ken from "next" list. (In this case,
the only one left in the list is the entry placed
during step 9.)
22) Row 3 is scanned, and the first valid next position found in column 1. This entered in the
"next" list.
23) Continuing the scan of row 3, another valid
next position is found in column 2. Since this
corresponds to the requesting task, the row number (3, for Rs) is placed in the message list for
queue management.
24) The next entry is taken from the "next" list.·
25) This entry indicates that the position in row 3,
column 1 is to be occupied and a vertical scan
made. However, the vertical flag in this position
was set during the process implied in step 14.
Therefore the entry is discarded.
26) The "next" list is empty and the process in complete.
27)R2 and Rs ("2" and "3" in the message list) must
both be released before the request for R4 can be
honored.

Representation as a directed yraph
Alternatively, interlock detection may be viewed as
isolating loops (cycles) in a directed graph. This is so because a system of tasks and resources, inter-related
through queues, may be represented as a directed graph.
Let the tasks in such system be represented by the
vertices (or nodes) of a directed graph. Let an arc be
drawn from node T i to node T;, oriented toward T;, if
and only if T i must wait in the queue behind T i for resource rk. Associate with the arc T, Tithe symbol rk to
complete the representation.
The total set of these arcs, connected "into a minimal
network, illustrates the inter-relationships of all the
tasks in the system with respect to their competition for
available resources.
Using the illustration in an earlier section, first construct the set of individual arcs for each "wait pair."

(S)~_R_o..;;;J~~C9

(T l waits tor To to release Ro>

®
~(S)
(§) R5~ eY
Construct the minimal graph, retaining duplicate

1176

Fall Joint Computer Conference, 1968

arcs between the same pair of-nodes for completeness.

Note that the tasks which are still able to process are
the sinks of the graph. If To releases R a, the graph becomes:

If To were nowto request R 2, the graph is:

~ven though T 2 is now the only task able to process,
no mterlock exists. When T 2 completes, To may run.
When To completes, Tl may run, and then Ta.
Suppose, however, that instead of R!), To requested
R4 (S or E). The graph would then be:

T 2 is still free to process. However, an interlock
exists: for when T 2 completes, no other task may run.
Each is waiting for a resource held by another of them.
There are several techniques from Graph Theory for
the detection of cycles in a directed graph, l -some
oriented toward matrix representation. In view of the
relatively small-number of elements in this application,
adaptations of these might be highly efficient.
Also of potential usefulness from Graph Theory is the
idea of the length of the longest path between any pair
of vertices. This could be used as a rough estimate of the
length of time a task would have to wait for a resource.
The task could decide whether to use an alternate, even temporary, resource, if appropriate.
CONCLUSIONS
This paper has presented a queue management technique which detects interlocking requests for system resources. Two variations of matrix manipulation were
shown, offering a choice between completeness or compactness.
Either approach offers more flexibility and security
than traditional resource control. Conventionally, the
possibility of system interlock constrains a task to:
1) Release all currently-controlled resources prior to
requesting others;
2) Request multiple resources in parallel; and!or
3) Use resources in a strictly-defined order.
The wait-matrix, with elements consisting of a single .
binary digit, requires very little storage. It yields only a
yes/no answer as to whether an interlock exists, but it
provides the information with a minimum of programming. When a task's request for a resource is denied, it
has to release all those it already controls and then rerequest control of them in parallel with the new resource.
The precedence matrix is bulky, especially in complex
systems with many resources defined. H()wever, the algorithm based upon it can isolate the minimal set of resources which must be released by the task.
In addition, the algorithm is easily modified to reverse the direction of search through the wait-chains.
This action permits it to identify the minimal set of resources for which the request for control must be denied.
Topics for further investigation are suggested by
showing that the set of tasks in a system is a partially
ordered set under the relation "wait-for"; and that the
system of tasks and their requests for resources may be
represented as a directed graph.
REFERENCES
1 CCYUNG
The connectedne88 of directed graph8 and applications
PhD Thesis Columbia University New York 1966 p 71

A dual processor checkout system
by KENNETH C. SMITH
Martin Marietta Corporation
Denver, Colorado

"The brain 'Of the higher animals, including
man, is a dQuble 'Organ, cQnsisting 'Of right and
left hemispheres cQnnected by an isthmus 'Of nerve
tissue. It IQQks as thQugh in animals with an intact
calQsum (nerve tissue) a CQPy 'Of the visual wQrld
as seen in 'One hemisphere is sent 'Over tQ the 'Other,
with the result that bQth hemispheres can learn
tQgether a discriminatiQn presented tQ just 'One
hemisphere. When this CQnnectQr between the
tWQ halves 'Of the cerebrum was cut, each hemisphere functiQned independently as if it was a
cQmplete brain. . . ."
"KnQwing that the answer was wrQng, the right
hemisphere precipitated a frQwn and a shake 'Of
the head, which in turn cued in the left hemisphere
tQ the fact that the answer was wrQng and that
it had better CQrrect itself! ... Taken tQgether,
'Our studies seem tQ demQnstrate cQnclusively that
in a split-brain situatiQn we are really dealing,
with tWQ brains, each separately capable 'Of mental
functiQns 'Of a high 'Order...." 1
As human sciences prQbe deeper intQ the physical and psychQIQgical makeup 'Of man; and as
'Other sciences extend man's technical creatiQns,
there frequently seems tQ exist a parallel between
the man himself and his inventiQns. N Qt 'Only do
these parallels 'Occur 'Often but they frequently
occur tQ an extent much mQre grQSS than could
have been PQstulated. Such is the case with a
"Dual PrQcessQr CheckQut System." AlthQugh
seemingly a new technical creatiQn it is clQsely
parallel tQ its creatQrs in its decisiQn and cQntrolling mechanisms.
CQntrQI and decision making is achieved by tWQ
Sigma 7 computers which act independently in
mQnitQring test article data but CQQrdinate effQrts
tQ diagnQse data and transmit cQmmands. Func-

1177

tiQns perfQrmed by 'Only half the brain and results 'Of tests are shared such that each cQmputer
may learn frQm the 'Other. N 'Ow we may discuss
a mQre technical descriptiQn of the checkQut system.

General discussion
TWQ Scientific Data Systems, Sigma 7 prQcesSQrs and five 16k, 32 bit WQrd memQry banks were
chosen fQr the basic cQmputer compQnents. Connected directly intQ the memQry with cQntrol frQm
direct I/O lines are tWQ PCM links, analog channels, and tWQ identical discrete data links. CQmmand links to the test article are provided redundantly. This cQnfiguration prQvides a high
prQbability 'Of cQmpleting a test shQuld a checkout
equipment malfunctiQn 'Occur; and a high prQbability 'Of test article error detectiQn tQ guarantee
human safety.
The sQftware cQnsists 'Of three main elements:
Executive Subsystem, Data MQnitoring Subsystem, CQmmand/CQntrQI Subsystem. The Data
MQnitQring Subsystem prQcesses PCM data links,
discrete data links,- data assQciated interrupts and
nQtifies the remainder 'Of the subsystems 'Of anQmalies which may have 'Occurred frQm the expected
data flow. Specially built hardware helps tQ filter
a raw discrete data rate of lMHZ and a PCM
data rate 'Of 384 KHZ per link. All data is mQnitored at all times tQ prQvide a cQmplete test article profile.
The CQmmand/CQntrQI Subsystem interprets
'Output frQm a user 'Oriented test language translator (subject 'Of anQther paper for this cQnference) which prQvides the test engineer with full
cQntrQI 'Over vehicle testing in a language he may
easily use. Tests may be written ,vhich send com-

1178

Fall Joint Computer Conference, 1968

A

A
R
T
I

I

R N
T T
I
C
L

E

C

E A

L
E

C
E

FIGURE 1-Checkout system online software

mands to the test article and set criteria in the
D'ata Monit'Oring Subsystem as to what results are
expected. The Command/Control Subsystem then
acts on the recovered data as to how to proceed
further.
Executive software fundamentally- consists of a
modified version of the SDS Basic Control Monitor
herein described as BCM. Each CPU is normally
administered by BCM which handles all the checkout system processing functions. Processor work
loads are divided into modular software elements
called tasks and scheduling these tasks, core allocation, I/O processing, trap processing, self tests~
etc., become service functions of BCM. The Executive Subsystem provides these services and overall
control of processing to the other subsystems.
Reliability and integrity are fundamental to the
design philosophy and cannot be compromised.
All data c'Oming into the checkout system must be
processed by both CPU's as are commands transmitted to the test article. This minimizes any possible data error or incongruity between CPU's to
the extent that a disagreement between CPU's will
stem from a discernible hardware malfunction.
The important points being first to detect an error
at the earliest possible 'Opportunity and second, to
take proper recovery action for continuation of the
test with one or two CPU's on line.
A self-test program is the basic element of integrity which guarantees the success of the system
and exists as the lowest pri'Ority task in the scheduler. This means that whenever a CPU is idling
it is in fact testing itself and associated memory
for operability. Several types of tests are conducted on the computer processor and memory
banks: Instruction tests are executed on the CPU.
Read/write tests are performed on the mem'Ory
modules and periodic sum-checks are made on software programs. Self test is also used in processing

Sigma 7 traps and in this way a CPU may determine which particular part of the hardware has
caused a failure.
Responsibility rests with each CPU to cross
examine the validity flags, which, if they are set,
indicates a critical failure in the CPU whose flags
are being tested. The validity flag is set whenever a possible malfunction may have occurred
and is reset when the proper recovery has been
executed. The weight of making a command decis-ion rests upon the good CPU which must learn
the maj or computer component failure of the other
computer system. This may result in new processing functions or a small modification of the test
presently being c'Onducted.
The functions which will be described consist
of that functional hierarchy which manages these
learning and controlling processes between CPU's.
With the internal computer programs and hardware being tested as they are, all that remains to
maintain integrity is to control the data inputs
and outputs. The methods 'Of doing this will later
be described in detail. Such is the spectrum of
programs which guarantee system integrity.
Figure 2 depicts a generalized control and error
flow for the Executive Subsystem and portrays
the hierarchy of the Executive Programs. CPU/
CPU communication is processed through the Dual
Processing Controller with the exception of testing the validity flag. This and CPU/CPU syn-

Control
r--

Task
Delegation
Executive

cro/cro

Interrupt and
Memory ]rtaoe

Dual
Processor
Exeoutive

Dual
Prooessi.J!g
Controller

Non
-agTeement

CPU
Beset

~

Service

Impasse

1
1I~

Control

~

BCM

Control

~

Tasks

I

Inquiry

Selt

T

1 - - - - - - - e....l Tests! ~ _ _
. ---0.

e

r

FIGURE 2-Checkout system executive subsystem

A Dual Processor Checkout System
chronization tests are conducted on a periodic
basis so that an inter-CPU link will not be required to detect a faulty processor. The Check-out
System receives a BCD range time which is required for data synchronization purposes. A
binary doubleword image of this time is maintained in each CPU for basic clocking purposes.
These three sources must be in agreement for the
Checkout System to be fully operati'Onal and any
disagreement is indicative of a computer or data
error. In this case a computer may be taken off
line or a related checkout system function disc'Ontinued.
Errors resulting from any data disagreements
are passed to the Dual Processor Executive which
diagnoses inconsistencies on a detailed level.
Whenever the computer systems disagree upon a
data input, a program failure or some hardware
failure, exhaustive testing must be initiated to uncover the error. Because this is difficult it is imperative that programs compare data wherever
the eventuality may arise that different functi'Ons
may be performed as a result of the data change.
Should a discrete error be detected in one Data
Monitoring Subsystem but not in the other then
that inconsistency will be reported immediately
to the Dual Processor Executive, this indicates a
hardware malfunction if the criteria for the Data
are pre-checked. It may be parenthetically added
that this particular error seldom arises since all
critical discretes (those which are crucial to the
test articles successful performance) are triply
voted in hardware before being evaluated by the
software, indicating that a single discrete failure
will not be detected as an error---..:only a dissenting
vote.
The Dual Processing Executive isolates the malfunctions in the Checkout System and determines
the proper posture to continue testing. Any decisive errors, hardware or software, from all executive subsystem programs are sent to the task
Delegation Executive which reconfigures the software to be compatible with the hardware failures.
If a PCM link failed, then all programs pertaining
to that link would be disarmed, a message would
be printed and processing would continue. CPU
Reset is entered if the Task Delegation Executive
has determined that the functions assigned to this
Executive Subsystem cannot be completed. This
obviously assumes that the computer memory and
processor are 'Operational to a large degree so that
it may turn itself off. Memory or CPU failures

.1179

would be observed in the verification flags which
help to alleviate' this problem.
Executive programs reflect the Checkout System redundancy philosophy. Stated simply, all
critical command/control loops are triply redundant. Whenever a command is to be sent out to the
.test article,· that command is first voted by both
CPU's; the command is transmitted only if both
CPU's agree. The command is sent down three
separate transmission links to the pad equipment
and ultimately to the test article. Critical data are
monitored in an analogous manner. Commands
will generally result in a discrete event change
through the .Data Monitoring Subsystem where
the discrete events are again voted by the CPU's.
Should the CPU's disagree upon an anomaly,·then
this would result in control being passed to the
Dual Pr'Ocessor Executive where such inconsistencies are processed. Each CPU is self-reliant to
the extent that it can fully complete a test independently and run asynchr'Onously with periodic
data and timing checks. This helps to minimize
the data crossover, simplifies analysis in case of
an error and ultimately reduces the complexity
of the software programs.
Thus far, little uniqueness has been suggested
for s'Oftware implementation in the checkout system. Functions such as dynamic allocation of memory, program scheduling (with a hardware priority interrupt algorithm), high speed mass storage
buffer techniques, on-line data analysis and display have seldom been seen in one real-time system but are generally known and understood. It
is not the intent of this discussion to discuss the
total Checkout System Executive uniqueness but
it must be mentioned (even if trite) that the
Sigma 7 dictated some unique implementation by
virtue of its real time characteristics. N ow, we
may proceed to a more specific treatment of the
dual system.

Task delegation executive
Ultimate hierarchal control resides with the'
Task Delegation Executive. All noteworthy information of system status is available here and
all decisions to be based on tailures will be made
here. A failure may result in two general actions:
(1) elimination of that particular function from
the processing chain in that CPU; (2) elimination 'Of a CPU. Each action that is to be taken is
fully dependent on the present system state. For-

1180

Fall Joint Computer Conference, 1968

tunately, however, there are only two hasic states;
Redundant Mode with both CPU's executing the
same test to provide reliability; Test Mode with
each CPU operating independently. This minimizes the number of switch positions.
Switching is accomplished on three program
levels. At the first level is the arming and disarming of tasks (scheduled entries) and interrupts. Both these areas contribute to the change
of data flow (input and output). At the second
level is a gross trap vector change affecting all
Executive Services such as I/O, inter CPU communications, etc. The third level is more refined
and lies within the service trap processors as a
trap may branch to one of several programs for
processing. The Task Delegation Executive must
weigh each of the failures and turn the switches
to the proper position. Notification of the other
CPU and operator/test conductor notification are
other functions completed.

portant than the integrity gained since the number
of command transmissions to the vehicle is not
great over a one second interval. There are also
frequent occasions where the CPU's must be synchronized which implies CPU lost time by forcing one CPU to wait for another. The result of
the technique is a synchronization of events rather
than of a timing window. A communication technique of this type allows each CPU to run asynchronously most of the time which is important
for this system since the loads on the processors
are not identical. Areas of non-redundancy such
as CRT displays, PCM evaluation and recording,
and analog data collection can cause processing
imbalances.
Redundant functions, such as critical anomalies
and commands, are always agreed upon by the
CPU's so that any subsequent action will be the
same in both CPU's. This then must be done in
a synchronous manner.

Dual processing controller: Transmit/receive logic

Dual processing controller: Inter-processor
services

Communication between the CPU's is accomplished using interrupts and common memory. Implementation incorporates both a simplex and duplex interrupt structure. The former technique
is used where ultimate action will result in a processor being taken off-line. This is a high priority
communique for highly significant messages. The
remainder of communication is implemented
through a duplex algorithm where an acknowledgement is always required to complete' the loop.
In both cases messages are placed in mail boxes
which are scrutinized when inter-processor interrupts occur. For the duplex system, timers are
set up upon initiation of a transmission which determine the greatest allowable time that may be
delayed before receiving an acknowledgement.
The inter-processor time delays, as set by the
timers, are only dependent upon interrupt priorities and their processing. These times are not
difficult to determine, and not dependent upon
total processor loads. A second set of timers is
used where necessary to guarantee that queuing
functions are ultimately processed. A failure to
do so implies a serious system non-agreement.
Later examples will demonstrate the technique
where these timers are used. A disadvantage in a
_duplexed system is that it implies a loss of speed
and lost processor time due to. bookkeeping requirements.This lost time is felt to be less im-

Three basic types of services are provided:
memory check, hardware register check and much
of the intra CPU service linkage. These services
may be put together into several combinations to
process any particular circumstance which may
arise. All these services are requested through a
trap processor. that a task may initiate~ This being the only manner in which a slave program
(one that does not have full access to the instruction repertoire) may enter the master mode and
subsequently do I/O, trigger interrupts, etc. The
tool used to provide this linkage :qas been discussed
in the previous paragraphs in general and will
now lend to a more specific treatment of technique.
As has been previously mentioned, the CPU's
run asynchronously and for this reason queuing
type techniques are used so as not to inhibit the
processors operating upon other tasks. To compare data between ~rocessors, then, some memory
allocation scheme must be implemented so that
data may be passed from CPU to CPU. Inherent in
the design is a large segment of memory which has
been designated for dynamic allocation herein
termed "Free Storage." A free storage block
(FSB) may be assigned to a program for an indefinite length of time.
Inter-CPU communication is separated into
three priorities. _ These priorities are assigned to

A Dual Processor Checkout System
a service function and dictate some of the implementation. At the highest priority level, are the
intra-CPU services to be allowed between CPU's.
Expected processing delay and acknowledgement
times are short and no synchronization is required.
Therefore, no queuing is used.
The second transmit/receive pair is assigned to
command functions. Execution delay time due to
interrupting functions is again expected to be nonexistent but because of a synchronous processing
characteristic some queuing is required. Data
cross checking is executed at the lowest level of
priority with normally expected delay times in
the milliseconds region. Therefore extensive queuing is required.
EXAMPLE 1
All inter-CPU services do not require a FSB
and, in fact, this is the rule rather than the exception. Traps which provide scheduler services between CPU's are of such a nature and no addi-

cre

I
!

Program
Requiring an
Alternate
Servic

1 PROGRAMS

1181

tional linkage is needed than that already provided within one CPU.· No queuing is necessary
because the message may be acted on immediately
requiring no synchronization at all between
CPU's. A task such as PCM processing may be
active in CPU 2 and desire to communicate an
anomaly to CPU 1. The task would form all the
normal setup required to activate a task within
CPU 2 but would execute an inter-CPU trap instead. The scheduler in CPU 1 will in turn activate the requested task depending on its level in
the task priority structure. Any address modification will be done in the trap processor if the addressing needs to be changed. This is a serious consideration since the memory banks are numbered oppositely from each CPU with bank 3 being addressed the same from both CPU's. A
buffer area relative to CPU 2 will be displaced by
a constant (for that location) so that CPU 1 may
address the correct buffer.
.
Figure 3 depicts a typical inter-CPU scheduler
service processing. A task being processed by
Cre 2 PROGRAMS

FIGURE 3-Alternate service processor

Dual Processing
Controller
Functions

Dual Processing

Controller Functions
r -_ _ _--L---=In~rru

t

Examine Service
Code
Issue CAL32 Trap

Set Message in
Mailbox
Send Interrupt

Set cre 1
Mailbox

,Set Watchdog Timer

Send Interrupt

/

Retum

Retum

/
/

I

Intermpt

Fetch Code

Turn

orr

Return
Interrupted Program

Clock

r----

.

I
CPU 1
Mailbox

/

~

Return to
Interrupted Program

1182

Fall Joint Computer Conference, 1968

CPU 1 desires to transmit a display message in
CPU 2 and therefore initiates a trap to perform
that function. The message is deciphered by the
Dual Processing Controller. A special code is
placed in CPU 2's mailbox which will direct it to
perform the desired function. Argument addresses are modified for CPU 2 usage. Then, an interrupt is generated for CPU 2 to look in a specific

mailbox location. Before CPU 1 exits it must set
a timer to be certain that CPU 2 has received the
message. Control is returned back to the requesting task to continue processing.
N ow CPU 2 comes into operation through the
interrupt, immediately performs the desired function, sets the acknowledge code in the mailbox,
and triggers an acknowledge interrupt to CPU 1.

FIGURE 4-CPU/CPU memory crosscheck

~

CPU I PROGRAMS
DUAL PROCESS I NG
CONTROLLER FUNC TIONS

Scheduler
Activates
A Task

CPU 2 PROGRAMS
DUAL PROCESSING
FUNCTIONS

~
Get FS'B
Crosscheck
Request

r----sa

Suspend Task
Modi fy Memory
Address For
Other CPU
Set Mar Ibox
Set Tfmers
Send Interrupt

1

~.

Fetch Code

CPU 2
ar Ibox

Queue

·,les5age

Acknowledge Message
Return

I

g
.

rI

Interrupt

Return
Ret rn To
Interrupted
Program

Queue

I

Get FSB

~

..

Fetch Code
Conti nue Task
Executl'on
Acknowledge
Message
. Return
Return To
Interrupted Program

I

Sea rch Queue

~--

CPU I
af Ibox

Scheduler
ActIvates

Canpare Ce I 15
Set Mar Ibox
Send Interrupt
Return

A Dual 'Processor Checkout System
When the interrupt routine is exited, the scheduler will consider the new request and service it
at the proper priority level. A CPU 1 interrupt
routine next comes into activity, turns off the
timer and. exits back to the interrupted program.
EXAMPLE 2
The case above is the more frequent of the
types' of inter-CPU service$' but far ,from the most
interesting. Data compares demonstrate a greater
level of complexity and require a FSB to be accomplished. Following Figure 4 will assist in
comprehending the design implementation of the
inter-CPU data compare service. Since the work
scheduled for the processors in asynchronous, we
may assume that the schedular activates a task in
CPU 1 which requires a check of data in memory
(the synchronous case of both CPU's executing
the same task at the same time cannot be ignored
for implementation but is d'One so here for sim,plicity. The proper trap is set and control is
given to the Dual Processing Controller which
first suspends the task. By suspending execution
of this task it may be observed that the mailbox
entry'to be made is unique as to the task which
requested service. All queued entries are keyed
to a task since it has been temp'Orarily ignored by
the scheduler and no new service can be requested
from that task. The memory address of the FSB
is modified to be read by CPU 2 and the message
is set in the mailbox for CPU 2. CPU 2 is notified
of the pending message by an interrupt from
CPU 1 and then CPU 1 returns to normal execution with the service requesting task suspended.
CPU 2 receives the interrupt, 10'Oks into the
mailbox and decides to queue this message for
later reference. An interrupt is returned to the
initiating CPU to complete one duplex loop. Again,
CPU 1 is initiated and cleans up any bookwork
that may be pending such as turning off one timer.
In this type of service an additi'Onal clock is required to assure that the request to compare data
will not be ignored indefinitely. This is precisely
the type of error the checkout system is designed
to process and timer runout could be indicative of
a maj or malfunction. M'Ore will be said on this
a.rea later in the discussion of the Dual Processing
Executive. It is observed that except for the initial memory compare request all processing proceeds asynchronously under interrupt control.
The second half 'Of the sequence may now take

1183'

place! CPU 2 has the same task activated which
also wishes to compare data. The queue is
searched, an entry f'Ound and the data cells, are
compared. The results of this compare are sent
back to CPU 1 which continues execution of the
suspended task if the results agree. If they do
not, an error mode results which sends control to
the Dual Processing Executive to diagnose the
problem. Acknowledgement is sent back to CPU
2 from CPU 1 to complete the duplex loop. Control is returned to the task if it is still the highest
active task in the schedule and if the data agree.
EXAMPLE 3
Implicit in describing the CPU/CPU service
functions is a method to compare commands that
are to be sent to the vehicle. Although simple in
concept, the actual implementation is somewhat
more formidable than the example previously described for data cross-checking. Three distinct
steps are required to transmit a command and
these c'Ommands are intimately connected to the
redundant philosophy that both processors participate in transmitting commands. This condition may be overridden but only if one processor
is on line, otherwise agreement. is mandatory in
the c'Ommand/control loop. This is not to say that
a program may not be generated to test the test
article up to checkout system processing limits
such. as generating a frequency 'Of 10 KHZ or
better to determine phase shift in a. vehicle subsystem. In this situation no inter-CPU communication is required at the command level.
Implementation has been in such a way that a
minimal amount of inter-CPU communication is
required for synchronization and is somewhat
similar to that as demonstrated inthe previous example. Following Figure 5, a task which is to
send a command causes a trap to the special I/O
handler routine. The Task Delegation Executive
has previously set this trap process'Or to respond
for a Redundant Mode. In this case the task
requesting service does not know how many
CPU's are on line or which CPU is executing the
program. The proper processing path is automatically taken. An interrupt is sent to CPU 2
which queues the message and transmits the acknowledge back. At a later time CPU 2 has the
same task activated as in CPU 1 which determines
to send the same command. CPU 2 observes CPU
1 is already waiting and therefore proceeds with

1184

Fall Joint Computer Conference, 1968

command execution· by loading the proper hardware command registers. A message is placed in
the mailbox for CPU 1 to check and send the command to the test article, which CPU 1 does. The
task which was suspended is continued and a
transmission complete or incomplete is sent to
CPU '2. If the command is sent, then control is
returned back to the task when the interrupt
routine finally releases control. Control is sent to
the Dual Processor Executive should this not be
the case, if the command should be· bad or the
CPU's do n'Ot agree on register content. CPU·2
continues the suspended task when it gets an in-

terrupt which indicates the transmission has been
sent. Control is then returned to the task which
was suspended.

Dual processing executive
The function of this program is to determine
which computer system has malfunctioned, not an
obviously simple assignment with 'Only two voters.
'Ve must recall, at this time, that all data inputs
.and outputs are agreed between CPU's before
they are allowed to occur. Also important is the
awareness a CPU must have to its peripheral

FIGURE 5-Command comparison service

,-

cro

CP(J 1 PROGRAMS
Dual Processing
Controller Functions

I

Task Decides
to Issue a
COJDnand

Dual Processing
Controller Functions
Interrupt

r---t! CPU 2

Sus :pend Task
Set Mailbox
Tri&ger Interrupt
Set Busy Flag
for Command I/O

j

• -I:>; Mailbox

r-------~------~

Fetch Code

-

i

QUeue Me8Sage

Return

n
I

Queue

r--i

In+.

Fetch Code

\

\~.-.

Set Transmission

L::J

Good in Mailbox

I

-:::: Return to
Interrupted Program

I
I

! Task Decides

!

Get Queued Entry

\

Send Command
Continue Task

Return

Return to
Interrupted Program

to Issue

~ _ ~r--s-us-pe-.-n-d-Ta-S-'k-S-----Irt&eomm&Ild .J

~

Check Hard. ware
Registers

Reset Command I/O
Busy Flag
Trigger Interrupt

2 PROGRAMS

~

I

Set Busy Flag for
Command I/O
Load Hardware
Registers

Set Mailbox
Trigger Interrupt

"..

_._----..----..,.,

~ cro

2

.Return to
Scheduler

iL

(MailbOX.
~

--

Interrupt
Fetch Code
Command
I/O Busy Flag
Cont·n e Task

a. •• t

Return to
Interrupted l'rogram

I-----.----~

' - -_ _.........M.W.o"'"-_ _ _ _- - j

A Dual Processor Checkout System
equipment (peripheral to the processor and memory that is). To this end all Martin built hardware has extensive error detecting capability
which compliments the SDS equipment.
U sing this knowledge, that the only malfunction
left must be in one of the computers themselves,
then a self-test is initiated in both computer systems. A period of time is passed to allow completion of. the test and then the validity flags are
tested, each CPU testing the other. Should an
error be detected then the malfunctioning CPU
will be taken off line. The responsibility for this
function resides with the good CPU.
Far, far down the processing path lies the possibility that an error will not be detected. To
realize how far down it is necessary to point out
that all critical data, that which can force a processing decision, are majority voted; and that the
programs for testing a test article have been frequently run before. Therefore, it is not expected
to find software errors in this environment. All
data loads are well analyzed and programmed for.
However, should this eventuality arise then the
test article will be "safed" by both systems to assure that nothing and no one will be harmed.

Reset
This program attempts to clean up a bad CPU
and place itself off line. At the same time the good
CPU also performs a reset function on the bad
CPU which will prohibit that CPU from taking
any further action by isolating the processor from
all memory. This will assure that no further interaction will be allowed by an uncontrollable
computer.

Technique effectiveness
One method of evaluating the effectiveness of
the implementation technique is to examine the
trade-offs available within the implementation and
make this information available for future use.
Core used for the total dual processing control
and error processing scheme for all the processes
described amount to a total of 1200 locations.
Sigma 7 processors must have unique first memory
modules to process traps and interrupts which accounts for the reverse addressing scheme previously mentioned. From this same consideration,
and because of the reliability factor, CPU's are
not allowed to access these modules in a dual processor configuration. Therefore, mailboxes reside

1185

in the second two memory modules and any message· sent out is also recorded so that lost messages
can be recovered to the fullest. extent. Eight interrupts are assigned for inter-CPU linkage and each
interrupt has two mailbox locations with the exception of one which is not used for duplex operation. Six of the interrupts are tied in pairs with
the seventh available for interrupt error recovery.
A timing chart would also be in order. Two basic
considerations are made: (1) elimination of the
duplex system, (2) elimination of queuing.
Nominal Total Time (usee)
CPU Service
Crosscheck
Command Transmission
Alternate Service

W /0 Duplex W /0 Queuing Normal

369.3
320.0
186.0

469.3
370.0'
246.0

639.3
520.0
246.0

Times shown do not take into consideration expected delays due to asynchronous operation nor
is an individal processor share of the time shown.
In general, execution times may be divided in half
to obtain these times. Therefore, to process a
crosscheck service each CPU would require about
320 usec processor time for a queued, fully duplex technique.

Observations
The computerized dual processor solution to
automatic checkout is not a concept that has
evolved over night but after years of experience
in the field. A pilot study was undertaken to aid
in feasibility which resulted in a basic checkout
system design criteria. Data, dynamics and interfaces are all well defined and the problems left became those of fitting a computer into this environment. Uniqueness of the checkout system stems
from its reliability requirements and the method
of implementation. The Dual Computer System
with its asynchronous characteristics allows either
computer to control the article being tested. Moreover, system information must be made available
for automatic switchover to occur. It is also recalled that both computers take an active part in
test execution. .In this light, the similarity with
the human counterpart can be observed.
Software and hardware design and implementation optimization are observed on two levels: (1)
design criteria for the checkout system where
trade-offs were made, (2) frequent testing of very
well defined processes. Imvlementation errors can

1186

F~ll

JQtnt C'Omputer ConfeNnce, 1~66

sh'OW up during preliminary article testing since
many tests are dynamic en'Ough t'O exercise the
check'Out system t'O its fullest capability. This is
in additi'On t'O normal pr'Ogram check'Out.
The Dual Pr'Ocess'Or Check'Out System has been
implemented with the idea that integrity cann'Ot
be c'Ompr'Omised and at the same time system dynamics must n'Ot be restricted. The system must
resP'Ond t'O 95 discrete changes in 10 millisec'Onds
'Of which all changes are n'Ot critical in nature.
H'Owever, this will result in 'Only a few. c'Ommands
being generated. Asynchr'On'Ous pr'Ocessing'is the
answer t'O the 'Overall pr'Oblem all'Owing each c'Omputer the greatest latitude possible.
The Dual Pr'Ocess'Or Check'Out System with its
tW'O brained system, requires close c'O'Ordinati'On
t'O achieve the desired ends. This is done thr'Ough

an isthmus 'Of mem'Ory which is stimulated by. interrupts. A capability exists f'Or each pr'Ocess'Or
to gain inf'Ormation and c'Ommands fr'Om each
other. Should 'One half 'Of the check'Out system
fail the 'Other half is fully capable 'Of carrying 'On
n'Ormally but must have t'O re-Iearn menial, n'Oncritical tasks. Specifically, PCM m'Onit'Oring may
have t'O be re-initialized S'O that c'Orrect test criteria
may be set, which indicates a learning phase. This
is also true 'Of anal'Og testing. Finally, the c'Onnection between the pr'Ocessors may be severed in
which case they may 'Operate independently.
BIBLIOGRAPHY
1 M S GAZZANIGA
The split brain in man
Scientific American August

196~

An operating system for a central real-time
data-processing computer *
by PAUL DAY and HENRY KREJCI
Argonne National Laboratory
Argonne, Illinois

INTRODUCTION
A detailed study of the laboratory data-proces,sing
needs of our' Chemistry Division has shown that
about 25 unrelated experiments (see Table I)
will require or .greatly benefit from real-time
computer service. An assessment of the nature of
these requirements and a careful study of the
capabilities .of the Sigma 7** cQmputer clearly indicate 'that this work IQad could be handled by
such a central computer. This load will use up
to 10 % of the I/O capacity of the system ~nd
require about 40 % of the Central Processor tI~e
to satisfy the real-time needs of all the experI~
ments if they are running simultaneously. An additional 30 % of the Central Processor capability
would be used to perform final processing and
analysis .of the data at the end of each ex~er~­
ment. Further examination of the problem IndIcated that while a comparable amount of money
. would have tQ be spent whether we purchased individual .computers for each experiment or a
single Sigma 7 system, the service that could be
provided by the central system would be far superior to the individual small computer service. In
additiQn to the usual features found in third generation computers, the Sigma 7 has true independent Input/Output processors, independent
memory modules and a high speed random access
device (3 X 106 bytes/sec transfer rate), all of
which are essential for the efficient operation of
a large data-collection and processing system.
Analog, information from the various laboratory instruments will be digitized at the remote
*Work performed under the auspices of the U.S. Atomic
.
Energy Commission.
**Manufactured by Scientific Dat~ Systems, Sa.nta Monica,
Calif.

site and transmitted directly to the central computer memory. The data from each of these remote instruments will be processed by a separate
prQgram residing in the central computer. In
the initial stages of our installation, most of the
experimentalists will require this collection and
storage of data for processing at the termination of a .run which may last anywhere from
seconds to days. On the other hand, some of these
experiments, such .as 'nuclear detector multiparameter analyzers, will be partially controlled
by the computer. These experiments will be provided with amplifier gain stabilization and live
displays of computer modified spectra. There is
little doubt that as the experimentalists become
more cognizant of the capabilities of a central
computer, mQre and more experiments will be
mQdified tQ become increasingly interactive with
the computer during an experiment.
The very nature of laboratory research requires that the experimentalist be allowed to'
change his operating parameters during an experiment and make interrogations regarding its
progress. Therefore, in addition to the data lines,
each remote site will contain at least a teletypewriter which will always be interactive with the
computer. Using this device, the experimentalist
will be able to initiate the loading of, his program
and provide the relevant program parameters
by responding to a series of questions posed by
the computer.
Upon termination of an experiment, the ac. cumulated data will be completely processed by
a user-selected processing program. This processing will be done in a low priority background
mode. Additional background processing will be
run on an open-shop basis and small batch jobs
will run one-at-a-time during the many milli-seconds per second that the Central Processor is not

1187

1188

Fall Joint CQmputer CQnference, 1968

handling the real-time prQcessing. In additiO'n,
there are some verylQng-term (hundreds O'f hO'urs)
rO'utine cO'mputatiQns that are O'f important theQretical interest which will be carried out as a "subbackground" jQb. While these calculatiO'ns CQuld
be dQne mQre readily O'n a larger and faster
cQmputer, the time available on larger systems
is better spent Qn dO'ing more advanced research
in this cQmputatiO'nal area.
Design goal
The specific requirements O'f the users in a
typical chemistry research facility cO'ver a brO'ad
spectrum: (a) Data rates vary frO'm O'ne byte/sec
to' lOOK bytes/sec during millisecQnd periO'ds. (b)
Real-time O'peratiQns, except· fO'r data stO'rage requirements, vary from zerO' to sO'phisticated
analysis requiring abQut 10 % O'f the Central
PrO'cessing Unit's (CPU) time thrO'ughout the
O'peratiO'n O'f an experiment. HQwever, there is nO'
instance in which a high request rate and a large
computatiO'nal lO'ad combine to' require mO're than
0.1 sec per secO'nd O'f CPU usage. (c) While
sO'me experiments are cO'mpletely invalidated by
the lO'SS Qf a single data PO'int, O'thers WO'uld nO't
be degraded significantly even if 20 % of the data
were lQst. (d) AlsO', a few experiments must have
a sQphisticated reSPO'nse from the cQmputer in less
than a secQnd and others eQuId wait all day, the
lQSS being the expe.rimentalist's time and patience.
The cO'mputer will be used as a data buffer by
each of the experiments, SQme of which will require mO'derate calculatiO'ns on a real-time basis.
Most O'f the real-time prQgrams require the
evaluatiO'n O'f mathematical functiO'ns and the use
O'f many cO'mmon sub-routines. For example, over
half O'f these prO' grams require a rather large
spectrum-analysis rO'utine. CO'nsiderable memO'ry
can be saved by cQding these routines as "pure
prQcedures" (re-entrant cQde) and having Qne
CO'Py resident in memory shared by many resident prQgrams. As a result, in the majQrity Qf
the cases (20), the data buffer area will require
about four times mQre cO're than the specialized
portiQn O'f each program. Since most of the experiments are transmitting data in an asynchrO'nQus manner, these large data areas must
be dedicated fO'r the d uratiO'n O'f an experiment;
O'nly a 20% saving in CQre WQuld be realized if
"O'verlay techniques"l were emplQyed Qn the cQde
area. TherefO're, at the expense O'f buying sufficient memory to keep these real-time programs

resident during an experiment, the Operating System is relieved Qf the burden O'f re-establishing
apprQpriate memO'ry-residence fQr the majO'rity
O'f the prO'grams. A few (5) O'f the experiments
will transmit their data in bursts at intervals O'f
30 seconds O'r longer and then require a rather
large prQgram fO'r analysis and a nO'n-critical reSPO'nse (time-wise) back to' the laboratO'ry. These
service requests are handled using O'verlay techniques and ar~ processed O'ne at a time on a firstcome first-served basis in a cO'mmO'n memQry area
on a lO'W priO'rity basis.
A simulatiQn prO'gram has been written to'
assess the cO'mpatibility O'f the CPU requirements O'f the variO'us users. This prQgram depicts
Qn a graphical plO't (Figure 1) the dedicatiQn Qf
the CPU as a functiQn Qf time. The input parameters to' the prQgram include the switching Qverhead time, data rates, buffer size, cO'mputatiO'n
time per service request, number O'f I/O requests
per service, and the elapsed time between an I/O
request and its completiO'n. Included in the prO'gram is a routine which simulates the randQmness of the individual events in thO'se experiments where this is applicable. Table II summarizes the results Qf simulating 21 minutes O'f
Sigma 7 O'peratiQn. As can be seen, each service
request was satisfied befQre the succeeding request was signalled (zerO's in cQlumn labeled "NO'.
Inst. Lost"). This gives O'ne assurance that the
demands O'f all of the prQPO'sed users will be
satisfied by the CPU when they are running
simultaneO'usly. The simulatiO'n alsO' shO'WS that it
is practical, when a brO'ad spectrum O'f requirements is to' be satis,fied, to' assign a priO'rity to'
each task and let it run to' cQmpletiQn O'r until it
is interrupted by a service request frO'm a jQb Qf
higher priority. UpO'n cO'mpletiQn Qf the higher
priQrity task, the interrupted task is resumed.
When a real-time prO'gram is to' be IO'aded, its
priQrity will be assigned based O'n the current
cO'mputer wQrk IO'ad, the service frequency, cO'mputatiO'nal time and its required reSPQnse time.
Inasmuch as the sequence O'f tasks carried Qut
by the CPU will be determined by the priority
assignments, sO'me means must be undertaken to'
insure that a high priority prQgram dQes nO't
exceed its allotted running time (preventing prQgrams O'f IO'wer priQrity from running) due to' a
program errO'r O'r an unusual data sequence. AlsO',
all of the real-time prQgrams will be assO'ciated
with research projects whQse data-prQcessing requirements change frO'm time to' time, implying
that these prQgrams will require mO'dificatiO'n on

Operating System for Central Real-Time Data-Processing Computer

USER

INSTRUMENT

INVESTIGATION

Digital potentiometer
Neutron spectrometer
Multi-channel anal.
Atomic beam
Multi-parameter anal.
Multi-channel anal.
Multi-parameter anal.
Multi-parameter anal.
Multi-channel anal.
Mass spectrometers
Mass spectrometer
Cryogenic NMR
Digital potentiometer
Neutron spectrometer
X-ray spectrometer
36-det. count lab
Multi-parameter anal.
Time-of-flight m. s.
Mass spectrometer
Multi-parameter anal.
N ucl. mag. res.

Abraham
Atoji
Camall
Diamond
Engelkemeir
Fields
Friedman
Glendenin
Hines
Holt
Katz
Katz
Osborne
Peterson
Siegel
Steinberg
Steinberg
Studier
Wexler
Unik
Weil

1189

Low temp. heat capacity
Crystallography
Optical absorption spectra
Nuclear spins
Heavy element decay studies
Decay scheme studies
Proton reaction studies
Fission studies
Decay scheme studies
Routine mass analysi~
Deuterated compounds
Deuterated compounds
Low temp. heat capacity
Crystal structure
Crystal structure
Decay and spectra studies
Proton-nuclei reactions
Carbonaceous chondrites
Charge states of mol. frag.
High res. fission studies
Structural studies

TABLE I-Work load summary
i es

8"C~QND.

e

e..

UNI~

ea

HINES

ee

CARtiALL

21

STtl8RD I!

eo

f'ItLDS

l-

19

STUDIE"

f

18

!'BRH"

17

kATZ 2

16

OSBRtI I!

I
II-

15

OSBRN 1

f

1"

SEtOtL

13

flTDJI

I
I-

12

HOLT 2

11

HOLT 1 .

9~

10

kATZ 1

] }

l-

IlI-

--------~
~--~)

---~

--~
~

l---

l------+

1------)

H

1---+

I-~I~

9

"~IL

B

tNOLk"R

7

OLNDIN

6

STNBRO

5

UNI~

I I

11
II
] 1
] 1] 1] ] 11]
] ] I' ] ] ] } I ]
} } }

1

..

FRD"flN

a

STN8RG 1

e

WEXLtR

1

WEXLtR 1

] ]

e

l ]
]1

}

] ]

] }

] ]

] ]

} }
]]

I ]

] ]

] ] J ] ] ]] ] ]

] ] I ] ) I ] ]]
] ] ] ) ] ] ] ] ]

] I ] ] ] ] ] ] ]
]]

] 1

] ]] ]

J ]

] ]

]1]1])]])

} ] 11 ] ] ] ] J
1 ]
] ]

] ] I ] ] 1] ]

1]

I I

] ]

J J

]1

]1
1.00 SEC.

] ]

] 1 ] ] 1 } I 1 ].

1]

11

] ]

I ]

] ) ] J ) ] J 1]

] ] ] ] ]
] ]

I ]

1]

]]

, e.oo I

FIGURE 1--8imulation of CPU usage as a 'function of time. Increasing time along horizontal axis and tasks of decreasing
priority along vertical axis.

SEC.

1190

Fall Joint Computer Conference, 1968

occasion. With 25 such programs resident in
memory at a time, some of which have not been
thoroughly tested, it is absolutely necessary that
these programs be prevented from writing over
each other.
In addition to being required for the Operating
System, auxiliary storage must be .used for several other purposes.. The real-time processing for
a single multi-parameter experiment requires
the maintenance of eight updated spectra. Each
spectrum (2048 words.) must be updated during
a data buffer processing which takes place at
intervals of about one second. Limiting core
residency to one table ata time reqUires the readal.7~2

ing and writing of the eight tables once each
second. The simultaneous running of the four
planned multi-parameter experiments will require an aggregate data transfer rate of 512K
bytes/sec between memory and .mass storage.
Also, the raw data from the multi-parameter experiments must be written on magnetic tape.
The remainder of the remote sites will generate
considerably less data (less than 65K bytes per experimental run), which can be readily handled by
a mass storage device with a capacity of several
million bytes. To facilitate remote loading of
real-time programs, their object forms must reside in mass storage.

MIN •• TOTAL RUN TIME

.32 MSEC./SWAP, OVERHEAD

PRIORITY
AND LINE
DESCRIPTION
1 WEXLER 1
2 WEXLER 2
3 STNBRG 1

FROMAN
!5 UNIt< 1
4

.6 STNBRG

7 GLNDIN

a

E.NGLKMR
9 WElL
10 KATZ 1
.1 1 . HOLT 1
12 HOLT 2
13 ATO..JI

14 SEIGEL
15

OSeRN t

16 OSSRN 2
17 KATZ 2_
18 ABRHM
19 STUDIER

20
21
22
23
24

FIELDS
STNBRG 2

CARNAl.L
HINES

UNIt< 2

T.IME REa. AVE. -TI ME
TO SERVe BETWEEN
AN INT.
INTERRUPT

1.50
1.20
2.00
2.50
2.70
2.50
2.20
2.10
1.80
3.10
16.00
16.00
60.00
50.00
150.00

249.56
249.56
247.66
100.49
55.65
55.78
55.71
250.42
150.18
150.36
29606.75
31016'.60
957S.65
12771.54

150.0~

IS147.64
18609.96
14803.38
28319.50
930.9.79
217116.16
162837.12
162837.12
1 08S!l8.oe

400.00
·150.00
270.00
2000.00
2000.00
2000.00
2500.00
3000.00

14803~38

PERCENT NO. INTS. NO. INTS. TIMES INOF TOTAL SERVICED LOST
TERRUPTEO
TIME
.60
• 48
.Sl
2.49
4.85
4.48

3.95
• 84
1.20
2.06
.05
• 05
• 63
• 39

t.Ol
.99
2.15
1.01
.95
2.15

.-..-.-.----.-.

....

- TOTAL FOREGROUND

37.6f

BACKGROUND

57.00

OVERHEAD

O.

o.
o•
o.
o.

O.

o.
o.
o.
o.
o.
o. :
o.
o.
o.

- 46.

14.
6•

• 92

1.23
1.54
2.76

o.
o.
o.
o.
o.
o.
o.
o.
o.

5220.•
5220 •
5260.
12964.
23408.
23355.
23382.
5202 •
8674.
8664.
44.
42.
136.
102 •
S8.
86.
70.
88.

a.
a.

12.
..
122099.

--........

5.40

TABLE II-Real time simulation-summary

--.-

-.

.......................
o.

o.
12.

82.
349.
939.
-1082.

1175.
544.
834.
1641.
77.
72.
8~3 •
49·8.
1276.
1261.
2736.
1300.
1204.
2840.
1225.
1611.
2026.
3641.
~.-

.......-...

27258.

73765.

Operating System for Central Real-Time Data-Processing Computer
While the most efficient code is written in assembly language, the experimentalists should not
be required to learn a particular machine language, but be permitted to write their own programs in the more familiar FORTRAN language,
if they prefer. Therefore, FORTRAN capability
should be provided for the real-time user as well
as for the batch processing users.
Although immediate final analysis is useful
to the experimentalist, it does not have the urgency of the real-time operations and therefore
should be executed in a "background mode" which
will operate at a low priority level. The actual
final processing requests will be placed in a
background queue by the data-collection phase
program upon completion of an experimental
run. The remaining' CPU time should be used for
"open shop" batch processing and long-term computations.
The Supervisor offered by Scientific Data Systems (Batch Processing Monitor) for the Sigma
7 was designed to handle a number of real-time
tasks in the foreground mode (high priority)
while running a sizable batch-processing operation in the background mode (low priority). The
principal shortcomings of their system are that
the practical number of foreground tasks is considerably less than ten, there is inadequate memory protection, no provision is made for foreground program "overrun time" traps, and that
it contains an inefficient Rapid Access Device
(mass storage) handler. Rather than attempt to
modify a Supervisor that was designed without
large-scale real-time operation as the primary
goal, it was decided that a fresh start, with the
detailed requirements of the experimentalist in
mind, would be a better approach.

Hardware configuration
The Sigma 7 is a third-generation machine:!
with an 85.0 nano-second memory and an instruction set whose .execution times range from
1.2 J.L sec for an "add immediate" to 25 J.L sec for a
"long form floating divide" (16 decimal digit).
The Central Processing Unit (CPU) has sixteen
programmable registers, of which seven may be
used as index registers. The instruction counter,
"memory write key," and other information germane to the currently running program reside in
a 64 bit CPU register called the Program Status
Doubleword (PSD). When a program is to be
interrupter, the 16 registers and the current PS.D
must be saved, and the PSD of the interrupting

1191

program must be set in the CPU. This can be
accomplished with .a series of three instructions
(,24 J.L sec). When the interrupted program is to
be resumed, the 16 registers and the PSD must
be reloaded. This is also accomplished with three
instructions (22 J.L sec). The CPU also has a
priority interrupt structure which responds to
signals from internal clocks, I/O terminations and
up to 240 external sources. A memory "write
protection" feature associated with the CPU compares the "write protect key" (two hits in the
PSD) with the portion of a 512 bit "write-protect
lock" image (private CPU register) associated
with the page of memory (512 word) being referenced. An attempt to modify memory with an
in1proper "lock and key" match results in a fault
trap. This checking operation proceeds in parallel
with instruction interpretation and thus does not
lengthen instruction execution time.
The hardware configuration for our system is
shown in Figure 2. For the remote terminals
and the standard pe·ripherals, data transfer with
the main memory is accomplished through one of
the two "Multiplexor Input/Output Processors,"
each of which can service up to 32 simultaneous
users with a throughput of about 400 K bytes/sec.
For high-speed data exchange with mass storage
at speeds approaching full memory speed, one
"Selector Input/Output Processor" will be used.
Both types of I/O processors operate _independently of the CPU once an I/O operation has been
initiated. They also contain internal hardware
which permits them to execute a succession of
I/O commands from core memory ("command
chaining") without CPU intervention once they
have the address of the first command in a
chain.
The advantages of independent input-output
processors would tie obviated if there were only
a single set of addressing hardware for all of
memory. Our configuration will contain four
separate memory banks of 16K words each. Each
bank of memory is accessed through a port, one
port being dedicated to the CPU and one to each
of the I/O Processors. Each of the banks contains its own reading and writing hardware.
Therefore, the CPU can operate at full computational speed when instructions are being executed from one bank of memory while a mass
storage device is transferring data through the
"Selector I/O Processor" at near full memory
speed into another bank.
The mass sto·rage device will be an SDS "Rapid

1192' Fall Joint

Co~puter

Conference, 1968

Access Device" .(RAD) which consists of a r0tating disk (35 msec. rotation period) with fixed
heads that transfer data at 3 X '10 6 bytes/sec.
The 64 bands of 82 sectors (1024 bytes) each
have a total storage capacity of about 5 X 106
bytes.

Operating system structure
Overview

Our system is not a generalized data acquisition system, but an Operating System in the fullest sense of the term as. recently described by
Wood,s consisting of a Supervisor and a set of
Processors. The Supervisor was written to provide a well defined multi-program environment
in which a large number of programs may operate without mutual interference; the Processors
(assembler, compiler a.nd loader), which run
under the Supervisor, are the current SDS versions. Each program is fully protected from all
other programs both in space as well as time. By

"E"ORY

8"NIt 0
1& It '

SIG"" 7
CENTR"L
PROCESSINg
UNIT

implementing the Sigma 7 "write-protection"
feature, a program is prevented from writing in
a portion of memory which is not assigned to it.
In addition, all I/O is executed via I/O ha.ndlers
(part of the Supervisor) which fully check the
validity of each request. U'sing the CPU internal
clock, the system also keeps control of the time
that it spends on each program and can abort
a job which exceeds its allotted time (or proceed
to the next program if' a time-slice has been consumed in a time-sharing situation). The Supervisor is structured in such a manner that it is
event-driven from three types of signals; request
for service from a remote site, the completion of
an I/O operation, or the run-out of an internal
clock. With this structure it is then quite feasible
to have a mix of tasks running simultaneously
which include: real-time data-collection and/or
processing programs requesting service via an
I/O completion or external signal; a group of
. time-sharing conversational terminals which are
time-sliced via an internal clock; and a modest
batch-processina- operation.

"ErtORY

8"Nat 1
1& It

"E"ORY
8"NIt

e

. 16 It

--------------------,

-----.
I
I
I
I
I
I
I
I
I
I
I

C"RD
PUNCH

FIGURE 2-Hardware configuration

"EMORY

S,.NIt

16 It

a

Operating System for Central Real-Time Data-PrO'cessing CO'mputer
Data buffering
As indicated in an earlier sectiO'n, the, variety
of tasks that must be performed Dn a real-time
basis is Df such a nature that they may be assigned a priO'rity and performed on a run-to-completion basis (interruptable only by a higher'
priority task). The task that each prO'gram must
perfDrm generally consists of transferring the
data that accumulates in memDry tOo auxiliary
stDrage and/or performing sO'me modest cO'mputations and returning the results to' the remO'te
site. Since response times of less than a secO'nd
are nDt required in most Df the cases, and in O'rder
to keep the program switching rate down to a
reasonable level (about a hundred per second),
the data will be accumulated in blO'cks (under contrO'I O'f a Multiplexor lOP) intD dedicated portions of memory assO'ciated with each program.
The size O'f these blocks will depend on the average data rate, the peak rate, the amO'unt of
processing required and the urgency for computer analysis. In mO'st O'f the prO'gra.ms under
consideration, the experiment can proceed before
a block O'f data is analyzed Dr transferred to'
auxiliary stO'rage. Thus, to facilitate an uninterrupted flow O'f data, the data buffer areas are
divided up into a minimum O'f two equal blocks
and the "command chaining" feature of the I/O
PrO'CeSSDrs is incO'rpDrated in the follO'wing manner. The first I/O command fills half of the
buffer and upon filling this area the commandchaining flag directs the input/output processor
to get the next cO'mmand which initiates input
intO' the Dther half O'f the buffer. Simultaneously,
with the cDmpletiO'n O'f the filling of the first half,
the I/O interrupt of the CPU is triggered, signalling the Supervisor that service is needed. With
the expected mix of tasks, the assO'ciated program shDuld obtain sufficient CPU usage to
prO'cess the buffer and extend CO'mmand chaining
in a circular manner before the second half of
the buffer fills. As a result, the experiment can
cO'ntinue to transmit its data to' the cO'mputer independently O'f the detailed needs of the O'ther
users. In the event the task is nO't completed befO're the second half of the buffer is full, the I/O
fO'r . this terminal will halt because command
0haining had nO't yet been set; the, processing
program will have to. start the I/O O'peratiO'n
again when the apprO'priate task has been completed O'n the other half O'f the buffer. For data
inputs that vary a great deal in rate during the
course O'f an experiment, this double-buffering

1193

technique can be extended to' any level. Typically,
these buffers are a few hundred words IO'ng with
a filling time of about onesecO'nd.
Software priority structure
The CPU task sequence can be readily related
to the priority chain in Dne of two ways. The
rapid switching capabilities of the external
priority interrupt structure could be used if the
requirements O'f the tasks can be estima.ted with
sufficient accuracy and remain cO'nstant fO'r extended periods. However, if the real-time needs
change, the priO'rity assignments must be modified. AlthO'ugh changing hardware priority assignments is a relatively simple changing of
cables, it could O'ften mean stO'Pping someexpen.
ments while the cables were changed. This WO'uld
be intO'lerable. TO' avoid this problem, a software
priDrity interrupt system was designed and implemented which allows priDrity alteratiO'n in
microseconds and has the potential for a SupervisO'r cO'ntrolled dynamic priority re-assignment.
Figure 3 depicts the priorities as they are established in the present SupervisO'r. Any task Df
higher priority can gain the services of the CPU
(via an interrupt) after the request has been
processed by the "Interrupt Supervisor" portion
of the Supervisor.
Program description table
The Supervisor runs, services and contrO'ls the
variO'us resident prDgrams by referring to a PrO'gram DescriptiO'n Table (PDT) assO'ciated with
each prO' gram. These tables contain the Program
Status DoublewO'rds, the "write-prO'tection" lock
image, the current and allotted running times,
space fO'r saving the register contents when interrupted and various cO'ntrO'I bytes which assist
the Supervisor in keeping track of the current
status O'f the prDgram. All of the I/O cO'mmands
and the memO'ry bounds are also stored in the
table. The I/O handlers use this informatiO'n in
processing I/O:, requests. The integrity Df these
tables is insured by making them inaccessible to
the user prO'grams ("write-prO'tected").
Input/output handlers
All input-output DperatiO'ns are effected by appropriate handlers which· nDt only facilitate data
buffering but also permit prO' grams O'f IDwer
priO'rity to run while a prO' gram O'f higher
priority is waiting for the completion of an I/O

. 1194 Fall Joint Computer Conference, 1968-'
I/O INTERRUI"T

I/D INTERRUPT LEVEL

REAL T1"E LEVELS

···

··
j

I

··

··
·

F"IRST THREE
~~~~¥ Of" A

ID Of" NEXT LOWER
PRIORITY PDT
DEVICE ADDRESS

.



END OF" JOB
'

<.-----.:.--

(7)

(8)

where· (J max is the maximum angle, as seen from the
hologram, over which the· image is played out, A is the
mean wavelength of the'illumination, and .:l A is the
wavelength spread. As examples, with a mean wavelength of 8500A, a .:l A of 30A, and a (J max of 450 ,
(all values for lasing diodes) the number of bits that can
be stored is 5 x 104 • If now ~ A is 300A (the value for
non-lasing diodes), the capacityis reduced to 500 bits
per hologram.
The third, and perhaps most important limitations
arise from the spatial coherence of conventional light
emitting diodes. The GaAs diodes is an extended source
of light, each segment of the source emitting light independently of every other segment. As a result there is
no unique, time-invariant representation of the emitted
radiation that can be made that relates one part of the
wavefront with another. It is this that is meant by spatially incoherent illumination. Such illumination imposes a severe limitation on the capacity of a holographic
store, for it can be demonstrated that if spatially incoherent light is used to illuminate a hologram, and
if the source as seen from the hologram is of a given
angular extent, then the smallest image from the
hologram will have the same angular extent. As
a result the number of bits per hologram (the number of resolvable spots in the hologram's image),
which is inversely proportional to the smallest resolvable spot in the image, is reduced. There are a
number of ways to overcome this limitation. The first is
by the use of a lens to reduce the angular extent of the
source as seen by the hologram. The second, and more
elegant, is the use of smaH, lasing GaAs diodes. These
diodes, as will be. discussed below, have a high surface brightness, near monochromatic output, and if
made very small have spatially coherent light emission.
There are other factors that could be considered, for
example the effects of emulsion shrinkage. However,
this and other effects are not peculiar to this application
of holography and so will not be discussed here. Com-

1201

plete discussion can be found in the literature. See for
example Ref. 7.

Light emitting diodes
There are actually two types of light emitting diodes
to consider, incoherent and coherent (lasing) diodes.
We will first consider incoherent light emitting diodes
and then turn out attention to coherent (lasing) diodes.
IncohereI).t diodes emit light, when forward biased, at
a -mean wavelength of 91OOA, with a spectral width of .
300A. The diodes typically have a low power efficiency,
about 2%, requiring large currents to achieve high
light output. In our experiments, using a commercially
available diode, four amps are required to obtain 40mw
of emitted radiation. The diode response time, current
in to light out, is relatively long, typically 100--150 ns.
I t is suspected that the slow speed of response is due to a
mechanism involving deep traps which must be filled
before light of appreciable intensity will be emitted.
Surface brightness, a figure of merit commonly used
to compare different sources of illumination, is 40 mw/
mm2 which is large when compared to conventional
sources but low, as will be seen, when compared to a
lasing source. Surface brightness is important in that it
ultimately determines, regardless of the kind of optics
used, the power per point in the image. A low surface
brightness therefore, will result in a low storage capacity
in the hologram memory.
As mentioned previously, such a source is spatially
incoherent. Furthermore, due to its low surface brightness, the source size must be large to achieve reasonable
power levels. We can see therefore, that such a source
will always impose a severe limitation on the capacity of
such a memory (especially if a high speed memory is de.
sired).
Coherent (la.sing) diodes emit light at a mean wavelength of 85OOA, with a spectral width of less than 30A.
Above threshold (approximately 50 ma) they have a
high differential power efficiency about 30 %, allowing
low drive currents. The speed of response very high,
the diodes typically turn on in less than ten nanoseconds.
Surface· brightness is very high, typically 8 X 108
mw/mm2 • Thus even with incoherent illumination,
small source sizes can be used and still yield appreciable
puwer levels, implying high storage capacities. Encouragingly however, light emission has been achiev~
for small (~1/2 mil) lasing diodes which is completely
coherent and implies that capacities of up to 30 X 103
bits per hologram can be achieved. The small size of
these diodes also allows one to fabricate an array of
light sources which is qUite compact. Arrays of 1 mil
diodes on 10 mil centers are contemplated for future
models of the memory.

is

1202 Fall Joint Computer Conference, 1968
The main disadvantage of coherent .(lasing) GaAs
diodes is that presently they must be used in a liquid
nitrogen enviornment. With the availability of reasonably low cost closed cycle liquid nitrogen refrigerators
this is not a real disadvantage, mainly a psychological
one.

The photodetecting array

1000 bit array, a 25 nanosecond risetime is indicated.

One of the features of this array is the isolation of the
selection current, which flows horizontally through the
switching diodes, and the signal currents, which flow
through the vertical line. Since the large currents associated with driving the GaAs diodes are electrically
separate, due to the optical coupling, high speeds without drive noise spikes in the sense output are achievable.
Moreover, this simple diode structure is fairly easy to
integrate, and the photodetecting array can be built up
of smaller subsections. Such photodetecting arrays have
been built at out laboratories and operated successfully ..

Figure 6 shows a word-organized photodetecting array suitable for the Light-Emitting Diode holographic
memory system. The total analysis of this matrix will-be
published elswhere; in this section we will outline how
this matrix works and its characteristics.
A word is selected by turning on the switching diodes _
Discussion of experiments
of a particular (horizontal, in the case of Figure 6) line
by closing its switch (a transistor). Then the photo curExperiments were performed to demonstrate the imrent generated by light falling on the back-biased photoplications of the previous statements. To illustrate the
diodes (which can -have typically 70% quantum effisalient features of the concept, a small working model
ciency) on the vertical lines has a path through the
of a holographic read-only memory accessed by light
sense amplifier, which has low imput impedance,
emitting diodes was built. * The model consisted of four
through ground and the conducting word line. Photoholograms accessed by four GaAs light emitting diodes.
currents generated in the photo diodes associated with
The hologram stored 26 bits, with the output of the
an unselected word line see a high impedance path and
hologram projected onto a 26 bit photodetector array as
so do not contribute to the sense signal. The risetime of
described in the previous section. The outputs of the
such an array has been calculated to be
photo detector array were amplified to a suitable level,
threshold detected, and strobed out. Each component
R. T. = 5 X wn Rswitch diode Cphotodiode
will be examined in turn.
To compensate for the wavelength shift, as discussed
where 5 represents the ratio of a selected word sense sigpreviously, each hologram stored a predistorted array
nal to worst case unselected word signal, wn the number
of spots as shown in Figure 7, with the output
of bit~ in the detecting matrix, Cphotodiode the capacity
of each hologram a square array. The hologram
of a photodiode, and R switch diode the on resistance of
was made with the configuration schematically
the switching diode. Putting in typical values of 5 pf for
photodiode, 1 ohm resistance switching diodes and a
BIT

SWITCHING

--.01--+.::.tI-l)t--............C+-~t-......-ot-., DIODE

FIG URE 7-Input spot array, showing reference
Notice lack of squareness
FIGURE 6-A word-organized photodetecting array

* For ADP/ECOM,

~eam

Fort Monmouth, New Jersey.

position.

Holographic Read-Only Memories
shown in Figure 8, using light from a He-Ne laser
(X = 6328A). The reference beam diverged from a point
1" in front of the hologram surface and at an angle of
2° to the hologram normal.
The sources of illumination for hologram reconstruction were commer9ial GaAs light emitting diodes. The
diodes were driven by a pulse amplifier capable of delivering 4' ampere pulses with a 50% duty cycle at 10
Mhz. The pulses was able to deliver the four ampere
pulse within 25 nanoseconds at the initiate pulse, including delay plus current risetime.
The desired diode was selected by means of a switch.
A simple lens was used to collect the light emitted by
the diode to provide the convergent beam needed for
undistorted hologram reconstruction. As discussed
earlier the playout beam converged to a point at the
'plane of the image I" from the hologram.
The image of each hologram was projected onto a
photodetector array, of the type discussed previously,
with each hologram imaging on the same photodetector
array. The light sensitive elements of the array were
hpa 4207 diodes. The word diodes were hpa 1006
high conductance diodes. Separate word lines could be
selected either mechanically, by means of switches, or
electronically.
The output of each bit line was fed into a low input
impedance, high gain, wide bandwidth, amplifier. The
amplifier had a total delay plus rise time of about 50
nanoseconds, an input impedance of 50 ohms, and a
transfer ratio of 1.5 volts per 1 #-,watt of illumination
incident on a photodetector. The output of the amplifier was strobed and then applied to a threshold detector.
Sense amplifier output of two bits read in parallel is
shown in Figure 9. There are a number of things of interest to be seen in the figure. The first is the high one to
zero ratio obtained from this kind of memory system.

FIG URE 8-Recording arrangement

1203

. The second is the long rise time of the light emitting
diodes. Figure 10 illustrates the access time ofthe system. The top trace is the input to the light emitting
diodes, pulse amplifier (the address pulse), the second
trace is the sense amplifier output, and the bottom trace
is the processed output. The access time shown is 200
no; again it can be seen that amost all of the delay is
caused by the slow turn on time of the light emitting
diodes, as seen in the second trace. As shown the pulse
duration is 500 ns. This length was chosen for pictorial
reasons only, and is not representative of pulse duration
achievable. As we have said above, the main speed
limitation is the time of response of the. light emitting
diodes, which with incoherent diodes is about 150 ns,
while with coherent light emitting diodes it is about 10
ns, or less.
The minimum detectable signal, the value of which
was a prime objective of this study, for it ultimately
determines the speeds and capacities that can be
achieved in future systems, was found to be 2.5 X IO-~4
watt-seconds. (This corresponds to a 1/4 #-,watt signal
of 100 nanoseconds duration, we give the result in terms
of minimum energy for reasons of generality.) This
value would allow a 10:1 signal to noise ratio, or an
error rate of l(t-s, which is minimally acceptable. With
this figure it is possible to determine the energy needed

a
b
FIGURE 9-The sense amplifier output for a one (a) and zero
(b). Vertical scale ~ volt/em Horizontal scale lOOns/em

FIGURE lo-Operating waveforms of hologram read-only
memory using incoherent light-emitting diodes
(a) Initiate pulse
(b) Sense amplifier output, showing rise time
of light-emitting diode
(c) Strobed output
Horizontal se~le lOOns/em

1204

Fall Joint Computer Conference, 1968

to access a hologram memory system, for if N is the
number of bits, Ell is the hologram efficiency, then the
energy needed is

E TOT =

N X 2.5 X 10-14
.

watt-seconds.

E"

The hologram used in this model had an efficiency of
5%, which is typical for the type of holograms used.
This means the minimum energy needed in a hologram
memory is
E TOT

=NX5X

10-18 watt-seconds.

The storage capacity of the holograms used has been
shown to be most severely limited by the low surface
brightness and lack of spatially coherent radiation of
the light emitting diodes. The maximum number, of bits
that can be stored with such a source is below 200 bits
per hologram if access times of .5 #,secs are desired. The
large physical size of the light emitting diodes does not
allow compact spacing of the array of sources (and thus
of holograms) and thus limits the capacity of a complete
system, for the conditions of a maximum hologram output angle of 450 and a hologram array size larger than
the photodetector array are not compatible. As a result,
if the hologrmns can not be spaced compactly, only a
small number of holograms can be used, limiting the
system capacities. All of the above indicate that coherent (lasing) diodes must be used to make the concept
realizable.

the hologram at high speed. It has been the purpose of
this paper to describe a technique to achieve ~uch a
means, the use of an array of light emitting diodes to access the holograms.
The use of light-emitting diodes as a hologramselector severely reduces the capacity of this type of hologram memory when compared with that using a gas
laser. However, the. use of these diodes allows very
rapid access to the memory store, something not yet
achievable with the gas laser systems.
This is the philosophy of this class of read-only
memories: sacrifice capacity for the speed and easy
changeability of contents. Experimentally we have
demonstrated the feasibility of making holograms for
such systems and. devised photodetecting arrays of requisite sensitivity and speed. As both our analyses and
experiments have indicated, the critical element in this
type of memory is the accessing array. Incoherent
diodes possess neither the surface brightness nor short
turn-on time required for this memory system. With
lasing diodes, capacities of 1()5-106 bits at cycle times
below 100 nanoseconds appear achievable with present
hologram and photodetecting array techniques. However all present lasing diodes need to be cooled to 701500 K. With further improvement in the device
characteristics, an order of magnitude larger capacities
become possible. Such memory systems can be evaluated on the basis of the analysis presented in this paper.
REFERENCES
1 RJPOTTER

CONCLUSIONS
From the discussion in this paper, the advantages of a
hologram read-only memory are manifest. Digital information can be stored with high bit dens~ty and with
great redundancy. The need for fine, highly corrected
lenses is obviated; accordingly, the need for precise,
mechanical placement does not exist. Easy replacement
of the storage medium is possible since the only link
between the medium and the rest of the system is light,
requiring no electrical or mechanical connections. The
most important advantage, however, is that through
the use of holography large information stores can be
achieved with relatively few electric components. In a
properly made system, the number of components
needed in a conventional memory is essentially divided
by the number of holograms used: each hologram' projects its images onto the same detector array, reducing the ~eed for extensive electronics.
While these advantages have been apparent to us and
others for some time, a total system has not been implemented due to a lack of an efficient means of accessing

Optical processing of information

Spartan Books Baltimorev1963
2 DGABOR
Microscopy by reconstructed wave-fronts

Proc Roy Soc London A197 454 1949
3 HFLEISHER
Application of interference photography to optical information
storage

Presented at Holography Seminar Colorado State University
January 1967
4 1J K ANDERSON et al
A high-capacity semipermanent optical memory

Presented at Conference on Laser Engineering and Applications
Washington DC June 1967
5 AOPLER
Fourth generation software

Datamation 1322 1967
6 R CHAPMAN M FISHER
A new technique for removable media read-only memories

Proc of 1967 Fall Joint Computer Conference Anaheim Calif
1967
7 D BOSTWICK D H R VILKOMERSON R S MEZRICH
Techniques for removal of distortions in hologram images caused
by a change in playback wavelength

Presented at Spring Meeting of Optical Society of America
March 1968

Semiconductor memory circuits and technology
by WENDELL B. SANDER
Fairchild Semiconductor
Palo Alto, California

INTRODUCTION
In the past few years the use 'Of semic'Onduct'Or
flip-fl'OPs f'Or a significant P'Orti'On 'Of the mem'Ory
of a c'Omputing system has been appr'Oaching a
practical reality. The pressures and c'Ommitments
within semic'Onduct'Or lab'Orat'Ories t'Oward achieving a c'Ompetitive edge in the main frame mem'Ory
market are increa&ing. The mem'Ory field 'Offers
a new market area as 'OPP'Osed t'O the displacement
'Of existing pr'Oducts in a new f'Orm. Furtherm'Ore,
the manufacturing pr'Ocess f'Or semic'Onduct'Or
memories is characterized by the mass pr'Oducti'On
'Of similar items; this pr'Ocess is in exact acc'Ordance with the present pr'Oducti'On methods in the
semic'Onduct'Or industry and is far less painful
than that 'Of ~he LSI l'Ogic field, which threatens
t'O lead t'O an endless pr'Oduct pr'Oliferati'On at small
v'Olumes. Alth'Ough the mem'Ory field is in its infancy and largely speculative, it is P'Ossible t'O
make a brief survey 'Of the techn'Ol'Ogy and the c'Oncepts being pursued in the devel'Opmental lab'Oratories. A further review 'Of semic'Onduct'Or mem'Ory can be f'Ound in Reference (1).

Chip technologies
At the present time three distinct chip techn'Ol'Ogies are being c'Onsidered; bip'Olar, p-channel
MOS and c'Omplimentary MOS. The bip'Olar pr'Ocess used f'Or semic'Onduct'Or mem'Ory is similar t'O
conventi'Onal pr'Ocessing including multi-layer
metal. Mem'Ory circuits are repetitive and can be
relatively non-critical 'Of c'Omp'Onent value permitting expl'Oitati'On 'Of the bip'Olar pr'Ocess in new
ways t'O achieve a marked impr'Ovement in c'OmP'Onent density. An excellent example 'Of this is a
cell described by BTL 2 at the 1967 ISSCC. The
p-channel MOS techn'Ol'Ogy 'Offers a c'Onceptually

simple techn'Ol'Ogy capable 'Of pr'Oducing mem'Ory
cells at g'O'Od density.
Pr'Ogress in MOS techn'OI'Ogy is pr'Oceeding m'Ore
or less independently fr'Om the particular needs
'Of semic'Onduct'Or mem'Ory and the maj'Or eff'Ort in
p-channel MOS mem'Ory circuit techniques is
simply t'O expl'Oit the techn'OI'Ogy t'O its fullest. F'Or
example, it is desirable t'O use bip'Olar circuits t'O
pr'Ovide high level drive and l'OW level sensing t'O
achieve the best mem'Ory system perf'Ormance
fr'Om the MOS cells. 3,4,5
C'Omplimentary MOS is the integrati'On 'Of b'Oth
p-channel and n-channel MOS devices 'On the same
chip. It is the least well devel'Oped 'Of the three
techn'OI'Ogies discussed in terms 'Of pr'Oducti'On
capability. There is a wide variety 'Of fundamental
appr'Oaches t'O c'Omplimentary MOS being· devel'Oped including thin film field effect transist'Ors, 6
silic'On 'On saphire 7, and simple diffused silic'On.
Due t'O the need f'Or pr'Oducti'On c'Ompatibility and
low C'Ost, the all-silic'On system is them'Ost likely
near-term pr'Oducti'On pr'Ocess and in fact, is the
'Only pr'Ocess with standard pr'Oducts 'Of any kind
presently available. C'Omplimentary MOS is
viewed by many (but n'Ot all) in the semic'Onduct'Or
field as being applicable primarily t'O special mem'Ory applicati'Ons where the inherently I'OW P'Ower
is essential. It is n'Ot likely t'O be C'Ost c'Ompetitive
in m,em'Ory applicati'Ons with either bip'Olar 'Or
p-channel MOS f'Or the f'Oreseeable future.

1205

Packaging technologies
There are three basic techn'Ol'Ogical fact'Ors in
interc'Onnecti'On 'Of the m'On'Olithic chips int'O a
mem'Ory m'Odule; interc'Onnecti'Ons fr'Om the chip,
sec'Ond level interc'Onnecti'Ons, and chip sealing
against c'Ontaminati'On. There is a maximum size

1206

Fall Joint Computer Conference, 1968

of chip for any given technology that can be fabricated with 100% yield over the chip. This number is fairly small; on the order of 64 to 256
bits/chip with present technology. This represents a level of testing required before proceeding
to interconnecting the chips (or wafer region)
with a higher level interconnection.
The most straightforward way to handle these
good chips is to test the wafer, ink the bad chips,
dice the wafer, and throwaway the bad die. However, at least one manufacturer, Texas Instruments, has taken an alternative approach, wherein
the good areas are mapped and a special interc'Onnection mask for the wafer is made and applied
to interconnect these regions. 8,9 This approach
will not be discussed in detail here. The following
discussion as'sumes the handling of small chips
acquired by wafer sort and dicing. The attachment of the chip to the next level interconnect can
be handled by individual lead bonding of pads on
the die to a next level interconnect or may be
formed by anyone of several batch attachment
techniques. Since memory chips tend to have a
large number of leads, batch attachment is most
attractive.
Some 'Of the batch connection techniques are:
1) Ultrasonic bonding of aluminum bumps
on the chip to the substrate metalization
2) Thermo compression bonding of aluminum bumps. on the chip to the substrate
metalization
3) S'Older reflow of solder bumps on the chip
to the substrate metalization (Figure 1)
4) Form solder coated beams extending beyond the edge of the chip which are
bonded to the substrate with the die
ei~her face up or face down.
A comparative evaluation of methods 1, 2 and 3
may be found in Reference 10. Thebeam lead approach of 4) was developed by Bell Telephone
Laboratories and is described in reference. 11
The attachment is usually made with the die
mounted face down on the substrate, however,
techniques for face up mounting are being developed for both beam lead structures and by batch
interc'Onnectiqn of face up chips after die attachment. 12 The major factors to consider in the attachment systems are the economics, attachment
yield, repairability and heat dissipation. Ease of
repair will probably be a dominant factor if many
chips are attached to a single substrate. In gen-

FIGURE I-Chip with solder bumps

eral, the face down bonding technologies are easier
io repair, but the face-up technolOgies allow better
inspection 'Of bond quality and better heat dissipation.
The substrate (Figure 2) for next level interconnection can be anything from a single chip
package to dozens of uncased chips on a multilayer substrate. The best economic potential lies
in multiple attachment of uncased chips to the
substrate. Single layer metalization on the substrate is highly desirable from an economic standpoint but may not be adequate in all cases. The
substrate material is usually either alumina with

FIGURE 2-Single and double layer substrate

Semiconductor Memory Circuits and Technology
gold plated molymanganese interconnections·· or
silicon with aluminum or moly-gold interconnections. The alumina is a common ceramic well understood for the purpose whereas silicon is more
fragile but can provide very high interconnection
densities.
Chip sealing is required to prevent surface
contamination which can degrade the circuits.
Chip sealing can be accomplished by a sealing
cap on every chip, by sealing the entire multichip
substrate (Figure 3), or by a sealing coat of silicon nitride on the chip. The nitride sealing method
is by far the most attractive but is not yet a well
developed production technology for integrated
circuits. The most common and practical alternative is sealing the entire module. This requires a
large area seal that is not easy but can be done.
Memory circuit8

Read-write cells
Bipolar
Figure 4 illustrates the most common bipolar
memory cell in linear select and coincident select
form. This cell operates with the word line at a
. low potential for standby. When the word line is
raised the information in the cell is read out by
sensing the current in the bit line, or is written
into by holding down one bit line, thus forcing the
transistor on the held down side to be turned on.
The major disadvantage of this cell is that the
standby power dissipation is higher than the
power dissipation when addressed. This problem

+V

COINCIDENT

LINEAR

FIGURE 4-Bipolar cells

can be alleviated for the linear select cell by treating both the illustrated word line and the +v line
as word lines. Both lines can be raised for addressing and minimum supply voltage for standby
operation can be utilized. 8
Other cell circuits can be used 13,1',15 and four
layer devices have been proposed/ 6 but the cell
of Figure 4 will probably be the most common
bipolar cell for the near future in large memory
arrays due to its simplicity.
MOS cells
Figure 5 illustrates the most commonly used
MOS cell in both 17,12,13,14 linear select and coincident select form. The cell is operated by turning
on the MOS transistors to connect the cell to the
bit lines. The bit lines can then be sensed to determine which state the cell is in. For writing
into the cell a differential voltage is impressed
across the bit lines to force the cell into the desired state. To achieve the best performance from
this cell the word lines (or x and y lines) are
driven from powerful bipolar drive circuits, thus
providing very fast addressing to· the cell. If the

LINEAR

FIGURE 3-Multichip module assembly

1207

COINCIDENT

FIGURE

5~

MOS cells

1208

_Fall J 'Oint C'Omputer C'Onference, 1968

bit lines are held negative during the sense time
then a differential current will appear 'On the bit
lines as s'O'On as the w'Ord line is 'On. This differential current can be sensed by a bip'Olar differential thresh'Olding circuit with very little v'Oltage
change 'On the bit lines. Thus, the p'O'Or capabilities 'Of MOS devices t'O drive large capacitance has
been defeated by using bip'Olar drive and sensing.
The resistance 'Of the MOS resist'Ors is of little
importance in this cell since signal current is
drawn thr'Ough the ON switch and the write v'Oltages (again bip'Olar) will f'Orce b'Oth sides 'Of the
cell t'O the c'Orrect p'Otential. Theref'Ore, these l'Oad
resist'Ors need 'Only pr'Ovide leakage current 'Or be
supplied by asynchr'On'Ously pulsing the -v line.
In this way, standby cell p'Ower can easily be in
the micr'Owatts. Peripheral circuit p'Ower and
transient p'Ower are the maj'Or s'Ources 'Of p'Ower
dissipati'On in the MOS mem'Ory.

on S'O that the flip-fl'Op will be f'Orced t'O the state
impressed 'On the bit line.
C'Omparative evaluati'On
In all three technol'Ogies the mem'Ory 'Organization will be repetitive arrays of dense cells and
specialized peripheral circuitry. Only the c'Omplimentary MOS is likely t'O be treated as having
l'Ogic compatible signal levels; the MOS and bilJ'Olar will be used with very specialized bipolar
interface circuits. Table I is a brief c'Omparis'On
of the three cell technologies assuming present
pilot line technology. The designs are assumed
optimized for cost with speed a secondary factor.
The speed here means the cycle time in a complete
memory system of the order of 105 bits.
TABLE I -Cell technology comparison

C'Omplimentary MOS
Figure 6 illustrates a c'Omplimentary MOS memory cell. This is not suggested as an 'Optimum cell
but is representative 'Of the c'Omplexity 'Of c'Omplimentary MOS cells. 5 ,6 In this cell transmissi'On
gate A is n'Ormally 'On t'O cl'Ose the flip-fl'Op 1'O'Op.
T'O read the cell transmissi'On gate B c'Onnects the
cell to the bit line and the state 'Of the cell can be
sensed. T'O write int'O the cell transmissi'On gate
A is turned 'Off and transmission gate B is turned

WRITE

READ/WRITE -----~--+---l.--_4_-BIT LINE

FIGURE 6-Complimentary MOS cell

Bipolar
p-channel M OS
Compi. MOS

Cell Area
(sq. mil)

Speed
(n sec).

10
15
80

100
250
200

Power (cell
only) (mw)

0.5
0.01
10-0

The cell density of bipolar and pMOS are close
enough that cell c'Ost will not be a determining
factor. The complimentary MOS density, however, is poor enough to shut it out of a raw cost
race. Furthermore, it offers no major advantage
'Over p-channel MOS in speed r therefore, the first
conclusion that can be drawn is that complimentary MOS is. most suited in micropower applications where very low standby power is essential.
This market area is sufficiently large to assure
. development of c'Omplimentary MOS mem'Ories.
The bip'Olar vs pMOS trade-'Off is m'Ore inv'Olved.
The bip'Olar has a speed advantage with n'O direct
cell c'Ost penalty; h'Owever, the 0.5 mw power level
causes definite thermal problems. A 4K mem'Ory
m'Odule c'Ould be assembled 'Oli less than 1 sq. inch
'Of substrate but w'Ould dissipate 2 watts of p'Ower.
Thus, simple air c'Onvecti'On c'Ooling is inadequate
alth'Ough other cooling. meth'Ods can easily handle
this p'Ower density. The pMOS cell does n'Ot have
this pr'Oblem and can be packaged t'O the mechanical limit with simple c'O'Oling.
The big market is in the l'Ow cost/go'Od performance area. There is no clear cut winner between

Semiconductor Memory Circuits and Technology
bipolar and pMOS for memory cells as the tradeoffs are too close. Both will be developed and a
final edge of one over the other will require some
dramatic development in one of the technologies.
Bipolar technology will always have an edge in
speed, however, because speeds much faster than
100 n sec. system cycle time are achievable. Therefore, bipolar cell technology is assured a niche in
the memory market.
On-chip decoding

As cell density increases there is increasing difficulty in getting the interconnections off the chip.
For example, a 4 mil by 4 mil linear select cell will
have leads on 4 mil centers in one direction and 2
mil centers in the other. These leads can be
brought off on alternate sides around the chip
giving 8 mil pad to pad centers on two sides and
4 mil centers on the other two. Coincident select
cells require only 8 mil centers on all sides but the
cells will be larger with a given technology.
These lead centers are manageable in terms of
actually making the connections to the substrate
but the substrate problem can be serious. It is
very difficult to place 4 mil centers on a ceramic
substrate, although such placement is not too bad
on silicon. Unfortunately, the unscrambling
8.round the chip will be a maze and will almost
certainly require complex multilayer substrate interconnection.
If the lead density can be significantly relieved,
then single layer substrates can be considered
since part of the interconnections can be reflected
onto the chips if more pads are available. This all
leads to the consideration of on-chip decoding. Onchip decoding must be very simple since it must
be duplicated on many chips. The simplest decoding is diode networks in bipolar and series gating
in p MOS. In both cases on-chip address inversion
is prohibitive. In bipolar the complexity is too
severe and in pMOS the speed penalty is too severe. Assuming on-chip decoding with non-inverting logic, three choices of input are available:
1) True and complement binary signals.
For 2 n bits 2 n lines are required. The
internal decoding gates are n input gates.
2) Multi-dimensional decoding. For 2n bits
two dimensional decoding requires 2 (2n/2)
lines (n even) and a decoding gate fan-in
of 2: Three dimensional decoding requires

1209

as little as 3 (2nl3 ) lines with a gate fan-in
of three. Multidimensional decoding can
be continued to the point that each dimension is only 4 lines wide (plus possibly
one 2 line dimension) where the number
of lines required is only 2n and the number of gate inputs is ~ (n even) or n~1
(n odd). Thus, the number of inputs is
the same as for true-complement inputs
but the decoding gates are simpler.
3) Combinatorial Decoding. If there are m
input lines and the internal decoding gates
have n inputs then ( mn ) independent signals can be decoded. For example, with
12 inputs and 3 input gates the number of
lines that can be selected on chip is (l~ ) =
220 whereas the multidimensional case
would also use 12 inputs and 3 input gates
but only select 26 =64 lines.
Of the three input structures the multidimensional decoding is superior to true-complement input and the input code is very easy to construct.
The combinatorial decoding is the most efficient
on the chip but the input code is very complex to
construct and the number of select lines is not
inherently a power of 2. Since the multidimensional decoding is sufficient to solve the lead problem (for example 16 lines for 256 bits vs 48 lines
for the undecoded linear select cell) the complication .of combinatorial decoding is not required.
Figure 7 illustrates decoding networks for the
bipolar cell. Both word line decoding and bit line
decoding are possible. On the word line all diode
inputs must be high to permit the selected word
line to rise. On the bit lines all inputs must be high
on the selected bit line pairs toeprmit the cell
current to be coupled through the output diodes to
the bit line bus. Note that by using both word line
and bit line decoding the decoding is broken into
two networks so that for 256 bits, two 4 X 4 decoding networks can be used requiring only 2input gates.
Figure 8 illustrates decoding for the pMOS
structure. Series gating is the most desirable
form of decoding since the series gates can be
made with low on impedance and driven by bipolar drivers. Word line decoding is again possible, however, the drive voltage of the series gate
must be somewhat higher than the highest potential expected on the line and since the word

1210

Fall. Joint Computer
+v

Conference~

1968

+v·

WORO LINE
. DECODING

I

-+- ------- -- --------I

BIT LINE
DECODING

----+-----...1....--+-----...1....--

BIT LINE

_ _ _...L.-_ _ _ _ _....l-_ _ _ _ _ _

BUS

FIGURE 7-Bipolar cell decoding

-v

-v

such that p'Owerful drivers and sensitive ·sense
circuits are used. This provides a minimum peripheral circuit 'Overhead cost.
The pMOS word lines and bit lines are generally capacitive requiring drive voltages in the 5 to
20 volt range with sense currents in' the order of
0.1 rna. The high voltage capacitance drive required leads to active pull-up drivers.. Transient power at these voltages can be the dominant
power requirement 'Of the system.
Bipolar cell drive circuits require a low voltage
swing (in the order of 1 volt) but currents in the
range of 0.1 to 0.2 ma per bit, therefore, bipolar
cells use a low voltage current driver and driver
power is small compared t'O cell power. Bipolar
sense circuits must sense 0.1 rna signals with small
bit line perturbation so the sense circuit requirement are quite similar between bipolar and pMOS
cells.
Other memory forms

WORD LINE
DECODING

-v

BIT LINE
DECODING
BIT LINE
BUS

FI GURE 8-MaS cell decoding

line is already driving a series gate the input voltages must be quite high. On the bit lines, however, the series decoding gate drive required is
the same as the normal word line so that bit line
decoding is more desirable since the voltage
swings are somewhat smaller.
:peripheral circuit consideration
The peripheral circuitry for semic'Onductor
memory has many of the attributes of peripheral
circuitry for magnetic memory. In particular the
largest possible fan-outs and fan-ins are desirable

Other memory structures than simple readwrite are possible. The most commonly considered
are associative memories,19,20 multiport memories
and read-'Only memories. Of these three, read-only
memories are making the biggest splash in the
market place.
Both bipolar and pMOS read-only memories
are available from semiconductor manufactl.lrers
today. Both contain complete logic compatible deeoding 'On the chip and use a fixed connection pattern applied during fabrication. Thus, there is
a tooling charge to get the pattern desired. The
bipolar version is somewhat faster but the pMOS
version is somewhat lower cost. The pMOS memory is stored by selectively creating or c'Onnecting
to a dense array of MOS transistors. The bipolar
memory is stored by selectively connecting to an
array of diodes or emitter followers. The actual
array densities can be quite similar, however, the
logic compatible pMOS decoding is much more
dense than the logic compatible bipolar decoding.

An economic example
Table II provides an illustration 'Of the economic factors in semiconductor memory. This is
an example of potential 1972 costs in high v'Olume
production. A 4096 bit module is assumed using
16, 256 bit bipolar memory chips, 4 decode-drive
chips and 2 sense-digit drive chips. The packaging

Semiconductor Memory Circuits and Technology
is assumed as a single layer interconnect ceramic
substrate with repairable upsidedown chip attachment. The module would have about 70 n/sec
read or write cycle time and would dissipate about
2.5 watts.
TABLE II-Module cost

Cost Each

Total!
Module

Cost/Bit
(cents)

$1.00
0.75
1.00
10.00
.50
2.00

$16.00
3.00
2.00
10.00
11.00
2.00

0.4
0.075
0.05
0.25
0.275
0.05

25 % yield loss
Repair
Retest

44.00
11.00
1.00
1.00

1.1
0.275
0.025
0.025

5% Yield loss

57.00
3.00

1.43
0.075

60.00

1.5

Cost Center
256 bit chip
Decode/drive chip
Sense/digit chip
Substrate
Attach cost
Test

Table II illustrates module cost alone and does
not include system packaging or power. The po
tential for system packaging compatability and at
least partial power supply compatibility are fav'Orable fact'Ors in semic'Onduct'Or memories.
CONCLUSIONS
Semiconductor memory is looming as contender
f'Or a maj'Or portion of the computer main frame
memory market. C'Omplimentary MOS and bip'Olar
techn'OI'Ogies are assured 'Of a niche in the micr'O
P'Ower and very high speed areas respectively. The
major battle will be between bip'Olar and p channel MOS f'Or the I'OW C'Ost (and high v'Olume) seg'
ment 'Of the market.
The semic'Onduct'Or and packaging technol'Ogies
required are in an advanced state 'Of devel'Opment
and a maj'Or impact sh'Ould be seen within the
next few years.
REFERENCES
1 DAHODGES
Large-capacity semiconductor memory
Proceedings of the IEEE Vol 56 No 7 July 1968
2 J ElVERSON J H WOURINEN JR B T MURPHY
DJDSTEFAN

12111

Beam-lead sealed-junction semiconductor merrwry with mini rna
cell complexity
IEEE Journal of Solid-State Circuits Vol SC-2 No 4 ppl96201 December 1967
3 PPLESHKO LMTERMAN
An investigation of the potential of MOS transistor merrwries
IEEE Transactions on Electronic Computers Vol EC-15 No
4 pp 423-427 August 1966
'4 D E'BREWER S NISSIM G V PODRAZA
Low power computer memory system
AFIPS Conference Proceedings vol 31 pp 301-393 FJCC 1967
5 JHFRIEDRICH
A coincident-select MOS storage array
Digest of Papers pp 104-105 ISSCC 1968
6 J R BURNS J J GIBSON A HAREL K C HD
R A POWLUS
Integrated memory using complimentary field-eifect transistors1
Digest of Papers pp 118-119 ISSCC 1966
7 J F ALLISON F P HEIMAN J R BURNS
Silicon on sapphire complimentary MOS memory cells
IEEE Journal of Solid State Circuits Vol SC-2 No 4 pp
208-212 December 1967
8 R S DUNN G E JEANSONNE
Active memory design using discretionary wiring for LSI
Digest of Papers ISSCL 1967 pp 48-49
9 RSDUNN
The case for bipolar semiconductor memories
AFIPS Conference Proceedings Vol 31 pp 596-598 FJCC 1967
10 P SCHARLT T COLEMAN K AVELLAR
Flip component technology
Proceedings 1967 Electronic Components Conference pp
269-275
11 M P LEPSETTER
Beam lead technology
Bell systems Tech J vol 45 pp 233-253 February 1966
12 J MARLEY J H MORGAN
Direct interconnection of uncased silicon integrated circuit chips
Proceedings 1967 Electronic Components Conference pp
283-290
13 G B POTTER J MENDELSON S SIRKIN
Integrated scratch pads sire new generation of computers
Electronics vol 39 No 7 pp 119-126 April 4 1966
14 H A PERKINS J D SCHMIDT
A n integrated semiconductor memory system
Proceedings-Fall Joint Computer Conference 1965 pp
1053-1064
15 I CATT E C GARTH DE MURRAY
A high speed integrated circuit scratch pad memory
Proceedings Fall Joint Computer Conference 1966 pp
315-331
16 R P SHIRELEY
SMID A new memory element
Proceedings Fall Joint Computer Conference 1965 pp
637--647
17 J D SCHMIDT
Integrated MOS transistor random access memory
Solid State Design Janua.ry 1965 pp 21-25
18 A W BIDWELL
A high speed associative memory
Digest of Technical Papers ISSCC 1967 pp 78-79
19 R IGARASH T KAROSARA T YAlTA
A 150-nanosecond associative memory using integrated MOS
Transistors
Digest of Technical Papers ISSCC 1966 pp 104-105

2·1/2D core search memory
by PHILIP A. 'HARDING and MICHAEL. W. ROLUND
Bell Telephone Laboratories, Inc.

Naperville, Illinois
WORD

Usual memories allow addressing of individual word
lines with each word line containing an assemblage of
bits. Bit detectors which can sense all words are utilized
to read the word contents. Associative memories allow
bit lines to be addressed, as well as word lines, to determine which set of words match the input bit states.
Such memories ~re useful because they eliminate time
consuming word hunting in table look up operations,
and because they allow· easy access to specific words
highlighted by activity or flag bits.
Unfortunately, most associative memories proposed
are expensive. Possibly circuits based on large scale
integration may allow low cost associative operations,
but they are not economically feasible today. Core associative memories rarely have been proposed; those
described have complex memory mat structures.1 In
most cases, the cost penalties far outweigh the usefulness of the true associative array.
The semiassociative solution, one in which a segment
of any single bit line in an array can be addressed to
read out the corresponding bit locations for anum ber of
words may be the compromise that finds a wide range of
applications.2 This type of memory, defined as'a single
bit search memory, is symbolically illustrated in Figure
1. Such a memory may be operated in the ordinary sense;
one of the "n" unique addresses can be selected to read,
the "m" bit word contents. Similarly, a unique address
may be chosen and the m bits can be independently written. The figure illustrates the selection of address "Ai"
either for reading or writing the "m" bit contents.
In the search mode, a single block' of K address locations associated with any single word bit, Bj, may be
selectively read or written. The figure highlights the
B jth bit of words 1 through K in block 1 and the B tth
bit words of SK through (S + l)K in block S as possible search words. The entire word field of n can be arranged in n/K blocks of K words each to facilitate the
search mode. It has been found that a modified 2 wire
or 3 wire 2-1/2D Core 'Memory can economically

I-"- BLOCK 1 - .

r-'

WORD',
WORD
BLOCK S --,
BLOCK

%

1

III

BIT

u

4

3

2
1

l1li
1234

1\

SK

,

(S+I)K

n

AODRESS

FIGURE l-Definition of a single bit search memory

achieve the single bit search memory function if only
one m-bit nonnal address word or one K-bit search word
is selected during one memory cycle.

2-1/2D, 2-wire memory
A single bit search memory can be thought of as an
extension of the conventional, 2-1/2D, 2-wire core
memory shown in Figure 2, which was reported on in an
earlier paper.3 In a 2-wire, coincident current memory,
the readout voltage from a particular core must obviously be sensed across one of the two drive wires intersecting the core. In the memory of Figure 2, the core array is composed of a front and rear plane, with the readout voltage being sensed differentially across a pair
of selected bit wires, one in each plane. The centertapped conneetion of the bit readout transformer also
forces the bit drive current to divide equally between
the front and rear planes. The group selection cirouit
provides a virtual ground to the selected pair of memory
wires in each bit while simultaneously isolating the nODselected wires, whioh are multipled to the readout transformers at the top of the array. Hence, any noise voltages induced on the nonselected wires are not coupled
into the readout. The group selection circuit is formed
of diode rails conneoted in a driven bridge configuration

1213

1214

Fall Joint Computer Conference, 1968

BIT READOUT
TRANSFORMER

WORD
GROUP
SelECTION
CIRCUIT
WORD
DRIVERS

FIGURE 3-A single bit search core memory

FIGURE 2-2-1/2D core memory
BIT DRIVER

of the type described in Reference 4.
The word access consists of a diode matrix driving
folded word loops each of which intersects two cores on
a given bit line. Since the word and bit currents will add
in one of the cores and cancel in the other, the direction
of word current is'used to select one core or the other,
thus reducing the number of word access circuits required by a factor or two. 4 The looping of the word wire
has anum ber of additional advantages. The cores can
be oriented in-line, rather than in a diamond pattern,
increaE-ing packing density. The shuttle voltages due
to word current tend to cancel. And finally, no more
than half of the cores on a bit line can be disturbed by
word current into a worst case delta noise state. 4
A word is read out of this memory in the following
fashion. Bit read current is applied first causing a large
noise spike in the readout. When this noise has expired,
the word current is applied, reading out the state of the
m bits of the word. The word is rewritten into memory
by reversing the word current and applying reverse bit
write current to those locations where a "one" is to be
stored. This timing is indicated in FigureA.

Single bit search memory
The word access of Figure 2 supplies current' to only
one selected word loop. At little additional cost, we can
perform the same selection process with an access that
is virtually identical to that used in the bit dimension,
as shown in Figure 3. We can select a pair of word lines
(rather than a single loop) by energizing the appropriate
word driver and word group selection circuit. Note that
the cores on the front and rear planes are oppositely
phased with respect to the word current so that only one

WORD DRIVER

BIT REAQ.OUT

FIGURE 4-Memory address sequence: bit driver current
word driver current, bit readout

core is selected per bit. However, we now have the
added flexibility of being able to readout in the word
dimension. That is, rather than energizing m bit drivers
and then a single word driver and finally sensing the
readout on m pairs of bit wires, we energize all K word
drivers first and then a single bit driver while sensing
the readout on K pairs of word wires. Thus, we can
readout the state of K different word locations of a
given bit in a "search" mode, or all m bits of a word in
the normal or "address" mode. The timing for the
search mode is illustrated in Figure 5. The readout wave
form is the same in either mode, but appears across the
bit transformers in the address mode and across the
word transformers in the search mode.

4- plane single bit search
The word wiring scheme of Figure 3 lacks the shuttle

2-1f2D Core Search Memory

~-BIT

1215

-_ DRIVERS
BIT
--

BIT DRIVER

\

o

WRITE 0

'----~

rJ

m

(1

WORD DRIVER

WRITE 1

WORD READOUT

FIGURE 5-Memory search sequence, bit driver current,
word driver current, word readout

and delta noise reauction advantages of the word line
looping of Figure 2. These advantages can be regained
by using the four-plane configuration of Figure 6. The
word and bit lines are wired so that only one plane receives a simultaneous word and bit current. This is indicated in Table 1.
TABLE 1--4 memory plane wiring

1

2
3
4

Search memory operation
Let "llS consider the operation of a search memory of
n words, having m bits per word, and with the word
access divided into K segments (each segment having an
independent driver, detector, and register cell). The
block diagram of such a memory is shown in Figure 7.

BIT
LINES

CORE
SELECTED
PLANE
PLANE
PLANE
PLANE

FIGURE 6-Improved search memory access

PLANES
PLANES
PLA NES
PLA NES

1,2

1,2
3,4
3,4

WORD
LINES
PLANES
PLANES
PLANES
PLANES

1,4
2,3
2,3
1,4
WORD
DATA

BIT
DATA

FIGURE 7-Single bit search memory block diagram

1216

Fall Joint Computer Conference, 1968

In a normal or address mode, the address to be intern

rogated is supplied in two parts; log2 K address bits
are supplied to the address register which controls the
bit group and word group selector circuits. As an example let us assume equal bit and word group selector
circuits. Then the bit group selector circuit decodes 1/2
log2 ~ inputs and selects one of the

¥~ groups of m

bits each. The word group selector decodes the remaining 1/2 log, ~ inputs and selects one of the

~

groups of K bits each.
The remaining lout of K address selection is performed via the word data input, which energizes the
appropriate lout of K word drive:rs. All m bit drivers
are atcivated and readout is accomplished via the m bit
detectors.

desired flag bits. The matching operation is readily accomplished by using a word register such as the one
shown in Figure 8. The register is initially set to the
"'1's" state by a timing pulse. If the readout from a
given word detector mismatches the match bit B J , the
corresponding flip-flop in the word register is reset. On
the subsequent search read, the match bit becomes
the second· desired flag bit B J + 1, and· those readouts
which mismatch will again reset their corresponding
word register flip-fl()ps. Thus, any word flip-flop which
remains set after all of the flag bits have been searched
corresponds to an address whose flag bits match all of
the desired flag bits.
The following example further illustrates the ripple
search technique. Suppose that the desired flag bits B J
through BJ + 3 are 1001 and that the memory contents
FROM DETECTORS

n

In a search read mode, log2 K address bits are again .
MATCH BIT Bj

supplied to the address register to control the group selector circuits. However, now the roles of the word data
and bit data inputs are interchanged. All K word drivers
are energized, corresponding to the K addresses over
which the search is to be made, while only a single bit
driver is energized, dependent on which bit is to be
searched. Readout is accomplished via the K word detectors.
Flag bi~ memory

As mentioned earlier, one of the attractive applications of a search memory is in the reduction of hunting
for an active or "flagged" word. If a single bit of the
word is reserved to indicate activity in the remainder
of the word, that particular bit can be called a "flag"
bit. If one performs a search read on the flag bit, then
those locations in the word register which are set will
correspond to the words which are "active," and subsequent normal reads can be used, with the active word
locations automatically stored in the word register, to
determine the entire contents of the active words. If
multiple words are flagged in a word block, some form
of priority selection may be necessary. In the case of few
active words, the flag bit search can reduce the hunting
time by a factor of K. In practice, the reduction can be
as great as' one or two orders of magnitude.
Ripple search

In the case where more than a single flag bit is required to locate a desired address in memory, a "ripple
search" technique can be employed. This simply involves sequentially searching through the flag bits, eliminating all addresses which mismatch on any of the

DOUBLE
RAIL
MISMATCH
GATES
TIMING

~R~

----------K

FIGURE 8-Ripple search word register

MEMORY CONTENTS
OF WORD BLOCK "S"

BIT
Bj+3=1

0

Bj+2=O

0

h

1
1

1 0
0 1
1

Bj+1 =0

0

0

0

Bj=1

0

1

1 0

1

~

1

'I"{
I

0 ,..
0

/j
SK+l SK+2SK+3SK+4

(S+l)K

0

1
0

1

0

0

0

1

0

0

0


j::

~ r:100:000

l../ ~

l../ ~

0.95~

for large sized memories.
The memory described is ba~ically a 2-1/2D-2-wire
core store; it therefore has the speed, size and cost limitations inherent to such systemE!. Microsecond operation for a. few million bits is entirdy feasible. However,
it is also possible to extend the scheme to 3-wire systems
for inoreased speed capability.

BITS

REFERENCES

0.86

....

c(

1 W L McDERMID

Y.II

ali:

HE PETERSON

A magnetic associative memory system
0.5

The IBM Journal of Research and Development Vol 5 No 1
January 1961 pp 59-62
2 HSSTONE
Associative processing for general purpose computers through the
use of modified memories

o

2

3

4

5

6

7

8

NUMBER OF SEARCH BITS
24 BITS/WORD
32 WORDS/SEARCH

FIGURE 12-2-1/2D core address and search memory costs:
relative cost/bit vs number of search bits for an
800,000 and 6,400,000 bit memory

1968 Fall Joint Computer Conference
3 PAHARDING MWROLUND
Novel low cost design for 2-1/2D storage systems

1967 Solid State Circuits Conference Digest of Technical
Papers Vol X IEEE Catalog No 4C-49 pp 82-83
4 PAHARDING MWROLUND
Bit access problems in 2-1/2D 2-wire memories

1967 Fall Joint Computer Conference Proceedings Vol 31 pp
353-362

Design of a small multiturn magnetic
thin :film memory
by WILLIAM .0.. SIMPSON
Texas Instruments, Incorporated
Da.lla.s, Texas

INTRODUCTION
Since Pohm* introduced the concept of multiturn
windings ,as a means for improving the efficiency
of planar thin film memory elements and thereby
lowering the cost, very little, work has been published on memories using this technique. Planar
thin film memory elements typically require word
currents on the order of 500 rna and bit currents
on the order of 150 rna while signals are in the
range of 1 to 2 mv. Multiturn windings can be
used for sense lines and/or word lines to improve
the efficiency of planar film elements because the
drive requirements are inversely proportional to
the number of turns while the signal output is
theoretically directly proportional to the number
of turns. This paper describes the design of a
74000 bit planar thin film memory using a multi.turn sense-digit structure but using single lines
for word drive.

strates approximately 3 in. square (Figure 1).
A layer of SiO is deposited over the film for protection. Typical values for Hc and Hk are 5.0 and
4.8 respectively. The aluminum substrate serves
as a ground plane and also. plays an important
role in obtaining virtually creep free films.

System characteristics
The memory is organized into 1024 words of
72 bits each, using 24 film planes in two 3 X 4
back-to-back arrays as shown in Figure 2. Word

Design goal

The prime purpose for design of this system
was to obtain a planar thin film memory concept
which would be compatible with standard integrated circuits and yet retain relatively high performance characteristics. Cost, power, and reliability as always were also important considerations.

Film characteristics
The magnetic films used in this system are continuous sheets of 1100 A Ni-Fe-Co film evaporated
on electro-chemically polished aluminum sub·Pohm, A. v., "Magnetic Film Scratch-Pad Memories," IEEE
Transactions on Electronic computers, Vol. EO-I5, No. 4 August
1966.

FIGURE 1-Film plane

1219

1220

Fall Joint Computer Conference, 1968

FIGURE 4-Bit geometry cross section

FIGURE 2-Film plane array

lines are etched from copper-Mylar** laminate and
are 10 mils wide on 20 mil centers, approximately
10 inches long. Sense-digit lines are made from
insulated 4 mil round wire wound into 4-turn flat
coils approximately 13 inches long held in place
by Mylar cladding. These flat coils shown in Figure 3 are on 80 mil centers. Since there are two
**Trademark of E. I. du Pont de Nemours & Co.

BIT1{

C

BIT2{C
BIT3{

C

2 SPOTS PER BIT

:I:
:1:1

.080

crossings of each coil for each word line, there are
necessarily two memory elements per bit. Furthermore, since the flat coil serves both as the
sense signal pickup coil and the bit drive line, the
two elements receive opposing easy-direction fields
and are always magnetized in opposite directions.
Thus, the signals from the two elements switched
by rotation in the conventional DRO mode are
additive in the coil and appear in differential
mode at the coil output terminals. As Pohm indicated, the sense line does not behave like a
transmission line but like a lossy inductive pickup coil.. If this were not the case the signals
would not be additive at the output terminals
because of time delays in the 13 in. long coil.
The memory cell cross section, shown in Figure
4~ consists of the aluminum plate with film on
the upper surface, an overlay of word lines fol..
lowed by an overlay of sense-digit lines, the word

LOWER HALF

UPPER HALF

WORDLIN~

~~

FIGURE 3-Memory element geometry

FIGURE 5-Sense-digit channel

Design of Small Multiturn Magnetic Thin Film Memory 1221

12

>E
~

....
iodes are included on both ends of the word lines to prevent
"half select" currents in unselected lines during
word current transients.
All of the logic circuitry, is accomplished using
standard series ·74 networks. System timing is
achieved through the use of tapped delay lines
so that no one-shots or other adjustable devices
are required. In fact, there are no controls· or
·other variable elements in the entire memory.

Physical specifications
The memory package shown in Figure 10 is
designed for 19 in. rack mounting with a panel
height of 3 112 in. 'and a total depth of 20 in. The
weight is approximately 35 lbs. The power dissipation of the memory is less than 60 ;w:atts. This
relatively low power level is primarily due to the
saturated stage nature of TTL networks used
throughout.

Memory performance characteristics
The sense amplifier analog output is shown in
Figure 11. Signals for both Is' "and Os' are shown
superimposed for one particular bit channel with
all 1024 words of the memory operating. Note
that the recovery pattern indicates a minimum
cycle time of 350 ns can be achieved using this
technique, although the memory was originally
designed for application as a 500 ns cycle time
system. Similarly, the design goal for access time
of 250 ns was also achieved.
CONCLUSION

FIGURE 10-Memory photograph

A planar magnetic thin film memory has been
designed and bunt by Texas Instruments using
all integrated circuits for electronics achieving a
cycle time faster than 500 ns, and an access time
of 250 ns. The memory is organized as 1024 words
by 72 bits in order to balance the costs of the
word drive circuits against the sense-digit cir-

Design of Small Multiturn Magnetic Thin Film Memory 1223
cuits. The inherent advantage of this particular
organization is that the computer can achieve
speed advantage not only because of a fast repetition rate, but also because four 18 bit words are
accessed simultaneously. (Comparable core memory designs are ordinarily organized 4096 words
of 18 bits each.) The outlook is for higher speed
(faster than 150 ns) memories in similar organizations to be developed in planar magnetic
films. The cost of these memories will be COID-

petitive with 2112 D core memories of the same
capacity but the organization and speed can be
considered to offer at least a 4: 1 improvement in
mUltiple word accessing and a 3:1 improvement
in speed. As a result of this, more computers will
be designed to take advantage of the long word
either by extending the word length of the computer itself or by ordering instructions and data
in such a manner that sequential addressing will
be required a large percentage of the time.

An adaptive sampling system for hybrid computation
by GEORGE A. RAHE
Naval Po~tgraduate School
Monterey, California

and
WALTER KARPLUS
University of California
Los Angeles, California

INTRODUCTION

Redundancy in 8ynchronou8 8ampling

In most data processing and hybrid computing
The concept of sampling is central to the operation of
all systems in which analog information is to be pro- . systems, the sampling rate is dictated by specified error
bounds upon the reconstructed signal. If the continuing
cessed by a digital computer. In conventional hybrid
analog signal is sampled and processed by a digital
computing and data-processing systems the continuous
computer, itmust be possible to reconstruct a continuanalog signal is represented by an amplitude-modulated
ous signal from the samples so that the maximum
pulse train in which the pulses occur at fixed intervals of
difference between the reconstructed and the original
time. Such synchronous sampling facilitates control by
signal nowhere exceeds a specified tolerance. In accordthe digital computer clock and requires a minimum
ance with the well-known sampling theorem, this
amount of equipment. In many applications, however,
sampling rate is based on the largest magnitude and the
it is important to minimize the number of samples emhighest frequency components the signal is expected
ployed to represent the analog signal. For example in
to attain. Actually, the analog signal may never attain
telemeter applicat.ions, it is important to economize
these maximum magnitudes or frequencies, or it may
transmitter power by limiting. the number of samples
attain them for only brief periods of time. Therefore, the
transmitted over long communication links.
utilization of fixed sampling usually leads to a large
In hybrid computation, power conservation is not a
number of unnecessary samples, samples which can be
primary objective, but high sampling rates often tax
eliminated without deteriorating the quality of the reseverely available digital computer memory capacity
constructed signal. Ih essence, the synchronous train of
and the band-width of data channels. Not infrequently,
samples contains redundant information, and the
. high sampling rates limit .the number of analog channels
various data-compression schemes are intended to
which can be accommodated by a given analog-digital
interface. Accordingly, a variety of so-called data-. minimize this redundancy. A number of proposed datacompression methods involve the suppression of recompression techniques have been proposed, techniques
dundant samples. In that case the analog signal is
which are designed to reduce the number of samples
sampled, and the sample is analyzed to determine
which must be transmitted across an analog-digital
whether samples can safely be omitted from the signal
interface without exceeding specified error bounds.
transmitted to the digital computer. An alternate
The system described· in this paper represents a
approach, proposed in this paper, involves the utilinovel approach to this problem. It differs from conzation of an adaptive sampling system so that the
ventional data-compression techniques in that the
analog signal is sampled only as often as neceessary,
analog signal is modified or SUbjected to an approxibut all samples actually taken are transmitted to the
mation prior to sampling. The theory underlying this
digital computer. The sampling interval is therefore
method is first briefly developed below, followed by
continuously and automatically controlled as a function
description of a hybrid computer mechanization of the
data compression syst"em.
of the analog signal activity.

1225

1226

Fall Joint Computer Conference, 1968

Accuracy constraints

The accuracy demanded of a sampled data system
of the type considered in this paper. is dictated first of
.all .by the characteristics of available hardware. Thus '
It IS unreasonable to attempt to reduce the reconstruction error below the combined magnitudes of
the various error sources inherent in the hybrid system.
These error sources include particularly the drift,
zero-offset, phase-shift, and noise in the analog portion,
and the quantization error (related to the word-length)
on the digital computer.
. The purpose to which the data is to be put also
dIctates the form of reconstruction and therefore the
control laws for the sampling operation. Consider for
example the operation of graphic CRT terminals in
hybrid computation where analog signals are to be
displayed on one or more such terminals. Commercially available terminals represent a function by a
series of straight line segments to an accuracy of from
1%to .5%. Line segment generators require only the
origin and terminus of a line segment to produce the
req~red line. ~or the purposes of this type of applicatIOn, a samplmg control law is required which will
provide a reconstruction by linear interpolation to a
predefined maximum error, with the minimum number
of lir:e se~ments, and in real time. In addition the apprOXImatIOn must be continous at the end points of the
line segments in order not to be objectionable to the
user. "
Definition of the approximation - continuous secant

The nature of the reconstruction to be considered
and some of the properties of the approximation
wIll ~ow be described. Consider that the function cf>(x),
contmuous on a closed interval (a,b), is to be approximated by line segments P i(X) over subintervals
h~re

.6 i

= (Xi+l - Xi) i

~i

The set of line segments P i(X) which form the best
approximation to cf>(x) are defined to be those which
mi~imize the nllmber of segments n for a given predefmed tolerance E. The defihition that cf>(x) is continuous in the mathematical sense will of course present no restriction on mechanizable functions.
It can be readily shown 3 that a minimum of nonunique number of line segments on a closed interval
a ~ x ~ b is made up of the set determined by the
maximum line segment with orgin at a point (a) and
maximizing the length of each succeeding line segment
adjoining them at their end points until a segment is
determined which contains the point (b).
. The problem of finding the optimum sampling points
~s reduced t.o dete~mining the largest value of A i in any
mterval whICh satisfies the predefined error.
Determination of approximati~n interval

Before proceeding with the derivation of the control
laws for determining the maximum approximation
interval, it will be advantageous to consider the nature
of the function cf>(x) to be approximated. The continuous function cf>(x) to be approximated on a certain
finite interval has, in most cases, a fixed direction of
cor:c~vity (upward or downward) which changes only
a fmIte number of times. Such functions will be referred
to here as "piecewise convex or concave".
Without loss in generality, the study of conca ve
function cf>(x) is reduced to that for convex function
- cf>(x) and defined as follows: Definition: A function
cf>(x) is convex on the interval (a,b) provided that2

.

cf>(p.fj

1, 2, .... n such that l

=

hi = cf>(Xi+I) - cf>(Xi)

+ (1

- p.) a)

(a, fj) C (a, b) "

a ~ x ~ b

Maxlcf>(x)

(1)

~

p.cf>(fj)

+ (1

- p.) cf>(a)
(2)

P.E(O, 1)

i=l

The plot of a convex function cf>(x) in cartesian coordinates, therefor~, is characterized by the property
that any arc of the plot has all of its points located not
higher than the secant chord that joins its end points.
For the purposes of exposition only, consider cf>(x)
twice differentiable on the interior interval Xi < x < Xii-I"
then
.

where
Pi(x) = ai

+ bix

P~(x) = 0

and
n

L,:-1 ~i =

(a, Xl)

+ (Xl, X2) + ...

(Xn-l, b)

(a, b)

(3)

An Adaptive Sampling System for Hybrid Computation

A vaiue of x = Xi+l is sought such that the maximum
deviation of cf>(x) from the secant P(x) is equal to E in
the interval Xi .::; X ~ Xi+11 where
P(X) = cf>(Xi)

+

cf>(Xi+l) - cf>(Xi) (x - Xi)
Xi+! - Xi

taking the derivative

(4)

then there exists a Xi+! such that:
P(X) - cf>(X) = E

1227

(9)
From which

Xi

<

X

<

Xi+l

(5)

dcf>(x)
dx

= cf>(x) - [cf>(Xi) X -

E]

= M2(X)

(io)

Xi

Substituting equation 4 into equation 5
For
cf>(Xi)

+ cf>(Xi+!)

- cf>(Xi) (x Xi+l - Xi

Xi) -

cf>(x) = E

So that the value of X = Xi+! is that value of Xfor which

or
(11)

(6)

defining
M (x) = cf>(x) -. [cf>(Xi) - E]
3
X - Xi

as is shown in Figure 2a.
In an entirely parallel development for the case where
cf>(x) is concave:
· M ()
cf>(x) - [cf>(Xi) + E]
D efi Ulng. 1 X = ~-=-..-::..:..-.;.---:...----=.
X - Xi

(7)

and

then

Xi+l

(12)

is that value of x for which
Ma(X)

M (x) = cf>(x) - cf>(Xi)
2
X - Xi

< MaxM 1 (x)

(13)

XEAi

as is shown in Figure 2b.

It is seen· from Figure 1 that the slope pf the tangent
line UV is equal to the minimum value of M 3 (x) so that
the value of X = Xi+l is sought such that:

~

E

I
Figure 2d

Figure 2b

c:p (x)

...
:Q-

2

~....
P~x),. .-),..........
. .-

Ma P(;,
... .......

R

...

E

::.----

MIN Ma

u

~------~

Xi+1

Xi

Figure 2a

E

~
MIN Ma

X'I.

Xi+1
Figure 2c

Xi

FIGURE 1-The continuous secant method.

FIGURE 2-Behavior of the secant method for (a) convex
interval; (b) concave interval; (c) concave-convex interval;
(d) convex-concave interval.

1228

Fall Joint Computer Conference, 1968

Consider the case now where cp(x) is not strictly concave or convex in an interval. If the curve is concave
convex as in Figure 2c, the maximum excursion of cp(x)
from MaxM1 (x) must have been less than E in the
concave portion. Since M 2(x) at x = Xi+l must be greater than MimM 2 (x) in the interval, then the error in
approximation over the concave portion must be less
than E. Which is to say that the interval is effectively
convex to within a predefined error E. A similar argue
ment follows directly for the case where c/>(x) is convexconcave, as shown in Figure 2d.

Analog
Memory

_1_

r------ -I

t-t

o
t-t

~

o

(t)

Summary of control laws for the determination
of sam ling points

The general control laws for determination of the.
optimum sample points may now be summarized as
follows:
Defining
M1(x)

=

cp(x) - [cp(x,)

+

E]

x - Xi

M 2(X) = cp(x) - CP(Xi)·
x - Xi

(14)

FIGURE 3-Mechanization of the control laws for the secant
method.
Successful Runs

Ma(x)

=

Unsuccessful Runs

cp(x) - [CP(Xi) - E]
x -

and x

>

•

Significant Samples

Xi

Xi.
¢

(x)

-,--, - - - ---- --

The sample point Xil-l is given by the minimum value
of X for which either of the following logical equations
is satisfied:

I

"
I

,

I

I

.

.

I

1.

(15)

II.

(16)

M eehanization of the hybrid interpolator

A primary concern in the development of any sampling control law is ease of mechanization. A functional
diagr~ of the mechanization of the proposed hybrid
interpolator is given in Figure 3. While an all analog
mechanization encounters certain difficulties, modern
hybrid multipliers are suited to divide by a parameter
as restricted as time. Equipment will allow updating of
time at a 500· kHz rate. The suitability of this method
was verified by simulation on a hybrid computer system.
By contrast, the operation of a pure digital interpolator is illustrated in Figure 4. The interpolator

Sample·
Interval

--t.1. ......
fs .

FIGURE 4-A digital linear interpolation.

forms a straight line with f(O) as the origin and f(2)
as the terminus, and computes a value for the intervening sample f (1). If that value is within tolerance a
new line is formed with f(3) as terminus and both
intervening values must be computed and compared
with the actual values. If either value is not within
tolerance f(2) is transmitted and becomes the origin
of the next line. This procedure has two significant
failings: first, the operation requires· at least one subsequent value of the function in order to make an
approximation, and second each intervening point

An Adaptive Sampling System for Hybrid Computation
. must be reapproximated at every new interval. It
follows then that for a line lenght of n intervals, s = n/2
(1 + n) intermediate points must be calculated and
compared with the actual sample values. Even for rela:tively smaIl values of n the computation time prohibits its use in most applications. .
A determination of efficacy

1229

number of sampling methods was determined on an
IBM 7094 computer. The results of that study are presented in Table I. This summary indicates that even
under worst case conditions the proposed system
demonstrates a marked reduction in the number of
samples required for reconstruction to a fixed predefined error. Certain familiar reconstructions were
included for the same relative RMS error for comparison purposes. I ••

Choice of test signal
In order to compare various sampling systems it is
necessary to define an appropriate test signal. Two
considerations prompt the selection of a signal which
is a worst case for adaptive sampling: a) The resulting
compression provides a lower bound on system performance; and b) it provides a measure of system
susceptibility to the generation of more samples than
a suitably formulated synchronous system.
Since the adaptive system samples on the basis of
both amplitude and frequency, a worst case would be
one which is characterized by a flat power spectral
density over the total predicted bandwidth (a maximum information signal).
Such a signal which is easily generated and easily
described in both statistical and deterministric terms
can be constructed from a sum of sine waves.'
N

t/>(t) =

L

,.-1

&"sin(c.).t

+ 8.)

(17)

For eight or more non-harmonic related sine waves.
the probability density function becomes indistin~
p;uishable from the Gaussian one.

The operation of a sampling system in the presence
of source noise is often neglected since noise over the
entire predicted bandwidth works to reduce sample
reduction and also precludes the successful operation of
many proposed systems which rely on measurement.
of the derivatives of the function to be sampled.
In the absence of noise, adaptive sampling can be
expected to reduce the sampling rate by additional
factor of ten when the signal occupies only one tenth
of the predicted bandwidth. However, noise can be
expected to occupy the entire band and the effect of the
presence of this noise is paramount to the evaluation
of adaptive system.
In order to evaluate this effect, noise with a flat
spectral density, a Gaussian distribution and· RMS
level equal to half the predefined tolerance and cut off
at 18 db per octave at a frequency ten times the highest signal frequency was added to the signal. The
effective sampling rate even at this excessive noise
level was increased by only twenty percent over the
rates determined for the noise free case.
CONCLUSIONS

Comparison of sampling rates
The sampling rate for a fixed maximum error for a

SAMPLES/
SAMPLING
CYCLES OF
RMS
RECONSTRUCTION HIGHEST ':ERROR
METHOD
FREQ.
(%)
Butterworth
4-stage filter
8.0
2.0
Linear phase
4-stage filter
15.0
2.0
Zero-order hold
628
First-order hold
62.8
First-order (Synch).
interpolator
22~2
Continuous Secant
6.7
1.76

Source noise

PREDEFINED
TOLERANCE%
FULL
SCALE

:±:.5
:±:.5

An algorithm has been presented for the determination of a continuous polygonal approximation which
results in the least number of samples for a given predefined maximum error where the end points of the
line s~gments are restricted to lie on the function. The
method is suitable for general application since it requires no apriori knowledge of the properties of the
function to be sampled. The method has been shown
to provide a reduction in the sampling rate even under
worst case conditions and to operate effectivelv with
little degradlttion in performance when the signal is
corrupted by noise.
ACKNOWLEDGMENT

:±:.5
:±:.5

TABLE I-Relative sampling rate (fr) and RMS error vs.
predefined tolerance for test signal input8•6

The studies described in this paper were sponsored in
part by the National Science Foundation under a grant
to the Department of Engineering, University of
California at Los Angeles, and by the U. S. Naval

1230

Fall Joint Computer Conference, 1968

Ships Systems Command under a contract with the
Department of Electrical Engin~er~g, Naval Postgraduate School, Monterey, California. The authors
also gratefully acknowledge the courtesy extended to
them by the Electronic Associates Inc. Computing
Center, El Segundo, California, and by Mr. J. Magnall
formerly the director of that center.

cp (t)

Continuous Function

BIBLIOGRAPHY
1 DE LA VALLEE-POUSSIN
Lecon,s sur l'approximation des jonctions d'une variable reaZle
Paris 1919
2 JWYOUNG
General theory of approximation by functions involving a given
number of arbitrary parameters
Trans-American Math Soc 23:331-334 July 1907
3 GARAHE
Adaptive sampling
PhD Dissertation UCLA 1965
4 L W GARDENHIRE
Redundancy reduction the key to adaptive telemetry
Nat Telemetering Conf Los Angeles California June 1964
5 RKSISKIND
Probability distributions oj sums oj independent sinusoids
Technical Memo No 83 Systems Technology
Laboratories Inc Ingelwood California March 1961
6 DDMcRAE
Interpolation errors
Radiation Inc Reports 1 pt 1 May 1961

cp • (t)

Sampled Function

CPiOH

cPFOH

Output Zero-order Hold

-

Output First-order Hold

APPENDIX I.
FIGURE A. I-Zero and first-order holds.

Synchronous sampling in hybrid computation

The reconstruction of signals in hybrid computation
have been restricted in general to the simple zero and
first-order holds shown in Figure A.I. Since the error
in computer systems is generally required to remain
within a predefined tolerance E, the synchronous
sampling rate is dictated by the smallest interval in
which this tolerance can be reached.
From Figure A.I, the reconstruction by a zero-order
hold is seen to be glven by the value of the functions at
the last sampling instant nT. The output of the zeroorder hold cJ>ZOH (t) is given by:
cJ>ZOH(t.) = cJ>{nT)

nT

<

t

< Cn +

I)T

(A.I)

for which the construction error eZOH(t) is given by
(A.2)

the maximum full scale relative error becomes
emaa:

where n
T
Since

=

=

=

Max eZOH(t)
2A

(A.4)

nwT = 211"

(A.5)

= 1I"/n

(A.6)

then
ema;e

and for a predefined tolerance E, the sampling rate n is
given by
n>_11"_

(A.3)

2A

the number of sa,mples/cycle
sampling period

For an input signal cJ>(t)
cJ>(t) = A sin wt

AwT

E/2A

An Adaptive Sampling System for Hybrid Computation

1231

Choosing a value of relative error which is used to
evaluate the Secant Method, e max < .5 percent, the
sampling rate (n) for the zero order hold is found to be:
n

>

638 samples/cycle.

(A.8)

The number of samples required when reconstruction
is performed by a first-order hold is derived in much the
same fashion. The output of the first-order hold can be
written as follows:
c/>FOH(t) = 2(t - T) - c/>(t - 2T)

.5 percent is

n # 62.8 samples/cycle

(A.12)

Similarly the sampling rate for a linear interpolator can
be shown to be:

(A.9)
Max eLI(t)
2A

and the error for a first order hold eFOH becomes
~OH =

<

and the sampling rate for e max
therefore

c/>(t) - 2c/>(t - T) - c/>(t - 2T) (A.I0)
n

In the case of the sine wave input

>

< ",2T2

= ~

-

>

16

4n2

(A.13)

__1__
2 VE/2A

~

Again for a maximum relative error of .5 percent

c/>(t) = A sin wt

n

the maximum full scale relative error is given
(A.ll)

>

22.2 samples.'cycle.

(A.14)

A new solid state electronic iterative differential
analyzer making maximum use of integrated circuits
by BRIAN K. CONANT
University oj Arizona
Tucson; Arizona

INTRODUCTION
The feasibility of really fast hybrid computation was
demonstrated by the development and application of
The University of Arizona's ASTRAC 11. 4 ,8,10 But,
no machine commercially available to date has the
required mode-control switching speed and lowimpedance computing networks; and most computers
do not have the required amplifier bandwidth. The
development of the LOCUST system represents an
attempt to design a truly producible very fast computer at moderate cost.
The new machine was developed as a project-group
Ph.D. dissertation, a somewhat novel concept in
engineering education: The writer acted as a project
engineer on the overall design, test,and application of
the LOCUST computer and helped to supervise four
M.S. thesis projects14 ,17,19,21, plus several term-paper
projects1 ,2,/i,20, which contrihuted significant components. The computer, including all printed circuit
cards, was built by undergraduate student technicians,
using Motorola and Fairchild integrated circuit modules
and discrete components.
The LOCUST system is an all solid-state iterative
differential analyzer making maximum use of integrated circuits (Figure 1). The machine comprises 34
free amplifiers of which 16 can be used as integrator/
track hold circuits, plus 18 amplifiers permanently committed to 6 high-speed multiplier/dividers, and 4 comparators (56 amplifiers total). The new machine is
capable of solving a sixteenth-order linear or non-linear
differential-equation system up to 2000 times per second
for iterative and random-process computations under
digital control. 4 Linear errors are within 0.2 percent up
to 10 kHz. Special "slow'" summing networds also
permit operation as a slow analog computer. The following design featu:res are of special interest:
1. Maximum use of both linear and digital monolithic integrated· circuits enhances computer performance and still reduces parts and assembly
costs.

Figure I-The LOCUST computer, built in the University
of Arizona's Analogjhybrid Computer Laboratory. Linear and
digital integrated circuits enhance computer performance and
still. reduce parts and assembly costs

2. New mounting and shielding te,chniques, including a technique for shielding low cost unshielded
patchbays, were developed (Figure 2).
3. Low level current-mode digital logic modules
(Motorola, MECL, integrated circuits) el~inated
digital noise in the analog portion of the computer.
The low-level logic swing (0:8 V) along with the
balanced·-current nature of the non-saturating current mode logic serve to reduce radiation and,
more importantly, computer ground-system disturbances.

1233

1234

Fall Joint Computer Conference, 1968

Figure 2-Analog-circuit boxes and digital-circuit cards
plug directly into the rear of inexpensive low-leakage plastic
patchbay receivers. Rows and columns of patchbay springs used
for grounds, power connections, and logic also serve as analogpatchbay shields. Simple metal patchboards with shielded patchcords have patch holes only for the actual analog-computer
terminations for an uncluttered appearance. Summer-integrator
patching is logic controlled and does not require cumbersome
bottle plugs

the gain/bandwidth requirements of fast analog computation with the aid of a new high-performance amplifier developed at The University of Arizona as an M.S.
thesis ploject; 14 the reference describes its design in
great detail. The performance of the new amplifiers is
summarized in Table 1a.
The block diagram of Figure 330 shows the basic amplifier design, which employs a three-channel feed-forward circuit. Beginning from the amplifier input, the
channels are: the high-frequency channel directly to the
wide-band class AB output stage, the intermediate-frequency channel through a Motorola MC1433 integrated-circuit operational amplifier, and the low-frequency channel through a Fairchitd ~726C hot-substrate preamplifier.
The bandwidth and output-current limitations of the
MC1433 integrated-circuit amplifier are overcome by
cascading it with a high-current output stage and by
feeding forward the high frequency signals directly to
the output stage from the summing junction. Because
LOW-FREQ
CHAI'