1962 12_#22 12 #22
1962-12_#22 1962-12_%2322
User Manual: 1962-12_#22
Open the PDF directly: View PDF
.
Page Count: 319
| Download | |
| Open PDF In Browser | View PDF |
CONFERENCE
PROCEEDINGS
VOLUME 22
FALL JOINT
COMPUTER
CONFERENCE
CONFERENCE
PROCEEDINGS
VOLUME 22
FALL JOINT
COMPUTER
CONFERENCE
~ SPARTAN BOOKS
~
6411
CHILLUM PLACE. N. W.
•
WASHINGTON
12. D. C.
List of Joint Computer Conferences
1. 19·51 Joint AlEE-IRE Computer Conference,
Philadelphia, December 1951
2. 1952 Joint AIEE-IRE-ACM Computer Conference, New York, December 1952
3. 1953 Western Computer Conference, Los
Angeles, February 1953
4. 1953 'Eastern Joint Computer Conference,
Washington, December 1953
5. 1954 Western Computer Conference, Los
Angeles, February 1954
6. 1954 Eastern Joint Computer Conference,
Philadelphia, December 1954
7. 1955 Western Joint Computer Conference,
Los Angeles, March 1955
8. 1955 Eastern Joint Computer Conference,
Boston, November 1955
9. 1956 Western Joint Computer Conference, San
Francisco, February 1956
10. 1956 Eastern Joint Computer Conference, New
York, December 1956
11. 1957 Western Joint Computer Conference, Los
Angeles, February 1957
12. 1957 Eastern Joint Computer Conference,
Washington, December 1957
13. 1958 Western Joint Computer Conference,
Los Angeles, May 1958
14. 1958 Eastern Joint Computer Conference,
Philadelphia, December 1958
15. 1959 Western Joint Computer Conference,
San Francisco, March 1959
16. 1959 Eastern Joint Computer Conference,
Boston, December 1959
17. 1960 Western Joint Computer Conference,
San Francisco, May 1960
18. 1960 Eastern Joint Computer Conference,
New York, December 1960
19. 1961 Western Joint Computer Conference,
Los Angeles, May 1961
20. 1961 Eastern Joint Computer Conference,
Washington, December 1961
21. 1962 Spring Joint Computer Conference, San
Francisco, May 1962
22. 1962 Fall Joint Computer Conference, Philadelphia, December 1962
C.onferences 1 to 19 were sponsored by the National Joint Computer Committee,
predecessor of AFIPS. Back copies of the proceedings of these conferences may
be obtained, if available, from:
•
•
•
Association for Computing Machinery, 14 E. 69th St., New York 21, N. Y.
American Institute of Electrical Engineers, 345 E. 47th St., New York 17, N. Y.
Institute of Radio Engineers, 1 E. 79th St., New York 21, N. Y.
Conference 20 and up are sponsored by AFIPS. Copies of AFIPS Conference
Proceedings may be ordered from the publishers as available at the prices indicated below. Members of societies affiliated with AFIPS may obtain copies
at the special "Member Price" shown.
Volume
List
Price
Member
Price
Publisher
20
21
22
$12.00
6.00
8.00
$7.20
6.00
4.00
Macmillan Co., 60 Fifth Ave., New York 11, N. Y.
National Press, 850 Hansen Way, Palo Alto, Calif.
Spartan Books, 6411 Chillum Place, NW, Washington 12, D. C.
The ideas and OpInIOnS expressed herein are solely
those of the authors and are not necessarily representative of or endorsed by the 1962 Fall Joint Computer Conference Committee or the American Federation of Information Processing Societies.
Library of Congress Catalog Card Number: 55-44701
Copyright © 1962 by American Federation of Information Processing Societies,
P.O. Box 1196, Santa Monica, California. Printed in the United States of
America. All rights r.eserved. This book or parts thereof, may not be reproduced in any form without permission of the publishers.
Manufactured by McGregor & Werner, Inc.
Washington, D. C.
CONTENTS
Page
Page
27
Preface
Processing Satellite Weather Data - A Status Report Part I
Processing Satellite Weather Data - A Status Report Part II
Design of A Photo Interpretation Automaton
36
Experience with Hybrid Computation
44
Data Handling at an AMR Tracking Station
56
71
73
86
Information Processing for Interplanetary Exploration
EDP As A National Resource
Planning the 3600
D825 - A Multiple-Computer System for Command &
Control
97
The Solomon Computer
108
121
The KDF.9 Computer System
A Common Language for Hardware, Software, and
Applications
Intercommunicating Cells, Basis for a Distributed
Logic Computer
On the Use of the Solomon Parallel-Processing
Computer
v
1
19
130
137
147
154
Data Processing for Communication Network Monitoring and Control
Design of ITT 525 "Vade" Real-Time Processor
iii
v
Charles L. Bristor
1
Laurence I. Miller
19
W. S. Holmes
H. R. Leland
G. E. Richmond
E. M. King
R. Gelman
K. M. Hoglund
P. L. Phipps
E. J. Block
R. A. Schnaith
J.A. Young
T. B. Steel, Jr.
27
Charles T. Casale
James P. Anderson
Samuel ,A'. Hoffman
Joseph Shifman
Robert J. Williams
Daniel L. Slotnick
W. Carl Borck
Robert C. McReynolds
A. C. D. Haley
Kenneth E. Iverson
36
44
56
71
73
86
97
108
121
C. Y. Lee
130
J.R.Ball
R. C. Bollinger
T. A. Jeeves
R. C. McReynolds
D. H. Shaffer
D. I. Caplan
137
Dr. D. R. Helman
E. E. Barrett
R. Hayum
F. O. Williams
154
147
Page
Page
161
On the Reduction of Turnaround Time
170
184
Remote Operation of a Computer by High Speed
Data Link
Standardization in Computers and Information
Processing
High-Speed Ferrite Memories
197
Microaperture High-Speed Ferrite Memory
213
Magnetic Films-Revolution in Computer Memories
225
229
232
234
Hurry, Hurry, Hurry
The Case for Cryotronics?
Cryotronics - Problems and Promise
Some Experiments in the Generation of Word and
Document Associations
A Logic Design Translator
177
251
262
275
280
Comprotein: A Computer Program to Aid Primary
Protein Structure Determination
Using Gifs in the Analysis and Design of Process
Systems
A Data Communications and Processing System
for Cardiac Analysis
285
Cluster Formation and Diagnostic Significance in
Psychiatric Symptom Evaluation
304
Spacetracking Man-Made Satellites and Debris
310
311
313
List of Reviewers
1962 Fall Joint Computer Conference Committee
American Federation of Information Processing
Societies (AFIPS)
iv
H. S. Bright
B. F. Cheydleur
G. L. Baldwin
N. E. Snow
C. A. Phillips
R. E. utman
H. Amemiya
H. P. Lemaire
R. L. Pryor
T. R. Mayhew
R. Shahbender
T. Nelson
R. Lochinger
J. Walentine
C. Chong
G. Fedde
Howard Campaigne
W. B. Ittner, III
Martin L. Cohen
Gerard Salton
161
170
177
184
197
213
225
229
232
234
D. F. Gorman
J. P. Anderson
Margaret Oakley Dayhoff
Robert S~ Ledley
William H. Dodrill
251
·M. D. Balkovic
C. A. Steinberg
P. C. Pfunke
C. A. Caceres
Gilbert Kaskey
Paruchuri R. Krishnaiah
Anthony Azzari
Robert W. Waltz
B. M. Jackson
280
262
275
285
304
310
311
313
PREFACE
The theme of the 1962 Fall Joint Computer Conference is Computers in the Space Age. Today there is a two-way street in which
computing equipment has contributed vitally to the success of space age
technology, but the space-age demands have had their major effects on
the design 'of computers. Of these we can readily discern three outstanding results: (1) development of more efficient interfacing between man and machine, (2) radical reduction of the size of systems,
and (3) the maturing of the theory and implementation of cooperative
systems, including multi-point operating complexes.
Naturally these achievements are irrevocably to be reflected in
the stationary equipment that benefits business and science. We already know that for the purposes of the Space Age, computing equipment is to provide facility for command-decision and for control of a
new order of complexity. But we are just becoming aware of the products of this progress. The social implications of advances in the precise selection of information via recursive interplay between man and
machine-though barely perceptible at the present time-are rapidly
assuming major influence on the structure of the near future.
Altogether, the interaction of the space age and computer technologies has brought about a rich growth in new and potent national resources. Indeed, the record of the United State s in the field of information and data processing systems is pre-eminent in the present
world. It is helping therefore very directly to give us pre-eminence in
space.
J. Wesley Leas
Chairman
1962 Fall Joint Computer Conference
v
PROCESSING SATELLITE WEATHER DATAA STATUS REPORT - PART I
Charles L. Bristor
U. S. Weather Bureau
Washington, D. C.
SUMMARY
digestive, and productive headings. Tasks
under these headings are explained for both
the photo and infrared data. The individual
program modules and subroutines are discussed further in an appendix. Reference is
made to the second part of this report which
expands on the logical design of the digital
and non-digital data handling system complex and extends the discussion into data
rates, command and control concepts and the
executive program which manages the overall process.
Less than 500 radiosonde observations
are available for the current twice daily
three dimensional weather analysis over the
Northern Hemisphere-a coverage far less
than is required for short term advices and
for input to numerical prediction computations. Global observations from operational
satellites as a complement to existing data
networks· show promise of filling this need.
TIROS computer programs now being used
for production of perspective geographic locator grids for cloud photos, and other programs being used to calibrate, edit, locate
and map infrared radiation sensor measurements, have provided a background of experience and have indicated the potentialities
of a more automated satellite data processing
system. The tremendous volume of data expected from the Nimbus weather satellite
indicates the need for automatic data processing. Each pass around the earth will
produce ninety-nine high resolution cloud
pictures covering about ten percent of the
earth from pole to pole and infrared sensors
will provide lower resolution information but
on a similar global basis. Indications are
that machine processing of the 280-odd million
binary bits of data from each orbit can materially reduce the human work load in producing analyzed products for real time use. The
main programming packages in support of the
presently developing automatic data processing systems are explained under ingestive,
INTRODUCTION
The need for more meteorological data is
an old refrain which is almost constantly
being revived. Why do we always desire
more data? Among the many very good answers to this question are some which are
pertinent to the subject of meteorological
satellites. A most generalized answer might
be expressed in two parts:
1. because as the scope of human activities increases, new applications of weather
information arise and new needs for meteorological advice are generated and
2. because potential economic gains provide a tremendous impetus for attempting to
improve the quality and scope of our present
weather services.
Within the category of the first answer
one may cite the expansion of global air
travel over routes that are practically devoid of weather observations of any kind and
1
2 / Processing Satellite Weather Data - A Status Report - Part I
the similar deployment of air and sea defense forces to remote areas. Even the man
in space program is generating a need for
global weather information. In the thirties
and into World War II a marked expansion of
weather observing networks took placemainly through expansion of weather communications to communities where observations facilities could be installed. Because
of communications and logistics costs, this
type of expansion cannot take place indefinitely to fulfill the ever growing need for detailed observations on a global scale. However, within the scope of the first answer,
such a global network would be extremely
valuable merely as a means of providing current weather information and very short term
warnings and advisories.
Beyond i,mmediate operational advice is
the need implied by the second answer-the
problem of weather prediction. The American Meteorological Society (1962) has recently restated its estimate of current skills
in weather forecasting.
" ... For periods extending to about 72
hours, weather forecasts of moderate
skill and usefulness are possible. Within
this interval, useful predictions of general trends and weather changes can be
made ...• "
Few .would deny the economic importance
and increased application of more preCise
3-day forecasts.
Since the mid-fifties numerical weather
prediction has had a significant influence on
the level of skill in weather forecasting generally. The method involves a mathematical
description of the atmosphere in three dimensions utilizing the hydrodynamic equations of motion and the laws of thermodynamics. The partial differential equations of
such a "model" are arranged in a prognostic
mode such that only time dependent partials
remain on the left side. The finite difference
version of such an equation set is then integrated in short time steps to produce prognostic images of the various data fields which
served to describe the initial state of the
fluid. Phillips (1960) has summarized the
current view which delimits the potential of
numerical weather prediction-to the extent
that lack of observations prevents adequate
description of the atmosphere on a global
basis. Figure 1 indicates the present network of observing stations which provide the
cutrent three dimensional description of the
atmosphere together with a grid overlay indicating intersections at which information
is required concerning the current state of
the fluid in order that the finite difference
equations may be integrated. Obviously a
poorly distributed collection of less than 500
observations can not adequately establish
values for nearly two thousand grid points.
Areas the size of the United States are indicated without any upper air soundings whatsoever. The situation in the Southern Hemisphere is much worse.
This brief discussion of the meteorological data problem points up the need for a
detailed global observational network and
offers the real challenge to meteorological
satellites. Can indirect sensing via satellite
fill the need for global weather data? D. S.
Johnson (1962) has summarized the meteorological measurements carried out thus far
by satellites and discussed others planned
and suggested for the future.
Indications are that, whereas satellite observations will likely never supplant other
data networks, they hold great promise in
providing complementary data on a truly
global basis. Limited experience with satellite weather data already obtained is very
encouraging.
The following is a description of current
efforts in processing the ever growing volume
of this data. First, limited computer processing of TIROS data is discussed. The latter portion of this report and the second
paper in this two part series describe in
some detail the current status of computer
programming in support of the truly automated real time data processing systems
under construction for the Nimbus satellite
system.
EXPERIENCES WITH TIROS
Since April, 1960 cloud photos from TIROS
satellites have been made available to the
meteorological community on an intermittent
operational basis. Details of the satellites'
construction including its slow scan cameras
have been given elsewhere along with an account of certain difficulties in geographically
locating the cloud photos because of meandering in the spin axis (NASA - USWB, 1960). A
cloud photo sample is presented in Figure 2.
Even without a meteorological background,
one would likely concede, on the basis of intuition, that such cloud patterns could provide
Proceedings-Fall Joint Computer Conference, 1962 / 3
Figure 1. Northern Hemisphere map showing upper air reporting stations and computation grid used in objective weather map analysis and numerical prediction. The
Weather Bureau's National Meteorological Center uses a somewhat denser grid of
more than 2300 points. Less than 500 of these reports are routinely available for
specification of quantities at the grid points.
valuable observational evidence concerning
the state of the atmosphere. A considerable
research effort is now going on in an effort
to extract quantitative information from such
images (NASA - USWB, 1961). For the present, computer processing has been confined
largely to the production of geographic locator grids as an aid to further interpretation
of the cloud patterns. The locator grid superimposed on the picture in Figure 2 and the
sample grids shown in Figure 3 are produced
at a rate of 10 seconds per grid on the IBM
7090 (Frankel & Bristor, 1962). Line drawn
output is produced on an Electronic Associates Data Plotter or, alternately, by General
Dynamics High Speed Microfilm Recorder.
Input for each grid includes latitude and longitude of the sub-satellite point, altitude of
the satellite as well as azimuth and nadir
and spin angles which describe the attitude
and radial position of the camera with respect to the earth. An auxiliary program is
required for the production of image to obj ect
ray distortion tables. These tables correct
for symmetric and asymmetric distortions
due to the lens and the electronics of the
system and are produced from pre-launch
calibration target photos taken through the
entire camera system. An additional feature
of the gridding program is the large dictionary of coastline locations from which transformations to the perspective of the image
4 j Processing Satellite Weather Data - A Status Report - Part I
Figure 2. Sample cloud picture with perspective geographic locator grid. This photo, \
taken by TIROS III, shows hurricane Anna \
near 12°N, 64°W (lower left) on July 20,1961
together with large streamers projecting toward another vortex pattern to the east (right).
are made as an aid in mating the cloud image
and grid. Some 10,000 such grids have been
produced thus far for selected cloud photos
taken by TIROS I and TIROS III and are available in an archive, along with the pictures,
for research applications. A somewhat less
detailed but similar gridding procedure is
being utilized on a smaller Bendix G-15
computer at the TIROS readout sites for the
current real time hand processing of the
picture data (Dean, 1961). A typical example
of such a nephanalysis (cloud chart) composed
from a group of photos is shown in Figure 4.
Features from the several images are replotted in outline form or reduced to symbolic
form on a standard map base for facsimile
transmission to the weather analysts and
forecasters.
Starting with TIROS II in November, 1960,
infrared sensors have furnished experimental
radiation measurements in five selected
wavelength intervals (NASA - USWB, 1961
and Bandeen, 1962). Although these data
have not been available in real time, an extensive 7090 program has been produced for
their reduction to a usable form. The IR information has been utilized in a quantitative
manner in several research studies. Fritz
and Winston (1962) have demonstrated its
usefulness in cloud top determinations and
Winston and Rao (1962) have used it in connection with energy transformation investigations on the planetary sc ale.
The data reduction program accepts raw
digitized sensor values read out from the
satellite, rejects space viewed samples, converts the earth viewed responses to proper
physical quantities through a calibration procedure and finally combines the data with
orbit and attitude information to create a
final meteorological radiation tape (FMRT).
Data from one orbit is thus reduced to an
archivable file on magnetic tape by the 70.90
in less than twenty minutes. This tape becomes the data source for other programs
which have been produced for the purpose of
mapping selected samples of such data on
standard maps for use with other meteorological charts. A sample is shown in Figure 5.
The above discussion indicates the nature
of the data obtained thus far by meteorological satellites and the kinds of computer support provided. Experience gained in programming the earth 10 cat ion of sensor
measurements obtained from satellites, the
conversion to standard maps, the calibration
and logical sorting of raw data and the experience gained with distortion and attitude
programs have all provided background for
programs now being produced for direct application in an automatic system. Meanwhile
research with TIROS data is suggesting new
uses which are likely to lead to a requirement for more kinds of products and interpretations. Experience from past efforts is
thereby supporting present efforts in developing an automated, real time system for the
proceSSing of global data coverage which
will be coming from the Nimbus satellite
series.
THE NIMBUS DATA PROCESSING TASK
The Nimbus satellite represents a significant advancement over TIROS as an operational sat e 11 i t e. The spacecraft system
(Stampfl and Press, 1962) provides more
camera coverage of higher resolution, and
earth stabilization assures maximum photo
coverage. One downward and two oblique
looking cameras will view a broad strip of
the earth athwart the vehicle's path as shown
in Figure 6. The three views overlap slightly.
Proceedings-Fall Joint Computer Conference, 1962 / 5
86 7
2
3~
1
1
Figure 3. TIROS grids with familiar coastline features. The set of digits bracketing
a central intersection indicate the latitude (left-hand number, plus for North) and longitude (right-hand, plus for East) of that point. A zero is plotted along the meridian
at the next intersection to the South. Legend in the lower right indicates orbit and
frame number for the matching photo (top line, from left) as well as readout station,
mode (taped or direct), and camera (single digits, from left). Horizon arc is indicated beyond the truncated grid pattern at the top where appropriate.
The extremely foreshortened region near the
horizon is not viewed. Thirty-three such
photo clusters will be obtained from each
pass around the earth. Considerable overlap
in the wings is obtained from cluster to cluster as shown in Figure 7. The near polar
orbit will assure global coverage daily.
Overlap from orbit to orbit is minimal at the
equator but is very great near the poles (Figure 8). During the polar summer one would
expect to see a view such as is covered in
Figure 9 on every orbit. The slight inclination of the orbit in a retro sense (inj ection
into orbit with a westerly direction component) will provide controlled illumination for
the pictures in that local sun time will remain unchanged from orbit to orbit. Each
slow scan TV camera (1" diameter Vidicon
tube) contains 833 lines of picture information giving a maximum image resolution of
about 1/2 mile when looking directly downward from a nominal orbit of 500 nautical
6 / Processing Satellite Weather Data - A Status Report - Part I
miles. Such apicture will thus contain nearly
700,000 picture elements. If each of these
scan spots is converted on a 16 segment gray
scale into a 4 bit binary number, then the 99
pictures obtained from each 108 minute orbit
will produce almost 275 million bits.
Scanning radiometers will provide IR information as does TIROS but again will obtain optimum scans from horizon to horizon
athwart the vehicles's track. One narrow
angle high resolution sensor (HRIR) will respond in a water vapor "window" portion of
the infrared spectrum and will effectively
provide cloud top temperatures or, in cloudless areas, surface temperatures. A mosaic
of such scans on the dark portion of each
pass will provide a night time cloud cover
picture from pole to pole.
The first such HRIR sensor with a .5 degree viewing cone will provide maximum
resolution of about 5 nautical miles. Since
the earth will be viewed about one third of
each scan revolution, 240 non-overlapping
measurements can be obtained from each
scan. Approximately 2800 non-overlapping
scan swaths will be required to cover the
dark half of the orbit. Since these sensors
have a wider usable response range, each
scan spot will occupy a 7 bit binary number.
The HRIR response from each orbit will
Figure 4. Nephanalyses (cloud charts) prepared by TIROS readout station meteorologists.
Features of the cloud patterns from two successive orbits are extracted in outline form and
placed on a standard polar stereographic map base for facsimile transmission to weather
analysts and forecasters. Vortex centers are located along with other distinctive features.
Proceedings-Fall Joint Computer Conference, 1962 / 7
·····'0··
·1
········'0···
..........
. . . .. . ... ,00'..- + - - - f - - - - I - - ._
.. -. . -'.j..:...-.. ------i----
::.::
..........................
. . . . . .. ,.. .
. . :.:~",:: : . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
--=p.-:!--''--+----+------''-~-.-:: ~-':'..>p'~"",-.: - .:- .--+-... -.. -.--+.- .' - ·' l C r - - - l - - - - l - - - - - . J . - -
::: : '::::~:
. ......
...
. ''b ..
..
.
,
..
~
. ....... ....... P: ."
:::6. : ..
·9··
::::~~.:
~~~ ::~~J~,- + - f - - - - - + - - I - - ..............
..
Figure 5. TIROS II infrared analys is. Part of the 8 -12 micron water vapor l1window l1 data
read out on orbit 578 has been summarized in grid squares on a polar stereographic map base.
Radiation coming essentially from cloud tops or from the surface is expressed in watts per
square meter.
therefore contain more than 4.7 million
bits.
Another 5 channel medium resolution infrared scanner (MRIR) will provide additional
information throughout each orbit. The five
degree view of the MRIR sensors will provide about 42 separate earth measurements
per scan revolution from each channel. Approximately 700 non-overlapping scans are
required for a full orbit so that (again using
7 bits per measurement) the MRIR response
from each orbit will contain more than 1
million bits.
The volume of information expected from
each pass is indeed impressive especially
when one realizes that this information is to
come night and day on a continuous basis for
immediate real time utilization. A marked
increase in the present number of TIROS
data analysts and helpers is indicated for
Nimbus data processing if present semihand procedures continue. With plans for
higher resolution sensors of increasing variety, automatic processing of satellite weather
data is becoming a necessity.
STATUS REPORT ON NIMBUS DATA
PROCESSING PROGRAM
The automatic data processing system
under construction will be located at the
Weather Bureau's National Weather Satellite
Center (NWSC) , Washington, D. C. and will
receive its input data from the command and
data acquisition (CDA) facilities at Fairbanks, Alaska through multiple broad band
microwave communication facilities. The
system at NWSC contains a complex of
8 / Processing Satellite Weather Data - A Status Report - Part I
components in addition to the digital computers. A detailed explanation of the system
is beyond the scope of this report although a
brief description from the computer oriented
viewpoint is given in the second part. Let it
suffice at this pOint to say that the system is
evolutionary in design in that computations
will continue in support of semi-hand processing procedures. For this purpose the
system's IBM 7094 with attached 1401 will
be utilized to produce a picture gridding
tape. Information in the form of override
signals at specified Vidicon scan line and
scan spot numbers, when melded with the
analogue picture signals, will produce a
kinescope recording of the original cloud
photo with a super-imposed dotted line locator grid such as Figure 10. A small CDC
160A computer, interruptable by Vidicon
synch pulses, will synchronously meld the
digital information from the gridding tape
onto spare tracks of an analogue picture
tape. Other non-digital devices will then
combine the synchronized information on
this tape as it is fed into the kinescope
recorder.
The 7094 program is being produced essentially as an extension of present TIROS
programs. A simulated output of this program has provided check out facility for the
160A program which now awaits the unique
non-digital hardware complex for final checkout. A complexity of supporting programs
are involved in this effort as indicated in the
appendix which briefly describes each program module. This effort will permit a
TIROS type semi-hand processing of the
photo data but with hand melding of grid and
picture now automated.
The far greater task of the system involves duplication of the semi-hand processing by automatic means. In the beginning
these efforts must be experimental in that
application of the data is still exploratory.
Methods of presentation,quantities to be extracted from the basic data, the scale of
atmospheric phenomena to be described
(resolution) are all in exploratory stages. -A
Figure 6. Perspective grids and mapped coverage of Nimbus camera cluster as seen from a
500 nautical mile orbit. The central camera looks directly downward at the sub-satellite
point. Side cameras are tilted 35 degrees to either side of the track.
Proceedings-Fall Joint Computer Conference, 1962 / 9
Figure 7. Geographic coverage to come from Nimbus showing overlap between
adjacent three camera clusters.
major effort is underway to create a hierarchyof data processing programs to activate the system and produce a variety of
outputs in a flexible manner.. These may be
grouped as ingestive, digestive, and productive.
The ingestive programs are more than
simple input routines in that some preprocessing of the data is accomplished. In
the case of picture data, the entire volume
mentioned previously is to be fed into storage in the computer. Some sorting is required before storage so that separate disk
file.s are created containing data from each
of the three cameras. As time permits,
other pre-processing activities will also be
accomplished. Light intensity signals over
the face of each photo require normalization
for angle of view before quantitative comparisons are valid. Also, for the same reason, solar aspect variation from equator to
pole must be removed.
In the case of the incoming MRIR and
HRlR data, the ingestive process is partly
one of data editing. By recognition of pulses
which provide knowledge of scanner shaft
angle, almost two thirds of the incoming data
which is non-earth viewing can be eliminated.
10 / Processing Satellite Weather Data - A Status Report - Part I
Other raw housekeeping input information
such as attitude error signals and sensor
environment temperatures must be unpacked
and translated through calibration in the in.;.
gestive process before they can be used in
processing the meteorological data.
Final checkout of these programs must
await activation of the complete hardware
complex since only limited simulation is
possible.
The digestive process takes the pertinent
incoming data and converts it to a meteorologically usable form. A major task is the
melding of this data with the orbit and attitude
information to geographically locate the sensor information elements. In the case of the
photos, part of this work is accomplished as
an adjunct to the earlier mentioned program
which produces the picture gridding tape. An
open lattice of points selected by scan row
and spot number are geographically located
within each image. From these location
"bench marks" the digestive program transforms the foreshortened, perspective photo
image into a rectified equivalent on a standard map base. Figure 11 is an experimental
Figure 8. Geographic coverage envelopes to come from Nimbus showing
overlap from orbit to orbit.
Proceedings-Fall Joint Computer Conference, 1962 / 11
Figure 9. Sample perspective grid showing
the polar region to be viewed by Nimbus.
example. The rectified image appears on a
mercator map projection-in one view as a
replotting of the original picture elements
only. It demonstrates the futility of extending this process into extremely foreshortened
image areas where a realistic rectified image
would consist largely of interpolated filler.
After this step the rectified images are
fitted together into a mosaic strip which is
then available as a product source.
The digestive infrared data program is
being patterned after that mentioned earlier
which has been produced for the processing
of TIROS radiation data. The calibrated and
earth located data will similarly provide a
product source through the archivable final
meteorological radiation tape.
Programs for production of usable output
material present the most problems. Full
resolution photo mosaics rectified to polar
stereographic or mercator maps are expected to find application over limited regions in connection with hurricane detection
and tracking, for example. For other broadscale analysis problems, products having
reduced resolution may be adequate. This
implies searching these images by machine,
editing and summarizing them as to percent
cloud cover, brightness and pattern. Some
interesting patterns are revealed in the
TIROS photos of Figure 12. Although, as
mentioned earlier, quantitative interpretations are only gradually emerging, the rings,
spirals and streets seen in these photos will
likely be subjects for identification through
pattern recognition techniques. Cloud heights,
provided indirectly from the IR data through
Figure 10. Cloud photo with melded grid (simulated): Original Hugo rocket photo (left)
looking toward Bermuda from Wallops Island, Virginia at 85 miles altitude and 100 scan
line digitization of the original picture (right) played back through a digital CRT (SC-4020)
with 15 unit gray scale produced by programmed time modulation. Certain picture elements have been replaced by grid signals before playback to produce latitude/longitude
lines.
12 / Processing Satellite Weather Data - A Status Report - Part I
Figure 11. Rectified cloud photo: Digitized
picture elements from figure 10 replotted on
a Mercator map base without filler (below)
and with filler (above) to produce a rectified
pictorial image.
cloud top temperatures, present an added
output. The MRIR package will yield other
derived products such as maps of the net
radiation flux. There is thus a family of derived products available from the digested
material. A variety of output equipments including prototype cathode ray tube photo recording devices which are driven by digital
tapes and somewhat similar photo quality
facsimile machines, require additional conditioning of the output products to suit the
formats specified.
The variety of production type programs
are indicated in the appendix. It is likely
that all such production varieties cannot be
produced in real time from the data received
on all passes of the satellite. The intent is
that these products will be available for experimental utilization and that variations and
modifications of those which prove to be
most useful will assume an operational role.
CONCLUSION
This has been a brief attempt to present
a background to the non-meteorologist explaining the need for more weather data, and
the present and likely future role of weather
satellites. The need for computers and automatic data processing is explained in terms
of the kinds of data involved. Computer support of semi-hand methods is discussed
along with current efforts toward a truly
automated effort for Nimbus satellite data.
As the variety of sensors and the volume of
such data increases, a maximum degree of
automatic processing and utilization of the
data is indicated.
The scope has been limited to the data
processing job as seen from the computer
programmers viewpOint. Other groups within
the Goddard Space Flight Center of NASA,
the Weather Bureau's NWSC and their contractors have vital roles in the design, launch,
command and readout of the satellite and the
supplying of other important data in the form
of sensor calibration and orbital information
from tracking station data before the sensor
data can be rendered meteorologically useful.
Only scant mention has been ma<;le of the
entire data processing system. The second
paper in this series will give additional details of the digital and non-digital data processing machine complex-again from the
standpoint of the computer programmer.
The role of the computer as manager of the
process will be amplified in terms of command and control.
APPENDIX
The main program modules are listed
below together with some details concerning
each subroutine portion. The main section
of each program module is indicated by an
asterisk. Status of various portions is indicated as of September, 1962.
Executive Program
Details of the Executi ve Program are
provided as part of the text of Part 2 of this
paper.
Time -Attitude - Calibration Ingestion Program
*Time/Attitude Sort: Engineering housekeeping data on "A" channel including pitch,
roll and yaw attitude signals and certain
vehicle temperatures used in IR sensor calibration are transmitted as pulse code modulation (PCMA). Shutter times from the
Proceedings-Fall Joint Computer Conference, 1962 / 13
Figure 12. Sample TIROS cloud patterns. Convective clouds over Lower California
(upper left) August 21, 1961. Clas sic hurricane symbol from cloud pattern of Hurricane Betsy (upper right) near 36°N, 59°W on September 8, 1961. Field of cellular
clouds (lower left) near 25°S, looE on July 31, 1961. Cirrus cloud streamers off
the Andes (lower right) passing eastward off the Argentine coast, August 3, 1961.
Advanced Vidocon Camera System (AVCST)
are sent in similar format on another channel.
This program will accept such information
and sort it from an intermixed input format.
PCMA Unpack and Monitor: Unpacks the
separate 7 -bit raw count measurements and
translates selected quantities into meaningful
temperatures or angles. Items to be used
are examined for quality and format with
optional outputs for visual inspection.
PCMA Output: Organizes attitude, calibration temperature, and picture time information into tables and issues the information
in a form suitable for use by the main data
processing programs.
Time/ Attitude Editor: Optionallyaccomplishes some of the above duties as required
in the event that this information is made
available in semi-processed form as a direct
digital message.
This section is in an active design status
awaiting final format of PCMA data and decision on items to be transmitted from Fairbanks, Alaska.
Picture Grid Tape and Rectification Program
Orbit: Based upon a specified time request, this subroutine supplies satellite altitude and latitude/longitude of the sub satellite
14 / Processing Satellite Weather Data - A Status Report - Part I
pOint. The information is generated as a
prediction based upon periodically updated
fundamental orbital elements which are supplied by the main NASA orbital determination through minitrack data.
Picture Attitude: Converts pitch, roll and
yaw error signals into nadir and azimuth
angles of each camera's principle line and
also provides a radial displacement correction to the orientation of each raster.
Distortion: On the basis of prelaunch target photos, produces radial and tangential
distortion corrections for a pre-selected
family of image raster points so that, through
interpolation, any image X, Y point can be
expressed in terms of two component angles
in object space.
Geography: Provides a large catalog of
latitude/longitude points along all major
coastlines of the world. The subroutine provides ordered groups of such points in short
segments for quick selection. Such coastlines are optionally included with latitude/
longitude lines in grids melded to the photos.
*Grid Meld and Rectification Locator:
This is the main program segment. It includes the basic calculations which produce
latitude and longitude from an X, Y image
point. The subroutines above serve as input
support. The primary output is approximately 1000 latitude/longitude locations from
a pre-selected open lattice of image locations. These locations are available in table
form for later interpolative rectification of
the entire picture raster.
Grid Meld Output: For every sixth scan
line of each picture raster, the locations of
latitude/longitude line crossings are calculated. This information from one simultaneous three picture cluster is logically
combined into a set of binary tape records
containing a series of three-bit code groups
and nine-bit count groups which -tell where
over-ride signals are to replace the picture
signal and produce a dotted line grid.
One such orbit routine has been produced
for TIROS. Revision awaits coordination with
NASA orbital computation group as to mathematical model to be used for Nimbus. Geographic coastline tables from TIROS have been
expanded to global coverage and are available
for Nimbus. Other portions are active.
Line Drawn Grid Program
*Grid Line Locator: A program similar
to the above but intended primarily for
emergency use. It computes X, Y image
points from pre-selected latitude/longitude
intersections.
Line Output: Generates a special format
tape for a model 3410 Electronic Associates
Data Plotter.
Cathode Ray Tube Grid Program
*CRT Grid Locator: Essentially a duplicate of the Grid Line Locator above.
CRT Output: Generates a special format
tape to guide the cathode ray tube beam to
produce. grids recorded on microfilm from
devices such as the Stromberg Carlson Model
4020 Microfilm Recorder.
Both the line drawn and CRT grid programs have been completed as generalized
versions of TIROS packages and are being
used experimentally.
Digital Picture
Ingestio~n
*Picture Sort: Digitized pictures arriving from the analogue to digital converter
through the external format control unit will
enter the computer in packed words. Each
36-bit word will contain 4-bitintensitymeasurements from nine consecutive scan spots
all from the same picture. A cyclic commutation intermixes such words from the three
cameras. This program sorts the information for output into separate files each containing information from only one camera.
The following subroutines support this task
and carryon added preprocessing functions.
Picture External Communicator: Picture
data is being recorded at 7 -1/2 inches per
second into a bin tape recorder and the digital
conversion process consults this tape intermittently at 30 inches per second. The external communicator is really an extension
of the executive routine which sends out commands to stop and start the read capstan on
the bin tape recorder.
Picture Monitor: Provides superficial
checks to see that a signal is present, that
raster line synch marks are clear, etc.
Unrectified Print: Produced by IBM 1401
printer will produce a visual check of the
raster and its relationship to the fiducial
marks, a single character corresponding to
each scan spot.
Solar Ephemeris: With time of photo,
provides the latitude/longitude of the subsolar point from which usable sun angles may
Proceedings-Fall Joint Computer Conference, 1962 / 15
be generated for later interpretation of brightness, reflection properties and other attributes of the image.
Sun Glint: Used in conjunction with the
Solar Ephemeris routine will earmark that
part of any image where the response is primarily caused by sun glint.
Output to Storage: Will consist of routine
output commands to the two disk channels
but output of information is important insofar as efficient positioning of the write arm
is concerned since a maximum net transfer
rate is required.
Most parts of this module are active. The
Solar Ephemeris has been completed as a
more efficient version of a similar TIROS
package. Input format and means of detecting ends of scan lines are being worked out
in conjunction with final design specifications of the Format Control Unit.
Picture Digestion and Production
*Picture Rectification: Utilizes the output
of the rectification locator program. Separate picture scan spots are repositioned in
sUb-blocks of storage according to grid
squares on a standard map base. The following supporting packages are utilized.
Picture Selector: Provides input/output
selection capability. A picture will be specified by exposure time and as left, right or
center camera. A specification of core buffer
location and picture segment will result in
movement of the required item to or from
disk storage.
Brightness Normalizer: Adjusts the image
response for variations due to the scan electronics and also adjusts for pole to equator
illumination differences.
Background: Provides an updated background response from which current responses will be treated as anomalies. In this
way partial discrimination between cloud and
background will be possible.
Interpolate: Provides an efficient quadratic interpolation within a two dimensional
array. This package will be used extensively
in connection with transformations from x, y
image locations to i, j map grids.
Indexing: A flexible subroutine which permits identification of storage location as a
function of i, j location in square mesh grid
which is to be superimposed on a map projection.
Mosaicker: A routine which will combine
rectified, summarized data in an overlap
region based on priority selection rules.
Cloud Cover: Some 400 picture elements
falling in a ten nautical mile grid square will
be ranked as background, cloud or doubtful..
Percentage cloud cover and average cloud
cover brightness will be expressed as edited
output.
Disjunction: Further interpretation of the
data used for cloud cover analysis will express the areal variability of cloud cover
thus distinguishing between scattered or
broken cloud arrays in large contiguous
masses as compared to other cases similar
in net cloud cover but distributed in a more
specular array.
Orientation: By comparing profiles of
response within a ten mile square using
samples taken from different radial orientations, certain streakiness and other features
of the pattern can be deduced.
Stereo Map: Computer i, j coordinates
ona specified square mesh grid on a polar
stereographic map base for a given latitude/
longitude point on earth.
Mercator Map: Similar to above but using
a Mercator map base.
Grid Print Output: Prints out on standard IBM printer the various summarizations discussed above by using a character
for each 10 mile mesh interval (square type
and ten line per inch carriage control are
desirable). By coding character selection,
both quantitative and pictorial output can be
obtained.
Line Drawn Output: Contoured fields are
produced from magnetic tape on an Electronic
Associates Data Plotter, Model 3410. Cloud
height analyses will likely be produced by
this device.
CRT Output: Similar to grid print output
but utilizing a device such as the SC 4020
microfilm recorder.
Fax Output: Similar to the above but utilizing digital tape directly to drive a facsimile
scan device.
Most program segments are active. The
interpolation routine is in operation. The
background package will be self generating
after Nimbus launch in that clear air earth
views will be accumulated as background information. Stereo and Mercator mappers
have been produced. An experimental unrectified print package has also been produced.
16 / Processing Satellite Weather Data - A Status Report - Part I
MRIR Ingestion and Digestion
Programs
Scan Rate: The scan shaft angle corresponding to a specific sensor sample can be
deduced from a shaft angle reference pulse
but is also dependent on knowledge of scan
shaft spin rate and sampling frequency. This
subroutine will be available on an optional
basis to compute the spin rate by counting
shaft reference pulses over a given number
of cloud pulses.
*MRIR Ingestion: Manages input, partial
processing and places raw product in intermediate storage.
Scanner Attitude: Similar to picture attitude routine but supplies a series of nadir
and azimuth angles along a scan swath.
Space Cropper: From height supplied
by orbit routine and roll correction, provides identification of IR samples with respect to scan shaft reference pulse thus
permitting rejection of all but earth viewing
sample.
Earth Locator: An adaptation of the picture locator package which furnishes latitude/
longitude information from input provided by
orbit and attitude routines.
Solar Sector: By using the solar ephemeris and location of viewed spot, provides
solar angles for interpretation of data.
MRIR Data/Format Monitor: Inspects the
raw data to detect format errors and to judge
the general quality of the data (noise). Failure to pass acceptance tests causes visual
output for further inspection.
*MRIR Format and Output: Creates the
archivable intermediate source tape from
which various output products are derived.
This main portion utilizes the routines below
and some of those above which cannot be
utilized for want of time during the ingestive
phase.
Calibration: A step-wise two dimen~ional
array interpolation which produces effective
black body temperatures from raw sensor
counts as a function of environmental temperatures adjacent to the sensors and in the
electronic data trans mission equipment.
Documentation: Places appropriate identification on the archivable product including
orbit number, date, time, etc.
Parts of this package that are also used
with HRIR are active. Earth Locator and
Calibration will be minor revisions of TIROS
routines.
MRIR Production Programs
*MRIR Mapper: Consults the final Meteorological Radiation tape produced by the
digestive programs and generates fields of
derived quantities as indicated below. Also
supervises the various output packages.
Cloud Height Analyzer: With the aid of a
temperature height analysis based upon existing observations and climatology, provides a
map of height information based on water
vapor window measurements. This information is now available in consort with cloud
photo information for further interpretation.
Limb Darkening: Provides corrections to
sensor response as a function of viewing
angle (path length).
Net Flux: Creates a map indicating the
net radiative flux (incoming short wave vs.
outgoing long wave) through a functional combination of sensor responses.
Albedo: Produces a map of reflectivity
of the cloud patterns.
MRIR Print Output: These output programs are minor revisions of those mentioned for cloud photos.
MRIR Line Drawn Output:
MRIR Fax Output:
MRIR CRT Output:
This portion is generally not active pending decisions on availability of portions of
data in real time.
HRIR Ingestion and Digestion Programs
*HRIR Ingestion and Format: A CDC
160A computer program which accepts packed
raw count information, unpacks and edits the
data with the help of the two routines below.
HRIR Space Cropper: A preliminary separation of earth and space viewing response
is accomplished without specific height or
attitude input in order to eliminate unwanted
response without using a highly complex program on a small computer.
HRIR Format Monitor: Detects unsatisfactory quality of input data and optionally
generates output for visual inspection (see
similar MRIR routine).
*HRIR Digestion: Provides intermediate
calibrated and geographically located data as
indicated above for MRIR. Many of the subroutines cited above for MRIR are also applied directly to HRIR.
HRIR Calibration: A Simplified version
of the similar MRIR routine.
Proceedings-Fall Joint Computer Conference, 1962 / 17
HRIR Format and Output: Generates the
archivable product source tape. Single channel sensor output is arranged in a format
somewhat different from that used for the
multi-channel MRIR.
This module is active. The ingestive
portions using the 160A is being carried out
by contract with National Computer Analysts
(NCA), Princeton, N. J. An internal segment
of the HRIR Digestion package which precisely defines the earth viewed data sample
is in check out.
HRIR Production Programs
These programs borrow heavily from the
MRIR cloud height analYSis and the photo
cloud cover routines described above. Output routines will also be minor variations of
those discussed.
Some output routines await word format
specifications and instruction sets for prototype output hardware. Special character
chains for computer printer output are being
considered.
Picture Grid Melding Program
*CDC 169A Grid Meld: Provides synchronous recording of digital grid signals produc ed by IBM 7094 and the analog picture
raster.
Time Check: Insures correspondence between gridding signals and pictures by input
of PCM time groups direct from the analog
picture tape and the comparable time information which a c com pan i e s the gridding
signals.
Panel Documentation: Provides documentation information from the 7094 produced tape in proper format for output to the
multitrack analgue picture tape such that a
documentation panel is activated as the
gridded picture is produced for film recording.
This segment is completed and awaiting
non-digital equipment for final checkout.
Details of Panel Documentation await final
design specification of panel display device.
Simulation Support Programs
Certain non-operational programs are
useful as feasibility and timing experiments
while others produce interface input or output product. samples which serve to check
out segments of operational programs. Some
of these have been produced:
AVCS Photo Rectification Study
HRIR FMRT Output Simulation
MRIR Raw Data Simulation
Executive Routine Test
Various phases of the photo rectification
study have been completed including gray
scale experiments on a digital CRT, filler
experiments and obtaining timing figures.
Other Simulation Programs Test Hardware:
Passive Switching Exerciser (7094)
Active Switching Exerciser (7094)
Control Logic Communicator (for 7094
and 160A)
Format Control Test (for 7094 and 160A)
Analog to Digital Test (7094)
AVCS Picture Tape Test (160A)
These routines are awaiting final design
specifications and specific control formats.
REFERENCES
Am e ric a n Meteorological SOCiety, 1962:
Statement on Weather Forecasting. Bulletin A.M.S., Vol. 43, N. 6, June 1962, 251.
Bandeen, W. R., 1962: TIROS II Radiation
Data User's Manual Supplement. A & M
Div., GSFC, NASA, May 15, 1962.
Dean, C., 1961: Grid Program for TIROS II
Pictures. Allied Research Associates,
Inc. Contract No. Cwb 10023, Final Report, March 1961.
Frankel, M. and C. L. Bristor, 1962: Perspective Locator Grids for TIROS Pictures. Meteorological Satellite Laboratory
ReportNo. 11, U. S. Weather Bureau, 1962.
Fritz, S. and J. ,So Winston, 1962: Synoptic
Use of Radiation Measurements from
TIROS II. Monthly Weather Review, 90 (1),
January 1962.
Johnson, D. S., 1962: Meteorological Measurements from Satellites. Bulletin A.M.S.,
Vol. 43, N. 9, September 1962.
National Aeronautics and Space Administration and U. S. Weather Bureau, 1962: Final
Report on the TIROS I Meteorologicarsat-"
ellite System. NASA Tech. Report No.
R-131.
National Aeronautics and Space Administration and U. S. Weather Bureau, 1961; a:
Abstracts and figures of Lectures and
Reprints of Reference Papers. The International Meteorological Satellite Workshop. Washington, D. C., Nov. 13-22,1961.
18 / Processing Satellite Weather Data - A Status Report - Part I
National Aeronautics and Space Administration and U. S. Weather Bureau, 1961; b:
TIROS II Radiation Data User's Manual,
August 1961.
Phillips, N. A., 1960: Numerical Weather
Prediction. Advances in Computers, Vol.
I edited by Franz L. AU, Academic Press,
1960, 43-51.
Stampfl, R. A. and H. Press, 1962: The
Nimbus Spacecraft System, to be published in Aerospace Engineering, 21 (7).
Winston, J. S. and P. K. Rao, 1962: Preliminary Study of Planetary Scale Outgoing Long Wave Radiation as Derived
from TIROS II Measurements. Monthly
Weather Review, 90, August 1962.
PROCESSING SATELLITE WEATHER DATA A STATUS REPORT - PART II
Laurence 1. Miller
U. S. Weather Bureau
Washington, D. C.
the enormous volume of data. The data
processing plan for the operational meteorological satellite, Nimbus, is the result of a
continuing research and development program begun after World War II with German
and American rockets and more recently includes the highly successful TIROS satellites. It is beyond the scope of this report to
provide a detailed description of the TIROS
satellites; however, Table 1 provides a ready
comparison between some of the more salient
features of the two systems and furnishes a
foundation for the ensuing more detailed description of the Nimbus data-processing
system.
Limited computer prqcessing of TIROS
data was discussed in Part I, and details of
the difficulty of "real-time" computer processing of the information have been given
elsewhere, along with an engineering description of the first TIROS satellite and a meteorological analysis of some of the data [4].
Equally as important a consideration in not
preparing elaborate data-processing codes
to handle the TIROS data was the limitation
in speed and storage capacity of existent
digital computers when the TIROS design
was considered. The time required to compute a reprojected image of one complete
photograph approached the elapsed time of
one entire orbit [5]. Although attention will
be given to this problem in a subsequent
section, it hardly seems redundant to point
out that computers of the present generation
are still barely adequate to this task.
SUMMARY
Experience gained from earlier meteorological satellites provides a firm background
for the basic design of the data processing
center. Nevertheless, the almost limitless
nature of the sampled data and some uncertainty as to the optimum forms of the final
products dictate the need for providing the
basic system with extreme flexibility and
good growth potential. To achieve the desired versatility, the operation of the various
portions of the system are being designed so
that their functions are almost entirely programmable to facilitate rapid conversions to
handle new types of data and cope with changing situations.
Maximum utilization of a computer's logical capabilities are stressed to avoid redundant construction of analog hardware andlor
special "black boxes." An executive monitor
program is designed to provide the necessary link between computer and external
hardware. Emphasis is placed on the centralization of control and the modular design
of the main programing packages.
INTRODUCTION
In Part I of this report reference has
been made to the site of the data-processing
center with only passing comment on the
communication network and the system being
designed to manage, edit, process and output
19
20 / Processing Satellite Weather Data - A Status Report - Part IT
Ta.Dle 1
Comparison of Nimbus and TmOS
Height (inches)
Diameter (inches)
Weight (pounds)
Orbital Altitude (Nautical miles)
Orbital Inclination
Stabilization
Earth Coverage (%)
Camera Raster (lines per frame)
TV Resolution (miles)
Maximum Power Available (watts)
m Sensors (resolution, miles)
Period (minutes)
No. of Cameras
Command Stations
The second part of this paper serves three
purposes: to examine the logical layout of
the central computer with associated peripheral equipment and external hardware; to
describe the functioning of the data processing system, emphasizing the logical capabilities of. the computer; to discuss the vital link
between computer and external hardware
provided by an executive monitor program.
DATA TRANSMISSION
Figure 1 is a generalized schematic representation of the flow of data from Nimbus
to the National Weather Satellite Center
NIMBUS
~
1/\
Tmos
Nimbus
19
42
300
380
48° Equatorial
Spin-Stabilized
10-25
500
1
20
MRffi (30)
App. 100
2
2
118
57
650
500
80° Polar
Earth - Seeking (3 axes)
100
833
1/2
400
MRffi (30)-HRm (5)
App. 100
3
1
(NWSC), Suitland, Md., via the command and
data acquisition (CDA) station at Fairbanks,
Alaska. The proposed transmission facility
between Alaska and Suitland will utilize two
48 Kc lines, known commercially as Telpak
B. The telemetry aboard the satellite provides information on the spacecraft environment and attitude as well as information from
the three meteorological experiments. Data
recorded on magnetic tape recorders aboard
the vehicle are telemetered to the ground
station using an FM-FM system to accommodate the considerable information bandwidth.
Somewhat different considerations apply
to each of the multiple sensor and environmental signals as they are initially re,corded
on the spacecraft, telemetered to the ground
and finally received at the transmission terminal equipment. These features are summarized as follows:
-------
PICTu~t
DATA
FAIRBANKS, ALASKA I
NWSC
SUITLAND, MARYLAND
Figure 1. Schematic representation of the
flow of data from Nimbus to the National
Weather Satel,lite Center.
Each of the three video cameras are
simultaneously exposed for. 40 milliseconds,
scanned for 6.75 seconds and recorded on
magnetic tape at 30 i.p.s. Although each exposure of the thirty-three frames (three picture set) are 108 seconds apart, only 3.7
minutes of actual recording time is required.
Playback to ground is maintained at 30 i.p.s.
but is recorded, still in FM form, at 60 i.p.s.
Since the long line bandwidths are not sufficient to accommodate the frequency range,
Proceedings-Fall Joint Computer Conference, 1962 / 21
the ground tape is rewound and then relayed
to the NWSC in 30.85 minutes at 7.5 i.p.s.
HRIR
The narrow angle high resolution radiation
sensor is active only during the dark southbound portion of the orbit of approximately
64 minutes. During this time data is recorded at 3.75 i.p.s. and then telemetered to
the CDA station in 8.1 minutes at 30 i.p. s.
The transmission is received at Suitland in
8.1 minutes; however, the data are recorded
at 60 i.p.s.
MRIR
Five medium resolution radiation sensors
scan from horizon to horizon during the entire orbit. An endless tape loop records the
data continuously (except during readout) at
0.4 i.p.s.; increasing the playback speed by
a factor of 30 reduces the readout time to
3.6 minutes. The data is recorded at Suitland
at 30 i.p.s.
PCM
Space craft environmental signals, including attitude signals, vehicle temperatures
and other housekeeping data, are transmitted
as pulse code modulation (PCM). This information is also recorded during the entire
orbit in a similar manner to the MRIR, discussed earlier.
The "real-time" aspects of the operation
are accentuated by the undelayed transmission of the PCM and infrared data directly to
the NWSC computers over the leased microwave facilities. The total time required for
complete satellite interrogation is 8+ minutes; therefore, all but three to four orbits a
day can be recorded at Fairbanks.~:~ Transmission of the video data to Suitland is delayed about 10 minutes while the computers
convert the raw PCM data to useful parameters; therefore, all the data is not received
at the center until approximately 40 minutes
after the start of interrogation. Direct access
of the data to the IBM computer is accomplished by means of a Direct Data Connection
(DDC), which permits real time transmission
*The east coast of North America is being
considered as a site for a second CDA
station.
between 7094 storage and external devices at
rates up to 152,777 words per second.
The NASA Space Computing Center at the
Goddard Space Flight Center supplies a set
of orbital elements, which are periodically
updated by information received from the
world-wide Minitrack network. Prior to
satellite interrogation these elements are
converted to satellite latitude, longitude and
height as a function of the orbit time.
INPUT DATA
Before turning to a consideration of the
high data rates as they pertain to the "realtime" system, let us briefly outline the presently proposed computer complex. The primary computer will be a 32,000 word core
memory IBM 7094 equipped with the following elements: fourteen MOD V magnetic tape
drives shared between two channels, two 1301
disk files each connected to a separate channel, one DDe attached to a tape channel, a
core storage clock and interval timer, an online printer and card-reader. Two smaller
scale computers will also be available, an
IBM 1401 to serve primarily as an inputoutput device to the 7094, and a CDC 160A to
be used in the picture-gridding program and
to some degree as a preprocessor for the
less voluminous MRIR and HRIR data.
Table 2 provides a summary of the volume
and real-time rates (equivalent to 60 i.p.s.
playback) of the experimental data, and
Table 3 provides the data rates of the 7094
input-output equipment. From a consideration of the simultaneous input-output computing abilities of the 7094, and the effective
use of optimum buffering techniques, it appears at first that the severest constraint to
operational use of the data is imposed by the
acceptance rates of the DDC and the temporary storage devices. However, closer examination of the basic machine cycle time
(2.0 microseconds) and the frequencyof main
frame cycles borrowed by the input-output
equipment reveals that insufficient editing,
buffering and operational programming time
would be available even if the basic acceptance and transfer rates could be appreciably
increased. t
tThe 7094 was selected as the result of a
staff study which considered among other
things delivery dates, performance and reliability, software, user groups, and especially speed and storage capacity.
22 / Processing Satellite Weather Data - A Status Report - Part II
Table 2
Satellite and Station Recording Rates
Satellite
AVCS
Record
(min.)
Speed
(i.p.s.)
Playback
(min.)
Speed
(i.p.s. )
3.7
lffiffi
MRm PCMA
64.8
108
108
DESIGN CONSIDERATIONS
30
3.7
30
3.75
0.4
0.4
8.1
3.6
3.6
30
12
12
60
Fairbanks
Speed
(i.p.s.)
Playback
(min.)
Speed
60
60
60
30.85
7.5
Direct
3.6 Direct
60
-
-
NSWC
Speed
(i.p.s.)
7.5
60
Playback
(min.) 30 (Batch) 8
Speed
(i.p.s.)
30
60
7.5
-
.5
-
60
Table 3
Volume and Real Time Data Rates
Binary Bits
AVeS
lffiffi
MRm
multiplexed and introduced to an analog-todigital converter which encodes the sampled
values in digital form while preserving the
integrity and rate of the data. In the case of
the video signals the data are recorded at 7.5
i.p.s. in a special bin storage recorder which
permits the information to be read into the
computer in batches at 30 i.p.s., well within
the data handling capabilities of the computer.
275,000,000
14,700,000
3,600,000
mM 729 Mod IV
(high dens ity)
IBM 729 Mod VI
(high density)
IBM 1301 DISC
IBM DDC
Bits/second
App.
1,402,920
59,000
134,400
375,000
540,000
App. 500,000
App. 1,000,000
The required high rate of data transmission is obtained by maintaining a continuous
flow between the transmission line and the
computers. The analog signals are detected,
During all phases of the system design it
has beenvital for us to consider both the high
degree of flexibility and growth potential inherent in the Nimbus Research and Development program and the implications of future
programs of international cooperation in
weather satellites. Further, as the system
passes from the experimental phase to the
truly operational stage the degree of automation will increase and eventually replace
manually performed functions. The required
balance between these practical considerations and the need to assume an immediate
operational posture has been achieved by designing the structure of the combined digitalanalog complex as machine, not hardware,
orientated.
To achieve the desired versatility, the
operation of the various portions of the system are being designed so that their functions are almost entirely programmable to
facilitate rapid conversions to handle new
types of data and cope with changing situations. Emph,asis has been placed on the
modular concept so that substitution of one
pack-age for another does not have ramifications throughout the entire system. Maximum utilization of the computers logical
capabilities have been stressed to avoid redundant construction of analog or special
hardware. Wherever possible, major hardware units are standard, dependable general
purpose equipment; and where it has been
necessary to build special eqUipment, these
are of the patch board variety.
CONTROL PHILOSOPHY
The Nimbus system has a common base
with many other complex systems where
computers are employed for such vital functions as information storage, retrieval and
display. Inherent in most of these systems
(e.g., BMEWS, SAGE, MERCURY) is a complex information processing problem which
Proceedings-Fall Joint Computer Conference, 1962 / 23
requires intervention of skilled personnel to
make the ultimate decision. These systems
serve to provide a broad basis of facts on
which the dorninant information processor,
man, can make his decision. Whereas these
systems have been designed because it is
possible to differentiate between the normal
and abnormal, no such clear-cut definition
exists in our weather system. Logical uses
of pattern recognition theory and meteorological research may well negate this last
remark, but such techniques are beyond the
state-of-the-art at this time.
A second difference arises when we consider that the ingestive program is not engaged throughout the entire processing cycle,
i.e., the time between successive readouts.
During the ingestive phase (phase I) the external hardwa~e maybe completely active or
passive or any combination of the two; during
the non-ingestive process (phase IT) the external hardware is predominantly passive.
At any time during a processing cycle both
diagnostic and management interrupts may
occur, but the type of program control invoked must be considered in light of these
two phases. Management interrupts which
may occur at any time are caused by the
normal transfer of data through the computer
and must be given immediate priority. A
component of the external hardware which
monitors the system to prevent loss of quality
or integrity of the information may also provide a diagnostic interrupt at any phase;
however, the right to take action is reserved
to the computer. During phase I the computer must be programmed to take immediate action; however, during phase IT the suspected malfunction may be beyond the present
logical flow of information, and the computer may merely advise a superviser and
refuse to disturb the present operation. The
monitoring and diagnostic control programs
must be optimized as a function of the two
phases.
At the time of initial launch when complete understanding of all possible system
malfunctions is lacking, problems may arise
which have not been anticipated. To cope
with this situation a special manual mode of
operation is provided which permits human
intervention to apply recovery techniques.
As a further "guard to the guards" a real
time programmable clock senses the status
of each phase and signals the present mode
of operation.
It appears that the regularity of the data
and uniformity of time scale should best be
served by an automated system with minimum human intervention. This philosophy is
controlled by an executive program which
also provides the link between the computer
and external equipment.
EXECUTIVE PROGRAM
The actual machine program consists of
five main sections:
1. Internal Control: Coordinates and ties
together the other portions of the executive
monitor. It also requests other program
modules from the system file and provides
for operator ove rride.
2. Schedule: Accepts pre-readout information concerning the data to come and establishes the time schedule and sequence of
program modules to be consulted for that
orbit.
3. Interrupt Interpreter: Diagnoses the
interrupt from the standpoint of source and
reason and directs the computer to the appropriate action. Interrupts may come from
the clock, from the direct data connection
interrupt wire, from the external interrupt
or from regular channel commands.
4. Logical External Communicator: On
the basis of clock alarms or otherwise, sends
commands to control the mode of operation
of the nondigital hardware. This routine is
linked to the interrupt interpreter.
5. Clock Manager: Provides the means
for setting the interval timer and causing
clock interrupts and also fulfills program
requests for time information.
However, the executive program is more
than a series of machine instructions which
controls the flow of information through the
computer and the interaction between the
main program modules. It is, in fact, the
guiding philosophy of the entire data processing system. The program consists of a
rigid set of rules and controls which determine the manner in which the various resources available are utilized in the satisfaction of the system design characteristics.
At first glance, it seems paradoxical that the
Nimbus system, always on the side of growth
and flexibility should make such precise demands at the ve ry heart of the system. Nevertheless' without such a firm foundation our
system would be at best unstable and at worst
com pie tel y unable to meet the specifiC
24 / Processing Satellite Weather Data - A Status Report - Part IT
requirements of growth and flexibility from
within the physical and environmental constraints imposed by the system. The original
form of the executive program will be overlaid by many accretions, some of which may
be major before we are through. The executive program will be the subject of a future
paper.
SYSTEM DESCRIPTION
The functioning of the data processing system is best illustrated through a description
of the events that occur during one orbi tal
cycle. As the information is received at the
common carrier terminal equipment, the
data are directed into three main channels.
1. Into a monitor tape recorder which at
all times records the input from the transmission lines at appropriate speeds. All
data are stored as received providing a safeguard against loss of data in case of breakdown of the processing equipment. This tape
also .serves as an archive copy until replaced
by the CDA master tape.
2. Into a picture gridding and reproduction
branch, in which the analog AVCS signals can
be directly reproduced in pictorial form,
with the insertion of computed latitude, longitude and geographic boundary grids' and
appropriate legends.
3. Into a digitizing subsystem where the
incoming data are formated, converted from
analog to digital and transferred to the computers.
Twelve separate modes of operation appear at least once during each complete
cycle. * Modes 1 through 11 occur (with considerable overlap) during the data ingestion
phase when the primary role of the master
computer is one of system command and
control and only editing and minimum computations are performed. Mode 12 represents the time allocated to the maln data
processing programs which are described in
the appendix to this report. During this
phase the executive program continues to
provide the link between the program modules. However, the master computer relinquishes control of the external hardware and
*An extra burden is placed on the system
when two orbits are stored aboard the
spacecraft and simultaneously acquired by
the CDA station. An alternate mode is provided but will not be treated in this pape r.
peripheral computers to allow manual control for special functions, e.g., archival operations, preventative maintenance. Thus, it
can be seen that approximately 60% of each
cycle is available for computation during
which system software is minimized so as
not to interfere with the program's capacity
to perform the basic function. The modes
for this system are shown in Figure 2 and
are as follows:
Mode 1. Initial load - receive and process
preinterrogation message and compute orbital track. Activate monitor tape recorder
and generate modes 2, 3, 4.
Mode 2. Receive HRIR picture and time
data and switch this information to tape bin
recorder #1. This mode is terminated by
the computer upon receipt of an end of transmission code.
Mode 3. Receive PCM-A data and switch
to demodulator and decoding circuits which
convert the data to digital form. Transfer
the information through the Format Control
Unit (FCU) to the 7094. The computer senses
the end of data to terminate the mode.
Mode 4. Receive AVCS time and direct
the information to demodulator and decoding
circuits which convert the amplitude modulated time information to digital form. The
data is switched to the computer via the FCU.
Upon receipt of all pictures the computer
ends this mode.
Mode 5. Playback the HRIR data as soon
as' it is recorded and dumped into the bin
(Mode 2). The information is converted to
digital form and routed through the 160A
computer to produce an edited digital tape.
Mode 6. Receive MRIR data and record
on bin recorder.
Mode 7. Playback MRIR data to digitizing
SUb-system.
Mode 8. Receive AVCS data and record
on bin recorder at 7.5 i.p.s. allowing tape to
fill up bin.
Mode 9. Playback AVCS data from bin
recorder in short bursts at 30 i.p.s. through
digitizing SUb-system to 7094.
Mode 10. Receive AVCS data and switch
information into picture gridding and reproduction branch.
Mode 11. Transfer HRlR digital tape
from 160A to 7094.
Mode 12. Relinquish automatic control of
the external equipment. Process the data.
Although, the limited goals of data storage
and display are accomplished in quasi-real
Proceedings-Fall Joint Computer Conference, 1962 / 25
A
B
C
D
E
F
LOAD ORBITAL DATA
LOAD AVCS TIME
LOAD MRIR
LOAD HRIR TIME
LOAD PCMA
PREPARE GRIDDING TAPE FOR AVCS
EIT]
IT]
2
..J
<{
z(/)
3
..... 0
4
~
LOAD AVCS PICTURES
L . . . . - I_
01
9_ _- - l
_
8_ _
L -_ _
COMPUTATION OF DATA
~
----11 [~-_-_----
12
Ow
00
z~
::::>
lJ..
o
10
20
30
40
50
60
70
80
90
100
TIME IN MINUTES
I~
DATA PROCESSING CYCLE
Figure 2. Functional modes and computer usage diagram.
time this represents only a partial fulfillment of the system design. The most important function of the computer will be the
summarization of the data into convenient
products for use in weather analysis and
forecasting. The meteorological information
will be disseminated as photographs, maps
and charts, and coded teletypewriter analyses
over domestic and international networks.
True justification for the system design is
possible only if we include this capability to
obtain upon programmed demand these desired outputs, properly formated and to communicate this information to the outside
world.
It is planned to distribute data to meteorologists in the following forms:
1. Gridded photographs (gridded meaning
having latitude and longitude lines).
2. Mosaics, one for each orbital swath
on a scale of about 1:10,000,000. The
resolution of these mosaics will be about 10
miles. Probably three base maps will be
made: polar stereographic for northern and
southern hemispheres and Mercator for
equatorial regions.
3. Mosaics similar to above for North
Atlantic, North America and possibly other
areas. This item has lower priority than 2
and it may prove easier for stations to prepare their own mosaics. Scale may possibly
be 1:20,000,000.
4. Infrared maps similar to 2, from HRIR
data.
5. Infrared maps showing cloud heights
from MRIRdata. Scale possibly 1:20,000,000.
6. Graphical nephanalyses for stations
lacking capability of receiving more detailed
data.
7. Coded nephanalyses for stations having
only radio telegraph or radio teletype.
26 / Processing Satellite Weather Data - A Status Report - Part II
Distribution will be on a selective basis
so that to the greatest extent possible each
user will receive only the data he desires.
Although additional communication links will
be provided distribution to many overseas
sites will necessarily be limited to radio,
including radio facsimile.
REFERENCES
1. Davis, R. M., "Methodology of System
Design."
2. Gass, S. I., Scott, M. B., Hoffman, R.,
Green, W. K., and Peckar, A., "Project
Mercury Real-Time. Computational and
Data Flow System," Proceedings of the
Eastern Joint Computer Conference, Dec.
1961.
3. Hosier, W. A., "Pitfalls and Safeguards
in Real-Time Digital Systems," Datamation, April, May 1962.
4. National Aeronautics and Space Administration, U. S. Weather Bureau, "Final
Report on the TIROS I Meteorologic al
Satellite System," NASA Tech. Report
No. R-131, 1962.
5. Frankel, M. H., and Bristor, C. L., "Perpective Locator Grids for TIROS Pictures," Meteorologic al Satellite Laboratory Report No. 11, U. S. Weather Bureau,
1962.
6. Hall, F., "Weather Bureau Preliminary
Processing Plan."
7. Ess/Gee, Inc., "Nimbus Data Digitizing
and Gridding Sub-System Design Study,"
U. S. Weather Bureau Cwb 10264.
DESIGN OF A PHOTO
INTERPRETATION AUTOMATON*
W. S. Holmes
Head, Computer Research Department
H. R. Leland
Head, Cognitive Systems Section
G. E. Richmond
Principal Engineer
Cornell Aeronautical Laboratory, Inc.
Buffalo 21, New York
INTRODUCTION
aerialphotographs is based on work which has
shown experimentally that present patternrecognition machinery-indeed that which
existed several years ago-can be applied to
the recognition of silhouetted, stylized objects
which are militarily interesting. Murray has
reported just such a capability for a simple
linear discriminator. t Since the information
required to design more capable recognition
machines is readily available, it might seem
that there is no problem of real interest remaining to m a k e a rudimentary photointerpretation machine an accomplished fact.
This, unfortunately, is not so. One of the
most difficult problems is that which is referred to as the segmentation problem. The
problem of pattern segmentation appears in
almost all interesting pattern recognition
problems, and is simply stated as the problem of determining where the pattern of interest begins and ends (as in speech recognition problems) or how one defines those
precise regions or areas in a photo. which
constitute the patterns of interest. The problem exists whenever there is more than one
The extremely large volume of photographic material now being provided by reconnaissance and surveillance systems, coupled with limited, but significant, successes
in designing machinery to recognize patterns
has caused serious consideration to be given
to the automation of certain portions of the
photo interpretation task. While there is
little present likelihood of successfully designing machines to interpret, aerial photographs in a complete sense, there is ample
evidence to support the conjecture that simple
objects, and even some complex objects, in
aerial photographs might be isolated and
classified automatically. Even if machinery,
produced in the near future, can only 'per-:form a preliminary sorting to rapidly winnow the input volume and to reduce human
boredom and fatigue on simple recognition
tasks, the development of such machinery
may well be justified.
The supporting evidence for the conjecture that simple objects can be identified in
*This work was sponsored by the Geography Branch of the Office of Naval Research and by the
Bureau of Naval Weapons.
tSee "Perceptron Applications in Photo Interpretation," A. E. Murray, Photogrammetric Engineering, September 1961.
27
28 / Design of a Photo Interpretation Automaton
simple object in the entire field of consideration of the pattern recognizer. The situation appears almost hopeless when one finds
patterns of widely varying sizes, connected
to one another (in fact or by shadow), enclosed within other patterns, or having only
vaguely defined outlines.
This paper constitutes a report on a system which has been conceived to solve some
of these problems. It is being tested by
general-purpose computer implementation.
The system discussed represents one of several possible approaches to the problem and
had its design focused towards the use of
presently known capabilities in pattern recognizers. No special consideration has been
given, at this time, to methods of implementing the device; however, the entire system
can be built in at least one way.
System Principles
Figure 1 is the basic block diagram for
the system. It has evolved from evaluation
of possible approaches suggested by research
Figure 1. Photointerpretation system
block diagram.
conducted at CAL, pattern recognition work
of others, and techniques successfully used
in other problems.
As is evident from Figure 1, obj ects of
interest have been categorized in two different ways. First, simple objects, such as
buildings, aircraft, ships, and tanks have
been distinguished from complexes, or complex objects. Second simple objects have
been categorized, according to their lengthto-width ratios, as being either blobs (aircraft, storage tanks, buildings, runways) or
ribbons (roads, rivers, railroad tracks). As
shown, the detection of simple objects is accomplished separately for ribbons and for
blobs. In the work reported here the blob
channel-from the input end through the identification of a few complex objects-is receiving the major attention.
The preprocessing which is carried out
in the first portion of the system solves several of the problems inherent in the use of a
simple pattern-recognition device to aid in
the photo interpretation problem. -Briefly,
objects are to be detected, isolated, and
standardized so that they can be presented
separately (not necessarily sequentially) for
identification.
The function performed at the object identification level is that of identifying the blobs
which have previously been detected, isolated, and standardized. The input material
to this level or state consists of black-onwhite objects. As has been previously indicated, existing devices are fundamentally capable of accomplishing the identification task.
At the complex object level, the location
and identification information available from
the Simple object-level outputs is combined
and appropriately weighted to identify objects
at a higher level of complexity. An illustrative example is the combination of aircraft
(simple objects) near a runway (another
simple object) and a group of buildings (each
a simple object) to determine the existence
of an airfield.
In the following sections the basic steps
in the preprocessing sequence will be described in more detail and some illustrations from current computer studies will be
discussed. The most difficult part of the
problem, by far, is that of detection.
Obj ect Detection
A study of sample aerial photography suggests three ways in which images of objects
of interest differ from their backgrounds:
a. points on objects may differ in intensity from the intensity characterizing
the background.
b. objects may be (perhaps incompletely)
outlined by sharp edges, even though
the interior of the image has the same
characteristic intensity as the background.
c. objects may differ from baGkground
only in texture, or two dimensional
frequency content.
Examples of the first two kinds of objects
are shown encircled in Figure 2. There
Proceedings-Fall Joint Computer Conference, 1962 / 29
Figure 2. Examples of objects defined by intensity contrast (0) and by edges (~).
seem to be many fewer examples of objects
which differ from background solely by texture. This class of objects would be much
larger if our de fin i t ion of object were
broader, including, for example, corn fields.
Perhaps the most useful area in which spatial frequency content can be put into use is
that of terrain classification. Terrain classification, as will be noted again later, can
playa significant role in the final identification of our narrower class of objects.
For detection of objects in classes a. and
b., we have been proceeding experimentally
to determine the capabilities of simple,
30 / Design of a Photo Interpretation Automaton
two-dimensional numerical filters, some
nonlinear and some linear.
For initial experimentation, * the object
filters for discrimination based on intensity
contrast (class a objects) were designed as
shown in Figure 3. Square apertures ("picture frame" regions) were used to compute
intensity information which was then compared with the intensity of the point at the
center of the square, A, to determine if the
central point differed sufficiently in intensity
from its background to qualify as being a
point of an obj ect.
A computing method equivalent to the following was used~ Each point in the input
photograph was surrounded by a frame one
point thick, and of width d (Figure 3). The
mean, m., and standard deViation, (J, of the
intensity of the points in the frame were
then computed.
If
A
>m +
A
HOUSEKEEPING
MODULE
,...
......
6SI/
32
()
(
~
......
""
D-
(
I
HOUSEKEEPING
MODULE
Q
J
"......
D-
()
DATA CHANNELS
MAXIMUM
(
(
<>
()
262,144 51 BIT
WORDS MA XIMUM
Figure 5. Maximum 3600 simple system.
Proceedings-Fall Joint Computer Conference, 1962 / 81
STORAGE
MODULE
HOUSEKEEPING
MODULE
o 0
~----,1"
1"-1
1
COMPUTATIONAL
"
I
I
MODULE
I
o 0
1
I
HOUSEKEEPING
J
("")00II'--------'
MODULE
o 0
STORAGE MODULE
HOUSEKEEPING
STORAGE
MODULE
\------1
MODULE
I',
I
"
1
COMPUTATIONAL
'0
I
MODULE
I
1
I
HOUSEKEEPING
J
().004:...-----'
MODULE
COMPUTATIONAL
/
/
/
/
/
MODULE
HOUSEKEEPING
MODULE
04--------------------------------~
Figure 6A. Partially expanded 3600 system.
Figure 6B. Two 3600 systems sharing commom storage.
STORAGE
MODULE
82 / Planning the 3600
DATA CHANNEL
I
STORAGE
MODULE
HOUSEKEEPING
MODULE
"
\I
COMPUTATIONAL
/ ' .".A>
MODULE
/
HOUSEKEEPING
)./
()4"----'
MODULE
COMPUTATIONAL
..0
~---I
HOUSEKEEPING
MODULE
...-
./
MODULE
..STORAGE
MODULE
\
DATA CHANNEL
Figure 6C. Two- computer system with additional common storage.
awareness of the existence of the initial computer and input-output complex. The same
holds for the initial system. Thus, each systern, from an operating pOint of view, is completely independent of the other. The effects
of mutual existence are detectable, however,
but these only indirectly. If both systems
are using the same storage modules extensively' to the exclusion of their own private
modules (if any), over-all operations may be
slowed. For such operations, a more reasonable approach would give each computerinput-output complex its own storage module
or modules and reserve the common storage
area for data of more permanent nature.
In a similar manner, other multi-computer
complexes can be constructed within the
interconnecting limitations imposed on individual modules.
Real- Time Multiplexed Systems
In multiplexed real-time systems, a high
degree of control is required over the entire
system. The modular approach planned permits this as an extension of the multicomputer complexes mentioned. In addition
to two or more completely independent systems sharing a common storage pool, a system is used whereby the independent systems
are also interconnected via their data channels. Each data channel is designed so it
may be connected directly, without intervening black boxes or cable adapters, into any
Proceedings-Fall Joint Computer Conference, 1962 / 83
o
o
HOUSEKEEPING
1------\
MODULE
\
L..--_ _ _ _--'
iI '\
I '\
STORAGE
MODULE
I
I
COMPUTATIONAL
'0
MODULE
I
I
I
.--------, I
HOUSEKEEPING
J
O"f-----'
MODULE
COMPUTATIONAL
P
MODULE
IL..--_ _ _ _......
/
/
/
/
STORAGE
MODULE
HOUSEKEEPING
MODULE
0-+------,
STORAGE
MODULE
000
STORAGE
MODULE
000
Figure 6Do
Two-computer system with private and common storage.
other data channel. These other data channels may be on the same or different systems. The interconnecting linkage supplies
data paths and coupling information. In addition, each interconnecting link permits interruption in either direction, and presents
major fault information such as parity error
in storage or illegal operation code. Thus,
one computer may run in real time with data
and an initial reference program in a common storage pool. Another computer may
be in standby status and processing low
priority problems. The on-line computer
may interrupt the standby computer at any
time and request it to resume the problem.
The new on-line computer, depending upon
the status of the remainder of the system,
may merely serve as a substitute computational unit or as a total computational facility.
Either option is under program control. The
central controlling program would be stored
in a common storage unit with possible interchange with standby units. The computer
units are so designed that anyone of a group
may act as the dominant force with option to
transfer the responsibility at any time. Figure 7 shows a typical multiplexed system.
Future Expansion
One of the initial planning goals required
long life through addition of newer equipment.
With simple in t e r f ace s chosen between
84 / Planning the 3600
modules, it is a relatively simple matter to
change the system by the substitution of a
new module type for an older one. Newer
storage modules or special purpose computing modules may be easily added to the system. With the five access positions on each
storage module, new and unrelated module
types may be added to the system and controlled via the storage medium.
\.................
As an aid in adding new features to the
present design without the necessity of disrupting current or future units, a limited
micro-programming facility was designed
into the computer module. All control elements which the logical designer has at his
disposal when he designs instruction algorithms are available for further use. Very
high speed transmission lines terminate in
.........
\
..... "0 A
A
"b
B
I
........
\ ........
\
J
........
B 0+---'
/
......
\ ........
,
.............. "0 C
0+----'
C
I
f
COMPUTATIONAL
MODULE
!
o 0
DATA CHANNEL
COMM
o 0
!
STORAGE
MODULE
Figure 7. A multiplexed 3600 system.
Proceedings-Fall Joint Computer Conference, 1962 / 85
each of these many control elements. The
other ends of the transmission lines are attached to a timing and control device which
selectively pulses the lines at the same frequency as the basic computer instructions
do. Thus, the cable is a logical extension of
the computer module control circuitry. Instruction algorithms performed via t his
method perform at the same rate of speed
as if they were incorporated into the original
computer module design itself. Algorithms
under active consideration are a square root
and a generalized polynomial evaluator.
This facility permits later inclusion of
new instructions, hardware subroutines, or
modification of existing instructions without
modification to the computer system. This
facility is used by attaching an external unit
to the computer module. The external unit
contains timing elements, a function translator, and a small command structure. When
connected to the computer module, the external unit is considered an integral part of
the computer module. It is referenced by a
special instruction which gives total control
of the external unit. The unit then performs
the specified instruction or subroutine, and
then gives control of the computer back to
the main program.
The advantage s of this facility are many:
it is possible to include specialized instructions where very heavy usage is encountered;
subroutines such as square root can be constructed a voiding a multiplicity of storage
references; and instructions can be given to
special equipment attached to the computer
module.
SUMMARY
A large-scale computer system is planned
in which modular and flexible system design
assures a reasonably long machine life. Relationships between modules are shown to be
highly important, particularly for future
expansion.
0825 - A MULTIPLE-COMPUTER SYSTEM
FOR COMMAND & CONTROL
James P. Anderson, Samuel A. Hoffman, Joseph Shi/man, and Robert J. Williams
Burroughs Corporation
Burroughs Laboratories
Paoli, Pennsylvania
INTRODUCTION
central data processing facility. The data
processing functions alluded to are those
typical of data processing, plus special functions associated with servicing displays,
responding to manual insertion (through
consoles) of data, and dealing with communications facilities. The design implications
of these functions will be considered here.
Availability Criteria: The primary requirement of the data-processing facility,
above all else, is availability. This requirement, essentially afunction of hardware reliability and maintainability, is, to the user,
simply the percentage of available, on-line,
operation time during a given time period.
Every system designer must trade off the
costs of designing. for reliability against
those incurred by unavailability, but in no
other application are the costs of unavailability so high as those presented in command and control. Not only is the requirement for hardware reliability greater than
that of commercial systems, but downtime
for the complete system for preventive maintenance cannot be permitted. Depending
upon the application, some greater or lesser
portion of the complete system must always
be available for primary system functions,
and all of the system must be available most
of thetime.
-The data processing facility may also be
called upon, except at the most critical
times, to take part in exerCising and evaluating the operation of some parts of the system, or, in fact, in actual simulation of system functions. During such exercises and
simulations, the system must maintain some
The D825 Modular Data Processing System is the result of a Burroughs study, initiated several years ago, of the data processing requirements for command and control
systems. The D825 has been developed for
operation in the military environment. The
initial system,. constructed for the Naval
Research Laboratory with the designation
AN/GYK-3(V), has been completed and tested.
This paper reviews the design criteria analysis and design rationale that led to the system structure of the D825. The implementation and operation of the system are also
described. Of particular interest is the role
that developed for an operating system program in coordinating the system components.
Functional Requirements of Command
and Control Data Processing
By "command and control system" is
meant a system having the capacity to monitor and direct all aspects of the operation of
a large man and machine complex. Until
now, the term has been applied exclusively
to certain military complexes, but could as
well be applied to a fully integrated air traffic control system or even to the operation
of a large industrial complex. Operation of
command and control systems is characterized by an enormous quantity of diverse but
interrelated tasks-generally ariSing in real
time-which are best performed by automatic
data-processing equipment, and are most
effectively controlled in a fully integrated
86
Proceedings-Fall Joint Computer Conference, 1962/87
(although perhaps partially and temporarily
degraded) real-life and real-time capability,
and must be able to return quickly to full operation. An implication here, of profound
Significance in system design, is, again, the
requirement that most of the system be always available; there must be no system elements (unsupported by alternates) performing
functions so critical that failure at these
pOints could compromise the primary system functions.
Adaptability Criteria: Another requirement, equally difficult to achieve, is that the
computer system must be able to analyze the
demands being made upon it at any given
time, and determine from this analysis the
attention and emphasis that should be given
to the individual tasks of the problem mix
presented. The working configuration of the
system must be completely adaptable so as
to accommodate the diverse problem mixes,
and, moreover, must respond quickly to important changes, such as might be indicated
by external alarms or the results of internal
computations (exceeding of certain thresholds, for example), or to changes in the hardware configuration resulting from the failure
of a system component or from its intentional
removal from the system. The system must
have the ability to be dynamically and automatically restructured to a working configuration that is responsive to the problem-mix
environment.
Expansibility Criteria: The requirement
of expansibility is not unique to command and
control, but is a desirable feature in any application of data processing equipment. However, the need for expansibility is more acute
in command and control because of the dependence of much of the efficacy of the system upon an ability to meet the changing requirements brought on by the very rapidly
changing technology of warfare. Further, it
must be possible to incorporate new functions
in such a way that little or no transitional
downtime results in any hardware area.
Expansion should be possible without incurring the costs of providing more capability
than is needed at the time. This abUityof
the system to grow to meet demands should
apply not only to the conventionallyexpansible areas of memory and I/O but to computational devices, as well.
Programming Criteria: Expansion of the
data-processing facility should require no reprogramming of old functions, and programs
for new functions should be easily incorporated into the overall system. To achieve
this capability, programs must be written in
a manner which is independent of system
configuration or problem mix, and should
even be interchangeable between sites performing like tasks in different geographic
locales. Finally, because of the large volume of routines that must be written for a
command and control system, it should be
possible for many different people, in different locations and of different areas of
responsibility, to write portions of programs, and for the programs to be subsequently linked together by a suitable operating system.
Concomitant with the latter requirement
and with that of configuration-independent
programs is the desirability of orienting system design and operation toward the use of a
high-level procedure-oriented language. The
language should have the features of the usual
algorithmic languages for scientific computations, but should also include provisions
for maintaining large files of data sets which
may, in fact, be ill-structured. It is also
desirable that the language reflect the special nature of the application; this is especially true when the language is used to direct
the storage and retrieval of data.
Design Rationale for the DataProcessing Facility
The three requirements of availability,
adaptability, and expansibility were the motivating considerations in developing the
D825 design. In arriving at the final systems design, several existing and proposed
schemes for the organization of data processing systems were evaluated in light of
the requirements listed above. Many of the
same conclusions regarding these and other
schemes in the use of computers in command and control were reached independently
in a more recent study conducted for the
Department of Defense by the Institute for
Defense Analysis [1].
The Single-Computer System: The most
obvious system scheme, and the least acceptable for command and control, is the
single-computer system. This scheme fails
to meet the availability requirement simply
because the failure of any part-computer,
memory, or I/O control-disables the entire
system. Such a system was not given serious
consideration.
88 / D825 - A Multiple-Computer System for Command & Control
Replicated Single-Computer Systems: A
system organization that had been well known
at the time these considerations were active
involves the duplication (or triplication, etc.)
of single-computer systems to obtain availability and greater processing rates. This
approach appears initially attractive, inasmuch as programs for the application may
be split among two or more independent
single-computer systems, using as many
such systems as needed to perform all of the
required computation. Even the availability
requirement seems satisfied, since a redundant system may be kept in idle reserve as
backup for the main function.
On closer examination, however, it was
perceived that such a system had many disadvantages for command and control applications. Be sid e s requiring considerable
human effort to coordinate the operation of
the systems, and considerable waste of available machine time, the replicated single computers were found to be ineffective because
of the highly interrelated way in which data
and programs are frequently used in command and c.ontrol applications. Further, the
steps necessary to have the redundant or
backup system take over the main function,
should the need arise, would prove too cumbersome' particularly in a time-critical application where constant monitoring of events
is required.
Partially Shared Memory Schemes: It was
seen that if the replicated computer scheme
were to be modified by the use of partially
shared memory, some important new capabilities would arise. A partially shared
memory can take several forms, but provides principally for some shared storage
and some storage privately allotted to individual computers. The shared storage may
be of any kind-tapes, discs, or core-but
frequently is core. Such a system, by providing a direct path of communication between computers, goes a long way toward
satisfying the requirements listed above.
The one advantage to be found in having
some memory private to each computer is
that of data protection. This advantage vanishes when it is necessary to exchange data
between computers, for if a computer failure
were to occur, the contents of the private
memory of that computer would be lost to
the system. Furthermore, many tasks in the
command and control application require access to the same data. If, for example, it
would be desirable to permit some privately
stored data to be made available to the fully
shared memory or to some other private
memory, considerable time would be lost in
transferring the data. It is also clear that a
certain amount of utilization efficiency is
lost, since some private memory may be
unused, while another computer may require
more memory than is directly available, and
may be forced to transfer other blocks of
data back to bulk storage to make way for
the necessary storage. It might be added in
passing that if private I/O complements are
considered, the same questions of decreased
overall availability and decreased efficiency
arise.
Master/Slave Schemes: Another aspect
of the partially shared memory system is
that of control. A number of such systems
employ a master/slave scheme to achieve
control, a technique wherein one computer,
deSignated the master computer, coordinates
the work done by the others. The master
computer might be of a different character
than the others, as in the PILOT system,
developed by the National Bureau of Standards [2], or it may be of the same basic design, differing only in its prescribed role, as
in the Thompson Ramo Wooldridge TRW400
(AN/FSQ-27) [3]. Such a scheme does recognize the importance, for multicomputer systems, of the problem of coordinating the processing effort; the master computer is an
effective means of accomplishing the coordination. However, there are several difficulties in such a design. The loss of the
master computer would down the whole sys..;.
tern, and the command and control availability
requirement could not, consequently, be met.
If this weakness is countered by providing
the ability for the master control function to
be automatically switched to another processor, there still remains an inherent ineffiCiency. If, for example, the workload of
the master computer becomes very large,
the master becomes a system bottleneck
resulting in inefficient use of all other system elements; and, on the other hand, if the
workload fails to keep the master busy, a
waste of computing power results. The conclusion is then reached that a master should
be established only when needed; this is what
has been done in the design of the D825.
The Totally Modular Scheme: As a result
of these analyses, certain implications became clear. The availability requirement
Proceedings - Fall Joint Computer Conference, 1962 / 89
dictated a decentralization of the computing
function-that is, a multiplicity of computing
units. However, the nature of the problem
required that data be freely communicable
among these several computers. It was decided, therefore, that the memory system
would be completely shared by all processors.
And, from the point of view of availability
and efficiency, it was also seen to be undesirable to associate I/O with a particular
computer; the I/O control was, therefore,
also decoupled from the computers.
Furthermore, a system with several computers, totally shared memory, and decoupled
I/O seemed a perfect structure for satisfying
the adaptability requirements of command
and control. Such a structure resulted in a
flexibility of control which was a fine match
for the dynamic, highly variable, processing
requirements to be encountered.
The major problem remaining to realize
the computational potential represented by
such a system was, of course, that of coordinating the many system elements to behave,
at any given time, like a system speCifically
designed to handle the set of tasks with which
it was faced at that time. Because of the
limitations of previously available equipment,
an operating system program had always
been identified with the equipment running
the program. However, in the proposed design, the entire memory was to be directly
accessible to all computer modules, and the
operating system could, therefore, be decoupled from any specific computer. The
operation of the system could be coordinated
by having any processor in the complement
run the operating system only as the need
arose. It became clear that the master computer had actually become a program stored'
in totally shared memory, a transformation
which was also seen to offer enhanced programming flexibility.
Up to this point, the need for identical
computer modules had not been established.
The equality of responsibility among computing units, which allowed each computer to
perform as the master when running the operating system, led finally to the design
specification of identical computer modules.
These were freely interconnected to a set of
identical memory modules and a set of identical I/O control modules, the latter, in turn,
freely interconnected to a highly variable
and diverse I/O device complement. It was
clear that the complete modularity of system
elements was an effective solution to the
problem of expansibility, inasmuch as expansion could be accomplished simply by
adding modules identical to those in the
existing complement. It was also clear that
important advantages and economies resulting from the manufacture, maintenance, and
spare parts provisioning for identical module~ also accrue to such a system. Perhaps
the most important result of a totally modular organization is that redundancy of the
required complement of any module type, for
greater reliability, is easily achieved by incorporating as little as one additional module
of that type in the system. Furthermore,
the ,additional module of each type need not
be idle; the system may be looked upon as
operating with active spares.
Thus, a design structure based upon complete modularity was set. Two items remained to weld the various functional modules into a coordinated system-a device to
electronically interconnect the modules, and
an operating system program with the effect
of a master computer, to coordinate the activities of the modules into fully integrated
system operation.
In the D82 5, these two tasks are carried
out by the switching interlock and the Automatic Operating and Scheduling Program
(AOSP), respectively. Figure 1 shows how
the various functional modules are interconnected via the interlock in a matrix-like
fashion.
System Implementation
Most important in the design implementation of the D825 were studies toward practical realization of the switching interlock
and the AOSP. The computer, memory, and
I/O control modules permitted more conventional solutions, but were each to incorporate
some unusual features, while many of the I/O
devices were selected from existing equipment. With the exception of the latter, all
of these elements are discussed here briefly.
(A summary of D825 characteristics and
specifications is included at the end of the
paper.)
Switching Interlock: Having determined
that only a completely shared memory system
would be adequate, it was necessary to find
some way to permit access to any memory
bi any processor, and, in fact, to permit
B
'.
90 / D825 - A Multiple-Computer System for Command & Control
MAGNETIC
MAGNmC TAPE
TRANSPORT
~.'_
-
i:g~ER
CABINET)
,,-~
!e
\' --
::.:;:~'
READER
'
..
"
w...
..
MAGNETIC
DISC FILE
S2D3 PRINTER
"0.
....
e1)
~
UJJ-
PUNCH
HIGH-SPEED
PRINTER
r--
SPECIAL
REAL-TIME CLOCKS
&
SELECTm
DATA CONVERTERS
,---
WW
INPUT/OUTPUT
CONTROL
MODULES
~
~
READER
-
r--
coa ..
U
~ ~
~
1d
INTERSYSTEM
DATAUNKS
Figure 1. System Organization, Burroughs D825
Modular Data Processing System.
sharing of a memory module by two or more
processors or II 0 control modules.
A function distributed physically through
all of the modules of a D825 system, but which
has been designated in aggregate the switching interlock, effects electronically each of
the many brief interconnections by which all
information is transferred among computer,
memory, and I/O control modules. In addition
to the electronic switching function, the
switching interlock has the ability to detect
and resolve conflicts such as occur when two
or more computer modules attempt access
to the same memory module.
The switching interlock consists functionally of a crosspoint switch matrix which
effects the actual switching of bus interconnections, and a bus allocator which resolves
all time conflicts resulting from simultaneous requests for access to the same bus or
system module. Conflicting requests are
queued up according to the priority assigned
Proceedings-Fall Joint Computer
to the requestors. Priorities are pre-emptive
in that the appearance of a higher priority
request will cause service of that request
before service of a lower priority request
already in the queue. Analyses of queueing
probabilities have shown that queues longer
than one are extremely unlikely.
The priority scheduling function is performed by the bus allocator, essentially a
set of logical matrices. The conflict matrix
detects the presence of conflicts in requests
for interconnection. The priority matrix
resolves the priority of each request. The
logical product of the states of the conflict
and priority matrices determines the state
of the queue matrix, which in turn governs
the setting of the crosspoint switch, unless
the reque sted module is busy.
The AOSP: An Operating System Program: The AOSP is an operating system
program stored in totally shared memory
and therefore available to any computer.
The program is run only as needed to exert
control over the system. The AOSP includes
its own executive routine, an operating system for an operating system, as it were,
calling out additional routines, as required.
The configuration of the AOSP thus permits
variation from application to application,
both in sequence and quantity of available
routines and in disposition of AOSP storage.
The AOSP operates effectively on two
levels, one for system control, the other for
task processing.
The system control function embodies all
that is necessary to call system programs
and associated data from some location in
the I/O complement, and to ready the programs for execution by finding and allocating
space in memory, and initiating the processing. Most of the system control function (as
well as the task processing function) consists
of elaborate bookkeeping for: programs being
run, programs that are active (that is, occupy
memory space), I/O commands being executed, other I/O commands waiting, external
data blocks to be received and decoded, and
activation of the appropriate programs to
handle such external data. It would be inappropriate here to discuss the myriad details
of the AOSP; some idea of its scope, however,
can be obtained from the following list of
some of its major functions:
1. configuration determination,
2. memory allocation,
3. scheduling,
Conference~
1962 / 91
4. program readying and end-of-j ob
cleanup,
5. reporting and logging,
6. diagnostics and confidence checking,
7. external interrupt processing.
The task processing function of the AOSP
is to execute all program I/O requests in
order to centralize scheduling problems and
to protect the system from the possibility of
data destruction by ill-structuredor conflicting programs.
AOSP Response to Interrupts: The AOSP
function depends heavily upon the comprehensive set of interrupts incorporated in the
D825. All interrupt conditions are transmitted to all computer modules in the system, and each computer module can respond
to all interrupt conditions. However, to make
it possible to distribute the responsibility
for various interrupt conditions, both system
and local, each computer module has an
interrupt mask register that controls the
setting of individual bits of the interrupt
register. The occurrence of any interrupt
causes one of the system computer modules
to leave the program it has been running and
branch to the suitable AOSP entry, entering
a control mode as it branches. The control
mode differs from the normal mode of operation in that it locks out the response to some
low-priority interrupts (although recording
them) and enables the execution of some additional instructions reserved for AOSP use
(such as setting an interrupt mask register
or memory protection registers, or transmitting an I/O instruction to an I/O control
module).
In responding to an interrupt, the AOSP
transfers control to the appropriate routine
handling the condition designated by the interrupt. When the interrupt condition has
been satisfied, control is returned to the
original object program. Interrupts caused
by normal operating conditions include:
1. 16 different types of external requests,
2. completion of an I/O operation,
3. real-time clock overflow,
4. array data absent,
5. computer-to-computer interrupts,
6. control mode entry (normal mode halt).
Interrupts related to abnormalities of either
program or equipment include:
1. attempt by program to write out of
bounds,
2. arithmetic overflow,
3. illegal instruction,
92 / D825 - A Multiple-Computer System for Command & Control
4. inability to access memory, or an internal parity error; parity error on an
I/O operation causes termination of
that operation with suitable indication
to the AOSP,
5. primary power failure,
6. automatic restart after primary power
failure,
7. I/O termination other than normal
completion.
While the reasons for including most of the
interrupts listed above are evident, a word
of comment on some of them is in order.
The array-data-absent interrupt is initiated when a reference is made to data that
is not present in the memory. Since all array
references such as A[k] are made relative to
the base (location of the first element) of the
array, it is necessary to obtain this address
and to index it by the value k. When the base
of array A is fetched, hardware sensing of a
presence bit either allows the operation to
continue, or initiates the array-data-absent
interrupt. In this way, keeping track of data
in use by interacting programs can be simplified, as may the storage allocation problem.
The primary power failure interrupt is
highest priority, and always pre-emptive.
This interrupt causes all computer and I/O
control modules to terminate operations, and
to store all volatile information either in
memory modules or in magnetic thin-film
registers. (The latter are integral elements
of computer modules.) This interrupt protects the system from transient power failure,
and is initiated when the primary power
source voltage drops below a predetermined
limit.
The automatic restart after primary power
failure interrupt is provided so that the previous state of the system can be reconstructed.
A description of how an external interrupt
is handled might clarify the general interrupt
procedure. Upon the presence of an external
interrupt, the computer which has been assigned responsibility to handle such interrupts
automatically stores the contents of those
registers (such as the program counter) necessary to subsequently reconstitute its state,
enters the control mode, and goes to a standard (hardware -determined) location where a
branch to the external request routine is
located. This routine has the responsibility
of determining which external request line
requires servicing, and, after consulting a
table of external devices (teletype buffers,
console keyboards, displays, etc.) associated
with the interrupt lines, the computer constructs and transmits an input instruction to
the requesting device for an initial message.
The computer then makes an entry in the
table of the I/O complete program (the program that handles I/O complete interrupts)
to activate the appropriate responding routine when the message is read in. A check
is then made for the occurrence of additional
external requests. Finally, the computer
restores the saved register contents and
returns in normal mode to the interrupted
program.
AOSP Control of I/O Activity: As mentioned above, control of all I/O activity is
also within the province of the AOSP. Records are kept on the condition and availability of each I/O device. The locations of
all files within the computer system, whether
on magnetic tape, drum, disc file, card, or
represented as external inputs, are also
recorded. A request for input by file name
is evaluated, and, if the device associated
with this name is readily available, the action
is initiated. If for any reason the request
must be deferred, it is placed in a program
queue to await conditions which permit its
initiation. Typical conditions which would
cause deferral of an I/O operation include:
1. no available I/O control module or
channel,
2. the device in which the file is located
is presently in use,
3. the file does not exist in the system.
In the latter case, typically, a message would
be typed out on the supervisory printer, asking for the miSSing file.
The I/O complete interrupt signals the
completion of each I/O operation. Along with
this interrupt, an I/O result descriptor is
deposited in an AOSP table. The status
relayed in this descriptor indicates whether
or not the operation was successful. If not
successful, what went wrong (such as a parity error, or tape break, card jams, etc.) is
indicated so that the AOSP may initiate the
appropriate action. If the operation was successful' any waiting I/O operations which
can now proceed are initiated.
AOSP Control of Program Scheduling:
Scheduling in the D825 relies upon a jobtable
maintained by the AOSP. Each entry is identified with a name, priority, precedence
requirements, and equipment requirements.
Priority may be dynamic, depending upon
Proceedings--Fall Joint Computer Conference, 1962 / 93
time, external requests, other programs, or
a function of many variable conditions. Each
time the AOSP is called upon to select a program to be run, whether as a result of the
completion of a program or of some other
interrupt condition, the job table is evaluated. In a real-time system, situations
occur wherein there is no system program
to be run, and machine time is available for
other uses. This time could be used for
auxiliary functions, such as confidence routines.
The AOSP provides the capability for program segmentation at the discretion of the
programmer. Control macros embedded in
the program code inform the AOSP that
parallel processing with two or more computers is possible at a given point. In addition, the programmer must specify where
the branches indicated in this manner will
join following the parallel processing.
Computer Module: The computer modules
of the D825 system are identical, generalpurpose, arithmetic and control units. In
determining the internal structure of the
computer modules, two considerations were
uppermost. First, all programs and data
had to be arbitrarily relocatable to simplify
the storage allocation function of the AOSP;
secondly, programs would not be modified
during execution. The latter consideration
was necessary to minimize the amount of
work required to pre-empt a program, since
all that would have to be saved to reinstate
the interrupted program at a later time would
be the data for that program and the register
contents of the computer module running the
program at the time it was dumped.
The D825 computer modules employ a
variable-length instruction format made up
of quarter-word syllables. Zero-, one-, two-,
or three-address syllables, as required, can
be associated with each basic command syllable. An implicitly addressed accumulator
stack is used in conjunction with the arithmetic unit. Indexing of all addresses in a
command is provided, as well as arbitrarily
deep indirect addreSSing for data.
Each computer module includes a 128position thin-film memory used for the stack,
and also for many of the registers of the machine, such as the program base register,
data base register, the index registers, limit
registers, and the like.
The instruction complement of the D825
includes the us u a 1 fixed-point, floating-
point, logical, and partial-field commands
found in any reasonably large scientific data
processor.
Memory Module: The memory modules
consist of independent units storing 4096
words, each of 48 bits. Each unit has an
individual power supply and all of the necessary electronics to control the reading,
writing, and transmission of data. The size
of the memory modules was established as a
compromise between a module size small
enough to minimize conflicts wherein two or
more computer or I/O modules attempt access to the same memory module, and a size
large enough to keep the cost of duplicated
power supplies and addressing logic within
bounds. It might be noted that for a larger
modular processor system, these trade-offs
might indicate that memory modules of 8192
words would be more suitable. Modules
larger than this- of 16,384 or 32,768 words,
for example-would make construction of
relatively sma 11 equipment complements
meeting the requirements set forth above
quite difficult. The cost of smaller units of
memory is offset by the lessening of catastrophe in the event of failure of a module.
I/O Control Module: The I/O control
module executes I/O operations defined and
initiated by computer module action. In
keeping with the system objectives, I/O control modules are not assigned to any particular computer module, but rather are treated
in much the same way as memory modules,
with automatic resolution of conflicting attempted accesses via the switching interlock
function. Once an I/O operation is initiated,
it proceeds independently until completion.
I/O action is initiated by the execution of
a transmit I/O instruction in one of the computer modules, which delivers an I/O descriptor word from the addressed memory
location to an inactive I/O control module.
The I/O descriptor is an instruction to the
I/O control module that selects the deVice,
determines the direction of data flow, the
address of the first word, and the number of
words to be transferred.
Interposed between the I/O control modules and the physical external devices is another crossbar switch designated the I/O
exchange. This automatic exchange, similar
in function to the switching interlock, permits two-way data flow between any I/O
control module and any II 0 device in the system. It further e.nhances the flexibility of the
94 / D825 - A Multiple-Computer System for Command & Control
system by providing as many possible external data transfer paths as there are 110 control modules.
Equipment Complements: A D825 system
can be assembled (or expanded) by selection
of appropriate modules in any combination of:
one to four computer modules, one to 16
memory modules, one to ten I/O control
modules, one or two I/O exchanges, and one
to 64 I/O devices per I/O exchange in any
combination selected from: operating (or
system status) consoles, magnetic tape transports, magnetic drums, magnetic disc files,
card punches and readers, paper tape perforators and readers, supervisory printers,
high-speed line printers, selected data converters, special real-time clocks, and intersystem data links.
Figure 2 is a photograph of some of the
hardware of a completed D825 system. The
Figure 2.
equipment complement of this system includes
two computer modules, four memory modules
(two per cabinet), two I/O control modules
(two per cabinet), one status display console,
two magnetic tape units, two magnetic drums,
a card reader, a card punch, a supervisory
printer, and an electrostatic line printer.
D825 characteristics are summarized in
Table 1.
SUMMARY AND CONCLUSION
It is the belief of the authors that modular
systems (in the sense discussed above) are a
natural solution to the problem of obtaining
greater computational capacity-more natural than simply to build larger and faster
machines. More specifically, the organizational structure of the D825 has been shown
to be a suitable basis for the data processing
Typical D825 Equipment Array.
Proceedings-Fall Joint Computer Conference,. 1962 / 95
facility for command and control. Although
the investigation leading toward this structure proceeded as an attack upon a number
of diverse problems, it has become evident
that the requirements peculiar to this area
of application are, in effect, aspects of a
single characteristic, which might be called
structural freedom. Furthermore, it is now
clear that the most unique characteristic of
the structure realized-integrated operation
of freely intercommunicating, totally modular
elements-provides the means for achieving
structural freedom.
For example, one requirement is that
some specified minimum of data processing
capability be always available, or that, under
any conditions of system degradation due to
failure or maintenance, the equipment remaining on line be sufficient to perform primary system functions. In the D825, module
failure results in a reduction of the on-line
equipment configuration but permits normal
operation to continue, perhaps at a reduced
rate. The individual modules are designed
to be highly reliable and maintainable, but
system availability is not derived solely from
this source, as is necessarily the case with
more conventional systems. The modular
configuration permits operation, in effect,
with active spares, eliminating the need for
total redundancy.
A second requirement is that the working
configuration of the system at a given moment
be instantly reconstructable to new forms
more suited to a dynamically and unpredictably changing work load. In the D82 5, all
communication routes are public, all modules are functionally decoupled, all assignments are scheduled dynamically, and assignment patterns are totally fluid. The system
of interrupts and priorities controlled by the
AOSP and the switching interlock permits
instant adaptation to any work load, without
destruction of interrupted programs.
The requirement for expansibility calls
simply for adaptation on a greater time scale.
Since all D825 modules are functionally decoupled, modules of any types may be added
to the system simply by plugging into the
switching interlock or the I/O exchange.
Expansion in all functional areas may be
pursued far beyond that possible with conventional systems.
It is clear, however, that the D825 system
would have fallen far short of the goals set
for it if only the hardware had been considered~ The AOSP is as much a part of the
D825 system structure as is the actual hardware. The concept of a "floating" AOSP as
the force that molds the constituent modules
of an equipment complement into a system
is an important notion having an effect beyond
the implementation of the D825. One interesting by-product of the design effort for the
D825 has, in fact, been a change of perspective; it has become abundantly clear that
computers do not run programs, but that
programs control computers.
ACKNOWLEDGMENTS
The authors wish to acknowledge the outstanding efforts of their many colleagues at
Burroughs Laboratories who have contributed
so well and in so many ways to all stages of
D825 deSign, development, fabrication, and
programming. It would be impossible to cite
all of these efforts. The authors also wish
to acknowledge the contributions of Mr.
William R. Slack and Mr. William W. Carver,
also of Burroughs Laboratories. Mr. Slack
has been closely associated with the D825
from its original conception to its implementation in hardware and software. Mr.
Carver made important contributions to the
writing and editing of this paper.
REFERENCES
1. Marlin G. Kroger et aI, "Computers in
Command and Control" (TR61-12, prepared for DOD:ARPA by Digital Computer
Application Study, Institute for Defense
Analyses, Research and Engineering Support Division), November 1961.
2. A. L. Leiner, W. A. Notz, J. L. Smith,
and A. Weinberger, "Organizing a Network of Computers to Meet Deadlines,"
Proceedings, Eastern Joint Computer Conference, December 1957.
3. R. E. Porter, "The RW-400-A New Polymorphic Data System," Datamation, Vol.
6,No. 1, January/February 1960,pp. 8-14.
96 / D825 - A Multiple-Computer System for Command & Control
Table 1. SpeCifications, D825 Modular Data Processing System
Computer module:
4, maximum complement
Computer module, type:
Digital, binary, parallel, solid-state
Word length:
48 bits including sign (8 characters, 6 bits each) plus
parity
Index registers:
(in each computer module)
15
Magnetic thin-film registers:
(in each computer module)
128 words, 16 bits per word, 0.33- Jlsec read/write
cycle time
Real-time clock:
(in each computer module)
10 msec resolution
Binary add:
1.67 Jlsec (average)
Binary multiply:
36.0 Jlsec (average)
Floating-point add:
7.0 Jlsec (average)
Floating-point multiply:
34.0 Jlsec (average)
Logical AND:
0.33 Jlsec
Memory type:
Homogeneous, modular ,random-access, linear-select,
ferrite-core
Memory capacity:
65,536 words (16 modules maximum,4096 words each)
I/O exchangesper system:
1 or 2
I/O control modules:
10 per exchange, maximum
I/O devices:
64 per exchange,
Access to I/O devices:
All I/O devices available to every I/O control module
in exchange
Transfer rate per I/O exchange:
2,000,000 characters per second
I/O device complement:
All standard I/O types, including 67 kc magnetic
tapes, magnetic drums and diSCS, card and paper tape
punches and readers, character and line printers,
communications and display equipment
maxim~m
THE SOLOMON COMPUTER*
Daniel L. Slotnick, W. Carl Borck, and Robert C. McReynolds
Air Arm Division
Westinghouse Electric Corporation
Baltimore, Maryland
in data reduction, communication, character
recognition, optimization, guidance and control, orbit calculations, hydrodynamics, heat
flow, diffusion, radar data processing, and
numerical weather forecasting.
An example of the type of problem permitting the use of the parallelism is the numerical solution of partial differential equations. Assuming the value of a function, u, is
known on the boundary, r, of a region, the
solution of the Laplace equation t can be calculatedat each mesh point, x, y ih the region
as illustrated in Figure 1.
Since the iteration formula is identical
for each mesh point in the region, the arithmetic capability provided by a processing
element corresponding to each point will
enable one calculation of the equation; i.e., a
single program execution, to improve the
approximation at each of the mesh points
simultaneously.
Figure 2 illustrates a basic array of proceSSing elements. Each of these elements
possesses 4096 bits of core storage, and the
arithmetic capabilities to perform serial-bybit arithmetic and logic. An additional capability possessed by each processing element
is that of communication with other processing elements. Processing element E can
INTRODUCTION AND SUMMARY
The SOLOMON (Simultaneous Operation
Linked Ordinal MOdular Network), a parallel
network computer, is a new system involving
the interconnections and programming, under
the supervision of a central control unit, of
many identical processing elements (as few
or as many as a given problem requires), in
an arrangement that can simulate directly
the problem being solved.
The parallel network computer shows
great promise in aiding progress in certain
critically important areas limited by the
capabilities of current computing systems.
Many of these technical areas possess the
common mathematical denominator of involving calculations with a matrix or mesh
of numerical values, or more generally involving operations with sets of variables
which permit simultaneous independent operation on each individual variable within the
set. This group is typified by the solution of
linear systems, the calculation of inverses
and eigenvalues of matrices, correlation and
autocorrelation, and numerical solution of
systems of ordinary and partial differential
equations. Such calculations are encountered
throughout the entire spectrum of problems
*The applied research reported in this document has been made possible through support and
sponsorship extended by the U.S. Air Force Rome Air Development Center and the U.S. Army
Signal Research and Development Laboratory under Contract Number AF30(602)2724: Task
730J. It is published for technical information only, and does not necessarily represent
recommendations or conclusions of the sponsoring agency.
t Jeeves, T. A., et ali "On the Use of the SOLOMON Parallel-Processing Computer." Proceedings of the Eastern Joint Computer Conference, Philadelphia, December 1962.
97
98 / The Solomon Computer
-r--..
~,....-
'\
V
/
IL( .,y) KNOWN in
r
(
- - f---- ---
/
• +h,
(x-h, y)
/
("+1)
1 {(n)
,lL
4
y)
~~
(.,y)
1
(.-h,y)
/ -
V
\.
-
f--
V
~
(.,y)
--
(', Y -h)
\
IL
-
(x,y +h)
+lL
(n)
('+h,y)
(n)
+lL
Ix,y+h)
(n)
+U
}
(.,y-h)
Figure 1. Iterative Solution of
Laplace I s Equation
transmit and receive data serially from its
four nearest neighbors: the processing elements immediately to right, A; left, C; above,
B; and below, D.
A fifth source of input data is available to
the processing element matrix through the
"broadcast input" option. This option utilizes
a register in the central control to supply
Figure 2. Basic Array of Processing
Elements
constants when needed by an arbitrary number of the processing elements during the
same operation cycle. This constant is
treated as a normal operand by the processing elements and results in the central control unit becoming a "fifth" nearest neighbor
to all processing elements.
The processing element array is the core
of the system concept; however, it is the
method of controlling the array which turns
this concept into a viable machine design.
This method of control is the simplestpossible in that the processing elements contain a
minimum of control logic - the "multimodal"
logic described below.
Figure 3 illustrates how the processing
element array, a 32 x 32 network, is controlled by a single central control unit.
Multimodal control permits the processing
elements to alter control signals to the processing element network according to values
of internal data. They are individually permitted to execute or ignore these central
control signals.
Basically, the central control unit contains
program storage (large capacity randomaccess memory), has the means to retrieve
and interpret the stored instructions, and has
the capability, subj ect to multimodal logic,
to cause execution of these instructions within
the array. Thus, at any given instant, each
processing element in the system is capable
of performing the same operation on the
operands stored in the same memory location
of eachprocessing element. These operands,
however, may all be different. The flow of
control information from the control unit to
the processing elements is indicated in figure 3 by light lines. An instruction is retrieved from the program storage and transmitted to a register in central control. Within
central control, the inst-ruction is interpreted
and the information contained is translated
into a sequence of signals and transmitted
from central control to the processing elements. Since this information must be provided to 1024 processing elements, it is necessary to branch this information and provide
the necessary amplification and power. This
is accomplished by transmission through
branching levels, which provide the necessary
power for transmission.
As described above, each processing element in the network possesses the capability
of communicating with its four adjacent elements. The "edge" processing elements,
Proceedings-Fall Joint Computer Conference, 1962 / 99
Figure 3. PE Array Under Central Control
however, do not possess a full complement
of neighbors. The resulting free connections
are used for input-output application. This
makes possible very high data exchange rates
between the central computer and external
devices through the input-output subsystem.
These rates could be still further increased
by providing longer "edges"; i.e., by the use
of a nonsquare network array.
Two input-output exchange systems are
used by the input output equipment. The primary exchange system is a high speed system operating at a data rate near that of the
processing element network. This system
consists of magnetic tapes and rotating magnetic memories and serves the network with
data storage during large net problems.
The secondary exchange system provides
the user with communication with the primary
exchange system through conventional high
speed printers, and tape transports. The
data at the output of this system is compatible
with most conventional devices.
The Processing Element
The processing element (PE) logic, illustrated in Figure 4, basically consists of two
parts: the processing element memory, and
the arithmetic and multimodal control logic.
The multimodal control within each processing element provides the capability for
individual elements to alter the program flow
as a function of the data which it is currently
processing. This capability permits the
processing element to classify data and make
judgments on the course of programming
which it should follow. Whenever individual
elements are in a different mode of operation
than specified by central control, they will
not execute the specified command.
During each arithmetic operation, one
word will be read serially from each of the
two memory frames associated with a unique
processing element. The operand in frame
one will be transmitted by central control
command, either to the internal adder or to
that of one of the four adjacent elements
which are its nearest neighbors in the network array. The five gates labeled A in
Figure 4 control the routing of information
from frame one. Since only one of these may
be activated during a single operation, a word
in frame one can be entered in the operation
select logic of only one of the five processing
elements. The frame-two operand can be
routed only into the unit's full adder.
Each PE in the system will communicate
with a corresponding unit, thereby producing
a flow of information between processing
elements during network operations.
Word addressing of the memory is performed by the matrix switches in the central
control unit. These switches convert the
address from the binary form of the instruction to the one-of-n form required for addressing the memory frame. Provision is
made for special addressing of specific
memory locations for temporary storage of
multiplier and quotient during multiplication
and division. Successive bits are shifted into
the PE logic by two digit counters in central
control.
Three different types of storage are permitted: (1) the sum can be routed into frame
one while the original word in frame two is
rewritten; (2) the sum can be routed into
frame two while the word in frame one is
rewritten; and (3) information can be interchanged between frames. Note that in the
first two operations, the word which was
located in the memory frame into which the
sum is routed is destroyed. No information
is altered during the third type operation.
Multimodal Operation: Multimodaloperation gives the processing element the additional capability for altering program flow
and tagging information on the basis of internal data. Any command given by the
100 / The Solomon Computer
,..-- :
- - - - --- -- - - -- - - - - - - - - - - - - - - - - .,
I
CENTRAL
CONTROL
L.~.~'!. ~~,,!,3L ____ - -- - 2!- - 9<- --
.
I
~~~ft~~~'~- - - - - - - - ,
:
-r----l'----"---"
I
I
I
I
MATRIX
I
SWITCH
FRAME
I
I
•
::1
.,,:ba.
~--t!'.c describes in detail the
application of the SOLOMON. system to problems in partial differential equations and
certain matrix calculations. Results to datet
~:~Slotnick,
establish a performance advantage between
60 and 200for the SOLOMON Computer compared to currently available large scale digital systems.
ACKNOWLEDGMENTS
The authors gratefully acknowledge the
assistance received through many discussions
with their associates during the conception
and development of the SOLOMON system,
and, in particular, E.R. Higgins, W.Ho Leonard, and Dr. J .C. Tu.
D. L., W. C. Borck, and R. C. McReynolds; "Numerical Analysis Considerations for
the SOLOMON Computer." Proceedings of the Air Force Rome Air Development CenterWestinghouse Electric Corporation Air Arm Division Workshop in Computer Organizationo
To appear.
t Jeeves, T. A., op cit.
THE KDF.9 COMPUTER SYSTEM
A. c. D. Haley
English Electric Co., Ltd.,
Kidsgrove,
Stoke-an-Trent, England
SUMMARY
ideal for evaluation of problems expressible
in algebraic form.
A number of other significant advantages
arise directly from the nesting store principel, chief among them being a striking reduction in the program storage space required. T his is due to eli min a t ion of
unnecessary references to main store addresses and to the implicit addressing of
operands in the nesting store itself. Many
instructions are therefore concerned with
specifying only a function, requiring many
fewer bits than those instructions involving
a main store add res s. Instructions are
therefore of variable length to suit the information content necessary and on average
three instructions may occupy a single machine word (48 bits). This again reduces the
number of main store references, allowing
the use of a store of modest speed \vhile still
allowing adequate time for simultaneous operation of a number of peripheral devices on
an autonomous main store interrupt basis.
Fast parallel arithmetic facilities are
associated with the nesting store, both fixed
and floating-point operations being provided.
A further nesting store system facilitates the
use of subroutine s, and a third set of special
stores is associated with a particularly comprehensive set of instruction modification and
counting procedures.
Operation of the machine is normally under
the control of a "Director" program. Anumber of different Directors cover a variety of
operating conditions. Thus a simple version
is used when only a single program is to be
executed and more sophisticated ver sions
The English Electric KDF.9 computing
system has a number of unusual features
whose origins are to be found in certain decisions reached at an early stage in the planning of the system. At this time (1958-59)
simplified and automatic programming procedures were becoming established as desirable for all programming purposes,
whereas previously they had been regarded
as appropriate only to rapid programming of
"one -off" problems because of the drastic
reductions of machine efficiency w h i c h
seemed inevitable.
Many ear 1 y interpretive programming
schemes aimed to provide an external threeaddress language, and for a time it appeared
that a machine with this type of internal coding approached the ideal. Increasing interest
in translating programs, particular for problem languages such as ALGOL and FORTRAN,
showed the fallacy of this assumption. It became evident that efficient translation could
only be achieved on a computer whose internal structure is adapted to handle lengthy
algebraic formulae rather than the artificially
divided segments of a three address machine.
The solution to the difficulty was found in
the use of a "nesting store" system of working
registers. This consists of a number of
associated storage positions forming a magazine in which information is stored on a
"last in, first out" basis. It is shown that
this basic idea leads to development of a
computer having an order code near to the
108
Proceedings-Fall Joint Computer Conference, 1962 / 109
may be used, for example, to control pseudooff-line transcription operations in parallel
with a main program, operation of several
programs simultaneously on a priority basis,
etc.
INTRODUCTION
For almost a decade after the first computers were put into service, developments
in system specifications were almost exclusively a by-product of local engineering
progress. No radical changes took place in
machine structure, except insofar as engineering changes led directly to operational
ones. Thus the emergence of the ferrite core
store as the most reliable and economic rapid
access store for any significant quantity of
information led to the abandonment, now virtually complete, of the various types of delay
line store. In consequence, "optimum programming" is now used only in certain systems which exchange speed for economy and
use a magnetic drum for the working store.
The majority of systems continue to be
basically simple single address order code
machines, in which most orders implicitly
specify an arithmetic register as one participant in every operation. It was the subsequent proliferation of transfers between
this register and the main store, purely to
allow its use for intermediate arithmetic operations on instructions, which led to the introduction at Manchester University of the
"B-tube." This in turn has been extended and
elaborated to provide the automatic instruction modification and/or counting features
which are now universal.
A further reduction in housekeeping operations, with a consequent increase in speed,
can be obtained by providing not a single
register associated with the arithmetic unit,
but a number of registers. Sometimes all
facilities are available on all registers, and
in other machines the facilities are divided.
Changes and elaborations such as these
have, of cour se, had the primary aim of increasing the effective speed of the machine.
The penalty to be paid is the complexity of
programming. The increased quantity of
hardware required is, of course, more than
compensated by increased computing power.
There is no corresponding compensation
for the increased programming costs, particularly for problems of infrequent occurrence. This factor was the provocation for
much of the early work on simplified programming schemes. Major programs and
those to be used repeatedly were written in
machine code, but the remainder were written
in pro.blem-oriented languages either obeyed
interpretively (as if a set of subroutines) or
translated into the necessary machine code
program by highly sophisticated translator
rout~nes. Typical of the former approach
are English Electric "Alphacode" [1] and
Ferranti/Manchester University "Autocode"
[2, 3] and of the latter method I.B.M. "Fortran" [4] and English Electric "Alphacode
Translator" [5, 6 J. The penalties of using
these schemes are again different in nature.
Interpretive routines lead to inefficient use
of machine time during every run, factors of
10 to 100 covering most of the systems in
common use. The translator routines, in
contrast, may ideally produce a 100% efficient program, although a loss of speed by a
factor of 1-1/2 to 5 is more usual. The
translation operation, performed only once,
may however occupy long periods and may
swamp the subsequent gain over the interpreti ve method for a rarely used program.
In the late nineteen-fifties it was evident
that a successful new general-purpose system should ideally have two programming
features:
(a) It should have an order code designed
to allow easy efficient coding in machine
language after a minimum training period:
(b) The language should be such as to allow the preparation of a number of rapid
translators (for many categories of work)
from problem -oriented languages into efficient machine programs.
Studies based on these fundamental requirements culminated in the machine organization selected for KDF .9.
Choice of an Order Code
Many interpretive schemes used up to this
time had been of three-address type, where
the typical instruction is of the form
C = A (funct;on) B,
and A,B,C are main store addresses. Experience had shown that this type of code was
well adapted for design calculations and scientific usage, expecially if a wide range of
fixed and floating point functions and wellchosen loop counting and instruction modifying facilities were made available.
110 / The KDF.9 Computer System
At a time when potential arithmetic speeds
were overtaking storage access rates the use
of a three-address machine code had the
drawback of requiring four main store references for each operation (including extraction of an instruction), and with the constant
demand for stores of 32,000 or more words
the necessary instruction length approaches
60 bits when provision for a large number of
functions and modifier s is made.
A single address system, on the other
hand, requires greater care in programming
and a multi-accumulator machine is perhaps
worse still from this standpoint.
None of these conventional structures is
particularly well suited to preparation of efficient translation routines, and a study was
therefore made of a special form of multiaccumulator system using automatic allocation of addresses for the working registers.
This depends on presentation of data to the
system in a form closely analagous to the
"reverse Polish" notation. (This notation
merely involves writing the sign of any arithmetic operation after, rather than before, the
two operands.
Thus
a + b is written ab +
a -+ b is written ab+, etc.)
Fundamentally the procedure is to arrange
the machine structure in such a way as to allow operations to be performed in a natural
order. Thus, in carrying out the operations
involved in calculating E = A.B. + C .D, the
natural sequence is
obtain A:
obtain B:
multiply & retain A.B:
obtain C:
obtain D:
multiply & retain C.D:
add:
store E.
Given a suitable "user code" (i.e., a convenient form in which to prepare a program
of instructions), it was evident that such a
machine structure, if attainable at reasonable
cost, would meet requirement (a) above. The
belief in its appropriateness for simplicity
of coding was coincidentally supported by the
appearance of an interpretive ·scheme [7],
producing a similar problem-language, written for DEUCE. This proposal, however,
attracted little attention, mainly because the
multi-level storage system of DEUCE prevented attainment of a reasonable working
efficiency.
The potential virtues of this type of problem language were again supported by a subsequent proposal to use a reverse Polish
notation as a computer common language
(APT). For a number of reasons this proposal was not adopted, but two of the reasons
advanced in its favor are worthy of note in
the present context. They were:
(a) That the language is one in which
problems can be formulated with little effort
after a short period of training:
(b) That the process of automatic or
manual translation from any conventional
problem language to most of the existing machine languages requires an effective expression in reverse Polish notation as an
intermediate step in the procedure. Using
such a machine code therefore eliminates a
substantial part of the process.
The Nesting Store
The evaluation of a formula above can
be seen to consist of successive operations either of fetching new operands (or
storing results) or of performing operations
on these most recently named. An intermediate result is immediately reused (as in the
seventh step) or temporarily pushed "below
the surface," as when the product A.B is
superseded by the fetching from store of C.
A mechanical analogue of such a system
is shown in Figure 1. It consists of a magazine, spring loaded to maintain the contents
at the top, with only one point of entry and
exit. Objects stored can therefore only enter
and leave on a "last in, first out" basis.
The electronic equivalent is shown in Figure 2. There are n separate registers, each
capable of accommodating one machine word,
and corresponding bit poSitions of each register are connected so that the assembly can
also be treated as anumber of n-bitreversible shifting registers. Information can be
Proceedings-Fall Joint Computer Conference, 1962 / 111
Two important points emerge at this
stage. The first is that an operation sequence written as
~II
_ _>
>
II
Figure
transferred to the top register N 1 from a
conventional main store buffer register (a
parallel machine is assumed), and simultaneously with any such transfer a shift pulse
causes a downward shift of any existing information. Thus the new word appears in
N1, the previous content of N1 in N2, etc.
This process is called "nesting down." Reversal of the process causes "nesting up"
with transfer of the content of N 1 towards the
main store. "Nesting up" and "nesting down"
can occur in any sequence, provided, of
course, that not more than n words are in the
store at any time.
Associated with the top two registers is a
"mill" or arithmetic unit [8]. Serial transfers
to and from the mill are shown for clarity,
but in a fast machine parallel working is
again used.
To allow the evaluation of a formula such
as the simple one of section 2, the mill must
be made capable of the two arithmetic operations needed. One of these is addition, and at
this point it must be noted that in general no
operand is considered as necessary after being used by the mill. The operation "add"
therefore uses the words in N 1 and N2, and
places the sum back in N 1. Since N2 IS now
unoccupied, "nesting up" occurs, the content
of N3 moving to N2, etc.
The natural effect of most of the conventional arithmetic operations can now b~ visualized. Thus, "multiply" produces a d'oublelength product in N 1 and N2 (the more
Significant half in N1 for obvious reasons).
No nesting is involved in this operation, but
the more elaborate "multiply and round"
produces a single length result in N 1 and
therefore requires nesting up.
YA, YB, x, YC, YO, x, +,
= YE
evaluates the formula given earlier (YA is
to be interpreted as "fetch from the main
store address containing A into N 1, "and = YE
as a converse operation). This sequence is
the one involving a reduction to a minimum
number of main store operations.
The second important point is that arithmetic operations have implied addresses, so
that only the function need be specified. On
the other hand, all instructions referring to
a main store address require many bits for
this purpose unless flexibility and convenience are sacrificed by addreSSing relative
to a local datum.
Variable Length Instructions
It is clearly advantageous to economize in
instruction storage space, and hence, for
reasons stated above, to allow instructions
to vary in length according to their function
and the additional information they require.
Obviously any instruction must carry within
itself a definition of the number of bits included in it. Analysis shows that complete
variability is unprofitable (as well as complicating the hardware), since five bits would
be needed to specify the length. Three possible lengths of 8, 16 and 24 bits are therefore
made available, including in each case bits
to deSignate length. In connection with instruction length, a unit of eight bits is referred to as a "syllable." Most arithmetic
operations are therefore one-syllable instructions; memory fetch and store operations together with jumps are three-syllable,
while two-syllable instructions include a
number requiring parameters of less length
than memory addresses (shift instructions,
input/output instructions, etc.).
The word length of the computer is 48 bits,
and this is the smallest unit in which information can be extracted from the main store.
Instructions are stored continuously and
obeyed serially; i.e., the store area concerned is regarded as a continuous series of
eight-bit syllables, rather than of 48-bit
words, and two or three-syllable instructions may overlap from one word tothe next.
Associated with the main control there is
112 / The KDF.9 Computer System
therefore a two-word register. This at any
time holds the word containing the current
instruction together with that from the next
higher main store position (if the current instruction overlaps a word boundary both
words are of course in use). As soon as all
syllables in one word have been used, this
word is replaced at the first available main
store cycle by a further word in sequence
from the main store.
Because of the economy in instruction
storage space achieved by these means, it is
frequently possible to contain important inner
program loops in two words of instructions.
Provision is made to mark such loops by a
special jump instruction, whose effect is to
inhibit the extraction of a new instruction
word until the condition for leaving the loop
is satisfied. This saves two main store
cycles (12 microseconds) for extracting instruction words on every circuit of the loop,
and incidentally saves a syllable since again
no main store address needs specifying completely.
Some additional complexity arises due to
the separation of control into two parts, one
associated primarily with arithmetic operations (known as "Mill Control") and the other,
which runs normally at least one instruction
in advance, controlling most other internal
operations and in particular controlling access to the main store (this is called "Main
Control"). The object of this is, of course,
to increase effective speed by allowing, for
example, a new instruction word to be extracted while an arithmetic operation proceeds in the nesting store. Such overlaps are
completely automatic and no special actions
are required of the programmer.
Further Consideration of
Nesting Store
At this stage it is convenient to examine
the nesting store and the consequence of its
use in a little more detail.
It should first be stated that the representation of Figure 2 while possible is uneconomic if more than a two or three words of
storage capacity are required. Examination
of a large number of programmes shows that
only occasionally is storage for more than
about eight words needed in the evaluation of
an expression, and that a sixteen word limit
to capacity will cause inconvenience only on
very rare occasions.
Z
~
Q.
:::>
0
I-
l-
lL
~
V)
I
V)
I
TO MAIN STORE BUFFER REGISTER
NI
MILL
N2
N3
I
I
I
I
I
I
I
I
I
I
I
I
I
l
I
I
I
I
I
I
I
I
I
I
I
i
I
I
I
I
I
I
I
I
I
I
I
I
I
I
~lllljljl
1
N48
I
I
Figure 2
A set of 16 registers of 48 bits each is an
expensive assemblage and it is natural to
examine the possibility of using core storage
in some form. The use of a core store for
all positions carries penalties in speed of
operation (this remains true as long as the
speed is similar to the main store speed),
and a satisfactory solution is reached by the
use of the configuration shown in Figure 3.
The top three registers are now conventional flip-flop registers, and a 16-word core
plane makes up the remainder of the store.
This gives a total of 19 words which is advantageous for reasons shown below. The
flip-flop registers are inter-connected by a
number of gated transfer paths (each for
parallel transfer of 48 bit words) which also
include transfers to and from the arithmetic
unit. An unconventional transformer coupling
system allows these paths to be provided
economically while permitting the transfer
of a word between any two registers in under
half a microsecond. Below the registers the
core store operates in a manner differing in
form but identical in principle with the lower
registers of Figure 2. No shifting of words
between store positions occurs as nesting up
or down is called. Instead successive positions are emptied or filled by successively
Proceedings--Fall Joint Computer Conference, 1962 / 113
MAIN STORE
:----B-I----:
L ___ , ___ -.J
,...1,
',I
1+'
I----B~----i
L ______ -.l
MILL
CORE PLANE
(16 X 48)
Figure 3
addressing levels in the plane. Suppose, for
instance, that the whole system is empty, and
successive words are fed into Nl (by a series
of main store fetch instructions). The transfer paths between N 1, N2, N3 and the mill are
gated as required by control.
The only complication arising here is that
the transfer system does not allow N2, for
example, to send to N3 in the same half
microsecond period in which it receives from
Nt. The two buffer registers of the mill are
therefore used, and the transfers are, in the
first half microsecond, Nl to Bl and N2 to
B2.
The write address counter is at zero and
the read counter at 15 for a reason which will
appear shortly. In the first half microsecond
N3 is also allowed to set up bias currents in
the vertical core lines where digits are to be
written.
The next half microsecond sees the word
in N3 written into the top core plane pOSition
by operation of the write drive, and the subsequent period covers the operations needed
to complete the first fetch. These are main
store to Nl, Bl to N2, and B2 to N3. Simultaneously the read and write address counters
are advanced by oneto zero and one respectively.
A second and third fetch may now be executed in exactly the same way, but, as the
system was assumed empty originally, blanks
(as distinct from data zeroes) are written
into levels 0, 1, 2 of the core plane, leaving
the cores cleared. Only on the fourth and
subsequent fetches does a genuine word appear in level 3.
Continuation of these operations will on
the sixteenth occasion fill the lowest level of
the core plane, and the write counter has now
carried round to address zero while the read
counter is at 15. The programme is interrupted if further entries are attempted (except under c i r cum s tan c e s mentioned in
a later section).
It will have been noted that the read counter is always one pOSition behind the write
counter, so that if a nesting down operation
is followed by nesting up, the read drive is
used and no correction is needed tothe counter position in orderto read out the last word
inserted.
Note also that all operations are either to
read out, leaving a clear storage pOSition, or
to write into an already clear position. The
normal read/write cycle is therefore unnecessary and would be wasteful of time.
Special Nesting Store Instructions
The simple example given earlier illustrates the general nature of the instructions
provided. It is soon discovered, however,
that a few instructions are desirable which
have no real counterpart in more conventional
systemsa
Straightforward evaluation of an expression in algebraic form will usually produce
the desired result without special manipulation. Occasionally it will be found that two
operands, for example prior to a division, are
in reverse order. An instruction "REVERSE"
has the effect of interchanging the contents
of Nl and N2.
The instruction ''DUPLICATE,'' which
nests down and leaves a copy of N 1 in both
Nl and N2 has many uses. Followed by
MULTIPLY it produces the square of Nl,and
it also allows an operand shortly to be lost
in some other operation to be preserved for
later use without requiring a main store reference.
Instructions ZE RO and E RASE bring an
all zero word into Nl, nesting down, and
erase Nl, nesting up, respectively.
Many instructions are also available in
double-length form. Thus double length addition (mnemonic +D) treats Nl and N2 as
one 96 bit number, and N3, N4 as a second.
114 / The KDF.9 Computer System
It produces a double-length sum in N1, N2,
nesting up two places.
Single and double precision floating point
representation is also permitted, the corresp.onding mnemonic forms for addition being
+F and +DF.
With these instructions it is possible to
introduce an example showing the power of
the system and incidentally the speed and the
economy in instruction storage space.
The example to be given is perhaps somewhat artificial, but it has been selected to
illustrate not only the essential simplicity of
evaluation of any expression from an algebraic statement, but to illustrate also that
even in a sophisticated system there is scope
for an occasional elegant twist. The formula
to be evaluated is
3
f = a (a b 2 + 1)
b + 2c 2 d 2
where a, b, c, d are single length fixed point
numbers stored at Ya, Yb, Yc, Yd, nonadjacent addresses in the main store.
The table shows the successive steps in
the calculation, together with the contents
after each step of the top few cells of the
nesting store.
The following pOints should be noted:
(a) Again main store references are reduced to the minimum practicable.
(b) A count shows that only 30 syllables,
or five instruction words, are used (there
are no two-syllable instructions in this particular example).
(c) It is advantageous to evaluate the numerator in expanded form in this particular
case.
(d) The use of DOUBLE DUPLICATE at
step three neatly anticipates future requirements for the parameters. An automatically
programmed version would fetch these parameters again or could perhaps use a less
satisfactory temporary storage process to
avoid this.
The complete evaluation on KDF. 9 takes
less than 170 microseconds.
It is interesting and instructive to compare this performance with that of a conventional one or three-address machine. The
same actual arithmetic and main store
speeds, and a single accumulator only for
the one-address machine are assumed. It is
also assumed that instructions are packed
two to a word for the single address system
and one to a word for the three address
type.
The single-address programme then occupies eight words of storage and takes about
250 microseconds (increases of 50% in both
factors). A three-address system uses 9
words of storage and takes 340 microseconds.
This example requires few operations of
a "housekeeping" nature and the savings arising from use of a nesting store are less
prominent than is frequently the case. On
the other hand, it is also possible to find
cases where there is little difference between the various systems.
The very poor relative speed of the threeaddress machine is accentuated because of
the assumption of the same basic internal
speeds as for KDF.9 (typically 6 microsecond store cycle, 15 microsecond multiplication). It is, however, true that these
represent speeds attainable at corresponding
levels of economy. A substantially faster
store with 1 microsecond cycle time could be
used, at a Significant cost penalty, and would
bring down the problem time to around 160
microseconds. There is, of course, no corresponding saving in storage space.
Treatment of Sub-Routines
It will have been observed that in the example above the condition of the nesting store
at the end was identical with that at the start.
A system of preparing sub-routines can be
based on this fact. At any stage in a program
there may be information in one or more
cells of the nesting store. At this point the
parameters required by the sub-routine must
be planted by the main program (or of cour se
by a lower order sub-routine). A note must
also be made of the next instruction in the
program for re-entry purposes. These two
points will be considered separately.
Just as an instruction such as MULTIPLY
expects to find operands in the top cells of
the nesting store and terminates leaving the
product in place of the operands (nesting as
necessary), so a sub-routine can also be arranged to function. It then becomes in effect
a machine instruction in that it does not influence the nesting store contents below the
level of the operands made ready for it.
It will also be obvious, as soon as multilevel sub-routines are conSidered, that the
operations involved in storing the return
instruction add res s for use after each
Proceedings-Fall Joint Computer Conference, 1962 / 115
Table 1
N1
OPERATION
N2
N3
N4
-
-
-
-
-
-
-
FETCH
b
FETCH
a
b
DOUBLE DUP.
a
b
a
b
DUPLICATE
a
a
b
a
MULTIPLY
a2
b
a
b
MULTIPLY
a2 b
a
b
DUPLICATE
a2 b
a2 b
a
MULTIPLY
a 4 b2
a
b
ADD
a 4 b2 + a
b
REVERSE
b
NUM
FETCH d
d
FETCH c
N5
b
-
-
b
-
-
-
-
-
-
-
-
b
NUM
-
-
c
d
b
NUM
-
MULTIPLY
cd
b
NUM
-
-
DUPLICATE
cd
cd
b
NUM
-
MULTIPLY
c 2 d2
b
NUM
-
-
DUPLICATE
c 2 d2
c 2 d2
b
NUM
-
ADD
2c 2 d2
b
NUM
-
-
ADD
DENOM
NUM
-
-
-
DIVIDE
NUM
DENOM
-
-
-
-
-
-
-
-
=f
STORE
sub-routine in turn are again of a last in,
first out nature. Another nesting store is
therefore provided for this purpose, but there
is here, of course, no necessity for a number
of inter-connected registers or any arithmetic facility. This" sub- routine jump nesting store" (S.J.N.S.) therefore has one register as its most accessible cell, with again a
sixteen-word core plane below. As in the
case of the main nesting store, only a total
of 16 words, not 17 as might have been expected in this case, are available. Since one
is reServed for a special purpose (see below), only 15 may be used by the programmer.
The instruction "Jumpto Sub-routine" has
the effect of planting its own address in the
top cell of S.J.N.S. and of causing an unconditional jump to the entry point of the required sub-routine. It will be noted that
116 / The KDF.9 Computer System
S.J.N.S. must receive not only the word address but also the syllabic address of the
return point.
Any sub-routine is terminated by one of
the variants of the basic EXIT instruction,
which transfer s control to the instruction
whose address is in the top cell of S.J.N.S.
(nesting it up one cell). These variants
augment this address by units of three
syllables before performing the basic EXIT
operation, and thus return control to the instruction immediately following the "Jump to
sub-routine" instruction (itself three syllables long) or to one of the succeeding threesyllable instructions. These are usually a
string of unconditional jump instructions,
corresponding to failure exits or multiple
exit point's preceding the normal return instruction.
The Q-Stores
The computer organization is completed
by addition of a further set of storage registers (not of nesting type) known as the Qstores, and by provision of an input/output
system.
The first of these features is based on
conventional practices, and will be treated
briefly. A set of fifteen registers (formally
16, one of which is identically zero) is used
for address modification, counting and anumber of other purposes. Each register is a
full 48-bit word, but for address modification
is considered as having three 16-bit independent sections for modifier, increment and
counter.
Any main store reference has associated
with it a Q-store number .(or an implied QO),
and refers to the address specified, augmented by the content of the modifier section.
If the letter Q is added to the mnemonic form
of the ·instruction then after the address has
been calculated the modifier is changed by
addition of the increment and the counter is
reduced by one. Jump instructions testing
the counters are, of course, provided.
The Q-stores may also be used if desired
as 48-bit registers, or as independent 16-bit
registers, in each case with accumulative or
direct input. The "counter" part of any Qstore may be used to hold the amount of shift
(positive or negative) in shift instructions.
There are sufficient facilities of this kind in
the machine to remove the need for any kind
of programmed instruction modification.
The other main use of the Q-stores is in
connection with input/output operations, outlined in the next section.
Input/Output Operations
Provision is made for use of the normal
peripheral devices, each one operating completely autonomOUSly and simultaneously with
other peripherals and with computer operations. In the standard system up to 16 peripheral devices, with a total transfer rate in
excess of a million characters per second,
may be handled.
To call any peripheral transfer, an instruction specifies the nature of the operation, referring also to a Q-store in which the
other required parameters have been planted.
These are the device number, and the limiting addresses of the main store area concerned' since peripheral transfers· are of
variable length~
Assuming that the device is available (i.e.,
not already busy), a check is now carried out
by the input/output control system that the
main store area specified does not over lap
that involved in any peripheral transfer already in progress. The parameters are restored within the control, so that the Q-store
register is freed for further use by the program.
The transfer now proceeds in a manner
which, for economic reasons, differs depending on whether the device concerned is fast
(magnetic tape, for example) or slow (punched
card or paper tape, etc.).
In the case of the former, six-bit characters are assembled (taking a read operation
by way of illustration) into machine words.
When a word is complete, it is placed in a
single word buffer and a signal to the main
control system seizes a store cycle as soon
as possible to transfer the word into store.
Such calls have priority on the time of the
main store, but anyone peripheral device
may have to wait until other devices have
been dealt with.
Since such a buffering system uses a substantial quantity of equipment, more economical procedures are adopted. for the
slower devices. A common unit is shared by
all these, and no assembly into words takes
place outside the main store. Instead, as
any device has a character ready, the required main store word is extracted, the
character inser-ted in the next character
Proceedings-Fall Joint Computer Conference, 1962 / 117
location and the word is ret urn e d to
store.
This process is somewhat prodigal of main
store time, but the penalty of handling devices
with character rates up to a few kilocycles
per second in this way is quite negligible.
An attempt by the computer to make use
of a peripheral device or any part of a main
store area concerned in an uncompleted peripheral transfer results in a "lock-out."
The programme operation is interrupted until
the prior operation is completed. The user
is thus freed from any obligation to check and
guard in his program against such conflicting
operations.
KDF.9 "User Code"
Up to this point it has been implied that
programs are written directly in machine
code, albeit in mnemonic form. A very simple compiler routine can clearly translate
the mnemonics into instructions. This would,
however, leave the programmer with certain
tedious tasks, such as calculation of addresses
for jump instructions. The syllabic instruction form increases this problem.
The opportunity is therefore taken to incorporate in the compiler a number of additional features to eliminate such difficulties.
One or two specific facilities will be mentioned.
The mnemonic code handled by the compiler is called "User Code" and has the following important characteristic: every User
Code instruction is either a directive to the
compiler, not appearing explicitly in the compiled version of the program, or is compiled
into one machine instruction proper. Thus
there are no macro-instructions in User Code
which become sequences of instructions in
basic machine code, the correspondence between User Code and machine code instructions being essentially one to one.
Calculation of jump addresses is handled
by the compiler, the user being required only
to label the entry point with an arbitrary reference number and to specify "jump to reference number ..• if ..• " (for example JrCqZ
is the mnemonic meaning "jump to reference
r if the counter section ofQ-store q is zero").
Further elaboration of this process allows
a Similar labelling of sub-routines. When it
encounters an ~nstruction JSLi, the compiler
ensures that a copy of the library sub-routine
i will be made from the library tape and
incorporated into the program. It also writes
the appropriate entry address into the jump
instruction in the main program.
It will be remembered that in section 3
an instruction YA was used as a mnemonic for
"'Fetch into N1 the word in main store address A." It is clear IYI possible for the programmer to treat as his data store a continuous main store area to be addressed as
YO, Yl, etc. The compiler is again used to
convert these relative addresses to absolute
addresses, allocating as the starting point
the first word available after assembly of
the programme.
An obvious extension is to allow the programmer to use a number of groups designated YA, YB, etc., each having an area
commenCing at zero. Thus YAO, YB75 are
permissible addresses.
Other areas of store are Similarly allocated by the compiler. Constants appear
to the user as a set of stores known as Vstores. To incorporate a required constant
for a program requires only the user code
statement that, for example, V27 = F3.14159.
This has the effect of converting the numeric
value in standard floating binary form and
allocating a store word to it. Any subsequent
program reference to V27 fetches this constant into the nesting store.
Similar ly the programmer can refer to
main store areas reserved by the compiler
for working store (storage of intermediate
results or data). Exactly as for Y and V
stores these are referred to with prefix W.
When the compilation is performed, the
resultant program commences with the transcribed main program, followed by any subroutines used. Areas for V and W stores
(and for other functions of this type) are provided up to the highest numbered of each type
in the original program. Finally the remaining area becomes the Y store area, thus allowing maximum generality.
Clearly it is possible to prepare special
versions of the compiler with any degree of
elaboration or desired characteristics. The
incorporation of a routine for conversion of
constants (which may be expressed in a number of ways) is a requirement for all uses,
but it is evident that the prinCiple can be extended as deSired.
Thus the standard compiler, which is appropriate for preparation of programs for a
variety of applications, may be supplemented
by additional special purpose compiler s.
118/ The KDF.9 Computer System
Interruption and the Use of a
Director Routine
KDF.9, like other computers of its generation, has built-in interrupt facilities. Specifically, thi s means that at virtually any time
the normal sequence of instructions may be
suspended and control transferred to a fixed
address (syllable 0 of word 0 for obvious
practical reasons). Such a transfer of control
is called an Interruption. It is arbitrary only
in the sense that it is generally outside the
control of the programmer. It will take place
only when one of a certain number of quite
clearly defined situations arises in the machine. When an Interruption occurs, the interrupted program is left in such a state that
it may be subsequently resumed and will then
continue exactly as if nothing had happened unless the reason for the interruption was
some obvious abuse by the program of the
facilities of the machine.
The purpose of the Interruption facility is
to make "Time-sharing" possible. Here it is
necessary to distinguish between "parallel
operation" and "time-sharing." The former
implies that the machine is doing more than
one operation at a particular moment. On
KDF.9 such occasions include the ability of
one or more peripheral transfer operations
to proceed at the same time as computation
as long as no lock-out violation occurs (see
above). If such a violation does occur, a
transfer of control is necessary in order to
enter some other sequence of instructions
which can proceed unaffected by the lock-out.
This switch of control is automatic, and
so implies the necessity of interruption, in
order that the programmer shall not be required to anticipate and take action over lockout violations. The ability to switch from one
instruction sequence to another is called
"Time-sharing"; it will be evident that timesharing and parallel operation go hand~in
hand, the former enabling the most efficient
use to be made of the latter.
There are two versions of the KDF.9 system. One of them has an elaborate time
sharing facility which enables up to four independent programs to be stored within the
machine at once (together with a supervisory
program or "Director"). They are obeyed on
a time -shared priority basis - that is, each
program is allocated a priority, and the hardware and the supervisory program together
ensure that the program of highest priority
is always operating subject only to peripheral
lockout conditions.
The other version of the KDF.9 system
only permits one program to be stored, in
addition to a supervisory program. This
version can be converted to "Full Timesharing" by addition of equipment. The extra
equipment includes:
(a) Extra core planes which enable each
program to have its own Nesting Store, Subroutine Jump Nesting Store and Q-Store. To
swifchNesting Stores,for instance,it is necessaryonly to nest down three places, so that
the entire contents of the nesting store are
in the bottom sixteen cells. These are all
in the core plane, which can then be disconnected from the top registers and replaced
by another core plane. This is a very satisfactory compromise between loss of time
during changeover and volume of extra equipment required.
(b) A register corresponding to each
priority level, which is set whenever that
program is held up by a peripheral lock-out
and which records the details of the lock-out.
Interruptions are caused whenever a hold-up
occurs and also whenever a lock-out is
cleared which was holding up a program of
higher priority than the one currently operating. For this purpose a register noting
the current priority level is also necessary.
Features common to both types of machine provide for relative addressing and for
addreSs checking. The relative addressing
feature causes the contents of a "Base Address" register to be added to any main store
address used by a program before access to
the main store is made. This allows programs to be coded always as if stored at
location 0 onwards, but to be stored and
obeyed in any segment of the store.
The address checking actually precedes
augmentation of the relative address by the
base address, and includes a check that the
relative address is not negative and does not
exceed the size of the store area allocated
to that particular program. (Core storage
allocation is completely flexible and is in the
hands of the supervisory program. Programs
may be moved about bodily in the store and
may have their priorities interchanged).
If a program tries to go outside its allocated storage area a "Lock-in Violation" is
said to have occurred, and an interruption
follows. The same thing happens if a program tries to use a peripheral device which
Proceedings-Fall Joint Computer Conference, 1962 / 119
has not been allocated to it - a register of
currently allocated peripheral devices is automatically referred to every time a peripheral transfer is called.
In addition, interruption will occur if an
ordinary program tries to use one of a number of instructions which are reserved for
use by the Director. Such instructions provide access to the various hidden registers
concerned with the interruption facilities.
The over-riding objective is to ensure that
no program is capable of doing anything which
can upset the operation of any other.
Other interruptions can be instigated by the
machine operator, in order to allow input of
control messages on the console typewriter.
The program itself may also include instructions which cause entry to the Director in
order, for example, to ask for allocation of
peripheral units.
One other reason for interruption, of particular significance, is that which occurs
whenever a peripheral transfer called by the
Director itself comes to an end. This enables
a certain amount of "programmed timesharing" within the Director. There are many
interesting and valuable possibilities. Thus,
the feature is used to allow programs to output "lines of print" which the Director can
send direct to a printer or to a magnetic tape
for subsequent off or on-line printing - the
latter again Director controlled. This allows
standard programs to be written so as to be
capable of running on systems having different
output device configurations.
The only limitation on the facilities offered
by such supervisory routines lies in the size
of program which can be allowed without restricting the "main program" unduly. It is
therefore expected that in addition to the
"standard Directors" a number of others will
appear for various ranges of use.
Any Director must satisfy a number of requirements. It will be stored throughout normal operation of the machine in word 0 onwards of the main store and will be entered
at this point by any interruption. On each
such entry, it must:
(a) preserve, as far as is necessary, the
state of the interrupted program;
(b) discover the reason for interruption
and take appropriate action;
(c) return to program.
These requirements will be considered in
turn.
(a) This involves very little work on the
Director's part. The address to which control must be returned when the program is
resumed is automatically planted in the
S.J .N.S. by the interruption process. This
makes use of one of the "spare" cells of this
store, leaving one for use by the Director itself.
Similar ly the Director make s use of the
three spare cells of the Nesting Store and
therefore does not have to do anything to
preserve the sixteen cells used by programs.
It must, however, store away the contents of
any Q-stores which it is going to use itself
(usually three) and of two special registers
provided to record arithmetic overflow and
peripheral device states.
(b) The Director has access to a "Reason
for Interrupt" register, different digits of
which are used to indicate the different reasons. It is, in fact, the setting and subsequent
detection of non-zero bits in this register,
as various situations arise, which triggers
off the interruption sequence. Once interruption has occurred it cannot occur again
until control has been returned to program;
however, digits corresponding to reasons for
interruption may appear in the register at
any time (they are cleared out as soon as the
register is read by the Director).
(c) In the case of a KDF.9 system with
full time-sharing facilities, this requires the
Director first to determine the priority level
to which control will be returned, and then to
arrange for connection of the appropriate
Nesting Stores, etc.
On any machine, the Director must also
restore any Q-stores which were temporarily
parked away, and must reset the two special
registers mentioned in (a) above. The Base
Address register, which was at zero while
the Director was in operation, must then be
set before finally using a special version of
the EXIT instruction to return to the main
program (this special instruction removes
the inhibition preventing interruption while
the Director operates).
CONCLUSIONS
In the space available it has been possible
only to outline some of the distinctive features of the KDF.9 system. Many of these
arise from the novel arrangement of working
registers, whereas others are unspectacular
120 / The KDF. 9 Computer System
in terms of hardware but are the product of
close investigation of operational needs. No
attempt has been made to catalogue the performance factors of the KDF.9 system or
even to include a specification. It is, however, confidently anticipated that it may point
the way to even more striking improvements
in the ratio of performance to cost.
Apart from the standard ver sions of the
Director already mentioned, efficient use
of a system of the proper KDF. 9 presupposes the availability of an adequate
"software package." Prominent among the
indi vidual items associated with the system
are the following:
(a) A fast-compiling "load and go" ALGOL
Compiler [9].
(b) A fast object-program ALGOL translator [10, 11].
(c) ALGOL program testing and operating
systems.
(d) An ALGOL procedure library.
(e) Translators for FORTRAN, COBOL
and other languages.
Detailed descriptions of some of these
items appear elsewhere.
ACKNOWLEDGMENTS
In preparing this paper, the author is
pri vileged to report the work of enthusiastic
teams of designers and users led by R. H.
Allmark and C. Robinson respectively. Most
of the members of these groups have made
Significant original contributions and mention of individuals in such a team enterprise
is out of place. The writer wishes to express
his thanks to them and also to the Manager of
the Data Processing Division of the English
Electric Company, Ltd., for permission to
publish this paper.
REFERENCES
1. S. J. M. Denison, E. N. Hawkins and
C. R 0 bi n son: "DEUCE Alphacode."
DEUCE Program News, No. 20, January
1958 (The English Electric Co. Ltd.).
2. R. A. Brooker: "The Autocode Programs
developed for the Manchester University
Computers." Computer Journal,l, 1958,
p. 15.
3. R. A. Brooker, B. Richards, E. Berg and
R. Kerr: "The Manchester Mercury
Autocode System." (University of Manchester, 1959).
4. FORTRAN Manual: (International Business Ma.chines Corporation).
5. F. G. Duncan and E. N. Ha wki n s:
"Pseudo-code Translation on Multilevel Storage Machines." Proceedings
of the International Conference on Information Processing, p. 144 (UNESCO,
Paris, June 1959).
6. F. G. Duncan and H. R. Huxtable: "The
DEUCE Alphacode Translator." Computer Journal, 3, 1960, p. 98.
7. C. L. Hamblin: "GEORGE: A Semitranslation Programming Scheme for
DEUCE." Programming and Operation
Manual. University of New South Wales,
Kensington, N .S. W.
8. R. H. Allmark and J. A. Lucking: "Design of an Arithmetic Unit Incorporating
a Nesting Store." Proceedings of the
I.F.I.P. Congress, Munich, August 1962.
"The Whetstone KDF.9
9. B. Randell:
ALGOL Translator." Proceedings of the
Programming Systems Symposium, London School of Economics, July 1962.
10. E. N. Hawkins and D. H. R. Huxtable:
"A Multi-pass Translation Scheme for
ALGOL60 for KDF.9." AnnualReviewof
Automatic Programming, Vol. 3, 1962,
Pergammon Press.
11. F. G. Duncq,n: "Implementation of
ALGOL 60 for KDF.9." Computer Journal, ~, 1962, 130.
A COMMON LANGUAGE FOR HARDWARE,
SOFTWARE, AND APPLICATIONS
Kenneth E. Iverson
Thomas J. Watson Research Center, IBM
Yorktoum Heights, New York
language which may be slightly augmented in
different ways at the various levels.
First, it is difficult, and perhaps undesirable, to make a precise separation into a
small number of levels. For example, the
programmer or. analyst operating at the
highest (least detailed) level, may find it
convenient or necessary to revert to a lower
level to attain greater efficiency in eventual
execution or to employ convenient operations
not available at the higher level. Programming languages such as FORTRAN commonly
permit the use of lower levels, frequently of
a lower level "assembly language" and of the
underlying machine language. However, the
employment of disparate languages on the
various levels clearly complicates their use
in this manner.
Second, it is not even possible to make a
clear separation of level between the software
(metaprograms which transform the higher
level algorithms) and the hardware, since the
hardware circuits may incorporate permanent
or semipermanent memory which determines
its action and hence the computer language.
If this special memory can itself be changed
by the execution of a program, its action may
be considered that of software, but if the
memory is "read-only" its action is that of
hardware-leaving a rather tenuous distinction between software and hardware which is
likely to be further blurred in the future.
Finally, in the design of a data processing
system it is imperative to maintain close
INTRODUCTION
Algorithms commonly used in automatic
data processing are, when considered in
terms of the sequence of individual physical
operations actually executed, incredibly complex. Such algorithms are normally made
amenable to human comprehension and analysis by expressing them in a more compact
and abstract form which suppresses systematic detail. This suppression of detail
commonly occurs in several fairly well defined stages, providing a hierarchy of distinct descriptions of the algorithm at different
levels of detail. For example, an algorithm
expressed in the FORTRAN language may be
transformed by a compiler to a machine code
description at a greater level of detail which
is in turn transformed by the "hardware" of
the computer into the detailed algorithm actually executed.
Distinct and independent languages have
commonly been developed for the various
levels used. For example, the operations
and syntax of the FORTRAN language show
little semblance to the operations and syntax
of the computer code into which it is translated, and neither FORTRAN nor the machine
language resemble the circuit diagrams and
other descriptors of the processes eventually
executed by the machine. There are, nevertheless, compelling reasons for attempting
to use a single "universal" language applicable to all levels, or at least a single core
121
122 / A Common Language for Hardware, Software, and Applications
communication between the programmers
(i.e., the ultimate users), the software designers, and the hardware designers, not to
mention the communication required among
the various groups within anyone of these
levels. In particular, it is desirable to be
able to describe the metaprograms of the
software and the microprograms of the hardware in a common language accessible to all.
The language presented in Reference 1
shows promise as a universal language, and
the present paper is devoted to illustrating its
use at a variety of levels, from microprograms, through metaprograms, to "applications" programs in a variety of areas.
To keep the treatment within reasonable
bounds, much of the illustration will be
limited to reference to other published
material. For the same reason the presentation of the language itself will be
limited to a summary of that portion required
for microprogramming (Table 1), augmented
by brief definitions of further operations as
required.
MICROPROGRAMS
In the so-called "systems design" of a
computer it is perhaps best to describe the
computer at a level suited to the machine
language programmer. This type of description has been explored in detail for a single
machine (the IBM 7090) in Reference 1, and
more briefly in Reference 2. Attention will
therefore be restricted to problems of description on a more detailed level, to specialized equipment such as associative memory, and to the practical problem of keying
and printing microprograms which arise from
their use in design automation and simulation.
The need for extending the detail in a
microprogram may arise from restrictions
on the operations permitted (e.g., logical or
and negation, but not and), from restrictions
on the data paths provided, and from a need
to specify the overall "control" circuits
which (by controlling the data paths) determine the sequence in the microprogram, to
name but a few. For example, the basic "instruction fetch" operation of fetching from a
memory (i.e., a logical matrix) M the word
(i.e., row) M i selected according to the base
two value of the instruction location register
.§.. (that is, i = .L~), and transferring it to the
command register,£., may be described as
!l...- MJ...§:.
Suppose, however, that the base two value
operation (i.e., address decoding) is not provided on the register §.. directly, but only on
a special register q to which s may be transferred. The fetch then becomes
Suppose, moreover, that all communication
with memory must pass through a buffer
register 12, that each transfer out of a mem0ry word M i is accompanied by a sub~equent
reset to zero of that word (that is, M~ - ~:-),
that every transfer from a register (or word
of memory) x to a register (or word of memory) ~ must be of the form
and that any register may be reset to zero,
then the instruction fetch becomes
1
!l..-
2
a- s
3
II
4
b
5
M...Li!_ -€
6
M.l!!._
7
c-
8
c- b
-
-
~
Y Q
-
f
M..Li!Yll
12
Y
}
M..L~
€
Y
c.
In this final form, the successive statements correspond directly (except for the
bracketed pair 4 and 5 which together comprise an indivisible operation) to individual
register-to-register transfers. Each statement can, in fact, be taken as the "name" of
the corresponding set of data gates, and the
overall control circuits need only cycle
through a set of states which activate the
data gates in the sequence indicated.
The sequence indicated ina microprogram
such as the above is more restrictive than
necessary and certain of the statements (such
as 1 and 3 or 6 and 7) could be executed
Proceedings-Fall Joint Computer Conference, 1962 / 123
concurrently without altering the overall
result. Such overlapping is normally employed to increase the speed of execution of
microprograms. The corresponding relaxation of sequence constraints complicates their
specification, e.g., execution of statement k
might be permitted to begin as soon as statements h, i and-j were completed. Senzig
(Reference 3) proposes some useful techniques and conventions for this purpose.
The "tag" portion of an associative memory can, as shown in Reference 2, be characterized as a memory M, an argument vector x, and a sense vector §. related by the
expression
/\
~=M=2f.
or by the euivalent expression
-
v-
§..=M~2f,
obtained by applying De Morgan's law. Falkoff
(Reference 4) has used microprograms of the
type discussed here in a systematic exploration of schemes for the realization of associq.tive memories for a variety of functions
including exact match, largest tag, and nearest larger tag.
Because the symbols used in the language
have been chosen for their mnemonic properties rather than for compatibility with the
character sets of existing keyboards and
printers, transliteration is required in entering microprograms into a computer, perhaps for processing by a simulation or design
automation metaprogram. For that portion
of the language which is required in microprogramming, Reference 5 provides a simple
and mnemonic solution of the transliteration
problem. It is based upon a two-character
representation of each symbol in which the
second character n.eed be specified but rarely.
Moreover, it provides a simple representation of the index structure (permitting subscripts and superscripts to an arbitrary
number of levels) based upon a Lukasiewicz
or parenthesis-free representation of the
corresponding tree.
METAPROGRAMS
Just as a microprogram description of a
computer (couched at a suitable level) can
provide a clear specification of the corresponding computer language, so can a program
description of a compiler or other metaprogram give a clear specification of the
"macro-language" which it accepts as input.
No complete description of a compiler expressed in the present language has been
published, but several aspects have been
treated. Brooks and Iverson (Chapter 8,
Reference 6) treat the SOAP assembler in
some detail, particularly the use of the open
addressing system and "availability" indicators in the construction and use of symbol
tables, and also treat the problem of generators. Reference 1 treats the analysis of
compound statements in compilers, includIng the optimization of a parenthesis-free
statement and the translations between parenthesis and parenthesis-free forms. The latter
has also been treated (using an emasculated
form of the language) by Oettinger (Reference
7) and by Huzino (Reference 8); it will also
be used for illustration here.
Consider a vector f. representing a compound statement in complete>!c parenthesis
form employing operators drawn from a set
p. (e.g., p. = (+, x, -,
Then Program 1
shows an algorithm for translating any wellformed statement c into the equivalent statementl inparentheSis-free (Le., Lukasiewicz)
form.
Program 1 - The components of £.. are
examined, and deleted from.£, in order from
left to right (steps 4, 5). According to the
decisions on steps 6, 7, and 8 t , each component is discarded if it is a left parenthesis,
appended at the head of the resulting vector
1. if it is a variable (step 9), appended at the
head of the auxiliary stack vector §.. if it is
an operator (step 10), and initiates a transfer
of the leading component of the stack ~ to
the head of the result 1 if it is a right parenthesis (steps 11, 12).- The behavior is perhaps best appreciated by tracing the program
for a given case, e.g., if.£ = ([, [, x, +, y,], x,
[, p, +, q, ],]), then I = (x, +, q, p, +, y, x).
+».
APPLICATIONS
Areas in which the programming language
has been applied include search and sorting
procedures, s y m b 0 1 i c logic, linear programming, information retrieval, and music
>:Cln . complete parenthe sis form all implied'
parentheses are explicity included, e.g •• the
statement ({X + y) x
()
o
8
8
o
::s
t""
§
Table 1
~
~
aq
2e,eration
Definition
Notation
f
o. Scalar
P
EIVector
~
~:; (~O' ~1' .••• ~V(~) -1)
A
~t:,~)}
R
A
NI MatrIx
.
D
S
B
Floor
A
S
k-
Ceiling
k -
k> x>k-l
Residue
mod m
k
n
C
LxJ
rxl
-min
t
Ai is i-th row vector
(~O'··· ~v(A)-l)
+k,
k,m,n,q
intege rs
w
=1
iff u
=1
and v = 1
EI And
w-
~IOr
w -
o
w -li
w= 1 iff u=
w-(xRy)
w='l iff x stands in
re lation R to y
UI\. V
R
Negation
N
..
S ProposItIOn
Full vector
{ A. is j-th column vector
-J
-
0< k < m
0
P
!:: -
~(n)
w= 1 iff u= 1 or v= 1
°
(All l's)
~= (
w-ej(n)
w.=(i=j)
(linpositionj)
-
-1
°
1 2 3 4)
1
2
4
5
2
345
6
3
E
C Prefix vector
I
A Suffix vector
L
Infix vector
.A
R Interval vector
"R
A Full matrix
y
S
Identity matrix
-
w-aj(n)
.
r3. 1 41=4,
r-3.14 1 = -3
I
7
I 21
!::-~J(n)
!::i=(i::,n-j)
= 0,
7
w-i~aj(n)
See Rotation (j1'safteriO's)
w- -E(mxn)
-J
w~
=
(All1's)
I (mxn )
w~
= (i:; j)
(Diagonal l's)
-
W-
-
-J
I -3 =
~
~.,
CD
4
§
Q.
~
10
~X.Y
......
""""'
(~-S
~
y)
r(3::'2)=0.
(5-/2)=1,
~ (5) = (1,1,1,1,1) ,
eO(5) = (1,0,0,0,0),
Of dimension
v (w) = n. The
Gfinall's)
~i = i + j
CD
~
......
(i=j)=oij'
'{ (u-/ v) = exclusive-or of u and v, (u < v) = U
(j leading l's)
-1
. ~- .!:!(n)
.,~
:)
en
o
L-3.14J= -4
7 19 = 5,
:
'!!.- QA Y.
-
-
~= (:
L3. 1 4J = 3,
-
w. = (i< j)
p::
.,
~
'"
e
3
(5)=(0,0,0,1,0)
2
3
a (5) = (1,1,1,0,0), a (4) = (1,1,0,0)
3
. . .
-
.
~(5)=(0,0,1,1,1), j~:d:;i,jti=~J
omitted if
clear from
context
2~a3(9)=
°.!:
(0,0,1,1,1,0,0,0,0)
1
(3) = (0,1,2) • -!:
}
0
f d"
ImenSlon
~>it~e~m) ay
be
E (mXn)
-
v.
-
n~aybe
(Integers from j)
§
A
I= zero vector
S
P Unit vector
.,o......
(8.9) • .£ = (3,2,1)
~-~+.Y
!:: -
!::i =
£. =
E.= (1,0,1,0,1), s.= (1,0,1)
All basic operations
are extended component -by -component
to vectors and matrices. e. g.,
~ u Vv
-
~ = (3,4,5.6,7),
k-SX-
~
is the maximum length prefix (suffix) in !:o
...--
al(I,I. 0.1. 0) = (1.1, O. o. 0). w/(l. 1. o. 1. 0) = (0.
al.e.= (1.0.0.0.0). w/.e.= (0.0.0.0.1).
{
a/~j = ~j, (1/al(~ =~)l t ~ = ~ left
O. O.
justified
o.
o
0).
!.,
C'D
()
£(x)
~
is the vector representation of the character x
In 7090 £(d) = (0.1,0.1.
In 8421 code. £(0)
=
o.
(0. O.
0). Q(e) = (0,1.0.1,0,1)
o. 0) •
..e.(I)
=
o
~
.,
~
(0. O. 0.1)
C'D
C'D
~
Basic Operations for Microprogramming (selected from Iverson, A Programming Language. Wiley. 1962)
n
.......
C'D
CO
CJ:)
~
"'-...
.....
~
c:J1
126
I
A Common Language for Hardware, Software, and Applications
1
I
2
§... +- .§. (0)
3
v~) : 0
4
~. +- £1
5
+-
Q. +-
f (0)
=
Q1/£
=
6
x
7
x
8
x
9
1
=
:p.
+-
€
x
~l
10
§... +- x®~
11
1. +-
~1
12
§... +-
all..§:
key transformation t is, in general, a manyto-one function and the index i is merely used
as the starting point in searching the table
for the given argument x. Figure 1 is otherwise self-explanatory. Method (e) is the
widely used open addressing system described by Peterson (Reference 12).
Symbolic Logic. If.:! is a logical vector
and T is a logical matrix of dimension
2 v (:)"x vW such that the base two value of
Ti is i (that is, 1.. T = .l..°(2 V(~»), then the
rows of T define the domain of the argument
.b and any logical function f~ defined on.K
can be completely speCified by the intrinsic
vector i V) such that i jV) = ftt j).
1\
Expansion of the expression p = T :: 2f
shows that P is the vector of minterms in.2£,
and consequently
f~) =
= XV,~)
®J.
PROGRAM 1
Translation fro:m co:mplete parenthe sis state:ment.£ to equivalent Lukasiewicz state:mentl
theory. The first two are treated extensively
in Reference 1, and Reference 9 illustrates
the application to linear programming by a
13-step algorithm for the simplex method.
The areas of symbolic logic and matrix
algebra illustrate particularly the utility of
the formalism provided. Salton (Reference
10) has treated some aspects of information
retrieval, making particular use of the notation for trees. Kassler's use in music concerns the analysis of Schoenberg's 12-tone
system of composition (Reference 11).
Three applications will be illustrated here:
search procedures, the relations among the
canonical forms of symbolic logic, and matrix
inversion.
Search Algorithms. Figure 1 shows search
programs and examples (taken from Reference 1) for five methods of "hash addressing"
(cf. Peterson, Reference 12), wherein the
functional correspondent of an agrument x is
determined by using some key transformation
function t which maps each argument x into
an integer i in the range of the indices of
some table (Le., matriX) which contains the
arguments and their correspondents. The
X if, ~) X p.
X ,
(2)
and the relation between the intrinsic vector
and the exclusive disjunctive characteristic
vector X. V? ~) (first derived by Muller, Reference 14) is given directly by the square
matrix S = TOT. The properties of the
matrix S are easily derived from this formulation. Moreover, the formal application
of De Morgan's laws to equations (1) and (2)
yields the two remaining canonical forms
directly (Reference 1, Chapter 7).
Matrix Inversion. The method of matrix
inversion using Gauss-Jordan (complete)
elimination with pivoting and restricting the
total major storage to a single square matrix
augmented by one column (described in References 14 and 15) involves enough selection,
permutation, and decision type operations to
render its complete description by classical
~
Proceedings-Fall Joint Computer Conference, 1962 / 127
k
t(ki)
n
=
(Sunday, Monday, Tuesday, Wednesday, Thursday,
Friday, Saturday)
= 1 + (610 nil, where
=
d
=
(2, 2, 3,6,3, 1,2), where
= (1, 2,
3, 6), and s
F=
6
(19, 13, 20, 23, 20, 6, 19)
(ni is the rank in the alphabet of the first letter of ki)
%
1
2
3
4
5
Zi
= t(k i ),
1
2
V=
3
F=
6
1 1
3 2
Friday
Sunday
Tuesday
4
5
Wednesday
4
3
~
[ Monday
Thursday
Saturday
i : p(V)
1
Overflow
(a)
1
2
3
4
5
i - t(x)
x : F1i
T=
6
j_F2 i
Wednesday 4
6
6
1
= 6.
Data of examples
1
2
3
Friday
Sunday
Tuesday
7
6 0
1 4
Friday
Sunday
Tuesday
Monday
Thursday
Wednesday
Saturday
3 5
2 7
5 0
4
7 0
i-Fai
1
2
3
V=
2
5
7
[ Monday
Thursday
Saturday
:]
i :
0
x : Vli
Single table with chaining
(c)
i - Va'
Overflow with chaining
(b)
j - V 2'
1
2
3
4
5
6
7
T=
Sunday
Monday
Tuesday
Wednesday
Thursday
Friday
Saturday
1 2
2 7
3 5
4
5
0
i-mi
5
6
7
0
6
7
0
x:
m = (6, 1, 3,
0, 0,
4)
T=
Friday
Sunday
Monday
Tuesday
Thursday
Wednesday
Saturday
6
1
i - t(x)
2
3
5
4
7
i - p(T)
T1i
It (i + 1)
i-Ta i
j
Single table with chaining and
mapping vector
(d)
1
2
3
4
i - t(x)
+-
T2i
~
Open addressing systemconstruction and use of
table
(e)
Figure 1. Programs and examples for methods of scanning equivalence classes
defined by a I-origin key transformation t
128
I
A Common Language for Hardware, Software, and Applications
matrix notation rather awkward. Program 2
describes the entire process. The starred
statements perform the pivoting operations
and their omission leaves a valid program
without pivoting; exegesis of the program
will first be limited to the abbreviated program without pi voting.
Program 2 - Step 1 initializes the counter k which limits (step 11) the number of
iterations and step 3 appends to the given
square matrix M a final column of the form
(1, 0, 0, ... , o) which is excised (step 12)
only after completion of the main inversion
process. Step 7 divides the pivot (i.e., the
first) row by the pivot element, and step 8
subtracts from M the outer product of its
first row (except that the first element is
replaced by zero) with its first column.*
The result of steps 7 and 8 is to reduce the
first column of M to the form (1, 0, 0, ... , 0)
as required in Jordan elimination.
*
*
4
*
5
*
*
P1 - -p.l
*The row rotation k l' X is a row-by-row
extensio.n of the rotation k l' X, that is,
(~1' X) 1. = Iii t Xi. Similarly, U~ 1t X) j =
-
6
w11M1
7
_M1~M1+M1€
_
_1_
8
M ~ M - (Q:1
9
M ~ ~
10
11
P
~
1
+-+
l'
t
W11M j
«(i1
x
11
M 1) ~ M 1
M)
p
Lk~k-1
12
*
The net result of step 9 is to bring the
next pivot row to first position and the next
pivot element to first position within it by
(1) cycling the old pivot row to last place
and the remaining rows up by one place, and
(2) cycling the leading column (1, 0, 0, ... , o)
to last place (thus restoring the configuration
produced by step 3) and the remaining columns one place to the left. The column rota-11)tion* Q.,., M rotates all columns save the
first upward by one place, and the subsequent
row rotation ~ t rotates all rows to the left
by one plac e.
The pivoting provided by steps 2, 4, 5, 6,
10, and 13 proceeds as follows. Step 4 determines the index j of the next pivot row by
selecting tlie maximum t over the magnitudes
of the first k elements of the first column,
where k = v(M} + 1 - q on the q-th iteration.
Step 6 interchanges the first row with the
selected pivot row, except for their final
components. Step 5 records the interchange
in the permutation vector+ p which is itself
initialized (step 2) to the vaiiie of the identity
permutation vector l..l(k) = (1, 2, ... , k). The
rotation of P on step 10 is necessitated by
the corresponding rotations of the matrix M
(which it indexes) on step 9. step 13 performs
the appropriate inverse reordering§ among
the columns of the resulting inverse M.
Iii
l'
Xj.
r
tThe expres sion !:!.. = 'Q ~ denote s a logical
vector 1! such that '!& i = 1 if and only if Xi is
a maximum among thos e components Xk such
that Y..k = 1. Clearly, !:!../!:.. I is the v~tor of
indices of the (restricted) maxima, and
(y) i. III is the first among them.
tA permutation vector is any vector p whose
components take on all the values 1-:- 2, .•• ,
v(p). The expression y = X denotes that y
_l!
is a permutation of}£ defined by ~i =.:!Pi.
§ Permutatioz:1 is extended to matrices as
follows:
13
N
= -J!.
M
-
¢===;>
N.
-1
M. ,
-l!.J
PROGRAM 2
Matrix inver sion by Gaus s-Jordan elimination
The expres sion!:! L X is called the !! index of
and is defined by the relation!l. LX= j, where
X = f!j. 1£ X is a vector, then
= !! L X is defined by j i = !! L ~i. Consequently, p.. L.!. I
denotes the permutation inverse to P....
X
*The outer product Z of vector J£. by vector 1
is the "column by row product" denoted by
g ~ ~ X~ a:nd defined by ~ ~ = ~ i Xl j •
1.
Proceedings-Fall Joint Computer Conference, 1962 /129
A description in the ALGOL language of
matrix inversion by the Gauss-Jordan method
is provided in Reference 16.
9.
REFERENCES
1. Iverson, K. E., A Programming Language,
Wiley, 1962.
2. Iverson, K. E., "A Programming Language," Spring Joint Computer Conference, San Francisco, May 1962.
3. Senzig, D. N., "Suggested Timing Notation
for the Iverson Notation," Research Note
NC -120, IBM Corporation.
4. Falkoff, A. D., "Algorithms for ParallelSearch Memories," J.A.C.M., October
1962.
5. Iverson, K. E., "A Transliteration Scheme
for the Keying and Printing of Microprograms," Research Note NC-79, IBM Corporation.
6. Brooks, F. P., Jr., and Iverson, K. E.,
"Automatic Data Processing," Wiley (in
press).
7. Oettinger, A. G., "Automatic Syntactic
Analysis of the Pushdown Store," Proceedings of the Twelfth Symposiumill
Applied MathematiCS, April 1960, published by American Mathematical Society,
1961.
8. Huzino, S., "On Some Applications of the
Pushdown Store Technique," Memoirs of
10.
11.
12.
13.
14.
15.
16.
the Faculty of Science, Kyushu University, Sere A, Vol. XV, No.1, 1961.
Iverson, K. E., "The Description of Finite
Sequential Processes," Fourth London
Symposium on Information Theory, August 1960, Colin Cherry, Ed., Butterworth and Company.
Salton, G. A., "Manipulation of Trees in
Information Retrieval," Communications
of the ACM, February 1961, pp. 103-114.
Kassler, M. , "Decision of a Musical
System," Research Summary, Communications of the ACM, April 1962, page
223.
Peterson, W. W., "Addressing for Random Access Storage," IBM Journal of
Research and Development, Vol. 1, 1957,
pp. 130-146.
Muller, D. E., "Application of Boolean
Algebra to Switching Circuit Design and
to Error Detection," Transactions of the
IRE, Vol. EC-3, 1954, pp. 6-12.
Iverson, K. E., "Machine Solutions of
Linear Differential Equations," Doctoral
ThesiS, Harvard University, 1954.
Rutishauser, H., "Zur Matrizeninversion
nach Gauss-Jourdan," Zeitschrift fur
Angewandte Mathematik und Physik, Vol.
X, 1959, pp. 281-291.
Cohen, D., "Algorithm 58, Matrix Inversion," Communications of the ACM, May
1961, p. 236.
INTERCOMMUNICATING CELLS,
BASIS FOR A DISTRIBUTED LOGIC COMPUTER
c. Y. Lee
Bell Telephone Laboratories, Inc.
Holmdel, New Jersey
The purpose of this paper is to describe
an information storage and retrieval system
in which logic is distributed throughout the
system. The system is made up of cells.
Each cell is a small finite state machine
which can communicate with its neighboring
cells. Each cell is also capable of storing a
symbol.
There are several differences between
this cell memory and a conventional system.
With logic distributed throughout the cell
memory, there is no need for counters or
addressing circuitry in the system. The
flow of information in the cell memory is to
a large extent guided by the intercommunicating cells themselves. Furthermore, because retrieval no longer involves scanning,
it becomes possible to retrieve symbols from
the cell memory at a rate independent of the
size of the memory.
Information to be stored and processed is
normally presented to the cells in the form
of strings of symbols. Each string consists
of a name and an arbitrary number of parameters. When a name string is given as its
input, the cell memory is expected to give
as its output all of the parameter strings
associated with the name string. This is
called direct retrieval. On the other hand,
given a parameter string, the cell network
is also expected to give as its output the
name string associated with that parameter
string. This is called cross-retrieval.
The principal aim of our design is a cell
memory which satisfies the following criteria:
1. The cells are logically indistinguishable from each other.
2. The amount of time required for direct
retrieval is independent of the size of
the cell memory.
3. The amount of time required for crossretrieval is independent of the size of
the cell memory.
4. There is a simple uniform procedure
for enlarging or reducing the size of
the cell memory.
Aim and Motivation
We are primarily concerned here with the
design of the memory system of a computer
in which memory and logic are closely interwoven. The motivation stems from our contention that in the present generation of
machines the scheme of locating a quantity
of information by its "address" is fundamentally a weak one, and furthermore, the
constraint that a memory "word" may communicate only with the central processor (in
most cases the accumulator) has no intrinsic
appeal. This motivation led us to the design
of a cell memory compatible with these contentions.
The association of an address with a
quantity of information is very much the result of the type of computer organization we
now have. Writing a program in machine
language, one rarely deals with the quantities
of information themselves. A programmer
normally must know where these quantities
130
Proceedings-Fall Joint Computer Conference, 1962 / 131
of information are located. He then manipulates their addresses according to the problem at hand. An address in this way often
assumes the role of the name of a piece of
information.
There are two ways to look at this situation. Because there is usually an ordering
relationship among addresses, referring to
contents by addresses has its merits provided a programmer has sufficient foresight
at the time the contents were stored. On the
other hand, a location other than being its
address can have but the most superficial
relation to the information which happens to
be stored there. The assignment of a location to a quantity of information is therefore
necessarily artificial. In many applications
the introduction of an address as an additional
characteristic of a quantity of information
serves only to compound the complexity of
the issue. In any event the assignment of
addresses is a local problem, and as such
should not occupy people's time and may
even be a waste of machine's time.
A macroscopic approach to information
storage and retrieval is to distinguish information by its attributes. A local property,
"350 Fifth Avenue," means little to most
people. The attribute, "the tallest building
in the world," does. The macroscopic approach requires only that we be able to discern facts by contents. Whatever means are
needed for· addressing, for counting, for
scanning and the like are not essential and
should be left to local considerations.
Doing away with addressing, counting, and
scanning means a different approach to machine organization. The underlying new concept is however simple and direct: The work
of information storage and retrieval should
not be assumed by a central processor, but
should be shared by the entire cell memory.
The physical implementation of this concept
is intercommunicating cells.
Although one of the principal aims of an
intercommunicating cell organization is to
make the rate of retrieval independent of the
amount of information stored in the cell
memory, a number of other engineering criteria are no less important. Uniformity of
cell design makes mass production possible.
Ease of attaching or detaching cells from the
cell memory simplifies the growth problem.
Also, facilities for simultaneous matching of
symbols make complex preprocessing such
as sorting unnecessary.
A few intuitive remarks on cell memory
retrieval may be appropriate here. Information stored in the cell memory are in the
form of strings of symbols. Normally, such
a string is made up of a name and the parameters which describe its attributes. Each
cell is capable of storing one symbol. A
string of infornlation is therefore stored in a
corresponding string of cells. In the cell
memory each cell is given enough logic circ~itry so that it can give us a yes or no
answer to a simple question we ask. If we
think of the symbol, say s, contained in a cell
as its name, then the question we ask is merely
whether the cell's name is s or is not s.
In retrieval, for example, we may wish to
find all of the parameters (attributes) of a
strategy whose name is XYZ. As a first
step we would simultaneously ask each cell
whether its name is X. If a cell gives us an
answer no, then we are no longer interested
in that cell. If a cell gives us an answer yes,
however, we know it may lead us to the name
of the strategy we are looking for. Therefore, we also want each cell to have enough
logic circuitry so that it can signal its neighboring cell to be ready to respond to the next
question. We then simultaneously ask each
of these neighboring cells whether its name
is Y. Those cells whose names are Y in
turn signal their neighboring cells to be
ready to respond to the final question:
whether a cell's name is Z. The cell which
finally responds is now ready to signal its
nearest neighbor to begin giving out parameters of the strategy.
The process of cell memory retrieval
provides a particularly good example of letting the cells themselves guide the flow of
information. By a progressive sequence of
questions, we home in on the information we
are looking for in the cell memory, although
we have no idea just where the information
itself is physically stored. Because generally
most cells contain information which is of no
use to us, the number of cells which give yes
answers at any moment is quite small. We
may, if we wish, think of the retrieving
process therefore as a process of getting
rid of useless information rather than a
searchingprocess for the useful information.
Cell Configuration
Each cell in the cell memory is made up
of components called cell elements. Each
cell element is a bistable device such as a
132 / Intercommunicating Cells, Basis for a Distributed Logic Computer
relay or a flip-flop. The cell elements are
divided into two kinds: the cell state elements and the cell symbol elements. In
the design of the cell memory to be described here, each cell will have a single
cell state element so that each cell has two
logical states. A cell may either be in an
active state or in a quiescent state. There
may be any number of cell symbol elements,
depending upon the size of the symbol alphabet.
The over-all structure of a cell memory
is shown in Figure 1. Each cell in the cell
memory, say cell i, is controlled by four
types of control leads: the input signal lead
(IS lead), the output signal lead (OS lead), the
match signal lead (MS lead), and the propagation lead (P lead). The input signal lead
is active for the duration of the input process.
The input symbol itself is carried on a separate set of input leads. When a cell is in
an active state, and the input signal lead is
activated, whatever symbol is carried on the
input leads is then stored in that cell.
is compared with the contents carried on the
input leads. If the conlparison is successful,
an internal signal, m i, is generated in cell i.
The signal, m i, is transmitted to one of the
neighboring cells of cell i, causing that cell
to become active.
The propagation lead controls the propagation of activity in a cell memory. When a
cell is in an active state, a pulse on the
propagation lead causes it to pass its activity
to one of its neighboring cells. The direction
of propagation is controlled by two separate
direction leads: Rand L.
The circuits of a cell employing flip-flops
and gates are shown in Figure 2.
sll~~'it -++-H-+--4---+--+-+-H++---If---!--+---+-l-++-~-+--+-+__
~rm~
IS -+-t---------+--+-++-I-----+-I+_-
-++-I-4--4---+--+-+-H-+.----If---!--+---+-l-+-+---+--+-+__
MS~---------+--+~----+~I+_-
:1~1~~ -+++---4-~-+-+-H---If---4--+---+-l+---+--+---+__
PROPAGATE
-+t+---4---+-t-t-4-__If----+---+-l-+.----+--+__
L~~~~. -+~---+---+-++------->-----+---t-.t---+--+__
~7~-+----~r--~r-~-4-----+Dl~~~1!~N ---+.._ _ _ _ _+...-_--11--_-+-_ _ _ _ _
----f~~-------+..........----+__II+_b, b' --4_---------..-__--+__11+_-
X,x'
0,0;
OUTPUT - - - - - - - - - - - -_ _- - -
Figure 1.
Overall diagram of the
cell memory.
The output signal lead controls the flow of
output from the cell memory. Each symbol
read out from a cell is carried by a separate
set of output leads, and is also stored in a
buffer called the Output Symbol Buffer. When
a cell is in an active state, a pulse on the
output signal lead causes that cell to read
out its contents to the set of output leads.
An important function performed by the
cell memory is the operation of simultaneous
matching of the contents of each of the cells
with some fixed contents. This operation is
controlled by the match signal lead. During
matching, the contents of each cell, say cell i,
More detailed circuit of cell i.
Figure 2.
An Example of Cross-Retrieval
Let there be three separate strings of information stored in the cell memory. Let
these strings have the following names and
parameters:
Name
Parameter
AB
xy
xw
AC
u
B
Proceedings--Fall Joint Computer Conference, 1962 / 133
The strings are stored in the cell memory in
the form of a single composite string. There
must be some way, therefore, by which the
name and the parameter strings can be told
apart and also a means for distinguishing
among the three strings of information themselves. To do this we introduce two tag
symbols, a and {3. Every name string is
preceded by a tag of a, and every parameter
string is preceded by a name of {3. The
string stored in the cell memory therefore
has the form:
the neighboring cell on its right. We then
have the situation shown in Figure 4, where
each arrow indicates that a signal is being
transmitted by a cell whose contents are
[ffiJ.
aB{3XY aAB {3XW aAC {3U a •••
We have found it convenient to use the
diagram
I I I
p
q
to represent a cell. In this diagram, p stands
for the symbol stored in the cell, and q for
the state of the cell. Also, q is 1 if the cell
is quiescent, and is 2 if the cell is active.
Using such diagrams, Figure 3 shows the
manner in which the composite string is
stored in the cell memory.
~
...
Figure 4. Signals being transmitted by
@I!J after matching.
Now every cell which receives a signal
from its neighboring cell, whether from the
left or from the right, will change from a
quiescent state to an active state. Also, the
signals transmitted by the cells to their
neighbors should be thought of as pulses so
that they disappear after they have caused
the neighboring cells to become active. The
next stable situation is shown in Figure 5;
each of the active cells is represented by
double boundary lines.
During the next match cycle, we want all
of the cells to match their individual contents
against the fixed contents:
Figure 3. Cell memory configuration
at the start of retrieval.
Let us suppose that we wish to retrieve
the name of a string whose parameter is XW.
We call such a process in which a parameter
string input causes a name string output the
process of cross-retrieval. The process of
retrieving a parameter knowing its name is
called direct retrieval.
Initially, we want all of the cells to match
their individual contents against the fixed
information
I{3 I 1 I
Furthermore, we want every cell whose contents happen to be [[[!] to send a signal to
...
Figure 5. Som e of the cells become active
after receiving pulses from the neighboring
cells.
134 / Intercommunicating Cells, Basis for a Distributed Logic Computer
As before, every cell whose contents are
~ sends a signal to its right neighboring
cell, causing that cell to become active. At
this point, each previously active cell is
made to restore itself to the quiescent state.
The stable situation is illustrated in Figure 6.
PREVIOUSLY
ACTIVE CELL
~ ...
Figure 7.
cell
~
...
Figure 6. Previously active cells restore
themselves to quiescent state as new cells
become active.
During the following match cycle, the
cells are made to match their individual contents against the fixed contents
In this example, there is now only one cell
whose contents are [RI]]. That cell first
signals the cell on its right, causing that
neighboring cell to become active, and then
restores itself to the quiescent state.
During the next match cycle, all of the
cells are made to match against
Transfer of activity from
~ to cell ~
then reads out its symbol to the output leads.
That symbol is now compared with the fixed
symbol a in an external match circuit associated with the control leads. If there is no
match, a propagate-left signal is sent to
every cell and the external comparison
process is repeated. This process eventually
terminates when the cell [QJ]J is reached
(Figure 8). The purpose of this phase of the.
output process is strictly to locate the beginning of the information string which is being
retrieved.
The actual read out process begins with a
propagate-right signal. The cell ~ which
contains the first letter of the name string
AB is activated. The symbol A is now read
out and matched externally with the symbol {3.
The read out process continues until {3 is
reached and is then terminated.
Figure 9 gives a general picture of the
read out part of the cross-retrieval process.
Ia I 2 I
The presence of the symbol a shows that the
matching process is at an end, and that the
output part of the retrieval process is to
begin. The cell whose contents are [lli\
is the only cell which is active at the moment.
During the output phase, a number of actions take place. First of all, two successive
propagate-left signals are sent to the cell
memory. The result is a transfer of activity
from the cell whose contents are [AlI] to the
cell whose contents are~, as shown in
Figure 7. An output Signal is now supplied
to all of the cells. The cell which is active
Figure 8.
Reaching the initial symbol in the
string to be retrieved.
Proceedings--Fall Joint Computer Conference, 1962 / 135
If a string consists of a name N and k-1
parameters PI' P 2 , ••• , Pk - 1 , we will now
assign to it a set of k+2 tags: ll!o, ll!1, ••• ,
ll! k, and~. The string will be stored in the
cell memory in the following form:
I. OUTPUT SYMBOL
A. NO MATCH
WITH a
5.0UTPUT SYMBOL
W. NO MATCH
WITH a
3. OUTPUT SYMBOL
f3. NO MATCH
WITH a
2. OUTPUT SYMBOL
B. NO MATCH
WITH a
4. OUTPUT SYMBOL
X. NO MATCH
WITH a
6. OUTPUT SYMBOL
a. MATCH
AND STOP
Figure 9. A general picture of the read-out
part of the cross-retrieval process.
The symbol ll!o indicates the beginning of a
string, and the symbol ~ indicates the end of
a component (that is, a name or a parameter).
The symbols ll! 1, ll! 2, ••• , ll!k are the tags
associated with the components N, P 1, ••• ,
P k _ 1· Furthermore, it should be noted that
a tag is associated always with a given attribute. For example, ll! 1 is the name tag and
should be used as a name tag for all information strings.
Consider now the cross-retrieval problem
where the cell memory is given as its input
a component string together with its tags:
Storage and Cross-Retrieval
The storage and the retrieval of symbols
in the cell memory are both accomplished
by letting the cells pass their activities to
their neighboring cells and, in this way,
guide the flow of information in the cell
memory. The process of storing symbols in
the cell memory provides a good illustration
of the dependence of the cell memory upon
the propagation of activity among the cells.
Prior to storing the first symbol, the
first cell in the cell memory is made active.
When the first symbol appears on the set of
input leads, the first cell, being active, becomes the only cell prepared to receive that
symbol. The first cell, like all of the cells,
plays a dual role however. Mter taking in
the symbol, it also passes its activity to the
right neighboring cell. The neighboring cell
then becomes the only active cell in the cell
memory, and hence becomes the only cell
prepared to receive the next symbol when it
appears on the input leads.
When we examine the many kinds of information strings which make retrieval difficult' we find that a string is much more
likely to have several parameters rather
than a single parameter. For such strings
the tag system which we have used in the
example in the last section is inadequate.
The cell memory, for the purpose of crossretrieval, must give as its output the entire
string:
In describing a procedure for crossretrieval, we shall assume for the purpose
of this presentation that every component
stored in the cell memory is unique. This
means that if an input string O! j P j -1 ~ is
presented to the cell memory ,we can be
sure that either (1) there is no string in the
cell memory which has ll!j P j -1 ~ as one of
its components, or (2) there is exactly one
string in the cell memory which has ll! j P j -1 ~
as one of its components. Under this assumption therefore, no two strings could compete
with each other during retrieval.
The basic cross-retrieval procedure is
the following. The string ll! j P j - 1 ~ is first
matched with all of the strings stored in the
cell memory. When a match has occurred,
the cell in which the symbol ll! j + 1 is stored
(ll!j+l being, in this case, the symbol immediately following ll! j P j-l ~ in the cell memory) would be activated. Because ll! j Pi"-1 ~
is unique in the cell memory, the ce I in
which the symbol ll! j + 1 is stored becomes
the only active cell in the cell memory. This
136 / Intercommunicating Cells, Basis for a Distributed Logic Computer
activity is now propagated towards the left
until the symbol a 0, which is the beginning
of the string is reached. Symbols are then
retrieved fr~m the cell memory to the right
until finally the symbol a 0, which is the
beginning of the next string, is reached.
their support in this work. The writer also
wishes to acknowledge the benefit he received
from many discussions with Mr. M. C. Paull
and with his other colleagues.
Outlook
REFERENCES
We wanted to present here the basic ideas
of a distributed logic system without going
into many related problems and other technical considerations. The most obvious asset
of such an organization is the tremendous
speed it offers for retrieval. Suitable programs can also be developed to make the
organization extremely flexible. In addition,
we believe the macroscopic concept of logical design away from scanning, from searching, from 'addressing, and from counting, is
equally important. We must, at all cost, free
ourselves from the burdens of detailed local
problems which only befit a machine low on
the evolutionary scale of machines.
On the other hand, the emphasis on distributed logic introduces a number of physical problems. If a cell memory is to be
practically useful, it must have many thousands, or perhaps millions of cells. Each
cell must therefore be made ofphysical components which are less than miniature in
size, and which must each consume extremely
tiny amounts of power. Furthermore, because
the cells are all identical, mass production
techniques should be developed in which a
whole block of circuitry can be formed at once.
Because the coordination of vast amounts
of information is essential to scientific, economic, and military progress, the type of
organization exemplified by the cell memory
needs to be explored and explored extensively.
The research on machine organization, however , cannot stand alone; the success of this
.
research will depend also on the success In
other fields of research: microminiaturization, integrated logic, and hyper-reliable
circuit design.
ACKNOWLEDGEMENT
The writer is indebted to Messrs. S. H.
Washburn, C. A. Lovell, and W. Keister for
1. Albert E. Slade, The Woven Cryotron
2.
3.
4.
5.
6.
7.
8.
9.
10.
Memory, Proc. Int. Symp. on the Theory
of Switching, Harvard Univ. Press, 1959,
p. 326.
R. R. Seeber and A. B. Linquist, Associative Memory with Ordered Retrieval,
IBM Jour. of Res. and Dev., 5, 1962,
p. 126.
R. S. Barton, A New Approach to the
Functional Design of a Digital Computer,
Proc. of the Western Joint Computer
Conference, May 9 to 11, 1961, p. 393.
S. H. Unger, A New Type of Computer
Oriented Towards Spatial Problems,
Proc. of the IRE, 46, 1958, p. 1744.
P. M. Davis, A Superconductive Associative Memory, Proc. Spring Joint Computer Conference, May 1 to 3, 1962,
p. 79.
V. L. Newhouse and R. E. Fruin, A Cryogenic Data Address Memory, Proc.
Spring Joint Computer Conference, May 1
to 3, 1962, p. 89.
J. W. Crichton and J. H. Holland, A New
Method of Simulating the Central Nervous System Using an Automatic Digital
Computer, Tech. Report, Univ. of Mich.,
March, 1959.
H. Blum, An Associative Machine for
Dealing with the Visual Field and Some
of its Biological Implications, Tech.
Report, Air Force Cambridge Research
Labs., February, 1962.
R. F. Rosin, An Organization of an Associative Cryogenic Computer, Proc.
Spring Joint Computer Conf., May 1 to 3,
1962, p •. 203.
R. J. Segal and H. P. Guerber, Four Advanced Computers - Key to Air Force
Digital Data Comm. Syst., Proc. Eastern
Joint Computer Conference, December,
1961, p. 264.
ON THE USE OF THE SOLOMON
PARALLEL-PROCESSING COMPUTER*
J. R. Ballt, R. C. Bollingert, T. A. Jeevest, R .. C. McReynolds X , D. H. Shaffert
We stinghouse Electric Corporation
Pittsburgh, Pa.
the first place, computers conventionally
require two memory cycles per simple
command-one cycle to obtain the instructions and one cycle to obtain the operand.
Although SOLOMON has the same basic requirement it handles up to 1024 operands
with each instruction. Consequently, the
time per operand spent in obtaining the instruction is negligible. This results in increasing the speed by a factor of two. In the
second place, the fact that the processing
elements handle 1024 operands at once greatly
increases the effective speed. The factor is
not 1024, however. Since the processors are
serial-by-bit they require n memory references to add an n-bit word. If n is taken
nominally to be 32, then the resulting net
speed advantage is 1024/n, that is 1024/32 =
32. These two factors result in a fundamental speed increase on the order of 64
to 1 for comparable memory cycle times.
In addition to these concrete factors, there
are other factors whose contribution to speed
cannot be as easily measured. Among these
are i) the advantages due to the intercommunication paths between the processing elements, ii) the advantage of using variable
word length operations, iii) the net effect
SUMMARY
The SOLOMON computer has a novel design which is intended to give it unusual capabilities in certain areas of computation.
The arithmetic unit of SOLOMON is constructed with a large number of simple processing elements suitably interconnected, and
hence differs from that of a conventional
computer by being capable of a basically
parallel operation. The initial development
and study of this new computer has led to
considerable scientific and engineering inquiry in three problem areas:
1. The design, development, and construction of the hardware necessary to make
SOLOMON a reality.
2. The identification and investigation of
numerical problems which most urgently
need the unusual features of SOLOMON.
3. The generation of computational techniques which make the most effective use of
SOLOMON's particular parallel construction.
This paper is an early report on some
work which has been done in the second and
third of these areas.
SOLOMON has certain inherent speed advantages as a consequence of its design. In
~:cThe
applied research reported in this document has been made possible through support and
sponsorship extended by the U.S. Air Force Rome Air Development Center and the U.S. Army
Signal Research and Development Laboratory.
tWestinghouse Research Laboratories, Pittsburgh, Pa.
XWestinghouse Air Arm Division, Baltimore, Md.
tPresently at Pennsylvania State University, State College, Pa. This work was done while at
Westinghouse Research Laboratories, Pittsburgh, Pa.
137
138 / On the Use of the Solomon Parallel-Processing Computer
resulting from either eliminating conventional indexing operations or else superseding
them by mode operations, and iV) the loss in
effectiveness resulting from the inability of
utilizing all processors in every operation.
The net speed advantage can only be determined by detailed analysis of individual specific problems.
The task of evaluating the feasibility of
the SOLOMON computer has led to investigations of problems which primarily involve
elementary, simultaneous computations of
an iterative nature. In particular, the solution of linear systems and the maintenance
of real-time multi-dimensional control and
surveillance situations have been considered.
Within these very broad areas two special
problems have been rather thoroughly studied
and are presented here to demonstrate the
scope and application of SOLOMON. The first
of these is a problem from partial differential equations, namely the discrete analogue
of Dirichlet's problem on a rectangular gird.
The second is the real-time problem of satellite tracking and the computations which
attend it. These problems are discussed
here individually and are followed by a brief
summation of other work.
Partial Differential Equations
Introduction: Among current scientific
computational problems, the numerical solution of partial differential equations has in
recent years made the most severe demands
on the computational speed and storage capacity of computing machines. It is therefore
natural to investigate the capability and performance of the SOLOMON computer in this
area. This discussion describes such an investigation in the area of partial differential
equations of elliptic type. In particular, the
problem chosen to serve as a standard for
comparison with various methods and Conventional machines is that of solving Laplace's
equation over a square with Dirichlet boundary conditions. Since estimates of rates of
convergence of various methods are obtainable for the Dirichlet problem on the square,
and since running-time estimates for conventional machines also can be easily made
for this problem, it seems a reasonable
criterion.
The discussion which follows assumes
familiarity with the concept of the SOLOMON
Parallel-Processing Computer [1]. There
are three parts: (1) an outline of the problem and the numerical methods to be used;
(2) a discussion of the organization of the
computations using the parallel-processing
ability; (3) a presentation of the results of
some comparisons of SOLOMON with conventional machines on the basis of the problem mentioned previously.
Finite Difference Approximations: In this
section, the main features of the numerical
methods to be considered are summarized.
A complete exposition is given in the book of
Forsythe and Wasow [2]. The statement of
the problem given below is by now fairly
standard in the literature.
Suppose the region to be considered is
the open region, R, of the xy-plane, and suppose R has a boundary, C, which is a simple
closed curve. It is required to find the solution u = u(x,y) of the Laplacian boundary value
problem
au =
°in R,
(1)
u prescribed on C,
where au denotes the Laplacian of u, u xx + U yy •
For purposes of numerical computation,
the problem (1) is approximated as follows.
We first replace the xy-plane by a net of
square meshes of side h. For the given mesh
constant, h > 0, the net consists of the lines
x = J,Lh, Y = vh, J,L,v = 0,1,2, . ... The pOints
(J,Lh, vh) are called nodes, and the nodes within
R form the net region R h , assumed to be connectable by line segments of the net within R.
A node of Rh is said to be a regular interior
point if each of its four (nearest) neighbors
(J,Lh ± h, vh ± h) is in RUC. All other points
of Rh are called irregular interior points.
The set of boundary pOints, denoted by C h ,
contains those pOints which are the points of
intersection of C with the lines of the net;
these mayor may not be nodes. Since the
prinCipal purpose of this paper is to illustrate the SOLOMON concept, for purposes of
exposition the simplifying assumption will
be made that all points of Ch are nodes. That
is, it will be assumed henceforth that the
region of interest is a bounded plane region
which is the connected union of squares and
half-squares of the net imposed, so that all
interior pOints are regular. For a detailed
treatment of irregular interior pOints, and
errors of discretization and approximation
incurred by the use of an approximate boundary C h rather than C, and by approximations
Proceedings-Fall Joint Computer Conference, 1962 / 139
to the partial derivatives, the references
should be consulted, especially [2].
From well-known finite difference approximations for the partial derivatives
occurring in {1}, a formal difference approximation, defined on the net, may be obtained
as follows. For convenience, the neighbors
of a point P are deSignated as shown belOW,
and the net function will be denoted by U to
avoid excessive subscripting.
and the references should be consulted for
the mathematical baSis.
Simultaneous Displacements: One straight
forward method which comes readily to mind
as a candidate for a SOLOMON program
because of its use of nearest neighbors is
the method of simultaneous displacements.
This well-known iterative method makes use
of the five-point formula mentioned above,
and goes as follows: A trial solution Uo {P},
P in R h, is chosen. Suppose that we are at
the k.th stage in the iteration, i.e., Uk - 1 has
already been determined. For each point P,
the value of the new solution Uk {P} at that
point is obtained averaging the values of the
{k-1} st solution at the four nearest neighbors
of P, Le.,
Uk {P} =
In terms of this diagram, the replacements
are:
u xx by
~ [U{W} - 2U{P}
h2
+ U{E}]
by _1 [U{N} - 2U{P} + U{S}].
h2
Then {1} becomes
~hU = ~ [U{N} + U{E} + U{S} + U{W} - 4U{P)]
h2
{2}
= 0 in R h. U prescribed on C h.
The formula for the Laplacian operator in
{2} is the well-known five-point formula or
"star" for the numerical solution of field
equations. By using the formula {2} to replace the operator in {1} at each point. P in
R h, the conditions of {1} may be approximated
by a system of algebraic equations. Using
the boundary conditions and applying {2} at
each interior point P, the problem {1} is replaced by a system of N simultaneous . equations for the N unknown values of P interior
to C h . Under appropriate conditions {see
[2], for example}, it can be shown that for P
in R h, U{P} -- u{P} as h -- 0; the discussion
here, however, is limited to a brief outline
of some iterative procedures used for solving the approximate problem on SOLOMON,
!
[Uk -1 {N} + Uk -1 (E)
+ U k - 1 {S} + U k - 1 (W)] .
This process is continued until the change in
the values at every point does not exceed
some prescribed tolerance. Sufficient conditions for the process to converge are known,
and are given in [2, section 21.4]. This
method has much to recommend it for
SOLOMON in terms of simpliCity and ease
of programming. Its major defect, and one
which encouraged investigation of other methods, is its inferior rate of convergence; this
will be discussed later.
Optimum S u c c e s s i v e Overrelaxation:
There is a modification of the method of
simultaneous displacements in which a new
value is used as soon as it has been computed. That is, when solving for a new value,
U(P}, at a point P, one always employs the
latest values of all other components involved
in the formula for the new value at P. The
procedure is called the method of successive
displacements {or successive relaxation}.
As might be inferred from the brief description, it depends critically on the order (J in
which the unknowns are determined by relaxation. Only cyclic orders are conSidered, in
which the new values are obtained in the order
U 0-( 1~' U 0-(2), ••• , U o-(N) , and repeat, where
{ (J G) j
is some permutation of the first N
integers.
Early users of relaxation often found it
prOfitable to overrelax, that is, to change a
component by some real factor w, w > 1,
140 / On the Use of the Solomon Parallel-Processing Computer
times the change needed to satisfy the equation exactly. Over relaxation was originally
employed primarily in hand computation, and
was not usually employed in the regular or
cyclic order which is found most convenient
when the computation is done on a digital
computer. The question was then raised as
to whether overrelaxation was profitable when
used with a fixed, cyclic order of determining
the unknowns. Although it is known [2, section 21.4] that over relaxation isnotprofttable
in solving the Dirichlet problem by simultaneous displacements, it is true that overrelaxation is highly profitable for many
elliptic difference operators in connection
with the method of successive displacements
(successive relaxation). The theory of overrelaxation in successive displacements is
due to Young and Frankel; a detailed exposition is given in [2, section 22], where the
original papers and later developments are
also referred to.
The practical application of the method is
as follows. As before, we use the five-point
difference formula. The nodes in R hare
scanned repeatedly in a cyclic order. At
each point, the reSidual, ~hU(P), is formed,
where ~hU(P) = U(N) + U(E) + U(S) + U(W) 4 U(P). Then, a new value of U(P) is computed by the formula,
where w is the overrelaxation factor. (For
w = 1, this is the method of successive displacements.) The theory of the method shows
that it will converge for 0 < w < 2, and that
the most rapid convergence occurs at a value
called w opt ' with 1 < w opt < 2. A good estimate of Wopt is important, and it is known
[2, section 22] that a value of w somewhat
larger than w opt is less costly in computer
time than one somewhat smaller than w opt.
It is also important that the time required to
obtain a good estimate of w opt be small compared to the time required to solve the problem using merely a reasonable guess for
wopt •
The problem of determining w opt is a very
substantial one. A few methods are suggested
in [3, section 25.5], but the results at present
are inconclusive, and it is conjectured in [2]
that finding w opt may in some cases be as
hard as solving the original boundary value
problem. For this reason, the program de-
veloped here assumes an acceptable value of
w has been determined, and proceeds from
that point.
One other point should be mentioned in
connection with the cyclic order in which the
new values at the nodes of the mesh are found.
To be acceptable an order must be what is
called consistent in the literature [2,4]; the
details and definitions require a somewhat
lengthy preparation and will not be given
here, but are fully discussed in the references just cited. Forsythe and Wasow [2,
p. 245] do state a simple criterion (due to
Young) for consistency of five-point formulas, which may be reproduced here:
"Of each pair P, Q of adj acent nodes
of the net, one, say P, precedes the
other, say Q, in the order a. If so,
draw an arrow from P to Q. Do this
for every pair of adjacent nodes. Then,
according to David Young (private communication), the order a is consistent
if and only if each elementary square
mesh of the net is bounded by four arrows with a zero circulation, Le., if a
directed path 'enclOSing the square
travels with the arrows on two sides
of the square, and against the arrows
on two sides."
In order to make as much use as possible of
the parallel nature of SOLOMON, and at the
same time observe the requirement that the
order be conSistent, we have chosen the following scheme. We partition the nodes, P ij ,
of the mesh into two sets SI, S2, by letting
S 1 be the set of all pOints with odd parity
(i + j odd) and S 2 be the set of all pOints with
even parity (i + j even). Then an order consistent with this partition is to solve for all
components in S 1, and then for all those in
S2' This leads to a checkerboard-like arrangement which can be easily obtained on
SOLOMON.
Rates of Convergence: A few words about
rates of convergence are in order here to
illustrate the reason that successive overrelaxation was chosen for SOLOMON rather
than simultaneous displacements, even though
the latter method is so simple to program.
By "rate of convergence" is meant the average asymptotic number of base-e digits by
which the error is decreased per iterative
step. Forsythe and Wasow [2, p. 283] list
some approximate rates of convergence for
various methods for solving the Dirichlet
problem on a 1T x 1T square with n pOints on a
Proceedings--Fall Joint Computer Conference, 1962 / 141
side. For simultaneous displacements the
approximate rate of convergence is given as
h 2/2; for optimum successive overrelaxation
it is 2h. For the 1T x 1T square, these rates
will then be h 2/2 = 1T 2 /2n 2 , and 21T/n, respectively, and the error will be decreased
by about one base-e digit per 2n 2/1T 2 iterations for simultaneous displacements and per
2n/1T iterations for successive overrelaxation. In [2, p. 374], it is shown that to reduce
the initial error by a factor of 10 -6 (approximately e -13. 8) takes 13.8n/{21T) = 2.2n sweeps
for successive relaxations; a similar calculation gives 2.8n 2 sweeps for simultaneous displacements. Some running-time computations based on these figures indicate that
because of the large number of iterations
required with the method of simultaneous
displacements, SOLOMON can only show
an order-of-magnitude improvement over
conventional machines for n small. For
successive overrelaxation, however, the improvement is substantial; this will be discussed later.
The Computational Scheme: The effectiveness of SOLOMON in solving problems will
depend to a large extent on the representation of the problem in SOLOMON memory.
The particular representation chosen for the
Dirichlet problem and presented here permits all 1024 proceSSing elements to be fully
utilized. That is, no proceSSing element is
required to remain inactive for one or more
instruction times, exceptwhen that processor
contains a boundary or exterior point of
the net.
Consider the rectangular net shown in Figure 1 which contains No x Mo net points. The
nodes of the net are partitioned into 1024 rectangular groups of dimension N x M. This
is shown in Figure 2. In order to apply the
iteration formula with a consistent ordering,
it is convenient to restrict Nand M to be
even integers.
Each group of net pOints must now be represented in the corresponding processing
element memory. As shown in Figure 3, the
net pOints in an N x M group are orderedand
a list is constructed in each processing element memory to contain the net function
evaluation at each net point. Since the application of the iterative formula requires
the four nearest neighbors of each net point,
a net point ordering will be selected which
allows the neighbor net functions to be easily
obtained.
CURVE 522924
No 1 -I--t---+--+--+--+--+-+----i-+--+--!..--+---+--+-+--+
o-~-.--+~~-~--j---~4--+_+_4·~-~~--+__+_~_+_
O~-~+--+~+-~-+~+-~~~~~+
o -f______
.-+--+-t------+----t---+---+----+---+---+--+--f--~*~--f---+-
O-f--------~+--+-~~-+-4--~+_+-~-+-_+--lf______~
o
-f______
0---
2-
--t±
- -.
----I---+---+---j--+--+---+--+----+--+---i-----+--+
C----_
---1--------+-----1-+--+-+-~-I---+-
- - --- -_. - - - - - - - - -
-+--1---+ ----+-----1----
--- - - - - . - -
---j-+--+---+--+----+-_+_
----r---+-+----+--+----+--+-----1r---r-+---4---+---+------1--~
o
1
2
0
0
0
0
0
0
0
0
0
0
0
M;l
Region composed of squares and half squares
Figure 1
CURE 522925
-----
31
··
I
~(-l
•
. 0
1
~
--- f______'
I
:
i
I
I
(
-;
o
1
:
1
l
o
I
I
I
I
I
i
I
I
I I I
!
I
I
I
I
o~o~
a
100000000000.00000031
Partition mesh points into 1024 equal groups.
Figure 2
PE MEMORY
List of
Mesh Point
Values
N-I
1000000
M-I
PROBLEM
Order mesh points for each group in a corresponding PE memory.
_Figure 3
PE MEMORY
List of
Mesh Point
Va lues
I
I
142 / On the Use of the Solomon Parallel-Processing Computer
The distinction among boundary, interior,
and exterior net pOints is facilitated by the
multi-modal operation of the processing elements. Associated with each net point in
each processing element will be mode bits
identifying that point. These mode bits will
then be used to set the mode state of the
processing elements before applying the
iteration formula at a net point. Then by only
operating on processing elements which are
in the mode assigned to interior net pOints,
both boundary and exterior points remain unchanged.
Storage Requirements: Each SOLOMON
processing element contains 4096 bits of core
memory. This memory is addressable by bit
number and the word size may be set and
changed by the program. This variable word
length structure of the processing element
permits net point values to be listed without
any wasted bits. All statements about storage capacity for data words will then depend
directly on the acc.uracy needed for the data.
A word in SOLOMON memory will be necessary for each of the No Mo net points. In
addition, 2 mode bits are used to identify the
pOints as boundary, interior, or exterior.
Therefore, if p is the number of bits in a net
point word, the total storage capacity, c, of
SOLOMON core memory is:
c = f4096 ]
[p
+2
X
1024.
formula.
given by
This program requires time T r
Tr = (14.5 p + 3) 1. 2 /.LS
for one iteration on p-bit words. Since one
iteration revises the net point value at 1024
nodes simultaneously, the time, T, required
to process a single net point is
_l=
T - 1024
14.5 p + 3 JlS.
853
The time required to test the results of
an iteration for a solution within tolerance
is very small and may be neglected.
The time required to process one mesh
point on a conventional computer is estimated at
Tc = 70/ls.
This estimate is based on the arithmetic
instructions necessary to evaluate the formula which include
6 Additions
1 Multiply
1 Shift
2 Load and Store
plus an assumed 3 index-instruction times to
distinguish each point as interior or boundary.
These two estimates may now be compared
to obtain an approximate SOLOMON timeadvantage ratio, (T:Tc);
For various p this is
p:
18
24
30
c: '""200,000 -160,000 -130,000
36
p
-100,000
These figures can be extended through the
auxiliary storage system of SOLOMON.
Running-Time Estimates: In order to
measure the effectiveness of the SOLOMON
parallel organization on the Dirichlet problem an estimate of the time required to process each net point on SOLOMON will be
compared to a similar estimate for a conventional computer. ':'
A program has been written for SOLOMON
which applies the Young-Frankel iteration
'!cThe computer chosen for comparison is of
conventional contemporary design with a
multiplication-division time that is one to
seven times as long as its 4/ls addition time.
18
24
70 Jls
T
T:Tc
30
36
70 Jls
70 Jls
.31 Jls
.41 Jls
.52 JlS
.60 Jls
225:1
167:1
133:1
117:1
These figures are conservative since they
include neither the time to test for a solution
nor the time required fo r a conventional computer to buffer the large number of net points.
In addition, it appears to be very convenient
and quick for SOLOMON to solve a reduced
problem with approximated boundary values
in order to provide an initial guess for the
full problem. In this manner, it is hoped to
further increase the SOLOMON speed advantages given above.
Proceedings-Fall Joint Computer Conference, 1962 / 143
Satellite Tracking
The computing and data processing problem for satellite surveillance is receiving increasing attention as the tempo of the space
program increases. The increase in satellite
densities over the next few years will increase
the computing and data processing problem
by several orders of magnitude. Since the
presently existing problem of simultaneously
tracking relatively few objects requires, in
general, high-speed, large capacity computing systems, it is evident that the future
problem of satellite surveillance will require
highly complex and sophisticated computing
systems having capabilities far greater than
those of contemporary systems.
The SOLOMON computer with its parallel
computational capabilities, large capacity
memory system and novel system organization features is capable of meeting the advanced computing requirements for satellite
surveillance with respect to both speed and
memory requirements. To illustrate applicability of SOLOMON to the problem, the
major functions that must be performed in a
satellite surveillance system are discussed
in some detail. Certain aspects of the problem will be omitted in order to keep the text
of this paper unclassified.
The satellite surveillance problem is, in
general terms, that of receiving and performing the required processing of input data from
a radar system maintaining continuous surveillance over a specified area of coverage.
The raw radar data needs to be converted
to digital form and fed into the computing
system. The computing system must then
establish and maintain track files on all satellites passing through its coverage sector.
The establishment of track files on each
target within the coverage involves the elimination of false alarms, resolution of ambiguity, etc., from the input data. Once firm
tracks have been established, the computer
must ascertain the status of each detected
satellite; that is, whether the satellite is a
known satellite following a predicted orbit,
a known satellite not within its predicted
course, or a satellite not included in the
computer's available identification data.
The status of each satellite will determine
the subsequent processing of the relevant
data. If the satellite is known and following
a predicted course, it will be tagged and no
subsequent processing will be required. If
the satellite is known but not within its prescribed course, it will probably be necessary
to perform the orbital correction calculations
required to update the orbital elements maintained on each known satellite. If a particular satellite is unknown, orbital calculations
must be performed, resulting in a set of
orbital elements defining the orbit of the
newly detected satellite.
The SOLOMON computer, although capable of performing the bulk of the computing
required in the system, will probably not
solve the entire computing problem. One
might visualize the entire computing complex as consisting of a conventionally organized tactical computer and a radar control
unit in addition to SOLOMON. The tactical
computer would be capable of performing
the refined processing on a relatively small
number of satellites that require additional
processing, a task th~t would be an inefficient
use of SOLOMON. The radar control computer would probably be a special purpose
machine capable of performing the specific
control function required by the radar system (Le., frequency control, antenna beam
steering control, etc.).
Data correlation is a function having a
substantial bearing on computing system
requirements. Correlation will be required
in two phases of the problem. The first of
these is scan-to-scan correlation; Le., correlation of radar input data on successive
radar scans. The second is the problem of
matching radar returns on observed satellites with known or predicted satellite orbital positions.
The problem of scan-to-scan correlation
becomes most critical when the false alarm
rate of the input data is high, because then
the computing system must process a much
larger number of returns than the number of
actual targets. Due to stringent requirements
on the radar system, e.g., long range tracking, high accuracy, low signal-to-noise ratio,
etc., the false alarm rate will probably be
high. It should be pointed out that a tradeoff exists between the computing capabilities
required and the overall radar performance.
With a computing system possessing the inherent capabilities of SOLOMON the radar
system performance will be vastly increased.
This increased system performance will result from SOLOMON's ability to accept and
process a much higher density of return
144 / On the Use of the Solomon Parallel-Processing Computer
including false alarms and ambiguous returns
as well as legitimate returns) than could be
realized otherwise. In addition to eliminating false alarms the computing system must
discard returns from non-orbiting objects
and recognize multiple returns from a single
obj ect. The most reliable techniques for
eliminating false alarms and ambiguous
returns are based on persistence of the returns from scan-to-scan. This necessarily
means that the computing system must store
all returns for several radar scans (on the
order of 5) before any unique returns can be
eliminated as false alarms. Such a technique
demands large memory capacity such as that
provided by the SOLOMON processing ~le
ment memories.
SOLOMON is particularly well adapted to
perform the scan-to-scan correlation functions as illustrated by the following discussion. Figure 4 illustrates the scan-to-scan
correlation .of 4 parameters: range (R),
range rate (R), azimuth ((3) and elevation (0).
The parameters of new returns are compared with those from previous scans. The
number of comparisons that can be made
Simultaneously by SOLOMON is equal to the
total number of processing elements in the
network, assuming of course that each unique
return is routed to a specified processing
element in the desired manner. Once the
program has cycled through the correlation
subroutine there will undoubtedly be r~turns
in various processing elements that did not
match the returns with which they were being
compared. In these cases, by the use of
multi modal control, the returns that did not
match are set to a specified mode by the
programmer. The subroutine can now be
repeated, acting only on the processing elements in this mode and comparing the unmatched returns with other returns either
within the same processing elements or in
adjacent processing elements. This process
utilizes the interconnection between processing elements. The number of comparisons that must be made (i.e., the number of
iterations of the subroutine required to perform thescan-to-scan correlation function)
to assure that a return has been compared
with all possible matches is a function of
target density, false alarm rates, target
geometry, precorrelation techniques, and
the actual SOLOMON configuration.
When a return has been firmly established as a return from a satellite then that
~E
RE )
a
E
Established
data from
previous scans
Set to specified
mode
~E
RN)
RN
aN
New data
from
present scan
Set to specified
mode
~N
Set to specified
mode
Set to specified
mode
Correlation on Four Parameters
Figure 4
return must be compared with the predicted
satellite orbit data to determine if this is an
unknown satellite, a known satellite following a predicted course, or a known satellite
not within its predicted course. The autocorrelation subroutine required for this function is almost identical to that required for
scan-to-scan correlation.
Other systems that have been proposed
for the performance of satellite surveillance
have relied upon a separate catalog memory
containing the orbital elements for each known
satellite. Because of the irregular order in
which satellites might appear within the radar
coverage at any given time, it is probably
not feasible to order the catalog of tracks in
the machine. In SOLOMON the orbital elements in the catalog memory would be distributed throughout the processing element's
memories as a function of total satellite density. That is, if the total satellite count is
five times the total number of SOLOMON
processing elements, each processing element memory would contain 5 satellite tracks.
The correlation is therefor~ Simplified and
solvable within SOLOMON in a manner almost identical to that of the scan-to-scan
correlation.
Proceedings-Fall Joint Computer Conference, 1962 / 145
To establish the magnitude of the advantages of SOLOMON over conventional computers in the performance of such correlations, the problem is outlined in detail as
follows: a computer has in its memory m
total track files. Each track file, consisting
of several words, defines a particular target.
At any given time the computer can receive
n new inputs from the radar systems. The
problem is (1) to associate each new input
with an established track file, (2) establish
new track files where requ,ired, or (3) eliminate spurious Or ambiguous returns. The
approach to this problem to date has been
quite straightforward. Since existing computers are sequential and only one operation
can be performed at a time, each new return
is sequentially called from memory and a
search of the established tracks is made.
The total number of discrete steps required
to make the search in a sequential computer
on n new returns through m established track
files to determine if any n has a match among
m is
m + m - 1 + m - 2 --- + m _ n = -'-(n_+_1....!....)~~_m_-. . .n.. . .!. .)
where m > n.
In SOLOMON the total number of discrete
steps required to perform the same function,
as previously pointed out, is simply m, since
in any given step the computer would be making n comparisons and, at most, comparison
of each n with all possible m's would take
place in m steps (see Figure 4).
In the satellite surveillance problem,
where both m and n will be exceptionally
large, the advantages of SOLOMON over conventionally organized computers are obvious.
The problem of coordinate conversion will
pose additional requirements on the computing systems employed for this problem. Since
the track files as established from the raw
radar input will be in radar coordinates and
the tracks in the catalog will probably be in
the form of orbital elements, some form of
conversion must be done prior to the correlation of the detected satellites with the
known satellites. These computations, although not especially complex, are quite
time-consuming. While in a conventionally
organized computer each track would be
converted in a sequential manner, in the
SOLOMON computer the execution of one
conversion subroutine would perform the
conversions for all satellites (assuming that
the total number of conversions required is
not greater than the totalnumber of processing elements). In order to compute new satellite orbits and update existing data on established satellites these satellites must be
tracked and sufficient data on the actual
orbits must be gathered. Of the total number
of satellites detected in the radar search
mode, only a small number will require detailed orbital calculations. In a typical satellite tracking system this function would not
be performed by SOLOMON, but by a high
speed conventionally organized computer.
Other Problem Areas
In the course of studies such as those
reported here it has become apparent that
the computational techniques most suitable
for SOLOMON are not necessarily those
which have been popular for conventional
high-speed machines. To make proper use
of the parallel design of SOLOMON it sometimes seems necessary to employ methods
of computation that would be quite cumbersome for a conventional machine.
A case in point can be found in the problem of solving a system of linear equations.
To be speCific, consider the problem of
solving a system of 15 equations in 15 unknowns. Because of the unique construction of
SOLOMON, it is possible to solve up to 64
such systems simultaneously for the same
cost in time and effort. The activities of the
computer will be the same whether one or
sixty-four systems are being solved. Consequently, SOLOMON will perform better
when many systems of linear equations can
be solved at the same time.
We wish to stress that the proper computational scheme must be employed for the
most efficient use of SOLOMON's capabilities since our mission is not to establish
SOLOMON as the ultimate in computers, but
rather to explore those areas in which it is
superior.
The problem areas in which SOLOMON
has been found to be especially capable are
constantly expanding under the impetus of
the present investigations. Work is now in
process on several other sample problems
to demonstrate the general applicability of
SOLOMON. We are studying multidimensional functional optimization, where an entirely new methodology for finding absolute
146 / On the Use of the Solomon Parallel-Processing Computer
maxima may result. We are studying communication and transportation problems, a
sorting problem, a problem in handling a
Boolean form, multiple integration, and a
sound detection problem. Additional systems stu die s being considered include
photo-reconnaissance, numerical we at her
forecasting, cryptoanalysis, nuclear reactor
calculations, and a ir traffic control.
2. Forsythe, G. E., andWasow, W. R. "FiniteDifference Methods for Partial Differential
Equations," Wiley, N.Y., 1960.
REFERENCES
3. Forsythe, G. E. "Difference Methods on
a Digital Computer for Laplacian Boundary
Value Problems." Transactions of the
Symposium on Partial Differential Equations. N. Aronszayn, A. Douglis, C. B.
Morrey, Jr. (eds.), Interscience, N.Y.,
1955.
1. Slotnick, D. L., Borck, W. C., McReynolds,
R. C. "The SOLOMON Computer," Proceedings of the Fall 1962 Joint Computer
Conference.
4. Sheldon, J. W. "Iterative Methods for the
Solution of Elliptic Partial Differential
Equations," Mathematical Methods for
Digital Computers, Wiley, N.Y., 1960.
DATA PROCESSING FOR COMMUNICATION
NETWORK MONITORING AND CONTROL
D. I. Caplan
Surface Communications Division
Radio Corporation of America
Camden 2, New Jersey
The long-haul communications network is
the backbone of military communications. It
provides the coordination necessary for global
military operations and logistic support.
For maximum network effectiveness, a central monitoring and control function is necessary. System studies described in this
paper have shown that automatic data processing is applicable to network monitoring
and control, and provides rapid and efficient
network reaction to natural and man-made
disturbances.
The basic elements of the long-haul communication network are the switching centers,
the trunks connecting them, and the subscribers. Three types of service are provided to
the user of the long-haul network. These are:
1. Direct - A direct connection is made
on demand between two subscribers, and
broken down when the call is completed.
2. Store and Forward - A message is
transferred through the network. It is stored
at each switching center, and then passed
along to the next center until it reaches its
destination.
3. Allocated Service - Allocated service
is a direct subscriber-to-subscriber connection which remains in effect full time. This
"hot-line" service differs from direct service
in that the connection is not broken down at
the end of a call.
The long-haul network must provide service in the face of many operational difficulties. The s e difficulties inc Iud e wide
variations in the traffic load and "outages"
of equipment due to acts of nature or enemy
action. The hot-line service must be restored immediately when any of the hot-line
channels are effected by outages. To complicate the situation, peak traffic loads will
usually occur at the very time outages are
caused by enemy action or severe storms.
There are basically four actions which
can be taken to alleviate operating difficulties. These are:
1. Alternate Routes: Backed up storeand-forward traffic can be sent via other
routes. Direct calls can also be handled over
routes other than the preferred route.
2. Spare Channels: Space channels can be
put into service, either to replace down channels or to add transmission capacity.
3. Preemption: Circuits or facilities can
be reassigned from low priority users to high
priority users.
4. Service Limitations: Maximum message length or call time can be specified,
service can be denied to certain classes of
subscriber s, or other limitations can be
placed on the subscribers.
The actions listed above can be taken on a
local or global level. Local action will be
taken at an individual switching center, while
regional or global action will require cooperative performance at a number of switching centers. Obviously the effectiveness of
regional or global measures depends on coordination of the switching centers, which
147
148 / Data Processing for Communication Network Monitoring and Control
I
must be achieved through a central control
facility.
Based on analysis of the problems involved,
a system study of the control center functions
and possible implementation has been performed. Automatic data processing was found
to be applicable both at the switching centers
and at the control center. The results of the
system study ,described in this paper, are
applicable to many long-haul systems, and
provide an inSight into the network control
complex of the future.
Network Monitoring and Control Concept
To provide effective network reaction to
varying traffic loads and equipment outages,
a closed loop network monitoring and control
system is necessary. As shown in Figure 1,
the status of the network is monitored at the
switching centers and transferred back to the
network control center. Network operation
is analyzed at the control center and control
actions are initiated there. These control
actions are carried out through command
messages sent to the switching centers.
• MESSAGE BACKLOG
• CHANNEL OR TRUNK OUTAGE
• ALLOCATED CIRCUIT OUTAGE
• ALTERNATE ROUTING
• SPARE CHANNEL UTILIZATION
• PREEMPTION
• SERVICE LIMITATION
Figure 1. Network Monitoring and Control
Concept.
Like any closed loop system, the monitoring and control system can be made ineffective by delay or by inaccuracy caused by data
errors. To reduce these two problems to a
minimum, automatic data processing should
be used at the switching centers for composition of the status messages and at the control
center for network status analysis and display.
Communication between the switching centers and the network control center can be
accomplished in several ways. First, allocated channels could be provided between the
switching centers and the network control
centers. Second, direct or store and forward
communications can be initiated either periodically on a preassigned schedule or when
required. In order to achieve effective use
of communication facilities, a common practice is to use store and forward messages
for both monitoring and control information,
with direct calls used only under emergency
conditions. In the case of status messages
which are usually long and which contain
routine information for the most part, a preassigned schedule of reports is used. Generally, an hourly report is frequency enough
for satisfactory reaction time. Emergency
reports can be entered at any time.
Status Message Composition
The simplest approach to station status
monitoring is manual. The technical controller at the station records the message
backlogs, channel outages, and other pertinent
data. He then composes a teletype status
message which he sends to the network control center.
From the network control center viewpoint,
the manually prepared status messages are
a special problem. Because of human error,
mistakes in format are common. Messages
having incorrect formats will be rejected at
an automated network control center, and
manual intervention and correction will be
required. An automated status message composed at the switching center is therefore
desirable.
Data ordering is another problem in manually prepared status messages. Each event
at the station, such as a channel outage or
restoration, has a time of occurence which
must form a part of the status messages.
These events are usually recorded by the
station personnel in order of occurrence.
However, the status message format will
normally require grouping by channel or
trunk, so the events must be sorted into a
prescribed sequence before they are transmitted. This operation is time consuming
and subject to error when performed manually.
Proceedings-Fall Joint Computer Conference, 1962 / 149
The Status Message Composer concept,
shown in Figure 2, was developed to provide
automatic status message generation from
manual inputs. A block diagram of the proposed unit is shown in Figure 3.
Figure 2.
DISPLAY
AND ENTR:'
PANEL
Status Message Composer,
Artist's Concept.
f----
~ PRINTER I
PAGE
MAGNETIC
KEYBOARD
PAPER
TAPE
READER
f----
---
CORE
MEMORY
PAPER
TAPE
PUNCH
The operator types variable data, such as
reason for outage, into the keyboard on the
left of the Status Message Composer. The
panel on the right of the keyboard indicates
to the operator what information is required,
and provides overall controls such as unit
power.
As shown in the block diagram, the
operator-entered information is stored in a
core memory. The time of data entry is read
into the core memory automatically from a
real-time clock. The location of the stored
information is predetermined, so that all information about a particular trunk or channel always goes into the same group of words
in the memory. Therefore, the status data
are always stored in the proper sequence,
and will be properly grouped when read out
of the memory.
The initial setup of the core memory is
performed by reading in a punched paper tape
which designates the core locations to be used
for each trunk, channel, etc. The paper tape
also designates the display and entry module
corresponding to each trunk or channel.
Changes in station trunkB or channels can be
accommodated by simply changing the paper
tape and relabeling one or more display and
entry modules.
A status report can be generated either
periodically under clock control or at any
time under manual control. The report is
generated by a memory read-out which transfers all stored status data to both the paper
tape punch and the page printer. The punched
paper tape is entered into the communications network for store-and-forward transmission to the network control center. The
copy produced by the page printer becomes
the station operating log.
CLOCK
Network Control Center Functions
Figure 3.
Status Message Composer,
Block Diagram.
The display and entry panel at the top of
the Status Message Composer provides the
means for manual entry of trunk or channel
outages. Each of the small display and entry
modules corresponds to a single trunk or
channel. The color of the display module
indicates the last inserted status, and serves
as a station status display.
The functions to be performed at the network control center are described in this
section. They can be handled manually or
automatically. The next section describes
an automatic implementation of the network
control center.
The network control center operation is
shown in the information flow diagram, Figure 4. Status messages from the switching
centers are received at the network control
150 / Data Processing for Communication Network Monitoring and Control
INCOMING
STATUS
MESSAGE
MESSAGE
CHECKING
FORMAT ANDlo
PARITY
CONTROL
CENTER
FILE
SUMMARY
REPLY
ADDITIONAL
DATA
QUERY
REQUEST FOR
ADDITIONAL
DATA
DISPLAY:
NEW DATA
FLAGGED
suggested for this application is an RCA 304
computer with a record file for bulk storage.
As shown in Figure 5, the data input section of the control center provides terminations for incoming channels. Two channels
are used for locally generated data; queries
from the network controller on one channel
and manually entered data on the other channel. The other channels carry status messages from network switching centers.
TO
SWITCHING
CENTERS
OUTGOING
CONTROL
MESSAGES
Figure 4. Network Control Center Information Flow Diagram.
center and recorded, either manually or automatically. The information in the incoming
messages is checked for errors, using whatever redundancy is available in the messages
(Le., parity bits or format). The status data
are then recorded in the control center file.
The status information is summarized for
display to the network controller. It is important that the degree of summarization be
optimized. Too much detail will swamp the
controller; too little will limit his understanding of network status. Also, new information
must be flagged in some way to attract the
controller's attention and indicate the need
for action on his part. The controller will
acknowledge recognition of the new information by resetting the flag.
USing the data provided by the display, the
controller is able to solve most network
problems. . On occaSion, however, he may
need additional detailed data. Anyinformation in the control center files will be on-call,
as required. To get additional data, the controller will initiate a query, designating the
desired data. This operation is shown in
Figure 4 as the query-reply loop.
Automated Network Control Center
To provide the network control center
functions automatically, the system concept
shown in Figure 5 has been developed. A general purpose digital computer is proposed,
together with special equipment 'and displays,
to provide status message proceSSing, storage, and display. The Information Processor
I~
INFORMATION-
~ROCESSOR
PROCESSOR
RECORD
tIJ
FILE
Figure 5. Data Flow in Automated Network
Control Center.
Temporary storage for all incoming channels is provided by paper tape. Data can be
handled on all channels simultaneously at 100
words per minute. When a complete message
has been recorded on tape in one channel, it
is read into the Information Processor at
1000 words per minute. The Information
Processor checks the incoming messages for
correct format. Messages with format errors
are rejected and printed out. Control center
personnel correct the format errors and reenter the corrected status messages at the
manual entry keyboard.
.
In addition to checking the incoming status
messages, the Information Processor maintains a complete file of network status. It
continually provides an updated summary of
Proceedings-Fall Joint Computer Conference, 1962 / 151
network status, in suitable code and format,
for transfer via the display buffer to the wall
map display. New display data are indicated
to the controller by flashing lights, which the
controller can reset when he desires.
Both trunk equipment status and traffic
backlog would be provided in a single geographic type display. These two factors are
closely related and form the basis for intelligent system control decision. Above the
geographic display would be a tabular display
of the status of lines allocated to high priority
users. The wall display is shown in Figure 6.
Figure 6.
An artist's concept of the proposed network
control room is shown in Figure 6. The wall
map displays the status of all switching centers, traffic backlogs, and trunks in the network. The tabular display at the top of the
map shows the status of allocated channels.
The controller's console is simple and
functional. It contains a page printer and keyboard for query entry and reply and a second
keyboard for composition of control messages. A small panel is provided for display
illumination controls, and for indicator s
showing status of equipment in the next room.
Network Control Room.
If the controller wants. data other than that
shown by the display, he will enter a query to
the Information Processor. The Processor
prepares the reply data and transfer s it back
to the controller. By entering a query, the
controller can get any data recorded in the
Processor files.
Although one man can operate the console,
working space for an observer is provided.
A possible layout of the automated control
center is shown in the artist's concept of
Figure 7. The equipment room is located
next to the network control room. The separating -wall has been removed for clarity.
152 / Data Processing for Communication Network Monitoring and Control
Figure 7.
Artist's Concept of Automated Control Center.
At the wall to the right in an RCA 304 Information Processor, with the record file and
paper tape inputs located in front. The smaller
cabinets contain the paper tape punches and
readers for terminating the incoming channels. Two teletype operator positions are
provided; one for manual entry and the other
for channel coordination with the incoming
and outgoing channels. The large racks in the
rear house the display buffers and other special equipment.
Data Processing in the Automated Control
Center
The Information Processor in the Automated Network Control Center must perform
four data processing functions:
1. Incoming Message Check: The processor will check the format of incoming messages and reject those having format errors.
The messages in error will be printed out,
together with an indication of the detected
error.
2. Station File Maintenance: A file of the
current status and recent history of each station (switching center) will be kept in the
processor memory. As the station status
reports come in, the status information will
be posted to the station file.
3. Display Data Output: The processor
will automatically provide updated status in
a form suitable for the control center display.
4. Process Queries: The processor will
accept queries from the keyboard at the controller'S console. The data requested will be
retrieved from the station file and output to
the page printer at the controller's console.
Each of these tasks must be done on a real
time baSiS, to avoid system delays which
would reduce effectiveness. The operating
speed of the processor complex should be
designed to keep up with the peak work load,
and to catch 'up after periods of scheduled
maintenance.
The data processing operations in the
proposed control center' system are based on
the use of a Data Record File for storage of
station status. The RCA Data Record File,
Model 361, stores information on both sides
of 128 magnetic coated records, with 18,000
characters stored on a side. The status of
each station is stored on one side of a record
in the Data Record File. Included in the stored
Proceedings-Fall Joint Computer Conference, 1962 / 153
data are the status of every trunk terminating
at the station, broken down into traffic backlog, status of the trunk, status of the channels
in the trunk, and status of the users having
allocated channels in the trunk.
As previously described, the incoming
status reports are initially stored on paper
tape, and then read into the Processor as
complete messages. As soon as the Processor recognizes the station identity referred
to in a particular tape, it selects the record
containing the status of that station and reads
the complete station file into the core memory. When the station file is in the core mem0ry and the status message has been read in,
the Processor performs the updating operation. After the entire station status has been
updated, it is rewritten in the record file as
a unit.
As the Processor updates the station file,
it abstracts the data that must be displayed.
These data are translated into the proper
code and format for driving the display, and
transferred to the display buffer.
Query processing is done on a station basis.
Each query is analyzed by the processor to
determine the stations involved. The station
file records are then scanned and the appropriate information is extracted and converted
to a format suitable for print-out.
SUMMARY
A system study of long-haul network control center requirements, functions, and operation has resulted in a network monitoring
and control concept which includes data proceSSing at both the switching centers and the
control center. Such high speed data processing will substantially increase effectiveness of the present world wide long-haul
system by reducing the reaction time to
natural and man made disturbances. It,
therefore,· is a valuable tool in meeting the
increased traffic loads and the vulnerability
of communication channels to modern weapons.
ACKNOWLEDGEMENT
Many RCA engineers contributed to the
network control center concepts described
above. The author wishes to particularly
acknowledge the efforts of A. Coleman, S.
Kaplan, J. Karroll, M. Mas 0 n son, D.
O'Rourke, and E. Simshouser.
DESIGN OF ITT 525 uVADE" REAL-TIME PROCESSOR
Dr. D. R. Helman, E. E. Barrett, R. Hayum and F. O. Williams
ITT Federal Laboratories
Nutley, New Jersey
SUMMARY
INTRODUCTION
The ITT 52 5 VADE (Versatile Automatic
Data Exchange) is a medium-scale communications processor capable of handling 128
duplexed teletype lines and 16 high speed
data lines. The processor is of the singleaddress, parallel binary type utilizing a twomicrosecond-cycle-time core memory and
operating at a single-phase clock rate of four
megacycles. The fundamental design approach of the machine is to trade the intrinsic
speed of high performance hardware for a
reduction in total equipment, through timesharing. The memory if? shared between
stored program and input/output functions
without the use of a complicated "interrupt"
feature. Serial data transfers between the
memory and communication lines are performed on a "bit-at-a-time" basis requiring
a minimum of per-line buffering. The central processor hardware is largely conventional but has been reduced as much as possible without impairing the power of a basic
communications processing instruction repertoire-which includes indexingand character mode operations but not, as yet, multiplication or division. Instruction time is six
microseconds and the number of instructions
performed per second· varies from 63,500 to
81,000 depending on the existing input/output
traffic load. Duplexing of the machine is accomplished by a "shadow" system whereby
the off-line processor is continuously updated
by the on-line processor through one of the
normal high speed data links.
At the present time the Design of RealTime Processors is following the multiprogramming or multisequencing philosopy.
Multiprogramming is usually defined as the
time sharing of a single central processing
unit. The processor responds to the realtime channels by activating a stored program
which in many cases is unique to that particular channel. The memory must store not only
the individual programs, but also the addresses of the programs for the corresponding channels. Furthermore, the processor
must provide some type of priority-interrupt
system which will respond to the various service requests of the real-time channels.
Real-time processors designed upon the
multiprogramming basis usually provide perline equipment which converts the serial
binary stream to a parallel character. After
this conversion, the character enters the
Input-Output system where character or word
buffering occurs. Finally, the data enters the
Main Memory for processing, after the channel service-request has been recognized.
To implement such a processing system much
special purpose hardware and programming
is required. Program, index and supervisory
memories may be utilized in conjunction with
special purpose priority interrupt hardware.
ITT 525 (Versatile Automatic Data Exchange) is a real-time processor designed
upon a radically new philosophy. The objective of the design is to trade-off high internal
processing speed with hardware, such that
154
Proceedings-Fall Joint Computer Conference, 1962 / 155
effective utilization is made of the machine
capability. The ITT 525 processor serves as
a real-time store and forward message processor, which may serve as many as 16 high
speed duplexed data lines and 128 teletype
lines operating at a 100% line utilization.
The unique features of the ITT 525 include
the sharing of one core memory for input,
output and processing functions; the serial
bit at a time assembly and disassembly of
messages in the shared core memory using
some simple in-out hardware; the storage of
instruction micro-function logic instead of
the standard operation decoder and logic; a
minimum register central processor utilizing
direct data transfers and providing the facilities for indexing and character mode; a
powerful instruction repertoire for the implementation of the operational, utility and diagnostic programs.
System Design
The objective of the ITT 525 design was to
produce a versatile message processor, at a
minimum cost per line, to perform the function of a local area center handling a reasonable amount of data and teletype lines. It was
decided to implement a system capable of
interfacing with 128 duplexed teletype lines
plus 16 duplexed data lines.
In orderto achieve minimum cost per line
the first design philosophy established was
to make optimum use of the common equipment. Thus, it was decided to time-share
one core memory and the major control circuits, between the In-Out unit and the central
processor. This was made possible because
of the high speed core memory, operating at
a 2 microsecond read-write speed, and 4
megacycle logic circuits. If it is assumed
that two memory cycles are required to perform a machine instruction, a maximum of
250,000 instructions per second may be executed by the ITT 525.
Since the ITT 525 has such ahigh internal
speed, it was further decided to deviate from
conventions and accept data from the realtime lines a bit at a time per line into the
one core memory with no per line buff~ring.
The messages are, thus, taken directly from
the serial bit stream into the core memory
where they are completely assembled. The
message remains in this storage area while
it is being processed and analyzed by the
stored program and finally becomes disassembled one bit at a time for the output line
transmittal. This line scanning or bit sampling of the input and output lines requires a
total of 62 %of the total machine time for the
128 teletype and 16 data line configuration.
Thus, a total of 95,000 instructions per second are available for the central processing
functions.
The system analysis of the ITT 525 determined that to completely process the assembled messages, with an input line utilization
of 100%, would require between 35,000 to
45,000 instructions per second. This proceSSing consists of message validity checking,
message decoding for destination and priority,
message filing, message journalling, message code conversion and finally output queueing. Approximately 1500 instructions per
message are needed to perform the complete
proceSSing functions. This proceSSing estimate of 45,000 instructions/sec is rather
conservative, since the probability of continuous 100% line utilization is very remote.
Thus, the average processing time will be
much smaller than the 45,000 instructions per
second. However, the total available central
proceSSing time for the ITT 525 is 95,000 instructions per second so that, obviously,
50,000 instructions per second remains for
future expansion or a further trade-off of
time for hardware or flexibility.
To make further use of the extra machine
time, it was decided to employ the concept of
stored micro-operations or microfunctions.
A reserved area of memory contains the
microoperations for each machine instruction. This word is retrieved for each instruction before the execution of the instruction
can proceed. This extra memory retrieval
per instruction uses an equivalent of 31,500
instructions per second, so that a total of
63,500 instructions/per second remain for
message processing. The stored microfunction logic replace s the conventional wired
logic operation decoder and some corresponding microoperation logic. However, the primary advantage of this approach is not the
reduction of hardware obtained, but in increased instruction flexibility and speed of
machine check-out. Each microfunction may
be tested independently either by a diagnositc
program or from the operator's console. This
facility greatly reduces the time required to
isolate and repair machine failure.
In summary, the ITT 525 system design
has resulted in the development of a stored
program processor in which the memory is
156 / Design of ITT 525 "VADE" Real-Time Processor
time - shared between Input/Output wired logic
and the program control logic. The processor
operates internally on parallel binary words
each consisting of thirty-two bits. The instruction cycle, consisting of 6 microseconds,
performs single address instructions with an
available rate of at least 63,000 instructions
per second.
The block diagram of ITT 525 VADE (Figure 1) illustrates the machine configuration
consisting of aCentral Processor and In-Out
time- sharing the core memory.
CORE MEMORY
INSTRUCTION WORD
BITS
DEFINITION
0-7
B-15
16-23
24-31
CHARACTER
(Co)
CHARACTER I (C,)
CHARACTER 2 (C,)
CHARACTER 3 (C,)
a
BITS
DEFINITION
a
CHARACTER MODE (C)
INDEX TAG (X)
SENSIBLE DEVICE CODE (S)
OPERATION CODE (OP)
WORD ADDRESS (W)
CHARACTER ADDRESS (A)
I
2-7
B-13
14-29
14-31
Figure 2. ITT 525 Processor Words.
Machine Organization
CENTRAL PROCESSOR
OAT A WORD
IN/OUT UNIT
Figure 1. ITT 525 VADE
Central Processor
The central processor of the ITT 525 is a
single address, binary, one's complement
processor employing stored logic for instruction decoding, one index register and character mode operation. The design is based
upon a minimum register configuration, with
maximum time sharing, and direct transfer
between registers.
The instruction cycle of the processor is
six microseconds. This cycle is broken down
into three memory accesses: one unloadload to fetch the instruction; one unload-load
to obtain the stored microfunction control
word; and finally one unload-load to obtain
the operand and execute the instruction. The
processor word consists of thirty-two bits
which may take the form of 4 eight bit characters or an instruction word divided into six
fields (Figure 2).
The Memory Unit consists of a high speed
linear selection core memory operating at a
speed of 2 microseconds per complete cycle.
The size of the core modules varies from
4096 words at 33 bits to 32,768 words. An
extra bit (33) is furnished, which enables
parity checking during the unload cycle and
parity generation during the load cycle.
The design of the processor registers may
be considered conventional for the instruction counter, index register and meraory
address register. However, several unique
features were employed in the use and design
of the MemoryBuffer,Accumulator and Control Buffer.
The Memory Buffer is a time-shared
register which is concerned with normal
memory functions plus other functions such
as arithmetic unit buffering, character m9de
gating, In-Out mode buffering and behaving
like a pseudo bus. During arithmetic and
logical operation the arithmetic unit may be
considered as the Memory Buffer Register
and the Accumulator. The reasoning behind
this approach is that the contents of the
memory buffer may be utilized as the arithmetic unit "B" register, while the data is
being loaded into memory. For example, in
addition the Accumulator contains the augend
and the Memory Buffer contains the augend.
These operations may, furthermore, be performed in either the word or character mode.
Special' character gating between accumulator
and memory' buffer enable the programmer
either to perform the operation on 32 bits or
one of the eight bit characters. In the transfer of data from register to register, the
MemoryBuffer acts as a pseudo-bus, through
which all data must pass. This configuration
reduces redundant paths and allows one to
form any data transfer path as desired. This
capability is especially useful in developing
new instructions conSisting of several register transfers.
Proceedings-Fall Joint Computer Conference, 1962 / 157
The accumulator register of the ITT 525
is the heart of the arithmetic unit. This
register in conjunction with the memory
buffer performs a parallel, two-step addition
and subtraction. One of the unique features
of the Accumulator is the carry chain configuration, which has a maximum delay of 425
nanoseconds.
The standard, simple, carry chain configuration consists of a single gate per flip-flop
stage. The delay encountered for this arrangement is the number of stages times the
gate delay, which for the ITT 525 would have
been 32 x 35 or 1120 nanoseconds. In order
to take full advantage of the four megacycle
clock it was determined that a carry chain
delay of less than 500 nanoseconds would be
desirable for the ITT 525. One technique
available to speed the carry chain is the passcarry or grouping-carry idea. In this case,
several stages are combined to form one
large carry gate, thus, reducing the overall
carry chain delay. However, in the ITT 525
Accumulator the carry chain design is based
upon the group hierachy principle. This concept makes optimum use of the recursive
nature of the carry equation by first combining flip-flops into groups and groups into
sections. In this way, if a carry has to be
passed for 32 bits, it will avoid not only the
groups of flip-flops, but also the section of
groups.
The functions that the accumulator may
perform upon data are as follows:
1. Partial Add (Exclusive Or)
2. Carry
3. Inclusive Or
4. Reset
5. Complement
6. And
7. Cycle Left
The accumulator may be sensed by program for the following conditions:
1. Minus Zero
2. Plus Zero
3. Overflow
4. Any bit of Character 3 (24-31)
5. Plus or Minus Zero
The Control Buffer contains the Instruction Micro-operations- obtained during the
Processor's "Stored Logic" Memory cycle.
Each bit of this register is assigned a specific microfunction, such as "Reset Accumulator," Transfer Index Register to Memory
Buffer," etc. If a particular instruction requires that microfunction a "One" appears in
that bit pOSition. High fan-out drivers distribute these microfunctions to the various
register input gates. In addition to the increased flexibility and some cost reduction,
the use of the stored logic technique provides
a powerful tool for checkout and maintenance.
INPUT /OUTPUT UNIT
With the single exception of a direct input
from a paper tape reader on the console, all
processor inputs and outputs are handled by
the Input/Output Unit, including transfers
between core memory and secondary storage
devices. The initial implementation of the
ITT 525 system has the following traffichandling capability:
1. 16 duplexed high speed, data lines operating at any speeds up to 2400 bits per second (8-bit code).
2. 128 duplexed teletype lines operating
at speeds of 60, 75 or 100 words per minute
(5-bit code).
3. Block transfers of computer words to
one of eight magnetic tape units operating at
a transfer rate of 2500 computer words per
second.
This is a maximum capability configuration with regard to teletype and data lines.
Smaller machine capabilities are implemented in any combination of modular blocks
of 4 data or 16 teletype lines. Also, individual line speeds are completely independent
and may be changed without incurring hardware changes.
Although the stated capabilities conform
only to the task of communications processing' the unique features of the Input/Output
Unit are applicable to other tasks and configuration requirements with a moderate
amount of hardware change. The "bit-at-atime" technique is easily adapted to various
forms of serial bit streams, regardless of
framing or Synchronization details, and the
method used for tape word transfers is directly applicable to any block transfer process, even if the ''blocks''are degenerate ones
of only a few words of characters.
Teletype and High Speed Data Lines
Incoming serial bits on these lines are
transferred directly into core memory. Outgoing serial bits are transferred directly
from the memory to one or two per-line output flip-flops. Each output line requires one
158 / Design of ITT 525 "VADE" Real-Time Processor
flip-flop for pulse-stretching and data output
lines require an additional flip-flop to reduce
bit jitter. The total line storage required
is 160 flip-flops, which compares favorably
with the 1536 flip-flops required if each line
(input and output) were to terminate in a onecharacter buffer.
Several fixed core memory locations are
permanently assigned to each input and each
output line which are used by the input/output
logic. These per-line locations contain space
for character assembly/disassembly, program flag bits, control bits and timing information. The stored program exercises control over the Input/Output Unit by performing
regular scans of these words and changing
their contents when necessary, thus modifying the operations of the wired logic of the
Input/Output Unit. Specifically, in addition
to noting the end of incoming messages or
initiating output for outgoing messages, the
program must make "bin" assignments to
active lines. It is unfeasible to reserve for
each line a space in memory adequate for the
largest possible message. Alternatively,
"bins" of 75 words, or 300 characters, are
assigned to active lines as they are required.
The Input/Output Unit logic notifies the program of such needs by flag bits and can store
temporarily, in the fixed memory locations,
as many as twelve incoming characters during
the interim between bin assignments. In normal input operation, after a bin assignment
is received, characters are transferred to
memory soon after completion, independent
of the stored program.
In reference to the block diagram of Figure 1, the basic operation of the I/O Unit is
rather simple. A "Scan Generator" controls
line selection and memory addressing (for
control words) according to a fixed cycle of
operation. Then, for each line scan, the most
important control word for the line, the" status
word," is unloaded to the "Status Word Buffer"
where it remains for one or two more memory
cycles to control operations on the line information through use of the Memory Buffer
for examination and modification of other
words. Finally, two counters are used for
timing purposes indicated below.
The Input/Output Unit obtains control of
the memory and performs a "scan cycle"
every 280 microseconds. This interval is
compatible in two different ways, with the bit
periods of the lines. A 2400 bit-per-second
data line has a bit length of 41 7 microseconds
and a 100 word-per-minute teletype line has
a bit length of 13.46 milliseconds. By scanning all data input and output lines each scan
cycle but only one-fourth the teletype input
lines and one-sixteenth the teletype output
lines, the following rates are obtained:
1. data lines are scanned at least 1.49
times per bit.
2. teletype input lines are scanned 12, 16,
or 20 times per bit for 100, 75 and 60 wordper-minute lines, respectively.
3. teletype output lines are scanned 3, 4,
or 5 times per bit for 100, 75 and 60 wordper-minute lines, respectively.
These rates permit the sampling of teletype
input lines within ±8% of the nominal bitcenter, or better, to minimize the effects of
distortion. Actual sampling is accomplished
on the basis of predicted bit-center sampling
times established when the "stop" pulse to
"start" pulse transition is detected and stored
in the fixed memory space for the line for
later coincidence comparison with a real-time
(based) counter. Data input lines are sampled on the basis of timing provided by their
associated synchronizing signals. The synchronizing signal used has a frequency of onehalf the bit rate of the line. The value of this
signal (zero or one) is stored on each scan
and the line value is not sampled unless the
stored and present values of the synchronizing Signal differ. Output lines are handled in
exactly the same manner as input lines except
that teletype output lines do not need the high
scan rate provided for input teletype lines
since the output process itself controls the
waveform distortion.
Duringthe I/O scan operations, one memory cycle is required to scan each line and
two additional memory cycles are required
for each character transfer between the fixed
memory locations for the line and the message bin elsewhere in memory. By limiting
the number of character transfers allowed in
each scan, minimum and maximum I/O scan
time requirements of 144 and 173 microseconds, respectively, are obtained. Since the
interval between scans is 280 microseconds,
52% to 62% of total processor time is spent
in input/output operations and 38% to 48% remains for stored program use (63,500 to
81,000 instructions per second).
Magnetic Tape Block Transfers
Block transfers of computer words between
the memory and the Magnetic Tape Module
Proceedings-Fall Joint Computer Conference, 1962 / 159
of the 525 are initiated by the stored program
and executed in detail by the Input/Output
Unit. A single fixed location in memory holds
a block address and a count of the number of
words to be transferred. During a block
transfer, the MTM sends requests for word
transfers to the I/O Unit at approximate 400
microsecond intervals and these requests
must result in a word transfer within 67
microseconds. It is pOSSible, then, that a
word transfer must be made when the I/O
Unit is not in control of the memory. In this
case, the I/O Unit gains control of the memory
only for the transfer time and then relinquishes control until the next regular line
scan cycle time. While it is in control of the
memory, the I/O Unit uses the fixed location
MTM word to control the transfer of a word
between the specified address and the MTM
buffer. Then the address is incremented,
the block transfer count decremented and the
MTM word is transferred back to its fixed
location. Much of the logic performing these
operations within the I/O Unit is common to
the logic required for teletype and data line
operations, since character transfers and bin
counts for these lines are handled on the same
general basis. The I/O Unit is easily expanded to include a similar block transfer
proviSions for magnetic drums, card readers,
punChes, printers and displays.
Except for the implementation of the block
transfer process, there is nothing unusual
about the operation of the magnetic tapes.
One tape at a time may be selected to read,
write, backspace one record, advance one
record, write end-of-file or rewind, the rewind operation being performed in a quasioff-line state so that other units may be
selected during this operation.
Input/Output Program Requirements
Since the ITT 525 has no interrupt feature,
the stored program-input/output interface
is an unusual one. Regular scanning of the
fixed-location input/output control words is
essential to the bin assignment task of the
program. Input teletype words must be
scanned at least once every 800 milliseconds
and input data words at least once every 40
milliseconds. These figures represent the
amount of time required for incoming information to fill the twelve-character per-line
temporary storage space. Output words are
scanned (for bin assignment needs) at
whatever speeds the programmer desires
since no information can be lost and the only
consideration is for efficiency in transmitting
messages which are more than one bin in
length.
A more complicated problem arises when
the programmust exert control overl/O Unit
operations by modifying the contents of fixed
location control words. An input/output line
scan cycle may interrupt the program and
change the contents of a control word at the
same time that the program is preparing to
modify the word. Since the program has no
natural means of knowing an interruption has
occurred, it would tend to force obsolete data
into the control word. To circumvent this
problem, a flip-flop is provided which can be
sensed by the program and which, if on,
guarantees the program that the six instruction times immediately following the sense
instruction will be free of I/O Unit interruption-this number of instructions being sufficient to perform the control word modification.
Duplexing
To meet the reliability of many real-time
problems, the ITT 525 may be utilized in a
duplexed configuration. The duplexing design
of the ITT 525 has been selected on the basis
of maximum reliability and minim}lm special
purpose duplexing hardware.
The Duplexed System configuration is illustrated in Figure 3. System A is in control
of the magnetic tape and communication links,
all input lines and output lines are accepting
and sending data. The standby machine B
also has the input lines connected to it and
accepts all input messages. Furthermore,
machine B assembles, processes the message and sets up the output queue. Machine
A regularly sends data to machine B via a
OUTPUT
UNES
Figure 3. Duplex Configuration.
TAPE
MODULE
160 / Design of ITT 525 "VADE" Real- Time Processor
normal high speed data line concerning the
disposition of messages. Once machine A has
outputted the message, machine B erases the
message and updates its own output queue.
Machine B does not file, journal or overflow,
or output any messages. Using one of the high
speed data links for the regular communication between machines A and B insures a
smooth cutover with no loss of data. The
worst condition that might occur is that after
cutover machine B might output a given message again since it had not received the last
disposition data.
Machine A, if in control may by program
relinquish control to Machine B and vice
versa. Also, manual means are available.to
establish the duplex configuration via the
Operator's Console. Output lines and the
magnetic tapes are automatically switched
into the proper machine for each configuration.
CONCLUSION
The ITT 525 VADE is intended to do a
medium-scale job using only a small-scale
amount of hardware. Although testing and
program debugging will not be complete for
another two or three months, there is little
doubt that the system will satisfy this aim.
Further extensions of the VADE approach
have been planned which will improve the
speed, line-handling capacity and versatility
of the machine by modular additions of hardware at selectively increased cost.
ON THE REDUCTION OF TURNAROUND TIME
H. S. Bright and B. F. Cheydleur
Computer Division
Phil co Corporation, a Subsidiary of Ford Motor Company
Willow Grove, Pennsylvania
Resources:
SUMMARY
(a) Flagging, in the procedure oriented
language, of the permissible break-in pOints
on large programs.
(b) Sequential, rather than concurrent
operation of programs, by means of fast exchange of more contents with disc.
(c) Use of main core as input/output
buffer for short communications with multiple
remote stations.
Approach. In principle, the operator system is to be increased in capability for minimizing delays, through application of currently available hardware, together with
planned interruption of long runs by short
ones.
For small jobs, break-in on large jobs at
selected interrupt pOints. For large jobs,
stack input/output on disc. Sequence primarily by estimated run time, with some
consideration of priority, and with less attention given to arrival chronology.
The paper describes a typical large relaxation calculation, giving operation parameters as executed on a modern computer,
showing for this rather formidable example
a break-in-point interval on the order of
several seconds.
In jobs of such size, total tape and disc
traffic can be comparable in volume to internal data flow. In contrast, the concurrent
buffering of set-up information for many
problems constitutes a relatively minor contribution to total data flow. Thus the initial
input and final output for many jobs may
Objective: To Reduce Delays. It is the
intent of this work to permit small computing
jobs to run with typical delays of minutes
rather than hours, while no jobs, including
the largest ones, become appreciably worse
in turnaround time than at present.
Background. The basic idea of multiplebreak-in operation from many input/output
stations is not new. Most authors, however,
have proposed either dramatic advances in
hardware or software, or computation complexes of conventional hardware so large as
to be economically unattractive. McCarthy,
for example, proposed serving some dozens
of stations for simultaneous on-line debugging
of as many programs, by using perhaps a
million words of slow magnetic core memory
with a very fast computer.
Sources of Delay. "Legitimate" delays,
for jobs to be run as~ntities in the sequence
in which received, consIst primarily of queue
development during periods in which the average workload acquired exceeds the computation rate capacity of the facility. "Illegitimate" causes of delay result mainly from
manual job stacking. Artificial delays are
inserted at several places in a typical facility, including the sign-in-desk, the cardto-tape facility, the on-line input tape stack,
the on-line tape units (gross operational
delays from the mounting of file tapes while
the computer system idles), and the printer
tape stack.
161
162 / On the Reduction of Turnaround Time
proceed concurrently, although calculations
will in all cases be executed sequentially.
Conclusion. The paper marshals arguments supporting the practicality of greatly
reducing turnaround delays without using huge
memory or very costly types of communication facilities.
The productivity of all direct users of
large- scale general-purpose digital computing centers, and to a lesser extent the productivity of the entire organizations they
serve, are significantly affected by the typical
time delay between request for and delivery
of computer service, which we shall call
"turnaround time."
For this reason, reduction of turnaround
time has in recent years become recognized
as having major economic importance. As
machines have become faster, individual
problem setup time has assumed larger
significance in computing center lOgistics.
Some of the delays for clerical work at setup
time have been taken over by operator programs. Much of the effort on delay reduction
has been applied to attempts to increase the
effective throughput capacity of the computing
systems themselves, either by concurrent
operations or through increase in sheer speed.
One approach that has been widely used,
as a matter of absolute necessity, for the handling of real-time problems within generalpurpose facilities, has had surprisingly little
attention in "unreal-time" applications. The
intent of this paper is to direct attention to
the technique of short-run break-in by programmed interrupt, and to show how modern
hardware, without the costly special facilities
often required for prompt interrupt, can make
this method attractive for general-purpose
applications.
We believe that the method, which uses
only off-the-shelf hardware and software,
can permit many short jobs to run with typical delays measured in minutes rather than
in hours, while no jobs (including the longest
ones) become drastically worse in turnaround
time than in conventional first-in, first-out
operation.
Throughput Increase
Before proceeding with our discussion, it
will be useful to review some of the steps that
have been taken to increase the effective capacity of general-purpose computing facilities:
1. Concurrent Schemes (cohabiting programs)
1.1 Micro- segmentation by commutating hardware
1.2 Decentralization by input/output
autonomy
1.3 Macro- segmentation (p r 0 g ram
segment merging by hardware interrupt)
1.31 Merged input/output, seq\lential execute (several concurrent I/O
streams permitted, but only one computation
at a time has control)
1.32 Merged input/output and execute (full-blown "multi-programming")
1.33 Seq uenti al input/output,
merged execute (early real-time operations
on unbuffered machines)
2. Sequential Schemes (programs alone
in memory)
2.1 Multi-phase operation (batched input, execute, output ("I, E, 0"» [1].*
2.2 Faster machines
2.21 Sequential integral programs
2.22 Short-run break-in by program interrupt
3. Multiple Independent Machines
The concurrent schemes suffer from the
serious disadvantage that, even in multiplecomputer-unit complexes (whether or not all
of the available memory space is accessible
by all processors) sufficient main-memory
space must be available for all of the programs or program segments that are to be
operated together, if the operation is to be
economically feasible. This often means that
either the program multiplexing is limited
to jobs that require very little memory space,
or that memory sizes are required that are
economically unattractive at present. t
Method 1.1, in present realizations, has
the additional disadvantages that both time
and memory space segmenting must be
*Batching of input, execute, and output phases
of all jobs on an input tape is discussed in
detail in reference [1].
tOne proposal, [4], called for a single computer system with one million words of
magnetic-core memory.
Proceedings-Fall Joint Computer Conference, 1962 / 163
relatively simple and inflexible. It increases
turnaround time for all processor-limited
problems that are run concurrently, since
the single central processor must be timeshared and all such jobs must take longer
than when run seriatim.
All of the Concurrent Schemes shown
above permit efficient use of multiple online input/output devices, but the sharing of
a single I/O device by several problems is
at present feasible only if the "device" is
actually a large random-access auxiliary
memory element or if Data Select * hardware
facilities are available.
This is particularly s i g n i f i can t when
scheduling multi-tape problems on a large
machine; many of those jobs for which concurrent operation would be most attractive
require the use of half or more of the total
number of tape units available, espeCially
when tape-oriented operator systems are
used. This limitation on time- sharing of a
single I/O device is, alas, almost as frustrating for the modest scheme proposed by
the present paper as for the most sophisticated time-and-corespace-merging scheme
discussed. Its effect is to impose serious
limits on the permissible assignments of tape
units or other I/O devices, for jobs that are
to be run concurrently in any system, and in
most cases to prohibit reassignment of a
given I/O device until completion of the job
to which it was last assigned. The noteworthy
exception is the case of tape units used for
"scratch" storage of intermediate results;
such units may be reassigned as soon as the
last Read operation upon a given data string
has been completed, although the reaSSignment problem is a difficultone in the important case when the number of rereads is dependent upon calculatioJl, results and must be
determined at run time.
In the proposed scheme, the effect of this
tape assignment restriction is merely to hold
back the start of an interrupting job until
adequate I/O facilities can be assigned.
Thus, when an interruptable job that has
*This feature permits individual data records
on magnetic tape to be tagged with control
marks so that they can be processed or
skipped without detailed examination; it is
most commonly used for the writing of
multiple reports on a single tape by a single
program, so that report selection may be
made at the time of off-line printing.
extensive I/O unit reqUirements is running,
only those jobs that can be accommodated on
the remaining I/O devices can be permitted
to get to the head of the interrupting job stack.
(A comment on semantics is in order
here. Many of the early papers on multiprogramming, and a few recent ones, seem
to consider concurrency of I/O with computation (Method 1.31 with the restriction that
all of the I/O activity relates to the execution of one program) to constitute multiprogramming. We feel that "buffering" is the
accepted term to be applied to such concurrency, as long as a single main program
(which of course may be controlled by an
operator program and from time to time by
an arbitrary number of subprograms) has
control of the machine. We consider multiprogramming to mean "concurrency of the
execution phases of two or more unrelated
programs".)
Among the sequential schemes, Method
2.1 was clearly not designed with turnaround
tim e in mind, since it normally increases
turnaround time for all jobs in a batch.
This scheme, commonly known as "threephase" operation, was intended to save time
by avoiding repeated loading of large input
and output routines. It operates by first preparing ("I" Phase) the input from all jobs on
an input tape; then ("E" Phase) executing all
of these jobs; and finally ("0" Phase) preparing output (printer tape) for all jobs. The
usual operating option of assigning special
output tape(s) at run time, in order to take
care of priority situations or to take advantage of a temporarily short printer job queue,
is not available without a complete change of
operating procedure back to "single-phase"
operation, the normal scheme in which a single job is processed from start to finish.
Thus, in true three-phase operation, the output from all jobs on a given input tape is delayed until the last job has been completed.
Method 2.21 is the one that comes under
fire in the classical justification for effort
to be expended upon development of operator
programs. It is clear that, as times for Input, Execute, and Output become vanishingly
small on a well-balanced very-high power
machine, one could conceive of a facility in
which most of the time was spent in program
setup, startup, and wrapup. In a typical
large- scale present-day facility, in fact,
brute speed alone cannot accomplish much
reduction of turnaround time. The scheme
164 / On the Reduction of Turnaround Tim e
proposed below permits setup information
accession to be concurrent for many jobs,
while startup and wrapup can occur very
quickly under program control.
Because of the remarkable advances that
have been made recently in machine power
and relative economy, a few words in retrospect will serve to underline the significance
of the preceding paragraph. A typical highpower modern machine has 2000% to 5000%
more computation capability, and 2500% to
7000% more input/output capability, than the
vacuum-tube machines (circa 709 and 1105)
that ushered in the concept of the integrated
data processing facility using parallel binary
arithmetic and b u f fer e d magnetic tape.
Clearly, such huge increases in capa'city,
and in computation-per-dollar if the machines
can be kept occupied, call for serious reexamination of our operating methods.
Method 3, which is the addition of entire
computer systems, has represented sound
management practice during the first two
generations of large machines (the last of
the vacuum tube machines and the first of
the transistor machines), at least in those
installations where unscheduled delays of
more than a few hours might be prohibitively
costly; if not having on-site backup hardware
can be more expensive than that hardware
would be, then the extra hardware is justified irrespective of capacity considerations.
Because third-generation hardware will
be much more predictable as to its readywilling-able condition (i.e., unscheduled downtime will be greatly reduced), and because
maintenance experience on second-generation
machines has taught design lessons that
should dramatically reduce time required
for scheduled maintenance, it seems reasonable that hardware unpredictability will in the
few years to come offer less justification
for parallel facilities. The large user who
has had the advantages of more than one
machine will, thus, in many cases consider
conversion to a single, more-powerful machine in which overall hardware economy
(computation per hardware dollar) can be
better. There will be, from this class of
user, intense interest in means for achieving
the excellent traffic- handling behavior of the
multiple-machine facility in a larger-singlemachine facility. Since this user will not be
willing (and in many cases will not be able)
to submit to the restrictiveness of the concurrent schemes, we feel that only Method
2.2 will meet his needs with any degree of
success.
Macro- Segmentation in Practice
In many installations, the basic hardware
configuration is determined by the requirements of a single class of "bread-and-butter"
problems. With such problems rwming in
the system, there will not be much excess
memory space or processor capacity available. The macro-segmentation scheme is a
means for scheduling the available excess
capacity.
If problems could be segmented so precisely that the onset and duration of memory ,
processor(s), and I/O device availability
could always be matched precisely with the
demands of other problem segments, then
parallel operation could permit complete
use of the entire machine. This does not
appear to be workable in the real world.
In practice, a useful degree of approximation to that ideal can be achieved if the
problems, major and minor, are macrosegmented at compilation time so that the
incidence of spare capacities in various subsystems, instead of being pre-computed,
may be continually tested by control hardware and assigned at execute time, in vivo.
Such a strategem requires the prescripting to every macro- segment of a precis of
the I/O and memory requirements of that
segment, a signalling of activity-completion
from each I/O device to the executive program, and the continued monitoring of the
problem programs at the macro- segment
level.
This scheme is being implemented for
several machines in the U.S. and in England,
notably in the English Electric KDF.9 which
is described elsewhere in these proceedings.
Particular notice should be taken of the
recent work reported in reference [6], by
Corbato, et al, describing an application of
Method 1.32. Their thoughtful comments on
several aspects of multiprogramming system requirements andplanning have inspired
much of the work reported here, and the
reader is referred to that paper for valuable
background information on program timesharing of hardware. Their algorithm for
run queue control will be discussed briefly
below, and several references will be made
to observations in that paper.
Proceedings-Fall Joint Computer Conference, 1962 / 165
Short-Run Break-In
The basic method proposed here is much
simpler conceptually, and offers advantages
shared by none of the other methods listed
except those that utilize sheer power alone.
In a sense, it is a simplification of the macrosegmenting concept outlined above.
The segmenting is to be performed in large
programs only, under control of flags planted
by the programmer. This will require establishment of a program~ing convention for
maximum on-line time interval between flags,
which for large machines might be chosen on
the order of a few seconds toa few minutes.
Jobs whose maximum machine time requirement is smaller than the maximum permitted interval between flags will not be segmented, and it is these jobs that can be called
in by the operator program whenever a
break-in flag is encountered.
Whilewe do not havethe temerity to essay
a rigorous proof that any particular break-in
time limit is a reasonable one for all circumstances, it will be helpful to consider one
example of a notably forbidding class of problem in which data flow to and from auxiliary
memory proceeds concurrently with rather
involved calculation and indexing. In the inversion by relaxation methods of large sparse
matrices, it is prohibitively expensive of
restart time to interrupt the calculation during a mesh sweep. As each new sweep starts,
however, a substantial amount of initialization is performed; it is not unreasonable to
request that auxiliary memory data flow be
organized for efficient interruption at these
points. For instance, tape data can start
new blocks, so that, at worst, tapes may
require simple backspaCIng in the event of
interrupt at such a point; disc data flow may
start a new I/O order at these points.
Consider a tridiagonal matrix of order
100,000 that represents an array of difference
equations, calculation for each point to consist of eight accumulative-floating-multiply
operations together with a few housekeeping
operations. On a typical modern large- scale
computer):~ thefloating-Ioad-multiply-add sequence may take 7 microseconds. Ignoring
the brief housekeeping operations, the time
for this mesh sweep would be 5.6 seconds.
Thus, imposition of the one-minute rule would
*e.g., the Philco 212.
afford no hardship to the programmer of this
large problem. Clearly in the formalization
of problems that are even larger than this
one, sectionalization into relatively autonomous parts is a sina qua non of rational construction and rational problem checkout. The
run duration for these sections will tend to
be far less than a minute.
Thus the feasibility of the strategem (2.22),
wherein a major problem occupying most of
one critical facility must be displaced, sectionally, to reduce turnaround time for a
minor problem, is assumed to be dependent
only on the means for dumping core memory,
etc., into a high-data-rate "scratch medium"
such as drums or discs.
From the standpoint of turnaround time,
the availability of modern discs suggests that
the complete loading of a number of problems
can be made from cards and tape to discs
well in advance of actual processing. All
short-run segments and all problems such as
usually require five minutes of machine room
set-up time and one or two seconds of run
time, can now be processed ambulando. It
should be noted that the loading of information in advance of processing each problem
segment can be effected automatically from
discs more rapidly and with less entailment
of control equipment, via Method 2.22, than
would be the case with the macro- segmenting
(Method 1.3), for the passage-time of macrosegments is not well matched with the access
time of discs and is even more badly matched
with the access times of tapes.
Altogether, from considerations of simpliCity of Method 2.22 and of the tanking
advantages of discs, it seems quite practical to permit small jobs to interrupt large
ones and to thereby implement a first level
of priority for jobs of short estimated run
time.
Memory-Protect Considerations
With regard to the ubiquitous problem of
memory protection (which let us discuss in
the limited contextof protection of the Operator System program from being overwritten
by a not-~et-debugged user program), Corbate> [ibid.j suggested dynami~ relocation of
all memory accesses that pick up instructions
or data words. ThiS, in a true multiprogramming sy stem, would consume significant
machine time on a computer that did not
have rather extensive specialized control
166 / On the Reduction of Turnaround Time
hardware. With the straightforward scheme
proposed here, memory protection can be
adequately provided by the addition to a conventional machine of simple boundary registers. For the protection of I/O unit assignments, Corbato[ibid.] suggested the trapping
of all I/O instructions issued by user programs; under the scheme suggested here,
this would be necessary only for the interrupting (small) programs.
Control of Precedence
Assuming that overall review and authorization of problems provides all the filtering
needed in a facility except for within-shift
scheduling, and further assuming that shortrun break-in is adopted and discs are utilized, it becomes necessary to answer the
question, "how much running-time should be
allowed for small problems before automatic
program-reversion to large problems is
permitted?" Should the parameter be fixed
or variable? Should it be som e interval that
is greater than a few seconds and perhaps
less than five minutes? Is this range too
large?
Using modern high-flushing-rate auxiliary
memory equipment, one can save and replace
the contents of main memory in less than
one second, even on a fairly large computer.
Consider the example of a 32,000-word core
memory machine equipped with a disc backup
memory that positions in a maximum of 100
milliseconds and communicates data at the
rate of 119,000 words per second (8,192 words
per 68-millisecond revolution), with angular
delay of no more than one word-time when
data words are moved in groups of 8,192 or
more. Assume that the disc heads have been.
prepositioned to a "home" position by convention at a flagged break-in point in a large
program. Time to refresh main memory
would then be, at most:
T =
32,768 second + 0.1 second
119,000
32,768 second
119,000
approximately.
+
= 0.65
second,
Among the parameters of priority, those
that are dependent on equipment required for
each problem segment become less important in a computer complex in which high
capacity discs are included, because the
tape-drives that would be required to serve
as scratch media, in classical complexes,
are now replaceable with areas on the discs.
Likewise in file updating, even the currentchanges may be kept on diSCS, as well as the
problem-program and the library. Thus,
except for the propriety of using tapes as a
medium for large history files or reference
files, the functions of tapes are apt to be supplemental to those of diSCS, rather than vice
versa. One such supplemental preference
for tapes with respect to discs inheres in the
two or three millisecond access time to the
beginning of information blocks that are
already in position for the next reading or
writing action. On the whole, however, the
Criticality of I/O availability is considerably
reduced when modern discs are available,
and the number of essential availability
parameters that must be used in a scheduling
calculation is very small.
When there are several problems loaded
into the tape or disc stack, the selection of
the next one to be processed can be based on
a calculation that takes into account the estimated run time. An early, perhaps whimsical scheme that considered e.r.t., among
other variables, was the North American
(Aviation) "Precedence Pro g ram" circa
1955.
NAPP controlled the job stack on an IBM
701 by conSidering the four factors U =
Urgency, W = Wait time (since problem submitted, B = Business this customer gives
the computing center per month, and r =run
time estimated for this problem. For each
waiting problem, the program calculated
priority and chose the problem having the
highest value of P to be r.un next, according
to:
P
= WUB
r
The value of U was set by reference to a
table established by laboratory management
and changed from day to day or perhaps from
hour to hour. The parameter B was inserted
in order to provide an appropriate indication
of the loudness with which this customer was
able to knock on the computing center door.
The usual first-come, first-served sequencing convention may be looked upon as a degenerate form of this formula, with U, B,
and r held constant.
Proceedings-Fall Joint Computer Conference, 1962 / 167
We propose that two of the above four
factors, Wait Time and Estimated Run Time,
be considered in addition to I/O units required, in establishing run precedence. We
do not propose to consider memory space
requirements, since this scheme does not
require cohabitation of running programs in
main memory. We also propose to provide
some weighting other than linear for the two
times, thus:
P
=n
log W - m log r,
where n and m and weights given to Wait
Time and Run Time respectively.
So far as turnaround time is concerned,
the responsibilities of the Executive Program
can be summarized into three classes:
(a) The computation in advance for each
problem in the input stack of a precedence
number, taking into account the parameters
of priority.
(b) The anticipation, through survey of
the estimates of running time of the status
of the queue, giving advance notice to operators of when the backlog of little and big
problems is to be replenished.
(c) The providing of advance notice to
operators of need to set up reference file
tapes, during the advance of a large problem
from segment to segment, in accordance with
the computations of (a) and the intrasegment
directions to operators provided by the complier.
For the case of a true multiprogramming
operation, with some scheme for sequencing
small time-segments of user programs, system efficiency can approach zero for heavy
workload when for some large programs the
loading time becomes large compared to the
run time segment length. Corbat6 [ibid.]
proposed a scheduling algorithm that guaranteed an operating efficiency of at least 50%
by keeping segment operate time equal to or
greater than load time, and pointed out that
one may determine the longest loading delay
among a number of competing program segments and that, for a given "segment delay,"
the number of users must be limited. Unfortunately, in a typical computing center
environment, it is the completion of a job
rather than the start of its execution that is
of interest; completion time does not seem
to us to be predictable in the general case.
Under the proposed scheme, on the contrary, provided good discipline is maintained
with regard to insertion of flags in interruptable programs and to limitation on the duration of interrupting jobs, it is possible to
permit prediction of worst completion delay,
or turnaround time limit, in terms of the
number of short jobs waiting, by merely
assigning a weight of zero to the coefficient
m in (b) above, as executed by the operator
program. This limit, for j jobs waiting,
would be simply j(t f + t J where t f is the
maximum permitted time between flags and
ti is the maximum permitted time for any
interrupting job.
We feel intuitively that it would be desirable to experiment with weights for Wait and
Run time coefficients n and m. For the facility that serves only a few dozen short- run
users, it might be best to weight Wait time
much more heavily than Run time, thereby
approximating what Corbato calls "round
robin" service; for the facility that serves
a very large number of users, mean turnaround time must be greater and it will be
desirable to favor short jobs by weighting
Run time heavier.
Since, in this system, the criticality of
I/O equipment scheduling (so far as tapes
are concerned) is relaxed, some of the com- .
plexities that would enter into a general
scheduling model are not present. Thus, for
any problem in the stack, it becomes feasible to automatically examine the remaining
factors, complying with the residue of considerations in a model such as given by J.
Heller [2].
In this system, no special reprocessing of
object languages is required in order to
conform memory allocation to the ongoing
problem-mix decisions. Furthermore, we
require no real-time solution of linear models of flow or loading such as that reported
by Totschek and Wood [3], nor are surrogates
for solutions of these models needed when
disc storage is present to provide cushioning.
The Executive Routine is relieved of responsibility for micro-monitoring of error
concatenations that can thread across a set
of problems and a complex of equipment for
only one large problem segment is processed
at a_time; problem independence is fostered.
Thus, in modern computers, the interdependence of hardware within any problem segment
can be kept at a reasonable level while maintenance and problem debugging are simplified.
In partitioned and buffered memory computers, i.e., those incorporating several
168 / On the Reduction of Turnaround Time
memory modules having independent data
and address registers, the inherent parallel
capability is thereby conserved so as to
contribute to speed of processing. This is
in sharp contrast to the usual situation in
micro-segmentation schemes, where memory
partitioning complicates inter-job control
and contributes to program control ricochet.
Implications
The basic idea of multiple-break-in operation from many input/output stations is not
new. Most proposers, however, have advocated either dramatic advances in hardware
or software, or computation complexes of
conventional hardware so large as to be· economically unattractive. McCarthy and associates [4], for example, proposed serving
some dozens of stations for simultaneous
on-line debugging of as many programs, by
using an enormous slow magnetic core memory with a very fast processor complex. One
of their prinCipal concerns was to provide
on-line responsiveness in the system to any
set of queries or inputs emanating from the
array of program-development stations, so
that it seemed that all problem materials
must be immediately accessable in directlyaddressable memory. We believe that a
variation of the strategem of (2.22), adopted
for high-speed processors and discs, could
serve most of these requirements, particularly when individual groupings of problems
can be controlled by individual executive
routines, with occasional call-out from one
group to another.
In this connection, the peak memory traffic load level reached when transmitting ten
characters per secondper station to or from
100 stations Simultaneously, reaches a character rate of only 1000 per second, or 1/250
of the load that a modern tape unit imposes
on a single I/O channel. In a current model >:<
commerCially available computer with each
of four independent memory modules operating at one microsecond full cycle, and
assuming that one full word of memory
would be accessed twice for each character
incoming from these control stations, this
loading would entail 2/250/4 = 1/500 or 0.2%
of the full memory capability. Clearly, the
on-line query of raw information from a
hundred or so stations is not a time- consuming
process for this simple memory complex.
The Many-Short-Jobs Workload
Clearly, a production-computation w~rk
load that consists entirely of one-minute Jobs
is not gOing to be expedited by a traffichandling scheme that emphasizes short jobs
at the expense of long ones. This is a real
limitation, for there are many organizations
in which much of the daytime workload consists of brief compile-and-execute jobs.
Even for such organizations, however,
there may be a powerful advantage in the
method of short-run break-in. As discussed
in the previous section, the economic soundness of use of main memory as a buffer for
a multiplicity of input stations appears to be
evident.
Most large computing centers would be
capable of serving an enormous number of
additional users if the minimum time per job
were sharply reduced. In a typical presentday operation, jobs much s~orter than one
minute in extent are relatively few in number because people having such work to do
can' get it done more promptly in other ways.
With the possibility of achieving reasonably good efficiency for jobs requiring one
second or less of large- scale machine time,
a whole new class of user becomes vulnerable to the wiles of the numerical mountainmover. It is not difficult to conceive of several thousand jobs per day being done for the
technical staff of a large laboratory, provided
there are a large number of input stations
conveniently located in the manner of reference [4].
In passing, it should be noted that one second (net-ignoring setup, startup, and wrapup)
of machine time in this age is an item of not
inconsiderable potential value. When used
for such a mundane task as generation of a
table of value~ for an implicit function, for
example, it could accomplish the equivalent
of months of hand calculation.
Perhaps more to the point, the availability on a few minutes notice of a tool of such
awesome power can encourage "calculation,
not guesstimation" for problem sizes which
would otherwise not be served at all.
CONCLUSION
We have endeavored to show that the
conceptually simple scheme of short-run
Proceedings-Fall Joint Computer Conference, 1962 / 169
break-in can permit turnaround time for
brief computing jobs to be reduced drastically
without substantial increase in time for any
jobs, including the longest ones. In particular, we have pointed out how one of the basic
objectives of the McCarthy, et al proposal,
to make feasible nearly simultaneous access
by many people to a large computer, can be
met through the application of presentlyavailable hardware and presently-designable
software.
REFERENCES
1. Mock, Owen and Swift, Charles, J., "The
SHARE 709 System: Programmed I/O
Buffering," J. ACM 6, 2, Apri11959.
2. Heller, J., "Sequencing Aspects of Multiprogramming," J. ACM, 8, 3, July, 1961.
3. Totschek, R. and Wood, R. C., "An Investigation of Real-Time Solution of the
Transportation Problem," J. ACM, 8, 2,
April, 1961.
4. McCarthy, John, et al., "Report of the
Long Range Computation Study Group,"
(private communication), Massachusetts
Institute of Technology, Cambridge, Massachusetts, April, 1961.
5. Greenfield, Martin N., "Fact Segmentation," Proc. 1962 SJCC, pp. 307-315.
6. Corbato, F. J., et aI, "An Experimental
Time-Sharing System," Proc. 1962 SJCC,
pp. 335-344.
REMOTE OPERATION OF A COMPUTER BY HIGH
SPEED DATA LINK
G. L. Baldwin
Bell Telephone Laboratories, Incorporated
Murray Hill, New Jersey
N. E. Snow
Bell Telephone Laboratories, Incorporated
Holmdel, New Jersey
INTRODUCTION
evaluation, an attempt is made to point out
the limitations in usefulness of such a system and the inherent qualities, both good and
bad.
One promising means of attaining data
transmission speeds high enough to be effective with present day computer operation is
the use of wide band facilities provided by the
Bell System TELPAK service offerings.
Almost every industry large enough to make
use of a computer also has need of large numbers of voice telephone circuits between centers of operation. Quite often these circuits
are provided by a TE LPAK channel. Alternate use of the entire channel in a continuous
spectrum data transmission system not only
makes high speeds possible, but in many
cases economically attractive.
With the establishment of a new installation at Holmdel, New Jersey, Bell Telephone
Laboratories had an excellent opportunity to
make use of and evaluate an experimental
data transmission service, using a TELPAK
A channel. A TELPAK A service with appropriate terminal equipment, can be used as
twelve voice circuits or as an equivalent continuous spectrum wide band channel.
Installation of the system was completed
and routine operation begun in February, 1962,
giving the Holmdel Laboratories rapid access
to an IBM 7090 computer at the Murray Hill,
New Jersey, Laboratories.
This paper presents first, a description
of the system, and second, the concepts
under which it was devised with an evaluation
of the operational results. As a part of the
Description of Experimental System
A functional block diagram of the data
transmission system is shown in Figure 1.
Basically it is a magnetic core to magnetic
core system used primarily for tape-to-tape
transmission. IBM input-output and transmission control equipment are utilized with
Bell System experimental data sets, prototype N- 2 telephone carrier terminals, and a
specially engineered type N-1 carrier repeatered line facility. A discussion of each of
these system components follows.
IBM
N-2
729
TAPE
DRIVE
"---_--' '---_ _ _~I TERM.
BELL TELEPHONE LABORATORIES
MURRAY HILL, N.J.
BELL TELEPHONE LABORATORIES
HOLMDEL, N.J.
IBM
729
IBM
1401
COMPUTER
IBM
72B7
CONTROL
OFFICE
N·l
CARRIER
LINE
11 REPEATERS
I
-------------------------~
TAPE
DRIVE
MURRAY
HILL
1-'--------1 CARR. CENTRAL
LOCAL LOOP
15000' 19 GA.
ON·LOADED PAIRS
N-2
HOLMDEL
i + - - - - - - - j CARR. CENTRAL
TERM.
OFFICE
Figure 1. Block Diagram-Experimental
Murray Hill-Holmdel Data Link.
170
Proceedings-Fall Joint Computer Conference, 1962 / 171
Data Link Input-Output Equipment
Tape drives are of the IBM 729 type operatingunder control of IBM 1401 computers.
Both computers and tape drives are used in
routine data processing operations when not
connected in the data transmission configuration. While transmission is in progress, the
tape drive at either end of the system is under
control of the local 1401 and subsequently
the transmission control unit, an IBM 7287
Data Communication Unit.
The system is designed for transmission
of data records or "blocks" with error detection and reply between transmissions. Data
is read from magnetic tape one record at a
time. Parity checks are made as the data is
read into the 1401 core storage, from where
it may be clocked at a synchronous rate for
transmission.
The IBM 7287 transmission control unit is
arranged in the system to clock data from
the 1401 storage under control of the timing
signal supplied by the data set (timing could
be supplied by the 7287 if not supplied by the
data set). Upon receiving data from storage
the 7287 performs parity check, code translation from seven bit to four-out-of-eight
fixed count, serializes the data and delivers
a binary dc signal acceptable at the data set
interface. In addition, for each record transmitted, it generates and adds record identification, start-of-record, end-of-record, and
longitudinal redundancy check characters.
T,he receiving 7287 receives serial data
from the data set, performs character and
longitudinal error detection, code translates
back to seven bit characters and delivers in
parallel to the receiving 1401. Upon completion of receiving and checking a record, a
digital control signal is returned via the
reverse direction of transmission to the
transmitting 7287. The control signal identifies the record received and indicates that
the record passed all error checks or failed
and should be retransmitted. At this point
the transmitting end may be in either of two
conditions. In the first operating mode the
1401 may have already read the next record
from tape and transmission may continue if
no error is indicated. If an error is indicated,
it is then necessary to back the tape up two
records and read the record again for retransmission. The second operating mode (determined by choice of 1401 program) holds each
record in 1401 s tor age and continues
retransmitting until a "no error" reply is
received, and then the next record is read
from tape. The more efficient mode of operation is, of course, dependent upon the transmission error rate.
When a record passes all 7287 error
checks and parity checks in the receiving
1401, the record is delivered to the receiving
729 tape drive, completing the tape-to-tape
transmission.
In order to achieve maximum efficiency
it is necessary to maintain character synchronization continuously in both directions
of transmission, rather than re-establish
synchronism for each record or reply transmission. To accomplish this, the 7287 transmits periodically (once each 500 millisec0nds) a short interval (approximately 10
milliseconds) of a character synchronization
pattern either between record transmissions
or between reply transmissions. Continuous
repetition of a synchronization pattern must
receive careful design consideration (as will
be explained) for transmission interference
reasons in this or similar systems.
Data Set
Experimental Bell System X301A (M-1)
Data Sets used in the system are designed for
serial transmission at a synchronous rate of
42,000 bits per second. The principles of
operation are the same as in the Bell System
201A Data Set (commonly referred to as a
"four phase data set") currently providing.
DATA-PHONE service on voice circuits.
The data set employs quaternary phase
modulation with differential synchronous
detection. Data delivered serially to the
transmitter is encoded two bits (a "dibit") at
a time into a phase shift of an 84 KC carrier.
For the four possible dibits (11,00,01,10) the
phase of the carrier transmitted during a dibit
time interval is shifted by 1, 3, 5, or 7 times
1T /4 radians with respect to the carrier phase
during the previous dibit time interval. The
21,000 dibit per second modulation results
in a line Signal spectrum symmetrical about
the carrier frequency in the 63 KC to 105 KC
band.
At the receiver, dibit timing is recovered
directly from sideband components of the line
signal. This timing is then used in the demodulation process. Data is recovered by
detecting and decoding the phase relationship
between the previous dibit interval of line
172 / Remote Operation of a Computer by High Speed Data Link
signal (available from a one dibit delay line)
and the present dibit interval of line signal.
The recovered data with a synchronized bit
timing signal (generated from the recovered
dibit timing) is then delivered at the receiver
output.
Data, timing, and control circuits all appear on one ganged coaxial connector on the
rear of the data set chassis. Interface circuits are designed to drive low impedance
(90-120 ohm) loads or to terminate similar
circuits.
Operation of the X301A (M-l) Data Set
differs somewhat from that of voice band sets
using the same modulation technique. It is
undesirable, in the idle condition, to transmit
a signal corresponding to a repeated bit pattern. Under this condition the line signal
spectrum contains high level single frequency
components which may result in crosstalk
into other telephone carrier systems operating in the same cable. At the same time it is
desirable to maintain continuous bit synchronization, requiring continuous transmission
of a line signal. A compromise solution is
necessary.
The data set interface provides Send Request and Clear to Send control circuits. A
Send Request "on signal presented to the
data set results in a Clear to Send "on" signal being returned to the data source device,
and data will then be accepted on the Send
Data interface circuit. When the Send Request
and Clear to Send control circuits are in the
"off" condition, the data set will not accept
data on the Send Data circuit but generates a
line. signal automatically, corresponding to a
repeated "1000" bit pattern correctly related
to the dibit timing signal. This results in the
most desirable (lowest level single frequency
components) "idling" line signal possible.
Continuous transmission also enables the
receiver to maintain bit synchronization between data, reply, or character synchronization transmissions.
If
N-2 Carrier Terminal
Telephone carrier terminals used in the
system are prototype models of the type N- 2
transistorized system designed to provide
twelve two-way voice channels. A prototype
N-2 data channel unit replaces plug-in voice
channel units. The system then handles a
two-way wide band data channel. The data
channel unit serves to adjust signal levels
and modulate the data signal from the data
set into the frequency band vacated by the
voice channels removed. The data channel
spectrum is then modulated by the N-2 terminal group circuitry into the proper frequency band for transmission to the carrier
line.
N-1 Carrier Line
The type N -1 carrier line utilized for this
system is of the same type widely used in
providing carrier telephone circuits throughout the Bell System, insofar as equipment and
cable facilities are concerned. This particular line is specially designed to minimize
noise (e .. g. short repeater sections). The
same design is required in many military
services. Not all N-1 carrier lines in service meet the necessary noise requirements,
but with additional engineering and construction effort can be made to do so.
The N-1 line between Murray Hill and
Holmdel, New Jersey, is approximately thirty
miles in length, short enough so that no phase
or amplitude equalization is required. It is
estimated that equalization will become a
necessity in the order of one hundred miles
of repeatered line.
Transmitted signal levels into the N-2
terminal and N-1 carrier line must, on a
particular system, be a compromise between
deriving a signal-to-noise ratio yielding
satisfactory data error rates and keeping
interference into adjacent systems in the
same cable at a minimum.
Local Loops
Data signals are transmitted from and received at the data set over non-loaded telephone cable pairs. The experimental system
described here utilizes 3000 feet of 26 gauge
pairs between the Murray Hill Laboratories
and the Murray Hill central office. At Holmdel
15000 feet of 19 gauge pairs are used. Although it is not expected to be universally
true, it proved necessary in this system to
use double shielded pairs between the building entrance cable termination and the computing center data set location to avoid inductive pickup of interfering signals.
Need for System-Initial Concepts of Design
and Usage
Inlate 1959, when we were doingthe initial
planning for computing equipment at Holmdel,
Proceedings-Fall Joint Computer Conference, 1962 / 173
we had been told that within about six months
of initial occupancy, the Holmdel buildings
would house a total of some twenty-five hundred people. Since these people were to be
transferred primarily from our New Jersey
Laboratories at Murray Hill and Whippany,
many of them at the time of relocation would
be in the middle of projects requiring use of
a computer. In order to provide a capability
roughly equal to that at Murray Hill, we considered the following alternatives:
Installation of a 7090 at Holmdel
This we realized to be an ultimate requirement, and a 7090 is presently scheduled for
installation in the latter part of the year.
However, the total antiCipated load during the
fir st six months or so of occupancy did not
justify earlier installation. A computing facility of this size costs of the order of $100
thousand per month, and it is almost out of
the question from an economic point of view
to install one without a nearly full prime shift
load.
Install a Smaller Machine in the Interim
Period
This is an alternative which we dismissed
at once. Our programming costs are of the
same order of magnitude as the computer
operating costs, and we just could not afford
the reprogramming effort entailed.
Use a Station-Wagon Data Link
This was the most attractive alternative
from a point of view of economy, and we have
even gone so far as to provide backup for the
automatic data link by a truck which makes
several regularly scheduled round trips per
day between Murray Hill and Holmdel in the
event of data link failure. However, the truck
scheduling problems here were such that the
service, in terms of turn-around time on a
typical job, would not be good enough for acontinuing operation.
A Voice-Bandwidth Data Link
There were several commercially available automatic data transmission facilities
which operated at speeds up to 2400 bits per
second on voice bandwidth lines. However,
these were all too slow to give adeauate
service under heavy load conditions. Since we
were sure of several hours of 7090 usage per
day, the delays in such a facility would result in about the same grade of service as the
station -wagon.
A Tape-Speed Data Link
At the time we were making our plans for
Holmdel, a Microwave tape-to-tape data link
was in operation on the West Coast. Such a
system would provide higher speeds and some
operational benefits. Since, however, we were
really interested in coverage only over a
fair ly brief interval, installatio~ of a microwave transmission system made this alternative unattractive, from a cost viewpoint
alone.
The TELPAK A Data Link
ThiS, of course, was our final choice. Its
main attractions were that it operates over
transmission facilities which are available in
fair ly large quantity in the Bell System toll
plant throughout the country, its cost reasonable, and its operating speed such as to
increase the total job proceSSing time by only
about twenty-five percent. Furthermore,
being in the business of communication, we
felt this to be an extremely worthwhile experiment in the field of data transmission.
In forming our initial concepts of the
TELPAK A data link, we knew that at the
time of its installation the Murray Hill Computation Center would include a 7090 supported by three 1401 computers as peripheral
equipment. At Holmdel we needed at least
one 1401 to read and write the tapes to be
transmitted over the data link and to read
cards, print, and punch. Since the signalling
rate of the data link was limited to the order
of 40 kilobits per second, we required a buffer
at each end to compensate for the difference
between tape and transmission speeds. It was
quite natural then to examine, together with
people from IBM, the possibility of using the
1401 computer at each location for both buffer
storage and control.
Since we have stored program computers
at each end of the data link, there is a great
deal of flexibility in matter s of tape format,
block Size, coding scheme and the like. However, virtually all of our operating experience
has been with tapes written in the format
peculiar to our 7090 monitor program,
174 / Remote Operation of a Computer by High Speed Data Link
BE-SYS-4. In reference to the particular
options possible with IBM tape transports,
this format involves binary (odd parity), high
density (556 characters per inch) records of
length which is variable but do not exceed
1000 characters. As far as the data link is
concerned, our only limitation on block size
is the buffer space we have reserved in the
1401 memory. We have 8000 characters of
core storage in all of our 1401 's and the storage required for the transmission program
itself is about 1100 characters, so that it is
possible for us to work with a much larger
block size.
It is typical of a buffered transmission
scheme that the average effective data rate
is low for both very short and very long
records. With very short records a large
fraction of time is spent in "overhead"starting and stopping the tapes and transmitting acknowledgements. Conversely for
very long records, there is a higher probability of a parity error in transmission of a
record and high penalty in time for retransmission. In between these two extremes there
is a fairly broad optimum. In terms of our
particular experience a block of 1000 characters is the optimum length for a fairly high
transmission noise level, say, one error in
fifty thousand characters. On the other hand,
even in a noise-free system it gives an average transmission rate of about 90% of the
maximum possible. In this sense, then, the
block size associated with our normal tape
format is quite satisfactory. Secondly, the
uniformity of the physical appearance of all
information on tape has Simplified the 1401
program normally used for operation of the
data link to the point where we have made it
a part of our standard 1401 program used for
card-to-tape and tape-to-card/print operations. This in turn allows us to switch the
1401 from local operation to transmission
without loading a separate program.
Operation of System
Let us now look at our method of operation of the data link in more detail. First,
at Holmdel, a "batch" of programs is loaded
into the 1401 using our standard card-to-tape
program, thereby producing a 7090 input tape.
This is rewound, and the 1401 program is
altered (by sense switches) to transmit this
tape to the receiving 1401 at Murray Hill.
The tape is then sent to Murray Hill, and at
the end of transmission the duplicate tape is
rewound and the tape transport is switched
electrically from the 1401 to the 7090. When
the computer becomes available, the monitor
program on the 7090 reads the tape, executing the various programs as they appear and
generating results, again batched on a single
output tape. At the completion of all jobs the
output tape is rewound, switched electrically
to the 1401, and transmitted to Holmdel.
Finally the tape received at Holmdel is rewound and processed by our standard 1401
program to produce printed output and punched
cards. Although operation over the data link
in this manner involves the extra tape spinning (twice forward, two rewinds) required
for transmission, the entire process can be
carried out without mounting or dismounting
a tape. In practice, we do move the data link
output tape from its tape transport at Holmdel
to one on an adjacent 1401, to allow printing
and transmission to proceed simultaneously.
One very nice feature of the buffered transmission scheme is that we have complete
freedom in the choice of 1401 input/output
equipment to be used. At the time of writing
we are using 729 Model IT tape transports at
Holmdel and 729 Model IV transports at
Murray Hill; these differ in their operating
speeds (75 and 112 inches per second, respectively). It is quite pOSSible, for example, to
go directly from cards at one end of the link
to tape at the other, although we do not normally do so because of the added time this
ties up both 1401 'so Indeed our standard program in the receiving 1401 scans the data for
certain 7090 monitor control cards and prints
these immediately on the 1403 printer while
transcribing them as well on tape. This then
provides the computer operators with a summary of the jobs to be processed as well as
any unusual instructions as to how they should
be treated.
In a typical day's use, we transmit over
20 input tapes from Holmdel to Murray Hill,
on a twice-an-hour schedule between 9: 15 and
4:45, with transmissions in the evening shifts
as required by the load. At the time of writing our daily load from Holmdel is about five
hours of 7090 time, during which time we
process typically 130 separate jobs. Roughly
one hundred of these jobs are processed during the prime shift (9 a.m. to 5 p.m.); during
this same eight hour period we typically
process 150 jobs which originate at Murray
Hill. The turn-around time (time from
Proceedings-Fall Joint Computer Conference, 1962 / 175
submission of a program to the Holmdel
Computation Center to delivery of printed
output) is generally between two and four
hours on jobs which require five minutes or
less on the 7090. This is limited primarily
by the printing capacity of the Holmdel 1401 's,
since the groups which were transferred to
Holmdel are working on jobs which generate
a considerable amount of output. We have
handled some high priority runs for one part
of the Telstar project giving less than halfhour turn-around at Holmdel without disrupting our flow of work through the 7090 (aside,
of course, from the 7090 time used for actual
execution of the program).
Operating Reliability and Results
As far as reliability is concerned, the link
has gone dowIJ., in the period February 15
through July 15, a total of about eight timesonce for the data set, once for the IBM
translator and a half dozen times for the
transmission facilities. These facility failures were isolated to one cable section and
the problem was eliminated by changing to
different pairs in the same cable. When the
data link is working correctly, we have experienced an average retransmission rate
due to noise of one error per three thousand
records (see Figure 2).
50~----------------------------------.
d)
a:
. - MH TO HO
0---0 HO TO MH
w
I-
U 40
«
a:
«J:
U
Z 30
o
...J
...J
~ 20
a:
w
Q.
d)
a:
o
a:
a:
·\A.-.~
.",
10
w
//'
12
Figure 2.
1
2
3
TIME OF DAY
4
5
6
OTHER
Error Rate Vs. Time of Day.
Preliminary Conclusions-Areas of Usefulness, Qualities
Our primary use of the Data Link was to
provide rapid yet economical access to a
large computer from a remote location. The
desirability of the data link for this use depends on a number of factors:
The Load at the Remote Location
Since this use of the data link requires a
1401 at the remote location, not only to transmit data, but also to do normal card-to-tape
and tape-to-print!punch operation, it is expensive. A remote faCility, including a 1401,
a tape unit, auxiliary keypunches and associated equipment, furniture, staff and space,
would cost of the order of $10 thousand per
month. Reasonable economy dictates that
this cost be spread over a load of at least
thirty or forty hours of 7090 usage. On the
other hand, a monthly load of over 150 hours
would justify the installation of a separate
7090. Therefore, this use of the data link
depends strongly on bounding the load within
a fairly critical range.
AntiCipated Growth of the Load
Clearly, one of the competitors of remote
operation of a large computer over a data
link is the installation of an on-premises
machine which is smaller and less expensive.
The desirability of this course of action depends on what future loads are antiCipated.
That is, if we expect the load to be fairly
constant with time the separate smaller computer is probably cheaper and more attractive. On the other hand, if the load is expected
to grow to the point where a large machine
will be justified within one or two year s, then
the saving in reprogramming and re-training
of programmers might well repay the added
dollar cost of the data link many times over .
This certainly has been true in our own case.
It might be added here that competition from
small machines is purely on an economic
basis. As soon as the industry can develop
really in e x pen s i v e peripheral printer s,
readers, and punches, the data link will be
much more attractive.
The Need for Good Service
Certainly the least expensive remote use
of a computer, with today's technology, involves the use of cars, trucks, telephones
and the U. S. Mail. These, however, give a
grade of service whereby it takes a day or
more to return results on any given run. In
176 / Remote Operation of a Computer by High Speed Data Link
a situation where the computer is being used
primarily for production runs on a predictable
schedule, this grade of service is often more
than satisfactory. However, when a large
part of the load consists of checkout and running of programs which are required in scientific and engineering projects, such economies in operating costs are far outweighed
by ineffective use of technical personnel and
by project delays.
Upon installation ofa 7090 at Holmdel, we
plan to continue use of the data link for purposes of load balancing as well as for protection of both computation centers in the event
of temporary unavailability of one of the
7090's. Although this is a far less compelling motivation than that of remote computer
operation, the price comes down at the same
time, for now we have 1401 computers, operating personnel and everything else at both
locations anyway.
As with anything in this world except good
bourbon, the data link does have some recognized deficiencies. In the first place, because
the operation is essentially tape-to-tape, we
find ourselves winding and rewinding tapes
twice more than we ordinarily do when operating a 7090 locally. These added operations,
although quite simple, are enough to encourage us to do more batching of jobs than otherwise. ThiS, in turn, increases the turn-around
time and requires more attention of the operators.
As far as speed is concerned, it would be
nice to have it a good deal faster, for we find
ourselves tying up the two 1401 's for roughly
thirty hours a month just transmitting tapes.
On the other hand, this is not a really serious
deficiency, since for each minute we spend
transmitting data we generally spend five
more in printing it.
The major limitation on the data link as
we have used it is its cost, and this is really
not a reflection on the cost of data transmiSSion, but rather on the cost of the supporting equipment and personnel at the remote location. If we could bring the total
price of a remote location down to one or two
thousand dollars per month and yet retain the
input-output speed of the 1401 card reader
and printer, then we would be able to justify
a remote operation with a monthly load of
well under ten hours of 7090 time. It is felt
to be but a matter of time until the proper
equipment is developed, but it certainly is
not possible today.
STANDARDIZATION IN COMPUTERS AND
INFORMATION PROCESSING
C. A. Phillips and R. E. Utman
Business Equipment Manufacturers Association
New York, New York
The data processing standardization program is a comparatively new effort since it
had its genesis in an action by the International Organization for standards (ISO) in
late 1959. On a recommendation from Sweden, ISO decided there was a need for a
standards program in connection with computers and information processing. It seems
rather remarkable that the need for a standards program was recognized so early in the
state of an art whose principal tool, the electronic computer, is only fifteen years old
this year. Before getting into the specifics
of this program, let us first consider very
briefly, standardization as a process, and its
alfect upon our lives. The Encyclopaedia
Britannica describes standardization as a
continuing process to establish measurable
or recognizable degrees of uniformity, accuracy or excellence, or an accepted state
in that process. It goes on to point out that
man's accomplishments in this direction
pale into insignificance when compared with
standards in nature, without which we would
be unable to recognize and classify within a
species, the many kinds of plants, fishes,
birds or animals. Without such standardization in the human body, physicians would
not know whether an individual possessed
certain organs, where to look for them, or
how to diagnose or treat disease. To further
quote the Encyclopaedia "without nature's
standards there c 0 u 1 d be no organized
society, no education and no physicians;
each depends upon underlying comparable
similarities." Although we are inclined to
think of man-made standards as relating
prinCipally to such things as weights and
measures, money, energy, power or other
material commodities, you will also find
standards in social customs, in codes, procedures, specifications and time-to name a
few. Standardization is important to geography, photography, chemistry, pharmacy,
safety, education, games, sports, music,
ethics and religion. The profession of accounting, for example, is largely dependent
upon standards-which are generally referred to as "accepted practice."
In fact, it is "accepted practice" that usually generates standards-many of which may
be unwritten, simple and crude, while at the
other end we have standards that are specified in great detail, nationally accepted and
used, and, in many cases, subject to legal
definition.
There is no que stion that industrial activity thrives on standardization. It has been
argued with strong support, that industrial
standardization is the dynamic force that, in
a sense, created our modern Western economy. There is no question that industrial
standardization is the cornerstone of our
mass-production methods, which, in turn, is
such a vital part of our American economy.
All of the industrially advanced countries of
the world have their own national standards,
and in many of them, standardization is
whatever the government decrees-at least
that is the case in Soviet Russia and its
177
178 / Standardization in Computers and Information Processing
satellites. Russia has over 8,500 standards
in effect; Germany, over 11,000; France
about 4,500. The 'United Kingdom has approximately 4,000 British Standards a vailable for use. The United states has approximately 2,000 American standards approved
by our voluntary national standards body, the
American standards Association.
The multiplicity of s tan dar d s making
groups and the frequent duplication of effort
by several groups having a kindred problem,
led to the founding of the American standards
Association (ASA) in 1918. During World
War I, the need for eliminating conflicting
standards and duplication of work became
urgent. Several engineering societies, together with the War, Navy and Commerce
Departments, established the American Engineering Standards Committee which was
reorganized in 1928 and renamed the American standards Association.
Today, the ASA federation consists of 126
national organizations, supported by approximately 2,200 companies. Over the years,
ASA has evolved a set of procedures that
apply checks and balances to assure that a
national concensus supports every standard
approved as an American standard by ASA.
By the terms of its constitution, ASA is not
permitted to develop standards, but instead,
acts as a catalyst by aiding the different
elements of the economy to obtain a desired
standards action through the established
procedures.
In 1946 some one hundred top leaders in
business and industry entered into a formal
agreement with the Secretary of Commerce
to broaden the scope and activities of the
ASA. Along with the American Society for
Testing Materials, ASA is now reviewing
federal specifications to bring them into line
with the best industry practice. Today the
federal government is following a policy of
using industry standards rather than writing
its own and ASA has become a focal point of
cooperation in standards work between government and industry.
The Department of Defense, by specifiC
Directive, has authorized its personnel to
partiCipate in ASA activities as voting members and, at the present, no fewer than 25
federal agencies and in excess of 600 government representatives are participating in
the work of ASA committees. The National
.. Bureau of standards accounts for many of
these committee posts.
As a means of avoiding or eliminatingdifferences among National Standards, which
sometimes may be even a greater trade barrier than import quotas or high tariffs, 44
nations have joined together in a world-wide
non-government federation of national standards bodies known as the International Organization for standards, or ISO. The objective is to coordinate the national standards
in various fields by means of ISO Recommendations, which are then available for
voluntary adoption by all countries. In the
electrical field, international standardization
is conducted through the International Electrotechnical Commission (IEC) which is an
independent division of ISO, made up of
national committees in 34 countries. The
American standards Association is the USA
member of ISO and the U. S. National Committee of the lEC is an arm of ASA.
With this very general background of
standards practices and organization, let us
next look at the standards program in the
field of computers and information processing under three general headings:
1st - the relationship of this program to
the international and national standards organizations and the manner
in which the effort has been organized and is being directed;
2nd - the m e m b e r s hip of the various
groups participating in the program;
3rd - the scope of the overall program
and its various subdiviSions, along
with the approach in each case and
a brief report on progress.
Coming out of the 1959 meeting of ISO,
previously referred to, was the assignment
by ISO to the United states of overallresponsibility for the programs conduct. A chart
'reflecting the overall organizational structure would show at the top of two parent
organizations: ISO, from the international
level, and ASAfrom the national level. Following established procedures, ASA assigned
the program to a sponsor, which is usually a
trade association with a direct interest in
the subject and a willingness to undertake
the effort. In this case, the Office Equipment Manufacturers Institute, which later
became the Business Equipment Manufacturers Association or BEMA, was the logical organization to be given the responsibility as sponsoring activity. Under ASA
procedures, the sponsor organizes the project' subdividing it as necessary, and finances
Proceedings-Fall Joint Computer Conference, 1962 / 179
the full time staff and other direct costs incident to the program.
Each major project under a sponsor is
referred to as a Sectional Committee. ASA
has an identification system of letters and
numbers for these Sectional Committees,
and under this system the data processing
standards project became known as the X-3
Sectional Committee.
It should be mentioned that a concurrent
project that is concerned with standards in
the office machines area was also assigned
to BEMA as sponsor and is identified as the
X-4 Sectional Committee. This breakdown
into the X-3 and X-4 Sectional Committees,
coincides with the organization of BEMA into
semi -autonomous groups known as the Data
Processing Group, with responsibility for
X-3, and the Office Machines Group, with
responsibility for the X-4 Project.
Closely related to the X-3 Sectional Committee is another Sectional Committee identified as X-6, which was established by ASA
under the sponsorship of the Electronic Industries Association (EIA) for consideration
of those aspects of the standards program in
data processing which are purely electrical
as distinguished from the logical or other
physical characteristics which is the responsibility ofX-3.
It would be well at this point to consider
further the role of ASA in relation to the
Sectional Committees. As the various Sectional Committees develop recommendations
through various sub-committees, they go
through an approval process at the Sectional
Committee level and are then submitted to
ASA. The ASA will review the proposed
standards and the supporting data and reach
a judgement as to whether or not a concensus exists for such a standard. It may be
refused on a single negative vote or approved
with several dissents.
The X-3 Sectional Committee is made up
of three major groups with approximately
the same number of members in each group.
These groups are known as the Users Group,
the General Interest Group and the Manufacturers Group and are made up for th~ most
part, by representatives of trade associations, profeSSional or technical societies or
other bodies having a direct interest in the
subject. The members of the Manufacturers
Group are selected from the BEMA membership by the Engineering Committee of the
Data Processing Group/BEMA, which is
charged with direct responsibility (within
BEMA) for general direction of the standards
program. At the present time, the X-3 Sectional Committee is chaired by a staff member of the Data ProceSSing Group of BEMA.
The General Interest Group of the X- 3
Sectional Committee is made up, for the most
part, of organizations or societies related by
professional background or interest. They
include the Association for Computing Machinery (ACM), the American Management
Association (AMA) , the Electronic Industries
Association (EIA), the Engineers Joint Council (EJC), the Institute of Radio Engineers
(IRE), the Association of Management Engineers (ACME), the National Machine Accountants Association (NMAA) and the Telephone
Group. The Department of Defense is also
represented in the General Interest Group.
The Users Group is made up of associations that have a common interest as to type
of business. They include the Air Transport
Association (ATA), the American Bankers
Association (ABA), the American Petroleum
Institute (API), the Insurance Accounting and
statistical Association, the Joint Users Group
(JUG), the Life Office Management Association (LOMA), the National Retail Merchants
Association (NRMA) with the American Gas
Association and the Edison Electric Institute
holding a joint membership. The General
Services Administration represent the Federal Government in the Users Group.
Representing the Manufacturers Group
are ten companies, some manufacturingcomplete data proceSSing systems, while others
manufacture devices used in conjunction with
data processing systems. The companies
representing BEMA are: Burroughs Corporation, International Business Machine
Corporation, Minneapolis - Honeywell ED PD,
Monroe Calculating Machine Company, National Cash Register Company, Pitney-Bowes
Inc., Radio Corporation of America, Remington Rand Divisoin of Sperry Rand, Royal
McBee Corporation, and standard Register
Company.
ASA Procedures require that the terms
of reference under which a Sectional Committee operates shall be clearly set forth in
a statement of scope, which might be called
a "charter." The language used to describe
the scope of the X-3 Sectional Committee is
as follows:
"standardization 0 f the terminology;
program de sc ription, programming
180 / Standardization in Computers and Information Processing
languages, communication characteristics, and physical (non-electrical) characteristics of computers and data processing devices, equipments and systems."
You will note the specific exclusion of electrical characteristics which, as previously
mentioned, has been assigned to the X-6
Committee under sponsorship of EIA.
The very broad scope of the X-3 Sectional
Committee has been subdivided into seven
parts or subcommittees, all having one thing
in common-they are dealing with problems
of communication. The first subcommittee,
X-3.1 is concerned with Optical Character
Recognition, X-3.2 is concerned with Coded
Character Sets and Data Format, and X-3.7
is concerned with Magnetic Ink Character
Recognition. You will note that these three
deal primarily with input and output prob1ems which might be described as communications between men and machines. Another
concerned with communications between men
and machines is X-3.4 on Common Problem
Oriented Programming Language. The X-3.3
subcommittee is concerned with data transmission problems which might be described
as communications between machines. X-3.5
concerned with Terminology and Glossary,
and X-3.6 concerned with Problem Definition
and Analysis, represent problems of communicationbetween men about machines. These
seven subcommittees are chaired by representatives of the companies that comprise the
Manufacturers Group together with one chairman from the Navy Department and one from
the General Interest Group representing the
Association for Computing Machinery.
In the numeric order let us next examine
the scope, the approach and the progress of
each of the subcommittees of X-3.
The scope of the X-3.1 subcommittee has
been defined as the development of humanly
legible character sets for use as input/output
for data processing systems and the interchange of information between data processing and associated equipment. Considerable
work has been done over the past few years
in this field by the Retail Merchants Association and others, which has been followed up
and expanded upon by X-3.l. Initially, the
work of this group has been concentrated in
the numeric area-which is so badly needed,
and at the same time is probably easier to
develop. As you probably know, there are
several optical readers on the market today,
all using their own unique character font,
and a standard font, both for numbers and
letters, could do much to advance the state
of the art. This group has a two-pronged
problem-if the standards are set low in
quality as to format, Size, density or other
printing characteristic s, the optical reader
will be comparatively expensive to produce.
If, on the other hand, the standards are set
high, the reader may be cheaper, but the
printing devices and imaging media may be
higher in cost. Achieving a proper balance,
is the big problem confronting this subcommittee.
The X-3.1 subcommittee approached their
problem by dividing the work between three
task groups. The first group will determine
the proposed measurements, specifications
and terminology for the font; the second
group is concerned primarily with printing
capabilities and the parameters of printing
devices; while the third group will study ~e
tail requirements and priorities and will
evaluate other requirements with similar or
different problems. Interest. and participation in the X-3.1 effort has been very high
with about 50 people working actively. They
have been holding meetings every 4 to 6
weeks and are confident of measurable progress within the current year. Hopefully, they
will have a numeric font ready for consideration soon.
The scope of the X-3.2 subcommittee provides that they will develop standards for
coded character sets and data record formats to facilitate communications both within
and between data processing systems. Here
the problem is primarily machine to machine
communication, rather than man to machine,
as with X-3.1 Today there are over 65 different machine codes used world-wide, with
over 50 different ones used in the United
state s. Frequently the difference between
these codes may appear to be minor, although such differences may have a major
impact. For example, the order in which the
string of characters places the numbers and
letters. Some codes put the alphabetic characters first, followed by the numbers, then
followed by the machine-function codes such
as: carriage return, back-space, uppercase, lower-case, etc. Other codes change
this order or reverse it. There is also the
problem of a standard representation of the
characters in the bit structure of the magnetic tape, or the punched holes in cards or
tape.
Proceedings-Fall Joint Computer Conference, 1962 / 181
Obviously, if standards proposed by this
country are to be accepted internationally,
they must make provision for alphabets other
than English, and must not differ too greatly
from codes used by European or other
countries.
The approach adopted by the X-3.2 subcommittee is in three phases-the first task
group is to determine the alphanumeric characters' symbols and control characters desired; the second group will develop the
detailed code representation for such characters, symbols and functions; and the third
group will develop a standard format for
utilizing the standard coding. The first two
of these groups have met their targets-with
what accuracy is yet to be determined-and
the third group is now working actively with
a joint input/output group.
The X-3.2 subcommittee has completed
the development of a recommended American Standard code for information interchange and has submitted it to the X-3 Committee for processing. In turn, X-3 has
submitted the recommendation for balloting
by the X-3 members. In the meantime,
through conferences with groups in Europe,
the X-3.2 subcommittee is considering revisions or modifications that would make the
proposed standard more acceptable as an
international code. The pros and cons are
being actively discussed and we mayor may
not have an American Standard within this
area within the current calander year.
The X-3.3 subcommittee on Data Transmission is concerned with the determination
and definition of parameters governing the
operational action and reaction between communications systems and the digital generating and receiving systems utilized in data
processing. From anoverall business standpoint, this is a relatively new problem that
has been under active development for only
four to six years although the military have
been working on it for much longer. Previously, translations from data processing
codes to codes suitable for transmission
have been done manually. Today there are
standards in the field of voice and telegraph
transmission, but very little has been done
beyond this. Radio and television raised
questions on line or channel quality and
width and the advent of the computer raised
questions of relative costs. Many users of
data processing equipment believe that the
economical use of data processing equipment
requires the centralization of the data processing activity. This immediately imposes
requirements for economical data transmission and the need for an effective interface
with data communication.
Although X-3.3 was somewhat slow in
getting under way, they have now organized
their effort under five groups. The first
group will handle liaison with other interested groups; the second group will develop
a glossary of special terms relating to this
subject; the third group will document error
detection and control techniques; the fourth
group will try to specify the system aspects
of data processing terminal equipment to be
connected with communications equipment,
and the fifth group will do research into systems performance characteristics.
In spite of a slow start, X-3.3 has made
good progress, largely because of excellent
cooperation with the related subcommittee
under the X-6 Sectional Committee sponsored by EIA. For several years EIA has
had a group known as EIA TR/27.6, working
on problems in this area. Through this cooperative approach X-3.3 now has had its
proposed standard on signalling speeds for
data transmission equipment approved by the
X-3 Sectional Committee and the ASA as an
American Standard. This is the first American Standard to result from the X-3 program.
The scope of X-3.4 has been described as
follows: "Standardization and speCification
of common programming languages of broad
utility, with provision for revision, expansion and strengthening, and for definition and
approval of test problems." This area is
one of the most difficult and probably of the
greatest concern to the data processing community because of the increasing costs of
"soft-ware." It will be noted that the scope
encompasses both the business-type languages and the SCientific-engineering type
languages. In fact, it probably also includes
the so-called Itcommand and control" languages of the military under the title of
"problem-oriented." There have been many
users groups active in the development of
programming languages and it is expected
that X-3.4 will utilize much of this work that
has gone before. At the present time the
subcommittee is conSidering COBOL, ALGOL
and FORTRAN, the three most widely used
programming languages.
The X-3.4 subcommittee has been organized into six working groups, the titles of
182 / Standardization in Computers and Information Processing
which will suggest the areas of work assigned: Working Group 1 is concerned with
language theory and structure, WG 2 with
ALGOL, and Specification languages, WG 3
with FORTRAN, WG 4 with COBOL and language processors, WG 5 with international
problems, and WG 6 with programming language terminology. The complexity of this
area makes it probable that there will be
some overlap within the subcommittee and
with other groups and that the progress may
be somewhat slower in spite of the most dedicated and sincere effort of the participants.
The X-3.5 subcommittee is concerned with
problems of man-to- man communications
under the following two-part scope: (1) to
recommend a general glossary of information
processing terms, and (2) to coordinate and
advise the subcommittees of ASA X - 3 in the
establishment of definitions required for their
proposed standards. This group has recognized an over lap with work done in this field
by others, for example, a joint glossary has
been compiled by the ACM-the AlEE and
IRE conSisting of over 2800 words or terms.
X-3.5 expects to use this as a base and to
coordinate their efforts with those of other
groups, including one in the Federal Government under the aegis of the Interagency Data
Processing Committee, and a similar joint
effort in Great Britain.
X-3.5 is a comparatively small group with
strong participation from the User and General Interest groups on X-3. Much work has
already been done in this field and it is now
largely a matter of collating, refining and
editing and of putting the product in prescribed form for submission as a proposed
American standard.
New users, or prospects for electronic
data proceSSing systems are frequently surprised to find that there is still no standard
accepted ways of defining data proceSSing applications. This is the field in which X-3.6
has defined their scope; as: (1) the development of standardized survey techniques,
(2) the standardization of flow charting symbols, and (3) the development of narative symbolic - quantitative methods of presenting
results to top management, data processing
customers and data processing operators.
Work along this line has been done by the
separate companies and a good bit by government agencies. The Federal government,
through the Interagency Data Processing
yommittee has developed guide lines and
criteria for feasibility studies, application
studies, and flow charting techniques and
symbolization. There is good reason to believe that these efforts will have a strong
influence on the work of the X-3.6 subcommittee. It is also hoped that active interest
and partiCipation by educational institutions
can be stimulated.
X-3.6 has subdivided the work into four
task group assignments. The first is concerned with methodology, the second with
input/ output data and file description, the third
with data transformation, and the fourth with
nomenclature and flow charting. So far, the
greatest tangible results have been in the
work of Group 4 which is hopeful to having a
proposed standard for flow charting symbols
ready soon for consideration. In a recent
indication of the dynamic nature of industrial
standardization, ASA abolished' the obsolete
X-2 Sectional Committee on office standards
and assigned its project on charting paperwork procedures to X-3 and X-3.6.
The latest of the subcommittees is X-3.7
which is concerned with magnetic ink. character recognition. Work in this field is well
along, and this is recognized in the scope
which is described as follows: (1) development of standards for magnetic ink. character recognition (MICR) for present and future
use, and (2) resolution of problems arising
in industry and the market place involving'
manufacturers and printers. Under the aegis
of the American Bankers Association and interested manufacturers, the magnetic ink
character recognition font, known as E 13 B,
has been adopted by the American. banking
industry. Therefore, the X-3.7 subcommittee is following an approach which might be
thought of as a maintenance program rather
than a development program. They propose
to (1) determine the best common method for
handling miscoded documents, (2) resolve a
standard location for check serial numbers,
and (3) eliminate extraneous magnetic printing on the clear band of checks.
The X-3.7 subcommittee is hopeful of
getting the de facto standard MICR font processed and accepted as an American Standard
and thereafter submitted for consideration as
an International standard. It has been approved by X-3 and is now submitted for processing through ASA.
By design this structure of the X-3 Sectional Committee closely resembles the organization of ISO Technical Committee 97
Proceedings-Fall Joint Computer Conference, 1962 / 183
on Computers and Information Processing.
There are six subcommittees and one working
Group of TC97 at the international level with
titles and scopes quite similar to those of the
X-3 subcommittees: SCl for multi-lingual
glossaries; SC2 for coded character sets; SC3
on both optical and magnetic character recognition; SC4 on input/output media standards;
SC5 on programming languages; SC6 on data
transmission, and WGl on Problem Definition
and Analysis including flowcharts.
It should be emphasized that although organizationally similar, the character of national and international standardization differs considerably. National standardization
activity can involve development of standards where need exists and accepted practice or appropriate developmental facility
does not, as in the aforementioned character
code case. International standardization, on
the other hand, tends to be legislative in
nature, with the work of TC97 and its groups
devoted to the processing of national proposals that represent local standards or
practice. Little if any standards development is foreseen or considered in the international activities of TC97. In addition to
considering national standards proposed for
ISO conSideration, TC 97 also accepts documented proposals from official liaison organizations of an international nature such
as the European Computer Manufacturers
Association and the International Federation
for Information ProceSSing.
Proposed international standards involving electrical characteristics are processed
by the IEC Technical Committee 53 - Computers and Information ProceSSing, and its
four subcommittees: A for Input/output
Equipments; B for Data Communications; C
for Analog Equipments in Digital Systems;
and D for Input/Output Media.
Counterpart interests are included in
the scope of the ASA X-6 Sectional Committee.
Where logical, physical and electrical
factors in a standardization proposal cannot
be isolated, as in such input/output media as
magnetic tape, TC97, TC95 on Office Machines, and TC53 have made provision internationally for joint WG D and SC 53D work.
Nationally, X-3 and X-6 have joined forces in
three joint task groups for consideration in
input/output media standards for magnetic
tape, perforated tape, and punched cards.
Other cooperative efforts are provided for as
need arises, such as in programming languages and character sets, and between all
special technical areas and the glossary
activities.
Two final comments on the basic interests and intent of the national and international standardization efforts. First, there
is no desire to develop or establish standards for the sake of standardiZing. The only
justifiable reason for standards in information proceSSing is need, as expressed by the
users and manufacturers of computers and
deVices, and those affected by such equipment.
Second, in order to assure that such need is
properly expounded when it exists, and that
resultant standards represent acceptable solutions to such needs, the industry and user s
and general interests must partiCipate fully
and with qualified, active representation.
Standards must accurately represent the
predominant practices or wishes of the entire information processing community.
Those of you who are active in the data
processing community are probably familiar
with the magazine DATAMATION and may
have read the feature article on the ASA X-3
Sectional Committee in the February 1962
issue. Although the article is rather critical
of the lack of progress that has been made
through the end of 1961, it is generally factual
and, with minor exceptions, gives a good picture of this first year of the Standards Program. In spite of its rather critical tenor, the
the article concludes with this statement:
"A more realistic point of view is that
standards activities are, by their very
nature, methodical, plodding and, subsequently, quite permanent in their effect.
It should be clearly understood that the
biases, politics and frictions which come
to play and may seem to impede the effort
are, in fact, expressions of legitimate
interests which comprise one of the most
important aspects of the deliberations
involved in setting and maintaining a
standard. "
HIGH-SPEED FERRITE MEMORIES
H. Amemiya, H. P. Lemaire*, R. L. Pryor, T. R. Mayhew
Radio Corporation of America
Camden 8, N. J.
components have become prime factors in the
determination of the minimum access time
of a memory.
The use of permalloy thin films as highspeed storage elements has recently received
a great deal of attention [8,9,10]. A small
memory with a cycle time of less than 0.5
p.sec has been operated [11], and cycle times
shorter than 1 J.Lsec appear generally feasible
in memories of larger capacity [12]. In addito their high-speed potentialities, thin-film
memories can be fabricated in large sheet
arrays at relatively low cost. (This fabrication technique has not been fully developed at
the present time.) Disadvantages of thin-film
memories include high drive-current requirements (0.5 to 1 ampere) and low bit outputs (less than 5 mv) [9]. Large arrays
operating at high speeds may, in fact, be
impracticable, because the discrimination
between the low output signal and the stack
noise becomes increasingly difficult as memory capacity is increased.
Ferrite pieces utilizing closed magnetic
paths of miniature dimensions offer obvious
advantages for high-speed memories [13].
Outputs and switching speeds can be kept high,
while at the same time drive currents kept
low. This paper presents the results of a
program aimed at developing ferrites capable
of operation in memories with cycle times
less than 0.5 J.Lsec and at bit costs competitive with those of slower, more conventional
arrays. These goals have necessitated the
solution of two associated problems: first,
INTRODUCTION
For several years ferrite cores [1,2] have
constituted the mainstay of computer storage
memories. The typical computer today [3]
has a transistor-driven core memory which
operates in a coincident-current mode [4]
with a 5-to-10-J.Lsec cycle time. Although a
coincident-current storage unit with a cycle
time approaching 2 Jlsec has been built, higher
speeds have been attained by exploiting socalled partial-switching modes of operation.
\ Word-address memory systems [5] with
cycle times less than 1 J.Lsec and as low as
0.7 J.Lsec have been reported [6,7]. To attain
these speeds with conventional 50/30 cores
(that is, cores of 0.050-inch outer diameter,
O.030-inch inner diameter) drive requirements are necessarily high (approximately 1
ampere-turn), and particular attention must
be given to the physical arrangements of
conductors, sense windings, and storage elements.
Actually, short cycle times have been
realized by the development of fast- switching·
storage elements which operate in impulse
switching mode s using high drives, and by
minimizing such factors as propagation time,
field transients, and mutual-coupling effects
to reduce the duration of the unproductive
phases of the memory cycle. The reduction
of these phases has assumed increasing importance as the operating speeds of memories
have increased. Array geometry, timing operation, and the relative positioning of
*RCA, Needham, Mass.
184
Proceedings-Fall Joint Computer Conference, 1962 / 185
the development of low-drive, fast-switching
memory cells; and second, the development
of methods for the assembly of these cells
into arrays economically.
Memory Organization
High speeds have been achieved utilizing
a word-address, two-core-per-bit memory
organization.
Linear selection (wordaddress) memory schemes are well established as a means of obtaining increased
memory speeds since, in contrast to coincident current methods, readout currents of
unlimited magnitude can be used. (Currents
are then determined by transistor driver
limitations rather than by core or memory
organization.) Linear selection provides a
second, important means of attaining high
speeds by making it possible to use high
amplitude, short duration write pulses. Narrow pulses such as these switch a minimum
of flux by themselves although their amplitude is substantially greater than the normal
switching threshold; when added to the exciter
or digit pulses, however, they are capable of
switching a significant amount of flux [14].
As memory cycles are reduced, a point is
reached where two-core-per-bit operation
becomes a necessity if practicable signal-tonoise ratios are to be maintained. This comes
about for two reasons; fir st, as the write
pulse is made increasingly narrow, a very
small fraction of the core is being switched
so that upon readout the difference between a
1 and a 0 is very small; and second; as the
read pulse is made narrower and the rise
time decreased the contribution of reversible
magnetization changes becomes an increasingly significant fraction of the total output.
In addition, the peaking time of the core
rapidly approaches the time at which the
reversible flux peak occurs so that the two
pea~s merge into one. Figure 1 illustrates
qualitatively, the output waveforms for two
cases, each involving partial switching modes
differing essentially in the amount of flux
switched and the switching times. Figure 1a
illustrates the performance of a core with a
switching time of 200 nsec, usable in a
memory with approximately a 1 J.lsec cycle
time. Figure 1b illustrates the case for
higher speed situations. Here, less flux was
written into the core, the read duration and
rise times were decreased and the switching
time reduced to 50 nsec. When the switching
time is short, even with precise strobing, the
difficulty of obtaining sufficient discrimination is evident from Figure lb.
Two-core-per-bit operation provides a
means of cancelling out the reversible flux
contribution to the total output. Figure 2
illustrates four possible two-core-per-bit
schemes which differ only in the way the digit
pulses are applied to the bit. In Figure 2a,
bidirectional digit pulses pass through both
cores in the bit. In Figure 2b, a digit pulse
passes through both cores to write a 1; there
is no digit pulse when writing a O. In both 2a
and 2b, digit pulse s "add" to the partial-write
pulse in one core and "subtract" from the
partial-write pulse in the other. In 2c and 2d,
o.
b.
----------- uV I
a
50
TIME (ns)
a. Switching time, t
Figure 1.
100
TIME (ns)
s -
200 ns
Output Waveforms vs.
b. t
s -
50 ns
Switching Times.
186 / High-Speed Ferrite Memories
Bit Characteristics
DIGIT DRIVERS
Do
"
a.
Do
/
b.
TO WRITE
"ZERO"
TO WRITE
"ONE"
01
Bit evaluation was performed using a twowire system with common sense -digit wires
and common read-write wires. A schematic
illustration of the test setup is indicated in
Figure 3. In bipolar operation, digit driver
"one" is turned on to write a 1 and driver
"zero"to write a O. In the amplitude sensing
mode, driver "one" is again turned on to
write a 1; driver "zero" is not needed and is
simply turned off.
NO DIGIT
TO WRITE
"ZERO"
TO WRITE
"ONE"
0
1
TO WRITE
"ZERO"
Do
d.
c.
f\
"ONE"
--1 \...:
°
I
b.~NE"
f\
"ONE"
C.JL
f\
"ONE"
d.--1 \..:
CORE A
R
W
"ZERO"
Figure 2.
DRIVER
"ZERO"
°
NO DIGIT
TO WRITE
"ZERO"
SENSE SIGNALS
O.
DIGIT
DIFFERENCE
SENSE
AMPLIFIER
DIGIT
DRIVER
"ONE"
t.j
/!jCORE 8
!J
Four Digit Drive Techniques and
Sense Signal Waveforms.
Figure 3.
digit pulses have but one direction which is
the same as that of the partial-write pulse.
In cases 2a and 2c, the read outputs are bipolar, whereas in 2b and 2d the outputs are
unipolar. Only two of these schemes (2c and
2d) were in fact used in this system.
Unidirectional bit drives were used exclusively when it became evident that bidirectional digiting led to an increase in digitdisturb sensitivity.
This effect was
particular ly evident when each core in the bit
was individually examined. Test results
showed that a core which received a digit
pulse opposite in direction to the partialwrite pulse had a lower disturb threshold
than a core for which write and digit pulses
were in the same direction. This effect and
similar phenomena has been reported elsewhere [6]. Henceforth, in this article, Figure 2c will be referred to as the bipolar
sensing scheme. The situation drawn in Figure 2d, which differ s froril2c in that only one
core is digited (output of single polarity) will
be termed an amplitude sensing scheme.
Test Set- Up (Bit Evaluation).
Amplitude Sensing: Cores which were
tested with the amplitude sensing mode were
subjected to the series of input pulses as
shown in Figure 4a. This sequence was
chosen to produce the worst signal-to-noise
ratio; in this case, this ratio is the ratio of
the amplitude of the undisturbed 1 to the
amplitude of the disturbed 0 (uV1 :dVz ). The
particular order of the sequence-undisturbed
o voltage, (uVz )' undisturbed 1 voltage (uV 1)'
disturbed 1 voltage (dV1 ), disturbed 0 voltage (dVz ), was intended to bring out instability in the remanent state on readout, if instability existed. If readout were incomplete
the dVz which follows a dV1 would have its
highest value, and a uV1 follQwipg a uVz would
have its lowest value.
The four read outputs were superimposed
to obtain a simultaneous oscilloscope display
of the type shown in Figure 5. The waveforms
shown were obtained using cores of 0.050inch outer diameter; 0.010-inch inner diameter, with the drive pulse characteristic shown
Proceedings-Fall Joint Computer Conference, 1962 / 187
INPUT
CORE
A
INPUT
CORE
B
READ
OUTPUT
uVz
uV I
Q.
AMPLITUDE SENSING
62 R-W CYCLES
I
INPUT
CORE
A
INPUT
CORE
B
READ
OUTPUT
dVI
A
D""
'
..
8 T064
DISTURB PULSES
~-------~----~
~--------\I'
~------dVz
62uV;S
V
dVz
b. BIPOLAR SENSING
a. Amplitude Sens ing
Figure 4.
TEMPERATURE
CORE
Pulse Test Programs and Outputs.
= 25° C
= 50 MIL 0.0., 10 MIL 1.0.
AMPLITUDE ma
DURATION
100
ns
RISE TIME
ns
25
READ
325
WRITE
250
50
25
DIGIT
60
90
25
70
60
BIT
OUTPUT
mv
50
40
30
20
.10
10%
··POINT
0
Figure 5.
b. Bipolar Sensing
Output Waveforms (Amplitude
Sensing).
in each figure. The switching time, taken at
the 10-percent points, is about 70 nsec. The
relatively small difference between disturbed
and undisturbed signals is an indication that
the digit-disturb pulse has minimum disturbing effect on the core.
Figure 6a shows the effects of variations
in the digit and partial-write current amplitudes for fixed current-read conditions. The
1 and 0 outputs are actually undisturbed 1 's
(uVi ) and disturbed O's (dVz ), respectively,
Figure 6b shows the signal-to-noise rotios
which serve as a guide in the determination of
optimum drive-pulse characteristics. For the
particular durations and rise times used in
this case, a range of workable digit and
partial-wire current levels is available. In
Figure 6, for example, it is apparent that a
digit current in the 60- to 70-ma range and a
partial-write current of 200 to 220 ma would
give a 1 output of 40 to 65 mv at signal-tonoise ratios of 9 or 10 to 1. Switching times
in this case are in the order of 80 nsec.
The drive pulse and output characteristics
indicated in Figures 5· and 6 are typical of
cores with a 10 mil inner diameter and 50 mil
188 / High-Speed Ferrite Memories
BIT
OUTPUT
mv
40
80
60
100
DIGIT CURRENT - ma
Figure 6a. Changes in Bit Output with Variations of Digit and Partial- Write Current
Amplitudes.
back voltages per bit during readout with
little change in net signal output. This happens, of course, because as the rise time is
decreased the reversible flux contribution to
the total output is increased. Figure 7 illustrates the effect of decreasing the rise time
of the readpulsewhile maintainingits amplitude fixed. As is evident from the diagram,
the total back voltage per bit decreases from
250 mv to 150 mv when the rise time is increased from 13 to 40 nsec. The decrease in
signal is in comparison very small-in fact
there is no change in signal when the rise
time is changed from 30 to 40 nsec. These
facts are of obvious importance if long memory words come into consideration since one
can bargain for lower back voltages on the
word lines by increasing rise times and memory cycle times but with very little loss in
bit output.
Bipolar Sensing: In the bipolar-sensing
method (Figure 4b), core A is digited to write
aI, core B to write a O. Thus, in the test
program shown, the first series of read outputs ,. are undisturbed 1 'so The last readout
CORE = 50MIL 0.0., 10MIL 1.0.
AMPLITUDE m a
READ
WRITE
DIGIT
350
VARIABLE
VARIABLE
DURATION
ns
RISE TIME ns
40
40
44
100
60
100
100
e(mv)
50
READ RISE TIME-4Gns
10
\
..--...
--~~
ns
::J~
0
i=
«
a:
8
UJ
rn
(5
z
I
0
II
...J
6
«
z
(!)
en
4
80
100
DIGIT CURRENT - ma
Figure 6b. Signal to Noise Ratio vs. Digit
and Partial- Write Current Amplitudes.
outer diameter. Switching times and readwrite cycle times can be shortened by increasing the read pulse amplitude and decreasing rise times and durations. Shortened
rise times, however, bring about increased
Figure 7. Effect of Read Pulse Rise Time
on Core and Bit Output.
Proceedings-Fall Joint Computer Conference, 1962 / 189
on the right is a disturbed 0 which is termed
the "lowest 0." In this case, although core B
is raised to the higher flux state by superimposing digit and partial-write pulses, the
disturb pulses are applied to core A. (If the
pulse patterns in the two cores were reversed,
the initial series of readouts would be undisturbed O's and the last readout a disturbed 1.)
Actually, the pulse sequence is a variation
of that used for amplitude sensing and was
chosen to give the worst-case conditions (i.e.,
by promoting the lowest 0 or the lowest 1).
As before, the reasoning was that if the readpulse amplitude or duration was insufficient
to bring the core back to a stable remanent
state, a 0 would have its lowe st magnitude if
it followed a series of l' s. A disturbed 0
obtained under these conditions, then, should
be the lowest O.
This particular pulse sequence also aggravates the worst-case condition, because the
high pulse repetition rate (2 mc) increases
core temperature. In one of the materials
examined, both cores become heated to about
20°C above room temperature. Core A, however, undergoes more flux reversals per
cycle than core B, and becomes slightly
warmer than core B. The temperature difference between the core s was found to be in
the order of 5 to 7 ° C. This temperature tends
to increase the 1 output and decrease the 0
output. Because core A is warmer, it not
only has an increased output, but also a reduced coercivity which tends to lower the
digit-disturb threshold.
The temperature effect was easily observed by switching from the program shown
to one producing a series of undisturbed O's
followed by a disturbed 1. For about 3 seconds after the program change, the output
amplitude s are unstable. Immediately after
the program change, the undisturbed 0 is low
and gradually increases to a higher stable
value. Initially, the disturbed 1 is relatively
high and stabilizes at a lower value. If the
pulse program is again reversed, the undisturbed 1 is low initially, and gradually increases; the disturbed 0 is high initially and
gradually decreases.
Figure 8 shows the effect of digit-current
amplitude on output for a given material and
for a given set of partial-write and read conditions. The upper curve is a plot of undisturbed 1 (or undisturbed 0) outputs as a
function of digit-current amplitudes. The
lower curve shows the effect of digit amplitude
CORE
= 50 MIL O. D.,
10· MIL I. D.
RISE TIME
READ
24
WRITE
DIGIT
20
20
,.or
140
ns
1
I
120
BIT
OUTPUT
100
60
40~-r----+-----~---+----~----~
50
60
70
80
DIGIT CURRENT -
Figure 8.
90
100
ma
Bit Output vs. Digit Current
(Bipolar Sensing).
on a disturbed 0 or disturbed 1. As expected,
the undisturbed output gradually increases
as the digit current level is 1ncreased. The
disturbed output, on the other hand, goes
through a maximum (in this case at about 90
rna). The decrease in 0 output beyond this
point occurs because the disturb threshold of
the core has been grossly exceeded at this
digit-current level. This is particularlyevident from the figure in that the disturbed
curve is now "split," depending upon whether
8 or 62 disturb pulses were in the pulse program. Point E in Figure 8 is the e generated.
Ideally it might be desirable to display
complete encyclopedic knowledge about all
items appearing in an association map, including, for example, information about the
environment, property lists, homographic
uses, and so on. In the absence of a complete
semantic dictionarY,a compromise is usually
made in actual systems, either by restricting the relations to be recognized automatically to certain very specific types [8-10],
or else by providing extensive lists of relationships which are not, however, recognized
fully automatically [11,12].
In a practical automatic system, however,
it would seem necessary on the one hand to
distinguish more than a few specific types of
relations, and on the other to perform the
recognition procedure automatically without
benefit of extensive dictionaries. These requirements would suggest that a small number of restricted dictionaries be used together with other indications provided by the
linguistic context. The following linguistic indicators might provide important information:
a) prepositions and other function words;
b) affixes of various kinds;
c) special quantifiers such as "many, "
"some," "every";
d) logical connectives such as "and," "or,"
"not";
e) special ref ere n t s such as "like,"
"these," "~hose";
f) special linguistic units such as "the
fact is-," "it is claimed," "it is hoped."
PrepOSitions and other function words
have been used in the past for the extraction
of semantic indications [13 -15]. More general forms of syntactic analysis may be used
Similarly for the recognition of word relations [15-17]. In the next section, a method
is presented for the recognition of word as-sociations by means .of a simple form of
syntactic analysis.
A Simple Syntactic Method for the
Recognition of Word Associations
Dictionary Look-up and Suffix Analysis:
The method to be described takes ordinary
236 / Some Experiments in the Generation of Word and Document Associations
English text and assigns a unique part-ofspeech indicator and a unique syntactic function indicator to each word. These indicators are then used to assemble words into
phrases and phrases into clauses. The complete program is described in the flowchart
of Figure 1.
percent of the words are found in the dictionary, so that forty-five percent of the words
are still left without any syntactic indication
afte r dictionary look-up.
INPUT
ITEM
SEMANTIC
INDICATOR
(2nd SYNTACTIC
CLASS)
WEEKLY
TO
TYPE AND VERI FY INPUT TEXT
WEEKS
ITEMIZE TEXT WORDS AND ASSIGN SERIAL NUMBERS
WELL
LOOK -UP ITEMS IN DICTIONARY
AND ASSIGN SYNTACTIC INDICATORS
ITEMS NOT FOUND
IN DICTIONAqy
USE SUFFi XES TO ASSIGN SYNTACTIC INalCATORS
ITEMS
WITHOUT
SUFFIX
..
WENT
ITEMS
FOUND IN
DICTIONARY
ITEMS WITH RECOGN IZABLE
SUFFIX
MATCH PREDICTED SYNTACTIC FUNCTIONS WITH SYNTACTIC
INDICATORS AND ASSIGN COMPLETE SYNTACTIC INFORMATION
ASSIGN SYNTACTIC INFORMATION CORRESPONDING
TO MOST PROBABLE PREDICTION
USE SYNTACTI C INFORMATION TO
DETERMIN E WORD ASSOCIAT IONS
SI MPLIF I ED SYNTACTI C ANALYSIS PROGRAM
Figure 1
WERE
WEST
DO
WHAT
SERIAL
NUMBER
SEMANTIC
INPUT ITEM
INDICATOR
AND SYNTACTIC (1st AND 3rd
INDICATORS SYNTACTIC CLASSES)
GAS -00816000 WEEKLY
HO,AO,N!
%
TO
GAS- 00802000 WEEKS
N2
%
TO
GAS-00098000 WELL
HO,AO,N!
"I.
GAS-002080oo WENT
VI
%
MV
GAS-00043000 WERE
%
LV
GAS-00559000 WEST
Nl, AO, HO
%
DO
DO
GAS-00048000 WHAT
%
RPRooOOOOOOO C I ,P3, AO,HO
WHEN
GAS -00051 000 WHEN
RPROOOOOOOOO Cl,HO, P3
%
WHENCE
GAS -003 33000 WHENCE
CI,HO
%
WHENEVER
GAS-00334000 WHENEVER
CI, HO
%
TO
TO
TO
DICTIONARY EXCERPT
Figure 2
The input text is first transferred onto
magnetic tape, and the individual words are
separated and provided with a serial number.
A small dictionary is then used to assign
syntactic part-of-speech indicators to those
words which are included in the dictionary.
A dictionary excerpt is shown in Figure 2.
It may be noted that each dictionary item is
provided with as many syntactic indicators
as there are possible applicable parts of
speech. For example, the word "weekly"
reproduced as the first item in Figure 2 is
furnished with three indicators (HO,AO,Nl),
representing respectively adverb, adjective,
and noun-singular indications. Certain semantic indicators are also included in the
dictionary. The "TO" semantic indicator,
for example, represents a time indication.
Other semantic indications also included,
denote motion, location, direction, dimension,
value, duration, and so on.
The dictionary presently used contains
about four hundred function words, such as
prepositions, conjunctions, adverbs, articles,
and so on, and about five hundred common
nouns, verbs, and adjectives. The dictionary
composition is shown in detail in Table 1.
Experimental evidence indicates that for the
technical texts tested approximately fifty-five
In order to generate parts-of- speech indications for the words not found in the dictionary, a suffix analysis program is used.
Specifically, an attempt is made to detect a
recognizable suffix for each word by comparing the word endings with a list of suffixes contained in a suffix table. Suffixes of
seven letters are tested before suffixes of
six letters, and so on, down to suffixes of
one letter. Special provisions are made to
detect plural noun forms and third person
singular present forms for verbs. When a
suffix match is obtained, the part-of-speech
indicators included in the suffix table are
attached to the corresponding text words.
An excerpt from the suffix table is shown in
Table 2.
It should be noted that many words include
"false" suffixes which, when looked up in the
suffix table, would provide incorrect information. For example, the words "king,"
"wing," and "sing" would be provided with
"gerund" or "present participle" indicators
because of the "ing" ending. Such words are
considered to be exceptions; they are included for the most part in the regular dictionary so as to eliminate the suffix analysis. Experience has shown that the suffix
Proceedings-Fall Joint Computer Conference, 1962 / 237
Table 1. Dictionary Composition
SYNTACTIC INDICATIONS
NN
Ino VBl
NN, VB Ino VBEF'l
~!..~B__ I~i~_V~~~ ____
VBI or VB2 lno NN, no IOBll
VBI or VB2 Iwith lOBJ'l
VB3 Ino V 3TI'1
VB3 Iwith V3TI'1
VB4
PERCENTAGE
OF TOTAL
266
101
27.7 %
10.5
6.0
11 VBEF:
VERB PREDICTIONS TAKE PRECEDENCE
OVER NOUN COMPLEMENT PREDICTIOOS.
12.3
4.6
1.3
0.4
1.4
211OBJ:
VERBS MAY TAKE INDIRECT OBJECT
~7
-------- ------liB
44
12
------ ---------
POI
P02
P03
PQ3
NO OF WORDS
IN DICTIONARY"
13
5
Ino NAMS'l
I with NAMS'l
21
8
(no CONl
(no COMl
CON
I
49
12
------------- - -------
COM
CON
COM
PCT
---- - - - - - - - - ADJ (no NAMS'l
ADJ (with NAMS')
ADV
PRE
PRI
ART
PRP
PAP
TOTAL WORDS
NO. OF
CHARACTERS
7
6
- - - - - - -271
II
134
71
67
72
0.5
0.8
2.2
0.8
0.1
5.1
1.3
31 V3TI:
VERBS MAY TAKE AN INFINITIVE
VERB COMPLEMENT
(OUGHT, NEED, DARE, USEDl.
41 NAMS:
CANNOT FULFILL ADJ MAS OR
NADJ CM PREDICTIONS
(THIS, THOSEl
0.7
28.2
1.2
13.9
7.4
0.1
0.3
70
7.5
• ITEMS EXHIBITING MORE THAN
ONE SYNTACT IC INDICATOR ARE
LISTED SEVERAL TIMES
963
SUFFIX
SYNTACTIC
INDICATOR*
SORBING
DENTIAL
NENTIAL
RENTI AL
TENTION
TURBING
SENTIAL
PRP
ADJ
ADJ
ADJ
NNI
PRP
ADJ
ANSION
ANTIAL
ENSSON
ENTIAL
NNI
ADJ
NNI
ADJ, NN I
it
EXCEPT IONS
APPEAR IN
THE
DICTIONARY
syntactic' analysis is used to assign to each
word a unique part of speech indicator and a
unique syntactic function [18,19]. Specifically,
each sentence is scanned from left to right,
one word at a time. At each point, predictions are made conce rning the syntactic
structures to be found later in the same
sentence. When a new word is considered,
its associated grammatical indicators are
tested against the unfulfilled predictions
available at that time. If a match is found
between any of the predicted grammatical
functions and the available syntactic indicators, (he first matching grammatical function and the associated part-of-speech indication are assigned to the corresponding
word in the input text. The accepted prediction is then deleted, and further predictions
are made for new structures to be expected
later in the same sentence. If no match can
be found between any unfulfilled prediction
and available grammatical indication, as is
the case when words are tested which have
no associated grammatical information, the
most likely prediction is used to generate
acceptable grammatical indicators for the
corresponding input items.
The predictive analysiS makes use of information stored in two separate tables, the
grammar table and the prediction table. The
grammar table lists predicted syntactic functions against the syntactic indicators which
can fulfill each prediction, and the prediction
table lists a c c e pte d syntactic indicators
INPUT
I TEM
SERIAL
NUMBER
INPUT ITEM
AND SYNTACTIC
INDICATORS
SYMBOLS IDENTIFYING
PROCEDURE USED TO
GENERATE SYNTACTIC
INDICATORS'
I$H-02540000 OF
RO
analysis provides syntactic. information for
an additional thirty percent of the words found
in technical texts. After the suffix analYSiS,
less than fifteen percent of the words are
thus left without any grammatical information.
A small piece of suffix analyzed output is
shown as an example in Figure 3. It may be
noticed that all items found in the dictionary,
or furnished with a recognizable suffix, are
provided with syntactic indicators as a result
of the table look-up procedures. The words
"atmosphere" and "alarm" in the excerpt of
Figure 3 could not be classified by any of the
available methods, and are therefore left
without grammatical indication.
Predictive Syntactic Analysis: Following
the suffix analysis, a Simplified predictive
IMPENDING
1 SH-02550000 1MPENDI NG
%
X-LI T.
attNI
CR.S.S
I TEM FOUND IN
DICTIONARY
I $H-02560000 CRISt S
%
N2. V2
BLANK I X-LIT'
%
• SH-02510000
SYNTAX GENERATED
BY SUFF.x ANALYSIS
PO
NOW
ISH-02580000 NOW
%
X-LIT I X-LIT
HO,NI, CI
NO SYNTACTIC
INFORMATION
AVAILABLE
1SH -02590000 AN
TO
ATMOSPHERE
I SH-02600000 II II 1111 1111
OF
ISH·02610000 OF
RO
ALARM
ISH-026Z0000 111111111.11
X-LI T.
X-LIT.
ISH -02630000 t
OUPC3QOOOQOO CO. Cl
1SH -02640000 S I NeE
C" RO,HO
ISH -02650000
"BLANK I BLANK
%
,T
P3
SUFFI X ANALYZED OUTPUT
Figure 3
238 / Some Experiments in the Generation of Word and Document Associations
against new syntactic functions to be predicted as a result of a match. Excerpts from
the grammar and prediction tables are shown
in Tables 3 and 4 respectively.
Table 3. Excerpt from Grammar Table
SYNTACTIC FUNCTION
PREDICTED
SUBJECT
ADJ MAS
OBJT VB
PRED HD
NADJ eM
VERB CM
ADVB MS
PREP CM
ADVB ES
PREP ES
END SEN
SYNTACTIC INDICATORS SATISFYING
THE PREDICTION
ART, ADJ, NNI, NN2, PRP
NNI, NN2, ADJ
ART, ADJ, NNI, NN2
VB I, VB2, VB3, VB4
NNI,NN2
VB I, VB3, VB4, PRP, PAP, PR I
ADJ
ART, ADJ, NN I, NN2, PRP
ADV
PRE
PCT
Table 4. Excerpt from Prediction Table
Generated by Accepted Syntactic Indicators
ACCEPTED
SYNTACTIC
INDICATOR
ART
ADJ
NNI
NN2
PO I
P02
P03
PRP
COM
CON
PCT
SYNTACTIC FUNCTIONS PREDICTED
ADJ MAS,
ADVB ES
ADJ MAS,
NADJ eM, PREP ES
NADJ CM,
PREP ES
NADJ CM, PREP ES
NADJ CM, PREP ES
NADJ CM, PREP ES
NADJ CM, PREP ES
[tNADJ CM),' VERB CM, PREP ES,
ADVB ES,
OBJT VB
(LAST PREDICTION SATISFIED BY
\ CERTAIN ESSENTIAL ITEMS, (SCL $UB,
PRED HD, INFINTY)2
{SUBJECT, PRED HD, (OUP CNd,
INFINTY, (ADVB MS)4
Il WHEN ACCEPTED
AS "PREP CM"
2) WHEN NOT OCCURING
BEFOR E "CON"
3) WHEN EXHIBITING
SUPPLEMENTARY
"DUPC" CODE
erased from the list of unfulfilled predictions, and a number of new predictions, as
indicated by the prediction table, are added
to the list.
Since the "article" indicator was accepted
as correct for the first word, Table 4 shows
that "adjective master" and "adverb essence"
predictions are to be added to the list of predictions. The list of predictions operates,
with minor modifications, as a pushdown
store, so that new predictions are always
added at the top of the list. When the second
text word is about to be processed, the list
of unfulfilled I predictions therefore contains
"adjective master ," "adverb essence," "predicate," and "end of sentence" predictions in
that order. Subsequent words are processed
in the same manner until the end of the sentence is reached, at which point the list of
predictions is cleared and initial conditions
are restored.
The analysis of a short sentence is shown
as an example in Table 5. The generation
code indicates whether any syntactic indicators were available at the start of the predictive analysis, and, if so, how they were
generated. The column labelled "predictor
serial" contains the serial number of the
item which was originally used to generate
Table 5. Analysis of a Sample Sentence
4) WHEN ACCEPTED AS
"DUP CNJ" FUNCTION
{SUBJECT, PRED HD, INFIt-..
REL CLS, END SEN
Consider, for example, the conditions
which obtain at the beginning of a sentence.
The functions predicted initially are "subject;'
"predicate ," and "end of sentence ," in that
order. Suppose that the first word in the
sentence has an associated "article" indicator. The grammar table is then consulted
to find out whether the first prediction (subject) matches the available part-of-speech
indication (article). The first entry in Table
3 shows that the subject prediction is fulfilled by "article," "adjective," "noun singular," "noun plural," and "present participle"
indicators. A match is therefore found, and
"subject" and "article" are assumed to be
the correct grammatical function and partof-speech indication for the first word in the
sentence. The "subject" prediction is then
INPUT TEXT
ON
A
SUITABLY
SCALED
GRAPH
ANY
RISING
EXPONENTIAL
CURVE
TENDS
TO
PRODUCE
AN
HYPNOTIC
EFFECT
OF
IMPENDING
CRISIS
GENERATION
CODE
SERIAL
NO.
0
D
D
S
NI
0
0
D
S
NI
0
D
D
D
S
D
0
S
S
D
0238
0239
0240
0241
0242
0243
0244
0245
0246
0247
0248
0249
0250
0251
0252
0253
0254
0255
0256
0257
SYNTACTIC
FUNCTION
,,,.,, )
PREP CM
AOVB ES
ADJ MAS
NAOJ CM
"""')
'PREP CM
AOJ MAS
'OBJT VB
AOJ MAS
PRED HO}
VERB CM
INF BSE
0BJT VB}
AOJ MAS
AOJ MAS
PREP ES}
PREP CM
NADJ CM
END SEN
SYNTACTIC
INDICATOR
PRE
ART
ADV
PRP
NNI
COM
ADJ
PRP
ADJ
NNI
VB2
PR I
VB I
ART
ADJ
NNI
PRE
PRP
NN2
PCT
PREDICTOR
SERIAL
*237
238
239
239
241
"237
243
244
245
246
*237
248
249
250
251
252
253
254
255
*237
0- DICTIONARY
S- SUFFIX
ANALYSIS
NI- NO
INFORMATION
* INITIAL
PREDICTION
,
ERRONEOUS
ANALYSIS
the prediction fulfilled by the current word.
Thus the word "tends" in Table 5 was recognized as a verb fulfilling the "predicate head"
prediction. The predictor serial (237) shows
that "predicate head" was an initial prediction originally predicted at the start of the
sentence. The" syntactic function" and "predictor serial" columns of Table 5 are used
later in the bracketting procedure which generates phrases and clauses.
Proceedings-Fall Joint Computer Conference, 1962 / 239
By checking the syntactic functions of
Table 5, it may be seen that an error was
made in the analysis. In particular, the
subject phrase "any rising exponential curve"
was not properly recognized because the
comma (item 0243) predicted a parallel construction consisting of another prepositional
phrase. This particular error can be caught
since the subject prediction, which is considered to be "essential," is left unfulfilled
at the end of the sentence. However, other
mistakes are made in the course of a normal
analysis which cannot be detected so easily.
These errors are due to three principal
causes:
a) the absence of grammatical information for words not found in the dictionary or in the suffix table;
b) the inadequacy in certain cases of the
grammar and prediction tables used;
c) the large number of syntactic ambiguities present in the language.
Clearly, nothing can be done about the last
of the three listed causes. In fact, even the
most elaborate automatic methods for syntactic analysiS, including, in particular, those
which make use of complete syntactic dictionaries of the language, produce erroneous
output in cases of linguistic ambiguity. The
question arises whether the performance of
the crude analysis here described is much
less reliable than other more ambitious
programs.
A list of principal error types found in
the current analysis appears in Table 6. The
analysiS of the output indicates an error percentage in the assignment of syntactic functions of 7 to 8 percent primary errors, and
6 percent induced errors. The latter category. consists of errors which arise as a
result of some preceding error in the same
sentence. If, for example, the word "profound"
in the phrase "many profound questions •. I'
is not found in the dictionary, it will be assigned a noun indicator satisfying the subject
prediction. When the word "questions" is
taken up, the subject prediction will already
have been ·erased, and rfquestions" is there-:fore recognized as the third person present
singular form of the verb "to question" satisfying the predicate prediction. The latter
error is thus induced by the false classification of the adjective "profound."
In general, local structures such as subject phrases, noun complement phrases,
preposition complement phrases, participial
Table 6. Principal Error Types
ERROR TYPE
EXAMPLES
I ITEMS NOT ENTERED IN DICTIONARY
1. ADJECT IVE -
NOUN
BENEFICIAL, REMEDIAL, ANNUAL,
SCIENTIFIC, REAL, OBSCURE;
2. NOUN +-+ VERB
(WORDS IN - 5)
EXPENDITURES, TRENDS, PROBLEMS;
TENDS, INSISTS;
3. NOUN - - VERB
(VARIOUS)
INFORMATION, EDUCATION;
GENERATE, ASSIMILATE;
NEEDED, OUTSTRIPPED, PUN ISHED;
4. ADVERB +--+ NOUN, ADJECTIVE
SIMPLY, REGARDLESS, INDEED;
ABSURD, AWARE;
n ITEMS
IN DICTIONARY
WITH CORRECTLY ASSIGNED SYNTACTIC FUNCTION
1. CONJUNCTION - - ADJECTIVE
("THAT")
2. CONJUNCTION 3. PRONOUN ("SO")
RELATIVE SUBJECT
ADVERB
(EXTRA OBJECT PREDICTION LEFT
IN POOL!
THAT, WHO
SO FAR AS, SO VERBOSE
phrases, verb complement phrases, and so
on, are almost always properly analyzed.
Errors arise most frequ~ntly in the correct
recognition of coordinated and subordinated
structures, and, in particular, in the accurate prediction· of parenthetic and parallel
constructions. For example, in the sentence
"The conceptual, rather than equipment, aspects .•• "
the first comma correctly predicts a parenthetic expression. The word "than," however, generates among other predictions a
subclause prediction which dominates the
other unfulfilled predictions. When the second comma is reached, it is interpreted in
accordance with the latest prediction as signalling a parallel construction, thus repeating for "aspects" the same assignment as
for "equipment."
In order to judge the severity of the handicap arising from the absence of a complete
word dictionary, a test run was made with a
special dictionary, which included all relevant words occurring in the sample texts.
The error rate was found to be reduced by
less than forty percent, resulting in a primary error rate of 4.5 percent and an induced
rate of about 4 percent. These figures are
comparable with error rates produced by
other far more complicated syntactic analysis programs. The absence of the dictionary is therefore largely compensated by
the effectiveness of the suffix analysis and of
the modified predictive techniques. This
conclusion is further confirmed by the fact
that the bracketting program which produces
word associations performs almost perfectly for local structures.
Bracketting Program: A number of computer programs are in existence which use
240 / Some Experiments in the Generation of Word and Document Associations
the output of an automatic syntactic analysis
to produce dependency structures in tree
form [14,20,21] • The list of all dependent
word groups can be obtained from a dependency tree by following all branches of the
tree, and generating groups of words, located
on the same branch. Alternatively, if the
tree structure is not available, the same
result can be achieved by using the information provided by the predictive analysis.
- Specifically, the syntactic function indicators and predictor serial numbers (columns 4 and 6 of Table 5) are used to produce
for each item a chain serial number and a
chain function. The chain serial numbers
for "subject," "subclause subject," "predicate head" and "preposition essence" are the
same as the corresponding predictor serials;
for other syntactic functions, the chain serials
are given by the predictor serials for the
embedding structure. The chain serial is
then effectively defined as the serial of the
first item in the same phrase, although discontinuous constituents will cause items with
the same chain serial to be separated by
other extraneous items.
For example, preposition complements
are assigned the chain serial corresponding
to the predictor serial of the preceding preposition. Verb complements are Similarly
assigned the chain serial for the predictor
of the preceding predicate head. Comparable rules apply to the remaining syntactic
functions. The chain function assigned to a
given word is the syntactic function associated with the predictor of the embedding
structure, except that "subject," "subclause
subject," "predicate head," "preposition essence," "preposition complement" and "infinity" functions keep their present function
indicator.
In order to bring together all members of
the same structure, it is only necessary to
sort the items in chain serial number order,
while at the same time preserving the word
order inside each substructure. The structures generated for the sample sentence of
Table 5 are represented by brackets in column 4 of the Table. The chain functions assigned to the five brackets are, respectively,
preposition complement, preposition complement, predicate head, object of verb, and
preposition complement. The chain function
assigned to the second bracket is in error,
since the subject of the sample sentence was
not correctly recognized, as previously ex-
plained. Higher order brackets can SImIlarly be produced by combining subject, predicate, and object brackets. The bracketting
procedure is generally performed without
difficulty, and the substructures are almost
always correctly recognized. In fact, many
of the more trivial errors in the syntactic
analysis are not reflected by any bracketting
errors.
Following the bracketting procedure, it is
easy to generate word groups of various
kinds together with their associated functions. For example, the brackets of Table 5
will yield four noun phrases (" suitably scaled
graph,"" rising exponential curve," "hypnotic
effect," "impending crisis"), two prepositionalphrases ("on graph," "effect of crisis"),
and one verb phrase ("tends to produce").
Similarly, a subject-verb-object grouping
would yield, assuming proper recognition of
the subject, the phrase "curve produce effect."
The output produced by the syntactic
bracketting procedure can be incorporated
in a word association map -in two different
ways. First, it is possible to replace each
word associated with a node of the map by a
complete phrase, and, second, relations between nodes, represented by branches in the
map, can be provided with relationship indicators, such as prepOSitions, conjunctions,
verbs, and so on, as determined by the syntactic analysis program. The replacement
of individual words by phrases is relatively
straightforward. The problem created by
the addition of relational indications is more
complicated, particularly for those relations
which are indicated by the context rather than
by specific function words [15]. However,
even if a standard set of relational indications is not available, and relational information provided by the context is not explicitly identified, the addition of word groupings and of some simple indicators of relation
(e.g., prepositions), furnishes a more accurate description, and permits a more profitable comparison of document content.
The Use of Bibliographic Citations for the
Generation of Document Associations
In the preceding section, information was
extracted from written texts to obtain word
associations. It may be expected that when
these association programs are added to
Proceedings-Fall Joint Computer Conference, 1962 / 241
other methods for the identification of document content, similarities and differences
between documents will be easier to detect.
Another possible approach to the generation
of document associations consists in using
bibliographic references to determine document content. Specifically, if it could be
shown that a similarity in the bibliographic
references or citations attached to two documents also implies a similarity in subject
matter, then citations might be helpful for
the automatic assignment of index terms to
new documents, and for the gene ration of
document associations.
An experiment is described in the present
section which uses citation tables (listing
with each cited document the set of all those
documents which cite the original ones) and
reference tables (listing with each citing
document the set of all documents cited by
the original ones) to compute Similarity
coefficients between documents. The coefficient sets derived from the citations are
then compared with another set of similarity
coefficients derived from index terms attached to the documents, and an attempt is
made to determine whether large coefficients
in one set correspond to large coefficients
in the other, and vice-versa [22].
The proposed use of citations exhibits the
same advantage s as the syntactic method previously described, namely that the required
input information (in this case bibliographic
references and citations) is directly available
with most documents, and need not therefore
be generated by more or less unreliable
methods. Moreover, references can be processed automatically nearly as easily as ordinary running text. Use can also be made,
at least for experimental purposes, of existing manually indexed collections for which
the assignment of index terms is made under
controlled conditions.
Measure of Similarity: Consider first a
given collection of n documents. Let Xl, X 2,
••• Xn be a collection qf n vectors, such that
each vector element Xj of a given vector Xl
represents some property relating to the ith
document. Specifically, if the n documents
are characterized by m properties each,
each document is represented by one mdimensional vector. To relate the properties which identify two specific documents i
and j, it is then only necessary to. compa:r:e
the elements of the two vectors X 1 and X J ;
moreover, if each document is to be related
to each other document in the collection, all
possible vector pairs must be compared.
In most cases, it is convenient to restrict
the values of the vector elements to 0 and 1,
in such a way that element X ~ = 1 if and. only
ifproperty j holds for document i, and X~ = 0
otherwise. This restriction is, howeve~, in
no way essential. Two documents i and j are
then assumed to be closely related if they
share a number of common properties, or,
alternatively, if the vectors Xi and X j have
their nonzero elements in corresponding
positions. The desired measure of relatedness must also be a function of the totalnumber of properties applying to each of the
documents, that is, of the total number of
nonzero elements in each vector, since one
shared property out of a possible total of one
hundred is clearly less Significant than one
out of a possible total of two.
A simple measure which exhibits the desired behavior and permits a comparison of
magnitudes is obtained by considering the
vectors of document properties as vectors in
m - space, and by taking as a distance function
the cosine of the angle between each vector
pair. Specifically,
Xi . X j
Ixil Ixjl
s
m
\' XiX j
L.J - k - k
=
k=1
If one or both of the vectors is identically
equal to zero, s is defined as zero, following the intuitive notion that if a document
has no identifiable properties it cannot share
any properties with other documents.
If the vector elements take on values between 0 and 1, the angle between any two
vectors cannot exceed ninety degrees; the
value of s then ranges from 0 to 1. If, moreover, the document properties are represented
by strictly logical vectors, the expression for
s can. be Simplified since CMP
~
-
CNG 4 - TDCMP
RT CNG - RHO
----- RT CN2 - RHO
.400
!~
TOCMP TDCMP TOCMP TDCMP
TOCMP TOCMP TDCMP TOCMP
DOCUMENTS
CI TNG
CNG 2
CNG 3
CNG 4
CITED
CTO 2
CTD 3
CTD 4
Fa 603
FR 612
GR 604
JO 607
JO 609
JO 611
KU 605
.077
.481
.142
.572
.362
.000
.133
.416
.526
.285
.605
.41 I
.000
.207
.404
.505
.338
.510
.485
.172
.264
.433
.478
.279
.443
.452
.196
.274
.000
.000
.000
.000
.000
.000
.000
.000
.000
.000
.000
.000
.000
.000
.000
.000
.
,000
.000
.000
.000
.000
.000
.000
.000
.000
.000
.000
_.-.- RT CN3 - RHO
- - RT CN4 - RTTD
.300
'-----'--_'----~___L_...I...----L._.L..-._ _
4.
8
DOCUMENT
RANK
COMPAR ISON OF MAXIMUM CROSS-CORRELATION
COEFFICIENTS
Figure 9
Consider now the sample list of crosscorrelation coefficients shown in Tables 8,
9, and,10, for early documents published in
1957 -1958, for middle-range documents published in 1959-1960, and for recent documents published in 1961, respectively. It is
seen in Table 8 that the large values of the
coefficients for the early documents occur
as -expected by comparing cited documents.
Similarly, Table 10 shows that large values
of the coefficients for the recent documents
Table 8. Typical Cross-Correlation Coefficients for Early Documents (Published
1957-1958)
~
TDCMP TDCMP TDCMP TDCMP
TDCMP TDCMP TDCMP TDCMP
DOCUMENTS
CITNG
CNG 2
CNG 3
CNG 4
CITED
CTD 2
CTD 3
CO 208
FO 496
FO 506
FR 205
GI209
G1491
GI495
.000
.299
.000
.436
.109
.000
.273
.000
.459
.000
.000
.635
.000
.000
.000
.000
.000
.483
.134
.559
.306
.563
.000
.000
.000
.000
.000
.000
.000
.587
.000
.496
.542
.428
.615
.469
.610
.000
.000
.455
.000
.217
.433
.424
,292
.210
.487
.628
.447
.643
.523
.552
.615
.426
.622
.556
CTD 4
occur as expected by comparing citing documents. These data confirm the expected
fact that in a closed collection the amount of
citing is not a good content indicator for
early documents, and the citedness is not a
good indicator for recent documents. Table 9
shows that for those documents which can
both cite and be cited, the maximum values
of the coefficients occur sometimes by comparing cited documents and sometimes by
using citing documents. The data of Table 9
thus confirm those previously exhibited in
Table 7, to the effect that no criterion was
found to exist for consistently preferring
either the amount of citing or the amount of
being cited as a content indication.
Turning now to the actual values of the
coefficients obtained, it appears that for the
majority of the documents the values of the
cross-correlation coefficients "lie between
0.450 and 0.700, when citation links of length
two are used, indicating a substantial similarity between overlapping citations and overlapping index terms. For some documents,
the value of the cross-correlation coefficient for links of length two was found to be
smaller than 0.400. However, in almost
every case the failure to obtain adequate
agreement between citations and index terms
was due to the total, or almost total, lack of
citations. The exact data for' 34 documents
publishedin 1959-1960areshown inTable 11.
248 / Some Experiments in the Generation of Word and Document Associations
The documents with coefficients below 0.400
are listed explicitly together with the number
of citation links in each case. The following
results are apparent for the correlations between CNG 2 - TDCMP and CTD 2 - TDCMP
respectively:
a) number of documents with coefficient
above 0.400, exhibiting substantial
agreement between over lapping citations and index terms - 23 and 20;
b) number of documents with coefficient
below 0.400, exhibiting fewer than three
citation links of length one or two - 11
and 13;
c) number of documents with coefficient
below 0.400 exhibiting three or more
citation links - 1 and 1.
While there exists no scientific reason
for asserting that a threshold value of 0.400
for the cross-correlation coefficients is
necessarily significant as a similarity indication, it does appear that for nearly all documents which exhibit a reasonable number
of citations (more than two), the coefficients
obtained reveal considerable agreement between over lapping citations and over lapping
index terms. Further experimentation is
needed to evaluate more precisely the significance of the numeric results, and to determine to what extent citations can actually
be used for the automatic generation of
index terms.
CONCLUSION
Two experiments have been described
which use information directly provided with
written texts to determine \associations between words and documents. The syntactic
experiment has shown that it is possible to
extract information from function words and
word suffixes to generate word groups and
relations between word groups. These, in
turn, may be used to obtain a more effective
method for comparing the content of documents.
The citation experiment has shown that
for documents which exhibit an adequate
number of citations, similarities in citations
seem to provide some indication of similarities in content. For early documents,
citedness furnishes a better indication than
the amount of citing, and vice-versa for
recent documents; for documents which can
both cite and be cited, equally good indications were obtained by comparing citing and
cited documents.
Table 11. Range of Coefficients for
34 Middle-Range Documents
TYPE OF
CROSS-CORRELATION
RANGE CF
COEFFICIENTS
NlIIBER OF
DOCUMENTS
.600 OR OVER
.500-.599
.400-.499
7
B
B
.300-.399
3
.200-.299
I
0
0
SELECTED
DOCUMENT
CODES
NUMBER OF LINKS OF
LENGTH 2 (AND
NO. OF DIRECT LINKS)
---------- -------- ------- -------------II
(2)
CNG 2
TDCMP
.100-.199
.001-.099
.000
7
.600 OR OVER
.500-.599
.400-.499
7
.300-.399
3
{CO
GR
JO
MA
305
313
314
301
r~
FR406
IS 312
LY414
MG414
IIJ306
IIJ318
.200-.299
.100-.199
.001- .099
.000
(4)
(2)
3
0
0
0
0
0
0
(0)
(0)
(0)
(0)
(0)
(0)
(I)
(I)
I
12
----------- --------
CTD 2
TDCMP
2
6
5
5
0
0
6
{CO 317
FO 415
IIJ 318
r~
FR 405
FR 406
MA301
IIG404
r~
IS 311
J0413
LY 414
SA 401
VC 307
--------------(2)
1
I
2
7
I
I
3
I
0
0
0
0
0
0
(2)
(I)
(2)
(I)"
{I)
(3)
(I)
(0)
(I)
(I)
(2)
(PI
(0)
It is suggested that methods of this type,
which require neither extensive input 'dictionaries nor complicated computer processes, may eventually form the basis for
practical "automatic programs for content
analysis. While no simple method will be
completely successful on its own, a combination of many techniques of the kind described may we~l offer not only a speedier,
but also a more satisfactory, solution to the
content analysis problem than the construction of semantic encyclopedias which are so
far away in the future.
SUMMARY
The solution of most problems in automatic information dissemination and retrieval
is dependent on the availability of methods
for the automatic analysis of information
content. In most proposed automatic systems, this analysis is based on a counting
procedure which uses the frequency of occurrence of certain words or word classes
to generate sets of index terms, to prepare
automatic abstracts or extracts, to deter-
Proceedings-Fall Joint Computer Conference, 1962 / 249
mine certain word groupings, and to extend
or modify in various ways sets of terms
originally given. Unfortunately, it is not
possible to perform completely effective
subject analyses solely by frequency counting techniques.
Two automatic methods are presented to
aid in an effective subject analysis. The
first makes use of a simplified form of syntactic analysis to determine associations
between words in a text, and the second uses
bibliographic citations to classify documents
into subject areas. Neither method requires
extensive dictionaries or tables of the type
normally used for automatic classification
schemes; instead, information is extracted
from certain function words, from suffixes
provided with many words in the language,
and from bibliographic citations already
available with most documents.
Specifically, the syntactic analysis makes
use of a small dictionary of a few hundred
function words such as prepositions, conjunctions, articles, and certain nouns. Word
suffixes are then isolated, and a suffix table
is used to obtain additional grammatical
indicators. A type of predictive analysis is
then used to assign syntactic function indicators to all words in a sentence by matching
predicted syntactic structures against the
available grammatical information for the
various words. If no grammatical information is available, the most likely prediction
is used to classify the given word. The syntactic function indicators are used to group
words into phrases of certain types, and
phrases into clauses, and to determine certain word associations. Experimental evidence indicates that the error rate is not
substantially higher than that found in other
more complicated syntactic analysis programs which require full syntactic word
dictionaries.
The citation matching program uses bibliographic citations to determine document
similarities. A similarity coefficient is first
calculated for all document pair s as a functionof the number of overlapping citations
between them. A second similarity coefficient is then derived using this time the
number of over lapping index terms as a
criterion. The index terms may be generated
by hand, or may be derived by means of word
frequency analyses. Finally, similarity coefficients derived from over lapping citations
are compared with those derived from overlapping index terms.
The coefficients, computed for a sample
document collection, are analyzed to verify
the hypothesis that when a closeness exists
in the subject matter of certain documents,
as reflected by over lapping index terms,
there exists a corresponding closeness in
the citation sets. It is found that the computed similarity coefficients are much larger
than those obtained by assuming a random
assignment of citations and index terms.
Suggestions are made for using citation sets
as an aid to the automatic generation of
index terms.
REFERENCES
1. H. P. Luhn, Auto-encoding of Documents
for Information Retrieval Systems, IBM
Corporation, ASDD Report, 1958.
2. H. P. Luhn, "Automatic Creation of Literature Abstracts," IBM Journal of Research and Development, Vol. 2, No.2,
April 1958.
3. H. P. Luhn, "The Automatic Derivation of
Information Retrieval Encodements from
Machine-Readable Texts, " in Information
Retrieval and Machine Translation, Part
2, A. Kent, ed., Interscience Publishers,
New York, 1961.
4. L. B. Doyle, Indexing and Abstracting by
Association, Report SP - 718/001/00,
System Development Corporation, April
1962.
5. H. Borko, "The Construction of an Empirically Based Mathematically Derived
Classification System," Proceedings of
the AFIPS Spring Computer Conference,
San Francisco, May 1962.
6. L. B. Doyle, "Semantic Road Maps for
Literature Searchers," Journal of the
Association for Computing Machinery,
Vol. 8, No.4, October 1961.
7. V. E. Giuliano, "A Linear Method for
Word Sentence Association," private
communication.
8. B. F. Green, Jr., A. K. Woif, C. Chomsky,
and K. Laughery, "Baseball: An Automatic Question Answerer," Proceedings
WJCC, Los Angeles, May 1961.
9. R. K. Lindsay, "The Reading Machine
Problem," Doctoral TheSiS, Carnegie
Institute of Technology, September 1960.
10. N. S. Prywes and H. J. Gray, Jr., "A
Report on the Development of a List
250 / Some Experiments in the Generation of Word and Document Associations
11.
12.
13.
14.
15.
16.
Type Processor - I," University of Pennsylvania, Moore School of Electrical Engineering, 1961.
M. DetantandA. Leroy, "Elaborationd'un
Programme d' Analyse de la Signification,"Rapport GRISANo. 11, EURATOMC ETIS, June 1961.
P. J. Stone, R. F. Bales, J. Z. Namenwirth, and D. M. Ogilvie, "The General
Inquirer: A Computer System for Content Analysis and Retrieval based on the
Sentence as a Unit for Information," Laboratory of Social Relations, Harvard University' November 1961.
P. Baxendale, "An Empirical Model for
Machine Indexing, II Third Institute on Information Storage and Retrieval, February 1961.
G. Salton, "The Manipulation of Trees
in Information Retrieval," Communications of the Association for Computing
Machinery, Vol. 5, No.2, February 1962.
G. Salton, "The Identification of Document Content: A Problem in Automatic
Information Retrieval," Proceedings of
a Harvard Symposium on Digital Computers and their Applications, Annals of
the Computation Laboratory of Harvard University, Vol. 31 (to be published, 1962).
S. Klein' and R. F. Simmons, "Automatic
Analysis and Coding of English Grammar for Information ProceSSing Sys-
17.
18.
19.
20.
21.
22.
23.
tems," Report SP-490, System Development Corporation February 1962.
S. Klein and R. F. Simmons, "A Computational Approach to Grammatical
Coding of English Words," Report SP 701, System Development Corporation,
February 1962.
1. Rhodes, "A New Approach to the Mechanical Translation of Russian," National Bureau of Standards, Report No.
6295, 1959.
M. Sherry, "Comprehensive Report on
Predictive Syntactic Analysis," Mathe mat i cal LinguistiCS and Automatic
Translation, Report No. NSF - 7, Section I, Harvard Computation Laboratory, 1961.
W. Plath, "Automatic Sentence Diagramming," First National Conference
on Machine Translation and Applied
Language AnalYSiS, National PhYSical
Laboratory, Teddington, 1961.
D. G. Hays, "Grouping and Dependency
Theories," Report P - 1910, Rand Corporation, Santa Monica, 1960.
G. Salton, "The Use of Citations as an
Aid to Automatic Content Analysis," Information Storage and Retrieval, Report
ISR - 2, Harvard Computation Laboratory, in preparation, June 1962.
F. E. Hohn, S. Seshu, and D. D. Aufenkamp, "The Theory of Nets, "mE Transactions on Electronic Computers, Vol.
EC - 6, 1957, pp. 154-161.
A LOGIC DESIGN TRANSLATOR
D. F. Gorman and J. P. Anderson
Burroughs Corporation
Burroughs Laboratories
Paoli, Pennsylvania
INTRODUCTION
It is envisioned that a translator such as
this would be incorporated within a complete
design automation system, and, as such,
should be expected to provide input for minimization programs, card layout and assignment programs, backplane wiring programs,
and the like. For this reason, a canonical
form is essential to a completely automated
system and, in addition, will provide a convenient input form for follow-up programs.
From the canonical form, it is possible
to obtain an estimate of the relative cost of
a system, based upon component costs. As
a result, the cost and effects of modifications and changes can be quickly and accurately determined. The canonical form also
provides a means of comparing and evaluating various designs by examining the hardware structure. Previously, such comparisons could not be made, because of the
prohibitive number of man-hours involved in
the detailed logical design. Thus, those operations which are difficult to' implement, in
terms of complexity or cost, are easily pinpointed for further study. Logical inconsistencies and timing conflicts are eliminated. Thus, the translator will provide a
tool for system designers to more accurately measure their systems', and, hopefully,
will promote better descriptions of future
systems.
The process of logic design of a computer
is analogous to programming, using hardware for commands. Logic designers must
frequently take imprecise narrative descriptions of computer systems, and, applying
experience, ingenuity, inventiveness, and
considerable perseverance, transform the
description into a prescription for a running
system. This process takes an inordinate
amount of time. Considerable attention has
been devoted to the reduction of programming
time through the use of higher programming
languages, such as ALGOL and COBOL, and
conversion to machine code through translators. In a similar manner, translation of
higher languages for logic and system design
would greatly reduce the effort now needed
to specify and design computer systems.
Just as in the case of programming,
changes to the description are tolerated up
to a certain point, then are prohibited, or are
made with great reluctance and less than
perfect efficiency. This paper offers systems and logic designers a means to automate
the repetitive and otherwise error-,prone
detail associated with much machine design,
as well as to provide a means for making
systems changes acceptable for a longer
period of time, without encountering the
logic design equivalent of program patches.
An additional motivation is to be able to
rapidly obtain a canonical description of a
computer system in the form of application
equations of all registers within the machine.
System Description Language
A system can be described by giving the
functional interrelationships of the various
251
252 / A Logic Design Translator
parts of the system. The description is usually presented in a narrative form, with the
drawbacks of ambiguities and otherwise imprecise meanings. Upon examining the notion
of system design, as opposed to logic design,
it is often found that the emphasis is upon
the control structure of a processor, with
details on the number of registers to be included, the structure of each, the manner of
connection in each case, and the parts to be
played by each in the execution of various
commands. Particular attention is paid to
the selection and description of a command
list, especially those commands which are
the "particular features" of a machine.
These considerations suggested that a language that was a concise and unambiguous
description of these objects would, in fact,
describe the computer system.
The system descriptive language devised
is based upon the informal programmatic
notation frequently used to replace clumsy
narrative [1], such as replacing the statement
The contents of A and X are interchanged; the contents of A are then
stored in memory at the place addressed
by the contents of MA.
by
EXCHANGE A AND X
A
~
MA*
In general, the language assumes the existence of such hardware entities as counters,
adders, registers, clocking systems, and the
like. The forms of the language are not unlike constructs found in algebraic languages
such as ALGOL [2].
The basic language construct devised to
describe the interaction of the various registers is a statement. The principal statement
type is associated with interregister transfers. In addition, counting statements, conditional statements, shift statements, memory
access statements, and the like are provided
for the most frequently invoked system
functions.
Examples of some of the various statements are shown below:
Example
transfer
A ~B
conditional IF G(13) = 0 THEN (5.G ~ B) ELSE
(6.B
shift
~
G)
A MOVE RIGHT
OFF 6
~
B
Example
arithmetic
counting
subroutine
memory access
A + B
C + 1
~
~
R
C
INSTFETCH
MA*
~
L
Statements are freely combined to form
functional descriptions of instructions and
control. These larger constructs are the
microsequences used to describe the machine
functions. The complete functional description of a computer system is given by the set
of microsequences for the instruction set and
control (such as, instruction fetch and data
fetch), plus the declarations describing the
register Sizes, and their interconnections.
As an example, the following is a microsequence for the acquisition of the next instruction in some machine:
INSTFETCH
1. PC ~ MA
2. MA* ~ B
3. PC + 1 ~ PC
Declarations are used to specify details
of the system. They are used to name registers and describe their size and structure as
well as to provide characteristics of various
equipment assumed, such as arithmetic units,
logical units, etc. In addition, declarations
describe permissible data paths in the system as a check on the transfers specified in
the microsequences. In genera.l, the mierosequences· make reference to substructure
names for the apparent reason of readability
and mnemonic value. As a canonical representation, however, it was deemed desirable
to use the parent name of a register, Since,
in general, the substructure is artificially
imposed by the system deSigner and has no
a priori meaning in hardware. Typical of
the register declarations is the example of
an instruction register description in Figure 1.
It is the intention to draw upon the advances achieved in recent years in the design
of system elements, such as adders, counters,
comparators, etc., by providing libraries of
designs of these specialized units. These
designs would reflect various compromises
between speed and cost, and would represent
diversified implementations of computational
algorithms. These units are declared as
special registers,and have a format similar
Proceedings-Fall Joint Computer Conference, 1962 / 253
C~(OP~ INDX(7, 9), ADDR~»
m,
Register Na
j ,.
),
),
)1'
),
)1'
)~
Size F.ields
Substructur e Nam=s
Figure 1. An Instruction
Register Description.
to storage registers, differing only by the
inclusion of cost and speed parameters.
Translator Structure
The overall structure of the translator is
that shown in Figure 2.
Briefly, the translator accepts systems
descriptions in the language provided, and
transforms the microsequences into an intermediate language known as the design table.
After suitable manipulation, the design table
may be converted to application equations.
The translator is coded as a set of recursive
procedures, using the Burroughs 220 ALGOL
translator [3]. Each procedure corresponds
(more or less) to one of the syntactic definitions of the language. In addition, since the
language unit most frequently invoked is a
statement, a procedure is provided to array
statement elements into a vector, and associate with them a type designator (for
example, an identifier, a number, an operator,
a delimiter, etc.). A set of link list procedures is used to handle the associative storage for the register and transfer lists, and
for various other lists used in the translation process.
The logic design translator has three major
parts: the scan and recognizer, the design
table analysis, and the equation generator.
The Scan and Recognizer: This segment
of the translator takes microsequences, in
the system descriptive language, converts
them into a series of information transfers,
and enters them into the design table. As
mentioned above, the design table can be
thought of as an intermediate language used
during the translation process as a convenient
form in which to store and manipulate data.
The design table is represented as an m X 12
matrix, where m is the number of operations to be performed within a micro sequence
and the 12 columns indicate the following
information:
Column 1 - source register
2 - most significant bit of the field
of the source register under
consideration
3 - least significant bit of the field
of the source register under
consideration
4 - destination register
Registers and
Special Registers
List of Legitimate
Transfers
SYSTEM
STRUCTURE
Transfer Paths
Hardware Decisions
Transfer type
Timing information
Logical blocks and
register type
Microsequen::es
as an ordered
sequence of
legitimate
transfers
Design
Table
Microsequences
(In theSystem de.scription language)
Figure 2.
A Logic Design Translator..
Equations
of entire
system
254 / A Logic Design Translator
Column 5 - most significant bit of the field
of the destination register under
consideration
6 - least significant bit of the field
of the destination register under
consideration
7 - equipment used for the transfer
of information
8 - most significant bit of the field
of the equipment under consideration
9 - least significant bit of the field
of the equipment under consideration
10 - control conditions to be satisfied during this operation
11 - relative tim e at which this
microstep is to occur
12 - delay of the destination register
The design table is set up to accommodate
information transfers; that is, its basic
structure embodies a source and a destination of information, provisions for indicating
other equipment which may be used, and the
control conditions which must be fulfilled
for each transfer. That all routines to be
performed by the specified machine can be
decomposed into a series of information
transfers is fundamental in the design of
digital computers.
Each microstep must be individually analyzed in the order of its appearance within
the microsequence. The microstep is fir st
examined by the recognizer in order to determine which type of statement it is. At the
present time, 12 basic statement types have
been defined within the system. Depending
upon the statement type, a subroutine is
chosen which scans the statement, determining which registers or substructures of registers of the machine are invoked by" this
microstep. In general, substructure names
are used in the microsteps, mainly for their
mnemonic value. These substructure names
are now replaced by the name of the parent
register along with the appropriate bit fields.
It is at this time that the basic statement is
broken down into a series of information
transfers between the specified registers.
The parent register names and bit fields
are now entered into the design table in the
"source" and "destination" columns, each
transfer in a separate row. Since each information transfer between registers may
use other equipment in the system (busses,
memory address register, etc.), these transfers are compared to the list of data paths
which was initially entered as declarations
concerning the system, and all additional
equipment which is used in this transfer is
entered in the "equipment used" column of
the design table. The operators appearing
in the statement are entered in the "control"
column. Flags, to be used in the design table
analysis, are entered in the "time" column,
if needed. A "0" flag is used to indicate that
the following row is part of the operation;
thus, the two rows are to be considered as
one. A "1" flag indicates that the design table
is partitioned below that row.
The Design Table Analysis: This section
of the translator determines the length of
time each operation will take, the time when
it may begin, and sets up the control for the
next instruction fetch.
The type of transfer associated with each
operation in the design table is determined
by the destination register. If the destination
register accepts information in parallel, one
delay will be required for the transfer. If a
serial transfer is required, then the number
of delays required for the operation is equal
to the number of bits in the register involved.
A delay is the time that the destination register requires to process one bit of information.
Since the GO TO statement effectively
alters the order in which operations are to
be performed in the machine, the design table
must be partitioned so that the timing of each
branch of the program may be determined
independently.
The clock time at which an operation may
begin is determined as follows:
1. The first operation in the design table
can begin at the first clock pulse, denoted by t 1 •
2. The first operation in any partition of
the design table can begin. at t max +1,
where t max is the highest clock time
assigned to any previous operation in
the design table.
3. The clock time of the nth operation is
determined by comparing all registers
and other equipment used by an operation against previous operations, within
the same partition, beginning with the
(n - 1) st operation, until the conflict
with the maximum time is found. A
conflict is said to occur between two
operations if any equipment is used in
common by both operations. The n th
Proceedings-Fall Joint Computer Conference, 1962 / 255
operation is then begun at a time one
greater than the time of the conflicting
operation with the highest time. If no
conflict is found within the partition,
the starting time of the fir st operation
within the partition is also the starting
time of the n th operation.
This method of assigning clock times to
operations ensures that, for the order of
microsteps as declared, equipment does not
remain idle needlessly.
An implicit subroutine, called microsequence completion, must be added to each
microsequence so that, as each microsequence is executed, control is transferred to
the following microsequence. This is accomplished by entering in the (i + 1) st row of
the design table the following:
a) "0" in column 1
b) CLOCK in column 4
c) RESET in column 10
d) tmax + lin column 11
e) DeiaYcLocK in column 12
The Equation Generator: After the operation of the design table analYSiS, the application equations for all the data storage registers in the machine are generated. Each
operation in the design table will form a
term of the application equation for each bit
of the destination register. This term is
formed by taking the conjunction of the microsequence name, the corresponding bit of the
source register, all variables in the control
column, and the clock time. After all the
microsequences to be performed by the specified machine have been processed, the total
equation for a particular bit of a register is
formed in a separate sort run by taking the
disjunction of all terms in which the bit under
consideration was the destination bit.
LikeWise, the application equations for
the control flip-flops are derived from the
design table based upon the microsequence
name and the clock times associated with
each control variable.
An Example
Two microsequences will now be u$ed to
illustrate the functioning of the translator.
These microsequences do not represent a
useful routine in a computer system but were
constructed to illustrate at least one example
of each type of statement. Assume that all
information transfers occur in parallel. The
resulting design tables are shown in Figs. 3
and 5. The twomicrosequences are as
follows:
MICROSEQUENCE
WIN
1. P ~Q
2. K + 1 ~ P
3. S + T ~ W
4. X MOVE RIGHT· OFF 3 ~ X
5. IF
K =P
THEN
(6.' W ~ X)
ELSE
(7. W ~ Y
8. R6-l0 ~ W 2 - 6
9. K - 1
~
)
P
MICROSEQUENCE
LOSE
1. (K' W v R) ~ X
2. WIN
3. M* ~ T
4. EXCHANGE SAND T
5. T ~ M*
6. GO TO 2
For microsequence WIN, microsteps 1
and 2 are straightforward entries into the
table. The fact that BUS1 was used during
these operations was determined by consulting declarations about the system assumed
to have been previously entered via the translator. In microstep 3, row 7, the 0 in column
11 is used as a flag to indicate that both
operands (S and T) must be brought to the
arithmetic unit at the same time. Microstep
5 is a conditional statement and, as such,
imposes control conditions upon following
statements, as indicated by the FF1 and
-FF1 entries in column 10.
Microsequence LOSE illustrates some
different statement types and another feature
of the translator. In evaluating the Boolean
expression in microstep 1, the translator
finds that temporary storage is necessary.
The translator assigns a name to this register (TEMP 1) and notifies the designer that
this register is necessary if the micro step
is to be executed in the prescribed manner.
The analysis section of the translator determines the clock times for the operations
indicated in the design table. Since parallel
transfers were assumed, only one delay will
be required for each transfer; the results are
shown in Figures 4 and 6. In accord with the
flags in column 11 of microsequence WIN,
the beginning times for the operations of
rows 1 and 2, 3 and 4, etc., are the same.
An interesting situation occurs in rows 19,
N
CJ1
0')
"-
>
~
0
aq
"",.
()
Microstep
Number
Row
k
i
1
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
2
3
4
5
6
7
8
9
Source
Register
1
P
BUSI
K
BUSI
COUNT
BUSI
S
T
ARITH
X
SHIFT
SHIFT
SHIFT
K
BUSI
P
BUS2
LOGICAL
W
W
R
K
BUSI
COUNT
BUSI
Begin
2
End
3
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
6
1
1
1
1
n
n
n
n
n
n
n
n
n
n
n
n
n
n
n
n
n
1
n
n
10
n
n
n
n
Destination
Register
4
BUSI
Q
BUSI
COUNT
BUSI
P
ARITH (A)
ARITH (B)
W
SHIFT
SHIFT
SHIFT
X
BUSI
LOGICAL (A)
BUS2
LOGICAL (B)
FFI
X
y
W
BUSI
COUNT
BUSI
P
Begin
5
End
6
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
2
1
1
1
1
n
n
n
n
n
n
n
n
n
n
n
n
n
n
n
Eq.
Used
7
t::1
('D
Begin
8
End
9
Control
10
Time
11
0
0
UP
0
ADD
ADD
0
RIGHT-OFF
RIGHT-OFF
RIGHT-OFF
EQL
n
0
0
0
EQL
n
1
n
n
6
FFI
FFI
FFI
1
1
1
n
n
n
n
Figure 3. Design Table for Microsequence WIN.
1
1
1
FFI
- FFI
- FFI
0
DOWN
0
Delay
12
0
1
0
1
0
1
5
5
1
1
1
1
1
0
3
0
3
1
1
1
1
0
1
0
1
00
dQ.
::s
~
Ii
§
00
..~
~
0
Ii
Microstep
Number
k
1
2
3
4
5
6
7
8
9
999
Row
i
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
Source
Register
1
P
BUS1
K
BUS1
COUNT
BUS1
S
T
ARITH
X
SIDFT
SHIFT
SHIFT
K
BUS1
P
BUS2
LOGICAL
W
W
R
K
BUS1
COUNT
BUS1
"0"
Begin
2
End
3
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
6
1
1
1
1
n
n
n
n
n
n
n
n
n
n
n
n
n
n
n
n
n
1
n
n
10
n
n
n
n
Destination
Register
4
BUS1
Q
BUS1
COUNT
BUS1
P
ARITH ~A~
ARITH B
W
SHIFT
SHIFT
SHIFT
X
BUS1
LOGICAL (A)
BUS2
LOGICAL (B)
FF1
X
y
W
BUS1
COUNT
BUS1
P
CLOCK
Begin
5
End
6
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
2
1
1
1
1
n
n
n
n
n
n
n
n
n
n
n
n
n
n
n
n
n
1
n
n
6
n
n
n
Eq.
Used
7
Begin
8
End
9
Control
10
UP
ADD
ADD
RIGHT·OFF
RIGHT·OFF
RIGHT·OFF
EQL
EQL
FF1
FF1
FF1
1
1
1
1
1
1
FF1
.... FF1
""'FF1
DOWN
n
RESET
Time
11
Delay
12
1
1
2
2
3
3
1
1
6
1
2
3
4
4
4
4
4
7
8
8
9
4
4
7
7
10
0
1
0
1
0
1
5
5
1
1
1
1
1
0
3
0
3
1
1
1
1
0
1
0
1
1
~
to;
0
n
C'D
C'D
s:L
.....
::s
aq
00
I
~
~
C-t
0
.....
a
.g
()
0
a
C'D
to;
()
0
::s
I-+)
C'D
Figure 4. Design Table for Microsequence WIN after Timing Analysis.
I~
::s
n
C'D
.....
~
(j)
N
............
N
I:J1
..;r
N
CJ1
00
...........
>-
~
0
(JQ
1-'.
(")
tj
Microstep
Number
k
Source
Register
1
Row
i
Begin
2
End
3
1
1
1
1
1
1
1
n
n
n
n
n
n
n
Destinanon
Register
4
CD
00
1-'.
(JQ
Eq.
Begin
5
End
6
1
1
1
1
1
1
1
n
n
n
n
n
n
n
I
I
1
I
1
I
I
m
n
n
n
n
n
m
Used
7
Begin
8
End
9
Control
10
Time
11
Delay
12
0
0
0
3
3
1
3
3
1
::s
~
"1
§
1
1
2
3
4
5
6
7
K
BUS1
W
LOGICAL
TEMP1
R
LOGICAL
BUS1
LOGICAL
LOGICAL
TEMP1
LOGICAL
LOGICAL
(A)
(B)
(A)
(B)
X
AND
AND
OR
OR
0
1
2
a;)
3
4
5
6
34
35
36
37
38
39
40
41
The entire microsequence WIN is entered here.
M
MEM (DATA)
S
T
TEMP!
T
M
T(k)
I
I
I
I
I
I
I
m
n
n
n
n
n
m
MEM (ADD)
T
TEMPI
S
T
MEM (DATA)
MEM (ADD)
CLOCK
Figure 5. Design Table for Microsequence LOSE.
0
RESET
0
I
I
I
I
I
1
I
I
I
I
I~
Microstep
Number
k
1
2
3
4
Source
Register
1
Row
i
1
2
3
4
5
6
7
~}
33
34
35
36
37
38
5
39
6
999
40
41
42
K
BUS1
W
LOGICAL
TEMP1
R
LOGICAL
Begin
2
End
3
1
1
1
1
1
1
1
n
n
n
n
n
n
n
Destination
Register
4
BUS1
LOGICAL
LOGICAL
TEMP1
LOGICAL
LOGICAL
X
Begin
5
End
6
1
1
1
1
1
1
1
n
n
n
n
n
n
n
(A)
(B)
(A)
(B)
Eq.
Used
7
Begin
8
End
9
Control
10
AND
AND
OR
OR
Time
11
Delay
12
1
1
1
4
5
5
8
0
3
3
1
3
3
1
n
CD
CD
Q,....
{9_j7
The entire microsequence WIN is entered here.
"'d
"i
0
::s
(Jq
C/.l
I
~
.....
.....
Po'
M
MEM (DATA)
S
T
TEMP1
T
M
"9"
"0"
1
1
1
1
1
m
n
n
n
n
1
n
1
m
MEM (ADD)
T
TEMP1
S
T
MEM (DATA)
MEM (ADD)
CLOCK
CLOCK
1
1
1
1
1
1
1
m
n
n
n
n
n
m
Figure 6. Design Table for Microsequence LOSE after Timing Analysis.
RESET
RESET
14
14
9
15
16
17
17
18
19
1
1
1
1
1
1
1
1
1
C-t
0
,....
::s
I"t"
(')
0
~t:
I"t"
(l)
"i
(')
0
::s
I~
"i
(l)
::s
n
(l)
.co
0)
N
'-....
N
C1I
co
260 / A Logic Design Translator
20, and 21. Since the control conditions for
operations 20 and 21 are the inverse of the
conditions for operation 19, operation 19 is
ignored during the determination of the
starting time for operations 20 and 21.
The equation generator will convert the
design tables into terms which will be combined in a later sort run to form the application equations. Terms generated by microsequence WIN are:
BUS 1 1_n
= WIN
. P 1-n • t 1
Q 1-n
= WIN
. BUS 1 1_n . t 1
BUS1 1_n
= WIN . K 1-n · t2
= WIN . BUS1 1_n . UP. t2
COUNT 1-n
SmFT1_n
SmFT 1_n
= WIN
= WIN
= WIN
= WJN
= WIN
= WIN
= WIN
= WIN
= WIN
= WIN
= WIN
= WIN
= WIN
= WIN
= WIN
SmFT 1_n
= WIN . SmFT 1-n . RIGHT . OFF . t3
X 1-n
= WIN
BUS 1 1_n
P1-n
ARITH (Ah_n
ARITH(A)1_n
ARITH(A)1_n
ARITH(Ah_n
ARITH(A)1_n
ARITH(Bh_n
ARITH(Bh_n
ARITH(Bh_n
ARITH(B)1_n
ARITH(Bh_n
W 1-n
BUS1 1_n
• COUNT 1-n . t3
• BUS1 1_n • t3
• S 1-n . ADD· t 1
· S1_n ·
· S1_n ·
· S1_n ·
· S1-n
· T 1-n ·
· T 1_n ·
· T 1_n ·
· T 1_n ·
ADD
ADD
ADD
·
·
·
ADD
t2
t3
t4
t5
ADD· t1
ADD· t2
ADD· t3
ADD· t4
. T 1-n·
ADD· t 5
. ARITH 1_n . t6
. X 1-n . RIGHT· OFF· t1
. SmFT 1-n . RIGHT . OFF . t2
SmFT 1-n
= WIN . K 1- r1 • t4
p 1-n
= WIN . BUS1 1_n . EQL . t4
= WIN . P 1-n . t4
= WIN • BUS2 1_n • EQL . t4
= WIN . LOGICAL 1 • t7
= WIN · W 1-n . FF1 . t8
= WIN • W 1-n . -FF1 . t8
= WIN · R 6_10 . -FF1 . t9
= WIN · K 1_n . t4
= WIN BUS1 1_n . DOWN . t4
= WIN · COUNT1.. n . t7
= WIN · BUS1 1_n . t7
CLOCK
= WIN
LOGICAL(A) 1-n
BUS2 1_n
LOGICAL(B) 1-n
FF11
X 1-n
y 1-n
W2_6
BUS1 1_n
COUNT 1_n
BUS 1 i-n
· "0"
. RESET
.
t 10
Proceedings-Fall Joint Computer Conference, 1962 / 261
After all microsequences have been processed, the terms from all microsequences are collected and sorted; and the equations for the system are formed. For example, the equation for
the first bit of the X register will have the following form:
Xl
= \WIN
. SHIFT 1 . t4
v
WIN· IW1 . FFI . tSJv
v
(from microsequence WIN)
\ LOSE . LOGICAL! . ts v LOSE· SHIFT ~ . t12 v LOSE . W1 . FFI . t 16 ) v . _ .
v
(from microsequence LOSE)
CONCLUSION
The foregoing outlines a programming
system that is another tool for the computer
designer. The emphasis has been on the
representation of a computer system in a
form suitable for further \ processing by an
automated design system, or by programs to
evaluate the cost of the machine under consideration. In general, it is envisioned that
the user of this system would be the designer
concerned principally with the overall control structure of a computer.
It should be pointed out that the algorithms devised are, to a large degree, a
func'tion of the class of machines being designed (parallel- synchronous, in the exampIe), and, to a lesser degree, a function of
assumptions rega~ding characteristics of
the system elements available to the designer. (For example, the translation algorithm employed would have to be changed
to accommodate "shifting type" registers.)
Further changes would have to be incorporated to give the designer explicit control
over the concurrences in microsequences
rather than to arbitrarily exploit the concurrences (based upon utilization of system
elements) in the translator.
The system described herein will be used
as a vehicle for extending the notation to cover
wider classes of designs, and to study the
implications of the notational devices to canonical representation of computer systems.
It has not escaped the authors that the
language described in this paper is essentially the form necessary for functional
simulation of computers, and that it would
be a relatively simple task to write a translator that would generate a simulation program representing a proposed machine design, either on a functional level or on an
individual clock pulse baSis.
ACKNOWLEDGEMENT
The contributions of Roy Proctor, of the
Burroughs Laboratories Computation Center, both for programming and for several
suggestions in connection with this work,
are acknowledged with pleasure.
REFERENCES
1. Barton, R. S., "A New Approach to the
Functional Design of a Digital Computer,"
Proceedings, 1961 WJCC, pp. 393-396;
May 9-11, 1961.
2. Naur, P., et aI, "Report on the Algorithmic Language ALGOL 60," Communications of the ACM, vol. 3, no. 5, pp. 299314; May 1960.
3. Burroughs Algebraic Compiler, Bulletin
220-21011-P (Detroit, Michigan: Equipment and Systems Marketing Division,
Burroughs Corporation) January 1961.
COMPROTEIN: A COMPUTER PROGRAM TO AID
PRIMARY PROTEIN STRUCTURE DETERMINATION*
Margaret Oakley Dayhoff and Robert S. Ledley
National Biomedical Research Foundation
Silver Spring, Maryland
This ordering is of great interest because it
is the order of the amino acids in a protein
that is determined by the gene. Thus, according to current biological theory, the gene determines which proteins will be made by
determining the order of the amino acids in
the protein chain and it is these proteins in
turn, acting as enzymes, that control the
chemical processes that determine the physical and functional characteristics of the
organism.
Finding the amino acid order of a protein
chain has proved a time consuming process
for the biochemist; in fact, only about 6 complete or almost complete protein orderings
have been found so far, namely those of insulin [1 ], hemoglobin [2], ribonuclease [3],
tobacco mosaic virus protein [4 ], myoglobin [5], and cytochrome C [6]. The basic
technique used on all these proteins (with the
exception of myoglobin). was to breakdown the
long chain chemically into smaller fragment
chains at several different points, to analyze
the amino acids in each fragment chemically,
and then to try to reconstruct the entire protein chain by a logical and combinatorial examination of over lapping fragments from the
different breakdowns. It is in this reconstruction of the protein that the computer finds its
application.
INTRODUCTION
Among the main chemical constituents of
the human body-and, in fact, of all living
things-are proteins. In addition to serving
as component structural parts of many types
of living tissues, the proteins are enzymes
that are necessary in order that the chemical
reactions which comprise the life processes
may occur. The protein enzymes act to
"decode" the message of the genes, interpreting this message in terms of specific
chemical reactions which determine the physical and functional characteristics of the
organism. Thus proteins playa uniquely vital
role in the evolution, ontogeny, and maintenance of living organisms. It therefore becomes important when studying the basis of
life processes to know the structure of the
proteins themselves.
In spite of their highly complex role, the
molecular structure of proteins is, in prinCiple, relatively simple: they are long chains
of only twenty different types of smaller
molecular "links" called amino acids (see
Figure 1). Each type of protein is characterized by a particular ordering of the amino
acid links, and a major problem in finding
the exact structure of a protein is to obtain
the ordering of the amino acids in the chain.
*The research reported in this paper has been supported by Grana GM 08710 from the National
Institutes of Health, Division of General Medical Sciences, to the National Biomedical Research
Foundation.
262
Proceedings-Fall Joint Computer Conference, 1962 / 263
Abbreviation
Name
1
2
3
4
5
6
7
8
9
10
alanine
arginine
asparagine
aspartic acid
cysteine
glycine
glutamine
glutamic acid
histidine
isoleucine
ALA
ARG
ASN
ASP
CYS
GLY
GLN
GLU
HIS
ILU
Name
Abbreviation
leucine
lysine
methionine
phenylalanine
proline
serine
threonine
tyrosine
lQ tryptophane
20 valine
LEU
LYS
MET
PHE
PRO
SER
THR
TYR
TRY
VAL
11
12
13
14
15
16
17
18
Figure 1. A listing of the amino acids with their abbreviations is shown in the upper section
and the lower indicates part of the protein ribonuclease which actually comprises a chain of
some 124 amino acids.
As a trivial example, suppose that for a
protein one chemical breakdown produced the
fragment chains of known ordering,
Breakdown P: AB,
CD,
and E
Where A, B, C, D and E each occur once and
only once in the protein. Let us call this a
complete breakdown, and let another breakdown, this time incomplete, produce the fragments
Breakdown Q: BC
and
Breakdown P:
(A, B, C)
and
(D, E)
and that another, incomplete, breakdown is
Breakdown Q: (A, B)
and
(C, D)
Clearly (C, D) of breakdown Q overlaps (A,
B, C) and (D, E) of breakdown P but (A, B)
is contained within (A, B, C). Hence, since
each amino acid has distinct "left" and "right"
ends, two possible protein reconstructions
result, namely (see Figure 2b)
DE
(A, B) (C) (D) (E) and (E) (D) (C) (A, B)
where A, B, C, D, and E represent amino
acids. Here fragment BC in Breakdown Q
(see Figure 2a) clearly overlaps the two fragments AB and CD of Breakdown P, and DE
overlaps CD and E of breakdown P, giving as
the reconstructed protein
ABCDE.
As another example, consider the more common case where the amino acid components
of a fragment are known, but the order of
these within the fragment is unknown. Let
parentheses indicate that the order theyenclose is unknown [e.g., (A, B, C) represents
the six permutations of A, B, C; (D, E) represents either DE or ED; (A, B, C) (D, E)
represents the 6 x 2 = 12 fragments of each
of the six permutations in (A, B, C) followed
by DE or ED etc.], and suppose that one complete breakdown is
where in each possibility the order of A, B
still remains unknown. Such partial reconstructions frequently occur, and pinpoint for
the biochemist that portion of the molecule
on which further effort is required.
Unfortunately, however, the problems involved in reconstructing proteins are not as
simple as in the examples just given. The
largest protein analyzed so far, the tobacco
mosaic virus protein, has only 158 amino
acids whereas proteins usually have chains
of many hundreds of amino acid links. Since
the number of combinations on n things taken
r at a time (1< r < n) increasesmore rapidly
than does!! itself, it is to be expected that the
difficulties in piecing together the fragments
of a protein will increase p-roportionally faster
than the number of amino acids in the protein.
In addition, there may be occasional errors
in the fragments reported by the biochemist,
264 / Comprotein: A Computer Program to Aid Primary Protein Structure Determination
Breakdown P
I
I
A
B
I
D
C
r----1
E
I
I
Breakdown Q(a):
Breakdown P
I
I
I
A
B
Breakdown Q(b):
C
I
I
I
or
E
D
I
I
E
D
I
I
A
C
B
I
I
Figure 2. Illustration of two different breakdowns of a protein into amino acid fragments
(see text).
as well as other aberrations in the data.
Hence the logical and combinatorial problems
can become severe, and a computer is then
required to assist in the analysis.
The advantage of computer aid is that it
may help significantly to extend the current
chemical analysis methods of determining
the amino acid sequences of proteins to many
more and much larger proteins. Byexhaustively analyzing the possibilities of protein
reconstructions, the computer may assist in
determining the best next step to try in the
chemical analysis processes. In addition, it
should be noted that presently the chemical
analysis is carefully planned to produce results that will be logically simple for mental
analysis. The use of a computer to perform
the logical analysis. may thus allow Significant
simplification and further systematization of
the chemical experimental procedures by
placing more of the burden ·on the automated
logical and combinatorial analysis and less
on the experimental procedures.
In this paper we shall describe a completed computer program for the IBM 7090,
which to our knowledge is the first successful
attempt at aiding the analysis of the amino
acid chain structure of protein [7]. The idea
was conceived by us in 1958, but actual programming was not initiated until late 1960.
D. F. Bradley, S. A. Bernhard, and W. L.
Duda have independently reported, in an asyet unpublished paper [8], progress in approaching a similar problem. R.. Eck has
reported on a system for using marginalpunch cards to aid in certain aspects of the
logical analysis problem [9].
A SIMPLIFIED ILLUSTRATION
Discussio~
of the programming methods
utilized will be clarified if. we first consider
a simple illustration. Suppose a complete
breakdown P is made by the biochemist as in
Figure 3, and that another breakdown Q is
also known but not complete (see Figure 3).
Incomplete
breakdown
Q LIST
Complete breakdown
P LIST
P1
p2
p3
P4
p5
(R)(A,B)
(D)(B)(C,A)
(A)(C)(D)(X,A)(C)
(B) (D) (B) (D) (A,B,D)
(C)(A)(Z)
q1
q2
q3
q4
q5
(A)(B,B,D)
(A)(C,A,C)
(X)(A,C ,B)
(B)(A,A,C,D)
(A)(B,C ,D)
Figure 3. Breakdowns of protein fragments
for the illustration in the text.
In Figure 4 we show how such breakdowns P
and Qcan occur from our hypothetical protein, but the problem is to reconstruct this
from the fragments given in Figure 3. Since
each fragment qi of breakdown Q musteither
overlap several fragments of P, or else be
included within some fragment of P, let us
start by making a list for each q i of all possible such associated P fragments; Figure 5
shows such lists for our illustration. As an
example of how 'each entry in a list is found,
consider the test of whether or not q4 overlaps P4 P5 where
q4 is (B)(A,A,C,D)
and where
P4 is (B)(D)(B)(D)(A,B,D), Psis (C)(A)(Z)
The problem is to determine if each acid of
<4 can be accounted for in p 4 and p 5' First
note that the maximum overlap between q4
Proceedings-Fall Joint Computer Conference, 1962 / 265
Pl
RIA
P2
Blln
BIIA
P3
C
ql
I IA
C
P4
nix
1
q2
ci
A
IB
I
D
Ps
B
DIB
A
n
l .I
c
A
I
Z I
q4
q3
qs
Figure 4. lllustration of sources of peptide fragments from protein molecule
illustrated in the text.
ql
q2
q3
PlP2
P2 P3
P 3 P4
P l P4
P2 P4
q4
qs
PI P3
PI P2
P2 P s
P4 P 3
P2 P4
P3 Ps
P4 Ps
P3 P2
P4P2
qs list.* This leaves in the list for ql only
PlP2 and P4P2' H we first assume that PlP2
is overlapped by ql, then in the q4 list only
P4PS remains, and hence in the q2 list only
P2P3 remains, giving altogether these adjacent .fragments,
P3 P 4
which determine the structure as
Figure 5. The q lists for the illustration
in the text.
and P4 is (B ,A,D), on the right of P 4. This
leaves (A,C) of q4 "hanging over" on the right
of Q4, to be accounted for in P 5' This is
clearly possible, resulting in the overlap:
P4
j
Ps
(B)(D)(B)(D)(B)(A,D) I I(C)(A)(Z) I
I
(B)(A,D)
(C)(A)
On utilizing PI, P2, P3 and P4 we find as the
final struc ture:
(R)(A)(B)(D)(B)(A)(C)(A)(C)(D)(X)
(A)(C)(B)(D)(B)(I»(B)(A,D)(C)(A)(Z)
Returning to the second possibility in the
ql overlap list, namely P4P2, this leaves only
PIP 3 in the q4 list, which in turn leaves only
P2PS in the q 2 list. Hence a secondpossibility for adjacent fragments is
I
44
In order to determine all the entries in all
the lists, such trials must be made by the
computer for every pair of fragments PiP P
for each qk'
However, .just forming the lists is but the
first step in reconstructing the protein chain.
The next step is an elimination process to
leave only the consistent possibilities. For
instance, q3 can only arise from P3P4; hence
P3 must be followed by P4, and P4 must be
preceeded by P3, and therefore all other possibilities in other lists involving P3 and P4
can be eliminated-such as PlP4, P2P4 and
PSP4 in the qllist of Figure 5, P3PS in the q2
list, P4P3 in the ~ list, andp2P4andp3P2 in the
which gives the structure
(R)(B)(A)(A)(C)(I»(X)(A)(C)(B)(D)
(B)(D)(I»(A)(B)(D)(B)(A)(C)(C)(A)(Z)
*Actually, it is also necessary to eliminate
the occurrence of P4 in the lists by replacing it with P3*, which stands for P3P4' This
is to insure that an impossible succession
of conditions such as PIP2' P2 P 3' P3 P l is not
produced.
266 / Comprotein: A Computer Program to Aid Primary Protein Structure Determination
COMPUTER HANDLING OF BIOCHEMICAL
INFORMATION
The problems involved in writing a computer program, however, are not as straightforward as the above illustration might indicate. The biochemist utilizes enzymes to
break up (hydrolyze) the protein into the
fragments that we have been considering;
these fragments are called peptides by the
biochemist. The enzymes commonly used,
such as subtilisin and chymotrypsin, produce
an assortment of peptides which may overlap
each other. Hence a problem arises in actually arriving at a complete set of peptide
fragments, as illustrated above in the breakdown P. In addition, for several reasons, the
biochemical experiments very often do not
result in integer values for the number of
amino acids of a particular kind that occur in
a peptide fragment. This second uncertainty
problem must also be taken into account by
the computer program. Furthermore, there
may be experimental errors in the amino
acid composition and ordering 0 f . some
peptides.
In the case of overlapping peptide fragments from a hydrolytic breakdown, the computer program tries to reconstruct a complete
set of fragments from overlapping subsets of
fragments. The procedure of accomplishing
this is to look for every group of two, three,
or four acids known to be adjacent in some
peptide. Then, for each such group, the probability that this particular group will occur
again in the protein chain is computed from
the amino acid frequency data. For instance,
if the ordered pair LYS-PHE occurs, and it
is known thatthere arefive LYSresidues and
four PHE residues in the entire protein chain
of, say, 150 amino acid links altogether, then
the probability that another such L YS- PHE
pair will occur is approximately 4 x 3/150.
If the probability is small that another
such group occurs, it is most likely that all
of the peptides containing this group should
a~ise from the same part of the protein; hence
these peptides are sorted out. All possible
fragments that can be reconstructed from
these (overlapping) peptides are then determined.
It may happen, however, that all these
peptides cannot "fit" into any reconstructed
fragment; this indicates either that some
peptide must arise from a different place
on the protein or that there may be an
experimental error. In such a case the experimental results are reconsidered from a
chemical point of view.
Of course, there is a smallbutfiniteprobability that a miSinterpretation can be made
at this point and an erroneous peptide constructed, such as could occur if a highly unlikely configuration actually occurred more
than once in the protei~ or if there were lost
peptides from a particular region but the
existing peptides fortuitously fit. However,
it is likely that in any case, in later building
up of the protein, an inconsistency would
arise, leading to the rejection of this erroneously constructed peptide .
Some peptides may contain two or more
groups on which searches are made. Reconstructed fragments containing these peptides
must be joined together themselves. Hence
the program merges these fragments to obtain all possible larger fragments. Such procedures will fix the relationships of many
amino acids beyond that given in the initial
data. These new relationships change the
probabilities of occurrence of the two, three,
or four amino acid groups. The probabilitie s
are accordingly recalculated, and once more
searches on improbable groups are made,
leading to further merges of fragments into
even larger fragments. This process is repeated by an iterative program until less than
about 20 fragments remain.
Further details must be taken into consideration by the program: the set of fragments
may still not be complete; there may exist
alternative possibilities for a fragment; and
there may be gaps in the chain where all the
peptides were lost.
After obtaining a complete, or almost
complete, set of fragments by iteration of the
searching procedure, the program can continue toward reconstructing the entire protein utilizing the remainingpeptides not used
in the building up of the complete set P as
the Q set of peptides (see example above).
It should be noted that in the various phases
of the reconstruction of the protein the assumption is made that the total amino acid
content of the entire protein and of the fragments is known. There is always, however,
some experimental uncertainty in the number
of amino acids of each type in the protein, to
within a fraction of one amino acid. As· a
rule, the larger number of amino acids is
al ways chosen initially. If an extra acid is
thereby included in the computations, it may
Proceedings-Fall Joint Computer Conference, 1962 / 267
be eliminated at the end, by a procedure described below.
Completing the final reconstruction of the
protein again can present further details
which must be taken into consideration by
the computer program. Some peptides may
appear to overlap at only one amino acid. If
this occurs it would be unwise to conclude
definitely that this represented a true overlap. Hence "pseudofragments" are used
which consist of each overlapping fragment
without the common amino acid.
Single amino acids to complete the P set,
as required by the amino acid constitution of
the protein are considered with the larger
fragments.
If extra amino acids are so included, the
final answer showing which fragments must
be attached may place no attachment restrictions on these extra acids. In this case, if
the acid arose from a fractional experimental
result (see above), one may presume that it
does not actually occur in the molecule.
Otherwise further experimentation may be
required. For example, if amino acid X is
added to the P list in Figure 3, the Q lists
will be unaffected and the resulting answers
will be unchanged. One might conclude that
either X really didn't belong in the P list or
it could be at either end of the molecule. It
is sometimes known which amino acids are
on the right and left ends of the protein itself,
and this information can further reduce the
final possibilities.
DESCRIPTION OF COMPUTER PROGRAMMING SYSTEM
The computer programming system to aid
protein analysis has been written in a flexible
manner. The computer input and output is in
terms of three letter abbreviations for the
amino acids, with the parentheses notation
for ordered and unordered sets as described
above. Intermediate results are printed out
for examination by the biochemist; in fact the
entire process is geared for a close cooperative effort between the computer and the biochemist during the entire analysis. This is
necessary in order to take advantage of the
special conditions presented by any particular
protein and type of chemical experimental
procedures. For example it might be convenient to omit all prolines from the peptides,
or not to consider a distinction between Glu
and GIn. Special rules might be introduced
regarding end-groups from hydrolyses by
certainenzymes, etc. Such special considerations can be handled by the programming
system, and make it easier to spot errors in
the experimental data.
The programming system is based on the
following six programs:
(1) MAXLAP: Program to find the maximum possible overlap between any two peptides with any amount of ordering information known.
.(2) MERGE: Pro gram to find all
possible over lapping configurations of two
peptides.
(3) PEPT: Program to find all possible
fragments that are consistent with the overlapping of any number of peptides.
(4) SEARCH: Program to search on
probabilistic considerations all peptides
which contain an unusual group of amino
acids.
(5) QLIST: Program to generate the Qlists of possible associated sets of P peptides
over which each q i fragment can fit.
(6) LOGRED: Program to perform the
logical reduction of the Q-lists to obtain all
possible protein structures that are consistent with the data.
Since detailed flow diagrams would consume
too much space and not be appropriate for
the present discussion, we have included here
only gross overall flow diagrams of these
programs. Each of the six programs will
nowbe described, and simple examples illustrating some of the methods involved will be
given. (For further details see "Sequencing
of Amino Acids in Proteins Using Computer
Aids," Report No. 62072/8710, National Biomedical Research Foundation, Silver Spring,
Md., July 1962.)
Program MAXLAP (Figure 6). In this
program p and q are peptide fragments,
PCOM and QCOM are lists of acids from
these peptides respectively which may provide the maximum overlap. After setting up
a tentative maximum number of positions
(Le., amino acids) of overlap, three cases
may be distinguished as illustrated by the
three examples of Figure 7. In the first example there is the successful maximum overlap situation where all of p or q is overlapped. Here MV is the list of all the acids
from PCOM and QCOM which match; with this
maximum overlap, the complete maximum
overlap pro t e i n fragment is shown. In
268 / Comprotein: A Computer Program to Aid Primary Protein Structure Determination
START
~
Assume tentative maximum
number of overlap positions
of p and q.
I Clear PCOM,QCOM, and MV.
+
J
..
Enter first group of p in PCOM.
Enter first group of q in QCOM.
I
i'
I
Yes
!
~}-
Is p or
used up
No i
Add next group
from peptide
list to empty
list PCOM or
QCOM.
I
r-
Maximum overlap is found
in MV.
See example I.
Match elements of PCOM and QCOM.
and move matchinq acids into MV.
t
No
\. Is PCOM or QCOM empty ? J
~ Are there more elements
Yes
in QCOM than in PCOM ?
Shift q to right the
minimum amount so that
all these PCOM elements can appear in
the nonoverlapping
first portion of p.
Deduce new number of
positions in the tentative maximum overlap and the content
of the first p group.
See example II.
(
,
\ DONE t ..
RETURN I
Yl
Shift q to right until
all QCOM elements
can match elements of p
or extend past the end of
p.
Deduce the new number of
positions in the tentative
maximum overlap and the
content of the first p
group.
See example III.
~ Are there any elements
"
Yes
in the tentative p structure ?
t No
I
I No overlap of p and q.
I
Figure 6. Flow Chart for Program MAXLAP.
Example I
p:
q:
MV:
Protein fragment with maximum overlap:
Example II
p:
q:
New tentative maximum
Over lap structures
Protein fragment with maximum overlap:
Example ill
p:
q:
New tentative maximum
Overlap structures
Protein fragment with maximum overlap:
(A,B)(C,D)
(C ,B,A)(F ,G,D)
(A,B)(C)(D)
(A,B)(C)(D)(F,G)
(C)(D,E,F)
(D,C,E)(G,H)
(C)(F,D,E)
(D,C,E)(G,H)
(C)(F)(D,E)(C)(G,H)
(A,B)(B,E,A)(D)
(B) (A,D,E)(F)
(A,B) (B,E ,A) (D)
(B)(A,D,E)(F)
(A,B) (B) (A,E ) (D) (F)
Figure 7. Examples of the three cases considered in flow chart of program MAXLAP.
Proceedings-Fall Joint Computer Conference, 1962 / 269
the second example all of p cannot be overlapped by q, and hence new tentative overlap
positions must be assumed. Here F is the
limiting acid since it does not appear in q.
In the third example even though D appears
in both p and q, new tentative overlap positions must be assumed, because in p the D
acid is restricted in position at the right.
Program MERGE (Figure 8). Once the maximum overlap has been determined, all other
possible overlaps can be determined. Several
cases can occur. First the essentially trivial
cases of p and q entirely disjoint, as in
q: (D,E)
p: (A,B,C)
or else q a subset of p as in
Find the structure of
maximum overlap of p
and q.
p and q are derived
from separate portions of the molecule.
q is wholly contained
in p, but not in a
unique position.
p and q can be derived
from the same portion
of the protein chain.
List the structure.
Yes
Does the first overlapping
group of p contain one
amino acid?
Does the first overlapping
group of p contain more than
five acids?
mation to include
at this time.
Return.
No
Generate all unique single
acids, pairs of acids,
triplets or quadruplets
from this first overlapping
group of p.
Can q overlap each whole
peptide formed from this
first group and the rest
of the ppeptide ?
Yes
Reform p peptide starting
with the next p group.
Figure 8. Flow Chart for Program MERGE.
270 / Comprotein: A Computer Program to Aid Primary Protein structure Determination
C START
q: (B,C)
p: (A,B,C ,D)
t
If neither of these is the case, as for example
in
q: (B)(B,D)(G)
p: (A,B,B,D)
the first overlapping group of p is considered,
which for our later example is
I Form
(B,B,D)
If this first overlapping group contains not
more than five nor less than two acids, it
warrants further consideration. A list is
made of all singles, pairs, triplets and quadruplets of acids that can be formed from this
overlapping group, which for our example is
(B)
(D)
(B,D)
For one MCO(N) and for one MCO(J)
where J) N, find a 11 the over laps
that contain the elements of SAA.
Continue through J and N until
successful.
(B,B)
Next each of these is examined to see if it
can overlap with q. For an example we have
respectively
I
t
For a 11 MCO(K) where K') J,
find all the overlaps of the
MCO(K) peptide with each
peptide of the previous
merged list.
Print out all the peptides
that could not merge.
Print out all the peptides
too ambiguous in structure
to consider at this time.
Print out all the
peptides that are
merged.
(A,B,D)(B)(B,D) G, none,
(A,B)(B)(D)(B)(G), and (A,D)(B)(B)(D)(G)
where we have underlined the overlapping
group to the left of which p fits and to the
right of which q fits. Finally the peptide is
re-formed starting with the next group, and so
forth, until all the overlapping groups of p
and q have been considered.
Program PEPT (Figure 9). This program
extends the previously discussed program
MERGE in that it finds all of the possible
structures of the protein chain consistent with
the overlapping of all the peptides obtained
from a search for all experimental peptides
with a rare configuration of amino acids. The
overlapping portion of these acids must contain the group of rare amino aCids, called
SAA in the flow chart. The list resulting from
the search is called MCa in the flow chart.
Program SEARCH (Figure 10). This program systematically looks at each pair, triplet, and quadruplet of amino acids that are
known to occur together from the experimental
peptide fragment data. For each such group,
the probability of its occurrence is computed
from the amino acid frequency data as described above. A list, called Num(L) in the
flow chart, is made of these groups of amino
acids known to occur together which are improbable of occurrence (Le., less .probable
a Ii st of merged peptides.
Print out the merged structures.
C END)
Figure 9.
Flow Chart for Program PEPT.
than some chosen value). The letter L is used
to index the elements of the list Num(L).
Finally the program utilizes PEPT to generate and print out all possible merged
structures. For example suppose a search
was made on the ordered pair
LYS-PHE
and there resulted the following fragments:
(ALA)(ALA, ALA, LYS)(PHE)
(ALA)(ALA, LYS)(PHE)
(ALA)(LYS, PHE, GLU, ARG)(GLU)
(LYS)(PHE)
The merged structure becomes
(ALA)(ALA)(ALA)(L YS)
(PHE)(GL U,ARG)(GL U)
Proceedings-Fall Joint Cotnputer Conference, 1962 / 271
For N=2,3,or 4
search all the
peptides for N
amino acids known
to
ether.
Compute probability tHat
these N acids might occur
together in two places in
the protein chain, from
the amino acid frequency
data.
No
Yes
Is part or all of this
amino acid group already
listed in Num(L) ?
No
List group of acids in Num(L).
List
peptide
number.
Yes
For each input peptide,
can it contain the amino
acids Num(L) adjacent to
each other ?
one
Yes
Call PEPT to determine all
of the possible protein
chain configurations consistent with these peptides
all overlapping by the
amount of the searched-on
acids, Num(L).
Print out the sorted input
peptides and the possible
merged structures.
Figure 10. Flow Chart for Program SEARCH.
Program QLIST (Figure 11). This program forms the lists of peptides related by
each fragment q 1. It is to be noted that each
element of a Q list may contain up to five p
fragments (although in our example of section 2 only two peptides appeared in each
element of the Q lists). The input to this
program is a list P of peptides which in some
order will reconstruct the original protein
and a list Q of peptides which give additional
ordering information about the protein. In
the flow chart P' is a hypothetical peptide
272 / Comprotein: A Computer Program to Aid Primary Protein Sturcture Determination
(
START
+
r-------~',.~ For all q peptides .\
t
For all p peptides. ~~-----------.
J
r--------'3.....~
,
No
Form new P'by adding
the next peptide in
the P list to the
right end of p~
Can q overlap some of
each p peptide in pi?
t
No
Yes
Does q still extend
to the right of pi?
Yes
"-,
Does p/contain
5 p peptides ?
No
Yes
,
Print out a list of P
peptide numbers in p~
This is a possible configuration of the protein, from which q was
derived.
~J Remove the last
p peptide in pl.
". I
~------------~~
...
t
No
'-------------"-;..;;....--i
Was this the last
pep tide in the P 1is t . ?
i Yes
( Is pi empty?
No
)J--_.N __O _ _ _____
i Yes
Was this the last q
peptide in Q list ?
i
Yes
END)
Figure 11. Flow Chart for Program Q LIST.
constructed by the juxtaposition of up to five
P peptides.
Program LOGRED (Figure 12). In this
program the Q lists are given. Calling each
term of a Q lists a condition, the flow chart
involves the lists: MQ(M) of conditions on
the assumption of the Mth tentative condition;
MQI of conditions being considered; and IR(M)
of tentative conditions which determine a
possible protein structure. The symbol
MTI(J) is the last condition considered in the
Jth Q list. The program follows a tree structure of possibilities, keeping tract of tentative conditions until a branch is eliminated
or comes to a successful conclusion. The
program follows with greater generalization
Proceedings-Fall Joint Computer Conference, 1962 / 273
START)
I
t
Let M=l
MTl(J)=O
I
f
Find the shortest Q list
of those remaining to be
considered.
r
Clear Mth 1ists of
No
I-E--~----t
conditions MO(M)
and tentative condi tiDn lR(M)
C
(M=M-l)
Exit
Yes
Any more possible conditions
to be cons idered in this 1 ist ?
Yes
MTl (J)=MTl (J)+ 1.)
'f
~
I
Assume that condition
number Mll(J) is true.
M=O ? )
No
Transfer Mthreduced
from MO(M) to
o lists
y
.......
Yes
f
~----'~--1\.
Any lists vanish?
MOL
-
J
)
No
Replace by first peptide of
tentative condition any other
peptides of this condition
which occur in the other
conditions.
Transfer MOl lists to MO(M).
Store tentative condition in lR(M).
t
I---------{~ M=M+ 1 ;J-ooIE"':---_ _---iN:..::.:o"--_-t( All 0 cond i t ions sa tis f i ed ? )
Yes
Print out tentative condition
list IR.
These P peptide associations
determine a possible protein
structure.
Clear IR(M).
Transfer MO(M) lists to MOl.
Figure 12. Flow Chart for Program LOGRED.
the method of the "simplified illustration" of
the second section of this paper.
SUMMARY AND CONCLUSION
The computer program described in this
paper becomes useful when there is a large
number of small peptide fragments resulting
from the breakdowns which are to be woven
into a consistent and unique structure. This
is a long tedious process when carried out
by hand, and is subject to careless errors
and impatient overlooking of all alternative
274 / Comprotein: A Computer Program to Aid Primary Protein Structure Determination
possibilities/!c The completed IBM 7090
Computer program has been successfully
tested on a hypothetical subtilysin breakdown
of ribonuclease into over eighty fragments.
Just a s the proteins are composed of
c,hains of the same types of molecules, the
genetic substances desoxyribonucleaic acid
(DNA) and ribonucleaic acid (RNA) are composed of chains of only four different types
of molecules called the nucleotide bases. It
is possible that the order of the molecules
in these substances can also be determined
by the aid of this computer program and some
computer experiments in this direction have
been made. However, application of these
*Dr. Wm. Dreyer, of National Institutes of
Health, has developed chemical technique s
for isolating a large percentage of the peptide s formed in a subtilisin hydrolysis, and
for determining the total amino acid content
and identifying the right and left ends of the
peptide f rag men t s. This experimental
technique is very rapid and can be mechanized to a large extent; thus data taking
should be reduced to months instead of year s.
The computer program is ideally suited for
analysing this type of data.
The computations in this paper were done
in the Computing Center of the Johns Hopkins Medical Institutions, which is supported
by Research Grant, RG 9058, from the
National Institute s of Health and by Educational ContributionS' from the International
Business Machines Corporation.
techniques to DNA and RNA still awaits further development in the chemical experimental methods.
REFERENCES
1. Sanger, F., Science, 129, 1340 "1959."
Hirs, C. H. W., Moore, S., Stein, W. H., J.
BioI. Chem. 235, 633, "1960" Spockman,
D. H., Stein, W. H., Moore, S., J. BioI.
Chem. 235, 638 "1960."
2. Braunitzer, G., Gehring-Mueller, R.,
Helschmann, N., HeIse, K., Hobom, G.,
Rudloff, V., and Wittmann-Tiebold, B.,
Hoppe Seyler Z. Physiol. Chem., 325, pp.
283-6, Sept. 20, 1961.
3. Hirs, C. H. W., Moore, S., and Stein, W.
H., J. BioI. Chem. 235, 633 (1960).
4. Tsugita, A., Gish, D., Young, J., FraenkelConrat, H., Knight, C. A. and Stanley, W.
M., Pros. Natl. Acad. Sci., Vol. 46, pp.
1463-9, 1960.
5. Kendrew, J., Watson, H., Strandberg, B.,
Dickerson, R., Phillips, D. and Shore, V.,
Nature, Vol. 190, pp. 666-70, May 20, 1961.
6. Margoliash, E., Smith, E., Kreil, G., and
Tuppy, H., Nature 192,1121-7, (Dec. 1961).
7. Ledley, R. S., Report on the Use of Computers in Biology and Medicine, Natl. Acad.
of Scis.-Natl. Research Council, May 1960.
8. Bradley, D. F., Bernhard, S. A., Duda, W.
L., unpublished work.
9. Eck, R., Nature, Jan. 20, 1962, 241-243.
USING GIFS IN THE ANALYSIS AND
DESIGN OF PROCESS SYSTEMS
William H. Dodrill
Service Bureau Corporation
Subsidiary oj International Business Machines
A process system may be defined as an
integrated combination of equipment which
functions to produce one or more products,
and possibly byproducts, by altering the
physical and/or chemical nature of raw
materials. Even though there is wide diversity in products produced, raw materials
used, and equipment combinations employed,
all process systems exhibit certain funda ...
mental similarities. Characteristically, any
process system may be subdivided into three
operational phases. These are: preparation
of raw materials, conversion to products, and
recovery and purification. All three phases
may not necessarily be included in a specific
@)
process system, nor is there always clear
distinction among them. Still, there is universality in practice in that only a limited
number of types of equipment are used to
perform speCific types of operations.
As an example of a relatively simple, but
fairly typical, process system, consider the
flow diagram for high-temperature isomerzation reproduced in Figure 1. This is an
operation performed in petroleum refining
to improve the octane rating of certain gasoline constituents. It consists of a reactor
for converSion, a flash drum and three
distillation columns for recovery and purification, and several pieces of auxiliary
RECYCLE
PROPANE
FUEL
ISO-PROPANE
PRODUCT
®
GAS
VENT
RECYCLE
HYDROGEN
@
MAKEUP
HYDROGE"N
C\I
z
0
i=
FEED
~
CD
0)
...J
...J
~
(/)
0
@
Figure 1. High-Temperature Isomerization
275
PRODUCT
276 / Using Gifs in the Analysis and Design of Process Systems
equipment. Distillation 1 also serves in preliminary preparation of the feed. Let us
suppose that a unit similar to this is to be
built for a particular refinery. Before individual pieces of equipment can be ordered or
fabricated, it is necessary that their sizes
and capacities be specified. This type of
analysis is termed process design.
Prior to the general availability of computer methods, the process design engineer
was forced to rely almost exclusively on information obtained from physical systems.
Thus, if information about a similar unit,
which was 0 per at i n g satisfactorily, was
available, the design engineer merely gave
the same or slightly modified specifications.
For new processes, or processes for which
comparison information was not available, he
had to resort to scaleup from a pilot unit,
construction of the pilot unit, in turn, being
based on scaleup from a bench or laboratory
unit. Thus, information required for construction of each process system was derived
from a bench or laboratory model, through
pilot unit, to production unit. This procedure
is costly and time consuming. Further, the
difficulties involved in experimenting with
physical systems, especially of plant scale,
all but prohibits investigation of anything
more than minor design modifications.
As an alternate method of deSign, many
computational methods are used. However,
rigorous computations are too laborious for
practical hand solution. As a consequence,
only shortcut and approximate methods are
used regularly, and their primary application is limited to supplementing plant and
pilot information. Now that economical and
reliable machine computations are readily
available, lengthy, as well as complex, computational procedures are increasing in utility. DeSign is still ultimately based on comparison information, but this dependence is
becoming less critical.
One of the many applications of computers
in process design is in implementing the use
of mathematical models. A mathematical
model may be defined as a series of arithmetical operations performed to compute
numerical values which represent certain
performance characteristics of a physical
system. Required input information are numerical values for design and operating conditions. A wide range in complexity is
exhibited by the mathematical models which
are programmed for com put e r solution.
Some, such as for the mixing of two streams,
are very simple, while others are very complex, requiring many hours of high-speed
computer time for solution. The general
concept, though, is the same. That is, given
a numerical evaluation of operating and design conditions, the objective is to compute an
estimate of performance characteristics for a
corresponding physical system. Actually, this
describes the simulation of a physical system using a mathematical model. An alternate type of mathematical model can be
formulated for direct design. Accordingly,
given numerical values" for operating parameters and performance restrictions, numerical values for required design can be
computed directly. Formulation of this type
of model is normally much more difficult.
Many simulations for commonly used
pieces of industrial equipment are being
processed daily. However, this often requires theoretical isolation from the process system of which the piece of equipment
is an integral part, and so may result in
significant error. That is, the functional interrelationship of pieces of process equipment may be so significant, even with respect to the piece being" studied, that gross
errors in analysis may result when the effects of proposed modifications on the system
are neglected. This is one of the primary
motivations behind the development of Generalized Interrelated Flow Simulation, GIFS.
That is, even more important than providing
a convenient "means Of using preprogrammed
mathematical models, GIFS provides a means
of analyzing a process system as an integrated network.
The development of GIFS (Generalized interrelated Flow Simulation) is based on study
of a wide variety of industries. Representative among these are: petroleum refining,
metals production, and manufacture of chemicals, pharmaceuticals, pigments, plastics,
and pulp and paper. In essence, all of these
represent special cases of continuous flow
systems. Therefore, the design of GIFS is
based not on the analysis of specific production systems, but" rather, on the much broader
scope of flow systems analysis in general.
The development of a generalized flow network method, which is easy to use as well as
applicable to a wide variety of process systems, poses a number of difficulties. Outstanding among these is the ever-present
Proceedings-Fall Joint Computer Conference, 1962 / 277
possibility of system over or under specification. That is, there is a finite number of
variables which, when fixed, completely determine a system, and all other pertinent
variables can be computed. There is always
the possibility that an engineer will attempt
to set either more or fewer variables than
are required for complete specification of a
process system. In order to insure against
this possibility, GIFS has been purposely restricted to analysis of cases which correspond to physical systems that are completely
determined and which are not over specified.
Thus, GIFS is applicable only to the generation of a mathematical model to simulate the
functioning of an entire integrated process
system. No attempt has yet been made to incorporate direct design aspects in this model.
In short, the method can be used only for the
performance evaluation of a fixed process
system at fixed operating conditions and with
fixed feeds. This is really not a drastic limitation since design as well as optimization
can be approached through multiple case
studies.
In application, each system under consideration is represented as a network of interrelated stream flows. Figure 1 is an example.
Each stream is identified by assigning to it a
unique stream number as indicated. Numerical values which are used to describe the
nature of a stream are termed stream proper~ies. Stream properties include total flow
rate, composition, temperature, pressure,
heat content, and phase state. Restrictive
relationships among streams, such as are
simulated by preprogrammed mathematical
models for speCific types ~f process equip-·
ment, are indicated by what is termed unit
computations. Examples of unit computations
include the simulation of process equipment,
such as reactors, distillation columns, heat
exchangers, etc., as well as factors which
have no direct equipment counterparts (pressure drops in lines and ambient heat exchange). The currently available library includes 21 unit computations. These· are
summarized in Table 1. Complete descriptions are given in the GIFS user's manual.~:~
They represent a collection which is basic in
nature and highly versatile. Further, GIFS is
constructed so as not to be limited to the
library of unit computations available at any
one time. At present, four additional unit
computat ions are being prepared for
inclusion in the library, and others can be
added as specific needs arise.
Figure 2 is a reproduction of one page of
the input data sheet which describes Figure 1
in terms of unit computations. Each unit
computation is specified by giving the unit
computation type, associated stream number
or numbers, and arguments if any. A unit
computation number is included for identification only. It is usually convenient to
number unit computations sequentially. The
first unit computation in Figure 2 indicates
the addition of streams 1 and 19 to obtain
stream 3. The unit computation type is STAD
(Stream Add), and the associated stream
numbers are 1, 19, and 3. The second unit
computation indicates the function of distillation column number 1, the associated
streams being 3, 12, and 4. The unit computation used is CRSEP (Component Ratio
Separation). This is an approximate simulation of a distillation column which requires,
as arguments, component recoveries for all
components present in the feed (ratio of
component flow rate recovered in the overhead product, stream 12, to component flow
rate in the feed stream, stream 3). Since
there are eleven components for the illustrative system, values for eleven arguments
are entered. Additional unit computations
necessary to describe the entire system are
entered sequentially. Detailed information on
the preparation of input data sheets is contained in the user's manual. ~:~
The objective of each computation is an
evaluation of all interrelated stream properties. These are computed as a function of
the properties of the given feed streams.
Each system is computed as an integrated
network and in a manner which satisfies
fundamental material balance relationships
as well as the restrictions defined by the
specified unit computations. Systems are
normally non-linear, and a method of solution by iteration is employed. As an example
of a computer output, just one portion of the
report for the illustrative example is reproduced in Figure 3. This is the portion
which describes computed properties for
stream 9.
*GIFS, Generalized Interrelated Flow Simulation, userls manual is available through
any SBC office or local sales representative.
278 / Using Gifs in the Analysis and Design of Process Systems
G I FS
sac
UNIT
UNIT
COMPUTATION
COMPUTATIONS
STREAM
ARGUMENT
TYPE
2nd
1st
-I.
I:
I.
,.
17..rD.
-!:I
. .,03
-;""D/
rr
icp'
-........
I
.""
.-;'2
0'
_1
~
~
-I.
:.,.
.~
~
e
·'/lfl.
[/0.
rtY
r-,:;;
~
.dor
.~/
.....
.DDY
~.~
4 4
o I
3
01
31
/.
h.
~a
. hl
4th
.47
b.
"Li
o
3rd
,
I
7
o
SHEET
3
PAGE'&- OFLL..-
Figure 2. Sample Input Data Sheet
STREA'"
NO.
1
2
3
4
5
6
7
8
"J
10
11
COMPONENT
NAME
H2
(1
(2
(3
l-C4
N-C4
l-C5
TOTAL
MOLES/HR. MOLE FR.
1.2926 o. C139
C.3115 0.C033
C.1001 O.COll
c.e6e2 0.C007
0.0472 0.C005
(.2389 0.0026
22.2530 0.2389
3.4096 0.0366
18.3820 0.1973
44.9652 0.4827
2.0943 0.C225
93.1626 1.0CCC
lIi-C5
I-C6
N-(6
C7+
TCTAlS
PRESSURE
VAPOR FLew
(Z = 1.C)
HEAT CONTENT
30C.OC
c.
o.
-0.
9
FLOW
LBS./HR. WT. FR.
2.5853 0.0003
4.9845 0.0007
3.0124 0.0004
3.0062 0.0004
2.7415 0.0004
13.8804 0.0018
1604.4429 0.2125
245.8286 0.0326
1584.5274 0.2098
3875.• 9987 0.5133
209.8512 0.0278
7550.8590 1.0000
PSIA.
M
~
~M
cu.
H./HR.
SlC. CL. FT./HR.
BTU/HR.
flASH L1Q
TEMPERATURE
LIQUID FLOW
VAPOR
,,"aLES/HR.
o.
O.
o.
o.
o.
o.
o.
o.
o.
o.
O.
o.
120.00
25.04
FRACTION VAPOR O.
Figure 3. Computer Report
LIQUID
MOLES/HR.
1.2926
0.3115
0.1001
0.0682
0.0472
0.2389
22.2530
3.4096
18.3820
44.9652
2.0943
93.1626
DEG. F.
GPM
Proceedings-Fall Joint Computer Conference, 1962 / 279
Table 1
Unit Computations Summary
Title
Stream Add
Stream Subtract
and Zero
Stream Split
Stream Equate
Stream Zero
Temperature Add
Heat Add
Pressure Add
Temperature Set
Heat Set
Pressure Set
Vapor Ratio Set
Bubble Point
Dew Point
Isothermal Flash
Adiabatic Flash
Constant R Flash
Stream Heat
Equilibrium
Separation
Component R
Separation
Type
Streams
STAD
ST1 ST2 ST3
STSBZ
SPLT
STEQ
STZ
TAD
QAD
PAD
TSET
QSET
PSET
RSET
BPT
DPT
TFLSH
QFLSH
RFLSH
STQ
ST1 ST2 ST3
ST1 ST2 ST3
ST1 ST2
ST
ST
ST
ST
ST
ST
ST
ST
ST
ST
ST
ST
ST
ST
EQSEP
ST1 ST2 ST3
CRSEP
ST1 ST2 ST3
Arguments
F12
DT
DQ
DP
T
Q
P
R
H1
H2
---
---- RN
Reactor
REACT
GIFS has been developed, and is now being
offered, as one of SBC' s preprogrammed
!omputer services. Characteristics of these
services include accurate, inexpensive, and
rapid processing for a wide variety of problems, use of convenient input data sheets, and
the presentation of computed values in clear,
comprehensive reports. Cases which have
been processed to date represent applications
in such fields as petroleum refining, inorganic
and organic chemicals manufacture, andpulp
and paper production. These have been processed in a routine manner, and have not indicated any unforseen complications" in the
approach or the computer implementation.
From all indications, companies who are using the service are highly satisfied, and
ST1 ST2
NR NC1 NK1 CNV1
Sl ---------- SNC1
C1 ----------. CNCI
--------- CNCNR
expect to continue using it. Future developmental efforts will depend on industrial
response. SBC is already contemplating
the implementation of a number of additional features.
ACKNOWLEDGMENTS
Appreciation is expressed to Service
Bureau Corporation and to mM for supplying the materials, equipment, and encouragement needed for this project, and to the
SBC personnel who assisted in its completion. Appreciation is also expressed to
Prof. L. M. Naphtali for his original developmental efforts, and to Dr. J. E. Brock
for his assistance in the preparation of this
documentation.
A DATA COMMUNICATIONS AND PROCESSING
SYSTEM FOR CARDIAC ANALYSIS
M. D. Balkovic
Bell Telephone Laboratories
Holmdel, New Jersey
P. C. Pfunke*
A. T. & T. Company
New York City
C. A. Caceres, M. D.
Chief, Instrumentation Unit
Heart Disease Control Program
U. S. Department of Health, Education,
and Welfare
Washington 25, D. C.
C. A. Steinberg
Department of Medical and
Biological Physics
Airborne Instruments Laboratory
Deer Park, Long Island, New York
Many aspects of medical diagnoses involve extensive, tedious procedures in which
the capabilities of a digital computer can
prDvide the physician an invaluable aid.t Because of the complexity and cost of modern
computers, systems to aid in diagnostic procedures must generally be centrally located
and be capable of serving many physicians.
Thus, data communication links between the
physicians and the computer location are required. Such a data communication and processing system for cardiac analysis is now in
operation and is described in this paper.t
The cardiac data processing system discussed in this paper is comprised of data
acquisition un its located throughout the
country, data communication links which can
be established as required, a data processing
unit, and print-out devices.
The data acquisition unit, as shown in Figure 1, provides the capability of recording an
electrocardiographic Signal simultaneously
Figure 1. Data Acquisition Unit.
on a graphical recorder and on a magnetic
tape recorder. Prior to recording the electrocardiogram, an 8-digit, serial, binary-coded
*Will present paper.
tReference #1.
tThis is a pilot project facility of the Instrumentation Unit, Heart Disease Control Program,
Division of Chronic Diseases, U. S. Dept. of Health, Education and Welfare, Wash.ington 25, D. C.
280
Proceedings--Fall Joint Computer Conference, 1962 / 281
decimal number is generated and recorded
on magnetic tape. This 8-digit number is
used to identify the particular data that is
recorded. Two digits indicate the place
where a recording is taken; four digits indicate the patient t s number, and the last two
digits indicate the particular electrocardiographic lead that is recorded. The electrocardiographic signal is modulated using pulse
repetition frequency modulation prior to recording on magnetic tape" in order to achieve
a frequency response down to 0.1 cycles per
second. The upper frequency response extends to 200 cycles per second.
Figure 2 is a block diagram of the data
acquisition unit. The input electrocardiogram is amplified prior to recording. A
coder generates the 8-digit number corresponding to the location, patient and lead.
designed and constructed analog transmitting
data sets are located at the data acquisition
units while receiving data sets are located
at the data processing unit. These analog
data sets incorporate integrated telephone
sets which allow, when in the voice mode of
operation, the dialing of calls and normal
voice com m un i cat ion over the regular
switched telephone network. When in the
data mode of operation, the data set provides
modulation and demodulation circuitry to
allow the transmission of analog signals with
frequencies from 0 to 200 cps over dialed-up
voice telephone facilities. An input to output
linearity from transmitting to receiving data
sets is maintained to better than 1%.
Figure 3 shows a block diagram of the
electrocardiogram data set transmitter. The
input circuitry provides a load impedance of
OSCILLOSCOPE
DISPLAY
GRAPHICAL
RECORDER
+
,
RA
LA
RL
LL
PC
~
~
.....
LEAD
SELECTOR
IDENTIFICATION
NUMBERS
I--
{~
±:
......
......
ECG
AMPLIFIER
TAPE
IDENTIFICATION
UNIT
I---++-
.....
MODULATOR
r--
TAPE
TRANSPORT
r+-
[
f--
Figure 2.
I-+-
DEMODULATOR
~
TO
DATA SET
FREQUENCY
COMPENSATION
f-
~
Data Acquisition Unit.
The code number followed by the amplified
electrocardiogram is fed to pulse repetition
frequency" modulator whose center frequency
is 3600 cycles per second. The output of the
modulator is recorded on magnetic tape.
The output of a second head on a tape recorder is demodulated and the demodulated
signal is fed to a graphical recorder and/or
a DATA-PHONE data set for transmission
to the data proceSSing unit. The magnetic
tape records generated by the data acquisition unit contain completely identified electrocardiographic signals. These magnetic
tape records can be sen1 via DATA-PHONE
service to the data proceSSing unit for subsequent analysis and proceSSing.
Communication between the data acquisition units and the data proceSSing unit is
achieved over DATA-PHONE service on
switched tel e p h 0 n e facilities. Specially
1000 ohms and accepts analog signals which
can range from - 3 volts to +3 volts and can
contain frequency components between 0 and
200 cps. The input circuitry acts to adjust
the DC level and gain of the input signal such
that it provides proper bias range for the
astable multivibrator which follows the input
circuit. This astable multivibrator accomplishes the actual frequency modulation. A
,-------- - - - - - - - - - ,
I
-3 to +3 Volts
o to 200 CPS I
lOOO.n
r Il
i
TRANSMITTER
1
l
GAIN
AND
D.C. LEVEL
ADJUST
1
f--
MULTIVIBRATOR
I--
11 1000 to
ALTER
11500 CPS
i F.M.
AND
LEVEL II 1-6 dbm
CONTROL
1 "900n
L _____ . .:. . ___ ..- ___ _
Figure 3.
:l
I
Data Set Transmitter.
282 / A Data Communications and Processing System for Cardiac Analysis
given voltage delivered to the input of the
data set represents a particular bias on the
multivibrator and determines the multivibrator t s frequency of oscillation. For the
indicated input voltage range, the astable
multivibrator will oscillate at frequencies
ranging from 1000 cps to 1500 cps when the
input circuit level and gain controls are
properly adjusted. The square wave output
of the multivibrator is filtered and attenuated by the transmitter output circuitry. The
output circuitry provides a 900 ohm source
impedance to the telephone line. The Signal
transmitted to the telephone line is essentially
a sine wave with a fundamental frequency between 1000 cps and 1500 cps at a level of
-6 dbm. A bandpass filter removes harmonics
above 2500 cps.
Figure 4 shows a block diagram of the
electrocardiogram data set receiver. The
input bandpass filter terminates the telephone
line in 900 ohms at all transmitting frequencies. It has a slope of 12 db/octave to either
side of the 1250 cps center frequency of the
telephone line signal. Following the bandpass filter is a combined amplifier and
limiter, the gain of which determines the
sensitivity of the receiver, -38 dbm. The
output of this stage is a 1.5 volt square wave
with a fundamental frequency that of the
.------------------,
I
RECEIVER
I
I
-6 to -38 dbm 1
1000 to
I
1500 CPS F.M.
II
I
900nr-i
~------------------------~:
I
1
I
IL
1_3 to +3 Volts
to 200 CPS
I0
PULSER
FILTER
LEVEL
ADJUST
_______________
I
--.J
Figure 4. Data Set Receiver.
received signal. The square wave is differentiated and full-wave rectified to give a
sequence of pulses which correspond to the
zero crossings of the received signal. These
pulses are used to trigger a mono-pulser
which delivers a pulse of fixed width each
time it is triggered. Thus, the mono-pulser
delivers an output pulse for every zero
crossing of the input signal from the telephone line and provides an average output
voltage which is proportional to the received
signal frequency. By passing the output of
the mono-pulser through a low pass filter,
the original baseband Signal is reconstructed.
The output circuitry serves to amplify the
signal and restore the correct DC signal
level. This output circuitry is designed to
drive a 1000 ohm load impedance.
The telephone line signal spectrum (1000
cps to 1500 cps) is chosen to avoid falling
within the same band used by various kinds
of single-frequency signalling equipment employed in Bell System switching facilities.
If this were not true, automatic circuit disconnect could occur when certain signal patterns are transmitted. The input sensitivity
of the data set receiver is chosen to be appropriate for the range of typical dialed-up
telephone connections.
Figure 5 is a block diagram of the equipment in the data proceSSing unit. This consists of three basic units; an input console,
a digital computer, and an output unit. The
input console can accept data that is transmitted over the telephone line or data obtained from playing back magnetic tape records made in the field. The data transmitted
over the telephone line is recorded on one of
the two magnetic tape recorders. If it is desired, the data transmitted over the telephone
line can be recorded on magnetic tape and
fed into the computer simultaneously.
The input console is capable of automatically searching the magnetic tape for any
pre-selected set of identification numbers.):c
Once finding the proper set of identification
numbers, these numbers are decoded and fed
into the computer and serve to identify the
result of the data proceSSing. Alternately,
the system can accept any electrocardiogram and set of identification numbers,
whether transmitted over the telephone line
or played back from a magnetic tape recording, decode the identification and feed the
identification number into the digital computer.
An oscilloscope and photographic recorder
are incorporated into the system to monitor
the electrocardiogram while recording and
playing back. The electrocardiogram is
filtered with a bandpass filter to eliminate
high frequency noise. The derivative of the
filtered electrocardiogram is then obtained
using conventional analog techniques. The
filtered electrocardiogram and the derivative
of the electrocardiogram are converted to
digital form at a rate of 500 samples per
~rence #2.
Proceedings-Fall Joint Computer Conference, 1962 / 283
I"PUTS
FROM
'b.fPHONE
LINES
IDENTIFICATiON {
NUMBERS
Figure 5. The Data Processing Center.
second using a successive approximation
analog to digital converter. The digital representation of these two signals, along with
the decoded identification numbers, are then
fed directly to the digital computer for subsequent processing. Figure 6 is a photograph
of the system. The input console is at the left
and the computer is at the right.
Figure 6. Data Processing Unit.
The digital computer is a Control Data
Corporation 160A computer (CDC 160A) and
is programmed to automatically recognize
and measure the wave forms of the electrocardiogram. >:~ The cardiologist uses these
measurements to interpret the electrocardiogram. The measurements are of amplitudes
and duration of the waves, and the interval
between certain of them. The program recognizes whether some waves are diphasic,
bi-fed or tri-fed. The measurements made
by the computer were programmed based on
conventional EKG criteria for wave onset,
termination and slope. t
The computer processes an electrocardiographic lead and arrives at the desired
measurements in 12 seconds. The entire
input, process, and write-out time is 52 seconds. The output of the computer is printed
on an automatic typewriter or punched on
paper tape that can be used for telephone
transmission of the results back to the
sender. Although at present verbal telephone
communication is used for return of results, a
completely electronic system has been tried. t
*Reference #3.
#4.
#5.
t Reference
t Reference
284 / A Data Communications and Processing System for Cardiac Analysis
The data communication link used for output
data from the data processing unit has consisted of DAT A-PHONE service over switched
telephone facilities. Bell System Data Sets
402A and 402B have been used in conjunction
with Tally Register Corporation Model 51
tape-to-tape equipment. This communication
link is capable of transmitting 8-level parallel digital data at a rate of 75 characters
per second.
The approach taken in the system described in this paper can be extended to other
medical diagnostic processes such as the
analysis of phono-cardiograms, electroencephalograms, etc. There is great potential
here for electronics to assist the medical
profession.
REFERENCES
1. Rikli, Arthur E. and Caceres, Cesar A.,
1960. The Use of Computers by Physicians as a Diagnostic Aid. Reprinted
2.
3.
4.
5.
from The New York Academy of Sciences
Sere II, Volume 23, No.3, Pages 237-239.
Paine, L. W., and Steinberg, C. A. AMedical Magnetic Tape Coding and Searching
System. Presented to the Spring, 1962
IRE Convention.
Steinberg, C.A., Abraham, S., and Caceres,
C. A. Pattern Recognition in the Clinical
Electrocardiogram. Presented at The
Fourth International Conference on Medical Electronics, New York, New York,
July, 1961
Caceres, Cesar A. How Can the Waveforms of a Clinical Electrocardiogram be
Measured Automatically by a Computer?
Presented at a meeting of the Professional
Group on Bio-Medical Electronics (local
chapter) New York, New York, February,
1960.
Rikli, Arthur E., Caceres, Cesar A., and
S t e i n b erg, C. A. Electrocardiography
with Electronics: An Experimental Stud
Exhibit) American Heart Association National Meeting October, 1962.
CLUSTER FORMATION AND DIAGNOSTIC SIGNIFICANCE
IN PSYCHIATRIC SYMPTOM EVALUATION*
Gilbert Kaskey
Engineering Director, Systems Design and Applications Division
Remington Rand Univac
Paruchuri R. Krishnaiah
Senior Statistician, Applied Mathematics Department
Remington Rand Univac
Anthony Azzari
Senior Programmer, Applied Mathematics Department
Remington Rand Univac
techniques are all discussed along with the
results obtained using the tools of correlation analysis and simultaneous multivariant
analysis of variants.
SUMMARY
The tremendous variability in symptom
constellations associated with specific psychiatric diagnoses makes the use of statistical techniques a virtual necessity in the rigid
for.mulation of any symptom-disease model.
In addition, the large number of symptoms
associated with psychiatric disorders suggest that an electronic computer can be used
to considerable advantage in certain aspects
of psychiatric diagnosis, e.g., in the correlation analysis between symptoms, in the
determination of quantitative criteria for
diagnosis, and in the evaluation of the reliability of symptom assignment by different
personnel.
This paper reports on the results obtained
from the analysis of data on 199 subjects
collected by the Childrens Unit of the Eastern
Pennsylvania Psychiatric Institute. The types
of information available, the method of collection, and the statistical methodology and
INTRODUCTION
The fact that electronic computers c'art ,be
used effectively in certain phases'oi mediCal
diagnosis has been recognized for somEf time.
It has been only recently, however, that applications have been extended to include
research investigations in psychotherapy
generally, and symptom pattern formation
specifically. Although the present study has
not extended beyond its initial phase, preliminary findings suggest that the merger of
computer techniques with clinical observations can do a great deal to shed more light
on the problems of emotionally disturbed
children.
The extreme variabili ty in symptom
constellations ass 0 cia ted with specific
*The authors are indebted to Mrs. Janice Schulman, Research Associate, and Dr. Robert Prall,
Director of the Children's Unit of the Eastern Pennsylvania Psychiatric Institute for their invaluable aid in formulating the problem areas and in interpreting the results.
285
286 / Cluster Formation and Diagnostic Significance in Psychiatric Symptom Evaluation
psychiatric diagnoses makes the use of statistical techniques a virtual necessity in the
rigid formulation of any symptom -disease
model. In addition, the large number of
symptoms associated with psychological and
psychophysiological disorders suggests that
an electronic computer can be used to considerable advantage in many aspects of this
diagnOSis. For example, the objective determination of symptom clusters using correlation analysis and the statistical testing of
significance among alternative diagnoses require an excessive amount of computation
which is only feasible when handled by a
computer.
This report on the initial phase of the investigation describes the clinical information
collection techniques, the psychiatric categories assigned, and the statistical methodology applied in the determination of the preliminary results. .
The ultimate goal of the project is to
evolve, through analysis of the data, a picture of how the various symptoms tend to
form patterns or "clusters." It is hoped that
these clusters can be related directly to the
relevant category as an aid in the problem of
differential diagnosis. Associated with this
investigation is an evaluation of the diagnostic
categories themselves and the determination, from a statistical standpoint, of the significance among the several categories using
the "symptom vectors" as criteria. The
mathematical details of the technique used in
this phase-Simultaneous Multivariate Analysis of Variance-are given in the Appendix.
Data Description
The Children's Unit at the Eastern Pennsylvania Psychiatric Institute has been collecting data regarding symptom formation in
emotionally disturbed children since 1956.
An enumeration of the emotional problems
exhibited by each child seen in the Clinicwhich is both an inpatient and outpatient treatment center-is obtained in a highly structured interview with each parent who is seen
individually. In addition to whether the child
possesses any of the 130 symptoms commonly
encountered in clinical practice (see Appendix, Table A-1), information with regard
to (1) duration of the symptom; (2) whether
the symptom is currently characteristic of
the child or existed only in the past; and (3)
whether the parent considers it as "serious"
or "not so serious," is also obtained.
Several assumptions were necessary in
order to make the data amenable to computational techniques. Since additional evidence
is now being collected on "normal controls,"
i.e., on non-patient children of various ages,
and since consistency checks between parents
are being made, it is expected that the validity
of these assumptions will become apparent
in later studies; in any case they are such
that the final results will not be seriously
affected by their acceptance.
A symptom is considered to be present if
either parent says it is present. If only one
parent sees a particular symptom, the designation as to the severity and .duration is determined by the response of the parent who
sees it even though it may be listed by the
other as not characteristic of the child.
If the information elicited indicates that
both parents agree on the relevance of the
given symptom, but disagree as to the specific designation, the following rules determine how the response is counted:
a. If one parent says the symptom exists
at present while the other says it existed only in the past, the symptom is
listed at present.
b. If one parent says present always and
the other says present recent, the former designation is used.
c. If one parent considers the symptom
serious but the other does not, it is
listed as serious.
Other basic sociological data are available
on each of the 199 cases being investigated
but no attempt at analysis has been made.
Presumably such information as age, sex,
religion, race, and number of parents interviewed, will provide fruitful areas for future
study.
Only four of the fifteen diagnostic categories are being studied in detail since in
many instances the data available in tile remainder represent too few cases from which
to draw meaningful results. Further, the
categories finally selected are those of
greatest interest to the research personnel
at the Children t s Unit with whom the study
is being made. Descriptive definitions* are
given below to enable the reader to relate
the symptom patterns, as indicated by the
correlation analysis described in the next
*See .Diagnostic and Statistical Manual of
Mental Disorders, The American Psychiatric Association, Washington, D. C.
Proceedings-Fall Joint Computer Conference, 1962 / 287
section, to the emotional disturbance classifications tested in the section entitled "Diag,nostic Significance."
Childhood Schizophrenia - th~s category, also known as childhood psychosis, is characterized by varying
degrees of personality disintegration
and failure to test and evaluate external reality correctly. Children in this
group fail in their ability to relate themselves effectively to other people and to
such tasks as school work. They may
exhibit unpredictable behavior, regression to earlier childhood forms of behavior, and uneven rates of development
of any area of bodily and mental development. The onset of this condition
may be from birth or may come later
on in childhood and is often gradual.
Psychophysiological Disorders - such
disorders are sometimes referred to
as psychosomatic disorders and represent organic disturbances which are
induced by mental or emotional stimuli.
Psychoneurotic Disorders - these are
characterized chiefly by "anxiety"
which may either be felt and expressed
directly or may be unconsciously and
automatically controlled by various
psychological defense mechanisms.
Personality Trait Disturbances - disorders of this sort are characterized
by developmental defects or pathological 'trends in the personality structure,
wit1;l minimal subjective anxiety, and
little or no sense of distress. Usually
it is manifested by long standing patterns of action or behavior, rather than
mental or emotional symptoms seen in
neuroses and psychoses.
The 199 cases under consideration have each
been studied by professional personnel at
the Clinic prior to having a final diagnosis
assigned.
Symptom Cluster Formation
A major purpose of the investigation has
been the objective determination of symptom
clusters associated with each individual diagnostic group and also with all groups together.
The analysis has been aimed, then, at answering the two questions:
1. Is there a tendency for specific symptoms to be more frequently accompanied by other symptoms?
2. Are there specific symptoms or clusters of symptoms associated with certain diagnostic categories?
The intercorrelations between symptoms
have been examined statistically and classed
into overlapping and non-overlapping clusters.
A cluster is defined to be overlapping if
all the symptoms in the cluster are correlated significantly with one another.
A cluster is defined to be non-overlapping
if each symptom in the cluster is correlated
significantly with one or more, but not all,
of the other symptoms.
The correlation between any two symptoms is given by
r
ij
=
where
N = total number of subjects
N i = number of subjects with symptom i
N j = number of subjects with symptom j
N i j = number of subjects with symptoms
i and j.
The hypothesis that no correlation exists
between symptom i and symptom j can be
easily tested by using the fact that
rij
t
~
=----""'"'---
is distributed as Student's "t" with (N - 2)
degrees of freedom. We accept or reject
the hypothesis according as
where t ex =the appropriate "t" table value at
the a significance level.
It is important to note that the application
of this significance test is valid only when
the distribution of individuals for a given
symptom is normal. Although the symptom
data are discrete and binomial-since each
individual is assigned a "zero" or a "one"
depending on whether the symptom is absent
or present-the sample size would seem to
be large enough to take advantage of the fact
288 / Cluster Formation and Diagnostic Significance in Psychiatric Symptom Evaluation
that the binomial tends to normality as the
sample size increases. Moreover, the discreteness of the data is due more to the lack
of precision in measuring than to any inherent lack of continuity in the underlying scale.
A simple frequency count of the 20 most
frequently occurring symptoms in each diagnostic group and also in all diagnostic groups
taken as a single entity has been made with
the follOwing results:
Symptoms
Diagnostic group 1: 2, 9, 11, 13, 14, 15, 18,
(Psychological
25, 26, 39,45,47, 50, 52,
Disorders)
60, 61, 64, 65, 68, 79.
Diagnostic group 2: 1, 10, 11, 13, 14, 15, 26,
(Personality Trait 28, 33, 37, 39,40,43,47,
Disturbances)
56, 57, 60, 61, 66, 68.
Diagnostic group 3: 10, 11, 13, 16, 17,18,23,
(Childhood Schiz25, 26, 28,31,33, 53, 59,
ophrenia)
60, 64, 68, 73, 76, 80.
Diagnostic group 4: 1, 9, 10, 11, 13, 14, 15,
(Psychophysiologi- 18, 21, 26,39,40,47, 56,
cal Disorders)
57, 60, 61, 66, 68, 69.
All groups:
Group 3, Cluster 1: 10, 11, 13, 16, 17, 18, 23,
25, 31, 33, 59, 60,68, 73
Group 4, Cluster 1: 1, 9, 10, 11, 13, 14, 15,
18, 21, 26, 39,40,47, 56,
57, 60, 61, 66, 68, 69
All Groups, Cluster 1: 1, 9, 10, 11, 13, 14, 15,
17, 18, 21, 26,28,33,
39, 47, 56, 57,60, 61,
68
The above listing clearly indicates that virtually all symptoms within each group are
indirectly interrelated with the exception of
those in diagnostic group 1. In an attempt to
study the direct relationships which exist,
the overlapping clusters (as defined above)
were determined.
Overlapping Clusters
Group 1: 2, 50, 64; 9, 26; 11, 13; 13, 15;
15; 18, 26; 25, 39; 25, 45; 25,
25, 65; 39, 45; 39, 50; 39, 65;
50; 45, 65; 50, 65; 26, 45; 39,
47, 60;47, 65; 60, 65.
14,
50;
45,
47;
Group 2:
11,
56,
33;
68;
68.
10,
37,
60;
28,
43,
11,
43,
15,
60;
57;
13,. 37,
56; 11,
26, 33,
39, 40;
56, 60,
56;
40;
61;
40,
68;
Group 3: 10,
13,
16,
59;
13,
33,
68;
23,
17,
60;
16,
31,
10, 23, 73; 11, 17;
60, 68; 16, 17, 33;
18, 33; 23, 25, 31,
73.
1, 9, 10, 11, 13, 14, 15,
17, 18, 21, 26, 28, 33, 39,
47, 56, 57, 60, 61, 68.
The correlation between each pair of
symptoms listed within the indicated category
is shown in Tables I through V where a star
in a particular cell of the matrix indicates
significance at the 5% level. The critical
values for diagnostic groups 1, 2, 3, and 4
~re 0.514, 0.344, 0.269, and 0.250, respectively; for all diagnostic groups the figure is
0.138. The non-overlapping symptom clusters, determined on the basis of significant
correlations, are formed as indicated below.
,Non-Overlapping Clusters
Group 1, Cluster 1: 11, 13, 14, 15
Group 1, Cluster 2: 2, 9, 18, 25, 26, 39, 45,
47, 50, 60, 64, 65
Group 2, Cluster 1: 10, 11, 13, 14, 15,26,28,
33, 37, 39,40,43, 56, 57,
60, 61, 66, 68
Group 4:
33;
13,
73;
59,
10,
14,
15,
61;
57,
14,
26;
37;
43,
61;
56;
14,
28,
56,
66·,
1, 9, 47; 1, 47, 56, 60; 9, 10, 11, 13;
9, 10; 11, 40, 47; 9, 11, 15; 9, 47,
57; 9, 61; 10, 11, 13, 21; 10, 11, 21,
47; 10, 11, 21, 68; 11, 15, 18; 11,
15, 21; 13, 14, 21; 13, 26; 14, 21,
47; 14, 40, 47; 15, 18, 26; 26, 66;
39, .40; 39, 57;47, 57; 47, 66; 68,
69.
All Groups: 1, 9, 39; 1, 17, 18; 1, 9, 47; 1,
47, 56,60; 9, 10, 14, 15; 9, 14,
47; 9, 15, 61; 9, 68; 10, 11, 13,.
15, 17; 10, 13, 14, 15, 17,21; 10,
15, 17, 21, 33; 10, 21, 33, 60; 14,
17, 21, 28; 14, 47; 15, 17, 18, 33;
15, 18, 26; 15, 56; 15,61; 17, 21,
28, 33; 21, 28, 33,60; 21,28, 33,
68; 28, 39; 28, 60,61; 57, 60, 61.
TABLE I
Correlation Matrix for Diagnostic Group 1
2
9
11
13
14
15
18
25
26
39
45
47
50
52
60
61
64
65
68
79
2
9
11
13
14
15
18
25
26
39
45
47
50
52
60
61
64
65
68
79
+1.00
+.354
-.200
-.200
+.100
-.426
+.139
+.100
+.378
+.213
+.000
- .200
+.577*
- .107
+,.000
+.289
-.577*
+.000
- .354
-.200
+.354
+1.00
-.354
+.000
+.354
+.075
+.294
+.000
+.535*
+.075
+'.167
+.000
+.272
-.302
+.272
+.272
- .,408
+.272
-.250
-.354
.;. .200
- .354
+1.00
+.700*
+.100
+.213
-.277
-.200
- .189
-.426
+.000
+.100
-.289
+.213
+.000
-.289
+.289
+.000
+.000
+,100
-.200
+.000
+.700*
+1.00
+.400
+.533*
-.277
- .200
-.189
- .426
+.000
+.100
- .289
+.213
+.000
-.289
+'.000
+.289
+.000
-.200
+.100
+.354
+.100
+.400
+1.00
+.533*
+.139
+.100
+.378
-.107
+.354
+'100
+.000
- .107
-.289
- .289
+.000
+.289
-.354
-.500
- .426
+.075
+.213
+.533*
+.533*
+1.00
- .237
- .107
-.161
+.139
+.294
-.277
- .277
+.139
- .237
+1. 00
+.139
+.681*
+.207
+.294
+.139
+.080
+.207
+.080
+.480
+.080
+.080
+.294
+.139
+.100
t.OOO
-.200
-.200
+.100
'-.107
+.139
+1.00
+.378
+.533*
+.707*
+.400
+.577*
+.213
+.000
+.289
- .289
+.,577*
+.000
+.100
+.378
+.535*
- .189
-.189
+.378
-.161
+.681*
+.378
+1.00
+.443
+.535*
+.378
+.327
- .161
+.327
+.327
-.218
t.327
-.134
-.189
+.213
+.075
-.426
- .426
- .107
- .364
+.207
+.533*
+.,443
+1.00
+,075
+.533*
+.431
- .023
+.,431
+.431
- .185
+.431
+.075
- .107
+.000
+.167
+.000
+.000
+.354
+.075
+.294
+.707*
+.535*
+.075
+1.00
+.354
+.272
+.075
- .068
- .068
-.068
+.,272
- .250
+.000
-.200
+.000
+.100
+.100
+'.100
+.213
+.139
+.400
+.378
+.533*
+'.354
+1.00
+.000
+.213
+.577*
+.,289
+.000
+.577*
+.000
-.500
+.577*
+.272
-.289
-.289
+.000
-.492
+.080
+.577*
+.327
+.431
+,.272
+.000
+1.00
-.185
+.167
+.444
-.667*
+.167
- .068
+.000
-,107
-.302
+.213
+.213
-.107
+.318
+.207
+.213
- .161
- .,023
+.075
+.213
-.185
+1.00
-.185
+.431
+.123
+.431
+.075
+.213
+.000
+.272
+.000
+.000
- .289
-.185
+.080
+.000
+,.,327
+.,431
-.068
+.577*
+.167
- .185
+1.00
+.444
-.111
+.167
+.272
-.289
+.289
+'.272
-.289
-.289
-.289
-.185
+.480
+.289
+.327
+.431
-.068
+.289.
+.444
+.431
+.444
+1.00
-.389
+.444
+.272
+.000
-.577*
- .408
+.289
+.000
+.000
+.123
+,080
-.289
-.218
-.185
- .068
+.000
-.667*
+.123
- .111
-.389
+1.00
- .389
+.272
+.289
+.000
+.272
+.000
+.289
+.289
+.431
+.080
+.577*
+.327
+-.431
+.272
+.577*
+.167
+.,431
+.167
+.444
-.389
+1. 00
- .068
-.289
-.354
-.250
+.000
+.000
- .354
-.302
+.294
+.000
- .134
+.075
-.250
+.000
-.068
+,075
+.272
+.272
+.272
- .068
+1.00
+.354
-.200
-.354
+.100
-.200
-.500
- .426
+.139
+.100
- .189
-.107
+.000
- .500
+.000
+.213
- .289
+,000
+.289
-.289
+.354
+1.00
-~364
+.075
+.213
-.492
+.318
-.185
-.185
+.123
+.431
-.302
- .426
2
9
11
13
14
15
18
25
26
39
45
47
50
52
60
61
64
65
68
79
'"d
t;
o
o
('!)
('!)
p..
"".
::I
(Jq
00
I
~
e......
c:...
o
"".
a
(i
.go
a
('!)
t;
(i
a
Cl)
t;
Cl)
::I
.......o
Cl)
co
0)
N
"-.....
N
00
co
~
CO
o
...........
(j
E'
fJl
~
CD
~
~
o
~
[.....
1
1
10
11
13
14
15
26
28
33
37
39
40
43
47
56
57
60
61
66
68
+1.00
+.000
+.160
+.067
+.021
+.021
+.124,
- .020
+.021
- .020
+.250
+.199
+.280
-.134
+'.250
+,043
+.160
+ 04i
+:021
+.067
10
11
13
14
+.160 +.067 +.021
+.418* +.800* +.373*
+1~00 +.353* +.273
+.353* +1.00 +.242
+.273 +.242 +1.00
+.273 +.089 +.283
+,.239 - .083 +.373*
-.144 +.160 +.187
+.089 +.242 +.283
+.385* +.454* +.324
+.089 - .,219 -.004
+.383* +.013 +.187
+.500* +.289 +.336
+.057 +.155 +.089
+.457* +.396* +.570*
+.,311 +.130 +.336
+.057 +.155 +.457*
+.311 , -.029 +.336
+~093 +.089 +,242 +.139
-.100 +.155 +.010 +.242
+~OOO
+1.00
+.418*
+.800*
+.373*
+.233
+.167
+.000
+.233
+.535*
-.326
+.000
+.289
+.060
+.461*
+.000
+.060
+.000
15
TABLE II
8
Correlation Matrix for Diagnostic Group 2
§
26
28
+.021 +.124 -.020
+.233 +.167 +.000
+.273 +.239 - .144
+.089 - .083 , +.160
+.283 +.373* +.187
+1.00 +,.373* +.187
+.373* +1.00 +.134
+.187 +.134 +1.00
+.,426*' +.373* +.461*
+'.461* +.297 +.083
-.004 +~202 +,.050
+.187 +.297 -,.048
+.188 +.241 -.039
+.089 - .199 +.032
+.283 +.202 +.187
+.188 +.064 +'.244
+.089 +.020 +.559*
+.485* +.417* +.244
+'.139 -.140 +.324
+.242 +.100 +.307
s;:l.
33
37
39
40
43
47
56
57
60
61
66
68
+.021
+.233
+.089
+.242
+.283
+.426*
+'.373*
+.461*
+1.00
+.187
- .004
+.050
+.040
+.089
-.004
+,336
+.089
+.485*
+.139
+.242
-.020
+.535*
+.383*
+.454*
+.324
+.461*
+.297
+.083
+.250
-.326
+.089,
- .219
-·004
-.004
+.202
+.050
-.004
-.087
+1.00
+.735*
-.108
+.089
-.004
+.188
+.089
+.188
+.139
+.089
+.199
+.000
+.383*
+;013'
+.187
+.1:87
+'.297
-.048
+.050
+.214
+.735*
+1.00
+,t03
+.208
+.187
+.103
+.032
+.,386*
+.187
+.013
+.280
+.289
+.500*
+.289
+.336
+.188
+.241
-.039
+.040
+.386*
- .108
+.103
+1.00
+.121
+.633*
+: 389*
+.121
+.083
+.188
+.447*
-.134
+.060
+.057
+.155
+:.089
+'.089
-.199
+.032
+.089
- .144
+.089
+.208
+.121
+1.00
+.089
+.311
+.057
+.121
+.273
-.042
+.250
+.373*
+.457*
+,396*
+.570*
+.283
+.202
+.187
-.004'
+.461*
-.004
+.187
+.633*
+.089
+1.00
+.336
+.457*
+.188
+.139
+.396*
+.043
+.000
+.311
+.130
+.336
+.188
+·.064
+.244
+.336
+.103
+.188
+.103
+;389*
+.311
+.336
+1.00
+.311
+.389*
+.188
+.289
+.160
+.060
+.057
+.155
+,.457'"
+.089
+.020
+.559*
+.089
+.208
+.089
+.032
+.121
+.057
+.457*
+.311
+1.00
+.311
+.273
+.353*
+.043
+.000
+.311
- .029
+.336
+.,485*
+.417*
+'.244
+.485*
+.103
+.188
+.386*
+.083
+.121
+.188
+.389*
+.311
+1.00
+.188
+.130
+.021
+.093
+.089
+.242
+'.139
+.139
-.140
+.324:
+.139
+.050
+.139
+.187
+.188
+.273
+.139
+.188
+.273
+.188
+1.,00
+.396*
+.067
-.100
+.155
+~187
+1.00
-.087
+.214
+.386*
-.144
+.461*
+.103
+,208
+.103
+'.050
+.307
+~O10
+.242
+.242
+.100
+.307
+.242
+.307
+'.089
+.013
+.447*
-.042
+.396*
+'.289
+.353*
+.130
+.396*
+1.00
t:1
~.
1
10
11
13
14
15
26
28
33
37
39
40
43
47
56
57
60
61
66
68
o
fJl
.....
~
~
til
.....
I§.....
.....
§
~
~
~
CD
.....
='
~
fJl
~
t:r
.....
~
~
.....
~
til
~
~o
9
trl
<
~
~
.....
8
TABLE III
Correlation Matrix for Di~gnostic Group 3
10
11
13
16
17
18
23
25
26
28
31
33
53
59
60
64
68
73
76
80
10
11
13
16
17
+l.00
+.242
+.384*
+.087
+.341*
+.213
+.341*
+.·232
+.141
+'.004
+.018
+.426*
+.044
+.141
+.153
+.242
+l.00
+.179 ..
+.095
+.447*
-.009
+.257
-.058
+.100
-.009
-.042
+.123
-.103
-.042
-.103
+.112
-.006
-.024
-.103
+.095
+.384*
+.179
+l. 00
+.213
+.447'"
+.233
-.123
-.058
+.242
+.233
- .042
+.271*
+'.149
+.100
+.275·
+.112
+.305·
+.095
+.023
+.213
+,087
+.095
+.213
+1.00
+.334*
+.259
+.059
-.046
+.190
+.084
-.118
+.339*
+.145
-.118
+.145
-.265
+.279·
-.288·
- .037
+.056
+.341*
+.447*
+.447*
+.334*
+1.00
+.073
+.118
-.162
+.177
+.212
-.152
+.371*
-.043
+.012
+.103
-.207
+.226
+.059
+.103
-.079
-.101
+.070
+.293·
-.064
-.015
18
23
25
+.213 +.341* +.232
-.009 +.257 -.058
+~233
-.123 -.058
+'.259 +.059 -.046
+~073 +.118 - .162
+1.00 +.073 +.175
+.073 +1.00 +.473*
+.175 + . 473* +1.00
+.213 -.152 -.004
+.112 - .067 +;175
+.004 +.341* +.351*
+.472'" +.200 +.267
+.264 +~103 +.120
+.004 +.341* +.469*
+.079 -.043 +.120
-.155 -.067 -.127
+,186 -.133 -.091
-.091 +.334* +.251
-.106 - .043 +'.016
+.171 +.196 +.053
26
28
+.141
+.100
+.242
+.190
+.177
+.213
-.152
-.004
+1.00
+.108
-.227
+.171
+.262
-.105
+.153
-.101
+.204
-.015
+.044
-.015
+.004
-.009
+.233
+.084
+.212
+.112
-.067
+.175
+.108
+1.00
+.108
+.145
.:..014
+.108
+.171
-.243
31
+.018
-.042
-.042·
- .118
- .152
+.004
+. 341*
+,351*
-.227
+·.108
+1.00
+.043
+.262
+.509'"
-.064
+.108
+~072
-.065
,:".003 +.498*
+.171 +.262
+.171 -.015
33
53
+.426* +.044
+.123 -.103
+.271* +.149
+.339* +.145
+.371* -.043
+.472'" +.264
+.200 +.103
+.267 +.120
+.171 +.262
+.145 - .014
+.043 +..262.
+1.00 +.189
+.189 +1.00
+.043 +.153
+'.302·· +.036
-.073 +.079
+.093 +.110
+.018 +.145
+.076 +.229
- .089 -.037
59
60
64
68
+.141
-.042
+.100
-.118
+.012
+.004·
+.341*
+.469*
-.105
+.108'
+.509*
+.043
+.153
+LOO
+.262
+.108
+.070
+.498*
+.153
-.015
+.153
- .103
+.275*
+.145
+.103
+·.079
- .043
+.120
+.153
+.171
-.064
+.302*
+.036
+.262
+1.00
+.079
+.348·
+.054
+.132
+.054
-.101
+.112
+.112·
-.265
- .. 207
-.155
-.067
-.127
-.101
-.243
+.108
-.013
+.079.
+.108
+.079
+1.00
-.042
-.091
+.171
+.171
+.070
-.006
+.305*
+.279*
+.226
+.186
-.133
-.081
+.204
+.,072
-.065
73
+.293*
-.024.
+.095
-.288*
+.059
-.091
+.. 334*
+.251
-.015,
-.003
+·.. 498'"
+.09~ t·018
+.110 +.145
+.070 +.498'"
+.348* +~054
-.042 -.091
+1.00 +.054
+.054 +1. 00
-.126 +.145
-.058 -.030
76
80
- .064
-.103
+.023
- .037
+.103
-.106
-.043
+.016
+.044
+.171
+.262
+.076
+.229
+.153
- .015
+.095
+.213
+.056
-.079
+.171
+.196
+.053
-.015
+.171
- .015
-.089
-.037
-.015
+.054
+.171
-.058
-.030
- .037
+1.00
+~132
+.171
-.128
+.145
+1.00
-.037
I
10
11
13
16
17
18
23
,25
26
28
31
33
53
59
60
64
68
73
76
80
"d
"1
o(')
(1)
(1)
I
i
Q.
""".
~
00
I
Io:I:j
~
......
......
Coot
o
""".
a
n
o
i
(1)
"1
n
§
~
(1)
"1
(1)
5
..
(1)
...co
0)
N
'-...
N
...co
M
CD
to..)
'-...
(')
.....
s::
00
~
Cl)
~
"%j
o
~
S
1
9
+1.00
+.255*
+.196
+.132
+.145
-.025
+.097
-.148
+.075
+.059
+.25'5*
+1.00
+.439*
+.312*
+.287*
+.209
+.460*
+.188
+.117
+.073
+.186
+.399*
+.361*
+.065
+.281*
+.065
+.460*
10
TABLE IV
~
.....
o
Correlation Matrix for Diagnostic Group 4
§
::s
Q.
11
13
14
15
18
21
26
39
40
47
56
57
+.132
+.. 312*
+.516*
+1.00
+.456*
+.185
+.498*
+.312*
+.472*
+.141
+.093*
+.255*
+·.307
-.034
:,".185
-.126
+.209
+.149
+.261*
-.034
+.145
+.287*
+.478*
+.456*
+1.00
+.325*
+.210
+.205
+.360*
+.288*
+.028
+:224
+.126
+.059
+.164
+.059
+·.122
+.059
+.164
+.059
- .025
+.209
+.238
+.185
+.325*
+1.00
+.179
+.065
+.300*
+.133
-.022
+.268*
+.097
+.460*
+.248
+.498*
+.210
+.179
+1.00
+.460*
+.381*
+.343*
-.009
+.234
+.178
-.152
+.179
+.008
+.242
+.168
+.110
-.072
-.148
+.188
+.215
+.312
+.205
+.065
+ . 460*
+1.00
+.195
+.354*
+.186
+.113
+.099
+.065
+.137
+.065
+.224
+.290*
+.243
+.065
+.075
+.117
+.376*
+.. 472*
+.,360*
+.300*
+. 381*
+.195
+1.00
+.222
-.027
+.202
+.339*
+.062
+.149
+.140
- .032
+.2.19
+.324*
+.219
+.059
+.073
+.154
+.141
+.288*
+.133
+.105
+.186
+.117
+.093
+.028
-.022
-.009
+.186
-.027
+.118
+1.00
+.300*
+.173
+.117
+.319*
+.117
- .009
+.117
+.046
-.237
+.190
+.399*
+.286*
+.255*
+.224
+.268*
+.234
+.113
+.202
+.208
+'.300*
+1.00
+.384*
+.069
+.128
+.141
+.005
+.214
+.244
+.141
+.295*
+.361*
+.294*
+.307*
+.126
+.319*
+.178
+.099
+.• 339*
+,007
+.173
+.384*
+1'.00
+.294*
+.405*
+.383*
+.084
+.560*
+.226
+.029
+.360*
+.065
+.089
-.034
+.059
- .054
- .152
+.065
+.062
-.006
+,117
+.069
+,294*
+1.00
+.132
+.281*
+.. 165
+.185
+.164
+.155
+.179
+.137
+.149
+.225
+.319*
60
61
66
68
69
+.115
+.215
+.165
+.149
+.059
+.238
+.168
+.290*
+. 219
+.279*
+.117
+.214
+.560*
+.241
+.238
+.165
+.168
+1.00
+.114
+.013
+.219
+.09l
+.346*
+.261*
+.164
+.194
+.110
+.243
+.324*
+'.106
+.046
+.244
+.226
+.037
+.194
+.191
+.110
+.114
+1.00
+.346*
-.049
-.159
+.013
-.034
+.059
-.054
-.072
+.065
+.219
+.184
-.237
+.141
+.029
+.089
-.054
+.241
-.072
+.013
+.346*
+1.00
t:1
.....
~
(JQ
1
9
10
11
13
14
15
18
21
26
3.9
40
47
56
57
60
61
66
68
69
+..196
+.439*
+1.00
+.516*
+.478*
+.238
+.248
+.215
+.376*
+.184
+~105
+.117
+.190
+.286*
+.295*
+.294*
+.360*
+.089
+.132
+.165
+-.441*
+.165
+.; 183
+.168
+.115 +~215 +.165
+.219 +.091 +.. 346*
- .049 - .159 +.013.
+;319~
-.054
+.155
+.092
+.179
+.238'
+'.194
-.054
+~343*
+:354*
+.222
+LOO
+.118
+.208
+.007
-.006
+.225
+.184
+.042
+.279*
+.106
+.184
--
-
~.092
+.393*
+.088
+'.241
+.037
+.089
'-----.
+~128
+.405*
+.092
+1.00
+.238
+.102
+.238
+.194
-.054
+.441* +.183
+.065 +.460*
+.165 +.168
-.126 +.209
+.059 . +.122
+.092 +.179
+.008 +.242
+.065 .L.224
+.140 -.032
+.184 +.042
+.117 -.009
+.141 +.006
+.383* +.084
+.393* +.088
+.238 +.102
+1.00 +.168
+.168 +1.00
+.165 +.168
+.191 +.HO
+.241 - .072
::s
1
9
10
o
11
00
.....
<§.....
.....
n
§
13
14
15
18
21
26
39
40
47
56
57
60
61
66
68
69
00
.....
~
n
~
n
Cl)
.....
::s
i
~
00
~
t:r'
~
~
.....
n
00
~
.§
~
o
8
tJ:j
e
s::
~
.....
o
::s
TABLE V
Correlation Matrix for All Diagnostic Groups
1
9
10
11
13
14
15
17
18
21
26
28
33
39
47
56
57
60
61
68
1
9
10
+1.00
+.235*
- .001
+.045
+.017
+'.063
+.039
-.086*
- .124.*
+.127
+.052
+.130
+.065'
+.233*
+.410*
+.378
+.133
+.317*
+.119
- .005
+.235*
+1. 00
+.214*
+.191
+.140
+.291*
+.326*
+.083
+.166
+.001
+.165
+.153
+.097
+. 218~
+.257*
+.055
+.089
+.028
+.285*
-.082*
- .001
+.214*
+1.00
+.375*
+.515*
+.205*
+.246*
+.260*
+.105
+.296*
+.115
+.191
+.276*
-.002
+.100
+.049
+.110
+.221*
+.141
+.173
---
11
13
14
+.045 +.017 +.063
+.191 +.140 +.291*
+~375* +.515* +.205*
+1.00 t·335* +.162
+.335* +1.00 +.285*
+.162 +.285* +1.00
+.328* +.262* +.270*
+.332* +.392* +.234*
+.083 +.079 +.123
+.233 +.291* +.221*
+.128 +.119 +.111
+.121 +.183 +.248*
+.119 +.200 +.179
+.065 +.022 +.071
+.116 +.153 +.222*
+.130 +.084 +.107
+.153 +.079 +.077
-.035 +.153 . +.080
+.154 +.101 +.141
+.150 -t .139 +.107
-
--
15
17
18
21
26
28
33
39
47
56
57
60
61
68
+.039
+.326*
+.246*
+.328*
+.262*
+.270*
+1. 00
+.447*
+.286*
+.247*
+.303*
+.210
+.222*
+'.061
+.189
-.093*
+.061
+.115
+.302*
+.061
-.086*
+.083
+.260*
+.332*
+.392*
+.234*
+.447*
+1.00
+.248*
+'.325*
+.219
+.240*
+.282*
+.054
+.061
-.124*
+.166
+.105
+.083
+~ 079
+:.123
+.286*
+.248*
+1.00
+~ 076
+.343*
+.125
+.217*
+.157
+.076
- .080
+.064
+.006
+.180
+.225
+.127
+.001
+.298*
+.233
+.29P'
+.221*
+.247*
+.325*
+;076
+1.00
+.180
+.265*
+.247*
+.050
+.223
+.189
+.073
+.223*
+.096
+.,402*'
+.052
+. 165
+.115
+.128
+.119
+.111
+.303*
+.219
+.'343*
+.180
+1. 00
+.042
+.185
+.130
+.153
+.191
+'.121
+.183
+.248*
+;210
+.240*
+.125
+.265*
+,042
+1. 00
+.276*
+.24()*
+.210
+.123
+.179
+.256*
+.218*
+.222*
+.065
+.097
+.276*
+.119
+.200
+.179
+.222*
+.282*
+.217*
+.247*
+.185
+.276*
+1.00
- .024
+.030
+.089
+.149
+.198*
+;212
+.226*
+.233*
+.218*
-.002
+.065
+.022
+.071
+.061
+.054
+.157
+.050
+.219
+.240*
- .024
+1. 00
+.206
+.031
+.158
+.133
+.125
+.034
+.410*
+.257*
+.100
+.116
+.153
+.222*
+.189
+.061
+.076
+.223
+'.126
+.210
+.030
+.206
+1.00
+.273*
+.174
+~ 312*
+.181
+.061
+.378*
+.055
+.049
+.130
+.084
+.107
-.093*
- .013
-.080
+.189
-.004
+,123
+.089
+.031
+.273*
+1..00
+.123
+.318*
+.076
+.054
+.133
+.089
+.110
+.153
+.079
+.077
+.061
+.114
+.064
+,.073
+.100
+.179
+.149
+.158
+.174
+.125
+1.00
+.243*
+.225*
+.165
+.317*
+.028
+.221*
-.035
+.153
+.080
+.115
+.085
+.006
+.223*
+.15'5
+.256*
+.198*
+.133
+.312*
+.318*
+.243*
+1.00
+.206*
+.188
+.119
+.285*
+.141
+.154
+.101
+.141
+.302*
+.173.
+.180
+.096
+.190
+.218*
+.212
+.125
+.181
+.076
+.225*
+.. 206*
+1.00
+.134
-.005
-.082*
+.173
+.150
+.139
+.107
+.061
+.234
~.013
+.114
+.085
+.173
+'.234
+.n9
+.126
-.004
+.100
+.155
+.190
+.131
--
-
---
----
1
9
10
11
13
14
15
17
+~225
18
+.402* 21
+.131 26
+.222* 28
+.226* 33
+.034 39
+.061 47
+.054 56
+.165 57
+.l88 60
+.134 61
+LOO 68
'"d
~
o(")
CD
CD
Q.
.....
='
aq
Dl
I
~
e......
~
.....
a
(j
o
i
CD
~
(j
a
CD
~
CD
..~
CD
~
co
0)
~
...........
~
co
c,.:)
294 / Cluster Formation and Diagnostic Significance in Psychiatric Symptom Evaluation
The clustering of these groups suggests that
for the various diagnoses, and also for all
cases regarded as a· single entity, certain
specifiC constellations do have a tendency to
reappear; in other words, the appearance of
a specific symptom indicates the probable
appearance of other symptoms associated
with the former.
Diagnostic Significance
A typical experiment usually involves taking one or more measurements on the experimental units under test. The experiment
is called univariate, or uniresponse, as opposed to multivariate, or multiresponse, when
only one measurement is taken on each unit.
Since the symptom study data have many determinations associated with each sample
,item, i.e., individual, the tools of multivariate
analysis are required for the analysis.
TABLE VI
Diagnostic Group Means by Symptom Type
Diagnostic Group*
Symptom
1
9
10
11
13
14
15
17
18
21
26
28
33
39
47
56
57
60
61
68
1
2
3
4
0.7581
0.6774
0.6935
0.8226
0.7742
0.6452
0.7419
0.6290
0.6774
·0.7258
0.8387
0.6129
0.6290
0.8065
0.8065
0.6935
0.6452
0.6935
0.7419
0.7097
0.3519
0.4444
0.8148
0.8704
0.8704
0.6852
0.6852
0.9074
0.7037
0.6111
0.8148
0.7037
0.8333
0.6296
0.5185
0.4630
0.6481
0.7407
0.6667
0.8519
0.9091
0.5758
0.6667
0.8485
0.7576
0.6970
0.6970
0.6061
0.5152
0.6364
0.8182
0.6364
0.6970
0.6970
0.8485
0.6970
0.7273
0.8485
0.7273
0.7576
0.5333
0.8000
0.5333
0.6667
0.6667
0.6667
0.7333
0.5333
0.8667
0.4667
0.9333
0.5333
0.4667
0.7333
0.6667
0.4667
0.5333
0.6000
0.6000
0.8000
*Diagnostic Group 1: Psychoneurotic Disorders
2: Childhood Schizophenia
3: Personality Trait Disturbances
4: Psychophysiological
Disorders.
Table VI shows the mean proportion of
individuals in each diagnostic group having
the particular symptom under consideration. Differences among the diagnostic groups
are to be analyzed on the basis of these 20
symptoms - the 20 most frequently occurring
symptoms in the entire sample.
Using the notation of the Appendix, we let
Xijt adopt the value one or zero depending
on whether the jth individual in the ith diagnostic group does or does not have the tth
symptom. For this particular analysis "i"
takes the values 1 through 4; "j" takes the
values 1 through TJ i where TJ 1 = 62, TJ 2 = 54,
TJ3 = 33, and TJ4 = 15; and t can adopt any of
the 20 symptoms listed in Table VI.
The symbol x i. t refers to the value in
each cell of the table, i.e., the sample proportion of individuals in the ith group exhibiting symptom t. For example, x 3. 13 has
the value 0.7576 indicating that almost 76%
of the study group claSSified as having personality trait disturbances exhibit definite signs
of rebelliousness. We let J.L ij denote the
population mean proportion.
The total hypothesis of no difference among
diagnostic groups (see Appendix A) is denoted by
H : J!:.1 = 1!:.2 = !:!:.3 = l:!:.4 ,
where
J!:.'i
= (J.L ii, J.Li2, ••• ,
J.Li20)
and
J.Li
= transpose of J.L \ •
The hypothesis of no difference between the
ith and 1 th diagnostic groups is denoted by
Hie : 1!:. i = !:!.f (i ~ 1 = 1, 2, 3, 4).
The Within Group SP matrix (Se) and its
inverse are given in Tables VII and vm.
The sizes of the samples drawn from the
several groups are TJI = 62; TJ2 = 54; TJ3 = 33;
TJ4 = 15. The number of variates (i.e., symptoms) is p = 20 and TJ, the total sample size
is 164. The "error degrees of freedom,"
v, = TJ - p - K + 1 = 164 - 20 - 4 + 1 = 141.
Then
T if 2 = VTJ if ( T!:.' i - g;) S; I (R:. i - fl f ) ,
where
and
H\ = (Xi.1 , ••• ,
x i.2ff), i
= 1, 2,3, 4.
TABLE VII
Within Groups Sums of Squares and Cross Products (SP) Matrix
1
9
10
1 +30.146 +7.0441 +().6551
9 +1.0441 +37.342 +10.249
10 +0.6551 +10.249 -+,'32.392
11 +2.0138 +7.4415 +10.333
13 +2.0153 +5.2010 +15 .• 413
14 +1.4165 +11.216 +7.4432
15 +1.3348 +12.352 +6.7486
17 -0.25'37 +3.8877 +7.4257
18 -2.5970 +6.4716 +2.6413
21 +5.4517 -0.8414 +11.168
26 +2.0870 +4.4732 +3.6170
28 +5.4656 +7.8783 +5~4155
33 +4.9597 +6.7382 +9.2183
39 +8.3581 +6.9755 -0.5811
47 +8.4570 +6.5634 +3.5078
56 t10.965 +1. 9174 +1.7404
57 +3.2778 +6.1295 +3;4729
60 +10.075 +1. 7720 +5.1182
61 +4.8442 +10.821 +3.9634
68 +3.3327 -2;2448 +5.9357
11
13
+2.0138
+7.4415
+10.383
+22.717
+9.7299
+4.7113
+10.109
+9.9682
+2,2866
+6.7768
+2.6871
+3.5163
+2.5709
+2.4299
+4.0764
+3.6880
+3 ..9368
-1.9434
+4,464.3
+2.0153
+5.2010
+15.413
+9.7299
+26.325
+8.7376
+8._4258
+11.673
+1. 8643
+10.863
+3.6578
+5.2641
+6.5489
-2.0598
+5.0412
+3.8595
+2.0541
+3.6827
+3~55n
14
+1. 4165
+11.216
+7.4432
+4. 711~
+8.7376
+36.145
+8.6071
+7.9919
+4.3510
+9.0536
+5.1519
+9.4771
+7.3084
+2.0820
+7.3749
+3.4315
+3.1515
+2.3355
+~.8719 +5.9286
+3.9590 +4.6701
15
+1.3348
+12.352
+6.7486
+10.109
+8.4258
+8.6071
+33.422
+14.684
+9.4199
17
-0.2537
+3.8877
+7.4257
+9.9682
t11.673
+7.9919
+14.684
+30.617
+6.8628
:1:8.2321 +13.288
+8.1864 +4.5341
+6.2664 +7.6214
+8.0675 +~.9617
+1.5100 +0.8905
+6.8696 +3.8379
-4.1965 +1. 5937
+1.7472 +2.2673
+1..5742 +1. 8856
+9.8770 +5.0524
+1.6121 +7.0303
18
21
26
28
33
39
47
56
57
60
61
68
-2.5970
+6.4716
+2.6413
+2.2866
+1.8643
+5.4517
-0.8414
+11.168
+6.7768
+10.863
+9.0536
+8.2321
+13.288
+2.4091
+36.542
+6.6540
+10.100
+10.291
+3.1622
+7.1137
+6.6095
+1. 5728
+9.3277
+2.1402
+14;444
+2.0870
+4.4732
+3.6170
+2.6871
+3.6578
+5.1519
+8.1864
+5.4656
+7.8783
+5.4155
+3.5163
+5.2641
+9.4771
+6.2664
+7.6214
+3.7658
+10.100
+1. 5176
+37.339
+10.060
+7.9259
+8;4996
+4.6829
+5.3148
+7.8788
+8.4004+6.3528
+4.9597
+6: 7382
+9.2183
+2.5709
+6.5489
+7.30Q4
+8.0675
+8.9617
+8.9988
+10.291
+5.2721
+10.06'0
+32.671
+3.0514
+4.0332
+3.8213
+7.2114
+6.908i
+8.1372
+4.9650
+8.3581
+6.9755
-0.5811
+2.4299
-2.0598
+2.0820
tl. 5100
+0.8905
+4.8213
+3.1622
+5,2760
+7.9259
+3.0514
+32.173
+6.1993
+2.4182
+5.1110
+3.0222
+4.9093
+1. 3289
+8.4570
+6.5634
+3.507.8
+4.0764
+5.0412
+7.3749
+6.8696
+3.8379
+4.3344
+7;1137
+4.0073
+8.4996
+4.0332
+6.1993
+30.735
+6.1778
+4.8968
+9.8243
+5.8729
+5.4522
+10.965
+1. 9174
+1. 7404
+3.6880
+3.8595
+3.4315
+3.2778
+6 .. 1295
+3; 4729
+3.9368
+2.0541
+3.1515
+1. 7472
+2.2673
+2.9766
+1. 5728
+2.2301
+5.3148
+7.2114
+5.1110
+4.8968
+3.5938
+3~. 7.87
+8.1685
+8.7347
+4.2163
+10.075
+1. 7720
+5.1182
-1. 9434
+3.6827
+2.3355
+1. 5742
+1.8856
+2.4986
+9.3277
+4.0338
+7.8788
+6.9031
+3.0222
+9.8243
+10.944
+8.1685
+31. 390
+7.6665
+7.9977
+4.8442
+10.821
+3,9634
+4.4643
+2.8719
+5.9286
+9.8770
+5.0524
+7.3417
+2;1402
+6.0497
+8.4004
+8.1372
+4.909'3
+5.8729
+3.3327
-2.2448
+5.9357
+3,5573
+3.9590
+4.6701
+1.6121
+7.0303'
+5.5444
+14.444
+2.9607
+6 .. 3528
+4.9650
+1. 3289
+5.4522
+4.1633
.+4.2163
+7.9977
+3.3064
+28.050
+4~3510
+9.4199
+ti.8628
+34.783
+2.4091
+8.7688
+3.7658
+8.9988
+.4,;8213
+4.3344
+0.3632
+2.9766
+2;4986
+7.3417
+5.5444
+4.5~41
+~. 7688
+6.6540
+22.378
+1; 5176
+5.2721
+5.2760
+4.0073
+1. 2136
+2.8301
+4.0338
+6.0497
+2.9607
-4~1965
+i. 5937
+0.3632
+6.6095
+1. 2136
+4.6829
+3.8213
+2.4182
+6.1778
+37.306
+3.59.38
+10.944
+1. 5028
+4.1633
+~.5028
+8.7347
+7.6665
+34.01E
+3.3064
1
9
10
11
13
14
15
17
18
21
26
28
33
39
471
56
57
60
61
68j
'"d
"1
o
~
CD
CD
0..
~.
='
aq
00
I
~
e.....
C-t
o
a
~.
n
o
~
aCD
"1
n
o
a
CD
"1
CD
='~
...CD
...co
0)
~
...........
~
co
CJ1
N
CO
0)
...........
(')
.....
s::
00
~
('D
"i
~
o
"i
1
1 +4.7895
9 -.36387
10. +.65889
11 - .19965
13 ,..29130.
14 +.37448
15, -.0.8430.
17 +;45582
18 +.870.13
21 -.27892
26 ... 0.1717
28 +.0.7960.
33 -.54182
39 .,..88753
47, -.58243
56 -.89158
57 +.22999
60. -.87236
61 - .1t.i321
68 -.310.69
9
10.
11
TABLE VIII·
9
~
""".
o
Inverse of Within Groups Sums of Squares and Cross Products (SP) Matrix
§
13
15
1.4
17
18
21
26
=
28
33
39
47
56
p.
57
60.
61
68
+.65889 -.19965 -.29130. +; 37448 -.0.8430. +.45582 +.870.13 -.27882 -.0.1717 +.07960. -.54182 -.88753 -.58243 -.89158 +.22999 -.87236 -.16321 -.310.69 1
9
+5.3770. ".1. 4410. -2.280.7 +.0.640.7 +.45554 +.535:59, +~ 14944 -.71938 - .0.50.0.8 +.0.8471 -.82438 +.210.25 +.2360.3 +.41460. +.11938 -.71326 +.12864 -.45911 10.
-1. 4410. +7.1461 -1. 0.30.2 +;31518 -1.3250. -1. 0.364 +.20.638 -•.19892 +.18313 +.12727 +.75995 -.43834 ~" 27961 -.93619 -.62546 +1. 6151 -.35149 ".40.905 11
-2.280.7 -1. 0.30.2 +6.8221 -.62666 -.15958 -1. 3232 +.13984 -.42110. -.24723 -.090.88 +.0.0.0.66 +.69187 -.35779 -.22772 +.0.2469 -~13696 +.15264 +.50.533 13
+.0.640.7 +.31548 -.62866 +3.60.94 -.13688 -.15679 +.0.9965 ~.42777 -.33215 -.41733 -.170.96 +.10.552 ,..43472 -.22695 -.0.150.7 +.36342 -.10.962, -.2490.0. 14
+;45654 -1. 3250. -.15933 -.13688 +5.3675 -1. 5236 -.56668 -.50.976 "..870.40. -.0.4387 - .-25906 +.50.80.8 -.52870. +1. 0.774, +.33740. -.210.28 -:.5750.3 +.63519 15
+.53359 -1.0.364 -1.3232 -.13679 -1. 5236 +5.4720. -.32928 -.89612 +.20521 -.29967 -.55140. -.0.4372 +.26372 -.10.442 +.0.1144 +.15340. -.0.2996 -.0.4912 17
+'.14944 +.20.538 +.13984 +.0.9965 -.56668 -.32923 +3.8491 -1:; 640.48 -1. 0.10.9 +.0.8759 -.78134 -.42223 -.24533 -.20.975 +.12182 -.0.1175 -.22791 -.8140.7 18
-.71938 -.19692 -.42110. -.42777 -.50976 -.89812 +.54048 +4.9229 .... 74180. -.43416 -.68443 -.31915 -.10.817 .;,.27262 +.35766 -.51180. +.47157 -1. 5637 21
-.0.50.0.8 +.18313, -.24728 -.33215 -.870.49 +.20.521 -1.0.10.9 "".74180. +5.780.3 +.61250. -.0.8773 -.73685 -.00.683 -.0.3233 -.0.9158 -.28369 -.43446 +.11646 26
+.0.8471, +.12727 -,,0.90.98 -.41733 -.0.4387 -.29967 +.0.8759 -.43416 +.61250. +3.560.2 -.53728 -.64626 -.35438 -.0.6220. -.0.2278 -.32820. -.38625 -.22586 28
-.82438 +.759.95 +.0.0.0.66 -.170.96 -.2590.6 -.55140. -.78134 -.58448 -.0.8773 -.53728 +4.2980. +.11190. +.27977 "..12855 -.5590.0. -.0.3859 ,-.34554 +.240.0.6 33
+.210.26 -.43834 +.69187 +.10.552 +.50.80.8 -.0.4372 -.42223 -.31915 -.73685 ,-.64526 +.11190. -1:3.9297 -.36335 +.20.168 -.28987 +.18533 -.0.4978 +.25118 39
+.2360.3 ... 21961 -.36780, -.43472 -.52870. +.2,6372 -.24533 - .10.617 -.0.0683 -.35438 +.27977 -.36335 +4.2623 -.15616 -.11542 -.84641 +.0.0.732 -.26424 47
+.41460. -.93619 -.22772 -.22695 +1. 0.774 -.10.442 -.20.975 -.27262 -.0.3233 -.0.6220., -.12655 +.20.168 -.15516 +3.5394 +.0.2394 -.97478 +.16777 +.13447 56
+.11938 .:..62346 '+.0.2469, -.0.150.7 +.33740. +.0.1144 +.12182 +.35765 -.0.9158 -.0.2278 -.53699 ... 28987 - .11542 +;0.2394 +3.2273 -.70.979 -.45211 -.31327 57
-.71326 +1.6151 -.13696 +.36542 -.21028 +.15340. -.0.1175 -.51180. -.28369 -.323.20. -.0.3659 +.18533 - .,84541 -.97478. -.76979 +5.0.131 -.670.9,3 -.5590.8 60.
+.12864 .. ;,35149 +.15~64 -.10.952 -.5750.3 -.0.2996 -,,22791 +.47157 -.43446 -.38625 -.34664 -.0.4978 +.0.0.732 +.15777 -.45211 -.670.93 +3.8961 -.19339 61
-. ~5911 -.40.90.6 +.60.0.38 -.2490.0 ' +.63519 -.49125 -.8140.7 -1. 5637 +.11646 ... 22686 +.240.0.5 +.25116 -.26424 +.13447 -.31327 -.55968 -.19359 +5.0.80.8 68
• Each element of the matrix is to be multiplied by 10.- 2
-.86387
+4.2844
-1.2788
-,.33975
+.29l52
-.88422
.,1.0.461
+.25134
-.35453
+.8680.7
+.0.3680.
-.32662
-.0.1124
-.43655
-.13543
-.12110.
-.27392
+.30.606
-.51847
+.63948
-1. 2788 -.33975 +.29152 -,,33422 .. 1. 0.461 +.25134 -.35483 '+.3680.7 +.0.3580. -.32662 -.0.1724 -.43655 - .13843 - .12170. -.27392 +.30.606 -.51847 +.63943
-
~
""".
~
=
o00
o""".
~
c§.00
""".
o""".
§
~
o
('D
""".
=
tod
00
~
tr
""".
~
"i
o""".
00
i
o
9
tzj
<
~
~
""".
§
Proceedings-Fall Joint Computer Conference, 1962 / 297
The data on symptoms yield the following
values for the T 2, s:
2
T12 = 77.3215,
I2
work remains to be done, but the pilot investigation to date gives every indication that
future study will provide ;revealing information and meaningful results for differential
diagnosis of emotional disturbances.
T 13 = 15.8515 ,
APPENDIX
2
T 14 = 18.7329,
Tests for Equality of Mean Vectors
T~3 = 53.7181,
2
T 24 = 36.4148,
T;' = 27.0456.
It is apparent that T :ax = Ti2 = 77.3215.
If we are interested in testing the hypothesis at the 6% level, the critical value, T ~
is found in the following way.
Probability [largest
T~f
:s T ~ I H]
H: lh = ••• =
4
L
== 1 -
Pif = 0.94,
i~'£=l
where
P if
= Probability
[T:f
Consider K multivariate normal populations with a common unknown covariance
matrix ~ and mean vectors /l h ••• , IlK'
where '/l~ = (/l,il' ••• , /lip) and /lit denotes
the ith population mean on the tth variate.
Since a multivariate normal is completely
specified by its mean vector and covariance
matrix, the above populations are homogene0us if their mean vectors are equal. The
hypothesis of equality of mean vectors is
denoted by
> T! I Hit] •
Since all Pif 's are equal to (say) P, we have
1 - 6P = 0.94
and therefore P = 0.01.
Now each T fe is distributed as the F distribution with (P,v), that is (20,141) degrees
of freedom. The F tables yield a value of
T \2\
ex = 1 88 when P = 0.01.
Hence T:ax > T! and the hypothesis of no
difference among diagnostic groups is rejected. In fact, each T~f > T! which indicates that each differs significantly from all
others.
Although the application of this particular
test is justified only when the data are multivariate normal it seems reasonable to assume that (1) the data for each diagnostic
group do approximate normality since the
sample sizes are moderately large; and (2)
the test itself is not particularly sensitive to
deviations from normality.
The results, while preliminary, are extremely encouraging in that they suggest the
existence of reasonably clear-cut criteria
for diagnostic grouping. Obviously much
I:!:x..
Various test procedures are available in the
literature to test the total hypothesis H and
to make a decision relative to the acceptance
or rejection of its subhypotheses in the event
H is rejected. For a detailed discussion on
the relative merits of these procedures, the
reader is referred to [5]. Two of these procedures are briefly reviewed here. The first
procedure is known as the "Multivariate
Analysis of Variance (MANOVA) test based
on the largest root." This test was proposed
by S. N. Roy [7] and the fest procedure is to
accept of reject H according as
0
where CL (A) denotes the m~imum characteristic root of A and A ex is chosen such that
Probe [CL (SHS ;1) :s Aex \HJ
= (1
- a).
In the above equation, SH and Se respectively
denote the sums of squares and cross products (SP) matrices due to H and "error" respectively. (In the univariate case, where
p = 1, SH and Se respectively denote the sums
of squares due to H and error respectively.
The value of Aex for any given value, of a
can be obtained from the tables of D. L.
Heck [1]. If the total hypothesis is rejected,
we can make multiple decisions on the acceptance or rejection of the various subhypotheses
298 / Cluster Formation and Diagnostic Significance in Psychiatric Symptom Evaluation
by examining the confidence intervals on the
"parametric functions" which measure departures from these subhypotheses. These
confidence intervals were derived by B. N.
Roy and R. Gnanadesikan [8,9]. For an illustration of the MANOV A test with biochemical
data, the reader is referred to [13].
The second method is based on the maximum of the (~ ) Hotelling T 2, s. Thisprocedure is applicable to test H and make multipIe decisions on the acceptance or rejection of
its subhypotheses of the form Hit : J.1. i = Ilt ,
(i ~ I = 1, 2, ••• , K). This procedure was
formulated by S. N. Roy and R. C. Bose [6]
and is a multivariate analogue of Tukey's
multiple comparison test [14]. The technique
(with trivial modification) is described below:
Let
(i
~ I
= 1, 2, ••• , K),
where v is the error degrees of freedom, TJ i is
the size of ith sample and {li' (i = 1, 2, ••• , K)
is the maximum likelihood estimate of Il i.
Then we accept or reject H according as
largest
T~t
out of
(~)
pairs ;
T~ ,
not known. M. Siotani [10-12] has suggested
some approximations to this distribution
which for moderately large samples, seem
satisfactory. When the error degrees of
freedom are very large, the following approximation can be used.
Probe [ largest
T~t ~ T!I HJ
K
f!!
1-
~
P if ,
i;et=l
where
But, when H it is true, T ~_ is distributed as
the F distribution with \p, TJ- P - K+ 1) degrees of freedom where
So, the values of P if's can be obtained from
F tables (or incomplete Beta function tables)
for any given value of T!. Here, we note that
M. Siotani suggested an approximation similar to the above as a first approximation for
upper percentage points.
In one way classification, SH = (s tu) and
Be = (s etu) are respectively called "Between
Groups" and "Within Groups" SP matrices
where
where T~ is chosen such that
Probe [largest
T~t
out of
Stu = ~ TJ i (X i. t - X .. t ) (X L u - X .. u )
(~) pairs ~ T!JHJ
i
= (1 - a).
If the total hypothesis H is rejected, we ac-
cept or reject the subhypotheses Hit according as
and
XLt
= (~
Xijt) /TJi
j
2
Tit
<
2
> T cx..
X .. t
= (~~Xijt)
i
Recently, it was shown [5] that the above test
is better (in the sense of shortness of the
lengths of the confidence intervals) than the
MANOVA test but the nature of the exact
distribution of the "largest Hotelling T2" is
= flit
/TJ.
J
Here Xijt denotes the observed value associated with the jth individual in the ith
group on the tth variate. XL u , x .. u can be
defined similarly.
Proceedings-Fall Joint Computer Conference, 1962 / 299
TABLE A-1
List of Symptoms
DESCRIPTION
ORIGINAL
NUMBER
REASSIGNED
NUMBER
School Maladjustment:
Arithmetic disability
Reading disability
Unsatisfactory school work in general
Truancy from school
Frequent absences from school
Fear or extreme dislike of school
n
1
4
5
6
Delete
2
3
7
8
9
4
5
6
10
11
12
7
8
Delete
13
14
15
16
17
18
19
20
21
22
23
24
25
9
10
11
12
13
14
15
16
17
18
19
20
21
2;}
22
Asocial Behavior:
Destructiveness
Lying
Stealing
Cheating
Pre-occupation with matches, fire-setting
Running away from home
Negative Attitudes and Behavior:
Sullenness, Sulkiness
Disobedience
Stubbornness
Negativism
Rebelliousness
Resentment
Easily aroused anger
Unprovoked anger
Temper tantrums
Excessive whining or crying
Jealousy, envy of others
Bearing grudges
Teasing behavior
Attacking behavior
Fighting
Bullying
Cruelty
27
28
29
Delete
30
31
32
33
34
35
36
37
38
23
24
25
26
27
28
29
30
31
Other Interferences in Social Relationships:
Seclusiveness
Shyness
Excessive passivity, non-assertiveness
Over-sensitivity
Over-conformity
Over-dependency
Excessive independency
Clinging exaggerated display of affection
General indifference or disinterest in people
300 / Cluster Formation and Diagnostic Significance in Psychiatric Symptom Evaluation
TABLE A-1 (Continued)
DESCRIPTION
ORIGINAL
NUMBER
REASSIGNED
NUMBER
39
40}
41
42
43
44
45
46
47
48
48
32
33
97
34
35
36
37
38
Delete
Delete
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
39
40
41
Delete
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
Other Interferences in Social Relationships (Continued)
Inability to form strong attachments or relationships
Inability to get along with other children his own age
Unpopularity with children
Inability to get along with adults
Preference for younger children
Preference for older children
Suspiciousness, distrust of people
Excessive assertiveness
Excessive competitiveness
"Sissy-like" behavior (boys)
"Tom-boyish" behavior (girls)
Attitudes Toward The Self:
Lack of self-confidence
Self-depreciation
Feelings of bodily inadequacy or defects
Attempts at physical self-injury
Disregard for his own possessions
Carelessness
Messiness
Worrying
Emphasis on sameness
Dawdling, procrastination
Over- conscientiousness, need for perfection
Specific fears, phobias
General fearfulness, timidity
Unrealistic fearlessness
General feeling of dissatisfaction, lack of enthusiasm
Specific compulSions
Bragging, grandiosity
Accident proneness
Exaggerated unconcern
Interference with Thought Activity:
Day-dreaming
Verbalized fantasies
Delusions, hallucinations
Complete self-absorption
Inability to concentrate
Boredom, lack of interest
Specific obsessions
Memory disturbances - underdeveloped
Memory disturbances - overdeveloped
68
69}
70
71
72
73
74
75
76
57
59
60
61
62
63
64
77
65
58
Motor Disturbances:
Tiredness
Proceedings-Fall Joint Computer Conference, 1962 /301
TABLE A-1 (Continued)
ORIGINAL
NUMBER
REASSIGNED
NUMBER
78
79
80
81
82
83
84
85
66
67
68
69
70
Delete
71
Delete
86
87
88
89
90
91
92
72
73
74
75
76
77
Delete
93
94
95
96
97
98
99
100
101
102
78
79
80
81
Delete
82
Delete
Delete
Delete
Delete
103}
104
83
Day
Night
105}
106
84
Excessive masturbation
Sex Play
107
Delete
Exhibitionism
Voyeurism
108}
109
85
DESCRIPTION
Motor Disturbances (Continued)
Laziness
Poor coordination, awkwardness
Restlessness
Hyperactivity
Stereotyped mannerisms
Tics
Facial grimaces
Head-banging
Disruption in Developmental Pattern:
Speech disorders
Doesn't talk at all
Slow in learning to speak
Doesn't speak clearly
Limited vocabulary
Repetitive speech
Use of 3rd person
Other speech disorders
Eating problems
Won't eat at all
Doesn't eat enough, "poor eater"
Likes only a limited number of foods
Won't try new foods
Eats only strained foods
Food fads
Allergies to certain foods
Excessive appetite
Eats with fingers, poor table manners
Other eating problems
Wetting
Day
Night
Soiling
302 / Cluster Formation and Diagnostic Significance in Psychiatric Symptom Evaluation
TABLE A-1 (Continued)
DESCRIPTION
ORIGINAL
NUMBER
REASSIGNED
NUMBER
110
111
112
113
114
115
116
117
118
Delete
Delete
Delete
Delete
Delete
86
87
88
89
119
120
121
122
123
124
125
126
127
128
129
130
90
91
92
93
94
Delete
Delete
Delete
95
96
Delete
Delete
Disruption in Developmental Pattern (Continued)
Sex Play (Continued)
Sexual activities with others
Sexual attitude
Sexual approach
Fearful of sex
Other sexual problems
Thumb- sucking
Nail-biting
Mouthing, licking, sucking, biting, objects
Fetishism
Sleep disturbances
Refusal to go to bed
Fear of the dark
Difficulty falling asleep
Nightmares
Talking or calling out in sleep, crying in sleep
Sleep-walking
Light sleeper, waking up frequently at night
Prowling at night through the house
Wanting to sleep with one or both parents
Excessive sleep
Other sleep disturbances
Other problems not mentioned above
BIDLIOGRAPHY
1. Heck, D. L., "Charts of Some Upper Percentage Points of the Distribution of the
Largest Characteristic Root," Annals of
Mathematical Statistics, Vol. 31 (1960),
pp. 625-642.
2. Johnson, P.O., statistical Methods in Research, Prentice Hall, Inc., New York
(1950).
3. Kaskey, Gilbert and Krishnaiah, P. R.,
"Statistical Routines for Univac Computers," Transactions of Middle Atlantic
Conference of American Society for Quality
Control (1962), pp. 289-304.
4. Krishnaiah, P. R., Simultaneous Tests and
the Efficiency of Generalized Balanced
Incomplete Block Designs (Unpublished
Manuscript) .
5.
"Multiple Comparison
Tests in Multi-Response Experiments."
6.
7.
8.
9.
10.
Presented at the Annual Meeting of the
Institute of Mathematical Statistics, September 1962.
Roy, S. N., and Bose, R. C., "Simultaneous Confidence Interval Estimation,"
Annals of Mathematical Statistics, Vol.
24 (1953), pp. 513-536.
Roy, S. N., Some Aspects of Multivariate
Analysis, John Wiley and Sons, Inc., New
York, 1957.
Roy, S. N., and Gnanadesikan, R., "Further Contributions to Multivariate Confidence Bounds," Biometrika, Vol. 44
(1957), pp. 399-410.
"A Note on Further
Contributions to Multivariate Confidence
Bounds," Biometrika, Vol. 45 (1958). p.
581.
Siotani, M., "The Extreme Value of the
Generalized Distances of the Individual
Points in the Multi variate Normal
Proceedings-Fall Joint Computer Conference, 1962 / 303
Sample," Annals of the Institute of statistical Mathematics, Vol. 10 (1959), pp.
183-203.
"On the Range in Mul11.
tivariate Case," Proceedings of the Institute of statistical Mathematics, Vol. 6
(1959), pp. 155-165.
12.
. "Notes on Multivariate
Confidence Bounds, " Ann a 1 s of the
Institute of statistical Mathematics, Vol.
11 (1960), pp. 167-182.
13. Smith, H., Gnanadesikan, R., and Hughes,
J. B., "Multivariate Analysis of Variance
(MANOVA),"Biometrics, Vol. 18 (1962),
pp. 22-41.
14. Tukey, J. W., The Problems of Multiple
Comparisons (Unpubl i shed Notes),
Princeton University.
SPACETRACKING MAN-MADE SATELLITES AND DEBRIS
Robert W. Waltz, Colonel, USAF
Commander 9th Aerospace De!. Div.
and
B. M. Jackson, Captain, USAF
Ent Air Force Base, Colorado
middle of April of 1961. Personnel began
arriving on site about that time, and by the
10th of June all of the facilities were ready.
Unexpectedly, the computer at our research
and development facility had a failure due to
air conditioning. The requirement was placed
upon our organization to commence operations 20 days ahead of schedule. We accomplished this without much difficulty.
Since that time we have taken on the name
of Spacetrack Center. This is the portion of
1st Aerospace Squadron that has the satellite
mission. We were established as a contingent
of the NORAD Combat Operations Center.
The entire space detection and tracking system, which is the responsibility of NORAD,
has been dubbed SPADATS.
There are many types of satellites in orbit.
One revolution requires a minimum or' about
ninety minutes. This varies, of course, and
can be several times this quantity. The only
organization in the free world that has re sponsibility for tracking all man-made objects in
space is the 1st Aerospace Control Squadron.
AEROSPACE EVALUATION
The United States got into the aerospace
business in 1957 when Russia launched its
first Sputnik. At that time there was no operational skill developed in the art of detecting
or tracking satellites. The Research and
Development Command of the Air Force was
assigned the responsibility to develop such a
capability. This organization wa established
at Hanscom Field. In addition to research
and development responsibility, they were
tasked with the operational requirements of
those days. In November 1960, the Air Force
began its training of a group of Air Force
personnel to establish a completely operational organization. In February of 1961,
Headquarters USAF directed that the 1st
Aerospace Control Squadron be in position
by 1 July at Ent Air Force Base with an operational capability of detecting and tracking
satellites in support of NORAD. At that time
we had neither trainedpersonnel nor aphysical location to go into, communications facilities, nor a computer. In the next four months
our organization increased from the original
twelve people to something in the order of
over 100. A building was identified. Generals
were moved from their offices and the facilities were completely rehabilitated. A computer was designated and installed by the
SATELLITE BACKUP
In order to provide reliability in the system and to provide our research organization
with the equipment and tools to further develop
the state of the- art of satellite detection and
tracking, a backup facility was established at
Hanscom Field. Equipment identical to that .
installed at E nt has been provided. The backup
installation is called the Spacetrack R&D Facility and is under the control of the Air Force
SystemsCommand. As do we, they have other
names, the prime one being 496L Systems
Project Office" (SPO).
RELATIONSHIPS WITH OTHER
ORGANIZATIONS
The 1st Aerospace Control Squadron was
established to support the North American
Air Defense (NORAD). The United States Air
304
Proceedings-Fall Joint Computer Conference, 1962 / 305
Force, through its Air Defense Command, is
responsible for the support and technical
operation of all the facilities of the NORAD
Combat Operations Center (COC) located at
Ent Air Force Base. The 1stAerospace Control Squadron is responsible to the Air Defense
Command (through the 9th Aerospace Defense
Division). NORAD is a unified command composed of all the various military services of
the United States and Canada.
Radars are of two basic types. One is the
fixed fan which has a stationary antenna. It
radiates electrical energy in a horizontal or
vertical fan-shaped plane. The other type
radar is the tracker. It provides a pencil
beam pattern and must be pointed at the satellite. The antenna is parabolic and is steerable either by manual control or automatically
by computer program. See Fig. 1.
SPACETRACK MISSION
The mission of the Spacetrack Center is to
detect and track all man-made space objects;
maintain an information catalogue on all space
objects; determine orbits and ephemerides
of all space objects; provide system status
and satellite displays; and provide satellite
data to NORAD and other military and scientific agencies as required.
SATELLITE SENSORS
To support the mission of Spacetrack, there
are sensors located all around the world.
There are three basic types of sensors that
are used to detect and track satellites. These
are optical, radar and radiometric. Some of
these are controlled by military organizations; others are scientific instruments which
support Spacetrack on a cooperative basis.
An example of the optical type sensor is
the Baker-Nunn camera. A Baker-Nunn
c'amera is very similar to a telescope and is
equipped for taking pictures of satellites
against a star background. The mount for
the Baker-Nunn camera is steerable and can
be programmed to track the path of the satellite in accordance with predictions provided
by the Spacetrack Center. Techniques of the
astronomer are used to determine the position of the satellite with respect to reference
stars. In this manner, we obtain the right
ascension and declination of the satellite's
position.
Radars provide our most useful information in that they determine all of the quantities required to fix the satellite's position as
one point in space. From the radar we get
azimuth, elevation, range, range rate, doppIer, and what we call a Signature of the satellite. By signature we imply its tumble rate
and radar cross section from which we can,
in general, determine the size of the object,
length, width and general configuration.
Figure 1. Aerial View of Thule-This areial
view of the BMEWS, Thule, Greenland radar
site shows the four huge detection radar antennas and the one radome which protects the
movable tracking radar antenna. The stationary detection antennas measure 165 feet
in height and 400 feet across the top.
A third and last major sensor that contributes significantly to our mission is the
radiometric type. It is apassive type sensor
that depends upon transmissions from the
satellite itself. It is, in effect, a Simple but
highly directional radio receiver. With this
device we can determine the azimuth of the
satellite, the time of its closest approach to
the facility, and the doppler.
COMMUNICATIONS AND SENSOR
LOCATIONS
Teletype or high-speed data link communications connect Spacetrack to all of the
major sensors or centers that collect satellite observations. Direct circuits are provided to:
a. The Ballistic Missile Early Warning
fixed fan radars at Clear, Alaska, and
Thule, Greenland (Thule also has a
tracking rac;lar).
306 / Spacetracking Man-Made Satellites and Debris
b. Shemya Island (Alaska)-both a fixed
fan and tracking radar.
c. Laredo AFB, Texas-tracking radar.
d. Moorestown, New Jersey-tracking
radar.
e. Spacetrack R&D Facility, Mass,-for
computer backup. Data from a radiometric sensor and a tracking radar are
received from this area.
f. SPASUR operated by the U. S. Navyconsists of a series of vertical radar
fans across southern United States.
g. Patrick AFB, Florida-provides data
from the Atlantic Missile Range sensors including all types.
h. Sunnyvale and Point Mugu, Calif.-for
data from the Pacific Missile Range.
i. NationalAeronautics & Space Administration (NASA) at Greenbelt, Marylandlinks Spacetrack with the scientific
sensors around the world including
those of foreign governments.
SATELLITE DATA INPUTS
Information received from the sensors by
100 wpm teletype circuits is processed to the
computer in two modes. In the manual mode
a paper tape and hard copy of each message
received by the Communications Center is
delivered to the Data Conversion Room. If
the message is in the proper format, it can
be immediately converted from paper tape to
punched cards by using an mM 047 tape-tocard converter. On the other hand, for certain few messages, the observations must be
processed manually from the hard copy by
entering the information in a converted form
onto a standard observation sheet. This sheet,
in turn, is hand-punched to provide cards.
The cards are then consolidated and read on
magnetic tape in an off -line mode.
The second mode is called a semiautomatic mode of operation. Due to equipment and program difficulties, this system
has never been used operationally. However,
the hardware is designed to allow satellite
observations received by teletype to be fed
directly to the. computer through electrical
circuits. The system input starts with the
sensor looking at the satellite and determining its position in azimuth, elevation, range,
range rate, and any other quantities it can
obtain. This information is transmitted over
100 wpm teletype circuits and received in the
Communications Center on a page printer
which provides a hard copy. If the format is
coded beginning with five $ signs, switching
action routes the message to the computer.
The switching unit is activated by the first
three $ signs and connects the incoming circuits to a tape punch. The remainder of the
message, which includes two $ signs, the text
and the clOSing signal, is punched on paper
tape and stored until ready to be transmitted
to the computer electrically. To end the
message, a signal of five right-hand parentheses are provided. Three of these activate
the switching unit to return it to its normal
pOSition of a local tape punch in the C omm
Center. Messages not in this coded format
are received only as a hard copy and local
paper tape in the Comm Center. As mentioned
in the manual mode, these two items are
delivered to Data Conversion for processing
by hand. Upon command from the computer,
messages stored in the Comm Center's semiautomatic input equipment can be transmitted
through the DMNI and Real-Time System to
the Philco 2000. Twenty-four low-speed
teletype and three high-speed data link circuits can be connected to the DMNI at present.
To provide for storage of data during periods
when the 2000 is inoperative, aDMNI recorder
will be installed. This will consist of a 410
processor and a magnetic tape unit.
SATELLITE PROGRAM (A-1)
The A-1 program system (see Figure 2)
provides for the processing of nearly all of
the routine satellite data. It contains its own
executive and is completely independent of
INPUTS
OBSERVATION
CONVERSION
Report Tape
REPORT
ASSOCIATION
DIFFERENTIAL
CORRECTION
New Elements (RA,i,w,e,To 8t P)
OUTPUTS
TO SENSORS
NOR AD AND
OTHER USERS
Equatorial Crossing Time It Longitude
Table to Correct for any Locat ion
Position of Satellite for one Sensor.
Azimuth, Elevation, Range &. Time
Figure 2. Satellite Computer Program (A-I).
Proceedings-Fall Joint Computer Conference, 1962 / 307
the Philco SYS. The inputs to this system
are from magnetic tape and consist of (1) the
70TTY IN tape which contains the manual
data provided from punched card and (2) the
DMNI tape created through the semiautomatic mode of operation.
These two tapes are the input to the observation conversion (ORCON) routine, where
the observations are converted on a report
(R) tape to one common standard computer
format. The R tape becomes an input to the
report association (RASSN) routine. In
RASSN, the observations (reports) are compared against the entire satellite inventory
and are identified according to their association with the predicted satellite positions.
Those that meet very narrow limits established in the program are called fully associated reports (R a's). Those that do not
associate within this narrow tolerance or
associate with more than one satellite are
called doubtfully associated (R d) • The third
category contains those observations which
do not associate with any objects in the satellite inventory and are called unassociated
reports (RJ. Most of the unassociated reports are the results of electrical noise or
inaccurate observations. The SRADU tape
(sorted reports associated, doubtful and unassociated) is the input to the Simplified General Perturbation Differential Correction
(SGPDC) routine. Here the fully associated
observations are used to correct the elements
of each satellite.
Elements are the variables required to
fully describe a satellite orbit. These con~
sist of the right ascension, the inclination,
the argument of perigee, the eccentricity, the
time of epoch, and the period of the satellite.
The outputs produced from the new elements
are forwarded to the sensor in order that they
may acquire the satellites on future passes.
In addition, outputs are forwarded to NORAD
and other users for tactical and scientific
purposes.
The primary outputs consist of the bulletin
and the look angle. The bulletin is a listing
of equatorial crossing times and longitude
for each revolution of the satellite. Also included with the bulletin is a table to correct
the equatorial crossing to any location in the
world. Look angles provide the position of
one satellite for one sensor. This information is presented in the form of azimuth, elevation, range, and time for the specific sensor
and for the specific satellite to be observed.
Other outputs from our system include periodic reports required by NORAD and the
other users of our information.
In addition to the A-1 system, there are
about 20 other programs in use for satellite
computations. These were originally written
in FORTRAN and mM machine language and
were later converted to ALTAC for use with
Philco SYS. These are now being converted
to TAC for attachment to the A-l executive.
SPACETRACK CONTROL ROOM
The computer outputs are manually
checked for accuracy prior to release to the
tactical or scientific communities. This is
accomplished in our Spacetrack Control
Room.
Compared to operational displays, the
Spacetrack Control Room for aircraft surveillance operations is quite an unspectacular
facility. (See Figure 3.) To date, no dynamic
displays have been created which are entirely
satisfactory for satellite purposes. In order
to provide satellite data to CINCNORAD,
static menu boards and a random access
slide projector are used. Included are a
Figure 3. Space Scoreboards,-Status boards
in North American Air Defense Command's
Space Detection and Tracking System Operations Control Center at Colorado Springs,
Colorado, display timely tracking information
on all man-made objects in earth orbit.
summary of the satellite population, detailed
information regarding the payload of each
launch, and sensor data. A world map shows
the location of the major sensors and centers
that collect observations and the connecting
308 / Spacetracking Man-Made Satellites and Debris
communications circuits. These are displayed to NORAD thru a closed-circuit television system.
BMEWS BACKUP
In addition to the satellite mission, the
Philco 2000 has been programmed to provide
a backup function for the BMEWS Display Information Processor located in the NORAD
COC. The mission of BMEWS (Ballistic Missile Early Warning System) is to display to
NORAD impact and launch data on missiles
detected by the BMEWS which has radars at
Clear, Alaska, and Thule, Greenland.
BMEWS RADARS AND FUNCTIONS
"----.-~ - - - - - '
NoRAD
Figure 4.
BMEWS Data Flow.
NORAD COC DISPLAYS
The BMEWS site at Clear consists of several fixed fan type radars. The site at Thule
consists of a tracker radar in addition to several fixed fan radars. A future site being installed at Fylingdales, England, will contain
all tracker type radars. These "forward"
sites provide the coverage required to detect
hostile missiles that may be launched against
the North American Continent.
With computers located at these sites,predicted missile launch and impact pOints and
the time of impacts are computed. This information is provided to the computer facilities within the NORAD COC for generation of
displays.
BMEWS PROGRAM (B-1)
Data flows to the Philco 2000 from the forward site radars over high-speed data link
circuits and thru the DMNI and Real-Time
System. The B-1 program (see Figure 4)
processes this information for display in the
COCo Like the A-1 program, B-1 is completely independent of SYS. The B-1 program
converts the forward site information into
data suitable for drawing slides for a projection system and for driving numerical displays. The output is provided through the
DMNO (Device for Multiplexing Non-Synchronous Outputs) to four devices: the impact
and launch display decoder for projection
system, the threat summary panel, the DMNO
flexowriter for status data, and the remote
transmitters that provide information similar
to that displayed in the NORAD COC to the
Strategic Air Command and the Joint Chiefs
of Staff.
The BMEWS data are presented in the
Combat Operations Center (COC) on two projection systems and a numerical display. The
projection systems consist of two maps: one
is of the North American Continent on which
predicted impacts are displayed, and the other
is a polar projection of the European-Asian
Continent on which computed launch points
are indicated. Both launch and impact locations are represented by ellipses. Included
with each ellipse is a letter-number reference that identifies the forward site which
detected the missile and a serial number for
correlation between the two maps.
The Threat Summay Panel is a numerical
display. "Five-minute windows" provide a
measure of the size of the missile raid during the past five minutes. In another portion
of the panel are shown the total numbers, for
each site, of missiles detected and predicted
to impact on the North American Continent.
The time of next missile impact is displayed.
Lastly, an "Alarm Level" is produced which
is a combined measure of the missile raid
and summarizes the credence of the threat.
FUTURE
Inthe near future the B-2 program will be
integrated into the system. It combines the
functions of both A-1 (for satellite) and B-1
(for BMEWS) under one executive. This will
allow for full-time backup of the DIP computer and for proceSSing satellite data simultaneously.
Under study is hardware which will completely automate the satellite inputs and
outputs. The present manual teletype system
Proceedings-Fall Joint Computer Conference, 1962 / 309
will be replaced bya computerized communications center.
USA
Deep Space Probes
(Heliocentric Orbit)
CURRENT SATELLITE SITUATION
The satellites in orbit as of 24 September
1962 were as follows:
USA
Payloads in Earth Orbit
Debris in Earth Orbit
UK USSR
38
1
4
*185
0
4
Objects Decayed
UK USSR
4
0
2
227
1
10
109
60
*Includes 61 Omicron which exploded and
produced 139 known object!'l
LIST OF REVIEWERS
Mr. S. N. Alexander
Mr. James P. Anderson
Mr. W. L. Anderson
Miss Dorothy P. Armstrong
Dr. M. M. Astrahan
Mr. Robert C. Baron
Mr. W. D. Bartlett
Mr. R. S. Barton
Mr. Aaron Batchelor III
Mr. J. V. Batley
Mr. M. A. Belsky
Mr. Eric Bittman
Mr. Erich Bloch
Dr. Edward K. Blum
Mr. Theodore H. Bonn
Mr. Arthur Bridgman
Mr. Herbert S. Bright
Mr. Edward A. Brown
Mr~ L. E. Brown
Mr. W. Brunner
Dr. Werner Buchholz
Mr • James H. Burrows
Mr. R. V. D. Campbell
Mr. B. F. Cheydleur
C. K. Chow (Mr.)
Mr. Robert H. Courtney
Mr. Robert P. Crago
Mr. L. Jack Craig
Dr. T. H. Crowley
Mr. James A. Cunningham
Miss Ruth M. Davis
Dr. Douglas C. Engelbart
Mr. Howard R. Fletcher
Dr. Ivan Flores
Miss Margeret R. Fox
Mr. R. F. Garrard
Mr. Ezra Glaser
Mr. Jack Goldberg
Mr. Geoffrey Gordon
Mr. Joseph K. Hawkins
Mr. George G. Heller
Mr. George Heller
Mr. W. H. Highleyman
Mr. S. A. Hoffman
Mr. Arthur Holt
Mr. E. Hopner
Dr. Grace Murray Hopper
Mr. Richard A. Hornweth
Mr. Paul W. Howerton
Mr. Morton A. Hyman
Mr. George T • Jacobi
Mr. Robert Jayson
Dr. Laveen Kanal
Mr. Herbert R. Koller
Dr. R. A. Kudlich
Mr •. J. Kurtzberg
Mr. Michael R. Lackner
Dr. Herschel W. Leibowitz
Prof. C. T. Leondes
Mr. Harry Loberman
Mr. J. D. Madden
Miss Ethel C. Marden
Mr. Walter A~ Marggraf
Dr. R. E. Meagher
Mr. Philip W. Metzger
Mr. Albert Meyerhoff
Dr. Robert C. Minnick
Mrs. Betty S. Mitchell
Mr. Ralph E. Mullendore
310
Mr. Simon M. Newman
Mr. J. P. Nigro
Mr. Glenn A. Oliver
Mr. James L. Owings
Mr. G. W. Petrie
Mr. C. A. Phillips
Dr. Arthur V. Pohm
Mr. Jack Raffel
Mr. M. J. Relis
Mr. A. E. Rogers
Mr. David Rosenblatt
Mr. Arthur I. Rubin
Dr. Morris Rubinoff
Mr. Bruce Rupp
Prof. Norman R. Scott
Mr. I. SeUgsohn
Mr. Donald Seward
Mr. J. E. Sherman
Mr. William Shoo man
Mr. R. A. Sibley
Mr. Q. W. Simkins
Mr. R. F. Stevens
Dr. Richard I. Tanaka
Mr. Lionel E. Thibodeau
Mr. Robert Tillman
Mr. R. A. Tracy
Mr. R. L. Van Horn
Mr. Kenneth W. Webb
Mr. Gerard P. Weeg, Prof.
Mr. Thomas J. Welch
Mr. Stanley Winkler
Mr. Hugh Winn
Mr. H. Witsenhausen
Mr. William W. Youden
1962 FALL JOINT COMPUTER CONFERENCE COMMITTEE
General Committee
J. Wesley Leas, Chairman
RCA Building 201-3
Camden 8, N. J.
E. Everet Minett, Vice-Chairman
Remington Rand Univac
Blue Bell, Pa.
T. T. Patterson, Secretary
RCA Building 13-2
Blue Bell, Pa.
Finance Committee
Herman A. Mfel, Chairman
Auerbach Corporation
Phila. 3, Pa.
Arthur D. Hughes, Vice Chairman
Auerbach Corporation
Phila. 3, Pa.
Public Relations Committee
Thomas D. Anglim, Chairman
Remington Rand Univac
Blue Bell, Pa.
Joseph Hoffman
General Electric Co.
Missile & Space Division
Valley Forge, Pa.
Thomas I. Bradshaw
RCA Building 202-2
Camden 8, N. J.
E. C. Bill
Remington Rand Univac
Blue Bell, Pa.
Proceedings Committee
Joseph D. Chapline, Chairman
Philco Computer Division
Willow Grove, Pa.
Walter Grabowsky, Vice Chairman
Auerbach Corporation
Phila. 3, Pa.
Program Committee
E. Gary Clark, Chairman
Burroughs Corporation
Paoli, Pa.
Arnold Shafritz
Auerbach Corporation
Phila. 3, Pa.
Aaron Batchelor, Vice Chairman
Burroughs Corporation
Paoli, Pa.
Robert S. Barton
1981 E. Meadowbrook Rd.
Altadena, Cal.
T. H. Bonn
Remington Rand Univac
Blue Bell, Pa.
Dr. Stanley Winkler
IBM, Fed. System Div.
Rockville, Md.
B. F. Cheydleur
Philco Computer Division
Willow Grove, Pa.
James L. Owings
RCA Building 82-1
Camden, N. J.
Dr. Hugh Winn
General Electric Co.
Missile & Space Division
Valley Forge, Pa.
Arrangements Committee
Peter E. Raffa, Chairman
Technitrol, Inc.
Phila. 34, Pa.
Robert A. Hollinger
Vice Chairman
Technitrol, Inc.
Phila. 34, Pa.
311
William McBlain
Minneapolis-Honeywell
Regulator Co., Pottstown, Pa.
312
Ladies Activity Committee
Miss Mary Nagle, Chairman
RCA Building 204-2
Camden 8, N. J.
Mrs. John M. Bailey
1032 Wayne Rd.
Haddonfield, N. J.
Mrs. John W. Mauchly, Vice-Chairman
Mauchly Associates
Fort Washington, Pa.
Miss Josephine Schiazza
RCA Building 204-2
Camden 8, N. J.
Printing and Mailing Committee
Norman A. Miller, Chairman
Remington Rand Univac
Blue Bell, Pa.
John Coston
Remington Rand Univac
Blue Bell, Pa.
Mrs. Ethel Levinson
Remington Rand Univac
Blue Bell, Pa.
Registration Committee
Louis F. Cimino, Chairman
General Electric Co.
Missile & Space Div.
Valley Forge, Pa.
Richard D. Burke
Vice-Chairman
IBM Corp.
Phila. 2, Pa.
John Schafer
General Electric Co.
Missile & Space Div.
Valley Forge, Pa.
Miss Eleanor Gardosh
General Electric Co.
Missile & Space Div.
Valley Forge, Pa.
Jack Armstrong
General Electric Co.
Missile & Space Div.
Valley Forge, Pa.
Miss Liz Gunson
IBM Corp.
230 S. 15th St.
Phila.2, Pa.
Sol Steingard
General Electric Co.
Missile & Space Div.
Valley Forge, Pa.
Exhibits Committee
R. A. C. Lane, Chairman
RCA Building 204-1
·Camden 8, N. J.
Lowell Bensky
Re se Engineering
A & Courtland st.
Phila., Pa.
W. P. Hogan, Vice Chairman
Leeds & Northrup
North Wales, Pa ..
Special Events Committee
Herbert S. Bright, Chairman
Philco Computer Division
Willow Grove, Pa.
Dr. Louis R. Lavine
Philco Computer Division
Willow Grove, Pa.
B. F. Cheydleur, Vice Chairman
Philco Computer Division
Willow Grove, Pa.
R. Paul Chinitz
Remington Rand· Univac
Blue Bell, Pa.
Edward H. Nutter
Philco Computer Division
Willow Grove, Pa.
John W. Mauchly
Mauchly Associates
Fort Washington, Pa.
Daniel AshIer
Auerbach Corporation
1634 Arch Street
Phila. 3, Pa.
Dr. Morris Rubinoff
Moore School of Electrical Engrg.
University of Pennsylvania
Phila., Pa.
Harry Bortz
IBM Corp.
230 S. 15th St.
Phila. 2, Pa.
Mrs. Margery League
Remington Rand Univac
Blue Bell, Pa.
Administration Committee
T. T. Patterson, Chairman
RCA Building 13-2
Camden 2, N. J.
John P. Brennan, Jr., Vice Chairman
RCA Building 82-1
Camden, N. J.
313
Technical Advisor
Dr. Morri s Rubinoff
Moore School of Electrical Engineering
University of Pennsylvania
Philadelphia 4, Pa.
AMERICAN FEDERATION OF INFORMATION
PROCESSING SOCIETIES (AFIPS)
AFIPS, P. O. Box 1196, Santa Monica, California
Chairman
Executive Committee
Dr. Willis H. Ware
The RAND Corporation
1700 Main Street
Santa Monica, Calif.
Dr. Arnold A. Cohen, IRE
Dr. Harry D. Huskey, ACM
Dr. Morris Rubinoff, AlEE
Secretary
Treasurer
Miss Margaret R. Fox
National Bureau of Standards
Data Processing Systems Div.
Washington 25, D. C.
Mr. Frank E. Heart
Lincoln Laboratory
P. O. Box 73
Lexington 73, Mass
AlEE Directors
ACM Directors
IRE Director s
Mr. G. La. Hollander
Hollander Associates
P. O. Box 2276
Fullerton, Calif.
Mr. H. S. Bright
Secretary, ACM
Philco Computer Division
Willow Grove, Pa.
Mr. W. L. Anderson
General Kinetics, Inc.
2611 Shirlington Road
Arlington 6, Va.
Mr. C. A. R. Kagan
Western Electric Co.
P. O. Box 900
Princeton, N. J.
Mr. W. M. Carlson
E. I. duPont deNemours & Co.
Mechanical Research Lab.
101 Beech St.
Wilmington 98, Del.
Dr. Werner Buchholz
IBM ·Development Lab.
P. O. Box 390
Poughkeepsie, N. Y.
Mr. H. T. Marcy
IBM Corporation
1000 Westchester Ave.
White Plains, N. Y.
Dr. H. D. Huskey
Computer Center
University of Calif.
Berkeley 4, Calif.
Dr. Arnold A. Cohen
Remington Rand Univac
Univac Park
St. Paul 16, Minn.
Dr. Morris Rubinoff
Moore School of Elec. Engr.
200 South 33rd st.
Philadelphia 4, Pa.
Mr. J. D. Madden
System Development Corp.
2500 Colorado Ave.
Santa MOnica, Calif.
Mr. Frank E. Heart
Lincoln Laboratory
P. O. Box 73
Lexington, Mass.
AFIPS Representative to IFIP
Simulation Council Observer
Mr. I. L. Auerbach
Auerbach Corporation
1634 Arch Street
Philadelphia 3, Pa.
Mr. J. E. Sherman
Simulation Council, Inc.
Sunnyvale, Calif.
STANDING COMMITTEE CHAIRMEN
Finance
Dr. R. R. Johnson
General Electric Co.
P. O. Drawer 270
Phoenix, Ariz.
Planning
Dr. Morris Rubinoff
Moore School of Elec. Engrg.
200 South 33rd St.
Philadelphia 4, Pa.
Admissions
Dr. Bruce Gilchrist
IBM Corporation
590 Madison Ave.
New York 22, N. Y.
Source Exif Data:
File Type : PDF File Type Extension : pdf MIME Type : application/pdf PDF Version : 1.3 Linearized : No XMP Toolkit : Adobe XMP Core 4.2.1-c041 52.342996, 2008/05/07-21:37:19 Producer : Adobe Acrobat 9.0 Paper Capture Plug-in Modify Date : 2008:11:16 21:29:29-08:00 Create Date : 2008:11:16 21:29:29-08:00 Metadata Date : 2008:11:16 21:29:29-08:00 Format : application/pdf Document ID : uuid:270fc048-0de0-4913-a6dd-4dfa2753a468 Instance ID : uuid:f63ed328-229a-417c-907b-ef8ee7bb4713 Page Layout : SinglePage Page Mode : UseOutlines Page Count : 319EXIF Metadata provided by EXIF.tools