1962 12_#22 12 #22

1962-12_#22 1962-12_%2322

User Manual: 1962-12_#22

Open the PDF directly: View PDF PDF.
Page Count: 319

Download1962-12_#22 1962-12 #22
Open PDF In BrowserView PDF
CONFERENCE
PROCEEDINGS
VOLUME 22

FALL JOINT
COMPUTER
CONFERENCE

CONFERENCE
PROCEEDINGS
VOLUME 22

FALL JOINT
COMPUTER
CONFERENCE

~ SPARTAN BOOKS
~

6411

CHILLUM PLACE. N. W.

•

WASHINGTON

12. D. C.

List of Joint Computer Conferences
1. 19·51 Joint AlEE-IRE Computer Conference,
Philadelphia, December 1951
2. 1952 Joint AIEE-IRE-ACM Computer Conference, New York, December 1952
3. 1953 Western Computer Conference, Los
Angeles, February 1953
4. 1953 'Eastern Joint Computer Conference,
Washington, December 1953
5. 1954 Western Computer Conference, Los
Angeles, February 1954
6. 1954 Eastern Joint Computer Conference,
Philadelphia, December 1954
7. 1955 Western Joint Computer Conference,
Los Angeles, March 1955
8. 1955 Eastern Joint Computer Conference,
Boston, November 1955
9. 1956 Western Joint Computer Conference, San
Francisco, February 1956
10. 1956 Eastern Joint Computer Conference, New
York, December 1956
11. 1957 Western Joint Computer Conference, Los
Angeles, February 1957

12. 1957 Eastern Joint Computer Conference,
Washington, December 1957
13. 1958 Western Joint Computer Conference,
Los Angeles, May 1958
14. 1958 Eastern Joint Computer Conference,
Philadelphia, December 1958
15. 1959 Western Joint Computer Conference,
San Francisco, March 1959
16. 1959 Eastern Joint Computer Conference,
Boston, December 1959
17. 1960 Western Joint Computer Conference,
San Francisco, May 1960
18. 1960 Eastern Joint Computer Conference,
New York, December 1960
19. 1961 Western Joint Computer Conference,
Los Angeles, May 1961
20. 1961 Eastern Joint Computer Conference,
Washington, December 1961
21. 1962 Spring Joint Computer Conference, San
Francisco, May 1962
22. 1962 Fall Joint Computer Conference, Philadelphia, December 1962

C.onferences 1 to 19 were sponsored by the National Joint Computer Committee,
predecessor of AFIPS. Back copies of the proceedings of these conferences may
be obtained, if available, from:
•
•
•

Association for Computing Machinery, 14 E. 69th St., New York 21, N. Y.
American Institute of Electrical Engineers, 345 E. 47th St., New York 17, N. Y.
Institute of Radio Engineers, 1 E. 79th St., New York 21, N. Y.

Conference 20 and up are sponsored by AFIPS. Copies of AFIPS Conference
Proceedings may be ordered from the publishers as available at the prices indicated below. Members of societies affiliated with AFIPS may obtain copies
at the special "Member Price" shown.
Volume

List
Price

Member
Price

Publisher

20
21
22

$12.00
6.00
8.00

$7.20
6.00
4.00

Macmillan Co., 60 Fifth Ave., New York 11, N. Y.
National Press, 850 Hansen Way, Palo Alto, Calif.
Spartan Books, 6411 Chillum Place, NW, Washington 12, D. C.

The ideas and OpInIOnS expressed herein are solely
those of the authors and are not necessarily representative of or endorsed by the 1962 Fall Joint Computer Conference Committee or the American Federation of Information Processing Societies.
Library of Congress Catalog Card Number: 55-44701
Copyright © 1962 by American Federation of Information Processing Societies,
P.O. Box 1196, Santa Monica, California. Printed in the United States of
America. All rights r.eserved. This book or parts thereof, may not be reproduced in any form without permission of the publishers.
Manufactured by McGregor & Werner, Inc.
Washington, D. C.

CONTENTS
Page

Page

27

Preface
Processing Satellite Weather Data - A Status Report Part I
Processing Satellite Weather Data - A Status Report Part II
Design of A Photo Interpretation Automaton

36

Experience with Hybrid Computation

44

Data Handling at an AMR Tracking Station

56
71
73
86

Information Processing for Interplanetary Exploration
EDP As A National Resource
Planning the 3600
D825 - A Multiple-Computer System for Command &
Control

97

The Solomon Computer

108
121

The KDF.9 Computer System
A Common Language for Hardware, Software, and
Applications
Intercommunicating Cells, Basis for a Distributed
Logic Computer
On the Use of the Solomon Parallel-Processing
Computer

v

1
19

130
137

147
154

Data Processing for Communication Network Monitoring and Control
Design of ITT 525 "Vade" Real-Time Processor

iii

v

Charles L. Bristor

1

Laurence I. Miller

19

W. S. Holmes
H. R. Leland
G. E. Richmond
E. M. King
R. Gelman
K. M. Hoglund
P. L. Phipps
E. J. Block
R. A. Schnaith
J.A. Young
T. B. Steel, Jr.

27

Charles T. Casale
James P. Anderson
Samuel ,A'. Hoffman
Joseph Shifman
Robert J. Williams
Daniel L. Slotnick
W. Carl Borck
Robert C. McReynolds
A. C. D. Haley
Kenneth E. Iverson

36
44

56
71
73
86

97
108
121

C. Y. Lee

130

J.R.Ball
R. C. Bollinger
T. A. Jeeves
R. C. McReynolds
D. H. Shaffer
D. I. Caplan

137

Dr. D. R. Helman
E. E. Barrett
R. Hayum
F. O. Williams

154

147

Page

Page

161

On the Reduction of Turnaround Time

170

184

Remote Operation of a Computer by High Speed
Data Link
Standardization in Computers and Information
Processing
High-Speed Ferrite Memories

197

Microaperture High-Speed Ferrite Memory

213

Magnetic Films-Revolution in Computer Memories

225
229
232
234

Hurry, Hurry, Hurry
The Case for Cryotronics?
Cryotronics - Problems and Promise
Some Experiments in the Generation of Word and
Document Associations
A Logic Design Translator

177

251
262
275
280

Comprotein: A Computer Program to Aid Primary
Protein Structure Determination
Using Gifs in the Analysis and Design of Process
Systems
A Data Communications and Processing System
for Cardiac Analysis

285

Cluster Formation and Diagnostic Significance in
Psychiatric Symptom Evaluation

304

Spacetracking Man-Made Satellites and Debris

310
311
313

List of Reviewers
1962 Fall Joint Computer Conference Committee
American Federation of Information Processing
Societies (AFIPS)

iv

H. S. Bright
B. F. Cheydleur
G. L. Baldwin
N. E. Snow
C. A. Phillips
R. E. utman
H. Amemiya
H. P. Lemaire
R. L. Pryor
T. R. Mayhew
R. Shahbender
T. Nelson
R. Lochinger
J. Walentine
C. Chong
G. Fedde
Howard Campaigne
W. B. Ittner, III
Martin L. Cohen
Gerard Salton

161
170
177
184

197

213
225
229
232
234

D. F. Gorman
J. P. Anderson
Margaret Oakley Dayhoff
Robert S~ Ledley
William H. Dodrill

251

·M. D. Balkovic
C. A. Steinberg
P. C. Pfunke
C. A. Caceres
Gilbert Kaskey
Paruchuri R. Krishnaiah
Anthony Azzari
Robert W. Waltz
B. M. Jackson

280

262
275

285
304
310
311
313

PREFACE
The theme of the 1962 Fall Joint Computer Conference is Computers in the Space Age. Today there is a two-way street in which
computing equipment has contributed vitally to the success of space age
technology, but the space-age demands have had their major effects on
the design 'of computers. Of these we can readily discern three outstanding results: (1) development of more efficient interfacing between man and machine, (2) radical reduction of the size of systems,
and (3) the maturing of the theory and implementation of cooperative
systems, including multi-point operating complexes.
Naturally these achievements are irrevocably to be reflected in
the stationary equipment that benefits business and science. We already know that for the purposes of the Space Age, computing equipment is to provide facility for command-decision and for control of a
new order of complexity. But we are just becoming aware of the products of this progress. The social implications of advances in the precise selection of information via recursive interplay between man and
machine-though barely perceptible at the present time-are rapidly
assuming major influence on the structure of the near future.
Altogether, the interaction of the space age and computer technologies has brought about a rich growth in new and potent national resources. Indeed, the record of the United State s in the field of information and data processing systems is pre-eminent in the present
world. It is helping therefore very directly to give us pre-eminence in
space.

J. Wesley Leas
Chairman
1962 Fall Joint Computer Conference

v

PROCESSING SATELLITE WEATHER DATAA STATUS REPORT - PART I
Charles L. Bristor
U. S. Weather Bureau
Washington, D. C.
SUMMARY

digestive, and productive headings. Tasks
under these headings are explained for both
the photo and infrared data. The individual
program modules and subroutines are discussed further in an appendix. Reference is
made to the second part of this report which
expands on the logical design of the digital
and non-digital data handling system complex and extends the discussion into data
rates, command and control concepts and the
executive program which manages the overall process.

Less than 500 radiosonde observations
are available for the current twice daily
three dimensional weather analysis over the
Northern Hemisphere-a coverage far less
than is required for short term advices and
for input to numerical prediction computations. Global observations from operational
satellites as a complement to existing data
networks· show promise of filling this need.
TIROS computer programs now being used
for production of perspective geographic locator grids for cloud photos, and other programs being used to calibrate, edit, locate
and map infrared radiation sensor measurements, have provided a background of experience and have indicated the potentialities
of a more automated satellite data processing
system. The tremendous volume of data expected from the Nimbus weather satellite
indicates the need for automatic data processing. Each pass around the earth will
produce ninety-nine high resolution cloud
pictures covering about ten percent of the
earth from pole to pole and infrared sensors
will provide lower resolution information but
on a similar global basis. Indications are
that machine processing of the 280-odd million
binary bits of data from each orbit can materially reduce the human work load in producing analyzed products for real time use. The
main programming packages in support of the
presently developing automatic data processing systems are explained under ingestive,

INTRODUCTION
The need for more meteorological data is
an old refrain which is almost constantly
being revived. Why do we always desire
more data? Among the many very good answers to this question are some which are
pertinent to the subject of meteorological
satellites. A most generalized answer might
be expressed in two parts:
1. because as the scope of human activities increases, new applications of weather
information arise and new needs for meteorological advice are generated and
2. because potential economic gains provide a tremendous impetus for attempting to
improve the quality and scope of our present
weather services.
Within the category of the first answer
one may cite the expansion of global air
travel over routes that are practically devoid of weather observations of any kind and

1

2 / Processing Satellite Weather Data - A Status Report - Part I
the similar deployment of air and sea defense forces to remote areas. Even the man
in space program is generating a need for
global weather information. In the thirties
and into World War II a marked expansion of
weather observing networks took placemainly through expansion of weather communications to communities where observations facilities could be installed. Because
of communications and logistics costs, this
type of expansion cannot take place indefinitely to fulfill the ever growing need for detailed observations on a global scale. However, within the scope of the first answer,
such a global network would be extremely
valuable merely as a means of providing current weather information and very short term
warnings and advisories.
Beyond i,mmediate operational advice is
the need implied by the second answer-the
problem of weather prediction. The American Meteorological Society (1962) has recently restated its estimate of current skills
in weather forecasting.
" ... For periods extending to about 72
hours, weather forecasts of moderate
skill and usefulness are possible. Within
this interval, useful predictions of general trends and weather changes can be
made ...• "
Few .would deny the economic importance
and increased application of more preCise
3-day forecasts.
Since the mid-fifties numerical weather
prediction has had a significant influence on
the level of skill in weather forecasting generally. The method involves a mathematical
description of the atmosphere in three dimensions utilizing the hydrodynamic equations of motion and the laws of thermodynamics. The partial differential equations of
such a "model" are arranged in a prognostic
mode such that only time dependent partials
remain on the left side. The finite difference
version of such an equation set is then integrated in short time steps to produce prognostic images of the various data fields which
served to describe the initial state of the
fluid. Phillips (1960) has summarized the
current view which delimits the potential of
numerical weather prediction-to the extent
that lack of observations prevents adequate
description of the atmosphere on a global
basis. Figure 1 indicates the present network of observing stations which provide the
cutrent three dimensional description of the

atmosphere together with a grid overlay indicating intersections at which information
is required concerning the current state of
the fluid in order that the finite difference
equations may be integrated. Obviously a
poorly distributed collection of less than 500
observations can not adequately establish
values for nearly two thousand grid points.
Areas the size of the United States are indicated without any upper air soundings whatsoever. The situation in the Southern Hemisphere is much worse.
This brief discussion of the meteorological data problem points up the need for a
detailed global observational network and
offers the real challenge to meteorological
satellites. Can indirect sensing via satellite
fill the need for global weather data? D. S.
Johnson (1962) has summarized the meteorological measurements carried out thus far
by satellites and discussed others planned
and suggested for the future.
Indications are that, whereas satellite observations will likely never supplant other
data networks, they hold great promise in
providing complementary data on a truly
global basis. Limited experience with satellite weather data already obtained is very
encouraging.
The following is a description of current
efforts in processing the ever growing volume
of this data. First, limited computer processing of TIROS data is discussed. The latter portion of this report and the second
paper in this two part series describe in
some detail the current status of computer
programming in support of the truly automated real time data processing systems
under construction for the Nimbus satellite
system.
EXPERIENCES WITH TIROS
Since April, 1960 cloud photos from TIROS
satellites have been made available to the
meteorological community on an intermittent
operational basis. Details of the satellites'
construction including its slow scan cameras
have been given elsewhere along with an account of certain difficulties in geographically
locating the cloud photos because of meandering in the spin axis (NASA - USWB, 1960). A
cloud photo sample is presented in Figure 2.
Even without a meteorological background,
one would likely concede, on the basis of intuition, that such cloud patterns could provide

Proceedings-Fall Joint Computer Conference, 1962 / 3

Figure 1. Northern Hemisphere map showing upper air reporting stations and computation grid used in objective weather map analysis and numerical prediction. The
Weather Bureau's National Meteorological Center uses a somewhat denser grid of
more than 2300 points. Less than 500 of these reports are routinely available for
specification of quantities at the grid points.

valuable observational evidence concerning
the state of the atmosphere. A considerable
research effort is now going on in an effort
to extract quantitative information from such
images (NASA - USWB, 1961). For the present, computer processing has been confined
largely to the production of geographic locator grids as an aid to further interpretation
of the cloud patterns. The locator grid superimposed on the picture in Figure 2 and the
sample grids shown in Figure 3 are produced
at a rate of 10 seconds per grid on the IBM
7090 (Frankel & Bristor, 1962). Line drawn
output is produced on an Electronic Associates Data Plotter or, alternately, by General
Dynamics High Speed Microfilm Recorder.

Input for each grid includes latitude and longitude of the sub-satellite point, altitude of
the satellite as well as azimuth and nadir
and spin angles which describe the attitude
and radial position of the camera with respect to the earth. An auxiliary program is
required for the production of image to obj ect
ray distortion tables. These tables correct
for symmetric and asymmetric distortions
due to the lens and the electronics of the
system and are produced from pre-launch
calibration target photos taken through the
entire camera system. An additional feature
of the gridding program is the large dictionary of coastline locations from which transformations to the perspective of the image

4 j Processing Satellite Weather Data - A Status Report - Part I

Figure 2. Sample cloud picture with perspective geographic locator grid. This photo, \
taken by TIROS III, shows hurricane Anna \
near 12°N, 64°W (lower left) on July 20,1961
together with large streamers projecting toward another vortex pattern to the east (right).

are made as an aid in mating the cloud image
and grid. Some 10,000 such grids have been
produced thus far for selected cloud photos
taken by TIROS I and TIROS III and are available in an archive, along with the pictures,
for research applications. A somewhat less
detailed but similar gridding procedure is
being utilized on a smaller Bendix G-15
computer at the TIROS readout sites for the
current real time hand processing of the
picture data (Dean, 1961). A typical example
of such a nephanalysis (cloud chart) composed
from a group of photos is shown in Figure 4.
Features from the several images are replotted in outline form or reduced to symbolic
form on a standard map base for facsimile
transmission to the weather analysts and
forecasters.
Starting with TIROS II in November, 1960,
infrared sensors have furnished experimental
radiation measurements in five selected
wavelength intervals (NASA - USWB, 1961
and Bandeen, 1962). Although these data
have not been available in real time, an extensive 7090 program has been produced for
their reduction to a usable form. The IR information has been utilized in a quantitative
manner in several research studies. Fritz

and Winston (1962) have demonstrated its
usefulness in cloud top determinations and
Winston and Rao (1962) have used it in connection with energy transformation investigations on the planetary sc ale.
The data reduction program accepts raw
digitized sensor values read out from the
satellite, rejects space viewed samples, converts the earth viewed responses to proper
physical quantities through a calibration procedure and finally combines the data with
orbit and attitude information to create a
final meteorological radiation tape (FMRT).
Data from one orbit is thus reduced to an
archivable file on magnetic tape by the 70.90
in less than twenty minutes. This tape becomes the data source for other programs
which have been produced for the purpose of
mapping selected samples of such data on
standard maps for use with other meteorological charts. A sample is shown in Figure 5.
The above discussion indicates the nature
of the data obtained thus far by meteorological satellites and the kinds of computer support provided. Experience gained in programming the earth 10 cat ion of sensor
measurements obtained from satellites, the
conversion to standard maps, the calibration
and logical sorting of raw data and the experience gained with distortion and attitude
programs have all provided background for
programs now being produced for direct application in an automatic system. Meanwhile
research with TIROS data is suggesting new
uses which are likely to lead to a requirement for more kinds of products and interpretations. Experience from past efforts is
thereby supporting present efforts in developing an automated, real time system for the
proceSSing of global data coverage which
will be coming from the Nimbus satellite
series.
THE NIMBUS DATA PROCESSING TASK
The Nimbus satellite represents a significant advancement over TIROS as an operational sat e 11 i t e. The spacecraft system
(Stampfl and Press, 1962) provides more
camera coverage of higher resolution, and
earth stabilization assures maximum photo
coverage. One downward and two oblique
looking cameras will view a broad strip of
the earth athwart the vehicle's path as shown
in Figure 6. The three views overlap slightly.

Proceedings-Fall Joint Computer Conference, 1962 / 5

86 7
2

3~

1

1

Figure 3. TIROS grids with familiar coastline features. The set of digits bracketing
a central intersection indicate the latitude (left-hand number, plus for North) and longitude (right-hand, plus for East) of that point. A zero is plotted along the meridian
at the next intersection to the South. Legend in the lower right indicates orbit and
frame number for the matching photo (top line, from left) as well as readout station,
mode (taped or direct), and camera (single digits, from left). Horizon arc is indicated beyond the truncated grid pattern at the top where appropriate.

The extremely foreshortened region near the
horizon is not viewed. Thirty-three such
photo clusters will be obtained from each
pass around the earth. Considerable overlap
in the wings is obtained from cluster to cluster as shown in Figure 7. The near polar
orbit will assure global coverage daily.
Overlap from orbit to orbit is minimal at the
equator but is very great near the poles (Figure 8). During the polar summer one would
expect to see a view such as is covered in

Figure 9 on every orbit. The slight inclination of the orbit in a retro sense (inj ection
into orbit with a westerly direction component) will provide controlled illumination for
the pictures in that local sun time will remain unchanged from orbit to orbit. Each
slow scan TV camera (1" diameter Vidicon
tube) contains 833 lines of picture information giving a maximum image resolution of
about 1/2 mile when looking directly downward from a nominal orbit of 500 nautical

6 / Processing Satellite Weather Data - A Status Report - Part I
miles. Such apicture will thus contain nearly
700,000 picture elements. If each of these
scan spots is converted on a 16 segment gray
scale into a 4 bit binary number, then the 99
pictures obtained from each 108 minute orbit
will produce almost 275 million bits.
Scanning radiometers will provide IR information as does TIROS but again will obtain optimum scans from horizon to horizon
athwart the vehicles's track. One narrow
angle high resolution sensor (HRIR) will respond in a water vapor "window" portion of
the infrared spectrum and will effectively
provide cloud top temperatures or, in cloudless areas, surface temperatures. A mosaic

of such scans on the dark portion of each
pass will provide a night time cloud cover
picture from pole to pole.
The first such HRIR sensor with a .5 degree viewing cone will provide maximum
resolution of about 5 nautical miles. Since
the earth will be viewed about one third of
each scan revolution, 240 non-overlapping
measurements can be obtained from each
scan. Approximately 2800 non-overlapping
scan swaths will be required to cover the
dark half of the orbit. Since these sensors
have a wider usable response range, each
scan spot will occupy a 7 bit binary number.
The HRIR response from each orbit will

Figure 4. Nephanalyses (cloud charts) prepared by TIROS readout station meteorologists.
Features of the cloud patterns from two successive orbits are extracted in outline form and
placed on a standard polar stereographic map base for facsimile transmission to weather
analysts and forecasters. Vortex centers are located along with other distinctive features.

Proceedings-Fall Joint Computer Conference, 1962 / 7

·····'0··

·1

········'0···

..........

. . . .. . ... ,00'..- + - - - f - - - - I - - ._
.. -. . -'.j..:...-.. ------i----

::.::
..........................
. . . . . .. ,.. .

. . :.:~",:: : . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
--=p.-:!--''--+----+------''-~-.-:: ~-':'..>p'~"",-.: - .:- .--+-... -.. -.--+.- .' - ·' l C r - - - l - - - - l - - - - - . J . - -

::: : '::::~:
. ......

...

. ''b ..
..
.

,

..

~

. ....... ....... P: ."

:::6. : ..
·9··

::::~~.:

~~~ ::~~J~,- + - f - - - - - + - - I - - ..............
..

Figure 5. TIROS II infrared analys is. Part of the 8 -12 micron water vapor l1window l1 data
read out on orbit 578 has been summarized in grid squares on a polar stereographic map base.
Radiation coming essentially from cloud tops or from the surface is expressed in watts per
square meter.

therefore contain more than 4.7 million
bits.
Another 5 channel medium resolution infrared scanner (MRIR) will provide additional
information throughout each orbit. The five
degree view of the MRIR sensors will provide about 42 separate earth measurements
per scan revolution from each channel. Approximately 700 non-overlapping scans are
required for a full orbit so that (again using
7 bits per measurement) the MRIR response
from each orbit will contain more than 1
million bits.
The volume of information expected from
each pass is indeed impressive especially
when one realizes that this information is to
come night and day on a continuous basis for
immediate real time utilization. A marked
increase in the present number of TIROS

data analysts and helpers is indicated for
Nimbus data processing if present semihand procedures continue. With plans for
higher resolution sensors of increasing variety, automatic processing of satellite weather
data is becoming a necessity.
STATUS REPORT ON NIMBUS DATA
PROCESSING PROGRAM
The automatic data processing system
under construction will be located at the
Weather Bureau's National Weather Satellite
Center (NWSC) , Washington, D. C. and will
receive its input data from the command and
data acquisition (CDA) facilities at Fairbanks, Alaska through multiple broad band
microwave communication facilities. The
system at NWSC contains a complex of

8 / Processing Satellite Weather Data - A Status Report - Part I
components in addition to the digital computers. A detailed explanation of the system
is beyond the scope of this report although a
brief description from the computer oriented
viewpoint is given in the second part. Let it
suffice at this pOint to say that the system is
evolutionary in design in that computations
will continue in support of semi-hand processing procedures. For this purpose the
system's IBM 7094 with attached 1401 will
be utilized to produce a picture gridding
tape. Information in the form of override
signals at specified Vidicon scan line and
scan spot numbers, when melded with the
analogue picture signals, will produce a
kinescope recording of the original cloud
photo with a super-imposed dotted line locator grid such as Figure 10. A small CDC
160A computer, interruptable by Vidicon
synch pulses, will synchronously meld the
digital information from the gridding tape
onto spare tracks of an analogue picture
tape. Other non-digital devices will then
combine the synchronized information on

this tape as it is fed into the kinescope
recorder.
The 7094 program is being produced essentially as an extension of present TIROS
programs. A simulated output of this program has provided check out facility for the
160A program which now awaits the unique
non-digital hardware complex for final checkout. A complexity of supporting programs
are involved in this effort as indicated in the
appendix which briefly describes each program module. This effort will permit a
TIROS type semi-hand processing of the
photo data but with hand melding of grid and
picture now automated.
The far greater task of the system involves duplication of the semi-hand processing by automatic means. In the beginning
these efforts must be experimental in that
application of the data is still exploratory.
Methods of presentation,quantities to be extracted from the basic data, the scale of
atmospheric phenomena to be described
(resolution) are all in exploratory stages. -A

Figure 6. Perspective grids and mapped coverage of Nimbus camera cluster as seen from a
500 nautical mile orbit. The central camera looks directly downward at the sub-satellite
point. Side cameras are tilted 35 degrees to either side of the track.

Proceedings-Fall Joint Computer Conference, 1962 / 9

Figure 7. Geographic coverage to come from Nimbus showing overlap between
adjacent three camera clusters.

major effort is underway to create a hierarchyof data processing programs to activate the system and produce a variety of
outputs in a flexible manner.. These may be
grouped as ingestive, digestive, and productive.
The ingestive programs are more than
simple input routines in that some preprocessing of the data is accomplished. In
the case of picture data, the entire volume
mentioned previously is to be fed into storage in the computer. Some sorting is required before storage so that separate disk
file.s are created containing data from each

of the three cameras. As time permits,
other pre-processing activities will also be
accomplished. Light intensity signals over
the face of each photo require normalization
for angle of view before quantitative comparisons are valid. Also, for the same reason, solar aspect variation from equator to
pole must be removed.
In the case of the incoming MRIR and
HRlR data, the ingestive process is partly
one of data editing. By recognition of pulses
which provide knowledge of scanner shaft
angle, almost two thirds of the incoming data
which is non-earth viewing can be eliminated.

10 / Processing Satellite Weather Data - A Status Report - Part I
Other raw housekeeping input information
such as attitude error signals and sensor
environment temperatures must be unpacked
and translated through calibration in the in.;.
gestive process before they can be used in
processing the meteorological data.
Final checkout of these programs must
await activation of the complete hardware
complex since only limited simulation is
possible.
The digestive process takes the pertinent
incoming data and converts it to a meteorologically usable form. A major task is the

melding of this data with the orbit and attitude
information to geographically locate the sensor information elements. In the case of the
photos, part of this work is accomplished as
an adjunct to the earlier mentioned program
which produces the picture gridding tape. An
open lattice of points selected by scan row
and spot number are geographically located
within each image. From these location
"bench marks" the digestive program transforms the foreshortened, perspective photo
image into a rectified equivalent on a standard map base. Figure 11 is an experimental

Figure 8. Geographic coverage envelopes to come from Nimbus showing
overlap from orbit to orbit.

Proceedings-Fall Joint Computer Conference, 1962 / 11

Figure 9. Sample perspective grid showing
the polar region to be viewed by Nimbus.

example. The rectified image appears on a
mercator map projection-in one view as a
replotting of the original picture elements
only. It demonstrates the futility of extending this process into extremely foreshortened
image areas where a realistic rectified image
would consist largely of interpolated filler.

After this step the rectified images are
fitted together into a mosaic strip which is
then available as a product source.
The digestive infrared data program is
being patterned after that mentioned earlier
which has been produced for the processing
of TIROS radiation data. The calibrated and
earth located data will similarly provide a
product source through the archivable final
meteorological radiation tape.
Programs for production of usable output
material present the most problems. Full
resolution photo mosaics rectified to polar
stereographic or mercator maps are expected to find application over limited regions in connection with hurricane detection
and tracking, for example. For other broadscale analysis problems, products having
reduced resolution may be adequate. This
implies searching these images by machine,
editing and summarizing them as to percent
cloud cover, brightness and pattern. Some
interesting patterns are revealed in the
TIROS photos of Figure 12. Although, as
mentioned earlier, quantitative interpretations are only gradually emerging, the rings,
spirals and streets seen in these photos will
likely be subjects for identification through
pattern recognition techniques. Cloud heights,
provided indirectly from the IR data through

Figure 10. Cloud photo with melded grid (simulated): Original Hugo rocket photo (left)
looking toward Bermuda from Wallops Island, Virginia at 85 miles altitude and 100 scan
line digitization of the original picture (right) played back through a digital CRT (SC-4020)
with 15 unit gray scale produced by programmed time modulation. Certain picture elements have been replaced by grid signals before playback to produce latitude/longitude
lines.

12 / Processing Satellite Weather Data - A Status Report - Part I

Figure 11. Rectified cloud photo: Digitized
picture elements from figure 10 replotted on
a Mercator map base without filler (below)
and with filler (above) to produce a rectified
pictorial image.

cloud top temperatures, present an added
output. The MRIR package will yield other
derived products such as maps of the net
radiation flux. There is thus a family of derived products available from the digested
material. A variety of output equipments including prototype cathode ray tube photo recording devices which are driven by digital
tapes and somewhat similar photo quality
facsimile machines, require additional conditioning of the output products to suit the
formats specified.
The variety of production type programs
are indicated in the appendix. It is likely
that all such production varieties cannot be
produced in real time from the data received
on all passes of the satellite. The intent is
that these products will be available for experimental utilization and that variations and
modifications of those which prove to be
most useful will assume an operational role.
CONCLUSION
This has been a brief attempt to present
a background to the non-meteorologist explaining the need for more weather data, and

the present and likely future role of weather
satellites. The need for computers and automatic data processing is explained in terms
of the kinds of data involved. Computer support of semi-hand methods is discussed
along with current efforts toward a truly
automated effort for Nimbus satellite data.
As the variety of sensors and the volume of
such data increases, a maximum degree of
automatic processing and utilization of the
data is indicated.
The scope has been limited to the data
processing job as seen from the computer
programmers viewpOint. Other groups within
the Goddard Space Flight Center of NASA,
the Weather Bureau's NWSC and their contractors have vital roles in the design, launch,
command and readout of the satellite and the
supplying of other important data in the form
of sensor calibration and orbital information
from tracking station data before the sensor
data can be rendered meteorologically useful.
Only scant mention has been ma<;le of the
entire data processing system. The second
paper in this series will give additional details of the digital and non-digital data processing machine complex-again from the
standpoint of the computer programmer.
The role of the computer as manager of the
process will be amplified in terms of command and control.
APPENDIX
The main program modules are listed
below together with some details concerning
each subroutine portion. The main section
of each program module is indicated by an
asterisk. Status of various portions is indicated as of September, 1962.
Executive Program
Details of the Executi ve Program are
provided as part of the text of Part 2 of this
paper.
Time -Attitude - Calibration Ingestion Program
*Time/Attitude Sort: Engineering housekeeping data on "A" channel including pitch,
roll and yaw attitude signals and certain
vehicle temperatures used in IR sensor calibration are transmitted as pulse code modulation (PCMA). Shutter times from the

Proceedings-Fall Joint Computer Conference, 1962 / 13

Figure 12. Sample TIROS cloud patterns. Convective clouds over Lower California
(upper left) August 21, 1961. Clas sic hurricane symbol from cloud pattern of Hurricane Betsy (upper right) near 36°N, 59°W on September 8, 1961. Field of cellular
clouds (lower left) near 25°S, looE on July 31, 1961. Cirrus cloud streamers off
the Andes (lower right) passing eastward off the Argentine coast, August 3, 1961.

Advanced Vidocon Camera System (AVCST)
are sent in similar format on another channel.
This program will accept such information
and sort it from an intermixed input format.
PCMA Unpack and Monitor: Unpacks the
separate 7 -bit raw count measurements and
translates selected quantities into meaningful
temperatures or angles. Items to be used
are examined for quality and format with
optional outputs for visual inspection.
PCMA Output: Organizes attitude, calibration temperature, and picture time information into tables and issues the information
in a form suitable for use by the main data
processing programs.

Time/ Attitude Editor: Optionallyaccomplishes some of the above duties as required
in the event that this information is made
available in semi-processed form as a direct
digital message.
This section is in an active design status
awaiting final format of PCMA data and decision on items to be transmitted from Fairbanks, Alaska.
Picture Grid Tape and Rectification Program
Orbit: Based upon a specified time request, this subroutine supplies satellite altitude and latitude/longitude of the sub satellite

14 / Processing Satellite Weather Data - A Status Report - Part I
pOint. The information is generated as a
prediction based upon periodically updated
fundamental orbital elements which are supplied by the main NASA orbital determination through minitrack data.
Picture Attitude: Converts pitch, roll and
yaw error signals into nadir and azimuth
angles of each camera's principle line and
also provides a radial displacement correction to the orientation of each raster.
Distortion: On the basis of prelaunch target photos, produces radial and tangential
distortion corrections for a pre-selected
family of image raster points so that, through
interpolation, any image X, Y point can be
expressed in terms of two component angles
in object space.
Geography: Provides a large catalog of
latitude/longitude points along all major
coastlines of the world. The subroutine provides ordered groups of such points in short
segments for quick selection. Such coastlines are optionally included with latitude/
longitude lines in grids melded to the photos.
*Grid Meld and Rectification Locator:
This is the main program segment. It includes the basic calculations which produce
latitude and longitude from an X, Y image
point. The subroutines above serve as input
support. The primary output is approximately 1000 latitude/longitude locations from
a pre-selected open lattice of image locations. These locations are available in table
form for later interpolative rectification of
the entire picture raster.
Grid Meld Output: For every sixth scan
line of each picture raster, the locations of
latitude/longitude line crossings are calculated. This information from one simultaneous three picture cluster is logically
combined into a set of binary tape records
containing a series of three-bit code groups
and nine-bit count groups which -tell where
over-ride signals are to replace the picture
signal and produce a dotted line grid.
One such orbit routine has been produced
for TIROS. Revision awaits coordination with
NASA orbital computation group as to mathematical model to be used for Nimbus. Geographic coastline tables from TIROS have been
expanded to global coverage and are available
for Nimbus. Other portions are active.
Line Drawn Grid Program
*Grid Line Locator: A program similar
to the above but intended primarily for

emergency use. It computes X, Y image
points from pre-selected latitude/longitude
intersections.
Line Output: Generates a special format
tape for a model 3410 Electronic Associates
Data Plotter.
Cathode Ray Tube Grid Program
*CRT Grid Locator: Essentially a duplicate of the Grid Line Locator above.
CRT Output: Generates a special format
tape to guide the cathode ray tube beam to
produce. grids recorded on microfilm from
devices such as the Stromberg Carlson Model
4020 Microfilm Recorder.
Both the line drawn and CRT grid programs have been completed as generalized
versions of TIROS packages and are being
used experimentally.
Digital Picture

Ingestio~n

*Picture Sort: Digitized pictures arriving from the analogue to digital converter
through the external format control unit will
enter the computer in packed words. Each
36-bit word will contain 4-bitintensitymeasurements from nine consecutive scan spots
all from the same picture. A cyclic commutation intermixes such words from the three
cameras. This program sorts the information for output into separate files each containing information from only one camera.
The following subroutines support this task
and carryon added preprocessing functions.
Picture External Communicator: Picture
data is being recorded at 7 -1/2 inches per
second into a bin tape recorder and the digital
conversion process consults this tape intermittently at 30 inches per second. The external communicator is really an extension
of the executive routine which sends out commands to stop and start the read capstan on
the bin tape recorder.
Picture Monitor: Provides superficial
checks to see that a signal is present, that
raster line synch marks are clear, etc.
Unrectified Print: Produced by IBM 1401
printer will produce a visual check of the
raster and its relationship to the fiducial
marks, a single character corresponding to
each scan spot.
Solar Ephemeris: With time of photo,
provides the latitude/longitude of the subsolar point from which usable sun angles may

Proceedings-Fall Joint Computer Conference, 1962 / 15
be generated for later interpretation of brightness, reflection properties and other attributes of the image.
Sun Glint: Used in conjunction with the
Solar Ephemeris routine will earmark that
part of any image where the response is primarily caused by sun glint.
Output to Storage: Will consist of routine
output commands to the two disk channels
but output of information is important insofar as efficient positioning of the write arm
is concerned since a maximum net transfer
rate is required.
Most parts of this module are active. The
Solar Ephemeris has been completed as a
more efficient version of a similar TIROS
package. Input format and means of detecting ends of scan lines are being worked out
in conjunction with final design specifications of the Format Control Unit.
Picture Digestion and Production
*Picture Rectification: Utilizes the output
of the rectification locator program. Separate picture scan spots are repositioned in
sUb-blocks of storage according to grid
squares on a standard map base. The following supporting packages are utilized.
Picture Selector: Provides input/output
selection capability. A picture will be specified by exposure time and as left, right or
center camera. A specification of core buffer
location and picture segment will result in
movement of the required item to or from
disk storage.
Brightness Normalizer: Adjusts the image
response for variations due to the scan electronics and also adjusts for pole to equator
illumination differences.
Background: Provides an updated background response from which current responses will be treated as anomalies. In this
way partial discrimination between cloud and
background will be possible.
Interpolate: Provides an efficient quadratic interpolation within a two dimensional
array. This package will be used extensively
in connection with transformations from x, y
image locations to i, j map grids.
Indexing: A flexible subroutine which permits identification of storage location as a
function of i, j location in square mesh grid
which is to be superimposed on a map projection.

Mosaicker: A routine which will combine
rectified, summarized data in an overlap
region based on priority selection rules.
Cloud Cover: Some 400 picture elements
falling in a ten nautical mile grid square will
be ranked as background, cloud or doubtful..
Percentage cloud cover and average cloud
cover brightness will be expressed as edited
output.
Disjunction: Further interpretation of the
data used for cloud cover analysis will express the areal variability of cloud cover
thus distinguishing between scattered or
broken cloud arrays in large contiguous
masses as compared to other cases similar
in net cloud cover but distributed in a more
specular array.
Orientation: By comparing profiles of
response within a ten mile square using
samples taken from different radial orientations, certain streakiness and other features
of the pattern can be deduced.
Stereo Map: Computer i, j coordinates
ona specified square mesh grid on a polar
stereographic map base for a given latitude/
longitude point on earth.
Mercator Map: Similar to above but using
a Mercator map base.
Grid Print Output: Prints out on standard IBM printer the various summarizations discussed above by using a character
for each 10 mile mesh interval (square type
and ten line per inch carriage control are
desirable). By coding character selection,
both quantitative and pictorial output can be
obtained.
Line Drawn Output: Contoured fields are
produced from magnetic tape on an Electronic
Associates Data Plotter, Model 3410. Cloud
height analyses will likely be produced by
this device.
CRT Output: Similar to grid print output
but utilizing a device such as the SC 4020
microfilm recorder.
Fax Output: Similar to the above but utilizing digital tape directly to drive a facsimile
scan device.
Most program segments are active. The
interpolation routine is in operation. The
background package will be self generating
after Nimbus launch in that clear air earth
views will be accumulated as background information. Stereo and Mercator mappers
have been produced. An experimental unrectified print package has also been produced.

16 / Processing Satellite Weather Data - A Status Report - Part I
MRIR Ingestion and Digestion
Programs
Scan Rate: The scan shaft angle corresponding to a specific sensor sample can be
deduced from a shaft angle reference pulse
but is also dependent on knowledge of scan
shaft spin rate and sampling frequency. This
subroutine will be available on an optional
basis to compute the spin rate by counting
shaft reference pulses over a given number
of cloud pulses.
*MRIR Ingestion: Manages input, partial
processing and places raw product in intermediate storage.
Scanner Attitude: Similar to picture attitude routine but supplies a series of nadir
and azimuth angles along a scan swath.
Space Cropper: From height supplied
by orbit routine and roll correction, provides identification of IR samples with respect to scan shaft reference pulse thus
permitting rejection of all but earth viewing
sample.
Earth Locator: An adaptation of the picture locator package which furnishes latitude/
longitude information from input provided by
orbit and attitude routines.
Solar Sector: By using the solar ephemeris and location of viewed spot, provides
solar angles for interpretation of data.
MRIR Data/Format Monitor: Inspects the
raw data to detect format errors and to judge
the general quality of the data (noise). Failure to pass acceptance tests causes visual
output for further inspection.
*MRIR Format and Output: Creates the
archivable intermediate source tape from
which various output products are derived.
This main portion utilizes the routines below
and some of those above which cannot be
utilized for want of time during the ingestive
phase.
Calibration: A step-wise two dimen~ional
array interpolation which produces effective
black body temperatures from raw sensor
counts as a function of environmental temperatures adjacent to the sensors and in the
electronic data trans mission equipment.
Documentation: Places appropriate identification on the archivable product including
orbit number, date, time, etc.
Parts of this package that are also used
with HRIR are active. Earth Locator and
Calibration will be minor revisions of TIROS
routines.

MRIR Production Programs
*MRIR Mapper: Consults the final Meteorological Radiation tape produced by the
digestive programs and generates fields of
derived quantities as indicated below. Also
supervises the various output packages.
Cloud Height Analyzer: With the aid of a
temperature height analysis based upon existing observations and climatology, provides a
map of height information based on water
vapor window measurements. This information is now available in consort with cloud
photo information for further interpretation.
Limb Darkening: Provides corrections to
sensor response as a function of viewing
angle (path length).
Net Flux: Creates a map indicating the
net radiative flux (incoming short wave vs.
outgoing long wave) through a functional combination of sensor responses.
Albedo: Produces a map of reflectivity
of the cloud patterns.
MRIR Print Output: These output programs are minor revisions of those mentioned for cloud photos.
MRIR Line Drawn Output:
MRIR Fax Output:
MRIR CRT Output:
This portion is generally not active pending decisions on availability of portions of
data in real time.
HRIR Ingestion and Digestion Programs
*HRIR Ingestion and Format: A CDC
160A computer program which accepts packed
raw count information, unpacks and edits the
data with the help of the two routines below.
HRIR Space Cropper: A preliminary separation of earth and space viewing response
is accomplished without specific height or
attitude input in order to eliminate unwanted
response without using a highly complex program on a small computer.
HRIR Format Monitor: Detects unsatisfactory quality of input data and optionally
generates output for visual inspection (see
similar MRIR routine).
*HRIR Digestion: Provides intermediate
calibrated and geographically located data as
indicated above for MRIR. Many of the subroutines cited above for MRIR are also applied directly to HRIR.
HRIR Calibration: A Simplified version
of the similar MRIR routine.

Proceedings-Fall Joint Computer Conference, 1962 / 17
HRIR Format and Output: Generates the
archivable product source tape. Single channel sensor output is arranged in a format
somewhat different from that used for the
multi-channel MRIR.
This module is active. The ingestive
portions using the 160A is being carried out
by contract with National Computer Analysts
(NCA), Princeton, N. J. An internal segment
of the HRIR Digestion package which precisely defines the earth viewed data sample
is in check out.
HRIR Production Programs
These programs borrow heavily from the
MRIR cloud height analYSis and the photo
cloud cover routines described above. Output routines will also be minor variations of
those discussed.
Some output routines await word format
specifications and instruction sets for prototype output hardware. Special character
chains for computer printer output are being
considered.
Picture Grid Melding Program
*CDC 169A Grid Meld: Provides synchronous recording of digital grid signals produc ed by IBM 7094 and the analog picture
raster.
Time Check: Insures correspondence between gridding signals and pictures by input
of PCM time groups direct from the analog
picture tape and the comparable time information which a c com pan i e s the gridding
signals.
Panel Documentation: Provides documentation information from the 7094 produced tape in proper format for output to the
multitrack analgue picture tape such that a
documentation panel is activated as the
gridded picture is produced for film recording.
This segment is completed and awaiting
non-digital equipment for final checkout.
Details of Panel Documentation await final
design specification of panel display device.
Simulation Support Programs
Certain non-operational programs are
useful as feasibility and timing experiments
while others produce interface input or output product. samples which serve to check

out segments of operational programs. Some
of these have been produced:
AVCS Photo Rectification Study
HRIR FMRT Output Simulation
MRIR Raw Data Simulation
Executive Routine Test
Various phases of the photo rectification
study have been completed including gray
scale experiments on a digital CRT, filler
experiments and obtaining timing figures.
Other Simulation Programs Test Hardware:
Passive Switching Exerciser (7094)
Active Switching Exerciser (7094)
Control Logic Communicator (for 7094
and 160A)
Format Control Test (for 7094 and 160A)
Analog to Digital Test (7094)
AVCS Picture Tape Test (160A)
These routines are awaiting final design
specifications and specific control formats.
REFERENCES
Am e ric a n Meteorological SOCiety, 1962:
Statement on Weather Forecasting. Bulletin A.M.S., Vol. 43, N. 6, June 1962, 251.
Bandeen, W. R., 1962: TIROS II Radiation
Data User's Manual Supplement. A & M
Div., GSFC, NASA, May 15, 1962.
Dean, C., 1961: Grid Program for TIROS II
Pictures. Allied Research Associates,
Inc. Contract No. Cwb 10023, Final Report, March 1961.
Frankel, M. and C. L. Bristor, 1962: Perspective Locator Grids for TIROS Pictures. Meteorological Satellite Laboratory
ReportNo. 11, U. S. Weather Bureau, 1962.
Fritz, S. and J. ,So Winston, 1962: Synoptic
Use of Radiation Measurements from
TIROS II. Monthly Weather Review, 90 (1),
January 1962.
Johnson, D. S., 1962: Meteorological Measurements from Satellites. Bulletin A.M.S.,
Vol. 43, N. 9, September 1962.
National Aeronautics and Space Administration and U. S. Weather Bureau, 1962: Final
Report on the TIROS I Meteorologicarsat-"
ellite System. NASA Tech. Report No.
R-131.
National Aeronautics and Space Administration and U. S. Weather Bureau, 1961; a:
Abstracts and figures of Lectures and
Reprints of Reference Papers. The International Meteorological Satellite Workshop. Washington, D. C., Nov. 13-22,1961.

18 / Processing Satellite Weather Data - A Status Report - Part I
National Aeronautics and Space Administration and U. S. Weather Bureau, 1961; b:
TIROS II Radiation Data User's Manual,
August 1961.
Phillips, N. A., 1960: Numerical Weather
Prediction. Advances in Computers, Vol.
I edited by Franz L. AU, Academic Press,
1960, 43-51.

Stampfl, R. A. and H. Press, 1962: The
Nimbus Spacecraft System, to be published in Aerospace Engineering, 21 (7).
Winston, J. S. and P. K. Rao, 1962: Preliminary Study of Planetary Scale Outgoing Long Wave Radiation as Derived
from TIROS II Measurements. Monthly
Weather Review, 90, August 1962.

PROCESSING SATELLITE WEATHER DATA A STATUS REPORT - PART II
Laurence 1. Miller
U. S. Weather Bureau
Washington, D. C.
the enormous volume of data. The data
processing plan for the operational meteorological satellite, Nimbus, is the result of a
continuing research and development program begun after World War II with German
and American rockets and more recently includes the highly successful TIROS satellites. It is beyond the scope of this report to
provide a detailed description of the TIROS
satellites; however, Table 1 provides a ready
comparison between some of the more salient
features of the two systems and furnishes a
foundation for the ensuing more detailed description of the Nimbus data-processing
system.
Limited computer prqcessing of TIROS
data was discussed in Part I, and details of
the difficulty of "real-time" computer processing of the information have been given
elsewhere, along with an engineering description of the first TIROS satellite and a meteorological analysis of some of the data [4].
Equally as important a consideration in not
preparing elaborate data-processing codes
to handle the TIROS data was the limitation
in speed and storage capacity of existent
digital computers when the TIROS design
was considered. The time required to compute a reprojected image of one complete
photograph approached the elapsed time of
one entire orbit [5]. Although attention will
be given to this problem in a subsequent
section, it hardly seems redundant to point
out that computers of the present generation
are still barely adequate to this task.

SUMMARY
Experience gained from earlier meteorological satellites provides a firm background
for the basic design of the data processing
center. Nevertheless, the almost limitless
nature of the sampled data and some uncertainty as to the optimum forms of the final
products dictate the need for providing the
basic system with extreme flexibility and
good growth potential. To achieve the desired versatility, the operation of the various
portions of the system are being designed so
that their functions are almost entirely programmable to facilitate rapid conversions to
handle new types of data and cope with changing situations.
Maximum utilization of a computer's logical capabilities are stressed to avoid redundant construction of analog hardware andlor
special "black boxes." An executive monitor
program is designed to provide the necessary link between computer and external
hardware. Emphasis is placed on the centralization of control and the modular design
of the main programing packages.
INTRODUCTION
In Part I of this report reference has
been made to the site of the data-processing
center with only passing comment on the
communication network and the system being
designed to manage, edit, process and output

19

20 / Processing Satellite Weather Data - A Status Report - Part IT
Ta.Dle 1

Comparison of Nimbus and TmOS

Height (inches)
Diameter (inches)
Weight (pounds)
Orbital Altitude (Nautical miles)
Orbital Inclination
Stabilization
Earth Coverage (%)
Camera Raster (lines per frame)
TV Resolution (miles)
Maximum Power Available (watts)
m Sensors (resolution, miles)
Period (minutes)
No. of Cameras
Command Stations
The second part of this paper serves three
purposes: to examine the logical layout of
the central computer with associated peripheral equipment and external hardware; to
describe the functioning of the data processing system, emphasizing the logical capabilities of. the computer; to discuss the vital link
between computer and external hardware
provided by an executive monitor program.
DATA TRANSMISSION
Figure 1 is a generalized schematic representation of the flow of data from Nimbus
to the National Weather Satellite Center
NIMBUS

~
1/\

Tmos

Nimbus

19
42
300
380
48° Equatorial
Spin-Stabilized
10-25
500
1
20
MRffi (30)
App. 100
2
2

118
57
650
500
80° Polar
Earth - Seeking (3 axes)
100
833
1/2
400
MRffi (30)-HRm (5)
App. 100
3
1

(NWSC), Suitland, Md., via the command and
data acquisition (CDA) station at Fairbanks,
Alaska. The proposed transmission facility
between Alaska and Suitland will utilize two
48 Kc lines, known commercially as Telpak
B. The telemetry aboard the satellite provides information on the spacecraft environment and attitude as well as information from
the three meteorological experiments. Data
recorded on magnetic tape recorders aboard
the vehicle are telemetered to the ground
station using an FM-FM system to accommodate the considerable information bandwidth.
Somewhat different considerations apply
to each of the multiple sensor and environmental signals as they are initially re,corded
on the spacecraft, telemetered to the ground
and finally received at the transmission terminal equipment. These features are summarized as follows:

-------

PICTu~t

DATA

FAIRBANKS, ALASKA I

NWSC

SUITLAND, MARYLAND

Figure 1. Schematic representation of the
flow of data from Nimbus to the National
Weather Satel,lite Center.

Each of the three video cameras are
simultaneously exposed for. 40 milliseconds,
scanned for 6.75 seconds and recorded on
magnetic tape at 30 i.p.s. Although each exposure of the thirty-three frames (three picture set) are 108 seconds apart, only 3.7
minutes of actual recording time is required.
Playback to ground is maintained at 30 i.p.s.
but is recorded, still in FM form, at 60 i.p.s.
Since the long line bandwidths are not sufficient to accommodate the frequency range,

Proceedings-Fall Joint Computer Conference, 1962 / 21
the ground tape is rewound and then relayed
to the NWSC in 30.85 minutes at 7.5 i.p.s.
HRIR
The narrow angle high resolution radiation
sensor is active only during the dark southbound portion of the orbit of approximately
64 minutes. During this time data is recorded at 3.75 i.p.s. and then telemetered to
the CDA station in 8.1 minutes at 30 i.p. s.
The transmission is received at Suitland in
8.1 minutes; however, the data are recorded
at 60 i.p.s.
MRIR
Five medium resolution radiation sensors
scan from horizon to horizon during the entire orbit. An endless tape loop records the
data continuously (except during readout) at
0.4 i.p.s.; increasing the playback speed by
a factor of 30 reduces the readout time to
3.6 minutes. The data is recorded at Suitland
at 30 i.p.s.
PCM
Space craft environmental signals, including attitude signals, vehicle temperatures
and other housekeeping data, are transmitted
as pulse code modulation (PCM). This information is also recorded during the entire
orbit in a similar manner to the MRIR, discussed earlier.
The "real-time" aspects of the operation
are accentuated by the undelayed transmission of the PCM and infrared data directly to
the NWSC computers over the leased microwave facilities. The total time required for
complete satellite interrogation is 8+ minutes; therefore, all but three to four orbits a
day can be recorded at Fairbanks.~:~ Transmission of the video data to Suitland is delayed about 10 minutes while the computers
convert the raw PCM data to useful parameters; therefore, all the data is not received
at the center until approximately 40 minutes
after the start of interrogation. Direct access
of the data to the IBM computer is accomplished by means of a Direct Data Connection
(DDC), which permits real time transmission
*The east coast of North America is being
considered as a site for a second CDA
station.

between 7094 storage and external devices at
rates up to 152,777 words per second.
The NASA Space Computing Center at the
Goddard Space Flight Center supplies a set
of orbital elements, which are periodically
updated by information received from the
world-wide Minitrack network. Prior to
satellite interrogation these elements are
converted to satellite latitude, longitude and
height as a function of the orbit time.
INPUT DATA
Before turning to a consideration of the
high data rates as they pertain to the "realtime" system, let us briefly outline the presently proposed computer complex. The primary computer will be a 32,000 word core
memory IBM 7094 equipped with the following elements: fourteen MOD V magnetic tape
drives shared between two channels, two 1301
disk files each connected to a separate channel, one DDe attached to a tape channel, a
core storage clock and interval timer, an online printer and card-reader. Two smaller
scale computers will also be available, an
IBM 1401 to serve primarily as an inputoutput device to the 7094, and a CDC 160A to
be used in the picture-gridding program and
to some degree as a preprocessor for the
less voluminous MRIR and HRIR data.
Table 2 provides a summary of the volume
and real-time rates (equivalent to 60 i.p.s.
playback) of the experimental data, and
Table 3 provides the data rates of the 7094
input-output equipment. From a consideration of the simultaneous input-output computing abilities of the 7094, and the effective
use of optimum buffering techniques, it appears at first that the severest constraint to
operational use of the data is imposed by the
acceptance rates of the DDC and the temporary storage devices. However, closer examination of the basic machine cycle time
(2.0 microseconds) and the frequencyof main
frame cycles borrowed by the input-output
equipment reveals that insufficient editing,
buffering and operational programming time
would be available even if the basic acceptance and transfer rates could be appreciably
increased. t
tThe 7094 was selected as the result of a
staff study which considered among other
things delivery dates, performance and reliability, software, user groups, and especially speed and storage capacity.

22 / Processing Satellite Weather Data - A Status Report - Part II
Table 2
Satellite and Station Recording Rates
Satellite
AVCS
Record
(min.)
Speed
(i.p.s.)
Playback
(min.)
Speed
(i.p.s. )

3.7

lffiffi

MRm PCMA

64.8

108

108
DESIGN CONSIDERATIONS

30
3.7
30

3.75

0.4

0.4

8.1

3.6

3.6

30

12

12

60

Fairbanks
Speed
(i.p.s.)
Playback
(min.)
Speed

60

60

60

30.85
7.5

Direct

3.6 Direct
60
-

-

NSWC
Speed
(i.p.s.)
7.5
60
Playback
(min.) 30 (Batch) 8
Speed
(i.p.s.)
30
60

7.5

-

.5

-

60

Table 3
Volume and Real Time Data Rates
Binary Bits
AVeS
lffiffi

MRm

multiplexed and introduced to an analog-todigital converter which encodes the sampled
values in digital form while preserving the
integrity and rate of the data. In the case of
the video signals the data are recorded at 7.5
i.p.s. in a special bin storage recorder which
permits the information to be read into the
computer in batches at 30 i.p.s., well within
the data handling capabilities of the computer.

275,000,000
14,700,000
3,600,000

mM 729 Mod IV
(high dens ity)
IBM 729 Mod VI
(high density)
IBM 1301 DISC
IBM DDC

Bits/second
App.

1,402,920
59,000
134,400
375,000

540,000
App. 500,000
App. 1,000,000

The required high rate of data transmission is obtained by maintaining a continuous
flow between the transmission line and the
computers. The analog signals are detected,

During all phases of the system design it
has beenvital for us to consider both the high
degree of flexibility and growth potential inherent in the Nimbus Research and Development program and the implications of future
programs of international cooperation in
weather satellites. Further, as the system
passes from the experimental phase to the
truly operational stage the degree of automation will increase and eventually replace
manually performed functions. The required
balance between these practical considerations and the need to assume an immediate
operational posture has been achieved by designing the structure of the combined digitalanalog complex as machine, not hardware,
orientated.
To achieve the desired versatility, the
operation of the various portions of the system are being designed so that their functions are almost entirely programmable to
facilitate rapid conversions to handle new
types of data and cope with changing situations. Emph,asis has been placed on the
modular concept so that substitution of one
pack-age for another does not have ramifications throughout the entire system. Maximum utilization of the computers logical
capabilities have been stressed to avoid redundant construction of analog or special
hardware. Wherever possible, major hardware units are standard, dependable general
purpose equipment; and where it has been
necessary to build special eqUipment, these
are of the patch board variety.
CONTROL PHILOSOPHY
The Nimbus system has a common base
with many other complex systems where
computers are employed for such vital functions as information storage, retrieval and
display. Inherent in most of these systems
(e.g., BMEWS, SAGE, MERCURY) is a complex information processing problem which

Proceedings-Fall Joint Computer Conference, 1962 / 23
requires intervention of skilled personnel to
make the ultimate decision. These systems
serve to provide a broad basis of facts on
which the dorninant information processor,
man, can make his decision. Whereas these
systems have been designed because it is
possible to differentiate between the normal
and abnormal, no such clear-cut definition
exists in our weather system. Logical uses
of pattern recognition theory and meteorological research may well negate this last
remark, but such techniques are beyond the
state-of-the-art at this time.
A second difference arises when we consider that the ingestive program is not engaged throughout the entire processing cycle,
i.e., the time between successive readouts.
During the ingestive phase (phase I) the external hardwa~e maybe completely active or
passive or any combination of the two; during
the non-ingestive process (phase IT) the external hardware is predominantly passive.
At any time during a processing cycle both
diagnostic and management interrupts may
occur, but the type of program control invoked must be considered in light of these
two phases. Management interrupts which
may occur at any time are caused by the
normal transfer of data through the computer
and must be given immediate priority. A
component of the external hardware which
monitors the system to prevent loss of quality
or integrity of the information may also provide a diagnostic interrupt at any phase;
however, the right to take action is reserved
to the computer. During phase I the computer must be programmed to take immediate action; however, during phase IT the suspected malfunction may be beyond the present
logical flow of information, and the computer may merely advise a superviser and
refuse to disturb the present operation. The
monitoring and diagnostic control programs
must be optimized as a function of the two
phases.
At the time of initial launch when complete understanding of all possible system
malfunctions is lacking, problems may arise
which have not been anticipated. To cope
with this situation a special manual mode of
operation is provided which permits human
intervention to apply recovery techniques.
As a further "guard to the guards" a real
time programmable clock senses the status
of each phase and signals the present mode
of operation.

It appears that the regularity of the data
and uniformity of time scale should best be
served by an automated system with minimum human intervention. This philosophy is
controlled by an executive program which
also provides the link between the computer
and external equipment.

EXECUTIVE PROGRAM
The actual machine program consists of
five main sections:
1. Internal Control: Coordinates and ties
together the other portions of the executive
monitor. It also requests other program
modules from the system file and provides
for operator ove rride.
2. Schedule: Accepts pre-readout information concerning the data to come and establishes the time schedule and sequence of
program modules to be consulted for that
orbit.
3. Interrupt Interpreter: Diagnoses the
interrupt from the standpoint of source and
reason and directs the computer to the appropriate action. Interrupts may come from
the clock, from the direct data connection
interrupt wire, from the external interrupt
or from regular channel commands.
4. Logical External Communicator: On
the basis of clock alarms or otherwise, sends
commands to control the mode of operation
of the nondigital hardware. This routine is
linked to the interrupt interpreter.
5. Clock Manager: Provides the means
for setting the interval timer and causing
clock interrupts and also fulfills program
requests for time information.
However, the executive program is more
than a series of machine instructions which
controls the flow of information through the
computer and the interaction between the
main program modules. It is, in fact, the
guiding philosophy of the entire data processing system. The program consists of a
rigid set of rules and controls which determine the manner in which the various resources available are utilized in the satisfaction of the system design characteristics.
At first glance, it seems paradoxical that the
Nimbus system, always on the side of growth
and flexibility should make such precise demands at the ve ry heart of the system. Nevertheless' without such a firm foundation our
system would be at best unstable and at worst
com pie tel y unable to meet the specifiC

24 / Processing Satellite Weather Data - A Status Report - Part IT
requirements of growth and flexibility from
within the physical and environmental constraints imposed by the system. The original
form of the executive program will be overlaid by many accretions, some of which may
be major before we are through. The executive program will be the subject of a future
paper.
SYSTEM DESCRIPTION
The functioning of the data processing system is best illustrated through a description
of the events that occur during one orbi tal
cycle. As the information is received at the
common carrier terminal equipment, the
data are directed into three main channels.
1. Into a monitor tape recorder which at
all times records the input from the transmission lines at appropriate speeds. All
data are stored as received providing a safeguard against loss of data in case of breakdown of the processing equipment. This tape
also .serves as an archive copy until replaced
by the CDA master tape.
2. Into a picture gridding and reproduction
branch, in which the analog AVCS signals can
be directly reproduced in pictorial form,
with the insertion of computed latitude, longitude and geographic boundary grids' and
appropriate legends.
3. Into a digitizing subsystem where the
incoming data are formated, converted from
analog to digital and transferred to the computers.
Twelve separate modes of operation appear at least once during each complete
cycle. * Modes 1 through 11 occur (with considerable overlap) during the data ingestion
phase when the primary role of the master
computer is one of system command and
control and only editing and minimum computations are performed. Mode 12 represents the time allocated to the maln data
processing programs which are described in
the appendix to this report. During this
phase the executive program continues to
provide the link between the program modules. However, the master computer relinquishes control of the external hardware and
*An extra burden is placed on the system
when two orbits are stored aboard the
spacecraft and simultaneously acquired by
the CDA station. An alternate mode is provided but will not be treated in this pape r.

peripheral computers to allow manual control for special functions, e.g., archival operations, preventative maintenance. Thus, it
can be seen that approximately 60% of each
cycle is available for computation during
which system software is minimized so as
not to interfere with the program's capacity
to perform the basic function. The modes
for this system are shown in Figure 2 and
are as follows:
Mode 1. Initial load - receive and process
preinterrogation message and compute orbital track. Activate monitor tape recorder
and generate modes 2, 3, 4.
Mode 2. Receive HRIR picture and time
data and switch this information to tape bin
recorder #1. This mode is terminated by
the computer upon receipt of an end of transmission code.
Mode 3. Receive PCM-A data and switch
to demodulator and decoding circuits which
convert the data to digital form. Transfer
the information through the Format Control
Unit (FCU) to the 7094. The computer senses
the end of data to terminate the mode.
Mode 4. Receive AVCS time and direct
the information to demodulator and decoding
circuits which convert the amplitude modulated time information to digital form. The
data is switched to the computer via the FCU.
Upon receipt of all pictures the computer
ends this mode.
Mode 5. Playback the HRIR data as soon
as' it is recorded and dumped into the bin
(Mode 2). The information is converted to
digital form and routed through the 160A
computer to produce an edited digital tape.
Mode 6. Receive MRIR data and record
on bin recorder.
Mode 7. Playback MRIR data to digitizing
SUb-system.
Mode 8. Receive AVCS data and record
on bin recorder at 7.5 i.p.s. allowing tape to
fill up bin.
Mode 9. Playback AVCS data from bin
recorder in short bursts at 30 i.p.s. through
digitizing SUb-system to 7094.
Mode 10. Receive AVCS data and switch
information into picture gridding and reproduction branch.
Mode 11. Transfer HRlR digital tape
from 160A to 7094.
Mode 12. Relinquish automatic control of
the external equipment. Process the data.
Although, the limited goals of data storage
and display are accomplished in quasi-real

Proceedings-Fall Joint Computer Conference, 1962 / 25

A
B
C
D
E
F

LOAD ORBITAL DATA
LOAD AVCS TIME
LOAD MRIR
LOAD HRIR TIME
LOAD PCMA
PREPARE GRIDDING TAPE FOR AVCS

EIT]
IT]

2
..J

<{

z(/)

3

..... 0

4

~

LOAD AVCS PICTURES

L . . . . - I_

01

9_ _- - l

_

8_ _

L -_ _

COMPUTATION OF DATA

~

----11 [~-_-_----

12

Ow

00
z~
::::>
lJ..

o

10

20

30

40

50

60

70

80

90

100

TIME IN MINUTES

I~

DATA PROCESSING CYCLE
Figure 2. Functional modes and computer usage diagram.

time this represents only a partial fulfillment of the system design. The most important function of the computer will be the
summarization of the data into convenient
products for use in weather analysis and
forecasting. The meteorological information
will be disseminated as photographs, maps
and charts, and coded teletypewriter analyses
over domestic and international networks.
True justification for the system design is
possible only if we include this capability to
obtain upon programmed demand these desired outputs, properly formated and to communicate this information to the outside
world.
It is planned to distribute data to meteorologists in the following forms:
1. Gridded photographs (gridded meaning
having latitude and longitude lines).
2. Mosaics, one for each orbital swath
on a scale of about 1:10,000,000. The

resolution of these mosaics will be about 10
miles. Probably three base maps will be
made: polar stereographic for northern and
southern hemispheres and Mercator for
equatorial regions.
3. Mosaics similar to above for North
Atlantic, North America and possibly other
areas. This item has lower priority than 2
and it may prove easier for stations to prepare their own mosaics. Scale may possibly
be 1:20,000,000.
4. Infrared maps similar to 2, from HRIR
data.
5. Infrared maps showing cloud heights
from MRIRdata. Scale possibly 1:20,000,000.
6. Graphical nephanalyses for stations
lacking capability of receiving more detailed
data.
7. Coded nephanalyses for stations having
only radio telegraph or radio teletype.

26 / Processing Satellite Weather Data - A Status Report - Part II
Distribution will be on a selective basis
so that to the greatest extent possible each
user will receive only the data he desires.
Although additional communication links will
be provided distribution to many overseas
sites will necessarily be limited to radio,
including radio facsimile.
REFERENCES
1. Davis, R. M., "Methodology of System
Design."
2. Gass, S. I., Scott, M. B., Hoffman, R.,
Green, W. K., and Peckar, A., "Project
Mercury Real-Time. Computational and
Data Flow System," Proceedings of the
Eastern Joint Computer Conference, Dec.
1961.

3. Hosier, W. A., "Pitfalls and Safeguards
in Real-Time Digital Systems," Datamation, April, May 1962.
4. National Aeronautics and Space Administration, U. S. Weather Bureau, "Final
Report on the TIROS I Meteorologic al
Satellite System," NASA Tech. Report
No. R-131, 1962.
5. Frankel, M. H., and Bristor, C. L., "Perpective Locator Grids for TIROS Pictures," Meteorologic al Satellite Laboratory Report No. 11, U. S. Weather Bureau,
1962.
6. Hall, F., "Weather Bureau Preliminary
Processing Plan."
7. Ess/Gee, Inc., "Nimbus Data Digitizing
and Gridding Sub-System Design Study,"
U. S. Weather Bureau Cwb 10264.

DESIGN OF A PHOTO
INTERPRETATION AUTOMATON*
W. S. Holmes
Head, Computer Research Department
H. R. Leland
Head, Cognitive Systems Section
G. E. Richmond
Principal Engineer
Cornell Aeronautical Laboratory, Inc.
Buffalo 21, New York
INTRODUCTION

aerialphotographs is based on work which has
shown experimentally that present patternrecognition machinery-indeed that which
existed several years ago-can be applied to
the recognition of silhouetted, stylized objects
which are militarily interesting. Murray has
reported just such a capability for a simple
linear discriminator. t Since the information
required to design more capable recognition
machines is readily available, it might seem
that there is no problem of real interest remaining to m a k e a rudimentary photointerpretation machine an accomplished fact.
This, unfortunately, is not so. One of the
most difficult problems is that which is referred to as the segmentation problem. The
problem of pattern segmentation appears in
almost all interesting pattern recognition
problems, and is simply stated as the problem of determining where the pattern of interest begins and ends (as in speech recognition problems) or how one defines those
precise regions or areas in a photo. which
constitute the patterns of interest. The problem exists whenever there is more than one

The extremely large volume of photographic material now being provided by reconnaissance and surveillance systems, coupled with limited, but significant, successes
in designing machinery to recognize patterns
has caused serious consideration to be given
to the automation of certain portions of the
photo interpretation task. While there is
little present likelihood of successfully designing machines to interpret, aerial photographs in a complete sense, there is ample
evidence to support the conjecture that simple
objects, and even some complex objects, in
aerial photographs might be isolated and
classified automatically. Even if machinery,
produced in the near future, can only 'per-:form a preliminary sorting to rapidly winnow the input volume and to reduce human
boredom and fatigue on simple recognition
tasks, the development of such machinery
may well be justified.
The supporting evidence for the conjecture that simple objects can be identified in

*This work was sponsored by the Geography Branch of the Office of Naval Research and by the
Bureau of Naval Weapons.
tSee "Perceptron Applications in Photo Interpretation," A. E. Murray, Photogrammetric Engineering, September 1961.

27

28 / Design of a Photo Interpretation Automaton
simple object in the entire field of consideration of the pattern recognizer. The situation appears almost hopeless when one finds
patterns of widely varying sizes, connected
to one another (in fact or by shadow), enclosed within other patterns, or having only
vaguely defined outlines.
This paper constitutes a report on a system which has been conceived to solve some
of these problems. It is being tested by
general-purpose computer implementation.
The system discussed represents one of several possible approaches to the problem and
had its design focused towards the use of
presently known capabilities in pattern recognizers. No special consideration has been
given, at this time, to methods of implementing the device; however, the entire system
can be built in at least one way.
System Principles
Figure 1 is the basic block diagram for
the system. It has evolved from evaluation
of possible approaches suggested by research

Figure 1. Photointerpretation system
block diagram.

conducted at CAL, pattern recognition work
of others, and techniques successfully used
in other problems.
As is evident from Figure 1, obj ects of
interest have been categorized in two different ways. First, simple objects, such as
buildings, aircraft, ships, and tanks have
been distinguished from complexes, or complex objects. Second simple objects have
been categorized, according to their lengthto-width ratios, as being either blobs (aircraft, storage tanks, buildings, runways) or
ribbons (roads, rivers, railroad tracks). As
shown, the detection of simple objects is accomplished separately for ribbons and for

blobs. In the work reported here the blob
channel-from the input end through the identification of a few complex objects-is receiving the major attention.
The preprocessing which is carried out
in the first portion of the system solves several of the problems inherent in the use of a
simple pattern-recognition device to aid in
the photo interpretation problem. -Briefly,
objects are to be detected, isolated, and
standardized so that they can be presented
separately (not necessarily sequentially) for
identification.
The function performed at the object identification level is that of identifying the blobs
which have previously been detected, isolated, and standardized. The input material
to this level or state consists of black-onwhite objects. As has been previously indicated, existing devices are fundamentally capable of accomplishing the identification task.
At the complex object level, the location
and identification information available from
the Simple object-level outputs is combined
and appropriately weighted to identify objects
at a higher level of complexity. An illustrative example is the combination of aircraft
(simple objects) near a runway (another
simple object) and a group of buildings (each
a simple object) to determine the existence
of an airfield.
In the following sections the basic steps
in the preprocessing sequence will be described in more detail and some illustrations from current computer studies will be
discussed. The most difficult part of the
problem, by far, is that of detection.
Obj ect Detection
A study of sample aerial photography suggests three ways in which images of objects
of interest differ from their backgrounds:
a. points on objects may differ in intensity from the intensity characterizing
the background.
b. objects may be (perhaps incompletely)
outlined by sharp edges, even though
the interior of the image has the same
characteristic intensity as the background.
c. objects may differ from baGkground
only in texture, or two dimensional
frequency content.
Examples of the first two kinds of objects
are shown encircled in Figure 2. There

Proceedings-Fall Joint Computer Conference, 1962 / 29

Figure 2. Examples of objects defined by intensity contrast (0) and by edges (~).

seem to be many fewer examples of objects
which differ from background solely by texture. This class of objects would be much
larger if our de fin i t ion of object were
broader, including, for example, corn fields.
Perhaps the most useful area in which spatial frequency content can be put into use is

that of terrain classification. Terrain classification, as will be noted again later, can
playa significant role in the final identification of our narrower class of objects.
For detection of objects in classes a. and
b., we have been proceeding experimentally
to determine the capabilities of simple,

30 / Design of a Photo Interpretation Automaton
two-dimensional numerical filters, some
nonlinear and some linear.
For initial experimentation, * the object
filters for discrimination based on intensity
contrast (class a objects) were designed as
shown in Figure 3. Square apertures ("picture frame" regions) were used to compute

intensity information which was then compared with the intensity of the point at the
center of the square, A, to determine if the
central point differed sufficiently in intensity
from its background to qualify as being a
point of an obj ect.
A computing method equivalent to the following was used~ Each point in the input
photograph was surrounded by a frame one
point thick, and of width d (Figure 3). The
mean, m., and standard deViation, (J, of the
intensity of the points in the frame were
then computed.
If

A

>m +

A



HOUSEKEEPING
MODULE
,...
......

6SI/
32

()

(

~

......
""

D-

(

I

HOUSEKEEPING
MODULE

Q

J

"......

D-

()

DATA CHANNELS

MAXIMUM
(

(

<>

()

262,144 51 BIT
WORDS MA XIMUM

Figure 5. Maximum 3600 simple system.

Proceedings-Fall Joint Computer Conference, 1962 / 81

STORAGE
MODULE

HOUSEKEEPING
MODULE

o 0

~----,1"
1"-1
1

COMPUTATIONAL

"

I

I

MODULE

I

o 0

1

I

HOUSEKEEPING

J
("")00II'--------'

MODULE

o 0
STORAGE MODULE

HOUSEKEEPING

STORAGE
MODULE

\------1

MODULE

I',
I
"

1

COMPUTATIONAL

'0

I

MODULE

I

1

I

HOUSEKEEPING

J

().004:...-----'

MODULE

COMPUTATIONAL

/

/

/

/

/

MODULE

HOUSEKEEPING
MODULE
04--------------------------------~

Figure 6A. Partially expanded 3600 system.
Figure 6B. Two 3600 systems sharing commom storage.

STORAGE
MODULE

82 / Planning the 3600
DATA CHANNEL

I
STORAGE
MODULE

HOUSEKEEPING
MODULE

"

\I
COMPUTATIONAL

/ ' .".A>

MODULE

/

HOUSEKEEPING

)./
()4"----'

MODULE

COMPUTATIONAL

..0

~---I

HOUSEKEEPING
MODULE

...-

./

MODULE

..STORAGE
MODULE

\

DATA CHANNEL

Figure 6C. Two- computer system with additional common storage.

awareness of the existence of the initial computer and input-output complex. The same
holds for the initial system. Thus, each systern, from an operating pOint of view, is completely independent of the other. The effects
of mutual existence are detectable, however,
but these only indirectly. If both systems
are using the same storage modules extensively' to the exclusion of their own private
modules (if any), over-all operations may be
slowed. For such operations, a more reasonable approach would give each computerinput-output complex its own storage module
or modules and reserve the common storage
area for data of more permanent nature.
In a similar manner, other multi-computer
complexes can be constructed within the

interconnecting limitations imposed on individual modules.
Real- Time Multiplexed Systems
In multiplexed real-time systems, a high
degree of control is required over the entire
system. The modular approach planned permits this as an extension of the multicomputer complexes mentioned. In addition
to two or more completely independent systems sharing a common storage pool, a system is used whereby the independent systems
are also interconnected via their data channels. Each data channel is designed so it
may be connected directly, without intervening black boxes or cable adapters, into any

Proceedings-Fall Joint Computer Conference, 1962 / 83

o
o
HOUSEKEEPING

1------\
MODULE

\

L..--_ _ _ _--'

iI '\
I '\

STORAGE
MODULE

I
I

COMPUTATIONAL

'0
MODULE

I

I

I
.--------, I
HOUSEKEEPING

J
O"f-----'

MODULE

COMPUTATIONAL

P

MODULE

IL..--_ _ _ _......

/
/

/

/

STORAGE
MODULE

HOUSEKEEPING
MODULE

0-+------,

STORAGE
MODULE

000

STORAGE
MODULE

000

Figure 6Do

Two-computer system with private and common storage.

other data channel. These other data channels may be on the same or different systems. The interconnecting linkage supplies
data paths and coupling information. In addition, each interconnecting link permits interruption in either direction, and presents
major fault information such as parity error
in storage or illegal operation code. Thus,
one computer may run in real time with data
and an initial reference program in a common storage pool. Another computer may
be in standby status and processing low
priority problems. The on-line computer
may interrupt the standby computer at any
time and request it to resume the problem.
The new on-line computer, depending upon

the status of the remainder of the system,
may merely serve as a substitute computational unit or as a total computational facility.
Either option is under program control. The
central controlling program would be stored
in a common storage unit with possible interchange with standby units. The computer
units are so designed that anyone of a group
may act as the dominant force with option to
transfer the responsibility at any time. Figure 7 shows a typical multiplexed system.
Future Expansion
One of the initial planning goals required
long life through addition of newer equipment.
With simple in t e r f ace s chosen between

84 / Planning the 3600
modules, it is a relatively simple matter to
change the system by the substitution of a
new module type for an older one. Newer
storage modules or special purpose computing modules may be easily added to the system. With the five access positions on each
storage module, new and unrelated module
types may be added to the system and controlled via the storage medium.

\.................

As an aid in adding new features to the
present design without the necessity of disrupting current or future units, a limited
micro-programming facility was designed
into the computer module. All control elements which the logical designer has at his
disposal when he designs instruction algorithms are available for further use. Very
high speed transmission lines terminate in

.........

\

..... "0 A

A

"b

B

I

........
\ ........

\

J

........

B 0+---'

/

......
\ ........

,

.............. "0 C

0+----'

C

I

f

COMPUTATIONAL
MODULE

!

o 0

DATA CHANNEL
COMM

o 0

!

STORAGE
MODULE

Figure 7. A multiplexed 3600 system.

Proceedings-Fall Joint Computer Conference, 1962 / 85
each of these many control elements. The
other ends of the transmission lines are attached to a timing and control device which
selectively pulses the lines at the same frequency as the basic computer instructions
do. Thus, the cable is a logical extension of
the computer module control circuitry. Instruction algorithms performed via t his
method perform at the same rate of speed
as if they were incorporated into the original
computer module design itself. Algorithms
under active consideration are a square root
and a generalized polynomial evaluator.
This facility permits later inclusion of
new instructions, hardware subroutines, or
modification of existing instructions without
modification to the computer system. This
facility is used by attaching an external unit
to the computer module. The external unit
contains timing elements, a function translator, and a small command structure. When
connected to the computer module, the external unit is considered an integral part of

the computer module. It is referenced by a
special instruction which gives total control
of the external unit. The unit then performs
the specified instruction or subroutine, and
then gives control of the computer back to
the main program.
The advantage s of this facility are many:
it is possible to include specialized instructions where very heavy usage is encountered;
subroutines such as square root can be constructed a voiding a multiplicity of storage
references; and instructions can be given to
special equipment attached to the computer
module.
SUMMARY
A large-scale computer system is planned
in which modular and flexible system design
assures a reasonably long machine life. Relationships between modules are shown to be
highly important, particularly for future
expansion.

0825 - A MULTIPLE-COMPUTER SYSTEM
FOR COMMAND & CONTROL
James P. Anderson, Samuel A. Hoffman, Joseph Shi/man, and Robert J. Williams
Burroughs Corporation
Burroughs Laboratories
Paoli, Pennsylvania
INTRODUCTION

central data processing facility. The data
processing functions alluded to are those
typical of data processing, plus special functions associated with servicing displays,
responding to manual insertion (through
consoles) of data, and dealing with communications facilities. The design implications
of these functions will be considered here.
Availability Criteria: The primary requirement of the data-processing facility,
above all else, is availability. This requirement, essentially afunction of hardware reliability and maintainability, is, to the user,
simply the percentage of available, on-line,
operation time during a given time period.
Every system designer must trade off the
costs of designing. for reliability against
those incurred by unavailability, but in no
other application are the costs of unavailability so high as those presented in command and control. Not only is the requirement for hardware reliability greater than
that of commercial systems, but downtime
for the complete system for preventive maintenance cannot be permitted. Depending
upon the application, some greater or lesser
portion of the complete system must always
be available for primary system functions,
and all of the system must be available most
of thetime.
-The data processing facility may also be
called upon, except at the most critical
times, to take part in exerCising and evaluating the operation of some parts of the system, or, in fact, in actual simulation of system functions. During such exercises and
simulations, the system must maintain some

The D825 Modular Data Processing System is the result of a Burroughs study, initiated several years ago, of the data processing requirements for command and control
systems. The D825 has been developed for
operation in the military environment. The
initial system,. constructed for the Naval
Research Laboratory with the designation
AN/GYK-3(V), has been completed and tested.
This paper reviews the design criteria analysis and design rationale that led to the system structure of the D825. The implementation and operation of the system are also
described. Of particular interest is the role
that developed for an operating system program in coordinating the system components.
Functional Requirements of Command
and Control Data Processing
By "command and control system" is
meant a system having the capacity to monitor and direct all aspects of the operation of
a large man and machine complex. Until
now, the term has been applied exclusively
to certain military complexes, but could as
well be applied to a fully integrated air traffic control system or even to the operation
of a large industrial complex. Operation of
command and control systems is characterized by an enormous quantity of diverse but
interrelated tasks-generally ariSing in real
time-which are best performed by automatic
data-processing equipment, and are most
effectively controlled in a fully integrated

86

Proceedings-Fall Joint Computer Conference, 1962/87
(although perhaps partially and temporarily
degraded) real-life and real-time capability,
and must be able to return quickly to full operation. An implication here, of profound
Significance in system design, is, again, the
requirement that most of the system be always available; there must be no system elements (unsupported by alternates) performing
functions so critical that failure at these
pOints could compromise the primary system functions.
Adaptability Criteria: Another requirement, equally difficult to achieve, is that the
computer system must be able to analyze the
demands being made upon it at any given
time, and determine from this analysis the
attention and emphasis that should be given
to the individual tasks of the problem mix
presented. The working configuration of the
system must be completely adaptable so as
to accommodate the diverse problem mixes,
and, moreover, must respond quickly to important changes, such as might be indicated
by external alarms or the results of internal
computations (exceeding of certain thresholds, for example), or to changes in the hardware configuration resulting from the failure
of a system component or from its intentional
removal from the system. The system must
have the ability to be dynamically and automatically restructured to a working configuration that is responsive to the problem-mix
environment.
Expansibility Criteria: The requirement
of expansibility is not unique to command and
control, but is a desirable feature in any application of data processing equipment. However, the need for expansibility is more acute
in command and control because of the dependence of much of the efficacy of the system upon an ability to meet the changing requirements brought on by the very rapidly
changing technology of warfare. Further, it
must be possible to incorporate new functions
in such a way that little or no transitional
downtime results in any hardware area.
Expansion should be possible without incurring the costs of providing more capability
than is needed at the time. This abUityof
the system to grow to meet demands should
apply not only to the conventionallyexpansible areas of memory and I/O but to computational devices, as well.
Programming Criteria: Expansion of the
data-processing facility should require no reprogramming of old functions, and programs

for new functions should be easily incorporated into the overall system. To achieve
this capability, programs must be written in
a manner which is independent of system
configuration or problem mix, and should
even be interchangeable between sites performing like tasks in different geographic
locales. Finally, because of the large volume of routines that must be written for a
command and control system, it should be
possible for many different people, in different locations and of different areas of
responsibility, to write portions of programs, and for the programs to be subsequently linked together by a suitable operating system.
Concomitant with the latter requirement
and with that of configuration-independent
programs is the desirability of orienting system design and operation toward the use of a
high-level procedure-oriented language. The
language should have the features of the usual
algorithmic languages for scientific computations, but should also include provisions
for maintaining large files of data sets which
may, in fact, be ill-structured. It is also
desirable that the language reflect the special nature of the application; this is especially true when the language is used to direct
the storage and retrieval of data.
Design Rationale for the DataProcessing Facility
The three requirements of availability,
adaptability, and expansibility were the motivating considerations in developing the
D825 design. In arriving at the final systems design, several existing and proposed
schemes for the organization of data processing systems were evaluated in light of
the requirements listed above. Many of the
same conclusions regarding these and other
schemes in the use of computers in command and control were reached independently
in a more recent study conducted for the
Department of Defense by the Institute for
Defense Analysis [1].
The Single-Computer System: The most
obvious system scheme, and the least acceptable for command and control, is the
single-computer system. This scheme fails
to meet the availability requirement simply
because the failure of any part-computer,
memory, or I/O control-disables the entire
system. Such a system was not given serious
consideration.

88 / D825 - A Multiple-Computer System for Command & Control
Replicated Single-Computer Systems: A
system organization that had been well known
at the time these considerations were active
involves the duplication (or triplication, etc.)
of single-computer systems to obtain availability and greater processing rates. This
approach appears initially attractive, inasmuch as programs for the application may
be split among two or more independent
single-computer systems, using as many
such systems as needed to perform all of the
required computation. Even the availability
requirement seems satisfied, since a redundant system may be kept in idle reserve as
backup for the main function.
On closer examination, however, it was
perceived that such a system had many disadvantages for command and control applications. Be sid e s requiring considerable
human effort to coordinate the operation of
the systems, and considerable waste of available machine time, the replicated single computers were found to be ineffective because
of the highly interrelated way in which data
and programs are frequently used in command and c.ontrol applications. Further, the
steps necessary to have the redundant or
backup system take over the main function,
should the need arise, would prove too cumbersome' particularly in a time-critical application where constant monitoring of events
is required.
Partially Shared Memory Schemes: It was
seen that if the replicated computer scheme
were to be modified by the use of partially
shared memory, some important new capabilities would arise. A partially shared
memory can take several forms, but provides principally for some shared storage
and some storage privately allotted to individual computers. The shared storage may
be of any kind-tapes, discs, or core-but
frequently is core. Such a system, by providing a direct path of communication between computers, goes a long way toward
satisfying the requirements listed above.
The one advantage to be found in having
some memory private to each computer is
that of data protection. This advantage vanishes when it is necessary to exchange data
between computers, for if a computer failure
were to occur, the contents of the private
memory of that computer would be lost to
the system. Furthermore, many tasks in the
command and control application require access to the same data. If, for example, it

would be desirable to permit some privately
stored data to be made available to the fully
shared memory or to some other private
memory, considerable time would be lost in
transferring the data. It is also clear that a
certain amount of utilization efficiency is
lost, since some private memory may be
unused, while another computer may require
more memory than is directly available, and
may be forced to transfer other blocks of
data back to bulk storage to make way for
the necessary storage. It might be added in
passing that if private I/O complements are
considered, the same questions of decreased
overall availability and decreased efficiency
arise.
Master/Slave Schemes: Another aspect
of the partially shared memory system is
that of control. A number of such systems
employ a master/slave scheme to achieve
control, a technique wherein one computer,
deSignated the master computer, coordinates
the work done by the others. The master
computer might be of a different character
than the others, as in the PILOT system,
developed by the National Bureau of Standards [2], or it may be of the same basic design, differing only in its prescribed role, as
in the Thompson Ramo Wooldridge TRW400
(AN/FSQ-27) [3]. Such a scheme does recognize the importance, for multicomputer systems, of the problem of coordinating the processing effort; the master computer is an
effective means of accomplishing the coordination. However, there are several difficulties in such a design. The loss of the
master computer would down the whole sys..;.
tern, and the command and control availability
requirement could not, consequently, be met.
If this weakness is countered by providing
the ability for the master control function to
be automatically switched to another processor, there still remains an inherent ineffiCiency. If, for example, the workload of
the master computer becomes very large,
the master becomes a system bottleneck
resulting in inefficient use of all other system elements; and, on the other hand, if the
workload fails to keep the master busy, a
waste of computing power results. The conclusion is then reached that a master should
be established only when needed; this is what
has been done in the design of the D825.
The Totally Modular Scheme: As a result
of these analyses, certain implications became clear. The availability requirement

Proceedings - Fall Joint Computer Conference, 1962 / 89
dictated a decentralization of the computing
function-that is, a multiplicity of computing
units. However, the nature of the problem
required that data be freely communicable
among these several computers. It was decided, therefore, that the memory system
would be completely shared by all processors.
And, from the point of view of availability
and efficiency, it was also seen to be undesirable to associate I/O with a particular
computer; the I/O control was, therefore,
also decoupled from the computers.
Furthermore, a system with several computers, totally shared memory, and decoupled
I/O seemed a perfect structure for satisfying
the adaptability requirements of command
and control. Such a structure resulted in a
flexibility of control which was a fine match
for the dynamic, highly variable, processing
requirements to be encountered.
The major problem remaining to realize
the computational potential represented by
such a system was, of course, that of coordinating the many system elements to behave,
at any given time, like a system speCifically
designed to handle the set of tasks with which
it was faced at that time. Because of the
limitations of previously available equipment,
an operating system program had always
been identified with the equipment running
the program. However, in the proposed design, the entire memory was to be directly
accessible to all computer modules, and the
operating system could, therefore, be decoupled from any specific computer. The
operation of the system could be coordinated
by having any processor in the complement
run the operating system only as the need
arose. It became clear that the master computer had actually become a program stored'
in totally shared memory, a transformation
which was also seen to offer enhanced programming flexibility.
Up to this point, the need for identical
computer modules had not been established.
The equality of responsibility among computing units, which allowed each computer to
perform as the master when running the operating system, led finally to the design
specification of identical computer modules.
These were freely interconnected to a set of
identical memory modules and a set of identical I/O control modules, the latter, in turn,
freely interconnected to a highly variable
and diverse I/O device complement. It was
clear that the complete modularity of system

elements was an effective solution to the
problem of expansibility, inasmuch as expansion could be accomplished simply by
adding modules identical to those in the
existing complement. It was also clear that
important advantages and economies resulting from the manufacture, maintenance, and
spare parts provisioning for identical module~ also accrue to such a system. Perhaps
the most important result of a totally modular organization is that redundancy of the
required complement of any module type, for
greater reliability, is easily achieved by incorporating as little as one additional module
of that type in the system. Furthermore,
the ,additional module of each type need not
be idle; the system may be looked upon as
operating with active spares.
Thus, a design structure based upon complete modularity was set. Two items remained to weld the various functional modules into a coordinated system-a device to
electronically interconnect the modules, and
an operating system program with the effect
of a master computer, to coordinate the activities of the modules into fully integrated
system operation.
In the D82 5, these two tasks are carried
out by the switching interlock and the Automatic Operating and Scheduling Program
(AOSP), respectively. Figure 1 shows how
the various functional modules are interconnected via the interlock in a matrix-like
fashion.
System Implementation
Most important in the design implementation of the D825 were studies toward practical realization of the switching interlock
and the AOSP. The computer, memory, and
I/O control modules permitted more conventional solutions, but were each to incorporate
some unusual features, while many of the I/O
devices were selected from existing equipment. With the exception of the latter, all
of these elements are discussed here briefly.
(A summary of D825 characteristics and
specifications is included at the end of the
paper.)
Switching Interlock: Having determined
that only a completely shared memory system
would be adequate, it was necessary to find
some way to permit access to any memory
bi any processor, and, in fact, to permit

B
'.

90 / D825 - A Multiple-Computer System for Command & Control
MAGNETIC

MAGNmC TAPE
TRANSPORT

~.'_
-

i:g~ER
CABINET)

,,-~

!e
\' --

::.:;:~'
READER

'

..

"

w...
..

MAGNETIC
DISC FILE
S2D3 PRINTER

"0.

....

e1)
~

UJJ-

PUNCH

HIGH-SPEED
PRINTER
r--

SPECIAL
REAL-TIME CLOCKS
&

SELECTm
DATA CONVERTERS

,---

WW
INPUT/OUTPUT
CONTROL
MODULES

~
~

READER

-

r--

coa ..

U

~ ~

~
1d

INTERSYSTEM
DATAUNKS

Figure 1. System Organization, Burroughs D825
Modular Data Processing System.

sharing of a memory module by two or more
processors or II 0 control modules.
A function distributed physically through
all of the modules of a D825 system, but which
has been designated in aggregate the switching interlock, effects electronically each of
the many brief interconnections by which all
information is transferred among computer,
memory, and I/O control modules. In addition
to the electronic switching function, the
switching interlock has the ability to detect

and resolve conflicts such as occur when two
or more computer modules attempt access
to the same memory module.
The switching interlock consists functionally of a crosspoint switch matrix which
effects the actual switching of bus interconnections, and a bus allocator which resolves
all time conflicts resulting from simultaneous requests for access to the same bus or
system module. Conflicting requests are
queued up according to the priority assigned

Proceedings-Fall Joint Computer
to the requestors. Priorities are pre-emptive
in that the appearance of a higher priority
request will cause service of that request
before service of a lower priority request
already in the queue. Analyses of queueing
probabilities have shown that queues longer
than one are extremely unlikely.
The priority scheduling function is performed by the bus allocator, essentially a
set of logical matrices. The conflict matrix
detects the presence of conflicts in requests
for interconnection. The priority matrix
resolves the priority of each request. The
logical product of the states of the conflict
and priority matrices determines the state
of the queue matrix, which in turn governs
the setting of the crosspoint switch, unless
the reque sted module is busy.
The AOSP: An Operating System Program: The AOSP is an operating system
program stored in totally shared memory
and therefore available to any computer.
The program is run only as needed to exert
control over the system. The AOSP includes
its own executive routine, an operating system for an operating system, as it were,
calling out additional routines, as required.
The configuration of the AOSP thus permits
variation from application to application,
both in sequence and quantity of available
routines and in disposition of AOSP storage.
The AOSP operates effectively on two
levels, one for system control, the other for
task processing.
The system control function embodies all
that is necessary to call system programs
and associated data from some location in
the I/O complement, and to ready the programs for execution by finding and allocating
space in memory, and initiating the processing. Most of the system control function (as
well as the task processing function) consists
of elaborate bookkeeping for: programs being
run, programs that are active (that is, occupy
memory space), I/O commands being executed, other I/O commands waiting, external
data blocks to be received and decoded, and
activation of the appropriate programs to
handle such external data. It would be inappropriate here to discuss the myriad details
of the AOSP; some idea of its scope, however,
can be obtained from the following list of
some of its major functions:
1. configuration determination,
2. memory allocation,
3. scheduling,

Conference~

1962 / 91

4. program readying and end-of-j ob
cleanup,
5. reporting and logging,
6. diagnostics and confidence checking,
7. external interrupt processing.
The task processing function of the AOSP
is to execute all program I/O requests in
order to centralize scheduling problems and
to protect the system from the possibility of
data destruction by ill-structuredor conflicting programs.
AOSP Response to Interrupts: The AOSP
function depends heavily upon the comprehensive set of interrupts incorporated in the
D825. All interrupt conditions are transmitted to all computer modules in the system, and each computer module can respond
to all interrupt conditions. However, to make
it possible to distribute the responsibility
for various interrupt conditions, both system
and local, each computer module has an
interrupt mask register that controls the
setting of individual bits of the interrupt
register. The occurrence of any interrupt
causes one of the system computer modules
to leave the program it has been running and
branch to the suitable AOSP entry, entering
a control mode as it branches. The control
mode differs from the normal mode of operation in that it locks out the response to some
low-priority interrupts (although recording
them) and enables the execution of some additional instructions reserved for AOSP use
(such as setting an interrupt mask register
or memory protection registers, or transmitting an I/O instruction to an I/O control
module).
In responding to an interrupt, the AOSP
transfers control to the appropriate routine
handling the condition designated by the interrupt. When the interrupt condition has
been satisfied, control is returned to the
original object program. Interrupts caused
by normal operating conditions include:
1. 16 different types of external requests,
2. completion of an I/O operation,
3. real-time clock overflow,
4. array data absent,
5. computer-to-computer interrupts,
6. control mode entry (normal mode halt).
Interrupts related to abnormalities of either
program or equipment include:
1. attempt by program to write out of
bounds,
2. arithmetic overflow,
3. illegal instruction,

92 / D825 - A Multiple-Computer System for Command & Control
4. inability to access memory, or an internal parity error; parity error on an
I/O operation causes termination of
that operation with suitable indication
to the AOSP,
5. primary power failure,
6. automatic restart after primary power
failure,
7. I/O termination other than normal
completion.
While the reasons for including most of the
interrupts listed above are evident, a word
of comment on some of them is in order.
The array-data-absent interrupt is initiated when a reference is made to data that
is not present in the memory. Since all array
references such as A[k] are made relative to
the base (location of the first element) of the
array, it is necessary to obtain this address
and to index it by the value k. When the base
of array A is fetched, hardware sensing of a
presence bit either allows the operation to
continue, or initiates the array-data-absent
interrupt. In this way, keeping track of data
in use by interacting programs can be simplified, as may the storage allocation problem.
The primary power failure interrupt is
highest priority, and always pre-emptive.
This interrupt causes all computer and I/O
control modules to terminate operations, and
to store all volatile information either in
memory modules or in magnetic thin-film
registers. (The latter are integral elements
of computer modules.) This interrupt protects the system from transient power failure,
and is initiated when the primary power
source voltage drops below a predetermined
limit.
The automatic restart after primary power
failure interrupt is provided so that the previous state of the system can be reconstructed.
A description of how an external interrupt
is handled might clarify the general interrupt
procedure. Upon the presence of an external
interrupt, the computer which has been assigned responsibility to handle such interrupts
automatically stores the contents of those
registers (such as the program counter) necessary to subsequently reconstitute its state,
enters the control mode, and goes to a standard (hardware -determined) location where a
branch to the external request routine is
located. This routine has the responsibility
of determining which external request line
requires servicing, and, after consulting a
table of external devices (teletype buffers,

console keyboards, displays, etc.) associated
with the interrupt lines, the computer constructs and transmits an input instruction to
the requesting device for an initial message.
The computer then makes an entry in the
table of the I/O complete program (the program that handles I/O complete interrupts)
to activate the appropriate responding routine when the message is read in. A check
is then made for the occurrence of additional
external requests. Finally, the computer
restores the saved register contents and
returns in normal mode to the interrupted
program.
AOSP Control of I/O Activity: As mentioned above, control of all I/O activity is
also within the province of the AOSP. Records are kept on the condition and availability of each I/O device. The locations of
all files within the computer system, whether
on magnetic tape, drum, disc file, card, or
represented as external inputs, are also
recorded. A request for input by file name
is evaluated, and, if the device associated
with this name is readily available, the action
is initiated. If for any reason the request
must be deferred, it is placed in a program
queue to await conditions which permit its
initiation. Typical conditions which would
cause deferral of an I/O operation include:
1. no available I/O control module or
channel,
2. the device in which the file is located
is presently in use,
3. the file does not exist in the system.
In the latter case, typically, a message would
be typed out on the supervisory printer, asking for the miSSing file.
The I/O complete interrupt signals the
completion of each I/O operation. Along with
this interrupt, an I/O result descriptor is
deposited in an AOSP table. The status
relayed in this descriptor indicates whether
or not the operation was successful. If not
successful, what went wrong (such as a parity error, or tape break, card jams, etc.) is
indicated so that the AOSP may initiate the
appropriate action. If the operation was successful' any waiting I/O operations which
can now proceed are initiated.
AOSP Control of Program Scheduling:
Scheduling in the D825 relies upon a jobtable
maintained by the AOSP. Each entry is identified with a name, priority, precedence
requirements, and equipment requirements.
Priority may be dynamic, depending upon

Proceedings--Fall Joint Computer Conference, 1962 / 93
time, external requests, other programs, or
a function of many variable conditions. Each
time the AOSP is called upon to select a program to be run, whether as a result of the
completion of a program or of some other
interrupt condition, the job table is evaluated. In a real-time system, situations
occur wherein there is no system program
to be run, and machine time is available for
other uses. This time could be used for
auxiliary functions, such as confidence routines.
The AOSP provides the capability for program segmentation at the discretion of the
programmer. Control macros embedded in
the program code inform the AOSP that
parallel processing with two or more computers is possible at a given point. In addition, the programmer must specify where
the branches indicated in this manner will
join following the parallel processing.
Computer Module: The computer modules
of the D825 system are identical, generalpurpose, arithmetic and control units. In
determining the internal structure of the
computer modules, two considerations were
uppermost. First, all programs and data
had to be arbitrarily relocatable to simplify
the storage allocation function of the AOSP;
secondly, programs would not be modified
during execution. The latter consideration
was necessary to minimize the amount of
work required to pre-empt a program, since
all that would have to be saved to reinstate
the interrupted program at a later time would
be the data for that program and the register
contents of the computer module running the
program at the time it was dumped.
The D825 computer modules employ a
variable-length instruction format made up
of quarter-word syllables. Zero-, one-, two-,
or three-address syllables, as required, can
be associated with each basic command syllable. An implicitly addressed accumulator
stack is used in conjunction with the arithmetic unit. Indexing of all addresses in a
command is provided, as well as arbitrarily
deep indirect addreSSing for data.
Each computer module includes a 128position thin-film memory used for the stack,
and also for many of the registers of the machine, such as the program base register,
data base register, the index registers, limit
registers, and the like.
The instruction complement of the D825
includes the us u a 1 fixed-point, floating-

point, logical, and partial-field commands
found in any reasonably large scientific data
processor.
Memory Module: The memory modules
consist of independent units storing 4096
words, each of 48 bits. Each unit has an
individual power supply and all of the necessary electronics to control the reading,
writing, and transmission of data. The size
of the memory modules was established as a
compromise between a module size small
enough to minimize conflicts wherein two or
more computer or I/O modules attempt access to the same memory module, and a size
large enough to keep the cost of duplicated
power supplies and addressing logic within
bounds. It might be noted that for a larger
modular processor system, these trade-offs
might indicate that memory modules of 8192
words would be more suitable. Modules
larger than this- of 16,384 or 32,768 words,
for example-would make construction of
relatively sma 11 equipment complements
meeting the requirements set forth above
quite difficult. The cost of smaller units of
memory is offset by the lessening of catastrophe in the event of failure of a module.
I/O Control Module: The I/O control
module executes I/O operations defined and
initiated by computer module action. In
keeping with the system objectives, I/O control modules are not assigned to any particular computer module, but rather are treated
in much the same way as memory modules,
with automatic resolution of conflicting attempted accesses via the switching interlock
function. Once an I/O operation is initiated,
it proceeds independently until completion.
I/O action is initiated by the execution of
a transmit I/O instruction in one of the computer modules, which delivers an I/O descriptor word from the addressed memory
location to an inactive I/O control module.
The I/O descriptor is an instruction to the
I/O control module that selects the deVice,
determines the direction of data flow, the
address of the first word, and the number of
words to be transferred.
Interposed between the I/O control modules and the physical external devices is another crossbar switch designated the I/O
exchange. This automatic exchange, similar
in function to the switching interlock, permits two-way data flow between any I/O
control module and any II 0 device in the system. It further e.nhances the flexibility of the

94 / D825 - A Multiple-Computer System for Command & Control
system by providing as many possible external data transfer paths as there are 110 control modules.
Equipment Complements: A D825 system
can be assembled (or expanded) by selection
of appropriate modules in any combination of:
one to four computer modules, one to 16
memory modules, one to ten I/O control
modules, one or two I/O exchanges, and one
to 64 I/O devices per I/O exchange in any
combination selected from: operating (or
system status) consoles, magnetic tape transports, magnetic drums, magnetic disc files,
card punches and readers, paper tape perforators and readers, supervisory printers,
high-speed line printers, selected data converters, special real-time clocks, and intersystem data links.
Figure 2 is a photograph of some of the
hardware of a completed D825 system. The

Figure 2.

equipment complement of this system includes
two computer modules, four memory modules
(two per cabinet), two I/O control modules
(two per cabinet), one status display console,
two magnetic tape units, two magnetic drums,
a card reader, a card punch, a supervisory
printer, and an electrostatic line printer.
D825 characteristics are summarized in
Table 1.
SUMMARY AND CONCLUSION
It is the belief of the authors that modular
systems (in the sense discussed above) are a
natural solution to the problem of obtaining
greater computational capacity-more natural than simply to build larger and faster
machines. More specifically, the organizational structure of the D825 has been shown
to be a suitable basis for the data processing

Typical D825 Equipment Array.

Proceedings-Fall Joint Computer Conference,. 1962 / 95
facility for command and control. Although
the investigation leading toward this structure proceeded as an attack upon a number
of diverse problems, it has become evident
that the requirements peculiar to this area
of application are, in effect, aspects of a
single characteristic, which might be called
structural freedom. Furthermore, it is now
clear that the most unique characteristic of
the structure realized-integrated operation
of freely intercommunicating, totally modular
elements-provides the means for achieving
structural freedom.
For example, one requirement is that
some specified minimum of data processing
capability be always available, or that, under
any conditions of system degradation due to
failure or maintenance, the equipment remaining on line be sufficient to perform primary system functions. In the D825, module
failure results in a reduction of the on-line
equipment configuration but permits normal
operation to continue, perhaps at a reduced
rate. The individual modules are designed
to be highly reliable and maintainable, but
system availability is not derived solely from
this source, as is necessarily the case with
more conventional systems. The modular
configuration permits operation, in effect,
with active spares, eliminating the need for
total redundancy.
A second requirement is that the working
configuration of the system at a given moment
be instantly reconstructable to new forms
more suited to a dynamically and unpredictably changing work load. In the D82 5, all
communication routes are public, all modules are functionally decoupled, all assignments are scheduled dynamically, and assignment patterns are totally fluid. The system
of interrupts and priorities controlled by the
AOSP and the switching interlock permits
instant adaptation to any work load, without
destruction of interrupted programs.
The requirement for expansibility calls
simply for adaptation on a greater time scale.
Since all D825 modules are functionally decoupled, modules of any types may be added
to the system simply by plugging into the
switching interlock or the I/O exchange.
Expansion in all functional areas may be

pursued far beyond that possible with conventional systems.
It is clear, however, that the D825 system
would have fallen far short of the goals set
for it if only the hardware had been considered~ The AOSP is as much a part of the
D825 system structure as is the actual hardware. The concept of a "floating" AOSP as
the force that molds the constituent modules
of an equipment complement into a system
is an important notion having an effect beyond
the implementation of the D825. One interesting by-product of the design effort for the
D825 has, in fact, been a change of perspective; it has become abundantly clear that
computers do not run programs, but that
programs control computers.
ACKNOWLEDGMENTS
The authors wish to acknowledge the outstanding efforts of their many colleagues at
Burroughs Laboratories who have contributed
so well and in so many ways to all stages of
D825 deSign, development, fabrication, and
programming. It would be impossible to cite
all of these efforts. The authors also wish
to acknowledge the contributions of Mr.
William R. Slack and Mr. William W. Carver,
also of Burroughs Laboratories. Mr. Slack
has been closely associated with the D825
from its original conception to its implementation in hardware and software. Mr.
Carver made important contributions to the
writing and editing of this paper.
REFERENCES
1. Marlin G. Kroger et aI, "Computers in
Command and Control" (TR61-12, prepared for DOD:ARPA by Digital Computer
Application Study, Institute for Defense
Analyses, Research and Engineering Support Division), November 1961.
2. A. L. Leiner, W. A. Notz, J. L. Smith,
and A. Weinberger, "Organizing a Network of Computers to Meet Deadlines,"
Proceedings, Eastern Joint Computer Conference, December 1957.
3. R. E. Porter, "The RW-400-A New Polymorphic Data System," Datamation, Vol.
6,No. 1, January/February 1960,pp. 8-14.

96 / D825 - A Multiple-Computer System for Command & Control
Table 1. SpeCifications, D825 Modular Data Processing System
Computer module:

4, maximum complement

Computer module, type:

Digital, binary, parallel, solid-state

Word length:

48 bits including sign (8 characters, 6 bits each) plus
parity

Index registers:
(in each computer module)

15

Magnetic thin-film registers:
(in each computer module)

128 words, 16 bits per word, 0.33- Jlsec read/write
cycle time

Real-time clock:
(in each computer module)

10 msec resolution

Binary add:

1.67 Jlsec (average)

Binary multiply:

36.0 Jlsec (average)

Floating-point add:

7.0 Jlsec (average)

Floating-point multiply:

34.0 Jlsec (average)

Logical AND:

0.33 Jlsec

Memory type:

Homogeneous, modular ,random-access, linear-select,
ferrite-core

Memory capacity:

65,536 words (16 modules maximum,4096 words each)

I/O exchangesper system:

1 or 2

I/O control modules:

10 per exchange, maximum

I/O devices:

64 per exchange,

Access to I/O devices:

All I/O devices available to every I/O control module
in exchange

Transfer rate per I/O exchange:

2,000,000 characters per second

I/O device complement:

All standard I/O types, including 67 kc magnetic
tapes, magnetic drums and diSCS, card and paper tape
punches and readers, character and line printers,
communications and display equipment

maxim~m

THE SOLOMON COMPUTER*
Daniel L. Slotnick, W. Carl Borck, and Robert C. McReynolds
Air Arm Division
Westinghouse Electric Corporation
Baltimore, Maryland
in data reduction, communication, character
recognition, optimization, guidance and control, orbit calculations, hydrodynamics, heat
flow, diffusion, radar data processing, and
numerical weather forecasting.
An example of the type of problem permitting the use of the parallelism is the numerical solution of partial differential equations. Assuming the value of a function, u, is
known on the boundary, r, of a region, the
solution of the Laplace equation t can be calculatedat each mesh point, x, y ih the region
as illustrated in Figure 1.
Since the iteration formula is identical
for each mesh point in the region, the arithmetic capability provided by a processing
element corresponding to each point will
enable one calculation of the equation; i.e., a
single program execution, to improve the
approximation at each of the mesh points
simultaneously.
Figure 2 illustrates a basic array of proceSSing elements. Each of these elements
possesses 4096 bits of core storage, and the
arithmetic capabilities to perform serial-bybit arithmetic and logic. An additional capability possessed by each processing element
is that of communication with other processing elements. Processing element E can

INTRODUCTION AND SUMMARY
The SOLOMON (Simultaneous Operation
Linked Ordinal MOdular Network), a parallel
network computer, is a new system involving
the interconnections and programming, under
the supervision of a central control unit, of
many identical processing elements (as few
or as many as a given problem requires), in
an arrangement that can simulate directly
the problem being solved.
The parallel network computer shows
great promise in aiding progress in certain
critically important areas limited by the
capabilities of current computing systems.
Many of these technical areas possess the
common mathematical denominator of involving calculations with a matrix or mesh
of numerical values, or more generally involving operations with sets of variables
which permit simultaneous independent operation on each individual variable within the
set. This group is typified by the solution of
linear systems, the calculation of inverses
and eigenvalues of matrices, correlation and
autocorrelation, and numerical solution of
systems of ordinary and partial differential
equations. Such calculations are encountered
throughout the entire spectrum of problems

*The applied research reported in this document has been made possible through support and
sponsorship extended by the U.S. Air Force Rome Air Development Center and the U.S. Army
Signal Research and Development Laboratory under Contract Number AF30(602)2724: Task
730J. It is published for technical information only, and does not necessarily represent
recommendations or conclusions of the sponsoring agency.
t Jeeves, T. A., et ali "On the Use of the SOLOMON Parallel-Processing Computer." Proceedings of the Eastern Joint Computer Conference, Philadelphia, December 1962.

97

98 / The Solomon Computer
-r--..

~,....-

'\

V

/
IL( .,y) KNOWN in

r

(

- - f---- ---

/

• +h,

(x-h, y)

/

("+1)

1 {(n)
,lL

4

y)

~~

(.,y)

1

(.-h,y)

/ -

V

\.

-

f--

V

~

(.,y)

--

(', Y -h)

\

IL

-

(x,y +h)

+lL

(n)

('+h,y)

(n)
+lL

Ix,y+h)

(n)

+U

}

(.,y-h)

Figure 1. Iterative Solution of
Laplace I s Equation

transmit and receive data serially from its
four nearest neighbors: the processing elements immediately to right, A; left, C; above,
B; and below, D.
A fifth source of input data is available to
the processing element matrix through the
"broadcast input" option. This option utilizes
a register in the central control to supply

Figure 2. Basic Array of Processing
Elements

constants when needed by an arbitrary number of the processing elements during the
same operation cycle. This constant is
treated as a normal operand by the processing elements and results in the central control unit becoming a "fifth" nearest neighbor
to all processing elements.
The processing element array is the core
of the system concept; however, it is the
method of controlling the array which turns
this concept into a viable machine design.
This method of control is the simplestpossible in that the processing elements contain a
minimum of control logic - the "multimodal"
logic described below.
Figure 3 illustrates how the processing
element array, a 32 x 32 network, is controlled by a single central control unit.
Multimodal control permits the processing
elements to alter control signals to the processing element network according to values
of internal data. They are individually permitted to execute or ignore these central
control signals.
Basically, the central control unit contains
program storage (large capacity randomaccess memory), has the means to retrieve
and interpret the stored instructions, and has
the capability, subj ect to multimodal logic,
to cause execution of these instructions within
the array. Thus, at any given instant, each
processing element in the system is capable
of performing the same operation on the
operands stored in the same memory location
of eachprocessing element. These operands,
however, may all be different. The flow of
control information from the control unit to
the processing elements is indicated in figure 3 by light lines. An instruction is retrieved from the program storage and transmitted to a register in central control. Within
central control, the inst-ruction is interpreted
and the information contained is translated
into a sequence of signals and transmitted
from central control to the processing elements. Since this information must be provided to 1024 processing elements, it is necessary to branch this information and provide
the necessary amplification and power. This
is accomplished by transmission through
branching levels, which provide the necessary
power for transmission.
As described above, each processing element in the network possesses the capability
of communicating with its four adjacent elements. The "edge" processing elements,

Proceedings-Fall Joint Computer Conference, 1962 / 99

Figure 3. PE Array Under Central Control

however, do not possess a full complement
of neighbors. The resulting free connections
are used for input-output application. This
makes possible very high data exchange rates
between the central computer and external
devices through the input-output subsystem.
These rates could be still further increased
by providing longer "edges"; i.e., by the use
of a nonsquare network array.
Two input-output exchange systems are
used by the input output equipment. The primary exchange system is a high speed system operating at a data rate near that of the
processing element network. This system
consists of magnetic tapes and rotating magnetic memories and serves the network with
data storage during large net problems.
The secondary exchange system provides
the user with communication with the primary
exchange system through conventional high
speed printers, and tape transports. The
data at the output of this system is compatible
with most conventional devices.
The Processing Element
The processing element (PE) logic, illustrated in Figure 4, basically consists of two
parts: the processing element memory, and
the arithmetic and multimodal control logic.
The multimodal control within each processing element provides the capability for
individual elements to alter the program flow

as a function of the data which it is currently
processing. This capability permits the
processing element to classify data and make
judgments on the course of programming
which it should follow. Whenever individual
elements are in a different mode of operation
than specified by central control, they will
not execute the specified command.
During each arithmetic operation, one
word will be read serially from each of the
two memory frames associated with a unique
processing element. The operand in frame
one will be transmitted by central control
command, either to the internal adder or to
that of one of the four adjacent elements
which are its nearest neighbors in the network array. The five gates labeled A in
Figure 4 control the routing of information
from frame one. Since only one of these may
be activated during a single operation, a word
in frame one can be entered in the operation
select logic of only one of the five processing
elements. The frame-two operand can be
routed only into the unit's full adder.
Each PE in the system will communicate
with a corresponding unit, thereby producing
a flow of information between processing
elements during network operations.
Word addressing of the memory is performed by the matrix switches in the central
control unit. These switches convert the
address from the binary form of the instruction to the one-of-n form required for addressing the memory frame. Provision is
made for special addressing of specific
memory locations for temporary storage of
multiplier and quotient during multiplication
and division. Successive bits are shifted into
the PE logic by two digit counters in central
control.
Three different types of storage are permitted: (1) the sum can be routed into frame
one while the original word in frame two is
rewritten; (2) the sum can be routed into
frame two while the word in frame one is
rewritten; and (3) information can be interchanged between frames. Note that in the
first two operations, the word which was
located in the memory frame into which the
sum is routed is destroyed. No information
is altered during the third type operation.
Multimodal Operation: Multimodaloperation gives the processing element the additional capability for altering program flow
and tagging information on the basis of internal data. Any command given by the

100 / The Solomon Computer
,..-- :

- - - - --- -- - - -- - - - - - - - - - - - - - - - - .,

I

CENTRAL

CONTROL

L.~.~'!. ~~,,!,3L ____ - -- - 2!- - 9<- --

.
I

~~~ft~~~'~- - - - - - - - ,

:

-r----l'----"---"
I

I
I
I

MATRIX

I

SWITCH

FRAME

I
I

•

::1
.,,:ba.
~--t!'.c describes in detail the
application of the SOLOMON. system to problems in partial differential equations and
certain matrix calculations. Results to datet

~:~Slotnick,

establish a performance advantage between
60 and 200for the SOLOMON Computer compared to currently available large scale digital systems.
ACKNOWLEDGMENTS
The authors gratefully acknowledge the
assistance received through many discussions
with their associates during the conception
and development of the SOLOMON system,
and, in particular, E.R. Higgins, W.Ho Leonard, and Dr. J .C. Tu.

D. L., W. C. Borck, and R. C. McReynolds; "Numerical Analysis Considerations for
the SOLOMON Computer." Proceedings of the Air Force Rome Air Development CenterWestinghouse Electric Corporation Air Arm Division Workshop in Computer Organizationo
To appear.
t Jeeves, T. A., op cit.

THE KDF.9 COMPUTER SYSTEM
A. c. D. Haley
English Electric Co., Ltd.,
Kidsgrove,
Stoke-an-Trent, England
SUMMARY

ideal for evaluation of problems expressible
in algebraic form.
A number of other significant advantages
arise directly from the nesting store principel, chief among them being a striking reduction in the program storage space required. T his is due to eli min a t ion of
unnecessary references to main store addresses and to the implicit addressing of
operands in the nesting store itself. Many
instructions are therefore concerned with
specifying only a function, requiring many
fewer bits than those instructions involving
a main store add res s. Instructions are
therefore of variable length to suit the information content necessary and on average
three instructions may occupy a single machine word (48 bits). This again reduces the
number of main store references, allowing
the use of a store of modest speed \vhile still
allowing adequate time for simultaneous operation of a number of peripheral devices on
an autonomous main store interrupt basis.
Fast parallel arithmetic facilities are
associated with the nesting store, both fixed
and floating-point operations being provided.
A further nesting store system facilitates the
use of subroutine s, and a third set of special
stores is associated with a particularly comprehensive set of instruction modification and
counting procedures.
Operation of the machine is normally under
the control of a "Director" program. Anumber of different Directors cover a variety of
operating conditions. Thus a simple version
is used when only a single program is to be
executed and more sophisticated ver sions

The English Electric KDF.9 computing
system has a number of unusual features
whose origins are to be found in certain decisions reached at an early stage in the planning of the system. At this time (1958-59)
simplified and automatic programming procedures were becoming established as desirable for all programming purposes,
whereas previously they had been regarded
as appropriate only to rapid programming of
"one -off" problems because of the drastic
reductions of machine efficiency w h i c h
seemed inevitable.
Many ear 1 y interpretive programming
schemes aimed to provide an external threeaddress language, and for a time it appeared
that a machine with this type of internal coding approached the ideal. Increasing interest
in translating programs, particular for problem languages such as ALGOL and FORTRAN,
showed the fallacy of this assumption. It became evident that efficient translation could
only be achieved on a computer whose internal structure is adapted to handle lengthy
algebraic formulae rather than the artificially
divided segments of a three address machine.
The solution to the difficulty was found in
the use of a "nesting store" system of working
registers. This consists of a number of
associated storage positions forming a magazine in which information is stored on a
"last in, first out" basis. It is shown that
this basic idea leads to development of a
computer having an order code near to the
108

Proceedings-Fall Joint Computer Conference, 1962 / 109
may be used, for example, to control pseudooff-line transcription operations in parallel
with a main program, operation of several
programs simultaneously on a priority basis,
etc.
INTRODUCTION
For almost a decade after the first computers were put into service, developments
in system specifications were almost exclusively a by-product of local engineering
progress. No radical changes took place in
machine structure, except insofar as engineering changes led directly to operational
ones. Thus the emergence of the ferrite core
store as the most reliable and economic rapid
access store for any significant quantity of
information led to the abandonment, now virtually complete, of the various types of delay
line store. In consequence, "optimum programming" is now used only in certain systems which exchange speed for economy and
use a magnetic drum for the working store.
The majority of systems continue to be
basically simple single address order code
machines, in which most orders implicitly
specify an arithmetic register as one participant in every operation. It was the subsequent proliferation of transfers between
this register and the main store, purely to
allow its use for intermediate arithmetic operations on instructions, which led to the introduction at Manchester University of the
"B-tube." This in turn has been extended and
elaborated to provide the automatic instruction modification and/or counting features
which are now universal.
A further reduction in housekeeping operations, with a consequent increase in speed,
can be obtained by providing not a single
register associated with the arithmetic unit,
but a number of registers. Sometimes all
facilities are available on all registers, and
in other machines the facilities are divided.
Changes and elaborations such as these
have, of cour se, had the primary aim of increasing the effective speed of the machine.
The penalty to be paid is the complexity of
programming. The increased quantity of
hardware required is, of course, more than
compensated by increased computing power.
There is no corresponding compensation
for the increased programming costs, particularly for problems of infrequent occurrence. This factor was the provocation for

much of the early work on simplified programming schemes. Major programs and
those to be used repeatedly were written in
machine code, but the remainder were written
in pro.blem-oriented languages either obeyed
interpretively (as if a set of subroutines) or
translated into the necessary machine code
program by highly sophisticated translator
rout~nes. Typical of the former approach
are English Electric "Alphacode" [1] and
Ferranti/Manchester University "Autocode"
[2, 3] and of the latter method I.B.M. "Fortran" [4] and English Electric "Alphacode
Translator" [5, 6 J. The penalties of using
these schemes are again different in nature.
Interpretive routines lead to inefficient use
of machine time during every run, factors of
10 to 100 covering most of the systems in
common use. The translator routines, in
contrast, may ideally produce a 100% efficient program, although a loss of speed by a
factor of 1-1/2 to 5 is more usual. The
translation operation, performed only once,
may however occupy long periods and may
swamp the subsequent gain over the interpreti ve method for a rarely used program.
In the late nineteen-fifties it was evident
that a successful new general-purpose system should ideally have two programming
features:
(a) It should have an order code designed
to allow easy efficient coding in machine
language after a minimum training period:
(b) The language should be such as to allow the preparation of a number of rapid
translators (for many categories of work)
from problem -oriented languages into efficient machine programs.
Studies based on these fundamental requirements culminated in the machine organization selected for KDF .9.
Choice of an Order Code
Many interpretive schemes used up to this
time had been of three-address type, where
the typical instruction is of the form
C = A (funct;on) B,
and A,B,C are main store addresses. Experience had shown that this type of code was
well adapted for design calculations and scientific usage, expecially if a wide range of
fixed and floating point functions and wellchosen loop counting and instruction modifying facilities were made available.

110 / The KDF.9 Computer System
At a time when potential arithmetic speeds
were overtaking storage access rates the use
of a three-address machine code had the
drawback of requiring four main store references for each operation (including extraction of an instruction), and with the constant
demand for stores of 32,000 or more words
the necessary instruction length approaches
60 bits when provision for a large number of
functions and modifier s is made.
A single address system, on the other
hand, requires greater care in programming
and a multi-accumulator machine is perhaps
worse still from this standpoint.
None of these conventional structures is
particularly well suited to preparation of efficient translation routines, and a study was
therefore made of a special form of multiaccumulator system using automatic allocation of addresses for the working registers.
This depends on presentation of data to the
system in a form closely analagous to the
"reverse Polish" notation. (This notation
merely involves writing the sign of any arithmetic operation after, rather than before, the
two operands.
Thus

a + b is written ab +
a -+ b is written ab+, etc.)

Fundamentally the procedure is to arrange
the machine structure in such a way as to allow operations to be performed in a natural
order. Thus, in carrying out the operations
involved in calculating E = A.B. + C .D, the
natural sequence is
obtain A:
obtain B:
multiply & retain A.B:
obtain C:
obtain D:
multiply & retain C.D:
add:
store E.
Given a suitable "user code" (i.e., a convenient form in which to prepare a program

of instructions), it was evident that such a
machine structure, if attainable at reasonable
cost, would meet requirement (a) above. The
belief in its appropriateness for simplicity
of coding was coincidentally supported by the
appearance of an interpretive ·scheme [7],
producing a similar problem-language, written for DEUCE. This proposal, however,
attracted little attention, mainly because the
multi-level storage system of DEUCE prevented attainment of a reasonable working
efficiency.
The potential virtues of this type of problem language were again supported by a subsequent proposal to use a reverse Polish
notation as a computer common language
(APT). For a number of reasons this proposal was not adopted, but two of the reasons
advanced in its favor are worthy of note in
the present context. They were:
(a) That the language is one in which
problems can be formulated with little effort
after a short period of training:
(b) That the process of automatic or
manual translation from any conventional
problem language to most of the existing machine languages requires an effective expression in reverse Polish notation as an
intermediate step in the procedure. Using
such a machine code therefore eliminates a
substantial part of the process.
The Nesting Store
The evaluation of a formula above can
be seen to consist of successive operations either of fetching new operands (or
storing results) or of performing operations
on these most recently named. An intermediate result is immediately reused (as in the
seventh step) or temporarily pushed "below
the surface," as when the product A.B is
superseded by the fetching from store of C.
A mechanical analogue of such a system
is shown in Figure 1. It consists of a magazine, spring loaded to maintain the contents
at the top, with only one point of entry and
exit. Objects stored can therefore only enter
and leave on a "last in, first out" basis.
The electronic equivalent is shown in Figure 2. There are n separate registers, each
capable of accommodating one machine word,
and corresponding bit poSitions of each register are connected so that the assembly can
also be treated as anumber of n-bitreversible shifting registers. Information can be

Proceedings-Fall Joint Computer Conference, 1962 / 111
Two important points emerge at this
stage. The first is that an operation sequence written as
~II

_ _>

>

II

Figure

transferred to the top register N 1 from a
conventional main store buffer register (a
parallel machine is assumed), and simultaneously with any such transfer a shift pulse
causes a downward shift of any existing information. Thus the new word appears in
N1, the previous content of N1 in N2, etc.
This process is called "nesting down." Reversal of the process causes "nesting up"
with transfer of the content of N 1 towards the
main store. "Nesting up" and "nesting down"
can occur in any sequence, provided, of
course, that not more than n words are in the
store at any time.
Associated with the top two registers is a
"mill" or arithmetic unit [8]. Serial transfers
to and from the mill are shown for clarity,
but in a fast machine parallel working is
again used.
To allow the evaluation of a formula such
as the simple one of section 2, the mill must
be made capable of the two arithmetic operations needed. One of these is addition, and at
this point it must be noted that in general no
operand is considered as necessary after being used by the mill. The operation "add"
therefore uses the words in N 1 and N2, and
places the sum back in N 1. Since N2 IS now
unoccupied, "nesting up" occurs, the content
of N3 moving to N2, etc.
The natural effect of most of the conventional arithmetic operations can now b~ visualized. Thus, "multiply" produces a d'oublelength product in N 1 and N2 (the more
Significant half in N1 for obvious reasons).
No nesting is involved in this operation, but
the more elaborate "multiply and round"
produces a single length result in N 1 and
therefore requires nesting up.

YA, YB, x, YC, YO, x, +,

= YE

evaluates the formula given earlier (YA is
to be interpreted as "fetch from the main
store address containing A into N 1, "and = YE
as a converse operation). This sequence is
the one involving a reduction to a minimum
number of main store operations.
The second important point is that arithmetic operations have implied addresses, so
that only the function need be specified. On
the other hand, all instructions referring to
a main store address require many bits for
this purpose unless flexibility and convenience are sacrificed by addreSSing relative
to a local datum.
Variable Length Instructions
It is clearly advantageous to economize in
instruction storage space, and hence, for
reasons stated above, to allow instructions
to vary in length according to their function
and the additional information they require.
Obviously any instruction must carry within
itself a definition of the number of bits included in it. Analysis shows that complete
variability is unprofitable (as well as complicating the hardware), since five bits would
be needed to specify the length. Three possible lengths of 8, 16 and 24 bits are therefore
made available, including in each case bits
to deSignate length. In connection with instruction length, a unit of eight bits is referred to as a "syllable." Most arithmetic
operations are therefore one-syllable instructions; memory fetch and store operations together with jumps are three-syllable,
while two-syllable instructions include a
number requiring parameters of less length
than memory addresses (shift instructions,
input/output instructions, etc.).
The word length of the computer is 48 bits,
and this is the smallest unit in which information can be extracted from the main store.
Instructions are stored continuously and
obeyed serially; i.e., the store area concerned is regarded as a continuous series of
eight-bit syllables, rather than of 48-bit
words, and two or three-syllable instructions may overlap from one word tothe next.
Associated with the main control there is

112 / The KDF.9 Computer System
therefore a two-word register. This at any
time holds the word containing the current
instruction together with that from the next
higher main store position (if the current instruction overlaps a word boundary both
words are of course in use). As soon as all
syllables in one word have been used, this
word is replaced at the first available main
store cycle by a further word in sequence
from the main store.
Because of the economy in instruction
storage space achieved by these means, it is
frequently possible to contain important inner
program loops in two words of instructions.
Provision is made to mark such loops by a
special jump instruction, whose effect is to
inhibit the extraction of a new instruction
word until the condition for leaving the loop
is satisfied. This saves two main store
cycles (12 microseconds) for extracting instruction words on every circuit of the loop,
and incidentally saves a syllable since again
no main store address needs specifying completely.
Some additional complexity arises due to
the separation of control into two parts, one
associated primarily with arithmetic operations (known as "Mill Control") and the other,
which runs normally at least one instruction
in advance, controlling most other internal
operations and in particular controlling access to the main store (this is called "Main
Control"). The object of this is, of course,
to increase effective speed by allowing, for
example, a new instruction word to be extracted while an arithmetic operation proceeds in the nesting store. Such overlaps are
completely automatic and no special actions
are required of the programmer.
Further Consideration of
Nesting Store
At this stage it is convenient to examine
the nesting store and the consequence of its
use in a little more detail.
It should first be stated that the representation of Figure 2 while possible is uneconomic if more than a two or three words of
storage capacity are required. Examination
of a large number of programmes shows that
only occasionally is storage for more than
about eight words needed in the evaluation of
an expression, and that a sixteen word limit
to capacity will cause inconvenience only on
very rare occasions.

Z

~

Q.
:::>

0

I-

l-

lL

~

V)

I
V)

I

TO MAIN STORE BUFFER REGISTER

NI
MILL

N2

N3

I

I

I
I

I

I

I
I

I
I
I
I

I

l

I
I
I
I
I

I
I
I
I
I

I

i

I
I

I
I
I
I

I
I
I
I

I
I
I

I

~lllljljl

1
N48

I

I

Figure 2

A set of 16 registers of 48 bits each is an
expensive assemblage and it is natural to
examine the possibility of using core storage
in some form. The use of a core store for
all positions carries penalties in speed of
operation (this remains true as long as the
speed is similar to the main store speed),
and a satisfactory solution is reached by the
use of the configuration shown in Figure 3.
The top three registers are now conventional flip-flop registers, and a 16-word core
plane makes up the remainder of the store.
This gives a total of 19 words which is advantageous for reasons shown below. The
flip-flop registers are inter-connected by a
number of gated transfer paths (each for
parallel transfer of 48 bit words) which also
include transfers to and from the arithmetic
unit. An unconventional transformer coupling
system allows these paths to be provided
economically while permitting the transfer
of a word between any two registers in under
half a microsecond. Below the registers the
core store operates in a manner differing in
form but identical in principle with the lower
registers of Figure 2. No shifting of words
between store positions occurs as nesting up
or down is called. Instead successive positions are emptied or filled by successively

Proceedings--Fall Joint Computer Conference, 1962 / 113
MAIN STORE

:----B-I----:
L ___ , ___ -.J

,...1,

',I

1+'
I----B~----i
L ______ -.l

MILL

CORE PLANE

(16 X 48)

Figure 3

addressing levels in the plane. Suppose, for
instance, that the whole system is empty, and
successive words are fed into Nl (by a series
of main store fetch instructions). The transfer paths between N 1, N2, N3 and the mill are
gated as required by control.
The only complication arising here is that
the transfer system does not allow N2, for
example, to send to N3 in the same half
microsecond period in which it receives from
Nt. The two buffer registers of the mill are
therefore used, and the transfers are, in the
first half microsecond, Nl to Bl and N2 to
B2.
The write address counter is at zero and
the read counter at 15 for a reason which will
appear shortly. In the first half microsecond
N3 is also allowed to set up bias currents in
the vertical core lines where digits are to be
written.
The next half microsecond sees the word
in N3 written into the top core plane pOSition
by operation of the write drive, and the subsequent period covers the operations needed
to complete the first fetch. These are main
store to Nl, Bl to N2, and B2 to N3. Simultaneously the read and write address counters
are advanced by oneto zero and one respectively.
A second and third fetch may now be executed in exactly the same way, but, as the
system was assumed empty originally, blanks
(as distinct from data zeroes) are written

into levels 0, 1, 2 of the core plane, leaving
the cores cleared. Only on the fourth and
subsequent fetches does a genuine word appear in level 3.
Continuation of these operations will on
the sixteenth occasion fill the lowest level of
the core plane, and the write counter has now
carried round to address zero while the read
counter is at 15. The programme is interrupted if further entries are attempted (except under c i r cum s tan c e s mentioned in
a later section).
It will have been noted that the read counter is always one pOSition behind the write
counter, so that if a nesting down operation
is followed by nesting up, the read drive is
used and no correction is needed tothe counter position in orderto read out the last word
inserted.
Note also that all operations are either to
read out, leaving a clear storage pOSition, or
to write into an already clear position. The
normal read/write cycle is therefore unnecessary and would be wasteful of time.
Special Nesting Store Instructions
The simple example given earlier illustrates the general nature of the instructions
provided. It is soon discovered, however,
that a few instructions are desirable which
have no real counterpart in more conventional
systemsa
Straightforward evaluation of an expression in algebraic form will usually produce
the desired result without special manipulation. Occasionally it will be found that two
operands, for example prior to a division, are
in reverse order. An instruction "REVERSE"
has the effect of interchanging the contents
of Nl and N2.
The instruction ''DUPLICATE,'' which
nests down and leaves a copy of N 1 in both
Nl and N2 has many uses. Followed by
MULTIPLY it produces the square of Nl,and
it also allows an operand shortly to be lost
in some other operation to be preserved for
later use without requiring a main store reference.
Instructions ZE RO and E RASE bring an
all zero word into Nl, nesting down, and
erase Nl, nesting up, respectively.
Many instructions are also available in
double-length form. Thus double length addition (mnemonic +D) treats Nl and N2 as
one 96 bit number, and N3, N4 as a second.

114 / The KDF.9 Computer System
It produces a double-length sum in N1, N2,

nesting up two places.
Single and double precision floating point
representation is also permitted, the corresp.onding mnemonic forms for addition being
+F and +DF.
With these instructions it is possible to
introduce an example showing the power of
the system and incidentally the speed and the
economy in instruction storage space.
The example to be given is perhaps somewhat artificial, but it has been selected to
illustrate not only the essential simplicity of
evaluation of any expression from an algebraic statement, but to illustrate also that
even in a sophisticated system there is scope
for an occasional elegant twist. The formula
to be evaluated is
3

f = a (a b 2 + 1)

b + 2c 2 d 2

where a, b, c, d are single length fixed point
numbers stored at Ya, Yb, Yc, Yd, nonadjacent addresses in the main store.
The table shows the successive steps in
the calculation, together with the contents
after each step of the top few cells of the
nesting store.
The following pOints should be noted:
(a) Again main store references are reduced to the minimum practicable.
(b) A count shows that only 30 syllables,
or five instruction words, are used (there
are no two-syllable instructions in this particular example).
(c) It is advantageous to evaluate the numerator in expanded form in this particular
case.
(d) The use of DOUBLE DUPLICATE at
step three neatly anticipates future requirements for the parameters. An automatically
programmed version would fetch these parameters again or could perhaps use a less
satisfactory temporary storage process to
avoid this.
The complete evaluation on KDF. 9 takes
less than 170 microseconds.
It is interesting and instructive to compare this performance with that of a conventional one or three-address machine. The
same actual arithmetic and main store
speeds, and a single accumulator only for
the one-address machine are assumed. It is
also assumed that instructions are packed
two to a word for the single address system

and one to a word for the three address
type.
The single-address programme then occupies eight words of storage and takes about
250 microseconds (increases of 50% in both
factors). A three-address system uses 9
words of storage and takes 340 microseconds.
This example requires few operations of
a "housekeeping" nature and the savings arising from use of a nesting store are less
prominent than is frequently the case. On
the other hand, it is also possible to find
cases where there is little difference between the various systems.
The very poor relative speed of the threeaddress machine is accentuated because of
the assumption of the same basic internal
speeds as for KDF.9 (typically 6 microsecond store cycle, 15 microsecond multiplication). It is, however, true that these
represent speeds attainable at corresponding
levels of economy. A substantially faster
store with 1 microsecond cycle time could be
used, at a Significant cost penalty, and would
bring down the problem time to around 160
microseconds. There is, of course, no corresponding saving in storage space.
Treatment of Sub-Routines
It will have been observed that in the example above the condition of the nesting store
at the end was identical with that at the start.
A system of preparing sub-routines can be
based on this fact. At any stage in a program
there may be information in one or more
cells of the nesting store. At this point the
parameters required by the sub-routine must
be planted by the main program (or of cour se
by a lower order sub-routine). A note must
also be made of the next instruction in the
program for re-entry purposes. These two
points will be considered separately.
Just as an instruction such as MULTIPLY
expects to find operands in the top cells of
the nesting store and terminates leaving the
product in place of the operands (nesting as
necessary), so a sub-routine can also be arranged to function. It then becomes in effect
a machine instruction in that it does not influence the nesting store contents below the
level of the operands made ready for it.
It will also be obvious, as soon as multilevel sub-routines are conSidered, that the
operations involved in storing the return
instruction add res s for use after each

Proceedings-Fall Joint Computer Conference, 1962 / 115
Table 1
N1

OPERATION

N2

N3

N4

-

-

-

-

-

-

-

FETCH

b

FETCH

a

b

DOUBLE DUP.

a

b

a

b

DUPLICATE

a

a

b

a

MULTIPLY

a2

b

a

b

MULTIPLY

a2 b

a

b

DUPLICATE

a2 b

a2 b

a

MULTIPLY

a 4 b2

a

b

ADD

a 4 b2 + a

b

REVERSE

b

NUM

FETCH d

d

FETCH c

N5

b

-

-

b

-

-

-

-

-

-

-

-

b

NUM

-

-

c

d

b

NUM

-

MULTIPLY

cd

b

NUM

-

-

DUPLICATE

cd

cd

b

NUM

-

MULTIPLY

c 2 d2

b

NUM

-

-

DUPLICATE

c 2 d2

c 2 d2

b

NUM

-

ADD

2c 2 d2

b

NUM

-

-

ADD

DENOM

NUM

-

-

-

DIVIDE

NUM
DENOM

-

-

-

-

-

-

-

-

=f

STORE
sub-routine in turn are again of a last in,
first out nature. Another nesting store is
therefore provided for this purpose, but there
is here, of course, no necessity for a number
of inter-connected registers or any arithmetic facility. This" sub- routine jump nesting store" (S.J.N.S.) therefore has one register as its most accessible cell, with again a
sixteen-word core plane below. As in the
case of the main nesting store, only a total

of 16 words, not 17 as might have been expected in this case, are available. Since one
is reServed for a special purpose (see below), only 15 may be used by the programmer.
The instruction "Jumpto Sub-routine" has
the effect of planting its own address in the
top cell of S.J.N.S. and of causing an unconditional jump to the entry point of the required sub-routine. It will be noted that

116 / The KDF.9 Computer System
S.J.N.S. must receive not only the word address but also the syllabic address of the
return point.
Any sub-routine is terminated by one of
the variants of the basic EXIT instruction,
which transfer s control to the instruction
whose address is in the top cell of S.J.N.S.
(nesting it up one cell). These variants
augment this address by units of three
syllables before performing the basic EXIT
operation, and thus return control to the instruction immediately following the "Jump to
sub-routine" instruction (itself three syllables long) or to one of the succeeding threesyllable instructions. These are usually a
string of unconditional jump instructions,
corresponding to failure exits or multiple
exit point's preceding the normal return instruction.
The Q-Stores
The computer organization is completed
by addition of a further set of storage registers (not of nesting type) known as the Qstores, and by provision of an input/output
system.
The first of these features is based on
conventional practices, and will be treated
briefly. A set of fifteen registers (formally
16, one of which is identically zero) is used
for address modification, counting and anumber of other purposes. Each register is a
full 48-bit word, but for address modification
is considered as having three 16-bit independent sections for modifier, increment and
counter.
Any main store reference has associated
with it a Q-store number .(or an implied QO),
and refers to the address specified, augmented by the content of the modifier section.
If the letter Q is added to the mnemonic form
of the ·instruction then after the address has
been calculated the modifier is changed by
addition of the increment and the counter is
reduced by one. Jump instructions testing
the counters are, of course, provided.
The Q-stores may also be used if desired
as 48-bit registers, or as independent 16-bit
registers, in each case with accumulative or
direct input. The "counter" part of any Qstore may be used to hold the amount of shift
(positive or negative) in shift instructions.
There are sufficient facilities of this kind in
the machine to remove the need for any kind
of programmed instruction modification.

The other main use of the Q-stores is in
connection with input/output operations, outlined in the next section.
Input/Output Operations
Provision is made for use of the normal
peripheral devices, each one operating completely autonomOUSly and simultaneously with
other peripherals and with computer operations. In the standard system up to 16 peripheral devices, with a total transfer rate in
excess of a million characters per second,
may be handled.
To call any peripheral transfer, an instruction specifies the nature of the operation, referring also to a Q-store in which the
other required parameters have been planted.
These are the device number, and the limiting addresses of the main store area concerned' since peripheral transfers· are of
variable length~
Assuming that the device is available (i.e.,
not already busy), a check is now carried out
by the input/output control system that the
main store area specified does not over lap
that involved in any peripheral transfer already in progress. The parameters are restored within the control, so that the Q-store
register is freed for further use by the program.
The transfer now proceeds in a manner
which, for economic reasons, differs depending on whether the device concerned is fast
(magnetic tape, for example) or slow (punched
card or paper tape, etc.).
In the case of the former, six-bit characters are assembled (taking a read operation
by way of illustration) into machine words.
When a word is complete, it is placed in a
single word buffer and a signal to the main
control system seizes a store cycle as soon
as possible to transfer the word into store.
Such calls have priority on the time of the
main store, but anyone peripheral device
may have to wait until other devices have
been dealt with.
Since such a buffering system uses a substantial quantity of equipment, more economical procedures are adopted. for the
slower devices. A common unit is shared by
all these, and no assembly into words takes
place outside the main store. Instead, as
any device has a character ready, the required main store word is extracted, the
character inser-ted in the next character

Proceedings-Fall Joint Computer Conference, 1962 / 117
location and the word is ret urn e d to
store.
This process is somewhat prodigal of main
store time, but the penalty of handling devices
with character rates up to a few kilocycles
per second in this way is quite negligible.
An attempt by the computer to make use
of a peripheral device or any part of a main
store area concerned in an uncompleted peripheral transfer results in a "lock-out."
The programme operation is interrupted until
the prior operation is completed. The user
is thus freed from any obligation to check and
guard in his program against such conflicting
operations.
KDF.9 "User Code"
Up to this point it has been implied that
programs are written directly in machine
code, albeit in mnemonic form. A very simple compiler routine can clearly translate
the mnemonics into instructions. This would,
however, leave the programmer with certain
tedious tasks, such as calculation of addresses
for jump instructions. The syllabic instruction form increases this problem.
The opportunity is therefore taken to incorporate in the compiler a number of additional features to eliminate such difficulties.
One or two specific facilities will be mentioned.
The mnemonic code handled by the compiler is called "User Code" and has the following important characteristic: every User
Code instruction is either a directive to the
compiler, not appearing explicitly in the compiled version of the program, or is compiled
into one machine instruction proper. Thus
there are no macro-instructions in User Code
which become sequences of instructions in
basic machine code, the correspondence between User Code and machine code instructions being essentially one to one.
Calculation of jump addresses is handled
by the compiler, the user being required only
to label the entry point with an arbitrary reference number and to specify "jump to reference number ..• if ..• " (for example JrCqZ
is the mnemonic meaning "jump to reference
r if the counter section ofQ-store q is zero").
Further elaboration of this process allows
a Similar labelling of sub-routines. When it
encounters an ~nstruction JSLi, the compiler
ensures that a copy of the library sub-routine
i will be made from the library tape and

incorporated into the program. It also writes
the appropriate entry address into the jump
instruction in the main program.
It will be remembered that in section 3
an instruction YA was used as a mnemonic for
"'Fetch into N1 the word in main store address A." It is clear IYI possible for the programmer to treat as his data store a continuous main store area to be addressed as
YO, Yl, etc. The compiler is again used to
convert these relative addresses to absolute
addresses, allocating as the starting point
the first word available after assembly of
the programme.
An obvious extension is to allow the programmer to use a number of groups designated YA, YB, etc., each having an area
commenCing at zero. Thus YAO, YB75 are
permissible addresses.
Other areas of store are Similarly allocated by the compiler. Constants appear
to the user as a set of stores known as Vstores. To incorporate a required constant
for a program requires only the user code
statement that, for example, V27 = F3.14159.
This has the effect of converting the numeric
value in standard floating binary form and
allocating a store word to it. Any subsequent
program reference to V27 fetches this constant into the nesting store.
Similar ly the programmer can refer to
main store areas reserved by the compiler
for working store (storage of intermediate
results or data). Exactly as for Y and V
stores these are referred to with prefix W.
When the compilation is performed, the
resultant program commences with the transcribed main program, followed by any subroutines used. Areas for V and W stores
(and for other functions of this type) are provided up to the highest numbered of each type
in the original program. Finally the remaining area becomes the Y store area, thus allowing maximum generality.
Clearly it is possible to prepare special
versions of the compiler with any degree of
elaboration or desired characteristics. The
incorporation of a routine for conversion of
constants (which may be expressed in a number of ways) is a requirement for all uses,
but it is evident that the prinCiple can be extended as deSired.
Thus the standard compiler, which is appropriate for preparation of programs for a
variety of applications, may be supplemented
by additional special purpose compiler s.

118/ The KDF.9 Computer System
Interruption and the Use of a
Director Routine
KDF.9, like other computers of its generation, has built-in interrupt facilities. Specifically, thi s means that at virtually any time
the normal sequence of instructions may be
suspended and control transferred to a fixed
address (syllable 0 of word 0 for obvious
practical reasons). Such a transfer of control
is called an Interruption. It is arbitrary only
in the sense that it is generally outside the
control of the programmer. It will take place
only when one of a certain number of quite
clearly defined situations arises in the machine. When an Interruption occurs, the interrupted program is left in such a state that
it may be subsequently resumed and will then
continue exactly as if nothing had happened unless the reason for the interruption was
some obvious abuse by the program of the
facilities of the machine.
The purpose of the Interruption facility is
to make "Time-sharing" possible. Here it is
necessary to distinguish between "parallel
operation" and "time-sharing." The former
implies that the machine is doing more than
one operation at a particular moment. On
KDF.9 such occasions include the ability of
one or more peripheral transfer operations
to proceed at the same time as computation
as long as no lock-out violation occurs (see
above). If such a violation does occur, a
transfer of control is necessary in order to
enter some other sequence of instructions
which can proceed unaffected by the lock-out.
This switch of control is automatic, and
so implies the necessity of interruption, in
order that the programmer shall not be required to anticipate and take action over lockout violations. The ability to switch from one
instruction sequence to another is called
"Time-sharing"; it will be evident that timesharing and parallel operation go hand~in­
hand, the former enabling the most efficient
use to be made of the latter.
There are two versions of the KDF.9 system. One of them has an elaborate time
sharing facility which enables up to four independent programs to be stored within the
machine at once (together with a supervisory
program or "Director"). They are obeyed on
a time -shared priority basis - that is, each
program is allocated a priority, and the hardware and the supervisory program together
ensure that the program of highest priority

is always operating subject only to peripheral
lockout conditions.
The other version of the KDF.9 system
only permits one program to be stored, in
addition to a supervisory program. This
version can be converted to "Full Timesharing" by addition of equipment. The extra
equipment includes:
(a) Extra core planes which enable each
program to have its own Nesting Store, Subroutine Jump Nesting Store and Q-Store. To
swifchNesting Stores,for instance,it is necessaryonly to nest down three places, so that
the entire contents of the nesting store are
in the bottom sixteen cells. These are all
in the core plane, which can then be disconnected from the top registers and replaced
by another core plane. This is a very satisfactory compromise between loss of time
during changeover and volume of extra equipment required.
(b) A register corresponding to each
priority level, which is set whenever that
program is held up by a peripheral lock-out
and which records the details of the lock-out.
Interruptions are caused whenever a hold-up
occurs and also whenever a lock-out is
cleared which was holding up a program of
higher priority than the one currently operating. For this purpose a register noting
the current priority level is also necessary.
Features common to both types of machine provide for relative addressing and for
addreSs checking. The relative addressing
feature causes the contents of a "Base Address" register to be added to any main store
address used by a program before access to
the main store is made. This allows programs to be coded always as if stored at
location 0 onwards, but to be stored and
obeyed in any segment of the store.
The address checking actually precedes
augmentation of the relative address by the
base address, and includes a check that the
relative address is not negative and does not
exceed the size of the store area allocated
to that particular program. (Core storage
allocation is completely flexible and is in the
hands of the supervisory program. Programs
may be moved about bodily in the store and
may have their priorities interchanged).
If a program tries to go outside its allocated storage area a "Lock-in Violation" is
said to have occurred, and an interruption
follows. The same thing happens if a program tries to use a peripheral device which

Proceedings-Fall Joint Computer Conference, 1962 / 119
has not been allocated to it - a register of
currently allocated peripheral devices is automatically referred to every time a peripheral transfer is called.
In addition, interruption will occur if an
ordinary program tries to use one of a number of instructions which are reserved for
use by the Director. Such instructions provide access to the various hidden registers
concerned with the interruption facilities.
The over-riding objective is to ensure that
no program is capable of doing anything which
can upset the operation of any other.
Other interruptions can be instigated by the
machine operator, in order to allow input of
control messages on the console typewriter.
The program itself may also include instructions which cause entry to the Director in
order, for example, to ask for allocation of
peripheral units.
One other reason for interruption, of particular significance, is that which occurs
whenever a peripheral transfer called by the
Director itself comes to an end. This enables
a certain amount of "programmed timesharing" within the Director. There are many
interesting and valuable possibilities. Thus,
the feature is used to allow programs to output "lines of print" which the Director can
send direct to a printer or to a magnetic tape
for subsequent off or on-line printing - the
latter again Director controlled. This allows
standard programs to be written so as to be
capable of running on systems having different
output device configurations.
The only limitation on the facilities offered
by such supervisory routines lies in the size
of program which can be allowed without restricting the "main program" unduly. It is
therefore expected that in addition to the
"standard Directors" a number of others will
appear for various ranges of use.
Any Director must satisfy a number of requirements. It will be stored throughout normal operation of the machine in word 0 onwards of the main store and will be entered
at this point by any interruption. On each
such entry, it must:
(a) preserve, as far as is necessary, the
state of the interrupted program;
(b) discover the reason for interruption
and take appropriate action;
(c) return to program.
These requirements will be considered in
turn.

(a) This involves very little work on the
Director's part. The address to which control must be returned when the program is
resumed is automatically planted in the
S.J .N.S. by the interruption process. This
makes use of one of the "spare" cells of this
store, leaving one for use by the Director itself.
Similar ly the Director make s use of the
three spare cells of the Nesting Store and
therefore does not have to do anything to
preserve the sixteen cells used by programs.
It must, however, store away the contents of
any Q-stores which it is going to use itself
(usually three) and of two special registers
provided to record arithmetic overflow and
peripheral device states.
(b) The Director has access to a "Reason
for Interrupt" register, different digits of
which are used to indicate the different reasons. It is, in fact, the setting and subsequent
detection of non-zero bits in this register,
as various situations arise, which triggers
off the interruption sequence. Once interruption has occurred it cannot occur again
until control has been returned to program;
however, digits corresponding to reasons for
interruption may appear in the register at
any time (they are cleared out as soon as the
register is read by the Director).
(c) In the case of a KDF.9 system with
full time-sharing facilities, this requires the
Director first to determine the priority level
to which control will be returned, and then to
arrange for connection of the appropriate
Nesting Stores, etc.
On any machine, the Director must also
restore any Q-stores which were temporarily
parked away, and must reset the two special
registers mentioned in (a) above. The Base
Address register, which was at zero while
the Director was in operation, must then be
set before finally using a special version of
the EXIT instruction to return to the main
program (this special instruction removes
the inhibition preventing interruption while
the Director operates).
CONCLUSIONS
In the space available it has been possible
only to outline some of the distinctive features of the KDF.9 system. Many of these
arise from the novel arrangement of working
registers, whereas others are unspectacular

120 / The KDF. 9 Computer System
in terms of hardware but are the product of
close investigation of operational needs. No
attempt has been made to catalogue the performance factors of the KDF.9 system or
even to include a specification. It is, however, confidently anticipated that it may point
the way to even more striking improvements
in the ratio of performance to cost.
Apart from the standard ver sions of the
Director already mentioned, efficient use
of a system of the proper KDF. 9 presupposes the availability of an adequate
"software package." Prominent among the
indi vidual items associated with the system
are the following:
(a) A fast-compiling "load and go" ALGOL
Compiler [9].
(b) A fast object-program ALGOL translator [10, 11].
(c) ALGOL program testing and operating
systems.
(d) An ALGOL procedure library.
(e) Translators for FORTRAN, COBOL
and other languages.
Detailed descriptions of some of these
items appear elsewhere.
ACKNOWLEDGMENTS
In preparing this paper, the author is
pri vileged to report the work of enthusiastic
teams of designers and users led by R. H.
Allmark and C. Robinson respectively. Most
of the members of these groups have made
Significant original contributions and mention of individuals in such a team enterprise
is out of place. The writer wishes to express
his thanks to them and also to the Manager of
the Data Processing Division of the English
Electric Company, Ltd., for permission to
publish this paper.

REFERENCES
1. S. J. M. Denison, E. N. Hawkins and
C. R 0 bi n son: "DEUCE Alphacode."
DEUCE Program News, No. 20, January
1958 (The English Electric Co. Ltd.).
2. R. A. Brooker: "The Autocode Programs
developed for the Manchester University
Computers." Computer Journal,l, 1958,
p. 15.
3. R. A. Brooker, B. Richards, E. Berg and
R. Kerr: "The Manchester Mercury
Autocode System." (University of Manchester, 1959).
4. FORTRAN Manual: (International Business Ma.chines Corporation).
5. F. G. Duncan and E. N. Ha wki n s:
"Pseudo-code Translation on Multilevel Storage Machines." Proceedings
of the International Conference on Information Processing, p. 144 (UNESCO,
Paris, June 1959).
6. F. G. Duncan and H. R. Huxtable: "The
DEUCE Alphacode Translator." Computer Journal, 3, 1960, p. 98.
7. C. L. Hamblin: "GEORGE: A Semitranslation Programming Scheme for
DEUCE." Programming and Operation
Manual. University of New South Wales,
Kensington, N .S. W.
8. R. H. Allmark and J. A. Lucking: "Design of an Arithmetic Unit Incorporating
a Nesting Store." Proceedings of the
I.F.I.P. Congress, Munich, August 1962.
"The Whetstone KDF.9
9. B. Randell:
ALGOL Translator." Proceedings of the
Programming Systems Symposium, London School of Economics, July 1962.
10. E. N. Hawkins and D. H. R. Huxtable:
"A Multi-pass Translation Scheme for
ALGOL60 for KDF.9." AnnualReviewof
Automatic Programming, Vol. 3, 1962,
Pergammon Press.
11. F. G. Duncq,n: "Implementation of
ALGOL 60 for KDF.9." Computer Journal, ~, 1962, 130.

A COMMON LANGUAGE FOR HARDWARE,
SOFTWARE, AND APPLICATIONS
Kenneth E. Iverson
Thomas J. Watson Research Center, IBM
Yorktoum Heights, New York
language which may be slightly augmented in
different ways at the various levels.
First, it is difficult, and perhaps undesirable, to make a precise separation into a
small number of levels. For example, the
programmer or. analyst operating at the
highest (least detailed) level, may find it
convenient or necessary to revert to a lower
level to attain greater efficiency in eventual
execution or to employ convenient operations
not available at the higher level. Programming languages such as FORTRAN commonly
permit the use of lower levels, frequently of
a lower level "assembly language" and of the
underlying machine language. However, the
employment of disparate languages on the
various levels clearly complicates their use
in this manner.
Second, it is not even possible to make a
clear separation of level between the software
(metaprograms which transform the higher
level algorithms) and the hardware, since the
hardware circuits may incorporate permanent
or semipermanent memory which determines
its action and hence the computer language.
If this special memory can itself be changed
by the execution of a program, its action may
be considered that of software, but if the
memory is "read-only" its action is that of
hardware-leaving a rather tenuous distinction between software and hardware which is
likely to be further blurred in the future.
Finally, in the design of a data processing
system it is imperative to maintain close

INTRODUCTION
Algorithms commonly used in automatic
data processing are, when considered in
terms of the sequence of individual physical
operations actually executed, incredibly complex. Such algorithms are normally made
amenable to human comprehension and analysis by expressing them in a more compact
and abstract form which suppresses systematic detail. This suppression of detail
commonly occurs in several fairly well defined stages, providing a hierarchy of distinct descriptions of the algorithm at different
levels of detail. For example, an algorithm
expressed in the FORTRAN language may be
transformed by a compiler to a machine code
description at a greater level of detail which
is in turn transformed by the "hardware" of
the computer into the detailed algorithm actually executed.
Distinct and independent languages have
commonly been developed for the various
levels used. For example, the operations
and syntax of the FORTRAN language show
little semblance to the operations and syntax
of the computer code into which it is translated, and neither FORTRAN nor the machine
language resemble the circuit diagrams and
other descriptors of the processes eventually
executed by the machine. There are, nevertheless, compelling reasons for attempting
to use a single "universal" language applicable to all levels, or at least a single core

121

122 / A Common Language for Hardware, Software, and Applications
communication between the programmers
(i.e., the ultimate users), the software designers, and the hardware designers, not to
mention the communication required among
the various groups within anyone of these
levels. In particular, it is desirable to be
able to describe the metaprograms of the
software and the microprograms of the hardware in a common language accessible to all.
The language presented in Reference 1
shows promise as a universal language, and
the present paper is devoted to illustrating its
use at a variety of levels, from microprograms, through metaprograms, to "applications" programs in a variety of areas.
To keep the treatment within reasonable
bounds, much of the illustration will be
limited to reference to other published
material. For the same reason the presentation of the language itself will be
limited to a summary of that portion required
for microprogramming (Table 1), augmented
by brief definitions of further operations as
required.
MICROPROGRAMS
In the so-called "systems design" of a
computer it is perhaps best to describe the
computer at a level suited to the machine
language programmer. This type of description has been explored in detail for a single
machine (the IBM 7090) in Reference 1, and
more briefly in Reference 2. Attention will
therefore be restricted to problems of description on a more detailed level, to specialized equipment such as associative memory, and to the practical problem of keying
and printing microprograms which arise from
their use in design automation and simulation.
The need for extending the detail in a
microprogram may arise from restrictions
on the operations permitted (e.g., logical or
and negation, but not and), from restrictions
on the data paths provided, and from a need
to specify the overall "control" circuits
which (by controlling the data paths) determine the sequence in the microprogram, to
name but a few. For example, the basic "instruction fetch" operation of fetching from a
memory (i.e., a logical matrix) M the word
(i.e., row) M i selected according to the base
two value of the instruction location register
.§.. (that is, i = .L~), and transferring it to the
command register,£., may be described as

!l...- MJ...§:.

Suppose, however, that the base two value
operation (i.e., address decoding) is not provided on the register §.. directly, but only on
a special register q to which s may be transferred. The fetch then becomes

Suppose, moreover, that all communication
with memory must pass through a buffer
register 12, that each transfer out of a mem0ry word M i is accompanied by a sub~equent
reset to zero of that word (that is, M~ - ~:-),
that every transfer from a register (or word
of memory) x to a register (or word of memory) ~ must be of the form

and that any register may be reset to zero,
then the instruction fetch becomes
1

!l..-

2

a- s

3

II

4

b

5

M...Li!_ -€

6

M.l!!._

7

c-

8

c- b

-

-

~

Y Q

-

f

M..Li!Yll

12

Y

}

M..L~

€

Y

c.

In this final form, the successive statements correspond directly (except for the
bracketed pair 4 and 5 which together comprise an indivisible operation) to individual
register-to-register transfers. Each statement can, in fact, be taken as the "name" of
the corresponding set of data gates, and the
overall control circuits need only cycle
through a set of states which activate the
data gates in the sequence indicated.
The sequence indicated ina microprogram
such as the above is more restrictive than
necessary and certain of the statements (such
as 1 and 3 or 6 and 7) could be executed

Proceedings-Fall Joint Computer Conference, 1962 / 123
concurrently without altering the overall
result. Such overlapping is normally employed to increase the speed of execution of
microprograms. The corresponding relaxation of sequence constraints complicates their
specification, e.g., execution of statement k
might be permitted to begin as soon as statements h, i and-j were completed. Senzig
(Reference 3) proposes some useful techniques and conventions for this purpose.
The "tag" portion of an associative memory can, as shown in Reference 2, be characterized as a memory M, an argument vector x, and a sense vector §. related by the
expression
/\

~=M=2f.

or by the euivalent expression
-

v-

§..=M~2f,

obtained by applying De Morgan's law. Falkoff
(Reference 4) has used microprograms of the
type discussed here in a systematic exploration of schemes for the realization of associq.tive memories for a variety of functions
including exact match, largest tag, and nearest larger tag.
Because the symbols used in the language
have been chosen for their mnemonic properties rather than for compatibility with the
character sets of existing keyboards and
printers, transliteration is required in entering microprograms into a computer, perhaps for processing by a simulation or design
automation metaprogram. For that portion
of the language which is required in microprogramming, Reference 5 provides a simple
and mnemonic solution of the transliteration
problem. It is based upon a two-character
representation of each symbol in which the
second character n.eed be specified but rarely.
Moreover, it provides a simple representation of the index structure (permitting subscripts and superscripts to an arbitrary
number of levels) based upon a Lukasiewicz
or parenthesis-free representation of the
corresponding tree.
METAPROGRAMS
Just as a microprogram description of a
computer (couched at a suitable level) can
provide a clear specification of the corresponding computer language, so can a program
description of a compiler or other metaprogram give a clear specification of the

"macro-language" which it accepts as input.
No complete description of a compiler expressed in the present language has been
published, but several aspects have been
treated. Brooks and Iverson (Chapter 8,
Reference 6) treat the SOAP assembler in
some detail, particularly the use of the open
addressing system and "availability" indicators in the construction and use of symbol
tables, and also treat the problem of generators. Reference 1 treats the analysis of
compound statements in compilers, includIng the optimization of a parenthesis-free
statement and the translations between parenthesis and parenthesis-free forms. The latter
has also been treated (using an emasculated
form of the language) by Oettinger (Reference
7) and by Huzino (Reference 8); it will also
be used for illustration here.
Consider a vector f. representing a compound statement in complete>!c parenthesis
form employing operators drawn from a set
p. (e.g., p. = (+, x, -,
Then Program 1
shows an algorithm for translating any wellformed statement c into the equivalent statementl inparentheSis-free (Le., Lukasiewicz)
form.
Program 1 - The components of £.. are
examined, and deleted from.£, in order from
left to right (steps 4, 5). According to the
decisions on steps 6, 7, and 8 t , each component is discarded if it is a left parenthesis,
appended at the head of the resulting vector
1. if it is a variable (step 9), appended at the
head of the auxiliary stack vector §.. if it is
an operator (step 10), and initiates a transfer
of the leading component of the stack ~ to
the head of the result 1 if it is a right parenthesis (steps 11, 12).- The behavior is perhaps best appreciated by tracing the program
for a given case, e.g., if.£ = ([, [, x, +, y,], x,
[, p, +, q, ],]), then I = (x, +, q, p, +, y, x).

+».

APPLICATIONS
Areas in which the programming language
has been applied include search and sorting
procedures, s y m b 0 1 i c logic, linear programming, information retrieval, and music
>:Cln . complete parenthe sis form all implied'
parentheses are explicity included, e.g •• the
statement ({X + y) x 

() o 8 8 o ::s t"" § Table 1 ~ ~ aq 2e,eration Definition Notation f o. Scalar P EIVector ~ ~:; (~O' ~1' .••• ~V(~) -1) A ~t:,~)} R A NI MatrIx . D S B Floor A S k- Ceiling k - k> x>k-l Residue mod m k n C LxJ rxl -min t Ai is i-th row vector (~O'··· ~v(A)-l) +k, k,m,n,q intege rs w =1 iff u =1 and v = 1 EI And w- ~IOr w - o w -li w= 1 iff u= w-(xRy) w='l iff x stands in re lation R to y UI\. V R Negation N .. S ProposItIOn Full vector { A. is j-th column vector -J - 0< k < m 0 P !:: - ~(n) w= 1 iff u= 1 or v= 1 ° (All l's) ~= ( w-ej(n) w.=(i=j) (linpositionj) - -1 ° 1 2 3 4) 1 2 4 5 2 345 6 3 E C Prefix vector I A Suffix vector L Infix vector .A R Interval vector "R A Full matrix y S Identity matrix - w-aj(n) . r3. 1 41=4, r-3.14 1 = -3 I 7 I 21 !::-~J(n) !::i=(i::,n-j) = 0, 7 w-i~aj(n) See Rotation (j1'safteriO's) w- -E(mxn) -J w~ = (All1's) I (mxn ) w~ = (i:; j) (Diagonal l's) - W- - -J I -3 = ~ ~., CD 4 § Q. ~ 10 ~X.Y ...... """"' (~-S ~ y) r(3::'2)=0. (5-/2)=1, ~ (5) = (1,1,1,1,1) , eO(5) = (1,0,0,0,0), Of dimension v (w) = n. The Gfinall's) ~i = i + j CD ~ ...... (i=j)=oij' '{ (u-/ v) = exclusive-or of u and v, (u < v) = U (j leading l's) -1 . ~- .!:!(n) .,~ :) en o L-3.14J= -4 7 19 = 5, : '!!.- QA Y. - - ~= (: L3. 1 4J = 3, - w. = (i< j) p:: ., ~ '" e 3 (5)=(0,0,0,1,0) 2 3 a (5) = (1,1,1,0,0), a (4) = (1,1,0,0) 3 . . . - . ~(5)=(0,0,1,1,1), j~:d:;i,jti=~J omitted if clear from context 2~a3(9)= °.!: (0,0,1,1,1,0,0,0,0) 1 (3) = (0,1,2) • -!: } 0 f d" ImenSlon ~>it~e~m) ay be E (mXn) - v. - n~aybe (Integers from j) § A I= zero vector S P Unit vector .,o...... (8.9) • .£ = (3,2,1) ~-~+.Y !:: - !::i = £. = E.= (1,0,1,0,1), s.= (1,0,1) All basic operations are extended component -by -component to vectors and matrices. e. g., ~ u Vv - ~ = (3,4,5.6,7), k-SX- ~ is the maximum length prefix (suffix) in !:o ...-- al(I,I. 0.1. 0) = (1.1, O. o. 0). w/(l. 1. o. 1. 0) = (0. al.e.= (1.0.0.0.0). w/.e.= (0.0.0.0.1). { a/~j = ~j, (1/al(~ =~)l t ~ = ~ left O. O. justified o. o 0). !., C'D () £(x) ~ is the vector representation of the character x In 7090 £(d) = (0.1,0.1. In 8421 code. £(0) = o. (0. O. 0). Q(e) = (0,1.0.1,0,1) o. 0) • ..e.(I) = o ~ ., ~ (0. O. 0.1) C'D C'D ~ Basic Operations for Microprogramming (selected from Iverson, A Programming Language. Wiley. 1962) n ....... C'D CO CJ:) ~ "'-... ..... ~ c:J1 126 I A Common Language for Hardware, Software, and Applications 1 I 2 §... +- .§. (0) 3 v~) : 0 4 ~. +- £1 5 +- Q. +- f (0) = Q1/£ = 6 x 7 x 8 x 9 1 = :p. +- € x ~l 10 §... +- x®~ 11 1. +- ~1 12 §... +- all..§: key transformation t is, in general, a manyto-one function and the index i is merely used as the starting point in searching the table for the given argument x. Figure 1 is otherwise self-explanatory. Method (e) is the widely used open addressing system described by Peterson (Reference 12). Symbolic Logic. If.:! is a logical vector and T is a logical matrix of dimension 2 v (:)"x vW such that the base two value of Ti is i (that is, 1.. T = .l..°(2 V(~»), then the rows of T define the domain of the argument .b and any logical function f~ defined on.K can be completely speCified by the intrinsic vector i V) such that i jV) = ftt j). 1\ Expansion of the expression p = T :: 2f shows that P is the vector of minterms in.2£, and consequently f~) = = XV,~) ®J. PROGRAM 1 Translation fro:m co:mplete parenthe sis state:ment.£ to equivalent Lukasiewicz state:mentl theory. The first two are treated extensively in Reference 1, and Reference 9 illustrates the application to linear programming by a 13-step algorithm for the simplex method. The areas of symbolic logic and matrix algebra illustrate particularly the utility of the formalism provided. Salton (Reference 10) has treated some aspects of information retrieval, making particular use of the notation for trees. Kassler's use in music concerns the analysis of Schoenberg's 12-tone system of composition (Reference 11). Three applications will be illustrated here: search procedures, the relations among the canonical forms of symbolic logic, and matrix inversion. Search Algorithms. Figure 1 shows search programs and examples (taken from Reference 1) for five methods of "hash addressing" (cf. Peterson, Reference 12), wherein the functional correspondent of an agrument x is determined by using some key transformation function t which maps each argument x into an integer i in the range of the indices of some table (Le., matriX) which contains the arguments and their correspondents. The X if, ~) X p. X , (2) and the relation between the intrinsic vector and the exclusive disjunctive characteristic vector X. V? ~) (first derived by Muller, Reference 14) is given directly by the square matrix S = TOT. The properties of the matrix S are easily derived from this formulation. Moreover, the formal application of De Morgan's laws to equations (1) and (2) yields the two remaining canonical forms directly (Reference 1, Chapter 7). Matrix Inversion. The method of matrix inversion using Gauss-Jordan (complete) elimination with pivoting and restricting the total major storage to a single square matrix augmented by one column (described in References 14 and 15) involves enough selection, permutation, and decision type operations to render its complete description by classical ~ Proceedings-Fall Joint Computer Conference, 1962 / 127 k t(ki) n = (Sunday, Monday, Tuesday, Wednesday, Thursday, Friday, Saturday) = 1 + (610 nil, where = d = (2, 2, 3,6,3, 1,2), where = (1, 2, 3, 6), and s F= 6 (19, 13, 20, 23, 20, 6, 19) (ni is the rank in the alphabet of the first letter of ki) % 1 2 3 4 5 Zi = t(k i ), 1 2 V= 3 F= 6 1 1 3 2 Friday Sunday Tuesday 4 5 Wednesday 4 3 ~ [ Monday Thursday Saturday i : p(V) 1 Overflow (a) 1 2 3 4 5 i - t(x) x : F1i T= 6 j_F2 i Wednesday 4 6 6 1 = 6. Data of examples 1 2 3 Friday Sunday Tuesday 7 6 0 1 4 Friday Sunday Tuesday Monday Thursday Wednesday Saturday 3 5 2 7 5 0 4 7 0 i-Fai 1 2 3 V= 2 5 7 [ Monday Thursday Saturday :] i : 0 x : Vli Single table with chaining (c) i - Va' Overflow with chaining (b) j - V 2' 1 2 3 4 5 6 7 T= Sunday Monday Tuesday Wednesday Thursday Friday Saturday 1 2 2 7 3 5 4 5 0 i-mi 5 6 7 0 6 7 0 x: m = (6, 1, 3, 0, 0, 4) T= Friday Sunday Monday Tuesday Thursday Wednesday Saturday 6 1 i - t(x) 2 3 5 4 7 i - p(T) T1i It (i + 1) i-Ta i j Single table with chaining and mapping vector (d) 1 2 3 4 i - t(x) +- T2i ~ Open addressing systemconstruction and use of table (e) Figure 1. Programs and examples for methods of scanning equivalence classes defined by a I-origin key transformation t 128 I A Common Language for Hardware, Software, and Applications matrix notation rather awkward. Program 2 describes the entire process. The starred statements perform the pivoting operations and their omission leaves a valid program without pivoting; exegesis of the program will first be limited to the abbreviated program without pi voting. Program 2 - Step 1 initializes the counter k which limits (step 11) the number of iterations and step 3 appends to the given square matrix M a final column of the form (1, 0, 0, ... , o) which is excised (step 12) only after completion of the main inversion process. Step 7 divides the pivot (i.e., the first) row by the pivot element, and step 8 subtracts from M the outer product of its first row (except that the first element is replaced by zero) with its first column.* The result of steps 7 and 8 is to reduce the first column of M to the form (1, 0, 0, ... , 0) as required in Jordan elimination. * * 4 * 5 * * P1 - -p.l *The row rotation k l' X is a row-by-row extensio.n of the rotation k l' X, that is, (~1' X) 1. = Iii t Xi. Similarly, U~ 1t X) j = - 6 w11M1 7 _M1~M1+M1€ _ _1_ 8 M ~ M - (Q:1 9 M ~ ~ 10 11 P ~ 1 +-+ l' t W11M j «(i1 x 11 M 1) ~ M 1 M) p Lk~k-1 12 * The net result of step 9 is to bring the next pivot row to first position and the next pivot element to first position within it by (1) cycling the old pivot row to last place and the remaining rows up by one place, and (2) cycling the leading column (1, 0, 0, ... , o) to last place (thus restoring the configuration produced by step 3) and the remaining columns one place to the left. The column rota-11)tion* Q.,., M rotates all columns save the first upward by one place, and the subsequent row rotation ~ t rotates all rows to the left by one plac e. The pivoting provided by steps 2, 4, 5, 6, 10, and 13 proceeds as follows. Step 4 determines the index j of the next pivot row by selecting tlie maximum t over the magnitudes of the first k elements of the first column, where k = v(M} + 1 - q on the q-th iteration. Step 6 interchanges the first row with the selected pivot row, except for their final components. Step 5 records the interchange in the permutation vector+ p which is itself initialized (step 2) to the vaiiie of the identity permutation vector l..l(k) = (1, 2, ... , k). The rotation of P on step 10 is necessitated by the corresponding rotations of the matrix M (which it indexes) on step 9. step 13 performs the appropriate inverse reordering§ among the columns of the resulting inverse M. Iii l' Xj. r tThe expres sion !:!.. = 'Q ~ denote s a logical vector 1! such that '!& i = 1 if and only if Xi is a maximum among thos e components Xk such that Y..k = 1. Clearly, !:!../!:.. I is the v~tor of indices of the (restricted) maxima, and (y) i. III is the first among them. tA permutation vector is any vector p whose components take on all the values 1-:- 2, .•• , v(p). The expression y = X denotes that y _l! is a permutation of}£ defined by ~i =.:!Pi. § Permutatioz:1 is extended to matrices as follows: 13 N = -J!. M - ¢===;> N. -1 M. , -l!.J PROGRAM 2 Matrix inver sion by Gaus s-Jordan elimination The expres sion!:! L X is called the !! index of and is defined by the relation!l. LX= j, where X = f!j. 1£ X is a vector, then = !! L X is defined by j i = !! L ~i. Consequently, p.. L.!. I denotes the permutation inverse to P.... X *The outer product Z of vector J£. by vector 1 is the "column by row product" denoted by g ~ ~ X~ a:nd defined by ~ ~ = ~ i Xl j • 1. Proceedings-Fall Joint Computer Conference, 1962 /129 A description in the ALGOL language of matrix inversion by the Gauss-Jordan method is provided in Reference 16. 9. REFERENCES 1. Iverson, K. E., A Programming Language, Wiley, 1962. 2. Iverson, K. E., "A Programming Language," Spring Joint Computer Conference, San Francisco, May 1962. 3. Senzig, D. N., "Suggested Timing Notation for the Iverson Notation," Research Note NC -120, IBM Corporation. 4. Falkoff, A. D., "Algorithms for ParallelSearch Memories," J.A.C.M., October 1962. 5. Iverson, K. E., "A Transliteration Scheme for the Keying and Printing of Microprograms," Research Note NC-79, IBM Corporation. 6. Brooks, F. P., Jr., and Iverson, K. E., "Automatic Data Processing," Wiley (in press). 7. Oettinger, A. G., "Automatic Syntactic Analysis of the Pushdown Store," Proceedings of the Twelfth Symposiumill Applied MathematiCS, April 1960, published by American Mathematical Society, 1961. 8. Huzino, S., "On Some Applications of the Pushdown Store Technique," Memoirs of 10. 11. 12. 13. 14. 15. 16. the Faculty of Science, Kyushu University, Sere A, Vol. XV, No.1, 1961. Iverson, K. E., "The Description of Finite Sequential Processes," Fourth London Symposium on Information Theory, August 1960, Colin Cherry, Ed., Butterworth and Company. Salton, G. A., "Manipulation of Trees in Information Retrieval," Communications of the ACM, February 1961, pp. 103-114. Kassler, M. , "Decision of a Musical System," Research Summary, Communications of the ACM, April 1962, page 223. Peterson, W. W., "Addressing for Random Access Storage," IBM Journal of Research and Development, Vol. 1, 1957, pp. 130-146. Muller, D. E., "Application of Boolean Algebra to Switching Circuit Design and to Error Detection," Transactions of the IRE, Vol. EC-3, 1954, pp. 6-12. Iverson, K. E., "Machine Solutions of Linear Differential Equations," Doctoral ThesiS, Harvard University, 1954. Rutishauser, H., "Zur Matrizeninversion nach Gauss-Jourdan," Zeitschrift fur Angewandte Mathematik und Physik, Vol. X, 1959, pp. 281-291. Cohen, D., "Algorithm 58, Matrix Inversion," Communications of the ACM, May 1961, p. 236. INTERCOMMUNICATING CELLS, BASIS FOR A DISTRIBUTED LOGIC COMPUTER c. Y. Lee Bell Telephone Laboratories, Inc. Holmdel, New Jersey The purpose of this paper is to describe an information storage and retrieval system in which logic is distributed throughout the system. The system is made up of cells. Each cell is a small finite state machine which can communicate with its neighboring cells. Each cell is also capable of storing a symbol. There are several differences between this cell memory and a conventional system. With logic distributed throughout the cell memory, there is no need for counters or addressing circuitry in the system. The flow of information in the cell memory is to a large extent guided by the intercommunicating cells themselves. Furthermore, because retrieval no longer involves scanning, it becomes possible to retrieve symbols from the cell memory at a rate independent of the size of the memory. Information to be stored and processed is normally presented to the cells in the form of strings of symbols. Each string consists of a name and an arbitrary number of parameters. When a name string is given as its input, the cell memory is expected to give as its output all of the parameter strings associated with the name string. This is called direct retrieval. On the other hand, given a parameter string, the cell network is also expected to give as its output the name string associated with that parameter string. This is called cross-retrieval. The principal aim of our design is a cell memory which satisfies the following criteria: 1. The cells are logically indistinguishable from each other. 2. The amount of time required for direct retrieval is independent of the size of the cell memory. 3. The amount of time required for crossretrieval is independent of the size of the cell memory. 4. There is a simple uniform procedure for enlarging or reducing the size of the cell memory. Aim and Motivation We are primarily concerned here with the design of the memory system of a computer in which memory and logic are closely interwoven. The motivation stems from our contention that in the present generation of machines the scheme of locating a quantity of information by its "address" is fundamentally a weak one, and furthermore, the constraint that a memory "word" may communicate only with the central processor (in most cases the accumulator) has no intrinsic appeal. This motivation led us to the design of a cell memory compatible with these contentions. The association of an address with a quantity of information is very much the result of the type of computer organization we now have. Writing a program in machine language, one rarely deals with the quantities of information themselves. A programmer normally must know where these quantities 130 Proceedings-Fall Joint Computer Conference, 1962 / 131 of information are located. He then manipulates their addresses according to the problem at hand. An address in this way often assumes the role of the name of a piece of information. There are two ways to look at this situation. Because there is usually an ordering relationship among addresses, referring to contents by addresses has its merits provided a programmer has sufficient foresight at the time the contents were stored. On the other hand, a location other than being its address can have but the most superficial relation to the information which happens to be stored there. The assignment of a location to a quantity of information is therefore necessarily artificial. In many applications the introduction of an address as an additional characteristic of a quantity of information serves only to compound the complexity of the issue. In any event the assignment of addresses is a local problem, and as such should not occupy people's time and may even be a waste of machine's time. A macroscopic approach to information storage and retrieval is to distinguish information by its attributes. A local property, "350 Fifth Avenue," means little to most people. The attribute, "the tallest building in the world," does. The macroscopic approach requires only that we be able to discern facts by contents. Whatever means are needed for· addressing, for counting, for scanning and the like are not essential and should be left to local considerations. Doing away with addressing, counting, and scanning means a different approach to machine organization. The underlying new concept is however simple and direct: The work of information storage and retrieval should not be assumed by a central processor, but should be shared by the entire cell memory. The physical implementation of this concept is intercommunicating cells. Although one of the principal aims of an intercommunicating cell organization is to make the rate of retrieval independent of the amount of information stored in the cell memory, a number of other engineering criteria are no less important. Uniformity of cell design makes mass production possible. Ease of attaching or detaching cells from the cell memory simplifies the growth problem. Also, facilities for simultaneous matching of symbols make complex preprocessing such as sorting unnecessary. A few intuitive remarks on cell memory retrieval may be appropriate here. Information stored in the cell memory are in the form of strings of symbols. Normally, such a string is made up of a name and the parameters which describe its attributes. Each cell is capable of storing one symbol. A string of infornlation is therefore stored in a corresponding string of cells. In the cell memory each cell is given enough logic circ~itry so that it can give us a yes or no answer to a simple question we ask. If we think of the symbol, say s, contained in a cell as its name, then the question we ask is merely whether the cell's name is s or is not s. In retrieval, for example, we may wish to find all of the parameters (attributes) of a strategy whose name is XYZ. As a first step we would simultaneously ask each cell whether its name is X. If a cell gives us an answer no, then we are no longer interested in that cell. If a cell gives us an answer yes, however, we know it may lead us to the name of the strategy we are looking for. Therefore, we also want each cell to have enough logic circuitry so that it can signal its neighboring cell to be ready to respond to the next question. We then simultaneously ask each of these neighboring cells whether its name is Y. Those cells whose names are Y in turn signal their neighboring cells to be ready to respond to the final question: whether a cell's name is Z. The cell which finally responds is now ready to signal its nearest neighbor to begin giving out parameters of the strategy. The process of cell memory retrieval provides a particularly good example of letting the cells themselves guide the flow of information. By a progressive sequence of questions, we home in on the information we are looking for in the cell memory, although we have no idea just where the information itself is physically stored. Because generally most cells contain information which is of no use to us, the number of cells which give yes answers at any moment is quite small. We may, if we wish, think of the retrieving process therefore as a process of getting rid of useless information rather than a searchingprocess for the useful information. Cell Configuration Each cell in the cell memory is made up of components called cell elements. Each cell element is a bistable device such as a 132 / Intercommunicating Cells, Basis for a Distributed Logic Computer relay or a flip-flop. The cell elements are divided into two kinds: the cell state elements and the cell symbol elements. In the design of the cell memory to be described here, each cell will have a single cell state element so that each cell has two logical states. A cell may either be in an active state or in a quiescent state. There may be any number of cell symbol elements, depending upon the size of the symbol alphabet. The over-all structure of a cell memory is shown in Figure 1. Each cell in the cell memory, say cell i, is controlled by four types of control leads: the input signal lead (IS lead), the output signal lead (OS lead), the match signal lead (MS lead), and the propagation lead (P lead). The input signal lead is active for the duration of the input process. The input symbol itself is carried on a separate set of input leads. When a cell is in an active state, and the input signal lead is activated, whatever symbol is carried on the input leads is then stored in that cell. is compared with the contents carried on the input leads. If the conlparison is successful, an internal signal, m i, is generated in cell i. The signal, m i, is transmitted to one of the neighboring cells of cell i, causing that cell to become active. The propagation lead controls the propagation of activity in a cell memory. When a cell is in an active state, a pulse on the propagation lead causes it to pass its activity to one of its neighboring cells. The direction of propagation is controlled by two separate direction leads: Rand L. The circuits of a cell employing flip-flops and gates are shown in Figure 2. sll~~'it -++-H-+--4---+--+-+-H++---If---!--+---+-l-++-~-+--+-+__ ~rm~ IS -+-t---------+--+-++-I-----+-I+_- -++-I-4--4---+--+-+-H-+.----If---!--+---+-l-+-+---+--+-+__ MS~---------+--+~----+~I+_- :1~1~~ -+++---4-~-+-+-H---If---4--+---+-l+---+--+---+__ PROPAGATE -+t+---4---+-t-t-4-__If----+---+-l-+.----+--+__ L~~~~. -+~---+---+-++------->-----+---t-.t---+--+__ ~7~-+----~r--~r-~-4-----+Dl~~~1!~N ---+.._ _ _ _ _+...-_--11--_-+-_ _ _ _ _ ----f~~-------+..........----+__II+_b, b' --4_---------..-__--+__11+_- X,x' 0,0; OUTPUT - - - - - - - - - - - -_ _- - - Figure 1. Overall diagram of the cell memory. The output signal lead controls the flow of output from the cell memory. Each symbol read out from a cell is carried by a separate set of output leads, and is also stored in a buffer called the Output Symbol Buffer. When a cell is in an active state, a pulse on the output signal lead causes that cell to read out its contents to the set of output leads. An important function performed by the cell memory is the operation of simultaneous matching of the contents of each of the cells with some fixed contents. This operation is controlled by the match signal lead. During matching, the contents of each cell, say cell i, More detailed circuit of cell i. Figure 2. An Example of Cross-Retrieval Let there be three separate strings of information stored in the cell memory. Let these strings have the following names and parameters: Name Parameter AB xy xw AC u B Proceedings--Fall Joint Computer Conference, 1962 / 133 The strings are stored in the cell memory in the form of a single composite string. There must be some way, therefore, by which the name and the parameter strings can be told apart and also a means for distinguishing among the three strings of information themselves. To do this we introduce two tag symbols, a and {3. Every name string is preceded by a tag of a, and every parameter string is preceded by a name of {3. The string stored in the cell memory therefore has the form: the neighboring cell on its right. We then have the situation shown in Figure 4, where each arrow indicates that a signal is being transmitted by a cell whose contents are [ffiJ. aB{3XY aAB {3XW aAC {3U a ••• We have found it convenient to use the diagram I I I p q to represent a cell. In this diagram, p stands for the symbol stored in the cell, and q for the state of the cell. Also, q is 1 if the cell is quiescent, and is 2 if the cell is active. Using such diagrams, Figure 3 shows the manner in which the composite string is stored in the cell memory. ~ ... Figure 4. Signals being transmitted by @I!J after matching. Now every cell which receives a signal from its neighboring cell, whether from the left or from the right, will change from a quiescent state to an active state. Also, the signals transmitted by the cells to their neighbors should be thought of as pulses so that they disappear after they have caused the neighboring cells to become active. The next stable situation is shown in Figure 5; each of the active cells is represented by double boundary lines. During the next match cycle, we want all of the cells to match their individual contents against the fixed contents: Figure 3. Cell memory configuration at the start of retrieval. Let us suppose that we wish to retrieve the name of a string whose parameter is XW. We call such a process in which a parameter string input causes a name string output the process of cross-retrieval. The process of retrieving a parameter knowing its name is called direct retrieval. Initially, we want all of the cells to match their individual contents against the fixed information I{3 I 1 I Furthermore, we want every cell whose contents happen to be [[[!] to send a signal to ... Figure 5. Som e of the cells become active after receiving pulses from the neighboring cells. 134 / Intercommunicating Cells, Basis for a Distributed Logic Computer As before, every cell whose contents are ~ sends a signal to its right neighboring cell, causing that cell to become active. At this point, each previously active cell is made to restore itself to the quiescent state. The stable situation is illustrated in Figure 6. PREVIOUSLY ACTIVE CELL ~ ... Figure 7. cell ~ ... Figure 6. Previously active cells restore themselves to quiescent state as new cells become active. During the following match cycle, the cells are made to match their individual contents against the fixed contents In this example, there is now only one cell whose contents are [RI]]. That cell first signals the cell on its right, causing that neighboring cell to become active, and then restores itself to the quiescent state. During the next match cycle, all of the cells are made to match against Transfer of activity from ~ to cell ~ then reads out its symbol to the output leads. That symbol is now compared with the fixed symbol a in an external match circuit associated with the control leads. If there is no match, a propagate-left signal is sent to every cell and the external comparison process is repeated. This process eventually terminates when the cell [QJ]J is reached (Figure 8). The purpose of this phase of the. output process is strictly to locate the beginning of the information string which is being retrieved. The actual read out process begins with a propagate-right signal. The cell ~ which contains the first letter of the name string AB is activated. The symbol A is now read out and matched externally with the symbol {3. The read out process continues until {3 is reached and is then terminated. Figure 9 gives a general picture of the read out part of the cross-retrieval process. Ia I 2 I The presence of the symbol a shows that the matching process is at an end, and that the output part of the retrieval process is to begin. The cell whose contents are [lli\ is the only cell which is active at the moment. During the output phase, a number of actions take place. First of all, two successive propagate-left signals are sent to the cell memory. The result is a transfer of activity from the cell whose contents are [AlI] to the cell whose contents are~, as shown in Figure 7. An output Signal is now supplied to all of the cells. The cell which is active Figure 8. Reaching the initial symbol in the string to be retrieved. Proceedings--Fall Joint Computer Conference, 1962 / 135 If a string consists of a name N and k-1 parameters PI' P 2 , ••• , Pk - 1 , we will now assign to it a set of k+2 tags: ll!o, ll!1, ••• , ll! k, and~. The string will be stored in the cell memory in the following form: I. OUTPUT SYMBOL A. NO MATCH WITH a 5.0UTPUT SYMBOL W. NO MATCH WITH a 3. OUTPUT SYMBOL f3. NO MATCH WITH a 2. OUTPUT SYMBOL B. NO MATCH WITH a 4. OUTPUT SYMBOL X. NO MATCH WITH a 6. OUTPUT SYMBOL a. MATCH AND STOP Figure 9. A general picture of the read-out part of the cross-retrieval process. The symbol ll!o indicates the beginning of a string, and the symbol ~ indicates the end of a component (that is, a name or a parameter). The symbols ll! 1, ll! 2, ••• , ll!k are the tags associated with the components N, P 1, ••• , P k _ 1· Furthermore, it should be noted that a tag is associated always with a given attribute. For example, ll! 1 is the name tag and should be used as a name tag for all information strings. Consider now the cross-retrieval problem where the cell memory is given as its input a component string together with its tags: Storage and Cross-Retrieval The storage and the retrieval of symbols in the cell memory are both accomplished by letting the cells pass their activities to their neighboring cells and, in this way, guide the flow of information in the cell memory. The process of storing symbols in the cell memory provides a good illustration of the dependence of the cell memory upon the propagation of activity among the cells. Prior to storing the first symbol, the first cell in the cell memory is made active. When the first symbol appears on the set of input leads, the first cell, being active, becomes the only cell prepared to receive that symbol. The first cell, like all of the cells, plays a dual role however. Mter taking in the symbol, it also passes its activity to the right neighboring cell. The neighboring cell then becomes the only active cell in the cell memory, and hence becomes the only cell prepared to receive the next symbol when it appears on the input leads. When we examine the many kinds of information strings which make retrieval difficult' we find that a string is much more likely to have several parameters rather than a single parameter. For such strings the tag system which we have used in the example in the last section is inadequate. The cell memory, for the purpose of crossretrieval, must give as its output the entire string: In describing a procedure for crossretrieval, we shall assume for the purpose of this presentation that every component stored in the cell memory is unique. This means that if an input string O! j P j -1 ~ is presented to the cell memory ,we can be sure that either (1) there is no string in the cell memory which has ll!j P j -1 ~ as one of its components, or (2) there is exactly one string in the cell memory which has ll! j P j -1 ~ as one of its components. Under this assumption therefore, no two strings could compete with each other during retrieval. The basic cross-retrieval procedure is the following. The string ll! j P j - 1 ~ is first matched with all of the strings stored in the cell memory. When a match has occurred, the cell in which the symbol ll! j + 1 is stored (ll!j+l being, in this case, the symbol immediately following ll! j P j-l ~ in the cell memory) would be activated. Because ll! j Pi"-1 ~ is unique in the cell memory, the ce I in which the symbol ll! j + 1 is stored becomes the only active cell in the cell memory. This 136 / Intercommunicating Cells, Basis for a Distributed Logic Computer activity is now propagated towards the left until the symbol a 0, which is the beginning of the string is reached. Symbols are then retrieved fr~m the cell memory to the right until finally the symbol a 0, which is the beginning of the next string, is reached. their support in this work. The writer also wishes to acknowledge the benefit he received from many discussions with Mr. M. C. Paull and with his other colleagues. Outlook REFERENCES We wanted to present here the basic ideas of a distributed logic system without going into many related problems and other technical considerations. The most obvious asset of such an organization is the tremendous speed it offers for retrieval. Suitable programs can also be developed to make the organization extremely flexible. In addition, we believe the macroscopic concept of logical design away from scanning, from searching, from 'addressing, and from counting, is equally important. We must, at all cost, free ourselves from the burdens of detailed local problems which only befit a machine low on the evolutionary scale of machines. On the other hand, the emphasis on distributed logic introduces a number of physical problems. If a cell memory is to be practically useful, it must have many thousands, or perhaps millions of cells. Each cell must therefore be made ofphysical components which are less than miniature in size, and which must each consume extremely tiny amounts of power. Furthermore, because the cells are all identical, mass production techniques should be developed in which a whole block of circuitry can be formed at once. Because the coordination of vast amounts of information is essential to scientific, economic, and military progress, the type of organization exemplified by the cell memory needs to be explored and explored extensively. The research on machine organization, however , cannot stand alone; the success of this . research will depend also on the success In other fields of research: microminiaturization, integrated logic, and hyper-reliable circuit design. ACKNOWLEDGEMENT The writer is indebted to Messrs. S. H. Washburn, C. A. Lovell, and W. Keister for 1. Albert E. Slade, The Woven Cryotron 2. 3. 4. 5. 6. 7. 8. 9. 10. Memory, Proc. Int. Symp. on the Theory of Switching, Harvard Univ. Press, 1959, p. 326. R. R. Seeber and A. B. Linquist, Associative Memory with Ordered Retrieval, IBM Jour. of Res. and Dev., 5, 1962, p. 126. R. S. Barton, A New Approach to the Functional Design of a Digital Computer, Proc. of the Western Joint Computer Conference, May 9 to 11, 1961, p. 393. S. H. Unger, A New Type of Computer Oriented Towards Spatial Problems, Proc. of the IRE, 46, 1958, p. 1744. P. M. Davis, A Superconductive Associative Memory, Proc. Spring Joint Computer Conference, May 1 to 3, 1962, p. 79. V. L. Newhouse and R. E. Fruin, A Cryogenic Data Address Memory, Proc. Spring Joint Computer Conference, May 1 to 3, 1962, p. 89. J. W. Crichton and J. H. Holland, A New Method of Simulating the Central Nervous System Using an Automatic Digital Computer, Tech. Report, Univ. of Mich., March, 1959. H. Blum, An Associative Machine for Dealing with the Visual Field and Some of its Biological Implications, Tech. Report, Air Force Cambridge Research Labs., February, 1962. R. F. Rosin, An Organization of an Associative Cryogenic Computer, Proc. Spring Joint Computer Conf., May 1 to 3, 1962, p •. 203. R. J. Segal and H. P. Guerber, Four Advanced Computers - Key to Air Force Digital Data Comm. Syst., Proc. Eastern Joint Computer Conference, December, 1961, p. 264. ON THE USE OF THE SOLOMON PARALLEL-PROCESSING COMPUTER* J. R. Ballt, R. C. Bollingert, T. A. Jeevest, R .. C. McReynolds X , D. H. Shaffert We stinghouse Electric Corporation Pittsburgh, Pa. the first place, computers conventionally require two memory cycles per simple command-one cycle to obtain the instructions and one cycle to obtain the operand. Although SOLOMON has the same basic requirement it handles up to 1024 operands with each instruction. Consequently, the time per operand spent in obtaining the instruction is negligible. This results in increasing the speed by a factor of two. In the second place, the fact that the processing elements handle 1024 operands at once greatly increases the effective speed. The factor is not 1024, however. Since the processors are serial-by-bit they require n memory references to add an n-bit word. If n is taken nominally to be 32, then the resulting net speed advantage is 1024/n, that is 1024/32 = 32. These two factors result in a fundamental speed increase on the order of 64 to 1 for comparable memory cycle times. In addition to these concrete factors, there are other factors whose contribution to speed cannot be as easily measured. Among these are i) the advantages due to the intercommunication paths between the processing elements, ii) the advantage of using variable word length operations, iii) the net effect SUMMARY The SOLOMON computer has a novel design which is intended to give it unusual capabilities in certain areas of computation. The arithmetic unit of SOLOMON is constructed with a large number of simple processing elements suitably interconnected, and hence differs from that of a conventional computer by being capable of a basically parallel operation. The initial development and study of this new computer has led to considerable scientific and engineering inquiry in three problem areas: 1. The design, development, and construction of the hardware necessary to make SOLOMON a reality. 2. The identification and investigation of numerical problems which most urgently need the unusual features of SOLOMON. 3. The generation of computational techniques which make the most effective use of SOLOMON's particular parallel construction. This paper is an early report on some work which has been done in the second and third of these areas. SOLOMON has certain inherent speed advantages as a consequence of its design. In ~:cThe applied research reported in this document has been made possible through support and sponsorship extended by the U.S. Air Force Rome Air Development Center and the U.S. Army Signal Research and Development Laboratory. tWestinghouse Research Laboratories, Pittsburgh, Pa. XWestinghouse Air Arm Division, Baltimore, Md. tPresently at Pennsylvania State University, State College, Pa. This work was done while at Westinghouse Research Laboratories, Pittsburgh, Pa. 137 138 / On the Use of the Solomon Parallel-Processing Computer resulting from either eliminating conventional indexing operations or else superseding them by mode operations, and iV) the loss in effectiveness resulting from the inability of utilizing all processors in every operation. The net speed advantage can only be determined by detailed analysis of individual specific problems. The task of evaluating the feasibility of the SOLOMON computer has led to investigations of problems which primarily involve elementary, simultaneous computations of an iterative nature. In particular, the solution of linear systems and the maintenance of real-time multi-dimensional control and surveillance situations have been considered. Within these very broad areas two special problems have been rather thoroughly studied and are presented here to demonstrate the scope and application of SOLOMON. The first of these is a problem from partial differential equations, namely the discrete analogue of Dirichlet's problem on a rectangular gird. The second is the real-time problem of satellite tracking and the computations which attend it. These problems are discussed here individually and are followed by a brief summation of other work. Partial Differential Equations Introduction: Among current scientific computational problems, the numerical solution of partial differential equations has in recent years made the most severe demands on the computational speed and storage capacity of computing machines. It is therefore natural to investigate the capability and performance of the SOLOMON computer in this area. This discussion describes such an investigation in the area of partial differential equations of elliptic type. In particular, the problem chosen to serve as a standard for comparison with various methods and Conventional machines is that of solving Laplace's equation over a square with Dirichlet boundary conditions. Since estimates of rates of convergence of various methods are obtainable for the Dirichlet problem on the square, and since running-time estimates for conventional machines also can be easily made for this problem, it seems a reasonable criterion. The discussion which follows assumes familiarity with the concept of the SOLOMON Parallel-Processing Computer [1]. There are three parts: (1) an outline of the problem and the numerical methods to be used; (2) a discussion of the organization of the computations using the parallel-processing ability; (3) a presentation of the results of some comparisons of SOLOMON with conventional machines on the basis of the problem mentioned previously. Finite Difference Approximations: In this section, the main features of the numerical methods to be considered are summarized. A complete exposition is given in the book of Forsythe and Wasow [2]. The statement of the problem given below is by now fairly standard in the literature. Suppose the region to be considered is the open region, R, of the xy-plane, and suppose R has a boundary, C, which is a simple closed curve. It is required to find the solution u = u(x,y) of the Laplacian boundary value problem au = °in R, (1) u prescribed on C, where au denotes the Laplacian of u, u xx + U yy • For purposes of numerical computation, the problem (1) is approximated as follows. We first replace the xy-plane by a net of square meshes of side h. For the given mesh constant, h > 0, the net consists of the lines x = J,Lh, Y = vh, J,L,v = 0,1,2, . ... The pOints (J,Lh, vh) are called nodes, and the nodes within R form the net region R h , assumed to be connectable by line segments of the net within R. A node of Rh is said to be a regular interior point if each of its four (nearest) neighbors (J,Lh ± h, vh ± h) is in RUC. All other points of Rh are called irregular interior points. The set of boundary pOints, denoted by C h , contains those pOints which are the points of intersection of C with the lines of the net; these mayor may not be nodes. Since the prinCipal purpose of this paper is to illustrate the SOLOMON concept, for purposes of exposition the simplifying assumption will be made that all points of Ch are nodes. That is, it will be assumed henceforth that the region of interest is a bounded plane region which is the connected union of squares and half-squares of the net imposed, so that all interior pOints are regular. For a detailed treatment of irregular interior pOints, and errors of discretization and approximation incurred by the use of an approximate boundary C h rather than C, and by approximations Proceedings-Fall Joint Computer Conference, 1962 / 139 to the partial derivatives, the references should be consulted, especially [2]. From well-known finite difference approximations for the partial derivatives occurring in {1}, a formal difference approximation, defined on the net, may be obtained as follows. For convenience, the neighbors of a point P are deSignated as shown belOW, and the net function will be denoted by U to avoid excessive subscripting. and the references should be consulted for the mathematical baSis. Simultaneous Displacements: One straight forward method which comes readily to mind as a candidate for a SOLOMON program because of its use of nearest neighbors is the method of simultaneous displacements. This well-known iterative method makes use of the five-point formula mentioned above, and goes as follows: A trial solution Uo {P}, P in R h, is chosen. Suppose that we are at the k.th stage in the iteration, i.e., Uk - 1 has already been determined. For each point P, the value of the new solution Uk {P} at that point is obtained averaging the values of the {k-1} st solution at the four nearest neighbors of P, Le., Uk {P} = In terms of this diagram, the replacements are: u xx by ~ [U{W} - 2U{P} h2 + U{E}] by _1 [U{N} - 2U{P} + U{S}]. h2 Then {1} becomes ~hU = ~ [U{N} + U{E} + U{S} + U{W} - 4U{P)] h2 {2} = 0 in R h. U prescribed on C h. The formula for the Laplacian operator in {2} is the well-known five-point formula or "star" for the numerical solution of field equations. By using the formula {2} to replace the operator in {1} at each point. P in R h, the conditions of {1} may be approximated by a system of algebraic equations. Using the boundary conditions and applying {2} at each interior point P, the problem {1} is replaced by a system of N simultaneous . equations for the N unknown values of P interior to C h . Under appropriate conditions {see [2], for example}, it can be shown that for P in R h, U{P} -- u{P} as h -- 0; the discussion here, however, is limited to a brief outline of some iterative procedures used for solving the approximate problem on SOLOMON, ! [Uk -1 {N} + Uk -1 (E) + U k - 1 {S} + U k - 1 (W)] . This process is continued until the change in the values at every point does not exceed some prescribed tolerance. Sufficient conditions for the process to converge are known, and are given in [2, section 21.4]. This method has much to recommend it for SOLOMON in terms of simpliCity and ease of programming. Its major defect, and one which encouraged investigation of other methods, is its inferior rate of convergence; this will be discussed later. Optimum S u c c e s s i v e Overrelaxation: There is a modification of the method of simultaneous displacements in which a new value is used as soon as it has been computed. That is, when solving for a new value, U(P}, at a point P, one always employs the latest values of all other components involved in the formula for the new value at P. The procedure is called the method of successive displacements {or successive relaxation}. As might be inferred from the brief description, it depends critically on the order (J in which the unknowns are determined by relaxation. Only cyclic orders are conSidered, in which the new values are obtained in the order U 0-( 1~' U 0-(2), ••• , U o-(N) , and repeat, where { (J G) j is some permutation of the first N integers. Early users of relaxation often found it prOfitable to overrelax, that is, to change a component by some real factor w, w > 1, 140 / On the Use of the Solomon Parallel-Processing Computer times the change needed to satisfy the equation exactly. Over relaxation was originally employed primarily in hand computation, and was not usually employed in the regular or cyclic order which is found most convenient when the computation is done on a digital computer. The question was then raised as to whether overrelaxation was profitable when used with a fixed, cyclic order of determining the unknowns. Although it is known [2, section 21.4] that over relaxation isnotprofttable in solving the Dirichlet problem by simultaneous displacements, it is true that overrelaxation is highly profitable for many elliptic difference operators in connection with the method of successive displacements (successive relaxation). The theory of overrelaxation in successive displacements is due to Young and Frankel; a detailed exposition is given in [2, section 22], where the original papers and later developments are also referred to. The practical application of the method is as follows. As before, we use the five-point difference formula. The nodes in R hare scanned repeatedly in a cyclic order. At each point, the reSidual, ~hU(P), is formed, where ~hU(P) = U(N) + U(E) + U(S) + U(W) 4 U(P). Then, a new value of U(P) is computed by the formula, where w is the overrelaxation factor. (For w = 1, this is the method of successive displacements.) The theory of the method shows that it will converge for 0 < w < 2, and that the most rapid convergence occurs at a value called w opt ' with 1 < w opt < 2. A good estimate of Wopt is important, and it is known [2, section 22] that a value of w somewhat larger than w opt is less costly in computer time than one somewhat smaller than w opt. It is also important that the time required to obtain a good estimate of w opt be small compared to the time required to solve the problem using merely a reasonable guess for wopt • The problem of determining w opt is a very substantial one. A few methods are suggested in [3, section 25.5], but the results at present are inconclusive, and it is conjectured in [2] that finding w opt may in some cases be as hard as solving the original boundary value problem. For this reason, the program de- veloped here assumes an acceptable value of w has been determined, and proceeds from that point. One other point should be mentioned in connection with the cyclic order in which the new values at the nodes of the mesh are found. To be acceptable an order must be what is called consistent in the literature [2,4]; the details and definitions require a somewhat lengthy preparation and will not be given here, but are fully discussed in the references just cited. Forsythe and Wasow [2, p. 245] do state a simple criterion (due to Young) for consistency of five-point formulas, which may be reproduced here: "Of each pair P, Q of adj acent nodes of the net, one, say P, precedes the other, say Q, in the order a. If so, draw an arrow from P to Q. Do this for every pair of adjacent nodes. Then, according to David Young (private communication), the order a is consistent if and only if each elementary square mesh of the net is bounded by four arrows with a zero circulation, Le., if a directed path 'enclOSing the square travels with the arrows on two sides of the square, and against the arrows on two sides." In order to make as much use as possible of the parallel nature of SOLOMON, and at the same time observe the requirement that the order be conSistent, we have chosen the following scheme. We partition the nodes, P ij , of the mesh into two sets SI, S2, by letting S 1 be the set of all pOints with odd parity (i + j odd) and S 2 be the set of all pOints with even parity (i + j even). Then an order consistent with this partition is to solve for all components in S 1, and then for all those in S2' This leads to a checkerboard-like arrangement which can be easily obtained on SOLOMON. Rates of Convergence: A few words about rates of convergence are in order here to illustrate the reason that successive overrelaxation was chosen for SOLOMON rather than simultaneous displacements, even though the latter method is so simple to program. By "rate of convergence" is meant the average asymptotic number of base-e digits by which the error is decreased per iterative step. Forsythe and Wasow [2, p. 283] list some approximate rates of convergence for various methods for solving the Dirichlet problem on a 1T x 1T square with n pOints on a Proceedings--Fall Joint Computer Conference, 1962 / 141 side. For simultaneous displacements the approximate rate of convergence is given as h 2/2; for optimum successive overrelaxation it is 2h. For the 1T x 1T square, these rates will then be h 2/2 = 1T 2 /2n 2 , and 21T/n, respectively, and the error will be decreased by about one base-e digit per 2n 2/1T 2 iterations for simultaneous displacements and per 2n/1T iterations for successive overrelaxation. In [2, p. 374], it is shown that to reduce the initial error by a factor of 10 -6 (approximately e -13. 8) takes 13.8n/{21T) = 2.2n sweeps for successive relaxations; a similar calculation gives 2.8n 2 sweeps for simultaneous displacements. Some running-time computations based on these figures indicate that because of the large number of iterations required with the method of simultaneous displacements, SOLOMON can only show an order-of-magnitude improvement over conventional machines for n small. For successive overrelaxation, however, the improvement is substantial; this will be discussed later. The Computational Scheme: The effectiveness of SOLOMON in solving problems will depend to a large extent on the representation of the problem in SOLOMON memory. The particular representation chosen for the Dirichlet problem and presented here permits all 1024 proceSSing elements to be fully utilized. That is, no proceSSing element is required to remain inactive for one or more instruction times, exceptwhen that processor contains a boundary or exterior point of the net. Consider the rectangular net shown in Figure 1 which contains No x Mo net points. The nodes of the net are partitioned into 1024 rectangular groups of dimension N x M. This is shown in Figure 2. In order to apply the iteration formula with a consistent ordering, it is convenient to restrict Nand M to be even integers. Each group of net pOints must now be represented in the corresponding processing element memory. As shown in Figure 3, the net pOints in an N x M group are orderedand a list is constructed in each processing element memory to contain the net function evaluation at each net point. Since the application of the iterative formula requires the four nearest neighbors of each net point, a net point ordering will be selected which allows the neighbor net functions to be easily obtained. CURVE 522924 No 1 -I--t---+--+--+--+--+-+----i-+--+--!..--+---+--+-+--+ o-~-.--+~~-~--j---~4--+_+_4·~-~~--+__+_~_+_ O~-~+--+~+-~-+~+-~~~~~+ o -f______ .-+--+-t------+----t---+---+----+---+---+--+--f--~*~--f---+- O-f--------~+--+-~~-+-4--~+_+-~-+-_+--lf______~ o -f______ 0--- 2- --t± - -. ----I---+---+---j--+--+---+--+----+--+---i-----+--+ C----_ ---1--------+-----1-+--+-+-~-I---+- - - --- -_. - - - - - - - - - -+--1---+ ----+-----1---- --- - - - - . - - ---j-+--+---+--+----+-_+_ ----r---+-+----+--+----+--+-----1r---r-+---4---+---+------1--~ o 1 2 0 0 0 0 0 0 0 0 0 0 0 M;l Region composed of squares and half squares Figure 1 CURE 522925 ----- 31 ·· I ~(-l • . 0 1 ~ --- f______' I : i I I ( -; o 1 : 1 l o I I I I I i I I I I I ! I I I I o~o~ a 100000000000.00000031 Partition mesh points into 1024 equal groups. Figure 2 PE MEMORY List of Mesh Point Values N-I 1000000 M-I PROBLEM Order mesh points for each group in a corresponding PE memory. _Figure 3 PE MEMORY List of Mesh Point Va lues I I 142 / On the Use of the Solomon Parallel-Processing Computer The distinction among boundary, interior, and exterior net pOints is facilitated by the multi-modal operation of the processing elements. Associated with each net point in each processing element will be mode bits identifying that point. These mode bits will then be used to set the mode state of the processing elements before applying the iteration formula at a net point. Then by only operating on processing elements which are in the mode assigned to interior net pOints, both boundary and exterior points remain unchanged. Storage Requirements: Each SOLOMON processing element contains 4096 bits of core memory. This memory is addressable by bit number and the word size may be set and changed by the program. This variable word length structure of the processing element permits net point values to be listed without any wasted bits. All statements about storage capacity for data words will then depend directly on the acc.uracy needed for the data. A word in SOLOMON memory will be necessary for each of the No Mo net points. In addition, 2 mode bits are used to identify the pOints as boundary, interior, or exterior. Therefore, if p is the number of bits in a net point word, the total storage capacity, c, of SOLOMON core memory is: c = f4096 ] [p +2 X 1024. formula. given by This program requires time T r Tr = (14.5 p + 3) 1. 2 /.LS for one iteration on p-bit words. Since one iteration revises the net point value at 1024 nodes simultaneously, the time, T, required to process a single net point is _l= T - 1024 14.5 p + 3 JlS. 853 The time required to test the results of an iteration for a solution within tolerance is very small and may be neglected. The time required to process one mesh point on a conventional computer is estimated at Tc = 70/ls. This estimate is based on the arithmetic instructions necessary to evaluate the formula which include 6 Additions 1 Multiply 1 Shift 2 Load and Store plus an assumed 3 index-instruction times to distinguish each point as interior or boundary. These two estimates may now be compared to obtain an approximate SOLOMON timeadvantage ratio, (T:Tc); For various p this is p: 18 24 30 c: '""200,000 -160,000 -130,000 36 p -100,000 These figures can be extended through the auxiliary storage system of SOLOMON. Running-Time Estimates: In order to measure the effectiveness of the SOLOMON parallel organization on the Dirichlet problem an estimate of the time required to process each net point on SOLOMON will be compared to a similar estimate for a conventional computer. ':' A program has been written for SOLOMON which applies the Young-Frankel iteration '!cThe computer chosen for comparison is of conventional contemporary design with a multiplication-division time that is one to seven times as long as its 4/ls addition time. 18 24 70 Jls T T:Tc 30 36 70 Jls 70 Jls .31 Jls .41 Jls .52 JlS .60 Jls 225:1 167:1 133:1 117:1 These figures are conservative since they include neither the time to test for a solution nor the time required fo r a conventional computer to buffer the large number of net points. In addition, it appears to be very convenient and quick for SOLOMON to solve a reduced problem with approximated boundary values in order to provide an initial guess for the full problem. In this manner, it is hoped to further increase the SOLOMON speed advantages given above. Proceedings-Fall Joint Computer Conference, 1962 / 143 Satellite Tracking The computing and data processing problem for satellite surveillance is receiving increasing attention as the tempo of the space program increases. The increase in satellite densities over the next few years will increase the computing and data processing problem by several orders of magnitude. Since the presently existing problem of simultaneously tracking relatively few objects requires, in general, high-speed, large capacity computing systems, it is evident that the future problem of satellite surveillance will require highly complex and sophisticated computing systems having capabilities far greater than those of contemporary systems. The SOLOMON computer with its parallel computational capabilities, large capacity memory system and novel system organization features is capable of meeting the advanced computing requirements for satellite surveillance with respect to both speed and memory requirements. To illustrate applicability of SOLOMON to the problem, the major functions that must be performed in a satellite surveillance system are discussed in some detail. Certain aspects of the problem will be omitted in order to keep the text of this paper unclassified. The satellite surveillance problem is, in general terms, that of receiving and performing the required processing of input data from a radar system maintaining continuous surveillance over a specified area of coverage. The raw radar data needs to be converted to digital form and fed into the computing system. The computing system must then establish and maintain track files on all satellites passing through its coverage sector. The establishment of track files on each target within the coverage involves the elimination of false alarms, resolution of ambiguity, etc., from the input data. Once firm tracks have been established, the computer must ascertain the status of each detected satellite; that is, whether the satellite is a known satellite following a predicted orbit, a known satellite not within its predicted course, or a satellite not included in the computer's available identification data. The status of each satellite will determine the subsequent processing of the relevant data. If the satellite is known and following a predicted course, it will be tagged and no subsequent processing will be required. If the satellite is known but not within its prescribed course, it will probably be necessary to perform the orbital correction calculations required to update the orbital elements maintained on each known satellite. If a particular satellite is unknown, orbital calculations must be performed, resulting in a set of orbital elements defining the orbit of the newly detected satellite. The SOLOMON computer, although capable of performing the bulk of the computing required in the system, will probably not solve the entire computing problem. One might visualize the entire computing complex as consisting of a conventionally organized tactical computer and a radar control unit in addition to SOLOMON. The tactical computer would be capable of performing the refined processing on a relatively small number of satellites that require additional processing, a task th~t would be an inefficient use of SOLOMON. The radar control computer would probably be a special purpose machine capable of performing the specific control function required by the radar system (Le., frequency control, antenna beam steering control, etc.). Data correlation is a function having a substantial bearing on computing system requirements. Correlation will be required in two phases of the problem. The first of these is scan-to-scan correlation; Le., correlation of radar input data on successive radar scans. The second is the problem of matching radar returns on observed satellites with known or predicted satellite orbital positions. The problem of scan-to-scan correlation becomes most critical when the false alarm rate of the input data is high, because then the computing system must process a much larger number of returns than the number of actual targets. Due to stringent requirements on the radar system, e.g., long range tracking, high accuracy, low signal-to-noise ratio, etc., the false alarm rate will probably be high. It should be pointed out that a tradeoff exists between the computing capabilities required and the overall radar performance. With a computing system possessing the inherent capabilities of SOLOMON the radar system performance will be vastly increased. This increased system performance will result from SOLOMON's ability to accept and process a much higher density of return 144 / On the Use of the Solomon Parallel-Processing Computer including false alarms and ambiguous returns as well as legitimate returns) than could be realized otherwise. In addition to eliminating false alarms the computing system must discard returns from non-orbiting objects and recognize multiple returns from a single obj ect. The most reliable techniques for eliminating false alarms and ambiguous returns are based on persistence of the returns from scan-to-scan. This necessarily means that the computing system must store all returns for several radar scans (on the order of 5) before any unique returns can be eliminated as false alarms. Such a technique demands large memory capacity such as that provided by the SOLOMON processing ~le­ ment memories. SOLOMON is particularly well adapted to perform the scan-to-scan correlation functions as illustrated by the following discussion. Figure 4 illustrates the scan-to-scan correlation .of 4 parameters: range (R), range rate (R), azimuth ((3) and elevation (0). The parameters of new returns are compared with those from previous scans. The number of comparisons that can be made Simultaneously by SOLOMON is equal to the total number of processing elements in the network, assuming of course that each unique return is routed to a specified processing element in the desired manner. Once the program has cycled through the correlation subroutine there will undoubtedly be r~turns in various processing elements that did not match the returns with which they were being compared. In these cases, by the use of multi modal control, the returns that did not match are set to a specified mode by the programmer. The subroutine can now be repeated, acting only on the processing elements in this mode and comparing the unmatched returns with other returns either within the same processing elements or in adjacent processing elements. This process utilizes the interconnection between processing elements. The number of comparisons that must be made (i.e., the number of iterations of the subroutine required to perform thescan-to-scan correlation function) to assure that a return has been compared with all possible matches is a function of target density, false alarm rates, target geometry, precorrelation techniques, and the actual SOLOMON configuration. When a return has been firmly established as a return from a satellite then that ~E RE ) a E Established data from previous scans Set to specified mode ~E RN) RN aN New data from present scan Set to specified mode ~N Set to specified mode Set to specified mode Correlation on Four Parameters Figure 4 return must be compared with the predicted satellite orbit data to determine if this is an unknown satellite, a known satellite following a predicted course, or a known satellite not within its predicted course. The autocorrelation subroutine required for this function is almost identical to that required for scan-to-scan correlation. Other systems that have been proposed for the performance of satellite surveillance have relied upon a separate catalog memory containing the orbital elements for each known satellite. Because of the irregular order in which satellites might appear within the radar coverage at any given time, it is probably not feasible to order the catalog of tracks in the machine. In SOLOMON the orbital elements in the catalog memory would be distributed throughout the processing element's memories as a function of total satellite density. That is, if the total satellite count is five times the total number of SOLOMON processing elements, each processing element memory would contain 5 satellite tracks. The correlation is therefor~ Simplified and solvable within SOLOMON in a manner almost identical to that of the scan-to-scan correlation. Proceedings-Fall Joint Computer Conference, 1962 / 145 To establish the magnitude of the advantages of SOLOMON over conventional computers in the performance of such correlations, the problem is outlined in detail as follows: a computer has in its memory m total track files. Each track file, consisting of several words, defines a particular target. At any given time the computer can receive n new inputs from the radar systems. The problem is (1) to associate each new input with an established track file, (2) establish new track files where requ,ired, or (3) eliminate spurious Or ambiguous returns. The approach to this problem to date has been quite straightforward. Since existing computers are sequential and only one operation can be performed at a time, each new return is sequentially called from memory and a search of the established tracks is made. The total number of discrete steps required to make the search in a sequential computer on n new returns through m established track files to determine if any n has a match among m is m + m - 1 + m - 2 --- + m _ n = -'-(n_+_1....!....)~~_m_-. . .n.. . .!. .) where m > n. In SOLOMON the total number of discrete steps required to perform the same function, as previously pointed out, is simply m, since in any given step the computer would be making n comparisons and, at most, comparison of each n with all possible m's would take place in m steps (see Figure 4). In the satellite surveillance problem, where both m and n will be exceptionally large, the advantages of SOLOMON over conventionally organized computers are obvious. The problem of coordinate conversion will pose additional requirements on the computing systems employed for this problem. Since the track files as established from the raw radar input will be in radar coordinates and the tracks in the catalog will probably be in the form of orbital elements, some form of conversion must be done prior to the correlation of the detected satellites with the known satellites. These computations, although not especially complex, are quite time-consuming. While in a conventionally organized computer each track would be converted in a sequential manner, in the SOLOMON computer the execution of one conversion subroutine would perform the conversions for all satellites (assuming that the total number of conversions required is not greater than the totalnumber of processing elements). In order to compute new satellite orbits and update existing data on established satellites these satellites must be tracked and sufficient data on the actual orbits must be gathered. Of the total number of satellites detected in the radar search mode, only a small number will require detailed orbital calculations. In a typical satellite tracking system this function would not be performed by SOLOMON, but by a high speed conventionally organized computer. Other Problem Areas In the course of studies such as those reported here it has become apparent that the computational techniques most suitable for SOLOMON are not necessarily those which have been popular for conventional high-speed machines. To make proper use of the parallel design of SOLOMON it sometimes seems necessary to employ methods of computation that would be quite cumbersome for a conventional machine. A case in point can be found in the problem of solving a system of linear equations. To be speCific, consider the problem of solving a system of 15 equations in 15 unknowns. Because of the unique construction of SOLOMON, it is possible to solve up to 64 such systems simultaneously for the same cost in time and effort. The activities of the computer will be the same whether one or sixty-four systems are being solved. Consequently, SOLOMON will perform better when many systems of linear equations can be solved at the same time. We wish to stress that the proper computational scheme must be employed for the most efficient use of SOLOMON's capabilities since our mission is not to establish SOLOMON as the ultimate in computers, but rather to explore those areas in which it is superior. The problem areas in which SOLOMON has been found to be especially capable are constantly expanding under the impetus of the present investigations. Work is now in process on several other sample problems to demonstrate the general applicability of SOLOMON. We are studying multidimensional functional optimization, where an entirely new methodology for finding absolute 146 / On the Use of the Solomon Parallel-Processing Computer maxima may result. We are studying communication and transportation problems, a sorting problem, a problem in handling a Boolean form, multiple integration, and a sound detection problem. Additional systems stu die s being considered include photo-reconnaissance, numerical we at her forecasting, cryptoanalysis, nuclear reactor calculations, and a ir traffic control. 2. Forsythe, G. E., andWasow, W. R. "FiniteDifference Methods for Partial Differential Equations," Wiley, N.Y., 1960. REFERENCES 3. Forsythe, G. E. "Difference Methods on a Digital Computer for Laplacian Boundary Value Problems." Transactions of the Symposium on Partial Differential Equations. N. Aronszayn, A. Douglis, C. B. Morrey, Jr. (eds.), Interscience, N.Y., 1955. 1. Slotnick, D. L., Borck, W. C., McReynolds, R. C. "The SOLOMON Computer," Proceedings of the Fall 1962 Joint Computer Conference. 4. Sheldon, J. W. "Iterative Methods for the Solution of Elliptic Partial Differential Equations," Mathematical Methods for Digital Computers, Wiley, N.Y., 1960. DATA PROCESSING FOR COMMUNICATION NETWORK MONITORING AND CONTROL D. I. Caplan Surface Communications Division Radio Corporation of America Camden 2, New Jersey The long-haul communications network is the backbone of military communications. It provides the coordination necessary for global military operations and logistic support. For maximum network effectiveness, a central monitoring and control function is necessary. System studies described in this paper have shown that automatic data processing is applicable to network monitoring and control, and provides rapid and efficient network reaction to natural and man-made disturbances. The basic elements of the long-haul communication network are the switching centers, the trunks connecting them, and the subscribers. Three types of service are provided to the user of the long-haul network. These are: 1. Direct - A direct connection is made on demand between two subscribers, and broken down when the call is completed. 2. Store and Forward - A message is transferred through the network. It is stored at each switching center, and then passed along to the next center until it reaches its destination. 3. Allocated Service - Allocated service is a direct subscriber-to-subscriber connection which remains in effect full time. This "hot-line" service differs from direct service in that the connection is not broken down at the end of a call. The long-haul network must provide service in the face of many operational difficulties. The s e difficulties inc Iud e wide variations in the traffic load and "outages" of equipment due to acts of nature or enemy action. The hot-line service must be restored immediately when any of the hot-line channels are effected by outages. To complicate the situation, peak traffic loads will usually occur at the very time outages are caused by enemy action or severe storms. There are basically four actions which can be taken to alleviate operating difficulties. These are: 1. Alternate Routes: Backed up storeand-forward traffic can be sent via other routes. Direct calls can also be handled over routes other than the preferred route. 2. Spare Channels: Space channels can be put into service, either to replace down channels or to add transmission capacity. 3. Preemption: Circuits or facilities can be reassigned from low priority users to high priority users. 4. Service Limitations: Maximum message length or call time can be specified, service can be denied to certain classes of subscriber s, or other limitations can be placed on the subscribers. The actions listed above can be taken on a local or global level. Local action will be taken at an individual switching center, while regional or global action will require cooperative performance at a number of switching centers. Obviously the effectiveness of regional or global measures depends on coordination of the switching centers, which 147 148 / Data Processing for Communication Network Monitoring and Control I must be achieved through a central control facility. Based on analysis of the problems involved, a system study of the control center functions and possible implementation has been performed. Automatic data processing was found to be applicable both at the switching centers and at the control center. The results of the system study ,described in this paper, are applicable to many long-haul systems, and provide an inSight into the network control complex of the future. Network Monitoring and Control Concept To provide effective network reaction to varying traffic loads and equipment outages, a closed loop network monitoring and control system is necessary. As shown in Figure 1, the status of the network is monitored at the switching centers and transferred back to the network control center. Network operation is analyzed at the control center and control actions are initiated there. These control actions are carried out through command messages sent to the switching centers. • MESSAGE BACKLOG • CHANNEL OR TRUNK OUTAGE • ALLOCATED CIRCUIT OUTAGE • ALTERNATE ROUTING • SPARE CHANNEL UTILIZATION • PREEMPTION • SERVICE LIMITATION Figure 1. Network Monitoring and Control Concept. Like any closed loop system, the monitoring and control system can be made ineffective by delay or by inaccuracy caused by data errors. To reduce these two problems to a minimum, automatic data processing should be used at the switching centers for composition of the status messages and at the control center for network status analysis and display. Communication between the switching centers and the network control center can be accomplished in several ways. First, allocated channels could be provided between the switching centers and the network control centers. Second, direct or store and forward communications can be initiated either periodically on a preassigned schedule or when required. In order to achieve effective use of communication facilities, a common practice is to use store and forward messages for both monitoring and control information, with direct calls used only under emergency conditions. In the case of status messages which are usually long and which contain routine information for the most part, a preassigned schedule of reports is used. Generally, an hourly report is frequency enough for satisfactory reaction time. Emergency reports can be entered at any time. Status Message Composition The simplest approach to station status monitoring is manual. The technical controller at the station records the message backlogs, channel outages, and other pertinent data. He then composes a teletype status message which he sends to the network control center. From the network control center viewpoint, the manually prepared status messages are a special problem. Because of human error, mistakes in format are common. Messages having incorrect formats will be rejected at an automated network control center, and manual intervention and correction will be required. An automated status message composed at the switching center is therefore desirable. Data ordering is another problem in manually prepared status messages. Each event at the station, such as a channel outage or restoration, has a time of occurence which must form a part of the status messages. These events are usually recorded by the station personnel in order of occurrence. However, the status message format will normally require grouping by channel or trunk, so the events must be sorted into a prescribed sequence before they are transmitted. This operation is time consuming and subject to error when performed manually. Proceedings-Fall Joint Computer Conference, 1962 / 149 The Status Message Composer concept, shown in Figure 2, was developed to provide automatic status message generation from manual inputs. A block diagram of the proposed unit is shown in Figure 3. Figure 2. DISPLAY AND ENTR:' PANEL Status Message Composer, Artist's Concept. f---- ~ PRINTER I PAGE MAGNETIC KEYBOARD PAPER TAPE READER f---- --- CORE MEMORY PAPER TAPE PUNCH The operator types variable data, such as reason for outage, into the keyboard on the left of the Status Message Composer. The panel on the right of the keyboard indicates to the operator what information is required, and provides overall controls such as unit power. As shown in the block diagram, the operator-entered information is stored in a core memory. The time of data entry is read into the core memory automatically from a real-time clock. The location of the stored information is predetermined, so that all information about a particular trunk or channel always goes into the same group of words in the memory. Therefore, the status data are always stored in the proper sequence, and will be properly grouped when read out of the memory. The initial setup of the core memory is performed by reading in a punched paper tape which designates the core locations to be used for each trunk, channel, etc. The paper tape also designates the display and entry module corresponding to each trunk or channel. Changes in station trunkB or channels can be accommodated by simply changing the paper tape and relabeling one or more display and entry modules. A status report can be generated either periodically under clock control or at any time under manual control. The report is generated by a memory read-out which transfers all stored status data to both the paper tape punch and the page printer. The punched paper tape is entered into the communications network for store-and-forward transmission to the network control center. The copy produced by the page printer becomes the station operating log. CLOCK Network Control Center Functions Figure 3. Status Message Composer, Block Diagram. The display and entry panel at the top of the Status Message Composer provides the means for manual entry of trunk or channel outages. Each of the small display and entry modules corresponds to a single trunk or channel. The color of the display module indicates the last inserted status, and serves as a station status display. The functions to be performed at the network control center are described in this section. They can be handled manually or automatically. The next section describes an automatic implementation of the network control center. The network control center operation is shown in the information flow diagram, Figure 4. Status messages from the switching centers are received at the network control 150 / Data Processing for Communication Network Monitoring and Control INCOMING STATUS MESSAGE MESSAGE CHECKING FORMAT ANDlo PARITY CONTROL CENTER FILE SUMMARY REPLY ADDITIONAL DATA QUERY REQUEST FOR ADDITIONAL DATA DISPLAY: NEW DATA FLAGGED suggested for this application is an RCA 304 computer with a record file for bulk storage. As shown in Figure 5, the data input section of the control center provides terminations for incoming channels. Two channels are used for locally generated data; queries from the network controller on one channel and manually entered data on the other channel. The other channels carry status messages from network switching centers. TO SWITCHING CENTERS OUTGOING CONTROL MESSAGES Figure 4. Network Control Center Information Flow Diagram. center and recorded, either manually or automatically. The information in the incoming messages is checked for errors, using whatever redundancy is available in the messages (Le., parity bits or format). The status data are then recorded in the control center file. The status information is summarized for display to the network controller. It is important that the degree of summarization be optimized. Too much detail will swamp the controller; too little will limit his understanding of network status. Also, new information must be flagged in some way to attract the controller's attention and indicate the need for action on his part. The controller will acknowledge recognition of the new information by resetting the flag. USing the data provided by the display, the controller is able to solve most network problems. . On occaSion, however, he may need additional detailed data. Anyinformation in the control center files will be on-call, as required. To get additional data, the controller will initiate a query, designating the desired data. This operation is shown in Figure 4 as the query-reply loop. Automated Network Control Center To provide the network control center functions automatically, the system concept shown in Figure 5 has been developed. A general purpose digital computer is proposed, together with special equipment 'and displays, to provide status message proceSSing, storage, and display. The Information Processor I~ INFORMATION- ~ROCESSOR PROCESSOR RECORD tIJ FILE Figure 5. Data Flow in Automated Network Control Center. Temporary storage for all incoming channels is provided by paper tape. Data can be handled on all channels simultaneously at 100 words per minute. When a complete message has been recorded on tape in one channel, it is read into the Information Processor at 1000 words per minute. The Information Processor checks the incoming messages for correct format. Messages with format errors are rejected and printed out. Control center personnel correct the format errors and reenter the corrected status messages at the manual entry keyboard. . In addition to checking the incoming status messages, the Information Processor maintains a complete file of network status. It continually provides an updated summary of Proceedings-Fall Joint Computer Conference, 1962 / 151 network status, in suitable code and format, for transfer via the display buffer to the wall map display. New display data are indicated to the controller by flashing lights, which the controller can reset when he desires. Both trunk equipment status and traffic backlog would be provided in a single geographic type display. These two factors are closely related and form the basis for intelligent system control decision. Above the geographic display would be a tabular display of the status of lines allocated to high priority users. The wall display is shown in Figure 6. Figure 6. An artist's concept of the proposed network control room is shown in Figure 6. The wall map displays the status of all switching centers, traffic backlogs, and trunks in the network. The tabular display at the top of the map shows the status of allocated channels. The controller's console is simple and functional. It contains a page printer and keyboard for query entry and reply and a second keyboard for composition of control messages. A small panel is provided for display illumination controls, and for indicator s showing status of equipment in the next room. Network Control Room. If the controller wants. data other than that shown by the display, he will enter a query to the Information Processor. The Processor prepares the reply data and transfer s it back to the controller. By entering a query, the controller can get any data recorded in the Processor files. Although one man can operate the console, working space for an observer is provided. A possible layout of the automated control center is shown in the artist's concept of Figure 7. The equipment room is located next to the network control room. The separating -wall has been removed for clarity. 152 / Data Processing for Communication Network Monitoring and Control Figure 7. Artist's Concept of Automated Control Center. At the wall to the right in an RCA 304 Information Processor, with the record file and paper tape inputs located in front. The smaller cabinets contain the paper tape punches and readers for terminating the incoming channels. Two teletype operator positions are provided; one for manual entry and the other for channel coordination with the incoming and outgoing channels. The large racks in the rear house the display buffers and other special equipment. Data Processing in the Automated Control Center The Information Processor in the Automated Network Control Center must perform four data processing functions: 1. Incoming Message Check: The processor will check the format of incoming messages and reject those having format errors. The messages in error will be printed out, together with an indication of the detected error. 2. Station File Maintenance: A file of the current status and recent history of each station (switching center) will be kept in the processor memory. As the station status reports come in, the status information will be posted to the station file. 3. Display Data Output: The processor will automatically provide updated status in a form suitable for the control center display. 4. Process Queries: The processor will accept queries from the keyboard at the controller'S console. The data requested will be retrieved from the station file and output to the page printer at the controller's console. Each of these tasks must be done on a real time baSiS, to avoid system delays which would reduce effectiveness. The operating speed of the processor complex should be designed to keep up with the peak work load, and to catch 'up after periods of scheduled maintenance. The data processing operations in the proposed control center' system are based on the use of a Data Record File for storage of station status. The RCA Data Record File, Model 361, stores information on both sides of 128 magnetic coated records, with 18,000 characters stored on a side. The status of each station is stored on one side of a record in the Data Record File. Included in the stored Proceedings-Fall Joint Computer Conference, 1962 / 153 data are the status of every trunk terminating at the station, broken down into traffic backlog, status of the trunk, status of the channels in the trunk, and status of the users having allocated channels in the trunk. As previously described, the incoming status reports are initially stored on paper tape, and then read into the Processor as complete messages. As soon as the Processor recognizes the station identity referred to in a particular tape, it selects the record containing the status of that station and reads the complete station file into the core memory. When the station file is in the core mem0ry and the status message has been read in, the Processor performs the updating operation. After the entire station status has been updated, it is rewritten in the record file as a unit. As the Processor updates the station file, it abstracts the data that must be displayed. These data are translated into the proper code and format for driving the display, and transferred to the display buffer. Query processing is done on a station basis. Each query is analyzed by the processor to determine the stations involved. The station file records are then scanned and the appropriate information is extracted and converted to a format suitable for print-out. SUMMARY A system study of long-haul network control center requirements, functions, and operation has resulted in a network monitoring and control concept which includes data proceSSing at both the switching centers and the control center. Such high speed data processing will substantially increase effectiveness of the present world wide long-haul system by reducing the reaction time to natural and man made disturbances. It, therefore,· is a valuable tool in meeting the increased traffic loads and the vulnerability of communication channels to modern weapons. ACKNOWLEDGEMENT Many RCA engineers contributed to the network control center concepts described above. The author wishes to particularly acknowledge the efforts of A. Coleman, S. Kaplan, J. Karroll, M. Mas 0 n son, D. O'Rourke, and E. Simshouser. DESIGN OF ITT 525 uVADE" REAL-TIME PROCESSOR Dr. D. R. Helman, E. E. Barrett, R. Hayum and F. O. Williams ITT Federal Laboratories Nutley, New Jersey SUMMARY INTRODUCTION The ITT 52 5 VADE (Versatile Automatic Data Exchange) is a medium-scale communications processor capable of handling 128 duplexed teletype lines and 16 high speed data lines. The processor is of the singleaddress, parallel binary type utilizing a twomicrosecond-cycle-time core memory and operating at a single-phase clock rate of four megacycles. The fundamental design approach of the machine is to trade the intrinsic speed of high performance hardware for a reduction in total equipment, through timesharing. The memory if? shared between stored program and input/output functions without the use of a complicated "interrupt" feature. Serial data transfers between the memory and communication lines are performed on a "bit-at-a-time" basis requiring a minimum of per-line buffering. The central processor hardware is largely conventional but has been reduced as much as possible without impairing the power of a basic communications processing instruction repertoire-which includes indexingand character mode operations but not, as yet, multiplication or division. Instruction time is six microseconds and the number of instructions performed per second· varies from 63,500 to 81,000 depending on the existing input/output traffic load. Duplexing of the machine is accomplished by a "shadow" system whereby the off-line processor is continuously updated by the on-line processor through one of the normal high speed data links. At the present time the Design of RealTime Processors is following the multiprogramming or multisequencing philosopy. Multiprogramming is usually defined as the time sharing of a single central processing unit. The processor responds to the realtime channels by activating a stored program which in many cases is unique to that particular channel. The memory must store not only the individual programs, but also the addresses of the programs for the corresponding channels. Furthermore, the processor must provide some type of priority-interrupt system which will respond to the various service requests of the real-time channels. Real-time processors designed upon the multiprogramming basis usually provide perline equipment which converts the serial binary stream to a parallel character. After this conversion, the character enters the Input-Output system where character or word buffering occurs. Finally, the data enters the Main Memory for processing, after the channel service-request has been recognized. To implement such a processing system much special purpose hardware and programming is required. Program, index and supervisory memories may be utilized in conjunction with special purpose priority interrupt hardware. ITT 525 (Versatile Automatic Data Exchange) is a real-time processor designed upon a radically new philosophy. The objective of the design is to trade-off high internal processing speed with hardware, such that 154 Proceedings-Fall Joint Computer Conference, 1962 / 155 effective utilization is made of the machine capability. The ITT 525 processor serves as a real-time store and forward message processor, which may serve as many as 16 high speed duplexed data lines and 128 teletype lines operating at a 100% line utilization. The unique features of the ITT 525 include the sharing of one core memory for input, output and processing functions; the serial bit at a time assembly and disassembly of messages in the shared core memory using some simple in-out hardware; the storage of instruction micro-function logic instead of the standard operation decoder and logic; a minimum register central processor utilizing direct data transfers and providing the facilities for indexing and character mode; a powerful instruction repertoire for the implementation of the operational, utility and diagnostic programs. System Design The objective of the ITT 525 design was to produce a versatile message processor, at a minimum cost per line, to perform the function of a local area center handling a reasonable amount of data and teletype lines. It was decided to implement a system capable of interfacing with 128 duplexed teletype lines plus 16 duplexed data lines. In orderto achieve minimum cost per line the first design philosophy established was to make optimum use of the common equipment. Thus, it was decided to time-share one core memory and the major control circuits, between the In-Out unit and the central processor. This was made possible because of the high speed core memory, operating at a 2 microsecond read-write speed, and 4 megacycle logic circuits. If it is assumed that two memory cycles are required to perform a machine instruction, a maximum of 250,000 instructions per second may be executed by the ITT 525. Since the ITT 525 has such ahigh internal speed, it was further decided to deviate from conventions and accept data from the realtime lines a bit at a time per line into the one core memory with no per line buff~ring. The messages are, thus, taken directly from the serial bit stream into the core memory where they are completely assembled. The message remains in this storage area while it is being processed and analyzed by the stored program and finally becomes disassembled one bit at a time for the output line transmittal. This line scanning or bit sampling of the input and output lines requires a total of 62 %of the total machine time for the 128 teletype and 16 data line configuration. Thus, a total of 95,000 instructions per second are available for the central processing functions. The system analysis of the ITT 525 determined that to completely process the assembled messages, with an input line utilization of 100%, would require between 35,000 to 45,000 instructions per second. This proceSSing consists of message validity checking, message decoding for destination and priority, message filing, message journalling, message code conversion and finally output queueing. Approximately 1500 instructions per message are needed to perform the complete proceSSing functions. This proceSSing estimate of 45,000 instructions/sec is rather conservative, since the probability of continuous 100% line utilization is very remote. Thus, the average processing time will be much smaller than the 45,000 instructions per second. However, the total available central proceSSing time for the ITT 525 is 95,000 instructions per second so that, obviously, 50,000 instructions per second remains for future expansion or a further trade-off of time for hardware or flexibility. To make further use of the extra machine time, it was decided to employ the concept of stored micro-operations or microfunctions. A reserved area of memory contains the microoperations for each machine instruction. This word is retrieved for each instruction before the execution of the instruction can proceed. This extra memory retrieval per instruction uses an equivalent of 31,500 instructions per second, so that a total of 63,500 instructions/per second remain for message processing. The stored microfunction logic replace s the conventional wired logic operation decoder and some corresponding microoperation logic. However, the primary advantage of this approach is not the reduction of hardware obtained, but in increased instruction flexibility and speed of machine check-out. Each microfunction may be tested independently either by a diagnositc program or from the operator's console. This facility greatly reduces the time required to isolate and repair machine failure. In summary, the ITT 525 system design has resulted in the development of a stored program processor in which the memory is 156 / Design of ITT 525 "VADE" Real-Time Processor time - shared between Input/Output wired logic and the program control logic. The processor operates internally on parallel binary words each consisting of thirty-two bits. The instruction cycle, consisting of 6 microseconds, performs single address instructions with an available rate of at least 63,000 instructions per second. The block diagram of ITT 525 VADE (Figure 1) illustrates the machine configuration consisting of aCentral Processor and In-Out time- sharing the core memory. CORE MEMORY INSTRUCTION WORD BITS DEFINITION 0-7 B-15 16-23 24-31 CHARACTER (Co) CHARACTER I (C,) CHARACTER 2 (C,) CHARACTER 3 (C,) a BITS DEFINITION a CHARACTER MODE (C) INDEX TAG (X) SENSIBLE DEVICE CODE (S) OPERATION CODE (OP) WORD ADDRESS (W) CHARACTER ADDRESS (A) I 2-7 B-13 14-29 14-31 Figure 2. ITT 525 Processor Words. Machine Organization CENTRAL PROCESSOR OAT A WORD IN/OUT UNIT Figure 1. ITT 525 VADE Central Processor The central processor of the ITT 525 is a single address, binary, one's complement processor employing stored logic for instruction decoding, one index register and character mode operation. The design is based upon a minimum register configuration, with maximum time sharing, and direct transfer between registers. The instruction cycle of the processor is six microseconds. This cycle is broken down into three memory accesses: one unloadload to fetch the instruction; one unload-load to obtain the stored microfunction control word; and finally one unload-load to obtain the operand and execute the instruction. The processor word consists of thirty-two bits which may take the form of 4 eight bit characters or an instruction word divided into six fields (Figure 2). The Memory Unit consists of a high speed linear selection core memory operating at a speed of 2 microseconds per complete cycle. The size of the core modules varies from 4096 words at 33 bits to 32,768 words. An extra bit (33) is furnished, which enables parity checking during the unload cycle and parity generation during the load cycle. The design of the processor registers may be considered conventional for the instruction counter, index register and meraory address register. However, several unique features were employed in the use and design of the MemoryBuffer,Accumulator and Control Buffer. The Memory Buffer is a time-shared register which is concerned with normal memory functions plus other functions such as arithmetic unit buffering, character m9de gating, In-Out mode buffering and behaving like a pseudo bus. During arithmetic and logical operation the arithmetic unit may be considered as the Memory Buffer Register and the Accumulator. The reasoning behind this approach is that the contents of the memory buffer may be utilized as the arithmetic unit "B" register, while the data is being loaded into memory. For example, in addition the Accumulator contains the augend and the Memory Buffer contains the augend. These operations may, furthermore, be performed in either the word or character mode. Special' character gating between accumulator and memory' buffer enable the programmer either to perform the operation on 32 bits or one of the eight bit characters. In the transfer of data from register to register, the MemoryBuffer acts as a pseudo-bus, through which all data must pass. This configuration reduces redundant paths and allows one to form any data transfer path as desired. This capability is especially useful in developing new instructions conSisting of several register transfers. Proceedings-Fall Joint Computer Conference, 1962 / 157 The accumulator register of the ITT 525 is the heart of the arithmetic unit. This register in conjunction with the memory buffer performs a parallel, two-step addition and subtraction. One of the unique features of the Accumulator is the carry chain configuration, which has a maximum delay of 425 nanoseconds. The standard, simple, carry chain configuration consists of a single gate per flip-flop stage. The delay encountered for this arrangement is the number of stages times the gate delay, which for the ITT 525 would have been 32 x 35 or 1120 nanoseconds. In order to take full advantage of the four megacycle clock it was determined that a carry chain delay of less than 500 nanoseconds would be desirable for the ITT 525. One technique available to speed the carry chain is the passcarry or grouping-carry idea. In this case, several stages are combined to form one large carry gate, thus, reducing the overall carry chain delay. However, in the ITT 525 Accumulator the carry chain design is based upon the group hierachy principle. This concept makes optimum use of the recursive nature of the carry equation by first combining flip-flops into groups and groups into sections. In this way, if a carry has to be passed for 32 bits, it will avoid not only the groups of flip-flops, but also the section of groups. The functions that the accumulator may perform upon data are as follows: 1. Partial Add (Exclusive Or) 2. Carry 3. Inclusive Or 4. Reset 5. Complement 6. And 7. Cycle Left The accumulator may be sensed by program for the following conditions: 1. Minus Zero 2. Plus Zero 3. Overflow 4. Any bit of Character 3 (24-31) 5. Plus or Minus Zero The Control Buffer contains the Instruction Micro-operations- obtained during the Processor's "Stored Logic" Memory cycle. Each bit of this register is assigned a specific microfunction, such as "Reset Accumulator," Transfer Index Register to Memory Buffer," etc. If a particular instruction requires that microfunction a "One" appears in that bit pOSition. High fan-out drivers distribute these microfunctions to the various register input gates. In addition to the increased flexibility and some cost reduction, the use of the stored logic technique provides a powerful tool for checkout and maintenance. INPUT /OUTPUT UNIT With the single exception of a direct input from a paper tape reader on the console, all processor inputs and outputs are handled by the Input/Output Unit, including transfers between core memory and secondary storage devices. The initial implementation of the ITT 525 system has the following traffichandling capability: 1. 16 duplexed high speed, data lines operating at any speeds up to 2400 bits per second (8-bit code). 2. 128 duplexed teletype lines operating at speeds of 60, 75 or 100 words per minute (5-bit code). 3. Block transfers of computer words to one of eight magnetic tape units operating at a transfer rate of 2500 computer words per second. This is a maximum capability configuration with regard to teletype and data lines. Smaller machine capabilities are implemented in any combination of modular blocks of 4 data or 16 teletype lines. Also, individual line speeds are completely independent and may be changed without incurring hardware changes. Although the stated capabilities conform only to the task of communications processing' the unique features of the Input/Output Unit are applicable to other tasks and configuration requirements with a moderate amount of hardware change. The "bit-at-atime" technique is easily adapted to various forms of serial bit streams, regardless of framing or Synchronization details, and the method used for tape word transfers is directly applicable to any block transfer process, even if the ''blocks''are degenerate ones of only a few words of characters. Teletype and High Speed Data Lines Incoming serial bits on these lines are transferred directly into core memory. Outgoing serial bits are transferred directly from the memory to one or two per-line output flip-flops. Each output line requires one 158 / Design of ITT 525 "VADE" Real-Time Processor flip-flop for pulse-stretching and data output lines require an additional flip-flop to reduce bit jitter. The total line storage required is 160 flip-flops, which compares favorably with the 1536 flip-flops required if each line (input and output) were to terminate in a onecharacter buffer. Several fixed core memory locations are permanently assigned to each input and each output line which are used by the input/output logic. These per-line locations contain space for character assembly/disassembly, program flag bits, control bits and timing information. The stored program exercises control over the Input/Output Unit by performing regular scans of these words and changing their contents when necessary, thus modifying the operations of the wired logic of the Input/Output Unit. Specifically, in addition to noting the end of incoming messages or initiating output for outgoing messages, the program must make "bin" assignments to active lines. It is unfeasible to reserve for each line a space in memory adequate for the largest possible message. Alternatively, "bins" of 75 words, or 300 characters, are assigned to active lines as they are required. The Input/Output Unit logic notifies the program of such needs by flag bits and can store temporarily, in the fixed memory locations, as many as twelve incoming characters during the interim between bin assignments. In normal input operation, after a bin assignment is received, characters are transferred to memory soon after completion, independent of the stored program. In reference to the block diagram of Figure 1, the basic operation of the I/O Unit is rather simple. A "Scan Generator" controls line selection and memory addressing (for control words) according to a fixed cycle of operation. Then, for each line scan, the most important control word for the line, the" status word," is unloaded to the "Status Word Buffer" where it remains for one or two more memory cycles to control operations on the line information through use of the Memory Buffer for examination and modification of other words. Finally, two counters are used for timing purposes indicated below. The Input/Output Unit obtains control of the memory and performs a "scan cycle" every 280 microseconds. This interval is compatible in two different ways, with the bit periods of the lines. A 2400 bit-per-second data line has a bit length of 41 7 microseconds and a 100 word-per-minute teletype line has a bit length of 13.46 milliseconds. By scanning all data input and output lines each scan cycle but only one-fourth the teletype input lines and one-sixteenth the teletype output lines, the following rates are obtained: 1. data lines are scanned at least 1.49 times per bit. 2. teletype input lines are scanned 12, 16, or 20 times per bit for 100, 75 and 60 wordper-minute lines, respectively. 3. teletype output lines are scanned 3, 4, or 5 times per bit for 100, 75 and 60 wordper-minute lines, respectively. These rates permit the sampling of teletype input lines within ±8% of the nominal bitcenter, or better, to minimize the effects of distortion. Actual sampling is accomplished on the basis of predicted bit-center sampling times established when the "stop" pulse to "start" pulse transition is detected and stored in the fixed memory space for the line for later coincidence comparison with a real-time (based) counter. Data input lines are sampled on the basis of timing provided by their associated synchronizing signals. The synchronizing signal used has a frequency of onehalf the bit rate of the line. The value of this signal (zero or one) is stored on each scan and the line value is not sampled unless the stored and present values of the synchronizing Signal differ. Output lines are handled in exactly the same manner as input lines except that teletype output lines do not need the high scan rate provided for input teletype lines since the output process itself controls the waveform distortion. Duringthe I/O scan operations, one memory cycle is required to scan each line and two additional memory cycles are required for each character transfer between the fixed memory locations for the line and the message bin elsewhere in memory. By limiting the number of character transfers allowed in each scan, minimum and maximum I/O scan time requirements of 144 and 173 microseconds, respectively, are obtained. Since the interval between scans is 280 microseconds, 52% to 62% of total processor time is spent in input/output operations and 38% to 48% remains for stored program use (63,500 to 81,000 instructions per second). Magnetic Tape Block Transfers Block transfers of computer words between the memory and the Magnetic Tape Module Proceedings-Fall Joint Computer Conference, 1962 / 159 of the 525 are initiated by the stored program and executed in detail by the Input/Output Unit. A single fixed location in memory holds a block address and a count of the number of words to be transferred. During a block transfer, the MTM sends requests for word transfers to the I/O Unit at approximate 400 microsecond intervals and these requests must result in a word transfer within 67 microseconds. It is pOSSible, then, that a word transfer must be made when the I/O Unit is not in control of the memory. In this case, the I/O Unit gains control of the memory only for the transfer time and then relinquishes control until the next regular line scan cycle time. While it is in control of the memory, the I/O Unit uses the fixed location MTM word to control the transfer of a word between the specified address and the MTM buffer. Then the address is incremented, the block transfer count decremented and the MTM word is transferred back to its fixed location. Much of the logic performing these operations within the I/O Unit is common to the logic required for teletype and data line operations, since character transfers and bin counts for these lines are handled on the same general basis. The I/O Unit is easily expanded to include a similar block transfer proviSions for magnetic drums, card readers, punChes, printers and displays. Except for the implementation of the block transfer process, there is nothing unusual about the operation of the magnetic tapes. One tape at a time may be selected to read, write, backspace one record, advance one record, write end-of-file or rewind, the rewind operation being performed in a quasioff-line state so that other units may be selected during this operation. Input/Output Program Requirements Since the ITT 525 has no interrupt feature, the stored program-input/output interface is an unusual one. Regular scanning of the fixed-location input/output control words is essential to the bin assignment task of the program. Input teletype words must be scanned at least once every 800 milliseconds and input data words at least once every 40 milliseconds. These figures represent the amount of time required for incoming information to fill the twelve-character per-line temporary storage space. Output words are scanned (for bin assignment needs) at whatever speeds the programmer desires since no information can be lost and the only consideration is for efficiency in transmitting messages which are more than one bin in length. A more complicated problem arises when the programmust exert control overl/O Unit operations by modifying the contents of fixed location control words. An input/output line scan cycle may interrupt the program and change the contents of a control word at the same time that the program is preparing to modify the word. Since the program has no natural means of knowing an interruption has occurred, it would tend to force obsolete data into the control word. To circumvent this problem, a flip-flop is provided which can be sensed by the program and which, if on, guarantees the program that the six instruction times immediately following the sense instruction will be free of I/O Unit interruption-this number of instructions being sufficient to perform the control word modification. Duplexing To meet the reliability of many real-time problems, the ITT 525 may be utilized in a duplexed configuration. The duplexing design of the ITT 525 has been selected on the basis of maximum reliability and minim}lm special purpose duplexing hardware. The Duplexed System configuration is illustrated in Figure 3. System A is in control of the magnetic tape and communication links, all input lines and output lines are accepting and sending data. The standby machine B also has the input lines connected to it and accepts all input messages. Furthermore, machine B assembles, processes the message and sets up the output queue. Machine A regularly sends data to machine B via a OUTPUT UNES Figure 3. Duplex Configuration. TAPE MODULE 160 / Design of ITT 525 "VADE" Real- Time Processor normal high speed data line concerning the disposition of messages. Once machine A has outputted the message, machine B erases the message and updates its own output queue. Machine B does not file, journal or overflow, or output any messages. Using one of the high speed data links for the regular communication between machines A and B insures a smooth cutover with no loss of data. The worst condition that might occur is that after cutover machine B might output a given message again since it had not received the last disposition data. Machine A, if in control may by program relinquish control to Machine B and vice versa. Also, manual means are available.to establish the duplex configuration via the Operator's Console. Output lines and the magnetic tapes are automatically switched into the proper machine for each configuration. CONCLUSION The ITT 525 VADE is intended to do a medium-scale job using only a small-scale amount of hardware. Although testing and program debugging will not be complete for another two or three months, there is little doubt that the system will satisfy this aim. Further extensions of the VADE approach have been planned which will improve the speed, line-handling capacity and versatility of the machine by modular additions of hardware at selectively increased cost. ON THE REDUCTION OF TURNAROUND TIME H. S. Bright and B. F. Cheydleur Computer Division Phil co Corporation, a Subsidiary of Ford Motor Company Willow Grove, Pennsylvania Resources: SUMMARY (a) Flagging, in the procedure oriented language, of the permissible break-in pOints on large programs. (b) Sequential, rather than concurrent operation of programs, by means of fast exchange of more contents with disc. (c) Use of main core as input/output buffer for short communications with multiple remote stations. Approach. In principle, the operator system is to be increased in capability for minimizing delays, through application of currently available hardware, together with planned interruption of long runs by short ones. For small jobs, break-in on large jobs at selected interrupt pOints. For large jobs, stack input/output on disc. Sequence primarily by estimated run time, with some consideration of priority, and with less attention given to arrival chronology. The paper describes a typical large relaxation calculation, giving operation parameters as executed on a modern computer, showing for this rather formidable example a break-in-point interval on the order of several seconds. In jobs of such size, total tape and disc traffic can be comparable in volume to internal data flow. In contrast, the concurrent buffering of set-up information for many problems constitutes a relatively minor contribution to total data flow. Thus the initial input and final output for many jobs may Objective: To Reduce Delays. It is the intent of this work to permit small computing jobs to run with typical delays of minutes rather than hours, while no jobs, including the largest ones, become appreciably worse in turnaround time than at present. Background. The basic idea of multiplebreak-in operation from many input/output stations is not new. Most authors, however, have proposed either dramatic advances in hardware or software, or computation complexes of conventional hardware so large as to be economically unattractive. McCarthy, for example, proposed serving some dozens of stations for simultaneous on-line debugging of as many programs, by using perhaps a million words of slow magnetic core memory with a very fast computer. Sources of Delay. "Legitimate" delays, for jobs to be run as~ntities in the sequence in which received, consIst primarily of queue development during periods in which the average workload acquired exceeds the computation rate capacity of the facility. "Illegitimate" causes of delay result mainly from manual job stacking. Artificial delays are inserted at several places in a typical facility, including the sign-in-desk, the cardto-tape facility, the on-line input tape stack, the on-line tape units (gross operational delays from the mounting of file tapes while the computer system idles), and the printer tape stack. 161 162 / On the Reduction of Turnaround Time proceed concurrently, although calculations will in all cases be executed sequentially. Conclusion. The paper marshals arguments supporting the practicality of greatly reducing turnaround delays without using huge memory or very costly types of communication facilities. The productivity of all direct users of large- scale general-purpose digital computing centers, and to a lesser extent the productivity of the entire organizations they serve, are significantly affected by the typical time delay between request for and delivery of computer service, which we shall call "turnaround time." For this reason, reduction of turnaround time has in recent years become recognized as having major economic importance. As machines have become faster, individual problem setup time has assumed larger significance in computing center lOgistics. Some of the delays for clerical work at setup time have been taken over by operator programs. Much of the effort on delay reduction has been applied to attempts to increase the effective throughput capacity of the computing systems themselves, either by concurrent operations or through increase in sheer speed. One approach that has been widely used, as a matter of absolute necessity, for the handling of real-time problems within generalpurpose facilities, has had surprisingly little attention in "unreal-time" applications. The intent of this paper is to direct attention to the technique of short-run break-in by programmed interrupt, and to show how modern hardware, without the costly special facilities often required for prompt interrupt, can make this method attractive for general-purpose applications. We believe that the method, which uses only off-the-shelf hardware and software, can permit many short jobs to run with typical delays measured in minutes rather than in hours, while no jobs (including the longest ones) become drastically worse in turnaround time than in conventional first-in, first-out operation. Throughput Increase Before proceeding with our discussion, it will be useful to review some of the steps that have been taken to increase the effective capacity of general-purpose computing facilities: 1. Concurrent Schemes (cohabiting programs) 1.1 Micro- segmentation by commutating hardware 1.2 Decentralization by input/output autonomy 1.3 Macro- segmentation (p r 0 g ram segment merging by hardware interrupt) 1.31 Merged input/output, seq\lential execute (several concurrent I/O streams permitted, but only one computation at a time has control) 1.32 Merged input/output and execute (full-blown "multi-programming") 1.33 Seq uenti al input/output, merged execute (early real-time operations on unbuffered machines) 2. Sequential Schemes (programs alone in memory) 2.1 Multi-phase operation (batched input, execute, output ("I, E, 0"» [1].* 2.2 Faster machines 2.21 Sequential integral programs 2.22 Short-run break-in by program interrupt 3. Multiple Independent Machines The concurrent schemes suffer from the serious disadvantage that, even in multiplecomputer-unit complexes (whether or not all of the available memory space is accessible by all processors) sufficient main-memory space must be available for all of the programs or program segments that are to be operated together, if the operation is to be economically feasible. This often means that either the program multiplexing is limited to jobs that require very little memory space, or that memory sizes are required that are economically unattractive at present. t Method 1.1, in present realizations, has the additional disadvantages that both time and memory space segmenting must be *Batching of input, execute, and output phases of all jobs on an input tape is discussed in detail in reference [1]. tOne proposal, [4], called for a single computer system with one million words of magnetic-core memory. Proceedings-Fall Joint Computer Conference, 1962 / 163 relatively simple and inflexible. It increases turnaround time for all processor-limited problems that are run concurrently, since the single central processor must be timeshared and all such jobs must take longer than when run seriatim. All of the Concurrent Schemes shown above permit efficient use of multiple online input/output devices, but the sharing of a single I/O device by several problems is at present feasible only if the "device" is actually a large random-access auxiliary memory element or if Data Select * hardware facilities are available. This is particularly s i g n i f i can t when scheduling multi-tape problems on a large machine; many of those jobs for which concurrent operation would be most attractive require the use of half or more of the total number of tape units available, espeCially when tape-oriented operator systems are used. This limitation on time- sharing of a single I/O device is, alas, almost as frustrating for the modest scheme proposed by the present paper as for the most sophisticated time-and-corespace-merging scheme discussed. Its effect is to impose serious limits on the permissible assignments of tape units or other I/O devices, for jobs that are to be run concurrently in any system, and in most cases to prohibit reassignment of a given I/O device until completion of the job to which it was last assigned. The noteworthy exception is the case of tape units used for "scratch" storage of intermediate results; such units may be reassigned as soon as the last Read operation upon a given data string has been completed, although the reaSSignment problem is a difficultone in the important case when the number of rereads is dependent upon calculatioJl, results and must be determined at run time. In the proposed scheme, the effect of this tape assignment restriction is merely to hold back the start of an interrupting job until adequate I/O facilities can be assigned. Thus, when an interruptable job that has *This feature permits individual data records on magnetic tape to be tagged with control marks so that they can be processed or skipped without detailed examination; it is most commonly used for the writing of multiple reports on a single tape by a single program, so that report selection may be made at the time of off-line printing. extensive I/O unit reqUirements is running, only those jobs that can be accommodated on the remaining I/O devices can be permitted to get to the head of the interrupting job stack. (A comment on semantics is in order here. Many of the early papers on multiprogramming, and a few recent ones, seem to consider concurrency of I/O with computation (Method 1.31 with the restriction that all of the I/O activity relates to the execution of one program) to constitute multiprogramming. We feel that "buffering" is the accepted term to be applied to such concurrency, as long as a single main program (which of course may be controlled by an operator program and from time to time by an arbitrary number of subprograms) has control of the machine. We consider multiprogramming to mean "concurrency of the execution phases of two or more unrelated programs".) Among the sequential schemes, Method 2.1 was clearly not designed with turnaround tim e in mind, since it normally increases turnaround time for all jobs in a batch. This scheme, commonly known as "threephase" operation, was intended to save time by avoiding repeated loading of large input and output routines. It operates by first preparing ("I" Phase) the input from all jobs on an input tape; then ("E" Phase) executing all of these jobs; and finally ("0" Phase) preparing output (printer tape) for all jobs. The usual operating option of assigning special output tape(s) at run time, in order to take care of priority situations or to take advantage of a temporarily short printer job queue, is not available without a complete change of operating procedure back to "single-phase" operation, the normal scheme in which a single job is processed from start to finish. Thus, in true three-phase operation, the output from all jobs on a given input tape is delayed until the last job has been completed. Method 2.21 is the one that comes under fire in the classical justification for effort to be expended upon development of operator programs. It is clear that, as times for Input, Execute, and Output become vanishingly small on a well-balanced very-high power machine, one could conceive of a facility in which most of the time was spent in program setup, startup, and wrapup. In a typical large- scale present-day facility, in fact, brute speed alone cannot accomplish much reduction of turnaround time. The scheme 164 / On the Reduction of Turnaround Tim e proposed below permits setup information accession to be concurrent for many jobs, while startup and wrapup can occur very quickly under program control. Because of the remarkable advances that have been made recently in machine power and relative economy, a few words in retrospect will serve to underline the significance of the preceding paragraph. A typical highpower modern machine has 2000% to 5000% more computation capability, and 2500% to 7000% more input/output capability, than the vacuum-tube machines (circa 709 and 1105) that ushered in the concept of the integrated data processing facility using parallel binary arithmetic and b u f fer e d magnetic tape. Clearly, such huge increases in capa'city, and in computation-per-dollar if the machines can be kept occupied, call for serious reexamination of our operating methods. Method 3, which is the addition of entire computer systems, has represented sound management practice during the first two generations of large machines (the last of the vacuum tube machines and the first of the transistor machines), at least in those installations where unscheduled delays of more than a few hours might be prohibitively costly; if not having on-site backup hardware can be more expensive than that hardware would be, then the extra hardware is justified irrespective of capacity considerations. Because third-generation hardware will be much more predictable as to its readywilling-able condition (i.e., unscheduled downtime will be greatly reduced), and because maintenance experience on second-generation machines has taught design lessons that should dramatically reduce time required for scheduled maintenance, it seems reasonable that hardware unpredictability will in the few years to come offer less justification for parallel facilities. The large user who has had the advantages of more than one machine will, thus, in many cases consider conversion to a single, more-powerful machine in which overall hardware economy (computation per hardware dollar) can be better. There will be, from this class of user, intense interest in means for achieving the excellent traffic- handling behavior of the multiple-machine facility in a larger-singlemachine facility. Since this user will not be willing (and in many cases will not be able) to submit to the restrictiveness of the concurrent schemes, we feel that only Method 2.2 will meet his needs with any degree of success. Macro- Segmentation in Practice In many installations, the basic hardware configuration is determined by the requirements of a single class of "bread-and-butter" problems. With such problems rwming in the system, there will not be much excess memory space or processor capacity available. The macro-segmentation scheme is a means for scheduling the available excess capacity. If problems could be segmented so precisely that the onset and duration of memory , processor(s), and I/O device availability could always be matched precisely with the demands of other problem segments, then parallel operation could permit complete use of the entire machine. This does not appear to be workable in the real world. In practice, a useful degree of approximation to that ideal can be achieved if the problems, major and minor, are macrosegmented at compilation time so that the incidence of spare capacities in various subsystems, instead of being pre-computed, may be continually tested by control hardware and assigned at execute time, in vivo. Such a strategem requires the prescripting to every macro- segment of a precis of the I/O and memory requirements of that segment, a signalling of activity-completion from each I/O device to the executive program, and the continued monitoring of the problem programs at the macro- segment level. This scheme is being implemented for several machines in the U.S. and in England, notably in the English Electric KDF.9 which is described elsewhere in these proceedings. Particular notice should be taken of the recent work reported in reference [6], by Corbato, et al, describing an application of Method 1.32. Their thoughtful comments on several aspects of multiprogramming system requirements andplanning have inspired much of the work reported here, and the reader is referred to that paper for valuable background information on program timesharing of hardware. Their algorithm for run queue control will be discussed briefly below, and several references will be made to observations in that paper. Proceedings-Fall Joint Computer Conference, 1962 / 165 Short-Run Break-In The basic method proposed here is much simpler conceptually, and offers advantages shared by none of the other methods listed except those that utilize sheer power alone. In a sense, it is a simplification of the macrosegmenting concept outlined above. The segmenting is to be performed in large programs only, under control of flags planted by the programmer. This will require establishment of a program~ing convention for maximum on-line time interval between flags, which for large machines might be chosen on the order of a few seconds toa few minutes. Jobs whose maximum machine time requirement is smaller than the maximum permitted interval between flags will not be segmented, and it is these jobs that can be called in by the operator program whenever a break-in flag is encountered. Whilewe do not havethe temerity to essay a rigorous proof that any particular break-in time limit is a reasonable one for all circumstances, it will be helpful to consider one example of a notably forbidding class of problem in which data flow to and from auxiliary memory proceeds concurrently with rather involved calculation and indexing. In the inversion by relaxation methods of large sparse matrices, it is prohibitively expensive of restart time to interrupt the calculation during a mesh sweep. As each new sweep starts, however, a substantial amount of initialization is performed; it is not unreasonable to request that auxiliary memory data flow be organized for efficient interruption at these points. For instance, tape data can start new blocks, so that, at worst, tapes may require simple backspaCIng in the event of interrupt at such a point; disc data flow may start a new I/O order at these points. Consider a tridiagonal matrix of order 100,000 that represents an array of difference equations, calculation for each point to consist of eight accumulative-floating-multiply operations together with a few housekeeping operations. On a typical modern large- scale computer):~ thefloating-Ioad-multiply-add sequence may take 7 microseconds. Ignoring the brief housekeeping operations, the time for this mesh sweep would be 5.6 seconds. Thus, imposition of the one-minute rule would *e.g., the Philco 212. afford no hardship to the programmer of this large problem. Clearly in the formalization of problems that are even larger than this one, sectionalization into relatively autonomous parts is a sina qua non of rational construction and rational problem checkout. The run duration for these sections will tend to be far less than a minute. Thus the feasibility of the strategem (2.22), wherein a major problem occupying most of one critical facility must be displaced, sectionally, to reduce turnaround time for a minor problem, is assumed to be dependent only on the means for dumping core memory, etc., into a high-data-rate "scratch medium" such as drums or discs. From the standpoint of turnaround time, the availability of modern discs suggests that the complete loading of a number of problems can be made from cards and tape to discs well in advance of actual processing. All short-run segments and all problems such as usually require five minutes of machine room set-up time and one or two seconds of run time, can now be processed ambulando. It should be noted that the loading of information in advance of processing each problem segment can be effected automatically from discs more rapidly and with less entailment of control equipment, via Method 2.22, than would be the case with the macro- segmenting (Method 1.3), for the passage-time of macrosegments is not well matched with the access time of discs and is even more badly matched with the access times of tapes. Altogether, from considerations of simpliCity of Method 2.22 and of the tanking advantages of discs, it seems quite practical to permit small jobs to interrupt large ones and to thereby implement a first level of priority for jobs of short estimated run time. Memory-Protect Considerations With regard to the ubiquitous problem of memory protection (which let us discuss in the limited contextof protection of the Operator System program from being overwritten by a not-~et-debugged user program), Corbate> [ibid.j suggested dynami~ relocation of all memory accesses that pick up instructions or data words. ThiS, in a true multiprogramming sy stem, would consume significant machine time on a computer that did not have rather extensive specialized control 166 / On the Reduction of Turnaround Time hardware. With the straightforward scheme proposed here, memory protection can be adequately provided by the addition to a conventional machine of simple boundary registers. For the protection of I/O unit assignments, Corbato[ibid.] suggested the trapping of all I/O instructions issued by user programs; under the scheme suggested here, this would be necessary only for the interrupting (small) programs. Control of Precedence Assuming that overall review and authorization of problems provides all the filtering needed in a facility except for within-shift scheduling, and further assuming that shortrun break-in is adopted and discs are utilized, it becomes necessary to answer the question, "how much running-time should be allowed for small problems before automatic program-reversion to large problems is permitted?" Should the parameter be fixed or variable? Should it be som e interval that is greater than a few seconds and perhaps less than five minutes? Is this range too large? Using modern high-flushing-rate auxiliary memory equipment, one can save and replace the contents of main memory in less than one second, even on a fairly large computer. Consider the example of a 32,000-word core memory machine equipped with a disc backup memory that positions in a maximum of 100 milliseconds and communicates data at the rate of 119,000 words per second (8,192 words per 68-millisecond revolution), with angular delay of no more than one word-time when data words are moved in groups of 8,192 or more. Assume that the disc heads have been. prepositioned to a "home" position by convention at a flagged break-in point in a large program. Time to refresh main memory would then be, at most: T = 32,768 second + 0.1 second 119,000 32,768 second 119,000 approximately. + = 0.65 second, Among the parameters of priority, those that are dependent on equipment required for each problem segment become less important in a computer complex in which high capacity discs are included, because the tape-drives that would be required to serve as scratch media, in classical complexes, are now replaceable with areas on the discs. Likewise in file updating, even the currentchanges may be kept on diSCS, as well as the problem-program and the library. Thus, except for the propriety of using tapes as a medium for large history files or reference files, the functions of tapes are apt to be supplemental to those of diSCS, rather than vice versa. One such supplemental preference for tapes with respect to discs inheres in the two or three millisecond access time to the beginning of information blocks that are already in position for the next reading or writing action. On the whole, however, the Criticality of I/O availability is considerably reduced when modern discs are available, and the number of essential availability parameters that must be used in a scheduling calculation is very small. When there are several problems loaded into the tape or disc stack, the selection of the next one to be processed can be based on a calculation that takes into account the estimated run time. An early, perhaps whimsical scheme that considered e.r.t., among other variables, was the North American (Aviation) "Precedence Pro g ram" circa 1955. NAPP controlled the job stack on an IBM 701 by conSidering the four factors U = Urgency, W = Wait time (since problem submitted, B = Business this customer gives the computing center per month, and r =run time estimated for this problem. For each waiting problem, the program calculated priority and chose the problem having the highest value of P to be r.un next, according to: P = WUB r The value of U was set by reference to a table established by laboratory management and changed from day to day or perhaps from hour to hour. The parameter B was inserted in order to provide an appropriate indication of the loudness with which this customer was able to knock on the computing center door. The usual first-come, first-served sequencing convention may be looked upon as a degenerate form of this formula, with U, B, and r held constant. Proceedings-Fall Joint Computer Conference, 1962 / 167 We propose that two of the above four factors, Wait Time and Estimated Run Time, be considered in addition to I/O units required, in establishing run precedence. We do not propose to consider memory space requirements, since this scheme does not require cohabitation of running programs in main memory. We also propose to provide some weighting other than linear for the two times, thus: P =n log W - m log r, where n and m and weights given to Wait Time and Run Time respectively. So far as turnaround time is concerned, the responsibilities of the Executive Program can be summarized into three classes: (a) The computation in advance for each problem in the input stack of a precedence number, taking into account the parameters of priority. (b) The anticipation, through survey of the estimates of running time of the status of the queue, giving advance notice to operators of when the backlog of little and big problems is to be replenished. (c) The providing of advance notice to operators of need to set up reference file tapes, during the advance of a large problem from segment to segment, in accordance with the computations of (a) and the intrasegment directions to operators provided by the complier. For the case of a true multiprogramming operation, with some scheme for sequencing small time-segments of user programs, system efficiency can approach zero for heavy workload when for some large programs the loading time becomes large compared to the run time segment length. Corbat6 [ibid.] proposed a scheduling algorithm that guaranteed an operating efficiency of at least 50% by keeping segment operate time equal to or greater than load time, and pointed out that one may determine the longest loading delay among a number of competing program segments and that, for a given "segment delay," the number of users must be limited. Unfortunately, in a typical computing center environment, it is the completion of a job rather than the start of its execution that is of interest; completion time does not seem to us to be predictable in the general case. Under the proposed scheme, on the contrary, provided good discipline is maintained with regard to insertion of flags in interruptable programs and to limitation on the duration of interrupting jobs, it is possible to permit prediction of worst completion delay, or turnaround time limit, in terms of the number of short jobs waiting, by merely assigning a weight of zero to the coefficient m in (b) above, as executed by the operator program. This limit, for j jobs waiting, would be simply j(t f + t J where t f is the maximum permitted time between flags and ti is the maximum permitted time for any interrupting job. We feel intuitively that it would be desirable to experiment with weights for Wait and Run time coefficients n and m. For the facility that serves only a few dozen short- run users, it might be best to weight Wait time much more heavily than Run time, thereby approximating what Corbato calls "round robin" service; for the facility that serves a very large number of users, mean turnaround time must be greater and it will be desirable to favor short jobs by weighting Run time heavier. Since, in this system, the criticality of I/O equipment scheduling (so far as tapes are concerned) is relaxed, some of the com- . plexities that would enter into a general scheduling model are not present. Thus, for any problem in the stack, it becomes feasible to automatically examine the remaining factors, complying with the residue of considerations in a model such as given by J. Heller [2]. In this system, no special reprocessing of object languages is required in order to conform memory allocation to the ongoing problem-mix decisions. Furthermore, we require no real-time solution of linear models of flow or loading such as that reported by Totschek and Wood [3], nor are surrogates for solutions of these models needed when disc storage is present to provide cushioning. The Executive Routine is relieved of responsibility for micro-monitoring of error concatenations that can thread across a set of problems and a complex of equipment for only one large problem segment is processed at a_time; problem independence is fostered. Thus, in modern computers, the interdependence of hardware within any problem segment can be kept at a reasonable level while maintenance and problem debugging are simplified. In partitioned and buffered memory computers, i.e., those incorporating several 168 / On the Reduction of Turnaround Time memory modules having independent data and address registers, the inherent parallel capability is thereby conserved so as to contribute to speed of processing. This is in sharp contrast to the usual situation in micro-segmentation schemes, where memory partitioning complicates inter-job control and contributes to program control ricochet. Implications The basic idea of multiple-break-in operation from many input/output stations is not new. Most proposers, however, have advocated either dramatic advances in hardware or software, or computation complexes of conventional hardware so large as to be· economically unattractive. McCarthy and associates [4], for example, proposed serving some dozens of stations for simultaneous on-line debugging of as many programs, by using an enormous slow magnetic core memory with a very fast processor complex. One of their prinCipal concerns was to provide on-line responsiveness in the system to any set of queries or inputs emanating from the array of program-development stations, so that it seemed that all problem materials must be immediately accessable in directlyaddressable memory. We believe that a variation of the strategem of (2.22), adopted for high-speed processors and discs, could serve most of these requirements, particularly when individual groupings of problems can be controlled by individual executive routines, with occasional call-out from one group to another. In this connection, the peak memory traffic load level reached when transmitting ten characters per secondper station to or from 100 stations Simultaneously, reaches a character rate of only 1000 per second, or 1/250 of the load that a modern tape unit imposes on a single I/O channel. In a current model >:< commerCially available computer with each of four independent memory modules operating at one microsecond full cycle, and assuming that one full word of memory would be accessed twice for each character incoming from these control stations, this loading would entail 2/250/4 = 1/500 or 0.2% of the full memory capability. Clearly, the on-line query of raw information from a hundred or so stations is not a time- consuming process for this simple memory complex. The Many-Short-Jobs Workload Clearly, a production-computation w~rk­ load that consists entirely of one-minute Jobs is not gOing to be expedited by a traffichandling scheme that emphasizes short jobs at the expense of long ones. This is a real limitation, for there are many organizations in which much of the daytime workload consists of brief compile-and-execute jobs. Even for such organizations, however, there may be a powerful advantage in the method of short-run break-in. As discussed in the previous section, the economic soundness of use of main memory as a buffer for a multiplicity of input stations appears to be evident. Most large computing centers would be capable of serving an enormous number of additional users if the minimum time per job were sharply reduced. In a typical presentday operation, jobs much s~orter than one minute in extent are relatively few in number because people having such work to do can' get it done more promptly in other ways. With the possibility of achieving reasonably good efficiency for jobs requiring one second or less of large- scale machine time, a whole new class of user becomes vulnerable to the wiles of the numerical mountainmover. It is not difficult to conceive of several thousand jobs per day being done for the technical staff of a large laboratory, provided there are a large number of input stations conveniently located in the manner of reference [4]. In passing, it should be noted that one second (net-ignoring setup, startup, and wrapup) of machine time in this age is an item of not inconsiderable potential value. When used for such a mundane task as generation of a table of value~ for an implicit function, for example, it could accomplish the equivalent of months of hand calculation. Perhaps more to the point, the availability on a few minutes notice of a tool of such awesome power can encourage "calculation, not guesstimation" for problem sizes which would otherwise not be served at all. CONCLUSION We have endeavored to show that the conceptually simple scheme of short-run Proceedings-Fall Joint Computer Conference, 1962 / 169 break-in can permit turnaround time for brief computing jobs to be reduced drastically without substantial increase in time for any jobs, including the longest ones. In particular, we have pointed out how one of the basic objectives of the McCarthy, et al proposal, to make feasible nearly simultaneous access by many people to a large computer, can be met through the application of presentlyavailable hardware and presently-designable software. REFERENCES 1. Mock, Owen and Swift, Charles, J., "The SHARE 709 System: Programmed I/O Buffering," J. ACM 6, 2, Apri11959. 2. Heller, J., "Sequencing Aspects of Multiprogramming," J. ACM, 8, 3, July, 1961. 3. Totschek, R. and Wood, R. C., "An Investigation of Real-Time Solution of the Transportation Problem," J. ACM, 8, 2, April, 1961. 4. McCarthy, John, et al., "Report of the Long Range Computation Study Group," (private communication), Massachusetts Institute of Technology, Cambridge, Massachusetts, April, 1961. 5. Greenfield, Martin N., "Fact Segmentation," Proc. 1962 SJCC, pp. 307-315. 6. Corbato, F. J., et aI, "An Experimental Time-Sharing System," Proc. 1962 SJCC, pp. 335-344. REMOTE OPERATION OF A COMPUTER BY HIGH SPEED DATA LINK G. L. Baldwin Bell Telephone Laboratories, Incorporated Murray Hill, New Jersey N. E. Snow Bell Telephone Laboratories, Incorporated Holmdel, New Jersey INTRODUCTION evaluation, an attempt is made to point out the limitations in usefulness of such a system and the inherent qualities, both good and bad. One promising means of attaining data transmission speeds high enough to be effective with present day computer operation is the use of wide band facilities provided by the Bell System TELPAK service offerings. Almost every industry large enough to make use of a computer also has need of large numbers of voice telephone circuits between centers of operation. Quite often these circuits are provided by a TE LPAK channel. Alternate use of the entire channel in a continuous spectrum data transmission system not only makes high speeds possible, but in many cases economically attractive. With the establishment of a new installation at Holmdel, New Jersey, Bell Telephone Laboratories had an excellent opportunity to make use of and evaluate an experimental data transmission service, using a TELPAK A channel. A TELPAK A service with appropriate terminal equipment, can be used as twelve voice circuits or as an equivalent continuous spectrum wide band channel. Installation of the system was completed and routine operation begun in February, 1962, giving the Holmdel Laboratories rapid access to an IBM 7090 computer at the Murray Hill, New Jersey, Laboratories. This paper presents first, a description of the system, and second, the concepts under which it was devised with an evaluation of the operational results. As a part of the Description of Experimental System A functional block diagram of the data transmission system is shown in Figure 1. Basically it is a magnetic core to magnetic core system used primarily for tape-to-tape transmission. IBM input-output and transmission control equipment are utilized with Bell System experimental data sets, prototype N- 2 telephone carrier terminals, and a specially engineered type N-1 carrier repeatered line facility. A discussion of each of these system components follows. IBM N-2 729 TAPE DRIVE "---_--' '---_ _ _~I TERM. BELL TELEPHONE LABORATORIES MURRAY HILL, N.J. BELL TELEPHONE LABORATORIES HOLMDEL, N.J. IBM 729 IBM 1401 COMPUTER IBM 72B7 CONTROL OFFICE N·l CARRIER LINE 11 REPEATERS I -------------------------~ TAPE DRIVE MURRAY HILL 1-'--------1 CARR. CENTRAL LOCAL LOOP 15000' 19 GA. ON·LOADED PAIRS N-2 HOLMDEL i + - - - - - - - j CARR. CENTRAL TERM. OFFICE Figure 1. Block Diagram-Experimental Murray Hill-Holmdel Data Link. 170 Proceedings-Fall Joint Computer Conference, 1962 / 171 Data Link Input-Output Equipment Tape drives are of the IBM 729 type operatingunder control of IBM 1401 computers. Both computers and tape drives are used in routine data processing operations when not connected in the data transmission configuration. While transmission is in progress, the tape drive at either end of the system is under control of the local 1401 and subsequently the transmission control unit, an IBM 7287 Data Communication Unit. The system is designed for transmission of data records or "blocks" with error detection and reply between transmissions. Data is read from magnetic tape one record at a time. Parity checks are made as the data is read into the 1401 core storage, from where it may be clocked at a synchronous rate for transmission. The IBM 7287 transmission control unit is arranged in the system to clock data from the 1401 storage under control of the timing signal supplied by the data set (timing could be supplied by the 7287 if not supplied by the data set). Upon receiving data from storage the 7287 performs parity check, code translation from seven bit to four-out-of-eight fixed count, serializes the data and delivers a binary dc signal acceptable at the data set interface. In addition, for each record transmitted, it generates and adds record identification, start-of-record, end-of-record, and longitudinal redundancy check characters. T,he receiving 7287 receives serial data from the data set, performs character and longitudinal error detection, code translates back to seven bit characters and delivers in parallel to the receiving 1401. Upon completion of receiving and checking a record, a digital control signal is returned via the reverse direction of transmission to the transmitting 7287. The control signal identifies the record received and indicates that the record passed all error checks or failed and should be retransmitted. At this point the transmitting end may be in either of two conditions. In the first operating mode the 1401 may have already read the next record from tape and transmission may continue if no error is indicated. If an error is indicated, it is then necessary to back the tape up two records and read the record again for retransmission. The second operating mode (determined by choice of 1401 program) holds each record in 1401 s tor age and continues retransmitting until a "no error" reply is received, and then the next record is read from tape. The more efficient mode of operation is, of course, dependent upon the transmission error rate. When a record passes all 7287 error checks and parity checks in the receiving 1401, the record is delivered to the receiving 729 tape drive, completing the tape-to-tape transmission. In order to achieve maximum efficiency it is necessary to maintain character synchronization continuously in both directions of transmission, rather than re-establish synchronism for each record or reply transmission. To accomplish this, the 7287 transmits periodically (once each 500 millisec0nds) a short interval (approximately 10 milliseconds) of a character synchronization pattern either between record transmissions or between reply transmissions. Continuous repetition of a synchronization pattern must receive careful design consideration (as will be explained) for transmission interference reasons in this or similar systems. Data Set Experimental Bell System X301A (M-1) Data Sets used in the system are designed for serial transmission at a synchronous rate of 42,000 bits per second. The principles of operation are the same as in the Bell System 201A Data Set (commonly referred to as a "four phase data set") currently providing. DATA-PHONE service on voice circuits. The data set employs quaternary phase modulation with differential synchronous detection. Data delivered serially to the transmitter is encoded two bits (a "dibit") at a time into a phase shift of an 84 KC carrier. For the four possible dibits (11,00,01,10) the phase of the carrier transmitted during a dibit time interval is shifted by 1, 3, 5, or 7 times 1T /4 radians with respect to the carrier phase during the previous dibit time interval. The 21,000 dibit per second modulation results in a line Signal spectrum symmetrical about the carrier frequency in the 63 KC to 105 KC band. At the receiver, dibit timing is recovered directly from sideband components of the line signal. This timing is then used in the demodulation process. Data is recovered by detecting and decoding the phase relationship between the previous dibit interval of line 172 / Remote Operation of a Computer by High Speed Data Link signal (available from a one dibit delay line) and the present dibit interval of line signal. The recovered data with a synchronized bit timing signal (generated from the recovered dibit timing) is then delivered at the receiver output. Data, timing, and control circuits all appear on one ganged coaxial connector on the rear of the data set chassis. Interface circuits are designed to drive low impedance (90-120 ohm) loads or to terminate similar circuits. Operation of the X301A (M-l) Data Set differs somewhat from that of voice band sets using the same modulation technique. It is undesirable, in the idle condition, to transmit a signal corresponding to a repeated bit pattern. Under this condition the line signal spectrum contains high level single frequency components which may result in crosstalk into other telephone carrier systems operating in the same cable. At the same time it is desirable to maintain continuous bit synchronization, requiring continuous transmission of a line signal. A compromise solution is necessary. The data set interface provides Send Request and Clear to Send control circuits. A Send Request "on signal presented to the data set results in a Clear to Send "on" signal being returned to the data source device, and data will then be accepted on the Send Data interface circuit. When the Send Request and Clear to Send control circuits are in the "off" condition, the data set will not accept data on the Send Data circuit but generates a line. signal automatically, corresponding to a repeated "1000" bit pattern correctly related to the dibit timing signal. This results in the most desirable (lowest level single frequency components) "idling" line signal possible. Continuous transmission also enables the receiver to maintain bit synchronization between data, reply, or character synchronization transmissions. If N-2 Carrier Terminal Telephone carrier terminals used in the system are prototype models of the type N- 2 transistorized system designed to provide twelve two-way voice channels. A prototype N-2 data channel unit replaces plug-in voice channel units. The system then handles a two-way wide band data channel. The data channel unit serves to adjust signal levels and modulate the data signal from the data set into the frequency band vacated by the voice channels removed. The data channel spectrum is then modulated by the N-2 terminal group circuitry into the proper frequency band for transmission to the carrier line. N-1 Carrier Line The type N -1 carrier line utilized for this system is of the same type widely used in providing carrier telephone circuits throughout the Bell System, insofar as equipment and cable facilities are concerned. This particular line is specially designed to minimize noise (e .. g. short repeater sections). The same design is required in many military services. Not all N-1 carrier lines in service meet the necessary noise requirements, but with additional engineering and construction effort can be made to do so. The N-1 line between Murray Hill and Holmdel, New Jersey, is approximately thirty miles in length, short enough so that no phase or amplitude equalization is required. It is estimated that equalization will become a necessity in the order of one hundred miles of repeatered line. Transmitted signal levels into the N-2 terminal and N-1 carrier line must, on a particular system, be a compromise between deriving a signal-to-noise ratio yielding satisfactory data error rates and keeping interference into adjacent systems in the same cable at a minimum. Local Loops Data signals are transmitted from and received at the data set over non-loaded telephone cable pairs. The experimental system described here utilizes 3000 feet of 26 gauge pairs between the Murray Hill Laboratories and the Murray Hill central office. At Holmdel 15000 feet of 19 gauge pairs are used. Although it is not expected to be universally true, it proved necessary in this system to use double shielded pairs between the building entrance cable termination and the computing center data set location to avoid inductive pickup of interfering signals. Need for System-Initial Concepts of Design and Usage Inlate 1959, when we were doingthe initial planning for computing equipment at Holmdel, Proceedings-Fall Joint Computer Conference, 1962 / 173 we had been told that within about six months of initial occupancy, the Holmdel buildings would house a total of some twenty-five hundred people. Since these people were to be transferred primarily from our New Jersey Laboratories at Murray Hill and Whippany, many of them at the time of relocation would be in the middle of projects requiring use of a computer. In order to provide a capability roughly equal to that at Murray Hill, we considered the following alternatives: Installation of a 7090 at Holmdel This we realized to be an ultimate requirement, and a 7090 is presently scheduled for installation in the latter part of the year. However, the total antiCipated load during the fir st six months or so of occupancy did not justify earlier installation. A computing facility of this size costs of the order of $100 thousand per month, and it is almost out of the question from an economic point of view to install one without a nearly full prime shift load. Install a Smaller Machine in the Interim Period This is an alternative which we dismissed at once. Our programming costs are of the same order of magnitude as the computer operating costs, and we just could not afford the reprogramming effort entailed. Use a Station-Wagon Data Link This was the most attractive alternative from a point of view of economy, and we have even gone so far as to provide backup for the automatic data link by a truck which makes several regularly scheduled round trips per day between Murray Hill and Holmdel in the event of data link failure. However, the truck scheduling problems here were such that the service, in terms of turn-around time on a typical job, would not be good enough for acontinuing operation. A Voice-Bandwidth Data Link There were several commercially available automatic data transmission facilities which operated at speeds up to 2400 bits per second on voice bandwidth lines. However, these were all too slow to give adeauate service under heavy load conditions. Since we were sure of several hours of 7090 usage per day, the delays in such a facility would result in about the same grade of service as the station -wagon. A Tape-Speed Data Link At the time we were making our plans for Holmdel, a Microwave tape-to-tape data link was in operation on the West Coast. Such a system would provide higher speeds and some operational benefits. Since, however, we were really interested in coverage only over a fair ly brief interval, installatio~ of a microwave transmission system made this alternative unattractive, from a cost viewpoint alone. The TELPAK A Data Link ThiS, of course, was our final choice. Its main attractions were that it operates over transmission facilities which are available in fair ly large quantity in the Bell System toll plant throughout the country, its cost reasonable, and its operating speed such as to increase the total job proceSSing time by only about twenty-five percent. Furthermore, being in the business of communication, we felt this to be an extremely worthwhile experiment in the field of data transmission. In forming our initial concepts of the TELPAK A data link, we knew that at the time of its installation the Murray Hill Computation Center would include a 7090 supported by three 1401 computers as peripheral equipment. At Holmdel we needed at least one 1401 to read and write the tapes to be transmitted over the data link and to read cards, print, and punch. Since the signalling rate of the data link was limited to the order of 40 kilobits per second, we required a buffer at each end to compensate for the difference between tape and transmission speeds. It was quite natural then to examine, together with people from IBM, the possibility of using the 1401 computer at each location for both buffer storage and control. Since we have stored program computers at each end of the data link, there is a great deal of flexibility in matter s of tape format, block Size, coding scheme and the like. However, virtually all of our operating experience has been with tapes written in the format peculiar to our 7090 monitor program, 174 / Remote Operation of a Computer by High Speed Data Link BE-SYS-4. In reference to the particular options possible with IBM tape transports, this format involves binary (odd parity), high density (556 characters per inch) records of length which is variable but do not exceed 1000 characters. As far as the data link is concerned, our only limitation on block size is the buffer space we have reserved in the 1401 memory. We have 8000 characters of core storage in all of our 1401 's and the storage required for the transmission program itself is about 1100 characters, so that it is possible for us to work with a much larger block size. It is typical of a buffered transmission scheme that the average effective data rate is low for both very short and very long records. With very short records a large fraction of time is spent in "overhead"starting and stopping the tapes and transmitting acknowledgements. Conversely for very long records, there is a higher probability of a parity error in transmission of a record and high penalty in time for retransmission. In between these two extremes there is a fairly broad optimum. In terms of our particular experience a block of 1000 characters is the optimum length for a fairly high transmission noise level, say, one error in fifty thousand characters. On the other hand, even in a noise-free system it gives an average transmission rate of about 90% of the maximum possible. In this sense, then, the block size associated with our normal tape format is quite satisfactory. Secondly, the uniformity of the physical appearance of all information on tape has Simplified the 1401 program normally used for operation of the data link to the point where we have made it a part of our standard 1401 program used for card-to-tape and tape-to-card/print operations. This in turn allows us to switch the 1401 from local operation to transmission without loading a separate program. Operation of System Let us now look at our method of operation of the data link in more detail. First, at Holmdel, a "batch" of programs is loaded into the 1401 using our standard card-to-tape program, thereby producing a 7090 input tape. This is rewound, and the 1401 program is altered (by sense switches) to transmit this tape to the receiving 1401 at Murray Hill. The tape is then sent to Murray Hill, and at the end of transmission the duplicate tape is rewound and the tape transport is switched electrically from the 1401 to the 7090. When the computer becomes available, the monitor program on the 7090 reads the tape, executing the various programs as they appear and generating results, again batched on a single output tape. At the completion of all jobs the output tape is rewound, switched electrically to the 1401, and transmitted to Holmdel. Finally the tape received at Holmdel is rewound and processed by our standard 1401 program to produce printed output and punched cards. Although operation over the data link in this manner involves the extra tape spinning (twice forward, two rewinds) required for transmission, the entire process can be carried out without mounting or dismounting a tape. In practice, we do move the data link output tape from its tape transport at Holmdel to one on an adjacent 1401, to allow printing and transmission to proceed simultaneously. One very nice feature of the buffered transmission scheme is that we have complete freedom in the choice of 1401 input/output equipment to be used. At the time of writing we are using 729 Model IT tape transports at Holmdel and 729 Model IV transports at Murray Hill; these differ in their operating speeds (75 and 112 inches per second, respectively). It is quite pOSSible, for example, to go directly from cards at one end of the link to tape at the other, although we do not normally do so because of the added time this ties up both 1401 'so Indeed our standard program in the receiving 1401 scans the data for certain 7090 monitor control cards and prints these immediately on the 1403 printer while transcribing them as well on tape. This then provides the computer operators with a summary of the jobs to be processed as well as any unusual instructions as to how they should be treated. In a typical day's use, we transmit over 20 input tapes from Holmdel to Murray Hill, on a twice-an-hour schedule between 9: 15 and 4:45, with transmissions in the evening shifts as required by the load. At the time of writing our daily load from Holmdel is about five hours of 7090 time, during which time we process typically 130 separate jobs. Roughly one hundred of these jobs are processed during the prime shift (9 a.m. to 5 p.m.); during this same eight hour period we typically process 150 jobs which originate at Murray Hill. The turn-around time (time from Proceedings-Fall Joint Computer Conference, 1962 / 175 submission of a program to the Holmdel Computation Center to delivery of printed output) is generally between two and four hours on jobs which require five minutes or less on the 7090. This is limited primarily by the printing capacity of the Holmdel 1401 's, since the groups which were transferred to Holmdel are working on jobs which generate a considerable amount of output. We have handled some high priority runs for one part of the Telstar project giving less than halfhour turn-around at Holmdel without disrupting our flow of work through the 7090 (aside, of course, from the 7090 time used for actual execution of the program). Operating Reliability and Results As far as reliability is concerned, the link has gone dowIJ., in the period February 15 through July 15, a total of about eight timesonce for the data set, once for the IBM translator and a half dozen times for the transmission facilities. These facility failures were isolated to one cable section and the problem was eliminated by changing to different pairs in the same cable. When the data link is working correctly, we have experienced an average retransmission rate due to noise of one error per three thousand records (see Figure 2). 50~----------------------------------. d) a: . - MH TO HO 0---0 HO TO MH w I- U 40 « a: «J: U Z 30 o ...J ...J ~ 20 a: w Q. d) a: o a: a: ·\A.-.~ .", 10 w //' 12 Figure 2. 1 2 3 TIME OF DAY 4 5 6 OTHER Error Rate Vs. Time of Day. Preliminary Conclusions-Areas of Usefulness, Qualities Our primary use of the Data Link was to provide rapid yet economical access to a large computer from a remote location. The desirability of the data link for this use depends on a number of factors: The Load at the Remote Location Since this use of the data link requires a 1401 at the remote location, not only to transmit data, but also to do normal card-to-tape and tape-to-print!punch operation, it is expensive. A remote faCility, including a 1401, a tape unit, auxiliary keypunches and associated equipment, furniture, staff and space, would cost of the order of $10 thousand per month. Reasonable economy dictates that this cost be spread over a load of at least thirty or forty hours of 7090 usage. On the other hand, a monthly load of over 150 hours would justify the installation of a separate 7090. Therefore, this use of the data link depends strongly on bounding the load within a fairly critical range. AntiCipated Growth of the Load Clearly, one of the competitors of remote operation of a large computer over a data link is the installation of an on-premises machine which is smaller and less expensive. The desirability of this course of action depends on what future loads are antiCipated. That is, if we expect the load to be fairly constant with time the separate smaller computer is probably cheaper and more attractive. On the other hand, if the load is expected to grow to the point where a large machine will be justified within one or two year s, then the saving in reprogramming and re-training of programmers might well repay the added dollar cost of the data link many times over . This certainly has been true in our own case. It might be added here that competition from small machines is purely on an economic basis. As soon as the industry can develop really in e x pen s i v e peripheral printer s, readers, and punches, the data link will be much more attractive. The Need for Good Service Certainly the least expensive remote use of a computer, with today's technology, involves the use of cars, trucks, telephones and the U. S. Mail. These, however, give a grade of service whereby it takes a day or more to return results on any given run. In 176 / Remote Operation of a Computer by High Speed Data Link a situation where the computer is being used primarily for production runs on a predictable schedule, this grade of service is often more than satisfactory. However, when a large part of the load consists of checkout and running of programs which are required in scientific and engineering projects, such economies in operating costs are far outweighed by ineffective use of technical personnel and by project delays. Upon installation ofa 7090 at Holmdel, we plan to continue use of the data link for purposes of load balancing as well as for protection of both computation centers in the event of temporary unavailability of one of the 7090's. Although this is a far less compelling motivation than that of remote computer operation, the price comes down at the same time, for now we have 1401 computers, operating personnel and everything else at both locations anyway. As with anything in this world except good bourbon, the data link does have some recognized deficiencies. In the first place, because the operation is essentially tape-to-tape, we find ourselves winding and rewinding tapes twice more than we ordinarily do when operating a 7090 locally. These added operations, although quite simple, are enough to encourage us to do more batching of jobs than otherwise. ThiS, in turn, increases the turn-around time and requires more attention of the operators. As far as speed is concerned, it would be nice to have it a good deal faster, for we find ourselves tying up the two 1401 's for roughly thirty hours a month just transmitting tapes. On the other hand, this is not a really serious deficiency, since for each minute we spend transmitting data we generally spend five more in printing it. The major limitation on the data link as we have used it is its cost, and this is really not a reflection on the cost of data transmiSSion, but rather on the cost of the supporting equipment and personnel at the remote location. If we could bring the total price of a remote location down to one or two thousand dollars per month and yet retain the input-output speed of the 1401 card reader and printer, then we would be able to justify a remote operation with a monthly load of well under ten hours of 7090 time. It is felt to be but a matter of time until the proper equipment is developed, but it certainly is not possible today. STANDARDIZATION IN COMPUTERS AND INFORMATION PROCESSING C. A. Phillips and R. E. Utman Business Equipment Manufacturers Association New York, New York The data processing standardization program is a comparatively new effort since it had its genesis in an action by the International Organization for standards (ISO) in late 1959. On a recommendation from Sweden, ISO decided there was a need for a standards program in connection with computers and information processing. It seems rather remarkable that the need for a standards program was recognized so early in the state of an art whose principal tool, the electronic computer, is only fifteen years old this year. Before getting into the specifics of this program, let us first consider very briefly, standardization as a process, and its alfect upon our lives. The Encyclopaedia Britannica describes standardization as a continuing process to establish measurable or recognizable degrees of uniformity, accuracy or excellence, or an accepted state in that process. It goes on to point out that man's accomplishments in this direction pale into insignificance when compared with standards in nature, without which we would be unable to recognize and classify within a species, the many kinds of plants, fishes, birds or animals. Without such standardization in the human body, physicians would not know whether an individual possessed certain organs, where to look for them, or how to diagnose or treat disease. To further quote the Encyclopaedia "without nature's standards there c 0 u 1 d be no organized society, no education and no physicians; each depends upon underlying comparable similarities." Although we are inclined to think of man-made standards as relating prinCipally to such things as weights and measures, money, energy, power or other material commodities, you will also find standards in social customs, in codes, procedures, specifications and time-to name a few. Standardization is important to geography, photography, chemistry, pharmacy, safety, education, games, sports, music, ethics and religion. The profession of accounting, for example, is largely dependent upon standards-which are generally referred to as "accepted practice." In fact, it is "accepted practice" that usually generates standards-many of which may be unwritten, simple and crude, while at the other end we have standards that are specified in great detail, nationally accepted and used, and, in many cases, subject to legal definition. There is no que stion that industrial activity thrives on standardization. It has been argued with strong support, that industrial standardization is the dynamic force that, in a sense, created our modern Western economy. There is no question that industrial standardization is the cornerstone of our mass-production methods, which, in turn, is such a vital part of our American economy. All of the industrially advanced countries of the world have their own national standards, and in many of them, standardization is whatever the government decrees-at least that is the case in Soviet Russia and its 177 178 / Standardization in Computers and Information Processing satellites. Russia has over 8,500 standards in effect; Germany, over 11,000; France about 4,500. The 'United Kingdom has approximately 4,000 British Standards a vailable for use. The United states has approximately 2,000 American standards approved by our voluntary national standards body, the American standards Association. The multiplicity of s tan dar d s making groups and the frequent duplication of effort by several groups having a kindred problem, led to the founding of the American standards Association (ASA) in 1918. During World War I, the need for eliminating conflicting standards and duplication of work became urgent. Several engineering societies, together with the War, Navy and Commerce Departments, established the American Engineering Standards Committee which was reorganized in 1928 and renamed the American standards Association. Today, the ASA federation consists of 126 national organizations, supported by approximately 2,200 companies. Over the years, ASA has evolved a set of procedures that apply checks and balances to assure that a national concensus supports every standard approved as an American standard by ASA. By the terms of its constitution, ASA is not permitted to develop standards, but instead, acts as a catalyst by aiding the different elements of the economy to obtain a desired standards action through the established procedures. In 1946 some one hundred top leaders in business and industry entered into a formal agreement with the Secretary of Commerce to broaden the scope and activities of the ASA. Along with the American Society for Testing Materials, ASA is now reviewing federal specifications to bring them into line with the best industry practice. Today the federal government is following a policy of using industry standards rather than writing its own and ASA has become a focal point of cooperation in standards work between government and industry. The Department of Defense, by specifiC Directive, has authorized its personnel to partiCipate in ASA activities as voting members and, at the present, no fewer than 25 federal agencies and in excess of 600 government representatives are participating in the work of ASA committees. The National .. Bureau of standards accounts for many of these committee posts. As a means of avoiding or eliminatingdifferences among National Standards, which sometimes may be even a greater trade barrier than import quotas or high tariffs, 44 nations have joined together in a world-wide non-government federation of national standards bodies known as the International Organization for standards, or ISO. The objective is to coordinate the national standards in various fields by means of ISO Recommendations, which are then available for voluntary adoption by all countries. In the electrical field, international standardization is conducted through the International Electrotechnical Commission (IEC) which is an independent division of ISO, made up of national committees in 34 countries. The American standards Association is the USA member of ISO and the U. S. National Committee of the lEC is an arm of ASA. With this very general background of standards practices and organization, let us next look at the standards program in the field of computers and information processing under three general headings: 1st - the relationship of this program to the international and national standards organizations and the manner in which the effort has been organized and is being directed; 2nd - the m e m b e r s hip of the various groups participating in the program; 3rd - the scope of the overall program and its various subdiviSions, along with the approach in each case and a brief report on progress. Coming out of the 1959 meeting of ISO, previously referred to, was the assignment by ISO to the United states of overallresponsibility for the programs conduct. A chart 'reflecting the overall organizational structure would show at the top of two parent organizations: ISO, from the international level, and ASAfrom the national level. Following established procedures, ASA assigned the program to a sponsor, which is usually a trade association with a direct interest in the subject and a willingness to undertake the effort. In this case, the Office Equipment Manufacturers Institute, which later became the Business Equipment Manufacturers Association or BEMA, was the logical organization to be given the responsibility as sponsoring activity. Under ASA procedures, the sponsor organizes the project' subdividing it as necessary, and finances Proceedings-Fall Joint Computer Conference, 1962 / 179 the full time staff and other direct costs incident to the program. Each major project under a sponsor is referred to as a Sectional Committee. ASA has an identification system of letters and numbers for these Sectional Committees, and under this system the data processing standards project became known as the X-3 Sectional Committee. It should be mentioned that a concurrent project that is concerned with standards in the office machines area was also assigned to BEMA as sponsor and is identified as the X-4 Sectional Committee. This breakdown into the X-3 and X-4 Sectional Committees, coincides with the organization of BEMA into semi -autonomous groups known as the Data Processing Group, with responsibility for X-3, and the Office Machines Group, with responsibility for the X-4 Project. Closely related to the X-3 Sectional Committee is another Sectional Committee identified as X-6, which was established by ASA under the sponsorship of the Electronic Industries Association (EIA) for consideration of those aspects of the standards program in data processing which are purely electrical as distinguished from the logical or other physical characteristics which is the responsibility ofX-3. It would be well at this point to consider further the role of ASA in relation to the Sectional Committees. As the various Sectional Committees develop recommendations through various sub-committees, they go through an approval process at the Sectional Committee level and are then submitted to ASA. The ASA will review the proposed standards and the supporting data and reach a judgement as to whether or not a concensus exists for such a standard. It may be refused on a single negative vote or approved with several dissents. The X-3 Sectional Committee is made up of three major groups with approximately the same number of members in each group. These groups are known as the Users Group, the General Interest Group and the Manufacturers Group and are made up for th~ most part, by representatives of trade associations, profeSSional or technical societies or other bodies having a direct interest in the subject. The members of the Manufacturers Group are selected from the BEMA membership by the Engineering Committee of the Data Processing Group/BEMA, which is charged with direct responsibility (within BEMA) for general direction of the standards program. At the present time, the X-3 Sectional Committee is chaired by a staff member of the Data ProceSSing Group of BEMA. The General Interest Group of the X- 3 Sectional Committee is made up, for the most part, of organizations or societies related by professional background or interest. They include the Association for Computing Machinery (ACM), the American Management Association (AMA) , the Electronic Industries Association (EIA), the Engineers Joint Council (EJC), the Institute of Radio Engineers (IRE), the Association of Management Engineers (ACME), the National Machine Accountants Association (NMAA) and the Telephone Group. The Department of Defense is also represented in the General Interest Group. The Users Group is made up of associations that have a common interest as to type of business. They include the Air Transport Association (ATA), the American Bankers Association (ABA), the American Petroleum Institute (API), the Insurance Accounting and statistical Association, the Joint Users Group (JUG), the Life Office Management Association (LOMA), the National Retail Merchants Association (NRMA) with the American Gas Association and the Edison Electric Institute holding a joint membership. The General Services Administration represent the Federal Government in the Users Group. Representing the Manufacturers Group are ten companies, some manufacturingcomplete data proceSSing systems, while others manufacture devices used in conjunction with data processing systems. The companies representing BEMA are: Burroughs Corporation, International Business Machine Corporation, Minneapolis - Honeywell ED PD, Monroe Calculating Machine Company, National Cash Register Company, Pitney-Bowes Inc., Radio Corporation of America, Remington Rand Divisoin of Sperry Rand, Royal McBee Corporation, and standard Register Company. ASA Procedures require that the terms of reference under which a Sectional Committee operates shall be clearly set forth in a statement of scope, which might be called a "charter." The language used to describe the scope of the X-3 Sectional Committee is as follows: "standardization 0 f the terminology; program de sc ription, programming 180 / Standardization in Computers and Information Processing languages, communication characteristics, and physical (non-electrical) characteristics of computers and data processing devices, equipments and systems." You will note the specific exclusion of electrical characteristics which, as previously mentioned, has been assigned to the X-6 Committee under sponsorship of EIA. The very broad scope of the X-3 Sectional Committee has been subdivided into seven parts or subcommittees, all having one thing in common-they are dealing with problems of communication. The first subcommittee, X-3.1 is concerned with Optical Character Recognition, X-3.2 is concerned with Coded Character Sets and Data Format, and X-3.7 is concerned with Magnetic Ink Character Recognition. You will note that these three deal primarily with input and output prob1ems which might be described as communications between men and machines. Another concerned with communications between men and machines is X-3.4 on Common Problem Oriented Programming Language. The X-3.3 subcommittee is concerned with data transmission problems which might be described as communications between machines. X-3.5 concerned with Terminology and Glossary, and X-3.6 concerned with Problem Definition and Analysis, represent problems of communicationbetween men about machines. These seven subcommittees are chaired by representatives of the companies that comprise the Manufacturers Group together with one chairman from the Navy Department and one from the General Interest Group representing the Association for Computing Machinery. In the numeric order let us next examine the scope, the approach and the progress of each of the subcommittees of X-3. The scope of the X-3.1 subcommittee has been defined as the development of humanly legible character sets for use as input/output for data processing systems and the interchange of information between data processing and associated equipment. Considerable work has been done over the past few years in this field by the Retail Merchants Association and others, which has been followed up and expanded upon by X-3.l. Initially, the work of this group has been concentrated in the numeric area-which is so badly needed, and at the same time is probably easier to develop. As you probably know, there are several optical readers on the market today, all using their own unique character font, and a standard font, both for numbers and letters, could do much to advance the state of the art. This group has a two-pronged problem-if the standards are set low in quality as to format, Size, density or other printing characteristic s, the optical reader will be comparatively expensive to produce. If, on the other hand, the standards are set high, the reader may be cheaper, but the printing devices and imaging media may be higher in cost. Achieving a proper balance, is the big problem confronting this subcommittee. The X-3.1 subcommittee approached their problem by dividing the work between three task groups. The first group will determine the proposed measurements, specifications and terminology for the font; the second group is concerned primarily with printing capabilities and the parameters of printing devices; while the third group will study ~e­ tail requirements and priorities and will evaluate other requirements with similar or different problems. Interest. and participation in the X-3.1 effort has been very high with about 50 people working actively. They have been holding meetings every 4 to 6 weeks and are confident of measurable progress within the current year. Hopefully, they will have a numeric font ready for consideration soon. The scope of the X-3.2 subcommittee provides that they will develop standards for coded character sets and data record formats to facilitate communications both within and between data processing systems. Here the problem is primarily machine to machine communication, rather than man to machine, as with X-3.1 Today there are over 65 different machine codes used world-wide, with over 50 different ones used in the United state s. Frequently the difference between these codes may appear to be minor, although such differences may have a major impact. For example, the order in which the string of characters places the numbers and letters. Some codes put the alphabetic characters first, followed by the numbers, then followed by the machine-function codes such as: carriage return, back-space, uppercase, lower-case, etc. Other codes change this order or reverse it. There is also the problem of a standard representation of the characters in the bit structure of the magnetic tape, or the punched holes in cards or tape. Proceedings-Fall Joint Computer Conference, 1962 / 181 Obviously, if standards proposed by this country are to be accepted internationally, they must make provision for alphabets other than English, and must not differ too greatly from codes used by European or other countries. The approach adopted by the X-3.2 subcommittee is in three phases-the first task group is to determine the alphanumeric characters' symbols and control characters desired; the second group will develop the detailed code representation for such characters, symbols and functions; and the third group will develop a standard format for utilizing the standard coding. The first two of these groups have met their targets-with what accuracy is yet to be determined-and the third group is now working actively with a joint input/output group. The X-3.2 subcommittee has completed the development of a recommended American Standard code for information interchange and has submitted it to the X-3 Committee for processing. In turn, X-3 has submitted the recommendation for balloting by the X-3 members. In the meantime, through conferences with groups in Europe, the X-3.2 subcommittee is considering revisions or modifications that would make the proposed standard more acceptable as an international code. The pros and cons are being actively discussed and we mayor may not have an American Standard within this area within the current calander year. The X-3.3 subcommittee on Data Transmission is concerned with the determination and definition of parameters governing the operational action and reaction between communications systems and the digital generating and receiving systems utilized in data processing. From anoverall business standpoint, this is a relatively new problem that has been under active development for only four to six years although the military have been working on it for much longer. Previously, translations from data processing codes to codes suitable for transmission have been done manually. Today there are standards in the field of voice and telegraph transmission, but very little has been done beyond this. Radio and television raised questions on line or channel quality and width and the advent of the computer raised questions of relative costs. Many users of data processing equipment believe that the economical use of data processing equipment requires the centralization of the data processing activity. This immediately imposes requirements for economical data transmission and the need for an effective interface with data communication. Although X-3.3 was somewhat slow in getting under way, they have now organized their effort under five groups. The first group will handle liaison with other interested groups; the second group will develop a glossary of special terms relating to this subject; the third group will document error detection and control techniques; the fourth group will try to specify the system aspects of data processing terminal equipment to be connected with communications equipment, and the fifth group will do research into systems performance characteristics. In spite of a slow start, X-3.3 has made good progress, largely because of excellent cooperation with the related subcommittee under the X-6 Sectional Committee sponsored by EIA. For several years EIA has had a group known as EIA TR/27.6, working on problems in this area. Through this cooperative approach X-3.3 now has had its proposed standard on signalling speeds for data transmission equipment approved by the X-3 Sectional Committee and the ASA as an American Standard. This is the first American Standard to result from the X-3 program. The scope of X-3.4 has been described as follows: "Standardization and speCification of common programming languages of broad utility, with provision for revision, expansion and strengthening, and for definition and approval of test problems." This area is one of the most difficult and probably of the greatest concern to the data processing community because of the increasing costs of "soft-ware." It will be noted that the scope encompasses both the business-type languages and the SCientific-engineering type languages. In fact, it probably also includes the so-called Itcommand and control" languages of the military under the title of "problem-oriented." There have been many users groups active in the development of programming languages and it is expected that X-3.4 will utilize much of this work that has gone before. At the present time the subcommittee is conSidering COBOL, ALGOL and FORTRAN, the three most widely used programming languages. The X-3.4 subcommittee has been organized into six working groups, the titles of 182 / Standardization in Computers and Information Processing which will suggest the areas of work assigned: Working Group 1 is concerned with language theory and structure, WG 2 with ALGOL, and Specification languages, WG 3 with FORTRAN, WG 4 with COBOL and language processors, WG 5 with international problems, and WG 6 with programming language terminology. The complexity of this area makes it probable that there will be some overlap within the subcommittee and with other groups and that the progress may be somewhat slower in spite of the most dedicated and sincere effort of the participants. The X-3.5 subcommittee is concerned with problems of man-to- man communications under the following two-part scope: (1) to recommend a general glossary of information processing terms, and (2) to coordinate and advise the subcommittees of ASA X - 3 in the establishment of definitions required for their proposed standards. This group has recognized an over lap with work done in this field by others, for example, a joint glossary has been compiled by the ACM-the AlEE and IRE conSisting of over 2800 words or terms. X-3.5 expects to use this as a base and to coordinate their efforts with those of other groups, including one in the Federal Government under the aegis of the Interagency Data Processing Committee, and a similar joint effort in Great Britain. X-3.5 is a comparatively small group with strong participation from the User and General Interest groups on X-3. Much work has already been done in this field and it is now largely a matter of collating, refining and editing and of putting the product in prescribed form for submission as a proposed American standard. New users, or prospects for electronic data proceSSing systems are frequently surprised to find that there is still no standard accepted ways of defining data proceSSing applications. This is the field in which X-3.6 has defined their scope; as: (1) the development of standardized survey techniques, (2) the standardization of flow charting symbols, and (3) the development of narative symbolic - quantitative methods of presenting results to top management, data processing customers and data processing operators. Work along this line has been done by the separate companies and a good bit by government agencies. The Federal government, through the Interagency Data Processing yommittee has developed guide lines and criteria for feasibility studies, application studies, and flow charting techniques and symbolization. There is good reason to believe that these efforts will have a strong influence on the work of the X-3.6 subcommittee. It is also hoped that active interest and partiCipation by educational institutions can be stimulated. X-3.6 has subdivided the work into four task group assignments. The first is concerned with methodology, the second with input/ output data and file description, the third with data transformation, and the fourth with nomenclature and flow charting. So far, the greatest tangible results have been in the work of Group 4 which is hopeful to having a proposed standard for flow charting symbols ready soon for consideration. In a recent indication of the dynamic nature of industrial standardization, ASA abolished' the obsolete X-2 Sectional Committee on office standards and assigned its project on charting paperwork procedures to X-3 and X-3.6. The latest of the subcommittees is X-3.7 which is concerned with magnetic ink. character recognition. Work in this field is well along, and this is recognized in the scope which is described as follows: (1) development of standards for magnetic ink. character recognition (MICR) for present and future use, and (2) resolution of problems arising in industry and the market place involving' manufacturers and printers. Under the aegis of the American Bankers Association and interested manufacturers, the magnetic ink character recognition font, known as E 13 B, has been adopted by the American. banking industry. Therefore, the X-3.7 subcommittee is following an approach which might be thought of as a maintenance program rather than a development program. They propose to (1) determine the best common method for handling miscoded documents, (2) resolve a standard location for check serial numbers, and (3) eliminate extraneous magnetic printing on the clear band of checks. The X-3.7 subcommittee is hopeful of getting the de facto standard MICR font processed and accepted as an American Standard and thereafter submitted for consideration as an International standard. It has been approved by X-3 and is now submitted for processing through ASA. By design this structure of the X-3 Sectional Committee closely resembles the organization of ISO Technical Committee 97 Proceedings-Fall Joint Computer Conference, 1962 / 183 on Computers and Information Processing. There are six subcommittees and one working Group of TC97 at the international level with titles and scopes quite similar to those of the X-3 subcommittees: SCl for multi-lingual glossaries; SC2 for coded character sets; SC3 on both optical and magnetic character recognition; SC4 on input/output media standards; SC5 on programming languages; SC6 on data transmission, and WGl on Problem Definition and Analysis including flowcharts. It should be emphasized that although organizationally similar, the character of national and international standardization differs considerably. National standardization activity can involve development of standards where need exists and accepted practice or appropriate developmental facility does not, as in the aforementioned character code case. International standardization, on the other hand, tends to be legislative in nature, with the work of TC97 and its groups devoted to the processing of national proposals that represent local standards or practice. Little if any standards development is foreseen or considered in the international activities of TC97. In addition to considering national standards proposed for ISO conSideration, TC 97 also accepts documented proposals from official liaison organizations of an international nature such as the European Computer Manufacturers Association and the International Federation for Information ProceSSing. Proposed international standards involving electrical characteristics are processed by the IEC Technical Committee 53 - Computers and Information ProceSSing, and its four subcommittees: A for Input/output Equipments; B for Data Communications; C for Analog Equipments in Digital Systems; and D for Input/Output Media. Counterpart interests are included in the scope of the ASA X-6 Sectional Committee. Where logical, physical and electrical factors in a standardization proposal cannot be isolated, as in such input/output media as magnetic tape, TC97, TC95 on Office Machines, and TC53 have made provision internationally for joint WG D and SC 53D work. Nationally, X-3 and X-6 have joined forces in three joint task groups for consideration in input/output media standards for magnetic tape, perforated tape, and punched cards. Other cooperative efforts are provided for as need arises, such as in programming languages and character sets, and between all special technical areas and the glossary activities. Two final comments on the basic interests and intent of the national and international standardization efforts. First, there is no desire to develop or establish standards for the sake of standardiZing. The only justifiable reason for standards in information proceSSing is need, as expressed by the users and manufacturers of computers and deVices, and those affected by such equipment. Second, in order to assure that such need is properly expounded when it exists, and that resultant standards represent acceptable solutions to such needs, the industry and user s and general interests must partiCipate fully and with qualified, active representation. Standards must accurately represent the predominant practices or wishes of the entire information processing community. Those of you who are active in the data processing community are probably familiar with the magazine DATAMATION and may have read the feature article on the ASA X-3 Sectional Committee in the February 1962 issue. Although the article is rather critical of the lack of progress that has been made through the end of 1961, it is generally factual and, with minor exceptions, gives a good picture of this first year of the Standards Program. In spite of its rather critical tenor, the the article concludes with this statement: "A more realistic point of view is that standards activities are, by their very nature, methodical, plodding and, subsequently, quite permanent in their effect. It should be clearly understood that the biases, politics and frictions which come to play and may seem to impede the effort are, in fact, expressions of legitimate interests which comprise one of the most important aspects of the deliberations involved in setting and maintaining a standard. " HIGH-SPEED FERRITE MEMORIES H. Amemiya, H. P. Lemaire*, R. L. Pryor, T. R. Mayhew Radio Corporation of America Camden 8, N. J. components have become prime factors in the determination of the minimum access time of a memory. The use of permalloy thin films as highspeed storage elements has recently received a great deal of attention [8,9,10]. A small memory with a cycle time of less than 0.5 p.sec has been operated [11], and cycle times shorter than 1 J.Lsec appear generally feasible in memories of larger capacity [12]. In addito their high-speed potentialities, thin-film memories can be fabricated in large sheet arrays at relatively low cost. (This fabrication technique has not been fully developed at the present time.) Disadvantages of thin-film memories include high drive-current requirements (0.5 to 1 ampere) and low bit outputs (less than 5 mv) [9]. Large arrays operating at high speeds may, in fact, be impracticable, because the discrimination between the low output signal and the stack noise becomes increasingly difficult as memory capacity is increased. Ferrite pieces utilizing closed magnetic paths of miniature dimensions offer obvious advantages for high-speed memories [13]. Outputs and switching speeds can be kept high, while at the same time drive currents kept low. This paper presents the results of a program aimed at developing ferrites capable of operation in memories with cycle times less than 0.5 J.Lsec and at bit costs competitive with those of slower, more conventional arrays. These goals have necessitated the solution of two associated problems: first, INTRODUCTION For several years ferrite cores [1,2] have constituted the mainstay of computer storage memories. The typical computer today [3] has a transistor-driven core memory which operates in a coincident-current mode [4] with a 5-to-10-J.Lsec cycle time. Although a coincident-current storage unit with a cycle time approaching 2 Jlsec has been built, higher speeds have been attained by exploiting socalled partial-switching modes of operation. \ Word-address memory systems [5] with cycle times less than 1 J.Lsec and as low as 0.7 J.Lsec have been reported [6,7]. To attain these speeds with conventional 50/30 cores (that is, cores of 0.050-inch outer diameter, O.030-inch inner diameter) drive requirements are necessarily high (approximately 1 ampere-turn), and particular attention must be given to the physical arrangements of conductors, sense windings, and storage elements. Actually, short cycle times have been realized by the development of fast- switching· storage elements which operate in impulse switching mode s using high drives, and by minimizing such factors as propagation time, field transients, and mutual-coupling effects to reduce the duration of the unproductive phases of the memory cycle. The reduction of these phases has assumed increasing importance as the operating speeds of memories have increased. Array geometry, timing operation, and the relative positioning of *RCA, Needham, Mass. 184 Proceedings-Fall Joint Computer Conference, 1962 / 185 the development of low-drive, fast-switching memory cells; and second, the development of methods for the assembly of these cells into arrays economically. Memory Organization High speeds have been achieved utilizing a word-address, two-core-per-bit memory organization. Linear selection (wordaddress) memory schemes are well established as a means of obtaining increased memory speeds since, in contrast to coincident current methods, readout currents of unlimited magnitude can be used. (Currents are then determined by transistor driver limitations rather than by core or memory organization.) Linear selection provides a second, important means of attaining high speeds by making it possible to use high amplitude, short duration write pulses. Narrow pulses such as these switch a minimum of flux by themselves although their amplitude is substantially greater than the normal switching threshold; when added to the exciter or digit pulses, however, they are capable of switching a significant amount of flux [14]. As memory cycles are reduced, a point is reached where two-core-per-bit operation becomes a necessity if practicable signal-tonoise ratios are to be maintained. This comes about for two reasons; fir st, as the write pulse is made increasingly narrow, a very small fraction of the core is being switched so that upon readout the difference between a 1 and a 0 is very small; and second; as the read pulse is made narrower and the rise time decreased the contribution of reversible magnetization changes becomes an increasingly significant fraction of the total output. In addition, the peaking time of the core rapidly approaches the time at which the reversible flux peak occurs so that the two pea~s merge into one. Figure 1 illustrates qualitatively, the output waveforms for two cases, each involving partial switching modes differing essentially in the amount of flux switched and the switching times. Figure 1a illustrates the performance of a core with a switching time of 200 nsec, usable in a memory with approximately a 1 J.lsec cycle time. Figure 1b illustrates the case for higher speed situations. Here, less flux was written into the core, the read duration and rise times were decreased and the switching time reduced to 50 nsec. When the switching time is short, even with precise strobing, the difficulty of obtaining sufficient discrimination is evident from Figure lb. Two-core-per-bit operation provides a means of cancelling out the reversible flux contribution to the total output. Figure 2 illustrates four possible two-core-per-bit schemes which differ only in the way the digit pulses are applied to the bit. In Figure 2a, bidirectional digit pulses pass through both cores in the bit. In Figure 2b, a digit pulse passes through both cores to write a 1; there is no digit pulse when writing a O. In both 2a and 2b, digit pulse s "add" to the partial-write pulse in one core and "subtract" from the partial-write pulse in the other. In 2c and 2d, o. b. ----------- uV I a 50 TIME (ns) a. Switching time, t Figure 1. 100 TIME (ns) s - 200 ns Output Waveforms vs. b. t s - 50 ns Switching Times. 186 / High-Speed Ferrite Memories Bit Characteristics DIGIT DRIVERS Do " a. Do / b. TO WRITE "ZERO" TO WRITE "ONE" 01 Bit evaluation was performed using a twowire system with common sense -digit wires and common read-write wires. A schematic illustration of the test setup is indicated in Figure 3. In bipolar operation, digit driver "one" is turned on to write a 1 and driver "zero"to write a O. In the amplitude sensing mode, driver "one" is again turned on to write a 1; driver "zero" is not needed and is simply turned off. NO DIGIT TO WRITE "ZERO" TO WRITE "ONE" 0 1 TO WRITE "ZERO" Do d. c. f\ "ONE" --1 \...: ° I b.~NE" f\ "ONE" C.JL f\ "ONE" d.--1 \..: CORE A R W "ZERO" Figure 2. DRIVER "ZERO" ° NO DIGIT TO WRITE "ZERO" SENSE SIGNALS O. DIGIT DIFFERENCE SENSE AMPLIFIER DIGIT DRIVER "ONE" t.j /!jCORE 8 !J Four Digit Drive Techniques and Sense Signal Waveforms. Figure 3. digit pulses have but one direction which is the same as that of the partial-write pulse. In cases 2a and 2c, the read outputs are bipolar, whereas in 2b and 2d the outputs are unipolar. Only two of these schemes (2c and 2d) were in fact used in this system. Unidirectional bit drives were used exclusively when it became evident that bidirectional digiting led to an increase in digitdisturb sensitivity. This effect was particular ly evident when each core in the bit was individually examined. Test results showed that a core which received a digit pulse opposite in direction to the partialwrite pulse had a lower disturb threshold than a core for which write and digit pulses were in the same direction. This effect and similar phenomena has been reported elsewhere [6]. Henceforth, in this article, Figure 2c will be referred to as the bipolar sensing scheme. The situation drawn in Figure 2d, which differ s froril2c in that only one core is digited (output of single polarity) will be termed an amplitude sensing scheme. Test Set- Up (Bit Evaluation). Amplitude Sensing: Cores which were tested with the amplitude sensing mode were subjected to the series of input pulses as shown in Figure 4a. This sequence was chosen to produce the worst signal-to-noise ratio; in this case, this ratio is the ratio of the amplitude of the undisturbed 1 to the amplitude of the disturbed 0 (uV1 :dVz ). The particular order of the sequence-undisturbed o voltage, (uVz )' undisturbed 1 voltage (uV 1)' disturbed 1 voltage (dV1 ), disturbed 0 voltage (dVz ), was intended to bring out instability in the remanent state on readout, if instability existed. If readout were incomplete the dVz which follows a dV1 would have its highest value, and a uV1 follQwipg a uVz would have its lowest value. The four read outputs were superimposed to obtain a simultaneous oscilloscope display of the type shown in Figure 5. The waveforms shown were obtained using cores of 0.050inch outer diameter; 0.010-inch inner diameter, with the drive pulse characteristic shown Proceedings-Fall Joint Computer Conference, 1962 / 187 INPUT CORE A INPUT CORE B READ OUTPUT uVz uV I Q. AMPLITUDE SENSING 62 R-W CYCLES I INPUT CORE A INPUT CORE B READ OUTPUT dVI A D"" ' .. 8 T064 DISTURB PULSES ~-------~----~ ~--------\I' ~------dVz 62uV;S V dVz b. BIPOLAR SENSING a. Amplitude Sens ing Figure 4. TEMPERATURE CORE Pulse Test Programs and Outputs. = 25° C = 50 MIL 0.0., 10 MIL 1.0. AMPLITUDE ma DURATION 100 ns RISE TIME ns 25 READ 325 WRITE 250 50 25 DIGIT 60 90 25 70 60 BIT OUTPUT mv 50 40 30 20 .10 10% ··POINT 0 Figure 5. b. Bipolar Sensing Output Waveforms (Amplitude Sensing). in each figure. The switching time, taken at the 10-percent points, is about 70 nsec. The relatively small difference between disturbed and undisturbed signals is an indication that the digit-disturb pulse has minimum disturbing effect on the core. Figure 6a shows the effects of variations in the digit and partial-write current amplitudes for fixed current-read conditions. The 1 and 0 outputs are actually undisturbed 1 's (uVi ) and disturbed O's (dVz ), respectively, Figure 6b shows the signal-to-noise rotios which serve as a guide in the determination of optimum drive-pulse characteristics. For the particular durations and rise times used in this case, a range of workable digit and partial-wire current levels is available. In Figure 6, for example, it is apparent that a digit current in the 60- to 70-ma range and a partial-write current of 200 to 220 ma would give a 1 output of 40 to 65 mv at signal-tonoise ratios of 9 or 10 to 1. Switching times in this case are in the order of 80 nsec. The drive pulse and output characteristics indicated in Figures 5· and 6 are typical of cores with a 10 mil inner diameter and 50 mil 188 / High-Speed Ferrite Memories BIT OUTPUT mv 40 80 60 100 DIGIT CURRENT - ma Figure 6a. Changes in Bit Output with Variations of Digit and Partial- Write Current Amplitudes. back voltages per bit during readout with little change in net signal output. This happens, of course, because as the rise time is decreased the reversible flux contribution to the total output is increased. Figure 7 illustrates the effect of decreasing the rise time of the readpulsewhile maintainingits amplitude fixed. As is evident from the diagram, the total back voltage per bit decreases from 250 mv to 150 mv when the rise time is increased from 13 to 40 nsec. The decrease in signal is in comparison very small-in fact there is no change in signal when the rise time is changed from 30 to 40 nsec. These facts are of obvious importance if long memory words come into consideration since one can bargain for lower back voltages on the word lines by increasing rise times and memory cycle times but with very little loss in bit output. Bipolar Sensing: In the bipolar-sensing method (Figure 4b), core A is digited to write aI, core B to write a O. Thus, in the test program shown, the first series of read outputs ,. are undisturbed 1 'so The last readout CORE = 50MIL 0.0., 10MIL 1.0. AMPLITUDE m a READ WRITE DIGIT 350 VARIABLE VARIABLE DURATION ns RISE TIME ns 40 40 44 100 60 100 100 e(mv) 50 READ RISE TIME-4Gns 10 \ ..--... --~~ ns ::J~ 0 i= « a: 8 UJ rn (5 z I 0 II ...J 6 « z (!) en 4 80 100 DIGIT CURRENT - ma Figure 6b. Signal to Noise Ratio vs. Digit and Partial- Write Current Amplitudes. outer diameter. Switching times and readwrite cycle times can be shortened by increasing the read pulse amplitude and decreasing rise times and durations. Shortened rise times, however, bring about increased Figure 7. Effect of Read Pulse Rise Time on Core and Bit Output. Proceedings-Fall Joint Computer Conference, 1962 / 189 on the right is a disturbed 0 which is termed the "lowest 0." In this case, although core B is raised to the higher flux state by superimposing digit and partial-write pulses, the disturb pulses are applied to core A. (If the pulse patterns in the two cores were reversed, the initial series of readouts would be undisturbed O's and the last readout a disturbed 1.) Actually, the pulse sequence is a variation of that used for amplitude sensing and was chosen to give the worst-case conditions (i.e., by promoting the lowest 0 or the lowest 1). As before, the reasoning was that if the readpulse amplitude or duration was insufficient to bring the core back to a stable remanent state, a 0 would have its lowe st magnitude if it followed a series of l' s. A disturbed 0 obtained under these conditions, then, should be the lowest O. This particular pulse sequence also aggravates the worst-case condition, because the high pulse repetition rate (2 mc) increases core temperature. In one of the materials examined, both cores become heated to about 20°C above room temperature. Core A, however, undergoes more flux reversals per cycle than core B, and becomes slightly warmer than core B. The temperature difference between the core s was found to be in the order of 5 to 7 ° C. This temperature tends to increase the 1 output and decrease the 0 output. Because core A is warmer, it not only has an increased output, but also a reduced coercivity which tends to lower the digit-disturb threshold. The temperature effect was easily observed by switching from the program shown to one producing a series of undisturbed O's followed by a disturbed 1. For about 3 seconds after the program change, the output amplitude s are unstable. Immediately after the program change, the undisturbed 0 is low and gradually increases to a higher stable value. Initially, the disturbed 1 is relatively high and stabilizes at a lower value. If the pulse program is again reversed, the undisturbed 1 is low initially, and gradually increases; the disturbed 0 is high initially and gradually decreases. Figure 8 shows the effect of digit-current amplitude on output for a given material and for a given set of partial-write and read conditions. The upper curve is a plot of undisturbed 1 (or undisturbed 0) outputs as a function of digit-current amplitudes. The lower curve shows the effect of digit amplitude CORE = 50 MIL O. D., 10· MIL I. D. RISE TIME READ 24 WRITE DIGIT 20 20 ,.or 140 ns 1 I 120 BIT OUTPUT 100 60 40~-r----+-----~---+----~----~ 50 60 70 80 DIGIT CURRENT - Figure 8. 90 100 ma Bit Output vs. Digit Current (Bipolar Sensing). on a disturbed 0 or disturbed 1. As expected, the undisturbed output gradually increases as the digit current level is 1ncreased. The disturbed output, on the other hand, goes through a maximum (in this case at about 90 rna). The decrease in 0 output beyond this point occurs because the disturb threshold of the core has been grossly exceeded at this digit-current level. This is particularlyevident from the figure in that the disturbed curve is now "split," depending upon whether 8 or 62 disturb pulses were in the pulse program. Point E in Figure 8 is the e generated. Ideally it might be desirable to display complete encyclopedic knowledge about all items appearing in an association map, including, for example, information about the environment, property lists, homographic uses, and so on. In the absence of a complete semantic dictionarY,a compromise is usually made in actual systems, either by restricting the relations to be recognized automatically to certain very specific types [8-10], or else by providing extensive lists of relationships which are not, however, recognized fully automatically [11,12]. In a practical automatic system, however, it would seem necessary on the one hand to distinguish more than a few specific types of relations, and on the other to perform the recognition procedure automatically without benefit of extensive dictionaries. These requirements would suggest that a small number of restricted dictionaries be used together with other indications provided by the linguistic context. The following linguistic indicators might provide important information: a) prepositions and other function words; b) affixes of various kinds; c) special quantifiers such as "many, " "some," "every"; d) logical connectives such as "and," "or," "not"; e) special ref ere n t s such as "like," "these," "~hose"; f) special linguistic units such as "the fact is-," "it is claimed," "it is hoped." PrepOSitions and other function words have been used in the past for the extraction of semantic indications [13 -15]. More general forms of syntactic analysis may be used Similarly for the recognition of word relations [15-17]. In the next section, a method is presented for the recognition of word as-sociations by means .of a simple form of syntactic analysis. A Simple Syntactic Method for the Recognition of Word Associations Dictionary Look-up and Suffix Analysis: The method to be described takes ordinary 236 / Some Experiments in the Generation of Word and Document Associations English text and assigns a unique part-ofspeech indicator and a unique syntactic function indicator to each word. These indicators are then used to assemble words into phrases and phrases into clauses. The complete program is described in the flowchart of Figure 1. percent of the words are found in the dictionary, so that forty-five percent of the words are still left without any syntactic indication afte r dictionary look-up. INPUT ITEM SEMANTIC INDICATOR (2nd SYNTACTIC CLASS) WEEKLY TO TYPE AND VERI FY INPUT TEXT WEEKS ITEMIZE TEXT WORDS AND ASSIGN SERIAL NUMBERS WELL LOOK -UP ITEMS IN DICTIONARY AND ASSIGN SYNTACTIC INDICATORS ITEMS NOT FOUND IN DICTIONAqy USE SUFFi XES TO ASSIGN SYNTACTIC INalCATORS ITEMS WITHOUT SUFFIX .. WENT ITEMS FOUND IN DICTIONARY ITEMS WITH RECOGN IZABLE SUFFIX MATCH PREDICTED SYNTACTIC FUNCTIONS WITH SYNTACTIC INDICATORS AND ASSIGN COMPLETE SYNTACTIC INFORMATION ASSIGN SYNTACTIC INFORMATION CORRESPONDING TO MOST PROBABLE PREDICTION USE SYNTACTI C INFORMATION TO DETERMIN E WORD ASSOCIAT IONS SI MPLIF I ED SYNTACTI C ANALYSIS PROGRAM Figure 1 WERE WEST DO WHAT SERIAL NUMBER SEMANTIC INPUT ITEM INDICATOR AND SYNTACTIC (1st AND 3rd INDICATORS SYNTACTIC CLASSES) GAS -00816000 WEEKLY HO,AO,N! % TO GAS- 00802000 WEEKS N2 % TO GAS-00098000 WELL HO,AO,N! "I. GAS-002080oo WENT VI % MV GAS-00043000 WERE % LV GAS-00559000 WEST Nl, AO, HO % DO DO GAS-00048000 WHAT % RPRooOOOOOOO C I ,P3, AO,HO WHEN GAS -00051 000 WHEN RPROOOOOOOOO Cl,HO, P3 % WHENCE GAS -003 33000 WHENCE CI,HO % WHENEVER GAS-00334000 WHENEVER CI, HO % TO TO TO DICTIONARY EXCERPT Figure 2 The input text is first transferred onto magnetic tape, and the individual words are separated and provided with a serial number. A small dictionary is then used to assign syntactic part-of-speech indicators to those words which are included in the dictionary. A dictionary excerpt is shown in Figure 2. It may be noted that each dictionary item is provided with as many syntactic indicators as there are possible applicable parts of speech. For example, the word "weekly" reproduced as the first item in Figure 2 is furnished with three indicators (HO,AO,Nl), representing respectively adverb, adjective, and noun-singular indications. Certain semantic indicators are also included in the dictionary. The "TO" semantic indicator, for example, represents a time indication. Other semantic indications also included, denote motion, location, direction, dimension, value, duration, and so on. The dictionary presently used contains about four hundred function words, such as prepositions, conjunctions, adverbs, articles, and so on, and about five hundred common nouns, verbs, and adjectives. The dictionary composition is shown in detail in Table 1. Experimental evidence indicates that for the technical texts tested approximately fifty-five In order to generate parts-of- speech indications for the words not found in the dictionary, a suffix analysis program is used. Specifically, an attempt is made to detect a recognizable suffix for each word by comparing the word endings with a list of suffixes contained in a suffix table. Suffixes of seven letters are tested before suffixes of six letters, and so on, down to suffixes of one letter. Special provisions are made to detect plural noun forms and third person singular present forms for verbs. When a suffix match is obtained, the part-of-speech indicators included in the suffix table are attached to the corresponding text words. An excerpt from the suffix table is shown in Table 2. It should be noted that many words include "false" suffixes which, when looked up in the suffix table, would provide incorrect information. For example, the words "king," "wing," and "sing" would be provided with "gerund" or "present participle" indicators because of the "ing" ending. Such words are considered to be exceptions; they are included for the most part in the regular dictionary so as to eliminate the suffix analysis. Experience has shown that the suffix Proceedings-Fall Joint Computer Conference, 1962 / 237 Table 1. Dictionary Composition SYNTACTIC INDICATIONS NN Ino VBl NN, VB Ino VBEF'l ~!..~B__ I~i~_V~~~ ____ VBI or VB2 lno NN, no IOBll VBI or VB2 Iwith lOBJ'l VB3 Ino V 3TI'1 VB3 Iwith V3TI'1 VB4 PERCENTAGE OF TOTAL 266 101 27.7 % 10.5 6.0 11 VBEF: VERB PREDICTIONS TAKE PRECEDENCE OVER NOUN COMPLEMENT PREDICTIOOS. 12.3 4.6 1.3 0.4 1.4 211OBJ: VERBS MAY TAKE INDIRECT OBJECT ~7 -------- ------liB 44 12 ------ --------- POI P02 P03 PQ3 NO OF WORDS IN DICTIONARY" 13 5 Ino NAMS'l I with NAMS'l 21 8 (no CONl (no COMl CON I 49 12 ------------- - ------- COM CON COM PCT ---- - - - - - - - - ADJ (no NAMS'l ADJ (with NAMS') ADV PRE PRI ART PRP PAP TOTAL WORDS NO. OF CHARACTERS 7 6 - - - - - - -271 II 134 71 67 72 0.5 0.8 2.2 0.8 0.1 5.1 1.3 31 V3TI: VERBS MAY TAKE AN INFINITIVE VERB COMPLEMENT (OUGHT, NEED, DARE, USEDl. 41 NAMS: CANNOT FULFILL ADJ MAS OR NADJ CM PREDICTIONS (THIS, THOSEl 0.7 28.2 1.2 13.9 7.4 0.1 0.3 70 7.5 • ITEMS EXHIBITING MORE THAN ONE SYNTACT IC INDICATOR ARE LISTED SEVERAL TIMES 963 SUFFIX SYNTACTIC INDICATOR* SORBING DENTIAL NENTIAL RENTI AL TENTION TURBING SENTIAL PRP ADJ ADJ ADJ NNI PRP ADJ ANSION ANTIAL ENSSON ENTIAL NNI ADJ NNI ADJ, NN I it EXCEPT IONS APPEAR IN THE DICTIONARY syntactic' analysis is used to assign to each word a unique part of speech indicator and a unique syntactic function [18,19]. Specifically, each sentence is scanned from left to right, one word at a time. At each point, predictions are made conce rning the syntactic structures to be found later in the same sentence. When a new word is considered, its associated grammatical indicators are tested against the unfulfilled predictions available at that time. If a match is found between any of the predicted grammatical functions and the available syntactic indicators, (he first matching grammatical function and the associated part-of-speech indication are assigned to the corresponding word in the input text. The accepted prediction is then deleted, and further predictions are made for new structures to be expected later in the same sentence. If no match can be found between any unfulfilled prediction and available grammatical indication, as is the case when words are tested which have no associated grammatical information, the most likely prediction is used to generate acceptable grammatical indicators for the corresponding input items. The predictive analysiS makes use of information stored in two separate tables, the grammar table and the prediction table. The grammar table lists predicted syntactic functions against the syntactic indicators which can fulfill each prediction, and the prediction table lists a c c e pte d syntactic indicators INPUT I TEM SERIAL NUMBER INPUT ITEM AND SYNTACTIC INDICATORS SYMBOLS IDENTIFYING PROCEDURE USED TO GENERATE SYNTACTIC INDICATORS' I$H-02540000 OF RO analysis provides syntactic. information for an additional thirty percent of the words found in technical texts. After the suffix analYSiS, less than fifteen percent of the words are thus left without any grammatical information. A small piece of suffix analyzed output is shown as an example in Figure 3. It may be noticed that all items found in the dictionary, or furnished with a recognizable suffix, are provided with syntactic indicators as a result of the table look-up procedures. The words "atmosphere" and "alarm" in the excerpt of Figure 3 could not be classified by any of the available methods, and are therefore left without grammatical indication. Predictive Syntactic Analysis: Following the suffix analysis, a Simplified predictive IMPENDING 1 SH-02550000 1MPENDI NG % X-LI T. attNI CR.S.S I TEM FOUND IN DICTIONARY I $H-02560000 CRISt S % N2. V2 BLANK I X-LIT' % • SH-02510000 SYNTAX GENERATED BY SUFF.x ANALYSIS PO NOW ISH-02580000 NOW % X-LIT I X-LIT HO,NI, CI NO SYNTACTIC INFORMATION AVAILABLE 1SH -02590000 AN TO ATMOSPHERE I SH-02600000 II II 1111 1111 OF ISH·02610000 OF RO ALARM ISH-026Z0000 111111111.11 X-LI T. X-LIT. ISH -02630000 t OUPC3QOOOQOO CO. Cl 1SH -02640000 S I NeE C" RO,HO ISH -02650000 "BLANK I BLANK % ,T P3 SUFFI X ANALYZED OUTPUT Figure 3 238 / Some Experiments in the Generation of Word and Document Associations against new syntactic functions to be predicted as a result of a match. Excerpts from the grammar and prediction tables are shown in Tables 3 and 4 respectively. Table 3. Excerpt from Grammar Table SYNTACTIC FUNCTION PREDICTED SUBJECT ADJ MAS OBJT VB PRED HD NADJ eM VERB CM ADVB MS PREP CM ADVB ES PREP ES END SEN SYNTACTIC INDICATORS SATISFYING THE PREDICTION ART, ADJ, NNI, NN2, PRP NNI, NN2, ADJ ART, ADJ, NNI, NN2 VB I, VB2, VB3, VB4 NNI,NN2 VB I, VB3, VB4, PRP, PAP, PR I ADJ ART, ADJ, NN I, NN2, PRP ADV PRE PCT Table 4. Excerpt from Prediction Table Generated by Accepted Syntactic Indicators ACCEPTED SYNTACTIC INDICATOR ART ADJ NNI NN2 PO I P02 P03 PRP COM CON PCT SYNTACTIC FUNCTIONS PREDICTED ADJ MAS, ADVB ES ADJ MAS, NADJ eM, PREP ES NADJ CM, PREP ES NADJ CM, PREP ES NADJ CM, PREP ES NADJ CM, PREP ES NADJ CM, PREP ES [tNADJ CM),' VERB CM, PREP ES, ADVB ES, OBJT VB (LAST PREDICTION SATISFIED BY \ CERTAIN ESSENTIAL ITEMS, (SCL $UB, PRED HD, INFINTY)2 {SUBJECT, PRED HD, (OUP CNd, INFINTY, (ADVB MS)4 Il WHEN ACCEPTED AS "PREP CM" 2) WHEN NOT OCCURING BEFOR E "CON" 3) WHEN EXHIBITING SUPPLEMENTARY "DUPC" CODE erased from the list of unfulfilled predictions, and a number of new predictions, as indicated by the prediction table, are added to the list. Since the "article" indicator was accepted as correct for the first word, Table 4 shows that "adjective master" and "adverb essence" predictions are to be added to the list of predictions. The list of predictions operates, with minor modifications, as a pushdown store, so that new predictions are always added at the top of the list. When the second text word is about to be processed, the list of unfulfilled I predictions therefore contains "adjective master ," "adverb essence," "predicate," and "end of sentence" predictions in that order. Subsequent words are processed in the same manner until the end of the sentence is reached, at which point the list of predictions is cleared and initial conditions are restored. The analysis of a short sentence is shown as an example in Table 5. The generation code indicates whether any syntactic indicators were available at the start of the predictive analysis, and, if so, how they were generated. The column labelled "predictor serial" contains the serial number of the item which was originally used to generate Table 5. Analysis of a Sample Sentence 4) WHEN ACCEPTED AS "DUP CNJ" FUNCTION {SUBJECT, PRED HD, INFIt-.. REL CLS, END SEN Consider, for example, the conditions which obtain at the beginning of a sentence. The functions predicted initially are "subject;' "predicate ," and "end of sentence ," in that order. Suppose that the first word in the sentence has an associated "article" indicator. The grammar table is then consulted to find out whether the first prediction (subject) matches the available part-of-speech indication (article). The first entry in Table 3 shows that the subject prediction is fulfilled by "article," "adjective," "noun singular," "noun plural," and "present participle" indicators. A match is therefore found, and "subject" and "article" are assumed to be the correct grammatical function and partof-speech indication for the first word in the sentence. The "subject" prediction is then INPUT TEXT ON A SUITABLY SCALED GRAPH ANY RISING EXPONENTIAL CURVE TENDS TO PRODUCE AN HYPNOTIC EFFECT OF IMPENDING CRISIS GENERATION CODE SERIAL NO. 0 D D S NI 0 0 D S NI 0 D D D S D 0 S S D 0238 0239 0240 0241 0242 0243 0244 0245 0246 0247 0248 0249 0250 0251 0252 0253 0254 0255 0256 0257 SYNTACTIC FUNCTION ,,,.,, ) PREP CM AOVB ES ADJ MAS NAOJ CM """') 'PREP CM AOJ MAS 'OBJT VB AOJ MAS PRED HO} VERB CM INF BSE 0BJT VB} AOJ MAS AOJ MAS PREP ES} PREP CM NADJ CM END SEN SYNTACTIC INDICATOR PRE ART ADV PRP NNI COM ADJ PRP ADJ NNI VB2 PR I VB I ART ADJ NNI PRE PRP NN2 PCT PREDICTOR SERIAL *237 238 239 239 241 "237 243 244 245 246 *237 248 249 250 251 252 253 254 255 *237 0- DICTIONARY S- SUFFIX ANALYSIS NI- NO INFORMATION * INITIAL PREDICTION , ERRONEOUS ANALYSIS the prediction fulfilled by the current word. Thus the word "tends" in Table 5 was recognized as a verb fulfilling the "predicate head" prediction. The predictor serial (237) shows that "predicate head" was an initial prediction originally predicted at the start of the sentence. The" syntactic function" and "predictor serial" columns of Table 5 are used later in the bracketting procedure which generates phrases and clauses. Proceedings-Fall Joint Computer Conference, 1962 / 239 By checking the syntactic functions of Table 5, it may be seen that an error was made in the analysis. In particular, the subject phrase "any rising exponential curve" was not properly recognized because the comma (item 0243) predicted a parallel construction consisting of another prepositional phrase. This particular error can be caught since the subject prediction, which is considered to be "essential," is left unfulfilled at the end of the sentence. However, other mistakes are made in the course of a normal analysis which cannot be detected so easily. These errors are due to three principal causes: a) the absence of grammatical information for words not found in the dictionary or in the suffix table; b) the inadequacy in certain cases of the grammar and prediction tables used; c) the large number of syntactic ambiguities present in the language. Clearly, nothing can be done about the last of the three listed causes. In fact, even the most elaborate automatic methods for syntactic analysiS, including, in particular, those which make use of complete syntactic dictionaries of the language, produce erroneous output in cases of linguistic ambiguity. The question arises whether the performance of the crude analysis here described is much less reliable than other more ambitious programs. A list of principal error types found in the current analysis appears in Table 6. The analysiS of the output indicates an error percentage in the assignment of syntactic functions of 7 to 8 percent primary errors, and 6 percent induced errors. The latter category. consists of errors which arise as a result of some preceding error in the same sentence. If, for example, the word "profound" in the phrase "many profound questions •. I' is not found in the dictionary, it will be assigned a noun indicator satisfying the subject prediction. When the word "questions" is taken up, the subject prediction will already have been ·erased, and rfquestions" is there-:fore recognized as the third person present singular form of the verb "to question" satisfying the predicate prediction. The latter error is thus induced by the false classification of the adjective "profound." In general, local structures such as subject phrases, noun complement phrases, preposition complement phrases, participial Table 6. Principal Error Types ERROR TYPE EXAMPLES I ITEMS NOT ENTERED IN DICTIONARY 1. ADJECT IVE - NOUN BENEFICIAL, REMEDIAL, ANNUAL, SCIENTIFIC, REAL, OBSCURE; 2. NOUN +-+ VERB (WORDS IN - 5) EXPENDITURES, TRENDS, PROBLEMS; TENDS, INSISTS; 3. NOUN - - VERB (VARIOUS) INFORMATION, EDUCATION; GENERATE, ASSIMILATE; NEEDED, OUTSTRIPPED, PUN ISHED; 4. ADVERB +--+ NOUN, ADJECTIVE SIMPLY, REGARDLESS, INDEED; ABSURD, AWARE; n ITEMS IN DICTIONARY WITH CORRECTLY ASSIGNED SYNTACTIC FUNCTION 1. CONJUNCTION - - ADJECTIVE ("THAT") 2. CONJUNCTION 3. PRONOUN ("SO") RELATIVE SUBJECT ADVERB (EXTRA OBJECT PREDICTION LEFT IN POOL! THAT, WHO SO FAR AS, SO VERBOSE phrases, verb complement phrases, and so on, are almost always properly analyzed. Errors arise most frequ~ntly in the correct recognition of coordinated and subordinated structures, and, in particular, in the accurate prediction· of parenthetic and parallel constructions. For example, in the sentence "The conceptual, rather than equipment, aspects .•• " the first comma correctly predicts a parenthetic expression. The word "than," however, generates among other predictions a subclause prediction which dominates the other unfulfilled predictions. When the second comma is reached, it is interpreted in accordance with the latest prediction as signalling a parallel construction, thus repeating for "aspects" the same assignment as for "equipment." In order to judge the severity of the handicap arising from the absence of a complete word dictionary, a test run was made with a special dictionary, which included all relevant words occurring in the sample texts. The error rate was found to be reduced by less than forty percent, resulting in a primary error rate of 4.5 percent and an induced rate of about 4 percent. These figures are comparable with error rates produced by other far more complicated syntactic analysis programs. The absence of the dictionary is therefore largely compensated by the effectiveness of the suffix analysis and of the modified predictive techniques. This conclusion is further confirmed by the fact that the bracketting program which produces word associations performs almost perfectly for local structures. Bracketting Program: A number of computer programs are in existence which use 240 / Some Experiments in the Generation of Word and Document Associations the output of an automatic syntactic analysis to produce dependency structures in tree form [14,20,21] • The list of all dependent word groups can be obtained from a dependency tree by following all branches of the tree, and generating groups of words, located on the same branch. Alternatively, if the tree structure is not available, the same result can be achieved by using the information provided by the predictive analysis. - Specifically, the syntactic function indicators and predictor serial numbers (columns 4 and 6 of Table 5) are used to produce for each item a chain serial number and a chain function. The chain serial numbers for "subject," "subclause subject," "predicate head" and "preposition essence" are the same as the corresponding predictor serials; for other syntactic functions, the chain serials are given by the predictor serials for the embedding structure. The chain serial is then effectively defined as the serial of the first item in the same phrase, although discontinuous constituents will cause items with the same chain serial to be separated by other extraneous items. For example, preposition complements are assigned the chain serial corresponding to the predictor serial of the preceding preposition. Verb complements are Similarly assigned the chain serial for the predictor of the preceding predicate head. Comparable rules apply to the remaining syntactic functions. The chain function assigned to a given word is the syntactic function associated with the predictor of the embedding structure, except that "subject," "subclause subject," "predicate head," "preposition essence," "preposition complement" and "infinity" functions keep their present function indicator. In order to bring together all members of the same structure, it is only necessary to sort the items in chain serial number order, while at the same time preserving the word order inside each substructure. The structures generated for the sample sentence of Table 5 are represented by brackets in column 4 of the Table. The chain functions assigned to the five brackets are, respectively, preposition complement, preposition complement, predicate head, object of verb, and preposition complement. The chain function assigned to the second bracket is in error, since the subject of the sample sentence was not correctly recognized, as previously ex- plained. Higher order brackets can SImIlarly be produced by combining subject, predicate, and object brackets. The bracketting procedure is generally performed without difficulty, and the substructures are almost always correctly recognized. In fact, many of the more trivial errors in the syntactic analysis are not reflected by any bracketting errors. Following the bracketting procedure, it is easy to generate word groups of various kinds together with their associated functions. For example, the brackets of Table 5 will yield four noun phrases (" suitably scaled graph,"" rising exponential curve," "hypnotic effect," "impending crisis"), two prepositionalphrases ("on graph," "effect of crisis"), and one verb phrase ("tends to produce"). Similarly, a subject-verb-object grouping would yield, assuming proper recognition of the subject, the phrase "curve produce effect." The output produced by the syntactic bracketting procedure can be incorporated in a word association map -in two different ways. First, it is possible to replace each word associated with a node of the map by a complete phrase, and, second, relations between nodes, represented by branches in the map, can be provided with relationship indicators, such as prepOSitions, conjunctions, verbs, and so on, as determined by the syntactic analysis program. The replacement of individual words by phrases is relatively straightforward. The problem created by the addition of relational indications is more complicated, particularly for those relations which are indicated by the context rather than by specific function words [15]. However, even if a standard set of relational indications is not available, and relational information provided by the context is not explicitly identified, the addition of word groupings and of some simple indicators of relation (e.g., prepositions), furnishes a more accurate description, and permits a more profitable comparison of document content. The Use of Bibliographic Citations for the Generation of Document Associations In the preceding section, information was extracted from written texts to obtain word associations. It may be expected that when these association programs are added to Proceedings-Fall Joint Computer Conference, 1962 / 241 other methods for the identification of document content, similarities and differences between documents will be easier to detect. Another possible approach to the generation of document associations consists in using bibliographic references to determine document content. Specifically, if it could be shown that a similarity in the bibliographic references or citations attached to two documents also implies a similarity in subject matter, then citations might be helpful for the automatic assignment of index terms to new documents, and for the gene ration of document associations. An experiment is described in the present section which uses citation tables (listing with each cited document the set of all those documents which cite the original ones) and reference tables (listing with each citing document the set of all documents cited by the original ones) to compute Similarity coefficients between documents. The coefficient sets derived from the citations are then compared with another set of similarity coefficients derived from index terms attached to the documents, and an attempt is made to determine whether large coefficients in one set correspond to large coefficients in the other, and vice-versa [22]. The proposed use of citations exhibits the same advantage s as the syntactic method previously described, namely that the required input information (in this case bibliographic references and citations) is directly available with most documents, and need not therefore be generated by more or less unreliable methods. Moreover, references can be processed automatically nearly as easily as ordinary running text. Use can also be made, at least for experimental purposes, of existing manually indexed collections for which the assignment of index terms is made under controlled conditions. Measure of Similarity: Consider first a given collection of n documents. Let Xl, X 2, ••• Xn be a collection qf n vectors, such that each vector element Xj of a given vector Xl represents some property relating to the ith document. Specifically, if the n documents are characterized by m properties each, each document is represented by one mdimensional vector. To relate the properties which identify two specific documents i and j, it is then only necessary to. compa:r:e the elements of the two vectors X 1 and X J ; moreover, if each document is to be related to each other document in the collection, all possible vector pairs must be compared. In most cases, it is convenient to restrict the values of the vector elements to 0 and 1, in such a way that element X ~ = 1 if and. only ifproperty j holds for document i, and X~ = 0 otherwise. This restriction is, howeve~, in no way essential. Two documents i and j are then assumed to be closely related if they share a number of common properties, or, alternatively, if the vectors Xi and X j have their nonzero elements in corresponding positions. The desired measure of relatedness must also be a function of the totalnumber of properties applying to each of the documents, that is, of the total number of nonzero elements in each vector, since one shared property out of a possible total of one hundred is clearly less Significant than one out of a possible total of two. A simple measure which exhibits the desired behavior and permits a comparison of magnitudes is obtained by considering the vectors of document properties as vectors in m - space, and by taking as a distance function the cosine of the angle between each vector pair. Specifically, Xi . X j Ixil Ixjl s m \' XiX j L.J - k - k = k=1 If one or both of the vectors is identically equal to zero, s is defined as zero, following the intuitive notion that if a document has no identifiable properties it cannot share any properties with other documents. If the vector elements take on values between 0 and 1, the angle between any two vectors cannot exceed ninety degrees; the value of s then ranges from 0 to 1. If, moreover, the document properties are represented by strictly logical vectors, the expression for s can. be Simplified since CMP ~ - CNG 4 - TDCMP RT CNG - RHO ----- RT CN2 - RHO .400 !~ TOCMP TDCMP TOCMP TDCMP TOCMP TOCMP TDCMP TOCMP DOCUMENTS CI TNG CNG 2 CNG 3 CNG 4 CITED CTO 2 CTD 3 CTD 4 Fa 603 FR 612 GR 604 JO 607 JO 609 JO 611 KU 605 .077 .481 .142 .572 .362 .000 .133 .416 .526 .285 .605 .41 I .000 .207 .404 .505 .338 .510 .485 .172 .264 .433 .478 .279 .443 .452 .196 .274 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 . ,000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 _.-.- RT CN3 - RHO - - RT CN4 - RTTD .300 '-----'--_'----~___L_...I...----L._.L..-._ _ 4. 8 DOCUMENT RANK COMPAR ISON OF MAXIMUM CROSS-CORRELATION COEFFICIENTS Figure 9 Consider now the sample list of crosscorrelation coefficients shown in Tables 8, 9, and,10, for early documents published in 1957 -1958, for middle-range documents published in 1959-1960, and for recent documents published in 1961, respectively. It is seen in Table 8 that the large values of the coefficients for the early documents occur as -expected by comparing cited documents. Similarly, Table 10 shows that large values of the coefficients for the recent documents Table 8. Typical Cross-Correlation Coefficients for Early Documents (Published 1957-1958) ~ TDCMP TDCMP TDCMP TDCMP TDCMP TDCMP TDCMP TDCMP DOCUMENTS CITNG CNG 2 CNG 3 CNG 4 CITED CTD 2 CTD 3 CO 208 FO 496 FO 506 FR 205 GI209 G1491 GI495 .000 .299 .000 .436 .109 .000 .273 .000 .459 .000 .000 .635 .000 .000 .000 .000 .000 .483 .134 .559 .306 .563 .000 .000 .000 .000 .000 .000 .000 .587 .000 .496 .542 .428 .615 .469 .610 .000 .000 .455 .000 .217 .433 .424 ,292 .210 .487 .628 .447 .643 .523 .552 .615 .426 .622 .556 CTD 4 occur as expected by comparing citing documents. These data confirm the expected fact that in a closed collection the amount of citing is not a good content indicator for early documents, and the citedness is not a good indicator for recent documents. Table 9 shows that for those documents which can both cite and be cited, the maximum values of the coefficients occur sometimes by comparing cited documents and sometimes by using citing documents. The data of Table 9 thus confirm those previously exhibited in Table 7, to the effect that no criterion was found to exist for consistently preferring either the amount of citing or the amount of being cited as a content indication. Turning now to the actual values of the coefficients obtained, it appears that for the majority of the documents the values of the cross-correlation coefficients "lie between 0.450 and 0.700, when citation links of length two are used, indicating a substantial similarity between overlapping citations and overlapping index terms. For some documents, the value of the cross-correlation coefficient for links of length two was found to be smaller than 0.400. However, in almost every case the failure to obtain adequate agreement between citations and index terms was due to the total, or almost total, lack of citations. The exact data for' 34 documents publishedin 1959-1960areshown inTable 11. 248 / Some Experiments in the Generation of Word and Document Associations The documents with coefficients below 0.400 are listed explicitly together with the number of citation links in each case. The following results are apparent for the correlations between CNG 2 - TDCMP and CTD 2 - TDCMP respectively: a) number of documents with coefficient above 0.400, exhibiting substantial agreement between over lapping citations and index terms - 23 and 20; b) number of documents with coefficient below 0.400, exhibiting fewer than three citation links of length one or two - 11 and 13; c) number of documents with coefficient below 0.400 exhibiting three or more citation links - 1 and 1. While there exists no scientific reason for asserting that a threshold value of 0.400 for the cross-correlation coefficients is necessarily significant as a similarity indication, it does appear that for nearly all documents which exhibit a reasonable number of citations (more than two), the coefficients obtained reveal considerable agreement between over lapping citations and over lapping index terms. Further experimentation is needed to evaluate more precisely the significance of the numeric results, and to determine to what extent citations can actually be used for the automatic generation of index terms. CONCLUSION Two experiments have been described which use information directly provided with written texts to determine \associations between words and documents. The syntactic experiment has shown that it is possible to extract information from function words and word suffixes to generate word groups and relations between word groups. These, in turn, may be used to obtain a more effective method for comparing the content of documents. The citation experiment has shown that for documents which exhibit an adequate number of citations, similarities in citations seem to provide some indication of similarities in content. For early documents, citedness furnishes a better indication than the amount of citing, and vice-versa for recent documents; for documents which can both cite and be cited, equally good indications were obtained by comparing citing and cited documents. Table 11. Range of Coefficients for 34 Middle-Range Documents TYPE OF CROSS-CORRELATION RANGE CF COEFFICIENTS NlIIBER OF DOCUMENTS .600 OR OVER .500-.599 .400-.499 7 B B .300-.399 3 .200-.299 I 0 0 SELECTED DOCUMENT CODES NUMBER OF LINKS OF LENGTH 2 (AND NO. OF DIRECT LINKS) ---------- -------- ------- -------------II (2) CNG 2 TDCMP .100-.199 .001-.099 .000 7 .600 OR OVER .500-.599 .400-.499 7 .300-.399 3 {CO GR JO MA 305 313 314 301 r~ FR406 IS 312 LY414 MG414 IIJ306 IIJ318 .200-.299 .100-.199 .001- .099 .000 (4) (2) 3 0 0 0 0 0 0 (0) (0) (0) (0) (0) (0) (I) (I) I 12 ----------- -------- CTD 2 TDCMP 2 6 5 5 0 0 6 {CO 317 FO 415 IIJ 318 r~ FR 405 FR 406 MA301 IIG404 r~ IS 311 J0413 LY 414 SA 401 VC 307 --------------(2) 1 I 2 7 I I 3 I 0 0 0 0 0 0 (2) (I) (2) (I)" {I) (3) (I) (0) (I) (I) (2) (PI (0) It is suggested that methods of this type, which require neither extensive input 'dictionaries nor complicated computer processes, may eventually form the basis for practical "automatic programs for content analysis. While no simple method will be completely successful on its own, a combination of many techniques of the kind described may we~l offer not only a speedier, but also a more satisfactory, solution to the content analysis problem than the construction of semantic encyclopedias which are so far away in the future. SUMMARY The solution of most problems in automatic information dissemination and retrieval is dependent on the availability of methods for the automatic analysis of information content. In most proposed automatic systems, this analysis is based on a counting procedure which uses the frequency of occurrence of certain words or word classes to generate sets of index terms, to prepare automatic abstracts or extracts, to deter- Proceedings-Fall Joint Computer Conference, 1962 / 249 mine certain word groupings, and to extend or modify in various ways sets of terms originally given. Unfortunately, it is not possible to perform completely effective subject analyses solely by frequency counting techniques. Two automatic methods are presented to aid in an effective subject analysis. The first makes use of a simplified form of syntactic analysis to determine associations between words in a text, and the second uses bibliographic citations to classify documents into subject areas. Neither method requires extensive dictionaries or tables of the type normally used for automatic classification schemes; instead, information is extracted from certain function words, from suffixes provided with many words in the language, and from bibliographic citations already available with most documents. Specifically, the syntactic analysis makes use of a small dictionary of a few hundred function words such as prepositions, conjunctions, articles, and certain nouns. Word suffixes are then isolated, and a suffix table is used to obtain additional grammatical indicators. A type of predictive analysis is then used to assign syntactic function indicators to all words in a sentence by matching predicted syntactic structures against the available grammatical information for the various words. If no grammatical information is available, the most likely prediction is used to classify the given word. The syntactic function indicators are used to group words into phrases of certain types, and phrases into clauses, and to determine certain word associations. Experimental evidence indicates that the error rate is not substantially higher than that found in other more complicated syntactic analysis programs which require full syntactic word dictionaries. The citation matching program uses bibliographic citations to determine document similarities. A similarity coefficient is first calculated for all document pair s as a functionof the number of overlapping citations between them. A second similarity coefficient is then derived using this time the number of over lapping index terms as a criterion. The index terms may be generated by hand, or may be derived by means of word frequency analyses. Finally, similarity coefficients derived from over lapping citations are compared with those derived from overlapping index terms. The coefficients, computed for a sample document collection, are analyzed to verify the hypothesis that when a closeness exists in the subject matter of certain documents, as reflected by over lapping index terms, there exists a corresponding closeness in the citation sets. It is found that the computed similarity coefficients are much larger than those obtained by assuming a random assignment of citations and index terms. Suggestions are made for using citation sets as an aid to the automatic generation of index terms. REFERENCES 1. H. P. Luhn, Auto-encoding of Documents for Information Retrieval Systems, IBM Corporation, ASDD Report, 1958. 2. H. P. Luhn, "Automatic Creation of Literature Abstracts," IBM Journal of Research and Development, Vol. 2, No.2, April 1958. 3. H. P. Luhn, "The Automatic Derivation of Information Retrieval Encodements from Machine-Readable Texts, " in Information Retrieval and Machine Translation, Part 2, A. Kent, ed., Interscience Publishers, New York, 1961. 4. L. B. Doyle, Indexing and Abstracting by Association, Report SP - 718/001/00, System Development Corporation, April 1962. 5. H. Borko, "The Construction of an Empirically Based Mathematically Derived Classification System," Proceedings of the AFIPS Spring Computer Conference, San Francisco, May 1962. 6. L. B. Doyle, "Semantic Road Maps for Literature Searchers," Journal of the Association for Computing Machinery, Vol. 8, No.4, October 1961. 7. V. E. Giuliano, "A Linear Method for Word Sentence Association," private communication. 8. B. F. Green, Jr., A. K. Woif, C. Chomsky, and K. Laughery, "Baseball: An Automatic Question Answerer," Proceedings WJCC, Los Angeles, May 1961. 9. R. K. Lindsay, "The Reading Machine Problem," Doctoral TheSiS, Carnegie Institute of Technology, September 1960. 10. N. S. Prywes and H. J. Gray, Jr., "A Report on the Development of a List 250 / Some Experiments in the Generation of Word and Document Associations 11. 12. 13. 14. 15. 16. Type Processor - I," University of Pennsylvania, Moore School of Electrical Engineering, 1961. M. DetantandA. Leroy, "Elaborationd'un Programme d' Analyse de la Signification,"Rapport GRISANo. 11, EURATOMC ETIS, June 1961. P. J. Stone, R. F. Bales, J. Z. Namenwirth, and D. M. Ogilvie, "The General Inquirer: A Computer System for Content Analysis and Retrieval based on the Sentence as a Unit for Information," Laboratory of Social Relations, Harvard University' November 1961. P. Baxendale, "An Empirical Model for Machine Indexing, II Third Institute on Information Storage and Retrieval, February 1961. G. Salton, "The Manipulation of Trees in Information Retrieval," Communications of the Association for Computing Machinery, Vol. 5, No.2, February 1962. G. Salton, "The Identification of Document Content: A Problem in Automatic Information Retrieval," Proceedings of a Harvard Symposium on Digital Computers and their Applications, Annals of the Computation Laboratory of Harvard University, Vol. 31 (to be published, 1962). S. Klein' and R. F. Simmons, "Automatic Analysis and Coding of English Grammar for Information ProceSSing Sys- 17. 18. 19. 20. 21. 22. 23. tems," Report SP-490, System Development Corporation February 1962. S. Klein and R. F. Simmons, "A Computational Approach to Grammatical Coding of English Words," Report SP 701, System Development Corporation, February 1962. 1. Rhodes, "A New Approach to the Mechanical Translation of Russian," National Bureau of Standards, Report No. 6295, 1959. M. Sherry, "Comprehensive Report on Predictive Syntactic Analysis," Mathe mat i cal LinguistiCS and Automatic Translation, Report No. NSF - 7, Section I, Harvard Computation Laboratory, 1961. W. Plath, "Automatic Sentence Diagramming," First National Conference on Machine Translation and Applied Language AnalYSiS, National PhYSical Laboratory, Teddington, 1961. D. G. Hays, "Grouping and Dependency Theories," Report P - 1910, Rand Corporation, Santa Monica, 1960. G. Salton, "The Use of Citations as an Aid to Automatic Content Analysis," Information Storage and Retrieval, Report ISR - 2, Harvard Computation Laboratory, in preparation, June 1962. F. E. Hohn, S. Seshu, and D. D. Aufenkamp, "The Theory of Nets, "mE Transactions on Electronic Computers, Vol. EC - 6, 1957, pp. 154-161. A LOGIC DESIGN TRANSLATOR D. F. Gorman and J. P. Anderson Burroughs Corporation Burroughs Laboratories Paoli, Pennsylvania INTRODUCTION It is envisioned that a translator such as this would be incorporated within a complete design automation system, and, as such, should be expected to provide input for minimization programs, card layout and assignment programs, backplane wiring programs, and the like. For this reason, a canonical form is essential to a completely automated system and, in addition, will provide a convenient input form for follow-up programs. From the canonical form, it is possible to obtain an estimate of the relative cost of a system, based upon component costs. As a result, the cost and effects of modifications and changes can be quickly and accurately determined. The canonical form also provides a means of comparing and evaluating various designs by examining the hardware structure. Previously, such comparisons could not be made, because of the prohibitive number of man-hours involved in the detailed logical design. Thus, those operations which are difficult to' implement, in terms of complexity or cost, are easily pinpointed for further study. Logical inconsistencies and timing conflicts are eliminated. Thus, the translator will provide a tool for system designers to more accurately measure their systems', and, hopefully, will promote better descriptions of future systems. The process of logic design of a computer is analogous to programming, using hardware for commands. Logic designers must frequently take imprecise narrative descriptions of computer systems, and, applying experience, ingenuity, inventiveness, and considerable perseverance, transform the description into a prescription for a running system. This process takes an inordinate amount of time. Considerable attention has been devoted to the reduction of programming time through the use of higher programming languages, such as ALGOL and COBOL, and conversion to machine code through translators. In a similar manner, translation of higher languages for logic and system design would greatly reduce the effort now needed to specify and design computer systems. Just as in the case of programming, changes to the description are tolerated up to a certain point, then are prohibited, or are made with great reluctance and less than perfect efficiency. This paper offers systems and logic designers a means to automate the repetitive and otherwise error-,prone detail associated with much machine design, as well as to provide a means for making systems changes acceptable for a longer period of time, without encountering the logic design equivalent of program patches. An additional motivation is to be able to rapidly obtain a canonical description of a computer system in the form of application equations of all registers within the machine. System Description Language A system can be described by giving the functional interrelationships of the various 251 252 / A Logic Design Translator parts of the system. The description is usually presented in a narrative form, with the drawbacks of ambiguities and otherwise imprecise meanings. Upon examining the notion of system design, as opposed to logic design, it is often found that the emphasis is upon the control structure of a processor, with details on the number of registers to be included, the structure of each, the manner of connection in each case, and the parts to be played by each in the execution of various commands. Particular attention is paid to the selection and description of a command list, especially those commands which are the "particular features" of a machine. These considerations suggested that a language that was a concise and unambiguous description of these objects would, in fact, describe the computer system. The system descriptive language devised is based upon the informal programmatic notation frequently used to replace clumsy narrative [1], such as replacing the statement The contents of A and X are interchanged; the contents of A are then stored in memory at the place addressed by the contents of MA. by EXCHANGE A AND X A ~ MA* In general, the language assumes the existence of such hardware entities as counters, adders, registers, clocking systems, and the like. The forms of the language are not unlike constructs found in algebraic languages such as ALGOL [2]. The basic language construct devised to describe the interaction of the various registers is a statement. The principal statement type is associated with interregister transfers. In addition, counting statements, conditional statements, shift statements, memory access statements, and the like are provided for the most frequently invoked system functions. Examples of some of the various statements are shown below: Example transfer A ~B conditional IF G(13) = 0 THEN (5.G ~ B) ELSE (6.B shift ~ G) A MOVE RIGHT OFF 6 ~ B Example arithmetic counting subroutine memory access A + B C + 1 ~ ~ R C INSTFETCH MA* ~ L Statements are freely combined to form functional descriptions of instructions and control. These larger constructs are the microsequences used to describe the machine functions. The complete functional description of a computer system is given by the set of microsequences for the instruction set and control (such as, instruction fetch and data fetch), plus the declarations describing the register Sizes, and their interconnections. As an example, the following is a microsequence for the acquisition of the next instruction in some machine: INSTFETCH 1. PC ~ MA 2. MA* ~ B 3. PC + 1 ~ PC Declarations are used to specify details of the system. They are used to name registers and describe their size and structure as well as to provide characteristics of various equipment assumed, such as arithmetic units, logical units, etc. In addition, declarations describe permissible data paths in the system as a check on the transfers specified in the microsequences. In genera.l, the mierosequences· make reference to substructure names for the apparent reason of readability and mnemonic value. As a canonical representation, however, it was deemed desirable to use the parent name of a register, Since, in general, the substructure is artificially imposed by the system deSigner and has no a priori meaning in hardware. Typical of the register declarations is the example of an instruction register description in Figure 1. It is the intention to draw upon the advances achieved in recent years in the design of system elements, such as adders, counters, comparators, etc., by providing libraries of designs of these specialized units. These designs would reflect various compromises between speed and cost, and would represent diversified implementations of computational algorithms. These units are declared as special registers,and have a format similar Proceedings-Fall Joint Computer Conference, 1962 / 253 C~(OP~ INDX(7, 9), ADDR~» m, Register Na j ,. ), ), )1' ), )1' )~ Size F.ields Substructur e Nam=s Figure 1. An Instruction Register Description. to storage registers, differing only by the inclusion of cost and speed parameters. Translator Structure The overall structure of the translator is that shown in Figure 2. Briefly, the translator accepts systems descriptions in the language provided, and transforms the microsequences into an intermediate language known as the design table. After suitable manipulation, the design table may be converted to application equations. The translator is coded as a set of recursive procedures, using the Burroughs 220 ALGOL translator [3]. Each procedure corresponds (more or less) to one of the syntactic definitions of the language. In addition, since the language unit most frequently invoked is a statement, a procedure is provided to array statement elements into a vector, and associate with them a type designator (for example, an identifier, a number, an operator, a delimiter, etc.). A set of link list procedures is used to handle the associative storage for the register and transfer lists, and for various other lists used in the translation process. The logic design translator has three major parts: the scan and recognizer, the design table analysis, and the equation generator. The Scan and Recognizer: This segment of the translator takes microsequences, in the system descriptive language, converts them into a series of information transfers, and enters them into the design table. As mentioned above, the design table can be thought of as an intermediate language used during the translation process as a convenient form in which to store and manipulate data. The design table is represented as an m X 12 matrix, where m is the number of operations to be performed within a micro sequence and the 12 columns indicate the following information: Column 1 - source register 2 - most significant bit of the field of the source register under consideration 3 - least significant bit of the field of the source register under consideration 4 - destination register Registers and Special Registers List of Legitimate Transfers SYSTEM STRUCTURE Transfer Paths Hardware Decisions Transfer type Timing information Logical blocks and register type Microsequen::es as an ordered sequence of legitimate transfers Design Table Microsequences (In theSystem de.scription language) Figure 2. A Logic Design Translator.. Equations of entire system 254 / A Logic Design Translator Column 5 - most significant bit of the field of the destination register under consideration 6 - least significant bit of the field of the destination register under consideration 7 - equipment used for the transfer of information 8 - most significant bit of the field of the equipment under consideration 9 - least significant bit of the field of the equipment under consideration 10 - control conditions to be satisfied during this operation 11 - relative tim e at which this microstep is to occur 12 - delay of the destination register The design table is set up to accommodate information transfers; that is, its basic structure embodies a source and a destination of information, provisions for indicating other equipment which may be used, and the control conditions which must be fulfilled for each transfer. That all routines to be performed by the specified machine can be decomposed into a series of information transfers is fundamental in the design of digital computers. Each microstep must be individually analyzed in the order of its appearance within the microsequence. The microstep is fir st examined by the recognizer in order to determine which type of statement it is. At the present time, 12 basic statement types have been defined within the system. Depending upon the statement type, a subroutine is chosen which scans the statement, determining which registers or substructures of registers of the machine are invoked by" this microstep. In general, substructure names are used in the microsteps, mainly for their mnemonic value. These substructure names are now replaced by the name of the parent register along with the appropriate bit fields. It is at this time that the basic statement is broken down into a series of information transfers between the specified registers. The parent register names and bit fields are now entered into the design table in the "source" and "destination" columns, each transfer in a separate row. Since each information transfer between registers may use other equipment in the system (busses, memory address register, etc.), these transfers are compared to the list of data paths which was initially entered as declarations concerning the system, and all additional equipment which is used in this transfer is entered in the "equipment used" column of the design table. The operators appearing in the statement are entered in the "control" column. Flags, to be used in the design table analysis, are entered in the "time" column, if needed. A "0" flag is used to indicate that the following row is part of the operation; thus, the two rows are to be considered as one. A "1" flag indicates that the design table is partitioned below that row. The Design Table Analysis: This section of the translator determines the length of time each operation will take, the time when it may begin, and sets up the control for the next instruction fetch. The type of transfer associated with each operation in the design table is determined by the destination register. If the destination register accepts information in parallel, one delay will be required for the transfer. If a serial transfer is required, then the number of delays required for the operation is equal to the number of bits in the register involved. A delay is the time that the destination register requires to process one bit of information. Since the GO TO statement effectively alters the order in which operations are to be performed in the machine, the design table must be partitioned so that the timing of each branch of the program may be determined independently. The clock time at which an operation may begin is determined as follows: 1. The first operation in the design table can begin at the first clock pulse, denoted by t 1 • 2. The first operation in any partition of the design table can begin. at t max +1, where t max is the highest clock time assigned to any previous operation in the design table. 3. The clock time of the nth operation is determined by comparing all registers and other equipment used by an operation against previous operations, within the same partition, beginning with the (n - 1) st operation, until the conflict with the maximum time is found. A conflict is said to occur between two operations if any equipment is used in common by both operations. The n th Proceedings-Fall Joint Computer Conference, 1962 / 255 operation is then begun at a time one greater than the time of the conflicting operation with the highest time. If no conflict is found within the partition, the starting time of the fir st operation within the partition is also the starting time of the n th operation. This method of assigning clock times to operations ensures that, for the order of microsteps as declared, equipment does not remain idle needlessly. An implicit subroutine, called microsequence completion, must be added to each microsequence so that, as each microsequence is executed, control is transferred to the following microsequence. This is accomplished by entering in the (i + 1) st row of the design table the following: a) "0" in column 1 b) CLOCK in column 4 c) RESET in column 10 d) tmax + lin column 11 e) DeiaYcLocK in column 12 The Equation Generator: After the operation of the design table analYSiS, the application equations for all the data storage registers in the machine are generated. Each operation in the design table will form a term of the application equation for each bit of the destination register. This term is formed by taking the conjunction of the microsequence name, the corresponding bit of the source register, all variables in the control column, and the clock time. After all the microsequences to be performed by the specified machine have been processed, the total equation for a particular bit of a register is formed in a separate sort run by taking the disjunction of all terms in which the bit under consideration was the destination bit. LikeWise, the application equations for the control flip-flops are derived from the design table based upon the microsequence name and the clock times associated with each control variable. An Example Two microsequences will now be u$ed to illustrate the functioning of the translator. These microsequences do not represent a useful routine in a computer system but were constructed to illustrate at least one example of each type of statement. Assume that all information transfers occur in parallel. The resulting design tables are shown in Figs. 3 and 5. The twomicrosequences are as follows: MICROSEQUENCE WIN 1. P ~Q 2. K + 1 ~ P 3. S + T ~ W 4. X MOVE RIGHT· OFF 3 ~ X 5. IF K =P THEN (6.' W ~ X) ELSE (7. W ~ Y 8. R6-l0 ~ W 2 - 6 9. K - 1 ~ ) P MICROSEQUENCE LOSE 1. (K' W v R) ~ X 2. WIN 3. M* ~ T 4. EXCHANGE SAND T 5. T ~ M* 6. GO TO 2 For microsequence WIN, microsteps 1 and 2 are straightforward entries into the table. The fact that BUS1 was used during these operations was determined by consulting declarations about the system assumed to have been previously entered via the translator. In microstep 3, row 7, the 0 in column 11 is used as a flag to indicate that both operands (S and T) must be brought to the arithmetic unit at the same time. Microstep 5 is a conditional statement and, as such, imposes control conditions upon following statements, as indicated by the FF1 and -FF1 entries in column 10. Microsequence LOSE illustrates some different statement types and another feature of the translator. In evaluating the Boolean expression in microstep 1, the translator finds that temporary storage is necessary. The translator assigns a name to this register (TEMP 1) and notifies the designer that this register is necessary if the micro step is to be executed in the prescribed manner. The analysis section of the translator determines the clock times for the operations indicated in the design table. Since parallel transfers were assumed, only one delay will be required for each transfer; the results are shown in Figures 4 and 6. In accord with the flags in column 11 of microsequence WIN, the beginning times for the operations of rows 1 and 2, 3 and 4, etc., are the same. An interesting situation occurs in rows 19, N CJ1 0') "- > ~ 0 aq "",. () Microstep Number Row k i 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 2 3 4 5 6 7 8 9 Source Register 1 P BUSI K BUSI COUNT BUSI S T ARITH X SHIFT SHIFT SHIFT K BUSI P BUS2 LOGICAL W W R K BUSI COUNT BUSI Begin 2 End 3 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 6 1 1 1 1 n n n n n n n n n n n n n n n n n 1 n n 10 n n n n Destination Register 4 BUSI Q BUSI COUNT BUSI P ARITH (A) ARITH (B) W SHIFT SHIFT SHIFT X BUSI LOGICAL (A) BUS2 LOGICAL (B) FFI X y W BUSI COUNT BUSI P Begin 5 End 6 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 n n n n n n n n n n n n n n n Eq. Used 7 t::1 ('D Begin 8 End 9 Control 10 Time 11 0 0 UP 0 ADD ADD 0 RIGHT-OFF RIGHT-OFF RIGHT-OFF EQL n 0 0 0 EQL n 1 n n 6 FFI FFI FFI 1 1 1 n n n n Figure 3. Design Table for Microsequence WIN. 1 1 1 FFI - FFI - FFI 0 DOWN 0 Delay 12 0 1 0 1 0 1 5 5 1 1 1 1 1 0 3 0 3 1 1 1 1 0 1 0 1 00 dQ. ::s ~ Ii § 00 ..~ ~ 0 Ii Microstep Number k 1 2 3 4 5 6 7 8 9 999 Row i 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 Source Register 1 P BUS1 K BUS1 COUNT BUS1 S T ARITH X SIDFT SHIFT SHIFT K BUS1 P BUS2 LOGICAL W W R K BUS1 COUNT BUS1 "0" Begin 2 End 3 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 6 1 1 1 1 n n n n n n n n n n n n n n n n n 1 n n 10 n n n n Destination Register 4 BUS1 Q BUS1 COUNT BUS1 P ARITH ~A~ ARITH B W SHIFT SHIFT SHIFT X BUS1 LOGICAL (A) BUS2 LOGICAL (B) FF1 X y W BUS1 COUNT BUS1 P CLOCK Begin 5 End 6 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 n n n n n n n n n n n n n n n n n 1 n n 6 n n n Eq. Used 7 Begin 8 End 9 Control 10 UP ADD ADD RIGHT·OFF RIGHT·OFF RIGHT·OFF EQL EQL FF1 FF1 FF1 1 1 1 1 1 1 FF1 .... FF1 ""'FF1 DOWN n RESET Time 11 Delay 12 1 1 2 2 3 3 1 1 6 1 2 3 4 4 4 4 4 7 8 8 9 4 4 7 7 10 0 1 0 1 0 1 5 5 1 1 1 1 1 0 3 0 3 1 1 1 1 0 1 0 1 1 ~ to; 0 n C'D C'D s:L ..... ::s aq 00 I ~ ~ C-t 0 ..... a .g () 0 a C'D to; () 0 ::s I-+) C'D Figure 4. Design Table for Microsequence WIN after Timing Analysis. I~ ::s n C'D ..... ~ (j) N ............ N I:J1 ..;r N CJ1 00 ........... >- ~ 0 (JQ 1-'. (") tj Microstep Number k Source Register 1 Row i Begin 2 End 3 1 1 1 1 1 1 1 n n n n n n n Destinanon Register 4 CD 00 1-'. (JQ Eq. Begin 5 End 6 1 1 1 1 1 1 1 n n n n n n n I I 1 I 1 I I m n n n n n m Used 7 Begin 8 End 9 Control 10 Time 11 Delay 12 0 0 0 3 3 1 3 3 1 ::s ~ "1 § 1 1 2 3 4 5 6 7 K BUS1 W LOGICAL TEMP1 R LOGICAL BUS1 LOGICAL LOGICAL TEMP1 LOGICAL LOGICAL (A) (B) (A) (B) X AND AND OR OR 0 1 2 a;) 3 4 5 6 34 35 36 37 38 39 40 41 The entire microsequence WIN is entered here. M MEM (DATA) S T TEMP! T M T(k) I I I I I I I m n n n n n m MEM (ADD) T TEMPI S T MEM (DATA) MEM (ADD) CLOCK Figure 5. Design Table for Microsequence LOSE. 0 RESET 0 I I I I I 1 I I I I I~ Microstep Number k 1 2 3 4 Source Register 1 Row i 1 2 3 4 5 6 7 ~} 33 34 35 36 37 38 5 39 6 999 40 41 42 K BUS1 W LOGICAL TEMP1 R LOGICAL Begin 2 End 3 1 1 1 1 1 1 1 n n n n n n n Destination Register 4 BUS1 LOGICAL LOGICAL TEMP1 LOGICAL LOGICAL X Begin 5 End 6 1 1 1 1 1 1 1 n n n n n n n (A) (B) (A) (B) Eq. Used 7 Begin 8 End 9 Control 10 AND AND OR OR Time 11 Delay 12 1 1 1 4 5 5 8 0 3 3 1 3 3 1 n CD CD Q,.... {9_j7 The entire microsequence WIN is entered here. "'d "i 0 ::s (Jq C/.l I ~ ..... ..... Po' M MEM (DATA) S T TEMP1 T M "9" "0" 1 1 1 1 1 m n n n n 1 n 1 m MEM (ADD) T TEMP1 S T MEM (DATA) MEM (ADD) CLOCK CLOCK 1 1 1 1 1 1 1 m n n n n n m Figure 6. Design Table for Microsequence LOSE after Timing Analysis. RESET RESET 14 14 9 15 16 17 17 18 19 1 1 1 1 1 1 1 1 1 C-t 0 ,.... ::s I"t" (') 0 ~t: I"t" (l) "i (') 0 ::s I~ "i (l) ::s n (l) .co 0) N '-.... N C1I co 260 / A Logic Design Translator 20, and 21. Since the control conditions for operations 20 and 21 are the inverse of the conditions for operation 19, operation 19 is ignored during the determination of the starting time for operations 20 and 21. The equation generator will convert the design tables into terms which will be combined in a later sort run to form the application equations. Terms generated by microsequence WIN are: BUS 1 1_n = WIN . P 1-n • t 1 Q 1-n = WIN . BUS 1 1_n . t 1 BUS1 1_n = WIN . K 1-n · t2 = WIN . BUS1 1_n . UP. t2 COUNT 1-n SmFT1_n SmFT 1_n = WIN = WIN = WIN = WJN = WIN = WIN = WIN = WIN = WIN = WIN = WIN = WIN = WIN = WIN = WIN SmFT 1_n = WIN . SmFT 1-n . RIGHT . OFF . t3 X 1-n = WIN BUS 1 1_n P1-n ARITH (Ah_n ARITH(A)1_n ARITH(A)1_n ARITH(Ah_n ARITH(A)1_n ARITH(Bh_n ARITH(Bh_n ARITH(Bh_n ARITH(B)1_n ARITH(Bh_n W 1-n BUS1 1_n • COUNT 1-n . t3 • BUS1 1_n • t3 • S 1-n . ADD· t 1 · S1_n · · S1_n · · S1_n · · S1-n · T 1-n · · T 1_n · · T 1_n · · T 1_n · ADD ADD ADD · · · ADD t2 t3 t4 t5 ADD· t1 ADD· t2 ADD· t3 ADD· t4 . T 1-n· ADD· t 5 . ARITH 1_n . t6 . X 1-n . RIGHT· OFF· t1 . SmFT 1-n . RIGHT . OFF . t2 SmFT 1-n = WIN . K 1- r1 • t4 p 1-n = WIN . BUS1 1_n . EQL . t4 = WIN . P 1-n . t4 = WIN • BUS2 1_n • EQL . t4 = WIN . LOGICAL 1 • t7 = WIN · W 1-n . FF1 . t8 = WIN • W 1-n . -FF1 . t8 = WIN · R 6_10 . -FF1 . t9 = WIN · K 1_n . t4 = WIN BUS1 1_n . DOWN . t4 = WIN · COUNT1.. n . t7 = WIN · BUS1 1_n . t7 CLOCK = WIN LOGICAL(A) 1-n BUS2 1_n LOGICAL(B) 1-n FF11 X 1-n y 1-n W2_6 BUS1 1_n COUNT 1_n BUS 1 i-n · "0" . RESET . t 10 Proceedings-Fall Joint Computer Conference, 1962 / 261 After all microsequences have been processed, the terms from all microsequences are collected and sorted; and the equations for the system are formed. For example, the equation for the first bit of the X register will have the following form: Xl = \WIN . SHIFT 1 . t4 v WIN· IW1 . FFI . tSJv v (from microsequence WIN) \ LOSE . LOGICAL! . ts v LOSE· SHIFT ~ . t12 v LOSE . W1 . FFI . t 16 ) v . _ . v (from microsequence LOSE) CONCLUSION The foregoing outlines a programming system that is another tool for the computer designer. The emphasis has been on the representation of a computer system in a form suitable for further \ processing by an automated design system, or by programs to evaluate the cost of the machine under consideration. In general, it is envisioned that the user of this system would be the designer concerned principally with the overall control structure of a computer. It should be pointed out that the algorithms devised are, to a large degree, a func'tion of the class of machines being designed (parallel- synchronous, in the exampIe), and, to a lesser degree, a function of assumptions rega~ding characteristics of the system elements available to the designer. (For example, the translation algorithm employed would have to be changed to accommodate "shifting type" registers.) Further changes would have to be incorporated to give the designer explicit control over the concurrences in microsequences rather than to arbitrarily exploit the concurrences (based upon utilization of system elements) in the translator. The system described herein will be used as a vehicle for extending the notation to cover wider classes of designs, and to study the implications of the notational devices to canonical representation of computer systems. It has not escaped the authors that the language described in this paper is essentially the form necessary for functional simulation of computers, and that it would be a relatively simple task to write a translator that would generate a simulation program representing a proposed machine design, either on a functional level or on an individual clock pulse baSis. ACKNOWLEDGEMENT The contributions of Roy Proctor, of the Burroughs Laboratories Computation Center, both for programming and for several suggestions in connection with this work, are acknowledged with pleasure. REFERENCES 1. Barton, R. S., "A New Approach to the Functional Design of a Digital Computer," Proceedings, 1961 WJCC, pp. 393-396; May 9-11, 1961. 2. Naur, P., et aI, "Report on the Algorithmic Language ALGOL 60," Communications of the ACM, vol. 3, no. 5, pp. 299314; May 1960. 3. Burroughs Algebraic Compiler, Bulletin 220-21011-P (Detroit, Michigan: Equipment and Systems Marketing Division, Burroughs Corporation) January 1961. COMPROTEIN: A COMPUTER PROGRAM TO AID PRIMARY PROTEIN STRUCTURE DETERMINATION* Margaret Oakley Dayhoff and Robert S. Ledley National Biomedical Research Foundation Silver Spring, Maryland This ordering is of great interest because it is the order of the amino acids in a protein that is determined by the gene. Thus, according to current biological theory, the gene determines which proteins will be made by determining the order of the amino acids in the protein chain and it is these proteins in turn, acting as enzymes, that control the chemical processes that determine the physical and functional characteristics of the organism. Finding the amino acid order of a protein chain has proved a time consuming process for the biochemist; in fact, only about 6 complete or almost complete protein orderings have been found so far, namely those of insulin [1 ], hemoglobin [2], ribonuclease [3], tobacco mosaic virus protein [4 ], myoglobin [5], and cytochrome C [6]. The basic technique used on all these proteins (with the exception of myoglobin). was to breakdown the long chain chemically into smaller fragment chains at several different points, to analyze the amino acids in each fragment chemically, and then to try to reconstruct the entire protein chain by a logical and combinatorial examination of over lapping fragments from the different breakdowns. It is in this reconstruction of the protein that the computer finds its application. INTRODUCTION Among the main chemical constituents of the human body-and, in fact, of all living things-are proteins. In addition to serving as component structural parts of many types of living tissues, the proteins are enzymes that are necessary in order that the chemical reactions which comprise the life processes may occur. The protein enzymes act to "decode" the message of the genes, interpreting this message in terms of specific chemical reactions which determine the physical and functional characteristics of the organism. Thus proteins playa uniquely vital role in the evolution, ontogeny, and maintenance of living organisms. It therefore becomes important when studying the basis of life processes to know the structure of the proteins themselves. In spite of their highly complex role, the molecular structure of proteins is, in prinCiple, relatively simple: they are long chains of only twenty different types of smaller molecular "links" called amino acids (see Figure 1). Each type of protein is characterized by a particular ordering of the amino acid links, and a major problem in finding the exact structure of a protein is to obtain the ordering of the amino acids in the chain. *The research reported in this paper has been supported by Grana GM 08710 from the National Institutes of Health, Division of General Medical Sciences, to the National Biomedical Research Foundation. 262 Proceedings-Fall Joint Computer Conference, 1962 / 263 Abbreviation Name 1 2 3 4 5 6 7 8 9 10 alanine arginine asparagine aspartic acid cysteine glycine glutamine glutamic acid histidine isoleucine ALA ARG ASN ASP CYS GLY GLN GLU HIS ILU Name Abbreviation leucine lysine methionine phenylalanine proline serine threonine tyrosine lQ tryptophane 20 valine LEU LYS MET PHE PRO SER THR TYR TRY VAL 11 12 13 14 15 16 17 18 Figure 1. A listing of the amino acids with their abbreviations is shown in the upper section and the lower indicates part of the protein ribonuclease which actually comprises a chain of some 124 amino acids. As a trivial example, suppose that for a protein one chemical breakdown produced the fragment chains of known ordering, Breakdown P: AB, CD, and E Where A, B, C, D and E each occur once and only once in the protein. Let us call this a complete breakdown, and let another breakdown, this time incomplete, produce the fragments Breakdown Q: BC and Breakdown P: (A, B, C) and (D, E) and that another, incomplete, breakdown is Breakdown Q: (A, B) and (C, D) Clearly (C, D) of breakdown Q overlaps (A, B, C) and (D, E) of breakdown P but (A, B) is contained within (A, B, C). Hence, since each amino acid has distinct "left" and "right" ends, two possible protein reconstructions result, namely (see Figure 2b) DE (A, B) (C) (D) (E) and (E) (D) (C) (A, B) where A, B, C, D, and E represent amino acids. Here fragment BC in Breakdown Q (see Figure 2a) clearly overlaps the two fragments AB and CD of Breakdown P, and DE overlaps CD and E of breakdown P, giving as the reconstructed protein ABCDE. As another example, consider the more common case where the amino acid components of a fragment are known, but the order of these within the fragment is unknown. Let parentheses indicate that the order theyenclose is unknown [e.g., (A, B, C) represents the six permutations of A, B, C; (D, E) represents either DE or ED; (A, B, C) (D, E) represents the 6 x 2 = 12 fragments of each of the six permutations in (A, B, C) followed by DE or ED etc.], and suppose that one complete breakdown is where in each possibility the order of A, B still remains unknown. Such partial reconstructions frequently occur, and pinpoint for the biochemist that portion of the molecule on which further effort is required. Unfortunately, however, the problems involved in reconstructing proteins are not as simple as in the examples just given. The largest protein analyzed so far, the tobacco mosaic virus protein, has only 158 amino acids whereas proteins usually have chains of many hundreds of amino acid links. Since the number of combinations on n things taken r at a time (1< r < n) increasesmore rapidly than does!! itself, it is to be expected that the difficulties in piecing together the fragments of a protein will increase p-roportionally faster than the number of amino acids in the protein. In addition, there may be occasional errors in the fragments reported by the biochemist, 264 / Comprotein: A Computer Program to Aid Primary Protein Structure Determination Breakdown P I I A B I D C r----1 E I I Breakdown Q(a): Breakdown P I I I A B Breakdown Q(b): C I I I or E D I I E D I I A C B I I Figure 2. Illustration of two different breakdowns of a protein into amino acid fragments (see text). as well as other aberrations in the data. Hence the logical and combinatorial problems can become severe, and a computer is then required to assist in the analysis. The advantage of computer aid is that it may help significantly to extend the current chemical analysis methods of determining the amino acid sequences of proteins to many more and much larger proteins. Byexhaustively analyzing the possibilities of protein reconstructions, the computer may assist in determining the best next step to try in the chemical analysis processes. In addition, it should be noted that presently the chemical analysis is carefully planned to produce results that will be logically simple for mental analysis. The use of a computer to perform the logical analysis. may thus allow Significant simplification and further systematization of the chemical experimental procedures by placing more of the burden ·on the automated logical and combinatorial analysis and less on the experimental procedures. In this paper we shall describe a completed computer program for the IBM 7090, which to our knowledge is the first successful attempt at aiding the analysis of the amino acid chain structure of protein [7]. The idea was conceived by us in 1958, but actual programming was not initiated until late 1960. D. F. Bradley, S. A. Bernhard, and W. L. Duda have independently reported, in an asyet unpublished paper [8], progress in approaching a similar problem. R.. Eck has reported on a system for using marginalpunch cards to aid in certain aspects of the logical analysis problem [9]. A SIMPLIFIED ILLUSTRATION Discussio~ of the programming methods utilized will be clarified if. we first consider a simple illustration. Suppose a complete breakdown P is made by the biochemist as in Figure 3, and that another breakdown Q is also known but not complete (see Figure 3). Incomplete breakdown Q LIST Complete breakdown P LIST P1 p2 p3 P4 p5 (R)(A,B) (D)(B)(C,A) (A)(C)(D)(X,A)(C) (B) (D) (B) (D) (A,B,D) (C)(A)(Z) q1 q2 q3 q4 q5 (A)(B,B,D) (A)(C,A,C) (X)(A,C ,B) (B)(A,A,C,D) (A)(B,C ,D) Figure 3. Breakdowns of protein fragments for the illustration in the text. In Figure 4 we show how such breakdowns P and Qcan occur from our hypothetical protein, but the problem is to reconstruct this from the fragments given in Figure 3. Since each fragment qi of breakdown Q musteither overlap several fragments of P, or else be included within some fragment of P, let us start by making a list for each q i of all possible such associated P fragments; Figure 5 shows such lists for our illustration. As an example of how 'each entry in a list is found, consider the test of whether or not q4 overlaps P4 P5 where q4 is (B)(A,A,C,D) and where P4 is (B)(D)(B)(D)(A,B,D), Psis (C)(A)(Z) The problem is to determine if each acid of <4 can be accounted for in p 4 and p 5' First note that the maximum overlap between q4 Proceedings-Fall Joint Computer Conference, 1962 / 265 Pl RIA P2 Blln BIIA P3 C ql I IA C P4 nix 1 q2 ci A IB I D Ps B DIB A n l .I c A I Z I q4 q3 qs Figure 4. lllustration of sources of peptide fragments from protein molecule illustrated in the text. ql q2 q3 PlP2 P2 P3 P 3 P4 P l P4 P2 P4 q4 qs PI P3 PI P2 P2 P s P4 P 3 P2 P4 P3 Ps P4 Ps P3 P2 P4P2 qs list.* This leaves in the list for ql only PlP2 and P4P2' H we first assume that PlP2 is overlapped by ql, then in the q4 list only P4PS remains, and hence in the q2 list only P2P3 remains, giving altogether these adjacent .fragments, P3 P 4 which determine the structure as Figure 5. The q lists for the illustration in the text. and P4 is (B ,A,D), on the right of P 4. This leaves (A,C) of q4 "hanging over" on the right of Q4, to be accounted for in P 5' This is clearly possible, resulting in the overlap: P4 j Ps (B)(D)(B)(D)(B)(A,D) I I(C)(A)(Z) I I (B)(A,D) (C)(A) On utilizing PI, P2, P3 and P4 we find as the final struc ture: (R)(A)(B)(D)(B)(A)(C)(A)(C)(D)(X) (A)(C)(B)(D)(B)(I»(B)(A,D)(C)(A)(Z) Returning to the second possibility in the ql overlap list, namely P4P2, this leaves only PIP 3 in the q4 list, which in turn leaves only P2PS in the q 2 list. Hence a secondpossibility for adjacent fragments is I 44 In order to determine all the entries in all the lists, such trials must be made by the computer for every pair of fragments PiP P for each qk' However, .just forming the lists is but the first step in reconstructing the protein chain. The next step is an elimination process to leave only the consistent possibilities. For instance, q3 can only arise from P3P4; hence P3 must be followed by P4, and P4 must be preceeded by P3, and therefore all other possibilities in other lists involving P3 and P4 can be eliminated-such as PlP4, P2P4 and PSP4 in the qllist of Figure 5, P3PS in the q2 list, P4P3 in the ~ list, andp2P4andp3P2 in the which gives the structure (R)(B)(A)(A)(C)(I»(X)(A)(C)(B)(D) (B)(D)(I»(A)(B)(D)(B)(A)(C)(C)(A)(Z) *Actually, it is also necessary to eliminate the occurrence of P4 in the lists by replacing it with P3*, which stands for P3P4' This is to insure that an impossible succession of conditions such as PIP2' P2 P 3' P3 P l is not produced. 266 / Comprotein: A Computer Program to Aid Primary Protein Structure Determination COMPUTER HANDLING OF BIOCHEMICAL INFORMATION The problems involved in writing a computer program, however, are not as straightforward as the above illustration might indicate. The biochemist utilizes enzymes to break up (hydrolyze) the protein into the fragments that we have been considering; these fragments are called peptides by the biochemist. The enzymes commonly used, such as subtilisin and chymotrypsin, produce an assortment of peptides which may overlap each other. Hence a problem arises in actually arriving at a complete set of peptide fragments, as illustrated above in the breakdown P. In addition, for several reasons, the biochemical experiments very often do not result in integer values for the number of amino acids of a particular kind that occur in a peptide fragment. This second uncertainty problem must also be taken into account by the computer program. Furthermore, there may be experimental errors in the amino acid composition and ordering 0 f . some peptides. In the case of overlapping peptide fragments from a hydrolytic breakdown, the computer program tries to reconstruct a complete set of fragments from overlapping subsets of fragments. The procedure of accomplishing this is to look for every group of two, three, or four acids known to be adjacent in some peptide. Then, for each such group, the probability that this particular group will occur again in the protein chain is computed from the amino acid frequency data. For instance, if the ordered pair LYS-PHE occurs, and it is known thatthere arefive LYSresidues and four PHE residues in the entire protein chain of, say, 150 amino acid links altogether, then the probability that another such L YS- PHE pair will occur is approximately 4 x 3/150. If the probability is small that another such group occurs, it is most likely that all of the peptides containing this group should a~ise from the same part of the protein; hence these peptides are sorted out. All possible fragments that can be reconstructed from these (overlapping) peptides are then determined. It may happen, however, that all these peptides cannot "fit" into any reconstructed fragment; this indicates either that some peptide must arise from a different place on the protein or that there may be an experimental error. In such a case the experimental results are reconsidered from a chemical point of view. Of course, there is a smallbutfiniteprobability that a miSinterpretation can be made at this point and an erroneous peptide constructed, such as could occur if a highly unlikely configuration actually occurred more than once in the protei~ or if there were lost peptides from a particular region but the existing peptides fortuitously fit. However, it is likely that in any case, in later building up of the protein, an inconsistency would arise, leading to the rejection of this erroneously constructed peptide . Some peptides may contain two or more groups on which searches are made. Reconstructed fragments containing these peptides must be joined together themselves. Hence the program merges these fragments to obtain all possible larger fragments. Such procedures will fix the relationships of many amino acids beyond that given in the initial data. These new relationships change the probabilities of occurrence of the two, three, or four amino acid groups. The probabilitie s are accordingly recalculated, and once more searches on improbable groups are made, leading to further merges of fragments into even larger fragments. This process is repeated by an iterative program until less than about 20 fragments remain. Further details must be taken into consideration by the program: the set of fragments may still not be complete; there may exist alternative possibilities for a fragment; and there may be gaps in the chain where all the peptides were lost. After obtaining a complete, or almost complete, set of fragments by iteration of the searching procedure, the program can continue toward reconstructing the entire protein utilizing the remainingpeptides not used in the building up of the complete set P as the Q set of peptides (see example above). It should be noted that in the various phases of the reconstruction of the protein the assumption is made that the total amino acid content of the entire protein and of the fragments is known. There is always, however, some experimental uncertainty in the number of amino acids of each type in the protein, to within a fraction of one amino acid. As· a rule, the larger number of amino acids is al ways chosen initially. If an extra acid is thereby included in the computations, it may Proceedings-Fall Joint Computer Conference, 1962 / 267 be eliminated at the end, by a procedure described below. Completing the final reconstruction of the protein again can present further details which must be taken into consideration by the computer program. Some peptides may appear to overlap at only one amino acid. If this occurs it would be unwise to conclude definitely that this represented a true overlap. Hence "pseudofragments" are used which consist of each overlapping fragment without the common amino acid. Single amino acids to complete the P set, as required by the amino acid constitution of the protein are considered with the larger fragments. If extra amino acids are so included, the final answer showing which fragments must be attached may place no attachment restrictions on these extra acids. In this case, if the acid arose from a fractional experimental result (see above), one may presume that it does not actually occur in the molecule. Otherwise further experimentation may be required. For example, if amino acid X is added to the P list in Figure 3, the Q lists will be unaffected and the resulting answers will be unchanged. One might conclude that either X really didn't belong in the P list or it could be at either end of the molecule. It is sometimes known which amino acids are on the right and left ends of the protein itself, and this information can further reduce the final possibilities. DESCRIPTION OF COMPUTER PROGRAMMING SYSTEM The computer programming system to aid protein analysis has been written in a flexible manner. The computer input and output is in terms of three letter abbreviations for the amino acids, with the parentheses notation for ordered and unordered sets as described above. Intermediate results are printed out for examination by the biochemist; in fact the entire process is geared for a close cooperative effort between the computer and the biochemist during the entire analysis. This is necessary in order to take advantage of the special conditions presented by any particular protein and type of chemical experimental procedures. For example it might be convenient to omit all prolines from the peptides, or not to consider a distinction between Glu and GIn. Special rules might be introduced regarding end-groups from hydrolyses by certainenzymes, etc. Such special considerations can be handled by the programming system, and make it easier to spot errors in the experimental data. The programming system is based on the following six programs: (1) MAXLAP: Program to find the maximum possible overlap between any two peptides with any amount of ordering information known. .(2) MERGE: Pro gram to find all possible over lapping configurations of two peptides. (3) PEPT: Program to find all possible fragments that are consistent with the overlapping of any number of peptides. (4) SEARCH: Program to search on probabilistic considerations all peptides which contain an unusual group of amino acids. (5) QLIST: Program to generate the Qlists of possible associated sets of P peptides over which each q i fragment can fit. (6) LOGRED: Program to perform the logical reduction of the Q-lists to obtain all possible protein structures that are consistent with the data. Since detailed flow diagrams would consume too much space and not be appropriate for the present discussion, we have included here only gross overall flow diagrams of these programs. Each of the six programs will nowbe described, and simple examples illustrating some of the methods involved will be given. (For further details see "Sequencing of Amino Acids in Proteins Using Computer Aids," Report No. 62072/8710, National Biomedical Research Foundation, Silver Spring, Md., July 1962.) Program MAXLAP (Figure 6). In this program p and q are peptide fragments, PCOM and QCOM are lists of acids from these peptides respectively which may provide the maximum overlap. After setting up a tentative maximum number of positions (Le., amino acids) of overlap, three cases may be distinguished as illustrated by the three examples of Figure 7. In the first example there is the successful maximum overlap situation where all of p or q is overlapped. Here MV is the list of all the acids from PCOM and QCOM which match; with this maximum overlap, the complete maximum overlap pro t e i n fragment is shown. In 268 / Comprotein: A Computer Program to Aid Primary Protein Structure Determination START ~ Assume tentative maximum number of overlap positions of p and q. I Clear PCOM,QCOM, and MV. + J .. Enter first group of p in PCOM. Enter first group of q in QCOM. I i' I Yes ! ~}- Is p or used up No i Add next group from peptide list to empty list PCOM or QCOM. I r- Maximum overlap is found in MV. See example I. Match elements of PCOM and QCOM. and move matchinq acids into MV. t No \. Is PCOM or QCOM empty ? J ~ Are there more elements Yes in QCOM than in PCOM ? Shift q to right the minimum amount so that all these PCOM elements can appear in the nonoverlapping first portion of p. Deduce new number of positions in the tentative maximum overlap and the content of the first p group. See example II. ( , \ DONE t .. RETURN I Yl Shift q to right until all QCOM elements can match elements of p or extend past the end of p. Deduce the new number of positions in the tentative maximum overlap and the content of the first p group. See example III. ~ Are there any elements " Yes in the tentative p structure ? t No I I No overlap of p and q. I Figure 6. Flow Chart for Program MAXLAP. Example I p: q: MV: Protein fragment with maximum overlap: Example II p: q: New tentative maximum Over lap structures Protein fragment with maximum overlap: Example ill p: q: New tentative maximum Overlap structures Protein fragment with maximum overlap: (A,B)(C,D) (C ,B,A)(F ,G,D) (A,B)(C)(D) (A,B)(C)(D)(F,G) (C)(D,E,F) (D,C,E)(G,H) (C)(F,D,E) (D,C,E)(G,H) (C)(F)(D,E)(C)(G,H) (A,B)(B,E,A)(D) (B) (A,D,E)(F) (A,B) (B,E ,A) (D) (B)(A,D,E)(F) (A,B) (B) (A,E ) (D) (F) Figure 7. Examples of the three cases considered in flow chart of program MAXLAP. Proceedings-Fall Joint Computer Conference, 1962 / 269 the second example all of p cannot be overlapped by q, and hence new tentative overlap positions must be assumed. Here F is the limiting acid since it does not appear in q. In the third example even though D appears in both p and q, new tentative overlap positions must be assumed, because in p the D acid is restricted in position at the right. Program MERGE (Figure 8). Once the maximum overlap has been determined, all other possible overlaps can be determined. Several cases can occur. First the essentially trivial cases of p and q entirely disjoint, as in q: (D,E) p: (A,B,C) or else q a subset of p as in Find the structure of maximum overlap of p and q. p and q are derived from separate portions of the molecule. q is wholly contained in p, but not in a unique position. p and q can be derived from the same portion of the protein chain. List the structure. Yes Does the first overlapping group of p contain one amino acid? Does the first overlapping group of p contain more than five acids? mation to include at this time. Return. No Generate all unique single acids, pairs of acids, triplets or quadruplets from this first overlapping group of p. Can q overlap each whole peptide formed from this first group and the rest of the ppeptide ? Yes Reform p peptide starting with the next p group. Figure 8. Flow Chart for Program MERGE. 270 / Comprotein: A Computer Program to Aid Primary Protein structure Determination C START q: (B,C) p: (A,B,C ,D) t If neither of these is the case, as for example in q: (B)(B,D)(G) p: (A,B,B,D) the first overlapping group of p is considered, which for our later example is I Form (B,B,D) If this first overlapping group contains not more than five nor less than two acids, it warrants further consideration. A list is made of all singles, pairs, triplets and quadruplets of acids that can be formed from this overlapping group, which for our example is (B) (D) (B,D) For one MCO(N) and for one MCO(J) where J) N, find a 11 the over laps that contain the elements of SAA. Continue through J and N until successful. (B,B) Next each of these is examined to see if it can overlap with q. For an example we have respectively I t For a 11 MCO(K) where K') J, find all the overlaps of the MCO(K) peptide with each peptide of the previous merged list. Print out all the peptides that could not merge. Print out all the peptides too ambiguous in structure to consider at this time. Print out all the peptides that are merged. (A,B,D)(B)(B,D) G, none, (A,B)(B)(D)(B)(G), and (A,D)(B)(B)(D)(G) where we have underlined the overlapping group to the left of which p fits and to the right of which q fits. Finally the peptide is re-formed starting with the next group, and so forth, until all the overlapping groups of p and q have been considered. Program PEPT (Figure 9). This program extends the previously discussed program MERGE in that it finds all of the possible structures of the protein chain consistent with the overlapping of all the peptides obtained from a search for all experimental peptides with a rare configuration of amino acids. The overlapping portion of these acids must contain the group of rare amino aCids, called SAA in the flow chart. The list resulting from the search is called MCa in the flow chart. Program SEARCH (Figure 10). This program systematically looks at each pair, triplet, and quadruplet of amino acids that are known to occur together from the experimental peptide fragment data. For each such group, the probability of its occurrence is computed from the amino acid frequency data as described above. A list, called Num(L) in the flow chart, is made of these groups of amino acids known to occur together which are improbable of occurrence (Le., less .probable a Ii st of merged peptides. Print out the merged structures. C END) Figure 9. Flow Chart for Program PEPT. than some chosen value). The letter L is used to index the elements of the list Num(L). Finally the program utilizes PEPT to generate and print out all possible merged structures. For example suppose a search was made on the ordered pair LYS-PHE and there resulted the following fragments: (ALA)(ALA, ALA, LYS)(PHE) (ALA)(ALA, LYS)(PHE) (ALA)(LYS, PHE, GLU, ARG)(GLU) (LYS)(PHE) The merged structure becomes (ALA)(ALA)(ALA)(L YS) (PHE)(GL U,ARG)(GL U) Proceedings-Fall Joint Cotnputer Conference, 1962 / 271 For N=2,3,or 4 search all the peptides for N amino acids known to ether. Compute probability tHat these N acids might occur together in two places in the protein chain, from the amino acid frequency data. No Yes Is part or all of this amino acid group already listed in Num(L) ? No List group of acids in Num(L). List peptide number. Yes For each input peptide, can it contain the amino acids Num(L) adjacent to each other ? one Yes Call PEPT to determine all of the possible protein chain configurations consistent with these peptides all overlapping by the amount of the searched-on acids, Num(L). Print out the sorted input peptides and the possible merged structures. Figure 10. Flow Chart for Program SEARCH. Program QLIST (Figure 11). This program forms the lists of peptides related by each fragment q 1. It is to be noted that each element of a Q list may contain up to five p fragments (although in our example of section 2 only two peptides appeared in each element of the Q lists). The input to this program is a list P of peptides which in some order will reconstruct the original protein and a list Q of peptides which give additional ordering information about the protein. In the flow chart P' is a hypothetical peptide 272 / Comprotein: A Computer Program to Aid Primary Protein Sturcture Determination ( START + r-------~',.~ For all q peptides .\ t For all p peptides. ~~-----------. J r--------'3.....~ , No Form new P'by adding the next peptide in the P list to the right end of p~ Can q overlap some of each p peptide in pi? t No Yes Does q still extend to the right of pi? Yes "-, Does p/contain 5 p peptides ? No Yes , Print out a list of P peptide numbers in p~ This is a possible configuration of the protein, from which q was derived. ~J Remove the last p peptide in pl. ". I ~------------~~ ... t No '-------------"-;..;;....--i Was this the last pep tide in the P 1is t . ? i Yes ( Is pi empty? No )J--_.N __O _ _ _____ i Yes Was this the last q peptide in Q list ? i Yes END) Figure 11. Flow Chart for Program Q LIST. constructed by the juxtaposition of up to five P peptides. Program LOGRED (Figure 12). In this program the Q lists are given. Calling each term of a Q lists a condition, the flow chart involves the lists: MQ(M) of conditions on the assumption of the Mth tentative condition; MQI of conditions being considered; and IR(M) of tentative conditions which determine a possible protein structure. The symbol MTI(J) is the last condition considered in the Jth Q list. The program follows a tree structure of possibilities, keeping tract of tentative conditions until a branch is eliminated or comes to a successful conclusion. The program follows with greater generalization Proceedings-Fall Joint Computer Conference, 1962 / 273 START) I t Let M=l MTl(J)=O I f Find the shortest Q list of those remaining to be considered. r Clear Mth 1ists of No I-E--~----t conditions MO(M) and tentative condi tiDn lR(M) C (M=M-l) Exit Yes Any more possible conditions to be cons idered in this 1 ist ? Yes MTl (J)=MTl (J)+ 1.) 'f ~ I Assume that condition number Mll(J) is true. M=O ? ) No Transfer Mthreduced from MO(M) to o lists y ....... Yes f ~----'~--1\. Any lists vanish? MOL - J ) No Replace by first peptide of tentative condition any other peptides of this condition which occur in the other conditions. Transfer MOl lists to MO(M). Store tentative condition in lR(M). t I---------{~ M=M+ 1 ;J-ooIE"':---_ _---iN:..::.:o"--_-t( All 0 cond i t ions sa tis f i ed ? ) Yes Print out tentative condition list IR. These P peptide associations determine a possible protein structure. Clear IR(M). Transfer MO(M) lists to MOl. Figure 12. Flow Chart for Program LOGRED. the method of the "simplified illustration" of the second section of this paper. SUMMARY AND CONCLUSION The computer program described in this paper becomes useful when there is a large number of small peptide fragments resulting from the breakdowns which are to be woven into a consistent and unique structure. This is a long tedious process when carried out by hand, and is subject to careless errors and impatient overlooking of all alternative 274 / Comprotein: A Computer Program to Aid Primary Protein Structure Determination possibilities/!c The completed IBM 7090 Computer program has been successfully tested on a hypothetical subtilysin breakdown of ribonuclease into over eighty fragments. Just a s the proteins are composed of c,hains of the same types of molecules, the genetic substances desoxyribonucleaic acid (DNA) and ribonucleaic acid (RNA) are composed of chains of only four different types of molecules called the nucleotide bases. It is possible that the order of the molecules in these substances can also be determined by the aid of this computer program and some computer experiments in this direction have been made. However, application of these *Dr. Wm. Dreyer, of National Institutes of Health, has developed chemical technique s for isolating a large percentage of the peptide s formed in a subtilisin hydrolysis, and for determining the total amino acid content and identifying the right and left ends of the peptide f rag men t s. This experimental technique is very rapid and can be mechanized to a large extent; thus data taking should be reduced to months instead of year s. The computer program is ideally suited for analysing this type of data. The computations in this paper were done in the Computing Center of the Johns Hopkins Medical Institutions, which is supported by Research Grant, RG 9058, from the National Institute s of Health and by Educational ContributionS' from the International Business Machines Corporation. techniques to DNA and RNA still awaits further development in the chemical experimental methods. REFERENCES 1. Sanger, F., Science, 129, 1340 "1959." Hirs, C. H. W., Moore, S., Stein, W. H., J. BioI. Chem. 235, 633, "1960" Spockman, D. H., Stein, W. H., Moore, S., J. BioI. Chem. 235, 638 "1960." 2. Braunitzer, G., Gehring-Mueller, R., Helschmann, N., HeIse, K., Hobom, G., Rudloff, V., and Wittmann-Tiebold, B., Hoppe Seyler Z. Physiol. Chem., 325, pp. 283-6, Sept. 20, 1961. 3. Hirs, C. H. W., Moore, S., and Stein, W. H., J. BioI. Chem. 235, 633 (1960). 4. Tsugita, A., Gish, D., Young, J., FraenkelConrat, H., Knight, C. A. and Stanley, W. M., Pros. Natl. Acad. Sci., Vol. 46, pp. 1463-9, 1960. 5. Kendrew, J., Watson, H., Strandberg, B., Dickerson, R., Phillips, D. and Shore, V., Nature, Vol. 190, pp. 666-70, May 20, 1961. 6. Margoliash, E., Smith, E., Kreil, G., and Tuppy, H., Nature 192,1121-7, (Dec. 1961). 7. Ledley, R. S., Report on the Use of Computers in Biology and Medicine, Natl. Acad. of Scis.-Natl. Research Council, May 1960. 8. Bradley, D. F., Bernhard, S. A., Duda, W. L., unpublished work. 9. Eck, R., Nature, Jan. 20, 1962, 241-243. USING GIFS IN THE ANALYSIS AND DESIGN OF PROCESS SYSTEMS William H. Dodrill Service Bureau Corporation Subsidiary oj International Business Machines A process system may be defined as an integrated combination of equipment which functions to produce one or more products, and possibly byproducts, by altering the physical and/or chemical nature of raw materials. Even though there is wide diversity in products produced, raw materials used, and equipment combinations employed, all process systems exhibit certain funda ... mental similarities. Characteristically, any process system may be subdivided into three operational phases. These are: preparation of raw materials, conversion to products, and recovery and purification. All three phases may not necessarily be included in a specific @) process system, nor is there always clear distinction among them. Still, there is universality in practice in that only a limited number of types of equipment are used to perform speCific types of operations. As an example of a relatively simple, but fairly typical, process system, consider the flow diagram for high-temperature isomerzation reproduced in Figure 1. This is an operation performed in petroleum refining to improve the octane rating of certain gasoline constituents. It consists of a reactor for converSion, a flash drum and three distillation columns for recovery and purification, and several pieces of auxiliary RECYCLE PROPANE FUEL ISO-PROPANE PRODUCT ® GAS VENT RECYCLE HYDROGEN @ MAKEUP HYDROGE"N C\I z 0 i= FEED ~ CD 0) ...J ...J ~ (/) 0 @ Figure 1. High-Temperature Isomerization 275 PRODUCT 276 / Using Gifs in the Analysis and Design of Process Systems equipment. Distillation 1 also serves in preliminary preparation of the feed. Let us suppose that a unit similar to this is to be built for a particular refinery. Before individual pieces of equipment can be ordered or fabricated, it is necessary that their sizes and capacities be specified. This type of analysis is termed process design. Prior to the general availability of computer methods, the process design engineer was forced to rely almost exclusively on information obtained from physical systems. Thus, if information about a similar unit, which was 0 per at i n g satisfactorily, was available, the design engineer merely gave the same or slightly modified specifications. For new processes, or processes for which comparison information was not available, he had to resort to scaleup from a pilot unit, construction of the pilot unit, in turn, being based on scaleup from a bench or laboratory unit. Thus, information required for construction of each process system was derived from a bench or laboratory model, through pilot unit, to production unit. This procedure is costly and time consuming. Further, the difficulties involved in experimenting with physical systems, especially of plant scale, all but prohibits investigation of anything more than minor design modifications. As an alternate method of deSign, many computational methods are used. However, rigorous computations are too laborious for practical hand solution. As a consequence, only shortcut and approximate methods are used regularly, and their primary application is limited to supplementing plant and pilot information. Now that economical and reliable machine computations are readily available, lengthy, as well as complex, computational procedures are increasing in utility. DeSign is still ultimately based on comparison information, but this dependence is becoming less critical. One of the many applications of computers in process design is in implementing the use of mathematical models. A mathematical model may be defined as a series of arithmetical operations performed to compute numerical values which represent certain performance characteristics of a physical system. Required input information are numerical values for design and operating conditions. A wide range in complexity is exhibited by the mathematical models which are programmed for com put e r solution. Some, such as for the mixing of two streams, are very simple, while others are very complex, requiring many hours of high-speed computer time for solution. The general concept, though, is the same. That is, given a numerical evaluation of operating and design conditions, the objective is to compute an estimate of performance characteristics for a corresponding physical system. Actually, this describes the simulation of a physical system using a mathematical model. An alternate type of mathematical model can be formulated for direct design. Accordingly, given numerical values" for operating parameters and performance restrictions, numerical values for required design can be computed directly. Formulation of this type of model is normally much more difficult. Many simulations for commonly used pieces of industrial equipment are being processed daily. However, this often requires theoretical isolation from the process system of which the piece of equipment is an integral part, and so may result in significant error. That is, the functional interrelationship of pieces of process equipment may be so significant, even with respect to the piece being" studied, that gross errors in analysis may result when the effects of proposed modifications on the system are neglected. This is one of the primary motivations behind the development of Generalized Interrelated Flow Simulation, GIFS. That is, even more important than providing a convenient "means Of using preprogrammed mathematical models, GIFS provides a means of analyzing a process system as an integrated network. The development of GIFS (Generalized interrelated Flow Simulation) is based on study of a wide variety of industries. Representative among these are: petroleum refining, metals production, and manufacture of chemicals, pharmaceuticals, pigments, plastics, and pulp and paper. In essence, all of these represent special cases of continuous flow systems. Therefore, the design of GIFS is based not on the analysis of specific production systems, but" rather, on the much broader scope of flow systems analysis in general. The development of a generalized flow network method, which is easy to use as well as applicable to a wide variety of process systems, poses a number of difficulties. Outstanding among these is the ever-present Proceedings-Fall Joint Computer Conference, 1962 / 277 possibility of system over or under specification. That is, there is a finite number of variables which, when fixed, completely determine a system, and all other pertinent variables can be computed. There is always the possibility that an engineer will attempt to set either more or fewer variables than are required for complete specification of a process system. In order to insure against this possibility, GIFS has been purposely restricted to analysis of cases which correspond to physical systems that are completely determined and which are not over specified. Thus, GIFS is applicable only to the generation of a mathematical model to simulate the functioning of an entire integrated process system. No attempt has yet been made to incorporate direct design aspects in this model. In short, the method can be used only for the performance evaluation of a fixed process system at fixed operating conditions and with fixed feeds. This is really not a drastic limitation since design as well as optimization can be approached through multiple case studies. In application, each system under consideration is represented as a network of interrelated stream flows. Figure 1 is an example. Each stream is identified by assigning to it a unique stream number as indicated. Numerical values which are used to describe the nature of a stream are termed stream proper~ies. Stream properties include total flow rate, composition, temperature, pressure, heat content, and phase state. Restrictive relationships among streams, such as are simulated by preprogrammed mathematical models for speCific types ~f process equip-· ment, are indicated by what is termed unit computations. Examples of unit computations include the simulation of process equipment, such as reactors, distillation columns, heat exchangers, etc., as well as factors which have no direct equipment counterparts (pressure drops in lines and ambient heat exchange). The currently available library includes 21 unit computations. These· are summarized in Table 1. Complete descriptions are given in the GIFS user's manual.~:~ They represent a collection which is basic in nature and highly versatile. Further, GIFS is constructed so as not to be limited to the library of unit computations available at any one time. At present, four additional unit computat ions are being prepared for inclusion in the library, and others can be added as specific needs arise. Figure 2 is a reproduction of one page of the input data sheet which describes Figure 1 in terms of unit computations. Each unit computation is specified by giving the unit computation type, associated stream number or numbers, and arguments if any. A unit computation number is included for identification only. It is usually convenient to number unit computations sequentially. The first unit computation in Figure 2 indicates the addition of streams 1 and 19 to obtain stream 3. The unit computation type is STAD (Stream Add), and the associated stream numbers are 1, 19, and 3. The second unit computation indicates the function of distillation column number 1, the associated streams being 3, 12, and 4. The unit computation used is CRSEP (Component Ratio Separation). This is an approximate simulation of a distillation column which requires, as arguments, component recoveries for all components present in the feed (ratio of component flow rate recovered in the overhead product, stream 12, to component flow rate in the feed stream, stream 3). Since there are eleven components for the illustrative system, values for eleven arguments are entered. Additional unit computations necessary to describe the entire system are entered sequentially. Detailed information on the preparation of input data sheets is contained in the user's manual. ~:~ The objective of each computation is an evaluation of all interrelated stream properties. These are computed as a function of the properties of the given feed streams. Each system is computed as an integrated network and in a manner which satisfies fundamental material balance relationships as well as the restrictions defined by the specified unit computations. Systems are normally non-linear, and a method of solution by iteration is employed. As an example of a computer output, just one portion of the report for the illustrative example is reproduced in Figure 3. This is the portion which describes computed properties for stream 9. *GIFS, Generalized Interrelated Flow Simulation, userls manual is available through any SBC office or local sales representative. 278 / Using Gifs in the Analysis and Design of Process Systems G I FS sac UNIT UNIT COMPUTATION COMPUTATIONS STREAM ARGUMENT TYPE 2nd 1st -I. I: I. ,. 17..rD. -!:I . .,03 -;""D/ rr icp' -........ I ."" .-;'2 0' _1 ~ ~ -I. :.,. .~ ~ e ·'/lfl. [/0. rtY r-,:;; ~ .dor .~/ ..... .DDY ~.~ 4 4 o I 3 01 31 /. h. ~a . hl 4th .47 b. "Li o 3rd , I 7 o SHEET 3 PAGE'&- OFLL..- Figure 2. Sample Input Data Sheet STREA'" NO. 1 2 3 4 5 6 7 8 "J 10 11 COMPONENT NAME H2 (1 (2 (3 l-C4 N-C4 l-C5 TOTAL MOLES/HR. MOLE FR. 1.2926 o. C139 C.3115 0.C033 C.1001 O.COll c.e6e2 0.C007 0.0472 0.C005 (.2389 0.0026 22.2530 0.2389 3.4096 0.0366 18.3820 0.1973 44.9652 0.4827 2.0943 0.C225 93.1626 1.0CCC lIi-C5 I-C6 N-(6 C7+ TCTAlS PRESSURE VAPOR FLew (Z = 1.C) HEAT CONTENT 30C.OC c. o. -0. 9 FLOW LBS./HR. WT. FR. 2.5853 0.0003 4.9845 0.0007 3.0124 0.0004 3.0062 0.0004 2.7415 0.0004 13.8804 0.0018 1604.4429 0.2125 245.8286 0.0326 1584.5274 0.2098 3875.• 9987 0.5133 209.8512 0.0278 7550.8590 1.0000 PSIA. M ~ ~M cu. H./HR. SlC. CL. FT./HR. BTU/HR. flASH L1Q TEMPERATURE LIQUID FLOW VAPOR ,,"aLES/HR. o. O. o. o. o. o. o. o. o. o. O. o. 120.00 25.04 FRACTION VAPOR O. Figure 3. Computer Report LIQUID MOLES/HR. 1.2926 0.3115 0.1001 0.0682 0.0472 0.2389 22.2530 3.4096 18.3820 44.9652 2.0943 93.1626 DEG. F. GPM Proceedings-Fall Joint Computer Conference, 1962 / 279 Table 1 Unit Computations Summary Title Stream Add Stream Subtract and Zero Stream Split Stream Equate Stream Zero Temperature Add Heat Add Pressure Add Temperature Set Heat Set Pressure Set Vapor Ratio Set Bubble Point Dew Point Isothermal Flash Adiabatic Flash Constant R Flash Stream Heat Equilibrium Separation Component R Separation Type Streams STAD ST1 ST2 ST3 STSBZ SPLT STEQ STZ TAD QAD PAD TSET QSET PSET RSET BPT DPT TFLSH QFLSH RFLSH STQ ST1 ST2 ST3 ST1 ST2 ST3 ST1 ST2 ST ST ST ST ST ST ST ST ST ST ST ST ST ST EQSEP ST1 ST2 ST3 CRSEP ST1 ST2 ST3 Arguments F12 DT DQ DP T Q P R H1 H2 --- ---- RN Reactor REACT GIFS has been developed, and is now being offered, as one of SBC' s preprogrammed !omputer services. Characteristics of these services include accurate, inexpensive, and rapid processing for a wide variety of problems, use of convenient input data sheets, and the presentation of computed values in clear, comprehensive reports. Cases which have been processed to date represent applications in such fields as petroleum refining, inorganic and organic chemicals manufacture, andpulp and paper production. These have been processed in a routine manner, and have not indicated any unforseen complications" in the approach or the computer implementation. From all indications, companies who are using the service are highly satisfied, and ST1 ST2 NR NC1 NK1 CNV1 Sl ---------- SNC1 C1 ----------. CNCI --------- CNCNR expect to continue using it. Future developmental efforts will depend on industrial response. SBC is already contemplating the implementation of a number of additional features. ACKNOWLEDGMENTS Appreciation is expressed to Service Bureau Corporation and to mM for supplying the materials, equipment, and encouragement needed for this project, and to the SBC personnel who assisted in its completion. Appreciation is also expressed to Prof. L. M. Naphtali for his original developmental efforts, and to Dr. J. E. Brock for his assistance in the preparation of this documentation. A DATA COMMUNICATIONS AND PROCESSING SYSTEM FOR CARDIAC ANALYSIS M. D. Balkovic Bell Telephone Laboratories Holmdel, New Jersey P. C. Pfunke* A. T. & T. Company New York City C. A. Caceres, M. D. Chief, Instrumentation Unit Heart Disease Control Program U. S. Department of Health, Education, and Welfare Washington 25, D. C. C. A. Steinberg Department of Medical and Biological Physics Airborne Instruments Laboratory Deer Park, Long Island, New York Many aspects of medical diagnoses involve extensive, tedious procedures in which the capabilities of a digital computer can prDvide the physician an invaluable aid.t Because of the complexity and cost of modern computers, systems to aid in diagnostic procedures must generally be centrally located and be capable of serving many physicians. Thus, data communication links between the physicians and the computer location are required. Such a data communication and processing system for cardiac analysis is now in operation and is described in this paper.t The cardiac data processing system discussed in this paper is comprised of data acquisition un its located throughout the country, data communication links which can be established as required, a data processing unit, and print-out devices. The data acquisition unit, as shown in Figure 1, provides the capability of recording an electrocardiographic Signal simultaneously Figure 1. Data Acquisition Unit. on a graphical recorder and on a magnetic tape recorder. Prior to recording the electrocardiogram, an 8-digit, serial, binary-coded *Will present paper. tReference #1. tThis is a pilot project facility of the Instrumentation Unit, Heart Disease Control Program, Division of Chronic Diseases, U. S. Dept. of Health, Education and Welfare, Wash.ington 25, D. C. 280 Proceedings--Fall Joint Computer Conference, 1962 / 281 decimal number is generated and recorded on magnetic tape. This 8-digit number is used to identify the particular data that is recorded. Two digits indicate the place where a recording is taken; four digits indicate the patient t s number, and the last two digits indicate the particular electrocardiographic lead that is recorded. The electrocardiographic signal is modulated using pulse repetition frequency modulation prior to recording on magnetic tape" in order to achieve a frequency response down to 0.1 cycles per second. The upper frequency response extends to 200 cycles per second. Figure 2 is a block diagram of the data acquisition unit. The input electrocardiogram is amplified prior to recording. A coder generates the 8-digit number corresponding to the location, patient and lead. designed and constructed analog transmitting data sets are located at the data acquisition units while receiving data sets are located at the data processing unit. These analog data sets incorporate integrated telephone sets which allow, when in the voice mode of operation, the dialing of calls and normal voice com m un i cat ion over the regular switched telephone network. When in the data mode of operation, the data set provides modulation and demodulation circuitry to allow the transmission of analog signals with frequencies from 0 to 200 cps over dialed-up voice telephone facilities. An input to output linearity from transmitting to receiving data sets is maintained to better than 1%. Figure 3 shows a block diagram of the electrocardiogram data set transmitter. The input circuitry provides a load impedance of OSCILLOSCOPE DISPLAY GRAPHICAL RECORDER + , RA LA RL LL PC ~ ~ ..... LEAD SELECTOR IDENTIFICATION NUMBERS I-- {~ ±: ...... ...... ECG AMPLIFIER TAPE IDENTIFICATION UNIT I---++- ..... MODULATOR r-- TAPE TRANSPORT r+- [ f-- Figure 2. I-+- DEMODULATOR ~ TO DATA SET FREQUENCY COMPENSATION f- ~ Data Acquisition Unit. The code number followed by the amplified electrocardiogram is fed to pulse repetition frequency" modulator whose center frequency is 3600 cycles per second. The output of the modulator is recorded on magnetic tape. The output of a second head on a tape recorder is demodulated and the demodulated signal is fed to a graphical recorder and/or a DATA-PHONE data set for transmission to the data proceSSing unit. The magnetic tape records generated by the data acquisition unit contain completely identified electrocardiographic signals. These magnetic tape records can be sen1 via DATA-PHONE service to the data proceSSing unit for subsequent analysis and proceSSing. Communication between the data acquisition units and the data proceSSing unit is achieved over DATA-PHONE service on switched tel e p h 0 n e facilities. Specially 1000 ohms and accepts analog signals which can range from - 3 volts to +3 volts and can contain frequency components between 0 and 200 cps. The input circuitry acts to adjust the DC level and gain of the input signal such that it provides proper bias range for the astable multivibrator which follows the input circuit. This astable multivibrator accomplishes the actual frequency modulation. A ,-------- - - - - - - - - - , I -3 to +3 Volts o to 200 CPS I lOOO.n r Il i TRANSMITTER 1 l GAIN AND D.C. LEVEL ADJUST 1 f-- MULTIVIBRATOR I-- 11 1000 to ALTER 11500 CPS i F.M. AND LEVEL II 1-6 dbm CONTROL 1 "900n L _____ . .:. . ___ ..- ___ _ Figure 3. :l I Data Set Transmitter. 282 / A Data Communications and Processing System for Cardiac Analysis given voltage delivered to the input of the data set represents a particular bias on the multivibrator and determines the multivibrator t s frequency of oscillation. For the indicated input voltage range, the astable multivibrator will oscillate at frequencies ranging from 1000 cps to 1500 cps when the input circuit level and gain controls are properly adjusted. The square wave output of the multivibrator is filtered and attenuated by the transmitter output circuitry. The output circuitry provides a 900 ohm source impedance to the telephone line. The Signal transmitted to the telephone line is essentially a sine wave with a fundamental frequency between 1000 cps and 1500 cps at a level of -6 dbm. A bandpass filter removes harmonics above 2500 cps. Figure 4 shows a block diagram of the electrocardiogram data set receiver. The input bandpass filter terminates the telephone line in 900 ohms at all transmitting frequencies. It has a slope of 12 db/octave to either side of the 1250 cps center frequency of the telephone line signal. Following the bandpass filter is a combined amplifier and limiter, the gain of which determines the sensitivity of the receiver, -38 dbm. The output of this stage is a 1.5 volt square wave with a fundamental frequency that of the .------------------, I RECEIVER I I -6 to -38 dbm 1 1000 to I 1500 CPS F.M. II I 900nr-i ~------------------------~: I 1 I IL 1_3 to +3 Volts to 200 CPS I0 PULSER FILTER LEVEL ADJUST _______________ I --.J Figure 4. Data Set Receiver. received signal. The square wave is differentiated and full-wave rectified to give a sequence of pulses which correspond to the zero crossings of the received signal. These pulses are used to trigger a mono-pulser which delivers a pulse of fixed width each time it is triggered. Thus, the mono-pulser delivers an output pulse for every zero crossing of the input signal from the telephone line and provides an average output voltage which is proportional to the received signal frequency. By passing the output of the mono-pulser through a low pass filter, the original baseband Signal is reconstructed. The output circuitry serves to amplify the signal and restore the correct DC signal level. This output circuitry is designed to drive a 1000 ohm load impedance. The telephone line signal spectrum (1000 cps to 1500 cps) is chosen to avoid falling within the same band used by various kinds of single-frequency signalling equipment employed in Bell System switching facilities. If this were not true, automatic circuit disconnect could occur when certain signal patterns are transmitted. The input sensitivity of the data set receiver is chosen to be appropriate for the range of typical dialed-up telephone connections. Figure 5 is a block diagram of the equipment in the data proceSSing unit. This consists of three basic units; an input console, a digital computer, and an output unit. The input console can accept data that is transmitted over the telephone line or data obtained from playing back magnetic tape records made in the field. The data transmitted over the telephone line is recorded on one of the two magnetic tape recorders. If it is desired, the data transmitted over the telephone line can be recorded on magnetic tape and fed into the computer simultaneously. The input console is capable of automatically searching the magnetic tape for any pre-selected set of identification numbers.):c Once finding the proper set of identification numbers, these numbers are decoded and fed into the computer and serve to identify the result of the data proceSSing. Alternately, the system can accept any electrocardiogram and set of identification numbers, whether transmitted over the telephone line or played back from a magnetic tape recording, decode the identification and feed the identification number into the digital computer. An oscilloscope and photographic recorder are incorporated into the system to monitor the electrocardiogram while recording and playing back. The electrocardiogram is filtered with a bandpass filter to eliminate high frequency noise. The derivative of the filtered electrocardiogram is then obtained using conventional analog techniques. The filtered electrocardiogram and the derivative of the electrocardiogram are converted to digital form at a rate of 500 samples per ~rence #2. Proceedings-Fall Joint Computer Conference, 1962 / 283 I"PUTS FROM 'b.fPHONE LINES IDENTIFICATiON { NUMBERS Figure 5. The Data Processing Center. second using a successive approximation analog to digital converter. The digital representation of these two signals, along with the decoded identification numbers, are then fed directly to the digital computer for subsequent processing. Figure 6 is a photograph of the system. The input console is at the left and the computer is at the right. Figure 6. Data Processing Unit. The digital computer is a Control Data Corporation 160A computer (CDC 160A) and is programmed to automatically recognize and measure the wave forms of the electrocardiogram. >:~ The cardiologist uses these measurements to interpret the electrocardiogram. The measurements are of amplitudes and duration of the waves, and the interval between certain of them. The program recognizes whether some waves are diphasic, bi-fed or tri-fed. The measurements made by the computer were programmed based on conventional EKG criteria for wave onset, termination and slope. t The computer processes an electrocardiographic lead and arrives at the desired measurements in 12 seconds. The entire input, process, and write-out time is 52 seconds. The output of the computer is printed on an automatic typewriter or punched on paper tape that can be used for telephone transmission of the results back to the sender. Although at present verbal telephone communication is used for return of results, a completely electronic system has been tried. t *Reference #3. #4. #5. t Reference t Reference 284 / A Data Communications and Processing System for Cardiac Analysis The data communication link used for output data from the data processing unit has consisted of DAT A-PHONE service over switched telephone facilities. Bell System Data Sets 402A and 402B have been used in conjunction with Tally Register Corporation Model 51 tape-to-tape equipment. This communication link is capable of transmitting 8-level parallel digital data at a rate of 75 characters per second. The approach taken in the system described in this paper can be extended to other medical diagnostic processes such as the analysis of phono-cardiograms, electroencephalograms, etc. There is great potential here for electronics to assist the medical profession. REFERENCES 1. Rikli, Arthur E. and Caceres, Cesar A., 1960. The Use of Computers by Physicians as a Diagnostic Aid. Reprinted 2. 3. 4. 5. from The New York Academy of Sciences Sere II, Volume 23, No.3, Pages 237-239. Paine, L. W., and Steinberg, C. A. AMedical Magnetic Tape Coding and Searching System. Presented to the Spring, 1962 IRE Convention. Steinberg, C.A., Abraham, S., and Caceres, C. A. Pattern Recognition in the Clinical Electrocardiogram. Presented at The Fourth International Conference on Medical Electronics, New York, New York, July, 1961 Caceres, Cesar A. How Can the Waveforms of a Clinical Electrocardiogram be Measured Automatically by a Computer? Presented at a meeting of the Professional Group on Bio-Medical Electronics (local chapter) New York, New York, February, 1960. Rikli, Arthur E., Caceres, Cesar A., and S t e i n b erg, C. A. Electrocardiography with Electronics: An Experimental Stud Exhibit) American Heart Association National Meeting October, 1962. CLUSTER FORMATION AND DIAGNOSTIC SIGNIFICANCE IN PSYCHIATRIC SYMPTOM EVALUATION* Gilbert Kaskey Engineering Director, Systems Design and Applications Division Remington Rand Univac Paruchuri R. Krishnaiah Senior Statistician, Applied Mathematics Department Remington Rand Univac Anthony Azzari Senior Programmer, Applied Mathematics Department Remington Rand Univac techniques are all discussed along with the results obtained using the tools of correlation analysis and simultaneous multivariant analysis of variants. SUMMARY The tremendous variability in symptom constellations associated with specific psychiatric diagnoses makes the use of statistical techniques a virtual necessity in the rigid for.mulation of any symptom-disease model. In addition, the large number of symptoms associated with psychiatric disorders suggest that an electronic computer can be used to considerable advantage in certain aspects of psychiatric diagnosis, e.g., in the correlation analysis between symptoms, in the determination of quantitative criteria for diagnosis, and in the evaluation of the reliability of symptom assignment by different personnel. This paper reports on the results obtained from the analysis of data on 199 subjects collected by the Childrens Unit of the Eastern Pennsylvania Psychiatric Institute. The types of information available, the method of collection, and the statistical methodology and INTRODUCTION The fact that electronic computers c'art ,be used effectively in certain phases'oi mediCal diagnosis has been recognized for somEf time. It has been only recently, however, that applications have been extended to include research investigations in psychotherapy generally, and symptom pattern formation specifically. Although the present study has not extended beyond its initial phase, preliminary findings suggest that the merger of computer techniques with clinical observations can do a great deal to shed more light on the problems of emotionally disturbed children. The extreme variabili ty in symptom constellations ass 0 cia ted with specific *The authors are indebted to Mrs. Janice Schulman, Research Associate, and Dr. Robert Prall, Director of the Children's Unit of the Eastern Pennsylvania Psychiatric Institute for their invaluable aid in formulating the problem areas and in interpreting the results. 285 286 / Cluster Formation and Diagnostic Significance in Psychiatric Symptom Evaluation psychiatric diagnoses makes the use of statistical techniques a virtual necessity in the rigid formulation of any symptom -disease model. In addition, the large number of symptoms associated with psychological and psychophysiological disorders suggests that an electronic computer can be used to considerable advantage in many aspects of this diagnOSis. For example, the objective determination of symptom clusters using correlation analysis and the statistical testing of significance among alternative diagnoses require an excessive amount of computation which is only feasible when handled by a computer. This report on the initial phase of the investigation describes the clinical information collection techniques, the psychiatric categories assigned, and the statistical methodology applied in the determination of the preliminary results. . The ultimate goal of the project is to evolve, through analysis of the data, a picture of how the various symptoms tend to form patterns or "clusters." It is hoped that these clusters can be related directly to the relevant category as an aid in the problem of differential diagnosis. Associated with this investigation is an evaluation of the diagnostic categories themselves and the determination, from a statistical standpoint, of the significance among the several categories using the "symptom vectors" as criteria. The mathematical details of the technique used in this phase-Simultaneous Multivariate Analysis of Variance-are given in the Appendix. Data Description The Children's Unit at the Eastern Pennsylvania Psychiatric Institute has been collecting data regarding symptom formation in emotionally disturbed children since 1956. An enumeration of the emotional problems exhibited by each child seen in the Clinicwhich is both an inpatient and outpatient treatment center-is obtained in a highly structured interview with each parent who is seen individually. In addition to whether the child possesses any of the 130 symptoms commonly encountered in clinical practice (see Appendix, Table A-1), information with regard to (1) duration of the symptom; (2) whether the symptom is currently characteristic of the child or existed only in the past; and (3) whether the parent considers it as "serious" or "not so serious," is also obtained. Several assumptions were necessary in order to make the data amenable to computational techniques. Since additional evidence is now being collected on "normal controls," i.e., on non-patient children of various ages, and since consistency checks between parents are being made, it is expected that the validity of these assumptions will become apparent in later studies; in any case they are such that the final results will not be seriously affected by their acceptance. A symptom is considered to be present if either parent says it is present. If only one parent sees a particular symptom, the designation as to the severity and .duration is determined by the response of the parent who sees it even though it may be listed by the other as not characteristic of the child. If the information elicited indicates that both parents agree on the relevance of the given symptom, but disagree as to the specific designation, the following rules determine how the response is counted: a. If one parent says the symptom exists at present while the other says it existed only in the past, the symptom is listed at present. b. If one parent says present always and the other says present recent, the former designation is used. c. If one parent considers the symptom serious but the other does not, it is listed as serious. Other basic sociological data are available on each of the 199 cases being investigated but no attempt at analysis has been made. Presumably such information as age, sex, religion, race, and number of parents interviewed, will provide fruitful areas for future study. Only four of the fifteen diagnostic categories are being studied in detail since in many instances the data available in tile remainder represent too few cases from which to draw meaningful results. Further, the categories finally selected are those of greatest interest to the research personnel at the Children t s Unit with whom the study is being made. Descriptive definitions* are given below to enable the reader to relate the symptom patterns, as indicated by the correlation analysis described in the next *See .Diagnostic and Statistical Manual of Mental Disorders, The American Psychiatric Association, Washington, D. C. Proceedings-Fall Joint Computer Conference, 1962 / 287 section, to the emotional disturbance classifications tested in the section entitled "Diag,nostic Significance." Childhood Schizophrenia - th~s category, also known as childhood psychosis, is characterized by varying degrees of personality disintegration and failure to test and evaluate external reality correctly. Children in this group fail in their ability to relate themselves effectively to other people and to such tasks as school work. They may exhibit unpredictable behavior, regression to earlier childhood forms of behavior, and uneven rates of development of any area of bodily and mental development. The onset of this condition may be from birth or may come later on in childhood and is often gradual. Psychophysiological Disorders - such disorders are sometimes referred to as psychosomatic disorders and represent organic disturbances which are induced by mental or emotional stimuli. Psychoneurotic Disorders - these are characterized chiefly by "anxiety" which may either be felt and expressed directly or may be unconsciously and automatically controlled by various psychological defense mechanisms. Personality Trait Disturbances - disorders of this sort are characterized by developmental defects or pathological 'trends in the personality structure, wit1;l minimal subjective anxiety, and little or no sense of distress. Usually it is manifested by long standing patterns of action or behavior, rather than mental or emotional symptoms seen in neuroses and psychoses. The 199 cases under consideration have each been studied by professional personnel at the Clinic prior to having a final diagnosis assigned. Symptom Cluster Formation A major purpose of the investigation has been the objective determination of symptom clusters associated with each individual diagnostic group and also with all groups together. The analysis has been aimed, then, at answering the two questions: 1. Is there a tendency for specific symptoms to be more frequently accompanied by other symptoms? 2. Are there specific symptoms or clusters of symptoms associated with certain diagnostic categories? The intercorrelations between symptoms have been examined statistically and classed into overlapping and non-overlapping clusters. A cluster is defined to be overlapping if all the symptoms in the cluster are correlated significantly with one another. A cluster is defined to be non-overlapping if each symptom in the cluster is correlated significantly with one or more, but not all, of the other symptoms. The correlation between any two symptoms is given by r ij = where N = total number of subjects N i = number of subjects with symptom i N j = number of subjects with symptom j N i j = number of subjects with symptoms i and j. The hypothesis that no correlation exists between symptom i and symptom j can be easily tested by using the fact that rij t ~ =----""'"'--- is distributed as Student's "t" with (N - 2) degrees of freedom. We accept or reject the hypothesis according as where t ex =the appropriate "t" table value at the a significance level. It is important to note that the application of this significance test is valid only when the distribution of individuals for a given symptom is normal. Although the symptom data are discrete and binomial-since each individual is assigned a "zero" or a "one" depending on whether the symptom is absent or present-the sample size would seem to be large enough to take advantage of the fact 288 / Cluster Formation and Diagnostic Significance in Psychiatric Symptom Evaluation that the binomial tends to normality as the sample size increases. Moreover, the discreteness of the data is due more to the lack of precision in measuring than to any inherent lack of continuity in the underlying scale. A simple frequency count of the 20 most frequently occurring symptoms in each diagnostic group and also in all diagnostic groups taken as a single entity has been made with the follOwing results: Symptoms Diagnostic group 1: 2, 9, 11, 13, 14, 15, 18, (Psychological 25, 26, 39,45,47, 50, 52, Disorders) 60, 61, 64, 65, 68, 79. Diagnostic group 2: 1, 10, 11, 13, 14, 15, 26, (Personality Trait 28, 33, 37, 39,40,43,47, Disturbances) 56, 57, 60, 61, 66, 68. Diagnostic group 3: 10, 11, 13, 16, 17,18,23, (Childhood Schiz25, 26, 28,31,33, 53, 59, ophrenia) 60, 64, 68, 73, 76, 80. Diagnostic group 4: 1, 9, 10, 11, 13, 14, 15, (Psychophysiologi- 18, 21, 26,39,40,47, 56, cal Disorders) 57, 60, 61, 66, 68, 69. All groups: Group 3, Cluster 1: 10, 11, 13, 16, 17, 18, 23, 25, 31, 33, 59, 60,68, 73 Group 4, Cluster 1: 1, 9, 10, 11, 13, 14, 15, 18, 21, 26, 39,40,47, 56, 57, 60, 61, 66, 68, 69 All Groups, Cluster 1: 1, 9, 10, 11, 13, 14, 15, 17, 18, 21, 26,28,33, 39, 47, 56, 57,60, 61, 68 The above listing clearly indicates that virtually all symptoms within each group are indirectly interrelated with the exception of those in diagnostic group 1. In an attempt to study the direct relationships which exist, the overlapping clusters (as defined above) were determined. Overlapping Clusters Group 1: 2, 50, 64; 9, 26; 11, 13; 13, 15; 15; 18, 26; 25, 39; 25, 45; 25, 25, 65; 39, 45; 39, 50; 39, 65; 50; 45, 65; 50, 65; 26, 45; 39, 47, 60;47, 65; 60, 65. 14, 50; 45, 47; Group 2: 11, 56, 33; 68; 68. 10, 37, 60; 28, 43, 11, 43, 15, 60; 57; 13,. 37, 56; 11, 26, 33, 39, 40; 56, 60, 56; 40; 61; 40, 68; Group 3: 10, 13, 16, 59; 13, 33, 68; 23, 17, 60; 16, 31, 10, 23, 73; 11, 17; 60, 68; 16, 17, 33; 18, 33; 23, 25, 31, 73. 1, 9, 10, 11, 13, 14, 15, 17, 18, 21, 26, 28, 33, 39, 47, 56, 57, 60, 61, 68. The correlation between each pair of symptoms listed within the indicated category is shown in Tables I through V where a star in a particular cell of the matrix indicates significance at the 5% level. The critical values for diagnostic groups 1, 2, 3, and 4 ~re 0.514, 0.344, 0.269, and 0.250, respectively; for all diagnostic groups the figure is 0.138. The non-overlapping symptom clusters, determined on the basis of significant correlations, are formed as indicated below. ,Non-Overlapping Clusters Group 1, Cluster 1: 11, 13, 14, 15 Group 1, Cluster 2: 2, 9, 18, 25, 26, 39, 45, 47, 50, 60, 64, 65 Group 2, Cluster 1: 10, 11, 13, 14, 15,26,28, 33, 37, 39,40,43, 56, 57, 60, 61, 66, 68 Group 4: 33; 13, 73; 59, 10, 14, 15, 61; 57, 14, 26; 37; 43, 61; 56; 14, 28, 56, 66·, 1, 9, 47; 1, 47, 56, 60; 9, 10, 11, 13; 9, 10; 11, 40, 47; 9, 11, 15; 9, 47, 57; 9, 61; 10, 11, 13, 21; 10, 11, 21, 47; 10, 11, 21, 68; 11, 15, 18; 11, 15, 21; 13, 14, 21; 13, 26; 14, 21, 47; 14, 40, 47; 15, 18, 26; 26, 66; 39, .40; 39, 57;47, 57; 47, 66; 68, 69. All Groups: 1, 9, 39; 1, 17, 18; 1, 9, 47; 1, 47, 56,60; 9, 10, 14, 15; 9, 14, 47; 9, 15, 61; 9, 68; 10, 11, 13,. 15, 17; 10, 13, 14, 15, 17,21; 10, 15, 17, 21, 33; 10, 21, 33, 60; 14, 17, 21, 28; 14, 47; 15, 17, 18, 33; 15, 18, 26; 15, 56; 15,61; 17, 21, 28, 33; 21, 28, 33,60; 21,28, 33, 68; 28, 39; 28, 60,61; 57, 60, 61. TABLE I Correlation Matrix for Diagnostic Group 1 2 9 11 13 14 15 18 25 26 39 45 47 50 52 60 61 64 65 68 79 2 9 11 13 14 15 18 25 26 39 45 47 50 52 60 61 64 65 68 79 +1.00 +.354 -.200 -.200 +.100 -.426 +.139 +.100 +.378 +.213 +.000 - .200 +.577* - .107 +,.000 +.289 -.577* +.000 - .354 -.200 +.354 +1.00 -.354 +.000 +.354 +.075 +.294 +.000 +.535* +.075 +'.167 +.000 +.272 -.302 +.272 +.272 - .,408 +.272 -.250 -.354 .;. .200 - .354 +1.00 +.700* +.100 +.213 -.277 -.200 - .189 -.426 +.000 +.100 -.289 +.213 +.000 -.289 +.289 +.000 +.000 +,100 -.200 +.000 +.700* +1.00 +.400 +.533* -.277 - .200 -.189 - .426 +.000 +.100 - .289 +.213 +.000 -.289 +'.000 +.289 +.000 -.200 +.100 +.354 +.100 +.400 +1.00 +.533* +.139 +.100 +.378 -.107 +.354 +'100 +.000 - .107 -.289 - .289 +.000 +.289 -.354 -.500 - .426 +.075 +.213 +.533* +.533* +1.00 - .237 - .107 -.161 +.139 +.294 -.277 - .277 +.139 - .237 +1. 00 +.139 +.681* +.207 +.294 +.139 +.080 +.207 +.080 +.480 +.080 +.080 +.294 +.139 +.100 t.OOO -.200 -.200 +.100 '-.107 +.139 +1.00 +.378 +.533* +.707* +.400 +.577* +.213 +.000 +.289 - .289 +.,577* +.000 +.100 +.378 +.535* - .189 -.189 +.378 -.161 +.681* +.378 +1.00 +.443 +.535* +.378 +.327 - .161 +.327 +.327 -.218 t.327 -.134 -.189 +.213 +.075 -.426 - .426 - .107 - .364 +.207 +.533* +.,443 +1.00 +,075 +.533* +.431 - .023 +.,431 +.431 - .185 +.431 +.075 - .107 +.000 +.167 +.000 +.000 +.354 +.075 +.294 +.707* +.535* +.075 +1.00 +.354 +.272 +.075 - .068 - .068 -.068 +.,272 - .250 +.000 -.200 +.000 +.100 +.100 +'.100 +.213 +.139 +.400 +.378 +.533* +'.354 +1.00 +.000 +.213 +.577* +.,289 +.000 +.577* +.000 -.500 +.577* +.272 -.289 -.289 +.000 -.492 +.080 +.577* +.327 +.431 +,.272 +.000 +1.00 -.185 +.167 +.444 -.667* +.167 - .068 +.000 -,107 -.302 +.213 +.213 -.107 +.318 +.207 +.213 - .161 - .,023 +.075 +.213 -.185 +1.00 -.185 +.431 +.123 +.431 +.075 +.213 +.000 +.272 +.000 +.000 - .289 -.185 +.080 +.000 +,.,327 +.,431 -.068 +.577* +.167 - .185 +1.00 +.444 -.111 +.167 +.272 -.289 +.289 +'.272 -.289 -.289 -.289 -.185 +.480 +.289 +.327 +.431 -.068 +.289. +.444 +.431 +.444 +1.00 -.389 +.444 +.272 +.000 -.577* - .408 +.289 +.000 +.000 +.123 +,080 -.289 -.218 -.185 - .068 +.000 -.667* +.123 - .111 -.389 +1.00 - .389 +.272 +.289 +.000 +.272 +.000 +.289 +.289 +.431 +.080 +.577* +.327 +-.431 +.272 +.577* +.167 +.,431 +.167 +.444 -.389 +1. 00 - .068 -.289 -.354 -.250 +.000 +.000 - .354 -.302 +.294 +.000 - .134 +.075 -.250 +.000 -.068 +,075 +.272 +.272 +.272 - .068 +1.00 +.354 -.200 -.354 +.100 -.200 -.500 - .426 +.139 +.100 - .189 -.107 +.000 - .500 +.000 +.213 - .289 +,000 +.289 -.289 +.354 +1.00 -~364 +.075 +.213 -.492 +.318 -.185 -.185 +.123 +.431 -.302 - .426 2 9 11 13 14 15 18 25 26 39 45 47 50 52 60 61 64 65 68 79 '"d t; o o ('!) ('!) p.. "". ::I (Jq 00 I ~ e...... c:... o "". a (i .go a ('!) t; (i a Cl) t; Cl) ::I .......o Cl) co 0) N "-..... N 00 co ~ CO o ........... (j E' fJl ~ CD ~ ~ o ~ [..... 1 1 10 11 13 14 15 26 28 33 37 39 40 43 47 56 57 60 61 66 68 +1.00 +.000 +.160 +.067 +.021 +.021 +.124, - .020 +.021 - .020 +.250 +.199 +.280 -.134 +'.250 +,043 +.160 + 04i +:021 +.067 10 11 13 14 +.160 +.067 +.021 +.418* +.800* +.373* +1~00 +.353* +.273 +.353* +1.00 +.242 +.273 +.242 +1.00 +.273 +.089 +.283 +,.239 - .083 +.373* -.144 +.160 +.187 +.089 +.242 +.283 +.385* +.454* +.324 +.089 - .,219 -.004 +.383* +.013 +.187 +.500* +.289 +.336 +.057 +.155 +.089 +.457* +.396* +.570* +.,311 +.130 +.336 +.057 +.155 +.457* +.311 , -.029 +.336 +~093 +.089 +,242 +.139 -.100 +.155 +.010 +.242 +~OOO +1.00 +.418* +.800* +.373* +.233 +.167 +.000 +.233 +.535* -.326 +.000 +.289 +.060 +.461* +.000 +.060 +.000 15 TABLE II 8 Correlation Matrix for Diagnostic Group 2 § 26 28 +.021 +.124 -.020 +.233 +.167 +.000 +.273 +.239 - .144 +.089 - .083 , +.160 +.283 +.373* +.187 +1.00 +,.373* +.187 +.373* +1.00 +.134 +.187 +.134 +1.00 +.,426*' +.373* +.461* +'.461* +.297 +.083 -.004 +~202 +,.050 +.187 +.297 -,.048 +.188 +.241 -.039 +.089 - .199 +.032 +.283 +.202 +.187 +.188 +.064 +'.244 +.089 +.020 +.559* +.485* +.417* +.244 +'.139 -.140 +.324 +.242 +.100 +.307 s;:l. 33 37 39 40 43 47 56 57 60 61 66 68 +.021 +.233 +.089 +.242 +.283 +.426* +'.373* +.461* +1.00 +.187 - .004 +.050 +.040 +.089 -.004 +,336 +.089 +.485* +.139 +.242 -.020 +.535* +.383* +.454* +.324 +.461* +.297 +.083 +.250 -.326 +.089, - .219 -·004 -.004 +.202 +.050 -.004 -.087 +1.00 +.735* -.108 +.089 -.004 +.188 +.089 +.188 +.139 +.089 +.199 +.000 +.383* +;013' +.187 +.1:87 +'.297 -.048 +.050 +.214 +.735* +1.00 +,t03 +.208 +.187 +.103 +.032 +.,386* +.187 +.013 +.280 +.289 +.500* +.289 +.336 +.188 +.241 -.039 +.040 +.386* - .108 +.103 +1.00 +.121 +.633* +: 389* +.121 +.083 +.188 +.447* -.134 +.060 +.057 +.155 +:.089 +'.089 -.199 +.032 +.089 - .144 +.089 +.208 +.121 +1.00 +.089 +.311 +.057 +.121 +.273 -.042 +.250 +.373* +.457* +,396* +.570* +.283 +.202 +.187 -.004' +.461* -.004 +.187 +.633* +.089 +1.00 +.336 +.457* +.188 +.139 +.396* +.043 +.000 +.311 +.130 +.336 +.188 +·.064 +.244 +.336 +.103 +.188 +.103 +;389* +.311 +.336 +1.00 +.311 +.389* +.188 +.289 +.160 +.060 +.057 +.155 +,.457'" +.089 +.020 +.559* +.089 +.208 +.089 +.032 +.121 +.057 +.457* +.311 +1.00 +.311 +.273 +.353* +.043 +.000 +.311 - .029 +.336 +.,485* +.417* +'.244 +.485* +.103 +.188 +.386* +.083 +.121 +.188 +.389* +.311 +1.00 +.188 +.130 +.021 +.093 +.089 +.242 +'.139 +.139 -.140 +.324: +.139 +.050 +.139 +.187 +.188 +.273 +.139 +.188 +.273 +.188 +1.,00 +.396* +.067 -.100 +.155 +~187 +1.00 -.087 +.214 +.386* -.144 +.461* +.103 +,208 +.103 +'.050 +.307 +~O10 +.242 +.242 +.100 +.307 +.242 +.307 +'.089 +.013 +.447* -.042 +.396* +'.289 +.353* +.130 +.396* +1.00 t:1 ~. 1 10 11 13 14 15 26 28 33 37 39 40 43 47 56 57 60 61 66 68 o fJl ..... ~ ~ til ..... I§..... ..... § ~ ~ ~ CD ..... =' ~ fJl ~ t:r ..... ~ ~ ..... ~ til ~ ~o 9 trl < ~ ~ ..... 8 TABLE III Correlation Matrix for Di~gnostic Group 3 10 11 13 16 17 18 23 25 26 28 31 33 53 59 60 64 68 73 76 80 10 11 13 16 17 +l.00 +.242 +.384* +.087 +.341* +.213 +.341* +.·232 +.141 +'.004 +.018 +.426* +.044 +.141 +.153 +.242 +l.00 +.179 .. +.095 +.447* -.009 +.257 -.058 +.100 -.009 -.042 +.123 -.103 -.042 -.103 +.112 -.006 -.024 -.103 +.095 +.384* +.179 +l. 00 +.213 +.447'" +.233 -.123 -.058 +.242 +.233 - .042 +.271* +'.149 +.100 +.275· +.112 +.305· +.095 +.023 +.213 +,087 +.095 +.213 +1.00 +.334* +.259 +.059 -.046 +.190 +.084 -.118 +.339* +.145 -.118 +.145 -.265 +.279· -.288· - .037 +.056 +.341* +.447* +.447* +.334* +1.00 +.073 +.118 -.162 +.177 +.212 -.152 +.371* -.043 +.012 +.103 -.207 +.226 +.059 +.103 -.079 -.101 +.070 +.293· -.064 -.015 18 23 25 +.213 +.341* +.232 -.009 +.257 -.058 +~233 -.123 -.058 +'.259 +.059 -.046 +~073 +.118 - .162 +1.00 +.073 +.175 +.073 +1.00 +.473* +.175 + . 473* +1.00 +.213 -.152 -.004 +.112 - .067 +;175 +.004 +.341* +.351* +.472'" +.200 +.267 +.264 +~103 +.120 +.004 +.341* +.469* +.079 -.043 +.120 -.155 -.067 -.127 +,186 -.133 -.091 -.091 +.334* +.251 -.106 - .043 +'.016 +.171 +.196 +.053 26 28 +.141 +.100 +.242 +.190 +.177 +.213 -.152 -.004 +1.00 +.108 -.227 +.171 +.262 -.105 +.153 -.101 +.204 -.015 +.044 -.015 +.004 -.009 +.233 +.084 +.212 +.112 -.067 +.175 +.108 +1.00 +.108 +.145 .:..014 +.108 +.171 -.243 31 +.018 -.042 -.042· - .118 - .152 +.004 +. 341* +,351* -.227 +·.108 +1.00 +.043 +.262 +.509'" -.064 +.108 +~072 -.065 ,:".003 +.498* +.171 +.262 +.171 -.015 33 53 +.426* +.044 +.123 -.103 +.271* +.149 +.339* +.145 +.371* -.043 +.472'" +.264 +.200 +.103 +.267 +.120 +.171 +.262 +.145 - .014 +.043 +..262. +1.00 +.189 +.189 +1.00 +.043 +.153 +'.302·· +.036 -.073 +.079 +.093 +.110 +.018 +.145 +.076 +.229 - .089 -.037 59 60 64 68 +.141 -.042 +.100 -.118 +.012 +.004· +.341* +.469* -.105 +.108' +.509* +.043 +.153 +LOO +.262 +.108 +.070 +.498* +.153 -.015 +.153 - .103 +.275* +.145 +.103 +·.079 - .043 +.120 +.153 +.171 -.064 +.302* +.036 +.262 +1.00 +.079 +.348· +.054 +.132 +.054 -.101 +.112 +.112· -.265 - .. 207 -.155 -.067 -.127 -.101 -.243 +.108 -.013 +.079. +.108 +.079 +1.00 -.042 -.091 +.171 +.171 +.070 -.006 +.305* +.279* +.226 +.186 -.133 -.081 +.204 +.,072 -.065 73 +.293* -.024. +.095 -.288* +.059 -.091 +.. 334* +.251 -.015, -.003 +·.. 498'" +.09~ t·018 +.110 +.145 +.070 +.498'" +.348* +~054 -.042 -.091 +1.00 +.054 +.054 +1. 00 -.126 +.145 -.058 -.030 76 80 - .064 -.103 +.023 - .037 +.103 -.106 -.043 +.016 +.044 +.171 +.262 +.076 +.229 +.153 - .015 +.095 +.213 +.056 -.079 +.171 +.196 +.053 -.015 +.171 - .015 -.089 -.037 -.015 +.054 +.171 -.058 -.030 - .037 +1.00 +~132 +.171 -.128 +.145 +1.00 -.037 I 10 11 13 16 17 18 23 ,25 26 28 31 33 53 59 60 64 68 73 76 80 "d "1 o(') (1) (1) I i Q. """. ~ 00 I Io:I:j ~ ...... ...... Coot o """. a n o i (1) "1 n § ~ (1) "1 (1) 5 .. (1) ...co 0) N '-... N ...co M CD to..) '-... (') ..... s:: 00 ~ Cl) ~ "%j o ~ S 1 9 +1.00 +.255* +.196 +.132 +.145 -.025 +.097 -.148 +.075 +.059 +.25'5* +1.00 +.439* +.312* +.287* +.209 +.460* +.188 +.117 +.073 +.186 +.399* +.361* +.065 +.281* +.065 +.460* 10 TABLE IV ~ ..... o Correlation Matrix for Diagnostic Group 4 § ::s Q. 11 13 14 15 18 21 26 39 40 47 56 57 +.132 +.. 312* +.516* +1.00 +.456* +.185 +.498* +.312* +.472* +.141 +.093* +.255* +·.307 -.034 :,".185 -.126 +.209 +.149 +.261* -.034 +.145 +.287* +.478* +.456* +1.00 +.325* +.210 +.205 +.360* +.288* +.028 +:224 +.126 +.059 +.164 +.059 +·.122 +.059 +.164 +.059 - .025 +.209 +.238 +.185 +.325* +1.00 +.179 +.065 +.300* +.133 -.022 +.268* +.097 +.460* +.248 +.498* +.210 +.179 +1.00 +.460* +.381* +.343* -.009 +.234 +.178 -.152 +.179 +.008 +.242 +.168 +.110 -.072 -.148 +.188 +.215 +.312 +.205 +.065 + . 460* +1.00 +.195 +.354* +.186 +.113 +.099 +.065 +.137 +.065 +.224 +.290* +.243 +.065 +.075 +.117 +.376* +.. 472* +.,360* +.300* +. 381* +.195 +1.00 +.222 -.027 +.202 +.339* +.062 +.149 +.140 - .032 +.2.19 +.324* +.219 +.059 +.073 +.154 +.141 +.288* +.133 +.105 +.186 +.117 +.093 +.028 -.022 -.009 +.186 -.027 +.118 +1.00 +.300* +.173 +.117 +.319* +.117 - .009 +.117 +.046 -.237 +.190 +.399* +.286* +.255* +.224 +.268* +.234 +.113 +.202 +.208 +'.300* +1.00 +.384* +.069 +.128 +.141 +.005 +.214 +.244 +.141 +.295* +.361* +.294* +.307* +.126 +.319* +.178 +.099 +.• 339* +,007 +.173 +.384* +1'.00 +.294* +.405* +.383* +.084 +.560* +.226 +.029 +.360* +.065 +.089 -.034 +.059 - .054 - .152 +.065 +.062 -.006 +,117 +.069 +,294* +1.00 +.132 +.281* +.. 165 +.185 +.164 +.155 +.179 +.137 +.149 +.225 +.319* 60 61 66 68 69 +.115 +.215 +.165 +.149 +.059 +.238 +.168 +.290* +. 219 +.279* +.117 +.214 +.560* +.241 +.238 +.165 +.168 +1.00 +.114 +.013 +.219 +.09l +.346* +.261* +.164 +.194 +.110 +.243 +.324* +'.106 +.046 +.244 +.226 +.037 +.194 +.191 +.110 +.114 +1.00 +.346* -.049 -.159 +.013 -.034 +.059 -.054 -.072 +.065 +.219 +.184 -.237 +.141 +.029 +.089 -.054 +.241 -.072 +.013 +.346* +1.00 t:1 ..... ~ (JQ 1 9 10 11 13 14 15 18 21 26 3.9 40 47 56 57 60 61 66 68 69 +..196 +.439* +1.00 +.516* +.478* +.238 +.248 +.215 +.376* +.184 +~105 +.117 +.190 +.286* +.295* +.294* +.360* +.089 +.132 +.165 +-.441* +.165 +.; 183 +.168 +.115 +~215 +.165 +.219 +.091 +.. 346* - .049 - .159 +.013. +;319~ -.054 +.155 +.092 +.179 +.238' +'.194 -.054 +~343* +:354* +.222 +LOO +.118 +.208 +.007 -.006 +.225 +.184 +.042 +.279* +.106 +.184 -- - ~.092 +.393* +.088 +'.241 +.037 +.089 '-----. +~128 +.405* +.092 +1.00 +.238 +.102 +.238 +.194 -.054 +.441* +.183 +.065 +.460* +.165 +.168 -.126 +.209 +.059 . +.122 +.092 +.179 +.008 +.242 +.065 .L.224 +.140 -.032 +.184 +.042 +.117 -.009 +.141 +.006 +.383* +.084 +.393* +.088 +.238 +.102 +1.00 +.168 +.168 +1.00 +.165 +.168 +.191 +.HO +.241 - .072 ::s 1 9 10 o 11 00 ..... <§..... ..... n § 13 14 15 18 21 26 39 40 47 56 57 60 61 66 68 69 00 ..... ~ n ~ n Cl) ..... ::s i ~ 00 ~ t:r' ~ ~ ..... n 00 ~ .§ ~ o 8 tJ:j e s:: ~ ..... o ::s TABLE V Correlation Matrix for All Diagnostic Groups 1 9 10 11 13 14 15 17 18 21 26 28 33 39 47 56 57 60 61 68 1 9 10 +1.00 +.235* - .001 +.045 +.017 +'.063 +.039 -.086* - .124.* +.127 +.052 +.130 +.065' +.233* +.410* +.378 +.133 +.317* +.119 - .005 +.235* +1. 00 +.214* +.191 +.140 +.291* +.326* +.083 +.166 +.001 +.165 +.153 +.097 +. 218~ +.257* +.055 +.089 +.028 +.285* -.082* - .001 +.214* +1.00 +.375* +.515* +.205* +.246* +.260* +.105 +.296* +.115 +.191 +.276* -.002 +.100 +.049 +.110 +.221* +.141 +.173 --- 11 13 14 +.045 +.017 +.063 +.191 +.140 +.291* +~375* +.515* +.205* +1.00 t·335* +.162 +.335* +1.00 +.285* +.162 +.285* +1.00 +.328* +.262* +.270* +.332* +.392* +.234* +.083 +.079 +.123 +.233 +.291* +.221* +.128 +.119 +.111 +.121 +.183 +.248* +.119 +.200 +.179 +.065 +.022 +.071 +.116 +.153 +.222* +.130 +.084 +.107 +.153 +.079 +.077 -.035 +.153 . +.080 +.154 +.101 +.141 +.150 -t .139 +.107 - -- 15 17 18 21 26 28 33 39 47 56 57 60 61 68 +.039 +.326* +.246* +.328* +.262* +.270* +1. 00 +.447* +.286* +.247* +.303* +.210 +.222* +'.061 +.189 -.093* +.061 +.115 +.302* +.061 -.086* +.083 +.260* +.332* +.392* +.234* +.447* +1.00 +.248* +'.325* +.219 +.240* +.282* +.054 +.061 -.124* +.166 +.105 +.083 +~ 079 +:.123 +.286* +.248* +1.00 +~ 076 +.343* +.125 +.217* +.157 +.076 - .080 +.064 +.006 +.180 +.225 +.127 +.001 +.298* +.233 +.29P' +.221* +.247* +.325* +;076 +1.00 +.180 +.265* +.247* +.050 +.223 +.189 +.073 +.223* +.096 +.,402*' +.052 +. 165 +.115 +.128 +.119 +.111 +.303* +.219 +.'343* +.180 +1. 00 +.042 +.185 +.130 +.153 +.191 +'.121 +.183 +.248* +;210 +.240* +.125 +.265* +,042 +1. 00 +.276* +.24()* +.210 +.123 +.179 +.256* +.218* +.222* +.065 +.097 +.276* +.119 +.200 +.179 +.222* +.282* +.217* +.247* +.185 +.276* +1.00 - .024 +.030 +.089 +.149 +.198* +;212 +.226* +.233* +.218* -.002 +.065 +.022 +.071 +.061 +.054 +.157 +.050 +.219 +.240* - .024 +1. 00 +.206 +.031 +.158 +.133 +.125 +.034 +.410* +.257* +.100 +.116 +.153 +.222* +.189 +.061 +.076 +.223 +'.126 +.210 +.030 +.206 +1.00 +.273* +.174 +~ 312* +.181 +.061 +.378* +.055 +.049 +.130 +.084 +.107 -.093* - .013 -.080 +.189 -.004 +,123 +.089 +.031 +.273* +1..00 +.123 +.318* +.076 +.054 +.133 +.089 +.110 +.153 +.079 +.077 +.061 +.114 +.064 +,.073 +.100 +.179 +.149 +.158 +.174 +.125 +1.00 +.243* +.225* +.165 +.317* +.028 +.221* -.035 +.153 +.080 +.115 +.085 +.006 +.223* +.15'5 +.256* +.198* +.133 +.312* +.318* +.243* +1.00 +.206* +.188 +.119 +.285* +.141 +.154 +.101 +.141 +.302* +.173. +.180 +.096 +.190 +.218* +.212 +.125 +.181 +.076 +.225* +.. 206* +1.00 +.134 -.005 -.082* +.173 +.150 +.139 +.107 +.061 +.234 ~.013 +.114 +.085 +.173 +'.234 +.n9 +.126 -.004 +.100 +.155 +.190 +.131 -- - --- ---- 1 9 10 11 13 14 15 17 +~225 18 +.402* 21 +.131 26 +.222* 28 +.226* 33 +.034 39 +.061 47 +.054 56 +.165 57 +.l88 60 +.134 61 +LOO 68 '"d ~ o(") CD CD Q. ..... =' aq Dl I ~ e...... ~ ..... a (j o i CD ~ (j a CD ~ CD ..~ CD ~ co 0) ~ ........... ~ co c,.:) 294 / Cluster Formation and Diagnostic Significance in Psychiatric Symptom Evaluation The clustering of these groups suggests that for the various diagnoses, and also for all cases regarded as a· single entity, certain specifiC constellations do have a tendency to reappear; in other words, the appearance of a specific symptom indicates the probable appearance of other symptoms associated with the former. Diagnostic Significance A typical experiment usually involves taking one or more measurements on the experimental units under test. The experiment is called univariate, or uniresponse, as opposed to multivariate, or multiresponse, when only one measurement is taken on each unit. Since the symptom study data have many determinations associated with each sample ,item, i.e., individual, the tools of multivariate analysis are required for the analysis. TABLE VI Diagnostic Group Means by Symptom Type Diagnostic Group* Symptom 1 9 10 11 13 14 15 17 18 21 26 28 33 39 47 56 57 60 61 68 1 2 3 4 0.7581 0.6774 0.6935 0.8226 0.7742 0.6452 0.7419 0.6290 0.6774 ·0.7258 0.8387 0.6129 0.6290 0.8065 0.8065 0.6935 0.6452 0.6935 0.7419 0.7097 0.3519 0.4444 0.8148 0.8704 0.8704 0.6852 0.6852 0.9074 0.7037 0.6111 0.8148 0.7037 0.8333 0.6296 0.5185 0.4630 0.6481 0.7407 0.6667 0.8519 0.9091 0.5758 0.6667 0.8485 0.7576 0.6970 0.6970 0.6061 0.5152 0.6364 0.8182 0.6364 0.6970 0.6970 0.8485 0.6970 0.7273 0.8485 0.7273 0.7576 0.5333 0.8000 0.5333 0.6667 0.6667 0.6667 0.7333 0.5333 0.8667 0.4667 0.9333 0.5333 0.4667 0.7333 0.6667 0.4667 0.5333 0.6000 0.6000 0.8000 *Diagnostic Group 1: Psychoneurotic Disorders 2: Childhood Schizophenia 3: Personality Trait Disturbances 4: Psychophysiological Disorders. Table VI shows the mean proportion of individuals in each diagnostic group having the particular symptom under consideration. Differences among the diagnostic groups are to be analyzed on the basis of these 20 symptoms - the 20 most frequently occurring symptoms in the entire sample. Using the notation of the Appendix, we let Xijt adopt the value one or zero depending on whether the jth individual in the ith diagnostic group does or does not have the tth symptom. For this particular analysis "i" takes the values 1 through 4; "j" takes the values 1 through TJ i where TJ 1 = 62, TJ 2 = 54, TJ3 = 33, and TJ4 = 15; and t can adopt any of the 20 symptoms listed in Table VI. The symbol x i. t refers to the value in each cell of the table, i.e., the sample proportion of individuals in the ith group exhibiting symptom t. For example, x 3. 13 has the value 0.7576 indicating that almost 76% of the study group claSSified as having personality trait disturbances exhibit definite signs of rebelliousness. We let J.L ij denote the population mean proportion. The total hypothesis of no difference among diagnostic groups (see Appendix A) is denoted by H : J!:.1 = 1!:.2 = !:!:.3 = l:!:.4 , where J!:.'i = (J.L ii, J.Li2, ••• , J.Li20) and J.Li = transpose of J.L \ • The hypothesis of no difference between the ith and 1 th diagnostic groups is denoted by Hie : 1!:. i = !:!.f (i ~ 1 = 1, 2, 3, 4). The Within Group SP matrix (Se) and its inverse are given in Tables VII and vm. The sizes of the samples drawn from the several groups are TJI = 62; TJ2 = 54; TJ3 = 33; TJ4 = 15. The number of variates (i.e., symptoms) is p = 20 and TJ, the total sample size is 164. The "error degrees of freedom," v, = TJ - p - K + 1 = 164 - 20 - 4 + 1 = 141. Then T if 2 = VTJ if ( T!:.' i - g;) S; I (R:. i - fl f ) , where and H\ = (Xi.1 , ••• , x i.2ff), i = 1, 2,3, 4. TABLE VII Within Groups Sums of Squares and Cross Products (SP) Matrix 1 9 10 1 +30.146 +7.0441 +().6551 9 +1.0441 +37.342 +10.249 10 +0.6551 +10.249 -+,'32.392 11 +2.0138 +7.4415 +10.333 13 +2.0153 +5.2010 +15 .• 413 14 +1.4165 +11.216 +7.4432 15 +1.3348 +12.352 +6.7486 17 -0.25'37 +3.8877 +7.4257 18 -2.5970 +6.4716 +2.6413 21 +5.4517 -0.8414 +11.168 26 +2.0870 +4.4732 +3.6170 28 +5.4656 +7.8783 +5~4155 33 +4.9597 +6.7382 +9.2183 39 +8.3581 +6.9755 -0.5811 47 +8.4570 +6.5634 +3.5078 56 t10.965 +1. 9174 +1.7404 57 +3.2778 +6.1295 +3;4729 60 +10.075 +1. 7720 +5.1182 61 +4.8442 +10.821 +3.9634 68 +3.3327 -2;2448 +5.9357 11 13 +2.0138 +7.4415 +10.383 +22.717 +9.7299 +4.7113 +10.109 +9.9682 +2,2866 +6.7768 +2.6871 +3.5163 +2.5709 +2.4299 +4.0764 +3.6880 +3 ..9368 -1.9434 +4,464.3 +2.0153 +5.2010 +15.413 +9.7299 +26.325 +8.7376 +8._4258 +11.673 +1. 8643 +10.863 +3.6578 +5.2641 +6.5489 -2.0598 +5.0412 +3.8595 +2.0541 +3.6827 +3~55n 14 +1. 4165 +11.216 +7.4432 +4. 711~ +8.7376 +36.145 +8.6071 +7.9919 +4.3510 +9.0536 +5.1519 +9.4771 +7.3084 +2.0820 +7.3749 +3.4315 +3.1515 +2.3355 +~.8719 +5.9286 +3.9590 +4.6701 15 +1.3348 +12.352 +6.7486 +10.109 +8.4258 +8.6071 +33.422 +14.684 +9.4199 17 -0.2537 +3.8877 +7.4257 +9.9682 t11.673 +7.9919 +14.684 +30.617 +6.8628 :1:8.2321 +13.288 +8.1864 +4.5341 +6.2664 +7.6214 +8.0675 +~.9617 +1.5100 +0.8905 +6.8696 +3.8379 -4.1965 +1. 5937 +1.7472 +2.2673 +1..5742 +1. 8856 +9.8770 +5.0524 +1.6121 +7.0303 18 21 26 28 33 39 47 56 57 60 61 68 -2.5970 +6.4716 +2.6413 +2.2866 +1.8643 +5.4517 -0.8414 +11.168 +6.7768 +10.863 +9.0536 +8.2321 +13.288 +2.4091 +36.542 +6.6540 +10.100 +10.291 +3.1622 +7.1137 +6.6095 +1. 5728 +9.3277 +2.1402 +14;444 +2.0870 +4.4732 +3.6170 +2.6871 +3.6578 +5.1519 +8.1864 +5.4656 +7.8783 +5.4155 +3.5163 +5.2641 +9.4771 +6.2664 +7.6214 +3.7658 +10.100 +1. 5176 +37.339 +10.060 +7.9259 +8;4996 +4.6829 +5.3148 +7.8788 +8.4004+6.3528 +4.9597 +6: 7382 +9.2183 +2.5709 +6.5489 +7.30Q4 +8.0675 +8.9617 +8.9988 +10.291 +5.2721 +10.06'0 +32.671 +3.0514 +4.0332 +3.8213 +7.2114 +6.908i +8.1372 +4.9650 +8.3581 +6.9755 -0.5811 +2.4299 -2.0598 +2.0820 tl. 5100 +0.8905 +4.8213 +3.1622 +5,2760 +7.9259 +3.0514 +32.173 +6.1993 +2.4182 +5.1110 +3.0222 +4.9093 +1. 3289 +8.4570 +6.5634 +3.507.8 +4.0764 +5.0412 +7.3749 +6.8696 +3.8379 +4.3344 +7;1137 +4.0073 +8.4996 +4.0332 +6.1993 +30.735 +6.1778 +4.8968 +9.8243 +5.8729 +5.4522 +10.965 +1. 9174 +1. 7404 +3.6880 +3.8595 +3.4315 +3.2778 +6 .. 1295 +3; 4729 +3.9368 +2.0541 +3.1515 +1. 7472 +2.2673 +2.9766 +1. 5728 +2.2301 +5.3148 +7.2114 +5.1110 +4.8968 +3.5938 +3~. 7.87 +8.1685 +8.7347 +4.2163 +10.075 +1. 7720 +5.1182 -1. 9434 +3.6827 +2.3355 +1. 5742 +1.8856 +2.4986 +9.3277 +4.0338 +7.8788 +6.9031 +3.0222 +9.8243 +10.944 +8.1685 +31. 390 +7.6665 +7.9977 +4.8442 +10.821 +3,9634 +4.4643 +2.8719 +5.9286 +9.8770 +5.0524 +7.3417 +2;1402 +6.0497 +8.4004 +8.1372 +4.909'3 +5.8729 +3.3327 -2.2448 +5.9357 +3,5573 +3.9590 +4.6701 +1.6121 +7.0303' +5.5444 +14.444 +2.9607 +6 .. 3528 +4.9650 +1. 3289 +5.4522 +4.1633 .+4.2163 +7.9977 +3.3064 +28.050 +4~3510 +9.4199 +ti.8628 +34.783 +2.4091 +8.7688 +3.7658 +8.9988 +.4,;8213 +4.3344 +0.3632 +2.9766 +2;4986 +7.3417 +5.5444 +4.5~41 +~. 7688 +6.6540 +22.378 +1; 5176 +5.2721 +5.2760 +4.0073 +1. 2136 +2.8301 +4.0338 +6.0497 +2.9607 -4~1965 +i. 5937 +0.3632 +6.6095 +1. 2136 +4.6829 +3.8213 +2.4182 +6.1778 +37.306 +3.59.38 +10.944 +1. 5028 +4.1633 +~.5028 +8.7347 +7.6665 +34.01E +3.3064 1 9 10 11 13 14 15 17 18 21 26 28 33 39 471 56 57 60 61 68j '"d "1 o ~ CD CD 0.. ~. =' aq 00 I ~ e..... C-t o a ~. n o ~ aCD "1 n o a CD "1 CD ='~ ...CD ...co 0) ~ ........... ~ co CJ1 N CO 0) ........... (') ..... s:: 00 ~ ('D "i ~ o "i 1 1 +4.7895 9 -.36387 10. +.65889 11 - .19965 13 ,..29130. 14 +.37448 15, -.0.8430. 17 +;45582 18 +.870.13 21 -.27892 26 ... 0.1717 28 +.0.7960. 33 -.54182 39 .,..88753 47, -.58243 56 -.89158 57 +.22999 60. -.87236 61 - .1t.i321 68 -.310.69 9 10. 11 TABLE VIII· 9 ~ """. o Inverse of Within Groups Sums of Squares and Cross Products (SP) Matrix § 13 15 1.4 17 18 21 26 = 28 33 39 47 56 p. 57 60. 61 68 +.65889 -.19965 -.29130. +; 37448 -.0.8430. +.45582 +.870.13 -.27882 -.0.1717 +.07960. -.54182 -.88753 -.58243 -.89158 +.22999 -.87236 -.16321 -.310.69 1 9 +5.3770. ".1. 4410. -2.280.7 +.0.640.7 +.45554 +.535:59, +~ 14944 -.71938 - .0.50.0.8 +.0.8471 -.82438 +.210.25 +.2360.3 +.41460. +.11938 -.71326 +.12864 -.45911 10. -1. 4410. +7.1461 -1. 0.30.2 +;31518 -1.3250. -1. 0.364 +.20.638 -•.19892 +.18313 +.12727 +.75995 -.43834 ~" 27961 -.93619 -.62546 +1. 6151 -.35149 ".40.905 11 -2.280.7 -1. 0.30.2 +6.8221 -.62666 -.15958 -1. 3232 +.13984 -.42110. -.24723 -.090.88 +.0.0.0.66 +.69187 -.35779 -.22772 +.0.2469 -~13696 +.15264 +.50.533 13 +.0.640.7 +.31548 -.62866 +3.60.94 -.13688 -.15679 +.0.9965 ~.42777 -.33215 -.41733 -.170.96 +.10.552 ,..43472 -.22695 -.0.150.7 +.36342 -.10.962, -.2490.0. 14 +;45654 -1. 3250. -.15933 -.13688 +5.3675 -1. 5236 -.56668 -.50.976 "..870.40. -.0.4387 - .-25906 +.50.80.8 -.52870. +1. 0.774, +.33740. -.210.28 -:.5750.3 +.63519 15 +.53359 -1.0.364 -1.3232 -.13679 -1. 5236 +5.4720. -.32928 -.89612 +.20521 -.29967 -.55140. -.0.4372 +.26372 -.10.442 +.0.1144 +.15340. -.0.2996 -.0.4912 17 +'.14944 +.20.538 +.13984 +.0.9965 -.56668 -.32923 +3.8491 -1:; 640.48 -1. 0.10.9 +.0.8759 -.78134 -.42223 -.24533 -.20.975 +.12182 -.0.1175 -.22791 -.8140.7 18 -.71938 -.19692 -.42110. -.42777 -.50976 -.89812 +.54048 +4.9229 .... 74180. -.43416 -.68443 -.31915 -.10.817 .;,.27262 +.35766 -.51180. +.47157 -1. 5637 21 -.0.50.0.8 +.18313, -.24728 -.33215 -.870.49 +.20.521 -1.0.10.9 "".74180. +5.780.3 +.61250. -.0.8773 -.73685 -.00.683 -.0.3233 -.0.9158 -.28369 -.43446 +.11646 26 +.0.8471, +.12727 -,,0.90.98 -.41733 -.0.4387 -.29967 +.0.8759 -.43416 +.61250. +3.560.2 -.53728 -.64626 -.35438 -.0.6220. -.0.2278 -.32820. -.38625 -.22586 28 -.82438 +.759.95 +.0.0.0.66 -.170.96 -.2590.6 -.55140. -.78134 -.58448 -.0.8773 -.53728 +4.2980. +.11190. +.27977 "..12855 -.5590.0. -.0.3859 ,-.34554 +.240.0.6 33 +.210.26 -.43834 +.69187 +.10.552 +.50.80.8 -.0.4372 -.42223 -.31915 -.73685 ,-.64526 +.11190. -1:3.9297 -.36335 +.20.168 -.28987 +.18533 -.0.4978 +.25118 39 +.2360.3 ... 21961 -.36780, -.43472 -.52870. +.2,6372 -.24533 - .10.617 -.0.0683 -.35438 +.27977 -.36335 +4.2623 -.15616 -.11542 -.84641 +.0.0.732 -.26424 47 +.41460. -.93619 -.22772 -.22695 +1. 0.774 -.10.442 -.20.975 -.27262 -.0.3233 -.0.6220., -.12655 +.20.168 -.15516 +3.5394 +.0.2394 -.97478 +.16777 +.13447 56 +.11938 .:..62346 '+.0.2469, -.0.150.7 +.33740. +.0.1144 +.12182 +.35765 -.0.9158 -.0.2278 -.53699 ... 28987 - .11542 +;0.2394 +3.2273 -.70.979 -.45211 -.31327 57 -.71326 +1.6151 -.13696 +.36542 -.21028 +.15340. -.0.1175 -.51180. -.28369 -.323.20. -.0.3659 +.18533 - .,84541 -.97478. -.76979 +5.0.131 -.670.9,3 -.5590.8 60. +.12864 .. ;,35149 +.15~64 -.10.952 -.5750.3 -.0.2996 -,,22791 +.47157 -.43446 -.38625 -.34664 -.0.4978 +.0.0.732 +.15777 -.45211 -.670.93 +3.8961 -.19339 61 -. ~5911 -.40.90.6 +.60.0.38 -.2490.0 ' +.63519 -.49125 -.8140.7 -1. 5637 +.11646 ... 22686 +.240.0.5 +.25116 -.26424 +.13447 -.31327 -.55968 -.19359 +5.0.80.8 68 • Each element of the matrix is to be multiplied by 10.- 2 -.86387 +4.2844 -1.2788 -,.33975 +.29l52 -.88422 .,1.0.461 +.25134 -.35453 +.8680.7 +.0.3680. -.32662 -.0.1124 -.43655 -.13543 -.12110. -.27392 +.30.606 -.51847 +.63948 -1. 2788 -.33975 +.29152 -,,33422 .. 1. 0.461 +.25134 -.35483 '+.3680.7 +.0.3580. -.32662 -.0.1724 -.43655 - .13843 - .12170. -.27392 +.30.606 -.51847 +.63943 - ~ """. ~ = o00 o""". ~ c§.00 """. o""". § ~ o ('D """. = tod 00 ~ tr """. ~ "i o""". 00 i o 9 tzj < ~ ~ """. § Proceedings-Fall Joint Computer Conference, 1962 / 297 The data on symptoms yield the following values for the T 2, s: 2 T12 = 77.3215, I2 work remains to be done, but the pilot investigation to date gives every indication that future study will provide ;revealing information and meaningful results for differential diagnosis of emotional disturbances. T 13 = 15.8515 , APPENDIX 2 T 14 = 18.7329, Tests for Equality of Mean Vectors T~3 = 53.7181, 2 T 24 = 36.4148, T;' = 27.0456. It is apparent that T :ax = Ti2 = 77.3215. If we are interested in testing the hypothesis at the 6% level, the critical value, T ~ is found in the following way. Probability [largest T~f :s T ~ I H] H: lh = ••• = 4 L == 1 - Pif = 0.94, i~'£=l where P if = Probability [T:f Consider K multivariate normal populations with a common unknown covariance matrix ~ and mean vectors /l h ••• , IlK' where '/l~ = (/l,il' ••• , /lip) and /lit denotes the ith population mean on the tth variate. Since a multivariate normal is completely specified by its mean vector and covariance matrix, the above populations are homogene0us if their mean vectors are equal. The hypothesis of equality of mean vectors is denoted by > T! I Hit] • Since all Pif 's are equal to (say) P, we have 1 - 6P = 0.94 and therefore P = 0.01. Now each T fe is distributed as the F distribution with (P,v), that is (20,141) degrees of freedom. The F tables yield a value of T \2\ ex = 1 88 when P = 0.01. Hence T:ax > T! and the hypothesis of no difference among diagnostic groups is rejected. In fact, each T~f > T! which indicates that each differs significantly from all others. Although the application of this particular test is justified only when the data are multivariate normal it seems reasonable to assume that (1) the data for each diagnostic group do approximate normality since the sample sizes are moderately large; and (2) the test itself is not particularly sensitive to deviations from normality. The results, while preliminary, are extremely encouraging in that they suggest the existence of reasonably clear-cut criteria for diagnostic grouping. Obviously much I:!:x.. Various test procedures are available in the literature to test the total hypothesis H and to make a decision relative to the acceptance or rejection of its subhypotheses in the event H is rejected. For a detailed discussion on the relative merits of these procedures, the reader is referred to [5]. Two of these procedures are briefly reviewed here. The first procedure is known as the "Multivariate Analysis of Variance (MANOVA) test based on the largest root." This test was proposed by S. N. Roy [7] and the fest procedure is to accept of reject H according as 0 where CL (A) denotes the m~imum characteristic root of A and A ex is chosen such that Probe [CL (SHS ;1) :s Aex \HJ = (1 - a). In the above equation, SH and Se respectively denote the sums of squares and cross products (SP) matrices due to H and "error" respectively. (In the univariate case, where p = 1, SH and Se respectively denote the sums of squares due to H and error respectively. The value of Aex for any given value, of a can be obtained from the tables of D. L. Heck [1]. If the total hypothesis is rejected, we can make multiple decisions on the acceptance or rejection of the various subhypotheses 298 / Cluster Formation and Diagnostic Significance in Psychiatric Symptom Evaluation by examining the confidence intervals on the "parametric functions" which measure departures from these subhypotheses. These confidence intervals were derived by B. N. Roy and R. Gnanadesikan [8,9]. For an illustration of the MANOV A test with biochemical data, the reader is referred to [13]. The second method is based on the maximum of the (~ ) Hotelling T 2, s. Thisprocedure is applicable to test H and make multipIe decisions on the acceptance or rejection of its subhypotheses of the form Hit : J.1. i = Ilt , (i ~ I = 1, 2, ••• , K). This procedure was formulated by S. N. Roy and R. C. Bose [6] and is a multivariate analogue of Tukey's multiple comparison test [14]. The technique (with trivial modification) is described below: Let (i ~ I = 1, 2, ••• , K), where v is the error degrees of freedom, TJ i is the size of ith sample and {li' (i = 1, 2, ••• , K) is the maximum likelihood estimate of Il i. Then we accept or reject H according as largest T~t out of (~) pairs ; T~ , not known. M. Siotani [10-12] has suggested some approximations to this distribution which for moderately large samples, seem satisfactory. When the error degrees of freedom are very large, the following approximation can be used. Probe [ largest T~t ~ T!I HJ K f!! 1- ~ P if , i;et=l where But, when H it is true, T ~_ is distributed as the F distribution with \p, TJ- P - K+ 1) degrees of freedom where So, the values of P if's can be obtained from F tables (or incomplete Beta function tables) for any given value of T!. Here, we note that M. Siotani suggested an approximation similar to the above as a first approximation for upper percentage points. In one way classification, SH = (s tu) and Be = (s etu) are respectively called "Between Groups" and "Within Groups" SP matrices where where T~ is chosen such that Probe [largest T~t out of Stu = ~ TJ i (X i. t - X .. t ) (X L u - X .. u ) (~) pairs ~ T!JHJ i = (1 - a). If the total hypothesis H is rejected, we ac- cept or reject the subhypotheses Hit according as and XLt = (~ Xijt) /TJi j 2 Tit < 2 > T cx.. X .. t = (~~Xijt) i Recently, it was shown [5] that the above test is better (in the sense of shortness of the lengths of the confidence intervals) than the MANOVA test but the nature of the exact distribution of the "largest Hotelling T2" is = flit /TJ. J Here Xijt denotes the observed value associated with the jth individual in the ith group on the tth variate. XL u , x .. u can be defined similarly. Proceedings-Fall Joint Computer Conference, 1962 / 299 TABLE A-1 List of Symptoms DESCRIPTION ORIGINAL NUMBER REASSIGNED NUMBER School Maladjustment: Arithmetic disability Reading disability Unsatisfactory school work in general Truancy from school Frequent absences from school Fear or extreme dislike of school n 1 4 5 6 Delete 2 3 7 8 9 4 5 6 10 11 12 7 8 Delete 13 14 15 16 17 18 19 20 21 22 23 24 25 9 10 11 12 13 14 15 16 17 18 19 20 21 2;} 22 Asocial Behavior: Destructiveness Lying Stealing Cheating Pre-occupation with matches, fire-setting Running away from home Negative Attitudes and Behavior: Sullenness, Sulkiness Disobedience Stubbornness Negativism Rebelliousness Resentment Easily aroused anger Unprovoked anger Temper tantrums Excessive whining or crying Jealousy, envy of others Bearing grudges Teasing behavior Attacking behavior Fighting Bullying Cruelty 27 28 29 Delete 30 31 32 33 34 35 36 37 38 23 24 25 26 27 28 29 30 31 Other Interferences in Social Relationships: Seclusiveness Shyness Excessive passivity, non-assertiveness Over-sensitivity Over-conformity Over-dependency Excessive independency Clinging exaggerated display of affection General indifference or disinterest in people 300 / Cluster Formation and Diagnostic Significance in Psychiatric Symptom Evaluation TABLE A-1 (Continued) DESCRIPTION ORIGINAL NUMBER REASSIGNED NUMBER 39 40} 41 42 43 44 45 46 47 48 48 32 33 97 34 35 36 37 38 Delete Delete 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 39 40 41 Delete 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 Other Interferences in Social Relationships (Continued) Inability to form strong attachments or relationships Inability to get along with other children his own age Unpopularity with children Inability to get along with adults Preference for younger children Preference for older children Suspiciousness, distrust of people Excessive assertiveness Excessive competitiveness "Sissy-like" behavior (boys) "Tom-boyish" behavior (girls) Attitudes Toward The Self: Lack of self-confidence Self-depreciation Feelings of bodily inadequacy or defects Attempts at physical self-injury Disregard for his own possessions Carelessness Messiness Worrying Emphasis on sameness Dawdling, procrastination Over- conscientiousness, need for perfection Specific fears, phobias General fearfulness, timidity Unrealistic fearlessness General feeling of dissatisfaction, lack of enthusiasm Specific compulSions Bragging, grandiosity Accident proneness Exaggerated unconcern Interference with Thought Activity: Day-dreaming Verbalized fantasies Delusions, hallucinations Complete self-absorption Inability to concentrate Boredom, lack of interest Specific obsessions Memory disturbances - underdeveloped Memory disturbances - overdeveloped 68 69} 70 71 72 73 74 75 76 57 59 60 61 62 63 64 77 65 58 Motor Disturbances: Tiredness Proceedings-Fall Joint Computer Conference, 1962 /301 TABLE A-1 (Continued) ORIGINAL NUMBER REASSIGNED NUMBER 78 79 80 81 82 83 84 85 66 67 68 69 70 Delete 71 Delete 86 87 88 89 90 91 92 72 73 74 75 76 77 Delete 93 94 95 96 97 98 99 100 101 102 78 79 80 81 Delete 82 Delete Delete Delete Delete 103} 104 83 Day Night 105} 106 84 Excessive masturbation Sex Play 107 Delete Exhibitionism Voyeurism 108} 109 85 DESCRIPTION Motor Disturbances (Continued) Laziness Poor coordination, awkwardness Restlessness Hyperactivity Stereotyped mannerisms Tics Facial grimaces Head-banging Disruption in Developmental Pattern: Speech disorders Doesn't talk at all Slow in learning to speak Doesn't speak clearly Limited vocabulary Repetitive speech Use of 3rd person Other speech disorders Eating problems Won't eat at all Doesn't eat enough, "poor eater" Likes only a limited number of foods Won't try new foods Eats only strained foods Food fads Allergies to certain foods Excessive appetite Eats with fingers, poor table manners Other eating problems Wetting Day Night Soiling 302 / Cluster Formation and Diagnostic Significance in Psychiatric Symptom Evaluation TABLE A-1 (Continued) DESCRIPTION ORIGINAL NUMBER REASSIGNED NUMBER 110 111 112 113 114 115 116 117 118 Delete Delete Delete Delete Delete 86 87 88 89 119 120 121 122 123 124 125 126 127 128 129 130 90 91 92 93 94 Delete Delete Delete 95 96 Delete Delete Disruption in Developmental Pattern (Continued) Sex Play (Continued) Sexual activities with others Sexual attitude Sexual approach Fearful of sex Other sexual problems Thumb- sucking Nail-biting Mouthing, licking, sucking, biting, objects Fetishism Sleep disturbances Refusal to go to bed Fear of the dark Difficulty falling asleep Nightmares Talking or calling out in sleep, crying in sleep Sleep-walking Light sleeper, waking up frequently at night Prowling at night through the house Wanting to sleep with one or both parents Excessive sleep Other sleep disturbances Other problems not mentioned above BIDLIOGRAPHY 1. Heck, D. L., "Charts of Some Upper Percentage Points of the Distribution of the Largest Characteristic Root," Annals of Mathematical Statistics, Vol. 31 (1960), pp. 625-642. 2. Johnson, P.O., statistical Methods in Research, Prentice Hall, Inc., New York (1950). 3. Kaskey, Gilbert and Krishnaiah, P. R., "Statistical Routines for Univac Computers," Transactions of Middle Atlantic Conference of American Society for Quality Control (1962), pp. 289-304. 4. Krishnaiah, P. R., Simultaneous Tests and the Efficiency of Generalized Balanced Incomplete Block Designs (Unpublished Manuscript) . 5. "Multiple Comparison Tests in Multi-Response Experiments." 6. 7. 8. 9. 10. Presented at the Annual Meeting of the Institute of Mathematical Statistics, September 1962. Roy, S. N., and Bose, R. C., "Simultaneous Confidence Interval Estimation," Annals of Mathematical Statistics, Vol. 24 (1953), pp. 513-536. Roy, S. N., Some Aspects of Multivariate Analysis, John Wiley and Sons, Inc., New York, 1957. Roy, S. N., and Gnanadesikan, R., "Further Contributions to Multivariate Confidence Bounds," Biometrika, Vol. 44 (1957), pp. 399-410. "A Note on Further Contributions to Multivariate Confidence Bounds," Biometrika, Vol. 45 (1958). p. 581. Siotani, M., "The Extreme Value of the Generalized Distances of the Individual Points in the Multi variate Normal Proceedings-Fall Joint Computer Conference, 1962 / 303 Sample," Annals of the Institute of statistical Mathematics, Vol. 10 (1959), pp. 183-203. "On the Range in Mul11. tivariate Case," Proceedings of the Institute of statistical Mathematics, Vol. 6 (1959), pp. 155-165. 12. . "Notes on Multivariate Confidence Bounds, " Ann a 1 s of the Institute of statistical Mathematics, Vol. 11 (1960), pp. 167-182. 13. Smith, H., Gnanadesikan, R., and Hughes, J. B., "Multivariate Analysis of Variance (MANOVA),"Biometrics, Vol. 18 (1962), pp. 22-41. 14. Tukey, J. W., The Problems of Multiple Comparisons (Unpubl i shed Notes), Princeton University. SPACETRACKING MAN-MADE SATELLITES AND DEBRIS Robert W. Waltz, Colonel, USAF Commander 9th Aerospace De!. Div. and B. M. Jackson, Captain, USAF Ent Air Force Base, Colorado middle of April of 1961. Personnel began arriving on site about that time, and by the 10th of June all of the facilities were ready. Unexpectedly, the computer at our research and development facility had a failure due to air conditioning. The requirement was placed upon our organization to commence operations 20 days ahead of schedule. We accomplished this without much difficulty. Since that time we have taken on the name of Spacetrack Center. This is the portion of 1st Aerospace Squadron that has the satellite mission. We were established as a contingent of the NORAD Combat Operations Center. The entire space detection and tracking system, which is the responsibility of NORAD, has been dubbed SPADATS. There are many types of satellites in orbit. One revolution requires a minimum or' about ninety minutes. This varies, of course, and can be several times this quantity. The only organization in the free world that has re sponsibility for tracking all man-made objects in space is the 1st Aerospace Control Squadron. AEROSPACE EVALUATION The United States got into the aerospace business in 1957 when Russia launched its first Sputnik. At that time there was no operational skill developed in the art of detecting or tracking satellites. The Research and Development Command of the Air Force was assigned the responsibility to develop such a capability. This organization wa established at Hanscom Field. In addition to research and development responsibility, they were tasked with the operational requirements of those days. In November 1960, the Air Force began its training of a group of Air Force personnel to establish a completely operational organization. In February of 1961, Headquarters USAF directed that the 1st Aerospace Control Squadron be in position by 1 July at Ent Air Force Base with an operational capability of detecting and tracking satellites in support of NORAD. At that time we had neither trainedpersonnel nor aphysical location to go into, communications facilities, nor a computer. In the next four months our organization increased from the original twelve people to something in the order of over 100. A building was identified. Generals were moved from their offices and the facilities were completely rehabilitated. A computer was designated and installed by the SATELLITE BACKUP In order to provide reliability in the system and to provide our research organization with the equipment and tools to further develop the state of the- art of satellite detection and tracking, a backup facility was established at Hanscom Field. Equipment identical to that . installed at E nt has been provided. The backup installation is called the Spacetrack R&D Facility and is under the control of the Air Force SystemsCommand. As do we, they have other names, the prime one being 496L Systems Project Office" (SPO). RELATIONSHIPS WITH OTHER ORGANIZATIONS The 1st Aerospace Control Squadron was established to support the North American Air Defense (NORAD). The United States Air 304 Proceedings-Fall Joint Computer Conference, 1962 / 305 Force, through its Air Defense Command, is responsible for the support and technical operation of all the facilities of the NORAD Combat Operations Center (COC) located at Ent Air Force Base. The 1stAerospace Control Squadron is responsible to the Air Defense Command (through the 9th Aerospace Defense Division). NORAD is a unified command composed of all the various military services of the United States and Canada. Radars are of two basic types. One is the fixed fan which has a stationary antenna. It radiates electrical energy in a horizontal or vertical fan-shaped plane. The other type radar is the tracker. It provides a pencil beam pattern and must be pointed at the satellite. The antenna is parabolic and is steerable either by manual control or automatically by computer program. See Fig. 1. SPACETRACK MISSION The mission of the Spacetrack Center is to detect and track all man-made space objects; maintain an information catalogue on all space objects; determine orbits and ephemerides of all space objects; provide system status and satellite displays; and provide satellite data to NORAD and other military and scientific agencies as required. SATELLITE SENSORS To support the mission of Spacetrack, there are sensors located all around the world. There are three basic types of sensors that are used to detect and track satellites. These are optical, radar and radiometric. Some of these are controlled by military organizations; others are scientific instruments which support Spacetrack on a cooperative basis. An example of the optical type sensor is the Baker-Nunn camera. A Baker-Nunn c'amera is very similar to a telescope and is equipped for taking pictures of satellites against a star background. The mount for the Baker-Nunn camera is steerable and can be programmed to track the path of the satellite in accordance with predictions provided by the Spacetrack Center. Techniques of the astronomer are used to determine the position of the satellite with respect to reference stars. In this manner, we obtain the right ascension and declination of the satellite's position. Radars provide our most useful information in that they determine all of the quantities required to fix the satellite's position as one point in space. From the radar we get azimuth, elevation, range, range rate, doppIer, and what we call a Signature of the satellite. By signature we imply its tumble rate and radar cross section from which we can, in general, determine the size of the object, length, width and general configuration. Figure 1. Aerial View of Thule-This areial view of the BMEWS, Thule, Greenland radar site shows the four huge detection radar antennas and the one radome which protects the movable tracking radar antenna. The stationary detection antennas measure 165 feet in height and 400 feet across the top. A third and last major sensor that contributes significantly to our mission is the radiometric type. It is apassive type sensor that depends upon transmissions from the satellite itself. It is, in effect, a Simple but highly directional radio receiver. With this device we can determine the azimuth of the satellite, the time of its closest approach to the facility, and the doppler. COMMUNICATIONS AND SENSOR LOCATIONS Teletype or high-speed data link communications connect Spacetrack to all of the major sensors or centers that collect satellite observations. Direct circuits are provided to: a. The Ballistic Missile Early Warning fixed fan radars at Clear, Alaska, and Thule, Greenland (Thule also has a tracking rac;lar). 306 / Spacetracking Man-Made Satellites and Debris b. Shemya Island (Alaska)-both a fixed fan and tracking radar. c. Laredo AFB, Texas-tracking radar. d. Moorestown, New Jersey-tracking radar. e. Spacetrack R&D Facility, Mass,-for computer backup. Data from a radiometric sensor and a tracking radar are received from this area. f. SPASUR operated by the U. S. Navyconsists of a series of vertical radar fans across southern United States. g. Patrick AFB, Florida-provides data from the Atlantic Missile Range sensors including all types. h. Sunnyvale and Point Mugu, Calif.-for data from the Pacific Missile Range. i. NationalAeronautics & Space Administration (NASA) at Greenbelt, Marylandlinks Spacetrack with the scientific sensors around the world including those of foreign governments. SATELLITE DATA INPUTS Information received from the sensors by 100 wpm teletype circuits is processed to the computer in two modes. In the manual mode a paper tape and hard copy of each message received by the Communications Center is delivered to the Data Conversion Room. If the message is in the proper format, it can be immediately converted from paper tape to punched cards by using an mM 047 tape-tocard converter. On the other hand, for certain few messages, the observations must be processed manually from the hard copy by entering the information in a converted form onto a standard observation sheet. This sheet, in turn, is hand-punched to provide cards. The cards are then consolidated and read on magnetic tape in an off -line mode. The second mode is called a semiautomatic mode of operation. Due to equipment and program difficulties, this system has never been used operationally. However, the hardware is designed to allow satellite observations received by teletype to be fed directly to the. computer through electrical circuits. The system input starts with the sensor looking at the satellite and determining its position in azimuth, elevation, range, range rate, and any other quantities it can obtain. This information is transmitted over 100 wpm teletype circuits and received in the Communications Center on a page printer which provides a hard copy. If the format is coded beginning with five $ signs, switching action routes the message to the computer. The switching unit is activated by the first three $ signs and connects the incoming circuits to a tape punch. The remainder of the message, which includes two $ signs, the text and the clOSing signal, is punched on paper tape and stored until ready to be transmitted to the computer electrically. To end the message, a signal of five right-hand parentheses are provided. Three of these activate the switching unit to return it to its normal pOSition of a local tape punch in the C omm Center. Messages not in this coded format are received only as a hard copy and local paper tape in the Comm Center. As mentioned in the manual mode, these two items are delivered to Data Conversion for processing by hand. Upon command from the computer, messages stored in the Comm Center's semiautomatic input equipment can be transmitted through the DMNI and Real-Time System to the Philco 2000. Twenty-four low-speed teletype and three high-speed data link circuits can be connected to the DMNI at present. To provide for storage of data during periods when the 2000 is inoperative, aDMNI recorder will be installed. This will consist of a 410 processor and a magnetic tape unit. SATELLITE PROGRAM (A-1) The A-1 program system (see Figure 2) provides for the processing of nearly all of the routine satellite data. It contains its own executive and is completely independent of INPUTS OBSERVATION CONVERSION Report Tape REPORT ASSOCIATION DIFFERENTIAL CORRECTION New Elements (RA,i,w,e,To 8t P) OUTPUTS TO SENSORS NOR AD AND OTHER USERS Equatorial Crossing Time It Longitude Table to Correct for any Locat ion Position of Satellite for one Sensor. Azimuth, Elevation, Range &. Time Figure 2. Satellite Computer Program (A-I). Proceedings-Fall Joint Computer Conference, 1962 / 307 the Philco SYS. The inputs to this system are from magnetic tape and consist of (1) the 70TTY IN tape which contains the manual data provided from punched card and (2) the DMNI tape created through the semiautomatic mode of operation. These two tapes are the input to the observation conversion (ORCON) routine, where the observations are converted on a report (R) tape to one common standard computer format. The R tape becomes an input to the report association (RASSN) routine. In RASSN, the observations (reports) are compared against the entire satellite inventory and are identified according to their association with the predicted satellite positions. Those that meet very narrow limits established in the program are called fully associated reports (R a's). Those that do not associate within this narrow tolerance or associate with more than one satellite are called doubtfully associated (R d) • The third category contains those observations which do not associate with any objects in the satellite inventory and are called unassociated reports (RJ. Most of the unassociated reports are the results of electrical noise or inaccurate observations. The SRADU tape (sorted reports associated, doubtful and unassociated) is the input to the Simplified General Perturbation Differential Correction (SGPDC) routine. Here the fully associated observations are used to correct the elements of each satellite. Elements are the variables required to fully describe a satellite orbit. These con~ sist of the right ascension, the inclination, the argument of perigee, the eccentricity, the time of epoch, and the period of the satellite. The outputs produced from the new elements are forwarded to the sensor in order that they may acquire the satellites on future passes. In addition, outputs are forwarded to NORAD and other users for tactical and scientific purposes. The primary outputs consist of the bulletin and the look angle. The bulletin is a listing of equatorial crossing times and longitude for each revolution of the satellite. Also included with the bulletin is a table to correct the equatorial crossing to any location in the world. Look angles provide the position of one satellite for one sensor. This information is presented in the form of azimuth, elevation, range, and time for the specific sensor and for the specific satellite to be observed. Other outputs from our system include periodic reports required by NORAD and the other users of our information. In addition to the A-1 system, there are about 20 other programs in use for satellite computations. These were originally written in FORTRAN and mM machine language and were later converted to ALTAC for use with Philco SYS. These are now being converted to TAC for attachment to the A-l executive. SPACETRACK CONTROL ROOM The computer outputs are manually checked for accuracy prior to release to the tactical or scientific communities. This is accomplished in our Spacetrack Control Room. Compared to operational displays, the Spacetrack Control Room for aircraft surveillance operations is quite an unspectacular facility. (See Figure 3.) To date, no dynamic displays have been created which are entirely satisfactory for satellite purposes. In order to provide satellite data to CINCNORAD, static menu boards and a random access slide projector are used. Included are a Figure 3. Space Scoreboards,-Status boards in North American Air Defense Command's Space Detection and Tracking System Operations Control Center at Colorado Springs, Colorado, display timely tracking information on all man-made objects in earth orbit. summary of the satellite population, detailed information regarding the payload of each launch, and sensor data. A world map shows the location of the major sensors and centers that collect observations and the connecting 308 / Spacetracking Man-Made Satellites and Debris communications circuits. These are displayed to NORAD thru a closed-circuit television system. BMEWS BACKUP In addition to the satellite mission, the Philco 2000 has been programmed to provide a backup function for the BMEWS Display Information Processor located in the NORAD COC. The mission of BMEWS (Ballistic Missile Early Warning System) is to display to NORAD impact and launch data on missiles detected by the BMEWS which has radars at Clear, Alaska, and Thule, Greenland. BMEWS RADARS AND FUNCTIONS "----.-~ - - - - - ' NoRAD Figure 4. BMEWS Data Flow. NORAD COC DISPLAYS The BMEWS site at Clear consists of several fixed fan type radars. The site at Thule consists of a tracker radar in addition to several fixed fan radars. A future site being installed at Fylingdales, England, will contain all tracker type radars. These "forward" sites provide the coverage required to detect hostile missiles that may be launched against the North American Continent. With computers located at these sites,predicted missile launch and impact pOints and the time of impacts are computed. This information is provided to the computer facilities within the NORAD COC for generation of displays. BMEWS PROGRAM (B-1) Data flows to the Philco 2000 from the forward site radars over high-speed data link circuits and thru the DMNI and Real-Time System. The B-1 program (see Figure 4) processes this information for display in the COCo Like the A-1 program, B-1 is completely independent of SYS. The B-1 program converts the forward site information into data suitable for drawing slides for a projection system and for driving numerical displays. The output is provided through the DMNO (Device for Multiplexing Non-Synchronous Outputs) to four devices: the impact and launch display decoder for projection system, the threat summary panel, the DMNO flexowriter for status data, and the remote transmitters that provide information similar to that displayed in the NORAD COC to the Strategic Air Command and the Joint Chiefs of Staff. The BMEWS data are presented in the Combat Operations Center (COC) on two projection systems and a numerical display. The projection systems consist of two maps: one is of the North American Continent on which predicted impacts are displayed, and the other is a polar projection of the European-Asian Continent on which computed launch points are indicated. Both launch and impact locations are represented by ellipses. Included with each ellipse is a letter-number reference that identifies the forward site which detected the missile and a serial number for correlation between the two maps. The Threat Summay Panel is a numerical display. "Five-minute windows" provide a measure of the size of the missile raid during the past five minutes. In another portion of the panel are shown the total numbers, for each site, of missiles detected and predicted to impact on the North American Continent. The time of next missile impact is displayed. Lastly, an "Alarm Level" is produced which is a combined measure of the missile raid and summarizes the credence of the threat. FUTURE Inthe near future the B-2 program will be integrated into the system. It combines the functions of both A-1 (for satellite) and B-1 (for BMEWS) under one executive. This will allow for full-time backup of the DIP computer and for proceSSing satellite data simultaneously. Under study is hardware which will completely automate the satellite inputs and outputs. The present manual teletype system Proceedings-Fall Joint Computer Conference, 1962 / 309 will be replaced bya computerized communications center. USA Deep Space Probes (Heliocentric Orbit) CURRENT SATELLITE SITUATION The satellites in orbit as of 24 September 1962 were as follows: USA Payloads in Earth Orbit Debris in Earth Orbit UK USSR 38 1 4 *185 0 4 Objects Decayed UK USSR 4 0 2 227 1 10 109 60 *Includes 61 Omicron which exploded and produced 139 known object!'l LIST OF REVIEWERS Mr. S. N. Alexander Mr. James P. Anderson Mr. W. L. Anderson Miss Dorothy P. Armstrong Dr. M. M. Astrahan Mr. Robert C. Baron Mr. W. D. Bartlett Mr. R. S. Barton Mr. Aaron Batchelor III Mr. J. V. Batley Mr. M. A. Belsky Mr. Eric Bittman Mr. Erich Bloch Dr. Edward K. Blum Mr. Theodore H. Bonn Mr. Arthur Bridgman Mr. Herbert S. Bright Mr. Edward A. Brown Mr~ L. E. Brown Mr. W. Brunner Dr. Werner Buchholz Mr • James H. Burrows Mr. R. V. D. Campbell Mr. B. F. Cheydleur C. K. Chow (Mr.) Mr. Robert H. Courtney Mr. Robert P. Crago Mr. L. Jack Craig Dr. T. H. Crowley Mr. James A. Cunningham Miss Ruth M. Davis Dr. Douglas C. Engelbart Mr. Howard R. Fletcher Dr. Ivan Flores Miss Margeret R. Fox Mr. R. F. Garrard Mr. Ezra Glaser Mr. Jack Goldberg Mr. Geoffrey Gordon Mr. Joseph K. Hawkins Mr. George G. Heller Mr. George Heller Mr. W. H. Highleyman Mr. S. A. Hoffman Mr. Arthur Holt Mr. E. Hopner Dr. Grace Murray Hopper Mr. Richard A. Hornweth Mr. Paul W. Howerton Mr. Morton A. Hyman Mr. George T • Jacobi Mr. Robert Jayson Dr. Laveen Kanal Mr. Herbert R. Koller Dr. R. A. Kudlich Mr •. J. Kurtzberg Mr. Michael R. Lackner Dr. Herschel W. Leibowitz Prof. C. T. Leondes Mr. Harry Loberman Mr. J. D. Madden Miss Ethel C. Marden Mr. Walter A~ Marggraf Dr. R. E. Meagher Mr. Philip W. Metzger Mr. Albert Meyerhoff Dr. Robert C. Minnick Mrs. Betty S. Mitchell Mr. Ralph E. Mullendore 310 Mr. Simon M. Newman Mr. J. P. Nigro Mr. Glenn A. Oliver Mr. James L. Owings Mr. G. W. Petrie Mr. C. A. Phillips Dr. Arthur V. Pohm Mr. Jack Raffel Mr. M. J. Relis Mr. A. E. Rogers Mr. David Rosenblatt Mr. Arthur I. Rubin Dr. Morris Rubinoff Mr. Bruce Rupp Prof. Norman R. Scott Mr. I. SeUgsohn Mr. Donald Seward Mr. J. E. Sherman Mr. William Shoo man Mr. R. A. Sibley Mr. Q. W. Simkins Mr. R. F. Stevens Dr. Richard I. Tanaka Mr. Lionel E. Thibodeau Mr. Robert Tillman Mr. R. A. Tracy Mr. R. L. Van Horn Mr. Kenneth W. Webb Mr. Gerard P. Weeg, Prof. Mr. Thomas J. Welch Mr. Stanley Winkler Mr. Hugh Winn Mr. H. Witsenhausen Mr. William W. Youden 1962 FALL JOINT COMPUTER CONFERENCE COMMITTEE General Committee J. Wesley Leas, Chairman RCA Building 201-3 Camden 8, N. J. E. Everet Minett, Vice-Chairman Remington Rand Univac Blue Bell, Pa. T. T. Patterson, Secretary RCA Building 13-2 Blue Bell, Pa. Finance Committee Herman A. Mfel, Chairman Auerbach Corporation Phila. 3, Pa. Arthur D. Hughes, Vice Chairman Auerbach Corporation Phila. 3, Pa. Public Relations Committee Thomas D. Anglim, Chairman Remington Rand Univac Blue Bell, Pa. Joseph Hoffman General Electric Co. Missile & Space Division Valley Forge, Pa. Thomas I. Bradshaw RCA Building 202-2 Camden 8, N. J. E. C. Bill Remington Rand Univac Blue Bell, Pa. Proceedings Committee Joseph D. Chapline, Chairman Philco Computer Division Willow Grove, Pa. Walter Grabowsky, Vice Chairman Auerbach Corporation Phila. 3, Pa. Program Committee E. Gary Clark, Chairman Burroughs Corporation Paoli, Pa. Arnold Shafritz Auerbach Corporation Phila. 3, Pa. Aaron Batchelor, Vice Chairman Burroughs Corporation Paoli, Pa. Robert S. Barton 1981 E. Meadowbrook Rd. Altadena, Cal. T. H. Bonn Remington Rand Univac Blue Bell, Pa. Dr. Stanley Winkler IBM, Fed. System Div. Rockville, Md. B. F. Cheydleur Philco Computer Division Willow Grove, Pa. James L. Owings RCA Building 82-1 Camden, N. J. Dr. Hugh Winn General Electric Co. Missile & Space Division Valley Forge, Pa. Arrangements Committee Peter E. Raffa, Chairman Technitrol, Inc. Phila. 34, Pa. Robert A. Hollinger Vice Chairman Technitrol, Inc. Phila. 34, Pa. 311 William McBlain Minneapolis-Honeywell Regulator Co., Pottstown, Pa. 312 Ladies Activity Committee Miss Mary Nagle, Chairman RCA Building 204-2 Camden 8, N. J. Mrs. John M. Bailey 1032 Wayne Rd. Haddonfield, N. J. Mrs. John W. Mauchly, Vice-Chairman Mauchly Associates Fort Washington, Pa. Miss Josephine Schiazza RCA Building 204-2 Camden 8, N. J. Printing and Mailing Committee Norman A. Miller, Chairman Remington Rand Univac Blue Bell, Pa. John Coston Remington Rand Univac Blue Bell, Pa. Mrs. Ethel Levinson Remington Rand Univac Blue Bell, Pa. Registration Committee Louis F. Cimino, Chairman General Electric Co. Missile & Space Div. Valley Forge, Pa. Richard D. Burke Vice-Chairman IBM Corp. Phila. 2, Pa. John Schafer General Electric Co. Missile & Space Div. Valley Forge, Pa. Miss Eleanor Gardosh General Electric Co. Missile & Space Div. Valley Forge, Pa. Jack Armstrong General Electric Co. Missile & Space Div. Valley Forge, Pa. Miss Liz Gunson IBM Corp. 230 S. 15th St. Phila.2, Pa. Sol Steingard General Electric Co. Missile & Space Div. Valley Forge, Pa. Exhibits Committee R. A. C. Lane, Chairman RCA Building 204-1 ·Camden 8, N. J. Lowell Bensky Re se Engineering A & Courtland st. Phila., Pa. W. P. Hogan, Vice Chairman Leeds & Northrup North Wales, Pa .. Special Events Committee Herbert S. Bright, Chairman Philco Computer Division Willow Grove, Pa. Dr. Louis R. Lavine Philco Computer Division Willow Grove, Pa. B. F. Cheydleur, Vice Chairman Philco Computer Division Willow Grove, Pa. R. Paul Chinitz Remington Rand· Univac Blue Bell, Pa. Edward H. Nutter Philco Computer Division Willow Grove, Pa. John W. Mauchly Mauchly Associates Fort Washington, Pa. Daniel AshIer Auerbach Corporation 1634 Arch Street Phila. 3, Pa. Dr. Morris Rubinoff Moore School of Electrical Engrg. University of Pennsylvania Phila., Pa. Harry Bortz IBM Corp. 230 S. 15th St. Phila. 2, Pa. Mrs. Margery League Remington Rand Univac Blue Bell, Pa. Administration Committee T. T. Patterson, Chairman RCA Building 13-2 Camden 2, N. J. John P. Brennan, Jr., Vice Chairman RCA Building 82-1 Camden, N. J. 313 Technical Advisor Dr. Morri s Rubinoff Moore School of Electrical Engineering University of Pennsylvania Philadelphia 4, Pa. AMERICAN FEDERATION OF INFORMATION PROCESSING SOCIETIES (AFIPS) AFIPS, P. O. Box 1196, Santa Monica, California Chairman Executive Committee Dr. Willis H. Ware The RAND Corporation 1700 Main Street Santa Monica, Calif. Dr. Arnold A. Cohen, IRE Dr. Harry D. Huskey, ACM Dr. Morris Rubinoff, AlEE Secretary Treasurer Miss Margaret R. Fox National Bureau of Standards Data Processing Systems Div. Washington 25, D. C. Mr. Frank E. Heart Lincoln Laboratory P. O. Box 73 Lexington 73, Mass AlEE Directors ACM Directors IRE Director s Mr. G. La. Hollander Hollander Associates P. O. Box 2276 Fullerton, Calif. Mr. H. S. Bright Secretary, ACM Philco Computer Division Willow Grove, Pa. Mr. W. L. Anderson General Kinetics, Inc. 2611 Shirlington Road Arlington 6, Va. Mr. C. A. R. Kagan Western Electric Co. P. O. Box 900 Princeton, N. J. Mr. W. M. Carlson E. I. duPont deNemours & Co. Mechanical Research Lab. 101 Beech St. Wilmington 98, Del. Dr. Werner Buchholz IBM ·Development Lab. P. O. Box 390 Poughkeepsie, N. Y. Mr. H. T. Marcy IBM Corporation 1000 Westchester Ave. White Plains, N. Y. Dr. H. D. Huskey Computer Center University of Calif. Berkeley 4, Calif. Dr. Arnold A. Cohen Remington Rand Univac Univac Park St. Paul 16, Minn. Dr. Morris Rubinoff Moore School of Elec. Engr. 200 South 33rd st. Philadelphia 4, Pa. Mr. J. D. Madden System Development Corp. 2500 Colorado Ave. Santa MOnica, Calif. Mr. Frank E. Heart Lincoln Laboratory P. O. Box 73 Lexington, Mass. AFIPS Representative to IFIP Simulation Council Observer Mr. I. L. Auerbach Auerbach Corporation 1634 Arch Street Philadelphia 3, Pa. Mr. J. E. Sherman Simulation Council, Inc. Sunnyvale, Calif. STANDING COMMITTEE CHAIRMEN Finance Dr. R. R. Johnson General Electric Co. P. O. Drawer 270 Phoenix, Ariz. Planning Dr. Morris Rubinoff Moore School of Elec. Engrg. 200 South 33rd St. Philadelphia 4, Pa. Admissions Dr. Bruce Gilchrist IBM Corporation 590 Madison Ave. New York 22, N. Y.


Source Exif Data:
File Type                       : PDF
File Type Extension             : pdf
MIME Type                       : application/pdf
PDF Version                     : 1.3
Linearized                      : No
XMP Toolkit                     : Adobe XMP Core 4.2.1-c041 52.342996, 2008/05/07-21:37:19
Producer                        : Adobe Acrobat 9.0 Paper Capture Plug-in
Modify Date                     : 2008:11:16 21:29:29-08:00
Create Date                     : 2008:11:16 21:29:29-08:00
Metadata Date                   : 2008:11:16 21:29:29-08:00
Format                          : application/pdf
Document ID                     : uuid:270fc048-0de0-4913-a6dd-4dfa2753a468
Instance ID                     : uuid:f63ed328-229a-417c-907b-ef8ee7bb4713
Page Layout                     : SinglePage
Page Mode                       : UseOutlines
Page Count                      : 319
EXIF Metadata provided by EXIF.tools

Navigation menu